Prediction of potential genes in microbial genomes Time: Wed May 18 00:49:08 2011 Seq name: gi|222159358|gb|ACAB01000001.1| Bacteroides sp. D1 cont1.1, whole genome shotgun sequence Length of sequence - 83528 bp Number of predicted genes - 76, with homology - 73 Number of transcription units - 43, operones - 17 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 656 154 ## BT_4010 hypothetical protein 2 1 Op 2 . + CDS 661 - 1527 380 ## COG0338 Site-specific DNA methylase 3 1 Op 3 . + CDS 1524 - 2414 354 ## BT_4012 putative ABC oligo/dipeptide transport, ATP-binding protein + Prom 2649 - 2708 6.8 4 2 Tu 1 . + CDS 2901 - 3089 159 ## BT_4012 putative ABC oligo/dipeptide transport, ATP-binding protein + Term 3107 - 3152 -0.9 5 3 Op 1 13/0.000 - CDS 3256 - 4578 921 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 6 3 Op 2 . - CDS 4571 - 6880 1358 ## COG0642 Signal transduction histidine kinase 7 3 Op 3 9/0.000 - CDS 6896 - 8257 460 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 8 3 Op 4 27/0.000 - CDS 8258 - 11371 2546 ## COG0841 Cation/multidrug efflux pump 9 3 Op 5 . - CDS 11368 - 12486 894 ## COG0845 Membrane-fusion protein 10 3 Op 6 . - CDS 12502 - 13056 481 ## PROTEIN SUPPORTED gi|34557871|ref|NP_907686.1| ribosomal protein N-acetylase - Prom 13245 - 13304 6.8 + Prom 13028 - 13087 6.3 11 4 Op 1 9/0.000 + CDS 13188 - 14432 611 ## COG3275 Putative regulator of cell autolysis 12 4 Op 2 . + CDS 14413 - 15168 654 ## COG3279 Response regulator of the LytR/AlgR family + Term 15212 - 15270 7.3 - Term 15204 - 15254 5.2 13 5 Tu 1 . - CDS 15389 - 15823 299 ## BT_4016 hypothetical protein - Prom 15959 - 16018 4.8 - Term 15960 - 15999 5.4 14 6 Tu 1 . - CDS 16103 - 16570 385 ## BDI_0747 hypothetical protein - Prom 16632 - 16691 2.5 15 7 Tu 1 . - CDS 16831 - 17067 73 ## gi|262405309|ref|ZP_06081859.1| predicted protein - Prom 17094 - 17153 3.5 + Prom 17042 - 17101 1.5 16 8 Op 1 . + CDS 17133 - 17444 203 ## BDI_0746 hypothetical protein 17 8 Op 2 . + CDS 17491 - 17796 249 ## BDI_0745 hypothetical protein + Term 17862 - 17927 21.2 - Term 17799 - 17848 0.1 18 9 Op 1 . - CDS 18087 - 18509 247 ## BDI_0743 hypothetical protein 19 9 Op 2 . - CDS 18531 - 19751 608 ## BDI_0742 integrase 20 9 Op 3 . - CDS 19765 - 20994 931 ## COG0582 Integrase - Prom 21193 - 21252 3.8 + Prom 20953 - 21012 5.1 21 10 Op 1 . + CDS 21259 - 21498 108 ## BT_4023 transposase 22 10 Op 2 . + CDS 21495 - 21752 310 ## BT_4024 hypothetical protein 23 10 Op 3 . + CDS 21791 - 22105 267 ## BT_4025 hypothetical protein 24 10 Op 4 . + CDS 22102 - 23145 662 ## BT_4026 hypothetical protein 25 10 Op 5 . + CDS 23166 - 23447 209 ## BT_4027 hypothetical protein 26 10 Op 6 . + CDS 23444 - 23713 143 ## BT_4028 hypothetical protein + Term 23873 - 23920 4.4 + Prom 23892 - 23951 2.9 27 11 Tu 1 . + CDS 24089 - 24553 -154 ## gi|299148371|ref|ZP_07041433.1| conserved hypothetical protein + Term 24609 - 24650 -0.2 - Term 25927 - 25981 1.0 28 12 Tu 1 . - CDS 26016 - 26486 622 ## BT_4032 putative non-specific DNA-binding protein - Prom 26576 - 26635 3.0 - Term 26594 - 26637 8.1 29 13 Tu 1 . - CDS 26675 - 26788 106 ## gi|237716547|ref|ZP_04547028.1| predicted protein 30 14 Tu 1 . - CDS 26914 - 27387 431 ## COG3023 Negative regulator of beta-lactamase expression 31 15 Tu 1 . - CDS 27564 - 27815 216 ## BT_4030 hypothetical protein - Prom 27884 - 27943 4.3 - Term 28599 - 28640 -0.2 32 16 Tu 1 . - CDS 28669 - 28860 108 ## gi|237716551|ref|ZP_04547032.1| predicted protein - Prom 28900 - 28959 3.6 33 17 Op 1 . - CDS 29398 - 29718 255 ## BT_4035 hypothetical protein 34 17 Op 2 . - CDS 29734 - 30282 601 ## BT_4036 hypothetical protein - Prom 30303 - 30362 3.2 35 18 Op 1 . - CDS 30369 - 30617 181 ## gi|237716554|ref|ZP_04547035.1| predicted protein 36 18 Op 2 . - CDS 30575 - 30670 61 ## - Prom 30699 - 30758 6.5 37 19 Tu 1 . + CDS 30928 - 31818 838 ## COG0524 Sugar kinases, ribokinase family + Term 31892 - 31934 9.4 + Prom 31907 - 31966 5.5 38 20 Op 1 . + CDS 32066 - 35131 2836 ## BT_3569 hypothetical protein 39 20 Op 2 . + CDS 35196 - 36728 1634 ## BT_3568 hypothetical protein + Term 36733 - 36786 2.3 40 21 Op 1 . + CDS 36871 - 39180 2415 ## COG1472 Beta-glucosidase-related glycosidases 41 21 Op 2 . + CDS 39229 - 40602 1254 ## COG5368 Uncharacterized protein conserved in bacteria 42 22 Tu 1 . - CDS 40630 - 43191 1825 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 43378 - 43437 6.0 + Prom 43057 - 43116 2.4 43 23 Tu 1 . + CDS 43212 - 43373 120 ## gi|295086358|emb|CBK67881.1| hypothetical protein 44 24 Tu 1 . - CDS 43493 - 43942 292 ## BT_3564 hypothetical protein - Prom 44151 - 44210 6.0 45 25 Tu 1 . - CDS 45418 - 46992 1481 ## BT_4046 hypothetical protein - Prom 47088 - 47147 7.0 + Prom 47098 - 47157 6.0 46 26 Tu 1 . + CDS 47183 - 48736 1105 ## BT_0374 hypothetical protein + Term 48851 - 48905 -0.9 47 27 Tu 1 . - CDS 48785 - 49651 603 ## COG1864 DNA/RNA endonuclease G, NUC1 - Term 49664 - 49710 5.0 48 28 Op 1 . - CDS 49731 - 51683 1421 ## COG4085 Predicted RNA-binding protein, contains TRAM domain 49 28 Op 2 . - CDS 51704 - 52546 798 ## BT_3561 hypothetical protein 50 28 Op 3 . - CDS 52594 - 55122 2147 ## BT_3560 hypothetical protein - Prom 55182 - 55241 7.7 + Prom 55073 - 55132 6.2 51 29 Op 1 . + CDS 55323 - 56348 904 ## BT_3559 hypothetical protein 52 29 Op 2 . + CDS 56345 - 57502 486 ## COG1864 DNA/RNA endonuclease G, NUC1 + Term 57558 - 57609 6.2 - Term 57627 - 57674 8.0 53 30 Op 1 . - CDS 57722 - 59020 926 ## COG3174 Predicted membrane protein 54 30 Op 2 . - CDS 59073 - 59540 499 ## COG2954 Uncharacterized protein conserved in bacteria - Prom 59569 - 59628 4.2 55 31 Tu 1 . - CDS 59652 - 60677 819 ## BT_3553 hypothetical protein - Prom 60796 - 60855 8.2 - Term 60782 - 60825 3.1 56 32 Op 1 . - CDS 60940 - 61074 167 ## - Prom 61097 - 61156 3.7 57 32 Op 2 . - CDS 61159 - 62118 1009 ## COG1186 Protein chain release factor B 58 32 Op 3 . - CDS 62210 - 62275 59 ## - Prom 62299 - 62358 3.8 + Prom 62312 - 62371 4.5 59 33 Tu 1 . + CDS 62442 - 62873 567 ## BT_3551 putative zinc protease 60 34 Tu 1 . - CDS 63113 - 64927 1599 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) - Prom 65090 - 65149 5.4 - Term 65045 - 65102 8.5 61 35 Op 1 . - CDS 65151 - 66164 528 ## gi|237716577|ref|ZP_04547058.1| predicted protein 62 35 Op 2 . - CDS 66189 - 67361 845 ## BF1313 hypothetical protein 63 35 Op 3 . - CDS 67379 - 68497 888 ## BVU_0617 glycoside hydrolase family protein 64 35 Op 4 . - CDS 68525 - 70084 1252 ## BF1327 hypothetical protein 65 35 Op 5 . - CDS 70110 - 73391 2454 ## BT_1280 hypothetical protein - Prom 73418 - 73477 4.6 66 36 Op 1 6/0.000 - CDS 73575 - 74480 616 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 74544 - 74603 4.1 67 36 Op 2 . - CDS 74617 - 75162 377 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 75185 - 75244 8.1 + Prom 75317 - 75376 7.3 68 37 Tu 1 . + CDS 75407 - 76474 842 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 69 38 Tu 1 . - CDS 76563 - 77738 694 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 77765 - 77824 7.9 70 39 Tu 1 . + CDS 78138 - 78524 287 ## gi|237716586|ref|ZP_04547067.1| predicted protein + Term 78551 - 78585 3.5 - Term 78539 - 78573 4.3 71 40 Op 1 . - CDS 78582 - 79556 664 ## RPE_1593 hypothetical protein 72 40 Op 2 . - CDS 79546 - 80061 180 ## BVU_1718 putative transposase 73 40 Op 3 . - CDS 79991 - 80494 160 ## BVU_1718 putative transposase - Prom 80593 - 80652 4.8 74 41 Tu 1 . - CDS 80673 - 81692 853 ## BT_3536 hypothetical protein 75 42 Tu 1 . + CDS 82030 - 82497 348 ## Fisuc_0074 hypothetical protein - Term 82550 - 82602 1.1 76 43 Tu 1 . - CDS 82720 - 83355 460 ## BT_4231 hypothetical protein - Prom 83395 - 83454 2.6 Predicted protein(s) >gi|222159358|gb|ACAB01000001.1| GENE 1 3 - 656 154 217 aa, chain + ## HITS:1 COG:no KEGG:BT_4010 NR:ns ## KEGG: BT_4010 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 217 245 462 462 337 85.0 2e-91 FKVSEIYNREKRKQYIDNFDTNQKPDLSNEASEQWSVQDIVDNKGQVLINSERREIKKAN NQKARNRAGLVPKTLILHINNPKINKIFEELKHIQVKTCPNASSVLLRVFLELSVDAYLE RYDLVKNNAITACSSKEDLNGKVCKVLNHMTQLGTMSNDLSKGIRSEINDKNSVLSIESL NAYVHNEFFYPKADNLIIGWDNIESFFIQLWESINKE >gi|222159358|gb|ACAB01000001.1| GENE 2 661 - 1527 380 288 aa, chain + ## HITS:1 COG:sll0729 KEGG:ns NR:ns ## COG: sll0729 COG0338 # Protein_GI_number: 16332213 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Synechocystis # 5 270 22 282 285 93 29.0 5e-19 MEFLSPLRYPGGKAKVADFVQCLIKENALLDGTYVEPYVGGGSVALSLLFNEYVSDIYIN DKDISIYAFWYSVLNNTDALCQLIKDTPINVETWFKQKEFQQNKENRNVDLLNLGFSTFF LNRTNRSGILKAGIIGGYDQTGNYKIDARFNKEDLIKRIQRIADYAERIHLTNDDAVTLV QRLKDELPYNTLLYLDPPYYIKGKGLYLNYYNDTDHQNIANAIGAIVNCKWIVSYDNVPF ITNLYSKYRQQCFELNYSASNSGKGEEIMIFCDDIVIPKHKLFNHSTK >gi|222159358|gb|ACAB01000001.1| GENE 3 1524 - 2414 354 296 aa, chain + ## HITS:1 COG:no KEGG:BT_4012 NR:ns ## KEGG: BT_4012 # Name: not_defined # Def: putative ABC oligo/dipeptide transport, ATP-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 96 295 1 200 426 381 95.0 1e-104 MKITHIHIERFRGFQNEDFEVGSQLTAIAGQNGTQKSTLLGIVTQTFTLKPEDPMRAEKP LCGGSYISAFKDKFRLSPIFDRPKGHEWTISFDIGVDDFTVESIKRTGDPNVRFWKKGAR QEGDGYISFPTIFLSLKRLVPMAEEAKIITDDTLLTPEELSEFKQLHNKILIVQTPISSA TTITSKNKQSIGVSTELYDWNQNSMGQDNLGKIILALFSFKRLHDKYPQQYKGGILAIDE MDATMYPASQVELLKILRKYASKLNLQILFTTHSMSLLKVMDDLVQEVSKQEETAK >gi|222159358|gb|ACAB01000001.1| GENE 4 2901 - 3089 159 62 aa, chain + ## HITS:1 COG:no KEGG:BT_4012 NR:ns ## KEGG: BT_4012 # Name: not_defined # Def: putative ABC oligo/dipeptide transport, ATP-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 62 365 426 426 107 79.0 1e-22 MEQINAGGELGRQDAKKWFNGQLEYWGRNGNKVFNPFLRSIPIEVQEFKRNFENMIKRYI YD >gi|222159358|gb|ACAB01000001.1| GENE 5 3256 - 4578 921 440 aa, chain - ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 2 428 4 439 441 276 38.0 6e-74 MDKTRIIVVEDNIVYCEYVCNLLAREGYSTVKAYHLSTAKKHLQQATDDDIVVSDLRLND GNGIDLLRWMRKEGKMQPFIIMTDYAEVHTAVESMKLGSIDYIPKKLIEDKLVPLIRSIQ KERQSRHSRMPVFAREGSAFQKIMHRIKLVAVTDMSVMIFGENGTGKEHIAHHLHDKSKR AGKPFVAVDCGSLSRELAPSAFFGHVKGAFTGADSTKKGYFHEAEGGTLFLDEVGNLALE TQQMLLRAIQERRYRPVGDKSDKSFNVRIIAATNEDLEAAVSEKRFRQDLLYRLQDFVIT VPPLRDCQEDIMPLAEFFREIANKELECNVSGFSSEARKALLTHAWPGNVRELRQKIMGA VLQAQEGVVTKEHLELAVTKPTSPVSFALRNDAEDKERIMRALKQTNGNRSAAAELLGIS RATLYSKLEEYGLKYKFKQS >gi|222159358|gb|ACAB01000001.1| GENE 6 4571 - 6880 1358 769 aa, chain - ## HITS:1 COG:VCA0736_1 KEGG:ns NR:ns ## COG: VCA0736_1 COG0642 # Protein_GI_number: 15601492 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 269 518 450 709 726 126 31.0 2e-28 MEFVRKAIALSHTFIILIVVAIAFTCHNEWQEVEALEVGNRHIDEFRKEVNRIHIQLIEF SLLGETALDWDETDLENYHAQRIALDSTLCLFNETHVIGRIDSVRSLLEDKERQMFQIVR LIDEQQSINKKIASQVPLIVQTSMQEQPKKPKRKGFLGIFGKKEETKPPTTTTTLRSPNR SMVSEQKEQSRRLSEQADSLAARNADLNRQLKGLICQIENKVQADLQGREDEIVAMRGKS FMQVGGLMGFVLLLLLISYIIIHRDAKSIKQYKRKTTDLIGQLEQSVQRNEALIASRKKA VHTITHELRTPLTAITGYTELLQKECSKENNVNFLQSIQQSSDRMRDMLNTLLDFFRLDN GKEQPKLLPCRISTITHTLETEFMPVAMNKGLSLTVKNGNDAVVLTDKERIIQIGNNLLS NAIKFTEKGGVSLTIDYTNGVLTLIFEDTGTGMTEDEQQRVFGAFERLSNAAAKDGFGLG LAIVHNIVAMLDGKIHLESEKGKGSRFTVEIPMQKAEEIPEKEIQTYIHREERNLNVVAI DNDEVLLLMLKEMYAQEGIHCDTCTDAAELMEMIRRKEYNLLLTDLNMPGINGFELLELL RSSNVGNSQTIPVVVATASGSCDAEELLERGFAGCLFKPFSISELLEISDKCAIKATQDG KPDFSALLSYGNEVVMLEKLITETEKEMQAVRDAAMRNDLQELDALTHHLRSSWEILRAD QPLKVLYGVLHGKDKSDNEALSNSVTTVLDKGAEIIQLAKEERRKYENG >gi|222159358|gb|ACAB01000001.1| GENE 7 6896 - 8257 460 453 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 3 451 2 455 460 181 28 9e-45 MKKTAIYIILSGWMLTGCGTYSRYHRPDLSMENLYSTLPADADTTTLASLSWREMFTDPK LQSLIETGLDRNTDLNVARLRVEAAASALLTAKLSYLPSLGLNAEGNAGKHDGATAKTYN AGATASWELDIFGNLTAAKRGAAAALQGSRDYRQAVQTQLVATIADSYYTLAMLDAQMAI SHRTLENWQTTVRTLEALKKAGQSNEAGVLQAKANVMQLESSLLSIRKSISETENALFAI LAMPSHSIARSNLVEAAFPDTVSIGVPLQLLSNRPDVRQAEMELAQAFYTTNAARAAFYP HITLSGTLGWTNNGGGIITNPGQWLLNAIGSLTQPLFNRGANIANLKTAQIRQEEAKLLF QHSLLNAGKEVNDALTAWQTAKSQLEINARQVETLCDAVRKTESLMRHSNITYLEVLTAQ QSLLEAEVQQLQTRFERIQSVIKLYHALGGGRF >gi|222159358|gb|ACAB01000001.1| GENE 8 8258 - 11371 2546 1037 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 4 1024 3 1022 1051 730 39.0 0 MKLRFFIDRPVFSGVISVVIVLLGVIAMFSLPVEQYPDIAPPTINVWASYPGANAETVQK AVIVPLEEAINGVEDMTYMTSTASNTGDASINIYFKQGANADMAAVNVQNRVNAALSQLP AEATKTGVTTEKQQNAELLTFALYSPDDRFDQTFLNNYMKINVEPRLKRISGVGKAQLFG SNYSMRLWLKPDRMAQYGLIPDDISAVLARQNIEAATGSFGANHPTANEYTMKYRGRLSG ADEFGELVVKSLPGGNVLRLKEVADVELGDEYYNYSSQVNGHPAAMMMINQKAGSNASST INEIHEVLHELDRDLPEGAEFVVLTDTNKFLYASIRSVIRTLLEAILLVIVVVYVFLQDI KSTLIPAISIFVSIIGTFAVMSMIGFSINLLTLFALVLAIGTVVDDAIVVVEAVQAKFDE GYQSAVLAADDAMKDVSSAILTSTIIFMAVFFPVAMMGGTSGAFYAQFGITMAVAVGISA INAFTLSPALCALLLKPYIDEHGNTKDNFAARFRKAFNAVFYRLSRRYVRGVMFILHRRW LMWSSIAVSFVLLVLLVNATKTGLIPEEDTGTVMVSMNTKPGTSMAQTVKVMERINSRLD SIGEIEYNGAVAGFSFSGSGPSQAMYFVTLKDWEQREGEGQSVNDVIGKIYAATSDIPDA TVFAMSPPMIAGYGMGNGFELYLQDKAGGDITAFKKEADKFVEALSQRPEIGEVYSSFAT DYPQYWVDIDAAKCEQSGVSPAEVLSTLSGYYTGQYVSDFNRFSKLYHVTMQAPAEYRVN AESIHHIYVRTSDGGMAPLSRFVRLTKTNGPSDLTRFNLFNAISINGSPARGYSSGQVLE AISETARKVLPANYTYEFGGISREESKTTNNATLIFLLCMVLVYMILCALYESVFIPFAV LLAVPCGLMGSFLFAWLFGLENNIYMQTGLIMIIGLLAKTAILLTEYAGKRRSEGMTLAQ AAYSAAKVRLRPILMTVLSMVFGLIPLMLAHGVGANGSRSLATGVIGGMIVGTLALLFLV PSLFIVFQYIQERIKHN >gi|222159358|gb|ACAB01000001.1| GENE 9 11368 - 12486 894 372 aa, chain - ## HITS:1 COG:YPO3483 KEGG:ns NR:ns ## COG: YPO3483 COG0845 # Protein_GI_number: 16123629 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Yersinia pestis # 13 367 7 365 386 139 29.0 7e-33 MNKRFIIINFAAFAVMLFLSTGCKQDGKQDAVPSYRVITVSSMPVEISESYSATIRGRQD VDIMPQISGRITRLCVKEGTRVTTGQVLAVIDQTPYLAALRTALANVSAARAKAETARIE LQGKQALFDEKVISEYDLSLARNQLAVALAELQQAEAQEADARNNLSYTEIKSPSNGVVG TLPYRIGALVSPNMEQPFTVVSDNAEMYAYFSVSENMLRHLSTRYGSIDSLIAKMPEVSL QLNDGSLYKAKGRIETVSGVVDPATGAVQIKALFPNPERELLSGSIGNIVLRNPQTDAIT IPMTATVELQDKIIAYRLKNGRAEAAYLTVDRLNDGNRFIVKEGLSVGDTLIAEGVGLVR EGMSITPKDETK >gi|222159358|gb|ACAB01000001.1| GENE 10 12502 - 13056 481 184 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|34557871|ref|NP_907686.1| ribosomal protein N-acetylase [Wolinella succinogenes DSM 1740] # 1 173 1 175 182 189 52 3e-47 MKIETERLVIRDFQKRDVVGLLEYLSNPRVNCFAADRLCSEEAAFVYMQYSQKDMQRYAV SLKKEDFIIGDVFVLRENEDTYNVGWHFNKRFEGKGLAREAVIGLLDYLFREAGARRIYG FVEDDNIRSKRLCERLGMRREGCFKEFVTFINNPNGTPKYEDTRVYSILKKEWNSISAKS ITNN >gi|222159358|gb|ACAB01000001.1| GENE 11 13188 - 14432 611 414 aa, chain + ## HITS:1 COG:SMb21546 KEGG:ns NR:ns ## COG: SMb21546 COG3275 # Protein_GI_number: 16264735 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Sinorhizobium meliloti # 198 381 167 353 383 85 29.0 2e-16 MQIGFNKYNVYFTNSYYRFDSNFIRFSLFILSTKGVKPYLCRIKSNFAPDMTQVKMKIMS WKNGLLLSAVSYVFYLVMWFILDDKTIDQLPGMTIADYMVDFLLCMLFTYISLGFCFLVF RVLPFRTSYVWGMVYASCLMALNNIVAFGMITLFNFLWDETDNGLLDELLNMKGAYTFAM ISTFLSSVYANSFYLQSYIKARDEKQALEMALMKEKEIALQSQLNSLKLQINPHFMFNNF NNLLELIKEDTELAGKFLSNLSKVYHRYIITNLDRNLIPVADEIKFLDSYLYLMKVRHNE GVIAKVSPGVRQCKGFLPPAVLQLLVENAIKHNSFSSEHPLVINIKLSDDYITVCNLKRP LMSPIESTGLGLKNIIERYALLCDKKVKIENAENFYSVSLPIIKNINPYENTDS >gi|222159358|gb|ACAB01000001.1| GENE 12 14413 - 15168 654 251 aa, chain + ## HITS:1 COG:SA0251 KEGG:ns NR:ns ## COG: SA0251 COG3279 # Protein_GI_number: 15925964 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Staphylococcus aureus N315 # 1 235 1 231 246 91 31.0 1e-18 MKILILEDEQRNAKRLIRLLNDIDRTFIVEGPLASIKETVEFFQSGKTTDLILADIRLTD GLSFEALKHAQATVPIIFTTAYDEYAVQAFKFNSFDYLLKPLDPDELEAAIDKAVKAGKN YTDENLQQLFDALQKSKFRYRERFLLPYRDGYKTVRVSDINHIETENKIVHLRLNNGTSE VVNVSMDELEHQLNPDYFFRANRQYIINVEHVLFLGNYFGGKLIVRLKGYPKTEIQVSKE KAQRLKEWIDR >gi|222159358|gb|ACAB01000001.1| GENE 13 15389 - 15823 299 144 aa, chain - ## HITS:1 COG:no KEGG:BT_4016 NR:ns ## KEGG: BT_4016 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 140 1 140 140 223 86.0 1e-57 MKKKSKYGRNPKLNPKTHCVMVRFDDVEWNRFLTMYEESNVYAKAVFLKAHFFGQKFKVL KVDKTLVDYYTKLSDFHAQFRGIGTNYNQVVKELRIHFSEKKAMALLYKLEECTIDLVKL SREIMELSREMEAKWQQRRDDSMV >gi|222159358|gb|ACAB01000001.1| GENE 14 16103 - 16570 385 155 aa, chain - ## HITS:1 COG:no KEGG:BDI_0747 NR:ns ## KEGG: BDI_0747 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 155 1 155 155 228 85.0 4e-59 MKREPSITEQQAREIVEKMGRRESYTPKSMNDIYRRIGLEPDEPEQPGKTVTEETETAMA DEPSSEAVGETAMPQKRVSSKQRRLSLEEYRATYLQVPKIINRKPVFVSETVRDELDRVV RFLGGKGMSASGLIENLVRLHLDTYRNDIEQWRKL >gi|222159358|gb|ACAB01000001.1| GENE 15 16831 - 17067 73 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262405309|ref|ZP_06081859.1| ## NR: gi|262405309|ref|ZP_06081859.1| predicted protein [Bacteroides sp. 2_1_22] # 1 78 1 78 78 157 100.0 2e-37 MGFPEVVACCADTVVIVRDTDFSFGGYHSFSRKEIHNPASGNGFNLVWQQVAQGGRGWHR HGSPIAQSAQSLALNSYI >gi|222159358|gb|ACAB01000001.1| GENE 16 17133 - 17444 203 103 aa, chain + ## HITS:1 COG:no KEGG:BDI_0746 NR:ns ## KEGG: BDI_0746 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 103 1 103 103 184 93.0 1e-45 MEIVSIERKTFEELVAKFDRFVCRMDAICHRHGEKKMSEWMDNQDVCRMLNISPRTLQTL RDNGTLAYSQINHKTYYRPEDVQRIVSIVEDRRKEAKFKGRTI >gi|222159358|gb|ACAB01000001.1| GENE 17 17491 - 17796 249 101 aa, chain + ## HITS:1 COG:no KEGG:BDI_0745 NR:ns ## KEGG: BDI_0745 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 97 1 97 97 170 92.0 2e-41 MNELINKDNEWIIHFMGSLDRLLDNVEHLTASYRPTLNGERFFTDKEVSARLKVSRRTLQ DYRNEGRIAYIQLGGKILYRESDIERMLTDSYRTAYRQTAI >gi|222159358|gb|ACAB01000001.1| GENE 18 18087 - 18509 247 140 aa, chain - ## HITS:1 COG:no KEGG:BDI_0743 NR:ns ## KEGG: BDI_0743 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 124 1 124 124 197 77.0 1e-49 MSREIITISKMGAVTVPTAPVWMTQFEIADLFGVFSCNIRKMIQAIYKNKELNETDTMKY IKQTDGISYDVYNFEMIIAIAFRICSKETLLFRRFVINEICTTKKGIPVTLFVSYGKNGN LWYSRGSSRQPPVPDARMQR >gi|222159358|gb|ACAB01000001.1| GENE 19 18531 - 19751 608 406 aa, chain - ## HITS:1 COG:no KEGG:BDI_0742 NR:ns ## KEGG: BDI_0742 # Name: not_defined # Def: integrase # Organism: P.distasonis # Pathway: not_defined # 1 403 1 403 403 738 91.0 0 MRSTFSLLPYINRSKVRADGTTAVLCRITIDGKQTAISTGIYCRPEDWNGRKNEIKTVRE NNRLREYLRLTEEAYTEILKSQGVVSAEMLKNHISLNNIHPTTLLQMGEWERERLKKHSE EIDSTSSYRHSMYYQKYLTDFIASIGKKEIPLEEVTEDFGKSYKAFLKKCKNFSSSQTNK CMCWLNRLLYLAVDKEIIRVNPCEDLEYEPKPEARHRYISRDEFKKILSTPMYDKRMELA RRAFIFSTLTGLAYVDIKLLHPHHIGTNAEGRRYIRINRKKTKVEAFIPLHPIAEQILSL YNTTDDEKPVFPLPSRDSLWFDIHEMGVAIGKEENLSYHQSRHSFGTFLISADIPIESIA KMMGHSNIRTTQGYARITDDKISKDMDKLMERRKKISAGEKKENSI >gi|222159358|gb|ACAB01000001.1| GENE 20 19765 - 20994 931 409 aa, chain - ## HITS:1 COG:SMc02489 KEGG:ns NR:ns ## COG: SMc02489 COG0582 # Protein_GI_number: 15966799 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Sinorhizobium meliloti # 216 397 136 325 330 68 28.0 2e-11 MKVEKFKVLLYLKKSGLDKSGKAPIMGRITVNRTMAQFGCKLSCPPELWNPCESRLNGKS KEAVETNTKIEKLLLAVNNAFDNLVSRKMDFDATDVKNHFQGSMETQMTLMRMTDVVCDD LKARIGIDRAKTSYSTYHYMRLTLGEFIGHQYKVKDLAFGQLTEQFIHDYQVFAMENKGY AIDTVRHHLAILKKICRLAYKEGYADKIHFQHFTLPKQSDKTPRALSRESFEKIRDVEIE PHRKSHILARDMFLFGCYTGVSYADVISITDENLYTDDNGALWLKYRRKKNEHRASVKLL PEAIALIEKYHSEDRDTLFPLLRWPNLRRHMKALAALAGIKDDLCYHQARHSFASLITLE AGVPIETISRMLGHSDISTTQVYARVSPKKLFEDMDKFIEATQDFKLTL >gi|222159358|gb|ACAB01000001.1| GENE 21 21259 - 21498 108 79 aa, chain + ## HITS:1 COG:no KEGG:BT_4023 NR:ns ## KEGG: BT_4023 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 79 39 118 118 111 75.0 1e-23 MLALPCIFHMARHTFATTITLQHEIPLETVSKMLGHTKITTTQVYARVVDTKVMRDMATL KDMYSHKEDKSPNNKAANE >gi|222159358|gb|ACAB01000001.1| GENE 22 21495 - 21752 310 85 aa, chain + ## HITS:1 COG:no KEGG:BT_4024 NR:ns ## KEGG: BT_4024 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 82 1 82 86 100 65.0 2e-20 MKATVIINQEELELKAIDSMIAYEKSFITYSEMKKAVSDALRHYGSREGHRKIVLKGWII KTIYALDSNQLKDLDRITFEYLNEH >gi|222159358|gb|ACAB01000001.1| GENE 23 21791 - 22105 267 104 aa, chain + ## HITS:1 COG:no KEGG:BT_4025 NR:ns ## KEGG: BT_4025 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 1 98 98 131 69.0 8e-30 MKTETNSMRIVKPEKTDTAGTPQQEKDPEKSKYKPVTPVMLDTVPEDAVFIRSADVCKLL NISNSTLRNLRAERAIPFYKLGGTFLYSKEEIMNYLASNYSRRI >gi|222159358|gb|ACAB01000001.1| GENE 24 22102 - 23145 662 347 aa, chain + ## HITS:1 COG:no KEGG:BT_4026 NR:ns ## KEGG: BT_4026 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 337 1 325 332 439 68.0 1e-122 MSKNGFSYYKAETDRFQDIKIKRLKKKYGCDGYAVYQYALNEIYRVDGSYIRWTEDQLFD CADYWGMNEERVKEIVDYCAEICLFDPVVWKMKCILTSRAIQSRYIDICKLAKKKMYIPL DILLVEPEQPIKEPVTMPLFEAATAAEPTGQNIPDVSVSPTTVEPTFEKLPEDFQNFPET SGNLPEKIDKEKKSKEKENKEKSSSIPLPTIGELTEEEARALLSSTPLGRNKPETTIATG GNTASTTTGKRPPQVEAAAHEQKPRNPKGLIEALSPYNLSPRELEEVLKLSGHGEIGNPV WQILGEMHGNKRLRMPRLFLLKRLRDAVGVIADSSHGDDEIKNKSAS >gi|222159358|gb|ACAB01000001.1| GENE 25 23166 - 23447 209 93 aa, chain + ## HITS:1 COG:no KEGG:BT_4027 NR:ns ## KEGG: BT_4027 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 73 2 71 75 75 57.0 6e-13 MSENKKATDTPLYYPRKMRTGWCVAHEVTAAGITIERYGIHCQTYAEAYRRAEVMNRKQQ VETEAGREGTQVGETIGSLEPSASDGEKGGERS >gi|222159358|gb|ACAB01000001.1| GENE 26 23444 - 23713 143 89 aa, chain + ## HITS:1 COG:no KEGG:BT_4028 NR:ns ## KEGG: BT_4028 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 83 1 83 90 90 65.0 2e-17 MSEKVSTITLRLTAEEVTQLEILKNLTGKRTASEAIKHVVQEYPRFCTHYKQEAKEHGEL KRRYQEQGEAVRGFLSALDRLEKAGREKE >gi|222159358|gb|ACAB01000001.1| GENE 27 24089 - 24553 -154 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|299148371|ref|ZP_07041433.1| ## NR: gi|299148371|ref|ZP_07041433.1| conserved hypothetical protein [Bacteroides sp. 3_1_23] # 1 150 25 174 210 267 96.0 2e-70 MVRGNACSGNTAAHTAPSPAAALRQCLIGTFLHRQRFSDVQCFCAPCRFPCFPVAAQWAQ RKPPSDSVSMERSWGKSSQKLLLCVTVEAGKVGDYGSYYRLTHTAPASAHTPAQPDYKRQ PMRWKPLRKGIPCGTGDSCDARCGFAPPGTCCRD >gi|222159358|gb|ACAB01000001.1| GENE 28 26016 - 26486 622 156 aa, chain - ## HITS:1 COG:no KEGG:BT_4032 NR:ns ## KEGG: BT_4032 # Name: not_defined # Def: putative non-specific DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 154 1 156 158 194 73.0 1e-48 MAITIKPALRKNPQDKAAAAKYYAQVVLAPEMKQKQIVDQIADRCTLTGSDIKAVLDALM VVIKRNLANGSPVRLGDLGSFRPSVSGKGTEDASKCGANSVKKARVIYVPSTEIKEAVAM YSFSKAGASAANEEGGDEKPDPKPDEGGGEAPDPAA >gi|222159358|gb|ACAB01000001.1| GENE 29 26675 - 26788 106 37 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716547|ref|ZP_04547028.1| ## NR: gi|237716547|ref|ZP_04547028.1| predicted protein [Bacteroides sp. D1] # 1 37 31 67 67 73 100.0 4e-12 MKPFWRTTLKVLKVIGKIFVWLTGADSPVKDDEKRND >gi|222159358|gb|ACAB01000001.1| GENE 30 26914 - 27387 431 157 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 49 145 2 98 116 95 49.0 2e-20 MIKQRNITLIVVHCTASRCTSNLTPEALDSMHKRQGFAECGYHYYITKDGTIHHMRDITH VGAHAKGYNTPSIGVAYEGGLDASGHAADTRTDAQKQSLETLLRFLLLTYHGAKICGHRD LSPDLNHNGTIEPCEYIKQCPCFCVSVEYGYLMKNDE >gi|222159358|gb|ACAB01000001.1| GENE 31 27564 - 27815 216 83 aa, chain - ## HITS:1 COG:no KEGG:BT_4030 NR:ns ## KEGG: BT_4030 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 83 1 84 84 130 80.0 1e-29 MYTNMHQQNNCQSLTNRSYGFKELAVLYFPNIAPASASIRLKQWIKDDSELLEALQETNY QLSNRILTPKQKELITISFGSPF >gi|222159358|gb|ACAB01000001.1| GENE 32 28669 - 28860 108 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716551|ref|ZP_04547032.1| ## NR: gi|237716551|ref|ZP_04547032.1| predicted protein [Bacteroides sp. D1] # 21 63 1 43 43 72 100.0 9e-12 MHSAPAPAHTPAQPDHRRQPMRWTPLRTIAEVTPVLYTAPVAAPSLAEYSKGSSTLCDGD PFG >gi|222159358|gb|ACAB01000001.1| GENE 33 29398 - 29718 255 106 aa, chain - ## HITS:1 COG:no KEGG:BT_4035 NR:ns ## KEGG: BT_4035 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 106 1 106 106 138 76.0 5e-32 MENRMKSIAQRLEEHASPTPSKWRELFDFLETNKSWLRHSQNIAMLMLDRMEELGMSQKQ LAEKMNCSPQYISKVLRGRENLSLETLTKIENALEISIIKEEPMAV >gi|222159358|gb|ACAB01000001.1| GENE 34 29734 - 30282 601 182 aa, chain - ## HITS:1 COG:no KEGG:BT_4036 NR:ns ## KEGG: BT_4036 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 182 1 182 182 280 79.0 1e-74 MTFDDITGDRKLWAVHFEGEDDNEFYKVFNNWADVVWLRTFFKENIDDLNAYFKITDIKE AISDTIEDSERLRYIIMDISPETDLSKIFRPLDNNQASDVMLQKEKARLKRKYGHSSWLR LYAIKLIQGNYIITGGAIKLTATMQEREHTRQELVKIDKVRRYLLEEGIIDDEGFIEYIS EL >gi|222159358|gb|ACAB01000001.1| GENE 35 30369 - 30617 181 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716554|ref|ZP_04547035.1| ## NR: gi|237716554|ref|ZP_04547035.1| predicted protein [Bacteroides sp. D1] # 1 82 1 82 82 97 100.0 2e-19 MQNKRYQNSYIKGLKRNVINNTIAGIVIAGIIFIIQLIGGCEQPEITHQKTLIDSLQINT NDEINDWDTDTTIYHTTTKPTK >gi|222159358|gb|ACAB01000001.1| GENE 36 30575 - 30670 61 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MINIAFNLLTSQRKKYKLCKIKGIRILILKA >gi|222159358|gb|ACAB01000001.1| GENE 37 30928 - 31818 838 296 aa, chain + ## HITS:1 COG:TM0415 KEGG:ns NR:ns ## COG: TM0415 COG0524 # Protein_GI_number: 15643181 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 9 285 5 277 286 124 29.0 1e-28 MNKHQLCCIGHITLDKVVTPQNTVYMPGGTAFYCSHAIRHFNDIDYALVTAVGATEMNVV EQLREMGIHITALPSKYSVYFENIYGANPDDRTQRVLAKADPFTAGQLKDIDAQIYHLGS LLADDFSLEVIKELSQKGLIAVDSQGYLREVRDTHVYPVDWTDKREALQYIHFLKVNEHE MEVLTGLSDPHEAARQLHEWGVKEVLVTLGSMGSLIFDGKDFYRIPVYKPKEVVDATGCG DTYTIGYLYQRVSGAGIEEAGRFAAAMSTLKIEKSGPFNGSKEDVIQCMTTAEQMF >gi|222159358|gb|ACAB01000001.1| GENE 38 32066 - 35131 2836 1021 aa, chain + ## HITS:1 COG:no KEGG:BT_3569 NR:ns ## KEGG: BT_3569 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1021 1 1021 1021 1863 91.0 0 MEKFNAMRTRLLQHLQKKAIRSKSIMTLVCLLLASASAFAQTKTVTGTVTDAANEPLIGA SVLVQGTSTGTITDMDGKYSISVTPEDVLAFSYVGMTSQTIKVGTQNVINVTLKEDSQVL AETVVIGYGSAKKRDLTGSITNIKGEELANKPAMNPLSSLQGKVAGVQIVNSGRAGSDPE IRIRGTNSINGYKPLYIVDGLFNDNINFLNPEDIESMEILKDPSSLAIFGVRGANGVIII TTKKAKEGQTLVNINTSFGFKKVVDKVKLVNGSQFKELYNEQLANQKDDPFDYTGWDANT DWQDEIFQTAFITNNNISITGASPKHSFYLGVGYSYEQGNIEHEKFSKVTINASNDYKIT DFLKVGFQFNGARMLPADSKQVLNALRATPIAPVYNNEYGLYTALPEFQKAQINNPMVDV ELKANTTKAENYRASGNIYGEVDFLKHFTFKAMFSMDYASNNGRTYTPIVKVYDAAVNGG ISTLGTGKTEVSQFKENETKVQSDYLLTYTNSFDNGNHNLTATAGFTTYYNSLSRLDGAR KQGVGLVIPDNPDKWFVSIGDAATATNGSTQWERSTLSVLARVIYNYKGKYLFNGSFRRD GSSAFSYTGNEWQNFFSLGGGWLMSEEEFMKDIKWLDMLKIKASYGTLGNQNLDKAYPAE PLLTNAYSAVFGKPSIIYPGYQLAYLPNPNLRWEKVEAWEAGFETNLLRNRLHFEGVYYK KNTKDLLAEVPGISGTIPGIGNLGEIQNKGVEMAVTWRDQIGDWGYSVSANLTTIKNEVK SLVQEGYSIIAGDKQQSYTMAGYPIGYFYGYKVAGVYQSQADIDASPKNTLATVTPGDLK FADVNGDGEITPEDRTMIGNPTPKVTYGFSLGVDYKNWSLGIDMMGQGGNKIFRTWDNYN FAQFNYLEQRLDRWHGEGTSNTQPLLNTKHSINNLNSDYYIENGSFFRIRNVQLAYTFDK SLIAKIRLQALKVYVNIQNLKTWKHNTGYTPELGGTATAFGVDNGSYPVPAVYTFGINLT F >gi|222159358|gb|ACAB01000001.1| GENE 39 35196 - 36728 1634 510 aa, chain + ## HITS:1 COG:no KEGG:BT_3568 NR:ns ## KEGG: BT_3568 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 509 1 509 509 943 88.0 0 MKLNKYIFAALITSTTLFLGSCSDFLDRSPQGQFTEDDNPNALVNGKIYNVYTMMRNYNV TAGPPAFAIHCFRSEDSEKGSIASDGSDVAEMYDDFVYTPTNGLLGAYWGQNYAIIYQCN EILDAIAEKETAGQTETEDIINKGEASFFRAYCYFNLVRAFGEVPLVTYKINDASEANIP KTTADKIYEQIDKDLKTAEESLPETWSSEYTGRLTWGAARSLHARTYMMRNDWNNMYTAS TDVIKKGLYNLKTPYNEIFTDDGENNGGSIFELQCTATAALPQSTVIGSQFCEVQGVRGA GQWDLGWGWHMATEYMAEAYEQGDPRKNSTLLYFRHSDSDPITPENTNEPYGESPVSPAI GAYFNKKAYTDPALRKEYTNKGFWVNIRLIRYADVLLMGAESANEKGIPGEAIDYLEQVR ARARGTNTNILPKVTTTDQGELREAIRHERRVELGLEFDRFYDLVRWGIAKEVLHAAGKT NYQDKNALLPLPQTEIDKSKGVLVQNPDYQ >gi|222159358|gb|ACAB01000001.1| GENE 40 36871 - 39180 2415 769 aa, chain + ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 21 768 26 763 764 682 48.0 0 MIYKKLSLSILLFAAGFLTASAQKSPQDMDRFIDVLMNKMTLEEKIGQLNLPVTGEITTG QAKSSDIAAKIKKGEVGGLFNLKGVEKIREVQKQAVEDSRLGIPLLFGMDVIHGYETMFP IPLGLSCTWDMTAIEESARIAAVEASADGISWTFSPMVDISRDPRWGRVSEGSGEDPFLG AMIAEAMVRGYQGKNMERNDEIMACVKHFALYGAGEAGRDYNTVDMSRQRMFNDYMLPYE AAVEAGVGSVMASFNEVDGIPATANKWLMTDILRGQWGFNGFVVTDYTGISEMIDHGIGD LQTVSARAINAGVDMDMVSEGFVGTLKKSVQEGKVSMETLNTACRRILEAKYKLGLFDNP YKYCDPKRPARDIFTKAHRDAARRIAAESFVLLKNDSPDGNPNGNPLLPFNPKGNIAVIG PLANSRSNMPGTWSVAAVLDRCPSLVEGLKEMTAGKANIMYAKGSNLISDASYEERATMF GRSLNRDNRTDQQLLDEALNVARRSDIIIAALGESSEMSGESSSRTDLNIPDVQQNLLKE LLKTGKPVVLVLFTGRPLTLNWEQEHVPAILNVWFGGSEAAYAIGDALFGYVNPGGKLTM TFPKNVGQIPLYYAHKNTGRPLKDGKWFEKFRSNYLDVDNDPLYPFGYGLSYTTFSYSDI DLSHSSMDMNGSLTAAVEVTNTGTWPGSEVVQLYIRDVVGSSTRPVKELKGFQKIFLEPG EMKIVRFKIAPEMLRYYNYDLQLVAEPGDFEVMIGTNSRDVKTAKFTLN >gi|222159358|gb|ACAB01000001.1| GENE 41 39229 - 40602 1254 457 aa, chain + ## HITS:1 COG:AGl3503 KEGG:ns NR:ns ## COG: AGl3503 COG5368 # Protein_GI_number: 15891871 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 41 457 9 414 425 322 40.0 1e-87 MSALPLMLLFLSAVALTFQSCKGKSKSNNLSAATDSLSDDALMDTVQRRTFLYFWEGAEP NSGLAPERYHVDGVYPENDANVVTSGGSGFGIMAILAGIDRGYVTREEGLARMERIVSFL EKADRFHGAYPHWWYGDTGKVKPFGQKDNGGDLVETAFLIQGLLAVHQYYVNGNEKEKAL AQRIDQIWRDVDWNWYRQGGQNVLYWHWSPTYGWEMNFPVHGYNECMIMYILAAASPTHG VPAAVYHDGWAQNGAIVSPHKVEGIELHLRYQGTEAGPLFWAQYSFLGLDPVGLKDEYCP SYFHEMRNLTLVNRAYCIRNPKHYKGFGPDCWGLTASYSVDGYAAHSPNEQEDKGVISPT AALSSIVYTPEYSMQVMRHLYGMGDKVFGPFGFYDAFSETDNWYPQRYLAIDQGPIAVMI ENYRSGLLWKLFMSHPDVQAGLTKLGFNTNKQDVRQK >gi|222159358|gb|ACAB01000001.1| GENE 42 40630 - 43191 1825 853 aa, chain - ## HITS:1 COG:CC0815 KEGG:ns NR:ns ## COG: CC0815 COG1629 # Protein_GI_number: 16125068 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 116 819 45 720 737 162 23.0 2e-39 MKHRLLLLILFVFTTSVAGWAQKSTAPSYTVKGVLIDSLTQEGEPYATIKIAKKNAPDKA VKMAVTGANGKFQEKLNVAAGNYIITISSIGKAPIVKEFILKPSVKEVDLGTMISSEANN VLKGVEVVAQKPLVKVDVDKIEYNIEDDPDSKSNSILEMLRKVPLVTVDGEDNVQVNGSS SFKIHVNGKPNNMMSNNPKEVLKSMPANTIKYIEVITSPGAKYDAEGIGGILNIVTVGSG FEGYTATFRGNANNNGVGAGTYAMVKQGKLTVSANYNYNYNNSPRSYSDSYRENYEPDKE NEKYLESESSSKSKGNFQYGNLEASYEIDTLRLLTVAFGMYGSSDKSNSDGNTIMHGADR QDFAYRYRTDNHGKGSWYSINGNIDYQRTSRKNKERMITLSYKINSQPQTNDSYNTYLDI EPDENKLDIIKELSLKNFHSDGKTNTMEQTFQVDYTTPIGKLHTIETGAKYIFRRNSSDN RFYEAEGASENYAYNENRSSEYRHLNHILSAYAGYTLKYKGFTFKPGLRYEQTIQEVKYL VGPGENFDSNFSDLVPSVSLGIKLGKTQNLRGGYNMRIWRPGIWNLNPYFDNQNPMFINQ GNPNLKSEKSHSFDLSYSSFTSKFNINISLRHSFNNNGIERISRLITDKDGEILEGGHKA PYGALYSTYGNIGKSRETGLNFYLNWNASPKTRIYVNGRGNYSDLRSPAQELHNYGWNAS AYGGIQQTLPGKVRLSLNGGGSTPRISLQGKGSGYSYYSLGLSRSFLKEERLSLNIYCSN VLEKYRTYNNHTEGANFISKSSSKYPSRYYGFSISYRLGELKASVKKAARSIDNDDVKGG GGGGGNTGGGGGQ >gi|222159358|gb|ACAB01000001.1| GENE 43 43212 - 43373 120 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295086358|emb|CBK67881.1| ## NR: gi|295086358|emb|CBK67881.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 53 1 53 57 68 84.0 2e-10 MFIRRKTFKNAANSRAKSSWFIYTYEFFGVKNIKRMDKKSYYSIFLVFLNISI >gi|222159358|gb|ACAB01000001.1| GENE 44 43493 - 43942 292 149 aa, chain - ## HITS:1 COG:no KEGG:BT_3564 NR:ns ## KEGG: BT_3564 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 122 1 119 146 89 42.0 3e-17 MPKWISVEEAAAKYGINKEVIWLWADMKRFPMSYEKGITTVDEESLIGFLHQNKDRVTAE YIDTLEDLCIEKANICNLYAEIIGCQDKELLYQREQIARMKEIQTAMKRQNSRLRDCEKV FTKYEENFSTCWVGRICAHLRRLIWLIRR >gi|222159358|gb|ACAB01000001.1| GENE 45 45418 - 46992 1481 524 aa, chain - ## HITS:1 COG:no KEGG:BT_4046 NR:ns ## KEGG: BT_4046 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 519 2 515 518 829 80.0 0 MDTTDTTFRIYPIGIQNFEDLRNNNNVYVDKTELIYRLANTNKVYFLSRPRRFGKSLLVS TLDAYFRGKKDLFQGLAMERLEKEWNVYPVLHLDFSMTKYTALSDLLGQLNLNLYDWEKL YGKEEVEGTPAERFRGVIRRAYEQTGKPVVVLIDEYDAPLLDSNHLPELQNELREEMRKF FSPLKAQGEYLRFLFLTGISKFSQMSIFSELNNLQNISMRDDYSAICGITERELRTQLKT DIEMMAQANNETYEEACTHLKQQYDGYHFSENCEDIYNPFSLFNAFAQKKYANFWFSTGT PTFLINILQQSNFDIRELDGATATAEQFDAPTSVITDPLPVLYQSGYLTIKGYDPEFQLY TLAYPNKEVRKGFIESLMPAYVHLPARENTFYVVSFIKDLRAGKLTECLERIRSFFASIP NDLENKQEKHYQTIFYLLFRLMGQYVDTEVKSAIGRADVVVKMQDAIYVFEFKVDGTPEE ALEQINSKGYIIPYQPDHRKVVKVGVNFDSATRSIGDWKIVEEV >gi|222159358|gb|ACAB01000001.1| GENE 46 47183 - 48736 1105 517 aa, chain + ## HITS:1 COG:no KEGG:BT_0374 NR:ns ## KEGG: BT_0374 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 515 1 515 516 819 77.0 0 MSTKIYPIGIQNFEKIRNDGYLYIDKTALMYQMVKTGSYYFLSRPRRFGKSLLISTLEAY FQGKKELFTGLAVERLEKDWIKYPILHLDLNIEKYDTPESLDNILEKSLTAWEKLYGAEP SERSFSLRFAGIIERACKQAGQRVVILVDEYDKPMLQAIGNEELQKQFRNTLKPFYGALK TMDGYIKFAFLTGVTKFGKVSVFSDLNNLDDISMRKDYIEICGVSDQELHENLDIELHEF AETQSLSYDKLCTKLKEYYDGYHFTHNSIGIYNPFSLLNAFKYKEFGSYWFETGTPTYLV KLLKKHHYDLERMAHEETDAQVLNSIDSESTNPIPVIYQSGYLTIKGYDERFGIYRLGFP NREVEEGFIRFLFPFYANVNKVESPFEVQKFVREVETGDYDSFFHRLQSFFADTTYEVIR EQELHYENVLFIVFKLVGFYTKVEYHTNNGRVDLILQTDKFIYIMEFKLNGTAEEALQQI NNKRYALPFEADGRKLFKIGINFSEKTRNIEKWVVAS >gi|222159358|gb|ACAB01000001.1| GENE 47 48785 - 49651 603 288 aa, chain - ## HITS:1 COG:BB0411 KEGG:ns NR:ns ## COG: BB0411 COG1864 # Protein_GI_number: 15594756 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Borrelia burgdorferi # 93 284 10 192 195 102 31.0 7e-22 MFRKKEMKYRGINVWVACLCLLLAACDKDENNNSNNFATGIVELPALRNGANDVFVTHYT TFNGQKVTSFSMEYDKSKKHSRWIAFRFDNQTKQQNVSRSDEPFDADPAIGSQYQRVQAD FGKQGYDRGHLCASADRLYSREVNEQTFYYTNMSPQRNKFNTGIWLTLEGQVQSWGRSCT ASDTLYVVKGGTIDKEDQIRGYISNDRSKPIPRYYYMALLFKKGDSFKAIAFWMEHTDSP KSTKLVDYALSIDELEKKTGIDFFPNLNDNLENALEATYSTKAWPGLE >gi|222159358|gb|ACAB01000001.1| GENE 48 49731 - 51683 1421 650 aa, chain - ## HITS:1 COG:BS_yhcR_1 KEGG:ns NR:ns ## COG: BS_yhcR_1 COG4085 # Protein_GI_number: 16077984 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein, contains TRAM domain # Organism: Bacillus subtilis # 536 645 50 158 372 75 38.0 4e-13 MKKILNALFLTLLAVFTFSSCSDVPAPYDILGEGDVPGLTGDGTKENPYSIEAAQQKQDG TIAWVQGYIVGTVENYEDPSGSAKFAAPFTAKNNLLIAASATETNVKNCVCVQLSSGTEL YSKLNLAENATNLGHILAIQGSLEKFYGFPGVKSTTAATLDGKDVGGSGETDPDNPLGLD DSNPVNSFSATFDDAVNNNDYLLTNWYNVAVTGGRRWQGKIFNNTDKYIQATSYNASGSN FECWFVTPAFKVDEIADKTVSFKCAVYNYATAAANSNLEVYFLKLVNGKMESSKLTIDGM PTTDNTWVPLEAKLDSYAGQTGFVGFKYTSTSSTEALSYRLDDIQAGKGQGGGETPGEGT ELLTNGGFENWADGYPIGWKSTSTAGNATLEQSTDKRSGTYSVLVKGVSSSNKRLGSAEM TLKAGTYMFSAYFKAATADIASARLGYVLVDETGKAGSYVYDADYVNGISNEDWVTKSFT FTLSGEQKICLVIMNSGNPGKDLLVDDVSLKTEDGGIVGGVEEPDPEPGDADGTEAHPYS AADVITLNSTRKGPFYVKAYIVGAADGAMSKIQTSNFSVDTNIVVADKQGETEVNKLVPV QLPSGAIRTAWALKTNPDKLGKQVIIQANLENYFSTPGLKSPTSITEVGK >gi|222159358|gb|ACAB01000001.1| GENE 49 51704 - 52546 798 280 aa, chain - ## HITS:1 COG:no KEGG:BT_3561 NR:ns ## KEGG: BT_3561 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 280 1 277 277 299 57.0 6e-80 MKKLLSILCTGLLLCGAGLTSCENYDDPVTGNAYGNNSIPEGRTISIAALKEKYNDFIDT SKDTYTTIEGETRIEGVITCDDESGNLYKKLVVADETGAIVIGVNATGLYAFCPVGQKVV IDCKGLQIGSYRKQAQIGTVYNNSVGRMPEYVWKQHVRLINEPKLYYPELTPIEITTPAD LAAIDLKEAPVLVTFKDIKLSEADGTATYAPGDEGSVKRYFTYADGTQSGSNLFLYTSAY ANFSMEVMPQGSVNITGILLRYNNQWEVVVRTLSDIKRNN >gi|222159358|gb|ACAB01000001.1| GENE 50 52594 - 55122 2147 842 aa, chain - ## HITS:1 COG:no KEGG:BT_3560 NR:ns ## KEGG: BT_3560 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 842 1 846 846 1277 79.0 0 MKQRLGLAIVLFCLMPALFAQQKAKQNAREDNASFMFTESQLNEDDDAAQSTSALVTSNN DVYLSNVGYLFSPMRFRVRGYDSQYSDMYINGVQFNDAETGRFSYGLIGGLNDATRNKEG IGPFEFNNFTFGAIGGATNINLRASQYAAGSKLTLSGSNRNYILRGMYTYSTGLMNNGWA FTGSLGYRWGNEGNIEGIKYNSFSYFLGAEKVFNERHSLSLATWGTPTERGQQMAATEEA YYLANSHYYNPNWGYQNGEKRNARIVRQFEPSAIASWNFTIDDNKKLVTSAGFKYSNYGK SALGWNGNAADPRPDYYKKLPSSIFDVWESVPTADELQQFNEVTDNWKNNKAYRQLDWDA LYFANKQANALGKETLYYVEERHDDQLAFNLSSVFNHQWNEHNSYIAGVAVNTTKGMHYK KMKDLLGGQLYTDVDKFAVRDHGASSSMVQNDLDNPNRRIGEGDKFGYDYNIYVNKQSAW VRYQGNNGGSLNYFASGKIGSTQMFRDGLMRNGRAPLKSLGSSGTAKFLEGGIKAGLNWA INGNHSFTLNAGYEERAPLAYNSFIAPRIKNDFVRDLKTERIIGGDLTYNFNTPWVMGRL TGYYTRFQNEVEMDAFYNDSEARFTYLSMNGIEKEHWGIEAAATFKLTSELSLTAIGTWS EAKYTNNPDAVLTYESENESNLDRVYAKGMRANGTPLSAYSLALDYNVKGWFFNLTGNYY DRVYIDFSSYRRLGSVLDKNGAGVDANGNPVLHVPGQEKLDGGFMLDASIGKYIRLRNGK SISLNLSLTNILNNTDLRTGGFEQNRDDNYKDGDARVYKFSKNSKYFYAFPFNAFLNIGY RF >gi|222159358|gb|ACAB01000001.1| GENE 51 55323 - 56348 904 341 aa, chain + ## HITS:1 COG:no KEGG:BT_3559 NR:ns ## KEGG: BT_3559 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 341 1 342 343 577 80.0 1e-163 MKKSLMTLGVLALFVLMAYGQEKKFALYSVAFYNMENLFDTIHDEGKNDYEYLPNGTNQW NTMKYKAKLKNMSEILSMLSTDKLPMGPAIIGVSEIENYRVLEDILKQPALADRGYQYVH YEGADQRGVDCAFFYNPKLFELTNSKLVPYVYINDTVHKTRGFLIASGNIAGEKMHFIVN HWPSRAAASPVRERAGEQVRAIKDSLLREDSAAKIVIMGDMNDDPMDKSMAVALGAKRKP ADVGPTDLYNPWWDTLKKGYGTLMYKGKWNLFDQIVFTGNLLGTDRSTLKFYKHEIFRRD FMFQKEGKYKGYPKRTQAGGVWLNGYSDHLPTIIYLIKEVK >gi|222159358|gb|ACAB01000001.1| GENE 52 56345 - 57502 486 385 aa, chain + ## HITS:1 COG:BB0411 KEGG:ns NR:ns ## COG: BB0411 COG1864 # Protein_GI_number: 15594756 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Borrelia burgdorferi # 199 370 16 177 195 84 32.0 4e-16 MTKALFKLFILFIACSTAISCSEQDSPELPDNPGNTNQGIASIDQTQINANGGGFIIRVK ADGTWQASSSETWCTLSRASGNGNGSISGYMKANTGAERSVIITIIAGKEKAEFTLKQLA GNGSNPDPDPDPEKPSGYAGRIEIPALRSGDMYKFITHTTKENNKEIITYSYEYDCNKMH SRWVACTFSTATSDQDAGRNENFTEDLSLPPAYRLGEKAFSGSNYSRGHLIASEDRQYSV AANKKTFYMSNMSPQIQDGFNGGIWLNLERQVQSKGYSITNSKDTLYVVKGGTIRDDQIL KYISDGSHNIAVPKYYFMALLSLKDGKYSAIGYWFEHKSYNSKEPFSKYEVTIDELEANT DIDFFPNLPSDIEKDVEKSKDNWKW >gi|222159358|gb|ACAB01000001.1| GENE 53 57722 - 59020 926 432 aa, chain - ## HITS:1 COG:MA1450 KEGG:ns NR:ns ## COG: MA1450 COG3174 # Protein_GI_number: 20090309 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 14 348 4 329 413 81 26.0 3e-15 MDMEQLYSYVPRELVTFVLVTLFSLLIGLSQRRISLKREGETTLFGTDRTFTFIGILGYL LYILDPTDMRLFMGGGAVLGLLLGLNYYVKQSQFHVFGVTTIIIALITYCMAPIVATQPS WFYVMVVVTVLLLTELKHTFTEFAQRMKNDEMITLAKFLAISGIILPMLPHKNLIPDINL TPYSIWLATVVVSGISYLSYLLKRYVFHESGTLVSGIIGGLYSSTATISVLARKSRKASE QEATDYVAAMLLAVSMMFLRFMILILIFSREIFLSIYPYLLTMAVVAAIVAWFIHSRQKR PEDQPAETEEDDSSNPLEFKVALIFAVLFVIFTFLTHYTLVYAGTGGLNLLSFVSGFSDI TPFILNLLQNTGSVAALIITACSMQAIISNIMVNMFYALFFAGKGSKLRPWILGGFGVVI TCNLVLLLFFYI >gi|222159358|gb|ACAB01000001.1| GENE 54 59073 - 59540 499 155 aa, chain - ## HITS:1 COG:XF2357 KEGG:ns NR:ns ## COG: XF2357 COG2954 # Protein_GI_number: 15838948 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 1 155 1 161 165 140 47.0 1e-33 MAQEIERKFLVIGEFKSSAFAQSHIVQGYISSARGRTVRVRIRDEKGYLTIKGASNASGT SRYEWEKELALSEAEELMRLCEPGIIDKTRYLVRSGKHVFEVDEFYGENEGLIVAEVELE SEDEAFVKPDFIGEEVTGDIRYYNSQLMKKPYKTW >gi|222159358|gb|ACAB01000001.1| GENE 55 59652 - 60677 819 341 aa, chain - ## HITS:1 COG:no KEGG:BT_3553 NR:ns ## KEGG: BT_3553 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 341 10 350 350 635 91.0 0 MLLLATMAFGQAKEDAGDGKMLDKQTMEFKDYLPEIHGTIRGKYEYQTETSESRFEVRNA RFSVSGNVHPIVAYKAEIDLSDEGSIKMLDAYARVFPVKDLNFTIGQMRVPFTIDAHRSP HQQYFANRSFIAKQVGNVRDVGLTAGYTNKDGFPFILEGGLFNGSGLTNQKEWHKTLNYS IKAQLLPNKNWNLTLSTQMIKPENVRINMYDAGVYYQNDRFHIEAEYLYKMYGHEAFKDV HAVNSFINYDLPLKKVFNKISFLARYDMMTDHSDGKMDETTKALIINDYARHRVTGGITL SLSKAFIADLRLNFEKYFYKNSGVPKESERDKIVIEFMTRF >gi|222159358|gb|ACAB01000001.1| GENE 56 60940 - 61074 167 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKVKKKVAISKKEEEQAQKVVKIVFVSLIILALIMLIAFSFFG >gi|222159358|gb|ACAB01000001.1| GENE 57 61159 - 62118 1009 319 aa, chain - ## HITS:1 COG:SA0709 KEGG:ns NR:ns ## COG: SA0709 COG1186 # Protein_GI_number: 15926431 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Staphylococcus aureus N315 # 7 318 23 330 330 247 44.0 2e-65 MKLVKDLQKWIEGYNELKTLADELELAFDFYKEELVTEEDVDAAYAKASEAVEALELKNM LRDEADQMDCVLKINSGAGGTESQDWASMLMRMYLRYAETNGYKATIANLQEGDEAGIKT CTINIEGDFAYGYLKGENGVHRLVRVSPYNAQGKRMTSFASVFVTPLVDDSIEVNILPAN ISWDTFRSGGAGGQNVNKVESGVRLRYQYKDPYTGEEEEILIENTETRDQPKNRENAMRQ LRSILYDKELQHRMAEQAKVEAGKKKIEWGSQIRSYVFDDRRVKDHRTNYQTSDVNGVMD GKIDGFIKAYLMEFSSQES >gi|222159358|gb|ACAB01000001.1| GENE 58 62210 - 62275 59 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MITIEQLKDVKERTDALRRYL >gi|222159358|gb|ACAB01000001.1| GENE 59 62442 - 62873 567 143 aa, chain + ## HITS:1 COG:no KEGG:BT_3551 NR:ns ## KEGG: BT_3551 # Name: not_defined # Def: putative zinc protease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 888 1030 1030 192 67.0 4e-48 MTPKELKSEFYRLACTFYVSPGNERTYVVLSGLNENMPESEAAFKLAKDGLTNRLRTERI IKGDIIWSYINAQDLGQNVDPRIKLYNDIQNMTLKDIADFQKQWVKGRTYVYYILGDKKD LELDKLKAVGLIEELTQEQIFGY >gi|222159358|gb|ACAB01000001.1| GENE 60 63113 - 64927 1599 604 aa, chain - ## HITS:1 COG:VC2484 KEGG:ns NR:ns ## COG: VC2484 COG1022 # Protein_GI_number: 15642480 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Vibrio cholerae # 5 604 7 600 601 498 41.0 1e-140 MTYHHLSVLVHRQAEKYGDKVALKYRDYETAQWIPISWKQFSRTVRQAANAFVALGVEEQ ENIGIFSQNKPEWFYVDFGAFANRAVTIPFYATSSPAQAQYIINDAQIRFLFVGEQYQYD AAFSIFGFCTSLQQLIIFDRSVVKDPRDVSSIYFDEFMAMGEGLPHNDTVEERTERASYD DLANILYTSGTTGEPKGVMLHHSCYLEQFHTHDDRLTTMSDKDVSMNFLPLTHVFEKAWC YLCIHKGVQICINLRPADIQTTIKEIRPTLMCSVPRFWEKVYAGVQEKINETTGLKKALM LDAIRVGRIHNLDYLRLGKTPPVMNQLKYKFYEKTIYSLLKKTIGIENGNFFPTAGAAVP DEINEFVHSVGINMVVGYGLTESTATVSCTLPVGYDIGSVGVVLPGIEVKIGEDNEILLR GKTITKGYYKKAEATAAAIEPDGWFHTGDAGYFKNGQLFLTERIKDLFKTSNGKYVAPQA LETKLVIDRYIDQIAIIADQRKFVSALIVPVYGFVKEYAEEKGIKYKDMEELLKHPKIVG LFRARIDTLQQQFAHYEQIKRFTLLPEPFSMERGELTNTLKLKRSVVAKNYSEQIEKMYE ENEK >gi|222159358|gb|ACAB01000001.1| GENE 61 65151 - 66164 528 337 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716577|ref|ZP_04547058.1| ## NR: gi|237716577|ref|ZP_04547058.1| predicted protein [Bacteroides sp. D1] # 1 337 1 337 337 676 100.0 0 MKSIKKNIITSMVLLLPLWFVACESEREVGSTLFPEENSGDIVKAFIDNRCFYPKNYMES TVVQTGNGGDLIAGSEQVDLRVQLTNAAPQDLVFSMKVENSVSDAREDDVTWLGEDAISF LNQTVTIKKGELESEEMISFTLDEESGSLKELAESGMVALSLVTTDAVEISENYSTYLWK VNKEITNIDVNGSLKDKTMIDVTSYDVIGGYGVPTQDLSDDNMNTFLMSYIGYPNAKIPF HMHEPQEIIGLSITPAGYWGAWNLDYSKVELLGGDEPDNLTRIGIATCTTGMPSDHTPWE IAFYSPIKVRYLTVHVLDNFSNGGSINALCAEIRLYR >gi|222159358|gb|ACAB01000001.1| GENE 62 66189 - 67361 845 390 aa, chain - ## HITS:1 COG:no KEGG:BF1313 NR:ns ## KEGG: BF1313 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 387 3 378 380 207 34.0 5e-52 MNRYLLKQNIIGLLFFTCLFVGCDNAEYSVLHNQAYISQTGTNPNTALKVFIDQDLVTAN LNVRLSDPAEKDYNFEFVEDAELLAKYNEVNYTSYKFLPAEQYNFKGKEAVIKQGEVLSE SSGLEILPLTQEMKDSGNKYAIALTLQSKDGTTEVLKPGSTILYLLDPAIVTSVPVFNSR HNVAFSLIEDLSLSEWTLEFCVNMSKLGKKVGELNNQALFDGSSSLGESDGQIYTRFGDA PIEGNRLQIKTQGLQMNSATLFEEDKWYQIAWVCTSSKLYLYVDGKLDNSIDVPGKVTNL SKTKCKIGNTEYLKADVQMSEFRLWKRALSQREIANNLYATDPHSNALFAYFKFNEGKGD RFTDATGNGNEAWCIDPVEWRDNVRLNANN >gi|222159358|gb|ACAB01000001.1| GENE 63 67379 - 68497 888 372 aa, chain - ## HITS:1 COG:no KEGG:BVU_0617 NR:ns ## KEGG: BVU_0617 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 20 372 20 351 352 301 46.0 3e-80 MKRLYKYFFGILGISFALTACDDWLDTEIKDPANLTISNKDEAYYARLREYKKSDHPVAF GWYGNWTGTGASYENSLKGLPDSVDFVSLWGNWKNPSPAMMEDLRYVQEKKGTKVLFCFL VLDIGDQITPPMPQEEINNGTSIEDWRHKFWGWDYSLENRLVAVEKYANALCDTIEKYNY DGFDFDAEPNVQHPFPTDKELWQNNGQVIAKFVETMSKRVGPKSGTGKMLVVDGEPDALP AELFHHFDYLILQTYTTYYKQDNSRLDARFDKQYAHFKDVATAGEIAKKIIICENFEDHA KTGGAAFILPDGTEINSLSGFAYWNPSVGGIQYRKGGVGTYHMEYEYKVNAQGSETYPAL RKAIQIQNPSIK >gi|222159358|gb|ACAB01000001.1| GENE 64 68525 - 70084 1252 519 aa, chain - ## HITS:1 COG:no KEGG:BF1327 NR:ns ## KEGG: BF1327 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 513 4 510 514 284 36.0 7e-75 MKKINIVRSWLIGSFVLFGITSCTAGFEDANRPGHETSLEELGRDDYAVSSFITELQNFA FPEQENTYQNTEDLIGNYLGRYMTYVKPDFSVKNYTCFNASDDWKIVPWRDVFAKATSSF NAVAVLTQEEGPMYALALILRAQLMLRFTDTYGPLPIGVEEDANAYSSQEKVYSHLIETL DKAISIITPLVETNPGLVLKADADKVYGGKMNQWCKFANSLKLRMAIRMRFVNQDEAQRI AVEAYKAGVIEQNEDNCAITYIPNGQYKTSVDWGDSRACADLESYMNGFGDPRIYQYFKN TEEYGRRSVVGCRAGANVTNKDLAMKKYSAANIEEDSRGMWMTAAEMTFCRAEGAMLGWD MGGTPGALYNQAVRLSFEQWGVAGQEYYYLENSENTQEYYYDAADGYGGNHAPVSTITVK WDDNDSPERKMERLIVQKWIALFPDGQEGWNEIRRTGYPCVFPVAQSTNGYDLDVPNRIP FDPREMNGNNQENYRKAVEMLGGKDDYAARMWWQNGGNK >gi|222159358|gb|ACAB01000001.1| GENE 65 70110 - 73391 2454 1093 aa, chain - ## HITS:1 COG:no KEGG:BT_1280 NR:ns ## KEGG: BT_1280 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 1093 16 1110 1110 940 47.0 0 MNLYLKNIRKSILIICLLLVGSAIYAQNSSKITIKRKNISLQEALAEVRKLTHMSISYND SQLPVNRISLDIEKQPLEQALKVILKGTGFTYLIKDNYIMIVPEQNIKKAKSRNISGNVV DAKGEPLIGVTVIEKGTTNGAVTDLDGNYKITTKTATPVLVFSYVGYQTKETHATENIVN IVLEDGAQELGEVVVTALGIKRSEKALSYNVQKVGNDAVTTVKSANFMNSLSGKVAGVNI NASSAGMGGAARVVMRGPKSISQSNQALYVIDGIPVTGRSQGELKGDAMMYANQPGTESI ADVNPEDIESISVLSGPAAAALYGSAAAQGVVMITTKKGQEGKVSVTISNSSQFANPFVM PKFQDQYVNRPGEIKTWGDKATSEFGTYEPADFFNTGTNIQNNISLTAGTSKNQTYLSVG TTNAQGIIPNNSYDRYNFTFRNTTSFLNDKMTCDFNFNYIREKDKNLMAQGQWFNPLTSL YLFPRGESFDAIRTFEVYDPVRKIYVQNWNYGDALKMQNPYWVANRMNRTNDRNRYMVSA SLKYEILDWLNVTGRLRWDDAATKQEDKRYASTLKLFAPSDYGFYGYDKINDQTLYGDLM LNINKTLGENFSISSNMGASFSRLKYDVTGFQGGLKAPSNVFTPNAIDYGNATNDNRPIF ESYKHYINSMFISAELGYRSMLYLTLTGRNDWDSALHGTAQTSFFYPSVGVSAVISEMAK MPQMINYLKVRASWASVGSAIEPNLSSAWRYEYNPALGTYKTVTYKFPKKFYPERTDSWE AGVTARLFGNALSVDLTVYQSNTRKQTLLRDVTSGAAGFNKEYIQTGNIRNRGLELSVGY TKSWADFTWSSSLAYSMNRNKIVELLENPNEVVRQAGLSGCGVVLKKGGGMGDIYTYTDF KRDAEGNIALDSNGNVMQTNLSNPQYRGSVLPKGNLGFSNDFSWKGVNLGFVLTARLGGI CMSQTQAILDEYGVSAVSAEARNNGGIAVNTGKISAEGYYAVVGGDNPIWSEYIYSATNV RLQEAHISYTLPRKWLKSKELTLGVTANNLFMIYRKAPFDPESTASTGTYYQGFDYFMQP SLRTLGFNIKLKL >gi|222159358|gb|ACAB01000001.1| GENE 66 73575 - 74480 616 301 aa, chain - ## HITS:1 COG:AGpAbx251 KEGG:ns NR:ns ## COG: AGpAbx251 COG3712 # Protein_GI_number: 16119537 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 88 283 105 301 311 75 29.0 1e-13 MISPDSAEEKEMALHNLWFKTKGKFGHGMEHSFQQVLDKIGIEYTPMVTNVNRWKLWKSV AAAAVIVILSVTTTLWVSYNHFDRDNIAMVEHYVNNGARETISLPDGTTVCLNSGSYVFY PENLEGKTRTVYLMGEAEFKVAKNPEKPFIVRSSNMAITALGTEFNVKAYPEEDIITTSL IEGKVRVDCNDTISYVLTPGYQVVYNKCTADCQVLTANMKDVTAWQRGEIVFNKATIAEI IQTLERHFGIMFYVSTKKRNQDRYNFVFKENAGLEEVLEVMQVVVGQFDYRLKDNACHII W >gi|222159358|gb|ACAB01000001.1| GENE 67 74617 - 75162 377 181 aa, chain - ## HITS:1 COG:RSp0849 KEGG:ns NR:ns ## COG: RSp0849 COG1595 # Protein_GI_number: 17549070 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 38 166 37 161 169 69 31.0 4e-12 MIITRMEDKVLRFKKFFDLNFPKVKTFAWQLLKSEEDAEDIAQDIFVKLWEKPDLWLERE KLDSYLYTVVRNHIYNFLKHKAVEYDYLDVAAEKMRMAELGLPTPDDEFCAHELELFVQM ALERMPEQRRRVFLMSREEGMTSPEIAEKLNVSVRTVEQHIYKALQDLKKIILFLFFFYL D >gi|222159358|gb|ACAB01000001.1| GENE 68 75407 - 76474 842 355 aa, chain + ## HITS:1 COG:AGpA709 KEGG:ns NR:ns ## COG: AGpA709 COG0624 # Protein_GI_number: 16119709 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 66 354 69 383 387 103 27.0 7e-22 MKYDIPYMTTEAVSLLKSLISIPSISREETQAADFLQNYIEMAGMQTGRKGNNVWCFSPM FDLKKPTILLNSHIDTVKPVNGWRKDPFTPREENGKIYGLGSNDAGASVVSLLQVFLQLC RTSQKYNLIYLASCEEEISGKDGIESVLQGLPPVSFAIVGEPTEMQPAIAEKGLMVLDVT ATGKAGHAARNEGDNAIYKVLDDIAWFRDYRFEKESPLLGPVKMSVTVINAGTQHNVIPD KCTFIVDVRSNELYSNEELFAEIKKHISCEAQARSFRLNSSRIDEKHPFVQKAMKLGRVP FGSPTLSDQALMSFPSVKIGPGRSSRSHTAEEYIMLKEIEEAIGLYLELLDGLLI >gi|222159358|gb|ACAB01000001.1| GENE 69 76563 - 77738 694 391 aa, chain - ## HITS:1 COG:BS_resA KEGG:ns NR:ns ## COG: BS_resA COG0526 # Protein_GI_number: 16079372 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 263 373 45 155 181 75 31.0 2e-13 MILCTFCAIAIFSSAQEKFSIRGIANEELNNQLLYLCLMGDGEKAKEVVLDSTTVEKGKF SFSGVHQMPDIAIIKDMDGETYPLIFEKGEISVNIAANKRGGTPLNDSLNIALNRMQLIM DNMLKTSNDIYKLMLGMKPEVFSDKMRNDTVFKAEFDRNNKLFLAQMDSVSHCIKDYKNS IVGVYLFSVGGMMMPFDDMKILMKEASPLFSQNNLVRNIVEKKNQAELRMKAELENKMTP EQREEQKKRQEMDAKIKIGERFPDAKVKDNAGNMKLLSDYVGKGKYVLIDFWASWCGPCR HEMPNVKAAYEKYASKGFEVISISTDRKLKPWRAAVEELGMNWVQLLDVDASDIYGIHAI PRTFLVDPTGIVIDKNLRGEKLEEALSKLFE >gi|222159358|gb|ACAB01000001.1| GENE 70 78138 - 78524 287 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716586|ref|ZP_04547067.1| ## NR: gi|237716586|ref|ZP_04547067.1| predicted protein [Bacteroides sp. D1] # 1 128 1 128 128 207 100.0 2e-52 MKKFLSMTLLLTAMFLTFSACSSDDDETNYPDITGSWSEISETPGEYISSYMEVMWTFNS DNTATQRVILKFNNVTSRDTSTNFTYEYKGSTITLKNDKVTLDYEISISGNNMKLGNEKD GYFNLTKK >gi|222159358|gb|ACAB01000001.1| GENE 71 78582 - 79556 664 324 aa, chain - ## HITS:1 COG:no KEGG:RPE_1593 NR:ns ## KEGG: RPE_1593 # Name: not_defined # Def: hypothetical protein # Organism: R.palustris_BisA53 # Pathway: not_defined # 17 287 82 359 405 165 35.0 3e-39 MENKVLKAKFGSDKTPLYLGELSIPCYVLEDGTRVFSGRGIQNAIGANPNYSGTWLSKFI NSKPISTNLPPGIYDKLSHPIKIKRPTASGSQSDTYGYEVTLLIDLCYAIIDAYDSRVYQ VSEEYYKAARIITRAVSKVGIIALVDAVTGYDKEKKRAKDELQKFLNQFLSDEASKWIKT FEDSFFEMIYKMRGWNWTMTNKRPGVVGQWINNIVYERIAPLTLSTLNEKNPKNDKGYRK DKHHQFFTQDIGKPKLKEYLASVEALGRAANYNWNIFMELLDRAFPKQPQKIIDVEAEEI KEDSPNDEFDKGIDDILGYEPKED >gi|222159358|gb|ACAB01000001.1| GENE 72 79546 - 80061 180 171 aa, chain - ## HITS:1 COG:no KEGG:BVU_1718 NR:ns ## KEGG: BVU_1718 # Name: not_defined # Def: putative transposase # Organism: B.vulgatus # Pathway: not_defined # 6 159 139 299 307 142 47.0 5e-33 MRLLIGGKNKNRHWNKKVPRSQGRSFKDKTPVFGLIQQGGRVIAKVVPNTQVKNLSPLIL RYVKLGSDLYTDEWNYGKKADTLYNHQNVNHKKGFYGKGAFTTNHIENFWSVVKRGVIGV YHYWSRRHMQKYIDEFVYRANNRELTNREKFDRLLENLEYRLTYKELIYGK >gi|222159358|gb|ACAB01000001.1| GENE 73 79991 - 80494 160 167 aa, chain - ## HITS:1 COG:no KEGG:BVU_1718 NR:ns ## KEGG: BVU_1718 # Name: not_defined # Def: putative transposase # Organism: B.vulgatus # Pathway: not_defined # 19 148 17 137 307 131 50.0 8e-30 MFNSRFTSLSDLRKAFPTEQSCIDYLEERRWHGNVISPFDKTSKVYKCKDNKYRCKNTGK YFDVKTKTIFQGTRIPLISWFEAIWTILSYKKGISSVQLGENIKVSQKTAWFMMHRIRKA LGIDNDVGSEDEGGGGKLSGTVEIDETFNWRKEQEQALEQESSKKSR >gi|222159358|gb|ACAB01000001.1| GENE 74 80673 - 81692 853 339 aa, chain - ## HITS:1 COG:no KEGG:BT_3536 NR:ns ## KEGG: BT_3536 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 339 2 329 329 338 55.0 2e-91 MKNIFLPKVMMCIIALFTLILPASAQNTNATDEYKETLKKIMVLSGASATTDDFFQKITS VMKLNAPEKDEAYWNEFSKNWKEKIESKVFEMYMPVYEKHLTLEELKAVAAFYESPVGKK YKEASLIVMRETMPLLVKQLQTETFKVVRSEKSERVKRGEQRLKEYEQKKKRDKELYAQA YMLPSDSIVIVPEEVYEKAYENGMSTSPSLYSIERRKNDTKVTFIQPIYWDWQWLYYSPG FKIIDKKSGDEYNVRGYDGGAPIGRLLAVKGFNHKYIYISLLFPKLKKSVKEIDILELPH KKDKEQLPSNDDGKSKSYFNIKVKDYQTISDKKNKKIYY >gi|222159358|gb|ACAB01000001.1| GENE 75 82030 - 82497 348 155 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_0074 NR:ns ## KEGG: Fisuc_0074 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 47 154 31 141 142 83 42.0 3e-15 MKKILFLYIIHLGFALQPIQIQACCLDNEVVLANGTAIILQSPKVDNTKTIQFIKDFYAN YVFGAKNYVPAVKKHCTAKLQKQLKDNYEYDGEGYAIWNFRTGMQDGPSDISKVTSVTTL GNGLYKVNFIDMGIKGNRTLKIIDVNGTLKFDAIK >gi|222159358|gb|ACAB01000001.1| GENE 76 82720 - 83355 460 211 aa, chain - ## HITS:1 COG:no KEGG:BT_4231 NR:ns ## KEGG: BT_4231 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 207 212 207 48.0 2e-52 MAILYDWYENPGESDDSEEKGLHPRIFLNGKVGTDKLCRMIHGRSSLSVGDVKNAFEMLA QICGEELREGREVHIEGLGYFAPILRSTQKVTRSTKNKWSKMELKTIGFRPDARLRGELV GVKASRSKYARHSESLSAVEIDMRLKEYFADHDVMLRYDFQEVCCMTRTTANRHLRRLLE EGKLKNIGKRMQPIYVAAAGYYGVSRDVLRR Prediction of potential genes in microbial genomes Time: Wed May 18 00:52:49 2011 Seq name: gi|222159357|gb|ACAB01000002.1| Bacteroides sp. D1 cont1.2, whole genome shotgun sequence Length of sequence - 34231 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 11, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 56 - 115 5.3 1 1 Tu 1 . + CDS 179 - 1522 946 ## BT_4230 hypothetical protein + Term 1641 - 1699 13.6 - Term 1623 - 1693 8.1 2 2 Op 1 . - CDS 1754 - 2842 775 ## BF0741 hypothetical protein 3 2 Op 2 . - CDS 2869 - 4923 1789 ## COG3533 Uncharacterized protein conserved in bacteria - Prom 4945 - 5004 6.4 - Term 4990 - 5041 7.3 4 3 Tu 1 . - CDS 5065 - 6195 1254 ## COG2017 Galactose mutarotase and related enzymes - Prom 6268 - 6327 2.9 5 4 Op 1 1/0.000 - CDS 6340 - 7803 1461 ## COG3538 Uncharacterized conserved protein 6 4 Op 2 . - CDS 7806 - 10097 2299 ## COG3537 Putative alpha-1,2-mannosidase 7 4 Op 3 . - CDS 10110 - 12587 2541 ## BT_3526 glutaminase 8 4 Op 4 . - CDS 12618 - 14789 1846 ## BT_3525 hypothetical protein 9 4 Op 5 . - CDS 14818 - 16077 1128 ## COG4833 Predicted glycosyl hydrolase - Prom 16110 - 16169 5.3 - Term 16221 - 16270 11.3 10 5 Op 1 . - CDS 16310 - 17776 1231 ## BT_3523 hypothetical protein 11 5 Op 2 . - CDS 17799 - 18941 1120 ## BT_3522 hypothetical protein 12 5 Op 3 . - CDS 18967 - 20286 1399 ## BT_3521 alpha-1,6-mannanase 13 5 Op 4 . - CDS 20313 - 22232 1619 ## BT_3520 hypothetical protein 14 5 Op 5 . - CDS 22244 - 25684 3280 ## BT_3519 hypothetical protein - Prom 25744 - 25803 3.0 15 6 Tu 1 . - CDS 25834 - 27033 927 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 27068 - 27127 3.7 16 7 Tu 1 . - CDS 27150 - 27722 274 ## BT_3517 RNA polymerase ECF-type sigma factor - Prom 27754 - 27813 7.1 17 8 Tu 1 . - CDS 27828 - 28817 803 ## COG3507 Beta-xylosidase - Prom 28968 - 29027 6.0 + Prom 28853 - 28912 2.9 18 9 Tu 1 . + CDS 28983 - 30092 393 ## PROTEIN SUPPORTED gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase + Term 30225 - 30270 4.0 + Prom 30213 - 30272 5.7 19 10 Tu 1 . + CDS 30341 - 32161 1923 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 32201 - 32247 10.1 + Prom 32197 - 32256 4.1 20 11 Tu 1 . + CDS 32309 - 33835 788 ## COG3119 Arylsulfatase A and related enzymes + Term 33892 - 33939 -0.9 Predicted protein(s) >gi|222159357|gb|ACAB01000002.1| GENE 1 179 - 1522 946 447 aa, chain + ## HITS:1 COG:no KEGG:BT_4230 NR:ns ## KEGG: BT_4230 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 47 441 26 416 419 547 66.0 1e-154 MNAITLIWRQVLKELKLFQMKENKEDAVEKTDKEPQKMLKCHEDKYTIPLPDTAPPTGRE KEAERKQPVRLTQEVRTFLQKQYDFRYNLLTEETEYRPANKRAVPFEPVSKRELNTFCME AHDQGISCWDKDLSRYIYSTCIPEYHPFQLYMEELPGWDGTDRLEALAQRVSDNQLWIKS FHTWMLGLTAQWLGMTGIHANSVAPILVSSEQGRQKSTFCKALMPPALSRYYMDNLKLSA QGKPERLLAEMGLLNMDEFDKYGTQQMPLLKNLMQMANLNICKAYQKNFRNLPRIASFIG TSNRFDLLTDPTGSRRFICIEAEHLIDCEGIMHDQIYAQLKAELLSGIRYWFTKEEEREL QRHNAAFYHPCPAEEVFHACFRIAKDDEEGSERLSASDIFRRLKMYNPAAMRGSNPATFA QVLLAAGVTRKHTKYGNVYLVKPMKAA >gi|222159357|gb|ACAB01000002.1| GENE 2 1754 - 2842 775 362 aa, chain - ## HITS:1 COG:no KEGG:BF0741 NR:ns ## KEGG: BF0741 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 362 1 362 362 610 88.0 1e-173 MKRIVFLDYIRVFACFLVILVHASENFYCAPGATDMAGPQSFLANEVDRLWVSVYDGFSR MAVPLFMIVSAFLLAPMKEEQSMWQFYRQRCLRILPPFFIFMILYSTLPMLWGQIDGETS MKDLSRIFLNFPTLAGHLWFMYPLISLYLFIPIISPWLRKATAKEERFFIGLFVLSTCMP YLNRWCGEVWGQCFWNEYHMLWYFSGYLGYLVLAHYIRVHLTWNRSKRFIIGTILMVIGA VWTIYSFYVQAIPGELHSTPVIEIGWAFCTINCVLLTTGTFLMFTCIKRPQAPKLVTETS KLSYGMYLMHIFWLGLWVTVFKDTLALPTVAAIPCIAVVTFICCFVTTKIISFIPGSKWI VG >gi|222159357|gb|ACAB01000002.1| GENE 3 2869 - 4923 1789 684 aa, chain - ## HITS:1 COG:SMb20631 KEGG:ns NR:ns ## COG: SMb20631 COG3533 # Protein_GI_number: 16265291 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 326 528 309 525 640 90 30.0 1e-17 MKKNKTKGLFASIVLGVAVTACAPASSCGTVTVVDRPDIQSVNTNYIGYRAPLRPLNFIK LPVGSIQPEGWVKKYLELQRDGLTGHLGEISAWLEKDNNAWLTTGGDHGWEEVPYWLKGY GNLAYILNDPKMIEETKYWIEGVFASRQPDGYFGPVNERNGKRELWAQMIMLWCLQSYYE YSQDQRVIDLMTNYFKWQMTVPDDQLLEDYWEKSRGGDNIISIYWLYNHTGDAFLLELAE KIHRNTADWTKSTSLPNWHNVNIAQCFREPATYYMQTGDSAMLKASYNVHHLIRRTFGQV PGGMFGADENARLGYIDPRQGVETCGLVEQMASDEIMLCMTGDPMWAEHCEEVAFNSYPA AVMPDFKALRYITCPNHAISDSKNHHPGIDNRGPFLSMNPFSSRCCQHNHAQGWPYFTEH LVLATPDNGVATAIYAACKATVKVGDGKEITLHEETNYPFEEAIAFTVSTDEKVAFPFYL RIPSWTQKAEVRVNGKKISAAPVAGKYLCINREWANGDCVELTLPMSLSMRTWQVNKNSV SVDYGPLTLSLKIAEKYVEKDSRETAIGDSKWQKGADSKKWPTTEIYPDSPWNYSLVLDK KEPLKNFKVIRKSWPADNYPFTVASVPLEVKAIGRPVPEWKIDETGLCGVLPEEDAVKGD KEEITLIPMGAARLRISAFPNTKE >gi|222159357|gb|ACAB01000002.1| GENE 4 5065 - 6195 1254 376 aa, chain - ## HITS:1 COG:TM0282 KEGG:ns NR:ns ## COG: TM0282 COG2017 # Protein_GI_number: 15643051 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Thermotoga maritima # 46 375 25 355 356 287 45.0 2e-77 MKKLCVWAVAALLMAACTPKAEKATDSGLLQSKFQTEVDGKKTDLFTLRNKNNMEVCITN FGGRIVSVMVPDKDGVMQDVVLGFDSIQDYISKPSDFGATIGRYANRINQGQFTLDGVEY QLPRNNYGHCLHGGPQGFQYRVFDAALLNPQELQLTYRAEDGEEGFPGNITCKVVMKLTD DNAIDIQYEAETDKPTIVNMTNHSYFNLEGDAGNNSGHLLMVDADYYTPVDSTFMTTGEI VPVEGTPMDFRTPTPVGERINDYDFVQLKNGNGYDHNWVLNTKGDVTRKCASLKSPKTGI VLDVYTNEPGIQVYAGNFLDGSLTGKKGITYNQRASVCLETQKYPDTPNKAEWPSAVLRP GEKYTSQCIFKFSVDK >gi|222159357|gb|ACAB01000002.1| GENE 5 6340 - 7803 1461 487 aa, chain - ## HITS:1 COG:XF0843 KEGG:ns NR:ns ## COG: XF0843 COG3538 # Protein_GI_number: 15837445 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 21 474 45 497 516 461 47.0 1e-129 MKKQIKYISAGMLAGMLLCGGQLQASNRMTEMHVCLADAIQKDNRPEISNRLFRSNAVEK EILRVQKLLKNAKLAWMFTNCFPNTLDTTVHFRKGSDGKPDTFVYTGDIHAMWLRDSGAQ VWPYVQLANADPELKEMLAGVILRQFKCINIDPYANAFNDGAIPDGHWMSDLTDMKPELH ERKWEIDSLCYPLRLAYHYWKTTGDASIFDEEWIQAITNVLKTFKEQQRKDGVGPYKFQR KTERALDTVSNDGLGAPVKPVGLIVSSFRPSDDATTLQFLVPSNFFAVSSLRKAAEILEK VNKKTALSKECKDLAQEVETALKKYAVYNHPKYGKIYAFEVDGFGNHHLMDDANVPSLLA MPYLGDVNVNDPIYQNTRRFVWSEDNPYFFKGKAGEGIGGPHIGYDMVWPMSIMMKAFTS QNDAEIKTCIKMLMDTDADTGFMHESFHKDNPKKFTRAWFAWQNTLFGELILKLVNEGKV DLLNSIQ >gi|222159357|gb|ACAB01000002.1| GENE 6 7806 - 10097 2299 763 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 34 756 11 715 717 416 32.0 1e-116 MKKLALFAFTLLSAWAMTAKTITGPVDYVSPLVGTQSKHALSTGNTYPAIALPWGMNFWV PQTGKMGDGWAYTYDADKIRGFKQTHQPSPWINDYGQFAIMPVTGKAVFNQDERASWFSH KAETATPYYYKVYLADHDVVTEIAPTERAAAFRFTFPENDHSYVVVDAFDKGSFVKVIPS ENKIIGYTTKNSGGVPANFKNYFVLVFDKPFTYTAAVASGVIDTNKLEATDNHAGALIGF KTRKGEQVNVRVASSFISPEQAELNLKELGTDNIEQIAAKGRKVWNEVLGRIEVKDDDID HLRTFYSCLYRSVLFPRSFYEIDAKGDVMHYSPYNGEVLPGYMFTDTGFWDTFRCLFPFL NLMYPSMNTKMQEGLVNTYKESGFLPEWASPGHRGCMVGNNSASVVADAYLKGLKGYDIE TLWEAVKHGANAVHPNVRSTGRLGHEYYNKLGYVPYNVGINENAARTLEYAYDDWCIYQL GKALKKPKKEIEIFAKRAMNYKNLYDPEHKLMRGKNEDGTFQSPFNPLKWGDAFTEGNSW HYTWSVFHDPQGLIDLMGGKDGFNQMMDSVFILPPIFDESYYRAVIHEIREMQIMNMGNY AHGNQPIQHMLYMYNYSGQPWKAQHWIREVMDKLYTPAPDGYCGDEDNGQTSAWYVFSAM GFYPVCPGTDEYVLGTPYFKEMKLHLENGKTVTISAPNNGDDKRYISSMTLNGKEYTKNY LTHQDLLNGASISYKMDAKPNQQRGTKESDFPYSFSNEFKKKK >gi|222159357|gb|ACAB01000002.1| GENE 7 10110 - 12587 2541 825 aa, chain - ## HITS:1 COG:no KEGG:BT_3526 NR:ns ## KEGG: BT_3526 # Name: not_defined # Def: glutaminase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 825 1 825 825 1506 89.0 0 MKQQLMMLLLGTASVLCSCETQIEQHEKNELRAPAYPLVTIDPYTSAWSTTDNLYDSPVK HWTGKDFSLLGVAKVDGQTYRFMGTEELELRPLVKTSEQGSWTGKYTTQQPADGWQNAGF NDKAWKEGEAAFGTMENEHTAKTQWGEEFIWVRRVADIQEDLTGKNVYLEFSHDDDAIIY INGIKVVDTGNACKKNERVKLSEEVVASLKPGENLIAGYCHNRVGNGLLDFGLLVELDGY RSFHQTAQQTSVDVQPMQTYYTFTCGPVDLKLTFTAPMFMDNLDLLSRPVNYISYEVASN DGKKHQVELYFEASPQWAIDQPHQESVADSFTDGDLLFLRTGSRNQDILKKKGDDVRIDW GHFYLAAEKENSTSAIGDGRELRKNFVANKLEAPTTNGYDKLALVRSLGETQKADGHLLI GYDDIYSIQYFGDNLRPYWNREGNETIVSQFQKAEKEYKTQMKNSAAFDKKLMEEATAAG ERKYAELCALAYRQALAAHKLVQAPNGDLVFLSKENFSNGSIGTVDLTYPGAPLLLYYNP ELVKATMNHIFYYSESGKWAKPFAAHDVGTYPLANGQTYGGDMPVEESGNMVVLAAAIAK VEGNADYAQKHWETLTTWTDYLVENGLDPANQLCTDDFAGHFAHNANLSIKAIMGVASYG YLADMLGKKDVAEKYTQKAKEMAAEWVKMADDGDHYRLTFDKPGTWSQKYNLVWDKLLNL QIFPKNVAETEIAYYLSKQNKYGLPLDNRETYTKTDWIMWTATLANDKATFEKFIEPVYL FMNVTPNRVPMSDWVFTDEPNQRGFQARSVVGGYYIKMLEGKLIK >gi|222159357|gb|ACAB01000002.1| GENE 8 12618 - 14789 1846 723 aa, chain - ## HITS:1 COG:no KEGG:BT_3525 NR:ns ## KEGG: BT_3525 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 723 1 721 721 1351 91.0 0 MNNKQLLIIFSLLLTGNYASFAQTQKAKDKEAKTTFQTAEPWKPETDVRADATMVYGTLD KKGVSFEQRVQSWRDKGYQTQFMTGVAWGDYKDYFLGKWDGIDGHLKEGQRDRNGNEIAH GHLIPYIVPTESFIRYMQETQIKRVIDAGITSIYLEEPEFWMRGGYSEAFKSEWQKYYNF PWRPQHESPENTYLSNKLKYHLYYNALDKIFTYAKEYGKSKGLDVKCYVPTHSLINYTSW QIVSPEASLASLDCVDGYIAQVWTGTAREPNFYNGVKKERVFENAFLEYGCMKSMTAPLN RKMYFLTDPIEDRAKDWLDYKINYQATFAAQLMYPMVDTYEVMPWPDRIYQGLYRIAGTD QKERIPRSYSTQMQVMINTLNDIRTSDKQISGTHGIGVLMANSLMFQRFPDHNGYDDPQF SSFYGQTLPLLKRGIPVELVHMENTPFKDTFNGLQVLVMSYSNMKPMKSEYHNYLADWVK KGGTLVYCGEDVDPYQTVLEWWNTEGNAYKAPSEHLFEAMGLSRNPGDGTYRFGKGTVIV MREDPKHFVLKGGNDRKYFETIASAYESKTGKKIEIKNNFMVERGPYTIAAVMDESSSKE PLKLSGLYIDLFDKDLPVLTVKQINPGEQGYLYDLNKVSGKVKAKVLCGASRIYDEKVGK QSYSFVAKSPLHTTNVSRVLLPRKPGKVLVNGKAEQPEWNESSKTLLLSFENDPAGVNVS IEW >gi|222159357|gb|ACAB01000002.1| GENE 9 14818 - 16077 1128 419 aa, chain - ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 62 411 28 334 341 71 25.0 2e-12 MKGKILLALTLLLGVSTTTWAVGNLGKANQKKHAYTNEDVWAAYEGFNNTLLDSNKYIYK TNSSYPSAVDRGNGAAAIWCQPIYWDMAMNAYKLAKAQKDRKKTSYYKTLCEKIFAGNKA QYCQFDFDDNNENTGWFIYDDIMWWTISLARGYELFGVDEYLKLSEASFKRVWYGSEKVG DTGSYDKENGGMFWQWQPIQNPKPNKFSDGKMACINFPTVVAALTLYNNVPENRKESTDK RPDYQTKAQYLAKGKEIYEWGVENLLDKATGKIADSRHGNGDPAWKAHVYNQATFIGASI LLYKATGEKRYLDNAILAADYTVKDMSAEHKVLPFEGGIEQGIYTAIFAEYMAWLVYDCG QTQYLPFLKRTIKTGWANRDKTRNVCGGEYYKKLPEGAEIDSYSASGIPALMLLFPAKK >gi|222159357|gb|ACAB01000002.1| GENE 10 16310 - 17776 1231 488 aa, chain - ## HITS:1 COG:no KEGG:BT_3523 NR:ns ## KEGG: BT_3523 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 484 4 489 491 527 58.0 1e-148 MKCLTNKWREGAMLLSFLLISSLAGIFTACDDIEDEYITDTQLSILQESRTSLNYLLKNS TYGTAPGTYPETGKDILNAAIAELDALITRVEAGEELDETTLEAAVAKVNQAIDEFKKSK YYNLSPEAQQYINNLLAKADEILAIVNDETKWGNHQGQYPVENKSVLESAAKDLESLAER IKSGSITDMTQEIYDEAIAAADKKVEEVEDSAWPDNSQITWNLFVDGNAGSYIDFGYSED YVKFGEDDNQAFTIELWVNIKEYCNKQGEDNCTFLSTMTNDPHWSGWRAQDRTKGLLRTM VAHWQDNNHTNPQEWEPGWKKSDNWTKDRWTHYAFLFRDKGLPGFDTPTDVKCYSMIDGT RQGEVIRVGESWRTYINKQSIANQIKMTGFCMMDNNGNRNEWFSGYIKKIRIWKTNRTEN QVYASYMGNEEGVSADNPNLVEAWDFEVKGDQPTQSATRTITGLKGHTATLVGDNWQWIE STDITDNK >gi|222159357|gb|ACAB01000002.1| GENE 11 17799 - 18941 1120 380 aa, chain - ## HITS:1 COG:no KEGG:BT_3522 NR:ns ## KEGG: BT_3522 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 380 5 390 390 518 68.0 1e-145 MNTKYSNWKMWYYVLCMVLTLQLAACSEETHDEYTAAPEIEDTYIDQLDALIAEMTDLQQ NSDYGDKKGQYPTESRAILTDAIDDANRVVLLIKYQQPSPSESEKQRYVAEAKASIEQFK STIRTEDAETTPAELFVDGRGDGGSYIDFGRSEEYVNFGTEGNQAFTVEFWVKVTKGGGK DQNVFLSTYMGGDGWRNGWMMYWRNADGGIYRATWGETGGNICEPSLKAPEDGEWQHFLF VYSDKGLPGSPELRAKLYVNGEVKTTEGSVGNRFYNSSNYANYNAPMTAFGRYMRTSDNL FEEGFAGYMKKIRIWKSAKDNEYVQNSYNGTAEVTGKEADLAAAWDFMTKPSGSGNEVID LTGRHTAKIIGTYEWQRIVE >gi|222159357|gb|ACAB01000002.1| GENE 12 18967 - 20286 1399 439 aa, chain - ## HITS:1 COG:no KEGG:BT_3521 NR:ns ## KEGG: BT_3521 # Name: not_defined # Def: alpha-1,6-mannanase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 439 2 429 429 688 76.0 0 MKQYIFSAVCLMSGVLCMSSCNEDKQAKPYTPDYEIVPEYTNADTWTAYEAFNDNLLDPD KNIYKTSTAYTAATDRNNGAAAIWCQPIYWDMAMNAYKRAKAEGDTERENKYKQLCDDLF AGNKAHYVNFDFDDNNENTGWFIYDDIQWWTITLARAYELFKVEEYRSLAEASFARVWYG SPRVGDTGSYADPEKNLGGGMFWQWQPIGNPNENAAGDGKMACINFPTVVAALTLYNNVP ADRVADPNPESWSNKYGDFTRPHYETKEAYLAKGKEIYEWAVKNLVDSNTGEVADSKHGE GNPAWSDHVYNQATFIGASLLLYKATGEKTYLDNAILGADYTMNTMSETYDLLPFESGVE QGIYTAIFAEYMAMLVNDCGQTQYVPFLKRNINYGWANRDRTRNLCGGEYYKAQIEGATI DSYSASGIPALMLLFPADK >gi|222159357|gb|ACAB01000002.1| GENE 13 20313 - 22232 1619 639 aa, chain - ## HITS:1 COG:no KEGG:BT_3520 NR:ns ## KEGG: BT_3520 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 639 1 642 642 1065 79.0 0 MKKYNYIIVSLLACLLTTSCNDYFDQVPDDRLSLKEIFTTRDGALRYLSNVYTFLPDEFN QRQVHETSLYRTPGPWTGSSDEAEWTNDNKGKLINNNSIDATEGTMVLYRWKSWFSGIHE AAVFTENVDQAPLTVTERNQWKAEARALRAIYYFYLVRTYGPVPLLEKDFPMDTPSDELQ LPRNTVDECFDFIVSELKGAQNDGLLDDASTDKVSGYGRIDKAIAQAFIIEALTYRASWL FNGECNYYSDLANTDGTKLFPNKPDEATKRANWQKVINECNTFFSNYGSRYHLMYTNKDG VSVSGPDSEGFSPTESYRCAVRTLFSEMGNNKEMIFYRLDNAAGTMQYDRMPNRSGNTTN YRGGSLLGATQEMVDAYFMSNGESPISGYSADGVTPIINEKSDYVEEGVSTTEYKGTDGT LYAPAGTRMMYVNREPRFYVDITFSNSKWFDGTEGDYIVDFTYSGSCGKEQGSNDYTSTG YLVRKGMDSGDRNQNLVCVLLRLTNIYFDYIEALAHVSPTHEDIWTYMNMIRKRAGIPGY GETVNLPKPTTTEEVMELIRKEKRIELSFENCRYFDVRRWGLVNEYFNKAIHGMNVNYDG NEFFKRTEIVKRIFDRQYFFPIPQGEIDIDKNLVQNTGF >gi|222159357|gb|ACAB01000002.1| GENE 14 22244 - 25684 3280 1146 aa, chain - ## HITS:1 COG:no KEGG:BT_3519 NR:ns ## KEGG: BT_3519 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1146 6 1151 1151 1982 87.0 0 MFNMKNAIRERKSKVLGLFLCFLLLGIDYSFASYNNYSQFKTLSVSVSNSTLREVLKTIE KSSQFVFFYLDDAVNLERKVSIDSKNKNIEEILSELFEGTSCTYRISDRQIFISGKAPAS TEQQQNKRKISGRVTDIKGEPLIGVNVTVDGDANGSITNMDGLYEIFVTKKSVVLKFTYI GFKTSEIRTNASTNIYDVTLEEQVNELEETVIVGYGTQRKISNIGAQSSMKMEDIKTPSA SLTTTLAGRLAGVVAVQRTGEPGKDAADIWIRGISTPNTSSPLVLVDGVERSFNDIDPED IESLTTLKDASATAVYGVRGANGVILIKTKPGKVGKPTVSADYYESFTRFTKMVDLADGI TYMNAANEAMRNDGIATKYTEDQIRNTIAGKDPYLYPNVDWLKEIFNDWGHNRRVNVNVR GGSEKVAYYASVSYFNETGMTVTDKNINTYDSKMKYSRYNFTTNLNIDVTPTTKVEIGAQ GYLGEGNYPAISSADLYNAAMSISPVEYPKMFFVNGEAFVPGTSTNNNFNNPYSQATRRG YDNLTKNQIYSNLRVTQDLDMLTKGLKLTAMYAFDVYNEIHVHQDRAESTYNFLDTSVPY DMNGQPILQRIYEGSNVLSYKQETSGNKKTYLEASLNYDRTFNDDHRVSALFLFNQQSKL LYPKGTLEDAIPYRMMGIAGRATYSWKDRYFAEFNIGYNGAENFSPKHRFGTFPAFGVGW VVSNEKFWQPLSKAVSFLKIRYTDGKVGNSEVSDRRFMYLDQMKENGDYGYKFGPNGTKW SGYETVNMAVDLIWEESRKQDLGIDLKLFNDDLSIVFDLFKERRENILLKREHSIPSFLG YNTSAPYGNIGIIENKGFDGTIEYNKRINKDWVIALRGNVTFNKDKWIQGELPEQKYEWM NQYGHNINGVKGYVAEGLFTQAEIDDMARWESLSDANKAITPKPFASQFGTVKAGDIKYK DLNNDGQIDAYDQTYISRGDVPTTVYGFGFTVGWKDLSVGMMFQGVAGAERVLNGSSVNP FNGGGGSGNLYSNIGDRWTEENPDQNAFYPRLSYGSETTSNINNFQKSTWWVRNMNFLRL KTLQISYNLPKPWVNKVHLKNAAVYVMGTNLFTLSRFKLWDPELNTDNGASYPNTTSYSV GINFTF >gi|222159357|gb|ACAB01000002.1| GENE 15 25834 - 27033 927 399 aa, chain - ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 196 384 71 256 274 81 30.0 2e-15 MNSEQKHTDFSLYTFEEFLQNDFFISSMNHPTEETQKFWDEFEQMNPSNIDEYIAAKSYL EVFSKEKEEVLSNEETNDLWTRIQATNINKEKAKRKNYFLIGLSSAASVAILVGCFFLLK SYSSVLDPDIATFAVNTKADLPLTEETLLILAEDNVVSLKEKETEITYDSVEIKTNQESI QKEKSAAYNQLVIPRGKRSVLTFADGSKVWVNAGTRVIYPVEFEKDKREIYVDGEIYIEV ARDENRPFYVRTKDMNVRVLGTKFNVTAYESEAIRSVVLAQGCVQVETARTPKAILAPNQ MFSSADGKENISQVDVEQAISWINGLYCFQSADLGIVLQRLSTYYGVNVEFDPALSKIKC SGKIDLKDSFETVINGLTFVAPISYAYDEQYKTYRVVKK >gi|222159357|gb|ACAB01000002.1| GENE 16 27150 - 27722 274 190 aa, chain - ## HITS:1 COG:no KEGG:BT_3517 NR:ns ## KEGG: BT_3517 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 190 276 78.0 4e-73 MEPDIHTLSDSLLWKRFLEGDSSAYTQIYNQTVQDLFRFGLLYTSDKELIKDCIHDVFVK IHMNRAKLAPTDNIAAYLTVALKNTLFNALKKTTDSLSFDEIGEREETVDESPSTPETIY INNEQEKQVQATVHTMMSVLTDRQREIIYYRYIKEMSIDEISKVTDMNNQSVSNSIQRAL GRIRDLFKRK >gi|222159357|gb|ACAB01000002.1| GENE 17 27828 - 28817 803 329 aa, chain - ## HITS:1 COG:CC0813 KEGG:ns NR:ns ## COG: CC0813 COG3507 # Protein_GI_number: 16125066 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Caulobacter vibrioides # 21 311 52 337 540 70 27.0 4e-12 MKLKHLILLSIICCSQSVFGQKANQQPKTSGNPVFPGWYADPEGVVFGDEYWIYPTYSAP YDEQTFMDAFSSKDLVNWTKHPKVLSKENISWFKRALWAPAVIHANDKYYIFFGANDIQS NNELGGIGVAVADNPAGPFKDVLGKPLIDKFVNGAQPIDQFVYKDDDGQYYMYYGGWGHC NMVKLAPDLLSIVPFEDGTLYKEVTPEKYVEGPFMLKRNGKYYFMWSEGGWTGPDYCVAY AIADSPFGPFKREAKILQRDPNIGTGAGHHSVVKGPGEDEWYIIYHRHPLGETDGNARVT CVDRMYFDKDGKIKPIKMTFEGVKASPLK >gi|222159357|gb|ACAB01000002.1| GENE 18 28983 - 30092 393 369 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase [Saccharophagus degradans 2-40] # 39 351 5 310 314 155 33 2e-37 MKSKLLLLAVLLIIICASCQPKKKTEPEKEAVTGATYTNPLRERGAEPWAVFYKGKYYYT QGSESRIMLWETSDITNLNDSLRKPVWIPTDPSNSHHLWAPEMHRINNKWYIYFAADDGN MDNHQIYVIENEADIPTEGKFVMKGRIPTDKDNNWAIHASTFEHNGQRYMIWCGWQKRRI DSETQCIYIASMENPWTLSSDRVLISKPEYEWECQWVNPDGSKTAYPIHVNEAPHFFQPK NKDKVCIFYSASGSWTPYYCVGLLTADANANLLDPASWKKHPIPVFQQKPENEVFGPGGS SFVSSPDGKECYMLYHARQIPNDAPGAMDSRTPRLQKIEWDKDGMPVLGIPDKTGTILPK PSGTKIAEK >gi|222159357|gb|ACAB01000002.1| GENE 19 30341 - 32161 1923 606 aa, chain + ## HITS:1 COG:L0025 KEGG:ns NR:ns ## COG: L0025 COG3250 # Protein_GI_number: 15673962 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Lactococcus lactis # 109 485 109 458 996 97 26.0 1e-19 MNMKKNFLTMLLALALCGSTFAQWKPAGDKIKTSWGEQLDPKNVLPEYPRPIMERNDWKN LNGLWKYAITKKGDPTPAAYQGDILVPFAVESSLSGVGKMINEKEELWYQRTFDIPSAWR GKQILLHFGAVDWKAEVWVNDVKVGEHTGGFTPFYFDITSVLNKGNNDLVVKVWDPSDRG EQPRGKQIANPHGIWYTPVTGIWQTVWLEPVATQYITNLKTTPDIDNNSVKVEVAANTTS ADKVEVKVFDGKNLVAKGAALNGVPVELTMPANAKLWSPDSPFLYNMEVTLYKDGKAIDQ VKSYTAMRKYSIRKGQNGITRLQLNNKDYFQFGPLDQGWWPDGLYTAPTDEALVYDLKKT KDFGYNIVRKHVKVEPARWYTHCDQLGLIVWQDMPNGGPSPQWQARNYFNGTEVIRSAAS EANYRKEWKEIIDCLYSYPSIAVWVPFNEAWGQFKTPEIVAWTKEYDPSRLVNPASGGNH YTCGDILDLHHYPGPNMFLYDPRRATVLGEYGGIGLVVEGNTWVNDKKNWGYVKFNTSDE VTNEYIKYGKHLLELIRKGFSAAVYTQTTDVEGEINGLMTYDRKVIKMNEAKVREINQQI CNSLNK >gi|222159357|gb|ACAB01000002.1| GENE 20 32309 - 33835 788 508 aa, chain + ## HITS:1 COG:MT0310 KEGG:ns NR:ns ## COG: MT0310 COG3119 # Protein_GI_number: 15839682 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 23 459 6 399 465 129 27.0 2e-29 MNSKHIFLSTLVGPLVIPSLNAAETNRPNILFVISDDQSYPHASAYGSPIAKTPAFDFVA RNGWLFNNAFVTSPGSSPSRASILTGLYPWQIAEAGTHASSFPAEYKCFPDILAEAGYHV GCTGKGWGPGDWKISGRKHNPAGPEYNKRHIQPPFKGVSNTDYYQNFKEFLHARKEKQPF YFWVGSREPHRPYEKDSWKKANRKLADAKVPSFLPDKEAIKGDLLDYGTEIEWYDSQLSR ILDTLKEEGELENTLIIVMADNGMPFPAAKATCFEYGIHIPMAICWIGKINKGGISDEFV SGVDLAPTILDAVGIHNKGQMVGISLLPYLQKKRKDTGRTMVFAGRERHASARYNNWGYP SRALRTKDFLYIRNFRPERWPAGDPCAIDKKGRLEAMHSAYYDIDQCPAWDFIVSNRDDR EIYPYFLKAAGKRPYEELYNVSTDPGCLHNLVGNGKCTEKLNNLRKEMDKLLKKTKDTRY LDDPTQDDIWETYPRLSGAIRTFPTPLF Prediction of potential genes in microbial genomes Time: Wed May 18 00:53:59 2011 Seq name: gi|222159356|gb|ACAB01000003.1| Bacteroides sp. D1 cont1.3, whole genome shotgun sequence Length of sequence - 4920 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 54 - 107 0.4 1 1 Op 1 . - CDS 266 - 1642 1465 ## COG3119 Arylsulfatase A and related enzymes 2 1 Op 2 . - CDS 1698 - 3035 1043 ## COG3119 Arylsulfatase A and related enzymes 3 1 Op 3 . - CDS 3085 - 4677 1209 ## COG3119 Arylsulfatase A and related enzymes - Prom 4706 - 4765 5.3 Predicted protein(s) >gi|222159356|gb|ACAB01000003.1| GENE 1 266 - 1642 1465 458 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 9 429 14 474 497 182 30.0 1e-45 MKEEFFGLLCGCTLLPAFLHAQTERPNIVIVLADDLGWGDVGFHGSEIKTPSLDALVGEG VELERFYTSPISTPTRAGLMTGRYPNRFGVRSAVIPPWREDGLDENEETMADMLARNGYK NRAIIGKWHLGHTKKVHYPMNRGFSHFYGHLNGAIDYFDLTREGELDWHNDWETCHDKGY STELITQEAIRCIDAYEKEGPFMLYVAYNAPHTPLQAQEKDIKLYTDNFDSLTPKEQKKA TYSAMVSCMDRGIGAIVDALKKKGIMDNTFFIFFSDNGTAGVPGSSSGPLRGHKFDEWDG GGHAPAVLYWKKAEKQYKNLSSQVTGFVDLVPTLKDLVGDHSRPKREYDGISILPVLNGK KTCIDRDFYLGHGAVVNKDYKLIRKGMKPGLDLKQDFLVDYKTDPYEKKNASAGNEKIVK ALYEVALKYDTITPCIPEVPYGKGRDGFKAPKEWKVVR >gi|222159356|gb|ACAB01000003.1| GENE 2 1698 - 3035 1043 445 aa, chain - ## HITS:1 COG:MT0310 KEGG:ns NR:ns ## COG: MT0310 COG3119 # Protein_GI_number: 15839682 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 13 427 9 421 465 115 25.0 2e-25 MGATSLSFAQQTEKPNFLLFIADDCSHYDLGCYGNVDSKTPNIDHFATQGVRFTQAYQAV PMSSPTRHNLYTGVWPVRSGAYPNHTCANEGTLSVVHHLQPLGYKVALIGKSHIAPKSVF PFDLYVPPLKGVDLNFEAIQKFISDCKAKGEPFCLFVASNQPHTPWNKGDASQFNADKLT LPPMYVDIPQTRELFTHYLAEINYMDQEFGNVLSILDQEKMTDKSVVVYLSEQGNSLPFA KWTCYDAGVHSACIVRWPGVIKPGSVSDALVEYVDIVPTFVDIAGGKPLAKVDGESFKPV LTGKKKAHKEYSFSLQTTRGINAGSPYYGIRSVYDGRYRYIVNLTPEATFQNVETKSPLF KKWKSLAETDSHAKAMTIKYQHRPAIELYDVKNDPYCMKNLAEDAKQASTISRLDKELKR WMKDCGDEGQPTEMRAFEHMPGKRK >gi|222159356|gb|ACAB01000003.1| GENE 3 3085 - 4677 1209 530 aa, chain - ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 22 524 3 527 536 313 35.0 6e-85 MVKKNFLMTAPLLMASFCSAQEKPNVVLIMVDDMGYSDIGCYGGEILTPNIDALASKGVR FTQFYNTSRSCPTRASLMTGLYQHQTGIGQMSEDPFNQPNQKSPNDWGVPGYKGFLNRNC VTIAEVLKESGYHTYMAGKWHLGMHGEEKWPLQRGFERFYGILAGACSYLRPSGGRGLTL DNTKLPVPEAPYYTTDAFTDYAIRFVDEQKDDKPFFLYLAFNAPHWPLQAKEEDIQKFTK IYREKGWDEIREARRKRMTKLGIIEPNTEFAEWENRKWNELTEQEKDKVAYRMAVYAAQV HCVDYNVRKLLDYLKKNRKLDNTLVMFLSDNGACAEPYAELGGGKVSEINDPACSVFPSY GRAWAQVSNTPFRKYKCRSYEGGISTPLIVSWKSNLKNKKGEWCRVPGYLPDIMPTILEA TGATYPETYHGGNRIYPLVGSSLFPAIQKKTDSLHEYMYWEHQNNRAIRWGNWKAIRDEK GKEWELYDVVKDRTERNNLAEQHPEVLTKLVAEWEKWANANFVLPKHPKK Prediction of potential genes in microbial genomes Time: Wed May 18 00:54:00 2011 Seq name: gi|222159355|gb|ACAB01000004.1| Bacteroides sp. D1 cont1.4, whole genome shotgun sequence Length of sequence - 2266 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 2.1 1 1 Tu 1 . + CDS 87 - 2265 1596 ## BT_3508 hypothetical protein Predicted protein(s) >gi|222159355|gb|ACAB01000004.1| GENE 1 87 - 2265 1596 726 aa, chain + ## HITS:1 COG:no KEGG:BT_3508 NR:ns ## KEGG: BT_3508 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 726 35 758 851 1223 80.0 0 MNQAQTFANNFPREKAYLHFDNTSYYVGDTIWFKAYVTLAGQQIFSQISRPLYVELVDQT GHITDKQIIKLTQGEGNGQFVLPHSMLSGYYEVRAYTRWMLAFSEPQYFSRTFPIYQLTN SDKLERSITTYELSPSMENRPLETKEKLSVRFFPEGGQLVEGVTSQVAFKAESKDEGNIE LSGTIYTKEGAEISSFETLHDGMGHFKYTPSAQPAVAKVDFQGKKYEFTLPQALPNGYVL STVNNAGALLVKVSCNAATPQDTLAVFISHQGRPYVHQLISCRADTPQEFILPTRKLPAG VLQVSLINRAGNTLCERFVFSNPRAPLQLSAEGLKEVYTPYAPIRCELQVKNAKGEPISG DVSVSIRDAVRSDYLEYDNNIFTDLLLTSDLKGYIHQPGYYFASPSPRKQTELDILLIVH GWRKYDMSQAISTAPFTPLQLPEAQLVLNGQVKSTILKNKLKDIALSVIVKKDDQFITGG TVTDENGRFTIPVEDFEGTTEAVIQTRKVGKERNKDASILIDRNFSPAPRAYGYKELHPE WKDLTHWQQKAENFDSLYMDSIRKVEGLYVLDEVEIKSKRRQGNNMATKINEKSIDAYYD VRRSVDLLRDNGKIVTTIPELMEKLSPQFDWDRSNDKLTYRQKPICYIMDNHILSETETQ MMLTEVDGLASIIISKGTGGIDDEIIQNTKMSEVTDSTGVDISKLDKYSVFYLIPLPRRD VLNKSQ Prediction of potential genes in microbial genomes Time: Wed May 18 00:54:09 2011 Seq name: gi|222159354|gb|ACAB01000005.1| Bacteroides sp. D1 cont1.5, whole genome shotgun sequence Length of sequence - 3934 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 283 158 ## BT_3508 hypothetical protein + Term 318 - 376 -0.6 - Term 142 - 194 -0.5 2 2 Tu 1 . - CDS 208 - 369 99 ## - Prom 560 - 619 4.7 - Term 732 - 768 1.8 3 3 Op 1 . - CDS 993 - 1766 309 ## COG3293 Transposase and inactivated derivatives - Prom 1792 - 1851 2.4 4 3 Op 2 . - CDS 1860 - 2126 220 ## ZPR_0220 hypothetical protein + Prom 2415 - 2474 4.6 5 4 Tu 1 . + CDS 2574 - 3731 610 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 3777 - 3810 -0.2 Predicted protein(s) >gi|222159354|gb|ACAB01000005.1| GENE 1 2 - 283 158 93 aa, chain + ## HITS:1 COG:no KEGG:BT_3508 NR:ns ## KEGG: BT_3508 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 92 760 851 851 152 78.0 3e-36 AVLGTRQTVIQGYTHALEYYSPAYPTKELYMDKVDKRRTLYWNPSVRTDENGKAVIECYN NQYSTPVIIQAETMSKDGQIGSMKYSTIGQAEQ >gi|222159354|gb|ACAB01000005.1| GENE 2 208 - 369 99 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNNGDEGFRIRNNFILINNICYEVIYNKNYCSACPMVEYFMLPICPSLLIVSA >gi|222159354|gb|ACAB01000005.1| GENE 3 993 - 1766 309 257 aa, chain - ## HITS:1 COG:MA2835 KEGG:ns NR:ns ## COG: MA2835 COG3293 # Protein_GI_number: 20091659 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 4 255 17 270 278 226 44.0 2e-59 MGQYSTNLTDKQWQVIEKIINPQERRRKHSLRNIMNAILYLLKTGCQWRMIPKNFAPWES VYYYFSKWKNEGIIEELLDTLRSMVRIQSGRNESPSMGIIDSRNVKTSHHVDSVRGLDGN KKIKGRKQHIIVDSQGYLMSVAVHEANVHDSKGAPKVIERLSYKFPRLAKILADGGYRGS LADWVFDKFKWELEVVLRPDECPSKFKVIPKRWIVERAFAWLENFRRLTIDYEFHADTAE TMVQLAFCKIMLNKIIV >gi|222159354|gb|ACAB01000005.1| GENE 4 1860 - 2126 220 88 aa, chain - ## HITS:1 COG:no KEGG:ZPR_0220 NR:ns ## KEGG: ZPR_0220 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 1 86 48 137 466 94 47.0 1e-18 MVIYGANFGTDPSIISVKIGGKEAIVVSSKGNSLYCLTPSLCFEGSVEVKIGKQSSKAQA KYEYEPQLVVSTLCGYLDEYGKKLFKIY >gi|222159354|gb|ACAB01000005.1| GENE 5 2574 - 3731 610 385 aa, chain + ## HITS:1 COG:MT0627 KEGG:ns NR:ns ## COG: MT0627 COG1373 # Protein_GI_number: 15840000 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Mycobacterium tuberculosis CDC1551 # 1 350 3 375 411 162 28.0 1e-39 MIERILKSKLIEMANKYPIVTLTGPRQSGKSTLLRNSFQDYEYVSLEDPDMRLFATDDPR GFLSTYPDKTIIDEVQRVPSLFSYIQTHTDKENKEGMYMLAGSHNFLLMESVNQSLAGRT AVLKLLPFSHYEMEKGEILPSNVNEEIFKGAYPRIYDKAINPDDYYPFYIQTYVERDVRL LRNIGDLSKFIKFLKLCAGRIGQLLNLSSLANECGISVTAATNWLSILEASYICYLLKPD YNNYAKRLVKTPKLYFYDTGLACSLLDIQNTEQITTHFLRGGLFENLIINEFVKESYNRG VEPGLSFWRDSTGNEVDLLRMVGGKQYAYEIKSGATYSPDFFKGISKWAKLSNTPTEQCF AIYNGDRDIKTSVGQVNGWNHFSLF Prediction of potential genes in microbial genomes Time: Wed May 18 00:54:29 2011 Seq name: gi|222159353|gb|ACAB01000006.1| Bacteroides sp. D1 cont1.6, whole genome shotgun sequence Length of sequence - 47818 bp Number of predicted genes - 34, with homology - 34 Number of transcription units - 12, operones - 5 average op.length - 5.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 78 - 293 114 ## COG3292 Predicted periplasmic ligand-binding sensor domain - Prom 334 - 393 3.1 2 2 Op 1 . - CDS 416 - 1717 1115 ## COG4942 Membrane-bound metallopeptidase 3 2 Op 2 . - CDS 1714 - 2292 496 ## BT_3463 hypothetical protein 4 2 Op 3 . - CDS 2289 - 4061 1850 ## COG0457 FOG: TPR repeat 5 2 Op 4 . - CDS 4086 - 4517 473 ## COG0756 dUTPase - Prom 4635 - 4694 4.4 - Term 4685 - 4717 1.5 6 3 Op 1 . - CDS 4812 - 7091 1777 ## COG1472 Beta-glucosidase-related glycosidases 7 3 Op 2 . - CDS 7137 - 8705 942 ## gi|237716628|ref|ZP_04547109.1| predicted protein 8 3 Op 3 . - CDS 8726 - 10627 1144 ## BF3493 sialic acid-specific 9-O-acetylesterase 9 3 Op 4 . - CDS 10670 - 11656 853 ## BT_3473 hypothetical protein 10 3 Op 5 . - CDS 11676 - 13772 1717 ## BT_3474 hypothetical protein 11 3 Op 6 . - CDS 13785 - 16961 2360 ## BT_3475 hypothetical protein 12 3 Op 7 . - CDS 16979 - 18370 951 ## BT_3476 hypothetical protein - Prom 18531 - 18590 6.7 + Prom 18321 - 18380 4.9 13 4 Tu 1 . + CDS 18559 - 19662 375 ## PROTEIN SUPPORTED gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase - Term 19565 - 19601 4.1 14 5 Tu 1 . - CDS 19666 - 23649 1820 ## COG0642 Signal transduction histidine kinase - Prom 23801 - 23860 5.3 15 6 Tu 1 . + CDS 23911 - 25254 933 ## COG0232 dGTP triphosphohydrolase + Term 25300 - 25356 2.1 - Term 25127 - 25167 7.3 16 7 Op 1 . - CDS 25226 - 26242 939 ## COG3176 Putative hemolysin 17 7 Op 2 . - CDS 26269 - 27087 608 ## COG3176 Putative hemolysin - Prom 27178 - 27237 6.8 + Prom 27065 - 27124 8.4 18 8 Tu 1 . + CDS 27346 - 27849 465 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 28021 - 28083 3.1 + Prom 28232 - 28291 4.2 19 9 Op 1 29/0.000 + CDS 28312 - 28782 296 ## COG2001 Uncharacterized protein conserved in bacteria 20 9 Op 2 . + CDS 28754 - 29683 909 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 21 9 Op 3 . + CDS 29690 - 30046 382 ## BT_3454 hypothetical protein 22 9 Op 4 26/0.000 + CDS 30119 - 32245 2072 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 23 9 Op 5 4/0.000 + CDS 32275 - 33723 1576 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 24 9 Op 6 28/0.000 + CDS 33813 - 35081 1192 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase + Term 35102 - 35132 0.7 + Prom 35103 - 35162 2.6 25 9 Op 7 25/0.000 + CDS 35220 - 36554 1478 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 26 9 Op 8 31/0.000 + CDS 36587 - 37909 1105 ## COG0772 Bacterial cell division membrane protein 27 9 Op 9 26/0.000 + CDS 37969 - 39087 1197 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase + Prom 39100 - 39159 3.2 28 9 Op 10 . + CDS 39200 - 40600 1293 ## COG0773 UDP-N-acetylmuramate-alanine ligase 29 9 Op 11 . + CDS 40676 - 41413 251 ## PROTEIN SUPPORTED gi|163752975|ref|ZP_02160099.1| 30S ribosomal protein S12 + Prom 41423 - 41482 7.3 30 10 Op 1 35/0.000 + CDS 41513 - 42970 1435 ## COG0849 Actin-like ATPase involved in cell division + Term 42986 - 43040 6.5 31 10 Op 2 . + CDS 43096 - 44406 1433 ## COG0206 Cell division GTPase + Prom 44411 - 44470 4.9 32 10 Op 3 . + CDS 44493 - 44942 276 ## PROTEIN SUPPORTED gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 + Term 45051 - 45107 12.3 33 11 Tu 1 . + CDS 45698 - 47395 659 ## BT_3442 hypothetical protein + Prom 47445 - 47504 2.3 34 12 Tu 1 . + CDS 47555 - 47816 75 ## gi|237720385|ref|ZP_04550866.1| conserved hypothetical protein Predicted protein(s) >gi|222159353|gb|ACAB01000006.1| GENE 1 78 - 293 114 71 aa, chain - ## HITS:1 COG:XF1330_1 KEGG:ns NR:ns ## COG: XF1330_1 COG3292 # Protein_GI_number: 15837931 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Xylella fastidiosa 9a5c # 1 71 37 109 740 58 46.0 4e-09 MSHNTVWCALQDSYGFIWLGTSDGLNRYDGRGNKVYRNVLNEKFSLENNFVEALIEVDKN IWVGTNSGLYI >gi|222159353|gb|ACAB01000006.1| GENE 2 416 - 1717 1115 433 aa, chain - ## HITS:1 COG:YPO0063 KEGG:ns NR:ns ## COG: YPO0063 COG4942 # Protein_GI_number: 16120414 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Membrane-bound metallopeptidase # Organism: Yersinia pestis # 18 433 55 450 450 84 24.0 4e-16 MMKRFFVILISCLWLAVPLSAQSNKLIRELESKRGALQKQIAESESILKDTKKDVGSQLN SLAVLTGQIEERKRYIMAINNDVEAIQRELASLERQLHGLEKDLKDKKKKYEASVQYLYK NKSIEEKLMFIFSAKNLGQTYRRMRYVREYATYQRLQGEEILKKQEQIRKKKVERQQVKA AKENLLRERESEKVKLEEQEKEKRTLVTNLQKKQKGLQNEINKKRREANRLNARIDKLIA EEIERARKRAQEEARREAAARKKEESKEGKSATTETAAKSKPLESYTMSKADRELSGNFA ANRGKLPMPISGAYIITSRYGQYAVEGLRNVKLDNKGIDIQGKPGAQARAIFDGKVAAVF QLNGLFNVLIRHGNYISVYCNLSSASVKAGDTVKTKQSIGQIFSDGTDNGRTVLHFQLRR EKEKLNPEPWLNR >gi|222159353|gb|ACAB01000006.1| GENE 3 1714 - 2292 496 192 aa, chain - ## HITS:1 COG:no KEGG:BT_3463 NR:ns ## KEGG: BT_3463 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 193 193 232 74.0 5e-60 MKRIIYLLLLVVVVLAGCKSSKHLATSETKTSTKAPTSSYLASKLQLTIPSKKGSMSVGG TMKMKTHERVQISLLMPILRTEVARIEVTPDEVLLVDRMNKRFVRATKAELKNVLSKNVE FSRLEKILMDASLPSGKTELTGKDIGIPSLEKAKVQLYEFSTQEFSMTPTELTSKYRQIP LEELVRMLVALL >gi|222159353|gb|ACAB01000006.1| GENE 4 2289 - 4061 1850 590 aa, chain - ## HITS:1 COG:aq_854 KEGG:ns NR:ns ## COG: aq_854 COG0457 # Protein_GI_number: 15606205 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Aquifex aeolicus # 117 576 83 528 545 97 22.0 5e-20 MKIKIGWLFVTVLVLTSCGGTRSMRTAKTAVKADASFSVKESLLPAEQQRKYDYFFLEAM RMKEKNEYDAAFGLLQHCLDINPNASSALYEISQYYMFLRQVPQGQAALEQAVAFAPDNY WYSQGLVSLYQQQNELDKAVTLLEKMVTRFPSKQEPLFSLLDIYSRQEKYNDVISTLNRL EKRLGKNEQLSMEKFRIYLQMKDDKKAFQEIESLVQEYPMDMRYQVILGDVYLQNGKKQE AYDAYQKVLAVEPDNPMALFSMASYYEQTGQKELYQQQLDTLLLNKKVTSDTKISVMRQV IVENEQSSAKDSTQVIALFDRMMKQDIDDPQIPMLYSQYLLSKNMEQEAVPVLEQVVDLD PTNKAARLMLVSAAVKKEDYKQIIKVCEPGIEATPDALELYYYLAIAYHQAEQTDSVLSV CSRALEHVTADTRKEVISDFYSIMGDIYHTKKQMAEAYAAYDSSLVYNPSNIGALNNYAY YLSVERRDLDKAEEMSYKTVKAEPNNSTYLDTYAWILFEKGNYAEARIYIDNAMKNDGEK SDVIVEHCGDIYFMTGDVEGALKYWKKALEMGSESKTLKQKIEKKKYIAE >gi|222159353|gb|ACAB01000006.1| GENE 5 4086 - 4517 473 143 aa, chain - ## HITS:1 COG:FN1028 KEGG:ns NR:ns ## COG: FN1028 COG0756 # Protein_GI_number: 19704363 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Fusobacterium nucleatum # 1 142 4 145 146 172 62.0 2e-43 MNIQVINKSKHPLPAYATELSAGMDIRANLSEPITLEPLQRCLVPTGLYIALPKGFEAQV RPRSGLAIKKGITVLNSPGTIDADYRGEVCVILVNLSSEAFVIEDGERIAQMVIAKHEQP VWQEVEVLDETERGAGGFGHTGV >gi|222159353|gb|ACAB01000006.1| GENE 6 4812 - 7091 1777 759 aa, chain - ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 7 756 6 761 764 682 48.0 0 MKYQYLLLGVLYGLLLSSCSLSAEEGDKEMDRYIANLMGKMTLHEKLGQLNLPSGGDLVT GNVNSAELTKMVRNQEIGGFFNVKGIRKIVDLQRIAIDSTRLGIPLLVGADVIHGYETIF PIPLALSCSWDTLAVERMARISAVEASADGICWTFSPMVDICRDSRWGRIAEGSGEDPYL GSLLAKAYVRGYQGNNMQGKNEILSCVKHFALYGASESGKDYNVVDMSRQRMYNEYLAPY KAAVEAGVGSVMSSFNLVDGIPATANKWLLTDLLRNEWGFCGLLVTDYNSIAEISSHGVA PLKEASVRALQAGTDMDMVSCGFLNTLEESLKEGKVTEEQINAACRRVLEAKYKLGLFSD PYKYCDTLRVEKELYTTAHRAVARKIAAETFVLLKNEDHLLPLERKGKIALIGPMADARN NMCGMWSMTCTPSRHGTLLEGIRSAAGDKAEILYARGSNIYHDAELEKGGAGIRPLERGN ELQLLDEALHTAARADVIVAALGECAEMSGESASRTELGIPDAQQDLLKALVKTGKPVVL LLFTGRPLVLNWEDANVHSILNVWFGGSETGDAIADVLFGKVTPGGKLTTTFPRSVGQLP LYYNHLNTGRPDPDSHSFNRYSSNYLDMSNEPLYPFGYGLSYTNFSYGNLQLSSDVLSKN GELTVSVTVTNDGDFDGYEIVQMYLHDIYAEISRPVKELKGFERIFLKKGESREVKFVIT EDDLRFYNSGLQYVYEPGEFDVMVGANSRDVQTERFRAD >gi|222159353|gb|ACAB01000006.1| GENE 7 7137 - 8705 942 522 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716628|ref|ZP_04547109.1| ## NR: gi|237716628|ref|ZP_04547109.1| predicted protein [Bacteroides sp. D1] # 1 522 1 522 522 1029 100.0 0 MKKLNFLFLGLLSFMPIWGCNDDDSLPEAVVEVKEGHNEDIVSVIDYDIKNDGTLIGSQL NNLVGQSYGKTLYFPAGTYNLTEPIVLPLEYTKNVNLIFDKNATVKSDVHLEALIKVGYS ETYFTDVSHRRFSYIEGGILDCYNADNGILVNGRKQLVQIRTMSLVRGRNTHIRIHVPEG IGTGGTGSSDTKIDNVTIQGISSNDNVYGIYIDESCCDCKISDTFIYCTKEALVTKSAGH ILNNVHILSWDTTGGTHLEDGKNYRSTVGIRIASGGFFVFNQIYYDTIDRGIVVNEGYSP DLMLDQQISFSYLNNFGTSFIQAEGNNPNLKVKLSNSTFTVLNDNFKILDCPVDLVGWDI ADKFSFLNCVVANSQRLYPYETTLMQKLRGVSSDGIIWTTLANYDTNWNVLGAVISSPMR NTLEIDLADNLTVKLNFMFKGVEPLLQGYEVLGEGVEDIEFGYAVSGQFCVLLFRAKRSA SFNPIIYDTLGSGRFMPTPYKGKKYTLVDYGLSDSDIKILSR >gi|222159353|gb|ACAB01000006.1| GENE 8 8726 - 10627 1144 633 aa, chain - ## HITS:1 COG:no KEGG:BF3493 NR:ns ## KEGG: BF3493 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.fragilis # Pathway: not_defined # 4 631 12 644 654 688 53.0 0 MSGLFASLIACAKVKLSPLFSDNMVLQQEAEVPVWGKASPEMRVIVIPSWSGEKYENIAD RTGNWKVNIKTPKAGGPYHLEIIDGEKVRLENVMIGEVWLCSGQSNMQMPVEGWGKVKNY QQEVAQANYPNIRLMTVSNTISLSPSQEFTAVGGGWQVCSSGTIREFSAAAYFFGREIAR TQQVPVGLICAHWGGTNIESWISAQALGEVPDFVEQLKLIRRLGNKDCDLQAEEEQRQAK ILSLDKGMRNGKPFWNTLSYNDEGWISHSFPGNIEKTFPDFDGIVWGRKTVDIPEQWEGK PLSLHMGYVDDEDITYFNGIEIGSTKGYTHSRTYEIPGNLVKAGKAVITVRIVDTGGGCG IGGEMKLSKDVGDWILISGEWKCKVAAQSHIDPVFEMNPNVQTVLYNGMIHPLAPYKFRG VIWYQGENNVGRATQYRILLPLLIQSWREEWGNDFPFYLVQLANYLERADEPGDSQWAEL REAQRQTALYYDNVGMAVTIDIGDMNDIHPKNKQEVGRRLALLARADTYGERIVCSGPVY QTYRVKGDKVILSFECADGGLKTSDGKKLTGFAIAGIDGKYHWAEAEICGNEIIVSSEMV KHPLAVRYGWGDNPTCNLINGAGLPASSFQTLR >gi|222159353|gb|ACAB01000006.1| GENE 9 10670 - 11656 853 328 aa, chain - ## HITS:1 COG:no KEGG:BT_3473 NR:ns ## KEGG: BT_3473 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 328 2 333 333 323 50.0 4e-87 MKERIYYPLLAVLMVLFCAACNEEWTDEQYEHYVSFKAPMNYAKGVTDIYVKYKPNGMVT YQLPLIMSGSTMAGSDTEVQVAIDSDTLKSINWEYFHNRKDLYYRELTSGYYELNDMKVM IPAGDCMGILDINFKLSGIDLVDKWVLPLTIAESPTGSYEPHPRKNYKKALLRVIPFNDY SGTYSTTTMQVGFKDAGNKMTTNERTAFVVDEHSIFFYAGLINEEELERRKYKIIFTLNA DGTVTLSAPDPDIHFKLIKNPTWKVTEKMDASLPYLKHKTITIENFKYEFDDVTSVPGTP ISYTVEGTVNMERKINIQIPDEDQAIEW >gi|222159353|gb|ACAB01000006.1| GENE 10 11676 - 13772 1717 698 aa, chain - ## HITS:1 COG:no KEGG:BT_3474 NR:ns ## KEGG: BT_3474 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 696 18 679 681 761 58.0 0 MKWLYNTGAILFMVCYLASCSDYLEVEHNFKDRLTIEKVFTNRDYTDSWLSNTYSYLRDQ NQDIGGKESVPTNFADDIYWADWNNWGNKNIAGGYYSYFRNVEYNEDWMQETYTNCYLGI NKACVFLKYIHMNTQLTEEERLDYAAQARFVRAYYYWLLLRKYGPIPLIPDAGEDYTQDY DALARPRNTYDECVEYITSEMVLAAKDLPLERGANNIARPTRGAALATRAKVLLYAASPL MNGNTDSYAQQMVDDEGNRLLSAEYDESKWARAAAAARDVMELPGNNNGHRYQLYVKNRI RGGGTDDYPETIEPFDDNNFSKKSWPDGYADIDPFESYRSVFNGELSAYANPELIFSRVD NITVDHSGEGTNSPDGIANMVLHQLPTVAGGWSMHGMTQKQCDSYYMADGTDCPGKDKEI GRGDGSARLSGYVTSEDVDAGRYKPLRAGVSLQYANREPRFYASVAFNGSVWNMSSLNGK DGAASPNQQVWYYRGTSEGYNGGNRIFTGIGIKKYVNPYDAKYQNSFYYRQKTETAIRYA EILLIYAEALNELDGSYDIPSWDGSAMYTLRRDVNELKKGIQPVRIRAGVPDYTSSEYQN KEMFRKKLKRERQIEFMGENHRYYDLRRWKDAPQEEETLIYGCNMLMDEARRDLFHTPVA IMSLPVRFVEKSYFWPFSHGQLQRNKRLTQNPGWTYYD >gi|222159353|gb|ACAB01000006.1| GENE 11 13785 - 16961 2360 1058 aa, chain - ## HITS:1 COG:no KEGG:BT_3475 NR:ns ## KEGG: BT_3475 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1058 1 1039 1039 1272 60.0 0 MKQQFLVIIAALCCFAAGISAQQSTHRQIVVTGVVTDENQEPLIGANVSVKDVPGLGSIT NINGKFSIKMQPYNRLIFSYIGYETQEILVREERTVNVVMKEKTETTLDEVVITATGAQK KLTVTGAVSTINVEQLRTSPTGSISNSLAGNVAGIIARQTSGQPGKNVSEFWIRGISTFG AGSSALVLVDGFERDMNELNYEDIESFTVLKDASETAIYGSRGANGVVLITTRRGKINKI TVDAKVETIYNTRTFTPDFVDGITYANLANEARRTRNLEPIYSGNELKILTQGLDPDLLP NVDWMDTLLKNGAMSYRATLNLSGGGQNARYFVSGSYLDEGGMYKVDKSLNEDYNTNSNN KRWNYRMNADIDVTRTTLLQVGIGGSLKKMNESGMTSHEIWNSILNQTPTSMPIMYSNGY TPTNSDGDINPWVASTQCGYNEQWWNNIQTNITLNQKLDFITKGLNFIGRFGFDTDNYNF IKRYKCPALWSADRFRKDDGTINFTRKRAEQEMTQTSGNTGERKEYFEAELQYNRNFKSH ILSGTLKYTQDSKVKTQEIGNDIKNSLPMRHQGLAGRIAYNWNYRYFLNFNFGYTGSENF AAGHQFGFFPAYSVAWNIAEEPFVKKNLKWVNMFKVRYSWGKVGNDKMYEANGTTLIRFP YLYTIGYGGAPNPNIWGDNAGYYSAFGGYNWADYGYEKYGTTLFQGLRYTSYANDGVTWE IATKHDIGLDMSLFDDKFTLTIDYFHEDRKGIFMYRRYLPATAGVENGMTLPQANVGKVL SKGVDGNFAYKQQVGSVQLTMRGNMTLSKNEVKDKDEQNNVYPYLMEQEYRVNQAKGLIA LGLFKDYTDIRNSPQQTFGAYQPGDIKYKDVNGDGIISDMDRVAIGATTVPNLIYGMGIS AQWKGLDVNLHFQGAGKSSFFINGSGVRPFVNGSKGNILKDVVAGRWISRDESGTESTEN VRAEYPRLSYGGSNNNYQESSFWLRDGSYVRLKTLEIGYTLPKPLVNKMHFNNVRIYLIG TNLLTWSKFKMWDPEMNSSNGAAYPLSKSITVGLNVNL >gi|222159353|gb|ACAB01000006.1| GENE 12 16979 - 18370 951 463 aa, chain - ## HITS:1 COG:no KEGG:BT_3476 NR:ns ## KEGG: BT_3476 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 460 16 453 457 268 36.0 4e-70 MMNRTGKRMNVVKAFCLLALLGCFIGCKDDEKRKEQAFDNSLPIEIVGFEPREVTPRTQL LIYGKNFGNDPSLINLKLGGVETRVVNCNNECIYCMVPRNADKGTIEISIGDQAAIAPDR FTYIANTQVTTLCGYVDETGRYEVKDGPFSDCGLNSPTWLSIDPKNPNHIYMIEDKSSIR LIDMEVQELTTVATVGQLNIQDPRSLCWSLTGDTLFVSNYSNGDSPIVVTMLLRSEGFKK AHSLFSGYKWVISATIHPQTGDIYFAQESDATIKRYNMATKEVETMFSVGGNWIWMYVYF HPSGDFAYLMRPREGTIFKCQFNKTTNWLELPSIYVGNWNQGYNEGVGTNVRMGVMWQGT FVKNEKYIEQHKDDIYDFYVADGELYDNHPGDLIWKVTPNAEAIRYAGRGSVAMDNSHFG YVDGDLLQEARFNGPHSIAFDEVNKIFYIGDQHNHRIRQIIVE >gi|222159353|gb|ACAB01000006.1| GENE 13 18559 - 19662 375 367 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase [Saccharophagus degradans 2-40] # 37 351 5 313 314 149 31 4e-35 MKFQNCLFLILLIVCLSCQSKKTDKQTGIDSTLFYTNPVIGGGSEPWAIFHEGKYYYTQS LADCIILWETDDITQLAEAAKKNVYSPQNAPGISHLWGPEIHYMDNKWYIYCAADDGNID NRQIYVLENESPDPMKGEFVMKGRISTDKNNNLAIHANPFEHNGKRYLIWCGWETRRIYQ ETQCIYIAEMKNPWTLASDRILISKPEYEWECQWISVDGNKTGYPIHVNEAPQFFRSKNK DKLLIYYSASGNWTPYYCVGLLTADANSDLLNPNSWKKSLEPVFKQAPENHVYGPGSLCF IPSPDKKEWYILYHARNALRDMFVLDGRTTRIQKIEWDENGIPILGIPQKESTLLQKPSG TPTSDRN >gi|222159353|gb|ACAB01000006.1| GENE 14 19666 - 23649 1820 1327 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 808 1065 46 314 318 136 30.0 2e-31 MKKQVTAFICIIVFNFEAIFASTDFSFKRYQVEDGLSHNTTWCSIQDSYGFLWIGTSDGL NCYDGCGNRVYRNALNNKYSLGNNFVSSLLEEGENIWVGTNSGLYIYDRSTDRFSYFNET TRYGVFVSSEVKKIIKTQSGLIWIATLGQGVFIYEPVTGVLTQNSIHTSFVWDICQSADS RIYVSSMQEGLFCFDEKGTFLQNYPVLSDSNIGSQRINCLYSIKNEIWFGVGTNLLGCYQ EGSPDLKIYDASAQNFGAILCLMDYNEKELLVGTDNGLYLFNRDLKTFRRGDSPFGLRAL GDPTVNAMMWDAEGTLWVMTNSGGVSYMTKQLKRFDYYSLISLSGTLELGREIGPLCEDQ NGNIWVGTRNGLCFFNTQIHSLSEYTLKNVDARYDIRSLLLDGDRLWIGTYAEGLHVLNL KTGAVKSYVHSRDIPYTICSNDVLSLFKDRKGDVFVGTSWGLCRYNSQLDNFVTIIHVGS MASVTDIYEDMYNSLWVATSNSGVFRYNTVNDRWEHFEHIREDTTTITSNSVITLFEDMK GTMWFGTNGGGLCSFSKDTETFIDFDPENTLLPNKVIFSIEQDRTGDFWISCNAGLCRIN PVSQKNFHLFTVNDGLQGNQFTAQSSLRSSTGKFYFGGINGFNVFEPGQFTDNTYLPPVY VTDISFPYLTDEQEVKRLLRLDKPFYKADKIQLPYTNNSFTVRFVALSYEDPHRNRYTYM LKGIDKEWIHSSNNTASYTNLPAGEYEFLVCGSNNDHHWNEKAASLWIVITPPWWRTSVA YFMYILLFLCTACYIGWRWNLYVKRKYKRRMEDYQVAKEKEMYKSKINFFINLVHEIRTP LSLIRLPLEKLREEEKEERKEKYLSVMDKNVNYLLGITNQLLDFQKMESGALQLSMKECS INALMSDIFNQFNGSVDLQGLTICLSLPPEEISVMIDSEKVSKILVNLVGNAIKYAKSQV EMRLILLGDRVNIIVSDDGPGVPEGQQNKIFQAFYQLPDDPVATSTGTGIGLAFAKSLAE AHHGTLSLENRMEGGSSFILSLPLETVVTESIPNEFAEMQMIDVEGKKMPDSEFGNKRFI VLLVEDNVELLNLTRESLSDWFKVFSACNGKEALEILSRESIDVIVSDIMMPEMNGLELC NYIKSDLVYSHIPVILLTAKTTLDSKAEGFENGADAYIEKPFSIKQLHKQIENLLKLRLA FHKLMINLSGSPASSLADFALSQKDVEFIERVNTILSELLADESCSIDLLAEQMNMSRSN FSRKIKALSGMPPHDYIKNFRLNKSTELIMSGIRIAEVAEQLGFTTSSYFAKCFKEKFGV LPKDYKG >gi|222159353|gb|ACAB01000006.1| GENE 15 23911 - 25254 933 447 aa, chain + ## HITS:1 COG:sll0398 KEGG:ns NR:ns ## COG: sll0398 COG0232 # Protein_GI_number: 16331575 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Synechocystis # 1 444 1 440 440 288 38.0 2e-77 MNWKQLISAKRFGMEEFHEERQENRSEFQRDYDRLIFSAPFRRLQNKTQVFPLPGSVFVH NRLTHSLEVASVGRSLGDDVAKALLERHPELQDSFLPEIGSIVSAACLAHDLGNPPFGHS GEKAISTFFSEGKGVRLKEKQPNGEQLSPMEWEDLTHFEGNANAFRILTHQFEGRRKGGF VLTYTTLASIVKYPFSSSLAGTKSKFGFFVSEEESFQKIATELGLTLLNEHPLKYARHPL VYLVEAADDICYQMMDIEDAYKLKILTTEETKELLMTYFSEERQEHLRKTFLIVNDVNEQ IAYLRSSVIGLLIRECTRVFLDHEQEILSGTFEGSLIKRIAERPAAAYKHSVEVSINKIY RSRDVLDIELAGFRIISTLLELMIDAVTSPEKAYSKLLIDRVSSQYNINSPVLYERIQAV LDYISGMTDVFALDLYRKINGNSLPAV >gi|222159353|gb|ACAB01000006.1| GENE 16 25226 - 26242 939 338 aa, chain - ## HITS:1 COG:VCA0646 KEGG:ns NR:ns ## COG: VCA0646 COG3176 # Protein_GI_number: 15601404 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Vibrio cholerae # 4 222 304 514 605 135 37.0 1e-31 MEEIIKPVSKELLKAELTEDKRLRMTNKSNNQIYIITHQNAPNVMREIGRLREIAFRAAG GGTGLSMDIDEYDTMEHPYKQLIVWNPEAEEILGGYRYLLGTDVRFDEKGAPILATSHMF HFSDAFIKEYLPQTIELGRSFVTLEYQSTRAGSKGLFALDNLWDGLGALTVVMPNVKYFF GKVTMYPSYHRRGRDMILHFLKKHFYDKEKLVTPIEPLQLETSEEELNALFCKSTFKEDY KILNCEIRKLGYNIPPLVNAYMSLSPTMRMFGTAVNYEFGDVEETGILIAVDEILEDKRI RHIQTFIESHPDALRLSSCEGEEVFTPKVVTPQADCSR >gi|222159353|gb|ACAB01000006.1| GENE 17 26269 - 27087 608 272 aa, chain - ## HITS:1 COG:VCA0646 KEGG:ns NR:ns ## COG: VCA0646 COG3176 # Protein_GI_number: 15601404 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Vibrio cholerae # 53 267 66 273 605 81 30.0 1e-15 MTDDSLFLIDVDKILRTKAPKHYKYIPKFVVSYLKKIVHQDEINVFLNESKDRLGVDFLE ACMGFLDAKVDVKGIENLPKDGLYTFVSNHPLGGQDGVALGYVLGKHYDGKVKYLVNDLL MNLRGLAPLCIPINKTGKQAKDFPKMVEAGFQSDDQLIMFPAGLCSRRQNGVIRDLEWKK TFVVKSIQAKRDVIPVHFGGRNSDFFYNLANICKALGIKFNIAMLYLADEMFKNRHKTFT VTFGKPIPWQTFDKSKTPAQWAEYVKDIVYKL >gi|222159353|gb|ACAB01000006.1| GENE 18 27346 - 27849 465 167 aa, chain + ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 3 164 5 161 183 97 36.0 7e-21 MKSLSFRKDLVGVQDELLRFAYKLTTDREEANDLLQETSLKALDNEDKYTPDTNFKGWMY TIMRNIFINNYRKVVRDQTFVDQTDNLYHLNLPQDGSSESTERAYDLKEMHRVVNKLPKE YRVPFAMHVSGFKYREIAEKLNLPLGTVKSRIFFTRQKLQEELKDFR >gi|222159353|gb|ACAB01000006.1| GENE 19 28312 - 28782 296 156 aa, chain + ## HITS:1 COG:CAC2133 KEGG:ns NR:ns ## COG: CAC2133 COG2001 # Protein_GI_number: 15895402 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 127 2 119 142 67 33.0 1e-11 MIRFLGNIEARADAKGRVFIPATFRKQLQAASEERLIMRKDVFQDCLTLYPESVWNEELN ELRSRLNKWNSKHQLIFRQFVSDVEVVTPDSNGRILIPKRYLQICSIHGDIRFIGIDNKI EIWAKERAEQPFMSPEEFGAALEEIMNDDNRQDGER >gi|222159353|gb|ACAB01000006.1| GENE 20 28754 - 29683 909 309 aa, chain + ## HITS:1 COG:BS_ylxA KEGG:ns NR:ns ## COG: BS_ylxA COG0275 # Protein_GI_number: 16078578 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Bacillus subtilis # 14 309 4 310 311 238 43.0 2e-62 MIIDKMEKDELTYHVPVLLKESVDGMNIQPDGTYVDVTFGGAGHSREILSRLGEGGRLLG FDQDEDAERNIVNDPHFIFVRSNFRYLQNFLRYHDIEQVDAILADLGVSSHHFDDSERGF SFRFDGALDMRMNKRAGITAADIVNTYEEERLANLLYLYGELKNSRKLASVIVKARSGQY IRTIGEFLEIIKPLFGREREKKELAKVFQALRIEVNQEMEALKEMLLAATEALKPGGRLV VITYHSLEDRMVKNIMKTGNVEGKAESDFFGNLQTPFRLVNNKVIVPDEAEIERNPRSRS AKLRIAEKK >gi|222159353|gb|ACAB01000006.1| GENE 21 29690 - 30046 382 118 aa, chain + ## HITS:1 COG:no KEGG:BT_3454 NR:ns ## KEGG: BT_3454 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 118 172 95.0 4e-42 MEEEVVNKKTEEGKKKNRTSLKSILGGDILATDFFRRQTKLLVLIMVFIIFYIHNRYASQ QQQIEIDRLKKELIDIKYDALTRSSELMEKSRQSRIEDYISNKESDLQTSTHPPYLIK >gi|222159353|gb|ACAB01000006.1| GENE 22 30119 - 32245 2072 708 aa, chain + ## HITS:1 COG:CAC2130 KEGG:ns NR:ns ## COG: CAC2130 COG0768 # Protein_GI_number: 15895399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 3 708 15 724 729 176 26.0 1e-43 MTRYFFVILLMALIGIAIVVKAGITMFAERQYWQDVADRFVKENVTVKPNRGNIISSDGK LMASSLPEYRIYMDFMSGEKDEKRRRKDQARRDSILNANMDSICIGLHKIFPDRSAAQFK AHLKKGRQAKSRNYLIYPKRISYIQYKEVKRLPVFCLNRYKGGFKELAYNQRKKPFGSLA ARTLGDVYADTAKGARNGIELAFDTILKGRDGLTHRQKVMNKYLNIVDVPPVDGCDLIST IDVGMQDICEKALVDKLKELNASVGVVVLMEVATGEVKAIVNMMQGKDGEYYEMRNNAIS DMLEPGSTFKTASIMVALEDGKITPDYVVDTGNGQMPMYGRVMKDHNWHRGGYGKLTVTE ILGVSSNVGTSYIIDHFYGSNPQKFVDGLKRMSIDQPLHLQISGEGKPNIRGPKERYFAK TTLPWMSIGYETQVPPINILTFYNGIANNGVSVRPKFVKAAVKDGEVVKEYPTEVINPKI CSDKTLAQIREILRKVVGEGLAKPAGSKQFHVSGKTGTAQISQGAAGYKTGRTNYLVSFC GYFPSEAPKYSMIVSIQKPGLPASGGLMAGSVFSKIAERVYAKDLRLPLTNAIDTNSVVI PNVKAGEMREAQRVLEELEINIQGKVADSGKEVWGNTHVAPQAVVLESRSNMQNFVPSVI GMGAKDAVYLLESKGLKVNLVGVGKVKSQSIANGTIVKKGQTVTLTLK >gi|222159353|gb|ACAB01000006.1| GENE 23 32275 - 33723 1576 482 aa, chain + ## HITS:1 COG:CAC2129 KEGG:ns NR:ns ## COG: CAC2129 COG0769 # Protein_GI_number: 15895398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Clostridium acetobutylicum # 1 482 1 479 482 337 39.0 2e-92 MLLKELLKAIQPVEVAGDSDIEITGVNIDSRLVEAGQLFMAMRGTQADGHAYIPAAIAKG ATAILCEDMPEEPVAGITYIQVKDSEDAVGKIATTFYGDPTSKMELVGVTGTNGKTTIAT LLYNTFRYFGYKVGLISTVCNYIDDEPIPTEHTTPDPITLNRLLGRMADEGCKYVFMEVS SHSIAQKRISGLKYAGGIFTNLTRDHLDYHKTVENYLKAKKKFFDDLPKSAFSLTNLDDK NGLVMTQNTRSKVYTYSLRSLCDFKGRVLESHFEGMLLDFNNHELAVQFIGKFNASNLLA VFGAAVLLGKKEEEVLVALSTLHPVAGRFDAVRSPLGVTAIVDYAHTPDALINVLNAIHG VLEGKGKVITVVGAGGNRDKGKRPIMAKEAAKASDRVIITSDNPRFEEPQDIINDMLAGL DAEDMKKTISIADRKEAIRTACMLAEKGDVILVAGKGHENYQEIKGVKHHFDDKEVLKEI FK >gi|222159353|gb|ACAB01000006.1| GENE 24 33813 - 35081 1192 422 aa, chain + ## HITS:1 COG:YPO0552 KEGG:ns NR:ns ## COG: YPO0552 COG0472 # Protein_GI_number: 16120880 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Yersinia pestis # 1 422 1 360 360 212 33.0 1e-54 MLYYLFEWLQKLNFPGAGMFGYTSFRALMAVILALLISSIWGDKFINLLKRKQITETQRD AKTDPFGVNKVGVPSMGGVIIIVAILIPCLLLGKLHNIYMILMLITTVWLGSLGFADDYI KIFKKDKEGLHGKFKIIGQVGLGLIVGMTLYLSPDVVIRENIEVHNPGQEMEVIHGTNDL KSTQTTIPFFKSNNLDYADLVSFMGEHAQTAGWILFVIVTIIVVTAVSNGANLNDGMDGM AAGNSAIIGATLGILAYVSSHIEFAGYLNIMYIPGSEELVIYICAFIGALIGFLWYNAYP AQVFMGDTGSLTIGGIIAVFAIIIHKELLIPILCGVFLVENLSVILQRAYYKAGKRKGVK QRLFKRTPIHDHFRTSMSLIEPGCTVKFTKPDQLFHESKITVRFWIVTIVLAAITIITLK IR >gi|222159353|gb|ACAB01000006.1| GENE 25 35220 - 36554 1478 444 aa, chain + ## HITS:1 COG:BS_murD KEGG:ns NR:ns ## COG: BS_murD COG0771 # Protein_GI_number: 16078584 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Bacillus subtilis # 2 444 10 450 451 244 37.0 3e-64 MKRIVILGAGESGAGAAVLAKVKGFETFVSDMSAIKDKYKELLDSHQIAWEEGHHTEELI LNADEVIKSPGIPNDAPIILKLKAQGTPVISEIEFAGRYTDAKMICITGSNGKTTTTSLI YHIFKSAGLNVGLAGNIGKSLALQVAEDYHDYYIIELSSFQLDNMYNFRANIAVLMNITP DHLDRYDHCMQNYIDAKFRITQNQTPDDAFIFWNDDPIIRQELAKHGLKAHLYPFAAVKE DGAIAYVEDHEVKITEPIAFNMEQEELALTGQHNLYNSLAAGISANLAGIAKENIRKALS DFKGVEHRLEKVARVRGIDFINDSKATNVNSCWYALQSMTTKTVLILGGKDKGNDYTEIE DLVREKCSALVYLGLHNEKLHEFFDRFGLPVADVQTGMKDAVEAAYKLAKKGETVLLSPC CASFDLFKSYEDRGDQFKECVRAL >gi|222159353|gb|ACAB01000006.1| GENE 26 36587 - 37909 1105 440 aa, chain + ## HITS:1 COG:BH3275 KEGG:ns NR:ns ## COG: BH3275 COG0772 # Protein_GI_number: 15615837 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus halodurans # 13 398 15 398 398 140 29.0 4e-33 MDLLKNIFKGDKVIWIIFLCLCLISIIEVFSAASTLTYKSGDHWGPITQHSIILMVGAVV VVFLHNVPYKWFQVFPVFLYPISLVLLAFVTLMGIITGDRVNGAARWMTFMGLQFQPSEL AKMAVIIAVSFILSKRQDEYGANPNAFKYIMILTGLVFLLIAPENLSTAMLLFGVVCMMM FIGRVSAKKLFGMLGILALVGGVAVGILMAIPAKTLHNTPGLHRFETWQNRVSGFFEKEE VPAAKFDIDKDAQVAHARIAIATSHVVGKGPGNSIQRDFLSQAFSDFIFAIVIEEMGLIG GIFVVFLYLWLLMRAGRIAQKCERTFPAFLVMGIALLLVSQAILNMMVAVGLFPVTGQPL PLVSKGGTSTLINCAYIGMILSVSRYTAHLEEQKAHDAQIQLQLEADAAANSEAQTAAEP TAQILNSDAKFEDEHDIDKR >gi|222159353|gb|ACAB01000006.1| GENE 27 37969 - 39087 1197 372 aa, chain + ## HITS:1 COG:BH2565 KEGG:ns NR:ns ## COG: BH2565 COG0707 # Protein_GI_number: 15615128 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Bacillus halodurans # 5 366 1 363 363 247 37.0 2e-65 MEKELRIIISGGGTGGHIFPAVSIANAIIELRPDAEILFVGAEGRMEMQRVPDAGYRIIG LPIAGFDRKRLWKNVSVLIKLMRSQWKARKVIKDFRPQVAVGVGGYASGPTLKTAGMMGV PTLIQEQNSYAGVTNKLLAQKAKVICVAYDGMEKFFPADKIIMTGNPVRQNLTKDMPSKE EALRSFNLQPGKKTILIVGGSLGARTINNTLTAGLATIKENTDVQFIWQTGKYYYPQVKE AVKAAGTLPNLYVTDFIKDMAAAYAAADLVISRAGAGSISEFCLLHKPVILVPSPNVAED HQTKNALALVNKQAAIYVKDSEAETTLMDVALSTVNDEQKLKGLTENIAKLALPDSARII AQEVIKLAEAKS >gi|222159353|gb|ACAB01000006.1| GENE 28 39200 - 40600 1293 466 aa, chain + ## HITS:1 COG:CAC3225 KEGG:ns NR:ns ## COG: CAC3225 COG0773 # Protein_GI_number: 15896472 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Clostridium acetobutylicum # 7 444 11 448 458 236 34.0 8e-62 MNIETIKSVYFVGAGGIGMSALVRYFLFKGKIVAGYDRTPTPLTETLIAEGAQIHYEENV DLIPEACKDKESTLVIYTPAVPQEHEEIVYFRNNGFEIQKRAQVLGTITHSSKGLCVAGT HGKTTTSTMTAHLLHQSHVECTAFLGGISKNYGTNLLLSQTSPYTVIEADEFDRSFHWLS PYMTVITSTDPDHLDIYGTEQAYLESFEHYTTLIQPGGALIIRKGISLQPKVQPGVRVYT YSRDEGDFHAENIRIGNGEIFIDFVAPDTRINDIQLGIPVSINIENGVAAMALAHLNGVT DEEIKRGMASFRGVDRRFDFKIKNDRIVFLSDYAHHPSEIKQSVLSMRELYKDKKITAIF QPHLYTRTRDFYQDFADSLSLLDEVILVDIYPAREAPIPGVTSKLIYDNLRPGIEKSMCK KEDILNILKDKNIEVLITLGAGDIDNYVPEIKELLETKKLRMNNEE >gi|222159353|gb|ACAB01000006.1| GENE 29 40676 - 41413 251 245 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163752975|ref|ZP_02160099.1| 30S ribosomal protein S12 [Kordia algicida OT-1] # 49 244 47 239 239 101 30 9e-21 MVKRILLSIVMLVLIAYLAVAITAFNRKPADQTCRDMELVIKDTAYAGFITKEELKGILQ HKGIYPIGKKMERISTKSLERELSKHPLIDEAECYKTPSGKVCVEVTQRIPILRVMSSNG QNYYLDNKGTVMPPDAKCVAHRVIVTGNVEKSFAMKDLYKFGVFLHNNKFWDAQIEQIHV LPDQNIELVPRVGDHLVYLGKLENFEDKLARLKEFYKKGLNRVGWNKYSRINLEFNNQII CTKRE >gi|222159353|gb|ACAB01000006.1| GENE 30 41513 - 42970 1435 485 aa, chain + ## HITS:1 COG:RSc2840 KEGG:ns NR:ns ## COG: RSc2840 COG0849 # Protein_GI_number: 17547559 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell division # Organism: Ralstonia solanacearum # 5 334 7 338 410 139 27.0 1e-32 MATTEFIAAIELGSSKITGVAGRKNSDGSMQVLAYAQEDSSTFIRKGVIFNLDKTAQSLT SIINRLEGELKNSIAKVYVGIGGQSLRTVRNVVSRDLEEEAIISEELVSAIGDENIAIPV VDMDILDVAPQEYKVGNNLQANPVGLVGSHIEGRFLNIVARASVRKNLEHCFQQAKIDIA DQLIAPLVTANAVLTESERRSGCALIDFGADTTTISVYKNNILRFLTVLPLGGNSITRDI TTLQMEEEEAERLKKAYGDALYEEDPEQEEEATCKLEDDNRIIKVADLNNIIEARAEEIV ANVWNQIQLSGYEDKLLAGLILTGGAANLKNLDETLRKRSKIEKIRMAKLPRNTVHAPNN ILKKDGSQNTLFGLLFEGNQNCCLTETAPQAAAPVPPVSKPEPEVHRTADIFEDDQELKE QARLARLKKEEEEREAKLAAKEAEKLRKQKEKEEKERRKREAGPSWIQRKIDSLTKEIFS DDDMK >gi|222159353|gb|ACAB01000006.1| GENE 31 43096 - 44406 1433 436 aa, chain + ## HITS:1 COG:BB0299 KEGG:ns NR:ns ## COG: BB0299 COG0206 # Protein_GI_number: 15594644 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Borrelia burgdorferi # 11 405 22 396 404 221 40.0 3e-57 MDEIVQFDFPTDSPKIIKVIGVGGGGGNAVNHMYREGIHDVTFVLCNTDNQALAESPVPV KLQLGRSITQGLGAGNRPERARDAAEESIEDIKAQLNDGTKMVFITAGMGGGTGTGAAPV IARIAKEMDILTVGIVTIPFIFEGEKKIIQALDGVERIAQHVDALLVINNERLREIYADL TFMNAFGKADDTLSIAAKSIAEIITMRGTVNLDFADVKTILKDGGVAIMSTGFGEGENRV TKAIDDALHSPLLNNNDIFNAKKVMLNVSFCPSSELMMEEMNEIHEFMSKFREGVEVIWG VAIDNSLDTKVKITVLATGFGVEDVPGMDSLHAARSQEEEERQLQLEEEKEKNKERIRKA YGESASNIGSKSLRKRRHIYLFNTEDLDNDDIIAMVEDSPTYMRDKTTLTKIRTKAALEE EVATEEATDDNGVITF >gi|222159353|gb|ACAB01000006.1| GENE 32 44493 - 44942 276 149 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 [Lactobacillus johnsonii NCC 533] # 1 148 1 146 147 110 42 1e-23 MDLFEQVSEDIKNAMKAKDKVALETLRNIKKFFLEAKTAPGANDILTDDAALKIIQKLVK QGKDSAEIYIGQGRQDLADVELGQVAVMEKYLPKQMSAEELEAALKEIIAETGATSGKDM GKVMGVASKKLAGLAEGRAISAKVKELLG >gi|222159353|gb|ACAB01000006.1| GENE 33 45698 - 47395 659 565 aa, chain + ## HITS:1 COG:no KEGG:BT_3442 NR:ns ## KEGG: BT_3442 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 565 1 563 563 701 68.0 0 MRKSIAIIITSILVLSGCNKQQSVSDRLLNEVEKAIAINPDSASNLLKGISSPEKLDNKT FARWCMLSGKITDEIFNSILPTYQLERAYDWYSSHGSPDEQVQILIYLGRSYFADGDYDK AMSIYTNALDIAEKNKLNNLTGYTYSYIGDLYGEKFMRTEAIKRYKAAAECFKKENNTDS YACALRDVGREYACIDSLSRALKILTIADSVARNTKNIEVTASIDNALGNIYAMQNKYDK AEEYFLKALVGRETMPDYMALIDLYIASGAINKAQELLSKILQDNPKYTYSIKYLYYQIY NEEKNYKEALTNLKEYVEITDSIIYADNQSKILNIESKYNHLKISKEVDRLKIKQQSYII VLVICIGILLLIIIGYLLYRKKAKEKIQRQQEELNRIKTDLLYVSLELEKKKRLLDTFKE KNESYEEMQEEISLLTTNYKQLQNKILENSNPLHKELIHLANQNKPRNNKPLITDKQWKL IADEITYIYPNLRKYIYSRCPDLPEQDFWYCCLYISGFDTNTEAKLLNITVDSVRKKRLR LRQKLNIILPDNNATLYEYLIENMH >gi|222159353|gb|ACAB01000006.1| GENE 34 47555 - 47816 75 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720385|ref|ZP_04550866.1| ## NR: gi|237720385|ref|ZP_04550866.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 87 1 87 122 143 100.0 3e-33 MRKLSILLLLIGIPFMIIEANNLSTQRTKIIRWHKCQKTPNSIPIEATQDENSIEVRFLE DIDDQVTFQVKDQLGNIIFQDVILTPN Prediction of potential genes in microbial genomes Time: Wed May 18 00:55:50 2011 Seq name: gi|222159352|gb|ACAB01000007.1| Bacteroides sp. D1 cont1.7, whole genome shotgun sequence Length of sequence - 12672 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 7, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 11 - 70 4.1 1 1 Op 1 . + CDS 90 - 1085 357 ## BT_3439 hypothetical protein + Prom 1095 - 1154 5.7 2 1 Op 2 . + CDS 1180 - 2289 471 ## DSY2015 hypothetical protein + Term 2300 - 2340 10.1 + Prom 2307 - 2366 6.7 3 2 Op 1 . + CDS 2392 - 2931 242 ## gi|237716658|ref|ZP_04547139.1| predicted protein 4 2 Op 2 . + CDS 2959 - 3684 418 ## gi|298480569|ref|ZP_06998766.1| hypothetical protein HMPREF0106_00998 5 2 Op 3 . + CDS 3765 - 4514 323 ## gi|237716660|ref|ZP_04547141.1| conserved hypothetical protein + Term 4589 - 4622 4.1 - Term 4394 - 4439 8.2 6 3 Op 1 . - CDS 4559 - 5785 792 ## BT_3434 hypothetical protein 7 3 Op 2 . - CDS 5808 - 6755 365 ## BT_3433 hypothetical protein - Prom 6816 - 6875 4.2 8 4 Tu 1 . - CDS 6942 - 7670 458 ## BT_3431 DNA repair protein - Prom 7790 - 7849 3.7 + TRNA 8190 - 8264 50.0 # Glu TTC 0 0 + Prom 8191 - 8250 79.3 9 5 Op 1 . + CDS 8304 - 8558 421 ## PROTEIN SUPPORTED gi|153809175|ref|ZP_01961843.1| hypothetical protein BACCAC_03485 + Term 8592 - 8626 4.0 + Prom 8576 - 8635 3.1 10 5 Op 2 . + CDS 8757 - 10715 2213 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit + Term 10761 - 10815 13.3 + Prom 10840 - 10899 3.6 11 6 Tu 1 . + CDS 10928 - 11137 96 ## gi|237716666|ref|ZP_04547147.1| conserved hypothetical protein - Term 10865 - 10910 5.2 12 7 Op 1 . - CDS 11134 - 11715 580 ## BVU_1376 hypothetical protein - Term 11725 - 11763 8.4 13 7 Op 2 . - CDS 11782 - 12657 933 ## COG0696 Phosphoglyceromutase Predicted protein(s) >gi|222159352|gb|ACAB01000007.1| GENE 1 90 - 1085 357 331 aa, chain + ## HITS:1 COG:no KEGG:BT_3439 NR:ns ## KEGG: BT_3439 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 331 218 558 558 500 75.0 1e-140 MVELIQSYGEFVLTSYYTGGRASALYYGLDTQSTDFNSKERDMDKSINATYSWNKDSVSG DFSVGTKDGNSSTETKSFSELHVSIKTLGGAYGYNVATPPYDVKSTSINLTSWLQSLNDS RTHTMIDIQDGGLYPISDFILEENFKQRYNDTHMDFQHQEQLEEPYIEIMKVYVRKSSSG EKLYDIVPVLNTRQGDKLIFSNSSAASQSDAELKANSVATTFTEKSNAIKDEKSKYYKLA IKANPNKVINPILQTTLSFSINDVDEAGMYKFKNTNTNIWYIYNPTSLYCFAYYDDDYIL DAYGILDWVNSIPVKSVSMTTLYQRYRIFGL >gi|222159352|gb|ACAB01000007.1| GENE 2 1180 - 2289 471 369 aa, chain + ## HITS:1 COG:no KEGG:DSY2015 NR:ns ## KEGG: DSY2015 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 51 356 85 402 404 152 34.0 2e-35 MKKLLLPLTILLFFGLVVSSCSSDDVLQRENNEQPNIVKRNNEKAREAGIRLLESFGMTK TRSLSTSDYPDYYGGSYINGNGKLVVFLKGEIESTKATLIRLIGENDVIYTQGNYSYTEL NNVLTKITSFISSNKDSQIAKNIRYYYLNDFENCVVVELDKSNEMEIKEFKSEVVNFSGI VFKQCTREFQNHSLSPGSSIGTPKGTASMGYRATRFNTDGFVTAGHAYNTGDPAYYNNTL IGSCDFSIQGGSVDAAFISITNFSIVPNNGNLTGEEYNIWAGDNVTKLGETIAQSEGYIT STNANVNADGIYYTDMGEATYLSAFGDSGGPVYLTDSKKVIGIHKGSTAFTSIFVKVDNI ASKLNAHFN >gi|222159352|gb|ACAB01000007.1| GENE 3 2392 - 2931 242 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716658|ref|ZP_04547139.1| ## NR: gi|237716658|ref|ZP_04547139.1| predicted protein [Bacteroides sp. D1] # 1 179 17 195 195 316 100.0 5e-85 MLISCHNKTGNSKSTVNLEYENSIDSLNSYKMDISHFINSIEPGETIVEEKQMQGEEFIT RENENIKVRFLLKDSCSKDGFLKTEIINNTNDTIIYISNTFLHYRESNNSWIPLVYSDNY VKNDLGYLIPSHESVKIRFFLPSQEKYPKGKYKLQLLFRNTSKSIDYYIDKIFLSNKSN >gi|222159352|gb|ACAB01000007.1| GENE 4 2959 - 3684 418 241 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298480569|ref|ZP_06998766.1| ## NR: gi|298480569|ref|ZP_06998766.1| hypothetical protein HMPREF0106_00998 [Bacteroides sp. D22] # 1 241 1 241 241 442 98.0 1e-123 MNTKKIVGIFLLSLCTMCFVNCSDDDTPDDPADTITLNMLNEHNGKTYLGESNVYINEAN NFITSKNFISDVGNGAGVGANILPSLTNLTHEVAVTPGHIYQIFDKNTLINFPSGNYAIQ VETSYYQAYVVSKIVNSDMNIGAIVKYISVFPNNNGLPAYRYNIGSLYRIGETVELALPQ DIEFFLKEHSAGIKGLNVTYANNKLRITLTKAPDIVNGPYGTFDLYIRSNNVFTIVEVHV E >gi|222159352|gb|ACAB01000007.1| GENE 5 3765 - 4514 323 249 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716660|ref|ZP_04547141.1| ## NR: gi|237716660|ref|ZP_04547141.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 249 1 249 249 491 100.0 1e-137 MRIDRFLLFVLLLASSYGCSEEQPYEVEKESVSIWEGDTRHVIIETDLTEHNYKLESENQ EIATATLDEMGVCIATYKSGSTMIRLIDTDNNKIACEIYVYVIYFNSSEIINWGIPLEGH PGSPGIIIKAIDLRIPPEIEIELREKEKPFIGATYTFNQETKKFTMKTDSGIFKEGTYEW NITSLTLMYDDKTEKFGFKFATGMSHGYILQSDKTSEYQQRYPDAGITEVKVNYVWKDNE IIQLGGLTF >gi|222159352|gb|ACAB01000007.1| GENE 6 4559 - 5785 792 408 aa, chain - ## HITS:1 COG:no KEGG:BT_3434 NR:ns ## KEGG: BT_3434 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 408 1 411 411 606 70.0 1e-172 MIRNRNAIQLLCLLCLLVHPAINANLEARTIIGQRNDSVSHPLIRHQLGFDVRPGYIVST HSFLQGDNAQQKKIDQSLSFHFKYAFRFSKESNLGRLFPHTYQGIGISYHTFFSPAELGN PVSVYAFQGSRIAQLSPRLSFDYEWNFGASFGWKKYDELSNPQNVLIGSKINAYINLGFL LNWQVHKYWKLAAGVDLTHFSNGNTHYPNGGLNVIGGRIGIVRTLGVDEGLGPITPGRLF IKPHVSYDLVIYGATRKRGLIGDDVSSMIPGSFGVAGINFAPMYNFNNYFRAGLSADAQY DESANLKEYRVGEYYSGDLKFHRPPFREQFSVGLSLRAELVMPVFSINVGVGRNLIYSGD DTKGFYQVLALKTYVTRHLFLHVGYQLSKFKDPNNLMLGLGYRFHDKR >gi|222159352|gb|ACAB01000007.1| GENE 7 5808 - 6755 365 315 aa, chain - ## HITS:1 COG:no KEGG:BT_3433 NR:ns ## KEGG: BT_3433 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 311 1 314 315 285 50.0 1e-75 MKRFLYLTVCLLILGEFYGCNGDVFVDDFRPSDSELTLDGNGDVATIQFAASNWDWMYIA FGFPIQYKIYDADNRLITTDQGPSLNGLGKIVCTGEMIDFTIERVHPKEVKITVGENALS APFQFQLRASNEYEWQEIRVEISPTDRYVMDSITYSLNAYSYDPENRTERKEGVSFHNFT DVFSTYTFSPFESFYHFMRFKSDVPEAFQLLGEAGLTVEIPSMKDGYLTMNGKRAPLISA QQMLSCSNTEQEEDIIPPRTARTYTLWMVYDWLETRYTLYVSHPKTKKQRTVSGTLHSEM PVKYHITHRDVSLKN >gi|222159352|gb|ACAB01000007.1| GENE 8 6942 - 7670 458 242 aa, chain - ## HITS:1 COG:no KEGG:BT_3431 NR:ns ## KEGG: BT_3431 # Name: not_defined # Def: DNA repair protein # Organism: B.thetaiotaomicron # Pathway: Homologous recombination [PATH:bth03440] # 1 242 1 242 242 447 94.0 1e-124 MLQKTKGIVLHTLKYNDTSIIVDMYTELSGRASFLVTVPRSRKAAVKSVLFQPLSFIELE ADYRPNATLYRVKEAKSFYPFSSIPYDPYKSSMALFLSEFLYRAIREEAENRPLFAYLQH SIIWLDECGDGFANFHLVFLMRLSRFLGLYPNLEDYHTGDYFDLLNACFTSIRPQLHSSY INPEEAARLRQLMRMNYETMHLFGMSRAERTRCLTIMNDYYRLHLPDFPALKSLEVLKEL FD >gi|222159352|gb|ACAB01000007.1| GENE 9 8304 - 8558 421 84 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153809175|ref|ZP_01961843.1| hypothetical protein BACCAC_03485 [Bacteroides caccae ATCC 43185] # 1 84 1 84 84 166 100 6e-41 MANHKSSLKRIRQEETRRLHNRYYGKTMRNAVRKLRATTDKAEAVAMYPGITKMLDKLAK VNIIHKNKANNLKSKLALYINKLA >gi|222159352|gb|ACAB01000007.1| GENE 10 8757 - 10715 2213 652 aa, chain + ## HITS:1 COG:MA1584 KEGG:ns NR:ns ## COG: MA1584 COG0187 # Protein_GI_number: 20090442 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Methanosarcina acetivorans str.C2A # 1 645 1 626 634 691 56.0 0 MSEEQITPNNGSYSADSIQVLEGLEAVRKRPAMYIGDISVKGLHHLVYEIVDNSIDEALA GYCDHIEVTINEDNSITVQDNGRGIPVDYHEKEKKSALEVAMTVLHAGGKFDKGSYKVSG GLHGVGMSCVNALSTHMTTQVFRNGKIYQQEYEIGKPLYSVKEVGTADITGTRQQFWPDG TIFTETVYDYKILASRLRELAYLNAGLRISLTDRRVVNEDGSFKSEQFYSEEGLREFVRF IESSREHLINDVIYLNSEKQGIPIEVAIMYNTGFSENVHSYVNNINTIEGGTHLAGFRRA LTRTLKKYAEDSKMLEKVKVEISGDDFREGLTAVISVKVAEPQFEGQTKTKLGNNEVMGA VDQAVGEVLAYYLEEHPKEAKTIVDKVILAATARHAARKAREMVQRKSPMSGGGLPGKLA DCSDKDPSKCELFLVEGDSAGGTAKQGRNRMFQAILPLRGKILNVEKAMYHKALESDEIR NIYTALGVTIGTDEDSKEANIQKLRYHKIIIMTDADVDGSHIDTLIMTFFFRYMPQVIQN GYLYIATPPLYLCKKGKVEEYCWTDAQRQKFIDTYGGGSENAIHTQRYKGLGEMNAQQLW ETTMDPENRMLKQVNIDNAAEADYIFSMLMGEDVGPRREFIEENATYANIDA >gi|222159352|gb|ACAB01000007.1| GENE 11 10928 - 11137 96 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716666|ref|ZP_04547147.1| ## NR: gi|237716666|ref|ZP_04547147.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 69 1 69 69 100 100.0 2e-20 MKMKTTAPNPLFAKIRLFLIVLTSCVIATCIFLIIIHPEAWIMPACGIIIALCILLYQLY CIRKNRKRI >gi|222159352|gb|ACAB01000007.1| GENE 12 11134 - 11715 580 193 aa, chain - ## HITS:1 COG:no KEGG:BVU_1376 NR:ns ## KEGG: BVU_1376 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 193 1 193 193 334 87.0 8e-91 MIQIGDVVVSLDVFQEKFLCDLGACKGACCIEGDAGAPVELDEVMELEEVLPVIWDELAP EARAVIEKQGVVYTDEEGDLVTSIVNNKDCVFTCYDEKGCCYCAIEKAYREGKTDFYKPV SCHLYPIRIGDYGPYKAVNYNRWDVCKAAVLLGKKENLPVYQFLKEPLIRKFGEEWYKEL VTVAEELKKQQYI >gi|222159352|gb|ACAB01000007.1| GENE 13 11782 - 12657 933 291 aa, chain - ## HITS:1 COG:MA4007 KEGG:ns NR:ns ## COG: MA4007 COG0696 # Protein_GI_number: 20092802 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Methanosarcina acetivorans str.C2A # 4 290 228 518 521 289 50.0 3e-78 MVQAMQESYDEGVTDEFIKPIVNANCDGTIKEGDVVIFFNYRNDRAKELTVVLTQQDMPE VGMHTIPGLQYYCMTPYDASFKGVHILFDKENVSNTLGEYLASKGLSQLHIAETEKYAHV TFFFNGGRETPFDKEERILVPSPKVATYDLKPEMSAFEVKDKLVAAINENKYDFIVVNFA NGDMVGHTGIYEAIEKAVVAVDACVKDVIEAAKAQDYEAIIIADHGNADHALNEDGTPNT AHSLNPVPCVYVTENKAAKVEDGRLADVAPTILKIMGLEVPAEMDGNVLIK Prediction of potential genes in microbial genomes Time: Wed May 18 00:56:50 2011 Seq name: gi|222159351|gb|ACAB01000008.1| Bacteroides sp. D1 cont1.8, whole genome shotgun sequence Length of sequence - 1694 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 624 784 ## COG0696 Phosphoglyceromutase - Prom 847 - 906 6.3 + Prom 795 - 854 4.1 2 2 Tu 1 . + CDS 972 - 1692 557 ## COG0598 Mg2+ and Co2+ transporters Predicted protein(s) >gi|222159351|gb|ACAB01000008.1| GENE 1 3 - 624 784 207 aa, chain - ## HITS:1 COG:MA4007 KEGG:ns NR:ns ## COG: MA4007 COG0696 # Protein_GI_number: 20092802 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Methanosarcina acetivorans str.C2A # 6 207 15 217 521 218 51.0 5e-57 MSKKALLMILDGWGLGDQKKDDVIFNTPTPYWDYLMSTYPHSQLQASGENVGLPDGQMGN SEVGHLNIGAGRVVYQDLVKINRACADNSIMKNPEIVSAFSYAKENGKNIHFMGLTSNGG VHSSLVHLFKLCDIAKEYNINNAFIHCFMDGRDTDPKSGKGFIEELSAHCEKSAGKIASI IGRYYAMDRDKRWERVKEAYDLLVNGV >gi|222159351|gb|ACAB01000008.1| GENE 2 972 - 1692 557 240 aa, chain + ## HITS:1 COG:CAC0294 KEGG:ns NR:ns ## COG: CAC0294 COG0598 # Protein_GI_number: 15893586 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Clostridium acetobutylicum # 19 240 24 245 315 160 39.0 2e-39 MRTYLYCEAGFVEKAQWLPNSWVNVVCPNNDDFEFLTQTLNVPESFLNDIADTDERPRTD TEGNWLLTILRIPVVNKQNGSLPFDTVPIGIITNNEIIVSVCYHNTDLLPDFIEHTRRKG IVVRNKLDLIFRLIYSSAVWFLKYLKQINIDISAAEKELERSIRNEDLLRLMRLQKTLVY FNTSIRGNEVMIGKLRTIFQDTDYLDTELVEDVIIELKQALNTVNIYSDILTGTMDAFAS Prediction of potential genes in microbial genomes Time: Wed May 18 00:57:04 2011 Seq name: gi|222159350|gb|ACAB01000009.1| Bacteroides sp. D1 cont1.9, whole genome shotgun sequence Length of sequence - 55796 bp Number of predicted genes - 43, with homology - 43 Number of transcription units - 24, operones - 8 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 25 - 207 159 ## BT_3414 Mg2+/Co2+ transporter 2 2 Tu 1 . - CDS 344 - 1885 1291 ## BT_3413 hypothetical protein - Prom 1924 - 1983 4.6 3 3 Op 1 . - CDS 2066 - 2668 478 ## COG0164 Ribonuclease HII - Prom 2698 - 2757 1.5 - Term 2696 - 2738 5.5 4 3 Op 2 . - CDS 2759 - 4963 2287 ## COG3808 Inorganic pyrophosphatase - Prom 4990 - 5049 6.2 - Term 5534 - 5572 0.6 5 4 Tu 1 . - CDS 5580 - 5780 131 ## BT_3134 two-component system sensor histidine kinase/response regulator, hybrid ('one-component system') + Prom 6069 - 6128 2.6 6 5 Tu 1 . + CDS 6227 - 8416 2304 ## COG3345 Alpha-galactosidase 7 6 Tu 1 . - CDS 8525 - 8851 459 ## COG0393 Uncharacterized conserved protein - Prom 8953 - 9012 3.6 - Term 9097 - 9133 7.1 8 7 Op 1 24/0.000 - CDS 9142 - 10353 1230 ## COG0520 Selenocysteine lyase 9 7 Op 2 41/0.000 - CDS 10373 - 11716 1440 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 10 7 Op 3 41/0.000 - CDS 11730 - 12512 211 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 - Prom 12533 - 12592 3.7 11 7 Op 4 . - CDS 12598 - 14100 1420 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 12 7 Op 5 . - CDS 14078 - 14608 499 ## BT_3405 hypothetical protein - Prom 14644 - 14703 5.7 13 8 Tu 1 20/0.000 - CDS 14812 - 17901 3708 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) - Term 17940 - 18003 10.8 14 9 Op 1 . - CDS 18007 - 19281 595 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 15 9 Op 2 . - CDS 19284 - 19751 561 ## BT_3402 hypothetical protein - Prom 19905 - 19964 4.7 + Prom 19699 - 19758 7.3 16 10 Tu 1 . + CDS 19924 - 20319 512 ## BT_3400 hypothetical protein 17 11 Tu 1 . - CDS 20477 - 21175 518 ## COG1451 Predicted metal-dependent hydrolase - Prom 21279 - 21338 6.2 18 12 Tu 1 . + CDS 21305 - 22087 619 ## BT_3398 hypothetical protein + Term 22119 - 22165 8.1 - Term 22107 - 22153 5.1 19 13 Op 1 . - CDS 22270 - 22749 470 ## BT_3397 hypothetical protein 20 13 Op 2 . - CDS 22739 - 23248 266 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 21 13 Op 3 . - CDS 23310 - 24083 731 ## COG0548 Acetylglutamate kinase 22 13 Op 4 . - CDS 24140 - 26032 1852 ## COG1166 Arginine decarboxylase (spermidine biosynthesis) 23 13 Op 5 . - CDS 26102 - 27172 1101 ## BF0195 hypothetical protein 24 13 Op 6 . - CDS 27182 - 28948 1387 ## COG0326 Molecular chaperone, HSP90 family 25 13 Op 7 . - CDS 28958 - 31456 2866 ## COG0790 FOG: TPR repeat, SEL1 subfamily - Prom 31510 - 31569 8.9 26 14 Tu 1 . - CDS 31593 - 32120 471 ## COG0703 Shikimate kinase - Prom 32153 - 32212 3.9 - Term 32160 - 32212 -0.9 27 15 Tu 1 . - CDS 32236 - 32841 480 ## COG3560 Predicted oxidoreductase related to nitroreductase - Prom 32984 - 33043 5.4 + Prom 32811 - 32870 8.9 28 16 Op 1 . + CDS 32977 - 33612 558 ## COG3341 Predicted double-stranded RNA/RNA-DNA hybrid binding protein 29 16 Op 2 . + CDS 33642 - 35414 1375 ## COG1807 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family + Term 35432 - 35494 7.2 - Term 35183 - 35221 -1.0 30 17 Op 1 . - CDS 35401 - 37380 1169 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 31 17 Op 2 . - CDS 37427 - 37846 262 ## BT_3390 hypothetical protein 32 17 Op 3 3/0.000 - CDS 37830 - 38786 799 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 33 17 Op 4 . - CDS 38783 - 39586 693 ## COG0726 Predicted xylanase/chitin deacetylase 34 17 Op 5 . - CDS 39616 - 41451 216 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein - Prom 41474 - 41533 4.9 - Term 41518 - 41562 1.0 35 18 Op 1 . - CDS 41572 - 42696 661 ## COG1672 Predicted ATPase (AAA+ superfamily) - Prom 42717 - 42776 2.6 - Term 42738 - 42766 -0.9 36 18 Op 2 . - CDS 42778 - 44862 1364 ## COG5545 Predicted P-loop ATPase and inactivated derivatives - Prom 44882 - 44941 3.0 + Prom 45386 - 45445 7.3 37 19 Tu 1 . + CDS 45466 - 46077 342 ## BT_3384 hypothetical protein + Term 46129 - 46178 5.1 - Term 46116 - 46166 9.0 38 20 Tu 1 . - CDS 46247 - 47800 1170 ## BT_1828 hypothetical protein - Prom 47825 - 47884 4.4 39 21 Op 1 . - CDS 47983 - 49536 1201 ## BT_1828 hypothetical protein 40 21 Op 2 . - CDS 49582 - 50727 683 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 50963 - 51022 4.5 + Prom 51058 - 51117 6.0 41 22 Tu 1 . + CDS 51147 - 51632 273 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Prom 51775 - 51834 4.9 42 23 Tu 1 . + CDS 51862 - 54267 1161 ## gi|262405490|ref|ZP_06082040.1| predicted protein - Term 54600 - 54634 -0.5 43 24 Tu 1 . - CDS 54743 - 55615 135 ## Fjoh_0316 hypothetical protein - Prom 55688 - 55747 9.2 Predicted protein(s) >gi|222159350|gb|ACAB01000009.1| GENE 1 25 - 207 159 60 aa, chain + ## HITS:1 COG:no KEGG:BT_3414 NR:ns ## KEGG: BT_3414 # Name: not_defined # Def: Mg2+/Co2+ transporter # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 60 250 309 309 100 81.0 1e-20 MKRMTSLSIVLMLPTLIASFYGMNVDIHLEEVPFAFSLIVLFSIGLSTLAFVIFRKIKWF >gi|222159350|gb|ACAB01000009.1| GENE 2 344 - 1885 1291 513 aa, chain - ## HITS:1 COG:no KEGG:BT_3413 NR:ns ## KEGG: BT_3413 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 513 1 485 485 523 55.0 1e-147 MATNKLLWSSKIIGVFFMMLVCTLSANAQFLRTSYFMEGTHYRQQLNPALTPTKGYFNLP VIGAVNATVGSTSLGYQDIIDIIDDGDDFYTKPDFMNRLKDNNKLNVNFSTEILSAGWYK GKNFWSFNIGLRTDIGANLTKSMFTFLNEMDGIEDNWRNSNYDISNQQLNINAYAEVGLG LSRQINSRLTVGARVKALLGIGNMDLKLNNVAMNASLPSDARISQLQDPAYLSGLGVDDI TKLRSEIESYHANLAVDAHLESSFKGLNLKKEDGQDYISDFEFESKDMGIAGYGFGIDLG ASYKIMDNLTVSASILDLGFISWSKSSTQIANAKASGIDMKGSDYTSGIDPSDIPRSITA IENNIKNLQTDANGYMERVSGGDVLDYEMLQLRTEEASKSRKSRLASTLVIGAEYGFFNN KLAVGALSTTRFVQPDALTELTFSANYRPKSWFNVALSYSVIQSAGKSFGLGLKLGPLFV GTDYMFLGKNSNSVNGFVGVSIPLGGRKANKEG >gi|222159350|gb|ACAB01000009.1| GENE 3 2066 - 2668 478 200 aa, chain - ## HITS:1 COG:NMA0075 KEGG:ns NR:ns ## COG: NMA0075 COG0164 # Protein_GI_number: 15793104 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Neisseria meningitidis Z2491 # 9 196 2 193 194 179 51.0 2e-45 MLLPYLNENLIEAGCDEAGRGCLAGAVYAAAVILPKGFKNELLNDSKQLSEKQRYALREV IEKEAIAWAVGVVLPEEIDEINILRASFLAMHRAVDQLTTRPQHLLIDGNRFTKYPGIPH TTVVKGDGKYLSIAAASILAKTYRDDYMNRLHEEFPYYDWDHNKGYPTKKHRAAIAERGT TPYHRMTFNLLGDGQLTLNF >gi|222159350|gb|ACAB01000009.1| GENE 4 2759 - 4963 2287 734 aa, chain - ## HITS:1 COG:MA3879 KEGG:ns NR:ns ## COG: MA3879 COG3808 # Protein_GI_number: 20092675 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Methanosarcina acetivorans str.C2A # 3 727 11 683 685 528 48.0 1e-149 MDNILFWLVPVASVLALCFAYYFHKQMMKEGEGTPQMIKIAAAVRKGAMSYLKQQYKIVG WVFLGLVILFSIMAYGFHVQNAWVPIAFLTGGFFSGLSGFLGMKTATYASARTANAARTS LNAGLRIAFRSGAVMGLVVVGLGLLDISFWYLLLNAVIPADALTPTHKLCVITTTMLTFG MGASTQALFARVGGGIYTKAADVGADLVGKVEAGIPEDDPRNPATIADNVGDNVGDVAGM GADLYESYCGSILATAALGAAAFIHSADTVMQFKAVIAPMLIAAVGIILSIIGIFAVRTK ENATMKDLLGSLAFGTNLSSVLIVAATFLILWLLQLDNWIWISCAVVVGLLVGIIIGRST EYYTSQSYRPTQKLSESGKTGPATVIISGIGLGMLSTAIPVIAVVIGIIASYLLASAGDF ANVGMGLYGIGIAAVGMLSTLGITLATDAYGPIADNAGGNAEMSGLGAEVRKRTDALDSL GNTTAATGKGFAIGSAALTGLALLASYIEEIRIGLTRLGNVDLTFADGSSISVANATFID FMDYYEVHLMNPKVLSGMFLGSMMAFLFCGLTMNAVGRAAGHMVDEVRRQFREIKGILTG EAEPDYERCVAISTKGAQREMVVPSLIAIIAPILTGLIFGVPGVLGLLIGGLSSGFVLAI FMANAGGAWDNAKKYVEEGNFGGKGGEVHKATVVGDTVGDPFKDTSGPSLNILIKLMSMV AIVMAGLTVAWSLF >gi|222159350|gb|ACAB01000009.1| GENE 5 5580 - 5780 131 66 aa, chain - ## HITS:1 COG:no KEGG:BT_3134 NR:ns ## KEGG: BT_3134 # Name: not_defined # Def: two-component system sensor histidine kinase/response regulator, hybrid ('one-component system') # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 64 1187 1248 1271 72 54.0 6e-12 MALTDLSSLDFIRIIRLKHAAELLQEGELRTNEICDRVEFQSPSYFAKVFQKQFKVTSTE FAQQNK >gi|222159350|gb|ACAB01000009.1| GENE 6 6227 - 8416 2304 729 aa, chain + ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 27 720 14 724 748 341 30.0 4e-93 MKKLFVLLTALFTLAQVNAQEKNVIRIATDNTDLILQVAPNGRLYQAYLGDKLLNEKDIN NFSPYVKGGSDGSVSTRGWEVYPGSGAEDYFEPAVAITHNDGNPSTILRYVSSEQKAVAG GTETTIQLKDDQYPVEVTLHYIAYPKENVIKTWSEIKHAEKKPVTIWRYASTMLYFSGNE YYLTEFSSDWAKEAQMSTQPLLFGKKVIDTKLGSRAAMHTHPFFELGFEQPAQETQGRAM LGTIGWTGNFQFTFEVDNVGNLRVIPAINPYASDYELKPNEVFTTPEFIFTFSNNGTGEA SRNLHAWARNHQLKDGQGDRMTLLNNWENTYFKFDEKLLAELMKEAKHLGVDMFLLDDGW FGNKHPRNSDNAGLGDWEVMKSKLPGGIPALVKSAKEAGVKFGIWIEPEMINPKSELFEK HPDWAITLPNRETYYYRNQLVLDLSNPKVQDFVFGVVDNILTANPEVAFFKWDCNSPITN IYSPYLKNKQGQLYIDHVRGIYNVLERIKDKYPNVPMMLCSGGGARCDYEALKYFTEFWC SDNTDPIERLFIQWGFSQIFPAKAMDAHVTSWNKKTSVKFRTDVASMCKLGFDLGLKELN ADEQTFCQNAVANWTRLKKVILDGDQYRLVSPYDGNHMSVMYAAPDKNKAVLYTYDIHPR FGEKLLPVKLQGLDAKKMYKVKEINLMPNSKSNLAANEKTYSGDYLMKVGINAFTTNQAF SRVIELTAE >gi|222159350|gb|ACAB01000009.1| GENE 7 8525 - 8851 459 108 aa, chain - ## HITS:1 COG:ECs0952 KEGG:ns NR:ns ## COG: ECs0952 COG0393 # Protein_GI_number: 15830206 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 104 1 104 107 123 61.0 8e-29 MLMTTTPIIEGKRIVKYYGIVSGETIIGANVFRDLFASIRDIVGGRSGSYEEVLRMAKDT ALKEMEAQAAQMGANAVIGVDLDYETVGGSGSMLMVTANGTAVKIEED >gi|222159350|gb|ACAB01000009.1| GENE 8 9142 - 10353 1230 403 aa, chain - ## HITS:1 COG:mlr0021 KEGG:ns NR:ns ## COG: mlr0021 COG0520 # Protein_GI_number: 13470346 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Mesorhizobium loti # 2 403 11 412 413 455 51.0 1e-128 MDIQKIREDFPILSRTVYGKPLVYFDNGATTQKPRLVVDALVDEYYSVNANVHRGVHYLS QQATELHEASRETVRQFINARSTNEVVFTRGTTESINLLVSSFGDEFMEEGDEVILSVME HHSNIVPWQLLAARKGIAIKVIPMNDKGELLLDEYEKLFSERTKIVSVVHVSNVLGTVNP VKEMIATAHAHGVPCLIDAAQSIPHMKVDVQELDADFLVFSAHKIYGPTGVGVLYGKEEW LDRLPPYQGGGEMIQHVSFEKTTFNELPFKFEAGTPDYIGTTGLAKALDYVNGHGLEQIA AHEHELTTYALQRLKEIPQMRIFGEAAERGAVISFLVGDIHHFDLGTLLDRLGIAVRTGH HCAQPLMQRLGIEGTVRASFAMYNTKSEIDTLVAGIERVSKMF >gi|222159350|gb|ACAB01000009.1| GENE 9 10373 - 11716 1440 447 aa, chain - ## HITS:1 COG:alr2494 KEGG:ns NR:ns ## COG: alr2494 COG0719 # Protein_GI_number: 17229986 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Nostoc sp. PCC 7120 # 43 425 62 440 453 185 32.0 1e-46 MIVEQQYIDLFSQTEAMICKHSAEVLNAPRAAAFADFERLGFPTRKMEKYKYTDVSKYFE PDYGLNLNRLAIPVNPYEVFKCDVPNMSTALYFVVNDAFYNRALPKVNLPEGVIFGSLKE VAGQHPELVKKYYGQLADTSKDGVTAFNTAFAQDGVVFYVPKNVVVEKPIQLVNILRADV NFMVNRRVLIILEDGAQARLLICDHAMDNVNFLATQVIEVFAGENTVFDMYELEETHTST VRISNLYVKQEANSNVLLNGMTLHNGTTRNTTEVLLAGEGAEINLCGMAIADKNQHVDNH TSIDHAVPNCTSNELFKYVLDDQSVGAFAGLVLVRPDAQHTNSQQTNRNLCATRDARMYT QPQLEIYADDVKCSHGATVGQLDEGALFYMRSRGIAEKEARLLLMFAFVNEVIDTIRLEA LKDRLHLLVEKRFRGELNRCQGCAICK >gi|222159350|gb|ACAB01000009.1| GENE 10 11730 - 12512 211 260 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 245 1 232 318 85 28 5e-16 IVNGKIVHIIMLEIKDLHASINGKEILKGINLTVNPGEVHAIMGPNGSGKSTLSSVLVGN PAFEVTKGSVTFYGKNLLELSPEDRSHEGIFLSFQYPVEIPGVSMVNFMRAAVNEQRKYK GLPALTASEFLKLMREKRAVVELDNKLANRSVNEGFSGGEKKRNEIFQMAMLEPRLSILD ETDSGLDIDALRIVAEGVNKLKTPETSTIVITHYQRLLDYIKPDIVHVLYKGRIVKTAGP ELALELEEKGYDWIKKEVGE >gi|222159350|gb|ACAB01000009.1| GENE 11 12598 - 14100 1420 500 aa, chain - ## HITS:1 COG:SMc00530 KEGG:ns NR:ns ## COG: SMc00530 COG0719 # Protein_GI_number: 15965488 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Sinorhizobium meliloti # 26 500 11 489 489 713 70.0 0 MQQEEPNKMKPNQKDELEKKTDNEFVRKFAEEKYKYGFTTEVHTDIIERGLNEDVIRLIS SKKDEPEWLLEFRLKAYRHWLTLEMPTWAHLRIPEIDYQAISYYADPTKKKEGPKSMEEV DPELIKTFNKLGIPLEEQMALSGMAVDAVMDSVSVKTTFKETLMEKGIIFCSFSEAVREH PDLVKKYMGSVVGYRDNFFAALNSAVFSDGSFVYIPKGVRCPMELSTYFRINARNTGQFE RTLIVADDDSYVSYLEGCTAPMRDENQLHAAIVEIIVHDRAEVKYSTVQNWYPGDAEGKG GVYNFVTKRGNCKGVDSKLSWTQVETGSAITWKYPSCILTGDNSTAEFYSVAVTNNYQQA DTGTKMIHLGKNTRSTIVSKGISAGHSENSYRGLVRVAEKADNARNYSQCDSLLLGDKCG AHTFPYMDIHNETAVVEHEATTSKISEDQIFYCNQRGIPTEDAIGLIVNGYAKEVLNKLP MEFAVEAQKLLTISLEGSVG >gi|222159350|gb|ACAB01000009.1| GENE 12 14078 - 14608 499 176 aa, chain - ## HITS:1 COG:no KEGG:BT_3405 NR:ns ## KEGG: BT_3405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 176 1 176 176 256 88.0 3e-67 MATIDIIILIVIGAGAVVGFVKGFIRQLASILGLIVGLLAAKTLYASLAEKLCPTVTDSM TVAQVLAFIMIWIAVPLIFVLIASLLTKAMQAISLNWLNRWLGSGLGALKFLLLTSVVIG AIEFVDSDNKLISATKKEESLLYYPMETFAGIFFPAAKNMTQQYILENKDATRRTQ >gi|222159350|gb|ACAB01000009.1| GENE 13 14812 - 17901 3708 1029 aa, chain - ## HITS:1 COG:BH2413 KEGG:ns NR:ns ## COG: BH2413 COG0532 # Protein_GI_number: 15614976 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Bacillus halodurans # 456 1028 157 730 730 538 52.0 1e-152 MTIRLNKVTRDLNVGIATVVEFLQKKGHTVEANPNTKISEEQYAILVKEFSTDKNLRLES ERFIQERQNKERNKASVSIEGFEKQPEKQKSEDVIKTVVPEDARPKFKPVGKIDLDKLNG RKPEKVEKEPEQKQEVPVVERPVVKPEVKKEPEKREPEVKKEEVVIPPVSVVEPTPVVVE PVVVPEPVVETKPVEVEKVVEEVKKEEPKVVETAPVKAEEHKEEKKVETAQAEVTPVAEK APEDDGVFKIRQPELGAKINVIGQIDLAALNQSTRPKKKSKEEKRREREEKEKIRQDQKK LMKEAIIKEIRKDDSKVVKNGPKESADAAKKKRNRINKEKVDVNNVATSNFAAPRPNIQG KGGNGNNNGQGGQGNNNRRNNNNNKDRFKKPVIKQEVSEEDVAKQVKETLARLTTKGKNK TSKYRKEKREMASNRMQELEDQEMADSKVLKLTEFVTANELATMMDVSVNQVIATCMSIG IMVSINQRLDAETINLVAEEFGFKTEYVSAEVAQAIVEEEDAPEDLQPRAPIVTVMGHVD HGKTSLLDYIRKANVIAGEAGGITQHIGAYNVQLEDGRRITFLDTPGHEAFTAMRARGAK VTDIAIIIVAADDNVMPQTKEAINHAMAAGVPIVFAINKVDKPTANPDKIKEELAAMNYL VEEWGGKYQSQDISAKKGMGVEDLLEKVLLEAEMLDLKANPDRNATGSIIESSLDKGRGY VATVLVSNGTLKVGDIVLAGTSYGRVKAMFNERNQRIKEAGPSEPALVLGLNGAPAAGDT FHVVESDQEAREITNKREQLAREQGLRTQKILTLDELGRRIALGNFQELNIIVKGDVDGS VEALSDSLIKLSTEQIQVNVIHKGVGAISESDVSLAAASDAIIVGFQVRPSGAAAKMAEQ EGVDIRKYSVIYDAIEEVKAAMEGMLAPELKEQITATIEIREVFNITKVGLVAGAMVKTG KVKRSDKARLIRDGIVIFTGNINALKRFKDDVKEVGTNFECGISLVNCNDMKVGDMIETF EEIEVKQTL >gi|222159350|gb|ACAB01000009.1| GENE 14 18007 - 19281 595 424 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 1 352 1 349 537 233 36 1e-60 MAKKEETISLIDTFSEFKELKNIDRTTMVSVLEESFRSVIAKMFGTDENYDVIVNPDKGD FEIWRNREVVADEDLTNPNMQISLSEAQKIDASYEEGEEVTDEVIFAKFGRRAILNLRQT LASKILELEKDSIYNKYIDKVGTIINAEVYQIWKKEMLLLDDEGNELLLPKTEQIPSDFY RKGETARAVVARVDNKNNNPKIILSRTSPVFLQRLFEMEVPEINDGLITIKKIARIPGER AKIAVESYDDRIDPVGACVGVKGSRIHGIVRELRNENIDVINYTSNISLFIQRALSPAKI SSIRLNEEEKKAEVFLKPEEVSLAIGKGGLNIKLASMLTEYTIDVFRELDESVADEDIYL DEFRDEIDGWVIDAIKAIGIDTAKAVLNAPREMLIEKTDLEEETVDEVIRILKSEFEEGE EIEN >gi|222159350|gb|ACAB01000009.1| GENE 15 19284 - 19751 561 155 aa, chain - ## HITS:1 COG:no KEGG:BT_3402 NR:ns ## KEGG: BT_3402 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 155 155 251 96.0 5e-66 MIEKKTVCQIVDEWLEGKDYFLVEVTVSPDDKIVVEIDHAEGVWIEDCVELSRYIESKLN REEEDYELEVGSAGIGQPFKVLQQYYIHIGQEVEVLTKDGRKLAGILKDADEEKFTVAVQ KKVKLEGSKRPKLQEEDETFTYEQIKYTKYLISFK >gi|222159350|gb|ACAB01000009.1| GENE 16 19924 - 20319 512 131 aa, chain + ## HITS:1 COG:no KEGG:BT_3400 NR:ns ## KEGG: BT_3400 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 207 83.0 1e-52 MAHRLNTNKQFMVGNGILAFAVIFVVVIFVYMSMRLQREKQEERHFIESYTISLVKGFAG DSISLFVNDSLISNKTISEEPYTVEVGRFAEQSALLIVDNNTELVSTFDLSEKGGAYQFE KESDGIKQLAK >gi|222159350|gb|ACAB01000009.1| GENE 17 20477 - 21175 518 232 aa, chain - ## HITS:1 COG:RSc0521 KEGG:ns NR:ns ## COG: RSc0521 COG1451 # Protein_GI_number: 17545240 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Ralstonia solanacearum # 114 231 175 283 288 77 35.0 2e-14 MDKIIEDDELGRLVIRVNSRARSLVFRTKSDAVYVSVPPGTTLKEVKQAIENLRGKLLAS RQKIARPLIDLNYKIDAEHFKLSLVSGEKDQFLANSRLGVMEIVCPPHADFTDEKLQSWL HKVIEESLRRNAKSILPSRLAFLSKQCGLPYSSVKINSSQGRWGSCSARKAINLSYYLVL LPSHLIDYVLLHELCHTREMNHSERFWALLNQFTEGKALILRGELRKYRTEI >gi|222159350|gb|ACAB01000009.1| GENE 18 21305 - 22087 619 260 aa, chain + ## HITS:1 COG:no KEGG:BT_3398 NR:ns ## KEGG: BT_3398 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 259 1 260 261 261 55.0 2e-68 MKTSLLLVLVTGMFLVLTAGSCLQGKHIVGSKNYISKEVKADHFNEIKLVGSANISYWQD TCSHVEIHGSDNIIPLIETYVEGSTFIIKFKKNVTIWKGKLEIKVFAPELNKLSVNGSGN IRLINGIQTSKDIEFHINGSGNIQGEGFNCQKMAVSINGSGDVRLQQIESQECQASISGS GNINLNGKAIQASYSIAGSGNIQAADLQAENTDASISGSGNISCYASQKLVARVKGSGDI AYKGDPQEVDAPRKNIRQIK >gi|222159350|gb|ACAB01000009.1| GENE 19 22270 - 22749 470 159 aa, chain - ## HITS:1 COG:no KEGG:BT_3397 NR:ns ## KEGG: BT_3397 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 158 158 265 89.0 3e-70 MDYKDIEQLLERYWQCETSVEEEATLRDFFAKEEVPAHLLRYKNLFVYQQVQQEVGLGED FDARILAEVEPTVVKAKRLTLTGRFIPLFKAAAVIAIILSLGNVAQHSFSGDDGSVLATD TIGKQVTAPSVAISNDVKAEQVLADSLAKINHKVQVINE >gi|222159350|gb|ACAB01000009.1| GENE 20 22739 - 23248 266 169 aa, chain - ## HITS:1 COG:MT1259 KEGG:ns NR:ns ## COG: MT1259 COG1595 # Protein_GI_number: 15840665 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mycobacterium tuberculosis CDC1551 # 15 168 93 247 257 65 28.0 4e-11 MQEISFRNDILPLKDKLFRLALRITLDRAEAEDVVQDTMIRVWNKRDEWSQFESVEAYCL TVAKNLAIDRSQKKEAQNVELTPEMEEEPDANSPYDQMIHDERMNIINRLVNELPEKQRL IMQLRDIEGESYKKIATLLNLTEEQVKVNLFRARQKVKQRYLEIDEYGL >gi|222159350|gb|ACAB01000009.1| GENE 21 23310 - 24083 731 257 aa, chain - ## HITS:1 COG:MK1631 KEGG:ns NR:ns ## COG: MK1631 COG0548 # Protein_GI_number: 20095067 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Methanopyrus kandleri AV19 # 5 256 1 246 246 155 39.0 7e-38 MREKLTVIKVGGKIVEEEATLRQLLNDFAAIDGHKVLVHGGGRSATKIAAQLGIESKMVN GRRITDAETLKVVTMVYGGLVNKNIVAGLQARGVNALGLTGADMNVIRSVKRPVKDVDYG FVGDVEQVDATLLSDLIHKGVVPVMAPLTHDGHGNMLNTNADTIAGETAKALSALFDVTL VYCFEKKGVLRDENDDDSVIPQITRAEFGQYVADGVIQGGMIPKLENSFEAINAGVSEVV ITLASAINDSGGTRIKK >gi|222159350|gb|ACAB01000009.1| GENE 22 24140 - 26032 1852 630 aa, chain - ## HITS:1 COG:all3401 KEGG:ns NR:ns ## COG: all3401 COG1166 # Protein_GI_number: 17230893 # Func_class: E Amino acid transport and metabolism # Function: Arginine decarboxylase (spermidine biosynthesis) # Organism: Nostoc sp. PCC 7120 # 2 629 51 677 679 610 46.0 1e-174 MRKWRIEDSEELYNITGWGTSYFGINDKGHVVVTPRKDGVSVDLKELVDELQLRDVASPM LVRFPDILDNRIEKMSSCFKQAAEEYGYKAQNFIIYPIKVNQMRPVVEEIISHGKKFNLG LEAGSKPELHAVIAVNTDSDSLIVCNGYKDESYIELALLAQKMGKRIYLVVEKMNELKLI AKMAKQLNVQPNIGIRIKLASSGSGKWEESGGDASKFGLTSSELLEALDFLESKGLKDCL KLIHFHIGSQVTKIRRIKTALREASQFYVQLHSMGFNVEFVDIGGGLGVDYDGTRSSNSE GSVNYSIQEYVNDSISTLVDVSDKNGIPHPNIITESGRALTAHHSVLIFEVLETATLPEW DDEEEIAPDAHELVQELYGIWDSLNQNKMLEAWHDAQQIREEALDLFSHGIVDLKTRAQI ERLYWSITREINQIAAGLKHAPDEFRGLSKLLADKYFCNFSLFQSLPDSWAIDQIFPIMP IQRLDEKPERSATLQDITCDSDGKIANFISTRNVAHYLPVHALKKTEPYYVAVFLVGAYQ EILGDMHNLFGDTNAVHVSVNEKGYNIEQIIDGETVAEVLDYVQYNPKKLVRTLETWVTK SVKEGKISLEEGKEFLSNYRSGLYGYTYLE >gi|222159350|gb|ACAB01000009.1| GENE 23 26102 - 27172 1101 356 aa, chain - ## HITS:1 COG:no KEGG:BF0195 NR:ns ## KEGG: BF0195 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 350 1 350 359 548 81.0 1e-154 MRYTLEIQKLLLQTQNDSLHPREKANLLKEAIRIADENEDVEWATELRLDLIYELNLLSA DAEEITVFSKILDDYENHKDVIKEDDLLWKYKWIWACTFDLPEIPMEQVKAIGEDYKTRI LRNGYSLRSYYHRWSVECVWMRQYDKAKEYIDKMLNEKIDGQSCEACELNFMLDYYLETG QFDEAYSRAQPLINKQVTCYEANLRAYLKLSYYAQKAGKPEVAADMCARAEEALQGREKD EYLLLYLGLFIAYNIMTKPERGWEYAERCIGWSLRTNTLKKYRFSCDMVEALKYETRPEV SLSLPEEFPLYRPDGIYQVSELRNYFYQQAEELALRYDARNENSGYMDRLKQLMSN >gi|222159350|gb|ACAB01000009.1| GENE 24 27182 - 28948 1387 588 aa, chain - ## HITS:1 COG:lin0941 KEGG:ns NR:ns ## COG: lin0941 COG0326 # Protein_GI_number: 16800010 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Listeria innocua # 9 534 8 556 564 305 33.0 2e-82 MEKEGNNLFQVNLKGMIALLSEHIYSNPNTFVRELLQNCVDAITALRNIDENYKGRIDVF LNEDKTVVFRDNGVGLKEEEVYRFLTVIGESSKRDTPDADDFIGRFGIGLLSCFVVTNEI TVESRSAMGGQPVCWCGKVDGTYQLTLSDEERPIGSQVVLHPKGDWMHLFEYETFKKILV SYGEVLPYPIYLHYQGEEELVNTPSPVWLDPKATRKELLDYGAKVFQSSALDAFRIYTES GKVEGVLYVLPFRTQFSVRNSHKVYLKRMLLSEDDCNLLPPWAFFIRCLVNADGLLSTAS RESLVSNDQLKDARKEIGVAIKDYLRGLVQNDRAMFNRILDVHHFHIKAIASEDNELLRL FMDYLPFETNKGLRGFGSIRSASNVICYTKNLEDFRQVRRIAGAQGWLVVNAAYTFDETL LKKYVRLNPELTLDEISPSRLLEQFGEVEANKEFQAFEMKANELLKRFGCVCRLKHFTPV DIPVIFVAEEKENVAKSANNPLAAVLGAVNTAKQIPPTLTFNADNEMVQTLLQIQGDNKL FQHVVHILYVQSLLQGKYPVNSEEMELFNRSLSELMTSKMNDFINFLN >gi|222159350|gb|ACAB01000009.1| GENE 25 28958 - 31456 2866 832 aa, chain - ## HITS:1 COG:ECU11g0430 KEGG:ns NR:ns ## COG: ECU11g0430 COG0790 # Protein_GI_number: 19074843 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Encephalitozoon_cuniculi # 304 759 119 575 590 156 29.0 1e-37 MKTLKEKFGELSARIKASGQPARVWFPQYTPASLLNAENWWEALAVCEYALDTKEDEKLI EDFFELIFSAFDCNVEVDLNAEEYEFWWEKVMQVCDRVAVFSGAGWAQKGAQYSEARYGK RDMSYLFPYYEKAADMGWAEAEATVAYWRYMGFYCEQNKEEGERRFAALTSPEAILWGKH YRAFVEEFTGDKAKALQIRNDLLAELPEGERLRAHVYAALGDALDRAEGSVAEEAAYYEK SLEIVPNLYSLKNLATLYFRYPELNKPKELGFELWEKAWHAGVWSAANFLGYNYQEEEWL DMPKAIEWLEKGMLYCESYSAYELALIYLYNDEYKNVERGLMCLNRCVEDNYIQGIEGLA NIYFNGELVEEDMNRAKELLEKAIELGSGNAAYRLGWMYERGFLSEQPDYVKALEYYEKA ASLNNADGYCRMALYLANGYSGVKDADKSRECYEKAAELGSCFALVELAFLYENGDGVEK NYEKAFELISRAAGQGYPYAMFRSGLYMEKGVLGEAKPEEAFAWYTKAAEADDNDAIFAL GRCYKEGIGTAEDWDKALEWFGKGAEKEESRCLTELGLAYENGNGVEENPQKAVEYMMKA AEQDYGYAQFKMGDYYFFGCGPCLEDNKKAMEWYEKAVANEIPMAMLRMGEYYLYDYDSL NESEKAFAYFKKAAEYEWYSEGLGICYEMGIGVEDNETEAFKYYTLAADNGNTMSMYRTG LCYYNGVGVKQNYAEAYRWFTDAAGNENIAATYYLGKMMMYGEGCTPDPEAAVQWLLKAA EKNSDKAQFELGNAYLTGKGVEENDEIAMEWFEKAAENGNEKALKITGRRRK >gi|222159350|gb|ACAB01000009.1| GENE 26 31593 - 32120 471 175 aa, chain - ## HITS:1 COG:alr1244 KEGG:ns NR:ns ## COG: alr1244 COG0703 # Protein_GI_number: 17228739 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Nostoc sp. PCC 7120 # 2 159 8 162 181 106 36.0 2e-23 MVRIFLTGYMGAGKTTLGKAFARQMDIPFVDLDWYIEERFHKTVGELFTERGETGFRELE RNMLHEVAEFENVVISTGGGAPCFFDNMDFMNRTGKTVFLDVHPDVLFRRLRVAKQQRPI LQGKEDDELKAFIVQALEKRAPFYHQAQYIFNADELEDRWQIETSVQRLQQLLGL >gi|222159350|gb|ACAB01000009.1| GENE 27 32236 - 32841 480 201 aa, chain - ## HITS:1 COG:CAC3314 KEGG:ns NR:ns ## COG: CAC3314 COG3560 # Protein_GI_number: 15896557 # Func_class: R General function prediction only # Function: Predicted oxidoreductase related to nitroreductase # Organism: Clostridium acetobutylicum # 2 200 1 198 198 233 54.0 1e-61 MMERSFSEALKQRRTYYSITNQSPISDQEIECIVNMTVRHVPSAFNSQTTRVVLLLGESH KKLWQIVKDALKKIVPAEAFAKTEEKIDHSFACGYGTVLFFEDQKVVKGLQEAFPSYQEN FPGWSLQTSAMHQLAIWVMLEDVGFGASLQHYNPLIDEEVRRAWNLPEHWHLIAEMPFGL PIGKPGEKEFQPLEERVRIFK >gi|222159350|gb|ACAB01000009.1| GENE 28 32977 - 33612 558 211 aa, chain + ## HITS:1 COG:BH0863 KEGG:ns NR:ns ## COG: BH0863 COG3341 # Protein_GI_number: 15613426 # Func_class: R General function prediction only # Function: Predicted double-stranded RNA/RNA-DNA hybrid binding protein # Organism: Bacillus halodurans # 1 211 1 196 196 184 48.0 8e-47 MAKQKFYVVWEGVTPGIYTSWTDCQLQIKGYEAAKYKSFDTREEAERALTMSPYAYIGKN AKSKSGGSKPSSDTLPSCVIDNSLAVDAACSGNPGPMEYRGVHIASRQEIFHFGPMKGTN NIGEFLAIVHGLALLKNKGFDMPIYSDSANAISWVRQKKCKTKLPRTPETEELFLLIERA EKWLQGNTYTTRILKWETKEWGEIPADFGRK >gi|222159350|gb|ACAB01000009.1| GENE 29 33642 - 35414 1375 590 aa, chain + ## HITS:1 COG:all2870 KEGG:ns NR:ns ## COG: all2870 COG1807 # Protein_GI_number: 17230362 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family # Organism: Nostoc sp. PCC 7120 # 35 500 55 561 641 83 25.0 2e-15 MKTLTSNKAFWLLLVICVVTILPFLGLSEYHTKGEPREAIVSYSMLDSGNWILPRNNGGE MAYKPPFFHWSIAAVSAVVNGGQVTEMTSRLPSAIALIAMTLCGFLFFAKRKGVQLALLA AFITLTNFELHRAGANCRVDMVLTALTVGALYCLYRWYEKGLKGIPWLAILLMSCGTLTK GPVGTIIPCLVVGIFLLLRGVNFFKAFLLLSAWAILSLILPICWYVAAYQQGGEEFLALV MEENLGRMTNTMSYDSCVNPWHYNFVTLFAGYVPWTLLAVLSLFSLTYRKFNIQPAAWWK RFTAWIKNMDPVDLFSFTSIVIIFVFYCIPQSKRSVYLMPIYPFIAYFLAKYLFYLVKKQ SKVIKVYGSILAAISLLLFTCFIVLKCGLIPETIFHGRHAQDNINFMRAIQNISGAGALL LVAVPTILGIYWWFYQRKHALSNRFLYALVVLTMGLYLALDGAYQPAALNSKSVKFVAAE IEKIAPESEGTMYEFIEESLHAAGDPVHYFEINFYLNNRLDNFYQKRPAKGFLLIGTNDA EKYLPEFEKEGYQFEQVYESPKRVLRQIAKVYKFIKNEQPEKTETTPIVE >gi|222159350|gb|ACAB01000009.1| GENE 30 35401 - 37380 1169 659 aa, chain - ## HITS:1 COG:PA1689 KEGG:ns NR:ns ## COG: PA1689 COG1368 # Protein_GI_number: 15596886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Pseudomonas aeruginosa # 19 575 35 607 700 195 27.0 2e-49 MKRFLNLIVYILTIHISALLIAGLFRLVLFISSYHQLTSEALSDKSLSMLAFVHGVWFDN VIGCYILLLPLTIAVVCGVCNYYGKALFRFFTIFFSVFYGLVYLISASDIPYFAYFFKHI NSSIFEWFGYAGTTAGMILGESAYYLSIGLFLLFLAGFIVWLVCLSRYFHRRSLTISASF PFWKRGVVVLVGACLIGLCIFGIRGRTGYNPIKVSAAYFCQDAFLNQLGVSPTFNLLTSV MDDMRPENKYLHLMDEQEAITKAQALLNRQGDANVSPLAVYRHAAKADSVQQRRPNVVLI MMESMSSKLMKHFGQSETLTPFLDSLYTRSISFRNFYSAGIHTNHGLYATLYSFPAMMKR NLMKGSVIPRYSGLPTVLKENGYYNLFFMTHEGQYDNMNAFFRTNGYDEVFSQEDYPADK VVNSFGVQDDFLYDYAIPVLNQRAATGQPFFATLLSISNHPPYVIPPFFHPKTSEPEMQI VEYADWALRQFFEEARKQPWFDNTIFVLEGDHGKLVGDAECELPESYNHIPLMIYSSRIH PEEKNTFGGQVDIQPTILGLLNIDYLQNNFGVDLLKEERPCMFYTADNMIVGRNDTLLYL YNYETQQEFTYHIDNGKLKAVPMDDRFLPLKEYSFSMLQSAEFLVKHGKTVNSVPFTQQ >gi|222159350|gb|ACAB01000009.1| GENE 31 37427 - 37846 262 139 aa, chain - ## HITS:1 COG:no KEGG:BT_3390 NR:ns ## KEGG: BT_3390 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 136 2 133 143 127 47.0 2e-28 MTKISKAIFDRHPDWRDKFWQFVRFGVVGTVSSAIHYGVYCLVLLVANANISFTAGYAVG FICNYFLTTFFTFRSKPSSRNAVGFGFSHLINYLLEIGLLNLFLWIGAGELLAPILVMII VVPINFLILHFVYIYKGRK >gi|222159350|gb|ACAB01000009.1| GENE 32 37830 - 38786 799 318 aa, chain - ## HITS:1 COG:lin1066 KEGG:ns NR:ns ## COG: lin1066 COG0463 # Protein_GI_number: 16800135 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Listeria innocua # 4 314 6 316 329 316 47.0 4e-86 MIKLAIVSPCYNEEEVLEESTTRLTALFDDLVAKKKISADSFVLFVNDGSKDRTWSIIKQ LHRTNPYIKGMNLARNVGHQYAIMAGMMTAKDWSDAVITIDADLQDDLNAIEEMIDAYTQ GYDVVYGVKVSRQADPMLKRLSATAFYKLQHRMGVETIYNHADFRFLSRRVLEQLSHYQE RNVYLRGIIPLLGFPSTTVDDVIRERTAGTSKYTVRKMFSLALDGITSFSVKPIYGIVYL GIIFVFISILIGIYVLYSLISGTAEHGWASLMLSVWFVGGIVLMSIGAVGLYIGKIYKEV KRRPLYNVEEVLYDDQDK >gi|222159350|gb|ACAB01000009.1| GENE 33 38783 - 39586 693 267 aa, chain - ## HITS:1 COG:MA0797 KEGG:ns NR:ns ## COG: MA0797 COG0726 # Protein_GI_number: 20089681 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Methanosarcina acetivorans str.C2A # 2 257 16 248 250 70 27.0 3e-12 MILLSFDTEEFDVPREHGVDFPLDEAMKVSVYGTNRILDCLKKNGVKATFFCTSNFAENA PEVMRRIMDEGHEVAAHGCDHWQPQASDVSRSKEILERLTGRTIEGYRQPRMFPVSDTEL ERMGYVYNSSLNPAFIPGRYMHLSEPRTCFMTGKLLQIPASVTPWIRFPLFWLSCHNLPM WLYQSLVNRVLKHDGYFVTYFHPWEFYPLGEHPEFKMPFIIRNHSGKGMEERLDMLIRNL KEKGYPFMTYSEFAQIKLAELNKPDEK >gi|222159350|gb|ACAB01000009.1| GENE 34 39616 - 41451 216 611 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 370 607 1 229 311 87 28 1e-16 MKEFLQLMRRFVSPYKKYIGWAILLNILSAVFNVFSFTFLIPILSILFKTEGADKVYHFM EWGSGDLADVAKNNFYYYISQMIVDNGPTVALIFLGLFLMVMTLFKTGCYFASSAVMIPL RTGVVRDIRIMVYAKVMRLPMSFFSEERKGDIIARMSGDVGEVENSITSSLDMLMKSPIL IIIYFVTLVAVSWQLTLFTIIVLPGMGWLMGVVGRKLKRQSLEAQAKWSDTMSQLEETLG GLRIIKAFIAEDKMINRFTKCSNELRDATNKVAIRQAMAHPMSEFLGTILIVAVLWFGGT LILGKNATIDAPTFIFYMVILYSVINPLKDFAKAGYNIPKGLASMERVDKILKAENKIKE IPNPKPLKGLNDKIEFKDISFSYDGKREVLKHVNLTVPKGKTIALVGQSGSGKSTLVDLL PRYHDVQEGDITIDGTSIRDVRIADLRSLIGNVNQEAILFNDTFFNNIAFGVENATMEQV VEAAKIANAHDFIMEKPEGYNMNIGDRGGKLSGGQRQRISIARAILKNPPILILDEATSA LDTESERLVQEALERLMKTRTTIAIAHRLSTIKNADEICVLYEGEIVERGKHEELIELNG YYKRLHDMQQL >gi|222159350|gb|ACAB01000009.1| GENE 35 41572 - 42696 661 374 aa, chain - ## HITS:1 COG:MA1854 KEGG:ns NR:ns ## COG: MA1854 COG1672 # Protein_GI_number: 20090704 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanosarcina acetivorans str.C2A # 1 373 2 386 390 77 23.0 6e-14 MIRNPFITSGYVSADYFCDRQQESEQLVREVMNGNNLALVSTRRMGKTGLIRHCFQFPEI KQDYYTFFIDIYDSRSLRDLVFALSKEILEVLKPIGKKALHGFWECVKSLQASISFDVNG MPSLNLGLGDIQAPATTLDEIFRYLEQADKPCLIAIDEFQQITGYAEKNVEATLRTYVQH CNNARFIFAGSQRHVMGNMFLTPSRPFYQSVSMMHLESIPLEEYIRFACMHFKRAGKEME ESAVTTVYQQFEGVTWYIQKVLNTLYDMTPEHGVCKVEMVSEAIRQIIDSFRYTYSEILF RLPEKQKELLIAITKEGKAKAITSGAFIKKYRLASASSVQAALKGLLEKDFVTQEKGVYQ IYDRFLGIWLKENY >gi|222159350|gb|ACAB01000009.1| GENE 36 42778 - 44862 1364 694 aa, chain - ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 332 674 306 634 836 77 25.0 9e-14 MRITLVRDDGKVNTMRTLRIEQLVEQMKVETKTQPVSKMREVLPFMLPGDKNDYVQKVPK LIPAAAFFRKGGITTMSEYNGIVMIQVNNLSGRMEADEVKERVKELPQTYLAFTGSSGKS VKVWVRFTYPNDRLPTTSEEAELFHAHAYRLAVKFYQPQLPYDIELKVPSLEQYCRLTFD PHLYFNPEAMPIYMKQPAAMPGEVTYRERVQTETSPLQRLAPGYEKCNALSVLFEAAFAR ALDEETDYQPEGDKQSLLINLAGHCFRAGIPEEDTVRWSRAHYRLPKDDTLVRGTVRNVY RTCEGFASKSSLLPEQLFVMQMDEFMKRRYDFRFNQLTSQVECRERNSFNFYFHPVDKRL MASIAMNAHYEGLKLWDKDVVRYLNSDHVPVYQPIEEFLYDLPHWDGKDHIGDLAKRVPC DHPHWAQLFRRWFLSMIAHWRGMGKNHANSTSPILIGPQAYRKSTFCRLILPPCLQAYYT DSIDFSRKRDAELYLNRFLLINMDEFDQIGINQQSFLKHILQKPVVNTRRPNASAVEELR RYASFIGTSNHKDLLTDTSGSRRFIGVEVTGVIDVVRPIDYEQLYAQAMALLRSNERYWF DEKEEAIMTEANREFEQSPAIEQLFQVYYRVAEEEEEGEWLLAADILQRIQKASKMKISS GQVNYFGRILQKLGVKSFRKTRGVYYHVVAVGQG >gi|222159350|gb|ACAB01000009.1| GENE 37 45466 - 46077 342 203 aa, chain + ## HITS:1 COG:no KEGG:BT_3384 NR:ns ## KEGG: BT_3384 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 203 1 203 203 303 76.0 2e-81 MAIQFEFYKNPQPEKEGEEPSYHPRVVNFQHVTTQKLAREIHMATTFGKAEVEAMLMELS RCMGNHLCEGERVHLDGIGYFQITLQATEPVHSLTTRADKVKLKSINFQADRDLKSLCMT SHLRRSKYKPHSASLSEEEIDKKLTAYFATHPVLTRSNMQSLCSFTQSMASRQIRRLKAQ GNLQNIGKPTQPIYIPTPGHYEK >gi|222159350|gb|ACAB01000009.1| GENE 38 46247 - 47800 1170 517 aa, chain - ## HITS:1 COG:no KEGG:BT_1828 NR:ns ## KEGG: BT_1828 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 516 520 801 76.0 0 MSGKIYPIGVQNFESLRNDGYFYIDKTAMIYQLVKTGRYYFLSRPRRFGKSLLISTLEAY FQGKRELFEGLAMEKLEKDWITYPVLHLDLNTEKYDTPESLENKLNGALVEWEKMYGAEP SEKSLAMRFEGIIKRACRQEGQRVAILVDEYDKPMLQAIGNDALQKSFRSTLKAFYGALK SQDGCIKFAMLTGVTKFGKVSVFSDLNNLNDISMWNKYIDICGVSEQELYDNLDAELHEF ANVQGVTYEEICVRLKEMYDGYHFTHNSKGMYNPFSLLLAFDRNEFKSYWFETGTPTYLV ELLKKHHYDLHRMAHEETDEQVLNSIDSESTNPIPVIYQSGYLTIKGYDEEFGIYRLGFP NREVEEGFIRFLLPYYANVDKVESPFEIQKFVREVRAGDYESFFRRLQSFFSDIPYELAR ELELHYQNVLYIVCKLVGFYVKAEYHTSEGRVDMVLQTDKFIYIMEFKLNGTAEEALQQI NDKHYSRPFETDSRKLFKIGVNFSAETRNIEKWLVEN >gi|222159350|gb|ACAB01000009.1| GENE 39 47983 - 49536 1201 517 aa, chain - ## HITS:1 COG:no KEGG:BT_1828 NR:ns ## KEGG: BT_1828 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 516 520 806 77.0 0 MSGKIYPIGIQNFEKIRKGGYFYIDKTALVYQLVKTGSYYFLSRPRRFGKSLLISTLEAY FQGKRELFEGLAMEKLEKDWVKYPVLHLDLNTEKYDTPESLENKLNGALVEWEKIYGEEP SEKSLAMRFEGIIKRACQQEGQSVVILVDEYDKPMLQAIGDDVLQKSFRNTLKAFYGALK SQDGCIKFALLTGVTKFGKVSVFSDLNNLNDISMWNKYIDICGVSEQELYDNLDAELHEF ANVQGVTYEEICARLKEMYDGYHFTHNSKGMYNPFSLLLAFDRNEFKSYWFETGTPTYLV ELLKKHHYDLHRMAHEETDEQVLNSIDSESTNPIPVIYQSGYLTIKGYDERFGIYRLGFP NREVEEGFVRFLLPYYANVDKVESPFEIQKFVREVEAGDYESFFRRLQSFFSDIPYELAR ELELHYQNVLYIVCKLVGFYVKAEYHTSEGRVDMVLQTDKFIYIMEFKLNGTAEEALQQI NDKHYARPFEMDSRKLFKIGVNFSAETRNIEKWLVEN >gi|222159350|gb|ACAB01000009.1| GENE 40 49582 - 50727 683 381 aa, chain - ## HITS:1 COG:TM1265 KEGG:ns NR:ns ## COG: TM1265 COG1373 # Protein_GI_number: 15644021 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Thermotoga maritima # 2 381 16 387 387 120 26.0 7e-27 MLLSNISTDFIRYLHDEIEWSSRLIAILGSRGVGKTTMLLQHIKLYDNINETLFVTADDL YFAEHKLFDLAMDFYQHGGKKLYIDEIHKYAGWAREIKNIYDLIPKLQVVYTGSSILDLE VGEADLSRRKLEYRMVGLSFREYLAISYGYHLPVYSLEDILKHKIDFPYAEARPILLFKE YLQHGYYPFFQEKGYLLRLQSIIKQTLENDIPTFANMNIATALKLKRLLYIIAKSVPFKP NFTKLATLLDMNRNTVSDLMCYLEKAGIINQLRAETEGVRLLGKVDKVYLNNTNLAYALS DNTPDIGNVRETFFFSTLRVVCPVTTSEVADFTVGGYTFEVGGKNKSQKQVHDVENAYVV KDDIEYGMRNVVPLWAFGFLY >gi|222159350|gb|ACAB01000009.1| GENE 41 51147 - 51632 273 161 aa, chain + ## HITS:1 COG:PAB0773 KEGG:ns NR:ns ## COG: PAB0773 COG0110 # Protein_GI_number: 14521366 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Pyrococcus abyssi # 2 161 9 160 205 162 52.0 3e-40 MIDTTAKIAANAQIGKGTIIEEFTVIAPNAKIGDECKIHRNIFIDSNVIIGNRVKIQDNV MIPHGVTLEDGVFVGPSASFTNDKYPRSINPDGTLKSSEDWDVSETVVKYGASIGANATI LCGVTLGEWCMVAAGAVVTKDVPAYTLVAGNPAKIVGPVNF >gi|222159350|gb|ACAB01000009.1| GENE 42 51862 - 54267 1161 801 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262405490|ref|ZP_06082040.1| ## NR: gi|262405490|ref|ZP_06082040.1| predicted protein [Bacteroides sp. 2_1_22] # 1 801 66 866 866 1554 100.0 0 MLVSQVDLARYKIRAIFDITLFVITIIIFFITDGQYRWYAMIPVGGIVITECCANYKHIL QRIPFKRIKEFPYLKYEQAALIIPKGLVGILSTTLKEVKDIPANPRMMKIHFATTDESYE TRQVITDKSIIERIIKNEIVAYYDSLTEKIREKGYDRSQFFYLENENGHKEFICDILKDI YSLENVYNAIIQVQSDITKDGSIRFIGDKVGIRGYKLEKGELLLDIYETDHFTFKVFKKI FKDRRFKRIFQEIICRTNKANDNIKLLLVESLAFLFSSFGIDIIIGGRDASGAKKMLVAA RSGNIESDNRSTLHVPVNESFSRTDLVEHDQYYSPYHCVLRGIKEELGIPEEICKKTSVS FHDFAIVSDEGEIGLGCYVDFSFVMPLEEARLYPGQDKYLELADILIVPYPPFKWSPSAY EDYFYKTTGNEKFCMQWQSFTTLLYQRAILRNAEASIPIIWLVDSTIILTLLFLLTNYTQ IDLTSTIISLILGGLALFIMKRFNNKPVHKLTYGEFLKPFVPQWNGDVRVLQSTMHSQIV KGSRKETNPIADGIMFGLNSPSGQKLKLSDIHLLTPPFCTVRREFVNFTEYPISFYYVAE KNGDIGNCLSFIKIPYALSSDDLSLLLTVKTEKGHIVSYNFTKPIHSDIILDFTNTLDEK QVNTFSKYYRMNKDLLKNVQIASLDENFQKRWFPLDLFNAGNDYFWSVIDMEDELEKHDS YDFDFKIGKKTQPRDLYMKVIAKNMDRSSFSIRLNGNLKDMETFLCAFTSRNDNRRKMSD LDIYMMQLFLIRIGIVYAKKK >gi|222159350|gb|ACAB01000009.1| GENE 43 54743 - 55615 135 290 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_0316 NR:ns ## KEGG: Fjoh_0316 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 1 248 1 238 281 139 36.0 1e-31 MISIIICSRFQSISKELKDNIENTVGVVHEIICIDNSKSQYDIFSAYNEGVKRSQYPLLC FMHEDILHYTNDWGKLLINHFRDLKVGLIGISGPRFISQIPGIWWGPGSTDAGKDAICQY SIDTNRADPTITYNTCFKPIADANAIEVVAVDGCFFCMRKSVFDKIRFDEINYKGFHFYD LDISLQVYMLGLKSLCVYDILIEHVSNSKLDNEWLTNARIFFTKWKDCLPIVNYPVSAKE RRVLEKNNLQTMNYIINENNRKPSRYYKFSEKLYLLRYCCDKKMLQSLIY Prediction of potential genes in microbial genomes Time: Wed May 18 00:58:27 2011 Seq name: gi|222159349|gb|ACAB01000010.1| Bacteroides sp. D1 cont1.10, whole genome shotgun sequence Length of sequence - 35448 bp Number of predicted genes - 31, with homology - 31 Number of transcription units - 17, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 5 - 64 7.9 1 1 Tu 1 . + CDS 156 - 1154 328 ## gi|262405492|ref|ZP_06082042.1| glycosyl transferase + Prom 1206 - 1265 5.5 2 2 Tu 1 . + CDS 1306 - 2097 204 ## HMU12050 family 6 glycosyltransferase - Term 2052 - 2097 1.5 3 3 Op 1 . - CDS 2118 - 3029 350 ## HH0072 hypothetical protein 4 3 Op 2 11/0.000 - CDS 3090 - 4064 357 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 5 3 Op 3 . - CDS 4076 - 5053 309 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 5225 - 5284 4.9 6 4 Tu 1 . + CDS 5330 - 6247 204 ## BVU_1071 glycosyl transferase family protein - Term 6107 - 6140 -0.9 7 5 Tu 1 . - CDS 6265 - 7269 287 ## BF2799 glycosyltransferase - Prom 7317 - 7376 7.1 + Prom 7126 - 7185 8.9 8 6 Op 1 . + CDS 7318 - 7500 56 ## gi|294646362|ref|ZP_06724009.1| hypothetical protein CW1_2060 9 6 Op 2 11/0.000 + CDS 7540 - 8550 177 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 8864 - 8923 5.3 10 6 Op 3 1/0.000 + CDS 8972 - 9781 449 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 11 6 Op 4 . + CDS 9783 - 10832 873 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 12 6 Op 5 . + CDS 10829 - 11560 581 ## COG1213 Predicted sugar nucleotidyltransferases 13 6 Op 6 3/0.000 + CDS 11577 - 12389 661 ## COG3475 LPS biosynthesis protein 14 6 Op 7 . + CDS 12419 - 13552 651 ## COG0438 Glycosyltransferase 15 6 Op 8 . + CDS 13549 - 14361 430 ## gi|237716729|ref|ZP_04547210.1| conserved hypothetical protein 16 7 Tu 1 . - CDS 14362 - 15273 434 ## BT_3364 hypothetical protein - Prom 15400 - 15459 2.0 + Prom 15392 - 15451 8.9 17 8 Tu 1 . + CDS 15505 - 16563 430 ## COG0859 ADP-heptose:LPS heptosyltransferase - Term 16315 - 16356 2.5 18 9 Op 1 5/0.000 - CDS 16546 - 17622 708 ## COG0438 Glycosyltransferase 19 9 Op 2 . - CDS 17682 - 18713 744 ## COG0859 ADP-heptose:LPS heptosyltransferase 20 9 Op 3 . - CDS 18717 - 19322 533 ## BDI_2820 hypothetical protein - Prom 19412 - 19471 7.3 + Prom 19273 - 19332 7.7 21 10 Tu 1 . + CDS 19440 - 20480 523 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases - Term 20372 - 20412 2.2 22 11 Tu 1 . - CDS 20558 - 21133 347 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN - Prom 21265 - 21324 7.8 + Prom 21204 - 21263 5.2 23 12 Op 1 27/0.000 + CDS 21283 - 21519 403 ## COG0236 Acyl carrier protein 24 12 Op 2 1/0.000 + CDS 21535 - 22797 1397 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 25 12 Op 3 . + CDS 22724 - 23830 655 ## COG0571 dsRNA-specific ribonuclease + Term 23863 - 23897 -0.2 - Term 23708 - 23763 1.1 26 13 Tu 1 . - CDS 23772 - 24782 863 ## COG0205 6-phosphofructokinase - Prom 24899 - 24958 5.6 + Prom 24834 - 24893 2.0 27 14 Tu 1 . + CDS 24920 - 26446 1210 ## BT_3355 putative auxin-regulated protein + Term 26609 - 26647 -0.4 28 15 Tu 1 . - CDS 26687 - 27826 754 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain - Prom 27936 - 27995 1.9 + Prom 27774 - 27833 3.0 29 16 Tu 1 . + CDS 27880 - 30720 1419 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family + Prom 30751 - 30810 7.7 30 17 Op 1 . + CDS 30866 - 33472 1642 ## BT_3328 hypothetical protein 31 17 Op 2 . + CDS 33494 - 35447 1370 ## BT_3329 hypothetical protein Predicted protein(s) >gi|222159349|gb|ACAB01000010.1| GENE 1 156 - 1154 328 332 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262405492|ref|ZP_06082042.1| ## NR: gi|262405492|ref|ZP_06082042.1| glycosyl transferase [Bacteroides sp. 2_1_22] # 1 318 1 318 332 632 100.0 1e-179 MKCYICLTDSIVNRKDYFNMLKVTLISARKNTSLRLICLYDGKIGDPVYNLLKDFKVEII IHELPYKKELMEIYSHEWMETELGKDIEYSRIFGTFMRMEIPIIEKEDEYVLYTDMDIIF NDDILLKDLPHPAYLAAAPEFERDTKRMSYFNAGILVMNIQGMRIKYQQFVEMMKKRQRS TSGLFDQGYLNELCFNDMEILPIEYNWKPYWGINGNAKLIHFHGMKPCSNLEEAGFDTRE SFFRTIFDNNSQGYAGYIYYFILFFNYLGQKQDQWLCYHLQYILDLYKKPLIALAQKPNY KPKYRKYKRLYSIFVSISILLAILLLTALFLV >gi|222159349|gb|ACAB01000010.1| GENE 2 1306 - 2097 204 263 aa, chain + ## HITS:1 COG:no KEGG:HMU12050 NR:ns ## KEGG: HMU12050 # Name: not_defined # Def: family 6 glycosyltransferase # Organism: H.mustelae # Pathway: not_defined # 2 233 44 271 306 184 45.0 3e-45 MRIGILYICTGKYDIFWKDFYLSAERYFMQDQSFIIEYYVFTDSPQLYDEENNEHIHRIK QKNLGWPDNTLKRFHTFLRIKEQLERETDYLFFFNANLLFTCPIGKEMLPSSNSNGLLGT IHPGFYNKPNSEFTYERRVASTAYIPEGKGLYYYAGGLSGGCTESYLQLCTTICSWVDKD AANHIIPIWHDESLINKYFLDNPPAITLPPAYLYPEGWSLPFKPIILIRDKNKPEYGGHE FLRRKNSLWVKIKLICQKIKLAD >gi|222159349|gb|ACAB01000010.1| GENE 3 2118 - 3029 350 303 aa, chain - ## HITS:1 COG:no KEGG:HH0072 NR:ns ## KEGG: HH0072 # Name: not_defined # Def: hypothetical protein # Organism: H.hepaticus # Pathway: not_defined # 25 274 6 253 320 122 32.0 2e-26 MSIIRKWQRSCYKRMLKAHLVTPKVVVLMDGGICSQMLQYLLGEFFRKRGCKVVYDLSFY KEWGSDMDYKFARNFDLLKVFPYLGLTIATEFEISYYKRHFYYVGNNTSEWIDDFSFLEK NPPIYLGGYYHLPINIWVPAFMSTYRISSEIFDEGSKMLMSEIKQNINPVAVHVRRGDLS VEVRAYGKPASLSYFKRAVDYMEKETVTPFFYFFSDEPEWVASELIPYLEFANENYKVVD INGSDKGYMDLFLIAYCKHQITSKGTLGKYGALLRDSSDKIVILCDDKVEYQWKGVFYNS IFL >gi|222159349|gb|ACAB01000010.1| GENE 4 3090 - 4064 357 324 aa, chain - ## HITS:1 COG:SP1771_1 KEGG:ns NR:ns ## COG: SP1771_1 COG0463 # Protein_GI_number: 15901601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 8 215 7 223 259 110 34.0 4e-24 MEKESPLVSVIIPVYNAHNSIAKTLQSVINQTYTNLEIVVVNDGSTDESLDIIKTYAVED PRIVVFDKQNEGLVQARKSGIDIATGKYIQYLDSDDIMHEDAITRLVNKAEETQADMVVA PFMFCYDGESHKSTFFNFVELSGVEFLKNILLKKAYWCVWSKFHLRSLYQNEIERPDISF GEDVVLSTQLLLYLQKVVSVDYIIIDYNFTNTSMSHPANFDDKKYGDFVNYTIWIENYID KKGLGEEMKEALAHFHLENAFRRIYWKRFAGIRKDMKRLVNEMESYPCLINNLSKRERKI VNAYRVSGFWGDLKLRYYNMRHKL >gi|222159349|gb|ACAB01000010.1| GENE 5 4076 - 5053 309 325 aa, chain - ## HITS:1 COG:SP1365 KEGG:ns NR:ns ## COG: SP1365 COG0463 # Protein_GI_number: 15901219 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 7 214 6 214 328 124 37.0 3e-28 MESQYSVSVIIPVHNTAPYLRRCIESVRNQTLKDIEIILVDNLSTDDSPSICNEYANIDS RVKVLHLDEAGLSIARNAGIEIATAPYIGFIDSDDYIEPTMYENMLSAMLQNKVYMVYCN FCFEYEDGRIEQMYPNTDNVYVRTSQDVQRDIIFEKVSSSACTKLFDKIFLILIASYRYV FEDHATIYRWISICDKIVWIDAAYYHYIQRGDSICHTINSIKRYHYFLAEYSRLEFAKER RLFEGRDWCDAVNVIVENCFNHFQEFMTDPNHQLYPVEIKDMRNKLSRWRFLSRKELSSK YYKRLRKITYFWSIYYWTHFAKKKN >gi|222159349|gb|ACAB01000010.1| GENE 6 5330 - 6247 204 305 aa, chain + ## HITS:1 COG:no KEGG:BVU_1071 NR:ns ## KEGG: BVU_1071 # Name: not_defined # Def: glycosyl transferase family protein # Organism: B.vulgatus # Pathway: not_defined # 1 288 4 293 310 322 53.0 1e-86 MKKLAIIIPAYKARFLQETLDSIAKQNSHEFTVYIGDDASPYPLKTIVDHYKNKFDIIYH RFEQNMGKKDLPGHWERCISLSEEELIWLFSDDDLMPFDGVARIIQAAQKHSEGKYIFRF PLEVVDEHGKLKYKNPPFKTNLTSGYEFLLDKLSGKISSAACEYVFSRAVWEQTGGFIKF PLAWCTDDATWAKFADFTGGIISLPGNPVSWRNAEGENISNSTHFNKEKIRATILFIEWI GINYHSHLNERKFKKAIKRYIYKFLQHSLKNDFSLHDLLRLCNALRKLTLNVSLQVLVHH VWKKL >gi|222159349|gb|ACAB01000010.1| GENE 7 6265 - 7269 287 334 aa, chain - ## HITS:1 COG:no KEGG:BF2799 NR:ns ## KEGG: BF2799 # Name: not_defined # Def: glycosyltransferase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 332 1 348 348 219 37.0 2e-55 MVYIICYDWPSTSGNHTGMRYLYEYIQKRNPELYKMYTFNMGRCFLDKGKRGKKISVFFT ALKLAMTYKSGDKFILTEYLHRDSYQILFAKIIRFIRPKAPVYAMVHLVPEKLERRYSKA QIKKSSRFVTEIVTLGSSLTCYLNNLGIENVYTSFHYVDNNYYTPSANLYNRSDDVKVIV MGALARDFEQIAYIVSRLPQIHFYICKGRENIDYLFSGLKNVTLVGYVPEAELKHYMNDS DISLNVMKDTIGSNVICTSMATGLAMVVSDVGSIRDYCDETNACFCSSPDDFIEKIMILA NDRELLNSLKKMSLKKAQQFSIDNYCASLSSLLK >gi|222159349|gb|ACAB01000010.1| GENE 8 7318 - 7500 56 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294646362|ref|ZP_06724009.1| ## NR: gi|294646362|ref|ZP_06724009.1| hypothetical protein CW1_2060 [Bacteroides ovatus SD CC 2a] # 1 60 1 60 60 89 100.0 5e-17 MQNLHQSHYLDKHNILNKGNENNPKFYSNEIILFYTRQARDKKSIKNEITFIITNIKVTL >gi|222159349|gb|ACAB01000010.1| GENE 9 7540 - 8550 177 336 aa, chain + ## HITS:1 COG:SP1365 KEGG:ns NR:ns ## COG: SP1365 COG0463 # Protein_GI_number: 15901219 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 3 221 5 215 328 101 33.0 2e-21 MCLITLSMPIYNVEQYIERALLSALNQTYQNLEILIVDDLGHDNSMNIVYQLKHTHPRGN CIRIITHKKNLGLGGTRNTAIESAQGKYLYFMDSDDAIVPDCIETLYNIISQEKVDFVAA GINQIDENENSLKITNYPNITRKGKLCMAQYFYEEKNDIWVTTWNKLYNLSFLREFHIHF LEHVYHEDIVMHAMLALNTRSFTLTSQTTYLYNCKREDSITNMEQGYTLKHLNDLTIVFS EIRRLIQADNNLSKHLQKEINLYYLFEIYHRYLPAMYSNIKRVDLIKGLARLSQNCLPPI LETRLTFKSIKRFLFLKLPFNIQYLIKSHKYKKSRA >gi|222159349|gb|ACAB01000010.1| GENE 10 8972 - 9781 449 269 aa, chain + ## HITS:1 COG:Cj1135 KEGG:ns NR:ns ## COG: Cj1135 COG0463 # Protein_GI_number: 15792460 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Campylobacter jejuni # 10 261 257 514 515 186 40.0 5e-47 MNKPMASNISTSLIISTYNRSDALELCVKSVLRQSLLPDEIIIADDGSKEDTRELIHQLA ASSEVPIIHVWHEDLGFRLASIRNKAIAKASKEYIIQIDGDIVLHKDFVKDHARFAQKGS FVTGSRVLIREGLTKKMLAEKNCIISIHDKGTKNTINGVHLPWLSPLLQHYRQWDISYSR GCNMAFWKEDLLKVNGYNEAITGWGSEDHELVCRLINNGVRKRTIKFAGIVFHLHHELHG TDNLNNNRSILNETKVRKLTWCDKGIIQN >gi|222159349|gb|ACAB01000010.1| GENE 11 9783 - 10832 873 349 aa, chain + ## HITS:1 COG:BS_hisC KEGG:ns NR:ns ## COG: BS_hisC COG0079 # Protein_GI_number: 16079319 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Bacillus subtilis # 8 346 34 359 360 133 30.0 5e-31 MEKNLNFLDRNEFNYSPSKEVVEALKNFDINKLCFYTRIYDEGKKSILSVFLSELYDIDE TQVLLGYGGEDILKQAVHYFLTQEDGNKTMLIPKFSWWYYKSIADEVNGHTLQYPLYEDG NTFKYDFETLKDMIQKENPKILLLASPNNPTGNGLTPKELDELLAEVPSQTVVLIDEAYA SFVSTDTSYIKKLVNKYQNLIISRTLSKFYGLPGLRMGFGFMSKELEKFSRYSNKYLGYN RISEDIAIAALKSDAHYRNIAKLMNEDRERYEKEIGVLPGFKVYESVANFILIKYPIELK EALQKAFAEQSYKVKFMNEPDINTHLRITLGRPEQNRIVIDTIKEIASK >gi|222159349|gb|ACAB01000010.1| GENE 12 10829 - 11560 581 243 aa, chain + ## HITS:1 COG:AF1142 KEGG:ns NR:ns ## COG: AF1142 COG1213 # Protein_GI_number: 11498742 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar nucleotidyltransferases # Organism: Archaeoglobus fulgidus # 1 235 1 232 241 124 35.0 2e-28 MKAVILAAGIASRLRPLTDTTPKCLLKIGERCLLERAFDALIQNGFDEFIIVTGYRQQQI VDFLQARYPVQNITFIYNDRYESTNNIYSLWLTRPYTDGEEILLLDSDIVFDPQIVEKLL DSDKADILALNRHELGAEEIKVIVDDAQKVVEISKVCSISDAIGESIGIEKMSAEYTKAL FRELDIMITTEGLDNIFYERAFERLIPQGYSFYVMDTTEFFSAELDTVEDFQQAQKLIPA SLY >gi|222159349|gb|ACAB01000010.1| GENE 13 11577 - 12389 661 270 aa, chain + ## HITS:1 COG:L15884 KEGG:ns NR:ns ## COG: L15884 COG3475 # Protein_GI_number: 15672196 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Lactococcus lactis # 1 257 4 268 278 124 30.0 1e-28 MANYDIRPLQLRILKNLLAVDKVCKEHNLRYYIMAGTMLGAVRHKGFIPWDDDLDIGMPR ADYDLLMANAKEWLPKPYEAVCAENDKEYPLPFAKVQDADTTLIERMHLKYLGGVYIDIF PLDGVPESRMAQRMHFAKYEFYKRVLYLIHRDPYKHGKGPSSWIPLLCRKLFTLTGAQES IRKVMKKYDFDQSALVCDYDDGMKGIMSKDILGIPTPVLFEDEEVWGVQKYDAYLSQKYG DYMTIPKQSGQRQHNFHYLDLNKPYREFEV >gi|222159349|gb|ACAB01000010.1| GENE 14 12419 - 13552 651 377 aa, chain + ## HITS:1 COG:BMEI1404 KEGG:ns NR:ns ## COG: BMEI1404 COG0438 # Protein_GI_number: 17987687 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Brucella melitensis # 1 374 1 363 372 140 28.0 4e-33 MRIGFDGKRAVQNFTGLGNYSRYIVDILCQFYPENEYVLYAPKKRENKRLNKLTKQYRQL QLSYPTTSFWKKLSSLWRVLGVTRQLEKERIDIFHGLSNELPLNIHKSEVKSIVTIHDLI FLRYPQYYHSIDRNIYTYKFRKACENADRIIAISECTKRDIIEYFGIPADKIEVVYQGCD TSFTHPVTEEKKREVRAKYQLPEHYILNVGSIEERKNALSAVQALTMLPEQIHLVIVGRH TEYTDKIERFIKENKLEERVHIISNVPFDDLPTFYQLAEIFVYPSRFEGFGIPIIEALYS GIPVVAATGSCLEEAGGPDSIYIHPDDIKGMANAFKQIYSDPERKKVMIEKGQIFAKRFS EEKQAEEILNIYKKLMR >gi|222159349|gb|ACAB01000010.1| GENE 15 13549 - 14361 430 270 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716729|ref|ZP_04547210.1| ## NR: gi|237716729|ref|ZP_04547210.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 270 1 270 270 511 100.0 1e-143 MNILKHTIKKFIYGTLPYYFMKGYKPGSPYLKYYEYIKEHGYSRHLYEFKDEYANMPVDV QKDEEKGLYYVQKEEKRLYFRKSTPARKIQKYYRALSMEQDKRSPHHYFNSVKEVTGKVF VDVGCAEGYSSLEIIEEAKHVYLFEQDEQWLEAIRATFEPWQDKVTIVQKYVSDHNSSRE QTLDDFFNNQTDEHLFLKMDIEGAERHALAGCKNLFQNCQKLDFAICTYHLHDDEAVISA FLDKHNCTYTNQKGFFRHKIRSVVMRGSKS >gi|222159349|gb|ACAB01000010.1| GENE 16 14362 - 15273 434 303 aa, chain - ## HITS:1 COG:no KEGG:BT_3364 NR:ns ## KEGG: BT_3364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 239 1 238 247 189 40.0 8e-47 MRVIVNPKYAHLQKEIEEIPRSFQEKGDVVYDGRNVLKRIGLGSIDVVVKSFKKPHIINR VVYSFFRQSKAERSYIYSMEIQQHGFDTPEPVAMIEQFQSGLLSHSYYICCYDGGETVRS LMDGKVEGNEDKLSAFARYTAALHQAGILHLDYSPGNILIHQNDANEYSFSLVDVNRMQL LSDIDCDTVCRNMCRLCISREVLTYIMTEYASLRGWDVESTVSLALRYSDQFFTHYIYRR AARKEKERHIVSLILFFRLYRSVRKFFSWEPHISRFLLKKEKRIYDTYLCKYDYCELLSA DYQ >gi|222159349|gb|ACAB01000010.1| GENE 17 15505 - 16563 430 352 aa, chain + ## HITS:1 COG:FN0992 KEGG:ns NR:ns ## COG: FN0992 COG0859 # Protein_GI_number: 19704327 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 7 340 10 344 358 207 38.0 3e-53 MKPIKKILIIRFRQIGDSILAVALCSTLKKSFPDAEIHFVLNKNIASLYEGHPDIDKVIT FDKNENKPFTAYIKKVWQVTHQNKYDVIIDMRSTIRTLFFSLFSLKTPFRIGRIKGYTRL LLNYRTDTYSNSLTTDMVKRNLLLAAPLEKIKPIQYTKEFRLYLTDQEKKDFRYYMEKEG IDFARPVFLIGVTTKLLHKKWNTEFMITTLKRILEEYKDIQMIFNYAPGYEEEDARNIYK ELGCPERIKIDIQASSLRQLAALCANCSFYFGNEGGARHIAQALEIPSFAIYSPSASKSM WLPANSVLARGISPDDILPPEQQATLTYEERFALITPEKVYDQLTSTLRQLS >gi|222159349|gb|ACAB01000010.1| GENE 18 16546 - 17622 708 358 aa, chain - ## HITS:1 COG:SMb21078 KEGG:ns NR:ns ## COG: SMb21078 COG0438 # Protein_GI_number: 16264405 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 133 354 179 399 402 92 29.0 1e-18 MEKEIQRKKKVLIDLTNYGSLTAGFGQIAANYAAAFSSMPIEDLHFVYLLRQKYMQEFGP NVTSVPVRRINKFFPFTLPKVDVWHAVNQQRKLLRIAGGTKFIFTIHDFNFLTEKKPWKA KMYLRRMQNKVNKAAVVTTISHYVADVIRQHIDLKGKEIRVIYNGVERIDMLEGTQPSFA TGRPFFFTIGQIRRKKNFHLLVDVMRHFPEYDLYICGDAHFAYAEEVRNLIREKQVTNVF LTDVITQSEKIWLYRNCEAFLFPSEGEGFGLPVVEAMQFGKAVFAANRTSLPEVCNGHAI MWEHLDTESMVQSIREHLPDFYKDEERLTRMKEHAASFSYEKHIQAYLDLYRELAQLS >gi|222159349|gb|ACAB01000010.1| GENE 19 17682 - 18713 744 343 aa, chain - ## HITS:1 COG:FN0546 KEGG:ns NR:ns ## COG: FN0546 COG0859 # Protein_GI_number: 19703881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 3 339 6 332 335 90 26.0 4e-18 MARILIIRFSALGDVAMTIPVIHSLAVQYPQHEITVLSRAVWQPLFQGLPANVGFVGADL TGKHKGLWGLNSLYSELKTMHFDYIADFHHVLRSKYLCLRFRLANKPVASICKGRAGKKK LVRRHDKVMENQKSSFRRYADVLEKLGLPVLLNFSSIYGEGKGNFAEIEPVTGPKEGQKW IGIAPFAKHRGKIYPLELQEQVIAHFAANPQVKVFLFGGGKNEQEVFDAWIAKYPSVVSM IGKLNMRTELNLMSHLDVMLSMDSANMHLASLVNIPVVSIWGATHPYAGFMGWKQLPVNT VQLDLSCRPCSVYGQKPCWRGDYACLRDIKPEQVIAKIEGIVG >gi|222159349|gb|ACAB01000010.1| GENE 20 18717 - 19322 533 201 aa, chain - ## HITS:1 COG:no KEGG:BDI_2820 NR:ns ## KEGG: BDI_2820 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 201 1 201 201 293 74.0 3e-78 MTFSNLCNEIFWKSTTDYHVTDSVDAPMNNPYELKTIEYYLYLKNWIDAVQWHFEDIIRD PQIDPVEALALKRRIDKSNQDRTDLVELIDSYFLDKYKEVKPLSDATINTESPAWAIDRL SILALKIYHMQQEVERTDTTEEHRLQCQTKLNILLEQRKDLSAAIEQLLADIEAGRKYMK VYKQMKMYNDPALNPVLYAKK >gi|222159349|gb|ACAB01000010.1| GENE 21 19440 - 20480 523 346 aa, chain + ## HITS:1 COG:STM2370 KEGG:ns NR:ns ## COG: STM2370 COG0111 # Protein_GI_number: 16765697 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Salmonella typhimurium LT2 # 1 344 1 354 378 252 38.0 7e-67 MKIIIDNKIPFIKEAVQRIADEVVYVPGKDFTPELIRDADALIVRTRTHCNRNLLEGSRV KFIATATIGFDHIDTEYCKQAGIEWTNAPGCNSASVAQYIQSSLLVWKSCRNKKLNELTI GIIGVGNVGSKVAKVARDFGMRVLLNDLPREEKEGTEQFASLSKIAEECDIITFHVPLYK EGKYKTFHLADENFFQSLKRKPVIINTSRGEVIDTDALLKALDSRIISDAIIDVWEHEPE INRELLEKAFIGTPHIAGYSADGKANATRMSLDAICKFFQIEADYKINAPAPASPIIHAK NHEEAILQIYNPVEDSTRLKNQPEQFETLRGDYPLRREETAYLIKY >gi|222159349|gb|ACAB01000010.1| GENE 22 20558 - 21133 347 191 aa, chain - ## HITS:1 COG:MA0316 KEGG:ns NR:ns ## COG: MA0316 COG0299 # Protein_GI_number: 20089214 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Methanosarcina acetivorans str.C2A # 2 178 8 187 204 137 44.0 1e-32 MKKNIAIFASGSGSNAENIIRYFQKNDSVQVSLVLSNKSDAYVLERAHRLGVPCNVFPKE DWIAGDEILAILQEYRIDFVVLAGFLVRVPDLLLHAYPDKIINIHPALLPKFGGKGMYGD RVHEAVVAAGEKESGITIHYINEHYDEGNAIFQATCPVFPTDSPDDVAKKVHALEYEHFP QVIEQVLRNKY >gi|222159349|gb|ACAB01000010.1| GENE 23 21283 - 21519 403 78 aa, chain + ## HITS:1 COG:SMc00573 KEGG:ns NR:ns ## COG: SMc00573 COG0236 # Protein_GI_number: 15964896 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Sinorhizobium meliloti # 1 75 1 75 78 77 64.0 6e-15 MSEIASRVKAIIVDKLGVEESEVTTEASFTNDLGADSLDTVELIMEFEKEFGISIPDDQA EKIGTVGDAVSYIEEHAK >gi|222159349|gb|ACAB01000010.1| GENE 24 21535 - 22797 1397 420 aa, chain + ## HITS:1 COG:BS_yjaY KEGG:ns NR:ns ## COG: BS_yjaY COG0304 # Protein_GI_number: 16078199 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Bacillus subtilis # 1 418 1 411 413 419 53.0 1e-117 MELKRVVVTGLGAITPVGNSVPEFWENLVNGVSGAGPITHFDASLFKTQFACEVKNFDAT KYIDRKEARKMDLYTQYAIAVAKEAVSDSGLDVEKEDLNKIGVIFGAGIGGIHTFEEEVG NYYTRQEMGPKFNPFFIPKMISDIAAGQISIMYGFHGPNYATCSACATSTNAIADAFNLI RLGKANVIVSGGSEAAIFPAGVGGFNAMHALSTRNDEASKASRPFSASRDGFVMGEGGGC LILEELEHAKARGAKIYAEVAGVGMSADAHHLTASHPEGLGAKLVMKNALEDAEMDPKEV DYINVHGTSTPVGDISEAKAIKEVFGDHAFELNISSTKSMTGHLLGAAGAVESIASILAI KNGIVPPTINHEEGDNDENIDYNLNFTFNKAQKREVNVALSNTFGFGGHNACVIFKKYAE >gi|222159349|gb|ACAB01000010.1| GENE 25 22724 - 23830 655 368 aa, chain + ## HITS:1 COG:SA1076 KEGG:ns NR:ns ## COG: SA1076 COG0571 # Protein_GI_number: 15926816 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Staphylococcus aureus N315 # 53 268 24 241 243 108 36.0 2e-23 MLPCPIHLDLAVTMRALSSRNTQSNIVLRNEIDKIRLLFRKDRESYLCFYRILGFYPRNI QLYEQALLHKSTSVRSDKGRPLNNERLEFLGDAILDAIVGDIVYKRFEGKREGFLTNTRS KIVQRETLNKLAVEIGLDKLIKYSTRSSSHNSYMYGNAFEAFIGAIYLDQGYERCKQFME QRIINRYIDLDKISRKEVNFKSKLIEWSQKNKMEVSFELIEQFLDHDSNPVFQTEVRIEG LPAGTGTGYSKKESQQNAAQMAIKKVKEPVFMSTVEEIKAQHSATTTEPEVELATNSETE LATELENEFENELEAELNVIPETKPENVLEENTSDNEISTVQPTMEAEETSQSKSPCDAP ESPNRDLQ >gi|222159349|gb|ACAB01000010.1| GENE 26 23772 - 24782 863 336 aa, chain - ## HITS:1 COG:Cgl1221 KEGG:ns NR:ns ## COG: Cgl1221 COG0205 # Protein_GI_number: 19552471 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Corynebacterium glutamicum # 1 318 4 333 346 268 46.0 8e-72 MRIGILTSGGDCPGINATIRGVCKTAINYYGMEVVGIHSGFQGLLTKDVESITDKSLSGL LNLGGTMLGTSREKPFKKGGVVSDVDKPSLILQNIREMELDCVVCIGGNGTQKTAAKFAA MGVNIVSVPKTIDNDIWGTDISFGFDSAVSIATDAIDRLHSTASSHKRVMVIEVMGHKAG WIALYSGMAGGGDVILLPEIAYDIKNIGNTILERLKKGKPYSIVVVAEGIRTDGRKRAAE YIAQEIEYETGIETRETVLGYIQRGGSPTPFDRNLSTRMGGHATELIAKGEFGRMVALKG DDIASIPLEEVAGKLKLVTEDHDLVIQGRRMGICFG >gi|222159349|gb|ACAB01000010.1| GENE 27 24920 - 26446 1210 508 aa, chain + ## HITS:1 COG:no KEGG:BT_3355 NR:ns ## KEGG: BT_3355 # Name: not_defined # Def: putative auxin-regulated protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 502 1 502 503 1004 96.0 0 MNITKIISKTFDSRLKQIDLYASQASEIQHRVLSRLTRQAAQTEWGRKYDYSSIRNYEDF RKRVPIQTYEEIKPYVERLRAGEQNLLWPSEIRWFAKSSGTTNDKSKFLPVSKEALEDIH YRGGKDAAALYFRINPDSHFFSGKGLILGGSHSPNLNSNHSLVGDLSAILIQNVNPLINF IRVPSKKIALMSEWETKIEAIANSTIPVNVTSLSGVPSWMLVLIKRVLEKTGKQTLEEVW PNLEVFFHGGVAFTPYREQYKQVITTPKMHYVETYNASEGYFGTQNDLSDPAMLLMIDYG IFYEFVPLEEVGKENPRAYCLEEVELNKNYAMVISTSCGLWRYMIGDTVKFTSKNPYKFV ITGRTKHFINAFGEELIVDNAEKGLAKACAETGAQICEYSAAPVFMDENAKCRHQWLIEF AKMPDSVEKFAAILDATLKEVNSDYEAKRWKDIALQPLEVIVARQGLFHDWLAQKGKLGG QHKVPRLSNTREYIEAMLVLNNSAHPEE >gi|222159349|gb|ACAB01000010.1| GENE 28 26687 - 27826 754 379 aa, chain - ## HITS:1 COG:CAC2233 KEGG:ns NR:ns ## COG: CAC2233 COG0482 # Protein_GI_number: 15895501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Clostridium acetobutylicum # 5 375 2 354 355 238 37.0 1e-62 MIEENKRVLLGMSGGTDSSVAAMRLLEAGYEVIGVTFRFYELNGSTEYLEDARNLAERLG IRHITYDAREIFARQIIEYFVQEYLAGRTPVPCTLCNNYLKWPLLAKIADEMGIFYIATG HYAQNIQLNETFYITYAADSDKDQTFFLWGLKQDILNRMLLPMGDITKAEARAWAAEHGF RKVATKKDSIGVCFCPMDYRTFLKDWLVRNGQMSVSNSEASINNSQTSGSDKQTWSDRIR RGRFVDEKGNFIAWHEGYPFYTIGQRRGLGIHLNRPVFVKEIDPEKNEVMLASLSSLEKT EMWLKDWNLVNQERTLGRSDIIVKIRYRKQENYGTITVTSDHLLHVQLHEPLTAIAPGQA AAFYGDGLLLGGGIIVNAR >gi|222159349|gb|ACAB01000010.1| GENE 29 27880 - 30720 1419 946 aa, chain + ## HITS:1 COG:PA0799 KEGG:ns NR:ns ## COG: PA0799 COG0553 # Protein_GI_number: 15595996 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Pseudomonas aeruginosa # 462 930 163 639 663 328 40.0 4e-89 MKERPTNGQVIIVFTEHPILGILLIPYIAERLNDGTLQLVEQAFHASPEAMSIMSEAERQ AIDIASYYTEKYLMGLYSREKTVSRFLHKLSEDPERIKNNIRPFIEKKLLEMLALIRENG LPFYQKQAGSKILYAHHIYHINPHDVEIRVTFHVDSKTFRYQLQCYYEGQPFSLSELKPV VVLTSSPATLLLGMELYFFPHIESARILPFTKKRSISVDALQIEKYIDNIVIPIARYHDI ETHGLNITEEECACEAVLSFEDATYNGQALQLVFRYGDQTFAPDSANEMKKIIYRKTSGE IGFFRRNITVEEQAVQLLTNAGLQQLNATHFQLSAKAPEKTIVEWINNHREMLQQSFHLT SNVGNALYCLDEIRIEQSCDDEVDWFELHITVVIGNLRIPFSRFRKHILEEKREYLLPDG RMILLPEEWFSKYANLLEMGVQTEKGIRLKHAFIGAVQTALGEDGVKKFPAKQQIHNVAV PRTLKATLRPYQQKGFSWMVHLHKQGFGGCLADDMGLGKTLQTLTLLQYIYKPSAPKQPA TLIVVPTSLLHNWRREAKRFTGLSMMEYNNTVAIDKKRPEKFFGHFHLIFTTYGMMRNNI DILSSYHFEYVVLDESQNIKNSESLTFRSAIQLQSKHRLALTGTPIENSLKDLWAQFHFI QPDLLGTESAFQKQFIMPIRQGNERVRVQLQQLTAPFILRRSKKEVAPELPELTEETIYC DMTEEQNTCYEQEKNSLRNILLQHPQSTNRLHSFSVLNGILRLRQLSCHPQLILPDYTGT SGKTAQIIETFDTLQSEGHKVLIFSSFVRHLEVLAEAFRERGWKYALLTGATNNRPSEIA HFTEQKDVQAFLISLKAGGVGLNLTQADYVFIIDPWWNPAAESQAIARAHRIGQDKQVIA YRFITQNSIEEKILHLQDEKRKLAETFVADSESLPILSNEQWVDLL >gi|222159349|gb|ACAB01000010.1| GENE 30 30866 - 33472 1642 868 aa, chain + ## HITS:1 COG:no KEGG:BT_3328 NR:ns ## KEGG: BT_3328 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 868 8 868 868 515 36.0 1e-144 MKNRSITYLLLFGICSMLAIMQSCQKTDDLEADINSLKDRVAALEKATEGLNTSFASLQA LMQKNKIIIGITPTKDGLGYLLELSDGTSIKVMESEAVQASVPEFSVDEEGYWIYKTSND TDFKYLPGADGEKISAWPRDGEGNVVATPLISVSSSGYWQVSYDNGQTYTSLGTKAEGGS QGGTSIFSKVEYNEANHTFSFTLADGEKTYTFPVDDSFGLIIYGLNDAESEQAVQVFAPN ESHKEYKVEQNDVQQAAIQAPKGWDVLLSENLLTITPQATVVKDVEETIKIVLTSSKNYI RIVSIEVKQLSNETGAKAWQQFVNADQQNVLLDFSYAGYKHGEVAPPEIETLIAQGYKVY DVTDPQYGAIPNDGKSDRAAFMKVLEEIARETKQEDLNNMTDRYIKENAKAIIYFPEGNY ILQDKDSKDRRIRISMSDIVLKGAGRNKTTLEMTAANNSPKPTEEMWNAPVMMEFKHNTG LGESIGAITEDAPIGSKTITASLTGVSAGSWVCLVLGTPKLGNTDNDVINSELSPYQWQD IKVQQGITPNIKTNGIQIFEYHQIEKISGNSITFKEPIMHAINKDWGWNVHKFANYANVG VEDLTFKGHAKEKFIHHGSDIDDGGFKLIDFVRLTNSWMRRVNFESVSEAMSITSSANCS AYDITIGGNRGHASIRSQASSRIFIGKVTESSNGYTLRKGEGESTLMEYKTNVGQYHACG VSKQSMGAVIWNVKWGDDSCFESHATQPRATLIDCCTGGFMHWRQGGDSAQMPNHMENLT IWNFYATNVQTDQDIDTEGKFTWWDSNGFWWKFMPPIIVGFHGSPLDFDATQMKRLESNG TAVEPYSLYEAQLRKRLGYVPSWLSSLK >gi|222159349|gb|ACAB01000010.1| GENE 31 33494 - 35447 1370 651 aa, chain + ## HITS:1 COG:no KEGG:BT_3329 NR:ns ## KEGG: BT_3329 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 651 32 632 645 169 29.0 3e-40 MKKVWSMFMLLAVCLVACTNIDDLEDDVDALKKRVTALETQVRDINSNTEALRELYNEGT FITNIEEKSDSYTLTLSNGKTVNLYMKNDNNLLCPIIGIDSEGYWTVLYNKNETPERLTV NGQPVKANGESGKTPTFNVDSEGYWQVSYDEGKNYEYIYKEGTTDKVSATGDGSAPAEDK NFKSVTVENNELVLVLAGEDAPTIRIPIISDFECSFAAEDLEQIQEFSAGETKEFTMTMR GVKNTMITAPEGWSAKFSKEAGKENVLIVTAPASSAKMMTRATADNSTDIAILATSGKYA MIAKIQVSIKNRTDYKADFDHGKDITIGGITINNQIYSDADIQILDATDADVALDTYFSA TMSKPVILFLTGTAHNFTTTGVKSISNDVIIIGRYDDEQVTLRPINCWKSCKGKLLFKNI KIDLSDLNGGSNAGYFINNAGVISKGDFTDICIDNCLIANVLKPIYYDAAQKTYFGIDNI SVQDTRIEVNAIKIALINIYKGFNLGDYKTFNFKNNIVYSQTPQEGVQILNWATGNIPLS DGVLSAEIINNTFVNMIGSNIFFRYQKGTSLTISKNIFDVSPEAEFGSYYYSFLESCTPQ IDVTDNIVYGLTKNWNYYHTSSLVKEPTSGNNITKHATAPITQYDYVNGIF Prediction of potential genes in microbial genomes Time: Wed May 18 00:59:57 2011 Seq name: gi|222159348|gb|ACAB01000011.1| Bacteroides sp. D1 cont1.11, whole genome shotgun sequence Length of sequence - 69251 bp Number of predicted genes - 42, with homology - 41 Number of transcription units - 25, operones - 9 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 72 - 131 2.4 1 1 Tu 1 . + CDS 245 - 1057 765 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 1070 - 1118 -0.0 + Prom 1159 - 1218 4.8 2 2 Tu 1 . + CDS 1288 - 2769 1453 ## COG0215 Cysteinyl-tRNA synthetase + Term 2809 - 2849 5.1 + Prom 2881 - 2940 6.6 3 3 Tu 1 . + CDS 3023 - 3928 694 ## BDI_3256 putative transposase 4 4 Tu 1 . - CDS 3945 - 6806 2304 ## BT_3350 putative chondroitinase (chondroitin lyase) - Prom 6829 - 6888 4.4 5 5 Tu 1 . - CDS 6937 - 8487 1352 ## COG3119 Arylsulfatase A and related enzymes - Prom 8520 - 8579 5.2 + Prom 8254 - 8313 3.1 6 6 Tu 1 . + CDS 8562 - 8759 57 ## 7 7 Tu 1 . - CDS 8652 - 9854 1159 ## BT_3348 putative unsaturated glucuronyl hydrolase - Prom 10040 - 10099 6.4 + Prom 9943 - 10002 4.9 8 8 Tu 1 . + CDS 10060 - 10470 490 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism - Term 10513 - 10575 -0.4 9 9 Op 1 . - CDS 10641 - 11516 870 ## BT_3342 hypothetical protein 10 9 Op 2 . - CDS 11461 - 13416 1557 ## BT_3341 hypothetical protein - Prom 13491 - 13550 9.0 + Prom 13453 - 13512 8.2 11 10 Tu 1 . + CDS 13715 - 14326 524 ## BT_2534 hypothetical protein + Term 14355 - 14405 12.1 - Term 14343 - 14391 7.9 12 11 Op 1 . - CDS 14451 - 15407 990 ## COG5464 Uncharacterized conserved protein 13 11 Op 2 . - CDS 15420 - 15572 116 ## gi|294647597|ref|ZP_06725169.1| conserved domain protein - Prom 15658 - 15717 4.2 + Prom 15552 - 15611 5.6 14 12 Tu 1 . + CDS 15706 - 16182 640 ## BT_2538 hypothetical protein + Term 16232 - 16280 11.0 - Term 16218 - 16266 11.0 15 13 Op 1 . - CDS 16279 - 17928 1513 ## BT_2821 hypothetical protein 16 13 Op 2 . - CDS 17943 - 20708 2295 ## BT_2820 hypothetical protein 17 13 Op 3 . - CDS 20742 - 22538 1395 ## BT_2819 hypothetical protein 18 13 Op 4 . - CDS 22550 - 25726 2489 ## BT_2818 hypothetical protein - Prom 25805 - 25864 10.7 + Prom 26009 - 26068 6.5 19 14 Tu 1 . + CDS 26161 - 29268 3046 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 29328 - 29378 12.9 + Prom 29300 - 29359 3.3 20 15 Op 1 27/0.000 + CDS 29450 - 30625 1038 ## COG0845 Membrane-fusion protein 21 15 Op 2 9/0.000 + CDS 30642 - 34073 3151 ## COG0841 Cation/multidrug efflux pump 22 15 Op 3 . + CDS 34085 - 35476 374 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 23 15 Op 4 . + CDS 35495 - 36265 810 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase + Term 36287 - 36333 7.2 - Term 36265 - 36329 10.1 24 16 Tu 1 . - CDS 36427 - 37068 429 ## BT_3335 hypothetical protein - Prom 37125 - 37184 7.1 + Prom 37095 - 37154 5.4 25 17 Tu 1 . + CDS 37181 - 41239 3146 ## COG0642 Signal transduction histidine kinase + Prom 41322 - 41381 3.6 26 18 Tu 1 . + CDS 41532 - 43067 1302 ## COG3119 Arylsulfatase A and related enzymes + Prom 43080 - 43139 5.6 27 19 Op 1 . + CDS 43191 - 46385 2964 ## BT_3332 hypothetical protein 28 19 Op 2 . + CDS 46413 - 48185 1811 ## BT_3331 hypothetical protein 29 19 Op 3 . + CDS 48216 - 49316 781 ## BT_3330 hypothetical protein + Term 49395 - 49438 10.1 + Prom 49340 - 49399 4.7 30 20 Op 1 . + CDS 49544 - 50491 730 ## COG0042 tRNA-dihydrouridine synthase 31 20 Op 2 . + CDS 50546 - 51823 913 ## BT_3325 hypothetical protein + Term 52060 - 52100 3.1 - Term 52182 - 52217 1.3 32 21 Op 1 . - CDS 52241 - 55312 2176 ## BT_3324 chondroitinase (chondroitin lyase) precursor 33 21 Op 2 . - CDS 55348 - 56040 389 ## BVU_0159 hypothetical protein - Prom 56063 - 56122 1.9 - Term 56068 - 56109 9.0 34 22 Tu 1 . - CDS 56156 - 57526 1008 ## BF3314 hypothetical protein - Prom 57586 - 57645 2.9 + Prom 57996 - 58055 3.9 35 23 Op 1 . + CDS 58076 - 59026 793 ## Cpin_2255 hypothetical protein 36 23 Op 2 . + CDS 59032 - 60507 1099 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 37 23 Op 3 . + CDS 60538 - 63507 2161 ## Cpin_2252 TPR repeat-containing protein 38 23 Op 4 . + CDS 63482 - 65635 1355 ## Cpin_2251 coagulation factor 5/8 type domain protein 39 23 Op 5 . + CDS 65674 - 66189 418 ## gi|237716783|ref|ZP_04547264.1| conserved hypothetical protein - Term 66195 - 66262 17.4 40 24 Tu 1 . - CDS 66283 - 66843 590 ## BT_3323 hypothetical protein 41 25 Op 1 . + CDS 67130 - 68935 430 ## BDI_3446 hypothetical protein 42 25 Op 2 . + CDS 68966 - 69251 241 ## gi|237720541|ref|ZP_04551022.1| predicted protein Predicted protein(s) >gi|222159348|gb|ACAB01000011.1| GENE 1 245 - 1057 765 270 aa, chain + ## HITS:1 COG:VC1364 KEGG:ns NR:ns ## COG: VC1364 COG0561 # Protein_GI_number: 15641376 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Vibrio cholerae # 3 266 2 265 273 176 35.0 4e-44 MKYKLIVLDLDGTLTNSKKEITPRNRETLIRIQEQGIRLVLASGRPTYGIVPLANELRMN EFGGFILSYNGGEIINWETKEMVYENVLPNEVVPMLYECARTHQLSILTYDGADIITENS QDPYVQKEAFLNKMAVRETNDFLTEITLPVAKCLIVGDADKLIPLEAELSLRLQGHINVF RSEPYFLELVPQGIDKALSLAVLLKEIGVAREEVIAMGDGYNDLSMIKFAGLGIAMGNAQ EPVKKAANYITLSNEEDGVAEAIDKFCSQQ >gi|222159348|gb|ACAB01000011.1| GENE 2 1288 - 2769 1453 493 aa, chain + ## HITS:1 COG:DR1670 KEGG:ns NR:ns ## COG: DR1670 COG0215 # Protein_GI_number: 15806673 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Deinococcus radiodurans # 5 490 52 531 532 457 47.0 1e-128 MEHQLTIYNTLDRKKELFVPLHAPHVGMYVCGPTVYGDAHLGHARPSITFDVLFRYLTHL GYKVRYVRNITDVGHLEHDADDGEDKIAKKARLEELEPMEVVQYFLNRYHKAMEALNVLS PSIEPHASGHIIEQIQLVQKILDAGYAYESEGSVYFDVAKYNKDHHYGKLSGRNLDDVLN TTRDLDGQSEKRNPADFALWKKAQPEHIMRWPSPWSDGFPGWHAECTAMGRKYLGEHFDI HGGGMDLIFPHHECEIAQSVASQGDDMVHYWMHNNMITINGTKMGKSLGNFITLDEFFNG THKLLAQAYTPMTIRFFILQAHYRSTVDFSNEALQASEKGLQRLIEAIDALDKITPAPAT SEGINVKELRAKCYEAMNDDLNTPIVIAQLFEGARIINNIIAGNATISAEDLKELKETFH LFSFDIMGLKEEKGSSDGREAAYGKVVDMLLEQRMKAKANKDWATSDEIRNTLTALGFEI KDTKDGFEWRLNK >gi|222159348|gb|ACAB01000011.1| GENE 3 3023 - 3928 694 301 aa, chain + ## HITS:1 COG:no KEGG:BDI_3256 NR:ns ## KEGG: BDI_3256 # Name: not_defined # Def: putative transposase # Organism: P.distasonis # Pathway: not_defined # 1 301 1 301 301 555 90.0 1e-157 MFSEAKVTEIYCMADDFCKEFTLQQKKYMIEDKGHKHRNKPNRMNDAEVMVILILFHSGG FRCFKHYYKEYVCKHLRGLFPRCVSYNRFVELEKEILLPLTIFIKKVLLGSCTGISFVDS TPLRVCRNQRILIHKTFEGLAERGRCSMGWFFGFKLHLIINDKGEILNFMFTPGNVDDRE PLKQGKFLENIKGKLYADKGYIGQTLFENLFLNGIQTITKVKNNMKNSLMSIADKILLRK RALIETVNDELKNIAQIEHSRHRSFNNFIANALSAIAAYCFFEKKPAIDVKFVNDGQLAM F >gi|222159348|gb|ACAB01000011.1| GENE 4 3945 - 6806 2304 953 aa, chain - ## HITS:1 COG:no KEGG:BT_3350 NR:ns ## KEGG: BT_3350 # Name: not_defined # Def: putative chondroitinase (chondroitin lyase) # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 953 1 953 953 1726 86.0 0 MKSSCFIILFLLGIIPSAFAQLIGFEEEVPEAFKVSGKGEVKISSLFYKEGESSLEWDFQ PGSTLDVQIAPLSLNTRNEKQFGITLWIYNEKPQQDSIRFEFLNKEGEVSYWFSYRLQAA GWRACWISFEYMQGDKKDKKIVAYRLVAPQRKGRIFLDRLIFPEKKMNLRTTPDQQLPTN NGLSNRDLWHWCLVWKWEQQSYDIPLPSKLTSEQKKELKTIEQRLTDFLEVKKAPQGPIN AAYKTFEKAAISPSIAGTGFIGTPIVAPDEQDKKKGEMSWNDIETMLSGFAYDAYYNQNE TSKKNYFTVFDYAIDQGFAYGSGMGTNHHYGYQVRKIYTTAWLMRDAIYKHPHRDAYLST LRFWAALQETRQPCSPTRDELLDSWHTLLMAKFISAMMFPDAREQAQALSGLSRWLSSSL RYTPGTIGGIKVDGTTFHHGGFYPGYTTGVLATVGEYIAFTNGTSFELTEDARKHMKSAF IAMRNYCNFYEWGIGISGRHPFGGKMGSDDIEAFANIALSGDLSGQGNTFDRGLAADYLR LIRNSDTPNARFFKKEGIQPAQAPQGFFVYNYGSAGIFRRADWMVTLKGYTTDVWGSEIY TKDNRYGRYQSYGSVQIMGKGNPVSRAGSGFVQEGWDWNRLPGTTTIHLPFNLLDSPLKG TTMARSKENFSGSSSLDGKNGMFAMKLAERDYENFTPDFVARKSVFCFDNRMVCLGTGIA NSNADYPTETTLFQTKYNGKEPKVGEDNYWLHDGYDNYYHVVDGTVRAQVAEQESRHEKT REITKGKFSSAWIEHGKAPKEGTYEYMVLIQPSASDLDELRKTPAYEVLQRDQTAHVVYD KKTGITAYAAFEAYQPAADKVFVAIPAETMVMYAKESDKGIRLSVCDPNLNIEEKTYTTK EPSRPITKEIRLKGHWRLTSPMENVRLEQQGDQTVLTVTCQHGQPVEMLMENK >gi|222159348|gb|ACAB01000011.1| GENE 5 6937 - 8487 1352 516 aa, chain - ## HITS:1 COG:YPO0829 KEGG:ns NR:ns ## COG: YPO0829 COG3119 # Protein_GI_number: 16121138 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Yersinia pestis # 14 512 28 511 517 305 37.0 1e-82 MNKPINLLVGGLTLFAAQGCKAPKQVPQQAEHPNIIYVFPDQYRNQAMGFWSQDGFRDKV NFKGDPVHTPNLDAFARESMVLSSAQSNCPLSSPHRGMLLTGMYPNKSGVPLNCNSSRPI SSLREDAECIGDVFSKAGYDCAYFGKLHADFPTPNDPEHPGQYVEEKRPAWDAYTPKERR HGFNYWYSYGTFDEHKNPHYWDTDGKRHDPKEWSPLHEAGKVVSYLKNDGNVRDTKKPFF IMVGMNPPHSPYRSLNDCEEQDFNLYKDQPLDSLLIRPNVDLKMKKAASARYYFASVTGV DRAFGQILETLKEMGLDKNTVVIFASDHGETMCSQRTDDPKNSPYSESMNIPFLVRFPGK IQPRVDDLLLSAPDIMPTVLGLCGLGDSIPAEVQGRNFAPLFFDEKAEIVRPTGALYIQN VDGDKDENGLVQTYFPSSRGIKTAQYTLALYIDRDTKELKKSLLFDDVKDPYQLHNLPLE ENKEIVAQLCGEMGAMLKEINDPWYTEKILSDRIPY >gi|222159348|gb|ACAB01000011.1| GENE 6 8562 - 8759 57 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MINGKHVTLESVEYSSFTAYRSLFFIISYSSFLFQISFTHQSFQEVIVSIVQRDIDFLTV RNTTY >gi|222159348|gb|ACAB01000011.1| GENE 7 8652 - 9854 1159 400 aa, chain - ## HITS:1 COG:no KEGG:BT_3348 NR:ns ## KEGG: BT_3348 # Name: not_defined # Def: putative unsaturated glucuronyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 400 1 400 400 789 96.0 0 MKTILSALGLSLLILTSCGGQKKVEVDFIQDNIDNAVAQNTIQTDIIEKSGKILNPRTIN KDGSISYIPIEDWCSGFFPGSMWLTYNLTGDQKWLPLAEKYTEALDSVKYLKWHHDVGFM IGCSYLNGYRLADKKEYKDVIIETAKSLSTRFRPNAGVIQSWDADRGWQGTRGWKCPVII DNMMNLELLFEATALSGDSTFYNIAVKHADTTMAHHFRPDNSCYHVVDYDPETGEVRKKQ TAQGYADESSWARGQAWALYGYTTCYRYTKDKKYLDQAQKVYNFIFNNKNMPEDLVPYWD YDAPNIPNEPRDASAAACTASALYELDGYLPDNHYKETADKIMESLGSPAYRAKVGTNGN FILMHSVGSIPHGQEIDVPLNYADYYFLEGLMRKRDLEKK >gi|222159348|gb|ACAB01000011.1| GENE 8 10060 - 10470 490 136 aa, chain + ## HITS:1 COG:MA0735 KEGG:ns NR:ns ## COG: MA0735 COG2050 # Protein_GI_number: 20089620 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Methanosarcina acetivorans str.C2A # 4 132 16 143 146 110 49.0 7e-25 MTPQEFFKNDIFAKNAGIILLEVRKGYSKAKLEIKAEHLNAGARTQGGAIFTLADLALAA AANSHGTLAFSLSSTITFLRASGPGDTLFAEARERYIGRSTGCYQVDITNQDGELIATFE SSVFRKDQKVPFEVQE >gi|222159348|gb|ACAB01000011.1| GENE 9 10641 - 11516 870 291 aa, chain - ## HITS:1 COG:no KEGG:BT_3342 NR:ns ## KEGG: BT_3342 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 34 291 2 259 259 500 92.0 1e-140 MKTRKRKIGIEIIRTAVVDARRNLIIGLVALLTGIFALPASAQCEAKNDAFQSGEHVMYD LYFNWKFVWVKAGLASLTTNATTYHSQPAYRINLLALGSKRADFFFKMRDTLTCVIGEKL EPRYFRKGAEEGKRYTVDEAWFSYKDGLCLVNQKRTYRDGAFDESEASDSRCIYDMLSIL AQARSYDPADYKVGDKIKFPMATGRKVEEQTLIYRGKENVKAENGVTYRCLIFSLVEYDK KGKEKEVITFFVTDDLNHLPVRLDLFLNFGSAKAFLNNVTGNRHPLTSIVK >gi|222159348|gb|ACAB01000011.1| GENE 10 11461 - 13416 1557 651 aa, chain - ## HITS:1 COG:no KEGG:BT_3341 NR:ns ## KEGG: BT_3341 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 628 1 616 642 1051 87.0 0 MWYLLNKNAYFCSMLQFENQKIKQIVRKVLPVVTLGAMLYSCASIGRPDGGPYDETPPRF IGSTPEAGALNNKRTKVSLMFDEFIKLEKATEKVVVSPPQIQQPEIKASGKRIQVNLLDS LKPNTTYTIDFSDAIVDNNEGNPLGNFAFTFSTGTEIDTMEVAGTMLDASNLEPIKGMLV GLHSNLNDSAFNKLPFDRVARTDSRGRFSIRGVAPGKYRIYGLMDADQNFAFNQKSEMIA FHDSLIIPRMEERIRMDTAWVDSLTYDTIVEKKYMHYLPDDVILRAFKELNYSQYLIKSE RLVPQKFTFYFAGKADTLPVLKGLNFDEKDAFVIEKNQRNDTIHYWVKDSLLFKQDTLAM SLTYLYTDTLNQLVPRTDTLNLVSKQKYKKEEPDKDKKKKKKKKGEEDEPEPTKFLPVNV SAPSSMDVYGYISLNFEEPIASYDTAAIHLRQKVDTLWKDIPFEFEQDSVNLKKFNLYYD WEPTLEYEFSVDSTAFHGIYGLFTDKIKQGFKVRSEDEYFTLHFNVTGADSLAFVELLDA QDKVVRKRRVEEDGMVDFYFLNPGKYAARLINDTNGNGEWDTGDFAKGIQPEAVYYYPTI LEYKALWDVTQPWDIHATPVDKQKPDELKKQKPDEDKKKKDRDRNNQNRRR >gi|222159348|gb|ACAB01000011.1| GENE 11 13715 - 14326 524 203 aa, chain + ## HITS:1 COG:no KEGG:BT_2534 NR:ns ## KEGG: BT_2534 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 200 1 200 202 311 74.0 1e-83 MSIHYDLYDTPDIQKKGEEQPLHPRVVFNGTIDQEEFLDRVHKFTGISRSLLVGAMQSFQ NELRDLLANGWIVELGDIGYFSVSLQGPPVMKKKDVHAQSISLKNINFRAGKQFKKEVAQ QMRPERGASFTRPHGKGRSEEECLKLINKHLQQFPCLTRADYCRMTGHDKQRAINELNTF IERGLLIRYGAGKLVVYAKKSEE >gi|222159348|gb|ACAB01000011.1| GENE 12 14451 - 15407 990 318 aa, chain - ## HITS:1 COG:RC0439 KEGG:ns NR:ns ## COG: RC0439 COG5464 # Protein_GI_number: 15892362 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Rickettsia conorii # 119 318 1 183 192 67 26.0 5e-11 MEQKDKYIRFDWAIKRLLRNKANFGVLEGFLTVLLGEEIHILEILESEGNQQREDDKFNR VDIKARNSKDEIILVEVQNTRELYYLERILYGVAKTITEHIDLGEIYSNVKKVYSISILY FDIGQGSDYLYHGQNTFLGVHTGDHLKVTTKEQGAIVRKLPAEIFPEYFLIRVNEFNKVA VTPLEEWIEYLKTGCIRPDTKAPGLEEARKKLVYYNMDKAEKLAYDRHIDAVMIQNDVLS TAKLEGHEEGHQLGLEEGRQRGLEEGIKEGIEKGIEQGIEQGIEKEKTITARKLKSLNLP VDTIIQVTGLSSEEIEKL >gi|222159348|gb|ACAB01000011.1| GENE 13 15420 - 15572 116 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294647597|ref|ZP_06725169.1| ## NR: gi|294647597|ref|ZP_06725169.1| conserved domain protein [Bacteroides ovatus SD CC 2a] # 1 50 1 50 50 85 100.0 1e-15 MLLYWGILCKKDGTEGLLSQVISIYTDKGYDSIILLAEIYPFYQYNSYLC >gi|222159348|gb|ACAB01000011.1| GENE 14 15706 - 16182 640 158 aa, chain + ## HITS:1 COG:no KEGG:BT_2538 NR:ns ## KEGG: BT_2538 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 158 160 188 67.0 7e-47 MALKYVVKKRVFGFDKTKTEKYVAQNVITNTVNFRDLCKEVSMFGMIPEGAVKHVIDALI DTLNTNLNKGLSVQLGDFGCFRPGMNCKSQNEEKDVDADTVRRVKIVFTPGYKFKDMLDN VSIYKTEGNGGTSSGNGGGNKPENPDEGGSGEAPDPAA >gi|222159348|gb|ACAB01000011.1| GENE 15 16279 - 17928 1513 549 aa, chain - ## HITS:1 COG:no KEGG:BT_2821 NR:ns ## KEGG: BT_2821 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 549 4 573 573 401 42.0 1e-110 MRKYILLYTGLLLSVSGCSLLELDESTGLDREEAYSYFSNVKGLATYVYSQLPGDLGVLD GALRESATDNSVYVWSDNSVHDFYNNAWSPNNAVDNMWSKCYGAIRSVNSFLENYSQEKL ERFRWNDTYEEDIAKATMYREELRVLRAFYLFELAKRYGDIPLLTRTYALDEINGVEKTS FNEVIKYICDECSDAAKTLPVSHQDFWAETGRVTKGTALALKSRALLYAASLLHNPAQDA DKWKAAADAAYAIIKENWYSLPKTNVDPLYDKNGGNDVLKSPQLIFERRNGESFDFEANN LPISYEKGKTGNVPTQNLVDAFQMTNGKDFDWEQITPGQNPYEGRDPRFYKTVLCNGDTW MNSTIQSYEGGKDGAGTTGATTTGYYLKKYMNETVSLAPSNEKKKPHHFIIFRYAEILLN YAEAMDAWKDADYTDNDHPLSARAALNQVRAAADMPAITTSGDAFTESVRRERRVELAFE DHRFWDIRRWKIGDKTKAIYCIKITMENGLPVYKKELLETRNWDDKMYLYPIPQTEYYKN PNLGQNTGW >gi|222159348|gb|ACAB01000011.1| GENE 16 17943 - 20708 2295 921 aa, chain - ## HITS:1 COG:no KEGG:BT_2820 NR:ns ## KEGG: BT_2820 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 921 1 926 926 583 38.0 1e-164 MNKSLYIVAFSLAFATTGIAQNNEGQDKVIDLGTQTTTELRKTQAVSTIYSNELDKNAAT SPYNAIYGLLPGLSVMQNTSWGTDKSRLNVRGRGSLNGDTPLFVVDGFARPLEYINLSEI ESISVLKDGAATALWGGRGANGVVLITTKRGIYSQKDIKVDYKFGLGFPVNQPEFADAYT YAKARNEALRYDGLQPDMDEASFLSGGNSDLYPNVDWQKEALRNHTTSHQLDITFRGGGK RLRYFSVLNYKNDMGLLNSKYTDYTDRYNSQMKKYFLNLRMNLDVDITDATKLKLSMLGM LRETKRPTTSEATLFEQIFNTPSAAFPVQTQNGFWGSNNVLKTNPIANLADVGYYKLNQR MLQADLRLVQDLSVLTRGLSAELAIAYDNNATYQETAKKSFMYQTIEKGTDGEPIYTNYG NPNDELEISNKGLANQYIHANFEAKVNYHRTWDKHDFTASAMFRQESMTLTGANNSRYRQ YIIGTAGYNFDNRYMVDIVANCFGSSVLGKNDKYRAYPAISAAWILSNENFMKEISAFDY LKLRASYGRSGYDIYDYDMDKQYWVGSGAYYFQAGNTSAGSSLKEGVLAMEQLDLEVADK YNIGLDMSLFKGLTFSIDGFYDKRTNILIDGSGLISSAIGVTIPQMNAGKVETKGTELSA MWKKEYKDFNYYIGANFSYAKSKVVENGEGYQPYGYLSKKGYPVGQCFGWEAIGYFRDEA DIKSSPVQKFSEVRPGDVKYRDLNGDNVIDNNDQKAIGYSTAIPEIYYGINLGFEYKGFG VDALFQGVAHYSVMLNTASVYWPLRNNTNISNWYLNDRIRWTEETKDIANVPRLTTLDNA NNFRNSTQWLENGSYFKLRNLNVYYNLPSSWAKKVKMEKIQVYARANNLFSLDHVKHMNC EDLKINYPDMTSVYFGLNINF >gi|222159348|gb|ACAB01000011.1| GENE 17 20742 - 22538 1395 598 aa, chain - ## HITS:1 COG:no KEGG:BT_2819 NR:ns ## KEGG: BT_2819 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 598 12 585 585 331 39.0 6e-89 MKNYKKYIGCLSLSVLTLIGCEDLKFGEKFLDKPISNEQNIDSVFNKKVYAEQALAETYH SLPDYLPMQGRLGYGVLEMLTDLGDWTKKGAPKFYTGTVDGTNTYLEHLPYRLDVANTTI GVGPVYGIRRAYIYIENVDRVPDMTADEKAIGKAEAQVIIAYHYSQMLRYYGGMPWIDHA YMAEDAMKFPRMTVEETVQKIVGLLDAAASVLPWQVNADNDGRMTAASALALKSRVLQFV ASPLFNAEKPYLEGDASSQFLTWYGNYSPDRWQKALDAGLEFMRANKKNSDAYQLVNTGN PRDDFAAGYFNRHNGEVLISSRRFTTYATGKLPFAQVRYGVASPTLTYVDMFQMKDGTEF DWNNPDHKKFPFFDKDGNPRRDIRLYETVAVNGDKFRGAQKVQIYEGATQAPYKDGRMSY NGVAMRKFIRNFNDEVNGKFYSCPLIRLPEVYLNMAEAMNELNKAEVRDEFGNTAYDYLN KTRLRAGMPAISATDVPAGKPLREAILRERAIEFGYEEVRYFDLVRWKRSDIFTGQLSRL IIKKAAGEPSGFSYTVSHAMAETRQYAKPEKWNDKYFLLPLPIDEINKKYGLIQNPGW >gi|222159348|gb|ACAB01000011.1| GENE 18 22550 - 25726 2489 1058 aa, chain - ## HITS:1 COG:no KEGG:BT_2818 NR:ns ## KEGG: BT_2818 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 1058 31 1057 1057 865 46.0 0 MNKLTRLKRCKRCISTLMLMCCSLLAWAQNTVTGTVLDESNLEVIGANVKEKGTANGAIT DMNGVFHLKVSNLQKAVLQISFMGYEPQEVPLNGRKQLKIILKETANELQEVTVVAYGTQ KKETLTGAISAVNNEALIRSPNASVANSLAGQITGLSSVQTSGQPGQEDPKVFIRGVGSL TESASSPLILVDGIERSFYQMDPNEIESVTVLKDASATAVFGVRGANGVILVTTRRGKEG KAKISVNSSVGIQMPTRMLKMADSYTYATLRNEIITNDKPNATENDLVFNNYAVERFRLN DEPIMYPSIDWRKYLLNKTSVQTQHNLNISGGTKDIRYFISLGFLYQNGLLKQFEGVGYD NNYKYKRYNYRANLDFNVTKTTTLKIGIGGIVGNRNEPMINDATNSIWTLINQTTPFSSP GVIDGQLIVTPEERFDNKINLGNSALPKCYGTGYSTAINNTMNLDLLLNQKLDFITKGLS AEIKGAYNTSYGFTKKRKGQVEQFMPFYQSSLEDPSLGYDSPDFNKNIVYKIKGENKSLQ YEETSSRARDWYLEASLRYNRKFGFHNVGGLFLYNQSKKYYPAQWVGVPSAYIGFVGRLT YDYKSRYMAEVNFGYNGSENFAPGKRFGAFPAGSIGYILSEEAFMKKQKVVDYLKFRASV GLVGNDNLGNNRFLFLPDSYDVNLSGVDGWNNNKYGFNFGYNSKALILGALEKRLGNPNV TWETALKQNYGLDVHFLKSRLKISLDYFLEERKDILINRKTVPLLTGLTSSILPAVNMGK VKNRGYEVEVRWNDKIGQVQYNVQANVSYSKNKIIFQDEVEPNEPYMWRTGNPVGTLFGY VADGFYTEADFGENGKLVAGLPDPGVSVKPGDVKYRDLNGDEEITSDDQTIIGNPTRPAY TFGLNYGINYKGFFLTMNWTGAAQRSLLLDGAFREPFGNGKIRGLMQFHADTRWTPETAN TATTPRFTETNAVYNMRSSSLWVRDGSYLKLKNVTIGYNFTDKKMLKKLGIQQLGIKLTG YNLLTFDKFDIMDPECNPNNADSYPIIKIYNLGINLTF >gi|222159348|gb|ACAB01000011.1| GENE 19 26161 - 29268 3046 1035 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 25 1027 7 983 1087 704 38.0 0 MKKQLLSCCLAALGLTAIQAQSFNEWKDPEVNSVNRSAMHTNYFAFASADEAKAGIKENS ANFMTLNGLWKFNWVRHADARPTDFYQTNFNDKGWDDLQVPGVWELNGYGDPIYVNVGYA WRSQFKNNPPLVPTENNHVGSYRKEIILPADWKGKEIFAHFGSVTSNMYLWVNGRYVGYS EDSKLEAEFNLTNYLKPGKNVIAFQVFRWCDGSYLEDQDFFRYSGVGRDCYLYARDKKYI QDIRITPDLDSQYKDGTLNIAVDLKGSGTVALNLTDAQGNSVATADLKGSGKLNTTLSVS NPAKWTAETPNLYTLTATLKNGNNVVEVIPVKVGFRKIELKGGQILVNGQPVLFKGADRH EMDPDGGYVVSTERMLQDIKVMKELNINAVRTCHYPDDNRWYDLCDQYGLYVVAEANVES HGMGYGDQTLAKNPSYAKAHMERNQRNVQRGYNHPSIIFWSLGNEAGMGPNFEKCYTWIK NEDKTRAVQYEQAGTSEFTDIFCPMYYDYDACIKYSEGNIQKPLIQCEYAHAMGNSQGGF KEYWDIIRKYPKYQGGFIWDFVDQSCHWKNKDGVSIYGYGGDFNKYDASDNNFNDNGLIS PDRVPNPHAYEVAYFYQDIWTTPADLAKGEINIFNEYFFRDLSAYYMEWQLLANGEVVQT GIVSDLKVAPQQTVKVQIPFDTKNICPCKELLLNVSYKLKAAETLLPAGATIAYDQLSIR DYKAPELKLENQQASNLPVIVPTILDNDRNFLIVKGENFSMDFNKHNGYLCRYDVNGMQL MEDGSALTPNFWRAPTDNDFGAGLQHKYAAWKNPELKLTSLKHAIENDQAVVRAEYDMKS IGGTLSLTYTINNKGAVKVTQKMAADKSKKVSDMFRFGMQMRMPVNFNEIEYYGRGPGEN YADRNHAAMLGKYRQTVEEQFYPYIRPQETGTKTDIRWWRLLNISGNGLQFVSSAPFSAS ALNYTIESLDDGDGKDQRHSPEVEKADFTNFCIDKAQTGLACVNSWGAIPLEKYRLPYQD YELSFIMTPVYHKIK >gi|222159348|gb|ACAB01000011.1| GENE 20 29450 - 30625 1038 391 aa, chain + ## HITS:1 COG:ECs4393 KEGG:ns NR:ns ## COG: ECs4393 COG0845 # Protein_GI_number: 15833647 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 5 375 8 382 385 166 31.0 9e-41 MKSRIVLFAFCVALLSSCGNKGNDTGKVPEYAVQELQKTTANLTTAYPATIKGKQDVEIR PQVSGFITKLCVDEGAAVRKGQVLFIVDPTQYEAAVRTAKAAVATAEAAVSTQQMTYDNK KELNKKQIISDYDLAMAENSLAQTKAQLAQAKAQLTTAQQNLSFTQVKSPSDGVINNIPY RVGALVSPSIATPMTTVSEIDEVYVYFSMTEKELLAMTKSGSTIKEEISKIPTIKLQLID GSTYDIEGKVDAITGVIDQSTGSVSIRAIFPNKEHVLRSGGTANVLIPYTMENVITIPQS ATVEIQDKKFVYVLQPDNTVKYTEIKIFNLDNGKEYLVTSGLNSGDKIVIEGVQNLKDGQ KVQPITPAQKEANYQQHLKDQHDGNLATAFN >gi|222159348|gb|ACAB01000011.1| GENE 21 30642 - 34073 3151 1143 aa, chain + ## HITS:1 COG:all3143 KEGG:ns NR:ns ## COG: all3143 COG0841 # Protein_GI_number: 17230635 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Nostoc sp. PCC 7120 # 1 1123 1 1036 1057 622 34.0 1e-177 MKLDRFINRPVLSTVISILIVILGAIGLATLPITQYPDIAPPTVSVRATYTGASASTVLN SVIAPLEEQINGVENMMYMTSTASNTGSGEISIYFKQGTNPDMAAVNVQNRVSMAQGLLP AEVTKVGVTTQKRQTSMLVVFSLYDETDTYSESFIENYAKINLIPQVQRVPGVGDANVLG QDYSMRIWLRPDVMAQYKLVPGDVSAALAEQNVEAAPGQFGERSNQTFQYTIRYKGRLQQ PEEFENIVIKSLPDGEVLRLKDIAEIQLDRLGYNFTNRVDGHKSVTCIVYQMAGTNATQT ISDIEALLDEASKTLPTGLKLNISMNANDFLFASIHEVVKTLIEAFILVFIVVYIFLQDL RSTLIPTIAIPVALIGTFFILSLVGFSLNLLTLCALVLAIAIVVDDAIVVVEGVHAKLDQ GYTSARLASIDAMNELGGAIVSITLVMMAVFVPVSFMGGTAGTFYRQFGMTMAIAIGLSA LNALTLSPALCAVLLKPHKKEDGTTEDSTLKERMKVAYTAAHTTMINRYTEAIGKMLHPG ITLTLTLIAILGMIFGLFSINPIVTAIFVVLSILALIGMSTKKFKNRFNDTYESILKRYK KRVLFFIQKKWLSMGLVVASIAILVFFMNTTPTGMVPNEDTGTLMGAVTLPPGTSQDRSE KILARVDSLIASDPAVLSRTMISGFSFIGGQGPSYGSFIIKLKDWDERSMIQNSDVVVGS LYMRAQKIIKEAQVLFFAPPMIPGYSASTDIEVNMQDKTGGDLNKFFDVVNDYTAALEAR PEINSAKTSFNPNFPQYMIDIDAAACKKAGISPSDILTTMQGYYGGLYASNFNRFGKMYR VMIQSDPLSRKNLESLKNIKVRNSAGEMAPIAQFISVEKVYGPDIISRFNLYTSMKVMVA PASGYTSGQALTALAEVAQENLPTGYTYELGGMAREEAQSSGSATGLIFVLCFVFVYLLL SAQYESYILPLAVLLSIPFGLLGSFLFVNGVSAIGNISALKMILGTMSNNIYMQIALIML MGLLAKNAILIVEFALDRRKMGMSITWAAVLGAGARLRPILMTSLAMVVGLLPLMFAFGV GAHGNRTLGTASIGGMLIGMICQIFIVPALFVIFQYLQEKVKPMEWEDIDNADAVTEIEQ YAK >gi|222159348|gb|ACAB01000011.1| GENE 22 34085 - 35476 374 463 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 3 460 2 455 460 148 26 7e-35 MKKQILYMLCATALLSSCHIYKSYDRPEDITTSGLYRDPVADNDTLVSDTANFGNLPWRE VFTDPQLQSLIETGLKQNTDLQSAILSIKAAQAPLLAAKLAYVPALALSPQGTISSFDNS KASKTYSLPVTASWEISLFGGLLNAKRGADVAVKQAKAYKQAVQTQIIANVANLYYTLLM LDRQLEITRETAEILKRNAETMEAMKNAAMYNINSAGVEQSKAAYAQVMASIPEIEQSIR EVENSLSTLLGEAPRHIQRGTIEKQILPSEFSVGIPIQLLSNRPDVKAAEMSLASAYYNT NSARSAFYPQITLSGSAGWTNSAGSAIINPGKFLATAIGSLTQPLFYHGANIAKLKMAKA QEEQAKLSFQQTLLNAGSEVSNALSLYQKTSEKVESRQMQVESAKKASEDTKELFNLGTS TYLEVLSAQQAYLSAQISQVSDCFDRMQAVVSLYQALGGGREE >gi|222159348|gb|ACAB01000011.1| GENE 23 35495 - 36265 810 256 aa, chain + ## HITS:1 COG:ECs0183 KEGG:ns NR:ns ## COG: ECs0183 COG1043 # Protein_GI_number: 15829437 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Escherichia coli O157:H7 # 2 256 8 262 262 166 35.0 5e-41 MISPLAYVDPEAKLGKNVTVLPFAYIEKNVEIGDDCVIMSYASILQGTKMGKGNKVHQNA VLGAEPQDFHYTGEESSLIIGDNNDIRENVVISRATFAGNATKIGNGNYLMDKVHLCHDV QINNNCVVGIGTTIAGECTLDDCAILSGNVTLHQYCHIGSWTLVQSGCRISKDVPPYVIM SGNPVTYHGVNAVVLSQHHNTSERILRHIANAYRLIYQGNFSVQDAVQKIIDQVPMSEEI ENIVNFVKNSERGIVK >gi|222159348|gb|ACAB01000011.1| GENE 24 36427 - 37068 429 213 aa, chain - ## HITS:1 COG:no KEGG:BT_3335 NR:ns ## KEGG: BT_3335 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 213 1 213 213 413 90.0 1e-114 MKVKKISAANVEACSLPKLFDEEKIDFQPIQCVNWAEYPYKPKVSFRIAHTQNSILLHFK VKEESVRAKYGEDNGSVWTDSCVEFFSIPAGDGIYYNIECNCIGTILIGAGPVRNNREHA PKEVTALVQRWSSLGNQPFAERVEETDWEVALIIPYAVFFKHQIESLDGKEIKANFYKCG DELQTPHFLSWSPIEIDQPDFHRPDFFGTLEFE >gi|222159348|gb|ACAB01000011.1| GENE 25 37181 - 41239 3146 1352 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 813 1060 4 247 294 139 36.0 3e-32 MRKVILLFLLFLSVAGVRTQGQNITFSHLTTDDGLSQFSVNSLYIDERGIIWIGTREGLN RYNGNDIKSFKLNKNDPNSLFSNTVLRITGNKNGKVYLLCTDGVAEFDLTTQRFKTLLQG NVDAIYFNEKLYIGKREEVFVYNESTGNFDLYYHLAGKDITLSCLHLDEKKNLWMGTTSN GLYCLSGDKKISQPVTRGNIASIYEDSSKELWICTWEEGLYRIKTDSTIENFRHDPKNPN SICADFVRSCCEDNAGNLWIGTFHGLNRYEKSTGKFQLYTANANKPDGLTHSSIWCIVKD EQGTIWLGTYFGGVNYFNPEYEIYTRYKTGDTEKEGLSSPIVGRMTEDKDGNLWICTEGG GVNVYNRKNNTYRWYRHEEGKNSISHNNVKAIYYDRTNEIMWIGTHLGGLNKLDLRTNRF TVYRMKAGDPTSLPSDIVRDIVPYKDKLVVATQNGVCLFNPATGACQQLFKETKEGRGIG MVASLYIDKDGTLWIAATGEGVYSYRFDSGKLTNYPHNPANPNSLSNNNINSIMQDSNGN LWFCTSGSGLDRYRKESDDFENFDVQTDGLSSDCIYEVCESSIQKGDLLLITNQGFSQFN YPSKKFYNYGTENGFPLTAVNENALFVTHDGEVFLGGIQGMISFWEKKLHFTPKSYNIIL SRLLVNGKEVVPGDESGILEQSICHTPEISLKANQSMFSIEYATSNFIPANRDEILYRLE GFSDEWNHTYRKQTLITYTNLNPGKYTLVIKSQREGIKEARLLIIVLPSWYETWWAYLIY TIVTISLLWYLIQNYNSRIKLRESLKYEKKHIEDLEALNQSKLRFFTNISHEFRTPLTLI VGQVETLLQVQTFTPNIYNKVLGIYKNSLQLRELITELLDFRKQEQGHMKIKVSQHNLVN FLYENYLLFLEYASSKQINFKFNKQKDDIEVWYDQKQMQKVINNLLSNAVKHTKAEDTIS INVSQEKDHVIIEIKDTGTGIAAAEIDKIFDRFYQTEHLNSLNTGAGTGIGLALTKGIVE LHHGTIRVESEPGKGSSFIITLKLGKEHFTEEQIAKDDTETIQQTETIVPSVEIIPDSEW KEEDNKRIEDAKMLIVEDNESIKQMLVGIFETFYQVSTASDGVEALEMIQKDMPSIILSD VVMPRMSGTELCKQIKTDFNTCHIPVVLLTARTAIEHNIEGLKIGADDYITKPFNTNLLI SRCNNLVNSRRLLQEKFSKQPQAFAQMLATNPMDKEMLDRAMAIIERHLDNTDFNVNIFA REMGMARTNLFTKLKAVTGQTPNDFILSIRLKKGAVMLRNNPELNITEISDRIGFSSSRY FSKCFKEIYHVSPLAYRKGEESEEEGDGEETD >gi|222159348|gb|ACAB01000011.1| GENE 26 41532 - 43067 1302 511 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 13 501 11 467 497 160 28.0 6e-39 MKNVSRLLPLLSGIVTLSGCSHAPQKSNGQNNQKPNIIYIFADDLGIGDLSCYGATKVST PNIDRLAGQGVQFTNAYATSATSTPSRFGLLTGMYPWRQENTGIAPGNSELIIDTTCVTM ADMLKDAGYATGAVGKWHLGLGPKGETDFNNRITPNAQSIGFDYEFIIPATVDRVPCVFV ENGHVVELDPNDPITVNYDHKVGDWPTGEENPELVTLKPSQGHNNTIINGIPRIGWMTGG KSALWKDEDIADIITNKAKNFIASHQEEPFFLYMGTQDVHVPRIPHPRFAGKSGLGTRGD VILQLDWTIGEIMHTLDSLHIADNTILIFTSDNGPVIDDGYQDQAYELLNGHTPMGIYRG GKYSAYEAGTRVPFIVRWPARVKPNKQQALFSQIDVYASLASLLDQPLRKGAAPDSQEHL NVLLGKNNTNREYVVQQNLNNTLAIIKGQWKYIEPSDGPAIEYWTKMELGNDKQPQLYDL SSDPSEKTNVSKQYPDIVKELSELLESVKEK >gi|222159348|gb|ACAB01000011.1| GENE 27 43191 - 46385 2964 1064 aa, chain + ## HITS:1 COG:no KEGG:BT_3332 NR:ns ## KEGG: BT_3332 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 1064 1 1053 1053 1564 73.0 0 MKKHLSTQTRKRMLSSIGLILFSVSFILAQVLVKGTVKDNLGEGVPGASVQVKGTSQGTI TDLDGKFAFNVPNKNSILVISFIGYVTVEMKVDTQKPMVITLKEDTKTLDEVVVVGYQEI RKKDLTGSVAKASMAELLSTPTASFGETLGGRIAGVNVSSGEGMPGGQMNIVIRGNNSLT QDNSPLYVIDGFPVEDPSIAAAINQNDIESLDFLKDASATAIYGARGANGVVMITTKKGT IGQPKIKYDGSFGIQHITKTIPMMDAYEFVKLQAERSPKDMETTYFMNYDGKKWGLEDYR NIPQYNWQDEIFRSAWMQSHNVSLTGGSEGVRYNASLSYYDQDGILLESNYKRVQGRMGT TIQKKKLKIYLTTNYSSTTTTGGSPSQNSYSGMNNLFYSVWGYRPVTEPDRPLNSLMDNI MDDAINNTNDYRFNPIMSLKNEYRKTYANYIQFNGFAEYEFIKGLKLKVSGGYTFDTRKG ETFNNSKTRYGNPKSSDKVNAEIYHSQRATWLNENILTYQTNIKRKHFFNSMVGVTLQNS DYEYYSYKTVQIPNEALGMAGMSEGTPSTTKSLKSSWSMLSFLGRLNYNYKSLYYATVSF RSDGSSKFRGDNRFGYFPSGSLAWGFMEEDFMKPLKSVVSSGKLRASWGLTGNNRVGEYD TYALYQILKDKVGDFISIGSLPSGVYPFENSLTSVGTVPTSLRNRKLKWETTEQWNLGLD LGFLDERIGLTVDWYRKTTRDLLLNTALPTSSGYFSAMKNVGKVRNQGIEFTLNTTNIKN RHFSWTTNFNIAFNKNKVLELAENQSSLLSAAKFDQNYNSQYSYIAKVGYPMGMMYGFIY EGTYKYEDFDKVGDTYTLKRNVPYFSSESNTQPGMPKYADLNGDGIIDDNDRTMIGNGMP KHTGGFTNNFEYKGFDLSIFFQWSYGNDVLNANRLFFENSNKTRDLNQYASYADRWTPEN PESNIPRATDSGSNKVFSTRIIEDGSFLRLKTVSLGYTLPKQLTKKWKIDNARVFVAGQN LWTCTGYSGYDPEVSIREGALTPGLDFSAYPRAYSISFGINLGF >gi|222159348|gb|ACAB01000011.1| GENE 28 46413 - 48185 1811 590 aa, chain + ## HITS:1 COG:no KEGG:BT_3331 NR:ns ## KEGG: BT_3331 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 590 1 572 572 695 60.0 0 MKTIKIIIISLLIGLNLTSCDFLEKDPTYTTPENFFKNEADATSWLTGTYAILGQSSFYG NEFLYLVGGDDLGHYGGANRGPNKSGLICNNANTSDPAVAALWYTLYSGINRANIFLENI DAVPDMNDDTRKQYKAEARFLRAFYYFTLVECWGDVPFKTNSTEDAYNLAIPRTDKQTIY DFIIREMYGSAEDLKSAQDLNYLPGRVSKSAAWGMLARVYMFRAGEPKRDKEVGLANSTT NAEITEYFKKASYYAQLVRNEGHSLTAKYWDFFIDICSDKYNTALNKDGAKANESIWEVE FAGNRSTDVRAEGRIGNIIGIQGKDLSSKASITGKGDPGYAYAFIWNTPKLLELYEANGD IDRCNWNIAPFTYTQSAGEGTPVDGREFVKGKRDEVEQQYWDKSFSYGKTEPGSTYGDRE SKNDANKNRNRAAAKYRREYEADKKSKNDTSINFPLLRYSDILLMIAEAENEVNHGPNDL AYECINAVRERAGINKLAANLDETNFRKAIKDERAMELCFEYTRRFDLIRWGDYYKLMQE QVDKAQADESWKFGINVYTYFNIPKSYNYFPIPANEIGSNGAIKTNNPGW >gi|222159348|gb|ACAB01000011.1| GENE 29 48216 - 49316 781 366 aa, chain + ## HITS:1 COG:no KEGG:BT_3330 NR:ns ## KEGG: BT_3330 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 366 1 347 347 193 34.0 1e-47 MKTIYKSLMTIAFAGLSLASCDKELKEETAMEVGVVTDSNVSFDGKTVTVKKGSPVTFSF DGDPDFISFFSGEIGHEYKHRDRIEMQPEDVEKCEINFSILYDYGNAKTIEGSTHILISD RFGGISGNNVEKDKEAVTNCDWTELVSQEDLPKATKVTKDYSCPLTSYLGKEISIAFRLN PLDNSATMPVIHIKGLQLNLEFNNGKSTTINAKNFEFSALNVTYNLDDLSKNNTHLTKLK EALGNKNLTLEEMKSAEYADKIAYTTVDGNIPYFWRISQPNDFVTSGGSKDYTKGDTWLI SNPILLNGSCDPDAGVAIKNISQTLEIYSHTYEEAGTYTATFVANNANYVHQGGQVVREL TINVVE >gi|222159348|gb|ACAB01000011.1| GENE 30 49544 - 50491 730 315 aa, chain + ## HITS:1 COG:CAC3454 KEGG:ns NR:ns ## COG: CAC3454 COG0042 # Protein_GI_number: 15896694 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Clostridium acetobutylicum # 8 314 4 307 311 185 33.0 1e-46 MQKTLPIHFAPLQGYTDVIYRNAHAACFGGIDTYYTPFVRLEKGGFRHRDVRGIEPGNNQ VPHLIPQLIAPSFEKAEKILSLLIEKGYKEVDINMGCPFPMLAKRHNGSGILPYPEEVQA LLSWLITEYPQISFSIKMRLGWEDPEECLKLAPIINELPLRQVVMHPRLGKQQYKGEVDL KAFEAFQGVCKHPLIYNGDINSVEDIHRIQEQFPGLAGMMIGRGLLANPALALEYRQNRA LEFDEMREKLQSMHKCVYNQYAEQLEGGDEQLLNKMKTFWEYLMPQADRKLLKAIHKSTS LNKYNQAILAFFNQR >gi|222159348|gb|ACAB01000011.1| GENE 31 50546 - 51823 913 425 aa, chain + ## HITS:1 COG:no KEGG:BT_3325 NR:ns ## KEGG: BT_3325 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 425 1 409 409 650 78.0 0 MKKYILIFLCLITVSGSFAQMQDFKFKFYGQIRTDFYYNSRANEETVDGLFYMYPKDKVR DPEGNDLTSTPNSNFYTLYSRLGVDVAGPKLGTAKTSAKVEVDFRGTGTSYSVIRLRHAY LNLDWGKSALLLGQNWHPLFGDVSPQILNLSVGAPFQPFSRAPQIRYRYTNKNFQLTGAA VWQSQYTSQGPEGKTHKYLKQSCIPEIYLGLDYKSKQLQAGVGVEMLSLKPRTEAILHHY FTDPYTGYMHSMETTFKVDERITTLSYEAHVKYTNKDWFIGAKSVLGSNLTQASGLGGFG IKSVNEQTGEQEYTPIRFSSSWFNVVYGQKWKPGIFVGYAKNLGTSDALYAPKGNDAKLY GTGTDLNQLVTAGAELTYNVPHWKFGLEYTLSSAWYGSLNTSNGKIQDTHAVCNNRIVAV AMFMF >gi|222159348|gb|ACAB01000011.1| GENE 32 52241 - 55312 2176 1023 aa, chain - ## HITS:1 COG:no KEGG:BT_3324 NR:ns ## KEGG: BT_3324 # Name: not_defined # Def: chondroitinase (chondroitin lyase) precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 1023 3 1014 1014 1841 84.0 0 MMKQPFTKFGVTTLFSLLCSAFLHAQVVTDERMFSFEEPQIPDCITATHSRLSVSDLHYK DGKHSLEWTFEPGGILELKKDLKFEKKDPTGKDLYLSAFIVWVYNEVPQDTIIEFQFLKD GKRCTSFPFGINFSGWRAAWVCYERDMQGTPEEGMNELRIIAPNSKGSLFIDHLITATKV DARQQTADLQVPFVNAGTTNHWLVVYQHSLLKPDIELTPVDDKQRAEMQLLEKRFRDMIY TKGKTTDKEVETIRKKYDFYQITYKNGQVSGVPIYMVRASEAYERIIPNWDKDMLTKMGV EMRAYFDLMKRIAVAYNNVVNPVIREEMKKKFLAMYDHITDQGVAYGSCWGNIHHYGYSV RGLYLAYFLMKDVLRETGKLQEAERTLRWYAITNEVYPKPEVNGIDMDSFNTQTTGRIAS ILMMEDTPEKLQYLRSFSRWIDFGCRPALGLSGSFKVDGGAFHHRNNYPAYAVGGLDGAT NMIYLFSRTEFAVSELAHETVKNVLLTMRFYCNKLNFPLPMSGRHPDGKGKLVPMHFAMM ALAGSPDGKEEYDSEMASSYLRLISDPSIENDSPEYMPKVSNAEERKVAKRLVEKGFRPE PDPQGNIAMGYGCVSVQRRSNWSAVARGHSRYLWAAEHYLGANLYGRYLAHGSLQILTAA PGQTVTPATSGWQQEGFDWNRIPGVTSIHLPLEQLQAKVLNVDSYSGMEEMLYSDEAFAG GLSQQKMNGNFGMKLHEHDKYNGSHRARKSYHFIDGMIVCLGSDIENTNTEFPTETTIFQ LAVTDKAGHDYWKNYQGDKKVYVDHLGTGYYVPTPIRFEKNFPQYSRMQNTGKETKGDWV SLVVDHGKVPKNGSYEYAVLPQTNEALMKKFAKKPTYKVLRQDRNAHIVESVSEQIISYV LFETSETTLPGGLLQRVDTSCLVMTHKESADKIKLTVAQPDLALYRGPSDEAFDKDGKRI ERSIYSRPWIENASGEIPVTVTIKGQWNVEETPFCKVISSDKKQTVLQFSCKDGASFEVE LRR >gi|222159348|gb|ACAB01000011.1| GENE 33 55348 - 56040 389 230 aa, chain - ## HITS:1 COG:no KEGG:BVU_0159 NR:ns ## KEGG: BVU_0159 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 30 225 38 233 1106 192 45.0 8e-48 MTKRFFIGYSLITFFLLNSVVLLAQHKASFVQQWKIEDASHALQIIERADTLELIVPDGL TMWYRQRLTGDYEICYRICMVMQGGKYDRLSDLNCFWAANDPKYPDDLFARSQWRDGIFK NYNTLNLFYVGYGGNDNSTTRFRRYKGEYYGVADDKVKPLLKEYTDASHLLVPNQWYEIR IRVEKGITTYSVNDEELFRYTLAGGEGDGHFGLRLLQNHVLFTDFKATIL >gi|222159348|gb|ACAB01000011.1| GENE 34 56156 - 57526 1008 456 aa, chain - ## HITS:1 COG:no KEGG:BF3314 NR:ns ## KEGG: BF3314 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 41 453 63 429 430 100 24.0 8e-20 MKIKSLMYVLMGTMLVTTSCSDNELEKGNDGSGTVDPVNASTLVNVYSDKSASEASLLVG DVLVKDSRTLTLNVPAACEKVYMKYNTVSGTEATKEFALSPVSRGVDQSTGFNFETNRLA SVTLALPEDAVQPTNETDQGYLFYHNTGVVMFEDGWPIQLDSWYDEDFNDVVFEYDLKVT ECHSQQMMETVGGKEELLLTLDVRAVGGIYPTVLGVVLDGLKSEYVDRITASLVLKGGQG TMTDLAKEELSTKNIVKVENKNWNWSNDTRKEPRFAILTVDKAQAEGTVITLDGLTSLMD NNQDMFQVTQGKVREGLPMLRAEVRLIGKEGLTGAERDAQLAAFRELILDTNRQNFFIKV NGGKEIHMRGYAPTSAYKAEYEALVAGDTTLDANVYYSNTKGSTWGVKLPVGTRHAYERV PFREAYPDFTKWVDSKGASNQKWYENFVDEKTIRYW >gi|222159348|gb|ACAB01000011.1| GENE 35 58076 - 59026 793 316 aa, chain + ## HITS:1 COG:no KEGG:Cpin_2255 NR:ns ## KEGG: Cpin_2255 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 81 312 145 373 375 114 27.0 6e-24 MKSISLDSNTLSEEEIADRLQYDTKKRPKPNELRIITQLLTEIKSVHEDEINELNYQTIQ TVFQITRFLERELQFGSKRSKIQALKLIQSINGYASEAVLVRFLYHRELELRNSARYTYM WLSQGDPFRFFDEDIGMKLRQWDMMELHAILEHRKKVGYNTPSFIKWVNTSAEENVKIFF INEIRLYNETDSAPILAKQLNARSVEIRGEAIRTLGKLKYKEIEPKLIEMYHVQPEEVKR QIISAVADLKTDKALGFLYNAYDEADNWGTKRIILKSLYEYSAMGRKTFDQLERKADSHT AILFAHTRHPLINQLI >gi|222159348|gb|ACAB01000011.1| GENE 36 59032 - 60507 1099 491 aa, chain + ## HITS:1 COG:mlr6694 KEGG:ns NR:ns ## COG: mlr6694 COG1215 # Protein_GI_number: 13475588 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Mesorhizobium loti # 58 469 63 470 475 291 37.0 1e-78 MRDIIFTFFNYFVFFYTSMLAIFFITFAFLSFISLKRRKDYYVESYMRKTIKESPYTPGI SVIAPAFNEEKTIIDNVNSMLALEYPLFEVVIVNDGSTDSTLKKMTEYYELVEVPYAYIE RIKTRPFRRLLKSSNPKYSRLIVVDKENGGTKADASNAGINVASYPYFICTDVDCILEKY ALYRCISPIISSEKQVIAVSGTMLMANGCVVKDGQIVDVRTPRTPIPLFQNLEYMRSYLI GKMGWSAINGMPNVSGGFGLFDRSVAIAAGGYDAPSFAEDMDLITRMVGYMCDFSRPYKI VQIPDTCCWTEGPPNLAMLYRQRTRWARGLFQTLSIHHKMIFKKTYKQMGLLTLPYMFIF EFLAPIIELTGLIVFIYLAFTGAVNWNTAWMIYLTIYTFCQFLSIVVITYDYYVGMLYKR GYEYLWIIIASILEPIFYHPIITFCSLRGYLSYLTNRDFKWKNMERKGFKQKEESADNTD TTPMKPEPATI >gi|222159348|gb|ACAB01000011.1| GENE 37 60538 - 63507 2161 989 aa, chain + ## HITS:1 COG:no KEGG:Cpin_2252 NR:ns ## KEGG: Cpin_2252 # Name: not_defined # Def: TPR repeat-containing protein # Organism: C.pinensis # Pathway: not_defined # 28 989 27 954 954 152 22.0 8e-35 MNREFLHKITILGCLFLLITSSSGTDNGFQTPEQYAQIVQEHFANEEWEAGKELLEEGLQ KYPNVSDLEWLMGKYWFHEKNYDQSRYHLVKAIDDNYNNVNAKHLLVDVEDITENYSSAI CYVNELLEVNPYWRGLWRRKIELYRKQNNDVEADRLLKRINQIYPNDTILRKDYIYSMEV GYQQMKKGGNRKEAIEKLTELIKVSPQNEEYYLDIINLHLQEGNREAALGWSSNGLAAIP GSGALIVKRASILSELARYPEALVFIREQMRRNNSPAIRRMYNDLLMEAARAEKQRDPYV LYGMAYEGGNKNKEALDYLLNTSVTRGYTDDALFYIREAKKQYGNNDKGILYKEYMLYRQ MNEDDLAYSTLKKMYEMYPDDYDITLAMSAQHMKKAEKLMELGLYAEALPHVLFVSQKHV DDNEVNGAAWEKALSCYINMKRYNEALATLDTITLHFPDYENCTLKRAFILDKMDKTEEA LQLYLSAIEQSDEDMRIFYVIGYEELAVPYIKKCMEAGATKKAYEEAIKLVSLNPSSDLG LRYAINSSGLLGKYDEFEKYTAQGINYYPEEPFYQAKRATVLERDKRYEASLEFLKPILN KYPSNKEIIGAFSQSSEYRALQLTKAKEPEKALAVLDTALLYDSQNKSLKYTKGVVYEAN RQADSAYYYQKYYEPSIMEYRSFQRHLSGLRSMMLKNEIALTYLRARYGEEDIITSVATA EYTRKGQKNSYTGRLNYAGRSGSASDSMEAEEQTPGGVGIQVQGEWTHHFSPKWSLTANA AFATKYFPDITADVALRHYLKNDWEIGAHVGYRRVAAYTKRYEWNDEFFAGGTGDNGYLF TGWNESKTNLLTVGGELAKTIEVVRLNTKIDMHFFNSNFYYNAQIGAKYFPASDTKTNIN AMASIGSAPETAVLDYALPGSFSHTNTMVGLGGQYMVSPNITIGLMGTWNTYYNQTNTVR GTSPSNQIESISTRYKNLYNIYAQVYISF >gi|222159348|gb|ACAB01000011.1| GENE 38 63482 - 65635 1355 717 aa, chain + ## HITS:1 COG:no KEGG:Cpin_2251 NR:ns ## KEGG: Cpin_2251 # Name: not_defined # Def: coagulation factor 5/8 type domain protein # Organism: C.pinensis # Pathway: not_defined # 157 654 165 674 752 181 26.0 9e-44 MHKFIFHSDTLGPVLAGMICICCLLSGCRMQARTEMFPSKEGYLVTIGEDPTDRDTRWAK YLYEHLKKRANDDEIVAFGVSEMDMWRIIIQIDPTLQRGFKVACKGSDIRLTASDDKQML WLQYQLIKKISKEDPRIDGSDLPPALINLNDTCGAFAFDYQSIYSPYGLNADHTGVIGLN NFDDSWGIWGHNLRKVLGKDAEKVYATIHGKTDDSQLCFSSEDMYRQIESYIVDNFGEKG NFRFVIAPDDTPYACTCATCTALGNTEKNATPAVTELILRLSQRFPKHTFFTTSYLTTQQ VTDKQLPPNVGVIVSAIDYPLRRTDGKDEQDKKFAEQLDNWKKVTNNIYIWDYINNFDDY LTPFPILKIAQQRLQLFKQHGASGIFFNGSGYSYSSFDEMRTFVLSALLINPELPVDELI KSYFNQEYPVSKKWLYDYYTELENNAQSGKRLGLYAGIRESEKGFLYPEKFIKFYDEMGD FVSEAKGKERKKLHELQTALSFTRMELARDHSFDAYGYAKRNGKDIQPLPQARKWVTQLK EHQAFAGMGYYNESAYEIDYYIKEWEQYILASDIKKSLFLGMHPSATPKLNKNDSKKLTD GTHGLPGDYHCGWVIIPGEECTINLPVKGLNASGTFYISFLNLPRHRIYAPQQVQLLKDG VAYKTIDLKPEDSPEKGEMMKATVPADLNGTEQLSIKISCLKKPGTQMGIDEIAFIP >gi|222159348|gb|ACAB01000011.1| GENE 39 65674 - 66189 418 171 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716783|ref|ZP_04547264.1| ## NR: gi|237716783|ref|ZP_04547264.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 171 4 174 174 351 100.0 8e-96 MNKSIFYILLLTALPLYFTGCRKEVRPTSMTIKDSVRHYYPIKQGQQLDIMFTITNTGDA PLIISEMQPSCGCIILDKSSHIIIPEDGIRQFKATYNSIKNVGEVVHRIRIFGNMLPDGR AELKFDVNVVPDADYTRDYEELYQEFNTKNGIVREMVDGKESELGYYVGEP >gi|222159348|gb|ACAB01000011.1| GENE 40 66283 - 66843 590 186 aa, chain - ## HITS:1 COG:no KEGG:BT_3323 NR:ns ## KEGG: BT_3323 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 186 1 173 173 302 86.0 4e-81 MRAKKFVCCLLAMMLLAGVSFISCGNSSKAKADSELTTQDGEDFKSFLDKFTSSAAFQYT RIKFPLKTPITLLADDGETEKTFPFTREKWPLLDSETMKEERITQEEGGIYVSKFTLNEP KHKIFEAGYEESEVDLRVEFELQSDGKWYVVDCYTGWYGYDLPIGELKQTIQNVKEENAA FKEIHP >gi|222159348|gb|ACAB01000011.1| GENE 41 67130 - 68935 430 601 aa, chain + ## HITS:1 COG:no KEGG:BDI_3446 NR:ns ## KEGG: BDI_3446 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 601 1 660 660 305 32.0 5e-81 MKTIYTVFLLLLLLSCTSSSEKLAWEIANNSQTNKKELTRFLEHYKTNKDKDKYKAACFL IENMPNKYSINGKEQKIYDIDIVKADSLIKSLEHSFFLKEKSPYLKNYTFEQFCEYILPY RVADESLQYYWKWDCSRKFEKQCTNDIIQTAQNINAQIKIELSPEFYKDTLKSYSSIIKT GYGKCDDRTALVTMALRSVGIPAAFEFVPYWGSNNNGHSFVSIILPDNKIYPLQNTDKQA NGDYYLSRKTPKIYRKMYSIQDLAKHIDNIPELFRHNDLLDVTKLHNIGSCDVTVSTNIN KEKENFLSVFSPKRWVPVAFSSSQTFHHIGTGNIYNVDRNKEAIDLGDGIVYLPTHWVNE EAIPIGSPIIVSEDSVREIKPDTKHLERVVCKRKFPLNMRIVDFSKLMIMGVFEGANKAD FSDATELYKITKTPESKMQKIEISAEKAYRYIRYRKPKGTFSIAEFCLYQSDEKLLPFHP IACDAIYEDSTMLNIFDGQPLTYYQVSGGIDLWVGVDLYKPVKISKIGFAPRNDDNAIVS TDTYELFYWQDQWISLGRKRPIGDSVVYDNVPQKALLWLRNLTKGREERPFTYENGKQIW W >gi|222159348|gb|ACAB01000011.1| GENE 42 68966 - 69251 241 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720541|ref|ZP_04551022.1| ## NR: gi|237720541|ref|ZP_04551022.1| predicted protein [Bacteroides sp. 2_2_4] # 1 95 1 95 350 160 100.0 2e-38 MKKITLFLSLLLIGSVGFSQIIPGVNIGKRKEYMMRTYKINSQKADEYEQILFSLQKEND QLKNRKISSTQFKAEQKKLYKKYGTIISQAFSGGK Prediction of potential genes in microbial genomes Time: Wed May 18 01:03:12 2011 Seq name: gi|222159347|gb|ACAB01000012.1| Bacteroides sp. D1 cont1.12, whole genome shotgun sequence Length of sequence - 65879 bp Number of predicted genes - 48, with homology - 47 Number of transcription units - 24, operones - 9 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 98 - 157 1.5 1 1 Op 1 . + CDS 182 - 868 495 ## gi|237720541|ref|ZP_04551022.1| predicted protein 2 1 Op 2 . + CDS 876 - 1454 232 ## gi|237716787|ref|ZP_04547268.1| predicted protein 3 1 Op 3 . + CDS 1372 - 2418 438 ## gi|237720542|ref|ZP_04551023.1| predicted protein 4 1 Op 4 . + CDS 2423 - 2881 267 ## gi|237720543|ref|ZP_04551024.1| predicted protein 5 1 Op 5 . + CDS 2878 - 3324 208 ## gi|237716790|ref|ZP_04547271.1| predicted protein + Term 3390 - 3447 10.3 6 2 Tu 1 . - CDS 4456 - 4584 60 ## - Prom 4655 - 4714 5.9 + Prom 4619 - 4678 6.0 7 3 Tu 1 . + CDS 4736 - 5110 299 ## gi|298480418|ref|ZP_06998615.1| O-antigen polymerase superfamily 8 4 Tu 1 . - CDS 5281 - 6624 339 ## gi|237716792|ref|ZP_04547273.1| predicted protein - Prom 6673 - 6732 10.1 - Term 6688 - 6741 -0.8 9 5 Tu 1 . - CDS 6764 - 8047 873 ## BT_3321 hypothetical protein - Prom 8080 - 8139 5.2 + Prom 8045 - 8104 4.3 10 6 Op 1 . + CDS 8131 - 8895 873 ## COG0289 Dihydrodipicolinate reductase 11 6 Op 2 2/0.000 + CDS 8931 - 10415 1544 ## COG0681 Signal peptidase I 12 6 Op 3 . + CDS 10458 - 11399 509 ## COG0681 Signal peptidase I + Prom 11407 - 11466 6.4 13 7 Tu 1 . + CDS 11491 - 12117 570 ## BT_3317 hypothetical protein + Term 12245 - 12284 -1.0 14 8 Tu 1 . - CDS 12172 - 13881 1204 ## COG1874 Beta-galactosidase - Prom 13917 - 13976 5.6 - Term 13916 - 13963 7.5 15 9 Tu 1 . - CDS 14077 - 14565 312 ## BT_3316 hypothetical protein - Prom 14813 - 14872 6.2 16 10 Tu 1 . + CDS 15509 - 15733 111 ## gi|237716802|ref|ZP_04547283.1| predicted protein 17 11 Tu 1 . - CDS 15730 - 16539 640 ## COG3177 Uncharacterized conserved protein - Prom 16564 - 16623 5.8 - Term 16644 - 16698 11.2 18 12 Tu 1 . - CDS 16753 - 18999 2544 ## COG1472 Beta-glucosidase-related glycosidases - Prom 19129 - 19188 5.0 - Term 19109 - 19156 12.7 19 13 Op 1 . - CDS 19191 - 21188 1803 ## BT_3313 hypothetical protein 20 13 Op 2 . - CDS 21211 - 22701 1445 ## COG5520 O-Glycosyl hydrolase 21 13 Op 3 . - CDS 22732 - 24252 1306 ## BT_3311 hypothetical protein 22 13 Op 4 . - CDS 24267 - 27272 2651 ## BT_3310 hypothetical protein - Prom 27345 - 27404 9.6 - Term 27383 - 27420 2.2 23 14 Tu 1 . - CDS 27483 - 29126 1331 ## BT_3309 transcriptional regulator - Prom 29269 - 29328 6.5 + Prom 29216 - 29275 8.0 24 15 Tu 1 . + CDS 29390 - 30178 480 ## COG1712 Predicted dinucleotide-utilizing enzyme - Term 30014 - 30063 -0.6 25 16 Tu 1 . - CDS 30161 - 31045 566 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 31067 - 31126 5.2 26 17 Tu 1 . - CDS 31217 - 31498 199 ## BF4289 hypothetical protein - Prom 31659 - 31718 6.8 + Prom 31607 - 31666 6.2 27 18 Op 1 6/0.000 + CDS 31784 - 32350 495 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 32406 - 32465 3.4 28 18 Op 2 . + CDS 32499 - 33446 611 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 33609 - 33658 6.7 + Prom 33577 - 33636 4.5 29 19 Op 1 . + CDS 33669 - 36923 2667 ## Phep_1362 TonB-dependent receptor 30 19 Op 2 . + CDS 36929 - 38215 1103 ## Phep_1361 RagB/SusD domain protein 31 19 Op 3 . + CDS 38229 - 39167 594 ## Phep_1360 exopolysaccharide biosynthesis protein 32 19 Op 4 . + CDS 39180 - 40961 1184 ## COG3391 Uncharacterized conserved protein 33 19 Op 5 . + CDS 40968 - 42227 956 ## Phep_1359 NHL repeat containing protein + Term 42232 - 42273 9.2 34 19 Op 6 . + CDS 42291 - 43205 636 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Term 43278 - 43331 12.1 - Term 43266 - 43318 12.7 35 20 Op 1 . - CDS 43335 - 44432 848 ## Phep_1387 hypothetical protein 36 20 Op 2 . - CDS 44491 - 45711 933 ## COG0612 Predicted Zn-dependent peptidases - Prom 45789 - 45848 4.6 + Prom 45621 - 45680 3.1 37 21 Op 1 . + CDS 45817 - 46350 538 ## COG1611 Predicted Rossmann fold nucleotide-binding protein 38 21 Op 2 . + CDS 46355 - 46960 570 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 39 21 Op 3 . + CDS 46948 - 47868 752 ## COG0524 Sugar kinases, ribokinase family + Term 48055 - 48101 3.1 + Prom 48920 - 48979 4.3 40 22 Tu 1 . + CDS 49004 - 50935 1636 ## COG0513 Superfamily II DNA and RNA helicases - Term 50937 - 50997 8.7 41 23 Op 1 . - CDS 51008 - 55078 2310 ## COG0642 Signal transduction histidine kinase 42 23 Op 2 . - CDS 55098 - 55298 121 ## gi|298483819|ref|ZP_07001991.1| hypothetical protein HMPREF0106_04287 - Prom 55439 - 55498 6.1 + Prom 55271 - 55330 5.6 43 24 Op 1 . + CDS 55433 - 58570 2772 ## PRU_2712 hypothetical protein 44 24 Op 2 . + CDS 58583 - 60202 1291 ## PRU_2713 putative lipoprotein 45 24 Op 3 . + CDS 60227 - 62041 1300 ## PRU_2714 putative lipoprotein 46 24 Op 4 . + CDS 62050 - 63189 901 ## COG4833 Predicted glycosyl hydrolase 47 24 Op 5 . + CDS 63198 - 64376 943 ## COG4833 Predicted glycosyl hydrolase 48 24 Op 6 . + CDS 64413 - 65877 1229 ## COG3538 Uncharacterized conserved protein Predicted protein(s) >gi|222159347|gb|ACAB01000012.1| GENE 1 182 - 868 495 228 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720541|ref|ZP_04551022.1| ## NR: gi|237720541|ref|ZP_04551022.1| predicted protein [Bacteroides sp. 2_2_4] # 1 228 123 350 350 384 100.0 1e-105 MRALYKAESEWVKERDKMHKDTGEAWEKYENSDTMVSELNIKIKQILGTENGTWYIEYKR LFFRALDNMDKYGVTYKDAFTIAKIEDTYKQKRANILNSNKKNAEREVELMAIDDEMAKK IAKTVPSVSVKWKKVNNAALDHTLKSRYGLNQEQINKFKTAYNKYAIEEYKILNQKKLSD SDKYDQLSQLGETFCKTVNPLFKVDNYKKWYGWWKYDFERKMKRKGLK >gi|222159347|gb|ACAB01000012.1| GENE 2 876 - 1454 232 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716787|ref|ZP_04547268.1| ## NR: gi|237716787|ref|ZP_04547268.1| predicted protein [Bacteroides sp. D1] # 1 192 1 192 192 375 100.0 1e-102 MKKSTYIYTVLCVIAIVISIVNCKDEDLLESGLQTDMKDFSLQEAKNFFQTQAHANLTLS RSLDNKRNKTVSPGDFVPNWDAAVSSTNNGLACYDIPITPTYHFKAIYVDERNGKPSAGK VNVYQKLVIVKDVKSNRMDQYILTLIPSKLYDSRNGAQTCNNFINCADKGGFYRSSIIFL CIFTSYSSYKYF >gi|222159347|gb|ACAB01000012.1| GENE 3 1372 - 2418 438 348 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720542|ref|ZP_04551023.1| ## NR: gi|237720542|ref|ZP_04551023.1| predicted protein [Bacteroides sp. 2_2_4] # 7 348 172 513 513 629 100.0 1e-178 MQTRVVFTGVALYSCVYSQVTARISTFKNGVKTRGVFLLNASGKTNLSDKYEQARALAST VYIQKKKMVLTRGEDDYNYDYDYGNEDDYTYIGETLEEVIITPESNNNETSGGNDEWEII APPDSGTIDPEPTEPESTSTEDDTVTENNNGDQNSDEKSIPLSTAEKKAVNSLLIQLEKL KNIDRTKYTIEKQNYCRSTARTSKDGVLQLCQLFFSSDNLTEIDRIATIWHEMYHIDHKH YGKLEMTILEKTIVLNPPPYIEKILNERLDIMYGKYIMTPETREADFKQELIIDRYGTIE YYKNELETHKAERENFPEVSHYYENERTWLEWTYEQLLIIATEQSSNK >gi|222159347|gb|ACAB01000012.1| GENE 4 2423 - 2881 267 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720543|ref|ZP_04551024.1| ## NR: gi|237720543|ref|ZP_04551024.1| predicted protein [Bacteroides sp. 2_2_4] # 1 152 1 152 152 283 100.0 2e-75 MNTIKSFFISILLCMSIPSKGQTQDLEITFAYNRQDNALILKLFNNTDKEIIVLNQSLLN ESSGSCIILTEKHDNGQSDLIISLYDYEDGQWIRSKTINPNERLELFYSFEAIPANNVTR ARLFLSTYFRDRKTGKLVSKRYKNDLPIKQIK >gi|222159347|gb|ACAB01000012.1| GENE 5 2878 - 3324 208 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716790|ref|ZP_04547271.1| ## NR: gi|237716790|ref|ZP_04547271.1| predicted protein [Bacteroides sp. D1] # 1 148 1 148 148 277 100.0 1e-73 MITSKHLFIGIVLLGISIFCTAQEVKIEFSYDKPNNSLTLILTNNIDKEILVMNQGRLSE FSGSYIVLTESSNGKSADLTICLFTLESGKWILHKSLSPKGRIELSYPLDSIPANNVVRA HLFLSTYSNDEKTGKLTSKRYEKNLYID >gi|222159347|gb|ACAB01000012.1| GENE 6 4456 - 4584 60 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEAILLTMYFAKIDIVNTFMVLLFIIPIMLMNNISNGKTWPG >gi|222159347|gb|ACAB01000012.1| GENE 7 4736 - 5110 299 124 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298480418|ref|ZP_06998615.1| ## NR: gi|298480418|ref|ZP_06998615.1| O-antigen polymerase superfamily [Bacteroides sp. D22] # 1 124 407 530 530 255 100.0 5e-67 MYIYSKSSLEDYPLNTKPQILQTAARIAPNSELYIKMGDFWKQKRDYAQAEACYQTAAAM IPHHITPSYKLFQLYIDKGNINAAIDMGNYLLKQPIKKKGTKALRMEAEILEFLHKEKNI KKTQ >gi|222159347|gb|ACAB01000012.1| GENE 8 5281 - 6624 339 447 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716792|ref|ZP_04547273.1| ## NR: gi|237716792|ref|ZP_04547273.1| predicted protein [Bacteroides sp. D1] # 1 447 1 447 447 890 100.0 0 MNKIISLFIAVFFNVYTSAQINAGSTKESLMTLDEMPSWILSNIQFPQEAYKYGIAGIEQ VCISASWDGKVFITSILNTLNPAFEKEIMDVISKAPRCRYNGSQPKDIYKYMLIDFHQYI PEDKREQIQQVTMHIPPRLSNIPTSPFNSRDKFVQWIHNNIQIPSTLKCYSDTLLFQYTI TKKGKVNNISILQCKNDIVKCAIEDLLKKSPKWEPAIADRTTPIDVTICDKIIIKTDNDG MLLPLIVYRDDVFCNTRSKPTDPDMIVFNPEIKAKYNEEGNFLKNIMCDVIVDKKMVLNG SFVIEKDGTTSHIEISNSPDAETDSIVTEAIARTKWIAAMQGESAVRTIYSFGVNKQPRK QNQSSKYSYYDIFGKYFIALQANPMRTSYRFIQGDGTIQNYPFNNQGLFDYKAYYQGMLY YYKNMAGKNSNISRDYFDKLYKMYAGY >gi|222159347|gb|ACAB01000012.1| GENE 9 6764 - 8047 873 427 aa, chain - ## HITS:1 COG:no KEGG:BT_3321 NR:ns ## KEGG: BT_3321 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 427 1 427 427 761 83.0 0 MEHLLHYVWKHKLFPLKVLQTTNGLPVEVIDSGLQNPNAGPDFFNAKLKIDGALWVGNIE IHTHSSDWFRHGHHSDKAYDSVILHVVSEADTEITRTNGEQIPQLLLTCPDNVQLHYHEL CVADQYPACHPILASLPKLTIHSWLTALQTERLEQKAQLITQRLKHCNSNWEDAFFITLA RNFGFGLNGDAFETWAGLLPFRAMDKHRNDLFQIEAFFYGLAGLLEETFLKKEQEDEYSL RLCKEFRYLQRKFEIRQGMDATLWRFLRLRPENFPHIRLAQLAYLYQKGDKLFSRLLEAE TLVDVRNLLDARTSPYWENHYLFGRPSSQKEKTMGERSKDLIIINTVVPFLYTYGLHKAD ERMCERAGRFLEELKAESNHIIRSWSDAGLPVVSAADSQALIQLQKEYCDKRKCLYCRFG YEYLRKK >gi|222159347|gb|ACAB01000012.1| GENE 10 8131 - 8895 873 254 aa, chain + ## HITS:1 COG:RC0190 KEGG:ns NR:ns ## COG: RC0190 COG0289 # Protein_GI_number: 15892113 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Rickettsia conorii # 1 233 48 266 285 109 34.0 4e-24 MKIALIGYGKMGKEIEKVARSRGHEIVCIIDINNQDDFESEAFKSADVAIEFTNPMVAYS NYMKAFKADVKLVSGSTGWMAEHGEEIKKLCTEGGKTLFWSSNFSLGVSIFSAVNKYLAK IMNQFPAYDVTMSETHHIHKLDAPSGTAITLAEGILEKLDRKDKWVKGTFLAPDGTISGT NDCAPNELPIASIREGEVFGLHTIRYESDVDSITITHDAKSRGGFVLGAVLAAEYTATHE GFLGMSDLFPFLND >gi|222159347|gb|ACAB01000012.1| GENE 11 8931 - 10415 1544 494 aa, chain + ## HITS:1 COG:STM2582 KEGG:ns NR:ns ## COG: STM2582 COG0681 # Protein_GI_number: 16765902 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Salmonella typhimurium LT2 # 438 489 269 320 324 73 57.0 1e-12 MRQATRAQWIKCAIAILLYLVFLLWVRSWWGLIVVPFIFDIYITKKIPWSFWKKSKNPAV RSVMSWVDAIVFALVAVYFVNIYIFQNYQIPSSSLEKSLLVGDFLYVSKMSYGPRVPNTP LSMPLAQHTLPVFNTKSYIEWPQWKYKRVPGFGKVKLNDIVVFNFPAGDTVAVNYQQTTD FYTLAYGEGQRIYSKQIEMDSLTRSQQRAIYDLYYDAGRKQILNNPRTYGEVLWRPVDRR ENYVKRCVGLPGDTLQIVDGQVMIDGKAIENPENLQFNYFVQTTGPYIPEDMLRELGISK DDTMLIEDSGWEGGLLDMGLDNRNAQGKLNPVYHLPLTKKMYDTLLGNKKLISKIVMEPE EYAGQMYPLNLYTKWNRNNYGPIWIPAKGATITLTEDNLPIYERCIVAYEGNKLEVKPDG IYINGEKTNEYTFKMDYYWMMGDNRHNSADSRYWGFVPEDHVVGKPIVVWLSLDKDRGWF DGKIRWNRLFKWVD >gi|222159347|gb|ACAB01000012.1| GENE 12 10458 - 11399 509 313 aa, chain + ## HITS:1 COG:YPO2717 KEGG:ns NR:ns ## COG: YPO2717 COG0681 # Protein_GI_number: 16122921 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Yersinia pestis # 16 307 78 327 332 78 28.0 1e-14 MNIRKFKWILAFAGAVVVVLLLRGFAFTSCLIPSTGMENSIFQGERILVNKWSYGLRVPF MSLFSYHRWCESPVRQQDIVVFNNPAGIREPIIDRREIYISRCLGVPGDTLLVDSLFSVI SPEAQFNPDKKRLYSYPASKENLITSLMHTLSITNDGLMGSNDSTHVRSFSRYEYYLLEQ AMNGKESFVQPLSNREDAEPNPLIVPGKGKFIRVYPWNITLLRNTLVMHEGKQAEIKNDT LYVDGKPTQHCYFTKDYYWMGSNNTVNFSDSRLFGFVPQDHIIGKASIIWFSKEKETGLF DGYRWNRFFRTVK >gi|222159347|gb|ACAB01000012.1| GENE 13 11491 - 12117 570 208 aa, chain + ## HITS:1 COG:no KEGG:BT_3317 NR:ns ## KEGG: BT_3317 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 206 1 206 209 357 85.0 2e-97 MKIAYLSSAYLAPVEYYTKLLAYDKVLVEQHDHYIKQTYRNRCTIASPSGELVLSIPTVK PDTLKCPMKDIRISDHGNWRHLHWNAIESAYNSTPFFEYYKDDFRPFYEKKYEFLIDFNE ELCRMVCELIDIHPTMERTSEYKMEFAPGEFDFREVIHPKKDFREVDTEFIPQPYYQVFE PKLGFLPNLSIIDLLFNMGPESLLVLGK >gi|222159347|gb|ACAB01000012.1| GENE 14 12172 - 13881 1204 569 aa, chain - ## HITS:1 COG:CC2801 KEGG:ns NR:ns ## COG: CC2801 COG1874 # Protein_GI_number: 16127033 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Caulobacter vibrioides # 23 569 125 628 628 383 39.0 1e-106 MRNTLFGILFLFILPLQAHQKLPYLQKQGSTTQLMVDGKPFLVIGGELGNSSASSIEDIE RIFPKLQRMGLNTVLVPAYWDLTEPQEGKFDFTLTDKVIQQARANDLKVVFLWFGAWKNS MSCYAPIWFKEDYKKYPRAYTKAGKPLEIASSFSENVFQADSRAFSQWMKHIASVDKEEG TVIMIQIENEIGMLEDARDYSKEADKLFYAPVPSLFIGYLQKNKRSLHPEMLAKWESQGF KKKGTWQEVFGADVYTDEIFMAWSYAQYVERMAKLARSIYNIPLYVNAAMNSRGRKPGEY PSAGPLAHLIDVWHYAAPNIDFLAPDLYDKGFVDWVAKYKLHNNPLFIPEIRLEENDGVR AFYVFGEHDAIGFCPFSVESGSDRADAPLVQSYIKLKELMPLLTKYQGKGVMNGLLFDEE NKERILSYDDLEITCRHYFTLPWDPRARDGTVWPEGGGVLLRLAPDEYIVAGSGLVLEFK KQGENKMKSSPALGEDGFASVGGKASKTENSWQGGMRAGIGSVDEVNVNEDGSLKYIRRL NGDQDHQGRHVRIPVGEFSILHVKLYEYK >gi|222159347|gb|ACAB01000012.1| GENE 15 14077 - 14565 312 162 aa, chain - ## HITS:1 COG:no KEGG:BT_3316 NR:ns ## KEGG: BT_3316 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 148 150 196 68.0 2e-49 MEEVLSNQQARPGDATQLMHVIFSSDDEMMSFYLTLNRFMNPESYLVERTDRKRLEDLAS TLCSNVAAFEAIRNYKSISVKEVIRGFGAHMMNTLISNTNRFQSADAVGTLMNCILNTTK NSWQFKKMDRNNDIHLQNVRYLLNRLDAAESNEEKNCEEVAI >gi|222159347|gb|ACAB01000012.1| GENE 16 15509 - 15733 111 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716802|ref|ZP_04547283.1| ## NR: gi|237716802|ref|ZP_04547283.1| predicted protein [Bacteroides sp. D1] # 1 74 1 74 74 135 100.0 9e-31 MILYYFMYLFHHKYLVYNYSANGETDEPLFKTIRELHKFDINKCNRRYCNGDFVNEKQHN IQWVEVLSKEALQD >gi|222159347|gb|ACAB01000012.1| GENE 17 15730 - 16539 640 269 aa, chain - ## HITS:1 COG:mlr2757 KEGG:ns NR:ns ## COG: mlr2757 COG3177 # Protein_GI_number: 13472455 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 39 252 41 240 263 89 32.0 7e-18 MEKGIWLEIEELYKEFQQLGISQSVDYEKYYLYSLITHSTAIEGSTLTEMDAQLLFDEGV TAKGKPLVYHLMNEDLKKAYELAKKEAQRNTVITPAFLQKLNATLMRTTGGRHNTIGGSF DSSRGEFRLCGVTAGVGGRSYMGYQKVPVKVEELCFLLQERQKNVETFREQYELSFNAHL NLVTIHPWVDGNGRAARLLMNYIQFCYHLFPAKIFKEDRADYILSLQQAQDDETNQPFLD FMAVQLKKSLSLEIQKYKASHDKGFSFMF >gi|222159347|gb|ACAB01000012.1| GENE 18 16753 - 18999 2544 748 aa, chain - ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 19 748 14 714 793 428 35.0 1e-119 MKFKAMLLGLSVITALPAFAQKPVYLDTGKPIEERVKDALNRMTLEEKVKMIHAQSKFSS AGVPRLGIPEVWATDGPHGIRPEVLWDEWDQAGWTNDSCIAYPALTCLSATWNPEMSHLY GKSIGEEARYRKKDILLGPGVNIYRTPLNGRNFEYMGEDPYLSATMVVPYIKGVQENGVA ACVKHYALNNQEFNRHTTNVQLSDRALYEIYLPAFKAAVQEGGTWSIMGSYNLYQGQHAC HNKRLLKDILRDEWGFDGVVVSDWGGVHNTEQAIHNGMDLEFGSWTNGLSAGTRNAYDNY YLAFPYLKLIKEGKVGTKELDEKVSNVLRLIFRTSMDPHKPFGSLGSPEHGQAGRQIGEE GIVLLQNNGNLLPIDLNKAKKIAVIGENAIKMMTVGGGSSSLKVKYEISPLDGLKSRVGS KAEVVYARGYVGDPTGEYNGVKTGQDLKDNRSEDELLAEALQVAKDADYVVFFGGLNKSN HQDCEDSDRASLGLPYAQDRVISELAKVNKNLIVVNISGNAVAMPWVNEVPAIVQGWFLG SEAGTALASVLVGDANPSGKLPFTFPAKLEDVGAHKLGEYPGNKEELAQSKHRGDTINEI YREDIFVGYRWADKEKIKPLFPFGHGLSYTTFAYGKPSADKKTMTVDDTISFTVNVKNTG TREGQEVVQLYISDKKSSLPRPVKELKGFQKVKLAPGEEKAVTLTIDKKALSFFDDAKHE WVAEPGKFEAIIGSSSRDIKGTVPFELK >gi|222159347|gb|ACAB01000012.1| GENE 19 19191 - 21188 1803 665 aa, chain - ## HITS:1 COG:no KEGG:BT_3313 NR:ns ## KEGG: BT_3313 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 663 1 666 667 891 68.0 0 MKKNSYIIVALAGMLSMNSCNDDEFLPGNPSMEIKAENADALFGDSLPFTIKASDVDVPL STLKAQLFYGEEQVSETVIRTKTSGNDYTGKIFVPYYANIPNGKATLKYILQNIHFTTTE MTKELALARPDFPYLTLVDEEGKEYRMERQAMYKYSVTGDFSQKMKAYIKTPKVGENGNE LTFGWENGTIEAGSTNAISFSNTEPGNYAIKFNTLTYEAEPFAKLKVNGEDMELVENDIY AIKLTLKKNDILAFEGVPDYDNWWIDQDYFEKQEDGTLKFLPIDGSYQITANGKMKYFSV IALKNGEAAKLQDDGTGAIWAIGTGIGKPSVALSEVGWTPENGLCMPQLTAKKYQLTFIA GVTMKVDDINFKFFHINKWDNGEFKGDAISTTSELVKISSDGNLGLEEGQKFERGGIYRF TVDVTKGNTKAVLTVEKVGKVDLPAPDIFFGNDKMEVTDTDIYKSDQAFTQGQMITVTGI DNLNEWWIDPDFFEKQSDGALKFLPINGDYRVTANAVLKYFSVMALKDGKPAKLQDDGTG AIWAIGKGIGKPSVTSSEVGWEPSKALCLAQVASKKYQLTLKAGETLKTSGDPEVISFKF FHQNDWGGEFGNYASSILVEQLKLADSGNLEMQDNKAFEEGAVYRFTIDVTNGNANADLK VEKIN >gi|222159347|gb|ACAB01000012.1| GENE 20 21211 - 22701 1445 496 aa, chain - ## HITS:1 COG:CC1757 KEGG:ns NR:ns ## COG: CC1757 COG5520 # Protein_GI_number: 16126001 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: O-Glycosyl hydrolase # Organism: Caulobacter vibrioides # 27 492 15 469 469 240 32.0 5e-63 MIDMRNIALTFLGCFTILAACSNSDDAEKPVTPVPTGDVTIYATTSSLTRDLTRDAVNFS SKDNLAPTSITLNPTEQYQTMDGFGAAITGATCFNLLQMKPEDRHAFLTETFSDDKGFGF SYIRISIGCSDFSLSEYTCCDTKGIEHFALQSEEKDYILPILKEILSINPSIKVIAAPWT CPKWMKVKSLTDLTPLDSWTNGQLNPAYYQDYATYFVKWVQAFNAEGIDIYAVTPQNEPL NRGNSASLYMSWEEQRDFVKTALGPKFKTAGLATKIYAYDHNYDYSDIETEKNYPGKMYE DAAASQYLAGAAYHNYGGNREELLNIHKAYPEKELLFTETSIGTWNSGRDLSKRLLEDMK EVALGTINNWCRGVIVWNLMLDNDRAPNREGGCQTCYGAVDISNSDYKTIIRNSHYYIIA HLSSVVKPGAVRIGASGYADSNIMYSAFENPDGTYAFVLMNNNEKTKKITLSDGKRHFAY DVPGKSVTSYRWAKSE >gi|222159347|gb|ACAB01000012.1| GENE 21 22732 - 24252 1306 506 aa, chain - ## HITS:1 COG:no KEGG:BT_3311 NR:ns ## KEGG: BT_3311 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 506 1 506 506 920 89.0 0 MKLRTIFYGLTSGLLLGLSSCSLNYEPLDTYSDVTEGVTSDGTKIVFKDKAAVESHLTTL YNQMRDRQEHWYVDLLLISDSHSDNAYAGTTGAEVVPFENNSIEGSNSVLERDWNRYLED VARANKLICNIDLVTDNSLTTAERAQYKAEAKIFRAMVMFDMVRLWGDFPVITTVADDIT SENIDEVYPQYFPKQNTELEAYQQIEKDLLDAVLYAPDNTPGNKTLFSKSVARTLLAKIY AEKPLRDYTKVIQYCDEVKADGFDLVDDFSDLFGMNAAGTDAKMRNTKESILEAQFTSGA GNWCTWMFGRDLVNWNNNFTWAKWVTPSRDLISAFKQEGDEVRFKESIVYYDCNWSNYYP SDNYPFMYKCRSANSSIIKYRYADVLLLKAEALIMQDTPDLEGAADIIDEVRDRAKLGAL PTSVRSNKNVLLNALLKERRLELAFEGQRWFDLVRLDKVEEVMNAVYSKDSGRKAQIYTF DKNSYRLPIPQSVIDANDKIHQNPGY >gi|222159347|gb|ACAB01000012.1| GENE 22 24267 - 27272 2651 1001 aa, chain - ## HITS:1 COG:no KEGG:BT_3310 NR:ns ## KEGG: BT_3310 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1001 1 1001 1001 1813 93.0 0 MKKNRRKILSGSRKIFFAILAMFLSLSASAQQITASGQVLDAQKEPLIGVSVQEKGTSNG AITDLDGNFTLNVKQNAILIFSYVGYKSQEVKAAQQMKITLQEDNEVLDEVVVIGYGSVK RKDVTTAISSVSTRDLDMRPIVSAGQAIQGKAAGVSVIQPNGTPGGEMSIRVRGTTSMNG SNDPLYVVDGVPVDNIKFLSPNDIESMQILKDASSASIYGSRAANGVILITTKAGAAGNA KVSLTAQFGLNKVADKVESLNAAQYKELQDEIGLVSLPDGLPDRTDWFDETYTTGKTQNY QVAVSNGNEKMKYYLSAGYLKEQGVLDISYYKRYNFRVNLENQVRKWLTVSANISYSDYT SNGGGAMGTGSNRGGVILAVINTPTYAPVWDALNPNQYYNNFYGVGNITNPLENMARAKN NKDKENRLLASGNILLTPFPELKFKSTLTLDRRNAVNTTFLDPISTAWGRNQYGEASDNR NMNTVLTFDNVLTYNKNFKKHGLEVMAGSSWTDSDYSNSWINGSHYRSDQIQTLNAANKI SWDNTGTGASQWGIMSFFGRVAYNFDSKYLVTANLRADGSSKLHPDHRWGVFPSFSAAWR ISSEKFMENLTWIDDLKLRGGWGQTGNQSGIGDYAYLQRYNIGRIEWFKKGGEGDSTDYA NAVPTISQANLRTSDLTWETTTQTNIGLDLTILNGRLTFNADYYYKKTKNMLMNVSLPAG AAAATSIARNEGEMVNKGFELSISSKNLRGGAFTWDTDFNISFNRNKLTKLELQKVYYDA KTADVVNDYVVRNEPGRALGGFYGYISDGVDPETGELMYRDLNNDGKISSSDRTYIGDPN PDFTYGMTNTFSWKGFNLSIFIQGSYGNDIYNASRIETEGMYDGKNQSARVLNRWKIPGQ ITDVPKANFKLLNSTYFVEDGSYLRLKDVSLSYNVKGKLLKKWGITRLQPYFTATNLLTW TSYSGMDPEVNQWGNSGTVQGIDWGTYPHCRSYVFGINVEF >gi|222159347|gb|ACAB01000012.1| GENE 23 27483 - 29126 1331 547 aa, chain - ## HITS:1 COG:no KEGG:BT_3309 NR:ns ## KEGG: BT_3309 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 547 1 547 547 804 81.0 0 MKKKYLIFLLLLSFPLYTRADKTLDSLLNVLDLTIQEHEIYVVQRESRIKHLKELTHGIE PNSAEQYNLNSQIYKEYKAFICDSAIHYLNENIRIAERLRDADRQIESQLQLSLLLSSTG MYKESLDVLESVDRRKIIPRLIADYYTCFDHVYGELGVYTQDKTLSGRYWSISQAYRDSL YAILPPESEEYLLMREASFRDQRQYEEALKVNDLRLTKIEPYTPQYAMATYHRSLIYKYS NDSLGEKRNLCLSAISDIRSAIKDHASLWMLAQLLYEDGDMERAYQYMRFSWNATKFYNA RLRSWQSADVLSLIDKTYQAMIEKQNDRLQQNLLLITALLVLLIVALGYIYRQMKKLADA RNHLQVANKQLNGLNEELRQMNSCLSSTNIELSESNQIKEEYIARFIKLCSTYINRLDAY RRMVNKKVSAGQIAELLKITRSQDALDEELEELYANFDTAFLHLFPNFVGKFNDLLQENE QILPKKGELLNTELRIFALIRLGIEDSSQIAEFLRYSVNTIYNYRAKVRNKARGSREDFD DLVRKIR >gi|222159347|gb|ACAB01000012.1| GENE 24 29390 - 30178 480 262 aa, chain + ## HITS:1 COG:MA0958 KEGG:ns NR:ns ## COG: MA0958 COG1712 # Protein_GI_number: 20089836 # Func_class: R General function prediction only # Function: Predicted dinucleotide-utilizing enzyme # Organism: Methanosarcina acetivorans str.C2A # 1 260 1 267 271 102 31.0 7e-22 MKKLVIVGCGRLAGIVADAVVKGLLPDYDLVGVYSRTASKAAHIVHKMQQHGKPCIACAT LEELLALKPDYLVESASPAAMRELALPALRNGTSVITLSIGALADETFYREVAETAKANG TRIYIASGATGGFDVLRTASLMGNTTARFFNEKGPNGLKGTPVYDDSLQTEQRTVFSGSA AEAIRLFPTKVNVTVAASRASVGPENMQVSIQSTPGFVGDTQRVEIKNDQVHAVVDIYSA TSDIAGWSVVSTLINITSPIVF >gi|222159347|gb|ACAB01000012.1| GENE 25 30161 - 31045 566 294 aa, chain - ## HITS:1 COG:SA0791 KEGG:ns NR:ns ## COG: SA0791 COG1052 # Protein_GI_number: 15926519 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Staphylococcus aureus N315 # 26 271 22 293 319 76 26.0 4e-14 MFQKLVAIEPVSLVPSAEKALYSFAGQVVMYPDVPASDDEIIARIGDADAVLLSYTSRIN RYVLECCLNVKYIGMCCSLYSPESANVDIRYANERGITVTGIRDYGDEGVVEYVVSELVR CLHGFGQESWEELPREITGLKVGIVGLGKSGGMIADALKFFGADISYYARSEKEAATAKG YCFLPLEKLLAESEVICCCLNKNTVLLHEEEFRQMGNRKILFNTGLSPAWDEAAFAEWLE GDNLCFCDTIGALGSEQLLNHPHVRCMQVSTGRTRQAFDRLSAKVLANLSEYNG >gi|222159347|gb|ACAB01000012.1| GENE 26 31217 - 31498 199 93 aa, chain - ## HITS:1 COG:no KEGG:BF4289 NR:ns ## KEGG: BF4289 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 34 93 148 204 206 63 51.0 2e-09 MRVGIRGDLERRKCVDFNTSCKWDRENLKKLVLQYIGEHGFITRTTYTELTGRLKNTALD DLKSFAAEGIIKREGRGNQMHFIALPRKEPDGE >gi|222159347|gb|ACAB01000012.1| GENE 27 31784 - 32350 495 188 aa, chain + ## HITS:1 COG:mlr8088 KEGG:ns NR:ns ## COG: mlr8088 COG1595 # Protein_GI_number: 13476697 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 19 174 54 210 217 66 26.0 3e-11 MKIDEIQCIKELRNGSYQAFTQIYEAYADRLYSFVLKQLKNRSLAQDIVQDTFLRLWDNR SQLNSFGNLQAFIFTIAKHQVIDYFRKQVNELQFEDFMEYCENQESDVSPEDLLLYDEFL QQLKKSKNILSQREREIYELSREEHIPVKQIAEQLELSEQTVKNYLTSALKILRSEIMKY NILFIFFL >gi|222159347|gb|ACAB01000012.1| GENE 28 32499 - 33446 611 315 aa, chain + ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 71 286 85 295 327 79 25.0 1e-14 MATHFKDLYRRYMEKGLPSSDLREFKKELEHIPDEELWNTMINMEKDSAPEIGMPPMMKK QIRKELHQIIWRRRWYQFAKYAAVIALLVTSSFGIYSLFDTPSSQQMITANVKPGSKSEI ILPDGTKVQLNGATTIRYDINDTEQRLVHLSGEAFFDVAKNPDCPFRVMVNDFQIEVLGT SFNVNTYKKDVIETSLLTGKIKISGGSLPHEYTLTPGEKATYSSVDKALKITKADVHVET GWCNDYLIFDSEPLIDVIEQIERWYGVEIELKYPQIAQDLLSGSFRHENIQNVIHSLSLQ YKFKYEIHKDKIIIY >gi|222159347|gb|ACAB01000012.1| GENE 29 33669 - 36923 2667 1084 aa, chain + ## HITS:1 COG:no KEGG:Phep_1362 NR:ns ## KEGG: Phep_1362 # Name: not_defined # Def: TonB-dependent receptor # Organism: P.heparinus # Pathway: not_defined # 1 1084 49 1140 1140 932 46.0 0 MILLSTTGALAQKGNVSVNINNGTVKTFIKEIEKQTRYTFVYRNNVLNDQAKVTVNCKNK PLDQVLSQVFTPLNVSYSLNNNTIVLVKQEVQQQKKNEKKTIKGTVTDGKGEPIIGANVI QEGGIGTITDVEGNFTVTADLSKPLEISYIGYKKKSIRIGASPTINIALEEDAHVMDEVI VIGYGSKTKRDVTSSIGTYKPGEVNVRQVLGVDELLQGRVSGVNITSASGVPGSRNRVSI RGIGSITAGNEPLYVIDGVPINNTSGDTGAWGAQSMNGLNDFNPSDVESIQILKDAASAA IYGSRATNGVILITTKKGSKGQAKVNIDTNVSFSNLTRTDKLDMTDTDLFLEVLNEAIDN YNLQTNSTQARIDNPAPGKAQTNWLDLVLRTAVTYTTTASVSGGTDKTNYYLSANYKHNE GVIINNQLKRYNLKVNLDTEIKKWLKVGTALNLSYSRNNRVPTGYNIGTSVITRAIEQRP WDSPYRPDGEYAVGGQELANHNPIQALNEEDVYIDNYRALGSLYMLFNITKDLNFKTTLG EDFNYTEEHIYYTADHPYGNKVGKLIDGRKSYASTLWENVLTYKHSFEEDFSLDVMLGHS IQKDVTSSAAQTGIGFPSPSFDVNSVAAEYSDISTGLSSFLLQSFFGRLSLNYKNRYLLT GTMRADGSSKFISSNRYGYFPSVSAGWNLGEESWWKYPQTDAKFRASWGCTGNQGGIGSY AYQALAGGGYNYNGENGLGLTTAGNRDLKWEKAQQADIGVDLSFFRGAITFTADAFIKDT KDLLYQKPTPATSGYTSQVCNIGSMRNKGLEFTLGANLSKGSFSWHSDFNISFIRNKLTA LLDNNEILTTSSMHALKVGEPIGSFYMIKWKGIYQSDDEIPAKIYDQGVRAGDCIYEDVD GNDVIDENDKQFVGSANPKFTGGFNNTFKYKGVDLSLFFTFSSGNKLYELWTGGLRMGNG TWPILKSSAESRWTGPGSTNENPRAIYGYTWNSTKFVNTRMLHDASYIRCRTASIGYTLP KSWINRIHIDNLRIYFQADNLFILTKWPYLDPEVNVSLSATNMGYDYLYPSQPRTFTIGV NLKF >gi|222159347|gb|ACAB01000012.1| GENE 30 36929 - 38215 1103 428 aa, chain + ## HITS:1 COG:no KEGG:Phep_1361 NR:ns ## KEGG: Phep_1361 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 18 428 18 426 426 407 52.0 1e-112 MKNFIISLLTVLLVFATSCNEMDQYPHNAVSSDNLTEEDAQLLLTGLYFYIQNKPTVNGY LTQDIVGGDLVRGGATGLKDPVLLVKDLVTPESGFVSGPWDGFYTALYQVNSLIVAVDKL AASQSRNEILGVASFFRGLIYYHLVTRYGEVPILEAPFSGDIAASTETEGWSFVEKNFQV AIDYAPTFSDKYYVSKQAAKALMARTKLAQGKMTEAAKLAEEVIGDANFSLADFDQIFRG KANREEIFSFVNLLNESSVNLSASLYSRASANGGSYTYAPTTKVMNMFEPDDKRTAISID MQETNEVINKYPGGEVTTDPIIITRLGEMYLISAEAQGLSKGLTRLNELRNFRGLPSVHP ATEEDFIDAILNERHTELLAEGFRWFDLVRLNRLESDLGFERKYNKLPIPAKERSLNKLL NQNSYWAN >gi|222159347|gb|ACAB01000012.1| GENE 31 38229 - 39167 594 312 aa, chain + ## HITS:1 COG:no KEGG:Phep_1360 NR:ns ## KEGG: Phep_1360 # Name: not_defined # Def: exopolysaccharide biosynthesis protein # Organism: P.heparinus # Pathway: not_defined # 7 307 1 297 303 224 39.0 4e-57 MNLLKYMQILSFVLPVAFLSCSNDTIEDVHYTPQTKIGQKLLAGSETVARVYTDTSFVVA LGVTETDVHFQKADSRSTHIFIIDIDLNEPGVSLEVGMPYDADVRNNFQRQTLTEMADYA DRPWHRVAAMINADFWDVSTMDIRGPIHRNGVILKNSFIFKETLPQQALSFIALTKDNKM VIADSVEYRGMQYNLKEVTGSGVIVLRDGEISGATYPGIDPRTCLGYSDDGHVYFMVADG RVEFYSYGLTYPEMGSIMKALGCSWAVNLDGGGSTQMLIRHPIADIFQIRNRPSDGQERP VVNTWMVTVNEP >gi|222159347|gb|ACAB01000012.1| GENE 32 39180 - 40961 1184 593 aa, chain + ## HITS:1 COG:MA2021 KEGG:ns NR:ns ## COG: MA2021 COG3391 # Protein_GI_number: 20090869 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 311 468 142 312 341 67 31.0 9e-11 MKNSIFILSILFCCNLFCGCSESDESDSRQFPPKLISIIPKAGSTGGTAIISGVHFSETA IDNEVFINGVQAEIIDATQNRLVIALPNNPEGTYSIKVSVKGEVVEGLKFTYATPQAPPE LAVLQVMPSSAYAGDLVTLIGQCFSTVVSENQVTINGVTAEVKEATSSQLKIIIPDTEEG SYPIRVRIGTKEAESPLFTYLHTVTLTTSSLTPARGKAGEEVVISGEGFGTTVEENMVSI NGKQATVKVVTATTLTIIAPENPSGTYPVKVTVADKTVENLSFTYEDLSYTVATVAGNSA TTSTDGKGTAASFKFPQGLALAPNGDIWIAERGNNVIRKMDQEYNVSTVAKSGTVTFNAP WQGGFDLSGIYYVANKALNNIIKITQEGTCSVFSTETTFKSPMSVTFDSNGNMYIADRDN KAVKKITSGGTVTNYDMSSLKAGPNCMAVDKKGRIFVGTGGTYQLHMFDTDGTLKTVFGT GVVPTAATYSDGEQNDLSKATMGATFGIAFGPDEVLYITDYTMHTIRTLTPDAEGDYTKG TLKTIAGIPGTKGKIDGSALTATFNCPASVLVSDKVYIADEQNHLIRTITVNK >gi|222159347|gb|ACAB01000012.1| GENE 33 40968 - 42227 956 419 aa, chain + ## HITS:1 COG:no KEGG:Phep_1359 NR:ns ## KEGG: Phep_1359 # Name: not_defined # Def: NHL repeat containing protein # Organism: P.heparinus # Pathway: not_defined # 1 416 1 435 439 154 31.0 7e-36 MRKIYILLLACPLLSLFVACNDNDYKASVTELRLVLVKPTNVYSGEIATILGRNFSTVPE ENAVFINDQQATVIEAFKDELKIILPEMAPGKYSIRVKSPSGELTGLELNYLKTPDQEYI VQTIVGQKGVFEMTDGVGTEATTKLPTGIAFAPDGSLWFTERGYNYIRRISPDFLVTSLL DVAVDGSSAIWQGGFDSKGNYYFIDKGKGMLRKIETGSMSVSTIASEMKSPMNVTFDDED NIYVSARDNKAIYKFTPSGTKTTFATLNVSPNYIVFDKNKNMIVGTSNGYVLIQISPDGT QKTIAGDGVKGQEYYDGEPGNPLSAKVGATFGVAAGSDGCLYLSDNTYNCIRKLTPDANG DYSKGTLETIAGSGKAGFSDGKGLKATFNQPYEIIITEDCKTMYVAGAVNYLIRRITVK >gi|222159347|gb|ACAB01000012.1| GENE 34 42291 - 43205 636 304 aa, chain + ## HITS:1 COG:AGl598 KEGG:ns NR:ns ## COG: AGl598 COG0584 # Protein_GI_number: 15890416 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 39 292 36 294 306 119 32.0 9e-27 MTKKLFFLLPILLTAFSISAQTRTDKLLKNLHDNESKYIFVIAHRGDWRNAPENSLQSIE KAIAMKVDMIELDIQPTKDGSFICMHDETLDRTSTGKGPIKDYTTEELKKFVLRSGNGIK TRQPIPTLKEALNVCKGHILVNIDKGGTYIKEIMPIIQECGMEKQVIIKGYYPVEKVKKE YGSNESMLYMPIVNLWDKEAVATIQTFIKNFTPIAYELCFKDDANPNLKIIDEIAKSGSR IWMNTLWDSLCGGHDDENALLESKDKHWGWMLKHKATMIQTDRPQELIHYLEEKGLRDLE QSSL >gi|222159347|gb|ACAB01000012.1| GENE 35 43335 - 44432 848 365 aa, chain - ## HITS:1 COG:no KEGG:Phep_1387 NR:ns ## KEGG: Phep_1387 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 25 362 20 358 358 315 45.0 1e-84 MRNLTRNVWICLGLIMFPLTTFGQQKSNFTYVPAQELLLVGKATTEGEYFHRVDTAKYCT MPPAVKKLFTNSAGLAISFTTNSPVIKAKWTVPDNYQLPNLTRLAQKGLDLYIKRDGKWQ FAGVGMPGGVTTERVIVDNMGTEEKECLLYLPLYDELKSLEIGVSSDAHIHKGENPFKEK IVVYGSSILQGASASRPGMAYPARLSRSSGYNFINLGLSGNGKMEKEVAGMLADIDADAF ILDCIPNPSPKEITERAVDFVMTLREKHPDTPIIIIQTLIRETGNFNQKARENVKQQNEA IAEQVEVLRKKGVKNLYFIKEDRFLGTDHEGTIDGTHPNDLGFDRMLKKYKPAISKILKI KFRDE >gi|222159347|gb|ACAB01000012.1| GENE 36 44491 - 45711 933 406 aa, chain - ## HITS:1 COG:BMEI1451 KEGG:ns NR:ns ## COG: BMEI1451 COG0612 # Protein_GI_number: 17987734 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Brucella melitensis # 8 395 68 453 490 211 34.0 2e-54 MHCNEYTLPNGLRIIHEPTLSKVAYCGFAIDAGTRDEAENEQGMAHFVEHLIFKGTEKRK AWHILNRMENVGGDLNAYTNKEETVVYAAFLKEHLERALELLGDIVFHSTFPQHEIEKET EVIIDEIQSYEDTPSELIFDDFEDMIFRNHPLGRNILGKPELLRSFRTEDVLSFTRRFYQ PGNMVFFVQGQYDFKKIIRLVEKYLSDIPDVRVENRRTPPPLYMPEHLTVPRDTHQAHVM IGSRGYNAYDDKRTALYLLNNVLGGPGMNSKLNVSLRERRGLVYNVESNLTSYTDTGAFC IYFGTDVDDMDTCLKLTYKELKRMRDVKMTSSQLAAAKKQLIGQIGVASDNFENNALGMA KTYLHYHKYESSELVFKRIEELTAQQLLEVANEMFAEEYLSTLIYK >gi|222159347|gb|ACAB01000012.1| GENE 37 45817 - 46350 538 177 aa, chain + ## HITS:1 COG:BH3084 KEGG:ns NR:ns ## COG: BH3084 COG1611 # Protein_GI_number: 15615646 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Bacillus halodurans # 3 161 2 160 187 116 38.0 2e-26 MEKIGIFCSASENIDKVYFESARQIGEWMGQKGKTLIYGGASLGLMECIARAVKENGGKV IGVVPAKLEENGKVSTLLDEEIHTRNLSDRKDIITEKSEVLVALPGGVGTLDEIFHVIAA ASIGYHRKKVIFYNEYGFYDELLKALHTLEDKGFARQPFSTYYEVANTLNELKEKIN >gi|222159347|gb|ACAB01000012.1| GENE 38 46355 - 46960 570 201 aa, chain + ## HITS:1 COG:YPO3577_1 KEGG:ns NR:ns ## COG: YPO3577_1 COG0794 # Protein_GI_number: 16123721 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Yersinia pestis # 6 194 17 201 212 142 42.0 4e-34 MIDSIKQLLQQEAQAVLNIPITDAYEKAVKLIVEQIHQKKGKLVTSGMGKAGQIAMNIAT TFCSTGIPSVFLHPSEAQHGDLGILQKNDLLLLISNSGKTREIVELTRLAHNLDPDLKFI VITGNPDSPLAKESDVCLSTGKPAEVCVLGMTPTTSTTAMTVIGDILVVQTMKETGFTIA EYSKRHHGGYLGEKSRSLCEK >gi|222159347|gb|ACAB01000012.1| GENE 39 46948 - 47868 752 306 aa, chain + ## HITS:1 COG:VCA0656 KEGG:ns NR:ns ## COG: VCA0656 COG0524 # Protein_GI_number: 15601414 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Vibrio cholerae # 1 257 18 276 323 89 24.0 8e-18 MRKVIGIGETILDIIFRGDQPSAAVPGGSVFNGIVSLGRMGINVGFISETGNDRVGNIIL QFMRENNIPTDHVNVFPDGKSPVSLAFLNEQSDAEYIFYKDYPKQRLDVLFPKLEEDDIV MVGSYYALNPVLREKVLELLDQAREKKAIIYYDPNFRSSHKNEAMKLAPTIIENLEYADI VRGSLEDFLYMYNMQDIDKIYKDKIKFYCPRFICTAGGEKVALRTNLVNKDYPVEPLQAV STIGAGDNFNAGLIYGLLKYDVRYRDLNNLNEEIWDKIIQCGKDFAAEVCGSFSNSVSVE FAKKYR >gi|222159347|gb|ACAB01000012.1| GENE 40 49004 - 50935 1636 643 aa, chain + ## HITS:1 COG:BH2384 KEGG:ns NR:ns ## COG: BH2384 COG0513 # Protein_GI_number: 15614947 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Bacillus halodurans # 1 543 5 522 539 370 38.0 1e-102 MKTFEELGVSPEIRRAIEEMGYENPMPVQEEVIPYLLGENNDVVALAQTGTGKTAAFGLP LLQQIDVKNRIPQSLILCPTRELCLQIAGDLNDYSKYIDGLKVLPVYGGSSIDSQIRSLK RGVHIIVATPGRLLDLMERKTVSLSTIRNVVMDEADEMLNMGFTDSINAILADVPQERNT LLFSATMSPEIARISKNYLRNAKEITIGRKNESTNNVKHVVYTVQAKDKYEALKRIVDYY PQIYGIIFCRTRKETQEIADKLMQEGYNADSLHGELSQAQRDTVMQKFRICNLQLLVATD VAARGLDVDDLTHVINYGLPDDTESYTHRSGRTGRAGKTGTSIAIINLREKGKLREIERI IGKKFITGEMPTAAGICQKQLIKVIDDLEKVKVNEEEIAGFMPEIYRKLEWLSKEDLIKR VVSHEFNRFAEYYRHRPEIEQPTIESRSERARKSDRKENGFEKRSRKAAPGLNRLFINLG KTDSFFPSDLIGLLNSNTRGRIELGRIDLMQNFSFFEVPEKETVNVLKALNRAKWNGRKV VVEVSSEEGGKGHENGSGERKGGKRSGKNEERAPRYENKDRKSKDASAKGSKSSKKEKPS RAERGYSDARGPKRKDNWQEFFKDKEPDFSEEGWARRKPKKQL >gi|222159347|gb|ACAB01000012.1| GENE 41 51008 - 55078 2310 1356 aa, chain - ## HITS:1 COG:BH1581 KEGG:ns NR:ns ## COG: BH1581 COG0642 # Protein_GI_number: 15614144 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 808 1054 345 588 594 116 29.0 3e-25 MGKTYLSRFFFFLVIGWWSLLDVEATAYSLRQFSSKNGLSNSAILSICQDCDGLIWIGSC DGLNVFDGTELRLYKPVDVRNNLSGNLIDNIMEAEDHVLWIQTNYGLDRFDTHLQTVQRF KDFQDISCMAKTRENDVLVIKDDGYIYYFDRKGLKFHRLNAKRLAFEDIQYICVDSANVL WIFSSSGDVRSYVIEKKEDSIPELILQSLFEQEGELFWTSGEGDLVYFIDGAYDLYEYDL RSRNKYFIADLKEEIAQRGQVSSIVKQRDDYYVGFKSSGLICLKYQSDSKVKYVMQSIGI ESGIFCLFKDRFQDIIWVGTDGQGVYMYFIDHFSIRNTLLDTSDYQINTPVRALFLDDEQ TLWVGTKGDGILRMLRYAPDTGGKFDVERLHTGNSSLTDNSVYCFAPSCWNRLWIGTEHG MNYYSYRERKVKEFPVMVDGKTLRYVHSICELNDSTIWVATVGEGIVKIVLNVSDGEPKV MFTRRIVLDGGRRASNYFFTSHQENDSILWFGNRGYGAYRMNVKTEVMMPFRFDQEVSNQ AVNDIFAILSNKKGYWLGTSFGLTHMQSPGNYRVYNEADGFPNNTVHGILEDGEHNLWLS TNQGVVRFNIEANTVQAYNRQNNLEVTEFSDGAFFKDERTGTLFFGGTNGFITINENDLT AMEYMPKLQFYRLSIFGKEYNIYHFLRQDKDVEVLELDYSQNFFNLSFVAVDYINGNNYT YSYMIDGLSNNWIENGSSTTAVFSNLPPGQYTLWVKCRSNIMGKESKPYSLVIRIAPPWY MTQLAYWGYFLLFLLLLWGLVYMAIRRYRRKRDIIIEKLNRQKRDEIYESKLRFFTNITH EFCTPLTLISGPCEKILSYAGTDGYIRKYADLVQQNAQKLNSLIQELIEFRRLETGHKVL DIQEVAVDEHTRGIAESFGEWVESRKVDYQLNIEEGDRWNTDVSCLSKIVNNLISNAFKY TSDGGKITVEQCIEGERLCIRVSNSGKGIKKENLDKIFDRYKILDDFETQNKNEAFPRNG LGLAICHSMVNLLAGEIRVMSIPEEVTTFEVVLPMLTVTDAKSGELKELKKQVMPISDKP VEQKKDIVADYDASRQTIMIIDDDPSMLWFVTEIFAGKYNVQPFNSAQEALEQLKLKQPD LILSDVMMPDMDGRAFAKIVKEDKLLSHIPLVLLSALNYIDEQVKGIESGAEAYVTKPFN VEYLEKIVERQIRRKEDMKEYYSSIYSAFKLEDGHLLHKEDKSFFEKMMRVIDEYVENPE LSVELLSASLGCSTRQFYRKLKNVTDKTPADIIKEYRLTVAERLLLTTNLTVEEIMNKAG YTNRGTFYKVFSQKFGMPPRQYREMKKKDLKEKDFH >gi|222159347|gb|ACAB01000012.1| GENE 42 55098 - 55298 121 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298483819|ref|ZP_07001991.1| ## NR: gi|298483819|ref|ZP_07001991.1| hypothetical protein HMPREF0106_04287 [Bacteroides sp. D22] # 16 66 1 51 51 82 100.0 6e-15 MKIRRCKYGGGAEECMYEFMPFMSFLDKMAGIEEEMAENRAIYYLFIIGVRSFFITFVLM NCILCY >gi|222159347|gb|ACAB01000012.1| GENE 43 55433 - 58570 2772 1045 aa, chain + ## HITS:1 COG:no KEGG:PRU_2712 NR:ns ## KEGG: PRU_2712 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 34 1045 3 1010 1010 1358 67.0 0 MTCRKYKLLFFSLAIASLFILAPDINAGQASILSTSQTKNTLVKGTVKDASGEPLIGVSV SVKGKSGIGTITDINGNFSIQCDANDTLVFSYIGYATLEFPVNGKSSLSISMKEDTKVLD EVIVVGYGTTTRKSAVGAVDQVKADMIENRPVANMTQALQGAAPNVIIQQRNHNPNDNKT NFNIRGISTLNDNSPLFVIDGLVADGESFNKLNPMDIENISILKDAGTAAIYGSRSSNGV VVVTTKKGKKNQRPVVRLSGMIGWENPDILFSPVAGYQNATLRNLAETNAGNAPKYTPDQ IRDLAAHQNEESWFFDQIMRTAMQQNYNLSVSGGSEHSTYMISMGYYDQESNYVGNDSFG VQRYNFRTSLSTELGRFKLTGILAYARNNSVSTTGGSLEVDAARTPTYYYYKMKSADGRY LLNDILSEFNPLGQLEAGGRNKYRNNYINTNVSAEMKIIDGLKLKGVFGADIMNDTRFTR NHAVAYYSSEEATEPRPIKKENNKTSNWNSNAYLINTQLLLDYNKTFGKHTVNGLVGLTN ESYTQSSNEIEKKYVDPDLGIATDETTSEPGNITGKTSVDDSNRTSITSFLGRAGYSYAD RYYAEFSFRYDGASKFHKDYRWGFFPSVSLGWRPTEESFMEFYKEKIGDLKLRASYGILG SQAIGTYDRFTVYDVYDNSYAYNNKTVSGAGFKLGLENLTWEKTQTFNIGVDASFLQNSL TVTFDYFHKRTNDILMKPLISSVFGTEMPMANIGKMQNQGWDLSVNYRLKTGAFTHNFNF NLGDSWNKVLEFPGDEQITQVEELSRIIRVGVPLNSYYGYKMAGFFQSYDEIEASAIPVG AKVQPGDIKFVDRNDDGIIDSKDKFILGNAFPRYTFGFTYGLNWKGIDFSMFWQGVGKRD MMLRGELIEPYHANYSYTIYKHQLDFWTPTNTEARWPRLAAPGSDSNRNNYGNGNGSDLF LLDGKYLRLKNLTIGYTLPKEWTKHLGMQKARLYINGQNLLTFSNNSFIDPESSEFDSKM STSGANSGRSYPTLRYFGFGVDIEF >gi|222159347|gb|ACAB01000012.1| GENE 44 58583 - 60202 1291 539 aa, chain + ## HITS:1 COG:no KEGG:PRU_2713 NR:ns ## KEGG: PRU_2713 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 539 8 545 545 698 62.0 0 MKRYKLLYLVAGVASVLMTGCNDMENTPSNKFTDNSFWNSTEKAQYVVNMAYSQMYNAGR MWQDESLSDNVFDGRNVTDPRAIRKGMGTPSLELFKNEWKDLYGGIKTCHVFLEKVDLVP GMNVAVKARMIAEIRFIRAFIYFRLTNLYGDVPFFDKDITIDESNSIARTPRATIISFIH QELEDIANALPTRDQLQDAEQGKITKGAVSALQARVYLMDSDWDNVIKYCDNLIKKQNEY GTYSLLPTYRSVFEETNEYNQEIILDRAYVPFLLTWGEMSDMAPLSVGARLCNRAPQQSL VDSYLMVDGKAFNENSPLYNPSTPYANRDLRLTATVVYDGYDWSKNVSDGSTGTIIQINP QSNTVDKYEAGSNKTATGYYTRKYYSPQAKGDMNSGVNLSIIRYADILLMYAEAQFEKGN MDATIWNQTILPLRNRAGFTEEAKAYPTEKTEAEMRQIIRNERRCELALEGLRWYDIKRW KAGKEYLEGYVYGANFNNGNPIRLDKRQFDENRDYLWAVPQSQINLNPNLAPNNPGYSN >gi|222159347|gb|ACAB01000012.1| GENE 45 60227 - 62041 1300 604 aa, chain + ## HITS:1 COG:no KEGG:PRU_2714 NR:ns ## KEGG: PRU_2714 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 379 1 374 374 381 53.0 1e-104 MKTKITKVVALFTTLAIIFSCTEDMEYRDTAVSPVNQLYEPISGKSVELVASATASLFFE WEAAKAEDSGSPLYEIVFDKEGGNFSNPLYKVLSDNNGARNYATISHKTLNKIGAAAGLN SGETGTIIWTVIASRGLSTAMSNVSHKLTITRMLGFAEVPSQLFLTGEATEGGSDISQAV PCSSPEQETFEIFTRLKGGKSYKLVDKTNDNANYFYIDGNRIVEGEGETTVEKDGIYRIT MDFSIASVSIQEVKSVGVFFCPSNKVIFELSYKGNGIFEGEGKVTFKQESWGKDQRYKLL MSYSDKDMYWGTLNDVDSSPNGAGQEDSYYFIKEYGINEGNEPQWNHKWKFDNQFDGSMT KVTVKFNGANYTHFVSLGDGVSPEEPETTPKNLFLTGEASEGGNDISKAVQYNKLGEGIF EIFTRLQGGKTYKMASSQEADAKFFYIEDGEIKSGDGITTVEKDGIYRVKVDFSNKSVIT KEIKSMGLWHCWDRNVLLDLPYIGNGVFEGTGTISPSNNDNRYKLLMVYADDSELIWGTK NDTDVVPGANPDPSYFYIMETNDNAEAGQWDRKWKFNDGIIGNNTKVRVIFTGENYTHSI EPVN >gi|222159347|gb|ACAB01000012.1| GENE 46 62050 - 63189 901 379 aa, chain + ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 67 375 40 341 341 156 35.0 8e-38 MKKLLLCVPFLALMFQSCSNIEDEYMYDKGLYTINWEAAADSSSVTIINRFWNETGNYFN YESDGYDETFHYWPQAHAMDVLIDAYIRTHDAKYKDCFDKWYAGINSKNGGSYWNNFYDD MEWIALTMIRLYEVTDEAKYLDTAKQLWNWIKEGWNEEYCNGGIAWNHGDVWSKNACSNG PAGLIACRLYQIEKVEEYKEWAIKIYQWEREHLFNPATGAVYDNIDGRTDNLNTLTLSYN QGTFLGTAHELFKITGEASYLKDARKAAYFGISDGSMIDAGNNLLRDEGNGDGGLFKGIF IRYFVKLILDDNLEPIYQKKFITFFNNNADILWRKGVNKSDLLYGSSWAKGAEGSTQLTI QTSGCTLIEAKAYYEKYKK >gi|222159347|gb|ACAB01000012.1| GENE 47 63198 - 64376 943 392 aa, chain + ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 140 354 90 303 341 75 29.0 2e-13 MRIIETVLCAAFLTGCAPKGNSNPNINLERAQQTLDSLYVHYSVPNTCLLRENFPFDEKH KVTYLASEEQANVPNAYSYLWPYSGTFSAVNALFEATQNKEFQKLLDERVLPGLDEYFDT ARTPNAYSSYVRTAPVSDRFYDDNVWLGIDFTDTYQLTQNKKYLDKATLIWDFIESGTDS VLGGGIYWCEQKKESKNTCSNAPGSVFALKLFKATNDSSYFKKGKALYEWTKEHLQDSTD YLYFDNIRLDGKIGKAKFAYNSGQMMQSAALLYQLTNNPVYLKEAQSIAKECYNYFFYDF TPVSGEPFKMIKKGDIWFTAVMLRGFIELYHLDKNKTYLDAFNKSLDYAWENARDEKGLF HTDLSGNKKDNKKWLLTQAAMIEMYSRLSAFE >gi|222159347|gb|ACAB01000012.1| GENE 48 64413 - 65877 1229 488 aa, chain + ## HITS:1 COG:XF0843 KEGG:ns NR:ns ## COG: XF0843 COG3538 # Protein_GI_number: 15837445 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 42 476 60 497 516 480 50.0 1e-135 MIKRNARLCTLATVLFIGSNMNASTFICPKDNTMIAKQAEITGKYKSQRPAKKDRLFTSQ AVEAEILRVKNLLTNSKLAWMFENCFPNTLETTVHYRTTDGKPDTFVYTGDIHAMWLRDS GAQVWPYIQLAHKDPELKKMLEGVIRRQFKCINIDPYANAFNDGAKGGDWMTDLTDMKPE LHERKWEIDSLCYPLRLAYQYWKETGDASIFDNEWIQAVINILRTFKEQQRKEGVGPYKF QRKTERALDTLNNDGLGAPVNPVGLIVSAFRPSDDATTLQFLVPSNFFAVTSLRKAAEIL ETVNQKTELATQCSELAQEVETALKQYATYNHPKYGTIYAFEVDGFGNHFLMDDANVPSL LAMPYLGDVDVNDPIYQNTRRFVWSKDNPYFFKGKAGEGIGGPHIGYDMIWPMSIMMKAF TSQDDQEIKECIQMLINTDAGTGFMHESFHKDNPEKFTRAWFAWQNTLFGELILKLVNEG KIDLLNSI Prediction of potential genes in microbial genomes Time: Wed May 18 01:06:17 2011 Seq name: gi|222159346|gb|ACAB01000013.1| Bacteroides sp. D1 cont1.13, whole genome shotgun sequence Length of sequence - 35009 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 9, operones - 5 average op.length - 4.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 154 - 2295 1848 ## BT_3289 hypothetical protein 2 1 Op 2 . - CDS 2315 - 3670 1446 ## COG1808 Predicted membrane protein 3 1 Op 3 . - CDS 3673 - 5169 1372 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 4 1 Op 4 . - CDS 5200 - 5976 658 ## COG4099 Predicted peptidase 5 1 Op 5 . - CDS 6039 - 7070 1135 ## COG2255 Holliday junction resolvasome, helicase subunit - Prom 7113 - 7172 5.3 + Prom 7005 - 7064 5.0 6 2 Op 1 . + CDS 7198 - 8430 1030 ## COG2715 Uncharacterized membrane protein, required for spore maturation in B.subtilis. 7 2 Op 2 . + CDS 8447 - 8866 428 ## COG0319 Predicted metal-dependent hydrolase + Term 8947 - 8991 6.8 - Term 8932 - 8982 12.2 8 3 Op 1 . - CDS 9035 - 9451 439 ## gi|237716841|ref|ZP_04547322.1| conserved hypothetical protein 9 3 Op 2 . - CDS 9458 - 12181 1793 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 10 3 Op 3 . - CDS 12165 - 14519 1807 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 11 3 Op 4 . - CDS 14537 - 16897 1794 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 12 3 Op 5 . - CDS 16943 - 18655 1328 ## BT_3274 hypothetical protein 13 3 Op 6 . - CDS 18673 - 19437 620 ## gi|294644438|ref|ZP_06722201.1| hypothetical protein CW1_0760 14 3 Op 7 . - CDS 19475 - 20989 915 ## BT_3272 putative outer membrane protein 15 3 Op 8 . - CDS 21000 - 24341 2523 ## BT_3271 hypothetical protein 16 3 Op 9 . - CDS 24381 - 25580 844 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 25679 - 25738 6.1 17 4 Tu 1 . - CDS 25827 - 26396 364 ## BT_3269 RNA polymerase ECF-type sigma factor - Prom 26523 - 26582 6.4 + Prom 27675 - 27734 2.5 18 5 Op 1 . + CDS 27757 - 29643 1545 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 19 5 Op 2 . + CDS 29694 - 30224 498 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 20 5 Op 3 . + CDS 30294 - 32138 1164 ## COG0322 Nuclease subunit of the excinuclease complex 21 6 Tu 1 . - CDS 32243 - 32413 102 ## gi|237716856|ref|ZP_04547337.1| predicted protein - Prom 32521 - 32580 2.1 + Prom 32252 - 32311 6.0 22 7 Op 1 . + CDS 32384 - 32836 442 ## COG1490 D-Tyr-tRNAtyr deacylase 23 7 Op 2 . + CDS 32876 - 33214 509 ## COG1694 Predicted pyrophosphatase 24 7 Op 3 . + CDS 33201 - 34103 1011 ## COG0274 Deoxyribose-phosphate aldolase + Term 34138 - 34187 2.4 - Term 34121 - 34173 4.0 25 8 Tu 1 . - CDS 34288 - 34479 65 ## BF0107 hypothetical protein 26 9 Tu 1 . + CDS 34502 - 35009 282 ## BT_3262 hypothetical protein Predicted protein(s) >gi|222159346|gb|ACAB01000013.1| GENE 1 154 - 2295 1848 713 aa, chain - ## HITS:1 COG:no KEGG:BT_3289 NR:ns ## KEGG: BT_3289 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 713 1 713 713 1361 91.0 0 MKFRLTVLIVFSLCLSNVFADEGMWLLGNLRKNKQTDRVMKELGLQMPVNKIYNPKKPSL SDAVVSFGGFCSGVVVSEDGLVFTNHHCGFSSIQQHSSVEHDYLKDGFFARSLDEELPNP ELYVRFLLRTEDVTKRVLSAARYAKTETERRVAVDSIMNVIGLEVSEKDSTLTGIVDAYY AGNEFWLSVYRDYNDVRLVFAPPSSVGKFGWDTDNWMWPRHTGDFSVFRIYANTKNGPAD YSPDNVPYHPEYVAPISLDGYKEGSFCMTLGYPGTTERYLSSYGIEEMMNGINQAMIDVR GVKQTVWKREMDRRPDIRIKYASKYDESSNYWKNSIGTNKAIKHLKVLEKKRAAEVALRD WIQSHPEEREKLIRLFSSLELSYSNRRETNRALAYFGESFINGPELVQLALEILNFDFEA EEKLVITRMKKLLEKYDNLDLSIDKEVFAAMLKEYQLKVDKKYLPAMYEKIDTLYNGNIQ AYVDSLYATSNITSPKGLKRFLERDTTYNLIEDPAVSLSLDLIVKYYEMNQSISEVSEQI EQGERLFNAAMRRMYADRNFYPDANSTMRLSFGTVGGYTPFDGATYDYYTTVKGIFEKVK EHAGDIDFAVQPELLSLLSSGDFGRYANAQGDMNVCFISNNDITGGNSGSAMFNAKGELL GLAFDGNWEAMSSDIVFEPDLQRCIGVDVRYMLFIMEKYGKAGNLVQELKIAR >gi|222159346|gb|ACAB01000013.1| GENE 2 2315 - 3670 1446 451 aa, chain - ## HITS:1 COG:SP1264 KEGG:ns NR:ns ## COG: SP1264 COG1808 # Protein_GI_number: 15901124 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 33 332 14 311 347 233 41.0 6e-61 MKTDERNKFAIKSFLGEYLDLKKDKDNELATVDSIRKGVEFKGANLWILIFAIFMASLGL NVNSTAVIIGAMLISPLMGPIMGVGLSVGLNDFELMKRSLKSFLITTAFSVTTATIFFLL APIAGSQSELLARTSPTIYDVFIALFGGLAGVVALSTKEKGNVIPGVAIATALMPPLCTA GYGLASGNLIYFLGAFYLYFINSVFISLATFLGVRVMHFQRKEFVDKTREKTVRKYIVLI VVLTMCPAVYLTFGIIKSTFYEAAANRFISDQLSFENTQVLDKKISYEHKEVRVVLIGPE VPDASISIARSKMKEYKLEDTKLVVLQGMNNEAVDVTSIRAMVMEDFYKNSEQRLQQQAV KISQLETTLEQYRTYDAMSRTLVPELKVLYPSITTLSIAHSLEVRVDSMKTDTVTLAVLK FDRHPSAAEKQKISEWLKARVGAKKLRLITE >gi|222159346|gb|ACAB01000013.1| GENE 3 3673 - 5169 1372 498 aa, chain - ## HITS:1 COG:TM0620 KEGG:ns NR:ns ## COG: TM0620 COG2244 # Protein_GI_number: 15643386 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Thermotoga maritima # 5 460 6 444 479 74 20.0 4e-13 MAGLKSLAKDTAIYGLSSIVGRFLNYMLVPLYTAVLPASTGGYGVVSNVYAFTALMLVLL TFGMETGFFRFANKSGEDPMKVYANSLLSVGGVSLIFVLLCLLFLQPISNLLDYGDHPEF IAMMAVVVALDSFQCIPFAYLRYKKRPVKFAAIKLLSIIGGIGLNLFFLLVCPWLNVHCP ATISWFYDPDYLVGYIFISNLIISVVQMFFFIPELTGFAYKLDRVLLKRMVVYSFPVLIL GLVGILNQTVDKMIYPFLFEDRQEGLVQLGIYAATSKIAMVMAMFTQAFRYAYEPFVFGK DREGDNRKMYAAAMKYFLIFSLLAFLAVMFYLDLLRYLVAKGYWEGLGVVAIVMLAEICK GIYFNLSFWYKLTDKTYWGAYFSVIGCVIIVVLNILFVPVYGYLASAWASVAGYAVILLL SYWIGQKEYPIRYDLKSLGLYVLLAAVLYVIGEQVPIPNLVLRLAFRTVLLLLFIAYIIK KDLPLSQIPVINRFIKKK >gi|222159346|gb|ACAB01000013.1| GENE 4 5200 - 5976 658 258 aa, chain - ## HITS:1 COG:TM0033 KEGG:ns NR:ns ## COG: TM0033 COG4099 # Protein_GI_number: 15642808 # Func_class: R General function prediction only # Function: Predicted peptidase # Organism: Thermotoga maritima # 34 256 170 395 395 177 40.0 2e-44 MMKQWITLFVFLFLSLSLSAQQGYGRDIFVSSKGDSLPYRMIHPESVKPGEKYPLVLFLH GAGERGNDNEKQLTHGGQMFLNPVNQEKYPAFVLIPQCPTDGYWAYTERPKSFIPAEMPV GQEISPILQTLKQLLDSYLVMPEVDTQRVYIIGLSMGAMGTYDLVIRYPEIFAAAIPICG IVNPSRLSAAKDVKFRIFHGDEDDVVPVKGSREAYKALKAAGADVEYIEFPGCNHGSWNP AFNYPGFMDWLFKQKKKR >gi|222159346|gb|ACAB01000013.1| GENE 5 6039 - 7070 1135 343 aa, chain - ## HITS:1 COG:NMB1243 KEGG:ns NR:ns ## COG: NMB1243 COG2255 # Protein_GI_number: 15677115 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Neisseria meningitidis MC58 # 14 330 22 338 343 387 61.0 1e-107 MEQEDFNIREHQLTSKERDFENALRPLSFEDFSGQDKVVENLRIFVKAARLRGEALDHVL LHGPPGLGKTTLSNIIANELGVGFKVTSGPVLDKPGDLAGVLTSLEPNDVLFIDEIHRLS PVVEEYLYSAMEDYRIDIMIDKGPSARSIQIDLNPFTLVGATTRSGLLTAPLRARFGINL HLEYYDDDILSNIIRRSASILDVPCSVRAASEIASRSRGTPRIANALLRRVRDFAQVKGT GSIDTEIAQFALEALNIDKYGLDEIDNKILCTIIDKFKGGPVGITTIATALGEDAGTIEE VYEPFLIKEGFMKRTPRGREVTELAYKHLGRSLYNSQKTLFND >gi|222159346|gb|ACAB01000013.1| GENE 6 7198 - 8430 1030 410 aa, chain + ## HITS:1 COG:RSc0452_1 KEGG:ns NR:ns ## COG: RSc0452_1 COG2715 # Protein_GI_number: 17545171 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for spore maturation in B.subtilis. # Organism: Ralstonia solanacearum # 1 210 1 215 251 198 46.0 2e-50 MVLNYIWIGFFVVAFIIALVKVIFLGDTEIFTAIMNATFDSSKTAFEISLGLTGVLALWL GIMKIGENSGLINALARFLSPVLCRLFPDIPKGHPVLGSIFMNMSANMLGLDNAATPLGL KAMKELQELNPKKDTASNPMIMFLVINTSGLIIIPISIMVYRAQMGGAQPTDVFIPILLS TFISTLVGVIAVSIAQKINLINKPILILMGIICLFFSGLIYLFLSVSREDMGTYSTLIAN ILLFSVIILFILTGVRKKINVYDSFVEGAKEGFTTAVRIIPYLVAFLVGIAVFRTSGAMD FLVGGIGYIVGLCGVDTSFVGALPTALMKSLSGSGANGLMIDTMKELGPDSFVGRMSCVI RGASDTTFYILAVYFGSVGITKTRNAVTCGLIADFSGIIAAILISYLFFF >gi|222159346|gb|ACAB01000013.1| GENE 7 8447 - 8866 428 139 aa, chain + ## HITS:1 COG:TP0650 KEGG:ns NR:ns ## COG: TP0650 COG0319 # Protein_GI_number: 15639637 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Treponema pallidum # 37 122 40 135 160 67 37.0 5e-12 MAVTYQTEGVKMPDIKKRETTEWIKAVAASYGKRLGEIAYIFCSDEKILEVNRQYLQHDY YTDIITFDYCEGDRLSGDLFISLDTIHTNAEQFGTSYEDELHRVIIHGILHLCGINDKGP GEREIMEGAENKALSIRKV >gi|222159346|gb|ACAB01000013.1| GENE 8 9035 - 9451 439 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716841|ref|ZP_04547322.1| ## NR: gi|237716841|ref|ZP_04547322.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 138 1 138 138 270 100.0 2e-71 MKKLLMTVMVLSCFAFSSYGQKVEKQIIGKWCNPYTYESTGELKGFEFKKGGKCSAINVP SLDLRTWKIDNGYLIIEGFSKEDNGKVEVYKTRERIGYVTSDSLELVVQEAQPRLAFLYL NMKSIKKLVTPEVTPDKK >gi|222159346|gb|ACAB01000013.1| GENE 9 9458 - 12181 1793 907 aa, chain - ## HITS:1 COG:FN1128 KEGG:ns NR:ns ## COG: FN1128 COG1506 # Protein_GI_number: 19704463 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Fusobacterium nucleatum # 636 879 419 660 660 77 26.0 9e-14 MKGEDKMNRTIIRIGIAVGLLIGFGQNIKAQTPKKPIDIDACMSWMRVESPDISPTGRWV TYRIVPMEYNPEYKDGKILHLFDSRTRKEILLEDVENIEFYNSDQAAYYQVSDSTGNAKT ILLELPSGTKTEWTYKEAFRPAKGIPYSISVTNVPKDTINHVPSFDRLIVRHLKNGTSFQ IDSIGYYTLYNERQSIIFVRKQAKGNALCYGPLTGPYQTIYQSSVKKEPVSFNLNAKQMI GGFSIKDSLWYNFSLKKNTCDLVFDRKEIVLPARMELVRANLSSSQKFLTMELRPYQEKI NKDKKEEEIKPDKSFELELWTWDEYEVPTLQTRSRYFRPQYSKYIYDIASRKLTEVAPGH ADLLEPDRAEEIHNVLYTDETPYCAQKDWLNEMPFDIYSVNVHTGEKQLVGRSYRTRPRW SMNGKWAVMYDPIAQVWNKFDGKTGEVTDISTAIGYPIFEETYDKPNPAPACGIAGWTAD GNNVLIYDAYDWWKIDLTGERQPECITKGYGRKNQRSIRKMTSNIDKEVFNPDETVIVSV WDENTMDEGIYSLDMKGRLKKLAEGPYIYSIHRFSDNQKYCIWNRQNISEFRDLWWSKSD FSDPIRITNANPQQSEYKWGTAKLVEWTNYENKPNKGILYLPEDYDAKKEYPVLVQFYET HSGGLNTYHAPMLSSAMADVMYFVSNGYIVFMPDVHFTIGTPGQSSYDAVVSGTKYLIEQ GIAHPGKIGLQGHSWSGFQTSYLVTKTDIFACANIGAPITDMVTGYLGIRGGSGLPRYFM YEEWQSRMGKSLWEAKDKYLASSAIVEADKIHTPLLIWHNDKDEAVAYEQGRALYLAMRR LQRPAWHLNYKGEGHFLGNQAAQKDWTIRMKQFFDYYLKGTKEPRWMKEGIHLRERGIDQ KYDLLEK >gi|222159346|gb|ACAB01000013.1| GENE 10 12165 - 14519 1807 784 aa, chain - ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 101 739 102 711 738 123 24.0 2e-27 MKNSLFRYLCLAAVALICSMASAQQKANYQLAEKFRLLTQNPIMKYSTEVNPTFINDTDC FYYSFTTREGEKYYYVNPKKKEKRLLFDTPELLSKIAVYTKKAYSAAAPHLSFTFMKDNE TIRLDFDRGLYTYNIRTKVLKKLDEKPIYKDGDPYWKKYSPDSLYMLYASKDNLYFVGNP KKGQDTIPVQLTTDGGPDYTFNREDEGKMEGRFGAESAHWLPGTHRFYAVREDNRKVRDL WLINSLSKFPELTTYKAELAGDKDVTQYELLLGDVDKREVKKVDINRWPDQYIDVLYTTN DGKRLYFQRYNRSWNQSDICEVDVETGQVRVVIHEENKPYLDYQMRSVSFLKDGKEILFR SERNGWGHYYLYDTATGNLKNQVTDGTWVAGPIAKIDTVGRKFYFYGYGREKSIDPYYYI LYEAKLDQLNAIRLLTPENASHEARISPSNHYIVDSYSTVSQEPVNVVRNSRGKVVMKLE KPDLQPVYEMGWKAPERFKVKAADGVTDLYGVMWKPADFDSTKVYPIISNVYPGPFFEYV PTRFTVNDVYNTRLAQLGFIVITVGHRGGTPMRGKAYHTYGYNNMRDYPLADDKYAIEQL AARYPFIDGTKVGIYGHSGGGFMSAAAICTYPDFYSAAVSSAGNHDNRIYNKGFVEIHFG VDEKVKTGKDSLGVEHTTYDYSVRVRPNQELAKNYKHGLLLFTGAVDNTVNPANTLRLVH ALIKADKDFDMFVLPKCTHGFFGESEVFFEHKMWRHFARLLLNDHSADADIDLNKYMIED ERRR >gi|222159346|gb|ACAB01000013.1| GENE 11 14537 - 16897 1794 786 aa, chain - ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 171 772 157 737 738 122 25.0 4e-27 MVLKRVFIVGSLSFIFALQGWGQASGKHDTILANYELVQEFQEFTLGGKLSNNSLSIYPR EINDTDNFWFDFQTTAGKDYYYVMPAAGKREPLFDKSKMAMQLSEFTKGVVDKNKLDISS ITFSKDQRSFEFDYKGKQYSYNRLTGKLTIVEKKEEKKESSDPSYTWMNFSPDKKYILYA KNHNLYVKGNKSLGVDTTEVQLTTDGMKDFSYAREDDYEPAEDGEVPTLARWCPDNRHVY AVRDDNRLLRDFWVINSIEDSPSLVKYKYEFPGDKYVTQNDLTIIDIVEKTAKKAKVDKW KDQYVMPLHVTSDSKYLFFERTKRTWDEVDVCSVNLSTLEVKEIIHEEDKPYRDPHARNV AILNDGKDILFRSERTGWGHYYHYDGDGNLKNVMTSGEWVTGYINSIDTLKRKVYLYGYG KDKKVNPFYYMLFEVDIDKEGVTPLSTEDGQHNVNFLKSHNYYIDTYSRVDMEPKIMLKD RKGKVIMELAKPDLDLVYASGWKKPERFVVKAADNITDLYGVMWKPSDFNPEKKYPIISV VYPGPYFGFVPTNFTLDDSYCTRMAQLGFIVITVGHRGDTPMRGKAYHRYGYGNMRDYPL ADDKYAIEQLAQRHSFINGKKVGIYGHSGGGFMAAAAIFTYPDFYTAAVSCSGNHDNSIY NRGWGECYNGVKEVEKVVKDSLGNETKEYEYKFSVKSNAEIAKNLKGHLMLVTGDMDKNV NPAHTYRVAQALIEAGKDFDMLVIPGAGHGYGSADKYFEKKMYRFFAKHLLGDNRADYWG DINCTK >gi|222159346|gb|ACAB01000013.1| GENE 12 16943 - 18655 1328 570 aa, chain - ## HITS:1 COG:no KEGG:BT_3274 NR:ns ## KEGG: BT_3274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 189 1 160 534 71 27.0 9e-11 MKKLGIYAVMAVFVGVMSGCLDDDNNYNYKQINDLQGGNFNIENINSGYNLIEGDELVLA PTFKFTIDSITPDVSYEWYIDKQLQTGESGATYTFKADKSGTYQVTFAVTDNKSGVQFGK STIIKVMSMFQRGWTILSDEGGRSVLHFIVPTTQHYQVTYNGETFTRDSLVYHIVKRDVV SNLGSNPKGLMNNIGYIDYNLQYGISVYDELVVKQDRWVELNGNTLEREVYTDEEFRGDI PAHFSPIEAAMTYTAKALLDKNGLIYWEKKADAADFHAGTYMSIGLNNETRFSRLFQAYK FNYYYTNVMLALTKEDNSLVGILDVGNVAGSESSAIGEMTSSESGNMYNIADPSGEDHFS NIKKTVVDALPAPYDGGNDFTMAYPFWTVLLKDEATSVYELRYFGLEADSRSVSCMDGWY YEAPLGVINDYRGMANFGNKRYVVIASGNQLYYYQYGWDSYGDVEYRGSLMPLGEPLPAA VKTLSGMDVTTNLRKYKYPYSGQLGVALEDGSFYIYSVVETRLKDGTCTAVSLKQQFPNE TTSEENKNFGEIVDVLYKWGSGDDYMSFSF >gi|222159346|gb|ACAB01000013.1| GENE 13 18673 - 19437 620 254 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294644438|ref|ZP_06722201.1| ## NR: gi|294644438|ref|ZP_06722201.1| hypothetical protein CW1_0760 [Bacteroides ovatus SD CC 2a] # 1 254 1 254 254 507 100.0 1e-142 MKMKIKYAIWGALICIADLMLISCEEKGLLVNSNDTAYLRFANDMTKDTTTVSFKMYNEG EDAEIPIEVSVYGQIQEDDLHFNVSVDAEYTTLPADLYVLPTDCRIRKGLLTDTIYVILK NAPMLEKETKILALQVVEANGVKQGDHSYSRAFITVTDRLFKPHWWSVIDAGSESNPGNS VDWFYLGEYSERKYQMFLDELKKDDMVFDGKNKQVLRKYSLRLKNTLKQMNEGKKKEDWE KDEHGNVITVEVAG >gi|222159346|gb|ACAB01000013.1| GENE 14 19475 - 20989 915 504 aa, chain - ## HITS:1 COG:no KEGG:BT_3272 NR:ns ## KEGG: BT_3272 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 501 4 486 488 245 34.0 2e-63 MIRKIYTTFVSFAFVVLTLSSCSDWLDLYPSDEIKEEYLFSSGDGFRTATNGIYRKMATF NLYGSNLTWGIMDAWAQSYYIDQAPSIGGGTAMRNMAALQFKNTELVPVTDAMWNAAWNV VANCNELAQQAAKADPNLFSELDGERQMILGEAIGLRAFIQFDLLRIYAPSPSSVGFGKD DRTFIPYVNVYPSYVNNHQTVSYCLEQIIADLKEAQRILKEVDATSSMGANARFNIETVS ESLFAKSRGYRLNYYAVTAELARVFLYAGKNAEAYAEAKKIIDEENKTGYFEASTNSSGF RNGNMKMYNDIIFGLYSSKELVEWDQEINHGSDGASEENYLCLDGDVAKELYGDEINSDW RFKYQLEEKYYGFYYRTLKYNKQTESTNEGKTNNRMLPMIRMSEVYYIAAEAIFDTKPKE ARGYLELVKKGRGISKKFDNVTNKSEFIDLLVNDARREFLGEGQIFYMYKRLNRLIPASS YYSSDILPTDENVVLPKPDSESNI >gi|222159346|gb|ACAB01000013.1| GENE 15 21000 - 24341 2523 1113 aa, chain - ## HITS:1 COG:no KEGG:BT_3271 NR:ns ## KEGG: BT_3271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1113 1 1116 1116 1073 49.0 0 MREKRWLLCFLVAVCCTFSTWALPSQDKTVTLNLHNVSIETVLDAVKKQTGVNMLYNSQM FKGVPPVSINAKNEKWEVALKLILNPQGFDYVVKDGIVVVRKLQAEKRDNRIRGTVIDSN KEPIPGASIIVKGTRTGTSTNIEGEFTLDVKDDKVTLEISFIGMKKQTLQVDATRKKMLE ITMLDDVKTLEDVVVTGYSNVRKSSFTGSSTQITGDELRKVSQTNVLDAMQSFDPSFRLM SNTQFGSDPNALPEMYIRGRSGVGTRDLDKNQLSKSNLENNPNLPTFIMDGFEVSIEKVY DLDPSRIESMTILKDAAATAIYGSRAANGVVVITTVTPKPGEVRVSYNFTGTVEMPDLSD YNLANASEKLEMERLAGLFDNSNMENQILDYNKRFAQIQRGVDTDWMVLPLRNAFDHKHS LFIEGGTQNLRWGVDASYNAANGVMKGSGRDRYSVGFSLDYRMKSLQVKNTVSFTHSKSK ESPYGSFSDYTILQPYDTPYNDDGTLRKRLSFSKNSNLREANNPLYEATLGNYTWSAYDE VSNNLSLNWYLTDYWTVRGQFSVNRKYSSGERFIDPLSSKTTAAPNEGGHNLGDLYVDDG NSLNWNANAALYYTRSFNKHNLNLSVAWEASSGSSDATNVHYRGFPSGQFHSSNYAAEIY EKPSRTEGTSRMVSAWATGNYTWNDIYLADFSVRFDGSSDFGSKQRWAPFFSGGLGVNIH NYEFLKGNEMVNKLKVRASYGRTGKASFPAYAATTMYEALFDEWYATGFGAVLKALGNNN LTWEKTDKFNVGIEMQFLNRRLTVDADYYHDKTIDLVNDVSLSQTSGFSSYKDNMGEVLN QGFELQVRGEVFRNRDWLVALWGNMAHNKNKILKISDSQRAYNQRVKEFYENIVNEEYGR YNSKYGVPISQYEEGQSLTSIWAVKSLGIDPTTGKEIFLNRDGSVSDTWNATQEVAVGNT EPKFNGSIGFNAAYKQWSLFAAFQYEWGGQEYNQTLVDRVENARIQYQNVDLRVLTDRWK QPGDIAQFKDIKNADSATMPTSRFVQDKAYLRLSALTLSYDFNREWIKKYLGMNMLRLEA STRDFINWNSIRQERGLSYPKSWTVDFSIKAQF >gi|222159346|gb|ACAB01000013.1| GENE 16 24381 - 25580 844 399 aa, chain - ## HITS:1 COG:AGl2871 KEGG:ns NR:ns ## COG: AGl2871 COG3712 # Protein_GI_number: 15891547 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 198 343 135 278 331 82 39.0 1e-15 MKKFENVYQDAALMKKALLGEANESEQQELEKRLAECPDLQKVYEQLQNGETLRVAFEEY KNYSSKKAYESFLQKIGQTEPEVIKKSRAFRIWWSVAAAVVLVIGLSFYMSNYGSIEEES RPLIQPGVQQAQLTLPDGSIIDVHKKEVNVIVDGVQVKYKEGVLSYKPTATTQYTEKSVV EKPVISNELVIPRGGENTVVLADGTTVHLNAGSKLTYPVRFVGKRRIVALEGEAYFEVVQ DESHPFVVQTHLGEVMVLGTAFNVNAYTDASVCYTTLVHGKVQFSAPNVGTVTLQPGEQA VVSANGTEKRTVDLDEYIGWVNGVYNFKNRSLGEIMETFERWYDIQIYYETPDLRDITYS GSLKRYGTINSFLDALELTGDLTYKISGRKVLIYDGMKE >gi|222159346|gb|ACAB01000013.1| GENE 17 25827 - 26396 364 189 aa, chain - ## HITS:1 COG:no KEGG:BT_3269 NR:ns ## KEGG: BT_3269 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 189 190 277 74.0 1e-73 MNDKIDTIVAGVNRKDEKMWGDFYDRFYTALCVYVSKILPVPDAVEDLVQDVFISVWEGK RTFSDIKELTNYLYRACYNNTLLYIRNNQIHDTILNSLAEEESVEDEDMIYALTVKEEII RQLYCYIEELPSEQRRIILMRIEGHTWEEIAERLEISINTVKTQKTRSYKFLRERLGDSV HSIILCLFL >gi|222159346|gb|ACAB01000013.1| GENE 18 27757 - 29643 1545 628 aa, chain + ## HITS:1 COG:RSc3328 KEGG:ns NR:ns ## COG: RSc3328 COG0445 # Protein_GI_number: 17548045 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Ralstonia solanacearum # 4 624 6 627 647 592 48.0 1e-169 MDFKYDVIVIGAGHAGCEAAAAAANLGSKTCLITMDMNKIGQMSCNPAVGGIAKGQIVRE IDALGGQMGLVTDETAIQFRILNRSKGPAMWSPRAQCDRAKFIWSWREKLENTPNLHIWQ DTVCELLVENGEATGLVTAWGVTFKAKCIVLTAGTFLNGLMHVGRHQLPGGRMAEPASYQ LTESIARHGITYGRMKTGTPVRIDARSVHFDQMDTQAGECDFHKFSFMNTSTHHLKQLQC WTCYTNEEVHRILREGLPDSPLFNGQIQSIGPRYCPSIETKIVTFPDKDQHQLFLEPEGE TTQELYLNGFSSSLPMEIQIAALKQIPAFKDLVIYRPGYAIEYDYFDPTQLKHTLESKII KNLFFAGQVNGTTGYEEAGGQGLIAGINAHINCHGGEAFTLARDEAYIGVLIDDLVTKGV DEPYRMFTSRAEYRILLRMDDADMRLTEKAFKLGLAKEDRYQLVKSKKEAVEQIVSFARN YSMKPALINDALEKIGTTPLRQGCKLIEILNRPQVTIENITEYVPAFQRELEKATSSDQD RKEEILEAAEILIKYQGYIDRERMIAEKLARLESIKIKGKFDYSTIQSLSTEARQKLVKI DPETIAQASRIPGVSPSDINVLLVLSGR >gi|222159346|gb|ACAB01000013.1| GENE 19 29694 - 30224 498 176 aa, chain + ## HITS:1 COG:YPO3123 KEGG:ns NR:ns ## COG: YPO3123 COG0503 # Protein_GI_number: 16123288 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Yersinia pestis # 11 162 18 169 187 159 50.0 2e-39 MIMSKETLIKSIREVPDFPIPGILFYDVTTLFKDPWCLQELSNIMFDMYKDKGITKVVGI ESRGFIMGPILATRLNAGFIPIRKPGKLPAETIEESYDKEYGKDTVQIHKDALDENDVVL LHDDLLATGGTMKAACELVKRLKPKKVYVNFIIELKELNGKSVFGDDVEVESVLTL >gi|222159346|gb|ACAB01000013.1| GENE 20 30294 - 32138 1164 614 aa, chain + ## HITS:1 COG:lin1197 KEGG:ns NR:ns ## COG: lin1197 COG0322 # Protein_GI_number: 16800266 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Listeria innocua # 9 590 2 573 603 406 41.0 1e-113 MNVEPESKTNEYLKGIVANLPEKPGIYQYLNTEGTIIYVGKAKNLKKRVYSYFSKEHEPG KTRVLVSKIADIRYIVVNTEEDALLLENNLIKKYKPRYNVLLKDDKTYPSICVQNEYFPR IFKTRKVIRNGSSYYGPYSHIPSMYAVLDLIKHLYPLRTCNLNLSPENIRAGKFNVCLEY HIKNCAGPCIGLQSQEEYLKNIDEIKEILKGNTQDISRMLLEKMQTLAAEMKFEEAQKVK EKYLLIESYRSKSEVVSAVLHNIDVFSIEEDESNSAFINYLHITNGAINQAFTFEYKKKL NESKEELLTLGIIEMRERYKSQSREIIVPFELDLELNNVVFTVPQRGDKKKLLDLSVLNV KQYKADRLKQAEKLNPEQRSMRLMKEIQQELHLERPPLQIECFDNSNIQGSDAVAACVVF KKAKPSKKDYRKYNIKTVVGPDDYASMKEVVRRRYQRAIEENSPLPDLIITDGGKGQMEV VREVIEDELNLAIPIAGLAKDNRHRTSELLLGFPPQTIGIKQQSPLFRLLTQIQDEVHRF AISFHRDKRSKRQVASALDTIKGIGEKTKTALLKEFKSVKRIKETSFEEIAAIIGEAKAK TVKEGLSNESSKNE >gi|222159346|gb|ACAB01000013.1| GENE 21 32243 - 32413 102 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716856|ref|ZP_04547337.1| ## NR: gi|237716856|ref|ZP_04547337.1| predicted protein [Bacteroides sp. D1] # 1 56 1 56 56 83 100.0 5e-15 MAYPLYNYSHCYLFYFVVRKGKFTKFERENGVLAGKYYKVLPKALMVYYLGMDYTD >gi|222159346|gb|ACAB01000013.1| GENE 22 32384 - 32836 442 150 aa, chain + ## HITS:1 COG:SP1644 KEGG:ns NR:ns ## COG: SP1644 COG1490 # Protein_GI_number: 15901480 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Streptococcus pneumoniae TIGR4 # 1 149 1 147 147 153 51.0 8e-38 MRIVIQRVSHASVTIEGHCKSSIGKGMLILVGIEESDGQEDIDWLCKKIVNLRIFDDENG VMNKSILEDGGEILVISQFTLHASTKKGNRPSYIKAAKPDISIPLYEQFCKDLSGALGKE IGTGIFGADMKVELLNDGPVTICMDTKNKE >gi|222159346|gb|ACAB01000013.1| GENE 23 32876 - 33214 509 112 aa, chain + ## HITS:1 COG:SA1292 KEGG:ns NR:ns ## COG: SA1292 COG1694 # Protein_GI_number: 15927040 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Staphylococcus aureus N315 # 2 99 3 101 105 80 44.0 7e-16 MTLEEAQKQVDQWVKTYGVRYFSELTNMVVLTEEVGELARVMARKYGDQSFKEGEKDNID EEIADVLWVLLCIANQTGVDITDAFRKSMEKKTKRDNKRHINNPKLKDHGRE >gi|222159346|gb|ACAB01000013.1| GENE 24 33201 - 34103 1011 300 aa, chain + ## HITS:1 COG:SMb21300 KEGG:ns NR:ns ## COG: SMb21300 COG0274 # Protein_GI_number: 16264552 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Sinorhizobium meliloti # 54 293 71 314 334 186 40.0 3e-47 MEENNSQPNKYKAALAKYNTNLSDADIQARVAELIEKKVPENNTEDVKKFLFNCIDLTTL NSTDSDESVMHFTEKVNQFDDEYPDLKNVAAICVYPNFAAIVKNTLEVDGVNIACVSGGF PSSQTFIEVKVAETALAIADGADEIDIVISIGKFLSGDYEGMCEEIQELKEVCKEHHLKV ILETGALKSASNIKKASILSMYSGADFIKTSTGKQQPAATPEAAYVMCEAIKEYYQKTGN KIGFKPAGGINTVNDAIIYYTIVKELLGEEWLDNQLFRLGTSRLANLLLSDIKGEEIKFF >gi|222159346|gb|ACAB01000013.1| GENE 25 34288 - 34479 65 63 aa, chain - ## HITS:1 COG:no KEGG:BF0107 NR:ns ## KEGG: BF0107 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 11 54 5 48 57 65 79.0 4e-10 MKSVRTVAMYFPEASASRYEYTSNSPTLHSIYNKIIYMKVRETRLLITIHNFEFCRIPGT GNK >gi|222159346|gb|ACAB01000013.1| GENE 26 34502 - 35009 282 169 aa, chain + ## HITS:1 COG:no KEGG:BT_3262 NR:ns ## KEGG: BT_3262 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 169 1 169 226 263 69.0 1e-69 MDAYTRTLRFNHNPLNLILGTEKKKGLRIGYMEAGLQGFYLNSMETGVHPQKLSRLLAEE FHCTDTESVTGLFQFLINEGDRVSYQIMLPYLLSTENINEFESIIQKRFFGVERFIQQGK NLYRFVKYTEERRDPIIWINDLEKGIIGWDMGLLVSLARASQTCGHISK Prediction of potential genes in microbial genomes Time: Wed May 18 01:07:20 2011 Seq name: gi|222159345|gb|ACAB01000014.1| Bacteroides sp. D1 cont1.14, whole genome shotgun sequence Length of sequence - 2540 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 173 60 ## BT_3262 hypothetical protein 2 2 Tu 1 . - CDS 174 - 1148 737 ## COG0142 Geranylgeranyl pyrophosphate synthase - Prom 1202 - 1261 7.3 + Prom 1158 - 1217 4.2 3 3 Tu 1 . + CDS 1237 - 2539 1056 ## COG0258 5'-3' exonuclease (including N-terminal domain of PolI) Predicted protein(s) >gi|222159345|gb|ACAB01000014.1| GENE 1 3 - 173 60 56 aa, chain + ## HITS:1 COG:no KEGG:BT_3262 NR:ns ## KEGG: BT_3262 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 51 171 221 226 78 72.0 9e-14 QAWKYIEQAAQLCSLDLHTAEEIDKSFLLGKAMKSGKIEDWDRLLSCYSLLGKHRR >gi|222159345|gb|ACAB01000014.1| GENE 2 174 - 1148 737 324 aa, chain - ## HITS:1 COG:PA4569 KEGG:ns NR:ns ## COG: PA4569 COG0142 # Protein_GI_number: 15599765 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Pseudomonas aeruginosa # 26 322 25 320 322 171 34.0 2e-42 MDSISLIRTPIEAELRDFKELFDSSLSSSNALLDSVVSHIRQRNGKMMRPILVLLVARLY GAVCPSTLHAAVSLELLHTASLVHDDVVDESTERRGQLSVNAIFNNKVAVLTGDYLLATS LVHAELTNSHRIIQLVSTLGQDLADGELLQLSNVSNHSFSEEVYFDVIRKKTAALFAACT KAAAFSVGVGEGEAELARLLGEYIGICFQIKDDIFDYFDNREIGKPTGNDMLEGKLTLPA LYVLNTTKNDEAQELAIKVKEGAATLDEIAHLISFIKENGGIEYAVQTMNIYKQKAFDLL ASLPESDICVALRAYLDYVVDREK >gi|222159345|gb|ACAB01000014.1| GENE 3 1237 - 2539 1056 434 aa, chain + ## HITS:1 COG:lin1600_1 KEGG:ns NR:ns ## COG: lin1600_1 COG0258 # Protein_GI_number: 16800668 # Func_class: L Replication, recombination and repair # Function: 5'-3' exonuclease (including N-terminal domain of PolI) # Organism: Listeria innocua # 5 288 2 285 316 212 43.0 1e-54 MDSSNKLFLLDAYALIYRAYYAFIKNPRINSKGFNTSAILGFVNTLEEVLKKENPTHIGV AFDPAGPTFRHEAFEQYKAQREETPEAIRLSVPIIKDIIRAYRIPILEVAGYEADDVIGT LATEAGRQGITTYMMTPDKDYGQLVSDKVFMYRPKHTGGFEVMGVEEVKAKFDIQSPAQV IDMLGLMGDSSDNIPGCPGVGEKTAQKLISEFGSIENLLEHTDQLKGALKTKVETNREMI TFSKFLATIKIDVPIQLEMDSLVREEADEDSLRSIFEELEFRTLIDRVLKKEVSGNGITS TPKSKVATGKSAPSPLPLFPEEGGGIQGDLFANFTPNEPNEAKKSNLETLESQSRNYQLI DTEKKRREIIQKLLTSKILSLDTETTETEPMDAELVGMSFSIAENEAFYVPVPSDRDEAL KIVKEFRPVFENEN Prediction of potential genes in microbial genomes Time: Wed May 18 01:07:35 2011 Seq name: gi|222159344|gb|ACAB01000015.1| Bacteroides sp. D1 cont1.15, whole genome shotgun sequence Length of sequence - 52321 bp Number of predicted genes - 50, with homology - 50 Number of transcription units - 22, operones - 13 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 32 - 1546 1671 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains + Term 1706 - 1742 -0.8 + Prom 1814 - 1873 4.8 2 2 Tu 1 . + CDS 1937 - 2332 516 ## BT_3259 hypothetical protein + Term 2370 - 2417 6.1 - Term 2360 - 2400 4.4 3 3 Op 1 9/0.000 - CDS 2422 - 3201 726 ## COG3279 Response regulator of the LytR/AlgR family 4 3 Op 2 . - CDS 3198 - 5249 1399 ## COG3275 Putative regulator of cell autolysis - Prom 5310 - 5369 5.1 + Prom 5284 - 5343 7.6 5 4 Op 1 . + CDS 5373 - 6278 782 ## COG1045 Serine acetyltransferase 6 4 Op 2 . + CDS 6295 - 7737 1549 ## COG0116 Predicted N6-adenine-specific DNA methylase + Prom 7780 - 7839 3.5 7 5 Op 1 . + CDS 7882 - 8430 345 ## BF4320 hypothetical protein 8 5 Op 2 . + CDS 8445 - 10643 2017 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 9 5 Op 3 . + CDS 10712 - 11986 1543 ## COG0151 Phosphoribosylamine-glycine ligase 10 6 Op 1 . + CDS 12101 - 13111 672 ## BT_3252 hypothetical protein 11 6 Op 2 . + CDS 13096 - 13575 381 ## COG1238 Predicted membrane protein 12 6 Op 3 . + CDS 13620 - 14423 620 ## COG4121 Uncharacterized conserved protein 13 6 Op 4 25/0.000 + CDS 14462 - 15391 754 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 14 6 Op 5 . + CDS 15429 - 16265 199 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein + Term 16486 - 16552 17.6 + Prom 16763 - 16822 3.0 15 7 Tu 1 . + CDS 16858 - 20241 3305 ## BT_3247 hypothetical protein + Prom 20259 - 20318 2.5 16 8 Tu 1 . + CDS 20352 - 20966 607 ## COG0726 Predicted xylanase/chitin deacetylase + Prom 21479 - 21538 1.6 17 9 Op 1 . + CDS 21601 - 21831 240 ## gi|262405650|ref|ZP_06082200.1| conserved hypothetical protein 18 9 Op 2 . + CDS 21931 - 22422 506 ## gi|237716884|ref|ZP_04547365.1| conserved hypothetical protein 19 9 Op 3 . + CDS 22443 - 23078 621 ## BT_2676 hypothetical protein + Term 23114 - 23172 10.1 + Prom 23081 - 23140 2.9 20 10 Op 1 . + CDS 23184 - 23903 564 ## BT_2675 hypothetical protein 21 10 Op 2 . + CDS 23953 - 24423 555 ## gi|160882198|ref|ZP_02063201.1| hypothetical protein BACOVA_00144 + Term 24475 - 24527 16.2 - Term 24458 - 24519 17.7 22 11 Tu 1 . - CDS 24541 - 25782 1405 ## BT_3233 hypothetical protein - Prom 25887 - 25946 6.4 - Term 25891 - 25954 16.0 23 12 Op 1 9/0.000 - CDS 25975 - 26778 1131 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 24 12 Op 2 . - CDS 26819 - 27661 1017 ## COG3717 5-keto 4-deoxyuronate isomerase - TRNA 28069 - 28142 52.1 # Gln TTG 0 0 - Term 28167 - 28202 4.4 25 13 Op 1 . - CDS 28250 - 29542 713 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 - Prom 29564 - 29623 6.8 26 13 Op 2 . - CDS 29629 - 30321 492 ## COG0084 Mg-dependent DNase - Prom 30351 - 30410 2.5 - Term 30353 - 30393 3.9 27 14 Op 1 . - CDS 30618 - 31007 173 ## PROTEIN SUPPORTED gi|126646897|ref|ZP_01719407.1| 50S ribosomal protein L34 28 14 Op 2 . - CDS 31013 - 31759 578 ## BF0075 uroporphyrinogen-III synthase 29 14 Op 3 . - CDS 31764 - 32597 315 ## BT_3225 hypothetical protein 30 14 Op 4 . - CDS 32594 - 33181 523 ## COG1611 Predicted Rossmann fold nucleotide-binding protein 31 14 Op 5 . - CDS 33261 - 35501 1029 ## Cpin_4389 TonB-dependent receptor plug 32 14 Op 6 . - CDS 35501 - 36508 649 ## gi|237716900|ref|ZP_04547381.1| conserved hypothetical protein 33 14 Op 7 . - CDS 36524 - 37516 468 ## COG1651 Protein-disulfide isomerase - Term 37523 - 37571 14.6 34 15 Op 1 . - CDS 37590 - 38501 592 ## gi|237716902|ref|ZP_04547383.1| conserved hypothetical protein - Prom 38562 - 38621 5.6 - Term 38569 - 38621 1.1 35 15 Op 2 . - CDS 38671 - 38841 133 ## gi|294644389|ref|ZP_06722152.1| hypothetical protein CW1_0710 - Prom 39019 - 39078 5.6 - Term 39235 - 39277 -0.0 36 16 Tu 1 . - CDS 39432 - 39803 367 ## gi|237716903|ref|ZP_04547384.1| conserved hypothetical protein - Prom 39869 - 39928 5.0 37 17 Tu 1 . - CDS 39930 - 41594 657 ## BT_3570 TPR repeat-containing protein - Prom 41638 - 41697 6.8 - Term 41678 - 41743 12.0 38 18 Tu 1 . - CDS 41775 - 43067 1496 ## COG0192 S-adenosylmethionine synthetase - Prom 43292 - 43351 3.4 + Prom 43026 - 43085 4.0 39 19 Tu 1 . + CDS 43305 - 43493 238 ## BT_3217 hypothetical protein + Term 43551 - 43584 0.2 40 20 Op 1 . - CDS 43468 - 43977 279 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 41 20 Op 2 . - CDS 43953 - 45011 1082 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 42 20 Op 3 . - CDS 45029 - 45751 677 ## COG0130 Pseudouridine synthase 43 20 Op 4 . - CDS 45751 - 46605 694 ## COG1968 Uncharacterized bacitracin resistance protein 44 20 Op 5 . - CDS 46608 - 46850 238 ## BT_3211 hypothetical protein - Prom 46874 - 46933 2.9 45 21 Op 1 . - CDS 46964 - 47845 564 ## COG2177 Cell division protein 46 21 Op 2 . - CDS 47849 - 48748 816 ## BT_3209 hypothetical protein - Prom 48788 - 48847 5.6 - Term 49175 - 49228 10.4 47 22 Op 1 . - CDS 49247 - 49666 175 ## gi|237716914|ref|ZP_04547395.1| predicted protein 48 22 Op 2 . - CDS 49656 - 50330 264 ## gi|237716915|ref|ZP_04547396.1| predicted protein 49 22 Op 3 . - CDS 50358 - 51515 538 ## BT_3206 hypothetical protein 50 22 Op 4 . - CDS 51515 - 51955 260 ## BT_3205 hypothetical protein Predicted protein(s) >gi|222159344|gb|ACAB01000015.1| GENE 1 32 - 1546 1671 504 aa, chain + ## HITS:1 COG:ECs4786_2 KEGG:ns NR:ns ## COG: ECs4786_2 COG0749 # Protein_GI_number: 15834040 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Escherichia coli O157:H7 # 3 504 134 635 635 458 50.0 1e-128 MIVLQNYGAVVKGPLFDTMIAHYVLQPELRHGMDYLAEIYLHYQTIHIDELIGPKGKNQK NMRDLDPKDVYLYACEDADITLKLKNVLEKELKENDAERLFYDIEMPLVPVLVNIERNGV LLDTEALQQSSAHFTAQMEQIEKEIYELAGETFNIASPKQVGEVLFDKLKIVEKAKKTKT GQYVTSEEVLESLRHKHPVVEKILEHRGLKKLLGTYIDALPLLINPRTGRVHTSFNQTVT ATGRLSSSNPNLQNIPIRDENGKEIRKAFIPDEGCLFFSADYSQIELRIMAHLSEDKNMI DAFLSNHDIHAATAAKIYKIDLKDVDSDMRRKAKTANFGIIYGISVFGLAERMNVDRKEA KELIDGYFETYPGVKAYMDKSIQVAQEKGYVETIFHRKRFLPDINSRNAVVRGYAERNAI NAPIQGSAADIIKVAMARIYQRFQAEGIQAKMILQVHDELNFSVPVNEKERVEEIVIEEM EKAYRMHVPLKADCGWGKNWLEAH >gi|222159344|gb|ACAB01000015.1| GENE 2 1937 - 2332 516 131 aa, chain + ## HITS:1 COG:no KEGG:BT_3259 NR:ns ## KEGG: BT_3259 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 186 83.0 2e-46 MKAKVSLKMFVLSAALLVVSLATSARSYDNQLIYNPIEENGMTVGQTVYKMDGNTLANYM KYNYKYDDNKRMIESEALKWNSNKDEWEKDLRINYTYEGKTVTTNYYKWNAKKQAYILVP EMTVTMDNTNM >gi|222159344|gb|ACAB01000015.1| GENE 3 2422 - 3201 726 259 aa, chain - ## HITS:1 COG:VC0693 KEGG:ns NR:ns ## COG: VC0693 COG3279 # Protein_GI_number: 15640712 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Vibrio cholerae # 11 254 5 232 237 84 27.0 2e-16 MMETEEKYKVVIVDDERTAIDALRRELEPYREFEVKGIAGNGAKGKKMIMELHPDLLFLD VELPDTLGLNLLSEIREDILWDMKVVFYTSYDKYLLQALRESAFDFLLKPFEAEDLKVIM ERYRKAMTSAALPLLPSFASSISALMPQQGMFMISTVTGFKLLRLEEIGFFEYLKDKRQW QVVLFNQTRLNLKRNTKAEDIISYSQAFVQISQSAIVNVNYLAMIDGKCCQLYPPFHDKN DLMISRSYLKELQERFFVL >gi|222159344|gb|ACAB01000015.1| GENE 4 3198 - 5249 1399 683 aa, chain - ## HITS:1 COG:ECs3260 KEGG:ns NR:ns ## COG: ECs3260 COG3275 # Protein_GI_number: 15832514 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Escherichia coli O157:H7 # 469 676 350 551 565 91 27.0 4e-18 MNRKTAMKLFGLLGLLMLLSCGYADRHRNNSSCEQVLVDSLEVRVQDSLFSNVHYSRSQV LDALAQAQDSQIYYRLLALYGKTFFVSSDFDSILYYNRRVKEFSRNASQSSESLQSPQWN DVLSDVYNIEGNVWMQLNRPDSAITDYKKAYEYRLKGKKLHLLPDICINTADAYLHRSDL AHTASYYRRALFLCDSLNLSEHAKFPVYYGLGQTYMELRDFDLSNHYYELAGQYFDEMNV SERWTYLNNRGNHYYYRKDYQEALKYMRRANVLVSSYPQMVFEQNFIKVNLGELYLLTDK LDSAQVCLDESYRFFSDIQHNSAVHYIETQMIELALKKGNIAQAKTMIARTAPVGHLDAN MLTIRNQYLQHYFEQIGDYRHAYEYLKRDCHLDDSIRSERVQMRVAELDMRYRQDTIVLR KEMQIQRQAGEMRVLKLSVFIWVLVCVLLVAGVVVVVWYMRKKREFLRQRFFQQINRVRM ENLRSRISPHFTFNVLGREINQFNGSEEVKHNLMELVKYLRRSLELTEKLSVSLQDELDF VRTYIELERGRVGEDFVATITVEDGLDATRIMIPSMIVQIPVENAIKHGLAGKDGKKELM VYVSRETNGIRITVTDNGRGYLPQVVSATRGTGTGLKVLYQTIQLLNTKNKSDKIRFNIT NRSDGQTGTEVSVYIPFRFSYDL >gi|222159344|gb|ACAB01000015.1| GENE 5 5373 - 6278 782 301 aa, chain + ## HITS:1 COG:PA3816 KEGG:ns NR:ns ## COG: PA3816 COG1045 # Protein_GI_number: 15599011 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Pseudomonas aeruginosa # 133 295 8 161 258 140 44.0 3e-33 MSPLNFTHILTQAVDELSESESYKGLFHQHKDGEPLPSAKVLYEIIELSRSILFPGYYGN STINSRTINYHIGVNIEKLFDLLCEQILAGLCFSTSIEGKCIVCSDSKREEAARLAAKFI SKLPAMRRILATDVEAAYNGDPAAESYGEVIFCYPAIKAISNYRIAHELLELGVPLIPRI ITEMAHSETGIDIHPAAKIGTHFTIDHGTGVVIGATSIIGNNVKLYQGVTLGAKSFPLDA DGKPIKGIPRHPILEDNVIVYSNATILGRITIGRDATVGGNIWVTENVPAGARIVQTKAK K >gi|222159344|gb|ACAB01000015.1| GENE 6 6295 - 7737 1549 480 aa, chain + ## HITS:1 COG:slr0064 KEGG:ns NR:ns ## COG: slr0064 COG0116 # Protein_GI_number: 16331495 # Func_class: L Replication, recombination and repair # Function: Predicted N6-adenine-specific DNA methylase # Organism: Synechocystis # 9 374 16 384 384 272 39.0 1e-72 MSEQFEMIAKTFQGLEEILAEELTALGANDIQIGRRMVSFTGDKRMMYKANFCLRTAIRI LKPIKNFTAKDADEVYNQIQAIPWEEYLDVNKTFAIDAVVFSDEFRHSKFVSYKVKDAIV DYFRDKTGKRPSVRINNPDVLLNIHIAQTTCTLSLDSSGESLHRRGYRQEAVEAPLNEVL AAGMILMTGWRGECDLIDPMCGSGTIPVEAALIAKNIAPGVFRKGFAFEKWVDFDADMFD EIYNDDSQEREFAHKIYGYDNNPKANEIATHNIKAAGVSKDIVLKLQPFQQFEQPQEKSI IITNPPYGERISTNDLLGLYQMIGERLKHAFVGNEAWVLSYREECFDQIGLKASQKVPLF NGPLECEFRKYEIFDGKYKEFKSQEEGEEKKEAEGEQAPRFRERKEFKPRREGEFKPRRD GESRPRREGEYKPRREGGEFKGGDGRDRKPRGEFRGGRDSRGPREFRGNREPRIPKKEEE >gi|222159344|gb|ACAB01000015.1| GENE 7 7882 - 8430 345 182 aa, chain + ## HITS:1 COG:no KEGG:BF4320 NR:ns ## KEGG: BF4320 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 161 2 170 196 142 49.0 4e-33 MNDIVIIENKIYEIRGQKVMLDFDLAEMYEVTTGNFNKAIKRNIERFPARFMFQLNKDEF NLIFQNGISSWGGTRKMPYAFTEQGVAMLSGVLKSPRAIEVSINIMDAFVHMRQYLLSHA PKQELEELRKRIEYLEEDISSDRESYEKQFDDLFTAFAKLSAAIQIKQTPLDRVKIEGFK NK >gi|222159344|gb|ACAB01000015.1| GENE 8 8445 - 10643 2017 732 aa, chain + ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 133 714 145 715 738 286 33.0 8e-77 MSKIGKRAMLLLLALPIGFNVMAQETKKPTLEELIPGGESYRYADNLYGLQWWGDECIKP GVDTLYSIQPKTGKETMVITREQINKVLEENKAGKLSHLYSVSFPWADKPQMLFKIAGKF IVYDFKSNQVVSTLKPKDGADNEDYCAASGNVAYTIGNNLYVNEKAVTNEPEGIVCGQTV HRNEFGINKGTFWSPKGNLLAFYRMDESMVTQYPLVDITARVGEVNNVRYPMAGMTSHQV KVGIYNPATGKSIYLNAGDPTDRYFTNISWSPDEKSLYLIEVNRDQNHAKLCRYNAETGE PMGVLYEEMHPKYVEPQNAIVFLPWDPTKFIYQSQRDGYNHLYLFDADAANMKGETYNSA NGGTYFQAGKVKQLTKGNWLVSDVLGFNTKRKEIIFTAVEGLRSGHFAVNVSNGKISQPF ENCKESEHSGMLSASGTYLIDRYSTKDQPRIINLVDTKNFKETANLLTAENPYEGYQMPS IETGTIKAADGTTDLHYRLMKPANFDPAKKYPVIVYVYGGPHAQCVTGGWQNGARGWDTY MAGKGYIMFTIDNRGSSNRGLAFENATFRRLGIEEGKDQVKGVEFLKSLPYVDSERIGVH GWSFGGHMTTALMLRYPEIFKVGVAGGPVIDWGYYEVMYGERYMDTPESNPEGYKECNLK NLAGQLKGHLLIIHDDHDDTCVPQHTLSFMKACVDARTYPDLFIYPCHKHNVAGRDRVHL HEKITRYFEQNL >gi|222159344|gb|ACAB01000015.1| GENE 9 10712 - 11986 1543 424 aa, chain + ## HITS:1 COG:VC0275 KEGG:ns NR:ns ## COG: VC0275 COG0151 # Protein_GI_number: 15640304 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Vibrio cholerae # 1 422 1 420 429 381 47.0 1e-105 MKILLLGSGGREHALAWKIAQSPKVEKLYIAPGNAGTTAVGENVNIKATDFEAISAFALK ENIQMIVVGPEDPLVKGIYDYFQNRPELKHIAVIGPSAQGAQLEGSKEFAKGFMMRHNIP TARYKSITSENLEEGLAFLETLEAPYVLKADGLCAGKGVLILPTLDEAKKELKEMLGGMF GSASATVVIEEFLSGIECSVFVLTDGDNYKVLPVAKDYKRIGEGDKGLNTGGMGSVTPVP FADEVYMEKVRTRIIEPTINGLKEEKITYKGFIFLGLIKVKGEPMVIEYNVRMGDPETES VMLRIQSDFVELLEGTAAGNLNEKTLVMDPRSASCVILVSGGYPEAYEKGFPISGLEQAA ATDSIIFHAGTAMKDGQVVTNGGRVIAVCSYGATKEEALAQSYKVADMIDFDKKYFRRDI GFDL >gi|222159344|gb|ACAB01000015.1| GENE 10 12101 - 13111 672 336 aa, chain + ## HITS:1 COG:no KEGG:BT_3252 NR:ns ## KEGG: BT_3252 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 336 1 336 336 447 77.0 1e-124 MRNKRLQSQVTAGRWTLPAVIFICALCWVLTYFLFPDSIASTTPKESSSFLQSTRDLLLP GWAERIVSFLIYAIIGYFLIELNNQFSIIRMRASMQTAIYFLLVTVCPGMHLLYIGDIVA LGSLISIYFLFKSYQQAQAAGYLFYSFFFIGAGSILFPQLTILSVLWLFEAYRFQSLTPR SFCGALLGWMLPYWMLFGHAFFYNEMELFYRPFNQLLAFGEIFNLQILQPWELAILGYLL VMFIVSAVHCIAAGFEDKIRTRAYLQFLIDLTLFLFLLIALQPIYCSALLPLLIISNSIL IGHFFVLTNSKSSNVFFIISLVGLILLFAFNIWTLL >gi|222159344|gb|ACAB01000015.1| GENE 11 13096 - 13575 381 159 aa, chain + ## HITS:1 COG:Cj0341c KEGG:ns NR:ns ## COG: Cj0341c COG1238 # Protein_GI_number: 15791709 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Campylobacter jejuni # 19 152 13 146 147 85 35.0 2e-17 MDAFVDSMIQLLIEWGLPGLFISALLAGSIVPFSSELVLVALVKLGLPPIACLISATLGN TVGGMTCYYMGRLGKISWIEKYFKVKKEKVDKMVKFLQGKGALMAFFTFLPAIGEVIAIA LGFMRSNTWLTIVSMFVGKLIRYILLLYVLESAWDAMAG >gi|222159344|gb|ACAB01000015.1| GENE 12 13620 - 14423 620 267 aa, chain + ## HITS:1 COG:SMb20878 KEGG:ns NR:ns ## COG: SMb20878 COG4121 # Protein_GI_number: 16264920 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 16 265 26 237 238 82 28.0 1e-15 MERIIERTDDGSATLFVPELNEHYHSTKGARTESQHIFIDMGLKASPAATPRVLEIGFGT GLNAWLTLEEAERSRRNIHYTGLELYPLDWQTVEQLGYISSDEQLTIHRKQTATDEQFTP NDEEEQQPAIELFKQLHTSPWEKDVQLTPHFTLRKIETDVNQWITGKKEESKMNSTNNNA VISQYSTSNTPFNTIYFDAFAPEKQPEMWSQELFNRLYVLLDRDGILTTYCAKGVVRRML QTAGFTVERLPGPPGGKREILRARKQE >gi|222159344|gb|ACAB01000015.1| GENE 13 14462 - 15391 754 309 aa, chain + ## HITS:1 COG:MTH604 KEGG:ns NR:ns ## COG: MTH604 COG0803 # Protein_GI_number: 15678632 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Methanothermobacter thermautotrophicus # 10 302 13 292 295 194 35.0 2e-49 MKQQLKDMRTLFLLGTCLLIAACTGRSSQASNDDETKPVITVTLEPQRYFTEAIAGDKFK VVSMVPKGSSPETYDPIPQQLVSLGDSKAYFRIGYIGFEQTWMDRLMNNTPHIQVFDTSK GIDLILNNDDHGHAHGHNSHDGHIHAVEPHVWNSTGNALIIAGNTYKALSQLDKANEPYY RNRYDSLCQRIQHTDSLIRRQLSTPEAAKAFIIYHPALSYFARDYGLHQISIEEGGKEPS PAHLKDLIDVCQAENVGVIFVQPEFDKRNAETIAQQTGTKVVPINPLSYDWEEEMLNVAK ALAPQAATK >gi|222159344|gb|ACAB01000015.1| GENE 14 15429 - 16265 199 278 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 234 1 210 311 81 28 1e-14 MKPIIEIKNLSAGYDGRTVLHDVNLSIYERDFLGIIGPNGGGKTTLIKCILGLLKPTGGE IIFHTPEKAPIRTSTEGTSTETSAKNSHQSSSAKSQLFLGYLPQYSTIDRKFPISVEEVI LSGLSIQKSLTSRFTPEQREKGKQIIARMGLEGLESRSIGQLSGGQLQRALLGRAIISDP AVLILDEPSTYIDKRFEARLYELLAEINKECAIILVSHDIGTVLQQVKSIACVNETLDYH PDTGVTTEWLERNFNCPIELLGHGTLPHRVLGEHHHHH >gi|222159344|gb|ACAB01000015.1| GENE 15 16858 - 20241 3305 1127 aa, chain + ## HITS:1 COG:no KEGG:BT_3247 NR:ns ## KEGG: BT_3247 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1127 11 1124 1124 2115 90.0 0 MGWLTFIIAATVYCLTIEPTASFWDCPEFITTGYKLEVGHPPGAPFFMLVANLFSQFASD VTTVAKMVNYMSALMSGACILFLFWSITHLVRKLIITDENNITKGQLITVMGSGLIGALV YTFSDTFWFSAVEGEVYAFSSLFTAVVFWLILKWEDVADQPHSDRWIILIAYLTGLSIGV HLLNLLCLPAIVLVYYYKKTPNATAKGSLIALLGSMVLVAAVLYGIVPGIVKVGGWFELL FVNGLGMSFNSGVVVYIILLAAALIWGVYESYTEKNKARMAISFILTIALLGIPFYGHGA SSIIIGILVIAALGLYLAPSVQAKIKERWRITARTMNTALLCTMMIVIGYSSYALIVIRS TANTPMDQNSPEDIFTLGEYLGREQYGTRPLFYGPAFSSKVALDVKDGYCIPRQSEAGSK FVRKEKTSPDEKDSYIELPGRVEYEYAQNMFFPRMYSSSHAPLYKQWVDIKGHDVPYDQC GEMVMVNMPNQWENIKFFFSYQLNFMYWRYFMWNFAGRQNDIQGSGEIEHGNWITGIPFI DNLLVGNQELLPQDLKNNKGHNVFYCLPLILGLIGLFWQAYHSQRGIQQFWVVFFLFFMT GIAIVLYLNQTPAQPRERDYAYAGSFYAFAIWVGMGVAGIIRMLREYCKMQELPAAVLAS VLCLFVPIQMAGQTWDDHDRSGRFVARDFGQNYLMTLQEKGNPIIYTNGDNDTFPLWYNQ ETEGFRTDARTCNLSYLQTDWYIDQMKRPAYDSPSLPITWDRVEYVEGQNEYIPIRTEMK AFIDSYFKQANELAAQGDTTILSLVHSIFGENPYELKEIINRWMLGKNDQLKELLKKTGK DIQLPLIPTDSIVMKVDKEAVRRSGMKIPEALGDSIPEYMTITLRDANGNPKRALYKSEL MMLEMLANANWERPIYMAITVGSENHLGMGNHFMQEGLAYRFTPFDTDKLDSKIDSEKMY DNLMNKFKFGGIDKPGIYIDENVMRMCYTHRRIFTQLVGQLIKEGKKDKALAALDYAEKM IPSYNVPYDWANGAFQMAESYYQLGQNEKANKIIDELANKSLEYMIWYLSLNDNQLAIAG ENFVYNASLLDAEVRLMEKYKSEELAKHYSTQLDQLYNEYVTRMKGK >gi|222159344|gb|ACAB01000015.1| GENE 16 20352 - 20966 607 204 aa, chain + ## HITS:1 COG:all4345 KEGG:ns NR:ns ## COG: all4345 COG0726 # Protein_GI_number: 17231837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Nostoc sp. PCC 7120 # 20 201 100 284 305 121 38.0 7e-28 MFIEQPPWLFRALYPQAIFRMDPNERAVYLTFDDGPIPEVTPWVLEVLEKHHIKATFFMV GDNIRKHPDEYRMVVEHGHRIGNHTFNHIRGFEYSNPDYLANARKVDEIIHSDLFRPPHG HMGFRQYYTLRYHYRIIMWDLVTRDYSKRMRPEQVLNNVKRYARNGSIITFHDSLKSWNN GNLQYALPRAIEFLKEEGYEFKVL >gi|222159344|gb|ACAB01000015.1| GENE 17 21601 - 21831 240 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262405650|ref|ZP_06082200.1| ## NR: gi|262405650|ref|ZP_06082200.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 76 1 76 76 159 100.0 7e-38 MIHAGQLIERTLHEQGRTVTWFATQLCCTRPNVYKIFRKENIDIHLLWRISYILGHDFFR DLSDSINTGSFPSVSK >gi|222159344|gb|ACAB01000015.1| GENE 18 21931 - 22422 506 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716884|ref|ZP_04547365.1| ## NR: gi|237716884|ref|ZP_04547365.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 163 1 163 163 321 100.0 1e-86 MKTKKLFLIIALVVGCAVGAHAQKTVFKFRDAQARAGDAVTEVCVKPTVVEVKILEDKGR IKDEWTLSKEEVEIAMKGELDNIRAWGTYLSTIKYNCDVIMGATFKVEDNEKTGGYTVTV VGYPGIFVNWHPATKEDYEWIHLQKLSPTDGKSQIAPVVKNKN >gi|222159344|gb|ACAB01000015.1| GENE 19 22443 - 23078 621 211 aa, chain + ## HITS:1 COG:no KEGG:BT_2676 NR:ns ## KEGG: BT_2676 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 210 210 366 86.0 1e-100 MKKVLSMVAALLLCVGTQAQIVSSRSAIVKTEKQASSTQWFLRAGLNIMNFSGDGAEGAD SNIGYNATFGYQKPLGSTGGYWGMEFGLGSRGFKVDDTKCMAHNIQYSPFTFGWKFAVAD NVRIDPHVGVFASYDYTSKMKEDGESISWGDYADYMEVDYNHFDAGMNIGVGVWYDRFNL DLTYQRGFIDAFSDADGFKTSNFMIRLGIAF >gi|222159344|gb|ACAB01000015.1| GENE 20 23184 - 23903 564 239 aa, chain + ## HITS:1 COG:no KEGG:BT_2675 NR:ns ## KEGG: BT_2675 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 239 2 241 241 377 76.0 1e-103 MTKRQFLVILYLLVGIIPLTAQTFTEQKKNYPVSADGNKYVVSGFTPFSDMQDENIYANA LLWIVKNVCPQLREGIAEVNVPAKNFSCDLILASQADSSQKNTYYCKALFQVKDGKLVYY LSNIRIESSAVIMKKITPMEKLQPDKKASHKGIMDDFVQVESQVLNKMFDFIITNQLSPI THWNEININKPVKGMTEDECLLAFGKPQTIQESNGEVQWMYSSSFYLFFKDGHVETIIK >gi|222159344|gb|ACAB01000015.1| GENE 21 23953 - 24423 555 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160882198|ref|ZP_02063201.1| ## NR: gi|160882198|ref|ZP_02063201.1| hypothetical protein BACOVA_00144 [Bacteroides ovatus ATCC 8483] # 1 156 27 182 182 272 100.0 6e-72 MKTIWKATVCIVAVLLSFSIYSCGDDDDDAVGSRDLLLGTWNGVYYLSQEWEDGEKVSDS KEDFVNGTNRYSIEFKEDGTYVEKDVYNSSGSTNYYHGTWSYSGNKLTLIDTEEDNYTEV WTVTTMTENELVYELREKEKEDGTTYEYYEQHAFTR >gi|222159344|gb|ACAB01000015.1| GENE 22 24541 - 25782 1405 413 aa, chain - ## HITS:1 COG:no KEGG:BT_3233 NR:ns ## KEGG: BT_3233 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 413 1 414 414 730 88.0 0 MKKILILFAVVLIGFASCADSKQSMTITVTNPLALERVGEMVEVPMSDVVAKLKLADTAQ IVVLDVDGQQVPYQVTYDEKVVFPTMVAANGTSIYTIQPGTPAPFDVVACGKYYPERLDD VAWENDLGGFRAYGPALQARGERGFGYDLFTKYNTTEPILESLYAEELNPEKRAKIAELK KTDPKAASELQKAISYHIDHGYGMDCYAVGPTLGAGVAALMAGDTIIYPYCYRTQEILDN GPLRFTVKLEFNPLVVRGDSNVVETRVISLDAGSYLNKTIVSYTNLKEAMPVTTGLVLRE PDGATVADAANGYITYVDPTTDRSGANGKIFVGAAFPAQVKEAKVVLFSEKEKKERGGAD GHVLAISEYEPGSEYTYYWGSAWDKAAIKTPDAWNKYMAEYAQKLRTPLTVAY >gi|222159344|gb|ACAB01000015.1| GENE 23 25975 - 26778 1131 267 aa, chain - ## HITS:1 COG:CAC2607 KEGG:ns NR:ns ## COG: CAC2607 COG1028 # Protein_GI_number: 15895865 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Clostridium acetobutylicum # 1 267 1 267 267 427 79.0 1e-120 MNQYLNFSLEGKVALVTGASYGIGFAIASAFAEQGAKVCFNDINQELVDKGMAAYAAKGI KAHGYVCDVTDEPAVQAMVATIAKEVGTIDILVNNAGIIRRVPMHEMDAADFRRVIDIDL NAPFIVAKAVLPAMMEKRAGKIINICSMMSELGRETVSAYAAAKGGLKMLTRNICSEYGE YNIQCNGIGPGYIATPQTAPLREKQADGSRHPFDSFICAKTPAGRWLDPEELTGPAVFLA SEASNAVNGHVLYVDGGILAYIGKQPK >gi|222159344|gb|ACAB01000015.1| GENE 24 26819 - 27661 1017 280 aa, chain - ## HITS:1 COG:YPO1725 KEGG:ns NR:ns ## COG: YPO1725 COG3717 # Protein_GI_number: 16121985 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Yersinia pestis # 6 280 2 278 278 296 49.0 4e-80 MKTNYEIRYAAHPEDAKSYDTTRIRRDFLIEKIFVPNEVNMVYSMYDRMVVGGALPVGEV LTLEAIDPLKAPFFLTRREMGIYNVGGPGIVKAGDATFELDYKEALYLGSGDRIVTFESK DAAHPAKFYFNSLTAHRNYPDRKVTKADAIVAEMGSLEGSNHRNINKMLVNQVLPTCQLQ MGMTELAPGSVWNTMPAHVHSRRMEAYFYFEIPEDHAICHFMGEVGETRHVWMKGDQAVL SPEWSIHSAAATHNYTFIWGMGGENLDYGDQDFSLITDLK >gi|222159344|gb|ACAB01000015.1| GENE 25 28250 - 29542 713 430 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 2 427 8 414 418 279 36 3e-74 MNFVEELRWRGMLQDIMPGTEELLSKEQVTAYLGIDPTADSLHIGHLCGVMILRHFQRCG HKPLALIGGATGMIGDPSGKSAERNLLDEETLRHNQACIKNQLAKFLDFESDVPNRAELV NNYDWMKDFTFLDFVREVGKHITVNYMMAKDSVKRRLNGEARDGLSFTEFTYQLLQGYDF LHLYETKGCKLQMGGSDQWGNITTGAELIRRTNGGEVFALTCPLITKADGGKFGKTESGN IWLDARYTSPYKFYQFWLNVSDSDAERYIKIFTSIEKEEIEALIAEHQEAPHLRLLQKRL AKEVTVMVHSEDDYNAAVDASNILFGNATSEALRKLDEDTLLAVFEGVPQFEISRDVLAE GVKAVDLFVDNAAVFASKGEMRKLVQGGGVSLNKEKLAAFDQVVTTADLLDEKYLLVQRG KKNYYLLIAK >gi|222159344|gb|ACAB01000015.1| GENE 26 29629 - 30321 492 230 aa, chain - ## HITS:1 COG:BS_yabD KEGG:ns NR:ns ## COG: BS_yabD COG0084 # Protein_GI_number: 16077107 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Bacillus subtilis # 53 191 57 203 255 81 33.0 1e-15 MKKNVTDILDIHTHKQEVDTQGKSIINYPLLADSPLYMPLAENVEVAVGRGSYYSIGIHP WEVRESNVSQQLSFLQQQLQRKQFVAVGEAGLDKLAKAPMELQLAVFKEQVKLSEKLGLP LIIHCVKAMEELLGVKKESRPQQPWIWHGFRGKPEQAVQLLKKGFYLSFGEYYPDETMQI VPDERLFLETDDSLLDIEDILCQAARVRGVEVEALCEVIRRNIQNVFFKA >gi|222159344|gb|ACAB01000015.1| GENE 27 30618 - 31007 173 129 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646897|ref|ZP_01719407.1| 50S ribosomal protein L34 [Algoriphagus sp. PR1] # 4 127 3 122 130 71 36 1e-11 MGIYTLCKAERLNSKILIGKMFEGGVSKSFSIFPIRVVYMPVEQGEAPASILISVSKRRF KRAVKRNRVKRQIREAYRKNKSLLVDELQRREQRLAVAFIYLSDELVATAELEEKMKIAL ARISEKLFS >gi|222159344|gb|ACAB01000015.1| GENE 28 31013 - 31759 578 248 aa, chain - ## HITS:1 COG:no KEGG:BF0075 NR:ns ## KEGG: BF0075 # Name: not_defined # Def: uroporphyrinogen-III synthase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 248 1 248 250 451 93.0 1e-125 MKIKKVLVSQPKPASEKSPYYDIAEKYGVKIDFRPFIKVESLSAKEFRQQKISILDHTAV IFTSRHAIDHFFTLCTELRVTIPETMKYFCVTEAVALYIQKYVQYRKRKIFFGATGKIED LIPSIVKHKTEKYLVPMSDVHNDDVKNLLDKNSIQHTEAVMYRTVSNDFTPDEEFDYDML VFFSPAGVTSLKKNFPDFNQKEIKIGTFGSTTAQAVRDAGLRLDLEAPTVQAPSMTAALD MFIKENNK >gi|222159344|gb|ACAB01000015.1| GENE 29 31764 - 32597 315 277 aa, chain - ## HITS:1 COG:no KEGG:BT_3225 NR:ns ## KEGG: BT_3225 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 43 276 6 239 240 351 89.0 2e-95 MIGDTLSSQNTDTLSLLQQKQISPAKADSDSLQLADLHAVQEVDSGFEGTPISYSPRTDD AIALTLLACFFLSSIALARGKKFLSQQVKDFVLHRERTSIFDSSTAADVRYLLVLVLQTC VLSGITFLNYFHDTCPALMDHVSSLLLLGIYVGFCLAYFLLKWLLYMFLGWTFFDKNKTN IWLESYSALIYYVGFALFPFVLFLVYFDLSLTNLVIIGSIILIFTKILMFYKWIKLFFHQ FSGLFLLILYFCALEIVPCLLLYQGMIQMNNILLIKF >gi|222159344|gb|ACAB01000015.1| GENE 30 32594 - 33181 523 195 aa, chain - ## HITS:1 COG:PA4923 KEGG:ns NR:ns ## COG: PA4923 COG1611 # Protein_GI_number: 15600116 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Pseudomonas aeruginosa # 4 189 3 185 195 164 45.0 7e-41 MNQINSVCVYSASSTKIDAVYFQAAETLGRLFAEHHIRLINGAGSIGLMCSVADAVLKNG GEVTGVIPRFMVEQNWHHTGLTELIEVESMHERKQKMANLSDGIIALPGGCGTLEELLEI ITWKQLGLYLNPIIILNTNRFFDPLLEMLEKAIDENFMRRQHGDIWKVAQTPEEAVQLLY ETPVWDISIRKFAAI >gi|222159344|gb|ACAB01000015.1| GENE 31 33261 - 35501 1029 746 aa, chain - ## HITS:1 COG:no KEGG:Cpin_4389 NR:ns ## KEGG: Cpin_4389 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 25 746 27 775 775 424 35.0 1e-117 MEKYLLLFILSCCGNFTICNAQTITVRGRVTDRKTGEVLSDANIGDLMSRTGVAANTYGL YSIRIKGGRCVLRCSMLGYVTQMDTLTLTANSVHNFALMPDNYQLSDVEVMGNQKAGGQL TLNQKDIQALPTLGSEPDVLKSLQYLPGVISGNEGSNNISVRGSNQWGNLILLDEAMVYN PNHALSFFSVFNNDAIQQVSLYKSYFPLKYGGRTSSVIDVKMREGNNQEKHRSATIGVVA SKIQLEGPIKKGKTSYLVAGRFAYPGAVLNVLKQFRGTKMSFYDVNAKINSTLNDRNRIF FSVYNGGDHTFFNQLVRSYGMNWGNTTATFRWNHVWTDRLSGNFSAIFSNYYYRYKSITD GMKFLWKSNIQSYQLKYDADYAVNNVLHIRSGLSAHVFTTMPGSISSWGDFSNVVPYRMD RRFLLDMAAYGEATYKISSVWRLSGGIRFPVFYTPKVGELKQKGYIMPEPRAELSYSPGT GNRLHAAFTQSSQNLHMLSNSSVGIPSDMWVPANRQLKPAVMKQVALGYEKSLEKGMYTF SLETYYRKTDHIVDFVDNANIFLNNQIETQLNTGYSKAYGAEFYVSKNRGRLTGWISYTL SHARNYIAALEDKEYPPVYDRPHSLKIFLNYEAGRKRRCAFAATFSYNSGMNLTLPIAHY RVNGTAFYIYSTRNGYRAPAFHELNLSMTCKTGKRGRLILSVMNVYNRKNVFTIYTSRDD YDFSDIGMHKMYLYGALPSISYQFTF >gi|222159344|gb|ACAB01000015.1| GENE 32 35501 - 36508 649 335 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716900|ref|ZP_04547381.1| ## NR: gi|237716900|ref|ZP_04547381.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 335 1 335 335 673 100.0 0 MRLYFNLNKITYLFLMQALYLCIACEEPFDVGMPIPEDAIVFDGVITDEPPPYYFVLSKP STKLKYPENRSFDRINDAEIVIVDLTTGIRDTLQNAKLTGYQDFRFYDHYRDKDVTVYMK WLPGETPGGLYVTNKIYGVENHTYELHIKYKGKEYTACERMVPKTPIDKIVMKRIDTGEG EPNETPCISFYNPPEEHNYYLLKTDFCSSKVLRVASVYNLYYGTTNSAGWPYSILDDEYL AENVIDYVVSEGEQFVLPNRPGFSYPVSDSIWIKMQSISENCYQVFDQMIKQIRSDGGTF SPRPTSVKSNIDNGAYGIFRVSAISEIYFYKKHRI >gi|222159344|gb|ACAB01000015.1| GENE 33 36524 - 37516 468 330 aa, chain - ## HITS:1 COG:AF1354 KEGG:ns NR:ns ## COG: AF1354 COG1651 # Protein_GI_number: 11498950 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Protein-disulfide isomerase # Organism: Archaeoglobus fulgidus # 176 330 133 301 305 75 30.0 2e-13 MKILYFFLFSFLFIACNTPVINKNDVVAEVNGEQILLSELASQSKQEIFDILNTAYEIKS RVLAGLIKQKLLEDAAKKENMSLEEFIDWFVQQKICVGQDSLKKRYGFNTQSFYVKGELI PLVKGSLEEKLSYQQKLRSRIVQALVDSLYQKADIKRFLYPPKQPECVVRDLCVYYRGNL DSPVSFIVASDYNCERCVQFEKTLSKLYDNYKERVKFGFVHFADAPSLAALACEAAGEQK QFWTFHDTIFNYSGVADSAFIYNLAKSKRLNMTEFDAYLHSSDKYKEMDKVINQLVERGL MATPTIIINDRLVYVTNSYEELSRLLEYEL >gi|222159344|gb|ACAB01000015.1| GENE 34 37590 - 38501 592 303 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716902|ref|ZP_04547383.1| ## NR: gi|237716902|ref|ZP_04547383.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 303 1 303 303 618 100.0 1e-175 MKEKLLVLGTMVALCCSCAGSSVLLDTESELNGLNNEYELGTLEAGDNGLKCETSGDEDS IQVLKDNYIAEIMGRMEVAKEKVMRTAYGATSPRVGVFRVGSCGNYKYLSIKIDCENGNS KTSVSGNVGDTYVDGGDNMRLEFCMVDAGFRYPGGVFLFEDVPLETMTLVRYHDTEDGGH NGVWSDDPNYYDVMHISGMSKLDSNATLAWNINRNMTKWGDIPVGPAGINYGVIAPAEMA SGNLYFDDEDHNNKNWAQVWLGPQNQYEPHGTYYGVQLDRNTRYHVCLNTDKTNFTKLVR SAI >gi|222159344|gb|ACAB01000015.1| GENE 35 38671 - 38841 133 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294644389|ref|ZP_06722152.1| ## NR: gi|294644389|ref|ZP_06722152.1| hypothetical protein CW1_0710 [Bacteroides ovatus SD CC 2a] # 1 56 1 56 56 91 100.0 1e-17 MEMKLLFFLLSAILFQSSISIYAMEQIKRGWVALKDVTQMRSHSIFVFPNNFLGSS >gi|222159344|gb|ACAB01000015.1| GENE 36 39432 - 39803 367 123 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716903|ref|ZP_04547384.1| ## NR: gi|237716903|ref|ZP_04547384.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 123 1 123 123 199 100.0 4e-50 MKAKFLPLLLLAILLQVSMFSFANEMTSARRIRLKNKTEVQHRSIPVTPDACIENSLLSI DLLSTVPTVTVIIKNAETDEVVYTSTDLNVDKVYIDLTGEEKGKYTLEIQLPKEAFTGDF ELE >gi|222159344|gb|ACAB01000015.1| GENE 37 39930 - 41594 657 554 aa, chain - ## HITS:1 COG:no KEGG:BT_3570 NR:ns ## KEGG: BT_3570 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 546 20 556 566 276 33.0 2e-72 MKTRCCYPKRYLLLFFSLLVSVGSILMSVSLSSCSSSVKSPLLLSADSLMEIYPDSALSI LESISSPQKLPRADRALYALLLTQARHKNYIALGDDSLIKTAVEYYGDKKKSVRAAKAHY YWGATYGEKGYTSFAVDEYLTAIRLMPVRDEFLAMIYDNLADCYEKDELDNVAMEAYRAA YQILKGKSGQIYPMRGIARMFLLQNRKDSALFYYQQALDCALADKDSSLIGALYHDFAMV YNEKKDYILANQYVSKAIMIQGEGAVNACLSKAQIMLNLNQLDSASYFYNKNMSQLDIYG KAVCYDGMYQIAKKKGEWKVATENMDIYKVLYDSIQIMTDNEELNRLMDKHQLEEHKRLL SEHAKTLVFTLVAVFFFLMIICVFCFMWNDRKRKKYYIALQHELTQKRVDTMLLKDEEVS ESNKEHIDKKRSELTEQQIQLCVSVLKTTDCYDQLETLEKATPKQLLAMRSLRKDIRSTI SNAFVDVMVNLKERYPTLTGDDVFYCVLSLLYCSKTVMMELMDATSDALKTRKNRIKNKV DAQLFERVFGADNQ >gi|222159344|gb|ACAB01000015.1| GENE 38 41775 - 43067 1496 430 aa, chain - ## HITS:1 COG:TM1658 KEGG:ns NR:ns ## COG: TM1658 COG0192 # Protein_GI_number: 15644406 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Thermotoga maritima # 1 430 1 395 395 427 53.0 1e-119 MGYLFTSESVSEGHPDKVADQISDAVLDKLLAYDPSSKVACETLVTTGQVVLAGEVKTKA YVDLQLIAREVIKKIGYTKGEYMFESNSCGVLSAIHEQSPDINRGVERQDPMEQGAGDQG MMFGYATNETENYMPLSLDLAHRILQVLADIRREGKVMTYLRPDAKSQVTIEYDDNGAPV RIDTIVVSTQHDDFIQPEDDSQAAQLKADEEMLSIIRRDVIEILMPRVIASIHHDKVLAL FNDKIIYHVNPTGKFVIGGPHGDTGLTGRKIIVDTYGGKGAHGGGAFSGKDPSKVDRSAA YAARHIAKNMVAAGVADEMLVQVSYAIGVARPINIFVDTYGRSHVNMTDGEIARVIDQLF DLRPKAIEERLKLRNPIYQETAAYGHMGREPQVITKKFSSRYEGDKTVEVELFTWEKLDY VDKIKAAFGL >gi|222159344|gb|ACAB01000015.1| GENE 39 43305 - 43493 238 62 aa, chain + ## HITS:1 COG:no KEGG:BT_3217 NR:ns ## KEGG: BT_3217 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 62 1 62 63 79 79.0 4e-14 MILNKKLKTLLVLALYIGFTIAIYAIVCHFIDKPFQEIHLLYAVLIGCIAYLPVFIAEKK KK >gi|222159344|gb|ACAB01000015.1| GENE 40 43468 - 43977 279 169 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 1 162 111 264 278 112 39 6e-24 CNADFGQIMAKVYLGLGTNLGDKEQNLRDAVQKIEEQVGKIVSLSAFYVTAPWGFSSDNS FLNAAVCVDTELAPIDVLQRTQAIEQELGRTKKSVNGIYSDRLIDIDLLLYGDLILSTTS PSGAKLILPHPLMAERDFVMKPLAEIAPGLVHPVLGKTMKELTSSFSPQ >gi|222159344|gb|ACAB01000015.1| GENE 41 43953 - 45011 1082 352 aa, chain - ## HITS:1 COG:SP1416 KEGG:ns NR:ns ## COG: SP1416 COG0809 # Protein_GI_number: 15901270 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Streptococcus pneumoniae TIGR4 # 1 349 1 341 342 296 43.0 3e-80 MKLSQFKFKLPEDKIALHPMKYRDESRLMVLHRNTGKIEHKMFKDVLDYFDDKDVFIFND TKVFPARLYGNKEKTGARIEVFLLRELNEELRLWDVLVDPARKIRIGNKLYFGPDDSMVA EVIDNTTSRGRTLRFLYDGPHDEFKKALYSLGETPLPHSIINRPVEPEDAERFQSIFAKN EGAVTAPTASLHFSRELMKRLEIKGVDFAYITLHAGLGNFRDIDVEDLTKHKMDSEQMFV NEMAVKTVNRAKDNGRNVCAVGTTVMRAIESAVSTDGHLKEFEGWTNKFIFPPYEFTVAN SMISNFHMPLSTLLMIVAAFGGYDQVMDAYHVALKEGYRFGTYGDAMLILDK >gi|222159344|gb|ACAB01000015.1| GENE 42 45029 - 45751 677 240 aa, chain - ## HITS:1 COG:MT2862.1 KEGG:ns NR:ns ## COG: MT2862.1 COG0130 # Protein_GI_number: 15842331 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Mycobacterium tuberculosis CDC1551 # 8 216 8 210 298 174 45.0 9e-44 MNFKKGEVLFFNKPLGWTSFKVVGHARYHICRRIGVKKLKVGHAGTLDPLATGVMIVCTG KATKRIEEFQYHTKEYVATLRLGATTPSYDLEHEIDATYPTEHITRELVEEVLTHFIGAI DQVPPAFSACMVDGKRAYELARKGEEVELKAKQLVIDEIELLECRLDDPEPMIQIRVVCS KGTYIRALARDIGEALHSGAHLTGLIRTRVGDVRLEDCLNPEHFKEWIDGQEIENEEENN >gi|222159344|gb|ACAB01000015.1| GENE 43 45751 - 46605 694 284 aa, chain - ## HITS:1 COG:aq_2195 KEGG:ns NR:ns ## COG: aq_2195 COG1968 # Protein_GI_number: 15607126 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Aquifex aeolicus # 4 268 1 247 256 179 43.0 5e-45 MGDLTTFETIIIAIVEGLTEFLPVSSTGHMIITQNILGVESTEFVKAFTVIIQFGAILSV VCLYWKRFFRLNHTPAPAGASALKCFLHKFDFYWKLLVAFIPAAILGFLFSDKIDEMLES VAIVAVMLVIGGIFMLFCDKIFSKGSEDTVLTERKAFNIGLFQCIAMIPGVSRSMATIVG GMAQKLTRKDAAEFSFFLAVPTMFAATGYKVLKLFLDGGTEILLNNMPALIIGNVVAFIV ALLAIKFFISFVTKYGFKAFGWYRIIVGGTILVMLLLGYNLEIG >gi|222159344|gb|ACAB01000015.1| GENE 44 46608 - 46850 238 80 aa, chain - ## HITS:1 COG:no KEGG:BT_3211 NR:ns ## KEGG: BT_3211 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 129 94.0 2e-29 MSDKQKFAFDKVNFILLAIGMAIVIIGFLLMTGPTSSETSFEPDIFSVRRIKVAPVVCLF GFLSMIYAVLRKPKTQKTEE >gi|222159344|gb|ACAB01000015.1| GENE 45 46964 - 47845 564 293 aa, chain - ## HITS:1 COG:L2 KEGG:ns NR:ns ## COG: L2 COG2177 # Protein_GI_number: 15672955 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Lactococcus lactis # 23 285 29 311 311 81 26.0 2e-15 MGDQNKYKSGSIFDMQFITSSISTTLVLLLLGLVVFFVLTAHNLSVYVRENISFSVLISD DMKEADILKLQKKLNQEPFVKQSEYISKKQALKEQTEAMGTDPEEFLGYNPFTASIEIKL HSDYANSDSIARIEKMIKKNTNIQDVLYRKELIDAVNDNIRNISLVLLALAVVLTFISFA LINNTIRLAIYSKRFLIHTMKLVGASWGFIRGPFLRKNVWSGILAAIVADSILMGTAYWA VTYEQELLQVITPEVMLIVCASVLAFGIVITWLCAYFSMNKYLRMKANSLYYI >gi|222159344|gb|ACAB01000015.1| GENE 46 47849 - 48748 816 299 aa, chain - ## HITS:1 COG:no KEGG:BT_3209 NR:ns ## KEGG: BT_3209 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 299 1 279 279 541 92.0 1e-152 MEKLSINACPVCGGAHLKRVMTCTDFYASGEQFELYSCEDCGFTFTQGVPVEAEIGKYYE TPDYISHTDTRKGAMNNIYHYVRSYMLGRKARLVAKEAHRKTGRLLDIGTGTGYFSDAMV RRGWKVEAVEKSPQAREFAKLHFDLDVKPESALKEFAPGSFDVITLWHVMEHLEHLDEVW QRLHELLTEKGVLIVAVPNCSSYDAQRYGEYWAAYDVPRHLWHFTPGTIQQLASRHGFIM AARHPMPFDAFYVSMLSEKHRGSSCSFLKGMFAGTLAWFNALGRKERSSSMIYVFRKKR >gi|222159344|gb|ACAB01000015.1| GENE 47 49247 - 49666 175 139 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716914|ref|ZP_04547395.1| ## NR: gi|237716914|ref|ZP_04547395.1| predicted protein [Bacteroides sp. D1] # 1 139 1 139 139 264 100.0 1e-69 MRNKENHRKLGKPPKNKRILKKEREKKQGRIALIIFFVIVSSPFWLYVGIIRNMILPIFG EKSKAALTEIRGPGIWAGRYYYNEPDFYYTFYIRDKLYRGNSGIQPNDSLFHLGDTVDVI YLKRFPFINALKLPKEKKE >gi|222159344|gb|ACAB01000015.1| GENE 48 49656 - 50330 264 224 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716915|ref|ZP_04547396.1| ## NR: gi|237716915|ref|ZP_04547396.1| predicted protein [Bacteroides sp. D1] # 1 224 8 231 231 373 99.0 1e-102 MSGDSYYQFSPYSYCGGNPICHRDEEGKFINNIIGAVIGGAADLGVQVASNLIQGKTAFE DINWVSVGASALEGGLTSGLSAGRTLAVKASVALAGNTTKAIMDGKLNNTNDILDVAKSS ITEVVSDRIIGKAGNIVGKGIKSVAPGIDKAISQKANKLILSNNKATDLAKKLPGVSNTK AARAIATSENGLVEASKTVSNTLQKVPENVTKFGLKIIKKENEK >gi|222159344|gb|ACAB01000015.1| GENE 49 50358 - 51515 538 385 aa, chain - ## HITS:1 COG:no KEGG:BT_3206 NR:ns ## KEGG: BT_3206 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 351 9 347 348 185 34.0 3e-45 MRTTFLNLILLFVIVGCKQPTINKVQQAVEAQAKLLVDSGLIANEYVVLYELAINDSNHI YSIQAADCPADLKFEYPSKIIKYKDKYLCFIELDEAPMSAKEMIEASSYSGNLVVEGDGG KSWLLVVSKLGEKKILIDTSLLRGWGTYFNITELWPYFSGYVKGCPVQMGVMSHDVELSD SYLSCNVDSIKRNLLWNENQNTTMIKNVYGQMYLKNNTDSVVYLSSSTKKHYAVVDGQDS LYLSLSDSLPIILGPNEKKILEYKSLPRQDEFFRNLTLKEDPWEYFYNLFCRSTYSFINV NGKKSQTKVMFHDIDNYGFNVSMTSPSIQFQILNHGIYDKKYGEMSRFRFWSDKWGAMNN ADKQRLSDDADERFQRNVNQVNVLP >gi|222159344|gb|ACAB01000015.1| GENE 50 51515 - 51955 260 146 aa, chain - ## HITS:1 COG:no KEGG:BT_3205 NR:ns ## KEGG: BT_3205 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 112 104 208 244 73 44.0 2e-12 MYNMQDGKSGGLETYNSTSSLEDAAAVFKFGADNTSVEWKLDIYNDKGDKTAIIGTSGRE DSVFADKQSELNVKGDKVIDMHSHPYNAQASDQDMKNLKIKTGTVYHRDSKVLFFYNSEN SRIGNNEYKIDTSKTLLDKLNDKFMK Prediction of potential genes in microbial genomes Time: Wed May 18 01:10:01 2011 Seq name: gi|222159343|gb|ACAB01000016.1| Bacteroides sp. D1 cont1.16, whole genome shotgun sequence Length of sequence - 1574 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 95 - 146 12.1 1 1 Op 1 . - CDS 188 - 646 194 ## BT_3203 hypothetical protein 2 1 Op 2 . - CDS 717 - 1574 561 ## BT_3202 cell well associated RhsD protein precursor Predicted protein(s) >gi|222159343|gb|ACAB01000016.1| GENE 1 188 - 646 194 152 aa, chain - ## HITS:1 COG:no KEGG:BT_3203 NR:ns ## KEGG: BT_3203 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 150 38 187 190 209 65.0 3e-53 MIDSNEVEYVHFWFVGDIDTNHALENCEDVVFMQESHDTIMRDRRIIERFVSVINRSKPI NPKSNYDLRVSSLVRLKPINGEKRPDIKVCIGNYGRRVLLNDVLMKGDHEELQKFIQEEL YDALTPYQWLPEVIKDYLKDHPEERNNYLPDE >gi|222159343|gb|ACAB01000016.1| GENE 2 717 - 1574 561 285 aa, chain - ## HITS:1 COG:no KEGG:BT_3202 NR:ns ## KEGG: BT_3202 # Name: not_defined # Def: cell well associated RhsD protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 285 1053 1337 1337 449 76.0 1e-125 AFGGTFASADNSIQPYKYNGKELDTKNGLNWYDYGARQYDVAIGRWNAVDPMAEKYYNWS PYSYCMSNPIKYIDPSGQTVVIWYKNDSGQTVSYSYSGGNVAHPNPFVQSVITAYQYNKA NGVKVGNGGGASTVKIVENADIKVNVMETPLEVTYNPYAARGAGCIYWKSDWGLQNENGT VTSPATDFDHEAAHALEHKTNAQEYEVNRVRGNDSQYDSKEERRVITGPEQRTARANGET RPGQMTRRNHKGRTVVTKGVTSNVIDIQKTKEYEKRKKPVWTSEP Prediction of potential genes in microbial genomes Time: Wed May 18 01:10:20 2011 Seq name: gi|222159342|gb|ACAB01000017.1| Bacteroides sp. D1 cont1.17, whole genome shotgun sequence Length of sequence - 47902 bp Number of predicted genes - 35, with homology - 33 Number of transcription units - 19, operones - 10 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1349 1047 ## BT_3202 cell well associated RhsD protein precursor 2 1 Op 2 . - CDS 1369 - 4449 1557 ## Cpin_7100 YD repeat protein - Prom 4526 - 4585 3.7 3 2 Op 1 . - CDS 4599 - 4700 80 ## 4 2 Op 2 . - CDS 4691 - 5227 323 ## COG3023 Negative regulator of beta-lactamase expression - Prom 5252 - 5311 4.0 5 2 Op 3 . - CDS 5325 - 5828 559 ## BT_3199 putative non-specific DNA-binding protein - Prom 5853 - 5912 4.2 6 3 Op 1 . - CDS 5951 - 6841 579 ## BT_3198 hypothetical protein 7 3 Op 2 . - CDS 6829 - 7149 236 ## BT_3197 hypothetical protein + Prom 7455 - 7514 2.3 8 4 Tu 1 . + CDS 7614 - 8327 713 ## BT_3196 hypothetical protein - Term 8225 - 8282 5.9 9 5 Tu 1 . - CDS 8511 - 9884 471 ## PROTEIN SUPPORTED gi|227395721|ref|ZP_03879044.1| SSU ribosomal protein S12P methylthiotransferase - Prom 10034 - 10093 6.4 + Prom 10019 - 10078 6.4 10 6 Op 1 . + CDS 10221 - 10682 337 ## gi|237716929|ref|ZP_04547410.1| conserved hypothetical protein + Prom 10758 - 10817 1.9 11 6 Op 2 . + CDS 10860 - 12359 1731 ## COG0427 Acetyl-CoA hydrolase + Term 12404 - 12466 12.1 - Term 12471 - 12510 -0.9 12 7 Tu 1 . - CDS 12675 - 14081 906 ## COG2027 D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) - Prom 14177 - 14236 6.7 - Term 14185 - 14228 3.3 13 8 Op 1 . - CDS 14238 - 15581 668 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 14 8 Op 2 . - CDS 15623 - 16075 387 ## BT_3185 hypothetical protein - Prom 16098 - 16157 4.3 + Prom 15876 - 15935 5.7 15 9 Op 1 . + CDS 16182 - 17753 1727 ## COG0029 Aspartate oxidase 16 9 Op 2 . + CDS 17756 - 18031 163 ## BT_3183 hypothetical protein 17 10 Op 1 . + CDS 18497 - 19555 856 ## COG2152 Predicted glycosylase 18 10 Op 2 . + CDS 19555 - 20817 1069 ## COG0477 Permeases of the major facilitator superfamily 19 10 Op 3 . + CDS 20801 - 22534 1381 ## Phep_4002 hypothetical protein 20 10 Op 4 . + CDS 22552 - 25620 2422 ## BDI_1907 hypothetical protein 21 11 Op 1 5/0.000 - CDS 25692 - 27257 856 ## COG3436 Transposase and inactivated derivatives - Prom 27283 - 27342 1.9 22 11 Op 2 . - CDS 27344 - 27694 104 ## COG3436 Transposase and inactivated derivatives 23 11 Op 3 . - CDS 27682 - 28083 226 ## BVU_0481 hypothetical protein - Prom 28272 - 28331 4.3 + Prom 28034 - 28093 3.3 24 12 Tu 1 . + CDS 28162 - 29967 1170 ## COG0642 Signal transduction histidine kinase + Term 30029 - 30069 2.0 25 13 Tu 1 . - CDS 30168 - 30908 380 ## COG0863 DNA modification methylase - Prom 30954 - 31013 5.7 - Term 30915 - 30948 -0.2 26 14 Op 1 . - CDS 31017 - 32330 476 ## Cthe_2471 hypothetical protein 27 14 Op 2 . - CDS 32323 - 33168 121 ## COG0338 Site-specific DNA methylase - Prom 33238 - 33297 6.5 - Term 33257 - 33304 9.1 28 15 Tu 1 . - CDS 33322 - 33900 785 ## COG1592 Rubrerythrin - Prom 34075 - 34134 6.7 + Prom 33902 - 33961 5.8 29 16 Tu 1 . + CDS 34150 - 35829 1843 ## COG0659 Sulfate permease and related transporters (MFS superfamily) + Term 35885 - 35932 2.5 30 17 Tu 1 . - CDS 36034 - 38358 1833 ## COG3525 N-acetyl-beta-hexosaminidase - Prom 38390 - 38449 3.0 31 18 Tu 1 . + CDS 38357 - 38554 89 ## - Term 38372 - 38420 5.1 32 19 Op 1 . - CDS 38457 - 40280 1392 ## BT_1629 hypothetical protein 33 19 Op 2 . - CDS 40324 - 42138 1425 ## BT_1630 hypothetical protein 34 19 Op 3 . - CDS 42152 - 45508 2557 ## BT_1631 hypothetical protein 35 19 Op 4 . - CDS 45538 - 47190 1362 ## BT_1632 chitinase - Prom 47273 - 47332 2.6 Predicted protein(s) >gi|222159342|gb|ACAB01000017.1| GENE 1 2 - 1349 1047 449 aa, chain - ## HITS:1 COG:no KEGG:BT_3202 NR:ns ## KEGG: BT_3202 # Name: not_defined # Def: cell well associated RhsD protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 45 447 269 718 1337 208 33.0 3e-52 MKRIYLLLSLLLCSLLCMSQVSTSQNYISTRTYTSPDHSGCREQVVYFDGLGRPSQTVDC GITPDGKDLVSLQEYDDQGRKLRTWLPAKSAGNGNYMTLSSLQSGASSLAGGDARPYLQT TYEASPLNRPVAQHGAGETWAEHPVSYQYITRNPRSFPSSYYMTPSGNFLGVCTTDEDGN QAYEFKDGFGRTILAGRMDGSDPCFAYYEYDNHDDLVNVYPPFVTYYQMNEPEGPYATQS SYSYHYDFLHRCVYKKLPERDAIYYIYDHGDHQVLSQDGEQRVRGEWAFSLSDELNRPVL TGICHNSYYYDDFPLCEIDVKARRDDTGTAFHGYILENIPMTNPVVYTVNYYDDYSFIGK HGVPTSLNYVTPPSGYGICYTESSKGLLTGTVTARLDATGVTGYDYAAFYYDERGRIIQS RATNHMGGIEEEYVAYSFTGEPLKRQHVH >gi|222159342|gb|ACAB01000017.1| GENE 2 1369 - 4449 1557 1026 aa, chain - ## HITS:1 COG:no KEGG:Cpin_7100 NR:ns ## KEGG: Cpin_7100 # Name: not_defined # Def: YD repeat protein # Organism: C.pinensis # Pathway: not_defined # 309 1022 389 1060 1065 108 24.0 9e-22 MKRLIFILSFFAALFPFTAHAASTAPYVPLSVPTSPQAAAFKMYGDVAVNPAMGVPDISI PLFDIDHHGYRLPLSLKYNPAPFHPGYNYDVFGRGWALSISSCISRSIECMPDELTDFKL DIDKFGKSYRNLSEEYLNTLELKSDLFTAILPDGSSFEFVIRKNYDGKIEYVVSGGRDVK ITHATMDRKIISFKVVDEQGVEYAFTGADTTFRGTGCTITPYNTTYVSWQLTSIRLPHSS ELITFDYDKSIHSNFGYQQAEPALRFHHFYELGCSPDVFDVTVHTYTRQHAYQMKLLTSV TYGTTTIRINYKDGTNSEYYNYAQSISIKDGDRLVRTINLEQHQGTLQSSSLSKSPFSML DCVTLLGGNDASSETYRCGYTSSYANFGGTDHWGNLNFRSNNYDVAYMNLFVGFDVSRAS SSFVTDVPKDENDLSPLDKVRLSNVPYNTRKPAGPESHCVLRKLTYPTGGYTEFDFENHK FFSFTDDDGDYIHDKKKRVKTEATGFRIREIVNYTAEGVRSDSKHYCYGKTEYEENGWEI GRYYHTGAGEPTLDPTIQTYMNYQSTDLPMSVLNMVLGLDPNGQYRSFKSNPFMSYSPSG GYPVYTWEWECTFSAFNFQRLLNGRPAVVYPEVTVYYLKDGATQFSPENCNGKTVYQYDI YEAMQNDTAFFEKPQYYGNVLWYEGEKYRYNLLKEQTDYMSNGREYKLKRRESLIWRYGG SSVIDWEYFNSYGTPSFVPATAVLGNFYTSTCQMLGHSLLYERKVTTYDESETGITDTEH YTYLSGNRMQWKERNAGGLHRETNWEYPAIARTGTTPAIVRKMVEKHILNPVLKEIQNQN SYNYEGNRREFGEFPTESGDMLILPARFYQTIYDSSSHYSEVLAYSSNGNPREVVSKDKL HMVYLWGYGDRYLIAEIKNATLEQVETAMSSVFGMTSSALAKSTTPDVAKLKALRNHTSL SEAHVTTFTHLPLVGVTSISDPTGKTSYYDYDGLGRLKANYYYEGNVVDESKKRVVQEYD YHYRNQ >gi|222159342|gb|ACAB01000017.1| GENE 3 4599 - 4700 80 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAMKKSVWDMILKVVIAVASALAGVLGANAMNL >gi|222159342|gb|ACAB01000017.1| GENE 4 4691 - 5227 323 178 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 142 2 98 116 102 48.0 4e-22 MRIINLIVVHCSATRGDCTLSPEDLDRLHRRRGFNGTGYHYYICKDGTVHLTRPIERIGA HAKGFNAHSIGICYEGGLDCRGRPADTRTPAQRATLRQLVGQLQEKFSGCRVCGHRDLSP DLNGNGEIEPEEWIKQCPCFEVAKEFKELEEFAIKTENTDEHRVTQHIKKQKGGKLWQ >gi|222159342|gb|ACAB01000017.1| GENE 5 5325 - 5828 559 167 aa, chain - ## HITS:1 COG:no KEGG:BT_3199 NR:ns ## KEGG: BT_3199 # Name: not_defined # Def: putative non-specific DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 150 1 150 168 236 85.0 2e-61 MPVLYKPFQSVLEDKNKKKLFHPRVIYTANVSTTQLAKEIAAYSSLSTGDVKNTLDNLVT VAAQHLQASESVTLDGFGTFRMVMKSNGKGVELPEKVSAAQASLTVRFLPNYTKNPDRTT ATRSLVTGAKCVRFDLADTSASGGGDSGKPDGGGSGDGGGEAPDPAA >gi|222159342|gb|ACAB01000017.1| GENE 6 5951 - 6841 579 296 aa, chain - ## HITS:1 COG:no KEGG:BT_3198 NR:ns ## KEGG: BT_3198 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 296 1 302 302 440 71.0 1e-122 MGRIKQGLDYFPMSTSFMHDRIVRRVMKREGDAAFATLVETLSYIYAGKGYYIPASDEFY DELTDSLYNTDLDDVKRIIAPSVECGLFDAGLFRQYGILTSADIQRQYLFITKRRSSSLI DPAYCLLEAEELASYHTSPNSKNSAEDADNKTGCDVTPTADTVTSTTDSATSEAEMSTLG TQNKEKQIKTNQNKLNHLSDSPQGENGGGKILKSRKVMTQEDIDNLQPPPDGMQRNFEGL LENLHSYKIPPSEQYAIILKSNFGAIGHPVWKGFSSIRGSNGKIKLPGHYLLSVIN >gi|222159342|gb|ACAB01000017.1| GENE 7 6829 - 7149 236 106 aa, chain - ## HITS:1 COG:no KEGG:BT_3197 NR:ns ## KEGG: BT_3197 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 106 1 106 106 139 73.0 4e-32 METIMNKSLCICGCCKRELPQEAFYMNQRTHTTDSYCKECRKANARRHRNQDKSTLFENK SVSYPVITEVKDYALRMFLIRHARQVVSDSIRRKQEKEKLRALWEE >gi|222159342|gb|ACAB01000017.1| GENE 8 7614 - 8327 713 237 aa, chain + ## HITS:1 COG:no KEGG:BT_3196 NR:ns ## KEGG: BT_3196 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 237 1 236 236 385 77.0 1e-106 MDTFLDVTGIVKRAKQVLNFKNDSELAEYLGVSRATVSNWGARNSIDFRLLLDKFGDKVD YNWLLLGKGNPKHQPRHCESELVQGEVEIIHNPKIPEPIDDRSVTLYDITAAANLKTLFT NKKQYALGKILIPNISVCDGAVYVNGDSMYPILKSGDIIGYKEISSFDNVIYGEIYLVSF MIDGDEYLAVKYVNRSEQEGHLKLVSYNTHHEPMDIPFASINAMAIVKFSIRRHMMM >gi|222159342|gb|ACAB01000017.1| GENE 9 8511 - 9884 471 457 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227395721|ref|ZP_03879044.1| SSU ribosomal protein S12P methylthiotransferase [Haliangium ochraceum DSM 14365] # 22 452 1 450 461 186 29 3e-46 MNELTGADFKSATEMTDDNKKLFIETYGCQMNVADSEVIASVMQMAGYSVAETLEEADAV FMNTCSIRDNAEQKILNRLEFFHSLKKKKKRLIVGVLGCMAERVKDDLITNHHVDLVVGP DAYLTLPDLIAAVETGEKAINVELSTTETYRDVIPSRICGNHISGFVSIMRGCNNFCTYC IVPYTRGRERSRDVESILNEVADLVAKGYKEVTLLGQNVNSYRFERPTGEVVTFPMLLRT VAEAAPGVRIRFTTSHPKDMSDETLEVIAQVPNVCKHIHLPVQSGSSRILKLMNRKYTRE WYLDRVAAIKRIIPDCGLTTDIFSGFHSETEKDHAMSLSLMEACGYDAAFMFKYSERPGT YASKHLEDNVPEEVKVRRLNEIIALQNRLSAESNQRCIGKTYEVLVEGVSKRSRDQLFGR TEQNRVVVFDRGTHRVGDFVNVRVTEASSATLKGEEI >gi|222159342|gb|ACAB01000017.1| GENE 10 10221 - 10682 337 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716929|ref|ZP_04547410.1| ## NR: gi|237716929|ref|ZP_04547410.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 153 1 153 153 234 100.0 1e-60 MEDFLKFLLIAGVILVGIFKEVNKNSKAKKAKAKHPVPPMPSPVEVNPDAVPIPEAWGSP KSLDELLRPILREQPVAKQPSKQQTSKQKKKKEEVSVAASLANSVAQDEQNSRQGAHYNT PHDSPDNKEDFTIHSAEEARRAIIWGEILQRKY >gi|222159342|gb|ACAB01000017.1| GENE 11 10860 - 12359 1731 499 aa, chain + ## HITS:1 COG:ygfH KEGG:ns NR:ns ## COG: ygfH COG0427 # Protein_GI_number: 16130821 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Escherichia coli K12 # 5 489 7 490 492 500 49.0 1e-141 MSLNRISAAEAASLIKHGYNIGLSGFTPAGTAKAVTAELAKIAEAEHAKGNPFQVGIFTG ASTGESCDGVLSRAKAIRYRAPYTTNADFRKAVNNGEIAYNDIHLSQMAQEVRYGFMGKV NVAIIEACEVTPDGKIYLTAAGGISPTICRLADQIIVELNSAHSKSGMGMHDVYEPLDPP YRREIPIYKPSDRIGLPYVQVDPKKIIGVVETNWPDEARSFAAADPLTDKIGQNVADFLA ADMKRGIIPSTFLPLQSGVGNIANAVLGALGRDKTIPPFEMYTEVIQNSVIGLIREGRIK FGSACSLTVTNDCLEGIYNDMDFFRDKLVLRPSEISNSPEIVRRLGIISINTAIEVDLYG NVNSTHIGGTKMMNGIGGSGDFTRNAYISIFTCPSVAKEGKISAIVPMVSHHDHTEHDVN IVITEQGVADLRGKSPKERAQAIIENCAHPDYKELLWDYLKLAGNRAQTPHAIQAALGMH AELAKSGDMKNTNWAEYAK >gi|222159342|gb|ACAB01000017.1| GENE 12 12675 - 14081 906 468 aa, chain - ## HITS:1 COG:BS_pbp KEGG:ns NR:ns ## COG: BS_pbp COG2027 # Protein_GI_number: 16078896 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) # Organism: Bacillus subtilis # 36 465 51 488 491 153 28.0 9e-37 MKKSLLLVALSVCLLPLWGQQNFSRIDSLIKKMLPEASEVGISVYDLTAKKSLYNYRAEK LSRPASTMKLLTAITALSRPEAAEPFRTEVWHDGVIEHDTLQGNLYVVGGFDPEFNSQSM DSLIEEVITFPFSVINGQVYGDVSMKDSLYWGSGWAWDDTPAGYQPYLSPLMFCKGTVQV SVVPSTVQGDTASVSCQPVSSYYTVTNQTKTRTSSAGKFSFTRDWLTNGNNLLVSGNVTS IRKDDVNIYDSPRFFMHTFLERLRGKGITTPQSYGFAELPRDSVNVERMACWNTSVQKVL NQLMKESDNLNAEALLCRLGAQATGKKQVAAEDGIVEIMKLIRRLGHDPKDYKIADGCGL SNYNYLSPALLVDFLKYAYSQTEVFQMLYKSLPIGGVDGTLKFRMKGTPAFRNVHAKTGS FTAINALAGYLKMKNGHEVAFAIMNQNVLSAAKARAFQDKVCEVIIGK >gi|222159342|gb|ACAB01000017.1| GENE 13 14238 - 15581 668 447 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 2 447 3 448 458 261 33 4e-69 MKYQVIIIGGGPAGYTAAETAGKGGLSVLLIEKNSLGGVCLNEGCIPTKTLLYSAKTYDS AKHASKYAVNIPEVSFDLPKIIARKSKVVRKLVLGVKAKLTANNVTIVSGEAQIIDKNTV RCGEETYEGENLILCTGSETFIPPIPGVDAVNYWTHRDALDSKELPASLAIVGGGVIGME FASFFNSLGVQVTVVEMMDEILGGMDKELSALLRAEYTKRGIKFLLSTKVVGLSQTEEGA VVSYENAEGNGSVIAEKLLMSVGRRPVAKGFGLENLNLEKTERGAIRINEKMQTSVPGVY VCGDLTGFSLLAHTAVREAEVAVHSILGKEDAMSYRAIPGVVYTNPEIAGVGETEESAST KGINYQVIKLPMAYSGRFVAENEGVNGVCKVLLNEQQRVIGAHVLGNPASEIITLAGTAI ELGLTAAQWKKIVFPHPTVGEIFREVL >gi|222159342|gb|ACAB01000017.1| GENE 14 15623 - 16075 387 150 aa, chain - ## HITS:1 COG:no KEGG:BT_3185 NR:ns ## KEGG: BT_3185 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 129 1 129 131 192 80.0 4e-48 MNRIFHARIAWYQYFLLVVLTVNAVGALWCKYILVAVLFMLMLIVVIEQIIHTTYTLTTN GDLEVSRGRFIRKKIIPLSEITTVRKYHSMKFGRFSVTDYILIEYGKGKFVSVMPVKEQE FAELLEKRMFAKKIKATEAIDPVDSQADES >gi|222159342|gb|ACAB01000017.1| GENE 15 16182 - 17753 1727 523 aa, chain + ## HITS:1 COG:PA0761 KEGG:ns NR:ns ## COG: PA0761 COG0029 # Protein_GI_number: 15595958 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Pseudomonas aeruginosa # 4 519 6 520 538 486 47.0 1e-137 MVRKFDFLVIGSGIAGMSFALKVAHKGKVALICKSGLEEANTYFAQGGVASVTNLLVDNF DKHIEDTMIAGDWISSRDAVEKVVREAPAQIEELIKWGVDFDKNEKGEFDLHREGGHSEF RILHHKDNTGAEIQDSLIKAVQRHPNITVIENQFAIEILTQHHLGVTVTRQTPDIKCYGA YILDPKTGKVDTYLAKVTLMATGGVGAVYKTTTNPLVATGDGIAMVYRAKGTVKDMEFVQ FHPTALYHPGDRPSFLITEAMRGYGGVLRTMDGKEFMQKYDPRLSLAPRDIVARAIDNEM KNRGDDHVYLDVTHKDPEETKKHFPNIYEKCLSLGIDITKDYIPVAPAAHYLCGGILVDL DGQSSIERLYAVGECSCTGLHGGNRLASNSLIEAVVYADAAAKHSLQAVDQYSYNEDIPE WNDEGTRSPEEMVLITQSMKEVNQIMSTYVGIVRSDLRLKRAWDRLDIIYEETESLFKRS VASREICELRNMVNVGYLIMRQAMERKESRGLHYTIDYPHVKK >gi|222159342|gb|ACAB01000017.1| GENE 16 17756 - 18031 163 91 aa, chain + ## HITS:1 COG:no KEGG:BT_3183 NR:ns ## KEGG: BT_3183 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 86 1 86 90 136 79.0 2e-31 MKTLIRTFILFFCITGLASCYAGIPLKSGKSENNQTYEVSYLFEHDGVKVYRFLDLGNYV YFTTRGDVTSIKNDSTRKRTITIYKDSIPNR >gi|222159342|gb|ACAB01000017.1| GENE 17 18497 - 19555 856 352 aa, chain + ## HITS:1 COG:lin0857 KEGG:ns NR:ns ## COG: lin0857 COG2152 # Protein_GI_number: 16799931 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Listeria innocua # 6 347 5 351 355 309 45.0 6e-84 MDIAKRCEQNPLLAPKDLKAGINGMEITCLLNPGVFKYQGKIWLLLRVAERPIQKEGIIS FPIYNEKGKIEVVSFLKNDPDLDASDPRVIGYKGKNYLTTMSYLRLVSSTDGIHFKDEPE FPPIFGKGELEAFGIEDCRVATTEDGFYLTFTEVSSVAVGVGLIHTNDWKNYTRHGMIFP PHNKDCALFEQKIGDKYFALHRPSSPELGGNYIWLAESPDRLHWGNHKCVATTRDGMWDC ARVGAGAAPIKTEEGWLEIYHGADFNHRYCLGALLLDLNDPSKVIARSEEPIMEPIAAYE QTGFFGNVVFNNGHLVDGDTITMYYGASDEVICKAELSISEILNVLKQTQEK >gi|222159342|gb|ACAB01000017.1| GENE 18 19555 - 20817 1069 420 aa, chain + ## HITS:1 COG:BS_yqgE KEGG:ns NR:ns ## COG: BS_yqgE COG0477 # Protein_GI_number: 16079556 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 21 417 20 410 430 76 23.0 9e-14 MNKLKKEYLFFKQQEPNMRTLLVTNMLYALVLPIVEIFVGAYIMRNTSDPVMVAFYQLAM YIGIITTSLVNGYLLKKYSVKTLYSVGILISGISMFGMMTIKSLGFVELGLAGFLMGAAS GFFWTNRYLLALYNTNDNSRNYFFGLESFFFSITSIGVPLIIGAFISQIDGKEVFGFLFN INTSYCVVTMAAVVITVIACMTLWKGKFENPKETNFLYFRFNILWKKMLWLAGLKGMVQG FLVTAPAILVLKLVGDEGALGLIQGVSGALTAVLVYVLGRMARPQDRLKIFVGGLIVFFA GTLFNGILFSATGVILFVLCKVIFQPLFDLAYFPIMMRTIDAVAKIENRNEYAYILSHEF GLFLGRAFGLLLFIFLAYCVSQDFALKYALVIVAGLQLAAYPLAKNITNQTDTIEHENDK >gi|222159342|gb|ACAB01000017.1| GENE 19 20801 - 22534 1381 577 aa, chain + ## HITS:1 COG:no KEGG:Phep_4002 NR:ns ## KEGG: Phep_4002 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 30 566 25 564 578 573 50.0 1e-162 MKMTSKGVFVLMTVCFLASCSGNGNNFGEKVKQVSIHRVDSMPDMPETYKMLDWKQKAQK YDQFIFDWNNKSEVGPLIWLDDTRRNMDQTTFGLYTAIKDIRQGKNANNGEFHESLNSLA AILGAGLVGIDKTNQDGYNYVKMVQNYFNSDNGWNIVMNNTTPSVALLGGGYARDWWYDV LPNALYYAICDVFPNVDGAEKIQKSIAEQFVKADSVLNGNYDYSYFDYAQMKGMVNHIPL QQDAAGGHAYVLLCAYHKFGDPRYLQHSKSAIEALLAQKESRFYEALLPLGVYTAAYLNA VEGANYDVAKLLDWVFDGCKSPAGRTGWGIIVGKWGDYDVSGLQGSITDGGGYAFLMNSI KPAWPFIPMVKYQPQYAKAIGKWMLNNASACRLFYPGEIDETHQWAPELKDITYDNVSYE GLRKTDDYGKASLKGVSPVAIGDGPKWIKGNPTESMFSVYSSSPVGILGAIVCQTNVEGI LRLDCNVTDFYTEKPYPVYLYYNPHKETKTITYQATQPCDLFDIVAKEYIAKNIKTNGSV EIPANDARVIVELPAGTELELKNGKIIANKQNIISYN >gi|222159342|gb|ACAB01000017.1| GENE 20 22552 - 25620 2422 1022 aa, chain + ## HITS:1 COG:no KEGG:BDI_1907 NR:ns ## KEGG: BDI_1907 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 21 1022 53 1083 1083 634 40.0 1e-180 MSKHLLLFILILAVCTNVFAQQTKSISGVVYDSNTGEALIGVSVLEVGTTNGTITDFDGR YTLKVSSNKVSFSYVGFKTEIVNVTNSGTYNVNLVSDNKLDEVVVIGYGTQRKSDLTGAL ASISSKDIKNYAVSNASELLTGKAAGVFVAASSGQPGSDAVIRVRGLGTVNDNNPLYVVD GQFMDNISSLNPSDIERMEVLKDASACAIYGSRGSNGVILITTKGGVKGETTVTLDAYVG VKNSYKALNMMNSDQYYNFIMKAYENDASFQNSMKDKFTNQYQKGYNTNWWNEVTRTAFN QNYNLSIRKGTDNSRSSLSLGYVDDQGAIITTEFKRLSLKANLEYDINKFITVGANVNLA KIRKRDAGAIPSFDFIQKADPFTPVISPLVDPSSENYEYNKYAPTEWSYDPNPVAMLELP NRYNDIFNVFGNVFAHIKLYKGLSYRVQYSFERYHDTFKDFRPVYSSTFSEDNLANQESK YNKETQLNNNSAVTSNYLVEQRLNYNTTIGRHKLDAMVAMTYEKNSSEGINAFKRKALGN DEIYQILDAQTAGDNTSGGKETSSMLSYLGRINYVYDDRYLATVNFRADGSSRFAKRNRW GYFPSVSLGWRVSNEEFFKNLNIENTISNLKLRVGWGQNGNQRIDRDAPLTLIGTNNENQ WYFGNGYSQGYVPTYVGNADIKWETSQQTNVGLDMSFFKNSLDVSMDFYVKKTSDMLLNM PIPSFGAFPNSPFFNAGDLKNTGFEIVVNYRNQIGKDFNYNVGLNMSTYKTEVTKLTSEY LSGNTSRTYVGGPIGRFWGYKQIGIFQNQEEIDNYVDKNGTKIQPNAQPGDFKFAKLGES GELNDDDDRTFIGDPNPDLIYGFNLGFSYKNFDVSMAFQGTIGNDIWNVAKGSLASAGRQ NALADAYTKAWTKDGDLDAVYPRITNSDSNNNMRGSSFYVENGSYLRLQNMQIGYTLPSH ICQKSKLFSSCRFYVSGQNIFTLTGYSGLDPELGINNPLDMGVDTTRYPSSRTFTFGVNL QF >gi|222159342|gb|ACAB01000017.1| GENE 21 25692 - 27257 856 521 aa, chain - ## HITS:1 COG:ECs3866 KEGG:ns NR:ns ## COG: ECs3866 COG3436 # Protein_GI_number: 15833120 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 147 515 97 457 463 124 27.0 5e-28 MKKDEIIELLKEQIKGLRDDNNRLLGQVDALIKEVSSLKEALLQKGESLSKQQRLAKGLA KLVSNTSEQQQAPQPAMSEEERQKREAEKADKRKARKNNGAKRDMHYEMEEEEHVVYPDD PDFDINKARLFTTAPRICVRYECVPMRFIKHVYKIHTYTQDGRLFEGKTPVSAFLNSSYD GSFIAGLMELRYIQSLPVERIINYFEGHGFTLKKPTAHKLIEKASGLFENLYKCIRQTAL SDPYKAADETYYKILVPEKNSKGKGVRKGYLWVVVGINTRMIYLLYDDGSRSERVILNEL GSCKGIIQSDGYSPYRKLESDAYPNITRIPCLQHIKRKFIDCGENDPDAKRIVELINTLY QNEHKHKVGVDGWTVEQNLMHRKKYAPDILGEIKDVFDEIEERGDLLPKSELQEAITYLR NEWNAVVDIFNYGDTYLDNNIVERMNRYISLSRKNSLFFGSHKGAERGAILYTIALTCRM HKVNLFEYLTDVINRTAEWQPNTPLEKYRELLPDRWEKANG >gi|222159342|gb|ACAB01000017.1| GENE 22 27344 - 27694 104 116 aa, chain - ## HITS:1 COG:ECs3847 KEGG:ns NR:ns ## COG: ECs3847 COG3436 # Protein_GI_number: 15833101 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 113 1 110 115 62 33.0 1e-10 MYSLTSANRYYLYQGFVRMNLGIDGLFKIIRLEMKDLSPVSGDIFLFFGKNRQSVKILRW DGDGFLLYYKRLEGGSFELPTFNPHTGNYEISYQVLSFILNGVSLKSVRLRKRFRI >gi|222159342|gb|ACAB01000017.1| GENE 23 27682 - 28083 226 133 aa, chain - ## HITS:1 COG:no KEGG:BVU_0481 NR:ns ## KEGG: BVU_0481 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 133 1 134 134 236 86.0 3e-61 MDLENRFRRTWKRFQEVLKTDYNCSLADVCREQHTTFGGMSSWMSRRGYSVKQAKADVVR DYYGGVEPSPLSTTSPAFTQIAPAMSSEEEFSLAGITITFNSGTTISVKRATPGGVIKML LDYERKEGDPCIL >gi|222159342|gb|ACAB01000017.1| GENE 24 28162 - 29967 1170 601 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 89 342 54 312 318 109 29.0 2e-23 MEIGNNPNIIFTQLPPGEYQLEVKSTNGDKVWGDNVYRLTIKVGYPWWLSVPALIIYAIL CIIIFYITKSVIKNRIRLSRQVLIAQVEKRHEQKIYESRLNFFTNVAHEFFTPLTLIYTP AQHLLEQDGLDGSTKKYLQIIKNNAERMQKLISELMEFRKTKSSNMDLNPENVDVKSLME YASNNYVDILKENKIDFKVEAHDTSEIYSDRNALEKIIFNLLSNAFKYTPRYGYIHVEIS QNEPDKTLYLLVRNSGKGLTEQQMAEIFDKYKIFDTPKLGNSVSNGIGLNLTKSLVELLG GQISVNSELGKSVELSVVIPSLPSDHSLVVTDEKDSLQKEKNGQDHLSPRKDTVILIVED EKNIRDLLKDVLFDYVIHEAKDGVEALKEIEHNHPDIIISDIVMPNMDGLSLINKLKSDL KTSYIPIIGISAKASVEDQINAFNHGADAYIVKPFHPRQVISTIENLLSRQMLLKDYFNS SMSSVKVKDGIVLHPEDEELIQNVTDFIKNNIDDELLSPSSIAEFVGVSKATLYRKFKEI IDKTPSEFVRGIRLEYAAKLLRTTKLTVSEIMFKCGFSNKSYFYREFLKQYGVSPKDYRN Q >gi|222159342|gb|ACAB01000017.1| GENE 25 30168 - 30908 380 246 aa, chain - ## HITS:1 COG:HP0092 KEGG:ns NR:ns ## COG: HP0092 COG0863 # Protein_GI_number: 15644722 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Helicobacter pylori 26695 # 1 221 41 261 277 266 57.0 3e-71 MIFADPPYFLSNGGISVQSGKIVCVDKGDWDRSFGKESIDNFNYKWIADCRDKLKDNGTI WISGTYHNIFSVANQLTELGFKILNCITWVKTNPPPNISCRYFTYSAEYIIWARKNNNVS HYYNYDFMKMSNANHQMTDVWNLSAIEGWEKIHGKHPTQKPINLLARVIAASTRPGAWIL DPFAGSSTTGVTANLLKRRFLGIDIEQKYLELSILRRKDLDFPKVRADFLSRIYSINVHA ISLEDK >gi|222159342|gb|ACAB01000017.1| GENE 26 31017 - 32330 476 437 aa, chain - ## HITS:1 COG:no KEGG:Cthe_2471 NR:ns ## KEGG: Cthe_2471 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 5 434 4 423 426 404 48.0 1e-111 MNNASKIDTNWTKIFNKYPILQTIKDEGKYIITAQQIKEFWEPRLMTKHDHSVNRPQIFI DNSLSILPITRGSYIIGAMDLYHDFALSDREKVFDSTDSMPVSSPAFIESIDFNNITSEA TAINSIYVSDILRDFLNEPTLLPTVNGRMGAGSFSFSVNCLSGEDTSLLVDVSNSQIEID GGYEGTNSLCLIEAKSSLSTDFLVRQLYYPFRLWTNKITKPIRPVFLLYSNGTYYLFEYA FEEIGNYNSLKRVQYKKYRIENDVITLQDILEIPKRIPVVKEPQIQFPQADSLERIINLC EIMNSDNKAFNKYGIAKIYSFDERQSDYYANAGVYLGLIQRYKKGSIYNYKLSNLGKQIF KLPLRSRHLRVAELILSHSPFRQTLKSYIDNANIPSKEEVIDIMRHCELYRMSENLYYRR SSTVISWINWVLNLRTE >gi|222159342|gb|ACAB01000017.1| GENE 27 32323 - 33168 121 281 aa, chain - ## HITS:1 COG:PH1032 KEGG:ns NR:ns ## COG: PH1032 COG0338 # Protein_GI_number: 14590870 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Pyrococcus horikoshii # 12 279 4 291 330 233 45.0 3e-61 MSSRRTNKLVVPFVKWVGGKRQILSEIEKYLPSKFATYYEPFIGGGALLFHIQPKNAIIN DFNAELINLYTVIKENPLELIADLKKHKNEADYFYEIRSLDRDREKFSALTNIERASRII FLNKTCYNGLFRVNSSGEFNTPFGRYKNPNIVNEITIKAVSLYLNNSNVLIMNGDFEEAL HNVSKGDFVYLDPPYDPLSNSSSFTGYTQGGFDRTEQERLRNVCDWLHAKKAKFLLSNSC TDLILDLYKNYNITEIKAIRSINSDAEKRGQVSEVLIRNYE >gi|222159342|gb|ACAB01000017.1| GENE 28 33322 - 33900 785 192 aa, chain - ## HITS:1 COG:CAC2575 KEGG:ns NR:ns ## COG: CAC2575 COG1592 # Protein_GI_number: 15895835 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 3 192 2 195 195 199 57.0 2e-51 MAKSIKGTQTEKNLLTSFAGESQARMRYTYFASAAKKEGYEQIAAIFTETADQEKEHAKR MFKFLEGGMVEITASYPAGVIGNTLENLRAAAAGEHEEWSLDYPHFADVAEQEGFPMIAA MYRNISIAEKGHEERYLAFVNNIENMTVFAKEGEVVWQCRNCGYIEIGKEAPEVCPACLH PQAYFEVKKENY >gi|222159342|gb|ACAB01000017.1| GENE 29 34150 - 35829 1843 559 aa, chain + ## HITS:1 COG:CT856 KEGG:ns NR:ns ## COG: CT856 COG0659 # Protein_GI_number: 15605592 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Chlamydia trachomatis # 8 559 13 560 567 407 42.0 1e-113 MKLFEFKPKLVSCLKNYSKETFMADLMAGIIVGIVALPLAIAFGIASGVSPEKGIITAII AGFIISLLGGSKVQIGGPTGAFIVIIYGIIQQYGEAGLIVATLMAGVLLVLLGVFKLGAV IKFIPYPIIVGFTSGIAVTIFTTQIADIFGLSFGDEKVPGDFVGKWLIYFRHFDTINWWN TIVSIVSIIIIAITPKFSKKIPGSLIAIIVVTIAVYLMKTFGGIDCIQTIGDRFTIKSEL PDAVVPALDWEAIRELFPVAITIAVLGAIESLLSATVADGVIGDRHDSNTELIAQGAANI IAPLFGGIPATGAIARTMTNINNGGKTPIAGIIHAVVLLLILLFLMPLAQYIPMACLAGV LVIVSYNMSGWRVFRALLKNPKSDVTVLLITFFLTVIFDLTVAIEVGLVIACVLFMKRVM ETTEISVITDEIDPNKESDITVHEENLMIPKGIEVYEINGPYFFGIATKFEETMTQLGDR PKVRIIRMRKVPFIDSTGIHNLTALCEMSQKEKTTIILSGVNDNVHNVLEKAGFYELLGK ENICPNINVALDRAKNLVE >gi|222159342|gb|ACAB01000017.1| GENE 30 36034 - 38358 1833 774 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 28 614 29 594 757 357 35.0 6e-98 MKKEILIILVLLTSLISMHAANLTSPPISIVPRPAQVVPGEGNFTFSSQTLFSVENHEQA VIVRNFINLLKRSSGITSQLAIGGSEKSHVCFTTDSSLKSEAYQLAVTPEQIQVKASDIK GFFYALQTIRLLLPSAVEGNQQAKNIRWSIPAVTIQDEPRFGYRALMLDAARFFIPKENV LRIIDCMGMLKINTLHFHLTDDNGWRLEIKKYPRLTEVGAWRVNRMDVPFYFRRNAEPGE PTPIGGFYTQEDIKEMVSYAADRQIEIIPEIDMPAHSNAALAAYPQLACPVVDSYIGVIP GLGGSNSGVIYCAGNDSAFTFLQDVLDEVMAIFPSRYIHLGGDEAQKGYWKKCPLCQQRI KKEHLANEEDLQGYFMKRMSDYVRSKGREVIGWDELTNSSFLPEGVIIQGWRGLGTAALK AAEKGHQFIMTPARILYLIRYQGPQWFEPQTYFGNNTLKDIYDYEPVQADWKPEYASLLK GVQASMWTEFCNKPEDVDYLVFPRLAALAEVAWTQPEHKDWTAFLKGLDAYNEHLTAKGI VYARSMYNIQHTVTPNDGMLEVKLECIRPDMEIHYTTDGSEPTATSPLYEKIVPVKEALT LKGATFAYGRQMGKTLILPVRWNLATAKPVLGTNPTEKLLTNGIRGSLKYSDFEWCSWAD NDSVSFTIDLLKPEMLNTLTLGCITNNGMAIHKPASVRVEVSDDNSKFREAVVQSFTSEE IFREGNFIENLSLKLGGTQARYVRVTAKGPGVCPASHVRPGQTSRIFFDEVMLE >gi|222159342|gb|ACAB01000017.1| GENE 31 38357 - 38554 89 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVIKLLRVKMSKEMHKHLTSDILSINLLKAEHYISRIIDLHLPSMSHLRPNTCYSYKNES NQFPS >gi|222159342|gb|ACAB01000017.1| GENE 32 38457 - 40280 1392 607 aa, chain - ## HITS:1 COG:no KEGG:BT_1629 NR:ns ## KEGG: BT_1629 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 595 1 573 593 449 43.0 1e-124 MKNIFYKLFSISLLGVCLMGCEDETKYLPLPQAVPFTMTVNNTTFVMGEKLVVDLDINPD KDGSEVLANEDFDIYFTAKTGTEDVSGVFKDFHNIVTFPKGEKSIRVEFPLKETGLEGSK NFNFVAFSRGYEIGNPSFPMKVSDYYRVNMAIQGNADNIVTEGDKFVLVASLDKPRAIPI VVSISAKAEDESSFENLPSELTIPAGALNAKTDPISLVMDGAKTGDKELTLNFTSNSEAN PMKSDKLVIMMTDLESLANPDMYDPTLVYEHPEYPFVSKGNKSKFDAWWGSDRSSSVEIT GKTGAKPGTDYASLTSHPNQKLAEEGWKFWSGVEFHFIQSSFGWSMPTKPNTSFGNYRHW SFNYIPTKSIQDVFAANRDKCSNVTEDGIYQLWVQAGSTPGDGLGASRDYGSAAVYIGRG GSGASTRFIHINPGMRIEMRVRVTGDRTGIVPVIELRDAPGGVFDKQIQRIDILRNTKGN LVTQGVFSTMEDGTKTSPLPKIGDYNIYWVELTTDGIKVGINGSTNVDVPSTSSWGFAPA DGLALDFLFAPADNRPDGWDSVLKAISEPKTDPNTPMMEIDWIRFYTNSTYSDEGVTYWA NAGQLFY >gi|222159342|gb|ACAB01000017.1| GENE 33 40324 - 42138 1425 604 aa, chain - ## HITS:1 COG:no KEGG:BT_1630 NR:ns ## KEGG: BT_1630 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 603 1 590 591 869 72.0 0 MKKIYLLAVALAGTLLTSCSDFLDKVPLVVPSSETFLTNQSAVTNYVNGLYIALPSAGAY GMGIMGEEKNSDNMVAMVYDRRMNGELQESIGGVTEWQKGYQNLRSVNYFLEYYKVPAAE ETAEVLSLKGEAYFFRAYWHYYLLTKFGSIPVMDKFWDGNATVGGLQIPARDRSAVAQFI LDDLKAAKELLYSRSKYQGLRICKEAAMVMAMRVALFEGTWEKYHKGTDFAAAEYKSDDF LQQVLTLGDELFQMGITLNTKENDKSVKNIDDAYGNLFNSKDLSATQEVLFWKKYSIADN LVHSLNTNLSSGMCDAEGPAGLSQSLVDNYLSVDGNFIDPTAAKFKDFNESFKDRDGRLL ATVMHSDCKYKSMTVKGAKKMLVKAYSEEDKDLINPPQIVSTGNQSNTTGYHIRMGIDTT FVSGQGETATPMIRYAEALLAYAEAAEELGKCDAAVLEKTIKPLRERAGVTYLAPSAIDP NFPDFGYTISANLQEIRRERRSELPLQGMRLDDLMRWRAHKLFQGKRGGGAYFGAGSVLY KSIDPKNEELAQILVDNNDYLDPLKNVLPRGFQFDAGRDYLLPIPPSEISLNKELTQNPG WRKN >gi|222159342|gb|ACAB01000017.1| GENE 34 42152 - 45508 2557 1118 aa, chain - ## HITS:1 COG:no KEGG:BT_1631 NR:ns ## KEGG: BT_1631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1118 1 1119 1119 1729 76.0 0 MKKEKLMKIQERLVFLLIFLFLLPIGVIAQQKMITGQVVDEQGEAVIGASVMVKGAKTGT ITDLDGNFSVQGKAGQTLSISYVGYTPLDVKITKLQNNRFVLKENTEVLDEVVVVGMDTQ KRNTITAAVSVVKGDDIVNRPITDLTSALQGNVAGLNFASDAMGGAKGGELGTDIKFNIR GVGSINGGEPYVLVDGIEQSMQNVNPADVASISVLKDASAAAIYGARAAYGVVLVTTKSG NSEKAKVSYSGTVGFSSPIRTPQMMNSLEFAYYNNALYDAGASASGINRISDSVIEKIKG FMQNPYSEEFPGIDVSSNGEDWASAYYAQYGNTDWFKYYYKDKSIRHSHNLSVQGGSQKI NYYIGMGYVYQEGFLDHVKDDLSKYNLNTKLQAKPTDWLRFNLNNNITLQMIKRPMPNQS IIYNKIGSHRPSQVTHLPIDSKYNIPSWNETLFMQNSNYENNRLSDALSVSATATLMKGW DVTAEMKIRFDVENNNMILYENPQYETPLGTLKPGDATGRRQGFSYPNIDWKLMKFGSYA RGSLFNYYLSPNVSSMYIQQWGKHSFKAMAGFQMEMKEDSNEYMYKDGMMSNDIFSFANA SGNVIAGEDRTHWATMGAYLKLNWNYDNIYFFEFSGRYDGSSRFAKGNRWGFFPSFSAGY DIARTDYFQRLKLPVSQLKVRVSYGRLGNQNGAGLYDYLATMSLDALGSNGWLLPTASSA SAIKGTIAKTPKMVSPFITWEKVDNANLGLDLMLFDNRLSLEANIYQRMTRDMIGPAEAI PDISGIASENRSKINNATLRNRGWELSVNWHDQLKCGFSYGVGFNVFSYKAVVTKYNNPD GTIYNNHTGYGPNKGYYEGMDLGEIWGYQADDLFMTNREIDDYLRKVDMSALKPDDMWRR GDLKYIDSNNDGKIDGGDGTIYNHGDLKIIGNTTPKYSFGINLNVGYKGFEVSTLLQGVA KRDFPISGSSYMFSAGSYNFKEHLDYFSPETPNGFLPRLTSWKNNQDFYVNTGYNSTRYL LNAAYMRMKNLTVSYNFDKKLLKHIGISRLKLYVACDNLFTVTKLPKQFDPETLNQVNGM AGGTTTEAPGLTSPVVSNGNGMVYPLNRNFVFGLDFTF >gi|222159342|gb|ACAB01000017.1| GENE 35 45538 - 47190 1362 550 aa, chain - ## HITS:1 COG:no KEGG:BT_1632 NR:ns ## KEGG: BT_1632 # Name: not_defined # Def: chitinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 544 16 548 554 563 53.0 1e-159 MRNKFFLCLFVLVSLTACKDTKWVEVPGGMPPEGAITGQPGVDDPAEKTLINCQYIGGDP FELNRIPVENLIVSNDLIYLAARPYANGDIAFDLPINDATFTGGVSHEATYEGRSGVIKF DGTGKMNGKDGLLHSTEETFNKSFTFATYIFIDEWVPGAYIFKKTSGDKVLAALQLGANE QTVSWALGNASVTAGTYVDAKGLAPGAWHYLAVDYDGNKKQFRIFVDAWSSTLDKEIVLP TARADFYIGENFKGRLDETSFWSMAAGTGGKNGIKFDNWNMVAKVLAYWQYDDSTNPGKD SHTWLDRWKKIRETLAAQVGNDSRKLRLGFGSGQWRQMMSADNTRTAFANNLKSVLKEYE FDGVDLDIEWPTTETEYANYSATIVKIRQILGKDVCFSVSLHPVAYKISKNAIEALDFMS YQCYGPAVMRYSYEQFVKDAEMAVEYGIPADKLVLGVPFIGTTGVNGEQVSYSDFINGGL SDVALDEFTYNGKTYTLNGQNTIRKKAKYACEEGYRGIMSWGSDVSVNNEKSLLKAIQEE FVAYEINNPK Prediction of potential genes in microbial genomes Time: Wed May 18 01:12:23 2011 Seq name: gi|222159341|gb|ACAB01000018.1| Bacteroides sp. D1 cont1.18, whole genome shotgun sequence Length of sequence - 56825 bp Number of predicted genes - 35, with homology - 35 Number of transcription units - 16, operones - 7 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 146 - 307 86 ## gi|262405725|ref|ZP_06082275.1| conserved hypothetical protein - Term 135 - 169 2.5 2 2 Tu 1 . - CDS 356 - 2113 1329 ## BVU_3461 hypothetical protein - Prom 2185 - 2244 3.4 3 3 Tu 1 . - CDS 2250 - 2435 201 ## gi|298483253|ref|ZP_07001432.1| conserved hypothetical protein - Prom 2631 - 2690 3.7 + Prom 2361 - 2420 3.8 4 4 Op 1 . + CDS 2641 - 5088 2296 ## BT_3173 hypothetical protein 5 4 Op 2 . + CDS 5125 - 9069 2533 ## COG0642 Signal transduction histidine kinase + Term 9156 - 9200 4.1 - Term 9142 - 9187 8.1 6 5 Op 1 9/0.000 - CDS 9227 - 10591 397 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 7 5 Op 2 27/0.000 - CDS 10604 - 13783 2782 ## COG0841 Cation/multidrug efflux pump 8 5 Op 3 . - CDS 13798 - 14904 1225 ## COG0845 Membrane-fusion protein - Prom 15007 - 15066 7.2 + Prom 14966 - 15025 2.1 9 6 Tu 1 . + CDS 15128 - 16039 679 ## BT_2939 putative transcriptional regulator + Term 16268 - 16304 -0.8 - Term 15912 - 15964 3.3 10 7 Op 1 . - CDS 16127 - 17137 529 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 11 7 Op 2 . - CDS 17140 - 17463 328 ## BF0796 hypothetical protein 12 7 Op 3 . - CDS 17475 - 17669 189 ## gi|237716967|ref|ZP_04547448.1| conserved hypothetical protein - Prom 17708 - 17767 5.0 + Prom 17667 - 17726 5.8 13 8 Tu 1 . + CDS 17788 - 17982 159 ## gi|237716968|ref|ZP_04547449.1| conserved hypothetical protein - Term 18361 - 18435 10.0 14 9 Op 1 . - CDS 18478 - 20517 1849 ## gi|294806811|ref|ZP_06765637.1| putative lipoprotein 15 9 Op 2 . - CDS 20548 - 21021 569 ## gi|237716971|ref|ZP_04547452.1| conserved hypothetical protein 16 9 Op 3 . - CDS 21034 - 22215 1119 ## BT_2913 unsaturated glucuronylhydrolase 17 9 Op 4 . - CDS 22231 - 24162 1664 ## Phep_2654 hypothetical protein 18 9 Op 5 . - CDS 24187 - 26001 1740 ## Slin_2455 heparinase II/III family protein 19 9 Op 6 . - CDS 26022 - 27833 1910 ## BF0333 hypothetical protein 20 9 Op 7 . - CDS 27863 - 31135 3201 ## BF0387 hypothetical protein - Prom 31385 - 31444 7.3 + Prom 31094 - 31153 6.5 21 10 Op 1 . + CDS 31403 - 32836 1191 ## BT_3171 sialic acid-specific 9-O-acetylesterase 22 10 Op 2 . + CDS 32845 - 36873 2879 ## COG5002 Signal transduction histidine kinase + Term 36974 - 37038 8.1 - Term 36865 - 36914 11.7 23 11 Tu 1 . - CDS 37087 - 37860 1020 ## COG0731 Fe-S oxidoreductases - Prom 37881 - 37940 9.6 - Term 37884 - 37921 -0.4 24 12 Op 1 . - CDS 37976 - 40087 1067 ## Cpin_1424 hypothetical protein 25 12 Op 2 . - CDS 40113 - 42461 1235 ## Cpin_6153 hypothetical protein 26 12 Op 3 . - CDS 42477 - 44150 1124 ## gi|237716983|ref|ZP_04547464.1| conserved hypothetical protein 27 12 Op 4 . - CDS 44168 - 44851 646 ## gi|237716984|ref|ZP_04547465.1| conserved hypothetical protein 28 12 Op 5 . - CDS 44861 - 46384 1141 ## Cpin_1098 hypothetical protein 29 12 Op 6 . - CDS 46401 - 49754 2941 ## Cpin_5147 TonB-dependent receptor plug - Prom 49797 - 49856 7.3 30 13 Op 1 . - CDS 49888 - 51066 675 ## COG3712 Fe2+-dicitrate sensor, membrane component 31 13 Op 2 . - CDS 51098 - 51607 442 ## BT_3037 RNA polymerase ECF-type sigma factor 32 13 Op 3 . - CDS 51651 - 53699 1154 ## BT_3167 hypothetical protein - Prom 53788 - 53847 5.8 + Prom 53666 - 53725 6.0 33 14 Tu 1 . + CDS 53844 - 54422 486 ## BT_3166 hypothetical protein - Term 54713 - 54747 -0.3 34 15 Tu 1 . - CDS 54783 - 55310 616 ## COG0566 rRNA methylases - Prom 55403 - 55462 4.9 + Prom 55713 - 55772 7.0 35 16 Tu 1 . + CDS 55811 - 56749 841 ## COG0379 Quinolinate synthase Predicted protein(s) >gi|222159341|gb|ACAB01000018.1| GENE 1 146 - 307 86 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262405725|ref|ZP_06082275.1| ## NR: gi|262405725|ref|ZP_06082275.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 53 10 62 62 90 100.0 2e-17 MVKLLHFTRKLLIHRRDREKVKGEMERWKINIQWGNLFDTVMIQYYTEMNITL >gi|222159341|gb|ACAB01000018.1| GENE 2 356 - 2113 1329 585 aa, chain - ## HITS:1 COG:no KEGG:BVU_3461 NR:ns ## KEGG: BVU_3461 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 583 1 582 590 970 85.0 0 MEYVSSERKLLPYGMMNFADIRLDNYYYVDKTSFIPVIEQSDRFFFFIRPRRFGKSLTLN MLQHYYDVRTRDKFDALFGDLYIGKHPTRDRNSYLVLYLNFSGISGELHNYRQGLDAHCN TSFDYFCDIYAEYLPKGIKEVLNEKAGAVEQLDYLYHQCELAGQQIYLFIDEYDHFTNAI LSDAESIHRYTEETHKEGYLRAFFNRVKAGTYSSIKRCFITGVSPVTMDDLTSGFNIGTN YSLTPKFNAMMGFTEDEVREMLTYYSTKAPFHHTVDELIELMKPWYDNYCFAQECYDQPT LYNSNMVLYFVKNYIDNNGKAPRNMIESNIRIDYEKLRMLIRKDKEFAHDASIIQTLVSQ GYITGELKDGFPAANIVDSDNFVSLLYYFGMLTVSGTFEGKTKLIIPNQVVREQLYTYLL NTYNEADLSFSNHEKDELSSALAYRGAWQSYFDYIADCLKRYASQRDKQKGESFVHGFTL AMTAQNRFYRPISEADTQSGYVDIFLSPMLEIYPDMSHSYIIELKYARYKDPESRVEELR AEAIAQTNRYADTDRVKNAIGTTQLHKIVVVYKGMEMRVCEEVNS >gi|222159341|gb|ACAB01000018.1| GENE 3 2250 - 2435 201 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298483253|ref|ZP_07001432.1| ## NR: gi|298483253|ref|ZP_07001432.1| conserved hypothetical protein [Bacteroides sp. D22] # 1 61 33 93 93 100 98.0 3e-20 MMSGLEINTTFVSYIKTSMYLTGMSREELYAEAYKDLIAISTQMNMFIDKVCKKATMMSP F >gi|222159341|gb|ACAB01000018.1| GENE 4 2641 - 5088 2296 815 aa, chain + ## HITS:1 COG:no KEGG:BT_3173 NR:ns ## KEGG: BT_3173 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 815 1 815 815 1516 87.0 0 MKKLCPLILCTLFSLSAASIETVDYTQGLSVWFDTPNNLDGQAIWLRANGTGANPDKAWE SRSLPIGNGSLGANIMGSISAERITLNEKTLWKGGPNTAKGAEYYWNVNKQSSGVLKEIR QAFLDGDSKKAGYLTQENFNGLGAYEEKDETPFRFGAFTTMGELYVETGLSEINMSNYRR ILSLDSAMAVVQFYKDGIRYQRKYFISYPDSVMVMKFTADKGGKQNLVLSYCPNNEAKSH LEADGNDGLVYTGVLNNNGMKFAFRIKAIHKGGTLKAENDRIIVKDADEVVFLLTADTDY KMNFAPDFKDPKAYVGNDPSQTTLAMMDNALKKGYDELYRNHEADYTALFNRVRFEINPE IGTPNLPTYKRLASYKKGVPDYQLEQLYYQFGRYLLIASSRPGNMPANLQGLWHNNTDGP WRVDYHNNINIQMNYWPACPTNLSECTWPLIDFIRSLVKPGEKTAQAYFNARGWTASISA NIFGFTAPLSSKSMAWNLNPTVGPWLATHIWEYYDYTRDTKFLKEIGYDLIKSSAQFAVD HLWHKPDGTYTAAPSTSPEHGPVDEGVTFAHAVVREILLDAIQASKVLGTDAKERKQWEN VLTKLVPYRIGRYGQLLEWSTDIDDPKDEHRHVNHLFGLHPGHTISPVTTPELAQAARVV LEHRGDGATGWSMGWKLNQWARLQDGNHAYKLYGNLLKNGTLDNLWDTHAPFQIDGNFGG TAGITEMLLQSHMGFIQLLPALPDAWANGSISGICAKGNFEVSISWKEGQLEKAIIHSKS GIPCNVRYGDKTLKFKTVKGKKYEITLKGDKLAVL >gi|222159341|gb|ACAB01000018.1| GENE 5 5125 - 9069 2533 1314 aa, chain + ## HITS:1 COG:slr1393_3 KEGG:ns NR:ns ## COG: slr1393_3 COG0642 # Protein_GI_number: 16329802 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 792 1019 45 290 301 127 33.0 1e-28 MKTKYAVLSSLFFIIQLVGVLPLRAEHYYFKQISLKEGLPSNVRCILRDEQGFVWIGTKS GLGKFDGHELKRYKHQANDPNSLLHNLIYQIAEDKQHNIWVLTEKGIARYQQQSNDFTFP TDEDGKNITAYSFCLVPDGILFGGKDRIYKYSYNDSSLRLLQYFNANSFKINALSMWDSK TLLCCSRWTGIYLVDLHTGEHRRPPFDCGPEITTMMTDSRKRIWIAPYSGGLRCYSYDGK LLASYSTRNSALSNDIVLSLAEREGQLWIGTDGGGINILTSETGEISQLEYIPGRENYSL PANSILCLHNDHNNNIWAGSTCNGLISIREVFMKTYTDVVPGNDRGLSNSTVRSLYRQST DSIWIGTDGGGINLFNPRTEKFTHYLSTWNDKIAFISGFTPGKLLISLFSKGVFIFNPAT GEKQPFTIVDKETTTQLCNRGKSVNLYQNTPNTVLLLGDHVYQYHLKEKKFDKVTGEEGK SIVGTLVPFKHENNRTYINDFKHIYELHDGDSLLQTTFECYQDTVISSASHDEHGDFWIG SNFGLIHYNPVSQKQTHILTNLFTEVNMVQCDQRGKVWIGADNLLFAWLIQEQKFVLFGE SNGAIQNEYLPNARLVNNEGDVYIGGVKGMLRIDGQLLLNTSEMPELQLLDIIINGEPAQ NKLYSHPAAISVPWDSNITIRIMSKEEDIFRKKVYRYRIEGLNDQYIESYQPELVIRSLP PGNYKIMASCTAKDGSWIPNQKILELTVLPPWYRTWWFILGCAILIATIIIETFRRTLKR NQEKLKWAMKEHEKQMYEEKVRFLINISHELRTPLTLIHAPLNQILKSLSSGDSQYLPLK AIYRQAQRMKNLINMVLDVRKMEVGESKLQIQPHPLNQWIEHVSQDFISEGEAKNVHIRY QLDPRIEKVSFDKDKCEIILSNLLINALKHSPQDTEITIASELLPEENRVRISVIDQGCG LKQVDTHKLFTRFYQGTGEQSGTGIGLSYSKILVELHGGSIGAQDNEETGASFFFELPLR QEPEEIICEPKAYLNELMMDDNSGQPSAKEVFDTTPYTILVVDDNPDMTDFLKRTFEGYF KRIIIAGDGVEALQLVRSHVPDIIVSDVMMPRMNGYKLCKTIKEDINISHIPVILLTARD DRQSQISGYKNGADGYLSKPFEVEMLMELIRNRLKNREYTKKRYLNAGLIPTPEESTISL TDETFLIKLNSIIQENIENSNLGIQFICQEVGLGRSSLYNKLKALTGMGANEYISKLRIE KAITLISGTDMPFSEIAEKTGFTTPSYFSTAFKQYTGETPSQYKERKKKGLAGK >gi|222159341|gb|ACAB01000018.1| GENE 6 9227 - 10591 397 454 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 41 452 35 455 460 157 28 1e-37 MKKNIITLVAISLSLSGCGIYTKYEPATVVPDDLYGGEVVTEDTAGLGNMDWRELFTDPY LQSLIEVGLQTNTDYQSAQLRVEEAQAVLMSAKLAFLPAFALAPQGTVSSFDTQKATQTY SLPVTASWELDVFGRMRNSKKQAKVLYAQSEDYRQAVRTQLIAGIANTYYTLLMLDEQLD ISRQTEEAWKETVASTRALMNAGLANESAVSQMEAAYYQVQGSVLDLQQQINQVENSLAL LLAETPRNYERGVLAKQQFPADFSIGIPVRMLSSRPDVRSAERTLEAAFYGTNAARSAFY PSITLSGSAGWTNSAGAMIINPGKFLASAVGSLTQPLFNRGQVVAQYRIARAQQEEAALG FQQTLLNAGSEVNDALIAYQTSQGKRILLDKQITSLQTALRSTTLLMEHGNTTYLEVLTS RQSLLSAQLSQTANHFTEIQSLINLYRALGGGQE >gi|222159341|gb|ACAB01000018.1| GENE 7 10604 - 13783 2782 1059 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 6 1042 5 1035 1051 729 40.0 0 MKGNVFIKRPVMAISISVLILAIGLISLFTLPVEQYPDIAPPTVYVTASYTGADAEAVMN SVIMPLEESINGVEDMMYVSSSASNAGLAIIQVYFKQGTDPDMAAVNVQNRVAKAQGLLP AEVTKVGVSTMKRQTSFLQIGALVCTDGRYDQTFLANYLDINVIPQIKRIEGVGDVMELG DTYSMRIWLKPERMAQYGLVPSDITAILGEQNIEAPTGSLGESSKNVFQFTMKYRGRLKS VEEFRNTVVRSREDGSILRLQDVAEVELGTMTYSFRSEMDSQPAVLYMIFQTAGSNATAV NKEITTQIERMEKNLPEGTEFVTMMSSNDFLFASIHNVVETLIIAIILVILVVYFFLQDL KSTLIPSISIIVSLVGTFACLVAAGFSLNILTLFALVLAIGTVVDDAIVVVEAVQSKFDA GYKSPYLATKDAMGDVTMAIVSCTCVFMAVFIPVTFMGGTSGVFYTQFGITMATAVGISM ISALTLCPALCAIMMRPSDGTKSAKSINGRVRAAYNASFNAVLGKYKRGVMFFIRHRWMV WTSLAVAVALLVYLMSTTKTGLVPQEDQGVIMVNVSISPGSTLEETTKVMDRLENILKDT PEIEHYARVAGYGLISGQGTSYGTIIIRLKDWSERKGKEHSSDAVVSRLNGQFQAIKEAQ VFSFQPAMIPGYGMGNSLELNLQDMTGGELATFYEAAIQFLGALNERPEVAMAYTSYAIN FPQISVEVDAAKCKRAGISPSAVLDAVGSYCGGAYISNYNQYGKVYRVMMQASPEYRLDE QALNNMFVRNGTQMAPVSQFVTLKQVLGPETANRFNLYSTITANVNPADGYSSGEVQKVI EEVAAQSLPAGYGYEYGGMAREEASSGGAQTVFIYAICIFLIYLILACLYESFLIPFAVI FSVPFGLMGSFLFAKILGLENNIYLQTGVIMLIGLLAKTAILITEYAIERRRKGMGIVES AYSAAQVRLRPILMTVLTMIFGMLPLMFSSGAGANGNSSLGTGVVGGMAVGTLALLFVVP VFYIIFEFLQEKIRKPMEEEPDVQVLLEKEKSEVERERK >gi|222159341|gb|ACAB01000018.1| GENE 8 13798 - 14904 1225 368 aa, chain - ## HITS:1 COG:Cj0367c KEGG:ns NR:ns ## COG: Cj0367c COG0845 # Protein_GI_number: 15791734 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Campylobacter jejuni # 15 368 12 362 367 140 29.0 4e-33 MITVSKKWIRLIGIVGCTVWMASCKQASDAGVKSSSYAVMQIEAVDKEFSSSYSASIRGR QDIDIYPQVSGTIEKLCVTEGQKVRRGQSLFIIDQVPYKAALKTATANVEAARAALGTAE LTYKSNKELYAQKVVSEFSLKTAENSYLTAKAQLSQAEAQEISARNNLSYTEVKSPSDGV VGALPYRAGALVSANMPYPLTTVSDNSDMYVYFSMTENQLLALTRQYGDMDEALKNMPEV ELRLNDNSVYDKKGVIESISGVIDRQTGTVVARVVFPNESRLLHSGASGTVVVPSIYKDC IAIPQTATVKMQDKTIVYKVVDGKAVSTLITVAENNDGREYVVLDGLKAGDEIVSEGAGL VREGTQVK >gi|222159341|gb|ACAB01000018.1| GENE 9 15128 - 16039 679 303 aa, chain + ## HITS:1 COG:no KEGG:BT_2939 NR:ns ## KEGG: BT_2939 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 303 1 303 303 538 83.0 1e-152 MEKENIITVDIEDFKDSQHVLDYIDDDFAIVNSLEGTPYSNDTIKLNCFLIAVCIEGCIQ LDVNYRTYKLQAGELLLGLPNTIISHTMLSPKYKVRLAGFSTRFLQRIIKMEKETWNTAI HIHNNPVKSVDNGEDQTVFGFYRDLIIAKINDEPHCYHKEVIQHLFSAIFCEMMGQLHKE IEASGNMEGSKEGIKQVNYILRKFMELLSKDKGMHRSVSYFANELCYTPKHFSKVIKQAC GRTPLDLINETTVEHIKYRLKRSEKSIKEIAEEFNFPNQSFFGKYVKAHLGTSPANYRNR KEE >gi|222159341|gb|ACAB01000018.1| GENE 10 16127 - 17137 529 336 aa, chain - ## HITS:1 COG:SMa0592 KEGG:ns NR:ns ## COG: SMa0592 COG3550 # Protein_GI_number: 16262763 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Sinorhizobium meliloti # 61 307 123 361 390 97 37.0 3e-20 MGTDIHICPSLLRETGGESGYSPKALKQLSDGKNISCELPYRHFDDNVGIDLFNNNSKRI SVSGVQIKYSLVADDGILRLTKEGEQGEFILKPVPNNLRNKEFCPANEHLTMQIAAQVYG IPTAPNGLCFFQDGTPAYFVRRFDLSEKGKLQKEDFASLAGLTRKNGGSDYKYDNLAYEE FAGIIDKYSSVPQVDKLRFFELILFNFVYSNGDAHLKNFSLLEEKKGRFRLSPAYDLLNT HLHINDGIFAMSKGLFVNPEPDYFGGASAVTGKTFYHFGLKIGLPERLVNSCLEKYSKIY ELTDQLIEHSFLSKELKKQYRLMYKSRVGSYLSVIE >gi|222159341|gb|ACAB01000018.1| GENE 11 17140 - 17463 328 107 aa, chain - ## HITS:1 COG:no KEGG:BF0796 NR:ns ## KEGG: BF0796 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 107 2 106 109 127 60.0 1e-28 MSRSAKVYIKGVYAGLLTEIDREHYSFCYDTDYYNNPQLPAVSLTMPKTQQEYTSSYLFP VFFNMTSEGDNRIIQARNLHIDEEDDFGILLATAHTDTIGAITIKKM >gi|222159341|gb|ACAB01000018.1| GENE 12 17475 - 17669 189 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716967|ref|ZP_04547448.1| ## NR: gi|237716967|ref|ZP_04547448.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 64 1 64 64 113 100.0 4e-24 MDLIEIGKHIRERRKELGLDQSTLATLAGIGINALVRLERGSGNPRFDVLFNVLKTLGLS IHIQ >gi|222159341|gb|ACAB01000018.1| GENE 13 17788 - 17982 159 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716968|ref|ZP_04547449.1| ## NR: gi|237716968|ref|ZP_04547449.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 64 1 64 64 125 100.0 8e-28 MLFTIYGDLKEIHAYKGKTLIAERSANMKNCIFYLSCSLYFEPFLALHEKRPPFRVTDFA NNTK >gi|222159341|gb|ACAB01000018.1| GENE 14 18478 - 20517 1849 679 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294806811|ref|ZP_06765637.1| ## NR: gi|294806811|ref|ZP_06765637.1| putative lipoprotein [Bacteroides xylanisolvens SD CC 1b] # 1 679 1 679 679 1309 100.0 0 MKNRLFAIGAMILMMTSCTQDELISSDNGKEPATAGSCLTLVGLSSPQTRISIGDKTGDV YPVIWSEGDALGVFSRTAGTDINNVQSLLSDESIGQNSGVFTSDDVKMAEEGATELLIYY PYRASTELAENGNKITSTLSVEQEQSRPGDSHHIGKYGFAFAKATVSGPDMLAKFTLNHA MAYVKFSISSQELSTYKLKSVSLYDKETKTPLSGVFTADLDTDELTYGPDVKPYATVSLT TPELLASAQDIYLTTYPADLSGKEVYIVITLENDQQTVTIPILKEGKQLKANAVNTIAVN NLKLSDNSCEWYEPVETRLLAGGWAYGESNCLLTNISTSGVSNTISVKARGNFMEVEEPK YAKTILGCDLNKDHKMIAVNGSTTDISPIGSDYNITINAYKVSGGYDGGCGQVAIYGADQ TTVIWSFIIWMTPTPAEHPYGNTGYVVLDRNLGTYMTCEGDNWKQNGVYFQWGRPTPVGW SGSVGTNIPTEATNVRFSIENPRALLYTNNVENTKSDWYLGAWTGARTDRKDDFWGNPNE SSTYLNPSDGHKSIYDPCPKGYRVVSPRVLDEIEQKGEFVKQSATAVFKYCYDGTNYAYW PLAGCKWGSNGGNNGNNTGLDTAKGAACYWSNSSASSYGNDKDQGATSLYYKVSDKTWTH SSGRSHAFSVRCMKDTENR >gi|222159341|gb|ACAB01000018.1| GENE 15 20548 - 21021 569 157 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716971|ref|ZP_04547452.1| ## NR: gi|237716971|ref|ZP_04547452.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 157 1 157 157 312 100.0 4e-84 MIMKNLHISMAFIFVLLLITASCSDKDDSAPRVIPTMNLTVSEIADNTALITSEQQTGTT FGAKVIEFYPVADIGFDYNIEVKLVKFVEENGEPVSLPYTRKITEGLRPGVNYISAIIAY NAEGRAVCSAFQTWKASGTEGAWSDGGSAGDLEENEW >gi|222159341|gb|ACAB01000018.1| GENE 16 21034 - 22215 1119 393 aa, chain - ## HITS:1 COG:no KEGG:BT_2913 NR:ns ## KEGG: BT_2913 # Name: not_defined # Def: unsaturated glucuronylhydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 389 7 393 402 404 52.0 1e-111 MRILLISILGLLTLASCTKQEPMEALIDRVFTVAEQQYTTMDTRLTEKTLPRTLSADGEF VPSNIYWWCSGFYPGSLWYIYEYTGNEAVKTLAEKNTLKLDSIQYVTRDHDVGFQLNCSY GNAFRLTGNEAYKQVLYQGAKSLSTRFNPIAGVIRSWDFVRKGCDWKFPVIIDNMMNLEL LLSMSKAYADDSLQNIACTHANTTIQHHFRDDYSTYHLVDYDPETGAVRGKQTVQGFSDD SSWSRGQAWALYSYTMMFRLTGYQNYLLQAGHIADMLLRRLPADGIPYWDFDAPVEEQTY RDASAAAIMASAFIELSRYIPGTEAKESYFAMAEKQLRTLASKEYLAEPGTNECFILKHS VGALPDKSEVDVPLTYADYYFLEALLRYKNLQK >gi|222159341|gb|ACAB01000018.1| GENE 17 22231 - 24162 1664 643 aa, chain - ## HITS:1 COG:no KEGG:Phep_2654 NR:ns ## KEGG: Phep_2654 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 48 640 34 628 847 477 42.0 1e-133 MKTILFKTIIIAFFVGTAVSCSDEDENRFRPGSTHEKPNPTEPEGGLDYSKLTADNHPRL LMNAEAFTALKAKVDANSSANLTLLHNTIMSVCNSKGMNATALTYKLDASNKRILDVSRD ALLRIFTCAYAYRMTGDTKYLTKAETDMNAVCNFPDWNSKRHFLDVGEMATAVAFGYDWL YNELSAATRTKAANALLKFAFQQAQNKNWNLNFYEATNNWNQVCNGGLVCAALASYENNP SEAKDMIEKALESNKPALEVMYSPDGNYPEGSGYWCYGTLYQVLMLAALNSTLGTDNGLS DTPGFSKTAEYMLYMTGLNSKFFNYSDCAPSSTAALASWWFADKYSNPSLLYNELKMLKN GEYASCAENRLLPMIMAFANNLNLDAISAPSNKLWSGKGETPVVMVHTDWTYTDTDKYLG IKGGKAGSSHGHMDAGSFVYDAYGVRWSMDFGLQSYTTLESKLSALGGNLWDMGQNSMRW DVFRLNNLNHSTISINDARHRVNGVATLTTTINTATELGATFDLTEVVSDQAASATRTVK IVNDKDLVVMDEIKARTDKSAKVRWCMVTPAVPTVESNRIVLTNGSKVMYLTASGSVKPT YKQWSTTSENSYDQANPGTYMVGFEATVTANQTATFTTTLSPK >gi|222159341|gb|ACAB01000018.1| GENE 18 24187 - 26001 1740 604 aa, chain - ## HITS:1 COG:no KEGG:Slin_2455 NR:ns ## KEGG: Slin_2455 # Name: not_defined # Def: heparinase II/III family protein # Organism: S.linguale # Pathway: not_defined # 28 602 34 609 628 589 48.0 1e-166 MKKILLLLLIFVSGYTGVVAQQFDYGKIAPHPRLLLPAGGEEAIRKAIAEYPPLATVHQR IMELCDRTLTEQPVERIKEGKRLLAISRIALKRIYYLSYAYRMTGDKKYAHRAEQEMLAV SRFTDWNPTHFLDVGEMVMALAIGYDWLYDSLQPDTRRVVREAIIAKGFDAAKNTRHAWF YTAKNNWNSVCNSGLAYGALALFEEIPEVSKGIIEKCMETNPKAMVGYGPDGGYPEGFGY WGYGTSFQVMLIAALESAFGTDNGLSQAPGFMESARFMQYMTAPGGDCFCFSDSPVEAEC NMMMFWFAGKAKDLSLLWIERQYLDRPDMQFAEDRLLPSLLVFCSQLDLKHIGKPKRNFW FSRGDTPVFIYRGGWDSKEDTYLGVKGGSPSTSHAHMDAGSFIFERDGVRWAMDLGMQSY ITLESKGVDLWNMSQNGQRWEVFRLSNIAHNTLTINGERHLVKSNAPITRTFESKKQKGA EVDLSSVFANSVKKVVRTVILDQKDHLEVTDRLETGDKEAAVSWIMVTPAEAKITGKNRM ELTKDGQRMLLTVDADTEVEMKTWSNVPPHEYDFRNPGTIRVGFETVIPANRASQLKVRL IPLK >gi|222159341|gb|ACAB01000018.1| GENE 19 26022 - 27833 1910 603 aa, chain - ## HITS:1 COG:no KEGG:BF0333 NR:ns ## KEGG: BF0333 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 601 4 614 615 311 33.0 6e-83 MKKILMILTMALAATSCMDILDVAPEDQIASENMWTTEELADKGMAGLYFPFYATQLSST QLRRADGLNRQGIEAMSFATDYYSNNYPVELLSLATKPANDFQVWYEWKFCYTIIHACND AIANLHKADMSANKLARYQCEARFLRAWAYNRLNMLYQGVPVYLEPINNEDCTRGQSSVD EVWQVILDDLTYCINNPDFPNNTLNENYGRPSKGAAYALRGMVYMWKKQYKEAGNDFKEV ETCGYGLWTGEYADFFKYENEKDKEMIFSLQFSEETGYCDNIQQMTGARDTYDGWTEIKP SADFVDYYKNADGSDFKWSEVDGLQDWDLLTPKQREIFFCRDGLESMSSQKNALIKRVGE DIYQKYYLNSGNEARIKKAYNSRDPRLQQTVVTPYVPVDCYKPNYAGDANQIGKQLRWPL KEQGTNGGDFWLDKRTSAFYCYRKYNEFEKGRLISRSRCHTDWPLIRYTDVLLQYAEALA QTDQMGEAIRLVNKVRTRAHMPALTEGGSGPCAVNGKEDMLERIRYERRVEFCLEGINFF DEVRWGTYKETKFQGKDINGGKSWWGELAEYNWYYTDYMWPWTAPISETQKNPNLTKRSG WAY >gi|222159341|gb|ACAB01000018.1| GENE 20 27863 - 31135 3201 1090 aa, chain - ## HITS:1 COG:no KEGG:BF0387 NR:ns ## KEGG: BF0387 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 24 1090 50 1117 1117 796 42.0 0 MKKKHYLLFLLASLFLLTDVVWAQTSATVSGVVIDENGETLPGVSVVEVGTTNGVLTDLN GHYTLKTTSAKPSVSFSYIGYQTTTLPLNGRTKLDVQMKVETKILDEVVVVGYGVQKKVN LTGSVTSINFADQTEGRPIMSVSSALSGLAAGMNVTQASGQPGSDGATIRVRGNGTFNTN SPLVLVDGIEWSMDNVNPNDIESISVLKDAASTAIYGTRAANGVILITTKSGKGKPQISY SYSGVVQMPYNNLSFVSDYARYMGLVNEACDNVNTKGIFSQESIDRWRAASADPNGLNEY GVPNYVAYPNTDWFDEVFDTGYSQEHNLSVSGSSEKVKYMLSLGYLDNQGVMNRWNLDSS TQKINFRTNLEAKIVKWMTVGTRLYGQKQDYGMANISNGFKYLYQTTPGVYPGEPNYWGR AALASEESSNANNIFGQMAGATGFNTVWRLNASVYGIITPYKGLNIEGTFNYSPTFTDKS SYSRQNGYWDYVTDQRVSESALENASITNTSARTWRQSAEILVRYNTTIKKDHDLGALLG YSAQEYYSKSFAVSRKGATDWTLNELSTYETLVSSSSSAPAKWGLLSYFGRVNYGYKGRY LFEANLRADASSRFGVNQRWGYFPSFSGGWRISEESFMQGASDYLSNLKLRVSWGKTGNN STGNYDWQANYATGNVVIDGEGTKGLVRKKLSNDKLHWESTATTDIGLDFGFFNHRLTGE IDYYNKYTSDILYHPELYLSMGVVGSAPENLGEVRNRGVEFTLNWNDRIGKDFEYRVGMN FSFNANKVMKFKGELQKYWTYDAQGNKVSYVNNFSDVSESGFGGYICEGRQLGETYMYKV YRGSGEGYTGGAVDIHAGPKDGMIRTKGDMVWVQAMIDSGYSFGGMKTVAKDQLWYGDIL YADSNGDMNYGDTNDRDFSGHTSVPKFNLGFNCAFSYKNIDFSMLWSGAFGHYLNWNTDY YNSTLVSHGYGIIEHIADNHYFFDPSNPDDPRTNQWGKYPRLTYGTTYNNRIQSDWNEYK GDYFKLKNIQIGYTLPQRISSKFFVNKLRAFVSMDNILTITSYPGLDPEIGTAIGYPLMR QISFGGQITF >gi|222159341|gb|ACAB01000018.1| GENE 21 31403 - 32836 1191 477 aa, chain + ## HITS:1 COG:no KEGG:BT_3171 NR:ns ## KEGG: BT_3171 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 477 1 477 477 810 80.0 0 MKKHFLLLILSLLFLPIAQGKVKLPAMMGDHMVLQQNSSVKLWGWADGKKVTVTTSWNNR TYQASTDKDGAWLVKVDTPEGGYTPYSITISDGTPVTLSDILIGEVWICSGQSNMEMRMM GNAAQPIDNSLETLLNSGNYRDRIRFITVPRTNDTERRTDFEKRKWEVSSPETTIDCSAA AYFFARQLTESLHLPVGLVINSWGGSAIEAWMDEPTLKTVEGMNIEAAKNPKRGVHQRLE CLYNSMLWPVKNFTAKGFLWYQGESNISNYQFYAPMMTAMVQLWRNVWEAPDMPFYYVQI APYKYENSSNTGAALLREAQMEALKTIPNSGMVPTTDIGDEFCIHPPQKDVVGLRLATLA LTKTYGIRRLPSNGPTMTKVDYADKKAIVTFDNAPVGLFPTFSELEGFEIAGADKKFYPA KAKIVGRTNTVEVWSEEVAQPVAVRYAFRNYVGNITLRNTFGLSAFPFRTDTWDDVK >gi|222159341|gb|ACAB01000018.1| GENE 22 32845 - 36873 2879 1342 aa, chain + ## HITS:1 COG:BS_yycG KEGG:ns NR:ns ## COG: BS_yycG COG5002 # Protein_GI_number: 16081092 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 815 1047 369 603 611 128 29.0 1e-28 MKRSILSICILLTSVSLVFANIYKKYDVRSGLSGNCVRSILQDSIGYMWFATQDGLNRFN GIEFTNYGHSSENGGNSYMNIVTICRHQDNNQIWVASTEKLYLFDSWEEKFSVFDKQTED GVSVNSVFGMAYDNDGQLWIGTTNGLFVYNEKKGTLRQYLHSLSDPHSLPDNHIWVIYND SFGTIWIGTRNGLAKYNQRTDNFTGYISEGTSFGRPACNEIISLMESSQGVLWAGTWYGG LARFNKETGQFRYYFGEGDTLTIPRIRTLFQRTANSFYLGSDDGLYTFNTTTGECLPTDD GQNKESIYACYQDREGGIWIGTYFSGVSYLSPKHKDIEWYYPNGTENSLSGNVISQFCED PDGNIWIATEDGGLNLFDPRTKKFKNHLLRSSNPNIGYHNIHALLYNEGKLWIGSFSRGL YILDTQTGKMKNYRHNRANPHSIPNDHIYSIYQTKDGSIYLGTLSGFCRYDPESDSFRTL EPLSHIFIYDMVEDQHGDMWLASKRDGIWRYNRQTGKLHNYRNDPVNPDSPCSNWVIRVY IDHKQHLWFCTEGGGICRYHYQEDRFENFSTKENLPNNIIYGILDDQSGNYWLSSNRGLI RYEPQNKRAQLYTIEDGLQSNQFNFRSSLQASDGKFYFGGVNGFNCFYPFKLSINKVRPT ASISAVYMHSPDDKVSLSKRIPALSGQVTIPYQVVSFDIAFESLSYVAPSKNLYAYKLDG IHKEWIYTDKHNVSFLNLPPGEYTFRVKASNNDEYWSNDDCCLHIEILPPPWKTIYAKIF YLLIACGLAYFLIQLYLRKQQAVKARKMKEMEQIKNQELFQSKITFFTQVAHEIKTPVSL IKAPLEAILETHEWNSEVESNLSVIQKNTNRLMELIKQLLDFRKVDKEGYTLSFNEVDIN RMIEDIIDRFRAISLTGISFSVSLPKEHLQYNVDQEALTKIVSNLLTNAMKYARTRIMVI LDEHLSAEGRTLSLCVRDDGPGIPQEECSKVFEPFYQVGNTGNNGSGVGIGLSLVKLLVE KHKGKVYINPGYTEGCEVCVEIPYLEKSISVSPSITSMPDKVPALEEEGEPAGYSLLVVE DTTDMLEFLAKNLGNTYTIHTAANGKEALECLETTTVDLIISDIVMPHMDGFELLKSIRS DNMLCHIPFILLSALDSIDSKIAGLDYGADAYIEKPFSLSHMKATINNLLENRRMLFNHF TTVPNMSYDQTLMNKTDVKWLNTINEIITRNFTNEEFTIDKMAEEMAISRSNLQRKLKGL TGMPPNDYIRLIRLKTAGELLREGEYRINEVCFIVGFNNPSYFARCFQKQFGILPKDYVK KGNIIREVHILKHSHCDEKKYS >gi|222159341|gb|ACAB01000018.1| GENE 23 37087 - 37860 1020 257 aa, chain - ## HITS:1 COG:PAB0577 KEGG:ns NR:ns ## COG: PAB0577 COG0731 # Protein_GI_number: 14521069 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductases # Organism: Pyrococcus abyssi # 10 251 4 232 314 99 31.0 7e-21 MTIIFPSPIFGPIHSRRLGVSLGINLLPDDGKVCSFDCIYCECGFNAEHRTKKLLPTREE VRTALEEKLKDMQANGPAPDVLTFAGNGEPTAHPHFPEIIEDTLALRDKYFPKAKVSVLS NSTFIDRPAVFEALNKIDNNILKLDTVDEEYIHLLDRPNGKYSVKKIIERMKEFKGNCIV QTMFLKGGYQGKDMDNTSDKYVLPWIEAVKEIAPRQVMIYTIDRETPDHDLQKATHEELD RIVALLEKEGIPATASY >gi|222159341|gb|ACAB01000018.1| GENE 24 37976 - 40087 1067 703 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1424 NR:ns ## KEGG: Cpin_1424 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 25 621 61 754 853 130 21.0 2e-28 MKRRRYVIGILLCVTALLQAQDSVKSFDEFFVAGMDKIDGVFPVYVAEKEIYLEIPEKYI GREIEVSGQIDRGFDLLNRPVDGLGVVRIISPDKATICFQKPFYTERILDEKSTYQQSFS LSNMQPAGKSYPVVAYSKEQGAIIRITEYLMTGDDWFSYNDSFIRSLVPELSEIMKIHPF KEGVSFTVRRYHGVEAERYMLSSSAIILPEGSMPLEVTCAVRLLPLKKDQIRLADYRIPY RTLSFKDYSQNPYCMVEDSLILRWDMSQPLTFYVDTLFPKEYFQAVKEGVEAWNTAFHKA GIHDALQVRYADRKIIPAEQRAFISYDLRIPGIKSDFICHPRTGEILSCRLNIGHGFLKG KLDDYLLSCGASDSRILADRYSKEVEKELLQNEITEEIGYLLGLRRSLSKSSCGKTLKVG DDDCRTIYFGYHPFKGDQNCYDEREKLRQWIDHNLPDHIRLFQPSGKENSNSSFPEDYAV KISDLQTVVSRLDKIVYKGKKYDKGSSLTDIYRKAIRLYGSYLMEMAKVVGSSQPADAQR QAMLDLDNYLFHSVKKMECTYVKENLLETRNNLLYPELTRLFKQLLSVKTISALRLQALQ SDRKGYDDIDFFRDLYKGLFNGFDPQSAVSYEQMDIQLICLEAWLDIMQEATEHNSIKKR LTNELHSLHDRLEELSTTHSQREVRDMYILLLRRMDQYFRVVS >gi|222159341|gb|ACAB01000018.1| GENE 25 40113 - 42461 1235 782 aa, chain - ## HITS:1 COG:no KEGG:Cpin_6153 NR:ns ## KEGG: Cpin_6153 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 40 743 51 759 812 343 31.0 1e-92 MLKGIIGAVFLAVVCTCLYGQQQSITTFIKEEAPVEVIPGMFTTYRSDKHIYWEIPDSLI GREFAVTTTILTAPARPDRDMEKKFGYAGDMIGPVFFSFRKQGDELWMMDPQYERVIENP EGTYAKIAAQRGNERLYKILPIKARNQGSSLIEIGEVLKDFPLFTLDIVSFDLLIGTRLR EKDYIKEIKGYDNRLLIHASRVYRSSSMKIPGKPVAPPYIGDWDTGICIKLLSKRPLEAV AANTGAYFSISKECFQGNQPAIRKSVVKRWRLEIRAEDEERYMRGELVEPIQPIIFYIDR NTPEKYIDCIIEAVRDWRPAFEKAGFKNAIDARLAPTVEENPDFSIYDSTYPFISWKISG QNNAYGPTPCEPRSGEIIACHIGIFCSVLNLEQKWYFAQCGANDPQAWNIELPDSLQYEQ IKQVLTHEVGHTLGLEHNFLGSSHYSIDQLRDNDFLSQYSIGSSIMDYVRCNYALRPQDK VDLRNRRVRVGEYDKWAIEWGYRIFPGKDASEREKNRSLWNQEKQKDPSLHFSGRMDVRA QAEDLGNDHVMVNTQGIENLKYLCEHPDVWNVTDKTSLRVLQGRYEAVLEHYKQWVQHVL SHLGGKRLAEPDDENIYIPEKADYNKKVMSFIQAYILQPPVWMFDEKLTAKLEINGSREF DRFYEELMSEMIRSLREVERTENVCEDMLSVDEFLESIHKELFTEWADNIPVSDAKYKVQ LLYVSKLVKLLDRSEKITSSRLLVSIMQALNRIKEESLDYSHKITDPVAKKRAMFLVDSI LF >gi|222159341|gb|ACAB01000018.1| GENE 26 42477 - 44150 1124 557 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716983|ref|ZP_04547464.1| ## NR: gi|237716983|ref|ZP_04547464.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 557 1 557 557 1113 100.0 0 MRKYIICIVSFIGAALLQTGCYDDKGDYDYHDVNTMDIVIPETKVRMPKEEAVEVSIIPE ISQTLEQNEKNLVFQWKKTIEGKKAGSDRLSDYKDYSVGKECKVTVEPYESENIGLMLVI TDKKNGTTWYQIGEVAIIRPLNPCWFVLQEKEGKGVLGAIEGTPEGYYVYPDVFKSELNQ SFPLEGKPLAVSARKNYGDSFLSSMLGFFGFKVSPALMVVTDRDLALLTPSTLITRYPSN KILFEPTGKGEPLNIELYKMSTHGELFVNSGKVYCAPMDGFCVPFSVKKESEFPAISAYG SYGGGYLFFDSENHRFLASSILGFSDYMVPNSATQDIRNYGTKWTDQKPVRTYLMSSDES NVFDPNKIDPSLEVHDIVTGGNGGNFAYAIAAPHNGKELTVFKFSAQDEDPICAARYTIA LPSEVNVETAKFAASYAYTANLIFMTSGNKLYRIDLGRGRAIELYTYETDPSAQIVSFKF KDPESVREEDDDEEIGEYKEKLGMSLGLGINTADKGVVVELQLTVAGDVAREENSICVYE DPDQPIGKIVDISYNYE >gi|222159341|gb|ACAB01000018.1| GENE 27 44168 - 44851 646 227 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716984|ref|ZP_04547465.1| ## NR: gi|237716984|ref|ZP_04547465.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 227 1 227 227 442 100.0 1e-123 MKRIIYFVLAILVCGIYVGCSENEIDLYDQTPRINFYGSTTHVRTLVDTDYVKKEPYAVD SFEVRIQGDFLKENRDFCVKVTPNNDYQNSVDVLLESKYTYTDLDTVCQIFYYKINRPKV EAGRNVYGCYLEFDLNNPLHQFDKGLVEKNQIVLNVRWELKPTEWSDYVGFGSYSDAKYM FIMDVCQRVWDDLEDEDIDVIKQAYREYKEEGNPPILGEEGDEIEYE >gi|222159341|gb|ACAB01000018.1| GENE 28 44861 - 46384 1141 507 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1098 NR:ns ## KEGG: Cpin_1098 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 6 504 5 487 488 230 33.0 1e-58 MKQIIYILLTTTMLGLMSCSDWLDVSPKTSIPTDKQFESESGFKDALTGIYLKLGTTTLY AGDLTYAYLDELAGLYSDYPGYNTNAVFDQSIVFDYENMFLSKKNGIYSTMYNIIANINN FLEYVDKNKDVLVTERYYETMKGEALGLRAFLHFDLLRMFGPVYKEHPASKAIPYRIAFD KDATPVLPASEVVDAILKDLNDAEKLLKESDPLDFFTDQTDEDFIDKNHFLVNREFRMNL YAVKAMLARVYCYKGDAESKGLATEYAKQVIAASKYFTLYKSQTASNYNSIRYAEQIFGI TVNEFSNLLIGNYMDMENTNTQQHFYLDGDKFKFFYETADAGNTDWRKNTEMFEVINGAS RTDVFCRKYNQKPLNGGYAYSGADAIPLIRLPEMYYIVAECVPSASESADALNTVRFARG ISYSDEIPTTGYDDLDNTSEEDKNQTKRINEIMKEYRKEYFAEGQLFYFLKAHNYSTYYG CGIETMTEAHYQMTLPDDEYIFGNNSK >gi|222159341|gb|ACAB01000018.1| GENE 29 46401 - 49754 2941 1117 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5147 NR:ns ## KEGG: Cpin_5147 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 9 1117 55 1164 1164 645 34.0 0 MKKNQAQKFIVLFLFFLFAYSWRVEAQTGKEKQITMEFKNEGLPSIFKRFEKVSGYKVLF IYDEISSYTSTGKVEKATVDEALKVIIGKNPLKYHIDGQFINITKEDSKKFFSQVKGKVF SEEDGLPVIGATIIVEDANNIRAITDNNGNFQLSNVSKDSRVRISYVGLETQFLNPSSYM SVVMKSDTKALDEVVVTGMFNRKKEGFTGSAVTIKGEELKKYSTNNVAKAIAAVAPGLRI MDNINMGSNPNNLPDMRMRGGANMDLNAQATVDLNSASNDVLAVQGEYETYANQPLLIMD GFEISIQTLADMDPDRVASIVVLKDAAATAIYGSRAANGVIVIESKTPKPGRIWVTYGGE LRIEAPDMTGYNLMNAREKIDAELQSGLYTYGGETVEKWQLYQSKLREVLAGVNTYWLDK PLQTAFQQRHTVTLEGGDEALRYRMYVGYNSSPGVMKDSKRDVLTGSLDFQYRLKKVLLK NSITLDNSVANESPWGSFSEYTRLNPYLRPYGENGEIQKRLDNFEGVGGESSYLNPMYNT TFNSKDQSKNFTVRELFRVEYNPTNELRFEGAFNLSKSVGHRDIFRPAQHTLFDNVTDPT LRGDYRRSQSEAVSWGIDLTGSWNKQLKDHYLTANARMSVLENNSETYGNYVTGFPNDNM DNLLFGKKYNEKVTGDERTTRSIGWVAAGGYSYKYKYSFDFNIRLDGSSQFGKNNRWAPF WSTGLRWDLKKENFMKDVSFISDFILRGTYGTTGSQGFDPYQAHGYYTYSNLLLPYYSSD ATGSEILAMHNESLKWQTTKSTNLALELGFFDQRFTARVEYYRKITDNMVANISLAPSLG FSSYPENLGKIENKGWEISLSAIPYKNTAKQAYWTITVNGSHNTDKLLKISEAMKYRNDK SASDLKDTPLPRYEEGESLSRIWVVRSLGIDPASGDEILLKRNGEMTSAVNWSANDVVPI GNTEPKWQGYINSSFTYRGWGADVSFRYQFGGQVYNQTLLDKVENANLKYNVDRRVTQLR WAKPGDKAQFRILNPNGLETKATSRFIMDENIFQGSSLSVYYRMDRTNTKFISHWGLSSA KVTFNMEDFFYWSTVKRERGLYYPYSRQFTFALNVAF >gi|222159341|gb|ACAB01000018.1| GENE 30 49888 - 51066 675 392 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 191 391 125 324 331 87 32.0 5e-17 MDEKNINISKSEEELLEIMENRSHITADQLHNLKEDEECLQACSDLAEIVIEMQKKQNML AIDVRNEMADFHNKHSKNNRRKNTRMLLWASITGVAAAVVIILVLRAMMISSQPEIIQVF QANHVAQEVTLQVNDEKEIKPLKEVVESLSSSSTAQLSSKEIDYSRALLQTETKEVGKQK VQIHRLSIPRGETFKVVLSEGTEVFLNSDSRLAYPTIFKGKERVVSLEGEAYFKVTKDAK HPFIVKSGNLQVRVLGTEFNVRSYSPIDVRVTLITGKVAVSDTCGIHSVEMMPGQSAQLS SDGTFAVNEVDIESFLYWKEGFFYFDDVALVDMMKEIGRWYNIDIEFRNSKIMDLRMHFF ANRHQDIFHLIELLNRMERIHAHFETGKLIIE >gi|222159341|gb|ACAB01000018.1| GENE 31 51098 - 51607 442 169 aa, chain - ## HITS:1 COG:no KEGG:BT_3037 NR:ns ## KEGG: BT_3037 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 162 11 169 169 84 28.0 2e-15 MDDRATFDKMFNEWYAQFVYFAYYFINDAEVCRDIVSDAFEYLWRNYEKIEEATAKTYLY TIIRTRCIDYLRKQNIHEEYVEFTSQLTDKMIESDSQHSDSRVLRIRKAMEKLTPYNYHI LEACYIHNKKYKEVAEELNVSVAAIHKNIVKALRILREELGQEGNRNRL >gi|222159341|gb|ACAB01000018.1| GENE 32 51651 - 53699 1154 682 aa, chain - ## HITS:1 COG:no KEGG:BT_3167 NR:ns ## KEGG: BT_3167 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 682 1 682 682 1167 83.0 0 MKTIQLLRLISIINSLLIIPACSAQNLSESLLEDVLEDLSVNNGTDNSVNTPNWENELEE LSNRMQEPVNLNVATREQLEQFPFLSDIQIEHLLAYIYIHGQMKTIYELQLVEEMDRQTI QYLLPFVCIKAINNEPAFRWKSLLKSAAKYGKNELLTRFDIPFYRRKGYEHTYLGPSVYN SVKYSFRYSDRLYAGVVAEKDAGEPFGALHNRYGYDYYSFYLLLKDCGRLKALAIGNYRL SFGQGLVISTDYLMGKTIYASSFNNRSGGIKKHSSSDEYNYFRGVAATVALSKDWDLSGF YSHRSLDGVITDGEITSIYKTGLHRSQKEADKKNLFTMQLTGGHVSYQHNRIRLGITGIY YLFNRPYEPELTGYSKYNLHGNNFYNLGIDYAYRWHRFSFQGETAIGKQGWASLNRLQYS PVQNTQIMLIHRFYSYNYWAMFAHSFGEGSTTQNEQGYYIGMETSPFAYWKFFASFDLFS FPWKKYRVNKPSRGTDGLLQATFTPRSYLSMYLKYRYKRKERDWTGSKGSLTLPIFHHQL RYRLNYSLGDVLSSRTTLDYNHFHSQDRAANIGYQVTQMISSQLPWARLFADVQGSYFFT EDYDSRVYASESGLLYTFYTPSFQGRGFRCSVRLRYELNKHWLFITKFGETVYLDRNEIG SGNDLICGNKKGDVQMQLRIKF >gi|222159341|gb|ACAB01000018.1| GENE 33 53844 - 54422 486 192 aa, chain + ## HITS:1 COG:no KEGG:BT_3166 NR:ns ## KEGG: BT_3166 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 192 3 190 190 331 92.0 9e-90 MTLFAIVCCSLRLQAQDKQSINGYLVPMCIYNGDTIPCVQLRTVYIFRPLKFKNEKERQE YYRLIRNVKKVYPISREINQAIIETYEYLQTLPNEKARQKHIKRVEKGLKDQYTPRMKKL SFAQGKLLIKLIDRQSNSTSYELVKAFMGPFKAGFYQTFAALFGASLKKEYDPQGEDKLT ERVVLMVENGQI >gi|222159341|gb|ACAB01000018.1| GENE 34 54783 - 55310 616 175 aa, chain - ## HITS:1 COG:FN1519 KEGG:ns NR:ns ## COG: FN1519 COG0566 # Protein_GI_number: 19704851 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Fusobacterium nucleatum # 25 167 89 228 234 80 33.0 2e-15 MRKLKITELNRISAEEFKQVEKLPLVVVLDDIRSLHNIGSVFRTSDAFRIECIYLCGITA TPPHPEMHKTALGAEFTVDWKYVNNAVDAVDNLKNEGYIVYSVEQAEGSIMLDELQLDKT KKYAIVMGNEVKGVQQEVINHSDGCIEIPQYGTKHSLNVSVTTGIVIWDLFKKLH >gi|222159341|gb|ACAB01000018.1| GENE 35 55811 - 56749 841 312 aa, chain + ## HITS:1 COG:all4673 KEGG:ns NR:ns ## COG: all4673 COG0379 # Protein_GI_number: 17232165 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Nostoc sp. PCC 7120 # 3 307 20 323 324 369 57.0 1e-102 MNELIKAINELKKEKNAIILGHYYQKGEIQDIADYVGDSLALAQWAAKTEADIIVMCGVH FMGETAKVLCPDKKVLVPDMAAGCSLADSCPADQFAQFVKEHPGYTVISYVNTTAAVKAV TDVVVTSTNAKQIVESFPKDEKIIFGPDRNLGNYINSVTNRNMLLWDGACHVHEQFSVEK IVELKTQHPEALVLAHPECKSTVLKLADVVGSTAALLKYAVNHPENTYIVATESGILHEM QKKCPQTTFIPAPPNDSTCGCNECSFMRLNTLEKLYECLKNEAPEITVDPEVAKKAVKPI QRMLEISAKLGL Prediction of potential genes in microbial genomes Time: Wed May 18 01:15:29 2011 Seq name: gi|222159340|gb|ACAB01000019.1| Bacteroides sp. D1 cont1.19, whole genome shotgun sequence Length of sequence - 13927 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 6, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 60 - 902 937 ## COG1360 Flagellar motor protein 2 1 Op 2 . - CDS 934 - 1515 315 ## PROTEIN SUPPORTED gi|71274727|ref|ZP_00651015.1| Ham1-like protein 3 1 Op 3 . - CDS 1566 - 2483 758 ## COG1284 Uncharacterized conserved protein - Prom 2512 - 2571 3.5 4 2 Tu 1 . - CDS 2594 - 5428 2888 ## COG0495 Leucyl-tRNA synthetase - Prom 5448 - 5507 7.9 + Prom 5869 - 5928 7.8 5 3 Tu 1 . + CDS 6164 - 8734 2146 ## PRU_1162 putative glycolsyl hydrolase, family 18/alpha-rhamnosidase + Term 8807 - 8858 15.3 6 4 Op 1 . - CDS 8868 - 9785 540 ## COG1708 Predicted nucleotidyltransferases 7 4 Op 2 . - CDS 9802 - 11553 1599 ## BVU_3461 hypothetical protein - Prom 11694 - 11753 3.1 + Prom 11600 - 11659 2.0 8 5 Tu 1 . + CDS 11684 - 12559 684 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 12560 - 12609 4.3 - Term 12541 - 12604 13.3 9 6 Tu 1 . - CDS 12633 - 13769 821 ## COG1785 Alkaline phosphatase - Prom 13801 - 13860 3.8 Predicted protein(s) >gi|222159340|gb|ACAB01000019.1| GENE 1 60 - 902 937 280 aa, chain - ## HITS:1 COG:PA1461 KEGG:ns NR:ns ## COG: PA1461 COG1360 # Protein_GI_number: 15596658 # Func_class: N Cell motility # Function: Flagellar motor protein # Organism: Pseudomonas aeruginosa # 91 267 69 245 296 95 32.0 7e-20 MKKITLFTLLTLLLCTSCVTKKKFMLAEMAATASKDSLQGLLNNSREVGNQLSAQVKNLL RDTTKMGNSIRQYQSMLNVNMTEQEKLNALLSQKKNELNERERTINELQDMIKAQNDKVQ NLLSNVKDALLGFSTDELTVREKDGKVYVAMSDKLLFQSGSARLDKRGEEALGKLAEVLN KQTDIDVFIEGHTDNKPINTVQFKDNWDLSVIRATSVVRILIKNYNVNPLQIQPSGRGEY MPVDDNETIEGRSKNRRTEIIMAPKLDKLFQMLQSSEESK >gi|222159340|gb|ACAB01000019.1| GENE 2 934 - 1515 315 193 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|71274727|ref|ZP_00651015.1| Ham1-like protein [Xylella fastidiosa Dixon] # 1 193 1 197 200 125 41 1e-28 MKRKLVFATNNAHKLEEVAAILGDQVELLSLNDIGCQTDIPETAETLEGNALLKSSYIYK NYHLDCFADDTGLEVEALNGAPGVYSARYAGGEGHDAQANMLKLLHELDGKENRKAQFRT AISLILDGKEYLFEGVIKGEIIKEKRGDSGFGYDPVFMPEGYDRTFAELGNDIKNQISHR ALAVQKLCEFLQS >gi|222159340|gb|ACAB01000019.1| GENE 3 1566 - 2483 758 305 aa, chain - ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 9 303 1 281 283 151 31.0 1e-36 MHKLTKAEIMREVKDYIYITLGLISYSLGWAAFLLPYQITTGGTTGIGAIIYYATGFPIQ WSYFIINAVLMTFAIRVLGPKFSIKTTYAIFTLTFLLWLFQLVVNNYVEAPDMTPDGKPL LLGTGQDFMACIIGAAMCGVGLGITFNYNGSTGGTDIIAAIVNKYKDVSLGRMIMICDVF IISSCYFIFHDWRRVIFGFVTLFIIGVVLDWIINSARQSVQFFIFSKKYDEIADRIIKDA DRGVTVLDGTGWYSKNNVKVLVVLAKKRQSLEIFRLVKRIDPNAFISQSSVIGVYGEGFD KLKVK >gi|222159340|gb|ACAB01000019.1| GENE 4 2594 - 5428 2888 944 aa, chain - ## HITS:1 COG:SP0254 KEGG:ns NR:ns ## COG: SP0254 COG0495 # Protein_GI_number: 15900189 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Streptococcus pneumoniae TIGR4 # 3 944 4 832 833 720 42.0 0 MEYNFREIEKKWQKRWVEEKTYQVTEDDSKQKFYVLNMFPYPSGAGLHVGHPLGYIASDI YARYKRLQGFNVLNPMGYDAYGLPAEQYAIQTGQHPAITTVNNINRYREQLDKIGFSFDW NREIRTCDPEYYHWTQWAFQKMFNSYYCNDEQEARPIEELEKAFAIYGNEGLNAACSEDI SFTAEEWNAKSEKEKQEILMNYRIAYLGETMVNWCAELGTVLANDEVVDGVSERGGFPVI QKKMRQWCLRVSAYAQRLLDGLDTIEWTDSLKETQRNWIGRSEGAEVQFKVKDSDLEFTI FTTRADTMFGVTFMVLAPESELVAQLTTPEQKAEVDAYLDRTKKRTERERIADRSVTGVF SGSYAINPFTGEAVPVWISDYVLAGYGTGAIMAVPAHDSRDYAFAKHFGLEIRPLVEGCD VSEESFDAKEGIVCNSPRLDVTPYCDLSLNGLTIKEAIETTKKYVKEHNLGRVKVNYRLR DAIFSRQRYWGEPFPVYYKDGMPYMIDEASLPLELPEVAKFLPTETGEPPLGHATKWAWD TVNKCVVENEKIDHVTVFPLELNTMPGFAGSSAYYLRYMDPHNHQALVDPKIDQYWKNVD LYVGGTEHATGHLIYSRFWNKFLYDMGVSVMEEPFQKLVNQGMIQGRSNFVYRIKDTNTF VSLNLKDQYEVTPIHVDVNIVSNDILDLEAFKAWRPEYKTAEFILEDGKYVCGWAVEKMS KSMFNVVNPDMIVEKYGADTLRMYEMFLGPVEQSKPWDTNGIDGVHRFIRKFWSLFYSRT DEYLVTDEPATKEELKSLHKLIKKVTGDIEQFSYNTSISAFMICVNELFNLKCSKKEILE QLVITLAPFAPHVCEELWDVLGHETSVCDAQWPAYNEEYLKEDTINYTISFNGKARFNME FAADEASDTIQAAVLADERSQKWIDGKTPKKIIVVPKKIVNIVI >gi|222159340|gb|ACAB01000019.1| GENE 5 6164 - 8734 2146 856 aa, chain + ## HITS:1 COG:no KEGG:PRU_1162 NR:ns ## KEGG: PRU_1162 # Name: not_defined # Def: putative glycolsyl hydrolase, family 18/alpha-rhamnosidase # Organism: P.ruminicola # Pathway: not_defined # 18 856 344 1191 1193 1270 70.0 0 MKMNLHCFANLIPIMILALFSNSGLINATEVGKRTDALEASAWNDSQWISAVNAPVVKGH NNGRAADGASWFVSIVKNEQKIVSAKWMTAGLGVYELYVNGKPVGEEFLKPGFTHYAKTK RSFTYDITDIIRTKPNAENMLSVQVTPGWWADKIITPGGYDGMIGKKCAFRGVLELTFSD GTKKRYGTDLENWKAGIAGPVKHAGIFDGEEYDAREPMGFECVDKLSIPEENTEFSGDIL PSDGAEVYLRTDLALAPVRAYAWKNVEGVKENEFGKVIIAREFASDTEMTVSPGETLVVD FGQNCAGVPSFVFKAAEGTVLTCLPAELLNDGNGARIRGMDGPEGSCHRENLRIPHTGIR LDYTFAGGDNYVAYYPHCTFFGYRYVSITATGNVAIKSLKSIPVTSIAKELEAGTITTGN DLVNKLISNTYWGQLSNYLSIPTDCPQRDERLGWTADTQVFAETGTFFANTMKFFHKWMR DMRDTQNSLGGFPGVAPLAQYGDEKMRLGWADAGIIVPWTVWKQFGDTQIIEESWSAMDL FMNHINDTKYNHEALCGENGNYQWADWLSYEPLESCSGLAFSPQGPLPDAVSYWNYLSAS YWVIDAFMMRDMAAATGRDAAKYQQMADSAKAYIKKNFLNEDGTFKTAILNTMQTPALFA LKNQLLEGEPKAKMIDRLRENFAQHDLCLQTGFLGTSILMPTLTENGMEDIAYELLFQRK NPSWLYSVDNGATTIWERWNSYMIDKGMGPRGMNSFNHYAYGCVCEWIWETVAGIAADPA TPGFKHIIMKPIPDKRLGHVTAEYRSASGLIKSAWKYEGDTWIWEFTIPKGVTATVTLPG EVKSKEYGSGTYKVTK >gi|222159340|gb|ACAB01000019.1| GENE 6 8868 - 9785 540 305 aa, chain - ## HITS:1 COG:SMb20835 KEGG:ns NR:ns ## COG: SMb20835 COG1708 # Protein_GI_number: 16264326 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Sinorhizobium meliloti # 1 291 27 321 331 124 29.0 3e-28 MKRSIKRLPKRTQEELAVLQELILSNLTNVRMIILYGSYARGKYVIWDETYDERGVTTYY QSDLDILVICDTRDANKAERHAREVIVPKYDTRMEGKRHPAPPSIIVENPTTINRAIRRK HYFFYEIIKDGILLYNDGTFQIGKPEKLPYREIKQYAEEEYAECFPLAEGFLRHGHLDKE EENYKLGSFELHQACERYYKTFTLVYSGIRPKSHELKVLGAMVRSCSREFANVFPTHTFE DNKAFDKLCRAYIEARYNRLFTVSKEQYEYMLARTEVLREVTIRECAARMAYYDEMIEKE EKEKI >gi|222159340|gb|ACAB01000019.1| GENE 7 9802 - 11553 1599 583 aa, chain - ## HITS:1 COG:no KEGG:BVU_3461 NR:ns ## KEGG: BVU_3461 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 3 583 2 582 590 967 83.0 0 MEKYIAPDRKRIPYGMMNFAVIRRDDCYYVDKTRFIPMIEEADKFFFFIRPRRFGKSLTV NMLQHYYDILAKDKFEALFSDLYIGKHPTRDRNSYLVLYLNFSGIVGELHNYRKGLDAHC QTMFDYFCDIYADYLPKGIKEELDKKEGAVEQFEYLFTECNKTNQRIYLFIDEYDHFTNA ILSDIESLHRYTDETHGEGYLRAFFNKIKAGTYSSIERCFITGVSPVTMDDLTSGFNIGT NYSLTPEFNEMIGFTEEEVRQMLTYYSTTSPFNHSVDELIEIMKPWYDNYCFAEECYGET TMYNSNMVLYFVKNYIQRGKAPRDMVEDNIRIDYEKLRMLIRKDKEFAHDASIIQTLVSE GYVTGELKKGFPAVNITNPDNFVSLLYYFGMLTISGTYKGKTKLTIPNQVVREQIYTYLL STYNEAELNFSSYEKNELASALAYDGDWKAYFGYIADCLKRYTSQRDKQKGEFFVHGFTL AMTAQNRFYRPISEQDTQAGYVDIFLCPLLDIYSDMKHSYIVELKYAKYKDPESRVEELR LEAIAQANRYADTDMVKRAVGTTQLHKIVVVYKGMDMPICEEV >gi|222159340|gb|ACAB01000019.1| GENE 8 11684 - 12559 684 291 aa, chain + ## HITS:1 COG:BS_ybfH KEGG:ns NR:ns ## COG: BS_ybfH COG0697 # Protein_GI_number: 16077290 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 9 291 10 292 306 185 40.0 9e-47 MELKHGYYHLIAILVVAIWGLTFISTKVLINHGLTPQEIFFYRFFIAYLGIWVISPKRLF AGNWKDELWLMAGGFFGGSLYFFTENTALGITQASNVAFIICTAPLLTTILSLLFYKSEK ATKGLIYGSILALIGVGLVVFNGSFVLKLSPVGDLLTLLAALSWAFYSLVIKKMTGRYPT VFITRKIFFYGVLTILPAFLLHPLQPDFDVLLKPVVLSNLLFLAVLASLVCYVLWNVVLK QLGTVRASNYIYLNPLVTMVASVIILHEKITWITLLGAGCIIFGVYQAEKK >gi|222159340|gb|ACAB01000019.1| GENE 9 12633 - 13769 821 378 aa, chain - ## HITS:1 COG:TM0156 KEGG:ns NR:ns ## COG: TM0156 COG1785 # Protein_GI_number: 15642930 # Func_class: P Inorganic ion transport and metabolism # Function: Alkaline phosphatase # Organism: Thermotoga maritima # 52 329 21 310 434 161 36.0 2e-39 MNVKFKAHLFIFSLFFCHNLSAQEIVKGDFKNENPQSSYVPSFDMDATDRPVKNVILMIG DGMGLAHICSGMYANQGQLTITNLKTCGFVRTQSANKFTTDSAASGTAYSTGKKTKNGAL GMDENNQVIPNLPEKLSGYGYISGIVTTDNLDGATPAAFFAHQPERGMSKEIWADLPNSK LTFFSAGSYELFEKQAPNVQKEIKKEFTIIEEPNDKAIKKSKKLGYLPTKSKTASVNENR GDFLPSTTQMAIDYLSSRSTNGFFLMVEGARIDKSAHSNDYSAVVREVLDFDKAVEAAIR FAEKDGNTLVIISADHETGALALRDGNIKEGKMKAMFVSKGHTPIMVPLFAYGPQSKLFG GVQENSDVSNKILQLLAK Prediction of potential genes in microbial genomes Time: Wed May 18 01:15:47 2011 Seq name: gi|222159339|gb|ACAB01000020.1| Bacteroides sp. D1 cont1.20, whole genome shotgun sequence Length of sequence - 1746 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1540 890 ## COG3537 Putative alpha-1,2-mannosidase - Prom 1668 - 1727 6.3 Predicted protein(s) >gi|222159339|gb|ACAB01000020.1| GENE 1 1 - 1540 890 513 aa, chain - ## HITS:1 COG:CC0533 KEGG:ns NR:ns ## COG: CC0533 COG3537 # Protein_GI_number: 16124788 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Caulobacter vibrioides # 221 513 287 552 770 125 27.0 2e-28 MYKRKFIPLVLLMILIVNGCSNKQEVVSTPVDYVNPYMGNISHLLVPTYPTVHLPNSMLR IYPERADYTSDKIKGLPVIVTSHRGKSAFSLSFYQGAESGLQPVYYYCYDNEVIKPYSYS SYFEEEEVSTKFAPSHQAGIYELSFRKNDAPYLILSTTNGELKTTENTISGYQNIDHQTK VYVYMETSALPEKTMTVSKEEINHSHTAIKGKNVGIALLYSTDIKTIKVRYGISFIDEKQ AKANLQREIKDYDVENQMNIAKNIWNQTLSKIKVEGIDENAKAIFYTSLYRTYERMICLS EDNRYYSAFDNSIHKDSVPFYTDDWIWDTYRAVHPLRVIIEPQMEADMIQSFIRMAQQME HNWMPTFPEVTGDSRRMNSNHGVATVIDSYIKGICNFDLSAAYEACKLGITEKTLAPWSG IKGGEITKFYWENGYLPALAPGEQETADEVHPFEKRQPVAVTLGTSYDEWCLAQIAKQLG YEKDYNYFLQGSKNYRNIFNPKTKFFHPKNAKG Prediction of potential genes in microbial genomes Time: Wed May 18 01:16:01 2011 Seq name: gi|222159338|gb|ACAB01000021.1| Bacteroides sp. D1 cont1.21, whole genome shotgun sequence Length of sequence - 56842 bp Number of predicted genes - 32, with homology - 32 Number of transcription units - 11, operones - 7 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 392 - 451 4.2 1 1 Tu 1 . + CDS 554 - 3172 2516 ## COG0249 Mismatch repair ATPase (MutS family) + Term 3252 - 3313 11.2 - Term 3246 - 3292 6.1 2 2 Op 1 . - CDS 3303 - 4139 911 ## COG0682 Prolipoprotein diacylglyceryltransferase 3 2 Op 2 . - CDS 4157 - 5074 1019 ## COG1893 Ketopantoate reductase 4 2 Op 3 . - CDS 5122 - 6225 1239 ## COG0012 Predicted GTPase, probable translation factor - Prom 6256 - 6315 5.4 5 3 Op 1 . - CDS 6385 - 7659 758 ## Cpin_3746 hypothetical protein - Prom 7684 - 7743 3.2 6 3 Op 2 . - CDS 7753 - 8751 750 ## Cpin_1736 hypothetical protein 7 3 Op 3 . - CDS 8753 - 10435 847 ## Phep_0306 hypothetical protein 8 3 Op 4 . - CDS 10450 - 13770 1827 ## Cpin_2191 TonB-dependent receptor plug 9 3 Op 5 . - CDS 13792 - 15918 1464 ## gi|237717014|ref|ZP_04547495.1| conserved hypothetical protein - Prom 16125 - 16184 7.5 - Term 16123 - 16155 -0.2 10 4 Tu 1 . - CDS 16202 - 17794 961 ## BT_0763 hypothetical protein - Prom 17869 - 17928 5.9 - Term 17836 - 17886 -0.9 11 5 Op 1 4/0.000 - CDS 17941 - 19083 1182 ## COG2814 Arabinose efflux permease - Prom 19284 - 19343 4.4 12 5 Op 2 . - CDS 19385 - 20791 1110 ## COG1073 Hydrolases of the alpha/beta superfamily 13 5 Op 3 . - CDS 20816 - 23200 2185 ## BT_3111 hypothetical protein 14 5 Op 4 . - CDS 23197 - 24060 580 ## BT_3110 hypothetical protein - Prom 24266 - 24325 4.5 15 6 Tu 1 . + CDS 24508 - 26034 1548 ## COG3119 Arylsulfatase A and related enzymes + Prom 26041 - 26100 4.1 16 7 Op 1 . + CDS 26154 - 27806 1471 ## Cpin_2848 hypothetical protein 17 7 Op 2 . + CDS 27828 - 28238 267 ## gi|237717023|ref|ZP_04547504.1| conserved hypothetical protein 18 7 Op 3 . + CDS 28251 - 29876 1347 ## Cphy_1995 hypothetical protein 19 7 Op 4 . + CDS 29921 - 30910 670 ## COG3507 Beta-xylosidase + Prom 30954 - 31013 2.8 20 8 Op 1 . + CDS 31051 - 34239 3294 ## BT_0364 hypothetical protein 21 8 Op 2 . + CDS 34236 - 36146 1578 ## PRU_2735 hypothetical protein 22 8 Op 3 . + CDS 36169 - 39252 2934 ## Phep_3875 TonB-dependent receptor plug 23 8 Op 4 . + CDS 39278 - 41074 1546 ## PRU_2737 putative lipoprotein 24 8 Op 5 . + CDS 41104 - 42006 848 ## Slin_2105 hypothetical protein + Term 42038 - 42102 12.1 25 9 Op 1 . - CDS 42139 - 43104 763 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen 26 9 Op 2 . - CDS 43181 - 47191 2846 ## COG0642 Signal transduction histidine kinase - Prom 47298 - 47357 3.4 + Prom 47203 - 47262 6.7 27 10 Op 1 . + CDS 47400 - 49457 1864 ## COG3534 Alpha-L-arabinofuranosidase 28 10 Op 2 . + CDS 49546 - 51093 1349 ## COG3119 Arylsulfatase A and related enzymes 29 10 Op 3 . + CDS 51124 - 52254 933 ## BT_3094 putative secreted xylosidase 30 10 Op 4 . + CDS 52251 - 53918 1661 ## COG3119 Arylsulfatase A and related enzymes 31 10 Op 5 . + CDS 53924 - 56128 1849 ## COG3250 Beta-galactosidase/beta-glucuronidase - Term 56188 - 56250 13.2 32 11 Tu 1 . - CDS 56298 - 56741 478 ## COG0346 Lactoylglutathione lyase and related lyases - Prom 56782 - 56841 7.1 Predicted protein(s) >gi|222159338|gb|ACAB01000021.1| GENE 1 554 - 3172 2516 872 aa, chain + ## HITS:1 COG:HI0707 KEGG:ns NR:ns ## COG: HI0707 COG0249 # Protein_GI_number: 16272647 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Haemophilus influenzae # 9 869 11 860 861 620 40.0 1e-177 MNEEEIVLTPMMKQFLDLKAKHPDAVMLFRCGDFYETYSTDAIVAAEILGITLTKRANGK GKTIEMAGFPHHALDTYLPKLVRAGKRVAICDQLEDPKMTKKLVKRGITELVTPGVSIND NVLNYKENNFLAAVHFGKASCGVAFLDISTGEFLTAEGPFDYIDKLLNNFAPKEILFERG KRLMFEGNFGNKFFTFELDDWVFTETTAREKLLKHFETKNLKGFGVEHLKNGIIASGAIL QYLTMTQHTQIGHITSLARIEEDKYVRLDKFTVRSLELIGNMNDGGSSLLNVIDRTISPM GARLLKRWMVFPLKDEKPINERLNVVEYFFRQPDFKELIEEQLHLIGDLERIISKVAVGR VSPREVVQLKVALQAIEPIKQACMEADNASLNRIGEQLNLCISIRDRIAKEIKNDPPLLI NKGGVIQDGVNADLDELRQISYSGKDYLLKIQQRESEETGIPSLKVAYNNVFGYYIEVRN VHKDKVPPEWIRKQTLVNAERYITQELKEYEEKILGAEDKILVLETQLYTDLVQALMEFI PQIQINANQIARLDCLLSFANVARENRYIRPIIEDNDVLDIRQGRHPVIEKQLPIGEKYI ANDVMLDSDTQQIIIITGPNMAGKSALLRQTALITLLAQIGSFVPAESAHIGLVDKIFTR VGASDNISVGESTFMVEMNEAADILNNVSSRSLVLFDELGRGTSTYDGISIAWAIVEYIH EHPKAKARTLFATHYHELNEMEKSFKRIKNYNVSVKEVDNKVIFLRKLERGGSEHSFGIH VAKMAGMPKSIVKRANEILKQLESDNRQQGIAGKPLAEVSENRGGMQLSFFQLDDPILCQ IRDEILNLDVNNLTPIEALNKLNDIKKIVRGK >gi|222159338|gb|ACAB01000021.1| GENE 2 3303 - 4139 911 278 aa, chain - ## HITS:1 COG:VC0674 KEGG:ns NR:ns ## COG: VC0674 COG0682 # Protein_GI_number: 15640693 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Vibrio cholerae # 11 273 10 261 271 127 34.0 3e-29 MNTMLLSINWNPNPELFNLFGISIRYYGLLWAVGIFFAYVVVHYQYRDKKIEEKKFDPLF FYCFFGILIGARLGHCLFYDPGYYLSHFWEMILPIKFMPDGNWKFTGYEGLASHGGTLGL IIALWLYCRKTKLHYMDVLDMIAVATPITACFIRLANLMNSEIIGKPSDVPWAFVFERVD MLPRHPGQLYEAIAYLILFFIMIYLYKNYSKKLHRGFFFGLCLTYIFTFRFFIEFLKENQ EDFENSMMFNMGQWLSVPFIIIGVYFMFFYDKKKNKIA >gi|222159338|gb|ACAB01000021.1| GENE 3 4157 - 5074 1019 305 aa, chain - ## HITS:1 COG:BH1763 KEGG:ns NR:ns ## COG: BH1763 COG1893 # Protein_GI_number: 15614326 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Bacillus halodurans # 1 291 1 287 304 109 26.0 7e-24 MKYLIAGTGGVGGSIAGFLSLAGKDVTCIARGAHLQSIQTNGLKLKSDLKGEHTLRIPAT TAEEFNGKADVIFVCVKGYSVDSIVELIKRAAHKDTVVIPILNVYGTGPRIQKLVPEVTV LDGCIYIVGFVSGTGEITQMGKIFRLVYGAHRGTVVKPGLLEAIQQDLQEAGIKVDLSPD INRDTFIKWSFISAMAVTGAYYDVPMGEVQKPGKIRDTFIGLSTESAALGKKLGVEFPED PVSYNLKVIDKLDPESTASMQKDLARGHDSEIQGLLFDMIAAAEEQGIDIPTYRMVAEKF KESNH >gi|222159338|gb|ACAB01000021.1| GENE 4 5122 - 6225 1239 367 aa, chain - ## HITS:1 COG:PA4673 KEGG:ns NR:ns ## COG: PA4673 COG0012 # Protein_GI_number: 15599868 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Pseudomonas aeruginosa # 1 367 1 366 366 429 59.0 1e-120 MALQCGIVGLPNVGKSTLFNCLSNAKAQAANFPFCTIEPNVGVITVPDERLNALAELVHP QRIVPTTVEIVDIAGLVKGASKGEGLGNKFLANIRETDAIIHVLRCFDDDNVTHVDGSVN PIRDKEIIDYELQLKDLETIESRIQKVQKQAQTGGDKAAKLAYDVLVQYKDALEQGKSAR TVTFETKDEQKIAHELFLLTSKPVMYVCNVDEASAVNGNKYVDMVREAVKDENAQILIVA AKTEADIAELETYEDRQMFLAEVGLEESGVARLIKSAYKLLNLETYFTAGVQEVRAWTYE KGWKAPQCAGVIHTDFEKGFIRAEVIKYDDYIKYGSEAAVKEAGKLGVEGKEYVVQDGDI MHFRFNV >gi|222159338|gb|ACAB01000021.1| GENE 5 6385 - 7659 758 424 aa, chain - ## HITS:1 COG:no KEGG:Cpin_3746 NR:ns ## KEGG: Cpin_3746 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 2 357 29 381 436 102 26.0 3e-20 MENPDKRIQTKMQEYTDILTGAPDGWIAEINTQLGGIHTLWMQFTADNRVSMMFDYVEYY RDLKASPFESSYILKPLQGPTISFDTYSFLSIFADPNQLMNGAGQAGTGLGADYEYEIIS YKNDQFLLKGRKNKMEATLTKATNEEREAIKNGALMENQDQAPIYQKKYFTFSYKGQAYD FVSNGRKTGFLSANNGNPTLQIEGSKIDLNGNIVMMNPLILNGYEIYQFNKTSTGYTTQV GNDILEIKGGTTPIVPLFNAPPGAIHQTMVVYNDIQNQWSSEYFDKHKAAVSNLFAFTQT QFYIVYVAIHMYTDGSIVEIGYKIYQNGSLGADTYGFYYTFNYQKNSDGSITFGDYTPFD TSNPNHEEFDKCLSPILDGYFKKHKFFIKSWPALYNNSSQYVMSSLIPAEDKTLGVMVGV PMTF >gi|222159338|gb|ACAB01000021.1| GENE 6 7753 - 8751 750 332 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1736 NR:ns ## KEGG: Cpin_1736 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 31 305 22 288 303 211 42.0 4e-53 MKTTNHIAKWIQAYFMLLFCISTVIFTACNEDKLNLDQEIYGLGGNDEQKNELDKWLYEN YTQAYNIEVKYKWNSYELNTTAQLVPIMERFVKPSMDMIQRVWFEPYKQLAGDKFLREMT PKKIILVGSPEYNADGSQVLGQAEGARKITLFDGNSYNPSDADWIRSIMHTIEHEFAHIL HQTKMYDSSFKDISAGDYNPTGWTSETEVSALLAGFYSAYAMSGVDEDFVEVASLIMVYG KEWRDYRINLLKSLTTSPPSEGETAEETAQRIALANQAQTALTRLLAKEEIVINYFKNVW NVQFYDDAMGNKGLVSLVQDAINQIVNENTPE >gi|222159338|gb|ACAB01000021.1| GENE 7 8753 - 10435 847 560 aa, chain - ## HITS:1 COG:no KEGG:Phep_0306 NR:ns ## KEGG: Phep_0306 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 1 552 1 481 481 289 34.0 2e-76 MKRIKIYLYIGIMSFCTSCSDFLDKVPDERTQIDSPDKVSALLVNAYPKVSYAAIVNTRC DYITDFGSVYSGAQPGALFNFMGENFLWKDVIENGNDSFENFWTGCYAAIAVSNQALVSI DELGTPSSTINQKGEALMTRAFSHFCLASLFAQQYDESTASLYPGIPYVEKPETQPVEQY DRETVADTYRKITQDFEDSYPLMTDGSIYTAPKYHFNKKSAAAFGTRLYLLMKQYDKVLE YANQLISIPSQFETLTDSKGEPLKNQDGTPQQYISKDDPAFSFVSNNFHPFATTYMDASG YIEMRSYFCSSSTNANLLITEALSSIAWGELPFYSRYGLSSTDIGKTINADNVTNGMWAI PYYGTSNFYFIPKFTQFTKQESLDASNFLPYANIPVLRMEEVLLNRAEAHLMLNQYDKAI ADLNLYASQRFVIARSPASSRYYDSKELCITRDKILNFYKTKLNNTDHFINKYNQTDSWS DLKKGILLAILDFRSIEFYNEGLRWYDIVRWNIPVTHTQANGSKSSLMPDDDRRVLQIPE MATISGIKLNPRNNVNNIWE >gi|222159338|gb|ACAB01000021.1| GENE 8 10450 - 13770 1827 1106 aa, chain - ## HITS:1 COG:no KEGG:Cpin_2191 NR:ns ## KEGG: Cpin_2191 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 19 1106 131 1219 1219 1166 54.0 0 MNKHIILLFIILCIGLTRASAQHSITGKVLDATNQEPIPFASVTIKNVKSGTTTDNNGNF TLKVNNNDILIVSFMGYETKEVNIGTQKKVTILLSESSVLLDAVVVTGFQNLKKTTFTGS SVKIKADDINLPGETDISRMLEGKAAGVSIQNVSSTFGSAPKIRVRGATSINGENKPLWV VDGIVLEDVVNVSNDQLASGDPTTLLGSSVAGINTNDIETIDVLKDAAATALYGARAMNG VVVITTKRGKEGKPRIHYSGNYTVRTKPRYSEFNIMNSADQMSVYAEMERKGLLTSDIVN NQSSGIYGIMYNRINTWDESKQQYLLENTYAARHNFLMEHAAANTDWFDMLFTNSIMNEH TLSVSSGSQKSKSYISLGFLNDPGWTVADKVNRITMNFRNDYQLNDKIGFSFQTVGSVRM QNAPGSLSRSSDVVSGISSRNFDINPFNYALNTSRALRAYDPNGEREYYTLNFAPFNILE ELDNNYIKLNVIDLKAQAEFNWQIIKGLKYNFTGAIRYVRSKQEHEIHENSNMANAYRAA GNSTIRENNPYLYKDPSDPSAEPIVVLPKGGFYNTAENLLLNYDIRNSVSFSKIWNEKHE FNALLGQQIKYSDRQTNSNTGYGFQYDMGGTVSMNYLIMKQMIEQNFDYYSRALTYDRFA AFYANVDYTYDRRYSISGTVRYDGSNTMANSSSARWLPTWNISGKWNIGNEAFMKDFVWL DALALRAGVGLTASMPPLSNATPIFINSNTTRPSTDIESGIEIYTLGNSDLTWEKSYQSN FGFDATFLKGRLDFSFDYFIRNSFDLLSVIKVSGIGGERYKWANSADLSASGFDITLSGK PIVTKDFMWSSSFTMGYSKNEIKNSMQQPQINTLVGITGGNKNGYPVNGLFSIPFAGLNP ENGVPMFYDENGTITSDINFQSTSTDYLKYEGPTDPKYTGGFNNSFTWKGISLDIFFTFQ AGNVIRLNPVFNSSYNDSQATPKEFFNRWEFRGDEKHTNIPAIADKDLISDLSSTYPYSA YNYSDIRVAKGDFIRLKSLSVGYTIPQKLFKSWKLFSDIRLRVTGKNLWLLYSDKKLNGQ DPEFVNTGGVAQPIQKQVIFSLDVSF >gi|222159338|gb|ACAB01000021.1| GENE 9 13792 - 15918 1464 708 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717014|ref|ZP_04547495.1| ## NR: gi|237717014|ref|ZP_04547495.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 708 1 708 708 1395 100.0 0 MKNLIWLSAYLIISAIFSSCSKDDENTSIIPEGEKTNVNVSLNIAPINAMETNINDNRTS IENFWLLEFDQQGKNVAVIKELRFSINSAETVELVAGANMKLVVIANVDKSIVFQLGKQT YEALKNTIYTLPVNSSNAIPLVGEQVVSITKENQTLDAITLERIAAEVKFTINNTSSDFQ LTSIALKNIYKMYYFSKGTISQSNIADYSEIPCTIGEENYTHYIAENLMGTNSSITTDKD KVTDKLATFIEIVGERMSEGNKEIATFKMFIGENEKDFNVTRNNAYNYNINLNLADASDK RITIEKIPLEGIAAEANCYILNPNSEHELLVPVKRVNAFWGTEEGGFKTDHMLNDGNNQG KVTKWVARIITKDSSKELLRFTTQQGSSANDYIGIKPTGNEGNVLIGIYDATAGEPAQDA KPLWSWHIWITDYNPGGDINGIIPAISQNPGKADVTGGAIYRFGGIDNNQTKDGSTQAMM DRNLGALSATPTDGAKTIGMYYQCGRKDPIMTDWSQITKLGTSSLGGDVYSIHIKTAKGP VSKETAVKNPDTYFLGDADHFTYKDWLIEGSGSTELWYKKSDEKTKTLYDPCPAGWRVPY KMNTYGEINGEDVDNGIYLCPNSSIKIFYPFGGHLNAITGQYPGGSLFEYKTAQNGWLGS YPDNPNNAYSYHIFSAALVDEYDGPFYPIMPNNADGRYLGYSVRCVKE >gi|222159338|gb|ACAB01000021.1| GENE 10 16202 - 17794 961 530 aa, chain - ## HITS:1 COG:no KEGG:BT_0763 NR:ns ## KEGG: BT_0763 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 519 16 551 561 187 26.0 1e-45 MGVISIFTSCNPHPNLSKADQLINFYPDSSLTYLKAIQHPENMSPKDYAKWCLLVAQAKY NNAQSLVPDSFVFKAFDYFSKDTSDPILTARTYFYTALISKEIRNIEEAAQYLLKARDFA EKANDYNLAFKISHDLCDIYSKQGLFDYKLTEAWRGYNYAQISKDSLSIYYALADLGHAY STKEQIDSALYFFEKAFPIAQKVHPKATSSALNDICYAYINKKDYISALKICNEAIACEK DSIDRYNNYIIKGVIFQDIQQYDSAIHYFTLSSKNQYIYAKALSYSYLGEIYEKKGNILK ALEFIKTYELLRDSIEDQNQSVAIIRMQNLFQNEKLQKNNKELSKKMEEISSLVYKISAI AAFALLITGLCYFITYKKKSERIRKQEREISQAKDKLQQQAIENLQKEEKLSSLQADFLR RLVAINIPSLNNKANDLVIKLSNEDFANLEKDINATFDQFTVRLKKEYPLLDKNEIQFCC LIKIQLDLNTLANIYCRSKAAISKRKLRIKQEKLNITDKNISLDDFIQRF >gi|222159338|gb|ACAB01000021.1| GENE 11 17941 - 19083 1182 380 aa, chain - ## HITS:1 COG:STM0394 KEGG:ns NR:ns ## COG: STM0394 COG2814 # Protein_GI_number: 16763774 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Salmonella typhimurium LT2 # 1 367 1 367 390 311 50.0 2e-84 MKKSLIALAFGTLGLGIAEFVMMGILPDVAKDLGISIPTAGHFISAYALGVCVGAPVLTL ARKYPLKHILLVLVTLIMIGNICAAMAPNYWVLLAARFISGLPHGAYFGVGSIVAEKLAD KGKGSEAVSIMIAGMTIANLFGVPLGTSLSTMLSWRATFLLVGIWGIVILYYIWRWVPHV EGLKDTGFKGQFRFLRTPAPWLILGATALGNGGVFCWYSYINPMLTNVSGFSAESITPLM ILAGFGMVMGNLISGRLSDRYTPGKVGTAAQALICIMLLLIFFLSPYKWAAALLMCLCTA GLFAVSSPEQILIIRVSKGGEMLGAACVQVAFNLGNAIGAYAGGLAVSGGYRYPALAGVP FALIGFTLFLIFYKKYQAKY >gi|222159338|gb|ACAB01000021.1| GENE 12 19385 - 20791 1110 468 aa, chain - ## HITS:1 COG:MA2933 KEGG:ns NR:ns ## COG: MA2933 COG1073 # Protein_GI_number: 20091752 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Methanosarcina acetivorans str.C2A # 29 465 42 489 496 352 41.0 6e-97 MKTAVFKTKMRALVFLLAGLALSSSLSAQDISGAWHGKLSLPTSSLTIVFHISQTEQGAY VTTLDSPDQGANGIKTQTTSFNDSTLIIQIPVIHASYKGKLNSDNTINGTFTQGMPLPLN LKKGEASRPKRPQEPQPPFPYRSEEVTVRNERDGINLAGTLTLPEKGTKFPAVVMVTGSG AQNRDEEIMGHKPFFVIADYLTRNGIAVLRCDDRGTAASQGTHATATNEDFATDTEAMVN YLRSRKEINAKKIGIIGHSAGGIIAFIVAAKDPSIAFVVSLAGAGVRGDSLMLKQVELIS KSQGMPDAVWQGMKPSIRNRYAILQQTDKTPEELQKELYADVTKTMSPEQLKDLNTIQQL SAQISSMTSPWYLHFMRYDPAQDLKKLKCPVLALNGEKDIQVDAAMNLAAIQERITGNGN KNVTVKAYPNLNHLFQTCKKGTLAEYGQLEETINPEVLKDIIEWIRKQ >gi|222159338|gb|ACAB01000021.1| GENE 13 20816 - 23200 2185 794 aa, chain - ## HITS:1 COG:no KEGG:BT_3111 NR:ns ## KEGG: BT_3111 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 794 1 793 793 1257 86.0 0 MKHLKRFIATLAITVLSFHSYVNGCGGDFYDYMNYYNLFDQLLLENKGLQPFLLTTDDAF YGEDMNPDGKQQQPDENINAWMTFFKKNKTLQDMNEEQFKALLYTASYQSLKQPSSPYVT ALNKTDAGKHTLTYLQYAKELEPYAQLSNNGDWWEMKRTSSPSEETYTHYKDKGLELYRN CPYDELKLRYGYQLVRLAHYMRNKNNEAIRMYNLYVKPLKQEHYIYYAALEQTAGALYNI GKLANANYLYSRVFDHSDNRKKTAYTSFKIQNEIDWNEAMSWCKDNREKAAMYALRGYNT FSNELEEVENILNIYPESPYIKLLAIRYINKMERDILTRYSHSDATDDTSSFMQPSGKVL AEYERAQQIIKAVMNHPKVGDKDFWALYHAHMSFLCKDYRQAATFIDSLQTTKPELLKQK SRTQFSLYLAQLKIIGEDEEQAIRQYLQTSHADEDFINEIVGHLYKMQKDYGKAFLTHNR IENLRQNPDPDIINSLLANAGKENDQTLLTQLYELKGTYYLRMNNFAEAAKWFAKVPESY SLTHYKYDYETEKYIPTGISSDDFNGYSEISPLIFSNGFKRLFSVPAASQLTDVMYEQYL YLNKEHNKATLTAALMQLEKESQMMTEKGARAAYMLANYYYNISPTGYYRNIPAYFTDNS YYWSAYVSYGTTGPNSIPDYSKEYNYRDFTQEYMIVDNMENALALYEQAATYFTDREYKA RALFMASSCTMDLYAQNWWSNWNNILNPDFSRSDDEKKVDSYFRQLKKSYSDTQFFKEAV HECKYFEYYVKTEL >gi|222159338|gb|ACAB01000021.1| GENE 14 23197 - 24060 580 287 aa, chain - ## HITS:1 COG:no KEGG:BT_3110 NR:ns ## KEGG: BT_3110 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 287 57 341 341 513 85.0 1e-144 MFDVDNKGNGAFPTADYSPSFSLDNPGFQQKIVPVIFITNKAIRQCTAADIEKLARNCAD RIDTLYRLHFNKLPTEYQFDCDWTEKTKENYFNFLNHIRKLRKGVPISCTIRLHQIKFKD KTGIPPVDKGTLMYYASSEPTDFENENTILNNKDAASYIEKIGSYPLHLDIALPLYSWGI VRNPFGQIKLINGVRKATIDARPDYYQPTKDGIYCILQSHYINGVWVNKDYELKVEEVSP ETLLEAAKLLRKKLRKENREIIFYHLDKEIIKQYSTEQLINIINTFS >gi|222159338|gb|ACAB01000021.1| GENE 15 24508 - 26034 1548 508 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 23 507 29 483 497 166 26.0 7e-41 MKKIIYLVPTLSLVGCTSQHVEEKPNVIVILADDLGFGDVSAYGSTTIHTPNIDSLAHGG VCFMNGYATSATSTPSRYALMTGMYPWKNKDAKILPGDAPLIINENQFTLPKMMQQCGYV TGAIGKWHLGMGEGNVNWNETVKPGAKEIGFDYSCLIAATNDRVPTVYVENGDVVGLDPA DPIEVSYEHNFEGEPTAISHPEMLKMQWAHGHNNSIVNGIPRIGYMKGGQKARWKDEDMA DYFVDKVKNFVTEHKNTPFFLYYGLHEPHVPRAPHQRFVGKTTMGPRGDAIVEADWCVGE LLAHLKKEGLLEKTLIIFSSDNGPVLNDGYKDGAPELAGDHLPAGGLRGGKYSLFDGGTH IPLFVYWKGKIQPVVSDALVCQVDILASLGSMIHADLPEGLDSRNYLDAFFGKEQTARKD VVLEAQGRMAYRSGDWIMMPPYKGSERNLTGNELGNLGEYGLFNVKADRTQYQNMAAQQP ELLDSLKQNFFAEVDGYYRSEVEEEPLK >gi|222159338|gb|ACAB01000021.1| GENE 16 26154 - 27806 1471 550 aa, chain + ## HITS:1 COG:no KEGG:Cpin_2848 NR:ns ## KEGG: Cpin_2848 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 11 392 9 377 386 136 29.0 2e-30 MKMKAFLMFVLLFLCSMGTKAQDLGANYNENIDYPITEIEMLKQSKVTWVRGFVNIPNLF LQNENGKIVGVKENAIRTHIPTLKFIQAKKALGDRVKFILSLKIPFELYTDTVPKVGTKE MEYIFQATEVLLKTYDMAKNIEILVMGNEPEWENALDTDLCHADGEDYRAFLNEFANRLT AWKQANGWTFDIYAGALNRVSELPKSETVPAVVSVVNNNPNVVGLDLHVHALKINQAEDD FRIIRDKYGVTKKLICTEFSMVRALNPHVADALGEWGTKHGYTAGMKIYEYLNLIAEKAN AGTPVSATEFKSLFESYSWYPKNWYKTFYEVFKKYDTYAITGRFSVVPGGARAVYDAKTE MWELGGIYFSRYLGLDADGFYNPNPLLYPDFIAARDGLAVSSLVGGQRELFIRWGNGADK TGILSVTDENGTEVVRRSLDSENDYTLVENLVPGTAYHVALLKSDDAVLWKDDVKTKSVT GKFPLLKYQQVDEYMLVQLLNLPADVSSYKVKLDGQEINIVNKNLNGQILTAEVTYKDGS VEILSTTVRK >gi|222159338|gb|ACAB01000021.1| GENE 17 27828 - 28238 267 136 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717023|ref|ZP_04547504.1| ## NR: gi|237717023|ref|ZP_04547504.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 136 1 136 136 275 100.0 1e-72 MKVIYVVLVCLLAFTSCAETEYAENGVDKKNITGYTEVGFYSLDGICTFTEQEQLQAAIN PKRLTYRLQNSDQSKYIHVKFDGKPTAVGQKIKAEFSFQGVSFLKEGMEMEVVRIDGDKV WLSGDNVGMVIPVFNN >gi|222159338|gb|ACAB01000021.1| GENE 18 28251 - 29876 1347 541 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1995 NR:ns ## KEGG: Cphy_1995 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 144 396 410 631 691 82 29.0 6e-14 MKRYIPVVASLMAFSLIACGEVMDLTQPEKAEVTYSNITLSLYQTGKYDLYLDEPEYQYN IMVEKSHCEKEAKAELAVVDAKEFGEEYHLLPVEYYDLDGSNFNFKGDDVLRMVNLRFHD LGTLDGSKKYVLGLKLVSDDLAVNQEKSTMTFFLQQKQGEIDNPYTVATTSDLITLGEKL KDGKTIYAKIENDIDLQGVDWQPIETSVSKQLVLDGGGHTIRNLKVNTSSSVNQGFFGLL VGKCSNINFENAQITANTKMAGILAGQVGAATSPGIVENVRVSGTISLTSGTGAAAWDNG QAGGICARLHGADSKIHQCTSATDITAVWCAGGICGETRGGATISQCSSTGDIVTESCVG GIVGRMLYSIVTQCYSSGLAQAFPMKVANPAGGIAGFVDPSPTAVISYCYSDCEVSAQNQ VGGIMGFANKATGITVTHCVAWNQKLFSNGAPKSGRICGRFNKNEANNCYANPNMECRFS ANTPAIVDEQIPNYAAQNFGADRYNGLSTMSTPIDAAKALSWDKSIWNLTGDKPGLVWMI D >gi|222159338|gb|ACAB01000021.1| GENE 19 29921 - 30910 670 329 aa, chain + ## HITS:1 COG:BH3683 KEGG:ns NR:ns ## COG: BH3683 COG3507 # Protein_GI_number: 15616245 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus halodurans # 36 285 14 265 528 68 25.0 1e-11 MKKHKMTKRLILALGIICLSVTGVRADSVNEKEITYADPTIYVENGKYYLTGTRNQEPQG FAILESTDLEHWTVPDGSSLQLILRKGDRTYGEKGFWAPQYFKDKRTYYFTYTANEQTVI ASSKSVFGPFRQKEVKPIDASAKNIDSFLFKDDDGKYYLYHVRFNKGNYLWVAEFDIKKG SIKPETLKQCMDCTEPWEKTPNYKSAPVMEGPTVMKWDGVYYLFYSANHFMNIDYSVGYA TASSPFGPWKKHPNSPIIHRSLVGENGSGHGDVFKGLDGKYYYVYHVHRSDSTVSPRKTR IVPLILKKGNDGIYNITVDKEHVIKPMWK >gi|222159338|gb|ACAB01000021.1| GENE 20 31051 - 34239 3294 1062 aa, chain + ## HITS:1 COG:no KEGG:BT_0364 NR:ns ## KEGG: BT_0364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 1060 3 1025 1027 552 35.0 1e-155 MKRNRLYFSFFYLCCFLLCALDVQAQQIYWKGQVKDAVSGEPMIGVSVRVKGTGSGTITD FDGNFTVKASKGDILVISYVGYKTLELDLKNKTTLGVISLGEDTETLEEVVVVGYGVQKK VSSVGSIATAKGDDLLKIGSVNSVSEALQGQMPGVVAINSTSKPGADKASLLIRGKSTWG EADPLVLVDGIERDFNDVDVNEIESISVLKDASATAVYGVKGANGVILLTTKRGLEQKPE ISFTANFGFKQPSAAPEWSDYVTSMKQYNRAQANDGNWGAMVPESTIAAWENAYATGNYG PYNDVFPEVDWWKELVKNVGYEQAYNLNVRGGTKKMSYFVSLGYLHDGDIFNTTKQEDFD PSFSYRRYNWRSNFDFNITSTTKLSFNVAGKMGYQNQPSYYENVDSPDERFFKTFFTAPS NEFPIKYSNGIWGDGLSSDQNIACLMNEGGSRNIKQHQGFYDVILNQKLDFITKGLSLKA SLSYTTSSSWTTQIMPGKVLGKDDLVAQRTHIRINRVYDYTNPIYNADGTITYNYTEKRY PDENAPGDLPVGGVYDGFKAYGRKLYYELALNYSRQFGDHDVSALFVFNRKMNESTNTTN SGVMNFPAYEEDWVGRVTYNFKERYLAEFNGAYTGSEKFAPGRRFGFFPSASIGWRISEE PWVKKLTKGVLTNLKVRYSYGVVGNDKGATRFNYIQKFEQLSANAQFGKYQTSNWGPLYK EGKLADPDATWEESIKQNIGIEIGLWGKLNFTVDLFDEKRNNILMTRNTIPSWADSGIAF PQVNLGKTKNHGLELDIAWNDRIGKFNYYAKFNFATSENRIVFIDDPKNQSEYLKQAGKS IGYVNKYLATGNFQSLDDIYNSAQSTIANGAHNTLIPGDLYYIDYNGDGMIDAKDMVPMK NLNYPTTTLGFTLGGSYKGIGFNMLWYSAMDVYKEAIPSYLWDFPEGNIKAQPNTLNTWT ADAPIQSGPVRPSIHVQRSYNSVASTYTYTNHAYLRLKNLEVNYQIPKRWLQPLRLTKLQ VYVNGSNLLTFSKGDSRRDPEHSGQNVYPMVRRYNIGFRLGL >gi|222159338|gb|ACAB01000021.1| GENE 21 34236 - 36146 1578 636 aa, chain + ## HITS:1 COG:no KEGG:PRU_2735 NR:ns ## KEGG: PRU_2735 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 20 636 7 624 624 190 27.0 1e-46 MRHFKFTDMRRHKLNWIWIILLGLFIGSCEDYLDRSPDDGLSEDDVYKDYSSLLGFMDRI FQNDDILIFTHGINSYSNAYVTVGNLSDEYASVRDNDPSKFVNAGNWLENAGTRFEIGSK SDGKNFKSAISRAYTGLRIVNRVISGVDQVKSITDDQRRKLLGQAHFLRAYFYFEIIKRY GGMPIFDQLWGASDDFDFPRKTYQESNAWMQTDLDKAIEYLPISWPDEEHGRPDRVSAMA LKAMTQLYAASPLMQNDLNSIENKGYGKEMAAEAARSAQKAINAIESHEYYRLMNHDEYR SIQLMPNSNQFAQPEYLWFLRWHHGNWSAFVRAQWLTQPYDNKTGAEGTPYNAPTQNAVD MYERKGADGNYYPITDPRSGYDAVKTTNPYSDRDPRFTNNILVPGEQWGKNLQGVPYYLT TYSGGYSENFISTNQFTRGSQQTGYMCKKFIWPEASVPLFGEAGFQLYRLVAVYIRVSQV YLDMAEASYEATGDPDAVVTGCTMSARQALNKIRVRAGIGELPDGVDFREAYRRERGVEL MFEGHRWYDIRRWMIAEDLFKGEYPIMGVKATPINHSYTADQLKVEKACTYKLTDFTYEY VPVRTAVRTFNKRNYWYPLPMDEVAALDNLQQNPGW >gi|222159338|gb|ACAB01000021.1| GENE 22 36169 - 39252 2934 1027 aa, chain + ## HITS:1 COG:no KEGG:Phep_3875 NR:ns ## KEGG: Phep_3875 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 131 1027 46 937 937 384 30.0 1e-104 MRNKIRYIRLIGLIFVLIVGSSLTFAQNKSKRKSAKKPLVEVVSIVTDEKGSPLKNVSII CGEGAVVLYTDAKGRFQTKVENDATVMVEALGYEDKVYKLTGSNIVPEKIVLKKEPLFLT EESLVNRADGGKTYKGNEVGATSVLENGQFGTFPDLTLTNMLQGKMLGLQVRSTVSGLGN NTPDLFIRGQHGMSENTAIVIIDGVERPAADLIPEEIERIELLKDATAKILYGARAANGV LWVTTRRGKANRRIYNATAEAGVVQMTRTPDFLNSYQYANLYNEARANDGLTPYYNQKQL EGYKNSKGANDLLYPNVDLYDQLLNKNANYRKVSFDMTGGTDRVRYALIAGYVGGSGFED VTYTPQLHRLTLRGNLDFNVTDFLTISADVAGRMEMRKWGQLDCGQVFTALSTHRPNEYP LTMSPEETGLASSDGIPLFGASLLRPMNAYAETMYGGYTDERYTRSQTNIGLKFDLDMLT KGLKAGAFLSFDNYDYLQLSLSKVYPTYAIKTYRNFAGEEQIMYTQMKKTDVATSQSRKS TTLQQTLGWNAFAGYENTFNQKHDVSARLTYMYSKTTNQGVTQDIINANYALRLNYMYDR RYAVEADMALMGSNRFKPGNKYFFSTAGGLAWILSNEDFLKDNEYVNFLKLKTSAGILGY DRSTEHLLYERAWSQDGSFRFGTTNNGATAYYSTFVRAGNPNLKWEKAAEWNIGVEGLFL NNRLYTEINYFREKHTDIIGSVDASYGDYTGNFTYQDNMGSVLNHGIEGMFTWSDRINDW SYSVGANFVWSKNKVLKWNQVKHGEEYRYTVGRSTDAMMGLVAEGLFGKDVAINGHAPQT FADYQEGDIAYKDLNGDKIIDGRDVKELGNSFPRTTLGIDFNVSYKGWGLYLQGYSELGV HTWATNAYYWNNGEGKYSKLALDRYHPANNPTGSYPRLTTTAGENNFRNSSFWLKNTSFF RMKNVELSYTFNQFSPSSVVKKMKVFARGANLFVLSSVKDLDPELLNAGVTNYPVTRTFT GGVSFVF >gi|222159338|gb|ACAB01000021.1| GENE 23 39278 - 41074 1546 598 aa, chain + ## HITS:1 COG:no KEGG:PRU_2737 NR:ns ## KEGG: PRU_2737 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 597 1 589 589 244 31.0 7e-63 MKHILGSCVLAGLFFFCACDDTLDTKVTWQIGDSDTWRVPELAQGVLYKAYNGIANRPDC FEDNFLDCATDNAVTNLRNSSVYKLNMGGMTAFNNPIGNWSNAYNMLNYVNSFLENGLTD QVQYNRTDPEVDKQIKLRLEGESYFLRAWWHFELLKMYGGKARNGKALGIPLADHFISQD EAAQNGEFLRPTYQATVDFIVNDLNNAIELLPNVYQGDDLEFGNTQIGRATKAAAAVLKS RALLYSASPAMQDDDVTKITGMGQFEILNPTVYQAKWEAVAKEINKILGMEGFGTYVPVT ASSIADVQTESSDYAFRRYFNNNLLEGFHFPPFYYGSARATPSHNLVKAFYAKNGYPATD VRSGVDLSDPDFDLTQLYAVLDNRFALNVYYHSATFGDSGQPLDMSEGGKDSPSFSENAT RTGYYLAKFVSKKSAMLNPIQTLNSVHYNPLLRKSEVLLNFAEAANEAWGNPTVKAEGCL YSAYEILKTIRSQAGGINFDLYLDEMAQSKDSFRKLIQNERRLEFTFENHRYFDMRRWVL PLNEEVEGVAVTRNEDGTFSFKVQKVEQRKYEVKNYFMPLPYAELEKNKNLMNNQGWE >gi|222159338|gb|ACAB01000021.1| GENE 24 41104 - 42006 848 300 aa, chain + ## HITS:1 COG:no KEGG:Slin_2105 NR:ns ## KEGG: Slin_2105 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 26 226 16 226 331 67 27.0 7e-10 MKRLNILWIGLISVLMTACYDDYEKDYDKSSVYFASQKPLRTLVADTDMSIKVGVAIGGK REVHTDDWATFEIDPSLLEGTGLTLMPENYYQLANPNKMTISNPNLAIADVKVTFSDAFY NDDAALDKHYAIPFRLVDHNQDEVSTDVNGQLKDYSIVVVKFVSQYHGTYFVKGKVTNLS TQQVTEYNNKDLSQNMTRDFVSLGRNKLRRPGFGNTLENNESVNLTVNPDGSVTIEAGGS VAIIDASASLDPAAESLEFVGKQPKFTLSYKYTKGGVTYQVDEELIRRQNPEADLRFQEW >gi|222159338|gb|ACAB01000021.1| GENE 25 42139 - 43104 763 321 aa, chain - ## HITS:1 COG:PAB0790 KEGG:ns NR:ns ## COG: PAB0790 COG2865 # Protein_GI_number: 14521387 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Pyrococcus abyssi # 3 237 11 237 287 116 31.0 7e-26 MSESQNIEYKESWRDEYLKWICGFANAQGGRIYIGIDDNQQVVGVADTKRLMEDIPNKIV NYLDIVADVNLLHKEDKDYIEIAVQPCNLPIAYHGIYHYRSGSTKQELKGAALQQFLLRK MGHSWDDIENERATLDDIDRQAIDFFLRKAVDAGRMPVDSLNETTEKVLSNLNLIGGEGK LRNAALLLFGKNPAKFFTSVQFRIGRFGRDEADLMFQDVVDGNIIQMTDRVIEVLKSKND DPLNDPLNDPLKLSSLEKDILKLIVKDNQSTYDNLAISLSVSSATIKRAFRNLKVNGCII RKGSKKIGYWMITEKGKSILK >gi|222159338|gb|ACAB01000021.1| GENE 26 43181 - 47191 2846 1336 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 773 1073 12 315 318 141 32.0 7e-33 MRTTALKIVLSFSLVLLNASLSTGQIQYNNIRFKQLSIAEGLPHNTINAITQDNHGFIWF GTRNGLCRFDGYNINLFAHNEADSTSLCHDFITRLYNDSLRNVLWISTDQGICSYDYRTE DFTRYRIKGNTKDDVCFLNTSDGMLLAACSNGLYHYNEQDSLFVPFLLNEEKPHVRYFAE DGDKTLWIDTNKGLMRYSLEKKQFVSLPTLIQPFAQQCNNAVLISSNQLLFNTNNDFFVY HIQSNTLCNLSKDLEVKDFRCASTDHTGNIWVGTEYGIFVFNKLYQPIAHYEQSERDLSA LNDSPIYSLYQDKAHNMWVGTYFGGVNYYIFGSDQFRIYPYGGSFNHLSGKAVRQIINAP DNGLYIATEDGGLNYLNSKKEITRAERLHKQMQIYAKNIHSLWLDKDNSLWLGLFLKGAL HYIPHLNRTVDYNLLSEEVSSGFCIIEDKNDHVWYGGPSGLFLIDKKKANARPEKISPLR IFNLVQFNDSILWAGTRKGSMFQINTRTLKVTSLPILPHTDLYVTYIYPDSHQRIWIGTD NNGLYVSDRNGSIIASYSKEQLGSNAIKGIIEDEMNNIWVGTGSGLCCINPQMENIDRYT TADGLPINQFNYSSACKKPDGELYFGTINGMISFYPEQVRSVNPHFNIALTAIWSNSEYM SPNNEKASIPSSISELTEITLTHDQAQSLRLEYSGLNYQYTYNTQYAMKMEGIDKDWQFV GNQHQVRFSNLPSGRYILKIKASKDGIHWDETGQKDLAIRILPPWWLSPGAYFVYALLSL LIIYAAYRYTKTRIILLMRLKTEHEQRVNMENMNQQKINFFTYISHDLKTPLTLILSPLQ RLIQQPQISNNDKEKLEVIYRNANRMNYLINELLTFSKIEMKQMHISVRKGDIMHFLEEL SHIFDIVAGEREIDFIVSLEDTKEEVWFSPSKLERILYNLLSNAFKYTQPGGYVRLSAKL IKEEKETMVQISVKDSGRGIPKDVQEQIFESYYQVEKRDHREGFGLGLSLTRSLIHMHKG EIRVESEVNKGSDFIVTLNVSEDAYAPDERSSENITTEEIQKYNQRIKETIELIPEKLTN KEKDSGRESIMIVEDNKEMNDYLASIFGEKYDIIRAYNGAEACKKIARQLPNLIISDLMM PVMDGLEFTERVKQDVTTSHIPVILLTAKTDENDHTEGYLRGADAYITKPFNAKNLELLV QNIQKSRKQNIEHFKQAEELNIKQITNNPRDEVFMKELVELIMANLTKEDFGVTEITAHL RISRSLLHMKLKSLTGCSITQFIRTIKMKEAKTHLLNGMNVSEASFAVGISDPNYFTKCF KKEFNITPTEFLKQLK >gi|222159338|gb|ACAB01000021.1| GENE 27 47400 - 49457 1864 685 aa, chain + ## HITS:1 COG:CAC3436 KEGG:ns NR:ns ## COG: CAC3436 COG3534 # Protein_GI_number: 15896677 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Clostridium acetobutylicum # 6 536 9 510 835 270 31.0 6e-72 MNHPTLKSLTLGIGLFFTLPLVYANSSFPSSSDGTLYINKSKTRKVAPVKYGFHYEEIGM MGEGALHAELIRNRSFEEATPPAGLSVKNGLYENVPAPRVKEKKVFQADPLIGWTTYPLS YAPVFVSRTETDPMSEENKYSMLVNVTEDIANHPDALILNRGYYGMNLKTDTSYRLSLFL KSRNYSAPLRVFLVDELGQQVSNVIEVNIENRDWTKYTGELKPEKNVQRGMLAIQPMSKG QFQIDVVSLFPSDTWNEGKSVFRKDIVQNLKEFAPSFIRFPGGCIVHGVNEETMYHWKKT LGPIENRPGQWSKWAPYYRTDGIGYHEFYELCEYVGADAMYVMPTGMICSGWVKQSPQWN FRHIDVDLDAYIQDALDAIEYAIGDTTTKWGAERAKNGHPAPFPLKYIEIGNEDFGPVYW ERYEKIYQALSTKYPDLIYIANSVIRVVGRENDDKRKDIPNFVNPKNVKVFDEHYYNSIE WACEQHYRFDNYKRGVADLFIGELGISGKYPYNLLATGAIRMSIERNGDLNPLFAERPVM RHWDFLEHRIFLPMLINGVDSSVKTSFFYLAKMFRDHTFDVCLDAAIKDMEGMQNIFVTM GYDTGSKQYILKLINLQDKKVTLQPEVSGFKRPVKAHKTSLVLVPGKENTPATPNEVKPV ETEVGLDLNKPLELEAASMTVYRFK >gi|222159338|gb|ACAB01000021.1| GENE 28 49546 - 51093 1349 515 aa, chain + ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 41 497 1 456 467 161 28.0 4e-39 MKNTLFCITGLAIQGLAMSAAAQTDKPNIVVIMTDQQRADLCGREGFPLEVTPYVDQLAQ ENVWFNKAYTVMPASSPARCSMFTGRFPSATHVRTNHNIPDIFYQQDLVGVLKENGYQTA LVGKNHAYLKPADLDFWSEYGHWGKNKKATPAEKETARFLNQQARGQWLEPSPIFLEEQH PTKIVNEALAWIEKQKENPFFVWVSFPEPHNPYQVCEPYYSMFSPDKLPVLKTSRKDLAK KGEKYRILAELEDASCPNLEQDLPRIRANYIGMIRLIDDQIKRLIESLKASGQYENTIFV VLSDHGDYWGEYGLIRKGAGLSESLARIPMVWAGYQIKNQPAPMDGHVSIADLFPTFCSA IGAEIPAGVQGRSLWPMLTGKAYPKEEFSSMVVQQGFGGADVGLDASLTFEQEGALTPGK IAHFDELNTWTQSGTSRMIRKDDWKLVMNHYGNGELYNLKKDPSEVHNLFGEKKYSEIQT ELLTRLLAWELRLQDPLPLPQRRYHFKQNPFNYLK >gi|222159338|gb|ACAB01000021.1| GENE 29 51124 - 52254 933 376 aa, chain + ## HITS:1 COG:no KEGG:BT_3094 NR:ns ## KEGG: BT_3094 # Name: not_defined # Def: putative secreted xylosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 376 1 376 376 694 88.0 0 MKLIGRLGLILLTGFISICTASAQLYGTANTNAPELRVPKFVKPAFDYWMRDTWATLGPD GYYYITGTTSTPDRYFPGQRHCWDWNDGLYLWRSKDMKSWEARGQIWSMEKDGTWQKKPK VYKAGEKYQKKSINGDPMDNRFHAVWAPEMHYIKSAKNWFIVACMNESAGGRGSFILRSK TGKPEGPYENIEGNKDKAIFPNIDGSLFEDTDGTVYFVGHNHYIARMKPDMSGFAEELKT LKEKKYNPEPYIEGAFIFKYDNKYHLVQAIWSHRTSEGDTYIEKKGVTSPKTRYSYDCII STADNVYGPYSERYNAITGGGHNNLFQDKDGNWWATMFFNPRGAQAAEYKVTCRPGLIPM IYENGKFKPNFNYNTK >gi|222159338|gb|ACAB01000021.1| GENE 30 52251 - 53918 1661 555 aa, chain + ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 22 532 4 527 536 284 34.0 4e-76 MKTSLLIAGSLALVPMTYAQDKPNIIIILADDLGFSDLGCFGGEIHTPVLDKLAKNGIRM TQMYNSARSCPSRANLLTGLYPHQTGLGHMDGSHPAWPKGYSGFRSNSDNVTIAEVLKDA GYFTAMSGKWHLGNKSNPILRGFQEYYGLLGGFNSFWNPEVYTRLPKDRTPRHYEEGTFY ATNVITDYAIDFIDQAHQEKKPLFLYLAYNAPHFPLHAPKEVTDKYMSLYMQGWDKIRDA RWKRIVDLKLMQGKPELSPRGVVPESLFEDETHPLPAWDSLTKDQQTDLARRMSIFAAMV DVMDANIGRVVDELKKNGELDNTFIMFMSDNGACAEWHEFGFDKQTGVEYHTHTGAELDQ MGLPGTYHHYGTGWANVCCTPFTLYKHYAHEGGISTPCIISWGNQVKNKGGLDHQPAQFS DIMSTCVELAGATYPKEYQGRAILPTAGKSIFPIVKGKKMPERYIYAEHEGNRMVRKGDW KLVSANFKGDEWELYNIKEDRTEQHNLIGKYPEMAKELETAYFEWADKSDVLYFQKMWNT YNKNRRKDFKEYKTR >gi|222159338|gb|ACAB01000021.1| GENE 31 53924 - 56128 1849 734 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 108 501 117 495 1087 84 25.0 8e-16 MDKRILLTLALGAMSQVFTHAWEPKGDKIKTVWAEQVTPENVWQSYPRPQLQRAEWINLN GLWKYAVTDQNTPRKNVSFEGEILVPFAIESSLSGVQKTFLPTDKLWYQREFTLDPSWKN KSTILHFGAVDYECQVWVNNRLVGTHKGGNNPFSFDITKYLKKSGPQSIEVAVTDPTDTE SISRGKQQLNQEGIWYTPVSGIWQTVWLEAVNKTYIRQVLPSTDIERKSVKLAFDIIGAK GNEEVKIEVLDDGKVIKTVEQKLTNAIEIDVPNAVLWSPESPKLYHLNIVLSNGGRVLDR VKSYFALRKVDVRKDECGYNRICLNNQPIFQYGPLDQGWWPDGLLTPPSEEAMLWDMVQL KKMGFNMIRKHIKVEPEQYYYYADSLGLMMWQDMVSGFATSRKKEEHVNPLATTDWNAPE EHTRQWQKEMFEMIDRLRFYPCITTWVVFNEGWGQHNTVEIVNKVIKYDDTRLINGVTGW TDRGVGDMYDVHNYPVTSMILPENNGNRISVLGEFGGYGWAIKEHIWNPNMRNWGYKNID GAMALIDSYGRLVYDLETLIAQGLSAAVYTQTTDVEGEVNGLITYDRKVTKIPEGLLHLM HNRLYEITPAKAVTLIADSQNGSKNTRLVSLNGQELKMTSLPFDCPPRSTVVSEATFKVD KDFSHLSLWLNVAGEAKVWLNGVEVFAQEAKQTRQYNQYNISDYSRYLRKGSNLLKIEVK DSKKMRFDYGLRAY >gi|222159338|gb|ACAB01000021.1| GENE 32 56298 - 56741 478 147 aa, chain - ## HITS:1 COG:alr2922 KEGG:ns NR:ns ## COG: alr2922 COG0346 # Protein_GI_number: 17230414 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Nostoc sp. PCC 7120 # 1 131 3 132 135 114 39.0 8e-26 MRFSNVRLLVKDFAKCFKFYTEQLGLEPSWGDENSGYANFKVADGIEGFTLFVSDWMAPA VGNADKQQPVAMREKLMVSFSVDNLDETFADLKAKGVTFITEPTDMPDWGMRTLYLRDPE ENLIELFTPLAPEQCSQELLEENKKFQ Prediction of potential genes in microbial genomes Time: Wed May 18 01:18:21 2011 Seq name: gi|222159337|gb|ACAB01000022.1| Bacteroides sp. D1 cont1.22, whole genome shotgun sequence Length of sequence - 12146 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 2, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 415 - 474 4.1 1 1 Op 1 . + CDS 545 - 1573 455 ## COG3177 Uncharacterized conserved protein + Prom 1575 - 1634 6.6 2 1 Op 2 . + CDS 1760 - 3409 1551 ## BT_3091 putative regulatory protein + Term 3461 - 3511 9.2 + Prom 3490 - 3549 3.8 3 2 Op 1 . + CDS 3616 - 6621 2983 ## BT_3090 hypothetical protein 4 2 Op 2 . + CDS 6637 - 8127 1281 ## BT_3089 hypothetical protein 5 2 Op 3 . + CDS 8164 - 9678 1492 ## BT_3088 hypothetical protein 6 2 Op 4 . + CDS 9693 - 11471 1603 ## BT_3087 cycloisomaltooligosaccharide glucanotransferase 7 2 Op 5 . + CDS 11508 - 12144 490 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases Predicted protein(s) >gi|222159337|gb|ACAB01000022.1| GENE 1 545 - 1573 455 342 aa, chain + ## HITS:1 COG:NMA2029 KEGG:ns NR:ns ## COG: NMA2029 COG3177 # Protein_GI_number: 15794909 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis Z2491 # 74 341 17 276 290 116 32.0 7e-26 MEEEIQQYLRFHPLSSRSELMEGVNTKVSVATFKRLLAAMISAGSIEVIGQGPATCYKLT PQTFVTSYFDLESYFRKEVDEREIQQAFNFSLIPDILPNVDPFTMDERKHLTALQETFRR NVLEMTDGEYRKEMERLGVDLSWKSSQIEGNTYNLLETERLLLEKEEAKGKTKEEAIMLL NHKEALDFILDNPDYLQYLSIRKIEDIHSILIKELGVERHIRSRRVGITGTNYRPLDNEY QIREAMEDTCQLINNKESIFDKAFLALVLLSYIQPFADGNKRTARITSNAILIANGYCPL SFRSVDSIDYKKAMLVFYEQNNIYPFKKIFIEQFEFAVNTYF >gi|222159337|gb|ACAB01000022.1| GENE 2 1760 - 3409 1551 549 aa, chain + ## HITS:1 COG:no KEGG:BT_3091 NR:ns ## KEGG: BT_3091 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 549 1 550 550 905 91.0 0 MKKVILIFVTIVLSGLLYAKDNKSTDALLREIDGIIKNRQTYGAEKEARIADLKKLLAEA TSDEQRYGFCGRLFDEYRAYNLDSSFVYAQRKEELAHRMDKLDYLDDAAMNMAEVMGTTG MYKEALELLGQIDKKTLPDYLYGYYYHLYRTIYGLMGDYAVTEKVKKEYYRMTDLYRDSL LQVNASDSLGHVLVMADKCIVHAQYDEAIRMLMEYYNKPSLDDHSKAMLTYTLSEGYRLK GDKQGQKHYLALSAIADLKSAVKEYVSLRKLASLVYDEGDIDRAYNYLKCSLEDATLCNA RLRTLEISQVFPIIDQAYQLKTKRQQQEMKVSLICISLLSVFLLVAIFFVYKQMKKVAAA RREVVDTNTLLQELNEELHDSNSQLKEMNHTLSEANYIKEEYIGRYMDQCSTYLDKMDLY RRSLNKIAAAGRVEELYKAIKSSQFLDEELKEFYANFDMTFLQLFPNFVEEFNALLTEPM QPKPGELLNTELRIFALIRLGITDSTKIAQFLRYSVTTIYNYRTRVRNKALGERDEFETK VMQIGKVEE >gi|222159337|gb|ACAB01000022.1| GENE 3 3616 - 6621 2983 1001 aa, chain + ## HITS:1 COG:no KEGG:BT_3090 NR:ns ## KEGG: BT_3090 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 1001 1 999 999 1823 94.0 0 MYMKQSMKSKGFERRLLLIMWGLFLSLSAFAQQISIKGHVVDATGEPVIGASVVEGRTTN GTITDVDGNFSLSVPANSTLTISFVGYKTQTVPVNGKNSLKVTLQEDTEVLDEVVVVGYG TMKKSDLTGAVSSVGVKDIKDSPVANIGQAMQGKVSGVQIIDAGKPGDNVTIKIRGLGTI NNSNPLIVIDGIPTDLGLSSLNMADVERVDVLKDASATAIYGSRGANGVVMITSKRGAEG AGKVTVNANWAIQNATKVPDMLNAAQYAALSNDMLSNNDDNTNPYWADPSLLGTGTNWLD EMLRTGVKQSYSVSYSGGTEKAHYYVSGGFLDQSGIVESVNYRRFNFQANSDAQVNKWLK FTTNLTFSTDVKEGGSYSIGDAMKALPIQPVKNEDGSWSGPGQEAQWYGSVRNPIGTLYM MTNETKGYNFLANITGEIAFTKWLKLKSTFGYDAKFWFVDNFTPAYDWKPNPVEESSRYK SDNKSFTYLWDNYFVFDHTFAQKHRVGVMAGSSAQWNNYDYLNAQKNIFMFDNIHEMDNG EKMYSIGGSQSDWALLSLMARLNYSYEDKYLLTATVRRDGSSRFGKNNRWGTFPSVSLAW RISQEEWFPKDNSPVNDLKLRIGYGVTGNQEIGNYGFVASYNTGVYPFGNNNSTALVSTT LSNPNIHWEEVRQTNFGVDMSLFNSRVNLSLDAYIKKTADMLVKASIPITSGFEDTTETF TNAGKMRNKGVEMTLRTINLKGLFSWESALTATYNKNEILDLNSETPMFINQMGNSYVTM LRAGYPINVFYGYVTDGLFQNWEEVNRHATQPGAAPGDIRFRDLNNDGVINDEDRTVIGN PNPNWFFSLSNNFSYKGWELSVFLQGVSGNKIYNANNVDNEGMAAAYNQTTAVLNRWTGE GTSNSMPRAIWGDPNQNCRVSDRFVENGSYLRLKNITLSYTLPKKWMQKIQLENARISFS CENVATITGYSGFDPEVDINGIDSSRYPISRTFSMGLNFNF >gi|222159337|gb|ACAB01000022.1| GENE 4 6637 - 8127 1281 496 aa, chain + ## HITS:1 COG:no KEGG:BT_3089 NR:ns ## KEGG: BT_3089 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 496 1 496 496 956 90.0 0 MKKIITLVFMSAVLALTSCNDFLDKYPKYGVDPESEVTNEIAVALTTACYKTLQSSNMYN QRIWSLDIIAGNSEVGAGGGTDGLETVQASNFIAQSDNGFALYVWRSPWVGIGRCNIVLS NLPSASVSDEIKTRCMGEAYFLRAHYYYILVRLYGGVPLRLEPFEPGESADIARNTVDEV YAQILSDCKNAVNMLPPKSSYGDADRGRACKEAAMAMLADIYLTLAPNHRDYYNEVVTLC NNIAAMGYDLTQCNYADNFNATINNGAESLFEVQYSGSTEYDFWGGDNQASWLSTFMGPR NSGMVAGAYGWNLPTEEFVKQYEDGDLRKDITVLYQGCPAFDGMEYKRSWSNTGYNVRKF LVSKTVSPEYNTNPNNFVVYRYADVLLKKAEALNEQGHPDQAAEPLNIVRKRAGLTELPT TLTQTEMREKIIHERRMELAFEGHRWFDMIRVDNGNYALTFLKSIGKNNVTKERLLLPIP QTEMDSNHLMTQNPGY >gi|222159337|gb|ACAB01000022.1| GENE 5 8164 - 9678 1492 504 aa, chain + ## HITS:1 COG:no KEGG:BT_3088 NR:ns ## KEGG: BT_3088 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 504 1 504 504 850 84.0 0 MKKYIYQLLCSLFIGGTMVACTEDYMETDKGHDTLTLSVNQQELVLNEKNHSQEALALSW TTGTNYGSGNRISYTLELAKAGTDFANPYSVDLGTGTYQWTKKVEELNQFLHTQFGIGYA VKAELEARITAVVAGMEDQKQIATAAFTVTTYQPVTTTLYLIGDATPNGWSADNATAMER TDNGQFTWTGKLNTGSFKFITTLGEFLPSYNRDATAEEGFKLTYRTSGDQPDEQFTISKE ATYIVKANLLDLTLTLTETEDIGWRYEEFFIVGSFTGNNGWGFEALSKDAVQMDLFHYGA VIPWKADGEFKFASVTDFGQADAFFHPTMANAPYTSTSVVLGGDDNKWQMKEAECGKPYK VWFYTGKGKEKMLMRPFTPYEGLYLVGEATPNGWDLGNATPMTKSADSPYIFTWSGTLKA GEMKISCDKQSDWNGDWFMADKSNKAPTGEVETALFVAKSDAELNSMYPDTDLGSLDYKW NIQEAGSYRITIDQLKETISIVKQ >gi|222159337|gb|ACAB01000022.1| GENE 6 9693 - 11471 1603 592 aa, chain + ## HITS:1 COG:no KEGG:BT_3087 NR:ns ## KEGG: BT_3087 # Name: not_defined # Def: cycloisomaltooligosaccharide glucanotransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 592 1 592 592 1102 89.0 0 MKKIIYLVAAFLCMACSDDHGSNQENGGASGSVTEVTPVTSDLSVDLSTDKAFYKPGEKV VFTAEDALPAGTKVRYRLLGEVVGEESVNGTSWTWQPPTTDFKGYMAELYRQENGTDVIL GTIAVDVSSDPARFPRYGFVADFSREKTAEKTQEEMAYLNRHHINWVQFQDWHNKHHWPL GGTRTQLDEVYMDIANREVYTSSVKNYIEAQHRFGMKSMFYNLCFGALKDAAADGVKEEW YLFKDASHTIKDSHDLPGGWKSNIYLVDPSNKEWQKYLNERNDDVYANFAFDGYQIDQLG RRSTLYNYNGIPVNLREGYASFIEATKQAHPDKSLVMNAVSRYGARQIGETGKVDFFYNE VWADEADFTNLKAILYENGVYGNYQLNTVFAAYMNYNKADNRGEFNTPGILLTDAVMFAL GGSHLELGGDHMLCKEYFPNENLTMSEELKTAMVRYYDFLTSYQNLLRDGGTENSVSMNC TNGEMRLNSWPPQQGSVTTYAKQVGGKQVIHLLNFSQANSLSWRDVDGTMPEPALITKAA LQMNLSAKVNKLWVASPDVHGGALQELAFTQENGVVSFTLPSLKYWTMIVTE >gi|222159337|gb|ACAB01000022.1| GENE 7 11508 - 12144 490 212 aa, chain + ## HITS:1 COG:alr4773 KEGG:ns NR:ns ## COG: alr4773 COG1501 # Protein_GI_number: 17232265 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Nostoc sp. PCC 7120 # 16 212 13 206 779 89 29.0 6e-18 MIRYRLWVWVLCLMFPTWVWAENEIRTCTSFTQQGRQVTFHLADSAALQLQLCSSSVVKI WFSPDGQLQRRNASFAVINEELEEVGTIHVDEQAACYEIFTPKLRIRVNKSPMSLQIFDK YQKLLFSDYADKGHVSEGTKKVEYKVLRRDEHFFGLGEKAGKMDRRGESYKMWNSDKPCY SVVEDPLYKSIPFFMSNYRYGIFLDNTYKTEF Prediction of potential genes in microbial genomes Time: Wed May 18 01:19:03 2011 Seq name: gi|222159336|gb|ACAB01000023.1| Bacteroides sp. D1 cont1.23, whole genome shotgun sequence Length of sequence - 31564 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 9, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1866 1536 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases + Term 1872 - 1908 -0.3 + Prom 2694 - 2753 8.7 2 2 Tu 1 . + CDS 2872 - 5442 1401 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 5557 - 5601 10.2 - Term 5179 - 5224 -0.9 3 3 Tu 1 . - CDS 5453 - 6319 490 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 6379 - 6438 11.1 + Prom 6341 - 6400 4.7 4 4 Op 1 . + CDS 6478 - 8055 1232 ## COG4146 Predicted symporter 5 4 Op 2 . + CDS 8070 - 9095 912 ## COG2152 Predicted glycosylase + Term 9122 - 9180 -0.4 + Prom 9169 - 9228 8.3 6 5 Op 1 . + CDS 9320 - 12448 2369 ## ZPR_0351 TonB-dependent receptor Plug domain protein 7 5 Op 2 . + CDS 12459 - 13997 1274 ## ZPR_0352 hypothetical protein 8 5 Op 3 . + CDS 14014 - 14991 841 ## ZPR_0353 hypothetical protein 9 5 Op 4 . + CDS 15009 - 16472 996 ## Csac_2519 coagulation factor 5/8 type domain-containing protein 10 5 Op 5 . + CDS 16483 - 17967 1164 ## Csac_2519 coagulation factor 5/8 type domain-containing protein 11 5 Op 6 . + CDS 17981 - 19165 949 ## COG4124 Beta-mannanase + Term 19200 - 19235 -0.1 + Prom 19214 - 19273 5.0 12 6 Op 1 1/0.333 + CDS 19398 - 20267 826 ## COG0657 Esterase/lipase 13 6 Op 2 . + CDS 20297 - 20638 349 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain + Term 20658 - 20699 5.4 + Prom 20667 - 20726 2.2 14 7 Tu 1 . + CDS 20760 - 21707 859 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 + Term 21806 - 21870 11.8 - Term 21916 - 21948 -0.9 15 8 Tu 1 . - CDS 21962 - 24232 1921 ## COG1472 Beta-glucosidase-related glycosidases - Prom 24259 - 24318 2.2 16 9 Op 1 . - CDS 24369 - 25631 1312 ## PRU_2519 polysaccharide hydrolase-like protein 17 9 Op 2 . - CDS 25652 - 27328 1518 ## PRU_2518 putative lipoprotein 18 9 Op 3 . - CDS 27340 - 30528 2844 ## PRU_2517 TonB dependent receptor 19 9 Op 4 . - CDS 30549 - 31364 710 ## COG2273 Beta-glucanase/Beta-glucan synthetase - Prom 31502 - 31561 2.8 Predicted protein(s) >gi|222159336|gb|ACAB01000023.1| GENE 1 1 - 1866 1536 621 aa, chain + ## HITS:1 COG:alr4773 KEGG:ns NR:ns ## COG: alr4773 COG1501 # Protein_GI_number: 17232265 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Nostoc sp. PCC 7120 # 1 543 208 747 779 339 34.0 9e-93 FGTESRDYYSFEAPNGEMVYYFIFGKDYKEIISQYVGLTGKPIMPPKWALGFAQCRGLLT SEKLSREIAEGYRKRGIPCDIIYQDIGWTEYLQDFEWRKGNYENPKKMLSDLKEMGFKVV VSQDPVIAQANKKQWEEADRLGYFVKDNTNGKSYDMPWPWGGNCGVVDFTLPAVADWWGA YQQKPIDDGISGFWTDMGEPAWSNEEQTERLVMKHHLGMHDEIHNVYGLTWDKVVKEQFE KRNPDRRVFQMTRAAYAGLQRYTFGWTGDSGNGDDVLQGWGQLANQIPVMLSAGLGLIPF SSCDITGYCGDIEDYPAMAELYTRWIQFGAFNPLSRIHHEGDNPVEPWLFGPEAEKNAKA AIELKYRLLPYIYTYAREAYDTGLPIMRPLFLEYPMDMETFSTDAQFMFGRELLVAPVVK KGARTKNVYLPEGTWIDYNNKQTVYTGEQWTTVDAPLSSVPIFVKQGSIIPTMPVMNYTH EKPVYPLTFEIFPAQEGAQAAFTLYEDEGEDLGYQRDEFVKTPIICSTLANGYKLTVSAR EGKGYTVPGPRNLLFRIYSAKAPKEVTVKGKKIKKTKPERLEENLENDTESVVWSWDKET GVCSVRIPDKGVDEQLMITFK >gi|222159336|gb|ACAB01000023.1| GENE 2 2872 - 5442 1401 856 aa, chain + ## HITS:1 COG:XF0846 KEGG:ns NR:ns ## COG: XF0846 COG3250 # Protein_GI_number: 15837448 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Xylella fastidiosa 9a5c # 40 852 59 876 891 553 36.0 1e-157 MVALAMVTTFSTLNGNISANTLQKELHSGWRFKQARLSNWYPATVPGVVHTDLIDNKIIE DPFYRLNERGVQWVDKEDWIYETTFDVAPELMDKNNIRLYFKGLDTYADVYLNGEKILEA DNMFREWKLPVKDKLKAKDNKLRIYFHSPIKRDMPKWDALPFHYEAINDQSENGGVFNRK VSVFARKAGYHYGWDWGPRLVTSGIWRPVLLEAWDDLRIENVFIRQKEVSYKHADISNIV EILADKDIEKAEVTVRDNKTGKIYGATATCLTKGLNKIAVNFFIKNPNLWWSNGLGKANL YEFRTEVKTNEQNKDCLITSTGIRSLKVIRKEDKEGRSFYFQLNGIPVFSKGANYIPCDN FLPRVTKELYEKTILDAVNANMNTLRVWGGGTYEDDYFYELCDKYGILVWQDFMFACSVY PSEGELLENIRQEAIDNVRRLRNHPSLALWCGNNECLEAWFGWNWKENYAKQNPEYARII WQQYEDLFHKMLPEVVTENSPETFYWPSSPFSRYDGVSENNKGDTHYWAVWHAKKPISEY NKVRSRFFSEYGFQSFPEFESVKMYAPYPEDWEITSEVMMSHQRGGEFANKLIEDYLLNE YRKPKDFESFLYMNLVLQGDAIKTAIEAHRRDMPYCMGTLFWQHNDCWPVASWSSRDYYG RWKAQHYFAKAAFRDVLVSPIVNNDRLDVYIVSDRLRKTSAILELEVCDMEGKLVNSIRR SVTIPANESKVVMSHRLNSFIKSQPENQLVISATLTDQQGTIYTNNYFLTKQKEMLYPQV NISYQLKSLPDGYELTLKADRFARAVYLSLDGIDNFFEDNYFDLMPGKDKIVKVRTDISY TDFSRQLKIKSLVDGY >gi|222159336|gb|ACAB01000023.1| GENE 3 5453 - 6319 490 288 aa, chain - ## HITS:1 COG:AGl1086 KEGG:ns NR:ns ## COG: AGl1086 COG2207 # Protein_GI_number: 15890663 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 32 281 42 295 305 148 36.0 1e-35 MKDIQKEITPISSDDLFIVLNHPNAKFDYPVHYHSDYEINLVMNTYGERIVGDSVEKFDN LDLVMTGPNLPHAWKGEIVEGNHVVTIQFSDKLLNFPILEKRLFSPIKQLLLDSQKGISF SENTMLAMKEKILQLTRMQGFHTVLEFLSILYDLSTADRHILVSNLYDTKDTIRTSKSRR IAKVCDYIEKNFEEPIKLGDVAALVNMSESAFSHFFRKKTNCTFIDYINNLRVARACQML TETSHTVAEICYSCGFNNLSNFIRIFKKKKGNTPSEYRTILQQMLIKY >gi|222159336|gb|ACAB01000023.1| GENE 4 6478 - 8055 1232 525 aa, chain + ## HITS:1 COG:yidK KEGG:ns NR:ns ## COG: yidK COG4146 # Protein_GI_number: 16131549 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Escherichia coli K12 # 5 489 3 492 571 193 28.0 9e-49 MESFSIGTIDIIIVVVYLLFIIWWGVKNGKSSDAQSYFLAGRSMPWWVVGLSLFAASISS TTLIGQSGDAYDTGLAVFNYNLTGIVVMVFFAVFLLPLYIRSKVFTIPEFLEKRFDKRSR YYFSAICIIGNIFLDAAGALYAAALIIKLVLPEADLQMIIIVFAIIAASYTIPGGLSSAI NAELIQAVVLILGSVLLTYFCFQQGGSYFYHLFENDDILVKLIRPLSDNATPWLGLIVGM PILGIYFWANNQTLVQRVLSARTIDEGRRGVIFTGFLTMSTLFIIAIPGLIARDLFPGLD KPDMVYPNMVMQLMPMGLLGIMLAALLSALTSTISAILNSTSTLFTMDFYAKFNKHADTK KLVLVGKIASCVIILLAALWAPQIGKFGSLLKYYQEMLSYISPPIVAAFLLGIFNKRVNG GGAFMGLILGLVVAVLLLFFKNTVFGDLHFLLMVPFLLIFSIMVIYVSSFFYAAPGKDKL TDTTFSFSDLKEEYVTLKTVVWYKNYQVWAGVLLALSVLIWIIFQ >gi|222159336|gb|ACAB01000023.1| GENE 5 8070 - 9095 912 341 aa, chain + ## HITS:1 COG:TM1852 KEGG:ns NR:ns ## COG: TM1852 COG2152 # Protein_GI_number: 15644595 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Thermotoga maritima # 1 334 3 293 296 194 36.0 2e-49 MKLKKFEGNPILSPSEKNDWESLVTCNPGVFYDNGMFYMLYRAAGNDREHVIRLGLATSR DGFYFERHSDRPAFSPSVDGPDSGCVEDPRIVKFDGEYYITYAYRPVPPGQYWKFGHDEV LLPECGKYAPAAIAKNLGNTGLAVTTDFCHFRRLGRLTSPVLDDRDVILFPEKVNGQFVM LHRPKQYIGEKFGVEYPSIWIKFSDDLLDWEGKESHLLLTGRENTWEEKIGGSTPPLKTD KGWLMLYHGVEHGGKGYYRVGALLLDLDNPLKILAKTHESILEPEEDFEINGYYNGCVFP TGNVIVDDTLFVYYGSADKYVCVATCSVNELLSYLLTDCKV >gi|222159336|gb|ACAB01000023.1| GENE 6 9320 - 12448 2369 1042 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0351 NR:ns ## KEGG: ZPR_0351 # Name: not_defined # Def: TonB-dependent receptor Plug domain protein # Organism: Z.profunda # Pathway: not_defined # 27 1042 43 1052 1052 1206 59.0 0 MNKQKKRLGSMSLCKVVLICLFLSVFSTAWAAVEQTSSKITVTGVVKDVQGEPLIGATIK IKGSAVGTVSDFDGNYSLANVPVNGVLEFSYVGMNSIEISVSGKKVVNAVLSDATQLLDQ VVVIGYGSVKKSDLTGSVATVKPEALSDIPANSVEGLLQGRAAGVQVINSSQDPGAGSTV RIRGNSSLRGSNSPLIVVDGFPLGDAGDLKQINTDDIVSMEVLKDASASAIYGSRGANGV IIVTTRRAKEGTTTVSVKQQLTFSEFSSKLDLWRDPVMMAMLNNEGRVNGGLQPLYIGAN NSAGVYYPSIEELQTTWTTNTRWDELVFRDAPVSNNTTVSVSSANEKTSFNLSANYYTDQ GMYIDDDYSKGGYNLNVSHKIYKNFTIRASNILYLGNRNNNGGMAYWRNPIYPVYNEDGT YYQTTAQDYSHPIAIREHKKDKTKMLDVISSGAFEWQVIPQLRLTSQLNYKYGKSTNDQY LPQKYTQVGTENRGRGVIGNTENQNFVSETFANYFQMFGKHEVGAMVGHSYEWYQYRDSY LTANGFVNEALGNENMGAGEKANNIISNGFSKSKLVSGMARLNYTYDNKYLLTATFRADG SSKFGKNNKWGYFPSGAISWKAHEEKFIKKLNTFDELKFRLSYGISGNQGISPYQTLSRY GTTKYYHQGQWLTAIGPGYESGQSGQDGIWKIWSGIPNPDLKWETTAQWDFGIDMAFFNR RLRVTLDIYNKETSDLLRERNLALSSGYDRMWVNDGKIRNRGFEVTIDGVMVDKKDWSLN GTFIFSMNRNKVLDLGSSVESGLLKDARTGMEYEVVGNLSEQYRSYTNILAIGQPINVFY GYKCDGIIQTLPEGLNAGLSGEDAMPGEFKYVNTYDGDGTQGVVNENDKCIIGDPNPDFT ASLNLSFRWKKLDASIFLNGVFGNDVLNTKTFGDPHNAALRWTPDNPTNKYPSLRDGRQT RFSDYWIEDGSFVRIQNLTVGYTFDLPKQKVFASKIRLYMNASNLYTFTKFDGYDPEVGQ NGIYSGGYPRLRKWTVGLDITF >gi|222159336|gb|ACAB01000023.1| GENE 7 12459 - 13997 1274 512 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0352 NR:ns ## KEGG: ZPR_0352 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 3 510 4 497 497 494 51.0 1e-138 MKKNIYTFILISCILGFSGCSLDEEPYGFYSEDNFYKTEGDAMAAIHYAYATLTYIEYSR ALYFIGDMPTEIMTTKGDASVDNRALNLWRIDNFDTNATLVNFFKYSFIAINRANAVIKK MPGTDIDSKLKDQYLGEAYFLRAYNYFCLSRNFGLVPMHYGVVETLSQTGTPLANNMDEI YDLMIDDCKKAEELLPVYAAPIMGRIDRVGAQALLAKIYLYAASAKANNVPLYKEMAKDV NSLYQDAMTYAGRALGLDPDYPQNTYGFESDLLKIYDVENFAGKENIFLMSMDRTGEAEG QYSKISKMFLPYIEGATIWIKQGDKDEYIKSHDGWGEYRTEMNFYNEFDGNDLRKKDLIV DKVYNEDGTVKAEWKPDGGDFEYPFCRKYIDPKFDGDKTSTRPFLIRFTDIAMVYAEAAG PTTKAYELINFIRHRAGLGDLTPGLPIDEFREKVYEERKFEMAFEGDRMYDIRRWGRIGE IKEVKDAGLNESQYSFYPLPQTEINLNPSLRE >gi|222159336|gb|ACAB01000023.1| GENE 8 14014 - 14991 841 325 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0353 NR:ns ## KEGG: ZPR_0353 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 1 322 1 311 312 261 47.0 3e-68 MNKLLKYTLFILCMGVMQSCVKDMQDDLNDGGWNNERSVTSLKFKNQVGQAEIERIDDTT GEVTVTLNVGAIPDLSKVEIESIELSYQATASAGKGSTLDFSNPERKSVLTVTSATGKTR EYMIYVNEFRESLEGTWSINNLIIYGGTGPEWGGGRVYPFMDKSWCWYDEYSPEKEYDNT LTFIMENITDDGNTSGKCINAAGPDGKYADFIFKGSSNPADKTDIDLKKYYRKIPVGESR WIRNYSLGTITFIDSDGNESTAVLENPGTYTLFQGGGYTNTVTLPNNAFSFQINGVDEWT YIYTDYDVFAKQARKFFVLVTKNED >gi|222159336|gb|ACAB01000023.1| GENE 9 15009 - 16472 996 487 aa, chain + ## HITS:1 COG:no KEGG:Csac_2519 NR:ns ## KEGG: Csac_2519 # Name: not_defined # Def: coagulation factor 5/8 type domain-containing protein # Organism: C.saccharolyticus # Pathway: not_defined # 6 486 20 493 628 190 29.0 9e-47 MKKYFFIFIVAFFASCSPVEQDVIITIDTETVLNDCYIGNGAQWDPYQLDYGKGKLTISE ADWEKLYDRLDFMRPQFIRIMINTSSFVDQGRLVPERGMEVLSKMLDYCQSRNVTVMFGD WGWSVIRAREAEIDSQSLRHAAALVDYLVHNKGYSCIRYYNLVNEPNGYWAATNKSFPLW ASSVRQFYQYMKEFDLIGEVGIVGPDIAIWDKAETGWMDSCAVQLNEQIELYDIHTYPSK VTVNSGEYADIIAAYKEKIPEGKKIVMGEIGFKFVEKADSLYQKENIRRAGSKVNASLED SQMFVYDYMYGTDMADALIQTVNTGFSGSIAWMLDDAMHSNEAPDKLKIWGFWNIFGEEC FGAEEENVRPWYYAWSLLTRYMPAGTAVYRTDTTGEPFVKAAVSGKDGKYMIALVNVSDS PKTVKLKSNVLSSLSGCKKFIYSDGTLKQKGDCLLLPNDENLILHLEKGETVTMPAESLI VYTNYDY >gi|222159336|gb|ACAB01000023.1| GENE 10 16483 - 17967 1164 494 aa, chain + ## HITS:1 COG:no KEGG:Csac_2519 NR:ns ## KEGG: Csac_2519 # Name: not_defined # Def: coagulation factor 5/8 type domain-containing protein # Organism: C.saccharolyticus # Pathway: not_defined # 39 467 53 465 628 152 27.0 2e-35 MNLKTIIGTFVVGMLLVSCTHSDNRLTVNVSQEVVTSGYIGNGVEWDPYDEAESWGASVS DEDWQKLFKRLDFMRMGYVRCMINSPFRYYDPQTGKYDKTRNIASLSRLLQYCTDHNIVV MYGEYNPPTWAMKDEQCWVDMSVDYLNYLVNDLGFSCIKYFVIFNEPDGNWASTDGDYDL WKSMLLRFADKMKEYPGLDGKVKFAGPDAVVDYKNPSSPFDAAGWVKQAATDLDSHIGIY DIHAYPGQHEVRSGQYAEILARYKEFIPAGKKIVLGEAGYKYWREADSLLMKEYNRRVEG HPFTKGTDSNMLVYDYFYGLDMPLLCMEVMNGGYSGIAAWMLDDAMHSSGDSGKTEDIKL WGMWNILGEEVFDDPSQEEIRPWYYTWSLMCRYFPTGSNILKTTLDRTKGVYAAAGEYQG KYVFAAVNIGDEDRSLDISLPAALENVSLYVYEEGNRQTNNEGHPVPVQTGLTVDGTYQT ILKAQSFILLTNID >gi|222159336|gb|ACAB01000023.1| GENE 11 17981 - 19165 949 394 aa, chain + ## HITS:1 COG:BS_ydhT KEGG:ns NR:ns ## COG: BS_ydhT COG4124 # Protein_GI_number: 16077655 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-mannanase # Organism: Bacillus subtilis # 137 306 131 294 362 63 27.0 5e-10 MKKQINILLVGLASVFMSCSDVEEVSALYPEYEEVKELSIVDKNATSETKALYSNLWAIQ SKGFMFGHHDDLWYGRKWYNEEGRSDTHDVCGDYPAVFSFDVAEIMDDRYQNPENEIRKR VALEAYERGEVLIACAHLNNPLTGGDSWDNSSNEVVKEILKEGSPTHLKFKTWLDRLANF ALNLKGKNGELVPVIFRPYHEHTQTWSWWGSSCTTSDEFTSFWKFTVNYLRDTKNVHNLI YAISPQIDSVQGRENFLFRWPGNDYVDFLGMDSYHGLSPVVLTTNMNTLSRLSKELKKPC GVTETGVEGIQRDGVPEKNYWTQQILTPVTGQFVSMIVMWRNKYDPSESESHFYSVFKGH ASEKSFIKMYESPVSLFSADLPDMYTMAEGMNVK >gi|222159336|gb|ACAB01000023.1| GENE 12 19398 - 20267 826 289 aa, chain + ## HITS:1 COG:DR0821_2 KEGG:ns NR:ns ## COG: DR0821_2 COG0657 # Protein_GI_number: 15805847 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Deinococcus radiodurans # 66 253 6 195 242 105 31.0 8e-23 MDAIWLYAAYWERILKNTEINKIMKKICLLLFLLCACIAQAQNAYRTDKDISYVSGSETD TYRRERCKLDIYYPENKKDFSTIVWFHGGGMEGGNKFIPKELREQGFAVVAVNYRLSPKA KNPAYIEDAAEAVAWVFKNIEKYGGRRDHIFVSGHSAGGYLSLILAMDKKYMAAYGADAD SVAAYLPVSGQTVTHFTIRKERGLPDGIPVVDEYAPVNKARKETAPLVLITGDKHLEMAA RYEENALLEAVLKSIGNKKVTLYEMQGFDHGQVLGPACYLIADYVKRFK >gi|222159336|gb|ACAB01000023.1| GENE 13 20297 - 20638 349 113 aa, chain + ## HITS:1 COG:CAC3376 KEGG:ns NR:ns ## COG: CAC3376 COG1917 # Protein_GI_number: 15896618 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Clostridium acetobutylicum # 5 112 2 109 114 104 43.0 5e-23 MKTRSDAFQFEKELDWEKPAPGIRRQIMGYDGQLMMVKVEFEKGAVGSMHQHYHSQATYV ASGKFELTIGDRKEILSTGDGYYVEPDQPHGCVCLEAGVLIDTFSPMRADFLI >gi|222159336|gb|ACAB01000023.1| GENE 14 20760 - 21707 859 315 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 311 3 307 308 335 56 2e-91 MEKIARKLTDLVGNTPLLELSNYNTSNDLKARLIVKIESFNPAGSVKDRIALAMIEDAET KGILHPGATIIEPTSGNTGVGLAFVSAAKGYKLILTMPDTMSIERRNLLKALGAELVLTP GADGMKGAIAKAEELKAVTPGAVILQQFENPANPAMHLRTTGLEIWRDTEGKVDIFVAGV GTGGTVSGVGEALKMRDPSIKIVAVEPSDSPVLSGGKPGPHKIQGIGAGFVPKTYKASIV DEIIQVQNDDAIRTSRELAKQEGLLVGISSGAAVYAATELAKRPENAGKMIVALLPDTGE RYLSTILYAFEEYPL >gi|222159336|gb|ACAB01000023.1| GENE 15 21962 - 24232 1921 756 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 25 748 11 745 778 364 33.0 1e-100 MTASIAMTACSKQAVPTAIPEDAKIEQQVEELLSKMDLDAKIGQMTELAIDVLGETINGE FQLDEAKLHKAIAEYKVGSFLNAPGPVAQNPEKWNEIIGRIQELSMTEIGIPCIYGLDQN HGTTYTLGGTLFPQNINMGAAFNPDLTYEAARVTAYETRASNCPWTYSPTVDMARDPRWP RVWENYGEDCLVNAIMGSTAVRGFQGDDPNHIPADRIATSVKHYMGYSMPRTGKDRTPAY ISVSELREKCFAPFKACVEAGALTIMVNSGSINGKPVHADRELLTQWLKEDLGWDGMLIT DWADINNLYTREHVAADKKEAIEMAINAGIDMAMEPYDLNYCTLLKELVQEKKIPMSRID DAVRRVLRLKFRLGLFDHPNTLLKDYPLFGSKEHALIALHAAEESEVLLKNKDNILPLPQ GKKLLVTGPNANSMRCLNGGWSYSWQGHLTDRFADKYNTIYEAICNKFGADHVRLEQGVT YKPEGAYMEENEPEIEKAVAAARNVDIIIACIGENSYCETPGNLSELAISANQSKLVKAL AATGKPIILILNEGRPRIINDLEPLADAVIDILLPGNYGGDALANILAGDVNPSAKMPYT YPRHEAALTTYDYRVSEEMDKMEGAYDYNAVVSVQWPFGYGLSYTTFEYSNFQTDKSSFT TGDDLHFSIDVTNTGKYTGKEVVMLFSSDLVASLTPENRRLRAFKKVELQPGETQTVTLS IKGSDLAFVNSDGQWVLEQGDFRMQCGNQVLTITYK >gi|222159336|gb|ACAB01000023.1| GENE 16 24369 - 25631 1312 420 aa, chain - ## HITS:1 COG:no KEGG:PRU_2519 NR:ns ## KEGG: PRU_2519 # Name: not_defined # Def: polysaccharide hydrolase-like protein # Organism: P.ruminicola # Pathway: not_defined # 5 418 9 451 454 114 25.0 7e-24 MKKIYIALFALLAVGSTFTACTEEEPFATVTENDDPHILAPVFPDRTNGQLATFANISRD ANLSIALTVTPKDYTTVTWFIDGQEVESGTDSDKEINRSLKAGTYNLKIEVETVKGKKTS REGLVVVNPLADDPQSKEVAFERIVSPGKTARLYGSNLQNVTAILLGGNTITDPTYVESE DENYLEYIVPTGVSEGDYRIVLQDAAGNEYGADMVKVTNASLVISGANRATANVDWTISG INLENIASLTIGGQTVSQFSNQSSTEVTLTCPDLSDGSYTLTGKTRSGEAVQFLNDNATT TEQTVIISTEITLWSGHHYVSWDMPEGDPNRTFNLIPMDAFTGITAGSTLKVVYSIEPTA EYHQMRLATGNWSYFTDPINFSENGEYALTLTQDMLNMIQAENGFICVGHGYYVDLVTVK >gi|222159336|gb|ACAB01000023.1| GENE 17 25652 - 27328 1518 558 aa, chain - ## HITS:1 COG:no KEGG:PRU_2518 NR:ns ## KEGG: PRU_2518 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 19 558 23 576 580 445 44.0 1e-123 MRQSKYITIITMACALFFASCSDEYMENMNTDPSKAATIDPNAQLTTAQLQTYGDLSMME IYRNYHYAFTQQLMGCWNTTNYGGRHTLDNNEMSRIWTSFYTQSLKNIIDAQYRTAEDAE KVNINSVLRIYRVYLMSIITDTYGDAPFSEAGLGFLEGKFNPKYDKQEDIYNAFFLELED AVNKIDPTKDKVTGDLIYAGDVTKWQQLANSLRLRFAMRISNVNPTKAQTEFENALAANG GVITDASSDALIKYMTIAFSFGQEAYSDYRGNSLSQLLFGNDPANNPSYLCSTFFNQLRQ SGDPRTFKISRCYYDGLMSATSPDNRVDITQEMIEKGIDFSPRNPGAYSWEPWPTGYDSD ICKELAVNNPSVTATMAREVEPKLANNFLKSDNPGVVMTSAEVKFLMAEATVKKWNVGSV SAEDLYKQGVRAAMDFLTDNYGCTATTDAEFDAFIQGRGTFGHTDNQKLEAINTQAWILH FTNPAECWANVRRSGYPKLKSPAEYGFGQYLTGGTEIPVRLCYPVLESSYNKKSYNEAIE RMGGTDNWHSLLWWDTEN >gi|222159336|gb|ACAB01000023.1| GENE 18 27340 - 30528 2844 1062 aa, chain - ## HITS:1 COG:no KEGG:PRU_2517 NR:ns ## KEGG: PRU_2517 # Name: not_defined # Def: TonB dependent receptor # Organism: P.ruminicola # Pathway: not_defined # 9 1062 10 1057 1057 1350 64.0 0 MNMEKCKYLLFCGTLALWMAGSFQQVYAATEPSSQAIQQQSNKVTGKVSDATGPIIGASV VEKGTANGTITDLDGNFSLNVKAGATLVVSYVGYKSEEVKAGRGPLNITLKEDAKALDEV VVTALGIKRERKALGYGIDEVKGEALTKAKETNLINSMAGRVPGLVVSQTAGGPSGSTRV ILRGSTEMTGNNQPLYVVDGVPLDNTNFGSAGTNGGFDLGDGISSINADDVENMSVLKGP AASALYGSRASHGVILITTKRANKDKISVEYNGTLTFDTQLAKWDEVQQIYGMGSNGTYS YDAISNTNKSWGPKADGSNMLKYFDGVERPFLIVPDNTSNFFRTGITATNSAIIGVNSGK TGIRFTYTDMRNKDIVPQTHMSRDIFNLRANTSAGKVDLDFSVNYTREDVKNRPALGDSK SNIGKNLMTLATTYDQEWLQTYQTADGEYSNWNGMDPYNVNPYWDIYKNFNKSKKDLFRM NGKAVWNIDPHLKLQATLGAELNWFTFDDYKAPTTPGFEAGRLQNSAFRNRMYNFEVLAL YNNHWGDFDFNATLGGNVYKVNNQTTVTTAQDMKIRDVPSLTSFNEISVVPGSYRKQINS VYGAVNVGWKHMLYLDATLRGDQSSTLPTGNNMYVYPSFSGSFVFSELTKLGDLLPYGKV RMSWAQVGSDTDPYQLGLVYTKSKYAYPGYTIGYIDNGTIPNKDLKPTKTNSVEMGLELK FLKNRIGLDFTYYSQISKNQIMGMASSWTTGYNYRLINAGKIENKGIEIALSTRPIQTRD FSWDINLNFSKNSNKVKELDGESDMFELEKASWLDVQVAAKVGENFGSIVGPDFQRNEKG DILIDPQTGLPQYDKSNHVLGNASWDWTGGLLTNFTYKNLSLVAVFDVKVGADLYSMSAR AAYESGKSPETLAGRDAWYRSEEERLAAGIAKGADNWKPTGGFVAPGVIDNGDGTYRPND IYINPEDYWMSVCRNAPSMFIYDNSYVKCRELTLSYNVPKSWLKNVISGLTVSFVARNPF IVWKNIPNIDPDSNYNNTTGMGLEYGSLPSRRSYGFNVNVKF >gi|222159336|gb|ACAB01000023.1| GENE 19 30549 - 31364 710 271 aa, chain - ## HITS:1 COG:TM0024 KEGG:ns NR:ns ## COG: TM0024 COG2273 # Protein_GI_number: 15642799 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Thermotoga maritima # 40 270 221 455 642 95 31.0 1e-19 MKKVALFLFLSVCFAQQSCSSDSVGTEPEENPQDILFKDDFNFFDEKVWTKETHEPGWTN QELQAYDAAHVSVGKDGDKSVLILTAERKGNKIYSGRINSKGKKSFKYRKIEASIKLPKT NGGLWPAFWMMGDNDKQWPACGEIDIMEMGEQSGMAAGDSEKQVNTAIHYGPSAAAHEQQ YYKANVANSLQDGNYHTYSLDWDENNLTISVDNVKFHTFDISSNTYFHDNFYILFNLAVG GAFTGITDINKLTGLKDGQKVNMYIDWVKIL Prediction of potential genes in microbial genomes Time: Wed May 18 01:20:19 2011 Seq name: gi|222159335|gb|ACAB01000024.1| Bacteroides sp. D1 cont1.24, whole genome shotgun sequence Length of sequence - 69687 bp Number of predicted genes - 58, with homology - 56 Number of transcription units - 27, operones - 13 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 204 - 251 8.2 1 1 Tu 1 . - CDS 341 - 4312 2631 ## COG0642 Signal transduction histidine kinase - Prom 4384 - 4443 3.9 - Term 4733 - 4782 1.1 2 2 Tu 1 . - CDS 4811 - 7213 2282 ## COG1472 Beta-glucosidase-related glycosidases - Prom 7237 - 7296 10.8 3 3 Tu 1 . + CDS 7630 - 7752 57 ## - Term 7627 - 7663 5.3 4 4 Tu 1 . - CDS 7873 - 8352 430 ## COG3467 Predicted flavin-nucleotide-binding protein - Prom 8382 - 8441 6.0 - Term 8391 - 8432 -0.1 5 5 Tu 1 . - CDS 8543 - 10699 1244 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 - Term 10948 - 10987 2.1 6 6 Op 1 . - CDS 11050 - 12198 788 ## Paes_0204 TIR protein 7 6 Op 2 . - CDS 12235 - 13368 726 ## Fisuc_2304 hypothetical protein 8 6 Op 3 . - CDS 13372 - 14412 644 ## BT_3244 hypothetical protein 9 6 Op 4 . - CDS 14420 - 15736 672 ## BT_3243 hypothetical protein 10 6 Op 5 . - CDS 15749 - 16645 944 ## BT_3242 hypothetical protein 11 6 Op 6 . - CDS 16668 - 18236 814 ## BT_3238 hypothetical protein 12 6 Op 7 . - CDS 18260 - 21880 2504 ## BT_3240 hypothetical protein - Prom 21927 - 21986 5.0 13 7 Op 1 . - CDS 22032 - 23195 689 ## COG3712 Fe2+-dicitrate sensor, membrane component 14 7 Op 2 . - CDS 23258 - 23860 452 ## BVU_0609 RNA polymerase ECF-type sigma factor + Prom 23811 - 23870 9.1 15 8 Op 1 . + CDS 23905 - 24912 705 ## COG0451 Nucleoside-diphosphate-sugar epimerases 16 8 Op 2 . + CDS 24903 - 25868 668 ## BT_3074 hypothetical protein 17 9 Op 1 . - CDS 25958 - 26953 497 ## PROTEIN SUPPORTED gi|148828154|ref|YP_001292907.1| ribosomal protein L11 methyltransferase 18 9 Op 2 . - CDS 26962 - 28887 1029 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen - Prom 28945 - 29004 5.1 19 10 Op 1 . + CDS 29001 - 30260 1407 ## COG0826 Collagenase and related proteases 20 10 Op 2 . + CDS 30301 - 30702 493 ## COG0824 Predicted thioesterase 21 10 Op 3 . + CDS 30744 - 31820 1042 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 22 10 Op 4 . + CDS 31902 - 34391 1505 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family + Term 34397 - 34428 1.8 - Term 34375 - 34425 2.0 23 11 Op 1 . - CDS 34615 - 34923 190 ## BVU_3461 hypothetical protein 24 11 Op 2 . - CDS 34989 - 35351 218 ## BVU_3461 hypothetical protein - Prom 35407 - 35466 2.2 25 12 Tu 1 . - CDS 35531 - 35803 270 ## BVU_3461 hypothetical protein 26 13 Op 1 . - CDS 35904 - 37031 657 ## COG1672 Predicted ATPase (AAA+ superfamily) 27 13 Op 2 . - CDS 37050 - 37229 80 ## - Prom 37266 - 37325 4.2 + Prom 37238 - 37297 6.0 28 14 Tu 1 . + CDS 37336 - 39114 1606 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase + Term 39192 - 39239 9.6 29 15 Tu 1 . - CDS 39410 - 39910 223 ## gi|237717093|ref|ZP_04547574.1| predicted protein - Prom 39990 - 40049 8.8 30 16 Op 1 . + CDS 40344 - 41099 269 ## PRU_1492 hypothetical protein 31 16 Op 2 . + CDS 41174 - 42184 401 ## gi|237717095|ref|ZP_04547576.1| predicted protein 32 16 Op 3 . + CDS 42181 - 43443 409 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily 33 16 Op 4 . + CDS 43440 - 44009 313 ## GFO_0152 metallophosphoesterase domain-containing protein (EC:3.1.-.-) + Term 44166 - 44203 -0.9 34 17 Tu 1 . - CDS 44265 - 44660 310 ## gi|160884700|ref|ZP_02065703.1| hypothetical protein BACOVA_02689 - Prom 44690 - 44749 5.7 35 18 Op 1 . + CDS 45054 - 45923 929 ## BVU_1036 hypothetical protein 36 18 Op 2 . + CDS 45977 - 47185 664 ## gi|237717101|ref|ZP_04547582.1| conserved hypothetical protein 37 18 Op 3 . + CDS 47197 - 49074 1127 ## gi|237717102|ref|ZP_04547583.1| conserved hypothetical protein 38 18 Op 4 . + CDS 49105 - 49686 488 ## BVU_3810 hypothetical protein 39 18 Op 5 . + CDS 49755 - 50762 476 ## gi|237717104|ref|ZP_04547585.1| conserved hypothetical protein 40 18 Op 6 . + CDS 50759 - 52333 1094 ## BF0477 hypothetical protein 41 18 Op 7 . + CDS 52346 - 53632 909 ## gi|237717106|ref|ZP_04547587.1| conserved hypothetical protein + Term 53717 - 53769 3.1 + Prom 53715 - 53774 4.6 42 19 Op 1 . + CDS 53848 - 54141 231 ## COG1669 Predicted nucleotidyltransferases 43 19 Op 2 . + CDS 54125 - 54511 261 ## BT_2210 hypothetical protein 44 20 Tu 1 . - CDS 54475 - 54630 82 ## gi|237721912|ref|ZP_04552393.1| predicted protein - Prom 54746 - 54805 3.5 + Prom 54532 - 54591 8.4 45 21 Op 1 . + CDS 54629 - 54988 192 ## Caka_0557 oxidoreductase domain protein 46 21 Op 2 . + CDS 55057 - 56334 512 ## BT_3061 hypothetical protein 47 21 Op 3 . + CDS 56344 - 57351 549 ## BT_3060 hypothetical protein 48 21 Op 4 . + CDS 57406 - 58371 792 ## BT_3059 hypothetical protein + Term 58389 - 58434 9.0 + Prom 58527 - 58586 5.8 49 22 Tu 1 . + CDS 58636 - 59475 586 ## BT_3058 transcriptional regulator + Term 59577 - 59612 -0.6 - Term 60010 - 60054 7.5 50 23 Op 1 36/0.000 - CDS 60100 - 60855 855 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit 51 23 Op 2 . - CDS 60892 - 62871 2364 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 52 23 Op 3 . - CDS 62909 - 63613 830 ## BT_3053 putative cytochrome b subunit - Prom 63730 - 63789 5.5 53 24 Tu 1 . - CDS 63825 - 64694 687 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 64723 - 64782 6.7 + Prom 64824 - 64883 4.5 54 25 Tu 1 . + CDS 64968 - 65930 773 ## BT_3050 chitinase precursor 55 26 Op 1 . - CDS 65965 - 66150 77 ## gi|262405889|ref|ZP_06082439.1| predicted protein 56 26 Op 2 . - CDS 66059 - 66847 500 ## Fisuc_1212 DNA-damage-inducible protein D 57 26 Op 3 . - CDS 66875 - 67513 548 ## BT_3041 hypothetical protein - Prom 67548 - 67607 3.4 - Term 68028 - 68077 10.9 58 27 Tu 1 . - CDS 68192 - 69685 1594 ## COG0793 Periplasmic protease Predicted protein(s) >gi|222159335|gb|ACAB01000024.1| GENE 1 341 - 4312 2631 1323 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 788 1012 8 229 294 126 36.0 3e-28 MVRFNLVITLLLMINMAVKAELTNRMFDVRHVGYAEGLSSQRVFSIVEDGDGAMWIATKT GIDRYNGHTVKNYDLPGSFYYGDLAGRRLYLLYDAQQGLFAYDHTGRIYRYSTILDHFEQ VLHLGQLIQEEVILNKLCLDSDGTWWMGADKGLYKQEADHRIVAVLKGQYVNDIAFAGES LFVGTSNGVWQLSHALPDKKRQLLEGWNVQTLFCDKPKKELWIGTFGSGLSVMNLDTSKV LALEGQGSTFLHPIRAITDYDVHTILIGVDGGGVYAIDKDTKKAHLLMNTKDDTDTYLRG NGVYAVTRDDQGNIWIGSYTGGVSVAILLKHPISILAHEKGNTHSLISNNVNDIEENPDG NQWFATDDGISIRNTLSGTWKHVLKEIVTISLCTSGNGNVWVGTYGDGVYLLDNNGRVLR HLTKQQGQLTTNYIFSVRQDMEGDLWIGGLDGCLIMFEKEKGSKQSFDVNWVQSIEPIDR NRVAVATVNGFFLVDKHTGNIQHYANSQEFHNQNVSAYIISMLFNDDGTVWLGTEGGGLN LYDMKNRTVKTFTVQEGLPSNDIYSLQRDDKKRLWVSTGKGIALIDSLRVSNLNYAGNID KEYNKSSFARLMNGEFVYGSTDGAVFIMPLDISTVDYWTLLRFTGLTVDYQNVQEEESLR PAIHDMLADRAVRLGYKYNSFTVSFESINYRFQRDIVYQHILEGYDNDWSKPSAEGKASY TKVSPGTYLFKVRSLRRSDGKTISESTLEVKVSQPWWNSWWAWTIYLFIISFVFYFILRY KSNQLQKKYDEDKIRFFIDTAHDIRTPVTLIMAPLEDLNREQDLSDKARYYLNLAHESTR KLHSLITQLLEFEKVDAHKPSSSSVPLCLNEVLLKEVSVFRSFCEKKQLTLNLTQPDESV FVSADKHLIEMLLDNLLSNACKYTMPQGEISLDLKATKRKAILSLKDNGIGIPKKAKKHI FSDIYRAENARQSQEEGTGFGLLQVQRIVKMLHGKITFCSEEGKGTTFIVTLPRTTTVAE PVSHESSLEHLAGTSNNNSHETDKMKDPDDRNTLLIVEDHETLRHYLRQTFEHLYRVIDV ADGHEAIACLANEYPDIILSDVMMPGIRGDELCRMVKENPDTSGIPVVLLTAKANHEAIV EGLKKGADDYIPKPFSTEILKLKVQGLIDNRNRQRQFFMRQAIAQVEAGGKRDDNESNEN NESIDNKNVTTASETMAEGDRRFIMQATRFVLEHLDEPDFNINLLCHEMAMSRTLFYSRL KSLTGKGPQEFIRIIRLQKAAELLKEGKSVTDVAAETGFVNTKYFSSLFKKQFGMQPSKY SGK >gi|222159335|gb|ACAB01000024.1| GENE 2 4811 - 7213 2282 800 aa, chain - ## HITS:1 COG:SSO3032 KEGG:ns NR:ns ## COG: SSO3032 COG1472 # Protein_GI_number: 15899739 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Sulfolobus solfataricus # 67 800 4 736 754 502 38.0 1e-141 MKKLLCLALLVSAGSIYSGSISANNKPTDNKSGNNSKDIYKKTWIDFNKNGIKDVYEDPS APIEARIADLLSQMTLEEKTCQMATLYGSGRVLKDAWPTAGWSAEIWKDGIGNIDEQANG LGKFGSEISYPYANSVKNRHTIQRWFVEQTRLGIPVDLTNEGIRGLCHDRATMFPAQCGQ GATWNKKLIREIAKVTANEAKALGYTNIYSPILDIAQDPRWGRVVESYGEDPYLAGELGK QMILGLQSEGIVATPKHFAVYSIPVGGRDGGTRTDPHVAPREMKTLYLEPFRKGIQEAGA LGVMSSYNDYDGEPVSGSYHFLTEILRQQWGFKGYVVSDSEAVEFLHTKHRITPTEEEMA AQVVNAGLNIRTNFTPPQDFILPLRHAINEGKVSLHTLDQRVSEILRVKFMMGLFDNPYP GDDRRPETVVHNDAHKAVSMKAALESVVLLKNKNQMLPLSKNFKKIAVIGPNAEEVKELT CRYGPANASIKTVYQGIKEYLPNSEVRYAKGCDIIDKYFPESELYNVPLDTQEQAMIHEA VELAKASDIAILVLGGNEKTVREEFSRTNLDLCGRQQQLLEAVYATGKPVVLVMVDGRAA TINWANKYVPAIIHAWFPGEFMGDAIAKVLFGDYNPGGRLAVTFPKSVGQIPFAFPFKPG SDSKGKVRVDGVLYPFGYGLSYTTFGYSDLKISKPVIGPQENITLSCTVKNTGKKAGDEV VQLYIRDDFSSVTTYDKVLRGFERIHLQPGEEQTVNFTLTPQDLGLWDKNNQFTVEPGSF SVMVGASSQDIRLKGSFEVQ >gi|222159335|gb|ACAB01000024.1| GENE 3 7630 - 7752 57 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKQGEGNCLPSLTITFTLTPVMPMDKEIQAEDEGVRVKVF >gi|222159335|gb|ACAB01000024.1| GENE 4 7873 - 8352 430 159 aa, chain - ## HITS:1 COG:MA2197 KEGG:ns NR:ns ## COG: MA2197 COG3467 # Protein_GI_number: 20091038 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Methanosarcina acetivorans str.C2A # 5 153 6 152 152 92 33.0 2e-19 MKTVIIEDKQRIESIILHCDACFVGITDLEGNPYVVPMNFGYENGIIYLHSGPEGSKLEM LEHNNNVCITFSLGHKLVYQHEKVACSYSMRSESAMCRGQVEFIEEIDDKRRALDIIMRH YTDNAFSYSDPAVRNVKVWKVAVRQMTGKVFGLRANEKP >gi|222159335|gb|ACAB01000024.1| GENE 5 8543 - 10699 1244 718 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 89 717 70 703 730 483 39 1e-136 MAKKKEKKEKKAGKRMSKKELAALLIDFFHAKSSETLSMKYIFSELRLTTHPQKMLCVDI LHDLLADDYISEIEKGKFRLNNHGTEMTGTFQRKSNGKNSFIPEGGGEPIFVAERNSAHA MNNDKVKIAFYAKRKNREAEGEVIEILERANDTFVGTLEVAKSYAFLVTENRTLANDIFI PKDKLKGGKTGDKAIVKVTEWPDKAKNPIGQIIDILGQAGDNTTEMHAILAEFGLPYVYP KAVETAADKIPAEISAEEIAKREDFRKVTTFTIDPKDAKDFDDALSIRKLKDGLWEVGVH IADVTHYVKEGGIIDKEAEKRATSVYLVDRTIPMLPERLCNFICSLRPNEEKLAFSVIFD ITEKGEIKDSRIVHTVINSDRRFTYEEAQQIIETKEGDFKEEVLTLDTIAKALREKRFSA GAINFDRYEVKFEIDEKGKPISVYFKESKDANKLVEEFMLLANKTVAEKIGCVPKNKKAK VLPYRIHDLPDPEKLENLSQFIARFGYKVRTSGTKTDISKSINHLLDDIHGKKEENLIET VSIRAMQKARYSTHNIGHYGLAFEYYTHFTSPIRRFPDMMVHRLVTKYMDGGRSVSEAKY EDLCDHSSNMEQIAANAERASIKYKQVEFMSERLGQIYDGVISGVTEWGLYVELNENKCE GLVPVRDLDDDYYEFDEKNYCLRGRRKNKIYSLGDAITVRVARANLEKKQLDFELIEK >gi|222159335|gb|ACAB01000024.1| GENE 6 11050 - 12198 788 382 aa, chain - ## HITS:1 COG:no KEGG:Paes_0204 NR:ns ## KEGG: Paes_0204 # Name: not_defined # Def: TIR protein # Organism: P.aestuarii # Pathway: not_defined # 121 382 143 359 360 85 28.0 4e-15 MKKFGFFLFAVLGLIACGDDNNDPTPEQHVTCSISAPAEGATVNIAEKMTIKGEATIDFG EISNVTLKVGDKAISEVTAVPFSYDYTFEANQAEGALKIELTVKGDQGTTATSEVNVTLQ KTKPTPEPEEGKMIDPRDNHEYKIVTIGEQIWMAENLAYLPSVSKPEAAATSDGDPLYFV FNYDGEDVNAAKATKEYKTYGVLYNWYAAMNQKNATGGDAEAIPSGIQGICPSGWHLPSK AEWKKLESFVVSELAPVEGNVWTDEDGNKYSDKDCKNVWSALTGKLDTDGWGESGMIDEN PDLANGPRDTYGFNVIPAGQCYHSGGFETPKSKSRTDFWSTDQATYGAGTVYFSNMSYGL GYSSDKGGIQEKRGLSVRCVKN >gi|222159335|gb|ACAB01000024.1| GENE 7 12235 - 13368 726 377 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_2304 NR:ns ## KEGG: Fisuc_2304 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 143 376 530 740 747 79 29.0 2e-13 MKKILLTSFIVALGLLGASCKDDNSTASGGGEILPDQQVSCEIFMPTNGETVIISDKLII RGEGTTNYGKIISAELKVGGEVITDISSVPFYYEYTFPKEAEPGELKIELAVKGDHEGSA LATITVTTELGDRPAPPQIGEDLTDTRDGNVYKTVQLADQLWMAENLRYLPEQNFDISST APKYYVMFDSDIKTDLGKAYLKAYGAYYNLPAALQGETALGEDETRNIKGVCPDGWHIPS QKEWQTLSKYVLDSGMAAIMSDGQVDETAIAKALASTTMWMLPEYTEIEPQPTWVGVEME KNNATLFNGLPIGFRACAGDEDWMHTCYSAGWWSSTAGVQMGPEFGITVRLWSDLHTFVT NAEFNPGVGLPVRCIKD >gi|222159335|gb|ACAB01000024.1| GENE 8 13372 - 14412 644 346 aa, chain - ## HITS:1 COG:no KEGG:BT_3244 NR:ns ## KEGG: BT_3244 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 232 354 162 34.0 2e-38 MRKLFNIFLFVLCSVAIAGCNDTESSESKLEIKAVNTNFQATGGKGYIQLQSTGNITADV DVDWCVLKEVNPNEVVFEVKENTGYSGRNALLTISNGVETEAFNINQSGAVFIFGKDEWM LRTDNKAATLPIKLQSSFDYTIDIPAEAQEWLSFEQNAKGINFKVKENTSGKMRGAIVNV AAKDRSASYQVIQYDVDELTGTWQGMFSDGQMNYGLKDVTIEKQEDGTYLLSNILTGLPY KLKAKAIDNCLAFGAGQNLGVFEDNLYLSFEILSSDLYYVKDPSVTISLGPVMLTDGTFV FAFSGIKESDPFGFVFRVYEDAKLQKVIDNLSIFINCILFKEDIIQ >gi|222159335|gb|ACAB01000024.1| GENE 9 14420 - 15736 672 438 aa, chain - ## HITS:1 COG:no KEGG:BT_3243 NR:ns ## KEGG: BT_3243 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 436 1 443 443 282 38.0 2e-74 MKKYLSIYTLLALACIVLQSCLFSEEEIFDESSANRATADVIKCQEILKDVPNGWKLEYY IGSNYSAGAVTLLMKFDGKQVEMASETGAESYKPGTIITSLYQVKSEQSTMLTFDSYNPL IHMFSGPLGLNMNLGGDYEFIIMSATPDKVILQGKKYKNIMEMTPMPKDIPWRIQIEDII NIEKDAFLNTYRMEKGGQVLNYFIRNNGTMATFSAYSADYSSARSLPYIYTEKGLKLQSP YNVNGLEVQNFKWDRKSRLFVCTDAGATDIVLKEYYPENYLQYEDYIGTYTATIDDYDEG PTSQSVTISPKVRGESYTLKSIGGFNFTLQYDKASGKLVLDSQSISSTSSSSYYFACAAG VEGYAHTELSLPSRLRNGLVNVTTNSNPFTFYFADKASQEITSLIIWAYSSDEYSTSGLM GYWSWYNSILMVKENESN >gi|222159335|gb|ACAB01000024.1| GENE 10 15749 - 16645 944 298 aa, chain - ## HITS:1 COG:no KEGG:BT_3242 NR:ns ## KEGG: BT_3242 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 298 1 304 304 377 62.0 1e-103 MKKYIIYSLIMTLTCGLGACNNDEDVDKANSIFSTEEVDRNPFDKWILGNYTHPYNIALK YRMEDNESDMTHILAPAEYKKSVVLAKIIKHVWLEAYDEATGNPNFLRQYIPKTIHFIGS PAYEDNGTMVLGTAEGGMKITLYNVNDINPDQIDINLLNDYYFQTMHHEFAHILHQTKNY DPAFDRITENAYIGSDWYMVGADRNAWQQGFVTPYAMSESREDFVENIAVYVTNTKDYWN NMLQNAGENGRALIKQKFEIVYSYMEQTWGINLDELREIVLRRQDDIANGYVDLSIIE >gi|222159335|gb|ACAB01000024.1| GENE 11 16668 - 18236 814 522 aa, chain - ## HITS:1 COG:no KEGG:BT_3238 NR:ns ## KEGG: BT_3238 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 520 10 517 519 547 53.0 1e-154 MKYRNVLYTTLISALCLGTTSCSDFLDEMPDNRTELTTEESITKVLVSAYPMTTNCHIGE FYSDNIDENSRAYSYLFRLNEHLYRWQQTTEENQDSPHALWNDCYNSIASANQALAAIEQ MGNPQSLSAQKGEALICRAYNHFVLATTFCKAYGTNGDKDLGIPYIKEPETSVNPQYSRG TVAEVYENIAADLEEGLPLIDDNIYSRVKYHFNKKAAYAFAARFYLYYTQPDFSNCQKVI NYANIVLGNNASQYLRDWAALGALSPNKNIQPNAYVDADNRANLMVISAASYWPLVSDPG YANCERYCMNNITASESCKSEGPWGDQSSYHQIPFSPGGSIKNGFRRLVIYQQFTSGNSW IGYMLYPAFTTDEALLCRAEAYTLLKRYDEAAADIDAWQKAFTKNTQTLTKETINDFYAR LKYYTPEAPTVKKELHPDFVVEKGMQENLIHCILHARRLLTLEEGLRWQDIKRYGIIIYR RYYEGYTLVNITDKMDTNDPRRAIQLPASVITAGMQPNPRND >gi|222159335|gb|ACAB01000024.1| GENE 12 18260 - 21880 2504 1206 aa, chain - ## HITS:1 COG:no KEGG:BT_3240 NR:ns ## KEGG: BT_3240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 142 1206 2 1058 1058 1716 79.0 0 MRNNWNLFIVCFLCVLCFQPYQVNAQTQSQKKITIEISNERLPSVLKRLEKLSGYKILFT YDDVKKFTVSGSVKDKSIEQTLDIILANKPLEYHIEDQFITITSKGPSKQAKVFNVKGVV ISGDDGQPLIGATVVIKGSKSGVLTDIDGKFSIENVSNKSLLQFSYIGMKPQDLTPTPTM NVTLMPDVQTLSEVVVTGMQKMDKRLFTGATKQLSADEVKLDGLPDISRGLEGRAAGVSV QNVSGTFGTAPKIRVRGATSIFGSSKPLWVVDGVIMEDAIDVGPELEDAIDVGPDDLSSG DAETLISSAIAGLNSDDIESFQILKDGSATSIYGARAMAGVIVVTTKKGKAGVSKMSYTG EFTTRMIPSYKEFNIMNSQEQMGIYKEMEQKGWLNNSDTYRAKDSGVYGRMYQLINQYNP VTGQFGLANTPEARNAYLREAEMRNTDWFNTLFSNNVTQNHSVSITSGTEKSSFYASLSA MSDPGWYKQSEVKRYTANLNTTYNIYKNLSINLISSASYRKQKAPGTLSSEVNAASGEVT RQFDINPYSYALNTSRALDPTVDYTANYAPFNILHELDNNYMDLNVADVKFQGEIKWKAL PELELSALGAVRYQASSQEHNIKDHSNQATAYRTGMDDATIRDENNLLYTNPDNPYALPI SILPEGGIYQRKDYRMLGVDFRATASWNHLFAEKHITNLFAGMEVNDLKRMQNYFQGWGM QYTMGEIPSYVYEFFKKGIESGESYYGLNHTTTRSVAGFANATYSYDGRYTVNGTFRYEG TNRMGKSRSSRWLPTWNMSGAWNVHEENFFKTFSSVLSHLTFKASYSLTADRGPEYVTNS HAIITSYSPFRPFTSGQETGLYVSDPENSELTYEKKHELNIGADMGFLDNRINFSIDWYK RNNYDLIGITPTQGVGGSIYKYANIASMKSHGIEFTISSKNIQSKNFSWHTDFIFSKAKN EVTELNARSSVMDLVSGYGFARQGYPVRGLFSIPFVGLNSNGIPMYNINGKITSTDIDFQ TRDNIDYLKYEGPTDPTITGSFGNIFTYKAFKLNVFITYSFGNVVRLNPCFSYQYSDLSS MPREFKNRWTVPGDEKRTNIPAIISKRQYEDNKDLMYAYNAYNYSTERIAKGDFIRMKEI SLSYDFPQSWIAPAKISNLSLKLQATNLFLIYADKKLNGQDPEFFNTGGVASPVPRQFTL TLRLGF >gi|222159335|gb|ACAB01000024.1| GENE 13 22032 - 23195 689 387 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 192 375 131 309 331 72 29.0 2e-12 MNKDIDNINNTDESVEKALDIMESSSEINVEQLQEMLKNEETLQACRDIMDSSLFLQQKG GIELPNVEMELERFKKKQHSTRMRSNLWKITIGIAAMITVLFGSYYLINSLTTPALEPIT VFTADTTPQHIVLQKDDGEKIVLDEPQSNNQVLPKTAIAKSEKKELDYRQVISTTTQTHV LTVPRGESFKVVLCDGTEVWLNANSNFVYPTAFIGNERIVTLEGEAYFKVAKDAKHPFIV KTKTVQTRVLGTEFNIRSYTPEDTHVVLINGKVEVSNTKGGSYTRLYPGEDAHLQSDGNF VLTEVDLDSYVYWKDGYFYFDNITLKDIMQNLGRWYNVNIEFRNKEAMTYKMHFISDRTK DLEHTISLLNRMKKVTVTLQGNTLTID >gi|222159335|gb|ACAB01000024.1| GENE 14 23258 - 23860 452 200 aa, chain - ## HITS:1 COG:no KEGG:BVU_0609 NR:ns ## KEGG: BVU_0609 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.vulgatus # Pathway: not_defined # 43 199 15 171 175 101 36.0 2e-20 MPQRYDKYKIKAILLFSIIICLFATRFIYYSTLLKHVEDKADFDFLFKEYYPQLYYYAFH LINNMEASKDIVSDAFEFIWANYAKIDKATAKSYLYIYVRNKSIDFLRHQNIHEQYVQIY SELTKSYVETEYLEQDERMMHISKAMEKLTPHTRHILEECYIQRKKYQEVAEELNISVSA VRKHIVKALQVIREECAKKS >gi|222159335|gb|ACAB01000024.1| GENE 15 23905 - 24912 705 335 aa, chain + ## HITS:1 COG:PAB2145 KEGG:ns NR:ns ## COG: PAB2145 COG0451 # Protein_GI_number: 14520521 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Pyrococcus abyssi # 4 331 6 303 307 88 29.0 2e-17 MESILITGASGFIGSFIVEEALKRKFGVWAGIRSTSSKRYLKNRKIHFLELDFAHPNELR AQLSGHKGTYNKFDYIIHCAGVTKCPDKHSFDYVNYLQTKYFIDTLKELNMVPKQFIYIS TLSVFGPIREKDYTPIKADDPAVPNTAYGLSKLKAELYIQSMPGFPYVIYRPTGVYGPRE ADYYLMAKSIRKHVDFSVGFRRQDLTFVYVKDIVQAIFLGIEKKVVRKAYFLTDGKVYKS RAFSDLIQKELGNPFVLHLKCPLIVLKVISLFAEFIATRSGRSSTLNSDKYKIMKQRNWQ CDITPVMDELGYVPEYDLEKGVRETIAWYKNEGWL >gi|222159335|gb|ACAB01000024.1| GENE 16 24903 - 25868 668 321 aa, chain + ## HITS:1 COG:no KEGG:BT_3074 NR:ns ## KEGG: BT_3074 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 321 1 321 321 542 96.0 1e-153 MALDLFKRIETRKGLFAVEKITLIYNLLTSILILFLFQRMDHPWHMLLDRAMIAAMTFLL MYLYRLAPCKFSAFVRIVIQMSLLSYWYPDTFEFNRFFPNLDHVFATAEQFIFNGQPAIW FCHTFPHLIVSEAFNMGYFFYYPMMLIVALFYFIYKFEWFEKMSFVLVTSFFIYYLIYIF VPVAGPQFYFPAIGIDNVSKGIFPAIGDYFNHNQELLPGPGYQHGFFYSLVEGSQQVGER PTAAFPSSHVGISTILMIMAWRGSKKLFACLIPFYMLLCGATVYIQAHYVIDAIVGFFSA FLLYVVVTWMFKKWFAQPMFK >gi|222159335|gb|ACAB01000024.1| GENE 17 25958 - 26953 497 331 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148828154|ref|YP_001292907.1| ribosomal protein L11 methyltransferase [Haemophilus influenzae PittGG] # 1 287 1 284 326 196 39 4e-49 MKIAHIDLGKHPILLAPMEDVTDPAFRLMCKKFGADMVYTEFVSSDALIRAVSKTAQKLS ISDAERPVAIQIYGKDTETMVEAAKIVEQAQPDILDINFGCPVKRVAGKGAGAGMLQNIP KMLEITRAVVDAVKIPVTVKTRLGWDANNKVIVELAEQLQDCGIAALTIHGRTRAQMYTG EADWTLIGEVKNNPRMHIPIIGNGDVTSPQRCKECFDRYGVDAVMIGRASFGRPWIFKEV KHYLETGEELPPLSFEWCMEVLRQEVVDSVNLLDERRGILHVRRHLAASPLFKGIPNFRN TRIAMLRAETKEELFRIFDEITSQRKENPEI >gi|222159335|gb|ACAB01000024.1| GENE 18 26962 - 28887 1029 641 aa, chain - ## HITS:1 COG:FN0191 KEGG:ns NR:ns ## COG: FN0191 COG2865 # Protein_GI_number: 19703536 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Fusobacterium nucleatum # 21 417 6 380 477 94 27.0 8e-19 MENKKPIVLNNLEEWDFTDYIDSLIQKEEAVDLEFKIAKDGLPNSLWDTYSSFANTDGGV IILGVKEYKQQFIIEGLTKEQISQYKKDFWNQINNPDCVNENLLSDNDLYEGCYKGKNLL LIYVPRATRIQRPIYRTRNPFGGHTFKRNHEGDYKCTDAEVKRLIADSDENHPRDSRILC NYSIEDIDKETLTQYRQLFANLKPSHPWLSLNDLEFLTKLEAYRKDRHTKEEGFTLAGIL MFGKTESITDPECAPNYFPDYREHLGADDSIRWSDRICPDGTWEANLFQFYRKVYPKLTV ILPKPFQIRNGIRIDETPTHIAIREAFINTLIHCDFSEEGNIVVEQWVDKYRFKNPGTML VSKTQYYSGGDSICRNKALQKMFMLIGFSEKAGSGVNKIIKGWREANWQKPYVEELNRPD KVELTLPMISLLPDDAVIKLKELFDGKIEKLTQDELTALVTCYSENEINNTQLQYVVPQH RSDITKMLKKLCNEGFLIAEGNGRGTKYHINESEGKFETLEDKVESSENNMESSESNMES SESNMESSESNMESLKSNMESSKQNKQRMTFEELQALIISLSQDYISIEEIATKVHKTPK YLINFIIPRMFQNGTIERLYPGIPNHPKQKYKATNKNFNKQ >gi|222159335|gb|ACAB01000024.1| GENE 19 29001 - 30260 1407 419 aa, chain + ## HITS:1 COG:aq_1015 KEGG:ns NR:ns ## COG: aq_1015 COG0826 # Protein_GI_number: 15606313 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Aquifex aeolicus # 1 405 7 401 409 268 37.0 2e-71 MAPVGSRESLAAAIQAGADSIYFGIENLNMRARSANTFTIDDLREIARTCDEHGMKSYLT VNTIIYDKDIPLMHTIVDAAKEAGISAVIAADVAVMNYARQIGQEVHLSTQLNISNAEAL KFYAQFADVVVLARELNLEQVAEIYRQIQEEHICGPSGEQLRIEMFCHGALCMAVSGKCY LSLHEMNHSANRGACMQVCRRSYTVRDKETDVELDIDNEYIMSPKDLKTIHFMNKMLDAG VRVFKIEGRARGPEYVRTVVECYKEAIKAYLDGTFTDEKIAAWDERLKTVFNRGFWDGYY LGQRLGEWTRNYGSAATERKIYVGKGIKYFSNIGVSEFLVEAAEVSVGDKLLITGPTTGA LFMTLEEARVDLESVQTVKKGQHFSMKSDKIRPSDKLYKLVSTEELKKFKGLDIEQKRG >gi|222159335|gb|ACAB01000024.1| GENE 20 30301 - 30702 493 133 aa, chain + ## HITS:1 COG:CC3234 KEGG:ns NR:ns ## COG: CC3234 COG0824 # Protein_GI_number: 16127464 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Caulobacter vibrioides # 5 117 14 126 147 60 32.0 1e-09 MNYIYELDMKVRDYECDLQGIVNNANYQHYLEHTRHEFLTSVGVSFAALHEQGVDPVVAR ISMAFKTPLKSGDEFVSKLYMKKEGIKYVFYQDIFRKNDNKVVVKSTVETVCVVNGRLSD SELFDNVFAPYLK >gi|222159335|gb|ACAB01000024.1| GENE 21 30744 - 31820 1042 358 aa, chain + ## HITS:1 COG:FN1068 KEGG:ns NR:ns ## COG: FN1068 COG0758 # Protein_GI_number: 19704403 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Fusobacterium nucleatum # 69 356 10 285 288 191 36.0 2e-48 MVPGIGHIGAKHLIDGMGNAVDVFRLRKEIPERIPEVSQRVIEALDCPQAVLRAEQEYEF IRKNRISCLSFHDEAYPSRLRECEDAPVVLFFKGNADLNSLHILNMVGTRNATDYGTQIC ASFLRDLKALCPEVLVVSGLAYGIDIHAHREALANELPTVGVLAHGLDRIYPHVHRKTAV DMLEKGGLLTEFLSGTNPDRHNFVSRNRIVAGVCDATIVIESAEKGGSLITAELAEGYHR DCFAFPGRVNDEYSKGCNLLIRENRASLLLSAEDFVKAMGWEVSAHSAKVANVQRSLFLD LSEEEQKVIEILEKRGDLQINTLVVEADIPVQKMNTILFELEMKGVVRVLVGGMYQLL >gi|222159335|gb|ACAB01000024.1| GENE 22 31902 - 34391 1505 829 aa, chain + ## HITS:1 COG:CPn0849 KEGG:ns NR:ns ## COG: CPn0849 COG0553 # Protein_GI_number: 15618758 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Chlamydophila pneumoniae CWL029 # 221 669 698 1152 1166 233 32.0 1e-60 MSMAKKTKADKKTKSTVNKVSYHYRPDNMTLQDWQIALRRQAAMKEKFVIFERDKKEYPG YYTVINPTSGNEYNVVYRGHQSPWNYCSCMDFKASQLGTCKHLEGVKLWIREKRRKVCRV TPPYSSVYLSYQGERKVCLRIGTDNEEEFRKLASPYFTPDGVMRPAAIDSITEFLRAATR LNNTFRWYPDALGFILEQRDLRRRSQLLPDYASDTALDTLLKTKLYPYQKEGIRFAFRAG KSIIADEMGLGKTIQAIGTAELMRKHQFISSALIICPTSLKYQWKKEIERFTDAKAIVVE GNHLTRKVLYGAEEFYKIVSYNSVCNDIKILKSLHTDFLIMDEVQRLKNWNTQISKAARH IESDYSVILSGTPLENKLEELYSIMQFVDQFCLGPYYQFLDQTVVRSDTGKVVSYKNLNA IGEQMKNVLIRRRKKDVALQLPGRMDKVLFVPMTEQQRNMHDEYQSIVSQLVFKWTKTRF LSEKDRKRLLLMLSQMRMLCDSTYILDQKTRYDTKVEETLNILRNVFESGDEKVVIFSQW ERMTRLIAKELDVLGVRYEYLHGGIPSEARKNLTDNFTELPESRVFLSTDAGSTGLNLQV ASILINLDLPWNPAVLEQRIARIYRIGQKKNIQVINLVASQTIEERMLSTLNFKTSLFEG ILDNGEDTIFLENSKFDKMMDSLKVVAEPSEEVDGAHNRMAGCSIDLNDNEEKVVEETAV LQTTDTDSLALLEEETPEEGVTEVSTEKKSSVSAADNNPEQLLQQGFSFLSGLARTLSSP EATQRLVDSIVEEDKETGRTSIRIPVSDKESVVGILNLFGKLLAAGNKT >gi|222159335|gb|ACAB01000024.1| GENE 23 34615 - 34923 190 102 aa, chain - ## HITS:1 COG:no KEGG:BVU_3461 NR:ns ## KEGG: BVU_3461 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 102 481 582 590 172 82.0 3e-42 MTAQNRFYRPISEQDTQAGYVDIFLCPLLDIYSGMKHSYIVELKYAKYKDPESRVEELRL EGITQANRYAETETVKNAVDATQLHKIVVVYKGMDMPVCEEI >gi|222159335|gb|ACAB01000024.1| GENE 24 34989 - 35351 218 120 aa, chain - ## HITS:1 COG:no KEGG:BVU_3461 NR:ns ## KEGG: BVU_3461 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 38 104 375 441 590 114 79.0 1e-24 MTTTALRKNVMAKPPCTTQIWCSTSSRIIFNEAKHPGITNPDNFVSLLYYFGMLTISGMY EGKTKLTIPNQVVREQIYTYLLSTYDEADLSFSGYEKNELPAPLPTTLTGKPISVILPTA >gi|222159335|gb|ACAB01000024.1| GENE 25 35531 - 35803 270 90 aa, chain - ## HITS:1 COG:no KEGG:BVU_3461 NR:ns ## KEGG: BVU_3461 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 3 90 2 89 590 113 72.0 2e-24 MEEYIIPRRKRIPYGMMNFAVIRRDDCYYVDKIRFIPMIEEADKFFFFIRPCRFGKSLTV NMLQHYYDILAKNKFDALFDDLYIGKHPTP >gi|222159335|gb|ACAB01000024.1| GENE 26 35904 - 37031 657 375 aa, chain - ## HITS:1 COG:MA1854 KEGG:ns NR:ns ## COG: MA1854 COG1672 # Protein_GI_number: 20090704 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanosarcina acetivorans str.C2A # 2 375 4 389 390 74 23.0 3e-13 MEKPFVFGVATSGDNFTDREKETQRLLLNFAHGVNTVLISPRRWGKTSLVKKVAQLSQDS KRKVVYIDIFSCRTENEFYRLFATSILKQTSSKWEEWIENTKQFLSHINPKISIGADPMN DFSISFEYNMQNNTENDILQLPEKIAIEKGIQIVICIDEFQQISDFEDSKTFQKKLRSVW QLQQHVSYCLFGSKKHLMNELFEKKNLPFYKFGDAIYLAKIETKYWIEYICKRFENTGKH ISPELAEEICRLVDNHSSYVQQLSWLLWIRTINTATTEQLSYALEDLLDQNNILFQSETE NLSAYQMNFLKAIVDGIHTKFSSKEVISKYNLGTSANIVRLKNALLQKELIETNGKEIVL ADPVFGVWFKKEMLY >gi|222159335|gb|ACAB01000024.1| GENE 27 37050 - 37229 80 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFNIIMQENHLHIDPFTINNKSNCFMFAFILLRILTSLLVTTLLVTNLLVFQRITIFAQ >gi|222159335|gb|ACAB01000024.1| GENE 28 37336 - 39114 1606 592 aa, chain + ## HITS:1 COG:AF1252m KEGG:ns NR:ns ## COG: AF1252m COG5016 # Protein_GI_number: 18677784 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Archaeoglobus fulgidus # 2 457 1 431 480 186 32.0 8e-47 MMKKEIKFSLVYRDMWQSSGKYQPRVDQLVRIAPLIIEMGCFARVETNGGAFEQVNLLYG ENPNKAVRAFTAPFKEAGIQTHMLDRGLNALRMYPVPADVRKLMYKVKHAQGVDITRIFC GLNETRNIIPSIKYALEAGMIPQATLCITYSPVHTVEYYARIADQLIEAGAPEICLKDMA GIGRPGMLGELVRTIKEKHPDILIQYHGHSGPGLSMASILEVCENGADIIDVAMEPMSWG KVHPDVISVQAMLKDLGFQVPDINMKAYMKARAMTQEFIDDFLGYFMDPTNKYMSSLLLK CGLPGGMMGSMMADLKGVHSGINMILRSKNEPELSLDDLLVMLFDEVEYVWPKLGYPPLV TPFSQYVKNVALMNLMQQVKGEDRWTMIDNHTWDMILGKSGRLPGKLAPEIIELAKSKGY EFVDTDPQLNYPDALDEYRKEMDENGWEYGEDDEELFELAMHDRQYRDYKSGVAKKRFEE ELQHAKDAAMAKNGYSEEEIKKLKRAKADPVIAPDNGQVLWEVSVEGPSIAPFIGRKYQH DEVFCYLSTPWGEYEKILTGFTGRVVEICAQQGANVHKGDVIGYILRSDIFA >gi|222159335|gb|ACAB01000024.1| GENE 29 39410 - 39910 223 166 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717093|ref|ZP_04547574.1| ## NR: gi|237717093|ref|ZP_04547574.1| predicted protein [Bacteroides sp. D1] # 1 166 23 188 188 328 100.0 8e-89 MAYKKNPKKKDALSIKRAVESLCFQIDWGLKLLGAEKGDLFHQLAKVEVDFISELNLTQD ILAIKSLVDGVKQNLQIEPTPESGDFTHSVVALALGIASISHLNNISLPESWREQIEKKL LTIYYPEKQRNKVVDWAKANGYSTSSYLGRPIVKFKQLYLIIERTK >gi|222159335|gb|ACAB01000024.1| GENE 30 40344 - 41099 269 251 aa, chain + ## HITS:1 COG:no KEGG:PRU_1492 NR:ns ## KEGG: PRU_1492 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 2 250 51 296 296 187 36.0 3e-46 MHREAIAELFGVEVECDTSSYEYYISSSSQLKNDKTRQWLLNSFTVSNMIEAGRNMKDRI LFEEIPEGTEYLQTVIDAMQRKKELKIDYKPFNGHQSIFHLQPYAMKVYHQRWYVVGYLK EQEGIRNIALDRILEMELTDDSFILPNDFDAEEYYAHTVGIFVNGKLKPQKVVVRVFGVH VEYMRSLPLHFSQQEIKTEHKEYSDFEYQLCLTPELTSQLLAMGEKIEVMEPIGLREEIR KRLFAAINRYK >gi|222159335|gb|ACAB01000024.1| GENE 31 41174 - 42184 401 336 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717095|ref|ZP_04547576.1| ## NR: gi|237717095|ref|ZP_04547576.1| predicted protein [Bacteroides sp. D1] # 1 336 6 341 341 639 100.0 0 MVNNYTIKTTYYDRKMDEKLLTQINERFPWIISYVKSHNCLDFQTGNDPKTNRSWFSIYR GTGRILTFRSHSGKVNEICDVAEAYKELMQPDFFRNPTPDQFDTYLAKIASTEKFKRYYE DTKEVRNEGYYQTLIGRRYTFEVKDSDDFILFDKELVIGFKTKDIKDEWNKEIVDQQTLK IEQLRKTYNGTLPEEIKPEYGEFDFLGLNTNGDILIMELKQNDPTKTALSPIQTSYYYLQ FQKLAREDDKLYQRIKAMIEQKIDYGLIGSSYKNKIPLKLSGRIIPCVIVGEDSNLSKTI CERYRFIRDLFLPEMKAYTYIREEGTLVTSEKLETE >gi|222159335|gb|ACAB01000024.1| GENE 32 42181 - 43443 409 420 aa, chain + ## HITS:1 COG:PH0466 KEGG:ns NR:ns ## COG: PH0466 COG0595 # Protein_GI_number: 14590378 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Pyrococcus horikoshii # 131 389 215 491 514 65 27.0 2e-10 MNLIIHRGADQIGGCITEISTENCKILIDFGSNLPGCKKEELTEEQVKSIIGNADAVFYT HYHSDHVGLHHLIPTNVLQYIGVGAKEVMLCKYDALRVHGDYSKQIEAIERMETYCAAKR IDVSKKGKIFVTPYFVSHSAFDAYMFLIECEGKKILHTGDFRRHGYIGKGLFPTLKKNVG EVDILITEGTMLGRSQECVISESEIQKNIIKALREHKYVFALCSSTDLDRLATFHAACKK TGRIFLVDEYQNRVLNVFTKYAGCKSDLFQFNAFKLINYRTVNVRNKLQKEGFLMPIRMS SGYLLKGMLDIYNDEKPWLIYSMWGGYTKEGKDYTNSDVINIRNLFGDRILDGTMDGVHT SGHADVETLKEVCQTVHPRIGVIPIHKDENSRYDSISGISSYFIFDEGDVDIHDIHISIK >gi|222159335|gb|ACAB01000024.1| GENE 33 43440 - 44009 313 189 aa, chain + ## HITS:1 COG:no KEGG:GFO_0152 NR:ns ## KEGG: GFO_0152 # Name: not_defined # Def: metallophosphoesterase domain-containing protein (EC:3.1.-.-) # Organism: G.forsetii # Pathway: not_defined # 1 186 1 203 205 136 40.0 5e-31 MKILQISDTHNQHRQLTDLPAADVIVHCGDFTDNGTEEEVLNFLNWFIELPYSHKIFITG NHDLCLWEAEGIEDLPNNVYFLQDCECEIDGIKFFGLAYNHPETLIPNDVDVLITHEPPI MILDESAGIHWGNALLRNKVYEVKPHYHLFGHAHEGYGTFKDEHIIFSNGAILDDHYNSC HKPKLIIYK >gi|222159335|gb|ACAB01000024.1| GENE 34 44265 - 44660 310 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160884700|ref|ZP_02065703.1| ## NR: gi|160884700|ref|ZP_02065703.1| hypothetical protein BACOVA_02689 [Bacteroides ovatus ATCC 8483] # 1 131 1 131 131 221 100.0 1e-56 METDLKKIVGHRLQMLRMEKNLTQEQMGEKLNLSTSAYCKIEYGETDLTLTRLNKIAEVL NMSALELFNKIDGNVYVNNSGTIGTNIGVAKDCSSVHIEAADDLRELIKANSRLLDMLYK RIELLENKVLL >gi|222159335|gb|ACAB01000024.1| GENE 35 45054 - 45923 929 289 aa, chain + ## HITS:1 COG:no KEGG:BVU_1036 NR:ns ## KEGG: BVU_1036 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 26 280 22 267 273 95 27.0 2e-18 MKKILLLFSICLIPMLGYAQTDENGKERVYIDYFSRPGTISNILAEALRNKVIEGIQKMD RVELIDVDSNEALKTEAKRRQEASAMGDAVCRSEVMTTLGAQYLIQGNITSMQGVKKTDS KGKTYYQGSVSYTLKIVDPSNGTLKGTQTFTHEGLTGNIGDTPDEAIIKTLDYVVISMDD FVDEYFKMKGTIVQIESTKKDKAQTVYIDLGTKRGVQKGQKFIVYIEMDIAGELSLKEVG RLNVKEVLSGTRSLCTVSKGGEEIMKASKEEKKLIILSRKNTFLGGLGL >gi|222159335|gb|ACAB01000024.1| GENE 36 45977 - 47185 664 402 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717101|ref|ZP_04547582.1| ## NR: gi|237717101|ref|ZP_04547582.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 402 20 421 421 758 100.0 0 MAQESYSNDVTCVKADDNIAVITASGTAEKKKDVYNMALKSIFNAIFLNGIDGVENGQPL VGKEDSYYMNQFFSSRYMLFVKNYETVGDPVRQSSNLYKGTVTAQILLGALKKDLIRNKL MTRPQEEMSMEETRQQMALPTIMVVPYKSNDRSSYADILKNDFDLRIAVSTVKEGFVKLG VKTVAAEGKQSGTLRASEWESKNADSNDKQLLMNSGADVYVIVDLQKDISAASGSRVSLI MTARETATGVDLASRKSWTNRFRTTDVDKLCAYAAQDVLDGFLKDISKEFARKVQQGNTV VLRVSLADNAINTMNSRINGSTTLSAHIRNWVRKNAQGGRYHIQGVVDDSLIFDSIQIPA KDSDGLPMDCITFADNLVNYLNDSGIDSEHRVDGSTIYLTIQ >gi|222159335|gb|ACAB01000024.1| GENE 37 47197 - 49074 1127 625 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717102|ref|ZP_04547583.1| ## NR: gi|237717102|ref|ZP_04547583.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 625 1 625 625 1204 100.0 0 MKHWLLFILVISWFCFPLSGQQTNRPDWVKQHPVSGLSYIGIGMAEISEGDYQQKAKQNA LSDLVSEIQVVIAANSLLNTLEDDGNVKQTFAESIRTEARAEIENFRLVDSWRSDNEYWV YYELNKDDYAALVAARRQKAIRNGFDFWYKGHITLQQGDLMTAIELFSNGMEAIRPVLNQ ELFCSYEGKTINLATELYAALAGVFDGITIVLNPVTVSVTPFQGIREPIAIGVYRNGNPL RNIRLKAEFVSGAGDLSSMSPTNESGVAALYVRNITSKQAQQEIGISLIDDVFSLFRKGS YAALFKQMLSSLPGATLTVNTVQTQTSAYVRSAQSDIEAVERTVKSLLNNHFFNVVASPS EADIIVTLDNKCRKGNTVPGELYNFIEFFSTLGIKIENNRTGQILLNYSINDERTLVPEN KSASQGKNMAARELIKRLNREFARELKKITFDRTGKIPERQKMLPDVPVPIVGASVPEKE ADPVISVPVVVPEVIPAPLVPVKSAKPENQKAIRVEWLDGVFVEFDKLATLGDKSRIHLK IVNTNADDCEVDLYSGNLTVINEKGEESPIVSVKLGSKFNDRRVTALIVPDLPTEMVIEV SKLQSVALLQLKDFKNNIVKLRGLK >gi|222159335|gb|ACAB01000024.1| GENE 38 49105 - 49686 488 193 aa, chain + ## HITS:1 COG:no KEGG:BVU_3810 NR:ns ## KEGG: BVU_3810 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 42 184 24 160 160 78 34.0 1e-13 MKRINLLGTALLTIAVMTSCGSSKPVTQTVQQPAVQQDVEINVPCSGPEFQTNKEYFRAS SMGLSTDMSIAKKKAMTEARAEIATAINAKVKSVTDSYVSSYQQGENDESKSRYQSLTRT VVEQELSGTRVICEKTMKTPDGKYKVYVSLELAGEEIMNAMANRIKNDDKLRIDFEYEKF KKVFEEEMSKNAQ >gi|222159335|gb|ACAB01000024.1| GENE 39 49755 - 50762 476 335 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717104|ref|ZP_04547585.1| ## NR: gi|237717104|ref|ZP_04547585.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 335 1 335 335 659 100.0 0 MMKLNLSDIRYNWVVIFIIQLLLSPGIDVFSQAMPVDTVYNQSYDWRPGQRETLPEWVFA SQRKGRVVGISDPCMKSEAARMLALQRAAYLYSLQQGVQLRLLSDVFSTMETASNTYEDQ RDKMLVLGVIEHPVQHVSYRIEHEYTSIFGEKFLEVSFTPSNDSCDFSYHSISELMLLFT KERVEEEEVKFNLLLESDSCREQSDQSWFQLKGTQSSPQIISYMNGVEICSSQEDCWYED VGFGGQGGLEKMDMRNAFWNAYMSSLVKALLLYPFSDVNVKQVDDRFNGGTDSGCGLYRE KVVAVLSISPFIKDIRNNKLYVDWQITEQQNIIRK >gi|222159335|gb|ACAB01000024.1| GENE 40 50759 - 52333 1094 524 aa, chain + ## HITS:1 COG:no KEGG:BF0477 NR:ns ## KEGG: BF0477 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 9 523 8 518 518 410 41.0 1e-113 MRKNRIRILSMVVFSLCLPFSLFAQIDKDMKEFDDFVKQQQKEFDDFVSEQNKEFAEFLK ETWKEYDLQKPVARPQRPEPVKPVSFDHKKPATKPQEVKVGEVTRLPIPTVTPSATFSPS KGKTQVPVVPGGKDVAVEPEKDFAVLPEIKPEVKPEETVPAKRSLSGVEFSFYSKDCVVN ASLKNRLTLKGIAEKDISAGWETLSRSNYQPLIDDCLAFKKDNALNDYGYLLLTRKVATE LCGSTHSDEIALMQMFLLSQSGYRVKVARMDNRLTLFYASGNMIYATCFITLNGVNYYRF DTTPNKTNSIYTYNRDFANAKNPVNMNITTPQPFSGTYVEKTLQAKAYPSVKVCSKVNSG LISFYKDYPQCDFSVYVGAPVSQEVQQTVLPSLQAAIQGKKQSEAANILINFVQTAFDYK TDGDQFGYEKPFFVDELFYYPYSDCEDRAVLYSYLVRTLMGLDVVLLEYPNHMATAVCFD ENIDGDYITVSGKKYIICDPTYIGASIGLAMSQFKNVAAKVLKY >gi|222159335|gb|ACAB01000024.1| GENE 41 52346 - 53632 909 428 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717106|ref|ZP_04547587.1| ## NR: gi|237717106|ref|ZP_04547587.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 428 1 428 428 841 100.0 0 MKNIKKYLCAIVMMVISMTAFAQEDVFNGDWLGANKEGDLKVAFTLNCDGNWQLNPYNEN ARCNGFMEVNMLEPGGRESLMATYEFYVESMNGNELTLSFVGGRPEVDAGISGQCKVVYK DDKLSFTGLDKGGKDAAFNGLTLVKSGSGADAIADAAADDGVPLGVKILAVLQLILYIAV VLFIVGHMFFVWYKGARYKEVFTVEGMLNKRLAAGMPEKMTDEEITEAWKLMDEAFATWT VIEKTDDDEFRKPTKMKQIKKSVLLIDQVIGMCPTDADVIERLNSLTDVINSGEERHFDG SRKLIWLGVIVGILMYWMMGVGMMFSTLIATGLYVVASRTPQFLIDKRALRGGGNIHNGI FAGVFGLLAGAQTVRTIYKFNDGHKEYSDDHSQHWIALAIGLVVLFVLAMMMAFWALLNY LRNYVLYF >gi|222159335|gb|ACAB01000024.1| GENE 42 53848 - 54141 231 97 aa, chain + ## HITS:1 COG:MJ1215 KEGG:ns NR:ns ## COG: MJ1215 COG1669 # Protein_GI_number: 15669400 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Methanococcus jannaschii # 1 81 5 86 86 63 47.0 8e-11 MKTKDEIIAILRNFKEEFGERYGIEKLGLFGSVARGEQKEDSDIDICVKLQDPDYFTRME IKESLEERFNAKVDVVSLTAIMRSLFRNHIEKDAIYI >gi|222159335|gb|ACAB01000024.1| GENE 43 54125 - 54511 261 128 aa, chain + ## HITS:1 COG:no KEGG:BT_2210 NR:ns ## KEGG: BT_2210 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 125 1 125 125 102 43.0 5e-21 MQSISNADILDMLLFVEERINTTIERCGSVISVNDFLASPDKMDIFDATCMRLQTIGETV KNIDNLTNHELLINYAGTPWRSIIGLRNIISHEYLSIDPEEIFKIVKVHLPGLLLAIQQI RNDIAASI >gi|222159335|gb|ACAB01000024.1| GENE 44 54475 - 54630 82 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237721912|ref|ZP_04552393.1| ## NR: gi|237721912|ref|ZP_04552393.1| predicted protein [Bacteroides sp. 2_2_4] # 7 51 1 45 45 80 100.0 4e-14 MYQSGFMNRSHFYKKFTKRIGVAPKDYRLHNKKIGKLQRSHILAAISFLIC >gi|222159335|gb|ACAB01000024.1| GENE 45 54629 - 54988 192 119 aa, chain + ## HITS:1 COG:no KEGG:Caka_0557 NR:ns ## KEGG: Caka_0557 # Name: not_defined # Def: oxidoreductase domain protein # Organism: C.akajimensis # Pathway: not_defined # 38 94 128 184 472 89 66.0 4e-17 MISSMVSEVVTRSLDAKLIRCSLMKFLGGDKISRRKFLATPDHSYFPICMEAMKLGIHVY VEKPLVRTSYECELLMQAEQKYGVITQMSYQGHSFCLLLTLNSQNLQNRFNRFTESFMS >gi|222159335|gb|ACAB01000024.1| GENE 46 55057 - 56334 512 425 aa, chain + ## HITS:1 COG:no KEGG:BT_3061 NR:ns ## KEGG: BT_3061 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 425 1 425 425 661 73.0 0 MKHGIYILAILMVLLGLPESGYASIPFFAVANDTIRRVMIYFDANDAGVNSCYKSNNQAI ATLDSLLSGNLGTKYVTALNVKTFVSPDGNESYNRSLAARRNDSIKEFLQRYNSDVSVDK IHFSSEGEDWSEFRKLVASDSNLPDREEVLILIDYHKNDVNKRKQLLRKLNRGIAYRYIV HNIFPELRRSVITIVGETSKLGKEAFEPVSSVFGLFVPNQEEALPKDQPDKSVGESEEKQ ACEVDISETEGPVKSQTVLAVKNNLLYDLALAPNIEVEIPIGKRWSLNTEYKCPWWLNSK HDFCYQLLSGGMEGRCWLGNRQKRNRLTGHFIGLYAEGGIYDFQLRGDGYQGKYYGAAGV TYGYARQLARHFSLEFSLGIGYLTTEYKKYTPYEGDIIWTNSGRYNFIGPTKAKVSLVWL ITTKR >gi|222159335|gb|ACAB01000024.1| GENE 47 56344 - 57351 549 335 aa, chain + ## HITS:1 COG:no KEGG:BT_3060 NR:ns ## KEGG: BT_3060 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 309 1 279 304 333 58.0 4e-90 MRNTRYRFLVLLSSLLMLTGCSRRDILDDYPVSGVDIKLDWDGMTDQLPEGVRVIFYPKN GEGRKVDKYLSVRGGEMKVPPGRYSVVTMGYNFNSDRIRIRGEESYESIEAYTEYCNDLG IAGMEKMVWSPDSLYVLNIDELKIEKSEEVLHLDWKMESVVKKYFFAVEAKGLEYVATVV GSIDGLSDCYCIGKGRGVCSSQPIYFEVKKGDNKVTASFTAFKQVKEMTMPTRMSISERE TSSEKDAIILILKFIKTDNTVQEATIDVTEIIGTLENAGTGEDGKPTPPPEIELPPDDKI EVDKPETPPNPDGGGGMGGNVDGWGPEDNVELPVK >gi|222159335|gb|ACAB01000024.1| GENE 48 57406 - 58371 792 321 aa, chain + ## HITS:1 COG:no KEGG:BT_3059 NR:ns ## KEGG: BT_3059 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 317 1 319 324 320 61.0 6e-86 MKKILLAVTAALAITGCSQNEEFEAPSQKAEINFNTAVTRATELDIDGLKSSGFQVYAYN TKAEEMSATVTLSTPWINGSATYSDSKWTVSGGPYYWPLAENLQFFAYSPKDGVTYTAPN GTTDKGYPKFTYTVGNTAALQKDLVIASVANAQKKTNEAATDVSLTFKHALTQVNIEVTK EAGYTYTISKVELTGIKGSGTFTYAGVNAGTWTAGTETSVSYAYELGAFSDDKTAVKAGN ALMLIPQALTDAKISIEYTVEKDGSKIAENSKEVSLTSTAAWAFGKKILYKLTLPIGAQE VGITATVTGWDAEDPVTPTVD >gi|222159335|gb|ACAB01000024.1| GENE 49 58636 - 59475 586 279 aa, chain + ## HITS:1 COG:no KEGG:BT_3058 NR:ns ## KEGG: BT_3058 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 274 1 274 280 481 83.0 1e-135 MPSICQGTWDCQICPKAVSNAITRVIYQRGFHKPAQKCEENLILFLMKGEILVNSKEYAG TMLKGGEFILQAIGSMFEMLAMTECECIYYRFIQPELFCDFRFNHIMKEVSPPLIYTPLK IIPELQYFLNGSITYLKGDKVCRDLLSLKRKELAFVLGYYYSDYDLSSLVHPLSKYVNSF QYFVIQNYKKVKTVEELAQLGGYTLSTFRRIFNNVFHEPVYEWMLARRKEGILDDLNNSK CSISEICYKYGFESLPHFSNFCKKSFGASPRNLRRQDIS >gi|222159335|gb|ACAB01000024.1| GENE 50 60100 - 60855 855 251 aa, chain - ## HITS:1 COG:Cgl0368 KEGG:ns NR:ns ## COG: Cgl0368 COG0479 # Protein_GI_number: 19551618 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Corynebacterium glutamicum # 5 243 1 238 249 234 46.0 1e-61 MDKNISFTLKVWRQAGPKAKGAFETYQMKDIPGDTSFLEMLDILNEQLISERKEPVVFDH DCREGICGMCSLYINGHPHGPATGATTCQIYMRRFNDGDTITVEPWRSAGFPVIKDLMVD RTAYDKIMQAGGYVSVRTGAPQDANAILIAKPIADEAMDAASCIGCGACVAACKNGSAML FVSAKVSQLNLLPQGKPEALRRAKAMLSKMDELGFGNCTNTRACEAECPKNISISNIARL NRDFIIAKLKD >gi|222159335|gb|ACAB01000024.1| GENE 51 60892 - 62871 2364 659 aa, chain - ## HITS:1 COG:Cgl0367 KEGG:ns NR:ns ## COG: Cgl0367 COG1053 # Protein_GI_number: 19551617 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Corynebacterium glutamicum # 4 658 28 673 673 606 48.0 1e-173 MIKIDSKIPEGPVAEKWTNYKAHQKLVNPANKRRLDIIVVGTGLAGASAAASLGEMGFRV FNFCIQDSPRRAHSIAAQGGINAAKNYQNDGDSVYRLFYDTVKGGDYRAREANVYRLAEV SNAIIDQCVAQGVPFAREYGGTLDNRSFGGAQVSRTFYAKGQTGQQLLLGAYSALSRQVN VGTVKLYTRYEMQDVVIVDGRARGIIAKNLVTGELERFAAHAVVIATGGYGNAYFLSTNA MGCNCTAAISCYRKGAVFANPAYVQIHPTCIPVHGDKQSKLTLMSESLRNDGRIWVPKKK EDAVKLQKGEIKGSDIPEEDRDYYLERRYPAFGNLVPRDVASRAAKERCDAGFGVNNTGL AVFLDFSEAINRLGIDVVLQRYGNLFDMYEEITDVNPGELAKEISGVKYYNPMMIYPAIH YTMGGIWVDYELQTTIKGLFAIGECNFSDHGANRLGASALMQGLADGYFVLPYTIQNYLA DQITVPRFSTDLPEFAEAEKAVQAKIDKFMSIQGKESVDSIHKKLGHVMWEYVGMGRTAE GLKKGIAELKEIRKEFETNLFIPGSKEGMNVELDKAIRLYDFITMGELVAYDALNRNESC GGHFREEYQTEEGEAKRDDENFFYVACWEYQGDDEKAPVLHKEPLVYEAIKVQTRNYKS >gi|222159335|gb|ACAB01000024.1| GENE 52 62909 - 63613 830 234 aa, chain - ## HITS:1 COG:no KEGG:BT_3053 NR:ns ## KEGG: BT_3053 # Name: not_defined # Def: putative cytochrome b subunit # Organism: B.thetaiotaomicron # Pathway: Citrate cycle (TCA cycle) [PATH:bth00020]; Oxidative phosphorylation [PATH:bth00190]; Benzoate degradation via CoA ligation [PATH:bth00632]; Butanoate metabolism [PATH:bth00650]; Metabolic pathways [PATH:bth01100]; Biosynthesis of secondary metabolites [PATH:bth01110] # 1 234 1 234 234 374 95.0 1e-102 MWLSNSSVGRKVVMSVTGIALVLFLTFHMAMNLVAIISADGYNMICEFLGANWYALVATA GLAALFVIHIIYAFWLTMQNRKARGSERYAVVDKPKTVEWASQNMLVLGLIVIVGLGLHL FNFWAKMQLPELMHNMGMHADTLTLAYAANGAYHIQQTFSSPVYVVLYLVWLFALWFHLT HGFWSSMQSLGWNNKVWINRWKCISNIYSTIVVLGFALVVVVFFVKTLICGGAC >gi|222159335|gb|ACAB01000024.1| GENE 53 63825 - 64694 687 289 aa, chain - ## HITS:1 COG:PA1713 KEGG:ns NR:ns ## COG: PA1713 COG2207 # Protein_GI_number: 15596910 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 131 277 122 268 278 60 31.0 4e-09 MPGDSFCIQYTNCTGCPKNVENILVYRNLPKGEHIPKDKCTQNCMLFMIKGELLINSEEH PGITLREKQIVLQAIGSKVEILALTDVEYVVYWFNELPLLCEDRYKEMMEQAEAPLTYTP LIMSERLYHLVTSMPEFLAEENPCSKYIDLKCKELVFLITNFYPLPQLGSFFYPISTYTE SFHYFVMQNYSNVKNVEEFAHLGGYTTTTFRRLFKNLYGVPVYEWILEKKREGILEDLQH TKMRITEICNRYGFDSLSHFAHFCKDSFGDTPRALRKKAAGGEKIGKVQ >gi|222159335|gb|ACAB01000024.1| GENE 54 64968 - 65930 773 320 aa, chain + ## HITS:1 COG:no KEGG:BT_3050 NR:ns ## KEGG: BT_3050 # Name: not_defined # Def: chitinase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 318 1 319 321 555 84.0 1e-157 MLMLRKIVFFFFLFVGLSLFAQQDYMVSSYVRGNFYNRGRISTESLRASDDLIFLNVHPN KDGSLSFENPRVFQGKGVTTWEGLIKSVRAKVKGTKVKIRLGASSGEWKAMVADEAARTT FAKNIKTVLEKNKLDGIDLDFEWAENEKEYKDYSLAILKMREVLGDKYLFSVSLHPVCYK ISKEAIEAVDFISLQCYGPSPVRFPIEKYCSDIQMVLEYGIPREKLVAGVPFYGVTKDNS KKTEAYFNFVQNGLVTSPAQNEVIYQGDTYVFDGQDNIRIKTRYAKEQKLKGMMSWDLAT DLPLSDSRSLLKTMTEELRE >gi|222159335|gb|ACAB01000024.1| GENE 55 65965 - 66150 77 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262405889|ref|ZP_06082439.1| ## NR: gi|262405889|ref|ZP_06082439.1| predicted protein [Bacteroides sp. 2_1_22] # 21 61 1 41 41 77 100.0 4e-13 MSQSEICLLVEAYFLNKFLQMKILRQGYFSKEKRFNTFFINLIYNKGGEEPIKTEPVRQH L >gi|222159335|gb|ACAB01000024.1| GENE 56 66059 - 66847 500 262 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_1212 NR:ns ## KEGG: Fisuc_1212 # Name: not_defined # Def: DNA-damage-inducible protein D # Organism: F.succinogenes # Pathway: not_defined # 1 256 1 256 278 356 68.0 6e-97 MKSEEIKELFKQFESIVCEYNKVECWSARELYPLLGYSQWRNFLSIIEKVKDACKNAGEN IAYHFADVSKMVILGSGAEREVDNIFLTRYACYLVAQNGDSRKQEIAFAQNYFAIQTRRA ELVEQRLIEYERVQARTKLAETEKLLSGVLYERGVDNQGFAIIRSKGDKALFHMDTKMLK RKLGVPDSRPLADFLPTISIKAKDFAAEMTSVNVQQKDLYGQTAIENEHVDNNVAVRDML VGRGIFPEQIPANEDIAAGVFL >gi|222159335|gb|ACAB01000024.1| GENE 57 66875 - 67513 548 212 aa, chain - ## HITS:1 COG:no KEGG:BT_3041 NR:ns ## KEGG: BT_3041 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 209 1 208 212 309 79.0 5e-83 MAIQFELYKTPRPKDEEDKEVYHARVVNFQHIDTDYLAKEIQIATSLTEGDVKSVLESLS HFMGDRLREGQSVHLDGIGYFQIKLNSQEPITSPKLKANQIKLKANISFKADVKLKKSVS VVHVERSKLKPHSAVLSNDEIDKLLTNYFKSNPVLTRRDFQGLCGFTPTTAARQIKRLKE EEKKLKNINTYYNPIYVPMPGYYGKAEMKTDE >gi|222159335|gb|ACAB01000024.1| GENE 58 68192 - 69685 1594 497 aa, chain - ## HITS:1 COG:XF2704 KEGG:ns NR:ns ## COG: XF2704 COG0793 # Protein_GI_number: 15839293 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Xylella fastidiosa 9a5c # 4 308 84 386 508 224 43.0 4e-58 AEFAISHFYVDKVDEDKLVEEAIIKMLAQLDPHSTYSDAEEVKKMNEPLQGNFEGIGVQF QMIEDTLLVVQPVSNGPSEKVGILAGDRIVAVNDSAIAGVKMSTEDIMKRLRGPKGSKVN LTIVRRGVQDPLLFTVKRDKIPILSLDASYMIQPKTGYIRINRFGATTAEEFKKAMTSLQ KQGMKDLILDLQGNGGGYLNAAIDLANEFLGQKELIVYTEGRTAKRSDFYAKGNGDFRNG RLIILVDEYTASASEIVSGAVQDWDRGIIVGRRSFGKGLVQRPIDLPDGSMIRLTIARYY TPSGRSIQKPYDSTVDYNKDLIERFNHGELMNADSIHFPDSLKAQTKKLGRTVYGGGGIM PDYFVPIDTTLYTDYHRNLVAKGAVIKFTMQFIEGHRKELKNKYKKFESFDEKFVVDDDM LATLKEIGEKEGVKFNEEQYQKSLPLIKTQLKALIARDLWDMNEYFQVMNTTNESIQKAL EILNSDEYQKKLKQGIQ Prediction of potential genes in microbial genomes Time: Wed May 18 01:23:58 2011 Seq name: gi|222159334|gb|ACAB01000025.1| Bacteroides sp. D1 cont1.25, whole genome shotgun sequence Length of sequence - 13396 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 2, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1348 1355 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 2 1 Op 2 . - CDS 1358 - 2494 973 ## BT_3032 hypothetical protein - Prom 2601 - 2660 5.0 - Term 2597 - 2649 2.0 3 2 Op 1 . - CDS 2672 - 4591 1298 ## Slin_1080 protein of unknown function DUF303 acetylesterase putative 4 2 Op 2 . - CDS 4610 - 5389 591 ## gi|237717127|ref|ZP_04547608.1| conserved hypothetical protein 5 2 Op 3 . - CDS 5392 - 8058 1219 ## BT_2524 alpha-rhamnosidase 6 2 Op 4 . - CDS 8084 - 9481 553 ## PROTEIN SUPPORTED gi|90020673|ref|YP_526500.1| ribosomal protein L9 7 2 Op 5 . - CDS 9489 - 10658 854 ## COG2152 Predicted glycosylase 8 2 Op 6 . - CDS 10685 - 12046 716 ## Dtur_0402 hypothetical protein 9 2 Op 7 . - CDS 12051 - 13091 854 ## COG2730 Endoglucanase 10 2 Op 8 . - CDS 13102 - 13308 178 ## gi|160884642|ref|ZP_02065645.1| hypothetical protein BACOVA_02631 Predicted protein(s) >gi|222159334|gb|ACAB01000025.1| GENE 1 1 - 1348 1355 449 aa, chain - ## HITS:1 COG:CT661 KEGG:ns NR:ns ## COG: CT661 COG0187 # Protein_GI_number: 15605394 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Chlamydia trachomatis # 17 449 7 429 605 396 50.0 1e-110 MEENELIPVDNNNAVEYTDDNIRHLSDMEHVRTRPGMYIGRLGDGAHAEDGIYVLLKEVI DNSIDEFKMQAGKKIEITVEENLRVSVRDYGRGIPQGKLIEAVSMLNTGGKYDSKAFKKS VGLNGVGVKAVNALSSRFEVRSYRDGKVRIATFSKGNLLTDETQNTEEENGTYIFFEPDN TLFLNYCFKPEFIETMLRNYTYLNTGLAIIYNGHRILSRNGLVDLLNDNMTATGLYPIIH LKGEDIEIAFTHTGQYGEEYYSFVNGQHTTQGGTHQSAFKEHIARTIKEFFNKNMDYTDI RNGLVAAIAVNVEEPIFESQTKTKLGSTNMVPGGVTVNKYVGDFIKQEVDNFLHKNADVA EAIQQKIQESEKERKAIAGVTKLARERAKKANLHNRKLRDCRIHLNDPKGKGLEEDSCIF ITEGDSASGSITKSRDVNTQAVFSLRGKP >gi|222159334|gb|ACAB01000025.1| GENE 2 1358 - 2494 973 378 aa, chain - ## HITS:1 COG:no KEGG:BT_3032 NR:ns ## KEGG: BT_3032 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 378 1 378 378 746 96.0 0 MAITIKKVSTKRELKKFIRFNYRMYKGNPYSVPDLYDDMLNTFNKKKNAAFEFCEADYFL AYRDDKIVGRVAAIINNQANEKWECKNVRFGWIDFIDDPEVSSALIKTVEEWGKERGMTH IAGPLGFTDFDAEGMLVEGFDQLSTMATIYNYPYYPVHMEKLGFEKDADWVEYKIYIPDA IPDKHKRISELIQRKYNLKIKKYTSGRKIAKDYGQKIFELMNEAYSPLYGYSPLTQRQID QYVKMYLPILDLRMVTLITDANDELVCVGISMPSLAEALQKSHGRLLPLGWFYLLKALFM KRRAKMLDLLLVAVKPEYQNKGVNALLFSDLIPVYQKLGFIFAESNPELELNGKVQAQWD YFETQQHKRRRAFIKEIK >gi|222159334|gb|ACAB01000025.1| GENE 3 2672 - 4591 1298 639 aa, chain - ## HITS:1 COG:no KEGG:Slin_1080 NR:ns ## KEGG: Slin_1080 # Name: not_defined # Def: protein of unknown function DUF303 acetylesterase putative # Organism: S.linguale # Pathway: not_defined # 7 633 9 635 652 526 42.0 1e-148 MKKLLSLTVCLLFVLSQIMAQLSVPSFFSDHMVLQREKPINIWGTASAGERISVTLGNAQ KSTRTDKNGKWSVSLPPMQAGGPYTLNVKSLKQTLSFTDILIGEVWICSGQSNMEFRLRS ANHATEEVAAANYPQIRSFNVIQEMGHTPKTDLKGKWEVCSPASASDFSAVGYFFARELY QKLNIPIGFINSSWGGTDIETWMSMEVIEHFPKYEKSLARMRSSEFEEYIKHSDKVKKEF EQAIINEPGEKEKWYLENTPTENWKEHIVPSLWSNEELSGIDGVVWFTYQFSIPANCLGQ DAELSLGTIDDDDITWVNGHEVGRTVGYDLKRLYKIPAEVLKEQNTITIKISDYRGGGGL YGPKDEVYLKINNRIFPLCDNWKYKVAVSSAQYDYVEYGPNAFPSLLFNAMIHPLVGLGM KGVIWYQGENNAARANEYIDLFPALIKDWRSRWNNEFPFYWVQLANFMSPAKQPSESHWA NLRDTQSKTLALPHTGQAVIIDIGEENDIHPRNKQDVGKRLALHALHNDYGYNSIVCTGP IFQSVKRIGDALEITFNACDEQLVARNKYGYLSGFAIAGADGKYQWAQAKIENNKVIVWN KEIQQPVSVRYAWGDNPDDANLYNAAGLPASPFEGHISQ >gi|222159334|gb|ACAB01000025.1| GENE 4 4610 - 5389 591 259 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717127|ref|ZP_04547608.1| ## NR: gi|237717127|ref|ZP_04547608.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 259 1 259 259 540 100.0 1e-152 MYKRITLFTFICLLALAGIAQNKDTYTILLSGASFAEPNNKWFEMGCRALHAIPINRAVS AESIAHTANKMLDGTLYTPEEFDNIDVFVLMQVHEKDVYNEANLKENYKDYKTPFDASDY AVCYDYVIKRYISDCYNQKFNPKSRYYNTPYGKPASIVLCTHWHDSRPIFNTSVRKLAEK WGFPVVEFDRYIGFSKNQKHPVTGKQYSLIYTGDSQETHGEVFGWHPPHGEHSFIQQRMA AIFADTLRKILLPKEYINE >gi|222159334|gb|ACAB01000025.1| GENE 5 5392 - 8058 1219 888 aa, chain - ## HITS:1 COG:no KEGG:BT_2524 NR:ns ## KEGG: BT_2524 # Name: not_defined # Def: alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 877 23 877 881 918 53.0 0 MAHIPIRSYLYLLLGTTLFSCVPQELKPPFELKCENIPVPVGVDTQTPRLSWKLPLLEED SINRVEIWLSTDSTQLSGRQSGYWNKSIIGAPIRVSYDGQPLDSYTTYYWKIGYQTSSKQ KTTFSPISSFTTGCLSPDNWKGKWITDKHDITYRPAPYYRKSFQLDKTIEQALLTIASAG LHELSINGQRAGNHFLDPMYTHFDKRILSVTHDVTSLLSLGENVIGVQLGNGWYNHQSTA VWFFDKASWRNRPKFTAQLHLRYTDGTTEYLGTDSTWQTTDSPVIFNSIYTAEHYDAQKE LAGWDSPGFNATGWYHAQETESPTETIKSQVMHPIRETARYTATQCKKINDSCYVYHFPQ NIAGVTELKVKGKKGTKLRLKHGELLDKNGMVNMANIDYHYRPTDDSDPFQTDIVILSGK QDRFMPKFNYKGFQFVEVSSSAPIQLSDENLIAVEMHSDVPIIGYWSSSSDLLNKIWKAT NSSYLANLFGYPTDCPQREKNGWTGDAHIAIETGLYNFDGISIYEKWMNDFCDEQKDNGV LPCIIPTSVWGYDWANGVDWTSAVAIIPWEIYRFYGDTTLLRRMYGPIKKYVSYIESIST NHLTDWGLGDWVPVRSKSNITLTSSIYYYTDVCILAKAARLFGYAEDASYYNTLAQKIKE AINTSFLNKETGIYAEGTQTELAMPLYWGIVPEEDKKKVAARLHELVEKDDYHLDVGLLG SKALLSALSDNGYAETAYKVASQDTYPSWGYWIKQGATTLHENWRTDVVIDNSYNHIMFG EIGAWLYKGLGGIQIDEKHPGFKHILLKPFFPADMNELTIRYNTPYGWLNINWVRQTNDC IRYTIDIPAGTSATFVPFTMPEPQKSITLQAGKHSLELDFIHQLINQR >gi|222159334|gb|ACAB01000025.1| GENE 6 8084 - 9481 553 465 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020673|ref|YP_526500.1| ribosomal protein L9 [Saccharophagus degradans 2-40] # 1 463 5 521 522 217 29 3e-56 TMENIKLREKIGYGLGDAASSMFWKLFTMYLLFFYTDVVGISSAVVGTMFLITRIWDTFL DPFVGILGDRTNSRWGKFRPYLLWIAIPFGICGILTFSSFGDNMTTKIIFAYATYTLMMM VYSLINVPYASLLGVMSANPQVRTEFSSYRMTFAFGGSILVLFLIEPLVDIFSKMKITEN IPDIAFGWQMAAVVFAIMASGMFLLTFLWTKERVQPIKEEKGSLKEDLKDLGRNKPWWIL LCAGIMALVFNSLRDGSAVFYFKYYVDSSDTFSFSFMNSAITLITIYLVLGQAANILGIM FVPSLTKRIGKKKTYFVAMVGATILSVLFYFLPKDFIWGILCLQVLISICAGIISPLLWS MYADISDYSEWKTGRRATGLIFSSSSMSQKFGWTIGGALTGWLLAYFGFKANVIQSDFAQ TGICMMMSIFPAIATMLSAFFISRYPLNEKRLYEISTELEERRKK >gi|222159334|gb|ACAB01000025.1| GENE 7 9489 - 10658 854 389 aa, chain - ## HITS:1 COG:MA2382 KEGG:ns NR:ns ## COG: MA2382 COG2152 # Protein_GI_number: 20091213 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Methanosarcina acetivorans str.C2A # 46 365 1 316 335 113 29.0 5e-25 MKSNRLEELTQNYEALINRKNEICNNSNGIYKRYYHPVLTAEHAPLIWKYDFDEKQNPFM EERIGINAVMNTGAIKINHKYYLVARVEGADRKSFFAVAESNSPVDGFRFWDYPIEMPET DIPDTNMYDMRLTAHEDGWIYGIFCAERKDTNAPAGDLSSAVAVAGIARTKDLKTWQRLP DLKSPSQQRNVVLHPEFVNGKYALYTRPQDGFIDAGNGGGIGWALIDDICHAEIKEEKII NKRFYHTIKEVKNGEGPHPIKTPQGWLHLAHGVRGCAAGLRYVLYLYMTSLEDPTEIIAE PAGYFMAPIGEERIGDVSNVLFSNGWIEDDNGKIYIYYASSDTRLHVAESTVSQLVDYCL HTPTDGFRSIESVKRIITMVNHNKQYLKQ >gi|222159334|gb|ACAB01000025.1| GENE 8 10685 - 12046 716 453 aa, chain - ## HITS:1 COG:no KEGG:Dtur_0402 NR:ns ## KEGG: Dtur_0402 # Name: not_defined # Def: hypothetical protein # Organism: D.turgidum # Pathway: not_defined # 33 453 1 441 442 326 41.0 9e-88 MFKRFSICYILFMFCITGTSAQEDRWTGNATNLSKGNLRVNSSGRYLEYSDGTPFLYMGD TAWELISRLNDKETELYLENRREKGFTVIQTVILDELDDMDVSSNGEPKLIDGNIDKPAP GYFTHVDKVISLAAAKGLYIALLPTWGDKVDKQWGKGPEIFTPENAYRYGKWLGERYMNA PNLIWIIGGDRSGDGKNFAIWNALATGIKSVDKNHLMTYHPHGEHSSSFWFHNASWLDFN MCQSGHAQQDFAIYQRLLLPDLKKEPHKPCMDGEPRYENIPINFKKENGRFGDDDIRHTL YQSMFSGACGYTYGCNDIWQMFDTGREPKCDADTPWYQSMDKQGAWDLIHFRRLWEKFDF TQGKNQQTIFGNIPLENKNYPVAFGNKDYLLVYFPQGGERTIYLPSMKASKRSLKWMNPR NGRITFHQNTTADTIPVSSPTKGKGNDWVLIIE >gi|222159334|gb|ACAB01000025.1| GENE 9 12051 - 13091 854 346 aa, chain - ## HITS:1 COG:RSp0162 KEGG:ns NR:ns ## COG: RSp0162 COG2730 # Protein_GI_number: 17548383 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Ralstonia solanacearum # 35 340 115 418 420 186 34.0 5e-47 MKKVFISAFLLLSLLTLNGCKSNQPPAKETGEPYGVNLACADFGSSFPGEYNKDYTYPTD QDLEYWQKKGLKLIRLPFKWERLQLDLKGPLNQHDLNKMKELVRAAEKRDMVVILDLHNY CRRFMNNEHTLIGNNELTIEDLASFWQAIAKEFSTFKNIYGYGLMNEPHDLAPETKWFDM AQASINAIREVDTNTLIMVGGNDWSSAERWIEQSDTLKFLKDPANNLAFEAHVYFDKDAS GTYKYSYEEEECYPEKGIDRVKPFVEWIKQNKFHGFIGEYGIPDNDPRWNETLDLFLGYL QENGINGTYWAAGPWWDTYFMAITPKDGKDRPQMPIIEKYTSTLKK >gi|222159334|gb|ACAB01000025.1| GENE 10 13102 - 13308 178 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160884642|ref|ZP_02065645.1| ## NR: gi|160884642|ref|ZP_02065645.1| hypothetical protein BACOVA_02631 [Bacteroides ovatus ATCC 8483] # 1 68 773 840 840 140 100.0 3e-32 MERTGNMVTLTNTGKHPAIGVHVEVPEKMDQLIVSENYIWLNPQESKILKINLESPVIVK GWNLQSPY Prediction of potential genes in microbial genomes Time: Wed May 18 01:24:55 2011 Seq name: gi|222159333|gb|ACAB01000026.1| Bacteroides sp. D1 cont1.26, whole genome shotgun sequence Length of sequence - 55428 bp Number of predicted genes - 42, with homology - 42 Number of transcription units - 19, operones - 9 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 2271 1101 ## COG3250 Beta-galactosidase/beta-glucuronidase 2 1 Op 2 . - CDS 2290 - 3375 773 ## COG2730 Endoglucanase 3 1 Op 3 . - CDS 3435 - 4607 914 ## BF0761 putative lipoprotein 4 1 Op 4 . - CDS 4636 - 6399 987 ## BF0834 hypothetical protein 5 1 Op 5 . - CDS 6412 - 9663 2569 ## BF0759 putative outer membrane protein - Prom 9685 - 9744 6.3 6 2 Tu 1 . - CDS 9783 - 10760 513 ## COG2730 Endoglucanase - Prom 10879 - 10938 10.7 + Prom 10848 - 10907 8.6 7 3 Tu 1 . + CDS 11072 - 13069 645 ## Kfla_1880 fibronectin type III domain protein - Term 12910 - 12954 -1.0 8 4 Tu 1 . - CDS 13066 - 14055 928 ## COG0530 Ca2+/Na+ antiporter - Prom 14153 - 14212 9.5 + Prom 14100 - 14159 6.7 9 5 Tu 1 . + CDS 14183 - 16039 1650 ## COG0668 Small-conductance mechanosensitive channel - Term 16151 - 16195 11.4 10 6 Op 1 . - CDS 16245 - 17183 952 ## BT_3017 acid phosphatase - Prom 17214 - 17273 4.4 11 6 Op 2 . - CDS 17345 - 20155 2985 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 20180 - 20239 3.4 - Term 20222 - 20252 0.0 12 7 Op 1 . - CDS 20347 - 20835 414 ## BVU_1305 hypothetical protein 13 7 Op 2 . - CDS 20856 - 21509 326 ## gi|237717146|ref|ZP_04547627.1| conserved hypothetical protein 14 7 Op 3 . - CDS 21557 - 22747 861 ## BT_3008 hypothetical protein 15 7 Op 4 . - CDS 22817 - 23596 444 ## COG2173 D-alanyl-D-alanine dipeptidase - Prom 23616 - 23675 1.8 16 8 Op 1 . - CDS 23808 - 26711 2141 ## BT_3006 hypothetical protein 17 8 Op 2 . - CDS 26725 - 29712 1872 ## BT_3005 hypothetical protein 18 8 Op 3 . - CDS 29700 - 31055 1055 ## BT_3004 hypothetical protein - Prom 31124 - 31183 5.9 + Prom 31558 - 31617 3.6 19 9 Tu 1 . + CDS 31648 - 32151 488 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 32183 - 32237 6.2 + Prom 32195 - 32254 3.9 20 10 Tu 1 . + CDS 32315 - 33091 741 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 33170 - 33208 5.5 - TRNA 33201 - 33277 73.3 # Met CAT 0 0 - Term 33152 - 33202 11.5 21 11 Op 1 . - CDS 33369 - 33680 268 ## BT_2979 hypothetical protein 22 11 Op 2 . - CDS 33686 - 34150 451 ## COG1522 Transcriptional regulators - Prom 34199 - 34258 9.9 - Term 34233 - 34291 11.3 23 12 Op 1 . - CDS 34311 - 35189 1004 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 24 12 Op 2 . - CDS 35210 - 35794 719 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 - Prom 35994 - 36053 5.7 25 13 Tu 1 . + CDS 36268 - 36966 565 ## COG0846 NAD-dependent protein deacetylases, SIR2 family + Term 37039 - 37079 6.1 26 14 Op 1 . - CDS 36987 - 37211 222 ## BT_2974 hypothetical protein 27 14 Op 2 1/0.000 - CDS 37225 - 37575 123 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 37591 - 37640 5.2 28 14 Op 3 . - CDS 37646 - 38422 715 ## COG0500 SAM-dependent methyltransferases + Prom 38768 - 38827 3.1 29 15 Tu 1 . + CDS 38865 - 39722 789 ## BT_2961 hypothetical protein + Term 39776 - 39816 1.3 - Term 39649 - 39686 1.3 30 16 Op 1 26/0.000 - CDS 39735 - 40484 696 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 31 16 Op 2 . - CDS 40477 - 41742 910 ## COG0438 Glycosyltransferase 32 16 Op 3 . - CDS 41753 - 42871 1011 ## BVU_0890 putative glycosyltransferase 33 16 Op 4 . - CDS 42844 - 43866 921 ## BVU_0889 hemolysin hemolytic protein 34 16 Op 5 . - CDS 43863 - 44726 692 ## COG3475 LPS biosynthesis protein 35 16 Op 6 . - CDS 44782 - 45834 680 ## BDI_2786 hypothetical protein 36 16 Op 7 . - CDS 45824 - 47278 973 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid - Term 47279 - 47335 12.6 37 16 Op 8 . - CDS 47351 - 49390 2354 ## COG0143 Methionyl-tRNA synthetase - Prom 49540 - 49599 6.0 + Prom 50194 - 50253 6.2 38 17 Tu 1 . + CDS 50326 - 52386 1685 ## COG1042 Acyl-CoA synthetase (NDP forming) + Prom 52477 - 52536 7.5 39 18 Tu 1 . + CDS 52626 - 52907 89 ## gi|295085629|emb|CBK67152.1| Glycosyl hydrolases family 43. 40 19 Op 1 . + CDS 53029 - 54324 1038 ## COG0673 Predicted dehydrogenases and related proteins 41 19 Op 2 . + CDS 54321 - 54566 133 ## BT_3471 hypothetical protein 42 19 Op 3 . + CDS 54627 - 55428 839 ## COG0673 Predicted dehydrogenases and related proteins Predicted protein(s) >gi|222159333|gb|ACAB01000026.1| GENE 1 3 - 2271 1101 756 aa, chain - ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 92 674 25 632 785 213 27.0 2e-54 MRTRIITIITLLLSTPMIIAQKSMDKIDRESFAAKLSPIEVKGIQMTEAGNIPLVRDTPA KISLDGTWQLAEGGTEKERLHTIWTDQIPAHVPGSIHTALVENGIIPDPYIGQNDSIAEK QSYKTWWMKREFELDSPSSHCILSFGGIANKCTIWLNGKLLGTHEGMFGGPDFSIGNYLK NKNTLIVKLEAIPQMFLGNWPPNANESWKYTVVFNCVYGWHYAQIPSLGIWRSVQLKEQA AVEIESPFIATRSLDGQMRLTLDLHKKSSPLKGVLYAEVSPKNFKGITQYYRFDINSQKK QETLSLDFQIKDPHLWWPNDRGEQSLYDLNLFFVPQKGKTAHIKTSFGIRTIEMRPLIDG AKEDYYNWTFVINGKPMFIKGTGWCTMDALMDFSRNKYEHLLQIAQSQHIQMLRAWGGGM PETDDFYELCDKYGILVMQEWPTAWNSHNTQPYTILQETVERNTKRLRNHPSLIMWGAGN ESDKPFGPAIDMMGRLSIELDGTRPFHRGEAWGGSLHNYNCWWDDAHLNHNLNMTAPFWG EFGIASLPHIETVRRYLDEEKEVWPPQRSGNFTHHTPIFGTMREMEKLTQYSGYFMPKDS LASFILGSQLAQVVGVRHTLERARTLWPHTTGALYYKMNDNYPGVSWSCVDYYGIIKPVH YFVQKSFAPLAAVMLFDRSNLASQEVSLPVYLLDDCQTLEKEPYQVKVSIYNALLDTVAT HTFNGTGDDNVVKKLGEISLNREQTKSTMLFFVLDI >gi|222159333|gb|ACAB01000026.1| GENE 2 2290 - 3375 773 361 aa, chain - ## HITS:1 COG:RSp0162 KEGG:ns NR:ns ## COG: RSp0162 COG2730 # Protein_GI_number: 17548383 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Ralstonia solanacearum # 32 356 106 418 420 175 34.0 1e-43 MLKDLFSLLTIIALLFSSCSKSDEEENGDEPQPTKQTAYFGVNLSGAEFGNVYPGVDGTH YGYPTEKDLDYFKAKGLYLVRFPFRWERIQPTMNGELNATELAKMKKFVKAAEDRNIQIL LDMHNFGRYCVYCDGQSSQNNQYAIIGNARCTVDNFCDVWKKLAKEFKDYKNIWGYDIMN EPYEMLASTPWVNIAQACINAIRTIDTKTTIIVSGDEFSSARRWKECSDNLKTLTDPSNN LIFQAHIYFDSDSSGNYNKGYDEDGATVQTGVARLKPFVDWLKENNKRGFVGEYGIPDTD GRWLDILDAALKYLQENGINGTYWSAGPRWGDYPLSVQPTNNYTQDRPQLSTLLKYKSTQ Q >gi|222159333|gb|ACAB01000026.1| GENE 3 3435 - 4607 914 390 aa, chain - ## HITS:1 COG:no KEGG:BF0761 NR:ns ## KEGG: BF0761 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 43 390 34 367 368 135 30.0 3e-30 MKKIKYFAIIAASIFTLTSCTDIVEVDDLKAKENKPSTGAPTVDKVVLATDAEFPIEGAN FEQVVRIEGTNLGDITSLKFNDIEVDSKEIYSTYDMLLAPIPRALPKEVTNTIYITTKHG ELSIPFVVSIPDLTINGLKNEFTQPGDTTVITGDNFDLYGITIEEAIVNLGNLPVNVIDA TRTELTIEIPANAAPKSTLTIKGANMDEAYKLTYMDPGVSQLFDFNNWPGSGAFTHSSQF PDAPKDFLCDGTLEGQPEPLVEGGKYIRFNNSVKAWGWMVMWAGYITVPAEVAADPSSYD LRFEICTGAKFPISAQARIILGDYGWYPSKGGIPVNTYGGWQTVRISADTEALLPNPIDP STNTAFKIIFSPESAQDFDLSMCNFRFVHK >gi|222159333|gb|ACAB01000026.1| GENE 4 4636 - 6399 987 587 aa, chain - ## HITS:1 COG:no KEGG:BF0834 NR:ns ## KEGG: BF0834 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 586 1 555 555 383 40.0 1e-104 MKKKNIFIYLMASSLLLSGAVMTSCESMIEEKPFDFIVPEDVEDSDNGADMWVTGVYNTL HEAMFRYGSFPRPLDYDCDYISGAVWQFSQFGSGNFQGGDGQADVLWTGMYSLINRANIA VSEINKMQNVSEVFKKNALGECYFLKAWAYFYLVRAYGAIPIYSVSVNESGQYTNNPRIP IAQVYTETIIPLLKDAKDMIYKNSDNGFKPGRVCAATAAGLLAKVYATIGSASMATGEQI TVKTGAPFVMQNVNGTMTKVYTEPVPTTFSKDQVAGYESFSSQEYYRLAYEIAGDIIGGE YGIHKLEDYDLVWSPSGKTCSEHLFSLQTKSGDELYGTLFSSHYCGRLNAAGNIDNSLTV GCRKHWYLLFEEKDYRVDKGVLHCWIRQNSDTSWGGGSYYPNFGKWQRMVEAKEPPFDNP KVTSGWRCDEGGSEQFFAFTTKYSQQIADQTQPRTDANYPFLRYADIVLIFAEAANELNG PTKESVDALNDVRTRSNATGKELANFTDKTSLRSAILEERAMELALEGDRRWDLIRWGIY LQAMNALGGMDEANNVKQRSSKHLLFPIPTLEILTNQGINENNPGWD >gi|222159333|gb|ACAB01000026.1| GENE 5 6412 - 9663 2569 1083 aa, chain - ## HITS:1 COG:no KEGG:BF0759 NR:ns ## KEGG: BF0759 # Name: not_defined # Def: putative outer membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 43 1083 24 1045 1045 1107 56.0 0 MKKNLFSFPRSKVRMLKGSKGVWLFLIMFWMINTAASAAGIEIKGTVTDSKGEPLPGVNI VELGVKKNNGTISDLNGKYTITVESQKSVLQYTFIGYKTTEVTVGNRKTINVSLKDDTQS LDEVVVIGYGTMRKKDLSGAVASIKSDDLMLGNPTSISQALQGKLAGVQVNQSDGAPGSG VSITIRGANSFSTNSQPLYIVDGIPFEVGDTPSSKANEGNNSTTNPLSLINPNDIESIDI LKDASATAIYGSRGANGVVLITTKRGRAGDAKVEFSTNFGLSKIAKMVKMLDAYTYANYV NEGVINGAAYDNLPYSYLPYRGKWNYRRDENDKIVPNSGKYYASPEDYLNPGYREDEYGN KEWVEGTNWMDEILQDALTQEYNLSVSGGNEKSNYAFSGNYTDQTGIIKNSGYERFAVRA NIGSHVKPWLNTGLNINFTRSLTKFAKSNSYDYSIIRSAMLYLPTLYVGDKTEDDSYAWL SANPRTYVNTAKDELKSINVFTSAFAEIKIFDCLKFRQNLGISYSVNDRASYYNRETGEG KASNGRAGKSDNFWQNLTAESLVTFDKTFNKLHHLNVVAGFTYEKSDWGGKTMNASNFPT DITQDFDMSQALNIETPASYRGQAVLVSLLGRANYTLKERYIFTASFRRDGSSRFAPGNK FANFASGAVAWTISEEEFIKNLNIFSNLKLRFSYGQTGNQAISSYQTIASLAPSNYPLDG TLSSGFAGQTYKGPLNDKLKWETTDQYNVGLDMGFWNNRISLSANYYYKKTNDLLQNVSI PNSTGYTTMWTNFGHVKNKGLELTGKIIALDKKDWTLDFDGNISFNKNEIGGLTADQYAN QLWYSAKEVFLQRNGLPIGTLFGYIEDGFYDNIAEVRADPIYAKASDDEARRMIGEIKYL DKNNDGKITSEDRAIIGDTNPDFIYGLNANLRWKNLTLGLFFQGTHGNDIFNGNLTNIGM SSIANITQDAYDSRWTPENAANAKWPRVTTAMTRDMKLSDRYVEDGSYFRLKTINLNYNF GSVIKGISNLSVFGTVTNVFTITGYSWFDPDVNAFGSDASRRGVDIFSYPSSRTYSIGFK LTL >gi|222159333|gb|ACAB01000026.1| GENE 6 9783 - 10760 513 325 aa, chain - ## HITS:1 COG:BS_bglC KEGG:ns NR:ns ## COG: BS_bglC COG2730 # Protein_GI_number: 16078874 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Bacillus subtilis # 6 313 21 338 508 238 42.0 1e-62 MKKVILFITLFSMISLFSYSKDPVKQWGQLQVKGNQLCSQTGDPIVLRGVSYGWHNLWPR FYNKQSVKWLKKDWKCTVLRAAMGTVIEDNYIENPEFALKCMNKVIKAAIKNDLYIIIDW HTYYPQKKEAKAFFSMMAQKYGKYPHIIYEIYNEPMEDSWESVKEYATDIISEIRKYDPD NIILVGSPHWDQDLHLVAESPLEGSDNIMYTLHFYAATHKQELRDRAEAAWEKGIPIFVS ECAGMECTGDGPLDIPEWTRWVEWLESKKISWVNWSISDKNETCSMILPRANKNGGWDES LIKPAGRQSRKFIRQYNSHIYKNKE >gi|222159333|gb|ACAB01000026.1| GENE 7 11072 - 13069 645 665 aa, chain + ## HITS:1 COG:no KEGG:Kfla_1880 NR:ns ## KEGG: Kfla_1880 # Name: not_defined # Def: fibronectin type III domain protein # Organism: K.flavida # Pathway: not_defined # 293 660 475 847 1090 157 29.0 9e-37 MKRSIFLVCFVVFGVVSLSAIGSIVSVAQSGWQTDSISYGISRMVDIASLPLLDRGISVH YEGSIDKRGKNADWDWSLYQDQRGEWVIFDVDGPGCIYNLVQHRYMSSSDPLFRFYFDGE ETPRFSLHLSEFGEKEPFIKPLAESYIGPFDNGRGPIRVARSFVPMPFNKGCRVTTDVKL EGYERTKGEGGWGHVVYHTYADNGIKTFTGKENYDTLIQLWKKQGSNLLCKDQLAYHRKS EQKINAGESITLLDEKGEGAIGSLKFYLPEINEQHLQDVWIHMFWDAHQQPDISCPLACL GGNSLGFHDTNYLLSGYNTDGWFYNYFPMPYWKHAKIIIENRSGVPVSLGFSEIAVSRSV YPTSNTGYFRNTPYYTRKHVAGIDSPIAAIQGRGKMVAAHVTCHAERSHIISCEGDVRVY IDGKRTPQVESDGSESYVCYGWGFPTPPEVHPMGGYDGLSDNPWSMTRFCIGDSYPFYSE LKFGIESGEYNNQYLEHSGTIFYYGQDKNVLVKTDSLDLNSSRSIKRHSYKAMGNVRRTK LESFFEGNEDRILYVGETVRFESFSSFRVNISSQNEGVRLRRLSDQNDVRQAARVFVDGE EVTERLWYVADSNPYKRWLEDDFEIPAKYTKGKKSLNIRIVPVSMSKEGKNTWNEAEYQV FCYNN >gi|222159333|gb|ACAB01000026.1| GENE 8 13066 - 14055 928 329 aa, chain - ## HITS:1 COG:BH0465 KEGG:ns NR:ns ## COG: BH0465 COG0530 # Protein_GI_number: 15613028 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Bacillus halodurans # 16 327 16 316 318 223 45.0 6e-58 MNILLLIGGLILILLGANGLTDGAASVAKRFHIPPIVIGLTIVAFGTSAPELTVSVSSAI KGSADIAIGNVVGSNIFNTLMIVGCTALFAPIVITRNTLRKEIPLCILSSIVLLICANDV FLDKAPENILNRVDGLLLLCFFVIFMGYTFAIASKPSTTEQAGQIEAPTIEEETEIKSLP WWKSILYIIGGLAALIFGGQLFVDGATGIARNLGVSESIIGLTLVAGGTSLPELATSIVA ALKKNPEIAIGNVIGSNLFNIFFVLGCSASITPLRLSGITNFDLFTLVGSGILLWLFGLF FAKRTITRIEGGIMILCYVAYTVVLIYNI >gi|222159333|gb|ACAB01000026.1| GENE 9 14183 - 16039 1650 618 aa, chain + ## HITS:1 COG:sll0590_2 KEGG:ns NR:ns ## COG: sll0590_2 COG0668 # Protein_GI_number: 16331818 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Synechocystis # 358 595 1 240 264 250 51.0 5e-66 MEINKIKRSLLVLFFSFLAIGAQAQLEQAVKKIFAGDTITGGHVPLKRDSDSIHLVNMQK SLEEARLNEANMRMEMEQMKLQMATADSVKYAQQRQRIDSLRQFTKGIPVVADGDTLFYL FTKRGGYTPQQRAQMTGAAIEEIGRRFNLQPDSVAIDHSDIVSDLMYGSKVLLSLTDQDA LWEGVSRDSLAKERQQSVITKLHEMKAEHGLWRMAKRVLYFVLVITGQYFLFRLTNWLFR KLKARILRLKDTKIKPVAIQGYELLDAQKQANLLVFLSSIGRYILMGLQLLFTVPLIFII FPQTEGLAYQLLGYIWNPIRGIFVGIIDYVPKLFTIIVIWYAVKYLVRLVLYLAREVEAG RLKINGFYPDWAMPTFHIARFLLYAFMIAMIYPYLPGSDSGVFQGISVFVGLIVSLGSST VIGNIIAGLVITYMRPFKMGDRIKLNDTTGDIIEKTPLVTRIRTPKNEVVTVPNSFIMSS HTVNYSTSAREYGLIIHSEVSIGYDIPWRQVNQILIDAALNTPGVVDDPRPFVLETSLSD WYPVYQINAYIKQADKMAQIYSDLHQNIQDKFNEAGIEIMSPHYMAVRDGNESTIQKGAV KNSGPAKTAEQGDVSEKA >gi|222159333|gb|ACAB01000026.1| GENE 10 16245 - 17183 952 312 aa, chain - ## HITS:1 COG:no KEGG:BT_3017 NR:ns ## KEGG: BT_3017 # Name: not_defined # Def: acid phosphatase # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 312 1 310 310 606 91.0 1e-172 MNMKIKSLLILALVSVCTFVQAQLTDYSIFDKKFNFYVANDLGRNGYYDQKPIAELMGTM GEEIGPEFVLATGDVHHFEGVRSVNDPLWMTNYELIYSHPELMIDWFPILGNHEYRGNTQ AVLDYTHISRRWTMPARYYTQTFEEKGATIRIVWIDTTPLIEKYRQESDKYPDACKQDVN KQLSWLESVLANAKEDWVIVAGHHPIYAYTPKEESERLDMQKRVDTILRKHKVDMYICGH IHNFQHIRVPGSDIDYVVNSSGSLARKVEPIEGTKFCSPEPGFSVCSIDKKELNLRMIDK KGNILYTVTRKK >gi|222159333|gb|ACAB01000026.1| GENE 11 17345 - 20155 2985 936 aa, chain - ## HITS:1 COG:CC0995 KEGG:ns NR:ns ## COG: CC0995 COG1629 # Protein_GI_number: 16125247 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 125 936 53 903 903 180 24.0 1e-44 MKRFLKFVSFVLLVMSTSGNTFAEEKVNVVKQGTIRGRIVDTSKQTLPGASIYIEKLHTG VTSDVNGYYTFANLTPGTYTVKVSYVGYSPVEMKITIPAGKTLEKDVILNEGLELQEVVV GGAFQGQRRAINSQKNTLGITNVVSADQVGKFPDSNIGDALKRINGINVQYDQGEARFGQ VRGTSADLSSVTINGNRLPSAEGDTRNVQLDLIPADMIQTIEVSKVVTSDMDGDAIGGSI NLITKSTPYKRTISATAGTGYNWISQKAQLNLGFTYGDRFFNDKLGMMAAVSYQNAPVGS DDVEFEYDVNKKGEVVMVEAQKRQYYVTRERQSYSLAFDYEINPNHRLTLQGIYNRRHDW ENRYRVTYKDLDKTGPDDEGEMQQSAQIETKGGTPNNKNARLELQQTMDFSLGGEHQFGK LSMNWGASYARASEDRPNERYFNLKQDFLGFDIVDAGGRFPYVTTDVNLHNGEVDGERGK WKVKELTESNQEIYEKDLKFKVDFELPLANGIYGNSLKFGAKYASKTKNRDVTVYDYADA YKDAYKTAYMDNLTSEIRDGFMPGNQYKATDFVSKEYLGSLDLKNMEGEQVLEESSGNYH AKENVTSAFFRFDQNLGKKLKMMLGLRMEATHIKYDGWNWMVDEEENETLEPTGDHKNNY VNWLPSVLLKYDVTDDFKVRASFTETLSRPKYSALIPCVNINRSDNELVMGNSDLTPTLS YNFDLSADYYFKSVGLVSAGIFYKKINDFIVDQVIGDYTYQNNEYKKFTQPKNAGDADLL GVELAYQRDFSFIAPALKCIGFYGTYTYTHTKVNNFNFEGRENEKDLSLPGSPEHTANAS LYFEKKGFNVRLSYNFASSFIDEMGEVAALDRYYDAVNYMDLNASYTFGKKFKTTFYADA TNLLNQPLRYYQGTKDRTMQSEHYGVKINAGVKVNF >gi|222159333|gb|ACAB01000026.1| GENE 12 20347 - 20835 414 162 aa, chain - ## HITS:1 COG:no KEGG:BVU_1305 NR:ns ## KEGG: BVU_1305 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 151 1 165 165 97 38.0 1e-19 MKKYILILFSIISLWSCKETEDESIDITVLPSATTTGANTFGCLMDGWIYVGGRYLNWGH SYVWTYDSFHYYTEEDKLSVSVSVKPGINLSFTILSPQEGKESTITDIEFGREELEDGTA FISRFDTEMKIISVTFGNGKRLTNGRFDIHYTEHNSNTGAQS >gi|222159333|gb|ACAB01000026.1| GENE 13 20856 - 21509 326 217 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717146|ref|ZP_04547627.1| ## NR: gi|237717146|ref|ZP_04547627.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 217 1 217 217 430 100.0 1e-119 MNQTPIKAILLLIVFTLSCHTPLFSQRDFERHEFSFHAGYGVMFHNPPTLTLSTHSYQRT LAQGVSWDGQYHFRPLKRFIFGGIYTGFSSKGSHPEGKDHLWVHFIGTQIGMCNANTKHW QIRVTTGPGGVILRNNSEVFGNTRKVRAFTIGLLTNANLTYKLNSNLGVSLGVQYMYSEL LRIRTYYHEEKVTVKLNSNDDVNLTRLNITTGLSYYF >gi|222159333|gb|ACAB01000026.1| GENE 14 21557 - 22747 861 396 aa, chain - ## HITS:1 COG:no KEGG:BT_3008 NR:ns ## KEGG: BT_3008 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 395 1 394 395 687 82.0 0 MIQISPETQLFIREHSSDDVRALALQAKKYPDINMPTAITQIAGRQVAAEKIPSWREIEE IWYPKHLSLEQCSSEITARYKARLFQGDSLTDLTGGFGIDCSFLATGFKSATYVERQEEL CEIAAHNFPILNLNHINVRNEDGVAYLQSMSPIDCIFLDPARRNEHGGKTVAISDCEPNV AELEALLLNKANRVMIKLSPMLDLSLALKELKHTQEIHILSVNNECKELLILLGQTSPTE ITIHCVNLLTKGTQEEQHLVFTREQEQRSQCTYTDSLGNYLYEPNASLLKAGAFRSIAAA YPVRKLHPNSHLYTSDSFIENFPGRIFRIVNQCSFNKKEVKENLADLKKANVTVRNFPAT VAELRKRLHLTEGGDTYLFASTLNNGQKVIIRCEKV >gi|222159333|gb|ACAB01000026.1| GENE 15 22817 - 23596 444 259 aa, chain - ## HITS:1 COG:CC2273 KEGG:ns NR:ns ## COG: CC2273 COG2173 # Protein_GI_number: 16126512 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine dipeptidase # Organism: Caulobacter vibrioides # 55 252 17 198 212 95 33.0 1e-19 MIILKYSLAIFCLLLTGCSFFSSQKEKENPALTEYEYTTDDMQSGEESATVLPPPPKRSA MALYMDSLGLVNVTDLDSSLVVKLMYTQADNFTGEVLYDNLTEAYLHPDAAYALIEAQKA LKKLHPSYSLIIYDAARPMSVQKKMWNVVKGTSKYKYVSNPNRGGGLHNYGLAVDISIQD SLGQPLPMGTQVDHLGVEAHITNENELVHNGKMSETERQNRILLRKVMKEAGFRALPSEW WHFNFCSRDVARQKYKLIP >gi|222159333|gb|ACAB01000026.1| GENE 16 23808 - 26711 2141 967 aa, chain - ## HITS:1 COG:no KEGG:BT_3006 NR:ns ## KEGG: BT_3006 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 965 4 965 967 1454 71.0 0 MTLEERLNEIVEKQQGDAIIPFLQGLTQEERKSLVPRLNKLEEYYSKFVQLNKNTYGTRG TSAQHRIINLTALVIYSLKEFRKHQWGINTEQLNELIPWYVPSWLDSFFKEGEGKEFGGF YGMDYEVLMDWIEQGILTVTPSPQTIAGYLVSYINTTAILQKREITLKEHIWYLFEYDCG QNWMDNRNGGHPYYPYKYLIENGKLDRMRVLRESLLAVNRNMNKNLCGWFAGMFTALTPS IEEQLTLQPEILAVLSAPHSRPVNIMLGLLKGLCNHPQFRIEEFLNQTSVLFASDVKAIH QNTLAVLNKLAKERKEFRDAICCVAAQGLMSREESTQSKIVKLILAYGETESATLREALA VYTETMLANTKKELKAYLENNESGHTSAHNEAAPAHSEQPDNDSASFVYEPILPIIREDN RIQEIASPEDLLFLASQVLDGNEIYHFDLLLGALVQWDRQQDTKQISQWAPILQRAYKLL MSGGSSRNGLLDQLMATFLLDYAKLLVKRFPKEGKELSDLHQKMVQKDELQKGKWNYRNL QKLTIRQKTNQKEKFPVHKQLLCRTLDLLESKEKPLPLLSTPTHTPMFIAPATLVERLKQ YQQANTEPDDMDMQIALSRVALDNSSQELPIILQDLKEEYQRLLSFLLGAEDVLPQAPFT HPSWWMTAGLIKSPETIYSEFKDFSYNKGPREFLTGNFTWRTYLRTHSYTDYNKKVVEWN SATLTFDIPESKNSHVANKNEYNEKISYHSYDSHPLVVEMYPLIERFDDIQNDLPRLAWL TPNMPEPLLVWCIRSAIYDPMLNEVREIGITKATIEALHQLRHTWHEASYILEASCMLVA DKTSRSYAAEIWIDRVSKGCIDSKRIGKILGSHQHTGWGPLKRLTGLIQQQMMNVSPLHN RELENLIVAMLAGLPEKPIKDLKKLLEIYAELLSINHSKAKDEQVLHLLNVWKGVANLKK AVANIQH >gi|222159333|gb|ACAB01000026.1| GENE 17 26725 - 29712 1872 995 aa, chain - ## HITS:1 COG:no KEGG:BT_3005 NR:ns ## KEGG: BT_3005 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 994 1 948 948 1376 69.0 0 MEELKKELEKLSKAYVDTPENEEKILIPFIKRLLELPMKDRRKLLPLIRELQWIKGRFAG FSSETTCSAARAHFLSAVQFVCANRREMDMAYHVKFDMLCKLLPLYNPTWMTDFINDDKT WFNFDLNYEELMQLMDMGYLKEIAPSRIAHVLPWITRIRNKNPKGDDTFNSELLLKRDIT LKEHIWTLFEHESIIGYQDDCAKNAYKKGITTRDESISAALYRFSLDGHLDREQLLRATL ATFHRSFKKDMAGWFARFFETLQPTAGELLSLQEEIMQTFTSSYTKPVNIMLQQLKSIAD EEGFRYQEFIERATTLFFSSPKNSLLTIYSIFEKIVAQHSEMKEPCCITLCQLFLKKDES LQKKAANFISKYGDASSSNLQETLQSYQPEMFQSVQAILASFKPQSIDSQSTEPHLAKEA NATDTGVTEDILHTEGKNTERNSTDENSTDNSLLSEEPSLEAIRICREDNRIPFPADKED FLFQLSRLFDMEENWEIETTIAAIIAFHPQLDKEDLNRMEPIFQRAANIVANSWEPYEDL LATFLLEYQRLWAQTDNSNTGVLRNMFTRLEEKLKGIDKNRGAYDERSFKRLADWKPGYS NATCFIPIQQLWLNVIRQIKGGNSLPLLSTPTHTPAYLQATKLIRRLAAYQEAGKKPCSW DFQLAIARCALEDKEEAIVTARQLLQDEYLHLCLFLLDENTQPEPPYNHPTAWIAAGLVK APETEFKAFKSFSCNTLPHNHLTGDYEWKEVKPKENSYETDRRLLQLDFHKWHKYIERNS HQLWQEHLIINSKYNMDDSRYMEPLLCCYPNRPEPLIAQIISNYMAFGSPQEDSKRTLAC ALRMLLSFHCPLKEMSLLLLSGSLLFVDKTVRSYAAELWVEGLSTGRINNHRVGEILARL INMELAPLKRFTTQVYESMYKRSAFHNRQLEELLTVFISGLPDKPVTGLKQLLELYLELL TINRSKVTNEQLLQRLQEWATNSNLKKVTTSLNKL >gi|222159333|gb|ACAB01000026.1| GENE 18 29700 - 31055 1055 451 aa, chain - ## HITS:1 COG:no KEGG:BT_3004 NR:ns ## KEGG: BT_3004 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 451 1 451 451 855 92.0 0 MTDTISYQYAAPSTLQRSADQDELFLAKYSEIEKREAPCFFWGKLTQPYMTARCLIALSN VVQSSFNLTPAQLTMLKDPIVTAGNNRLRFEGFSNCAGVYARVDVLPDGHNGEFLENGTT NVDFNPGMISALGGIGRQENVVMSVGPKEVGLYNKGEKVIERKVPLPVKWIKGLTTVQIY QSVAEKVYSFNRIQTLQLFQTLPKSSVKCDYFLVMRGQKPAFSPVKSMNGICIGGLHRLR LLEPLLPFADELKVFAHPTMQSTIWQLYFGPVRFSLSLSRECWRGFSGEGAALESLLEDV PERWIEAMDKYSYANQQFNPTLFAIEEHIDLDKVDSLSARLAAMGLLGFDLDENSFFYRR LPFKTERILSLNPRMIAAEKLLEEDKVEIIANDGNRTEARVAGSGGVRHTVILDRENEKE RCTCTWFSSNQGERGACKHILAVKKLAQWKN >gi|222159333|gb|ACAB01000026.1| GENE 19 31648 - 32151 488 167 aa, chain + ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 3 164 5 161 183 96 36.0 2e-20 MKSLSFRKDLIGVQDELLRFAYKLTTDREEANDLLQETSLKALDNEDKYTPDTNFKGWMY TIMRNIFINNYRKVVRDQTFVDQTENLYHLNLPQESGFESTERAYDLKEMHRVVNALPKE YRVPFAMHVSGFKYREIAEKLNLPLGTVKSRIFFTRQKLQEELKDFR >gi|222159333|gb|ACAB01000026.1| GENE 20 32315 - 33091 741 258 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 258 1 256 256 117 32.0 2e-26 MIKAIMLDVDGTLVSFETHKILQSSVEALKKIHDRGIRIVIATGRAAGDLHEIAAVPYDG IIALNGADCVLLDGTVIRRHLIPKDDFKKAMEITKAFDFAVAIELDEGVFVNRLTPTVER IAKIVEHPIPAVVDIEELFDRKECCQLCFYIDDEMEQQVMPLLPNLSLSRWHPLFADVNL AGISKATGLSAFADYYGIEMAEIMACGDGGNDIPMLKAAGIGVAMGNASETVKASADFVT DTVENDGLCKALKHFGII >gi|222159333|gb|ACAB01000026.1| GENE 21 33369 - 33680 268 103 aa, chain - ## HITS:1 COG:no KEGG:BT_2979 NR:ns ## KEGG: BT_2979 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 103 1 103 103 185 96.0 4e-46 MEFLNEYHLAGLFIGICTFLIIGLFHPVVVKAEYYWGTKCWWIFLVLGIAGVIASLSIDN VILSSLLGVFAFSSFWTIKEVFEQEERVQKGWFPKNPKRKYKF >gi|222159333|gb|ACAB01000026.1| GENE 22 33686 - 34150 451 154 aa, chain - ## HITS:1 COG:YPO0002 KEGG:ns NR:ns ## COG: YPO0002 COG1522 # Protein_GI_number: 16120355 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 3 148 6 151 153 115 40.0 4e-26 MERIDNLDRQILEIISQNARIPFKDVAAECGVSRAAIHQRVQRLIDLGVIVGSGYHVNPK SLGYRTCTYVGIKLEKGSMYKSVVAELQKIPEIVECHFTTGPYTMLTKLYACDNEHLMDL LNNKMQEIPGVVATETLISLEQSIKKEIPIRVEK >gi|222159333|gb|ACAB01000026.1| GENE 23 34311 - 35189 1004 292 aa, chain - ## HITS:1 COG:STM4397 KEGG:ns NR:ns ## COG: STM4397 COG0545 # Protein_GI_number: 16767643 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Salmonella typhimurium LT2 # 76 287 22 219 220 177 48.0 2e-44 MKKVSIFMAIAAAASLASCTAQAPKANLKSDIDSLSYSIGMAQTQGLKGYLTGRLDVDTA YMADFIKGLNEGANKTSKKDIAYMAGLQIGQQISNQMMKGINQELFGTDSTKTISKENFM AGFIAGTLEKGGVMTMEAAQEYTRTAMETIKAKALAEKYADYKAENEKFLAENKTKDGVK TTASGLQYKVITEGKGEIPADTCKVKVNYKGTLIDGTEFDSSYKRNEPSTFRANQVIKGW TEALTMMPVGSKWELYIPQELAYGARESGNQIKPFSTLIFEVELLSIEKDKK >gi|222159333|gb|ACAB01000026.1| GENE 24 35210 - 35794 719 194 aa, chain - ## HITS:1 COG:ECs5185 KEGG:ns NR:ns ## COG: ECs5185 COG0545 # Protein_GI_number: 15834439 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Escherichia coli O157:H7 # 5 194 67 259 259 188 53.0 5e-48 MDKFSYSIGLGIGQNLSSMGIGNLAVDDFAQAIKDVLEGNQTAISHNEAREIVNKYFEEL ETKMGAVAIEQGQAFLEENKKGPGVVVLPSGLQYEIIKEGTGKKPKATDQVRCHYEGTLI DGTLFDSSIQRGEPAVFGVNQVIPGWVEALQLMPEGSKWKLYIPSELGYGARGAGEMIPP HSTLIFEVELLEVL >gi|222159333|gb|ACAB01000026.1| GENE 25 36268 - 36966 565 232 aa, chain + ## HITS:1 COG:jhp1180 KEGG:ns NR:ns ## COG: jhp1180 COG0846 # Protein_GI_number: 15612245 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Helicobacter pylori J99 # 1 226 1 220 234 232 49.0 5e-61 MKNLVVLTGAGMSAESGISTFRDAGGLWDRYPVEQVATPEGYQRDPALVINFYNERRKQL LEVKPNRGHELLAELEKSFHVTVVTQNIDNLHERAGSSHIIHLHGELTKVCSSRDPHNPH YIKELKPEEYEVKMGDKAGDGTQLRPFIVWFGEAVPEIETAIQYVEKADIFVIIGTSLNV YPAAGLLHYVPRGAEVYLIDPKPVDTHTSRSIHVIQKGASEGVAELKQLLGV >gi|222159333|gb|ACAB01000026.1| GENE 26 36987 - 37211 222 74 aa, chain - ## HITS:1 COG:no KEGG:BT_2974 NR:ns ## KEGG: BT_2974 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 74 2 72 72 106 81.0 3e-22 MEKENKILIFRTSITQRRDIKRIGELLAEFPQIDKWNVDFEDWEKILRIECRDISALEIS EVLRNNHIFATELE >gi|222159333|gb|ACAB01000026.1| GENE 27 37225 - 37575 123 116 aa, chain - ## HITS:1 COG:BH3506_1 KEGG:ns NR:ns ## COG: BH3506_1 COG2207 # Protein_GI_number: 15616068 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 10 116 2 107 130 73 36.0 7e-14 METKDINKQEYQLRINKVTDYIHNHIDQPLSLQKMAGIACFSPFHFHRVFTILTGETPTD YIKRTRIEKAALLLKRNKELSATEIARLCGFSSLSLLSRNFRLHFSMTIREFRSLK >gi|222159333|gb|ACAB01000026.1| GENE 28 37646 - 38422 715 258 aa, chain - ## HITS:1 COG:slr1117 KEGG:ns NR:ns ## COG: slr1117 COG0500 # Protein_GI_number: 16329224 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Synechocystis # 25 257 19 253 253 194 41.0 1e-49 MSNELKTIHEFDFTLICNYFKGLKRQGPGSPEVTQKALSFTNELSDNARIADIGCGTGGQ TMALANYTKGQITGIDLFPDFIELFNKNAIEAHCEDRVKGIVGSMDALPFQEEELDLIWS EGAIYNIGFERGMNEWNKFLKKNGFIAVTEASWFTPERPSEIEDFWMANYPEIDTIPRKI MQMEKAGYIPTAHFILPENCWLEHFYAPQFPVQEAFLKEYAGNDAAADLITGQRYEESLY NKYKEYYGYVFYIGQKKD >gi|222159333|gb|ACAB01000026.1| GENE 29 38865 - 39722 789 285 aa, chain + ## HITS:1 COG:no KEGG:BT_2961 NR:ns ## KEGG: BT_2961 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 285 1 285 285 491 76.0 1e-137 MKSKLLLACTILFFCSSFLCGQNQTSKVANSAETNNGAVRHPWQGKRVGYLGDSITDPNC YGDKIKKYWDFLQEWLGITPYVYGISGRQWNDVPRQAEKLKQEHGGEVDAILILMGTNDF NDGVPIGEWFTETEEQVMAARGQVKKLETRKKRTPIMDGNTYKGRINIGIHRLKQLFPDK QIVLLTPLHRSLADFGEKNVQPDENYQNSCGEYVDAYVQAVKEAGNVWGVPVIDFNAVTG LNPMVEEQLIYFYDAGYDRLHPSTKGQIRMARTLMYQLLALPATF >gi|222159333|gb|ACAB01000026.1| GENE 30 39735 - 40484 696 249 aa, chain - ## HITS:1 COG:jhp0094 KEGG:ns NR:ns ## COG: jhp0094 COG0463 # Protein_GI_number: 15611164 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Helicobacter pylori J99 # 9 187 3 182 260 140 40.0 2e-33 MHSVHPTPKFSIITVTYNAEKVLEDTIQSVISQTYHHIEYIIVDGASKDGTLSIINRYRS RIHTVVSEPDKGLYDAMNKGIALASGDYLCFLNAGDCFHEDDTLQQMVHTINGSELPDVL YGETAIVDQDRHFLRMRRLSAPETLTWKSFKQGMLVCHQAFFPRHTLVEPYNLKYRFSAD FDWCIRIMKKARTLHNTHLTIIDYLDEGMTTRNQKASLKERFRIMAKHYGLIGTVAHHIW FVIRAVIHR >gi|222159333|gb|ACAB01000026.1| GENE 31 40477 - 41742 910 421 aa, chain - ## HITS:1 COG:all4426 KEGG:ns NR:ns ## COG: all4426 COG0438 # Protein_GI_number: 17231918 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 1 416 1 412 417 237 33.0 4e-62 MRVLIINTSERIGGAAIAASRLMESLKNNGIKAKMLVRDKQTDQISVVRLKSNWLQVWKF MWERIVIWSANRFRRYHLFDVDIANTGTDITSLPEFRQADVIHLHWINQGMLSLNDIRKI LTSGKPVVWTMHDMWPCTGICHYARECNNYQQECHDCPYIYKGGGRKDLSYRTFRKKQKL YSYAPIHFVTCSHWLKEQAQTSALFEGKSVTNIPNAINTNLFKPMNKKEARAKFMLPEGK KLVLFGSLKITDKRKGVDYLIGACKLLAEKHPEWKDSLGVVVFGNQSQQLQEQLPFHVYP LPYIKNEHEVVNIYNAVDLFAIPSLEENLPNMIMEAMACGVPCVGFNVGGIPEMIDHLHN GYVAQYKSSEDFANGIHWILTEPEYNELSAQACRKVLGNYSESIVAKKYTDVYNKITGKY A >gi|222159333|gb|ACAB01000026.1| GENE 32 41753 - 42871 1011 372 aa, chain - ## HITS:1 COG:no KEGG:BVU_0890 NR:ns ## KEGG: BVU_0890 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.vulgatus # Pathway: not_defined # 1 372 1 371 371 608 78.0 1e-172 MAEKKQAPLNKLLTIYFYHTRLTRESYEEWKEYKFPGHILYGLPLLENYGIHSVMHKCKY FSGRLKLMLYATKEILFCKEKYDVLYATSFRGIEPVIFLRALGLYRKPIVIWHHTAVVTN PKPWREQISRLFYKGIDQMFLFSRKLIQDSQKTRKAPSHKLKLIHWGPDLPFYDHLLAEM PDRKPEGFISTGKENRDVDTLLQAFAATNEKLDLYIAVSCGNINYKKIIDPYALPDSIRI HYTDGVIPYELGKLVARKSCIVICCLDFPYTVGLTTLVEAFALGIPVICSRNPNFEIDID KEEIGITVEYNDVQGWIDAIRYIADHPEEARRMGENARKLAEERFNLEIFSREIAESLLE ISNISSKNRTFA >gi|222159333|gb|ACAB01000026.1| GENE 33 42844 - 43866 921 340 aa, chain - ## HITS:1 COG:no KEGG:BVU_0889 NR:ns ## KEGG: BVU_0889 # Name: not_defined # Def: hemolysin hemolytic protein # Organism: B.vulgatus # Pathway: not_defined # 3 339 1 337 338 611 85.0 1e-173 MNMKSALVDVPVLILFFNRPQQLSQVFEQVKKARPSRLFLYQDGARNERDLPGIKACREI VSQIDWECEVERLYQEKNFGCDPSEYISQKWAFSKVDKCIVLEDDDVPAVSFFQFCKEML DKYEYDTRISMIAGFNPEEITQDMPYDYFFTTTFSIWGWASWKRVVDQWDEFYNFLDDSF NMQQLEQLIKERKFRSDFIYMCQRHREQQKAFYETIFHASILFNSGLSIVPTRNMINNLG ATADSTHFAGSIHTLPKGYRRIFTMKRYEVDFPLKHPRYVIENVAYKESVYRIMGWDHPW IKIGRSFEELFLNLKYGNFSIITKAVKNRINKWLKRNKHH >gi|222159333|gb|ACAB01000026.1| GENE 34 43863 - 44726 692 287 aa, chain - ## HITS:1 COG:SP1274 KEGG:ns NR:ns ## COG: SP1274 COG3475 # Protein_GI_number: 15901134 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 8 133 7 128 269 83 34.0 5e-16 MNKKYTAEELDLLHTELYDILGETIRVCQKHNIPYFVIGGTAIGALYDQAILPWDDDIDI GMTRENYNKFLKVAPGELGPSYFLSWIETDPHTPYYFAKVKKNDTLFVEEMFKNVPMHPG IFVDIFPFDKIPDNKLLRRIQSEALGFLKCCLMGKEIWMWKHFGTCEIENPTNRGAFSCF LNRVVDLLFSKKAIYRMLVSVQSCFNSRNTRYYNNVMATADHVTVESIRHLQPVKFGPLT VTAPDDLEGFLRYNYPTLHRFTEEEQEKVNNHYPAALSFSTTPKQEL >gi|222159333|gb|ACAB01000026.1| GENE 35 44782 - 45834 680 350 aa, chain - ## HITS:1 COG:no KEGG:BDI_2786 NR:ns ## KEGG: BDI_2786 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 307 1 313 353 288 44.0 3e-76 MNPNDLATKYRLLNSSFKKTMIYHIGIDAGFFTEYTYMLHAMLYCLQHKIQFKLYSDDAN FGWEKGWEDCFAPFCEQVHEPFHHTYNTHRLPSWQALMQDKKLSKTKLLKWKLKVTCKNI IGKALAFFTYGKPIRLNFQVTFNPNQHFHIPELGIDGDYLHTFQKLTEITWKLNDTTAQE CRQFATTLQLPPQYAGCQIRGGDKITETNLLPPEHYIRLIKEKTAIRDVFVLTDDYRLFE QVQTLAPDIHWYTLCSPNEQGYVNSAFTQTAKELKQKQMARFLCSIQLLMDASVFIGSIT TGPSLFLLKKFYPEINPADCLLKDFPQASVLPIPGRGRVATEFMQGNLKL >gi|222159333|gb|ACAB01000026.1| GENE 36 45824 - 47278 973 484 aa, chain - ## HITS:1 COG:mll5270 KEGG:ns NR:ns ## COG: mll5270 COG2244 # Protein_GI_number: 13474395 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Mesorhizobium loti # 6 481 75 553 561 160 26.0 4e-39 MSENQSLKEKTAKGLFWGGFSNGIQQLLNLLFGIFLARLLTPADYGMVGMLAIFSLIASS IQESGFTAALVNKKEVTHNDYNAVFWFNAAISLSLYLLLFLCAPLIADFYHTPELTPLAR YSFIGFFIASLGISHSAYLLRNLMVKQRALSSVIGLTVSGITGVTLAYFGFSYWGIATQS IVYVAVNTACYWHFTRWRPSLQFNFTPIKEMFGFSGKLLVTNVFNHINNNLFSVILGKFY TEKEVGYYNQANKWCGMGQLFISGMINGVAQPVLTKVSDDLERQKRVFRKMLRFTAFVSF PAMLGLGIVSEELITITITDKWYSSIPIMQILCISGAFTPIAYLYQQLIISKGKSRIYMW NTIALGIILLSGVLLVHSHGIYAMLAVYVSTNILWLLTWHYFVWQEIGLKLRHALIDILP YAVIATTVMVITYYSTRSIENIYLRLASKIVLAAALYVAAMWGSRSVTFKESIQYFIKKK PHEP >gi|222159333|gb|ACAB01000026.1| GENE 37 47351 - 49390 2354 679 aa, chain - ## HITS:1 COG:PAB2364_1 KEGG:ns NR:ns ## COG: PAB2364_1 COG0143 # Protein_GI_number: 14521189 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Pyrococcus abyssi # 7 546 3 553 562 558 49.0 1e-158 MEKKFKRTTVTSALPYANGPVHIGHLAGVYVPADIYVRYLRLKKEDVLFIGGSDEHGVPI TIRAKKEGITPQDVVDRYHSLIKKSFEEFGISFDVYSRTTSPTHHQLASDFFKTLYDKGE FIEKTSEQYYDEEAKTFLADRYITGECPHCHSEGAYGDQCEKCGTSLSPTDLINPKSAIS GSKPVMKETKHWYLPLDKHEGWLRKWILEDHKEWRPNVYGQCKSWLDMGLQPRAVSRDLD WGIPVPVEGAEGKVLYVWFDAPIGYISNTKELLPDSWETWWKDPETRLIHFIGKDNIVFH CIVFPAMLKAEGSYILPDNVPSNEFLNLEGDKISTSRNWAVWLHEYLADFPGKQDVLRYV LTANAPETKDNDFTWKDFQARNNNELVAVYGNFVNRAMVLTQKYFDGRVPAQGELTDYDK ETLKEFADVKAEVEKLLDVFKFRDAQKEAMNLARIGNKYLADTEPWKLAKTDMERVGTIL NISLQLVANLAIAFDPFLPFSSEKLRKMLNMNTFEWSELGRDNLLPVGHQLNKPELLFEK IEDATIEAQVQKLLDTKKANEEANYKANPIRPNIEFDDFTKLDIRVGTILECQKVPKADK LLQFKIDDGLETRTIVSGIAKHYQPEELVGKQVCFIANLAPRKLKGIVSEGMILSAENND GSLAVIMPGREVKPGSEVK >gi|222159333|gb|ACAB01000026.1| GENE 38 50326 - 52386 1685 686 aa, chain + ## HITS:1 COG:AF1211 KEGG:ns NR:ns ## COG: AF1211 COG1042 # Protein_GI_number: 11498810 # Func_class: C Energy production and conversion # Function: Acyl-CoA synthetase (NDP forming) # Organism: Archaeoglobus fulgidus # 5 682 3 679 685 322 33.0 2e-87 MITTQLLRPESIVVVGASNNVHKPGGAILKNLINGGYRGELRAVNPKEKEVQGVPAFADV NDLPDTDLAVLAVPALMCPDMVETLASKKQTRAFIILSAGFGEETHEGALLEECILETVN KYGASLIGPNCIGLMNTWHHSVFSQPIPNLNPKGVDLISSSGATAVFILESAVTKGLQFN SVWSVGNAKQIGVEDVLQFMDENFDPETDSHLKLLYIESIQNPDRLLFHASSLIRKGCKI AAIKAGSSESGSRAASSHTGAIASSDSAVEALFRKAGIVRCFSREELTTVGCVFTLPELK GKNFAIITHAGGPGVMLTDALSKGGLNVPKLEGEVAEELKAQLFPGAAVGNPIDILATGT PEHLRLCIDYCEEKLDNIDAMMAIFGTPGLVTMFEMYDVLHEKMQTCKKPIFPILPSINT AGAEVSAFLAKGHVNFADEVTLGTALSRIVNAPKPAVPEIELFGVDVPRIRRIIDSIPED GYIAPNYVQALLHAAGIPLVDEFVSDNKEEIVAFARRCGFPVVAKVVGPVHKSDVGGVVL NIKSEQHLALEFDRMMQILDARAIMVQPMLKGTELFVGAKYEEKFGHVVLCGLGGIFVEV LKDVSSGLAPLSYEEAYSMIHSLRAYKIIQGTRGQKGVNEDKFAEIIVRLSTLLRFATEI KEMDINPLLATEKAVVAVDARIRIEK >gi|222159333|gb|ACAB01000026.1| GENE 39 52626 - 52907 89 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295085629|emb|CBK67152.1| ## NR: gi|295085629|emb|CBK67152.1| Glycosyl hydrolases family 43. [Bacteroides xylanisolvens XB1A] # 6 93 21 108 108 183 100.0 3e-45 MPVFHGGGKAQTKSAVGYKNLIVDIARPDSTVSKAADGCFYVYVTEFTRNVPIMKSKDLV EWMYCNTAFPIRLVLGLSLKMEFGHPILKLYGR >gi|222159333|gb|ACAB01000026.1| GENE 40 53029 - 54324 1038 431 aa, chain + ## HITS:1 COG:BH1248 KEGG:ns NR:ns ## COG: BH1248 COG0673 # Protein_GI_number: 15613811 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 45 282 5 236 340 92 32.0 1e-18 MKHTPISRRDFLKNLGIAGAGTLLAASPWLSAFSEVMNTSNEKCRLAIIGPGSRGRFLMG FLAKNPKVDIVALCDIYKPSIENALKLAPNAKVYGDYREVLEDKSIDAILVATSLSSHCK IVLDAFDAGKHVFCEKSIGFTMEECYRMYQKHRSTGKIFFTGQQRLFDPRYIKAMEMIHA GTFGEINAIRTFWNRNGDWRRSVPSPNLERLINWRLYKEFSKGLMTELACHQLQIGSWAL RKIPEKVMGHGAITYWKDGRDVYDNVSCVYVFDDGVKMTFDSVISNKFYGLEEQIMGNLG TVEPEKGKYYFESVAPAPAFLQMVNDWENKVFDSLPFAGTSWAPETANENKGEFIIGERP KSDGTSLLLEAFVEAVITQKQPKNIAEEGYYASMLCLLGHQALEEERTLYFPDEYKIDYL NHQSVKTSEAI >gi|222159333|gb|ACAB01000026.1| GENE 41 54321 - 54566 133 81 aa, chain + ## HITS:1 COG:no KEGG:BT_3471 NR:ns ## KEGG: BT_3471 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 78 1 78 80 82 64.0 6e-15 MKDTIVTARRKKIELITLLVCFVVSNLLHLYAIIAYHAPFTEMITSIFYIIIFTFVLYAF WGILRLLFYGMQALFKKKSRS >gi|222159333|gb|ACAB01000026.1| GENE 42 54627 - 55428 839 267 aa, chain + ## HITS:1 COG:STM4425 KEGG:ns NR:ns ## COG: STM4425 COG0673 # Protein_GI_number: 16767671 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Salmonella typhimurium LT2 # 41 239 3 189 336 66 27.0 5e-11 MVTRRDFLKTMSMASAGLALGTGDLLHAQTASPKKGRGDKVKIAYIGIGNRGEQIIEDFA RTGMVEVVALCDVDMGAKHTQKIMAKYPKAKQFRDFRQMFDKAGNEFDAVAIATPDHSHF PISMLALASGKHVYVEKPLARTFYEAELLMQAALKRPNLVTQVGNQGHSEANYFQFKAWM DAGIIKDVTAITAHMNNPRRWHKWDTNIYKLPSGQQLPKDLDWDTWLGVTPYHEYNKDYH LGQWRCWYDFGMGALGDWGAHILDTAH Prediction of potential genes in microbial genomes Time: Wed May 18 01:26:54 2011 Seq name: gi|222159332|gb|ACAB01000027.1| Bacteroides sp. D1 cont1.27, whole genome shotgun sequence Length of sequence - 61156 bp Number of predicted genes - 37, with homology - 37 Number of transcription units - 16, operones - 11 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 617 491 ## BT_3470 putative dehydrogenase 2 1 Op 2 . + CDS 677 - 2032 1294 ## BT_3469 hypothetical protein + Term 2110 - 2152 2.5 + Prom 2072 - 2131 3.5 3 2 Tu 1 . + CDS 2179 - 2997 487 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis + Prom 3344 - 3403 3.1 4 3 Op 1 . + CDS 3498 - 7526 2405 ## COG3292 Predicted periplasmic ligand-binding sensor domain + Prom 7530 - 7589 14.2 5 3 Op 2 . + CDS 7620 - 9629 1113 ## Phep_1705 hypothetical protein + Term 9655 - 9702 12.2 + Prom 9656 - 9715 5.5 6 4 Tu 1 . + CDS 9817 - 11826 1413 ## COG1554 Trehalose and maltose hydrolases (possible phosphorylases) + Prom 11887 - 11946 4.8 7 5 Tu 1 . + CDS 12067 - 14097 1116 ## BT_2899 hypothetical protein + Term 14204 - 14245 -0.9 + Prom 14131 - 14190 10.3 8 6 Op 1 . + CDS 14263 - 17367 2472 ## PRU_2188 hypothetical protein 9 6 Op 2 . + CDS 17385 - 19277 1294 ## PRU_2187 putative lipoprotein 10 6 Op 3 . + CDS 19297 - 20442 791 ## PRU_2186 putative lipoprotein 11 6 Op 4 . + CDS 20476 - 20979 443 ## gi|237717189|ref|ZP_04547670.1| predicted protein 12 6 Op 5 . + CDS 21011 - 22075 978 ## COG3507 Beta-xylosidase + Term 22082 - 22132 6.1 + Prom 22079 - 22138 2.5 13 7 Op 1 . + CDS 22161 - 24707 1997 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases + Term 24722 - 24760 4.5 + Prom 24709 - 24768 9.2 14 7 Op 2 . + CDS 24802 - 24975 99 ## gi|295085613|emb|CBK67136.1| hypothetical protein + Term 25007 - 25050 1.2 + Prom 25057 - 25116 11.4 15 8 Op 1 . + CDS 25136 - 26566 1011 ## BT_3298 hypothetical protein 16 8 Op 2 . + CDS 26596 - 29652 2261 ## BT_3297 hypothetical protein 17 8 Op 3 . + CDS 29661 - 31673 1407 ## BT_3296 putative outer membrane protein 18 8 Op 4 . + CDS 31693 - 32679 992 ## BT_3295 hypothetical protein + Term 32787 - 32845 11.1 - Term 32985 - 33052 -1.0 19 9 Tu 1 . - CDS 33195 - 34748 1382 ## COG0642 Signal transduction histidine kinase - Prom 34768 - 34827 9.7 20 10 Op 1 . - CDS 34846 - 36384 1083 ## COG0606 Predicted ATPase with chaperone activity 21 10 Op 2 . - CDS 36398 - 37468 766 ## BT_2845 hypothetical protein - Prom 37511 - 37570 6.9 - Term 37507 - 37551 12.5 22 11 Op 1 . - CDS 37580 - 39274 1729 ## BT_2844 hypothetical protein - Prom 39338 - 39397 4.8 23 11 Op 2 . - CDS 39410 - 40369 657 ## COG4974 Site-specific recombinase XerD - Prom 40401 - 40460 4.9 + Prom 40333 - 40392 3.3 24 12 Op 1 . + CDS 40432 - 40851 249 ## COG0757 3-dehydroquinate dehydratase II 25 12 Op 2 . + CDS 40887 - 42344 1552 ## COG0469 Pyruvate kinase 26 12 Op 3 . + CDS 42365 - 43021 634 ## COG4122 Predicted O-methyltransferase 27 12 Op 4 . + CDS 43038 - 43370 312 ## COG0858 Ribosome-binding factor A + Term 43414 - 43458 5.2 + Prom 43372 - 43431 6.8 28 13 Tu 1 . + CDS 43541 - 44614 745 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component + Prom 44629 - 44688 6.3 29 14 Op 1 6/0.000 + CDS 44760 - 45311 265 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 45391 - 45437 -0.1 + Prom 45359 - 45418 8.3 30 14 Op 2 . + CDS 45468 - 46583 776 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 46585 - 46644 8.3 31 15 Op 1 . + CDS 46735 - 50124 2387 ## ZPR_4655 TonB-dependent receptor Plug domain protein 32 15 Op 2 . + CDS 50136 - 52082 1444 ## ZPR_4656 RagB/SusD family protein 33 15 Op 3 . + CDS 52115 - 53755 1207 ## gi|237717210|ref|ZP_04547691.1| conserved hypothetical protein + Term 53778 - 53842 -0.0 34 15 Op 4 . + CDS 53854 - 55416 1148 ## BVU_3139 hypothetical protein 35 15 Op 5 . + CDS 55428 - 57077 1289 ## BVU_3139 hypothetical protein + Term 57145 - 57188 5.1 + Prom 57222 - 57281 4.2 36 16 Op 1 1/0.000 + CDS 57301 - 59508 1912 ## COG1472 Beta-glucosidase-related glycosidases + Term 59526 - 59577 7.0 37 16 Op 2 . + CDS 59588 - 61154 1348 ## COG1472 Beta-glucosidase-related glycosidases Predicted protein(s) >gi|222159332|gb|ACAB01000027.1| GENE 1 3 - 617 491 204 aa, chain + ## HITS:1 COG:no KEGG:BT_3470 NR:ns ## KEGG: BT_3470 # Name: not_defined # Def: putative dehydrogenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 204 257 460 460 402 92.0 1e-111 FLELGLPYEVTMQYAKGHNDYFFPYSSTILFRFPQRKGMPPVDITWYDGLDNLPPIPAGY GVSGLDPNIPVTNQGDTPKSKLNPGKIIYTKDLIFKGGSHGSTLSIIPEEKAKEMADKLP KVPKSPSNHFENFLLACNGIEKTRSPFEINGVLSQVFSLGVMAQRLNTQLFFDPRTKQIT NNEFANAMLAGLPPRKGWDEFCKL >gi|222159332|gb|ACAB01000027.1| GENE 2 677 - 2032 1294 451 aa, chain + ## HITS:1 COG:no KEGG:BT_3469 NR:ns ## KEGG: BT_3469 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 451 1 449 449 818 90.0 0 MIMKKNIVLTLLLFCTASLGAQNWEPLFNEKNLKGWKKLNGKAEYKIVDGAIVGVSKMGT PNTFLATTKNYGDFILEFDFKVDDGLNSGVQLRSESKKDYKKGRVHGYQFEIDPSKRAWS GGIYDEARRNWLYPLTLNPSAKTAFKNNAWNKARIEAVGNSIRTWINGVPCANIWDDMTP VGFIALQVHAIGNAADEGKTVSWKDIRICTTDVERYQTPEAQAAPEVNLIANTISPNEAK EGWTLLWDGKTTDGWRGAKLSTFPAKGWKIEDGILKVMKSGGAESANGGDIVTTRKYKNF ILKVDFKITEGANSGIKYFVNPDMNKGAGSAIGCEFQILDDDKHPDAKLGVKGNRKLGSL YDLIPAPKNKPFNKKEFNTATIIVKGNHVEHWLNGVKLIEYDRNNDMWNALVAYSKYKNW PNFGNPEEGNILLQDHGDEVWFKNVKIKELK >gi|222159332|gb|ACAB01000027.1| GENE 3 2179 - 2997 487 272 aa, chain + ## HITS:1 COG:CAC2766 KEGG:ns NR:ns ## COG: CAC2766 COG1477 # Protein_GI_number: 15896021 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Clostridium acetobutylicum # 42 267 28 291 319 79 25.0 1e-14 MQVPIQHMYKPSHEGSGLLYAWFLSMHTRVDIILYSKKTEGELLFVVNCIYDALCRLEKM ANFYNPASELAYVNRTAFVSPVVLSEELYSMIDLCLEYNGKTLGCFDITVHSENYNQNTI QSVHLSAEDHSIYFSRPGVAVNLSGFLKGYALETIKSILNECVIENALINMGNSSVLALG NHPVGTGWKVNNILLHNECLTTSGNDSPERRHIVSPRDGKLVEGARQISVVTTNGAIGEI LSTALFAADSELRKDLLAEFSSVLSQFFLIGY >gi|222159332|gb|ACAB01000027.1| GENE 4 3498 - 7526 2405 1342 aa, chain + ## HITS:1 COG:XF1330_1 KEGG:ns NR:ns ## COG: XF1330_1 COG3292 # Protein_GI_number: 15837931 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Xylella fastidiosa 9a5c # 29 766 28 739 740 130 21.0 1e-29 MKKVLILLICLIGYQISCHAQMADEHYYFKNLSVQNGLSQNTVNAILQDRQGFMWFGTKD GLNRYDGLSFRKFKHDDRTRRSIGNNFITALYEDTKGDIWVGTDVGLYIYDPEKDSFRHF TELSEENTKIEHTVTAISGDNEGCVWVAVESQGLFCYNLEKGELQNYTLQNFSFLTTNVE TFAFDNSGTLWIGCYGDGLFYSKDRLQTLHPYLSPLDNKEFYANDVVTCIVKGAYNCLYV GSLKGGVKELNLTSNKLHDLLSEDESGEPVFCRELLVASDNELWIGAESGLYIYNFRLSK YAHLRSSINDPYSLSDNAIYSLCKDREGGIWIGSYFGGINYYPRFYTNFEKYYPKGTDKG LHGKRVREFCQDNQGILWIGTEDGGLNRFNPKTKTFSFFTPSNAFTNVHGLCLVDDNLWV GTFSKGVKVVDTRTGAIVKTYQKTDSPRSLIDNSVFSICRTTTGDIYLGTLFGLLRYNRQ SDDFDRIPELNGRFVYDIKEDFGGNLWLATYANGAYCYNVSEKKWKNYLHNENDPKSLPY DKVVSIFEDSHRQVWLTTQGGGFCLFHPETETFTSYNLADGLPNDVVYQIVEDKDGFFWL TTNNGLVCFQPTTGTMKVYTTSNGLLGDQFNYRSSLETEDGTIYLGSIDGFIAFNPKTFS ENRSLPSIVITDFLLFGKEVYVGEPGSPLKKSITFSDELILQSNQNSFSFRVTALDFQAP RMSKIMYKLDGFDADWLTIGESPIVTYSNLRYGNYTFKVKVSNSDGVWNENEISLKVHIL PPFYFSAWAYFVYALLIIGCSLYIIMYFKRRSNNKHRRQMEKFEQEKEREVYHAKIDFFT NVAHEIRTPLTLIKGPLENIILKKQVDAETREDLNVMKQNTERLLNLTNQLLDFRKTESQ GFRLNFAKCNITEVLKETHVRFTSLAKQKGLDFTLQVPEKDFYAHVNQEAFTKIISNLLN NGMKYAESYVHVMLEAPQRDDNNLFRIRTVNDGVIIPDAMKEEIFKPFVRFNEQEDGKVT TGTGIGLALSRSLAEFHLGTLVMEAGEESNIFCLTLPVVQDTTITLTPEAEVEIERVNEI PAEQAGKKDNRPAVLVVEDNPDMLTFVVRQLSRDYTVLTATNGAEALQVLDGNYVNLVIS DVVMPVMDGFELCKTIKSDLNYSHIPVILLTAKTNIQSKIEGMELGADAYIEKPFSVEYL QACASNLIQNREKLRQAFAESPFIAANTMALTKADEEFIKRLNEVIQINYGNPEFSMDDM ADKLNMSRSNFYRKIKGVLDLSPNEFLRLERLKKAAQLLKEGENRVNEICYMVGFNSPSY FAKCFQKQFGVLPKDFVNSKEE >gi|222159332|gb|ACAB01000027.1| GENE 5 7620 - 9629 1113 669 aa, chain + ## HITS:1 COG:no KEGG:Phep_1705 NR:ns ## KEGG: Phep_1705 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 29 655 55 693 706 612 47.0 1e-173 MMTRQKTVWLALLLIGLYTDFLNASEIMRWMIASSDSICWKVNGAHNDHIEMSGLKVSTV LRYGVNEAGEWVIDRNMVLPTFRTIPNDTHGSLQHHFNGDWAHLCLVNGQPLVGEKVETV SLNGIMTVKSIYAARGISLTRTLFPSTSQPAFCEKYELENTTDHPQTVQFPSTTLSYSTD ETKGVEGSYTLTATLSSPVKNGTYLLKAGEKAWFQVVYAGYKKHDQELALDVNNELLARR RFLRRIQGNLVLETPNDVINTMFSFAKIRGSESIFDTKGGYMQSPGGEAYYAAVWANDQA EYINPFFPYLGYQAGNRSAMDSFGLFMRYMNDEYKPLPSSIIAEGIDCFGVAGDRGDVAM VGYGAARYALASGNRNEAEKLWPLIEWCLEYCHRQLNVDGVVKSDSDELENRFPAGKANL CTSSLYYDALISAAYLGKALKKEHKQIKSYQDRAAELYKNIDVYFARNVEGYDTYRYYDG NTVLRSWICIPLTMGIYEQAKGTVEALFSDKLWMENGLLTQSGTSTYWDRSTLYAFRGAY SCGARDIATEYLDKYSATRLLGDHVPYAVEAWPEGNQRHLSTESALYCRIMTEGLFGIRP IALDAFVMTPQLPESWNMMSLKRVCAFGSIFDIEVKRDKDNKLSVLIKKDHKIVKRSKIM NGKPIEVRI >gi|222159332|gb|ACAB01000027.1| GENE 6 9817 - 11826 1413 669 aa, chain + ## HITS:1 COG:PH0746 KEGG:ns NR:ns ## COG: PH0746 COG1554 # Protein_GI_number: 14590619 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose and maltose hydrolases (possible phosphorylases) # Organism: Pyrococcus horikoshii # 219 656 210 692 737 199 30.0 2e-50 MRSLLIVFYCLVSALTIKANGQDRLWKIQTTDYHGTYYGATVANGGIGILPWKEPFSIRH VMLNHVFDSATPQDVSRVLRGINPFNLQMQINGQTVNGDNISGWEQCIDMKEATHNTHFI CDGKADVSYSICALRNLPYAGLVRVEVTALGDMYLTVSNPIEVPDEYKNIGSKLVNVNVN GNDIKIVRTWALSKHREQKVSASSAFIYDKNSNVQQQYNGNGQISIPLKKGEKFFFTLLG AVCTGRDFIDPYNEADREVIYGAKEGLARLMDGHQKLWDEMWKGDIVIEGDDEAQRAVRF ALYNLYSNARKGSRLSIPPFGLSSQGYNGHIFWDTEFWMYPPMLFMNQGIARTMMDYRTD RLEAACQKALSYGYDGAMFPWESDDAGEEACPTWALTGPFEHHITADIAIAAWNYYCMSG DQRWLREEGFPLMEKVAEFWVSRVEKNDDGSYSIRNVVCADEYAEGDDNAFTNGTVIRAL QDATKAAFICGVKAPVIWKEIAAKLRIPKFENGVTMEYDGYSGQVIKQADANLLVYPLNL ITRPEEIRKDLEYYEGKIDKGGPAMSFSVLALQYARLGEGDKAYELFVQSFRPNQLPPFG VLSEGAGGTNPYFSTGAGGLLQTVINGFCGLEITDNGIKQLPSKLPEHWRRLVVKGVGPD GKTYVRLQK >gi|222159332|gb|ACAB01000027.1| GENE 7 12067 - 14097 1116 676 aa, chain + ## HITS:1 COG:no KEGG:BT_2899 NR:ns ## KEGG: BT_2899 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 675 1 675 677 1136 78.0 0 MNLKCLFCGILVGLYVEAIEAVNYTPEKISASISLKVPGNESEKYSLDMRQLKNDRSSYL LEPSKGIPMVITQRMEDMNGKLRINVSLTALEDVYFNFSEQLSTGFNHDDCQFYMPGFWY RRNLRSPKEAPSFHTSDSWLVREDRLSAPLTGIYSEKEKSFVTVSRIDNFANEVLSTHRE GEIILSGKTSIGFTGFENQNGVATLSFGFPYREAPKSYIRKLTLAPAVEAFQSLKKGETM LLTWEIMENKADDYSDFIRNAWEYSYDTYAPAPVDTPYSIEDMKEVMSRFFVNSLVTGNS LTFNSGIHLRTADCQSNGQAEVGFIGRVLLNAFNAWEYSWQCGREDLKENSMKVFDSYLK NGFTQAGFFKESVNFDRGYEDPVHSIRRQSEGIYAMLHFLAYEKENGRRHPEWEQKMKNM LDILLQLQHADGSFPRKFHDDFTVVDDSGGSTPSATLPLIMGYKYFKDKRYLASAKRTAD YLEKVLISKADYFSSTLDANCEDKEASLYAATATYYLSLITKGDEHRHYADLTRKAAYFA LSWYYVWDVPFAQGQMLGDIGLKTRGWGNVSVENNHIDVFVFEFASVLQWLSKEYEEPRF AHFAEVISTSMRQLLPHEGHLCGIAKSGFYPEVVQHTNWDYGKNGKGYYNDIFAPGWTVA SLWELLTPGRAEHFLK >gi|222159332|gb|ACAB01000027.1| GENE 8 14263 - 17367 2472 1034 aa, chain + ## HITS:1 COG:no KEGG:PRU_2188 NR:ns ## KEGG: PRU_2188 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 43 1034 47 1045 1045 1300 64.0 0 MDLKKALFVCLLLYYGISANSALPEHSSMFGKTNMSEQAKGIRISGTITDKDKNPLPGVN VRVLEQTGTGVITDMDGHFYLDVPGKNSVIEISYIGFKTQQIKVGSKINFDVILEEDIEA LDEVVVTGYGSQKKMSVIGSIETLQPKKLQVGATRSLSNNLAGQIAGVIAVQRSGEPGYD SSNFWIRGIASFSGGQSPLVLVDGIERDLNNIDPAEIESFSVLKDASASAMYGVRGANGV IVINTKRGKVGAPSVNLRVEHSIAEPTKLPDFIGAADHMTLINEITENKSRLPFSQEQID RTRYGYDRDLYPDVNWLDEITKDYAYSTRANLTVSGGSDFLRYSLVGSYFGEKGIMETDK SLPYDTGTKLTRYNMRANVDLDVTKTTVLRLNVGGFLQTHRRQAFSTDDAFDRAFITSPF VYPARYSDGTIPIMAINDVNPWAAVTQRGYDVVTASQIQSLFAIEQNLKMITPGLKAKVT FSFDRWNQSSITRGKDATYHSIATGRDIEGNLIHSVLKYGDESLGHGNKGEYGNSRVYFE GTLTYARTFGKHDVDALFLYNQQSYDNGSVQPYRKQGIAGRLSYTFDSRYVTEFNFGYNG SENFAKGKRFGFFPSVAVGWLLSEEPFMDRFRNTLSKLKLRGSIGKVGNDDIGGRRFAYI TTLNTNADKYNWGDTGQIGRTGISEGEVGVENLTWETALKMNLGFELGLWNELELQVDVF KEKRTNIFMQRKIIPSQTGFLTNPWANFGEVTNRGVEFSLNYNKQINKDWFVGFRSNFTY AINRVDEYDEPETVKGTYRGLTGRSLNTLYGLQAERLFTEDDFDANGNLKFGIPTQDVGA SKVRPGDIKYIDMNGDGIITDADEGYIGGTVDPRIVYGFGGNVSYKNWDLNFFFQGTGDT YRVIGGTTYFIPGSGQTLQGNIYSNYNDRWTEENPAQDVFWPRLTEEPNRQNYRNSTWWK KNMRFMRLKTLELGYTLPQSVTDKIHSKAIRFYVSGNNLFCFSPFKLWDPELNTNTGLRY PAMRSVMFGVDLNF >gi|222159332|gb|ACAB01000027.1| GENE 9 17385 - 19277 1294 630 aa, chain + ## HITS:1 COG:no KEGG:PRU_2187 NR:ns ## KEGG: PRU_2187 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 16 629 18 635 641 749 61.0 0 MKLKKYIVAGLIGIVSLGSCTYLDKEPDTEIDIDMVFENKTKVESWLANVYSRIPNPGND WLNTYGWEVFADDLTASARWQQWDWPNITKIFGGWTPNTQWAGNFWDGLPRKIRQAYIFI ERVHALPDSDLPQREVDYMKAEARFLAAYYWWQLLEAYGPIPFRPNYIAPTDFTLSDLMV GQRPFDEVVSHLDNEMLEASKQLPAVWQSVEKYGRITSVMCLTVRAKMLLFAASPLVNGN EWYIGHKNKEGEELFNSTYSQEKWVKAAEACKQLLDVAEQAGYRLYVEYNDEGEIDPFTS LENLWFTKFQDGNKEILFPYTKEDAYQSYQFYTNKAVTKEYGGGNGLGVYQGLVDAFFMK NGLPKDDDNSEYVEEGFSTKVEARSTNWTGGTGQKGEITAEGTYNMYCNREPRFYTAVSY NGSWYALAGRKFEFFKNQRDNDYTHDAPQNGYLVRKKVYPQDSPKNGSYKWRQMFLYRLA ASYLDYAEAVNEAYDNRASREDALKYVNKVRERAGVRQYTLNTVALDDPKYIHVDDNQLA VRAIVRMERRVELCCEGSRWSDIRRWKIVEQLPEMCGDDYGMNFAGINSQEFYKRTVFQT RVWKKAMYWLPVYIDEMNKNTNLVQAPFWN >gi|222159332|gb|ACAB01000027.1| GENE 10 19297 - 20442 791 381 aa, chain + ## HITS:1 COG:no KEGG:PRU_2186 NR:ns ## KEGG: PRU_2186 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 19 379 19 374 376 223 35.0 7e-57 MNKIMKLICLLFLMSIAASCNKDPEYYALETPVDQMLLKASSTEITLEKEKENEVAVLFT WDAAAQRGGADATITYFFRMYMADLHSNVTDLYEIESEERSISFTHKELNDILASWSVLP GDKVTIEAEVIGQVNSSVEYLKPELSKTQLDVIGYDKNATAIYMVMVDDAGEKTVRRMTE KVVGSGVYQSTAELKPCEYFFALSPDSDYPCYMKAENGDNSLQYVAEAGNGEMFENTKSG TYTIVVDLNRLDVNIVSIYPLPQDGIWIVGDASTIGWDWGKMKKEGAFVNNDPRHPERWT YTGNFYGGKEFKLALEPNGADFAGKFFFAPSQGANPGVEHELGEARYQDNGGDLKWIVAD DGKYILTVDLNEMKIYLEPTE >gi|222159332|gb|ACAB01000027.1| GENE 11 20476 - 20979 443 167 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717189|ref|ZP_04547670.1| ## NR: gi|237717189|ref|ZP_04547670.1| predicted protein [Bacteroides sp. D1] # 1 167 4 170 170 313 100.0 3e-84 MKKLFIILYLLTNIFVLGACGDVESEGDPQPRVPRVEVESDDYYVQITQENMDGVAFVYR WLDIETASKYVMTLGLQVDGEITETMGVEDKSILTYNGVREQLFTNQQIMEYMKAFGVDE KEAQFSITLEAFQSDGMPITSEMDKESKTSIIHIVLGPGVELNSSME >gi|222159332|gb|ACAB01000027.1| GENE 12 21011 - 22075 978 354 aa, chain + ## HITS:1 COG:BS_abnA KEGG:ns NR:ns ## COG: BS_abnA COG3507 # Protein_GI_number: 16079933 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus subtilis # 52 314 44 303 313 105 32.0 1e-22 MLYLQHTITLGILSLLTWGCGNSDKQKGEAEQAPLADKVWYSNPVIDSDNPDPTVIKAED GYFYLYATGGNTWIHKSKDLVHWTYVGGAYEDDKKPSWEPEAGIWAPDINYINGKYVMYY SLSKWGGGATCGIGVSVSDKPEGPFTDQGKLFRSNEIDVHNSIDPFYIEDDGKKYLFWGS WYGIWGVELTEDGLGLKGGIENAKATKIQVAANTGGTNAYEGSYILKRGKYYYLFCSTGT CCDGANSTYQTVVCRSTELFGPYLNKSGDNMNDNKHEVLIHRNDAFLGTGHNAEIVQDDE GKDWIIYHAYRTADPSLGRVVLLDRVYWDDDDWPYLVGDGPVVKAEAPTFKKSK >gi|222159332|gb|ACAB01000027.1| GENE 13 22161 - 24707 1997 848 aa, chain + ## HITS:1 COG:SP0312 KEGG:ns NR:ns ## COG: SP0312 COG1501 # Protein_GI_number: 15900245 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Streptococcus pneumoniae TIGR4 # 96 672 10 564 679 433 40.0 1e-121 MKRGLLLLLCVMQSFIFVLHAQNTQRNIAYTDDNVRFTVISDGAIRLEYAPDGKFVDDRS HIVVNRNYPQVDYKLKTRGGWVEITTSKMKMRYKKNSGQFTDKNLVITASKEILPFTWKP GMQQKGNLKGTYRTLDGMDGDTQTQTWVADTKKGDKLKLEDGLLATDGWTLIDDSQGLLF DDNKDWDWVKERPANGGQDWYFIAYGHDYKAALKDYTLFAGKVPLPPRYAFGYWWSRYWL YSDREFRNLIDNFHTYQIPLDVLVVDMDWHYTEQGKGGWTGWTWNRDLFPNPKEFLGYLK QNDVKITLNLHPADGVASYEEKYPGLAKDMGVDPQSKQTIPWINSDKKFIKNMFKNVLTP MEKDGVDFWWLDWQQGIYDPKVKNLSNTWWINYAFFSNMEKNRDTRPILYHRWGGLGNHR YQVGFSGDAVVSWKSLDFQPYFNSTASNVLYGYWSHDLGGHIGESIDPEMYTRWLQFGAL SPIMRTHSQKSAGLNKEPWVFNKEYCDVLRKTIQQRYEMAPYIYTMARKGYDEGLALCRP MYYDYPDNKEAYEFRNEYMFGDDILVAPATAPAKDGYVQVRVWLPEGEWYELHTGTLLKG GQIVERPFAIDEYPIYVKAGAVLPMYTRKVMNLNGNDEEVVVTVFPGGNGISSFNFYEDN GNDKNYASEYAVTKLTSSSQGQERTIVIGKRDGQYKDMPESRSFKVKVLSSLVPQSVTVN GQPAKYEYLGEEFALSVDLSVLSCDQEKVIKIVYPTETADMNGLLGVSRRIAKSMEQLKY RDSYICFKEEFGKMGSLSEAVIYAPEKISSLVSDFWKSYTDLPEVLKRQGLNDENKAWFL QSICWSKQ >gi|222159332|gb|ACAB01000027.1| GENE 14 24802 - 24975 99 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295085613|emb|CBK67136.1| ## NR: gi|295085613|emb|CBK67136.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 57 1 57 57 106 100.0 4e-22 MLLRFRKCLYFYDHLYNYAICNITFGGGDIGWMIIEESTQHNKIVKVLQNILEIVVL >gi|222159332|gb|ACAB01000027.1| GENE 15 25136 - 26566 1011 476 aa, chain + ## HITS:1 COG:no KEGG:BT_3298 NR:ns ## KEGG: BT_3298 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 476 17 470 470 283 35.0 9e-75 MKNLFKTKKGGKQVRFVLFSLICLGGLGSCMDKNVEVNLPSYVASDPNVEVVFTDYSPLE GAVRTNMFINGSNFGTDPSLIRVVVGGKEAKVVSSSGTQIYCIVPSRADGGFVQVDILNK DKSLNATYTFEQKFSYQYNVVVGTLCGVVDSEGNTSITDGNFDKAGTQTPLAMYYDESSG ERCIYFFENEHSLRKIYLDKDSIATVITNGAAGWSSAPSGLGWSVGRDTMFVNCPQGNVE RAGAYYLLRKENFKNSYPALNGDDFNTLFTHPIDGTIFLCRGRDATLHKAVWNEKTQMWD PKQIAKMGPSGWFQCVTFHPSGNFAYLIARDKQCVMKAEYNWAGKYLETPITFAGEFGSY GYKDASQNSARFDNPMQGCFVLNEEYVAEQRLDVYDFYLTDAANHCIRKITPDGIVTTFA GRGSYSTDQIVSGYIDGDPRETARFNYPLGLCYEESTGTFYVGDNGNHRVRTIALQ >gi|222159332|gb|ACAB01000027.1| GENE 16 26596 - 29652 2261 1018 aa, chain + ## HITS:1 COG:no KEGG:BT_3297 NR:ns ## KEGG: BT_3297 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1018 1 1026 1026 1316 63.0 0 MKKFLFMILLLTTGCVIDMYAQSQTIEVTGMVRDAEKIPLAGVNISVKNVAGLGVITNID GEYKIKIAPYSTLIFSYIGFETQEVLIKDNMSKVDVTLRESSASVIDEVVITGTGAQKKI TMTGAVSTVDVDVLKSSPTSSLVNALAGNVPGVLARQTSGRPGANMSEFWIRGISTFGAG SGALVLVDGFERSLNEVNVEDIESFSVLKDASATAIYGSRGANGVVLITTKRGKAGKINI NAKVETSYNTRTMTPDFVDGYTYATMLNESRTTRSQQPMYSQDELRLIQTGLDPDLYPNV DWMDVLLRDGAMTYRANLDISGGGSVARYFVSASYVSEGGMYKTDGKLKDYNTNANYQRW NYRMNVDFDVTKTTMVTVGVSGWLSTQNDPGLGSDALWKSVMGQTPINMPVIFSNGRIPA AGTSERTNPWVIATQTGYYEEWQNVIQTNATLDQKLDFITKGLKFTGRFGFDTYNNNYRN HKQWPEQWQAERQRDSNGEIVFKKVAEKQLMFHESSASGNRRQFLEAILQYDRRFGDHSV GGTLKYTQDEYVNTQSTAGYDWLPNRHMGLAGRVTYGWKYRYMVDFNFGYNGSENFAPGN QFGFFPAYSVAWNIAEEPIIKKNLKWMNMFKLRYSYGKVGSDNIGTRFPYLELFESRNPY NWGDYNTPNVDNGQIYKQISSSGVTWEIATKHDIGVDFSLWDDRFTGTVDYFHETRDGIF MQRQSLPGTLGLSYLTPSANVGKVRSTGFDGNVAFKQDIGNVSLTVRGNFTYSNNKILAA DEANSAYPYIRDTGYRVNQAKGLIALGLFKDYDDIRNSPQQTEWGTVMPGDIKYKDVNSD GVINDSDRVPIGSTTRPNLIYGFGMSATWKGFDVNLHFQGAGKSSFFIDGFGVRPFSGGD WGNILTDVVGNYWSLGTNENPDAKYPRLTWQNNSNNNRESTYWLRDGSYLRLKTVEVGYT LPKNISRAILMNNVRIFFIGTNLLTFSKFKLWDPEMGSSNGQQYPLSKILTLGLTVNI >gi|222159332|gb|ACAB01000027.1| GENE 17 29661 - 31673 1407 670 aa, chain + ## HITS:1 COG:no KEGG:BT_3296 NR:ns ## KEGG: BT_3296 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 670 1 674 674 630 48.0 1e-179 MKKKHILFSTIISMGMMVSSCSDYLSVEGKLGENTQSLENIFENKEWSEQWLATAYAWLT WSNIDIGSKDNCITNFSDDMCYSDRNLEYRVFKYCEYDENWKQDSWAQAYDGIRHASIFI QNIDRNKEMTPDEIVDYKAQARFVRAYFYWKLLQKYGPIPIIPNDGVMDYNAAYEDLYIP RSTYDDCVNYIAEEMKLAAKDLPLKRDTRNMARPTRGAALATRAKALLYAASPLNNPRPT DTERFTDLVDDEGRYLMAQEYNEEKWAKAAAAARDVVELGRKGVYELHTVSAHSTGTIDN PATIAPPTHAVYSHADFPAGWRNIDPLQSYETLFNGVIFPSENKEMIFTTGQNNGDINTM IQHQMPIAFGGYNCHAMTGKQCDAYQMNTGKPFDKTKDWTGDENYVSAEEAASGDWAPLV EGVNKQYGHREPRFYATVAYNGCLWNGTNAVQSYDRNLIIWYYRGEESGWSNSGDRWLAT GIGIRKFVSSRDNFKTEGGVISKPVIGIRYADVLLWYAEALNELGASTYQIPSWDGTQSY DISRDKNEMSYAISRVRIRGGVPDFGSDVYENADEFRKHLKHERQIELFAENSRYFDLRR WKDAEYEESQQMYGCNAYMSKTERDLFHTQVINSNLPTTFSRKMYFWPISHDQLRRNLRL TQNPGWTYYD >gi|222159332|gb|ACAB01000027.1| GENE 18 31693 - 32679 992 328 aa, chain + ## HITS:1 COG:no KEGG:BT_3295 NR:ns ## KEGG: BT_3295 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 328 1 334 334 298 47.0 2e-79 MKNYIVYFILLLTLVLTGACNNEWVEEVYVQNVGLKAPVNSKGVTDVYLRYQANGEVTYR LPVIVGGSTMNERDLDVAIDVDPDTLIGLNEARWKQREDLYFQLLDKKHYEFASPTCHIP AGECQGLFDIKFKFNGLDLVDKWVLPLTILDSDLYDPNPRKNYRKALLRVMPFNDYSGSY GATNMFVYMMDNTSAMVSSTRTLHVVSENQCFFYAGVISEELKDRDKYKIVLTFNENGTL TAEPADPTNAMNFRLIGDAPTYTTSMSLDEVTPYLEHHFTTIYMSYTYDDVTTYEDKDIV VPCRAEGSMLMERKVNSLIPDTDQAIQW >gi|222159332|gb|ACAB01000027.1| GENE 19 33195 - 34748 1382 517 aa, chain - ## HITS:1 COG:YPO0256_1 KEGG:ns NR:ns ## COG: YPO0256_1 COG0642 # Protein_GI_number: 16120593 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Yersinia pestis # 233 515 350 612 693 87 27.0 9e-17 MRKSIIAILMTFCLSIMTYAQTPRDRATELKEQAQSSLNQKDYIKARYLFKKAYEAFATR ENYPQAIECGVQANALYVRENFYKEGFELCRDMDQLIWAGEQNQKKVFYDLRFLVNKERL QMYTALKNPAQAKTQLNKLEETANLAKNDSLTEVLLYTKANYYYTFNQNTQGDACFRKLI NQYKEKKDYAKVSDCYKKLIGIARKANNAPLMERTYESYIVWTDSVKALTAQDELNVLKR KYDESQLTIQEKDDTLSAKQYIIIGLCVLVVILVAGLAILAAILLKFIAGNRKLKKSVKI ANEHNELKTKFIQNISSQMEPTLNTLATSANELLQKAPQEASQMQSQVAALKKFSDDIQE LSSLENSLTELYELGEINVGTFCENVMDKVKEHIKIDVTPSVNAPKLQVKTNKEQLERIL LYLLKNAAFYTEQGRISLEFKKRGAHTHQFIVTDTGIGIPAEQQENIFKPFTEVKDLTTG DGLGLPICSLIAAKMNGSLTLDTSYTKGTRFILELHV >gi|222159332|gb|ACAB01000027.1| GENE 20 34846 - 36384 1083 512 aa, chain - ## HITS:1 COG:slr0904 KEGG:ns NR:ns ## COG: slr0904 COG0606 # Protein_GI_number: 16331658 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Synechocystis # 1 507 1 507 509 533 52.0 1e-151 MLIKVFGAAVQGIDATLITIEVNSSRGCMFYLVGLPDSAVKESHQRIISALLVNGYKMPT SNIVVNMAPADIRKEGSAYDLPLAIGLLGANETISSEKFSRYLLMGELSLDGSIQPIKGA LPIAIKAREDGFEGLIIPQQNAREAAVVNQLKVYGVSNIKEVVEFFNNERELEPTVVNTR EEFYQQQTNCDLDFADVKGQENVKRALEVAAAGGHNLIMVGAPGSGKSMMAKRLPSILPP LSLGESLETTKIHSVTGQLKRGSSLISQRPFRDPHHTISQVAMVGGGSFPQPGEISLAHN GVLFLDELPEFNRGVLEVLRQPLEDRQISISRIKSTISYPANLMLIASMNPCPCGYYNHP TKACVCSPGQVQKYLNKISGPLLDRIDIQIEIVPVPFDKISDQRQGESSNIIRQRVIKAR QMQEKRYTEYTGIYCNAQMNSKLLAMYAQPDAKGLALLKNAMERLNLSARAYDRILKVAR TIADLEGAEQILPNHLAEAISYRNLDRENWAG >gi|222159332|gb|ACAB01000027.1| GENE 21 36398 - 37468 766 356 aa, chain - ## HITS:1 COG:no KEGG:BT_2845 NR:ns ## KEGG: BT_2845 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 344 13 360 371 447 70.0 1e-124 MKKSSILILIIICLCSCGKSSKKVSITGEIKGLGTDTLYLYGMDESFDRIDTIFAKNDKF SYTASIDTITSAFLLIDNQTEYPIFLDKGNQIKIKGDVNHPEYLDINGNIYNEEFTNFQK ELNSLPTPSEKDLEQKAEEFIMKHHSSFVSLYLLDKYFVQKDSPDFNKIKKLIEIMTGIL QDKLYIEQLNEAISLSEKTETGKYAPFFSLSNIKGEKITRSSEDFKKKNLLINFWASWGD SISNHQSNTELKEIYQKYKKNKHIAMLGISLDIDKQEWQDAIKRDTLNWEQVCDFGGLNS EVAKQYAIKQVPSNILLSADGKILAKNLKGEQLKKKIEEVVTAAEEKEKQDNKKKK >gi|222159332|gb|ACAB01000027.1| GENE 22 37580 - 39274 1729 564 aa, chain - ## HITS:1 COG:no KEGG:BT_2844 NR:ns ## KEGG: BT_2844 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 564 1 553 553 912 92.0 0 MTKKLYLPLLMAMVVALFSSCSKKMGELSADYFTVTPQVLEAVGGKVPATINGKFPEKYF NKKAVVEVTPVLKWNGGEAKGQPATFQGEKVEGNDQTISYKMGGSYTMKTSFDYVPEMAK SELYLEFKATIGKKVVTIPAVKIADGVISTSELVNNTLGNANPALGEDAFQRIIKEKHDA NIMFLIQQANIRSSELKTAKEFNKEVANVNEAANKKISNIEVSAYASPDGGVSLNTTLAE NREGNTTKMLSKDLKKAKIDAPIDAKYTAQDWEGFQELVSKSNIQDKELILRVIAMYQDP AQRESEIKNISAVYKELANTILPQLRRSRLTLNYEIIGKSDEEIAKLASSNPSELNVEEL LYAATLTSDPAKQEVIYTQATKQFPNDYRGYNNLGKLAYQAGNIDKAESYFKKAASVNAT PEVNMNLGLISLMKGDKAAAEAYFGKAAGTKELGESMGNLYIAQGQYERAVNSFGDSKTN SAALAQILAKDYNKAKNTLANVERPDAYTDYLMAVLGARTNNSSMVTSSLKSAVAKDSSL AKKAATDLEFAKFFTNADFMNIIK >gi|222159332|gb|ACAB01000027.1| GENE 23 39410 - 40369 657 319 aa, chain - ## HITS:1 COG:SA1328 KEGG:ns NR:ns ## COG: SA1328 COG4974 # Protein_GI_number: 15927078 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Staphylococcus aureus N315 # 14 306 2 294 295 216 44.0 7e-56 MEINEKKHKKEQQELIIRKYQQYLKLEKSLSLNTLDAYLTDLDKLMSFLTLEGIYVLDVC LSDLQRFAAGLHDIGIHPRSQARILSGIKSFFRFLILENYLEADPSELLEGPKIGFKLPE VLTVEEIDRIISAVDRSKAEGQRNRAILETLYSCGLRVSELITLKLSDLYFDEGFIKVEG KGSKQRLVPISPRAINEIKLYITDRNQIEVKKGHEDFVFVSQRRGKGLSRIMIFHMIKEL AQKAGITKNISPHTFRHSFATHLLEGGANLRAIQCMLGHESIATTEIYTHIDRNMLRSEI IEHHPRNIKYRKEKEELFH >gi|222159332|gb|ACAB01000027.1| GENE 24 40432 - 40851 249 139 aa, chain + ## HITS:1 COG:sll1112 KEGG:ns NR:ns ## COG: sll1112 COG0757 # Protein_GI_number: 16329990 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase II # Organism: Synechocystis # 2 138 6 144 152 156 55.0 1e-38 MKIQIINGPNINLLGKREPSIYGSVTFEDYLADLRKRYVDVEIDYFQSNIEGEMIDCIQQ VGFDVDGIILNAGAYTHTSIALQDAIRSVTSPVIEVHISNVHSRESFRHVSMIACACKGV ICGFGLNSYRLALEALLDR >gi|222159332|gb|ACAB01000027.1| GENE 25 40887 - 42344 1552 485 aa, chain + ## HITS:1 COG:BB0348 KEGG:ns NR:ns ## COG: BB0348 COG0469 # Protein_GI_number: 15594693 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Borrelia burgdorferi # 1 471 1 473 477 390 47.0 1e-108 MLLKQTKIVASISDRRCDVDFIKQLFEAGMNVVRMNTAHASREGFEALIANVRAVSNRIA ILMDTKGPEVRTTANAEPIPYKIGEKVKIVGNPDLETTRECIAVSYSNFVSDLNIGGTIL IDDGDLELEVIDKTDGYLLCEVKNDATLGSRKSVNVPGVRINLPSLTEKDRNNILYAIEK DIDFIAHSFVRNRQDVLDIREILDAHNSDIKIIAKIENQEGVDNIDEILEVADGVMVARG DLGIEVPQERIPGIQRVLIRKCILAKKPVIVATQMLHTMINNPRPTRAEVTDIANAIYYR TDALMLSGETAYGKYPVEAVKTMTKIAAQAEKDKLEENDIRIPLDENSNDVTAFLAKQAV KATSKLKIRAIITDSYSGRTARNLAAFRGKYPVLAICYKEKTMRHLALSYGVEAIYMPEL ANGQQYYFAALRRLLKEGRLQPSDMVGYLSSGKAGTKTSFLEINVVEDALKHAEETVLPN NNRYL >gi|222159332|gb|ACAB01000027.1| GENE 26 42365 - 43021 634 218 aa, chain + ## HITS:1 COG:aq_1507 KEGG:ns NR:ns ## COG: aq_1507 COG4122 # Protein_GI_number: 15606661 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Aquifex aeolicus # 5 211 7 211 212 144 37.0 1e-34 MQETESIDEYILQHIDPESDYLKSLYRDTHVKLLRPRMASGHLQGRMLKMFVEMIQPRQI LEIGTYSGYSALCLAEGLPKGGLLHTFEINDEQEDFTRPWLEKSPFADKIRFYIGDALEL VPRLGVTFDMAFIDGDKRKYIEYYEMTLAYLSEGGYIIADNTLWDGHVLEQPRNTDAQTI GIKAFNDLVAQDVRVEKVILPLRDGLTIIRKRIVSSKS >gi|222159332|gb|ACAB01000027.1| GENE 27 43038 - 43370 312 110 aa, chain + ## HITS:1 COG:BS_rbfA KEGG:ns NR:ns ## COG: BS_rbfA COG0858 # Protein_GI_number: 16078728 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Bacillus subtilis # 3 109 2 107 117 63 37.0 6e-11 METTRQNKISRLLQKELSEIFLLQTKSMPGTLVSVSAVRISPDMSIARVYLSVFPSEKAE EMVKNINNNMKSIRYELGTRVRHQLRIIPELKFFVDDSLDYIEKIDSLLK >gi|222159332|gb|ACAB01000027.1| GENE 28 43541 - 44614 745 357 aa, chain + ## HITS:1 COG:HI1548 KEGG:ns NR:ns ## COG: HI1548 COG4591 # Protein_GI_number: 16273448 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Haemophilus influenzae # 37 354 91 406 416 95 24.0 2e-19 MVASFFTAFDPQLKITVREGKVFDAQDERIRTVCALPEVEVFTETLEENAMVQYKDRQAM VVLKGVEDNFEELTAIDSILYGAGEFVLHDSIVNYGVMGVELVATLGTGLEFVDPLQVYL PKRNSKVNMANPGASFNRDYLYSPGVVFVVNQQEYDGKYILTSLGFLRQLLDYTTEVSAM ELKLKPNVNTSSVQSKIENILGDDFVVQNRYQQQADVFRIMEIEKLISYLFLTFILMIAC FNVIGSLSMLILDKKDDVVTLRSLGASDKLISRIFLFEGRLISLFGAISGIVLGLILCFI QQKFGIISLGGGGGTFVVDAYPVSVHAWDVVLIFITVLAVGFLSVWYPVRYLSKRLL >gi|222159332|gb|ACAB01000027.1| GENE 29 44760 - 45311 265 183 aa, chain + ## HITS:1 COG:DR0180 KEGG:ns NR:ns ## COG: DR0180 COG1595 # Protein_GI_number: 15805216 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Deinococcus radiodurans # 10 176 53 222 229 70 23.0 1e-12 MKLENNEICRIVDGDEIAFNRFMEHYSSRLYHYTFALLGQKESAEEIVSDVFFEVWKNRK GLAEIGNMNAWIQTITYRKAISFLRKETGKYELSFDDIEDFIFEPVQSPAEEMISKEEMA KINDAIQQLPPKCKHVFFLAKIDGLPYKDIADMLNISVKTINNHIAFALDEIAKRLNMKS RKS >gi|222159332|gb|ACAB01000027.1| GENE 30 45468 - 46583 776 371 aa, chain + ## HITS:1 COG:PA3900 KEGG:ns NR:ns ## COG: PA3900 COG3712 # Protein_GI_number: 15599095 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 171 362 119 305 317 75 26.0 2e-13 MSLKEYHRLADLFSNLNSQENIEDDFTSMPDAEKLKFFWENCAQEKIDPSSIIEKTQRKM RKDAMKRRKKYFLVASASIAASFLICISTIYFLNQNGSVNLDFQAIAKEMDSQSVEEVTL ITSKEQLNLDEDVFIKYSKEGKVAVNSQVIKEKEEKTKEEQEYNQLLVPAGKRARVELSD GTRLVVNSQSKVIYPRCFKGDIRKIYAQGEVFLEVAHDKKHPFIVESDDFKLQVLGTKFN ISNYKGGTTNIVLVEGAVEVTDKNEKKARLNPNDLLNIANGTIAYQKQVDVAEYISWVEG IMLLNGNDLSQIIQRLSIYYGISIQCEPIIGKEKVYGKLDLKDDIDEVIECIQQTLPFTI EKSDTSIYLNK >gi|222159332|gb|ACAB01000027.1| GENE 31 46735 - 50124 2387 1129 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4655 NR:ns ## KEGG: ZPR_4655 # Name: not_defined # Def: TonB-dependent receptor Plug domain protein # Organism: Z.profunda # Pathway: not_defined # 130 1129 5 997 997 1068 54.0 0 MVKKTIFLHLKKRLKLFCIINVFFMIPFSSFGTNSISWNSEEDLFSVQYENVTVKDILDY IEKHSKYIFIYSANVQKNLNNKVSISVSNKKIDAVLKELFSETGLNYKMSGRQITISVPE APKVQQTIQQKGIKVTGNVSDEKGEPLIGVTIILKNDSTVHALTDMNGNYSIIVPERKSV LSFRYIGFVPKEEVVNNRKVVNVQMVEDVGQLDEVVVVAYGAQKKESVVGSITTIEPAKL KVSTTRSISNNLAGTVAGVLAVQRSGEPGYDNSSFWIRGISTFQDAGQNPLVLIDGIERD LNNIDPEEIESFSVLKDAAASAVYGVRGANGVILINTKRGQVGKPRVTVKAEFAATQPVK LPEYLGAADYMQVLDDILMDTGQQPKYTDRIAKTRAGYDPDLYPDVNWMDAIANDYASNQ RVTVDISGGTETLRYSFVAAAYNERGILKRDKSYDWDPTIKLQRYNVRSNVDLKLSPTTQ LRFNIGGYLQDRNSTTKDISQIFQKAFVAVPHAFPAQYSSGQIPTTEEPNVWAWATQSGY KRRSDSKIETLFSVEQDLKFLLPGLKVKGTFSFDRFSSGTVSRGKTPDYYVPATGRDDEG NLIIASKSNGTNFLDYSKSGDYGNKSVYMEATLSYDRTFAEKHSVAAMLLFNRRNYDDGS KLPYRNQGLAGRASYTYSGKYVGEFNFGYNGSENFAKGKRYGFFPSGAIGWIVSEEAFMQ PLRKVISKLKLRASYGQVGNANLGGRRFAYLSTITDDYDTLNMYKWGLDSSYGLTGMAEG EFAVQDLTWEIVNKMNLGVELGFLNGMIDLQLDYFDERRKDIFMPRESVPMTAGFMKQPW KNFGKVTNQGVEVSLNVNKQFGKDLFVSLMGTFTYAHNEITEKDEPSAVVGTNRAETGHP VGQLTGYIAEGLFTEDDFEDVSTGKLKEGIPTQSFVSKLRPGDIRYRDVNGDGKVDVFDK SPIGGTKDPEIVYGFGLNMKYKNLDFGALFQGIGRSWNILGSSIIPGANRGVTGNMFTNA NDRWTVDNPSQNVFYPRLDDGINSNNNQPSTWWLRNMSFLRLKNIELGYSLPKNLWRNTT VISGIRLFVRGTNLLTFSKFDLWDPEVENTTGAAYPIMKSLSAGFEIKF >gi|222159332|gb|ACAB01000027.1| GENE 32 50136 - 52082 1444 648 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4656 NR:ns ## KEGG: ZPR_4656 # Name: not_defined # Def: RagB/SusD family protein # Organism: Z.profunda # Pathway: not_defined # 1 643 1 615 615 508 45.0 1e-142 MKNSILKFIVLSIITTFTFSSCSDYLDKQPDDQLDLESVFENKKNMERWLAYVYRGLPEY YTYDGPDAIADELIPSVGWEAQGFKAIQYQKGNWTADNPGVIAYWNTYPKYIRSAYLFIK HAHPLEAVPAEEVDFMKAECRFFIAYYNSMMAIAYGSVPIIREASESTSADDLMLKQEPF YNVIDWADQEMLEASKQLPATQTEDKKYGRVTSVGCLAMRARMLLFAASDLVNGNPALAN IKNIDGTPIFNSAHDPERWKRAVDACKLVIDEAEAAGYHLHYEYLDNGDIDPFLSYQNAV MKRWNEANRELLFVRTMDSGGWYDKNCIPRGLGINGVGAIGITQSLVDAFFMRNGQRPII GYNSDGSPKVNPDANYSETGFSTQKETYSTKWQYGSSEGDRNKDENVVVDANTYKMYCDR EPRFYISVLHNEQWHIGGKRNTDFYMDGKDGGPSHDAPWSGYLVRKRVDPSANPKEGSGD YKNRHGALCRLAEIYLSYAEALNEYSIEKGTYTANQKEILKYVNLIRERAGIPEYSVSAE EGKITAPSDPVEMRELIRQERRVELNCESGLRFNDLRRWKLAEKVLDGDFYGMNAYIKVS DADYRNKYYTRTVYQTRKFISYWWPIPQDDIDKNWNLVQTPDWTVGNQ >gi|222159332|gb|ACAB01000027.1| GENE 33 52115 - 53755 1207 546 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717210|ref|ZP_04547691.1| ## NR: gi|237717210|ref|ZP_04547691.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 546 1 546 546 1090 100.0 0 MKNGYIFKKNLLRFFFLTLASCFIVACNDDDDATNSLDNRTSLVLTPSDYEIELKEDTPD EVALTLNWTEADPIGSDYYISYLYKMDLENNSFGTNTMIREYIDDDFSKSYTHKELQSLL VSKWKQQPGDLVKLQARIIGSVEGPKFVKPEVSTVTIKVKMYSEKTFVADHLYMSGTAVD GEDLEILPMESQPKRYVSICDLKAGNLHFPIVWKDENKINAISPVAAEQQITDGAMEAKI KGIDNAGYWVIPEDGQYRVVVDFETRTVTIGLASNFIEADKIYIAGTCVSADVEMTRTIE DENQYAFHAELQPGTIYFPILFNGQKDMAIAPEESGDFTDGTAMNISTMSPEAAALAYHW NIKTAGVYRVVININTKKVTIYSPETDPKPMVVSWTWNNNTMTTTIERVFIWGPYDGWAK DGTGDTGFTMAHSMTPSLANPYLFIYKGAELPRKNSIKDKDGNAHPGGLNFKVGPQSAGC YTFGSTADAIRGSYDGCLDIAESDYNQKQTVVGGQSHNRYAFFSVPVGVNYIELDIKELT VFFDKR >gi|222159332|gb|ACAB01000027.1| GENE 34 53854 - 55416 1148 520 aa, chain + ## HITS:1 COG:no KEGG:BVU_3139 NR:ns ## KEGG: BVU_3139 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 95 511 8 410 417 192 30.0 2e-47 MVIRKISYYRVIVSLCLFMFLLSPVFGDENSAVSFDKELPENNIVTIQPDSLRILHNPLT GWVLYASMGVDAADFWAQYDHMYIPELGHNVSVTDYAHTLYIRASWTDFNPQEDVYGWKI DSNLRAYIEGAYQRNMRLAFRVVVDSRDKRTEFTPQFVKDAGAKGFMNKGKWSPYSDDPV FQKYYTKFVKALAKDFNDPSKVEFIDGFGLGKWGEYHTMIYSTGDDTPKKAVFDWVTDIY SQAFDKVPVVINYHRWIGAGKDWVDDEHFAADSEEMLEEAIKKGYSLRHDAFGMTTYYGS WERRFAKKYRNICPIIMEGGWIVKSHSYWQDPRGYRKDSHEDVRRGEFDDSKEAHVNMMD FRFGDTESWFKDAFDLVRRFLREGGYRLYPSEISLPLEVSKKTTPTIKHCWNNMGWGYCP TNIPQWNQKYKVAFALLDSETNEVRYTFVDTNTDLSKWIKGSPAEYVFEPDMSKVETGTY VWAVGLVDRSNKDLIGLDIAAKGDILDSGWLKLSKVSIKK >gi|222159332|gb|ACAB01000027.1| GENE 35 55428 - 57077 1289 549 aa, chain + ## HITS:1 COG:no KEGG:BVU_3139 NR:ns ## KEGG: BVU_3139 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 103 546 5 415 417 184 31.0 6e-45 MKINFQNMCSLRKISCGLIFSMAISLVACGSDNDEEGGGNGDDGLVEDTSIPGNSIVTVK QNRNIILHNPLSGWVLYAGIGDGLSSTFWQDYDNFPSSEGTVKVSDYANTLYLRGAWADF NPEEGKYAWNSDCDTPSAKRLKMLIEGAKQRNMKLAFTFVVDSRDKHYNFTPNFVKEAGA KGYETQTGSVKVWSPYPDDPIFQKYYEKFIRALAKDFNDPDKVQFVSGSGFGKWGEYHSV WYYQVRELGKPELPTREAVFDWVTDLYSQVFDKVPVFVNYHRWIGTSKEWDGNNYDKDTE RLIGKAVAKGYSLRHDAFGMKTYYSTWERNFIAKWKYLVPVVMEGGWVKNSHGNSIQGDG YANYAEVRQGEFDEAKTACVNMMDLRYNSDFRNGETYSWFNEAFQLVKQFCTEGSYRLFP DRISLPTTISNGKQIEIAHRWNNFGWGYCPTNIPQWKNKYKVAFALLDIKNDKPKYVFVD GEPEACDWVKGTAKSYTFTTRVEGVEAGKYMWAVGIVDTAKQNEIGIHLAVKNNVTSAGW LKLFEVTAR >gi|222159332|gb|ACAB01000027.1| GENE 36 57301 - 59508 1912 735 aa, chain + ## HITS:1 COG:YPO0616 KEGG:ns NR:ns ## COG: YPO0616 COG1472 # Protein_GI_number: 16120942 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 29 728 4 710 727 519 40.0 1e-147 MRKKVLILGLCLLGVTHSLSSKDKKSIPLYKDAKAPIEKRIDDLISRMTLEEKVLQLNQY TLGRNNNVNNVGEEVKKVPSEIGSLIYFDINPELRNSMQKKAMEESRLGIPIIFGYDAIH GFRTIYPISLGQACSWNPGLVEQACAVSAQEARMSGVDWTFSPMIDVARDPRWGRVAEGY GEDPYTNGVFAAASVRGYQGDDMSAENRMAACLKHYVGYGASEAGRDYVYTEISAQTLWD TYLLPYEMGVKAGAPTLMSSFNDISGVPGSANPYIMTEILKKRWKHDGFIVSDWGAVEQL KNQGLAATKKDAARYAFNAGLEMDMMSHAYDRHLKELVEEGKVTMAQVDESVRRVLRVKF RLGLFERPYTPVTNEKDRFFRPQSMAVAAQLAAESMVLLKNDNQILPLTNKKKIAVVGPM AKNGWDLLGSWCGHGKDTDVEMLYDGLTAEFGGDAELRYAMGCKPQGNDRSGFAGALDVA RWSDVVIVCLGEMLTWSGENASRSTIALPQIQEELVKELKEAGKPVILVLSNGRPLELNR MEPLCDAILEIWQPGINGARSMAGILSGRINPSGKLAMTFPYSTGQIPIYYNRRKSGRGH QGFYKDITSDPLYPFGHGLSYTEFKYGTVTPSATKVKRGDKLSAEVTVTNTGSRDGAETV HWFISDPYCSITRPVKELKHFEKQLIKAGETKTFRFDIDLERDFGFVNEDGKRFLEAGEY HILVQGQTVKIELID >gi|222159332|gb|ACAB01000027.1| GENE 37 59588 - 61154 1348 522 aa, chain + ## HITS:1 COG:TM0025 KEGG:ns NR:ns ## COG: TM0025 COG1472 # Protein_GI_number: 15642800 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 29 388 2 391 721 253 38.0 5e-67 MKKKIGGIGLCMLWSILTCAQTITPQAEQRAKDIVSKMTLQEKIEYISGYTSFSLRAIPR LGIPEIKLADGPQGIRNHAPKSTLYPSGILSASTWNRELLYKLGQGLGQDAKARGVNILL GPGVNIYRAPLCGRNFEYFGEDPYLTGETAKQYILGVQSEGVIATIKHFAANNQEWSRHH ASSDIDERTLQEIYFPAFRKAVQEANVGAVMNSYNLLNGVHATEHKWLNIDVLRNLWGFK GILMSDWTSVYSAVGAANAGLDLEMPKGRFMNLENLLPAIKAGTVTEETINLKVQHILQT LIAYDMLDKEQKDSNIAEDNPFSRQTALELAREGVVLLKNEGNLLPLKGKTAVMGPNANL IPTGGGSGFVTPFSTVSVAQGLKELKKKNLLLLTDDVIYEDIVHEFYTDANRQMKGFKAE YFKNKTLSGQPEVIRTESSVDYDWGYGAPLDGFPTDGFSVRWTACYMPQTDGQLKLHIGG DDGYRLFVNDKHITGDWGNHSYSSREVELPVEGGKEYRFRIE Prediction of potential genes in microbial genomes Time: Wed May 18 01:30:13 2011 Seq name: gi|222159331|gb|ACAB01000028.1| Bacteroides sp. D1 cont1.28, whole genome shotgun sequence Length of sequence - 187702 bp Number of predicted genes - 122, with homology - 120 Number of transcription units - 59, operones - 28 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 882 771 ## COG1472 Beta-glucosidase-related glycosidases 2 1 Op 2 . + CDS 912 - 2024 1108 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 3 1 Op 3 . + CDS 2078 - 3946 1717 ## BT_2458 putative pyridine nucleotide-disulphide oxidoreductase 4 1 Op 4 . + CDS 3954 - 6020 1658 ## Phep_0375 hypothetical protein + Term 6114 - 6152 -0.9 + Prom 6171 - 6230 4.7 5 2 Tu 1 . + CDS 6255 - 7553 901 ## COG3458 Acetyl esterase (deacetylase) + Prom 7587 - 7646 7.1 6 3 Op 1 . + CDS 7668 - 9464 1746 ## COG3250 Beta-galactosidase/beta-glucuronidase 7 3 Op 2 . + CDS 9486 - 10340 711 ## Cphy_0623 hypothetical protein + Term 10439 - 10473 -0.6 - Term 10399 - 10436 -0.2 8 4 Tu 1 . - CDS 10487 - 11692 650 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 11723 - 11782 6.0 - Term 12184 - 12230 8.5 9 5 Op 1 23/0.000 - CDS 12256 - 13266 1107 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 10 5 Op 2 . - CDS 13270 - 15120 1930 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 15341 - 15400 78.0 + TRNA 15324 - 15398 54.5 # Arg CCT 0 0 - Term 15506 - 15553 9.6 11 6 Tu 1 . - CDS 15569 - 18580 2843 ## COG0342 Preprotein translocase subunit SecD - Prom 18605 - 18664 1.8 - Term 18695 - 18733 2.1 12 7 Op 1 . - CDS 18746 - 20830 2027 ## COG0339 Zn-dependent oligopeptidases 13 7 Op 2 . - CDS 20861 - 21946 798 ## BT_2833 hypothetical protein 14 7 Op 3 . - CDS 21951 - 22856 766 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 15 7 Op 4 . - CDS 22837 - 23514 627 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Prom 23704 - 23763 7.4 + Prom 23615 - 23674 9.7 16 8 Op 1 . + CDS 23736 - 24008 353 ## COG0776 Bacterial nucleoid DNA-binding protein + Term 24031 - 24071 8.1 17 8 Op 2 . + CDS 24092 - 25909 2103 ## COG0018 Arginyl-tRNA synthetase 18 8 Op 3 . + CDS 25973 - 27160 901 ## COG4292 Predicted membrane protein - Term 27213 - 27275 6.4 19 9 Op 1 . - CDS 27397 - 29421 1318 ## BT_2828 hypothetical protein - Term 29431 - 29479 6.1 20 9 Op 2 . - CDS 29494 - 31839 2230 ## COG0550 Topoisomerase IA - Prom 31906 - 31965 9.8 - Term 31925 - 31988 4.2 21 10 Tu 1 . - CDS 32019 - 36089 2277 ## COG0642 Signal transduction histidine kinase - Prom 36126 - 36185 6.4 + Prom 36197 - 36256 6.5 22 11 Tu 1 . + CDS 36496 - 37095 554 ## BT_2225 hypothetical protein + Term 37107 - 37147 -0.4 23 12 Op 1 . + CDS 37645 - 38733 810 ## gi|237717238|ref|ZP_04547719.1| predicted protein 24 12 Op 2 . + CDS 38746 - 40077 567 ## Dfer_2199 hypothetical protein + Term 40285 - 40333 1.1 + Prom 40195 - 40254 7.6 25 13 Tu 1 . + CDS 40462 - 42603 1082 ## COG3669 Alpha-L-fucosidase + Prom 42678 - 42737 6.5 26 14 Op 1 . + CDS 42763 - 45873 2001 ## BT_4247 hypothetical protein 27 14 Op 2 . + CDS 45894 - 47876 1343 ## BT_4246 hypothetical protein 28 14 Op 3 . + CDS 47898 - 49328 1223 ## BVU_0948 hypothetical protein 29 14 Op 4 . + CDS 49355 - 51478 1248 ## gi|237717244|ref|ZP_04547725.1| predicted protein 30 14 Op 5 . + CDS 51515 - 53185 930 ## Dfer_2199 hypothetical protein + Prom 53227 - 53286 2.2 31 15 Op 1 1/0.000 + CDS 53306 - 55339 1107 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 55341 - 55400 6.2 32 15 Op 2 . + CDS 55422 - 57617 1611 ## COG3345 Alpha-galactosidase 33 15 Op 3 . + CDS 57632 - 58360 131 ## COG2071 Predicted glutamine amidotransferases + Prom 58403 - 58462 2.7 34 16 Op 1 1/0.000 + CDS 58505 - 61351 1817 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 61404 - 61463 3.0 35 16 Op 2 . + CDS 61576 - 63777 1803 ## COG1472 Beta-glucosidase-related glycosidases 36 16 Op 3 . + CDS 63786 - 64874 825 ## Slin_0490 hypothetical protein + Term 64933 - 64970 4.1 + Prom 64943 - 65002 7.6 37 17 Tu 1 . + CDS 65042 - 67042 1800 ## BT_1871 putative alpha-glucosidase + Term 67068 - 67117 5.3 - Term 67062 - 67099 5.2 38 18 Op 1 . - CDS 67159 - 68880 1391 ## BT_2817 putative TonB-dependent receptor 39 18 Op 2 1/0.000 - CDS 68901 - 71918 2980 ## COG0457 FOG: TPR repeat 40 18 Op 3 . - CDS 71942 - 73252 861 ## COG1672 Predicted ATPase (AAA+ superfamily) - Term 73951 - 73991 -0.0 41 19 Tu 1 . - CDS 74007 - 74531 575 ## COG1051 ADP-ribose pyrophosphatase - Prom 74691 - 74750 5.9 + Prom 74656 - 74715 5.4 42 20 Op 1 11/0.000 + CDS 74745 - 75311 629 ## COG0450 Peroxiredoxin + Term 75337 - 75379 5.3 43 20 Op 2 . + CDS 75396 - 76946 395 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Term 77020 - 77057 5.1 - Term 77007 - 77044 2.4 44 21 Tu 1 . - CDS 77197 - 77904 491 ## Coch_1347 hypothetical protein - Prom 77963 - 78022 4.5 45 22 Tu 1 . - CDS 78092 - 82039 2104 ## COG0642 Signal transduction histidine kinase - Prom 82059 - 82118 4.1 + Prom 82254 - 82313 4.3 46 23 Op 1 . + CDS 82407 - 85367 2332 ## BF4062 putative TonB-linked outer membrane protein 47 23 Op 2 . + CDS 85382 - 87022 1517 ## BVU_3705 hypothetical protein 48 23 Op 3 . + CDS 87046 - 88719 1223 ## Coch_1345 hypothetical protein 49 23 Op 4 . + CDS 88738 - 89763 489 ## COG3858 Predicted glycosyl hydrolase + Term 89841 - 89897 14.1 + Prom 89836 - 89895 6.0 50 24 Tu 1 . + CDS 89935 - 91761 1268 ## COG3568 Metal-dependent hydrolase + Term 91781 - 91831 4.3 - Term 91766 - 91822 3.5 51 25 Op 1 . - CDS 91907 - 93673 873 ## BVU_2031 hypothetical protein 52 25 Op 2 . - CDS 93693 - 94961 1089 ## COG0477 Permeases of the major facilitator superfamily 53 25 Op 3 . - CDS 94964 - 95536 512 ## COG2731 Beta-galactosidase, beta subunit 54 25 Op 4 . - CDS 95566 - 96495 909 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 55 25 Op 5 . - CDS 96522 - 98276 977 ## BVU_2031 hypothetical protein 56 25 Op 6 . - CDS 98303 - 100027 1092 ## Dfer_2403 RagB/SusD domain protein 57 26 Op 1 . - CDS 100151 - 103462 2664 ## Dfer_2402 TonB-dependent receptor plug 58 26 Op 2 2/0.000 - CDS 103422 - 104408 821 ## COG0524 Sugar kinases, ribokinase family 59 26 Op 3 . - CDS 104448 - 105371 725 ## COG0524 Sugar kinases, ribokinase family - Prom 105520 - 105579 9.6 60 27 Tu 1 . - CDS 105599 - 107140 939 ## BT_2802 hypothetical protein - Prom 107195 - 107254 3.0 - Term 107202 - 107251 14.0 61 28 Tu 1 . - CDS 107280 - 107762 435 ## BT_2538 hypothetical protein - Prom 107819 - 107878 3.4 + Prom 108144 - 108203 4.2 62 29 Op 1 . + CDS 108259 - 108666 334 ## gi|260173218|ref|ZP_05759630.1| hypothetical protein BacD2_15210 63 29 Op 2 . + CDS 108733 - 109683 421 ## mru_0017 hypothetical protein + Term 109902 - 109935 0.0 64 30 Tu 1 . - CDS 109760 - 109978 137 ## gi|237717280|ref|ZP_04547761.1| predicted protein - Prom 110134 - 110193 2.6 + Prom 110541 - 110600 3.6 65 31 Tu 1 . + CDS 110628 - 110912 144 ## gi|237717282|ref|ZP_04547763.1| predicted protein 66 32 Tu 1 . - CDS 110974 - 111573 521 ## BT_2225 hypothetical protein - Prom 111674 - 111733 5.9 + Prom 111839 - 111898 6.4 67 33 Op 1 . + CDS 111993 - 115412 2484 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins 68 33 Op 2 . + CDS 115433 - 117166 1433 ## Phep_2529 RagB/SusD domain protein 69 33 Op 3 . + CDS 117186 - 117977 483 ## COG3568 Metal-dependent hydrolase 70 33 Op 4 . + CDS 118006 - 120162 1452 ## Amuc_0060 alpha-N-acetylglucosaminidase (EC:3.2.1.50) 71 33 Op 5 6/0.000 + CDS 120171 - 121313 784 ## COG1929 Glycerate kinase 72 33 Op 6 . + CDS 121320 - 122576 980 ## COG2610 H+/gluconate symporter and related permeases 73 33 Op 7 . + CDS 122591 - 123676 580 ## COG4299 Uncharacterized conserved protein 74 33 Op 8 . + CDS 123683 - 125866 1728 ## BT_0438 alpha-N-acetylglucosaminidase precursor + Term 125929 - 125980 5.1 + Prom 125900 - 125959 12.7 75 34 Op 1 6/0.000 + CDS 125985 - 126566 510 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 126581 - 126640 6.5 76 34 Op 2 . + CDS 126675 - 127631 506 ## COG3712 Fe2+-dicitrate sensor, membrane component 77 34 Op 3 . + CDS 127663 - 128436 596 ## RB8407 hypothetical protein 78 34 Op 4 . + CDS 128444 - 129223 247 ## gi|237717296|ref|ZP_04547777.1| conserved hypothetical protein 79 34 Op 5 . + CDS 129237 - 130766 1036 ## gi|260173237|ref|ZP_05759649.1| hypothetical protein BacD2_15305 + Term 130899 - 130946 8.2 + Prom 130812 - 130871 7.4 80 35 Tu 1 . + CDS 131087 - 132073 722 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 132150 - 132209 4.2 81 36 Op 1 . + CDS 132236 - 135451 2457 ## Dfer_0714 TonB-dependent receptor plug 82 36 Op 2 . + CDS 135472 - 136950 1411 ## Dfer_4709 RagB/SusD domain protein 83 36 Op 3 . + CDS 136974 - 138050 710 ## ZPR_4337 glycoside hydrolase, family 5 84 36 Op 4 . + CDS 138087 - 139964 1270 ## BT_2892 hypothetical protein 85 36 Op 5 . + CDS 139991 - 141880 1237 ## gi|237717303|ref|ZP_04547784.1| predicted protein 86 36 Op 6 . + CDS 141920 - 145174 2468 ## BVU_0750 TPR domain-containing protein + Term 145200 - 145257 10.1 + Prom 145189 - 145248 4.8 87 37 Op 1 . + CDS 145371 - 145922 194 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 88 37 Op 2 . + CDS 145952 - 147091 788 ## COG2017 Galactose mutarotase and related enzymes + Prom 147113 - 147172 1.8 89 38 Op 1 . + CDS 147208 - 148011 717 ## COG0483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family + Term 148012 - 148048 2.4 90 38 Op 2 . + CDS 148055 - 148762 373 ## COG1040 Predicted amidophosphoribosyltransferases - TRNA 149478 - 149554 79.9 # Asn GTT 0 0 - TRNA 149580 - 149656 79.9 # Asn GTT 0 0 + Prom 149650 - 149709 2.7 91 39 Op 1 . + CDS 149734 - 150723 937 ## COG0524 Sugar kinases, ribokinase family 92 39 Op 2 . + CDS 150770 - 152458 1851 ## COG0793 Periplasmic protease + Term 152486 - 152537 16.4 - Term 152472 - 152527 11.3 93 40 Tu 1 . - CDS 152595 - 153785 1011 ## COG0642 Signal transduction histidine kinase - Prom 153843 - 153902 8.4 + Prom 153770 - 153829 7.5 94 41 Tu 1 . + CDS 153969 - 155387 1482 ## COG0499 S-adenosylhomocysteine hydrolase + Term 155416 - 155477 5.8 - Term 155162 - 155204 -0.4 95 42 Tu 1 . - CDS 155412 - 155585 99 ## + Prom 155448 - 155507 6.1 96 43 Tu 1 . + CDS 155539 - 158052 2143 ## BT_2796 hypothetical protein + Term 158103 - 158153 6.6 + Prom 158055 - 158114 4.5 97 44 Op 1 4/0.000 + CDS 158336 - 159667 1151 ## COG1538 Outer membrane protein 98 44 Op 2 . + CDS 159690 - 160781 1113 ## COG1566 Multidrug resistance efflux pump 99 44 Op 3 . + CDS 160831 - 162468 1081 ## BT_2793 putative MFS transporter 100 44 Op 4 . + CDS 162507 - 163382 535 ## COG2207 AraC-type DNA-binding domain-containing proteins 101 45 Tu 1 . - CDS 163387 - 164040 724 ## COG0035 Uracil phosphoribosyltransferase 102 46 Tu 1 . + CDS 164321 - 165928 1653 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) + Term 165951 - 165990 9.1 + Prom 165940 - 165999 7.0 103 47 Tu 1 . + CDS 166079 - 167386 547 ## COG0249 Mismatch repair ATPase (MutS family) + Prom 167419 - 167478 6.7 104 48 Op 1 . + CDS 167504 - 169996 1892 ## BT_4682 hypothetical protein 105 48 Op 2 7/0.000 + CDS 170028 - 170585 421 ## COG2059 Chromate transport protein ChrA 106 48 Op 3 . + CDS 170582 - 171106 472 ## COG2059 Chromate transport protein ChrA + Term 171134 - 171185 6.1 - Term 171121 - 171173 14.0 107 49 Tu 1 . - CDS 171196 - 172995 1771 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 173141 - 173200 4.1 + Prom 172948 - 173007 4.1 108 50 Tu 1 . + CDS 173151 - 173420 446 ## PROTEIN SUPPORTED gi|160883111|ref|ZP_02064114.1| hypothetical protein BACOVA_01079 + Term 173446 - 173483 6.1 + Prom 173473 - 173532 9.7 109 51 Op 1 . + CDS 173607 - 174182 647 ## COG1396 Predicted transcriptional regulators 110 51 Op 2 . + CDS 174218 - 175867 1584 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II + Term 175915 - 175977 8.2 - Term 175903 - 175964 4.2 111 52 Tu 1 . - CDS 176015 - 177097 774 ## COG0836 Mannose-1-phosphate guanylyltransferase - Prom 177292 - 177351 80.3 + TRNA 177262 - 177348 50.8 # Leu CAA 0 0 + Prom 177273 - 177332 80.4 112 53 Tu 1 . + CDS 177521 - 178330 424 ## COG3177 Uncharacterized conserved protein + Term 178460 - 178494 -0.7 - Term 178164 - 178196 -0.2 113 54 Op 1 . - CDS 178356 - 179096 106 ## Fjoh_3691 NERD domain-containing protein 114 54 Op 2 . - CDS 179106 - 179549 144 ## gi|54307151|ref|YP_133696.1| hypothetical protein NBU1_02 115 54 Op 3 . - CDS 179605 - 180324 440 ## gi|54307150|ref|YP_133695.1| hypothetical protein NBU1_01 - Prom 180551 - 180610 3.6 - Term 180442 - 180485 4.2 116 55 Tu 1 . - CDS 180643 - 182046 932 ## BVU_1439 mobilization protein 117 56 Tu 1 . - CDS 182258 - 183214 415 ## BVU_1440 DNA primase - Prom 183258 - 183317 2.9 118 57 Op 1 . - CDS 183446 - 184636 619 ## BDI_2140 hypothetical protein 119 57 Op 2 . - CDS 184641 - 184955 205 ## Amuc_0323 phage transcriptional regulator, AlpA - Prom 185023 - 185082 4.4 - Term 185003 - 185040 6.2 120 58 Op 1 . - CDS 185125 - 186066 703 ## gi|54307147|ref|YP_133700.1| hypothetical protein NBU1_06 121 58 Op 2 . - CDS 186074 - 187411 687 ## PG0838 integrase - Prom 187456 - 187515 3.6 122 59 Tu 1 . - CDS 187525 - 187701 92 ## Predicted protein(s) >gi|222159331|gb|ACAB01000028.1| GENE 1 1 - 882 771 293 aa, chain + ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 16 285 433 705 793 201 40.0 1e-51 FDNISSAIIRFNAYSLNEAKLRQGLAKVDNVVFCTGFNSNTEGEGFDRPFALLRYQELFI KKIASMHPNVVVVLNAGGGVDFTNWYDAAKAILMAWYPGQEGGQAIAEILTGKISPSGKL PISIEWKWDDNPVHGSYYENLKAEIKRVDYSEGVFVGYRGYDRSGKEPFYPFGYGLSYTT FAYSNMAAEKTGEHQVTVSFDIENTGKMDASEVAQVYVHDVQSSVPRPLKELKGYEKVFL KKGEKKRVAVVLDEDAFSFYDMNQHRFVVEKGDFEILVGPVSSQLPLKATVEL >gi|222159331|gb|ACAB01000028.1| GENE 2 912 - 2024 1108 370 aa, chain + ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 52 369 44 352 352 209 35.0 7e-54 MKKTLLLGASLLCSVFIMATEIPFQKAEIKSIMRKVADWQIANPHPAPEHDDLNWPQGAL YVGMVDWAELAEKEDNDDTYYKWLTRIGRRNCWQPDKRFYHADDIAVSQSFLDLYRKYKD EAMIIPTLARTEWIVNHPSEGSFKLVEGDLKTLERWTWCDALFMAPPVYAKLYMLTGEKK YIKFMNREYKATYDYLFDKEENLFYRDWRYFDKREANGKKVFWGRGNGWVLGGLVEILKE LPKKDKNRKFYEELFAKLATRVAGLQHPDGFWHASLLDPASYPSPETSCTTFIVYSIAYG INEGLLDKEIYLPVMIKGWNALVSAVEPNGKLGYVQQIGADPKKVTRDMTEVYGVGAFLM AGNEIYKMAR >gi|222159331|gb|ACAB01000028.1| GENE 3 2078 - 3946 1717 622 aa, chain + ## HITS:1 COG:no KEGG:BT_2458 NR:ns ## KEGG: BT_2458 # Name: not_defined # Def: putative pyridine nucleotide-disulphide oxidoreductase # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 616 26 623 626 790 61.0 0 MKKFLLTCIAVACSLVAVAEELLIEAESFSQRGGWVLDQQFMDQMGSPYLMAHGMGIPVA DAMAEINIPQAGTYYVYARTYNWTSPWTDAEGPGKFRLALGGKLLKATLGHTGNSWQWQF AGKTVLKAGTTTLALKDLTGFDGRCDAIYLTTDANTQPATWDTAETAALRTRLRQQQTVP AHQYDFVVVGGGIAGMCAATSAARLGCKVALVNDRPVLGGNNSSEIRVHLGGIIEMGPNQ GLGRMIREFGHERSGNAQPGDYYEDQKKEDFIDAEKNITLYASQRAVAVKMQGDRIASVT IQHIETGGQTELTAPLFSDCTGDATIGYLAGADWAMGREGRDEYGESLAPEEPDSLVMGA SIQWYSKDMKKKTSFPHFEYGVRFDAENCEPVTMGEWKWETGMNRNQVSEAERVRDYGLL VIYSNWSYLKNHYKDHKKYANRSLDWVAYVSGKRESRRLLGDYVLSQDDIDKNVAHEDAS FTTTWSIDLHFPDSVNSVRFPGNEFKSATVHRWIHPYAVPYRCLYSRNVDNLFMAGRNMS CTHVALGTVRVMRTTGMMGEVVGMAAGLCHKHRVEPRDIYHHHLPELKQLMQAGLGKRDV PDNQRFNEPNKLLEVPGAYIKP >gi|222159331|gb|ACAB01000028.1| GENE 4 3954 - 6020 1658 688 aa, chain + ## HITS:1 COG:no KEGG:Phep_0375 NR:ns ## KEGG: Phep_0375 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 27 686 38 698 700 654 49.0 0 MSKKNIITIFLSAICTLPLWGGQQYYAFLKGDTLRIGNNYMERAMLWNNGAPVTISLTDK QHGKNIPVQGKQPDFSIVKGIPTDATFTVNEIPTNGIHASYLQATVACTIGSLNIERRYR IYADCPAIACDTYLKGQVELYQNKEDNRSNADRKNIEHTADMATGVKTPTLDRLQLSGNH WSARTIEFFDYTDWNDNLVAGRTWLPYRRNTYRGNLLFAHDVVTRQGFFFLKEAPSSSTQ LHYPGSDFVADFSDFMVVGLGIASHDVKPDSWTRVYGCVTGIYTGGEQEALTALRLYQKQ LRHHTAAQDEMIMLNTWGDRSQDAKIDEAFCLAELDRAARMGITLFQLDDGWQSGKSPNS KTAGGSFKDIWKNTGYWTPNPTKFPHGLKPIVEKGKKLGIRIGLWFNPSIQNDFADWQKD AQAIIGLYKKYGICCFKIDGLQIPTKTAEQNLRRLFDTVLEQTNYEVIFNLDATAGRRGG YHYMNEYGNIFLENRYTDWGNYYPYRTLRNLWMLSRYVPAEKVQIEFLNKWRNADKYDAA DPFAPARYSFDYLFAITLAAQPLAWMEASNLPEEAYITASLLKKYQPLQLRFHQGVILPI GEEPSGRSWTGFQSTVSGTQGYLVVYREDNEQARGTIDTWLPEGKKVTFTPVMGSGKKFA AKVGAQGRVSFELNDKNSFTLYQYEVKP >gi|222159331|gb|ACAB01000028.1| GENE 5 6255 - 7553 901 432 aa, chain + ## HITS:1 COG:TM0077 KEGG:ns NR:ns ## COG: TM0077 COG3458 # Protein_GI_number: 15642852 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acetyl esterase (deacetylase) # Organism: Thermotoga maritima # 128 409 12 303 325 109 29.0 1e-23 MIKSVRFLLLLIAFVSMQIVAWGQPQERLVQVQVTPDHTNWLYKPGEKVKFKVAVLKCNI PQDNLEVRYEISEDMMKPHQTGKQPLKNEKLEINAGTMKKEGFLRCRAFVTCQGREYEGV ATVGFSPEKLQPTTPLPADFLEFWKSTKEAAEKWALEPIMTLLPERCTDKVNVYHVSFAN NDYASRMYGILCVPKASGKYPAILKVPGAGIRAYNGEAERAGKGFIILEIGIHGIPVNLT GDVYHRLYNGALKNYHSFNMDNRDKYYYKRVYTGCVRAIDFIYTLPEFNGNLATFGGSQG GALSIVIAGLDDRVKGLVSFYPALCDMAGYTHGRAGGWPHLFKDAKNRTPEKIKTIQYFD VVNFARQVKVPGFYTFGYNDMVCPPTTTYSAYNVINAPKELFVAETTAHYAYAEQWSAAW NWVMNFLKNKSK >gi|222159331|gb|ACAB01000028.1| GENE 6 7668 - 9464 1746 598 aa, chain + ## HITS:1 COG:TM1062 KEGG:ns NR:ns ## COG: TM1062 COG3250 # Protein_GI_number: 15643820 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 82 557 32 529 563 202 31.0 1e-51 MKTGEKLSLCLLLLIVFAGSVAAQELITNVYGRNIHSLNGKWNAIIDLYDQGQRMKIYEN RQPEGNIDFYEYAFEGGLRLNVPGDWNSQSPELKYYEGTVWYARHFDAKRLADKRQFLYF GAVSYRCKVYLNGIEIAEHEGGFTPFQVEVTDLLKDGDNFLAVEVNNRRTKDAIPAMAFD WWNYGGITRDVLLVKTPRTFIEDYFIQLDKNAPDRIIARVRLSDEKAGEKVTVAIPELKI NAELTTDAEGKAETVLNAKKLQRWSPEEPKLYGVTVSSSADRVEEQIGFRNITVKGTDIY LNGKPTFMCCISFHEEIPQRMGRAFSEADAAMLLNEAKALGVNMIRLAHYPQNEYTVRLA EKMGFLLWQEIPIWQGIDFTDDDTRKKAQRMLSEMIKRDQNRCAVGYWGVANETQPSKER NEFLTSLLETGKQLDTTRLYVAAFDLVHFNSEKQRFVMEDSFTSQLDVVAINKYMGWYHP WPVEPKDAIWEVVTDKPLIISEFGGEALYGQSGDENVVSSWSEEYQARLYRDNIRMFDNI PNLRGVSPWILFDFRSPFRFHPTNQDGWNRKGLISDQGMRKKAWYLMRDYYMKKRNNR >gi|222159331|gb|ACAB01000028.1| GENE 7 9486 - 10340 711 284 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0623 NR:ns ## KEGG: Cphy_0623 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 28 254 1 223 236 155 40.0 2e-36 MKQYKIVVICMCILLLLAGGVYAQQKTVRILAIGNSFSQDAVEQYLHELAEAEGISTIIG NMFIGGCSLERHVKNARDNAPAYAYRKIGTDGKKREKGKMSLEAVLADEAWDYVSLQQAS PFSGMYETYEASLPELIEYVKARLPKKTKLMLHQTWAYASTSKHSGFKNYNCNQLTMYQA IADAVKKAAKVNKIKIVIPSGTAIQNARTSFIGDHLNRDGYHLDVKIGRYTAACTWFERI FKHSVVGNPYAPEGLDEARKAVAQKAAHAAVKHPYKVTELSITN >gi|222159331|gb|ACAB01000028.1| GENE 8 10487 - 11692 650 401 aa, chain - ## HITS:1 COG:FN1382 KEGG:ns NR:ns ## COG: FN1382 COG1373 # Protein_GI_number: 19704717 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 7 400 1 401 402 314 47.0 2e-85 MNFPRILKKRKGYIDRIKPFMQKSVVKVLTGQRRVGKSFLLYQLIEEILAKEPDANIIYV NLEDFTFSSLQTAEDLHSYIISHSKEKAKNYIFIDEIQDIPGFEKIIRSLLLNEDNDIYI TGSNAKMLSGELATYLSGRYIEFKIYSLSYSEFLEFHGLTESETSYELYSRYGGLPYLLN LPLEDETVNEYLKSVYSTIVFRDVVSRYKLRNTLFLEKLIQFLSENIGNLFSAKNISDYL KSQHTTISVNQIQSYTEYLNNAFLIHRVERYDLIGKRVFEIGEKYYFENMGIRNIVIGYR ITDKAKILENLVYNHLLYKGYDIKVGYYGDKEIDFIGEKNGEKLYIQVALKIDSDKTAER EFGNLLKIQDNYPKIVVTEDTFSGNSYEGIRHCPIRQFLME >gi|222159331|gb|ACAB01000028.1| GENE 9 12256 - 13266 1107 336 aa, chain - ## HITS:1 COG:Rv2454c KEGG:ns NR:ns ## COG: Rv2454c COG1013 # Protein_GI_number: 15609591 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Mycobacterium tuberculosis H37Rv # 2 335 38 372 373 325 46.0 1e-88 MSDKVYTVQDYKSGQPRWCPGCGDHAFLNSLHKAMAELGVAPHDIAVISGIGCSSRLPYY VNTYGFHTIHGRAAAVATGAKVANPNLTIWQISGDGDGLAIGGNHFIHALRRNVDLNMIL LNNRIYGLTKGQYSPTSERGLVTKSSPYGTVEDPFHPAELAFGARGRFFARCIAVDGAAS VEVLKAAANHKGASVVEVLQNCVIFNDGTHASVATKEGRAKNAIYLEHGKPMLFGENKEF GLMQEGFGLKVVKLGENGITEKDILIHDAHCQDNTLQLKLALMEGPDFPIALGVIRDVDA PTYNDAVVEQIEEIKGKKKYHNFQELLMTNETWEVK >gi|222159331|gb|ACAB01000028.1| GENE 10 13270 - 15120 1930 616 aa, chain - ## HITS:1 COG:MT2530_2 KEGG:ns NR:ns ## COG: MT2530_2 COG0674 # Protein_GI_number: 15841979 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Mycobacterium tuberculosis CDC1551 # 213 606 1 389 425 373 51.0 1e-103 MADEMMVKELEEVVVRFSGDSGDGMQLAGNIFSNVSATVGNDICTFPDYPADIRAPQGSL TGVSGFQVHIGAGQVYTPGDRCHVLVAMNPSALKTQIKFCKPQGLIITDSDSFEARDLEK AQFKTNNPFEELGIKQEVLEVPISSMCKESLKDSGLDNKSILRCKNMFALGLVCWLFNRN LAAAEKMLREKFAKKPEIAEANIKVLNDGFNYGANTHASVSTYKIESKAPKSKGLYTDIN GNKATAYGLIAAAEKAGLGLYLGSYPITPATDILHELAKHKSLGIKTVQCEDEIAGCASA VGAAFAGALAVTTTSGPGVCLKSEAMNLAVIGELPLVIVNVQRGGPSTGLPTKSEQTDLL QALYGRNGESPMPVIAATSPTNCFDAAYMACKIALEHMTPVVLLTDAFVANGSAAWKLPN LDEYPAINPPYVTPDMAGNWTPYQRNPETGVRYWATPGTEGFMHRIGGLEKSNETGAIST EPENHNKMVHLRQAKVDKIADYIPELEVLGDEDADLLIVGWGGTYGHLRLAMDYMREHGK KVAFVHFQYINPLPKNTADVLHKYKKIVVAEQNLGQFAGYLRMKVPGLNISQFNQVKGQP FVTRELIDAFTKLLEE >gi|222159331|gb|ACAB01000028.1| GENE 11 15569 - 18580 2843 1003 aa, chain - ## HITS:1 COG:AGc2877_1 KEGG:ns NR:ns ## COG: AGc2877_1 COG0342 # Protein_GI_number: 15888881 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 389 662 275 548 562 244 44.0 8e-64 MQNKGFVKVFAVLLTLVCVFYLSFSFVTRHYTNKAKEIANGDPKVEQDYLDSLSNEKVML WNWTLKQCREMEISLGLDLKGGMNVILEVSVPDVIKALADNKPDEAFNNALATAAKQAVN SQDDVITLFIREYHKIAPDAKLSELFATQQLKDKVNQKSSDAEVEKVLRAEVKAAVENSF NVLRTRIDRFGVVQPNIQSLEDKMGRIMVELPGIKEPERVRKLLQGSANLEFWETYTARE VLPAMQSADAKLRVILAEGTTADTDTIEAVLTEATPVEKKTVSAADSLAAALKGDVTAED KSAANMEEIKKQYPLLSILQLNSSGQGPVIGYANYKDTADINKYLAMPEIKADLPKDLRL KWGVSPSEFDKKGQTFELYAIKSTERNGKAPLEGDVVTDAKDEFDQYSKPAVSMTMNSDG ARRWAQLTKQNIGRSIAIVLDNYVYSAPNVNSEITGGRSQITGHFTPEQAKDLANVLKSG KMPAPAHIVQEDIVGPSLGQESINAGIFSFVVALILLMIYMCSMYGFIPGMVANCALILN FFFTLGILSSFQAALTMSGIAGMVLSLGMAVDANVLIYERTKEELRAGKGVKKALADGYS NAFSAIFDSNLTSIITGIILFNFGTGPIRGFATTLIIGILVSFFTAVFITRIVYEHFMNK DKWLNLTFTTKISKNLMTNTHFDFMGTNKKSLIIVSAIIVVCIGSFAIRGLSQSIDFTGG RNFKVQFENPVEPEQVRELISSKFGDANVSVIAIGTDKKTVRISTNYRIEDEGNNVDSEI ESYLYETLKPVLTQNITLETFIDRDNHTGGSIVSSQKVGPSIADDIKTGAIYSVVLALLA IGLYILLRFRNIAYSIGSIVALSCDTIMIIGAYSLFWGILPFSLEIDQTFIGAILTAIGY SINDKVVIFDRVREFFGLYPKRDKRVLFNDSLNTTLARTINTSLSTLIVLLCIFILGGDS IRSFAFAMILGVVIGTLSSLFIASPIAYNMMKNKKVVVPATEE >gi|222159331|gb|ACAB01000028.1| GENE 12 18746 - 20830 2027 694 aa, chain - ## HITS:1 COG:XF1944 KEGG:ns NR:ns ## COG: XF1944 COG0339 # Protein_GI_number: 15838538 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Xylella fastidiosa 9a5c # 24 694 35 716 716 596 46.0 1e-170 MIKKTLTILAVSCMMYSCATKTESNPFFTEFQTQYGVPSFDKIKLEHYEPAFLKGIEEQN QNIEAIIESPEIPTFENTIVALDNSAPILDRVSAIFFNMTDAETTDSLTALSIKLAPVLS EHDDNISLNEKLFKRVNDVYQKKDSLNLTSEQERLLDKTYKRFVRSGANLNAEDQTRLRE INKELSTLGITFSNNILNENNAFKLFVDKEEDLAGLPEWFRQSAAEEAKAAGQPGKWLFT LHNASRLPFLQYSENRPLREQIYKAYINRGNNNDENDNKENIRKIVSLRLEKAKLLGFDC YANFVLDETMAKNASNVMSLLNNLWSYALPKAKAEATELQKLMDKEGKGEKLEAWDWWYY TEKLRKEKYNLSEEDTKPYFKLENVRNGAFTVANKLYGITLTKLEGVPTYHPDVEVFEVK DADGSQLGIFYVDYFPRSGKSGGAWMSNYREQQGDTRPLVCNVCSFTKPVGDTPSLLTMD EVETLFHEFGHALHGLLTKCEYKGTSGTNVVRDFVELPSQINEHWATEPEVLKMYAKHYQ TGEVIPDEIIEKILQQKTFNQGFMTTELLAAAILDMNLHTVTDVKNIDMLAFEKEAMDKL GLIPEIAPRYRVTYFNHIIGGYAAGYYSYLWANVLDNDAFEAFKEHGIFDKNTADLFRHN VLEKGDSEDPMVLYRNFRGAEPSLEPLLKNRGMK >gi|222159331|gb|ACAB01000028.1| GENE 13 20861 - 21946 798 361 aa, chain - ## HITS:1 COG:no KEGG:BT_2833 NR:ns ## KEGG: BT_2833 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 361 593 81.0 1e-168 MEHIGKFVAYLILAVNALFVGTLILSAYSPYLNPKINPLASSLGLAFPIFLTINLIFIGF WAFVNYRYALLPAIGFLICIPQIRTYIPFNSTTKTIPEGSIKILSYNVMSFNNLEKKDGK NPVLSYLVDSNADIICLQEYNTATNKKYLTEQDIKKALKAYPYQSVHQQGKGDVQLACFS KFPILSIHPIKYESNYNGSMKYVLNVNNDTLTLINNHLESNKLTKEDRGMYEDMIKDPNA KKVKTGLRQLIRKLAEASAIRASQADSVAKAIAACKYPTTIVCGDFNDGSISYTHRILTQ KLDDAFTQSGKGLGISYNQNKFYFRIDNILISPNLKAYNCTVDRSIKASDHYPIWCYISK R >gi|222159331|gb|ACAB01000028.1| GENE 14 21951 - 22856 766 301 aa, chain - ## HITS:1 COG:MA3859 KEGG:ns NR:ns ## COG: MA3859 COG0705 # Protein_GI_number: 20092655 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Methanosarcina acetivorans str.C2A # 34 214 39 212 226 88 32.0 1e-17 MGHIIADLKETFRRGNIFIQLIYINVGIFLIGTLINVFLRLFEVSTPDIFGIFALPASFI GFIHQPWSLFTYMFMHAGILHILFNMLWLYWFGSLFLYFFSAKHLRGLYVLGGICGGLLY MIAYNVFPLFSSQVVGSTLVGASASVLAIVAATAYREPNYRVQLFLFGAIRLKYLALVVI GIDVLSITSSNAGGHIAHLGGALAGLWFAASLNKGTDLTSWINWILDGFISLFQKKTWKR KPKMKVHYGSSATGREKDYDYNAHKKAQSDEVDRILEKLKKSGYDSLTTEEKKSLFDASK R >gi|222159331|gb|ACAB01000028.1| GENE 15 22837 - 23514 627 225 aa, chain - ## HITS:1 COG:XF0649 KEGG:ns NR:ns ## COG: XF0649 COG0705 # Protein_GI_number: 15837251 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Xylella fastidiosa 9a5c # 1 222 9 211 224 147 43.0 2e-35 MPTVTKNLIIINVLVFFGTIVAQRYGLDLTNYLGLHFFLASDFNPAQLITYMFMHGGFSH IFFNMFAVFMFGPILEQTWGPKRFLFYYILCGIGAGLIQEGVQYIQYVIELSQHTQVNLI GYGVIPMEQYLNMLTTVGASGAVYAILLAFGMLFPNNRLFIFPLPFPIKAKFFVIGYAAI ELWAGLANSTGDNVAHFAHLGGMLFGLILILYWRKKSNNNGTYYS >gi|222159331|gb|ACAB01000028.1| GENE 16 23736 - 24008 353 90 aa, chain + ## HITS:1 COG:YPO3154 KEGG:ns NR:ns ## COG: YPO3154 COG0776 # Protein_GI_number: 16123316 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Yersinia pestis # 1 89 1 89 90 79 60.0 1e-15 MNKSDLISAMAAEAQMSKADAKKALDAFITSVTNAMKAGDKVSLVGFGTFSVSERAERTG INPSTKATITIPAKKVAKFKAGAELSAAVE >gi|222159331|gb|ACAB01000028.1| GENE 17 24092 - 25909 2103 605 aa, chain + ## HITS:1 COG:TP0831 KEGG:ns NR:ns ## COG: TP0831 COG0018 # Protein_GI_number: 15639817 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Treponema pallidum # 8 605 12 589 589 471 40.0 1e-132 MKIEDKLVASVISGLKALYGQEVPEKMVQMQKTKKEFEGHLTLVVFPFLKMSKKGPEQTA QEIGEYLKANDPAVAAFNVIKGFLNLTIASATWIELLNEIQADEQYGLMQVTDASPLVMI EYSSPNTNKPLHLGHVRNNLLGNALANIVAANGNKVVKTNIVNDRGIHICKSMLAWKKYG NGETPETSGKKGDHLVGDYYVSFDKHYKAEVKELMAKFTAQGMSGDEAKAKAEAESPLMQ EAREMLVKWEAGDPEVRGLWEMMNNWVYAGFDETYKKMGVSFDKIYYESNTYLEGKEKVM EGLEKGFFYKKEDGSVWADLTAEGLDHKLLLRGDGTSVYMTQDIGTAKLRFADYPINKMI YVVGNEQNYHFQVLSILLDKLGFEWGKSLVHFSYGMVELPEGKMKSREGTVVDADDLMEE MIATAKETSQELGKLDGLTQEEADDIARIVGLGALKYFILKVDARKNMTFNPKESIDFNG NTGPFIQYTYARIQSVLRKAAESGIVIPEQIPAGIELSEKEEGLIQMVADFAAVVKQAGE DYSPSIIANYTYDLVKEYNQFYHDYSILREENEAVKVFRIALSANVAKVVRLGMNLLGIE VPSRM >gi|222159331|gb|ACAB01000028.1| GENE 18 25973 - 27160 901 395 aa, chain + ## HITS:1 COG:RSc0230 KEGG:ns NR:ns ## COG: RSc0230 COG4292 # Protein_GI_number: 17544949 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Ralstonia solanacearum # 19 352 19 353 398 248 41.0 1e-65 MFYSPHPLLRKRETETATVSYSELLFDLIYVFSVTQLSHYLLHNLTWEGLLKETILWFAV WMLWQHTIWVTNWFNPDTRPIRILLFISMLVGLVMAAAIPYAFTYRGLIFAVCYVLIQAG RTLYIIGVLGDHHLAANFKRIMGWFCISAVFWITGAILQGEWQILLWIIAAICDYTAPMH GFALPRLGRSDSSKEWTIEGHHLVERCQLFVIIAFGETLLMTGASLSEVEEWTPLVIISA VISFIGSLAMWWVYFDVSSEAGSRKIQEVKDPGKLGLIYNAIHIVLVGALIICAVGDELI VAHPEQEMRAEVVFVLIIGPIVYILANSIYKYVTCRMLPLSHIIAVIALALLLPWPYHIS LLTMNILVTSVFIFVIVFDMLFPNKGFKIKWEPKI >gi|222159331|gb|ACAB01000028.1| GENE 19 27397 - 29421 1318 674 aa, chain - ## HITS:1 COG:no KEGG:BT_2828 NR:ns ## KEGG: BT_2828 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 674 1 660 670 383 35.0 1e-104 MKQTNLKYLMIGVLTIASISCMDKERDLSWERRHMPKEAYFDFNMTQAVALDINYCFKSE NYRVLFDIYDQDPIEYSADGSVSQKDIEPIYRAVTDEEGKFSGEMNIPADLSEVWLSSDY LATVSPLKLTIDDSRRLSFNQDAYIATLRSQTASKTRGVTVNQHTYLKEWHVLPDADWDD SGRPTNLEPKINIPPADVLYNIKYVFRKVTVKDESGKSKVMNISQNYPEFFDGSIKMTSD IPIVNPTEVSLVFITSSAAWYNTVGYYTYPTNNPPQSASDIKQIIAFPNTSPIYKTLGAG ALVCGEEIKLKYWNEDTQEYEDKFPAGVTIGWCLQGMGFKSKLTSETDKDKVGDIIKGMG ARYSTRNLNTNNTQRTVSLRDSKSGQIVAVGFEDNIDFDYADAIFYIHTSEKNAIDPTLP PLPEDPEAIPEQYKISYSGTLAFEDLWPKLGDYDMNDVMVKYTSTMTRNALDNRIYEIED KFILQHCGGYLQNGFGYQLHKLSNSNVKSVKITGPDASGLSSSIYMEGKETEPGQSHPTI LLYDDMTKFKNITDESKKEYTVTITLDGASEKEVVPPYNPFIFISSNEGRGKELHLINYP PTDKADLSLLGTGKDIYRPEEGMYYVSADLMPFAINMPVSNLPVPEEGKRIDQSYPKFSG WVSSNGKQNKDWYK >gi|222159331|gb|ACAB01000028.1| GENE 20 29494 - 31839 2230 781 aa, chain - ## HITS:1 COG:AGc2398_1 KEGG:ns NR:ns ## COG: AGc2398_1 COG0550 # Protein_GI_number: 15888629 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 585 2 579 580 437 41.0 1e-122 MQKNLVIVESPAKAKTIEKFLGKDFKVLSSYGHIRDLKKKEFSIDVEKNFTPSYEIPADK KALVNTLKTEAKDAETVWLASDEDREGEAIAWHLYEVLKLKPENTKRIVFHEITKSAILK AIEQPRDIDLNLVNAQQARRILDRIVGFELSPVLWRKVKPALSAGRVQSVAVRLIVERER EIHAFQTEAAYRITAVFLVPDTDGKLVEMKAELARRIKTKEEAKAFLNACQGASFAIDDI TTRPVKKTPPAPFTTSTLQQEAARKLGYTVAQTMMLAQRLYESGFITYMRTDSVNLSEFA TTGSKDAIIKMMGDRYVHLRHFETKTKGAQEAHEAIRPTYMENQSVEGTAQEKKLYDLIW KRTIASQMADAELEKTTATITISGSSDVFTAIGEVIKFDGFLRVYRESYDDDNEQEDESH LLPPLKKGQKLEHGPIIATERFTQRPPRYTEASLVRKLEELGIGRPSTYAPTISTIQQRE YVEKGNKDGEERQFNVMTLKDRQIKDENHTEITGAEKAKLFPTDTGTVVNDFLTEYFPDI LDFNFTASVEKEFDEIAEGEVKWTSIMKNFYDKFHPSVENTLAIKTEHKVGERILGEEPG TGKTVSVKIGRFGPVVQIGTVEDEEKPRFAQMKKGQSMETITLEEALELFKLPRTLGEYE GKSVSVGVGRFGPYVLHNKVYVSLPKTLDPMEITLEEAEQLILEKRQKETERHIKKFEEE PELEILNGRYGPYITYKGSNYKIPKDIVPQDLSLASCLELIKLQDEKGPATTKKGRFAKK K >gi|222159331|gb|ACAB01000028.1| GENE 21 32019 - 36089 2277 1356 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 813 1207 269 659 676 127 28.0 2e-28 MKKIVLLITIQILCMHSFVVNAQRIFSTSYTMDDGLAANRVYSILQDSCGFMWFGTDDGL SRFDGIAFKNYYLSEYINATTSNSVKKIFIDRRGRMWIGLDTGIVIYDSKTDTFQPFHAK TETGEMIQTYVTDMIEDSDGEVWIATSGEGLYRFSPSDKVRLRVYRNIPGKINSISQNII MTLQQDSKQNIWIGTYSEGLCCFDKQKNTFVTYKKSNHPNSLSDNSIQKVFEDSHGNLWI GTFQNGLDLFDPVAKTFTNYRDRSPNNLLYHIHDIKEYRPGELFISSDNGMGIFKADKGE IIQSDNPNLKIRTGANKFIYTIYIDKEESLWLGSYFDGIKFYSAFQNNFKYYSCSSSSTP QSGKVINVIKEDKDDKYWIGTDDNGIFRFDAKTQEITPFRDAASIGTTYYCIHDLLVDGD KLYAATYGRGLEVIDLKTGKVESFLNNPEDSTSIPSSRVFTLYKASNGCIYVGTSAGFCY YNPEKNNFIRIGSFTGKISDIIEDYFGRIWIGTSISGLYSYNVRTQKITSYQRSDHPNSL TKNVITTLAIDSKKQLWVGTYGQGLCRYNDETDDFTRYDHLKLPNKIINSIIPKGDLLWI STNKGLVVYNPDTKHIKTYSKSNGLYNEQFTPCSGFESSDGRLFLGSTDGFCYFFPQDLR ENTYNPPVILTNMTIFGKDIQADTPNSPIHQSIGYTDEIVLKYNQSMIGFDFAALSYIAP KENDYQYMLEGLDSEWQFTKGSNNHLSYANLPVGEYVLRIKGTNSDKLWSSNEVQLKIKV LPPFFRSQLAYLIYALVLLIAIMLTVWYYVKRTEKRQKERIKRLNDEKEKELYNSKIDFF TNIAHEIRTPLSLIIGPLEYLMKTSSINNVYGEYLSIIEQNYKRLYALVTQLLDFRKVDT GSYKLSYDCYRIKEIICKVSCIFELSARQKKVAIDTSSIPEELSIVIDEEAFTKIISNLL SNALKYAKSTIRITTIEKDSEIVVTVTDDGIGITDQEKTKIFDAFYQVKNNSEINKLGIG IGLHMTRSLVQLMNGKIEVSDREGGENGVSISVYFPKQAAITALPQVAKRVEDTIIPENS IEENELESTLPGEPLKKQYAIMVVDDNPEILDFLSKILSEEYFVISASSGEEALQILEKN NIDLIISDVMMEEMDGFELCGKIKSNINISHVPVILLTAKTDTESKIKGLEAGADAYIEK PFSPFHLKAQLLNLLKKRESQQKTYASTPLSDLHSAVHNKLDEEFMNKCTEIIQNNIEDS EFSVNTLAQELGMSRTSVFTKIKGIIGMTPNDFIKVTRLKKACKMMVEGEYRVTEIGFLV GFSSSSYFAKCFQKQFGMLPTEFLKKIKEDPSSATK >gi|222159331|gb|ACAB01000028.1| GENE 22 36496 - 37095 554 199 aa, chain + ## HITS:1 COG:no KEGG:BT_2225 NR:ns ## KEGG: BT_2225 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 199 1 199 201 213 48.0 4e-54 MSVNYDLYETPDLAKSGEEQPLHARVVLKGSYTAEEFVEQVTVFQHMPHAQVVGVIEAIS KELRHLLLKGFSVELGDIGYFSLSLSVDKKVMDPKDLRSPSVSLKDINFRVNRQFKKDIE SEIELQRYHSPFRVKNPLDRERCLQRLEKFLEDHPCINRQDYAMLVGKTKIQALQDINAF IKDGVLKKYGSGRSVVYIK >gi|222159331|gb|ACAB01000028.1| GENE 23 37645 - 38733 810 362 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717238|ref|ZP_04547719.1| ## NR: gi|237717238|ref|ZP_04547719.1| predicted protein [Bacteroides sp. D1] # 1 362 1 362 362 768 100.0 0 MGILKCILIACSCLVWATHAFCVEVKDTIEYRAQVWVDKIDLERYGGEASFKKNLQQMFH NTTRFWNESPNKFNYYFRFVPADELCVYDIQGDKDRYGEFQRKAFGRLDLSKYDFVLFLA LGAKNEGLSCGGGGASGQSVVMCYVREAHNIFTDALYPDQGTYSNLGHEYGHVRGATDLY QYMIAAEDNPVSHEKLTPPKCNMGTGYRVWSDYCSALFNYTAKMRPLDKDLSDKVFPRKL VIKVGKMGKPLSNCTVNFYGTRAGGKYNKRDVYPKAYRTYTTSKRGIIEITDLYKLYHPD MTDANIPPKEPQDLFPYSYWFSFLVEIIDEVGQKKYVWLPDVELQREHLETGNDTYTVNV TF >gi|222159331|gb|ACAB01000028.1| GENE 24 38746 - 40077 567 443 aa, chain + ## HITS:1 COG:no KEGG:Dfer_2199 NR:ns ## KEGG: Dfer_2199 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 47 438 38 438 442 377 45.0 1e-103 MCCSKRNFIYLFCGLMLALNIQMLQAKLGNKIIFIPEDDLKKHGFDVPDGRFGYDCMAES DNLVIFWERSFGKEPAVNMDESKRFYPNEILSEGERYYRYFVDKLKFVQKGKSYTDKYKM IIWMYDDNEKTVYGGAHDNVGMTWFRPCRINGYPYCTLAHELGHSFQFMVEADGGKGFPG TTLYEYTSQWMLWQVHPDWVTIENYHLNNYMKQTHYTLFHKTNQYCAPQFMEYWSYKHGL PVIGRMWSEALKEEDPVSTYMRITKTSQDLFNEEIYDAATRFVTWDLPRIKSVCSSYANE HRCKLKKMGNGWYQITKEYCPQSYGYNAIRLKVPKGGTVIDLTFEGMAGNEGFYSENKAY AGWRYGFVAMQKNGKRVYSKMNRAVNSVNQTVNFIVPDDTQFLWLVVTGAPDKHTKYHQK MEQVAEWPYRIKISKTSFHPDTL >gi|222159331|gb|ACAB01000028.1| GENE 25 40462 - 42603 1082 713 aa, chain + ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 23 496 1 451 559 232 30.0 2e-60 MSCKLSFTLLCIAMLASCISTKMDPPQPYGVLPTDRQLAWHEMEMYNMIEFNQNNALDIH WGWGNESPELVNPVDFDAEKIVLLAKKIGSKGFLINAKHHGGFCMWPTKTTDYNISKSPW KDGKGDWVGEWAKACRQHGLKLGLYLSPWDRNAACFGTPEYLKMYKEQLTELCTNYGELF TLWFDGAPGAGGSGYYGGANEYRGGFIEYYDWDNIYAIVRELQPNAVIFNDPGPDIRWVG NEDGYAGRYKKTCWATLFPEGNWNMRPRPNSDGERIKMLNEGQRHGKYWIPAECDFSLKR RFAYHTTDSLTTKSPQKLFDIYLASVGLGQGFDMCLPLTPKGVMEWRDVKSLEGFAELMN KTFSKNFIKGARLVPSNVRGNITRHYGTAKLTDNDRYSYWATDDTVTNASLMIELPKKQV FNLIKVRENIKLGQRIDSMKVDAWVDGGWKEIAEATSIGACRIIRLENDLNTDRIRLRFF APVALAVSEVSLFKEPDNLEAPKIYRKKDGMVSIRTDRPVISIRYTTDGTEPSFTSNEYK EPFLFDKQGVVRAAVFTSDKKSGEITSVIFDQCKKNWKIISPVKSGVDNMIDDNVESYFH TYDAGNKKEFVPDEVIVDMGTTIPVSEIIYTPRQDMYRNVDGVIENYEIYLSEDGLKWEM ASKGVFQNIRQTLVPQEIKLNKVYRSRYLKFVVNGVLEKNFISVAEIGVRTVD >gi|222159331|gb|ACAB01000028.1| GENE 26 42763 - 45873 2001 1036 aa, chain + ## HITS:1 COG:no KEGG:BT_4247 NR:ns ## KEGG: BT_4247 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 40 1036 112 1117 1117 1189 59.0 0 MKYYLRRIAFVTLGAFIVAGTTSLPAIADSGVKGAVSAQQQQSKRQVTGTVTDALDGSPI IGANIALKGTQTGVISDLDGNYSIPVNSTKDILVVSFIGYKTREVPVETSGIIDIKLTSD NEMLDEVVIVGSGTQKKVSVTGAISTVKGLELKAPSSSLTSSMAGKLAGVIVNTKSGEPG TASEFFIRGVSTFGGRATPLILLDDVEIATGDLDNIPAETIESFSILKDASATAIYGARG ANGVMLITTKKGQENSKTRINITLENSFNKPMNFPEFVDGATWMEMYNEAELTRNKSIRQ GRYSRQQIEGTRYGLNPYVYPDVNWGDLIFKDFATNQRANINISGGGQKVTYYMSLNVNH DTGLLDSPKYYSWNNNINNMGYNFQNNISVKVTSTTTIDLNMNTQIRNNRGPKNSVDDLF KMMLSSNPINFPAVFPAQEGDQHVKYGNSILTANYTRINPYAYMATTFKQVDSNMLHTTL KIKQNLDFVTKGLSANALVNFKNWSSSEYNRTIEPYYYRVKENSYDPKNPVSYELERLGT SGKQYIAESGITRSGDRTIFMQFQINYTRQFGKHNVGGMLMYMQRDFKKDVLPNRNQGFS GRFTYDYGQRYLLEFNFGYNGTERLAKGDRFEFFPAASLGWVVSNEAFFEPLRDKIDNLK LRASYGLVGSDETGTDFGHFLYLDKVELDNIKYSTGYDWQTTFGGPKISSYYVLDAGWER SKKLDVGLEMTLFRHWNITADYFYEKRSNILMQRKTWPSQLGYSATPWSNVGKVDNWGWE FSTNYHHSINKDLSMDFRGNFTYVENKYVYRDEPYTPYPWKRETGRPLNSTWGYIAEGLY QTEDEIKYGPKVELGSTAQVGDIKYRDLNGDGVINADDQGMISEYGGNPRIQYGFGLNIN YKKFDLGVFFNGSAMRKIMLTGIHPFGESDYNIFKFVAKDYWTEANPNPNAAYPRLGLQK ANSDNNVVPSTFWMRNGNFLRFKTLEIGYRFCRYGRIYVTGDNLAVFSPFKEWDPELEWY KYPLQRTFNIGLQLNF >gi|222159331|gb|ACAB01000028.1| GENE 27 45894 - 47876 1343 660 aa, chain + ## HITS:1 COG:no KEGG:BT_4246 NR:ns ## KEGG: BT_4246 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 658 1 642 642 532 43.0 1e-149 MKLLNTIRILSGALFISLCCLTASCNFLEIVPVEQADLDDATQDFNSTLGFLHSCYAGIQ SPVAFENTEASADEYAVPLEYSLQGYNSLKVAHNLFTGNNNDGGKWSNYYRYIGQIHLFL QELKKAPVSDLIKEEWEAEAYFLMAYYHFEVLRCYGPCPISDRLLPQTASENEYPGRMHY DYVTDWIVRLLDEKVLAGDKLRNTRSNTEVGRATRPIALALKAKVLLYAASPLWNGKFPY PEWKNVKLGKAMETPGYGTALVSTTYDREKWVRAEEAAQEALNAALDAGHYLFGTRAGDE SLQEQKGLPMPYIPNIEGTDTEVKAFQNKVLTLRYMMTAKGSDGNLEILWELHKTTNFVF GSIPNNIMKLNNGSEYGGWSSTSPYLYAVEHFYTNKGTLPKIAASKNEYPAESKWLDPAG VTGRSELFNICVAREPRFYAWLGFHQGDYSSRLANGQPIQIDTRDPDKQGYNPAKYNRNH LVTGFAIQKFLDPNTQLDLSGGGGIPNPAYPLIRMADLYLMLAECQANLSSDGKSDTYAT EALKNLNAVHQRAGLKAITKEDLTDMPLMEWIKNERYVEFYAEGQRYFDVRRWVEGGKYL AAGKREGLRAEEKGLEYDEFRQRIKVNQPYEWNNRQYILPIYYVDVNSNPQLVQAPGYGN >gi|222159331|gb|ACAB01000028.1| GENE 28 47898 - 49328 1223 476 aa, chain + ## HITS:1 COG:no KEGG:BVU_0948 NR:ns ## KEGG: BVU_0948 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 367 1 374 450 95 23.0 5e-18 MKNIYILFLLGVVVTTLSCEDGNKNYTEEFDSVYYYLKEGVNEFDFYSVNSAKVQTITVG RGGHGLNGVSTINLVPFTQAEMDEYNATIDGDYKLLAPEYYDMPATVAFEDGVEYRNVDV VFKGNMSELAKTGEYMLPIRLEADNGTVNDNKNIIYFKPNIIVPSIITDPTGVHVIRMEE GKREKQSFELAFYLDVKNEWDLKLRIENNEDELEKAVRQYNETMGVDYLLLPKENRSFES TILFPAGEALASSFVEVNNNNLMMDDYLLPIIPKKVVGMPFDVREAICYIHVMVTGELKL ISLADDMLSSNSIHPGQPLRNMLDNDIKTYYESIWTSNTDPKHDSKYGVYIDIDVSDVTP MIEKQIKIDYSTREYANAVPNQIILYVGTSVDNLRKVGELSATKNNLPKKGGVWTDDSEQ NVDMPQFSVGRQEISLIRVSFLSSANGTQIKNLTQSGAISGDQNSVAISGFRLYGK >gi|222159331|gb|ACAB01000028.1| GENE 29 49355 - 51478 1248 707 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717244|ref|ZP_04547725.1| ## NR: gi|237717244|ref|ZP_04547725.1| predicted protein [Bacteroides sp. D1] # 1 707 1 707 707 1347 100.0 0 MEALMKNIWRIIPVLLLSVAMAGCSDDDDTTSSFFELSSTDLECNSNNVIDVVVPIEGQT YKLTVKSSVDVIWTTKLEGGSWIVATPVTKQSGDGEILIVASSNPTNIKNRVATVTIRNS VNDEIYRYAFTQSYDPNQMAQKEELYAYIHYNEDRYNKEELIYTTSINKVEGNVSISSLG QEELDKYNEDNLTLYKLLPTDCYTLPNSISFDGEVSALLDIKFNKKTGDLSTSEEYMLPI IVSVDDRKAELIWLVVNIYDRNISRSVKQFDFYNANQDITTLLTINKEEGEVTLSAFTEE ELNAHNSLYGTSYMSIPEDYIVMPASVDFDDSDLSKDVKITFKKEVGELDEDKEYVWGIR VYVDGFPVSEIFMKPRIATPVVTMESKEYKHILILDQGEKKASYGFNLSLDVINQWEFTV EFEENETALKAAVDKYNADKGTSYTLLPKGNCVLSSSEFTQEDNEKLVTATLSGDGLTLD KDYLYPIIPIGCGNSPFEVKEKITYVHVIMETKVESISDLRSINLEANMLKASATDGGNN VGNLLAYNNYWQSIWSPNGGKNDPKIDPVYGVYLDIDFSNKPLSQAFSFNYLPRNWPNAV PNEIVVYAGTSNDDLKKIGELKYEDNQLPFADKTWIGRVSEEDLSKLSLYSLKESGATLV RLSFISSRDKTYSNNSVKNILGYDYTKWNNSGDYPCVALQQLKIYGK >gi|222159331|gb|ACAB01000028.1| GENE 30 51515 - 53185 930 556 aa, chain + ## HITS:1 COG:no KEGG:Dfer_2199 NR:ns ## KEGG: Dfer_2199 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 138 544 27 436 442 288 38.0 5e-76 MNILKWMWKILPIMLLAVNATACNDDDDSGKRERPLTFEVTFLSLVCNYTRPVVVNAPFT GKDYHLTVTASPEVTWKVEVVSGDLVTATPSGEQSGSGEITLTVAANPAKDPGKKSEVVI TNNANDDTYRFVFTQVEKVLLIPEHTMIGQSGPELFENENSKFNKHYMKESDNVALFWEK SFGTNPQNAERKFNPDDLLKMAEEVYDFMKYDLYFGNKEESVTNKYKLIVYVINDSEGGA TGGGNYPVGELAIRPHHSGNPNMVYHEVSHSFQYLALWDSGIKDAWFPGPIFEMTSQWTL LRRSPEWIDQEFNHFTNYISGTHRSLGHNDNAYNNPYMFEYWANKHGVEIMSRIFQETTL DDKTESGQLNFIKTYKRLTHINQEQLNEEMYDAASRFITWDLPRIEMAYAARGANVHTCQ LVQLGVTYRISPERCPSNYGYNGIKLTVPEAGTTVKVNFRGIINSSEYNIHKPNNAQWRY GFLAVLKDGSRVYGEPSKEDIGSASLQVPENTEHLWLVVAATPKEIYDTGADNQWPYQFT LDNTEPDGDKCRVIKK >gi|222159331|gb|ACAB01000028.1| GENE 31 53306 - 55339 1107 677 aa, chain + ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 23 578 6 552 570 166 28.0 2e-40 MKHRLWAFVVAFLCYSSIVFAERQDILLNNQWKFRFSHQVQKGTEVRVDLPHTWNAQDAL SGKIDYKRGIGNYEKKLFVQSQWKGKRLFLRFEGVNSIANVFINRKHIGEHRGGYGAFVF EITDRVEYGKDNSILVRVNNGEQLDIMPLVGDFNFYGGIYRDVHLLVTDEICISPLDYAS PGVRLIQDSVSHEYACVRALIDLSNGSDKLRELELNIILSNAGKVVKEKNEKVTLYPDSV SQKEMRIELNNPHLWNGRQDPFLYQVEVSLLHNGEILDQVIQPLGLRFYRIDPDEGFFLN GEHLPLRGVCRHQDRSEIGNALHPEHHEEDAALMLEMGVNAVRLAHYPQATYFYDLMDKL GIVVWAEIPFIGPGGYEDKGFVDSPSFRANGKEQLKELIRQHYNHPSICVWGLFNELAEY GDNPVEYINELNVLAHQEDTTRPTTSASNQSGKINFVTDLIAWNRYDGWYGGTPQDLGKW LDWMHKEYPEIRIAISEYGAGASIYHQQDSLVKTNPTSWWHPENWQTYYHIENWKAITER PFVWGSFVWNMFDFGAAHRTEGDRPGVNDKGLVTFDRKVRKDAFYFYKANWNKEDPMIYL IGKRNIIRVQRQQTIIAFSNQPEAELFVNGKSCGKVETDRYSILEWKHVILTPGKNKIEV KTTNKKQTLTDMYYCIL >gi|222159331|gb|ACAB01000028.1| GENE 32 55422 - 57617 1611 731 aa, chain + ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 67 681 70 695 748 344 33.0 4e-94 MIINMKQSLIAAIGLLLSSSLFAKDYLISTSKTSLLITANAGEPSKMQYYGVSIAPGQIQ SIYDAGLVLNANSYPAFGLQTVGEKAIAITQPDGNMSLDLAVDAVTQYTSKDGEITEVLL KDKIYPVWVKQCYKAYKGTDIITTWVEIGNAGKKPVTLFQFASAYLPMFRGDNWLTHFHG AWGAEHTMDEEKLTNGQKVIANKDGLANTETDNPSFMLSLDGKPQEETGKVFGGVLAWSG NYKLKLDVRNTALNIIAGMNEESSQYILEPKETFVTPEFAMTYSTNGKGGVSRAFHHWAR QYKMNRGERLHDILLNSWEGVYFKVNQKGMDEMMEAFSAMGGELFVMDDGWFGNKYPRNG GNSSLGDWEVCKEKLPEGIEGLLASARKHHIKFGIWIEPEMSNTKSELFEKHPDWILKID NRPLSTGRGKTQVVLDLTNPKVQDFVFGVVDNLMANYPEISYMKWDDNCSLLDYGSSYLP KNKQSHLYIEYNRGLQKVLQRIRAKYPELVMQLCAGGGARVNYGLLPYFDEFWTSDDTDA LQRIYMQWGVSGFFPAIAMASHVSADKNHQTGRRIPLKFRFDVAMSGRLGMEIQPKDMND VDKAFAKRAIAAYKDIRPVVQFGDLYRLVSPYDNKGVSSLMYVTSEKNKAVFYAYKISHF INMVIPKVCMNGLDPSKNYRLVDLTPVDSSKPCDLNGKIISGKILMEEGIALKSLLKSEY SSLALQLQAVD >gi|222159331|gb|ACAB01000028.1| GENE 33 57632 - 58360 131 242 aa, chain + ## HITS:1 COG:FN0505 KEGG:ns NR:ns ## COG: FN0505 COG2071 # Protein_GI_number: 19703840 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Fusobacterium nucleatum # 10 238 6 241 243 178 41.0 7e-45 MRMIKREIPVIGISTEIDIPGRISVKRKYVDAVLQAGGIPFILPFTDNVQILQSVVSFID GLLLTGGGDISPVIYGESTLPECGECCRDRDDFDYALLRLASERQIPVLGICRGMQIINT YFGGTLYQDLPAQYPSEINHRSPDAFMILQHNVRCLRTGKLYSVTGKESLKVSSIHHQAV KKLACGFKASAFADDGVIESIESDSEHIWGVQFHPELQAVEGDEAMKKLFVYFISQARTL SC >gi|222159331|gb|ACAB01000028.1| GENE 34 58505 - 61351 1817 948 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 96 433 117 453 1087 86 24.0 2e-16 MRNRLAGTLLLLFVFLCTSWAQTSGLRKMIDLSGTWSFALDPHGKGEQEKLWNMDFAETV TLPGTTDSNRKGIENTNKSETTYLSRYYKYEGAAWYSRNIEIPADWKEKHIVLFLERTRP TTVWIDGREVGKCNYLSTPQEFNLTHFLAPGSHKLTIRVDNGKSIPGQIRSSSHACTEST QTNWNGIIGRMELQAMNPLFIESIQAFPQVEDRSVKVKITLSNSSGIGGKKLQLSASAFN TAKKHKARSAEYNLKEGSKEYEFTYQLGDNAVLWSDLQPALYQLKAEINGIDEKTVRFGL RDFKTEETHFTINGAKTFLRGKHDACVFPLTGHTAMDLEQWRRYFRICKEYGLNHCRFHS WCPPAACFEAADLEGIYLQPELPIWGGFKKESAELMDFLMKDGENIMREYSNHASFVLFA LGNELGGDINVMKEFVDRFRSIEPRHLYTYGSNIFLGSRGHIPGEDFLVTCRVGSGEGYS THARASFSFADAEEGGYLNNTYPNSVMNFDEALEKSPVPVIGHETGQFQTYPNYEEMKKY TGVLAPWNFEVFRDRLEKAGMLEQADDFFKASGAWSVELYRADIEMNLRSKRMAGFQLLD LQDYPGQGSAYVGILDAFMDSKGLVEPKKWREFCSEVVPLLTTAKFCWTGGESFAGTVEI ANYGETSLNEKSISWELKNGKKSLGKGKMAIPSGLGLLTAGTIRLTLPDVEQAYKAELLL KVSGTSYQNSYPLWIYPAKKQLKAGNVVVARQLTDDVLNALKQGGKVLLMPLEEDCKEVT VGGLFQTDYWNYRMFKSICDRIKKPASPGTLGILTNPEHPVFDDFPTEYHTNWQWYPIIK HSYPLILDGMPKEYRPIVQVIDNVERNHKLGLLMELNVEGSKLLLCMSDLEAVRDTPEGL QFYAALLAYMNSSDFKPSTSLSVESFKNLFETGVRKEGIKVLDNISYE >gi|222159331|gb|ACAB01000028.1| GENE 35 61576 - 63777 1803 733 aa, chain + ## HITS:1 COG:STM2166 KEGG:ns NR:ns ## COG: STM2166 COG1472 # Protein_GI_number: 16765496 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Salmonella typhimurium LT2 # 32 727 37 759 765 592 43.0 1e-168 MMKRLITYSCMAIFAMGSMAQNQKDEAQMNRFISGLMKKMTLEEKVGQLSQCSGGFATGP DNTRISRTEDISKGLIGSLLNVSGAANTRKYQEAAMKSRLQIPLLFGLDVIHGYRTGFPL PLAEAASFDLDAIERGSRCAAKEAAAAGLHWTFAPMVDISWDARWGRVMEGAGEDPYYGS LVAKARVHGLQGDDLSKENTILSCIKHFAGYGAAIAGKDYNSVDMSMGQFVNFYMPPYKA GAEAGAATFMSAFNDFNNEPSTGSTFLLRELLKNQWGFKGFVVSDWGSVGEMMNHRYAKD EKEAAYKGIKAGLDMEMVSECYSKNLVSLVKEGKVSIKLVDDAVRRILEQKYKLGLFDDP FRYCDEERERTVIGSQESRKEACYVSERSIVLLKNENSVLPLSSSIKKVALIGALSKSQK DMCGAWSCAEVGKVVTLYEAMEKRGVDINYNDGYDLKTNKIVNLDQTLAAARQSDVVIVA MGEQAWESGEMRSKGDISIAAEQQRLVSELVKTGKPVVVLMMCGRPVIFNEVRREAPAIL CTWWLGSEAGNAICNVLWGDYNPSGKLPMTFPQHNGQIPLYYQYKSTGRPTALGGWCAKY IDIPTEPAYPFGYGLSYTTFNYSDVKLLPGDGKGTYTRVAVTVTNTGKCEGEEVVQLYVR DEVASITRPIKELKGFEKIKLSPGESKTVTFDVKDEQLGFYNNQMKFIVEKGDFTLMVGG NSRDLQEIKYILN >gi|222159331|gb|ACAB01000028.1| GENE 36 63786 - 64874 825 362 aa, chain + ## HITS:1 COG:no KEGG:Slin_0490 NR:ns ## KEGG: Slin_0490 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 40 362 31 332 691 94 25.0 6e-18 MKNNWKRWLIFPALLPFFSWMEAKTLEENQIRQSLDTIEYRVFVAVDKAGVEHWGGKEAY QAKLNAFFDQVNDFWNKAGNGRFNYYFRYIPDLQVIYDCSSRQLEKIYQKSAGFPNHDVL LIIDSILDFDDEESAKGWYCGGGADDLNMVICRSRSKTEHEDLFGIDYFHRGVAHEFGHY RGVTDLYADRIRAKNNPVNHIEYEPDSCVMNSHYKTYKWSSYAVHIINHTAKSKRPRRDF DGFFKQMFPENIQVSVKVKGKKQKGVKLNLYGSRAKFNDLIATPYRTYETDKKGEYLITG VPNLYDSPAPPLHTDELPYNRWFTFLLEAEYKGEKKYVWLPEYEVQQTFFENKDTYQVTI DF >gi|222159331|gb|ACAB01000028.1| GENE 37 65042 - 67042 1800 666 aa, chain + ## HITS:1 COG:no KEGG:BT_1871 NR:ns ## KEGG: BT_1871 # Name: not_defined # Def: putative alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 664 1 661 662 754 52.0 0 MKRMMKRLLPVLMAIAPLTICAQKVYELKDISGKVQVNVIVNDKNVEYSVLHDNDVMVAP SPIFMKLTDGTAFGLNPKVKKISRRSVNETIYPPIYKKKSIKDQFNELTIDFKGGYSLVF RAYEDGAAYRFVSELKKPFMVESEQASFCFPNDPKVFVASPKGRMNEGKKDPFYSSFQNT YLETALSAWDKEQIAFLPVLVEGKNGKKICITEADLMNYPGMYVKHGEHGYSLDGIFAAY PKTIVDEVRGLKGVVKSREPYIARVEGNTAFPWRVMVIAKDDAELLCNDMVYKLATPAQF TDFSWIKPGKVAWDWWNDWNLYNVDFRAGINNETYKYYIDFASKFGIEYVILDEGWAVPG KADLFEVIPEIDLKELISYAKSKNVDLILWAGYRAFEKDMDRVCKHYAAMGIKGFKIDFM DRDDQQVVEFNRKAAETGAKYKLLIDLHGTFKPTGLQRTYPNTINFEGVHGLEEMKWAEP GTDQVIYDVTAPFIRMVAGPLDYTQGAMNNVIMKNFHAVYTEPMSQGTRCRQLALYTIFD SPINMLCDAPTNYLKEEECTKFIAAIPTVWNQTLPISGKVGEHIVMAREKDGIWYVGGLT DWKERDVEVDLSFLGDGEFNAEIFRDGINADRVGKDYKREVIRVSANKKLNLHMAPGGGF VVRITK >gi|222159331|gb|ACAB01000028.1| GENE 38 67159 - 68880 1391 573 aa, chain - ## HITS:1 COG:no KEGG:BT_2817 NR:ns ## KEGG: BT_2817 # Name: not_defined # Def: putative TonB-dependent receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 573 1 573 573 699 62.0 0 MKRSHYIIRGLALAISMPSCLQAQTTQPKDTTMTRTVVVEQEYNPDIMDASKVNVLPKVE EPTVSKKEVEYATTFFPATSVPAGLMRPYTGKEVQPGTTPGYVRAGYGNYGNLDILANYL FRLSQKDKLNVRFQMDGMDGKLTLPFTDSEKWNAFYYRTRANVDYTHQFNKLDLNIAGNF GLSNFNFQPESVNSKQKFTSGDFHAGIHFTDETAPLRFNAETNLLMYERQHNMFNESEAN TGIKETIIRTKGDVTGAIGDQQLITIALEMNNLLYNGYTKNVSTGDEYFKNYTTLLLNPY YELDNDDWKLHIGANVDLSFGFDKSFRISPDITAQYIFSDSYIVYAKATGGKQLNDFRRL ESICPYGELPEANTTATLGYVQRPYDTYEQINGSIGFKASPYPGLWFNVFGGYQNLKNDL SYLGFDPSNIHSGSYLSFAQDNTENLYLGGEISYDYKDILGISAKYTYRNWDSKTEEYLL AVKPVSEMSFNVRIHPISALNINLGYDYISRKEVKEYAKMTAINDLHIGASYNVFKGVSV YAQVHNLLNKKYQYYLGYPTEGFNFLGGLSFRF >gi|222159331|gb|ACAB01000028.1| GENE 39 68901 - 71918 2980 1005 aa, chain - ## HITS:1 COG:MA1613 KEGG:ns NR:ns ## COG: MA1613 COG0457 # Protein_GI_number: 20090471 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Methanosarcina acetivorans str.C2A # 30 968 4 992 1885 65 20.0 7e-10 MKKEITRLICAAICCTPIIGFAQTGDKFTSTDNLYKEGKELFQERNYAAALPALKAFVKQ KPAASLLQDAEYMLVSSAYELKDKNRIELLRKYLDRYPDTPYANRIYALLASCYFYEGKY DEALALFNSADLDLLGNEERDDCTYQLATCYLKTDNLREAAIWFETLRANSPKYAKDCDY YLSYIRYTQKRYSEALKGFLPLQDDSKYKALVPYYIAEIYTQLKNYDKAQIVAQNYLSAY PNNEHAAEMYRILGDAYYHFGQYHQAVEAFNNYLNKDRSAPRRDALYMLGLSYYQTKVYS KAAETLGKVTTDNDALTQNAYLHMGLSYLQLAEKNKARMAFEQAAASNANMQIKEQAAYN YALCLHETSFSAFGESVTAFEKFLNEFPTSPYAEKVSSYLVEVYMNTRSYDAALKSIDRI AKPSAQILEAKQKILFQLGTQSFANADFEQALKYLNQSIAIGQYNRQTKADAYYWCGESY YRLNRMVEAARDFNAYLQLTTQPNNEMYALANYNLGYIAFHRKDYTQASNYFQKYIQLEK GENATALADAYNRIGDCHLHVRNFEEAKQYYSQAEQMNTSSGDYSFYQLALVSGLQKDYT GKITLLNRLVGKYPASPYAVNAIYEKGRSYVLMDNNNQAITSFKELLTKYPESPVSRKAA AEIGLLYYQKGDYNQAIEAYKQVIEKYPGSEEARLAMRDLKSIYVDLNRIDEFAALANAM PGHIRFDANEQDSLTYAAAEKIYARGRMEEAKTSLNKYLQTFPEGAFSLNAHYYLCLIGN EQKNYDMILLHSGKLLEYPNNPFAEEALILRAEVQFNQQNMAEALASYKMLKEKATNVER RQLAETGILRCAFLLRDDIETIHAATDLLAEAKLSPELRNEALYYRAKAYTKQKADKKAM EDYRELAKDTRNSYGAEAKYQVAQSLYDAKEYAAAEKELLNYIEQSTPHAYWLARSFILL SDVYHATGKDLDARQYLLSLQQNYQGNDDIESMIESRLSKLKVEN >gi|222159331|gb|ACAB01000028.1| GENE 40 71942 - 73252 861 436 aa, chain - ## HITS:1 COG:PAB1371 KEGG:ns NR:ns ## COG: PAB1371 COG1672 # Protein_GI_number: 14521702 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Pyrococcus abyssi # 3 415 10 428 472 191 33.0 3e-48 MKFYDRKKEMAILYAAKKQSQTSACFTVMTGRRRIGKTALLLESVKDTRFVYIFVARKSE VLLCAQYQPVIQETLGIRIYGEIREFAQLFELLMQHSQKENFTLIIDEFQEFLYINPSIV SDIQRIWDQYHATSKINFIVCGSVYSMMKRIFEDRKEPLYGRLTARISLHPFTTTVIKEI LADHAPQFTSEDLLCLYLLTGGVAKYITLLMEAKAFNKTAMINYALSEGSPFLTEGKDML ISEFGKDYANYFSILSLIAEGKTTQREIDSIINKNTGSYLANLEDEYSLIKKVKPMFAKP NSRATRYCLQDNFLRFWFRFIFSNNAVLEMGKNHLLIEYVEKNYEQYSGLLLEKYFRELI AEEEEVTDVGNYWDKNGENEIDLIALNRFNKTALIGEVKRNPRKISIPALEKKGETLHKE LNDYKITFKGFSLKDM >gi|222159331|gb|ACAB01000028.1| GENE 41 74007 - 74531 575 174 aa, chain - ## HITS:1 COG:MK1028 KEGG:ns NR:ns ## COG: MK1028 COG1051 # Protein_GI_number: 20094464 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Methanopyrus kandleri AV19 # 42 160 27 142 154 69 38.0 3e-12 MEHPLDQFLYCPKCGSSHFEINNEKSKKCADCGFVYYFNPSAATVALILNEKNELLVCRR AKEPAKGTLDLPGGFIDMNETGEEGVAREVLEETGLKVKKAIYQFSLPNIYVYSGFPVHT LDMFFLCTVEDMSHFSAMDDVADSFFLPLSEIHPEDFGLDSIRRGLSLFLAQCR >gi|222159331|gb|ACAB01000028.1| GENE 42 74745 - 75311 629 188 aa, chain + ## HITS:1 COG:STM0608 KEGG:ns NR:ns ## COG: STM0608 COG0450 # Protein_GI_number: 16763985 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Salmonella typhimurium LT2 # 4 188 3 187 187 247 59.0 9e-66 MEPILNSQLPEFSVQAFHNGAFKTVTNNDLKGKWAILFFYPADFTFVCPTELVDMAEKYD QFKAMGVEIYSVSTDSHFVHKAWHDASESIRKIQYPMLADPTGALSRALGVYIEEEGMAY RGTFVVNPEGKIKVVELNDNNIGRDASELLRKVEAAQFVASHDGEVCPAKWKKGESTLKP SIDLVGKI >gi|222159331|gb|ACAB01000028.1| GENE 43 75396 - 76946 395 516 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 212 505 2 297 306 156 33 5e-37 MLESALKEQLKGIFAGLEANFTFDISVSSSHENKTELLELLGDVADCSDHITCSVNEGEA LKFTLLKNSDRTGITFRGIPNGHEFTSLLLAILNLDGKGKNFPDEAVCNRVKALKGPIHL TTYVSLTCTNCPDVVQALNAMTTLNPAITHEMVDGALYQDEVDALKIQGVPSVFADGKLL HVGRGEFGELLAKLEDQYGIDETKANAEVKEYDVIVVGGGPAGVSAAIYSARKGLRVAIV AERVGGQVKETVGIENLISVPETTGNELADNLKTHLLRYPVDLLEHRKVEKVEVVGKQKQ ITTSVGEKFLAPALIIATGASWRKLNVPGEAEYIGRGVAFCPHCDGPFYKGKHVAVVGGG NSGIEAAIDLAGICSKVTVFEFMDELKADSVLQERLKSLPNVEVFASSQTTEVIGNGDKL TALRIKDRKTEEERLVELDGVFVQIGLSANSSVFRDIVETNRLGEIMIDAHCRTNVTGIY AAGDVSTVPYKQIIISMGEGAKAALSAFDDRVRGII >gi|222159331|gb|ACAB01000028.1| GENE 44 77197 - 77904 491 235 aa, chain - ## HITS:1 COG:no KEGG:Coch_1347 NR:ns ## KEGG: Coch_1347 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 36 231 282 498 508 154 42.0 2e-36 MRNYYLTLLLICICGLCFAQQQTNDISTKNPPNKEWNFNNLDGWKYGHQDNNPDNQCILE NGYLRIFTRANSVDRKKVHTVERIYTTGRYTWRTHIPQMGIGDQCSVGSWIYHDDQHELD FEVGYGKDTVRKELNAAPNEMIAYMTSQAYPFSSVPVVIKTGWHLFEIDLTLKNGNYYIT WLIDNEPKHELQLEFGKDIAFHIFCSVENLKFIGEQPAQQENSGLFDFVRYTYHD >gi|222159331|gb|ACAB01000028.1| GENE 45 78092 - 82039 2104 1315 aa, chain - ## HITS:1 COG:MA1149_2 KEGG:ns NR:ns ## COG: MA1149_2 COG0642 # Protein_GI_number: 20090015 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 802 1029 21 258 279 125 30.0 5e-28 MSKAHVLKRYSFLIWILFFVLSTKAEQQDYYFRQISLEQGLSQSRVQCIYRDHLGVIWIG TKWGLNSYDQSELKSYFHDREQPNSLPDNFIRFITEDRLGDLYVSTNKGIAIYNKAENQF QPLKYNGKPFNAWSYLQIGDNFLFGGEETLYQYNLTDKSITTIFPDIDGDKLKCINRIFQ WSPDMLITSSKKDGLWMYDLIKKKMYRCPFVKEREINTIFVDSQNRLWVSFYGKGIACYS KEGKRLFSLSTKNSGLNNDIIFDFLEKDNQLWIATDGGGINILDFQTMKFSHLKHISDDE QSLPNNSIYRLYKDQMDNIWIGSIHGGLFAIKKVFIKTYKDVPLNNPNGVSERTVVSIFE DKDTLLWIGTDGGGINSFDQKTNTFHHYPTTYGEKVTSITDFSENELLLSCFNKGVFTFN KRTAQMQPFPIINDSISKREFSSGDLVNLYATKDNIYILGAKVYIYNKHTRQTSILYAPQ IDIQRQIAMQAIYSDDTHLYLMGTNNLFKLNFKTNELSSLVNMKEGDDFTSACRDDKGNF WIGSNFGLLFYNKQTGKTEKIHTNLFNSVSSLAYDKKGKVWIGAQNMFFAYIINEKRFVI LDESDGVPSNELIFTPIPALRTPNLYMGGTMGLVRINTDIIFESNSSPILKLLEVKLNGK STLKQVNNNCISIPWNHSSFNIKVIADEKNSFRKHLFRYVITGKDKMVIESYLQTLELGT LASGEYTISVSCDTSNGEWSQPIEILTIIVSPPWWKSIWFIILCIFFAFLVAGVVFFSLI RKKENRLKREMREHEKKIYEEKIRFLINISHELRTPLTLIYASLKRILNKEVKQDELPEY LQGAFKQANQMKDIINIVLDARKMEVGQEVLHISSHPLHKWIQEVAETFQTASKAKEIEI TYDFDDSIQSIAYDDTKCKVVLSNLIMNALKYSPNQTRIVIKTIRTNESIQVHVQDQGIG LDNVDIKKLFTRFYQGKHNEGGSGIGLSYAKMLIDLHGGRMGAFNNEDRGATFFYEIPAN LQEQEVSCPQHSYLNELLSSPEEEEKIESGAFSLQGYSLLIVEDKQDLREFLKNALKDKF KKIYQAENGLVALEVIKQQQPDIIVSDVMMPQMNGYQLCKEIKENLNISHIPVILLTARA DSESQMLGYKLGADAYLPKPFEMEMLLSVIQNQMRNREYIKSRYRGNQFILSPQEATFSN ADEQFMIKLNEMIDQNLSQPDLDVKFLTAQMAMSRTSLYNKIKELTGMGANDYINRRRID KAIILLIQSDMSITEISEQVGFTYQRYFSTLFKEMKGMTPSQFRAQHGSTQQQSE >gi|222159331|gb|ACAB01000028.1| GENE 46 82407 - 85367 2332 986 aa, chain + ## HITS:1 COG:no KEGG:BF4062 NR:ns ## KEGG: BF4062 # Name: not_defined # Def: putative TonB-linked outer membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 986 1 998 998 819 45.0 0 MDNLRKTLGCLLLFLFAAVTSTYAQVAKQYSGTVTDADSNEPVIGVNVTLKDAQTGTVTD LSGKFSISAPVGSTLLFSYIGYVTKEVKLGTNTSLKITMQEDQNQLSEVVVIGYGVVKKS DITGAVASVSSKQFKDQPVKRVEDILQGRTAGVAVTSVNGLPGGTVKVRVRGTTSLNTSN DPLYVIDGIMSGGLDVNPADIQSIEVLKDASATAIYGSRGANGVVLVTTKKGVEGKVQIY ADVAIGVSNILKKYDLLNAYEYATALKEYNGISFADDEMEAYKNGSKGIDWQNLMLQTGI SQDYKLGISGGTAKNKYLISANVLNMTAMTITTKYQRAQLRINLDNELTKWLTLSTKINA SRTHSHNGGIDIMNFLNYSPTMEMKDPVTGVYNMDPYNSVNGNPYGARVANYGDSYVYAL NTNMDLTFKIMKGLTLSVQGAANYSHVPSYSFTSSLAKPGQISGMENASRMNLFWQNTNN VTYNTSFGDHHLTATAVFEASGAEGRNLKLTGSDLANEFVGYWNAKNAKTRDGENGYSAE AIVSGLGRIMYNYKGKYMLTGTFRADGSSKFQKKNKWGYFPSAAVAWDVAKENFMSKQNI VQQLKLRASFGVVGNQSIGAYTTLGMLAPTNYDGYGSDAIHTGYWTGNLATPDVTWESTY QYNIGLDASVLDGRLSFTAEWFRKDTKDLLLRKPAPQYNGGGSFWVNQGEVRNSGVEFTI TATPLTDKDIFGWETSLNASYLKNKIIDLAGSDFIVGENYTSIGGGPIQIKKVGYPIGSF YLYEWANFNDQGANLYKHQSNGSLTTNPGADDLVTKGQAEPNWTFGWNNTFTWKNWTLNL FINAALGQDRLNVSRYAMGSMTGVYRFISLSDAYYKSWDKVANKADAVYASHKNSDNRNY PDSDFWLEDASFVKLKNISLTYNIPKKITKVADIQLSVSAQNLFTLTKYTGMDPEVYSES DYGFNGVDMGSYPVPRTFTFGMKLNF >gi|222159331|gb|ACAB01000028.1| GENE 47 85382 - 87022 1517 546 aa, chain + ## HITS:1 COG:no KEGG:BVU_3705 NR:ns ## KEGG: BVU_3705 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 15 544 10 532 534 401 44.0 1e-110 MDMKTIYRIFKSVGLALIALWMVSCQDLLTEDPKGQLAVTNFFNSKGDLDLALNGMYSKV ASDMYANIWAGFESVMGDDISTHPAANKQGLREVDTYNVSDNNTWVTELWGARWRLVKAA NFIIDNAGRTPEVSQEEKDAAIGQAYYWRAYSYFYFVMAWGEVPMVVKDEINYNMPLATV PEIYELIISDLKKAETMVPANYTKEPYARNGVNIAVSQGAVKATLAYVYMAMAGWPLNKG TEYYQLAAAKAKEVIDASKKGTYYYKLLPDYKQVYSMEYNKNNPEVLLGVYYNLGIDALT NAPLADFLADYAYGGGGWGDTNGEIKFWYDFPEGSRKDASYFPKIILKNETKLRDWWEDP NPEAPRVVVAPCFMKKVETTTGEEFDYTNPKISMNQNGEKTHQIIRLAEVYCWYAEAVGR SKTGSITEAVNLLNEVRNRANGSVVADRDIYKTTMSYDELAEAAYNEHGWEIAGYYWGNI ATRARDMFRMNRIKDHFEYRKLNPEIEVAPGVFRKEAVSVSGTWNDSKMYLPRPFVDSSI NPNLKN >gi|222159331|gb|ACAB01000028.1| GENE 48 87046 - 88719 1223 557 aa, chain + ## HITS:1 COG:no KEGG:Coch_1345 NR:ns ## KEGG: Coch_1345 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 138 552 67 499 499 194 34.0 1e-47 MKLRYLYLAIGVLCNASLVSCGDSFKEKTEIVACGISTNALTFGVSSTEVQTVDITSEAA WEVAVDQAGGNWLTVSPLEGTGNGTLTIAADKNNGPKRSATLTIAAKGAELRTITVIQDG YKGTIYNYGDFTGLQKTGLVAGINPITIVDNDECEDGKALRIYTRAGEEYEGTNGDRFKV QTTTQFGSGRYEWRIYVPKFGMNDRASIGAFLYFDDGHELDFEICSGTAADRAAHSAGPD DMLCLVTSQANPFFSEFTPIKGDAWHTFVLDLKLENKKYLAEWSVDGKVLKRAQLDYGEE AYFRAISSVENLYGMGDHAATQENYALFDYLEYVPYDYSMKPIIEGQLPPEPEGTTVKWD FEEAGFVPVGWTNNGGTIADGSLNLSNGNNFVYSPEVGAGKYTWEIDVPLVGVGEKWLAG GNIAATNAEERSFSMFVFSGTENDRAACTVPPVPGQMLVRCYTEAMGVYGVPIDPGKHTL TIDLRLNADGAYWAAWIIDGEVAKTFTTWYTPAQFKFGFSMMTFADGGGWQGDKPTAKTY TVKYDYIEYKKYNYDEE >gi|222159331|gb|ACAB01000028.1| GENE 49 88738 - 89763 489 341 aa, chain + ## HITS:1 COG:BS_yaaH KEGG:ns NR:ns ## COG: BS_yaaH COG3858 # Protein_GI_number: 16077084 # Func_class: R General function prediction only # Function: Predicted glycosyl hydrolase # Organism: Bacillus subtilis # 90 332 184 412 427 71 24.0 3e-12 MRYLSIIGTAICCVMLVMTSVSCKQEIFSSTTQEKKVAPWYVYIDGSSLKDIEPVKEIIS SISVFGNPPKSFIDECHQNHIEVYQAVGGNEETIDTPQKRKALVEKYVSDCNANGYDGID LDLEHLNPDIQDAYTEFLKLASKELHAVGKKLSHCVSFYPALYQDNETKMFHDPAVLNTT CDLVRVMCYDMYFAPGIDKPELKHRDDCMGIGPTSNYLWTREAMLFWMKHIPNDKLVMAL PAYANDYAVTGGIKGRQIYQSVPDSINGILPSPTWLCYEKVNMYLYDGTDGNRHLFYASD ARSTEALLELADELGISQIGFWHFNSVDPQMWDTAAKWQKK >gi|222159331|gb|ACAB01000028.1| GENE 50 89935 - 91761 1268 608 aa, chain + ## HITS:1 COG:SMb20092 KEGG:ns NR:ns ## COG: SMb20092 COG3568 # Protein_GI_number: 16263840 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Sinorhizobium meliloti # 16 244 5 245 252 93 31.0 9e-19 MKKIFLLISAILFIFPVQAQHTLRLMTYNIKNANGMDDICSFQRVANVINNASPDVVAIQ EVDSMTRRSGQKYVLGEIAECTQMHACFAPAIEFEGGKYGIGLLAKQVPLRLQTIPLPGR EEARTLILAEFEDYIYCCTHLSLTEEDRMKSLEIVKSLIASCKKPLFLAGDMNAEPESDF IKELQKDFQILSNPEKHTYPAPDPKETIDYIAASKQNATGFAVISARVVNEPMASDHRPI LVELRTAEKADKIFRTKPYLQNPVGNGITVMWETTVPSYCWVEYGTDTTRLERARMIVDG QVVCNNKLHKIRIDGLQPGQKYYYRVCSQEMLLYQAYKKVFGNTAQSDFSEFTLPATDTD SFTAVVFNDLHQNTQTFRALCKQIKNLNYDFVVFNGDCVDDPVDHNQATAFISELTEGVR GDRIPTFFMRGNHEIRNAYSIGLRDHYDYVEDKTYGSFNWGDTRIVMLDCGEDKLDSHWV YYGLNDFTQLRNEQVDFLKKELSAKEFKKAKKRVLIHHIPLYGNYEKNLCADLWIKLLEK APFNVSLNAHTHKYAYHPQGELGNNYPVIIGGGYKMDSATVMILEKKNDELRIKVLNVRG EVLLDITV >gi|222159331|gb|ACAB01000028.1| GENE 51 91907 - 93673 873 588 aa, chain - ## HITS:1 COG:no KEGG:BVU_2031 NR:ns ## KEGG: BVU_2031 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 56 556 27 539 558 322 38.0 3e-86 MKYMYSLCRSIKLHACISMFLLCLVSYSCGGSDDGISEPPGNPGTDPAPDPDDPSDDWRT LYNGIILPSVWPPKNINISSYEPMNVPYLQTPPEVVPIDVGRQLFFDNFLIANTDLERKF YKPKKYAQNPILKPETALETSNIPGASAKDGGVWWEPREQIFKMWYEAGWLNKMAYATSN DGVNWVRPDLNNGSNELLTLSRFPANSCAVVLDYEASDNERYKMFLRSPNADATDNAGYC MISSNGKQWSWFTKTGPCGDRSTMFYNPFRKKWVFSIRTLGVLGNSPHGRARYYREHSDF LAGAVWSKADVVFWCNADNKDTPDPEFNLPPELYNLNAVGYESIMLGLHQILLDENEIAK AANRPKITELKVSFSRDGFHWDRPYRDAFIPATRVPGSWDRGYVQSVGGICTVVGDQLRF YYIGFKGGDSSSYMHSNGATGMATLRRDGFASMSTTGEGSLTTHPVKFSGKYMFVNVNCP AGELKVEILDKNNKVIDGFSADKCKAVTIDSTIQQVEWNGASDLSSLAGQEVRFRFKLKN GDLYAFWVSPSTNGESNGYVAGGGPGYTSNKDTEGKKAYEKASGFQKF >gi|222159331|gb|ACAB01000028.1| GENE 52 93693 - 94961 1089 422 aa, chain - ## HITS:1 COG:CC2486 KEGG:ns NR:ns ## COG: CC2486 COG0477 # Protein_GI_number: 16126725 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 8 373 40 424 519 103 24.0 5e-22 MLKDKKYYPWMVVGMLVVVSLLNYLDRQMLATMRPFIMVDIKELESITNFGRLMAIFLWI YAIMSPISGMIADRLNRKWLIVISLFVWSTVTLMMGYVDDISHLYILRALMGVSEAFYIP AALSLTADYHQGKTRSIAIGVLTSGIYLGQAIGGFGATVAAHTSWQFTFHSFGMIGIIYS LVLIALLHEKKTYTVQTEQKKSIKENLSMTFKGLAVLFSNIAFWVMLYYFAALNLPGWTT KNWLPTMVSDTLNMDMEFAGPLSTTTLALSSFVGVFLGGFISDKWVQKNIKGRVYTSATG LSLIVPALLLMVYGNNIVFIVAGAILFGLGFGMYDTNNMPILCQFIPSRYRATGYGLMNF VGIAAGAIITNELGKAFEANYKDLVFLIMIFSVCLSIFLQLKVLNPKTTDMTDELMTESK NK >gi|222159331|gb|ACAB01000028.1| GENE 53 94964 - 95536 512 190 aa, chain - ## HITS:1 COG:PM1255 KEGG:ns NR:ns ## COG: PM1255 COG2731 # Protein_GI_number: 15603120 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Pasteurella multocida # 51 188 11 149 158 96 36.0 3e-20 MIITMTCSCENIKKNNSIATSAADPTEIDWRYGWTVAASPILDMDALKKHQAQYPERWKA TFEYLKNTDLGRLSPGEHEIIGREVYAIVSEYTPKEATECNFEAHRKYIDFQYLISGKEK MGVTTLDKVVPICEYDEEKDIVFFKPDASAIYEVATPDLFYVFFPKDPHRPSIKDENGVT VKKIVIKIKM >gi|222159331|gb|ACAB01000028.1| GENE 54 95566 - 96495 909 309 aa, chain - ## HITS:1 COG:yagE KEGG:ns NR:ns ## COG: yagE COG0329 # Protein_GI_number: 16128253 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Escherichia coli K12 # 1 263 8 270 309 161 31.0 2e-39 MMNKDLFEGVITPMVTPLIDRENIDFKGLEKLLDHLINGGVSGVFLMGTTGEGTSISPRM RKDLIKYSIEYVKGRVPVFVSIADCCIEESLNMARYAKECGVTYLVSALPFYLGLTQKEI IDYYTTIADNVPLPLFLYNIPAQTKLMISVEAVKTLAKHPNIIGMKDSSGNGTYFNTLLA EIKAEYPNFTILVGPDEMLASTMAMGGNGGVNSGSNLFPELYVNLFKACKAKDTERILKL QKLVMKVSTGVYSVDKSSVSFLKGLKAALFTEGLITDYICEPLQKVNNADLETIRKNVTE LKQQIIQVL >gi|222159331|gb|ACAB01000028.1| GENE 55 96522 - 98276 977 584 aa, chain - ## HITS:1 COG:no KEGG:BVU_2031 NR:ns ## KEGG: BVU_2031 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 38 548 32 539 558 353 40.0 2e-95 MKRRDFIRNAGLALSTAFIPNIAFPSFLRGGASPSSALYNGITLPDVWPPRNISMNYDPM PLPYLTNRPEIVPIDLGRQLFVDNFLIEQTDMERRSYTPRKMSFNPVLKPETELEQGTYG IPGASAKDGGVWWDPKDNIFKMWYEAGWLHRMAYATSKDGIHWERPNLDVVAGTNQIVPE IVADSSTVWLDHFTKNPEERFKMFLRSPNSIPGSTERFNYGFSMVSPDGIHWGKPVKTGP CGDRSTMFYNPFRQKWVYSLRNPENINRVPIGRFRYYHEHPNFLEGAKWTKEDLIFWLGA DYLDDPDPYIGDKAQLYNLSAVPYESIMLSLPQIHLGPSNERCREAGVPKITELKVAYSR DGFYWDRTDRHTFIPAERKAGSWDRGYVQSVGGLCTIVGDQLWFYYIGFEGNKKKKSTVY EKNGMHANGSTGVAVLRRDGFVSYSAQKEKGTLVTRPVTFTGEYLFVNVNCPNGELKVEV LDADNKVIQGFSANDCKPLTVDSTIAQIRWKNKKDLSTLKSQSVRFKFYLTNGDLYSFWV SPDTEGASNGYNAAGGPGFSGGIDKEGMRAYKKAQDFPNLLERC >gi|222159331|gb|ACAB01000028.1| GENE 56 98303 - 100027 1092 574 aa, chain - ## HITS:1 COG:no KEGG:Dfer_2403 NR:ns ## KEGG: Dfer_2403 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 1 570 27 576 576 461 46.0 1e-128 MTPENSLKNETELKLYINGLLPMLSGASNVDNGASVAAGRMMEKTDDVIWPTLPDYMIGK RSSTQSAGSWDWDNLRKINIFLKYSVNCPDETVREKYNAMAYYLRAQFYYDKLKTFGGVP WYDTVLEDNSPELYKPRDSREMIADKILEDLDKAIKNGVEAKSLNEITKWTALALKSRFC LFEGTFRKYHNIEGSEKFLKECVSASEALMTSGKYTIDKGEGTDVAYRDLFAQPATNNAS DVEVITAYAYSISLGVKHNTNYSIINASGNQIGLSKAFMDSYLMNDGTRFTDQPGYATMI FGEEFENRDPRMFQTVRCPGYARIGKTHDVNRLYEAMIISTTGYMPIKYIQSGDYDKQTS NENDIILYRYAEVLLNYAEAKAELGTLLQTDLEQSIKLIRDRVGMPNMDMAAANANPDPV LSAQYPLVNGSNKGVILEIRRERRVEMVMENLRYDDIIRWKAGHLFTEQFKGVYFSTVTA PSTKYDLDKPAYTARNANFYIYIGERPSNLTEKNSIGLNVGIFLSNGDSGNKVVNTDKVK TWNEDRDYLAPLPSSALVINPNLKQNPYWDSPSN >gi|222159331|gb|ACAB01000028.1| GENE 57 100151 - 103462 2664 1103 aa, chain - ## HITS:1 COG:no KEGG:Dfer_2402 NR:ns ## KEGG: Dfer_2402 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: D.fermentans # Pathway: not_defined # 47 1103 161 1217 1217 967 47.0 0 MKTYSNRRFILSGRNLQLVTDSIRHKLILLLSCLFIFSMNVYAQNSVSGTVRDGNGEPLL GVSVIVKHGKTTTGTVTDLNGHYQVKANPSDTIEFSFIGFKSVIVPANQKVINVTLHDDN QLLDEVVVVGYGTQKKVNLTGAVSQVTAEVLENRSTSSVTQMLQGAMPNVNIQVNTGSPG AGGTISIRGMGSVNSSSPLILVDGIPGSIDQLNPNDIESISVLKDASSAAIYGARAAFGV VLVTTKKAKSGKAKISYDGYFSWSKPTVSTDFVTSGYEHAYIYDTSYMGQKGFMGAATGY TEEDYKELEARRYDTTENPDRPWVVTTTESNGRQKYNYYANFDWWNWMYKERMPSQSHNV SISGGTDKIQYYVSANFYSKDGMMKRVDEKYTQYTLTSKFNAEIKPWLKIGNQTNYFDSS HTYPGENAANAAFARTMINCAPYYVPIGPDGNYTGLMKNGKVLNEGRVADIYGGVSKGNV GKRRFRNTFSVEITPIKNLSVNADYTFGFTMDDNWKRQGLVYISNGYANETQLSTTSTHK KDYIQKSMSYDPSHVFNVYATYGNTFAKAHHVSATVGMNYEKQQNHDLLGYRTDVLSETL NDLNLATGTGASDLKATGGASAYELFGLFFRANYNYKERYLLEVNGRYDGSSRFLSGNRY GFFPSVSAGWRISEEQFMKGLNIFDNLKVRASYGVLGNQLGVSMYPYSTITQKQSSYIVG NSLAYYLTTPAPVAGDYTWEKVGTLNMGIDISVLDNRLNFSGDYFIRKTSDMFVDGVTLP AVYGANPPRQNAGEMKTKGYEITVTWRDNFKLAGRSLNYEIFGSIGDATSEITKYQGNDT NILTDYYVGKKIGEIWGYRTGGLFQSDEEATEWTSKIDQSAVNRDILKSQGKWSVARGGD VKYLDKDGNNIINNGANTLEDHGDLEVIGNETPRYNYGFGANINWYGFDFSISFQGIGQR DLYPNKEMEKFWGSWGRVNSAFLPKGMAEQAWSEENKGAYFPRLERGSAAYNDNGQMQVV NDRYLQNLAYLRLKNLSIGYTLPTKLTTKAGIERLRIYFAGENLCYWSPFKTDYIDPEQA MSANDARIYPFSKTVSVGVNLTF >gi|222159331|gb|ACAB01000028.1| GENE 58 103422 - 104408 821 328 aa, chain - ## HITS:1 COG:SA0258 KEGG:ns NR:ns ## COG: SA0258 COG0524 # Protein_GI_number: 15925971 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Staphylococcus aureus N315 # 2 298 4 303 304 201 41.0 2e-51 MKVVVIGSSNIDMVAQVNHLPAPGETVGDASFMQSLGGKGANQAVAAARLGGSVTFVTSL GNDMYADILKKNFEKEGITTDYIINDTQHPTGTALIFVADSGENCIAVAPGANYSLSPES INHFSKVIDETEIVVMQAEIPYNTIKSIALLAQQKGKKVLFNPAPACLIDAELMKAIDIL VVNEVEAAFVSGIKHTGDNLEKIAEALIAAGTQNVVITLGSHGVYMKNAHTSVRLPSYQV NAIDTIAAGDTFCGALAVMCAQKEINIEAINFANAAAAIAVTRSGAQPSIPTLEEVNHFI IENELPYRSTSKFNEHENLLEPTIYSIR >gi|222159331|gb|ACAB01000028.1| GENE 59 104448 - 105371 725 307 aa, chain - ## HITS:1 COG:HI0505 KEGG:ns NR:ns ## COG: HI0505 COG0524 # Protein_GI_number: 16272449 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Haemophilus influenzae # 8 304 2 298 306 232 46.0 7e-61 METISIHKPKVVVIGSCNTDMVVKANRLPVPGETVLGGTFYMNPGGKGANQAIAASRLGA EVTFISKIGYDLFGLQALEIYKSEKINTEFIFTDQKSPSGVALISVDSFGENSIIVAPGA SRSLSIEDINKAEEKIKEADIILMQLEVPIETAEFAATIANKYNKKVILNPAPASTLSDT FLRNIHTILPNRIEAEMLSGIKVIDIESAHRAAKTIGDKGIENVVITLGKDGAYVKEKDK YAMIPAKQVETIDTTGAGDVFCGAFSVCLSEKHSLTEAVEFANAAAAIAVTRIGAQSAIP YKMEVTL >gi|222159331|gb|ACAB01000028.1| GENE 60 105599 - 107140 939 513 aa, chain - ## HITS:1 COG:no KEGG:BT_2802 NR:ns ## KEGG: BT_2802 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 512 1 512 513 846 89.0 0 MKRVDSVISLIYSLSKSEKKHFSLQVVKDKEEKDYLVIYDIITKSKQPNGNAVKEEFYKR RPKGSFEVSIQYLYEKLVDTLLTLRKKKDIYYDLLNDLCKARMLYERSLFEECFEILSNT IEQAQFYENNEILTIALKQELEYLLRLNFPNMTEQELFHKHFIQNESLKKIRKITEQSSL HNLLKYRLSHIGAIRTSKQKQDMNDLMVNELYIAASSDAEGNFELTRNHKLFQANYLMGA GDYRAALNSYKELNSLFEQNQQFWSNPPIYYLSVLEGVLGSLRSVGNYNEMPYFLEKLKR LIAEDSSLEFKVNATCLLFQYELFPYLDKGDFLDCTEIMNRYQETLYDKEAWLSPIRKSE LLLYTTIVHIGNQNYKAAKKYISNAIVDHNIKYLPLMRTIRLVRLIAYYETQEYELIRHE SRSITRSLSSPKEQTFKTERIILWFLNKRNLPILRKDREAFWEKLSPEIQELYNDKYENQ LLRIFDFTAWMESKIRKEKLSEVLRIHTNAKEP >gi|222159331|gb|ACAB01000028.1| GENE 61 107280 - 107762 435 160 aa, chain - ## HITS:1 COG:no KEGG:BT_2538 NR:ns ## KEGG: BT_2538 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 158 1 158 160 134 52.0 1e-30 MALKYVVKKTTFGFDKEKAEKYVARPFNVVTVDFKMLCDQVTKVGFVPRGTVKSVLDGLI DSLITYMEIGASVSLGEFGTFRPSFGCKSQDDEKEVTTDTLKNRKIIFTPGSMFKGMIKS VSIQKLDSSKSNNSGTPDNGNGGSGEGGNEGGGEAPDPAA >gi|222159331|gb|ACAB01000028.1| GENE 62 108259 - 108666 334 135 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173218|ref|ZP_05759630.1| ## NR: gi|260173218|ref|ZP_05759630.1| hypothetical protein BacD2_15210 [Bacteroides sp. D2] # 1 135 23 157 157 245 100.0 5e-64 MRKIIIVSAISMVIFSCCIGVSSNTDKDKKYLYIEATGTNSHGDYSAKIDTLEIMEKNDS LAYLKAFEELCVSQRASALVVEIMKEKMGDRFDEYEEVHDFRLLNEKLEEVDRTIVPDSV LAEIAQSIFSLKLEK >gi|222159331|gb|ACAB01000028.1| GENE 63 108733 - 109683 421 316 aa, chain + ## HITS:1 COG:no KEGG:mru_0017 NR:ns ## KEGG: mru_0017 # Name: not_defined # Def: hypothetical protein # Organism: M.ruminantium # Pathway: not_defined # 2 133 46 180 312 108 40.0 2e-22 MKYDVFISYSSKDSGVAFDVCAMLEAAGLTCWIAPRNVYGGKSYAREIIEAITESQIVLF IFSSYSNCSSHVESEIDIAFNQEKVIIPFRIEEVKMSPELTYYINKKHHIDGIPEPALSF DILKESVLNNIPRLQKELDKERAYKLLREDLGDFDIEFLKSVLQNSRNNRADPRNVNCED NVAENEFNILQNAAGELCLLIRSRNGRPRKPCFICDNSNFTILFRNNSSAVFLEDINPVA VEALNKVNQMLVVELNNDEVVREYVVPIVLIEGLTSYLQDDAIYDKDSGIQGTLDRVLSH KQQTLLRKLYCNWREN >gi|222159331|gb|ACAB01000028.1| GENE 64 109760 - 109978 137 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717280|ref|ZP_04547761.1| ## NR: gi|237717280|ref|ZP_04547761.1| predicted protein [Bacteroides sp. D1] # 1 72 1 72 72 135 100.0 1e-30 MQNRLLRICKLLTIFIIELNYHLTVSYGEAFDKRRSVFGGEFGNLFVGKLFVALVAEDGK PTIHVATFGTFG >gi|222159331|gb|ACAB01000028.1| GENE 65 110628 - 110912 144 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717282|ref|ZP_04547763.1| ## NR: gi|237717282|ref|ZP_04547763.1| predicted protein [Bacteroides sp. D1] # 1 94 25 118 118 195 100.0 8e-49 MMGVSRAGYYKWKRRDPSTRDLNRETMVEFVEQMHSEHSTHGYRWVVAFIRNELRATVSD NFVYKCFRYLGIQSETRPCFAIGYDTPVNYHLIS >gi|222159331|gb|ACAB01000028.1| GENE 66 110974 - 111573 521 199 aa, chain - ## HITS:1 COG:no KEGG:BT_2225 NR:ns ## KEGG: BT_2225 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 198 1 199 201 181 45.0 2e-44 MSVYYDLYASGNPQKKDEQQPLHARVIPSGTLDAKKFIELVSKSNGFSQATIEGCLQAVT DELQHWLKQGWIVEVGELGYFSLSLKCDHPVMEKKEIRSPSIHLNKVNLRINKKFRENME PLPLERMESPYRSNGNPDEDKCLSILMQHLDEQGCITCVDFTRLAGISRYKATILLNTYL EEGIIRKYGGGKTVVYLKK >gi|222159331|gb|ACAB01000028.1| GENE 67 111993 - 115412 2484 1139 aa, chain + ## HITS:1 COG:PA0931 KEGG:ns NR:ns ## COG: PA0931 COG4771 # Protein_GI_number: 15596128 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Pseudomonas aeruginosa # 196 333 41 191 742 64 35.0 1e-09 MEKDSPFSSTGRRCIVTLMCIFVTAFVYAQQQKISVSIKELPLKEAISQIAEKASMHVAY SKEFVDTSRKVSLEVKDTEVNKALTLLLKGTNIGFRFLDDSILFYNKEYQNKTEPVDSQG EKKELYVKGKVTDENTEPIIGATVSVKGSTTGTITDINGQYSIKVPYGSTLRYSYVGYRE ESVIAKATTVNVVMKEDAVSLEDVVVVGYGVQKKVNVTGAVSMVKAEAIENRPITNVTTG LQGLLPGVSIVSSSGQPGAVPSINIRGTGTINSSTAPLILIDGVAGGDINLLNPSDIESV SVLKDAASSAIYGARAANGVILVTTKKGEKKERVVFQYNGYAGFQTPTALPELVNGREYM ELSNEAMSAAGFSKPYTQEAFDKYDSGLYPNEYSNTDWIDEIYKSRAFQTGHNVSARGGS EKTGFFMSYGFLDQDGLVVGDGYSSKRHNARISVNTEVYDRLKLTGQMSYVDFYKKDLGY SGTSGVFRLSQRMSPLLPVMWQIPDENGRMVDSENWSYGSVRNPLQVAYESGMEERKTRV LNGIFNADLKIIDGLNVGMQYSANIYTRQVDEFNPKMLSYYSDGSPLKANEDAKDYISQS HLDVMTQTLQFTLNFNKTIGRHELGALLGFSQEWENRSTLGATRDNVMVEDIHVISAGMI NFMNSGTKDEWALRSYFGRVNYAFDGKYLFEANLRADGTSRFAKGNRWGYFPSFSAGWNF SREKFMEFATSVLSSGKLRASWGELGNQNIPGNYYPYLSPIITEESYPIGASNTPVMGLW QNKIGNPDIKWETIRMLNFGVDLSFLNNRLNVDFDWYKKENIDALVRPDVPAIVGVSSSN VGYVNLGKIDVKGWELNLSWRDKIGSVNYNLGFNLSDARNKITDLGGTPESLTSTASGSY RRVGDPIGAFYGYLTDGLAQVYDFESVNTTTGKYQKPKFPLVASQNGIVQPGDIKYRDIS GPDGAPDGVIDDYDKVVFGEKEPHYTYAIKGGLEWKGIDFSFYLQGVGKVAGYLEDEARH AFINDYSIPKKEHLDRWTPMNPNASYPRLYQSQEHNRLFSDYWKEDASYLRLKNIQIGYR FPARMVAPLGINSLRVYASADNLFTKTDYFGAYDPEVRTTSGDVYPQVKTYVFGLSITF >gi|222159331|gb|ACAB01000028.1| GENE 68 115433 - 117166 1433 577 aa, chain + ## HITS:1 COG:no KEGG:Phep_2529 NR:ns ## KEGG: Phep_2529 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 4 576 24 579 579 296 37.0 1e-78 MKYSKYFLIIALFGIVTSCSDFLDRTNPNEPDNVTFWVNEDQLKNALPPCYEALQKDYLV NWSESTAETVMWGNITSGLSKVSGGKHSYTDGFPFTTYWTGAYSYIYRCNNFLDNYNKAQ VAQNKKDVYAAEVKTIRALMYFYLTVFWGDVPWVGEVIQPEDAYIERTPREKVIDQLVED LKWAAERMPEERYTGDKLGRLDRWGALAILARIALQNERWELAAKTSEYIIENSPYGLYE YEKLFHHEGDVENDPKNIEAIVYSLFVPEIRTQSLPNETCSPTDYIRLNPTKSLVDAYLC TDGKPAKTGLEYYKKTGVQTSSLYKSPEEHYVDYFQNRDPRMKMTLYAPGDKWPGGDDGD PDTDKANEIFNLPRFASLQDNNRVGANSRTGFYLKKYNDIDLAGSSVGGHGNLNVIRFAE ILLIYAEATFELQGKKLTQTQIDYSINRLRDRVNMHRMNLDELSAWGMDLETELRRERRI ELAGEGTRYADVMRWREGELRFGRAITGPSLKVCMNDLGANPYPDTGVDEFGDVIYEKST AEGGARYFDATKHYLWPVPNPERQKNPLLGQNPGWEK >gi|222159331|gb|ACAB01000028.1| GENE 69 117186 - 117977 483 263 aa, chain + ## HITS:1 COG:CC0523 KEGG:ns NR:ns ## COG: CC0523 COG3568 # Protein_GI_number: 16124778 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Caulobacter vibrioides # 28 253 5 245 259 65 26.0 7e-11 MYSMKKYLLLFLCLTLGVAYAQDTLRVRVMTYNLRFGELASLEELAHHIKSFKPDFVALQ EVDSKTDRKRTPHQKGKDFISELAYHTGMFGLYGKTIDYSTGYYGIGMLSKYPYISVQKI MLPHPVKEHERRAMLEGLFEMGNDTIVFTSTHLDVNSQETRAEQIKFITGHFKNYKYPVI LGGDFNARHYSEAIRGMDSWFAASNDDFGMPAWKPVIKIDYLFAYPQKGWRVISTQTVQS LLSDHLPIITELEYVKEASKKKY >gi|222159331|gb|ACAB01000028.1| GENE 70 118006 - 120162 1452 718 aa, chain + ## HITS:1 COG:no KEGG:Amuc_0060 NR:ns ## KEGG: Amuc_0060 # Name: not_defined # Def: alpha-N-acetylglucosaminidase (EC:3.2.1.50) # Organism: A.muciniphila # Pathway: not_defined # 31 715 33 721 848 652 45.0 0 MIHTIIKYLLISTTLFFCSCHKPKTDIITPAKQLIERQIGERAQSIHFEYIEPSEGKDIF EVIASDGRLTLRGSSSVAICYAFHTYMKEACKSMKTWSGEHITSVMPWPDYELYEQMSPY ELRYFLNVCTFGYTTPYWDWERWEKEIDRMALYGVNMPLATVASEAIAERVWLRMGLNKE EIREFFTAPAHLPWHRMGNLNKWDGPLSDTWQQNQIALQHQILTRMRELGMQPIAPAFAG FVPEAFAQKHPDTQFRHMRWGGFDEEYNAYVLPPDSPFFEEIGKLFVEEWEKEFGENTYY LSDSFNEMELPIDKEDKEAKYKLLTEYGETIYKSITAGNPDAVWVTQGWTFGYQHSFWDK ESLKALLSNVPDDKMIIIDLGNDYPKWVWNTEQTWKVHDGFYGKKWIFSYVPNFGGKNTM TGDLDMYASSSVKALRAANKGNLIGFGSAPEGLENNEVVYELLADMGWSSDSIDLDDWMK IYCEARYGGYPDAMEEAWKLFRKTAYSSLYSYPRFTWQTVVPDQRRISKIDLSDDYLQAI RLYASCADELKSSELYRNDLIEFVSYYLAAKAENFYKQALKDDSENRVFAAQRNLQQTVD LLMDVDRLLASHPLYRLEEWVEFARNSGTTLQEKDAYEANAKRLITSWGGIQEDYAARFW SGLIKDYYIPRIQLYFTKDRNKIREWEEQWITSPWSNSTTPFDDPVKAALNLIEKTNK >gi|222159331|gb|ACAB01000028.1| GENE 71 120171 - 121313 784 380 aa, chain + ## HITS:1 COG:SA2220 KEGG:ns NR:ns ## COG: SA2220 COG1929 # Protein_GI_number: 15928010 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Staphylococcus aureus N315 # 1 376 3 378 380 271 41.0 1e-72 MKKIVLAFDSFKGSVGSFEIAKAAEKAIQEELPDCQIIRFPIADGGEGTTEALCSALHAQ TVSCRVHDPFMKPIEVSYGIVNNDAMAIIEMASACGLPLIDSSRRSPMKTTTYGVGEMVA DALKRGCREFIIGIGGSATNDAGIGMLKALGARFLDSGNGELEPVGENLIKVHQIDISQL NPALKESKFTIACDVSNPFFGKEGAAYVYAPQKGANSLQVIELDNGLRHYAQVIKEYTNM DISQLPGAGAAGGMGGGLLPFLNAELQSGIEVILKTLRFEEVVRQADLILTGEGKLDRQT CMGKALDGILRVGERCQVPVIALGGAVEATEALNRMGFTAVLPIQPFPVTLEEAMQPEFT KENIERTVRQVVRIIKQFTK >gi|222159331|gb|ACAB01000028.1| GENE 72 121320 - 122576 980 418 aa, chain + ## HITS:1 COG:HI0092 KEGG:ns NR:ns ## COG: HI0092 COG2610 # Protein_GI_number: 16272066 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Haemophilus influenzae # 1 414 4 414 419 340 53.0 3e-93 MTAIGALIGLSLSILLIIRKLSPTYSLIIGAIVGGLLGGLSLNETVTVMTEGVKEVTPAV LRILTAGVLSGILIQTEATTVISNAIINKMGEKRVFMALALATMLLCTVGVFIDVAVITV APVALSIGKRLNLSPSVLLIAMIGGGKCGNIISPNPNTIIAAGNFNADLSAVMFANILPA VIGLFFTVFVIVRLMPQTVKNKKTMMQTEDKEEERNLPSLRTSLIAPVVTIVLLALRPAA GINIDPLIALPVGGLCGAICMKQWKNILPSIEYGLQKMSVVAILLIGTGTIAGIIKNSSL KDWILTGLDHAHMSDALIAPISGALMSAATASTTAGATLASSSFAETILAIGISAVWGAA MINSGATVLDHLPHGSFFHATGGVCELNFKERLKLIPYESLIGIVLAAGTTILYIISN >gi|222159331|gb|ACAB01000028.1| GENE 73 122591 - 123676 580 361 aa, chain + ## HITS:1 COG:all1887 KEGG:ns NR:ns ## COG: all1887 COG4299 # Protein_GI_number: 17229379 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 6 361 2 375 375 186 33.0 6e-47 MNPNKRLLSLDVLRGITVAGMILVNNTGKCGYNFAAFAHAKWDGFSPADLVFPMFMFLMG ISTYISLCKYNFQCRPAIAKIIKRSLLLIFIGLVMEWFITAIDSGNYFDLSQLRLMGVMQ RLGICYGITALLAVTIPHKKFMPLAIILLVVYFIFQLFGNGFEKSVDNIVGIVDSAILGS NHMYLQGRQFVDPEGILSTIPAVSQVMIGFVCGKIIIDIKDNDRRMLNLFLIGTTLLFAG YLLSYACPLNKRLWSPSFVLLTCGIATLSLALLLYIIDVKQNKKWFSFFETFGANPLVIY VFSCIAGGLLVHWHIHTTVFNNLLNPLFGNYFGSFMYGVFFLLFNGLLGYVLLKRKIYIK L >gi|222159331|gb|ACAB01000028.1| GENE 74 123683 - 125866 1728 727 aa, chain + ## HITS:1 COG:no KEGG:BT_0438 NR:ns ## KEGG: BT_0438 # Name: not_defined # Def: alpha-N-acetylglucosaminidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 691 1 699 730 670 45.0 0 MKRKSVYTCLVMLFMSLVLQAKDKDVAVAEALLKRLLPSYIESFQFQKLKGEKDCFTIES VKDKIVIGGNNANSMAMGLNHYLKYYCLTTVSWYADIAVEIPEELPMVDEKVVSEARVDT RFFLNYCTYGYTMPWWQWKEWERFIDWMALNGINMPLAITGQEAVWYKVWSKMGMSDIEI RSYFTGPPYLPWHRMANIDRWNGPLPMEWLEHQVSLQKKILARERELNMKPVLPAFAGHV PADLKRIYPEADIQHLGKWAGFADAYRCNFLNPNDALFAKIQKLFLDEQKKLFGTDHIYG LDPFNEVDPPSFEPEYLRKIASDMYATLTAADPKAQWMQMTWMFYFDKDKWTSERMKALL TGVPQNKMILLDYHCENVELWKRTEHFHDQPYIWCYLGNFGGNTTLTGNVKESGARLENA LINGGGNLKGIGSTLEGLDVMQFPYEYILEKAWNLNVDDNKWIECLADRHVGCVSQPVRD AWKRLFNDIYVQVPRTLGTLPGYRPALNKNSEKRTSNVYSNVELLEVWRKLNEAPSDRRD AFRLDLITVGRQVLGNYFLDVKMEFDRMVEAKDHQALKACGEKMKEILNDLDKLNAFHPY CSLDKWIDDARKMGDSPQLKDYYEKNARNLITTWGGSLNDYASRSWAGLISDYYAKRWEV YVNTFIKAAEEGVEVDQKQLEDELKEIEEGWVNATDRKDTRKDVHSTTDGLLSFSTFLFS KYQRLVK >gi|222159331|gb|ACAB01000028.1| GENE 75 125985 - 126566 510 193 aa, chain + ## HITS:1 COG:all2193 KEGG:ns NR:ns ## COG: all2193 COG1595 # Protein_GI_number: 17229685 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 11 178 19 189 201 67 26.0 2e-11 MSQKYSQDEAQALVKALKEGNQLAFSIVYKTYAAQTFSLAFKYLLNKELAEDAVQNLFLK LWLKKEEIDETKPINRYLFTMLKNDLLNTLRDSKKNIYLLEDCLSMVLELEDNSQNENLK QEQMNIIQQALEQLSPQRRKVFEMKVSGKYSNQEIADKLNLSINTIKFQYSQSLKQIRAT VGELSLLLLYCMM >gi|222159331|gb|ACAB01000028.1| GENE 76 126675 - 127631 506 318 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 119 308 131 315 331 72 28.0 1e-12 MKSKKKHTNQDFEVVDGIGKYMDDDIQRIIEGKQLELGDRELPPFDEYRIYKNIQEAVMR EEKRKNRRIRMPLFFKWTVACAIVLLVIGVGYNFYQSHNEANLVYREVCAVRGEKLFVLL PDGSRVWLNADSKLTYPEQFAKYNRNVTLEGEAYFEIAENKKSPFQVLAENVKIQVTGTC FNVKAYASDKVIKTTLDEGSINIGHVQSRRPMQQMLPGQTAVYEKRSNVIKIKTDRYHDD ASSWKSNRLIFRNASLKEVLTTLSRHFDIEITVKNEKIASFTYDFVCKGNDLNYVLEVMQ SITPVSFKKISEYTYTVE >gi|222159331|gb|ACAB01000028.1| GENE 77 127663 - 128436 596 257 aa, chain + ## HITS:1 COG:no KEGG:RB8407 NR:ns ## KEGG: RB8407 # Name: not_defined # Def: hypothetical protein # Organism: R.baltica # Pathway: not_defined # 26 255 44 272 281 140 34.0 4e-32 MKALLKPIVWVCLFFFAYQSTYAQALKIMSYNCRMSGEMTGYSVKEYAVFIRKYNPDVVM LQEIDYNTKRNKNQDFTTQLAAELGLFSVFGKAMDTGGGEYGVAILSKYPFVYINNKTFE GIDGAKEPRTLLYVDIQEPGTSDVIRIGTTHLDHSTDLIRSAMAEQINERIGTGDTPTLL GGDFNARTDSNVICEVMKNWQRICDDTFTYPADQPTIKIDYIFGLPQNKWKVKSFKVLSN PEVSDHRALFAEVEFVK >gi|222159331|gb|ACAB01000028.1| GENE 78 128444 - 129223 247 259 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717296|ref|ZP_04547777.1| ## NR: gi|237717296|ref|ZP_04547777.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 259 1 259 259 489 100.0 1e-137 MKSNYLYLLFICVISCLFVSCSDEDNKGEDIPEWNRFRTTNVLVYAHLSEQNLFSSCSYK EVASSIRNTVHSVALLDRTNAVYGQTAIINTGTETARESKKVPVFVPVSYSNGEKKIIGS TVLFPSTISEMTQYVVKNDCRYLETKTEAVNGIDMLFCSVSLNSEDLIAPAVDVFKKKVN EQTVLVGTVKRALLPNLESAITSNLTSDTYSFVEVENRNRDSEYCIFVLTSHKWAFRGVT ETSVSGDLHCFQLQIESLK >gi|222159331|gb|ACAB01000028.1| GENE 79 129237 - 130766 1036 509 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173237|ref|ZP_05759649.1| ## NR: gi|260173237|ref|ZP_05759649.1| hypothetical protein BacD2_15305 [Bacteroides sp. D2] # 1 509 1 509 509 984 98.0 0 MKQYIKNIMILILALPWGLLLTNCSDSKSDYSDDNAYPPPTVELTSPSEIDVVEYNSTVT VSARSFSAVGIHSIYATLLKMDENGEYEEINATERQRLKIDTLQTDMTLEFDLNVKVNTR EAAGILVTSTDVLTKTAQKVIPIKKITKLPSQIFTEPSDFPVLVPDEEVSLSVVIRSAVG IKSIKHTLCNKVLGDLKEYTTIPVSGNPLEMEFILKTVVDNKETNGIKIVVEDIEGLKEE KIINVEGLEGVDNNVALVFNDIEMAPEWEHSTEPDQPYIFSIEGIMIRGVQKHVLSLKEI KGYGSKANSIDFAFINIWRNPSFVAVKNRGFSYVSASRINGGPIGRAYDVNDWIKPAGVA TNKTLFTLIPDDKVVDLGIDAMMANAASDVKTFEALNVLESIAKGGADMLMQRVNASDGY PNDPCSLQIKDGSYIAFVTAAGKYGVIHVIEAANDMDALVAGGCKIATPTGVVGSQGPAY SGAGITGLTYDGVALLYGRTCKLKIVVQK >gi|222159331|gb|ACAB01000028.1| GENE 80 131087 - 132073 722 328 aa, chain + ## HITS:1 COG:AGpAbx251 KEGG:ns NR:ns ## COG: AGpAbx251 COG3712 # Protein_GI_number: 16119537 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 15 274 24 258 311 76 27.0 9e-14 MEDIYIIISKYLLRTASPQEEMEIMEWRNADAGHEQEFQELCESWQIAHAGIHPVIPDKE RVWEKIMSNLNLVKPVKMYTQRLLYRAVGIAAMLALALGFSLSLLVSEKEEVGLVSFTAP VGQKAEVSLPDGTKVWLNSGSTLTYSTDYAKDCRSVKLNGQAFFDVVQDSKRQFDVSVGD VKVLVHGTAFDVNGYGDHSELEVVLLRGHVTVVSTLTDKLLADMKPNQKVIIPLHEMEKC KLEACDAEVESVWRLGKLKIENENLQEIVQKMERWYGIKIQLHDVPENKRYWMTIKTESL REMLEIINRVTPITYTINGEEVSITGRK >gi|222159331|gb|ACAB01000028.1| GENE 81 132236 - 135451 2457 1071 aa, chain + ## HITS:1 COG:no KEGG:Dfer_0714 NR:ns ## KEGG: Dfer_0714 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: D.fermentans # Pathway: not_defined # 1 1071 7 1124 1124 847 41.0 0 MLKLKQILCSILVLGSLLSSPSLWAQKETKVNIAARQISLKNLIQAVEKQTDYTFVFDNS IPLSRIVSLKGGSQYLSDVLKQAFDKSDIAYEIVGKQIVLQKVQTKTNRTISGVVKDEQG EAVIGASVLVKGTTNGTVTDFNGKFELQNVPESATIDVTFIGYAPQSKKVVAGVRSMNFV LEEDTETLDEVVVVGYGVQRKRDLSGSIASVKGDIITEYANTSVASALQGRVSGVQIQQT NGQPGAGIQVRVRGSNSIRGDNEPLWIINGFPGDINMINTADIESVEVMKDASATAIYGS RGANGVILITTKQAKEGKITVEYNASFGVQSLAKELELLDAWEYMSYLNEKAAINNQPTI YTDEEIKSTRHSTNWQRELFRNALVTDHSVNVSGGTEKVQGTLGASYFDQQGIVKESGYK RMSIRSSLNYHISKYVTVSSNLIFSRSNHNQMNSQGGSRGTSVIGSTLILPPTATPHYDD GTWNDFQTQPIAPVNPLAYVKEVDNKWYANRIMANASLTIKPIDGLSIQLSANVNNNQNR KDYYKSLQYPNSQGAASITFGETVGITSNNIITYNKSFLKKHHLSVMGGFTYEQSTSKTA GTGTAEGFLSDVTETYDMDAATVKGLPTSSYSDWRLFSFLGRVNYNYADRYLLTASLRAD GSSRYSKGNKWGYFPSAAAAWRLSQESFLRDVEWLSDLKFRLSYGVTGSTAISPYSTQNT LRTENVVFDKNTTVAYVPSDTYTGDLKWETTSQFNVGVDLSLFNNRLRMTADYYRKKTTD LLNNVEMPRSSGYTTALRNIGSIRNSGFELQLDGRIIDRAVKWDLGVNFSLNRSKVLVLS EDKDIFGGELDNTILKDQLNLMRVGEPMYVFYGYVEDGYDENGHIVYKNMDDDPAITAAD KTIIGDPNPDFLVNLTTAVSYKGFTLSAFFQSSIGNDIYSLSMAAQAYDYGYNGNTLREV YYNHWTPENPTAKYPNLDQTSYKMSDRFVYDGSFVRLKNLELAYDVPCARSKFIKRARVY VSAQNLFTITSYPFWDPDINANGGGSSMIQGVDSYCYPSARTYTIGCRLTF >gi|222159331|gb|ACAB01000028.1| GENE 82 135472 - 136950 1411 492 aa, chain + ## HITS:1 COG:no KEGG:Dfer_4709 NR:ns ## KEGG: Dfer_4709 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 1 492 1 479 479 283 37.0 1e-74 MKKLLIYIAAFAMILAMNTSCEDMGSLEEHPKKVDATTFMANAKEVESVINSIYFQLRRD PGFSRYLIVLEEGLADYCIGRGNYATAYDTGLTSGGVGFSKDSWAVLYRAIRFANNILDG IGNTPLSQQEYNNLTGETRFLRAFAYSWLARNWGAVPFFDEQNMNDFNKPRTPEADIWKF VIDEADYAASNLPEVAKSAGRPSRYAALALKTEACLYAGRYEEAAEAAGLIISSKRYSLV EVGKADDFLDLYGHTANATSEEIFYIKFNRDSGSTIAYMYLCKPNPFVNMGAVGIYTDYQ KNKFIQNWDQNDLRYQFGLYKQTQNGTLNALTKTGMICSKFRDSEWTGSSTTPNDNPVYR YADILLYYAEAVCRWKGAPTDDAMEKLNMVRRRAYGQKPTQASSSDYKLADYASKDAFLA LVLQERGYETIFEGKRYNDLKRCGKLAEAALAAGRISALSEVGDAAYWWPIPTDEFNYNM ALDQTKDQNPGY >gi|222159331|gb|ACAB01000028.1| GENE 83 136974 - 138050 710 358 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4337 NR:ns ## KEGG: ZPR_4337 # Name: not_defined # Def: glycoside hydrolase, family 5 # Organism: Z.profunda # Pathway: not_defined # 1 355 1 350 353 327 45.0 4e-88 MNRLNILIVTILFLLTTTGCAQQAERWSTEKANAWYASQKWPVGINYVTATAINQFEMWQ EETFDPKTMELELGRAGELGFNTVRIFLHDMVWEADPAGFKQRLDTFLGICQKHGMRAIV TFFTNGGRFESPKLGVQPASVQGVHNSQWIQSPGAPSVNDPSTYPRLERYVKDVMTTFKA DDRILLWCLYNEPENFKQKAHSMPLLREVFRWAREVNPSQPLSSPIWIYPGGHGTRSNLP IISFLGENCDVMTFHCYYGPEEMEKFIAFMKQFDRPVICQEYMGRPRSTFEEIMPILKRE KVGAISWGLTAGKCNFHLQWSSKAGDPEPEIWFHDIFRLDGTPYSQQEIDFIKSMTSN >gi|222159331|gb|ACAB01000028.1| GENE 84 138087 - 139964 1270 625 aa, chain + ## HITS:1 COG:no KEGG:BT_2892 NR:ns ## KEGG: BT_2892 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 96 486 65 448 570 91 26.0 8e-17 MKKLKTHRYLLVTVMGLLVACTSYTSTDEWPNPRPNSNPDPEPVTGEITSIQSLNKSENV AVNSHADEWGNSSLELDYRRLLTLESPALSAVNALYPRIKKLGDGTYLLLYQQGPQAWNV YYALSTNLITWQNASSPLFQSESARQTSGASDTRCFSSCDAVVLANKDILAFASFRLNQG YRVDPQSNGIMMRRSSDNGRTWSTAQIIYQGTTWEPYALQLRSGEIQVYFTDSEPLTADS GTAMLRSMDNGRTWTVVGKVIRQKTGLAIDGSGKQIYSDQMPSARELNNSTKIAVATETR FRDEGDVYHISMAWSSDNWASAPLTGDEVGPSDRKLNFVQKAAAPYLVQFPSGETVLSYN ASSLFTMQVGNAEASQWGETYQPFSGKGYWGATEIIDPHTLVAAMPATFVNSENKDAARI QIGQFVLNHRINASGMTPVIDADNSDWSAVSDALFIGSVSTTQAVFRFAYDAENVYCLVE RLDKDLTTDDSMELIFQGGDATGTPLKISLIPDAVQYTIKCSHSSVTCKGAVHGTFGDTA ADKGYVVEMAIPRTLLRVVADRLMFNATLYDQNGSDTFTGLTATNYEKWLPIVLKAATDP EPLPGEGDTGTGPSWNNGDTEGTWK >gi|222159331|gb|ACAB01000028.1| GENE 85 139991 - 141880 1237 629 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717303|ref|ZP_04547784.1| ## NR: gi|237717303|ref|ZP_04547784.1| predicted protein [Bacteroides sp. D1] # 1 629 1 629 629 1208 100.0 0 MKKNKILILLVIGLSLGSCVQTDDEAFSPEPAVKGGEKSITAICAVTATTRTSLNASMNV VWTQNDEIRLFGTSTPNGAVYTTTASSVRTAVFDPVDTSVDDAIRYAVYPASAASGSQLE GTTLAVDFSALAGQKWMGAFNADSEISSLPMAAASDGEAFSFKNLCGGVRIQLTDYQLMG INVKSVAVRGNDGEQVSGIADVNVADGIVSLRKSSTPSAAATALVTCDTSVPLSADNNPS AHTDFVLFLPAVNYTKGLTFVITDAAGRIYEQATPGAFTIEAGVVKPMELLPVTLYYGKA NCYRTASAGTLEIDVTPYYSLAGDYTYENRPRVNINGELVDKAVSATVLWTQTNSSSSGD VLSAVPALEGTTLKVPVSGVKGNALVAIRDASGKNVWSFHIWVTEASDLTYINEERGTFK MMDRNLGATSVTPKDQNAYGAWYQWGRKDPFPRPLDIVRSSATTVSDKELTANATTSAEV GTVSYTISNPDTRIFSANDWHNEWRNNGLWGNSDGLTKNVKTVYDPCPEGYCVPDQNCYQ GFIFTSKTECDNNYGHLFVIDGSQTSYFPTGGYLDRGANKIAYQEYRGYQWTSNPGTTGA YYFYYNNANLNFTGLDRASAASVRCVRIE >gi|222159331|gb|ACAB01000028.1| GENE 86 141920 - 145174 2468 1084 aa, chain + ## HITS:1 COG:no KEGG:BVU_0750 NR:ns ## KEGG: BVU_0750 # Name: not_defined # Def: TPR domain-containing protein # Organism: B.vulgatus # Pathway: not_defined # 12 995 8 1019 1113 517 32.0 1e-144 MKYFNILFLILFLTTSAKAQSVKVSVSKVSIPTYTEPEREELPMFAENRVHQRSSGNPYP NKIVLKVNREQKVDKEYTLIKLENEYLELQILPEIGGKIYAAKDKTNGYDFFYKNHVIKP ALIGALGSWISGGLEFNWPFHHRASSFMPTDYEIEKLPGGGVIVWVSEHDPTDRMKGMVG IVLNPGESIFETRVKLSNITPLRHSFLWWENVAVPSNKNYEIFFPHDVSHVFFHYKRSVT TYPVATNAAGIFNGIRYDGAVDISKHKNTIQPTSYFSAASQYDFFGGYDTGRKCGVVHIG DHHVSPGKKMFTWAYNQLSQSWENALTDTDGAYCELMAGSYSDNQPDFTWLEPMETKTFS QYWFPIGEIGVPDFANTTGAIYVKDTIKVQLNKTRNVKITVKGDNQILYSGKATVKAREE YMLPADVRMKLGYSIDVTANDGTVLMSYTVKKHDTFNIPHTTQDMPNIKKVESPHLLYLE GLHVDQYRDPATKGESYYKEALERDPNFAPALIALGEAKLRNAFYSEALEYLLRAEKVLT RFNTRLENGKLYYLLGHVYLALDEQEKSYDYFRKAAWSSAYVSSAMTYAAMLDIRKLEYD KAVQHLTTAITYHKDNAVANALMIYASYLQGDKKASERQYLSVEANDKLNHLARYFGVLT GKVSARDFMEKIRTDKNQVCLDLIETLLVANLQKEAVSLIEMLQTHEPLIFSLSAIYADI KGGSPNDSATEGIAFPSRRIEMNSLSHWAKQGSRKAQLLLGCALYAKGHYEKAVALWEGL SGTDYRAARNLAVAYYSHMDRKNEVLPLLKQALSLKPNDEQLIFETVYVMGKLGVAPIER ISFLNNHKSVISRDDIMLEWARAYNMAGQEDKAIELLRGRNFVPAEGGEHAVAEQYMFAY FLKGRRLMKENKMQEAVDCFKAAQTLPQNLGAGLWNIVRLVPFKYYEAICLKSLGQEDKA NENFDFITGIEVDYFSNMNLPELPFYQALCYRETGMPFKGDMLINYKLQDWKEGMKTVDA GYFATTPFFISFCDRAVQQRSAYYSYLLALAYRYTGDTKLAQKYIEQAAVSDPYALNIFA ERQF >gi|222159331|gb|ACAB01000028.1| GENE 87 145371 - 145922 194 183 aa, chain + ## HITS:1 COG:PA0149 KEGG:ns NR:ns ## COG: PA0149 COG1595 # Protein_GI_number: 15595347 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 39 174 36 172 181 72 29.0 5e-13 MRLINTIPGTMDIQAFRNYYDTYYEQLCCFLNFYTHDVAVIEDIIQEVFLKLWENKDCIE ITYIKTYLFRAAKNRVLNYLRDEENRHQLLENWFNQQLEERKYKDCFDIDALTKVVNRAI EQLPERCREIFSLSRKEGLSYRQIAERLGISVKTVETQISIALKRIREILSSSAFAFLWL FIR >gi|222159331|gb|ACAB01000028.1| GENE 88 145952 - 147091 788 379 aa, chain + ## HITS:1 COG:TM0282 KEGG:ns NR:ns ## COG: TM0282 COG2017 # Protein_GI_number: 15643051 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Thermotoga maritima # 26 378 8 355 356 279 44.0 6e-75 MNRFFACLYLAVLVSACAECEVIPGVKTAFYGVTQGGDTVTQYILTNASGAQFKVIDYGC RVTNIMVPDREGKMADVVLGYENLKDYETGAERFFGALLGRYANRIAGGDFMIDSVRYQL SCNESPNGHPGHLHGGVKGFDRVMWKAMPVNCSDTLGIVFTRCSLDGEEGYPGNLDCKVT YYWTPDNTWRIEYEAVTDQPTIVNMSQHCYFNLQGYDGGSVLNHIVQINADSVTVNTPWY VPFSVESVVGGPLDFRTPHSFAERVTSPNEHMKLMGGYSANWILRDYNGDLRYAATVTEP QSGRQLETYTTEPGLLIYTGIGLSEKIVAKGGPQQKYGGLILETIHHPDTPHHPEFPSCV LRPHEKYYSVTEYRFTVQK >gi|222159331|gb|ACAB01000028.1| GENE 89 147208 - 148011 717 267 aa, chain + ## HITS:1 COG:PA3818 KEGG:ns NR:ns ## COG: PA3818 COG0483 # Protein_GI_number: 15599013 # Func_class: G Carbohydrate transport and metabolism # Function: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family # Organism: Pseudomonas aeruginosa # 13 264 10 263 271 160 37.0 2e-39 MLDLKQLTADVCRIATEVGSFLKEERKNFRRERVVEKHAHDYVSYVDKESEVRVVKALTA LLPEAGFITEEGSATYQDEPYCWVIDPLDGTTNYIHDEAPYCVSIALRSRTELLLGVVYE VCRDECFYAWKGGKAFMNGEEIHVSNVEDIKDAFVITELPYNHLQYKQTALHLIDQLYGV VGGIRMNGSAAAAICYVAIGRFDAWMEAFLGKWDYSAAALIVQEAGGKVTDFYGEDHFIE GHHIIATNGNLHPVFQKLLLEVPPLNM >gi|222159331|gb|ACAB01000028.1| GENE 90 148055 - 148762 373 235 aa, chain + ## HITS:1 COG:DR1389 KEGG:ns NR:ns ## COG: DR1389 COG1040 # Protein_GI_number: 15806406 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Deinococcus radiodurans # 14 218 17 204 219 97 33.0 2e-20 MKHTLLIKDWLSSFLSLLFPRCCVVCGRPLAKGEECICTVCNINLPRTNYHLRKDNPVER LFWGQIPLERATSFFFYEKGSDFRLILHRLKYGGQKEIGAIMGRYMAAELLSSNFFQGID VIIPIPLHKKKQQIRGYNQSEWIARGIAAVTGIPIDTESILRKKNTETQTRKSVFERRDN VEGIFELQHPETLAGKHILIVDDVLTTGSTTLACASCLVDMEEIRISILTLAMVE >gi|222159331|gb|ACAB01000028.1| GENE 91 149734 - 150723 937 329 aa, chain + ## HITS:1 COG:mll5335 KEGG:ns NR:ns ## COG: mll5335 COG0524 # Protein_GI_number: 13474450 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Mesorhizobium loti # 4 308 19 326 343 181 36.0 2e-45 MDKIIGLGNALVDVLATLKDDTLLDEMGLPKGSMQLIDDAKLQQINTKFSQMKTHLATGG SAGNAILGLACLGAGTGFIGKVGNDNYGEFFRENLQKNKIEDKLLNSDRLPSGVASTFIS PDGERTFGTYLGAAASLRAEELTLDMFKGYAYLFIEGYLVQDHEMILHAIELAKEAGLQI CLDMASYNIVANDLEFFSLLINKYVDIVFANEEEAKAFTGKEPEEALGVIAKKCSIAIVK VGASGSYIRKGTEEIKVSAIPVQKVVDTTGAGDYFASGFLYGLTCGYSLDKCAKIGSILS GNVIQVIGTTMPQERWDEIKLNINRILAE >gi|222159331|gb|ACAB01000028.1| GENE 92 150770 - 152458 1851 562 aa, chain + ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 7 409 2 398 408 235 36.0 2e-61 MKKLLNRRIAIVAVAVIATVAFFSFKSGDDRNFQIAKNLDIFNAIVKELDMFYVDTIDPN KTIREGIDNMLYTLDPYTEYYPEEDQSELEQMIKGSFGGIGSYIAYNTKLKRSMISEPFE GTPAAKAGLKAGDILMEIDGKDLAGKNNAEVSQMLRGQAGTSFKLKIERPNVKGGRTPME FTIVRESIQNPAIPYTAVLDNNIGYIGLSTFSGNPSKEFKKAFLDLKKQGATSLVIDLRS NGGGLLDEAVEIANYFLPRGKVIVTTKGKIKQASNTYKTLREPLDLDIPIAVLVNSGTAS ASEILSGSLQDLDRAVIVGNRTFGKGLVQVPRSLPYGGTMKVTTSKYYIPSGRCVQAIDY KHRNEDGSVGTIPDSLTKVFYTAAGREVRDGGGVMPDITIKQEKLPNILFYLVRDNLIFD YATQYCLKHPTIVAPEKFEVTDADYNDFKALVKKADFKYDQQSEKILKTLKEAAEFEGYM DDASEEFKVLEKKLNHNLDRDLDYFSTDIKKMIATEIIKRYYYQRGNIIQQLKDDDGLKE AMKILNDPVKYKEMLSAPVAKK >gi|222159331|gb|ACAB01000028.1| GENE 93 152595 - 153785 1011 396 aa, chain - ## HITS:1 COG:VCA0709_1 KEGG:ns NR:ns ## COG: VCA0709_1 COG0642 # Protein_GI_number: 15601465 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 168 393 495 727 738 156 41.0 7e-38 MRHYEGKTKEELLKIIEQLEEKIDFLSSHLSPPSQERFRDKYSTRILDALPDMLTVFDHD ANIIELASSPATNHVEGINASNITTTNVKDILPEEAYESVRKNMDKVILTGESSTARHDL MLDGVLHHYENRIFPLDNKYLLCMCRDISQQWEAEQTNAQQQKELKAARVKAEESDRLKS AFLANMSHEIRTPLNAIVGFSKLITFATSTEEKNQYAEIIERNSEMLLNLFNDILDLSSL EADSLKFNIRPIKLTDICLQLKQQFCHKTQNGVKLILDDVDADMYASGDWNRIIQIISNL LSNATKFTPKGEIHFGYREKEDFVEFYVKDSGIGIPAERVATIFRRFGKVNDFVQGTGLG LTLCRMLVEKMGGRIWLRSQEGQGSRFYFTLPLIRQ >gi|222159331|gb|ACAB01000028.1| GENE 94 153969 - 155387 1482 472 aa, chain + ## HITS:1 COG:XF1037 KEGG:ns NR:ns ## COG: XF1037 COG0499 # Protein_GI_number: 15837639 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylhomocysteine hydrolase # Organism: Xylella fastidiosa 9a5c # 33 472 1 446 446 648 68.0 0 MSTELFSTLPYKVADITLADFGRKEIDLAEKEMPGLMALREKYGESKPLKGARIMGSLHM TIQTAVLIETLVALGAEVRWCSCNIYSTQDHAAAAIAATGVPVFAWKGETLADYWWCTLQ ALSFAGGKGPNVIVDDGGDATMMIHVGYDAENNAAVLDKEVHAEDEIELNAILKKVLAED STRWHRVAEEMRGVSEETTTGVHRLYQMQEEGKLLFPAFNVNDSVTKSKFDNLYGCRESL ADGIKRATDVMIAGKVVVVCGYGDVGKGCSHSMRSYGARVLVTEVDPICALQAAMEGFEV VTMEEACMEGNIFVTTTGNIDIIRIDHMEKMKDQSIVCNIGHFDNEIQVDALKHYPGIKC VNIKPQVDRYYFPDGHSIILLADGRLVNLGCATGHPSFVMSNSFTNQTLAQMELFNKKYD INVYRLPKHLDEEVARLHLEKIGVKLTKLTPEQAAYIGVSVDGPYKADHYRY >gi|222159331|gb|ACAB01000028.1| GENE 95 155412 - 155585 99 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIKARMAIKSGRNFFIYNDTIYYLRLKFEYKINYDLTIQYLQLITVEYRSPTFINRK >gi|222159331|gb|ACAB01000028.1| GENE 96 155539 - 158052 2143 837 aa, chain + ## HITS:1 COG:no KEGG:BT_2796 NR:ns ## KEGG: BT_2796 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 825 1 824 843 1444 85.0 0 MKKFLPDLIAILAFIILSFAYFFPADIEGRILFQHDTAAGVGAGQESKEYLERTGERTRW TNSIFGGMPTYQMSPSYDSTTSLKGVEKVYRLFLPDYVVLTFIMMLGFYILLRAFGISAW LAGLGGVIWAFSSYFFILIPAGHIWKFVTLAYIPPTIAGVVLAYRKKYLLGGIITALFIA LQIQSNHIQMSYYFMFVILFFVGAYFEDAYKKKELPHFFKASAILALAAVVGVCINISNL YHTYEYSKETMRGKSELKQEGAAASQTSSGLDRDYITNWSYGIGETLTLLVPNVKGGGSG STMSQSEVAMAKANPMYSGIYSQLPQYFGEQPWTAGPVYVGAFVMFLFVLGCFIVKGPLK WALLGATIFSILLSWGKNFMGLTDFFIDYVPMYNKFRAVSSILVIAEFTIPLLAIFALKE ILSKPDTLKLKENRGGVIATLVLTAGVALILAVAPGTFFSGFITTQEMAALKQALPAEHL APFVANLTEMREAIIASDAWRSFFIIMIGCLFLFLYQQRKLKASFTLAGIALLCLIDMWS INKRYLNDEQFVPKSKQTEAFVKTQADEMILQDTTLNYRVLNFIGFPGNTFNENNTAYWH KSVGGYHAAKLRRYQEMIDHHIVPEMQETYQAVATAGGQMDSVDASKFRVLNMLNTKYFI FPAGEQGQAVPVVNPYAYGNAWFVDKVQYVNNANEEIDALNDILPTETAVVDVKFKEQLK GVTEGYKDSLSTIQLTSYEPNRLVYKASTPKDGVVVFSEIYYPGWQATIDGQPVDIARAD YILRAINIPAGEHTIEMWFDPQSIQVTESIAYAALTLLLIGVIIFALMQRSKIAKKP >gi|222159331|gb|ACAB01000028.1| GENE 97 158336 - 159667 1151 443 aa, chain + ## HITS:1 COG:aq_1332 KEGG:ns NR:ns ## COG: aq_1332 COG1538 # Protein_GI_number: 15606535 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Aquifex aeolicus # 4 435 11 414 415 99 20.0 2e-20 MKRLLFIFSFFFLFVLRGNTQEVEILTLEDCLRIGVDNNLSLEGKRKEIQKSKYGVSENR SKLLPQINAIAGYNNNFDPPVSVTDGSSYGVPYNITKTLQHSANAGLEMQMPLFNQTLYT SMSIAKVMEEISRLSYGKAREDVILQISKMYYLGQVTAEQIMLIKANITRLEELRDITQA FFDNGMSMEVDLKRVNINLENLKVQHDNAQAMMKQQLNMLKYIMDYPAEKEIALTPVNTD SITTVALTGLSENIYELQLSQSQVQLAERQKKIITNGYIPSLSLTGSWRYAAYTDKGYHW FHSGPSNQWFRSYGVGLTLRIPIFDGLDKTYKIKKAMIDIENKRLAWEDARKNLQTQYLN AVNDLMNNQRNFKKQKDNYLLAEDVYAVTSDRYREGIASMTEVLQDEMQMSEAQNNYISA HYNYRVTNLMLLKLTGQIESLVK >gi|222159331|gb|ACAB01000028.1| GENE 98 159690 - 160781 1113 363 aa, chain + ## HITS:1 COG:BMEII0793 KEGG:ns NR:ns ## COG: BMEII0793 COG1566 # Protein_GI_number: 17989138 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Brucella melitensis # 56 360 14 316 325 160 33.0 4e-39 METMENNLPSATHEEKAKKMKKLRRWQIAISLLGVAIIVWGVIEVICLFLNYSQTETSND AQIEQYVSPINLRASGYIDKIYFTEHQEVHKGDTLLVLDDREYKIRVMEAEAALKDAQAG ATVINATLNTTQTTASVYDASIAEIEVRLAKLEKDRKRYENLVKRNAATPIQLEQIVTDY EATRKKLEATKRQKKAALSGVDEVSYRRMNTEAAIQRATAALEMARLNLSYTVVIAPCDG KLGRRSLEEGQFISAGQTITYILPDTQKWIVANYKETQIENLHIGQEVFVTVDAISDKEF KGKVTSISGATGSKYSLVPTDNSAGNFVKIQQRIPVRIDFTDLSKEDNERLAAGMMVVVK AKL >gi|222159331|gb|ACAB01000028.1| GENE 99 160831 - 162468 1081 545 aa, chain + ## HITS:1 COG:no KEGG:BT_2793 NR:ns ## KEGG: BT_2793 # Name: not_defined # Def: putative MFS transporter # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 545 1 545 545 893 86.0 0 MPSYPKNYPFYSWMPKPLGIIILLFFFLPILTVGGVYSVNSTEMMSGLGIISEHIQFANF VTSIGMAAFAPFLYQLVCVRREKMMCIVGFAFMYIFSYICAKTDSVFLLALCSLLTGFLR MVLMMVNLFTLIWYAGGMEATRNITPGLEPKDTAGWNKLDIERCVSQPAVYLFFMILGQS GTALTAWLSFEYEWKYVYYFMMGILLISILLLFITMPNYKFPGRFPINFRKFGNVTAFCI SLTCLTYVLVYGKVLDWYDDESIRWATAVSILFAGIFLYMDVTRRSPYVLLDAFKLRTIR MGTLLYLLLMVINSSAMFVNVFAGVGMHLDNLQNASLGNWCMVGYAIGAVIAMVLGGKGL HFKYLFAMGFFFLSLSAVFMYFEVQTAGVYERLKYAVIIRATGMMILYALTAAYANQRMP FKYLSTWICIMLTVRMVVGPSIGGAIYTNVLQERQQHYITRYAQNVDLLNPDASTSFLGT VQGMKYQGKSETEARNMAAISTKGRIQVQATLSALKEMAGWTIYGGLICMIFVLVVPYPK RKLLT >gi|222159331|gb|ACAB01000028.1| GENE 100 162507 - 163382 535 291 aa, chain + ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 122 288 123 291 307 72 26.0 1e-12 MDSGQDRLLQFDRDLLSGKNICSSEGIFVNFPPSLKKPFQMKGLGVIICRQGNFQFSLNQ KKHFAGAGESLFIPEDGEFQVLQESEDMEVRILIYQIEPIRDIMGNLVVSMYMYSRLTPE EPSCVWSTGEEEEIVKYMSLLDNVLQSEENSFKLYEQKLLLLALTYRICSIYNRKLVNDG REVGGRKNEVFIHLIQLIEKYYMQERGVEFYADKLCLSPKYLSAVSKSICGYTVQELVFK AIIRKSISLLKNTQKDIQEISNAFGFPNASYFGTFFKKQVGVSPQQYRKNL >gi|222159331|gb|ACAB01000028.1| GENE 101 163387 - 164040 724 217 aa, chain - ## HITS:1 COG:MTH1114 KEGG:ns NR:ns ## COG: MTH1114 COG0035 # Protein_GI_number: 15679125 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Methanothermobacter thermautotrophicus # 1 214 8 211 215 137 38.0 2e-32 MKVIDFGQTNSILNQYISEIRNVEVQNDRLRFRRNIERIGEIMAYEMSKEFKYSVKNIQT PLGIAPVSTPDNNLVISTILRAGLPFHQGFLSYFDGAENAFVSAYRKYKDTLKFDIHIEY IASPRIDDKTLIITDPMLATGGSMELSYQAMLTKGHPAEIHVASIIASQKAIDHIKNVFP EDKTTIWCAAIDPELNEHSYIVPGLGDAGDLAYGEKE >gi|222159331|gb|ACAB01000028.1| GENE 102 164321 - 165928 1653 535 aa, chain + ## HITS:1 COG:VC2738 KEGG:ns NR:ns ## COG: VC2738 COG1866 # Protein_GI_number: 15642731 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Vibrio cholerae # 2 535 10 541 542 811 74.0 0 MANLDLSKYGITGVTEILHNPSYDVLFAEETKPSLEGFEKGQVTELGAVNVMTGIYTGRS PKDKFFVKNEASADSVWWTSEEYKNDNKPCTEEAWADLKAKAVKQLSGKRLFVVDTFCGA NEATRMKVRFIMEVAWQAHFVTNMFIRPTAEELANYGEPDFVCFNASKAKVDNYKELGLN SETATVFNLKTKEQVILNTWYGGEMKKGMFSIMNYMNPLRGIASMHCSANTDMEGTSSAI FFGLSGTGKTTLSTDPKRKLIGDDEHGWDNEGVFNYEGGCYAKVINLDKESEPDIYNAIK RDALLENVTVDANGKIDFTDKSVTENTRVSYPIYHIENIVKPVSKGPHAKQVIFLSADAF GVLPPVSILNPEQAQYYFLSGFTAKLAGTERGITEPTPTFSACFGAAFLSLHPTKYAEEL VKKMEMTGAKAYLVNTGWNGSGKRISIKDTRGIIDAILDGSIDKAPTKVIPFFDFVVPTE LPGVDPKILDPRDTYECACQWEEKAKDLAGRFIKNFAKFTGNEAGKALVAAGPKL >gi|222159331|gb|ACAB01000028.1| GENE 103 166079 - 167386 547 435 aa, chain + ## HITS:1 COG:aq_308 KEGG:ns NR:ns ## COG: aq_308 COG0249 # Protein_GI_number: 15605835 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Aquifex aeolicus # 157 434 489 774 859 97 30.0 5e-20 MTYLDTDKQTYADLSITETVNNEQFLFSLFSKTETKEGKSLMMNWVMYPLSDLDMIRKRQ EAVAWDALPELLLNEEELDFIEYYLAYRYQIREAHVLLSCATVIDRLLRYDSTRYVICRG VKLVIHLLHCLERWAKELDEDAPQLMKESARMVNDILSGSELGEVLEQTSGEERRLSNYT IDKYDYLFRCTRLLSLKELLSVLYLLDVCRTAHRVAREKNFCCTPKVVETMDFSVEGVVH PFVKNVRENSWVMSRGNISIFTGSNMAGKSTTLKALTLAVWLAHCGLPVPVKSMVCPLYE GIYTSINLPDSLRDGRSHFMAEVLRIKEVMQKALTGKRCLVVLDEMFRGTNAKDAFEASL AVNELLKRFSCCHFLISTHILEYAKAFEKDPACCFYYMEAEIIDDAFVCHHRLLQGISEA RVGYWVVKKLLAGLI >gi|222159331|gb|ACAB01000028.1| GENE 104 167504 - 169996 1892 830 aa, chain + ## HITS:1 COG:no KEGG:BT_4682 NR:ns ## KEGG: BT_4682 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 829 1 810 812 1423 81.0 0 MKKLYFFLICSLFLVSGLYAGETDYTKGLSIWFDTPNTLQGYAVWYGGRPDLWKGGDKPE TAGNAGHNPDPAWESQSLPIGNGSLGANIMGSVEAERITFNEKTLWRGGPNTAKGADYYW NVNKQSAHLLDEIRKAFTEGNQEKAEMLTRQNFNSEVSYDADGETPFRFGSFTTMGEFYV ETGLNIIGMSDYKRILSLDSAMAVVQFKKDHVVYQRNYFISYPANVMVMRFSADQPGKQN LVFSYAPNPVSTGNMASDGNKGLVYSASLDNNGMKYVVRIQAETKGGTLFNADGKLTVKG ADEVVFYITADTDYKPNFDPDFKDPKTYVGVNPEETTKEWMNNAVSQRYTALFSQHYNDY AALFDRVKLNLNPAIKGRNLPTPQRLKNYRAGQPDYYLEELYFQFGRYLLISSSRPGNMP ANLQGIWHNNVDGPWRVDYHNNINIQMNYWSVCSTNLNECMLPLVDFIRTLVKPGEKTAK SYFGARGWTASISGNIFGFTTPLESQDMSWNFNPMAGPWLATHIWEYYDYTRDLKFLKET GYELIKSSADFAVDYLWHKPDGTYTAAPSTSPEHGPIDQGATFVHAVVREILLDAIEASK VLGVDKKERKQWEHVLANLVPYKIGRYGQLMEWSVDIDDPKDEHRHVNHLFGLHPGHTVS PVTTPELAKAAKVVLVHRGDGATGWSMGWKLNQWARLQDGNHAYTLFGNLLKNGTLDNLW DTHSPFQIDGNFGGTAGITEMLLQSHMGFIQLLPALPDAWKEGSVSGICAKGNFEVDMVW ENNQLKEAVVHSNAGGNCVIKYADKTLSFKTVKGRSYRVEYDVTKGLIRQ >gi|222159331|gb|ACAB01000028.1| GENE 105 170028 - 170585 421 185 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 6 179 7 180 186 120 40.0 1e-27 MVLEYLKLFVTFAKIGMFTIGGGYAMIPLIEREIVAKQWMSKEEFMEMFALTQSLPGVFA VNISIFVGYKLYKVGGSLVCALATILPSFVIMMLIAMFFARFQDNEVMIRIFNGIRPAVV ALILFPCISAVRALKLKYLQLVAPAIATLLIWQFGLSPIYIVLAGILGGLVYTLWLKEKI MNKEV >gi|222159331|gb|ACAB01000028.1| GENE 106 170582 - 171106 472 174 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 174 1 173 176 121 41.0 5e-28 MIYLQLLWVYLKIGMFGFGGGYAMLSLIQHEIVDIHHWLTPQQFTDVVAISQMTPGPIGI NSATYVGYAVTQSVWGAVLATVAVCLPSFILVLLISYFFAKCKDNKYIKAAMSGLLPMSV ALIASAALLMMNRENFMDYKSIGIFAAAFLVTWKWNLHPILLICLAGVVGLLLY >gi|222159331|gb|ACAB01000028.1| GENE 107 171196 - 172995 1771 599 aa, chain - ## HITS:1 COG:DR1198 KEGG:ns NR:ns ## COG: DR1198 COG1217 # Protein_GI_number: 15806217 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Deinococcus radiodurans # 5 594 4 593 593 661 55.0 0 MQNIRNIAIIAHVDHGKTTLVDKMLLAGNLFRSNQNSGELILDNNDLERERGITILSKNV SINYNGTKINIIDTPGHSDFGGEVERVLNMADGCILLVDAFEGPMPQTRFVLQKALEIGL KPIVVVNKVDKPNCRPEEVYEMVFDLMFSLNATEDQLDFPVIYGSAKNNWMSTDWQQQTD SITPLLDCIVENIPAPQQLEGTPQMLITSLDYSSYTGRIAVGRVHRGTLKEGMNITLVKR NGDMFKSKIKELHVFEGLGRVKTNEVSSGDICALVGIDGFEIGDTVCDFESPEALPPIAI DEPTMSMLFAINDSPFFGKDGKFVTSRHIHDRLMKELDKNLALRVRKSEEDGKWIVSGRG VLHLSVLIETMRREGYELQVGQPQVIFREIDGVKCEPIEELTINVPEEYSSKIIDMVTRR KGEMVKMENTGERINLEFNMPSRGIIGLRTNVLTASAGEAIMAHRFKEYQPHKGEIERRT NGSMIAMESGTAFAYAIDKLQDRGKFFIFPQDEVYAGQVVGEHSHDNDLVINVTKSKKLT NMRASGSDDKVRLIPPIQFSLEEALEYIKEDEYVEVTPKAMRMRKVILDEIERKRANKN >gi|222159331|gb|ACAB01000028.1| GENE 108 173151 - 173420 446 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883111|ref|ZP_02064114.1| hypothetical protein BACOVA_01079 [Bacteroides ovatus ATCC 8483] # 1 89 1 89 89 176 100 7e-43 MYLDAAKKQEIFGKYGKSNTDTGSAEAQVALFSYRISHLTEHMKLNRKDYSTERALTMLV GKRRRLLDYLKARDIERYRAIVKELGLRK >gi|222159331|gb|ACAB01000028.1| GENE 109 173607 - 174182 647 191 aa, chain + ## HITS:1 COG:MTH659 KEGG:ns NR:ns ## COG: MTH659 COG1396 # Protein_GI_number: 15678686 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanothermobacter thermautotrophicus # 7 190 6 189 190 194 57.0 6e-50 MDTSKIVGEKIKALREDKSISIEELAQRSGLAIEQIERIENNIDIPSLAPLIKIARVLGV RLGTFLDDQDEVGPVVCRKKEAKDAISFSNNAIHSRKHMEYHSLSKSKADRHMEPFIIDV MPTQDTDFVLSSHEGEEFIMVMEGIMEISYGKNTYLLEEGDSIYYDSIVPHHVHAYEGQA AKILAVVYTPI >gi|222159331|gb|ACAB01000028.1| GENE 110 174218 - 175867 1584 549 aa, chain + ## HITS:1 COG:MTH657 KEGG:ns NR:ns ## COG: MTH657 COG0318 # Protein_GI_number: 15678684 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Methanothermobacter thermautotrophicus # 1 544 1 545 548 724 61.0 0 MQLFDRTLGQWLEHWAEETPDKEYIVYSDRNLRFTWSQLNRRVDDMAKGLIAIGVERGTH VGIWAANVPDWLTLLYACAKIGAVYVTVNTNYKQSELEYLCQNSDMHTLCIVNGEKDSDF VQMTYTMLPELKTCERGHLKSERFPYMRNVVYVGQEKHRGMYNTAEILLLGDNVEDDRLN ELKSKVDCHDVVNMQYTSGTTGFPKGVMLTHYNITNNGFLTGEHMKFTADDKLCCCVPLF HCFGVVLATMNCLTHGCTQVMVERFDPLVVLASIHKERCTALYGVPTMFIAELHHPMFDL FDMSCLRTGIMAGSLCPVELMKQVEEKMYMKVTSVYGLTETAPGMTASRIDDPFDVRCNT VGHDFEFTEVKVIDPETGEECPVGVQGEMCNRGYNTMKGYYKNPEATAEVLDENNFLHSG DLGIKDEDGNYRITGRIKDMIIRGGENIYPREIEEFLYKLDGVKDVQVAGIPSKKYGEAV GAFIILQEGVQMQEADVRDFCRNKISRYKIPKYIFFVNEFPMTGSGKIQKFRLKDLGLQL CKEQGIEII >gi|222159331|gb|ACAB01000028.1| GENE 111 176015 - 177097 774 360 aa, chain - ## HITS:1 COG:CAC3072 KEGG:ns NR:ns ## COG: CAC3072 COG0836 # Protein_GI_number: 15896323 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 9 340 5 337 350 252 39.0 6e-67 MTNQDNYCVIMGGGIGSRFWPFSRKTLPKQFLDFFGTGRSLLQQTFDRFQKVIPTENIFI VTNAMYADLVKEQLPEVNEEQILLEPARRNTAPCIAWAAYHIRALNPNANIVVAPSDHLI LKEDEFLAAIEKGLDFVSRSEKLLTLGIKPNRPETGYGYIQIDEPAGGNFYKVKTFTEKP ELELAKVFVESGEFYWNSGLFMWNVNTIIKASEDLLPELASKLAPGKDIYATDKEKAFIE ENFPACPNVSIDFGIMEKADNVYVSLGDFGWSDLGTWGSLYDLSERDPEGNVTLKCHSLI YNSKDNMVVLPKGKLAVIDGLEGFLIAESDNVLLICRKDEEHAIRKYVNDAQMQLGDDFI >gi|222159331|gb|ACAB01000028.1| GENE 112 177521 - 178330 424 269 aa, chain + ## HITS:1 COG:mlr2757 KEGG:ns NR:ns ## COG: mlr2757 COG3177 # Protein_GI_number: 13472455 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 39 255 41 243 263 101 33.0 1e-21 MEQGVWQEIELLYQKFQKLGISEAVDYDKYYLYSLITHSTAIEGSTLTELDTQLLFDEGV TAKGKPLVHHLMNEDLKQAYELAKTESGSLVPITPAFLKRLNAMLMRTTGSVHSVLGGSF DSSKGEFRLCGVTAGVGGHSYMNYLKVPAKVDELCAILQEKQKKMGTLREQYELSFNAHL NLVTIHPWVDGNGRTARLLMNYIQFCYHLFPTKIFKEDREEYILSLRQCQNEETNQPFLD FMARQLKKSLSIEIERFNVSRKKGFSFMF >gi|222159331|gb|ACAB01000028.1| GENE 113 178356 - 179096 106 246 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_3691 NR:ns ## KEGG: Fjoh_3691 # Name: not_defined # Def: NERD domain-containing protein # Organism: F.johnsoniae # Pathway: not_defined # 24 243 36 255 259 164 41.0 2e-39 MHIAITVIFFAVVIFIKLKMPMWKGKYSEKLVNNKIQELPEEYVVFNDLLFESNGYSTQI DHIVVSPYGVFVIETKGYKGWILGRENGEYWTQTIYKSKHQFYNPIKQNAGHVRFLHHLL KCSTDILFIPIVVFNNSAELKVHADNNIVVNRYNLKRAILQYRTAVLNQETINWIIQTIN QNRIIADKEKLKQHKHNAKARQYRSSRLINQGVCPQCGGHLILRKGKYGTFYGCSNFPTC KFTINS >gi|222159331|gb|ACAB01000028.1| GENE 114 179106 - 179549 144 147 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|54307151|ref|YP_133696.1| ## NR: gi|54307151|ref|YP_133696.1| hypothetical protein NBU1_02 [Bacteroides uniformis] # 1 147 1 147 147 297 100.0 1e-79 MKKLTIIFYLFLIPLCVFAQNESSISHTKTVNDSTLIKIHAIVEDIDFRMHGLDRYKLYP TENIYNFLQLDTMTGKIEIVQWSLDDDKEGSATLNNEDLSLGTGCGTFELYPTKNIYQFL LLDKVTGRRWHVQWGFESSKRWIKRIY >gi|222159331|gb|ACAB01000028.1| GENE 115 179605 - 180324 440 239 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|54307150|ref|YP_133695.1| ## NR: gi|54307150|ref|YP_133695.1| hypothetical protein NBU1_01 [Bacteroides uniformis] # 1 239 1 239 239 436 100.0 1e-121 MKQNICVVIFLLLSTICKAQTFIYPQNDSILTEYNDGKFWAYLNKNDFVVGLSCFEEKDD YGKYYHLDIFIKNLSETPIVFQPDSVHSNLLTKKDDTLKLEVYTNEEYQKKIKRSQAWAM ALYGISAGINAGTAGYSTSYSTSYSSNGYAYTTITNHYDANAAYQANVASTNQMLTLGQM MDNDRTIREQGYLKTTTIYPDEAIIGYMNIKRKKGKILIVNIPIGDYIYTFKWDVNRKK >gi|222159331|gb|ACAB01000028.1| GENE 116 180643 - 182046 932 467 aa, chain - ## HITS:1 COG:no KEGG:BVU_1439 NR:ns ## KEGG: BVU_1439 # Name: not_defined # Def: mobilization protein # Organism: B.vulgatus # Pathway: not_defined # 1 467 1 467 467 786 89.0 0 MATKSSIHIKPCNIASSEVHNRRTAEYMRNIGESRIYVVPELSTNNEQWINPDFGTPELR THYDNIKQMVKEKTGRAMQEKERERKGKNGKIIKVAGCSPIREGVLLIRPDTTLADVRKF GEECQRRWGITPLQIFLHKDEGHWLNGQPEAEDRESFKVGNRWFKPNYHAHIVFDWMNHE TGKSRKLNDEDMAAMQTLASNILLMERGQAKAVTGKEHLERNDFIIKKQKAELQRIEETK RHKEQQVSLAEQELKQVKAEIRTDKLKSAATDVATAITSGVGSLFGSGKLKDLEQSNENL RHEIAKRDKGIDELKAKMQQMQEQHSKQIRNLQGIHNQELEAKDKEISRLNAILEKAFNW FPLLKEMLRMEKLCYAIGFTKDMINSLLTKKEAIRCNGRIYSEEHKRKFDIKNDIFKVEK NPTDDSKLILTINRQPIGEWFKEQWEKLRQGLRQLAEEPRKSRGFRM >gi|222159331|gb|ACAB01000028.1| GENE 117 182258 - 183214 415 318 aa, chain - ## HITS:1 COG:no KEGG:BVU_1440 NR:ns ## KEGG: BVU_1440 # Name: not_defined # Def: DNA primase # Organism: B.vulgatus # Pathway: not_defined # 1 318 1 318 318 615 95.0 1e-175 MTIEEAKSIRIADYLHSLGYSPVKQQGINLWYKSPFREESEASFKVNTEREQWYDFGLGK GGNIIALAAHLYATDHVPYILKRIAEQTPHVRPVSFSFGKQSSSEPSFQQLEIVPLSSPA LLAYLQGRGINIELAKRECSEARFTHNGKRYFAIAFPNGSGGFEVRNRYFKGCIAPKEIS HIRQSGKARNTCYVFEGFMDYLSFLTLRQESCPNYPELDGQDYIVLNSVSNVNKALYPLG NYERIHCFFDNDHAGIEALQQIRKEYSRDRYIRDASQIYSGCKDLNEYLQKQVERKRQVQ SVKGMSSQSPKKKNGFRL >gi|222159331|gb|ACAB01000028.1| GENE 118 183446 - 184636 619 396 aa, chain - ## HITS:1 COG:no KEGG:BDI_2140 NR:ns ## KEGG: BDI_2140 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 396 18 413 413 644 78.0 0 MKEEEYMRVGTTLYKVVNQPCANGGYEKKRVVWNNSTLRQDYGKNYLATVPKYDGFCTVP NHLNYQKEIEGFLNLYEPIEHKPQQGDFSHIQSLMRHIFGEQYELGMDYMQLLYLQPTQK LPIVLLVSEERNTGKSTFLNFLKAVFENNVTFNTNEDFRSQFNSDWAGKLLIVVDEVLLN RREDSERLKNLSTTFTYKVEAKGKDRTEIAFFAKFVLCSNNEYLPILIDAGETRYWVRKI MPLQSDDTNFLQKLKAEIPAFLYFLTQRELSTTQESRMWFNPRLTHTAALQKIIRSNRNR LEIEMTELLLDIMSNMNVESVSFCLNDLVTLLLYSQVKVEKYQVRKVVQEVWKLTSAHNS LSYTAYEFAPHRECHYEPKRKTGRFYTVTKEQLTAI >gi|222159331|gb|ACAB01000028.1| GENE 119 184641 - 184955 205 104 aa, chain - ## HITS:1 COG:no KEGG:Amuc_0323 NR:ns ## KEGG: Amuc_0323 # Name: not_defined # Def: phage transcriptional regulator, AlpA # Organism: A.muciniphila # Pathway: not_defined # 14 101 18 105 106 62 38.0 5e-09 MTDILAIIQNGNGNIKLEVTGEDLLLFSNQLISRAKHELSTAIAEARKEKYLTKEEVKKM CGVCDTTLWHWSKKNYLKPVKVGNKVRYRQSDIQRILGEHNPLI >gi|222159331|gb|ACAB01000028.1| GENE 120 185125 - 186066 703 313 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|54307147|ref|YP_133700.1| ## NR: gi|54307147|ref|YP_133700.1| hypothetical protein NBU1_06 [Bacteroides uniformis] # 1 313 1 313 313 563 100.0 1e-159 MGQQEYDNFKRLIKEWLDSHPNEYADFVEEMNDKKFKGFFNIFNTAVRLVPKYKEAARKR IGDDRNPDFEELENVLLQSDLAEKIVNEFHTPNKRSIVPAMLAWLYYGRSYECMVEQGEE LTKRKDIPTLYKWLVSGMVKFIIRKSIANGMRTKEDWQVFRKQQKAIEENSLVEWAIEED EDMDEEETDMLQEEQPKTAGRKADTRTLPELLIERHDILIERIGTRLKTHATETDIARLF IALVEYRFMRKCPIKTFRNALYEQFKEQEIVHERGIQKAYRNLTSPFGNSKKLVKDIGED HEAIEELKAYLSN >gi|222159331|gb|ACAB01000028.1| GENE 121 186074 - 187411 687 445 aa, chain - ## HITS:1 COG:no KEGG:PG0838 NR:ns ## KEGG: PG0838 # Name: not_defined # Def: integrase # Organism: P.gingivalis # Pathway: not_defined # 219 439 202 424 432 84 28.0 8e-15 MKVTFIIKKAAKRYDTESMATIYVRFRNGRQLDSVAPTQLAINPNLWDDKDECVKTKAVC NEEMRTHINEEIRQLKTYIEKVYQQEKEAIDKEWLKTTLDKFYHPEKYFLPEEVVIKPTI GELFDEFLNKHPLSEVRKKNFRVVKRALLRYELYVRATKRGQKGFILDVDLVTPDTLRDM WDFFQNEYQYYELYPSIYEAIPEKRTPQPRSKNTLIDCFSRIRTFFLWCFDNKRTTNRPF DKFPIEECTYGTPYYITLEERDRIFNADLSATPQLAIQRDIFIFQTLIGCRVSDLYRMTK LNVVNEAIEYIPKKTKEGNPVTVRVPLNDKAKEILERYKEYEGKLLPFISEQKYNDAIKK IFKLAGVDRIVTILDPLTHNEIKRPIYEVASSHLARRTFIGNIYKKVKDPNLVSALSGHK EGSKAFRRYRDIDEEMKKDLVKLLD >gi|222159331|gb|ACAB01000028.1| GENE 122 187525 - 187701 92 58 aa, chain - ## HITS:0 COG:no KEGG:no NR:no NFGMTNELRGVEKKNGESISTFSASVPRPAFEIIKVGKMWVKRKMRKNATITLYFNML Prediction of potential genes in microbial genomes Time: Wed May 18 01:37:20 2011 Seq name: gi|222159330|gb|ACAB01000029.1| Bacteroides sp. D1 cont1.29, whole genome shotgun sequence Length of sequence - 5497 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 120 - 1292 605 ## gi|237717341|ref|ZP_04547822.1| conserved hypothetical protein - Prom 1445 - 1504 3.8 + Prom 1487 - 1546 4.2 2 2 Tu 1 . + CDS 1571 - 1915 361 ## gi|237718791|ref|ZP_04549272.1| conserved hypothetical protein + Prom 3069 - 3128 7.0 3 3 Tu 1 . + CDS 3160 - 3843 532 ## BT_2762 TonB + Term 3865 - 3903 6.1 + Prom 3848 - 3907 3.7 4 4 Op 1 . + CDS 3949 - 4572 342 ## BT_2761 hypothetical protein 5 4 Op 2 . + CDS 4617 - 5180 419 ## COG2096 Uncharacterized conserved protein 6 4 Op 3 . + CDS 5254 - 5475 357 ## PGN_1678 hypothetical protein Predicted protein(s) >gi|222159330|gb|ACAB01000029.1| GENE 1 120 - 1292 605 390 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717341|ref|ZP_04547822.1| ## NR: gi|237717341|ref|ZP_04547822.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 390 1 390 390 758 100.0 0 MRTLFFILPFIFITTGCIKQNTEKSDILTTEDIAKFDSLFNAWEKEIHNNPKVQLSSSTR AYTILPQFKELKAMGKKIIPCIIEQFEKDNNTFFALPLYDELQDNDSLKSHRETSEQDKV SEIVQKFREQEKRKMWEEANSYKTSITDTTFLKTRGTMSYAAIDLFTGRGFNKDYAFNQV VYLKALQRAQKHLSVVNGQLACDLKSGTEIYISEDLYLFITDLFKDWNTWLESGKYEIIK DEQGLYTVLPKKTNQSSQTNSVTLRPIPRDIPTGEDISPASLSIETEHDHYPLSTTEVKI TITNHSHYEYECGEKYSLAYYNKSKRQWETLPTDPVVNDILWIFPPEHPSHQQTIKLYTS ESPNHPGKYRIYKSFNRGTKVAYGEFELLP >gi|222159330|gb|ACAB01000029.1| GENE 2 1571 - 1915 361 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237718791|ref|ZP_04549272.1| ## NR: gi|237718791|ref|ZP_04549272.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 102 1 102 114 117 99.0 2e-25 MRKQILSLAVALLIGSTVCMAQNRQNGKADREKRMEQIVADLGLNEKQAKDFKAAMKEMR PAKNKSNERPTREEMQKKRQEIDAKIKSILTDEQYKKYQDMRKKDNAKKKKKAK >gi|222159330|gb|ACAB01000029.1| GENE 3 3160 - 3843 532 227 aa, chain + ## HITS:1 COG:no KEGG:BT_2762 NR:ns ## KEGG: BT_2762 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 1 227 227 325 83.0 5e-88 MEAKKSKKAAIENQRGSWLLMGLVVALAFMFVSFEWTQHDVRVAALSPDDESIFVTELVP ITFPEEKLEPPPPPETKVTELFEIVENNMEVTDNVSTVSEDMNAVHDVIWIPPVVETETV VEDIIHVSVEVMPEFPGGTAALMKYLGSNIKYPTISQETGSQGKVIVQFVVDRDGTISNP EVVRGVDPYLDKEAIRVISSMPKWTPGVQNGKKVRVKYTVPVLFRLQ >gi|222159330|gb|ACAB01000029.1| GENE 4 3949 - 4572 342 207 aa, chain + ## HITS:1 COG:no KEGG:BT_2761 NR:ns ## KEGG: BT_2761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 6 212 212 326 77.0 4e-88 MNLLLCIKRPFIWLSRFRHRCGYGVHSPFAFSLITDVIYEKMPYYAYSSLKEEQKKMVRE RGWTKGSQKVNRFLFRLVNRVQPNTIIEVGRPSSTALYLQSAKPSASYLFASDLSELFLD ADTSVDFLYLNDYQNPDLLEEVFRVCVHRTTLKSVFVVHGICYSKEMKALWKKWQADERV GITFDLYDVGLLFFDKTKIKQQYIINF >gi|222159330|gb|ACAB01000029.1| GENE 5 4617 - 5180 419 187 aa, chain + ## HITS:1 COG:lin1172 KEGG:ns NR:ns ## COG: lin1172 COG2096 # Protein_GI_number: 16800241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 6 185 3 179 188 154 44.0 1e-37 MKKSLVYTKTGDKGTTSLVGGSRVPKTHIRLEAYGTVDELNSHLGWLHTYLQEESDRDFI LSIQHKLFAIGSHLATDQEKTQLKPASIITPEHVESIEREIDKLDEQLPELCAFIIPGGS RGAAVCHVCRTICRRAERRILALSETCTISPEVLAFVNRLSDYLFVLSRKINFDEQNNEI FWDNSWK >gi|222159330|gb|ACAB01000029.1| GENE 6 5254 - 5475 357 73 aa, chain + ## HITS:1 COG:no KEGG:PGN_1678 NR:ns ## KEGG: PGN_1678 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 73 13 85 85 117 98.0 2e-25 MYWTLELASKLEDAPWPATKDELIDYAMRSGAPLEVIENLQEMEDEGEIYESIEDIWPDY PSKEDFFFNEEEY Prediction of potential genes in microbial genomes Time: Wed May 18 01:41:18 2011 Seq name: gi|222159329|gb|ACAB01000030.1| Bacteroides sp. D1 cont1.30, whole genome shotgun sequence Length of sequence - 125137 bp Number of predicted genes - 108, with homology - 107 Number of transcription units - 40, operones - 17 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 204 - 248 7.4 1 1 Op 1 . - CDS 486 - 3332 2465 ## COG3345 Alpha-galactosidase 2 1 Op 2 . - CDS 3362 - 4387 328 ## BDI_1319 glycoside hydrolase family protein 3 1 Op 3 . - CDS 4419 - 5642 993 ## COG3754 Lipopolysaccharide biosynthesis protein - Prom 5662 - 5721 4.0 4 2 Op 1 . - CDS 5875 - 8154 1702 ## COG1472 Beta-glucosidase-related glycosidases 5 2 Op 2 . - CDS 8171 - 10282 1528 ## COG1874 Beta-galactosidase 6 2 Op 3 . - CDS 10323 - 11315 742 ## BT_3516 arabinan endo-1,5-alpha-L-arabinosidase A precursor + Prom 11385 - 11444 10.5 7 3 Tu 1 . + CDS 11598 - 12695 409 ## PROTEIN SUPPORTED gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase + Term 12703 - 12758 5.1 + Prom 12812 - 12871 5.7 8 4 Op 1 . + CDS 12896 - 14071 895 ## COG0584 Glycerophosphoryl diester phosphodiesterase 9 4 Op 2 . + CDS 14102 - 17098 2789 ## BVU_0500 hypothetical protein 10 4 Op 3 . + CDS 17111 - 18886 1392 ## Slin_5420 RagB/SusD domain protein 11 4 Op 4 . + CDS 18939 - 20399 1102 ## gi|294810052|ref|ZP_06768724.1| putative lipoprotein 12 4 Op 5 . + CDS 20422 - 21321 686 ## gi|237717360|ref|ZP_04547841.1| conserved hypothetical protein 13 4 Op 6 . + CDS 21339 - 23567 1621 ## gi|237717361|ref|ZP_04547842.1| conserved hypothetical protein 14 4 Op 7 . + CDS 23595 - 24497 768 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Term 24535 - 24585 9.2 + Prom 24581 - 24640 5.5 15 5 Tu 1 . + CDS 24730 - 28653 2118 ## COG0642 Signal transduction histidine kinase 16 6 Tu 1 . - CDS 28862 - 29272 153 ## COG4804 Uncharacterized conserved protein + Prom 29050 - 29109 4.4 17 7 Tu 1 . + CDS 29352 - 29540 78 ## gi|237717365|ref|ZP_04547846.1| predicted protein + Term 29558 - 29608 5.4 - Term 29547 - 29593 7.1 18 8 Op 1 . - CDS 29629 - 30798 1231 ## BT_2758 hypothetical protein 19 8 Op 2 2/0.000 - CDS 30832 - 31890 1168 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D 20 8 Op 3 . - CDS 31926 - 33236 1343 ## COG2704 Anaerobic C4-dicarboxylate transporter - Prom 33390 - 33449 7.2 + Prom 33309 - 33368 7.9 21 9 Tu 1 . + CDS 33414 - 34847 1544 ## COG1027 Aspartate ammonia-lyase + Term 34882 - 34926 11.1 + Prom 34900 - 34959 5.0 22 10 Tu 1 . + CDS 35069 - 36439 488 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 + Prom 36535 - 36594 3.2 23 11 Tu 1 . + CDS 36618 - 36797 102 ## gi|237717371|ref|ZP_04547852.1| predicted protein + Term 36873 - 36936 -0.8 + Prom 37209 - 37268 5.0 24 12 Tu 1 . + CDS 37479 - 37910 243 ## gi|237717372|ref|ZP_04547853.1| conserved hypothetical protein + Term 37964 - 38009 3.4 + Prom 38063 - 38122 4.8 25 13 Tu 1 . + CDS 38171 - 38746 708 ## BT_2753 hypothetical protein + Term 38779 - 38815 4.0 + Prom 38856 - 38915 6.3 26 14 Op 1 . + CDS 38942 - 41398 1863 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase 27 14 Op 2 . + CDS 41450 - 43246 1583 ## BT_2751 hypothetical protein + Term 43392 - 43423 4.5 + Prom 43367 - 43426 4.0 28 15 Tu 1 . + CDS 43488 - 43958 485 ## COG0394 Protein-tyrosine-phosphatase + Term 44054 - 44110 10.2 - Term 43772 - 43808 0.0 29 16 Tu 1 . - CDS 43933 - 46005 565 ## PROTEIN SUPPORTED gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 - Prom 46039 - 46098 4.6 + Prom 46015 - 46074 3.5 30 17 Op 1 . + CDS 46100 - 47617 1400 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 31 17 Op 2 . + CDS 47631 - 48854 1253 ## COG1519 3-deoxy-D-manno-octulosonic-acid transferase + Term 48930 - 48994 19.3 - Term 48916 - 48981 6.1 32 18 Tu 1 . - CDS 48997 - 49518 390 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily - Prom 49540 - 49599 3.5 33 19 Tu 1 . - CDS 49667 - 51448 1217 ## COG0006 Xaa-Pro aminopeptidase + Prom 51406 - 51465 4.1 34 20 Op 1 . + CDS 51636 - 51827 310 ## PROTEIN SUPPORTED gi|160883083|ref|ZP_02064086.1| hypothetical protein BACOVA_01051 35 20 Op 2 . + CDS 51901 - 52782 727 ## COG4974 Site-specific recombinase XerD 36 20 Op 3 . + CDS 52798 - 53097 193 ## PROTEIN SUPPORTED gi|163755828|ref|ZP_02162946.1| 30S ribosomal protein S21 + Term 53143 - 53191 3.2 + TRNA 53195 - 53271 73.6 # Thr TGT 0 0 + TRNA 53354 - 53439 65.6 # Tyr GTA 0 0 + TRNA 53465 - 53537 68.9 # Gly TCC 0 0 + TRNA 53548 - 53619 81.6 # Thr GGT 0 0 37 21 Tu 1 . + CDS 53671 - 54855 1405 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 + Term 54870 - 54910 8.2 + TRNA 54909 - 54984 81.1 # Trp CCA 0 0 + Prom 54911 - 54970 80.2 38 22 Op 1 . + CDS 54996 - 55187 140 ## BF4198 preprotein translocase SecE subunit 39 22 Op 2 45/0.000 + CDS 55205 - 55747 561 ## COG0250 Transcription antiterminator 40 22 Op 3 55/0.000 + CDS 55809 - 56252 743 ## PROTEIN SUPPORTED gi|160883077|ref|ZP_02064080.1| hypothetical protein BACOVA_01040 41 22 Op 4 . + CDS 56268 - 56966 1160 ## PROTEIN SUPPORTED gi|29348146|ref|NP_811649.1| 50S ribosomal protein L1 42 22 Op 5 . + CDS 56982 - 57500 849 ## PROTEIN SUPPORTED gi|237717390|ref|ZP_04547871.1| ribosomal protein L10 43 22 Op 6 28/0.000 + CDS 57548 - 57922 589 ## PROTEIN SUPPORTED gi|153805949|ref|ZP_01958617.1| hypothetical protein BACCAC_00194 + Term 57948 - 57992 9.2 + Prom 57945 - 58004 4.3 44 23 Op 1 58/0.000 + CDS 58027 - 61839 2908 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 + Prom 61861 - 61920 2.3 45 23 Op 2 . + CDS 61946 - 66229 4411 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit + Term 66246 - 66305 16.2 + Prom 66269 - 66328 9.8 46 24 Tu 1 . + CDS 66376 - 66684 340 ## BT_2732 hypothetical protein + Prom 66737 - 66796 5.1 47 25 Op 1 56/0.000 + CDS 66821 - 67231 701 ## PROTEIN SUPPORTED gi|160883069|ref|ZP_02064072.1| hypothetical protein BACOVA_01032 + Prom 67254 - 67313 2.2 48 25 Op 2 51/0.000 + CDS 67375 - 67851 809 ## PROTEIN SUPPORTED gi|160883068|ref|ZP_02064071.1| hypothetical protein BACOVA_01031 49 25 Op 3 4/0.000 + CDS 67897 - 70014 1879 ## COG0480 Translation elongation factors (GTPases) + Term 70021 - 70076 7.2 + Prom 70016 - 70075 3.8 50 26 Op 1 40/0.000 + CDS 70095 - 70400 495 ## PROTEIN SUPPORTED gi|29348137|ref|NP_811640.1| 30S ribosomal protein S10 51 26 Op 2 58/0.000 + CDS 70419 - 71036 1070 ## PROTEIN SUPPORTED gi|237717399|ref|ZP_04547880.1| 50S ribosomal protein L3 52 26 Op 3 61/0.000 + CDS 71036 - 71662 1046 ## PROTEIN SUPPORTED gi|237717400|ref|ZP_04547881.1| 50S ribosomal protein L4 53 26 Op 4 61/0.000 + CDS 71679 - 71969 480 ## PROTEIN SUPPORTED gi|160883062|ref|ZP_02064065.1| hypothetical protein BACOVA_01025 54 26 Op 5 60/0.000 + CDS 71975 - 72799 1416 ## PROTEIN SUPPORTED gi|153805960|ref|ZP_01958628.1| hypothetical protein BACCAC_00205 55 26 Op 6 59/0.000 + CDS 72820 - 73089 465 ## PROTEIN SUPPORTED gi|153805961|ref|ZP_01958629.1| hypothetical protein BACCAC_00206 56 26 Op 7 61/0.000 + CDS 73126 - 73536 679 ## PROTEIN SUPPORTED gi|29348131|ref|NP_811634.1| 50S ribosomal protein L22 57 26 Op 8 50/0.000 + CDS 73542 - 74273 1248 ## PROTEIN SUPPORTED gi|160883058|ref|ZP_02064061.1| hypothetical protein BACOVA_01021 58 26 Op 9 . + CDS 74297 - 74731 737 ## PROTEIN SUPPORTED gi|153805964|ref|ZP_01958632.1| hypothetical protein BACCAC_00209 59 26 Op 10 . + CDS 74737 - 74934 321 ## PROTEIN SUPPORTED gi|160883056|ref|ZP_02064059.1| hypothetical protein BACOVA_01019 60 26 Op 11 50/0.000 + CDS 74931 - 75200 456 ## PROTEIN SUPPORTED gi|160883055|ref|ZP_02064058.1| hypothetical protein BACOVA_01018 61 26 Op 12 57/0.000 + CDS 75203 - 75568 596 ## PROTEIN SUPPORTED gi|29348126|ref|NP_811629.1| 50S ribosomal protein L14 62 26 Op 13 48/0.000 + CDS 75588 - 75908 534 ## PROTEIN SUPPORTED gi|160883053|ref|ZP_02064056.1| hypothetical protein BACOVA_01016 63 26 Op 14 50/0.000 + CDS 75908 - 76465 913 ## PROTEIN SUPPORTED gi|237717411|ref|ZP_04547892.1| 50S ribosomal protein L5 64 26 Op 15 50/0.000 + CDS 76471 - 76770 503 ## PROTEIN SUPPORTED gi|153805970|ref|ZP_01958638.1| hypothetical protein BACCAC_00215 + Term 76788 - 76816 -0.9 65 26 Op 16 55/0.000 + CDS 76823 - 77218 664 ## PROTEIN SUPPORTED gi|160883050|ref|ZP_02064053.1| hypothetical protein BACOVA_01013 66 26 Op 17 46/0.000 + CDS 77234 - 77803 968 ## PROTEIN SUPPORTED gi|237717414|ref|ZP_04547895.1| 50S ribosomal protein L6 67 26 Op 18 56/0.000 + CDS 77825 - 78169 541 ## PROTEIN SUPPORTED gi|29348120|ref|NP_811623.1| 50S ribosomal protein L18 68 26 Op 19 . + CDS 78175 - 78693 848 ## PROTEIN SUPPORTED gi|160883047|ref|ZP_02064050.1| hypothetical protein BACOVA_01010 69 26 Op 20 . + CDS 78703 - 78879 281 ## PROTEIN SUPPORTED gi|53715448|ref|YP_101440.1| 50S ribosomal protein L30 70 26 Op 21 53/0.000 + CDS 78912 - 79358 739 ## PROTEIN SUPPORTED gi|237717418|ref|ZP_04547899.1| 50S ribosomal protein L15 71 26 Op 22 2/0.000 + CDS 79363 - 80703 876 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 72 26 Op 23 9/0.000 + CDS 80719 - 81516 607 ## COG0024 Methionine aminopeptidase 73 26 Op 24 . + CDS 81518 - 81736 239 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 74 26 Op 25 . + CDS 81745 - 81861 198 ## PROTEIN SUPPORTED gi|53715443|ref|YP_101435.1| 50S ribosomal protein L36 75 26 Op 26 48/0.000 + CDS 81895 - 82275 637 ## PROTEIN SUPPORTED gi|29348113|ref|NP_811616.1| 30S ribosomal protein S13 76 26 Op 27 36/0.000 + CDS 82287 - 82676 665 ## PROTEIN SUPPORTED gi|29348112|ref|NP_811615.1| 30S ribosomal protein S11 + Prom 82689 - 82748 3.2 77 26 Op 28 26/0.000 + CDS 82794 - 83399 1028 ## PROTEIN SUPPORTED gi|160883039|ref|ZP_02064042.1| hypothetical protein BACOVA_01002 78 26 Op 29 50/0.000 + CDS 83411 - 84403 1021 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 79 26 Op 30 . + CDS 84407 - 84910 834 ## PROTEIN SUPPORTED gi|237717427|ref|ZP_04547908.1| 50S ribosomal protein L17 + Term 84928 - 84988 16.4 + Prom 84953 - 85012 6.7 80 27 Op 1 . + CDS 85058 - 85453 125 ## BT_2699 hypothetical protein + Term 85482 - 85529 1.9 + Prom 85485 - 85544 5.4 81 27 Op 2 . + CDS 85566 - 86108 423 ## BT_2698 hypothetical protein + Term 86250 - 86303 13.3 82 28 Tu 1 . - CDS 86261 - 88075 1191 ## COG0249 Mismatch repair ATPase (MutS family) - Prom 88096 - 88155 1.8 + Prom 88229 - 88288 7.3 83 29 Tu 1 . + CDS 88330 - 88974 491 ## BT_2695 hypothetical protein + Term 89134 - 89168 1.0 - Term 88996 - 89052 1.2 84 30 Tu 1 . - CDS 89178 - 89357 60 ## gi|237717432|ref|ZP_04547913.1| predicted protein - Prom 89388 - 89447 5.8 + Prom 89406 - 89465 4.2 85 31 Op 1 1/0.000 + CDS 89515 - 91512 1555 ## COG2987 Urocanate hydratase + Term 91518 - 91587 17.1 + Prom 91515 - 91574 4.8 86 31 Op 2 1/0.000 + CDS 91741 - 92643 862 ## COG3643 Glutamate formiminotransferase + Prom 92667 - 92726 2.3 87 31 Op 3 1/0.000 + CDS 92781 - 94034 932 ## COG1228 Imidazolonepropionase and related amidohydrolases 88 31 Op 4 1/0.000 + CDS 94077 - 94706 670 ## COG3404 Methenyl tetrahydrofolate cyclohydrolase 89 31 Op 5 . + CDS 94703 - 96202 1403 ## COG2986 Histidine ammonia-lyase + Term 96242 - 96308 13.4 90 32 Tu 1 . - CDS 96405 - 96644 68 ## - Prom 96729 - 96788 1.8 + Prom 96385 - 96444 4.3 91 33 Op 1 . + CDS 96550 - 97200 700 ## BT_2689 hypothetical protein 92 33 Op 2 13/0.000 + CDS 97256 - 98602 1327 ## COG1538 Outer membrane protein 93 33 Op 3 27/0.000 + CDS 98636 - 99673 1127 ## COG0845 Membrane-fusion protein + Term 99739 - 99781 2.7 + Prom 99676 - 99735 1.5 94 33 Op 4 . + CDS 99788 - 102946 2963 ## COG0841 Cation/multidrug efflux pump 95 33 Op 5 . + CDS 102961 - 103254 351 ## BT_2685 hypothetical protein + Term 103284 - 103327 7.6 + Prom 103258 - 103317 7.9 96 34 Op 1 . + CDS 103478 - 104869 1271 ## BT_2683 putative periplasmic protein 97 34 Op 2 . + CDS 104823 - 105773 608 ## BT_2682 putative periplasmic protein 98 34 Op 3 . + CDS 105760 - 107247 889 ## COG1696 Predicted membrane protein involved in D-alanine export + Term 107311 - 107367 19.6 - Term 107396 - 107444 3.8 99 35 Op 1 . - CDS 107494 - 110649 1180 ## gi|294644103|ref|ZP_06721880.1| putative lipoprotein 100 35 Op 2 . - CDS 110652 - 113129 1647 ## gi|237717447|ref|ZP_04547928.1| conserved hypothetical protein 101 35 Op 3 . - CDS 113133 - 114275 770 ## BF1976 hypothetical protein 102 35 Op 4 . - CDS 114297 - 115685 932 ## gi|237717449|ref|ZP_04547930.1| predicted protein - Prom 115705 - 115764 3.0 103 36 Tu 1 . - CDS 115853 - 117202 1267 ## gi|237717450|ref|ZP_04547931.1| predicted protein - Prom 117283 - 117342 3.2 104 37 Op 1 . - CDS 117397 - 118902 1124 ## BT_1064 hypothetical protein 105 37 Op 2 . - CDS 118913 - 119500 439 ## BF2182 hypothetical protein - Prom 119710 - 119769 4.8 106 38 Tu 1 . - CDS 119939 - 121117 437 ## BT_0036 integrase - Prom 121230 - 121289 5.4 + Prom 121065 - 121124 10.2 107 39 Tu 1 . + CDS 121340 - 122344 1069 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) + Term 122365 - 122415 14.1 + Prom 122536 - 122595 5.2 108 40 Tu 1 . + CDS 122642 - 123787 394 ## D11S_1182 hypothetical protein + Term 123843 - 123877 5.3 Predicted protein(s) >gi|222159329|gb|ACAB01000030.1| GENE 1 486 - 3332 2465 948 aa, chain - ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 111 737 129 743 748 353 34.0 6e-97 MIIINKRNLFFLISVWLLSTLLSAQNVTISTPQTQLLLSVPNGGTPEQLYYGSRTSDADI RSICETACRRNAYPVYGMGYPCETALSVRHADGNLTLQMAVIGVKETRLTKENATLTVIE LKDKVYPFFVNICYKAWQDADVIETWTEIRHEEKKPVQLQQFASAYLPVRRGNVWLSHLS GAWANEGQLCQEALQPGMKVIKNTDGVRNSQSAHAEVMFSLDGKPQENTGRVIGAALCYS GNYKLRIDTQEDDWHHFLAGINEENSWYNLKKEEVFRTPALALTYSDEGMSGCSRKFHQW ARLHKLANGNTPRKILLNSWEGVYFDINEQGMDQMMGDIAAMGGELFVMDDGWFGDKYPR ENDSYALGDWTVDKTKLPGGLQSLLDNARKHGIRFGIWLEPEMANTKSELYEKHPEWIIK APEREVVCARGGTQVVLDLSNPQVQDFIVQTVDELMNSYPDIDYIKWDANMSIITQGSQY LTKDNQSHLNIEYHRGFENVCRRIRASYPQLTIQACASGGGRVNYGVLPYFDEFWTSDNT DALQRIYIQWGTSYFFPAIGMGAHISASPNHQTSRSVPLKFRIDVAMSGRLGMEIQPKNM TEEEKALCRNAIAEYKTIRPVVQFGDIYRLLSPYDKQGAASLMYVSPEKDKAVFYWWKTE HFCNRHLPRVKMAGLAPDKYYKVHELNRIDTEPLKFEGKSFSGAYLNDNGLEIPSTHRVE PSKQNEYASRVLYLEKVTPSFSDNRIEQRPPLRVLCLGNSITRHEYKADIEWFSEWGMAA SKEENDYCHQLEKMLSQNRPGTVVTPLNIAYWERNLNCNIDSLIGTHVTDKDVIVIRLGE NVQDKEAFKSGILRLVEYCKRKADKVVITGCFWKDEEKERAIINAAHMHGLTFIPIDWID RLYNSRPKVGDTLYDIHGKPYTVTKDFIIAHPDDEGMKKIAEAIYRVL >gi|222159329|gb|ACAB01000030.1| GENE 2 3362 - 4387 328 341 aa, chain - ## HITS:1 COG:no KEGG:BDI_1319 NR:ns ## KEGG: BDI_1319 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: P.distasonis # Pathway: not_defined # 4 333 2 327 583 493 67.0 1e-138 MLYKKCQIIVLIPIFLFHVVTSFAQQRDSRVREYLSPIHIVWQQESQLIQGAEYLLRSGH GQANLVNNELCKLSSTGQQHPAILFDFGKELQGGLQIVTGMPASHAPVTIRVRLGESVSE AMCDIDEVNGATNDHAMRDFVISVPWLGVLEVGNSGFRFARIDLLDDSAELHLKEIRAIS IFQDIPYKGSFRCNDERLNQIWQTGAYTVHLNMQDYIWDGIKRDRLVWIRDLHPEVMTVN TVFGYNEVIPKSLDLIRDSTPLPQWMTMCTYSLWWILIQRDWYLYQGNLDYLKEQKGHLC DLLQLIMTRIGEDGLEKFNDNEGRFLDWPSCENPLYTKSFH >gi|222159329|gb|ACAB01000030.1| GENE 3 4419 - 5642 993 407 aa, chain - ## HITS:1 COG:CC0633 KEGG:ns NR:ns ## COG: CC0633 COG3754 # Protein_GI_number: 16124886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis protein # Organism: Caulobacter vibrioides # 46 401 222 566 818 125 27.0 1e-28 MHYLRTLTYLWSLLTLLLMVAITTICSCVSTPQRSGQLKEQDEYDVAAYIWPSCHNDPMG RDTLWSEGTGEWEIIKKGNPRFEGHYQPKVPLWGYEMDDDTQVMEKWIDVATAHGINTFI FDWYWFNGQPFLESTVNNGFLKAKNNKKMKFYLMWANHNVAHNYWNVHRYKDNNSRLWTG VVDWANFKIIVERVIRQYFHKPNYYKIDNCPVFSIFGFHELLESFGNADGVKKAMDYFRS EVKKAGFPDLHLQLIGDGSPTDKFIELVVKAEVNSVTKYNWGWPYEEDYLAWGTKAMQRR DQWTAKLDSLGVPFFPNASIGWDDTPRFPNKTAKEVVHYNDSPESFAAFLQKTKEYVDQR PDRPKLITINSWNEWVEGSYLLPDMKHGYGYLNAVKRVMNGEYDYKP >gi|222159329|gb|ACAB01000030.1| GENE 4 5875 - 8154 1702 759 aa, chain - ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 22 756 26 761 764 689 49.0 0 MKIRNLISCALCGLTLFACSPSGGGKDREMDRFITDLMDRMTLREKLGQLNLPSGGDLVT GSVMNGELSDMIRKQEIGGFFNVKGIQKINALQHLAVEESRLKIPLLVGADVIHGYETIF PIPLALSCSWDTLAVERMARISAIEASADGINWTFSPMVDICRDGRWGRIAEGNGEDPYL GSLMAKAYVRGYQGNNMQGNDEILACVKHFALYGASESGRDYNTVDMSRLRMYNEYLAPY KAAVDAGVGSIMSSFNIVDGIPATANKWLLTDLLRNEWGFGGLLVTDYNSIAEMSSHGVA PLKEASIRALQAGTDMDMVSCGFLNTLEESLKEAKVTEEQINMACRRVLEAKYKLGLFSN PYKYCDALRAKEKLYTPEHRSAARAIAAETFVLLKNDNNLLPLEQKGRIALIGPMADARN NMCGMWSMTCTPSRHRTLLEGIRAAAGDKAEILYAKGSNIYYDAETEKAATGIRPLERGD NRELLDEALRVASRSDIIVAALGECAEMSGESASRTSLEIPDAQQDLLKALVKTGKPVVL LLFTGRPLVLNWEDANVHSILNVWFGGSETGDAVADVLFGKITPSGKLTTTFPRSVGQLP LFYNHLNTGRPDPDSHIFNRYASNYLDESNEPLYPFGYGLSYTDFVYGELQISSETLPKN GELTVSVTVTNKGNYDGYETVQLYLRDIYAEVARPMKELKGFERIFLKSGESRDVKFVIT KDDLKFYNSELHYVYEPGEFDVMVGPNSRDVQTKSFQAE >gi|222159329|gb|ACAB01000030.1| GENE 5 8171 - 10282 1528 703 aa, chain - ## HITS:1 COG:TM1195 KEGG:ns NR:ns ## COG: TM1195 COG1874 # Protein_GI_number: 15643951 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Thermotoga maritima # 34 696 2 646 649 334 31.0 5e-91 MKKSLLVLALLFSLVSFAQSPVDKSFPPEELFALGSYYYPEQWDSSQWERDLKKMSEMGI KFTHFAEFAWAMMEPEEGKYDFEWLDRAVSLAEKYGLKVIMCTPTPTPPVWLSKKYPDIL IQRDNGVSIQHGRRQHASWSSDRYRRYVENIVSRLAMRYGNHPAVIGWQIDNEPGHYGVV DYSENAQIKFRVWLQKKYGTIDKLNDTWGTSFWSETYQNFEQVRLPNQQEVPEKPNPHAM LDLNRFMADELAGFVNMQADILRRHIHKDQWITTNLIPVFNPVDPVRIDHTDFLTYTRYL VTGHNQGIGSQGFRMGIPEDLGFSNDQFRNRVGKTFGVMELQPGQVNWGVYNPQPLPGAV RMWVYHVFAGGGKFVCNYRFRQPLKGSEQYHYGMIMTDGVTLSPGGEEYVRITQEMKKLR AAYDKKSRMPKQLASRRIGLLFDMNNYWEMEFQRQTDQWRTMPHIHKYYNLLKSFAAPVD VISEKEDFSDYPFLIAPAYQLLDNNLVKRWTEYVKNGGHLILTCRTGQKDRDAKLWEAPL AAPIHQLAGINSLYYDHLPHSLYGKVDFGSEEYAWNNWADVLTPAAGTDVWAVYADQFYK GAASVIHRRLGKGTVTYIGTDTDDGKLEKEVVRRVYTEAGVPTEDLPYGVVKEWRDGFYI ALNYTSDIQEIVIPDEAEVLIGSARLEPAGVVVWKEKSDNKYK >gi|222159329|gb|ACAB01000030.1| GENE 6 10323 - 11315 742 330 aa, chain - ## HITS:1 COG:no KEGG:BT_3516 NR:ns ## KEGG: BT_3516 # Name: not_defined # Def: arabinan endo-1,5-alpha-L-arabinosidase A precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 329 1 328 637 537 76.0 1e-151 MNSKIVFFFMIMCCCSVTLFAQTNTGKKQTSTNPVFDGWYADPEGVVFGDKYWIYPTYSA PFEQQLYMDAFSSPDLVHWRKHPKVLSVENISWLRNALWAPAVIEANGKYYFYFGANNIY HESEVGGIGVAVADRPEGPFKDALGKPLIGKIVNGAQPIDQFVFKDDDGQYYMYYGGWGH CNMVKMGSDLISIVPFEDGTTYKEVTPENYIEGPFMLKHNGKYYFMWSEGDWTGPDYCVA YAIADSPFGPFKRVGKILEQDTKIANGAGHHSIIKGPGKDEWYIVYHRRPLGETDGNYRI TCIDRMYFNKNGTIKPVKMTSGGIKARPLR >gi|222159329|gb|ACAB01000030.1| GENE 7 11598 - 12695 409 365 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase [Saccharophagus degradans 2-40] # 37 351 5 313 314 162 33 9e-39 MRILLYFLLILLVGCSSGQNKKTHERVDLDSVPVYTNPLLSRGGDPWAVFHNGIYYYTQS MSDRITLWATDDITRLAEASQKDVWVPQNSSNAFHLWGPEIHYINNKWYIYCCADDGNID NRQIYVLENESADPMKGEFVMKGRISTDEDNNWAIHASTFEHGGKRYLIWCGWQKRRVYQ ENQCIYIAEMENPWTLASDRILISEPEYEWECQWISIDGNKTAYPIRVNEAPQFFYSLKK DKLLIYYSASGNWTPYYCVGLLTADTDSDLLNPASWKKAPEPVFKQAPENHVYGPGSICF IPSPDGKEWYMLYHARKSLYDMLVLDSRTPRMQKIEWDKEGIPVLGIPQKEGFPLRKPSG TPERK >gi|222159329|gb|ACAB01000030.1| GENE 8 12896 - 14071 895 391 aa, chain + ## HITS:1 COG:AGl598 KEGG:ns NR:ns ## COG: AGl598 COG0584 # Protein_GI_number: 15890416 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 138 287 32 184 306 89 36.0 1e-17 MRNENLLSFLLIKRRILKLLAIKYAYSLIVLCGISFMACSDDDPVKKNPYLQTSTRAMLK EVVEVVFNNIDSNTDITVDFGDGTVKEGKAATPITHAYTQSGDYTMLVTAGERVVQKRIR IYDLLALTEAMKQFRDADNKMVWAMTHRSHTTDKTIPENSVSAVEAAINAGADVIECDTH LTSDGVVMVCHDQTINATTNGTGDITKMTYAEIQQYNLLDRNGRVTDEKMPTLEEFLKAG RGKIYFNLDYSPRTASTQEVMNVVKELDMMEQVFFYCNSSQKAEEVLSISKKAHVYTWAQ SAYKPLVGLPGNYFVQYSYAPDGQSTPIGTAVSEGMIPTVNMLHVLSGLIPEYDINTTYL DELLQIYPEVRMIQTDAPEVLISALQARGLR >gi|222159329|gb|ACAB01000030.1| GENE 9 14102 - 17098 2789 998 aa, chain + ## HITS:1 COG:no KEGG:BVU_0500 NR:ns ## KEGG: BVU_0500 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 23 998 118 1092 1092 944 49.0 0 MKNIHFKTCTCLLGILLILFSGIQVKGASLEVQQESKVSGTVTDEKGEPVIGASVVVVES KQGTITDIDGKFLVNVPKNGHLKVSYIGYETQEVAVKNGQLLKIVLKEESTALNEVVVVG YGTMKRKEMTSAISHIGAKDLNQISSLDASMLLQGKVSSVSVSNTALADPNNQGSIQIRG VSSRNAGLGPLIVVDGVPGGDMTNINPADIESIDVLKDGAASAIYGTRGSNGVILINLKK GTKDGQVHTTYSTSVTFNKAKKELDIMNAEEYRAYRTVSNPLSDMGADSDWFDAVTQLGV THMHTLTFSGGNARTNYRVTADYRNARGIDLRSDRREYGARANINHTTKDGLFTFSANIT PRVIDRNKSANVYSNAIKNNPTMPIYDSESANGYYRFPSGTEGSNIVEQLNEEENGTEIK LLEWNATAAVNLLPLFNPKNPDMVLKSQVTISQYQVDKFNGWFTPSTYGSNVNDGVTGKA SRDFDKSTTNNLEWVTNFSTRIKNHQIRAMVGYSYNYGVNSGMSAENWNFSSDGLTYNNL GSGLEAAKDGKTMMGSYKNDHKLISFFGRVNYDWKERYLFTFSLRHEGSSRFGENHKWGN FPAVSVGWRISDEKFMKGLSWIDDLKVRYDYGVTGNQEIGNYNSLATYRAFGWYQYNGNR FHVWGPSKNTNSELRWEKGHNQNVGLDFSLFSHKVTGSFNYFHRKQSDLLGDYSVPVPPN LFSTIYTNVGTLRNTGFELDVTVNAVRTKDFTYSFTLVGATNNNKFVSFSNDIYKGQKYY STCSMANPNNPGYLQRIEEGERIGNYFTYRYAGVDKKGDWLIYDKKGNVIPVAQGTEEDK SVTGNGLPKFTGSMTHNFTYKNFDLSVALRGAAGFDIFNVHDFYFGLQSMTTNQLTTAYS KNAHITTGKNVITDYFIEPGDYLKIDNVTLGYTMNLNKKYIEKIRLFGTANNLYTFTKFT GVDPSTYEVNGLTPGTFGGGYNYYPSAFQFILGLQVSF >gi|222159329|gb|ACAB01000030.1| GENE 10 17111 - 18886 1392 591 aa, chain + ## HITS:1 COG:no KEGG:Slin_5420 NR:ns ## KEGG: Slin_5420 # Name: not_defined # Def: RagB/SusD domain protein # Organism: S.linguale # Pathway: not_defined # 23 586 25 575 575 397 41.0 1e-109 MKITKYIIGLGIMAGTVISLPSCTDLSETVYDQVMSQNYYNTKQDVINAVFRPFEHIYES IVRRHEHEELTGDQLITPTRGTWWYDGGKWEKYHYHQYDDINQEDWTNEWEGMYGGIGQC NLVLDDLSQLNPSKFNLSDSEFDSFKAQLRCMRAYCYLRLLNSYRNCILTTTSDQAVNEQ PENRKQVSPQVMFDFIESELRDVCLVALPSKIGQSGNGGQQGQFTKAGAAALLVRLYLNA EKWIGQPKWQECIDMCERITGGEFGYYAIADNWYEPFDWNNEMCNEVILGFPGSYGLTSW HLKNDKRTVYGRALPYGCEKYLKIEGDGSRNPKYALSPSYNNQQPRALFDYELGMVTQKF AKYAGDKRYKQYVNTSNNTREGMFFMEGKIANSSISGGYAKNPDNQYVLYLLDQVGRFEG NAESGYIANPNYQESKLGNGDFNSGLYCVKYPFYPFDGGYFIESDYTEIRLAEIIYSQAE CLLRLGQAGEAGKLLNSVRKRNYENFTADIAYQPEGNVVLDLKEMLDEWGREFLAESRRR TDLIRFGRFQDAWWDKKRDADTHYELFPFSQTQLEQNEYLKQNPGYPDIAR >gi|222159329|gb|ACAB01000030.1| GENE 11 18939 - 20399 1102 486 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294810052|ref|ZP_06768724.1| ## NR: gi|294810052|ref|ZP_06768724.1| putative lipoprotein [Bacteroides xylanisolvens SD CC 1b] # 1 486 1 486 486 909 100.0 0 MKNKLLTISKIGLLAFTILVSQACQEDYEMVDPPMITDYDDDLDEEVVLQKGLESYFVTQ FGEGDMDGTSWDQAMDVAGFRKLLSGSIDLSKSTIYMSQGKYVMSETGGLGVIIRKDIKA IKGGYSLLSEGTDLTNRRIDTYKTVISGDVNGNNQADSGDCGLLLVKGGIIGIEGVTFQY GYLSNNDAKSNECGSGIYINGNVNSTSVELTDCIIRDCKTEAVNGQGGVAGGTAILIASG SSKLNNVKFLDNAADSRGGAIRCNSNKAVVFMNNCLITGNSVRELFGVGIQISSGHICMN NTTIVGNPGKGAALNGGGSFMLANSTIVGHDIDQEYGAFRCETSIDGDTKFINNLLISEN STAPSFILNGANKEAYSMGYNLYQRVNNFTMDVSDTAYPTLVNGNLTEEGVYKWNIDQIG QVTGYATKQAVVNAVKSFNPAASPVVNLGEVFVEWMGEDAFGLDQRGVTRNPDKMQMGAY DAVLSN >gi|222159329|gb|ACAB01000030.1| GENE 12 20422 - 21321 686 299 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717360|ref|ZP_04547841.1| ## NR: gi|237717360|ref|ZP_04547841.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 299 1 299 299 583 100.0 1e-165 MKNKFYIMTALIISLTGCSSIEDGTLPTVNFPEDHIIRVVTEVKAPVSRAGYDASNLESF GLIIHNSENIAYDYNSQMVKSDNDWNTVDGQQMLWDNARTPVTVIAYAPYISEAGLDTPL AINIQSNQTTEENVIASDFLLTKSMVDPKQDLTADGRLKVTLDHAMSKLIIKVTVNNGME DAAISKLGDMAVNGTIVGGICDLSVPEPVVIPREDAVATTIAPYKGTDGYECILLPQTII EGFSVNFSYDGKLYIWTAEEPIELQKGVEHTLTLNVNETARLATMQMKYRATGQIIDQK >gi|222159329|gb|ACAB01000030.1| GENE 13 21339 - 23567 1621 742 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717361|ref|ZP_04547842.1| ## NR: gi|237717361|ref|ZP_04547842.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 742 1 742 742 1430 100.0 0 MKITKHILTACILTGMAACDTDKEIAVFNEDSGAIKINAVIDAAYTRSNPTGTNEQQMQF NNGDQILLSCEDGSVTYMLSGGQWAPTDNYYLRWGNEPVTYSAFYPVTEGTSVANFSLPI NQQSLENLASADYMTCTVEDAINEGSGVLHLNMNRRMAKVIMTLDDIDSQSKALGVKIGS YQGYTDGNVSSGTALVSPYVTIPEGGKAGQSGCKYTAIVAPGTANPNSTFVSLNYKGEDL VLPGIPAIKAGFCYEFTLKVEGSVIRLSEPIVTPWETGTINGGDATELQLDAYYVKEHAT GNATGMDWDNAMGVDGLRNLLRTNTNSAITTANAKKLDGKNIYVAGGTYLIADQEAGLKI EYSGYSKQVEIKVVCGYDPQSTRKDLSKRDPVRYLTTFTGDANNNGIADAGDYSLFTLGN QIDITFEGCTFSCGYHPNEKINGYSGGFLIANGSSGNATLQLNHCIIEKCYNAGVNGSGE AGGSGIFMYKGTAKLNHVQLRNNKASSRGGAIRVNDSGSILFMNNCSITGNEGGQFGYAI QMSNGHLCMNNTTVTNNSGRDGTINGAGSMLIVNSTIIEDGAQNSGAVIRCESWPARQSF LMNNIILNKNAANPVVEMSGDDTRHLTSKGYNLMGGTIMPASTNKFTTSEFDKYNSTISS LNVVWDANRSVYTWDGNVNEFTRTAADAVKNAITTGFKPDSCPFQNLGEEFGKWLDEMDA FSTDQLGNPRDKNAMWPGAYQE >gi|222159329|gb|ACAB01000030.1| GENE 14 23595 - 24497 768 300 aa, chain + ## HITS:1 COG:CC3172 KEGG:ns NR:ns ## COG: CC3172 COG0584 # Protein_GI_number: 16127402 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Caulobacter vibrioides # 41 300 1 271 295 130 32.0 4e-30 MKNYKLIWLGIFCCLQPLILFAQRGEMIRKAFLDPKSDQVLVVAHRGNWRSAPENSVAAI DSAIQMGVDIVEIDIHKTKDGQLILMHDDRVDRTTNGKGLIKDYTLAEIKKLKLKDKDGN LTEHIVPTLEEALLAGKGQIMFNLDKAYSVFDEVYAVLEKTGTASLVIMKGGQPVETVKS EFGDYLDKVIYIPVIGLDGKDGEKKVRDYMTQLHPAAFELVYSDSSSPLPRKVKSIILGK SRIWYNTLWASLAGGHEDDQALKDPDGNYGYLIEELGARILQTDRPGFMLDYLRKKGLHD >gi|222159329|gb|ACAB01000030.1| GENE 15 24730 - 28653 2118 1307 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 789 1032 46 296 318 131 29.0 8e-30 MASAEFSFKKYQVENGLSHNTTWCAIQDSYGFLWIGTSNGLNCYYGSGNKIYRNVLNDKY SLGNNFVGSLMEEGEDLWIGTSSGLYIYDRSDDHFLYFDKTTKYGVFISCEVKKIVKTKN GLIWIATLGQGIFIYDPVKDVLTQNSIRTFFAWDICQGTDSLVYVSSMQEGLLCFDEQGT FLQSYPVSFELNIGNQRINCLHSIDHEIWCGVGTNSLGCYQEGSQELKIYNTSAESFSSI LCLMDYNGKEMLVGTDNGIYLFDRERKTFRRGDSPFGLWALSDPTVHAMMRDNEGTLWVM TNSGGINYMSKQSKRFSTYSLSLTGMEKADKEIGPFCEDREKNVWVGTRNGLWFFNSQTH SFHEYFLKKSNIKYDIRSLLLEGDNLWVGTYAEGLHVLNLKTGNIKSYTYSRDIPNTLCS NDVLSLFKDRKGEIFVGTSWGLCRYNPQSDNFITVINVGAMASVTDIYEDMYNNLWIATS NRGVFCYNTVGEEWKHFEHIREDLTTITSNSVITLFEDRKGTMWFGTNGGGLCSFSKETE TFVDFDPENIWLPDKVIYSIEQDLAGNFWISCNSGLYQFNPADKNKNCLFTINDGLQGNQ FTAQSSLASSTGKMYFGGVNGFNVFEPKEFTDNTYLPPVYVINISFPNLNNEREVRRLLR LDKPFYTVDKIKLPYENNSFTIRFAILSYEDPLRNRYAYILNGVDKEWINNSSNNTASYT NLPPGEYEFLVRGSNNDHYWTEKAISLWIVVTPPWWRTTIAYVVYALLLLAAIVFLGWRW NLYVRRKYKRRMEEYQTTKEKEMYKSKINFFINLVHEIRTPLSLIRLPLEKLREEEIKGK ENKYISVIEKNVDYLLGIINQLLDFQKLESSALQLNLKKCNVNALVSDIFNQFCGSAELR GLTLHLCLPHEEINMMIDSEKMSKILVNLIGNAMKYAKSRIELKVLSVDDEVRIEVTDDG PGIPEDQRDKIFQAFYQLPDDPVASSKGTGIGLAFARSLAEAHHGSLFLENSSEGGASFV LSLPLGTVETEDVSDELTETDAVDVGEEVTSVSEFANRRFTVLLVEDNLELLNLTREALS TWFRVIRAGNGKEALEILTRQSVDVIVSDVMMPEMDGLELCNHVKSDMAYSHIPVILLTA KTTLESKVEGLESGADVYLEKPFSVKQLYKQIENLLKLRLAFHKQMTDITIGSSSSLSDF AISQKDVEFIDKINAILSEQVINENYSIDILADQMNMSHSNLYRKMKGLFGMPPNNYVKN FRLNKAAELIANGVRIAEAAERLGFASSSHFAKCFKEKFGMLPKDYK >gi|222159329|gb|ACAB01000030.1| GENE 16 28862 - 29272 153 136 aa, chain - ## HITS:1 COG:MA2994 KEGG:ns NR:ns ## COG: MA2994 COG4804 # Protein_GI_number: 20091812 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 39 117 104 182 345 82 50.0 3e-16 MSQIFLNILDKRDWNNSVDQIQKLTESPIKSTREYPFTLSWSHYLILMRIESDEERRFYE IESQKQNWSVRQLQRQYNSSLYERLALSKDKDSVLRLAQEGQTINKPDDIIKNPLTLFSN HCSIIYTKNYGQFVMF >gi|222159329|gb|ACAB01000030.1| GENE 17 29352 - 29540 78 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717365|ref|ZP_04547846.1| ## NR: gi|237717365|ref|ZP_04547846.1| predicted protein [Bacteroides sp. D1] # 1 62 1 62 62 119 100.0 6e-26 MEFITSTHLKQWADTKECQSLLPELMRRLICASVKQLNRLSFPSGDAVHMSEWAGADTWV LI >gi|222159329|gb|ACAB01000030.1| GENE 18 29629 - 30798 1231 389 aa, chain - ## HITS:1 COG:no KEGG:BT_2758 NR:ns ## KEGG: BT_2758 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 389 7 389 389 694 91.0 0 MLLLAGSIQGVYAQKTEKKDKFNPNTPLFEELTDVKKKTDKFNLYLNMQGSFDAHFQNGF QEGDFNMHQLRIEAKGNINNWLSYRYRQRLNRSNDANGMIDNLPTSIDYAGIGIKLNDQF SLFAGKQCAAYGGIEFDLNPIDIYQYSDMIDYMSNFMTGLNVGYNITPEQQLNLQILNSR NSSFDSTYGITEDAEGNLPDLKSGKMPLVYTLNWNGNFNNVFKTRWSASVMNEAKSHNMY YYALGNELNLGKWNAFVDFMYSKEDIDRKGIITSIVGRPGGHNAFDANYLSVVAKCNYRF LPKWNVFVKGMYETASVGKASEGIEKGNYRTSWGYLGGIEYYPMETNLHFFVTYVGRSYD FTSRAKVLGQENYSTNRISVGFIWQMPVF >gi|222159329|gb|ACAB01000030.1| GENE 19 30832 - 31890 1168 352 aa, chain - ## HITS:1 COG:STM3106 KEGG:ns NR:ns ## COG: STM3106 COG0252 # Protein_GI_number: 16766407 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Salmonella typhimurium LT2 # 1 352 1 348 348 373 62.0 1e-103 MKQFKSFGLVVVTLLFSVTMAFAAKPNIHILATGGTIAGTGSSATGTSYTAGQVAIGALL DAVPEIKDIANVTGEQIVKIGSQDMNDQVWLTLAKKINELLKRPDIDGIVITHGTDTMEE TAYFLNLTVKSNKPVVLVGAMRPSTALSADGPLNLYNAVVTAAAKESKDKGVLVAMNGLI LGAQSTVKMNTVDVQTFQAPNSGALGYVLNGKVFYNQVTLKKHTTQSVFDVTHLNALPKV GIVYSYSNIEADMVTPMLNNGYKGIIHAGVGNGNIHQNIFPVLTDARQKGILVVRSSRVP TGPTTLDAEVDDAKYQFVASQELNPQKSRVLLMLALTKTTDWKQIQQYFNEY >gi|222159329|gb|ACAB01000030.1| GENE 20 31926 - 33236 1343 436 aa, chain - ## HITS:1 COG:HI0746 KEGG:ns NR:ns ## COG: HI0746 COG2704 # Protein_GI_number: 16272687 # Func_class: R General function prediction only # Function: Anaerobic C4-dicarboxylate transporter # Organism: Haemophilus influenzae # 2 430 6 433 440 369 55.0 1e-102 MILQLAFVLTAIIIGARLGGIGLGVMGGVGLAILTFVFGLQPTAPPIDVMLMIAAVISAA SCMQAAGGLDYMVKLAEHLLRKNPSHVTLLSPLVTYLFTFVAGTGHVAYSVLPVIAEVAT ETKIRPERPLGIAVIASQQAITASPISAATVALLGLLAGFDITLFDILKITIPATIVGVL VGALFSMKVGKELVEDPEYQKRLKEGLFNNKKIEIKDVKNKRSAMISVLIFILATAFIVL FGSFEGMRPSFLIDGEIVTLGMSSIIEIVMLSAAAIILLVTKTDGIKATQGSVFPAGMQA VIAIFGIAWMGDTFLQGNMGQLTLSIEGIVQQMPWLFGVALFVMSILLYSQAATVRALVP LGIALGISPYMLIALFPAVNGYFFIPNYPTVVAAINFDRTGTTKIGKYVLNHSFMMPGLV STIVAIALGLLFIQIF >gi|222159329|gb|ACAB01000030.1| GENE 21 33414 - 34847 1544 477 aa, chain + ## HITS:1 COG:Cj0087 KEGG:ns NR:ns ## COG: Cj0087 COG1027 # Protein_GI_number: 15791475 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Campylobacter jejuni # 9 468 3 462 468 516 55.0 1e-146 MEQNLSKKTRTESDLIGSREVPESALYGVQTLRGIENFRISKFHLNEYPLFIQALAITKM GAAVANRELDLLTEEQTDAILRACKEILEGKHHEQFPVDMIQGGAGTTTNMNINEVVANR ALELMGHQRGEYQYCSPNDHVNRSQSTNDAYPTAIHIGLYYTHLKLVKHFEVLIEAFRKK GREFAHIIKMGRTQLEDAVPMTLGQTFNGFASILEHEVKNLDFAAQDFLTVNMGATAIGT GITAEPEYAEKCIAALRKITGLDIKLADDLVGATSDTSCMVGYSGAMRRIAVKMNKICND LRLLASGPRCGLGEINLPAMQPGSSIMPGKVNPVIPEVMNQVAYKVIGNDLCVAMSGEAA QMELNAMEPVMAQCCFESADLLMNGFDTLRTLCIDGITANEEKCRRDVHNSIGVVTALNP VIGYKNSTKIAKEALETGKGVYELVLEHNILSKKDLDTILEPENMIKPVKLDIHPNR >gi|222159329|gb|ACAB01000030.1| GENE 22 35069 - 36439 488 456 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 1 451 1 444 458 192 28 7e-48 MNAFDVIIIGFGKGGKTLAAEFAKRGRKVAIVERSDKMYGGTCINIGCIPTKTLVHQAKI ASGMKDATFEERSEFYRNAISVKEAVTSALRNKNYHNLADNPNVTVYTGVGSFVSSDVVS VRTSTEEIMLTSKQIIINTGAETVIPPIEGVVGNPLVYTSTSIMELTELPRRLVIIGGGY IGLEFASMYASFGSQVTVLESYPELIVREDRDIAASVKETLEKKGIVFRMNAKVQSVKHI EDGAVVVFSDSQTGEVVELEADAVLLATGRRPNTKDLNLEVAGVETDARGAIIVDEYLKT TQPNIRAVGDVKGGLQFTYISLDDYRIIREDLFGDKERTTNDRNPVSYSVFIDPPLARIG LNEEEARKQNLDIIIKKLPVMAIPRAKTLGETDGLLKAVIDKNTGKILGCMLFAPDSSEV INTVAVAMKTGQDYTFLRDFIFTHPSMSEALNDLFS >gi|222159329|gb|ACAB01000030.1| GENE 23 36618 - 36797 102 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717371|ref|ZP_04547852.1| ## NR: gi|237717371|ref|ZP_04547852.1| predicted protein [Bacteroides sp. D1] # 8 59 1 52 52 97 100.0 3e-19 MISLCKAMGGSWYPYKETAVSTYRNCGIHTWTLSFPALETAVSRLGTVVSYAGNRRLLG >gi|222159329|gb|ACAB01000030.1| GENE 24 37479 - 37910 243 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717372|ref|ZP_04547853.1| ## NR: gi|237717372|ref|ZP_04547853.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 143 19 161 161 246 100.0 2e-64 MAQWISIEEAAAKYGFETEYIWVLIEIREITVDYQRTDIDIIDDESIQDFINRNKKGITL MYIEKLERYCMDKSQVCSIYASLLGKQDKEITIYKGAKFLYDALQSKWLKQCDRVRHLEK ELALEQTTCRTCWLKRLCMIIRK >gi|222159329|gb|ACAB01000030.1| GENE 25 38171 - 38746 708 191 aa, chain + ## HITS:1 COG:no KEGG:BT_2753 NR:ns ## KEGG: BT_2753 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 191 1 196 196 303 81.0 2e-81 MKKIFGALMIAVCIGMAMPAQAQIHFGVKGGLNLSKASFSNVGDNFKKDNFTGFFIGPMA EFNIPIVGLGVDAALLFAQRGIKVSEGNEDLTIKQNGIDIPVNLKYTIGLGSMAGIYLAA GPDFYFDFEKKSGIDKKKAQVGINVGAGLKLLNHLQVGANYNIPLGDTADIEGTDASYKT KTWQVSVAYIF >gi|222159329|gb|ACAB01000030.1| GENE 26 38942 - 41398 1863 818 aa, chain + ## HITS:1 COG:lin1938 KEGG:ns NR:ns ## COG: lin1938 COG1198 # Protein_GI_number: 16801004 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Listeria innocua # 1 817 1 793 797 467 34.0 1e-131 MKKYVDVILPLPLPKSFTYSLPDECAEDVKIGCRVVVPFGRKKFYTAIVLNVHYCAPTEY EVKDISALLDASPILLPVQFKFWEWIADYYLCTQGDVYKAALPSGLKLESETIVEYNLDF EADAPLPEREQRILDLLAVDAQQSVTKLEKDSGIKNILTAIKSLLDKEAIFVKEELKRTY KPKTEARVRLAGTADEKQLHILFDILSRAPKQLALLMKYVEYSGILGTGTPKEVSKKELL QRANVAPSVLNGLVDKKIFEIYYHEIGRLNKQEKEVVELNALNEFQQRAHDEIVQSFQEK NVCLLHGVTSSGKTEVYIHLIEETIRQGKQVLYLLPEIALTTQITERLQRVFGARLGIYH SKFPDAERVEIWRKQLGENGYDIILGVRSSVFLPFRNLGLVIVDEEHENTYKQQDPAPRY HARSAAIVLAAMYGAKTLLGTATPSIESWQNAGEGKYGFVQLKERYKEIQLPEIIPVDIK ELHRKKRMVGQFSPLLIQYMKEALEQEEQVILFQNRRGFAPMVECRTCGWVPKCKNCDVS LTYHKGINQLTCHYCGYTYQLPKSCPACEGTELVNRGFGTEKIEDDIKILFPEAAVARMD LDTTRTRSAYEKIIADFEQGKTDILIGTQMVSKGLDFDHVSIVGILNADTMLNYPDFRSY ERAFQLMAQVAGRAGRKNKRGRVVLQTKSIEHPIIHQVIANDYEDMVGGQLAERQMFHYP PYYRLVYVYLKNHNEALLDQMAAVMADKLRAVFGNRVLGPDKPPVARIQTLFIKKIVVKI EQNAPMGRARELLLRIQREMLEDERYKSLIVYYDVDPM >gi|222159329|gb|ACAB01000030.1| GENE 27 41450 - 43246 1583 598 aa, chain + ## HITS:1 COG:no KEGG:BT_2751 NR:ns ## KEGG: BT_2751 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 598 1 598 598 1127 87.0 0 MKRLTIFSLTCLFSVGAVFAQQGVTQCGVPTGQPKFPLLTYQELPDPAAPSDKEWAAVTT TQVSWGTTDVRYAKHGLPRLKNQQTVALKGWRGERVNAQAVVWTGVELKDLNFSFGDFKD KKGNVLPQDAFTGGFVRYVMTDELNKDGRGACGHRKAVDYDSLLVADPIDTSLKAMALPA RTVQPVWVQCWIPQSAVPGTYKGELLINDGSRLLQRLNLEITVSSRELPAPSEWAYHLDL WQSPYAVARYYQVPLWSQEHLDAMRPLMKMLADAGQKIITATLMHKPWNGQTEDYFDTMV TWMKRADGTWSFDYTIFDRWVEFMMSVGIDKQINCYSMVPWELSFQYYDQATNSLKFVKT APGEEAYEEMWVAMLSSFSKHLKEKGWFDICAIAMDERPMEVMQKTLKVIRKADPDFKVS LAGNYHAEIEPDLYDYCIVIGQNFPEEVRLRRVAENKRTNYYTCCTEAHPNTFTFSDPAE AVWISYYSSKKHLDGYLRWAYNSWPLEPLLDSRFRSWAGGDTYLVYPGARSCIRFERLIE GIQAHEKINILRQEFEKKGNKAGLKKIEKMLAPFNLGSMPETPAAVTVNRANQILNSL >gi|222159329|gb|ACAB01000030.1| GENE 28 43488 - 43958 485 156 aa, chain + ## HITS:1 COG:alr5068 KEGG:ns NR:ns ## COG: alr5068 COG0394 # Protein_GI_number: 17232560 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Nostoc sp. PCC 7120 # 3 153 4 155 161 149 49.0 1e-36 MKKILFVCLGNICRSSTAEGVMLHLIKEAGLEKEFVIDSAGILSYHQGELPDSRMRAHAA RRGYQLVHRSRPVRIEDFYHFDLIIGMDDRNIDDLKDKAPSPEEWKKIHRMTEYCTRIPA DHVPDPYYGGAEGFEYVLDVLEDACAGLLTSLTPDS >gi|222159329|gb|ACAB01000030.1| GENE 29 43933 - 46005 565 690 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 [Bacillus selenitireducens MLS10] # 196 687 243 732 750 222 29 8e-57 MEHLKKKKSFSWRDLLYKSLLFVSTVTLIVYFLPRDGKFNYQFDINKPWKYGQLIATFDF PIYKEDAVVKREQDSLMAFFQPYYQLDKDIEKNAIAKLKENYHTNLKGILPSIDYLRYIE RTLKEIYQAGIVSTENIQQLQKDSTSSVMVIDDKLANPQATGNIYTVKKAYEHLLTADST HFNREVLRQCALNEYITPNLTFDEERTLAAKEEILNNYSWANGLVVSGQKIIDRGEIVSP HTYNILESLRKESIKRNESMGQNRLILGGQVLFVGMLMLCFMLYLDLFRKDYYQRKGSLS LLFTLIVFYSVITAFMVTHNIFNVYIIPYAMLPIIIRVFLDSRTAFLTHVITILICSISL RFPHEFILTQLAAGLVAIFSLRELSQRSQLFRTALLVILTYAAIYFAFELMTENGLSTDF SKLNIRMYTHFIINGILLLFTYPLLFLLEKTFGFTSNVTLVELSNINNDLLRQMSETVPG TFQHSMQVANLAAEAAIRIGAKSQLVRTGALYHDIGKMENPAFFTENQSGGVNPHKNLNY EQSAQVVISHVTDGLKLADKHNLPKAVKDFISTHHGRGKTKYFYISWKNEHPDEEPNEEL FTYPGPNPFTKEQAILMMADAVEAASRSLPEYTEETISNLVDKIIDSQVTEGYFKECPIT FKDIATVKAVFKEKLKIAYHTRISYPELKK >gi|222159329|gb|ACAB01000030.1| GENE 30 46100 - 47617 1400 505 aa, chain + ## HITS:1 COG:BB0372 KEGG:ns NR:ns ## COG: BB0372 COG0008 # Protein_GI_number: 15594717 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Borrelia burgdorferi # 7 505 4 487 490 336 37.0 5e-92 MAERKVRVRFAPSPTGALHIGGVRTALYNYLFARQHGGDLIFRIEDTDSNRFVPGAEEYI LESFKWLGIHFDEGVSFGGECGPYRQSERREIYKKYVQVLLENGKAYIAFDTPEELDAKR AEIANFQYDASTRGMMRNSLTMPKEEVDALIAEGKQYVVRFKIEPNEDVHVNDLIRGEVI INSSILDDKVLYKSADELPTYHLANIVDDHLMEVSHVIRGEEWLPSAPLHVLLYRAFGWE DTMPEFAHLPLLLKPEGNGKLSKRDGDRLGFPVFPLEWHDPKSGEISSGYRESGYLPEAV INFLALLGWNPGNDQEVMSMDELIKLFDLHRCSKSGAKFDYKKGIWFNHQYIQQKPNEEI AELFLPFLKEHGVEAPFEKVVTVVGMMKDRVSFIKELWDVCSFFFVAPTEYDEKTVKKRW KEDSAKCMTELAEVIAGIEDFSIEGQEKVVMDWIAEKGYHTGNIMNAFRLTLVGEGKGPH MFDISWVLGKEETIARMKRAVEVLK >gi|222159329|gb|ACAB01000030.1| GENE 31 47631 - 48854 1253 407 aa, chain + ## HITS:1 COG:YPO0055 KEGG:ns NR:ns ## COG: YPO0055 COG1519 # Protein_GI_number: 16120408 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic-acid transferase # Organism: Yersinia pestis # 49 405 51 416 425 140 27.0 7e-33 MLYDLAIVIYDFIVHLAAPFSRKPRKMMKGHWVVYELLRQQVEKGEQYIWFHAASLGEFE QGRPLIEMIRAKYPNYKILLTFFSPSGYEVRKHYRGADIVCYLPFDKPRNVKKFLDIANP CMAFFIKYEFWKNYLDELHKRRIPVYSVSSIFRREQIFFKWYGGTYRNVLKDFDHLFVQN EASKRYLSKIGISRVTVVGDTRFDRVLQIREEAKELPLVEKFKGTNSFTFVAGSSWGPDE DLFLEYFNNHPEMKLIIAPHVIDENHLVEIISKLKRPYVRYTRADERNVLKADCLIIDCF GLLSSIYRYGEIAYIGGGFGVGIHNTLEAAVYGIPVIFGPKYQKFMEAVQLIEAKGAYSI KDYDELKTLLDRFLADELFLRETGTNAGYYVTSNAGATEKIMHMINF >gi|222159329|gb|ACAB01000030.1| GENE 32 48997 - 49518 390 173 aa, chain - ## HITS:1 COG:RP516 KEGG:ns NR:ns ## COG: RP516 COG0663 # Protein_GI_number: 15604376 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Rickettsia prowazekii # 3 170 2 168 185 149 44.0 3e-36 MALIKSVRGFTPEIGENCFLADNATIIGDVKIGNDCSIWFNTVLRGDVNSIRIGNGVNIQ DGSVLHTLYQKSTIEIGDHVSVGHNVTIHGATIKDYALVGMGSTVLDHVVVGEGAIVAAG SLVLSNTIIEPGSIWGGVPAKFIKKVDPEQAKELNQKIAHNYLMYSQWYKEDK >gi|222159329|gb|ACAB01000030.1| GENE 33 49667 - 51448 1217 593 aa, chain - ## HITS:1 COG:Cj0653c KEGG:ns NR:ns ## COG: Cj0653c COG0006 # Protein_GI_number: 15792013 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Campylobacter jejuni # 6 593 5 595 596 462 42.0 1e-130 MKQSIKERVHALRMTFHPNSIKAFIIPSTDPHLSEYVAPHWMSREWISGFTGSAGTAVIL MDKAGLWTDSRYFLQATKELEGSGITLYKEMLPETPSITEFLCQHLKPGESVSIDGKMFS VQQVEQMKEELAAHQLQVDIFGDPLSSIWKDRPAMPDSPAFIYDIKYAGKSCEEKISAIR TELKKKGVYALFISALDEIAWTLNLRGNDVHCNPVIVSYLLITQDEVTYFISPEKVTAEV ETYLKERQIGIQKYDEVETFLNSFPGKNILIDPRKTNYSIYSSINPQCSILRGESPVALL KAIRNEQEVAGIHAAMQRDGVALVKFLKWLEESVSTGKETELSIDKKLHEFRAAQPLYMG ESFDTIAGYKEHGAIVHYSATPESDVTLQPRGFLLLDSGAQYLDGTTDITRTIALGELTE EEKTDYTLILKGHIALAMAKFPAGTRGAQLDVLARMPIWNHRMNFLHGTGHGVGHFLSVH EGPQSIRMNENPVILQPGMVTSNEPGVYKAGSHGIRTENLTLVCKDGEGMFGEYFKFETI TLCPICKKGIIKEMLTNEEIEWLNNYHQTVYEKLSPDLNEEEKVWLQEATASL >gi|222159329|gb|ACAB01000030.1| GENE 34 51636 - 51827 310 63 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883083|ref|ZP_02064086.1| hypothetical protein BACOVA_01051 [Bacteroides ovatus ATCC 8483] # 1 63 1 63 63 124 100 3e-27 MIVVPVKEGENIEKALKKFKRKFEKTGIVKELRSRQQFDKPSVTKRLKKERAVYVQQLQQ VED >gi|222159329|gb|ACAB01000030.1| GENE 35 51901 - 52782 727 293 aa, chain + ## HITS:1 COG:SA1328 KEGG:ns NR:ns ## COG: SA1328 COG4974 # Protein_GI_number: 15927078 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Staphylococcus aureus N315 # 2 293 4 295 295 193 35.0 3e-49 MLIESFLDYLQYERNYSEKTVLAYGEDIKQLQEFAQEEYGKFNPLEVEAELIREWIVSLM DKGYTSTSVNRKLSSLRTFYKYLLRQGETTIDPLRKIKGPKNKKPLPVFLKENEMNRLLD ETDFGEGFKGCRDRLIIEMFYATGMRLSELIGLDNKDVDFSASLLKVTGKRNKQRLIPFG DELQELMLEYINVRNETIPERSEAFFIRENGERLYKNLVYNLVKRNLSKVATLKKKSPHV LRHTFATTMLNNEAELGAVKELLGHESITTTEIYTHATFEELKKVYKQAHPRA >gi|222159329|gb|ACAB01000030.1| GENE 36 52798 - 53097 193 99 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163755828|ref|ZP_02162946.1| 30S ribosomal protein S21 [Kordia algicida OT-1] # 1 98 4 101 102 79 41 1e-13 MDVRIQSIHFDASEQLQAFIQKKVSKLEKYYEDIKKVEVSLKVVKPEVAENKEAGIKILI PNGEFYASKVCDTFEEAIDLDVEALGKQLVKYKEKQRSK >gi|222159329|gb|ACAB01000030.1| GENE 37 53671 - 54855 1405 394 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 394 1 407 407 545 66 1e-154 MAKEKFERTKPHVNIGTIGHVDHGKTTLTAAITTVLAKKGLSELRSFDSIDNAPEEKERG ITINTSHVEYQTANRHYAHVDCPGHADYVKNMVTGAAQMDGAIIVCAATDGPMPQTREHI LLARQVNVPRLVVFLNKCDMVDDEEMLELVEMEMRELLSFYDFDGDNTPIIRGSALGALN GVEKWEDKVMELMDAVDNWIPLPPRDVDKPFLMPVEDVFSITGRGTVATGRIETGVIHVG DEIEILGLGEDKKSVVTGVEMFRKLLDQGEAGDNVGLLLRGIDKNEIKRGMVLCKPGQIK PHSKFKAEVYILKKEEGGRHTPFHNKYRPQFYLRTMDCTGEITLPEGTEMVMPGDNVTIT VELIYPVALNPGLRFAIREGGRTVGAGQITEIID >gi|222159329|gb|ACAB01000030.1| GENE 38 54996 - 55187 140 63 aa, chain + ## HITS:1 COG:no KEGG:BF4198 NR:ns ## KEGG: BF4198 # Name: not_defined # Def: preprotein translocase SecE subunit # Organism: B.fragilis # Pathway: Protein export [PATH:bfr03060]; Bacterial secretion system [PATH:bfr03070] # 1 63 1 63 63 100 90.0 2e-20 MKKVIAYIKESYDELVHKVSWPTYSELANSAVVVLYASLLIALVVWGMDVCFQNFMEKIV YPH >gi|222159329|gb|ACAB01000030.1| GENE 39 55205 - 55747 561 180 aa, chain + ## HITS:1 COG:CC3205 KEGG:ns NR:ns ## COG: CC3205 COG0250 # Protein_GI_number: 16127435 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Caulobacter vibrioides # 7 179 12 183 185 150 47.0 9e-37 MAEIEKKWYVLRAISGKEAKVKEYLEADLKNSDLGEYVSQVLIPTEKVYQVRNGKKIVKE RSYLPGYVLVEAALVGEVAHHLRNTPNVIGFLGGSEKPVPLRQSEVNRILGTVDELQETG EELNIPYVVGETVKVTFGPFSGFSGIIEEVNSEKKKLKVMVKIFGRKTPLELGFMQVEKE >gi|222159329|gb|ACAB01000030.1| GENE 40 55809 - 56252 743 147 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883077|ref|ZP_02064080.1| hypothetical protein BACOVA_01040 [Bacteroides ovatus ATCC 8483] # 1 147 1 147 147 290 100 2e-77 MAKEVAGLIKLQIKGGAANPSPPVGPALGSKGINIMEFCKQFNARTQDKAGKILPVIITY YADKSFDFVIKTPPVAIQLLEVAKVKSGSAEPNRKKVAELTWEQVRTIAQDKMVDLNCFT VEAAMRMVAGTARSMGIAVKGEFPVNN >gi|222159329|gb|ACAB01000030.1| GENE 41 56268 - 56966 1160 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348146|ref|NP_811649.1| 50S ribosomal protein L1 [Bacteroides thetaiotaomicron VPI-5482] # 1 232 1 232 232 451 100 1e-126 MGKLTKNQKLAAEKIEAGKAYSLKEAASLVKEITFTKFDASLDIDVRLGVDPRKANQMVR GVVSLPHGTGKEVRVLVLCTPDAEAAAKEAGADYVGLDEYIEKIKGGWTDIDVIITMPSI MGKIGALGRVLGPRGLMPNPKSGTVTMDVAKAVKEVKQGKIDFKVDKSGIVHTSIGKVSF SPDQIRDNAKEFISTLNKLKPTAAKGTYIKSIYLSSTMSAGIKIDPKSVDEI >gi|222159329|gb|ACAB01000030.1| GENE 42 56982 - 57500 849 172 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237717390|ref|ZP_04547871.1| ribosomal protein L10 [Bacteroides sp. D1] # 1 172 1 172 172 331 100 9e-90 MRKEDKSTIIEQIAATVKEYGHFYLVDVTAMNAAATSALRRDCFKSDIKLMLVKNTLLHK ALESLEEDFSPLYGSLKGTTAVMFSNTANVPAKLIKDKAKDGIPGLKAAYAEESFYIGAD QLDALVSIKSKNEVIADIVALLQSPAKNVISALQSGGNTLHGVLKTLGERPE >gi|222159329|gb|ACAB01000030.1| GENE 43 57548 - 57922 589 124 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153805949|ref|ZP_01958617.1| hypothetical protein BACCAC_00194 [Bacteroides caccae ATCC 43185] # 1 124 1 124 124 231 100 1e-59 MADLKAFAEQLVNLTVKEVNELATILKEEYGIEPAAAAVAVAAGPAAGAAAAEEKTSFDV VLKSAGAAKLQVVKAVKEACGLGLKEAKDLVDGAPSTVKEGLAKDEAESLKKTLEEAGAE VELK >gi|222159329|gb|ACAB01000030.1| GENE 44 58027 - 61839 2908 1270 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 9 1269 16 1387 1392 1124 45 0.0 MSSNTVNQRVNFASTKNPLEYPDFLEVQLKSFQDFLQLDTPPEKRKNEGLYKVFAENFPI ADTRNNFVLEFLDYYIDPPRYTIDDCIERGLTYSVPLKAKLKLYCTDPDHEDFDTVIQDV FLGPIPYMTDKATFVINGAERVVVSQLHRSPGVFFGQSVHANGTKLYSARIIPFKGSWIE FATDINNVMYAYIDRKKKLPVTTLLRAIGFENDKDILEIFDLAEDVKVNKTNLKKVLGRK LAARVLKTWIEDFVDEDTGEVVSIERNEVIIDRETVLEEEHIDEILESGVQNILLHKDEP NQSDFSIIYNTLQKDPSNSEKEAVLYIYRQLRNADPADDASAREVINNLFFSEKRYDLGD VGRYRINKKLNLTTDMDVRVLTKEDIIEIIKYLIELINSKADVDDIDHLSNRRVRTVGEQ LSNQFAVGLARMSRTIRERMNVRDNEVFTPIDLINAKTISSVINSFFGTNALSQFMDQTN PLAEITHKRRMSALGPGGLSRERAGFEVRDVHYTHYGRLCPIETPEGPNIGLISSLCVFA KINELGFIETPYRKVENGKVDLSDNGLIYLTAEEEEEKVIAQGNAPLNDDGTFVLNRVKS RQDADFPVVEPSEVDLMDVAPQQIASIAASLIPFLEHDDANRALMGSNMMRQAVPLLRSE APIVGTGIERQLVRDSRTQITAEGDGVVDYVDATTIRILYDRTEDEEFVSFEPALKEYRI PKFRKTNQNMTIDLRPICDKGQRVKKGDILTEGYSTEKGELALGKNLLVAYMPWKGYNYE DAIVLNERVVREDLLTSVHVEEYSLEVRETKRGMEELTSDIPNVSEEATKDLDENGIVRI GARIEPGDIMIGKITPKGESDPSPEEKLLRAIFGDKAGDVKDASLKASPSLKGVVIDKKL FSRVIKNRSSKLADKALLPKIDDEFESKVADLKRILVKKLMTLTEGKVSQGVKDYLGAEV IAKGSKFSASDFDSLDFTSIQLSNWTSDDHINGMIRDLVMNFIKKYKELDAELKRKKFAI TIGDELPAGIIQMAKVYIAKKRKIGVGDKMAGRHGNKGIVSRVVRQEDMPFLADGTPVDI VLNPLGVPSRMNIGQIFEAVLGRAGKTLGVKFATPIFDGATMEDLDQWTDKAGLPRYCKT YLCDGGTGEQFDQAATVGVTYMLKLGHMVEDKMHARSIGPYSLITQQPLGGKAQFGGQRF GEMEVWALEGFGAAHILQEILTIKSDDVVGRSKAYEAIVKGEPMPQPGIPESLNVLLHEL RGLGLSINLE >gi|222159329|gb|ACAB01000030.1| GENE 45 61946 - 66229 4411 1427 aa, chain + ## HITS:1 COG:mlr0277 KEGG:ns NR:ns ## COG: mlr0277 COG0086 # Protein_GI_number: 13470543 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Mesorhizobium loti # 13 1395 18 1356 1398 1327 50.0 0 MAFRKENKTKSNFSKISIGLASPEEILENSSGEVLKPETINYRTYKPERDGLFCERIFGP IKDYECHCGKYKRIRYKGIVCDRCGVEVTEKKVRRERMGHIQLVVPVAHIWYFRSLPNKI GYLLGLPTKKLDSIIYYERYVVIQPGVKAEDGVAEYDLLSEEEYLDILDTLPKDNQYLED NDPNKFVAKMGAEAIYDLLARLDLDALSYELRHRAGNDASQQRKNEALKRLQVVESFRAS RGRNKPEWMIVRIVPVIPPELRPLVPLDGGRFATSDLNDLYRRVIIRNNRLKRLIEIKAP EVILRNEKRMLQESVDSLFDNSRKSSAVKTDANRPLKSLSDSLKGKQGRFRQNLLGKRVD YSARSVIVVGPELKMGECGIPKLMAAELYKPFIIRKLIERGIVKTVKSAKKIVDRKEPVI WDILEHVMKGHPVLLNRAPTLHRLGIQAFQPKMIEGKAIQLHPLACTAFNADFDGDQMAV HLPLSNEAILEAQMLMLQSHNILNPANGAPITVPAQDMVLGLYYITKLRAGAKGEGLTFY GPEEALIAYNEGKVDIHAPVKVIVKDVDENGNIVDVMRETSVGRVIVNEIVPPEAGYINT IISKKSLRDIISDVIKVCGVAKAADFLDGIKNLGYQMAFKGGLSFNLGDIIIPKEKETLV QKGYDEVEQVVNNYNMGFITNNERYNQVIDIWTHVNSELSNILMKTISSDDQGFNSVYMM LDSGARGSKEQIRQLSGMRGLMAKPQKAGAEGGQIIENPILSNFKEGLSVLEYFISTHGA RKGLADTALKTADAGYLTRRLVDVSHDVIITEEDCGTLRGLVCTDLKNNDEVIATLYERI LGRVSVHDIIHPTTGELLVAGGEEITEEVAKKIQESPIESVEIRSVLTCEAKKGVCAKCY GRNLATSRMVQKGEAVGVIAAQSIGEPGTQLTLRTFHAGGTAANIAANASIVAKNNARLE FEELRTVDIVDEMGESAKVVVGRLAEVRFVDVNTGIVLSTHNVPYGSTLYVSDGDLVEKG KLIAKWDPFNAVIITEATGKIEFEGVIENVTYKVESDEATGLREIIIIESKDKTKVPTAH ILTEDGDLIRTYNLPVGGHVIIENGQKVKAGEVIVKIPRAVGKAGDITGGLPRVTELFEA RNPSNPAVVSEIDGEVTMGKIKRGNREIIVTSKTGEVKKYLVALSKQILVQENDYVRAGT PLSDGATTPADILAIKGPTAVQEYIVNEVQDVYRLQGVKINDKHFEIIVRQMMRKVQIDE PGDTRFLEQQVVDKLEFMEENDRIWGKKVVVDAGDSQNMQAGQIVTARKLRDENSMLKRR DLKPVEVRDAVAATSTQILQGITRAALQTSSFMSAASFQETTKVLNEAAINGKIDKLEGM KENVICGHLIPAGTGQREFEKLIVGSKEEYDRILANKKTVLDYNEVE >gi|222159329|gb|ACAB01000030.1| GENE 46 66376 - 66684 340 102 aa, chain + ## HITS:1 COG:no KEGG:BT_2732 NR:ns ## KEGG: BT_2732 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 102 1 101 101 164 89.0 1e-39 MEEQNNNGQLQIELREEVAQGTYANLAIITHSSSEFILDFVRVMPGIPKAGVQSRIIVAP EHAKRLLRALEDNIAKYERVFGPIRTSDEPPISPLTGVKGEA >gi|222159329|gb|ACAB01000030.1| GENE 47 66821 - 67231 701 136 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883069|ref|ZP_02064072.1| hypothetical protein BACOVA_01032 [Bacteroides ovatus ATCC 8483] # 1 136 1 136 136 274 100 1e-72 MPTIQQLVRKGREVLVEKSKSPALDSCPQRRGVCVRVYTTTPKKPNSAMRKVARVRLTNQ KEVNSYIPGEGHNLQEHSIVLVRGGRVKDLPGVRYHIVRGTLDTAGVAGRTQRRSKYGAK RPKPGQAAAPAKGKKK >gi|222159329|gb|ACAB01000030.1| GENE 48 67375 - 67851 809 158 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883068|ref|ZP_02064071.1| hypothetical protein BACOVA_01031 [Bacteroides ovatus ATCC 8483] # 1 158 1 158 158 316 100 4e-85 MRKAKPKKRLILPDPVFNDQKVSKFVNHLMYDGKKNTSYEIFYAALETVKAKLPNEEKSA LEIWKKALDNVTPQVEVKSRRVGGATFQVPTEIRPDRKESISMKNLILFARKRGGKSMAD KLAAEIMDAFNEQGGAYKRKEDMHRMAEANRAFAHFRF >gi|222159329|gb|ACAB01000030.1| GENE 49 67897 - 70014 1879 705 aa, chain + ## HITS:1 COG:HP1195 KEGG:ns NR:ns ## COG: HP1195 COG0480 # Protein_GI_number: 15645809 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Helicobacter pylori 26695 # 3 700 4 692 692 842 59.0 0 MAKHDLHLTRNIGIMAHIDAGKTTTSERILFYTGLTHKIGEVHDGAATMDWMEQEQERGI TITSAATTTRWKYAGDTYKINLIDTPGHVDFTAEVERSLRILDGAVAAYCAVGGVEPQSE TVWRQADKYNVPRIAYVNKMDRSGADFFEVVRQMKDVLGANPCPIVVPIGAEESFKGLVD LIKMKAIYWHDETMGADYSIEEIPAELIDEANEWRDKMLEKVAEFDDALMEKYFDDPSTI TEEEVLRALRNATVQMAVVPMLCGSSFKNKGVQTLLDYVCAFLPSPLDTENVIGTNPNTG AEEDRKPSDDEKTSALAFKIATDPYVGRLTFFRVYSGKIEAGSYIYNSRSGKKERVSRLF QMHSNKQNPVEVIGAGDIGAGVGFKDIRTGDTLCDETAPIVLESMDFPEPVIGIAVEPKT QKDMDKLSNGLAKLAEEDPTFTVKTDEQTGQTVISGMGELHLDIIIDRLKREFKVECNQG KPQVNYKEAITKTVNLREVYKKQSGGRGKFADIIVNIGPADADFTLGGLQFVDEVKGGNI PKEFIPAVQKGFTNAMKSGVLAGYPLDSLKVTLVDGSFHPVDSDQLSFEICAMQAYKNAC AKAGPVLMEPIMKLEVVTPEENMGDVIGDLNKRRGQVEGMESSRSGARIVKAMVPLAEMF GYVTALRTITSGRATSSMTYSHHAQLSSSIAKAVLEEVKGRADLL >gi|222159329|gb|ACAB01000030.1| GENE 50 70095 - 70400 495 101 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348137|ref|NP_811640.1| 30S ribosomal protein S10 [Bacteroides thetaiotaomicron VPI-5482] # 1 101 1 101 101 195 100 1e-48 MSQKIRIKLKSYDHNLVDKSAEKIVRTVKATGAIVSGPIPLPTHKRIFTVNRSTFVNKKS REQFELSSYKRLIDIYSSTAKTVDALMKLELPSGVEVEIKV >gi|222159329|gb|ACAB01000030.1| GENE 51 70419 - 71036 1070 205 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237717399|ref|ZP_04547880.1| 50S ribosomal protein L3 [Bacteroides sp. D1] # 1 205 1 205 205 416 100 1e-115 MPGLLGKKIGMTSVFSADGKNVPCTVIEAGPCVVTQVKTVEKDGYAAVQLGFQDKKEKHT TKPLMGHFKRAGVTPKRHLAEFKEFETELNLGDTITVEMFNDASFVDVVGTSKGKGFQGV VKRHGFGGVGQATHGQHNRARKPGSIGACSYPAKVFKGMRMGGQLGGDRVTVQNLQVLKV IADHNLLLVKGSIPGCKGSIVLIEK >gi|222159329|gb|ACAB01000030.1| GENE 52 71036 - 71662 1046 208 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237717400|ref|ZP_04547881.1| 50S ribosomal protein L4 [Bacteroides sp. D1] # 1 208 1 208 208 407 100 1e-112 MEVNVYNIKGEDTGRKVTLNESIFGIEPNDHAIYLDVKQFMANQRQGTHKSKERSEISGS TRKIGRQKGGGGARRGDMNSPVLVGGARVFGPKPRDYFFKLNKKVKTLARKSALSYKAQN DAIVVVEDFTFEAPKTKDFVAMTKNLKVSDKKLLVILPEANKNVYLSARNIEGANVQTIS GLNTYRVLNAGVIVLTENSLKAIDNILI >gi|222159329|gb|ACAB01000030.1| GENE 53 71679 - 71969 480 96 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883062|ref|ZP_02064065.1| hypothetical protein BACOVA_01025 [Bacteroides ovatus ATCC 8483] # 1 96 1 96 96 189 100 6e-47 MGIIIKPLVTEKMTAITDKLNRFGFVVRPEANKLEIKKEIEALYNVTVVDVNTVKYAGKN KSRYTKAGIINGRTNAFKKAIVTLKEGDTIDFYSNI >gi|222159329|gb|ACAB01000030.1| GENE 54 71975 - 72799 1416 274 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153805960|ref|ZP_01958628.1| hypothetical protein BACCAC_00205 [Bacteroides caccae ATCC 43185] # 1 274 1 274 274 550 100 1e-155 MAVRKFKPTTPGQRHKIIGTFEEITASVPEKSLVYGKKSSGGRNSEGKMTMRYIGGGHRK VIRIVDFKRNKDGVPAVVKTIEYDPNRSARIALLFYADGEKRYIIAPNGLQVGATLMSGE TAAPEIGNALPLQNIPVGTVIHNIELRPGQGAALVRSAGNFAQLTSREGKYCVIKLPSGE VRQILSTCKATIGSVGNSDHGLESSGKAGRSRWQGRRPRNRGVVMNPVDHPMGGGEGRAS GGHPRSRKGLYAKGLKTRAPKKQSSKYIIERRKK >gi|222159329|gb|ACAB01000030.1| GENE 55 72820 - 73089 465 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153805961|ref|ZP_01958629.1| hypothetical protein BACCAC_00206 [Bacteroides caccae ATCC 43185] # 1 89 1 89 89 183 100 3e-45 MSRSLKKGPYINVKLEKRILAMNESGKKVVVKTWARASMISPDFVGHTVAVHNGNKFIPV YVTENMVGHKLGEFAPTRTFRGHAGNKKR >gi|222159329|gb|ACAB01000030.1| GENE 56 73126 - 73536 679 136 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348131|ref|NP_811634.1| 50S ribosomal protein L22 [Bacteroides thetaiotaomicron VPI-5482] # 1 136 1 136 136 266 100 5e-70 MGARKKISAEKRKEALKTMYFAKLQNVPTSPRKMRLVADMIRGMEVNRALGVLKFSSKEA AARVEKLLRSAIANWEQKNERKAESGELFVTQIFVDGGATLKRMRPAPQGRGYRIRKRSN HVTLFVGAKSNNEDQN >gi|222159329|gb|ACAB01000030.1| GENE 57 73542 - 74273 1248 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883058|ref|ZP_02064061.1| hypothetical protein BACOVA_01021 [Bacteroides ovatus ATCC 8483] # 1 243 1 243 243 485 100 1e-136 MGQKVNPISNRLGIIRGWDSNWYGGNDYGDSLLEDSKIRKYLNARLAKASVSRIVIERTL KLVTITVCTARPGIIIGKGGQEVDKLKEELKKVTDKDIQINIFEVKRPELDAVIVANNIA RQVEGKIAYRRAIKMAIANTMRMGAEGIKIQISGRLNGAEMARSEMYKEGRTPLHTFRAD IDYCHAEALTKVGLLGIKVWICRGEVFGKKELAPNFTQSKESGRGNNSGNNGGKNFKRKK NNR >gi|222159329|gb|ACAB01000030.1| GENE 58 74297 - 74731 737 144 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153805964|ref|ZP_01958632.1| hypothetical protein BACCAC_00209 [Bacteroides caccae ATCC 43185] # 1 144 1 144 144 288 100 9e-77 MLQPKKTKFRRQQKGRQKGNAQRGNQLAFGSFGIKALETKWITGRQIEAARIAVTRYMQR QGQIWIRIFPDKPITRKPADVRMGKGKGAPEGFVAPVTPGRIIIEAEGVSYEIAKEALRL AAQKLPITTKFVVRRDYDIQNQNA >gi|222159329|gb|ACAB01000030.1| GENE 59 74737 - 74934 321 65 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883056|ref|ZP_02064059.1| hypothetical protein BACOVA_01019 [Bacteroides ovatus ATCC 8483] # 1 65 1 65 65 128 100 2e-28 MKIAEIKEMTTNDLVERVEAETANYNQMVINHSISPLENPAQIKQLRRTIARMKTELRER ELNNK >gi|222159329|gb|ACAB01000030.1| GENE 60 74931 - 75200 456 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883055|ref|ZP_02064058.1| hypothetical protein BACOVA_01018 [Bacteroides ovatus ATCC 8483] # 1 89 1 89 89 180 100 3e-44 MISLMEARNLRKERTGVVLSNKMDKTITVAAKFKEKHPIYGKFVSKTKKYHAHDEKNECN IGDTVSIMETRPLSKTKRWRLVEIIERAK >gi|222159329|gb|ACAB01000030.1| GENE 61 75203 - 75568 596 121 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348126|ref|NP_811629.1| 50S ribosomal protein L14 [Bacteroides thetaiotaomicron VPI-5482] # 1 121 1 121 121 234 100 2e-60 MIQVESRLTVCDNSGAKEALCIRVLGGTGRRYASVGDVIVVSVKSVIPSSDVKKGAVSKA LIVRTKKEIRRPDGSYIRFDDNACVLLNNAGEIRGSRIFGPVARELRATNMKVVSLAPEV L >gi|222159329|gb|ACAB01000030.1| GENE 62 75588 - 75908 534 106 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883053|ref|ZP_02064056.1| hypothetical protein BACOVA_01016 [Bacteroides ovatus ATCC 8483] # 1 106 1 106 106 210 100 3e-53 MSKLHIKKGDTVYVNAGEDKGKTGRVLKVLVKEGRAFVEGINMVSKSTKPNAKNPQGGIV KQEASIHISNLNPVDPKTGKATRIGRKKSSEGTLVRYSKKSGEEIK >gi|222159329|gb|ACAB01000030.1| GENE 63 75908 - 76465 913 185 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237717411|ref|ZP_04547892.1| 50S ribosomal protein L5 [Bacteroides sp. D1] # 1 185 1 185 185 356 100 3e-97 MSNTASLKKEYAERIAPALKSQFQYSSTMQIPVLKKIVINQGLGMAVADKKIIEVAINEM TAITGQKAVATISRKDIANFKLRKKMPIGVMVTLRRERMYEFLEKLVRVALPRIRDFKGI ESKFDGKGNYTLGIQEQIIFPEINIDSITRILGMNITFVTSAETDEEGYALLKEFGLPFK NAKKD >gi|222159329|gb|ACAB01000030.1| GENE 64 76471 - 76770 503 99 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153805970|ref|ZP_01958638.1| hypothetical protein BACCAC_00215 [Bacteroides caccae ATCC 43185] # 1 99 1 99 99 198 100 1e-49 MAKESMKAREVKRAKLVAKYAEKRAALKQIVRTGDPADAFEAAQKLQELPKNSNPIRMHN RCKLTGRPKGYIRQFGISRIQFREMASNGLIPGVKKASW >gi|222159329|gb|ACAB01000030.1| GENE 65 76823 - 77218 664 131 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883050|ref|ZP_02064053.1| hypothetical protein BACOVA_01013 [Bacteroides ovatus ATCC 8483] # 1 131 1 131 131 260 100 3e-68 MTDPIADYLTRLRNAIGAKHRVVEVPASNLKKEITKILFEKGYILNYKFVEDGPQGTIKV ALKYDSVNKVNAIKKLERISSPGMRKYTGYKDMPRVINGLGIAIISTSKGVMTNKEAAEL KIGGEVLCYVY >gi|222159329|gb|ACAB01000030.1| GENE 66 77234 - 77803 968 189 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237717414|ref|ZP_04547895.1| 50S ribosomal protein L6 [Bacteroides sp. D1] # 1 189 1 189 189 377 100 1e-103 MSRIGKLPISIPAGVTITLKDNVVTVKGPKGEMSQYVNPAINVTIEDGHVTLTENDKEML DNPKQKHAFHGLYRSLVHNMVVGVSEGYKKELELVGVGYRASNQGNIIELALGYTHNIFI QLPAEVKVETKSERNKNPLIILESCDKQLLGQVCSKIRSFRKPEPYKGKGIKFVGEVIRR KSGKSAGAK >gi|222159329|gb|ACAB01000030.1| GENE 67 77825 - 78169 541 114 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348120|ref|NP_811623.1| 50S ribosomal protein L18 [Bacteroides thetaiotaomicron VPI-5482] # 1 114 1 114 114 213 96 5e-54 MTTKIERRVKIKYRVRNKISGTTECPRMSVFRSNKQIYVQIIDDLSGKTLAAASSLGLTE KVAKKEQAAKVGEMIAKKAQEAGITTVVFDRNGYLYHGRVKEVADAARNGGLKF >gi|222159329|gb|ACAB01000030.1| GENE 68 78175 - 78693 848 172 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883047|ref|ZP_02064050.1| hypothetical protein BACOVA_01010 [Bacteroides ovatus ATCC 8483] # 1 172 1 172 172 331 100 1e-89 MAGVNNRVKITNDIELKDRLVAINRVTKVTKGGRTFSFSAIVVVGNEEGIIGWGLGKAGE VTAAIAKGVESAKKNLVKVPILKGTVPHEQSARFGGAEVFIKPASHGTGVVAGGAMRAVL ESVGVTDVLAKSKGSSNPHNLVKATIEALSEMRDARMVAQNRGISVEKVFRG >gi|222159329|gb|ACAB01000030.1| GENE 69 78703 - 78879 281 58 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|53715448|ref|YP_101440.1| 50S ribosomal protein L30 [Bacteroides fragilis YCH46] # 1 58 1 58 58 112 100 7e-24 MSTIKIKQVKSRIGAPADQKRTLDALGLRKLNRVVEHESTPSILGMVDKVKHLVAIVK >gi|222159329|gb|ACAB01000030.1| GENE 70 78912 - 79358 739 148 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237717418|ref|ZP_04547899.1| 50S ribosomal protein L15 [Bacteroides sp. D1] # 1 148 1 148 148 289 100 5e-77 MNLSNLKPAEGSTKTRKRIGRGAGSGLGGTSTRGHKGAKSRSGYSKKIGFEGGQMPLQRR VPKFGFKNINRVEYKAINLDTIQKLAEAKSLTKVGVNDFIEAGFISSNQLVKVLGNGTLT NKLEVEAHAFSKTATAAIEAAGGTVVKL >gi|222159329|gb|ACAB01000030.1| GENE 71 79363 - 80703 876 446 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 13 442 19 447 447 342 41 7e-93 MRKAIETLKNIWKIEDLRQRILITILFVAIYRFGSYVVLPGINPAMLAKLHEQTSEGLLA LLNMFSGGAFSNASIFALGIMPYISASIVIQLLGIAVPYFQKLQREGESGRRKMNQYTRY LTIAILLVQAPSYLLNLKMQAGPSLNASLDWTLFMVTSTIILAAGSMFILWLGERITDKG IGNGISFIILIGIIARFPDALLQEVVSRVANKSGGLIMFIIEIVFLLLVIGAAILLVQGT RKIPVQYAKRIVGNKQYGGARQYIPLKVNAAGVMPIIFAQAIMFIPITFIGFSNVNDAGG FLHAFTDHTSFWYNFVFAVMIILFTYFYTAITINPTQMAEDMKRNNGFIPGIKPGKKTAE YIDDIMSRITLPGSFFLALVAIMPAFAGVFGVQAGFAQFFGGTSLLILVGVVLDTLQQVE SHLLMRHYDGLLKSGRIKGRAGVAAY >gi|222159329|gb|ACAB01000030.1| GENE 72 80719 - 81516 607 265 aa, chain + ## HITS:1 COG:BH0156 KEGG:ns NR:ns ## COG: BH0156 COG0024 # Protein_GI_number: 15612719 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus halodurans # 1 251 1 246 248 266 50.0 4e-71 MIFLKTEDEIELLRQSNLLVGKTLAEVAKLVKPGVTTCELDKVAEEFIRDHGATPTFKGF PNQYGEPFPASLCTSVNEQVVHGIPGDIVLKDGDIVSVDCGTYMNGFCGDSAYTFCVGEV DEEIRNLLKVTKEALYIGIQNAVQGKRIGDIGYAIQQYCESHSYGVVREFVGHGIGKEMH EDPQVPNYGKRGYGPLMKRGLCIAIEPMITLGDRQVIMESDGWTVRTRDRKCAAHFEHTI AVGAGEADILSSFKFIEEVLGDKAI >gi|222159329|gb|ACAB01000030.1| GENE 73 81518 - 81736 239 72 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 72 1 72 72 96 61 5e-19 MAKQSAIEQDGVIVEALSNAMFRVELENGHEITAHISGKMRMHYIKILPGDKVRVEMSPY DLSKGRIVFRYK >gi|222159329|gb|ACAB01000030.1| GENE 74 81745 - 81861 198 38 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|53715443|ref|YP_101435.1| 50S ribosomal protein L36 [Bacteroides fragilis YCH46] # 1 38 1 38 38 80 100 3e-14 MKVRASLKKRTPECKIVRRNGRLYVINKKNPKYKQRQG >gi|222159329|gb|ACAB01000030.1| GENE 75 81895 - 82275 637 126 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348113|ref|NP_811616.1| 30S ribosomal protein S13 [Bacteroides thetaiotaomicron VPI-5482] # 1 126 1 126 126 249 100 3e-65 MAIRIVGVDLPQNKRGEIALTYVYGIGRSSSAKILDKAGVDKDLKVKDWTDDQAAKIREI IGAEYKVEGDLRSEIQLNIKRLMDIGCYRGVRHRIGLPVRGQSTKNNARTRKGRKKTVAN KKKATK >gi|222159329|gb|ACAB01000030.1| GENE 76 82287 - 82676 665 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348112|ref|NP_811615.1| 30S ribosomal protein S11 [Bacteroides thetaiotaomicron VPI-5482] # 1 129 1 129 129 260 100 2e-68 MAKKTVAAKKRNVKVDANGQLHVHSSFNNIIVSLANSEGQIISWSSAGKMGFRGSKKNTP YAAQMAAQDCAKIAFDLGLRKVKAYVKGPGNGRESAIRTIHGAGIEVTEIIDVTPLPHNG CRPPKRRRV >gi|222159329|gb|ACAB01000030.1| GENE 77 82794 - 83399 1028 201 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883039|ref|ZP_02064042.1| hypothetical protein BACOVA_01002 [Bacteroides ovatus ATCC 8483] # 1 201 1 201 201 400 100 1e-110 MARYTGPKSRIARKFGEGIFGADKVLSKKNYPPGQHGNSRKRKTSEYGVQLREKQKAKYT YGVLEKQFRNLFEKAATAKGITGEVLLQLLEGRLDNVVFRLGIAPTRAAARQLVSHKHIT VDGEVVNIPSFAVKPGQLIGVRERSKSLEVIANSLAGFNHSKYAWLEWDEASKVGKMLHI PERADIPENIKEHLIVELYSK >gi|222159329|gb|ACAB01000030.1| GENE 78 83411 - 84403 1021 330 aa, chain + ## HITS:1 COG:BMEI0781 KEGG:ns NR:ns ## COG: BMEI0781 COG0202 # Protein_GI_number: 17987064 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Brucella melitensis # 8 325 11 321 337 238 42.0 1e-62 MAILAFQKPDKVLMLEADSRFGKFEFRPLEPGFGITVGNALRRILLSSLEGFAITTIRID GVEHEFSSVPGVKEDVTNIILNLKQVRFKQVVEEFESEKVSITVENSSEFKAGDIGKYLT GFEVLNPELVICHLDSKSTMQIDITINKGRGYVPADENREYCTDVNVIPIDSIYTPIRNV KYAVENFRVEQKTDYEKLVLEISTDGSIHPKEALKEAAKILIYHFMLFSDEKITLESNDT DGNEEFDEEVLHMRQLLKTKLVDMDLSVRALNCLKAADVETLGDLVQFNKTDLLKFRNFG KKSLTELDDLLESLNLSFGTDISKYKLDKE >gi|222159329|gb|ACAB01000030.1| GENE 79 84407 - 84910 834 167 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237717427|ref|ZP_04547908.1| 50S ribosomal protein L17 [Bacteroides sp. D1] # 1 167 1 167 167 325 100 5e-88 MRHNKKFNHLGRTASHRSAMLSNMACSLIKHKRITTTVAKAKALKKFVEPLITKSKEDTT NSRRVVFSNLQDKIAVTELFKEISVKIADRPGGYTRIIKTGNRLGDNAEMCFIELVDYNE NMAKEKVAKKATRTRRSKKAAEAAPAAVEAPATEAPKAEEPKAESAE >gi|222159329|gb|ACAB01000030.1| GENE 80 85058 - 85453 125 131 aa, chain + ## HITS:1 COG:no KEGG:BT_2699 NR:ns ## KEGG: BT_2699 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 195 73.0 5e-49 MAKFVKVILFLLLTVAFHSIAGNVSTERMVDKQGCDITYAMGQRGEICAPDLPGKPVAEL TNLQSHQISVTRIQRVQLGEYFFSLKNALQGCADRESSLSQHWGRIYDTTTSYYCQPSSE YYVYTLRRIII >gi|222159329|gb|ACAB01000030.1| GENE 81 85566 - 86108 423 180 aa, chain + ## HITS:1 COG:no KEGG:BT_2698 NR:ns ## KEGG: BT_2698 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 179 3 178 179 258 82.0 6e-68 MATTQTIDATIFASTHPDIAKRISVSGVLISSVMLLIGILAFASTFELDDKSSTTSMALM VLGTGLFLIGIFRLFWKSKEVVYLPTKSVAKEHSVFFDLKHMDALKNLVNSGSFSADSKI KSEASGNIRMDVILSADKKFAAVQLFQFVPYTYQPITSVQYFTNEKASAIVAFLSKSKIQ >gi|222159329|gb|ACAB01000030.1| GENE 82 86261 - 88075 1191 604 aa, chain - ## HITS:1 COG:CAC3034 KEGG:ns NR:ns ## COG: CAC3034 COG0249 # Protein_GI_number: 15896285 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 15 601 11 597 598 249 27.0 8e-66 MEKLEQIITTYQQIIQKTELELQNARKRIYYISLLRLILFVGAVANAIIFWSDGWLCLSA FAVLPFILFIWLVKRHNFWFYRKDFLKKKIEINEQELRAMQYDFSDFDDGKEFINPSHLY TLDLDVFGEHSLFQYINRTATPIGKQHLANWFNAHLENKAAIDHRQEAIRELSTDLEYRQ QIRLLGLLYKGKPADTTEIKEWAASPSYYRKRTLLRILPIAVTVINFLCISLAISGIISA NIAGGVFVSFVLFSTIFSKGITKLQTTYGEKLQILSTYADQILLTEQKEMHSHILQELKT DLTSQNQTASQAVRQLSKLMNALDQRSNLLMSTILNGLIFWELRQVMRIEKWKEIHASDL PRWIETIGEIDAYCSLATFAYNHPDYIYPTIAAQSFHLQAEALGHPLMDRNQCVRNGIDI EKRPFFIIITGANMAGKSTYLRTVGVNYLLACIGAPVWAKQMKIYPARLVTSLRTSDSLA DNESYFFAELKRLKLIIDKLEAGEELFIILDEILKGTNSMDKQKGSFALIKQFMHMNANG IIATHDLLLGTLIESFPQNIRNYCFEADITNNELTFSYQMRNGVAQNMNACFLMKKMGIA VIND >gi|222159329|gb|ACAB01000030.1| GENE 83 88330 - 88974 491 214 aa, chain + ## HITS:1 COG:no KEGG:BT_2695 NR:ns ## KEGG: BT_2695 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 214 8 221 222 354 80.0 1e-96 MKKGCILCLLISVISLPLIAQNIIYSNLKELLTQHGDTTAVLRVEKRSRNQIVLTGGADY RITAGDDESMCRRLKKRCFAVRDEKGNLYLNCRKLRYKKLRFGAWYAPAVQLGKNIYFCA MPLGSVIGGNFVEEDDVKLGGNIGDALAASSLVTKRVCYELNGETGKVDFLGKDRMLQLL KNYPELKQAYLKDDSQEAKHTFKYLLELQNKQKK >gi|222159329|gb|ACAB01000030.1| GENE 84 89178 - 89357 60 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717432|ref|ZP_04547913.1| ## NR: gi|237717432|ref|ZP_04547913.1| predicted protein [Bacteroides sp. D1] # 1 59 1 59 59 104 100.0 2e-21 MSKLSYAFIQARIQAVLSKQIDNPFTEVDSRVGITLNCKTIIKTNGGASLKLKIDFSLK >gi|222159329|gb|ACAB01000030.1| GENE 85 89515 - 91512 1555 665 aa, chain + ## HITS:1 COG:SPy2082 KEGG:ns NR:ns ## COG: SPy2082 COG2987 # Protein_GI_number: 15675840 # Func_class: E Amino acid transport and metabolism # Function: Urocanate hydratase # Organism: Streptococcus pyogenes M1 GAS # 1 659 13 671 676 1019 70.0 0 MRITLGNTLPPYPDFVEGIRRAPDRGYTLTPAQTITALKNALRYIPIEWHEQLAPELMEE LRTRGRIYGYRFRPAGDLKAKPIDEYRGQCIEGKAFQVMIDNNLCFDIALYPYELVTYGE TGQVCQNWMQYRLIKQYLEELTQEQTLVIESGHPLGLFRSRPDAPRVIITNSMMIGQFDN QHDWHIAAQMGVANYGQMTAGGWMYIGPQGIVHGTFNTLLNAGRLKLGIPQDQNLRGHLF VSSGLGGMSGAQPKAAEIAGAVSIIAEVDSSRIETRYRQGWVGHVTADIAEAYRMASQAM QRREPCSIAYHGNVVDLLEYAERERIPIELLSDQTSCHAVYEGGYCPAGLTFEERTRLLH ESPEQFRHLVDISLHRHFEVIKKLVARGTYFFDYGNSFMKAIYDAGVKEISRNGVDEKDG FIWPSYVEDIMGPQLFDYGYGPFRWVCLSGKHEDLIKTDHAAMECIDVNRRGQDLDNYNW IRDAEKNQLVVGTQARILYQDAVGRMNIALRFNEMVRRGEVGPIMLGRDHHDVSGTDSPF RETSNIKDGSNVMADMAVQCFAGNCARGMSLVALHNGGGVGIGKAVNGGFGMVCDGSERV DEILRSAMLWDVMGGVARRSWARNPHAMETSEAFNDSHARDYQITMPYVADEELIKKIVP YIVGK >gi|222159329|gb|ACAB01000030.1| GENE 86 91741 - 92643 862 300 aa, chain + ## HITS:1 COG:SPy2083 KEGG:ns NR:ns ## COG: SPy2083 COG3643 # Protein_GI_number: 15675841 # Func_class: E Amino acid transport and metabolism # Function: Glutamate formiminotransferase # Organism: Streptococcus pyogenes M1 GAS # 5 299 3 298 299 332 53.0 6e-91 MSWNKIIECVPNFSEGRDLEKIDQIVAPFRIKAGVKLLDYSNDEDHNRLVVTLVGEPEAL YEAIVEAVGVAVRLIDLNQHTGQHPRMGAVDVIPFIPIKNTSMEEAIELSKKVAAKVAEI YHLPVFLYEKSATASHRENLASVRKGEFEGMAEKIKLPEWQPDFGPAERHPTAGTVAIGA RMPLVAYNINLSTDNLEIATKIAKNIRHINGGLRYVKAMGVELKERNITQVSINMTDYTR TALYRAFELVRIEARRYGVTIVGSEIIGLVPMEALIDTASYYLGLENFSMQQVLEARIME >gi|222159329|gb|ACAB01000030.1| GENE 87 92781 - 94034 932 417 aa, chain + ## HITS:1 COG:BH1984 KEGG:ns NR:ns ## COG: BH1984 COG1228 # Protein_GI_number: 15614547 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Bacillus halodurans # 21 411 24 409 426 335 45.0 1e-91 MSENLIIFNARIVTPIGFSARKGEEMSQLQVIENGTVEVTKGIITYVGENRGEDRDGYYQ HYWHYNARGHCLLPGFVDSHTHFVFGGERSEEFSWRLNGESYMSIMERGGGIASTVKATR QMNFLKLRSAAEGFLKRMSTMGVTTVEGKSGYGLDRETELLQLKVMRSLNNTEHKRVDIV STFLGAHALPEEYAGRSDEYIDFLIREMLPLVRDGELAECCDVFCEQGVFSIEQSRRLLQ AAKEHGFLLKLHADEIVSLGGAELAAELGALSADHLLHASDAGIRAMADAGVVATLLPLT AFALKEPYARGREMIDSGCAVALATDLNPGSCFSGSIPLTIALACIYMKMSIEETITALT LNGAAALQRADRIGSIEVGKQGDFIVLNSNNYHILPYYIGMNCVIMTIKGGMLYPVA >gi|222159329|gb|ACAB01000030.1| GENE 88 94077 - 94706 670 209 aa, chain + ## HITS:1 COG:FN0739 KEGG:ns NR:ns ## COG: FN0739 COG3404 # Protein_GI_number: 19704074 # Func_class: E Amino acid transport and metabolism # Function: Methenyl tetrahydrofolate cyclohydrolase # Organism: Fusobacterium nucleatum # 2 207 3 212 212 134 40.0 2e-31 MLADLTVKDFLDKVACSDPVPGGGSIAALDGALASALSTMVARLTVGKKGYEASEEVMQH AQTITLRLLDEFIALIDKDSAAYNEVFACFKLPKTTDEENAARSAAIQEATKQAALVPLE VARKALDMMSVIADVARLGNRNAVTDACVAMMSARSAVLGALLNVRINLGSLKDRDFVLQ LQTEADTIEQTACRREKELLDAVNQDLRV >gi|222159329|gb|ACAB01000030.1| GENE 89 94703 - 96202 1403 499 aa, chain + ## HITS:1 COG:FN1406 KEGG:ns NR:ns ## COG: FN1406 COG2986 # Protein_GI_number: 19704738 # Func_class: E Amino acid transport and metabolism # Function: Histidine ammonia-lyase # Organism: Fusobacterium nucleatum # 3 493 2 495 511 411 45.0 1e-114 MSKNVYHIGSGALTFEIIERIINENLKLELAPEAKLRIQKCRDYLDHKIASSEEPLYGIT TGFGSLCTKNISPDELGTLQENLIKSHACSVGEEIRPVIIKLMMLLKAHALSLGHSGVQV ITVQRILDFFNNDVMPIVYDRGSLGASGDLAPLANLFLPLIGVGDVYYKGKKCEAISVLD EFGWEPVKLMSKEGLALLNGTQFMSANGVFAMLKAFRLSKKADLIAALSLEAFDGRIDPF MDCIQQIRPHQGQIETGEAFRKLLAGSELIERHKEHVQDPYSFRCIPQVHGATKDAIRYV ASVLLTEINSVTDNPTIFPDEDRIISGGNFHGQPLAISYDFLAIALAELGNISERRVSQL IMGLRGLPEFLVANPGLNSGFMIPQYAAASMVSQNKMYCYAASSDSIVSSNGQEDHVSMG ANAATKLYKVMDNLEHILAIELMNAAQGIDFRRPLKTSPVLERFLHAYRKEVPFVKDDIV MYKEIHKTVAFLKRTQIDY >gi|222159329|gb|ACAB01000030.1| GENE 90 96405 - 96644 68 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPFEVKASVAAVMIRSLNSTRWEASFPCSAIIRFSILLKCSFRFLFISKGKTEQRYIENS INKSSFLLFIMHLGIIIIQ >gi|222159329|gb|ACAB01000030.1| GENE 91 96550 - 97200 700 216 aa, chain + ## HITS:1 COG:no KEGG:BT_2689 NR:ns ## KEGG: BT_2689 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 214 2 212 215 338 87.0 8e-92 MMAEHGKDASQRVELRERIITAATEAFTSKGIKSITMDDIAAALGISKRTLYEVFSDKES LLKECILKAQADRDKYLQKIFEQSHNVLEVILAVFQKSIEVFHQTNKRFFEDIKKYPKVY EMMKNRQDSDSQKTMSFFKTGVEQGIFRPDVNFAIVNLLVREQFDVLLNTDICNEYPFIE VYESIMFTYIRGISTEKGAKVLEDFIQEYRKNRIAD >gi|222159329|gb|ACAB01000030.1| GENE 92 97256 - 98602 1327 448 aa, chain + ## HITS:1 COG:PA4144 KEGG:ns NR:ns ## COG: PA4144 COG1538 # Protein_GI_number: 15599339 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 39 447 56 464 471 72 19.0 2e-12 MNKMKRLAGKKILLVAVALCAFGFAKAQDTQAEKNILTLTLDKALEIALDENPTIKVAEE EVALKKVASKEAWQSLLPEASIAGSVDHTIKAAEMKLNDNVFKMGQDGSNTVNAGLSINL PLFVPGVYRAMSMTKTDIELAVEKSRASKLDLVNQVSKAYYQLMLAQDSYEVLQGSYKLA EDNYNVVNAKYQQGTVSEYDKISAEVQMRSIKPNLISAANAVTLAKLQLKVLMGITADVE IKINDSLTNYETAMFANQLREENVNLDNNTTMKQLDLNMKLLEKNVKSLKTNFMPTLSMS FSYQYQSLYNPNINFFNYNWSNSSSLMFNLSIPLYKASNFTKVKSARIQMRQLDWNRIDT ERQLNMQIVSCRNNMSASTEQVVSNKENVMQAKKAVVIAEKRYDVGKGTVLELNSSQVSL TQAQLTYNQSIYDYLVAKADLDQVLGKE >gi|222159329|gb|ACAB01000030.1| GENE 93 98636 - 99673 1127 345 aa, chain + ## HITS:1 COG:VC0165 KEGG:ns NR:ns ## COG: VC0165 COG0845 # Protein_GI_number: 15640195 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 1 343 1 363 368 111 27.0 3e-24 MKRSIQLVALLLTVFMGSCTGGKDKAAAEHVDAKPIVKLADVKARPVDQIQDYTATVEAE VKNNIAPSSPVRIDQIFVEVGDRVSKGQKLVQMDAANLKQTKLQLDNQEIEFNRIDELYK VGGASKSEWDASKMQLDVKKTAYKNLLENTSLQSPINGVVTARNYDNGDMYSGGEPVLVV EQITPVKLLINVSETYFTKVKKGTPVDVKLDVYGDEVFTGTINLIYPTIDATTRTFQVEI RLDNKDQRVRPGMFARATLNFGTAENVVVPDLAIVKQAGSGDRYVFVYKDGKVSYNKVEL GRRMGTEYELKSGVPDNSQVVVAGQSRLVNGMEVEVEASSQPSSK >gi|222159329|gb|ACAB01000030.1| GENE 94 99788 - 102946 2963 1052 aa, chain + ## HITS:1 COG:BH3816 KEGG:ns NR:ns ## COG: BH3816 COG0841 # Protein_GI_number: 15616378 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Bacillus halodurans # 7 1020 5 1005 1093 493 32.0 1e-139 MSLYEGAVKKPIMTSLCFLAVVIFGLFSLSKLPIDLYPDIDTNTIMVMTAYPGASASDIE NNVTRPLENTLNAVSNLKHITSRSSENMSLITLEFEFGNDIDVLTNDVRDKLDMVSSQLP DDVENPIIFKFSTDMIPIVLLSVQANESQSALYKILDDRVVNPLARIPGVGTVSISGAPQ REIQVYCDPNKLEAYHLTIETISSIIGAENKNIPGGNFDIGSETYALRVEGEFDDSRQLA DVVVGTHNGANVFLRDVARIVDTVEERAQETYNNGVQGAMIVVQKQSGANSVEISKKVAD ALPRLQKNLPSDVKIGVIVDTSDNILNTIDSLTETVVYALLFVVIVVFLFLGRWRATLII CITIPLSLIASFIYLAISGNTINIISLSSLSIAIGMVVDDAIVVLENVTTHIERGSDPKQ AAVHGTNEVAISVIASTLTMIAVFFPLTMVSGMSGVLFKQLGWMMCAIMFISTVAALSLT PMLCSQLLRLQKKPSKMFKLFFTPIEKALDGLDTWYAKMLNWAVRHRPIVIVGCIAFFIV SLLCAKGIGTEFFPAQDNARIAVQLELPIGTRKEIAQELSEKLTNQWLTKYKDIMKVCNY TVGQADSDNTWASMQDNGSHIISFNISLVDPGDRDITLEAVCDEMREDLKAYPEFSKAQV ILGGSNTGMSAQASADFEVYGYDMTMTDSVAARLKRELLKVKGVTEVNISRSDYQPEYQV DFDREKLAMHGLNLSTAGNYLRNRVNGAVASKYREDGDEYDIKVRYAPEFRTSLESLENI LIYNAQGQSVRVKDVGKVVERFAPPTIERKDRERIVTVSAVISGAPLGDVVAAGNKVIDK MDLPGDVTIQISGSYEDQQDSFRDLGTLAILIVVLVFIVMAAQFESLTYPFIIMFSLPFA FSGVLMALFFTNSTLSVMSLLGGIMLIGIVVKNGIVLIDYITLCRERGLAVLNSVVTAGK SRLRPVLMTTATTVLGMIPMAIGGGQGSEMWSPMAIAVIGGLTISTVLTLILIPTLYCVF AGTGIKNRRRKLRRQRELDVYFQANKDSIIKK >gi|222159329|gb|ACAB01000030.1| GENE 95 102961 - 103254 351 97 aa, chain + ## HITS:1 COG:no KEGG:BT_2685 NR:ns ## KEGG: BT_2685 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 97 97 198 95.0 6e-50 MKSVLITFDQAYYERIMALLDRLGCRGFTYLEKVQGRGSKTGDPHFGSHAWPSMCSAILT VVDDSKVDPLLDTLHIMDLETEQLGLRAFVWNIERTI >gi|222159329|gb|ACAB01000030.1| GENE 96 103478 - 104869 1271 463 aa, chain + ## HITS:1 COG:no KEGG:BT_2683 NR:ns ## KEGG: BT_2683 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 463 1 464 464 863 88.0 0 MEIIKNYLKYSLWFVLIVFAVLLGLHWLPALTIDGHTMRRVDLLSDLRYPESETAAADSD SIPLPPVVKPAFVDTCRTGMTCIEDYSDSTLRGMTPFYKALDRVSSDDSDDKQVRIAVFG DSFIEADIFTADLREMLQKQFGGCGVGFVTITSMTSGYRPTVRHTFGGWSSHAVTDSVYF DKKKQGISGHYFVPRNGAYVELRGQNKYASLLDTCQRASIFFHNRDSVLLSARVNKGENK NYSLGPSDGLQQIQVDGRIGSIRWTVDRADSTLFYGLAMDGKKGIILDNFSLRGSSGLSL RGIPPQMLKQFNRQRPYDLIILEYGLNVATERGRNYDNYQKGLLTAIEHLKECFPQAGIL LLSVGDRDSKNENGELRTMPGVKNLIRYQQNIAAESGIAFWNMFEAMGGEGSMAKLVHAK PSMANYDYTHINFRGGKHLAGLLYETMIYGKEQYDRRRAYEQE >gi|222159329|gb|ACAB01000030.1| GENE 97 104823 - 105773 608 316 aa, chain + ## HITS:1 COG:no KEGG:BT_2682 NR:ns ## KEGG: BT_2682 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 34 316 6 288 288 531 88.0 1e-149 MVRNNTTGGVLMSKNNLQSVFILILAGMLSVPCFLAEAIAQDRIPACPPLGKTAKLIKPL REMNWANDTITVQISFPSAFRETGRNEIIDSIALLAPVFEHLRQVRAGLSEDTVRIVHIG DSHVRGHIYPQTIGARLTETFGAVSYIDKGVNGATCLTFTHPDRIAEIAALKPELLILSF GTNESHNRRYNINVHYNQMDELVKLLHDSLPNVPILLTTPPGSYESFRQRRRRRTYAINP RTATAAETIRCYAKDHRLLVWDMYDVVGGKRRACTNWTDAKLMRPDHVHYLPEGYILQGN LLYQALIKAYNDYVSH >gi|222159329|gb|ACAB01000030.1| GENE 98 105760 - 107247 889 495 aa, chain + ## HITS:1 COG:PA3548 KEGG:ns NR:ns ## COG: PA3548 COG1696 # Protein_GI_number: 15598744 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Pseudomonas aeruginosa # 40 418 19 390 520 248 40.0 1e-65 MFPIDIDFSRLKEVLTYDPQAPMIFSSGIFLWLFAAFMVVYVLLQRKYTARIMFVTLFSY YFYYKSSGTYFFLLAIVTVADFFLAQLMDRVEGYWKRKGLVALSLGVNLGLLAYFKYTNF LGGVIASLMGGEFTALDIFLPVGISFFTFQSLSYTIDVYRRDIKPLTNLLDYAFYVSFFP QLVAGPIVRARDFIPQIRKPLFVSQEMFGRGIFLIVSGLFKKAIISDYISINFVERIFDN PTLYSGVENLMGVYGYALQIYCDFSGYSDMAIGIALLLGFHFNLNFNSPYKSASITEFWR RWHISLSSWLKDYLYISLGGNRKGKFRQYLNLIITMFLGGLWHGASWNFVLWGTFHGVAL ALHKMWMSITGRKKGEESHGWRRVFGVIITFHFVCFCWIFFRNADFQNSMDMLGQIFTTF RPQLFPQLLEGYWKVFALMLLGFLLHFAPDSWENAACRGVIRLPFLGKAVLMVALIYLVI QMKSSEIQPFIYFQF >gi|222159329|gb|ACAB01000030.1| GENE 99 107494 - 110649 1180 1051 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294644103|ref|ZP_06721880.1| ## NR: gi|294644103|ref|ZP_06721880.1| putative lipoprotein [Bacteroides ovatus SD CC 2a] # 1 1051 1 1051 1051 1987 100.0 0 MKNKTTNILIVLTIMALTSCVREELLPSSGNGEEGERVKVELSFKIPPASSPQELQSRSM STDNAFSVEFFKETSPAKTRSGEVTELYNLWLFQFHSDGSIHGSPQQVSDEVTMVNDMAL LNVTLRVGTDQTLYLVALGKKVTVAANLAEIRNINELENLQLDYVDNRNGLYYSRITMES EIPYAGAAKGVSVRKTGDSDNGEIVYGTPDGFSGGIEVKRLVSRITLKHKFDVTENTLEG MRLLKVPIKLCINPASADKDVIDNIPMADLESDNAFEDADKDTDGFFTSQWYVAQNKQGT VADIASERDRYRKVTDGTGKAPEAGTNIEAWSYAKSSRSLYTVHQIYVGNNNTDNFDVEI NSYYDLRTIINSVDLNDGRIRSYTAEQKVYLSSSTRSSSGNGTGDIIGATGVYFDAHYGW RPVIIYAQGRKVTVGIYKDARCTQLVNMNSPTENWLQLSSYPNYTEVVRNGGANTLTNQI ETNIFVPTRFKLYLYADEYITDENGMIVDPEFDRNKIYKAVTNYTTSDFVYERTLYVQVT TEEINAEGIPKSKKGTYTIKQKRGHYAGLFGGEIKDGQYTEGLIIDAFDEHKSRIDDQIT PANYIYSGYYNVMTFYTWNKDKDAETNYCSFMNGKQATFNLATNPGNYVVNNGRESIIPT MRKINGNIDLYQYNYYNSFHARFCHDLNRDKNGNGVIDYLPDDPDNNELEWYLPSAYQAF GILASAGTIIYNSPMALALELDASTAPGWSIELAGWNSSNTNKNSSKEVRCVRNIPVSPE RKAGTKVTTYTDGADSYAMIDLSTLPYGILDRTTAEGKAELYEKLDLYSYSTTGTEYDES VSQSDASKPLGKTVFRVRKNYPITGSTITNGILVSSFSSPKFIVSPTDVYNDGDTKSNTA SDILQNAAGTPGKNSITMTWAEANGRLNTANTQSWDISSKAMNTGCYAYKGKSGTDEPGS WRVPNSRELSMILIFAHELEKKSSETGFNALNKVKTETYYNSGYWSSTERSSTSRPKEAL CGSIDVNYGLTIDASTKSSRRNRLRCIKDIP >gi|222159329|gb|ACAB01000030.1| GENE 100 110652 - 113129 1647 825 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717447|ref|ZP_04547928.1| ## NR: gi|237717447|ref|ZP_04547928.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 825 1 825 825 1595 100.0 0 MKMRIRYRILALAAAALLSGCAKESAVMNLPEAEKVTVQIKTKALTGSVVEGVTIEKVRF MLFQNETLLNNYTTSNGTLNIDDSGLFNLAVSPGVYYFAAIINETPELTEKLNAVTKRTD IDGAGFDWQSSALTSGNMPLANLGEIKIQPGTTSGQGMAQLRPVYNSEGGRYKGDFTTAS TTATIDMPRVMSQVSVFLRQDDAVTEEIKISNVQVSNIPRNSYLVDRLDADNTKQDLFNG SEMTISASEGDKEYNGKLYHTFPTAIIAEKSFYENGTDDGTKATDPDNAAYLFINATYGG VPTTYKILLREGDANFRLLRNTNYNVYATIKQIGSKGIYVVIEPVKLYNITVNWKPVEGL VIVSDREADFNKNVNVWSDYTAYSGILKVYKGDAYHDALFKYGSLIATGNDATTTIEQGF TAPTAVEATNDVIWYPGNFNVTGIGSWDDVPYITDASNIPSGNTPELVAQGKGDPCRLAA LSPHQIGVEGKVDNQQWHMANPTEYAILMKAANGAESENNNGYCSFHELLIPNVKYRNES GVLVSSHYYKGNYWSTESNKAFSFDSQNPTAAVLGAAAPGQGYTVRCVRNNIPAARITIN PPTSVNYKGAAINGLPFYVDSNVPYWKMELIKSGDHVGTSMNFDDFSFTPLATGVMHEIE GSYTQTPKAYIARRESRTEDRTFGVKFTSMHFTGEETTYYFTITQNRYSIRGIPSINGLG TDNRIKGEGGKYTIHIELTPDDVAMPIGAQLKVRYTYLSAPRGEESTVATITDGNQHEYD VELDILANDTPDVIGLSFVVYMKEKDESGYRDISSSSYNYYQNTK >gi|222159329|gb|ACAB01000030.1| GENE 101 113133 - 114275 770 380 aa, chain - ## HITS:1 COG:no KEGG:BF1976 NR:ns ## KEGG: BF1976 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 16 378 18 316 318 73 23.0 2e-11 MNKTYFIGILLITASLFAGCTKENTEDCFSGLRLDFYFTHHTGNGNLFGEKVRRIHVYLF DTEGILQLHATDNGDKLEYSYLQNGQIQTVSKPNPWGKLSDEYVMNLDKVPPGKYRIISW ADNGQKDNTTYFHGEMKNPASHNLRKDVTLGATMADDLYMFLKCRTASGLPEELVPEAEE INDLWYGAAGSRHPQSDMYTYETVEVKNSVVTARRIELIRNTNLLKVTLSGLEHLTPEVR DTHAANRDTDFKLWAVATDERYKSDNTFDGYTRSVRYSPFKTLMDANTLSADIKVLRLTM DRPVLLYIETPGGRRIPEQPIDIVSMLFKARNPDTGAYIYQSQSDFDRIYEHPVEVRIGA NLNIRIFIGEWEIVNVKPAE >gi|222159329|gb|ACAB01000030.1| GENE 102 114297 - 115685 932 462 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717449|ref|ZP_04547930.1| ## NR: gi|237717449|ref|ZP_04547930.1| predicted protein [Bacteroides sp. D1] # 1 462 1 462 462 903 100.0 0 MRKVFFVGLIAILTGCSQESRVSSPEDSGIELQINLSLGSVSTRAVDAPITGNDGEGYVL PFTEVKTLKVDLYKSLEAAPIFTYTATDAEIAGIRNTATGKLARLSIPKIPVTTQYVKVT INRFKEENPLINQLQVSSQDDPPAPTMNRTEIPYEGVTKEIIVIPEESDPFSVKVKAIVE VAPVLSRFEIIPGEIVVTNPATTGASFDWTDGTAGRDKIKNFTEADITVAEAGARENFRK KYGTDATAAAIYSYRVRLIRSPFPDGFDINSTPTGFYFYMNYFKPSLNAATIIKNTNDGI SDWNMTTGFAEYKKGGQQSNMYDARTSETTKVNAFHLFPQSVAADATLQQIKDGMPHLIM GFSTDKIKRWLTVRAFSDKSNNEIINAFKPGYCYSLDLDDVVITPWSLGLDVRITDGDKV TESDPIIKDFDQTSHVPEPEGADLALGLRVSKWRDKDLEVEL >gi|222159329|gb|ACAB01000030.1| GENE 103 115853 - 117202 1267 449 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717450|ref|ZP_04547931.1| ## NR: gi|237717450|ref|ZP_04547931.1| predicted protein [Bacteroides sp. D1] # 1 449 1 449 449 794 100.0 0 MKKIIFMALFAATLSACSNENNEVTQETTNLKSLDFSFDVATPKTRMSDDEVTTAITAVR NNIQNITIEYFNASNASLGTYDFTPEQVATAKGDDQTATAAGRKPVNISNIPSATTKVNV YMNVDKNAADINDLQIGYTKMEYRGTPKDITLKTPEGERIGDNTNDLYSVEVSVAPVLSR FEFHGKASDIKVNSSASGILPSGVADGTKDQAKAHVSDATITAAEKAAQEAWRKANPGTA DPSTWSFKYTVAYAYDDAYTIESIDEYYMNNIPLTKDGDLVLNANDGAGNWNDAAKEKYK IGGDMEKMFDTTVEADKVIAYNLFPQTATNGTGASTTDVKTSMPHFVLKLTTKANGATAP RWLTIRALKAADVLITSFDAGKVYVLNATDININQYSAKLKVTTNNDIPGPEIPEPVDPT DPNPEPVGKDLDVLVKITEWTIVNVKPEW >gi|222159329|gb|ACAB01000030.1| GENE 104 117397 - 118902 1124 501 aa, chain - ## HITS:1 COG:no KEGG:BT_1064 NR:ns ## KEGG: BT_1064 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 118 498 3 378 392 380 51.0 1e-104 MKRQYMLFVIVLLFCTNTSAQQKEYSGKIHVTALSLQQEGDSVYVKLSFDISGVNVDSRR SISLIPTLVAAGDRLDLPEVVVKGRENYNVYRRETALMSNREKTAYAAEAPYAVVPGFKS GNAKTIAYSVAVKYNPWMADAKLDMYEDLCGCGNPARRIGITMLANRITLEKIIEPYEIT PSMAYVQPVVEPVKRREIISEAFLDFVVNKTDIRPDYMNNPQELKKVTDLVAEIKDDPDV TVRAINVIGYASPEGLLSNNQRLSEGRAKALTGYLASRFDYPRSLYQVAFGGENWDGLKE CVEASQMPYRKEVLEFIGSFPAECGYADQVRRKKTLMNLKGGEPYRYIIREFCPLLRKAI CKIDFDVRNFSIEQAKEVFKSRPQNLSLNEMFLVANTYEKGSQEFIDLFETAVKLYPDDV TANLNAAAAALSRKDTVYAKRYLNKIEKSLDIPEYHNTMGVLEMLCGNYDKAASHLNKAT ETGLPEAKQNLAEMTKKKQQR >gi|222159329|gb|ACAB01000030.1| GENE 105 118913 - 119500 439 195 aa, chain - ## HITS:1 COG:no KEGG:BF2182 NR:ns ## KEGG: BF2182 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 21 195 19 188 188 180 50.0 2e-44 MKIRSIFTLLILTLLVSYSRAQDIAIKTNLLYGGYTYTPNLSLEIGLGKRSTLDLGGGYN PWNLDGTAENNKKLVHWLGEIEYRYWLCQRFSGHFLGVHVLGSQYNIAQQDLPLIFGKGS KEYRFEGYGYGGGISYGYNFFLGIRWSIEANLGIGYARLHYDKYKCQKCGEKIGTESRNY FGPTKAGISLIYYLK >gi|222159329|gb|ACAB01000030.1| GENE 106 119939 - 121117 437 392 aa, chain - ## HITS:1 COG:no KEGG:BT_0036 NR:ns ## KEGG: BT_0036 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 392 1 401 403 303 41.0 5e-81 MITIKVKFRESTIKKKKGTIYYLLTRNKEYREITTPYKIHSHEWNDKRSLIAISNAEYQR RYELQLIENSIIRDIRHLQHILTSENNIDTIIKLFKKIQGESLLSVFSKSVTNDLYIKNQ PRTAKSYIASVKSFLRFRNGVDISFEELTPKLISDYERYLKEQQICNNTISFYMRNLRAI YNRAVEESYTEQKHPFKKVFVGNDKTIKRAIDEDVISRLKTLDLSSKPRLAFSRDMFMFS FYARGMAFIDLAYLTKENIQGEYIIYRRHKTGQELSIKLEICLKTIIDRYSHYSNGTFLF PVLSESTQNYDSALRLHNIHLKKISGFMGLPKPLTSYVARHSWATLAKKKGISTQIISES MGHNSEKTTRIYLSSLDRSVIDDANAKLISGI >gi|222159329|gb|ACAB01000030.1| GENE 107 121340 - 122344 1069 334 aa, chain + ## HITS:1 COG:YPO2980 KEGG:ns NR:ns ## COG: YPO2980 COG0667 # Protein_GI_number: 16123161 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Yersinia pestis # 7 332 3 328 329 386 58.0 1e-107 MENSFSYQPAAGRYERMPYKYCGKSGLQLPLISLGLWHNFGSVDNFDVATDMIKYAFDHG VTHFDLANNYGPVPGSAETNFGRILRENFQGYRDEMIISSKAGHDMWAGPYGGNSSRKNL MASIDQSLRRTGLEYFDIFYSHRYDGVTPVEETIQTLIDIVKQGKALYIGISKYPPEQAR MAYEMMTKAGVPCLISQYRYSMFDREVEAETLPLAAEYGSGFIAFSPLAQGLLTDKYLNG IPEGSRAARSSGFLQLSQVTPEKVEAARQLNEIALRRGQTLAEMALAWVLKDERMTSVIV GASSVNQLADNLKALDHLEFSADELKEIEQILPE >gi|222159329|gb|ACAB01000030.1| GENE 108 122642 - 123787 394 381 aa, chain + ## HITS:1 COG:no KEGG:D11S_1182 NR:ns ## KEGG: D11S_1182 # Name: not_defined # Def: hypothetical protein # Organism: A.actinomycetemcomitans # Pathway: not_defined # 92 376 22 313 316 91 26.0 5e-17 MKQLTVVFALLCIFACSSEDIVSTVEESVKPANEVAIQTVNLLDSLEIMDYIYSIIGPMA KTRSNLPGGGYILPDNWDSQLTDEQRELIRNSFPRPLFSAERNALRKSFPLVEYSNTVVM NYPSSGYNCFAYSLGFNNKWIEFSTWDQVRYGYENASSVYHAAYDYMKGATSISRYYPVV WGWGNTPLHASLGGSPHCEAPYSKMGRMWLLWHLVSVFSNGMYGVPVETYGAVSPTRSLS EIDANAMKEISEDIHENIIFSPDELMMIARKVKTCRDSSRFESLFNEWKEAWHYSLSNNT ATTRNLPQYADLKAMGKEIIPLLIEKMVTEEDNFFAIRLYEDLQDNPNLIIRYANDDPHQ LEGLQQTTKKTIKKWLEYNSN Prediction of potential genes in microbial genomes Time: Wed May 18 01:46:19 2011 Seq name: gi|222159328|gb|ACAB01000031.1| Bacteroides sp. D1 cont1.31, whole genome shotgun sequence Length of sequence - 148025 bp Number of predicted genes - 103, with homology - 102 Number of transcription units - 47, operones - 29 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 29 - 88 2.5 1 1 Op 1 . + CDS 163 - 465 390 ## BT_4733 hypothetical protein + Term 501 - 544 6.1 + Prom 467 - 526 2.4 2 1 Op 2 . + CDS 563 - 856 251 ## COG4680 Uncharacterized protein conserved in bacteria 3 1 Op 3 . + CDS 861 - 1226 343 ## BF4068 hypothetical protein + Term 1248 - 1309 18.7 - Term 1397 - 1439 2.2 4 2 Op 1 . - CDS 1469 - 3586 1928 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 5 2 Op 2 . - CDS 3674 - 3904 251 ## gi|237715798|ref|ZP_04546279.1| conserved hypothetical protein - Prom 4072 - 4131 6.4 + Prom 3927 - 3986 4.0 6 3 Op 1 . + CDS 4098 - 4601 563 ## BT_4735 hypothetical protein 7 3 Op 2 . + CDS 4622 - 4789 193 ## gi|160887574|ref|ZP_02068577.1| hypothetical protein BACOVA_05594 8 3 Op 3 . + CDS 4776 - 5219 480 ## COG3023 Negative regulator of beta-lactamase expression + Term 5229 - 5286 18.1 - Term 5170 - 5213 5.1 9 4 Op 1 11/0.000 - CDS 5247 - 5945 755 ## COG1180 Pyruvate-formate lyase-activating enzyme 10 4 Op 2 . - CDS 5991 - 8219 2676 ## COG1882 Pyruvate-formate lyase - Prom 8287 - 8346 4.2 + TRNA 8461 - 8548 48.9 # Ser TGA 0 0 + TRNA 8612 - 8699 58.0 # Ser GGA 0 0 - Term 8718 - 8757 -0.1 11 5 Op 1 1/0.167 - CDS 8787 - 9263 592 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 12 5 Op 2 1/0.167 - CDS 9283 - 11049 1610 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 13 5 Op 3 . - CDS 11063 - 12970 1684 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit - Prom 13112 - 13171 5.0 + Prom 12957 - 13016 6.7 14 6 Tu 1 . + CDS 13133 - 15886 2077 ## BT_0126 hypothetical protein + Term 16025 - 16076 0.3 + Prom 16000 - 16059 6.7 15 7 Op 1 . + CDS 16115 - 16672 385 ## BF1765 hypothetical protein 16 7 Op 2 . + CDS 16699 - 18732 1944 ## BF1827 hypothetical protein 17 7 Op 3 . + CDS 18757 - 21123 2156 ## BF1763 putative outer membrane protein + Term 21145 - 21204 15.2 + Prom 21232 - 21291 5.3 18 8 Op 1 . + CDS 21343 - 22713 1115 ## BF1089 hypothetical protein + Prom 22724 - 22783 2.8 19 8 Op 2 . + CDS 22850 - 23872 888 ## COG3746 Phosphate-selective porin + Term 24047 - 24093 9.0 + Prom 24129 - 24188 4.3 20 9 Tu 1 . + CDS 24209 - 25621 1007 ## BVU_0121 glycoside hydrolase family protein + Term 25728 - 25789 12.0 21 10 Tu 1 . + CDS 26069 - 27982 1265 ## BT_0127 putative transmembrane protein 22 11 Tu 1 . + CDS 28036 - 32118 2964 ## COG0642 Signal transduction histidine kinase + Term 32139 - 32193 -0.9 + Prom 32123 - 32182 5.3 23 12 Tu 1 . + CDS 32233 - 32607 290 ## BT_0128 hypothetical protein + Term 32654 - 32696 4.2 + Prom 32766 - 32825 5.8 24 13 Op 1 . + CDS 32845 - 33375 448 ## BT_0139 RNA polymerase ECF-type sigma factor 25 13 Op 2 . + CDS 33425 - 36673 2803 ## BVU_0126 hypothetical protein 26 13 Op 3 . + CDS 36707 - 38503 1591 ## BVU_0125 hypothetical protein 27 13 Op 4 . + CDS 38540 - 39733 952 ## gi|160887598|ref|ZP_02068601.1| hypothetical protein BACOVA_05620 28 13 Op 5 . + CDS 39783 - 41822 1587 ## gi|237715824|ref|ZP_04546305.1| IPT/TIG domain-containing protein 29 13 Op 6 . + CDS 41884 - 43179 1139 ## Dd586_1768 exopolysaccharide inner membrane protein 30 14 Tu 1 . + CDS 43295 - 44701 888 ## BVU_0121 glycoside hydrolase family protein + Term 44722 - 44773 4.3 - Term 44709 - 44761 12.1 31 15 Op 1 . - CDS 44786 - 45631 523 ## gi|237715827|ref|ZP_04546308.1| predicted protein 32 15 Op 2 3/0.167 - CDS 45709 - 46896 1084 ## COG1312 D-mannonate dehydratase 33 15 Op 3 . - CDS 46926 - 47741 195 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 47961 - 48020 5.4 + Prom 47752 - 47811 5.4 34 16 Op 1 . + CDS 47862 - 48743 607 ## COG2207 AraC-type DNA-binding domain-containing proteins 35 16 Op 2 . + CDS 48813 - 50762 1744 ## BT_0132 alpha-glucosidase, putative + Term 50960 - 51025 14.6 36 17 Op 1 . - CDS 50980 - 51363 282 ## BT_0134 hypothetical protein 37 17 Op 2 . - CDS 51341 - 52123 781 ## BT_0135 hypothetical protein - Prom 52239 - 52298 7.1 + Prom 52198 - 52257 7.4 38 18 Tu 1 . + CDS 52277 - 53887 1035 ## BT_0136 hypothetical protein + Prom 53912 - 53971 3.3 39 19 Tu 1 . + CDS 53995 - 55923 1633 ## COG3533 Uncharacterized protein conserved in bacteria + Prom 55953 - 56012 7.3 40 20 Op 1 . + CDS 56036 - 57958 1224 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 41 20 Op 2 . + CDS 58003 - 60282 1639 ## Csac_2721 heparinase II/III family protein + Term 60320 - 60371 12.4 + Prom 60284 - 60343 8.0 42 21 Tu 1 . + CDS 60384 - 63155 1137 ## COG3292 Predicted periplasmic ligand-binding sensor domain + Term 63158 - 63203 2.9 + Prom 63257 - 63316 6.8 43 22 Op 1 . + CDS 63342 - 63857 437 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 44 22 Op 2 . + CDS 63878 - 67111 2993 ## BT_0140 hypothetical protein 45 22 Op 3 . + CDS 67139 - 68860 1325 ## BT_0141 hypothetical protein 46 22 Op 4 . + CDS 68895 - 70361 1196 ## COG5492 Bacterial surface proteins containing Ig-like domains 47 22 Op 5 . + CDS 70412 - 71677 1202 ## Dd586_1768 exopolysaccharide inner membrane protein + Term 71702 - 71752 14.6 + Prom 71680 - 71739 3.1 48 23 Tu 1 . + CDS 71775 - 74930 2058 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 74991 - 75043 12.2 + Prom 74986 - 75045 5.3 49 24 Tu 1 . + CDS 75214 - 77946 1829 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 78142 - 78197 -0.9 + Prom 78451 - 78510 5.9 50 25 Tu 1 . + CDS 78637 - 79152 624 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 79206 - 79265 1.8 51 26 Op 1 . + CDS 79285 - 82521 3003 ## BT_0140 hypothetical protein 52 26 Op 2 . + CDS 82525 - 84258 1470 ## BT_0141 hypothetical protein 53 26 Op 3 . + CDS 84290 - 85828 1239 ## COG5492 Bacterial surface proteins containing Ig-like domains 54 26 Op 4 . + CDS 85834 - 87237 1219 ## BT_0143 putative transmembrane protein + Term 87304 - 87348 12.1 + Prom 87351 - 87410 5.3 55 27 Op 1 . + CDS 87487 - 89136 1511 ## COG3507 Beta-xylosidase 56 27 Op 2 . + CDS 89203 - 90423 1154 ## BT_0146 unsaturated glucuronyl hydrolase + Term 90519 - 90571 14.1 - Term 90512 - 90554 8.5 57 28 Op 1 . - CDS 90585 - 91058 527 ## BT_0147 hypothetical protein 58 28 Op 2 . - CDS 91116 - 92408 1427 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 92429 - 92488 7.3 59 29 Op 1 . - CDS 92660 - 93445 984 ## COG0501 Zn-dependent protease with chaperone function 60 29 Op 2 . - CDS 93479 - 95725 2061 ## BT_0150 putative ferric aerobactin receptor - Prom 95897 - 95956 7.3 + Prom 95959 - 96018 6.1 61 30 Op 1 . + CDS 96039 - 96473 377 ## BT_0151 hypothetical protein 62 30 Op 2 . + CDS 96500 - 97318 698 ## COG0627 Predicted esterase + Term 97396 - 97444 8.1 - Term 97640 - 97680 -0.8 63 31 Op 1 . - CDS 97726 - 98520 424 ## BF1404 hypothetical protein 64 31 Op 2 . - CDS 98527 - 100002 852 ## BT_0154 putative periplasmic protease - Prom 100144 - 100203 3.6 + Prom 99988 - 100047 4.9 65 32 Op 1 . + CDS 100186 - 101253 697 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain + Prom 101255 - 101314 4.8 66 32 Op 2 . + CDS 101342 - 101596 365 ## gi|237715862|ref|ZP_04546343.1| predicted protein 67 32 Op 3 . + CDS 101586 - 101927 259 ## BVU_0489 hypothetical protein + Term 102090 - 102137 -0.2 68 33 Tu 1 . - CDS 101998 - 102657 532 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Prom 102682 - 102741 3.3 - Term 102667 - 102720 8.4 69 34 Op 1 . - CDS 102746 - 103336 657 ## BT_0157 hypothetical protein 70 34 Op 2 . - CDS 103403 - 104830 996 ## COG1757 Na+/H+ antiporter - Prom 104851 - 104910 5.0 71 35 Op 1 . - CDS 104985 - 105419 304 ## BT_0160 hypothetical protein 72 35 Op 2 . - CDS 105466 - 106641 852 ## COG0477 Permeases of the major facilitator superfamily 73 35 Op 3 7/0.000 - CDS 106668 - 109109 1440 ## COG4953 Membrane carboxypeptidase/penicillin-binding protein PbpC - Prom 109156 - 109215 3.4 74 35 Op 4 . - CDS 109259 - 114889 4182 ## COG2373 Large extracellular alpha-helical protein - Prom 114945 - 115004 7.1 + Prom 114858 - 114917 7.3 75 36 Tu 1 . + CDS 114990 - 115391 257 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 + Term 115556 - 115591 3.1 76 37 Tu 1 . - CDS 115537 - 115899 410 ## COG4828 Predicted membrane protein - Prom 116000 - 116059 2.6 77 38 Tu 1 . - CDS 116179 - 116337 163 ## - Prom 116453 - 116512 7.7 - Term 116498 - 116536 1.1 78 39 Tu 1 . - CDS 116557 - 116721 81 ## BT_0173 hypothetical protein - Prom 116774 - 116833 4.1 - Term 116838 - 116872 -1.0 79 40 Op 1 . - CDS 116873 - 117211 221 ## BT_0167 hypothetical protein 80 40 Op 2 . - CDS 117204 - 117503 268 ## BT_0168 hypothetical protein - Prom 117750 - 117809 7.1 - Term 117741 - 117793 8.0 81 41 Op 1 . - CDS 117811 - 118821 1281 ## BF3207 hypothetical protein 82 41 Op 2 . - CDS 118841 - 119542 818 ## COG0822 NifU homolog involved in Fe-S cluster formation - Prom 119634 - 119693 3.8 + Prom 119518 - 119577 6.3 83 42 Tu 1 . + CDS 119781 - 120185 321 ## BT_0175 hypothetical protein + Term 120430 - 120491 1.5 + Prom 120529 - 120588 3.0 84 43 Op 1 . + CDS 120622 - 121122 600 ## BT_0176 hypothetical protein + Term 121125 - 121164 1.2 85 43 Op 2 . + CDS 121199 - 121969 521 ## BT_0177 hypothetical protein 86 43 Op 3 . + CDS 122002 - 122547 437 ## BT_0178 hypothetical protein 87 43 Op 4 . + CDS 122557 - 123009 440 ## COG0691 tmRNA-binding protein 88 43 Op 5 . + CDS 123098 - 125845 2735 ## COG1410 Methionine synthase I, cobalamin-binding domain 89 43 Op 6 . + CDS 125857 - 126459 596 ## COG0778 Nitroreductase 90 43 Op 7 . + CDS 126528 - 126653 121 ## gi|153807903|ref|ZP_01960571.1| hypothetical protein BACCAC_02189 91 43 Op 8 . + CDS 126650 - 128200 1428 ## COG0591 Na+/proline symporter + Term 128382 - 128419 -0.6 - Term 128036 - 128071 -0.6 92 44 Op 1 . - CDS 128266 - 129657 1373 ## COG4623 Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein 93 44 Op 2 . - CDS 129702 - 130355 560 ## COG0572 Uridine kinase - Prom 130382 - 130441 3.8 + Prom 130278 - 130337 3.1 94 45 Op 1 . + CDS 130422 - 130721 433 ## BF3194 hypothetical protein 95 45 Op 2 . + CDS 130737 - 132815 1777 ## COG4232 Thiol:disulfide interchange protein + Term 132880 - 132914 1.1 - Term 132805 - 132855 1.0 96 46 Op 1 . - CDS 132933 - 135173 1376 ## COG1472 Beta-glucosidase-related glycosidases 97 46 Op 2 . - CDS 135181 - 137184 1218 ## Dfer_4710 xanthan lyase 98 46 Op 3 . - CDS 137181 - 139496 1516 ## COG1472 Beta-glucosidase-related glycosidases - Term 139503 - 139557 9.1 99 47 Op 1 . - CDS 139583 - 140797 954 ## gi|237715895|ref|ZP_04546376.1| predicted protein 100 47 Op 2 . - CDS 140812 - 142167 1116 ## XCC0740 hypothetical protein 101 47 Op 3 . - CDS 142172 - 143818 1122 ## Dfer_1699 secreted protein-putative xanthan lyase related 102 47 Op 4 . - CDS 143852 - 145438 1345 ## Fjoh_2078 RagB/SusD domain-containing protein 103 47 Op 5 . - CDS 145451 - 148024 1838 ## Fjoh_2077 TonB-dependent receptor Predicted protein(s) >gi|222159328|gb|ACAB01000031.1| GENE 1 163 - 465 390 100 aa, chain + ## HITS:1 COG:no KEGG:BT_4733 NR:ns ## KEGG: BT_4733 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 100 6 101 101 143 83.0 2e-33 MQINNHQIVDYDAVLDAKFGAEGTPERAEAEEKAYAFYTGQIIEDARKKAKITQAELARR IGSDRSYISRVESGQTEPKVSTFYRIMNALGCRIEFSMSL >gi|222159328|gb|ACAB01000031.1| GENE 2 563 - 856 251 97 aa, chain + ## HITS:1 COG:asl4856 KEGG:ns NR:ns ## COG: asl4856 COG4680 # Protein_GI_number: 17232348 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 17 97 5 85 85 85 43.0 3e-17 MRIFTEQALKEYAEEHPDSKVALQEWVTIVKKSEWACFADIKKTFNSADSVGNQHYVFNV KGNNYRLVVVVKFTVKFVYIRFIGTHKEYDKIDCANI >gi|222159328|gb|ACAB01000031.1| GENE 3 861 - 1226 343 121 aa, chain + ## HITS:1 COG:no KEGG:BF4068 NR:ns ## KEGG: BF4068 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 121 1 121 121 154 66.0 7e-37 MTKIENQTQYEWAVKRVEDLLPLVKDDTPLNDPNSIELELLSNLVADYSEEHFSLGEPSL VDVLKLRMYEMGLNQKSLSQLIGVSPSRLSDYISGKCEPTLKVAREISKKLNIDASIVLG V >gi|222159328|gb|ACAB01000031.1| GENE 4 1469 - 3586 1928 705 aa, chain - ## HITS:1 COG:L109011 KEGG:ns NR:ns ## COG: L109011 COG5545 # Protein_GI_number: 15672499 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Lactococcus lactis # 462 634 210 385 480 65 25.0 3e-10 MDHFTIFQDFYRVIAEMTEEEIVSAISNGTYRETVEDVRRIFAEQGEKAANEKKIELPEI TFSANYRGGRSNATLVKYLGYIVVDIDHQTQEALARILALAKKCAHTRIAFISPKGMGLK IVVRVCRTDGTLPETIQEIEDFHHAAYHKIASFYAQLCGVEVDTSGQDVSRTCLFSYDPD IYFNPDATAVIVEQPPMFFKSQKKKRASGKKKKTTAPDNNPLTEQVALNYHSSHASLMVT LNYYHNKSEKYVTGNRNNYLHCLACMYNRYGVPQEEAAAFIKSQFTDLPADEMDALIGSA YGHNEEFDTRKLNSTQKRILQIEQHIKENYDTRYNEVLHIMEYRRRKTDTEQPEPFHILD EMMENSIWMEMNELGYSCTVKTIQNLIYSDFSITYHPIREYLDSLPEWDGTDYIGILANS VHTSHQKFWVECLERYLVGMCAAATQDDVVNHTVLLLCSEIQNIGKTTFINNLLPPELRA YLSTGLINPSNKDDLAKIAQAMLINLDEFEGMNGRELNIFKDLVTRKVISIRLPYARRSQ NFPHTASFAGTCNYQEVLHDTTGNRRFLCFHVNSMEFIKINYAQLYAQIKYLLNKPGYQY WFTQSENSRIEENNEDFIFHSPEEELVLTHIRKPERFEKVHYLTVTEIAELIRERTGYQY SHGTKAQLGKVMSKHGFEFHKGKNGRRYTVFIIDTEQVKSNRLYE >gi|222159328|gb|ACAB01000031.1| GENE 5 3674 - 3904 251 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715798|ref|ZP_04546279.1| ## NR: gi|237715798|ref|ZP_04546279.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 76 1 76 76 136 100.0 4e-31 MYYPDQPFHIRTYKKSELAHLYNPNVCLKVALQILRRWIVYNLPLLQELEQEGYRARNRL LSPRQVATIIRYLGEP >gi|222159328|gb|ACAB01000031.1| GENE 6 4098 - 4601 563 167 aa, chain + ## HITS:1 COG:no KEGG:BT_4735 NR:ns ## KEGG: BT_4735 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 165 167 228 81.0 6e-59 MAMTVTYSVVPRKNPAKKDESAKYYAQAQASGELDFEELCEGITSRSTCTETDVRAAISG ILYEAKRALKAGRIVRLGDLGSLQIGLNSEGAVSVKEFSSSLITAAHIIFRPGKTLADIT KILSYQQVTTRAVAQTGGSGNEGEDDDKGSGGGSDGGGSGEAPDPAA >gi|222159328|gb|ACAB01000031.1| GENE 7 4622 - 4789 193 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160887574|ref|ZP_02068577.1| ## NR: gi|160887574|ref|ZP_02068577.1| hypothetical protein BACOVA_05594 [Bacteroides ovatus ATCC 8483] # 1 55 9 63 63 85 100.0 1e-15 MKEIVTKILDVIMFLVPFFGKRKRNRIVREVRFNATHKEVCNVKTTEREKDDEKD >gi|222159328|gb|ACAB01000031.1| GENE 8 4776 - 5219 480 147 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 47 142 3 98 116 88 40.0 5e-18 MRKISLIVIHCSATRADRDFTAKDVDTAHRFRGFSCWGYHYYIRKSGQIEPMRDEDTVGA HARGFNAISLGVCYEGGLDENGKAADTRTSRQKEAMHRLVNELLQRYPDAKVVGHRDLSP DTNYNGIVDPWERIKECPCFEVKAETW >gi|222159328|gb|ACAB01000031.1| GENE 9 5247 - 5945 755 232 aa, chain - ## HITS:1 COG:VC1869 KEGG:ns NR:ns ## COG: VC1869 COG1180 # Protein_GI_number: 15641871 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Vibrio cholerae # 2 229 14 244 246 197 42.0 1e-50 MGTFDGPGLRLVVFLQGCNFRCLYCANPDTIAGKGGTPTPPEEIVRMAMSQRPFFGKRGG VTFSGGEPTFQAKALVPLVRELKEKGIHVCIDSNGGIWNEEVEELFKLTDLVLLDIKEFN PARHQALTGRSNEQTIRTAAWLEENEKPFWLRYVLVPGYSDFEEDIRRLGEALGKYKMVQ RVEILPYHTLGVHKYEAMEQEYKLKDVKENTPEQLEKAAEVFKEYFTTVVVN >gi|222159328|gb|ACAB01000031.1| GENE 10 5991 - 8219 2676 742 aa, chain - ## HITS:1 COG:CAC0980 KEGG:ns NR:ns ## COG: CAC0980 COG1882 # Protein_GI_number: 15894267 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Clostridium acetobutylicum # 7 742 8 743 743 943 62.0 0 MELNKIFKDGLWSTEINVRDFVSHNITPYYGDASFLEGPTERTKAVWNRCLEALAEERAN NGVRSLDNVTVSTITSHKAGYIDKENELIVGLQTDELLKRAIKPFGGINVVSKACHENGV EVDDRVKDIFTHYRKTHNDGVFDVYTEEIRSFRSLGFLTGLPDNYARGRIIGDYRRMALY GIDRLIEAKKEDLRNLTGPMTEARIRLREEVAEQIKALKDMKVMGEYYGLDLSRPAYTAQ EAVQWVYMAYLAAVKEQDGAAMSLGNVSSFLDIYMEYELSKGTITESFAQELIDQFVIKL RMVRHLRMQSYNDIFAGDPTWVTESLGGRLNDGRTKVTKTSFRFLQTLYNLGPSPEPNLT VLWSPELPEGFKEFCAKVSIDTSSIQYENDDLMREVRQSDDYGIACCVSYQEIGKQIQFF GARCNLAKALLLAINGGRCENTGTVMVKNIPVLTSDTLNFEEVMSNYKKVLTEIARVYNE AMNIIHYMHDKYYYEKAQMALVDTNPRINLAYGVAGLSIALDSLSAIKYAKVTARRNDIG LTEGFDIQGEFPCFGNDNDKVDHLGVDLVYFFSEELKKLPVYKNARPTLSLLTITSNVMY GKKTGATPDGRAKGVAFAPGANPMHGRDKNGAIASLSSVAKLRYRDSQDGISNTFSIVPK SLGATEEDRVENLVTMMDGYFTKGAHHLNVNVLNREMLYDAMEHPEKYPQLTIRVSGYAV NFVKLSREHQLEVISRSFHERM >gi|222159328|gb|ACAB01000031.1| GENE 11 8787 - 9263 592 158 aa, chain - ## HITS:1 COG:TM0012 KEGG:ns NR:ns ## COG: TM0012 COG1905 # Protein_GI_number: 15642787 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 4 156 16 171 176 143 45.0 1e-34 MSDIKLACDMTEQIKTICDKHGNKPGELINILHEAQHLQGYLPEETQRIIASKLGIPVSK VYGVVTFYTFFTMTPKGKHPISVCMGTACYVRGSEKLLEEFKRVLGIEVGDTTPDGKFSL DCLRCVGACGLAPVVMIGEKVYGRLQPVDVKKIIEELE >gi|222159328|gb|ACAB01000031.1| GENE 12 9283 - 11049 1610 588 aa, chain - ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 222 586 5 364 372 380 53.0 1e-105 MEEKQITLQIDGHFITVPEGSTILEAAIKIGINIPTLCHIDLKGTCIKNNPASCRICVVE VMGRRNLAPACATRCTEGMVVKTSTLRVMNARKVVAELILSDHPNDCLTCPKCGNCELQT LALRFNIREMPFNGGELSPRKREITASIVRNMDKCIFCRRCESVCNDVQTVGALGAIRRG FNTTIAPAFDRMMTESECTYCGQCVAVCPVGALTERDYTNRLLDDLANPDKVVIVQTAPA VRAALGEEFGFPPGTLVTGKMVYALRELGFDYVFDTDFAADLTIMEEGSEILNRLTRYLN GDKSVRLPILTSCCPAWVNFFEHHFPDMLNIPSTARSPQQMFGSIAKSYWAEKMGIPREK LVVVSIMPCLAKKYECARDEFKVNGIPDVDYSISTRELARLIKRANIGFPLVLDSPFDNP MGESTGAGVIFGTTGGVMEAALRSVYEIYTGEPLKNVNFEQVRGLNGVRRATINLNGFEL KVGIAHGLGNARHLLEDIRNGHNEYHVIEIMACPGGCIGGGGQPLHHGNSEILYARANAL YREDANKPLRKSHENPYIKTLYEDYLGKPLGEKSETLLHTHYFNKAID >gi|222159328|gb|ACAB01000031.1| GENE 13 11063 - 12970 1684 635 aa, chain - ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 46 566 8 527 527 684 61.0 0 MKILSIHDLATIRKRAEHNLSLREESNEKVTEKCYGLASGAQHLQILICGGTGCKASSSQ GITDNLLKAIKKNEITDKVEVITVGCFGFCEKGPIVKIIPDNTFYTQVTPEDAEEIINEH IIGGRRIERLLYVDPKTEHTVSDSKHMDFYRKQLRIALRNCGFIDPENIEEYIAREGYFA LADCLLNKQPTDVIDIIKRSGLRGRGGGGFPTGLKWEFANKQQSDVKYVVCNADEGDPGA FMDRSIMEGDPHSIVEAMCVCGYSIGSTKGLVYIRAEYPLAINRLKTAINQAREYGLLGD HILGTDFSFDIEIRYGAGAFVCGEETALIHSMEGKRGEPTLKPPFPAESGYLGKPTNVNN VETLANIPIILTKGADWFATIGTERSKGTKVFALAGKINNVGLIEVPMGTTLREVIYEIG GGIKGGKKFKAVQTGGPSGGCLTEKHLDTPIDFDNLLSAGSMMGSGGMIVMDEDDCMVSV ARFYLDFTVEESCGKCTPCRIGNKRLLELLNKITEGKGTEKDLDTLATLGQVIKDTALCG LGQTSPNPVLSTLDNFYDEYMEHVRDKTCRSKQCKSLLTYTISPERCIGCHLCAKNCPAD AISGLVRKPHVIAPDKCIKCGMCMARCKFNAILVC >gi|222159328|gb|ACAB01000031.1| GENE 14 13133 - 15886 2077 917 aa, chain + ## HITS:1 COG:no KEGG:BT_0126 NR:ns ## KEGG: BT_0126 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 917 1 917 918 1705 87.0 0 MRRILLGLCFLSFLNSASGQEIPLPEKMPQTHPRVLTTPAGKQETWKLIKKEEWAKDVFN KLKERTEVYTNLTDAQPAWLLSRLAMYWKSHATEVYVKGETFDHAGGERAPYPTVRYTGT RGTAATHGRPKLADVVPYDDEDGNVTFCNNALPDRPMESVHPSKTGRNIESLNCEILGIA RDAAFLYWMTDEEKFAKLAAGVFDTYMTGIYYRNVPIDLNHGHQQTLVGLTSFEVIHEDA LHIAVPLYDFLYNYLKANYPDKMEIYAGAFKKWADNIIANGVPHNNWNLLQARFIMNVGL VLEDNKEYADGKGREYYIDYVMNRSSIRQWSLTRLADYGFDINTGIWAECPGYSSVVIND YANFVNQFDTNLQYDLVKAMPVLSKAVATTPQYLFPNRMICGFGDTHPGYLSTNFFIRMI QNAQANGKKEQENYFTALLKCLNPDLGNDKTEKKNVRVSVNSFFEDKPLTLNPKVQPGKI EDYVSPLFYAPNVSWLVQRNGMHPRNSLMISLNGSEGNHMHANGISMELYGKGYVLGPDA GIGLFLYSGLDYAEYYSQFPSHNTVCVDGISSYPVMKSNHSFDLLSCFPASAEPGKAFTS VTYSNLYFREPESRADQTRMMSIVTTGAETGYYVDVFRSRKEKGGDKMHDYFYHNLGQTL TLTAADGSDLNLQPTEELAFAGAHLYAYSYLYDKKVAATNKDVKATFTIDMKDKDGDDIY MNLWMKGEPDREVFTALAPMTEGLSRTPNMPYNIKEQPTLTFVARQHGEAWNRPFVSIYE PSTKKEPSAIQSVSYFDAEGAGLEDFAGICVKSKNGRIDHIFSLSDAAQTATYQGMKVKA DYAVISNEYAGNRTLFLGNGTQLVAPGVMIQTDNAANVLLEKKEGKWYIISSAPCTVVIG DKKIKSDAASEPMLLRI >gi|222159328|gb|ACAB01000031.1| GENE 15 16115 - 16672 385 185 aa, chain + ## HITS:1 COG:no KEGG:BF1765 NR:ns ## KEGG: BF1765 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 25 185 41 201 201 121 43.0 1e-26 MVYVMRNGRRVIAALSFLLLCCGVHVHAQRVAVKTNALGWLTASPNVEAEFVLGSHVSLN MGIAANPISTDNFKTTFTHFQPEVRYWLNRPMVSHFLGITAFVNNFDMMVKDVHHKGDAY AAGLTYGYAWVLGDHWNIEATAGVGVLRYRQFKYDKGTPKPGAVNDSKTTIAPVKLGVSF VYIIR >gi|222159328|gb|ACAB01000031.1| GENE 16 16699 - 18732 1944 677 aa, chain + ## HITS:1 COG:no KEGG:BF1827 NR:ns ## KEGG: BF1827 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 672 1 670 709 491 39.0 1e-137 MKINLKRDRLTGILLVAMMFVGMNVSAQRIRVQGHITNPQGKSVPNVNVLNPVNDERIEM SDEDGRYSVLVEKNGSLKFTCVGYEDKTVKVAGKQILNVVLKDAVIELDEVTITSKVKDK VIPEPTDIEIKGNYFHLKTRVPVPKEMFNSHRRLVLQPSIYDVTLKKRLLMRPVVFDGDT YNTTQNRMYDYDMDKDPLHDYIRVKTTSSRKGDIIAYHDSIYIEYLQHDYRADVHLAMEN YRNIIYRDSFSIARGTVNPLRFLEYKFSAFSLTDEKYLPKPVMQLRDTKGEVNLTFLVGK ADLDDKNPQNQVELNRLNQELRAIETNPDASLKSFHITGVASPDGSYATNLRLAKLRTDK ALERILAQLDPETRKLLEVKSDASVASWKEVAELLKKNSKPELAKEVEDLIKQYAATPYR LNGVLKSKPFYKELAATYLPKLRKVQYTYGYSIFRSLTDDEIRELYRKNPKQLTRFEYYR MITTAKTPDEREKYCREALELYDNFTYAANELAVATIQKDTPDSRILEPFVSKSAPAELL SNQAIALLHEGKYTKADSVLTLVPEEAVSEDLQAIVQALAGYYNDAFEKVAATSPFNEVV MLLAMKKNQEAWDKISTMDVETAREYYIKAIAANRLEKIGDAIMSIEKALELDPSLLEVA KVDGDIIDLLPEEQKIK >gi|222159328|gb|ACAB01000031.1| GENE 17 18757 - 21123 2156 788 aa, chain + ## HITS:1 COG:no KEGG:BF1763 NR:ns ## KEGG: BF1763 # Name: not_defined # Def: putative outer membrane protein # Organism: B.fragilis # Pathway: not_defined # 30 389 42 367 391 155 31.0 6e-36 MNIFRINNKYVAVAIFLMLAVTLQAQDYGALQYMLQKRPANEKFESNKFNEHLFFSAGIG PYSLLTSGDSQDGMGMTAHLFMGKWITPVHGLRIGVNLGYLPSSIYDSKIKMGGGSLDYL LNMSSLAYGYNANRCFELVGIAGIEAGYSKVGDNSDRSEKYPDLKGGGQLYYGAHLGLQG NVRLSSTLDLFVEPRIGWYNDGFAYTESWRNYKMAGSVLVGLTYMPAAPMGTKIHFDDFD KSSFLNHMFISLSGGISTLKVPGIKNTIKGLGPQFSAGIGKWFSPSSGLRLSGTVGLSDT PSGSASGYFKHVDLHADYLLNINNVLWGYDEDRIFSLIGIAGVNLAGTKGVDKTAKYAPG IGVGVQGSFRINRSVDLFIEPRLNVYNKRYAGGRGVGGNTDQLGELNFGLTYHTIDRAAR PKNGFSSNHIADNLFMTSGIGVQMFLNKTNLENLGSLGPQASVSIGKWLSPYSGLRLVGT GGFFTNYVVPGSVKAGKLRHASVSGGLDYLWNITSTMSGYNPDRIFDLIGSVGVNLAYTS KSDHKFQPGINAGIQGLWHVNDFLGLYIEPQIRLYGDKFIEGNLGFMQKDVMVGVNAGFH YRFVPYSKAANRSVFGQDDKRYFISGALGLGSLLVANKDLVKNAGVEAKGSIGKWYTPLS AWRVNGTIMYKAKTSSKMNLHYAGLGMDYMMSLATLAKGYSPDHVIDVVPFVGVTAGLVR RYGKFRAVPGLDAGVQVKLKVASSLYLYAEPKVGIRTDTYDGSEQGRPDRVASMVGGLLY RFKMPTFQ >gi|222159328|gb|ACAB01000031.1| GENE 18 21343 - 22713 1115 456 aa, chain + ## HITS:1 COG:no KEGG:BF1089 NR:ns ## KEGG: BF1089 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 10 426 8 424 446 580 73.0 1e-164 MKTITEAISKQSSNRRLADFLFILWAGGAALLSYSLVYTLRKPFTAATFDGIEAFGFDYK VLVTIIQIAGYLIAKFIGIKLISELKRENRLKFILVSIAVAELSLVAFGALPTPYNMFAM FFNGLSLGCMWGVIFSFIEGRRTTDILASLLGISIVISSGTAKSIGLFVMNTLNVSEFWM PALIGAFALPLLALLGYSLTRLPQPTAQDIEQKSSRVTLNGKQRKELFIDFMPFLVLLFV ANLMLVVLRDIKEDFLVKIIDMNGQSSWMFAQVDTVVTLIILALFGAMVFVKSNIKVLVA LLGLVVLGTATMSFISFNYDSLQLDAITWLFVQSLCLYIAYLCFQSIFFDRFIACFKIKG NVGFFIVTIDFIGYTGTVLVLMFKEFAHVDINWLEFYNILSGYVGLICTVAFTCSMIYLI QRYKKEKQLKKAKEAEMSNGIKFTGEGMEPTTFSQI >gi|222159328|gb|ACAB01000031.1| GENE 19 22850 - 23872 888 340 aa, chain + ## HITS:1 COG:XF0975 KEGG:ns NR:ns ## COG: XF0975 COG3746 # Protein_GI_number: 15837577 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate-selective porin # Organism: Xylella fastidiosa 9a5c # 53 312 103 364 389 79 25.0 7e-15 MDGGVYLKNPNNFGNGTEFNDLRIGVKATYQNWGMKLEMGYAGNKVAIKDAFATYSYKNS SIQIGQFYEPFSLDMICSTFDLRFNQSPGAVLALTNSRRMGVAYSYRTQYYYLCGGFFTD NDLSNLKNASQGYAIDGRLVYRPLYEQAKLIHIGLAAIHRTPDGTLPEDENRNTFTYKSP GVSTIDNRTLIQADVDHAASQFKIGTELLIYYHKFFLQGEYIRAHVKREKGFENYTAQGA YLQCSWLLLGQNYLYDEEVACPGRPEGKALELCARFNYLSLNDAGIKGGTQKDLSFGLNY YINKHIAVKLNYSYFIPGSHIKEIESTNFSVVQGRFQFIF >gi|222159328|gb|ACAB01000031.1| GENE 20 24209 - 25621 1007 470 aa, chain + ## HITS:1 COG:no KEGG:BVU_0121 NR:ns ## KEGG: BVU_0121 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 25 470 16 461 461 713 74.0 0 MNIVGVLSGKRKCLLAIAIAFSTFGNAQLVTYPEGLNTGMPHNDDYTVKVREAGGEWKDV FEYEVQVDMDRVQSASMVQFDIGSPVEVMVKKNNGTIQDVKIRPLAIGIQHTVNHNAIFF TLTRPQCLSIEFNGDRLHNLHLFANPLETETYTESSDKVMYFGPGVHRPKDLPNTQIQIP SNTTVYLAPGAVVKAKLLIDKAENVRIVGRGILDHPIRGIEVTHSKNIWIDGITVINPDH YTVFGGESTGLTINNLKSFSCKGWSDGIDLMCCSDVLIDNVFMRNSDDCIAIYAHRWNYY GGSRNVTLQNSILWADIAHPINIGGHGNPDDKAGEILENITVRNVDILEHDEDDLLYQGC MAVDCGDKNLVRKALFEDIRVENIQEGRLFHINVRFNSKYDKQPGRGIEDIIFRNIIYNG VGENPSLLKGFDKERSVKNIIFDNVIINGMKMKNIDDFITNEYIKNITVK >gi|222159328|gb|ACAB01000031.1| GENE 21 26069 - 27982 1265 637 aa, chain + ## HITS:1 COG:no KEGG:BT_0127 NR:ns ## KEGG: BT_0127 # Name: not_defined # Def: putative transmembrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 629 478 1083 1097 1016 78.0 0 MINKIIWIIGIVWIGNIQNGVQAQRRNFIHPGITYTQGDLDRMKAMVKAKHEPYYSTFLK LKESPYSSLNTQVINRGKQIREGRFNATIGVDGRRAHDLALLWHLTGDEAYARKAVEYLN ANSYYTNTSSRGTGPLDNGKIYLLIDAAELMRDYSGWKVEGQQRFKNMLVYPGYSNTEDF SSKYANYWDDSKNGVTFYWNIFNFDAARFGNQGLFAARGMMAMGIYLDNEVMYDRAYRYL LGMKHRPDDLPYPSGPTVTSKEPIKKSPTMLDYKLEGRENKISDYGYDEQLQYYIYPNGQ CQESSRDQGHVLAGIHNYVAIAEMAWNQGDSLYNCLNNRLLLGLEWNYRYNLSKVQTYEE QKEPWEPTSYSKNSNEVTFENGKFLQIKSRSGRWESVAVSPHGRGDVAGSGGTREMALAH YAVRAGLPENNYTWLKRYRDYMIEHHGCENWGIAPNWFYEWTDWGTLTKRLTPWMAGDPV SFSTGKRVSGIHLLPSEISAADYDYYCLAEDPEGHTYHNVGKKRGNEYRPDGAVELRKEE DNYVVTQIEDGEWMNYTVSIPTDGDYTVYLVYQSKGNSLLSVASDSEVKTEPMQLPSSIQ WTEREIGKLTLPAGACVLRLQIEQAGDKLEIQKIIVK >gi|222159328|gb|ACAB01000031.1| GENE 22 28036 - 32118 2964 1360 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 844 1104 8 271 294 150 35.0 2e-35 MAMKIRICLLIILMGLLYPGYISSQESPYVFRQIGVVEGLPDNYVKSVFPIPDGRIGVRS TVLLSLYDGANYSNFPFNIHGEYSIAYNHIIPEQYIDADKRLWMKERKSLRVFDLTTEQY IYNVDSLFQQFGLKDKVADLIIDSEKHCWFLTPGSSVYMYDAETKSIEQVCRNDEFMEYY GGLLGVESHGKYSWMVHQKGAIRCYDLEKKRFVHQLDFLKDQLKPDDRVVLKILDNGDFW LMWDRGVGYYDVYNKKWNQISGIQLGHYSWFTSMDVDKGGNAWVGTVIDGFYVIDMHNFS VTRTLDIPLLSGNTVRNGIQSIYCDRENNSVWIGLYNQGMCYYHPSMNKIVLYNKKMING DWKGEEIRCMLETSKGEILMGTTQGVYRYEPETKSMNRFYHEFSQKNCRVLYEDSKKRVW VGTYHDGLYCIDHGKVQSYDYPDTDYQNELDFSNIRVMVEDSSGRLWVSIYGGVGLFNPE NGQINLLSEQFPELKKYKVANALAIDNQSRLVVGSDNGLYIYDPATNKIWIPEEDGQANS IFNQGSIKYNQILKDHEGTLWFATQYGLSVLTYNGQSYTLGKEEGLSKAILNVVEDKNHD IWISTVTSIYKIKVDRRADKYSFHVISCLSEDEIRQDDLYSFPSLMTRDNQLFLGLMNGF IAFSPENMIDNQCLNRPLFTSFRLFNVPVVSGEIYNGRVLFDKALSYSDEVQLKYDENYI TLEFSGLNFPNPSQTSFRYQLEGFDKEWTETLFENGQGRVVYNNLPSGEYIFRVSAAGND RIWGPESAFKIVIHPPFWDTLAARIFYAILVILLIFGLIYVINRRNRQKMIRMQEEEALK QKEELDQMKFRFFTNISHELRTPLTLIITPLDMVIRRLTDDAMKKQLNTIYKNAQNLLSL VNQLLDFRKLEMKGERLHLMNGDMEEFIVSAYNNFMPMAVEKHLNFVNQSEHRALYMFFD RDKVHKIMNNLLSNAFKYTPQGGTVNLQLATEEIEERNYVRISVSDTGIGISESDLPYIF DRFYQVGNEGDEKIGSGIGLHLVREYVNIHGGRIKVDSQIDRGSVFTVWLPMDLKPEPNE LPEEIIGTETPDIKEKETTTSTVDDNLKKLLLVEDNQEFRTFLKEQLEDFYQIIEAADGG EGERKAIEENPNLIISDIMMPKVDGIELCRRIKTNVQTSHIPVILLTARTADDIKINSYE VGADSYMSKPFNFDMLMVRIEKLIEQQEKRKQEFRKNIEVNPSAITITSVDEQLIQKCLE YIEKNMDNPEYGVEELSGDLGMVRMSLYRKLQSITGHTPTDFIRSIRLKRAAQLLQGSQL PIVEIANRVGFSSPSYFSKCFREMFGMLPKQYAEESGRKE >gi|222159328|gb|ACAB01000031.1| GENE 23 32233 - 32607 290 124 aa, chain + ## HITS:1 COG:no KEGG:BT_0128 NR:ns ## KEGG: BT_0128 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 122 10 131 132 199 80.0 2e-50 MKVLVLISLVGGLMLSSCGGSSSKNNANVKEVNPADTIYLGDLREKFANDSVFFKIVAPD LMLMDYQYLWAVTESEAVEKGLTKEYYKRVKKEISDTNEAIKRGVMKGANVKRIPDFQAS QENK >gi|222159328|gb|ACAB01000031.1| GENE 24 32845 - 33375 448 176 aa, chain + ## HITS:1 COG:no KEGG:BT_0139 NR:ns ## KEGG: BT_0139 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 170 1 169 171 125 44.0 8e-28 MNKQQATFDDFFSECYLKYKEYIKNYIAIRICHPHEAEDLAQDVFVRLWEHRAFVNKDTV WSLLFTIARNIVTDKIRRYYKQEDFVAYIYNRMEDTSRNTTEDTIHFRELKKMHDQVMEA LPVKRRQIYELSFNHELSCPAIAGKLSLSPRTVECQLLLARKTVRTYLKNEFSKVG >gi|222159328|gb|ACAB01000031.1| GENE 25 33425 - 36673 2803 1082 aa, chain + ## HITS:1 COG:no KEGG:BVU_0126 NR:ns ## KEGG: BVU_0126 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 18 1082 21 1108 1108 676 38.0 0 MEIKSKFLCSKGVALAFAMLLGGSPGVLLYANDSVEESLMVVQTGRIIKGLVTDANGEPL IGCNVVVIGSNAGVITDIDGRFTLNIPADAKQIKVSYIGYVDQIINLHGRSDFKVVLKED NNALDEVVVVGYGTQKKATLTGAVEQIGSQVLESRAITNVGAALQGATPGLVVTRSSSRP GNEGLNFQIRGLTSVNGGSPLIIVDGVPVLNSESFQNLNSDDIENISVLKDGSASIYGAK AANGVILVTTKKGKGKTTVDYGFNMRFTTNGIMAFSPSMQEYASMWLEANKEMPEHDWWG WGEENLKKMAQGIEGIYESTVADWGTMFVGNANRLDELFARRYSYQHNLSVAGSTDKSDY RISLAYADNQANLATAYDGQKQLNLRLNYGIKLTDWFKLETLASMIKTNTETPTHGIDRT LYGNDAPFFPAKNPYGQWYANFGNVGDRNAAAATTDGGRDEREKLTTRVDFKALVDIWKG ITFEGTASFQNEEYRRERYSLPVQCYNWFGEQTAKLVYETTQTLSTPQDVLNFKDSHQPG YLVQANNARYQYYSGLLKYKRTFAEVHNIDAMFGINAEKWVTKKVVTAREKFEDAGIYDL NLATGTQGNGGGKTHNGTYSYIARLNYNYAEKYMVELMGRRDGNSKFAPGYRFKSFGSVS LGWAFSEEQFVEFLKPVLSFGKLRLSYGSSGNDVGLGDYDYVSTVSLGTAGFGTIPANQV SSGFGGLISYDRTWEKVSQKNFGIDLNFFDNRLKATFDYFIKDNTGMLVNVTYPGVLGGK SPKTNSGHLNVKGWEFTIGWRDQIKDFSYYANFNIGDTKSLLKEMEGADSYGAGWNAAVN GYPLNSYFLYRTDGYFKDQAEVDRYYALYGEGKEDLTGVGAGSASRLRPGDTKRLDLNGD YKISGAGNENSDLQYLGDSNPHFVFGFTLGGAWKGFDVNAMFQGVGKQYVIRNDWMAYPF QTRTANQNPTYLGKTWTESNPNAEFPRLTTNANLARWNYQNNDFMMQNNRYIRLKTLIVG YTLPQIWTRKVKLEKVRVYFSGNDLWEATSIRDGFDPEMGAASNNSGYPFARTWSFGLNI TL >gi|222159328|gb|ACAB01000031.1| GENE 26 36707 - 38503 1591 598 aa, chain + ## HITS:1 COG:no KEGG:BVU_0125 NR:ns ## KEGG: BVU_0125 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 598 1 675 675 336 36.0 2e-90 MKKITNYILICTCALGFSSCVNTFLDLEPLDAKTDVIYFKTPEHFREYANGLYGQLLGWQ SSYGSIFDHMDAASDLSTCFRYSYGVGTGVMGIPNDDSRWNNCYSNIRATNHMFERAVSS YTGNLADIKKELAEGHFFRAYNYFYLLKFFGGVPVVTKVLDVTSPELYGKRNSRYEVINL ILSDLDEAIAGLPLEQNITSADKGKISKQAAQAFKARVLLYEATWRKYNGTSTDFEGSAG PASDQVNAFLEESVQLSETVMGDAAYSLWNYNNVAAMRNMSNRYLFNIEEEASNPAGAGR ATNKEFIIASVYSQETRKGQVDLNQVIYTDMRPSRKLIDMFLCTDGLPVSMSDKFQGYKN PGDEFQNRDFRLTSYVGSYATSLTVENCGYGVSKFAITDIQRQSKDESANYPVLRLAEVY LNYAEAVMERYGEISDDQLNKSINKIRARAGIANLTNALAKRIQEGVPANATKTVNQVML DEIRRERALELYMEGFRCDDLKRWGIAEENLNESRCGAVVGNASYPTAFVDENGNATSAF NPAIFTKGTEVVETGKGKLPCVVLLKSSDCAFTKGDYLWAIPRNQINLNSNLVQNPGY >gi|222159328|gb|ACAB01000031.1| GENE 27 38540 - 39733 952 397 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160887598|ref|ZP_02068601.1| ## NR: gi|160887598|ref|ZP_02068601.1| hypothetical protein BACOVA_05620 [Bacteroides ovatus ATCC 8483] # 1 397 1 397 397 761 100.0 0 MKFLFKYTFHLAILACVSALFSGCEQDPKYRVYDYPVPVVESIYPTDGYVTTQVVITGTN FGDRAEAVKVFFGEAQSNKVLDCKNNRLVVEVPETAVTGNLSLQIYNKKVENIGHYTVLP TPRVITVTSDSEDGEGVADTGDKVTITGENFGTDPSDISVSFNGTPAEFELVDESTIVAT TPADYQTGSVTVTIHGYTMTGGAMFNPNSKGDVTVLYLQNYKQPFAKANDESWKNGEWWT PAVWNQNAASFNAKGNTTVSGMQYKSEEGLTLAFQNGWDKEAYTNGKMWQVATLRPGKYR LEVTYAYTLVVTDAGNFISVLIAKGDSESDIPNVADLEQLNGVYVAYDKMGTANDSGTLV TPSFEVTETTDVVIGFLTSLAKGNSYFKVTELKLILE >gi|222159328|gb|ACAB01000031.1| GENE 28 39783 - 41822 1587 679 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715824|ref|ZP_04546305.1| ## NR: gi|237715824|ref|ZP_04546305.1| IPT/TIG domain-containing protein [Bacteroides sp. D1] # 1 679 20 698 698 1256 100.0 0 MKKIYSYLFPVFLSFFALAACDEDNEEIVPMSYTDPVATVTKIEPVEGYVGNEFTINGDD FGIRTEDVKVFIGSQEAVVVSCADDAILAKVPESATNGKITVEVFGQRVETDLVYRVLGK PGVSVVKPSYGFPGASIVFEGQEFVSSKTLYTLTFGTSTDKAEIVGTPTDTEFTAKIPET AVSGVMTLIMAEQTIDLASYPFTVLKHATLDIPKEDEPVPSGFAGSKFAITGTNLVQELL DKSVEGLEPLKVTFSKAGGEPVEAVIDTDNLMDKSIPLTVPASLEAGDYTITVITPFETI GTQLKYTVLPMPTVTGISTKAGYINAEVTIIGQNFGTKAENIQVFFGETVCDKVTLNDKG NIVVNVPKGVSSEAPVKIKLIIQGKEIEMGESGTFEVWETPEITSVETPYIYPYGTLVKA GQEITFTGKGFGTDKNSVTVTFEGISVPVTVKEITTTSITVTVPQGFNGGRVTMVFEGIA QPVESDMLQPLPVDGDISQYVLMNYKQPFEYVKKGGDKGFHKKGEWAKAAYWIVQNSNLT AGEGGAAVDLAFKTKYGDGSDAGLALQTDWGFDNPKNDGKVYQTSHLQKGKYKLTAHVYE YDGRGFTGYVAVCKGNEMANTSDIPSKSLANASISGTGDVVVDILVEEPTDVVIGFVCTI TVKQGRAKIDNFKLELVEQ >gi|222159328|gb|ACAB01000031.1| GENE 29 41884 - 43179 1139 431 aa, chain + ## HITS:1 COG:no KEGG:Dd586_1768 NR:ns ## KEGG: Dd586_1768 # Name: not_defined # Def: exopolysaccharide inner membrane protein # Organism: D.dadantii_Ech586 # Pathway: not_defined # 60 429 47 406 410 293 43.0 9e-78 MKKIYYITAVFVTLFLTGCGDGIDLPGVNVETDLNKIPLPDNNVNLEQVELKPSTEPMLH EGLHTEEDFQRIRDKKAAGEEPWVSAYQLLVESQFSQKTADTYPTEWIKRGVSGDENYMN AARGATIAYQQALRWKIEQDDEYAAKAVENLNKWVQTCVGVTGNTNLSLAAGLYGYEFAI AGELLRDYEGWDRADFAAFQNWLLKVFYPANDDFLKRHHDTNPLHYWANWCLCNIAAKMA IGIVTDRRDIYNEGIGHLQTGDTNGRLRRAIYHDYAPDYNFAQWQESGRDQGHTLMCVGL MGIICQLAWSQGDDFFAYDDNLFMRACEYAACCNYTNETVPYTTYIWQKQSQWGYPIPEE QTTLGGGKWIKRAIWALPYYHYKGVKDISDDNLKYTKIATEYVGIEGGGGYYDANSGGYD VLGLGTLMYAR >gi|222159328|gb|ACAB01000031.1| GENE 30 43295 - 44701 888 468 aa, chain + ## HITS:1 COG:no KEGG:BVU_0121 NR:ns ## KEGG: BVU_0121 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 9 468 5 461 461 496 54.0 1e-139 MNKLIRIALTFLLVMTTGVIQAEIVIYPVPQGIYYARHNDDYTVKVRQVGEKDWVDLYEY NVKVDMDTKSDATMVQFDFSGKVEVLVQKNSGEIRSAVVRPLSKGIQPEIDGNFLLFTID KPQKLSVEFNGDRLNNLHVFANPIIKNVPDKNDPNVIYFEAGIHEPTDTAGKCFRIPSNT TVYLEGGAVLKGCLTCDSVENVKILGHGMLLEPQQGISVTYSRNVLIDGVTVVNPRHYTV SGGQSTGITIKNLKSFSYQGWSDGLDFMSCSDVMIDDVFLRNSDDCIALYTHRWNYYGDC RNIHVLNSTLWADIAHPINIGTHGNTETGDEVLEDIVFRNIDILEHDEDDRDYQGCMTIN VGDHNLAQNITFEDIRVEHIQEGQLFHLRVMYNQKYNTGPGRGVKNITFRNISCTGKYIN SSLIEGYDKNRKIENILFENIVLNGRRITSLEELNIDKKDFVEKIRIK >gi|222159328|gb|ACAB01000031.1| GENE 31 44786 - 45631 523 281 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715827|ref|ZP_04546308.1| ## NR: gi|237715827|ref|ZP_04546308.1| predicted protein [Bacteroides sp. D1] # 1 281 1 281 281 534 100.0 1e-150 MKAKMTKLCLLFILLCSLPVFYSCSDEADVFYLYEYWEVMLDPEPPISETSLPITGGTKL GIKGGIAPYTAEIADGQIATAYIDENNDIQISSIKLGSTSLMVKDADGRIIKIGLKVVNG KQSFSVNSVEARITGIDESLLDEQQKEKLEAVIKKIKDEAGIQATGGIAFSYDQKSSGKV TIVTSKDNAPKIEGSFSRSTSESGTTFQITINGKEYDCKFKLPERPDTSKSITTRDLGPI PYWLVEDVTEDYKSDVTDLFGINATSLKIERIYIGSFTPLR >gi|222159328|gb|ACAB01000031.1| GENE 32 45709 - 46896 1084 395 aa, chain - ## HITS:1 COG:uxuA KEGG:ns NR:ns ## COG: uxuA COG1312 # Protein_GI_number: 16132143 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Escherichia coli K12 # 2 381 1 383 394 355 47.0 1e-97 MMEKTWRWFGKKDKITLPMLRQIGVEGIVTALHEVPNGEIWTVEAINDLKSYIESYGLRW SVVESLPVCEAIKYAGTEREQLIENYKVSLANLGKCGVKTVCYNFMPVIDWIRTDLQYPW PDGTSSLYYDRIRFAYFDIKILEREGAEKDYTEEELHKVAELDKVITDTEKDNLIDTIIV KTQGFVNGNIKEGDKNPVAIFKRLLGLYKDIDRDALRENMCYFLSAIMPVCDEYGINMCV HPDDPPFQVLGLPRIVTNEEDIAWFLNAVDNPHNGLTFCAGSLSAGAHNDTRELAKKFAG RTHFVHLRSTEAMPGGNFIESSHLSGRGHLIDLIRIFENENPGLPMRVDHGRMMLGDEDK GYNAGYSFHGRMLALAQVEGMMTVVDDEKQHKIEF >gi|222159328|gb|ACAB01000031.1| GENE 33 46926 - 47741 195 271 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 10 263 4 238 242 79 27 7e-14 MNELFSIAGKVAVITGAGGVLGGNIAQHFVQQGAKVVAIDIRQEQLDNRVAELKQYGNDV IGIIGNVLDIASLEKVAEEVVAKWGQIDILLNIAGGNMPGATLEPDQHFYDMDISCWEKV TNLNMNGTVYPSMIFGKVMAKQKKGCIINVSSMAAYSAITRVPGYSAAKTAVANFTQWLA SEMALKYGDGIRVNAIAPGFFIGDQNRRVLINPDGSLTDRSKKVLAKTPMNRFGDIKELN GAVQFLCSEAASFVTGAMLPIDGGFSAFSGV >gi|222159328|gb|ACAB01000031.1| GENE 34 47862 - 48743 607 293 aa, chain + ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 25 293 40 300 313 137 29.0 2e-32 MKKIMNEKLTITTSNPVRARFYEYPRFTYPWHFHSEYEIIYVEKGEGDCLVGDSIISYSK GDLILFGSELPHSMQSPPDGGEESDNEEKSEPKVRGVNIQFEKDFMHYSISQYSQFIPIR NLLEDACRGIKFTITRSGKIIKLLEQIPSAKGADQIILLLSLLQMMATSNHKKYLTTSHY TPSPSIMRNERMEKVIAYLNKHYTESIGLDEIASYTAMNPAAFCRYFKENTGKTFKEYVL DMRIGYACKLLNSSMMNISQISATCGFESPVHFNRIFKRVTGMTPTFYREQME >gi|222159328|gb|ACAB01000031.1| GENE 35 48813 - 50762 1744 649 aa, chain + ## HITS:1 COG:no KEGG:BT_0132 NR:ns ## KEGG: BT_0132 # Name: not_defined # Def: alpha-glucosidase, putative # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 645 1 640 644 1205 88.0 0 MNVNRMKKRILLLLLVIPALVKAQDVMTETRRELTSPDGAYRFTFYQRAVGEDNAQMYYT LTYKNRPVIEESKLGVLIENQLFESALGIPNDTCHFWCENLKLTETEHQKTDERWKPVYG ERAEVRDCYNEMTLKLKKGEGNGNQDGGYDKRKNYFMNIIVRAYNEGVAFRYHFPEMTNG LFLHIVGEQTSFTMPEGTMAYYERWAQGPYELRPLEGWGKEESERPLTLKLPDGLSVALL EAEMVDYVRGKFRLSAEKPSTLETSLYSSVDIISPYSTPWRVIMAAERPVDLINNNDIVL NLNPACKLADTSWIKPGKVFRSGDLKQDRVKAAIDFAAERGIQYVHMDAGWYGPEMKMSS DATTVSPDKDLDIPALCKYAESKGIGLMVYVNQRALVQQLDTLLPLYKKWGLKGVKFGFV QIGNQRWSTWLHDAVRKCGEYGLMVDIHDEYRPTGFSRTYPNLMTQEGIRGNEEMPDATH NTTLPFTRYLAGAGDYTLCYFNNRVKNTKAHQLAMAAVYYSPLQFMFWYDRPEFYQGEEE LEFWKAIPSVWDDSRALDGEVGEYIVQARRSGNDWFVGAMTNTEARTITLTTDFLEPGKK YMLHLYEDDDKLNTRTKVRSTHKKIKAGDKLVLKLKASGGAALHFIPFK >gi|222159328|gb|ACAB01000031.1| GENE 36 50980 - 51363 282 127 aa, chain - ## HITS:1 COG:no KEGG:BT_0134 NR:ns ## KEGG: BT_0134 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 127 1 127 127 237 90.0 9e-62 MQERHLYEYAVIRFVPKVEREEFINVGIVLFSKRCKYLKSLYTIDENKLKLFSSELDINC LIEGLKVFDKICQGNKEGGVIANMDIPDRFRWLTAVKSSCIQVSRPHPGFSTDLDNTLER LFKELVL >gi|222159328|gb|ACAB01000031.1| GENE 37 51341 - 52123 781 260 aa, chain - ## HITS:1 COG:no KEGG:BT_0135 NR:ns ## KEGG: BT_0135 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 260 1 260 260 488 92.0 1e-137 MDLRTANVTRYILPLREGGSLPALAEADDEFKYVVKFRGAGHGTKALIAELIGGEIARTL GFRVPEIIFLNLDEAFGRTEADEEIQDLLQWSRGLNMGLHFLSGSLTFDPVVHKVDGKTA SQIVWLDALLTNVDRTIKNTNMLMWHKELWLIDHGASLYFHHSWTNWQKHALSPFVQIKD HVLLPFADQLEEVDATFKQLLTDDKIREIVNAVPDDWLNWTEGQETPQDLRDVYIQFLEE RIKHSEIFVNEAQNARKALI >gi|222159328|gb|ACAB01000031.1| GENE 38 52277 - 53887 1035 536 aa, chain + ## HITS:1 COG:no KEGG:BT_0136 NR:ns ## KEGG: BT_0136 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 536 1 536 536 983 83.0 0 MKLLRFWVFGIMSVILLPSNMAAEWQWSVQLPGIISNETNDHPQAFLWIPSDCTQVKAVI IGNHNMTEETLFENSLFRERLSETGIALIWITPGWDQRWDITTGCQEVFEKMMEDFANVS GYSELKYAPIVPLGHSAMATYPWNFAAWNPDRTLAIISLHGDAPRTNLTGYGRDNMEWGK RTIDGIPGLMIEGEYEWWEDRVNPALAFRMMYPKSCVSFLCDTGRGHFDIADRTAGYIAL FLKKALEYRMPATYDLDKPVELKKLNPQNGWLAERWHPNQKKRAKAAPYKEYKGDSHDAF WYFDKEMAEMTEERYRRERGKKPQYLGFVQKGQLLTYNPKSHVKVAARFLPEKDGLTFHL KAVYTDSLHTAISDEHSLTSPEITRICGPVQKVNDTTFTVRFYRMGMYNQRRTGDICLLA GNDGDKHYKSTVQELSFRIPYRNTEGKRQHILFPGLDDVRKGTETISLHATSDCGLPVYY YVKEGPAEIEGNKLVFTRIPPRSKFPLKVTVVAWQYGLAGEVQTAESVERSFFIYQ >gi|222159328|gb|ACAB01000031.1| GENE 39 53995 - 55923 1633 642 aa, chain + ## HITS:1 COG:BH1877 KEGG:ns NR:ns ## COG: BH1877 COG3533 # Protein_GI_number: 15614440 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 49 642 4 594 758 377 36.0 1e-104 MIMNKKSLVAIALLAVSAISSAQSVYPGQHQGKMKKETVAPVRVESFDLKDVRLLPSRFR DNMLRDSAWMTSIDVSRLLHSFRTNAGVFAGREGGYMTVKKLGGWESLDCELRGHTTGHL LSAYALMYAATGSEIFKLKGDSLVNGLTEVQNALKGGYLSAFPEELINRNIRGKSVWAPW YTLHKLYSGLIDQYLYADNQQALKTVTKMGDWAYNKLKPLSEETRKLMIRNEFGGVNESF YNLYAITGDERYRWLAEYFYHNDVIDPLKELRDDLGTKHTNTFIPKVIAEARNYELTENE TSKKLSEFFWHTMIDHHTFAPGCSSDKEHFFDPKKFSKHLTGYTGETCCTYNMLKLSRHL FCWTGDSSIADYYERALYNHILGQQDPETGMVTYFLPLLSGSHKLYSTKENSFWCCVGSG FENHAKYGEAIYYHNNQGIYVNLFIPSQVTWKEKGLTLLQETEFPKEETTRFTIRAEKPV RTTVYLRYPSWSKKAEVLVNGKKVAVKQKPGSYIAITRDWKDNDRISATYPMQIALEATP DNPNKVALLYGPLVLAGERGTEGMQAPAPFSNPAFYNDYYTYNFHVPADLRTSLKVDMKH PERTLQRTGKELKFITEQGDVIRPLYDLHHQRYVVYWDLQSK >gi|222159328|gb|ACAB01000031.1| GENE 40 56036 - 57958 1224 640 aa, chain + ## HITS:1 COG:MA3635 KEGG:ns NR:ns ## COG: MA3635 COG0596 # Protein_GI_number: 20092435 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Methanosarcina acetivorans str.C2A # 372 639 16 282 282 94 27.0 5e-19 MKGRLLIVIFISICFIAGSCNVSSVQGSRYNFSSLDSVIQGWVSKEYYPGASICVVKNDS VIFQKNYGSYTPDTKVYVASAGKWVAAAVIGAVVDSTSLDWDDPVEKWLPEFKGDAKGKI LLRQLLSHTSGVRPYLPEPRVDNYNHLDSAIIEILPLDTVFTSGSRFQYGGLAMQIAGRM AEVAMGKEFETLFQELLAQPLEMKNSHFTPINTDGGHAPMLGGGLCTTLNDYIHFLSMIY HDGMYNDKRIISAKTVKEMQADQVKDAIIPSNNSDNYVAKGLGQSHNGIYGLGEWRELID KKTGEAYQISSPGWAGAYPWINKRENVYGFFIAHVVGASSKEDGFSSFYGSPVISRTVSE IVKGHPLVVKQGRVEVGNGSLYYEEAGTGAPVIFVHGHSLDHRMWDEQFSVLAKKYRVIR YDLRGYGISSSQTEDYQFTHAGDLVTLMDSLHIKKAHIVGLSLGGFITADMLAYFPDRML SAFLASGNIRKSKGPSEPMTPEEARVRDKEIAALKEKGVDVMKKEWFEGLMKSGGSQRER MRESLWQMIDEWDAWQPLHKEVRVVAGLDAIEKLKKNHPDVPALIVEGHSSGNRFSKEPP ILQYLPNGKLKVIDDCGHMLNMERPEEFNAALEEFLKNVQ >gi|222159328|gb|ACAB01000031.1| GENE 41 58003 - 60282 1639 759 aa, chain + ## HITS:1 COG:no KEGG:Csac_2721 NR:ns ## KEGG: Csac_2721 # Name: not_defined # Def: heparinase II/III family protein # Organism: C.saccharolyticus # Pathway: not_defined # 27 759 14 752 752 454 33.0 1e-126 MKQTIIAVICFFCVSSLYIQAQKINHPSLLYTPQRIQQVKQRMQHEPKLQEAWGDIKKTA DEALQKKDFNRLDYLSLAYLMTDNKEYADAIKEILLKAVEAESWGDVEMMARIPAWRSQL GMAHKSFLSAVGYDAAYNVMSSSERKKIAEGLKRLAVEPALGDWLLEPTRIHSLNSMGHN WWTSCVCQGGILALSLQNELPEVKEWVEQLHESLPEWFDFAGDVLQQKAKSFDEAGGMYE SLNYANFGIQEALLFRIAWINTHPGQNPGDIPQLAKLPSYFSQVCYPRTGMLHSLNFGDS HKNVSAESSMMLLYALGVKDPTILWYIAQVEQGQHRDGFFLNRPMGFFYTPDLSKAPAVP QLPTSQLFADFGWATMRDSWEKDATMLAVKSGHTWNHSHADANSFIVFHKGVDIIKDGGN CWYPNPAYRNYFFQSQAHNVVLFNGEGQPREQQYSGSTLRGYLYHLLDAGNVKYVLANGT GPVSNNFSRNFRHFLWMDNVIYMIDDLKTHKIGQFEWLWHTNGTYKKSGVDVNVTNGNSS VVIRPLYPRLLAKSDFVHDYPEDLYWEEIEAPTEDLKGTETYYSFHLPAEVNRVKGLTAI ILKETPNEKDLPQMERREGQDWIGLRIRHKGKVTDLYINQLADGRLMHSNSWIMPDGWMT DAYMFAVSYPEGTEAADAKDFFICHGSALRRDKETYFSSLAKLFVIQKEEDKKLNLWIDG QPKIHASFRSKKKPIRVEVNNKRIPVVYKQSQLSIKLVD >gi|222159328|gb|ACAB01000031.1| GENE 42 60384 - 63155 1137 923 aa, chain + ## HITS:1 COG:VC1353_1 KEGG:ns NR:ns ## COG: VC1353_1 COG3292 # Protein_GI_number: 15641365 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Vibrio cholerae # 7 587 13 577 675 73 22.0 1e-12 MNNLIRHITSLIFISLSLSAYPISYSFRGLSETNGLSDLTVSALYKDSVGYLWIGTATSV ECFDGFRFKHYPIFGDDEKLKWVNVIAETQGNQILVGNDMGLWRINKESGKLEPFVAEVI KFGVRSLLTDAQGTLYIGSEKGLFVCKNNELEQITINADMWSSDNFIVDLNMGEKDTLWI LTRSKLYSMNLSTRKITLCPCGDNDRQDYTYRKMARIGSRLYLGIMEHGVVVYDIPTGKF RDNWIDLGCNVISSLSGDGKDILYVGTDGNGVHFVSVKENKVMKSFRYEPGTNGGIRSNS VYSLLVDREGLIWVGFYQLGLDYTLYQSGLFSTYSYSPYFDSKDIPVRAIAINENERLIG SRNGLFYVDETKRRFRSFKSPQLRSNMIMCIYPFQGKYYIGTYGGGMYILDSSTLTIHDF EPSETPFVKGHVFCIKSDNDGCLWIATSMGLYQYKDGEQLNHYTSSNSKLPEGNVLDICF DSTGKGWICTATGLCIWDPSTRSIKSDIFPEGFIHKEKIRTVYEDSSHELYFLPDKGPIF ISDLSMTHFRRFQPGILLEGKDAMFMIEDREGWLWIGTNLGLYRYDKKSTIVPYTFVDGL PSSVFITCCPVIDASGTIWFGNSKGLIYLTRDLRDINEENSYPLAITDVYVNGKEPYHPA IQREQHLYKVSLESSHKNLTICFSGFTYSDPAYMSYEYKMEGMDDDWQLLGGKSELTYYD LSSGKYQFKIRRIGEPGSEICLSVTIASSFNATLWGTILLLIAIICLSYAYYRRKKNSPA LAVDAIETGKPVVEEKYRKSNVSMEECHRLANELDMLMQEKRLYVNPDLKIADLASVLNV SPYTLSYVFNQFLNKNYYDYLNDYRIAEFKRLLNEDEYSKYTLSALAELCGFSSRTSFFR YFKKTTGITPNEYVQSIGRTNED >gi|222159328|gb|ACAB01000031.1| GENE 43 63342 - 63857 437 171 aa, chain + ## HITS:1 COG:PA1300 KEGG:ns NR:ns ## COG: PA1300 COG1595 # Protein_GI_number: 15596497 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 12 156 9 158 175 65 25.0 6e-11 MENINMKSTQLVADSYKSYHHSVYLYIYYKIGKDEEAKDLSQDVFLRLMDYKQILCPETV KHFIFAICRNLVTDYLRRFYKAQEVTSYLFDRIPTCTNEVESQIIANNLLTCEWERANQL PEQRRKIYIMSRFENKSSADIAAQLGISSRTVENHLFISRKEIREYIQQCI >gi|222159328|gb|ACAB01000031.1| GENE 44 63878 - 67111 2993 1077 aa, chain + ## HITS:1 COG:no KEGG:BT_0140 NR:ns ## KEGG: BT_0140 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 1077 15 1097 1097 1164 54.0 0 MLVLNFMYMKKNLFRFSSRRFLFSTVMASALVTGAPQVVFANVGEVQAVSQTGAIKGKIV DSYGEPVIGANVKVKGTTNGTITDMNGNFTLNNVSGGTLVISFIGYKTLEVSVKGTNLAK IIMHEDTEVLDEVVVVGYGTQKKESLTGSVTVVDQKLFKDKGTVANPLSAMQGQVPGLRI TRSSAAPGEEGWGVSIRGSVSKNKVEPLLIIDGVPASGVSEMAQLNADDIETINFLKDAS AAIYGAKAAGGVILVTTKRPDAGKTRVEYSGSVTRKIVGLQPRMMSMDEWTDGVLQARLN DGYGEDDNWVRYARLAKAMKGSWINLHGGNNPDEDPIPGGFRGVADFVFHDMNWTDVLWG NATSTQHNLSVSGGSEKSTYRLSLGYLNDQGTLQWGNNSNERYNVRLFNSFKINNRISLE TNMSASRQHQVAPTQIGSILGSSIPQPGLPVSTIDGKPYAWGGIHTPNWSAELGGDNKLV VTTMNVNEILKVNILDGLDFQGTLGYSTNNAARDEQYLSIDWYQYDGTPILNENSPYPAK EKSSYTKSSARTDNYTASAFLTYKKLFDEVHDISLMGGVQYDYAAYDYSGTKAMDVEAAI ESLNGKGQIYIDKVDRWEEAILSYFGRLNYNYKSRYMVEVNARYDGSSKFLPKNRWNFFW GASAGWRITEEKFMENLRDYVNELKLRASYGEVGNQNGIGRYDGIQLYNYKSNSGALLGN SKGTYVESAGLVSTVRTWERIYNYNLGIDFGFLNNRLTGTFEVFKKKNDNMLVSRLHPGV LGGTAPSTNSGKFEANGYDASLTWRDKIGSFNYHVGGTLTYMTNKLVDGGNIVVSAGYNE AVNGYPLNSVFGYRYVGKVQNQDQLDYYVDKYVGSNSIALPANLRLGDHMYEDVNKDGKL TQDDLVFLGSDDPKYSFSFNFGCEWKGFDFNTVFQGVGKRTIYREIDSWKVPFKAIWLNT TNQSVGNVWSPETPNNHFPTYSNSNDINNYNYIPSTWSADNGAYLRLKEIVLGYTLPKAW MNKSGFISNLRIYVSGADLWEKSYITDGWDPEATRKVENKQRYPFNRTVTFGVNATF >gi|222159328|gb|ACAB01000031.1| GENE 45 67139 - 68860 1325 573 aa, chain + ## HITS:1 COG:no KEGG:BT_0141 NR:ns ## KEGG: BT_0141 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 573 5 579 579 602 55.0 1e-170 MKKIAYILGIICAMTMQSCLDLEPKTQLADTNYWKSPEHFKLFATQFYGWSADFRWLDDS QHADIRSDLFAGSTVNIYSNGTNSIPSSDKHYTDNYNRIRQTNTLLQQADAYEQQADIAI SVGEARFFRAYCYFELLQAYGDLIITRTPLDIDSPEMQKARNPRTEVADFIIDDLKEAAK LLPVDKAISSSDEGRLSSEAALAFLSRVALYEGTWEKFHNGGQNTERTKNLLNTAAQAAH DVMAGGAFELFAPQELGTEAYKYLFILENEKSNPASINKTGNKEYIFTRRHDPVINSIGF NITQGRLGNALYVTRKMANMYLQSNGLPINPQTWNYTKVDDEYKNRDNRMNNQLMVPGQT YWGTNGGRIDWSGNAAEIANASHRNFMPQTGTGYFPHKWCSERDGVPTGMEAYDFPIIRY AEVLLNYAEAVFERDGQISDEDLAISLNLTRKRINPEMPDLTNDFVTANNLDMRTEIRRE RTIEFFDENFRIDDLKRWKTAEDEMPMNLTGVKWKGTDFENRWPDASFKDKDGEGCIIHE KGRNWHEKHYLLPLPTDQLKLNSNLKQNPGWNQ >gi|222159328|gb|ACAB01000031.1| GENE 46 68895 - 70361 1196 488 aa, chain + ## HITS:1 COG:CAC3086 KEGG:ns NR:ns ## COG: CAC3086 COG5492 # Protein_GI_number: 15896337 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 4 210 147 345 498 80 33.0 8e-15 MTYKNILQKGAFAALMTAGLMMTVSCNDDDVSPVLEERVLVKEINLEVTPELPLLIGTDT LIKYTVGPDEAFNKEVVWKSTVPDVATVDAEGRITAVKAGSTVISAKPAVGYSATSTINV KVVNEIIHITDINLTNEDLEVFVTSTLPLTWQTSPADPTYPGLIWKSLTPDKATVNEKGE VKGIAEGKARIQVTATDDKHFTKEFEIKVKPIIYIESMEFVKEADKLALGETYTPQMNVT PIDATLSYIVWESSKPDVVEIDEKGRFVAKAYGNAVITALADYGDGKPVKATMSVNVSEG LMNDQFTVENIWQPFGNFTHREFEWMENDGVLRIFPGEDKTYKAVRICRNGGFGFNVTNY PILAFKMKFPENVFEASTSIEWYLDIWGGSGITVAGKYGEDTDKGKNAMTVVDCGEYKVF YADFTKKFLGKNQQYMPTSLTKYDTVELQFWKIWYEANETGSIYVDWIKTFESEQELKDL IKNESGKQ >gi|222159328|gb|ACAB01000031.1| GENE 47 70412 - 71677 1202 421 aa, chain + ## HITS:1 COG:no KEGG:Dd586_1768 NR:ns ## KEGG: Dd586_1768 # Name: not_defined # Def: exopolysaccharide inner membrane protein # Organism: D.dadantii_Ech586 # Pathway: not_defined # 45 419 44 407 410 293 44.0 8e-78 MKNLIYTFVGSVTLVLLTSCSAEFENGTPEPWQREINPNENYEYTIKHPCLVNTEADFER ARRKVNERAEPWISGWEKLCESRFAQLGYNANPQEEIVRHSQGGNFNTAAFDAGAAYQLA VRWKISGDEDYAKAAVKILNGWAKTCKGVNYKTWPDDSHRLLAAGFIGYQFAAPAELMRD YEGWKTEDFEVFKKWMDKTFYPICDDFLDNHFNSSAISGWMSWDLPAMLTILSIGVLNDD DAKIKQALEFFYHGKGMGCIEWSVKGMHEDPAGKVKGRHLAQSQEMGRDQGHATLNVGLH AYFCRTAYNMGIDLFAYNDNIILDLCEYTAKYNLTSAEDVEMPFEPYYHPKYGWHEKVSA DGKGRARPGWELLYNHYAKVKGMNAPYSQAFAEKERPEGYGERGTAEAGDLGFGTLLYTE E >gi|222159328|gb|ACAB01000031.1| GENE 48 71775 - 74930 2058 1051 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 29 1045 7 985 1087 702 39.0 0 MINRLFSTLLFLSCLFCGLSPTAKAQNNEWENPTQYEWNKEKPHADFRLYERADDAVNDN PQKSPWQYSLNGTWKFVYAPSIAESIKDFYRTDLSDNDWDTIIVPSNWEIQGFGEPIIRN IQYVFSPNPPYIDVDNPVGTYRRTFTVPQNWQGREVLLHFGSISGYARIYVNGQQVGMTK ASKTPAEFNVTNYLKKGENLLAVQVYRWHDGSYMEDQDFWRLSGIERDVFLTAYPQTTIW DFFLHAGLDDTYRHGQFRATVDLRSFNTNATLQKGTLTLDLKDAGGKTVLSAQKVYCISD ISTTLTFEGTVRNVRKWSAEHPSLYDCILTLRSDGDKQQTVVAHKVGFRRIEIKNARLLV NGVPTYIKGVNRHEHNDTLGHVQTREIIMNDLRLIKQLNMNAVRTSHYPNHPLFYQLCDQ YGIYVVDEANIETHGMGSVPYFKDTIPHPAYRPEWYAAHVDRITRMVERDKNHPCIIGWS LGNECGNGIVFHDEYKRLKKYDPGRFVQFEQAWEDWNTDVVCPMYPNMWKITEYRKSGKQ RPFIMCEYAHAQGNSNGNFKDLWDIIYDSPNLQGGFIWDFMDQGFKVKTEPRDGRTYWTY NGKMGSYKWLEDKKGELNTGTDGLISANGIPKPQAYEVKKVYQYIQFSAKDLGKGIISIK NRYDFTNLDEYAFIWEIYKNGEKISTGDFNVDLIPHEEKEIRLSLPVIPEDGNEYFLNLY AHTRVATDLVPAGYEVAREQMQLNKSSFFTSLPPCSGKLSYETKDNILSFQSGGVSGKID LKKGILFDYMINGEQPIKQYPEPAFWRAPIDNDFGNKMPTLAGVWRTAHVNRYVKKVTID EQNEKGLSIRIDWILNDIQVPYMMEYLICNNGAIVVTGSIDLTGTKLPELPRFGMRMELH QPYENLVYYGRGPFENYIDRYSSSFIGRYEDKVENQFYWYTRPQETGNKTDVRWLSLLDS EGQGVRITGLQPIAFSALHFSPEDLDPGLTRKLQHTIDIVPQKNIFLHVDLKQRGLGGDN SWGMYPHNEYRLLDKKYTYSYMIELVKKNSD >gi|222159328|gb|ACAB01000031.1| GENE 49 75214 - 77946 1829 910 aa, chain + ## HITS:1 COG:PA3423 KEGG:ns NR:ns ## COG: PA3423 COG2207 # Protein_GI_number: 15598619 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 726 903 65 243 247 75 29.0 3e-13 MSDGLSGLLVNAIYKDSEGFVWLGTDNCLDRFDGVKVRHYEFRGVDSGRKKRVNCITETA DNQLWVGNGIGLWRLNRANGQLERIVPEKIDFAVNTLLPDGDILYIGTEKGLFIQKDGQL LQVLTDRNMLAACNRIMDLCLNEDKSALWLATVQGLFSYSLKDGKIDSWHFQENVPEADY FRCLTRIGETLYLGTMSQGVVRFDINKKSFSHTVSLGCDVISDISSDGKETVYIATDGNG VHFLSHKAQQVARRFCHDVSDKEGIRSNSIYSLLVDDRGAVWVGHFQAGLDYSLYQNGLF QTYAYPPLFNSANLSIRSFVNRGHEKVIGSRDGLYYINEATGVVKSFVKPVLTSDLILTI CFYEGEYYIGTYGGGMMVLNPETLSLEYFARGGDTELFQKGHIFCVRPDEKGNLWIGTSQ GLFCYNRQAKQIKHFTSANSQLPEGNVYEVSFDSTGKGWIATETGMCIYDPASQSLRSNV FPEGFVHRDKVRTIYEDTGHNLYFIREKGSLFTSTLTMDRFRNRSVFSTLPDNSLMSIVE DNQGWLWVACNDGLLRIKEEGEEYDAFTFNDGVPGPTFTNGAAYKDEKGILWFGNTKGLI YVDPKRVDEVRGKVRPIVFTDILANGISFTNPSLKYNQNNLTFCFTDFAYGLPSALLYEY QLEGVDKDWKLLAAQNEVSYYGLSSGTYTFRVRLPGNEQSEAVYQVTVRPMIPWWGWALS VLLVVGIIAFIRYYVWNRMRRLLTSPASAISASIENTQQKDQTIEQAEVISEESSIAATE EKYKTNRLTEEECKELHKKLVAYVEKEKPYINPDLKMGDLASALDTSSHSLSYLLNQYLN QSYYDFINEYRVTQFKKMVVDSQYSRYTLTALAELCGFSSRASFFRSFKKSTGVTPNEYI RSIGGTAKEE >gi|222159328|gb|ACAB01000031.1| GENE 50 78637 - 79152 624 171 aa, chain + ## HITS:1 COG:PA1300 KEGG:ns NR:ns ## COG: PA1300 COG1595 # Protein_GI_number: 15596497 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 12 156 9 158 175 60 25.0 1e-09 METTANTSDNIITRSYEEYYQVILTYIAYRITHRYEAEDLTQDVFVRLLDYKQMLRPDTV KYFLFTIARNIVIDYIRRYYKKQEIDSYIYDTMSTSTNETEEKIIGDDLMTMERTRLAAM PEQRRLIYTLNRFENKTSPEIANELNLSCRTVENHLFLGRREMREFFRNCI >gi|222159328|gb|ACAB01000031.1| GENE 51 79285 - 82521 3003 1078 aa, chain + ## HITS:1 COG:no KEGG:BT_0140 NR:ns ## KEGG: BT_0140 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1078 22 1097 1097 1348 62.0 0 MKRRGLIFNSRPNKTAVFALGLMLGLCPVSKAWADTNMKDSQVVAQAQKKVKGQVVDATG EPLIGVNITVIGGTEGTITDIDGNYTIGVPAGAKLKFSYIGYKDQIIDVAAQTVINVKLQ EDSEVLDEVVVVGYGSQKKETLTGAVTVVSDKMLKNKGTMSSPLQAMQGQVPGVTITRNS AAPGDESWGLKLRGSVSANNAEPLIVIDGVAYDGVNALRNINPSDIQSINFLKDASAAIY GSRAAGGVVLVTTKQAKEGKARIEYSGSYTYKMVGLQPELMTMNEWANAVLQTCQNDGQP DSYSWVKYAKMALANEGKYIDLDHSANPFAPSYSDVMDFVFMDTNWQDVLFGNSYSTQHD LAVSGGTEKNLYRLSVGYMYDDSNLKWGNNNNQRFNLRLTNKLKVFDNFSIESVIAYNRQ DQVAPSQLNRTLTSSYPQPGLPASTIDGKPYSWGTWLSPIWFAELGGDNKLKVSEINISE KFTYNINKHLDVVANLGYNSGVASRDIKKMAITSYNYAGTKINTKADGYKQEESSYEKTS SRTDFYSMAGYVDYHNTFAQHHNVSAMVGAQYELKEYDYFGVSVKDIQNSLETVNGAGLV NLTDKHGTKWHEAVMSYYSRLNYNYKSKYLLEVNMRYDGSSKFKPENRWDFFYGISGGWR ITEEAFMKNIKWLNDLKIRLSYGEVGNQSGIDRYDGTQFYKFESQSGAYIGPNKGTIIDT NGKIASLGREWERIKNYNLGLDFSLFNSRLTGTAEVYMKRNDNMLINVSYPGVLGDNAGM SNNGKFRSHGWEVMVNWSDKIGKDFTYHVGGTYSFNTNKLTDIGAVSVLKSGFVDKQQGY PLNSIFGLRYAGKAQTEEERQKYLYRFLTGNTIGLTEQNFRLGDNMYADVNNDGKLDQND IVYLGTDDPKISFSFNVGAEWKGFDLSLVFQGAAQRTIFRTGDNNGNEIWRIPMKALYLN TSNQSVGNTWSPENRGAYYPTYSNKNEINDYNYQASSWSVEDGSYIRLKNVTLGYNVPSA FLAKTKAISSCRVYVAGADLWEYSKINDGWDPEASRKVSAAGRFPFVRTVTFGLNLTF >gi|222159328|gb|ACAB01000031.1| GENE 52 82525 - 84258 1470 577 aa, chain + ## HITS:1 COG:no KEGG:BT_0141 NR:ns ## KEGG: BT_0141 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 576 9 577 579 639 56.0 0 MKNMKLYKKALLILSLGLTLNSCLDLDPQDQLADGNLWGAADDFKYFATNFYGWTRDFTS VISDGAHSDWRSDLMTSSSVNMYSNGSNPIPTSDGNYTSNYAHIRRCNLLLQKAASFSGN GDISRYVAEAKFFRAYSYFDLVQLFGNVIITKEPLDITSPELRMSRNDRSEVIDFVIQDL KDAAEVLPETITTDDEGRISKWGAYAFLSRVALYEGTWQQNRNNEERGKALLSIAADAAK KVIDSKQFELFKPAALGEMAYKYLFILENVQSNPANLTKSANKEYIFYRRHDETIAPIGT NITHGCVANVQYINRKFVNMYLCSNGLPIDHAKSADLFKGYGGLTTEFENRDNRMKNCLL AHGTKYWDNDKPRIDWKGMDGEDATNAIICNVFQGSGYQNQKWGTERKVADTFEGYDFPI IRYAEVLLNYAEAMYELGTTGKALDDALNISLNLVRLRVNPDMPKLTTSFASENSLNMRE EIRRERTIELYNEGFRVDDLKRWNTAIVEMPKPILGVKWTGTDFATSWAGASNMAKDSDG CLILENSRRWGGKNDLYPLPVDQCQLNPNLGQNPGWQ >gi|222159328|gb|ACAB01000031.1| GENE 53 84290 - 85828 1239 512 aa, chain + ## HITS:1 COG:CAC3086 KEGG:ns NR:ns ## COG: CAC3086 COG5492 # Protein_GI_number: 15896337 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 66 252 202 374 498 70 33.0 9e-12 MKRDYIKYFVGLAACSFLGLTSCDDDKDLGGKMDEMITVSTITLNETQYDAGNKTICLLK NKELQLSWSIAPENATNANVQWTSSDESIVAVTREGKVVTKDKAGKAIITLTPEIGFGPE ATIVTRTVEVMDEYTYMSAINITNVPAEEIAAGDEYQLTVSSEPATTTFKRYKWTSSNPE VATVDEKTGLVTGISKGDATIIVTADDFSSNPVSASCEIGVKIVTPITGMTFTEDAELNQ LGYGQEYQIKYTLEPVDATASLLTWTSDNPEVISVDKTGKLKVHTTKAASAVITASYGPV VQNVTVTVADGRFWYSLGNGLGNGLDNNWYLDGNNASVVSSDGNKTTVQMGGPDSKVKYR GDLWLARSGKGKVVNITPAEYRYLAVKIGFKSQLHVGNNSQGCIKLEVFDDGIKKIGPEY WGTAGDANNRYAILGTSEISITKPNVLYYDLQARYKTVTPTDWNQVFDLSQCKFVIADFP ETAQTYDLYWIRSFKSIEELQAFVDSENNTNE >gi|222159328|gb|ACAB01000031.1| GENE 54 85834 - 87237 1219 467 aa, chain + ## HITS:1 COG:no KEGG:BT_0143 NR:ns ## KEGG: BT_0143 # Name: not_defined # Def: putative transmembrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 467 18 458 458 410 50.0 1e-113 MNKMKLYTIIGAGLFALTSTTLLACSEIEDGSNDMSSWPEVKDYVTTLEHPCMLHTETDF EFVKGKVQAGAQPWKNAFDHLSRGGNSLSQSNYKASPVKLLARLDQNNWAGKYPNDWNNY TKLMKDAAAAYQLALRWKLSETDGAQYADAAVAILNDWAKTCTGFIVNDKGEFIDPNEFL IFIQVHQIANAAEIMRSYSGWQEADFIKFKAWIADVFYPHITKFLSTHNGNECALHYWLN WDLSAMTALLSIGILADDNFKINEAIQYFKFGIGSGNIGNGVPFIHLDPDSNEMLGQCQE SGRDQGHATLCVSLLGAFCQMAKNVGEDLFIFDDGRALAMCEYVAKYNIGGAETGSSSAS WKMTGFRYTDNDLPYTTYTNCSGSWDTISAQERREGKDSRGEVRPAWELVNRLAQDYGKS SIYAKMWVDKMRENASRGNSDGGAGDYGPNSGGYDQLGFGTLMFAKE >gi|222159328|gb|ACAB01000031.1| GENE 55 87487 - 89136 1511 549 aa, chain + ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 10 548 16 529 531 214 30.0 4e-55 MRKKILFLAGMMACSSLAGAQEISKTWVADKGNGTYQNPVLHADYSDPDICAAGDDFYMT ASSFNCIPGIPILHSNDLVNWSLVNYALPVQEPEPFFDKAQHGKGVWAPAIRFHNGEFYI YWGDPDYGIYMIKTKDPKGKWSKPVLVKAGKGMIDATPLWDEDGKVYLIHAYAGSRSGVN SILVICELNAEGTEVISDPVMVFDGNDGKNHTVEGPKLYKRNGYYYIFAPAGGVAAGWQL VLRSKNIYGPYESKIVMAQGKTTINGPHQGGWVDTNTGESWFVHFQDKGAYGRVIHLNPM MWVNDWPVIGVDKDKDGCGEPVTTYKKPNVGKTYSVATPPESDEFNTRHLGLQWQWHANK KDTYGFTTDLGFIRLYAGSLSKEFVNFWEVPNLLMQKFPAEEFTATTKLTFTAKQDGEQT GIIVMGWDYSYLSIRKAGDQFILQQAVCKDAEQQNPEQIKELANIPVKYLKMPGVADNEW QTVYLQVKVRKGAVCTFAYSLDGKKYTTVGESFTARQGKWIGAKVGLFCVTPNEGNRGWA DVDWFRMNK >gi|222159328|gb|ACAB01000031.1| GENE 56 89203 - 90423 1154 406 aa, chain + ## HITS:1 COG:no KEGG:BT_0146 NR:ns ## KEGG: BT_0146 # Name: not_defined # Def: unsaturated glucuronyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 406 1 406 406 765 91.0 0 MKKIVVGLAVMLGFCTCVHQPSGTLDVNKALDYCAEQTQRTLAELKTDSGIDYTMMPRNI MTDEHHWNCRKATKEEWCAGFWPGVLWYDYEYTKDKQIQEEAEKFTNSLEFLSRTPAFDH DLGFLVFCSYGNGYRLTKNPAYKQVILDTADTLATLFNPIVGTILSWPREVEPRNWPHNT IMDNMINLEMLFWAAKNGGNPYLYDIAVSHADKTMKCQFRPDYTSYHVAVYDTITGNLIK GVTHQGYADSTMWARGQAWAIYGYTVVYRETKDPKYLDFAQKVTDVYLERLPEDKVPYWD FSAPGIPDAPRDASAAAVVASALLELSTYLPNGTGKRYKDAAIEMLASLNSDSYRSGKSK PSFLLHSVGHWPAHSEIDASIIYADYYYIEALLRLKRLQEGHGVLG >gi|222159328|gb|ACAB01000031.1| GENE 57 90585 - 91058 527 157 aa, chain - ## HITS:1 COG:no KEGG:BT_0147 NR:ns ## KEGG: BT_0147 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 157 11 152 153 214 89.0 6e-55 MKKVIMLAAVVAALASCQSKANKAAEAQADSLALAMTPITELTEVYEGTLPAADGPGIDY VLTLNAATDGVDTAYTLDMTYLDAEGQGQNKTFTSKGKQQTVHKVVNKKPVTAVKLTPKD GEAPMYFVVVNDTTLRLVNDSLQEAVSDLNYDIIKVK >gi|222159328|gb|ACAB01000031.1| GENE 58 91116 - 92408 1427 430 aa, chain - ## HITS:1 COG:sll0260 KEGG:ns NR:ns ## COG: sll0260 COG1253 # Protein_GI_number: 16331101 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Synechocystis # 1 422 8 430 448 317 42.0 2e-86 MEFLIILFLLILNGIFAMYEIALVSSSKARLETLVAKGNKSARGVLKQLEEPEKFLSTIQ IGITLIGIVSGAYGGVAIADDLVPFFSLIPGAEAYARNLAMITTVAIITYLSLIIGELVP KSIALSNPERYATLFSPVMILLTKVSYPFVWLLSVSTRLLNKLIGLKSEERPMTQEEIKM ILHQSSEQGVIDKEETEMIRDVFRFSDKRANELMTHRRDLIILHPDDTQEKVMKIIEEEH YSKYLLVDERKDEIIGVVSVKDIILMVGNKKVFNLREIARPPLFIPESLYANKVLELFKK NKNKFGVVVNEYGSTEGIITLHDLTESIFGDILEEDEMEEEEIVTRQDGSMLVEASMNID DFMEEMGILSYEDLESEDFTTLGGLAMFLIGRIPKAGDIFTYKNLQFEVVDMDRGRVDKL LVIKRDDEQE >gi|222159328|gb|ACAB01000031.1| GENE 59 92660 - 93445 984 261 aa, chain - ## HITS:1 COG:YPO0927 KEGG:ns NR:ns ## COG: YPO0927 COG0501 # Protein_GI_number: 16121232 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Yersinia pestis # 1 254 1 245 250 153 36.0 3e-37 MKREIAMIAFVLLGMGMTASAQFGKKINLGKALQAGKDVVSAVTLSDADIANMSKEYMAW MDTHNPLTKPDTEYGKRLEKLTGHIKEVDGLKLNFGVYEVIDVNAFACGDGSVRICAGLM DVMTDEEVMAVVGHEIGHVVHTDSKDAMKNAYLRSAVKNAAGAANDKVAKLTDSELGAMA EALAGAQFSQKQENEADDYGVEFCVKNGIDPYAMANALSKLAELAKDAPKASYAQRMFSS HPDTQKRIERTKAKADSYAKK >gi|222159328|gb|ACAB01000031.1| GENE 60 93479 - 95725 2061 748 aa, chain - ## HITS:1 COG:no KEGG:BT_0150 NR:ns ## KEGG: BT_0150 # Name: not_defined # Def: putative ferric aerobactin receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 748 45 792 792 1339 87.0 0 MILGLNKGGVTNAEGHFTIEQVPPGIYRLQATAIGYKSVTTPEYILSTKDLNISIEMEEN LTELAGVTVTASPFRRDLESPVSLRIIGLQEIEKSPGANRDISRIVQSYPGVAFSPIGYR NDLIVRGGSPSENRFYLDGVEIPNINHFSTQGASGGPVGILNADLIREVNFYTGAFPTDR GNALSSVLDFKLRDGDMEHNSLKATLGASEVSLASNGHIGKKTSYLVSVRQSYLQFLFDM LDLPFLPTFTDAQFKLKTRFNEQNELTVLGLGGIDNMRLNTKADSEDNEYILSYLPKIKQ ETFTLGAVYRHYAGPHVQSVVVSHSYLNNRNTKYRQNDESIPENLMLRLRSTEQETKFRF ENNSSFRNWKVNLGVNLDYSQYTNTTFQKAYTNQVQTFDYHTYLGMMRWGLFGTISYSSM DERFTASLGLRADANNYSSAMKSLSDQLSPRISLSYQLAEHWFVSGNAGLYYQLPPYTAL GFKDNNGTYVNKYNLRYMKVSQESLGISWRKGDTFEVSVEGFYKDYDKIPLSVVDGIPLT CKGNDYGVIGNELLTSTAQGRSYGAEILVKWLIAKKLNLASSFTIFKSEYRNDKESEYIA SAWDNRYIFNLRGTYNLPHQWSVGMKVSCIGGAPYTPYDEEKSSLVSAWDAQGKAYYDYS KYNKERLPAFAQVDLRIDKTFYLKHCMLGFYLDLQNITASKLKQQDVLMSTGIIENPEAP ANSQHYKMKRLKQSSGTLLPTLGITFEY >gi|222159328|gb|ACAB01000031.1| GENE 61 96039 - 96473 377 144 aa, chain + ## HITS:1 COG:no KEGG:BT_0151 NR:ns ## KEGG: BT_0151 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 247 81.0 8e-65 MKTVKLITCNDAMKAHILQGALENEGIESILHNENFSTLYKSCVSSIAGVDILVADEDYE NAVQVLKDNDSWPEELTLCPYCGSSDIQLVLRKEKRWRAMGAAIISALMVIPPGDNHWNY TCKQCHKNFEMPVAKFNPAAEPEE >gi|222159328|gb|ACAB01000031.1| GENE 62 96500 - 97318 698 272 aa, chain + ## HITS:1 COG:PM1451 KEGG:ns NR:ns ## COG: PM1451 COG0627 # Protein_GI_number: 15603316 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Pasteurella multocida # 4 272 2 265 269 174 37.0 2e-43 MEKKKNLLIALLLIWVAPSFAAKVDTLLIKSPSMNKEVQVVVVTPDAASGKKAVACPTIY LLHGYGGNAKTWIGIKPNLPQIADEKGIIFVCPDGKTSWYWDSPVNPSFRYETFISSELV KYIDEHYKTIADRKGRAITGLSMGGHGAMWNAIRHKDTFGASGSTSGGMDIRPFPKNWDM AKQLGEYESNKEVWDNHTVINQIDKIENGDLAIIVDCGEDDFFLNVNKDLHNRLLERKID HDFITRPGAHNGQYWNNSIDYQILFFDKFFKK >gi|222159328|gb|ACAB01000031.1| GENE 63 97726 - 98520 424 264 aa, chain - ## HITS:1 COG:no KEGG:BF1404 NR:ns ## KEGG: BF1404 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 11 247 15 249 274 150 38.0 5e-35 MKKIAILLLAILTFSCSDDDEKGTVEDKGQWAMIFNETVKSDSNPVDRTEKFMFDGERLI QHIIKQRYFEEEISNEVNLSYSDNQVTVTTDYLTLVYTLNSEGYASQCVYSLSSQNRIYQ FSYSAEGYLTGIVENIDDTEYSSTSLTYENGDITSISSKMNGLENKFIYEPGEESSTYHL PCLGLLEIHPLTFHIEALYAGLLGKDPRHFTMHSCPAGSNDEKTVYSYGFDKKGNPSRMI CQTTYAGGQASYYPYTRNISISFE >gi|222159328|gb|ACAB01000031.1| GENE 64 98527 - 100002 852 491 aa, chain - ## HITS:1 COG:no KEGG:BT_0154 NR:ns ## KEGG: BT_0154 # Name: not_defined # Def: putative periplasmic protease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 491 1 484 484 612 65.0 1e-173 MTKLRKIILIPALILVSISGFFSCGVDRWPEYAHQTALDTWMYDIMQQNYLWYQDLPSYD DVNLFLEPASFLSKVKSKNDSYSFVDSVMETPLPTYGFDYSLVRSADIDTAYNALITYVI PGSPAADAGLVRGDWIMKVDTSYISKKYETQLLQGIPARDLVMGVWKEVPVVEEEDKTKA DTEEGGEGETEYVYKVVPNGKTLKLPAARVVEDNPVHKYTVLPAKENGQEIEVGYLMYNS FTAGTSSDPEKYNNELRQVSQEFKTAGVKYVILDLRYNAGGSLDCVQLLGTILTSGERYD SNAPMAYLEYNDKNRDKDATIHFDGEVLKGGTNLNLPGLFVITSSTTAGAPEMLIRSLYL KDSYPVVAIGGATKGQNVATEQFINEEFLWSINPAVCTVYDSNHDTYGSISPATDLKVSE TIIDGITNYSEFLPFGNKDERLLKVALGVIDGSYPPKDEETEETTKAQFKIEKSVISPAS RRFSSNGLRLK >gi|222159328|gb|ACAB01000031.1| GENE 65 100186 - 101253 697 355 aa, chain + ## HITS:1 COG:BB0682 KEGG:ns NR:ns ## COG: BB0682 COG0482 # Protein_GI_number: 15595027 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Borrelia burgdorferi # 1 353 1 350 355 303 43.0 2e-82 MEIAALLSGGVDSSVVVHLLCEQGYKPTLFYIKIGMDGAEYMDCSAEEDIELSTATARKY GLSLEVVDLHQEYWENVAAYAIDKIRQGLTPNPDVMCNKLIKFGCFEQRVGKDFDFTATG HYATTLQRDGKTWLGTAKDPVKDQTDFLAQIDYLQVSKLIFPIGGLMKQEVREIANRAGL PSAKRKDSQGICFLGKINYNDFVRRFLGEKEGAIVELETGKKVGTHRGYWFHTIGQRKGL GLSGGPWFVVKKDIEENTIYVSRGYGVETQYGNEFRMHDFHFITDNPWKGQEKEVDIIFK IRHTPEFTKGKLIQEGEKLFHILSSEKLQGIAPGQFGVIYDEEVKVCVGSGEIIC >gi|222159328|gb|ACAB01000031.1| GENE 66 101342 - 101596 365 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715862|ref|ZP_04546343.1| ## NR: gi|237715862|ref|ZP_04546343.1| predicted protein [Bacteroides sp. D1] # 1 84 1 84 84 80 100.0 4e-14 MTTMELKSLKMDLVEELLSLNDKEMLNRVKNYLKRLKKMEAEKEEEEITKEEVLAGIDAG LKEVKLSMEGKLEVKTAREFINEL >gi|222159328|gb|ACAB01000031.1| GENE 67 101586 - 101927 259 113 aa, chain + ## HITS:1 COG:no KEGG:BVU_0489 NR:ns ## KEGG: BVU_0489 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 112 1 112 113 109 51.0 3e-23 MNYEIIVKPTFQREAKRLAKHYSSFKEDFVSLIDDLEQNPLLGTDLGHGLRKVCMKITSK GKGKRGGARVITFTLVVSQQDAVLNLLYIYDKADRASISEKEIEQLLKQNGLK >gi|222159328|gb|ACAB01000031.1| GENE 68 101998 - 102657 532 219 aa, chain - ## HITS:1 COG:RSc0292 KEGG:ns NR:ns ## COG: RSc0292 COG2197 # Protein_GI_number: 17545011 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Ralstonia solanacearum # 1 213 1 210 210 93 27.0 3e-19 MREFIIADNQDISKAGMMFLLSKQKEVSLLLEADNKAELIQQLRLHPQAVIVLDYTLFNF AGADELIVLQERFKEADWILFSDELSLNFLRQVLFSSMAFGVVMKDNSKEEIMTAIQCAT RKQRYICNHVSNLLLSGASSPLAASSVDDHLLTQTEKNILKEIALGKTTKEIAAEKNLSF HTINSHRKNIFRKLGVNNVHEATKYAMRAGIVDLAEYYI >gi|222159328|gb|ACAB01000031.1| GENE 69 102746 - 103336 657 196 aa, chain - ## HITS:1 COG:no KEGG:BT_0157 NR:ns ## KEGG: BT_0157 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 300 94.0 2e-80 MKKFIALVALVLVSASTMMYAQESNAAARRAERKAQRDAERAKLRAEEEVQDMAAYQQAV QALKNKQFVLEANQVVFRNGMSAFVTSNTNFVLMNGNRATVQTAFNTPYPGPNGIGGVTV DGNSSDMKMNIDKKGNVNCSFSVQGIGISAQVFINMSSGNNTASVSISPNFSNNNLTLNG NIVPLDQSNIFKGRSW >gi|222159328|gb|ACAB01000031.1| GENE 70 103403 - 104830 996 475 aa, chain - ## HITS:1 COG:FN1422 KEGG:ns NR:ns ## COG: FN1422 COG1757 # Protein_GI_number: 19704754 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 11 474 1 445 473 270 35.0 4e-72 MKKAPSPLISLIPIVVLVLLLFATIRTFGSDALSGGSQVSLLTTTAICILIGMAFYKIPW KDYELAITNNIAGVATAIIILLIIGALSGIWMISGVVPTLIYYGMQIIHPSFFLASTCII CALISVMTGSSWTTIATIGIALMGIGKAQGFEDGWIAGAIISGAYFGDKISPLSETTILA SSITDTPLFRHIRYMMITTIPSLIITLIIFTVAGLSHDASNTQHIAEVATALNEKFHITP WLLIVPVVTGILIARKVPSIVTLFLSTLLAGVFALIFQPELLQEISGVAVSGFDSLFKGL MMTVYGATNLHTDNTVLTDLIATRGMAGMMNTIWLILCAMCFGGAMTASGMLGSITSIFV RFMKKTVSVVGATVCSGLFLNLTTADQYISIILTGNMFRDIYAKKGYESCLLSRTTEDAV TVTSVLIPWNSCGMTQATILSVPTLVYLPYCFFNIISPLMSITVAAIGYKIARRS >gi|222159328|gb|ACAB01000031.1| GENE 71 104985 - 105419 304 144 aa, chain - ## HITS:1 COG:no KEGG:BT_0160 NR:ns ## KEGG: BT_0160 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 141 1 141 144 226 87.0 3e-58 MKYGVNRQILLITAGAVWIIAGANILRIGIVTWLNTSQDWMFKIGEATVVFLLFFVLVFR RLYYKHTQRIEQKKERRNCPFSFFDVKSWITMIFMISLGITIRSFHLLPETFISVFYTGL SIALILTGVLFIRYWWIRRKTFAS >gi|222159328|gb|ACAB01000031.1| GENE 72 105466 - 106641 852 391 aa, chain - ## HITS:1 COG:CAC3482 KEGG:ns NR:ns ## COG: CAC3482 COG0477 # Protein_GI_number: 15896719 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 15 385 19 389 394 247 39.0 4e-65 MKQPLKENGGLPASILWTLAIVAGVSVANIYYIQPLLNMIRHELGISEFRTNLIAMVTQI GYAAGLLFITPLGDLYQRKKIILVNFTVLIFSLLTIALTHSFHLILIASFLTGVCSMIPQ IFIPIAAQFSRPEHKGRNVGIVLSGLLTGILASRVVSGFIGELIGWREMYHIAAGMMFIC AIVVLKVLPDIQTNFQGKYSDLMKSLLALVKEYPQLRIYSIRAALNFGSLLAMWSCLAFK MGQAPFFANSDVIGMLGLCGVAGALTASFVGRYVKRVGVRRFNFIGCGLILFAWLLFFVG ENTFVGIIAGIIIIDIGMQCIQLSNQTSIFELNPRASNRINTVFMTTYFIGGSMGTFLAG SFWQLYGWHGVISTGVVLTGISLLITIFYKK >gi|222159328|gb|ACAB01000031.1| GENE 73 106668 - 109109 1440 813 aa, chain - ## HITS:1 COG:FN0580 KEGG:ns NR:ns ## COG: FN0580 COG4953 # Protein_GI_number: 19703915 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein PbpC # Organism: Fusobacterium nucleatum # 24 812 1 721 724 374 32.0 1e-103 MNPIKFFKRLSVTKKVMTISITLLIIGYIFCLPRQLFHVPYSTVVTDRNEELLGARIASD GQWRFPPRKTTPEKIKQCLITFEDKHFYHHWGVNPLATGRALYQNLKNKRVVSGGSTLTM QTIRLARNKPRTLGEKVVEMIWATRLEFRASKEEILSMYVSHAPFGGNVVGLDAAAWRYF GHSAEDLSWAESAMLAVLPNAPAMIHLSKGRKTLLSKRNRLLKQLLEEEIIDASTYELAV SEPLPDEPHPLPQIAPHLVSRFYQERNGLYTRSTIDRGIQSHIESLAERWSNEFNRSDIR NLAILVIDIPTNQVVAYCGNVHFDRKQGGSQVDVIQAPRSTGSILKPFLYNAMLQEGSLL PKMLLPDVPVNINGFTPQNFSMQFEGAVPASEALARSLNIPAVTMLQRYGVPKFHHLLQQ MEFKTINRTASHYGLSLILGGAEATLWDVTNAYAQMGRSLSSSPSPTVSQGKEAQILLET ENTEQINNDTKQTSRNHKSNHNKQEKREQSNSSPLSVEEDSEETTSAKTGAAWLTLSALT EVNRPEEIDWKSIPSMQTIAWKTGTSYGFRDAWAVGVTPRYTVGVWVGNATGEGKPGLVG AQTAGPVLFDIFSYLPSSPWFERPTGVFVDAEICRQSGHLKGRFCEETDTVLILPAGLRT EACPYHHLVTLSADESHRIYENCANTEPTIQKSWFALPPVWEWYYKQHHPEYKPLPPFKA GCGEDSFQSMQFIYPPMNAHIKLPKQLDGSKGFLTVELAHSNPNATIFWHLDDTYQTQTQ DFHKISLQPAPGKHSLTAVDGEGNTVSTTFFIE >gi|222159328|gb|ACAB01000031.1| GENE 74 109259 - 114889 4182 1876 aa, chain - ## HITS:1 COG:FN0579 KEGG:ns NR:ns ## COG: FN0579 COG2373 # Protein_GI_number: 19703914 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Fusobacterium nucleatum # 262 1869 54 1604 1611 393 25.0 1e-108 MGLTKTTRSISATGLLLLIMMTVGLYSCTRTQKDIIPSADYAPYVNAYTGGVISQNSTIR IELTHDQPMVDLNSELKNNPFSFSPSLKGKAYWVSNNTIEFVPEEGALKPGTLYEGTFRL GDFIEVDKKLKEFNFSFRVQERNFTLQLESLPITAAQPDEINIKGEIRFSDVVKKEEVEK MLTASDGKKSYPVEVTATDNLTRYQFNIRQIPREADDYPLTITANGSPAGIDRKQSEEVL IPAKDCFRFMSAERIEQPENGIEIVFSAPLSTTQDLKGLIEIPEVSSSIFQISENRVFIY FEANTQNKLTLNIHEGVKDSQGKALGTSHTISFSEVSLKPQVEMSTSAAILPDSKSLIIP FRAVNLYAVDLSVIRIFENNVLMFMQTNSLASTNELRRSGRLVYKKTLWLAKDASKDIHH WGDYSIDLAGLIHQEPGAIYRVILSFRQEYSAYPCGGNENQDMKFADSNTSDGLTKVSGS VLSEEDEAIWNTPEAYYYYNGGTMDWSVYRWTERDNPCHPSYYMNSDRIAACNVFASNLG MIVKRNSLNKLWIAVSNILDTKPIGKAQVTAYNFQLQPIGKGETNGDGFVEITPKGVPFI IVAESEKQKAYVRVVDGEEQSVSRFDVGGKDIQKGLKGFIYGERGVWRPGDTLYISFILE DREKRIPDKHPVALEIYNPRGQFYTKMISTQGMNGFYTFDVPTQATDPTGLWNAYIKVGG TTFHKGLRIETIKPNRLKINLALPKILQATDKDVYAPLTSTWLTGATASKLKAKIEMSLS KVNTQFKNYGQYIFNNPATNFTTIKTDIFDGTLDAEGKTSVTLKVPTATEAPGMLNATFT TRVFEPGGDASIYTQTIPFSPFTSYVGINLNQPKGKYIETDKDHVFDIVTVNTQGQLVNR TNLEYKIYRIGWSWWWENSGESFGTYINNSSITPVASGNLQTRGGKASFKFRIDYPSWGR YLVYVKDKESGHATGGTVYIDWPEWRGRSSKTDPSGIKMLAFSLNKDLYEIGETATAIIP AAAGGRALVSIENGSTVLRQEWIEVSNGGDTKYTFKITPEMTPNVYLHISLLQPHAQTVN DLPIRMYGVVPVFVTNSQTVLQPQIQMPEVLRPETNFNVTVSEKSGKPMTYTLAIVDDGL LDLTNFKTPDPWNDFYSREALGIQTWDMYDNVLGASAGSYSSLFSTGGDATLKPADAKAN RFKPVVKFIGPFYLGKGKSQTHTLKLPMYVGSVRAMVVAGQDGAYGNAEKTAFVRTPLMM LSTLPRVLSIQEEITVPVNIFAMENQVKNVTVSLQASGGGVQIVGTNQQSLKFTQPGDQL VFFTLKTGSKTGKATIHLTANGSGQQTKETIEIDVRNPNPVVTLRNSQWIEAGQSKELSY NLSSSSANNQIKLEVSRIPSVDISRRFDFLYNYQHHCTEQLTSKALPLLFVAQFKTIDKT EAEKIKTNVQEAIRQIYGRQLPNGGFVYWPGNAAADEWISSYAGMFLTLAQEKGYAVHAN VLNKWKRFQRAAAQNWRMPQEASGWQQWQSELQQAFRLYTLALAGVPEYGAMNRMKEQTG LSIQAKWRLAAAYALTGKMKPAEELVYNVETTVNPYSSMNQIYGSSDRDEAMILETLILM NRERDALQQAKVVSKNLSQEDWFSTQSTAFALMAMGRLAEKLSGTLDFVWSWNDKQQPAV KSAKAVFEKEIATTPKSGTVSVKNQGKGALSVDLITRTQLLNDTLPAISDNLRMDIRYAN LNGTPLSVNDIIQGTDFMAITSISNISGTSDYTNLALTHIIPSGWEIYNERMVAPETENA AADGSGQSVSKYSYQDIRDDRVLTYFNLRRGETKVFTVRLQATYAGNFILPAVQCEAMYD VNVQARSKAGRTTVSR >gi|222159328|gb|ACAB01000031.1| GENE 75 114990 - 115391 257 133 aa, chain + ## HITS:1 COG:CC3636 KEGG:ns NR:ns ## COG: CC3636 COG0545 # Protein_GI_number: 16127866 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Caulobacter vibrioides # 3 132 37 167 177 115 44.0 2e-26 MGKKKEYKDANRRFLKKLSFQEGVFALPCGIYYKVLETGEGTISPGARSIVTVHYKGSLI DGRVFDNSYERTCPDALRLSDVIEGWQVALQKMHVGDKWIIYIPYAMGYGIKSFDSIPAY STLIFEVELLGVA >gi|222159328|gb|ACAB01000031.1| GENE 76 115537 - 115899 410 120 aa, chain - ## HITS:1 COG:DR1328 KEGG:ns NR:ns ## COG: DR1328 COG4828 # Protein_GI_number: 15806346 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Deinococcus radiodurans # 9 106 1 96 109 57 34.0 5e-09 MLLTFLNTLATIISVISLLIVTYGVLVGFVAFLRNEIKRFNGTYTINNIRQLRADFGSYL LLGLEFLIASDILKTVVDPTLDELAILGGVVVVRTVLSVFLNKEIKELAEDNSAKDIKEL >gi|222159328|gb|ACAB01000031.1| GENE 77 116179 - 116337 163 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEYINEFHKQWIERTRKSIALWSKQPASLEEKRKQQERLDQQRAIREGKLKS >gi|222159328|gb|ACAB01000031.1| GENE 78 116557 - 116721 81 54 aa, chain - ## HITS:1 COG:no KEGG:BT_0173 NR:ns ## KEGG: BT_0173 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 54 283 336 336 102 92.0 5e-21 MTDTMGRMHSDAQFAGSSSVPAHVEMMRFLGIGNTPMVGCTVACAVDRAQALGK >gi|222159328|gb|ACAB01000031.1| GENE 79 116873 - 117211 221 112 aa, chain - ## HITS:1 COG:no KEGG:BT_0167 NR:ns ## KEGG: BT_0167 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 110 1 110 112 195 95.0 5e-49 MDELIKKHLQDILTAIEEVESFFGNAPKVYDDFYSNLCLRRAVERNIEIIGEAMNRILKV DKDIAITNSRKIVDARNYIIHGYDSLSVDILWSMVINHLPKLKNEVTALLNI >gi|222159328|gb|ACAB01000031.1| GENE 80 117204 - 117503 268 99 aa, chain - ## HITS:1 COG:no KEGG:BT_0168 NR:ns ## KEGG: BT_0168 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 99 1 99 99 163 93.0 2e-39 MKLIENNIQKIIDLCKKHKVHKLFVFGSVLTSRFNDNSDVDLIVDFNKAEVSDYFDNFFD FKYALENLFGRKVDLLEEQTIKNPYLKKNVDATKTLIYG >gi|222159328|gb|ACAB01000031.1| GENE 81 117811 - 118821 1281 336 aa, chain - ## HITS:1 COG:no KEGG:BF3207 NR:ns ## KEGG: BF3207 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 336 1 336 336 644 97.0 0 MIREVKFESQDRRIKQIIAALNANGIKDIEEANAICEAHGLDPYKTCEETQPICFENAKW AYVVGTAIAIKKGCTNAAEAAEAIGIGLQAFCIPGSVADDRKVGIGHGNLAAMLLREETK CFAFLAGHESFAAAEGAIKIAAKADKVRKEPLRCILNGLGKDAAQIISRINGFTYVQTQF DYFTGELKVVREIAYSDGPRAKVKCYGADDVREGVAIMWKEGVDVSITGNSTNPTRFQHP VAGTYKKERVLAGKPYFSVASGGGTGRTLHPDNMAAGPASYGMTDTMGRMHSDAQFAGSS SVPAHVEMMGFLGIGNNPMVGCTVACAVDVAQALSK >gi|222159328|gb|ACAB01000031.1| GENE 82 118841 - 119542 818 233 aa, chain - ## HITS:1 COG:CAC2565 KEGG:ns NR:ns ## COG: CAC2565 COG0822 # Protein_GI_number: 15895825 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Clostridium acetobutylicum # 1 233 1 230 230 371 79.0 1e-103 MTYSHEVEHMCVVKKGPNHGPAPIPEEGKWVKSKEIVDISGLTHGIGWCAPQQGACKLTL NVKEGIIQEALIETIGCSGMTHSAAMASEILPGKTVLEALNTDLVCDAINTAMRELFLQI VYGRTQSAFSEGGLIIGAGLEDLGKGLRSQVGTLYGTLAKGPRYLEMAEGYIKQIFLDKN DEICGYEFVHMGKFMDEIKKGTDANEALKKVTGTYGRVTAEQGAVKSIDPRHE >gi|222159328|gb|ACAB01000031.1| GENE 83 119781 - 120185 321 134 aa, chain + ## HITS:1 COG:no KEGG:BT_0175 NR:ns ## KEGG: BT_0175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 133 1 133 134 202 86.0 3e-51 MELNQIDIHYSIAAICVISSALVFYTIGVWGERLQRKLKFWHIIFFLLGLLADTVGTSLM EHIAELTHLHDEMHTVTGAIAILLMFVHALWAIWTYVKGTPIEKRHFNRFSIVVWCIWLI PYLIGVYLGMRLHV >gi|222159328|gb|ACAB01000031.1| GENE 84 120622 - 121122 600 166 aa, chain + ## HITS:1 COG:no KEGG:BT_0176 NR:ns ## KEGG: BT_0176 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 166 1 162 162 276 92.0 2e-73 MIEVVESALQKAAGEGMDEFIQAFTDKYKEVIGGELTAETMPLLTGEQHSLLAYQIFRDE MMVGGFCQLIQNGYGGYIFDNPFAKVMRLWGAEEFSKLVYKAKKIFDANRKDLEKERTDD EFMAMYEQYEAFDELEEAYLEMEEQVTALIASYVDDHLELFAKIIK >gi|222159328|gb|ACAB01000031.1| GENE 85 121199 - 121969 521 256 aa, chain + ## HITS:1 COG:no KEGG:BT_0177 NR:ns ## KEGG: BT_0177 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 239 1 248 263 131 39.0 2e-29 MFMKQVKFFLVALMAVVMGMSVTSCMNGDDNHNVTMTVPVKYNYGSFLMGDATTKLVPTT ELGFLDGNMYIISCQYDQSQVTANSTSIPVTLLSTPLCIDPKDNERLSAIKSEPTNPLYS LDKQQSSLVYYDKNTIVLTMPYWGKVTNSSVEESEVKKHSFILYYNPDEIKSTDTKLNLY ISHRVNDEEGESVTRSNFTYAYRAYSISSALNAFSSKTEGKLPKYLVLKAQTNNSKDELK EENGETSVEYDYAFTE >gi|222159328|gb|ACAB01000031.1| GENE 86 122002 - 122547 437 181 aa, chain + ## HITS:1 COG:no KEGG:BT_0178 NR:ns ## KEGG: BT_0178 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 181 1 181 181 310 92.0 2e-83 MKLISSPAKAWEEISMEEDRRKVFMAFVYPMIGLCGLSVFIGSLLTNGWGGPQSFQIAMT NCCAVAVALFGGYFLAAYAINEMGTRMFGMHSNMPLTQQFAGYALVVSFLLQIVTGLLPD FRIIAWLLQFYIVYVVWEGVPILMGVEEKQRLKYTLLSSVLLILCPAVIQIVFNRLTAIL N >gi|222159328|gb|ACAB01000031.1| GENE 87 122557 - 123009 440 150 aa, chain + ## HITS:1 COG:TM0254 KEGG:ns NR:ns ## COG: TM0254 COG0691 # Protein_GI_number: 15644629 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Thermotoga maritima # 10 149 16 155 158 134 48.0 4e-32 MKQPPVNIKNKRATFDYELIDTYTAGIVLTGTEIKSIRLGKASLVDTFCYFTKGELWVKN MHIAEYFYGSYNNHTARRERKLLLSKKELEKLQREMKNPGFTIVPVRLFINEKGLAKLVV ALAKGKKEYDKRESIKEKDDRRDMARMFKR >gi|222159328|gb|ACAB01000031.1| GENE 88 123098 - 125845 2735 915 aa, chain + ## HITS:1 COG:VC0390_2 KEGG:ns NR:ns ## COG: VC0390_2 COG1410 # Protein_GI_number: 15640417 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Vibrio cholerae # 324 914 1 590 899 710 60.0 0 MKKTISQVVSERILILDGAMGTMIQQYNLKEEDFRGERFAHIPGQLKGNNDLLCLTRPDV IQDIHRKYLEAGADIIETNTFSSTTVSMADYHVEEYVREMNLAAVKLARDLADEYTAKNP DKPRFVAGSVGPTNKTCSMSPDVNNPAYRALSYDELAASYQQQMEAMLEGGVDAILIETI FDTLNAKAAIFAAEQAMKATGVEVPVMLSVTVSDIGGRTLSGQTLDAFLASMQHANIFSV GLNCSFGARQLKPFLEQLAARAPYYISAYPNAGLPNSLGKYDQTPADMAHEVREYIEEGL INIIGGCCGTTDAYIAEYPALVKGAKPHIPALAPDCMWLSGLELLEVKPEINFVNVGERC NVAGSRKFLRLINEKKYDEALSIARQQVEDGALVIDVNMDDGLLDAKTEMTTFLNLIMSE PEIARVPVMIDSSKWEVIEAGLKCLQGKSIVNSISLKEGEEVFLEHARIIRQYGAATVVM AFDEKGQADTAARKIEVCERAYRLLVDKVGFNPHDIIFDPNVLAVATGIEEHNNYAVDFI EATAWIKKNLPGAHISGGVSNLSFSFRGNNYIREAMHAVFLYHAIQQGMDMGIVNPGTSV LYSDIPTDVLEKIEDVVLNRRPDAAERLIELAESLKATMSGTAGQPAAKQDAWREESVQE RLKYALMKGIGDFLEQDLAEALPLYDKAVDVIEGPLMDGMNYVGELFGAGKMFLPQVVKT ARTMKKAVAILQPIIESEKVEGSAAAGKVLLATVKGDVHDIGKNIVAVVMACNGYDIVDL GVMVPAETIVQRAIEEKVDMIGLSGLITPSLEEMAHVALELEKAGLDIPLLIGGATTSKM HTALKIAPVYHAPVVHLKDASQNASVASKLLNPQLKAELVNELNSEYEALREKSGLLKRE TVSLEEAQKNKLNLF >gi|222159328|gb|ACAB01000031.1| GENE 89 125857 - 126459 596 200 aa, chain + ## HITS:1 COG:MA0330 KEGG:ns NR:ns ## COG: MA0330 COG0778 # Protein_GI_number: 20089228 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 38 200 9 174 179 161 45.0 8e-40 MKKIFSFLCLIAAIVVAMSACSSTKEEKGTSETGNAALDNIFARKSVRTYLNKGVEKEKI DLMLRAGMAAPSGKDVRPWEFIVVSDRAKLDSMAAALPYAKMLTQARNAIIVCGDSVRSS YWYLDCSAAAQNILLAAESLGLGAVWTAAYPYEDRMQVVRKYTNLPDNILPLCVIPFGYP ATKENPKQKFDEKKIHYNQY >gi|222159328|gb|ACAB01000031.1| GENE 90 126528 - 126653 121 41 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|153807903|ref|ZP_01960571.1| ## NR: gi|153807903|ref|ZP_01960571.1| hypothetical protein BACCAC_02189 [Bacteroides caccae ATCC 43185] # 1 41 1 41 41 78 100.0 1e-13 MFGIDDPFIILPYLLSVVCVIFAAWFGLKYWNKDDEKDETR >gi|222159328|gb|ACAB01000031.1| GENE 91 126650 - 128200 1428 516 aa, chain + ## HITS:1 COG:MTH1856 KEGG:ns NR:ns ## COG: MTH1856 COG0591 # Protein_GI_number: 15679844 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Methanothermobacter thermautotrophicus # 1 513 1 512 526 484 49.0 1e-136 MNTFTLGLIVIGYLLSLAYLGFLGYKKTTNTSDYLVGGRQMNPIVMALSYGATFISASAI VGFGGVAAAFGMGIQWLCFLNMFIGVVIAFIFFGLRTRRMGAKLNVSTFPQLLGRHFRSR NIQVFIAAVIFVGMPLYAAVVMKGGAVFIEQIFQIDFNISLLIFTLVIAAYVIAGGMKGV MYTDALQAVIMFGCMLFLLFSLYQVLGMGFTEANKELTAIAPLVPEKFKALGHQGWTAMP VTGSPQWYSLVTSLILGVGIGCLAQPQLVVRFMTVESSKQLNRGVFIGCFFLIITVGAIY HAGALSNLFFLKTEGAVATEVVQDIDKIIPYFINKAMPDWFAALFMLCILSASMSTLSSQ FHTMGASVGSDIYGTYKPRSRNKLTNVIRLGVLFSILVSYIICYMLPHDIIARGTSIFMG ICAAAFLPAYFCALYWKKATKQGVMASLWVGTIGSLFALVFLHQKESAALGICKALFGRD VLITTYPFPVIDPILFALPLSVLAIIVISLMTKNKY >gi|222159328|gb|ACAB01000031.1| GENE 92 128266 - 129657 1373 463 aa, chain - ## HITS:1 COG:VC0866 KEGG:ns NR:ns ## COG: VC0866 COG4623 # Protein_GI_number: 15640882 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein # Organism: Vibrio cholerae # 31 456 30 460 530 168 28.0 2e-41 MNVKTLIIPLFLVLFCCVFGCRNKQHSTTDESARDLPQIKDSGELVVLTLYSSTSYFIYR GQEMGFQYELSEQFAKSLGLKLRIEVAKSVDEMIQKLRAGEGDMIAYNLPITKEWKDSLL YCGEDVITHQVIVQQGRGKQKPLKDVTELVGKDIYVKPGKYYDRLVNLNSELGGGIRIHE VTNDSTTIEDLITQVAQGKIPYTVADNDLAKLNKTYYPNLNIDLSISFDQRSSWAVRKDS PELAAAATKWHQENMTSPAYTASMKRYFENSKMMPHSPILSLKEGKISHYDDLFRKYSKD IGWDWRMLASLAYTESNFDTTAVSWAGAKGLMQLMPATARAMGVPPGKEQNPEESVKAAI KYIAATDRSFSMIPDKQERLNFILASYNAGLGHIYDAMALAEKYGKNKLVWKDNVENFIL LKSNEEYFTDPVCKNGYFRGIETYNFVRDIMSRYESYKKKIKA >gi|222159328|gb|ACAB01000031.1| GENE 93 129702 - 130355 560 217 aa, chain - ## HITS:1 COG:BH1275 KEGG:ns NR:ns ## COG: BH1275 COG0572 # Protein_GI_number: 15613838 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Bacillus halodurans # 13 212 3 202 211 221 54.0 9e-58 MQKYSIWFKYTFKEMLIIGIAGGTGSGKTTVVRKIIESLPAGEVVLLPQDSYYKDSSHVP VEERQNINFDHPDAFEWSLLSKHVMMLKEGNSIEQPTYSYLTCTRQPETIHIEPREVVII EGILALCDKKLRNMMDLKIFVDADPDERLIRVIQRDVIERGRTAEAVMERYTRVLKPMHL QFIEPCKRYADLIVPEGGSNKVAIDILTMYIKKHLKS >gi|222159328|gb|ACAB01000031.1| GENE 94 130422 - 130721 433 99 aa, chain + ## HITS:1 COG:no KEGG:BF3194 NR:ns ## KEGG: BF3194 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 99 1 99 99 172 89.0 3e-42 MVKHIVLFKLKDEVPAEEKLVVMTKFKEAIEALPAKISVIRKVEVGLNMNPGETWNIALY SEFDTLEDVKFYATHPDHVAAGKILAETKESRACVDYEL >gi|222159328|gb|ACAB01000031.1| GENE 95 130737 - 132815 1777 692 aa, chain + ## HITS:1 COG:HI0885 KEGG:ns NR:ns ## COG: HI0885 COG4232 # Protein_GI_number: 16272825 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol:disulfide interchange protein # Organism: Haemophilus influenzae # 247 605 182 521 579 97 27.0 1e-19 MRKIISFLLLSFVVYALQAQIKDPVKFKTELTPLSDTEAEVVFTAAIDKGWHVYSTDLGD GGPISATFNVDNKSGVELVGKLKPVGKEVATFDKLFEMKVRYFENTAKFVQKVKFTGGAY AIEGYLEYGACDDESCLPPTQVPFKFSGVAKAGNAAATKTEQSKAEQPEQKVVDKADKKE EATSVASKDSSAMMELVPATTTEAATDIQPAVASSELWKPVISDLQALGEEHGQEDMSWI YIFITGFLGGLLALFTPCVWPIIPMTVSFFLKRSKDKKKGIRDAWTYGASIVVIYVALGL AITLIFGASALNALSTNAIFNILFFLMLVIFAASFFGAFEIRLPSKWGNAVDSKAESTTG LLSIFLMAFTLSLVSFSCTGPIIGFLLVQVSTTGSVVAPAIGMLGFAIALALPFTLFALF PSWLKSMPKSGGWMNVIKVTLGFLELAFALKFLSVADLAYGWRLLDRETFLALWIVIFAL LGFYLLGKIKFPHDDDDNKVGVTRFFMALISLAFAVYMVPGLWGAPLKAVSAFAPPMQTQ DFNLYKNEVHAKFDDYDLGMEYARLNGKPVMLDFTGYGCVNCRKMEAAVWTDPKVSDLIN NDYVLITLYVDNKTPLTEPVKIIENGTERTLRTVGDKWSYLQRVKFGANAQPFYVLLDNQ GKPLNKSYAYNEDIPKYIEFLQTGLENYKKEK >gi|222159328|gb|ACAB01000031.1| GENE 96 132933 - 135173 1376 746 aa, chain - ## HITS:1 COG:YPO0616 KEGG:ns NR:ns ## COG: YPO0616 COG1472 # Protein_GI_number: 16120942 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 28 745 12 720 727 572 41.0 1e-163 MYMKKILCLLYLLSIFCFSHAQNENTYLEQKIDSTLSGMTIREKAGQLNQLDGRGTIENL KILIRKGEIGSVMNVTEPEIVNELQEIAYKQSRTGIPLVFTRDVVHGFKTMLPIPLGQAA TFHPELIQKGARIAAIEATEHGVRWSFAPMIDISRDARWGRIAESFGEDTYLTEQMAVAV VNGFQGDNLSNPQSMAACAKHFIGYGTVEGGRDYNSTHIPERQLRDVYLPPFEKAVKANC SSIMTSFNDNDGIPATGNKKLLKGILRKEWKFDGVVVSDWGSVTEMIKHGFAEDRKDAAR KAIEAGLDMDMSSKAFIQNIEELIAKGIITEETLDNAVRNVLRLKFRLGLFDNPYTDINK KKETYSDKHLAIAKKIAEESVVLLKNENRTLPLSPKIKSILIVGPLSDAPHDQLGTWTMD GETERTQTPVKALREMYGDKVEIHFVKGLEYSRDKNKMNFNKVLAKASQVDVIIAFIGEE AILSGEAHCLADIKLQGAQSELIKILSGTNKPLITVIMAGRPLIINEELNLSDAVLYAWH PGTMGGNALADILFGKTTPSGKLPVTFPKATGQIPIYYNHTNTGRPATGKEKSLDEIPLN AKQSVLGHSSYYLDLGAQPLFPFGYGLSYTSFEYSDLKTDHTVLTPNDTLSISVKVKNTG QYKGTEVVQLYVSDLFGSVTRPVKELKGFKRIELSPNEEKIVVFELSSYELSFWNINMKK EVEPGKFKIRIGTDSQSGLETFFEIK >gi|222159328|gb|ACAB01000031.1| GENE 97 135181 - 137184 1218 667 aa, chain - ## HITS:1 COG:no KEGG:Dfer_4710 NR:ns ## KEGG: Dfer_4710 # Name: not_defined # Def: xanthan lyase # Organism: D.fermentans # Pathway: not_defined # 28 665 39 677 680 790 58.0 0 MKPDMKNLILLLTVLTFLFSCKSNPKEEYDICIYGGTSAGVIAAYSAKMLDKKVLLIEPQ SRLGGLTSGGLGFTDIGNKQVVTGLSKDFYRRLGAYYGKLEQWIFEPKVADSLFNDYIKR ADVKVLYKYRITDVQLANGYIKNITLESSDGTKLGKSIAAKVFIDCTYEGDLMAKAGVSY IIGREDNKQYGETYNGVQLMKGHQFPDGVDPYKIKGDSTSGLLWGISPAALSSDGTGDKL VQAYNYRICLTDNPANKIEITRPENYDSTKYELLLRLFDAQPNKRKLNHYFIWSRMPNNK TDINNRGGFSTDMIGMNHNYPEASYEERAEIIKAHKDYTQGLLYFYKTDPRVPQELRDEI QAWGYPKDEYTEDNHWSPQLYIRESRRMTGDYVMTQAHCEGRETVTDGIGMAAYTMDSHN CQRLLVKKDGKYIVKNEGNVEIGGGLPYPISYRSIIPKEEECKNLLVPVCLSASHIAYGS IRMEPVFMVLAQSAAIAAAEAINTGSVQTVDIKKVQALLHENPLLDDSFSEILIDDSELD LSINNDWEVIKKQGGYGPTFLKSKVRNGSPVRFSPHMEHEGKYKVYTYYHMRKDISPAIT YSISNGIDSWTRVIHKDSVRIEGQTTGEWIELGTYNFQKSSMPYIEISTGDTSGAVIADA VLFIPVK >gi|222159328|gb|ACAB01000031.1| GENE 98 137181 - 139496 1516 771 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 28 760 3 749 778 520 38.0 1e-147 MKIKLLFILLITFQPLVFGANDIPERPLYLDPSQSVQTRVENLMSLMTLKEKVAQMCQYV GLEHMRDAEKNITEEELLNGHARGFYKGLHSTGVERMVTQGEIGSFLHVLTPAEANHLQK LAEKSRLKIPLLIGIDAIHGNGLVSGSTIYPSPIGMASTFAPDLIEQASRQTALEMRVTG SHWAFTPNIEIACDARWGRVGETFGEDPYLVSRMGVASIKGLQTDNLTGLNTVLACAKHL VAGGIANNGTNAGPVELSEGKLRNFFLPPFKAAIQEAKPFTLMPAHNELNGIPCHANKWL MTDIMRNEYGFDGFIVSDWMDMEAISTRHRISENTTDAFFLSVDGGVDMHMHGPVFFDAI LKLIKEGKLTEERVNKACAKILEAKFRLGLFENRYVTEAGIKKTVFTKKHQQTALEIARR SIVLLKNESLLPVDTRKFKKILVTGPNANNQSIMGDWVFEQPEKNVSTILEGIKEEASGT QINYVDVGWNMRALDSAKIEEAIQTAKSSDLAIVIVGEDSFRQHWKEKTCGENRDRMDIT LWGKQDYLVESIYKTGVPTIVILINGRPLATRWIAENIPAVIEAWEPGSMGGKAIAEILF GKVNPSGKLPITIPRHVGQISTVYNHKPSQFLHPYIDGDKTPLYPFGYGLSYTSFKYDQL KVNKADYHANEEIEITVNVSNTGERQGEEVVQLYIRDDYSNTTRPVKELKRFQRILLEKG ENKVVSFRLNKEDLSYYNHKAEYVLEPGTFTVMVGGSSLDKDLQKVKFNVK >gi|222159328|gb|ACAB01000031.1| GENE 99 139583 - 140797 954 404 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715895|ref|ZP_04546376.1| ## NR: gi|237715895|ref|ZP_04546376.1| predicted protein [Bacteroides sp. D1] # 1 404 1 404 404 718 100.0 0 MKLLKYNLGILLCLTAIFFIACDSDDETIQQNPKPTIKFTKVGDEPVIAYPGTELSFSVE MTGTAGIKKVVTMLDSQEIPGSAKEYPGNTDKDSYSVSYTIKSEEVGKTLNFVILATDNE EKKSTAEYVVYIQAAKPEIEIKIPDTAPETVTAGEVVTFDIEITSATTLRSIKTYLEGLE ITDLTKETFENPNSDTYVFSYTTTDLNAGQTLSFTFEVMDANGGIVRSEYSIEVTRAVEL DINEFYTLKIGAQASTDAGPFLNTTNGEIYVRDGAAAKSANIDITLFYSNGSYGYYFVSP SDASIEAIFKAPDAITTWEHRNSTKLKVIQMTPDEFLAINSAEMIQNLYTNSSEAEVEKL TNKLGVGSIIGFKTVADKYGIIIVRSFASGSSKGNVTIDMKVEK >gi|222159328|gb|ACAB01000031.1| GENE 100 140812 - 142167 1116 451 aa, chain - ## HITS:1 COG:no KEGG:XCC0740 NR:ns ## KEGG: XCC0740 # Name: not_defined # Def: hypothetical protein # Organism: X.campestris # Pathway: not_defined # 44 451 27 430 432 255 36.0 3e-66 MKLKYIITSVFVLATVLNVACSDRDNYTIPKGSFEEKPIVPPVKVETSKVDGKWRLLADG QELYVKGAACNNFYAEAADFGANVVRTYGVSDKSKAILDAAQEKGLYVNFGLYIKRETDG FDYNNAAAVKAQFDEMKATVERFKDHPALLVWSIGNEAEASYTNLKLWDAINDIAKMIHE TDPNHPTTTTLASSNVNHIKNIIEKAPHIDILSVNTYAPNLPGVLGNLQSAGWTKPYMIT EFGPRGTWQMNPEPERVLPWGGLVEQTSSEKEADYLKAYQENIAVNKDNGCLGSFVFLWG YQTHGEVLTWYGLFDKKGYTFPAVDAMQYAWTGSYPKNRAPVIATRNDILMNGKKAEDAI IVSPNSSNEAKVTATDPDGDALTYDWMIMKEKTASSDGSLPDGITGLIDDNTKKEITFKA PSTVGNYRLIVFVRDVKNKKVASAVIPFSVQ >gi|222159328|gb|ACAB01000031.1| GENE 101 142172 - 143818 1122 548 aa, chain - ## HITS:1 COG:no KEGG:Dfer_1699 NR:ns ## KEGG: Dfer_1699 # Name: not_defined # Def: secreted protein-putative xanthan lyase related # Organism: D.fermentans # Pathway: not_defined # 1 544 1 548 550 448 43.0 1e-124 MRKKITILTLLFCSFVTGLFANTPQYDICIYGESASGVIAAIQGARMGKKVVLISKNDHV GGLVTSGLTATDMNRNDLIGGITKEFYNKIYNYYLQPEVWHNQDRESFMVSTLKRTYRGK NDERQIQWVYESSVAERIMRDMLKTAGVEILFNHRLDLNKNVRKEENIIRSIQLENGKVI EAKMFIDASYEGDLLARAGVSYTVGRESNLQYGETYNGIRLNYKQGKDLTKISPYIKEGN VKSGALPYVTDREWGKQGDADKRVQAYCYRMTLTNDPDNRISIQKPKNYNPLWYEIYARM LKLEPETKLQQIITLTPMPNKKTDTNHLDFFGASYNYAEADYKTRQEIEQLHKDYALGML WFLGHDKRVPEHIRKEMLDWGLPKDEFTDTQNFPYQIYVREARRMIGAFVMTEKNVRKTD RTPVEHSVGLGSYALDCHYVSRVIDQEGKLRNEGTIFAPTIPYSISYYSLTPKEEECANL LATVCLSSSHVAYSTIRMEPVYMILGQSAATAAALAIDSNLPVQKISYDVLKYKLLQDGQ LLSVPTKK >gi|222159328|gb|ACAB01000031.1| GENE 102 143852 - 145438 1345 528 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_2078 NR:ns ## KEGG: Fjoh_2078 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: F.johnsoniae # Pathway: not_defined # 3 527 2 529 530 230 33.0 1e-58 MKKSILFIFTACFLFTTSCSDFLDEEHKTKYSSEYVFGTPEGLKLAVNALYALQRYYAND TENATIFALERGTDLAVTNGGTGNFYGIYDPNYLKPSASQVGFMWRTMYQIIGKANEIIA AAEDLEDTPSLRATVSEAKCFRAQSYFLLYRTFDRIWLNIQPTTAENVNDPRDFHAASEK EVFDLIYEDLEYAITNLDWVSDEAGRFTQAAARHMKAKAALWLKDWDTTLEQVEEIEKSG HFDLIALNEVFNAGDLNHKEALMVQQWSKNPGGNLSNATPKGNYYAAYFIAQYRTEIGGT AEYACSYDNWGYTYGRCLPSPYLFSLYDKAKDKRYQEYYIHQYKNTTDKNITYGSATVKP GDYFPLYKNGSINKNVYPGCIKYGDKWTRTASETRSYKDVIVYRLAESYIMAAEAALMKN DQTLAKYYFNKTWERAGNDKFTGVLTMKDIMDEQARELSFEGDRWYFLKRLGILIEQIKA YAGDPEIPASILGRNNLPANPHFVRWPIPEAEVINMGAENFPQNIGYK >gi|222159328|gb|ACAB01000031.1| GENE 103 145451 - 148024 1838 857 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_2077 NR:ns ## KEGG: Fjoh_2077 # Name: not_defined # Def: TonB-dependent receptor # Organism: F.johnsoniae # Pathway: not_defined # 5 857 147 1008 1008 638 41.0 0 NSTPSSSLGEMLRGQAAGVQVTMSNAAPGGSSNILIRGRRSLSGGNDPLYIVDGVPMTSI DDINSNDIASLEVLKDASSQSIYGARAANGVILITTKRGQTGKMKISYNTYAASQSIHKN FEFYNGEEWAALRKEAYYNANLSYDETDCFRGLMLDVFKSGEYVDWEKLMISSAWQQKHD ILIQSGGDKTKYALGLGYFDQNGMVPNSGFQRLSGRLNIDHKLLKNLTIGTNFLYTKSWK KTADGSFNSFITMPPLAKVYNDDGSLREDVTEAGESHYNPLWNIDYSNNKSQTDRLLINF FVDWKITKDLSYRANGSLNTRTVHSNTYQGTKHTTGRNNNGKATAGTSFSNDYLFENIVN YVKDFNKNHHFDATFMQSVNVIEWKNLGINGTGFANDDLTYNAIGSANEYGTPTWELSDR KLLSFLGRVRYNLFEKYLFTFALRVDGSSVFGKNNKYGYFPSGAFAWRINEESFLKEAKW LSNLKLRLSYGAVGNQGVTPYKSLGLTDRYLTEFGDKTIIGYLPGTELTNPNLKWETSTS GNIGLDFGFFNGRINGTIEYYNTKTTDLLVTKSIPSSLGYSTQTVNLAEMKNNGIEITLN TTPVKIKDFRWDVNFTFTKNKNEIKKIDGQVDENGKPLDDVNNKWFVGYPMNVYYDYVFD GIWQKDDDIANSHMPTATPGSIKLRDVNNDNQITADDRVVMQRDPKWIGTVGTSFNYKGF DLSADLYISHGGTIYNPYLTTFENGGDLTAKRNGIRRNYWTQNNPSNEAPAPNMTQAPAY ISSLGYQDASYVRLRNVTFGYNFPRALISKAYMQSLRLYMTLSNFWTKTDVQAYGPEQTP GDYPEPRTVLFGLNVTF Prediction of potential genes in microbial genomes Time: Wed May 18 01:52:09 2011 Seq name: gi|222159327|gb|ACAB01000032.1| Bacteroides sp. D1 cont1.32, whole genome shotgun sequence Length of sequence - 16737 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 7, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 695 480 ## BT_0190 hypothetical protein - Prom 877 - 936 6.3 - Term 797 - 831 -0.0 2 2 Op 1 6/0.000 - CDS 964 - 1941 748 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 1961 - 2020 8.3 3 2 Op 2 . - CDS 2068 - 2670 421 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 2679 - 2738 6.0 4 3 Tu 1 . + CDS 2845 - 3498 460 ## BT_0197 hypothetical protein + Term 3669 - 3707 2.5 + Prom 3604 - 3663 8.2 5 4 Op 1 . + CDS 3823 - 5727 1252 ## BT_0198 hypothetical protein 6 4 Op 2 . + CDS 5801 - 6292 367 ## BT_0199 hypothetical protein + Term 6470 - 6503 4.5 + Prom 6447 - 6506 9.8 7 5 Op 1 18/0.000 + CDS 6538 - 7389 1050 ## COG0040 ATP phosphoribosyltransferase 8 5 Op 2 19/0.000 + CDS 7492 - 8781 1258 ## COG0141 Histidinol dehydrogenase 9 5 Op 3 13/0.000 + CDS 8798 - 9838 950 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 10 5 Op 4 . + CDS 9842 - 10966 1076 ## COG0131 Imidazoleglycerol-phosphate dehydratase 11 5 Op 5 . + CDS 10993 - 11556 414 ## COG2365 Protein tyrosine/serine phosphatase + Term 11659 - 11708 3.2 - Term 11645 - 11694 3.2 12 6 Tu 1 . - CDS 11740 - 13665 1571 ## COG0171 NAD synthase + Prom 14218 - 14277 4.8 13 7 Tu 1 . + CDS 14311 - 16735 1805 ## BT_0206 hypothetical protein Predicted protein(s) >gi|222159327|gb|ACAB01000032.1| GENE 1 2 - 695 480 231 aa, chain - ## HITS:1 COG:no KEGG:BT_0190 NR:ns ## KEGG: BT_0190 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 231 1 230 1146 213 53.0 4e-54 MNNLPNIKRKGRNVHSFLAFLLFVCLSSLPCYIQAQEKKNITLDVKNETVENVFNQLSKQ TGYKFFYDQEIVNAAPRISIKARNSSLENILSMITVQTNLYFNKKNNTISVGKQKSQETI KSTRTKTVNGTVTDQNGEPIIGANVLVKGTTNGIITDINGNYSLANVTEDATIQFSYIGY QTTEVKANSKELARIILKEDSELLDEVIVVGYGVQKRSDVTGAISSVTSEK >gi|222159327|gb|ACAB01000032.1| GENE 2 964 - 1941 748 325 aa, chain - ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 122 322 67 268 274 79 29.0 9e-15 MKNTENNKKLYRRYLDDLYTTEDARQLLNSLHDPDNHETLNELSSDVWEETATQQPYTDL EREHYKREARQLLKHIEHKKRTWFHRIAVVTASTAAIVCLVLSGIHYLEHLNEQQIIYLE ASTSYGERKQLLLPDGTQLTLNSCSHVRYPNNFTGEERKIVLEGEGYFQVHRNEEQPFIV STRRFDVRVLGTCFDIKSYSSDEIVSVEVESGKVQVDLPEAMMRLKGKEQVLINTISGEY SKRREERPVAIWKKGGLRFNSTPIRDVAKELERMYNCHITFANGKFNNLISGEHDNKSLE AVLQSIEYTSGIRYKKEGNHILLYK >gi|222159327|gb|ACAB01000032.1| GENE 3 2068 - 2670 421 200 aa, chain - ## HITS:1 COG:RSc2361 KEGG:ns NR:ns ## COG: RSc2361 COG1595 # Protein_GI_number: 17547080 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 3 181 22 210 213 77 28.0 1e-14 MCATNIPGNERSLVIRLIAGDEDAFCELYAAYKNRLIYFAMRFLKSREYAEDIFQDAFTT IWESRRFINPNTSFSSYLYTIMRNRILNHLRELASEDRLKEQILSQAIDYSNETNNEIIA NDLQRLISHALEQLTSRQREIFEMSREKQMSHKEIAEALEISVNTVQEHISSSLRSLRTY LEKHSVTGTDLILLLICLNL >gi|222159327|gb|ACAB01000032.1| GENE 4 2845 - 3498 460 217 aa, chain + ## HITS:1 COG:no KEGG:BT_0197 NR:ns ## KEGG: BT_0197 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 217 1 199 199 309 75.0 6e-83 MSRIILNTKIWLFVLLFGMSFPAWAQSDDFNTWTKFKVNYKIDSRFSVSGDLELRMKDDV SRLDRWGLTVGGSYRPCSFLNLGVGYETHLRNLGDSDWKLRHRYHISATANFRYQWLKVS LRERFQQTFDRGDSETRLRSRLKLSYAPTKGIVSPYFSVEIYQSLDDTSFWRADRMRYRP GVEIALAKRWSLDAFYCYQYASSQGRHIAGIEVGYSF >gi|222159327|gb|ACAB01000032.1| GENE 5 3823 - 5727 1252 634 aa, chain + ## HITS:1 COG:no KEGG:BT_0198 NR:ns ## KEGG: BT_0198 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 634 19 634 634 729 57.0 0 MKTCHNLFVGVGFFILCCTMFTACVNHISEEEGEIINNGDIPLKFVADIHEIMNTRVVNN SFGEKDEVGLFALAGTTTMQEERYADNLHFVRSSTGEFVSDESVYYPDDGVTLNLISYYP YQNSGVAMGESSMQVTVATTQDKLDDYSHSDFLVASKKEVLASKDAVALTYNHQFFRMKI VLVPGEGENIEEILSVKPTLSVSGFYTKTIYDFQKKTFSAYSEEKDITPAGEWEIKDGRL VGKELILIPQEATVGYQYITLEAAGKLYTSLLPSTLQLESGKQRELEITFVSAEDILMSK VNGEIGDWDGTEVDHTESGILHKYIDVSKLTFEKSNVYKVIHSGKQVAEICKEYLVTPDF SSQAIVAYPMKEDGSVNLSQGIVAQLLGKSGKVNGGSVSWNMEDHSLTYVDGTLLARHNV YVLADGTISLSVTLADDVLPVLAQEDIVRDVRGGVIHNYPLVKIGTQYWMRSNLETSLYV NGDALPKLNQVTANIAGYLQSTTERYFYTANVALSGRILPTHWSIPNWEDWNILKDYLKG EASLLKSGTWLSLKTEEQVQPATNLSGFNGIPVGMYVGAFQADYENKHLAYWTLDNTNAT IDTKVFYMKSDTNIIEESNAGIDTKAFAIRCIRK >gi|222159327|gb|ACAB01000032.1| GENE 6 5801 - 6292 367 163 aa, chain + ## HITS:1 COG:no KEGG:BT_0199 NR:ns ## KEGG: BT_0199 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 163 1 163 163 318 93.0 5e-86 MKKIINPWKGLEGYNCFGCAPNNEAGVKMEFYEDGDEVVSIWKPRPEYQGWIDTLHGGIQ AVLMDEICAWVVLRKLQTTGVTSKMETRYRKSVDTKDSHIVLRASIKEVKRNIVIVEAKL YNEDGEVCTESVCTYFTFSKEKSKDEMHFTKCDVEPEEILPLI >gi|222159327|gb|ACAB01000032.1| GENE 7 6538 - 7389 1050 283 aa, chain + ## HITS:1 COG:PM1195 KEGG:ns NR:ns ## COG: PM1195 COG0040 # Protein_GI_number: 15603060 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Pasteurella multocida # 2 282 7 298 299 246 45.0 4e-65 MLRIAVQAKGRLFEETMALLGESDIKLSTTKRTLLVQSSNFPIEVLFLRDDDIPQTVATG VADLGIVGENEFMEKEEDAEIIKRLGFSKCRLSLAMPKDIEYPGLSWFNGKKIATSYPVI LRNFLKKNGVNAEIHVITGSVEVSPGIGLADAIFDIVSSGSTLVSNRLKEVEVVMKSEAL LIGNKNMSDEKKEVLEELLFRMNAVKTAEDKKYVLMNAPKDKLEEIIAVLPGMKSPTVMP LAQEGWCSVHTVLDEKRFWEIIGKLKGLGAEGILVLPIEKMIV >gi|222159327|gb|ACAB01000032.1| GENE 8 7492 - 8781 1258 429 aa, chain + ## HITS:1 COG:hisD KEGG:ns NR:ns ## COG: hisD COG0141 # Protein_GI_number: 16129961 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Escherichia coli K12 # 8 427 13 431 434 429 56.0 1e-120 MKLIKYPSKEQWTELLKRPALNTENLFDTVRSIINKVRAEGDKAVLEYEATFDKVTLSAL AVTPEEIQVAGTLVSDELKAAISLAKQNIETFHASQRFIGKKVETMNGVTCWQKSVGIEK VGLYIPGGTAPLFSTVLMLAVPAKIAGCKEIVLCTPPDKNGNIHPAILFAAQLAGVSKIF KAGGVQAIAAMAYGTESVPKVYKIFGPGNQYVTAAKQLVSLRDVAIDMPAGPSEVEVLAD ASANPVFVAADLLSQAEHGIDSQAILITTSEKLQTEVMAEVERQLAELPRREIAAKSLEN SKLILVKDLDEALELTNAYAPEHLIIETENYMEVAERVTNAGSVFLGSLTPESAGDYASG TNHTLPTNGYAKAYSGVSLDSFIRKITFQEILPEGIKAIGPAIEEMAANEHLDAHKNAVT VRLKAIQNS >gi|222159327|gb|ACAB01000032.1| GENE 9 8798 - 9838 950 346 aa, chain + ## HITS:1 COG:YIL116w KEGG:ns NR:ns ## COG: YIL116w COG0079 # Protein_GI_number: 6322075 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Saccharomyces cerevisiae # 4 343 5 377 385 228 38.0 1e-59 MKTLQELTRPNIWKLKPYSSARDEYKGVTASVFLDANENPYNTPHNRYPDPMQCELKTLL SKIKKISPEHIFLGNGSDEAIDLVFRAFCEPGKDNVVAIDPTYGMYQVCADVNDVEYRKV LLDDDFQFSADKLLAATDEHTKLIFLCSPNNPTGNDLLRSEIEKILCQFEGLVMLDEAYN DFSKAPSFLEELDKYPNLVVFQTFSKAWGCAAIRLGMAFASKEIIDILSKIKYPYNVNQL TQQQAIAMLHKHYEIERWVKTLKEERDYLEAEFEKLPCTIKLFPSDANFFLAKVTDAVKI YNYLVGEGIIVRNRHNISLCCNCLRVTVGTRVENNTLLATLKKYPG >gi|222159327|gb|ACAB01000032.1| GENE 10 9842 - 10966 1076 374 aa, chain + ## HITS:1 COG:VC1135_2 KEGG:ns NR:ns ## COG: VC1135_2 COG0131 # Protein_GI_number: 15641148 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Vibrio cholerae # 175 374 3 200 200 222 53.0 9e-58 MKKKVLFIDRDGTLVIEPPVDYQLDSLEKLEFYPKVFRNLGFIRSKLDFEFVMVTNQDGL GTSSFPEETFWPAHNLMLKTLEGEGITFDEILIDRSFPEDNAPTRKPRTGMLTKYLNNPE YDLAGSFVIGDRPTDVELAKNMGCRAIYLQNSPETLKEKGLEEVCALATTDWDQIAEFLF AGERKAEVRRTTKETDIDVTLNLDGNGACDISTGLGFFDHMLEQIGKHSGMDLTIRVKGD LEVDEHHTIEDTAIALGECIYQALGSKRGIERYGYALPMDDCLCQVCLDFGGRPWLVWDA EFKREKIGEMPTEMFLHFFKSLSDAAKMNLNIKAEGQNEHHKIEGIFKALARALKMAIKR DIYHFELPSSKGVL >gi|222159327|gb|ACAB01000032.1| GENE 11 10993 - 11556 414 187 aa, chain + ## HITS:1 COG:PA3885 KEGG:ns NR:ns ## COG: PA3885 COG2365 # Protein_GI_number: 15599080 # Func_class: T Signal transduction mechanisms # Function: Protein tyrosine/serine phosphatase # Organism: Pseudomonas aeruginosa # 38 178 48 189 218 81 28.0 7e-16 MQKRILLSLLLGVIFSISIFSQNLKVEKITLPDSELTNLYKIDSGVYRSEQPSHEDFKAL EKYGIGEALNLRNRHSDDDEAAGTNVKLHRVKTKAHSINEEQLIEALRIIKNRKAPIVIH CHHGSDRTGAVCALYRVVFQNVSKEDAIHEMTEGGFGFHRIYKNIIRRIKEADIEQIRRK VMCTEGF >gi|222159327|gb|ACAB01000032.1| GENE 12 11740 - 13665 1571 641 aa, chain - ## HITS:1 COG:CAC1050_2 KEGG:ns NR:ns ## COG: CAC1050_2 COG0171 # Protein_GI_number: 15894337 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Clostridium acetobutylicum # 326 634 1 309 310 457 66.0 1e-128 MNYGFVKVAAAVPHVKVADCKFNVEKIESLITVAEGKGVQIIIFPEMSITGYTCGDLFGQ QLLLEEAEMGLMQILNNTRQLDIISIVGMPVVVNSTVINAAVVIQKGKVLGVAAKTYLPN YKEFYEQRWFTSALQLTTNNVRLCGQIVPIGANLLFETSDTTFGIEICEDLWSTIPPSSS LALQGAEILFNMSADNEGIGKNNYLCSLISQQSARCIAGYVFSSCGFGESTTDVVFAGNG LIYENGSLLARSERFSMKEQLIISEIDVERIRAERRINTTFAANQANLGDKKAVSIATEF VNSKELTLTRKFNAHPFVPQGIELNEHCEEVFSIQVAGLAQRLVHTGAKTAVVGISGGLD STLALLVCVKTFDKLGLSRKGILGITMPGFGTTDRTYHNAIDLMKSLGISIREISIKDAC IQHFKDIEHDVNVHDVTYENSQARERTQILMDVANQTWGMVIGTGDLSELALGWATYNGD HMSMYGVNASVPKTLVKYLVQWVAENGMDENSKATLLDIVDTPISPELIPADENGEIKQK TEDLVGPYELHDFFLYYFLRFGFRPSKIFYLARTTFKDTYDEETIKKWLSTFFRRFFNQQ FKRSCLPDGPKVGSISISPRGDWRMPSDANSAMWLKEIENL >gi|222159327|gb|ACAB01000032.1| GENE 13 14311 - 16735 1805 808 aa, chain + ## HITS:1 COG:no KEGG:BT_0206 NR:ns ## KEGG: BT_0206 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 808 1 807 980 1087 68.0 0 MRKSILLFVLFTLTSIPLLLFAQGGYQVTGHIISAEDNQPMIGVSVLEKGTTNGVITDMN GNYSITVTKSPATLQFSYVGMKTVDKQVTASTRINLTMENDAQMVDEVVVVAYGVRKKGT IAGSVSTVRAEKMENVPAASFDQALQGQAPGLMVMSGSGEPSVAASFQIRGINSLSSGTS PLFILDGVPVSSGDFNTLNPSDIESISVLKDASSTSIYGARAANGVVVITTKRGLALDKA KVTFRTQLGISQLAQDKWNQMNTEERILFEKEVGLDKGKDYDLLRKTDINWLDVVFNDKA MLQNYEVSVNRATDRLNYYVSGNFFDQDGIAQGSGFRRYNMRANADVKASNWLKVGTTST VSYEDIEQAQTGEYTSVTPISASHFMMPYWNPYNEDGSIASTKDDSWTGTNQNPLDWMRN NPVSYKKYKVLSTLYAEVNPIKGLTIKSQFAADYAHMTAFRQSFPSFSTNNGSGNAGRSS NDRLSLTITNTANYMFTLREKHSFNFLLGQEGVDAQSEGFSISMRGQNNDLLTNISNGTL AASWSDTAAGTLYSYSYLSFFGRGEYNYDGRYYVDFSLRTDASSRFGKDNRWGAFWSVGL MWNLKKEKFLNECKWLTTTRLALSTGTSGNSDIGYYAWQSLVKGGMDYMGETGIYPAQSG NPDLSWEKTWTTNLALHLGFWNRINLDVELYNKKTTDMLMDVPQSYAVNGSGSRWDNIGA MVNRGVEVMVNGDVIRTKDFSWNLSANFSYNKNKITELYNGVDEYEIASTNLKYVVGRSS TEFYINRYAGVNPANGDALWYTKDGEIT Prediction of potential genes in microbial genomes Time: Wed May 18 01:52:36 2011 Seq name: gi|222159326|gb|ACAB01000033.1| Bacteroides sp. D1 cont1.33, whole genome shotgun sequence Length of sequence - 2287 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 25 - 519 447 ## BT_0206 hypothetical protein 2 1 Op 2 . + CDS 538 - 2067 1043 ## BT_0207 hypothetical protein 3 1 Op 3 . + CDS 2089 - 2287 189 ## BT_0208 hypothetical protein Predicted protein(s) >gi|222159326|gb|ACAB01000033.1| GENE 1 25 - 519 447 164 aa, chain + ## HITS:1 COG:no KEGG:BT_0206 NR:ns ## KEGG: BT_0206 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 164 817 980 980 257 71.0 1e-67 MTGKSFVAPWQGGFGTTLTWKGISLAAQFSWVADRWVFNNDRYLDEGNGLFETYNQSRRL LYDRWKKPGDVTDIPRYGVTPQMDSRFLEDASFLRLKNLMLSYNFPQALLKKTNFLTHLR VFVQGQNLLTFTKFSGLDPEGVSNMYQAQYPSTRQFTFGLEVSF >gi|222159326|gb|ACAB01000033.1| GENE 2 538 - 2067 1043 509 aa, chain + ## HITS:1 COG:no KEGG:BT_0207 NR:ns ## KEGG: BT_0207 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 509 1 511 511 625 61.0 1e-177 MKNKIKIYLLLIAVGIGSVSCLDKYPDDAIQAGQAIKTVDDANQAIIGIYAAFKDASLYS GRLTLLPDLQTDLVYAVTGYSNQYGDLWRWNILATNEDIEAVYGALYAIINRCNFLLDYI PGVQQSTTDDDALDKLEMIHGEALFARALCYSELIKLFCNSYESDGEAENELGVVLTSHY DSVDNTRRASLKDSYQFVLDDLELAAEYLKIDENSVSKDDLYSTTYINEYTVYALRARIA LYMKHYTEAIKYSTKVIDSGYYVLSSASIQYGSTGQSFYQYMWTNDASTEIIWKVGFTPT SYGGALGTIFFNYDYSTVRPDYVPAKWALELYDPYDLRFSIFFQTYLTGYPHALQWSLLQ KYWGNTTLMESYNIRHVSMPKPFRLSEQYLIRAEAYCFKNSPDYGSAGKDLGTLCAARYS SYSGGVSVSEKNALEVIEQERVKELYMEGFRLQDLKRWHKGFERKPQSESLDNGSSLKVE KDDPRFVWPIPQNELDAPGSDIQPNESNK >gi|222159326|gb|ACAB01000033.1| GENE 3 2089 - 2287 189 66 aa, chain + ## HITS:1 COG:no KEGG:BT_0208 NR:ns ## KEGG: BT_0208 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 66 1 66 299 63 46.0 2e-09 MKKIFIGLSAIFAFAFGFTGCDTDYVTYSGPDYIMFSDTLTVLPVQNNEEYFDIPVAATE ACSYDR Prediction of potential genes in microbial genomes Time: Wed May 18 01:52:52 2011 Seq name: gi|222159325|gb|ACAB01000034.1| Bacteroides sp. D1 cont1.34, whole genome shotgun sequence Length of sequence - 20493 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 7, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 716 500 ## BT_0208 hypothetical protein 2 1 Op 2 . + CDS 734 - 2290 1202 ## BT_0209 hypothetical protein 3 1 Op 3 . + CDS 2302 - 4995 1919 ## BT_0210 leucine-rich repeat-containing protein 4 1 Op 4 . + CDS 5016 - 6725 1287 ## BT_0211 hypothetical protein 5 1 Op 5 . + CDS 6753 - 8786 1426 ## COG1404 Subtilisin-like serine proteases 6 1 Op 6 . + CDS 8840 - 9610 314 ## BT_0213 hypothetical protein 7 1 Op 7 . + CDS 9643 - 10125 403 ## BT_0214 hypothetical protein + Term 10149 - 10199 6.2 8 2 Tu 1 . - CDS 10129 - 10275 108 ## gi|294808843|ref|ZP_06767572.1| hypothetical protein CW3_0919 - Prom 10413 - 10472 12.3 + Prom 10435 - 10494 10.2 9 3 Op 1 . + CDS 10540 - 10968 370 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 10 3 Op 2 1/0.500 + CDS 10991 - 11551 703 ## COG1592 Rubrerythrin + Term 11575 - 11615 7.1 + Prom 11565 - 11624 5.6 11 4 Tu 1 . + CDS 11646 - 12281 428 ## COG0778 Nitroreductase + Term 12338 - 12382 6.1 - Term 12326 - 12370 5.3 12 5 Op 1 4/0.000 - CDS 12426 - 12890 529 ## COG0526 Thiol-disulfide isomerase and thioredoxins 13 5 Op 2 . - CDS 12908 - 13207 235 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 13263 - 13322 5.5 + Prom 13225 - 13284 4.9 14 6 Op 1 . + CDS 13377 - 14504 492 ## COG1819 Glycosyl transferases, related to UDP-glucuronosyltransferase 15 6 Op 2 . + CDS 14501 - 14683 251 ## BF3012 hypothetical protein + Term 14699 - 14766 3.0 - Term 14686 - 14754 4.1 16 7 Op 1 . - CDS 14770 - 16926 1972 ## BT_0236 hypothetical protein 17 7 Op 2 . - CDS 16927 - 19086 2401 ## BT_0237 hypothetical protein 18 7 Op 3 . - CDS 19141 - 20385 1035 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) - Prom 20409 - 20468 7.1 Predicted protein(s) >gi|222159325|gb|ACAB01000034.1| GENE 1 3 - 716 500 237 aa, chain + ## HITS:1 COG:no KEGG:BT_0208 NR:ns ## KEGG: BT_0208 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 231 68 298 299 218 46.0 1e-55 LGVEILDKKTNAVEGRHFSLESNTVTIKAGERTANVRIRGIYDNIEVTDSLGMVLQLVSK EESKWELYGTETKVILEKVCPFDIHLFEGYALIVTSTYFGSYMKDVSQRLIRTKVDPNAE NTIIMKDYFYKGYDLKVKFTTNDLLNPLIEMEDQPFASTMEAFDTIYGDWEVWAYQSGYY LSYYSSCERFIFQYMTLHVPGMPAGKDEVGTYINVVKWVSDDEAQILIEEGVNNSLK >gi|222159325|gb|ACAB01000034.1| GENE 2 734 - 2290 1202 518 aa, chain + ## HITS:1 COG:no KEGG:BT_0209 NR:ns ## KEGG: BT_0209 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 513 3 516 650 280 34.0 1e-73 MRKFKQRNLVAHIRKFSLESCFTALFMALCISACSDDDDAAKVVFPEKQTVNCVYGDTKE LNFEANANWQLTSSATWCRFMNDGHEDYSLSGTAGKQTVTLKITDEVVAFDAPTVAHLTM MIGADKAIIADVVRDNKTRELKIYDMEGNEIQEIEVGYDDYKPFQVKANFRFAATNRPEW LEVAGNAIVGTVNEMTKGEVKVIDNPLYAKYVQNGTLTFADEDGVMSYSFPLVYKGMDPK KIKILNSNPWNWEVSLDGKTFIQTSSAGASASTTTSTYNKFIPYTVQALNDEVVPVYIQK VVEYGMVQMKIGEEDGVDWIKLEDDKKGNLRLKVNDSSEEREGYVLLLPQALYDEIKDEL WENLIEMDMETYEQDIKYTYQQNNLLINFVQKEKKQEVAQAFKVTYFDSSWNTQEAICTK VTDTDIINLYPDVSDIYTMEWPSAQGGVTIDPLEGDFELEWNFKVMRNGEDITSEKICEG SDTSLNAFIEGTLTEEFHILIEKDGEIKKVLIVTPNYN >gi|222159325|gb|ACAB01000034.1| GENE 3 2302 - 4995 1919 897 aa, chain + ## HITS:1 COG:no KEGG:BT_0210 NR:ns ## KEGG: BT_0210 # Name: not_defined # Def: leucine-rich repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 897 5 904 904 878 52.0 0 MNKKLYSILFFVPLIVSLLAMTVVTGCSDDDEPLQSTYGYVQFKLLKSASMDKGTTTRAV TDKLDKLSDAQKVKVVMQYNGSTITQTLLLDSYNAENAEFGLRSDKLKLLTGTYTIIGYY LYDKLDNVLYAGPAGKDNVFTIIADGLHTQKLSVDAVSRGMVTFKLVKDFVKTRAAEDEA YPLSNILSIDITVKNLFTQELTTIEKIRVKYVEDFTDEPADGYGDRNQETSYAECDTVAW LKAGTYQISSYKTYSDEKANNSLEVATVQTSKSFVVKDNEVTKDAEVPIRLDETAEYIKD YIALKAIWEKMGGPSWKYYGESAPMGVNWNFNKDIDMWGNQPGVQLLENGRVALISLAGF GAEGVVPDEIGQFTELRILSLGTHDEKLGGHLFDNYSVNMSDEQRKAMRMDYDTKFLSRD AREDLSTILRKGINDNPKMSPIKSSRISTKDVQFGALTNKITGVSKAIMRLTKLEQFFIS NSPVRSDGLFVNVKPDSPYYEEQEEWNWKNMATLTDIEIYNCPHLEKLPVDMLKQLPELV SLNIAHNPSIDGDQLKSDWEALLGEDSQCADKIQLLYMGYNKLKEFPEYGLLKKMKKLTL LDCTNNEIEILHPFGKEVKLMKVYLDNNKIKEIPGVEDGNGLKYFFGWNDVETLSCTNNQ LTEVPNIFNAKSDYVMGTVDFSNNLIDKFEDGENHRGINASTVNLHHNKFTIFPKLLFEK ESPMQTLNLSANGMTTIPKGALKGKTSLEYLTSLDLTYNKLSSLTDDFYVINMPILYGLD LSYNRFSKFPSQPLSINYLVVFGISHQRDDNGNRTLREWPTGLYKNPSLKVFYIGSNDLR KIDDTISSTIVLFEIKDNPNISIDMSGICPYIRAGQYTLIYDKSQDIRGCDALDLDK >gi|222159325|gb|ACAB01000034.1| GENE 4 5016 - 6725 1287 569 aa, chain + ## HITS:1 COG:no KEGG:BT_0211 NR:ns ## KEGG: BT_0211 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 552 1 625 660 431 39.0 1e-119 MKYLVKIALGLFVYMAAVASCKDDDDSGITGFSIDKEDITMGADGGKDIVTVSSGGEWAV SASEPWVNISPANGFGATECTVSIDSTLINGMRKAEIRFIPQGQAPCVMTVHQTGYGKMI YIEKPDVEIKASDTYDNRHFDVIVTTNVAFKMNTEYDVIPEKEWLTLPEDPTVDLDRGSR PRTTKIRVEWTMNPDFDIRTAKIHFTPKSTEDKLEQPAVLTISQKASPRIEDNRSGDSLT LLTIRERLEIGNNWNPGENMRYWDNVVLWEEGDEGLPKGENVVGRVRSVSFNMINTKESV PQEVHYLTYVESLTFFGNSNTATKSITLEDDVCGLEYLKSLTVSAYGLSAISDNLVLLGD RLETLDLSSNNFNSVPSIITKENFPKLKSLNLIGNRRSVISDLRNAKDPVKYPDGIGLFF NTKDDNTLRRLFMWDNLEELRLSYNFIEGTLPDFEIGVDGVTGYSQADVEAFGGDTIQYL VNEGAHIPKILPKMRKLSVNLNFFTGNLPEWVLYHPHLIEWDPEVLIYNQMEKGLNSEGK MVRFDNEPTNFDKYFEAFPKFKEKYELKD >gi|222159325|gb|ACAB01000034.1| GENE 5 6753 - 8786 1426 677 aa, chain + ## HITS:1 COG:alr1615_2 KEGG:ns NR:ns ## COG: alr1615_2 COG1404 # Protein_GI_number: 17229107 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 200 513 37 297 416 129 36.0 2e-29 MTTFALLAGACSDNDLNDDAPVVPVTPDIVIPEGAAEGELLVKFVPEMTEILDHTMKIRA VGASTRSGIPSTDEVLSILGAYDFERVFPVNSRTEEKSREAGLHLWYLVRFDKGTDLKEA MGKLSKLGEISKLQINPTIQRAYNPKKKPVPVSEAALNKMKTRAADNGFKFNDELLPHQW GYINRGDYSFVTEKAPAIVGSDVNCEEAWELCTGDPEIIVAVLDEGVMWKHPDLEANMWV NESEVLGSDDDADKNGYKGDRYGYNFVKNSGIISWQAAEDTGHGTHVAGTIAAVNHNGIG VSGIAGGNGTKPGVRIMSCQVFDGNSQVKMVNEARAIKYAADNGAVILQCSWGYNSAYSN PLQGYVPGPATDEEWSKTYPLEKEALDYFIHNAGSPNGVIEGGLAIFASGNEYSYMSSYP AAYEGCLSVASIAADYTPSSFSNYGMEVDFCAPGGDSEYHCVPGEDTDGNNVNIDQGMIL STLVVEGKAAYGYNEGTSMACPHVSGVAALGLSYALQQRRHFKVQEFINLMKQTAREVDS YYKGYKTYYYLHNSPGWSATRMDLSSYRGKMGKLVDAGALLHAIAGSGSDMKMPNLYVGI DKIVKIDLARYFLNGENLIYECEISDTSKATVTLEGTILKVKGVVAGMTSATMRTSDGKE QSFTITVRKNANDSGWM >gi|222159325|gb|ACAB01000034.1| GENE 6 8840 - 9610 314 256 aa, chain + ## HITS:1 COG:no KEGG:BT_0213 NR:ns ## KEGG: BT_0213 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 252 34 267 271 220 46.0 3e-56 MVFFPAHYLGAQGIAYQSVEVDSLINPPLLKEGGEMFSFDSLRIHIGSIYDTDAPRTYTF PFRNVSGKNVRITKITTSCGCTAAAFSLGTLAPGMESMVTLVYNPKNRIGTVETYAFVYT DISEKHPVARLMLIGEVVCSDEWRYLPSTMGTLRMKRKQIVFSEMTSNVCPSERILCANI GKKPLKLSAQMLPPYAVFQTEPEVIPPGQEADIVITVDGSKLPDNLGNGLQFNFIVEGVD APLTDRLVRVTIKRLQ >gi|222159325|gb|ACAB01000034.1| GENE 7 9643 - 10125 403 160 aa, chain + ## HITS:1 COG:no KEGG:BT_0214 NR:ns ## KEGG: BT_0214 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 157 1 156 158 204 60.0 9e-52 MRILFRIIPLCMLLFSFMACDDDEEVKVETLEVTPANLNGTWRLVEWNGEPMADGTYCYI TFVRRDKTFEMYQKFDSMYARFLSGIFEIEKDTYLGYIISGKYNNSLGGRWNNSYIVTEL LPAGTMIWTVKDDAGDVSKYVRCDGVPSKIVEEARDDIDN >gi|222159325|gb|ACAB01000034.1| GENE 8 10129 - 10275 108 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294808843|ref|ZP_06767572.1| ## NR: gi|294808843|ref|ZP_06767572.1| hypothetical protein CW3_0919 [Bacteroides xylanisolvens SD CC 1b] # 1 48 1 48 48 71 100.0 2e-11 MSEKELKMFDMNAQLNAAFSKEGSSKRKKRGFTTFSINEVNPLFLSIS >gi|222159325|gb|ACAB01000034.1| GENE 9 10540 - 10968 370 142 aa, chain + ## HITS:1 COG:FN2045 KEGG:ns NR:ns ## COG: FN2045 COG0735 # Protein_GI_number: 19705335 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Fusobacterium nucleatum # 7 135 14 140 142 115 47.0 2e-26 MKPYDRLLEHNIKPSMQRIAIMEYLMEHPIHPSADDIYTALSPSMPTLSKTTVYNTLKLF SEQGAALMLTIDEKNTNFDADTSVHSHFLCKRCGHIYDLKCPEAIKQVENLEMDGHQVSE VHYYYKGICKNCLSKDKETRID >gi|222159325|gb|ACAB01000034.1| GENE 10 10991 - 11551 703 186 aa, chain + ## HITS:1 COG:CAC3598 KEGG:ns NR:ns ## COG: CAC3598 COG1592 # Protein_GI_number: 15896832 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 182 1 180 181 231 71.0 4e-61 MKKFRCTVCGYVHEGDAAPEKCPLCKAPASKFVEVVEVEGDALSFADEHVIGVAKGCDEE MIKDLNNHFMGECTEVGMYLAMSRQADREGYPEVAEAFKRYAWEEAEHAAKFAELLGDCV WDTKTNLQKRKDAEQGACEDKKRIATRAKALNLDAIHDTVHEMCKDEARHGKGFEGLYNR YFGDKK >gi|222159325|gb|ACAB01000034.1| GENE 11 11646 - 12281 428 211 aa, chain + ## HITS:1 COG:PAE2336 KEGG:ns NR:ns ## COG: PAE2336 COG0778 # Protein_GI_number: 18313271 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Pyrobaculum aerophilum # 21 208 50 252 274 111 35.0 7e-25 MRKVQLLLVCLMLSAAAFAADKVIKLPKPNLNRTGAVMKALSERHSTREYASKALSLADL SDLLWAANGINRKESGMRTAPSALNKQDVDVYVVLPEGSYLYDAKNHQLTLIVAGDYRGA VAGGQAFVKTAPVSLVLISDLSRFGDAKSPRSQLMGAMDAGIVSQNISIFCSAANLATVP RASMDNEQLKKVLKLKDSQMPMMNHPVGYFK >gi|222159325|gb|ACAB01000034.1| GENE 12 12426 - 12890 529 154 aa, chain - ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 40 152 3 115 117 104 38.0 7e-23 MKKVLVMVALVMVSVIVYAFNDSGESNQGKKEVTGNGEVVVMDKDMFLKDVFDYEKSKEW KYKGDKPAIIDLYADWCGPCRQTAPIMKELAKEYAGKIVIYKVNVDKQKELAALFNATSI PLFVFIPMKGDPQLFRGAADKATYKKAIDEFLLK >gi|222159325|gb|ACAB01000034.1| GENE 13 12908 - 13207 235 99 aa, chain - ## HITS:1 COG:RSc1188 KEGG:ns NR:ns ## COG: RSc1188 COG0526 # Protein_GI_number: 17545907 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Ralstonia solanacearum # 6 97 16 107 108 95 44.0 3e-20 MEKFEDLIQSPVPVLVDFFAEWCGPCKAMKPVLEELKLIVGDKARIAKIDVDQHEDLATK YRIQAVPTFILFKNGEAVWRHSGVIHSSELQGVIEKHYS >gi|222159325|gb|ACAB01000034.1| GENE 14 13377 - 14504 492 375 aa, chain + ## HITS:1 COG:MTH884_2 KEGG:ns NR:ns ## COG: MTH884_2 COG1819 # Protein_GI_number: 15678904 # Func_class: G Carbohydrate transport and metabolism; C Energy production and conversion # Function: Glycosyl transferases, related to UDP-glucuronosyltransferase # Organism: Methanothermobacter thermautotrophicus # 2 331 1 315 348 69 21.0 1e-11 MKFLFIVQGEGRGHFTQAITLEEMLLRNGHEVVEVLVGKSSTRTLPGFFNRSIHAPVKRF ISPNFLPTADNKRANLTKSFAYNLLKLPEYIRSMYYINQRIRETGAEVVINFYELLTGLT YALFRPSVPYICVGHQYLFLHHDFEFPDKSSCQLWMLRFFTRMTALKSSKKLALSFREME QDDNNQIVTVPPLIRQEVTAIRPEEGNYIHGYMVNSGFADSVEHFHARHPEVPLTFFWDK SDTEEVIRVDETLSFHQIDDVKFLNAMAGCRAYASTAGFESICEAMYLGKPVLMVPAHIE QDCNAYDAMKAGAGIISDSFDLQPLLRFVGKYAPNRHFIYWVRSCERRIILELEKLAASQ SEITSIPTFNNYLPI >gi|222159325|gb|ACAB01000034.1| GENE 15 14501 - 14683 251 60 aa, chain + ## HITS:1 COG:no KEGG:BF3012 NR:ns ## KEGG: BF3012 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 60 1 60 60 86 70.0 4e-16 MKLNRTDYVLESASDGGYYAWLTVNMHCNAYGESPEEAIQNLQNIMNEMIDEMYMVEEFI >gi|222159325|gb|ACAB01000034.1| GENE 16 14770 - 16926 1972 718 aa, chain - ## HITS:1 COG:no KEGG:BT_0236 NR:ns ## KEGG: BT_0236 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 37 718 1 682 682 1331 92.0 0 MRKQILFVLFSLATLSIHADEGMWMLTDLKTQNAVAMRELGLEIPIEEVYDANGLSLKDA VVHFGGGCTGEIISSEGLVLTNHHCGYGAIQQHSNVEHDYLTDGFWAMNRDAELPTPGLT VTFIDRILDVTDYVNEQLKKDPDPEGVNYLSPSYLGTVAERFAKAENIEITPATKLELKA FYGGNKYYMFIKTVYSDIRMVGAPPSSIGKFSADTDNWMWPRHTGDFSLFRIYADKNGKP AKYSKDNVPLQVKKHLKISIAGVQEGDFTFVMGFPGRNWRYMISDEVEERMQTTNFMRQH VRGARQKVLMEQMLKDPAVRIHYASKYASSANYWKNAIGMNEGLVRLNVLDTKRAQQEEL LARGREKGDDSYQKAFDEIRSIVTHRRDAIYHQQAINEALVTALDFMRIPSTMELVAALK SKDKEQIKEAKLKLKQEADKYFASVPFPEVERMVAKEMLKTYANYIPEEQRINIFEIINS RFKGSIDAFVDACFEHSIFGNPKNFEKFIKKPSLYKIGYDWMVLFKYSITDGILKTAIAM KEANQNYDAAHKVWVKGMMDMRQEKGTPIYPDANSTLRLTYGQVLSYEPADGVVYDAHTT LKGVMEKEDQGNWEFVVPQKLKELYKSQDYGRYGKNGEMPVCFIVNTDNTGGNSGSPVFN SKGQLVGTAFDRNFEGLTGDIAFRPSSQRAACVDIRYTLFIIDKYAGASHIIDELSIE >gi|222159325|gb|ACAB01000034.1| GENE 17 16927 - 19086 2401 719 aa, chain - ## HITS:1 COG:no KEGG:BT_0237 NR:ns ## KEGG: BT_0237 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 719 1 695 695 1306 90.0 0 MNKLRFYLVALAALFVFSVRADEGMWLLQLMQQQHSIDMMKKQGLKLEAQDLYNPNGVSL KDAVGIFGGGCTGEIISPEGLILTNHHCGYASIQQHSSVEHDYLTDGFWATSRDKELPTP GLKFTFIERIEDVTDIVNAKIAAKEITESESFSNTFLQKLAHDLYFKSDLADKKGIVPQA LPFYAGNKFYLFYKKIYPDVRMVAAPPSSIGKFGGETDNWMWPRHTGDFSMFRIYADANG EPAEYSENNVPLKTKKHLSISIKGLKEGDYAMIMGFPGSTSRYLTVSEVKERMESENDPR IRIRSARLAVLKEVMNASDKIRIQYANKYAGSSNYWKNSIGMNKAIIDNDVLGTKAAQEA KFAEFAKAQNNAEYAAVVKNIDDLVAKTTPLNYQYTCLRETFFGAIEFGNVMLSKTREAL LEKNDSVIEARMKALESTYESIHNKDYDHEVDRKVAKALFPLYAEMVPANQRPSIYKVIE QKYKGDYNKFVDDMYDNSIFANRANFEKFTKKPSVKAIDNDLALQYCQSKYDLMDKLVSQ LKDMDQELALLHKTYIRGLGEMKLPVPSYPDANFTIRLTYGNVKPYDPKDGVHYNYYTTT KGILEKENPEDREFVVPAKLKELIEKKDYGRYALPNGDMPVCFLSTNDITGGNSGSPVLN ENGELIGCAFDGNWESLSGDINFDNNLQRCINLDIRYVLFILEKLGNCGHLINEMTIVE >gi|222159325|gb|ACAB01000034.1| GENE 18 19141 - 20385 1035 414 aa, chain - ## HITS:1 COG:MA2647 KEGG:ns NR:ns ## COG: MA2647 COG0641 # Protein_GI_number: 20091470 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Methanosarcina acetivorans str.C2A # 13 405 9 396 446 412 48.0 1e-115 MKTSTFAPFAKPLYVMVKPVGAVCNLACEYCYYLEKANLYKDNPKHIMSDELLEKFIDEY INSQTMPQVLFTWHGGETLMRPLSFYKKAMELQKKYAHGRTIDNCIQTNGTMLTDEWCEF FRENNWLVGVSIDGPQEFHDEYRKNKLGKPSFVKVMQGINLLKKHGVEWNAMAVINDFNA EYPLEFYRFFKEIGCQYIQFAPIVERILSHEDGRHLASLAENKAGTLADFSITPEQWGNF LCTLFDEWVKEDVGKYYVQIFDSTLANWMGEQPGICTMAKTCGHAGVMEFNGDVYSCDHF VFPEYKLGNIYSKTLVEMMHSERQHNFGNMKYQSLPTQCKECEFLFACNGECPKNRFSQT AEGEPGLNYLCKGYYQFFKHVAPYMDFMKNELMNQRPPANIMEALRNGELRVES Prediction of potential genes in microbial genomes Time: Wed May 18 01:53:48 2011 Seq name: gi|222159324|gb|ACAB01000035.1| Bacteroides sp. D1 cont1.35, whole genome shotgun sequence Length of sequence - 10452 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 50 - 784 519 ## BT_0239 hypothetical protein 2 1 Op 2 . + CDS 827 - 2581 1212 ## COG0457 FOG: TPR repeat + Prom 2583 - 2642 3.0 3 2 Tu 1 . + CDS 2747 - 3010 269 ## gi|262407546|ref|ZP_06084094.1| conserved hypothetical protein - Term 3144 - 3199 2.4 4 3 Tu 1 . - CDS 3323 - 4081 565 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase - Prom 4327 - 4386 5.2 + Prom 3984 - 4043 4.7 5 4 Op 1 28/0.000 + CDS 4202 - 5455 1072 ## COG0420 DNA repair exonuclease 6 4 Op 2 . + CDS 5452 - 8319 2628 ## COG0419 ATPase involved in DNA repair - Term 8286 - 8324 5.4 7 5 Op 1 . - CDS 8339 - 8914 608 ## BT_0247 hypothetical protein 8 5 Op 2 . - CDS 8920 - 9432 622 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 9 5 Op 3 . - CDS 9501 - 10403 792 ## COG1410 Methionine synthase I, cobalamin-binding domain Predicted protein(s) >gi|222159324|gb|ACAB01000035.1| GENE 1 50 - 784 519 244 aa, chain + ## HITS:1 COG:no KEGG:BT_0239 NR:ns ## KEGG: BT_0239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 1 244 244 378 82.0 1e-103 MNVNKILPFLLLLPFLASCTSKYKIEGTSSVNSLDGKMLYLKSLRDGEWVKLDSAEVVHG LFSMKGKIDSVQMVTLYMDEESIMPIVLESGKITVTISNTDLKAVGTSLNNALYEFISKR NQLEESISELEQKETRMVLDGGDLDEIHSQLVVEGDSLMQAMNQYVKTFISDNYENVLGP SVFMMLCSSLPYPIMTPQIDDIIKDAPYSFKDNKLVREFLSKARENMKLIEEHQRLEQNA STNK >gi|222159324|gb|ACAB01000035.1| GENE 2 827 - 2581 1212 584 aa, chain + ## HITS:1 COG:alr3017 KEGG:ns NR:ns ## COG: alr3017 COG0457 # Protein_GI_number: 17230509 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 102 287 105 294 340 61 24.0 5e-09 MKTTTVSVYAAFFFILVLSVSCRRDSEASDVAFQKIEICMESLPDTALYLLKSVPHPEKL RGKSQADYALLLTQAMDQNYVKFTSDSLIALALNYYTVERGDTAMRAKAQYYYGRVLREL GKDEEALSFLSSAKEMFGKIQCCKMFAMATDEIGMVNRKKKLYQESLKNFQESYAIYEEL KDSVGIVRAGQNIGRAYLFQNNWDSCYFYYNNALELARKKQYPSEVSILHELGILYRSMG ELKKSERYFLAAYEKETDEERKYVECMSLGYLYIQMGDVENARKYFKMSINSSKEYTRID AYNNLYFLEKDIDNFEEAITYHEKADSIVSVLDEVDSLELITELQKQYENEKLRSDNLQM KMHRTVFLFCGTIVFLIVAFYMCYYYYKSRNHKKKIAEIESQIRDNEEEIKRYQQEMEEI QELKDQVLEENRILEENRMKVGELNGKIVLLSMQNKTLSGLLKELGGELTVGPSSEQYIS AFRLLLAIKEGTLRGKLSDGERYKLFSLFDLLYANYVTRLLDKAPLLTKRDLEICCFLKF GLTNEELARIFQTSSDSVTKAKGRLKGRLGISSQEDLNAFLRDF >gi|222159324|gb|ACAB01000035.1| GENE 3 2747 - 3010 269 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262407546|ref|ZP_06084094.1| ## NR: gi|262407546|ref|ZP_06084094.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 87 1 87 87 161 100.0 1e-38 MKNPNELSATSSWRGNELQLIEQKANIVVGVANSRTRLRIGVLLSKYSEYQVDYISSESE TGKLIEEADLIIGAGLLPMKGCYAGNR >gi|222159324|gb|ACAB01000035.1| GENE 4 3323 - 4081 565 252 aa, chain - ## HITS:1 COG:TM1693 KEGG:ns NR:ns ## COG: TM1693 COG0204 # Protein_GI_number: 15644441 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Thermotoga maritima # 55 220 59 223 247 108 35.0 8e-24 MKILYYIYQICIALPILLVLTILTAIVTIVGSLLGGAHFWGYYPGKIWSQLICLFLLIPV KIEGREKLHDKTSYIFVPNHQGSFDIFLIYGFIGRNFKWMMKKSLRKLPFVGKACESAGH IFVDRSGPKKVLETIRQAKDSLKDGVSLVVFPEGARTFTGHMGYFKKGAFQLADDLQLAV VPVTIDGSFEILPRTGKWIHRHRMILTIHDPIPPKGKGMENIKATMAEAYAAVESALPEQ HKGMVKNEDQDR >gi|222159324|gb|ACAB01000035.1| GENE 5 4202 - 5455 1072 417 aa, chain + ## HITS:1 COG:PA4281 KEGG:ns NR:ns ## COG: PA4281 COG0420 # Protein_GI_number: 15599477 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Pseudomonas aeruginosa # 2 389 1 380 409 274 38.0 2e-73 MIRILHTADWHLGQTFFGYDRTGEHEVFLNWLAEEIRQKEIDALIIAGDVFDVSNPSAAS QSMYYQFIYRVTVENPNLQIVIVAGNHDSAARLEAPLPLLQAMRTEVRGVVRKLEGGEID YDHLIVELKNRKGEVELLCMAVPFLRQGDYPVVQTEGNLYAEGVRELYSQLLQRLWKQRT ANQSILAIGHLQATGSEIAEKDYSERTVIGGLECVSPEAFSEQIAYTALGHIHKAQRVSG RENVRYAGSPIPMSFAEKHYHHGVVMVTFDGGCAVDIERLECPKLIPLVSVPNGDPALPE VVLEALKELPETKDVAPYLEVKVLLEEPEPMLRQEIEEALADKNYRLARIVSTYRTDVEN TEKENENWKRGLQEMSPLQIAQSAFEKIYQVEMPAELTGLFQEAYLAATHKEEEEEE >gi|222159324|gb|ACAB01000035.1| GENE 6 5452 - 8319 2628 955 aa, chain + ## HITS:1 COG:PA4282 KEGG:ns NR:ns ## COG: PA4282 COG0419 # Protein_GI_number: 15599478 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Pseudomonas aeruginosa # 1 385 1 378 1211 184 37.0 8e-46 MKILAIRLKNLTSIEGMVEVDFTAEPLHSAGIFAISGPTGAGKSTLLDALCLALYDKAPR FATSVESVNLADVGDNQINQSDVRNLLRRGTSDGYAEVDFLGIDGRRYRSRWSVRRTRNK ISGSLQPQTLEVKELDTEKEFQGTKKELLIQLVELVGLTYEQFTRTVLLAQNDFATFLKS KGAAKAELLEKLTGTGVYSRISQEVYARNKAAQEEVTLIQNRMNVIELMPEEELLALQKE KELLAEKRVTGIKLLAEQNEQLNVVRSLKMQEDLWKKKQQEEQEEQARLKMLQGALASQE EGLVHFKAQWEAIQPDLKKARQLDIQIQSQQDSYTQSKQMLQSANKQVSEQEQKMRMATE QLQVSYSSLNRLLNHVGIKEVLQLEQVEEILRQEENKLTAGVNTNEERLLRLNSFGYPLL TEEQMKLQKELTRQQNIRQLTETQTKTKAEIERLEKETTDCLKQLTEQETALKVTQRLYE NARMAVGKDVKALRQQLQEGEACPVCGSTAHPYHQEQEVVDTLFRSIEQEYNAAVANCQQ INNRSIVLQRDWTHQKMVDGQIGEQLAALYKAGIDAGNEEQIQHRLTELAERILEYRNLY AEWQRSDEEIKKMRAHCEALRENVSLCRLAMQKVSSAKEQLLLLQNTASAEQKRFEVIEK ALNVLRQERSQLLKGKSADEAEAAVAKREKELNLALEKARKEVEAVHNRLSGLQGEMKQI TLAIGELQEQYKKIESPEQLPEIIKKQQEENLNIERTFSTMEARLLQQAKNKLTVEQIAK ELAEKQTIAERWAKLNKLIGSADGAKFKVIAQSYTLNLLLLHANKHLSYLSKRYKLQQVP DTLALQVIDCDMCDEIRTVYSLSGGESFLISLALALGLSSLSSNNLKVESLFIDEGFGSL DAESLRTAMEALEQLQMQGRKIGVISHVQEMSERISVQVQVHKKVNGKSVLTVVG >gi|222159324|gb|ACAB01000035.1| GENE 7 8339 - 8914 608 191 aa, chain - ## HITS:1 COG:no KEGG:BT_0247 NR:ns ## KEGG: BT_0247 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 187 1 187 192 249 85.0 3e-65 MELEELKKSWNALDEHLKDKEFIKEEELEKLIRHADKGIHAIASLNIKLILISLPILILF LAEVLLHNRLNPIYIIIIFAWIPALCWDIVTTRYLQRTQIDEMPLVEVISRVNRIHRWTI RERLIAIAFLLVLAVLSFIYWQVWQYGIGMIAFFILLWGGGLGLILWIYRKKFLNRIHEI KKNLSELNELM >gi|222159324|gb|ACAB01000035.1| GENE 8 8920 - 9432 622 170 aa, chain - ## HITS:1 COG:CC3310 KEGG:ns NR:ns ## COG: CC3310 COG1595 # Protein_GI_number: 16127540 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Caulobacter vibrioides # 42 166 33 158 166 81 34.0 8e-16 MREQATEANPPIEQEFLSVIREYERVIYKVCYLYANPNAPLNDLYQDVLLNLWKAYPKFR KECKVSTWIYRIALNTCISFYRKEKNVPEIVSLTKDTDWTIEAHDPINEMLKQLYQMINQ LGQLDKSIILLYLEDKSYEEIAEITGLTVTNVATKLSRIKDKLKRMKKEE >gi|222159324|gb|ACAB01000035.1| GENE 9 9501 - 10403 792 300 aa, chain - ## HITS:1 COG:VC0390_2 KEGG:ns NR:ns ## COG: VC0390_2 COG1410 # Protein_GI_number: 15640417 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Vibrio cholerae # 9 299 615 897 899 202 40.0 6e-52 MILSYKIHTVTPYINWIYFFHAWGFQPRFAAIANIHGCDVCRASWLTTFPEEERNKASEA MQLFKEANRMLDLLDRDYEVKTLFKLCKANSDGDNLIIEKEKDQFVTFPLLRQQTPKRDG SPFLCLSDFIRPLSSGIPDTIGAFASSIDADMEGLYEQDPYKHLLVQTLSDRLAEAATEK MHEYVRKEAWGYAKEENLGIADLLVEKYQGIRPAVGYPSLPDQSVNFLLDELLDMKQIGI SLTENGAMYPHASVCGLMFSHPASEYFSVGKIGEDQLEDYTRRRGKSIEEMRKFLAANLQ Prediction of potential genes in microbial genomes Time: Wed May 18 01:54:06 2011 Seq name: gi|222159323|gb|ACAB01000036.1| Bacteroides sp. D1 cont1.36, whole genome shotgun sequence Length of sequence - 17325 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 7, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1332 1128 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 2 1 Op 2 . - CDS 1329 - 2072 772 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 2130 - 2189 5.9 + Prom 2037 - 2096 4.6 3 2 Op 1 . + CDS 2332 - 5706 2847 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) + Prom 5714 - 5773 2.2 4 2 Op 2 . + CDS 5794 - 6006 283 ## gi|237715945|ref|ZP_04546426.1| conserved hypothetical protein 5 2 Op 3 . + CDS 5994 - 6290 166 ## gi|237715946|ref|ZP_04546427.1| conserved hypothetical protein + Term 6456 - 6492 -0.4 6 3 Tu 1 . - CDS 6323 - 8179 1425 ## COG1032 Fe-S oxidoreductase - Prom 8209 - 8268 2.7 - Term 8180 - 8252 8.4 7 4 Op 1 . - CDS 8285 - 8509 143 ## BT_0255 hypothetical protein 8 4 Op 2 . - CDS 8509 - 8826 502 ## BT_0256 hypothetical protein 9 4 Op 3 . - CDS 8905 - 11814 2420 ## BT_0257 xanthan lyase - Prom 11964 - 12023 4.7 - Term 11944 - 11997 12.4 10 5 Tu 1 . - CDS 12033 - 14024 1802 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Prom 14075 - 14134 4.4 11 6 Op 1 . - CDS 14173 - 14949 452 ## COG0388 Predicted amidohydrolase 12 6 Op 2 . - CDS 14987 - 15541 416 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 13 6 Op 3 . - CDS 15549 - 15818 376 ## BT_0261 hypothetical protein - Prom 15855 - 15914 4.2 14 7 Tu 1 . - CDS 15942 - 17129 726 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 17225 - 17284 5.2 Predicted protein(s) >gi|222159323|gb|ACAB01000036.1| GENE 1 3 - 1332 1128 443 aa, chain - ## HITS:1 COG:XF0988 KEGG:ns NR:ns ## COG: XF0988 COG0044 # Protein_GI_number: 15837590 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Xylella fastidiosa 9a5c # 1 442 1 445 449 412 48.0 1e-115 MKRTLIQNAVIVNEGRKVLGSVVIENEKIAEILVGEEKATAPCDEVIDASGCYLLPGAID EHVHFRDPGLTHKADITTESHAAAAGGVTSIMDMPNTNPQTTTLEALEEKFTLLGEKSAV NYSCYFGATNNNYTQFAQLDKHRVCGVKLFMGSSTGNMLVDRMASLRNIFGGTDLLIAAH CEDQGIIKENTDKYKKEYGDDVPLALHPLLRSEEACYRSSELAVQLARETNARLHIMHIS TAKELSLFSNVPLAQKRITAEACVSHLLFTEEDYQTLGARIKCNPAIKTAQDRKALQEAV NSGLIDAIATDHAPHLLSEKEGGALKAMSGMPMIQFSLVSMLELADKGVFTIEKVVEKMA HAPAQMYEIPNRGFIRKGYQADLVLVRPGSEWTVTTDCILSKCKWSPLEGHTFDWKVEKT FVNGHLLYNNGEIDETYRGQELF >gi|222159323|gb|ACAB01000036.1| GENE 2 1329 - 2072 772 247 aa, chain - ## HITS:1 COG:Rv2051c_2 KEGG:ns NR:ns ## COG: Rv2051c_2 COG0463 # Protein_GI_number: 15609188 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mycobacterium tuberculosis H37Rv # 7 239 3 229 264 213 45.0 3e-55 MQTSDSIVIIPTYNERENIENIIRAVFGLPKVFHILVIEDGSPDGTATIVKTLQQEFPER LFMIERKGKLGLGTAYITGFKWALEHAYEYIFEMDADFSHNPNDLPRLYQACSEQGGDVS IGSRYISGVNVVNWPMGRVLMSYFASKYVRFITGIPVHDTTAGFVCYRRQVLEMIDLDHI RFKGYAFQIEMKFTAYKCGFKIIEVPVIFINRELGTSKMNSSIFGEAIFGVIKLKVNSWF HKFPQKS >gi|222159323|gb|ACAB01000036.1| GENE 3 2332 - 5706 2847 1124 aa, chain + ## HITS:1 COG:BS_mfd KEGG:ns NR:ns ## COG: BS_mfd COG1197 # Protein_GI_number: 16077123 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Bacillus subtilis # 34 1037 31 1086 1177 628 34.0 1e-179 MTITELQQQYAAHPNMAVMKRLLKDTSVQTVFCGGLCASAASLFSSVLVQESGCPFVFIL GDLEEAGYFYHDLTQILGTEKVLFFPSSFRRSIKYGQKDAANEILRTEVLSRLQKREEGL CIVTYPDALAEKVVSRKELSDKTLKLNVGEKVDTTFITDVLHSYGFEYVDYVYEPGQYAV RGSIIDVFSFASEYPYRIDFFGDEVESIRTFEVESQLSREKKSGVSIVPDLAVTGDVTTS FLDFIPKDTTLAMRDFLWLRERIQVVHDEALTPQAIAVQEAAENGGITLEGKLIDGSEFT VRALDFRRLEFGNKPTGTPNASVTFNTSAQPIFHKNFDLVASSFKDYLEKGYSLYICSDS MKQTDRIKAIFEDRGDQINFTPVERTIHEGFVDNTLRLCIFTDHQLFDRFHKYNLKSDKA RSGKVALSLKELNQFTPGDYVVHTDHGIGRFSGLVRIPNGDTTQEVLKLVFQNEDVVFVS IHSLHKVSKYKGKEGEAPRLNKLGTGAWEKLKERTKSKIKDIARDLIKLYSQRRQEKGFS YSPDSFLQRELEASFIYEDTPDQSKATIDVKADMESDRPMDRLVCGDVGFGKTEVAIRAA FKAVADNKQVAVLVPTTVLAYQHFQTFRERLKGLPCRVEYLSRARTAAQTKAVLKGLKDG DVGILIGTHRILGKDVQFKDLGLLIVDEEQKFGVSVKEKLRQLKVNVDTLTMTATPIPRT LQFSLMGARDLSVISTPPPNRYPIQTEVHTFNEEVITDAINFEMSRNGQVFFVNNRIANL PELKAMIERHIPDCRVAIGHGQMEPTELEKIILDFVNYDYDVLLATTIIESGIDIPNANT IIINQAQNFGLSDLHQMRGRVGRSNKKAFCYLLAPPLGSLTAEGRRRLQAIENFSDLGSG IHIAMQDLDIRGAGNLLGAEQSGFVADLGYETYQKILTEAVHELKTDEFAELYADEIKGE GQISGEEFVEECQVESDLELLLPANYVTGSSERMLLYRELDGLTLDKDVEAFRSRLEDRF GPVPRETEELLRIVPLRRLSARLGAEKIFLKGGRMTLFFVSNPDSPFYQSKAFGKVIDYM MKYTRRCDLREQNGRRSMLIKNVTNVETAVSVLQEIVALPVKEE >gi|222159323|gb|ACAB01000036.1| GENE 4 5794 - 6006 283 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715945|ref|ZP_04546426.1| ## NR: gi|237715945|ref|ZP_04546426.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 70 1 70 70 111 100.0 2e-23 MAKKSYKPQDNELPQVNEPVAAYHVTTPNANLYVPTEYEQEIIMHSEKDYEEGRLHTQED VDKLVEQWLS >gi|222159323|gb|ACAB01000036.1| GENE 5 5994 - 6290 166 98 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715946|ref|ZP_04546427.1| ## NR: gi|237715946|ref|ZP_04546427.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 98 1 98 98 175 100.0 1e-42 MAKLKIVWTETATIVFQQILTFYNVRNGNTQYSRSIYTMVRDVLQLVAKYPYMYKATSVP NIRVFHCDYFKVYYRVLEKQILVEAVFDTRQDPDKAPF >gi|222159323|gb|ACAB01000036.1| GENE 6 6323 - 8179 1425 618 aa, chain - ## HITS:1 COG:PA4928 KEGG:ns NR:ns ## COG: PA4928 COG1032 # Protein_GI_number: 15600121 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Pseudomonas aeruginosa # 9 611 23 667 747 513 40.0 1e-145 MKEYRLTDWLPTTKKEVELRGWNELDVILFSADAYVDHPSFGAAVIGRILEAEGLKIAIV PQPNWRDDLRDFKKLGRPRLFFGISGGCMDSMVNKYTANKRLRSEDAYTPDGRPDMRPEY PSTVYSQILKKLYPDVPVVIGGIEASLRRLSHYDYWQDKVQKSILCDSGADLLIYGMGEK PLPDLVKNMKSLLTTEEPVLTSSKFRTIIGSVPQTAYLCRATEWTSAENDLSLYSHEECL ADKKKQASNFRHIEEESNKYSASRITQAVGNKIVVVNPPYPPMSQEDLDRSFDLPYTRLP HPKYKGKRIPAYDMIKFSINIHRGCFGGCAFCTISAHQGKFIVSRSKESILKEVKEVIQL PDFKGYLSDLGGPSANMYQMKGKDEAICKKCKRPSCIHPKVCPNLNTDHRPLLDIYRAVD ALPGIKKSFIGSGVRYDLLLHQSKDEAINRSTAEYTRELIVNHVSGRLKVAPEHTSDRVL SVMRKPSFEQFETFKRIFDRINREENLRQQLIPYFISSHPGCKEEDMAELAVITKRLDFH LEQVQDFTPTPMTVATEAWYTGFHPYTLEPIFSAKTQREKLAQRQFFFWYKPEERRNIIN ELRRIGRSDLIDKLYGKR >gi|222159323|gb|ACAB01000036.1| GENE 7 8285 - 8509 143 74 aa, chain - ## HITS:1 COG:no KEGG:BT_0255 NR:ns ## KEGG: BT_0255 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 74 127 86.0 2e-28 MITVDTCGITAYSPLIPAIRAMCTASPGETIEIIMNHADAFQDLKEYLSEQGIGFREIYD GEQMTLQFTINGKL >gi|222159323|gb|ACAB01000036.1| GENE 8 8509 - 8826 502 105 aa, chain - ## HITS:1 COG:no KEGG:BT_0256 NR:ns ## KEGG: BT_0256 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 105 1 105 105 175 93.0 5e-43 MYTIQANPSGTRSIEVSNENLRTIEKYALFRHLIDSTGIVDEAVLDKLKLNIRSLIASQE EDSKDLLDLCIDVIYHNNMKAFGLQQLIKLYLTWLSNTEAEEEEE >gi|222159323|gb|ACAB01000036.1| GENE 9 8905 - 11814 2420 969 aa, chain - ## HITS:1 COG:no KEGG:BT_0257 NR:ns ## KEGG: BT_0257 # Name: not_defined # Def: xanthan lyase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 969 1 970 970 1767 88.0 0 MKKIIVFLLCLTVSANFLFAQDIERNVKERLTDYFNKYTATAKISTPKLNSFDINYDRKT IAIYASESFAYQPFRPETVENIYNQVKELLPGPVHYYQLTIYADGKPIEDLVPNFYRNKK KDKERLSLNIDYKGAPWVKNISRPNEISRGLQDRHIAIWQSHGNYFKNDKNEWGWQRPRL FCTTEDMFTQSFVLPYVIPMLENAGAIVYTPRERDTQKNEIIVDNDTPNASLYLEVGSKK ANWANAPVRGFAQKKTIYKEGENPFTDGTCRFIPTERKKKKKKDQVFAEWVPTLPATGKY AVYVSYQTLPNSVSDAKYLVFHNGGVTEFKVNQKIGGGTWVYLGTFEFDKGNNDYGMVVL SNESSEHGVVCADAVRFGGGMGNIARGGRTSRLPRYLEGARYSAQWAGMPYEVYAGRKGE NDYTDDINTRSNAINYLSGSSVYNPQQSGLGVPLEMTMALHSDAGCSKTDEFIGSLGIYT TDFNNGKLNAGTDRYASRDLADILLTQIQKDIYSSYSIPWTRRSMWNRNYSETRLPATPS TIIELLSHQNFADMQLGHDPNFKFTVGRAIYKGILQFITSQHDKEYIVQPLPVSNFAIQF GKKKNTLELSWKGEDDPQEPTARPREYIVYTRIGYGGFDNGTLVSKTSHTVKIEPGLVYS FKVTAVNRGGESFPSEILSAYKAKREQEKVIIINGFDRISGPAVVNTSDRAGFDLSQDPG VPYISNISFCGAQTGFDRTQAGKEGKGSLGHSGNELEGMKIAGNTFDYPFIHGKAIQAAG KYSFVSCSDEAVENGLVTLEDYPVVDYILGLEKEDPANKAYYKTFSSAMQRIMTSYCQSG GSLFVSGAYVGSDMSGTQGNREFTEKILKYGYQSSLTDKSSNQIKGLGRTITIPRLPNEN SYAVPAADCIVPVDTAFPVFTYAPGNQSAGIAYKGNYRTFVLGFPFESIQSEADRATIMA GILGFFTQK >gi|222159323|gb|ACAB01000036.1| GENE 10 12033 - 14024 1802 663 aa, chain - ## HITS:1 COG:BS_nagB KEGG:ns NR:ns ## COG: BS_nagB COG0363 # Protein_GI_number: 16080555 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Bacillus subtilis # 42 279 7 238 242 171 38.0 6e-42 MKTNLSSQISLHRVSPRYYRPENAFEKSVLTRLEKIPTDIYESVEEGANYIAREIAQTIR EKQKAGRFCVLALPGGDSPSHVYTELIRMHKEEGLSFRNVIVFNMYEYYPLAPDAINSNF NALKSMLLDHIDIDKQNIFTPDGSIAKDTIFEYCRLYEQRIESFGGIDIALLGIGRVGNI AFNEPGSRLNSTTRLILLDNASRNEASKIFGTLDNTPISSITMGVSTILGAKKVYLLAWG ENKAAMIKECVEGAITDTIPASYLQTHNNAHVALDLSAAMNLTRIQRPWLVTSCEWNDKL IRSAIVWLCQLTGKPILKLTNKDYNENGLSELLALYGSAYNVNIKIFNDLQHTITGWPGG KPNADDTYRPERAKPYPKRVIIFSPHPDDDVISMGGTLRRLVEQKHEVHVAYETSGNIAV GDEEVVRFMHFINGFNQLFNNSADQVINEKYAEIRNFLKEKKDGDMDSRDILTIKGLIRR GEARTACTYNNIPLERCHFLDLPFYETGKIQKNPISEADVEIVRNLLREVKPHQIFVAGD LADPHGTHRVCTDAVFAAVDLEKEEGAKWLKDCRIWMYRGAWAEWEIENIEMAVPISPEE LRAKRNSILKHQSQMESAPFLGNDERLFWQRSEDRNRGTATLYDKLGLASYEAMEAFVEY VPL >gi|222159323|gb|ACAB01000036.1| GENE 11 14173 - 14949 452 258 aa, chain - ## HITS:1 COG:STM0308 KEGG:ns NR:ns ## COG: STM0308 COG0388 # Protein_GI_number: 16763691 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Salmonella typhimurium LT2 # 1 258 1 255 255 236 45.0 3e-62 MESIRISIIQTDIVWENKQENLRLLHEKLQSLRGITEIVVLPEMFSTGFSMQSKILAEPN SGETITTLKQWAAKFQLAICGSYIATENEQFYNRAFFLTPEGEEFYYDKRHLFRMGREAE HFSAGDKRLIIPYHGWNICLLVCYDLRFPVWSRNVGNEYDLLIYVANWPIPRRLVWDTLL RARALENQCYVCGVNRVGTDGYQLSYNGGSKVYSAFGEEIGSIPDEKEGITTVSVNLTAL NQFREKFPVWKDADEFHL >gi|222159323|gb|ACAB01000036.1| GENE 12 14987 - 15541 416 184 aa, chain - ## HITS:1 COG:CC1900 KEGG:ns NR:ns ## COG: CC1900 COG0204 # Protein_GI_number: 16126143 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Caulobacter vibrioides # 11 177 16 182 196 125 40.0 3e-29 MKKAIYSFIYYRLLGWKTNVTVPNYDKCVICAAPHTTNLDLFIGKLFYGAIGRKTSFMMK KEWFFFPLGIFFKAVGGIPVDRSRKTSLVDQMVHKFAEYKKFNLAITPEGTRKANPNWKK GFYFIALKAQVPIVLIGIDYSKKTISATKAIMPSGDINKDMREIKLYFKDFKGKHPENFA LGEI >gi|222159323|gb|ACAB01000036.1| GENE 13 15549 - 15818 376 89 aa, chain - ## HITS:1 COG:no KEGG:BT_0261 NR:ns ## KEGG: BT_0261 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 89 1 89 89 162 95.0 2e-39 MASRRELKKNVNYIAGELFTECLINSMFIPGTDKVKADELMAEVLKMQDEFVTRISHTEP GNVKGFYKKFRADFNAKVNEIIEAIGKLN >gi|222159323|gb|ACAB01000036.1| GENE 14 15942 - 17129 726 395 aa, chain - ## HITS:1 COG:Ta0724 KEGG:ns NR:ns ## COG: Ta0724 COG1373 # Protein_GI_number: 16081801 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Thermoplasma acidophilum # 2 384 26 405 441 80 23.0 8e-15 MYKRVQYQTITKRLKEPRHFLQVVLGPRQVGKTTVIKQVVNDLNLPYQIYSADSIPATQT SWISDCWNTARVQMRVEKLSEFILIIDEIQKIKNWSEVVKKEWDADTFNDINMKVVLLGS SRVLLEKGLSDSMMGRFEEIRMTHWSYPEMRDAFNMSLEQYLYFGGYPGAAFLIEDEERW GQYINGAIIDATINKDILYDSPISKPALLRQTFELGTSYSGEIVSLTKMVGALQDAGNTT TLAGYLNLLGDSGLLTGLQKFAMDKSRQRASAPKFQVFNNALKTVYNDLTFKEAILNRKE WGRIFESAIGAHIVSNAFTGNYEVFYWREKDKEVDYILKKKNRIVAIEVKSNSEMYNAGL EEIRKMYQPYASFVVGEGGMKAEQFLSINPAKLFE Prediction of potential genes in microbial genomes Time: Wed May 18 01:55:35 2011 Seq name: gi|222159322|gb|ACAB01000037.1| Bacteroides sp. D1 cont1.37, whole genome shotgun sequence Length of sequence - 231900 bp Number of predicted genes - 186, with homology - 180 Number of transcription units - 87, operones - 45 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 114 - 173 10.6 1 1 Tu 1 . + CDS 235 - 1188 607 ## BT_0291 integrase 2 2 Op 1 . + CDS 1603 - 2352 493 ## BT_0292 hypothetical protein 3 2 Op 2 . + CDS 2359 - 3576 752 ## BT_0293 hypothetical protein 4 2 Op 3 . + CDS 3624 - 5084 1375 ## BT_0294 hypothetical protein + Term 5188 - 5223 1.1 - Term 5374 - 5414 -0.8 5 3 Tu 1 . - CDS 5509 - 7725 2176 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Prom 7763 - 7822 7.3 + Prom 7707 - 7766 4.2 6 4 Op 1 27/0.000 + CDS 7997 - 9253 1305 ## COG0845 Membrane-fusion protein 7 4 Op 2 9/0.000 + CDS 9257 - 12361 2929 ## COG0841 Cation/multidrug efflux pump 8 4 Op 3 . + CDS 12361 - 13740 437 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 + Term 13775 - 13822 9.1 + Prom 13812 - 13871 4.6 9 5 Tu 1 . + CDS 13947 - 15593 1982 ## COG0205 6-phosphofructokinase + Term 15604 - 15664 4.4 + Prom 15595 - 15654 2.9 10 6 Op 1 . + CDS 15766 - 16659 686 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) 11 6 Op 2 3/0.000 + CDS 16736 - 18079 682 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 12 6 Op 3 . + CDS 18099 - 18821 393 ## COG0095 Lipoate-protein ligase A + Prom 18847 - 18906 5.7 13 7 Op 1 24/0.000 + CDS 18930 - 20366 1415 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes 14 7 Op 2 . + CDS 20404 - 22440 2134 ## COG0022 Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit 15 7 Op 3 . + CDS 22453 - 22959 559 ## COG0716 Flavodoxins + Term 23009 - 23068 8.1 - Term 22996 - 23055 4.3 16 8 Tu 1 . - CDS 23186 - 24034 1495 ## PROTEIN SUPPORTED gi|237715971|ref|ZP_04546452.1| ribosomal protein L11 methyltransferase - Prom 24059 - 24118 2.5 + Prom 23777 - 23836 3.5 17 9 Op 1 . + CDS 24046 - 24240 117 ## gi|237722251|ref|ZP_04552732.1| predicted protein 18 9 Op 2 . + CDS 24260 - 25594 491 ## BT_0315 hypothetical protein 19 9 Op 3 . + CDS 25620 - 27215 1396 ## BVU_2305 hypothetical protein + Term 27266 - 27314 9.4 - Term 27176 - 27227 0.6 20 10 Op 1 . - CDS 27335 - 28144 685 ## BT_0320 hypothetical protein 21 10 Op 2 . - CDS 28184 - 28654 542 ## COG0590 Cytosine/adenosine deaminases - Prom 28749 - 28808 5.6 22 11 Tu 1 . - CDS 28810 - 29796 885 ## COG0673 Predicted dehydrogenases and related proteins - Prom 29829 - 29888 4.7 - Term 30105 - 30147 10.4 23 12 Op 1 . - CDS 30309 - 31148 756 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 24 12 Op 2 . - CDS 31166 - 33229 1589 ## BT_0328 hypothetical protein - Prom 33278 - 33337 2.1 25 13 Op 1 22/0.000 - CDS 33339 - 33881 662 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 26 13 Op 2 . - CDS 33902 - 34666 636 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 27 13 Op 3 . - CDS 34679 - 34852 200 ## BF1647 hypothetical protein 28 13 Op 4 . - CDS 34860 - 35942 1372 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 29 13 Op 5 . - CDS 35954 - 36181 314 ## BT_0333 hypothetical protein 30 13 Op 6 . - CDS 36236 - 36517 214 ## BT_0334 hypothetical protein - Prom 36542 - 36601 5.9 + Prom 36817 - 36876 5.1 31 14 Op 1 2/0.000 + CDS 36902 - 37747 753 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 37806 - 37847 7.8 + Prom 37809 - 37868 5.7 32 14 Op 2 . + CDS 37900 - 38850 855 ## COG1073 Hydrolases of the alpha/beta superfamily + Prom 38912 - 38971 2.0 33 15 Tu 1 . + CDS 39017 - 39586 471 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 39660 - 39716 -0.1 34 16 Tu 1 . - CDS 39475 - 39720 109 ## - Prom 39774 - 39833 7.5 35 17 Op 1 . - CDS 39835 - 41394 1275 ## BVU_2520 hypothetical protein 36 17 Op 2 . - CDS 41437 - 43002 1088 ## BVU_2520 hypothetical protein - Prom 43052 - 43111 3.7 37 18 Tu 1 . - CDS 43196 - 43492 271 ## BDI_1586 putative nucleotidyltransferase - Prom 43520 - 43579 1.9 - Term 43579 - 43609 2.7 38 19 Tu 1 . - CDS 43637 - 43810 176 ## BT_1707 hypothetical protein - Prom 43842 - 43901 3.2 39 20 Op 1 . - CDS 43961 - 45754 1160 ## BVU_2520 hypothetical protein 40 20 Op 2 . - CDS 45787 - 47553 1030 ## BVU_2520 hypothetical protein 41 20 Op 3 . - CDS 47589 - 48794 715 ## BVU_2521 hypothetical protein 42 20 Op 4 . - CDS 48891 - 50174 1203 ## BVU_2522 hypothetical protein 43 20 Op 5 . - CDS 50243 - 51295 867 ## BVU_2523 hypothetical protein 44 20 Op 6 . - CDS 51292 - 52590 649 ## BT_3061 hypothetical protein 45 20 Op 7 . - CDS 52627 - 52998 254 ## BVU_2524 hypothetical protein 46 21 Tu 1 . - CDS 53406 - 54383 708 ## BVU_2525 tyrosine type site-specific recombinase - Prom 54460 - 54519 8.6 47 22 Tu 1 . + CDS 54334 - 54504 69 ## + Term 54512 - 54549 -0.9 48 23 Tu 1 . - CDS 54615 - 55052 324 ## BT_4511 hypothetical protein - Prom 55099 - 55158 5.4 - Term 55144 - 55189 -0.7 49 24 Tu 1 . - CDS 55224 - 56981 1264 ## BVU_3461 hypothetical protein - Prom 57128 - 57187 4.0 + Prom 57156 - 57215 4.4 50 25 Tu 1 . + CDS 57236 - 57715 474 ## COG2839 Uncharacterized protein conserved in bacteria 51 26 Tu 1 . + CDS 57827 - 59686 1227 ## BT_0338 hypothetical protein + Prom 59741 - 59800 6.8 52 27 Tu 1 . + CDS 59839 - 62085 1808 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases + Prom 62148 - 62207 6.9 53 28 Op 1 . + CDS 62230 - 64035 1784 ## COG5012 Predicted cobalamin binding protein + Prom 64090 - 64149 3.6 54 28 Op 2 . + CDS 64169 - 65740 1375 ## COG4146 Predicted symporter + Term 65754 - 65816 11.5 - Term 65746 - 65799 3.4 55 29 Op 1 . - CDS 65821 - 66498 360 ## BT_0342 hypothetical protein 56 29 Op 2 . - CDS 66491 - 67501 838 ## COG0407 Uroporphyrinogen-III decarboxylase - Prom 67559 - 67618 5.4 - Term 67582 - 67626 10.1 57 30 Op 1 . - CDS 67652 - 68086 538 ## COG0698 Ribose 5-phosphate isomerase RpiB 58 30 Op 2 . - CDS 68086 - 70095 2269 ## COG0021 Transketolase - Prom 70128 - 70187 3.3 - Term 70127 - 70175 2.5 59 31 Op 1 1/0.000 - CDS 70238 - 71782 1616 ## COG3534 Alpha-L-arabinofuranosidase - Prom 71835 - 71894 5.3 60 31 Op 2 . - CDS 71914 - 74316 1808 ## COG3533 Uncharacterized protein conserved in bacteria - Prom 74364 - 74423 7.0 61 32 Tu 1 . - CDS 74458 - 76053 1339 ## COG1070 Sugar (pentulose and hexulose) kinases - Prom 76078 - 76137 2.4 - Term 76087 - 76146 12.6 62 33 Op 1 5/0.000 - CDS 76160 - 77689 1706 ## COG2160 L-arabinose isomerase 63 33 Op 2 . - CDS 77731 - 78414 846 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 64 33 Op 3 . - CDS 78420 - 79163 809 ## COG1051 ADP-ribose pyrophosphatase - Term 79180 - 79214 -0.1 65 34 Op 1 . - CDS 79270 - 80964 1877 ## COG4146 Predicted symporter 66 34 Op 2 . - CDS 80990 - 82129 395 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase - Prom 82257 - 82316 7.6 + Prom 82260 - 82319 11.3 67 35 Tu 1 . + CDS 82507 - 84489 1895 ## COG3534 Alpha-L-arabinofuranosidase + Term 84553 - 84591 5.2 - Term 84536 - 84582 13.1 68 36 Op 1 . - CDS 84605 - 85285 430 ## PG0078 hypothetical protein 69 36 Op 2 . - CDS 85292 - 86503 915 ## COG1106 Predicted ATPases - Prom 86586 - 86645 6.1 - Term 86631 - 86675 10.5 70 37 Op 1 . - CDS 86702 - 87856 1339 ## COG0153 Galactokinase 71 37 Op 2 . - CDS 87904 - 89244 1467 ## COG0738 Fucose permease 72 37 Op 3 . - CDS 89296 - 90393 376 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase + Prom 90606 - 90665 3.5 73 38 Tu 1 . + CDS 90720 - 91691 934 ## COG1482 Phosphomannose isomerase + Term 91711 - 91754 7.2 - Term 91627 - 91668 -0.4 74 39 Op 1 . - CDS 91787 - 93490 969 ## gi|237716026|ref|ZP_04546507.1| predicted protein 75 39 Op 2 . - CDS 93487 - 94131 403 ## gi|237716027|ref|ZP_04546508.1| predicted protein 76 39 Op 3 . - CDS 94109 - 96328 999 ## COG3344 Retron-type reverse transcriptase - Prom 96572 - 96631 4.1 + Prom 96420 - 96479 4.4 77 40 Tu 1 . + CDS 96604 - 97569 518 ## BT_0595 integrase + Term 97752 - 97799 -1.0 78 41 Tu 1 . - CDS 97484 - 97669 79 ## - Prom 97897 - 97956 5.5 + Prom 97639 - 97698 6.7 79 42 Op 1 . + CDS 97905 - 98483 541 ## BT_0596 putative transcriptional regulator 80 42 Op 2 . + CDS 98512 - 99261 643 ## BT_0613 putative membrane protein involved in polysaccharide export 81 43 Op 1 . - CDS 99265 - 99678 268 ## COG3023 Negative regulator of beta-lactamase expression 82 43 Op 2 . - CDS 99682 - 99789 103 ## 83 43 Op 3 . - CDS 99825 - 100313 506 ## BT_1705 hypothetical protein - Prom 100511 - 100570 5.2 + Prom 100452 - 100511 6.4 84 44 Tu 1 . + CDS 100534 - 100752 212 ## BT_1704 hypothetical protein + Term 100862 - 100896 0.9 85 45 Op 1 . - CDS 100927 - 102759 1380 ## BT_1638 hypothetical protein 86 45 Op 2 . - CDS 102803 - 103435 501 ## BT_1702 hypothetical protein - Prom 103460 - 103519 6.2 + Prom 103493 - 103552 4.7 87 46 Tu 1 . + CDS 103589 - 103738 222 ## gi|237716039|ref|ZP_04546520.1| predicted protein + Prom 103778 - 103837 2.1 88 47 Op 1 2/0.000 + CDS 103889 - 105295 845 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 89 47 Op 2 2/0.000 + CDS 105360 - 106169 748 ## COG1596 Periplasmic protein involved in polysaccharide export 90 47 Op 3 . + CDS 106176 - 108617 1912 ## COG0489 ATPases involved in chromosome partitioning + Prom 108639 - 108698 4.3 91 48 Op 1 . + CDS 108729 - 109169 226 ## Kkor_2547 ExoV-like protein 92 48 Op 2 . + CDS 109156 - 109575 140 ## Kkor_2547 ExoV-like protein 93 48 Op 3 . + CDS 109575 - 111110 269 ## BVU_2391 putative transmembrane protein 94 48 Op 4 . + CDS 111173 - 111982 452 ## COG3774 Mannosyltransferase OCH1 and related enzymes 95 48 Op 5 . + CDS 111997 - 112374 196 ## COG3594 Fucose 4-O-acetylase and related acetyltransferases + Prom 112391 - 112450 4.0 96 49 Op 1 . + CDS 112545 - 112967 127 ## gi|294643767|ref|ZP_06721565.1| hypothetical protein CW1_1423 97 49 Op 2 . + CDS 112974 - 113942 516 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 98 49 Op 3 . + CDS 113944 - 114981 683 ## NGR_c02340 hypothetical protein 99 49 Op 4 . + CDS 115025 - 116155 893 ## COG0562 UDP-galactopyranose mutase 100 49 Op 5 . + CDS 116158 - 117162 414 ## Fnod_1455 galactofuranosyltransferase + Term 117211 - 117257 1.8 101 50 Tu 1 . + CDS 117593 - 118414 79 ## gi|237716051|ref|ZP_04546532.1| predicted protein + Prom 118505 - 118564 8.8 102 51 Op 1 25/0.000 + CDS 118587 - 119645 338 ## COG0438 Glycosyltransferase 103 51 Op 2 8/0.000 + CDS 119656 - 120747 836 ## COG0438 Glycosyltransferase 104 51 Op 3 8/0.000 + CDS 120759 - 121325 403 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 105 51 Op 4 . + CDS 121364 - 122479 363 ## COG0438 Glycosyltransferase 106 52 Op 1 2/0.000 + CDS 123050 - 123529 256 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 107 52 Op 2 . + CDS 123526 - 123996 327 ## COG5017 Uncharacterized conserved protein 108 52 Op 3 . + CDS 124027 - 124782 675 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 124785 - 124819 -0.7 109 53 Tu 1 . + CDS 125667 - 127220 1157 ## BT_0374 hypothetical protein + Term 127336 - 127378 -0.2 - Term 127106 - 127143 -1.0 110 54 Op 1 . - CDS 127227 - 127484 278 ## BT_0406 hypothetical protein 111 54 Op 2 . - CDS 127488 - 127850 289 ## BT_0407 hypothetical protein 112 54 Op 3 . - CDS 127883 - 128092 386 ## BF1659 hypothetical protein - Prom 128120 - 128179 8.3 + Prom 128137 - 128196 7.4 113 55 Tu 1 . + CDS 128293 - 129063 900 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity + Term 129101 - 129155 -0.1 + Prom 129093 - 129152 3.6 114 56 Tu 1 . + CDS 129179 - 129673 620 ## BT_0410 hypothetical protein + Term 129674 - 129719 -0.0 + Prom 129686 - 129745 4.4 115 57 Op 1 . + CDS 129925 - 130749 668 ## COG1218 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase 116 57 Op 2 1/0.000 + CDS 130761 - 132314 1390 ## COG0471 Di- and tricarboxylate transporters 117 57 Op 3 8/0.000 + CDS 132330 - 132938 595 ## COG0529 Adenylylsulfate kinase and related kinases 118 57 Op 4 18/0.000 + CDS 133018 - 133926 1016 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes 119 57 Op 5 . + CDS 133939 - 135399 1665 ## COG2895 GTPases - Sulfate adenylate transferase subunit 1 + Term 135418 - 135470 11.0 + Prom 135402 - 135461 4.5 120 58 Op 1 . + CDS 135483 - 136589 1105 ## BT_0416 hypothetical protein 121 58 Op 2 . + CDS 136615 - 137583 868 ## BT_0417 hypothetical protein + Term 137604 - 137656 12.1 + Prom 137602 - 137661 8.1 122 59 Op 1 . + CDS 137900 - 139039 1295 ## BT_0418 outer membrane porin F precursor + Term 139061 - 139111 12.2 123 59 Op 2 . + CDS 139114 - 139530 283 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 124 59 Op 3 . + CDS 139574 - 140128 689 ## COG0242 N-formylmethionyl-tRNA deformylase 125 59 Op 4 . + CDS 140205 - 142235 2066 ## COG0457 FOG: TPR repeat + Prom 142237 - 142296 2.1 126 60 Op 1 16/0.000 + CDS 142318 - 144258 1943 ## COG0441 Threonyl-tRNA synthetase 127 60 Op 2 . + CDS 144331 - 144942 517 ## COG0290 Translation initiation factor 3 (IF-3) + Term 144953 - 144991 1.1 128 61 Op 1 . + CDS 145011 - 145208 334 ## PROTEIN SUPPORTED gi|153808045|ref|ZP_01960713.1| hypothetical protein BACCAC_02331 129 61 Op 2 . + CDS 145307 - 145657 595 ## PROTEIN SUPPORTED gi|29345835|ref|NP_809338.1| 50S ribosomal protein L20 + Term 145683 - 145723 9.2 + Prom 145757 - 145816 5.3 130 62 Tu 1 . + CDS 146016 - 146777 365 ## COG1145 Ferredoxin 131 63 Op 1 . - CDS 146795 - 147367 523 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins - Prom 147388 - 147447 3.2 132 63 Op 2 1/0.000 - CDS 147458 - 148765 1210 ## COG1541 Coenzyme F390 synthetase 133 63 Op 3 11/0.000 - CDS 148777 - 149361 654 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 134 63 Op 4 . - CDS 149365 - 150957 1670 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits - Prom 150980 - 151039 4.0 135 63 Op 5 . - CDS 151041 - 152078 605 ## COG1559 Predicted periplasmic solute-binding protein - Prom 152167 - 152226 4.7 - Term 152427 - 152478 -0.9 136 64 Op 1 . - CDS 152537 - 153415 564 ## COG1864 DNA/RNA endonuclease G, NUC1 137 64 Op 2 . - CDS 153443 - 154024 402 ## COG1636 Uncharacterized protein conserved in bacteria - Prom 154119 - 154178 3.0 + Prom 153866 - 153925 5.9 138 65 Tu 1 . + CDS 154129 - 156168 1363 ## BT_0761 hypothetical protein + Term 156356 - 156405 15.1 + TRNA 156277 - 156349 81.4 # Thr CGT 0 0 + Prom 156279 - 156338 76.9 139 66 Tu 1 . + CDS 156473 - 156568 65 ## + Term 156757 - 156807 3.3 - Term 156531 - 156570 1.2 140 67 Tu 1 . - CDS 156571 - 157779 326 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase - Prom 158001 - 158060 9.6 + Prom 157812 - 157871 10.6 141 68 Op 1 . + CDS 158114 - 159988 1264 ## BT_0434 hypothetical protein 142 68 Op 2 . + CDS 160033 - 161130 756 ## BT_0435 hypothetical protein 143 68 Op 3 1/0.000 + CDS 161136 - 162539 1087 ## COG0477 Permeases of the major facilitator superfamily 144 68 Op 4 . + CDS 162581 - 163753 1222 ## COG2942 N-acyl-D-glucosamine 2-epimerase 145 68 Op 5 . + CDS 163760 - 165898 1369 ## Phep_2992 hypothetical protein 146 68 Op 6 . + CDS 165936 - 169064 2181 ## BT_3604 hypothetical protein 147 68 Op 7 . + CDS 169101 - 170795 1582 ## BT_3603 hypothetical protein 148 68 Op 8 . + CDS 170826 - 172694 1619 ## BT_3602 hypothetical protein 149 68 Op 9 . + CDS 172764 - 173966 1032 ## BT_3591 hypothetical protein + Term 173989 - 174031 4.3 + Prom 174013 - 174072 6.0 150 69 Op 1 . + CDS 174107 - 176269 1291 ## BT_3590 alpha-N-acetylglucosaminidase precursor 151 69 Op 2 . + CDS 176293 - 178872 1301 ## COG1472 Beta-glucosidase-related glycosidases 152 69 Op 3 . + CDS 178894 - 179976 566 ## BT_3593 hypothetical protein - Term 180736 - 180780 2.3 153 70 Tu 1 . - CDS 180805 - 181596 489 ## BT_3593 hypothetical protein - Prom 181819 - 181878 4.3 154 71 Op 1 . - CDS 181881 - 183068 678 ## COG1649 Uncharacterized protein conserved in bacteria 155 71 Op 2 . - CDS 183131 - 185461 879 ## COG4632 Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase 156 71 Op 3 . - CDS 185469 - 186872 743 ## BT_0446 hypothetical protein 157 71 Op 4 . - CDS 186904 - 187788 571 ## gi|237716108|ref|ZP_04546589.1| conserved hypothetical protein 158 71 Op 5 . - CDS 187831 - 188706 407 ## gi|294647256|ref|ZP_06724853.1| hypothetical protein CW1_2827 - Term 188708 - 188757 9.0 159 72 Op 1 . - CDS 188765 - 190525 1252 ## gi|237716110|ref|ZP_04546591.1| conserved hypothetical protein 160 72 Op 2 . - CDS 190583 - 192592 1198 ## Cphy_1063 hypothetical protein 161 72 Op 3 . - CDS 192610 - 194496 1485 ## Slin_6287 RagB/SusD domain protein 162 72 Op 4 . - CDS 194510 - 197536 2013 ## BF1062 hypothetical protein 163 72 Op 5 1/0.000 - CDS 197557 - 198822 1249 ## COG2942 N-acyl-D-glucosamine 2-epimerase 164 72 Op 6 . - CDS 198867 - 199331 325 ## COG0477 Permeases of the major facilitator superfamily - Prom 199435 - 199494 3.6 + Prom 199313 - 199372 5.6 165 73 Tu 1 . + CDS 199445 - 201076 1371 ## COG4409 Neuraminidase (sialidase) + Prom 201205 - 201264 7.3 166 74 Op 1 1/0.000 + CDS 201367 - 203979 1729 ## COG3250 Beta-galactosidase/beta-glucuronidase 167 74 Op 2 . + CDS 204014 - 206338 2247 ## COG3525 N-acetyl-beta-hexosaminidase 168 74 Op 3 . + CDS 206379 - 206618 94 ## gi|294646589|ref|ZP_06724222.1| conserved domain protein 169 74 Op 4 . + CDS 206506 - 208413 1478 ## COG3525 N-acetyl-beta-hexosaminidase + Term 208530 - 208587 12.1 + Prom 208529 - 208588 6.2 170 75 Op 1 . + CDS 208609 - 211353 2287 ## BT_0483 hypothetical protein 171 75 Op 2 . + CDS 211368 - 213050 1369 ## BT_0484 hypothetical protein + Term 213175 - 213215 2.1 172 76 Tu 1 . - CDS 213060 - 214199 811 ## COG4335 DNA alkylation repair enzyme - Prom 214401 - 214460 9.7 - Term 214426 - 214488 11.1 173 77 Op 1 5/0.000 - CDS 214513 - 215700 1431 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 174 77 Op 2 . - CDS 215816 - 216397 801 ## COG0576 Molecular chaperone GrpE (heat shock protein) - Prom 216593 - 216652 4.4 + Prom 216578 - 216637 8.3 175 78 Tu 1 . + CDS 216721 - 218340 1965 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 218451 - 218501 -0.8 - Term 218388 - 218426 0.4 176 79 Tu 1 . - CDS 218566 - 220314 1431 ## BT_2553 hypothetical protein - Prom 220390 - 220449 3.9 177 80 Tu 1 . - CDS 220504 - 222696 1445 ## BF3670 hypothetical protein - Prom 222756 - 222815 6.6 - Term 222755 - 222806 6.3 178 81 Tu 1 . - CDS 222824 - 223081 78 ## gi|298480880|ref|ZP_06999075.1| hypothetical protein HMPREF0106_01318 - Prom 223288 - 223347 9.5 + Prom 222715 - 222774 4.4 179 82 Tu 1 . + CDS 222980 - 223204 232 ## gi|237716128|ref|ZP_04546609.1| conserved hypothetical protein + Term 223315 - 223367 19.0 + Prom 223234 - 223293 9.1 180 83 Tu 1 . + CDS 223469 - 223705 122 ## gi|295084563|emb|CBK66086.1| hypothetical protein 181 84 Tu 1 . - CDS 223712 - 223957 217 ## BT_0708 hypothetical protein - Prom 223982 - 224041 7.0 + Prom 223985 - 224044 7.0 182 85 Op 1 . + CDS 224253 - 224753 602 ## BT_0707 hypothetical protein 183 85 Op 2 . + CDS 224804 - 225247 360 ## COG3023 Negative regulator of beta-lactamase expression + Prom 225252 - 225311 2.5 184 86 Tu 1 . + CDS 225355 - 225513 154 ## + Term 225540 - 225586 3.1 + Prom 225527 - 225586 7.1 185 87 Op 1 . + CDS 225611 - 228634 1417 ## Slin_2121 YD repeat protein 186 87 Op 2 . + CDS 228644 - 231899 1334 ## BT_2927 putative cell wall-associated protein precursor Predicted protein(s) >gi|222159322|gb|ACAB01000037.1| GENE 1 235 - 1188 607 317 aa, chain + ## HITS:1 COG:no KEGG:BT_0291 NR:ns ## KEGG: BT_0291 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 312 2 310 321 495 81.0 1e-139 MQKMNKNGFSQCAGAYIERLRKEGRYSTAHVYKNALFSFSKFCGTGNISFRQVTRECLRC YGQHLYGSGLKLNTVSTYMRMLRSIYNRGVEAGSAPYVPRLFHDVYTGVDIRQKKALPVT ELHKLLYEDPKSERLRRTQTIAALMFQFCGMSFADLAHLEKSALDRNVLQYNRIKTKTPI SLEILESAKEMVNQLRSNKPALPDCPDYLFDILHGDKKRKDEKAYKEYQSALRRFNNSLK DLARVLRLDSPVTSYTFRHSWATTAKYRGVPIEMISESLGHKSIKTTQIYLKGFGLRERT EVNRKNLSYVKNYNVSR >gi|222159322|gb|ACAB01000037.1| GENE 2 1603 - 2352 493 249 aa, chain + ## HITS:1 COG:no KEGG:BT_0292 NR:ns ## KEGG: BT_0292 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 249 1 249 249 392 75.0 1e-108 MKQTKAAFRNKPFISLKGSILFLILIPMSGLVHAQYSMGTTGQLMIPTAEMQETGTFMGG VNFLPEQVTPSVFSFPTMNYFVDMTLFSFIEFTYRMTLLKMTTGTGRTGYHNQDRSNTIR IRPLKESRYFPAVVIGGDDLLTEKKTPYWGAYYGVLTKTIGFRSGDQLAVTAGWYIHQGD CRVFNKGPFGGVRYTPSFCKELKLMVEYDTRGWNMGAAMRFWKHLSVNVFTREFTCVSAG LRYECTLMH >gi|222159322|gb|ACAB01000037.1| GENE 3 2359 - 3576 752 405 aa, chain + ## HITS:1 COG:no KEGG:BT_0293 NR:ns ## KEGG: BT_0293 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 405 1 406 410 644 73.0 0 MKRIVSATAVFLCGISLLQAQPVRVSETLKELDMENISVVEKRDTITAAFETSAYRGIYN GIGIAIRHLVIIPEIPTLQLLILDNALPQLCITIPAELIQKYQAGECALDEVYRKMGMTT STETAVRQLKGVKRKESSFGKVDLVVYPNVMLVNNVTYKLYKAALELQPAVEMQLWKGAS LRMQVSLPIVSNEDGKWNCVRLGYMTFRQDFRLANHWKGYLTGGSFSNDRQGLAAGIGYF SANGQWTVEGGGGITGSAHFYGSEWKMSQWKRVNGQISVGYYIPEVNTLVKVEGDRFIYG DYGVRGTLSRYFGEYIVGIYGMYTNGATNAGFNFSIPLPGKKRKRHLLRVMLPEYFAFQY DMRSGNEYAHRSLGESYTVEPKSAENSHFWQPDYIRYYLIKTSEK >gi|222159322|gb|ACAB01000037.1| GENE 4 3624 - 5084 1375 486 aa, chain + ## HITS:1 COG:no KEGG:BT_0294 NR:ns ## KEGG: BT_0294 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 485 1 475 476 452 60.0 1e-125 MKSVKWFYVGAMAIALGMLTFATVACHDDDDDPKPAEGEVIETPKPVVEYYIMGTVTSGG EAMNGVKVKVGSKNYTTDSNGKFSVTESATGTYSIEASSNGYLSQKTSVVIAENAENRSV VTVALALTKESRKETVSVTAEEEIKVEDNSESNTTIEDPGEVAPEVVVEDKPLVKVELAI PAGAIDTEASASLISDGNVGISVTTFVPAPAEVTTEVKAEEVNQNVEKSIPLAAAKFEPS GLKFKNDVTISIPNPIPGITFADADMILTYQNPDTGEWGDAKDNDGNVIKNVSSTTENGA VTAYTADVDHFSAYAIENKVYSKISNETVTTNILGQASRDNSENAKAVTGIELKYKEKSG WDYDKNDAGLVAEVKSQLGAGASAEDTKTVNAMVAFMKTRMFSLMGSVSGITETERVYNT VNVNGYTTMSYTCYAKARTTTLTANVKFNGQNKTVSITATRYTGTDHQYKTVTYNPTHSG GKGGSI >gi|222159322|gb|ACAB01000037.1| GENE 5 5509 - 7725 2176 738 aa, chain - ## HITS:1 COG:PA3339_1 KEGG:ns NR:ns ## COG: PA3339_1 COG1752 # Protein_GI_number: 15598535 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 27 297 22 299 308 195 41.0 3e-49 MKKQIFSTLVLSIGILLPFSLHSQEQRKKVGVVLSGGGAKGMAHIKALKVIEEAGIPIDY IAGTSMGAIVGGLYAIGYTTEQLDSMVRKQDWTFLLSDRIKRSAMSLTDRERSEKYTVSI PFTKTPKDAATGGLMKGQNLANLFSDLTVGYHDSIDFNKLPIPFACVAANVVNGEQIVFH DGILSTAMRASMAIPGVFTPVRQDSMVLVDGGIVNNYPADVVKAMGADIIIGVDVQNALK KADKLNSVPDILGQIVDITCQSNHEKNVDLTDTYIRVNVEGYSSASFTPAAIDTLMRRGE EAAKEQWNSLLALKKKIGITEDYTPKQHGPYSSLSNARTVYVTDISFSGVEVDDKKWLMK KCNLKENSDITTLQIEQALYQLRGSQSYSSASYTLKETPEGYHLNFLLQEKYERRINLGI RFDSEEIASLLVNATADLKTHIPSRLALTGRLGKRYAARIDYTLEPMQQRNFNFSYMFQY NDINIYEEGDRAYNTTYKYHLAEFGFSDVWYKNFRFGLGLRFEYYKYKDFLFKKPEISDL KVESEHFLSYFAQVQYNTYDKGRFPSKGSDFRATYSLYTDNMAQYNDHAPFSALNASWAS VIPVTRRFSVIPSIYGRILIGRDFPYPLQNAIGGDVPGFYIPQQLPFAGVTNLELMDNTI MIASIKFRQRMGAIHYLTLTGNYGLTDSNFFDILKGKQLFGVSAGYGMDSIFGPLEISLG YSNQTDKGSCFVNLGYYF >gi|222159322|gb|ACAB01000037.1| GENE 6 7997 - 9253 1305 418 aa, chain + ## HITS:1 COG:mll6731 KEGG:ns NR:ns ## COG: mll6731 COG0845 # Protein_GI_number: 13475614 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Mesorhizobium loti # 38 393 53 402 402 209 37.0 6e-54 MRLFFSKHELKLRRKRTIAAIMCMVVVLGVYWILTRPQKAAPEMPTVIVEPVVKDDVEIY GEYVGRIRAQQFVEVRARVEGYLENMLFAEGTYVNKNQVLFVINQDQYRAKADKARAQLK KDEAQALKAERDLKRIRPLFEQNAASQLDLDNAEAAYESAEATVAMSEADLAQAELELGY TIVRSPLSGHISERNVDLGTLVGPGGKSLLATIVKSDTVLVDFSMTALDYLKSKERNINL GQQDSTRSWQPNITITLADNTVYPFKGYVDFAEPQVDPQTGTFSVRAEMPNPKQVLLPGQ FTKVKLLLDVREGALVVPMKAVTIEKGGAYIYTMRKDNAVEKRFIELGPEVGNNVVVERG LAEGEMVVVEGFHKLTPGMKVRVSDPEAEAGDSITTTKNEVTGVKENTTGTKDNAKGE >gi|222159322|gb|ACAB01000037.1| GENE 7 9257 - 12361 2929 1034 aa, chain + ## HITS:1 COG:SMa1662 KEGG:ns NR:ns ## COG: SMa1662 COG0841 # Protein_GI_number: 16263363 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Sinorhizobium meliloti # 6 1028 7 1030 1044 840 43.0 0 MKVTFFIDRPVFSAVISIVIVIVGIIGLTMLPVDQYPQITPPVVKISASYPGASALTVSQ AVATPIEQEINGTPGMLYMESNSSNSGGFSATVTFDVSADPDLAAVEIQNRVKLAESRLP AEVIQNGISVEKQAPSQLMTLCLTSTDPKFDEIYLSNFATINVLDVIRRIPGVGRVSNIG SRYYAMQIWAQPDKLANFGLTVQDLQNALKDQNRESAAGVLGQQPVQGLDITIPITTQGR LSTVGQFEDIVVRANANGSIIRLKDVARVSLEASSYNTESGINGENAAVLGIYMLPGANA MEVAERVKEAMDEISKNFPEGLSYEIPFDMTTYISESIHEVYKTLFEALVLVVLVVYLSL QSWRATLIPVVAVPISLIGTFGFMLIFGFSLNILTLLGLILAIGIVVDDAIVVVEGVEHI METEHLSPYEATKKAMNGLASALIATSLVLAAVFVPVSSLSGITGQLYRQFTVTIVVSVL ISTVVALTLSPVMCSLILKPDNGKKKNIVFRKINEWLGIGSNKYVAAVTRTIKHPRRVLS AFGMVLIAIMLIHRIIPTSFLPVEDQGYFKIELELPEGATLERTRIVTERAIAYLEKNPY IEYIQNVTGSSPRVGSNQGRAELTVILKPWEERKSTTIEKIMDTVEKHLREYPECKVYLS TPPVIPGLGSSGGFEMQLEARGEATFDNLVDAADTLMYYASKRKELAGLSSSLQSEIPQL YFDVDRDKVKMLGVPLADVFSTMKAYTGSVYVNDFNMFNRIYKVYIQAEAPYREHKDNIN LFFVKASNGAMVPLTSLGNASYTTGPGSIKRFNMFTTAVIRGAAAQGYSSGQAMEIMEQI ARDHLPDNIGLEWSGLSYQEKQAGGQTGMVMALVFLFVFLFLAAQYESWTVPIAVLLSLP VAALGAYLGVWVCGLENDVYFQIGLVMLVGLAAKNAILIVEFAKVQVDKGEDLVQSAIYA AKLRFRPILMTSLAFVLGMLPMVLASGPGSASRQAIGTGVFFGMIFAIVFGIILVPFFFV MVYKTKSKILKHKK >gi|222159322|gb|ACAB01000037.1| GENE 8 12361 - 13740 437 459 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 10 456 7 457 460 172 27 9e-42 MKRKKLYIAILLLAGVLTSCKVGKSYVRPDLHLPDSLERAQDSISFGDQDWRDIYTDATL RSLIERALDHNKDMLIAAARVKEMAAQKRISTAALLPDIKGKVTAERELENHGGDAFKRS ETFEAQFLVSWELDLWGNLRWARSASIAEYLQSIEAQRALRMTIVAEVAQAYYELVALDT ELDIVKQTLKAREEGVRLARIRFAGGLTSETSYRQAQVELARTATLVPDLERKISLKEND IAYLAGEYPNKIARSRLLQEFNSPETLPVGLPSTLLERRPDIRQAEQKLIAANAKVGVAY TNMFPRLALTGGFGSESTSLSELLKSPYAVMEGALLTPIFGWGKNRAALKAKKAAYEAEV HSYEKAVLEAFKETRNAIVNFNKIKEVYELRANLERSAKSYMDLAQLQYINGVINYLDVL DAQRGYFDAQIGLSNAIRDELIAVVQLYKALGGGWEQNP >gi|222159322|gb|ACAB01000037.1| GENE 9 13947 - 15593 1982 548 aa, chain + ## HITS:1 COG:TP0542 KEGG:ns NR:ns ## COG: TP0542 COG0205 # Protein_GI_number: 15639531 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Treponema pallidum # 1 546 1 559 573 634 54.0 0 MTKSALQIARAAYQPKLPKALASGAVKAVAGAATQSVADQEAIKALFPNTYGMPLITFEA GEAVALPAMNVGVILSGGQAPGGHNVISGLFDGIKKLNPENKLYGFILGPGGLVDHNYME LTADIIDEYRNTGGFDIIGSGRTKLEAESQFEKGYEIIKELGIKALVIIGGDDSNTNACV LAEYYAAKNYGVQVIGCPKTIDGDLKNDMIETSFGFDTACKTYAEVIGNIQRDCNSARKY WHFIKLMGRSASHIALECALQVQPNVCIVSEEVEAKDMSLDDVVTYIAKVVADRAAQGNN FGTVLIPEGLVEFIPAMKRLIAELNDFLAANAEEFGQIKKSHQRDYIIRKLSPENSAIYA SLPEGVARQLTLDRDPHGNVQVSLIETEKLLSEMVATKLASWKEAGKYVGKFAAQHHFFG YEGRCAAPSNFDADYCYSLGYTASMLIANGKTGYMSSVRNTTAPAAEWIAGGVPITMMMN MERRHGEMKPVIQKALVKLDGAPFKAFAAQRDRWAIETDYVYPGPIQYFGPTEVCDQATK TLQLEQAK >gi|222159322|gb|ACAB01000037.1| GENE 10 15766 - 16659 686 297 aa, chain + ## HITS:1 COG:yegX KEGG:ns NR:ns ## COG: yegX COG3757 # Protein_GI_number: 16130040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli K12 # 70 292 48 265 275 187 43.0 3e-47 MPQRNNPMSAVQKKRTVSTTKKKGTTSSSKTSRTSKKEQMKHRTVMPVWIRNILAVVIVG CFSVVFYYFFIRPYAYRWKPCHGLKEYGVCIPDGYDIHGIDISHYQGKIDWKRLLQNKET ATPLHFVFMKATEGGDHNDTTFEANFANARNHGFIRGAYHFYIPGTDALKQADFFIRTVK LDTGDLPPVLDVEVTGRKEKKELQQGIKRWLDRVESHYGVKPILYTSYKFKTRYLDDSIF NTYPYWIAHYYVDSVKYQGKWDFWQHTDVGSVPGIKEDVDLNVFNGSLEELKKLTIK >gi|222159322|gb|ACAB01000037.1| GENE 11 16736 - 18079 682 447 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 3 444 4 446 458 267 33 3e-70 MNFDIAIIGGGPAGYTAAERAGANGLKAVLFEKKAMGGVCLNEGCIPTKTLLYSAKILDS IKSASKYGVSADSPSFDLSKIMSRKDKTVKMLTGGVKMTVNSYGVTIVEKEAFIEGEKEG MIRITCDGETYSVKYLLVCTGSDTVIPPIPGLSGVSYWTSKEALEIKELPKTLVIIGGGV IGMEFASFFNSMGVKVHVVEMMPEILGAMDKETSGMLRAEYAKRGVTFYLNTKVVEVNAH GVVIEKEGKVSTIEAEKILLSVGRKANLSKVGLDKLNIELHRNGVKVDEHLLTSHPRVYA CGDITGYSLLAHTAIREAEVAINHILGVEDRMNYDCVPGVVYTNPEVAGVGKTEEELIKS GLSYRVSKLPMAYSGRFVAENEQGNGLCKLIQDEEGKIIGCHMLGNPASELIVIAGIAIQ RGYTVEEFQKTVFPHPTVGEIYHEIMF >gi|222159322|gb|ACAB01000037.1| GENE 12 18099 - 18821 393 240 aa, chain + ## HITS:1 COG:SPy1033 KEGG:ns NR:ns ## COG: SPy1033 COG0095 # Protein_GI_number: 15675030 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Streptococcus pyogenes M1 GAS # 14 240 13 242 329 137 34.0 2e-32 MIRCIYSPFSDIYFHLAAEEYLLKQGNEDIFMLWQDTPSVVIGKHQRLRSEVDQEWAERE QVHIARRFSGGGAVYHDLGNVNLTFIETTPRLPEFVTYLQRTLDFLNSMGLMATGGERLG IYLNGLKISGSAQCLYKDRVLYHCTLLYDTDLTALHQALNPEPMVDDETLSSVYAVPSVR SEVTNIRRHLPAGTVTDFKEKAFQYFSKSQSVSAFTKEEIEAVNQLREEKYIQKEWIYSR >gi|222159322|gb|ACAB01000037.1| GENE 13 18930 - 20366 1415 478 aa, chain + ## HITS:1 COG:BH2761 KEGG:ns NR:ns ## COG: BH2761 COG0508 # Protein_GI_number: 15615324 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Bacillus halodurans # 5 475 4 417 426 275 40.0 1e-73 MSKFEIKMPKLGESITEGTIVSWSVKVGDMIQEDDVLFEVNTAKVSAEIPSPVAGKVEEI LYKEGDTVAVGIVVAIIDLDGEESSGTEPASEGATNEGADASQVAADVSGTSQSAADIAK SQSVNTASPPVDTSKPVAVEEERWYSPVVIQLAREAKIPKEELDAIQGTGYEGRLSKKDI KDYIEKKKRGDMAEPKPASAVAAPAASKPSVAVAPEPITPKTSPAASAPAVQSAATSSKS SAPVAMPGVEVKEMDRVRRIIADHMVMSKKVSPHVTNVVEVDVTKLVRWREKNKDAFFRR EGVKLTYMPVITEAVAKALVAYPQVNVSVDGYNILFKKHINVGIAVSLNDGNLIVPVVHD ADHLNLNGLAVAIDSLALKARDNKLMPEDIDGGTFTITNFGTFKSLFGTPIINQPQVAIL GVGYIEKKPAVIETPEGDTIAIRHKMYLSLSYDHRVVDGMLGGNFLHFIADYLENWQG >gi|222159322|gb|ACAB01000037.1| GENE 14 20404 - 22440 2134 678 aa, chain + ## HITS:1 COG:CT340_2 KEGG:ns NR:ns ## COG: CT340_2 COG0022 # Protein_GI_number: 15605063 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit # Organism: Chlamydia trachomatis # 354 678 5 328 328 258 44.0 3e-68 MKKKYDIKTTDVETLKKWYHLMTLGRALDEKAPSYLLQSLGWSYHAPYAGHDGIQLAIGQ VFTLGEDFLFPYYRDMLTVLSAGMTPEEIILNGISKATDPGSGGRHMSNHFAKPEWHIEN ISSATGTHDLHAAGVARAMVYYGHKGVAITSHGESATSEGFVYEAINGASLERLPVIFVI QDNGYGISVPKSEQTANRKVAENFSGFKNLKIIYCNGKDVFDSMNAMTEAHEYARETRNP VIVQANCVRIGSHSNSDKHTLYRDENELEYVKDADPLMKFRRMLLRYKRLTEEELQQIEA DAKKELSAANRKALAAPDPDPKSIYDFVMPEPYQPQKYKEGTHEAEGEKTFLVNAINETL KAEFRYNPDTFIWGQDVANREKGGVFNVTKGMQQEFGEARVFSAPIAEDYIVGTANGMSR FDPKIHVVIEGAEFADYFWPAVEQYVECTHEYWRSNGKFAPNITLRLASGGYIGGGLYHS QNIEGALTTLPGARIVCPSFADDAAGLLRTSMRSKGFTLFLEPKALYNSVEAATVVPEDF EVPFGKARIRREGTDLSIITYGNTTHFCLHVAERLEKEGGWKVEVIDIRSLIPLDKEAIF ESVKKTSKALVVHEDKVFSGFGAELAAMIGEEMFRYLDGPVQRVGSTFTPVGFNPILEKE ILPDEAKIYEAARKLLEY >gi|222159322|gb|ACAB01000037.1| GENE 15 22453 - 22959 559 168 aa, chain + ## HITS:1 COG:alr2405 KEGG:ns NR:ns ## COG: alr2405 COG0716 # Protein_GI_number: 17229897 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Nostoc sp. PCC 7120 # 2 168 3 169 170 136 44.0 1e-32 MKKIGLFYATKAERTSWVAEKIQKEFGEDKIEVVPIEQAWQNDFAAYDCFIVGASTWFDG ELPTYWDELLPELRTMKLKGKKVAIFGLGDQIRYPENFADGIGLLAEVFEGDEATLVGFT SSEGYTFERSKALRGEQWCGLVVDLDNQSEQAEKKIKAWCQQVKKEFA >gi|222159322|gb|ACAB01000037.1| GENE 16 23186 - 24034 1495 282 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237715971|ref|ZP_04546452.1| ribosomal protein L11 methyltransferase [Bacteroides sp. D1] # 1 282 1 282 282 580 100 1e-164 MKYFEFTFRTQPCTETVNDVLAAILGEVGFESFVECEGGLTAYIQQTLCDENAIKIAINE FPLPDTDITYTYTEAEDKDWNEEWEKNFFQPIIIGNRCVIHSTFHQDVPKAEYDIVINPQ MAFGTGHHETTSLIIEELLDSELKDKSLLDMGCGTSILAILARMRGARPCTAIDIDEWCV RNSIENIELNHVDDIAVSQGDASSLVGKGPFDVIIANINRNILLNDMKQYVACMHTNSEL YMSGFYIDDIAAIREEAEKNGLTFVHYKEKNRWAEVKFIYKG >gi|222159322|gb|ACAB01000037.1| GENE 17 24046 - 24240 117 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237722251|ref|ZP_04552732.1| ## NR: gi|237722251|ref|ZP_04552732.1| predicted protein [Bacteroides sp. 2_2_4] # 1 64 1 64 64 117 100.0 2e-25 MLVNNVQRYSNPGIESQVCLNFMPVSVVSSTKIVRNRDFPIILWGKSPPDEKSLPYLCNV NKFK >gi|222159322|gb|ACAB01000037.1| GENE 18 24260 - 25594 491 444 aa, chain + ## HITS:1 COG:no KEGG:BT_0315 NR:ns ## KEGG: BT_0315 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 206 2 189 429 254 88.0 3e-66 MKKIVYFLLFALCLPIGIFAQSVDDDLYFVPSKDKQEKKETPVKKEPKKQVTTNIYTSPG TTVVVQDRKGRTRDVDEYNRRYDARENEFVMDNDTLYIKEKSNPDLDGEWVTGEFNGTTD DYEYAERIIRFRNPRFAISISSPLYWDVVYGPNSWDWNVYTDGMYAYAFPTFSNPLWWDW RYGSYGWGWNYGWGWNRPYYSWGYYPGYWGGGYWGGWYGGGYWGHHHHWHGGPSWGWGGG GRPYYAGRSVINGNNRSYYSGSRNYNSTSYRRGVGSSTTRPANSSVGQSVRRSGSTGTNS RVVGTRDYTRSGSNSSVRSGSSYSRRNTETYTRPSSTRTSGTSTRNSGSSYNRSNSSTRS SGTNSSRSSSSYSRGSSTSPSRSYTPDRSSSRSSNSGSYSRSSGSSYSSGSSGSYSRSSG SSGSSSRSSGGGGSSRSSGGSYRR >gi|222159322|gb|ACAB01000037.1| GENE 19 25620 - 27215 1396 531 aa, chain + ## HITS:1 COG:no KEGG:BVU_2305 NR:ns ## KEGG: BVU_2305 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 531 1 523 523 394 46.0 1e-108 MKSNRILAICILLISVGFLNAQTTIYDANRWMGSDLNGTARFVGMGGAMGALGGDITTIG TNPAGIGIYRSNDVMVSFGFDNTGVKANGASLDKFHGSFDNAGFVFSTKIGNTTALRFAN FGFNYRKMKSFNRSMLVSGVFNTSQTVQMANMVNFDSYRDFDPFTEAALRSDDAFQNPEL PWLGIMGYNAHLVNPVYGKVDPENPDAVPPFEGYEPYFQAGDAVSQSYRSKESGGIHSFD LNGALNFYDRFYVGATLGLYSVNYDRTSEYNEDFTDKDGNGHGGYTLGNDFWVDGSGVDF KLGFILRPFESSSFRIGAAVHTPTFFSLKERNTAYINFDLNETTRGITKPYDARGNDTEG EYEYKLITPWKFNASMGYTIGSSVALGVEYEYSDRTSAKMKDPDGYELGQTEDIKAMMKA VHTLRVGAEFKLAPEFAFRLGYNHITAPLKSDAFKYLPVNSMRTDTEFSNPGATNNYTLG CGYRGESFYVDMAYMYNTYKETFYAFDSLDLPGTKVTNNNHKVVLTLGMRF >gi|222159322|gb|ACAB01000037.1| GENE 20 27335 - 28144 685 269 aa, chain - ## HITS:1 COG:no KEGG:BT_0320 NR:ns ## KEGG: BT_0320 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 269 1 267 267 272 52.0 1e-71 MKRKSLLFVMASVCLFSCQQQEELQDPSKGVIDFSTSIDQSINNALTRNSSSLTLPLKSN FAAGDVISMSVAEQDYHPFAIGMDSQTWNEAGTDSETVTFYAHYPELTNEAATTRSLSSR YREIKGGLEYLFGTAQANKGSKNVALTFKRMTTPVVLLDENNQPYEGRAIVKLFLKNKGV QDLFSGKIEADPNAKPEYIDIRKVSEGILTNLIPQIIKAGEKIGTVILEDGKEEPIIAEE DITIEAGTPVAVKMYARRGIIDERTPLFR >gi|222159322|gb|ACAB01000037.1| GENE 21 28184 - 28654 542 156 aa, chain - ## HITS:1 COG:MA3407 KEGG:ns NR:ns ## COG: MA3407 COG0590 # Protein_GI_number: 20092219 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Methanosarcina acetivorans str.C2A # 6 156 13 162 162 189 60.0 2e-48 MTKEALMRKAIELSKENVENGGGPFGAVIATKDGEIVATGVNRVTASCDPTAHAEVSAIR AAAAKLGTFDLSGYEIYTSCEPCPMCLGAIYWARLDKMYYGNNKTDAKNIGFDDSFIYDE LQLKPEDRKLPSEILLHNEALTAFKAWVAKEDRVEY >gi|222159322|gb|ACAB01000037.1| GENE 22 28810 - 29796 885 328 aa, chain - ## HITS:1 COG:BH1248 KEGG:ns NR:ns ## COG: BH1248 COG0673 # Protein_GI_number: 15613811 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 4 244 2 244 340 110 27.0 6e-24 MSEKMIKWGFIGCGEVTKTKSGPAFQKVEHSEVVAVMSRDGAKAKAYAKERGIKKWYDDA QELIDDPEVNAVYIATPPSSHATYAIMSMKAGKPAYIEKPMAVTYEECTRINRISNETGV PCFVAYYRRYLPYFQKVKELVENGTIGNVINVQIRFAQPPRDLDYNRDNLPWRVQADIAG GGYFYDLAPHQIDLLQDMFGCILEASGYKSNRGGLYPAEDTLSACFQFDNGLVGSGSWCF VAHDSAREDRIEIIGDKGMICFSVFTYEPIGLHTEKGREEICIGNPEHVQQPLIQAVVDH LLGKSVCSCDGESATLTNWVMDKILGKL >gi|222159322|gb|ACAB01000037.1| GENE 23 30309 - 31148 756 279 aa, chain - ## HITS:1 COG:AF0231 KEGG:ns NR:ns ## COG: AF0231 COG0834 # Protein_GI_number: 11497847 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Archaeoglobus fulgidus # 70 276 62 262 264 76 29.0 4e-14 MVTRPKLRLSRYLVPVIIVLAIIFSIRQCGNQEKPSGHPRDYAAIAKEGILRVATEYNSI SFYVDGDTVSGFHYELIQAFAYDKGLKTEITPLMSFEERLEGLSEGRYDVIACGILATSE LKDSLLLTSPITLNKQVLVQRRENGENDSLYIRNQLDLAGRTLHVVKGSPSILRIQNLGN EIGDTIYIKEIDKYGSEQLISMVAHGDIDYAVCDESIARAAADSIPQIDINTAISFTQFY SWAVSKQSPALLDSLNTWLDKFQKEKEYQKIYKKYYDKE >gi|222159322|gb|ACAB01000037.1| GENE 24 31166 - 33229 1589 687 aa, chain - ## HITS:1 COG:no KEGG:BT_0328 NR:ns ## KEGG: BT_0328 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 687 1 680 680 1177 84.0 0 MPPIMRRILLLYIIFSVIGLTTVHAQFDNGRQRVDENGYDQYGNQVDPAMIPDRLDSANV EVQGLPPKLYMWRINNQLGDRTIIPADTAYHHFQNSNLTEGLTGHYNYLANMGSPRMSRI FFERRDPEPTIFMEPFSSFFIRPTEFNFTNSNVPYTNLTYHKAGNKINGEERFKSYFSVN VNKKLAFGFNIDYLYGRGYYNNQNTAYFNAAIFGSYIGDRYQMQAIYSNNYLKTNENGGI EDDRYITAPEEMAQGQREYESTNIPTVLSATTNRNHDFYVFLTQRYNLGFSRDIPQAEND TTPAKQEFVPVTSFIHTIQVERARHSFNSDDDMREKNYYQNTYLEPDNPIARDSTTYMGI KNTIGIALLEGFNKYAKAGLTAFASYKISKYTLMNMEGNPLPDKYNENEIFVGGELSKRE GNVLHYHAIGEVGLAGKAIGQFNVKGDIDLNFPLWKDTVSLIARGEVSNKLAPFYMRHYH SKHFMWDNDMDKEFRTRIEGELSIARWRTRLKAGVENIKNYTYFNQQAVPEQKSGSIQVL SASLNQDFKLGIFHLDNEVTWQKSSDQTVLPLPDLSLYHNFYMQFKLAKKVLSVQLGADV RYFSKYNAPAYMPAIQNFYLQPENDQVEIGGYPIVNVYANLHLKRTRFYVMMYHVNQGMS SPNYFLSPHYPINPRVLKFGLSWNFYD >gi|222159322|gb|ACAB01000037.1| GENE 25 33339 - 33881 662 180 aa, chain - ## HITS:1 COG:MA2909_2 KEGG:ns NR:ns ## COG: MA2909_2 COG1014 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Methanosarcina acetivorans str.C2A # 7 179 12 183 186 115 39.0 4e-26 MKEEIIIAGFGGQGVLSMGKILAYSGLMEGKEVTWMPAYGPEQRGGTANVTVIVSDDKIS SPILSKYDAAIILNQPSLEKFESKVKPGGILIYDGYGIINPPTRKDIKVYRIDAMDAANE MNNAKAFNMIVLGGLLKLRPIVTLENVIKGLKKTLPERHHHLIPMNEEAIKKGMELIREV >gi|222159322|gb|ACAB01000037.1| GENE 26 33902 - 34666 636 254 aa, chain - ## HITS:1 COG:MA2909_1 KEGG:ns NR:ns ## COG: MA2909_1 COG1013 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Methanosarcina acetivorans str.C2A # 5 252 6 262 296 259 48.0 3e-69 MTKEEIIKPENLVYKKPTLMNDNAMHYCPGCSHGVVHKLIAEVIEEMGMEDKTVGISPVG CAVFAYNYLDIDWQEAAHGRAPAVATAVKRLWPDRLVFTYQGDGDLACIGTAETIHALNR GENITIIFINNAIYGMTGGQMAPTTLVGMKSSTCPYGRDVELHGYPLKITEIAAQLEGTA YVTRQSVQSVPAIRKAKKAIRKAFENSMNGKGSNLVEIVSTCSSGWKMTPEKANKWMEEH MFPFYPLGDLKDKE >gi|222159322|gb|ACAB01000037.1| GENE 27 34679 - 34852 200 57 aa, chain - ## HITS:1 COG:no KEGG:BF1647 NR:ns ## KEGG: BF1647 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 57 1 57 57 79 96.0 3e-14 MNPDKIRNVLNILFMILAVAAIITYFVAKDDFKMFIYVCGAAIFVKLMEFFIRFTNR >gi|222159322|gb|ACAB01000037.1| GENE 28 34860 - 35942 1372 360 aa, chain - ## HITS:1 COG:TM1759 KEGG:ns NR:ns ## COG: TM1759 COG0674 # Protein_GI_number: 15644505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Thermotoga maritima # 6 356 7 351 356 382 54.0 1e-106 MAEEVVLMKGNEAIAHAAIRCGADGYFGYPITPQSEVLETLAELKPWETTGMVVLQAESE VAAINMVYGGAGSGKKVMTSSSSPGVSLKQEGISYLAGAELPCLIVNVMRGGPGLGTIQP SQADYFQTVKGGGHGDYKLIALAPASVQEMADFVALGFELAFKYRNPAIILADGVIGQMM EKVVLPAQKPRRTEAEVIEQCPWAATGKAKDRKPNIITSLELKPEAMEINNLRFQAKYRE IEENEVRFEEINCEDAEYLIIAFGSMARIGQKAMELAREKGIKVGILRPITLWPFPTKAI AAYADKVKGMLVTELNAGQMIEDVRLAVNGKVKVEHFGRLGGIVPDPDEIVTALKEKIIK >gi|222159322|gb|ACAB01000037.1| GENE 29 35954 - 36181 314 75 aa, chain - ## HITS:1 COG:no KEGG:BT_0333 NR:ns ## KEGG: BT_0333 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: Citrate cycle (TCA cycle) [PATH:bth00020]; Metabolic pathways [PATH:bth01100] # 1 75 1 75 75 118 98.0 7e-26 MAKIKGAIVVDTERCKGCNLCVVACPLDVIALNKEVNMKGYNYAWQVKEDTCNGCSSCAT VCPDGCISVYKVKVE >gi|222159322|gb|ACAB01000037.1| GENE 30 36236 - 36517 214 93 aa, chain - ## HITS:1 COG:no KEGG:BT_0334 NR:ns ## KEGG: BT_0334 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 93 1 93 93 141 91.0 6e-33 MEQIDNLKELINQGDVDTAIKQLDQLLQDSSVEKEKDTLYYLRGNAYRKKGDWKQALDNY QYAIEINPDSPAVQARKMAIDILNFYHKDMFNQ >gi|222159322|gb|ACAB01000037.1| GENE 31 36902 - 37747 753 281 aa, chain + ## HITS:1 COG:VC0192 KEGG:ns NR:ns ## COG: VC0192 COG2207 # Protein_GI_number: 15640222 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Vibrio cholerae # 200 279 188 266 270 67 45.0 3e-11 MTKGDAEQVEPMKLIFNYNNVFYSFYYDDLGGCIHRSREYAINYVYSGEMILDNGKEQIH VRKGECVFIPRDHHITMYKRTYNGERYCGIFLMFTRRFLREMYSKLGQRRVPVNTPKLDS GVIKLPQTVELASLFASMTPYFNPEVKPKDDFMNLKIQEGLLALLDIDERFAPTLFDFNE PWKIDILEFMSENYMYEFTMEEMAHYTGRSLATFKRDFKKISDLTPEKWLIRKRLEVAYN KMKEGGKKVVDVYAEVGFKNPSHFSTAFKKQYGISPTAIFA >gi|222159322|gb|ACAB01000037.1| GENE 32 37900 - 38850 855 316 aa, chain + ## HITS:1 COG:PA2218 KEGG:ns NR:ns ## COG: PA2218 COG1073 # Protein_GI_number: 15597414 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Pseudomonas aeruginosa # 12 314 19 322 367 365 62.0 1e-101 MKTTNLTSVAIAVLLGTVMLAGAGCTNSKTNQETSDNMETLNLTQEWDKKFPQSDKVNHR KVTFKNRYGITLAADLYEPKNVEGKLAAIAVSGPFGAVKEQTSGLYAQTMAERGFLTLAF DPSYTGESGGEPRNVASPDINTEDFSAAVDFLTALSDVDAERIGIIGICGFGGMGLNAAA MDTRIKATVASTMYDMSRVNANGYFDAEDSADDRNKKREMMNVIRTRDAQSGTITPGVPG LPDSITGEEPQFVKDYFDYYKTNRGFHVRSINSNGAWSPTMTLSFINMPLLAYIHEISRP VLLIHGENAHSRYFKS >gi|222159322|gb|ACAB01000037.1| GENE 33 39017 - 39586 471 189 aa, chain + ## HITS:1 COG:MA0410 KEGG:ns NR:ns ## COG: MA0410 COG0110 # Protein_GI_number: 20089303 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Methanosarcina acetivorans str.C2A # 3 185 9 191 191 160 43.0 1e-39 MNDIFAKDLSGEMVSPDEPGYDELIADIFATIKTATEMNTGYRTPEEVHEYMERILDKEL DATTTVLPPLYIDYGKPVNIGKRCFIQQCCTFFGRGGITIGNDVFIGPKVNLITINHDPD PDNRSATYGRPIVIEDKVWIGINSTILPGVRIGYGAIVGAGSVVTKDVPAMTIVAGNPAR IIKKIETSE >gi|222159322|gb|ACAB01000037.1| GENE 34 39475 - 39720 109 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGSKQSAEGKIFISTTLPSLELERTFINYSFFRYNSYFIREQPLYYSLVSIFLMIRAGLP ATIVIAGTSLVTTLPAPTIAP >gi|222159322|gb|ACAB01000037.1| GENE 35 39835 - 41394 1275 519 aa, chain - ## HITS:1 COG:no KEGG:BVU_2520 NR:ns ## KEGG: BVU_2520 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 9 327 4 317 568 163 35.0 2e-38 MPLKNKTFLSNICLYTFLILGGGCTDEAGMENIGNTDRRIPLTISATVSRYTAIDKKNTS VTRTPIENDYTTEFSDGDAIGIFALRNFETPNVATIDGVYNLKLVYTKAADGTGSWAPAT GDTHALYSYDDNLAYVAYYPYRNGITIKQDNKAEILKDLANNAKLQPAADQSTSAAYTGS DLMAAVASPTVDPANANKKMLTLKFEHLHALLVLKPMGLVNCVPPAGASFEYAGGVLRLD VTAKDATINGIKALRMDDGTFRAIVPSPAGDFVPAGSYLTKGDKTILYTGTSLTAGKLAA GKYYTQQVETAVYTNGSITRALQVGDYFCSDGKIIPGETSIIKRTYIGLVFKVGRFTADD SEYVNGNGEPMTNINGYTVALKDANNGAAIKWGNKKFALDSNRNGSSPYGYFNGFKITQL LKADGIGNYPAANACISYSPAADAKTSGWFFPAGGQMLELRNTRGELIKKTVFTNYNSGR YWQSVQNGSGDSWSVGLTNGNTTQSYISTAYYVRPIAAF >gi|222159322|gb|ACAB01000037.1| GENE 36 41437 - 43002 1088 521 aa, chain - ## HITS:1 COG:no KEGG:BVU_2520 NR:ns ## KEGG: BVU_2520 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 8 452 3 461 568 171 33.0 5e-41 MTFKNKHSFISGFLGAALLIASSCALGGCTEIADEPQGIDDAGNGTPLTIRATASGFVSC VQGDTPATRTPTENGYTTEFNDGDAIGIFALKEYATPDVTPVDGVYNLKLVYTKAADGSG SWAPASGDTHVLYSYDDLVYVAYYPYREGITIKQDIETEIFKDLAANAKLQPAADQSTLA AYTGSDLMAAFARPATDPTDANKKVLTLAFEHLHALLVLKTNKLFNCTPPAGAGFEYSDP MLGADATAKDAVINGIKALPMGDGTFRAIVKTASADIVPTGNYKTTGDKTVLYTGASLAA ATFTAGKYYTQQVNMLESPDESVIRPLQEGDFFCSNGKISPYEANSFSSPVIGVVFYVGR HPEDNGVYVDKNGNPMEVHGYVINRSEGTDQWHNGGGTDFGTYKSTTDFNGYFNTDKLQA ISAGYNDMMTYVRKQTNSPQTTSGWFLPSIGQLIYIKKKYAEVISKSFDKCGGAMGRGWQ HHYWSSTEATAANSYNINWDSGTYSQKNKSSGSMVRTALAF >gi|222159322|gb|ACAB01000037.1| GENE 37 43196 - 43492 271 98 aa, chain - ## HITS:1 COG:no KEGG:BDI_1586 NR:ns ## KEGG: BDI_1586 # Name: not_defined # Def: putative nucleotidyltransferase # Organism: P.distasonis # Pathway: not_defined # 1 90 20 109 124 87 43.0 1e-16 MITQRAAVAATPDDFLLSPDGMLRLDAICINLIALDEAVKGLDKITRGELLPGYPEIYWS GVMKMRNKIAHHYFEMDAEVVFKTLKEDIPMMPPLCTA >gi|222159322|gb|ACAB01000037.1| GENE 38 43637 - 43810 176 57 aa, chain - ## HITS:1 COG:no KEGG:BT_1707 NR:ns ## KEGG: BT_1707 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 57 1 57 97 69 57.0 4e-11 MKSTAEILELLRIYKTQFASKYGFKRLGVFGSVARGEQTEQSDVDVCYEGEPPSLLT >gi|222159322|gb|ACAB01000037.1| GENE 39 43961 - 45754 1160 597 aa, chain - ## HITS:1 COG:no KEGG:BVU_2520 NR:ns ## KEGG: BVU_2520 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 9 597 4 568 568 296 34.0 1e-78 MRNIIKHPFINLIFFVALFAFSGCTNEMEPDIAGNSNLPADGTPLTVRATASGFQAPAQP GGSPATRTPVMEATSTKFQTGDAIGLFCVRKNASGTEYISTDIYNLKMTYTAAADGTGTW EAPTAVSAPLLYTDAVAYFAYYPYTDALTTASVNNEQDIRNWLKTNKPLATDMLTKEVFT ANDLMTATALPSLADADRGALVLNFKHEYALLVIKPMLMSPCIPPTGVTAYGYHAEARKW GIDRNVVQGGGDDFKMKINNRKASEMSDGTYRILTQATNTGSSIGCEYTTYDDDLGLCPV TATGSAISGGFQANTCYTLEVHCTSRTGSTAVERALQPGDFVYQHNGKIEIYPVDGAVDA GGKTPDYANAVGIVMTCDPTRMTDAGCNAKGWNHAYVMGLANISSGQKTLWMKNGHSLST PYPYPPVADMNTAETYMNGYAETETMLGKTPLSDYPIFAALQQYRNDNAVPAGINRSPWF IPSIGQWFDVMITLCGKSPYDFVSVGEWGIWQSDNSQRIEMLGKANNYLAKIGTTFLMPS SPEEHEIGFWLTSHFANQGIWAFYGTDRSGVFAIESTYGNLENYDVPVAVARPFFAF >gi|222159322|gb|ACAB01000037.1| GENE 40 45787 - 47553 1030 588 aa, chain - ## HITS:1 COG:no KEGG:BVU_2520 NR:ns ## KEGG: BVU_2520 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 33 588 18 568 568 319 37.0 2e-85 MRYIKHLFVRSFYQCAAPARLALLCAGFAALGGCTNETDSAGGSSTADSHIPLALEITAS GFTGQPDASPDTRASEQDNYDTWFEETDAIGLFAVRGIGTPAAAIVDGINNSKLTYAPAA DASHKPTWQPADAATTLYYYADVTYIAYYPYKDGIAIDPTQSAATILASFSGKTELQPAA DQSTPEAYAASDLMTADGTATDTADPSRKLLSLTFTHSYSLLVLKAIDLSPKDFVAPDGA FVYPPKVTAPSSDVDATDAVLNGIKMRKMGDGKFYAIVKPASGDIPIKGSYTTNSALIVY DGSLVAPGLEAGKKHEWTVTATLPYDSNPVERALKPGDFVFHNGSDIEIYPGDGAVDTNG RIPNYTNAIGIVATCNPQRMSATDRSKGWTHAYVMGLENISGSLQWSNVSVDESVIANTS PLIEGAENNMDGYTETEAMLTERASKGDLGNYPAFNTVNTYRNNNAVPAALTGKRSPWFM PSVGQWFDVMVNLCGRSPKTFRNNTNYNWRDETYGTEMWETINKQLSKINKPLTYIAWNS AHYLCSSEQDAGKSWIAGFEEYNIHVVVNGANKNLAEWQRTVRCFFAF >gi|222159322|gb|ACAB01000037.1| GENE 41 47589 - 48794 715 401 aa, chain - ## HITS:1 COG:no KEGG:BVU_2521 NR:ns ## KEGG: BVU_2521 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 21 349 15 345 356 139 31.0 3e-31 MKNRTIKAGRLFIHKNGIRAGVTVFLLTLLAACGADDNTPDMPQGDIPLDFGVSVDQAVS RAAETTASSLSSMGVYAYYTGSSNLSTSDKPNFMCNQKVERTNSASPWTYSPVKYWPNNP ADKVSFYAYGPYAPKGLNVSGTTQSGPPTMEYTIQGAEADQADLVIAGALPNQTYASNNG KVSFKMFHALTRVDINVTNVDKATGMTITVFTMGSLLDGKRPLPYDNEEWNLLTGGSGIL TKVTCIPTRLPYSPATDGTKTNLATFFVMPIRKGLAYKPSFKIAYTTPGNVPSGEAPVQT IEWNDKIPSPETWTMGAHISYNFKLEKKKLTITTSTHPTWDDAGTGTVTGSVVITYAVNP SDPNWGTGGSGSVNGKPVVTHSTVRNDVQWEDGGTEIVEKS >gi|222159322|gb|ACAB01000037.1| GENE 42 48891 - 50174 1203 427 aa, chain - ## HITS:1 COG:no KEGG:BVU_2522 NR:ns ## KEGG: BVU_2522 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 4 427 3 378 379 147 30.0 6e-34 MKTINAMMLSLAALTIAGCSQNEVTEISPDAHPQVGFGVYTGVPTRGVDMTTESMKDDPT DANKYGGFGIMAYFTGQDNFETVKTTVTPSFMHNQMVKFDGTNNVWTYSPVKYWPNRQND KISFFAYAPYESDWQNGSKSGVITSAATAPGIPYIKFKLQTTDKLDKMVDLVVADQRDKT YTAENGGKISFTFEHTLSRISFRAQLGAGDFDGMDGTNSFVYITHMWIVGTDHSADGSNL SLIDPASPVNANSKFYTSAKWSELHWNYEADATIAQADFSLDKMLNLESPGIDISTPAAG HDARTQGIRITKASQGTAEKAVPLFKDKEYLYLIPVGEKSGTDLTQNKGCAKGDIKIGFH YDIVSKDATNAGKFIASHGEAFIELPAGHMKRKESYLYTLKINLHKIEISDATVTPWEDI KTEATVE >gi|222159322|gb|ACAB01000037.1| GENE 43 50243 - 51295 867 350 aa, chain - ## HITS:1 COG:no KEGG:BVU_2523 NR:ns ## KEGG: BVU_2523 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 15 349 20 348 348 288 46.0 2e-76 MSDKWLITSSGQLLFFCLLCVSILLVSCERRDLTYYEVSEITLTADWNDSGLDDKEQKYG ATAVFYPRNGGEPKIFMMGDRSGDAVRLPMGVYDIIVFNRSFNDFSNIAFRGNSYETLEA YARKVETRVDKKTRVETRTIISSPDELAAATLEGFTVTEDMLGNYSQTTYGRTAPSRSSE ETPDGLYHLHFVPKKLTREVSAVLHIEGLNNIRSATCRLGGVAESIFLATGKTSANTVTQ EFTPSNPEFSPGSPFNGTLSCEFEIFGLNVIDNNNLHLDALLVDGKTEFEGDCTNVKITE NDDGTGSVTLILEASTEKVPDVKPEGGAGSGFDVDVDGWGNEINTDIPIN >gi|222159322|gb|ACAB01000037.1| GENE 44 51292 - 52590 649 432 aa, chain - ## HITS:1 COG:no KEGG:BT_3061 NR:ns ## KEGG: BT_3061 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 36 424 34 423 425 309 39.0 1e-82 MKNKPTYLKRLAFLFGLLLSLSADIFAQKGHTEEISVKPALLYFRFDKALVDSGYMDNGR TLRRLDELFSDSIPTARIDSIYILSFASPEGVPSYNNRLAMRRSYAVRNYLIWKYPHLDK CRILPCPQGENWQELRRMIAEDANLPKQKEVLQIIDCTPETEKRKAQLKKLDNGIPYHYI RQKILRYLRNASICTVKLYADTLPALPEPVCVKPEYQRQEPYAHNLPNPLTEETRRTMEQ VRQQEQLAKNKRPLFALKTNLLFDVAMMPNIEIEVPIGKRWSINGEYMFPWWLFDGDKYS MQILMGGLEGRYWLGSHQKRENSEVLTGHFFGLYAGGGKYDLQWKENGYQGEFFIAAGIS YGWATRIARNLHLEFNIGIGMLRTDYRHYHARDNYQTLLWQENGKYTWFGPTKAKLSLVW LLNRKVRKGGIK >gi|222159322|gb|ACAB01000037.1| GENE 45 52627 - 52998 254 123 aa, chain - ## HITS:1 COG:no KEGG:BVU_2524 NR:ns ## KEGG: BVU_2524 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 7 84 9 86 112 63 42.0 3e-09 MLTQATQAAIAILHDISTGIYFRSANFRFTEEEWENLFHKLEEGHIIRLLPNCEANIISS YVLCRSLNDLSLLDILEALGEPIYCIRPTPESLYLQNMQVAQKTGVLNQVTRMLLTGIKI SDW >gi|222159322|gb|ACAB01000037.1| GENE 46 53406 - 54383 708 325 aa, chain - ## HITS:1 COG:no KEGG:BVU_2525 NR:ns ## KEGG: BVU_2525 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.vulgatus # Pathway: not_defined # 1 325 1 362 362 371 57.0 1e-101 MDKEEKNVEMVHLSPYMKSVIERLTADKKRPAVHTYNATLNSFTKFFGGQGTEEMLVTDV FTAGKLKEYEAWLRSRNASWNTVSTYMRVLKAVYNRLVEAKRLTYDARLFDSVYTKVEAQ SKRSLTEEQMNTLLHTDFEKLPEDVQNVLAYFLLMFLFRGMPFIDMAYLRKQDLKEHCIT YCRHKTGKKMVVRIPHEATALFEKCRNKKTDSGYLFPILDETTENDKKLYENYRQALRTF NRKLAKMAALLLPGTNISSYTARHTWATLAFYSGIPIGIISKALGHSSIKVTETYLKPFE NEKVDAANDELIMSVVKRSKEKIVA >gi|222159322|gb|ACAB01000037.1| GENE 47 54334 - 54504 69 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYGDRCTISTFFSSLSINLNFNFRIPRSFLKQTYWGNVYGCKIKNFMGEENKRCFN >gi|222159322|gb|ACAB01000037.1| GENE 48 54615 - 55052 324 145 aa, chain - ## HITS:1 COG:no KEGG:BT_4511 NR:ns ## KEGG: BT_4511 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 145 9 144 144 168 55.0 5e-41 MNDNKLKNKTVPFNYARCYNEQCPKACNCLRRVAALLTTADTSYISIVNPMCIPATGIDC PHFQNAEKIHVAWGISHLLDNVPYKDGTNIKQQLIGHFGKTLYYRFYREERFLSPADQNY IRQLFRRKGITEEPVFDSYTDEYNW >gi|222159322|gb|ACAB01000037.1| GENE 49 55224 - 56981 1264 585 aa, chain - ## HITS:1 COG:no KEGG:BVU_3461 NR:ns ## KEGG: BVU_3461 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 4 585 1 582 590 929 81.0 0 MDPMEYIVPNRKRLPYGMMNFAVIRREDYYYVDKTRFIPMIEQADRFFFFIRPRRFGKSL TLNVLQHYYDVRTRDKFNDLFGDLYIGKHPTANRNSYLVLYLNFSGITGELNDYRKGLDA HCSITFMNFCKRYADLLPPETLEELRQVNGAVEQLDYLYQACERAGQKMYLFIDEYDHFT NAILSDAKSLHRYTDETHGEGYLRAFFNKVKAGTYSSIERCFITGVSPVTMDDLTSGFNI GTNYSLTPQFNQMMGFTEEEVREMLTYYSTNSPFRHTVDELMEIMKPWYDNYCFAQDCYG ETTMYNSNMVLYFVKNYIDNGKAPREMIEDNIRIDYEKLRMLIRKDKEFAHDASVIQTLV SQGYITGELKKGFPAVNITNPDNFISLLYYFGMLTISGIDKGKTKLTIPNLVVQEQLYTY LLNTYNDADLNFSSYEKSELSSQLAYDGNWQAYFDYIADCLKRYASQRDKQKGEFFVHGF TLAMTAQNRFYRPISEQDTQAGYVDIFLCPMLDIYSDMKHSYIVELKYAKYRDSENRVEE LRQEAIAQANRYADTDTVKQAIGSTQLHKIVVVYKGMEMRVCEEL >gi|222159322|gb|ACAB01000037.1| GENE 50 57236 - 57715 474 159 aa, chain + ## HITS:1 COG:NMB0932 KEGG:ns NR:ns ## COG: NMB0932 COG2839 # Protein_GI_number: 15676826 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis MC58 # 5 142 3 142 162 69 37.0 2e-12 MLDIILIVISALCMIAGLAGCILPFLPGPPIAYVGLVILHFTDKVQYSTTQLIVWLLIVA VLQVLDYFTPMLGSKYSGGSKWGNWGCIIGTLVGLFFLPWGIILGPFLGAVIGELLGNKE FSQALKSGVGSLLGFIFGTLLKFVVCGYFCYQFIIGLIR >gi|222159322|gb|ACAB01000037.1| GENE 51 57827 - 59686 1227 619 aa, chain + ## HITS:1 COG:no KEGG:BT_0338 NR:ns ## KEGG: BT_0338 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 619 1 616 616 1058 79.0 0 MKKTFIYLSFIIFLGWFPSLFAGEIYVSLQGNDKNSGTKEAPFYTLNRAIKQAREWRRLN RPEVAGGIYIRLEEGVYAQRNSLFLRPEDSGTPDSPTVICAVDGAHPVISGGVAVTGWKR GCNHPAIPEKLKQKIWSAEAPLIGNRRVETRQMWVNGHKVQRAAQFPDGGLERMIDFNPE EQTITIPVSQSVNPKRLQNAGQLEMIVHQRWAIAILRVKSIDAKDGQAVVRFHEPESHLE FAHPWPQPVIGGEKGNSSFCLTNALELLDQPGEWFQEYPSGTIYYYPQAGENMETAEVII PALETLVTIDGTLSRPVKHIQFNGITFAHTSWMRPSFQGHVTLQGGFPLLDAYKLQEPGL PEKAELENQAWITRPETAIRVKGAEHIDFKHCTFRHLSSTGLDYEWAVTASSVEDCQFTD IGGTALLVGAFPDGGFETHVPFIPVDVRELCSHITIRNNFISNVTNEDWGCVGIGAGYVR NMDISHNEVCHLNYSGICVGWGWTSLESGMCNNRIEANYVHHFARRLYDAGGLYTLSNQP GSVMRNNRIEHLIEAPYATNDRAFYIYLDEATDGYTMENNWCPTERFDSNRPGKKNVWKN NGPQVADSIKYKAGRIKQD >gi|222159322|gb|ACAB01000037.1| GENE 52 59839 - 62085 1808 748 aa, chain + ## HITS:1 COG:BH1905 KEGG:ns NR:ns ## COG: BH1905 COG1501 # Protein_GI_number: 15614468 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Bacillus halodurans # 185 738 143 689 773 380 36.0 1e-105 MKPTNYHLFDFLDFDTELSRNESLWKAYKPTAVYEKEGDIYVTVPFQKQKLANDMMADTD VPQEEYTLIIRQYNIGITRLFLGFGDYVLTDESEMLQFSDRIQKVPLFIKEQKGKWILST EDGTKRAVINVEEPLLDRWSELLPDPQETLDITLYPDGKREIRLSAYDHFSPPRYDALPV AFCKRDGRKERATLSFECKPDECFAGTGERFFKMDLSGQTFFLKNQDGQGVNNRRTYKNI PFYLSSRMYGTFYHTCAHSKLSLAGHSTRSVQFLSDQALLDVFVIGGDTMEDILRGYRDL TGYPSMPPLWSFGVWMSRMTYFSADEVDEICDRMRAEHYPCDVIHLDTGWFRTDWLCEWK FNEERFPDPKGFIGRLKKNGYRVSLWQLPYVAENAEQIDEARANDYIAPLTKQQATDGSN FSALDYAGTIDFTYPKATEWYKGLLKQLLNMGVTCIKTDFGENIHMDALYKGMKPELLNN LYALLYQKAAYEITKEVTGDGIVWARSAWAGCQRYPLHWGGDSCSSWDGMAGSLKGGLHF GLSGFAFWSHDVPGFHTLPNFMNSVVADDVYMRWTQFGVFTSHIRYHGTNKREPWHYPAI APLVKKWWKLRYSLIPYIVEQSKLAIESGYPLLQALILHHPEDKLCWHIDDEYYFGNDFL VAPVMNSENRRDIYLPEGKWVNFFTGERLEGACWLKDVYVPLEEMPVYVRANAVIPIYPE DVDCTDEMDLSKSMALRIDNDYKGFWNR >gi|222159322|gb|ACAB01000037.1| GENE 53 62230 - 64035 1784 601 aa, chain + ## HITS:1 COG:mlr1231 KEGG:ns NR:ns ## COG: mlr1231 COG5012 # Protein_GI_number: 13471298 # Func_class: R General function prediction only # Function: Predicted cobalamin binding protein # Organism: Mesorhizobium loti # 388 600 19 231 238 186 42.0 1e-46 MKTWKSNLEETKQRYINWWNHKGIILNMWEHFQEGVQPHAEIMPPAPAKDLSQKWFDPQW RAEYLDWYVAHSSLKADILPVANTQLGPGSLAAILGGVFEGGEDTIWIHPNPDFTDEIVF NLEHPNWILHKELLKACKAKANGHYFVGMPDLMEGLDVLAALKGTDRVLLDTVMQPEILE QQMQQINDIYFKVFDELYDIIREGDEMAFCYFSSWAPGKMSKLQSDISTMISQDDYRRFV QPFIREQCQKIDYTLYHLDGVGAMHHLPALLEIEELNAIQWTPGVGEPQGGSPKWYDLYK KILAGGKSVMACWVTLDELKPLLDHIGADGVHLEMDFHNEKEVEQAMRIVEEYTGSSTAV NTNKHQQDADLAATGQERICIREEQHREEDKLKPLYEAIVAGKLEPAVEITRQAIAEGVA PQMIINNYMIKAMGEVGQRFQDGKAFVPQLLMAGRAMKGALELLKPLLAGSASTTIGKIV IGTVKGDLHDIGKNLVASMLEGCGFEVINIGIDVTCDKFVEAVKENHADILCMSALLTTT MTYMKEVIQALEEAGIRNQVKVMIGGAPVSQGFADEIGADGYSDNANTAVAVAKELIGNK K >gi|222159322|gb|ACAB01000037.1| GENE 54 64169 - 65740 1375 523 aa, chain + ## HITS:1 COG:BH2222 KEGG:ns NR:ns ## COG: BH2222 COG4146 # Protein_GI_number: 15614785 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Bacillus halodurans # 13 521 4 523 580 176 29.0 8e-44 MHVKFLETLDWSILIAYFLILIAIGIWASSKRKKGSSLFLAENSLRWYHIGFSMWGTNVG PSMLIASASAGFTTGIVSGNYAWYAFVFICLLAFVFAPRYLGSRITTLPEFMGKRFGQST RNILAWYTIITILISWLALTLFAGGVLIRQVFDIPMWQSALLLLVISAFFTMLGGLKAVA YTNVYQMILLIVVSATLAIMGIYKVGGVGALVDAVPADYWNLFHPNDDPAFPWLPIILGY PIMGVWFWCTDQSMVQPVLAARNLKEGQMGTNFTGWLKILDVPLYILPGIICLALFPQLE NPDEAYMTMVTHLFPTGMVGLVLAVLTAALVSTVGSALNALSTVFTMDIYVKKIRPQAKQ REIIRVGQVVTVVGALISVIITIAIDSIKGLNLFNVFQSVLGFIAPPMAAVFLFGVFWKR TTTLAANMALTLGTVFSIGVGILYLWVFPAEQYDAWPHFMLLSFYLFVIIGIGMIVVSLW DKSPQLGILNMEKIEDKPARIVLILWGLLIVTMIGLYIFFNGH >gi|222159322|gb|ACAB01000037.1| GENE 55 65821 - 66498 360 225 aa, chain - ## HITS:1 COG:no KEGG:BT_0342 NR:ns ## KEGG: BT_0342 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 225 1 254 257 365 68.0 1e-100 MVEQELTYQKLQIPSSDIYEAMGYKDSMPDGMVIEEINTLLDRITPLLRPRFFFFLTDGL LDTEKATLTVKDTVLSTGKTIARQLRGSEAFAFFTATAGVEFEEFQHLLQQEDDMVKVYI ADSLGSIIAEKAADCMEEELAAFIEKRGWKHTNRYSPGYCGWHVSEQQKLFSLFPVASPC GIQLTDSSLMIPIKSVSGIIGVGSHVRKLEYTCGLCTYENCFRRK >gi|222159322|gb|ACAB01000037.1| GENE 56 66491 - 67501 838 336 aa, chain - ## HITS:1 COG:MA0146 KEGG:ns NR:ns ## COG: MA0146 COG0407 # Protein_GI_number: 20089044 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III decarboxylase # Organism: Methanosarcina acetivorans str.C2A # 62 323 68 327 339 107 30.0 3e-23 MGKLNMKEWISQTIQRKETIAIPIMTHPGIEFIGKTVHDAVTNGQVHYEAIKALCDKYPA AAATVIMDLTVEAEAFGAEIIFPENEVPSVSGRLLADEAAIEKLEVPALNKGRIPEYLKA NMLTARNVTDRPVFAGCIGPYSLAGRLYDMSEIMMLIYINPDAANTLLRKCSDFITRYCM ALKATGVNGVVMAEPAAGLLSNEDCLQYSSLFVKEIIEKVQDDHFAVVLHNCGNTGNCTQ AMVYTGAAAYHFGNKIKMEEALKEVPADALAMGNLDPVSLFKMAGPETMKEATLQLLEAT CAYPNFVLSSGCDIPPHTPSVNIDVFYTALEEFNNG >gi|222159322|gb|ACAB01000037.1| GENE 57 67652 - 68086 538 144 aa, chain - ## HITS:1 COG:TM1080 KEGG:ns NR:ns ## COG: TM1080 COG0698 # Protein_GI_number: 15643838 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Thermotoga maritima # 4 141 3 140 143 147 51.0 6e-36 MKTIGLACDHAGFELKEYVRGWLEAKGWAYKDFGTNSTASVDYPDYAHPLALAVESGECY PGIAICGSGNGINMTLNKHQGVRAALCWNAEIAHLARQHNDANVLVMPGRFISTEEADMI LTEFFSTKFDGGRHQNRIDKIPVK >gi|222159322|gb|ACAB01000037.1| GENE 58 68086 - 70095 2269 669 aa, chain - ## HITS:1 COG:BH2352 KEGG:ns NR:ns ## COG: BH2352 COG0021 # Protein_GI_number: 15614915 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Bacillus halodurans # 10 669 9 663 666 462 41.0 1e-130 MNDNKLMNRAADNIRILAASMVEKANSGHPGGAMGGADFVNVLFSEFLVYDPENPRWEGR DRFFLDPGHMSPMLYSTLALTGKFTLDELKEFRQWGSPTPGHPEVDIMRGIENTSGPLGQ GHTFAVGAAIAAKFLKARFDEVMNQTIYAYISDGGIQEEISQGSGRIAGALGLDNLIMFY DSNDIQLSTETKDVTVEDTAMKYEAWGWNVLNINGNDPDEIRAAIKEAQAEKERPTLIIG KTVMGKGARKADGSSYEANCATHGAPLGGDAYVNTIKNLGGDPTNPFVIFPEVAELYAKR AEELKKIVAEKYAKKAAWAKANPELAAKLELFFSGKAPKVDWAAIEQKAGSATRAASATV LGALATQVENMIVASADLSNSDKTDGFLKKTHSFKKGDFSGAFFQAGVSELSMACICIGM SLHGGVIAACGTFFVFSDYMKPAVRMAALMEQPVKFIWTHDAFRVGEDGPTHEPVEQEAQ IRLMEKLKNHKGHNSMLVLRPADAEETTIAWKLAMENMSTPTGLIFSRQNIANLPAGTDY EQAAKGAYIVAGSDENPDVILVASGSEVSTLVAGTELLRKDGVKVRIVSAPSEGLFRNQP KEYQEAILPADAKIFGMTAGLPVTLQGLVGCHGKVWGLESFGFSAPYKVLDEKLGFTAEN VYNQVKAML >gi|222159322|gb|ACAB01000037.1| GENE 59 70238 - 71782 1616 514 aa, chain - ## HITS:1 COG:BH1874 KEGG:ns NR:ns ## COG: BH1874 COG3534 # Protein_GI_number: 15614437 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Bacillus halodurans # 25 514 4 497 498 600 54.0 1e-171 MKAKLLVSTAFLAASVSLSAQKSATITVHADQGKEIIPKEIYGQFAEHLGSCIYGGLWVG ENSNIPNIKGYRTDVFNALKDLSVPVLRWPGGCFADEYHWMDGIGPKENRPKMVNNNWGG TIEDNSFGTHEFLNLCEMLGCEPYVSGNVGSGTVEELAKWVEYMTSDGDSPMANLRRKNG RDKAWKLKYLGVGNESWGCGGSMRPEYYADLYRRYSTYCRNYDGNRLFKIASGASDYDYK WTDVLMNRVGHRMDGLSLHYYTVTGWSGSKGSATQFNKDDYYWTMGKCLEVEDVLKKHCT IMDKYDKDKKIALLLDEWGTWWDEEPGTIKGHLYQQNTLRDAFVASLSLDVFHKYTDRLK MANIAQIVNVLQSMILTKDKEMVLTPTYYVFKMYKVHQDATYLPIDLTCEKISVRDNRTV PMVSATASKNKDGVIHISLSNVDADEVQEITINLGDTKAKKAIGEILTASKLTDYNSFEK PNIVKPAPFKEVKINKGTMKVKLPAKSIVTLELQ >gi|222159322|gb|ACAB01000037.1| GENE 60 71914 - 74316 1808 800 aa, chain - ## HITS:1 COG:BH1877 KEGG:ns NR:ns ## COG: BH1877 COG3533 # Protein_GI_number: 15614440 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 30 721 4 666 758 474 38.0 1e-133 MKTTSFILALLLSTSLGKAQNAPQVSYFPLQNVKLLDSPFLQAQQTDLHYILALDPDRLL APFLREAGLQPKAPSYTNWENTGLDGHIGGHYLSALSMMYAATGDTAVYNRLNYMLNELN RAQQTVGTGFIGGTPGSLQLWKDIKAGKIRAGGFDLNGKWVPLYNIHKTYAGLRDAYIYA GSDLARQMLIAFTDWMIDITSGLSDEQMQDMLRSEHGGLNETFADVAEITGDKKYLELAR RFSHKLILDPLIKEEDKLTGMHANTQIPKVIGYKRIAELSQDDKNWNHAAEWDHAARFFW NTVVNHRSVCIGGNSVREHFHPSDNFTSMLNDVQGPETCNTYNILRLTKMLYQNSHNPNQ TNEPDPNYVNYYERALYNHILASQEPDKGGFVYFTPMRPGHYRVYSQPETSMWCCVGSGL ENHTKYGEFIYAYRKDTLYVNLFIPSQLTWKEQGITLTQETCFPDDGKVTLRIDEAPKKK RTLMIRIPEWANQSKGYSVSINGKRKMFIMAKGNQYLPLSRKWKKGDVVTFHLPMKVSVE QIPDKKDYYAFLYGPIVLAASTGTEHLDGLYADDSRGGHIAHGKQIPLQEVPMLIGNPDS ICKSLQKEQNSRITFNYNGEVYPAQGKALELVPFFRLHNSRYAVYFRQASEEQFKAIQEE MATAERKATELANQTIDLIFPGEQQPESDHGIQYEQAETGTNKDRHFRRAKGWFGYQLKV KEEASRLLITVRKDDRNKVAILLNNEKLAVHPTVSEADKDGFITLSYVLPQKLNTGSCPI RFIPDGTEWTSAVYEVRLLK >gi|222159322|gb|ACAB01000037.1| GENE 61 74458 - 76053 1339 531 aa, chain - ## HITS:1 COG:CAC1344 KEGG:ns NR:ns ## COG: CAC1344 COG1070 # Protein_GI_number: 15894623 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 1 529 1 531 534 653 60.0 0 MKLDAKSTIEAGKAILGIELGSTRIKAVLIDQENKPIAQGSHTWENQLVDGLWTYNIEAI WSGLQDCYADLRANVKNLYGIEIETLAAIGVSAMMHGYMPFNKKEEILVPFRTWRNTNTG RAAAVLSELFVYNIPLRWSISHLYQAILDNEAHVNDIDFLTTLAGYVHWQITGEKVLGIG DASGMLPIDPTTHNYSAEMVDKFDKLIAPKEYSWKLEDILPKVLSAGENAGVLTPEGSKK LDVSGHLKAGIPVCPPEGDAGTGMVATNAVKQRTGNVSAGTSSFSMIVLEKDLSKPYEMI DMVTTPDGSLVAMVHCNNCTSDLNAWVNLFKEYQELLGIPVNMDEIYSKLYNIALTGDTD CGGLLSYNYISGEPVTGFADGRPLFVRSANDKFNLANFMRTHLYASVGVLKIGNDILFNE EKIKVDRITGHGGLFRTKGVGQRILAAAINSPISVMETAGEGGAWGIALLGSYLVNNEKK QSLADFLDESVFVGDAGIEVSPTPEDVAGFNTYIENYKAGLPIEEAAVKFK >gi|222159322|gb|ACAB01000037.1| GENE 62 76160 - 77689 1706 509 aa, chain - ## HITS:1 COG:TM0276 KEGG:ns NR:ns ## COG: TM0276 COG2160 # Protein_GI_number: 15643046 # Func_class: G Carbohydrate transport and metabolism # Function: L-arabinose isomerase # Organism: Thermotoga maritima # 7 506 6 495 496 504 50.0 1e-142 MNNVFDQYEVWFVTGAQLLYGGDAVIAVDAHSNEMVNGLNESGKLPVKVVYKGTANSSKE VETVFKAANNDEKCIGVITWMHTFSPAKMWIHGLQQLKKPLLHLHTQFNKEIPWDTMDMD FMNLNQSAHGDREFGHICTRMRIRRKVVVGYWKEEDTLHKIAVWMRVCAGWADSQDMLII RFGDQMNNVAVTDGDKVEAEQRMGYHVDYCPVSELMEYHKEIKNEDVDALVATYFKEYDH DASLEDKSTEAYQKVWNAAKAELAMRAILKSKGAKGFTTNFDDLGDIEYNGFDQIPGLAS QRLMAEGYGFGAEGDWKSAALYRTVWVMNQGLPKGCSFLEDYTLNFDGANSSILQSHMLE ICPLIAANKPRLEVHFLGIGIRKSQTARLVFTSKTGAGCTATVVDMGNRFRLIVNDVECI EPKPLPKLPVASALWIPMPNLEVGAGAWILAGGTHHSCFSYDLTAEYWEDYAEIAGIEMV HINKDTTISCFKKELRMNEVYYMLNKALC >gi|222159322|gb|ACAB01000037.1| GENE 63 77731 - 78414 846 227 aa, chain - ## HITS:1 COG:ECs5174 KEGG:ns NR:ns ## COG: ECs5174 COG0235 # Protein_GI_number: 15834428 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli O157:H7 # 2 227 1 227 228 318 66.0 6e-87 MLEELKEKVFHANLELVKHGLVIFTWGNVSAIDRESGLVVIKPSGVSYDDMKAEDMVVVD LDGKVVEGRLKPSSDTPTHVVLYKAFPEIGGVVHTHSTYATAWAQAGCDIPNIGTTHADY FHDAIPCTADMTEAEVKGAYELETGNVIVKRFEGLNPVHTPGVLVKNHGPFSWGKDAHDA VHNAVVMEQVAKMASIAYAVNPNLTMNPLLVEKHFSRKHGPNAYYGQ >gi|222159322|gb|ACAB01000037.1| GENE 64 78420 - 79163 809 247 aa, chain - ## HITS:1 COG:alr2484 KEGG:ns NR:ns ## COG: alr2484 COG1051 # Protein_GI_number: 17229976 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Nostoc sp. PCC 7120 # 33 239 21 237 248 133 34.0 3e-31 MDKDNSQESMISSNSLSFGKNRGEAYYSSNPTFYVGIDCIIFGFNEGEISLLLLKRNFEP AMGEWSLMGGFVQNNESVDDAAKRVLHELTGLENVYMEQVGTFGAIDRDPGERVISVAYY ALININEYDRKLVQKHNAYWVNMNELPPLIFDHPEMVEKARELMKQKASVEPIGFNLLPK LFTLSQLQSLYEAIYGETMDKRNFRKRVAEMDYIEKTDKIDKLGSKRGAALYKFNGKAYR KDPKFKI >gi|222159322|gb|ACAB01000037.1| GENE 65 79270 - 80964 1877 564 aa, chain - ## HITS:1 COG:BH2222 KEGG:ns NR:ns ## COG: BH2222 COG4146 # Protein_GI_number: 15614785 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Bacillus halodurans # 6 449 3 436 580 186 29.0 1e-46 MEALDWLVIGVFFLALIGIIVWVVRQKQNDSADYFLGGRDATWLAIGASIFASNIGSEHL IGLAGAGASSGMAMAHWEIQGWMILILGWVFVPFYTRSMVYTMPEFLERRYNPQSRTILS VISLVSYVLTKVAVTVYAGGLVFQQVFGIKELWGIDFFWIAAIGLVVLTALYTIFGGMKS VLYTSVLQTPILLLGSLIILVLGFKELGGWDEMMRVCGAVTVNDYGDTMTNLIRSNDDAN FPWLGALIGSAIIGFWYWCTDQFIVQRVLSGKNEMEARRGTIFGAYLKLLPVFLFLIPGM IAFALHQKYIGAGGEGFLPMLANGTANADAAFPTLVAKLLPAGVKGLVVCGILAALMSSL ASLFNSSAMLFTIDFYKRFRPETPEKKLVGIGQVATVVIVILGILWIPIMRSVGDVLYTY LQDVQSVLAPGIAAAFLLGICWKRTSAQGGMWGLIAGMVIGLTRLGAKVYYSNAGEVADS TFKYLFYDMNWLFFCGWMFLFCIIVVIVVSLATKAPTAEKIQGLVFGTATKEQKAATRAS WNHWDIIHTVIILAITGAFYWYFW >gi|222159322|gb|ACAB01000037.1| GENE 66 80990 - 82129 395 379 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 48 378 20 345 345 156 31 6e-37 MKKHFLLAGFAALMLVACNNKPASELTLSGLDPVKFQTEVNNAKTALYTLKNKAGMEVCI TNFGGRIVSVMVPDKNGKMQDVVLGFDSIADYINVPSDFGASIGRYANRINQGRFALDGD TIQLPQNNFGHCLHGGPKGWQYKVYEANLIDPTTLELTLVSPDGDENFPGNVTAKVTYKL TEDNAIDIKYSATTDKKTIINMTNHSYFNLAGDPSKASTDNILYVNADYYTPVDSTFMTT GEIASVKDTPMDFTTPKAVGKEIDNYDFVQLKNGKGYDHNWVLNTKGDLSQVAAKLTSPE SGITLEVYTNEPGVQVYTGNFLDGTVTGKKGIVYNQRASVCLETQHYPDSPNKADWPSVV LEPGQTYNSECIFKFSVEK >gi|222159322|gb|ACAB01000037.1| GENE 67 82507 - 84489 1895 660 aa, chain + ## HITS:1 COG:CAC3436 KEGG:ns NR:ns ## COG: CAC3436 COG3534 # Protein_GI_number: 15896677 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Clostridium acetobutylicum # 41 546 54 549 835 421 43.0 1e-117 MRRYANLLAVLALSTNLALHAQTNELVIQTKKLGAEIQPTMYGLFFEDINYAADGGLYAE LVKNRSFEFPQHLMGWKTYGKVSLMNDGPFERNPHYVRLSNPGHAHKHTGLDNEGFFGIG VKKGEEYRFSVWARLPQGSTKETLRIELVDTQSMGERQALVAGNLTIDSKDWKKYQIILK PGSTHPKSVLRIFLTSKGTVDLEHVSLFPVDTWKGHENGLRKDLAQALADIHPGVFRFPG GCIVEGTDLETRYDWKKSVGPVENRPLNENRWQYTFTHRFFPDYYQSYGLGFYEYFLLSE EMGAAPLPILNCGLSCQYQNNDPKAHVAVCDLDNYIQDALDLIEFANGDVNTKWGKVRAD MGHPAPFNLKFVGIGNEQWGKEYPERLEPFIKAIRKAHPEIKIVGSSGPNSEGKEFDYLW PEMKRLKADLVDEHFYRPESWFLAQGARYDNYDRKGPKVFAGEYACHGKGKKWNHYHAAL LEAAFMTGLERNADIVHMATYAPLFAHVEGWQWRPDMIWFDNLNSVRTTSYYVQQLYAQN KGTNVLPLTMNKKNVTGAEGQNGLFASAVYDKDKNELIVKVANTSATIQPISLNFEGLKK QDVLSDGRCIKLRSIDLDKDNTLEQPFAIVPQETPVSIEGNVFTTELEPTTFAVYKFTKK >gi|222159322|gb|ACAB01000037.1| GENE 68 84605 - 85285 430 226 aa, chain - ## HITS:1 COG:no KEGG:PG0078 NR:ns ## KEGG: PG0078 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 6 226 5 229 232 205 52.0 1e-51 MSRLLPKHRFEEIKREQRQAKLDKRRKVAVRNVSVSFLIVCEGERTEPNYFKALIKDRYS DIREVTIEGKGQGTVSLIKETIAIRDKSNKEFDRVWAVFDKDDFNDFNDAIQLAKKNHIL CAWSNESFELWYYLHFQYLDTGISRSQYIEKIEREIQNRTNDSNYRYKKKSPETFDILQR IGDESLAIKHAQRLRESFSGTDYAAHKPCTTVYELVEELTHPEKLL >gi|222159322|gb|ACAB01000037.1| GENE 69 85292 - 86503 915 403 aa, chain - ## HITS:1 COG:FN1198 KEGG:ns NR:ns ## COG: FN1198 COG1106 # Protein_GI_number: 19704533 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Fusobacterium nucleatum # 1 403 1 415 420 164 32.0 4e-40 MIINFTVGNYRSFKDKKVLSMEATAIKELNESVIEKEGYRLLPSAVIYGANSSGKSNFLK AIGKFGEIVNYSSKMSSTDKLNITPFLLNQKSTQEPSFFEIEILIDNQTYTYGFTADNLK VYEEWLYIKEGKAKKEKCLFVRTEEGIGIADEYNEGKGLEEKVRDNGLFLSTIDSFNGEI AKKIIHKIDCIWVISGIDHESWSVFTNDMCNNKNPELSFQQEIQDFLKSMNVGFNRFELP EDEKLAKEVKAYTIHNLYNEQEEVIGETRFSMKEHESSGTNKLFDLAALVIGQLALGGLL VIDELDSKLHPLLTQHIIKLFNNPKTNPKGAQLIFATHDTNLLNVKTFRRDQIWFTEKDH SEATDLYSLAEFREPEGNKIRKDRSFEKDYINGRYGAIPYIKD >gi|222159322|gb|ACAB01000037.1| GENE 70 86702 - 87856 1339 384 aa, chain - ## HITS:1 COG:CAC2959 KEGG:ns NR:ns ## COG: CAC2959 COG0153 # Protein_GI_number: 15896212 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Clostridium acetobutylicum # 1 383 6 388 389 256 40.0 8e-68 MDTEYVRSRFIKHFDGTTGFLYASPGRINLIGEHTDYNGGFVFPGAVDKGMIAEIKPNGT DKVKAYSIDLKDYVEFGLNEEDAPRASWARYIFGVCREMIKRGVDVKGFNTAFAGDVPLG AGMSSSAALESTYAFALNELFGENKIDKFELAKVGQATEHNYCGVNCGIMDQFASVFGKA GSLIRLDCRSLEYQYFPFHPEGYRLVLMDSVVKHELASSAYNKRRQSCEAAVAAIQKKHP HVEFLRDCTMAMLEEAKADISAEDYMRAEYVIEEIQRVLDVCEALEKDDYETVGQKMYET HHGMSKLYEVSCEELDFLNDCAKEYGVTGSRVMGGGFGGCTINLVKDELYDNFVEKTKAA FKAKFGRSPKVYDVVIGDGSRRLE >gi|222159322|gb|ACAB01000037.1| GENE 71 87904 - 89244 1467 446 aa, chain - ## HITS:1 COG:BMEII1053 KEGG:ns NR:ns ## COG: BMEII1053 COG0738 # Protein_GI_number: 17989398 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Brucella melitensis # 12 438 24 412 412 129 28.0 2e-29 MTQEKKNGNLVAIITMFFIFAMISFVTNLAAPFGTIWRNEYAGSNTLGMMGNMMNFLAYL FMGIPAGNMLVKIGYKKTALIAMAVGFLGLFTQYLSSLFGAGAEVFAFGEYVIKLNFVIY LLGAFICGFCVCMLNTVVNPMLNLLGGGGNKGNQLIQTGGALNSLSGTLTPLFVGALIGT VTSSTAMSDVAPLLFVAMGVFVAAFIVISFVAIPEPHLQKGGVKKEKFSHSPWSFRHTLL GVIGIFIYVGIEIGIPGTLNFYLADSSDKGAGIMMNGAAIGGAIAAIYWLLMLVGRTASS AISGKVSSRAQLIAVSATAIIFVLIAIFTPKDVTVSMPGYTVGEGFMMAQVPVSALFLVL CGLCTSVMWGGIFNLAVEGLGKYTAQASGIFMMMVVGGGVLPLIQQSISDSVGYMASYWL IIAALAYLLFYGLVGCKNVNKDIPVE >gi|222159322|gb|ACAB01000037.1| GENE 72 89296 - 90393 376 365 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 38 364 27 345 345 149 31 1e-34 MNNTFPTEGNLSGLSRKDFQKDINDKKTDLFILKNTKGMEVAVTNYGCAILSIMVPDKNG KYANVILGHDSIDHVINSPEPFLSTTIGRYGNRIAKGKFTLFGEEHELTINNGPNSLHGG PTGFHARVWDAVQIDESTVQFNYVSADGEEGFPGNLEVEMTYRLENEVNALTIEYRATTD KATVVNLTNHGFFNLAGISNPTPTVNNHIVTINADFYTPIDEVSIPTGEIAKVEGTPMDF RAPHTVGERIDDKFQQLIFGAGYDHCYVLNKMESGSLDLAATCKDPESGRIMEVYTTEAG VQLYTGNWLNGFEGAHGATFPARSAICFEAQCFPDTPNKPHFPSATLLPGDEYQQITVYK FTVEE >gi|222159322|gb|ACAB01000037.1| GENE 73 90720 - 91691 934 323 aa, chain + ## HITS:1 COG:CAC2918 KEGG:ns NR:ns ## COG: CAC2918 COG1482 # Protein_GI_number: 15896171 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Clostridium acetobutylicum # 1 323 1 310 326 197 37.0 2e-50 MYPLKFEPILKQTLWGGDKIIPFKHLNSDLKGVGESWEISGVEDNESVVANGPDKGLTLA DMVRKYREELVGEANYARFGNKFPLLIKFIDAKQDLSIQVHPTDELAKKRHNSMGKTEMW YVVDADKGAKLRSGFSEQITPKEYKERVLNNTITDVLQEYEIHPGDVFFLPAGRVHSIGA GSFIAEIQQTSDITYRIYDFNRKDANGKTRELHTDLAREAINYEVLDDYRTKYEPLKDEP VELVACTYFTTSLYDMTEEISCDYSELDSFVIFICMEGSCKMRDNEGNELTVSAGESILL PATTQDITITPEGGNVKLLETYV >gi|222159322|gb|ACAB01000037.1| GENE 74 91787 - 93490 969 567 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716026|ref|ZP_04546507.1| ## NR: gi|237716026|ref|ZP_04546507.1| predicted protein [Bacteroides sp. D1] # 1 567 1 567 567 1082 100.0 0 MKPALIYYKENVFSQLDYFQNIRLATSGHTVDERFYLTFNDLSESYNKEEITAIFIMNTL SDNAIELQGLQMAHYIRIITRGEGVGKLPIFLIGSENITELLRLSDYSSILQTPAVELLP YSKEKVKNVIDNIHTYKHLENYEGYLKKTHIPQPNNYLSHHSITSEWSLYQWSKCLNIKS AIENHIERNLYYRYVTSQHPECGDKVTIDYPPYPSESESEKKRILIIDDEADKGWGELYK RLFSHYSNIEVNTLKGFDYASDTKEALLPKVKETIEEKPFEPYHLVLLDIRLLQDDFNKK TDFSSFDVINMLQTLNKGIQIIIVTASNKAWNLQYTLNKNVFAYITKESIFDKSCSKEKL EELLKQSVNAISKSHFLKKVAENEKKILDRLNDKTSTISVPIRKKMLDNIREQIEIAWIM LTNYRFDERYLRYAYLSYYQILELFAEPSPDMYIKIKNKKDIYCEDINGNDINCSEKLKM IYDKTWKGNLIQQTQGTHKEKELELTPYARIGSWMFARTTNRHFMTLLELNSTRNNNTHG GSSTSNIDLEKRLLKMMALIIDFMDNA >gi|222159322|gb|ACAB01000037.1| GENE 75 93487 - 94131 403 214 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716027|ref|ZP_04546508.1| ## NR: gi|237716027|ref|ZP_04546508.1| predicted protein [Bacteroides sp. D1] # 1 214 3 216 216 390 100.0 1e-107 MKILIIDEKKTRREELASINKEVNNILKNCDQLHILTGNECTSFIEEIRSNKETSHIAKY AIICCHHTFVEKIEDQLKKICRKNSIPLIFFSGRYSYSYMSDNVLQLSVDKFYTQALPCI VQDIKAENPLILEKIEFGEDYEVAILMNTRNKLIEWLEAEDDTQTYSELDLDSYVLELAN DASLTECVHEDKGYNPTLLREQINSISSLIKQKI >gi|222159322|gb|ACAB01000037.1| GENE 76 94109 - 96328 999 739 aa, chain - ## HITS:1 COG:MA2102 KEGG:ns NR:ns ## COG: MA2102 COG3344 # Protein_GI_number: 20090946 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Methanosarcina acetivorans str.C2A # 46 285 67 292 563 134 37.0 8e-31 MTEQEIKIRQQVAQSFQDIKTVADLTKLMNEVWSYLCKGVHKRIPLKDVTYFSNYKLAKD AYYKFLIPKKSGKTREIQAPIKDLKRLQICLNFILSSLYHPHPSAKGFILGQNIGDAAKP HVRMPYVFHLDLKDFFTSISLYRVKACLTLPPFNLNGDKERIAYCIANICCTNDGNRAFL PQGAPTSPILSNIVSLRLDRKLTGLAKRFSARYTRYADDITFSSYQDIANNTEFQQELVR IISGQNFQIQPSKTRAEGRGYRQTVCGLTINEKVNVSKSYVKEIRLYLYLWERYGYERAQ MYLDSDIKKTKDNCSDIPQLSNYLSGKIQYMRMIKGNGDTTYKTLQNKFIYLYIPQWKEW KKNILDFCDAVQNSKLSIEELNKWYKTISTNINIHLLKDTPLYTSLTKALSCLTLKASDT PTQTVFKEQIHNATLLPSFLYENFSKNDPLKFITHIWDGNADNCKFEGYEDFIRKEQIAF KEITERFKTIDKNLFYCFYGFLHNPLNNRGWGQYKIKSGWSSSWLKAWCSEHPERSPFDC PIPENKREIAKNVKLNYFSDIVELFKSEFQFRLETHQLKKLLRELVKQYLNFDFHVTFEL TDTKLYTNVYMIRNILSDILHDMAQRKQFPNILVKVEDLGSDYVDILLSQQDSNYYATHQ QLMQEIESGDFCEWKRKMINLCDWYVEAQCKDGVFRIKYLNSIQSDRTIAEPLLLDGVKG FTHRIRIYKHYAYENPNYR >gi|222159322|gb|ACAB01000037.1| GENE 77 96604 - 97569 518 321 aa, chain + ## HITS:1 COG:no KEGG:BT_0595 NR:ns ## KEGG: BT_0595 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 315 5 317 318 504 81.0 1e-141 MRNKNGFSRCAEFYIGRLRKEGRHSTAHVYKNALFSFSKFCGTSNVSFRLVTRERLRRYG QYLYECGLKPNTISTYMRMLRSIYNRGVEAGSAPYVPRLFHDVYTGVDVRQKKALPATEL HKLLYEDPKSERLRRTQAIAALMFQFCGMSFADLAHLEKSALNQNVLRYNRIKTKTPMSV EVLNTAKEMINQLRSKESSHPDCPDYLFDILRGDKKRKDERGYREYQSALRRFNNNLKDL ARTLHLQSPVTSYTLRHSWATTAKYRGVSIEMISESLGHKSIKTTQIYLKGFELKERTEV NKGNLSYVRNCCVGRNKSVKF >gi|222159322|gb|ACAB01000037.1| GENE 78 97484 - 97669 79 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEEYLLTIYVFVYLCVKFKFRYLRSNTHISVLIIRTLHFYFCLHNSFLRKINSLYLPPYV L >gi|222159322|gb|ACAB01000037.1| GENE 79 97905 - 98483 541 192 aa, chain + ## HITS:1 COG:no KEGG:BT_0596 NR:ns ## KEGG: BT_0596 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 340 86.0 2e-92 MILTKPKSISAGPSDGTGEGVAHSKRWYVALVRMHHEKKVAERLDKIGIENFVPVQQEIH QWSDRRKMVESVLLPMMVFVHVNPKERKEVLSFSTVSRYMVMRGESSPAVIPDEQMARFR FMLDYSEEAICMNSSPLARGEKVRVVKGPLTGLVGELVNVDGKSKIAVRLNMLGCACVNM PIGYVEAICEKN >gi|222159322|gb|ACAB01000037.1| GENE 80 98512 - 99261 643 249 aa, chain + ## HITS:1 COG:no KEGG:BT_0613 NR:ns ## KEGG: BT_0613 # Name: not_defined # Def: putative membrane protein involved in polysaccharide export # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 249 68 317 317 271 54.0 2e-71 MNMKFSCAFALIFSFFFFSACQSYKKVPYLQDAEVLKQVNTQVSPVQDARLIPGDEVSIL VSTSDPVVSQPFNAQGSTFLLDDQGNINYPVLGKLPLNGLTSREAENLITERLKSYVKER PTVVVRMSGFKVSVLGEVASPGVYPVVNEQLNVLEALAMAGDLTIYGVRDNVKLIREDKN GHKQFVTLNLNDADLLLSPYYQLQQNDILYVTPNKTKAQSADIGTSTTMWISGFSILVSI ASLLVNILR >gi|222159322|gb|ACAB01000037.1| GENE 81 99265 - 99678 268 137 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 47 129 2 97 116 80 41.0 9e-16 MRTITLIIIHCSATPEGKFLSAEACRQDHIQHRGFRDIGYHFYITRDGEICQGRPLEKVG AHCRDHNTHSIGICYEGGLDIAGRPQDTRTLAQRGSLLALLRELRKRFPKALIVGHHDLN PMKECPCFNCIKEYGEL >gi|222159322|gb|ACAB01000037.1| GENE 82 99682 - 99789 103 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSNSSSPRSVWSFIIKVIITVATAIGGLIGVQSCM >gi|222159322|gb|ACAB01000037.1| GENE 83 99825 - 100313 506 162 aa, chain - ## HITS:1 COG:no KEGG:BT_1705 NR:ns ## KEGG: BT_1705 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 1 164 166 256 88.0 2e-67 MIRYKIYQNQQQKGLNAGKWFARAVSDETFDLAKLAEHMSKHNSPYSGGVIKGVLSDMVD CIKELLLDGKCVKIDDLAIFGVGIRSKAAETLEDFSLEKNITGMRLKARATGNLSTTNLK LDSQLKQQAEYQKPTTPGGDSDSGNTPNPNPGGSGEAPDPAA >gi|222159322|gb|ACAB01000037.1| GENE 84 100534 - 100752 212 72 aa, chain + ## HITS:1 COG:no KEGG:BT_1704 NR:ns ## KEGG: BT_1704 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 72 114 80.0 8e-25 MTEFKIRAYGRMELAQLYSPTLTDIAAYRKMKKWISLCPGLLQRLYDLGYESKRRSFTPL EVRVIVDALGEP >gi|222159322|gb|ACAB01000037.1| GENE 85 100927 - 102759 1380 610 aa, chain - ## HITS:1 COG:no KEGG:BT_1638 NR:ns ## KEGG: BT_1638 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 610 1 615 615 929 74.0 0 MTDIESLRRLTEAVETAGADIAPTYAEYVQLAFAIATDCGEAGREFFHRLCRTSAKYQRE HAERIFSNALTTRHGEVHLGTAFHLAEMANVKLCNTEVMNNRRNTENTENTPSKVLTHAH VYNKVENDEPDESEELLNGSDPNQPLPTFPEADWPKILLLIMSYATSPTQRDVMLLGALT AIGASMERYVRCPYAGKLQSPCLQSFIVAPSASGKGILSLIRLLVEPIHDEIRQQVATEV KAYKKEKAAYDVMGKERSKVEAPQMPKNRMFLISGNNTGTGILQNIMDANGTGLICETKA DTISAAIGSEYGHWSDTLRKAFDHDRLSYNRRTDQEYREVKKSYLSVLLSGTPAQVKPLI PSTENGLFSRQLFYYMHGIWAWINQFESGEADLEAIFTDIGLEWKKQLDLMKTHGVHTLR LTDEQKQEFNTLFSDLFFRSGLANDNEMSSSIARLAVNTCRIMAEVAMIRALECDQPYQF KNSSIHLLTPDKEIATDNIKDGIITRWDVTITAEDFKAVLELVTPLYRHATHILSFLPST EVKHRANADRDALFEAMGNQFTRAQLSEQATIMKIKPNTAFGWLNRLIKKGLFTNADDKG IYTRTHVCVC >gi|222159322|gb|ACAB01000037.1| GENE 86 102803 - 103435 501 210 aa, chain - ## HITS:1 COG:no KEGG:BT_1702 NR:ns ## KEGG: BT_1702 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 206 1 206 208 275 67.0 6e-73 MTNDFRMSYFMPPIAPIKDEHGQLVTPPTLIPCCEVSVEQVFQMITCNENLKVLTEQVRN SEDIRTAKVSLLPYVTPCGTFSRRSSKCLIDPSLLTVVDIDYLTSYQEAVEMRKTLFNDP LLHPVLTFISPSGRGVKAFIPYNHLPMADDANCITEKMKLAMLYTVMIYGTGTPPPFGEK KKGVDFSGKDIVRSCFLCHDPGALFRATNE >gi|222159322|gb|ACAB01000037.1| GENE 87 103589 - 103738 222 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716039|ref|ZP_04546520.1| ## NR: gi|237716039|ref|ZP_04546520.1| predicted protein [Bacteroides sp. D1] # 1 49 15 63 63 81 100.0 2e-14 MKKYSHVALGLFIAEQFKKAGVKIAPMCAEIGLGKAVYYAKVIKGETFV >gi|222159322|gb|ACAB01000037.1| GENE 88 103889 - 105295 845 468 aa, chain + ## HITS:1 COG:ECs2852 KEGG:ns NR:ns ## COG: ECs2852 COG2148 # Protein_GI_number: 15832106 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Escherichia coli O157:H7 # 136 458 138 455 464 234 39.0 2e-61 MQEVQRFNKVLKSFVLLGDIILLNLLLWMFTSIWENRSPFEYSMPLLQNMALMTLCYLVC NIRSGVILHRPVVRPEQIMLRVARNMIPFVLIVFGLSYIFHFECVNLRQLGVFYIVLIIV IISYRLTFRSILELYRKSGKNVRKVVLVGSHENMQELYHSMTDDPTSGYRVLGYFEDFPS DRYPMNIAYLGQPCEAVDYLTRNAGKVDQLYCSLPSARSAEIVPIINYCENHLIRFFSVP NVRNYLKRRMYFEMLGNVPVLSIRREPLELLENRMMKRGFDIICSLLFLCILFPIIYVIV GLAIKISSPGPVFFKQKRSGEDGREFWCYKFRSMRVNAQCDTLQATEHDPRKTRIGDLIR KTNVDELPQFINVLKGDMSLVGPRPHMLKHTEEYSHLINKYMVRHFVKPGITGWAQVTGF RGETKELWQMEGRVQRDIWYIEHWTFLLDLYIMYKTVYNVIRGDKEAY >gi|222159322|gb|ACAB01000037.1| GENE 89 105360 - 106169 748 269 aa, chain + ## HITS:1 COG:PM1016 KEGG:ns NR:ns ## COG: PM1016 COG1596 # Protein_GI_number: 15602881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Pasteurella multocida # 39 233 76 251 387 70 29.0 3e-12 MRKLKRLTLGALLAFLLVSCQSYKKVPYLQDTAFVNDTEQSVRQTGVKVMPKDLLTIAVS CSTPELAAPFNLVNSGTASGTEGKTVGQRNASSALKQYLVDNQGNINFPVLGEIHVGGLT KLEIENLIIDKLKVYLKEAPLVTVRIVNYRISVLGEVTKPGSFVVSNEKINLLEALAMAG DLTIYGMRDNVKLIRTGQDNKQEIITMDLNKAETVLSPYYQLQQNDIIYVTPNKTKAKNS DIGTNTGLWVSATSILVSLANILVILLNK >gi|222159322|gb|ACAB01000037.1| GENE 90 106176 - 108617 1912 813 aa, chain + ## HITS:1 COG:VC0937_2 KEGG:ns NR:ns ## COG: VC0937_2 COG0489 # Protein_GI_number: 15640953 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Vibrio cholerae # 496 794 2 299 302 123 30.0 1e-27 MKENSYDNNMSELDEEKVNYQELLFKYIIHWPWFVASVLACFIGAWMYLHFQTPVYQVSA SIMIKNDKKNGGGNTADLESLGLGGVITSTQSIDNEIEVLRSKTILKEVVNNLELYITYY DEDEFPKKELYKTSPVIVNLTAQEADKLPNVALVDMKLSPEGGLDVNLKIGLNEYNKHFD KLPAVLPTDAGTFGFTLKDSLSNGKIVGQSVVRNISAVVSQPFGVAKGYQWALEIAPTSK TTSVAVVSLMNTNIQRGQDFINKLMEMYNRNTNNDKNEVAEKTREFINERIKIIDEELGT TEDKLEAFKRNAGLTDISSDAQLAVSGNAEYERKRVENGTQINLVRDLNKYINNPSNEYE VLPSNIGLSDNGLTTQIDRYNELIIERKRLLRTSTESNPMIVNLDTSIRAMKANVKAAID GTLQGLLIVKADLDRESSRFSRRISDAPGQERQYVSIARQQEIKAGLYLMLLQKREENAI TLAATANNAKIIDEPAAEGAPVSPKPRIIYLIALVVGVGLPVSIIFLIGLTKFKIEGRGD VEKLTSLPIVGDVPLTEEANGSIAVFENQNTLMSETFRNLRTNLQFMLENDQKVILVTST VSGEGKSFISSNLAISLSLLGKKVVIVGLDIRKPGLNKIFNIPRKEQGITQYLSNPDKNL MDFVQPSDVSKNLFILPGGTVPPNPTELLARDSLDKAIEVLKKNFDYIILDTAPVGMVTD TLLVGRVADLSVYVCRADYTRKAEFTLINELADSNKLPNLCTVINGLDLQKKKYGYYYGY GKYGKYYGYGKRYGYGYGYGETHHGGKGERTEE >gi|222159322|gb|ACAB01000037.1| GENE 91 108729 - 109169 226 146 aa, chain + ## HITS:1 COG:no KEGG:Kkor_2547 NR:ns ## KEGG: Kkor_2547 # Name: not_defined # Def: ExoV-like protein # Organism: K.koreensis # Pathway: not_defined # 27 138 39 143 288 87 42.0 2e-16 MLLHKRADEIVLNVSMNLLCNKVFHSNIGDDINYYLIKELSHKRILNYWDFFNLREQPNF MVIGSIIGWMTNKDSIIWGSGVREPDNPLPAIPRKVLAVRGPLTRKYLISQGVECPEIYG DPALLLPKIYPLPFVNKKIPNRGDFT >gi|222159322|gb|ACAB01000037.1| GENE 92 109156 - 109575 140 139 aa, chain + ## HITS:1 COG:no KEGG:Kkor_2547 NR:ns ## KEGG: Kkor_2547 # Name: not_defined # Def: ExoV-like protein # Organism: K.koreensis # Pathway: not_defined # 1 131 148 274 288 83 36.0 3e-15 MILHKNDLGNSIIKEFIERERNKARQIDIKHYKDWRQVIKEIVECEMIISSSLHGLILSD AYHIPNIWIKFSDETFDGSFKYLDYFASVKRPIDRPLIIRSRLDLSDLLQYKDSYSPITF DAQKLLSVCPFIDKNKILP >gi|222159322|gb|ACAB01000037.1| GENE 93 109575 - 111110 269 511 aa, chain + ## HITS:1 COG:no KEGG:BVU_2391 NR:ns ## KEGG: BVU_2391 # Name: not_defined # Def: putative transmembrane protein # Organism: B.vulgatus # Pathway: not_defined # 4 510 8 512 512 328 39.0 3e-88 MSGNDRIAKNTIFLYFRMLCTIVVSLYTSRIILQTLGVNDYGIYQTVGGVAGLLSYIING SLASGSSRFITFEMGRRDKGKLSDTFSSLLTVHLLFGVVVALLAETIGLWFLYNKLVIPP ERFSAAVFTYHLSILMSIVGITQVPYTAVIIGHEKMNIYAYTSIIEAILKLLLVYILMVS DWDKLMLCASLLFVVQCGITFFYRYYCIRHYEESHYHFSFDKSIIKKVLGFSSWNLVENT SISLNAQGTTILLNMFFNSGIVTGRSVANIVSMTANSFVNNFRTAANPQIVKRFSANDFD GSKHLLLISTKYSYFLMLILAFPVFLVAKQLLYLWLGQIPDYSVVFLQFAIVTTLFGVFD QSFYSAFTAKGQIKETTICSICVGYLSFPVIYVLFKLGYSPVSLSWVMLFSSLILAIFIK PFLLVKIMGYTWTDIFVLFKSCIIVTVVSVPIPLLLYIFRNTLFHTPYLDFFILSISGVV CVALAVWFLGLDDGIRKRIQNNIKIKLLKRK >gi|222159322|gb|ACAB01000037.1| GENE 94 111173 - 111982 452 269 aa, chain + ## HITS:1 COG:FN1241 KEGG:ns NR:ns ## COG: FN1241 COG3774 # Protein_GI_number: 19704576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannosyltransferase OCH1 and related enzymes # Organism: Fusobacterium nucleatum # 1 85 1 88 243 81 48.0 2e-15 MIPKIIHYCWLSNDPMPAELKRCIKSWKRKLPDYKIKKWDTHNFDIYSIPFVAEACKMRK WAFACDYIRVYVLYTEGGIYMDSDVFVRNSLDFCLANRAFSAVECYPDLVEKIYAEGSVD AEGNKRKDIQYIDGIQIQAAILGAEKGHPFMRDCMNYYHDKHFILPDGSLNNKIISPFIF ANIAIDYGFKFKDEEQDLKSGLKIYSSSLFASNMELITGKAVAIHCCAGSWRWMPSSSMA YAVQYMKEIIKNILFRLHLRGGKTRGTLK >gi|222159322|gb|ACAB01000037.1| GENE 95 111997 - 112374 196 125 aa, chain + ## HITS:1 COG:CAC3042 KEGG:ns NR:ns ## COG: CAC3042 COG3594 # Protein_GI_number: 15896293 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose 4-O-acetylase and related acetyltransferases # Organism: Clostridium acetobutylicum # 2 122 3 136 337 68 33.0 2e-12 MRIDSLDILKGIGIILVVVGHMIGNQLYIRPWIYAFHMPLFFMLSGYCFNIAKHPQLLPF AVSRVWTLLVPCVLYTVVSLVVSPEYICEGRYYELKTLLPGALWFLPILFIVEVLGYNVA RFKGG >gi|222159322|gb|ACAB01000037.1| GENE 96 112545 - 112967 127 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294643767|ref|ZP_06721565.1| ## NR: gi|294643767|ref|ZP_06721565.1| hypothetical protein CW1_1423 [Bacteroides ovatus SD CC 2a] # 1 140 28 167 167 225 100.0 8e-58 MSLWKYCLLGLLGCVLTYVYVRTDALKHSQEILNYFEIGMVLACFSAIVGVASFSVFSLK RMPKLGNVLIYIGRNTLIIMSVQGIFISLADYFLRPLIPSFAIGKLVQFIFVFSCCLIMI PLINKYIPVLAGKGIKHRSL >gi|222159322|gb|ACAB01000037.1| GENE 97 112974 - 113942 516 322 aa, chain + ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 2 220 4 237 344 107 28.0 4e-23 MFLSVIIPVYNVEPYLHRCIDSVIAQDMSDEIELLLVDDGSKDASGAICDEYASKYSWIH AFHIPNGGVGNARNYGIEHVQGEYFTFIDSDDFIDPGFYKEVLRLHRQTPSDVYLFGYKD YPLNSSDGHRLKQCRCDDTESLAQLYLDMKKNYLMFSVINKIFNSIENRKHRFLTNIHYF EDCLFALDCLGKAKSVGVIEQAPYNYVHHPGEHLGGKYTAPEVVVEVARELKRRSDLLPQ SDELTQYTILEYYNNMLHAVDSSRGIKQRLQYIRILLREIETFGFKTEFKKYLGRRKILM QFSSPVGVLMMCLLRNLILKFR >gi|222159322|gb|ACAB01000037.1| GENE 98 113944 - 114981 683 345 aa, chain + ## HITS:1 COG:no KEGG:NGR_c02340 NR:ns ## KEGG: NGR_c02340 # Name: not_defined # Def: hypothetical protein # Organism: Rhizobium_NGR234 # Pathway: not_defined # 87 269 233 413 417 78 29.0 3e-13 MYQNYKVVVNTAAGRRRYMQYLVPPILNADIVDRYDIWVNTRNMVDIEFFKKLAQKYPKV NLVWQPDGVVNGIASINAFYRDCCDKDTIYMKLDDDVVWFEPELFEKMVKFRVDNPEYFL VSPLVINNALSTYLLQVHNKIKLDKYYMSICGERTICFDGWFAADLHDWFMEKYLIAGKY QELYVGKHPMGMARFSINCVLWFGNEMAEFKGEVPGDDEEFLSCIKPTQLGKANCFNGDA LIAHFAFGPQREGLDKMDILNRYGKILHDLWKQDESMREIDISVQQMIKKVEAREAELNS LPSPYKCIPKVKVPFFMRLGKKLPERVRCAIRELQRKQRYKFIER >gi|222159322|gb|ACAB01000037.1| GENE 99 115025 - 116155 893 376 aa, chain + ## HITS:1 COG:glf KEGG:ns NR:ns ## COG: glf COG0562 # Protein_GI_number: 16129976 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-galactopyranose mutase # Organism: Escherichia coli K12 # 4 373 2 362 367 445 57.0 1e-125 MKQYDYLIVGSGLFGSTFACMAKRQGKKCLVIDKRPQLGGNLYCENIEGINVHQYGAHIF HTSNKEVWDFVNSIVEFNRYTNSPVANYKGKLYNLPFNMNTFYQMWGVTTPLEAEAKLEE QRTEAKVALNGREPENLEEQAQLLIGKDIYETLIMGYTEKQWGRPCTELPAFIIRRLPVR MVFDNNYFNDKYQGIPIGGYNKLIEVLLDGVECRLNCDFGENREELTALADKIVYTGAID EFYGYRLGRLQYRTVSFETEIYDTANYQGNAVVNYTDRGVPYTRIIEHKHFEMFGQQLFD CPKTVVSKEYSAEWKPGMEPYYPVNDTLNNDLADKYRALAAHEKDIIFGGRLAEYKYYDM APIVKRAFEVVRSQGL >gi|222159322|gb|ACAB01000037.1| GENE 100 116158 - 117162 414 334 aa, chain + ## HITS:1 COG:no KEGG:Fnod_1455 NR:ns ## KEGG: Fnod_1455 # Name: not_defined # Def: galactofuranosyltransferase # Organism: F.nodosum # Pathway: not_defined # 25 332 20 340 350 151 33.0 4e-35 MKKYILIIKTIANDKLSKSLIGGGASVKAPQDIHKIALQNGYEEYPIILRGYKNKLLFIV VLFLKMIRLAINLPNGATLLIQYPSLNPKMLYFIFPFLKKKYLITLLHDINSVREKGELS GFENKVLSNFDEIIVHTPEMQTYFEQRLRPGIKYHYLGCFPYIAVPDKEARQLSKQVCFA GNIDKSVFFSDFVFENKDLDLIVYGSCSSNNAMKNKYEYKGVFKPDMIGHLEGSWGLVWD GDSTETCSGTWGSYLKIIAPHKFSLYVLAGLPLIVWKDSAMAKLVEMKNLGITVTSLSEI SARISAVSDNDYKEYCANILKFQPVLLKGENVCL >gi|222159322|gb|ACAB01000037.1| GENE 101 117593 - 118414 79 273 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716051|ref|ZP_04546532.1| ## NR: gi|237716051|ref|ZP_04546532.1| predicted protein [Bacteroides sp. D1] # 1 273 125 397 397 496 100.0 1e-139 MIAMYCYLKGINAFAQEGVHIDLSFSAIYYHPMWLAPIAGLANVILLWCLFQLQNKCFRC IVLSILLLSIYVTVVAASRTALFASVITMVLYIVYNARNVKKIILYLLVIGFLATISIPV YLEHSTQIQNKFEGGKGEKYGSRSAHFGEGFDKLNESPLIGSGFATAWYRGVLHKGRLES GSGWLSILFQLGALGAIIMLFILKKVTRVFKYIRHDRRLQLFVISLLFLCLHSCFEGYLL TVGYYIGFVFWLLISHIICYPDMVKKYKLNFES >gi|222159322|gb|ACAB01000037.1| GENE 102 118587 - 119645 338 352 aa, chain + ## HITS:1 COG:TM0622 KEGG:ns NR:ns ## COG: TM0622 COG0438 # Protein_GI_number: 15643387 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 17 331 19 354 388 61 25.0 2e-09 MNLFFFQNCISPHQIPFIEELSVFTDVDRVVVIAPRVDYDDRKLMGWKTSKLLETKGIEF LITPTMKVVQRLYEECKGIETFCFFSGINAFPEIVPWMKLSFNYSFKRGVITEPPLLYNH PLWLHKLRFALKDWRYVKYFDYLLVMGDEFVPYYRFWSKKWKVLPFVYCTEWRERIYPIP TSEKLKVLYVGSLSDRKNVVEMFQVLCQKTDLELGIVGDGEKRAQIEEMSMQANTEVVLY GMQPMERISDIMQQYDVLILPSKHDGWGAVVNEALILGLYVITSNHCGASYLLKDKQQGM IFTLEEAQSLSNVADVCIAKKDWIRETVNERITWSKNYISGKAVANYIVQNL >gi|222159322|gb|ACAB01000037.1| GENE 103 119656 - 120747 836 363 aa, chain + ## HITS:1 COG:slr1077 KEGG:ns NR:ns ## COG: slr1077 COG0438 # Protein_GI_number: 16329521 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Synechocystis # 69 363 81 374 386 182 35.0 7e-46 MKVIHYIPSLDRSSGGTTAYMQLLTKELTRLVELYVVSHASENPVVIDNCTVYFIPEFRN FMGMKRQWRILLTQLQPDVVHVNCCWMPACAFIQKWAQALGYKVVLTPHGMLEPWIMARH YWTRKLPALWFYQKAAVMKADVLHATAESEKENLLKLGYNNRIKIIANGIDVENIEMKSS WKRNKEILFLSRVHVKKGINFLLEAVAQLREQVEGYVIRIAGEGDAIYIDELKQLTERLG ISKLVFFEGGVYGNRKWELFRQADLFILPTHSENFGIVVAEALASGTPAVTTMGTPWSEL ESRRCGWWTKVGTEATVQALRNFLSLTENELEKMGRNGRKLVEEKYSARKVAEEFVEMYK SIL >gi|222159322|gb|ACAB01000037.1| GENE 104 120759 - 121325 403 188 aa, chain + ## HITS:1 COG:AGl141 KEGG:ns NR:ns ## COG: AGl141 COG0110 # Protein_GI_number: 15890178 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 12 168 30 185 202 133 45.0 2e-31 MGFKTLYSKRNKFRRLIWSMVWTCLARPFPRSMAMGWKRMLLRAFWAKIASTAAVYAIAK VFQSWLLAMDDYPCLAEGVDCYNTVPIRIGRNATVSQRAFLCTSGHDITDSRYHQTNASI VIEDRAWVCAEAFVGQGVTVGEGAVCAARAVVIKDVESWTVVGGESGKVYKEKDVNKYQQ DTLVSYFI >gi|222159322|gb|ACAB01000037.1| GENE 105 121364 - 122479 363 371 aa, chain + ## HITS:1 COG:NMB1705 KEGG:ns NR:ns ## COG: NMB1705 COG0438 # Protein_GI_number: 15677553 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Neisseria meningitidis MC58 # 135 358 137 348 354 62 26.0 1e-09 MKVNIITGPFGCLPPYAIGAVEKLWYSIGTDMRNKGHQVIFISKKPLKESSMDDNLLLHG YERTGSWVKDFVLDFVFSIKALSKMPKCDMLVLNSIWSPILCLLFKWKYRRALYNVARFP KKQMGAYFAMSSLACVSTAVYNALIEQSPSMKSRACVIPNPIDTHIFCNEHMVKTLSDSP EVVYSGRVHKEKGLDILVKAVTRLHEVGVSVGLRIIGATKIEDGGSGEDYVDYLESLVRG YRITWVEPIFSPSLLAKEIRKGDIFCYPSIAGLGETFGVAPLEAMGLGLVPIVSNLDCFK DFIVDNENGLVFDHTDVRCDELLANCLMRLLSDEKMYSEMSAKAIKKSAGFSVSRVSDMY MSIFKKVLGGE >gi|222159322|gb|ACAB01000037.1| GENE 106 123050 - 123529 256 159 aa, chain + ## HITS:1 COG:all2287 KEGG:ns NR:ns ## COG: all2287 COG0707 # Protein_GI_number: 17229779 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Nostoc sp. PCC 7120 # 11 153 2 142 151 67 29.0 9e-12 MVALMLHKSPKICLACSAGGHLRELQLAIGAIPEQWDCYWLTLKTTSTKAFMADKEHVFL VNFQPAKKWTLIVNCLQAIFWVLVKRPDVIITTGAGVTVPTVFFAKKLLGTKVIFVNSAA DVTHASKTPVWIERYSDLFLVQWEEMRQLFPNSICCGVL >gi|222159322|gb|ACAB01000037.1| GENE 107 123526 - 123996 327 156 aa, chain + ## HITS:1 COG:MA2172 KEGG:ns NR:ns ## COG: MA2172 COG5017 # Protein_GI_number: 20091014 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 146 1 144 155 62 27.0 3e-10 MIFASLGTMNIAFNRMAKAVDEWAAITKDEVIVQTGYTDYPYKHAKAFKFCTKEQMKGYI SRADILILQGGWGAISEAMELRKRIVVIPRYDKVEHIHDQFQLIRKLDTLGCVLGVFDER ELAAKMEQAKTFEFKQIEKGDAQKLIEKKLQEWFPR >gi|222159322|gb|ACAB01000037.1| GENE 108 124027 - 124782 675 251 aa, chain + ## HITS:1 COG:jhp0094 KEGG:ns NR:ns ## COG: jhp0094 COG0463 # Protein_GI_number: 15611164 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Helicobacter pylori J99 # 1 250 2 249 260 204 40.0 1e-52 MKVSIITSCFNRAATIRGAIESVLAQDYNNIEFIVVDGASTDGSLEIIREYEGRISTIIS EPDHGMYEAINKGIRVATGDVIGLLHSDDFFYDNGVISRIVEHMKTTRADFLYGDGLFVN PDNTDKVVRNWIGGDYRLWKVRHGWLPLHPTCYIRREVMMRLGLYNESYKIAADSELLVR YLMTGGLSVTYLKEYVVRMRMGGLSTDSAKRKKMWGEDIRVYSSHGLWPTLTKLEKMAWK VPQFVLALLKG >gi|222159322|gb|ACAB01000037.1| GENE 109 125667 - 127220 1157 517 aa, chain + ## HITS:1 COG:no KEGG:BT_0374 NR:ns ## KEGG: BT_0374 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 516 516 852 80.0 0 MSNKIYPIGIQNFEKIRQDGYFYVDKTALMYQMVKTGSYYFLSRPRRFGKSLLISTLEAY FQGKKELFTGLMVEKLEKDWIEHPILHLDLNIEKYDALESLDNILDKSLTAWEKLYGAEP SERSFSLRFAGIIERACQKTGQRVVILVDEYDKPMLQAIGNEDLQKQFRDTLKPFYGALK TMDGCIKFALLTGVTKFGKVSVFSDLNNLKDISMDERFVDICGITEKEIHDNLEEELHQL AEKQKMSYEQVCAELKECYDGYHFMEHTIGIYNPFSLLNTFDKMKFGSYWFETGTPTYLV NLLKKHHYDLERMAHEETDEQVLNSIDSESSNPIPVIYQSGYLTIKGYDEEFGIYRLGFP NREVEEGFVRFLLPYYANVNKVESPFEIQKFVREVRSGDYNSFFRRLQSFFADTGYDVIR EQELHYENVLFIVFKLVGFYTKVEYHTSEGRIDLVLQTDKFIYVMEFKLNGTAEEALKQI NEKHYSLPFEADNRKLFKVGVNFSSQTRNIEKWIVEE >gi|222159322|gb|ACAB01000037.1| GENE 110 127227 - 127484 278 85 aa, chain - ## HITS:1 COG:no KEGG:BT_0406 NR:ns ## KEGG: BT_0406 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 85 1 85 85 117 78.0 9e-26 MNAQTPQKFTLEEIAERKKKLLNEIHAQKKAMTATTREIFAPLAPATNKADAIMRSFNTG MAVFDGVVMGIKIMRKIRAYFRNLK >gi|222159322|gb|ACAB01000037.1| GENE 111 127488 - 127850 289 120 aa, chain - ## HITS:1 COG:no KEGG:BT_0407 NR:ns ## KEGG: BT_0407 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 119 143 93.0 2e-33 MFADDKSIENFQQLFFEFKKYLELQKEYTKLELTEKLTILFSTLIMILILIILGMVALFY LLFALAYILEPLVGGLMSSFAIIAGINVVLIALVIIFRKQLIISPMVNFLANLFLTDSNK >gi|222159322|gb|ACAB01000037.1| GENE 112 127883 - 128092 386 69 aa, chain - ## HITS:1 COG:no KEGG:BF1659 NR:ns ## KEGG: BF1659 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 69 1 69 69 79 91.0 3e-14 MKGLNVLAAFLGGAAVGAALGILFAPEKGEDTRHKIAEILRKKGIKLNRSEMETLVDEIA AEMKGEIAE >gi|222159322|gb|ACAB01000037.1| GENE 113 128293 - 129063 900 256 aa, chain + ## HITS:1 COG:all0475 KEGG:ns NR:ns ## COG: all0475 COG4221 # Protein_GI_number: 17227971 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Nostoc sp. PCC 7120 # 1 253 4 256 257 296 54.0 4e-80 MEAKIVFITGASSGIGEGCARKFAKEGWNLILNARTVSKLEELKAELEATHGVRVYILPF DVRDRKLAAASLESLPEEWKAIDVLVNNAGLVIGVDKEFEGNLDEWDIMIDTNIRGLLAM TRLVVPGMVERGRGHIINIGSIAGDAAYPGGSVYCATKAAVKALSDGLRIDLVDTPLRVT NIKPGMVETNFTVVRYRGDKEAADNFYKGIRPLTGDDIAETVYFAASAPAHIQIAEVLLM PTYQATGTISYKKKPE >gi|222159322|gb|ACAB01000037.1| GENE 114 129179 - 129673 620 164 aa, chain + ## HITS:1 COG:no KEGG:BT_0410 NR:ns ## KEGG: BT_0410 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 164 1 164 164 271 89.0 6e-72 MKKLVVLGMGVCLVLAFASCKSSESAYKKAYEKAKQQELAEPQVEAPVEVTPVVAAPVTT TKVADTSGVRQEKVTVVSGNEGLKDYSIVAGSFGVKANAEGLKDWLDGQGYHSTIAFNAD KAMYRVIVNSFADKAAAAEARDAFKAKYPNRSDFQGAWLLYRVY >gi|222159322|gb|ACAB01000037.1| GENE 115 129925 - 130749 668 274 aa, chain + ## HITS:1 COG:aq_337 KEGG:ns NR:ns ## COG: aq_337 COG1218 # Protein_GI_number: 15605852 # Func_class: P Inorganic ion transport and metabolism # Function: 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase # Organism: Aquifex aeolicus # 10 267 6 249 268 258 53.0 6e-69 MEQKYVMAAIDAALKAGEKILSIYNDPASDFEIERKADNSPLTIADRKAHEAIVAILNDT PFPVLSEEGKHLGYETRREWDTLWIVDPLDGTKEFIKRNGEFTVNIALVQNSVPVFGVIY VPVKKELYFGIEGAGAYKCSGIVSWEGDGVALEELVAKSERLPLKEVHDHLIVVASRSHL SPETESYIADLKKKHGSVELISSGSSIKICLVAEGKADVYPRFAPTMEWDTAAGHAIARA AGMEVYQAGKEEPLRYNKEDLLNPWFVVEPKREH >gi|222159322|gb|ACAB01000037.1| GENE 116 130761 - 132314 1390 517 aa, chain + ## HITS:1 COG:BH3384 KEGG:ns NR:ns ## COG: BH3384 COG0471 # Protein_GI_number: 15615946 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Bacillus halodurans # 1 305 2 300 589 164 33.0 3e-40 MTFEIVFVLLSLLGMVAALVADKMRPGMILFSVVVLFLCAGILTPKEMLEGFSNKGMITV ALLFLVSEGIRQSGTLGQVIKKLLPQGKTTVFKAQLRILPSVAFISAFLNNTPVVVIFAP IIKHWAKSVNLPATKFLIPLSYVTILGGICTLIGTSTNLVVHGMILEAGFEGFSMFELGK VGIFIAIAGIIYIFLFSKRLLPDARPDTAVPDEEVEEGEKLHRVEAVLGARFPGINKKLK DFNFQRHYGAEVKEIKTRNGQRFVTNLDDVVLHEGDTLVVMADDTFIPTWGESSVFVLLT NGNEPDTTGKKKRWFALLLLVLMIVGATVGELPITKEMFPGIKLDMFFFVSITTIIMAWT NLFPARKYTKYISWDILITIACAFAISKAMVNSGVADSVAKFIIGLSDDYGPHVLLAMVF IITNLFTELITNNAAAALAFPLALSISAQLGVSPTPFFVVICMAASASFSTPIGYQTNLI VQGIGNYKFTDFVRIGLPLNIITFLISIILIPLIWNF >gi|222159322|gb|ACAB01000037.1| GENE 117 132330 - 132938 595 202 aa, chain + ## HITS:1 COG:BH3385 KEGG:ns NR:ns ## COG: BH3385 COG0529 # Protein_GI_number: 15615947 # Func_class: P Inorganic ion transport and metabolism # Function: Adenylylsulfate kinase and related kinases # Organism: Bacillus halodurans # 13 198 16 201 208 205 53.0 5e-53 MEEKNHIYPIFDRMMTRQDKEELLGQHSVMIWFTGLSGSGKSTIAIALERELHKRGLLCR ILDGDNIRSGINNNLGFSETDRVENIRRIAEVSKLFLDSGIITIAAFISPNNDIREMAAN IIGKDDFLEIFVSTPLEECEKRDVKGLYAKARKGEIQNFTGISAPFEVPEHPALSLDTSK LTLEESVNRLLELVLPKVKSIK >gi|222159322|gb|ACAB01000037.1| GENE 118 133018 - 133926 1016 302 aa, chain + ## HITS:1 COG:VC2560 KEGG:ns NR:ns ## COG: VC2560 COG0175 # Protein_GI_number: 15642555 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Vibrio cholerae # 1 302 14 315 315 440 69.0 1e-123 MEEYKLSHLKELEAESIHIIREVAAEFENPVMLYSIGKDSSVMVRLAEKAFYPGKVPFPL MHIDSKWKFKEMIQFRDEYAKKYGWNLIVESNMEAFHAGVGPFTHGSKVHTDLMKTQALL HALDKYKFDAAFGGARRDEEKSRAKERIFSFRDKFHQWDPKNQRPELWDIYNARVHKGES IRVFPISNWTELDIWQYIRLENIPIVPLYYAKERPVINLDGNIIMADDERLPEKYRDQIE MKMVRFRTLGCWPLTGAVESGAATIEEIVEEMMTTTKSERTTRVIDFDQEGSMEQKKREG YF >gi|222159322|gb|ACAB01000037.1| GENE 119 133939 - 135399 1665 486 aa, chain + ## HITS:1 COG:PA4442_1 KEGG:ns NR:ns ## COG: PA4442_1 COG2895 # Protein_GI_number: 15599638 # Func_class: P Inorganic ion transport and metabolism # Function: GTPases - Sulfate adenylate transferase subunit 1 # Organism: Pseudomonas aeruginosa # 7 436 11 433 451 516 61.0 1e-146 MADNKLDIKAFLDKDEQKDLLRLLTAGSVDDGKSTLIGRLLFDSKKLYEDQLDALERDSK RVGNAGEHIDYALLLDGLKAEREQGITIDVAYRYFSTNGRKFIIADTPGHEQYTRNMITG GSTANLAIILVDARTGVITQTRRHTFLVSLLGIKHVVLAVNKMDLVDFSEERFNEIVADY KTFVTPLGIPDVNCIPLSALDGDNVVDKSERTPWYKGISLLDFLETVHIDNDHNFTDFRF PVQYVLRPNLDFRGFCGKVASGIVRKGDTVMALPSGKTSKVKSIVTYDGELDYAFPPQSV TLTLEDEIDVSRGEMLVHPDNLPTVDRNFEAMMVWMDEEPMDINKSFFIKQTTNLSRTRI DTIKYKVDVNTMEHLSLENGQLTKETLPLQLNQIARVVLTTAKELFFDPYKKNKSCGSFI LIDPITNNTSAVGMIIDRVEMKDMSDTEDVPVLDMTKLGIAPEHYEAVEKAVKELERQGL AVRLIK >gi|222159322|gb|ACAB01000037.1| GENE 120 135483 - 136589 1105 368 aa, chain + ## HITS:1 COG:no KEGG:BT_0416 NR:ns ## KEGG: BT_0416 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 368 1 368 368 673 96.0 0 MGLLEFNKLPINTLVGADWKTFKAITAGREIDAAYKGKYRLTKAVCRLLSPLASLQDKRY EKLLANQPLEHDPVFILGHWRSGTTFVHNVFSCDKHFGYNTTYQTVFPHLMMWGQPFFKK NMSWLMPDKRPTDNMELAVDLPQEEEFALSNMMPYTYYNFWFLPKYQQEYADKYLLFDDI TDAELKVFEEVFTKLIKISLWNTHGTQFLSKNPPHTGRVKELVKMFPNAKFIYLVRNPYT VFESTRSFFTNTIQPLKLQDVSNEQLEENILSIYAKLYHKYESDKKFIPEGNLMEVKFED FEADAMGMTETIYKSLSIPGFAEARSDIEKYVGGKKGYKKNKYKYDDRTIQLVEKNWDFA LKQWDYKL >gi|222159322|gb|ACAB01000037.1| GENE 121 136615 - 137583 868 322 aa, chain + ## HITS:1 COG:no KEGG:BT_0417 NR:ns ## KEGG: BT_0417 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 322 1 322 322 613 91.0 1e-174 MIQKSIHRLLMTGFVAFISSLSLMAQHKVEVVPFGDMNQWVDRQIKESSIIGGNTKNVYA IGPTSVIKGDQVYKNMGGSPWATSNVMAKVAGITKTNTSVFPEKRGDGYCARLDTRMESV KVLGLVNITVLAAGSIFTGSVHEPIKGTKNPQKMLQTGIPFTKKPVALQFDYKVKMSDRE NRIRATGFSKITDVPGKDYPAAILLLQKRWEDANGNVYAKRIGTMVTYYYHSTDWKNNVT YEIMYGDITNRPEYKAHMMRLQATESYTVNSKGESVPIHEVAWGDENDVPTHMYIQFTSS HGGAYIGSPGNTLWVDNVKLVY >gi|222159322|gb|ACAB01000037.1| GENE 122 137900 - 139039 1295 379 aa, chain + ## HITS:1 COG:no KEGG:BT_0418 NR:ns ## KEGG: BT_0418 # Name: not_defined # Def: outer membrane porin F precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 379 1 372 372 597 82.0 1e-169 MKKGLLFILLAAASVCLPAQEKENAAKSYRVETNRFGANWFISGGVGAQMYFGDNDGKAD FGKRLAPALDIAVGKWFTPGIGLRVAYNGLQAKGATPYANGPLVDGGQYSNGYYKEKWNV VNLHGDVLLNLTNMFCGYKEDRLYSFIPYVGAGFVHTGKGPGYDELGINAGLINRFRLSS ALDLNVELRGLLMKGAFGNSGPEGLAGLTVGVTYKFKKRGWDAVPTVPMVPESQLNDMRD RVNALKGENESLKRDLVEARNKKPEVIVKKEAGFIPRYVVVFNIGKSNISKREYMNIEAM AKGIKATDKVFTVTGYADKGTGSAEYNMKLSKKRAEAVRDLMVNEFGVPASQLKVDYKGG VGNMFYDDAKLSRVAIVEE >gi|222159322|gb|ACAB01000037.1| GENE 123 139114 - 139530 283 138 aa, chain + ## HITS:1 COG:CAC1680 KEGG:ns NR:ns ## COG: CAC1680 COG0816 # Protein_GI_number: 15894957 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Clostridium acetobutylicum # 3 135 2 134 135 86 39.0 2e-17 MSRIVAIDYGRKRTGIAVSDTMQLIANGLTTVPTHELLNFIGEYMAKEPVERIIIGLPKQ MNNEVSENMKNIEPFVRSLKKRYPDLPVEYVDERFTSVLAHRTMLEAGLKKKDRQNKALV DEISATIILQTYLESKRF >gi|222159322|gb|ACAB01000037.1| GENE 124 139574 - 140128 689 184 aa, chain + ## HITS:1 COG:TM1661 KEGG:ns NR:ns ## COG: TM1661 COG0242 # Protein_GI_number: 15644409 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Thermotoga maritima # 5 166 4 155 164 131 47.0 9e-31 MILPIYVYGQPVLRKVAEDITPDYPNLKELIENMFETMVHADGVGLAAPQIGLPIRVVTI TLDPLSEEYPEFKDFNKAYINPHILEVGGEEVNMEEGCLSLPGIHETVKRGDKIRVKYMD ENFVEHEEEVEGYLARVMQHEFDHLDGKMFIDHISPLRKQMIKGKLNTMLKGKARSSYKM KQVK >gi|222159322|gb|ACAB01000037.1| GENE 125 140205 - 142235 2066 676 aa, chain + ## HITS:1 COG:all0889 KEGG:ns NR:ns ## COG: all0889 COG0457 # Protein_GI_number: 17228384 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 18 625 36 584 605 147 23.0 5e-35 MVKRILTVLLLFPTLVCAQINTDRVMTIARNALYFEDYVLSIQYFNQVINAKPYLYEPYF FRALAKLNLEDFQGAETDCDAAIQRNPFVVGAYQIRGLARIRQSKFDGAIEDYKKALHYD PENITLWHNLTLSHIQKKDYDAAKEDLESLLKVSPRYTRAYLMRGEVSLQQKDTIAALND FNKALELDKYDPDAWSARAIVKLQQAKYAEAEADFNRAIPLSAKNAGNYINRALARFHQN NLRGAMSDYDLALDIDPNNFIGHYNRGLLRARVGDDNRAIEDFDFVIKMEPDNMMAVFNR GLLRAQTGDYRGAIQDYTTVINQYPNFLAGYYQRSEARRKIGDKKGAEQDEFKVMKAQID KQNGVTNKDVAQNKDKENEEEGGEKTRKKSDKNMNNYRKIVIADDSEAEQRYTSDYRGRV QDKNVNIKLEPMFALTYYEKMSDVKRSVNFHKYIEGLNRTGILPKRLRITNMEAPLTEEQ VKVHFALIDTHTSAIVEDDKNASKRFARAIDFYLVQDFSSAVSDLTQTILLDGDFFPAYF MRALIRCKQLEYQKAEQAVETDVVPGDNKRKEITAVDYEVVRKDLDKVINLAPDFVYAYY NRANVSAMLKDYRAAIVDYDKAIELNPDFADAYFNRGLTHIFLGNNKLGISDLSKAGELG IVSAYNVIKRFTDQSE >gi|222159322|gb|ACAB01000037.1| GENE 126 142318 - 144258 1943 646 aa, chain + ## HITS:1 COG:DR2081 KEGG:ns NR:ns ## COG: DR2081 COG0441 # Protein_GI_number: 15807075 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Deinococcus radiodurans # 2 641 1 647 649 643 51.0 0 MIKITFPDGSVREYNEGVNGLQIAESISSRLAQEVLACGVNGETYDLGRPINEDANFVLY KWDDEEGKHAFWHTSAHLLAEALQELYPGIQFGIGPAIENGFYYDVDPGEAVIKESDLPA IEAKMLELAAKKEEVVRKSIAKTDALKMFGDRGETYKCELISELEDGHITTYTQGAFTDL CRGPHLMTTAPIKAIKLTSVAGAYWRGHEDRKMLTRIYGITFPKKKMLDEYLILLEEAKK RDHRKIGKEMQLFMFSETVGKGLPMWLPKGTALRLRLQEFLRRIQTRYDYQEVITPPIGN KLLYVTSGHYAKYGKDAFQPIHTPEEGEEYFLKPMNCPHHCEIYKNFPRSYKDLPLRIAE FGTVCRYEQSGELHGLTRVRSFTQDDAHIFCRPEQVKDEFLRVMDIISIVFRSMDFQNFE AQISLRDKVNREKYIGSDDNWEKAEQAIIEACAEKGLPAKIEYGEAAFYGPKLDFMVKDA IGRRWQLGTIQVDYNLPERFELEYMGSDNQKHRPVMIHRAPFGSMERFVAVLIEHTAGKF PLWLTPEQVVILPISEKFNEYAEQVKMYLKIHEIRAIVDDRNEKIGRKIRDNEMKRIPYM LIVGEKEAENGEVSVRRQGEGDKGTMKFEEFAKILNEEVQNMINKW >gi|222159322|gb|ACAB01000037.1| GENE 127 144331 - 144942 517 203 aa, chain + ## HITS:1 COG:BH3140 KEGG:ns NR:ns ## COG: BH3140 COG0290 # Protein_GI_number: 15615702 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 3 (IF-3) # Organism: Bacillus halodurans # 12 172 26 187 190 140 46.0 1e-33 MKNDTLKGQYRINEQIRAKEVRIVSDDIEPKVYPIFQALKMAEERELDLVEISPNAQPPV CRIIDYSKFLYQLKKRQKEQKAKQVKVNVKEIRFGPQTDDHDYNFKLKHAKGFLEDGDKV KAYVFFKGRSILFKEQGEVLLLRFANDLEDYAKVDQMPILEGKRMTIQLSPKKKEASKKP ATAGTPKPAAPAQKAEKPEKGEE >gi|222159322|gb|ACAB01000037.1| GENE 128 145011 - 145208 334 65 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153808045|ref|ZP_01960713.1| hypothetical protein BACCAC_02331 [Bacteroides caccae ATCC 43185] # 1 65 1 65 65 133 100 8e-30 MPKMKTNSGSKKRFTLTGTGKIKRKHAFHSHILTKKTKKRKRNLCYSTTVDTTNVSQVKE LLAMK >gi|222159322|gb|ACAB01000037.1| GENE 129 145307 - 145657 595 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29345835|ref|NP_809338.1| 50S ribosomal protein L20 [Bacteroides thetaiotaomicron VPI-5482] # 1 116 1 116 116 233 100 4e-60 MPRSVNHVASKARRKKILKLTRGYFGARKNVWTVAKNTWEKGLTYAFRDRRNKKRNFRAL WIQRINAAARLEGMSYSKLMGGLHKAGIEINRKVLADLAMNHPEAFKAVVAKAKAA >gi|222159322|gb|ACAB01000037.1| GENE 130 146016 - 146777 365 253 aa, chain + ## HITS:1 COG:MA4170 KEGG:ns NR:ns ## COG: MA4170 COG1145 # Protein_GI_number: 20092963 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanosarcina acetivorans str.C2A # 2 240 12 247 294 75 26.0 8e-14 MIFYFSGTGNSKWIANQLSKEQKEELVFIPDALKNRTFEFCLREDEKIGFVFPIYSWAPP EIVLNFIRQLSLKGYKRQYLFFVCSCGDDTGLTQQVLEKALSHKGWKCHAGFSVTMPNNY VLLPGFDVDKKELEEKKLADAIPTLNQINASISRREELFLCHEGSIPFIKTRIINPLFNR FQMSPENFYTTDACIGCKRCEKSCPVGNIMMVGRKPVWGMDCTSCLACYHVCPQHAVQYG KRTKDKGQYFNPN >gi|222159322|gb|ACAB01000037.1| GENE 131 146795 - 147367 523 190 aa, chain - ## HITS:1 COG:BS_xpt KEGG:ns NR:ns ## COG: BS_xpt COG0503 # Protein_GI_number: 16079265 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Bacillus subtilis # 1 180 1 181 194 178 49.0 6e-45 MQLLKKRILQDGKCYEGGILKVDGFINHQMDPVLMKSIGVEFVRRFAATNVNKIMTIEAS GIAPAIMTGYLMDLPVVFAKKKSPKTIQNALSTTVHSFTKDRDYEVVISADFLTPNDNVL FVDDFLAYGNAALGIIDLIKQSGANLVGMGFIIEKAFQNGRKKLEEQGVRVESLAIIEDL SNCCIKIKDE >gi|222159322|gb|ACAB01000037.1| GENE 132 147458 - 148765 1210 435 aa, chain - ## HITS:1 COG:AF2013 KEGG:ns NR:ns ## COG: AF2013 COG1541 # Protein_GI_number: 11499595 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Archaeoglobus fulgidus # 2 433 8 438 440 441 47.0 1e-123 MSTQYWEEELETMSREKLQELQLQRLKKTINIAANAPYYKEVFSKHGITADSIQSLDDIR KVPFTTKSDMRAHYPFGLVAGDMSNDGVRIHSSSGTTGNPTVIVHSQHDLDSWANLVARC LYAVGIRKTDVFQNSSGYGMFTGGLGFQYGAERLGCLTVPAAAGNSKRQIKFINDFKTTA LHAIPSYAIRLAEVFQEEGLDPKGTTLKTLVIGAEPHTDEQRRKIEKMLGVKAYNSFGMT EMNGPGVAFECQEQNGMHFWEDCYLVEIIDPETGEPVPEGEIGELVLTTLDREMMPLIRY RTRDLTRILPGKCPCGRTHIRIDRIKGRSDDMFIIKGVNIFPMQVEKILVQFPELGSNYL ITLETVNNQDEMIVEVELSDLSTDNYIELEKIRKDITRQLKDEILVTPKVKLVKKGSLPQ SEGKAVRVKDLRDNK >gi|222159322|gb|ACAB01000037.1| GENE 133 148777 - 149361 654 194 aa, chain - ## HITS:1 COG:PH0764 KEGG:ns NR:ns ## COG: PH0764 COG1014 # Protein_GI_number: 14590633 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Pyrococcus horikoshii # 4 193 5 200 202 107 35.0 1e-23 MKKDIILSGVGGQGILSIATVIGKAALKEGLYMKQAEVHGMSQRGGDVQSNLRISDQPIA SDLIPSGKCDLIISLEPMEGLRYLPYLSPTGWLVTNETPFVNIPNYPETDKVMAEINKLP HKIVLNVDKVAKEVGSARVANIVLLGATIPFLGIDYEKVQDSIREIFLRKGEAIVEMNLK ALAAGKEIAEKLMQ >gi|222159322|gb|ACAB01000037.1| GENE 134 149365 - 150957 1670 530 aa, chain - ## HITS:1 COG:CAC2001 KEGG:ns NR:ns ## COG: CAC2001 COG4231 # Protein_GI_number: 15895271 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Clostridium acetobutylicum # 3 529 2 521 584 359 39.0 9e-99 MSKQLLLGDEAIAQAALDAGLSGVYAYPGTPSTEITEYIQMAPITTEQNIHNRWCANEKT AMEAALGMSFVGKRALVCMKHVGMNVAADCFVNSAITGVKGGLIVIAADDPSMHSSQNEQ DSRFYGDFSLIPMYEPSNQQEAYDMVYSGFEFSEKLGEPILMRMVTRLAHSRSGVERKEQ KPQNSISFSEDPRQFILLPGNARKRYKVLLARQDEFIKASEESPYNKYTDGPNKKLGIVA CGIGYNYLMENYPEGCEYPVLKIGQYPLPKKQLHQLIESCDEILVLEDGQPFVEKQLKGY LGIGVKVKGRLDGTLSQDGELNPDSVARAVGKENKSEFGIPSVVEMRPPALCEGCGHRDM YITLTEVLKEEYPSHKVFSDIGCYTLGANAPFNAINSCVDMGASITMAKGAADGGLFPAV AVIGDSTFTHSGMTGLLDCVNENASVTIVISDNETTAMTGGQDSAGTGRIEAICAGIGVD PAHIRVVTPLKKNYEEMKQIIREEIEYRGVSVIIPRRECIQTLARKKRSK >gi|222159322|gb|ACAB01000037.1| GENE 135 151041 - 152078 605 345 aa, chain - ## HITS:1 COG:XF0675 KEGG:ns NR:ns ## COG: XF0675 COG1559 # Protein_GI_number: 15837277 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Xylella fastidiosa 9a5c # 22 342 21 343 350 122 27.0 1e-27 MKKKKRNILLSILIGAFLLCAVAGGTFYYYLFAPQFHPSKTVYIYVDRDDTADSIYHKIK EFGHVNKFTGFQWMAKYKDFNQNIHTGRYAIRPNDNVYHVYSRFSRGYQEPMNLTIGSVR TLDRLARSIGKQLMIDSAEIASQLFDSTFLVQMGYTSITLPSLFIPETYQVYWDMSVDEF FKRIKDEHKRFWNKDRLSQATAIGMTPEEVSTLASIVEEETNNNEEKPMVAGLYINRLHQ DMPLQADPTIKFALQDFGLRRITNEHLKVNSPYNTYINTGLPPGPIRIPSKKGIDSVLNY TKHNYIYMCAKEDFSGTHNFASNYADHMANARKYWKALNERKIFK >gi|222159322|gb|ACAB01000037.1| GENE 136 152537 - 153415 564 292 aa, chain - ## HITS:1 COG:BB0411 KEGG:ns NR:ns ## COG: BB0411 COG1864 # Protein_GI_number: 15594756 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Borrelia burgdorferi # 101 273 2 175 195 118 38.0 2e-26 MNRNKKGKNRKLFKKKSHSNNRLGCIIAIIVLIPILFGVYLYCQQINIQKNNEPQTDTSF QIPPGKDLETPISLVPRQEQIIRHSGYTVSYNKDLKIPNWVSYELTRKETKGKEKRGNRF ITDPLVTGPIATNADYTRSGYDKGHMAPAADMKWSPEAMKESFYFSNMCPQHPQLNRRGW KNLEEKIRDWAIADSTIIIICGPIIEKYPKTIGKNKVVVPQKFFKVVLSPFVKPMRAIGF LFNNEQAVEPLSSYAVTIDSIESLTNMDFFAPLPDEIENKIEADINYSLWPN >gi|222159322|gb|ACAB01000037.1| GENE 137 153443 - 154024 402 193 aa, chain - ## HITS:1 COG:NMA1447 KEGG:ns NR:ns ## COG: NMA1447 COG1636 # Protein_GI_number: 15794352 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis Z2491 # 10 190 20 204 241 217 54.0 1e-56 MKKKFQLEVPGGADKVLLHTCCAPCSSAIIECMMQHHITPVIYYCNPNIYPQEEYMIRKE ECTRYAQSLGLEIIDADYDHENWRCHIIGMEQEPERGARCLRCFKLRLLETARYAHEHGF SVITTTLASSRWKSLEQINEAGQYATASYPDVTYWEQNWRKGGLSERRIAIIKEYNFYNQ QYCGCEFSMRKEE >gi|222159322|gb|ACAB01000037.1| GENE 138 154129 - 156168 1363 679 aa, chain + ## HITS:1 COG:no KEGG:BT_0761 NR:ns ## KEGG: BT_0761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 679 1 682 682 1165 88.0 0 MNELNCGQEEQYAGPEKKKSTSKIVKRTLVVAALALAVYVVYSVVYLFVSPDRNIQQIYL VPEDAAFIIQSSAPIEDWEKFSGSETWQCLKKAKSFEEVTESVEKLDSVVKSNKVLLSLV GERDMLISLHKTRATKWDFLLILDMQKTSKMDLLKDQVETVLVMSGFTVTNRMHNGINIL EMRDSETRDIFYIAFVDNHLVGSYTSGLVESAIDSRNKPKIGLDQSFIETEKLVSGKGLV RVFINYARVPQFMSIYLGARNEYIDLFSNSMNFAGLYLNTDKERMEVKGYTLRKDSADPY VTALLNSGKHKMKAHEILSGRTALYTNIGFNNPVTFVKELENAMSVHNKQLYDSYQSSRK KIEGLFGISLEENFLSWMSGEFAITQSEPGLLGHDPELILAIRAKSIKDARKNMELIEKK IKRRSPVKIKTVNYKDFEINYIEMKGFFRLFFGKLFDKFEKPYYTYVDDYVVFSNKAASL LSFVEDYEQKNLLKNNPGFENALSYLKSSSTIFLYTDVHKFYSQLKPMMNPATWNEIQSN KDILYSFPYWTMQVIGDGRSASLQYVMDYSPYQPEEVVAVAADEDDEEMNEDAETEKEQM SELKRFYVEKFEGNVLREFYSEGALKSEVEVKEGKRHGRYREYYEDGTLKLRGKYANNKP KGTWKYYTEDGKFERKEKF >gi|222159322|gb|ACAB01000037.1| GENE 139 156473 - 156568 65 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLSRIIVLVVAGVAVVYIVRFIDNFFSQRRR >gi|222159322|gb|ACAB01000037.1| GENE 140 156571 - 157779 326 402 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 88 394 6 316 319 130 25 6e-29 MNQQFLKEIEKGSKSALVKKRIITHYIYNGSSTIPDLSKELDLSVPTVTKFIGEMCDDGY INDYGKLETSGGRHPNLYGLNPESGYFLGVDIKRFAVNIGLINFKGDMVELKMNIPYKFE NSIEGMNELCKHILNFIKKLTINKEKILNINVNVSGRVNPESGYSFSQFNFEERPLADVL SEKLGYKVTIDNDTRAMTYGEYMQGCVKGEKDIIFVNVSWGVGIGIIIDGKIYTGKSGFS GEFGHMSAYDNEIICHCGKKGCLETEASGSALHRILLERIQSGESSILSTRIATEENPIT LDEIIAAVNKEDLLCIEIVEEIGQKLGKQIAGLINIFNPELVIIGGTLSLTGDYITQPIK TAVRKYSLNLVNKDSAIITSKLKDKAGIVGACMLARSRMFES >gi|222159322|gb|ACAB01000037.1| GENE 141 158114 - 159988 1264 624 aa, chain + ## HITS:1 COG:no KEGG:BT_0434 NR:ns ## KEGG: BT_0434 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 624 2 627 627 954 72.0 0 MRNYFLGLCLLFALCFTACSHSDDSVDVLIIGGGASGVTAGIQSARMGAATLIVEETEWL GGMLTSAGVSAVDGNYDLPAGLFGEFRGHLADYYGGLDSLKTGWVSAVLFEPSVGNKIFH EMVDAEKNLKVWHNATLVKLERENDAWIAQIQMKDNTIKKIHAKILIDGTELGDIAKMCG VKYDVGMESRHDTKEDIAPEEKNNIVQDITYVAILKDYGKDVTIPCPEGYNKDEFACACA SHVCITPKEPDRVWSKDMMITYGKLPNNKYMINWPIEGNDYYVNLIEMTREEREEALKYA KHYTMCFVYFLQHELGFNTLGLADDEYPTADKLPFIPYHRESRRIHGLVRFDLNHACEPF RQSQPLYRTCIAVGNYPVDHHHTRYHGYEELPNLYFHPIPSYGLPLGTLIPKDVEGLIVA EKSISVSNIINGTTRLQPMVMQIGQAAGALAALAVKEGKNMREVSVREVQNAILDGKGYL LPYLDVELDHPMFKSLQRIGSTGILKGIGKSVDWSNQMWFRADTLLLANELKGLGDVYPF VNKQVFEGNNTISIQKATELIGGIAEKEGFEMKEGRVEEIWNEFELKDFDMNRGILRSEM AILIDQILDPFNNKKVDITGQYIQ >gi|222159322|gb|ACAB01000037.1| GENE 142 160033 - 161130 756 365 aa, chain + ## HITS:1 COG:no KEGG:BT_0435 NR:ns ## KEGG: BT_0435 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 365 1 366 366 659 86.0 0 MKTLDRRDFLKKATLASASALTVPTLFESCTSKASASTVVATDGLESKLIVSKNNGLKIT GTFLDEISHDIPHQNWGEKEWDLDFQHMKNIGIDTVIMIRSGYRKFITFPSPYLLKKGCY MPSVDLVDMFLRLAEKYGMKFYFGLYDSGKYWDTGDMTWEVEDNKYVIDEVWENYGSKYK SFGGWYISGEISRATKGAIGAFHALGKQCKDISNGLPTFISPWIDGKKAIMGTTKMTKED AVSVQQHEKEWDEIFDGIHDVVDACAFQDGHIDYDELDAFFSVNKKLADKYGMQCWTNAE SFDRDMPIRFLPIKFDKLRMKLEAAKRAGYDKAITFEFSHFMSPQSAYLQAGHLYDRYKE YFEIK >gi|222159322|gb|ACAB01000037.1| GENE 143 161136 - 162539 1087 467 aa, chain + ## HITS:1 COG:CAC1339 KEGG:ns NR:ns ## COG: CAC1339 COG0477 # Protein_GI_number: 15894618 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 9 463 13 453 469 283 37.0 7e-76 MKSTINFGYLIFLSVVAALGGFLFGYDTAVISGTIAQVTQLFQLDALQQGWYVGCALVGS IVGVLFAGILSDKLGRKLTMVISAVLFSTSALGCALSADFAQLVVYRIIGGVGIGVVSIV SPLYISELAVAQYRGRLVSLYQLAVTVGFLGAYLVNYQLLAWAESGTQLSVDWLNKIFIT EVWRGMLGMETLPAILFFIIIFFIPESPRWLIVRGKELKAVNILEKIYNSITEAKSQLNE TKSVLTSETKSEWSLLMKPGIFKAVIIGVCIAILGQFMGVNAVLYYGPSIFENAGLSGGD SLFYQVLVGLVNTLTTVLALVIIDKVGRKKLVYYGVSGMVLSLVLIGLYFLFGDSLGVSS LFLLVFFLFYVFCCAVSICAVVFVLLSEMYPTKVRGLAMSIAGFALWIGTYLIGQLTPWM LQNLTPAGTFFLFALMCVPYMMIVWKLVPETTGKSLEEIERYWTRSE >gi|222159322|gb|ACAB01000037.1| GENE 144 162581 - 163753 1222 390 aa, chain + ## HITS:1 COG:all3695 KEGG:ns NR:ns ## COG: all3695 COG2942 # Protein_GI_number: 17231187 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Nostoc sp. PCC 7120 # 2 388 4 388 388 469 57.0 1e-132 MDFKKLANQYKDELLDNVLPFWLENSQDHEYGGYFTCLDREGKVFDTDKFIWLQGREVWM FSMLYNKVEKRKEWLDCAVQGGEFLKKYGHDGDYNWYFSLDRSGRPLVEPYNIFSYTFAA MAFGQLSLATGNQEYADIARKTFDIILSKVDNPKGKWNKLHLGTRNLKNFALPMILCNLA MEIEHILGKDYLEQAMDTCIHEVMDVFCRPELGGIIVENVDVNGNLMDCFEGRQITPGHA IEAMWFIMDLEKRLNRPELIEKAKNITLTMLEYGWDKEYGGIYYFMDRDGYPPQQLEWDQ KLWWVHIETLISLLKGYQFAGDKQCLEWFEKVHNYTWEHFKDQEHPEWFGYLNRRGEVLL SLKGGKWKGCFHVPRGLYQCWKTLEAISQK >gi|222159322|gb|ACAB01000037.1| GENE 145 163760 - 165898 1369 712 aa, chain + ## HITS:1 COG:no KEGG:Phep_2992 NR:ns ## KEGG: Phep_2992 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 3 711 7 725 725 561 40.0 1e-158 MNKLRIISILFFCLFLFSCGVKKEKIVCYGDAHSNLAQLLTDEGYQLHFCTSVTEALRIA SEQAPVLLLCPSYPEQGTVVTSEDLALIQSKSLRVFMDFPQQIGEHLCVKTDTMELERIV VCDSLTPQLPSMALMAFHRCVLKELDLTPDSTYLVAARVAGFDKAVYGLANTPVHPLLYQ QNNQLMVAATSISNFAVCRYLPEQRVQSMFEYIMNWLLHKEGVTFSSWLTYVSPSYTATE PLPEEAGKQSIAKGVEWYYNGHFLVHPSWKKDWADKYMGDGLMPVGPELPADMPDGDGSL GVLEGHMSGIKYDGTQMYRYWMRDDVQGETSFAFAAAGILLDNSQYTKVAANLLDYSFTE YRDSVRNDPKSPSYGLLGWAYTHKGTYYGDDNARSLLGSIASSALMNNPKWDKQIVEGIV GNFRTTGLNGFRGQNILESDLQKRGWKSYYNANLVNLHAHFEAWNWACYLWLYNQTHYQP LLERVKRGITMMMEGYPEQWSWTNGIQQERARMILPLAWLYRVEPTEEHLDWLHFMTNEL LRNQVPFGGIREELGDESKSLFGRTPSNAEYGNNEAPLIFDNGDPVADMLYTTNFAFVGL CEAAKATQDTTYIKAVNQMRDFLIRIQVRSDKFKNVDGAWFRAFNYEDWNYWASNADAGW GVWSTLTGWIQSWIVGTQFVLEEDSSLWDIANQKDVSTVASEVIEEMITRQL >gi|222159322|gb|ACAB01000037.1| GENE 146 165936 - 169064 2181 1042 aa, chain + ## HITS:1 COG:no KEGG:BT_3604 NR:ns ## KEGG: BT_3604 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 1042 8 1051 1051 1373 65.0 0 MKHKAMQKGRFCSMRILFFSLTILFVCASNSYAQSTVNGVVVDATGLPLPGVSVTVKGSA TGTTTDLNGHFTLNAPRSATLVLSYIGMITQEVKVNSRSMINVTLQEDINSLDEVVVVGY GTQKKIHLTGSISSVSSKELLKSTTSNVSQALVGKLPGLISQQATGAPGADDVSLLVRGH STYNGGDGPLILVDGVERSMAYINPNEVESVTILKDAASCAVYGMKAAAGVILVTTKRGT EGKTTINYKGSLTLSHATTLPKFMNGTHYMQWYNYARRMDGEKVYFTDEEIAMTTNGDPT DGFENTDWQEPIYRTTLMHQHNLSISGGNEKTRYFLSGGFMKQNGFIKGFELERGNFRSN IDTQVTKDISVSLNVAGKINDYYQPGGDSYENQTTNNVVGVLLYAAPFVPLEYEGMPASG YRGASNPDYAAGHSGYSKTRTMRLETSAKIEYSFPFLKGLKAGMFVGWDWQDRDSRSFKY SYELMLYKPESKKYVRQYASNLQPTGGMSVGDEKEQQVVLRPSISYNQKFGLHDVGALFL YEQTERKGNTLTGYRSDFALLDIDELPFGSTIHPTDGNFSSSIRQGYAGYVGRFNYAYNN RYLAEFTFRYDGSYHFKEGNRWGFFPSASLGWVASEEDFFKELFPQVERFKLRASFGILG SDNVDPFLYRKQYAWSKNNTVFGTTPQAVNTLYNKVSYPMENLTWEKCRSINVGFELSAW NGLLGIEFDVFYKYTYDILRPIGGVYPPSLGGHYPSIENSGTFDNRGFEITIKHRNHIGK FNYSLNGNLSFAHNKILRMTQADNTKPWQNRLGTSVGAIWGLKSLGLYQTQTEIDAAPLP ISETPRLGDIRYLDYNGDGLISWDDEVKIARPTTPEMMFSLMADANWKGFDLSVQLQGAA LCDKLLCGEWNNGARDQTPLTRPFYAGWDNAPYYLVENSWRPDNTNAEYPRLSTVAYANN AQVSDFWKRNGAYVRLKNVTLGYTLPQSWVKKAGISNLRLFASGHNLFTFTEFKYLDPES ANVIQGYYPQQRTFTFGVDVTF >gi|222159322|gb|ACAB01000037.1| GENE 147 169101 - 170795 1582 564 aa, chain + ## HITS:1 COG:no KEGG:BT_3603 NR:ns ## KEGG: BT_3603 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 564 5 561 561 688 60.0 0 MKTKNLLLGLALSTLTLQSCNVLDIDPTDTYSESTAYASIKNLDLYVKSFYGVFYSNADI NVGANLAMDDGVSDLIKYSWFNVAEGSVNKLFYYDNMMSPDGNYRSNWDDMYEQIRRFNE YFYDVHSGFADKLDSDQLAIRTAEVRFMRAFAYQELVLRHGGVILRINEDRVDDHNQRAQ ARSSEDDCWTFILDEYEKAAQVLPEEWTGSEAGRLTRGAAYGMKARAALYAKRWQDAVNA ALEVEKLEKKGVYQLLSGTSKDSYMQIFNTVNNKELILPVYFQQKTKQHMFNHFFCPPYD TEKAGLQPGTLGAAATPTDEYASAFDIKVGQEWKSFDWTHLNEYSEGPWGNRDPRFYASI LYNGANWIGRQLELYVGGKDGYMDFSTSASQDYVRTSTTGYIFRKFMDESDNINYVDIES TSYWIEMRYAEIVLILSEAYARLDKFKEAYDYLNKIRTRVGLPILAQQSTWDNYLKDLSK ERVCELGLEGHRWYDLVRWDIAQKVLNGQRLHGIKIEKTGSSFLYTRVECDTQDRKFPKK YNIFPIPSSELRNNTMCEQTPDWK >gi|222159322|gb|ACAB01000037.1| GENE 148 170826 - 172694 1619 622 aa, chain + ## HITS:1 COG:no KEGG:BT_3602 NR:ns ## KEGG: BT_3602 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 619 1 617 619 551 48.0 1e-155 MKLKHIINGFWVFSLCLAGCVDPDDLVRTESEITDQLIITGRFAGNENIEYETVIDRVNG TAVVQTPYYISDTEPIQGDITRMRLKATLPVGARFDPPLSGIHNLKEGFSSTLIEADGSK KNYTFKAIYKKSERANIIKVELANVRAQIVVADPVEEDKKGQIIIYKTSSSIDGELQNAL ITIAPWATIESDAYDPMTKILDLSVMPTVTITAQDGVKKTVYETVYKTPEFVEYGVGYVA GLFGFQTLKENSHGFEVGANRTMAVVDDYLILSNSNNFVNMPVFNRYNGKLLDQVKVNVD GIPSNRIIRAITSDDNGHMVAMAYVSTRAGSYATPYTDVNVWVWKNGIENTPTLILDKPF SDPVFDQAPMGVNWNYNIDMGSSIHITGDITSGKAVLTTTSPFSYRMVFISFVDGIMSDK AHVEFANVSGARVEDYTKILTTNTEKPFSYISTPANETNMVICAPVGDSADRAFAFTVPD SHYWGWQGFTKGIDYVDFNGARLLAVQNGSGSFNGAQRLYVADITKNPNAKSMSDGFIFD SQQGNAIGNADVPGGVPGSGYTATGYTSPWAFDGVSSVLGENIPCTGDVIFAKSQDGLAV QVYMLTTDHGLIAFELTKFKGL >gi|222159322|gb|ACAB01000037.1| GENE 149 172764 - 173966 1032 400 aa, chain + ## HITS:1 COG:no KEGG:BT_3591 NR:ns ## KEGG: BT_3591 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 400 1 393 393 453 54.0 1e-126 MNKKCMKKIVLTFVSLLMGGLCSVLVACSDEEHTVPVFPEDGSGEIEIKPDQKYDWETNR QSILANTDMVLLYSGGDQRLIWTQERAQPYITYVDEQNTPHWFFDSFLMLEILNTSDNWQ TVREYTKGMRYESATKAEWMKLIDCYFNSETGIAAIEAGVKKAMTTMGAPAYKRQVIIGI PEPIDVQNELVSGSSSVYWGEIDNMSLDFSKPADRVKACKWFIDQVRARFSEKEYQYVDL AGFYWIAEDASHTGNIITPIANYLNELKYSFNWIPFFNSDGHESWKELGFHYAYYQPNYY FDDKIPLTRLDEACKEALRCNMQMEIEFEDDVLAAHGKAYRLENYMAKFKEYGVWEKCRL AYYQSNNALLTLKYSSEPADVALYHKFCKFVIERPIRDSH >gi|222159322|gb|ACAB01000037.1| GENE 150 174107 - 176269 1291 720 aa, chain + ## HITS:1 COG:no KEGG:BT_3590 NR:ns ## KEGG: BT_3590 # Name: not_defined # Def: alpha-N-acetylglucosaminidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 715 14 728 732 1148 75.0 0 MVSCQRNQDPATNTLSEMCERLFPQHAHSFQFQLLTDSVDIDRFTLESDNGKILIKGNNR NSLAVGLNHYLKYYCQAHVSWYASDSVVMPAQLPEVEAPVILRSKCKNRFFLNYCTFGYS MPYWKWSDWERLIDWMALNGVTMPLAITGQESIWYKVWTEMGLSDEEVRTYFTGPAHLPW HRMSNVDYWQSPLPQSWLADQEKLQKLILERERAFDMTPILPAFAGHVPAELKELYPEAK IYTMSQWGGYDEKYRSHFIDPMDSLYSVIQRRFLEEQTKVYGTNHIYGIDPFNEVDSPNW NEEFLSNVSDKIYKSIQGVDSAAQWLQMTWMFYHAKEKWTQPRIKSFLNAVPQDKLILLD YYCDYTEIWRDTEQYYGKPYIWCYLGNFGGNTFLAGDLNDVDFKIDRLFKEGGDNVYGLG VTLEGLDVNPLMYEFVFERAWQNSMPVHQWIANWAQCRGGNVDNHIVKAWKQLYEKIYTS AALCGQAVLMNARPQLEGVEGWNTLPGYDYKNIDLWEIWKELLKAEGVYHSEYHFDVINV GRQVLGNLFADYRDKFTDCYRKKDLEGTKVWGQRMDQLLLDVDRLLCCSPVFSIGKWIKD ARDFAVNEQEQKYYEENARCILTVWGQKDTQLNDYANRGWGGLTRTFYRERWKRFTEEVI AAMTRHKNFDEEKFHQDITQFEYEWTLKNEDFPIISEENPISLAKELILKYDDDFCSLYP >gi|222159322|gb|ACAB01000037.1| GENE 151 176293 - 178872 1301 859 aa, chain + ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 28 767 4 764 778 464 36.0 1e-130 MKFIIKDILIICLFCCASSLNAQHAFPYKNPSLPTEERVNDLLNRMTLQEKIAQISHLQS WDVFDGQKLNTAKLAKMCGDKGYGFFEGFPLTAAQCRKNFRIIQTYLLEQTRLGIPGFSV AESLHGVVHEGSTIYPQNIAIGSTFNPELAYEMTKHIAGELNTIGVKQVLAPCIDVVREL RWGRVEESFSEDPFLCARMAVAEVKGYMDHGISPMAKHYGPHGNPLGGLNLASVECGIRD LFDVYLEPFEAILAETDILAVMSSYNAWNREPNSASKFMLTDILRDRFGFRGYVYSDWGV IDMLKNFHETAGNDFEAASQVLTAGLDVEASSLCFKSLESKVLAGEFDVRYIDRAVKRVL RAKFELGLFEDPYLEKNSYRWPLRAKECVSLSRQIADESTVLLKNEGNLLPLDIKKLRSV AVIGPNADCVQFGDYTWSKNKEDGITPLQGICRLAGKKVKVNYAQGCSIASFDQSGIEEA VCAAQQSDVALLFVGSSSTAFVRHSSAPSTSGEGIDLSGVELTGAQEELIEAVCATGKPV VLILVAGKPFAIPFAKKNVPAILVQWYAGEQAGNSIADILFGKVNPSGKISFSFPQSSGH LPAFYNHLTTDKGFYKEPGTYETPGRDYVFSSPNPLWAFGHGLSYTTFDLVSAIADKTHY QAHDTIAVKVKIANSGEVAGKEVVQLYIRDVVSTVMTPVKQLKAFEKISLNPAETKEITL KVPVHELYLTDNIGNRYLEPGTFEIKVGTASDRIVHRISIEVGSKLEKTPVVDSPQIIKV TPSGEFITVQGFVRDAQATPVAYVTVRAISSGQETQTDEKGFYSINLRTDDSIAFVKARF ITQQMEVKGHKNINIRLVK >gi|222159322|gb|ACAB01000037.1| GENE 152 178894 - 179976 566 360 aa, chain + ## HITS:1 COG:no KEGG:BT_3593 NR:ns ## KEGG: BT_3593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 360 10 358 358 452 63.0 1e-125 MSFSVYRKIANLILLLVASNLSAQVTYYDAAEFQLLGKATAATTERYVRLPDSLEHISRL PLWQLSRNSSGMAVRFRSNSTQVAVKWELLVNFHMDHMTDVAVKGLDLYCLEGKNWYFVN SARPMGKSTESSLISGMEAKEREFMIYLPLYDGLVSLSIGIDAGASINQPAMESPVREKP VVFYGTSILQGGCASRPGMAHTNIISRRLNRECINLGFSGNAFLDLGVAQVMAGVDAGVF VLDFVPNVTVEQMNERMEKFYRILRDRHPHTPVIFIEDPQFMDSYYNNANARKIKTLNDT LRRIFNELKKGKEKNIYYISSKRMLGSDREATVDGIHFTDVGMMRYADLVTPVIKKLLKK >gi|222159322|gb|ACAB01000037.1| GENE 153 180805 - 181596 489 263 aa, chain - ## HITS:1 COG:no KEGG:BT_3593 NR:ns ## KEGG: BT_3593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 263 97 357 358 391 68.0 1e-107 MTDTGVKGLDLYCLEGNCLWRFVNSARPTGKINQVTIIANMQPEEREYMLYLPLYDGLVS LAIGVDSLSTIDQPLIDYPIRKKPVVFYGTSILQGGCASRPGMAHTNIISRRLNRECINL GFSGNALLDLEVAKVIAEVDASVFVLDFVPNASVEQMKERMETFYHIIRSKHPDTPVIFI EDPIFTHTLYDERVSKEVQKKNDTLKEIFNRLKKENEKNIIFISSKNMLGEDGEATIDGI HFTDLGMMRYADLVYPIIKKAIK >gi|222159322|gb|ACAB01000037.1| GENE 154 181881 - 183068 678 395 aa, chain - ## HITS:1 COG:all4081 KEGG:ns NR:ns ## COG: all4081 COG1649 # Protein_GI_number: 17231573 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 12 393 60 421 430 166 29.0 6e-41 MSEKLKFCLCFAILFISLQSIAKKTEQGIRGVWVPAPRFTSVLHTYQGVKDFVKTLDELN MNSIFLVSYAETKTIYKSDVLMHYSTYKTQEESYLLSGYSKQYQSPTNDPVRDLIDEAHK HDIKVFFWFEYGFMGEGRPISLDNPLLAKNPHWLGIDNQQHPANYNQHDYYFNAYNPAVQ NFLIELIEEALTLYPDLDGIQGDDRLPAMPRNSGYDTYTVSLYQSQHQGKNPPADYNNPE WVRWRLDILNTFAKRLYKRIKAKSPNVMISFAPNPYPWCEENLMQEWPRWCKEKVCDLLA VQCYRYSIEAYRATVSEVLKYIHQNNPNQLFAPGMILMEGSSSKMSPELLREQLRVNREL GINSEIYFYNKGIDNPSVREVLKQAYHQKIKFPAN >gi|222159322|gb|ACAB01000037.1| GENE 155 183131 - 185461 879 776 aa, chain - ## HITS:1 COG:CAC2633 KEGG:ns NR:ns ## COG: CAC2633 COG4632 # Protein_GI_number: 15895891 # Func_class: G Carbohydrate transport and metabolism # Function: Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase # Organism: Clostridium acetobutylicum # 657 776 242 353 354 64 32.0 9e-10 MKLLIKKITCVILILCSYCSYGEAKIQLPSILGNGMVLQQKSEVKLWGKATPNKKVVVYT SWNAQQQEVDSNQKGDWSVAVTTPEAGGPYTIRISDGEELILDDVLIGEVWLCSGQSNME MPVKGFRGQPAAESQNTIVNANSNRSLRLFTVQRAYSSVPQENVAGQWERNTPKSVSTFS AVAYYYGDQLQKVLGIPVGLIHASWSGSSIEPWISKENLLQFPEIDLTPAANPQSKYANG TPTVLYNAMIKPLENYNIKGMIWYQGESNSARPEQYQRLFAVWAKQNRTLFRSKDFPIYY TEIAPVASPVDRPFQRAIFREAQLESMYEISNTGMAFTNDLGSEKFIHAPQKREIGQRLA YWALAKTYQLKGFEYSGPIHRSYMKNGKVIEILFDHADDGLNPENEPLVGFEVASEDSIF YPANAEIINGTSRVKVWNDKVIQPVYVRYAFRNFLRGNLINNAGLPATPFRMDLRKLDFQ NPENLGWTRVTTFGKLPEYVNVYHSPEWIESTRTNAYIAVIDTKKGGSLDVGGEESGIKT PTEFYQSEKRKPVIVLNGGYFANGKTVSLICKDGRILSDNISVVNRILEGKKTAYYPTRS VFSLYKDGTYHVDWIYKSNQQTYAYDMPALNSSTRPPLSVPSKGFPRGAKVWSAEMGIGA GPVLIKDGMIRNSWVEELLDVASGINPQTCQPRSAVGITQDGKLVLFVCEGREQTPDVPG MTLDQLARLMKAFGCVDALNLDGGGSSCMLINGKETIKPCNKDHQQRPVATVLFAR >gi|222159322|gb|ACAB01000037.1| GENE 156 185469 - 186872 743 467 aa, chain - ## HITS:1 COG:no KEGG:BT_0446 NR:ns ## KEGG: BT_0446 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 467 1 469 469 501 58.0 1e-140 MNKRAFSLDALRGYAIITMVLSGTIASGVLPGWMYHAQVGPRSNFAFDPSFYGITWVDLV FPFFLFAMGAAFPFSIGNKLEKGESKLRIAWDCLLRGFRLTFFAIFIQHMYPWVTSSPQD VRSWLLALFAFVLMFPMFMRIPVKMSKWLRGGIQLSAYALGIVMLLTVNYANGREFSLSY SNIIILVLANMSIFGSLAYLFTAKNKWARIAILPFIMAVFLGSKTDGSWVKALMNYSPIP WMYSFYYLKYLFIIIPGSIAGEYLKEWLQSKSEADSPLEEKRRIPLILLLTIGIIIFNLY GLYTRQLLLNLCGTVVILVCIYWLLKTRSNNMDYWRKLFVAGAYLLMLGLFFEAYEGGIR KDDSTYSYYFVTAGLAFMAMIAFSILCDIYKCRQLTRPLEMAGQNPMIAYVATNLVVMPV LNLIGVASYLSYLQQNAWLGFLRGIIITTLAALIAIIFTKLKWFWRT >gi|222159322|gb|ACAB01000037.1| GENE 157 186904 - 187788 571 294 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716108|ref|ZP_04546589.1| ## NR: gi|237716108|ref|ZP_04546589.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 294 1 294 294 588 100.0 1e-167 MKKYLFYSLFLSLAMGFYNCSDSDDGEPEPEPTDPINTYFKDEIKPQASVACFAGAYYHK AVTSKDLWLGIGGTIKLPTATFDEDRKNPSKPGQYLDNPSIYLGGNMGGQETDIGLTWEV VKDEQGNISAERKAFRPFMRRTSHSSGQASNYSNAPAQKEYYWYPGEEVTISIQIIRSGV LKFIVDGNGKHYESEYECAGYKQGTRGEFKRVNAIDQVSNEGKPTQATNTKVEGAQWKES FLFRMYDGKIVKAPLHTGRYTDMRCPDAMYFDINSSEAEKKTGAESVNINGAGY >gi|222159322|gb|ACAB01000037.1| GENE 158 187831 - 188706 407 291 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294647256|ref|ZP_06724853.1| ## NR: gi|294647256|ref|ZP_06724853.1| hypothetical protein CW1_2827 [Bacteroides ovatus SD CC 2a] # 1 291 10 300 300 598 100.0 1e-169 MKQNYNYIFLLTCLIIGGTSCSHKYRTSSPKSVETFFKNEVKPEKFIECFPGAYYRKVNS SKDVWLGVGGTVTLPQLSFDQTRKNTKKPGQFLDNPSVYLGGNMGDQETDFGLAWEVIRE KDGKLSKERKAFRPFMRRSEHINGQEPNYAQAPPEDKYYWYPGEKVTMYFQVLETRKVHF VIEGAGKRFECDYDCEGYIPGELGTFKRVNAIDQVANEGKPAQATKTKVLNSRWDESYYF RKYKNEIVKVPIHEGRFTDMRCPDSHFFEVTSTDEGRKIGAETISINGNGY >gi|222159322|gb|ACAB01000037.1| GENE 159 188765 - 190525 1252 586 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716110|ref|ZP_04546591.1| ## NR: gi|237716110|ref|ZP_04546591.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 586 12 597 597 1118 100.0 0 MCITALFTVTFSSCSDDDENRAYESVQIVGVKVNNELYTPSSASATETTVLIPAGVDLSK AKLQLLVINGTANFINDQEYDARTPLDLTLNGFDGTTVQTKLRIQSPPKLVSFIIEGMTV PNSDIHTGEESLIVQVPEETDLTALKVTMEYINGTIMDFQNGVALDYTNPRSFKIKGVDE ETIYTYEFIITTEKVGPASIKAMTINGIETDYVLTDDKNVAVPYIPALMDFTSVNVELTA GFGNKIDESFTGQGLNLMNGNNKVSIKGSNGVTTEFTIGIPQISAEPVFKKDYTELAGFG SDNLISVGISDPYIIAGNHSSTKKTPAYFDYTGNKIENLNDKGLSIAGHGIRIMATDDKG NILGTSLALSGDKPVLYRWSSVTAEAKEFISYDKSALGESATPRLAGIGIIGDLDGDATI VATKAQSVDVFVWKVTNGVVNPTPQKYAFPVATPSYYWNVVPMPVGMTGYMGFFSTSATN GLIWMNSTMGEVSRSSGVRTSGGDVITINGRVYVAYTAYSGDQKGVMRICDITDGKYNQI FNYTMEASGANGNSTASACLMVKNNELYAVFGCTGSGLYFYRIACK >gi|222159322|gb|ACAB01000037.1| GENE 160 190583 - 192592 1198 669 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1063 NR:ns ## KEGG: Cphy_1063 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 31 654 201 830 1263 460 40.0 1e-127 MNRLIILLSLLLVPVFISAQTVVTIEAASPDPTLTVRGLEKQIDLFNNPVWEKQEKGIVL LSTEYQNAIKSTQSIYTAVIVNKDMKVTKVLNGVISKNIQPVFKTPLDIELGQAEFALIG YDADYSKDGYRKFLAENFHVGDVVKLRINGEIHSLDKVIAFSQGSIPPQIELDNDFLFTV VGSKTTLSGCIANYDRKAGYQLFIESQTEIKPVPLTAKGLFHSQLTLNNGTNFFNWILKK GGKEITRKPVVVFSKAPDQQQSELVMWVEQFPNAKVLTNREAVTTMVNNVKKAGFTSIGL DVKGPEGYVSYRKNDLSKTPYLTATKNPNKQVKDDGFDLLEVVLQEAHKIGLKVYTSFNF FTEGNITVNDYAILHEHKDWEEIVQRPEDKGKLLKITESTRGKEAAKGKLLALAFVNPSN KEVQDFQLLRVEEVLKNYDIDGIVLDRCRYDNLYADFSHVTRNAFEEYLEKEGKVLENFP ADAFRINKEGVLIKGKFFKEWITFRSQTICDFTNRIRLLVDKYKVEKNPDLKMAAYVGSW YEVYYQNGVNWASNQFKYDDRLSFPDSEIYGKSYNRTSYLGNLDFLMIGTYYKTPKEVNR YITLGNILTCGQVPLLGSMSLPDLSVSDQGKVFGASLKNSSGLMIFDNCYVDWETFFEQM KIAFSIKKK >gi|222159322|gb|ACAB01000037.1| GENE 161 192610 - 194496 1485 628 aa, chain - ## HITS:1 COG:no KEGG:Slin_6287 NR:ns ## KEGG: Slin_6287 # Name: not_defined # Def: RagB/SusD domain protein # Organism: S.linguale # Pathway: not_defined # 3 626 2 531 531 241 32.0 8e-62 MKKAIKYIFNNKVLASGALILSLGIAGCSDYLDKEPMSEYLSSNFYNNDGAIAQGANGCY QRLLMDHSNTSSIPYCILWDMYTPYGIERADNSSIGVGNIELRTNFTVEQTWAILYTSVA RCNSVLDGAKPFYNELSDKAKIYLAEIQVLRSHYYIQLVSLWGDIPYFTSAVTEEQTKQV SRTPWKEVVDDILKTLDEAADILPWTAANYGRVDKSVALGLKARLALYAGSWCKFGFGMD GEKDQVKATEYFKIAAAASKKVIDESGRDLATNYADLFTRTGQLKEDVKKETMLFMMFSN QIHSFTQYMSLGEQVRMIGQSGRFPTQQLVDTYEMKNGKRIDETGSGYDPKKPFDSRDPR LKETVYTHHDVIIGNTGGDNKMKFLMEVYNPQTTSWDKDGNEKLVANLDYAGAVAQYGYV SSGVGFAWKKYNHFDDEANANPSYNILIMRFAEILLTYAEAKIELNELDATVVNAIDRVR ARVDMPGILSVDPTRENDQLKMRQIVRRERKVELAKEALMLFDMRRWRTGDIQNAEPTYG YPKATGVDPTTGKYPDGYEQATPDMVPSYGASGSDRDINDIASYAAYGDKLRSRDKGRSW NNRHYLWPIPQTERNKCPWLEQNKGYGE >gi|222159322|gb|ACAB01000037.1| GENE 162 194510 - 197536 2013 1008 aa, chain - ## HITS:1 COG:no KEGG:BF1062 NR:ns ## KEGG: BF1062 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 1008 122 1126 1126 561 36.0 1e-158 MRKILFLILMLSSVGIQAQNLIVKGVVSDESDVLPGVSIFVEGTSKGTISDINGKYSIEV KKGSKLVFSYVGYRTEELIANAPVLNVKMKADAIQLEEAIVVGYAKQKKATLTGAVSSVS SETITKRSVASLSTALQGAMPGVTIQQTSGEPGGDGGSIRIRGIGSINSNTDPLVLVDGI EMSIDQVDANTVESISVLKDAASASIYGSRASNGVILITTKRGQKGKITTTYSGYLTIQR PTNMPEPVAAWEYLQAELNAWDNAEITVSDAQRAQQLQQIEEQKTLRPDNWNRYDTDWKD ETMKNHSIMHNHNVTISGGSDKLTFFGSGTYLYQDGLIPNDNYSRTNLRLNADAQILPWA KFSIETALRQGKKVNPGLSTPKQIINQSLYMLPTLSAARELDGNWGYGKNGMNPTAQAYD SGEKITKGTDAVVNGTLTLTPIKGLELVGQYSRRQSTSRGRTLITPYTTSLRGQIMGSYP TDDSLTESWSETVRNYYRAQASYENKFFDHYGKILVGFQAEDNLNTSFSGGKRGFDLGRY YLGNGDSATATSSGGANSWAMMSWYARLNYNYKQRYLLEVNGRYDGSSRFTRDNRWGFFP SLSAGWVISEENFMKSTRKVLDFLKVRASYGLLGNQNIGNYPYAATIATGYGYYLGGEEA DKELVSGVAQTTLANSDISWEKSKQINFGIDFSLWNGLLSVTADYYIKNIYDMLMKFPLP YYVGMSPAYTNAGDMSNKGWEVSVSHKNKLNDFTYGVTFTLSDNRNKITNLNGLNSQDKT MVEGYPNKGIWGYVTDGYYKDWDDVNNSPKLGDARPGFVKYVKTYQGEDSDPMTIDTRDM VYLGDPFPHFEYGVTLNAGWKNFDFTAFFQGVGQRVNYMSGVGLKPFANGSNLFRHQMDS WTPDNQDAAYPILVPEANAGPNYQKSDKWVRDASYCRLKNVVLGYTLPNSWTKKLNIGSL RVYASGQNLFTISNFYKGYDPEVAYSGSVGGEFYPIMQTFTFGIDLKF >gi|222159322|gb|ACAB01000037.1| GENE 163 197557 - 198822 1249 421 aa, chain - ## HITS:1 COG:all3695 KEGG:ns NR:ns ## COG: all3695 COG2942 # Protein_GI_number: 17231187 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Nostoc sp. PCC 7120 # 31 417 4 388 388 482 58.0 1e-136 MDSKNNIGHSADISLTAELPIPIYNGNTIMDFKKLASLYKDELLDNVLPFWLEHSQDHEY GGYFTCLDREGKVFDTDKFIWLQSREVWMFSMLYNKVEKRQEWLDCAIQGGEFLKKYGHD GNYNWYFSLDRSGRPLVEPYNIFSYTFATMAFGQLSLATGNQEYADIARKTFDIILSKVD NPKGKWNKLHPGTRNLKNFALPMILCNLALEIEHLLDESYLKETMETCIHEVMEVFYRPE LGGIIVENVDIDGSLVDCFEGRQVTPGHAIEAMWFIMDLGKRLNRPELIEKAKETTLTML NYGWDKKYGGVYYFMDRNGCPPQQLEWDQKLWWVHIETLISLLKGYQLTGDKQCLGWFEK VHDYTWTHFKDKEYPEWYGYLNRRGEVLLPLKGGKWKGCFHVPRGLYQCWKTLEEITNIV S >gi|222159322|gb|ACAB01000037.1| GENE 164 198867 - 199331 325 154 aa, chain - ## HITS:1 COG:BS_araE KEGG:ns NR:ns ## COG: BS_araE COG0477 # Protein_GI_number: 16080449 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 3 152 318 464 464 96 39.0 1e-20 MTTVLALVIIDKVGRKKLVYYGVSGMVVSLILIGLYFLFGDSLGVSSLFLLVFFLFYVFC CAVSICAVVFVLLSEMYPTKVRGLAMSIAGFALWIGTYLIGQLTPWMLQNLTPAGTFFLF AVMCVPYMLIVWKLVPETTGKSLEEIERYWTRSE >gi|222159322|gb|ACAB01000037.1| GENE 165 199445 - 201076 1371 543 aa, chain + ## HITS:1 COG:STM0928 KEGG:ns NR:ns ## COG: STM0928 COG4409 # Protein_GI_number: 16764290 # Func_class: G Carbohydrate transport and metabolism # Function: Neuraminidase (sialidase) # Organism: Salmonella typhimurium LT2 # 193 533 57 399 412 114 30.0 6e-25 MRRIYYLLFLILLGYSFDVKASDTVFIHETQIPVLIERQDNVLFYLRLDAKESKKLDEII LDFSKSTNLTDIQAIKLYYGGTEALQDKDKNRFAPVEYISSHRPGGTLAAIPSYSIKCAE VGSSEKVVLKGNYNLFPGVNYFWISLQMKKDASLQTKILSDLCAVKVDGKELCCKSISPK NIVHRMAVGVRHAGDDGSASFRIPGLVTTNKGTLLGVYDVRYNSSVDLQEYVDVGLSRSI DGGKNWEKMRLPLSFGEYGGLPKAQNGVGDPSILVDTKTNTVWVVSAWTHGMGNQRAWWS SHPGMDLNHTAQLVLAKSTDDGKTWSKPINITEQMKDPSWYFLLQGPGRGITMSDGTLVF PTQFIDSTRVPNAGIMYSKDRGKTWKMHNMARTNTTEAQVAEIEPGVLMLNMRDNRGGSR AIAITKDLGETWTEHPSSRQALQEPVCMASLIHVDAKDNILNKDILLFSNPNTTKGRNHI TIKASLDKGLTWLPEHQLMLDEAEGWGYSCLTMIDKETIGILYESSVAHITFQAIKLTDI IKE >gi|222159322|gb|ACAB01000037.1| GENE 166 201367 - 203979 1729 870 aa, chain + ## HITS:1 COG:XF0846 KEGG:ns NR:ns ## COG: XF0846 COG3250 # Protein_GI_number: 15837448 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Xylella fastidiosa 9a5c # 50 849 59 860 891 527 38.0 1e-149 MMINLIGKKTQIACGLLCCCSMAYAQSNDNSEVVVLNTGWEFSQAGTELWRPAQVPGTVH QDLIYHKQIPDPFYGINEQKIQWVENEDWEYRTAFTVTPEQLKRDDAQLVFEGLDTYADV YLNGALLLKADNMFVGYTIPVKSQLRLGENLLHIYFHSPIRQTMPQYNSNGFNYPADNDH HEKHVSVFSRKAPYSYGWDWGIRMVTSGIWRPVTIRFYDAASISDYHVKQLSLTDQLAKL SNELEINNILPQALQAEVRINTSFEGNTEKGISQAITLQPGINHISIPSEVLSPVRWMPN GWGKPALYDFSAQIIVEDKVVAQQSHRIGLRTVRLVNEKDQDGESFYFEVNGVPMFAKGA NYIPQDALLTNVTTERYQTLFRDIKEANMNVIRVWGGGTYEDDRFYDLADENGILIWQDF MFACTPYPSDPTFLKRVEAEACYNIRRLRNHASLAMWCGNNEILEALKYWGFDKKFTPEI YQEMFQGYDKLFHQLLPTKVKELDADRFYIHSSPYLANWGRPESWGIGDSHNWGVWYGQK TFESLDTDLPRFMSEFGFQSFPEMKTIATFAAPEDYQIESEVMNAHQKSSIGNALIRTYM ERDYIIPEKFEDFVYVGLVLQGHGIRHGLEAHRRNRPYCMGTLYWQLNDSWPVVSWSGID YYGNWKALHYQAKRAFAPVHINPLLEGDNLCVYLLSDHLDTREKLTLEMRLTNFAGKKAG RTVVLPSLTLPANTSQCVYRTSLTTLFFPAKRPLADDLRHCFMQLTLKDKSGHTVAETVH FFRKTKDLLLPKTTVSCKIKQKDGVCELTLLSPCLAKDVFIEVPIQGARFSDNFFDLLPG ERKTVVITSPQIKKGEELPLTIKHIRETYN >gi|222159322|gb|ACAB01000037.1| GENE 167 204014 - 206338 2247 774 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 31 608 31 596 757 432 42.0 1e-120 MKQLLKLTGCLALAGLFASCQSAQQDANYQIIPMPQEIVTAQGNPFILKSGVKILYPEGN EKMQRNAQFLADYLKTATGKDFAIEAGTEGKNAIVLTLGTANENPESYQLKVAGDGITIT GPTEAGVFYGIQSLRKSLPVAVGADISMPAVEINDAPRFGYRGAHFDTSRHFFTVDEVKT YIDMMALHNMNRFHWHITEDQGWRLEIKKYPKLTEIGSKRTETVIGRNSGEYDGKPYGGF YTQEQAKEIVAYAAERYITVIPEIDLPGHMQAALAAYPELGCTGGPYEVWRQWGVSEDVL CAGNDQVLKFLEDVYSELIEIFPSEYIHVGGDECPKVRWEKCPKCQARIKALGLKSDDKH SKEERLQSFVINHIEKFLNDHGRQIIGWDEILEGGLAPNATVMSWRGEKGGIEAAKQKHD VIMTPNTYLYFDYYQTKDTENEPLGIGGYLPLERVYSYEPMPASLTPEEQKYIKGVQANL WTEYIPTFSHAQYMVLPRWAALSEIQWSAPDKKNYEDFLSRLPRLIKWYDAEGYNYAKHV FNVTAEYTPNPTDGTLDITLSTIDNAPIHYTLDGTEPTAASPLYESPLKIKENVTFSAIA VRPTGNSRVVSEKVNFSKSSMKPIVANQPVNKQYMFKGESTLVDGLKGNGNYKTGRWIAF YKNDMDMTIDLQQPTEISSVAISTCVEKGDWVFDARGLSVEVSDDGKNFTKVASEEYPAM KESDKNGIYEHKLSFSPVKTQYVKVVALSESKMPAWHGGKDSPAFLFVDEITID >gi|222159322|gb|ACAB01000037.1| GENE 168 206379 - 206618 94 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294646589|ref|ZP_06724222.1| ## NR: gi|294646589|ref|ZP_06724222.1| conserved domain protein [Bacteroides ovatus SD CC 2a] # 1 79 5 83 83 152 100.0 5e-36 MGIGNVLNRTFPQSTVRHTCPIKNGAGDGLFFTVREHKTLYKLAGFRSAASGKLLASIAG SFEEREKEGYAEYAFSVDN >gi|222159322|gb|ACAB01000037.1| GENE 169 206506 - 208413 1478 635 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 44 466 100 518 757 373 44.0 1e-103 MQGLEAQLLENCLQALPVHLKKGKKKDTQNMLSLLITEKNHQLPSPESYTLSVTPQQILI RATSGAGLFYGVQTLLQLAQPSGAGSYSIASVEIEDTPRFAYRGLMLDVSRHFSTKEFIK KQIDALAYYKINRLHLHLTDAAGWRLEIKKYPLLTEFAAWRTDPTWKQWWNGGRKYVRFD APGAYGGYYTQDDIREILEYARQHYITVIPEIEMPSHSEEVLAAYPQLSCSGEPYKNSDF CVGNEETFTFLENVLTEVMELFPSEYIHIGGDEAGKSAWKTCPKCQKRMKDEHLANVDEL QSYLIHRIEKFLNNHGRHLLGWDEILQGGIAPNATVMSWRGEEGGIAAVTSGHRAIMTPG AYCYLDSYQDAPYSQPEAIGGYLPLKKVYSYNPVPASLTAEQAKLVYGVQGNLWVEYIPT PEHVEYMIYPRILALAETAWSAPERKSWPDFHTRALSAVADLQAKGYHPFDLKKEIGSRP ESLQPVSHLALGKKVIYNSPYSSHYPAQGNTALTDGIRGDWTYGDGSWQGFISDNRLDVT IDMEKETSIHSVTAAFMQVVGAEVFLPETVVISISDDGTHFTELRKQHFEVSKETPIRFT DISWQGEAKGRYVRYQAQAGSEFGGWIFTDEIIVK >gi|222159322|gb|ACAB01000037.1| GENE 170 208609 - 211353 2287 914 aa, chain + ## HITS:1 COG:no KEGG:BT_0483 NR:ns ## KEGG: BT_0483 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 914 1 914 914 1635 90.0 0 MVQHARLIFYSLLLLVIPCEGTLAQKIPVAPIDSLITVGYATGSLKTLSGSVEKITETQM NKDQITNPLEAIRGRVPGLTIQRGSNGPAALDAVRLRGTTSLTSGNDPLIIVDGVFGDLS MLTSIYPTDIESFTILKDASETAQYGSRGASGVIEVTTKKGMSGRTQVAYNGSFGISTVY KNLKMLSGDEFRRVASERGISILDKGNNTDFQKEIEQTGLQQNHHIAFYGGSSESSYRVS LGFMDRQGVILNEDMKNFTSNMNMNQKMFDGFLNCELGMFGSIQKNHNLVDYQKTFYSAA TFNPTYPNHKDPVTNSWDGITTASQITNPLAWMEVQDDDATSHISTHARLTFNLMEGLKL SLFGAYTYNIVENSQYLPTSVWANGQAYKGTKKRESLLGNMMLTYKKNWKKHFFDVLALA ELQKETYTGYYTTVSNFSTDKFGYNNLQAGALRLWEGTNSYYEQPRLASFMGRFNYTYAD RYVLTLNARTDASSKFGANHKWGFFPSASAAWVISEEEFMKQFPVIDNLKFRIGYGLAGN QSGIDSYTTLNLVKPNGVVPVGNSAIVSLGDLRNTNPDLKWEVKHTFNAGFDIALFGNRL LLSANYYNSRTTDMLYLYNVSVPPFTYNTLLANIGSMRNWGTEIAIGITPLKTKDMELNI NANITFQRNKLLSLSGMYNGEMLSASEYKSLASLDGAGFHGGYNHIVYQMVGQPLGVFYL PHSTGLESDGNGGYTYGIADLNGGGVSLEDGEDRYVAGQAVPKTILGSNISFRYKRFDLS LQINGAFGHKIYNGTSLTYMNMNIFPDYNVMKKAPQQNIKDQTATDYWLEKGDYVNFDYV TLGWNVPIEKVQKLKKYVRSLRLAFTVNNLATISGYSGLSPMINSSTVNSTLGVDDKRGY PLARTYTLGLSINF >gi|222159322|gb|ACAB01000037.1| GENE 171 211368 - 213050 1369 560 aa, chain + ## HITS:1 COG:no KEGG:BT_0484 NR:ns ## KEGG: BT_0484 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 28 560 45 577 577 970 85.0 0 MKRYMKNRNQFLQISKQIIYPSRILLFFLVTSLTLFSCDKFLQENPKDKLPEEDVYNTIS EVYLNAVASLYTYVGGYSDSQGLQGTGRGVYDLNTFTSDEAIIPTRGGDWYDGGFWQGLY LHDWGVENDAIQATWEYLYKVVMLSNKSLERIDKFAETHSDAALPAYRAEVQAMRAMYYY YLMDLFGRIPLVQSSSVAMKEVVQSERKTVFEFVFKELQEAAPLLSDAHSNQSGPYYGRI TRPVATFLLAKLALNSEVYTDNDWTDGQRPDGKNIKFTVNGNELNAWETVIYYCDQLKAL GYKLEPEYETNFSIFNEPSIENIFTIPMNKTLYTNQMQYLFRSRHYNHAKAYGLSGENGP SATIEALETFGYETAEQDPRFDICYFAGVVRDLKGNIIKLDDGTVLEYLPWKVALDITDT PHEQTAGARMKKYEVDPTATKDGKLMENDIVLFRYADALLMKSEAKVRNGASGDEELNEV RSRVNASSRPATLENILAERQLELAWEGWRRQDLVRFGKFTRAYSSRPQLPDEGNGYTTV FPIPEKIRVMNTKLKQNPGY >gi|222159322|gb|ACAB01000037.1| GENE 172 213060 - 214199 811 379 aa, chain - ## HITS:1 COG:BS_yhaZ KEGG:ns NR:ns ## COG: BS_yhaZ COG4335 # Protein_GI_number: 16078046 # Func_class: L Replication, recombination and repair # Function: DNA alkylation repair enzyme # Organism: Bacillus subtilis # 5 377 4 356 357 258 37.0 1e-68 MAEPFKNMFNEQFFDLFTKDLKLVIDDFDACEFVSQVMDDEWEGRELKQRCMHITTVLRK FLPADYKEAIAKILELLDHIKKTRPDFSVIDDTKFGLTLEYGGILDNYVEQYGLDDYETS VKAIEKITQFTSCEFVAHSFIIKYPDQMMKQMLVWSKHEHWGVRRLASEGCRPRLPWAMA LPNLKENPAPIIPILENLKNDPARFVRLSVANNLNDIAKDNPETVIDLVKKWKGESKEVD WIIKHGCRTLLKQGNPEVMELFGFNSTISNICVEDFQISSPEVKVGDSLEVSFKLLNKND QTTKIRLEYGIYYQKANGTLTKKVHKISEKEYAGNSTTRITRKHSFRVVTTRKLHLGLHQ IAMIINGNELEKYSFELIE >gi|222159322|gb|ACAB01000037.1| GENE 173 214513 - 215700 1431 395 aa, chain - ## HITS:1 COG:ECs0015 KEGG:ns NR:ns ## COG: ECs0015 COG0484 # Protein_GI_number: 15829269 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Escherichia coli O157:H7 # 4 395 3 372 376 278 42.0 1e-74 MAEKRDYYEILEVTKTATVEEIKKAYRKKAIQYHPDKNPGDKEAEEKFKEAAEAYDVLSN PEKRSRYDQFGHAGVSGAAGNGGPFGGFGGEGMSMDDIFSMFGDIFGGRGGGFGGGFGGF SGFGGGGGSQQRRYRGSDLRVKVKLTLKEISTGVEKKFKLKKYVPCDQCHGTGAEGDGGS ETCPTCKGSGSVIRNQQTILGTMQTRVTCSTCGGEGKIIKNKCKKCGGDGIVYGEEVVSV NIPAGVAEGMQLSMGGKGNAGKHNGVAGDLLILVEEEPHQDLIRDENDLIYNLLLSFPTA ALGGAVEIPTIDGKVKVKIDSGTQPGKVLRLRGKGLPNVNGYGTGDLLVNISIYVPEALN KEEKNTLEKMEASDNFKPNTSVKEKIFKKFKSFFD >gi|222159322|gb|ACAB01000037.1| GENE 174 215816 - 216397 801 193 aa, chain - ## HITS:1 COG:alr2445 KEGG:ns NR:ns ## COG: alr2445 COG0576 # Protein_GI_number: 17229937 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Nostoc sp. PCC 7120 # 42 192 79 229 248 76 33.0 2e-14 MDPKEKKVKEEELNVEETQNHAEEQPQNEQAEDATPLTHEEELEKELEKAQEEIEEQKDK YLRLSAEFDNYRKRTMKEKAELILNGGEKSLSSILPVVDDFERAIKTMETATDVNAVKEG VELIYNKFMAVLAQNGVKVIETKDQPLDTDYHEAIAVIPAPSEAQKGKILDCVQTGYTLN DKVLRHAKVVVGE >gi|222159322|gb|ACAB01000037.1| GENE 175 216721 - 218340 1965 539 aa, chain + ## HITS:1 COG:BS_ykpA KEGG:ns NR:ns ## COG: BS_ykpA COG0488 # Protein_GI_number: 16078507 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 538 1 540 540 664 57.0 0 MITVSNVSVQFGKRVLFNDVNLKFTSGNCYGIIGANGAGKSTFLRTIYGDLDPTTGTIAL GPGERLSVLSQDHFKWDSYTVMDTVMMGHTVLWDIMKQREELYAKEDFTDEDGLKVSELE EKFAELDGWNAESDAAMLLSGLGVKEDKHYVLMGELSGKEKVRVMLAQALYGNPDNLLLD EPTNDLDMETVTWLEEYLSNFEHTVLVVSHDRHFLDSVCTHTVDIDYGKINMFAGNYSFW YESSQLALRQQQNQKAKAEEKKKELEEFIRRFSANVAKSKQTTSRKKMLEKLNVEEIKPS SRKYPGIIFTPEREPGNQILEVSGLSKKTEEGVVLFNDVNFNVEKGDKVVFLSRNPRAMT AFFEIINGNMKPDAGTFNWGVTITTAYLPLDNTDFFNTDLNLVDWLSQFGEGNEVYMKGF LGRMLFSGEEVLKKVSVLSGGEKMRCMIARMQLRNANCLILDTPTNHLDLESIQAFNNNL KTYRGNILFSSHDHEFIQTVANRIIELTPNGIIDKMMEYDEYITSDHIKELRAKMYGDK >gi|222159322|gb|ACAB01000037.1| GENE 176 218566 - 220314 1431 582 aa, chain - ## HITS:1 COG:no KEGG:BT_2553 NR:ns ## KEGG: BT_2553 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 582 1 582 583 996 83.0 0 MMKQIPYGLTDFGRIQKENYYYVDKTMFIEKIEMQPSYLFLIRPRRFGKSLTLAMLEAYY DVRYADQFDELFGHLYIGQHPTPIHNQFLIMRFNFSEVSSNINEVEESFRLHCCGKLRHF LQKYEHILGKEIWNVLNEETLEEPGALLSAINSYATLKGDIKIYLLIDEYDNFTNTILST YGTDLYRKATHGEGYIRRFFNVIKAATTGMGSAVNRLFITGVSPVTMDDVTSGFNIGTNI TTDPWFNDLVGFSEKELREMLTYYKEQGALPMSVDDAVTMMKPNYDNYCFSKNKLADCMF NSDMVLYCMKSLILHGVKPDEIVDPNIRTDFNKLAYLVRLDHGLGENFSVIKEIAEQGEI VTEIVTHFSALEMTDVGNFKSLLFYFGLLSIKGVDMMGRPLLHIPNLVVREQLFNFLIQG YARHDIFKLDVNRLRTLFENMSFKGDWKPLFEFLAEAIREQSRIREYIEGEAHIKGFLLA YLSMFRYYQLYPEYEMNKGFADFFFKPSPAAPVSPPYTYLLEVKYAKAGASEKEIRALAD DAREQLIRYSKDECVAEAREKGGLKLATIVWRSWELVLMEEV >gi|222159322|gb|ACAB01000037.1| GENE 177 220504 - 222696 1445 730 aa, chain - ## HITS:1 COG:no KEGG:BF3670 NR:ns ## KEGG: BF3670 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 716 1 704 719 587 44.0 1e-166 MENISITTYRGLSLVSGSISIRQMFEFIRGDVYRDRIRRLREAMDAGETVKADHMKKQLP YCTITATYAKERLAYSLDTYQDIITLDCDDMPAEKIPEFRQLVNDCPDTLGSFVSPRMHG LKIFVYLTGNEAEALRTELNALGTIDFLTLERYHHRIYALASSQYEKLLNTKVDTSGSDP GRGFFVSHDPDAFLSTERLENVKPLTVKVTLPTEEECKNKKRKNPGKRSPLLPVQENASP IDLQVQLDFRKALEYTKRKERLEIGNRDNFFYCLGNQCYHRHITEEEAVSLAHSHFGDLP DFDLELPLHNAYQYTSKTDQAEEESQQPRICQVIKFMDEHYEIRRNVVKEQIEFRKIIPD LPKTEQPPFSTLRTKDVNTFYINAQMKKIYSSQANLKALVDSDYAKPFNPFIHYFTSLQT WDGKTDHIGQLTKTVKAADQAFFEDSFRRWLVGMVACAIDDEAQNHQLMLLHGAQGKGKS TFVRHLLPPELKDYYRNGMISPDNKDHLLQMSSCLLINLDEFDTLSPARMQELKSLITQD VMNERKVYDIQNYTFIRRASFIASTNNPHCLPDIGENRRILFNTLLEIDYHTPVNHQGIY AQAYALYRQGFQYWYENQEITFLNNRNEAFRQKDPVEENLFFYFRAARPNDIQAKWYPAS QLLSILSMNGRTQANAQMKQMLVTVLENNHFHSRKTSNNITEYWVVEYSAEERKENSIRP QLPVQTGLEL >gi|222159322|gb|ACAB01000037.1| GENE 178 222824 - 223081 78 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298480880|ref|ZP_06999075.1| ## NR: gi|298480880|ref|ZP_06999075.1| hypothetical protein HMPREF0106_01318 [Bacteroides sp. D22] # 1 85 1 85 85 156 98.0 4e-37 MLGTVNNLYTNRIEVSPGDRKKFLLPFVHLCYDHDSVFNIKSPKYFAELLFMAVDFKKSD DSDESLSVEEIDCEIQRMKRTLLED >gi|222159322|gb|ACAB01000037.1| GENE 179 222980 - 223204 232 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716128|ref|ZP_04546609.1| ## NR: gi|237716128|ref|ZP_04546609.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 74 1 74 74 129 100.0 5e-29 MVITEMDKGEEKLLAVAWTYLYSVGIQIVDSAKHLRDKLGYLGKFSLFECYSLFVLVNEA FGKNVFFLMILSVI >gi|222159322|gb|ACAB01000037.1| GENE 180 223469 - 223705 122 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295084563|emb|CBK66086.1| ## NR: gi|295084563|emb|CBK66086.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 78 1 71 71 107 85.0 3e-22 MYYTASHHNIDRARLQGWHYKTDIGQNTPITAFVFIIANLDFIIANLDFIIANSDFIIAN LDFIIANFIFVIANQHLP >gi|222159322|gb|ACAB01000037.1| GENE 181 223712 - 223957 217 81 aa, chain - ## HITS:1 COG:no KEGG:BT_0708 NR:ns ## KEGG: BT_0708 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 81 1 81 81 122 82.0 3e-27 MKAEKEEIAEEPFVIRPYLKSELAHLYNPYVPLVYAMRKMREWIRNNKELYDAMYSGGEG KNDHTYSARQVRLIVRYLDEP >gi|222159322|gb|ACAB01000037.1| GENE 182 224253 - 224753 602 166 aa, chain + ## HITS:1 COG:no KEGG:BT_0707 NR:ns ## KEGG: BT_0707 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 143 1 136 162 214 83.0 9e-55 MAIPFKRMGRKDPRKVDGVVKYHPQLVTQGQSVDLDKLAYTMKEKSSLSLGDIQSVLTNL VEAMRTALFDGKSVNIHDFGVFSLSATTRGVDTKDECTMKNIKTVNINFRPSSSVRPNLT STRAGEKIEFLDLDAPKKKKTDGEDPGDEGGGDSGGGSGEAPDPAA >gi|222159322|gb|ACAB01000037.1| GENE 183 224804 - 225247 360 147 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 47 143 2 98 116 104 51.0 6e-23 MMRKIDLIVIHCSATRADRSLTPDDLEMQHRRRGFNGTGYHYYIRKDGTVHLTRPIERIG AHVKGFNSNSVGICYEGGLDAHGCPADTRTPEQRAALRLLVHQLLETFPGSRVCGHRDLS PDRNGNGEIEPEEWIKACPCFEVKAEF >gi|222159322|gb|ACAB01000037.1| GENE 184 225355 - 225513 154 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVIEISQLLIEYGKEVCKVMAVKKSLWDVILKVVIAVASAVAGVLGGNAMNL >gi|222159322|gb|ACAB01000037.1| GENE 185 225611 - 228634 1417 1007 aa, chain + ## HITS:1 COG:no KEGG:Slin_2121 NR:ns ## KEGG: Slin_2121 # Name: not_defined # Def: YD repeat protein # Organism: S.linguale # Pathway: not_defined # 22 643 30 703 1837 114 24.0 3e-23 MRKKNFIEQMLCLLLLLGTGTHIYAQNSSGFIPTDFQGYYFSPQNLGFSTPQTAEFVQYG NTRVNYYNGLLDLDIPLFDYKDTAFELNMSIKYISDGFKPGRRPSVVGNNWILNVGGAIT RNVVGNPDDVRQEQKSGLLAAIRDGKFKQYSKEDLLKLKIFNATEDRLYPDTEYDMAPDI FDFNFGPHKGRFIIDNSGNAKCISGGGYRIDLSEMSVQDYSTTNAPKRSVIKITTPDGYL YYFGGDVSCLEYSLPNNPGRLRSRPVQITSWYLSSIQDETKNNGISFSYQSCLQKNKYHL FMNSNVTGTRWINYKPDKNGYTKPAEIIGINDKDTDHFLMEDKVYTPILRTISTGSVLVN FITETFPVNFFGDSDGNDLIYLSSITMTKAPQVIKSCKFDYETSGRYFFLKNVTLHDQSE GPAIYSFDYNMSNDLPDPLTTNVDHWGFWNGGYEKIDNANTFFYDGNFEQRKAVNTGVSS CTMLNTITYPTKGEEKIDYEYNRYRHYLTKRTDSFAWDTNMATYDAPLGGVRVKHLTLHD PVTKKDRQRSFNYLDPTTGMESGRTHELPRYRMPIEDMVYSYNYYDYMETKNLNVYSISS NCMGRFNNISEYPIGYSYVTETFDDGSFCRYHFSSLADIPDNAEFGQAVTRTPDNLNRRS FGFYQVLDKALNYAPNDLAAFRGKLLFKTTYNNRYHKVAEEEYHYNVENKTADYEVSIDT GTGAILASKIFTVPCLLIQEKLTDENGVSILHNYEYNANGFVTQKETVNSNGDHVYLKYV HPGESSNLPSTIYYDNLIRLNRIEEPVAIIKYLKKSQDEKRKIIDFIYLYYEPTSSAGIQ KRTLRGSSLLKRLPENMDLTSYLTDSLTSFIEAYDNYDKYGNLITLRNFHNDITIYLWSY YGKYPIAEIKGSTYNEVKSALKRKPESLSEESEPNIKNIEQLRRLLPHAQITIYDYKQQI GMKYTSQPDGKTSFFQYDLQGRLKKRFRRGEKGDIQLMEYNRYHYSK >gi|222159322|gb|ACAB01000037.1| GENE 186 228644 - 231899 1334 1085 aa, chain + ## HITS:1 COG:no KEGG:BT_2927 NR:ns ## KEGG: BT_2927 # Name: not_defined # Def: putative cell wall-associated protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 436 1084 55 684 1074 388 38.0 1e-106 MKRHLIFFLFLFCALMAVDGRNNPQLVDKRMHFADIYKEPEIAHFSDVPRYIFYEGELHN YVIYTFTIQDRSTLFMANMMGTDTDYTSFRLNYRPLNSTGRFRKKSYSSDSHKVKEVLSN QDKFNFTFDEDLTQVHLHSPVLYEVLEPGEYELYSVASEDTKKPTKRLETNIYLSPVGVN EYTSVDIGHFTRNFSQEIRLGTFSGGDVYCKFSLDSINMISVVPLDTIGVNPILKMETLY RSLVSSNEYPLNENSLPQIKDFRCGVGDFYIHAYYPDSAYRKPKLRVTGIVDNAGIDWEH PLEVGSFSDSLDYSTFYYPYMFVPSGSSSRPCVYHRFTIKDTMDISIISEEYLNNITLYD STRHLVRKEEKDDYLTSLDVFGLSPGTYFFVTEGGNGTIVIKVKGKKSVQQLSDIKNYVS TEKSIINTPFASDLQGPLSATREIDYLDPFGRTEQTIQYGITPALNSLVNRKEYDRMYRD SCSWLPAVCPGSGDYVSSSDFRKHILELYNDEYACEKSLYDGSPLNRLTEKYNPGKDWHT TGHSEKVSYQTNSGTSQIRLFEVNGEKDNPVLSQKGFYATGELVVEEKKDEAGLPTLTYT NKLEQTIVSRTISGADTLDTYQVYDDFGNLAFVLPPMAVSSLQTLGQKDALDLYAYQYRY DELNHYRGKKLPGAEWIDMYFDKDGKLLRSRDGEMRKRNEWKCTFYDKLRREVVTGIYKG MMSYSSSASVEFAPDKPNNHYGYLFSSYLKMDSLNIQKVAYYDTYQYKKANSCFTAAMDY IDSNDYGHRFGDDSEQLNCKNLQTGVMTRIIGTDQMLCTSTYYDYFQRPVQVRSTDINGK VHVQNMAYDFCDHITASNDKVENISLVQTKTYDHAGRLLTEARLVNDLVSDTLRYNYDEL GHIANVKRVNGNHSLTSTNRYNLRGWLTSIESPLFSQKLYYTDGIGIPCYNGNISSMTWK TSANPDIRGYRFEYDLLSRLKNATYGEGETLSLNVNRFNEQITGYDKNGNILGLKRSGQT SANGYGLVDNLSVTLSGNQLKRVDDSVSGSAFGDNFNFKDGVKQNTEYFYDANGNLSKDL NKKII Prediction of potential genes in microbial genomes Time: Wed May 18 02:04:03 2011 Seq name: gi|222159321|gb|ACAB01000038.1| Bacteroides sp. D1 cont1.38, whole genome shotgun sequence Length of sequence - 2587 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 311 199 ## gi|237716135|ref|ZP_04546616.1| conserved hypothetical protein 2 1 Op 2 . + CDS 321 - 749 206 ## gi|237716136|ref|ZP_04546617.1| conserved hypothetical protein + Prom 776 - 835 2.9 3 2 Tu 1 . + CDS 872 - 1333 -19 ## gi|237716137|ref|ZP_04546618.1| conserved hypothetical protein + Prom 1348 - 1407 4.8 4 3 Op 1 . + CDS 1454 - 1708 115 ## gi|294647793|ref|ZP_06725346.1| hypothetical protein CW1_4860 5 3 Op 2 . + CDS 1777 - 2388 232 ## gi|237716138|ref|ZP_04546619.1| predicted protein + Term 2415 - 2474 10.1 Predicted protein(s) >gi|222159321|gb|ACAB01000038.1| GENE 1 3 - 311 199 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716135|ref|ZP_04546616.1| ## NR: gi|237716135|ref|ZP_04546616.1| conserved hypothetical protein [Bacteroides sp. D1] # 60 102 1 43 43 87 100.0 4e-16 KSKEELDRNQANVAKSIDTNVSGMMPNGDPAPKIDPNDKRRIIKVGRKVIIGLAVDRACM ELTNPDPTQDAYETHTNQVEGKKEYSTDYVGNIYNWIKELFK >gi|222159321|gb|ACAB01000038.1| GENE 2 321 - 749 206 142 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716136|ref|ZP_04546617.1| ## NR: gi|237716136|ref|ZP_04546617.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 142 1 142 142 218 100.0 1e-55 MGDIFILIFKKITILNSILMIVHCLFHPDFGEFDYMYMVVILLLISWIAWLIGKVLNLLI SNKERKEKNTKINNIVDFLYLKFRQDIDMTYAVFFAIYLVFYRHEKAIAYLILLLFGLYL GKKIAIRANRYIVDQANKKNTP >gi|222159321|gb|ACAB01000038.1| GENE 3 872 - 1333 -19 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716137|ref|ZP_04546618.1| ## NR: gi|237716137|ref|ZP_04546618.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 153 1 153 153 224 100.0 1e-57 MRYLFIAIYKKSLLWFSFMVIPYCIFSSFLFEYPTLIFIFRVILAIALAIIVIGGMMSYL FPYIEIPKRSLKGKGLIRALYLKSDRIITQAYCWFLVVYITFSIEYKSTLYNYLLLFLIG LFLGYKITIRSNKYSLDEASRRKQASKKESHTR >gi|222159321|gb|ACAB01000038.1| GENE 4 1454 - 1708 115 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294647793|ref|ZP_06725346.1| ## NR: gi|294647793|ref|ZP_06725346.1| hypothetical protein CW1_4860 [Bacteroides ovatus SD CC 2a] # 1 84 1 84 84 118 100.0 1e-25 MTSFGRVGKYLMYIQKMLYILCLIKILFSLFFYEYESSFMKNITFTLPLLLAQIVIPIIK GYKMKNSKLRYYLCYYSHLCSVLP >gi|222159321|gb|ACAB01000038.1| GENE 5 1777 - 2388 232 203 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716138|ref|ZP_04546619.1| ## NR: gi|237716138|ref|ZP_04546619.1| predicted protein [Bacteroides sp. D1] # 1 203 1 203 203 400 100.0 1e-110 MKTILILFVSFFISFNNCLSQTVPDERDKREYKEIISLYPKSLVSHFPRKIDDKKIGLMA LTFPRGKYLSYIHLAISYEDSDIEKLKKKVTLEAKEVYHIKDSCLMVIPYNYDTFEIVTL DSIQNCESIDVLPIPNFRLWESKFPSDFYDNAVLYVLNAEKGRFLKKDHLSRSGIGLPER WLHGYTKGLTFYKNYVVYWLEVW Prediction of potential genes in microbial genomes Time: Wed May 18 02:04:44 2011 Seq name: gi|222159320|gb|ACAB01000039.1| Bacteroides sp. D1 cont1.39, whole genome shotgun sequence Length of sequence - 28425 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 13, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 305 226 ## gi|237716139|ref|ZP_04546620.1| predicted protein 2 1 Op 2 . + CDS 317 - 778 231 ## gi|237721174|ref|ZP_04551655.1| conserved hypothetical protein + Term 789 - 844 14.2 - Term 773 - 832 2.4 3 2 Tu 1 . - CDS 837 - 1934 1120 ## BT_1240 hypothetical protein - Prom 1954 - 2013 4.6 + Prom 1872 - 1931 7.8 4 3 Tu 1 . + CDS 1990 - 2886 884 ## COG1266 Predicted metal-dependent membrane protease 5 4 Tu 1 . - CDS 2858 - 4990 189 ## PROTEIN SUPPORTED gi|227384144|ref|ZP_03867559.1| SSU ribosomal protein S1P - Prom 5016 - 5075 3.0 + Prom 4975 - 5034 6.5 6 5 Op 1 . + CDS 5109 - 7043 1483 ## COG0642 Signal transduction histidine kinase 7 5 Op 2 . + CDS 7040 - 8782 1639 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases 8 5 Op 3 . + CDS 8878 - 9780 637 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 9845 - 9876 1.1 - Term 9636 - 9682 1.1 9 6 Tu 1 . - CDS 9783 - 10367 476 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 10388 - 10447 6.0 + Prom 10381 - 10440 2.5 10 7 Tu 1 . + CDS 10476 - 11204 430 ## BT_1233 hypothetical protein + Term 11270 - 11312 0.4 - TRNA 11404 - 11478 57.4 # Glu CTC 0 0 - TRNA 11543 - 11632 55.5 # Ser GCT 0 0 - Term 11961 - 12000 -0.5 11 8 Op 1 7/0.000 - CDS 12063 - 13496 818 ## COG2425 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 12 8 Op 2 . - CDS 13465 - 15183 1214 ## COG0714 MoxR-like ATPases - Prom 15212 - 15271 7.8 + Prom 15171 - 15230 7.4 13 9 Tu 1 . + CDS 15253 - 15699 527 ## BF1816 hypothetical protein + Prom 15876 - 15935 6.4 14 10 Op 1 . + CDS 16071 - 17735 1568 ## COG2985 Predicted permease 15 10 Op 2 . + CDS 17801 - 19795 1866 ## COG3855 Uncharacterized protein conserved in bacteria 16 11 Op 1 . - CDS 19990 - 21138 909 ## BT_1227 hypothetical protein 17 11 Op 2 . - CDS 21193 - 22851 1581 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) - Prom 22873 - 22932 2.6 - Term 23203 - 23260 9.2 18 12 Op 1 14/0.000 - CDS 23275 - 24345 1059 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 24365 - 24424 3.7 - Term 24361 - 24424 13.7 19 12 Op 2 . - CDS 24447 - 25517 1220 ## COG1089 GDP-D-mannose dehydratase - Prom 25547 - 25606 3.6 20 12 Op 3 . - CDS 25611 - 26780 1057 ## COG1301 Na+/H+-dicarboxylate symporters - Prom 26802 - 26861 6.9 + Prom 26770 - 26829 4.8 21 13 Tu 1 . + CDS 26890 - 28365 1603 ## COG0362 6-phosphogluconate dehydrogenase Predicted protein(s) >gi|222159320|gb|ACAB01000039.1| GENE 1 3 - 305 226 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716139|ref|ZP_04546620.1| ## NR: gi|237716139|ref|ZP_04546620.1| predicted protein [Bacteroides sp. D1] # 24 100 1 77 77 145 100.0 1e-33 KSKEELDRNQANVAKSIDTNVSGMMPNGDPAPKRNPKDGGKKTMVGMILGAIGTLSKETL DVTNPDPSQDSYEVHTKKVEKKEIYGPTIWEEYFDWKINF >gi|222159320|gb|ACAB01000039.1| GENE 2 317 - 778 231 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237721174|ref|ZP_04551655.1| ## NR: gi|237721174|ref|ZP_04551655.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 153 11 163 163 225 94.0 8e-58 MRYLLIAIYEKTALLGSIILIAYRIFLPYIYKIPYLSSSVQIIFLIFWAITLIGGMMSYL FPYIRTPKRPLKAEGIIRTFYLKSDKIIMQVYCYVFGAYAIFYLDRVSVLFDYLMLLLLG MFLGYKIAVRANKYSLDETSKKKQVSKKESHTR >gi|222159320|gb|ACAB01000039.1| GENE 3 837 - 1934 1120 365 aa, chain - ## HITS:1 COG:no KEGG:BT_1240 NR:ns ## KEGG: BT_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 365 1 349 349 641 89.0 0 MYSLFFLYLSEQIYKIMKIKLLLISFLLAANALGAAAQVSKTYYVSKPGTLISMMTEEEA NSVTHLTLTGKLNAEDFRHLRDEFDNLKVLDISNAEIKMYSGKAGTYPNGKFYIYMPNFI PAYAFSNVVDGVTKGKATLEKVILSEKTKNIEDAAFKGCENLKICQIRKKTAPNLLPEAL ADSVTAIFVPLGSSDSYRYKDRWQNFAFIEGEPVETTLQVGAMGKLEEEILKAGLQPRDI NFLTVEGKLDNADFKLIRDYMPNLVSVDISRTNATAIPDFTFAQKKYLLNMKLPHNLKSI GQRVFSNCGRLCGTLELPASVTAIEFGAFMGCDNLRYVLATGNKITTLGDNLFGEGVPSK LVYKK >gi|222159320|gb|ACAB01000039.1| GENE 4 1990 - 2886 884 298 aa, chain + ## HITS:1 COG:FN0640 KEGG:ns NR:ns ## COG: FN0640 COG1266 # Protein_GI_number: 19703975 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Fusobacterium nucleatum # 71 288 75 288 293 84 27.0 3e-16 MEADNIAGGKEPKRLPVWACIPLFIVILFILLGLYGTLARGCLSLVLGVEARHPGVMGYI ILEASMLLAVLTAAIPMLRFERRPFSDLGLSLKGHVKGLWYGFLMAILLYLFGFGISFVL GEIEVTSFQFKPLELLGSWVFFLLVALFEEILMRGYILGRLLHTTMNKFLALFVSAALFA FMHIFNPEIAFLPMLNLLLAGMLLGASYLYTRNLCFPISLHLFWNWIQGPILGYQVSGNN FTTSMLTLRMPEENVLNGGAFGFEGSLICTVLMIVFTILIVWWGEKREAISLAVPRSC >gi|222159320|gb|ACAB01000039.1| GENE 5 2858 - 4990 189 710 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227384144|ref|ZP_03867559.1| SSU ribosomal protein S1P [Jonesia denitrificans DSM 20603] # 633 710 203 279 488 77 47 1e-13 MELFHKMISGFLGIPERQISSTLHLLGEGATIPFISRYRKEATGGLDEVQIEQIKEQHDK LCDIAKRKETILGTITEQGKLTAELEKRINDTWNPTELEDIYLPYKPKRKTRAEVARQKG LEPLATILLLQRENNLSAKAASFVKGEVKDVEDALKGARDIIAEQVNEDERARNAVRNQF GRQAEITAKLVKGKEEEAAKYRDYFDFSEPLKRCTSHRLLAIRRAESEGLLKVSINPDDE ACIERLERQFVRGNNECSRQVGEATTDAYKRLLKPSIETEFAAQSKEKADDEAIRVFTEN LRQLLLAPPLGQKRVLAIDPGFRTGCKVVCLDAQGNLLHNENIYPHPPINKTGEAASKLR KMIEAYQIEAISIGNGTASRETEDFINSQSFDRQIPVFVVSEQGASIYSASKIARDEFPD YDVTVRGAVSIGRRLMDPLAELVKIDPKSIGVGQYQHDVDQTKLKKALDQTVENCVNLVG VNLNTASSHLLTYISGLGPQLAQNIVNYRAENGAFSSRKELMKVPRMGAKAFEQCAGFLR IPGAKNPLDHTAVHPESYHIVEQMAKDLKCTIDELIADKELRRKINISDYITPTVGLPTL QDILQELDKPGRDPRKAIKVFEFDKNVRTIADLREGMILPGIVGNITNFGAFVDIGIKEN GLVHLSQLVERFISDPTEVVSIHQHVMVRVMNVDYDRKRIQLSMIGVPQD >gi|222159320|gb|ACAB01000039.1| GENE 6 5109 - 7043 1483 644 aa, chain + ## HITS:1 COG:MA4377_3 KEGG:ns NR:ns ## COG: MA4377_3 COG0642 # Protein_GI_number: 20093164 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 400 637 13 255 311 157 40.0 9e-38 MLGRLGYFFICFLWLLQSSKVLAVSNDKKPILIIYSYNPAAHQTSVTISDYMDEYSKLGG QRDIIIENMNCKSFSEAPLWSKMMTQILAKYQGEKHPAQIILLGQEAWAAYLSQRDEMQV KVPVMCSLASSNVVILPEDTVESLDSWMPESVDLFDDHLNIPELKSGFINQYDIEGNIQV IQAFYPKTKHIAFISDNTYGGVTMQALVRKEMKKFPDLDLILMDGRKHSIYTIVEELRQL PENTVILVGTWRVDMNEGYFMRNATYAMMEVTPTIPTFTPSSVSLGHWAIGGVLPDYRKV GGEMAMESVRMDTHPQDTVKHLSVIGCKAVLDSRKVKEWGLDPAVLPFKVQLVNQPVSFY QQYTYQIWSACALFVILVLGLCISLFYYFRTKRLKDELLKSEKDLRVAKDRAEESNRLKS AFLANMSHEIRTPLNSIVGFSDVLAVGGSTEEEQQSYYQIIKTNSDLLLRLINDILDLSR LEANRVTLTWEECDVVQLCSQVVASVSFSRQSSENQFLFTTSFESFRMVTDVQRMQQVMI NLLSNANKFTKCGKITLDFSVNEETQMAVFSVTDTGCGIPKEKQGLVFERFEKLNEYAQG TGLGLSICKLIVHKWKGSIWIDPDYTGGARFVFSHPLNLNIEKE >gi|222159320|gb|ACAB01000039.1| GENE 7 7040 - 8782 1639 580 aa, chain + ## HITS:1 COG:CAC0353 KEGG:ns NR:ns ## COG: CAC0353 COG0737 # Protein_GI_number: 15893644 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Clostridium acetobutylicum # 27 568 542 1093 1193 234 30.0 5e-61 MKRIVLIYGLLLCLALSVAAQEKVVKLKIVETSDVHGNYYPYNFITRHEWKGSLARVYSF VQKEREQYKENLILLDNGDILQGQPTAYYYNYIDTVSPHLCSEMMNYMKYDAGNMGNHDV ETGRAVFDRWIATCDFPVLGANIIDTSTGQPHLAPYKVLERDGVKIVVLGMITPAIPAWL SENLWKGLRFDDMEETARKWMKVIREKENPDLVIGLFHAGQEAFKMSGKYNENASLNVAK NVPGFDIVLMGHDHSRECKKVMNVAGDSVLIIDPASNGIVLSNVDVTLKLKDGKVQSKDI KGVLTETEAYGISEDFMKRFAPQYETVQKFVSKKIGTFTESISTHPAFFGPSAFIDLIHT LQLDITGADISFAAPLSFDSEIKKGDVFVSDMFNLYKYENMLYVMTLSGKEIKDFLEMSY YMWTNRMKSPEDHLLWFKEKRREGAEDRASFQNYSFNFDSASGIIYTVDVTKPQGEKITI TSMADGSPFRMDKIYKVALNSYRGNGGGELLTKGSGIPQEKLKDRIIFSTDKDLRFYLMN YIEEKGTMDPKALNQWKFVPEKWTVPAAKRDYEYLFRSVR >gi|222159320|gb|ACAB01000039.1| GENE 8 8878 - 9780 637 300 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 204 298 191 286 288 74 37.0 2e-13 MDIQSVPKVGISSVVHSKHIDADNIDVVDNDIALFDTESVISLYNGPTKLEVLTVGLCLE GTGTFKISLREFQLCPGLMVIALPNQIIEQRYFSHDFKGIFFAVSKNLLETLPKIGNVLS LFFYLKDYPCFDLTPHEQEVVKEYHAFIRKRLRNKEALYRREVVMGLMQGFFFELCTIFT NHAPANATTMRNKSRKEYIFERFYESLVESYQSERSVKYYADQLCLTPKHLSGVVKEVSG KTVGEWIDELVILEAKALLNSSSMNIQEIADRLNFANQSFFGKYFKHYTGMSPKEYRKSR >gi|222159320|gb|ACAB01000039.1| GENE 9 9783 - 10367 476 194 aa, chain - ## HITS:1 COG:CAC3336 KEGG:ns NR:ns ## COG: CAC3336 COG0664 # Protein_GI_number: 15896579 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 39 190 40 192 199 75 28.0 8e-14 MDTLLRETVNAVVNSRFPEMSIEGRRQIESILIREEFPKGAIALNEGEVAHELVFVGKGM LRQYYYKNGKDVTEHFSYEGCIVMCIESFLKQVPTRLIVETLEPSIIYLFPRDMIQKLAR ENWEINMFYQKILEYSLIVSQVKADSWRFESARERYNLLLETHPEIIKRAPLAHIASYLL MTPETLSRVRSGVL >gi|222159320|gb|ACAB01000039.1| GENE 10 10476 - 11204 430 242 aa, chain + ## HITS:1 COG:no KEGG:BT_1233 NR:ns ## KEGG: BT_1233 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 242 3 244 244 401 79.0 1e-111 MNKIIGLTVLLFCLSGCVRDNDAIYYPVGNVDVERGGPALEAGKGDLIARSYNTEDYVLD TLAQYPGDPTLGKLTFMINLKNQSADREVDGFNGVGRSELTMSLGYKDGNYPVESQVPVY TSSDVTASYAIKLRLKGELTLTGDEWMIDYMYAQLAGLFQPYPPTSFPEVFMCKGGEQPF ATFDSFRRTWTFDITYDRSNLSFSQLYFNLFVNLAGQKREERVRLRIDKESYFEIYKEKE EM >gi|222159320|gb|ACAB01000039.1| GENE 11 12063 - 13496 818 477 aa, chain - ## HITS:1 COG:VCA0762 KEGG:ns NR:ns ## COG: VCA0762 COG2425 # Protein_GI_number: 15601517 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 130 459 126 469 481 133 30.0 7e-31 MPMNKRTESIRLKHLQDIYYEKLQGIAYDVYDEQLHNLIIRPEELDADIHRYFRHTQPSL QDFYSHYASQWEYFHEMNEASDTKFLHFLKNSAYTFSMKYHLVDLNVKYYLQRFDAISPR SKEWKALRTLFFDKWHTLLSNNEFNYQMEHIERLCDDFYRLQLSLAKNLPVRGGSRLVWL LRNHKQIAEQILEYEETIKRNPVIRELVEILGKKHQSSRKRFKMTAGIHREQIISHATRS DITGICEGNDLNSLLPLEYCYLAEKSLQPIFFERFIEKRLQVIDYQSHEKQTINDKKTIG NEVSEEAEGPFIVCLDTSGSMAGERERIAKSTLLAIAELTEVQHRKCYVILFSDDIECIE ITDLGGSFDRLVDFLCQSFHGGTDMEPVITHALRKISEEGYMEADIITVSDFEMRPVDQL LSRTIEHAKAKQTKMYAISLGGKSAETSYLKLCDKYWEYSIQNAESLNKNRIEESNN >gi|222159320|gb|ACAB01000039.1| GENE 12 13465 - 15183 1214 572 aa, chain - ## HITS:1 COG:VCA0763 KEGG:ns NR:ns ## COG: VCA0763 COG0714 # Protein_GI_number: 15601518 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Vibrio cholerae # 38 384 1 350 552 298 44.0 1e-80 MIFPHETENSYLSGGNCLNLGVNPSGEVYFKEFLYLCLLLTTHEIIRRMKSIKSHITQLL KSLNEGVFEKEHTIALSLLSAMAGESIFLLGPPGVAKSLVARRLKLAFKDADAFEYLMSR FSTPDEIFGPVSISKLKDEDTYERITKGYLPTASIVFLDEIWKAGPAIQNSLLTVINEKI YRNGQFTVRVPLKALIAASNELPAKGEGLEALYDRFLIRQFVGCIEQEYAFDQMISSTRE IEPEIPEKLQVNDELYNQIQAESEKVGIHYTIFELIHNIKREIEQYNTGRDENTPPIYVS DRRWKKIVGLLRTSAYLNESPGIHFSDCLLMSACLWDEISQLPIIEEIVEQSIARGINTY LLGEKRLEQKLDTLKENMKSEHSLRELSDPGIQVVDTFYHRIEGYHIAGNLLIFASDYQS LKKDSNRLFYIQQDKFRPVNKILKAYDFVKNRNIAQKNIYSLRKGRRSVFVNNQEYPLLC YDNCNPLPAQQDGNTPFEFTLQEVIDLLHQMEVEYKTISERETAYTKEHLFLSSSQKSKI KRILGETAHIIENYRNELRIIAHAHEQENREY >gi|222159320|gb|ACAB01000039.1| GENE 13 15253 - 15699 527 148 aa, chain + ## HITS:1 COG:no KEGG:BF1816 NR:ns ## KEGG: BF1816 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 148 1 148 148 179 63.0 2e-44 MKTSTFKYFGLALMAILMVSFTSCEVEIDSFYDDDNNGAGYYNRSADLCSRTWVSFYRDM DGNYCRQELDFFLDRTGIDYIRVEYPNGAVDQYEYNFRWSWENYAQTSIRMSYGPNDVSY LDDVYIGGNRLSGYLDGRNNFVEFQGKR >gi|222159320|gb|ACAB01000039.1| GENE 14 16071 - 17735 1568 554 aa, chain + ## HITS:1 COG:ECs4625 KEGG:ns NR:ns ## COG: ECs4625 COG2985 # Protein_GI_number: 15833879 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 17 552 16 557 561 332 37.0 1e-90 MEWLYSLFIEHSALQAVVVLSLISAIGLGLGRVHFWGVSLGVTFVFFAGILAGHLGLSVD PQMLNYAESFGLVIFVYSLGLQVGPGFFSSFRKGGVTLNMLALGVVLLGTLLTVVASYAT GVSLPDMVGILCGATTNTPALGAAQQTLKQMGINSSTPALGCAVAYPMGVVGVILAVLLI RKVLVRKEDLEIKEKDDANKTYIAAFQVHNPAIFNKSIKDIARMSYPKFVISRLWRDGHV SIPTSDKILKEGDRLLVVTAEKDALALTVLFGEQENTDWNKEDIDWNAIDSELISQRIVV TRPELNGKKLGALRLRNHYGINISRVYRSGVQLLATPGLILQLGDRLTVVGEAAAIQNVE KVLGNAVKSLKEPNLVVIFIGIVLGLALGAIPFSFPGVSTPVKLGLAGGPIIVGILLGTF GPRIHMITYTTRSANLMLRALGLSMYLACLGLDAGAHFFDTVFRPEGLLWIGLGAGLTII PTVLVGFVAFKMMKIDFGSVSGMLCGSMANPMALNYVNDTIPGDNPSVAYATVYPLCMFL RVIIAQVLLMFLLN >gi|222159320|gb|ACAB01000039.1| GENE 15 17801 - 19795 1866 664 aa, chain + ## HITS:1 COG:CAC1572 KEGG:ns NR:ns ## COG: CAC1572 COG3855 # Protein_GI_number: 15894850 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 664 1 663 665 797 60.0 0 MTAQSNITPESIVGDLRYLQLLSRSFPTIADASTEIINLEAILNLPKGTEHFLTDIHGEY EAFQHVLKNASGAVKRKVNEIFGNTLREAEKKEICTLIYYPEEKLQLVKAREKDLDDWYL ITLNQLVKVCQNVSSKYTRSKVRKSLPAEFSYIIQELLHESSIEPNKHAYINVIISTIIT TKRADDFIIAMCNLIQRLTIDSLHIVGDIYDRGPGAHIIMDTLCNYHNFDIQWGNHDILW MGAASGNDSCIANVIRMSMRYGNLGTLEDGYGINLLPLATFAMDTYADDPCTIFMPKMNF ADAHYNEKTLRLITQMHKAITIIQFKLEAEIIDRRPEFGMTNRKLLEKIDFERGIFVYEG KEYALRDTNFPTVDPADPYRLTEEERELVEKIHYSFMNSEKLKKHMRCLFTYGGMYLVSN SNLLYHASVPLNEDGSFKHVKIRGKEYWGRKLLDKADQLIRTAYFDEEGEEDKEFAMDYI WYMWCGPEAPLFDKDKMATFERYFVEDKELHKEKKGYYYTLRNREDVCDQILAEFGASGP HSHIINGHVPVKTIQGEQPMKANGKLFVIDGGFSKAYQPETGIAGYTLVYHSHGMQLVQH EPFQSRQKAIEEGLDIKSTNFVLEFNSQRMMVKDTDKGKELVTQIQDLKKLLVAYRTGLI KEKV >gi|222159320|gb|ACAB01000039.1| GENE 16 19990 - 21138 909 382 aa, chain - ## HITS:1 COG:no KEGG:BT_1227 NR:ns ## KEGG: BT_1227 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 382 1 382 382 572 83.0 1e-161 MKIKKTILTAAILMAAVSLPAQNKSAGINISIWKDICTQPHDSTQTTYINIGLLSTMNRL NGVGINALGSVVHGDMNGVQITGLANLAGGTMRGVQLAGISNISGDNTVGLSAAGLVNIT GDKAQGVVISGLTSIGGDNNSGLMISGFMNVTGNMASGLHFSGIANITGQSFGGLMASGL LNVVGEHMNGLQIAGIANITASKLNGVQVALCNYATKARGLQIGLVNYYKEDMKGFQLGL VNANPDTKVQMMVYGGNATPANIGVRFKNQLFYTILGVGSMYQGLNDKFSASASYRAGLS FPLYKGLSISGDLGYQHIEAFDNKDEVIPKRLYALQARANLEYQFTKKFGIFATGGYGLT RFYNKSSNYDKGAIIEAGIVLF >gi|222159320|gb|ACAB01000039.1| GENE 17 21193 - 22851 1581 552 aa, chain - ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 25 546 14 499 600 216 29.0 1e-55 MEQSFIAYIENSIKNNWDLDALTDYKGATLQYKDVARKIEKLHIIFEESGIRKGDKIAVC GRNSSHWGVTFLATLTYGAVIVPILHEFKADNVHNIVNHSEAKLLLVGDMVWENLNESAM PLLEGILMMNDFTLLVSRSERLTYAREHLNEMFGKKYPKNFRKEHVAYHKDEPEELAVIN YTSGTTSYSKGVMLPYRSLWSNTKFAFEVLELEAGDKIVSMLPMAHMYGLAFEFLYEFSV GCHIYFLTRMPSPKIIFQAFEEVKPSLIVAVPLIIEKIIKKSVLPKLETPAMKILLKVPI INDKIKATVREEMIKAFGGNFKAVIVGGAAFNQEVEQFLKMIDFPYTVGYGMTECGPIIC YEDWRKFKPGSCGKAVPRMDVRVLSSDPENIVGEIVCKGPNVMLGYYKNKEATQEVIDKD GWLHTGDLALMDEEGNVTIKGRSKNLLLSSSGQNIYPEEIEDKLNNLPYVAESIIVQQNE KLVGLVYPDFDDAFAHGLKTEDIEQVMEENRVTLNTMLPAYSQISKMKIYPEEFEKTPKK SIKRYLYQEAKG >gi|222159320|gb|ACAB01000039.1| GENE 18 23275 - 24345 1059 356 aa, chain - ## HITS:1 COG:Cj1428c KEGG:ns NR:ns ## COG: Cj1428c COG0451 # Protein_GI_number: 15792746 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Campylobacter jejuni # 1 354 1 342 346 352 48.0 6e-97 MEKNAKIYIAGHRGLVGSAIWKNLQDKGYTNLIGRTHKELDLLDGMAVRKFFDEEQPEYV FLAAAFVGGIMANSIYRADFIYKNLQIQQNIIGESFRHNVKKLLFLGSTCIYPRDAEQPM KEDVLLTSPLEYTNEPYAIAKIAGLKMCESFNLQYGTNYIAVMPTNLYGPNDNFDLERSH VLPAMIRKIHLAHCLKEGNWEAVRKDMNQRPVEGVNGDSSKEDILAILKKYGISETEVTL WGTGTPLREFLWSEEMADASVFVMEHVDFKDTYKEGSKDIRNCHINIGTGKEITIRQLAE RIVETVGYQGKLTFDSSKPDGTMRKLTDPSKLHALGWHHKIEIEEGVQRMYEWYLK >gi|222159320|gb|ACAB01000039.1| GENE 19 24447 - 25517 1220 356 aa, chain - ## HITS:1 COG:BMEI1413 KEGG:ns NR:ns ## COG: BMEI1413 COG1089 # Protein_GI_number: 17987696 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: GDP-D-mannose dehydratase # Organism: Brucella melitensis # 1 350 1 346 362 481 67.0 1e-135 MKKALISGITGQDGSYLAEFLLQKGYEVHGILRRSSSFNTGRIEHLYFDEWVRDMKQKRT INLHYGDMTDSSSLIRIIQQVQPDEIYNLAAQSHVKVSFDVPEYTAEADAIGTLRMLEAV RILGLEKKTRIYQASTSELFGKVQEVPQKETTPFYPRSPYGVAKQYGFWITKNYRESYGM FAVNGILFNHESERRGETFVTRKISLAAARIAQGEQDKLYLGNLDARRDWGYAKDYIECM WLILQHDVPEDFVIATGEMHTVREFATLAFKEAGIELRWEGEGVNEKGIDVATGKSLVEV DPKYFRPSEVEQLLGDPTKAKTLLGWDPCKTSFEELVSIMVRHDMEKVKRMIATKH >gi|222159320|gb|ACAB01000039.1| GENE 20 25611 - 26780 1057 389 aa, chain - ## HITS:1 COG:Cgl2969 KEGG:ns NR:ns ## COG: Cgl2969 COG1301 # Protein_GI_number: 19554219 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Corynebacterium glutamicum # 8 381 5 379 412 326 52.0 5e-89 MKKIKIGLLARIVIAIILGIAIGTFFPAPLVRIFVTFNGIFSEFLNFSIPLIIVGLVTVA IADIGKGAGKMLLVTALIAYFATLFSGFLSYFTGVTVFPSLIEPGAPLEEVSEAQGILPY FSVSIPPLMNVMTALVLAFTLGLGLASLNSDALKNVARDFQEIIVRMISAVILPLLPLYI FGIFLNMTHSGQVYSILMVFIKIIGVIFALHIFLLVFQYSIAALFVHKNPFKLLNKMLPA YFTALGTQSSAATIPVTLEQTKKNGVSAEVAGFVIPLCATIHLSGSTLKIVACALALMMM QGIPFDFPLFAGFIFMLGITMIAAPGVPGGAIMASLGILQSMLGFDESAQALMIALYIAM DSFGTACNVTGDGAIALIIDKIMGKNRAE >gi|222159320|gb|ACAB01000039.1| GENE 21 26890 - 28365 1603 491 aa, chain + ## HITS:1 COG:TP0331 KEGG:ns NR:ns ## COG: TP0331 COG0362 # Protein_GI_number: 15639322 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconate dehydrogenase # Organism: Treponema pallidum # 8 491 4 488 488 575 57.0 1e-164 MANQNKTDIGLIGLAVMGENLALNMESKGWHVSVYNRTVPGVEEGVVDRFMNGRAKGKNI EGFTDIKAFVDSIAIPRKIMMMVRAGSPVDELMDQLFPLLSPGDILIDGGNSNYEDTNRR VKLAESKGFLFVGSGVSGGEEGALNGASIMPGGSEQAWPEVKPILQSIAAKAPDGTPCCQ WVGPAGSGHFVKMIHNGIEYGDMQLIAEAYWVMKKLLDLTNEEMADVFARWNEGKLRSYL VEITANILRHKDKSGGYLIDKILDAAGQKGTGKWSVINAMELGMPLGLIATAVFERSLSA QKDLRHLASRQYQCQHTQPIYNKVELVKNIFSALYASKLVSYAQGFAVLQRASDAFDWHL DLASIARMWRGGCIIRSIFLNDIAAAFEATDKPKHLLLAPYFKEEMKTLLPGWKNLVAEA MKEELPVPAFSSALNYFYSLTSADLPANLVQAQRDYFGAHTFERKDELRGQFFHENWTGH GGDTKSGTYNV Prediction of potential genes in microbial genomes Time: Wed May 18 02:05:19 2011 Seq name: gi|222159319|gb|ACAB01000040.1| Bacteroides sp. D1 cont1.40, whole genome shotgun sequence Length of sequence - 12398 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 8, operones - 6 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 15/0.000 + CDS 38 - 1534 1347 ## COG0364 Glucose-6-phosphate 1-dehydrogenase 2 1 Op 2 . + CDS 1531 - 2262 190 ## PROTEIN SUPPORTED gi|163781723|ref|ZP_02176723.1| 50S ribosomal protein L13 3 1 Op 3 . + CDS 2303 - 3253 971 ## COG2837 Predicted iron-dependent peroxidase + Term 3279 - 3324 6.5 4 2 Tu 1 . - CDS 3289 - 3501 151 ## BT_1232 hypothetical protein - Prom 3641 - 3700 4.9 + Prom 3687 - 3746 9.2 5 3 Tu 1 . + CDS 3816 - 4154 250 ## gi|160885865|ref|ZP_02066868.1| hypothetical protein BACOVA_03869 6 4 Op 1 . - CDS 4151 - 4618 361 ## gi|237716165|ref|ZP_04546646.1| conserved hypothetical protein 7 4 Op 2 . - CDS 4638 - 5024 173 ## gi|237716166|ref|ZP_04546647.1| predicted protein 8 4 Op 3 . - CDS 5037 - 5417 263 ## gi|237721145|ref|ZP_04551626.1| predicted protein - Prom 5483 - 5542 7.3 + Prom 5448 - 5507 5.5 9 5 Op 1 . + CDS 5571 - 5864 264 ## gi|237716168|ref|ZP_04546649.1| conserved hypothetical protein 10 5 Op 2 . + CDS 5880 - 6116 207 ## gi|160885871|ref|ZP_02066874.1| hypothetical protein BACOVA_03875 + Prom 6248 - 6307 3.4 11 6 Op 1 . + CDS 6551 - 6757 241 ## gi|160885873|ref|ZP_02066876.1| hypothetical protein BACOVA_03878 12 6 Op 2 . + CDS 6775 - 7032 82 ## gi|260170657|ref|ZP_05757069.1| hypothetical protein BacD2_02235 13 6 Op 3 . + CDS 7029 - 7220 170 ## gi|160885875|ref|ZP_02066878.1| hypothetical protein BACOVA_03880 + TRNA 7283 - 7356 59.6 # Undet ??? 0 0 + TRNA 7957 - 8026 27.8 # Pseudo ??? 0 0 + Prom 8305 - 8364 4.6 14 7 Op 1 . + CDS 8444 - 9226 425 ## COG4712 Uncharacterized protein conserved in bacteria 15 7 Op 2 . + CDS 9232 - 10146 516 ## BVU_2860 hypothetical protein 16 7 Op 3 . + CDS 10143 - 10592 233 ## COG0629 Single-stranded DNA-binding protein 17 8 Op 1 . + CDS 10743 - 11051 226 ## BVU_2858 hypothetical protein 18 8 Op 2 . + CDS 11070 - 11738 761 ## BVU_2857 hypothetical protein 19 8 Op 3 . + CDS 11671 - 12006 98 ## BVU_2856 hypothetical protein 20 8 Op 4 . + CDS 12003 - 12396 257 ## COG1061 DNA or RNA helicases of superfamily II Predicted protein(s) >gi|222159319|gb|ACAB01000040.1| GENE 1 38 - 1534 1347 498 aa, chain + ## HITS:1 COG:VCA0896 KEGG:ns NR:ns ## COG: VCA0896 COG0364 # Protein_GI_number: 15601650 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate 1-dehydrogenase # Organism: Vibrio cholerae # 5 498 9 501 501 560 54.0 1e-159 MDKFAMIIFGASGDLTKRKLMPALYSLYREKRLTGEYSILGIGRTVYSDDNYRSYILEEL QQFVKSEEQDTALMASFVSHLYYLPMDPAKEEGYPQLRQRLVDLTGEVEPDNLLFYLATP PSLYGVVPLYLKAAGLNTPHSRIIVEKPFGYDLESARELNKTYASVFNEHQIYRIDHFLG KETAQNVLAFRFANGIFEPLWNRNYIDYVEITAVENLGIEQRGGFYETAGALRDMVQNHL IQLVALTAMEPPAVFNADNFRNEVVKVYESLTPLNEVDLNEHIVRGQYTASGNKKGYREE KGVAPDSRTDTYIAMKLGISNWRWSGVPFYIRTGKQMPTKVTEIVVHFRETPHQMFHCAG GNCPRANKLILRLQPNEGIVLKIGMKVPGAGFEVRQVTMDFSYAQLGGVPSGDAYARLID DCIQGDPTLFTRSDAVEASWKFFDPVLRYWKDNPDAPLYGYPAGTWGPLESEAMMHEHGA DWTNPCKNLTNTDQYCEL >gi|222159319|gb|ACAB01000040.1| GENE 2 1531 - 2262 190 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163781723|ref|ZP_02176723.1| 50S ribosomal protein L13 [Hydrogenivirga sp. 128-5-R1-1] # 31 233 34 217 228 77 31 4e-14 MKLSVFPSSIETSRALILRLVEIMNEEPDRVFNIAVSGGNTPALMFDLWANEYMEITPWD RMRIYWVDERCVPPDDSDSNYGMMRNLLLGLTPILYENVFRIRGEAKPAKEAVRYSELVR QQVPLKRGWPEFDIVLLGAGDDGHTSSIFPGQEDLLTSNSIYVVSAHPRNGQKRIAMTGY PIQNARYVIFLITGKNKVDVVEEICNSGDTGPAAYIAHHAQNVELFVDKAAAAYIDDSNK KMN >gi|222159319|gb|ACAB01000040.1| GENE 3 2303 - 3253 971 316 aa, chain + ## HITS:1 COG:MT0820 KEGG:ns NR:ns ## COG: MT0820 COG2837 # Protein_GI_number: 15840211 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted iron-dependent peroxidase # Organism: Mycobacterium tuberculosis CDC1551 # 13 311 8 306 335 294 49.0 2e-79 MNPYQNSFGGNIPQDVAGKQGENVIFIVYTLKDTPETLDKVKDVCANFSGMIRSMRNRFP ELMFSCTMGFGADAWGRLFPEKGKPKELNTFEEIKGGKHTAVSTPGDILFHIRAKQMGLC FEFASIIDEKLQGVVEPVDETHGFRYMDGKAIIGFVDGTENPAVDENPYHFAVVGEEDAD FAGGSYVFVQKYIHDMVAWNALPVEEQEKVIGRRKFNDVELSDEEKPQNAHNAVTNIGDD LKIVRANMPFANTSKGEYGTYFIGYASTFSTTRQMLESMFIGNPVGNTDRLLDFSTAVTG TLFFAPSYDLLGELGE >gi|222159319|gb|ACAB01000040.1| GENE 4 3289 - 3501 151 70 aa, chain - ## HITS:1 COG:no KEGG:BT_1232 NR:ns ## KEGG: BT_1232 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 67 1 67 68 73 55.0 3e-12 MEYSVEELKNALIERCEKEGILYATVAMDRRTKEMILPDTLEGALKHPEYFVCTCRRVKD QYIVEEITKV >gi|222159319|gb|ACAB01000040.1| GENE 5 3816 - 4154 250 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885865|ref|ZP_02066868.1| ## NR: gi|160885865|ref|ZP_02066868.1| hypothetical protein BACOVA_03869 [Bacteroides ovatus ATCC 8483] # 1 112 1 112 112 185 100.0 6e-46 MKSLLKNVLRRISKKQSSKEDNATAFYPQCCAKVDDSARMRIKMSYDQNVKETISSLKTL ANDMSSGFVTFKKFQTRRYQYNPDADATLYASRLLRAASILEFLLTDPDNKS >gi|222159319|gb|ACAB01000040.1| GENE 6 4151 - 4618 361 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716165|ref|ZP_04546646.1| ## NR: gi|237716165|ref|ZP_04546646.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 155 1 155 155 281 100.0 1e-74 MRPIRTVPPKDEREYPLVITAEEKDKVLNYILVVANGKRTAKLNYKDIPDLRISKEQYEI VLEEFKNRRFIDYKGYGIEYLTLNFEIFNFAEKGGFTVERDLYILSFDTFQMQLERLEKE LSPDTAAKVDDVVGKAKNITELLIGLSALAEKMNL >gi|222159319|gb|ACAB01000040.1| GENE 7 4638 - 5024 173 128 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716166|ref|ZP_04546647.1| ## NR: gi|237716166|ref|ZP_04546647.1| predicted protein [Bacteroides sp. D1] # 1 128 1 128 128 236 100.0 3e-61 MKKRFLILSFLFVLIFNSCSDDSINLAGTTWTSAKDWYGKTRLSFEEGTPYLRSFFAISF DLKSFTIYNVADDNEDLEYEWKETVSGKYSINDNIVNLIVEKDNLTIPCEIEKDIMYYSN TRMKLYKQ >gi|222159319|gb|ACAB01000040.1| GENE 8 5037 - 5417 263 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237721145|ref|ZP_04551626.1| ## NR: gi|237721145|ref|ZP_04551626.1| predicted protein [Bacteroides sp. 2_2_4] # 1 126 1 126 126 207 98.0 1e-52 MINRIKEVITYSGLSERGFAIKCGLKPTTINNQLIGKREISLATIIAISSSFEEISAEWL LRGKGSMLLQKEETEPGMDKLKSIVYTIANLQDEINEKTVLTQRLLEENQKLKGELAMLK NERNIG >gi|222159319|gb|ACAB01000040.1| GENE 9 5571 - 5864 264 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716168|ref|ZP_04546649.1| ## NR: gi|237716168|ref|ZP_04546649.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 97 1 97 97 155 100.0 5e-37 MSYNLSQIMKSAHRNYKKGGKTFSECLKSAWSFAKLQESFSPEAVKSRTDKFLAERHEAM SKTAKATPSKEYNNLNIPASAYYNPNSTHYGAHYVGD >gi|222159319|gb|ACAB01000040.1| GENE 10 5880 - 6116 207 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885871|ref|ZP_02066874.1| ## NR: gi|160885871|ref|ZP_02066874.1| hypothetical protein BACOVA_03875 [Bacteroides ovatus ATCC 8483] # 1 78 1 78 78 105 100.0 1e-21 MDKRTELEIQRDKYEAVIEERDALISSLRGENEKLKRDLESERGFYREKVSQCDDLKKFI ESQRNLMDIVLKNNQSIL >gi|222159319|gb|ACAB01000040.1| GENE 11 6551 - 6757 241 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885873|ref|ZP_02066876.1| ## NR: gi|160885873|ref|ZP_02066876.1| hypothetical protein BACOVA_03878 [Bacteroides ovatus ATCC 8483] # 1 68 9 76 76 122 100.0 9e-27 MEKDIQRRNVIDVLRSMDVGAIEVFPIVQKPSVTNTLNARLYKEKAEGMAWKTKSDVKNM QFIVTRIA >gi|222159319|gb|ACAB01000040.1| GENE 12 6775 - 7032 82 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170657|ref|ZP_05757069.1| ## NR: gi|260170657|ref|ZP_05757069.1| hypothetical protein BacD2_02235 [Bacteroides sp. D2] # 1 75 5 79 89 153 100.0 3e-36 MIRGEMAEILLDNILRLFSTETFGKDKSAYYVGGEKKLMNLIEAGKIESDKPTNVQNGKW HCNAAQVLLHCRCAGRKVKSKKRKK >gi|222159319|gb|ACAB01000040.1| GENE 13 7029 - 7220 170 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885875|ref|ZP_02066878.1| ## NR: gi|160885875|ref|ZP_02066878.1| hypothetical protein BACOVA_03880 [Bacteroides ovatus ATCC 8483] # 1 63 1 63 63 98 100.0 2e-19 MKKIKVIQYAMMFIALWTTLYLIDSIEVSKKEFIAAFVLVTVVSVNYICFRYYEDRKQNK DSL >gi|222159319|gb|ACAB01000040.1| GENE 14 8444 - 9226 425 260 aa, chain + ## HITS:1 COG:CAC1936 KEGG:ns NR:ns ## COG: CAC1936 COG4712 # Protein_GI_number: 15895209 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 20 187 3 169 229 201 61.0 9e-52 MTARKNTVSTVQNEEKKKNSIRPLLASEIECRVGTMKPDGSGCSLLLYKDARVDMRILDE VFGEMNWKRHHDVVNGNLFCTLSIWDNEKKEWVSKQDVGTESSTEKEKGQASDAFKRAGF NWGIGRELYTGPFIWIPLEKNEIYQSKTGSPALYTKFSVKEIGYNEQKEIILLVIVDNKN RVRFAYGNTKEKVYAPNVSASNASGKVYTGVDLDRAIKQMTGVKSREELERVWAEHPELH NNKEFRNITIDMQKTYPPRN >gi|222159319|gb|ACAB01000040.1| GENE 15 9232 - 10146 516 304 aa, chain + ## HITS:1 COG:no KEGG:BVU_2860 NR:ns ## KEGG: BVU_2860 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 303 1 301 306 392 66.0 1e-108 MIELVKSSVVFNEENHTYMLGEKQLQGITGMISRQLFPDKYKDVPDFVLKRAAEKGSLIH AQCQFVDATGLPPESIEAENYLKERTKAGYKAFANEYTVSDNEYFASNIDCVWEKAGRIC LGDIKTTLHLDEEYLSWQLSIYAYLFELQNPLLKVDKLFGIWVRGDKHELVEIPRKPDKE VKKLMECEKKGEQYLSILPVPAPDDDKLLIPMQLVNTIIGIEEELADLTKIQKDYKAKLK TAMRENGVKSWDAGRLRVSYTPASTSDNFDTKKFQADYPELYSKYIKTVPKADSIRVTIR EDKS >gi|222159319|gb|ACAB01000040.1| GENE 16 10143 - 10592 233 149 aa, chain + ## HITS:1 COG:XFa0061 KEGG:ns NR:ns ## COG: XFa0061 COG0629 # Protein_GI_number: 10956771 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Xylella fastidiosa 9a5c # 2 111 3 111 136 101 43.0 6e-22 MSLNKLMLIGHVGKDPDIRILEAGSKVATFSFATTEKGYTLANGTQVPERTEWHNIVVWR GLADVVEKYVHKGDKLYLEGKIRTRSYDDSRGIKRYITELFVDNMEMLSVKPQQAPPPPP LPEHTNNQTRSAVNECPPPPPPTKDDLPF >gi|222159319|gb|ACAB01000040.1| GENE 17 10743 - 11051 226 102 aa, chain + ## HITS:1 COG:no KEGG:BVU_2858 NR:ns ## KEGG: BVU_2858 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 102 49 150 150 169 74.0 3e-41 MWKWFQCIGACLREYTGEEYWSTAAGVQDIHDLYCKKFLVKQVHVNGKVETIVRGTSKLN TLEMHNFMESVKIDAATEFGITLPLPEDQHYLDFIHEYQNRY >gi|222159319|gb|ACAB01000040.1| GENE 18 11070 - 11738 761 222 aa, chain + ## HITS:1 COG:no KEGG:BVU_2857 NR:ns ## KEGG: BVU_2857 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 221 1 221 233 175 49.0 9e-43 MIANLRNYEPETIEFVVPDSIREKFPPVLFQGSTNVDELIKLVNEHFNATFPESEVTQRL LDEFEISEIREEYCIKQENEVPKRERELLEAIERAKKIKSDAQDRLASIKTEIKDLAAEV KKGTREYHLSSKNTIRFALDGYFLYYSWVNGEFKLVKAEKIPDWDKRSLWAQEDRNRKAM LDLFGIEYPEVERPIDDTEDYGDKFEEDLSDKLPEEEPEDDE >gi|222159319|gb|ACAB01000040.1| GENE 19 11671 - 12006 98 111 aa, chain + ## HITS:1 COG:no KEGG:BVU_2856 NR:ns ## KEGG: BVU_2856 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 21 111 1 91 91 112 64.0 5e-24 MGTSSKKTCLINFLKKNRKTMSRLQHKKGRKSNYVKRLVNNPDWEEAKRKVRIRDGHKCQ MCGKDFNLEIHHKTYRVNGKSIVGHELEHLDCLVTLCGDCHSKVHKYHIKL >gi|222159319|gb|ACAB01000040.1| GENE 20 12003 - 12396 257 131 aa, chain + ## HITS:1 COG:VC1636_1 KEGG:ns NR:ns ## COG: VC1636_1 COG1061 # Protein_GI_number: 15641641 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Vibrio cholerae # 3 126 65 188 420 91 42.0 3e-19 MTYQLRDYQKSASDAAVSVFKSKEKKNYVIVLPTGAGKSLVIANIAARIDGPLIVFQPSK EILEQNFAKLQSYGIFDCGVYSASAGRKDINRITFAMIGSVMKHMSFFKHFKHVLIDECH LVNPEKGMYKE Prediction of potential genes in microbial genomes Time: Wed May 18 02:06:33 2011 Seq name: gi|222159318|gb|ACAB01000041.1| Bacteroides sp. D1 cont1.41, whole genome shotgun sequence Length of sequence - 55427 bp Number of predicted genes - 58, with homology - 58 Number of transcription units - 21, operones - 15 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 76 - 762 285 ## COG1061 DNA or RNA helicases of superfamily II 2 1 Op 2 . + CDS 768 - 1232 350 ## BDI_0857 putative recombination protein 3 1 Op 3 . + CDS 1268 - 2080 381 ## Coch_0881 hypothetical protein 4 1 Op 4 . + CDS 1977 - 2624 360 ## gi|237716185|ref|ZP_04546666.1| conserved hypothetical protein 5 1 Op 5 . + CDS 2581 - 2802 186 ## gi|260170645|ref|ZP_05757057.1| hypothetical protein BacD2_02175 + Term 2809 - 2867 0.1 + Prom 2820 - 2879 2.1 6 2 Op 1 . + CDS 2904 - 4622 931 ## COG1475 Predicted transcriptional regulators 7 2 Op 2 . + CDS 4651 - 7164 1686 ## Slit_1937 DNA methylase N-4/N-6 domain protein + Prom 7184 - 7243 5.0 8 3 Op 1 . + CDS 7263 - 7556 136 ## BVU_2843 hypothetical protein 9 3 Op 2 . + CDS 7617 - 8273 456 ## BVU_2842 hypothetical protein 10 3 Op 3 . + CDS 8242 - 8994 303 ## BVU_2841 hypothetical protein + Prom 9010 - 9069 3.9 11 4 Op 1 . + CDS 9106 - 9681 241 ## BVU_2840 hypothetical protein 12 4 Op 2 . + CDS 9730 - 10377 523 ## HAPS_0636 hypothetical protein 13 4 Op 3 . + CDS 10406 - 10870 147 ## gi|237716194|ref|ZP_04546675.1| conserved hypothetical protein + Term 10904 - 10944 -0.1 + Prom 10966 - 11025 3.7 14 5 Op 1 . + CDS 11192 - 12877 485 ## COG0270 Site-specific DNA methylase + Prom 12883 - 12942 2.5 15 5 Op 2 . + CDS 12966 - 13211 193 ## gi|237716197|ref|ZP_04546678.1| conserved hypothetical protein 16 5 Op 3 . + CDS 13238 - 14029 596 ## BF2327 putative lipoprotein 17 5 Op 4 . + CDS 14052 - 14210 107 ## gi|293372930|ref|ZP_06619299.1| hypothetical protein CUY_0816 + Prom 14323 - 14382 3.6 18 6 Op 1 . + CDS 14455 - 14871 287 ## gi|237716200|ref|ZP_04546681.1| predicted protein 19 6 Op 2 . + CDS 14893 - 15120 173 ## gi|237716201|ref|ZP_04546682.1| predicted protein + Prom 15281 - 15340 12.2 20 7 Op 1 . + CDS 15409 - 15840 295 ## Mnod_4116 hypothetical protein 21 7 Op 2 . + CDS 15852 - 16769 596 ## COG1032 Fe-S oxidoreductase + Term 16802 - 16838 3.3 22 8 Tu 1 . + CDS 16843 - 17649 209 ## COG0451 Nucleoside-diphosphate-sugar epimerases + Term 17682 - 17736 2.7 + Prom 17654 - 17713 2.7 23 9 Op 1 . + CDS 17742 - 18284 350 ## BVU_2837 hypothetical protein + Term 18307 - 18341 -0.7 24 9 Op 2 . + CDS 18346 - 20361 924 ## COG1783 Phage terminase large subunit + Prom 20378 - 20437 4.3 25 10 Op 1 . + CDS 20488 - 21933 577 ## BVU_2835 hypothetical protein 26 10 Op 2 . + CDS 21887 - 22948 562 ## BVU_2834 hypothetical protein 27 10 Op 3 . + CDS 22966 - 24543 1365 ## BVU_2833 hypothetical protein 28 10 Op 4 . + CDS 24548 - 24970 478 ## BVU_2832 hypothetical protein 29 10 Op 5 . + CDS 24979 - 25521 337 ## BVU_2831 hypothetical protein 30 10 Op 6 . + CDS 25518 - 26090 183 ## BVU_2830 hypothetical protein 31 10 Op 7 . + CDS 26087 - 26926 588 ## BVU_2829 hypothetical protein 32 10 Op 8 . + CDS 26923 - 27078 72 ## gi|237716214|ref|ZP_04546695.1| conserved hypothetical protein + Term 27081 - 27126 9.7 33 11 Op 1 . + CDS 27203 - 27649 277 ## BVU_2828 hypothetical protein 34 11 Op 2 . + CDS 27680 - 27841 198 ## gi|160885930|ref|ZP_02066933.1| hypothetical protein BACOVA_03935 35 11 Op 3 . + CDS 27844 - 31755 3433 ## COG5283 Phage-related tail protein 36 11 Op 4 . + CDS 31758 - 32339 311 ## BVU_2826 hypothetical protein + Prom 32355 - 32414 8.0 37 12 Op 1 . + CDS 32447 - 32734 95 ## gi|160885933|ref|ZP_02066936.1| hypothetical protein BACOVA_03938 38 12 Op 2 . + CDS 32724 - 33197 159 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 33217 - 33283 10.3 39 13 Tu 1 . - CDS 33282 - 34085 216 ## COG0117 Pyrimidine deaminase - Prom 34247 - 34306 5.8 40 14 Op 1 . + CDS 34426 - 36417 1046 ## BVU_2822 hypothetical protein 41 14 Op 2 . + CDS 36422 - 37735 597 ## BVU_2821 hypothetical protein 42 14 Op 3 . + CDS 37747 - 43920 3804 ## gi|237716224|ref|ZP_04546705.1| conserved hypothetical protein 43 14 Op 4 . + CDS 43923 - 44711 673 ## gi|237716225|ref|ZP_04546706.1| conserved hypothetical protein 44 14 Op 5 . + CDS 44737 - 45546 441 ## gi|262407834|ref|ZP_06084382.1| conserved hypothetical protein + Term 45580 - 45605 -0.5 + Prom 45570 - 45629 9.8 45 15 Op 1 . + CDS 45693 - 46100 193 ## gi|160885941|ref|ZP_02066944.1| hypothetical protein BACOVA_03946 46 15 Op 2 . + CDS 46115 - 47536 351 ## COG3344 Retron-type reverse transcriptase 47 15 Op 3 . + CDS 47499 - 47777 291 ## gi|160885943|ref|ZP_02066946.1| hypothetical protein BACOVA_03948 + Term 47801 - 47833 4.0 + Prom 47802 - 47861 8.3 48 16 Op 1 . + CDS 47975 - 48268 229 ## gi|294807657|ref|ZP_06766450.1| hypothetical protein CW3_0225 49 16 Op 2 . + CDS 48277 - 48792 295 ## gi|262407839|ref|ZP_06084387.1| conserved hypothetical protein 50 16 Op 3 . + CDS 48776 - 49354 512 ## COG0860 N-acetylmuramoyl-L-alanine amidase 51 16 Op 4 . + CDS 49375 - 49857 250 ## gi|237716233|ref|ZP_04546714.1| conserved hypothetical protein + Term 49931 - 49987 -0.9 + Prom 49859 - 49918 4.9 52 17 Op 1 . + CDS 50016 - 50630 283 ## gi|237716234|ref|ZP_04546715.1| conserved hypothetical protein 53 17 Op 2 . + CDS 50685 - 51746 472 ## COG2856 Predicted Zn peptidase 54 17 Op 3 . + CDS 51776 - 52039 324 ## gi|237716236|ref|ZP_04546717.1| predicted protein + Prom 52041 - 52100 5.0 55 18 Tu 1 . + CDS 52322 - 52531 172 ## AAur_3080 hypothetical protein + Prom 52575 - 52634 4.2 56 19 Tu 1 . + CDS 52659 - 53465 468 ## PCC7424_0225 hypothetical protein + Term 53567 - 53608 8.1 + Prom 53687 - 53746 8.6 57 20 Tu 1 . + CDS 53895 - 55007 503 ## gi|237716239|ref|ZP_04546720.1| predicted protein + Term 55226 - 55263 4.1 - Term 55126 - 55162 7.3 58 21 Tu 1 . - CDS 55180 - 55425 115 ## BF2302 hypothetical protein Predicted protein(s) >gi|222159318|gb|ACAB01000041.1| GENE 1 76 - 762 285 228 aa, chain + ## HITS:1 COG:VC1636_1 KEGG:ns NR:ns ## COG: VC1636_1 COG1061 # Protein_GI_number: 15641641 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Vibrio cholerae # 4 193 231 420 420 109 35.0 4e-24 MLKFITRTRPKVFTDVIYHCQVSELLAKGFLASLKYYDITKLDLSRVRTNSTGADYDEKS LLQEFERVDIYKDIVGWTKRLLNPKSGIPRKGILIFTRFIREAEKLASEIPNCAIVSGST PKEERARILKGFKDGRIKVVANVGVLTTGFDYPELDTVVLARPTKSLSLYYQMVGRVIRP CQGKEGWVVDLSGNFRRFGRVEELRIEQPEKGKWCIMSRGRQLTNVVF >gi|222159318|gb|ACAB01000041.1| GENE 2 768 - 1232 350 154 aa, chain + ## HITS:1 COG:no KEGG:BDI_0857 NR:ns ## KEGG: BDI_0857 # Name: not_defined # Def: putative recombination protein # Organism: P.distasonis # Pathway: not_defined # 14 154 17 157 157 197 70.0 1e-49 MWRNYKKKEKKKPLFEVEGVKVKKKPDLVDKLDRIFSLFIRYRDTMPNGYFQCISCGKIK PFNKADCGHYINRQHMSTRFDEMNCNAQCSHCNRFMEGNIQDYRRRLVAKYGERNVLILE AKKNVTKQFSDFQLEKLITHYKEEAKKLKEAKGL >gi|222159318|gb|ACAB01000041.1| GENE 3 1268 - 2080 381 270 aa, chain + ## HITS:1 COG:no KEGG:Coch_0881 NR:ns ## KEGG: Coch_0881 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 109 1 110 298 105 50.0 2e-21 MERNSFIFYKGWREAIKDLPDDVRLEIYESIIEYATTGNLRGLKPMANIAFNFIKIDIDR DTEKYMSIVERNKSNGSKGGRPKSENPKEPKEPTKPTGLFGNPKEPTKPDNDNEYDNDYV DDNDSHLKKKETSPKGESKKDELSLFPEEKIDWGGLMDYFNSTFKGKLPAIKSIDAKRKK AIKARVAQYGKQAVFDVFQLVLDSPFLLGQNDKNWRCTFDWIFKSANFTKILEGNYNGKR TDTAATRRESVSSLTDLAEKLLQSSMPQEG >gi|222159318|gb|ACAB01000041.1| GENE 4 1977 - 2624 360 215 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716185|ref|ZP_04546666.1| ## NR: gi|237716185|ref|ZP_04546666.1| conserved hypothetical protein [Bacteroides sp. D1] # 60 215 1 156 156 295 99.0 1e-78 MENELILRPQEENRLAVLRTSPKNYCKALCPKKVEDVFQSDEPSIGTIIRKFGEPQARAV LVILIADALEFFNVGNPMSATQVATTVDLIIEEYPYMKTDDFKLCFKNAMKMKYGNIYNR IDGQVIMSWLREYNKERCAVADNQSWNFHKENLSEEVSYTSGLSYEEYRNELKLRVEQGD EEAAKALSLSNEIISYLNKRENGKQEAEGDNLLEH >gi|222159318|gb|ACAB01000041.1| GENE 5 2581 - 2802 186 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170645|ref|ZP_05757057.1| ## NR: gi|260170645|ref|ZP_05757057.1| hypothetical protein BacD2_02175 [Bacteroides sp. D2] # 1 73 1 73 73 125 100.0 6e-28 MANKKQKVTIYWNTRHIKLEDIPEVKRRIRERFGIPNHTTVNGETDCYIREEDMELLRET EKRGFIQIRNKPA >gi|222159318|gb|ACAB01000041.1| GENE 6 2904 - 4622 931 572 aa, chain + ## HITS:1 COG:SMc02801 KEGG:ns NR:ns ## COG: SMc02801 COG1475 # Protein_GI_number: 15967087 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Sinorhizobium meliloti # 4 197 39 198 296 105 34.0 4e-22 MEVQNIRIDLISPSPLNPRKTFDEAALEELASNIEKQGLLQPVTVRVAKSEEMTNLETGD VTPLPYTYEIVCGERRFRAVSLLKAKEDEANVAKIKAHRKKSEKFQTISCIVREMTDDEA FEAMITENLQRKDVDPIEEAFAFAQLAEKGRTLEDIALKIGKSTRFVFDRIKLNSLIPEL KERVRNGDIPLSGAMILSKLDEDTQKEFHEEEEEQCTTAMIREFVSNSFMELGNAPWIKD DSDNWENTDIKSCSQCENNTCNHGCLFYEMNSKDARCINAACYEKKQIAYVTRKIQLEYE HLVKVGEPLSFGKTVIIARRPDTYWGEDRKVFYEKTLEAVKQLGFEIVDPDEIFRCKCWY SEDDERTLKMLEDGEVYRCLSFFGHYSPEFNVSFYYVRKATASSTSAVADLKEIEREKIN AQLKRAKDIVKEKSAEEMRKWAQEKTYYQRTKEFSENEQLVFDVLVLSGCSSTYLEKLNL KKWNGESDFVNYVKNNQADRHQWYRAFIAECLSSNNVNFCSYLQKCQKILFAEQYPDDYN ALTKKLADSYSKKEMKLKQQLEELNNDNTEEA >gi|222159318|gb|ACAB01000041.1| GENE 7 4651 - 7164 1686 837 aa, chain + ## HITS:1 COG:no KEGG:Slit_1937 NR:ns ## KEGG: Slit_1937 # Name: not_defined # Def: DNA methylase N-4/N-6 domain protein # Organism: S.lithotrophicus # Pathway: not_defined # 4 835 8 837 843 937 53.0 0 MKDYIEFLKDKMAISHNTGFEVNPNEISTSLYPHVRDTVRWAVSGGCRAIFSSFGMQKTV TQLEICRVIINQYFGKALIVCPKRVVVEFITQAKEHMNMTVKYVKTMSEVRACKCDIMIT NYERVRDGEDGVRIEPSFFTVTSLDEASVLRGFGTKTYQEFLPLFADVPFRFVATATPSP NRYKELIHYAGYLGVMDTGQALTRFFQRDSTKANNLTLYPHKEKEFWLWVSTWALFLTKP SDLGYPDTGYELPELRVHEEVVSVDNSTAGTDRDGQVKMFREAALGLADAAKERRDNMAE KIARVVEIINRPENKDEHFLLWHDLESEREALCKAIPGCKAVYGSQDDEEADKVIADFKN GRLKYLAAKPEMLGEGLNFQYHCHKAIMFIDYRFNDKFQAIARIYRFMQKHPVDLYLVYA ESEGEIFKSFMQKWAQHREMVSKMTDIVRENGLFGLQAEEKMMRWMFASREEKSGKLWKA INNDNVLECQKMESNSVDLVVTSIPFSNHYEYTPTYNDFGHNESNDKFFEQMDYLTPELM RILKPGRLACIHVKDRVLFGNATGDGMPTIDPFSEMTVFHYMKHGFRYMGRITVDTDVVR ENNQTYRLGYTEMCKDGSKMGIGCPEYVLLFRKLPSDTSRAYADLPVTKNKNEYSLARWQ IDAHASWKSSGNTLLSYEDMKVAGIDKIRHLFRNYEREHIYNYEEHVAFAEELEVYGKLP KTFMAVDPVSKKPWIWDDVTRMRTLNTKQSQKKRQNHICPLQLDIVERLIERYSNKGDLV FDPFGGIGTVPYCAIRLGRKGLSTELNYDYWKDSLSYLHEAEIEVNAPTLFDLMEAI >gi|222159318|gb|ACAB01000041.1| GENE 8 7263 - 7556 136 97 aa, chain + ## HITS:1 COG:no KEGG:BVU_2843 NR:ns ## KEGG: BVU_2843 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 97 1 97 97 97 51.0 1e-19 MITLNRFAQRCLNIMRKRFKMNEHSSRKAFSIRIEAVWRKFDIASKYRSDNLPKYSEDEE LAAEMIIYLVAYLKRFGCEDIEQLIKDKIEFDDRKND >gi|222159318|gb|ACAB01000041.1| GENE 9 7617 - 8273 456 218 aa, chain + ## HITS:1 COG:no KEGG:BVU_2842 NR:ns ## KEGG: BVU_2842 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 217 1 216 217 371 85.0 1e-101 MTEIIQVCLLDFNKGQLTGLPKNPRFFRDYRFEAMKKSIQDSPEMLELRELIVFPYNDGR YIVVCGNLRLRACKELGYKELPCKILAPDTPVKKLREYATKDNVNFGENDLDVMENEWNK AELQDWGIEFAPEKKEDEFKERFDAITDDTAIYPLIPKYDEKHELFIITSSNEVDSNWLR ERLDMQHMKSYKTGKVSKSNVIDIKDVRHALQNSNTKS >gi|222159318|gb|ACAB01000041.1| GENE 10 8242 - 8994 303 250 aa, chain + ## HITS:1 COG:no KEGG:BVU_2841 NR:ns ## KEGG: BVU_2841 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 250 1 250 250 476 88.0 1e-133 MPCKIVIPSHKRHDRVFAKKLVNDPIICVAESQADLYQQFNPECEIVTHPDDVMGLIPKR NWMAKHFGELFMLDDDVHACKPIYVEKGEPSRIKDKDKITNIIQSLFEMASMMDVHLFGF TARISPVMYDESAFLSLSKMITGCSYGVIYNKNTWWNEEIRLKEDFWISCYMKYKERKVL TDLRYNFEQKNTFVNAGGLASIRNQEEERKSILFIKKNFGDSILLKSATTNGKDKTKQLV QYNISCKFKF >gi|222159318|gb|ACAB01000041.1| GENE 11 9106 - 9681 241 191 aa, chain + ## HITS:1 COG:no KEGG:BVU_2840 NR:ns ## KEGG: BVU_2840 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 190 1 189 190 314 85.0 1e-84 MIIRTVCGYDFFEVSSAMQKAIRRADTGVAGFFALELWASGYRDYVWKRLFTISAEDCYG IITKEIEALWQGHELVNKTATEPKGRIFVSKAVILLCECRKNRDADHLQNFIYDRKDIDI EKWINDVRRYPIPIPDYTFDVHTRKGKKHGRTKEEFFQEEYKALQPRVPGLFDDLVQPSQ PKLFNDETTAK >gi|222159318|gb|ACAB01000041.1| GENE 12 9730 - 10377 523 215 aa, chain + ## HITS:1 COG:no KEGG:HAPS_0636 NR:ns ## KEGG: HAPS_0636 # Name: not_defined # Def: hypothetical protein # Organism: H.parasuis # Pathway: not_defined # 1 213 1 213 213 188 50.0 1e-46 MNTYYKFAPNVFLAKCDEKHEKGETIEVTTKYGKENESIVFNLIFEKDGFYYYSIVRADG FNVQEWAKQRAERRHEWASSAVQKSNEYFQKSNKHRDFLSLGEPIKVGHHSERGHRKMID DAWNNMGKSVEFSDKAAEHERVAKYWEKRANTINLSMPESIDFYEHKLEQAKEYHEGLKS GKYRREHTYAMAYANKAVKEAKKNYDLAVKLWGDV >gi|222159318|gb|ACAB01000041.1| GENE 13 10406 - 10870 147 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716194|ref|ZP_04546675.1| ## NR: gi|237716194|ref|ZP_04546675.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 154 1 154 154 326 100.0 3e-88 MRELSKETSLQRVMRASGRVPVQCSCSVCKQQCHTPCLGTPDDIERIIDAGYADRLALTN WAAGIFLGVINIAIPMIQPVSGKEFCAFFENGLCILHDKDLKPTEGRLSHHTVRKDNFNP TMSIAWNVAKEWLMPENEDVLSRVVNKFLNARKP >gi|222159318|gb|ACAB01000041.1| GENE 14 11192 - 12877 485 561 aa, chain + ## HITS:1 COG:mlr8517 KEGG:ns NR:ns ## COG: mlr8517 COG0270 # Protein_GI_number: 13477024 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Mesorhizobium loti # 5 548 25 656 667 163 25.0 9e-40 MSSINLLYIDLFCGAGGTSTGVESARIDGKQCAKVIACVNHDANAIASHAANHPDALHFT EDIRTLELSPLIEHLAKCKAQYPGAAVVLWASLECTNFSKAKGGQPRDADSRTLAEHLFR YIEAICPDYIQIENVEEFMSWGDMDENGKPISMDKGRLYQRWVRNVRKYGYNFDFRILNA ADYGAYTTRKRFFGIFAKNGLPIVFPQPTHCKNGKQDMFGRLEKWRPVKEILDFSDEGTS IFREKPLAEKTMERIYAGLIKFVAGGKDAFLIKYNSMSRTGKYNAPGIDEPCPVVATQNR LGVAQVCFLSKQFSGHPESKNVSINEPAGTITCKDHHAFVSAHYGNGFNRSINEPSATVT TKDRLSLVSPYFIDQQYGNSKPSSTEKPLGCITANPKYNLVSCKPWIMNTNFSNVGSSIE EPAQTVTANRKWHYLMNPQFNSAGGSVDNPCFTLIARMDKMPPYLIATETGHVVIEIYDT DSPMTKKIKEFMGLYGIIDIKMRMLRIPELKRIMGFPENYVLIGTQADQKKFIGNAVEVN MARVLCECISKKLRELGSVAA >gi|222159318|gb|ACAB01000041.1| GENE 15 12966 - 13211 193 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716197|ref|ZP_04546678.1| ## NR: gi|237716197|ref|ZP_04546678.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 81 1 81 81 151 100.0 1e-35 MKRNEKIAKLERLGIFNQWKYNTERANETFNIECPDFSMTNEERMNNLLDVDCCFHWFLT ISFPFNNTPEGVAFWNDIAKK >gi|222159318|gb|ACAB01000041.1| GENE 16 13238 - 14029 596 263 aa, chain + ## HITS:1 COG:no KEGG:BF2327 NR:ns ## KEGG: BF2327 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 155 10 165 210 286 85.0 6e-76 MIIAWFSCGATSAVACKIALSLYDDVHIYYIETGSGHPDNTRFLADCEKWYNQSIHIIRS DKYTCVSDVLRKGYINGAHGAACTLELKKKVRYKLEKELQHWDGQVWGFDYDPKEINRAI RLKQQYPDTKPLFPLIEKQITKQDAMGMLWKAGIEIPAMYKMGYNNNNCIGCVKGGMGYW NKIRKDFPDVFNEVAQIERDVGATCLKDKDGRIFLDELPTWRGDPVEEIIPDCSLICQIE FQEIIDRQVERVLKGEISINDVA >gi|222159318|gb|ACAB01000041.1| GENE 17 14052 - 14210 107 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293372930|ref|ZP_06619299.1| ## NR: gi|293372930|ref|ZP_06619299.1| hypothetical protein CUY_0816 [Bacteroides ovatus SD CMC 3f] # 1 52 1 52 134 101 96.0 2e-20 MNFKSLVAQLANRINQPHVVETYMRKVFASGVEWQKKQSPWIRVEERLPDEE >gi|222159318|gb|ACAB01000041.1| GENE 18 14455 - 14871 287 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716200|ref|ZP_04546681.1| ## NR: gi|237716200|ref|ZP_04546681.1| predicted protein [Bacteroides sp. D1] # 1 138 1 138 138 242 100.0 5e-63 MMTAKELSKLITTGRKLKKFIKETLPKIREEFQSHSNSGIDKHTDGFGRRESIQSMNISN LCYFSFSGSYGSGDTYSDIANMDTDLMQEYFIKYLNRHKDEIMEGVADLMINDAKSGQED AIKEIDEYKKSLLKLLEE >gi|222159318|gb|ACAB01000041.1| GENE 19 14893 - 15120 173 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716201|ref|ZP_04546682.1| ## NR: gi|237716201|ref|ZP_04546682.1| predicted protein [Bacteroides sp. D1] # 1 75 1 75 75 146 100.0 5e-34 MQYILTEQEYRALTPISEVNKLKEEVQLLNDKVMELSEHPCGSGADYRSVTFYCDDCPIG AFGTGTCTKSQQYSK >gi|222159318|gb|ACAB01000041.1| GENE 20 15409 - 15840 295 143 aa, chain + ## HITS:1 COG:no KEGG:Mnod_4116 NR:ns ## KEGG: Mnod_4116 # Name: not_defined # Def: hypothetical protein # Organism: M.nodulans # Pathway: not_defined # 3 122 4 124 154 166 61.0 3e-40 MAKIYVASSWRNVFQQDVVDILRDLGHEVYDFKNPPHGNGGFQWSDIDPDWQNWTTEQYR EALNHPIAQKGFDSDFNGMQWADVCVMVLPCGRSANTEAGWMKGAGKKVMVYSPKKEEPE LMYKIYDFVSDSIFRINDEIIGV >gi|222159318|gb|ACAB01000041.1| GENE 21 15852 - 16769 596 305 aa, chain + ## HITS:1 COG:Ta1390 KEGG:ns NR:ns ## COG: Ta1390 COG1032 # Protein_GI_number: 16082367 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Thermoplasma acidophilum # 101 234 73 207 425 83 35.0 6e-16 MNIGILAVDSNYPNLALMKISAWHKARGDNVEWYNPLCSYDKVYSAKVFSFTPDYGYYIN TNQVEKGGTGYDISKVLPVEVDKIVPDYNLYNIDKNLAYGFLTRGCPNRCKWCVVPAKEG NITPYMDIEEVSAGRKNVILMDNNVLASDYGLQQIEKIVSMGVRVDFNQGLDARLVTDDI ARLLARVKWMKRIRFGCDTPGQIAECERATALIDKYGYKGEYFFYCILLSDFKESFERVN HWKNKGGRFLPHCQPYRDLNNPRQIIPQWQKDLAGWADKKWVFRSCEFKDFTPRKGFKCR EYFQK >gi|222159318|gb|ACAB01000041.1| GENE 22 16843 - 17649 209 268 aa, chain + ## HITS:1 COG:XF2279 KEGG:ns NR:ns ## COG: XF2279 COG0451 # Protein_GI_number: 15838870 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Xylella fastidiosa 9a5c # 4 214 22 296 342 90 29.0 2e-18 MRKMIVTGSEGFIGKALCRELAKRDVEVIGLDRKSGIEATKVCELLKNGGIDCVFHLAAQ TSVFNGNLEQIRKDNIDTFMRVADACNQYHVKLVYASSSTANPENTTSMYGISKYFDEQY ASIYCKAATGCRLHNVYGPNPRKRTLLWFLIEKENVSLYNCGQNIRCFTYIDDVVEGLIY AVGCNRQLINICNVQPVTTMYFASLVKYYKPLEIELINKKRDFDNLEQSVNQDIYLVPLS YTSVEDGVKKVFAMRREDNSQKNAGAEK >gi|222159318|gb|ACAB01000041.1| GENE 23 17742 - 18284 350 180 aa, chain + ## HITS:1 COG:no KEGG:BVU_2837 NR:ns ## KEGG: BVU_2837 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 7 180 9 181 181 220 67.0 2e-56 MSRENVLTLKQEKFCQYYVDIDGNASEAYRMAYDCTKMKQESVWRNAHALMQNIKVTSRI KEIREKRAKESEVKRETVERVLMDIITSDPNDLYIVDELTGKVKMKSPSQLPKRTRNALK KIQNKRGEVVYEFNGKTEAARLLGAWNGWEADKNVNIKGGDGNKVGELRIGFEDNENSEE >gi|222159318|gb|ACAB01000041.1| GENE 24 18346 - 20361 924 671 aa, chain + ## HITS:1 COG:BS_yqaT KEGG:ns NR:ns ## COG: BS_yqaT COG1783 # Protein_GI_number: 16079672 # Func_class: R General function prediction only # Function: Phage terminase large subunit # Organism: Bacillus subtilis # 6 173 3 169 431 82 30.0 4e-15 MVINYKKLNPNGFYLLKYLNDETIRFIILYGGSSSGKSYSVAQTILIQTLQDGENTLVMR KVGASILKTIYEDYKVAAIGLGISHLFKFQQNTIKCLVNGAKIDFSGLDDPEKIKGISNY KRVQLEEWSEFEHPDFKQLRKRLRGKKGQQIICTFNPISESHWIKKEFIDKDKWHDVPMT VTIAGKELPKELTKVKSVKKNAPRQILNLRTKQIEEQAPNTVIIQSTYLNNFWVVGSPDG AYGFYDEQCVADFEYDRVHDPDYYNVYALGEWGVIRTGSEFFGSFNRGKHSGEHKYVPDL PIHISVDNNVLPYISVSYWQVDFTTGTKVWQFHETCAESPNNTVKKASKLVAKYLKSIQY SDRLYVHGDASTKAANSIDDEKRSWMDLFIDTLQKEGFEIEDKVGNKNPSVAMTGEFVNA IFDCTVPGIEIYIDESCSVSIEDYMSVQKDANGAILKTKVKNKTTLQTYEEHGHLSDTFR YVVVDLCSEQYIEFSNRRKRNLYACNGTINFFNPDTECKYTKKILYVMPNVNGKFVLIQA FRCGNKWHVVDVVFMDTTSTEDIRSSILSHESDSCVIECTDAYFPFIRELRSSTNKEIRV MKEFPDVDKRIAATSDYVKNSILFSASKVESDTEYVAFMNNLMDYNKDSETKEASAVLSG LVQFVVKLGLN >gi|222159318|gb|ACAB01000041.1| GENE 25 20488 - 21933 577 481 aa, chain + ## HITS:1 COG:no KEGG:BVU_2835 NR:ns ## KEGG: BVU_2835 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 5 481 4 475 475 473 49.0 1e-131 MNIFFDNLFGKKSKTKGEVEIVTSSENKDIDTQSGKAEKWSVAYIEDLTSPIVAGSNYLT LFSTIPEVFFPIDYIASRIAGANFQLKKTKDDSIVWANKRMNGILSRPNCLMRWKELIYQ HHIYKLCTGNSFIRAAMPDVFSTAEKWRYCDNYWVLPSDKTIVEPVYGNMPLFGIAQTED IIRSYRLEYGWNGSLEIPPYQIWHDRDGSAEFYSGAMFLKSKSRLASQNKPMSNLIAVYE ARNVIYVKRGGLGFIVSKKTDATGSIALTDDEKEQLLKQNFEKYGVRKGQVPYGISDADI DFVRTNLSIAELQPFEETLADAINIAGAYGIPAVLVPRKDQSTFSNQATAEKSVYCSTVI PMAKQFCKDFTAFLGLEGGGYYLDCDFSDVDCLQEGLKESEDVKTNINKRCREQFSCGLI TLNDWRAQIGESMIENPLFDKLKFDMSDEELDKVNRVFNTKSGDEKDGRENQKPSVQDKG K >gi|222159318|gb|ACAB01000041.1| GENE 26 21887 - 22948 562 353 aa, chain + ## HITS:1 COG:no KEGG:BVU_2834 NR:ns ## KEGG: BVU_2834 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 329 3 336 351 322 52.0 2e-86 MEEKIKSLQYKTKANDVDEKGIVTVAVNGIGVKDSQNDISMPGSFNKTLKENIGRMRWFL NHRTDQLLGVPLSGKETEGNLVMVGQLNLEKQIGRDTLADYKLFAENGRTLEHSIGVKAI KRDSVDPCKVLEWRMMEYSTLTSWGSNPQTFLVNIKSATADQVKEAVDFVRKAFLQHGYS DERLKGYDMELSLLLKSLNGGAVVSCPHCGHQFDYDAETEHTFAQQVLDYAADYQRWITQ DIVREEMEKLTPEIRTQVISLIDSVKSEKKEFSQKGLQDLMNYVRCPHCWGKVYRSNAIL QNTSEDTTGKNEPSVDTQEKNDGENGNDEVTIKAADNGTLFDFKSLNSCFENK >gi|222159318|gb|ACAB01000041.1| GENE 27 22966 - 24543 1365 525 aa, chain + ## HITS:1 COG:no KEGG:BVU_2833 NR:ns ## KEGG: BVU_2833 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 21 525 59 561 562 363 40.0 1e-98 MPIRKFTVSDFNLKTDGLPAEQKAFMENIVGMMCEVVNKSLEGIASPDEVSKQFDDINKL LKSYDNEKFQQLVKDNEELVAQVKTLGESIEKMKQKGLSMNAINKFDEKLNEMLDSEKFR DFAEGKTRKSGEFDGFSLKDVVSMTDNYTGDLLITQQQKRVVTQVANKKLHMRDVLTTLT ADPAYPQLAYAQVYAFNRNARFVTENGRLPESSIKVKEIQTGTKRLGTHIRISKRMLKSR VYIRSYILNMLPEAVWMAEDWNILFGDGNGENLLGIINNTGVTSVEKIISTAIVTGAAGA VKAITGYNGDKDVIVEFAEPQDLILDGMSITFAGAAVLTELNKTHALVKMEDGRILIPGV AFSGAETATDKMTFSVHEAGFKNIEEPNSEDVVKTAFAAMTYAQYFPNAIILNPMTVNGM ESEKDTTGRNLGIVKMVDGVKYIAGRPIIEYGGILPGKYLLGDFNQAANLVDYTTLTLEW AEDVETKLCNEVVLMAQEEVIFPIYMPWAFAYGDLAALKTAITKA >gi|222159318|gb|ACAB01000041.1| GENE 28 24548 - 24970 478 140 aa, chain + ## HITS:1 COG:no KEGG:BVU_2832 NR:ns ## KEGG: BVU_2832 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 129 1 116 140 62 37.0 6e-09 MDYILRGNDKDVTNVLKEQRIRINRGMIQLIPISECGLVTEEDARKTLECMLAEKNEEIG RLTASIAEKDKTIVELTEERETMKARIAELEVQVPSDEKNLPVADSKDLQEEDAKEVTVT DDKAVSVEDEKKTGKSKTSK >gi|222159318|gb|ACAB01000041.1| GENE 29 24979 - 25521 337 180 aa, chain + ## HITS:1 COG:no KEGG:BVU_2831 NR:ns ## KEGG: BVU_2831 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 180 3 173 173 216 57.0 4e-55 MLIDVSYFMSGPRHIENVSVAEMPSPQSLAVNEVINGYIKAFQPEFLRNVVGVTLSQAIT DYLELIEREKEDSSDEVDISEEKEAPQSGYAVLCEKLCEPFADYVFYHILRDANTQATIT GLVRLKCANEYVAPLKRQVSTWNSMVEKNKQFVEWAMSNDCPFDVQITKNLLTPINAFNL >gi|222159318|gb|ACAB01000041.1| GENE 30 25518 - 26090 183 190 aa, chain + ## HITS:1 COG:no KEGG:BVU_2830 NR:ns ## KEGG: BVU_2830 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 189 1 188 188 233 62.0 4e-60 MIDLDITELFEEIVKELPEGLEILYPNGKGGTKVVKSPRLNYIFGSSQYIKDILDEYSKS SAQSERKFPLVALFTPISEDRGDADYFSKAKVSLIIACSSCKEWSNEMRRTTSFKNILRP IYKRLLEVLYEDSRFDCDYDEKVKHSYSENYSYGRYGAYTDSGEAVSEPIDAINIRSMEI KINNLNCRRK >gi|222159318|gb|ACAB01000041.1| GENE 31 26087 - 26926 588 279 aa, chain + ## HITS:1 COG:no KEGG:BVU_2829 NR:ns ## KEGG: BVU_2829 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 278 1 277 277 404 67.0 1e-111 MRKIRTCKGSRMNTGSSACSIDWKKVKGAILTEHGVKLPADITGEKLLELCHADRPGRIY PILPFLEYAKNGGEPQVNPVGYGASEYNGLSAQTDTFTLKKFDEVLNAQLLKCANKGWDV YFWNQDNMLIGYNDDTDILAGIPMSTVYPTVTQYPTSSAKSAMTVSFSHEDVEDSQLHFD YVQLDFNPKNFVKGLVDVVFQKLEAENTYKIVEVVGGYDRTEEFGSLIADGAAEVMNNVT SATYSDGIITIVPKAGAVPSLKAPSVLYEKGIRGIEQVS >gi|222159318|gb|ACAB01000041.1| GENE 32 26923 - 27078 72 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716214|ref|ZP_04546695.1| ## NR: gi|237716214|ref|ZP_04546695.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 51 1 51 51 73 100.0 4e-12 MKVDNVTFVEAAVKGMTKEEFINAHIKVVWQELKEADRKKKLSEVYDAITK >gi|222159318|gb|ACAB01000041.1| GENE 33 27203 - 27649 277 148 aa, chain + ## HITS:1 COG:no KEGG:BVU_2828 NR:ns ## KEGG: BVU_2828 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 147 26 172 173 207 62.0 8e-53 MEEHKNVLVDCIQEQLYSGLDGTEHLLNPDYDTDTYFNEPGPWQNRAEQYKRWKERITPP LRSEMLYLPPRPIEVPNLFITGTFYDSITADRIDSGLRFSTKGFTDGSSIEKKYGEQILG IGDTAKEYFNIMYLRPWMERFFSECGYL >gi|222159318|gb|ACAB01000041.1| GENE 34 27680 - 27841 198 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885930|ref|ZP_02066933.1| ## NR: gi|160885930|ref|ZP_02066933.1| hypothetical protein BACOVA_03935 [Bacteroides ovatus ATCC 8483] # 1 53 10 62 62 95 100.0 1e-18 MQSELERISDLAKKAAVLDGCMYVVYQKEDGTYAFDKLGVEIKGKIVEYRHYL >gi|222159318|gb|ACAB01000041.1| GENE 35 27844 - 31755 3433 1303 aa, chain + ## HITS:1 COG:Z2987 KEGG:ns NR:ns ## COG: Z2987 COG5283 # Protein_GI_number: 15802339 # Func_class: S Function unknown # Function: Phage-related tail protein # Organism: Escherichia coli O157:H7 EDL933 # 314 600 209 512 696 91 27.0 1e-17 MADLKLKDFVDENDLQKLVELDNTIERVRADYVNAAKELAKGLKLNVEGVADLEKLSNLY NTQAKTAGSASAELTEALRKQSEITQTVSKKIEEKLNVEKLSAAELKKLTKANSDNAASL EKAAKAEANLTKAQNAGNTTRKKAVLSEEERLKLIRTAITLTNQEVHSRSQAKEMNKQLQ KAVDVLKDTDENYIRTLARLNSTIGINTDYIKRNSDRYSQQKMTIGAYREEVKAAWIEIQ NGNKSMQNMGIIARNAGMMLKTEMAPGLNKVGAGLKGWAAGYIGAQAVVSGVVALFTKLR EGVGDIVKFELANSRLAAILGTTSDKVKELTADAQRLGATTKYTASEATDLQIELAKLGF TRKEILDATEHVLKFAQATGAELADAASLAGASLRMFNADTRETERYVSAMAVATTKSAL SFSYLATALPIVGPVAKAFNFSIEDTLALLGKLSDAGFDASMAATATRNVFLNLADSNGK LAKALGKPVKTLPELVEGLKSLKEKGVDLNTTLELTDKRSVAAFNAFLTAVDKILPLREQ ITGVERELGDMAHTMGDNVHGALANLSSAWEAFMLSFSESTGPAKEFLNWMADKIRGIAN DLKSPEEKIEKIDYNFRTLAKKDANKKLLEVEKDFQAEYKRLIDAGDTEEQAYTKAVIQM KNKRIEVTAQEREALKRMKTRAQYATSEFEDMSWIKNGAAKMFGYYTSEAEKADKAQLEF SKNLFKIASSDEFNRGLDVIAEKFRPKGNDKNGSGITVLTDKEKREQEKALKEKLKIHET YQESELALMDEGLEKELAKIGVAYSKKIAAVKGNSKEEIATRQNLAKEMQEKLDEFTIKY NSDREKKDVENALAVVKKGSQEELDLKLHQLELQREAEIDAAEKTGEDVFLIDDKYAKKK QELYERHASDQVQLIAENAAHEQEIRDAAYVMDTLALKKQLASKEITQQEYAELEYQLKL DYVRKTTEAAIDALELELRNENLSAEDRAKIAEQLQKLKADLSQQEAEAEIDAINKVTKA DEKAQKERQRNLKKWLQTASQAVGAIGDLVSTIYDGQIQKIEEEQEANDEKYDKDVERIQ NLADSGAISEEEAEARKRAAKERTEAKNAELEKQKQEMARKQAIWEKATSVAQAGIATAL AITEALPNIPLSIVIGAMGAIQVATILATPIPSYADGTQGNDRHPGGAALVGDAGKHEVI MYSGKAWITPDTPTLVDIPKGAQVFPDVDKVDISNFDIPDWDFPTFSPTYFASSSGDTIV FNDYSRLEKRVDRTNFLLMKSLKMQRQDASNREFELYKLSKLK >gi|222159318|gb|ACAB01000041.1| GENE 36 31758 - 32339 311 193 aa, chain + ## HITS:1 COG:no KEGG:BVU_2826 NR:ns ## KEGG: BVU_2826 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 192 1 188 189 144 42.0 1e-33 MIERLNQITLSDFIELSCGNYACLLSDCKSMSESTLKEIASKLLVEYRSIVNPSNMKAMV MDKEDMLKERAKLLSLRICQALVSLGFYDDVRQVLGQLNVDTRNMSDEQVISKIDYLLHS AIFEQKRNEERRSEEHKGSKATPEQIRSSFDAEIAFLMTFFKMSIDSRVISAAVYANIVH QADVEISIRKRST >gi|222159318|gb|ACAB01000041.1| GENE 37 32447 - 32734 95 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885933|ref|ZP_02066936.1| ## NR: gi|160885933|ref|ZP_02066936.1| hypothetical protein BACOVA_03938 [Bacteroides ovatus ATCC 8483] # 1 95 1 95 95 167 100.0 2e-40 MNRKNSIHCINRHLYNVLLSELRTLETKCNRITAEVSEVKKMIALLPPDIGTLISSIERS AKEMHEQSIMHRKYVERCINGEPKIHLIRRADNGL >gi|222159318|gb|ACAB01000041.1| GENE 38 32724 - 33197 159 157 aa, chain + ## HITS:1 COG:SMc01419 KEGG:ns NR:ns ## COG: SMc01419 COG1595 # Protein_GI_number: 15965855 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Sinorhizobium meliloti # 3 156 22 173 184 70 27.0 1e-12 MDFEKELSEIYPWILKVARKFCCSMQDAEDLAGDTVYKLLVNRDKFDCSKPLQPWCLIIM RNTYIIRYNRNSLIHFTGLDMVDGSAISNCTAHSILFDDLVSTIQRCAKKSRCIDSVMYY ASGYSYDEISEILNIPVGTVRSRISSGRKQLCQELKY >gi|222159318|gb|ACAB01000041.1| GENE 39 33282 - 34085 216 267 aa, chain - ## HITS:1 COG:TM1828_1 KEGG:ns NR:ns ## COG: TM1828_1 COG0117 # Protein_GI_number: 15644572 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine deaminase # Organism: Thermotoga maritima # 10 126 5 123 144 88 47.0 1e-17 MEQKLSNVDLMKIAIEEQSKCTSFPKVGAVIAKDGIILAKAFKDEESSKHAERIAIEKLD KSTLNGATLVTTLEPCINIANNQPLQSCTDLIIESGIKDVIIGILDPNGAIYCQGYEKLL ENNINVSFFTPKLRNKIESSTFIYGDCNIGYGSGIRRVAVIGSGKNFEIKFSEKDNRSIK FRWCTLQYVHGIVDLMGPNESIRSAKGAQKFEDITDPFVFREPSHFARMKVGDIAIISPT DSTFVILIKLLEMTETDITFQWQVRNR >gi|222159318|gb|ACAB01000041.1| GENE 40 34426 - 36417 1046 663 aa, chain + ## HITS:1 COG:no KEGG:BVU_2822 NR:ns ## KEGG: BVU_2822 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 400 1 423 892 146 29.0 3e-33 MLCKYVLTVDSISYDIPKSCIQNWDEIKFSRKRSGLEGITRTFTSKFQFVGEAYDLILEE YLSKYLASNASITVYTITNSHTYEEFFSCRLDFGSLTYDGNTVSINSIDDSVANIIKANK GTQYEYSVDEIKDTYQLYYDSVSMNYSQPHTLGGNTVENDASLQYIVIDKGIYVEAITYS LPLYISGGELPSRDSPLEFYDVPQESKDDPNVFVKALSDIDIVLNFSFEYYISYSDAYTT KAEIVLGGRYEDGRLVELKRWGYNKGDVTPSNLNESIKIHLTKGQALFFDLNVTFNRVNA STGNIYFRNFKFETRFTSRANPIYVDAIRPIDVLNRLLKSMNGGNEGIYGEIASGVDERL DNCVILAAESIRGIPQAKLYTSYTKFKNWMETVFGFVPVINGVTVFFKHRDKLFSDNNVK DLNSSFSSFEYKVDSSRIYSLVRVGYDKQDYESMNGRDEFRFTTEYTTGIDITDNVLELI SPYRADVYGIEFLSQKRGQDTTDSESDNDVFFVCASTTLHDNGGVQTYKEYRLIRSGWEI SGVLDPETMFNTMYWQGGILQANAGYIGMFTKKLSYSSSDGNSDVVVNGIGMKDDFNVES GIITCGDVSFTTYNEDIPPTDDETIKILKDDLVYEGYIKEVSSTVERNEGVKYDLFVRSI TKA >gi|222159318|gb|ACAB01000041.1| GENE 41 36422 - 37735 597 437 aa, chain + ## HITS:1 COG:no KEGG:BVU_2821 NR:ns ## KEGG: BVU_2821 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 266 3 266 275 273 48.0 9e-72 MIISPFTPLFFSPSTDKFGAKSKYVQLFARTDRIFVELILTAKEQEPIVYINNLLSNIST PVSLSSWKMNDDKILYFYNISLLPCGYYTVTVNGNTSEIFKVTDDECELSETSLIQYSMK DNKQRLDAVWWIDGMQYFFDFRVPGGFKDNGWTFGVDNEQFVTSDEDIVELFSHEYTTVL FTLGNGMGCPVWFAELLNRVLCCNYVYFDGVRYTRKESNVPELNQQIEGLKSFVFNQMLQ KVRTMNPVLEWNNQLAMRCVQSGAYRIADDEGMRSIKYGSESEVAEVGAYINMTKAIPNT GVSINSDTMVTVNSIHHPGVDENSYWDLIAIKTTDIDNKYIGRRGYGKLTVNGLDRLKND LDNGSINLRAVLYKGDSYTNLIEGSVISRDGVCVLKGINGGDIGALKEFQLYLDNVYDCD IDNLGMTIELVWVYEND >gi|222159318|gb|ACAB01000041.1| GENE 42 37747 - 43920 3804 2057 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716224|ref|ZP_04546705.1| ## NR: gi|237716224|ref|ZP_04546705.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 2057 1 2057 2057 4036 100.0 0 MTETEKQQIISLVLQALKTNSLTIEQLTDTTELSKDMYVEVSGGRKISIDLLSSTIAKMV NGDFDALVENVNKIAKDLSDGDAELLKRITGVSDKSNPLTDPFKSIGSFTTIGSFKDKLK TMYSGDSSIGNYRCILSVDSSKIPVNIQIERLELDKVCQSFTSCIQLATMSDNAEGVYLG TVCTISRIGIVSNESVTWGKWTSVINDFEERIGKANGIAPLNEESKVPSECLPEPLSLGE GEEEAFPGNRGKSLEDTMKNIPSDIIKPGSFSVLSDASYLNVYFKKVSKTTGKETDDSFR LPSATLEQAGLLSAEDKQALEDMKSGTPADDVTHPIVIVDEIRPLKDGYYTLETAIAAIV SYQQESGVKYERTGLIITYKTGEYEMETRQFQGAVSDFATPSLWKPFGNGGGGSVVETSD EPAEGGKDAFSTGGAYAYVPANLDVNVETEGIVKLQMKNAAGETLGDEVQFAIGTGGGGQ TGGTIVAIAFQSTPVYGSYGSTLRTFAAIRSVTSNGVESSDNLIEKLELVDRESGLTVWT ETVNKASSGDMKDFSFELDFTAYFTAAGTRKFKLIATDESGNTGSKNVNVTAVDITCTCV QVLNYTPETLLTPTTESFSLPLYKFGNNTSDKGISAQVDIKINGEWQSLSTAVVNDNYSH SVVIRPASLGLEHGTYSLRIQGTDVASGVKGNVIYTAVMVIDPNSSTPLVALRYDDKNGG VVRLYETVELDVACYDPLEMTSPVSVKANDVQVTQIAASRNKTYQVKQQLQGYKADGTDT VNYTAVCKDVTSEPVRVTVSGSAIDAAIKEGAIYNFDFSSRTNQETDHSIVSGNYEMKVD GANWTTNGFGTFLGENCLRVAENVGVSLNHAPFAGSSIESNGAAIQFAFASKNVTDDDAL LLSCYDETSGAGFYVTGRVVGIFCNNGVSRREERAYRQGEKITVAVVVEPANNYVERDGT RYSMMKLFLNGEEVACLGYVPGGGSLIQTKYITMDGKLGDLYLYYMMAWNSYMEWAQAFK NYLVRLTDTEVMVKEYAFEDILKSQTAEGSTQSRPSAAEIYSRGMPYIVECPYEGSDIEA LDGTTSTSTKIYITLYYFDPERPWRNFKAVSVQTRNQGTTSAKRPVKNKRYYLAKSKGKN KDTRIILLNPDDTTEEGRRAIALAAINKVQVGDNTIPVDVITVKVDYSDSGNANDCGACE MMNVTYRALGGNYMTPVQRAFDGTFDSGDLHIEDLQMNHSTANHPVATYRCKDDSLQNVY FHAKGNWKEDKGEQFALGFKDTPGYNKGCLNYGDFIEFFGTPDETLDAIEIRFKQTDGVD TDSVYLLSLYCGSSYRIMRYQDSSWKKQSGSMKYENGKWNVTGDVLNPVEGFELLNYQGM DWFQGVGSVQDMMAMKTDKSSWVQKLVDNGTISADTFPAWTYYFESLVDDDQLAIDYALG KKVPYNLYRWLRFCDSCDYSKGGNWQRTWKENLYKYACPESVLSYDIFTDYLAATDQRAK NMQPMWFLEEYASVTDGVYSSEDAMRMYLNKIYDCDTLNSKDNDGGCTVDTEVDPNRTSD ETFTNPYAGYGSVLFNNIYLQQVVWTDSSGTELSLRTVAAAMRNVQATIDGVTLHPFSPE GATHFFIDKRLKKWQKLVSSYDGERKYISYTATSDAIYFYALQGLGLTALPSFIERRWRI RDGYFQTGDFFSGVISGRVSSKSNATIRIVAAKNGYFGVGNDASGNLSESCFLEAGEEYV FTNFSHEEGALLYIYQADRMKLLDLSEISLSSTVSFSAMQLVETLILGSDTHTEQSIGSY APLTSLNCGEMPFLVSLDIRNTQIATLVTDKCPRIAHINASGSKLENITLAETSPINDIS LPPTMTSLRFVGLPELTYTGLSAPSGLQIESMPNVQRLRLETSPQLDAIQMLRDVLASQA ASRKLSMLRISNMTLKADGSELLAILEYGVAGMDEDGNRQDKPVVNGTYELTVIRETDEI ESLESGIDGLVILTVIDAYIDLINWFNNESYGGEPYYDNVTLDNINEVLEYYNGETYEEY LERFAEDNMDINDLINK >gi|222159318|gb|ACAB01000041.1| GENE 43 43923 - 44711 673 262 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716225|ref|ZP_04546706.1| ## NR: gi|237716225|ref|ZP_04546706.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 262 1 262 262 472 100.0 1e-132 MTNEQSATLLRLNKQAQVAALNAVGFSDITENSRASEFGQRIKWAAGLFDLTLACNRISD NSKAYFTAAEWNSLTLANKQLYIKRGLRIRAHGHSFVIAAQECYNADMTTTFYWGGQGKA IDGLNQKGLGAMYGCFTGEEDTELIITGLKDQNNSGVIGAPAAEAARAYRAYTLESDGIE DESNWFLPSSGQMLLMYRYRDKINEMMRTFWSSDSMLMTDKYYWSSTIWDVNSAWAFELN TGRITNQNKNSNLLHVRAVASE >gi|222159318|gb|ACAB01000041.1| GENE 44 44737 - 45546 441 269 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262407834|ref|ZP_06084382.1| ## NR: gi|262407834|ref|ZP_06084382.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 269 1 269 269 504 97.0 1e-141 MDKNIASAMLLRLNKQDQIEALQSIGFTTVNENTPASDIAKYMQWSGTLLDLSLATLRIE DGEQVFFTASEWNSMSANNRSKYIRIGIRLRAECHQFIIAKSDCVDAGGNKTFKWGGYGT DLRGLKNYGSGNQGLYDTFDGKENTDVIIETLAGVKDTQGTVGAPAAEVARAYKACTLES DGIEDTTVWNLPALGELMLMAKYKTEINELITSMLGNQNIFTNDWYWSSTEYDASSSWNV SFSGGSVSSGLRQGAGRVRPLAAINTLSL >gi|222159318|gb|ACAB01000041.1| GENE 45 45693 - 46100 193 135 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885941|ref|ZP_02066944.1| ## NR: gi|160885941|ref|ZP_02066944.1| hypothetical protein BACOVA_03946 [Bacteroides ovatus ATCC 8483] # 1 135 1 135 135 262 100.0 5e-69 MALTQDLPISNSMYKLLNLIIDARQQFPKAFRYEFGTELMMLAVHCCEYIRYANTDMNLE HRADYLMKFLCEFDALKLLLRVCEERHLTSLTQTAEICLLAESIGKQSTGWYKKTVADLQ RQKANGSQQVAKPES >gi|222159318|gb|ACAB01000041.1| GENE 46 46115 - 47536 351 473 aa, chain + ## HITS:1 COG:alr3497 KEGG:ns NR:ns ## COG: alr3497 COG3344 # Protein_GI_number: 17230989 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Nostoc sp. PCC 7120 # 58 426 5 347 352 99 26.0 2e-20 MSEQLELFIGHPPGDEPGKTKIADATASSSWNVNFNNGNVSTGNRQGAGRVRPLAATGNI IYDILLSSIFEASEDCARQKRTSTDCVEFYNDYQSALVRLWYSIIYGEYVPDFSKVFIRT YPVYREVFAAAFIDRVVHHWIALRIEPILEERFREQGNVSKNCRKGEGCLSAVHYLNNMI VEVSEHYTADAYIFKDDLFSFFMSISKSLVWEMLNIFVRDNYKGDDIECLLYLLAVTIFH CPQNKCIRRSPVSMWDKLPSNKSLFHNDPDRGVAIGNLPSQLIANFLASVYDYFVMEILG FRHYVRFVDDFCIVVKSPEEILSKVHLLDGFLKEQLLLRLHPRKLYLQHYKKGVLFVGAF ILPGRIYVSNRVVGNTYNAVRKFNRIAENGFAEAYVEKFVSTMNSYYGLMKHFATYNIRR KIAAMLLPEWWEYVYIEGHFEKFVLKNKYNHRKQLIKHIKKHGSKKYLTAWDC >gi|222159318|gb|ACAB01000041.1| GENE 47 47499 - 47777 291 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885943|ref|ZP_02066946.1| ## NR: gi|160885943|ref|ZP_02066946.1| hypothetical protein BACOVA_03948 [Bacteroides ovatus ATCC 8483] # 1 92 1 92 92 183 100.0 4e-45 MDQKNILPRGIAKPIEQQPDGTWIVRHHFRVVGTSENGEELVTFASSEYPEKPTLQQIQR SIDRYRVCLTMYGDTISDEIEKVDLSVYMFTD >gi|222159318|gb|ACAB01000041.1| GENE 48 47975 - 48268 229 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294807657|ref|ZP_06766450.1| ## NR: gi|294807657|ref|ZP_06766450.1| hypothetical protein CW3_0225 [Bacteroides xylanisolvens SD CC 1b] # 1 97 1 97 97 147 100.0 3e-34 MSMGIKVLYDWLLQSNRPAHVKAGMFVFVVMLIFCFLLLGIDFCKSAIVSLTTTAIAAIV VEYIQKKCGFIFDWLDALATVLLPGLITVFSILVVTL >gi|222159318|gb|ACAB01000041.1| GENE 49 48277 - 48792 295 171 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262407839|ref|ZP_06084387.1| ## NR: gi|262407839|ref|ZP_06084387.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 171 1 171 171 317 100.0 2e-85 MRWLYELFNVDQIRIIFVSMFSSLLAYLTPTKGFLIALVVMFGFNIWCGMRADGVSIIRC KNFKWDKFKNALVELLLYLIIIEVVFSFMSLIGDGENSLLVIKTITYVFSYVYLQNAFKN LIIAYPRNKGFRIIYHVIRFEFKRATPTHVQGIIDRIENELDKEERYENID >gi|222159318|gb|ACAB01000041.1| GENE 50 48776 - 49354 512 192 aa, chain + ## HITS:1 COG:BH1294 KEGG:ns NR:ns ## COG: BH1294 COG0860 # Protein_GI_number: 15613857 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 2 188 3 173 253 60 28.0 1e-09 MKILIDNGHGSNTPGKCSPDGRLREYSYTREIAGRVVFELRKLGIDAELVVKEEIDVPLS ERCRRVNEYKTSDAILISIHCNAAGNGSNWMQARGWEAWTSVGQTKADKLADCLYATAEE CLFGMKIRKDMADGDPDKESSFYILKHTKCPAVLTENLFQDNKEDVDFLLSEEGKRTIVS LHVKGICKYLKV >gi|222159318|gb|ACAB01000041.1| GENE 51 49375 - 49857 250 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716233|ref|ZP_04546714.1| ## NR: gi|237716233|ref|ZP_04546714.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 160 9 168 168 288 100.0 7e-77 MFLMSGIWFTSCKTSHSIESQKQIDYSGDFLYLRNLIESLQLDVNKQTKVTTDKLSDLKI ENKTVYLSLPDSTGKQYLVKESTTTASKQEQERSEVDETLSITLQQFSNRLDTINNKVNA LLNQREKVVELSWWDLHKDKVYCYVIGLILAGWLGCKFKK >gi|222159318|gb|ACAB01000041.1| GENE 52 50016 - 50630 283 204 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716234|ref|ZP_04546715.1| ## NR: gi|237716234|ref|ZP_04546715.1| conserved hypothetical protein [Bacteroides sp. D1] # 98 204 1 107 107 204 100.0 2e-51 MAKIKNVAETAKRKRIINAKECEYELRESLEKLFDAFWNAVRNYEKEVIQTPFTARCRGF EASLLNSKIIQSVQSVFKDDWTFGKYKRFMLRVNGYIMLFKKLNSKNMPMNVPTRFSSSI QNQEQGYLFDMYDNGTEPILFFGYNKSRFGEIINPKLVYIDENRVRWTISENDIFTVNRT MDVQPAAASLSVRQNIKKKEGTNN >gi|222159318|gb|ACAB01000041.1| GENE 53 50685 - 51746 472 353 aa, chain + ## HITS:1 COG:MT2073_2 KEGG:ns NR:ns ## COG: MT2073_2 COG2856 # Protein_GI_number: 15841500 # Func_class: E Amino acid transport and metabolism # Function: Predicted Zn peptidase # Organism: Mycobacterium tuberculosis CDC1551 # 70 296 5 225 279 114 32.0 3e-25 MEINYKQIIFAREYRGYSQTELASKIVGLSQSNLSKYEKGIGPLSTDVLNRIIDFLGFPT DFYEKKISNIAENAHYRRKKGMTKNERSQIDLSNKLLGYIVDQMGESVEFPDMSFRMIDL EDGYTPETVAQYTRKYLGLKDEPVRNIFSLLERNGIIIIELDYDVDLFDGVSFLTDGGYY VIIINKNFSNDHKRFTLAHELGHLIMHTSNEFLISEYRDKEDEANRFASEFLMPSDAISN SLRGLKLQYLVELKRYWLTSMASIVRRAKDLKCITNEKYKYFSIELSRRGYRKSEPVSVY IDMPNMYNEAYKLHKNELEYSNEEMATAFSLPIDVLTRFCCPTKTNLKLRLSI >gi|222159318|gb|ACAB01000041.1| GENE 54 51776 - 52039 324 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716236|ref|ZP_04546717.1| ## NR: gi|237716236|ref|ZP_04546717.1| predicted protein [Bacteroides sp. D1] # 1 87 1 87 87 159 100.0 5e-38 MAKTIKRFTYAVKDKYDNMVTVYARIEKEGGLYYWYTSHLTKPQDADGIGIYNPSNVESN LDTAEAFLKVYISMMKDSKVIVPNNHY >gi|222159318|gb|ACAB01000041.1| GENE 55 52322 - 52531 172 69 aa, chain + ## HITS:1 COG:no KEGG:AAur_3080 NR:ns ## KEGG: AAur_3080 # Name: not_defined # Def: hypothetical protein # Organism: A.aurescens # Pathway: not_defined # 1 66 1 66 274 84 59.0 1e-15 MTLHEAIIKLLKEKGTPMTTTEIANALNENKWYLKKDKSEITPFQIHGRTKNYDKLFRRL GNTVYLVTD >gi|222159318|gb|ACAB01000041.1| GENE 56 52659 - 53465 468 268 aa, chain + ## HITS:1 COG:no KEGG:PCC7424_0225 NR:ns ## KEGG: PCC7424_0225 # Name: not_defined # Def: hypothetical protein # Organism: Cyanothece_PCC7424 # Pathway: not_defined # 1 265 101 365 370 126 28.0 1e-27 MNIYEEGQIKDIKIWKATEPSVVSKAVGYMTKPIVLVANRIIPQKAIQGALTGSNTLAKL ITDEQDIIRDAKVNSIIELKTKDLQLSDKLADEVHNWALAGLGLEGGVAGFFGLPGMFVD IPMVVTIALRTIHKIGICYGYKALTLEDTQFVYSIMSVAGSNSMQEKNMALVTLKQLNVI LVKQSWKKMAEKATVNKYCNEAILITIKSLAKQLGINITKRKAIQAIPLIGAGVGATMNI AFINDICWAARRSFQERWLIDNGKVQSI >gi|222159318|gb|ACAB01000041.1| GENE 57 53895 - 55007 503 370 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716239|ref|ZP_04546720.1| ## NR: gi|237716239|ref|ZP_04546720.1| predicted protein [Bacteroides sp. D1] # 1 370 1 370 370 700 100.0 0 MIRMKRNVKGWLVWGTVILILIIIVILAFYFIQTKGKFADKQTDWGEFGSLLGAIAGLIA FVGVLFTLRQNKQQFLNSEDRAVFFELLRIFISYRDALRVKRIDWVYDEKQCEWKITPYN EFCTPEKTYRQIYVELYHTFYLEIRRGIPENFSKEEFVRRIIPQNMSKEQWMFIYGQLNA AINNIYSEHEFGIHKGKINIYPVHINTYDYLCLNAIKIYFEQNNFKPIAEACAKAADHCF APYKNQLGTYFRNAYYILEMTSEFTSPLKYSNIFRAQLSKYELVLLFFNSFSSLSTIETR RLYLNADLFNNLELKDVRLKEGINDESVSRRMEYIHFPPVLFQKANKNEYMSSNLLEKLY NVTLSENNIL >gi|222159318|gb|ACAB01000041.1| GENE 58 55180 - 55425 115 81 aa, chain - ## HITS:1 COG:no KEGG:BF2302 NR:ns ## KEGG: BF2302 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 4 80 63 139 140 77 51.0 2e-13 YIERAEEAILRDKLGELPEAISFSYIAKKYFGKSRNWLYQRINGNIVNGKKARFTDNELK TFLNALNDVSEMIHQTSLKIS Prediction of potential genes in microbial genomes Time: Wed May 18 02:11:12 2011 Seq name: gi|222159317|gb|ACAB01000042.1| Bacteroides sp. D1 cont1.42, whole genome shotgun sequence Length of sequence - 29683 bp Number of predicted genes - 26, with homology - 25 Number of transcription units - 14, operones - 6 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 331 185 ## gi|293372962|ref|ZP_06619331.1| conserved hypothetical protein 2 1 Op 2 . - CDS 373 - 543 269 ## - Prom 580 - 639 5.5 3 1 Op 3 . - CDS 641 - 1825 396 ## BVU_2806 hypothetical protein - Prom 2002 - 2061 5.6 - Term 2180 - 2216 1.9 4 2 Op 1 9/0.000 - CDS 2232 - 2930 731 ## COG3279 Response regulator of the LytR/AlgR family 5 2 Op 2 . - CDS 3027 - 4088 870 ## COG3275 Putative regulator of cell autolysis 6 2 Op 3 36/0.000 - CDS 4127 - 5347 417 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 - Prom 5389 - 5448 6.5 7 2 Op 4 24/0.000 - CDS 5488 - 6231 254 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 8 2 Op 5 13/0.000 - CDS 6280 - 7500 1135 ## COG0845 Membrane-fusion protein 9 2 Op 6 . - CDS 7547 - 8944 1425 ## COG1538 Outer membrane protein - Prom 9107 - 9166 11.6 + Prom 9063 - 9122 7.4 10 3 Op 1 . + CDS 9163 - 9399 221 ## BT_1211 hypothetical protein 11 3 Op 2 31/0.000 + CDS 9420 - 10967 1718 ## COG1271 Cytochrome bd-type quinol oxidase, subunit 1 12 3 Op 3 . + CDS 11001 - 12143 887 ## COG1294 Cytochrome bd-type quinol oxidase, subunit 2 13 4 Tu 1 . - CDS 12392 - 13348 900 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 13396 - 13455 4.9 14 5 Op 1 . - CDS 13496 - 14767 1144 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase 15 5 Op 2 . - CDS 14774 - 17350 2307 ## BT_1204 putative outer membrane protein - Prom 17384 - 17443 4.7 - Term 17664 - 17702 2.2 16 6 Tu 1 . - CDS 17717 - 18802 1027 ## COG0381 UDP-N-acetylglucosamine 2-epimerase - Prom 18874 - 18933 5.3 + Prom 18747 - 18806 4.0 17 7 Tu 1 . + CDS 18895 - 19518 683 ## COG2860 Predicted membrane protein 18 8 Tu 1 . - CDS 19717 - 20214 562 ## BT_1200 hypothetical protein - Prom 20320 - 20379 7.6 19 9 Op 1 3/0.000 - CDS 20397 - 21197 658 ## COG0501 Zn-dependent protease with chaperone function - Prom 21247 - 21306 3.5 - Term 21294 - 21333 5.2 20 9 Op 2 . - CDS 21405 - 21965 772 ## COG1704 Uncharacterized conserved protein - Prom 22173 - 22232 5.1 + Prom 21943 - 22002 9.1 21 10 Tu 1 . + CDS 22205 - 22708 556 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 22743 - 22789 9.0 - Term 22734 - 22774 4.4 22 11 Op 1 . - CDS 22979 - 23722 460 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 23 11 Op 2 . - CDS 23700 - 24416 895 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins - Prom 24447 - 24506 4.0 - Term 24452 - 24496 4.4 24 12 Tu 1 . - CDS 24579 - 25931 1158 ## COG0534 Na+-driven multidrug efflux pump - Prom 26000 - 26059 5.1 + Prom 25832 - 25891 3.4 25 13 Tu 1 . + CDS 26023 - 27408 1344 ## COG0657 Esterase/lipase + Term 27434 - 27490 11.6 + Prom 27461 - 27520 5.1 26 14 Tu 1 . + CDS 27601 - 29472 1433 ## COG0642 Signal transduction histidine kinase + Term 29501 - 29541 6.4 - TRNA 29543 - 29618 70.7 # Lys TTT 0 0 Predicted protein(s) >gi|222159317|gb|ACAB01000042.1| GENE 1 1 - 331 185 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293372962|ref|ZP_06619331.1| ## NR: gi|293372962|ref|ZP_06619331.1| conserved hypothetical protein [Bacteroides ovatus SD CMC 3f] # 1 110 1 110 149 190 100.0 3e-47 MIDWNDCLPTKEMQADFERFKELKTTEEKEAFKKEMQDKYNKLPEAQKEAYKKASEAGLK ATVNACNDYIERAEEAILRDKLGELPEAISFSYIAKKYFGKSRNWLYQRI >gi|222159317|gb|ACAB01000042.1| GENE 2 373 - 543 269 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKDEEKKELEQEYENLKLLASFHEAYGVPENAKEREALINDILDRMNEIQEKLKKL >gi|222159317|gb|ACAB01000042.1| GENE 3 641 - 1825 396 394 aa, chain - ## HITS:1 COG:no KEGG:BVU_2806 NR:ns ## KEGG: BVU_2806 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 392 1 393 393 518 66.0 1e-145 MATLTLVIVPAKRLSDGTHKIRIRVAHNSETRFITTDIVVRENEFKNGKIVHRPDKDFLN TKLQQLYNLYFKRYMELDYPDSLTCTQLVKMITNPLNGEKHRKFEDIVDEYLSQIDEEER TKTYKLYRLATNKFMQFIGNGSLMEHITPIRMNQYISWLKKTKLSSTTINIYITLLKVII NYAIKMRYVTYDIDPFITARIPSAQKRETQITVEELKTIRDANLEHYNLNVTRDIFMLTY YLAGMNLVDILAYDFRTDEINYIRKKTKNTKEGDSLISFSIPEEAKPIIKKYMKKNTGKI IFGKYKNYTSCYNLLARKISQLGKVAGIRHKFTLYSARKSFVQHGYDLGIPLSTLEYCIG QSMKEDRPIFNYVTIMRKHADKAIREILDNLKNE >gi|222159317|gb|ACAB01000042.1| GENE 4 2232 - 2930 731 232 aa, chain - ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 2 204 1 207 240 99 32.0 4e-21 MMLRCAIVDDEPLALSLLESYVNKTPFLQLAGKYSSAVQAMKELPGEEVDLLFLDIQMPE LNGLEFSKMVDPHTRIVFTTAFGQYAIDGYRVNALDYLLKPISYVDFLQAANKALQWFEL VQKPEEIDSIFVKSDYKLVQVDLKKIMYIEGLKDYIKIYTEDAPKPILSLMSMKAMEELL PSSRFIRVHRSFIVQKDKIRVIDRGRIVFDKTYIPISDSYKQVFQTFLDERS >gi|222159317|gb|ACAB01000042.1| GENE 5 3027 - 4088 870 353 aa, chain - ## HITS:1 COG:ECs3260 KEGG:ns NR:ns ## COG: ECs3260 COG3275 # Protein_GI_number: 15832514 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Escherichia coli O157:H7 # 132 326 334 531 565 101 33.0 2e-21 MKQTFTSARRPLEVLIHIISWGIMFGFPFFFVERGNGNINWMAYTRHLAVPLSFMIVFYV NYFILVPRYLFQSQAKRYVVYNIIFLCAIGVLLHLWQSLTFDPSFAPKSKRPGMPPGWLF FLRDMLSLVFTIGLSAAIRMSARWTQNEAARKEAERNRAEAELKNLRNQLNPHFLLNTLN NIYALIAFDSDKAQQAVQELSKLLRYVLYDNQQTYVPLCKEVDFIRNYIELMRIRLSANV QMITKFDIQPDSQTLIAPLIFISLIENAFKHGISPTESSFISIHILENDNEVVCEIRNSN HPKTVEDKSGSGVGLEQVSRRLEILYPGAYTWLKGVSKDEKVYESRLSIKIRE >gi|222159317|gb|ACAB01000042.1| GENE 6 4127 - 5347 417 406 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 7 406 9 413 413 165 29 4e-40 MNGTNLLKIALRALANNKLRAFLTMLGIIIGVASVITMLAIGQGSKKSIQQQISEMGSNM IMIHPGADMRGGVRQDPSAMQTLKLADYEALRDETSFLSAISPNVSSSGQLIAGNNNYPA SVNGVGTEYLDIRQLTVENGDMFTEADIQSSAKVCVIGKTIVDNLFPDGSDPVGKIIRFN KIPFRVVGVLKAKGYNSMGQDQDAVVLAPYTTVMKRLLAVTYLQGVFASALTEDMTDYAT DEISTILRRNHKLKASDNDDFTIRTQQELSTMLNSTTDLMTTLLACIAGISLVVGGIGIM NIMYVSVTERTREIGLRMSVGARGVDILSQFLIEAIMISITGGIIGVIIGCGASWIVKSV AHWPIYIQPWSVFLSFAVCTVTGVFFGWYPAKKAADLDPIEAIRYE >gi|222159317|gb|ACAB01000042.1| GENE 7 5488 - 6231 254 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 224 1 221 223 102 30 3e-21 MKKVIEIQNIKRNFQVGDETVHALRGVSFNINEGEFVTIMGTSGSGKSTLLNILGCLDTP TSGEYLLDDIPVRTMSKPQRAVLRNRKIGFVFQSYNLLPKTTAVENVELPLMYNSAVSAS ERRRRAIESLQAVGLGDRLEHKSNQMSGGQMQRVAIARALVNNPAVILADEATGNLDSRT SFEILVLFQKLHAEGRTIIFVTHNPELSQYSSRNIRLRDGQVIEDTANPKILSAAEALAA LPKNDED >gi|222159317|gb|ACAB01000042.1| GENE 8 6280 - 7500 1135 406 aa, chain - ## HITS:1 COG:AGc3332 KEGG:ns NR:ns ## COG: AGc3332 COG0845 # Protein_GI_number: 15889118 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 371 37 432 437 196 32.0 5e-50 MKTKKIILIAVVVVVVVGAGIWLFAGSPAKHKVTYATATVSKGDISNSVTATGTIEPVTE VEVGTQVSGIIDKIYVDYNSVVTKGQLIAEMDRVTLQSELASQQATYDGAKAEYEYQKKN YERSKGLHEKSLISDTDFEQALYNYQKAKSSYDSSKASLAKAERNLSYATITSPIDGVVI SRDVEAGQTVASGFETPTLFTIAADLTQMQVVADVDEADIGGIIEGQRASFTVDAYPNDV FEGIVTQIRLGDASSTSSTNSTSTVVTYEVVISAPNPDLKLKPRLTANVTIYILDKKDVL SVPNKALRFTPEKPLIGNNDIVKDCEGEHKLWTREGTTFTAHPVEVGISNGISTEIISGI PEGTKVVTEATIGVMPGENMGPEGNMENSGERSPFMPGPPGSKKKK >gi|222159317|gb|ACAB01000042.1| GENE 9 7547 - 8944 1425 465 aa, chain - ## HITS:1 COG:RSc1854 KEGG:ns NR:ns ## COG: RSc1854 COG1538 # Protein_GI_number: 17546573 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Ralstonia solanacearum # 112 439 126 446 475 79 21.0 2e-14 MNMINVRRLTVMTFLGASMLSGISAQVSPLQVDTLKETKIPDQWDLQSCIDYAKEQNITI RKNRITAVSTQIDVKTAKAALFPSLSFSTGQQVVNRPYQETSSRVSGSEIISSNSKTSYN GNYGLNASWTLYNGNKRLKTIQQEKLNNQMAELDVATSENNIQESIAQVYIQILYAAESV KVNENTLQVSIAQRDRGQELLNAGSIAKSDLAQLEAQVSTDRYQLVTAQATLEDYKLQLK QLLELDGENEMNIYLPALSDENVLAPLPTKRDVYISALSLRPEIEASKLNVEASELGINI AKSSYFPTISLSAGIGTNHTSGSDFTFGEQVKNGWNNSIGLSVSVPIFNNRQTKSAVQKA KLQYETSILSLLDEQKALYKTIESLWLDANSAQQRYAAANEKLKSTQISYELISEQFNLG MKNTVELLTEKNNLLQAQQEQLQAKYMAILNTQLLKFYQGDQLAL >gi|222159317|gb|ACAB01000042.1| GENE 10 9163 - 9399 221 78 aa, chain + ## HITS:1 COG:no KEGG:BT_1211 NR:ns ## KEGG: BT_1211 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 78 1 78 78 130 88.0 2e-29 MKNTLLTIWNFYLEGFRSMTLGRTLWVIILLKLFVMFFILKMFFFPDFLGDHPTDADKGT YVGNELIERAIPDKSTDF >gi|222159317|gb|ACAB01000042.1| GENE 11 9420 - 10967 1718 515 aa, chain + ## HITS:1 COG:Cj0081 KEGG:ns NR:ns ## COG: Cj0081 COG1271 # Protein_GI_number: 15791471 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 1 # Organism: Campylobacter jejuni # 6 515 3 507 520 548 55.0 1e-156 MIESIDTSLIDWSRAQFAMTAMYHWIFVPLTLGLAVVMGIMETLYYKTGNEFWKKTAQFW MKLFGINFAIGVATGLILEFEFGTNWSNYSWFVGDIFGAPLAIEGILAFFMEATFIAVMF FGWGKVSKRFHLASTWLTGLGATISAWWILVANAWMQHPVGMEFNPDTVRNEMVDFWAVA TSPVAVNKFFHTVLSGWVLGAIFVVGISCWYLLKKRNREFALASIKIGAIFGLVASLLSV WTGDGSGYQIAQTQPMKLAAVEGLYEGGTNVGLVGIGVLNPEKKTYNDGKDPFLFRFEIP SMLSFLAERNVDGYVPGITNIIEGGYQLKDGSKALSAAEKIERGKTAIGALAAYRAAKSA GHEEDAKIAYNVLQENIPYFGYGYIKDVNQLVPNVPLNFYAFRIMVILGGYFILFFIVVL FFIYKKDLSKMRWMHWIALLTIPLGYIAGQAGWVVAECGRQPWAIRDMLPTMAAISKLDV SSVQTTFFIFLLLFTVMLIAGVGIMVKAIKKGPDA >gi|222159317|gb|ACAB01000042.1| GENE 12 11001 - 12143 887 380 aa, chain + ## HITS:1 COG:Cj0082 KEGG:ns NR:ns ## COG: Cj0082 COG1294 # Protein_GI_number: 15791472 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 2 # Organism: Campylobacter jejuni # 5 380 10 374 374 305 50.0 9e-83 MYIFLQQYWWLVVSLLGAILVFLLFVQGGNSLLFCLGKTEEHRKMMVNSTGRKWEFTFTT LVTFGGAFFASFPLFYSTSFGGAYWLWMIILFSFVLQAVSYEFQSKAGNLLGKKTYQTFL VINGVVGPVLLGGAVATFFTGSDFYINKANMTDTIMPVISHWGNGWHGLDALTNIWNVIL GLAVFFLARVLGALYFINNIDDKELTDKCRRAVRNNTVLFLVFFLSFVIRTLVSEGFAVN PETQEIYMQPYKYLTNFIEMPVVLALFLIGVVLVLFGIGKTLLKKTFDKGIWFAGIGTVL TVLSLLLVAGYNNTAYYPSYTDLQSSLTLANSCSSEFTLKTMAYVSILVPFVIAYIFYAW RSIDRHKITEKEMDEGGHSY >gi|222159317|gb|ACAB01000042.1| GENE 13 12392 - 13348 900 318 aa, chain - ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 318 1 324 324 343 54.0 2e-94 MKIVVLDGFAANPGDLSWEGMKVLGECTIYDRTAPEEVLERAAGAEAILTNKVIINADHM AALPELKYIGVLATGYNVVDTAAAKERGIVVTNIPSYSTASVAQMVFSHILNITQQVQHH SEEVHKGRWTNNKDFCFWDTPLMELRDKKIGLVGLGNTGYTTARVAIGFGMQVYALTSKS HFQLPPEIKKMDLDQLFSECDIISLHCPLTPDTREMVNARRLAMMKPTAILINTGRGPLI NEQDLADALNSGKIYAAGVDVLSTEPPCADNPLLTAKNCYITPHIAWATIEARERLMNIA ISNLQAYISGKPENVVNK >gi|222159317|gb|ACAB01000042.1| GENE 14 13496 - 14767 1144 423 aa, chain - ## HITS:1 COG:CAC0326 KEGG:ns NR:ns ## COG: CAC0326 COG2256 # Protein_GI_number: 15893618 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Clostridium acetobutylicum # 3 413 16 429 443 399 49.0 1e-111 MQPLAERLRPKTLDDYIGQKHLVGPGAILRKMIDAGRISSFILWGPPGVGKTTLAQIIAN KLETPFYTLSAVTSGVKDVREVIERAKSNRFFSQSSPILFIDEIHRFSKSQQDSLLGAVE NGTVTLIGATTENPSFEVIRPLLSRCQLYVLKSLEKEDLLELLQRAITTDAVLKERKIEL KETTAMLRFSGGDARKLLNILELVVQSETEETVVITDEMVTERLQQNPLAYDKDGEMHYD IISAFIKSIRGSDPDGAIYWLARMVEGGEDPAFIARRLVISAAEDIGLANPNALLLANAC FETLMKIGWPEGRIPLAETTIYLATSPKSNSAYSAINDALELVRSTGNLPVPLHLRNAPT KLMKQLGYGQEYKYAHSYEGNFVKQQFLPDELKDKRIWQPQNNPAEQKHAERMIQLWGDK FKK >gi|222159317|gb|ACAB01000042.1| GENE 15 14774 - 17350 2307 858 aa, chain - ## HITS:1 COG:no KEGG:BT_1204 NR:ns ## KEGG: BT_1204 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 858 1 858 858 1649 93.0 0 MIQRYNKILILFLLVLVASNAFAQQIKGVVTDSITHEPLMYISVYYQEKRDMGTITNIDG EYSLDTRRNGGTLVFSAVGYISKTVRVGSNNQTVNVKLAPDNVLLNEVVVKPQKEKYSRK NNPAVEFMKKVIEHKKAQVLEVNEYYQYDKYEKMKMSINDLTPEKLEKGIYKKYSFLRDQ VEVSETTNKLILPISVQETSSQTIYRKNPENKKTIIKGKNSNGIEEFFSTGDMLGTVLKD VFADINIYDDDIRLLQQRFVSPIGNNAISFYKYYLMDTLMVNKRECVHLTFVPQNSQDFG FTGHLYVLNDSTYAVQKCTMNLPKKTGVNFVNRMDITQQYEQLPNGNWVLADDDMTVDLS WNSNKTAGGLQVERTTKYSNYKFDPIEQRLFRLKGSVIKEADMLSKSDEYWASVRQVPLT KKESSMDVFVNRLEQIPGFKYIIFGAKALIENFVETGSKEHPSKVDIGPINTMISSNYID GTRFRLSGMTTAHFDKHWFLSGYGAYGLKDERWKYSGTVTYSFNKRDYVVWEFPKHYLSA TYSYDVMSPMDKFLFTDKDNIFLSVKTTTVDQMSYMRDATINYELETLTGFGVKAMLRHR NDEPTGKLEYLRNDAAQTRVHDVTTSEASLTLRYAPGESFVNSKQRRVPVSLDAPIFTIT HAMGFKGVLGGDYNFNRTEASVWKRFWLPASWGKIDCSVKAGAEWNTVPFPLLILPEANL SYITQRETFNLINNMEFLNDRYASMSLSYDMNGKLFNRIPLIKNLKWREMFRVRALWGTL TDKNNPFKSNNPDLFRFPTRDGKFTSFVMDPKVPYVEASVGIYNIFKLLHVEYVHRFTYR DNPGINKNGIRFMVLMVF >gi|222159317|gb|ACAB01000042.1| GENE 16 17717 - 18802 1027 361 aa, chain - ## HITS:1 COG:MJ1504 KEGG:ns NR:ns ## COG: MJ1504 COG0381 # Protein_GI_number: 15669698 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Methanococcus jannaschii # 1 359 1 362 366 195 33.0 1e-49 MKITIVAGARPNFMKIAPITRAIEAARALGKSISYRLVYTGRKDDTSLDASLFSDLDMKA PDVYLGVESSNPTSLTAGIMVAFEQELTENPAHVVLVVDDLTATMSCAIVAKKQGIKVAH LVAGTRSFDMKMPKEVNRMITDGLSDYLFTAGMVANRNLNQTGTESENVYYVGNILIDAI RYNRNRLLKPIWFSVLGLQEGNYLLFTLNRRVLLGNKENLRQLMKTIIDKSAGMPIVAPL HTYVRNAIKELDIEAPNLHIMPPQNYLFFGYLINKAKGIITDSGNVAEEATFLGIPCITL NTYAEHPETWRMGTNELVGEDPALLAKTMDTLMHGEWKRGELPERWDGRTAERIVQILTS K >gi|222159317|gb|ACAB01000042.1| GENE 17 18895 - 19518 683 207 aa, chain + ## HITS:1 COG:VC2382 KEGG:ns NR:ns ## COG: VC2382 COG2860 # Protein_GI_number: 15642379 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Vibrio cholerae # 3 197 37 233 239 120 39.0 2e-27 MPTFVQILDFIGTFAFAISGIRLASAKRFDWFGAYVVGLATAIGGGTIRDVLLDVTPGWM TDPIYLICTGLALLWVICFGRWLIRLNNTFFIFDTIGLALFTVVGVGKSIALGYPFWVAI IMGSITGAAGGVIRDVFINEIPLIFRKEIYAMACVVGGIAYWICDLAGLESYACQLIGGS AVFLTRILAVKYHICLPILKGGEEPEE >gi|222159317|gb|ACAB01000042.1| GENE 18 19717 - 20214 562 165 aa, chain - ## HITS:1 COG:no KEGG:BT_1200 NR:ns ## KEGG: BT_1200 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 159 164 228 77.0 8e-59 MRTKKQIFMLLLALLVGIPTLSAQSKKEKKEQKKEAVKKLIESENYKIDVNTAMPMRGRS IPLTSSYSLEIRNDSVISYLPYYGRAYSIPYGGGDGLNFKAVLKEYSMEMDKKGNAVIEF IARNPEDRYEYRVKVFPNGSASIDVNMQNRQSISFQGELYIKEEK >gi|222159317|gb|ACAB01000042.1| GENE 19 20397 - 21197 658 266 aa, chain - ## HITS:1 COG:lin0962 KEGG:ns NR:ns ## COG: lin0962 COG0501 # Protein_GI_number: 16800031 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Listeria innocua # 6 264 39 302 304 146 34.0 4e-35 MVNHFFLSSLPYSMGIVIIWFLIAYWANTSIINSATGSKPLDRIENKRVYNLVENLCMSQ GMKMPKINIIYDSSLNAFASGINERTYTVTLSEGIIKKLNDEELEAVIAHELSHIRNRDV RLLIISIVFVGIFSMLTEITLYTITHIRVRSNSKGSGGIFLFILLALVIAAIGYLFSSLM RFAISRKREYMADAGSAEMTKNPLALASALRKISADPAIEAVERKDVAQLFIQNPKKKSK SIFSGINGLFATHPPIEKRIEILEQF >gi|222159317|gb|ACAB01000042.1| GENE 20 21405 - 21965 772 186 aa, chain - ## HITS:1 COG:lin0961 KEGG:ns NR:ns ## COG: lin0961 COG1704 # Protein_GI_number: 16800030 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 4 186 5 185 185 157 47.0 1e-38 MTILIIVGIIVLLGIIFASMYNSLVKLRNNRENAFADIDVQLKQRHDLIPQLVDTVKGYA AHEKETLDRVIQARNGAVGAKTIDDKIAAENQLSSALAGLKITLEAYPDLKANQNFLQLQ EEIADVENKLAAVRRYFNSATKEYNNAVQTFPSNIVAGMTGFQREIMFDLGKNERANLDQ APKISF >gi|222159317|gb|ACAB01000042.1| GENE 21 22205 - 22708 556 167 aa, chain + ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 3 164 5 161 183 102 38.0 4e-22 MKSLSFRKDLIGVQEELLRFAYKLTANREEANDLLQETSLKALDNEEKYVPDTNFKGWMY TIMRNIFINNYRKVVRDQTFVDTTDNYYHLNLPQDSGFESTEGAYDLKEMHRIVNALPRE YKIPFSMHVSGFKYREIAEKLGLPLGTVKSRIFFTRQRLQQELKDFV >gi|222159317|gb|ACAB01000042.1| GENE 22 22979 - 23722 460 247 aa, chain - ## HITS:1 COG:CC0325 KEGG:ns NR:ns ## COG: CC0325 COG0744 # Protein_GI_number: 16124580 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Caulobacter vibrioides # 26 219 25 212 229 218 55.0 1e-56 MHKQLLIKKLLRYTRNLLIFFFASTLLAVIIYRFMPVYVTPLMVIRSVQQVFSGDKPTWK HTWVSFDKISPNLPMAVIASEDNRFAEHNGFDLVEIKKAMKENETRKRKRGASTISQQTA KNVFLWPQSSWVRKGLEVYFTFLIELFWSKERIMEVYLNSIEMGNGIYGAQATAKNKFGT TADKLTRGQCALIAATLPNPIRFNSAKPSSYILKRQSQILRLMNLVPKFPPEEKAVEKKK SKKKKSK >gi|222159317|gb|ACAB01000042.1| GENE 23 23700 - 24416 895 238 aa, chain - ## HITS:1 COG:FN1265 KEGG:ns NR:ns ## COG: FN1265 COG2885 # Protein_GI_number: 19704600 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Fusobacterium nucleatum # 47 227 37 202 202 82 36.0 9e-16 MNKMKFMVLFMSIAMIFGSCGSMNNTGKGAAIGGGSGAALGAILGGVIGKGKGAAIGAAI GTAVGAGTGALIGKKMDKAAAEAKQIEGAQVEQITDNNGLQAVKVTFDSGILFTTGNANL SAAAKSALSKFANNVLNQNRDMDVSIYGYTDNQGWKNSTAAQSQQKNLNLSQERAQSVSS YLLSCGVSTNQIKSVQGMGESDPVASNDTAAGREQNRRVEVYMYASEQMIRDAQAATH >gi|222159317|gb|ACAB01000042.1| GENE 24 24579 - 25931 1158 450 aa, chain - ## HITS:1 COG:TM0815 KEGG:ns NR:ns ## COG: TM0815 COG0534 # Protein_GI_number: 15643578 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Thermotoga maritima # 12 435 18 445 464 133 25.0 8e-31 MQGIKNLTQGPINRQLFNLAMPIMATSFIQMAYSLTDMAWVGRLGSEAVAAIGSVGILTW MSGSISLLNKVGSEVSVGQSIGAQSQEDARSFASHNITIALIISICWGTLLFIFAEPIIR IYELEDHITANAIQYLRIISTGLPFIFLSAAFTGIYNAAGRSKVPFFISGTGLILNIILD PLFIFGFGLGTNGAAYATWIAEASVFLIFVYQLRCRDALLGGFPFFTRLKKKYTRRIFKL GLPVATLNTLFAFVNMFLCRTASEQGGHIGLMTFTTGGQIEAITWNTSQGFSTALSAFIA QNYAAGRVERVLRAWYTTLWMTGIFGTFCTLLFVFFGNEVFAIFVPEQAAYEAGGVFLRI DGYSQLFMMLEITMQGVFYGIGRTIPPAIISISCNYMRIPLAILFVRMGMGVEGIWWAVC VTTVAKGLILLSWFIIIKKKCLSVPSTIKG >gi|222159317|gb|ACAB01000042.1| GENE 25 26023 - 27408 1344 461 aa, chain + ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 35 244 82 305 328 137 36.0 6e-32 MKQKFSTIFILLLLFFAGSRVVAQNAPKPFDIEQPSLRVFLPAPELATGRAVVACPGGGY SHLAFEHEGCDWAPYFNKQGIALIVLKYRMPNGDRTLPISDAEAAMKLVRDSANVWNLNP NDIGIMGSSAGGHLASTIATHAKPELRPNFQILFYPVITMDKSYTHRGSHDNLLGKDASA ELELEYSNEKQVTKDTPRAFIVYSDDDKVVPPANGVNYYLALNKNNVPSVLHIYPSGGHG WGIREGFLYKNEMLDELTSWLRSFKVPHKDAIRVACIGNSITYGARIKNRDRDSYPAVLS RMLGEAYWVKNFGVSARTLLNKGDHPYMNEKAYQDALAFNPNIVVIKLGTNDSKSFNWKY KADFTKDLQTMVDAFKALPAQPKIYLCYPSKAYQTGDNINDDIISKQIIPMIKKVAKKNN LSVIDLHAAMDGMPQLFPDKIHPNEEGAKVMAKAVYQSLKK >gi|222159317|gb|ACAB01000042.1| GENE 26 27601 - 29472 1433 623 aa, chain + ## HITS:1 COG:MA2256_2 KEGG:ns NR:ns ## COG: MA2256_2 COG0642 # Protein_GI_number: 20091095 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 378 616 36 282 345 159 39.0 2e-38 MNRLQDFTFFSQLMANANMGWWKANLSTANYECSDFIVELLGLDQTGVISFEDFNKRIQK EEQLSTTVHSYGVYQRPEVVYLLDTVKGSVWVRSKVCFQETDEDGNEIVYGISEVQDGPD MASAYQALQYSERLLSNIFKYLPIGIELYDMDGVLVDLNDKELEMFHIEKKEDVLGINIF DNPIFPKEMKERLKKNEDADFTFRYDFSKVGSYYQNTQKQGTIDLMTKVTTLYNSEHQPI NYLLINADKTETTVAYNKIQEFEEFFELVGDYAKVGYAHFNILSGHGYAQKSWYRNVGEA YETPLSDIFGTYRHFHPDDRALLIRFLDDARNGLTTQLSKEMRVLREDGTYTWTHVNLLV KKYAPQDRIIEIISINYDITELKRTEEMLVKARDKAEASDRLKSAFLANMSHEIRTPLNA IVGFSSLLTSTENAAEKELYNSLIGHNNKLLLNLINDVIDLSKIESGYLELRPDWVNLTE LLDESVAEYAHQVPSGVELLTNYPAHDSLVELDRLRIKQILSNFLSNALKNTTTGHVEIF YEVDHQSVRIGVKDTGRGIPQNMLEKIFERFEKLDSFAQGAGLGLPICKLIVEKMNGRIL VDSQLGIGTTFIIELPCRSMLVE Prediction of potential genes in microbial genomes Time: Wed May 18 02:12:13 2011 Seq name: gi|222159316|gb|ACAB01000043.1| Bacteroides sp. D1 cont1.43, whole genome shotgun sequence Length of sequence - 127235 bp Number of predicted genes - 107, with homology - 106 Number of transcription units - 44, operones - 26 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1879 - 1938 7.5 2 2 Tu 1 . + CDS 1965 - 5210 2446 ## BT_1185 OmpA-related protein + Term 5219 - 5272 12.5 - Term 5215 - 5253 3.2 3 3 Op 1 . - CDS 5258 - 6355 552 ## BT_1184 hypothetical protein 4 3 Op 2 . - CDS 6401 - 9058 1862 ## COG0642 Signal transduction histidine kinase 5 3 Op 3 . - CDS 9129 - 10343 1089 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 6 3 Op 4 . - CDS 10352 - 12070 256 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily - Prom 12094 - 12153 1.7 7 4 Op 1 . - CDS 12159 - 13250 341 ## NT01CX_0022 hypothetical protein 8 4 Op 2 26/0.000 - CDS 13255 - 14106 630 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 9 4 Op 3 . - CDS 14103 - 15317 630 ## COG0438 Glycosyltransferase 10 4 Op 4 . - CDS 15274 - 16419 604 ## BF2054 hypothetical protein 11 4 Op 5 8/0.000 - CDS 16463 - 17914 1043 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 12 4 Op 6 . - CDS 17907 - 18875 434 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 13 4 Op 7 . - CDS 18917 - 20032 613 ## BT_1175 hypothetical protein - Term 20057 - 20095 -0.7 14 4 Op 8 . - CDS 20107 - 20445 214 ## gi|237716281|ref|ZP_04546762.1| conserved hypothetical protein - Prom 20567 - 20626 5.6 + Prom 20447 - 20506 6.4 15 5 Tu 1 . + CDS 20594 - 21562 668 ## BT_1173 hypothetical protein + Term 21565 - 21610 0.1 + Prom 21662 - 21721 4.6 16 6 Tu 1 . + CDS 21744 - 23537 1379 ## BT_1172 DNA primase/helicase + Term 23554 - 23589 -0.8 17 7 Tu 1 . - CDS 23634 - 23891 220 ## BT_1170 hypothetical protein - Prom 24021 - 24080 4.2 + Prom 23868 - 23927 6.9 18 8 Tu 1 . + CDS 24072 - 25631 1348 ## BT_4046 hypothetical protein + Prom 25633 - 25692 6.0 19 9 Op 1 . + CDS 25817 - 26359 547 ## BT_1517 hypothetical protein 20 9 Op 2 . + CDS 26416 - 26721 360 ## BT_1518 hypothetical protein 21 9 Op 3 . + CDS 26726 - 27175 329 ## COG3023 Negative regulator of beta-lactamase expression + Term 27191 - 27239 9.1 22 10 Tu 1 1/0.167 - CDS 27270 - 28412 809 ## COG0438 Glycosyltransferase - Prom 28458 - 28517 3.8 23 11 Op 1 . - CDS 28522 - 29715 731 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 24 11 Op 2 . - CDS 29743 - 30669 526 ## COG1216 Predicted glycosyltransferases 25 11 Op 3 . - CDS 30679 - 31878 651 ## BF2060 putative transmembrane surface-related protein 26 11 Op 4 . - CDS 31967 - 34192 1495 ## BT_1164 hypothetical protein 27 11 Op 5 . - CDS 34204 - 34950 374 ## BT_1163 hypothetical protein - Prom 34988 - 35047 4.9 28 12 Op 1 . - CDS 35088 - 36329 1002 ## BT_1162 hypothetical protein 29 12 Op 2 . - CDS 36292 - 36516 58 ## gi|237721024|ref|ZP_04551505.1| predicted protein - Prom 36561 - 36620 11.0 + Prom 36442 - 36501 7.1 30 13 Tu 1 . + CDS 36578 - 37972 1269 ## COG3579 Aminopeptidase C + Term 38036 - 38086 15.0 + Prom 38054 - 38113 3.0 31 14 Op 1 7/0.000 + CDS 38145 - 39494 1437 ## COG1726 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA 32 14 Op 2 9/0.000 + CDS 39628 - 40800 1072 ## COG1805 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB 33 14 Op 3 9/0.000 + CDS 40814 - 41488 763 ## COG2869 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC 34 14 Op 4 9/0.000 + CDS 41510 - 42151 655 ## COG1347 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD 35 14 Op 5 7/0.000 + CDS 42262 - 42888 720 ## COG2209 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE 36 14 Op 6 . + CDS 42907 - 44181 1216 ## COG2871 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF + Term 44237 - 44294 11.2 + Prom 44342 - 44401 5.0 37 15 Op 1 . + CDS 44422 - 44601 200 ## gi|237716306|ref|ZP_04546787.1| conserved hypothetical protein 38 15 Op 2 . + CDS 44648 - 45982 1242 ## COG0513 Superfamily II DNA and RNA helicases + Prom 45988 - 46047 6.1 39 16 Op 1 6/0.000 + CDS 46188 - 47255 1144 ## COG1932 Phosphoserine aminotransferase + Prom 47272 - 47331 8.6 40 16 Op 2 2/0.000 + CDS 47355 - 48275 1261 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Term 48375 - 48424 8.1 + Prom 48324 - 48383 6.5 41 16 Op 3 . + CDS 48438 - 49685 1292 ## COG4198 Uncharacterized conserved protein + Term 49786 - 49838 10.1 + Prom 49697 - 49756 3.2 42 17 Tu 1 . + CDS 49941 - 50558 527 ## COG1739 Uncharacterized conserved protein + Prom 50571 - 50630 5.1 43 18 Op 1 . + CDS 50655 - 51404 448 ## gi|293373029|ref|ZP_06619398.1| hypothetical protein CUY_0916 44 18 Op 2 . + CDS 51484 - 52569 979 ## BT_1149 type II restriction enzyme HpaII 45 18 Op 3 . + CDS 52638 - 53207 526 ## COG0778 Nitroreductase + Term 53264 - 53309 9.0 46 19 Op 1 . - CDS 53200 - 53325 60 ## gi|237716317|ref|ZP_04546798.1| predicted protein 47 19 Op 2 . - CDS 53394 - 56243 2663 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain - Prom 56264 - 56323 4.7 48 20 Op 1 . - CDS 56356 - 56994 446 ## COG0491 Zn-dependent hydrolases, including glyoxylases 49 20 Op 2 . - CDS 57038 - 57658 511 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division - Prom 57680 - 57739 5.4 50 21 Tu 1 . - CDS 57772 - 58665 820 ## BT_1144 hypothetical protein - Prom 58696 - 58755 2.9 51 22 Op 1 . - CDS 58768 - 59001 251 ## BDI_1027 hypothetical protein 52 22 Op 2 . - CDS 59074 - 60159 577 ## COG1408 Predicted phosphohydrolases - Prom 60306 - 60365 6.3 + Prom 60200 - 60259 6.1 53 23 Tu 1 . + CDS 60356 - 60754 152 ## BT_1140 hypothetical protein + Prom 60799 - 60858 2.7 54 24 Op 1 . + CDS 60888 - 63101 2304 ## BF2084 putative TonB-dependent outer membrane receptor protein 55 24 Op 2 . + CDS 63171 - 63494 463 ## BT_1092 putative heavy-metal binding protein 56 24 Op 3 . + CDS 63593 - 65803 2287 ## COG2217 Cation transport ATPase + Term 65892 - 65942 5.2 - Term 65880 - 65930 5.2 57 25 Tu 1 . - CDS 65982 - 66821 480 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 66857 - 66916 8.5 + Prom 66806 - 66865 4.1 58 26 Tu 1 . + CDS 66888 - 67571 579 ## COG0321 Lipoate-protein ligase B 59 27 Op 1 . - CDS 67528 - 68445 888 ## BT_1088 hypothetical protein 60 27 Op 2 . - CDS 68448 - 70004 1221 ## COG0657 Esterase/lipase 61 27 Op 3 . - CDS 70011 - 71999 2160 ## COG1297 Predicted membrane protein - Prom 72019 - 72078 9.7 - Term 72049 - 72081 2.0 62 28 Op 1 . - CDS 72105 - 72500 331 ## BT_1085 hypothetical protein 63 28 Op 2 . - CDS 72598 - 73203 449 ## BT_1084 hypothetical protein 64 28 Op 3 . - CDS 73211 - 73816 575 ## COG0218 Predicted GTPase - Prom 73865 - 73924 6.7 - Term 73870 - 73930 5.4 65 29 Tu 1 . - CDS 73941 - 75395 785 ## COG0591 Na+/proline symporter - Prom 75627 - 75686 4.4 + Prom 75398 - 75457 3.3 66 30 Op 1 . + CDS 75487 - 76104 544 ## COG0353 Recombinational DNA repair protein (RecF pathway) 67 30 Op 2 . + CDS 76131 - 76589 298 ## BT_1080 hypothetical protein 68 30 Op 3 . + CDS 76586 - 77122 348 ## PROTEIN SUPPORTED gi|229254479|ref|ZP_04378409.1| acetyltransferase, ribosomal protein N-acetylase + Term 77146 - 77183 0.1 69 31 Op 1 . - CDS 77126 - 77716 405 ## COG1678 Putative transcriptional regulator - Prom 77742 - 77801 3.4 - Term 77720 - 77770 4.4 70 31 Op 2 . - CDS 77811 - 79127 1109 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 79297 - 79356 77.0 + TRNA 79280 - 79353 85.5 # Asp GTC 0 0 71 32 Op 1 . - CDS 80270 - 81307 745 ## gi|237716342|ref|ZP_04546823.1| predicted protein 72 32 Op 2 . - CDS 81355 - 83442 1204 ## Dtox_2889 protein of unknown function DUF291 73 32 Op 3 . - CDS 83492 - 84346 687 ## BF3848 hypothetical protein 74 32 Op 4 . - CDS 84370 - 85536 787 ## BF3849 hypothetical protein - Prom 85614 - 85673 4.4 + Prom 85619 - 85678 5.7 75 33 Op 1 . + CDS 85762 - 87021 695 ## gi|262407949|ref|ZP_06084497.1| predicted protein 76 33 Op 2 . + CDS 87018 - 88148 535 ## gi|237716347|ref|ZP_04546828.1| predicted protein 77 34 Op 1 . - CDS 88154 - 89170 791 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III - Prom 89208 - 89267 4.8 78 34 Op 2 . - CDS 89309 - 89434 94 ## - Prom 89504 - 89563 10.0 + Prom 89620 - 89679 8.2 79 35 Op 1 . + CDS 89742 - 90071 91 ## gi|237716349|ref|ZP_04546830.1| conserved hypothetical protein 80 35 Op 2 . + CDS 90080 - 90484 390 ## BT_1071 hypothetical protein 81 35 Op 3 8/0.000 + CDS 90488 - 91792 857 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase 82 35 Op 4 . + CDS 91789 - 92331 543 ## COG1475 Predicted transcriptional regulators + Term 92399 - 92438 2.1 + Prom 92645 - 92704 2.4 83 36 Op 1 . + CDS 92789 - 93016 119 ## gi|237720862|ref|ZP_04551343.1| predicted protein 84 36 Op 2 . + CDS 93034 - 93399 216 ## BT_1067 hypothetical protein 85 36 Op 3 . + CDS 93421 - 93993 609 ## BT_1066 hypothetical protein 86 36 Op 4 . + CDS 94002 - 95513 1467 ## BT_1064 hypothetical protein 87 36 Op 5 . + CDS 95571 - 97280 1327 ## BT_1063 hypothetical protein + Term 97306 - 97344 8.1 + Prom 97462 - 97521 2.9 88 37 Tu 1 . + CDS 97557 - 98555 639 ## BT_1062 hypothetical protein + Prom 98563 - 98622 1.9 89 38 Op 1 . + CDS 98729 - 99670 671 ## gi|294807955|ref|ZP_06766734.1| putative lipoprotein 90 38 Op 2 . + CDS 99745 - 101319 966 ## gi|237716360|ref|ZP_04546841.1| conserved hypothetical protein 91 38 Op 3 . + CDS 101364 - 104591 1682 ## BDI_2707 hypothetical protein + Term 104606 - 104660 10.2 - Term 104594 - 104648 10.2 92 39 Tu 1 . - CDS 104691 - 105650 606 ## BT_1061 hypothetical protein - Prom 105805 - 105864 3.6 - Term 105797 - 105837 7.8 93 40 Tu 1 . - CDS 105887 - 106843 805 ## BT_1060 hypothetical protein - Prom 106863 - 106922 6.5 + Prom 106891 - 106950 8.1 94 41 Op 1 . + CDS 107047 - 107976 677 ## COG0451 Nucleoside-diphosphate-sugar epimerases 95 41 Op 2 . + CDS 107996 - 109129 1055 ## COG0642 Signal transduction histidine kinase + Term 109134 - 109184 5.2 96 42 Op 1 . - CDS 109277 - 112156 2022 ## BT_1057 hypothetical protein 97 42 Op 2 . - CDS 112186 - 113007 488 ## BT_1056 hypothetical protein 98 42 Op 3 . - CDS 113012 - 113629 431 ## COG1180 Pyruvate-formate lyase-activating enzyme 99 42 Op 4 . - CDS 113626 - 116796 2444 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 100 42 Op 5 . - CDS 116805 - 117431 594 ## COG2731 Beta-galactosidase, beta subunit 101 42 Op 6 . - CDS 117428 - 118036 440 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 118056 - 118115 6.9 + Prom 118083 - 118142 8.2 102 43 Tu 1 . + CDS 118198 - 119190 759 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 119225 - 119260 -0.4 - Term 119251 - 119314 1.2 103 44 Op 1 . - CDS 119524 - 120483 510 ## BT_1050 hypothetical protein 104 44 Op 2 . - CDS 120501 - 121700 742 ## BT_1049 putative patatin-like protein 105 44 Op 3 . - CDS 121709 - 122848 829 ## BT_1048 putative secreted endoglycosidase 106 44 Op 4 . - CDS 122876 - 124426 1312 ## BT_1047 hypothetical protein 107 44 Op 5 . - CDS 124444 - 127233 2242 ## BT_1046 hypothetical protein Predicted protein(s) >gi|222159316|gb|ACAB01000043.1| GENE 1 57 - 1748 1857 563 aa, chain + ## HITS:1 COG:RSc2913 KEGG:ns NR:ns ## COG: RSc2913 COG0488 # Protein_GI_number: 17547632 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Ralstonia solanacearum # 9 561 5 554 555 642 57.0 0 MAADDKKIIFSMVGVSKAFTPNKNVLKDIYLSFFYGAKIGIIGLNGSGKSTLLKIIAGLE KSFQGEVVFSPGYSVGYLAQEPYLDNTKTVKEVVMEGVQPIVDALTEYEEINQKFGLPEY YEDQDKMDALFARQGELQDIIDATDAWNLDSKLERAMDALRCPPEDQPVENLSGGERRRV ALCRLLLQKPDVLLLDEPTNHLDAESIDWLEQHLQQYEGTVIAVTHDRYFLDHVAGWILE LDRGEGIPWKGNYSSWLEQKTKRMEMEEKTASKRRKTLERELEWVRMAPKARQAKGKARL NSYDKLLNEDVKEKEEKLEIFIPNGPRLGNKVIEAKHVAKAYGDKLLFDDLNFMLPPNGI VGVIGPNGAGKTTLFRLIMGLETVDKGEFEVGETVKVAYVDQQHRDIDPNKSVYQVISGG NELIRMGGRDVNARAYLSRFNFSGGDQEKLCGVLSGGERNRLHLAMALKEEGNVLLLDEP TNDIDVNTLRALEEGLEDFAGCAVVISHDRWFLDRICTHILAFEGDSNVFYFEGSYSEYE ENKMKRLGNEEPKRVRYRKLMTD >gi|222159316|gb|ACAB01000043.1| GENE 2 1965 - 5210 2446 1081 aa, chain + ## HITS:1 COG:no KEGG:BT_1185 NR:ns ## KEGG: BT_1185 # Name: not_defined # Def: OmpA-related protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1081 1 1090 1090 1556 71.0 0 MLKRMRSFLVAAMLMVIATVSAQVTTSSMSGKVTAQDEPIIGATVVAIHEPSGTRYGTVT NISGQFNLQGMRTGGPYKVEVSYVGYQTAIYKGVNLSLGEVYTLNVVLKESSELLDEVIV TAQKTVEKMGTVTNVSERQLTTLPTINRSITDFTKLSPYAGGSNSFAGRDGRYNTITVDG AALNNNFGLSTNNLPGGDAQPISLDAIDEISVNVSPYSVTYSNFTGASINAVTKSGTNEL KGTVYTYQKPKNFIGKSINDVDVPNVESYKSSLYGFTLGAPIIKNKLFFFVNGELENSTS PGILWTPSQEEGGSGDNQNHISRTWIKDLKTISDFVKDKYGYDPGSYDKFDDFESKNWKL MARLDWNINKSHKLSLRFNTVKSENDASISSTSSVITKANSNRYGVDAFAFGNSNYGFRN IVTSLSGELNSNFSSSVQNKLLVTYTHIRDSRTTKGDAFPMVDIYKDGKQYMTLGTELFT PFNDVENNVFSVTDNVTINKGNHLITAGATFERQYFMNSYLRAPYGYYRYASMDDFMTGE KPMLYGITYGYNGKDAPGAELTFGMLGAYAQDEYSITPNLKLTYGLRFDLPLYFDDLLGN AAIKEQSFNGTNVDVSEWPKSKLLISPRLGFNWDIKGDRSIVLTGGTGLFTGLLPFVWFT NQPTNAGQMQNMVEFETSELPANFAFNPNYKETLTQNPDMFPSTPGNEVPGAIAYVDPNF KMPQVWRSNVNAEFQLPYGFMLSVGAMYTRDIYNVVQKNMNEKAPSGTYNEQPGRVYWTK NNYYDNPKTKTVIQLTNGDEKGYQYSFNAVLTKKFDFGFTGSFGYTYTMAKDMTANPGSS AASVWQNNVAVNSLNDPGVSYSLFSTPHRLIANASYEVAYANMKTTVSLFYTGYQQGRFS YTYSNDMNGDGNYSDLMYVPASKEEMTFVDIKDKQNNVTYSAVDQQEDFWNYVNNDSYLN DHKGQYVERASSLEPWIHRFDMKIAQDFYAKIGSRKYGIQVSLDMLNIGNLLNSKWGAYR SCGLQSYDNVRLLKTASKVGEPLTYQMNASSREVFQKNSKWDYTASTGSAWQMQLGVKFT F >gi|222159316|gb|ACAB01000043.1| GENE 3 5258 - 6355 552 365 aa, chain - ## HITS:1 COG:no KEGG:BT_1184 NR:ns ## KEGG: BT_1184 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 365 1 369 370 576 72.0 1e-163 MKAFYRLFYYISIVLTSILAGVTIAGAFVGNAAPDSFKFMPFIGLILPILLLANLASAIY WTIRWRCWVFIPLIAIFSNWGYMSCVLQSPFFSPASSPMVKMNVYTPGVLTVATYNVDAF NHEHTGYSCKEIASYMRNLQADILCFQEFGINDEFGIDSISAALSNWPYHYIPSSPEGKN LLQLAVFSRYPIKEEHLIIYPDSKNCSLACDIEINGRTIRLFNNHLQTTEVSQNKRKLEK GLRTDDSQRVEHATLGLIDGLHENFRKRAVQADLLKQLIAASPYPTLVCGDFNSLPSSYV YHTIKGDKLQDGFQTSGHGYMYTFKYFKHLLRIDYILHSPELNSTDYFSPDLNYSDHNPV VMRMK >gi|222159316|gb|ACAB01000043.1| GENE 4 6401 - 9058 1862 885 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 486 732 61 317 328 170 39.0 1e-41 MNRILHDANNAQRILKLTADTMLLVDRHGVCVDIEPHCDLWFLQEDILLGKNIFELLPEY TRERVMPIFQIVLEEQRSISKNFKLVLKGETFYFKCLMFPYDGMVLCQYRDITQRSNVKR QLEQANLTLRAIQKVAQIGQWTYNTKQNIFHYLGYTGVLCEENVQNLAIEKYVELIVEED RQSFMEWCRVNEKELNMESISYRVRLNGEIFYMRIQTYLREQRSDGSFNIEGYIQNITDI QHRRNDINTLTHAINNAKESIFAAKPDGTIIFANRRFLYNHGISENEDISQLKIYNVAAD MPTQEAWNERCKDVIHGGSSNFVAHHPSKINKGILAYEGTMYNVTNDSGEESYWSFAHDI SERIRYEAQIKRLNQIMDTTINNLPAGIVVKEINNDFRYIYRNREAYNRDLYKNDPVGKN DFDFYPPIVAEKKRQEDIQVATTGKGLHWTAEGKDRNGNMIILDKRKIRVDGDELSSPII VSIEWDITELEMIKRELQSSKEKAEMSDSLKSAFLANMSHEIRTPLNAIVGFSHLIAESD DAEERKTYYNIVNANNERLLQLINEILDLSKIESGTIEFSFGPASLHNLCREVHDAHIFR TPQGVSLVYESSDESLMIETDKNRVFQVISNLIGNAVKFTKEGSISYGYKLADNQIVFHV TDTGTGIEPEKVGRVFERFAKLNNHAQGTGLGLSICKSIVERLGGKISVNSEFGKGTTFT FTLPYTIANPANVDSSKKENGEGNIAGIASAGKTADSSVDSSANTRHACILVAEDTDSNF DLLEAILGKDHRLIRAHDGMEAVTMFDEVKPDLILMDIKMPNLDGLEATKIIRELSATVP IIAQSAFAYEQDRKAAEEAGCNDFIAKPIADDKLKAMIHKWLLPS >gi|222159316|gb|ACAB01000043.1| GENE 5 9129 - 10343 1089 404 aa, chain - ## HITS:1 COG:PAE0419 KEGG:ns NR:ns ## COG: PAE0419 COG1215 # Protein_GI_number: 18311929 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Pyrobaculum aerophilum # 57 310 42 295 365 83 28.0 8e-16 MNEIITLCLEIIFWFALFIVFYTYLGYGIVLYVLVKLKELFVKPVKRSLPASDAGLPEVT LFITAFNEEDVIDEKMENSLELDYPADKLHIVWVTDGSNDGTNERLKTRWQGKATIHFQP LRQGKTAAMTRGMTLVDTPLVVFTDANTMVNREAIQEIVLAFQDPKVGCVAGEKRIAVQT KDGAAAGGEGIYWKYESTLKALDARLYSAVGAAGELFAVRRELFEAMEPDTLLDDFILSL RITMKGYTIAYCTNAYAIESGSADMREEEKRKVRIAAGGLQSIWRLRPLLNPFRYGTLSF QYTSHRVLRWSITPFLLFALFPLNIAILLLGGSAIFYGVLLAMQVLFYGLGYWGYYLSTK QIKNKLLFIPYYFLFMNVNVLKGIRYLKKKKGSGAWEKAKRAEK >gi|222159316|gb|ACAB01000043.1| GENE 6 10352 - 12070 256 572 aa, chain - ## HITS:1 COG:BS_yqgS KEGG:ns NR:ns ## COG: BS_yqgS COG1368 # Protein_GI_number: 16079540 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Bacillus subtilis # 4 539 3 554 638 140 25.0 6e-33 MNIKLLIKQFTFPLSLILLCYINLQYMHYSILIDKYSLTPFSYLENIIHAFIDICLLFFI PLYLIGKKTYLFFIPYTIITILVIVNVSYSRFFYTYMPPILYGEFNNLDGLSANVFATIR FSDITILITTLASILSYKQFHRAYSEIRRKTRRIFSLYILGIAFILLGGLIATATIHWSS LNYKYVHPFKNSPTESIFKFGIIYGTIIQCASNNKQGCNPEEVAKLEPFFYESKYSIEYP PKENIILIIVESLLSFPTDLKINGIEITPTLNKLVEKGAYYNNNMTSLIQLGESSDGQFT YLNGLIPKTKGVTIYDYFNNTFVSLPKLLKKQKPGIECRMVIPTSSKTWRQDGVCIQYGF DKLYSRKEYTLSNYKENWLNDKLLFEYAASIDKNSKQPFFSLILTSSTHSPYTKAVEDYL IPFPDTYSEELKNYLSNVHYMDKYLGKYLNFLKENHLYHNSLIIIASDHSISNDWLKSKE EDNVSFQIPLYIVNSPQKIDKTSDYVITQADLFPTLLDLGGIHSEWRGVGNSLLCPDSIL NTEREKKRIIYREKISDIILDSDYFKGKKISK >gi|222159316|gb|ACAB01000043.1| GENE 7 12159 - 13250 341 363 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_0022 NR:ns ## KEGG: NT01CX_0022 # Name: not_defined # Def: hypothetical protein # Organism: C.novyi # Pathway: not_defined # 33 317 69 345 496 125 31.0 2e-27 MKIIRKIILVVICVILVDYIYQYNQFQKYNQAPLIIKTPEGSNQPYHPSVIYIPEGWNGY KYWMAETPYPLGEDGDWKGLPPYRERWENPCVHVSKDGIHWNDFEDSQNPIDDLDENNII NKDYFSDPHLVFYKDTLECWYRISHQKNNATYILRKYTLNGKDWSPREVMINLQDTSIIK NETGNMVISPAIHKGTNGYVMWYVNSIEKPREICRSFSTDGKKWSKKETCHLPDNSVTPW HIDLAYIDKIYYLVIYDYDTNNLVLYSSSDGLSFNNKKYILSKAPMLGSFYSYGLYRSSL IKDNEKYKLYFSAFEKKTAIGLMEGNSISTLHATSAEGNFISFCKFPIVYLRNKKQSLKE LFQ >gi|222159316|gb|ACAB01000043.1| GENE 8 13255 - 14106 630 283 aa, chain - ## HITS:1 COG:TVN0223 KEGG:ns NR:ns ## COG: TVN0223 COG0463 # Protein_GI_number: 13541054 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Thermoplasma volcanium # 42 140 10 106 226 62 35.0 1e-09 MKWYTKYLQVYEQPFDSAPKEAVDFVKTKLQKLSENKQPLASVILICYNEEKRLLSCLWS LCDNICDFPIEILAVNNNSSDRTESVLQQLGVTYFNELKKGPGHARQCGLNQAKGTYHIC IDTDTMYPNNYIKTHVKKLMQPNVVCTFSLWSFIPDEQHSKWGLWWYESLRDFYLRIQAI QRPELCVRGMTFAFKTELGKQLGFRTDIIRGEDGSLALAMKPYGKLVFIHSSKARVITGY GTVGADGGLFKSFKVRFIKGLKGIGGLFTRKKKYKDKDNNLIK >gi|222159316|gb|ACAB01000043.1| GENE 9 14103 - 15317 630 404 aa, chain - ## HITS:1 COG:SMb21250 KEGG:ns NR:ns ## COG: SMb21250 COG0438 # Protein_GI_number: 16264502 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 5 399 35 409 427 137 28.0 3e-32 MNWTSDHSIQYLKSNPMNKKVKIAYCIPSLYYPSGMERVLTLKANYFAEHLGYDIHIILT DGKGKEPYYKLYPSITTHQLDINYDELYGLSLPKRIHRYWSKQKLFKKRLETCLNEIEPD ITISLLRRDINFINKMKDKSIKLGEIHFNKSNYREFSDNCLPGIIQRAVKQYWMWQLIRQ VRQLKSFVVLSHEDAAEWTELNNVTVIYNPLPFLPEQQSDNTPKQVIAVGRYVPQKGFDR LISAWSIVNKKHPDWILRIYGDGMREQLQNQIYELGISPSCILEHSTPDIVDKYCKSSIF VLSSRYEGFGMVIIEAMACGVPPVSFTCPCGPRDIISDGINGLLVENGNIEGLAEKICYL IENENVRREMGRQARMDIERFRIEPIAEQWKTLFESIIEKNKDL >gi|222159316|gb|ACAB01000043.1| GENE 10 15274 - 16419 604 381 aa, chain - ## HITS:1 COG:no KEGG:BF2054 NR:ns ## KEGG: BF2054 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 369 1 369 372 511 64.0 1e-143 MKALFLIFHGFDKANGISKKIHYQVKALKECGLDVRLCYYDISPSGERRWMADNAVIAEL GQGVFAKIRKRLGLNCIANYVLRENIHFVYIRSYHNANPFTINMVKRMKKKGAKVVMEIP TFPYDQEYITWPMKFKRLTDRCFRHRLASFLNGIVTFSNAKNIFGERTIRISNGIDFDAI PMKKQMNDTTHELHLIGVAEVHYWHGFDRLIRGLAEYYCTNPDYKVYFHIVGPLSGEREK QEILPVIRDNKLESYVILHGPQHDQQLDAMFEQADFAIGSLGRHRSGITHIKTLKNREYA ARGLAFTYSEIDEDFDKMPYIWKAPPDESPINIQQLISFQKSLTMTPQNIRESIRPLSWT AQMKKVIDELDIRPQHTILEK >gi|222159316|gb|ACAB01000043.1| GENE 11 16463 - 17914 1043 483 aa, chain - ## HITS:1 COG:L13324 KEGG:ns NR:ns ## COG: L13324 COG2244 # Protein_GI_number: 15672194 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Lactococcus lactis # 5 473 1 469 475 216 30.0 7e-56 MASNIKNQLFSGVFYTALAKYSGIVISLVIAGILARLLSPDDFGIVAVATVIIAFFSLLT DMGISPAIIQHKSLTKDELSTIFSFTVWTGIGISILFFAASWMIADYYESEILRTLCQLL SVNLFFASATIVPGALFYRNKEFKFIAIRSFVIQISAGAAAVTAALCGAGLYALIINPII SSILIFVISYQRYPQRLRFTLGLTALRKIFSYSAYQFLFNVINYFSRNLDKLLIGKYMSM SDLGYYEKSYRLMMLPLQNITQVITPVMHPIFSDFQNDKAKLATSYERILRFLAFIGLPL SVLLFFTAEEVTLIIFGVQWLPSVPVFRLLSLSVGIQIILSSSGSIFQAAGDTRSLFVCG VFSSVLNVTGILLGIFYFGTLTAVASCIVVTFSINFIQCYWQMYRITFRRSAWPFIRQLI SPFIISILIALALIPMQYALEGMNIFVTIIAKSIVSFIIFGIYIQMTHEYDIIGKVRSIL CKR >gi|222159316|gb|ACAB01000043.1| GENE 12 17907 - 18875 434 322 aa, chain - ## HITS:1 COG:YPO0187 KEGG:ns NR:ns ## COG: YPO0187 COG0463 # Protein_GI_number: 16120528 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Yersinia pestis # 4 227 2 223 329 135 34.0 8e-32 MKYNEIKVSVVIPVYNTEKYVRQAVESVMYQSLKELEIIVVDDGSTDKSLSIVEKLGDTD KRIQIYTQANQGQSIARNRGISHAHGEYIYFMDSDDLLEEDALELCYHKCKEEKLDFVFF DALVFFENNVENAPTLNYKHTEKLEDKIYTGQEALEIQLQNKEYTPSVCLHFIHRNVIEK HNLSFYPEIVHEDQLFTTLLYLQSTKAACIKRTFFHRRMRKDSTMTSKFAMRNIKGYLVV TEEILSFRRQTTEDKNKEIIDLYLSQMLDAAMWQAHSLKLPERIKLARLCLQKYKKYVST KTIGALLFKFLFINHKKTVDNG >gi|222159316|gb|ACAB01000043.1| GENE 13 18917 - 20032 613 371 aa, chain - ## HITS:1 COG:no KEGG:BT_1175 NR:ns ## KEGG: BT_1175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 367 1 367 372 544 70.0 1e-153 MSDSETDSPSIKALIVFRENGETDNLFVPILCDAIRMAGINVRYSQKEFWESDTTYDIIH FQWPEELVGWTCKDPDVIRCLKERIDFFRSRGTHFIYTRHNIHPHYANDIISRVYDIIES ESDTVVHMGHYSQTEFIQKYPDSRNVIIPYHIYQYTYKEDISIERARQYLNLPQEAFVVT AFGKFRNREERRMLLGAFRDWDKERKLLIAPRLCPFSRRNNYGRNILKRWASRIGYYVLM PLLNRMFRLQAGANDEPIDECDLPYYMSASDVIFIQERNALNSVNIPLAFLFHKIVVGPN TGNIGELLKNTGNPTFNPNHKADIIRALKMAWQLSTWGKGEMNYTYALENMSIDKVGKQY AEVYRDLNSRL >gi|222159316|gb|ACAB01000043.1| GENE 14 20107 - 20445 214 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716281|ref|ZP_04546762.1| ## NR: gi|237716281|ref|ZP_04546762.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 112 1 112 112 193 100.0 2e-48 MDTKRNQTLEEIEENKIVNEHYQNRVMLIKKLLKTSRLATVDLCVHIDISEASYYRYINF TSYMKADIFIHACLFLKQYIESHHIPYTQEEKRLIKTLDLFQISSNSNLNCN >gi|222159316|gb|ACAB01000043.1| GENE 15 20594 - 21562 668 322 aa, chain + ## HITS:1 COG:no KEGG:BT_1173 NR:ns ## KEGG: BT_1173 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 322 1 328 328 334 52.0 3e-90 MKKDQYFNLEVNLLNDDNIACMMSEMNAAEALGIYVMLLLHLRTKDAYEASCKPVLLKAM ARRYDVDEVAVERVLREFDLFELDEERQMFRSSYLDRVMKSLEEKRKMDIENGKKGGRPK KVAKSAETPVSKGRKPTENQKRREEESKEEESKGSVSVVNNNRSNIETPSLVSRLADEGN HGPLQPVLPWEKLVDQLSTFQSYMELAGQHSGLGKLFVDHQKLILEIFKKHIRLYDKGAG LLFPEDVKRYFSNYIAAGSVTCRTLRETLLKELENTVDKDVNRFESVVDGRRTYLGHLIP VDAPPRPDASAVWDDVKKRWAH >gi|222159316|gb|ACAB01000043.1| GENE 16 21744 - 23537 1379 597 aa, chain + ## HITS:1 COG:no KEGG:BT_1172 NR:ns ## KEGG: BT_1172 # Name: not_defined # Def: DNA primase/helicase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 597 1 598 599 1039 80.0 0 MRNFANYDVDIHGKSSGVLKTICKKCLPTRKNKRDRSLRVNVDTGHCHCYHCGADFYVPD DVEERKNAERVAARKRRAAAIPQHFQRPVFDASKTTLSEDTERWLVETRCIPQSVIAALR ITEQEEFMPQSGKKERCICFNYFEGEQLINTKFRALPKLFKMVQGAELIPYNIDSILGQT SCIIHEGELDAASSIAAGFKSAISVPAGANSNLSWLDRFMETHFEDLEEIIIAVDADSAG IRLRNELINRLGAERCRVVTYGPECKDANEHLCKYGIASLRIAIEQAAEVPLEGIFTAAD LHDDLRALFDNGFGPGAETGWEEMDKICTYERGRSVYVTGVPGAGKSEWVDELVLRLCLR HQWKIGFFSPENTPIVYHLRKLIEKLTGHRFQNGCGMTEGLLANSEDFLTENVSHISLKG NVSPDRVLAKAHELVVRRGCRIIVFDPLNRFDHNPQPGQTETQYISNLLNKFTEFAVQHN CLLVVVVHPRKMNRNPVTGITPRVEMYDINGSADFYNKADYGIIVERDKEVGATRVYVDK VKFKHLGVGGMASFVYDPVSGRYLPCEESHDPSLSADQRVRNTMFDNSCWLPEKELF >gi|222159316|gb|ACAB01000043.1| GENE 17 23634 - 23891 220 85 aa, chain - ## HITS:1 COG:no KEGG:BT_1170 NR:ns ## KEGG: BT_1170 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 85 1 85 85 131 81.0 9e-30 MKTKNKILPEQEEEKTFQYRTYGKGELALLYLPNILQQSAVDRFNEWIEAAPGLKERLLA TGMNPRARYYTPAQVRLIVEVLQEP >gi|222159316|gb|ACAB01000043.1| GENE 18 24072 - 25631 1348 519 aa, chain + ## HITS:1 COG:no KEGG:BT_4046 NR:ns ## KEGG: BT_4046 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 518 1 517 518 837 79.0 0 MEGIFRRYPIGIQSFERLRNDNCVYIDKTELIYRLVNTHTTYFLSRPRRFGKSLLVSTLE AYFSGKKDLFKGLMMEQLEKDWTVYPVLHIDFSISKYMNAGMLRSAINNRLVEWERIYGC DTSEDTFSLRLKGIIKRAYEQSGRQVVLLVDEYDSPMLDSNNNEELQAEIRGIMRDFFSP LKAQGEYLRFLFLTGISKFSQMSIFSELNNLQNISMQDAYSAICGITENELRTQLEEDIR RMAEANGETYDEACIHLKQQYDGYHFSENSEDIYNPFSLFNAFAQKKYANFWFSTGTPTF LIDILQQSDFDIRQLDGVSATAEQFDAPTNVITDPLPVLYQSGYLTIKEYDRDFQIYTLA YPNKEVRKGFIESLMPAYVHLPARENTFYVVSFIKDLRVGNLDQCMERIKSFFASIPNDM NNKEEKHYQTIFYLLFRLMGQYVDAEVKSAVGRADVVIKMQEAIYVFEFKVDGTPEEALA QINSKQYAIPYQADHRKVIKVGVNFDSSTRTIGEWIIEN >gi|222159316|gb|ACAB01000043.1| GENE 19 25817 - 26359 547 180 aa, chain + ## HITS:1 COG:no KEGG:BT_1517 NR:ns ## KEGG: BT_1517 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 142 3 140 176 120 42.0 3e-26 MATVTVVRYKRRKRLGDDKSPMMYLLKPKAGESKIYSIDSLAQEIESIGSLSVEDVSHVM KSFVRAMKKVLVAGNKVKVDGLGIFYTTLTCPGVEQEKDCTVKNITRINLRFKVDNSLRL ANDSTATTRGGDNNMMFELYTEKKSAAGGNGGDGSDDDGKGDGGEPGGGSGGGEAPDPAA >gi|222159316|gb|ACAB01000043.1| GENE 20 26416 - 26721 360 101 aa, chain + ## HITS:1 COG:no KEGG:BT_1518 NR:ns ## KEGG: BT_1518 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 94 1 93 101 123 82.0 2e-27 MLDKIIEIIMTILPFLGSSRKKRQMMAQEVKEFSELVKDQYTFLMQQLEKVLKDYFDLSS KVKEMHAEIFSLRGQLTQAAALQCSSKECVQRVQMVTETEG >gi|222159316|gb|ACAB01000043.1| GENE 21 26726 - 27175 329 149 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 142 2 98 116 111 51.0 5e-25 MRTINLIVIHCSATRADRDFTEDDLEVCHRRRGFNGTGYHFYIRKNGDIKTTREIERIGA HAKGHNQNSIGICYEGGLDCHGHPADTRTEWQIHSMRVLILALLRDYPGCRVCGHRDLSP DLNGNGEIEPEEWIKECPCFEVKEFCSKM >gi|222159316|gb|ACAB01000043.1| GENE 22 27270 - 28412 809 380 aa, chain - ## HITS:1 COG:CAC2911 KEGG:ns NR:ns ## COG: CAC2911 COG0438 # Protein_GI_number: 15896164 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Clostridium acetobutylicum # 1 370 1 369 374 156 28.0 5e-38 MKIAIEAQRIFRPNKHGMDFVALETIRELQKMDHENEYFIFVSPGEDKCLESSDNVHIIE LKCPTYPLWEQVALPRAVKSIKPDLLHCTSNTAPLHCPVPLVLTLHDIIYLEKRQSSSLS WYQEMGWHYRRLVVPRILPKCEKIITVSQFERKRILEALHLPEKQLVAVYNGFNSHFHLQ PKAPEITRKYIDADEYLFFLGNTDPKKNTPRVLKAYSGYLKKSAKKLPLLIADLKEDAID RILEEEKMMDIKSYLSFPGYIENTDLAALYSGAFAFLYPSLRESFGIPMLEAMACGTPII AGNTSAMPEIAGDGALLVDPFSPEDITAKILKLENDGTFYQQQVEYGLKRSQMFSWRNTA ESLLSIYKELSLSNICPVSK >gi|222159316|gb|ACAB01000043.1| GENE 23 28522 - 29715 731 397 aa, chain - ## HITS:1 COG:CAC1691 KEGG:ns NR:ns ## COG: CAC1691 COG1215 # Protein_GI_number: 15894968 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 1 261 1 271 425 82 28.0 1e-15 MNIVDWILFVLFALCVCYLLLYAIASKFYRVPEFPEARTLRRFVILFPAYKEDRVIVSSV RSFLRQEYPKEMFDIIVISDQMQSATNELLRSLPIRLLIADYKDSSKAKALTMAMDAIDG IPYDVVVIMDADNLTSPHFLTAVNRAFDSGVRCIQAHRTGQNLNTDISVLDGISEEINNG IFRSGHNALGLSAALSGSGMAFEADWFRKNVRLLETAGEDKELEVLLLQQRIHTTYLPEI PIYDEKTQKEEAISNQRKRWIAAQFGILRSSLSGLQKAIRQGNIDYCDKIIQWMLPPRLI QIAGVFGLTFIFTAIGIWLSLKGDSGNEWMIAIKWWILSIAQIVAMILPIPGNLLNKRLG KAIIKIPILALTTIGNLFKLKGAYKKFIHTEHGEGHF >gi|222159316|gb|ACAB01000043.1| GENE 24 29743 - 30669 526 308 aa, chain - ## HITS:1 COG:CAC3069 KEGG:ns NR:ns ## COG: CAC3069 COG1216 # Protein_GI_number: 15896320 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Clostridium acetobutylicum # 8 260 4 258 299 166 33.0 7e-41 MTNNIPDISFITICYNGFKDTCELIESLQNKIHSVSYEIIVVDNASHENEAAKIHQLYPT VVAIRSNENSGFSGGNNIGIQVAKGKYIFLINNDTYIESDHIAYLVERLESRPEIGGVSP KIRFAFPPQHIQFAGFTPLSQITLRNHMLGFDCPDDGTFDTPHTTPYLHGAAMMFKREVI EKIGMMPEIFFLYYEELDWSTSMTRAGYELWYDPCCTIFHKESQSTGQLSKLRTYYLTRN RLFYARRNLKGFNRIASIIYQSTVAATKNSLAFILKGRFDLAAAIFYGVNSGLLRPSSDK KSVVSPDK >gi|222159316|gb|ACAB01000043.1| GENE 25 30679 - 31878 651 399 aa, chain - ## HITS:1 COG:no KEGG:BF2060 NR:ns ## KEGG: BF2060 # Name: not_defined # Def: putative transmembrane surface-related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 390 45 434 444 501 61.0 1e-140 MVLILLFTTICLIQLGNDERTNWKPGQNIMTYFFIVWLLFYILEILNPNNVMAAWNINLT PYALTPLICAFVVPIVIRTKKDIELLLIIWSIFVLIFTLKGYWQKNHGFSSKDLYFLHVV GGARTHIIWSGIRYFSCFSDAANYGVHAAMSTVTFAIASFFVDSRWKRIYFLFIAFCGIY GMGISGTRSAMGVLMGGMLMITVIAKNWKALLGGIFISISIFAFFYYTNIGSGNQYIHKM RSSFHPTEDASYLVRVENRMRMKELMAKKPIGYGVGLSTGNFEPKEQMPYPPDSWLVGVW VETGIVGLILYLGIHGILFAWCSWLLMFKVRNKSLRGLVAAWLCMDAGYFIAAYVNDVMQ YPNPLTVYIGFALCFAAPHIDKRISEEEKEKELVATQKN >gi|222159316|gb|ACAB01000043.1| GENE 26 31967 - 34192 1495 741 aa, chain - ## HITS:1 COG:no KEGG:BT_1164 NR:ns ## KEGG: BT_1164 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 741 1 741 742 776 55.0 0 MNYIISIIRALFRHRWLILLGTTFFTLLVIYYTRHMQGGYDVKATLYTGVASGYNLESDK RTDWATVQNSMDNLISIMQAESTLKRVCLRLFARILIQGNPDKENNGITASSYNYTYNHL KNSPNGAEILKLIDKSSEDKTVANLEKYMRPHRDNYIYGLFYYNHPFYSYNALKNIKVQR RLTSDLLDISYSSGDPGIVYNTVSILMDEFVEEYRRIRYGETDKVIKYFEEELKRIGKKL NLEEEDLTKYNVEKRIINYIDETKEIAAISKEYELREQDALFAFNSSKSMLEELEKHMDS NAKQVLKNMEFVDKLREASSITGRISEAEAMTVSSKTDDQTLINEKKRLSEIKQELNDLT SSYIGHKYTKEGASRTNIIDQWLEQTLLYEKAKAELQIVQNARSELNDRYVFFAPVGTTI KQKERMINFTERNYLTVLQSYNEALLRKKNLEMTSATLKVLNEPTYPISSNSTNRKQIVI AACAVSFLIIVALLLLVEMLDRTLRDASRTRRVTGFKVIGAIPNTSPSRYGGLTKTYVQL SVKELSNSLLRFLTKRKSPGVFIINLFSTSDNSGEEELGNLICGYMQSRMLNARFITYGV DFNTDSTQFLLAKSITDFYTLQGEDVLIVAYPPLSTSNIPSALLHDANANILVASADRGW KTIDKQLCEQLTQQLSKTDVPFRICLTNANRDAVEDFTGQLPPHTLLRRIGYRLSQLSLT EKIIFNLRRKAKEAADEDDDE >gi|222159316|gb|ACAB01000043.1| GENE 27 34204 - 34950 374 248 aa, chain - ## HITS:1 COG:no KEGG:BT_1163 NR:ns ## KEGG: BT_1163 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 247 1 247 249 317 66.0 4e-85 MKQFIYFFSGLLLLFISTTRVVAQEVSDYSQLSEEDYSKIVLPPLSVLFENAKNSPIYEM ADVKARIERKLLQKEKWSFLGFFSLRGSYQYGMFGNESTYTDVAVAPYLTYSTQAQNGYT VGAGLSIPLNDLFDLKGRVSRQRLTLRSAELEREMKYDEIKKSIIEMYTMAISQMKVLQM RSESLVLANVQYEISEKNFANGTIESTDLSTDKERQSQAREAYENSKFELTKSLMILEVI SRTPIIRK >gi|222159316|gb|ACAB01000043.1| GENE 28 35088 - 36329 1002 413 aa, chain - ## HITS:1 COG:no KEGG:BT_1162 NR:ns ## KEGG: BT_1162 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 413 2 409 409 721 85.0 0 MLPSINMMSATADSTAVKSDELQMSTTSISDTTPTDFEGAPSDTIPALSKRELRRQRVAN RNLHYNILGGPSYTPDFGLLVGGSALMTFRMNPSDTTQRRSVVPMAIALMFKGGLNLMTK PQLFFKGDRFRIFGTFSYKNTIENFYGIGYSTNKDYERGEDTSEYRYSGIQVNPWFLFRL GESNFFAGPQVDLNYDKITKPAAGMVNEPSYIAAGGTEHGYKNFSSGLGFLLTYDTRDIP ANAYSGTYLDFRGMMYNKTFGSDNNFYRLEIDYRQYKTVGRRKVIAWTVQSKNAFGDVPL TKYVLSGTPFDLRGYYMGQFRDKSSHVMMAEYRQMINTDKSTWVKKMLSHIGYVAWGGCG FMGPTPGKIEGVLPNLGLGLRIEVQPRMNVRLDFGRDMVNKQNLFYFNMTEAF >gi|222159316|gb|ACAB01000043.1| GENE 29 36292 - 36516 58 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237721024|ref|ZP_04551505.1| ## NR: gi|237721024|ref|ZP_04551505.1| predicted protein [Bacteroides sp. 2_2_4] # 1 74 13 86 86 127 98.0 2e-28 MYGIKSCNFIYQHIEKHLILQKYDSFLDLLSKNIKTNRNKIVFFAVEFLKEKYEEIYNRN IGDASIHQYDVSYG >gi|222159316|gb|ACAB01000043.1| GENE 30 36578 - 37972 1269 464 aa, chain + ## HITS:1 COG:TP0112 KEGG:ns NR:ns ## COG: TP0112 COG3579 # Protein_GI_number: 15639106 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase C # Organism: Treponema pallidum # 27 463 3 440 450 278 33.0 2e-74 MNKQILSIFVFCAFSYSTQAQEVKGGISDSMMQQIKQSYANTPTDKAIRNAIGNNDIRKL ALNQDNLKGMDTHFSIKVSSKGITDQKSSGRCWLFTGLNVMRAKAIAKHNLGSFEFSQTY PFFFDQLEKANLFLQGIIDTSSKPMDDKMVEWLFRNPLSDGGTFTGVADIVSKYGLVPKD VMPETNSSENTSRMAGLIALKLREQGLQLRDLAAQGVKPAALEKTKTEMLSTIYRMLVLN LGVPPTEFTWTEYNAKGEPVSTETYTPLSFLKKYGDEKLIDNYVMLMNDPSREYYKCYEI DYDRHRYDGKNWTYVNLPIEDIKEMAISSLKDSTMMYFSCDVGKFLNSDRGLLDVKNYDY ESLMGTSFGMNKKQRIQSFASGSSHAMTLMAVDLDKNGKPTKWMVENSWGPAAGYQGYLI MTDDWFNEYMFRLVVETKYASKKALEVLKQKPIRLPAWDPMFAE >gi|222159316|gb|ACAB01000043.1| GENE 31 38145 - 39494 1437 449 aa, chain + ## HITS:1 COG:YPO3240 KEGG:ns NR:ns ## COG: YPO3240 COG1726 # Protein_GI_number: 16123399 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA # Organism: Yersinia pestis # 4 447 1 446 447 290 35.0 6e-78 MANVIKLRKGLDINLKGKAAETYATVKEPGFYALVPDDFPGVTPKVVVKEQEYVMAGGPL FIDKYHPEVKFVSPVSGVVTSVERGARRKVLNIVVEAAAEQDYEDFGKKNVASMDAEAVK SALLGAGLFAFIKQRPYDIIADPTVTPKGIFISAFDTNPLAPDFEFALKGEEANFQTGLD ALAKLAKTYLNISVKQNAAALTQAKNVTITAFDGPNPAGNVGVQINHLDPVSKGETVWTI DPQAVIFIGRLFNTGHVDFTRTVAVTGSEVLKPAYCKLQVGALLTNVFAGNVTKDKDLRY ISGNVLTGKQVSPNGFLGAFHSQLTVIPEGDDIHEMLGWIMPRFNQFSANRSYFSWLMGK KEYTLDARIKGGERHMIMSGEYDKVFPMDILPEYLIKAIIAGDIDRMEALGIYEVAPEDF ALCEFVCSSKMELQRIVRAGLDMLRSEMA >gi|222159316|gb|ACAB01000043.1| GENE 32 39628 - 40800 1072 390 aa, chain + ## HITS:1 COG:PA2998 KEGG:ns NR:ns ## COG: PA2998 COG1805 # Protein_GI_number: 15598194 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB # Organism: Pseudomonas aeruginosa # 3 384 2 401 403 323 44.0 2e-88 MKALRNYLDKIKPNFEEGGKFHAFQSVFDGFETFLFVPSKTAKTGTHIHDAIDSKRIMSI VVISLIPALLFGMYNVGYQHFTHTGATGSFIEMFIYGFLAVLPKIIVSYVVGLGIEFVVA QWKKEEIQEGFLVSGILIPMIVPVDCPLWILAVATAFSVIFAKEVFGGTGMNVFNVALIT RAFLFFAYPTKMSGDAVWVSGDSIFGLGQSVDGLTVATPLGQAATTGSVPAFNMDMITGL IPGSIGETSVIAILIGAVILLWTGVASWKTMISVFVGGAFMAWVFNSIGMENNTMAQMPW YEHLVLGGFCFGAVFMATDPVTSARTERGKYIFGFLIGVMAIVIRVLNPGYPEGMMLAIL LMNIFAPLIDYCVVQSNISRREKRTIKSNQ >gi|222159316|gb|ACAB01000043.1| GENE 33 40814 - 41488 763 224 aa, chain + ## HITS:1 COG:VC2293 KEGG:ns NR:ns ## COG: VC2293 COG2869 # Protein_GI_number: 15642291 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC # Organism: Vibrio cholerae # 2 224 4 250 257 89 29.0 5e-18 MNTNSNSYTIIYASVMVIIVAFLLAFVSSSLKSTQDKNVQLDTKKQILAALNIKNVEDAD AEYQKYVKGDMLMNVDGTLTENTGEFATNYEKEAKEQQRLHVFVCEVDGQTKYVVPVYGA GLWGAIWGYVALNEDKDTVYGTYFSHASETPGLGAEIATDHFQNEFVGKKTLENGAITLG VVKNGKVEKPDYQVDGISGGTITSVGVDAMLKSCLNSYLSFLTK >gi|222159316|gb|ACAB01000043.1| GENE 34 41510 - 42151 655 213 aa, chain + ## HITS:1 COG:HI0168 KEGG:ns NR:ns ## COG: HI0168 COG1347 # Protein_GI_number: 16272134 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD # Organism: Haemophilus influenzae # 10 211 8 208 208 213 58.0 2e-55 MGQLFSKKNKEVFSAPLGIDNPVTVQVLGICSALAVTAKLEPAIVMGLSVTVITAFANVV ISLLRKTIPNRIRIIVQLVVVAALVTIVSEILKAFAYDVSVQLSVYVGLIITNCILMGRL EAFAMQNGPWESFLDGVGNGLGYAKILIIVAFFRELFGSGTLLGFNILNYEPIQNIGYVN NGLMLMPPMALIIVACIIWYQRARHKELQEQSN >gi|222159316|gb|ACAB01000043.1| GENE 35 42262 - 42888 720 208 aa, chain + ## HITS:1 COG:HI0170 KEGG:ns NR:ns ## COG: HI0170 COG2209 # Protein_GI_number: 16272135 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE # Organism: Haemophilus influenzae # 1 208 1 198 198 209 60.0 3e-54 MEHLLSLFVRSIFVDNMIFAFFLGMCSYLAVSKNVKTAVGLGIAVTFVLLVTLPVNYLLQ TKVLAANAIIEGVDLSFLSFILFIAVIAGIVQLVEMVVERFTPSLYASLGIFLPLIAVNC AIMGASLFMQQRINLGPSDPKYIGDIWDALSYALGSGIGWLLAIVGLAAIREKMAYSDVP APLKGLGITFITVGLMAIAFMCFSGLNI >gi|222159316|gb|ACAB01000043.1| GENE 36 42907 - 44181 1216 424 aa, chain + ## HITS:1 COG:PA2994 KEGG:ns NR:ns ## COG: PA2994 COG2871 # Protein_GI_number: 15598190 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF # Organism: Pseudomonas aeruginosa # 6 424 6 407 407 430 51.0 1e-120 MDMNLILASIGVFLVVILLLVVILLVAKNFLVPSGNVKLTINGEKELEVASGSTLLNTLS VNGIFLSSACGGKGSCGQCKCQVLEGGGEILPSEVPHFSRKQQQDHWRLGCQVKVKGDMA IKIDESVLGVKEWECEVISNKNVATFIKEFIVALPKGEHMDFIPGSYAQIKIPKFSMDYD KDIDKSLIGDEYLPAWEKFGLLGLKCKNDEETIRAYSMANYPAEGDRIMLTVRIATPPFK PKEQGPGFMDVMPGIASSYIFTLKPGDKVIMSGPYGDFHPIFDSKKEMMWIGGGAGMAPL RAQIMHMTKTLHTTDRKMSYFYGARALNEVFYLEDFLQIEKDFPNFTFHLALDRPDPAAD AAGVKYTPGFVHNVIYETYLKNHEAPEDIEYYMCGPGPMSKAVEKMLDDLGVPAQNLMFD NFGG >gi|222159316|gb|ACAB01000043.1| GENE 37 44422 - 44601 200 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716306|ref|ZP_04546787.1| ## NR: gi|237716306|ref|ZP_04546787.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 59 1 59 59 100 100.0 4e-20 MEYHRISFIHNDTEYSFVKAMSSSLTGYALVTACRAEVTIYMKENNLKGYYILTGMAKV >gi|222159316|gb|ACAB01000043.1| GENE 38 44648 - 45982 1242 444 aa, chain + ## HITS:1 COG:lin1214 KEGG:ns NR:ns ## COG: lin1214 COG0513 # Protein_GI_number: 16800283 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Listeria innocua # 2 444 4 468 470 224 31.0 3e-58 MLEKNEIIQSALRNLKIEELNPMQEASLEQATGRKDVILLSPTGSGKTLAYLLPLLLTLK PNDDSVQVLILVPSRELALQIDSVFKAMGTSWKTCCCYGGHPIAEEKKSILSNHPAIIIG TPGRITDHLSKGNFNPETIETLIIDEFDKSLEFGFHDEMAEIITQLPGLKKRMLLSATDA AEIPEFTGLNRTVKLNFLSDDSEEQESRLKLMKVLSPSKDKIDTLYNLLCTLGSASSIVF CNHRDAVDRVHQLLADKKLLAERFHGGMEQPDRERALYKFRNGSCHVLISTDLAARGLDI PEVGHIIHYHLPVNEEAFTHRNGRTARWDATGTSYLILHAEEKLPSYILEEMEIVVLPEN PPRPPKSVWATIYIGKGKKEKLSRIDIAGFLYKKGNLTREDVGAIDVKEHYAFVAVRQAK VKQLLNLIQGEKIKGMKTIIEEAK >gi|222159316|gb|ACAB01000043.1| GENE 39 46188 - 47255 1144 355 aa, chain + ## HITS:1 COG:BS_serC KEGG:ns NR:ns ## COG: BS_serC COG1932 # Protein_GI_number: 16078066 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Bacillus subtilis # 5 353 6 356 359 331 47.0 1e-90 MKKHNFNAGPSILPREVIEDTAKAILDFNGSGLSLMEISHRAKDFQPVVDEAEALFKELL NIPEGYSVLFLGGGASMEFCMVPYNFLEKKAAYLNTGVWAKKAMKEAKGFGEVVEVASSA EATYTYIPKDYTIPTDADYFHITTNNTIYGTELKKDLDSPVPMVADMSSDIFSRPIDVSK YICIYGGAQKNLAPAGVTFVIVKNDALGKVSRYIPTMLNYQTHVDSGSMFNTPPVVPIYA ALQTLRWIKAQGGVKEMERRAIEKADMLYAEIDRNKMFVGTAAKEDRSRMNICFVMAPEY KDFEADFLKFATERGMVGIKGHRSVGGFRASCYNALPKESVQALIDCMQEFEKLH >gi|222159316|gb|ACAB01000043.1| GENE 40 47355 - 48275 1261 306 aa, chain + ## HITS:1 COG:aq_1905 KEGG:ns NR:ns ## COG: aq_1905 COG0111 # Protein_GI_number: 15606928 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Aquifex aeolicus # 2 305 3 320 533 159 35.0 8e-39 MKILVATDKPFAKVAVDGIRKEIEAAGYEFALLEKYTEKAQLLDAVKDANAIIIRSDIVD AEVLDAAKELKIVVRAGAGYDNVDLAAATAHNVCVMNTPGQNSNAVAELALGMMVYAVRN FYNGTSGTELMGKKLGIHAYGNVGRNVARVAKGFGMEVYAYDAFCPKEVIEKDGVKALDS AEELYKTCQVVSLHIPATAETKNSINYALLKDMPKGAMLVNTARKEVINEAELIKLMEER ADFKYITDIMPAANAEFAEKFAGRYFSTPKKMGAQTAEANINAGIAAAQQIVGFLKDGCE KFRVNK >gi|222159316|gb|ACAB01000043.1| GENE 41 48438 - 49685 1292 415 aa, chain + ## HITS:1 COG:CAC0016 KEGG:ns NR:ns ## COG: CAC0016 COG4198 # Protein_GI_number: 15893314 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 414 1 413 414 417 49.0 1e-116 MAIIKPFKGIRPPQNLVEQVASRPYDVLNSEEARAEAEGNEKSLYHIIKPEIDFPVGMDE HDERVYAKAAENFQLFQDKGWLVQDNKENYYIYAQTMNGKTQYGLVVGAYVPDYMNGIIK KHELTRRDKEEDRMKHVRVNNANIEPVFFAYPDNEKLDVIIKKYTANKPVYDFIAPGDGF GHTFWIVDQDADIATITAEFAKMPALYIADGHHRSAAAALVGAEKAKQNPNHRGDEEYNY FMAVCFPANQLTIIDYNRVVKDLNGLTPEQFLAALDKNFIVEEKGADIYKPSGLHNFSLY LGGKWYSLTAKAGTYNDNDPIGVLDVTISSNLILDEVLGIKDLRSDKRIDFVGGIRGLGE LKKRVDSGEMKVALALYPVSMKQLMDIADTGNIMPPKTTWFEPKLRSGLVIHKLD >gi|222159316|gb|ACAB01000043.1| GENE 42 49941 - 50558 527 205 aa, chain + ## HITS:1 COG:NMA0240 KEGG:ns NR:ns ## COG: NMA0240 COG1739 # Protein_GI_number: 15793258 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis Z2491 # 6 178 4 176 203 184 48.0 1e-46 MTAEDTYKTIIEPSEGIYTEKRSKFIAIALPVRTLDEIKMHLETYQKKYYDARHVCYAYM LGAARKDFRANDNGEPSGTAGKPILGQINSNELTDILIIVVRYFGGIKLGTSGLIVAYKA AAAEAIAAATIIEKTVDEDVTVMFEYPFMNDVMRIVKEEEPEILNQSYDMDCSMTLRIRR SMMPKLRARLEKVETARILDDENIL >gi|222159316|gb|ACAB01000043.1| GENE 43 50655 - 51404 448 249 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293373029|ref|ZP_06619398.1| ## NR: gi|293373029|ref|ZP_06619398.1| hypothetical protein CUY_0916 [Bacteroides ovatus SD CMC 3f] # 1 249 22 270 270 519 99.0 1e-146 MSVQAQAQIQETQKLAPNNIPVPWYSQKIAGCPYSHCSLASSLMVFDYFKGMTADTQRTA QDAEKKLIEYQRNYFLKKRAPFHRRTSIGQGGYYSFEIDSLTRYYENMISAEHFQQKDYR ILKDYIDRGIPVLVNVRYTGAVRGLRPGPRGHWMVLRGIDDKHVWVNDPGRSPEMRTKGE NICYPIKKQPGNPSYFDGCWTGRFIIVTPREWIRNSLFAQVGKLPPLEEVTHIVPPVVSS VTLPQVINQ >gi|222159316|gb|ACAB01000043.1| GENE 44 51484 - 52569 979 361 aa, chain + ## HITS:1 COG:no KEGG:BT_1149 NR:ns ## KEGG: BT_1149 # Name: not_defined # Def: type II restriction enzyme HpaII # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 361 633 85.0 1e-180 MAFEATKKEWCELYSFFRLLADGKVVLGTAEAKAGDTFWPVAMIQREEHDGTRQYYIEED TIRIEGENGSKSMPREDFGIVADLILQAVKSSPENDVASPEGVEEFLDEAAIFDLEAKTE DRTDFSITFWHPKAPLRGFNVRSRLGVMNPLLDGGRAANLKLEQSGVKFATPTVNKINAL PESPNEVAERMMMIERLGGVLKYADVADRVFRSNLLMIDLHFPRVLTEMVRIMHLDGISR ISELTEVIKQMNPLKIKDELINKHKFYEFKMKQFLMALVLGMRPAKIYNGLDSAVEGILL VDGNGEVLCYHKSEKQIMEDFLFLNTRLEKGSLEKDKYGFLERENGVYYFKLNAKIGLVK R >gi|222159316|gb|ACAB01000043.1| GENE 45 52638 - 53207 526 189 aa, chain + ## HITS:1 COG:MA1774 KEGG:ns NR:ns ## COG: MA1774 COG0778 # Protein_GI_number: 20090624 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 3 180 31 214 220 105 34.0 5e-23 MKTNEVLENIKARRSVRAYTGQQVLEEDLQAILEAATYAPSGMHLETWHFTAIQNMDKLT ELNERIKGAFAKSDDSRLQERGHSKTYCCYYHAPTLVIVSNEPTQWWAGMDCACAIENMF LAAQSLGIGSCWINQLGTTCDDPEVREFITALGVPANHKVYGCVALGYPDSKIPIKEKKV KANTITIVR >gi|222159316|gb|ACAB01000043.1| GENE 46 53200 - 53325 60 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716317|ref|ZP_04546798.1| ## NR: gi|237716317|ref|ZP_04546798.1| predicted protein [Bacteroides sp. D1] # 1 41 1 41 41 78 100.0 1e-13 MYQNSPFIEITLATTFCGEGDFYMRVSTQPLILSDKKLYSI >gi|222159316|gb|ACAB01000043.1| GENE 47 53394 - 56243 2663 949 aa, chain - ## HITS:1 COG:YPO0905_2 KEGG:ns NR:ns ## COG: YPO0905_2 COG1003 # Protein_GI_number: 16121210 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Yersinia pestis # 465 941 4 484 494 581 59.0 1e-165 MKTDLLASRHIGINEQDTAIMLRKIGVNSLDELINQTIPANIRLKEPLALTSPLTEYEFG KHIAELAAKNKLYTTYIGLGWYNTITPAVIQRNVFENPVWYTSYTPYQTEVSQGRLEALM NFQTAVCDLTAMPLANCSLLDEATAAAEAVSMMYALRSRAQQKANANVVFVDENIFPQTL AVMTTRAVPQGIELRVGKYKDFEPSPEVFACILQYPNSHGNVEDYSEFTEKAHAADCKVA VAADILSLALLTPPGEWGADIVFGTTQRLGTPMFYGGPSAAFFATKDEYKRNMPGRIIGW SKDKYGKLCYRMALQTREQHIKREKATSNICTAQALLATMAGFYAVYHGQEGITTIASRI HSITVFLEKQLKKCGYTQVNAQYFDTLRFELPEHVSAQQIRTIALTKEVNLRYYENGNVG FSIDETTDVAAANVLLSIFAIAAGKDYQKVDDIPERSNIDKDLKRTTPFLTHEVFSKYHT ETEMMRYIKRLDRKDISLAQSMISLGSCTMKLNAAAEMLPLSRPEFMGMHPLVPEDQAEG YRELIKNLSEDLKVITGFAGVSLQPNSGAAGEYAGLRVIRTYLESIGQGHRNKVLIPASA HGTNPASAIQAGFVTVTCACDEQGNVEMADLRVKAEENKDELAALMITYPSTHGIFETEI KEICDIIHACGAQVYMDGANMNAQVGLTNPGFIGADVCHLNLHKTFASPHGGGGPGVGPI CVAEHLVPFLPGHGIFGNSQNQVSAAPFGSAGILPITYGYIRMMGAEGLTQATKIAILNA NYLATCLKDTYGVVYRGANGFVGHEMILECRKVHEEAGISENDIAKRLMDYGYHAPTLSF PVHGTLMIEPTESESLAELDNFVDVMLNIWKEIQEVKNGEADKDDNVLINAPHPEYEIVS DRWEHSYTREKAAYPIESVRDNKFWINVARVDNTLGDRKLLPTRYGTFE >gi|222159316|gb|ACAB01000043.1| GENE 48 56356 - 56994 446 212 aa, chain - ## HITS:1 COG:VC1270 KEGG:ns NR:ns ## COG: VC1270 COG0491 # Protein_GI_number: 15641283 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Vibrio cholerae # 11 210 13 210 218 152 40.0 4e-37 MKIKRFEFNMFPVNCYVLWDDTKEAVVIDPGCFYEEEKQALKKFILTNELNVKHLLNTHL HLDHIFGNPFMLKEFGLSAEANKADEYWIDEAPKQSRMFGFQLQEEPVPLGKYLHDGDII TFGHTKLEAIHVPGHSPGSLVYYCKEDNCMFSGDVLFQGSIGRADLTGGNFDELIEHICS RLFVLPNETVVYPGHGAPTTIGMEKAENPFFR >gi|222159316|gb|ACAB01000043.1| GENE 49 57038 - 57658 511 206 aa, chain - ## HITS:1 COG:SA2499 KEGG:ns NR:ns ## COG: SA2499 COG0357 # Protein_GI_number: 15928295 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Staphylococcus aureus N315 # 10 149 16 164 239 100 36.0 1e-21 MEIILKYFPDLTEEQRKQFAALYDLYTDWNSKINVISRKDIENLYEHHVLHSLGIAKVIQ FRPGTSIMDLGTGGGFPGIPLAILFPEVKFHLVDSIGKKVRVATEVANAIGLKNVTFRHA RAEEEKRTFDFVVSRAVMPLADLIKIIKKNISSKQQNALPNGLICLKGGELEHETMPFKH KTVIHSLSDNFEEEFFKTKKVVYVPI >gi|222159316|gb|ACAB01000043.1| GENE 50 57772 - 58665 820 297 aa, chain - ## HITS:1 COG:no KEGG:BT_1144 NR:ns ## KEGG: BT_1144 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 297 1 283 283 508 88.0 1e-143 MNDKTNVEFTEKMGMKKQYVSLLAIILSVSGFLFSCHDKMNKNTGALQFDSIQVNKTAHL FNDTAKPACNIIINFAYPIKSSDDMLKDSLNTYFISACFGDKYIGEKPEEVVKQYTENYI SEYRRDLEPMYTEDEKDKEDESSIGAWYSYYKGIESHVQLYDKDLLVYRIDYNEYTGGAH GIYMATYLNMDLTLMRPLRLDDIFVGDYKDALTDLIWNQLMADNKVTTHEALEDMGYAST GDIAPTENFYLSKEGITFYYNVYDITPYSMGPVKVTIPFAMMEHLLGSNPILGELKN >gi|222159316|gb|ACAB01000043.1| GENE 51 58768 - 59001 251 77 aa, chain - ## HITS:1 COG:no KEGG:BDI_1027 NR:ns ## KEGG: BDI_1027 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 77 1 77 77 124 81.0 1e-27 MEQLHAHEVLHMMEGNSYSESSLREAIIKKFGSQQRFYACSAENMDVDTLIEFLKMKGKF MPAEDGFTVDITKVCKH >gi|222159316|gb|ACAB01000043.1| GENE 52 59074 - 60159 577 361 aa, chain - ## HITS:1 COG:CAC3027 KEGG:ns NR:ns ## COG: CAC3027 COG1408 # Protein_GI_number: 15896279 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 41 356 55 384 392 175 32.0 1e-43 MLTYLLIIITLYLAGNAYIFIRAKQALKVKSLGVKIFLTVLFWICALSFFGTMLTRNLEM PVFISHSMYTIGTSWLIFTLYMALFLLLFDILKLFKVVYKYRFYLSLVFTLGLLGCGVYN YHHPETNVVSILTNKRYEDTPQAIKIVAISDVHLGNGTGKAALKKYVEMINAQHPDLILI SGDLIDNSVVPLYTENMAEELANLKAPMGIYMVLGNHEYISGIDESIRYIKSTQIQLLRD SVVTLPNGIQLIGRDDRHNRKRHSLQELMVNIDKSKPIILLDHQPFDLEKTEAAGIDLQF SGHTHHGQIWPINWVTDYIFEQSHGYRQWGNSHVYVSSGLSLWGPPFRIGTHSEMVIFNF Q >gi|222159316|gb|ACAB01000043.1| GENE 53 60356 - 60754 152 132 aa, chain + ## HITS:1 COG:no KEGG:BT_1140 NR:ns ## KEGG: BT_1140 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 132 1 132 132 147 68.0 1e-34 MKKLAYITALILSLLIIYSGGGVSIVHYCCVRCETVQSCCDTGCPKCKKTHTCDSKKGCK DKGCTAIIYKLDLMKHTTELTTSALVVDLLCEHFCYLLTPTYADKSVEYDSLTSPPPLCS RQKLALYSTYII >gi|222159316|gb|ACAB01000043.1| GENE 54 60888 - 63101 2304 737 aa, chain + ## HITS:1 COG:no KEGG:BF2084 NR:ns ## KEGG: BF2084 # Name: not_defined # Def: putative TonB-dependent outer membrane receptor protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 13 735 13 735 735 1181 79.0 0 MIIGLFLLFAGNIYAQVRGTVKDSTGEAIPGANVFWMNTGQGVTTKEDGSFSITKPSKSH MLIVSFIGFQNDTIHVSSKNQQLDIVLRDGVELNEVNIVTRKLGTMKLRSSVMNEDMISS AELSRAACCNLGESFVTNPSVDVSYSDAATGAKQIKLLGLSGTYVQMLTENIPNYRGAAA PYGLGYVPGPWMQSIQVSKGTSSVKNGYEAITGQINVEFKKPQLPEADWVSANLFASTTN RYEANADATLKLSKRWSTSLLAHYENETKAHDGNDDGFVDIPQVEQYNVWNRWAYMGDHY VFQAGIKALSETRTSGQANHGGTMHSGDLYKVGIDTERYEFFTKNAYIFNKEKNTNLALI LSTTLHNQDATYGRKLYNVDQTNVYASLMFETEFNPQNSFSAGLSFNYDAYDQHYRLENT TDNPLKAFEKEAVPGAYVQYTLNLNDKWMVMAGLRGDYSNEHGFFVTPRAHLKYNPNDYV NFRLSAGKGYRTNHVLAENNYLLSSSRKVEVAKNLDMEEAWNYGASVSTYIPIFGKTLNV NAEYYYTDFLKQVIVDMDSNPHEVAFYNLDGRSYSHVFQVEASYPFFKGFTLTGAYRLTD AKTTYKGERMEKPLTSKYKGLLTASYQTPLGIWQFDATLQLNGGGRMPTPYELGDGQLSW ERRYGSFEQLSLQVTRYFRRWSIYVGGENLTNFKQKNPIIDAANPWGNNFDSTMIWGPVH GAKGYIGIRFNLARNSE >gi|222159316|gb|ACAB01000043.1| GENE 55 63171 - 63494 463 107 aa, chain + ## HITS:1 COG:no KEGG:BT_1092 NR:ns ## KEGG: BT_1092 # Name: not_defined # Def: putative heavy-metal binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 107 1 107 119 135 81.0 4e-31 MRTKRWMATCVVALLSVTAVLAKDIRVVVFKVSQMHCEKCEKKVKDNMRFEKGLKDISTE VKTKMVTITYDAEKTNVKKLQAGFNKFSYEAEFVKETKKDDQKADKK >gi|222159316|gb|ACAB01000043.1| GENE 56 63593 - 65803 2287 736 aa, chain + ## HITS:1 COG:alr7635 KEGG:ns NR:ns ## COG: alr7635 COG2217 # Protein_GI_number: 17158771 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Nostoc sp. PCC 7120 # 1 733 1 744 753 610 44.0 1e-174 MGNTTKKAFPVLNMHCAGCANNVERTVKKLPGVIEASVNFATNTLTVSYEKEQLTPGEIR AAVLAAGYDLIVEEAHKEERREEEQHKRYTRLKWKVIGAWILVVPLLVYSMILMHVPHSN EIQMVLAIPVMVFFGGGFFTGAWKQAKLGRSNMDTLVALSTSIAFLFSLFNTFFPEFWYD RGLEPHVYYEASAVIIAFVLTGKLMEERAKGNTSTAIRKLMGMQPKAARVLRNGVEEEIL IEKLQVGDLVVVRPGEQIPVDGQLSEGDSYVDESMISGEPVPVEKKKGDKVLAGTINQRG SFIIRAAQVGSETVLARIIAMVQEAQGSKAPVQRIVDRITGIFVPVVLGIAILTFVLWVT IGGSEYISYGILSAVSVLVIACPCALGLATPTALMVGIGKAASQHILIKDAVALEQMRKV DVVVLDKTGTLTEGHPTATGWLWAQSQEPHFKDVLLAAEMKSEHPLAGAIVAALQDEEKI KPAVLDSFESITGKGIKASYEGHTYWVGSHKLLKDFSATVSDVMAEMLVHYESDGNGIIY FGRENELLAIIAVSDPIKATSAEAVKELKRQGIDICMLTGDGQRTALAVSSRLGIERFVA DALPDDKAEFVRELQMQGKKVAMVGDGINDSQALALADVSIAMGKGTDIAMDVAMVTLMT SDLLLLPKAFQLSKQTVKLIHQNLFWAFIYNLIGIPIAAGILFPLNGLLLNPMLASAAMA FSSVSVVLNSLSLGRK >gi|222159316|gb|ACAB01000043.1| GENE 57 65982 - 66821 480 279 aa, chain - ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 152 279 161 293 307 71 30.0 2e-12 MKMDFPQVDLPTEVLAWTNVTEDILNIYKQSCRLQACIVAICTEGSMKASINLLDYEIRP NDLITLLPGTIIQFRERTEKVCLCFAGFSAHCTGRINLMKNIGNAYPKLIEQPVVPLNEE VASYLKDYFALLSRASCNENFEMDSELVELSLQTILTSIRLIYHKFPGENGSSNRKKEIC RELIHVITENYKNERRAQFYADKLGISLQHLSTTVRQVTGKSVLDTIAYIVIMDAKAKLK GTNMTIQEIAYSLNFPSASFFGKYFRRYVGMTPLEFRNR >gi|222159316|gb|ACAB01000043.1| GENE 58 66888 - 67571 579 227 aa, chain + ## HITS:1 COG:MT2274 KEGG:ns NR:ns ## COG: MT2274 COG0321 # Protein_GI_number: 15841708 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase B # Organism: Mycobacterium tuberculosis CDC1551 # 11 205 30 208 240 154 45.0 1e-37 MKTFFIDWNLIPYAEAWQRQTEWFDNIVRAKVQGESYENRIVMCEHPHVYTLGRSGKENN MLLSDEQLKAIDATLYHIDRGGDITYHGPGQLVCYPIVNLEEFQLGLKEYVHLLEEAVIR VCASYGIEAGRLAKATGVWLEGDTPRARKICAIGVRSSHYVTMHGLALNVNTDLRYFSYI HPCGFIDKGVTSLRQELKHDVPMDEVKQRLEEELRKLFQLPVARCGA >gi|222159316|gb|ACAB01000043.1| GENE 59 67528 - 68445 888 305 aa, chain - ## HITS:1 COG:no KEGG:BT_1088 NR:ns ## KEGG: BT_1088 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 305 1 310 310 504 80.0 1e-141 MKRVKHFLLASCLFPTLLGAQEMASAYFLELTPDAQSAGMAGTGLATTDNGSTAIFHNAS TIAFSQEVMGASYSYAKINQDYALHSASLFYRIGREGIHGFAVGFRHFKDPKVLDYRPHI WDLEAAYFRNVAKNLSLSLTFRYLQAKAAENADSKNSVCLDFGATYYRNMALLDEMASWS IGFQAANLGKKLDGQKLPARLGLGGTIDLPFSIENRLQVALDFNYLLPSEIRHLQAGIGA EYNFLKYGVVRAGYHFGDKDKGVGNYGTLGCGINFWPIRADFSYALADKDCFMRRTWQLG VGIVF >gi|222159316|gb|ACAB01000043.1| GENE 60 68448 - 70004 1221 518 aa, chain - ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 26 261 45 305 328 159 34.0 1e-38 MRRLLSIIVSLVTAISFAQQPVELPLWPDGAPNSSGLTGEEQETRPHFVTNVTHPTLTVY HPEKPNGMAIIMCPGGGYRGLGMDGEGYDMAPWFCGQGITYIVLKYRMPNGHWEVPVSDA EQAIRMVRQHAKEWNVNPYKVGLMGASAGGHLTATLATHYNSETRPDFQILLYPVVTMMQ VTRGNTRTALLGKNPTMEQIQKFSAELQVTPDTPQAFIALTSDDPSVAPYHGVNYYLALQ KNKVPATLHVYPTGGHGWGFKDHFKYKQQWTQELEKWLRDGVVFPENPEPMLRIGKSYLG TKYVANTLDQDGEESLVIRTDAVDCLTFVEYTLAQALGSSFADNLQKIRYRDGIINGYPS RLHYTSEWIENGIRHGFLTDITAKNSAHTQKISLSYMSTHPKQYKKLADSPENVRQMAEY EKAISGKVVHWLPKSELPEAGLPWIMNGDIIAITTKMPGLDIAHVGIAEYKEGKLHLLHA SSTLGKVVVSDEPLNHMLNNNKSWTGIRVVRMSHSKNN >gi|222159316|gb|ACAB01000043.1| GENE 61 70011 - 71999 2160 662 aa, chain - ## HITS:1 COG:PH0361 KEGG:ns NR:ns ## COG: PH0361 COG1297 # Protein_GI_number: 14590271 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pyrococcus horikoshii # 25 626 3 595 626 238 30.0 2e-62 MKQEEDKFTGLPENAFRELKPGEVYNPLMGPSKNYPEVNIWSVAWGIAMAILFSAAAAYL GLKVGQVFEAAIPIAIIAVGVSGAAKRKNALGENVIIQSIGACSGVIVAGAIFTLPALYI LQAKYPEMTVTFMQVFISSLLGGVLGILFLIPFRKYFVSDMHGKYPFPEATATTQVLISG EKGGSQAKPLLMAGMIGGLYDFIVATFGWWNENFTTRVCSAGEMLAEKAKLVFKVNTGAA VLGLGYIVGLKYASIICAGSLAVWWIIIPGMSAIWGDSVLNAWNPEITSTVGMMSPEEIF KYYAKSIGIGGIAMAGVIGIIRSWSIIKSAVGLAAKEMGGKGNVEKSIIRTQRDLSMKII AIGSIITLILIVLFFYLDVMQGNLLHTLVAIVLVAGISFLFTTVAANAIAIVGTNPVSGM TLMTLILASVVMVAVGLKGPSGMVAALVMGGVVCTALSMAGGFITDLKIGYWLGSTPAKQ ETWKFLGTIVSAATVGGVMIILNKTYGFTSGALAAPQANAMAAVIEPLMSGVGAPWLLYG IGAVLAIILTLCKIPALAFALGMFIPLELNVPLVVGGAVNWFVTTRSKDASLNTERGEKG TLLASGFIAGGALMGVISAAMRFGGINLVNEAWLNNTWSEVLALGAYALLILYFIKASMK VK >gi|222159316|gb|ACAB01000043.1| GENE 62 72105 - 72500 331 131 aa, chain - ## HITS:1 COG:no KEGG:BT_1085 NR:ns ## KEGG: BT_1085 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 244 96.0 8e-64 MKKILFLMTLLVMGVSFAFAQTNADIKFDKTTHDFGKFSENSPVVSCTFTFTNIGDAPLV IHQAVASCGCTVPEYTKEPIMPGKKGTIKVTYNGTGKYPGHFKKSITLRTNAKTEMVRLY IEGDMTAKDAK >gi|222159316|gb|ACAB01000043.1| GENE 63 72598 - 73203 449 201 aa, chain - ## HITS:1 COG:no KEGG:BT_1084 NR:ns ## KEGG: BT_1084 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 201 1 199 199 258 85.0 6e-68 MKMKKSFLSIAFLAVFMLTATNSQAQSWSDLLNKDNISKVVNAITGTTESIDMTGTWSYK GSAVEFESDNLLMKAGGAAAATMAENKLNEQLSKIGIKDGQMSFTFNADSTFTSTVGKKT LKGTYSYNASTKQVDLKYLKLLNLHAKVNCSSSSLELLFNSDKLLKLMAFIGSKSSSTAL KTVSSLAENYDGMMLGFQLSK >gi|222159316|gb|ACAB01000043.1| GENE 64 73211 - 73816 575 201 aa, chain - ## HITS:1 COG:FN2013 KEGG:ns NR:ns ## COG: FN2013 COG0218 # Protein_GI_number: 19705309 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 192 1 189 194 138 45.0 6e-33 MEITSAEFVISNTDVKKCPAGIFPEYAFIGRSNVGKSSLINMLTSRKGLAMTSSTPGKTM LINHFLINKNWYLVDLPGYGYARRGQKGKDQIRTIIEDYILEREQMTNLFVLIDSRLEPQ KIDLEFMEWLGENGIPFSIIFTKADKLKGGRLKMNINAYLRELGKQWEELPPHFVSSSED RTGRVDILNYIENINKDLNVK >gi|222159316|gb|ACAB01000043.1| GENE 65 73941 - 75395 785 484 aa, chain - ## HITS:1 COG:sll1087 KEGG:ns NR:ns ## COG: sll1087 COG0591 # Protein_GI_number: 16330938 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Synechocystis # 1 422 3 423 512 116 27.0 1e-25 MMILVTILCYFAVLLLIARITGRKGGSNAAFFKGENQSPWYIVSFGMIGASISGVTFVSV PGMVRGMDMTYMQTVLGFFFGYMAVAHILLPLYYKLNLTSIYGYLGTRIGVRAYRTGSFF FLLSRMLGTAAKLYLVCLILHTYVFQEMHVPFWLIAVGSVALVWIYTHKSGIKTIVWTDT LQTFCLIAALVFIIYFTIQRLDLNFSGIVQTIQNSEHSRIFIFDDWVSRQNFFKQFFSGI FIVIVMTGLDQDMMQKNLSCRNLREAQKNMYCYGFSFIPLNFLFLCLGILLIALAGQMQL ELPAMNDDILPMFAAQGYLGQSVLVLFTIGIIAAAFSNSDSALTAMTTSVCVDLLNTEKD TEETARRKRGKVHLSLSVLLAFFICLVEILNNKSVIDAIYIIASYTYGPLLGMFAFGLFT RRQTNDRWVPLIAILSPLLCYLADWWIGKETGYKFGYELLMLNGTLTFAGLICMSKKRKT LKVP >gi|222159316|gb|ACAB01000043.1| GENE 66 75487 - 76104 544 205 aa, chain + ## HITS:1 COG:DR0198 KEGG:ns NR:ns ## COG: DR0198 COG0353 # Protein_GI_number: 15805234 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Deinococcus radiodurans # 4 198 2 193 220 199 49.0 3e-51 MNQQYPSILLEKAVGEFSKLPGIGRKTAMRLVLHLLRQDTATVEAFGNSIITLKREVKYC KVCHNISDTETCQICANPQRDASTVCVVENIRDVMAVEATQQYRGLYHVLGGVISPMDGV GPNDLQIESLVQRVAEGGIKEVILALSTTMEGDTTNFYIYRKLDKLGVKLSVIARGISVG DELEYADEVTLGRSIVNRTLFTGTV >gi|222159316|gb|ACAB01000043.1| GENE 67 76131 - 76589 298 152 aa, chain + ## HITS:1 COG:no KEGG:BT_1080 NR:ns ## KEGG: BT_1080 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 149 149 238 87.0 6e-62 MEEQIKRAVRNLNISYVFFWVLPAFLLGAGEFELFPVGGLVDNAPAIYYFETVGILLTAL CVPLSLKLFSLVLKKKIDHMTITLALKRYVQWNIVRLGVLEVAIVVNLLCYYLTLSSTGN LCMLIGLTASLFCLPSEKRLRNELHIAKEEKL >gi|222159316|gb|ACAB01000043.1| GENE 68 76586 - 77122 348 178 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229254479|ref|ZP_04378409.1| acetyltransferase, ribosomal protein N-acetylase [Capnocytophaga ochracea DSM 7271] # 11 172 2 162 166 138 40 1e-31 MRQSFLMNDRIYLRAVEPEDMDIMYEMENDPSMWDISNFTVPYSRYVLRQYIEGSQCDVF ADKQLRLMMVRKSDQCILGTIDITDFVPLHSRGEVGIAVHKDYRQQGYATDALKLLCEYA FDFLSLSQLYAHVTTDNDVCVKLFTSCGFVQCGLLKNWLQVEGCYKDALLLQCLNPKK >gi|222159316|gb|ACAB01000043.1| GENE 69 77126 - 77716 405 196 aa, chain - ## HITS:1 COG:CPn0139 KEGG:ns NR:ns ## COG: CPn0139 COG1678 # Protein_GI_number: 15618063 # Func_class: K Transcription # Function: Putative transcriptional regulator # Organism: Chlamydophila pneumoniae CWL029 # 19 196 10 188 188 104 32.0 8e-23 MNIDSDIFKIQSNNVLPSRGRILISEPFLRDATFGRSVILLVDHTDEGSMGLVINKQLPL FLNDIIMEFKYLDEIPLYKGGPIATDTLFYLHTLSDIPGSISISKGLYLNGDFDEIKKYI LQGNKISECIRFFLGYSGWDSEQLNNEIRENTWLVSEEEKSYLMKNNIKDMWRTALEKLG SKYETWSRFPQVPTLN >gi|222159316|gb|ACAB01000043.1| GENE 70 77811 - 79127 1109 438 aa, chain - ## HITS:1 COG:MJ0001 KEGG:ns NR:ns ## COG: MJ0001 COG0436 # Protein_GI_number: 15668173 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanococcus jannaschii # 26 277 12 241 375 77 25.0 4e-14 MKNTPIERHLIDETINEFQIVDFSKATIREVKAIASKAEAASGVEFIKMEMGVPGLPPSA VGVKAEIEALQNGIASLYPDINGLPELKKEASNFIKAFINVDLNPEGCVPVTGSMQGTFA SFLTCSQCDEKKDTILFIDPGFPVQKQQLVVMGQKFETFDVYDYRGDKLKEKLESYLKKG NISAIIYSNPNNPSWICLKEEELQIIGELATQYDVIVLEDLAYFAMDFRQDLSKPYQPPF QPSVAHYTDNYVLLISGSKAFSYAGQRIGVSCISDKLYHRSYPGLTKRYGGGTFGTVFIH RVLYALSSGTSHSAQFAMAAMLKAANEGQYNFLNEVKIYGERAQKLKEIFLRHGFHLVYD NDLGDPIADGFYFTIGYPGMTSGELAKELMYYGVSAISLVTTGSHQEGLRACTSFIKDHQ YAQLDERMKLFAENHPIA >gi|222159316|gb|ACAB01000043.1| GENE 71 80270 - 81307 745 345 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716342|ref|ZP_04546823.1| ## NR: gi|237716342|ref|ZP_04546823.1| predicted protein [Bacteroides sp. D1] # 1 345 1 345 345 663 100.0 0 MKIHNHTLPKAIRRHSIRRWLATAATLTAMLTATLATLTGCSADNDPETATGGDGTSVTF GATIAAREGHDNGNAPASRAIPDGTFAEGDQILVHMDGKRKAFVYSTAQGFVHAPNYPGT NVDPTPPVWQDGEAEKSLLAYGPTACISSAGDGEYIYEVYVSVAQNDDREYKYSDYVYTA QTLYRSNPKLSFRHGMARVVLRLRSGGSLTDEDVAGASVLLGDKNLFTRADIDPQTGTLT AHKPKPNGLQPQTITPHRCADTPAGYAVAYEALLPPQDVSGKEFIQVRLSDGTELGYIAE SGSMLEGGHEYIYNVTVDAAQQVPAKNISYSITRHVHRPTSQTTE >gi|222159316|gb|ACAB01000043.1| GENE 72 81355 - 83442 1204 695 aa, chain - ## HITS:1 COG:no KEGG:Dtox_2889 NR:ns ## KEGG: Dtox_2889 # Name: not_defined # Def: protein of unknown function DUF291 # Organism: D.acetoxidans # Pathway: not_defined # 346 684 712 1061 1672 75 26.0 8e-12 MKQFTSTRLFAVTSVVAAMLAAGCSEENNEPSGNASNAAIVTASIGKTDNVVSSRAANTV WDVDDCIGISTSSVNGKTNYVNIRYKTNGSVFSPVPGAAGEDNTIYFQDKSPTTFTAYYP YEGANGTKPGSDGIITKELTVADQSPENLPAIDYLWAQQTAQSSNPKVDFRFSHRMSRLI LNFKAGAGTALPNGLTYTLTQLATEGTFDTSTGETKATGTVSKLENLPTSTIGRQITGIA ILWPQAVSSIHLQLKLGDNTFGAALTFPTGTAGEVLAPSTSYTFNVTVERTHITVGQADI DDWTSGGSKDITLQEARTLTFDPNGGSGTIKGDKVFEGATTSLPDDNGLTPPSGKTFVGW NTLSGGGGEFYAAGSTLTMPAGDLTLYAQWSGDGSAEDNPILITDAQGMKDIGASIESMR KHYRLCNDLVLDDWEAIYYGYGNGQGNPFSGTFDGGGHTVTLNGVKPIRAQFNNASGYFI SLFAEIRGEVRRLRVDGQITVDGTDENAPYHAGGICGYNNYGTVTDCISNVTVTATGKMG GLSAGSIAGWNDSGSIFNCYATGKIESTATAPFVSLGGIVGDSNSRIANCATLNSNISGK DGQTNTYIHRITGSNGGYIVNNYASATLPASEDKGLDKPDGKDCDAKPAASWWKEPGRWA NSYTTPAGSTKLFTPWNFTTTWEVTDGNLPVLLRE >gi|222159316|gb|ACAB01000043.1| GENE 73 83492 - 84346 687 284 aa, chain - ## HITS:1 COG:no KEGG:BF3848 NR:ns ## KEGG: BF3848 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 280 9 302 304 147 32.0 5e-34 MKITITLLCICATLFLAGCDVKDPIYNTPHPDKGAIVITADWTQRGEGIATPAKYITEAA GTSFTLHEPVAALAGLFVPGRTELLGYNPAAHVTVKDGVATVDNDPDNAGMQHPDPGVLL GGTVTADVVADDTVSVVLPMRQLFRKIAFEVAINGGDSRRIVGIEARLEGVAPSVDLRTG EVTGSAAAVSVPFVLSAEKLTGTVWVPGMIPGAVQRMVVALTFMDGMKDMVETDFSEMFK DFNTDKLHPMRLTGDVYAPVNSETGGTIIGWTETSGGDVNAGME >gi|222159316|gb|ACAB01000043.1| GENE 74 84370 - 85536 787 388 aa, chain - ## HITS:1 COG:no KEGG:BF3849 NR:ns ## KEGG: BF3849 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 26 383 115 490 493 196 34.0 1e-48 MYKPMKHLHKILSVLLLWGVSLCEISHARASETAGADTVVSFRFLPGENMFTRAGNEAEL ERLYVLIDCHKAEIAAGRMPVYVDGYCASQPTAKENLNTAFIRANRVKSELITRKGLKET DFVTANYARAYHNNKDMVVVTLRIPSAKATTETTPAKEEVEREEPRQKEVTVEPETKPGS QQATVKKDPVAEKQPEANRNVEQQPAPVTERLTSVPEQPYRFAVRTNVLYDAFLLPTLGV EWRVNRDLGVKLDGSLSWWGDEYGRVQKMWLVNPEVRWYLLDKKRFYVGASGSYGEYNIY KYMLGGIVSKDTGYQGKLWNAGLTAGYQLYLSRDFSIDFNLGLGYTRSEYDSFGMTDGVR VYKERNKSKHFWGPTQAGISLVWTIGSK >gi|222159316|gb|ACAB01000043.1| GENE 75 85762 - 87021 695 419 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262407949|ref|ZP_06084497.1| ## NR: gi|262407949|ref|ZP_06084497.1| predicted protein [Bacteroides sp. 2_1_22] # 1 419 1 419 419 791 100.0 0 MIKIPYIWKVDISSIYVTAIIPLIPFFSAIASGILVAFSLQDCLKREERRLKRIVLFYFS ISGIAWFITICYAFSPVLFTWLNIVCLLSFILPAIFFFRIIRFLTRLGQPENFSLLHYVL PGILAMVTLIWSLSVPFDVQLEIVRGRAEIVPAGYEAFARFYTLKPLFRVVFGLTYYLFA IGVLAVYYKRATGNKAIVHRPANWVLFIVGISLASLFSSVLPTFLMKRTEFYSSIWTLIV ALSIAMQHVLLAYHVMRRDYVPYIITEKTPKQPKATKKHVPQTEEQAVETKEKAKAPRKQ HSGKLNRRRFESFIRNEKPYLNPDYKITDLVEALDINRTTLSAFINRTYGVNFNRYLNRL RLKELEKLRSQPDGQGKSISSLLDKAGFKDFRNYSRAAAAEREDAEQKNETDKKKGDTE >gi|222159316|gb|ACAB01000043.1| GENE 76 87018 - 88148 535 376 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716347|ref|ZP_04546828.1| ## NR: gi|237716347|ref|ZP_04546828.1| predicted protein [Bacteroides sp. D1] # 1 376 1 376 376 690 100.0 0 MINIDQIDAFAFSAPIVCALVCMIMMLMDAVVRRHNVQEKRLRLFLALTYLITSLGWLGM VFYSISPRLFASYYTVFLFTLMLDQVMIYRFVSIITNTGERRKLNRLHLIIPLLFTLISI ISDMIVPVEQQRAVIFSEVSADEPNLWFRIMYVLTTTVFIVYNTLYPFLNLRNIRRYRKF IVNYSSDAYNASLAWLAVIQALILITVPVPLAGLLFRIPTISFSYFAWVGTLPYFINYLI LCYNLLNDNYLIIQPEDVKEDAAAKVSFISRKQFEHYLREKKPYLNPNLRITDLAAGLHT NRSYISGFINKEYDMNFCRLINRCRLYHLDRLRLSSLNSEKDNIDLVLMAGFSSYRSYLR VKNEEDRLSLLKVFEK >gi|222159316|gb|ACAB01000043.1| GENE 77 88154 - 89170 791 338 aa, chain - ## HITS:1 COG:lin2305 KEGG:ns NR:ns ## COG: lin2305 COG0332 # Protein_GI_number: 16801369 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Listeria innocua # 4 328 1 311 312 276 43.0 6e-74 MENINAVITGVGGYVPDYILTNDEISKMVDTTDEWIMGRIGIKERRILKDEGLGTSYIAR KAVKQLIKRTHTNPDDIDLVIVATTTPDYRFPSTASILCERLGLKRAFAFDMQAVCSGFL YALETGANFIRSGNYKKVVVVGAEKMSSIINYTDRATCPIFGDGGAAVMLEPTTEELGVM DAVLRTDGKGLPFLHIKAGGSVCTPSYYTLDNQMHYIYQEGRTVFKYAVSNMADACESII ARNHLSKDNIDWVIPHQANQRIITAVTQRLEVPAEKVMVNIERYGNTSAGTLPLCLWDFE DKLKKGDNLILTAFGAGFAWGAIYIKWGYDGKKNKNVY >gi|222159316|gb|ACAB01000043.1| GENE 78 89309 - 89434 94 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDNQSFNWLLHGCHVLSEVAVAYSPLPHTKMVIYNFRQILV >gi|222159316|gb|ACAB01000043.1| GENE 79 89742 - 90071 91 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716349|ref|ZP_04546830.1| ## NR: gi|237716349|ref|ZP_04546830.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 109 1 109 109 197 100.0 1e-49 MEQMKCTMQNTDEEIKRLEHVLCEMERISDTHREPRVIPHVKVGDVFAYLCRVFEFLQHR LKLHFKYKLACEVILAKISKYRDNNMKNKCFRNRNDEFDIFSLFLCVHF >gi|222159316|gb|ACAB01000043.1| GENE 80 90080 - 90484 390 134 aa, chain + ## HITS:1 COG:no KEGG:BT_1071 NR:ns ## KEGG: BT_1071 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 2 135 135 203 73.0 2e-51 MQIIRLKGKDKHLYRLLAPMVMDPEVIRANNNYPFKTGEEYVWFIAIEDKEVVGFVPVEQ KSRKKAVINNYYVKAEDAGREEILSHLLPAVIAEFGPESWLLNSVTLVQDKETFEKFEFV SMDKKWTRYVKMYR >gi|222159316|gb|ACAB01000043.1| GENE 81 90488 - 91792 857 434 aa, chain + ## HITS:1 COG:lin1347 KEGG:ns NR:ns ## COG: lin1347 COG3969 # Protein_GI_number: 16800415 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Listeria innocua # 9 433 3 434 434 439 49.0 1e-123 MAKKKIAGTKNVYELAQERLKVIFNEFDNIYVSFSGGKDSGVLLNMCIDYIRKNNLKVRL GVFHMDYEIQYKMTIDYVDRMLEANKDILDVYRVCIPFRVATCTSMYQSFWRPWEDNKKN IWVRSMPKKAMTKDDFPFYNTTMWDYEFQMRFAQWIHNKNDAVRTCCLIGIRTQESFNRW RCIYMSRKFQMYHKYKWTSKVGNDIYNAYPIFDWKTTDVWTANGKFQWDYNTLYDLYYRA GVNLERQRVASPFINEAQESLQLYRVLDPNTWGKMVGRVNGVNFTGMYGGTHAMGWQSVK LPEGYTWREFMYFLLSTLPERARKNYLRKLSVSVNFWRTKGGCLSDATIQKLIDAKVPII VMDNSNYKTLKKPVRMEYQDDIDIPEFKEIPTYKRMCVCILKNDHACKYMGFSPTKEEMS KRSQIMEQYRIIVS >gi|222159316|gb|ACAB01000043.1| GENE 82 91789 - 92331 543 180 aa, chain + ## HITS:1 COG:L69383 KEGG:ns NR:ns ## COG: L69383 COG1475 # Protein_GI_number: 15673430 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 6 165 7 166 180 221 63.0 4e-58 MSVDKSPVYEVKAVPVEKVYANDYNPNVVAPPEMKLLELSIWEDGFTMPCVCYYNKDEDH YILVDGYHRYTVLKTSQRIYKRENGLLPIVVIDKDLSNRMSSTIRHNRARGMHNIELMCN IVAELDKAGMSDQWIMKNIGMDRDELLRLKQISGLADLFANREFSIPDEVAPTETERKTL >gi|222159316|gb|ACAB01000043.1| GENE 83 92789 - 93016 119 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720862|ref|ZP_04551343.1| ## NR: gi|237720862|ref|ZP_04551343.1| predicted protein [Bacteroides sp. 2_2_4] # 1 74 27 100 130 116 87.0 4e-25 MREITKLTISILYLIFLLSVMCVQGSNVTGTLFQLTSKVTTCADIYARLLTKPLSNPLLG SLSYSDTQETIPMNL >gi|222159316|gb|ACAB01000043.1| GENE 84 93034 - 93399 216 121 aa, chain + ## HITS:1 COG:no KEGG:BT_1067 NR:ns ## KEGG: BT_1067 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 120 484 607 608 109 50.0 3e-23 MEKEVHIRQREAVPATGSTLQLMPTGNSNSEVYEIDYFCLTGEVDESAKNWIRLYSSVMP EGHTGEERIIVEDGKIYIKVLPNVEADSRNGIVHLTTMVSSVKTGISNVQRIKIDVTQLG Q >gi|222159316|gb|ACAB01000043.1| GENE 85 93421 - 93993 609 190 aa, chain + ## HITS:1 COG:no KEGG:BT_1066 NR:ns ## KEGG: BT_1066 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 189 189 333 83.0 2e-90 MKKVAILILFMLGCAISEGHAQKVALKSNLLYDATTTMNLGLEIGLARKWTLDIPVNYNP WKLSDGKRLRHLGVQPEIRYWFCESFRRMFVGMHGHYADFNVGGWPDWSFISDNMQHTRY QGYLYGGGFSVGYSWILKKRWSIETSVGVGYAHIVYDKYPCATCGTRLKESSKNYFGPTK ASVSLIYVIK >gi|222159316|gb|ACAB01000043.1| GENE 86 94002 - 95513 1467 503 aa, chain + ## HITS:1 COG:no KEGG:BT_1064 NR:ns ## KEGG: BT_1064 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 118 500 4 386 392 598 77.0 1e-169 MIIRKIGFLVCMLVVCSLNINTAMAQSRSYEGGITVNPVRLEQKGDFIHVDIDFVLNNVK VKSARGMDFIPRLVTPGRTQNLPKVSIKGRDEYLAYERELALMNAKEKRNYEKPYIVEKA GKLRNDTIRYQYLVPFESWMKDARLDVQRDECGCGETALMNIEEFGKVTLERVWTPYVVV PQFAYLQPKAEEIKQRDIQAECFLDFEVNKVNIRPEYMNNPQELAKIRKMIDELKSDPNV KVNRLDIIGYASPEGTLAANKRLSEGRAMALRDYLAYRYDFPRNQYYIVFGGENWDGLEK ALETIELEYKDEVLDIIRNIPIEKGRETKLMQLHGGTPYRYLLKYIFPSLRVAICKVNYE VRDFSVEEAKEIIKTRPQNLSLNEMFLVANTYPTGSQEFIDVFETAVRMYPQSEIANINA ATAALSRNDLVSAERYLSMVNSNKNLPEYNNAMGILMLMKGDYELSKKYLKVAEQSGLDA ARGNLEELVRKKANAAEMRKNGK >gi|222159316|gb|ACAB01000043.1| GENE 87 95571 - 97280 1327 569 aa, chain + ## HITS:1 COG:no KEGG:BT_1063 NR:ns ## KEGG: BT_1063 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 564 1 561 566 88 23.0 7e-16 MLKIKSILVSLLAIVALASCSNENLEENPEVEGGAGYDVAYVSISLTNPKVPGSRASGEQ PALPAESAINELYVITFDAGKVVTKDEKATKYATVLGSGSFGTNSGVTTPNTPVKVDPNT KYLLVIANPGYQLKDRLDNLSAGATYATINGMITVPTNNTKPNNAYLVEEVVHSNGCAMI NVGFYDDSDSDPSNHAWKDECLLDVSDKIVLVSDYKSEAQAQNAAKSNPATLEIERLAAK LEVMIGSPLAVGPFEDGTNASLGQFDFGNWTIDYYNSLFYPFAKKTTTASSHTTGFYKSN FYTVDPNFTTSGGTEYLTGIVKNTLDANREPKVEWVAESADAGDNYKYCIENTMAEGYQK FGAATRLVLKGQYAPWKSGEFTLGDDWYRLPNGTNSVNFKSFADLLAAYTPAKAKQTNSD PMTAQEKLLVTACELFYTQIQSELTTNDPSSFALLTQTILDDNNIKNGGELCKKEGCIYW YPKSLNYYYYEIRHDNAANSYMEYGKYGVVRNNYYTLTLTKVNGNGTPWYPGGGPEDPDE EEDIDKKGAYLHFEIKVAPWIYWTTNFEI >gi|222159316|gb|ACAB01000043.1| GENE 88 97557 - 98555 639 332 aa, chain + ## HITS:1 COG:no KEGG:BT_1062 NR:ns ## KEGG: BT_1062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 325 11 310 317 153 32.0 1e-35 MFGKTKSLFLIVASMLCMASCDSIREDLPRCELWLEFVFDYNMEYADAFNPQVKSVDVLV FDSDDKLLFTKSAEVAALVGGNRMSLTDELDFGSYKVLTVGSLSDRFRLSDNAGNKLAPG TSTLQQVIVSLKRETDVVNFEFQHLYFGEVVEVDHLPSSTDHKIYPVNLIRDTNRFNLAL MGYEENKVDGTQYTFEIQAPENAVYSWENEPAGQGPVTYVPYYTGPGEISDVVMSARLNT MRLLNRSGWDYKFIIRDANTEAEVWSYNLMTLLSIARPVSRYDGTELPFQEYLDRQSEWN LIFTVVEKNGGGFLQIGIVVGNWIHWLHGMEV >gi|222159316|gb|ACAB01000043.1| GENE 89 98729 - 99670 671 313 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294807955|ref|ZP_06766734.1| ## NR: gi|294807955|ref|ZP_06766734.1| putative lipoprotein [Bacteroides xylanisolvens SD CC 1b] # 1 313 67 379 379 588 100.0 1e-166 MRFIVFGSTPGGVRLDVNEHILLSTPETATDIDAQLLEVTSSNDILVVVIANEPQSLTSK LDGIANLLTLQEMIYDISSILNSDGQIISATGMPMTGVIRDISIAPDETKTVQMVIERAV ARVDVFIEAIDGGAVTGYIAGSTSVTLHNFSYDSYFVMGNVDNGTRDNADSSKNYGKVKE DVSESNLLTHKWTAATTETWAYSSAPGAENRKLLCSFYTAERLFKSDYSDRLSISMANVP KGPSDVTGITEKVIESVTKVDGTGSPTAQPFTEIRRNNVYQVTARVGKIGIQILTISVED WGERQDIDLDMDL >gi|222159316|gb|ACAB01000043.1| GENE 90 99745 - 101319 966 524 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716360|ref|ZP_04546841.1| ## NR: gi|237716360|ref|ZP_04546841.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 524 12 535 535 981 100.0 0 MAMAVLSAVLFSCVREEMTCDKELIVVQRIGEGGYVYTRGAAISSNTDLKEETFGLYGSL TPNASVPQPYFNASATVNADLTATISPLQYWPGLLNASMKFFSWYPYNDANAPTASFTDP GEMVLNYTANASAANHVDVLAAISGPVWVEGVNIHFYHTLTKVTFTFKKVAPVPDEVTIE KIEFQNVGKSGKLAMTEIPTTTTKNGKPKFVWSDVATGKVVSTLTDNKTVTEDATLMGDT FLMLPTDAFSATAKIVVTTNFGDREFLFSDIIAKNPHSWESGEYINYNLTISNETYQLSA TPLEWTESPVNVIFDKQYYLKLSQTKVQTAGDGVTVNIEAKTNYDANPDTGYLPGASLNK STMDTWATVNMAETSFSGGVYTYNIQVVMPGFNSGTGTKRETGFYINAGNLRHHVLLSQW KEDGEWLTSNVELDGSDGGTGNMQRRKLVFTSGNPGTWEWKITQIQDPDKILLNSETMLE TSGVSGDVVYFYFRADAVSGDTATLTLTNTNGDNLPIAVTLTAP >gi|222159316|gb|ACAB01000043.1| GENE 91 101364 - 104591 1682 1075 aa, chain + ## HITS:1 COG:no KEGG:BDI_2707 NR:ns ## KEGG: BDI_2707 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 16 421 7 374 720 73 23.0 4e-11 MQYGISIIRRCYLLPVLLMTVWLTGCVQEELGSTPSPTGNGIRFTLTVPDVNLPSVSSRT MTGTGTAKKEDEIETVDILVFDMSKTPAVYLECVSATGVTQDLADNSTVSFSAVLSPTTA STCIVVVANKELDNIVPGFIKGTTTKVEAMEKMLHAQTGKWLADGSTTDGYTRIPMYGEK VISKITPSMDPITGINMKRMLARIDIRNNSATSNFTVEEVYLANYNTTGYIAPAWNTNGQ VTDPAPDTPVLPVNSGKMTEEGDAILYSVNGNTPYEGEIYTFESIAAEDVGGVGQDGEAS RKNATCLIVKGRIGTGESTFYRVDFTKTGNTGEQVEYLSLKRNHKYIITITKALGIGYKS FSEALASYTVMSNLKFRLIHYDRDKVKDVVYNGQYILGVGESEVAVTQYQNNSYAIDVFT DNPGGWKATVTAGSDWLKFEGGADAASGVANDDTQLKLKIPYFNNDNIGVERKATVTLTA GRLTHNIEVTQYTIDPGIIKFVDAYGNVLERGLFFPIRNPDGDELPLEPQTVYVMFSVDK IEGKLFDANDIGMIQYNAGGLIPQLDRTNTVPFSERVQAFTIQPNPRRAGDGTVEDGEGW WWRWDFILFGLYDKEGYLTHVQFPINQGELEFSFRYIPTTVNSRTYKVNLGAEQYLQLFV NNNWEIMDIEELNITNDDGTGLIRTDSDNDVIIGRKNADDKMYLESIIGYNDGKGNDVVG HGYDFRLKLHPGKWKEGKSGTIRITFRNVMHTVSDEEFPFYRTIDLQMVSETKSYTTAGE PLFYLYPLRFDNRLYYKEIGETKQSVGRLENVADAEDICRNIGDGWRLPTASELLLSFAY ENALGGNAENYNNHDSQNIYGWYQNWTGNYWTSSYYEAKGSATRFWMEMSAGFLDSESIS KNNYFRCVRNNTNSGKKYPYLTVESSGVTIVSRDASGGTDPSVLFASGETPGTSDAMNKI APKFQIENTSSNGKTWSEAKAACENKGSGWRLPTQREMYLVLSMGGAVTSIENQGFGGST TWTGSGFEKISAVHWTLTSRDGGYWVVGHNNGEFGAWTTGESTDWAWYRCVRTIE >gi|222159316|gb|ACAB01000043.1| GENE 92 104691 - 105650 606 319 aa, chain - ## HITS:1 COG:no KEGG:BT_1061 NR:ns ## KEGG: BT_1061 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 319 1 319 319 484 78.0 1e-135 MKNGILNELISTMRERIPQDMNLANTLADILCMGKEAVYRRLRGEVSFTIDEVALLSQKL GISIDQIVGSHVSNKVTFDLNLLHASSALESYYEIINRYLQIFDYVKTDNTTEVYTASNS LPFTLYSSYENLSKFRLCRWMYQNGDIKTPHSLEEMSVDERIVNVHKKLSESIRQCPKTF FIWDTNIFYSFVKEIKYFASLNLISKDDVMHLKEELLQLLGVVEHLSIKGEFSENKKVSF YLSNISFEATYSYIEKHDYQVSLLRVYSINSMDSQSSYICQMQRNWIQSLKRHSILVSES GEAQRIAFLQKQLEVINTL >gi|222159316|gb|ACAB01000043.1| GENE 93 105887 - 106843 805 318 aa, chain - ## HITS:1 COG:no KEGG:BT_1060 NR:ns ## KEGG: BT_1060 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 318 2 317 317 450 70.0 1e-125 MTNNPNANLIEAMKEKLPLKGQLADMLMDTLYIGKEAVYRRLRGEVPFTLQESALISRKL GISLDKIIGLSFKSNAMFNINIVDYDDPFESYYNILEKYVSLINTMPDDPNSVMGTSANI IPQTLYLKHELLAKFRLFKWMYQNKYIDCKSFEELDIPPKLVNIQKDYVAMTRHIHSIDY IWDNMIFQHLINDIQYFASIHLISDETKEEIKKELFLLADELEELAINGKTADGNRVRIY VSNINFEATYSYVDTNNLQLSLIRIYSINSITTMDNEIFCTLKEWIQSLKKFSTLISESG EMQRIQFFKQQREIIDAL >gi|222159316|gb|ACAB01000043.1| GENE 94 107047 - 107976 677 309 aa, chain + ## HITS:1 COG:XF0611 KEGG:ns NR:ns ## COG: XF0611 COG0451 # Protein_GI_number: 15837213 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Xylella fastidiosa 9a5c # 3 307 22 326 329 484 70.0 1e-136 MKRILVSGGAGFIGSHLCTRLINEGHDVICLDNFFTGSKDNIIHLMDNHHFEVVRHDVTY PYSVEVDEIYNLACPASPIHYQHDPIQTAKTSVMGAINMLGLAMRLDAKILQASTSEVYG DPIVHPQPESYWGNVNPVGYRSCYDEGKRCAETLFMDYHRQNNVRVKIIRIFNTYGPRML PNDGRVVSNFILQALNNEDITIYGDGKQTRSFQYIDDLIEGMIRMMNTEDEFTGPINLGN PNEFPVLELAERIISMTGSSSKIVFKSLPDDDPKQRQPDITLAKEKLGWQPTVELEEGLK RMIEYFKNV >gi|222159316|gb|ACAB01000043.1| GENE 95 107996 - 109129 1055 377 aa, chain + ## HITS:1 COG:CAC3391 KEGG:ns NR:ns ## COG: CAC3391 COG0642 # Protein_GI_number: 15896632 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 130 373 342 576 579 119 34.0 7e-27 MNMEINPSEYKVLIVDDVISNVLLLKVLLTNEKFKIVTAGNGTQALEQVKKENPDLVLLD VMMPDISGFEVAQQMKADPEMAEIPIIFLTALNSTADIVKGFQVGGNDFISKPFNKEELI IRVTHQISLVAAKRIIVAQTEELRKTIMGRDKLYSVIAHDLRSPMGSIKMVLNMLILNLP SDTIGDEMYELLTMANQTTEDVFSLLDNLLKWTKSQIGKLKVVYQDINMVEVVEGVSEIF TMVASLKNIKIVQDVPVENVAVRADIDMIKTVIRNLISNAIKFSNEGSEVVVSLTEEDGM AIVSVKDSGCGIDDENQKKLLHTDTHFNTFGTNNEEGSGLGLLLCQDFVVKNGGKLWFTS KKGDGSTFSFSIPLLEK >gi|222159316|gb|ACAB01000043.1| GENE 96 109277 - 112156 2022 959 aa, chain - ## HITS:1 COG:no KEGG:BT_1057 NR:ns ## KEGG: BT_1057 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 959 1 954 954 1655 86.0 0 MQSFLQLVAHDLYAKIGNDLSRTALIFPNKRANLFFNEYLAGESDQPIWSPAAMSISDLF QKLSVQKSGDPIRLVCELYKVFKEETQSQETLDDFYFWGELLISDFDDVDKNMVDADKLF SNLQDLKNLMDDYEFLDKEQEEAIQQFFQNFSIERRTELKEKFISLWDKLGTIYHHYREN LTELGIAYEGMLYRNVIEQLDTDQLKYDKYIFVGFNVLNKVENEFFRKLKDAGKALFYWD YDIFYTQQIRKHEAGEFLKRNLGEFPNELPESFFDTFKEPKKIRYISASTENAQARFLPE WIKMITDNHSQIAEEKEKENAVVLCNEALLLPVLHSIPQEVKNVNITMGFPLAQTPIYSF INAAMELQTNGYRSDTGRFTYEAVSAILKHPYTRQISSHAGPLERELTKTNRFYPLPSEL KQDDFLATLFTPRNGIKELCDYLIELIKDISTIYRKEGEYNDIFNQLYRESLFQSHTKIN RLYSLIESGELSIRTDTLKRLITKVLTASNIPFHGEPAIGMQVMGVLETRNLDFRNLIIL SLNEGQLPKSGGESSFIPYNLRKAFGMTTIEHKNAVYAYYFYRLIQRAENITLLYNTSSD GLNRGEESRFMLQLLVEGPHDITREYLEAGQSPQSIQEIQIEKTPEVLRRIYRTYDSTNP SSVILSPSALNAYLDCRLRFYYRYVAGLKTPDEVSAEIDSALFGTIFHLSAQLAYTDLTA NGKMIQKEDIERLLRNEVKLQSYVDQAFKEELFKVAPEEKPEYNGIQLINSKVIVSYLKQ LLRNDLQYTPFEMVAMEKKVSEEITIQTGQGPFTLRLGGTIDRMDAKESTLRIVDYKTGG SPKIPANIEQLFTPSETRPNYIFQTFLYAAIMSRKQSLMVAPALLYIHRAASESYSPVIE MGEPRKPKIPVNNFAFFEDEFRERLQALLEEIFNEKEPFMQTEDTKKCSYCDFKAICKR >gi|222159316|gb|ACAB01000043.1| GENE 97 112186 - 113007 488 273 aa, chain - ## HITS:1 COG:no KEGG:BT_1056 NR:ns ## KEGG: BT_1056 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 273 1 269 269 314 63.0 2e-84 MVREKQICKILKEIRRQIAEANDIEFITSECQYQGDCLGTCPKCEAEVRYLEQQLERKQI AGKAITVLGISAGLIAMTPMTSCTNSANKGMNKEVPSDTTSYENLIMGEPVPTPAEDTII ASIKDAPPPPPPAPTPDDVLEGDICEEPPVVVGFIAPDISNTSLSVSDTLEVAPVMPEFP GGQQALIQFLGKKIKYPTVAQGEMGLQGRVIIRFVVDKEGNVVNPKVVRSVDPYLDKEAL RIINQMPKWKPGELEDGTKVAVYFTVPVMFRAQ >gi|222159316|gb|ACAB01000043.1| GENE 98 113012 - 113629 431 205 aa, chain - ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 12 202 12 256 302 98 26.0 7e-21 MKAVPLIGIARHRLTIDGEGVTTLVAFHGCPLRCKYCLNPTSLQPNGVWESYNCNQLYEE VRKDELYFLASCGGVTFGGGEPLLQSEFIRQFRQLCGPEWRITVETSLNVPQQNVEELIF TIDNYIVDIKDMNNDIYQRYTGKNNENVLGNLRYLIGKDKTKQIVIRTPLIPSYNTEKDI NLSIELLKEMGITQFDRFTYKTPND >gi|222159316|gb|ACAB01000043.1| GENE 99 113626 - 116796 2444 1056 aa, chain - ## HITS:1 COG:jhp1446 KEGG:ns NR:ns ## COG: jhp1446 COG1074 # Protein_GI_number: 15612511 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Helicobacter pylori J99 # 3 818 6 728 946 143 26.0 2e-33 MSELLVYKASAGSGKTFTLAVEYIKLLIFNPRAYRQILAVTFTNKATAEMKERILSQLYG IQIGDKDSEAYLNRIKEETGKTEQEIREAAGVALNYMLHDYSRFRVETIDSFFQSVMRNL ARELELSPNLNIELNNTEVLSDAVDSMIEKLGPTSPVLAWLLDYINERIADDKRWNVSDE VKSFGRNIFDEGYIEKGEGLRQRLRNPNTIKEYRKQLKALETEILEQMKGFYDQFEGELD GHALTADDLKNGSRGIGSYFRKLNNGVLSNDIRNATVEKCLEDAKNWATKTSPRYADIIN LANSSLIQILEDAEKLRSKNNLLLNSCRLSLQHLNKVQLLANIDEKVRQLNHDNNRFLLS DTNALLHQLVKDGDSSFVFEKIGTNIHNVMIDEFQDTSRMQWGNFKLLLLEGLSQGADSL IVGDVKQSIYRWRNGDWGILNSLNDHIEHFPIRVKTLATNRRSETNVIRFNNRIFTAAAN YLNGVYKQQLGKDCEDLQKAYADVVQESPRSTEKGYVKVSFLEPDEEHDYTEQTLISLGA EVENLLASGVQLNDIAILVRKNKSIPRIADYFDKELHYKVVSDEAFRLDASLAICMMLDA LRFLSDENNKIARAQLAIAYQNEVLQKGLDWNTLLLLPAENYLPAAFLEKIKELRLMPLY ELLEELFSIFEMNLIKDQDAYLFAFFDAVTDYLQSNSSELDGFIRYWNETLCSKTIPSGE VEGIRIFSIHKSKGLEFHTVLLPFCDWKLENETNNQLVWCTPQAAPFDALDILPINYSTQ MAESIYGNDYLQERLQLWVDNLNLLYVAFTRAGKNLIIWSRKNQKGTMSELLANTLPIVA KEEGIDWEEDCYEQGQLCPSEKERTKTSTNKLTQKPEKLPIRMESMRHDIEFRQSNRSAD FIQGIEEEDSDDRFINRGRMLHTLFSVIETAEDIDPAIERLIFEGVIRNDEKEKVAREVA TKAFSSPEIQDWYSGKWTLFNECAIIYKEKGVLQTRRPDRVMMKDNQVVVVDFKFGKENP KYNKQVKGYMQLLTKMGYKNITGYLWYVDEEKIEKV >gi|222159316|gb|ACAB01000043.1| GENE 100 116805 - 117431 594 208 aa, chain - ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 61 205 4 150 152 70 34.0 2e-12 MKSNFVFSSSSLFIGLFLFFFTENVYSQESADNWTLKQARQWTQKQEWANGLKAMPHKTT DYQEFASQYHKNKKVWDKTFQWLATHDLVNMPAGRYEVDGEHCYINVQDATTQDVSKRKI EAHRHGIDLQYVVKGTERFGITSAEYAEPITEYKPDVTFYKAKKIKYVDSTPDTFFMFFP KNFHQALVKAGKEPEEIRVIVAKIEYIP >gi|222159316|gb|ACAB01000043.1| GENE 101 117428 - 118036 440 202 aa, chain - ## HITS:1 COG:XF2239 KEGG:ns NR:ns ## COG: XF2239 COG1595 # Protein_GI_number: 15838830 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Xylella fastidiosa 9a5c # 2 189 4 196 206 67 24.0 1e-11 MSQSPIYLDINNNKSVITALKAGEEKVFDVVYRHYFRRLCAFCSQYVSEQEEIEEIVQET MMWLWENRCTLMEDLTLKTLLFTIVKNKALNRLSHFEIKRKVHQEIVDKYDSELNNPDFY LSDELFRLYEEALKRLPKEYLEAYEMNRNQHLTHKEIAEKLNVSPQTINYRIGQALKLLR VALKDYLPLFILIFGPNFFEQS >gi|222159316|gb|ACAB01000043.1| GENE 102 118198 - 119190 759 330 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 114 324 116 322 331 78 30.0 1e-14 MDETILVNYLQGECNDEEAARVEAWCEEGPENRKTLEQLYYTLFVGERIAVMNTVDTEAS LDQFKSAIREKEKKAKRKSISIRWGRYATVAAAFLTGLVFAGGIAWGLLSNKLSDYEVIT AAGQRAQTVLPDGSKVWLNASSKIVYHNSLWSSDRQIDLSGEAYFEVSHDKHAPFIVNSK QIKTCVLGTKFNVRARQDENRVVTTLLQGLVRVDSPRTEENGYLLKPGQTLNVNTDTYQA ELIEYNQPTDVLLWINGKLEFKQQSLLEITNIMEKLYDIKFIYKDEALKSERFTGEFSTD STADEILNVLMHTNHFSYKKDGRIVRVMKK >gi|222159316|gb|ACAB01000043.1| GENE 103 119524 - 120483 510 319 aa, chain - ## HITS:1 COG:no KEGG:BT_1050 NR:ns ## KEGG: BT_1050 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 314 1 299 301 166 36.0 8e-40 MYMKNIFQSLSIATLLGLFVPFISSCSDDEEVFNEWNATYVSLQRNDYLSGNVKKFNLTH DANGIGGDEIKMAFTVKTQKAVSTDMVIALSAKSETEGLDASQIVLSSSQVTLKEGQMTS EEITATVDPTIFASIMEKTSFSFSVSISNVTTNDKNTVISSNLSILPVIINKAAYCNLKS GTPSNSQLISNRAGWIVNVEEGVDGAPNNLIDGKTGTDVALNNKGFWFTVDLGETKVLTG IKTNHWGNAYAPREVEILQSENGMTWKSLSSLAIPGSSTQNITFISPITTRYLKYQIITI STNGRTDVTEFNVYEPKSE >gi|222159316|gb|ACAB01000043.1| GENE 104 120501 - 121700 742 399 aa, chain - ## HITS:1 COG:no KEGG:BT_1049 NR:ns ## KEGG: BT_1049 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 399 1 393 393 474 60.0 1e-132 MKRMNYIKILCFSAVITGLGSIVSCGDAKYDTLDTHAYIEEALSSTSQKVTVQATGESFT TLNVHLSNLSSTDNHYKLVTDQSVLDTYNHINGTGYIMLPKDYYTLPETITVKAGQYAAD ALSIALKAFSQEMMKSGESYALPVKLVSQDGSISPMENTGTFVILAESIIEFSAPMFVGA PSLKANKFTESPETYSQYTIEVRFQVANTADRDRAVFKNGGDDANFILLRFEDPQSDNEN YKAHSLVQIVGRNRLYLNPSNSFKPNEWQHLALVCDGSNYRLYINGVDSGVLSIPTGATT FSDVNWFCLGDDSYSRWGNCKILMSEARIWSVVRSASQIQNNMTQVSPKSVGLEAYWRFN EGQGNVFEDATGKGHTLTTSATPTWIDGILSTDKATEWK >gi|222159316|gb|ACAB01000043.1| GENE 105 121709 - 122848 829 379 aa, chain - ## HITS:1 COG:no KEGG:BT_1048 NR:ns ## KEGG: BT_1048 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 378 15 374 375 588 79.0 1e-166 MKKIYLLYIVLISLATTSLIGCSDWTESEAKTFPESIVSDEYYAALRAYKQTDHQVAFGW FGGWSGEGAYMKSSLAGIPDSVDIVSIWGNWSNITEAQKKDLEFCQQVKGTRFTMCFIIR SVGDQITPQNIRENWENMGFSSEKEAVNDFWGWPSDESNKEAIEASIRKYASAIADTVNK YGYDGFDIDYEPNFGNPGNIVDEDDRMFAFVDELGKYFGPKSGTGKLLVIDGEPQSITGR PEVGLYFDYFIIQAYNNSSPGSDSKLDKRLITGGVAGAGLVQTYSSVMSEEQITKMTIMT ENFEATDAAMDGGYDYTDRYGNKMKSLEGMARWQPSNGFRKGGAGTYHMEAEYGTSPEYK NIRRAIQIMNPSSHSLLKN >gi|222159316|gb|ACAB01000043.1| GENE 106 122876 - 124426 1312 516 aa, chain - ## HITS:1 COG:no KEGG:BT_1047 NR:ns ## KEGG: BT_1047 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 514 514 763 73.0 0 MRKYLRNITVGAALLLLAGSCTDKFEEYNTNQYQIHDADPATLMKSMIETIVNIQQNDSQ MQDQMVGQLGGYLCCSNTWSGTNFSTFNQSDAWNATPWNTPFEKIYGNFFQIQEATNSTG HYYAFACMIRAITMLRVADCYGPMPYSQVKKGNFYVSYDTQEQVYTSILSDLANAADVLY NYYVETNGNAPLGANDPVFDGNYSGWAKLANSMRLRVAMRISGTWPGIAQEAAEAAVTHK AGLIESNSDNAMLSCGTQSNPYQLAAVSWGDLRVNANIVDYMNAYGDPRMPKFFNKSTLA GKTDKYVGMRTGDADFKKADAAGFSIPAYTATSKLMVFCAAETAFLRAEGKLRGWNVGSK TAKAYYEDGINLSMEQYQVSATEYMKIDEAPVVSHESDVVQNATATITNTVSVMWDDSEA DNVNGKNFQRVITQKWIANYPLGLEAWAEYRRTGYPELYPCIDNLSDCGVSSQRGMRRLS FPYTEAQNNKANYDLGVAELGGADNEATDLKWAKKN >gi|222159316|gb|ACAB01000043.1| GENE 107 124444 - 127233 2242 929 aa, chain - ## HITS:1 COG:no KEGG:BT_1046 NR:ns ## KEGG: BT_1046 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 929 11 940 940 1488 83.0 0 SQTITLANAQPIKVTLGEDTQVLDEVVVTALGIKRATKALSYNVQEVKGDELTAVKDANF MNSLSGKVAGVQINSGATGAGGATRVVMRGMKSLTKDNNALYVIDGVPIFNTGKSGGEGL FGEMGGSDAVADLNPDDIASISMMTGPSAAALYGSAAANGVVLITTKKGQTEKTTITVSN STTFSKAYIMPDMQNRYGTSSGLFSWGELANRRYNPSDFFETGSNVINSVALSTGNTKNQ TYLSASTTNSGGILPNNSYNRYNFTARNTTHFLNDKLTLDIGAQYIVQNNKNMVSQGQYY NPLPSLYLFPRGDNFDEIRLYERYNTNYGYMEQYWPYGDASLSLQNPYWIQNRINRTSNK KRYMMNASLKWQATDWLNVVGRVNLDNSDYRNKNEKSASTLTTFCGVSGGFEDAMRQERS LYADFLANIDKTFGDFHLTANVGASIYHTSMDQLYIAGDLVIPNFFQINNINFSANYKPD PTGYEDEIQSIFASAELSWKNQLYLTVTGRNDWDSKLAFSKQKSFFYPSVGLSALLSEMV KLPEVISYAKIRGSYTVVASSFDRFLTNPGYEYNSQTHNWANPTVYPMDNMKPEKTKSWE IGLNLKFWGNRFNLDATYYRSNTLNQTFKVDIPSSSGYKQAIVQAGNVQNQGIELALGFS DKWAGFGWSSNATFTLNRNKVKRLASGSVNPVTGEAIQMDEMNVGWLGKENVAPRVILTE GGSMTDIYVYNQLTKDNNGNIKVDQNGNLGITSSNTPVKVGNLDADFNLGWTNHFTYKGI DLGVVLSARVGGLAYSATQGILDYYGVSETSATARDNGGIPINNGKVNAQKYYQTIGTGE GGYGRYYLYSATNVRLQELSLNYTLPKRWFKNVANVTLGIVGRNLWMIYCKAPFDPELSA STSSNYYMNVDYFMQPSLRNFGFNVKVQF Prediction of potential genes in microbial genomes Time: Wed May 18 02:17:43 2011 Seq name: gi|222159315|gb|ACAB01000044.1| Bacteroides sp. D1 cont1.44, whole genome shotgun sequence Length of sequence - 2762 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 506 - 550 6.1 1 1 Tu 1 . - CDS 639 - 1874 323 ## BT_1041 integrase - Prom 2116 - 2175 10.0 + Prom 1980 - 2039 6.1 2 2 Tu 1 . + CDS 2252 - 2762 272 ## BT_1042 hypothetical protein Predicted protein(s) >gi|222159315|gb|ACAB01000044.1| GENE 1 639 - 1874 323 411 aa, chain - ## HITS:1 COG:no KEGG:BT_1041 NR:ns ## KEGG: BT_1041 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 410 1 410 413 665 81.0 0 MFKYSKDGISVLTILDTRRAKKSGQFPVKVQVVYRRKQKYYSTGKEVSKEDWARLLKVKS RLLAEIRSDIESSFSTIKQQVNELIQKGEFSIETLSARLGRQMNDMTLRSAFRLKMKELE ANEQANTYLNYQSALKSLEDFGGSTIPLENITIDWLKRCEKFWISEGKSYSSISIYFRAL KCVLNRAVHDGIIKESSFPFGKNKYEIPEGCGRKLALTLSEIKKVMSYQCETKDIEEFRD LWVFSYLCNGINFMDLLFLQYSNIMDGEICFVRSKTSRTTKHNKEIHAIITPEMWNIIQK WGNPRLSPQTYIFKYARGTEDAFEKIRLVRRIITKCNRRLKKIAQDIGIFQLTTYTARHS FATVLKRGGAKTSYISESLGHSNLSVTEHYLAYFEKEERIRNAQLLTNFNL >gi|222159315|gb|ACAB01000044.1| GENE 2 2252 - 2762 272 170 aa, chain + ## HITS:1 COG:no KEGG:BT_1042 NR:ns ## KEGG: BT_1042 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 170 1 169 1097 249 85.0 2e-65 MKEIIKHKFSYLLFFLLLVSCFASAYGQERTITLNLSKVPLNTALKEIEKQTSMSVVYNT NDVDINRVISIKVTKESLNNVMNQLFRGVNVSFSIVDNHIVLSAKSNKEEQQKKTPITAS GTVTDSKGEPLIGVSILVKGTSNGTITDMDGNFKIQAAKGDVLEVSYIGY Prediction of potential genes in microbial genomes Time: Wed May 18 02:17:51 2011 Seq name: gi|222159314|gb|ACAB01000045.1| Bacteroides sp. D1 cont1.45, whole genome shotgun sequence Length of sequence - 2293 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 2292 1410 ## BF1326 hypothetical protein Predicted protein(s) >gi|222159314|gb|ACAB01000045.1| GENE 1 3 - 2292 1410 763 aa, chain + ## HITS:1 COG:no KEGG:BF1326 NR:ns ## KEGG: BF1326 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 760 172 925 1101 722 51.0 0 SQAITLTSAQPLKVTLGEDTQVLDEVVVTALGIKRATKALSYNVQEVKGDELTTVKSANF MNSLAGKVAGVNINASSSGVGGATRVVMRGTKSISSDNNALYVIDGVPIFNTNKGDTNGQ YSAQPRGEGISDINPEDIESMSVLSGPAAAALYGSNAASGVILITTKKGKEGKARIIISN NSTFSNPFIMPEFQNSYINRAGSFASWGDKASSLFGTYEPKDFFNTGTNIQNNVSLSVGN EKNQTYLSVGTTNATGIIQNSKYDRYNFTFRNTTKFLKDKMTLDVGFSYIIQEDLNLMAQ GEYFNPLPAVYLFPRGENFEAVRMYKTYDTTRKIDVQNWGWGDPGYSMKNNPYWVANEMN HGMKKQRYMANASLKYEILDWLDVTGRVRIDNATNDYSDKRNASTDLYFTNSSIYGFYKY YKADDRQAYADVMANINKRFGDLSLSANIGGSFTQTYYDERGFQGGLKDMSNVFSLYNMT TTLDKDTYPIESGYKQRTNSIFASAEIGWKSMLYMTLTGRNDWDSALINTEQSSFFYPSI GLSGVISEMVKLPKAITYLKVRGSFASVGAPIPKNLSSNKTYEWDPATSQWKLQTYRPLP KLYPERTNSWEAGLNAKFFNNSLSLEVTWYKANTRKQTFQVPLSGTAVYATMYAQSGNIE NKGMEFSLGYNKSWGDFSWNSNLTFSFNKNKIVELLDDAVDDEGNHYSLNEIDKGGIGSA KVILRKGGSMGDMYVTNRLKRDNEGNVYIDKASQNVKKEDIKN Prediction of potential genes in microbial genomes Time: Wed May 18 02:18:06 2011 Seq name: gi|222159313|gb|ACAB01000046.1| Bacteroides sp. D1 cont1.46, whole genome shotgun sequence Length of sequence - 27687 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 7, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 75 - 1652 1254 ## BF1327 hypothetical protein 2 1 Op 2 . + CDS 1685 - 2749 874 ## BF1328 putative secreted endoglycosidase 3 1 Op 3 . + CDS 2772 - 3983 891 ## BF1329 hypothetical protein 4 1 Op 4 . + CDS 4043 - 4729 392 ## BF1330 exo-alpha sialidase 5 1 Op 5 . + CDS 4731 - 4973 155 ## gi|298481354|ref|ZP_06999547.1| F5/8 type C domain-containing protein + Term 4996 - 5048 12.4 + Prom 4988 - 5047 9.2 6 2 Tu 1 . + CDS 5138 - 7339 1395 ## BT_1035 hypothetical protein + Term 7349 - 7386 4.1 + Prom 7357 - 7416 7.2 7 3 Op 1 . + CDS 7438 - 8766 883 ## COG0477 Permeases of the major facilitator superfamily 8 3 Op 2 . + CDS 8783 - 9751 958 ## COG2152 Predicted glycosylase 9 3 Op 3 . + CDS 9759 - 12032 2048 ## COG3537 Putative alpha-1,2-mannosidase + Term 12070 - 12113 11.4 + Prom 12039 - 12098 6.5 10 4 Tu 1 . + CDS 12334 - 13221 752 ## BT_1031 hypothetical protein + Prom 13299 - 13358 8.0 11 5 Op 1 . + CDS 13471 - 14895 856 ## BT_1026 hypothetical protein 12 5 Op 2 . + CDS 14926 - 18303 2869 ## BT_1029 hypothetical protein 13 5 Op 3 . + CDS 18315 - 20576 1619 ## BT_1024 hypothetical protein + Term 20603 - 20656 10.4 + Prom 20710 - 20769 4.1 14 6 Op 1 . + CDS 20802 - 22541 1530 ## BT_1023 hypothetical protein 15 6 Op 2 . + CDS 22544 - 23212 757 ## BT_1022 hypothetical protein + Term 23280 - 23328 8.3 + Prom 23302 - 23361 4.8 16 7 Op 1 . + CDS 23418 - 24341 705 ## BT_1021 arabinosidase 17 7 Op 2 . + CDS 24373 - 27685 2637 ## BT_1020 hypothetical protein Predicted protein(s) >gi|222159313|gb|ACAB01000046.1| GENE 1 75 - 1652 1254 525 aa, chain + ## HITS:1 COG:no KEGG:BF1327 NR:ns ## KEGG: BF1327 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 74 524 63 514 514 329 42.0 2e-88 MKIYKFITQISIAGCMLVGTPMLQGCMDNSLNDPLGKVSEEELGRDGFAANAFFLTLCDY AYPIATENIYNTNESLIGDCYGRYQVMATDKFKGANFATYNAPENWLNWQFADDNVMVLT YGAWNKIKNVTQGTGVNFAWAQILRIATMHRMLDMYGALPYSQISSDKLSTPYDTMEEAY HAMFTDLTDAINVMTVYVTENPNNRTMAKYDKVYDGDFEKWVKFANSLKLRMAIRIRFAD REYARQMAEEAINHSIGVITSNEDNATSLGTKNQLYTVLVQWPDQCVSADITSYMKGYND PRMSKYFKKKTSDKGDDDYVGMRAGLSYGNDITGPTFSRINVGEMDRTLWMSAAETAFNK AEAAMLGWDVKGETVKDLYESAIRLSFAQWGVSDGIDQYLNDESSTQADYVDPNENKGNE SAISSITIKWKDDAGEEEKLERLITQKWIAMWPLGQEAWSEYRRTGYPRFFPLLNGNVTN LPSANRIPFPPSEYIRNAANVNAAVSKLGGADNYQTKVWWQRRDK >gi|222159313|gb|ACAB01000046.1| GENE 2 1685 - 2749 874 354 aa, chain + ## HITS:1 COG:no KEGG:BF1328 NR:ns ## KEGG: BF1328 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.fragilis # Pathway: not_defined # 6 354 5 350 350 426 61.0 1e-118 MRKIFYFIMLLFGITVANTACDDWTDMEPKFQEDMTQSSLPEEYYAQLRAYKKTDHPVAF GWFGNWTGNGATLEKCLAGLPDSVDFVSIWGNWRNLTEAQTKDLRYVQNVKGTKALMCFI VQNIGDQLTPEEYKDNYLEFWGWNENKEEAIKKYAHAICDSIDKYGYDGFDIDYEPNFGH RGNMSGSDENMLLFIQTLGERIGPKSGTDRLLVIDGEPQSIVSESGPYFNYFIVQAYDSP GDNTGRDHLDSRLNSTIRNFDGHLTAEEVAKKYIVTENFEKWALSGGADFTDRYGNKMKS LEGMARWTPIINGKKCVKGGVGTYHMEYEYTIDGTEESYPYLRNAIRIMNPPVK >gi|222159313|gb|ACAB01000046.1| GENE 3 2772 - 3983 891 403 aa, chain + ## HITS:1 COG:no KEGG:BF1329 NR:ns ## KEGG: BF1329 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 18 401 15 376 378 219 37.0 1e-55 MRYLVKNKNTIMFAFAILLCAIIASCEDAKYDTIGSRIYLAESSSLSYESKKVTVAEEGS VVTVTPRSGQIATEDIEVTLGLSPKSLEVYNTKSGTNYKSMPEGSYSFDQKTVVIKKGEL VAPTVKVTIDPLTKEMLDSGDKFALPVAITAVSGGQQTLEGADVMIYIMDQVIITSAPVL TGTKPITMEMRQDYTVTQWSLEMRINMSELGDGVTPGVGYPGPPSYQNQAIFSAGPGNGI AEDGEIYIRFGDGPIPGNILQIKTQGTQVNAATKFKNNQWYHLAFVCDGVKLTIYVNGNI DATLDLPRKPLRLVKNSFGICNGDWMVTDAIVSEVRFWTTAISQSQIQNNMFAINPGTDG LEAYWKLNEGAGTEFKDATGHGNKATAPNGVVRWVDGIRSDGK >gi|222159313|gb|ACAB01000046.1| GENE 4 4043 - 4729 392 228 aa, chain + ## HITS:1 COG:no KEGG:BF1330 NR:ns ## KEGG: BF1330 # Name: not_defined # Def: exo-alpha sialidase # Organism: B.fragilis # Pathway: not_defined # 11 195 22 204 337 91 30.0 3e-17 MSLMASPLCISCDDDDEVTVNSKFYLSTKVFPVDSYEASLSVILKDNSVVVNELQPNYRF VACATERLSKDVKITADSDDNLVSAYNKANDTEYNILPAENYSFTNKTVTIENGESVSGD SIKIELLNVGSLTTEGGYLLPVTISSIEGNNLDALSSNRGVVYVKIQNIHVNVESGQPAE GTLIADRSGWTVKVAPTTRGDAKNLIDGTNSDLARDGGAEYWLTVDIG >gi|222159313|gb|ACAB01000046.1| GENE 5 4731 - 4973 155 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298481354|ref|ZP_06999547.1| ## NR: gi|298481354|ref|ZP_06999547.1| F5/8 type C domain-containing protein [Bacteroides sp. D22] # 1 80 242 321 321 151 98.0 2e-35 MQTLTGIRNKCYASSYSPTAVEVFTSSDGAKWKSIGAVTISRSGTQYIKFSKAVETRYLK YYVKQGPNTVSLTEFDLYAK >gi|222159313|gb|ACAB01000046.1| GENE 6 5138 - 7339 1395 733 aa, chain + ## HITS:1 COG:no KEGG:BT_1035 NR:ns ## KEGG: BT_1035 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 733 1 734 734 1318 82.0 0 MKLSIILRICLWFMGISMAFPSLSQTMLVTNGNTSSRIILKENNQISWTAANLLQTFIQK VTSCKLPVVISQTPRKGDILIGGQSPAEVTEDGFSISTQDGILKISGKENGVVYGVVTLL EQYLGIDYWGENEYSLAPSKTVNLPLINKVENPGFRYRQTQCYAIHTDSIYKWWNRLEEP NEVFAAGYWVHTFDKLLPSFIYGKEHPEYYSYFKGKRHPGKASQWCLSNPEVFEIVAQRV DSIFKANPDKHIMSVSQNDGNYTNCTCDACKAIDDYEGALSGSIITFLNKLAARFPDKEF STLAYLYTMNPPKHVKPLPNVNIMLCDIDCDREVTLTENASGKEFVKAMEGWSKITNNIF VWDYGINFDNYLAPFPNFHILQDNIRLFKKNHATMHFSQIAGSRGGDFAELRAYLVSKLM WNPEVNVDSLMQHFLHGYYGEAAPYLYQYIKIMEGALIGSGQRLWIYDSPVSHKYGMLKP ALMRRYNHLFDLAEKAVEAEPGFLKRVQRARLPIQYSELEIARTETEKDLVDINKKLDLF EERVKEFQVPTLNERSNSSVDYCKLYRERYMPQKEKSLALGAKVTYLIPPTGKYAALGKN ALIDGLFGGATFVDSWIGWEGTDGAFVIDLGEAKEIHSVETDFLHQIGAWILFPLKVVYS YAEDGEHYTHWRTIDLPEERTGEVKFRGVKAESAEPIKTRYVKVEVTGTKECPTWHYGVG HPSWFFIDEVIIK >gi|222159313|gb|ACAB01000046.1| GENE 7 7438 - 8766 883 442 aa, chain + ## HITS:1 COG:YPO3162 KEGG:ns NR:ns ## COG: YPO3162 COG0477 # Protein_GI_number: 16123324 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 56 440 53 410 492 93 21.0 7e-19 MQQNKTASQRKTINPACWVPTAYFAMGLPFIAINLVSVFMFKDLGISDTQIAFWTSLIMM PWTLKFLWSPFLEMYRTKKFFVLITELLSGVLFGVVAFSLFFDYFFAISISTMAVIAFSG ATHDIACDGVYMAELNKEDQAKYIGVQGAFYNVAKLVANGGLVAMAGALAEHFGAIEGAS IDANKGAYSSAWMIIFGVIAAIMVLIGIYHIKMLPSTQVPSTTKKTASEVGHELVAVIAN FFTKKHILYYICFIILYRLAEGFIMKIAPLFLRASRDVGGLGLSLTEIGTLNGIFGSAAF VLGSLLAGIYVSKFGLKKTLFTLCCIFNLPFVAYTFLAVAQPTNVYLIGTCITMEYFGYG FGFVGLTLFMMQQIAPGKHQMSHYAFASGIMNLGVMLPGMVSGYLSDLLGYRNFFIYVLI ATIPAFLITYFIPFTYDDSKNK >gi|222159313|gb|ACAB01000046.1| GENE 8 8783 - 9751 958 322 aa, chain + ## HITS:1 COG:TM1225 KEGG:ns NR:ns ## COG: TM1225 COG2152 # Protein_GI_number: 15643981 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Thermotoga maritima # 6 321 11 326 326 376 58.0 1e-104 MNKIQIPWEDRPVGCTDVMWRYSQNPVIGRYHIPSSNSIFNSAVVPFEDGFAGVFRCDNK AVQMNIFAGFSKDGIHWDINHEPIQFKAGNTEMIESEYKYDPRVTWIEDRYWVTWCNGYH GPTIGIAYTFDFKEFFQCENAFLPFNRNGVLFPQKIDGKYAMLSRPSDNGHTPFGDIYIS YSPDMKYWGEHRCVMKVTPFPESAWQCTKIGAGSVPFLTDEGWLLFYHGVITTCNGFRYA MGAAILDKDHPEKVLYRTREYLLGPAAPYELQGDVPNVVFPCAALQDGERVAVYYGAADT VVGMAFGYIKEIIDFTKRTSII >gi|222159313|gb|ACAB01000046.1| GENE 9 9759 - 12032 2048 757 aa, chain + ## HITS:1 COG:CC0533 KEGG:ns NR:ns ## COG: CC0533 COG3537 # Protein_GI_number: 16124788 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Caulobacter vibrioides # 35 756 35 752 770 337 32.0 5e-92 MKRIILTYTLAFSLLSAYAGEGGSPPLKNTDKSLLIDYVDPFIGTTNFGTTNPGAICPNG MMSVVPFNVMGSSENTYDKDARWWSTPYEYTNCFFTGYAHVNLSGVGCPELGSLLLMPTT GELNVDYKEYGSKYKDEQASPGYYSNYLTKYNVKTEVSATPRSGIARFTFPKGKSHILLN LGEGLTNESGAMLRRVSDSEVEGVKLLGTFCYNPQAVFPIYFVLRVKKNPSATGYWKKQR PMMGVEAEWDKDQGKYKLYTRYGKEIAGDDIGTYFSFDTEEGEQVEVQMGVSFVSIENAR LNLDREQAGKDFEKIHAEARSKWNHDLSRITVEGGTDAQKTVFYTALYHLLIHPNILQDV NGEYPAMESDKILTTKGDRYTVFSLWDTYRNVHQLLTLVYPERQMEMVRTMLDMYREHGW LPKWELYGRETLTMEGDPSIPVIVDTWMKGLRDFDVDLAYEAMYKSATLPGAENLMRPDN DDYMSKGYVPLREQYDNSVSHALEYYIADFALSRFAAALGKKKDAEMFYKRSLGYKHYYS KEFGTFRPILPDGTFYSPFNPRQGENFEPNPGFHEGSSWNYTFYVPHDVYGLAKLMGGKK PFIDKLQMVFDEGLYDPANEPDIAYAHLFSYFKGEEWRTQKETQRLLDKYFTTKPDGIPG NDDTGTMSSWAIFNMIGFYPDCPGLPEYTLTTPVFNKVTIRLDPKWYKENELVIETNRTQ PGVLYINKVLLNGKKFNKYHITHDELVHGKYLKFDLK >gi|222159313|gb|ACAB01000046.1| GENE 10 12334 - 13221 752 295 aa, chain + ## HITS:1 COG:no KEGG:BT_1031 NR:ns ## KEGG: BT_1031 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 295 3 296 297 522 87.0 1e-147 MRRNLFKHILWILILAECFPLLAIAGSQQKEQRYKIAVCDWMILKRQKIGSFQLVHELNG DGVELDMGGLGKREMFDNKLREPHFQQLFRETAQKYQLEVSSIAMSGFYGQSFLERANYK DLVQDCLCAMKVMNAKVAFLPLGGIKAGWEKIPALRTEVVKRLKEVGDMAASEGVVIGIE TQLDAKGDVKLLKEINSPGIKIYFKFQNALENGRDLCKELKTLGKKRICQIHCTDTDRVT LPYNERLDMNKVKKTLDKMGWRGWLVVERSRDKDDVRNVKKNYGTNIKYLKEVFQ >gi|222159313|gb|ACAB01000046.1| GENE 11 13471 - 14895 856 474 aa, chain + ## HITS:1 COG:no KEGG:BT_1026 NR:ns ## KEGG: BT_1026 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 26 473 26 476 481 283 38.0 1e-74 MKGKLKFEKPIVVSAVCLSMILSVGSCAGDGFDEETFSGGVTNTQLESPDASKVSFEKLA GKPSVKVEWPVIMGAGGYLFSMYIVDDPTNPVAVVKDSIIDGCSVACTYLDDTNYKVEIK TLGNDKYNNKDAASVSEVTWSTLVPATTIPAGTDLNTYFENNPITSGKDVEVAYELEAGG SYTISGDLNLGVNNVQLRGNKVNHAKVTFTGSGAIVSSGGGMALKFIDFDCDVVSKGSFV KFGDVPEEIKGMNSYGIVTNPMIIQSCEIRNVRHYLVNINGKKYAIQNFIVRDCLIKFYQ DAEIFNFNSNNSFVKDFELSSSTLYNLSEASNGRFMRVAGGKGADVGWADCSMNFTSNTF YNISCDQEAFNSNVWNRQKNTVNLSKNIFYDSCKGEFNRRIVGGRTDNAKTCDNNCYWYK GGSGLEKEANGNYGDKSTSAYGVDPGFKDPANGDFTVRHSEVISHGSGDPRWLK >gi|222159313|gb|ACAB01000046.1| GENE 12 14926 - 18303 2869 1125 aa, chain + ## HITS:1 COG:no KEGG:BT_1029 NR:ns ## KEGG: BT_1029 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 1125 10 1102 1102 1088 51.0 0 MIHKNTFKLCLGILLSFMMLLPTLAQNTKSFKGVVVDETGEPIIGAAVKVVDTTIGTITD LNGKFVINVPAGKLVEISYIGYISQRISDFKLTKVVLKEDTQQLDEVVVVGYGTQKKAHL TGAIATVPMDDIQDISSGSLANTLSGLVNGVSVSGGNSRPGENARITIRQNDVLASMGNN NLEPLYVIDGYIYPTEVKVGSSIENLGATAFNNLDPSMIEGISVLKDAAAAVYGARAANG VILVTTKKGKLGTPQISYSGTFGIADEVSRPKMLNAYEYGRLWNAVRAADPTDTSINLLE DLFQADELNAMKGLNYDLLDKYWKSALTQQHSVNLSGATEKANYFAGISYFNQDGNLGNL DYDRWTYRAGVNVKVSKWLNAGVQVSGDYGKKNTPLNKVGGSKTEDYNTLLTHPYYIPEE INGLPIAAFGISNSQVSNVQKYNFSAIQNNGDYSRNMTSNMNINANLEYDFGWSKWLKGL KVRFTYSKSINSAKINQYGSSYDLYYMADRAGSGKHLYTPIPGQEDAYDFLLTADNFLYA NNGKPVVNGDTSFLSRNMNRTDNYQMNFTVTYNRTFGDHTLGGLFSIERSEAESEYVEGS ITYPYEFTNGQSNTMSAEGKGNTSFSRSESGTLSYIGRINYAYADKYLLEFLLRSDASTK FAPENYWGFFPSVSAGWVISQEDWFKNNVKGIDYLKLRASFGLTGRDNTLPWQWAQNYAM DKDKGFIFGTGTDLNAGSHITINKNISAVNRNAHWDKSYKANFGLDFNVLNNRLVFAIDG YYDWNREMLLPYKSSIPGTVGTQSAYVNYGEMDAYGVELSVTWRDKIGKDFKYKVQVNTG YTDNKVLVTDWDSPMTFKSLHKGERSDIGTWGLQCIGMFRSYQDIDEYFAKYNITKYMDM TKDQVRPGMLIYKDVRGELQPDGTYAGPDGVVDKENDQVRMANRSNPYGFTTNLSAEYKG FSISAQLNASWGGYSFVPTNAIKLGNKGDAAGGYRDLEYANMPSFWNVDNMYVYQDVLDA AGNVVVKANHDAKYPNLQYASVNSAESTFWRVSGTRIRLNRLTLAYKIPSKYTKMIGIES CRFNVTGQNLLSFNNPYPDNFMDPMISYGTYPTLRKFTIGVNVTF >gi|222159313|gb|ACAB01000046.1| GENE 13 18315 - 20576 1619 753 aa, chain + ## HITS:1 COG:no KEGG:BT_1024 NR:ns ## KEGG: BT_1024 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 753 10 765 765 563 44.0 1e-158 MKRKHILIASVALMATSFLESCSDEFLQDKKNYDYASPQDAYNNYSGALARVNDIYKFIM PDVKGAPSFQYNSTGDSDAQSQSTEEYSGFGAFVDPRTPLTYQTGNNPVPDYFHGSSGNI QTMTYGIIRNCNDAIEGIEGSTLSQEQKNELLGQLYFLRAWRYYNLLKWYGGVPIVDKVQ EVTAESVTPRSTTKACIDFICNDLKKSSELLAPFTQNGGWTGKDLGRVTSGTAEALLGRV RLLYASPLFNRSNDVNRWQQAYDDIKASIAVLEKCGNGLQNFEAPGNNAAGWAKLFSEVE GNKEAVMITLYNTTASTSAADYSKNNPWERGIRPKNTLGNGGKNPSGMMVDLFPMADGKR PATYGSYTKLETSEYTYDTNPESPTCTPVFMNRDPRFYRTFAFPGVHWHFKGDPRNDKNN NPYDGSSYELWNYVWYTSEKDRDDIESGNTYGADNLLDNVKGLYIRKRSDDLQVSSNPRY QYSQENGFTYNANPYIEIRYAEVLLNYAEAACGLAYAKGGDNALLKEAVDRLKLIRQRVG YTGDCGLQANLESDPAACMSAILYERQIELAYEGKRFDDCRRWLLYDGDGMGGGLYTDDL PASWKLTGWNGNTCNWLGVVPLNGQRRDNIEYRVRNDYNNGLGGNSWPVGDASKNPDPLK DIKRPKALDLNEDISTSQETLKEFYDTYLIRKKKKGDSYDSNKGEYKITFYPRYYFLGLN QGAQGANSAVQQTIGWGDYNNGGANGTFDPLAE >gi|222159313|gb|ACAB01000046.1| GENE 14 20802 - 22541 1530 579 aa, chain + ## HITS:1 COG:no KEGG:BT_1023 NR:ns ## KEGG: BT_1023 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 579 1 579 579 1081 92.0 0 MKKKQVLLAVFAATMLTPTVAWAQYPQISGEAKENYTKMMTEERKRSDEAWAKALPIVQK EAREGRPYIPWAGRPYDLPQADIPSFPGAEGGGMYSFGGRGGKVITVTNLNDRGPGSFRE ACETGGARIIVFNVAGIIRLESPIIVRAPYVTIAGQTAPGDGVCIAGESFWVDTHDVVVR HMRFRRGETKVWHRDDSFGGNPVGNIMIDHCSCTWGLDENISFYRHMYDPSEGQYESKDL KLPTVNVTIQNTISAKALDTYNHAFGSTLGGENCAFARNLWASNSGRNPSIGWNGIFNFV NNVVFNWVHRSSDGGDYTAMFNMINNYYKPGPATPKDSNVGHRILKPEAGRSKLDHKEYG RVYADGNIMEGYPEITKDNWNGGIQIETQPNTDGYTEYMRSYQPFEMPYINIMGAKDAYD YVLKHVGANIPCRDIVDERVIEEVRTGIPYYEKKLPKDAYGDLTGLSPKSMGEDGQFKYR RLPKDSYKQGIITDVRQMGGYPEYKGTPYVDTDKDGMPDEWEIANGLNPNDPSDANKDCT GDGYTNIEKYINGISTKHKVDWRDMKNNYDTLAEKGKLM >gi|222159313|gb|ACAB01000046.1| GENE 15 22544 - 23212 757 222 aa, chain + ## HITS:1 COG:no KEGG:BT_1022 NR:ns ## KEGG: BT_1022 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 222 1 222 222 358 86.0 8e-98 MKKILILSLLFISGWMSAQAVDLNKENRDPEYVKSIVGRSQKIVDKLGLTDAKIAEDVRN VIANRYFELNDIYEVRDAKVKKVKESGLTGEAKNEALKAAENEKDAALYRSHFAFPANLS LFLDEKQIEAVKDGMTYGVVKVTYDSHLDMIPTLKEEEKAQIYAWLIEAREFALDAENSN KKHAAFGKYKGRINNYLAKRGYDLKKEREEWYKRIKARGGSI >gi|222159313|gb|ACAB01000046.1| GENE 16 23418 - 24341 705 307 aa, chain + ## HITS:1 COG:no KEGG:BT_1021 NR:ns ## KEGG: BT_1021 # Name: not_defined # Def: arabinosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 307 1 281 283 542 91.0 1e-153 MKKLRNTLAAISLLFLASCGGNKDYYMFTSFHEPADEGLRYLYSEDGIHWDSIPGVWLKP ELGQHQLMRDPSMVRTPDGTYHLVWTTSWKGDLGFGYAHSKDLIHWSEQQMIPVMADEPT TINVWAPEIFYDDENDQFMVVWASCVPGRFEKGIEEENNNHRLYYITTKDFKTVSKAKLL YDPGFSTIDAVIVKRAKNDYVMVLKDNTRPERNLKIAFSDSMTGPYSPASQPFTESFVEG PSVEKVGDDYLIYFDVYKKKIYGAMRTKDFRNFTDVTEEVSIPVGHKHGTIFTAPESVVK ALLEEKK >gi|222159313|gb|ACAB01000046.1| GENE 17 24373 - 27685 2637 1104 aa, chain + ## HITS:1 COG:no KEGG:BT_1020 NR:ns ## KEGG: BT_1020 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1104 1 1107 1109 2179 93.0 0 MNNSAKILVVLAAGWLTTTAFAQDRIHYTGKELSNPACHDGQLSPVVGVHNIQLVRANRE HPDASNGNGWTYNHQPMLAYWNGQFFYQYLADPSDEHVPPSQTFLMTSKDGYRWTNPEIV FPPYQVPDGYTKESRPGVQAKDLIAIMHQRVGFYVSKSGKLITMGNYGVALDKKDDPNDG NGIGRVVREIKKDGSYGPIYFIYYNHGFNEKNTDYPYFKKSKDREFVKACQEILDNPLYM MQWVEEADREDPILPLKKGYKAFNCYTLPDGRIASLWKHALTSISEDGGHTWAEPVLRAK GFVNSNAKIWGQRLSDGTYATVYNPSEFRWPLAISLSKDGLEYTTLNLVHGEITPMRYGG NYKSYGPQYPRGIQEGNGVPADGDLWVSYSVNKEDMWISRIPVPVEINASAHADDDFSKN KSIAELTDWNIYSPVWAPVSLEDQWLKLQDKDPFDYAKVERKIPASKELKVSFDLKAGQN NKGTLHIEFLDENGIACSRIELTDDGIFRLKGGSRFANMMKYEAGKIYHVEAVLSTVDRN IQVYVDGKRVGLRMFYAPVAAVERIVFRTGVPRTFPTVDTPADQTYDLPNAGAQDPLAEY GIANVKTSSTDKDASSAFLKYADFSHYADYFNGMEDENIVQAIPNAKASEWMEENIPLFE CPQHNFEEMYYYRWWSLRKHIKETPVGYGMTEFLVQRSYSDKYNLIACAIGHHIYESRWL RDPKYLDQIIHTWYRGNDGGPMKKMDKFSSWNADAVLARYMVDGDKDYLLDMKKDLETEY QRWERTNRLKNGLYWQGDVQDGMEESISGGRRKKYARPTINSYMYGNAKALSCIGILSGD EGMAMKYGMRADTLKNLVENELWNTRHQFFETMRTDSSANVREAIGYIPWYFNLPDTTQK YEIAWKEIMDEKGFSAPYGLTTAERRHPEFRTRGVGKCEWDGAIWPFASAQTLTAMANFM NNYPQTVLSDSVYFRQMELYVESQYHRGRPYIGEYLDEVTGYWLKGDQERSRYYNHSTFN DLIITGLIGLRPRLDNTIEVTPLIPADKWDWFCLDNVLYHGHNLTILWDKNGDRYHCGKG LRIFVDGKEVGQANTLTKIVCENA Prediction of potential genes in microbial genomes Time: Wed May 18 02:19:42 2011 Seq name: gi|222159312|gb|ACAB01000047.1| Bacteroides sp. D1 cont1.47, whole genome shotgun sequence Length of sequence - 42274 bp Number of predicted genes - 25, with homology - 24 Number of transcription units - 14, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 34 - 2868 1885 ## BT_1019 putative secreted hydrolase + Prom 2870 - 2929 3.7 2 2 Op 1 . + CDS 3069 - 4421 915 ## COG5434 Endopolygalacturonase + Prom 4430 - 4489 4.7 3 2 Op 2 . + CDS 4518 - 6482 1949 ## BT_1017 hypothetical protein + Term 6551 - 6592 8.2 + Prom 6577 - 6636 5.8 4 3 Tu 1 . + CDS 6685 - 7476 754 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily + Term 7487 - 7547 7.7 - Term 7503 - 7545 1.2 5 4 Tu 1 . - CDS 7556 - 8443 640 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 8466 - 8525 11.4 - Term 8485 - 8531 10.1 6 5 Tu 1 . - CDS 8557 - 12417 2893 ## COG4692 Predicted neuraminidase (sialidase) - Prom 12437 - 12496 4.2 7 6 Op 1 . - CDS 12547 - 14013 1647 ## BT_1012 hypothetical protein 8 6 Op 2 . - CDS 14034 - 15236 1155 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 9 6 Op 3 . - CDS 15250 - 17718 1628 ## BT_1010 hypothetical protein - Prom 17930 - 17989 5.7 + Prom 17689 - 17748 7.3 10 7 Tu 1 . + CDS 17940 - 18557 498 ## COG1793 ATP-dependent DNA ligase 11 8 Tu 1 . + CDS 18932 - 19633 279 ## BF0032 two-component system response regulator + Prom 19719 - 19778 7.4 12 9 Op 1 . + CDS 19867 - 21408 1076 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen 13 9 Op 2 . + CDS 21425 - 22132 388 ## BT_4629 hypothetical protein 14 9 Op 3 6/0.000 + CDS 22187 - 25480 2307 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 15 9 Op 4 . + CDS 25507 - 29364 2272 ## COG1002 Type II restriction enzyme, methylase subunits + Term 29407 - 29464 5.2 - Term 29250 - 29289 5.3 16 10 Tu 1 . - CDS 29350 - 29502 72 ## gi|255016179|ref|ZP_05288305.1| hypothetical protein B2_19929 - Prom 29645 - 29704 2.7 - Term 29650 - 29683 -0.8 17 11 Tu 1 . - CDS 29751 - 30092 222 ## NT01CX_1300 DNA modification methylase, putative - Prom 30212 - 30271 1.9 18 12 Op 1 . + CDS 30054 - 30227 57 ## 19 12 Op 2 . + CDS 30224 - 30592 303 ## BT_0029 hypothetical protein 20 12 Op 3 . + CDS 30619 - 32364 1319 ## BT_0030 hypothetical protein 21 12 Op 4 . + CDS 32394 - 33287 718 ## COG3568 Metal-dependent hydrolase 22 12 Op 5 . + CDS 33304 - 35466 1338 ## GYMC10_6263 metallophosphoesterase 23 12 Op 6 . + CDS 35478 - 36368 446 ## COG3568 Metal-dependent hydrolase + Term 36618 - 36660 9.1 - Term 36600 - 36654 12.1 24 13 Tu 1 . - CDS 36871 - 39303 1839 ## COG0642 Signal transduction histidine kinase - Prom 39324 - 39383 5.5 + Prom 39457 - 39516 8.1 25 14 Tu 1 . + CDS 39544 - 42222 2644 ## BT_0997 hypothetical protein Predicted protein(s) >gi|222159312|gb|ACAB01000047.1| GENE 1 34 - 2868 1885 944 aa, chain + ## HITS:1 COG:no KEGG:BT_1019 NR:ns ## KEGG: BT_1019 # Name: not_defined # Def: putative secreted hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 32 944 1 923 923 1692 86.0 0 MKFKKILLSLLICLSCFVQAKNITISRLTCEMQEGLVVVEGSPRLGWVMESPENGTRQSA YEIDIREAFTGRSVWNSGKVYSSQSQLVSTKGADIRPDNSFNYSWRVRVWDETDTPSEWS NEAKFRAVPERLSSGQWIGAITRQNAHLPEGRKFHGGELKKPEVKAAWEAVDTLAKKSIC LRRTFQVGDAKEGGANRKPGKKIVEATAYVCGLGFYEFSLNGKKVGNSEFAPLWSDYDKT VYYNTYDVTEQLRRGENVVGILLGNGFYNVQGGRYRKLQISFGPPTLLFELVINYEDGTC TTVHSDNNWKYDFSPVTFNCIYGGEDYDARREQKGWNQIGFDDSHWRPVVIQEAPKGILR PQMAAPVKIMERYDIQKVTKLNADQVASASVSTKRTVDPSAFVLDMGQNLAGFPEIKVHG KRGQKVTLIVAEALTEEDACNQRQTGRQHYYEYTLKGEGDETWHPRFSYYGFRYIQVEGA VLKGQKNPQKLPVLKNIQSCFVYNSARKVSTFESSNRIFNVAHRLIEKAVRSNMQSVFTD CPHREKLGWLEQVHLNGPGLLYNYDLTAYAPQIMQNMADAQHSNGAMPTTAPEYVIFEGP GMDAFAESPEWGGSLVIFPFMYYETYGDDSLIKKYYPNMRRYVDYLKTRADKGILSFGLG DWYDYGDFRAGFSRNTPVPLVATAHYYMTVMYLVQAAKMVGNDFDIHYYTSLAQDIMVAF NKCFLHKDTAQYGTGSQCSNALPLFLQMTQDADEQGSYRPDADLNEKVFANLIKDVEAHG NRLTTGDVGNRYLIQTLARNGEHELIYKMFNHEEAPGYGFQLKFGATTLTEQWDPRQGSS WNHFMMGQIDEWFFNSLVGIRPSTTPKQGYQKFIIAPQPVGDLKYVKASYETLYGTINVD WTCENGTFTLNVSVPVNTTAVVYLPGEKEPKEIQSGTYQLVCAK >gi|222159312|gb|ACAB01000047.1| GENE 2 3069 - 4421 915 450 aa, chain + ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 24 439 28 433 448 279 37.0 7e-75 MNLRTTLFVFLCFCATTVLRAERVDMLKAGAKANGKTLNTKLINSTIDRLNRGGGGTLFF PAGTYLTGSIHLKSNITLELEAGATLLFSDNFDDYLPFVEVRHEGVMMKSFQPLIYAVDA ENITIKGEGTLDGQGKKWWMEFFRVMIDLKDNGMRDVNKYQSMWDAANDTTAIYAETNKD YVNTLQRRFFRPPFIQPVRCKKVKIEGVKIINSPFWTVNPEFCDNVTIKGITIDNAPSPN TDGVNPESCRNVHISDCHISVGDDCITIKSGRDAQARRLGVPCENITITNCTMLSGHGGV VIGSEMSGSVRKVTISNCVFDGTDRGIRIKSTRGRGGVVEDIRVSNVVMSNIKQEAVVLN LKYSKMPVEPKSERTPIFRNVHISGMTVTNVKTPIKIVGLEEAPISDIVLRDIHIQEGKQ KCIFENCERITMDDVIVNGEVIKSTTNTTD >gi|222159312|gb|ACAB01000047.1| GENE 3 4518 - 6482 1949 654 aa, chain + ## HITS:1 COG:no KEGG:BT_1017 NR:ns ## KEGG: BT_1017 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 653 1 653 653 1185 89.0 0 MKRILFTMLLAASLSAEAQTQTYETEFARPLNEVLTDIQNRFGVRLKYDIDTVGKVLSYA DFRIRPYSVEESLTNVLAPFDYKFVKQKGNMYKLKAYEYPRRTDADGEKMLAYLNTLYTD QQSFQLRADSLKKEVRQRLGIDTLLAQCVKTKPILSKIRKFDGYTVQNFALETLPGLYIC GSIYTPQSKGKHALIICPNGHFGGGRYREDQQQRMGTLARMGAVCVDYDLFGWGESALQV GSAAHRSSAAHTIQAMNGLLILDYMLASRKDIDTSRIGTNGGSGGGTHTVLLSVLDDRFT ASAPVVSLASHFDGGCPCESGMPIQLSAGGTCNAELAATFAPRPQLIVSDGGDWTASVPT LEFPYLQRIYGFYQAKDKVTNVHLPKEKHDFGPNKRNAVYDFFAEVFKLDKKMLDESKVT IEPESAMYSFGEKGALLPEGAIRSFDKVAAYFDKKAYANLKSDASLEKKAIDWVASLELN DDKKAGFAVTAIYNHLRKVRDWHNEHPYTTIPEGINPLTGKPLSKLDREMIADSAMPKEV HERLMKDLRRVLTEEQIEQILDKYTVGKVAFTLKGYQAIVPNMTEEETAFVLEQLKLARE QAIDYKNMKQISAIFEIYKTKCEQYFNEHGRNWRQMFKDYVNKRNAEKKAQGKK >gi|222159312|gb|ACAB01000047.1| GENE 4 6685 - 7476 754 263 aa, chain + ## HITS:1 COG:PA1640 KEGG:ns NR:ns ## COG: PA1640 COG1752 # Protein_GI_number: 15596837 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 9 190 1 180 345 181 49.0 1e-45 MANRNVYDMDKQKVALVLSMGGARGIAHIGVIEELLRHNFEITSIAGSSMGAMVGAMYAS GKMEECKEWLFSWDKRKMWELADLTLSRDGLVKGDRFIKELKQIIPDMNIEDLPVPYVAM ATDIVRDQEVRFDRGSLHEAIRASISIPMLFRPLRKDGMVLIDGGILNPLPLSHVQRTEG DILIAVDVNAPIDTGKKKKVSPYNLLTESSRMMMQQITRYQIERCRPDILIQISGDTYDM LEFHHAASIVKTGVEVARRILMQ >gi|222159312|gb|ACAB01000047.1| GENE 5 7556 - 8443 640 295 aa, chain - ## HITS:1 COG:BH1510 KEGG:ns NR:ns ## COG: BH1510 COG1028 # Protein_GI_number: 15614073 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Bacillus halodurans # 52 293 2 241 243 202 46.0 7e-52 MADNYIERQQEQYEARKAAWKQAQKYGKKKLTAVRPAESKSHTSTVSKPEDSKRRVFITG GAEGIGKAIVEAFCLAGNQVAFCDINETSGQETAKATGATFHKVDVSDKNALESCMQTIL AEWNDIDIIVNNVGISKFSSITETSVEDFDKILSINLRPVFITSRLLAIHRKKQPSPNPY GRIVNICSTRYLMSEPGSEGYAASKGGIYSLTHALALSLSEWNITVNSIAPGWIQTHDYG QLQPEDHSQHPSRRVGKPEDIARMCLFLCEENNDFINGENITIDGGMTKKMIYIE >gi|222159312|gb|ACAB01000047.1| GENE 6 8557 - 12417 2893 1286 aa, chain - ## HITS:1 COG:STM1252 KEGG:ns NR:ns ## COG: STM1252 COG4692 # Protein_GI_number: 16764604 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted neuraminidase (sialidase) # Organism: Salmonella typhimurium LT2 # 924 1281 21 344 347 152 33.0 4e-36 MNKRIVSFFFLLLFAGSLYAAIGITDLRTEQLKNPAGIDVRQPRLSWRIESDEQNVIQTA YHILVASSPELLEQGKGDIWDSGKIESDASQWITYQGKTLKCNAPYYWKVKIDTNKGATN WSAPAFWTTGLFNEADWQGQWIGLDRTAPGDSETQWSRLAARYLRKEFAVKKSVKRATVH IAGMGLYELFINGQRIGNQVLAPAPTDYRKTILYNTYDVTSLLQTENAIGVTLGNGRFYT MRQNYKPYKIPTFGYPKLRLNLIVEYADGSKETIATNTSWKLTTEGPIRSNNEYDGEEYD ARKELGAWTQTGYDDKNWMPAQRVSIPSGTLRAQMMPGMKVTETLKPVSIKKLGNKYILD IGQNMAGWVRFRIKGQAGDSIRLRFAESLQDNGELYTRNFRDARSTDVYVVSGRETKDAT WAPRFIYHGFRYVEVSGYPNAKAEDFVAEVVEDEMEHIGTFNCSDETLNKIIRNAFWGIR SNYKGMPVDCPQRNERQPWLGDRTMGCWGESMLFDNYAMYTKWTRDIREAQREDGCIPDV APAYWNYYSDNVTWPAALPMACDMLFTNFGDKRPIEENYPAIKKWVSHIREYYMTEDFII TKDKYGDWCVPPESLELIHSKDPSRKTDGALIATAYYLKVLQLMHRFASLQGLKADAEEW EDLEHRMKDAFNARFLHVKEGTSPVPGHTLYPDSIFYGNNTVTANILPLAFGLVPKNYIN EVAKNAVTSIIKTNKGHISTGVIGVQWLLRELSRRGHANVAYLLATNKTYPSWGYMVEKG ATTIWELWNGDTANPEMNSGNHVMLLGDLLPWCFNNLAGIRADRWKSGYKHIVFQPAFEI QELSNVDASYMSIYGKITSRWTKTPTHLEWDIELPANTTGEVHLPDGRKEKIGSGKYHFS VDIPTRNTAILTDEFLYENASFPECHGATIVELKNGDLVASFFGGTKERNPDCCIWVCRK PKGSKEWTTPKLAADGVFSLKDPQAVLAGIDSTCTPVKDAKGTLIARRKACWNPVLFQIP GGDLILFYKIGLKVSDWTGWLVRSRDGGKTWSKREALPKGFLGPIKNKPEYINGRIICPS STEGSNGWRVHFEISDDKGKTWKMVGPLDAELSVPTQNRKKGGVNVDDQEGGEAIKGEGA KPVYAIQPSILKHKDGRLQILCRTRNAQVATAWSSDNGDTWSKVTLLNVPNNNSGTDAVT MKDGRHILIYNNFSTLPGTPKGPRTPLCVAVSEDGINWQPVLTLEDSPISQYSYPSIIQG KDGKLHAIYTWRRQRIKYAEIDPTKF >gi|222159312|gb|ACAB01000047.1| GENE 7 12547 - 14013 1647 488 aa, chain - ## HITS:1 COG:no KEGG:BT_1012 NR:ns ## KEGG: BT_1012 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 474 1 474 483 907 96.0 0 MRKFKILPLLLLLLTMATSAAAQKKTQKTYIPWDNGKLVVSEEGRYLKHENGAPFFWLGE TGWLLPERLNRDEAEYYLEQCKRRGYNVIQVQTLNNVPSMNIYGQYSMIDGYNFKNINQK GVYGYWDHMDYIIRTAAKKGQYIGMVCIWGSPVNRGEMTVEQAKAYGKFLAERYKDEPNI IWFIGGDIRGDVKTAEWEALATSIKAIDKNHLMTFHPRGRTTSATWFNNAPWLDFNMFQS GHRRYGQRFGDGDYPIEENTEEDNWRFVERSLAMKPMKPVIDGEPIYEEIPHGLHDENEL LWKDYDVRRYAYWSVFAGSFGHTYGHNSIMQFIKPGVGGAYGAKKPWYDALNDPGYNQMK YLKNLMLTFPFFERIPDQSIITGQNGERYDRAIATRGNDYLMVYNYTGRPMEVDFSKISG AKKNAWWYTTKDGKLEYIGEFDNGVHKFQHDSGYCSGNDHILIVVDSSKNYVEKAWTELP NAQQKWAK >gi|222159312|gb|ACAB01000047.1| GENE 8 14034 - 15236 1155 400 aa, chain - ## HITS:1 COG:CAC0359 KEGG:ns NR:ns ## COG: CAC0359 COG4225 # Protein_GI_number: 15893650 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Clostridium acetobutylicum # 57 397 23 361 361 311 42.0 2e-84 MNKNLLKLIVACGTACFLACAAPQKTETEKWSERMARSEMKRFPEPWMIEKAKKPRWGYT HGLVVKSMLEEWKHTGDSVYYEYAKIYADSLIDADGHIKTMKYLSFNIDNVNGGKILFDL YAQTGDERYKIAMDTLRKQMAEQPRTSEGGFWHKLRYPHQMWLDGIFMASPYLVQYGATF NEPALFDEATKQILLINSKTYDPATGLFYHGWDESREQKWSNPETGCSPNFWSRSIGWYG AAIVDVLDFLPQETAGRDSIIQILQGLAKAIVKYQDPSSGTWYQVTDQGAREGNYLESSA TALFIYTLAKAINKGYIGNEYIEPTQKAFDGMVKTFTRLEEDGSYTITNCCAVAGLGGDS KRYRDGSFEYYISEPIIENDPKSVGSFILAAIEYEKMTKK >gi|222159312|gb|ACAB01000047.1| GENE 9 15250 - 17718 1628 822 aa, chain - ## HITS:1 COG:no KEGG:BT_1010 NR:ns ## KEGG: BT_1010 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 822 3 824 824 1607 92.0 0 MNKKRLIGYLFVTMSVGCIQAQEQKSSVPEYKLWYDCPAQVWTEALPLGNGRLGAMVYGT PGTEQIQLNEESIWAGRPNNNANPDALEYIPKVRELVFAGKYLEAQTLATEKVMAKTNSG MPYQSFGDLRIAFPGHTRYSNYYRELSLDSARAIVRYEVDGVQYQRETITSFTDQVVMVR LTANRPGQITFNAQLTSPHQDVMIASEEGNCVTLSGVSSLHEGLKGKVEFQGRLTAKNKG GKIACADGILSVEKADEAVIYVSIATNFNNYQDITGNQTERAKNYLAKAMVHPFIESKKN HVDFYRQYLTRVSLDLGRDQYANVTTDKRVENFKNTNDTHLVATYFQFGRYLLICSSQPG GQPANLQGIWNDKLFPSWDSKYTCNINLEMNYWPSEVTNLSELNEPLFRLIKEVSDTGKE TAKIMYGANGWVLHHNTDIWRITGAVDKAPSGMWPSGGAWLCRHLWERYLYTGDVEFLRS VYPILKESGRFFDEIMVKDPVHNWLVVCPSNSPENVHSGSNGKATTAAGCTMDNQLIFDL WTAIISASQILDTDQEFASHLTQRLKEMAPMQVGHWGQLQEWMFDWDDPKDVHRHISHLY GLFPSNQISPYRTPELFDAARTSLIHRGDPSTGWSMGWKVCLWARLLDGDHAYKLITDQL TLVRNEKKKGGTYPNLFDAHPPFQIDGNFGCAAGIAEMLMQSYDGFIYLLPALPTVWKAG SIKGIIARGGFELDLSWKNGKVSRLVVKSHKGGNCRLRSLNPLTGNGLKRAKGENPNPLY AVPTIPEPLINEKANLNKVEIAETYLYDLPTKAGKEYVLIEK >gi|222159312|gb|ACAB01000047.1| GENE 10 17940 - 18557 498 205 aa, chain + ## HITS:1 COG:AGl502_1 KEGG:ns NR:ns ## COG: AGl502_1 COG1793 # Protein_GI_number: 15890358 # Func_class: L Replication, recombination and repair # Function: ATP-dependent DNA ligase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 25 184 1 157 546 173 53.0 2e-43 MAKNSPDEGKNIKMVEKQAVEASMINLKEYRAKRQFGKTPEPMADTVETPNRMPVFVVQK HDASHFHFDFRLEVDGVLKSWVVPKGPSMNPKDKRLAIEVEDHPLSYAHFEGVIPEGNYG AGTVEIWDSGTYAYVGNNRNISAAIKNGILEFKLHGHKLKGLFVLIHTNMDDQDKDWLLI KKDDVFAATHVYDAKIIPSYDEVFL >gi|222159312|gb|ACAB01000047.1| GENE 11 18932 - 19633 279 233 aa, chain + ## HITS:1 COG:no KEGG:BF0032 NR:ns ## KEGG: BF0032 # Name: not_defined # Def: two-component system response regulator # Organism: B.fragilis # Pathway: not_defined # 43 229 55 241 579 161 42.0 2e-38 MNDCYLSLLLNFRALLSIDSGANLVNSKQAIGVILPDLSLLGYAVPVESFGIGLFIVLFA YLAVEYNVQLLSHEKLKIAMNMLHTTHTPLILLRNQLEELKTGNLPESFSQQVEEALGYT ECIIYCNQNIVTLNKVNKKILPKTSTVNLELSSYITSIVNQCRPYADSRRIQLTVSECSD CVSCRINENIMTAALQHLISKLILGSDLGCCISINITHTTDSWQLQITSSKST >gi|222159312|gb|ACAB01000047.1| GENE 12 19867 - 21408 1076 513 aa, chain + ## HITS:1 COG:UU038 KEGG:ns NR:ns ## COG: UU038 COG2865 # Protein_GI_number: 13357594 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Ureaplasma urealyticum # 9 509 8 459 463 308 36.0 2e-83 MQIHNNTLIAECSAYDFKEMLERKKVKSWLKSVSAFANTDGGSLFYGVNDDGVIVGLENP QADADFISEMIKARLDPVPEVQLIPIEHEGHTLLEVKVKAGTLTPYYYYQDGTRTAYVRV GNESVECNSQQLLSLVLKGTHMTWDSLPTQVDASKHSFIILANTFREQTHQEWNDKYLES FGLVTPDGKLTNAGLLFVDNCTVFQSRIFCTRWTGLYKDDAISSVEHRANLVLLLKYGMD FIKNYTMSGWVKMPNYRLNLPDYSDRAIFEGLVNHLIHRDYTVMGGEVHIDIYDDRVELV SPGAMLDGTQIQDRDIYKVPSMRRNPVIADVFTQLDYMEKRGSGLRKMRELTEKLPNFLQ GKEPQYQTEATSFYTTFYNLNWGDNGRMPVEEVANRVNSTLEKYPVNEESSVEKFGVNTK EFGVNEESSVEKFGVNADKFGDTSETQKKVSKTAQKIIDLVISDPSITADNMANKIGVTK RAIEKNIKSLRGMGILVHEGSDKAGYWRIIVKP >gi|222159312|gb|ACAB01000047.1| GENE 13 21425 - 22132 388 235 aa, chain + ## HITS:1 COG:no KEGG:BT_4629 NR:ns ## KEGG: BT_4629 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 232 1 232 234 421 99.0 1e-116 METSFIPFFAIIAVLIIAFLFASLPQVNHKTRYRVLYAIAIIMLLAVIPISEYMAGGIQN SSNNYLLVLIFDIAVGYFCMYIAALLKFNVLKRKNQALENALTEKQQENVAILLEHQNEK QQALRQRELEWLAGKIKMFTEEEQEAILASALSFAEHDLIVAPSISIQPKETCSQQELMY FVCSAFYNMDKSRSEVVGFLYKVFPLYFPAGESALAKKMPGLEKVKERREKENAQ >gi|222159312|gb|ACAB01000047.1| GENE 14 22187 - 25480 2307 1097 aa, chain + ## HITS:1 COG:FN0414 KEGG:ns NR:ns ## COG: FN0414 COG0553 # Protein_GI_number: 19703756 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Fusobacterium nucleatum # 88 829 21 784 1014 179 25.0 2e-44 MSTKFFTNSTGNTLFDKFKGIAADMINFDSFLAVVGFFRSSGYFKLRKELGNVSEIKILV GINIDDIFRKHNKALLMLADEDKAKEIYHNGFKEDIINAQYSPEVEEGILQMCEDLVSGR LQMRIHATKNLHAKFYLCLPQNHSEHSDGWVIMGSSNISDSGLGIKQPPQYELNVAMKDF DDVKYCSDEFWALWNEAVPLTVEDIEEYKKKTYLGYQPTPYELYIKVLIDTFGDQVEDDF SIQLPDGVKDLKYQKDAVIQGYQMLMQHNGLFLADVVGLGKTMIATMIAKRFVEANGKNT NILVVYPPALEDNWRNTFKLFGIYKKAQFITNGSLSKVLDGKDNYKDKEEFDLIIVDEAH GFRSDSSGKYDELQKICKSPCINMGLLKSSQKKVMLLSATPLNNRPDDLQNQLLLFQNSQ SCTIDGVPNLKGFFSEYILDYKRLMRERDQRDVTSEVDKIYEQIRSKVIDKVTVRRTRNN ILNDPDYRADIKSQGIIFPNILPPNELEYVMDSDTSNRFYETLKQLTDGKTDENPEGKGL TYARYRAVEFLKPEYRNKYKNAVHIGQTLAAIYRVHMVKRLESSFYAFKKSLRTLLRITT DMIKMFEEDKVIIAPDLKVKDLQAKNMELDEIIEYAITKGYATEDILFTADAFSSDFVEM LHHDREILEQLNADWEKENDDPKFDKFQENLTHNFFDKERNPSGKLVLFSESVDTLNYLY DRLTKEIGRSDVLMVTAANRNRLGQTIKENFDANFDSDSMRYNIIITSDVLAEGVNLHRS NVIVNYDSPWNATRLMQRIGRVNRIGSVATNIYNYMFYPSQQGDKEIQLYKNALVKLQGF HSAFGEDAQIYSKEEIVKEFQMFDSNVKDSIDKKIALLREVRELYNSDRKLYHKIKALPM KSRVMRDTGKHSGKSIIFVSSNVKTEFYLASATSVEVIDFLEAVKYLKAKPEEQPVPFSK EEQHYKHVNSALAQYTTEYVEAADTSSINRTDLDKTSLEANKFLRTIKQITTDNELKSQC DVLMGYINEGIYAQLPRYLKTLSREYKNDRAKMKQNEYSLQNKISELLSEYQTMNKEQRH DAQDISNPQIIISESFK >gi|222159312|gb|ACAB01000047.1| GENE 15 25507 - 29364 2272 1285 aa, chain + ## HITS:1 COG:MJECS02 KEGG:ns NR:ns ## COG: MJECS02 COG1002 # Protein_GI_number: 10954534 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Methanococcus jannaschii # 198 623 279 625 1181 113 28.0 2e-24 MATIYTSDSLRNIFQSSFNLTQWYSFSQHFFNASELKEKPERIIENTSDEGYYLGNINTT DSYRIGLFHYNIRQGSVANKRVGLRNLVKSFINPTWGEFDAALVVFDSGDHWRLSFICDI NGEATSPKRYTFVLGDKGSHYNTPVARFIDLQQKGLSFTNIKEAFSVEALSKDFYNKLYN WYLWALSEDINVTFPNNPNTEKDDRENINVKLIRMITRLLFVWFIKQKGLVPDSIFNPKQ LKSILVDFDETSHIDGNYYNAILQNLFFATLNCAIIDEEGNPRCFAASKSGRDTRNLYRY KEMFQQKEEEILTLFAHVPFLNGGLFECLDKPKDLYLNQEYDIFYDGFSRNATKSSNGNF KYRAFVPNILFFNDDEDQPGLINLLKQYNFTIEENSPTDAVISLDPELLGRVFENLLAAY NPETQESARKSTGSFYTPRPIVDYMVDEAIKSYLLGKRLDRISEEKLNALFKERTVSTDW TDSNKDAIANALKQVKILDPACGSGAFPMGCLLRIVDIIELLKGDTVDRYQLKLTIIENC VYGVDIQPIAMLICKLRFFISLICEQNDIDFSSPETNFGINTLPNLETKFVAANTLISAN IRNYEDDWTNDEKLDSMKEKLLCIRNDHFLAKGRVAKKRSERQDNATRQQLLDYIVSHAQ KPDLEKIAAHERLLQQLSDEWELYKDEVWVDKTHPVEQTLFGVVEHPDSLFREDINKKKR KELTVRIKATKAEIAKEQNKGEITGFEAAVKQITEWNPYDQNSVSSFFDPEWMFCLKEKF DIVIGNPPYISTKGVKEEDKVKYEKEFGFSDDTYNLFTFKGLALCKDGGTLTYITPKTFW TTQTKRNMRDLLLSNTIRYIFDTANPFEAAMVDTCITQTVKQPMVDEHIVNFYDGTADLS HPIVFTPIQQSMFINAQNSVIFKPTPLNLRIYELYGKKVKELYDKWWSKIETSKKISQNH KELEAYRASLKPGDIALLGCLTEGGQGLATANNGKYIAVRRSTKWAKNIIESRPKKLVEA IKKKKIKVEGLDAYANEKEFLASLSEKEIANLFDNLKEKYGRDIFGQGYIYKIIDDSELA NVDELTKDEKENGIDTSKNFYVPYDKGDKDGNRWYLETPFAIAWSKENVQFLKTNSGKKG EGMPVVRNPQFYFREGFCWNNVLTTYMKCKKKEKTVQSTESMSFFSCVNQVPEFYLISLM NSRFAAIYVDNFINSTSHCTTGDAKLIPVLIPDNEILKSCNRLFTSAFELKKSVAKGITT DISIQSELEIIEEENDNLIACLYSI >gi|222159312|gb|ACAB01000047.1| GENE 16 29350 - 29502 72 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|255016179|ref|ZP_05288305.1| ## NR: gi|255016179|ref|ZP_05288305.1| hypothetical protein B2_19929 [Bacteroides sp. 2_1_7] # 1 50 55 104 104 94 98.0 2e-18 MNKIEQLCKEAICLKKDSFSSLVDRTTAEEKLLALQRDLDYYVQAELYGI >gi|222159312|gb|ACAB01000047.1| GENE 17 29751 - 30092 222 113 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_1300 NR:ns ## KEGG: NT01CX_1300 # Name: not_defined # Def: DNA modification methylase, putative # Organism: C.novyi # Pathway: not_defined # 1 113 1067 1189 1191 85 40.0 5e-16 MFLKCRKKEKSIHDVKSMSIFGVSNLLSEDYIITMINSTLISHYVDNFVNNTQTFQINDA RQLPIIIPTSTDDEQAKSFVSNAIKIKKGHTTNNALDVIQKEVDSYVEKIYNL >gi|222159312|gb|ACAB01000047.1| GENE 18 30054 - 30227 57 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYAFFFLTALQKHCVYITPTQSLPKVERITLVSRPWICLQKLDLIPDHMLVLFLPTL >gi|222159312|gb|ACAB01000047.1| GENE 19 30224 - 30592 303 122 aa, chain + ## HITS:1 COG:no KEGG:BT_0029 NR:ns ## KEGG: BT_0029 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 122 969 1087 1087 116 50.0 2e-25 MSQVWSEENTNAYFPHPRGYVALGSNRELAVVNTKYLQNLAYCRLKNLSIGYTLPDKWLS KIGFEKIRVYFSGENLLTFTKLHSDYIDPEQASASNSWKTSKSDANIYPWAKTYSFGVDI TF >gi|222159312|gb|ACAB01000047.1| GENE 20 30619 - 32364 1319 581 aa, chain + ## HITS:1 COG:no KEGG:BT_0030 NR:ns ## KEGG: BT_0030 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 581 1 605 609 479 45.0 1e-133 MKIMKYTLWATVGLSLLFASCESILDINPKDRLTTKDYFTNEEQLQLYSNQFYSNNFPGD GDIYRDDADMLIVSPLSDEVSGQRVIPETGGGWSWSALRSINFLLDNLGNCKDQKVRDKY EALARFFRAYFYFEKVKRFGDVPWYDKVLGSDDADLYKARDSREFVMGKIMEDLNFAIEV FKETNRTKELYRVTWWTAQALKSRVGLFEGTYRKYHGLGDYEKYLNDCVSASNEIMTASG GYSLYQSGSQSYRNLFKSENAIDAEIILARDYNNDLSLVHKVQAFENSPTLGRPGLSKKL VNYYLMKNGNRFTDQPNYATMEFKDEIVNRDPRLAQTIRTSNINMNVTMTGYHLLKYAND NMSYTGDSSNDLPLFRLAEVYLNYAEAKAELGTLTQTDIDNTINKLRTRAGVTGKLNMNA ANLSPDPYMCAPETGYVNITGDNKGVILEIRRERAIELVMEGFRYYDLMRWKEGQCMAQS FKGFYLPATAINKAYDIDGDGTNDVCFYTTSSQPNVGSVTYVKLASDGSGTSLSEGNYGN LLCYSWIDRTWNENRDYLYPIPRQEITLSNGVVTQNPGWNE >gi|222159312|gb|ACAB01000047.1| GENE 21 32394 - 33287 718 297 aa, chain + ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 55 296 4 255 257 107 30.0 2e-23 MKKIVIYTLITFIFSVFASSCEDNKDNSLYYPDFTWDTGDGEEDEDPVTETSMRVATYNL QVETGTGWTNRRERVAQLIRDYDFEICGFEEASWEQRSYLGTQLASDYQILAYGRDTGND DNKAGEMSGILYKKSRYTLLDAGRFWFSETPEIPSNGWDETNFKRFCVWGKFKDSKTQKE FYLFETHMPLADNARKHACQMLVDAVSDKAKDNTPAFCTGDFNATPDAPEIATTICQSGI LKDAYREAAVQHGALFTFPSKKTRIDFIFVKHATVLSTRTIVSSLSDHYPMVIVVEI >gi|222159312|gb|ACAB01000047.1| GENE 22 33304 - 35466 1338 720 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_6263 NR:ns ## KEGG: GYMC10_6263 # Name: not_defined # Def: metallophosphoesterase # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 65 706 983 1559 2013 152 28.0 6e-35 MKNKFYHISTYSVMRYWTILWMAILSFSCSDFNPMDSYSRIPPDRNTDIDDGDEGDGAGG LFEKGYGTVNKPYLVMDVIQIQNMSEALVKGKMIYFQLGADIDMKSISNWDPLNPTGDYY IYFDGNNHIIKNFTCTDKAYASFFGILTGTCKNVGFYNAHVEAATNSGAGVIGGYIGVKA PNAVEKTGQVENCYVSGKVKGKYAGGIASRMGRPYGGQICYIKNCYSTAEVISTGDECGG IVGSMYENSEVSYCYSTGVLIGANSVGGIAALPSEGAKITSCVAWNWKITGPAARSGRIS GVLSQGENGHQADPVASECYAWEDMICTGFTPEDNAGSVSTGKYDGVSESALTLQNSIAN WGTPWHNVGNIDMGFPILEWQLDREDYASYGGHDNEPEGDFANGDGTQNNPYVIANAIHI QNMSKALIEKQTTYFVLSADIDMQGIKWAPLNDANGYHKWIDFDGRNHVIKNLTCESGTY RSFFGVLCGECRNVGFVDANISSPNTGIGIIAAYVGLAAGAENYTGKITNCYTTGVLKGS GAAGGIGGVLGGSGYIKNCYSSATVIDQIANNTGKAGGIIGRVNGNASGSSIENCYTSGD INAIGGGNAGGIVGKVDGGKLVIKNCIAWNSMLVSTDKAKVGRIVGGTANATYENCYAYD GVILKAGEATFTVSDETSPSGSSFQGVAKSANELKNTVINWDSSLWKEGSNGYPVFKWSK >gi|222159312|gb|ACAB01000047.1| GENE 23 35478 - 36368 446 296 aa, chain + ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 41 289 1 250 257 165 35.0 9e-41 MICNMKCSINKKYYFLLLILLFAGGLPAKSSNLDKTKPEDIIRLASYNIRTKGDKGDKSW EVRLNALVDVVRRNKFDMFAIQEGRTSQLKDMMILNEYSYIGRDRDGDNKGEHCAIYYKK DRFKVLKHGDFWYSETPDIPSYGWGARCRRICTWGYFKDLRTGKKFYVFNSHTDHEATEA RRQSSFMLLEQVRKIAKGRPTFCTGDFNATPDEEPIQLLLKDSLLLDSYKCTLTPPKGPS GSFYAYDKTGKTAKRIDFIFVTPKIKILSYHTIDDDIKYNKYSSDHFPVMVEALLK >gi|222159312|gb|ACAB01000047.1| GENE 24 36871 - 39303 1839 810 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 518 803 17 307 328 184 36.0 5e-46 MKEQTNYDYEKYVQIAQMAKMGWWESDLKNKEYICSDFIVDLLGLESNRISFTEFHQRIR EDHRLRLKNEYMSLSYLETYEQMFPIRAKDGEIWVYSKINFQKPDKEGYRNMTGFLQYID RPIDTTNGNIDFFQVSNLLYQQTNISYSLLAFLQCDDVAQVVNKTLGDLLHQFLGDRIYI FEINRKEQRQDCTYEVTAEGISKEQEFLSNIPWDPSTWWNHQIAERRAIILNTLDDMPEE AAEYRQTLEMQDIKSLMVVPLISKEEVWGYMGIDMVRTQRSWSNVDYQCFSSLANIISIC IELRKSELQAKEDRLALDNSEKILRNIYKNLPAGVELYDKDGYLVDINDKELEIFGLSDK NEALGVNLFDNPNIPSEVKERLRAKEDVNFSINYDFSKINQYVDSRRNGIINLNTKVTAL YDSQNRFINYLFINIDTTETTNAYTKIQEFENLFLLIGDYAKVGFAHFNVLTRDGYAQDT WYRNLGEKEGIPMPQVIGVYAHVVPEDQAVLKNFVGEVKTGKATSLRKEVRVCRENGKYT WTSINVMVRDYRPQDGIIEMLCINYDITPLKETEQKLIIARDKAEELDRLKSAFLANMSH EIRTPLNAIVGFSSLLAETDSRSERQEYIKIVQENNELLLQLISDILDLSKIEAGTFNFV YTNVDVNETCSEIIKSMGMKVGKGVELILEEPFPECYIYTDKNRFTQVISNFINNALKFT QQGSITLGYEQVSHQKIKFYVRDTGMGIPEEKQKSVFERFVKLNTFVQGTGLGLSICKSI VSQMGGEIGVDSTEGIGSCFWFTHPYHAAD >gi|222159312|gb|ACAB01000047.1| GENE 25 39544 - 42222 2644 892 aa, chain + ## HITS:1 COG:no KEGG:BT_0997 NR:ns ## KEGG: BT_0997 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 892 1 873 873 1731 95.0 0 MFAACTLFLAACGRQTVKIMTPPDASNRVLFGAEQLQTTLDKAGYQVMMQQGDTTFSDPE IKTILLTEVNDTTLKKEGFHISTTGNLTRVSGRDGSGVIYGCRELIDRVNDSDGKLNFPE ELKDGPEMVLRGACVGLQKMTYLPGHGVYEYPYTPESFPWFYDKEQWIKYLDMLVANRMN SLYLWNGHPFASLVKLEDYPFALEVDEETFKMNEEMFSFLTEEADKRGIFVIQMFYNIIL SKPFAEHYGLKTQDRNRPITPLIADYTRKSITAFIEKYPNVGLLVCLGEAMCTVEDDVEW FTKTIIPGVKDGLQALGRTDEPPLLLRAHDTDCKLVMDAALPLYKNLYTMHKYNGESLTT YEPRGPWSKIHTDLSSLGSIHISNVHILANLEPFRWGSPDFVQKAVTAMHNVHGANALHL YPQASYWDWPYTADKLPNNEREFQLDRDWIWYQTWGRYAWNCHRDRTDEMGYWDHQLGKF YGTSDENASNIRVAYEESGEIAPKLLRRFGITEGNRQTLLLGMFMSQLVNPYKYTIYPGF YESCGPEGEKLIEYVEKEWKKQPHVGEMPLDIVAQVIEHGDKAVAAIDKAAGSVSSNKDE FARLQNDMHCYREFAYAFNLKVKAAKLVLDYQWGKEIKNLEEAIPLMEQSLEHYRKLVEL TDEHYLYANSMQTAQRRIPIGGDDGKNKTWKELLVHYEKELENFKANLALLKEKQNGNAV TETVEIAAWTPANVKLISNYPTVKVDEGTSLFVDVPGKIEAVAPELKGMKALRFNGNEQR EKGTSITFETDAPVKLLVAYFKDDQKKYAKAPKLEIDASANDYGQAEPVLTNAVRINGMP LANVHAYSFPAGKHTLMLPKGYLQVLGFTAAETKVRNAGLAGDEETMDWLFY Prediction of potential genes in microbial genomes Time: Wed May 18 02:21:06 2011 Seq name: gi|222159311|gb|ACAB01000048.1| Bacteroides sp. D1 cont1.48, whole genome shotgun sequence Length of sequence - 22546 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 9, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 46 - 4257 3413 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 4286 - 4331 1.1 + Prom 4663 - 4722 5.1 2 2 Tu 1 . + CDS 4754 - 5302 358 ## COG2207 AraC-type DNA-binding domain-containing proteins 3 3 Tu 1 . - CDS 5299 - 6798 768 ## COG0642 Signal transduction histidine kinase - Prom 6909 - 6968 7.3 - Term 6979 - 7033 -0.8 4 4 Tu 1 . - CDS 7041 - 8567 792 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 8717 - 8776 4.7 + Prom 8518 - 8577 5.5 5 5 Op 1 . + CDS 8712 - 11948 2317 ## COG3250 Beta-galactosidase/beta-glucuronidase 6 5 Op 2 . + CDS 11985 - 14834 2462 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 14849 - 14884 1.0 + Prom 14868 - 14927 6.3 7 6 Op 1 40/0.000 + CDS 14980 - 15666 611 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 8 6 Op 2 1/0.500 + CDS 15663 - 17030 1012 ## COG0642 Signal transduction histidine kinase + Prom 17060 - 17119 2.0 9 7 Tu 1 . + CDS 17151 - 19802 1894 ## COG0474 Cation transport ATPase + Term 19876 - 19935 5.6 + Prom 19842 - 19901 9.5 10 8 Tu 1 . + CDS 19965 - 21479 939 ## BT_0987 putative cytochrome c-type biogenesis protein + Term 21521 - 21568 -0.9 - Term 21359 - 21396 -0.5 11 9 Op 1 . - CDS 21472 - 21990 133 ## gi|237716436|ref|ZP_04546917.1| predicted protein 12 9 Op 2 . - CDS 21980 - 22546 127 ## Desal_2860 hypothetical protein Predicted protein(s) >gi|222159311|gb|ACAB01000048.1| GENE 1 46 - 4257 3413 1403 aa, chain + ## HITS:1 COG:TM1062 KEGG:ns NR:ns ## COG: TM1062 COG3250 # Protein_GI_number: 15643820 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 370 944 3 557 563 118 26.0 1e-25 MKKLLLAVFSIATTFSLYAQREVPQERMEQIYEEVKTPYKYGLAVAPADNYHKIDCPTVF RQGDKWLMTYVVYNGKGGTDGRGYETWIAESDNLLEWRTLGRVLSYRDGKWDCNQRGGFP ALPDMEWGGSYELQTYKGRHWMTYIGGEGTGYEAVKAPLYVGLAWTKGDISTAHEWESLD KPILSIHDKDAQWWEKLTQYKSTVYWDKDKTLGAPFVMYYNAGGHHPETNLKGERVGIAL SKDMKTWKRYSGNPVFAHEADGTITGDAHIQKMGDVYVMFYFSAFEPSRKYKAFNTFAAS YDLVNWTDWKGTDLIIPSKNYDELFAHKSYVVKHDGVVYHFYCAVNNAEQRGIAIATSKP MGRSAVRFPKPEIKNRRQIIELNEGWKTWRVENGKLRVESEKTVNIPHNWDDYYGYRQLT HGNLHGTVLYKKDFTLNNSQFSILNSQLKKYFLRFDGVGTYATITVNGKDFGRHPIGRTT LTLDVTDELKQGVNRLEVKAEHPEMIADMPWVCGGCSSEWGFSEGSQPLGIFRPVVLEVT DEIRIEPFGVHIWNDEKAANVFVETEVKNYSKTTETVELVNKLSNADGKQVFRLVEKVTL APGEMKVIRQQAPVENPVLWNTENPYLYKLASMIKRDTKTTDEISTPFGIRTISWPVKRN DGDGRFYLNGKPVFINGVCEYEHQFGQSHAFGNEQVAARVKQIRAAGFNAFRDAHQPHHL DYQKYWDEEGILFWTQFSAHVWYDTPEFRENFKKLLRQWVKERRNSPSVVMWGLQNESTL PREFAQECSDLIREMDPTAKTMRVITTCNGGEGTDWNVIQNWSGTYGGDVTKYDRELSQA NQLLNGEYGAWRSIDLHTEPGDFQVNGVWSEDRMCQLMETKIRLAEQAKDSVCGQFQWIY SSHDNPGRRQPDEAYRKIDKVGPFNYKGLVTPWEEPLDVYYMYRANYVPAAKDPMVYLVS HTWANRFEKGRRRATIEAYSNCDSVLLYNDLTNEKATFLGRKKNNGTGTHFMWENRDIRY NVLRAVGYYKGKPVAEDLILLNGLEQAPNFELLYQDDKKILKGEAGYNYLYRLNCGGDDY TDSFGQLWLQDNTNYSRSWAENFKDLNPYLASQRTTNDPIRGTRDWTLFQHFRFGRHQLE YRFPVADGTYRIELYFTEPWHGTGGSASTDCEGLRIFDVAVNDSVVLDDLDIWAESGHDG VCKKVVYATVKGGMLKIHFPEVKAGQALISGIAIASTDQELKPTVFPASGWSWEKADKEV MEKTPKELLPEDKNARVSISYEAETAVLKGKFQKKEHRKQMGVFFGKGKGNSIEWNVSTG LAQVYALRFKYMNTTGKPIPVLMKFIDSKGVVLKEDVLNFPETPDKWKMMSTTTGTFINA GHYKVLLSAENMDGIAFDALDIQ >gi|222159311|gb|ACAB01000048.1| GENE 2 4754 - 5302 358 182 aa, chain + ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 38 182 150 293 307 73 29.0 2e-13 MKATQHLLPVIMENPIVALSPLQACSYKMFYESSILSYASLRTRENKEIVKAVLTMFIQG ATEIYKLQNNWYLSSQSRKYEIYQEFLKLVMKHYTIHHGTSFYADELGLSLPHFCSTIKK AAGNTPLEVIASVILMEAKSRLKSTDEPVKNIALSLGFNNISFFNKFFKQHTGITPQEYR GR >gi|222159311|gb|ACAB01000048.1| GENE 3 5299 - 6798 768 499 aa, chain - ## HITS:1 COG:MA2348_2 KEGG:ns NR:ns ## COG: MA2348_2 COG0642 # Protein_GI_number: 20091183 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 189 496 114 427 427 154 34.0 5e-37 MEDYKQTLYNDELLNILPDGVIIFDTNGEVIQLNQQAFAELHVHPSVNDMLPFPTNRLFK LLNKKEDILSTILEKIRQGENTYSLPEHTFMQEQVDYTQFPIRGEFATIRDRTNLNKILF FFRNITVELTQEYILNTALQRTRIYPWYYDISRSEFTLDDRYFEHLGIPAGENNTLTMEE YVNMIHPDDRQPMADAFVVQLSGNTTFDKTVPFRLRRGDGTWEWFEGQSTYIANISGHPY RLVGICLSIQEYKDIENTLIEARKKAEESDRLKMAFLANMSHEIRTPLNAIVGFSDVIGS TYDELSEEERADFVRLISINSEHLVRLIDDILDLSKIESNTIKFTFSNCSLNSLMMDIEK EQAMKPISEIEIKSLLPDEDVYINTDITRLKQVICNFINNARKFTQKGYIHFGYTLDNRN ADSVQIFVEDTGSGIPQECLNEIFDRFYKVDTFKQGTGLGLSICKTIVEHLQGDISVKSE IGKGSCFTVTLPFDRKTED >gi|222159311|gb|ACAB01000048.1| GENE 4 7041 - 8567 792 508 aa, chain - ## HITS:1 COG:SP0659 KEGG:ns NR:ns ## COG: SP0659 COG0526 # Protein_GI_number: 15900560 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Streptococcus pneumoniae TIGR4 # 216 364 33 174 188 70 29.0 7e-12 MMMKKLLFSSLFLLGSLVSQAQHEYTIKGEVKGVKDGTHVSLFLTDGRVGSIVGTDTIRN GTFFFKRNAGESGMDQLSLMCRDTDFPPMSLDIYATPGAKIKVTGTNPLIYTWRVDSPVK EQQEHNRFIEDSRDLWDEFQRLAIKERSMRSASETERKALRTKSDSISSIINQRELKLMR ELPISNVWMEKLLRLSMSLKYNPKFTNKEEILALYDRLNEEQKASIEGQEIRVNLFPPKT VKEGDDMADADLFDLDGKVHHLADFKGKYMLLDFWSSGCGPCIMALPEMKEIQEQYKDRL TIISLSSDTQNRWKAASAQHEMTWQNLSDLKQTAGLYAVYDVNGIPNYVLISPEGKIVKM WSGYGKGSLKLKMRRYLDAPKREMSITGDTNRKVVNHPSFESTNTDILEVKQVELTDTAT IVHFYAYYIPKYWIQVSVNAKLVDEQGASYTLQKADGITPGKHFFLPESGEAEFSLTFKP LPLKTKSFNFTEGTAKNDWQINGIKLTR >gi|222159311|gb|ACAB01000048.1| GENE 5 8712 - 11948 2317 1078 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 22 1049 6 957 1087 612 35.0 1e-174 MKIYTLLLGALFVSPTQAQTMHDWENHHVLQINREPARATFTPFSVQKGDGSISLDGTWK FRWTPVPNERVVNFYQINFDDKDWTDFPVPANWEVNGYGTPIYVSAGYPFKIDPPRVMGE PRTDYTTYKERNPVGQYRRTFVLPTGWEVNGQTFLRFEGVMSAFYVWINGERVGYSQGSM EPSEFNVTKYLKSGENQISLEVYRYSDGSYLEDQDFWRFGGIHRSIHLIHTPDIHVRDYA VRTLPASAGDYEDFILQIDPQFSVYRGMTGKGYVLQGVLKDASGKEVATLKGDVEDMLDL EHKASRMNEWYPQRGPRKMGRLSAIIKSPERWTAETPYLYKLHLTLQNGEGKVVEQIEQA VGFRSVEIKKGQLLVNGNPVRFRGVNRHEHDPRTARVMSEERMLQDILLMKQANINAVRT SHYPNVSRWYELCDSLGLYVMDEADIEEHGLRGTLASTPDWHAAFMDRAVRMAERDKNYP CIVMWSMGNESGYGPNFAAISAWLHDFDPTRPVHYEGAQGVDGNPDPKTVDVISRFYTRV KQEYLNPGIAEGEDKERAENARWERLLEIAERTNDDRPVMTSEYAHSMGNALGNFKEYWD EIYSNPRMLGGFIWDWVDQGIYKTLPDGRTMVAYGGDFGDKPNLKAFCFNGLLMSDRETT PKYWEVKKVYAPVELKVENGELRVTNRNHHIDLSLYRCLWTLSVDGKEKERGEITLPEIA PGESSTIGLPAFRSLKTLSDYQLKVSIVLKSDALWAKAGHEVAWEQFCLQKGDLASADLI NKGALQVKEDDNSLLISGRSFSVQWEKKVNGSMTSLIYKGKEMLAHSDDFPVQPVTQVFR APTDNDKSFGNWLAKDWKLHGMDHPQINLESFHHEKRADGAVIVRIQTSNLYKEGKVVTT SVYTVFSDGTIDLKTTFLPQGVLPEIPRLGIAFCLAPAYDTFTWYGRGPQDNYPDRKTSA MIGLWKGSVADQYVHYPHPQDSGNKEEVHYLTLTDKQNKGIRVDAVENAFSASALHYTVQ DIYEETHDCDLKPRAEVILSMDAAVLGLGNSSCGPGVLKKYAIEKKEHTLHLRISSKQ >gi|222159311|gb|ACAB01000048.1| GENE 6 11985 - 14834 2462 949 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 42 584 63 629 1087 232 30.0 2e-60 MKKVTTLLSTLALATTLAAQNLPQTERQYLSGHGCDDMVEWDFFCTDGRNSGKWTKIGVP SCWELQGFGTYQYGITFYGKPFPKGVADEKGMYKYEFEVPEKFRGKQVNLVFEASMTDTE VKVNGRKVGSKHQGAFYRFSYNVTDFLKYGKKNLLEVTVAKESENASVNLAERRADYWNF GGIFRPVFLEVKPAVNLRHIAIDAKMDGTFRANCYTNISNDGMSIRTQILDKKGKKITET TVPVKAGGDWTSLQLNVSNPALWTAETPNLYKAQFSLLDKAGKVLHSETENFGFRTIEVR ESDGLYINGVRINVRGVNRHSFRPESGRTLSKEKNIEDVLLMKGMNMNSVRLSHYPADPE FLEACDSLGLYVMDELGGWHGKYDTPTGVRLIEGMIERDVNHPSIIWWSNGNEKGWNTEL DGEFHKYDPQKRPVIHPQGNFSGFETMHYRSYGESQNYMRLPEIFMPTEFLHGLYDGGHG AGLYDYWEMMRKHPRCIGGFLWVLADEGVKRVDMDGFIDNQGNFGADGIVGPHHEKEGSY YTIKQLWSPVQIMNTSINKQFDGKFSVENRYDYLNLNTCRFLWKQVKFPLATDASNAAIQ VLKEGEVQGSDVVAHSAGILDIKTNILPNADALFLTAIDPYGHELWRWTFPVNQLNQQTE QLSPLSSRPAYTETENELTVKANKRTFIFSKKDGQLKGVSVDNRKISFANGPRFIGARRA DRSLDQFYNHDDEKAKEKDRTYSEFPDAAVFTKLDVKEDGGNLVVTANYKLGNLDKTQWT INPSGELALDYTYNFSGVVDLMGIRFDYPEDQVISKRWLGAGPYRVWQNRIHGTQYDVWE NDYNDPIPGETFTYPEFKGYFGDVSWMNIQTKEGTISLTNETPDAYIGVYQPRDGRDRLL YTLPESGISVLNVIPPVRNKVNSTDLCGPSSQPKWVNGPQTGRVIFRFM >gi|222159311|gb|ACAB01000048.1| GENE 7 14980 - 15666 611 228 aa, chain + ## HITS:1 COG:ECs0609 KEGG:ns NR:ns ## COG: ECs0609 COG0745 # Protein_GI_number: 15829863 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 4 224 3 221 227 182 44.0 4e-46 MYTILIIEDEPRVASLLMNGLEENGYQTMVAYDGLMGLRLFQTHTFDLVISDIVLPKMDG FELAKEIRKTNPNIPILMLTALGSTNDKLDGFDAGADDYMVKPFDFRELNARIKVLLKRV TGNAQELPQELVYADLRIDLQRKDVERNGISIKLSPKEYNLLLYMVENAERVLSRVEIAE KVWNTHFDTGTNFIDVYINYLRKKIDRDFEPKLIHTKAGMGFILTDKI >gi|222159311|gb|ACAB01000048.1| GENE 8 15663 - 17030 1012 455 aa, chain + ## HITS:1 COG:RSp1043 KEGG:ns NR:ns ## COG: RSp1043 COG0642 # Protein_GI_number: 17549264 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 173 453 184 463 466 151 32.0 3e-36 MKIRTTLTLQYAGLTAAVFFVFVMAVYYVSEHSRSNAFFRNLQSEAITKAHLFLKDQVDA KTMQSIYLNNQKFINEVEVAVYTTDFKILYHDALQNDIVKETPEMIKRILKRKNINFYVD EYQAIGLVYPFEGKDYVVTAAAYDGYGYANRDALRNMLILLFIGGLSVLVVVGYILSRST LKPIRNIVKEAEKITASHIDKRLPVKNEQDELGELSTTFNALLERLEKSFNSQKMFVSNV SHELRTPMAALTAELDLALLKERSSEQYQMAIGNALQDSRRIVNLIDGLLNLAKADYQSE QITMEEVRLDELLLDARELVLKAHPDYHIELVFEQEAEEDNVLTVIGNSYLLTTAFVNLI ENNCKYSSNRTSSVLIAYWEQWAIIRLSDTGVGMSDTDKENLFTLFYRGENKNIAPGNGI GMALTQKIIHLHKGELTVSSHKDEGTTFVVKLPHI >gi|222159311|gb|ACAB01000048.1| GENE 9 17151 - 19802 1894 883 aa, chain + ## HITS:1 COG:PA4825 KEGG:ns NR:ns ## COG: PA4825 COG0474 # Protein_GI_number: 15600018 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Pseudomonas aeruginosa # 40 883 45 903 903 964 54.0 0 MVWKKKLRSPQYTFNSEKVFLVATQPGKSIYSYLQTTKLGLTQGEVQERQSIYGRNEVVH EQKKNPFILFIKTFINPFIGVLTGLAIISLFLDVLMADPGEQEWTGVIIISSMVLFSAIL RFWQEWRASEATDSLMKMVKNTCLAKRAGEQEEEIEITELVPGDIVYLAAGDMVPADIRI IDSKDLFISQASLTGESEPIEKFSEIQGQQFRKGSVIELDNICYMGSNVISGAAKGIVFE TGNKTYLGTIAKSLVGHRATTAFDKGISKVSFLLIRFMLVMVPFVFFVNGFTKGDWFEAF IFAISVAVGLTPEMLPMIVTANLSKGAIAMSKKKTIVKNLNAIQNFGAMDILCTDKTGTL TCDKIVLEKYINADGSDDNSKRILRHAYFNSYFQTGLRNLMDKAILSHVRDLSLEHLKDD YTKVDEIPFDFTRRRMSVVIEDRQGKRQIITKGAVEEVLDVCSYAEFNGQIHPLTDALKI KAQMISEEMNQQGMRVLAVSQKSFIEKDCNFAIEDEKEMVLIGYLAFLDPPKPSAAEAIE QLYAHGVAVKILSGDNDVVVKAIARQVGIDTSHFLTGIEIENMDETALKEAVKDTTLFSK LTPLQKTQIISLLQEQGNTVGFLGDGINDAGALRQSDIGISVDSAVDIAKESADIILLEK DLMVLEDGVLEGRKTFGNINKYIKMTASSNFGNMFSVMFASAFLPFLPMMPIHLLIQNLL YDISQTTIPFDRMDPEFLKKPRKWDASDLSRFMIYVGPISSIFDIITYLVMWYVFSCNSP EHQTLFQTGWFVEGLLSQTLIVHMIRTRKIPFIQSRATWPVMGLTFLIMAIGILIPFTAF GRSIGLTALPLSYFPWLVGILLSYCILTQIVKNWYIKKFVRWL >gi|222159311|gb|ACAB01000048.1| GENE 10 19965 - 21479 939 504 aa, chain + ## HITS:1 COG:no KEGG:BT_0987 NR:ns ## KEGG: BT_0987 # Name: not_defined # Def: putative cytochrome c-type biogenesis protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 504 1 504 504 798 79.0 0 MKHYYSFLIITSLLLAGCGQKDRAKSQFESSVQTEESYSLAKEYIKEAVITGKVLNRDFY PQEKELTLIIPFFWKMENQYRTPIQEDGSFSFRFPVYAKLREVSIRNYAEHLYIHPGDSI HVEIDFKDLFHPKVTGDAEKLNQEILAFTESAYYYIQNYSINPNLNIKDFEAELKKEYDF RLERRNEYLTKYKPMDDVTLFTEELLKQDYYYALLSYGNQCQFKTKKEMDRYHKLLSAIN KLYNKGILSARLYDIADEVEHYIAYGITYKDKKNPSVEEIMSAVGENELNQYIYTKMTVG SLNANDTLALTTRHTQFDSIVKMPHLRAQVMQIYNQTKSYLENPQPVSNNLLYGEFHENS KLKTSMPYMEPVYNILEKNHGKVIYFDFWARWCPPCLAEMEPLKQLRSKYSTKDLVIYSI CVSEPKEEWEECLNEYSLKNRGIECIYASDYFGKDNLQKIRKQWKIDRMPYYLLINRKGQ IVDFGTTARPSNPQFVSRIDDALK >gi|222159311|gb|ACAB01000048.1| GENE 11 21472 - 21990 133 172 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716436|ref|ZP_04546917.1| ## NR: gi|237716436|ref|ZP_04546917.1| predicted protein [Bacteroides sp. D1] # 1 172 1 172 172 306 100.0 3e-82 MSCKCFNRRFVFKNTDSFDHKYVNSKCKCTSRFTICENKSKFTIASKDISEVDKIKIDGY FDSSSEHRKCDYLFVYTSPLSCVYIFVELKGTDIAHAITQIGNTVNLFYNQGYLKDKKVI GAIVSSRHPSNDGTYRKAKQTLEKSLSSKIKSFRIEKKNKEMTYDPINDKVI >gi|222159311|gb|ACAB01000048.1| GENE 12 21980 - 22546 127 188 aa, chain - ## HITS:1 COG:no KEGG:Desal_2860 NR:ns ## KEGG: Desal_2860 # Name: not_defined # Def: hypothetical protein # Organism: D.salexigens # Pathway: not_defined # 18 187 239 404 409 106 37.0 3e-22 DILDIKVSFNNNGDTVYLMNENKEVKLSQTSSGIQSIIPLWIVFNQYVESKKKQILVIEE PELNLFPSTQHFLIDWIMRKMRKSNGSIVITTHSPYVLSVVDNLILAQEILKKSNKKKLV LSKIKELIPSMALIDFDDVSSYFFHSDGTVKDIRDTDIKSLGAEYIDTASDKLGYIFDEL CNIERNEL Prediction of potential genes in microbial genomes Time: Wed May 18 02:21:28 2011 Seq name: gi|222159310|gb|ACAB01000049.1| Bacteroides sp. D1 cont1.49, whole genome shotgun sequence Length of sequence - 22506 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 6, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 28 - 67 -0.6 1 1 Tu 1 . - CDS 85 - 732 383 ## Slin_2713 hypothetical protein - Prom 800 - 859 4.3 - Term 739 - 777 2.5 2 2 Op 1 . - CDS 1018 - 1746 447 ## BF3041 hypothetical protein 3 2 Op 2 . - CDS 1740 - 2021 285 ## CLD_0105 hypothetical protein - Prom 2103 - 2162 4.2 - Term 2059 - 2098 -0.0 4 3 Tu 1 . - CDS 2164 - 2682 324 ## BF2331 hypothetical protein - Prom 2790 - 2849 7.0 + Prom 3188 - 3247 10.1 5 4 Op 1 . + CDS 3303 - 3890 212 ## BDI_2596 hypothetical protein 6 4 Op 2 . + CDS 3898 - 5439 887 ## BT_1064 hypothetical protein 7 4 Op 3 . + CDS 5495 - 7096 1316 ## BT_1063 hypothetical protein 8 4 Op 4 . + CDS 7100 - 8071 458 ## BT_1062 hypothetical protein 9 4 Op 5 . + CDS 8093 - 10558 1390 ## gi|237716447|ref|ZP_04546928.1| conserved hypothetical protein 10 4 Op 6 . + CDS 10602 - 12869 1271 ## gi|237716448|ref|ZP_04546929.1| conserved hypothetical protein 11 4 Op 7 . + CDS 12897 - 13688 131 ## gi|237716449|ref|ZP_04546930.1| conserved hypothetical protein 12 4 Op 8 . + CDS 13748 - 14131 253 ## BF1869 hypothetical protein 13 4 Op 9 . + CDS 14184 - 15080 424 ## BF2115 putative AraC-type transcription regulator + Term 15143 - 15192 11.0 + Prom 15153 - 15212 4.2 14 5 Tu 1 . + CDS 15280 - 18579 2922 ## BT_0986 putative DNA-binding protein 15 6 Op 1 . + CDS 18696 - 20129 1147 ## BT_0985 putative sialic acid-specific acetylesterase II 16 6 Op 2 . + CDS 20163 - 22506 1277 ## BT_0984 hypothetical protein Predicted protein(s) >gi|222159310|gb|ACAB01000049.1| GENE 1 85 - 732 383 215 aa, chain - ## HITS:1 COG:no KEGG:Slin_2713 NR:ns ## KEGG: Slin_2713 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 1 215 1 214 401 72 29.0 8e-12 MAHLIVTDFGAIKSANIEIKKYNFFIGHTSSGKSTIAKLLAIFNNSIFWTIKEGDFNSFL RLLDKYNINFEFTSTTIIRYSNEKYYWEIGLNKFHSNYEDADLMEMANTSKSYDFILKFI ERKENNFAYKEFIKSLKNLLDLKDSAMVELIKPALVGLLYEECIPVYIPAERLLISTFSN SIFSLLQAGASIPDCIKDFGSLYEKARIQSIKILI >gi|222159310|gb|ACAB01000049.1| GENE 2 1018 - 1746 447 242 aa, chain - ## HITS:1 COG:no KEGG:BF3041 NR:ns ## KEGG: BF3041 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 171 1 169 171 164 49.0 3e-39 MLTVYHGSTYRVEQPLAGVCRPNLDFGVGFYLTDLKDQAIRWALRTADIRHEKSVWLNIY SLDIDACRNSSFHYLHFTTYDAHWLDFVVACRQGNVIWQDYDIIEGGIADDRVIRTIDLY MRGDYTREEALSRLIHQEPNNQICITNQKVIDEHLHFVDAILLPIPSPSKEIPNADIVMQ GKYYSIVELLATRLHISSLQALDIFYNSESYQRIVHRLGDLYLMSDAYIVDELMRELQKR QG >gi|222159310|gb|ACAB01000049.1| GENE 3 1740 - 2021 285 93 aa, chain - ## HITS:1 COG:no KEGG:CLD_0105 NR:ns ## KEGG: CLD_0105 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B1 # Pathway: not_defined # 3 69 1 67 71 65 47.0 6e-10 MTLKEKQLEFIIYCIENTAERLGRYSADVYNKLKELGAIDGYINAFYDTLHTQGKAYIVD SLLEYIYHRDPQWLPKDYRPFQISTQQKGDKSC >gi|222159310|gb|ACAB01000049.1| GENE 4 2164 - 2682 324 172 aa, chain - ## HITS:1 COG:no KEGG:BF2331 NR:ns ## KEGG: BF2331 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 165 1 152 163 85 32.0 1e-15 MVTVTINIKPYLAGYMYVRYRQSLEPDPENQSHSSSPSSSKRLIPIHLSHITPVYHFLHQ LSVPHPQNTSWKEIGNICFVLPKPRNGKNPEVYNYIGNDSALIIEKEIETEMKAELYSFL LDNKFNKGVMFKKSIEQFVEHYEMVGLVQEETLMRAFQRWRKLVKEEKAIKL >gi|222159310|gb|ACAB01000049.1| GENE 5 3303 - 3890 212 195 aa, chain + ## HITS:1 COG:no KEGG:BDI_2596 NR:ns ## KEGG: BDI_2596 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 21 194 18 187 188 172 52.0 4e-42 MKIQYIAILLLICWGSILKCSAQKVAVKTNLFYGAYSGTPNLGVEWGLSPRTTIELGAGL NWFTPNKASSNKKLVHWLGTMEYRYWTCERFSGHFWGIHIIGTQYNIAGHHLPLLFGDNS SHYRYEGWGAGGGISYGYHFLLSNRWSLEANIGAGYVRLHYDKFRCKTCGEKVGTENRNY FGPTKAAISLIFLIK >gi|222159310|gb|ACAB01000049.1| GENE 6 3898 - 5439 887 513 aa, chain + ## HITS:1 COG:no KEGG:BT_1064 NR:ns ## KEGG: BT_1064 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 130 512 12 388 392 355 49.0 2e-96 MTKLRSFLYLLIIFLTISSLSYGEENPKKWYTGIIQPTALELQQAGDSLHIQILYNFDNI QVSSNHSIELIPVLIAANHQMELPEISIKGRKNYQNLKRKLALMNTQERAFYQSEAPYSI LKGYGMTGKEQIRYSLIIPFEPWMKDAYLNVKKEVMGCCRPGRLLSAIPLFSAVTLEKLP VPYQITPHISYVQPKVEPVKSREMSCEAFLDFVVSKTDIKPDYMNNPTELKKISSMLAEV KNDTAITIRGISVIGYASPEGAVLFNKQLSEGRAKALVNYLLPRFPFSKELYKVEYGGEN WEGLRKMVAGSDMAEKDGILHIIDHIPVEINYRTNTSRKKSLMLYKQGNPYRFMLREYYP HLRKAICKIEYDVQNFNIEQAKVLIHSRPQNLSLNEIYLVALTYKNGSPEFIELFETAVS VFPDDKIANLNAASAALSRKDTLLAEKYLKKAETSTPEYENAVGVLHLLRGDYEQAKLHL NKAAESGLKQANLNLEELAKKEENIELMSKLDY >gi|222159310|gb|ACAB01000049.1| GENE 7 5495 - 7096 1316 533 aa, chain + ## HITS:1 COG:no KEGG:BT_1063 NR:ns ## KEGG: BT_1063 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 528 1 562 566 124 25.0 7e-27 MKTRSFLLSTLAVFIFAGCSSEDAREGNIPGGELDGKAYLSLSLQSHTATSRAANVEEKP GSSGESKAKAVKVLLFDEDDVCLDVADFDGLTVGNSGGESGGTGTPEAVASDAKLVPEKT KKVFVVINPYTDGWDLTSETVKGKPWSAINTAIEAAIANIAANENFMMASAGEGAGIEGA LTGVKVHKPTAYTSEAINQAKTDAQSDPAKISVDRLSAKVELAVKESFSTKPDGATFAFS GWELSVTNKSVKLYSERITYDNATIGAVYRRDKNYLSDEQPDVSNESTMEANMDAAFNYL KNIDSESEEMPAVAQSKGTSLYCLENTMEAKAQQLGFTTKVVVKAKYTPYGLNENSSYFS WKGNYYTLDQLKTEYLKHSDGSGLKVDLPIFLKKAGIMTQEQFDGDQDTKNSVVASLSEG ATATQLNAKTGIIGRFCAVRYYHESVCYYDVLIRHDQNVTEKMALGRYGVVRNNWYHLEL QSVSGPGTPWIPDPSDPDNPTPPGTDDDEADAYISVKITINPWTYWTQGVDLH >gi|222159310|gb|ACAB01000049.1| GENE 8 7100 - 8071 458 323 aa, chain + ## HITS:1 COG:no KEGG:BT_1062 NR:ns ## KEGG: BT_1062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 319 22 313 317 138 30.0 3e-31 MGIKRHIKKYKIILAAFIVAGITGCDVLHDDLDQCDLFLKFRYDYNMANEDWFAEQVEEV KVFVFDTQGKYIQTLTDNGHSLKSPDYRMLIPYRLKGCTAVVWAGKTDKFYQLPTMDTGD PINKLTLKYEPENNISNNHLDALWHSGPLLMFSSENISNTENVSLVRNTNDITIGITRGN NPVDASKYDIQLITANNFYDYKNNVGESSKNIIYHPCSVEEEDSKTALQTRLHTLRFIKD ADTTFSITEKASGKTIDIGGKTTINLIDYLLMSKPKMMGDQEYLDRRYEWDINIRIGDKE ENGYIALSITINNWTYWFQPTDM >gi|222159310|gb|ACAB01000049.1| GENE 9 8093 - 10558 1390 821 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716447|ref|ZP_04546928.1| ## NR: gi|237716447|ref|ZP_04546928.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 821 1 821 821 1582 100.0 0 MARLRMNITINIKFSANILLLTMFALFTYSCTKENEVTGARIYEGEKIPMSIKMGTRNAT TDKDEVINSVRAIVFNDKNELVYNDVSDASINVDSTYTASIRAARGYNNIYIICNETSEL AEKLAAITLENEIEKVTFSAVGIVAPPPMYGNVARAYVESRSDGKNATVTINNIKMTELP VSVTRMVSRISFTAIKNITNPNEDFKVTKLTVKVCRMPAATPIEEGRAYTEDIWSDDLTI SGTGELDKNGDYIIKDNNYTIPDTVDFITIPATYIPEHLLSQPENASQATYLKIDAQCVL KNGSSQVLNCVYLLNIGQEPPKNYNLTRNNHYQIYATITGMGAMGLYAEIVAMEEHDITI NWKPIDGLVIVSDKAADYDAVADTSRNVNIWNDVSVYSGILKAYHSETGYKDVLFKYGSL IAVHSGTSAGEGFIAPTNASALNDVLWYPGSYDPLSISGWTDIPYLNTDGIPANNTVDQV KAGVGDPCKLAGLSETQIKAQNIVDNGQWHMATPNENQALIAASDNENNSYGYPSFHWLL SPHNRYRDAGGVSQGDRSNGWYWNDDASVFEFSGNSSGASFQSGKDRQSAYMIRCVRNEI PESKMEAGAISSPTYQGTEKGVKAYFGIKSNVPYWTATLVTGEGAGTAEPTDFSFASDGT IVHTTHGSNTENIPVYVKRKESASPRSFRVKVEGVGLDGQTRSILLTVSQSGYDLRATTG LSSLGNISQTGGTYTVSIQLTPTDISIPAGKLFLQVIYGGVQKCLSTKVNTEPNTYSYSV SIEIPENSSPSSINLTVNILLAKDTGVTVPLGSPSITQNGY >gi|222159310|gb|ACAB01000049.1| GENE 10 10602 - 12869 1271 755 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716448|ref|ZP_04546929.1| ## NR: gi|237716448|ref|ZP_04546929.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 755 12 766 766 1481 100.0 0 MNLRTKYKRYFRAACTTLLLLAFVGCAKDKLVEYGPGDATGKVVMVSLKVGIPPAVEPAS TSGYSAAKSLFRDRNDSGLSFTVALEQGQEPHEVTTRVSDGTTKLHNLWLFQFDEHGSIK GNPHKLSDAVTAINDLMTIDVPLVVSENQTLYLLVLGPKLNYDMSGVSTLDELKNWSFDY LINVEGHTQSLITADDEVPLAGEVSGVTVVDIDGGKRGLVEYNKPAGFVGGIEIERLMAR VTLRYKFEVENYRLQGLKLLNVNNTIRLTNLKKNTDADTYATFEMDQLGEPDSNGYYSAT WYVAQNRQGTVTTILSESQRYYKVVNKVPSGAAPPLGTQIEAWAYPTTGTNEYAIYQIYV GNNNTNNFDVEPNHFYNLRTAINAEINSAKNDERIRAYTISQYVEFFSSLNVKASGASFS TLYNKTGATYDLDAAYSVRPIVIQTQGRKVEVEIYTDKDCTQRADKGSSWLLLSSSSNYT DAFNNVKEPLDTRVTASSILPTQVKFYLYNKEYIYDDDGKLVDPGESDKLGKRSLYIKVT TMPEGEGETLQTFHIFRLDQRPAVYAGCFGGERDTDGNYTMGLVHDRPQRSVYQYDVSSG KVEYGYDGIVTAAHSYGTDDVYYGKNATVNLAENIKNLTWSGYIPVPQKDASGHILLYQY QHPASTFSARACYDKNRDEDGNGRIEGEELKWYLPASNQLIGLYIGSLLDASSGQTITED GGTSAKRWYYGLNSYYKTEGGARCVRDIPLPSVTY >gi|222159310|gb|ACAB01000049.1| GENE 11 12897 - 13688 131 263 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716449|ref|ZP_04546930.1| ## NR: gi|237716449|ref|ZP_04546930.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 263 1 263 263 502 100.0 1e-140 MKRITVIIFASLFILLKGYSQTIEVYEDSRGNRYAAINSKDLPKQRIRDKSAVFYNNIQL YEYTPTGQDTNPITDSEGNPVYTDRIIRHIDSMSEAGKINKTVSAFFIVSPDIVYSDGND SEGKDSGTQTMNWATANGYLATANSNSYSTEKNIAVPKGCAMYRGKDGQDAPGTWRIPTL REGSLIMIYYKELEATATQGTDCKAFALSDIKPTTYWLATENTISSQAWSMTIYPDATRV KYKSVRDLSKTSSYYLRCIRDIP >gi|222159310|gb|ACAB01000049.1| GENE 12 13748 - 14131 253 127 aa, chain + ## HITS:1 COG:no KEGG:BF1869 NR:ns ## KEGG: BF1869 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 13 114 13 112 122 70 34.0 3e-11 MKTIKAEILMIEGAPFPVIKEIYDPSNERRNGTITPEAPIVITGQNLDMLTWESTNLYLV SSVNDRMLIECEDIHKYSDDKIYMTVPDISEGEYFLALMILMKDKESFLYIFPISLVVQF AQSKWHD >gi|222159310|gb|ACAB01000049.1| GENE 13 14184 - 15080 424 298 aa, chain + ## HITS:1 COG:no KEGG:BF2115 NR:ns ## KEGG: BF2115 # Name: not_defined # Def: putative AraC-type transcription regulator # Organism: B.fragilis # Pathway: not_defined # 22 291 20 289 290 253 44.0 4e-66 MEKKQPYKQDKVFDNILETEEFNVYTKIDDLPLDENPMYLEEGINGICTGGSALFNVFGN KRRIVSNDLVVIFPFQLASVTEISSDFSMTFLKVPKSLFMDTISGICIPTLDFFFYMRKN FSTPLYDEECQRFIHFCNILIFRINLPRNLFRRESIMQLLRVFYWDIYVAYKRNPKAAEL VKYTRKEKLLFDFFCLVIEFHTVSRDVAFYAKKMCVSAKHLTMVITDMSGRSAKDWIIDY SLLEIKALLRDSDLEIKEVASRTNFQSNSVMTRFFREHTGMTPSEYRERIYVKNEIDL >gi|222159310|gb|ACAB01000049.1| GENE 14 15280 - 18579 2922 1099 aa, chain + ## HITS:1 COG:no KEGG:BT_0986 NR:ns ## KEGG: BT_0986 # Name: not_defined # Def: putative DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1083 1 1086 1105 1929 84.0 0 MKSRLKQRIFALSLLAWTAVCPADAQQTRSLRDQFLSPSDEAKPWTFWYWMYGTVSKEGI TADLEAMKHAGLGGTYLMPIKGIHEGAQYDGKAQQLTPEWWEMVRFSMEEADRLGLKLGM HICDGFALAGGPWITPEESMQKVVWSDTIVNGGKLMAIRLPQPEAYENYYEDIALFALPV EDAADEMQAKITCVNLATTGNIKAAQTVNMDAAGVIRSSYPCYIQYEYGQPFTCRNIEIV LTGNNYQAHRLKVMASDDGVNYRLVKQLVPARQGWQNTDENSTHSIPPTTARFFRFYWTP EGSEPGSEDMDAAKWKPNLKIKELRLHREARLNQWEGKAGLVWRVAQATKEEEVGKQDCY SLSQVINLTKQYTGHSNGKTLTATLPKGKWKLLRMGHTATGHTNATAGGGKGLECDKFNP KTVRKQFDNWYAQAFVKTNPEIARRVLKYMHVDSWECGSQNWNKRFAIEFQKRRGYDLMP YLPLLAGIPMESVEQSEKILRDVRTTISELVVDVFYQVLADCAKEYDCQFSAECVAPTMV SDGLLHYQKVDLPMGEFWLNSPTHDKPNDMLDAISGAHIYGKNIIQAEGFTEVRGTWDEY PGMLKALLDRNYALGINRLFYHVYVHNPWLDRKPGMTLDGIGLFFQRDQTWWDKGAKAFS EYATRCQSLLQYGHPVTDIAVFTGEEVPRRSILPERLVPSLPGIFGAERVESERIRLANE GQPLRVRPVGVTHSANMADPEKWVNPLRGYAYDSFNKDAILRLAKAENGRITLPGGASYK VLVLPLSRPMNPEPVLSSEVQKKINELKEAGILVPSLPYTEEDFSVYGLERDMIVPEDIA WTHRRGELGELYFVANQKDETRTFTASMRINGKKPECWNPVTGEMNIHPSYHINGNRTEV TLTLAPNESVFIVYPAEGVDEGYGESSLQLQKEKKDTSKAPLNIALEAKEYTITFAANKK TLTRKELFDWSQESDEQIRYYSGTATYKTTFRWKNKPNKDQQIYLNLGTVYNLATVRVNG VDCGTIWTAPYRADITGALKKGINELEIEVTNTWANALAGADEGKAPFDGIWTNAKYRRA EKTLLPAGLLGPLSFSTTE >gi|222159310|gb|ACAB01000049.1| GENE 15 18696 - 20129 1147 477 aa, chain + ## HITS:1 COG:no KEGG:BT_0985 NR:ns ## KEGG: BT_0985 # Name: not_defined # Def: putative sialic acid-specific acetylesterase II # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 477 1 477 477 928 93.0 0 MKSSIIKTGTMLAGFLLAACLSTHAEVKLPAIFSDGMVMQQQTNANLWGTATPHKKVTVT TSWNGKQYAATADKNGAWKLTVATPEAGGPYTVTFDDGTQKTLNNILIGELWLCSGQSNM EMPMKGFKNQPVENANMDILHSKNPQIRLFTVKRTSTFTPQNDVIGSWKEATPVSVREFS ATAYYFGRLVNEILDVPVGLVVAAWGGSACEAWMTADWLKAFPEAKIPQTEADIKSKNRT PTVLYNGMLHPLIGMTMKGVIWYQGEDNWNRAHTYADMFTRLINGWRAEWKQGDFPFYYC QIAPYDYGIITEKGKEVINSAYLREAQAKVEHRVANSGMAVLLDAGMEKGIHPAKKQVAG ERLALLALTKTYGVEGVNGESPYYKSIEIKNDTVVVSFERANMWISGKNCFESKNFQVAG EDKAFYPAKAWIERSKMLVKSDKVPHPVAVRYCFENYVEGDVYCDGLPLGSFRSDDW >gi|222159310|gb|ACAB01000049.1| GENE 16 20163 - 22506 1277 781 aa, chain + ## HITS:1 COG:no KEGG:BT_0984 NR:ns ## KEGG: BT_0984 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 781 1 803 803 1371 80.0 0 MKRFYLIITILFVGVSALWSQHANVIWNTPSRNSSESMPCGGGDIGMNIWVEDGDVMFYV SRSGTFDENNCQLKQGRVRLRLSPNPFKDAKDFRQELKLKDGYVEIAAGNTQIQFWVDVF HPVIHVEVTNAQPLQTEVFYENWRHQERPVRKGEGQQCSYKWAPPKGTVTEADFISLDKI TDSTGKAETNRTSIKRNQLLFYHRNPEQTVFDVVVAQQGMNEVKSQMMNPLKNLTFGGTL FGENLEFAGTADDVYAGTDYRAWKFRSSKAARKEHICIVLHTDQTETIEEWEQGLQTALQ RIVPKGKVSSKTIIQDKKQTRSWWNSFWQRSFIEAEGEAKEITRNYTLFRYMLGCNAYGS VPTKFNGGLFTFDPCHVDEKQSFTPDYRKWGGGTMTAQNQRLVYWPMLKSGDFDMMPSQF DFYNRMLKNAELRSRIYWQHDGACFSEQIENFGLPNPAEYGFKRPDWFDKGLEYNAWLEY EWDTVLEFCQMILETKNYANADITPYLPLIESSLTFFDEHYQQLASRRGRKALDGNGHLI LFPGSACETYKMTNNASSTIAALKVVLETYGEKEEMLKAIPPIPLRYIEIKDTLNPTIAP VLKQTISPAVSWERINNVETPQLYPVFPWRIYGVGKEDLDIARNTYFYDPDAIKFRSHTG WKQDNIWAACLGLTEEAKKLSLAKLSNGPHRFPAFWGPGYDWTPDHNWGGSGMIGLQEML LQTNGEQILLFPAWPKEWNVHFKLHAPGETTVEATLKNGKVTDLKVLPESRKKDIVIMIE K Prediction of potential genes in microbial genomes Time: Wed May 18 02:23:27 2011 Seq name: gi|222159309|gb|ACAB01000050.1| Bacteroides sp. D1 cont1.50, whole genome shotgun sequence Length of sequence - 8075 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 66 - 2720 2229 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 2746 - 2805 3.5 2 2 Op 1 . + CDS 2886 - 7280 3547 ## COG0642 Signal transduction histidine kinase 3 2 Op 2 . + CDS 7339 - 8074 658 ## BT_0980 hypothetical protein Predicted protein(s) >gi|222159309|gb|ACAB01000050.1| GENE 1 66 - 2720 2229 884 aa, chain + ## HITS:1 COG:SMb21655 KEGG:ns NR:ns ## COG: SMb21655 COG3250 # Protein_GI_number: 16263752 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sinorhizobium meliloti # 45 801 3 731 755 237 28.0 7e-62 MNKRPFIILSAFLLLFSLIEKGQATETYRPETSVAGFIQLPGSGRQVYNFNPGWRFFRGD VRGAEAVNFDDRSWNVVSTPHTVELMPAEGSGCRNYQGPAWYRKHFVLPAETKGQRVVLH FEAAMGKQILYLNGKRIQEHLGGYLPFTLDLTANGVQAGDSCLLAVFTDNSDDKSYPPGK RQYTLDFAYHGGIYRDVWMIAKSPVAITDAIDSQIVGGGGVFVHFDKISEKSAQVYVNTE VQNDDARSESVTVETTLTDADGKVIKRSSGKLSLKPGEKKSIRQQMEVKNPTLWSPDTPY LYRVQSRIKKGNKSIDGGITRVGIRLAEFRGKDGFWLNGKPFGQLVGANRHQDFAYVGNA LPNSQQWRDAKRLRDVGCTIIRVAHYPQDPAFMDACDELGLFVIVATPGWQYWNKDPKFG ELVHQNTREMIRRDRNHPSVLMWEPILNETRYPLDFALKALKITKEEYPYPGRPVAAADV HSAGVKEHYDVVYGWPGDDEKEDKPEQCIFTREFGENVDDWYAHNNNNRASRSWGERPLL VQAMSLAKSYDEMYRTTGLFIGGAQWHPFDHQRGYHPDPYWGGIYDAFRQKKYAYEVFRS QSPASLQHPLAECGPMIFIAHEMSQFSDKDVVVFTNCDSVRLSIYDGTKTWTKPVVHAKG HMPNAPVIFENIWDFWEARGYSYTQKNWQKVNMVAEGIIDGKVVCTQKKMPSRRSTKLRL YVDTQKVNLIADGSDFIVVVAEVTDDSGNVRRLAKENIVFTVEGEGEIIGDATIGANPRT VEFGSAPVLIRSTRKAGKIKVKARVQFEGTQAPTATEIELESVPAELPFCYEEQTYEIQR TTPSTLNANPVKGSSEGKVQLTEEERQRVLDEVERQQTEFGTEK >gi|222159309|gb|ACAB01000050.1| GENE 2 2886 - 7280 3547 1464 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 928 1181 8 256 294 159 36.0 5e-38 MIGMYAFAYPNMIVEHYTAERGLPNNIVNCTLKGQDGFIWFGTWYGLCSFDGSKFRSYDN HDGFYSTDIPPRKIQRIVEDKNGFLWIKTIDRKLYLFDKRHESFHAVYDDVKEYSENIQI IKIQTTEEGDVLLLTKDKSLLRAYTDKEGKITMKQLHDSRPNVNVYDMRLKHNIFCETAE FINWIGMDYQILSLRKGEALKDKPADFIQKKVSANPDQFTCASYNSKFLWLGDKKGHIYS IDPQNGVVNRYEIPEIKQPVSNLLVTESGLMYITTNEGAYEYNIGYKQLTKLPFTIPEKD NGIIFYDKYDKVWFQEGNQALTYYDPLNRSHHRFTFPNQNAIGNFEMQDAGEQGMFFMTP GGEILLFDREKLEMTRINQLKPFSDDLPNQLFFHLLLDKDGILWLASTSSGVYRLNFPKK QFQLLTEVSPSPVVPERSTSWNQGIRALFQAQNGDIWVGTRWQALYRLDRNGQVKQIFSD KNYLLGAVYHIMEDKDGNLWFSTKGNGLVKAEPDMNSPHGLRFTRYINDPKNPNSISNND VYFTYQDSQGRIWVGLLGGGLNLISEENGAITFIHKYNGLKQYPAYGLYMEVRTMTEDED GRIWVGTMDGLMSFDGHFTAPEQIQFETYRQVSENSNVADNDIYVLYKDTDSQIWVSVFG GGLNKLVRYDKEKREPIFKSYGIREGMNNDVVKSIVEDKNGNLWFTTEIGLSCFNKATEQ FRNYDKYDGFLNVELEEGSALRTLNGDLWIGTRQGILTFSPDKLETLHTNYDTRIVDFKV SNRDLRSFRECPILKESITYAKAIELNYNQSMFTIEFAALNFYNQNRVSYRYILEGYEKE WHYNGKNRIASYTNVPPGDYTFRVETVDEANPELVSNCTLAITILPPWWLSWWATLIYVI LGLAALYFSLRLAFFMLKMKNDIYIEQKVSEMKIKFFTNISHELRTPLTLIKGPIQELRE REKLSPKGLQYVDLMEKNTNQMLQLVNQILDFRKIQNGKMRLHVSLIDFNEMIASFQKEF RVLSEENEISFTFQLPDEPIMVWADKEKMSIVIRNIISNAFKFTHSGGSIYITTGLTDDG KRCYVRVEDNGVGIPQNKLTEIFERFSQGENAKNSYYQGTGIGLALSKEIVNLHHGQIRA ESPEGQGAVFIVELLMDKEHYRPSEVDFYVGDTETAPVSVEQDPVANAISEDGTEEEPEI DASLPTLLLVEDNKDLCQLIKLQLEDKFNIHIANNGVEGLKKVHLYHPDIVVTDQMMPEM DGLEMLQSIRKDFQISHIPVIILTAKNDEDAKTKAITLGANAYITKPFSKEYLLARIDQL LAERKLFRERIRQQMENQTTTEEDSYEQFLVKKDVQFLEKIHQVIEENMDDSDFNIDTIA SGIGLSRSAFFKKLKSLTGLAPVDLVKEIRLNKSIELIKNTDLSVSEIAFAVGFKDSGYY SKCFRKKYNQSPREYMNEWRKGER >gi|222159309|gb|ACAB01000050.1| GENE 3 7339 - 8074 658 245 aa, chain + ## HITS:1 COG:no KEGG:BT_0980 NR:ns ## KEGG: BT_0980 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 245 1 245 560 477 93.0 1e-133 MNNIKKVLSAWMLVACVLPVAAQYPVIPDSVKARGAKQEAEFERKSDAAWEKALPTVLEE AKKGRPYKPWASKPEDLIKSNIPAFPGAEGGGMYTPGGRGGKVIVVTSLEDSGPGTLREA CETGGARIIVFNVAGVIRLKSPISVRAPYVTIAGQTAPGDGICVTGQSFLIDTHDVVIRH MRFRRGAQDVAFRDDAVGGNAVGNIMIDHCSASWGLDENMSIYRHVYNRGADGHGLKLPT VNITI Prediction of potential genes in microbial genomes Time: Wed May 18 02:23:34 2011 Seq name: gi|222159308|gb|ACAB01000051.1| Bacteroides sp. D1 cont1.51, whole genome shotgun sequence Length of sequence - 11604 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 6, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 946 791 ## BT_0980 hypothetical protein 2 1 Op 2 . + CDS 977 - 4102 2329 ## BT_0979 hypothetical protein + Term 4104 - 4157 12.1 + Prom 4128 - 4187 5.6 3 2 Op 1 . + CDS 4279 - 4830 585 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 4 2 Op 2 . + CDS 4913 - 5164 274 ## BT_0977 hypothetical protein + Term 5220 - 5283 8.6 5 3 Tu 1 . - CDS 5283 - 6536 815 ## COG0477 Permeases of the major facilitator superfamily - Prom 6781 - 6840 3.2 + Prom 6502 - 6561 8.3 6 4 Op 1 12/0.000 + CDS 6614 - 7381 747 ## COG2966 Uncharacterized conserved protein 7 4 Op 2 . + CDS 7383 - 7868 407 ## COG3610 Uncharacterized conserved protein 8 4 Op 3 . + CDS 7880 - 8608 527 ## BT_0973 hypothetical protein + Term 8675 - 8720 12.3 - Term 8661 - 8708 10.2 9 5 Op 1 . - CDS 8758 - 9567 616 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 10 5 Op 2 . - CDS 9605 - 10225 504 ## COG1011 Predicted hydrolase (HAD superfamily) - Prom 10403 - 10462 5.2 + Prom 10385 - 10444 6.9 11 6 Tu 1 . + CDS 10479 - 11582 590 ## BT_0969 hypothetical protein Predicted protein(s) >gi|222159308|gb|ACAB01000051.1| GENE 1 2 - 946 791 314 aa, chain + ## HITS:1 COG:no KEGG:BT_0980 NR:ns ## KEGG: BT_0980 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 314 247 560 560 547 90.0 1e-154 NSIFSEALDTYNHAFGATIGGHNSMFCRNLFASNISRNSSVGMDGDFNFVNNVVFNWWNR SVDGGDHNSFYNMINNYFKPGPITPIGKPISYRILKPEAGRDKNRPLSFGKAYVNGNIIH GNAKVTKDNWDGGVQLKEEVDAAKFLPLIKSDEAFKMPPVTVMDTKKAYTFVLDNVGANF PKRDAVDARVIKTVQTGKAIYAKDAPEFVSPYVKRRLPADSYKQGIITDIRQVGGLPEYK GEAVVDSDGDGMPDAWEIANGLNPNDPADANMDCNGDGYTNIEKYINGIDTRKKVDWTDL KNNYDTLSKRKSLL >gi|222159308|gb|ACAB01000051.1| GENE 2 977 - 4102 2329 1041 aa, chain + ## HITS:1 COG:no KEGG:BT_0979 NR:ns ## KEGG: BT_0979 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1041 1 1031 1031 2013 90.0 0 MKIQVRTILLGLLSIGFVQSYAQTFALQVKNDQITYLNDDRGNRILDFSTCGYKSSEQDI PSVRNVVFVPWKAGDNTARIQRAIDYVASLTPDASGFRGAVLLDQGEFSLSGSIRISASG IVLRGTDKEKTILLKKGVDRGALIYMEGMDDLNVQDTLKVLSHYVPVNARTLEVASGVSL KKGDRVMVTRPSGKEWIASLGCDIFGGGISALGWKEGDMDLTWDRTVCEVNGNQVTLDAP LTVALDANYGTSSLLTYQWNGRIHDCGVENMTLISDYDKRYPKDEDHCWTGISIEDAENC WVRLVNFKHFAGSAVIVQRTGSKITVEDCISKEPVSEIGGMRRCTFHTLGQQTLFQRCYS EQGIHDFAAGYCAAGPNAFVQCDSYESLGFSGSIDAWACGLLFDVVNIDGHNLTFKNLGQ DKNGAGWNTANSLFWQCTAAEIECYAPAKDAMNRAYGCWAQFSGDGEWAQSNNHVQPRSI FYAQLGERLNKECAERARILPRNTSATSSPTVEVAMELAKEAYNPRLTLEHWIGDNKFAP SVASTGVKSIDDIKEKKSAALANSSSYSAAAKLLTQPEVTVTNGRIQMDGALLVGGSHTT PWWNGKLKTNYLKKASPAITRFVPGREGLGLTDRIDSVVDFMKQKNILVFDQNYGLWYDR RRDDHERVRRRDGDVWGPFYEQPFGRSGQGTAWEGLSKYDLKRPNAWYWSRLKEFAEKGN KDGLLLFHENYFQHNILEAGAHWVDSPWRSSNNINQTGFPEPAPFAGDKRIFVADMFYDI THPVRRELHRQYIRQCLNNFADNSNVIQLTSAEFTGPLHFVQFWLDVIAEWETETGKKAK VALSTTKDVQDAILADPKRAAVVDIIDIRYWHYKTDGIFAPEGGKNMAPRQHMRKMKVGK VTFTEAYKAVNEYRQKFPQKAVTFYAQNYPAMGWAVFMAGGSCPVIPCTDKAFLKDAAAM EVEETNTDEYKKMVKSDIGSIIYSKSGTEIPVQLSSGKYVLKYIHPASGKIETINKSLKI NGLYNLKVPDKKEGIYWFHKL >gi|222159308|gb|ACAB01000051.1| GENE 3 4279 - 4830 585 183 aa, chain + ## HITS:1 COG:BH3216 KEGG:ns NR:ns ## COG: BH3216 COG1595 # Protein_GI_number: 15615778 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 5 183 3 178 193 67 26.0 2e-11 MMEQTNHSTDTLLASFQAGNMAAFSQLYNLHINVLFNYGLKLTIDKELLKDCIHDIFVKL YTKKDELGTIDNLRSYLFISLKNKLCDELRRRMYMSDTAVEEVSISTPTDVEDDYMEEEQ RKNEFSLVRRLLDQLSPRQREALTLYYIEEKKYEDICEIMNMNYQSVRNLMHRGLTKLRS LAS >gi|222159308|gb|ACAB01000051.1| GENE 4 4913 - 5164 274 83 aa, chain + ## HITS:1 COG:no KEGG:BT_0977 NR:ns ## KEGG: BT_0977 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 83 1 83 83 137 83.0 1e-31 MCSKVMDFLTDDDFINYVLGVTPQSASQWETYFREHPEEMADAEEAKAVLLAPANVDCGF SIVENNELKDRIISSIKDFSGIL >gi|222159308|gb|ACAB01000051.1| GENE 5 5283 - 6536 815 417 aa, chain - ## HITS:1 COG:BMEI0267 KEGG:ns NR:ns ## COG: BMEI0267 COG0477 # Protein_GI_number: 17986551 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Brucella melitensis # 11 379 20 365 397 113 28.0 8e-25 MNNQFTHPKERIRFAILTFFFAQGLCMASWASRIPDFKDVFAANYAFYWGLILFMIPVGK FVAIPLAGYLVSKLGSRSMVQVSILGYASSLLCIGLAHEVYLLGFLLFCFGVCWNLCDIS FNTQGIEVERIYGKTIMATFHGGWSLGACAGALIGFVMILAGVSPIWHYTLIFIIILIIA LSGRKYLQESAPQEAEVSDTKMKERNTAKAPNGFRLLFQKPEMLLLQLGLVGLFALIVES AMFDWSAVYFESVVHVPKSLQIGFLVFMIMMATGRFLTNYAYQLWGKKKVLQLAGSFICI GFFVSALLGGVFESMAMKVIINSLGFMLVGLGISCIVPTLYSFVGAKSKTPVSIALTILS SISFIGSLIAPLLIGAISQAFDIRIAYMIIGILGGCIVLIVSFSSAFDIQESGDKPE >gi|222159308|gb|ACAB01000051.1| GENE 6 6614 - 7381 747 255 aa, chain + ## HITS:1 COG:Cj1166c KEGG:ns NR:ns ## COG: Cj1166c COG2966 # Protein_GI_number: 15792490 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 13 247 12 249 258 142 34.0 7e-34 MKTGESLLSIGKFIAEYSAHLMGAGVHTSRVVRNTKRIGEAFGLDVKLSVFHRNIILTII DKETNEACNEVIDIPAHPISFEHNSELSALSWEAVDNHLSLEELKDKYKKIISAPRIHPL FVLLLVGFANASFCKLFGGDLISMGIVFSATITGFYLKQQMQAKKINHYVVFIVSAFVAS LCASTALIFDTTSEIAMATSVLYLVPGVPLINGVIDVVEGYVLTGFARLTEASLLIVSIA IGLSFTLLMVKNSLI >gi|222159308|gb|ACAB01000051.1| GENE 7 7383 - 7868 407 161 aa, chain + ## HITS:1 COG:Cj1165c KEGG:ns NR:ns ## COG: Cj1165c COG3610 # Protein_GI_number: 15792489 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 6 161 5 160 164 85 35.0 4e-17 MIALDILTDGFFAAIAGIGFGAISDPPLRAFKMIAILAALGHACRFCLMTYLGVDIATGS LFAGLVIGFGSLWLGKKVYCPMTVLYIPALLPMIPGKFAYNMVFSLIMCLQNVNDPDKLD KFMSMFFSNTLIASTVIFMLAVGATFPMFLFPHRAFSLTRH >gi|222159308|gb|ACAB01000051.1| GENE 8 7880 - 8608 527 242 aa, chain + ## HITS:1 COG:no KEGG:BT_0973 NR:ns ## KEGG: BT_0973 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 241 1 241 242 379 93.0 1e-104 MEIFWRTIAYYNSATWLLQIVIILIGIALTGLLIGRPRPWVKMAMKFYMIGLYTWISLVY YYIYCEERSYNGVMAMFWGVMAIIWIWDTITGYTTFERTHKYDLLSYVLLAMPFIYPLVS LARGLSFPEMTSPVMPCSVVVFTIGLLLLFAQKVNMFLVLFLCHWSLIGLSKTYFFQIPE DFLLASATIPGLYLFFREYFLNNLHADTKPKAKYINWLLISVCVGLAVLLTTTMFLELVP KG >gi|222159308|gb|ACAB01000051.1| GENE 9 8758 - 9567 616 269 aa, chain - ## HITS:1 COG:mll2118 KEGG:ns NR:ns ## COG: mll2118 COG1028 # Protein_GI_number: 13471973 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Mesorhizobium loti # 7 182 10 186 271 160 47.0 3e-39 MQPQVILITGASSGFGKITAQMLSERGHIVYGTSRKPSEDMNQVKMLVVDVTNSFSVCQA VERILLEQGRIDVLINNAGIGIGGALELATEEEVNIQMNTNFFGVVNMCKAVLPSMRKAR KGKIINISSIGGVMGIPYQGFYSASKFAVEGYSEALALEVHPFHIKVCVVEPGDFNTGFT DNRNISEQTRLDADYGESFLKSLEIIEKEERNGCHPQKLGAAICKIVERTNPPFRTKVGP WIQVLFAKSKKWLPDAVMQCALRIFYAIK >gi|222159308|gb|ACAB01000051.1| GENE 10 9605 - 10225 504 206 aa, chain - ## HITS:1 COG:L111950 KEGG:ns NR:ns ## COG: L111950 COG1011 # Protein_GI_number: 15672092 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Lactococcus lactis # 2 191 3 192 207 96 30.0 4e-20 MIKNIVFDFGGVIVDIDRDKAVQAFIKLGLADADTRLDKYHQTGIFQELEEGKLSADEFR KQLGDLCGRPLTMEETKQAWLGFFNEVNLNKLDYILELRKSYHVYILSNTNPFVMSWACS PDFSSKKKPLNDYCDKLYLSYQVGHTKPAPEIFEFMVNDCNIIPSETLFVDDGASNIHIG KELGFETFQPKNGEDWREEMTAILEK >gi|222159308|gb|ACAB01000051.1| GENE 11 10479 - 11582 590 367 aa, chain + ## HITS:1 COG:no KEGG:BT_0969 NR:ns ## KEGG: BT_0969 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 362 1 362 362 546 81.0 1e-154 MNNLRILLYMMLLSLAACHPEGTSVKQGLDKAAQLMEQDPDTASIILETIQSSQMNEAQL AEYNLLCTQLNEDKNIPHSSDKQIRQAASYYEKHGDEYQKSKAYYYLACVESDLEQKENA EIHFKEAIKLAKETEEYDHLAKICKRCSLYYQKYGNFDEALEMERKAYASQLILNDNKSD SSVILSSALGMFGVMSLLLGLLWKKNRHALSQLDLFKEEILKKDVESDKLMLRCNHLEEK YQSLQLHIYESSPVVSKVRQFKERNVLSSKIPSFSEKDWTELLRLQENVYGLVSKLKEIS PKLTEEDLRVCAFLREGVQPAYFADLMKLTTETLTRRISRIKTEKLMLINSKESLEDIVK SLGASPL Prediction of potential genes in microbial genomes Time: Wed May 18 02:24:04 2011 Seq name: gi|222159307|gb|ACAB01000052.1| Bacteroides sp. D1 cont1.52, whole genome shotgun sequence Length of sequence - 15523 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 7, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 557 - 616 8.1 1 1 Op 1 . + CDS 771 - 3905 2008 ## BT_0968 hypothetical protein 2 1 Op 2 . + CDS 3917 - 4162 240 ## BT_0967 hypothetical protein + Prom 4168 - 4227 4.3 3 2 Op 1 6/0.000 + CDS 4263 - 4814 466 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 4851 - 4910 6.7 4 2 Op 2 . + CDS 4946 - 5887 616 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 5918 - 5957 -0.0 + Prom 5902 - 5961 3.4 5 3 Tu 1 . + CDS 6013 - 8637 1561 ## BT_0964 hypothetical protein + Prom 8710 - 8769 6.1 6 4 Tu 1 . + CDS 8794 - 9264 434 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 7 5 Tu 1 . - CDS 9297 - 9485 103 ## - Prom 9587 - 9646 2.9 + Prom 9316 - 9375 5.9 8 6 Op 1 . + CDS 9454 - 10962 905 ## COG1145 Ferredoxin 9 6 Op 2 . + CDS 10982 - 12385 1007 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family 10 6 Op 3 . + CDS 12394 - 12846 444 ## BT_0961 hypothetical protein + Term 12882 - 12931 10.5 - Term 12868 - 12919 7.1 11 7 Tu 1 . - CDS 12961 - 15480 1413 ## COG5002 Signal transduction histidine kinase Predicted protein(s) >gi|222159307|gb|ACAB01000052.1| GENE 1 771 - 3905 2008 1044 aa, chain + ## HITS:1 COG:no KEGG:BT_0968 NR:ns ## KEGG: BT_0968 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1044 1 1044 1044 1824 82.0 0 MSYLLFKRGNDLGGWIVFLIAAFVYGMTIEPTASFWDCPEFISCAEKLQVGHPPGAPFYM LVGNLFTQFASDASQVSRMVNFLNALLSAGCILFLFWSITRLVRSILTDDARKLSTTDVI IILGAGFVGALAYTFSDTFWFSAVEGEVYAFSSFLTALVFWMILRWQDEADSISGDRWII LIAYIIGLSIGVHLLNLLCIPAIVLVFYYRKYQAVSLKGVIGTIALSGLLIVLILFVYIP GMADMGGWFELFFVNVLGFPFQTGLIIFLTLVLSLLIGAIYRFRKRIVHTGLWCLLMLTI GYTTYAVILIRANANTPLNENAPDHIFTLKSYLNREQYESAPLLYGRTYASEPEYVPDRD YYKVKTKKGSAIYRQDKEEGKYKIIGYKENVCYTQNMLFPRMWNDRLAASYKGWSGSTND VPTQKENLTYFITYQLNYMYVRYFLWNFVGRQNDIQGSGEPEHGNWITGISWLDNLRLGD QKLLPESLQQNKGHNVFYGLPLLLGLLGIYWQWARGKKGKQQFSVLFFLFFMTGLAIVLY LNQTPGQPRERDYAYAGSFYAFAIWIGMGAAGCCDMLRRKHFKVLPVSLLMLLCLLIPVQ MASQTWDDHDRSNRYTCRDFGANYLMTLPDTGNPIIFCNGDNDTFPLWYNQDTEEVRRDA RICNLSYAQTDWYIYQQQCPLYNAPGLPISWKQNQYQEGKNEYVAIRPELKKQIEELYQK HPEEARDSFGDDPYEIKNILKHWVFAEKQEFHVIPTDTINIHIDKDAVLRSGMMLPKAIR HLKGEELKNAIPDKLSISLKDIRLLTKVDLLILEILANCNWERPLYMAISVGNATKLKFD DYFVQEGLAFRFTPFNYKKWGDAEGDNGYAIDTEKFYENVMNRYKYGGLDTPGLYLDETT MRICYSHRRLFAQLAKELVKQGDNARAQKVLAYAEQAIPAYNVPQVYESGSYEIATAYAS IGESGKAITLLNDLIAESRDYIDWAFSLGDSRIAMVQRDCLYKFWQWNQCNELLKDMDKE RYKQSNQQFEEKYMRLAQLMNYQN >gi|222159307|gb|ACAB01000052.1| GENE 2 3917 - 4162 240 81 aa, chain + ## HITS:1 COG:no KEGG:BT_0967 NR:ns ## KEGG: BT_0967 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 77 1 67 73 75 73.0 8e-13 MKTKIVKEKELETVLSKKNYIILIIGSILIIAGYILMSGEGSTLAAYHPDIFSETRIRIA PLVCLLGYLLNVFGILYRPLK >gi|222159307|gb|ACAB01000052.1| GENE 3 4263 - 4814 466 183 aa, chain + ## HITS:1 COG:XF2239 KEGG:ns NR:ns ## COG: XF2239 COG1595 # Protein_GI_number: 15838830 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Xylella fastidiosa 9a5c # 4 179 9 194 206 68 26.0 8e-12 MSNNIDIKTLEAFQDGNHKAFETIFIAYYNKTKTFIDGYIKSEPDAEELTEDLFVNLWIN RHSIDTSKSFHSYLHTIARNAAINFLKHKYVCDAYLNNNQDTEYSSTSEEDLIAKELEML IDKLVGGMPEQRRMIYTLSRNEGLSNAEIAERLNTTKRNVESQLSLALKEIRKVISCFLV SLL >gi|222159307|gb|ACAB01000052.1| GENE 4 4946 - 5887 616 313 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 107 264 123 278 331 81 35.0 2e-15 MFLSARFPSETEEKVQKWIIKDKNQQAKAKASLDFWNELDVEADSNTYASLERVNLRTGY NKEHLTNIVSYQKFARIAAVIIPLFLFAGGMFYYLSPHNEMIEVSVAYGEQKHLILPDSS EVWLNAGSTILYPETFAKDKRLVMLDGEAYFSVKKDTASPFIVEASQLSVKVLGTRFNVK AYPNDEKITTTLTSGKVEVSVQSQPPHILKPNEQLTYDKKSSDIHISMIDTNDTNCWIVG KLVFTNASAGEIFRTLERHYNTTIDNTATIPTSKRYTVKFLKDESLDEILNILKDIIGFD YQQYEKKIVVTQP >gi|222159307|gb|ACAB01000052.1| GENE 5 6013 - 8637 1561 874 aa, chain + ## HITS:1 COG:no KEGG:BT_0964 NR:ns ## KEGG: BT_0964 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 812 1 799 800 1309 81.0 0 MNRINTKKRAGKLAIFLFFTFLSLTTVAQNKEKKITIQNKKISLKEAFAQIETQTGYSIA YEQSKLDIEKKHSLSLKNATVDKAMTQLLKGTGYAYKIKGYHIIISLQDNKQKPANNDTQ KLTQTIRGIVVDSKTNTPIEYASVCITEDPSRGGSTDERGNFRINNVPVGRYNIQATFMG YRSSIITEVSVTSSKEVFVEIPMDENVQDLAEVLVKPEIKKDRTVNPMAITGGRMISIEE ASRFANGFDDPARLSSAFAGVAGDVGTNAVAIRGNAPQFTQWRLEGIEIPNPTHFADLSG LGGGFLSGLSTQVIGNSDFYNGAFPAEYSNALSGVFDMHLRNGNNQKYEHAFQVGLMGID LASEGPINKKRGSSYLINYRFSTTSLASGNDINLKYQDLAFKLNFPTRKAGTFSIWGLGL IDRNKAEVLDRSEWETMGDRSSGSNKLDKLAGGLTHKYVINENTYIRSSLSATYSKDHSL VNLQTDDGTIVQVGDIQNSRWNFVFNSYLNKKFSSRHTNRTGITITELKYDLDYKVSPYF GLNQPMEQLSKGSGESTVFSAYSSSVINLSNNLTTSLGVTGQYFTLNKNWSIEPRVALKW EINPAHSLAVAYGLHSHRERLDYYFVEQVINGKKESNRYLDFSKAHHFGFTYDWNINQSL HLKIEPYYQYLFHIPVEKNSSFSIINYEEFYLDRILTSTGTAKNYGIDITLEQYMKNGFY YMITGSLFKSKYRGGDRIWRNTRLDKSYLVNLLAGKEWMVGRLKQNVLSINGRLFFQGGG RYTPVDEEKSQEERDIVFDESKAYTKRFNPSINGDVSISFRINKKRVSHEFSLKILNVGM RTGMHFYEYNERTSVVEEKDGSGLIPNISYKIYF >gi|222159307|gb|ACAB01000052.1| GENE 6 8794 - 9264 434 156 aa, chain + ## HITS:1 COG:BH2089 KEGG:ns NR:ns ## COG: BH2089 COG2220 # Protein_GI_number: 15614652 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Bacillus halodurans # 57 155 259 357 370 91 36.0 7e-19 MRIIFKKFRTRMIVGCILAVIALLAVSVVVFINQPSFGRTPRGERLERVMKSPNYRNGGY DTHYAEIGNRFPNIDLAILENGQYDKEWSLIHLMPQYMAQTARDLKAKKVLTVHHSKYAL AKHRWDEPLKNAEEMKNKDYLNVLIPEIGEVVTLEK >gi|222159307|gb|ACAB01000052.1| GENE 7 9297 - 9485 103 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAAESLIVRSINFFNFQFIVVISKCPKQSGFNLMYFNVANLISEICFSTIQITECFTNFT DK >gi|222159307|gb|ACAB01000052.1| GENE 8 9454 - 10962 905 502 aa, chain + ## HITS:1 COG:MTH401 KEGG:ns NR:ns ## COG: MTH401 COG1145 # Protein_GI_number: 15678429 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanothermobacter thermautotrophicus # 194 484 21 323 337 71 24.0 5e-12 MLRTIRLSAAIVCFTLITLLFLDFTGTLHTWFGWLAKIQFLPALLALNIGVVLFLIVLTF LFGRIYCSVICPLGVFQDIVSWISGKQKKNRFRYSSAMKWLRYGVLGVFIIMMVAGLNSL AILLAPYSAYGRIASSLFAPVWQWGNNLLAYFAERMDSYAFYEVDVWMKSLSTLIIAVVT LIVLFILAWRNGRTYCNTICPVGTVLGFISKYAIFKPVIDTSKCNSCGLCARNCKASCIN SKAHEIDYSRCVACMDCIGKCKHGAIKYTRRKPKNETATSEDMKAKAVTTEQIDNARRSF LSASAIFATTSVLKAQEKKVDGGLATIEDKKIPNRENPIYPPGALSARNFTQHCTACQLC VSVCPNQVLRPSDNLMTLMQPEMSYERGYCRPECTKCSEVCPAGAIHLTSLAEKSAIQIG HAVWIKENCVPLTDGMECGNCARHCPTGAIQMVASDPDKADSLKIPVVNVEKCIGCGACE NLCPSRPFSAIYVSGHQMHRII >gi|222159307|gb|ACAB01000052.1| GENE 9 10982 - 12385 1007 467 aa, chain + ## HITS:1 COG:MA0422 KEGG:ns NR:ns ## COG: MA0422 COG1453 # Protein_GI_number: 20089314 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Methanosarcina acetivorans str.C2A # 54 467 1 385 400 231 36.0 2e-60 MEEKNKKDINRRDFIKIVGISAATSTGLLYGCSSKGTTSSSSATGEGEVPTDKMTYRTSP TTGDRVSLLGYGCMRWPLKSAPDGNGEVIDQDAVNGLIDYAIAHGVNYFDTSPAYVQGFS EKATGIALSRHPRDKYYIATKLSNFSPDTWSREASLKMYHKSFAELQVDYIDYMLLHGIG MGGMEALKGRYLDNGILDFLVKEREAGRIRNLGFSYHGDIEVYDYLLSRHDEFKWDFVQI QLNYVDWKHAKETNTRNTDAEYLYGELAKRGIPAIIMEPLLGGRLSKLNDNLVARLKQRR PESSVASWAFRFAGSFPDILTVLSGMTYMEHLQDNLRTYSPLEPLTDEEKEFLEETAQLM LKYPTIPCNDCKYCMPCPYGLDIPAVLLHYNRCVNEGNVARSGQDENYAKARRAFLVGYD RSVPKLRQASHCIGCNQCVAHCPQNIDIPKELHRIDQFVEQLKQGTL >gi|222159307|gb|ACAB01000052.1| GENE 10 12394 - 12846 444 150 aa, chain + ## HITS:1 COG:no KEGG:BT_0961 NR:ns ## KEGG: BT_0961 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 140 1 140 144 233 82.0 2e-60 MEELINLLHTGGYSCTIANGGKIRTFTQRGVADLYDLLTQEPEFLKGALIADKVVGKGAA ALMILGGIKELYTDIISTKALELFRKSDVKVDFAQEVAFIWNRDRTGGCPVETMCIEVES AEEILPLIRDFLEKSEVGSKKEITETRNKV >gi|222159307|gb|ACAB01000052.1| GENE 11 12961 - 15480 1413 839 aa, chain - ## HITS:1 COG:BH4026 KEGG:ns NR:ns ## COG: BH4026 COG5002 # Protein_GI_number: 15616588 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 288 530 357 598 607 137 32.0 7e-32 MWIGTAIGLYLLNRNTGEYQYIEMEVGITYINTLYQADNGLLYIGTNGMGVFIYHPQDKT FEHYFSDNSALVSNRIFTILSEVDGRIMMSTENGITCFHTKEKIFRNWTKGEGLLPSYFN AAAGTVRKNKNFVFGSTDGAIELPVNVKFPDYKFSRLIFSDFHLSYQPVYPGVKGSPLQK SIDETDVLELAYDENTFSFEVSTINYDSPGSALYSWKLEGFYEKWTQPGANNLIRFTNLP PGKYTLHVRAVSREEHDIVFQERTMKVIITQPFWSSWWAILCYILLVIGGFYFVLRVINL RKQKNISDEKTQFFINTAHDIRTPLTLIKAPLEELLEEETLTDNGITRTNIALRNVEVLL RLVSNLINFERTDVYSSKMSVSEYELNTYMNDIYDTFASYAAIRRIEYTYESTFSYMNVS FDKEKMDSILKNIISNSLKYTPENGKVSISVSDTNDSWKVIIKDTGIGIPASEQSKLFKL HFRASNAINSKVTGSGIGLMLVGKLVSLHGGKISVDSVEHQGTTIKIVFPKKNKNSQNIS DEAPSKFEALAPVLPAPNVPAKTTATIDNPNLRRILVVEDNDELRSYLVSSLSSIYNVQA CANGKEALIIIKEYWPELVLSDIMMPEMRGDELCVAIKSDIEISHIPVLLLTALGEENNI LDGLSIGADEYLIKPFSVKILRANIANLLANRELLRMRYANLDIEAKSMVPSANGTNSLD WKFISNVKKIVDENINNPEFSVNMLCESSGMSRTSFYCKLKALTGQSPTEFIRVMRLKRA TELLKEGEYAINEISDMVGFSETKYFREVFKKYYKMSPSRYAKGGGNPAATDLEDDDED Prediction of potential genes in microbial genomes Time: Wed May 18 02:24:46 2011 Seq name: gi|222159306|gb|ACAB01000053.1| Bacteroides sp. D1 cont1.53, whole genome shotgun sequence Length of sequence - 37772 bp Number of predicted genes - 34, with homology - 34 Number of transcription units - 11, operones - 9 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1459 794 ## BT_0958 two-component system sensor histidine kinase/response regulator, hybrid ('one component system') - Prom 1492 - 1551 5.8 - Term 1561 - 1601 4.1 2 2 Op 1 22/0.000 - CDS 1701 - 2873 769 ## COG0842 ABC-type multidrug transport system, permease component 3 2 Op 2 9/0.000 - CDS 2857 - 4041 650 ## COG0842 ABC-type multidrug transport system, permease component 4 2 Op 3 13/0.000 - CDS 4064 - 5050 930 ## COG0845 Membrane-fusion protein 5 2 Op 4 . - CDS 5057 - 6469 1206 ## COG1538 Outer membrane protein - Prom 6715 - 6774 77.5 + TRNA 6701 - 6773 79.0 # Gly CCC 0 0 + Prom 6703 - 6762 79.8 6 3 Op 1 . + CDS 6980 - 8473 1658 ## COG0442 Prolyl-tRNA synthetase + Term 8500 - 8540 4.5 + Prom 8505 - 8564 6.6 7 3 Op 2 . + CDS 8686 - 9363 177 ## PROTEIN SUPPORTED gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 8 3 Op 3 . + CDS 9386 - 9943 417 ## BT_0927 two-component system sensor histidine kinase 9 3 Op 4 . + CDS 9940 - 10659 536 ## COG0642 Signal transduction histidine kinase 10 4 Op 1 24/0.000 - CDS 10791 - 11747 725 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 11 4 Op 2 5/0.000 - CDS 11740 - 12483 296 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 12 4 Op 3 . - CDS 12487 - 13638 950 ## COG1470 Predicted membrane protein - Prom 13806 - 13865 5.7 13 5 Op 1 . + CDS 13958 - 14395 509 ## BT_0923 putative periplasmic protein 14 5 Op 2 . + CDS 14417 - 15274 829 ## BT_0922 hypothetical protein + Term 15311 - 15376 19.5 - Term 15310 - 15353 5.1 15 6 Tu 1 . - CDS 15366 - 19820 3205 ## BT_0921 hypothetical protein - Prom 19944 - 20003 4.6 + Prom 19796 - 19855 4.9 16 7 Op 1 . + CDS 20021 - 21040 650 ## PROTEIN SUPPORTED gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase 17 7 Op 2 . + CDS 21080 - 22312 680 ## COG1546 Uncharacterized protein (competence- and mitomycin-induced) + Prom 22356 - 22415 4.6 18 8 Op 1 . + CDS 22435 - 22695 443 ## PROTEIN SUPPORTED gi|29346326|ref|NP_809829.1| 50S ribosomal protein L28 19 8 Op 2 . + CDS 22717 - 22905 320 ## PROTEIN SUPPORTED gi|53713719|ref|YP_099711.1| 50S ribosomal protein L33 20 8 Op 3 . + CDS 22923 - 23081 244 ## PRU_0750 hypothetical protein + Term 23107 - 23156 9.0 + Prom 23134 - 23193 8.6 21 9 Op 1 3/0.000 + CDS 23220 - 24179 734 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 22 9 Op 2 . + CDS 24176 - 25486 1208 ## PROTEIN SUPPORTED gi|229870452|ref|ZP_04490046.1| SSU ribosomal protein S12P methylthiotransferase 23 9 Op 3 . + CDS 25479 - 25751 251 ## BT_0912 DNA-binding protein HU 24 9 Op 4 . + CDS 25765 - 27183 905 ## BT_0911 putative integration host factor IHF alpha subunit + Term 27198 - 27247 12.2 + Prom 27337 - 27396 6.2 25 10 Op 1 23/0.000 + CDS 27424 - 28419 1041 ## COG0714 MoxR-like ATPases 26 10 Op 2 . + CDS 28503 - 29372 796 ## COG1721 Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) 27 10 Op 3 . + CDS 29384 - 30460 856 ## BT_0908 hypothetical protein 28 10 Op 4 5/0.000 + CDS 30510 - 31493 858 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 29 10 Op 5 . + CDS 31539 - 32567 947 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 30 10 Op 6 . + CDS 32576 - 33298 712 ## BT_0905 hypothetical protein 31 10 Op 7 . + CDS 33330 - 35150 923 ## BT_0904 hypothetical protein 32 10 Op 8 . + CDS 35202 - 36002 559 ## BT_0903 hypothetical protein + Term 36039 - 36076 2.1 + Prom 36045 - 36104 7.7 33 11 Op 1 . + CDS 36131 - 36421 260 ## BT_0902 hypothetical protein + Term 36435 - 36477 6.4 34 11 Op 2 . + CDS 36499 - 37620 599 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 37678 - 37727 8.5 Predicted protein(s) >gi|222159306|gb|ACAB01000053.1| GENE 1 1 - 1459 794 486 aa, chain - ## HITS:1 COG:no KEGG:BT_0958 NR:ns ## KEGG: BT_0958 # Name: not_defined # Def: two-component system sensor histidine kinase/response regulator, hybrid ('one component system') # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 479 1 479 1329 689 67.0 0 MKYIIYILLFFPIWVTAQTYKYIGIENGLSNRRIFNIQKDTQGYMWFLTNEGMDRYNGKD IKHYKLNKEASTLDAPVRLGWLYTGPQIGIWVIGKQGRVFRYEANEDDFRMVYKLPDISE TISCGYLDRNNNIWLCCKDSILLYNINDAHILQFSNILHSNITAIEQIDEQHFFIATELG VRYVKLENGILETIPVKTLDYFHAQVSELYFHKQLKRLFIGSFERGVFVYDMNTQEIIRP EADLSDVNIARISPLNETELLIATEGMGVYKVNANTCELEHYIVANYQSYNEMNGNNIND VFVDEEKRIWLANYPTGITVIDNRYENYHWMKHAMGNVQSLINDQVQAVIEDHEGDLWFG TSNGISLYNSKTGQWHSFLSSFNQQLKDKNHIFITLCEVSPGIIWAGGYTSGIYKINKKT LSVEYFSPYLLSHVNMRPDKYIRDIVKDSRGYIWSGGYYNLKCFDLETNTTRLYSGLNSN PFGSNG >gi|222159306|gb|ACAB01000053.1| GENE 2 1701 - 2873 769 390 aa, chain - ## HITS:1 COG:jhp1379 KEGG:ns NR:ns ## COG: jhp1379 COG0842 # Protein_GI_number: 15612444 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Helicobacter pylori J99 # 13 389 6 374 376 129 28.0 9e-30 MKTLSKLQQLSFIIRREFLAISTSYAVLLVLMGGIFVYGLLYNYMYAPNIVTDVPVAVVD NSHSELSRDFIRWLDATPQAEIYSQAMDYHEAKEWMKEGKVQGILYLPHDFEKRVFRGDE AVFSLYATTDAFLYFEALQEASSRVMLAINDKYRPDVAVFLPPQGLLAVTMAKPINVIGT ALYNYTEGYGSYLIPAVMMIIIFQTLLMVIGMVTGEEHSNRGIRAYTPFGNGWGVAIRIV AGKTSVYCALYAIFAFFLLGLLPHFFSIPNIGNGLYIVLLLIPYLMATSFLGLAASRYFT DSEAPLLMIAFFSVGLIFLSGVSYPMELMPWYWKAAHYILPAAPGTLAFVKLNSMGASMA DIRPEYITLWIQVFIYFIISIWVYKKKLEI >gi|222159306|gb|ACAB01000053.1| GENE 3 2857 - 4041 650 394 aa, chain - ## HITS:1 COG:VC1608 KEGG:ns NR:ns ## COG: VC1608 COG0842 # Protein_GI_number: 15641616 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Vibrio cholerae # 16 346 12 340 387 103 22.0 6e-22 MDESQTYFPFRSVLLREWQRMTSRRLYFGVCIVLPLFTLFFMATIFGNGQMENIPIGIVD QDNTATSRTIARNISAVPTFKVTKHFADEAAARESVQKKEIYGYLSIPPQFEQDAITGKN ATLSYYYHYALLSVGGELMAAFETSLAPVALSPVVMQAVALGVEQNQITTFLLPVQANNH PIYNPSLDYSVYLSQPFFFVLFQVLILLITVYTVGIEIKFRTANDWLTTAKGNIVTAVLG KLLPYTIIYILIGWLANYVMFGILHIPFQGSWWLMNIMTVLFIIATQALGLFLFSLFPAI SLVISVVSMVGSLGATLSGVTFPVSNMYPLVRDASYLFPVRHFTEMMQTMLYGGGGFIHL WPSAVILCIFPLLALLLLPHLKRAIESHKYENIK >gi|222159306|gb|ACAB01000053.1| GENE 4 4064 - 5050 930 328 aa, chain - ## HITS:1 COG:HP1488 KEGG:ns NR:ns ## COG: HP1488 COG0845 # Protein_GI_number: 15646097 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Helicobacter pylori 26695 # 36 327 37 327 329 207 36.0 2e-53 MKPTSKTLSWAFVIILLAVGIFTGLGVILMHKQQLVLQGQAEATEIRISGKLPGRIDTFF VQEGDWVHQGDTLVVINSPEVYAKYQQVNALEQVAVQQNKKIDAGTRRQIVATALQLWNK TKSDLTLAQTTYNRILTLYKDSVVTSQRKDEVEAMYKAAVAAERAAYEQYQMAVDGAQKE DKASAASMVDAARSTVDEISALLVDSRLTAPESGQIATIFPKRGELVAPGTPIMNLVVMD DIHVVLNVREDLMPQFKMDGTFVADVPAIGKENIEFKIYYISPLGSFATWKSTKQTGSYD LRTFEIHARPTEKVDDLRPGMSVLLTLD >gi|222159306|gb|ACAB01000053.1| GENE 5 5057 - 6469 1206 470 aa, chain - ## HITS:1 COG:VC1606 KEGG:ns NR:ns ## COG: VC1606 COG1538 # Protein_GI_number: 15641614 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Vibrio cholerae # 25 467 23 460 476 169 27.0 9e-42 MKKNKRLFILLTLYFSISQIDAQTPLSFEESLHLLNQGNQSLKIADKSIEIAKAERDKLN AFWYPSLQSTGAFVHMSEKIEVKQPLSQFTDPAKDFVHSIIPDDQVISSILDQIGANTLV FPLTPRNLTTVDLSAEWVLFSGGKRFRATNIGRTMVDLARESRAQVSANQQNLLVESYYG LRLAQQIVTVREETYNGLKKHYENALKLEAAGMIDKAGRLFAQVNMDEAKRALEAARKEE TVVQSALKVLLNKKDTDANIIPTSPLFMNDSLPPKMLFDLSVNSGNYTLNQLQLQQHIAK QEVRIAQSGYLPNIALFGKQTLYSHGIQSNLLPRTMIGIGFTWNLFDGLDREKKVRQSKL TEQTLALGQMKARDDLAVGVDKLYTQLEKAQDNVKALNATITLSEELVRIRKKSFTEGMA TSTEVIDAETMLATVKVARLAAYYEYDVALINLLSLCGTPEQFANYQPKP >gi|222159306|gb|ACAB01000053.1| GENE 6 6980 - 8473 1658 497 aa, chain + ## HITS:1 COG:BB0402 KEGG:ns NR:ns ## COG: BB0402 COG0442 # Protein_GI_number: 15594747 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Borrelia burgdorferi # 8 497 5 488 488 475 49.0 1e-134 MAKELKDLTKRSENYSQWYNDLVVKADLAEQSAVRGCMVIKPYGYAIWEKMQRQLDDMFK ETGHVNAYFPLLIPKSFLSREAEHVEGFAKECAVVTHYRLKNAEDGSGVVVDPAAKLEEE LIIRPTSETIIWNTYKNWIQSYRDLPILCNQWANVFRWEMRTRLFLRTAEFLWQEGHTAH ATREEAEEEAIRMLNVYGEFAEKYMAVPVVKGVKSANERFAGALDTYTIEAMMQDGKALQ SGTSHFLGQNFAKAFDVQFVNKENKLEYVWATSWGVSTRLMGALIMTHSDDNGLVLPPHL APIQVVIVPIYKNDEQLKQIDAKVEGIVAKLKALGISVKYDNADNKRPGFKFADYELKGV PVRLVMGGRDLENNTMEVMRRDTLEKETVTCEGIETYVQNLLEEMQANIYKKALDYRNSK ITTVDTYDEFKEKIEEGGFILAHWDGTTETEEKIKEETKATIRCIPFDSFVPGDKEPGKC MVTGKPSACRVIFARSY >gi|222159306|gb|ACAB01000053.1| GENE 7 8686 - 9363 177 225 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP19-BS75] # 1 183 1 190 226 72 31 3e-12 MKILIVEDEPSLRELIQCSLEKERYVVETASDFNSALRKVEDYDYDCILLDIMLPDGSGL NLLERLKALHKRENVIIISAKDSLEDKVLGLELGADDYLPKPFHLVELNARIKSVIRRHQ HDGEIDIRQGNVRIEPDKYRVFVNDQEVELNRKEYDILLYFINRPGRLINKNTLAESVWG DHIDQVDNFDFIYAQIKNLRKKLKDSGANIEIKAVYGFGYKMVVE >gi|222159306|gb|ACAB01000053.1| GENE 8 9386 - 9943 417 185 aa, chain + ## HITS:1 COG:no KEGG:BT_0927 NR:ns ## KEGG: BT_0927 # Name: not_defined # Def: two-component system sensor histidine kinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 182 1 182 425 318 90.0 4e-86 MKLIYYIILRITLALTFILTVWAIFFYVTMIDEVNDEVDDALEDYSETIIIRALAGEELP SKTNGSNNQYYMMEVSKEYAESREDIQYKDSMVYIEEKEETEPARILTTIFKDDEGRYHE LTVSTPSIEKDDLRDAIQVWIIFLYVALLFCIIIISVWVFYRNMRPLYVLLHWLDGYQTG KKISR >gi|222159306|gb|ACAB01000053.1| GENE 9 9940 - 10659 536 239 aa, chain + ## HITS:1 COG:mll7952 KEGG:ns NR:ns ## COG: mll7952 COG0642 # Protein_GI_number: 13476585 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 8 215 210 423 452 85 25.0 9e-17 MSNETQITEFRKLNEAAIRYVERTEQMFEQQKQFIGNASHEIQTPLAICRNRLEMLMEDD SLSEKQLEELMKTHQTLEYITKLNKSLLLLTKIDNGQFTDTKQLELNGLLKQYLQDYEEV YDYRNIEVTIDEQDIFNVTINESLAVALLTNLLKNAFVHNIDGGHIRITVTKNSITFRNS GVKNPLDKEHIFERFYQGTKKEGSTGLGLAIADSICRLQHLSIRYYFEQKEHCFEISRQ >gi|222159306|gb|ACAB01000053.1| GENE 10 10791 - 11747 725 318 aa, chain - ## HITS:1 COG:BH3213 KEGG:ns NR:ns ## COG: BH3213 COG1277 # Protein_GI_number: 15615775 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Bacillus halodurans # 8 317 31 344 345 319 57.0 4e-87 MNKVNHPFWVIVNKEISDHVKSWRFIILIGIIALTCMGSLYTALTNIGEAIKPNDPDGSF LFLKLFTVSDGTLPSFVLFINFLGPLLGIALGFDAVNSEQNKGTLSRMLSQPIHRDCIIN AKFVAALIVIGIMLFVLGFWVMGCGLIAIGIPPTAEEFWRIVFFIITSIFYVAFWLNLAI LFSLRFRQAATSALASVAVWLFFSVFYTMIVNLVAKGLSPSQMASPYQIISYQKFILGLM RLAPSELFNEATTTLLMPSVRSLGPLTMEQVQGAIPSPLPLGQSLLVVWPQLTGLIAATV ICFAISYIMFMRREIRSR >gi|222159306|gb|ACAB01000053.1| GENE 11 11740 - 12483 296 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 11 216 17 222 318 118 32 5e-26 MGEQVIVLTDLTKQYGNFTAVDHIRLNIRKGEIFGLLGPNGAGKSTTILMMLGLTEPTSG TVEICGINSTTHPIEVKRRIGYLPEDVGFYDDMTGPENLIYTARLNGIPDKEAKTKAMEL MKRVGLEEQLTKKTGKYSRGMRQRLGLADVLIKNPEIIILDEPTSGIDPAGVQEFIELIR WLSKEEGLTVLFSSHHLDQVQKVCDRVGLFSNGQLLALIDMAELKDKKQELSDIYNHYFE EGGERHE >gi|222159306|gb|ACAB01000053.1| GENE 12 12487 - 13638 950 383 aa, chain - ## HITS:1 COG:BH3215 KEGG:ns NR:ns ## COG: BH3215 COG1470 # Protein_GI_number: 15615777 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 8 383 9 385 385 284 40.0 2e-76 MTMRTNYFLLLAILLGVIPMSYTHANDSIPKSVILYTPYTKISVSPGASIDYSIDLINNT DQLTNANLSVSGLSASWKHEMKSGGWSLSQLSVLPKEKKTFNLKVEVPLKVNKGNYHFVV YAGNAKLPLNVVVAQKGTYQTEFTTDQPNMQGNSKSTFTFSATLKNQTADQQLYALMANA PRGWNVVFKPNYKQATSAQVEANSTQNVSIDITPPANVEAGSYKIPVRAATGTTSAELEL EVVVTGSYQMELTTPRGLLSTDVTAGDVKKLELEVRNTGSSLLKDIQLSANKPADWEVTF EPSKVDALKAGETSTVMATLKASKKALPGDYVTTIMAKTPEVNADAQFRIAVKTPMIWGW VGVLIIIATIGVVYYLFRKYGRR >gi|222159306|gb|ACAB01000053.1| GENE 13 13958 - 14395 509 145 aa, chain + ## HITS:1 COG:no KEGG:BT_0923 NR:ns ## KEGG: BT_0923 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 145 243 90.0 1e-63 MKKLVFLLVCFFTLQTVARADDDKPIQVTQMPQQAQQFIKQHFADSKVALAKMESDFFYK SYEVIFTNGDKVEFDNKGNWEEVNCKYSAVPTAIIPATIQKYVTTNYPDAKILKIERDKK DYEVKLSNRTELKFDLKFNLIDIDF >gi|222159306|gb|ACAB01000053.1| GENE 14 14417 - 15274 829 285 aa, chain + ## HITS:1 COG:no KEGG:BT_0922 NR:ns ## KEGG: BT_0922 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 284 1 292 293 361 64.0 1e-98 MKLKIYTLLLALCVTWSLQSCDNDDDNSIAVPTELQNAFASKYPNAANVKWETKCGYYVA DFYDGYEASAWFTQDGKWQMTETDIPYNALPQAVKTSFEKSEYASWKQDDVDKLERTGVE TIFVIEVENQNQEIDLYYSADGTLIKSIVDTDDDNTGHLPVQLTEAMRNFINEKYPNAKI MEIDVEDDRNDWDFGYTEVDIIHNGISKDVLFDQTGDWHSTSWEVRQNELPEAVKNTINN QYGEYRFDEAKRIEKADGTIYYRIELEKMNVDLEVNINEDGIVIP >gi|222159306|gb|ACAB01000053.1| GENE 15 15366 - 19820 3205 1484 aa, chain - ## HITS:1 COG:no KEGG:BT_0921 NR:ns ## KEGG: BT_0921 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1483 1 1484 1484 2614 88.0 0 MSVFVAKELSEVLNTKVVIGRINIGLLNRIIIDDVLLDDQSGQEMLKVTRLSAKFDIMPF FKGKISISSVQLFGFNIDLRKETPESPPNFKFVLDAFASKDTVKRESSLDLRINSILIRR GRMSYHVLSEQETPGKFNAKHIQLQNIIANISLKALSKDSLNLGIKRLSLDEKASGFSLK KMSLKLVANDKQTNIDNFTIELPETSLKLDTIHLEYDSLKAFDRFTEQVHFSFRTLPSQV TLKDISPFVPILSHFKEPVTLDMEVKGTVDQLTCSHLEITADDRQFRLRGDVSLQDLSRP QDAYVYGTLSELSANTRGVGFLVRNLSHNYNGVPPLLERLGNVSFQGEISGYFTDLVTYG QLQTSLGNVKTDLKLSSDKAKGLFAYSGAVKTEDFQLGKLLDNEKLEEITFNLDVHGRHI TGQLPAVELKGLIASVDYSRYRYENITLDGEYKQGGFNGKVALDDPNGSIYLNGDVNVTS KVPTFNFLAVVNKVRPHDLNLTTKYPDAEFSLKLKANFTGGSVDEMIGEINVDSLEFTAP DKAYFMQNMNIRATKQNGENQLRLTSEFLKASIEGKFQYHTLPASILNIMRKYVPSLILP PKKPIETHNNFLFDVHIYNMDILSTIFDIPLTVYTHSTLKGYFNDALQRLRVEGYFPRLQ YKNNFIESGMILCENPADHIRAQVRLTSLKKKGAVNLSLDAQAKDDNVSTTLNWGNNAAV TYSGQLAAVAKFLRTSGEKPLLKAMVDVKPTDVILNDTLWKIHASQVVVDSGRVDVNNFY FSHQDRYVRINGRLSENPKDTVKVDLKDINMGYVFDIASISDDVNFEGDATGTAYASGVL KKPIMNTRLYIKNFSLNQGRLGDLDIYGEWDNENRGIRLDASIQDISPSPSRVTGIIYPL KPESGLDLNIEANELNLKFLEHYMKSIANDIKGRGTGKVHFYGKFKGLNLDGAVMTDASM KFDILNTHFAVKDTIHLAPTGLTFNNIHISDMEGHSGRMDGYLHFQHFKNLNYRFEIQAN NMLVMNTKESADMPFYGTVYGTGNVLLAGNATQGLDVNVAMTTNRNTTFTYINGSVASAT SNQFIKFVDKTPRRTIQDSIQVISYYEQIQQKRQAEEEQKTDIRLNILVDATPDATMRII MDPIAGDYISGKGTGNIRTEFYNKGDVKMFGNYRINQGVYKFSLQEVIRKDFIIKDGSTI TFNGAPLDANMDIQASYTVNSASLNDLIPDASAIIQQPNVRVNCIMNLSGMLVRPTIKLG IELPNERDEIQTLVRNYISTEEQMNMQILYLLGIGKFYTEDARNNNQNSNVMSSVLSSTL SGQLNNALSQVFETNNWNIGTNLSTGDKGWTDMEVEGILSGQLLNNRLLINGNFGYRDNP MANTNFVGDFEAEWLITRSGDIRLKAYNETNDRYYTKTNLTTQGVGIMYKKDFNKWSDLY FWNKWRLRNKRKREEAEKVKTHQTDSITDKTAKSAVKRNHSQQQ >gi|222159306|gb|ACAB01000053.1| GENE 16 20021 - 21040 650 339 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase [Atopobium parvulum DSM 20469] # 4 333 479 811 832 254 42 4e-67 MSAIILGIESSCDDTSAAVIKDGYLLSNVVSSQAVHEAYGGVVPELASRAHQQNIVPVVH EALKRAGVTKEELSAVAFTRGPGLMGSLLVGVSFAKGFARSLNIPMIDVNHLTGHVLAHF IKAEGEEERQPAFPFLCLLVSGGNSQIILVKAYNDMEILGQTIDDAAGEAIDKCSKVMGL GYPGGPIIDKLARQGNPKAFTFSKPHIPGLDYSFSGLKTSFLYSLRDWLKDDPDFIEHHK VDLAASLEATVVDILMDKLRKAAKEYKINEIAVAGGVSANNGLRNAFQEHAEKYGWNIFI PKFSYTTDNAAMIAITGYFKYLDKDFCSIDLPAYSRVTL >gi|222159306|gb|ACAB01000053.1| GENE 17 21080 - 22312 680 410 aa, chain + ## HITS:1 COG:FN1929_2 KEGG:ns NR:ns ## COG: FN1929_2 COG1546 # Protein_GI_number: 19705234 # Func_class: R General function prediction only # Function: Uncharacterized protein (competence- and mitomycin-induced) # Organism: Fusobacterium nucleatum # 243 409 1 160 165 132 49.0 2e-30 MFAEIITIGDELLIGQVVDTNSAWMGQELNKIGIEVLRIVSIRDREEEIMEAIDNAMERV NIVLVTGGLGPTKDDITKQTLCKYFHTELVFNEEVFENVKRVLAGKIPMNALNKSQAMVP KDCMVINNPVGSASVSWFERDGKVLVSMPGVPQEMIAVMTESVLPKLHDRFQTDVIMHQT FLVQHYPESVLAEKLESWENTLPECIKLAYLPKLGIIRLRLTGRGQNREEVKVLLEREKL KLEEILGEDIFSEEDTPLEVIVGELLKKKKLTVSTAESCTGGSIAARLTSIAGSSEYFNG SVVAYSNEVKMGLLHVSSETLERYGAVSEETVIEMVKGAMKTLKTDCAVATSGIAGPGGG TPEKPVGTVWIAAGYKNEIHTYKQETNRGRGMNIERAGNNALLMLRDLLK >gi|222159306|gb|ACAB01000053.1| GENE 18 22435 - 22695 443 86 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29346326|ref|NP_809829.1| 50S ribosomal protein L28 [Bacteroides thetaiotaomicron VPI-5482] # 1 86 1 86 86 175 96 4e-43 MSKICQITGKKAMIGNNVSHSKRRTKRTFDLNLFNKKFYYVEQDCWISLSLCANGLRIIN KKGLDAALTEAVAKGYCDWKSIKVIG >gi|222159306|gb|ACAB01000053.1| GENE 19 22717 - 22905 320 62 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|53713719|ref|YP_099711.1| 50S ribosomal protein L33 [Bacteroides fragilis YCH46] # 1 62 1 62 62 127 100 8e-29 MAKKAKGNRVQVILECTEHKDSGMPGTSRYITTKNRKNTTERLELKKYNPILKRVTVHKE IK >gi|222159306|gb|ACAB01000053.1| GENE 20 22923 - 23081 244 52 aa, chain + ## HITS:1 COG:no KEGG:PRU_0750 NR:ns ## KEGG: PRU_0750 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 51 1 51 52 79 90.0 3e-14 MAKKTVASLHEGSKEGRAYTKVIKMVKSPKTGAYVFDEQMVPNEKVQDFFKK >gi|222159306|gb|ACAB01000053.1| GENE 21 23220 - 24179 734 319 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 5 317 2 322 336 287 46 8e-77 MGFFSFFSKEKKETLDKGLSKTKESVFSKIARAVAGKSKVDDEVLDNLEEVLITSDVGVE TTLNIIKRIEKRAAADKYVNTQELNLILRDEIAALLTENNSNDVADFDVPITRKPYVIMV VGVNGVGKTTTIGKLAYQFKKAGKSVYLGAADTFRAAAVEQLMIWGERVGVPVVKQKMGA DPASVAYDTLSSAVANNADVVIIDTAGRLHNKVGLMNELTKIKNVMKKVVPNAPDEVLLV LDGSTGQNAFEQAKQFTLATEVTAMAITKLDGTAKGGVVIGISDQFKIPVKYIGLGEGME DLQVFRKNEFVDSLFGENA >gi|222159306|gb|ACAB01000053.1| GENE 22 24176 - 25486 1208 436 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229870452|ref|ZP_04490046.1| SSU ribosomal protein S12P methylthiotransferase [Spirosoma linguale DSM 74] # 1 433 6 436 437 469 52 1e-132 MKRKRIDIITLGCSKNLVDSEQLMRQLEEVGYSVTHDTENPQGEIAVINTCGFIGDAKEE SINMILEFAERKEEGDLKKLFVMGCLSERYLKELAVEIPQVDKFYGKFNWKELLQDLGKV YHDELYIERTLTTPQHYAYLKISEGCDRKCSYCAIPIITGRHISKPMEEILDEVRYLVSQ GVKEFQVIAQELTYYGIDRYKKQMLPELIERISDIPGVEWIRLHYAYPAHFPTDLFRVMR ERDNVCKYMDIALQHISDNMLQLMRRQVSKKDTYRLIEQFRKEVPGIHLRTTLMVGHPGE TEEDFEELKEFVRKVRFDRMGAFAYSEEEGTYAAESYEDSIPQEVKQARLDELMDIQQGI SAELSAEKIGKQMKIIIDRLEGDYYIGRTEFDSPEVDPEVLVKRSERELKIGQFYQVEVT NADDFDLYAKIINDYE >gi|222159306|gb|ACAB01000053.1| GENE 23 25479 - 25751 251 90 aa, chain + ## HITS:1 COG:no KEGG:BT_0912 NR:ns ## KEGG: BT_0912 # Name: not_defined # Def: DNA-binding protein HU # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 90 1 90 90 147 90.0 1e-34 MNNKEFTSELAERLGYTIKDTSELIGSLLSSMTQELEEGNVIAVQGFGSFEVKKKAERIS INPASKQRMLVPPKLVLSYRPSNTLKDKFK >gi|222159306|gb|ACAB01000053.1| GENE 24 25765 - 27183 905 472 aa, chain + ## HITS:1 COG:no KEGG:BT_0911 NR:ns ## KEGG: BT_0911 # Name: not_defined # Def: putative integration host factor IHF alpha subunit # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 472 1 487 487 381 58.0 1e-104 MNERLTIQDLTDLLAAKHSMTKKDAEAFVKEFFLLIEQALENEKTVKIKGLGTFKLIDVD SRESVNVNTGERFQIKGHTKVSFSPDANLRDTINKPFAHFETVVLNENTILEDTPIEDTE EEEAGEEASAQTVLNEIGENTPPLVTEEYESTDDELSEKELIQEEQITAQPSVEDSIEEP VIVENVSTVEESIEPSSHVEPKRTSTETNIEEKVEQLEDEEVPEEEVAMVEQQPAAPIIE EKEEITAEKIIEQELMKANLQPVAPIIPPAKKETIKPVKSEHVSQPASKKTAPVKEKSPV PYLIAVIVIVLLLCGGVILFIYYPDLFSSSSDKNALDMPPVTTQPVQPETQLSDTIEQKD TIKEITPDVPKVVTPTPPVAQKEETAPAKAEPQTAPQQPATSAYLDSASYKITGTKTKYT IKEGETLTRVSLRFYGTKAMWPYIVKHNPKVIKNPNNVPYGTTIEIPELTKE >gi|222159306|gb|ACAB01000053.1| GENE 25 27424 - 28419 1041 331 aa, chain + ## HITS:1 COG:Rv1479 KEGG:ns NR:ns ## COG: Rv1479 COG0714 # Protein_GI_number: 15608617 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Mycobacterium tuberculosis H37Rv # 30 331 52 352 377 314 52.0 1e-85 MAESIDIRELNERIERQSAFVTNLTTGMDQIIVGQKHLVESLLIGLLSDGHVLLEGVPGL AKTLAIKTLASLIDAKYSRIQFTPDLLPADVIGTMVYSQKDESFKVQRGPIFANFVLADE INRAPAKVQSALLEAMQERQVTIGKETFILPEPFLVLATQNPIEQEGTYPLPEAQVDRFM LKVVIDYPKLEEEKLIIRQNINGEKFNVKPILKADEIIEARKVVRQVYLDEKIERYIVDI VFATRFPEKYDLKELKDMIGFGGSPRASINLALAARTYAFIKRRGYVIPEDVRAVAHDVL RHRIGLTYEAEANNMTSDEIISKILNKVEVP >gi|222159306|gb|ACAB01000053.1| GENE 26 28503 - 29372 796 289 aa, chain + ## HITS:1 COG:BB0175 KEGG:ns NR:ns ## COG: BB0175 COG1721 # Protein_GI_number: 15594520 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) # Organism: Borrelia burgdorferi # 3 276 8 278 291 141 31.0 1e-33 METSEILKKVRQIEIKTRGLSNNIFAGQYHSAFKGRGMSFSEVREYQFGDDIRDIDWNVT ARFNKPYVKVFEEERELTVMLMVDVSGSLEFGTIKQLKKDMVTEIAATLAFSAIQNNDKI GVIFFSDRIEKFIPPKKGRKHILYIIRELIDFQPESRRTNIRLALEYLTNVMKRRCTAFI LSDFIDQDSFKNALTIANRKHDVVALQVYDRRVSDLPPVGLMRIKDAETGHEQWIDTSSK AVRRAHRDWWIQKQTELNDTFTKSNVDAVSVRTDQDYVKALLNLFAKRN >gi|222159306|gb|ACAB01000053.1| GENE 27 29384 - 30460 856 358 aa, chain + ## HITS:1 COG:no KEGG:BT_0908 NR:ns ## KEGG: BT_0908 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 357 5 361 362 587 91.0 1e-166 MNRNIILIALLALLSVGKAAAQSVTVEAKIDSLQILIGEQAKVQLQVAMDAKQRAIFPAY TDTLVRGVEIIETVKPDTQFLNDRQRMLITQEYIITSFDSALYYLPPMPVTVDDKVYKSK ALALKVYSMPVDTLHPDQFFGQKPVMKAPFAWEDWYGLIACSFLALPLLGLLIYLIIRIR DNKPIIRKIKVEPKLPPHQAAMKEIERIKTEKIWQKGQSKEYYTELTDTLRTYIKNRFGF NALEMTSSEIIDQLLELNDKEAISDLKLLFQTADLVKFAKHDPQMNENDANLINAIDFIN ETKQPEEENQKPQPTEITIIEKRSLRVKAMLICGIALLSAALIGTFIYIGLQLYNLFV >gi|222159306|gb|ACAB01000053.1| GENE 28 30510 - 31493 858 327 aa, chain + ## HITS:1 COG:VCA0172 KEGG:ns NR:ns ## COG: VCA0172 COG2304 # Protein_GI_number: 15600942 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 3 318 4 313 318 157 31.0 3e-38 MVFANIEYLFLLLLLIPYIVWYILKQKKSEATLQISDARVYAHTPKSYKNYLLHVPFLLR CIALVLVILVLARPQTTNKWQNSEIEGIDIMLAIDVSTSMLAEDLKPNRLEAAKDVAAEF INGRPNDNIGITLFAGETFTQCPLTVDHAVLLDMIHNIKCGLITDGTAVGMGIANAVTRL KDSKAKSKVIILLTDGTNNKGDISPMTAAEIAKSFGIRVYTIGVGTNGMAPYPYPVGNTV QYVSMPVEIDEKTLTEIAGTTDGNYFRATSNSKLKEVYEEIDKLEKTKLNVKEYSKRDEE YHWFALAAFLCVLLEVLLRNSVLKKIP >gi|222159306|gb|ACAB01000053.1| GENE 29 31539 - 32567 947 342 aa, chain + ## HITS:1 COG:VCA0172 KEGG:ns NR:ns ## COG: VCA0172 COG2304 # Protein_GI_number: 15600942 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 6 328 5 318 318 93 26.0 6e-19 MFRFGEPTYLYLLLLLPFLAAFYLYSNYKRRKNIRRFGDPTLLAQLMPDVSKYRPDVKFW IIFVAIGLFSVLLARPQFGSKLETVKRKGVEVIIALDISNSMLAQDVQPSRLEKAKRLIS RLVDELDNDKVGMIVFAGDAFTQLPITSDYISAKMFLESINPSLISKQGTAIGEAINLAA RSFTPQEGVGRAIIVITDGENHEGGAVEAAKAAAEKGIQVSVLGVGMPDGAPIPVEGTND YRRDREGNVIVTRLNEAMCQEIAKEGKGIYVRVDNSNSAQRAINQEVNKMAKSDVESKVY TEFNEQFQAIAWVILLLLLAEILILDRKNPLFKNIHLFSNKK >gi|222159306|gb|ACAB01000053.1| GENE 30 32576 - 33298 712 240 aa, chain + ## HITS:1 COG:no KEGG:BT_0905 NR:ns ## KEGG: BT_0905 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 240 3 242 242 261 82.0 2e-68 MLKSKYILFAVFLLATVGVSAQKAERDYIRKGNRLFNDSVFVDAEVNYRKALEANPKSTV SMYNLGNTLSQQQKFQDAMEQYVSASKIEKDKMKLAHIYHNMGVLFQAGKDYAKAVDAYK MSLRNNPADHETRYNLALAQKMLKDQQNQQNQDQNQDQNKDQQKQDQKQDQNKDKQKDQK QDEKKDQQQPPKSEKKQDNQMSKENAEQLLNSVMQDEKDVQDKVKKQQKVMQGGRLEKDW >gi|222159306|gb|ACAB01000053.1| GENE 31 33330 - 35150 923 606 aa, chain + ## HITS:1 COG:no KEGG:BT_0904 NR:ns ## KEGG: BT_0904 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 606 1 608 608 1057 91.0 0 MKKLIIILMALIAYSTQMLADKVSFTASAPDAVVVGDQFRLSYTVTTQKVKDFRAPSIKG FDVLMGPSRSQQSNTQIVNGNVTSTSSITFTYILMANNAGEYTIPGASIVADGDQMVSNS VRIKVLPQDQGDSNSSSSSSTHSSSGTGVSNQDLFITASASKTNVYEQEAFVLTYKIYTR ESNLQLNNAKLPDFKGFHSQEIEMTTNARWTPEHYQGRNYYTTVYRQFVLFPQQSGKLYI DPAQFQMTVGKPVQSDDPFDAFFNGGSNVIEIKKSISTPKIAINVNPLPAGKPADFSGGV GEFNISSSINNKELKTNDAITIKLVISGTGNLKLISNPEIKFPDDFEVYDPKVDNQVRLT REGLTGNKVIEYLAIPRHAGTYKIPGVSFSYFDIRSKSYKTLKTEEYVINVEKGAGNADQ VIANFTNKEDLKVLGEDIRYIKQNEVTLQPKGSFFYGSMTYWLFYIIPALAFIIFFIIYR KQAAENANVAKMRTKKANKVATKRMKLAGKLLSENKKDAFYDEVLKALWGYISDKLNIPV SRLSKDNIEEKLRNHGVNEELIKEFLNALNDCEFARFAPGDENQAMDKVYSSSIEVISKM ENSIKH >gi|222159306|gb|ACAB01000053.1| GENE 32 35202 - 36002 559 266 aa, chain + ## HITS:1 COG:no KEGG:BT_0903 NR:ns ## KEGG: BT_0903 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 266 12 277 277 417 85.0 1e-115 MSLTCFAQDSLNIDSKQTNGADSIHASHTTFSSNTLEDATKAEGDSAYIKEDYAAAIQIY EALLKNGEAADVYYNLGNSYYKIGEIAKAVLNYERALLLQPGNGDIRANLEVARAKTIDK VEPVPEVFFVSWIKSLTNSMSVDAWATWGIVSFILLIIALYFFIFSKQIVLKKVGFILGI VFLIVTICSNLFASQQKEHLVNRNEAIVMNPSVTVRSTPSESGTSLFILHEGRKVNVKDN SMKEWKEIRLEDGKVGWVPASAIEVI >gi|222159306|gb|ACAB01000053.1| GENE 33 36131 - 36421 260 96 aa, chain + ## HITS:1 COG:no KEGG:BT_0902 NR:ns ## KEGG: BT_0902 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 96 1 96 96 178 95.0 5e-44 MRTITFNELRKIKDSLPSGSMHRIADELGLHVDTVRNFFGGHNFKEGKSVGIHLEPGPDG GLVMLDDTTVLDRALKILDELNMSMQKEQATESVQV >gi|222159306|gb|ACAB01000053.1| GENE 34 36499 - 37620 599 373 aa, chain + ## HITS:1 COG:MA2866 KEGG:ns NR:ns ## COG: MA2866 COG0589 # Protein_GI_number: 20091690 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Methanosarcina acetivorans str.C2A # 86 243 8 151 152 61 29.0 2e-09 MEDKLVTLAILTYTKAQILKNVLENEGIETYIHNVNQIQPVVSSGVRLRIKESDLPRALK ITESSTWLSESIVGEKEPKTENKSNKILIPVDFSNYSMKACEFAFNLAKTENAEVILLHV YFTPIYASSLPYGDVFNYQIGDEESVKTIIQKVHSDLNALSEKIKEKVTSGDFPNIKYSC ILREGIPEEEILRYAKEQRPMVIIMGTRGKNQKDIDLIGSVTAEVIDRSRTAVLAIPENT PFKQFSEVKRIAFITNFDQRDLIAFEAFFNTWKSFHFSVSLIHLTDSKDTWNEIKLAGIK EYFHKQYPGLEIHYDVVMNDNLLKGLDQYIKDNQIDIITLTSYKRNIFARLFNPSIARKM IFHSDTPLLVINS Prediction of potential genes in microbial genomes Time: Wed May 18 02:25:50 2011 Seq name: gi|222159305|gb|ACAB01000054.1| Bacteroides sp. D1 cont1.54, whole genome shotgun sequence Length of sequence - 11847 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 64 - 1278 1001 ## BT_0900 TPR repeat-containing protein 2 1 Op 2 . - CDS 1311 - 3887 2239 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit - Prom 4042 - 4101 6.4 + Prom 4154 - 4213 3.0 3 2 Op 1 1/0.000 + CDS 4242 - 6770 1975 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 6811 - 6845 4.0 + Prom 6817 - 6876 5.4 4 2 Op 2 . + CDS 6903 - 8948 1777 ## COG0326 Molecular chaperone, HSP90 family + Term 8985 - 9032 8.2 + Prom 8994 - 9053 5.1 5 2 Op 3 . + CDS 9120 - 11420 1443 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily + Term 11518 - 11584 30.0 + TRNA 11500 - 11573 54.1 # Arg ACG 0 0 + TRNA 11594 - 11667 55.8 # Arg ACG 0 0 + TRNA 11689 - 11762 54.1 # Arg ACG 0 0 Predicted protein(s) >gi|222159305|gb|ACAB01000054.1| GENE 1 64 - 1278 1001 404 aa, chain - ## HITS:1 COG:no KEGG:BT_0900 NR:ns ## KEGG: BT_0900 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 404 1 398 398 628 88.0 1e-178 MKRVLFSMVLLMAVSFSFAQMKNVKEAKSMANDVKPNFKQAEQLIKEAMKNPETKDLADT WDVAGFIQRRINEEQMKNAFLKKPYDTLKVYNSILKMYEYYTKCDELAEIPNEKGKVKNK YRKANASSMLAERPNLINGGIQYFNLDKNKEALKFFATYVESASYPMLADKELAKNDTLI PQIAYYATLAADRVGDKDAIIKYAPMALSDKDGGKFAMQLMADAYKAKGDTVAWIKSLEE GILKFPGNDYFFANLVDYYNSSNQASKAMEFADRMLSNDPNNKLYLYVKAYLYHNMKEYD NAIEYYKKAIAADPEYAEAYSNVGLVYLMKAQDYADKATTDINDPKYAEAQAVVKKFYEE AKPFYEKARALKPDQQDLWLQGLYRVYYNLNMGPEFEEIDKLMK >gi|222159305|gb|ACAB01000054.1| GENE 2 1311 - 3887 2239 858 aa, chain - ## HITS:1 COG:BH0007 KEGG:ns NR:ns ## COG: BH0007 COG0188 # Protein_GI_number: 15612570 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Bacillus halodurans # 1 812 1 804 833 866 55.0 0 MLEQDRIIKINIEEEMKSSYIDYSMSVIVSRALPDVRDGFKPVHRRILYGMMELGNTSDK PYKKSARIVGEVLGKYHPHGDSSVYFAMVRMAQEWAMRYPLVDGQGNFGSVDGDSPAAMR YTEARLNKLGEAMMDDLYKETVDFEPNFDNTLVEPKVMPTRIPNLLVNGASGIAVGMATN MPPHNLSEVIEACEAYIDNPEITVEELMEFVKAPDFPTGGFIYGVSGVREAYLTGRGRVI MRAKAEIESGQTHDKIVITEIPYNVNKAELIKYIADLVNDKKIEGISNANDESDRDGMRI VIDIKRDANASVVLNKLYKMTALQTSFGVNNVALVHGRPKTLNLRDLIKYFIEHRHEVVI RRTQFDLRKAKERAHILEGLIIASDNIDEVIRIIRAAKTPNDAIAGLIERFNLTEIQSRA IVEMRLRQLTGLMQDQLHAEYEEIMKQIAYLESILADDEVCRQVMKDELLEVKTKYGDER RSEIVYSSEEFNPEDFYADDQMIITISHMGYIKRTPLTEFRAQNRGGVGSKGTETRDEDF VEHIYPATMHNTMMFFTQKGKCYWLKVYEIPEGTKNSKGRAIQNLLNIDSDDNVTAYLRV KSLEDSEFINSHYVLFCTKKGVIKKTLLEQYSRPRQNGVNAITIREDDSVIEVRMTNGNN EIIIANRNGRAIRFHEAAVRVMGRTATGVRGITLDNDGQDEVVGMICIKDLETESVMVVS EQGYGKRSEIEDYRKTNRGGKGVKTMNITEKTGKLVTIKSVTDENDLMIINKSGITIRLK VADVRIMGRATQGVRLINLEKRNDQIGSVCKVMTESLEDEIPAEEAEGTIVSDPNADAPD IDDAADVNENESNNEIEE >gi|222159305|gb|ACAB01000054.1| GENE 3 4242 - 6770 1975 842 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 833 2 806 815 765 48 0.0 MNNQFSQRVSDIIVYSKEEANRLRSRYIGPEHLLLGILRDGEGKAIEILSKLNTNLAAIK QQIEAQLKAEADDMLLPDAEVPLSNDAAKILKMCILEARGMKSNIADTEHVLLAILREKN NMAASVLEANDINYVKVLEQATLQPDINSGMGFTEDDDDDEEMSSPRSGRGGSDERQQAQ TASKKPSNDTPVLDNFGTDMTKAAEEGRLDPVVGREREIERLAQILSRRKKNNPILIGEP GVGKSAIVEGLALRIIQKKVSRILFDKRVVALDMTAVVAGTKYRGQFEERIRSILNELQK NPNVILFIDEIHTIVGAGSAAGSMDAANMLKPALARGEIQCIGATTLDEYRKNIEKDGAL ERRFQKVIVEPTTAAETLQILRNIKDKYEDHHNVYYTDEALEACVKLTDRYITDRNFPDK AIDALDEAGSRVHLTNVNVPKEIEEQEKLIEEAKSKKNEAVKSQNFELAASFRDKEKELS VQLDEMKKEWEANLKENRQTVDAEEIANVISMMSGIPVQRMAQAEGIKLAGMKEDLQAKV IAQDTAIEKLVKAILRSRVGLKDPNKPIGTFMFLGPTGVGKTHLAKELAKYMFGSADALI RIDMSEYMEKFTVSRLVGAPPGYVGYEEGGQLTEKVRRKPYSIVLLDEIEKAHPDVFNIL LQVMDEGRLTDSYGRMVDFKNTVIIMTSNIGTRQLKEFGRGVGFATQSRLDDKEFSRSVI QKALNKSFAPEFINRVDEIITFDQLSLEAITKIIDIELKGLYDRIESIGYKLVIEDKAKE FIAGKGYDVQYGARPLKRAIQTYLEDGLSELIISSSLKEGDTIQVSLNEEKGELEMKVVT PE >gi|222159305|gb|ACAB01000054.1| GENE 4 6903 - 8948 1777 681 aa, chain + ## HITS:1 COG:alr2323 KEGG:ns NR:ns ## COG: alr2323 COG0326 # Protein_GI_number: 17229815 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Nostoc sp. PCC 7120 # 1 581 2 601 658 461 41.0 1e-129 MQKGNIGVTTENIFPIIKKFLYSDHEIFLRELVSNAVDATQKLNTLASIGEFKGELGDLT VHVELGKDTITISDRGIGLTAEEIEKYINQIAFSGANDFLEKYKNDANAIIGHFGLGFYS AFMVAKKVEIITKSYRDGAQAVKWTCDGSPEFTIEEIEKADRGSDIILYIDDDCKEFLEE ARISELLKKYCSFLPVPIAFGKKKEWKDGKQVETAEDNIINDTTPLWTRKPSELSDEDYK SFYSKLYPMSDEPLFWIHLNVDYPFHLTGILYFPKVKSNIELNKNKIQLYCNQVYVTDSV EGIVPDFLTLLHGVIDSPDIPLNVSRSYLQSDSNVKKISTYITKKVSDRLQSIFKNDRKQ FEEKWNDLKIFINYGMLTQEDFYDKAQKFALFTDTNDNHYTFEEYQTLIKDNQTDKDGNL IYLYANNKDEQYSYIEAATNKGYNVLLMDGQLDVAMVSMLEQKLEKSRFTRVDSDVVDNL IVKEDKKGETLEANKQDAITTAFKSQLPKMDKVEFNVMTQALGENSAPVMITQSEYMRRM KEMANIQAGMSFYGEMPDMFNLILNSDHKLIKQVLNEEESACQAEVAPILSEMDNVNKQR NELKDKQKDKKEEEIPTSEKDELNNLDKKWDDLKSKKEAVFIGYASNNKVIRQLIDLALL QNNMLRGEALNNFVKRSIELI >gi|222159305|gb|ACAB01000054.1| GENE 5 9120 - 11420 1443 766 aa, chain + ## HITS:1 COG:PA3339_1 KEGG:ns NR:ns ## COG: PA3339_1 COG1752 # Protein_GI_number: 15598535 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 22 299 24 301 308 160 35.0 1e-38 MRKIFLVLITLWLIIPAIHAQKVGLVLSGGGAKGMTHIGIIRALEENNIPIDYIAGTSMG AIIGSLYAMGYSPDDMVELLKSEDFKRWYSGEVEEKYVYHFKKNLPTPEFFNIRFSFKDS LKSLKPQFLPTSVVNPIQMNLVFVDLYARATAACKGDFDKLFVPFRCIASDVYNKKQLVM KEGDLGDAVRASMSFPFMFKPIEIDNVLAYDGGIYNNFPTDVMRDDFHPDVIIGSVVSTN PTKPKENDLMSQIENMVMQKTDYSIPDSMGILMTFKYDNVNLMDFQRIDELHDIGYNRTI SMMDSIKSRIHRRVNLDNIRLRRMVYRSNFPELRFKNIIIDGANPQQQVYIKREFHKSDN KEFTYEDLKQGYFRLLSDKMISEIIPHAIYNPEDDTYDLHLKVKLENNFAVRLGGNISTS NSNQIYLGLSYQDLNYYAKEFILDGQLGKVYNNVQFMAKIDFATAIPTSYRLIGSISTFD YFKKDKLFSRNNKPAFNQKDERFLKLQVGLPFLSSKRAEFGVGIAKIEDKYFQKSVIDFG NDKFDKSRYDLFGGSISFNGSTLNSKQYPTRGYREALVAQIFVGKERFYPGEGSTTNNNN KDHHSWLQLSYMKEKYHNMSEHWVLGWYLKALYASKNFSENYTATMMQAGEFSPTLHSKL TYNEAFRANQFVGAGIRPIYRLNQMFHLRGEFYGFMPIYPIEKNSLNKAYYGKAFSKFEY LGEISVVCQLPFGDISAYVNHYSSPKREWNVGLSIGWQLFNYRFIE Prediction of potential genes in microbial genomes Time: Wed May 18 02:26:05 2011 Seq name: gi|222159304|gb|ACAB01000055.1| Bacteroides sp. D1 cont1.55, whole genome shotgun sequence Length of sequence - 35974 bp Number of predicted genes - 30, with homology - 29 Number of transcription units - 13, operones - 6 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 246 - 734 235 ## gi|294810652|ref|ZP_06769301.1| hypothetical protein CW3_3010 2 1 Op 2 . - CDS 740 - 1192 207 ## gi|237715457|ref|ZP_04545938.1| predicted protein 3 1 Op 3 . - CDS 1274 - 2554 698 ## gi|237715458|ref|ZP_04545939.1| conserved hypothetical protein - Prom 2771 - 2830 1.8 4 2 Tu 1 . - CDS 3302 - 3499 75 ## COG3344 Retron-type reverse transcriptase + Prom 4507 - 4566 7.0 5 3 Tu 1 . + CDS 4591 - 4743 59 ## gi|237721718|ref|ZP_04552199.1| predicted protein 6 4 Op 1 . - CDS 4799 - 5560 595 ## COG0584 Glycerophosphoryl diester phosphodiesterase 7 4 Op 2 . - CDS 5604 - 7496 1285 ## BF0272 hypothetical protein 8 4 Op 3 . - CDS 7515 - 10673 2259 ## BF0273 hypothetical protein 9 4 Op 4 . - CDS 10677 - 10805 70 ## - Prom 10833 - 10892 9.6 - Term 10980 - 11013 -0.7 10 5 Op 1 . - CDS 11065 - 15033 1820 ## COG3292 Predicted periplasmic ligand-binding sensor domain 11 5 Op 2 . - CDS 15106 - 17154 1229 ## BT_2899 hypothetical protein - Prom 17242 - 17301 7.0 + Prom 17278 - 17337 6.9 12 6 Tu 1 . + CDS 17385 - 17627 195 ## Dfer_4026 transposase IS204/IS1001/IS1096/IS1165 family protein + Term 17707 - 17764 10.6 - Term 17530 - 17568 -0.8 13 7 Tu 1 . - CDS 17784 - 18911 733 ## BT_3151 hypothetical protein - Prom 18940 - 18999 5.5 + Prom 18844 - 18903 3.8 14 8 Tu 1 . + CDS 19089 - 20375 385 ## BVU_1598 transposase + Term 20496 - 20547 -0.8 + Prom 21053 - 21112 4.4 15 9 Tu 1 . + CDS 21146 - 21325 131 ## BT_1110 hypothetical protein + Prom 21341 - 21400 6.0 16 10 Op 1 . + CDS 21507 - 21962 277 ## COG1528 Ferritin-like protein 17 10 Op 2 . + CDS 22043 - 22690 422 ## COG2095 Multiple antibiotic transporter 18 10 Op 3 . + CDS 22737 - 23219 594 ## COG1528 Ferritin-like protein 19 10 Op 4 . + CDS 23243 - 24295 943 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes 20 10 Op 5 . + CDS 24308 - 25057 608 ## COG0588 Phosphoglycerate mutase 1 + Term 25058 - 25092 1.3 21 11 Tu 1 . + CDS 25134 - 25562 519 ## COG0071 Molecular chaperone (small heat shock protein) 22 12 Op 1 . + CDS 25663 - 26160 316 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 23 12 Op 2 . + CDS 26183 - 27169 786 ## COG0205 6-phosphofructokinase 24 12 Op 3 . + CDS 27182 - 28957 1044 ## COG0475 Kef-type K+ transport systems, membrane components 25 12 Op 4 . + CDS 28964 - 31522 2030 ## COG0058 Glucan phosphorylase 26 12 Op 5 . + CDS 31607 - 32347 159 ## BT_1099 putative arginase 27 12 Op 6 . + CDS 32399 - 32623 331 ## COG1803 Methylglyoxal synthase 28 12 Op 7 . + CDS 32624 - 32938 199 ## COG1803 Methylglyoxal synthase + Prom 32949 - 33008 5.5 29 13 Op 1 13/0.000 + CDS 33173 - 35578 1305 ## COG0642 Signal transduction histidine kinase 30 13 Op 2 . + CDS 35556 - 35973 255 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains Predicted protein(s) >gi|222159304|gb|ACAB01000055.1| GENE 1 246 - 734 235 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294810652|ref|ZP_06769301.1| ## NR: gi|294810652|ref|ZP_06769301.1| hypothetical protein CW3_3010 [Bacteroides xylanisolvens SD CC 1b] # 1 162 1 162 162 305 100.0 4e-82 MQTVPGNTWIDGGEENVAVAKTIHYTADEVKAGDYIYSGRSTSDGGLRKRYPNGKAQVIA DPKPQSVAGKTVAGVVFCIPKDTDPTGRLTPARLTDDKIMMKDFPNAEIWNYIVNPSLVA AGGPTPIYTTSYWSSTEAYYSPNNAYAIHFPDATLESNDKYL >gi|222159304|gb|ACAB01000055.1| GENE 2 740 - 1192 207 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715457|ref|ZP_04545938.1| ## NR: gi|237715457|ref|ZP_04545938.1| predicted protein [Bacteroides sp. D1] # 1 150 1 150 150 285 100.0 6e-76 MLMLAGGLLVSCGKMETGESTTRLPDGKYPLRLTAEVAQPHPRAGGKEIGVMLDGMLSLQ RYVMDASGNTVPKDAENTIYRKSTTETCVTARTPNADIDQSGGYAGFGLLYVTVVGGYDQ AVSLRFNHRMAKVEFTLMAGEGVTEEKRMK >gi|222159304|gb|ACAB01000055.1| GENE 3 1274 - 2554 698 426 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715458|ref|ZP_04545939.1| ## NR: gi|237715458|ref|ZP_04545939.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 426 1 426 426 834 100.0 0 MSGKTATPTGSALTDTEFFAPLVSAWQPQDDQSTHAAYTASDLMTAEGSATTGEDNTLHL SFTMNHRMALAVIEMPNTVKYKFTDERIPDYAVSPATTFSGIAQPLRVNDGTYRYLVNHA TPAPTIEGHYDEGSKEFTITPSGLSTGSYKRYKVDGAVTTVKDYTMQRGDYLLADGNLLP KGTTLTEEQKASVAPIVFWTPAETNPEGRITPASLDFDKIMVKEHPNCTHGLAVSIKDAP GNVSWQNVNDWVADFQRGTDFNPVDKDEYVNIATGFDATGNINRILGYQNTKVLWAYNGY CKTNGKTDALVNPAEVLKTFIANNPAPANSTGWFLPSVKELHMLCYKDVDNIAYTRDNTE TRDIVEVSISAVGGDALSPRNNHKRFWSSSESPSNKNGAFSVYFYNAFAQLSEKDGALNV RAVCAF >gi|222159304|gb|ACAB01000055.1| GENE 4 3302 - 3499 75 65 aa, chain - ## HITS:1 COG:MA4184 KEGG:ns NR:ns ## COG: MA4184 COG3344 # Protein_GI_number: 20092976 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Methanosarcina acetivorans str.C2A # 1 65 92 156 251 76 49.0 1e-14 MALELITESEADANSYGFRKFRSTADAIDALHRWLSRDCLPQWILEGDIKGCFDHINHEW LLNNV >gi|222159304|gb|ACAB01000055.1| GENE 5 4591 - 4743 59 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237721718|ref|ZP_04552199.1| ## NR: gi|237721718|ref|ZP_04552199.1| predicted protein [Bacteroides sp. 2_2_4] # 1 50 17 66 66 79 100.0 6e-14 MELNYKFGRLPSYTRLTFIVSAIVLFSISIQLYQHEQSFENLKLIFQKHK >gi|222159304|gb|ACAB01000055.1| GENE 6 4799 - 5560 595 253 aa, chain - ## HITS:1 COG:SA0220_2 KEGG:ns NR:ns ## COG: SA0220_2 COG0584 # Protein_GI_number: 15925931 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Staphylococcus aureus N315 # 26 242 1 217 242 85 31.0 7e-17 MKKVMCLLAIMFMIANTSAQTRVIAHRGFWKTQGSAQNSITSLLKADSIGCYGSEFDVWL TKDNGLVVSHDGIIQGHKVEESTLKELTGLWLANGECVPSLKELLETAKRKTSLKLVLEL KAHSKPEREIKAVEEIVSMIKKMGLEPRMIYITFSSHALKELIRQAPTSTPVYYLKGDLS PQQLKELGSAGPDYHFSEFYRYTDWIESCHALGLKVNVWTVNKKEDMQYFWDKVDFITTD EPLGLLKDIMKVK >gi|222159304|gb|ACAB01000055.1| GENE 7 5604 - 7496 1285 630 aa, chain - ## HITS:1 COG:no KEGG:BF0272 NR:ns ## KEGG: BF0272 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 630 3 614 614 372 37.0 1e-101 MKNIYTKLCLSAIGCLMMASCDFLDVVPQGTATVDDIYRTQYQAEGMVLSCYANIPNYFH PQQFPDFTGGNEIITSKGGTTRWFHFRSLVYGEESPTTTYYSLWSNTAKSYPQGAVKKAV WESIRNCYNVLNNLDRVSDITPENLSWWKGEVLFLIGYYHQIMLEYYGPIVIIDKEIPME SSPAEMMTSRSPYDTCVDFIANKYSEAARLLPGVWDSSKRNRATSSAALAFKARLLLYAA SPLVNGNSEFYSDFKNPDGTFLINQTYDREKWKRAMDAAKEAIDLCEENGYKLYGNSTND LEQGKKNYHEAFVGDGISGSGFNWNEVLFGFAEQGTISYCIKNMAPRVEFTSYSTKGFRG SLFPTWDCVSRYYTKNGLPWADDPETKKLDPYSIAPGDSTVRFHRNRDPRFYASIGFDRG NYDVQGKTIVLKCRRGEMQQNNGNAKDEYQTDNGYYCQKWVSKYDTYNRTTDQITYNRWC FPYMRLAELYLSYAEADFEYSGTLSTASLSYLNKVRERCGLPTFADSWAKAGGIPSGEKL REILHDERSIELAMEGRRFHDMRRWKIAHTEMMRYQKSWTLSGKTADSFYKLTDMKETGV RNFTAPKNYWLAIPQDQIEVNPNLVQNPGY >gi|222159304|gb|ACAB01000055.1| GENE 8 7515 - 10673 2259 1052 aa, chain - ## HITS:1 COG:no KEGG:BF0273 NR:ns ## KEGG: BF0273 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 41 1052 123 1132 1132 864 45.0 0 MNCKVKQGCLTSESCCILLWLAFTLFPVSVVTAADNRKIDESQQKEVKAETKGYVGDENG EPVIGATVLESGTQNGVITDVDGIFTIQLAKNASLTISYVGYQKKTVVVKDKKFISVNLT PDRSMMLDEVVVVGFGKQKKESLVGAVQAVKPEDLKMTSSNLSTSFAGNVPGIIAVQTQG EPGSDEAKFYIRGISTFGSNTSPLIILDGVEINATMMNNIPPESIASFSVLKDATATSLY GSRGANGVIIVTTKQGQLSEKMSVDIRFDNTFSMPTYVQKMADGPTYMDMYNEAVYNQAI SNNQEYEPFYSRDKIDKTRANANPYLFPNNDWYSMLFKDFTVNQNLNISIKGGSKNVDYF LNAGIFYENGIIRQPKEDKLDVGMRNKKYLFQSNVTARVTSTTKVGLNMNTQLFYHHAPK TSTRNLFAYSMHGNPVRFPATLPAEPGDTYIRYGSNDPWDVGKSEPNPYAKLSEGYTERN YVYMTTAFNLEQDLKFVTPGLKLTGLASFYNYSFNWLDHWIVPFYYKVSDDYTMDDQGNY LYKTSTIGEPGEPYLKSNSGRDPTESVWSLQGALDYSRQFGGHDVGATLVYHMKETKKVK DGGAEKDLLPYREQGMAGRLTYSFGQRYLLEATFGYNGSENFRSGHRFGFFPAIALGWTI SNEKFFQPLKKTVSTLKIRATYGLTGNDALATRFPYVTEVSMNNNLDWWTGSGTRVNGPL VNIYGNANATWEKSKKLNLGVDMTLLESMDITIDYFKEDRSGIFMQRSSVPSTMGVTGML PYANIGSVKNKGVDMSVAYSKVIGKDWVLRLNGSLTYAHNEITEIDEPVNVEPYSSRIGH PINSIMGYVSDGLFTSQEEIDRSPKQSFGNYTVGDIKYKDLNGDNVVNGYDRTIIGNPEI PEIIYGFGGTLKYKKWDLSLFFQGVAKVSLMMSDIHPFSEAGHKGYNIAQYIVDDHWSES NNVAGAAYPRLSPEFITNNAQTSTYYLRNAAFLRLKNAELGYSFFPWLRVYAAGTNLLTF SPFSTWDPEMGSGNGLKYPLQRTVKIGIQFHY >gi|222159304|gb|ACAB01000055.1| GENE 9 10677 - 10805 70 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNNRKENEQSCIVMNKKEGTSCITTHLPVFQRKIYTTNQKIY >gi|222159304|gb|ACAB01000055.1| GENE 10 11065 - 15033 1820 1322 aa, chain - ## HITS:1 COG:XF1330_1 KEGG:ns NR:ns ## COG: XF1330_1 COG3292 # Protein_GI_number: 15837931 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Xylella fastidiosa 9a5c # 31 753 28 739 740 141 23.0 9e-33 MQRLRILVILTLLSLMPLYSLAVLNEGQYAFRSLDINNGLSQNTVHAILQDKQGFMWFGT KDGLDRYDGISFRTFMKESGTLGNNFITSLYEDNLGQIWIGTDVGLYVYCPQMEKVRHFT LISNPNIDIDCTVNLITGDQKGGIWVVTQTRGIFYYNPQDSQLVNYQSDGSGTLNLKTSG QLYFDSDDVCWLDIRDGNLYYSKDKLKTLTPIFPEDSKVSFRDEYIYKLLPGPYNCMYVG TVFGLKEVNLTNKTIRTLLSKDELGGDIYIRELAFYSDDELWIGTESGLYIYNLHTAKII HLQNVNGDPYSISDNAIYSILKDREGGMWIGTYFGGVNYYPRQYTYFDKVYPQKESNKMG KRVREFCAAHDGTLWIGTEDKGLFHYYPSTGKIEPFIHPDIYHNVHGLYLDGDYLWVGNF AKGLKRIDLRTYAVKHYDNIASDIFSICRITAGDLYLGTTIGLFRYNPDTERFKRVPELG WTFVYYIKEDKQGNLWLATYADGVYKKNVRTGGWEHFVHEEADSSTLPSNKVLSIFEDSQ NQLWFTTQGGGFCRFVPSANTFVRYDGIGLPSNVIYRIEEDEKGLFWVSTNKGLVHFNPK NSSFKVYTVANGLLSNQFNYQSSYKDKNGRIYFGCINGFISFAPSSFIDNDFLPSVLITD FMLFNKKVVVGEKGSPLKQSITLSNHIELQSNQNSFSFRIAAISYQSPSMNTLLYRLEGY DSEWHTAGKGPITYSNLPYGTYMLRVKGANSDGVWNPDVRTLGIRILPPFYLSVWAYLIY ILLILGTFFALFFYLRKRTIEKQRKEMEKFEQEKEQELYTTKIDFFTNVAHEIRTPLTLI KCPLENVLADRELSENVRMELEIMDQNVERLLNLINQLLDFRKAENKGFKLNPKEYNIGT IVHSVYKYFTTLAKQRGIKLEVEVPEEELLASVDKEALTKILSNLFANALKYARTYVYLH LSVDEKNEVFTISMSNDGNIVPIEMRENIFKAFVQYRDGKDIVSGTGIGLAMARYLAELH QGMLVMDRELDCNRFILSIPILHQTLPDDSEEQHKKPGYDDEDREVSDKDKREVASILVV EDNKDMLAFVFRQLSSLYHVLIAENGVEALDVLEQKSVDLIISDIMMPLMDGVELCKHLK QNLDYSHIPVILLTAKTNLESRIEGLEEGADAYIEKPFSMEYLRANVANLLSNRERLRRH FIEFPFIKADAMAQTKADEMFISKLNEYVLRHLDNTDLQIDDIADAMNMGRSNFYRKLKG ILNMSPNEYLRLFRLKQAASILKEGTYGVVEVSYMVGFSTPSYFSSCFKKQFGVLPKDFI SH >gi|222159304|gb|ACAB01000055.1| GENE 11 15106 - 17154 1229 682 aa, chain - ## HITS:1 COG:no KEGG:BT_2899 NR:ns ## KEGG: BT_2899 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 682 3 677 677 1017 70.0 0 MTKLRCMLCCLMICTITIDARTVNDIYKRISAQVSLKVPGNQSRNYSLAFQGAVDDKGIY LLESEEKIPLIITERIERDNVKCVMVVSITALEDVYFNYQQQLKTGFRHNDCMFYLPGFW YSRNLRSPKGAPSFHISESWLVREDRLSSPLTGIFNQKDGRYMTVARKDDFQWDALATHQ TGEIILSGKTSLGFTGFESHDGTSTLSFGFPYREAPKTYIRKLTLAPEVTSFQYLKKGET VLLTWEFIEGQATDYSDFICHTWEYCYDTYRPQPVKIPFSPEEIKGVLSNYFVESFVGDK ALAYYSSPEMKVAACANMNVAEIGFVGRVLLNAFNAWEFGNENGRDDLAASSKKIFDSYL ENGFTETGYLREWVNLENDSDVKEERPVHSIRRQSEGIYAMFHYLNYEQKYKRHHPDWEQ RLKKMLDMFLQLQNPDGSFPRKFHDDFTVVDGSGGSTPSATLPLVMGYKYFKDKRYLSAA RRTAEYLEKELISKADYFSSTLDANCEDKEASLYTATATYYLSLITNGTEHNHYASLTRK AAYFALSWYYVWDVPFAPGQMLGDIGLKTRGWGNVSVENNHIDVFIFEFADVLRWLSKEY NEPRFSDFAEVISTSMCQLLPYKGHMCGVIKVGYYPEVIQHTNWDYGRNGKGYYNDMFAP GWTVASLWELLTPGRTENILLK >gi|222159304|gb|ACAB01000055.1| GENE 12 17385 - 17627 195 80 aa, chain + ## HITS:1 COG:no KEGG:Dfer_4026 NR:ns ## KEGG: Dfer_4026 # Name: not_defined # Def: transposase IS204/IS1001/IS1096/IS1165 family protein # Organism: D.fermentans # Pathway: not_defined # 8 80 8 80 281 76 53.0 3e-13 MRAQPQVPSLCIDETSLSCGELYTVVTNRAGRGGRGTLVTMIRGTKSEDVIKVLEMIHVS KRKTAKEVTLDLLPTMMRIV >gi|222159304|gb|ACAB01000055.1| GENE 13 17784 - 18911 733 375 aa, chain - ## HITS:1 COG:no KEGG:BT_3151 NR:ns ## KEGG: BT_3151 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 242 1 243 256 179 43.0 1e-43 MKSLFILQLVCCVITAMLALQLAMASLQVRWKVWRYEISRWILVASMLFFSVHYLLQMIH GLRAQGTDVGAAFNILFYTPVAFAITLSIINIESTGNKVRRYCLRGMMAYILIAIVFVIG MFKSQSLHIGNMLYVMLGLFVASMAYFILIIRKETKARKQKLMENFGIDLIPYVRYSQAS IILLYFAAGLLPVAILFNTLLYIIGPLILLSVIFFVHTFIAMGYYITPKEVIPEENDAEA KVTEAEDMKDGKNTHGTNILTANRKMEIELALKKWCEEGFYKDYEVNIYSLATKLGYKKN ELTEYFNQSEYTNFRTWLSDIRFNEAVRMMKANPEYSIDAISTECGFSSHTWIYRIFKQK TGMSPSQWRKQFASI >gi|222159304|gb|ACAB01000055.1| GENE 14 19089 - 20375 385 428 aa, chain + ## HITS:1 COG:no KEGG:BVU_1598 NR:ns ## KEGG: BVU_1598 # Name: not_defined # Def: transposase # Organism: B.vulgatus # Pathway: not_defined # 1 428 1 429 429 633 70.0 1e-180 MVKVQIKSEKLTSFGGIFPIMEKFDRMLSCTIDSTLGLRSKVYGYQYSEIIRSLMCVYFC GGSCVEDVTSHLMEALCLHPPLRTCSADTILRAIRELTTPNLTYRSDSGKSYDFNVASDL NNLLVNALLATGQLLPDNEYDFDFDHQFIETEKFDAKITYKKFTGYSPGIATVGDVIVGI ENRDGNTNVRFHQEDTLKRIFERLENARIHINRARMDCGSCSEAMVEMVEKHSRHFYIRA NRCVSLYDDIFALRGWKTEYINGHEFEICSIIAEKWVGKAYRLVIQRQRSINKELDLWEG EYTYRCILTNDYTSTDRDIVEYYNKRGGAERVLDDMNNGFGWKRLPKSFMAENTVFLLLT ALIRNFYRHLISDANMKSFGLKRTSRIKTFVFKYVSVPAKWIKTARQHILNIYTSNDAYM MAFKFDFG >gi|222159304|gb|ACAB01000055.1| GENE 15 21146 - 21325 131 59 aa, chain + ## HITS:1 COG:no KEGG:BT_1110 NR:ns ## KEGG: BT_1110 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 59 6 60 69 68 71.0 7e-11 MAFNNPPGHTEEELWQQILSLSRELQECRACLHYHNTLDYIYDKINMIEKELDILLRML >gi|222159304|gb|ACAB01000055.1| GENE 16 21507 - 21962 277 151 aa, chain + ## HITS:1 COG:MTH158 KEGG:ns NR:ns ## COG: MTH158 COG1528 # Protein_GI_number: 15678186 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Methanothermobacter thermautotrophicus # 1 126 2 128 171 71 27.0 5e-13 MDKRIEYAMNTLINTEIWSTNLYLSLQVYFEGQQLPILASWLSAQAQDNMGKVYQMMNRI YHEGGAVTIHEIRRDIRQWPTPLAALNTLLEHEQYISRQISELHVLCQNTDSSIHSFIKG LYTKRIYVSTAFMELLRILAMEYERRLPCFI >gi|222159304|gb|ACAB01000055.1| GENE 17 22043 - 22690 422 215 aa, chain + ## HITS:1 COG:BS_yvbG KEGG:ns NR:ns ## COG: BS_yvbG COG2095 # Protein_GI_number: 16080438 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Bacillus subtilis # 3 205 2 210 211 132 40.0 7e-31 MDIFAYSMLCFTSLFTLMDPLGVMPVFLQMTEGMSAGERRSIALKSCALTFVILVLFTLC GRFLFHFFGISTNGFRIAGGIIIFKIGYDMLQAHFTHVKLNENEKKEYSRNITVTPLAVP MLCGPGVISSGITLMEDAPEHIFKIALVCVIALVCLLSFIILCVSTRLLKILGETGNNVM MRLMGLILMVIAVECFINGMQPVLTDILRQAHACP >gi|222159304|gb|ACAB01000055.1| GENE 18 22737 - 23219 594 160 aa, chain + ## HITS:1 COG:MTH158 KEGG:ns NR:ns ## COG: MTH158 COG1528 # Protein_GI_number: 15678186 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Methanothermobacter thermautotrophicus # 1 158 2 161 171 126 45.0 2e-29 MTENLQKALNGQITAELWSANLYLSMSFYLEKEGFSGMARWMRKQSAEETGHACAIAEYM AKREAEAKVDKVDVVPQGWGSPTEVFEHALEHERHVSRLIDELVHLASEEKDNATRDFLW GFVREQVEEEANFLNIVNLMKKAGESGILFMDAKLGERQS >gi|222159304|gb|ACAB01000055.1| GENE 19 23243 - 24295 943 350 aa, chain + ## HITS:1 COG:ECs2900 KEGG:ns NR:ns ## COG: ECs2900 COG1830 # Protein_GI_number: 15832154 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Escherichia coli O157:H7 # 1 350 25 374 374 489 66.0 1e-138 MSTIIDLLGTQAGYYLDHVCKTIDKKLIHIPEPNMIDKAWVDSDRNIRTLESLQALYGHG RLANTGYVSILPVDQGIEHSAGASFAPNPLYFDPGNIVKLAIEGGCNAVASTFGVLGAVA RKYAHKIPFIVKLNHNELLTYPNSYDQVMFGTVKEAWNMGAVAVGATIYFGSEQSRRQIV EVSQAFEYAHELGMATVLWCYLRNSSFKKDGIDYHAAADLTGQANHIGVTIKADIVKQKL PSNNGGFKAIGFGKTDGRMYTELVSDHPIDLCRYQVANGYMGRVGLINSGGESHGTSDLH DAVVTAVVNKRAGGMGLICGRKAFQKTMKDGVSLLNTIQDVYLDPSITVA >gi|222159304|gb|ACAB01000055.1| GENE 20 24308 - 25057 608 249 aa, chain + ## HITS:1 COG:STM0772 KEGG:ns NR:ns ## COG: STM0772 COG0588 # Protein_GI_number: 16764136 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Salmonella typhimurium LT2 # 3 245 5 247 250 287 58.0 2e-77 MKRIVLLRHGESLWNKENRFTGWTDVDLSDKGIAEACKAGDMLKEAGFSFEAAYTSYLKR AVKTLNCVLDRLNEDWIPVEKSWRLNEKHYGILQGLNKRETADKYGEEQVHIWRRSYGVS PEPVKEDDPRYPGNDTRYAGVPEMELPRTESLKDAVMRVMPYWECVILPTLMHRDNLLVV AHGNSLRGIVKHLKNISDTDISLLNLPTAVPYVFEFDERPVLVRDYFLGDQEEIRRRTEA VAEQGMIRR >gi|222159304|gb|ACAB01000055.1| GENE 21 25134 - 25562 519 142 aa, chain + ## HITS:1 COG:TM0374 KEGG:ns NR:ns ## COG: TM0374 COG0071 # Protein_GI_number: 15643142 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Thermotoga maritima # 3 134 12 139 147 65 37.0 4e-11 MVPVKTNSNWLPSIFNDFFENEWLAKTGVTAPAINVIENDKDYKVEMAAPGMTKDDFKVN VDENNNLTICMEKKEEKKEEKKDKKYLRREFSYSKFQQTILLPENVEKDKISAKVEHGIL SIEIPKVKEEEKQKTSKAIEVK >gi|222159304|gb|ACAB01000055.1| GENE 22 25663 - 26160 316 165 aa, chain + ## HITS:1 COG:AGl1386 KEGG:ns NR:ns ## COG: AGl1386 COG1595 # Protein_GI_number: 15890818 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 164 10 166 189 94 34.0 1e-19 MKEFNFEKELLSVQDELFRFAYKLTADREKAEDLLQDTLLKAMLHKESYNKNTNFKGWLF IIMRNTFINGYRAEVGHTRLYISSDPAYYSRIMDESRTEEVDRNYDLEKIRNAIKSIPES HFIPFEMYLSGFKYREIAERTGVSLGTIKSRIFHCRKKLKAILAE >gi|222159304|gb|ACAB01000055.1| GENE 23 26183 - 27169 786 328 aa, chain + ## HITS:1 COG:BH3164 KEGG:ns NR:ns ## COG: BH3164 COG0205 # Protein_GI_number: 15615726 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Bacillus halodurans # 7 328 4 319 319 293 50.0 3e-79 MKKDEYIGILTSGGDASGMNAAIRAVTRTAIFNGFKVKGIYRGYEGLIAGESRELTTEDV SSIIQRGGTILKTARSEAFTTPEGRREAYDTMRKEKIGALVVIGGDGSLTGARIFAEEYP VTCIGLPGTIDNDLYGTDFTIGYDTALNTIVECVDKIRDTATSHDRIFFVEVMGRDAGFL AQNSAIASGAEAAIIPEDRTDADQLETFIGRGFRKTKNSSIVIVTESPGNKNGGAMHYAD RVKREYPGYDVRVSILGHLQRGGAPSANDRILASRLGEAAIQALMEDQHNIMVGIHNNEI VYVPFDQAIKNDKPIDKNLIRVLNELSI >gi|222159304|gb|ACAB01000055.1| GENE 24 27182 - 28957 1044 591 aa, chain + ## HITS:1 COG:PA5529 KEGG:ns NR:ns ## COG: PA5529 COG0475 # Protein_GI_number: 15600722 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Pseudomonas aeruginosa # 7 443 6 447 585 306 38.0 6e-83 MSEVAPLISDLATILIIAGIITVIFKWLGQPVIVGYIVAGIMAGPSISLFPTVSDQANIK IWADIGVIFLLFAIGLDFSFRKLISVGASAIFSTVIIVCGMMFLGYTAGNAMGFSHTSCI FLGGMLSMSSTAIVFKAFSDMGLLDQKFTGIVLGILIIEDVVAVIMMVVLSTLAVGKHFE GFEMLESILKLAAFLIFWSALGIYLIPTLLKRLNRFISNEILLTTSLGLCLGMVMIATKA GFSAALGAFVMGSLLAETDKAEEIAHIVQPVKDLFASVFFVSVGMMIDPAMIWEYAVPVF ILTALVLCGQVLFGSFGVLLSGQPLKIAIQAGFSLAQVGEFAFIIASLGISLNVMDKYLY PVIVAVSVITTFLTPYMIRLSDPVYCFADRHLPQFLKDYLTHYSSGTMTTRHQGSWHKLI RSMLVSVTLYLVVCLFFIALFFSYAYPLVMGRIPGMKGSLLSFVLVLLIISPFLCAIIMK KNNSVEFRKLWADNRFNRGLLVSMIVIKVLICIIIVMGIIIRLFNVALGSGLILSFLIIV VIYFSKRIRKRSLSMEKQFLENFQGTAGTFPQEEPETGETSTEKNHELSNK >gi|222159304|gb|ACAB01000055.1| GENE 25 28964 - 31522 2030 852 aa, chain + ## HITS:1 COG:PH1512 KEGG:ns NR:ns ## COG: PH1512 COG0058 # Protein_GI_number: 14591294 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Pyrococcus horikoshii # 22 851 18 833 837 595 40.0 1e-169 MEYSSYHVNVPQWREITVGSHLPAELRRFAEMAHNLWWTWNEDAKSLYSGLNPELWEEAE QNPVLFLERMDYEELEALTHDGNFMRKMENVYSTFKAYLDVEPDHSRPSVAYFSMEYGLD RVLKIYSGGLGILAGDYLKEASDSNVDLCAVGLLYRYGYFDQALAMDGQQQVHYDPQNFG QLPIEKVMQPDGRQLVIHVPYADSFTVHANVWKANVGRVSLYLLDTDNELNSEFDRPITH HLYGGDWENRLKQEILLGIGGMMTLKVLGIEKDVYHCNEGHAALINIQRLCDYISEGLDF GQAMELVRASSLYTVHTPVPAGHDYFDEGLFNKYMKGYPDKLGITWDELMNLGRQTPGNK GERFCMSVFACKTSQAVNGVSKLHKSVSQQMFAPLWKGYFPEENHVGYVTNGVHFPTWCT AEWKKLFKDNFDENFMNDQSNQEIWKGVYNIPDEEIWNMRKRLKTKLISYIKWKCGRDWL KSQVDPALGVSIFEKFNPNALLVGFGRRFATYKRAHLLFTDLDRLARIVNNQEHPIQFVF AGKAHPNDTAGQGLIKQIVEISRRPEFLGKIIFLENYDMDLARHLISGVDIWMNTPTRLA EASGTSGEKALMNGVLNFSVLDGWWYEGYRKEAGWAITDKATYQDEQYQNQLDAETIYFL LEHNILPLYYERKEKDYPETWVKYIKNSVAQIAPRFTMKRQLDDYYDGFYNKLSEHFHVL AADNYAKAKTLAGWKAAVDSRWNAVEIVSVNAGKGLDATVEAGKEYEMTVVIDEKGLDNA IGLESVIIRHEENEDRIHEVIPFSLTSKDGNLYTFKAITRMFSAGSFKQAFRMYPNHPLL PHRQDFCYVRWF >gi|222159304|gb|ACAB01000055.1| GENE 26 31607 - 32347 159 246 aa, chain + ## HITS:1 COG:no KEGG:BT_1099 NR:ns ## KEGG: BT_1099 # Name: not_defined # Def: putative arginase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 243 24 266 269 401 74.0 1e-110 MNFTGVYAHEVFARNNQFIWLDCRHLSGTRGYCDKEGIRKLKRVIAGYPAEGIHFIDSGN YHYLTKLWTDKLRVPFSLIVFDHHPDMQPPLFKGMLSCGSWVKDMLDWNMLCKKVVIVGA SDKLIRTVPEEYGQRVSFYSEATLAHEKGWRNFSSAYIEGPVYLSIDKDVLNPASAVTDW DQGSFSLQELEELLAIVLRKERVVGIDICGECSATLTLFEERREATVDSRANKELLKLIQ SFSCFL >gi|222159304|gb|ACAB01000055.1| GENE 27 32399 - 32623 331 74 aa, chain + ## HITS:1 COG:TM1185 KEGG:ns NR:ns ## COG: TM1185 COG1803 # Protein_GI_number: 15643941 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Thermotoga maritima # 7 74 16 76 166 62 47.0 2e-10 MMKTIVRKIGLVAHDAMKKDMIEWVLWNSERLIGHKFYCTGTTGTLIKKALEEKHPETEW DITILKSGPLGGDQ >gi|222159304|gb|ACAB01000055.1| GENE 28 32624 - 32938 199 104 aa, chain + ## HITS:1 COG:TM1185 KEGG:ns NR:ns ## COG: TM1185 COG1803 # Protein_GI_number: 15643941 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Thermotoga maritima # 1 89 78 165 166 76 48.0 1e-14 MGSRIVEGEIDYLFFFTDPMTLQPHDTDVKALTRLAGVENIVFCCNRSTADHIISSPLFL DPTYKRIHPDYTNYTQRFENKEIVSEAVERVKKRMSRNENNMIE >gi|222159304|gb|ACAB01000055.1| GENE 29 33173 - 35578 1305 801 aa, chain + ## HITS:1 COG:mlr3215_2 KEGG:ns NR:ns ## COG: mlr3215_2 COG0642 # Protein_GI_number: 13472804 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 321 665 1 342 382 125 27.0 3e-28 MWLKLKILLGYVILLLLLVLTIHIFRKEQTRRNSLRQDEHELACIRHLAGETYAGLLGLA TYGETASVWDESDLGLYHTKKDSVCNMLQTLKRYVKSPEQQSRIDSLCLLLERKELLLDT VMDTFGRLRKTGEIVNSKIPAIVSHIQQADVLPAEKKKEEETPKKGFWSFIRPGRKKSAY LQQKEQLERQRQSIGKHQGTTSVTSMLHSLDREVTDMQKAERERLLEQMDLLYSNNTDLN HRLHRIVRDFEADAGIRLDERYRQFISTRDRSFHTVSLLAVLISLLTVLLYLIIHRDLNR INQYQRQLEASNRENTELLQSRKRMMLTIAHDLRAPLATIKGCAELLPGEEKKSRKDEYA ENILHSSDYMIGLVNTLIGFYLLDTGRNKPILSIFRLGTLFSETARNYGSLAKKKKLRLT TAFSGLDVVVSGDRSQLQQILNNLLSNAIKFTRQGEIRLQAEYRNKELHFSVQDTGTGMT EEETTRIFTAFERLDNARNVPGFGLGLAIASRLVSGMQGSLTVKSKPGEGSTFTAFLPLL EADESTQMDETRIATDYHLDGINVLVIDDDRMQLNITKEMFNRNGVRCDCCQTSRELVTR LRSQRYDLLLTDIQMPETDGYGILELLRASNMENAKTIPVIAVTARVDDDNEYLSGGFSG CIHKPFSMEELINTVAQVIGEKDRKEYAPDFSLILSGEDNREEMLALFIEESRKDLAALT AALDRQDKEAAASILHKNLPLWETVRLDFPLSHLRELVTEPATEWTNRQSMEMRDIIRAV EKLIVYAEKYGRKAYENNPDY >gi|222159304|gb|ACAB01000055.1| GENE 30 35556 - 35973 255 139 aa, chain + ## HITS:1 COG:CPn0586 KEGG:ns NR:ns ## COG: CPn0586 COG2204 # Protein_GI_number: 15618496 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Chlamydophila pneumoniae CWL029 # 1 137 3 142 386 84 32.0 6e-17 MKTILIIEDDIVFSRSISNWLVKKGMKTECVATLANARKAIGQKEFDLILADLRLPDGNS TSLLKWMNEKYYSIPFLIMTNYGQVENAVTTMQLGAINYLCKPVQPDNLLALITEILDKN DNEQEFYRGESPKAHEMYR Prediction of potential genes in microbial genomes Time: Wed May 18 02:27:46 2011 Seq name: gi|222159303|gb|ACAB01000056.1| Bacteroides sp. D1 cont1.56, whole genome shotgun sequence Length of sequence - 74262 bp Number of predicted genes - 76, with homology - 73 Number of transcription units - 34, operones - 18 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 14 - 874 645 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 911 - 964 8.2 + Prom 907 - 966 4.9 2 2 Tu 1 . + CDS 1089 - 1706 363 ## BT_0016 hypothetical protein + Prom 1790 - 1849 6.5 3 3 Tu 1 . + CDS 2048 - 2293 239 ## BT_0100 hypothetical protein 4 4 Tu 1 . + CDS 2509 - 3069 361 ## BF1249 putative mobilisation protein 5 5 Op 1 . + CDS 3208 - 3507 133 ## gi|294646545|ref|ZP_06724182.1| conserved domain protein 6 5 Op 2 . + CDS 3541 - 3813 224 ## BT_2306 putative mobilization protein 7 5 Op 3 . + CDS 3844 - 5553 1241 ## COG3505 Type IV secretory pathway, VirD4 components 8 5 Op 4 . + CDS 5592 - 5762 268 ## gi|255691005|ref|ZP_05414680.1| D12 class N6 adenine-specific DNA methyltransferase 9 6 Tu 1 . + CDS 5934 - 6335 359 ## COG0338 Site-specific DNA methylase + Term 6535 - 6582 5.5 10 7 Op 1 . - CDS 6195 - 6596 67 ## 11 7 Op 2 . - CDS 6624 - 7517 466 ## BDI_2141 DNA primase 12 7 Op 3 . - CDS 7522 - 9351 609 ## COG3344 Retron-type reverse transcriptase 13 8 Tu 1 . - CDS 9901 - 10059 190 ## BDI_3503 DNA primase - Prom 10079 - 10138 3.8 + Prom 10025 - 10084 3.6 14 9 Tu 1 . + CDS 10166 - 10357 123 ## - Term 10495 - 10533 1.9 15 10 Tu 1 . - CDS 10553 - 10684 58 ## gi|293373352|ref|ZP_06619710.1| conserved domain protein - Prom 10707 - 10766 5.6 + Prom 10635 - 10694 3.0 16 11 Op 1 . + CDS 10759 - 11898 1206 ## PGN_0581 hypothetical protein 17 11 Op 2 . + CDS 11904 - 13328 646 ## BT_0017 hypothetical protein 18 11 Op 3 . + CDS 13333 - 14148 589 ## gi|237715497|ref|ZP_04545978.1| predicted protein 19 11 Op 4 . + CDS 14153 - 15601 650 ## gi|237715498|ref|ZP_04545979.1| predicted protein 20 11 Op 5 . + CDS 15608 - 15919 329 ## gi|237715499|ref|ZP_04545980.1| predicted protein 21 11 Op 6 . + CDS 15949 - 17019 627 ## BF3847 hypothetical protein 22 11 Op 7 . + CDS 17040 - 18806 858 ## gi|237715501|ref|ZP_04545982.1| predicted protein 23 11 Op 8 . + CDS 18818 - 20020 639 ## gi|237715502|ref|ZP_04545983.1| predicted protein + Prom 20060 - 20119 2.0 24 12 Op 1 . + CDS 20155 - 20892 441 ## gi|237715503|ref|ZP_04545984.1| predicted protein 25 12 Op 2 . + CDS 20919 - 21035 72 ## gi|294646527|ref|ZP_06724164.1| conserved domain protein 26 12 Op 3 . + CDS 21102 - 22994 1114 ## COG0550 Topoisomerase IA 27 12 Op 4 . + CDS 23006 - 23224 305 ## gi|237715505|ref|ZP_04545986.1| predicted protein + Term 23311 - 23356 9.4 - Term 23018 - 23048 -0.5 28 13 Tu 1 . - CDS 23259 - 23666 211 ## BF0137 hypothetical protein - Prom 23757 - 23816 4.5 + Prom 24036 - 24095 6.6 29 14 Op 1 . + CDS 24205 - 24324 145 ## + Prom 24364 - 24423 2.0 30 14 Op 2 . + CDS 24498 - 24809 331 ## BF3840 hypothetical protein + Term 24833 - 24871 4.1 - Term 24821 - 24858 7.1 31 15 Tu 1 . - CDS 24875 - 26104 664 ## COG0582 Integrase - Prom 26279 - 26338 5.1 - Term 26392 - 26425 -1.0 32 16 Tu 1 . - CDS 26456 - 28132 1435 ## COG3507 Beta-xylosidase - Prom 28161 - 28220 4.4 - Term 28186 - 28231 8.1 33 17 Op 1 . - CDS 28287 - 29528 1222 ## BT_4186 hypothetical protein - Prom 29576 - 29635 4.5 34 17 Op 2 . - CDS 29681 - 31300 1317 ## COG5434 Endopolygalacturonase - Prom 31450 - 31509 7.6 + Prom 31428 - 31487 5.2 35 18 Op 1 . + CDS 31652 - 32509 1058 ## COG0623 Enoyl-[acyl-carrier-protein] reductase (NADH) 36 18 Op 2 . + CDS 32555 - 33523 851 ## BT_4189 hypothetical protein + Prom 33531 - 33590 4.2 37 18 Op 3 . + CDS 33615 - 34319 509 ## COG0313 Predicted methyltransferases + Term 34483 - 34525 2.1 38 19 Op 1 . - CDS 34285 - 35403 649 ## BT_4191 hypothetical protein 39 19 Op 2 . - CDS 35467 - 36315 442 ## COG0320 Lipoate synthase 40 19 Op 3 . - CDS 36315 - 38525 1733 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases - Prom 38655 - 38714 5.2 + Prom 38486 - 38545 4.1 41 20 Tu 1 . + CDS 38680 - 38982 187 ## BT_4196 hypothetical protein + Term 39055 - 39092 -0.3 42 21 Tu 1 . - CDS 39044 - 39940 993 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase - Prom 39961 - 40020 8.3 - Term 39988 - 40046 14.0 43 22 Op 1 . - CDS 40062 - 40616 624 ## BT_4204 hypothetical protein 44 22 Op 2 . - CDS 40638 - 41405 817 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase 45 22 Op 3 . - CDS 41424 - 42809 1437 ## COG0774 UDP-3-O-acyl-N-acetylglucosamine deacetylase 46 22 Op 4 . - CDS 42821 - 43861 1017 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase - Prom 43881 - 43940 3.6 47 23 Op 1 . - CDS 43944 - 45173 887 ## COG1078 HD superfamily phosphohydrolases 48 23 Op 2 . - CDS 45204 - 46028 869 ## COG0284 Orotidine-5'-phosphate decarboxylase - Prom 46049 - 46108 8.6 49 24 Op 1 . - CDS 46111 - 47223 1150 ## COG0216 Protein chain release factor A 50 24 Op 2 . - CDS 47227 - 48393 1119 ## COG0150 Phosphoribosylaminoimidazole (AIR) synthetase - Prom 48513 - 48572 5.9 - Term 48596 - 48625 -0.2 51 25 Op 1 . - CDS 48664 - 49245 779 ## COG1704 Uncharacterized conserved protein 52 25 Op 2 . - CDS 49254 - 49574 173 ## gi|237715530|ref|ZP_04546011.1| predicted protein 53 25 Op 3 . - CDS 49571 - 49870 182 ## gi|237715531|ref|ZP_04546012.1| predicted protein 54 25 Op 4 . - CDS 49942 - 50865 635 ## COG1512 Beta-propeller domains of methanol dehydrogenase type 55 25 Op 5 . - CDS 50938 - 51888 650 ## COG1073 Hydrolases of the alpha/beta superfamily 56 25 Op 6 . - CDS 51908 - 52654 775 ## COG0169 Shikimate 5-dehydrogenase 57 25 Op 7 . - CDS 52694 - 53431 557 ## PROTEIN SUPPORTED gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 - Prom 53490 - 53549 3.9 58 26 Op 1 . - CDS 53593 - 54537 1137 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase 59 26 Op 2 . - CDS 54551 - 55498 1035 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase - Prom 55520 - 55579 3.9 60 27 Op 1 . - CDS 55600 - 56277 626 ## BT_4219 hypothetical protein - Prom 56299 - 56358 2.1 61 27 Op 2 . - CDS 56360 - 57220 422 ## BVU_1438 hypothetical protein 62 27 Op 3 . - CDS 57159 - 57413 67 ## gi|237715540|ref|ZP_04546021.1| predicted protein + Prom 57327 - 57386 5.4 63 28 Tu 1 . + CDS 57412 - 57849 334 ## BF3551 hypothetical protein + Term 57867 - 57926 -0.8 + Prom 58002 - 58061 5.7 64 29 Op 1 . + CDS 58136 - 58525 248 ## gi|298484496|ref|ZP_07002647.1| hypothetical protein HMPREF0106_04950 65 29 Op 2 5/0.000 + CDS 58513 - 58863 207 ## COG3436 Transposase and inactivated derivatives + Prom 58872 - 58931 4.3 66 29 Op 3 . + CDS 58958 - 60559 622 ## COG3436 Transposase and inactivated derivatives + Term 60707 - 60760 7.1 67 30 Tu 1 . - CDS 61017 - 62147 619 ## COG1672 Predicted ATPase (AAA+ superfamily) - Prom 62222 - 62281 7.5 - Term 62234 - 62284 12.4 68 31 Op 1 . - CDS 62305 - 63957 2004 ## COG2268 Uncharacterized protein conserved in bacteria 69 31 Op 2 . - CDS 63978 - 64475 444 ## BT_4221 hypothetical protein - Prom 64510 - 64569 7.0 + Prom 64510 - 64569 4.7 70 32 Op 1 . + CDS 64604 - 65353 571 ## COG0101 Pseudouridylate synthase 71 32 Op 2 . + CDS 65372 - 66274 816 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 72 32 Op 3 . + CDS 66320 - 66970 575 ## BT_4239 hypothetical protein + Term 67219 - 67253 1.1 - Term 66883 - 66929 10.2 73 33 Op 1 . - CDS 67002 - 68090 1091 ## BT_4240 hypothetical protein 74 33 Op 2 . - CDS 68121 - 71462 3277 ## COG3250 Beta-galactosidase/beta-glucuronidase 75 33 Op 3 . - CDS 71535 - 72425 718 ## COG1284 Uncharacterized conserved protein - Term 72465 - 72500 2.6 76 34 Tu 1 . - CDS 72540 - 73949 1536 ## COG0673 Predicted dehydrogenases and related proteins - Prom 74040 - 74099 7.2 Predicted protein(s) >gi|222159303|gb|ACAB01000056.1| GENE 1 14 - 874 645 286 aa, chain + ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 1 281 157 441 441 235 44.0 6e-62 MIACADISVLLRGASGTGKEHIAAEIHARSHRKNKPYLAIDCGAISDELAASEFFGHQKG AFTGAESDKVGLFRAVNGGTLFLDEIGNLSYKTQMLLLRALQEKRCKPVGSTKEYSFDIR LVAATNENLEKAIGEGRFREDLFHRLNEFTLRIPTLAECREDILPLAYFFLKLTCAKAHK SFRGFDRLAEAALLEYPWPGNIRELKNVIGRAVLICQEQWISVSDLNLEISLPKEEETQW TEEEKEKALLLQTLEKTGDNRSKAARLLNVSRTTLYEKLRKYHIID >gi|222159303|gb|ACAB01000056.1| GENE 2 1089 - 1706 363 205 aa, chain + ## HITS:1 COG:no KEGG:BT_0016 NR:ns ## KEGG: BT_0016 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 200 24 209 212 67 21.0 3e-10 MKPFTECRIFNYLSLASSPKQTVSDEEFSSSYTEYEQYLYDLAIESVSVSERLRHLLHSK VELISLKKLFTRTGHFHTAVAEFYLDKCLLLVEAEIELVNFGVQYPGTITTPSSFLSSLH WKGSLVNLMELISSLDYSGLITDESGKRLSFAGIVSAFEKLFNVAIPKPYDLRADLARRK KNYSVLLPKLKETFEKNIAACGNGK >gi|222159303|gb|ACAB01000056.1| GENE 3 2048 - 2293 239 81 aa, chain + ## HITS:1 COG:no KEGG:BT_0100 NR:ns ## KEGG: BT_0100 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 81 65 145 145 103 69.0 3e-21 MKVVKIDKAATDYYIKLTNLQSEYRRIGVNYNQAVKALHTGLSEKKALAMLYKLEQLTIE LISLNREIIRLTQEFEQWLQK >gi|222159303|gb|ACAB01000056.1| GENE 4 2509 - 3069 361 186 aa, chain + ## HITS:1 COG:no KEGG:BF1249 NR:ns ## KEGG: BF1249 # Name: not_defined # Def: putative mobilisation protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 186 78 263 412 204 54.0 9e-52 MLTDEQLTAIGQEYMEKMGYGNQPYIIYRHEDIGRPHIHIVSLRIDEQGKKIKDCKEWQR STAVCRELERKYHLLPAEKMERRESLPLTAVDYRKGDIKHQIANVVKPVMQGYKFQSVKE FKALLGLFHVTVEEAHKTIKGKTYHGLVYAATDEKGERTGVAIKSSKIGKSVGYEALQKK FVKSKQ >gi|222159303|gb|ACAB01000056.1| GENE 5 3208 - 3507 133 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294646545|ref|ZP_06724182.1| ## NR: gi|294646545|ref|ZP_06724182.1| conserved domain protein [Bacteroides ovatus SD CC 2a] # 1 99 1 99 99 164 100.0 2e-39 MIYGVTDIDHNSKTVFKGSLLGKEYSASVINRKYGTIPPEKTEEAPVIHPSEPEMKETEL VEGLLDIFSLESYPYPADDLTQSPYGKKKKRKRRGPHLG >gi|222159303|gb|ACAB01000056.1| GENE 6 3541 - 3813 224 90 aa, chain + ## HITS:1 COG:no KEGG:BT_2306 NR:ns ## KEGG: BT_2306 # Name: not_defined # Def: putative mobilization protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 86 1 87 669 109 60.0 3e-23 MSQDETRGLNKNLDFMRAISILFLVMNVYYFCYPYFLSMNLNIGVVDKILLNFQRDTGLF SHSLVSKSFSLLFLFFSCMGAKGRKDVEMS >gi|222159303|gb|ACAB01000056.1| GENE 7 3844 - 5553 1241 569 aa, chain + ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 99 456 117 466 589 86 25.0 2e-16 MLLFFGSELLLTARFPVALLTLLYVATVAAGYISLLTAGTWISRLLKNQLMDDVFNDENE SFMQERRLIANEYSVNLPTRFRYQRKTYSGWINVINPFRASLILGTPGSGKSYAIINNYI RQQIEKGFAAYIYDFKYPDLSIIAYNQLLKNKDKYAKPVGFYVINFDDPRYSHRCNPLNP SFLSDIADAYESAYVIMLNLNKSWIQKQGDFFVESPIVLFAVVIWYLKIYENGKFCTFPH AIELLNKPYSDLFTILTSYRELENYLSPFMDAWKGGAMEQLQGQIASAKIPLSRLISPAL YWIMTGDDFTLDINNPEEPKVLCVGNNPDRQNIYSCALGLYNARIVKMVNRKGRLKCSIL VDEVPTLYFKGLDTLIATARSNKVAVCLGAQDFSQLIRDYGDKEARVIQNTIGNIFSGQV VGETAKNLSERFGKVLQQRKSINMTREDTSTNISTQLDSLIPASKISNLSQGEFVGSVCD NFGEKIEQKIFHCEIVVDNERVAAETKAYKPIPVITDFTGADGKDHMQEEIERNYYQIKE DVTQIIEKELLRIQNDPNLKHLLETANDE >gi|222159303|gb|ACAB01000056.1| GENE 8 5592 - 5762 268 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|255691005|ref|ZP_05414680.1| ## NR: gi|255691005|ref|ZP_05414680.1| D12 class N6 adenine-specific DNA methyltransferase [Bacteroides finegoldii DSM 17565] # 1 56 1 56 56 102 100.0 7e-21 MEKKKSSPANLPGKPCFPWVGGKRRLLPVLIESLPKDFGQMDTYVEPFVGGGALFF >gi|222159303|gb|ACAB01000056.1| GENE 9 5934 - 6335 359 133 aa, chain + ## HITS:1 COG:MJ0598 KEGG:ns NR:ns ## COG: MJ0598 COG0338 # Protein_GI_number: 15668778 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Methanococcus jannaschii # 8 133 115 240 289 110 43.0 9e-25 MEKRTYYNEGNPNNITRAALFIFFMRTCYNGIYSVNHSGKLSVTFGAGGRVKLLEEELIR FNHKLLQDVVILDGDYRQTAEYTGANSLFYFDPPYKPVNEGNSCTSYMPQDFGDEEQINL ANFCKGIGETGAK >gi|222159303|gb|ACAB01000056.1| GENE 10 6195 - 6596 67 133 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHERQAIKKRVKSLLEGIWLSFYCYPATTGYLFLIISDQQFIDRFSFLGYGTDGTEDADS LDMETGIEVVEEGVLRVLLPGIGVGKHSFGTGFTYSLAEICQVDLLFISEVLWHVGGAGV TLIDRLIRRVKIK >gi|222159303|gb|ACAB01000056.1| GENE 11 6624 - 7517 466 297 aa, chain - ## HITS:1 COG:no KEGG:BDI_2141 NR:ns ## KEGG: BDI_2141 # Name: not_defined # Def: DNA primase # Organism: P.distasonis # Pathway: not_defined # 38 294 58 318 323 205 42.0 2e-51 MLSTESRMHGDMQVRFGGRYGKTYRRKAARRPVPSLRIGQGGDIITLAMELQKTKDISYA LKTIEGHFPAAFRPVAASPRQAEPQATGYRQVRIDPLTNPVLLGYLKERGILPEIAREAC KEVHFQNKGKWYFAVGFANRSGGYEIRNKYLKGSISPKEITHIKNGSDRCCIVVEGFMDY LSYLTLKATHPGNGQPKGNGPDYIVLNSVSNVGKAIPVLKEYKSALCLLDNDSAGRQAFQ QMAQAGCPVRDKSDCYREYNDLNDYLLGRKMAQENKTDHQHETAPKLIEKPAKKQSG >gi|222159303|gb|ACAB01000056.1| GENE 12 7522 - 9351 609 609 aa, chain - ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 35 607 255 825 834 296 33.0 1e-79 MENVRDIMRSPEQVLKALNKHGKVSDYKFERLYRILFNEEMFHVAYQRIYAKPGNMTPGT DGKTINRMSLQRINKVIASLRDESYKPNPAKRTHIPKKNGKKRPLGIPSFEDKLVQEVVR MILEAIYEEVFANTSHGFRPNRSCHTALTHIQKTFTGTKWFVEGDIKGFFDNIDHNVLIA TLRKRIDDNRFLRLIRKLLNAGYIEDWRFHNTNKGTPQGGNISPILANIYLDNFDKYMEE YALRFNKGKERHITKEYKQLSGKMQGILKSIKNIKDADARLQLRDEYVKLGRERQKIESR DSMDETYRRFRYVRYADDFLIGVIGSKADCVKIKSDITNYMEENLKLELSQEKTLITNAQ TPAKFLGFEVSVRKSDVVKRNKNNVSARYYNGKIVLKVAIETVRNKLEEYSAIRYKVENG RQVWFAKFRGNLMKKKIEDIVAAYNSEIRGFYNYYCIANNVAYALSKFGYIMEYSMYHTI AAKTNSTVSKVIDKYKIGNDIIVPYQDAKGNLRHRKFYNEGFKRKPPMYYTEVNDLSYTI AIPQPTLTERLEARTCELCGKVGPVVMRHVRKLNQLKGKTECDRLMLEKHRKTLVVCEKC YAKIHNHAK >gi|222159303|gb|ACAB01000056.1| GENE 13 9901 - 10059 190 52 aa, chain - ## HITS:1 COG:no KEGG:BDI_3503 NR:ns ## KEGG: BDI_3503 # Name: not_defined # Def: DNA primase # Organism: P.distasonis # Pathway: not_defined # 1 52 1 52 312 76 63.0 3e-13 MNIEEAKSIQLEDYLRRMGFNPVKQQGDSIWYCSPFREEKTPSFKVSASRNL >gi|222159303|gb|ACAB01000056.1| GENE 14 10166 - 10357 123 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVAGRYAGMTDIADMPTYAYIAPFPHMQFAQVCIQRGKAARVPDFDVPAVTAAIPGFNDF PAS >gi|222159303|gb|ACAB01000056.1| GENE 15 10553 - 10684 58 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293373352|ref|ZP_06619710.1| ## NR: gi|293373352|ref|ZP_06619710.1| conserved domain protein [Bacteroides ovatus SD CMC 3f] # 1 43 1 43 43 73 100.0 3e-12 MRKLSLIVMFCHLLSCPAFAQSPSKGYKAGLSAVIYRIHSARQ >gi|222159303|gb|ACAB01000056.1| GENE 16 10759 - 11898 1206 379 aa, chain + ## HITS:1 COG:no KEGG:PGN_0581 NR:ns ## KEGG: PGN_0581 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 4 320 1 338 494 132 31.0 2e-29 MNDVNLAPENKEATPEHGYLMAYDKKEQKAKGVKGIAANGELETLEANEANRDQFIKVDQ RGNFFTNFGKNFLYQYNNPGRYSLYNMPKETLVEQAKEKIEAAQEPQNEAVRRELASTRV YNNHRFNEREVNWEQAANYGITPDGLKNAKDSLERMLQGKTSAIAFRVAKNSELGRENGD AKLSLFRDENGAVKFDIHYIRQAPKIGEDYRGHVLTEEDLKALNQTGNLGKAVDVVIDYR TKETKSCYLSKDPVTNELFHMPVEQARIPRKVKDYTLSPKEYDAAVRGEEVPIRFKSDNG KFYATSIQMSAAERGVEFLWERSTKKLEEAQKQGQEQDGSQQQPHAPVQVAGKPRKKEEA SQQAEKKPRTRKPSITPKM >gi|222159303|gb|ACAB01000056.1| GENE 17 11904 - 13328 646 474 aa, chain + ## HITS:1 COG:no KEGG:BT_0017 NR:ns ## KEGG: BT_0017 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 178 348 32 211 315 66 26.0 3e-09 MATYISDDPKLLDELFRKDGEGQLLVGYETGKEKPHAESSYMLYPANPDRQDPVYTFMAL FSQQSIKAKYSAFVPNTRLEIYSFPKMTDVPAISGDISKKEYINQVLLPYIREKGLAPLI STNLRNVLFAQSRSDILMISGELPKLTTQQLDELVHFHQKQDELAARYDYNPVYKLPLHA VETSKGILFFSDTKMGREGLKSFYQQLSGNYFWVHGEPGPVRQYNVNCLSDDICPLVDAC YRKNPQSGKGEYDFDNAVFSKEAFRDRKQWKLAFETDMEPSASEFLRLNEFAGCPASRNN ADISKLLYLMENGFKRDIINDPDFGYRNVFQEYVTRIDDCINGQSSGPDLSDVLDDMRWK AKNILLTDFDVRGHRTLERTLNDRSVPFLINGTDAGEAMRQALLEGKWIYCPQISKSMPD LHFLHAEKTCNRVMAYTKSPVNKTVYQEKNGKIIPYVPALKKVSKTKRNNSLKM >gi|222159303|gb|ACAB01000056.1| GENE 18 13333 - 14148 589 271 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715497|ref|ZP_04545978.1| ## NR: gi|237715497|ref|ZP_04545978.1| predicted protein [Bacteroides sp. D1] # 1 271 1 271 271 556 100.0 1e-157 MKQPEQSYTAIETAHGFVFFTDTTEGQKNRQDFLQFMADHYFDPHFNLGPVNVYRAEGVL KDGSYVNPGEGLYPEYAYLQMDKTPEMELVYRNEMKPTWEDFGSFCHNMHCTSSHRNRNI ADILEEIESKDRKLLELSKQGTASDIRQQIEETGQDKALLDKLLKQYYDVRGHRTVGNIL RDPMECVTVDGVRLFTPHRQVLAAGHGLFLPGEAKSNPSHAYAWINGDFTRIVFSKDPPA NKQVFKVKTVIEKALNKKQDVKKKRNTHPKL >gi|222159303|gb|ACAB01000056.1| GENE 19 14153 - 15601 650 482 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715498|ref|ZP_04545979.1| ## NR: gi|237715498|ref|ZP_04545979.1| predicted protein [Bacteroides sp. D1] # 1 482 1 482 482 973 100.0 0 MDKRNQMENPFFDPDKPGSIFVGMDRYHQYSPHQPRNALTFIQKGDADSLFRKFLIDNIK EAECCPYIPDTELLRFDLANMRQVPPVDTHTPFEEYISKELLPYFQEHCIPPAKRISLRD AVYTYKYKNEPDGGILKKYLMQEPAYLEFRLQQQEKRTLYRCQPRYTFPLKVVENDFGYL IFSGNEIGRNGFRECIRYITDHYFDPHYDTGHLAVYDSTFMDKNLVPLIDAAYKPCKPME LDYSFDFYPASYIGLDELPKEFIDSLKPVCYHSMEATAGDFIKFATDWHFNKDTQVSISR ENHDIYRLLTVMRNGYMNIHEQPFTYFNELLPYAKEFEKVTQVKSAGEFDTGKFKRLSTE IRKAADGILKRDFDVRGHRSLENMLNDSTVTFTVGSRKLNEVQKTALASGYALYLPENNK EATRHLLFCKADFEQGRIEGSSKPFGVRTYVIKDGLLCPLPEEKNTVKKTENKNRHNNNR LK >gi|222159303|gb|ACAB01000056.1| GENE 20 15608 - 15919 329 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715499|ref|ZP_04545980.1| ## NR: gi|237715499|ref|ZP_04545980.1| predicted protein [Bacteroides sp. D1] # 1 103 1 103 103 187 100.0 2e-46 MIPKSQMYLGARIVENDPEEETPVVPYKGTVTAIEETGKGELDYFVYIRLDDESMKQKRI SLCCPDKIMTCFPWTIDLEEKQKMEQKMKNKKVPDPSKRHRLS >gi|222159303|gb|ACAB01000056.1| GENE 21 15949 - 17019 627 356 aa, chain + ## HITS:1 COG:no KEGG:BF3847 NR:ns ## KEGG: BF3847 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 139 338 27 241 301 70 24.0 6e-11 MSTYIGFNLNSNRQIEHFQTIENRYGINSDGGKFLFGQAELALKGSYIPKEEVYLIPYQG AVQPGNIERFIKDMTHNGGLSCATHFPLRDIAFVYENTSPYGIHNVDSIQRMLQKAKDNP LLKKQLNAYRAFHQEKEKDIYNRVITAINTNQGVLMFNDTGRGIQCAQKYLQHIGDNFFS PVYRDADKLQIYYFSTSNINLIKEASKCSNMFEHGLKKIYLPQKAHFLDSNMIANYTPAV ECSMAPSLECYNQLAEKLNLGKSQKNYNIGVLDRICKTGQIGNLEKDSRFNHQNSFVSLD ERIRLSYVGKQDGTLLKNALERTIKDTAKRILQTDYAVRGYEPPKQEKKKSRSITM >gi|222159303|gb|ACAB01000056.1| GENE 22 17040 - 18806 858 588 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715501|ref|ZP_04545982.1| ## NR: gi|237715501|ref|ZP_04545982.1| predicted protein [Bacteroides sp. D1] # 1 588 1 588 588 1150 100.0 0 MQKQEISNIMIFFVTQDLEGQPRQLEMHLMPEKEVSMMNQRFTEYLQRQREMYKPSLVQS HLPDLYLCRYQFPAGVSYPDIRLFDKDNSLVQKFITRNGGSMQGNVSLRGLEYLHSHDEE KSLPMLVASGLADHLLVQPEAKRFALAQDTLHDDPSETLTAVETAKGVLLFEYSGFGKTC CHAYMQHLADRFFITDEEKPEFVNLYKLTRPDAEVVKAFQASPNAFSLYTNSFLPEKAQY LDATILRNARLDRSHRIEPTFDAYDKFASSYNVLPSIANAQILRLLSLQETAGIYGIDYT TRRIPFIHKNSFNSQFNALQNIPAENKGGQEKVKSQIRDQAAYILKRDYGLIPDSLQNKE IDPIISLQTPKGAVYLPATDEGAIYKQCYLQYLADRFFTPEVQALGRIREFYISCPNHST EHYMQKHLDLFRSNPFYGQLAKMPLYPIEQSELLKKGGYPIEPTYHAFKQFTEDYRLSVT PENAEIFTLLFIREYGLPADFNTNESYKEFIHKGNFKPLDQEMSELQSKKGYSEKAFYNI QNRQQQLADKILGLRYRLTCPPLQLTGPAASEKRKTASRQNKSHNPRI >gi|222159303|gb|ACAB01000056.1| GENE 23 18818 - 20020 639 400 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715502|ref|ZP_04545983.1| ## NR: gi|237715502|ref|ZP_04545983.1| predicted protein [Bacteroides sp. D1] # 1 400 1 400 400 775 100.0 0 MPAWMEKIKDTFTAKADLKSLFIVLQPDNQVSTVMVVSYVPTDKDSFQVFMDLTARIAMD QKTIPDNLLLHFEGIPAKDIPFTSELPAKDSEKALSFIASYGGITENNSVPLRKAAYLRA CQRELTAENIRDLDYSPAYKCFIAHEDAMEKIAAGKQAKQFYTIAETEQGVRVFNDGLSG TIKFRDYLQSTADNFYSSSLQDVESLNIYRIETVSRRMLELSNGNQTTMPQAGMEILANY KPSVTFDMHPTGENLNRFVTAGALELSIRNRNIMTLQDIAARGYAHLPADESFAYKKDFL FVEKGIREITRQKELYRDYPFRQKMDELQNAARSLAQTLLNRDGVRKNYHRVSPPVVVSK KGEAQTAEKPQDKPGLSCGNEKKKVKTKTASVKKQAKPKL >gi|222159303|gb|ACAB01000056.1| GENE 24 20155 - 20892 441 245 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715503|ref|ZP_04545984.1| ## NR: gi|237715503|ref|ZP_04545984.1| predicted protein [Bacteroides sp. D1] # 1 233 46 278 290 451 100.0 1e-125 MDTYRFNTIRWSASNLHEPSWGKRIGLFFRTKNPIAILQDDYRSNRSNLLNKWETSSLNR IQMELVRMGSMPIQNKDRHTFKALNSIEKKIKQLQEYEGCPDMSHEKQLLATYKYILSSP FDRKFGGLPPDEICGMLQQNGLSESNLPYQNYEGILQGRETVMYELATGKNGEKYLQPAD QVKLNAGMSGISMDLISRFPAKEIPLPNLTESIQVSEKKEKLSIPKEPKKRVRKEKSKVK SVKIK >gi|222159303|gb|ACAB01000056.1| GENE 25 20919 - 21035 72 38 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294646527|ref|ZP_06724164.1| ## NR: gi|294646527|ref|ZP_06724164.1| conserved domain protein [Bacteroides ovatus SD CC 2a] # 1 38 1 38 38 68 100.0 8e-11 MKTICICEKPSMARSIARVLGVTEKQEGYLSGNGYAVT >gi|222159303|gb|ACAB01000056.1| GENE 26 21102 - 22994 1114 630 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 564 66 643 709 326 37.0 7e-89 MLPIIPDPFKLVVRQIKTEDGYKADPTALKQLEVIRKLLGQADQVISCTDAGREGELIMR YVLEYLGYHKETKRLWISSMTEKSIREGFDSLKSSKEFDNLYRAAKARRESDWVVGMNAS LSLSMAAGKSNYSLGRVQTPALGMICRRYLDNRDFIAKPYYLLQLRTTKAGKELVLTCTG KYDTPEKLDVDRKKVYEETTAKVVQVEKKEVPEEAPLLYDLTALQRSANTKLGLTAEQTL NIAQKLYEGGYISYPRTGCSYITEDIFEQVPSLIGLLKQHPRFTWHAENLCNQPLNRHCV DDSKMTDHHALIITENYPQRLSLDEQNIYSMIAGRMLEAFSGKCLKETVSVQADCNGVLF GIKGSQIKVPGWRGIYNEPSEKEEGSLLPEFQEDEILPVLGIDTLVKKTKPQPIFTEASL LAAMEGCGRTLDDEKEKEAMEDSGLGTPATRAGIIELLIARHYVERNGRSLIPTPKGLEV YDIVKEKMIANVSMTGGWECALHEIETGKVSTETFTQSINSYTQQITSELLALKLNHPDL PHCNCPKCGAETIIVFNKVAKCSDPNCGFLLFRTFNGRELTDNQMLLLLSGKRTGYLKFT SKKGKKYEASLELDDNYRIEMTFKDNKPKK >gi|222159303|gb|ACAB01000056.1| GENE 27 23006 - 23224 305 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715505|ref|ZP_04545986.1| ## NR: gi|237715505|ref|ZP_04545986.1| predicted protein [Bacteroides sp. D1] # 1 72 1 72 72 119 100.0 6e-26 MKKEMEEIPDELNPDLMLNTIASELLIKIAKGEIDIQKLVRKQLSDRGIDDQRNWIGPDK ARKYWEKYKMPV >gi|222159303|gb|ACAB01000056.1| GENE 28 23259 - 23666 211 135 aa, chain - ## HITS:1 COG:no KEGG:BF0137 NR:ns ## KEGG: BF0137 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 131 1 133 140 89 36.0 4e-17 MAKLQLLAITTLDGYLFDRTVSSPLWDNPNKYGLTKIRERATQTLGPDVSFISLTQWKKK NEGIYFIEAAPDTISVISSMFRYWLVDEIILFVAPYIQGDGIRLFTEIPGPSSWEMTGNK CFRTGICRLAYKRIE >gi|222159303|gb|ACAB01000056.1| GENE 29 24205 - 24324 145 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MELTIIETNAYQDLKKQLSMLSVQMMDFQKKIAPVTPDK >gi|222159303|gb|ACAB01000056.1| GENE 30 24498 - 24809 331 103 aa, chain + ## HITS:1 COG:no KEGG:BF3840 NR:ns ## KEGG: BF3840 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 103 1 100 100 154 76.0 1e-36 MADIITKDSEEFKELTGWIKRTGKNLEAAAARIRPTIADEHYLSGDEVCRMLHVSKRTLQ TLRDEKAIPYTSITSVGGKLLYPESGLYEVLKKNYKDFRRYLK >gi|222159303|gb|ACAB01000056.1| GENE 31 24875 - 26104 664 409 aa, chain - ## HITS:1 COG:SSO0375 KEGG:ns NR:ns ## COG: SSO0375 COG0582 # Protein_GI_number: 15897309 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Sulfolobus solfataricus # 215 386 110 270 291 67 31.0 5e-11 MRSTFRVLFYTKNQSIKNGKVPVMGRITINKTTACFSCKKEVSISLWDAKANRAKGKSEE ARMLNQELDNAKAQIAKHYQYICDHDSFVTAKKVYSRYVGFKEDSHTLMELFREQLESYK EKVGKEKAKSTYLGLVADYKSALLFLKDKKNVEDIALDELDKDFIEDYYNWMLGTCGSAS STAFGRVNTMKWLMHIAQEKGLIKVHPFTGFGCKPGYKRRSFLSEEELQRLIHVELRYKR QQAMRDMLLFMCFTGLAFADLKAITYKNIHTDSDGGTWLMGNRIKTGVAYVVKLLPIAIE LVEKYKGDNKKKDSPDCVFPVGDYETMKSSFKVLGKKCDCNVNITPHIGRHTFAVLAILK GMPLETLQKVLGHKSILSTQVYAELINPKVGEDTDRMCDKIGSVYRLAN >gi|222159303|gb|ACAB01000056.1| GENE 32 26456 - 28132 1435 558 aa, chain - ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 18 556 22 529 531 195 30.0 2e-49 MKRLTQTLAFCLLTVFTAVAQKNYVSEVWVSDLGNGKYKNPVLYADYSDPDACRVGDDFY MTSSSFNCLPGLQILHSKDLVNWTIIGAAVPNALPPIETPERPEHGNRVWAPAIRHHNGE FYIFWGDPDQGAFMVKAKDPKGPWSEPVLVKAGKGIIDTCPFWDEDGKVYMVHAYAGSRA GLKSVITICELNAEANAAFTPSRIIFDGHEAHQTCEGPKMYKRNDYYYIFHPAGGVPTGW QVVLRSKNIYGPYEWKTVLAQGDSPINGPHQGAWVDTPTGEDWFLHFQDVGAYGRIMHLQ PMKWVNDWPVIGIDKDGDGCGEPVLTYKKPNVGQTYPICTPQESDEFDGYTLSPQWQWHA NINEKWAYYAGDKSYVRLYSYPVLKEYKNLWDVANLLLQKTPSDNFTTTMKLTFSPNPKL KGERTGLVVMGRDYAGIILENTDKGLILSQVECKRADKGKPEQANASVNLSQNTVYLKVR FSCDGKKIKGSEGGHDLIVMCNFSYSLDGKKYHPLGNPFQAREGQWIGAKVGMFCTRPAI VTNDGGWADVDWFRITKK >gi|222159303|gb|ACAB01000056.1| GENE 33 28287 - 29528 1222 413 aa, chain - ## HITS:1 COG:no KEGG:BT_4186 NR:ns ## KEGG: BT_4186 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 408 1 408 410 751 88.0 0 MKKNLFIFTFLLGAFSLSAQAQKQEKTITIEVQNNWNQVKADAPVVINLHELHAGFKVKS AVVMEGTKEIPSQLDDLNRDRKMDELVFVTDLPAHGRKIFQVTLSSEKSAKTYPERVYAD MFIVDNRKGKHQRVQAITVPGTSNIYSMVRPHGPVLESELVGYRLYFNEKQTPDIYGKFN KGLEIKESQFYPTDEQLAKGFGDDVLRVFDSCGPGALKGWDGQKATHITPVDTRTERIIS YGPVRVIAEIEVTGWKYQDQELDMMTRYTLYAGHRDLHIETFFDEPLNKEVFCTGVQDIV GTSKSFSDHKGLVGSWGTDWPVNDTVKYAKETVGLGTCIPQRYVKSEEKDKANFLYTITA PGNKYFQYHTTFTSMKETFGYKTPEAWFAHLREWKEELAHPVTVKIKDNRTNK >gi|222159303|gb|ACAB01000056.1| GENE 34 29681 - 31300 1317 539 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 50 506 18 435 448 236 32.0 8e-62 MNTLTKRLFWMAISCLPFISSGCKQSETAVKESSISDALYQNLPFEMPKVQQPVFPAYEV NIEKFGAKGDGLFLNTKAINDAIKEVNQRGGGKVIIPEGIWLTGPIELLSNVNLYTEQNA LVLFTGDFEAYPIIATSFEGLETRRCQSPISARNAENIAITGYGTFDGNGDCWRPVKKGK LTASQWKKLVNSGGVLDEKQEIWYPTAGSLKGAMACKDFNVPEGINTDEEWAEIRPWLRP VLLSIVKSKKVLLEGVTFKNSPSWCLHPLSCEDFTVNNIMVINPWYSQNGDAIDLESCKN ALIINSVFDAGDDAICIKSGKDEDGRRRGEPCQNVIVKNNTVLHGHGGFVVGSEMSGGVK NIYVEDCTFMGTDVGLRFKSTRGRGGVVENIYINNINMINIPNEPLLFDLFYGGKGAGEE SEEDLLNRMKTSIPPVTEETPAFCNIHISNIVCRGSGRAMFFNGLPEMPISNITVKNVVM TEATDGVVISQVDGVTLENVYVESSKGNNILNVKNAKNLTVDGKVYEELGAKEEILSLK >gi|222159303|gb|ACAB01000056.1| GENE 35 31652 - 32509 1058 285 aa, chain + ## HITS:1 COG:BMEI1958 KEGG:ns NR:ns ## COG: BMEI1958 COG0623 # Protein_GI_number: 17988241 # Func_class: I Lipid transport and metabolism # Function: Enoyl-[acyl-carrier-protein] reductase (NADH) # Organism: Brucella melitensis # 5 260 4 257 272 127 34.0 2e-29 MSYNLLKGKRGIIFGALNDQSIAWKVAERAVEEGATITLSNTPMAIRMGEVDALAQKLNC QVIPADATSVEDLQNVFKTSMDILGGQIDFVLHSIGMSPNVRKKRTYDDLDYGMLDKTLD ISAVSFHKMIQSAKKLNAIADYGSIVALSYVAAQRTFYGYNDMADAKALLESIARSFGYI YGREHSVRVNTISQSPTFTTAGSGVKGMDKLFDFSNRMSPLGNATADECADYCIVMFSDL TRKVTMQNLFHDGGFSSVGMSLRAMATYEKGLDEYMDENGNIIYG >gi|222159303|gb|ACAB01000056.1| GENE 36 32555 - 33523 851 322 aa, chain + ## HITS:1 COG:no KEGG:BT_4189 NR:ns ## KEGG: BT_4189 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 320 14 333 335 583 88.0 1e-165 MLFASCGISTGKGMEQKEEEISVLRYDKLLSEYVRSNSFSAMQKLTMDYRMPTKILIEDV LSIGTVKDDTISQRLQKFYSDTTLVRLLSDVEAKYPNLDEVEKGLNKGFRKLKKEVPDTK VPFIYSQVSAFNESIILVDSLLGISLDKYMGEDYPLYKRFYYDYQCRSMRPERIVPDCFA FYLLSRYGMNYHEGTCLIDLMMHSGKINYVVQNLLGYSDIGEAMGYSKEENDWCKENEKE IWNYICTNDHLHARDPMVIRYYMKPAPAVDMLGAQAPALIGTWMGARIIASYMKKHKDMK LKDLLEFTDYHVMLSESNYLAS >gi|222159303|gb|ACAB01000056.1| GENE 37 33615 - 34319 509 234 aa, chain + ## HITS:1 COG:NMA0547 KEGG:ns NR:ns ## COG: NMA0547 COG0313 # Protein_GI_number: 15793541 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Neisseria meningitidis Z2491 # 1 233 6 239 241 211 47.0 1e-54 METALYLLPVTLGDTSIEKVLPSYNKEIISGIRYFIVEDVRSARRFLKKVDREIDIDALT FYPLNKHTSPDDISGYLQPLVGGASMGVISEAGCPAVADPGADVVAIAQRKKLKVVPLVG PSSIILSVMASGFNGQSFAFHGYLPIEPGERAKKLKALEQRVYVENQTQLFIETPYRNHK MVEDILLNCRPQTKLCIAANITCEGEYIQTRTVKDWKGHVPDLSKIPCIFLLYK >gi|222159303|gb|ACAB01000056.1| GENE 38 34285 - 35403 649 372 aa, chain - ## HITS:1 COG:no KEGG:BT_4191 NR:ns ## KEGG: BT_4191 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 372 1 373 373 603 86.0 1e-171 MAHKGTLSSAFNMSLGFIPVIISILLCELITQDTAIYIGTGIGIIGIYLSYRRKGLLIPN FILYIATGVLALLSVAALIPGDYVPPGALPLTLEVSILIPMLSLYMHKKRFINHFLKQIG SCNKRLYAQGAEAAVVSARIALIFGILHFIIISITIVFQDPLSKTSMFVLYKVLPPTVFL MSILFNQIAIRFFNHLMSHTEYVPIVNTKGDVIGKTPAIEAVNYKNAYINPVIRIAISTH GMLFLCDRPSTAILDRGKADIPMECYLRYGESLEAGATRLINNAFPHEKGIKPEFNIVYH FENEVTNRLIYLFIVDIKDDSILCTPRFKNSKLWNFKQIEENLGKGFFSSCFEDEYEHLK GVIYIREKYRES >gi|222159303|gb|ACAB01000056.1| GENE 39 35467 - 36315 442 282 aa, chain - ## HITS:1 COG:SA0785 KEGG:ns NR:ns ## COG: SA0785 COG0320 # Protein_GI_number: 15926513 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate synthase # Organism: Staphylococcus aureus N315 # 5 281 9 288 305 290 49.0 2e-78 MTDRVRKPEWLKINIGANERYTETKRIVDSHCLHTICSSGRCPNMGECWGKGTATFMIGG DICTRSCKFCNTQTGRPHPLDANEPTHVAESIALMKLDHAVITSVDRDDLPDLGATHWAR TIQEIKRLNPQTTIEVLIPDFQGRMELLDLVIEARPDIISHNMETVRRISPLVRSATNYD TSLQVIKHISEKGVKSKSGIMVGLGETPEEVETLMDDLLATGCQILTIGQYLQPSHRHYP VAAYITPQQFAEYKTVGLEKGFNIVESAPLVRSSYHAEKHIR >gi|222159303|gb|ACAB01000056.1| GENE 40 36315 - 38525 1733 736 aa, chain - ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 138 721 141 719 738 306 32.0 1e-82 MKKISITLLLCLLCLTGMAQGQKALDLKDITSGRFRPENIQGVIPMPDGEHYTQMNADGT QIIKYSFKTGEKVEVIFDVNTTRECDFKNFDSYQFSPDGQKLLIATKTTPIYRHSYTAVH YIYPLKRNDKGVTTNNIIERLSDGGPQQVPVFSPDGTMIAFVRNNNIFLVKLLYGNSESQ VTEDGKQNSVINGIPDWVYEEEFGFDRALEFSADNTLIAFIRFDESEVPSYSFPVFAGQA PRIDALKDYPGEYTYKYPKAGYPNSKVEVRTYDIKSHVTRTMKLPLDADGYIPRIRFTKD ANKLAIMTLNRHQDRFDLYFADPRSTLCKLILRDESPYYIKENIFDNIQFYPEYFSLLSE RDGYSHLYWYSMGGNLIKKVTNGKFEVKDFLGYDEEDGSFYYTSNEESPLRKAVYKIDKK GKKLKLSQQVGTNTPLFSKSMKYYMNKFSNLNTPMLVTLNDNSGKTLKTLITNDGLKQTL SRYAVPQKEFFTFQTTDGVKLNGWMMKPVNFSASKKYPVLMYQYSGPGSQQVLDTWGISW ETYMASLGYIVVCVDGRGTGGRGEAFEKCTYLKIGVKEAKDQVETALYLGKQPYVDKDRI GIWGWSYGGYMTLMSMSEGTPVFKAGVAVAAPTDWRFYDTIYTERFMRTPKENAEGYKES SAFTRADKLHGNLLLVHGMADDNVHFQNCAEYAEQLVQLGKQFDMQVYTNRNHGIYGGNT RQHLYTRLTNFFLNNL >gi|222159303|gb|ACAB01000056.1| GENE 41 38680 - 38982 187 100 aa, chain + ## HITS:1 COG:no KEGG:BT_4196 NR:ns ## KEGG: BT_4196 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 98 1 98 100 166 89.0 2e-40 MNSKRKDLQYASVFLLAVALTFLMGKGDNLWLVSWGDLIPSLFVLFIAGDCLHSSLLRIK RGEEEGGARWSTCFTFLVFSIIFMGDLFFIGNFVVNKLWS >gi|222159303|gb|ACAB01000056.1| GENE 42 39044 - 39940 993 298 aa, chain - ## HITS:1 COG:BH2366 KEGG:ns NR:ns ## COG: BH2366 COG0324 # Protein_GI_number: 15614929 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Bacillus halodurans # 4 282 5 287 314 190 36.0 3e-48 MPTLIVLIGPTGVGKTELSLRLAETFQTSIVSADSRQLYAELKIGTAAPTPDQLKRVPHQ LVGTLHLTDYYSAAQYETEALEILEKLFTQHEVVILTGGSMMYVDAICKGIDDIPTVDAE TRQLMLQKYEEEGLEQLCAELRLLDPEYYRIVDLKNPKRVIHALEICYMTGRTYTSFRTQ QKKQRPFRILKVGLTRDREELYDRINRRVDQMMEEGLLEEVRSVLPYRHLNSLNTVGYKE LFKYLDGEWELSFAIDKIKQNSRIYSRKQMTWFKRDEEIKWFHPEQETEILACLRQSL >gi|222159303|gb|ACAB01000056.1| GENE 43 40062 - 40616 624 184 aa, chain - ## HITS:1 COG:no KEGG:BT_4204 NR:ns ## KEGG: BT_4204 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 184 1 184 184 283 91.0 2e-75 MIYRFTIISDEVDDFVREIQIDPEATFYDFHEAILKSVGYANDQMTSFFICDDDWEKGKE VTLEEMDDNPEIDSWVMKDTAISELVEDEKQKLLYVFDYITERCFFIELSEIITGKDMNG AKCTKKSGDAPKQTVDFEEMTAAGGSLDLDENFYGDQDFDMEDFDQEGFDIGGDTSSPYE EEKF >gi|222159303|gb|ACAB01000056.1| GENE 44 40638 - 41405 817 255 aa, chain - ## HITS:1 COG:VC2248 KEGG:ns NR:ns ## COG: VC2248 COG1043 # Protein_GI_number: 15642246 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Vibrio cholerae # 1 255 1 262 262 207 42.0 1e-53 MISPLAYIHPEAKIGENVEIAPFVFIDKNVVIGDNNKIMANANILYGSRIGNGNTIFPGA VIGAIPQDLKFRGEESTAEIGDNNLIRENVTINRGTAAKGRTIVGNNNLLMEGVHVAHDA LVGNGCIIGNSTKMAGEIVIDDNAIVSANVLMHQFCHVGSHVMIQGGCRFSKDIPPYIIA GREPIAFSGINIIGLRRRGFSNEVIESIHNAYRIIYQSGLNTTEALKKIEDEFEKSPEID YIVNFIRNSERGIIK >gi|222159303|gb|ACAB01000056.1| GENE 45 41424 - 42809 1437 461 aa, chain - ## HITS:1 COG:XF0803 KEGG:ns NR:ns ## COG: XF0803 COG0774 # Protein_GI_number: 15837405 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-acyl-N-acetylglucosamine deacetylase # Organism: Xylella fastidiosa 9a5c # 1 319 1 297 304 180 34.0 4e-45 MLKQKTLKDSFSLSGKGLHTGLDLTVTFNPAPDNHGYKIQRIDVEGQPTIDAVADNVTET TRGTVLSKNGVKVSTIEHGMAALYALGIDNCLIQVNGPEFPILDGSAQYYVQEIERVGTV EQNAVKDFYIIKSKIEFRDETTGSSIIVLPDENFSLNVLVSYDSTIIPNQFATLEDMHNF KDEVAASRTFVFVREIEPLLSAGLIKGGDLDNAIVIYERKMSQESFDKLADVMGVPHMDA DQLGYINHKPLVWPNECARHKLLDVIGDLALIGKPIKGRIIATRPGHTINNKFARQMRKE IRLHEIQAPSYDCNREPIMDVNRIRELLPHRYPFQLVDKVIEIGANYIVGIKNITANEPF FQGHFPQEPVMPGVLQVEAMAQVGGLLVLNSVDDPERYSTYFMKIDGVKFRQKVVPGDTI IFRVELLAPIRRGISTMKGYAFVGEKVVCEAEFMAQIVKNK >gi|222159303|gb|ACAB01000056.1| GENE 46 42821 - 43861 1017 346 aa, chain - ## HITS:1 COG:FN1909 KEGG:ns NR:ns ## COG: FN1909 COG1044 # Protein_GI_number: 19705214 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Fusobacterium nucleatum # 1 336 1 331 332 224 36.0 2e-58 MEFSAKQIAAFIQGEIIGDENATVHTFAKIEEGMPGAISFLSNPKYTPYIYETQSSIVLV NKDFTPEHEIKATLIRVDNAYESLAKLLNLYEMSKPKKQGIDSLAFVAPSAKIGENVYIG AFAYIGENTVIGDNTQIYPHTFVGDGVKIGNSCLLYSNVNVYHDCRIGNECILHSGAVIG ADGFGFAPTPNGYDKIPQIGIVILEDKVDIGANTCVDRATMGATVVHSGVKLDNLIQIAH NDEIGSHTVMAAQAGIAGSTKVGEWCMIGGQVGIAGHAKIGDKVGLGAQSGVPGDIKSGS QLIGTPPMELKQYFKSSIAQRSLPDMQKELRNLRKEVEELKQLLNK >gi|222159303|gb|ACAB01000056.1| GENE 47 43944 - 45173 887 409 aa, chain - ## HITS:1 COG:BS_ywfO KEGG:ns NR:ns ## COG: BS_ywfO COG1078 # Protein_GI_number: 16080812 # Func_class: R General function prediction only # Function: HD superfamily phosphohydrolases # Organism: Bacillus subtilis # 4 406 10 410 433 191 28.0 3e-48 MPYERKIINDPVFGFINIPKGLLYDIVRHPLLQRLTRIKQVGLSSVVYPGAQHTRFQHSL GAFHLMSEAITQLASKGNFIFDSEAEAVQAAILLHDIGHGPFSHVLEDTIVKGVSHEEIS LMLMERMNKEMNGQLSLAIQIFKDEYPKRFLHQLVSGQLDMDRLDYLRRDSFYTGVTEGN IGSARIIKMLDVADDRLVVESKGIYSIENFLTARRLMYWQVYLHKTSVAYEKMLISTLLR AKELASQGVELFASPALRFFLYNDINPTEFYNNPDCLENFIQLDDNDIWTALKVWSTHTD KVLSTLSTGMINRNIFKVEISSEPISEDRKKELTLHISQQLGITLSEANYFVSTPSIEKN MYDPADDSIDIIYKDGTIKNIAEASDMLNISLLSKKVKKYYLCYQRLHR >gi|222159303|gb|ACAB01000056.1| GENE 48 45204 - 46028 869 274 aa, chain - ## HITS:1 COG:RSc2773 KEGG:ns NR:ns ## COG: RSc2773 COG0284 # Protein_GI_number: 17547492 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Ralstonia solanacearum # 4 268 22 285 288 201 40.0 2e-51 MDKQQLFENIKRKKSFLCVGLDTDIKKIPEHLLKEEDPIFAFNEAIIDATADLCIAYKPN LAFYESMGVKGWIAFEKTVKYIKDNYPDQFIIADAKRGDIGNTSAMYARTFFEELDIDSV TVAPYMGEDSVTPFLTYEGKWVILLALTSNKGSHDFQLTEDVNGERLFEKVLRKSQEWAG DDRMMYVVGATQGRAFEDIRKIVPNHFLLVPGVGAQGGSLEEVCKYGMNSTCGLIVNSSR GIIYVDKTEKFAEAARTAAQEVQAQMAEQLKAIL >gi|222159303|gb|ACAB01000056.1| GENE 49 46111 - 47223 1150 370 aa, chain - ## HITS:1 COG:VC2179 KEGG:ns NR:ns ## COG: VC2179 COG0216 # Protein_GI_number: 15642178 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Vibrio cholerae # 5 365 3 355 362 332 47.0 1e-90 MADNSTILEKLDGLVARFEEISTLITDPAVIADQKRYVKLTKEYKELDDLMKARKEYIQL LGNIEEAKNILANESDAEMREMAKEEMDNSQERLPVLEEEIKLMLVPADPQDSKNAILEI RGGAGGDEAAIFAGDLFRMYAKFCETKGWKMEVSNANEGTAGGFKEIVCSVTGDNVYGIL KYESGVHRVQRVPATETQGRVHTSAASVAVLPEAEEFDVVINEGEIKWDTFRSGGAGGQN VNKVESGVRLRYIWKNPNTGIAEEILIECTETRDQPKNKERALARLRTFIYDKEHQKYID DIASKRKTMVSTGDRSAKIRTYNYPQGRITDHRINYTIYNLAAFMDGDIQDCIDHLIVAE NAERLKESEL >gi|222159303|gb|ACAB01000056.1| GENE 50 47227 - 48393 1119 388 aa, chain - ## HITS:1 COG:MJ0203 KEGG:ns NR:ns ## COG: MJ0203 COG0150 # Protein_GI_number: 15668375 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazole (AIR) synthetase # Organism: Methanococcus jannaschii # 47 369 49 323 350 116 30.0 7e-26 MSNQRYMMRGVSASKEDVHNAIKNIDKGIFPKAFCKIIPDILGGDPEYCNIMHADGAGTK SSLAYMYWKETGDLSVWKGIAQDALIMNIDDLLCVGAVDNILVSSTIGRNKLLIPGEVIS AIINGTDELLAELREMGVGVYATGGETADVGDLVRTIIVDSTVTCRMKRSDVIDNANIRP GDVIVGLASYGQATYEKEYNGGMGSNGLTSARHDVFGKYLAEKYPESYDAAVPEELVYSG KLKLTDSVEDSPINAGKLVLSPTRTYAPVVKKLLDALRPEIHGMVHCSGGAQTKVLHFVE NVRVVKDNLFPVPPLFKTIQEQSGTDWAEMYKVFNMGHRLEVYLSPEHAEKVIAISESFG IPAQIVGRVEACEQTELIIKSEFGEFRY >gi|222159303|gb|ACAB01000056.1| GENE 51 48664 - 49245 779 193 aa, chain - ## HITS:1 COG:PM0785 KEGG:ns NR:ns ## COG: PM0785 COG1704 # Protein_GI_number: 15602650 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pasteurella multocida # 1 193 1 192 193 196 55.0 2e-50 MKKSIIIILAVVVILVIWAVSVYNGLVTMDENVSGQWANVETQYQRRADLIPNLVNTVKG YATHEKETLEGVVAARSQATQIKVDAADLTPEKLAQYQKAQGAVTSALGKLLAITENYPD LKANQNFLELQAQLEGTENRINVARKNFNDAAQAYNTNIRRFPKNIFAGMFGFDKKAYFE AEEGSEKAPKVEF >gi|222159303|gb|ACAB01000056.1| GENE 52 49254 - 49574 173 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715530|ref|ZP_04546011.1| ## NR: gi|237715530|ref|ZP_04546011.1| predicted protein [Bacteroides sp. D1] # 1 106 1 106 106 181 100.0 1e-44 MTNIVWTQSALQTLELVYLQTLQYTKNERIAAKLHNKLIKEAEMLRTFPNAGNILNTSER VTLCYRALVVDTNYKLIYYVDANKDVIIVTVWDVRQNPDKLTKTIE >gi|222159303|gb|ACAB01000056.1| GENE 53 49571 - 49870 182 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715531|ref|ZP_04546012.1| ## NR: gi|237715531|ref|ZP_04546012.1| predicted protein [Bacteroides sp. D1] # 1 99 1 99 99 137 100.0 3e-31 MKENKEYQEKKEQEKVVHEPTGSYGMDTITMRQKIMERVMLMEENQLKEMLRFSNELEQK SWLLPHTQEELEIAINRGMEDVKAGRIVSHEDVLKYYMK >gi|222159303|gb|ACAB01000056.1| GENE 54 49942 - 50865 635 307 aa, chain - ## HITS:1 COG:TM0962 KEGG:ns NR:ns ## COG: TM0962 COG1512 # Protein_GI_number: 15643722 # Func_class: R General function prediction only # Function: Beta-propeller domains of methanol dehydrogenase type # Organism: Thermotoga maritima # 4 161 6 150 238 88 32.0 2e-17 MKSILTFILATFLLFPLQAQEKVYTVDNLPKVHLQNKMQYVCNPAGILSQAACDSIDSML YALEQQTGIETVVAVVPSIGEEDCFDFCHQLLNKWGVGKKGKNNGLVILLVTDQRCIQFY TGYGLEGVLPDAICKRIQTRYMIPYLKDGNWDAGMVAGLKATCQRLDGSMENDALSDSND GGSFDFILAILCFIAIGGGLAFFSARKQSRCPNCGKHQLQRSGSRVVSRINGVKTEDVTY TCRNCGHTIIRRQQSYDNDYHHRGGGGGGPFIGGFGGGRGGFGGGGGFGGGSFGGGMGGG GGAGSRF >gi|222159303|gb|ACAB01000056.1| GENE 55 50938 - 51888 650 316 aa, chain - ## HITS:1 COG:SPy1892 KEGG:ns NR:ns ## COG: SPy1892 COG1073 # Protein_GI_number: 15675706 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Streptococcus pyogenes M1 GAS # 9 313 12 305 308 216 39.0 4e-56 MRRKVVYSIIFIMLALTGCTIGGSFYMLNFSLTPNGKILSKDADSYPFMYKNYPFLRPWV DSLKQVDALKDTFIINPHGIQLHAYYVAAPQPTSKTAVIVHGYTDNAIRMFMIGYLYNRD LGYNILLPDLQHQGESEGPAIQMGWKDRWDVLQWMNIANEIFGDSTQMVVHGISMGGATT MMVSGEEQKPFVKCFVEDCGYTSVWDEFSHELKTSFHLPPFPLMYTTSWLCEKKYGWNFK EASSLKQVAKSQLPMLFIHGDKDTYVPTWMVYPLYEAKPEPKELWIVPGAAHAVSYQENK QEYTDRVRAFVGRYIH >gi|222159303|gb|ACAB01000056.1| GENE 56 51908 - 52654 775 248 aa, chain - ## HITS:1 COG:MK0117 KEGG:ns NR:ns ## COG: MK0117 COG0169 # Protein_GI_number: 20093557 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Methanopyrus kandleri AV19 # 5 246 15 271 290 138 33.0 8e-33 MEKYGLIGYPLRHSFSIGYFNEKFKSEGINAEYVNFEIPSINNFMEVIEENPNLCGLNVT IPYKEQVIPFLDELDRDTAKIGAVNVIKIIRQPKGKVKLVGYNSDIIGFTQSIQPLLQPH HKKALILGTGGASKAVYHGLKNLGIESIFVSRTHKADDMLTYEELTPEIMAEYTVIVNCT PVGMFPKVDFCPNIPYELLTPNHLLYDLLYNPNVTLFMKKGEAQGAVVKNGLEMLLLQAF AAWEIWHK >gi|222159303|gb|ACAB01000056.1| GENE 57 52694 - 53431 557 245 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 [Kordia algicida OT-1] # 24 245 1 221 221 219 47 4e-56 MDYPQQHIKPYDEEGKKTEQVERMFDNIAHAYDKLNHTLSLGIDRSWRKKAIAWLRPFQP QRMMDVATGTGDFAILACRKLQPAELIGTDISEGMMNVGREKVKKEGLSDKISFAREDCT SLSFADNDFDAITVAFGIRNFEDLDKGLSEMCRVLKPGGHLVILELTTPDRFPMKQLFSI YSKVVIPLLGKLLSKDNSAYRYLPDTIKVFPQGEVMKGVIARAGFSEVNFKRLTFGICTL YTATK >gi|222159303|gb|ACAB01000056.1| GENE 58 53593 - 54537 1137 314 aa, chain - ## HITS:1 COG:CC3242 KEGG:ns NR:ns ## COG: CC3242 COG0152 # Protein_GI_number: 16127472 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Caulobacter vibrioides # 10 312 13 318 320 271 45.0 1e-72 MKALTKTDFNFPGQKSVYHGKVRDVYNINGEKLVMVATDRISAFDVVLPEGIPYKGQMLN QIAAKFLDATTDICPNWKMATPDPMVTVGVLCEGFPVEMIVRGYLCGSAWRAYKSGVREI CGVKLPDGMRENQKFPEPIVTPTTKAEMGLHDEDISKEEILKQGLATPEEYEILEKYTLA LFKRGTEIAAERGLILVDTKYEFGKHNGTIYLMDEIHTPDSSRYFYAEGYQERFEKGEPQ KQLSKEFVREWLMENGFQGKDGQKVPEMTPAIVQSISDRYIELFENITGEKFVKEDTTNI AERIFKNVETFLNR >gi|222159303|gb|ACAB01000056.1| GENE 59 54551 - 55498 1035 315 aa, chain - ## HITS:1 COG:DR1988 KEGG:ns NR:ns ## COG: DR1988 COG1702 # Protein_GI_number: 15806986 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Deinococcus radiodurans # 18 307 65 354 380 267 47.0 2e-71 MIEKLIVLEDIDPVIFYGVNNANIQLIKALYPKLRIVARGNVIKVLGDEEEMCAFEENIT KLEKYCAEYNSLKEEVIIDIIKGNAPQAEQTGNVIVFSVTGKPIIPRSENQLKLVEGFAK NDMVFAIGPAGSGKTYTAIALAVRALKNKEIKKIILSRPAVEAGEKLGFLPGDMKDKIDP YLQPLYDALQDMIPAAKLKEYMELNIIQIAPLAFMRGRTLNDAVVILDEAQNTTAQQIKM FLTRMGMNTKMIVTGDMTQIDLPASQTSGLVQALRILKGVKGISFVELNKKDIVRHKLVE RIVDAYEKFDKEAKF >gi|222159303|gb|ACAB01000056.1| GENE 60 55600 - 56277 626 225 aa, chain - ## HITS:1 COG:no KEGG:BT_4219 NR:ns ## KEGG: BT_4219 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 225 1 225 225 392 84.0 1e-108 MTIYKLRILPLFLLVIGLTTLTSGCKKKDMSLKLNEPRNIRGVVSYKRSFPDLNDAHLEV AKKIGIRPLADREAAEDMKEKLTHITDNEFYVVDSLTHSIPYLVPRASALLDTIGSNFLD SLAAKGLNPNQVIITSVLRTENDVKRLRRRNGNASANSAHCFGATFDVSWKRFKKVEDKD GRPLQDVSADTLKLVLSEVLRDLRQAEKCYIKYELKQGCFHITAR >gi|222159303|gb|ACAB01000056.1| GENE 61 56360 - 57220 422 286 aa, chain - ## HITS:1 COG:no KEGG:BVU_1438 NR:ns ## KEGG: BVU_1438 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 10 286 3 277 277 169 36.0 8e-41 MKTKHLPGALLSFLYTIIFFSCGNNDEKKIYSLSFEKEYYERPLLGTTNITITGGNRDYT VTVEKTDILNIDVDLSSSIGMGSLRVTPKKKGETKVKVKDNITNETVDLRIKITDSYLAY AIEKSNHPALSNGTVVYLINNEAKECYFFRYIEFRDELSHTPIAKGTYDFFTKLESGSGN SSPTYAIPYLTLNYTSDEQENFTDASVPPTPHKLRFELYDGVTSANAVLNLISRFLGVDW KELVEKALTRSEHTIVPTLKTTIDNTDYTIIGILNTYPEIPENILE >gi|222159303|gb|ACAB01000056.1| GENE 62 57159 - 57413 67 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715540|ref|ZP_04546021.1| ## NR: gi|237715540|ref|ZP_04546021.1| predicted protein [Bacteroides sp. D1] # 18 84 1 67 67 108 100.0 9e-23 MCVMFSGKVAYFYKPLQMYENSWRTRNKYTIYFNTLIKSLYLPPTSIQELIKLYLSCKTI KQNNHEDKTFTWSTTVISLYNYLF >gi|222159303|gb|ACAB01000056.1| GENE 63 57412 - 57849 334 145 aa, chain + ## HITS:1 COG:no KEGG:BF3551 NR:ns ## KEGG: BF3551 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 143 1 143 144 137 45.0 1e-31 MKTDFDYNSMPVSFAHCLNGHCLRADKCLRRQVTLRMPKERAAVMVINPEHVTSDGEDCT YFIDEKPVLFARGMKHLLDRVPLADATVIKRQMIAYFGKTIYYRCCNKERLIKPKEQKYI QELFRKRGVTETPQYDEYIEYYDLG >gi|222159303|gb|ACAB01000056.1| GENE 64 58136 - 58525 248 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298484496|ref|ZP_07002647.1| ## NR: gi|298484496|ref|ZP_07002647.1| hypothetical protein HMPREF0106_04950 [Bacteroides sp. D22] # 1 129 1 129 129 224 98.0 2e-57 MTAHDLYTRTLSDYKSCLCSSPALTLRSYCRERHVNYHGMCLWLSRHGISIRELRPSSSV TTAKEPASSFSRLLPVCSSPSSSPDMLYGVTITFPDGVTVSIKQGSSFSVSRFIDRYNSK IQEEESCLL >gi|222159303|gb|ACAB01000056.1| GENE 65 58513 - 58863 207 116 aa, chain + ## HITS:1 COG:SMc03279 KEGG:ns NR:ns ## COG: SMc03279 COG3436 # Protein_GI_number: 15966878 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 6 108 7 107 118 65 35.0 3e-11 MFALTDGMHYYLCQHYVDMRKGMTGLYRLVKSDMSLSPVSGDVFVFFSRKRDMVKILRWD TDGFILYQKRLEEGTFEVPRFNPDTGSYELSWKTFLLIMQGVSLRSAKCRKRFRLK >gi|222159303|gb|ACAB01000056.1| GENE 66 58958 - 60559 622 533 aa, chain + ## HITS:1 COG:ECs3866 KEGG:ns NR:ns ## COG: ECs3866 COG3436 # Protein_GI_number: 15833120 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 178 521 119 461 463 108 27.0 4e-23 MQKDRLIELLEDQIRSLKEQLEAAARREEHGQLLICELTMQVRQLTESVHSLEEALTLKN GKLKKQENISRGLGKLINNESEKQVPGKAEAVRQTAPSRPCPSPKERGNNKSKRKEHFEL EEKVIEVNPEHPLFSISLAKFMGYRDSIRYVYTPPKFEKLVYRQNIYSLNGTVFCGSAPQ APFLNSNYDGSFVAGLCQLRYIYSMSVERIIGFFRESGFELEKPTAHHLLGRAAEVLENL YRALRMAVLEDSYLCCDESYHKVLVEEKNSRGKGVRQGYIWAAVAVKQKLILYLYENGSR SGKVLFDLLSEYEGTVQSDAYSPYRKLESDAYPGIKRIACLQHVKRKFLEVQEEPEAQKI VELTNKLYQKEHEHCVGRQGWTDKDNLRHRKRYAPQILSEIKRELLRIKSKPDLLPKSEM AGAVDYMLSQWEAIKGIFTEGYYYLDNNLVERYNRYISLSRRNSLFFGSHKGAERGALFY SLACSCRMQGINTFEYITEVINKAAKLPPNTDIKVYRNLLPDKWKENRSRIET >gi|222159303|gb|ACAB01000056.1| GENE 67 61017 - 62147 619 376 aa, chain - ## HITS:1 COG:MA1854 KEGG:ns NR:ns ## COG: MA1854 COG1672 # Protein_GI_number: 20090704 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanosarcina acetivorans str.C2A # 1 376 1 388 390 91 24.0 2e-18 MKPYNPFLVYGYNSPEYFCDREKETDKMISALLNERNLTLISPRRMGKTGLIKNVFYQMK RENNPNAAYFYMDIYSTRDLKAFIQLLAQNVLGELDTLSQNILRQMTAFFKSCRPIISAD ERSGMPTVTLDFAPTHAEQTLKEILNYMVASKKQCYLAIDEFQQITEYPEKGVEALLRSH IQFMPNVHFIFSGSKKHVMEEMFTSAKRPFYQSTQIIVLTEIPLENYYSFAHSFFAKEKR ELTLETFSYLYQLENGHTWYVQSILNRLYEKKINPIDNRLVDRCINDILDEQETIYQSNL TLLTNNQVDLLKAIATEGCIKSINANDFIKKHHLKTPSSVNVALKSLLNKELIYNTPDGY IVYDRFFGKWLKDAVI >gi|222159303|gb|ACAB01000056.1| GENE 68 62305 - 63957 2004 550 aa, chain - ## HITS:1 COG:BS_yuaG KEGG:ns NR:ns ## COG: BS_yuaG COG2268 # Protein_GI_number: 16080153 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 369 1 377 509 113 28.0 9e-25 MTQEMLIMAAILVAVILLTFIGILSRYRKCKSDEVLVVYGKTGGDKKSAKLYHGGAAFVW PIIQGYEFLSMKPMQIECKLTGALSAQNIRVDVPTTITVAISTDPEVMQNAAERMLGLTM DDKQNLITDVVYGQMRMVIADMTIEELNSDRDKFLAKVKDNIDTELRKFGLYLMNINISD IRDAANYIVNLGKEAESKALNEAQANIEEQEKLGAIKIANQIKERETKVAETRKDQDIAI AETKKQQEISVANADKDRISQVAIANAEKESQVAKAEAEKNIRIEQANTEKESRVAELNS DMEIKQAEAAKKAAIGRNDAQKEVALSNAELAVTQANADKQAGEAAAKSEAAVQTAREIA QKEVEEAKARKVESSLKAEKIVPAEIARQEAILQANAIAEKITREAEARAKATLAQAEAE AKAIQMKLEAEAEGKKRSLLAEAEGFEAMVRAAESNPAIAIQYKMVDQWKEIAGEQVKAF EHMNLGNITVFDGGNGGTSNFLNTLVKTVAPSLGVLDKLPIGETVKGIINPESKTEEKPA TKADEKKDKK >gi|222159303|gb|ACAB01000056.1| GENE 69 63978 - 64475 444 165 aa, chain - ## HITS:1 COG:no KEGG:BT_4221 NR:ns ## KEGG: BT_4221 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 165 165 246 89.0 2e-64 MSSTIFLIIALVTTGIFLIQFVLSIFFGDIDADVDVDADISSVVSFKGLTHFGIGFGWYM YLAGNTEMQSYVIGILVGLFFVFAVWFLYKKAYQLQQVNHNEQTDQLVGRECVIYFKQSD SKYTVQTTRDGAMREVDVISESGKAYQTGDRTMITSYKDGTLFIQ >gi|222159303|gb|ACAB01000056.1| GENE 70 64604 - 65353 571 249 aa, chain + ## HITS:1 COG:CAC3099 KEGG:ns NR:ns ## COG: CAC3099 COG0101 # Protein_GI_number: 15896350 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Clostridium acetobutylicum # 1 243 1 244 244 157 37.0 2e-38 MQRYFIYLAYDGTNYHGWQIQPNGISVQECLMKALSTFLRREIEVIGAGRTDAGVHASLM VAHFDFDELLDEVSVADKLNRLLPPDISVYRVCRVNPDAHARFDATARTYKYYVTTSKYP FNRQYRWRVYNQLDYALMNEAARTLFEYTDFTSFSKLHTDVKTNICHITHAEWTQEDDAT WVFTIRADRFLRNMVRAIVGTLIEVGRGKLTVEGFRRVIEQQDRCKAGTSAPGQALFLVN VEYPESIFE >gi|222159303|gb|ACAB01000056.1| GENE 71 65372 - 66274 816 300 aa, chain + ## HITS:1 COG:CAC1984 KEGG:ns NR:ns ## COG: CAC1984 COG0697 # Protein_GI_number: 15895255 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Clostridium acetobutylicum # 2 296 4 283 285 130 31.0 4e-30 MWLLLAFLSATLLGFYDVFKKKSLKDNAVLPVLFLNTFFSSLIFLPFILISVYKPDLLGG TIFNVPVVGWEQHKYIIIKSFIVLSSWIFGYFGMKHLPLTIVGPINATRPVMVLVGAMLV FGERLNLYQWIGVMLAIVSFFMLSRSGKKEGIDFKHNKWIFFIVLAAITGAISGLYDKYL MKSLNPMLVQSWYNVYQVFIMCPILLLLWCPKRKSTTPFRWDWTIILISIFLSAADFVYF YALSYDDSMISIVSMVRRGSVVVSFTFGALLFREKNLKSKAIDLILVLIGMIFLYLGSKN >gi|222159303|gb|ACAB01000056.1| GENE 72 66320 - 66970 575 216 aa, chain + ## HITS:1 COG:no KEGG:BT_4239 NR:ns ## KEGG: BT_4239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 216 1 216 216 335 81.0 5e-91 MKINKLVITIFFSAVGLFASTVLWAQEAKTLFVNMPDSLSPLLTKVNREDCIDFLESKMK AQVENRFGKKSEMTDLSKDYIRMQMSSQSTWQMKVLALNDSTNVICTVSTACAPACDSSI RFYTDDWKPLTASLFITLPVMGDFLNTPDSAGVYEFDEARRSADILLMKADFNKENTELT VTLATPDYMSTETAEKLKPFLRRPIVYHWKNGTFTK >gi|222159303|gb|ACAB01000056.1| GENE 73 67002 - 68090 1091 362 aa, chain - ## HITS:1 COG:no KEGG:BT_4240 NR:ns ## KEGG: BT_4240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 360 1 360 363 696 93.0 0 MKDLSSIVANFKVQGTIEEIKPLGTGLINDTYKVNTKEADAPDYVLQRINHAIFQNVEML QSNITAVTNHIRKKLTEAGEADIERKVLSFLETEEGKAYWFDGDSYWRVMVFIPRAKTYE TVNPEYSNYAGEAFGNFQAMLADIPETLGETIPDFHNMEFRLKQLREAVAKDAAGRVSEV KYYLDEIEKRADEMCKAERLYREGKLPKRVCHCDTKVNNMMFDEDGKVLCVIDLDTVMPS FVFSDYGDFLRTGANTGDEDDKDLDRVNFNMEIFKAFTKGYLKGAKSFLTPIEIENLPYA AALFPYMQCVRFLADYINGDTYYKIKYPEHNLVRTKAQFKLLQSVEANTPEMIAFINECL KS >gi|222159303|gb|ACAB01000056.1| GENE 74 68121 - 71462 3277 1113 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 19 1077 5 966 1087 706 38.0 0 MGIFSLTTFNVMAEKAVKPYWQDVQVVEVNKEYPRTSFMTYNNRADALSGKFERSKYYRL LNGTWKFYFVDSYKKLPDNITDPNTNTDSWNDIQVPGNWEVQGHGIAIYTNHGYEFKPRN PQPPALPEANPVGVYRRDIDIPTDWDGRDIYLHLAGAKSGVYVYINGQEVGYSEDSKNPA EFLINNYVKPGKNVLTVKIFRWSTGSYLECQDFWRMSGIERDVYLYSQPKAALKDFRVKS TLDDSYKNGIFSLNVDLRNHEKAATNLTLVYELLDAQGKVISTEEKTAYIPSDEVRTLSF DQKLADVNTWTSEHPNLYKLLMTVKENGKINEIIPFNVGFRRIEIKPIEQKAANGKPYVC LFINGQPLKLKGVNIHEHNPSTGHYVTEELMRRDFELMKQHNLNSVRLCHYPQDRRFYEL CDEYGLYVYDEANIESHGLYYDLRKGGSLGNNPEWLKPHMDRTINMFERNKNYPSVTFWS LGNEAGNGYNFYQTYLWLKEADKELMNRPVNYERAQWEWNSDMYVPQYPGADWLENTGKN GSDRPVAPSEYAHAMGNSTGNLWGQWQAIYKYPNLQGGYIWDWVDQGLLQKDENGREYWA YGGDFGVDAPSDGNFLCNGLVNPDRGPHPAMAEVKYVHQNVGFEAVDAAAGIFKITNRFY FTNLKKYQIHYSVLANGKTIKGGKVSLDIAPQASKEFTVPVNGLKAQSGVEYFVNFSVTT TEPEPLIPTGYEIAYDQFQLPIQAEKAIYKANGPALKTTTQGDELIVSSSKVNFVFNKNS GLVTSYKVDGTEYFKDGFGIQPNFWRAPNDNDYGNGAPKRLQVWKQSSKNFHVTDATMTT ENKVVSLHVTYLLAAGNLYVVTYKIYPNGVVNVNAKFTSTDMQATETEVSEATRMATFTP GSDAARKAASKLEVPRIGVRFRLPAQMNNVQYFGRGPEENYIDRNHGTLVGVYKTTADKM YFNYVRPQENGHHTDTRWIALSPVKGNGLVLVADSTIGFNALRNSIEDFDSEEALPHPYQ WNNFSREEVANHDENAARNVLRRMHHVNDIVPRDFVEVCVDMKQQGVGGYDSWGARPEPF HQIPANRDYHWGFTLVPVRSANQANEAAKYDYR >gi|222159303|gb|ACAB01000056.1| GENE 75 71535 - 72425 718 296 aa, chain - ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 13 294 1 281 283 150 32.0 3e-36 MKTAIPKPSKQSIIREARDYVMIAIGMILYGIGWTVFLLPNDITTGGVPGIASIVYWATG FPVQYTYFSINFFLLLLALKLLGMKFCIKTIFGVFTLTFFLSVIQKLTAGFGLLHDQPFM ACVIGASFCGGGIGVAFSANGSTGGTDIIAAIINKYRDITLGRVVLICDMIIISSSYFVL KDWEKVVYGFVTLYICSFVLDQVVNSARQSVQFFIISNKYEEIGRHINEYPHRGVTIINA TGFYTGREVKMMFVLAKKRESPIIFRLIKDIDPNAFVSQSAVIGVYGEGFDHIKVK >gi|222159303|gb|ACAB01000056.1| GENE 76 72540 - 73949 1536 469 aa, chain - ## HITS:1 COG:lin2262 KEGG:ns NR:ns ## COG: lin2262 COG0673 # Protein_GI_number: 16801326 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 64 314 9 225 349 65 27.0 2e-10 MKKLLTATAIGLALLTWHTSCTQQPKAPEAFTPIKVETPARPAGQEDVIQLVTPKIDTVR VGFIGLGMRGPGAVARWTHIPGTKIVALCDLLPERVEKSQEILKNAGLPVAASYSGEEDA WKKLCERDDIDLVYIATDWKHHAAMGVYAMEHGKHVAIEVPAAMTLDEIWQLINTSEKTR KHCMQLENCVYDFFELTSLNMAQQGVFGEVLHVEGSYIHNLEDFWPEYWNNWRMDYNHLH RGDVYATHGMGPACQVLNIHRGDRMKTLVSMDTKAVNGPAYIKKQTGEEVTDFQNGDQTS TLIRTENGKTMLIQHNVMTPRPYSRMYQIVGADGYASKYPIEEYCLRPTQVDSKDVPNHE NLNAHGSVSENVKKALMDKYKDPIHIELEETAKKVGGHGGMDFIMDYRLAYCLQNGLPLD MDVYDLAEWCCMAELTRLSIENNSAPVEVPDFTRGGWNKVQGYRHAFAK Prediction of potential genes in microbial genomes Time: Wed May 18 02:31:17 2011 Seq name: gi|222159302|gb|ACAB01000057.1| Bacteroides sp. D1 cont1.57, whole genome shotgun sequence Length of sequence - 14684 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 9, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 395 - 467 68.9 # Gly TCC 0 0 - TRNA 473 - 558 63.9 # Tyr GTA 0 0 - Term 350 - 387 4.1 1 1 Tu 1 . - CDS 610 - 1104 407 ## COG0054 Riboflavin synthase beta-chain - Prom 1129 - 1188 3.6 2 2 Tu 1 . - CDS 1255 - 1938 830 ## BT_4254 hypothetical protein - Prom 2075 - 2134 5.3 + Prom 1946 - 2005 6.0 3 3 Op 1 . + CDS 2099 - 3217 1025 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 4 3 Op 2 . + CDS 3204 - 3491 256 ## BT_4256 hypothetical protein + Term 3729 - 3771 1.3 5 4 Op 1 . - CDS 3478 - 4269 678 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 6 4 Op 2 . - CDS 4214 - 4807 203 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 7 4 Op 3 . - CDS 4719 - 6470 1709 ## COG0793 Periplasmic protease 8 4 Op 4 . - CDS 6501 - 6950 411 ## COG2131 Deoxycytidylate deaminase 9 4 Op 5 . - CDS 6966 - 7457 336 ## BT_4261 hypothetical protein 10 4 Op 6 . - CDS 7482 - 9542 1821 ## COG0339 Zn-dependent oligopeptidases - Prom 9626 - 9685 5.8 - Term 9647 - 9699 13.2 11 5 Tu 1 . - CDS 9716 - 10726 1023 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase - Prom 10786 - 10845 6.8 + Prom 10782 - 10841 7.3 12 6 Tu 1 . + CDS 10869 - 11315 432 ## COG1970 Large-conductance mechanosensitive channel + Term 11328 - 11378 9.6 + Prom 11358 - 11417 4.3 13 7 Tu 1 . + CDS 11447 - 12970 1533 ## COG0519 GMP synthase, PP-ATPase domain/subunit + Term 12992 - 13035 10.1 - Term 12972 - 13029 7.6 14 8 Tu 1 . - CDS 13064 - 13732 635 ## BT_4300 Crp family transcriptional regulator - Prom 13947 - 14006 6.0 + Prom 13782 - 13841 9.9 15 9 Tu 1 . + CDS 13870 - 14469 616 ## COG2095 Multiple antibiotic transporter + Term 14535 - 14590 2.2 Predicted protein(s) >gi|222159302|gb|ACAB01000057.1| GENE 1 610 - 1104 407 164 aa, chain - ## HITS:1 COG:BH1557 KEGG:ns NR:ns ## COG: BH1557 COG0054 # Protein_GI_number: 15614120 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Bacillus halodurans # 19 164 11 156 156 139 50.0 3e-33 MATAYHNLSEYDFNSVPNAEAMKFGIVVSEWNFNITGALLKGAVDTLKKHGAKDENILVK TVPGSFELTFGANQMMENCDLDAIIAIGCVIKGDTPHFDYVCMGATQGITELNATGDIPV IYGLITTNTMEQAEDRAGGKLGNKGDECAITAIKMIDFVWSLNK >gi|222159302|gb|ACAB01000057.1| GENE 2 1255 - 1938 830 227 aa, chain - ## HITS:1 COG:no KEGG:BT_4254 NR:ns ## KEGG: BT_4254 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 1 227 227 377 96.0 1e-103 MAEQKNQNEHLNVEDALTQSEAFLIKYKNAIIGGVVAVIIIVAGFIMYKNLYAEPREEKA QAALFKGQEYFEQDAFEQALNGDSIGYTGFLKVADEYSGTKAANLAKAYAGICYAQLGKY EEAVKMLDSFNGKDQMVAPAILGAAGNCYAQLGQLDKAASTLLSAADKADNNTLSPIFLI QAGEILVKQGKYDDAVNAYTKIKDKYFQSYQAMDIDKYIEQAKLMKK >gi|222159302|gb|ACAB01000057.1| GENE 3 2099 - 3217 1025 372 aa, chain + ## HITS:1 COG:BH0004 KEGG:ns NR:ns ## COG: BH0004 COG1195 # Protein_GI_number: 15612567 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Bacillus halodurans # 1 364 1 369 371 174 31.0 2e-43 MILKRISILNYKNLEEVELGFSAKLNCFFGLNGMGKTNLLDAVYFLSFCKSSGNPIDSQN IRHEQDFFVIQGFYEAEDGTPEEIYCGMKRRSKKQFKRNKKEYSRFSDHIGFLPLVMVSP ADSELIAGGSDERRRFMDVVISQYDKEYLEALIRYNKALVQRNTLLKSEFPVEEELFLVW EEMMSQAGEIVFRKREAFIREFIPIFQSFYSFISQDKEAVGLSYESHARDASLLEVLKQS RERDKIMGFSLRGIHKDELNMLLGEFPIKKEGSQGQNKTYLVALKLAQFDFLKRTGRTVP LLLLDDIFDKLDASRVEQIVKLVGGDNFGQIFITDTNRGHLDRILHKVGSDYKIFRVEEG TIQEMEADDEAQ >gi|222159302|gb|ACAB01000057.1| GENE 4 3204 - 3491 256 95 aa, chain + ## HITS:1 COG:no KEGG:BT_4256 NR:ns ## KEGG: BT_4256 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 95 1 95 95 160 95.0 1e-38 MKRNDAEQIGKLIQQFLRQESLESPLNEQRLLDAWPQVLGPAAAYTSNLYIRNQTLYVHL TSAALRQELMMGREVLVRTLNQRVGAMVITNIIFR >gi|222159302|gb|ACAB01000057.1| GENE 5 3478 - 4269 678 263 aa, chain - ## HITS:1 COG:DR0470 KEGG:ns NR:ns ## COG: DR0470 COG1387 # Protein_GI_number: 15805497 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Deinococcus radiodurans # 6 254 8 251 260 90 27.0 2e-18 MKTNYHTHTTRCHHATGSDEEFVLSAIKGGYQEIGFSDHTPWKYHTNYISDIRMLPEELP GYVESLRSLQEKYKNQISIKIGLECEYFPEYIHWLKGIIKEYKLDYIIFGNHHFHTDEKF PYFGRNTDSVDMLELYEESAIEGMESGLFAYLAHPDLFMRSYPEFDHHCKLVSRHICRTA ARLNLPLEYNLGYEEYNDIHGITTIPYPDFWKVAAHEGCTAIIGVDAHNNQYLENPFYYS RATETLRKLGIKVIDRIPFLNEK >gi|222159302|gb|ACAB01000057.1| GENE 6 4214 - 4807 203 197 aa, chain - ## HITS:1 COG:aq_1731 KEGG:ns NR:ns ## COG: aq_1731 COG0212 # Protein_GI_number: 15606807 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Aquifex aeolicus # 4 178 3 174 186 112 35.0 5e-25 MERKKELRKRIALLKTQHADSTMRKLQSANILTALEAHPAFRAANTVLLYHSLNDEVDTH EFIRKWSNKKRILLPVVVGDDLELRIYTGPENMSICGVYGIEEPTGEAFTDYAAIDFIVV PGVAFDAKGNRLGRGKGYYDRLLPRIPTAYKAGICFPFQLVEEVPAESFDVRMDIIITIN EDELSHPYHPLPSCDRE >gi|222159302|gb|ACAB01000057.1| GENE 7 4719 - 6470 1709 583 aa, chain - ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 51 358 43 346 408 211 40.0 3e-54 MSTKNSSRFTPVIIAVSVVVGILIGTFYAKHFAGNRLGIINGSSNKLNALLRIVDDQYVD TVNMTDLVEKAMPQILAELDPHSTYIPAQNLEEVNSELEGSFSGIGIQFTIQNDTIHVNA VVQGGPSEKIGLMAGDRIVTVDDSLFVGKKVTNERAMRTLKGPKGSQVKLGIKRTGEKDL LHFNITRGDIPQNTVDAAYMLNDDIGYVKVSKFGRTSHVELLNALAQLNHKKCKGLIIDL RGNTGGYMEAAIRMVNEFLPEGKLIVYTQGRKYPRAEEFANGTGSCQKMPLVVLIDEGSA SASEIFTGAIQDNDRGTVVGRRSFGKGLVQQPIDFSDGSAIRLTIARYYTPSGRCIQRPY ESGKDRNYELDLYTRYEHGEFFSRDSIKQNESERYNTSLGRTVYGGGGIMPDIFVPQDTT GVTSYLSTVINRGLTIQFTFQYTDNNRKKLSQYETEEELLNYLRHQGLVEQFVRFADSKG VKRRNILIQKSYKLLEKNLFGNIIYNMLGLEAYLQYFNKTDATVIKGIEILEKGEAFPKA PVAVEEEEVTKDKKDGKKKRTAQAYSITEDPARGFDYAKAAIS >gi|222159302|gb|ACAB01000057.1| GENE 8 6501 - 6950 411 149 aa, chain - ## HITS:1 COG:AF1764 KEGG:ns NR:ns ## COG: AF1764 COG2131 # Protein_GI_number: 11499353 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Archaeoglobus fulgidus # 6 149 2 156 157 117 42.0 6e-27 MDIEKKQLELDKRYIRMASIWAENSYCQRRKVGALIVKDKMIISDGYNGTPSGFENVCED ENNLTKPYVLHAEANAITKIARSNNSSDGATMYVTASPCIECAKLIIQAGIKRVVYSEHY RLEDGIELLKRAGIEVIYTELDNHSSPDK >gi|222159302|gb|ACAB01000057.1| GENE 9 6966 - 7457 336 163 aa, chain - ## HITS:1 COG:no KEGG:BT_4261 NR:ns ## KEGG: BT_4261 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 161 17 167 169 150 56.0 1e-35 MKRSIFQIVGLLLLLPLFSGCNDSDDLQGIFTGKTWKLTYINLKDKGGWMNGFSEKSIKI LNENQESYTITFTGTEEDNRISNGAVKGRIITADLTGTWSANGKNNEFHASVTNVNENDD LAKEFIKGLNNASSYIGDDNGLFLYYNPAGSQQTYVLAFHVQR >gi|222159302|gb|ACAB01000057.1| GENE 10 7482 - 9542 1821 686 aa, chain - ## HITS:1 COG:XF1944 KEGG:ns NR:ns ## COG: XF1944 COG0339 # Protein_GI_number: 15838538 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Xylella fastidiosa 9a5c # 9 685 36 716 716 469 39.0 1e-131 MSNITNAQNPFYGQYHTPHGTVPFDRIETEHYEPAILEGIKLQNAEIEAIIQNPEKADFS NTIEAFEASGELLDKVVAVFGNMLSAETNDDLQELAQKIMPLLSEHSNNITLNEKLFARV KEVYNQKETLQPTQEQKQLLENAYNSFVRHGANLEGEAREEYRRLTNELSKLTLTFSENN LKETNAYQMLLTKKESLAGLPEIIIEAAAETAKSEEKEGWAFTLHAPSYVPFMTYSDNRD LRQKLYMAYNTKCTHDNEFNNIDIVKKIANTRMKIAQLLGYKDYAEYTLKKRMAENSESV YKLLNQLLEAYAPTAQQEYKEIQELAREEQGDDFVVMPWDWSYYSNKLKDKKFNINEEML RPYFELEQVKKGVFGLAEKLYGITFRKNTEIPVYHKEVEAFEVFDKDGKFLAVLYTDFHP RLGKRAGAWMTSYKDQWIDKKTGENSRPHVSVVMNFTKPTENKPALLTFNEVETFLHEFG HSLHGMFANSTYRSLSGTNVYWDFVELPSQIMENFAIEKDFLNTFARHYQTGEVLPDELI ERLVDASNFNIAYACLRQLSFGLLDMAWYTRDTPFEGDVKAYEQEAWKDAQILPVVPEAC MSTQFSHIFAGGYAAGYYSYKWAEVLDADAFSLFKQKGIFNQEVADSFRNNILSKGGTEH PMVLYKRFRGQEPSIDALLIRNGIRK >gi|222159302|gb|ACAB01000057.1| GENE 11 9716 - 10726 1023 336 aa, chain - ## HITS:1 COG:VC2000 KEGG:ns NR:ns ## COG: VC2000 COG0057 # Protein_GI_number: 15642002 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Vibrio cholerae # 2 332 3 330 331 489 77.0 1e-138 MIKVGINGFGRIGRFVFRAAMTRNDIQIVGINDLCPVDYLAYMLKYDTMHGQFDGTIEAD VENSKLIVNGQAIRITAERNPADLKWNEVGAEYVVESTGLFLSKDKAQAHIEAGAKYVVM SAPSKDDTPMFVCGVNEKTYVKGTQFVSNASCTTNCLAPIAKVLNDKFGILDGLMTTVHS TTATQKTVDGPSMKDWRGGRAASGNIIPSSTGAAKAVGKVIPALNGKLTGMSMRVPTLDV SVVDLTVNLAKPATYAEICAAMKEASEGELKGILGYTEDAVVSSDFLGDARTSIFDAKAG IALTDTFVKVVSWYDNEIGYSNKVLDLIAHMASVNA >gi|222159302|gb|ACAB01000057.1| GENE 12 10869 - 11315 432 148 aa, chain + ## HITS:1 COG:ECs4156 KEGG:ns NR:ns ## COG: ECs4156 COG1970 # Protein_GI_number: 15833410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 5 148 2 134 136 152 57.0 2e-37 MGKSTFLQDFKAFAMKGNVIDMAVGVVIGGAFGKIVSSLVANVIMPPIGLLVGGVNFTDL KWVMKAAEIGADGKEIAPAVTLDYGQFLQATFDFLIIAFAIFLFIRLITKLTTKKQAEVA AAPPTPPAPTKEEVLLTEIRDLLKEKNN >gi|222159302|gb|ACAB01000057.1| GENE 13 11447 - 12970 1533 507 aa, chain + ## HITS:1 COG:FN1444_2 KEGG:ns NR:ns ## COG: FN1444_2 COG0519 # Protein_GI_number: 19704776 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Fusobacterium nucleatum # 193 507 1 318 318 422 63.0 1e-118 MQEKIIILDFGSQTTQLIGRRVRELDTYCEIVPYNKFPKEDPTIKGVILSGSPFSVYDKD AFKVDLSEIRGKYPILGICYGAQYLSYTNGGKVEPAGTREYGRAHLASFCKDNVLFKGVR EESQVWMSHGDTITAIPDNFKKIASTDKVEIAAYQIEGEQVWGVQFHPEVFHSEDGTQIL KNFVVDVCGCKQDWSPASFIESTVAELKEQLGDDKVVLGLSGGVDSSVAAVLLNKAIGKN LTCIFVDHGMLRKNEFKNVMKDYECLGLNVIGVDASEKFFTELAGVAEPEAKRKIIGKGF IDVFDVEAHKIKDVKWLAQGTIYPDCIESLSITGTVIKSHHNVGGLPEKMNLKLCEPLRL LFKDEVRRVGRELGMPEHLITRHPFPGPGLAVRILGDITPEKVRILQDADDIFIQGLRDW GLYDKVWQAGVILLPVQSVGVMGDERTYERAVALRAVTSTDAMTADWAHLPYEFMGKVSN DIINKVKGVNRVTYDISSKPPATIEWE >gi|222159302|gb|ACAB01000057.1| GENE 14 13064 - 13732 635 222 aa, chain - ## HITS:1 COG:no KEGG:BT_4300 NR:ns ## KEGG: BT_4300 # Name: not_defined # Def: Crp family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 220 1 220 220 380 86.0 1e-104 METMFDTLLQLPLFQGLCHEDFTSILDKVKLHFIKHKVGETIIESGSPCKQLCFLLKGEV SIVTNSKENIYTVIEQMEAPYLLEPQSLFGMNTHYTSAYVAHTEAHTVSISKAFVLSDLF KYEIFRLNYMNIVSNRAQNLYSRLWEEPTQDLTEKIIRFFLLHCEKIQGEKIFKMKMDDL ARYLDDTRLNTSKALNELQDKGLLELRRKEILIPDAQKLVSE >gi|222159302|gb|ACAB01000057.1| GENE 15 13870 - 14469 616 199 aa, chain + ## HITS:1 COG:BMEI0883 KEGG:ns NR:ns ## COG: BMEI0883 COG2095 # Protein_GI_number: 17987166 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Brucella melitensis # 7 193 4 207 209 77 29.0 1e-14 MFAGFNWQQMVSAFIVLFAVIDIIGSIPIIINLKEKGKDVNAMKATVISFVLLIGFFYAG DMMLKLFHVDIESFAVAGAFVIFLMSLEMILDIEIFKNQGPIKEATLVPLVFPLLAGAGA FTTLLSLRAEYASVNIIIALVLNMIWVYFVVSMTGRVERFLGKGGIYIIRKFFGIILLAI SVRLFTANITLLIEALHRS Prediction of potential genes in microbial genomes Time: Wed May 18 02:31:49 2011 Seq name: gi|222159301|gb|ACAB01000058.1| Bacteroides sp. D1 cont1.58, whole genome shotgun sequence Length of sequence - 77429 bp Number of predicted genes - 57, with homology - 56 Number of transcription units - 21, operones - 14 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 42 - 653 300 ## BT_3384 hypothetical protein + Term 724 - 765 1.1 2 2 Tu 1 . - CDS 544 - 819 122 ## - Prom 842 - 901 7.2 + Prom 795 - 854 6.0 3 3 Tu 1 . + CDS 884 - 3481 1245 ## BT_3853 hypothetical protein + Term 3541 - 3579 3.5 + Prom 3557 - 3616 6.3 4 4 Op 1 . + CDS 3835 - 6963 2015 ## BT_3854 hypothetical protein 5 4 Op 2 . + CDS 6970 - 8928 1435 ## BT_3855 putative outer membrane protein 6 4 Op 3 . + CDS 8961 - 9653 401 ## BT_3856 hypothetical protein 7 4 Op 4 . + CDS 9669 - 11906 1181 ## COG3537 Putative alpha-1,2-mannosidase 8 4 Op 5 . + CDS 11937 - 13100 675 ## BT_3862 endo-alpha-mannosidase 9 4 Op 6 . + CDS 13127 - 14248 960 ## BT_3860 hypothetical protein + Term 14423 - 14447 -1.0 10 5 Tu 1 . - CDS 14428 - 15063 466 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Prom 15113 - 15172 8.6 + Prom 15106 - 15165 10.3 11 6 Op 1 4/0.000 + CDS 15277 - 17214 1651 ## COG3408 Glycogen debranching enzyme 12 6 Op 2 2/0.000 + CDS 17231 - 18496 1151 ## COG0438 Glycosyltransferase 13 6 Op 3 . + CDS 18514 - 19968 1297 ## COG1449 Alpha-amylase/alpha-mannosidase + Term 19992 - 20038 4.4 - Term 20262 - 20330 26.1 14 7 Op 1 . - CDS 20344 - 22083 1174 ## BT_4306 hypothetical protein 15 7 Op 2 . - CDS 22106 - 22939 693 ## COG0297 Glycogen synthase + Prom 22909 - 22968 9.0 16 8 Op 1 12/0.000 + CDS 23082 - 23927 712 ## COG0414 Panthothenate synthetase 17 8 Op 2 . + CDS 23946 - 24299 522 ## COG0853 Aspartate 1-decarboxylase + Term 24316 - 24382 23.0 - Term 24310 - 24364 16.6 18 9 Op 1 . - CDS 24444 - 26744 2545 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 26819 - 26878 5.1 - Term 26824 - 26864 4.9 19 9 Op 2 . - CDS 26903 - 28177 1431 ## COG0172 Seryl-tRNA synthetase - Prom 28202 - 28261 12.5 - Term 28355 - 28404 10.6 20 10 Op 1 32/0.000 - CDS 28428 - 28691 451 ## PROTEIN SUPPORTED gi|237715589|ref|ZP_04546070.1| 50S ribosomal protein L27 21 10 Op 2 . - CDS 28713 - 29030 534 ## PROTEIN SUPPORTED gi|237715590|ref|ZP_04546071.1| 50S ribosomal protein L21 - Prom 29115 - 29174 6.9 22 11 Op 1 . - CDS 29223 - 29861 489 ## COG0546 Predicted phosphatases 23 11 Op 2 . - CDS 29900 - 30784 866 ## COG3735 Uncharacterized protein conserved in bacteria 24 11 Op 3 . - CDS 30811 - 31341 316 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 25 11 Op 4 . - CDS 31386 - 32093 212 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 26 11 Op 5 . - CDS 32132 - 33625 1060 ## BT_4319 hypothetical protein 27 11 Op 6 . - CDS 33650 - 36487 2983 ## COG0612 Predicted Zn-dependent peptidases - Prom 36508 - 36567 1.6 28 11 Op 7 . - CDS 36569 - 37369 860 ## COG2877 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase - Prom 37408 - 37467 2.1 29 12 Op 1 . - CDS 37493 - 38419 847 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase 30 12 Op 2 . - CDS 38477 - 39397 955 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase - Prom 39426 - 39485 6.0 - Term 40070 - 40120 13.5 31 13 Op 1 . - CDS 40273 - 42888 1695 ## BT_4324 hypothetical protein 32 13 Op 2 . - CDS 42949 - 44154 904 ## BT_4325 hypothetical protein - Prom 44199 - 44258 2.9 33 14 Op 1 . - CDS 44261 - 44905 197 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 34 14 Op 2 . - CDS 44926 - 46473 1491 ## BT_4327 hypothetical protein 35 14 Op 3 . - CDS 46514 - 47212 609 ## COG1385 Uncharacterized protein conserved in bacteria 36 14 Op 4 . - CDS 47221 - 47814 576 ## COG1259 Uncharacterized conserved protein 37 14 Op 5 1/0.000 - CDS 47828 - 49078 1246 ## COG0477 Permeases of the major facilitator superfamily - Prom 49231 - 49290 8.1 38 14 Op 6 . - CDS 49602 - 50780 572 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative - Prom 50821 - 50880 5.2 39 15 Op 1 . - CDS 50886 - 51554 428 ## BT_4332 hypothetical protein 40 15 Op 2 . - CDS 51563 - 52225 518 ## BT_4333 hypothetical protein - Prom 52331 - 52390 7.0 + Prom 52263 - 52322 8.5 41 16 Op 1 . + CDS 52352 - 54844 2356 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 42 16 Op 2 . + CDS 54864 - 55511 559 ## BT_4335 hypothetical protein 43 16 Op 3 . + CDS 55548 - 56498 707 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Term 56531 - 56568 1.6 - Term 56618 - 56673 7.7 44 17 Tu 1 . - CDS 56694 - 59123 2157 ## COG3525 N-acetyl-beta-hexosaminidase - Prom 59174 - 59233 6.9 - Term 59194 - 59239 3.4 45 18 Op 1 . - CDS 59281 - 60570 1025 ## COG0526 Thiol-disulfide isomerase and thioredoxins 46 18 Op 2 . - CDS 60591 - 61751 894 ## gi|237715615|ref|ZP_04546096.1| conserved hypothetical protein 47 18 Op 3 . - CDS 61753 - 62892 756 ## HCH_00467 PDZ domain-containing protein 48 18 Op 4 . - CDS 62903 - 64609 1414 ## COG1404 Subtilisin-like serine proteases 49 18 Op 5 . - CDS 64641 - 65516 718 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 65536 - 65595 2.4 50 19 Op 1 . - CDS 65678 - 66685 722 ## gi|237715619|ref|ZP_04546100.1| conserved hypothetical protein 51 19 Op 2 . - CDS 66723 - 68267 1042 ## Cpin_3049 RagB/SusD domain protein 52 19 Op 3 . - CDS 68287 - 71853 2207 ## BT_3279 hypothetical protein 53 19 Op 4 . - CDS 71883 - 73043 561 ## COG3712 Fe2+-dicitrate sensor, membrane component 54 19 Op 5 . - CDS 73111 - 73665 289 ## Dfer_2829 RNA polymerase, sigma-24 subunit, ECF subfamily 55 19 Op 6 . - CDS 73700 - 74401 624 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 74425 - 74484 8.4 - Term 74463 - 74506 8.6 56 20 Tu 1 . - CDS 74518 - 76707 2300 ## COG3968 Uncharacterized protein related to glutamine synthetase - Prom 76763 - 76822 6.3 - Term 76766 - 76822 11.8 57 21 Tu 1 . - CDS 76848 - 77321 289 ## BT_4340 hypothetical protein Predicted protein(s) >gi|222159301|gb|ACAB01000058.1| GENE 1 42 - 653 300 203 aa, chain + ## HITS:1 COG:no KEGG:BT_3384 NR:ns ## KEGG: BT_3384 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 203 1 203 203 260 63.0 2e-68 MAVQFEFYKNPGTGEEGEEEQYHPRVVNFNSVSTEYLAAEIHRATTFGEAEVEGVLMSLS HFMSSHLKNGERVHLKGIGYFQVTLQTTEPVYDVKTRSDKVSFKAIRFQADKELKGELYD MHIQRSKWKCHSAVLSEEEIDRRLTEFFTTHQVLIRRNLQSICQFTQIMASRHIQRLKEQ GKIENIGTRFQPIYVPRSGYYGK >gi|222159301|gb|ACAB01000058.1| GENE 2 544 - 819 122 91 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTNKKFPSICNYDRRRSSNAKTHIKYLSTSNIPISSPPIMIFRPYLRLSTQYQGWFIFHN TRIWAHKSVENEFLYSLFYLVLSNVVYDAMP >gi|222159301|gb|ACAB01000058.1| GENE 3 884 - 3481 1245 865 aa, chain + ## HITS:1 COG:no KEGG:BT_3853 NR:ns ## KEGG: BT_3853 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 862 8 854 857 811 48.0 0 MKQKRYLLISLLIFCLFVTLQWGYANTTILSNYRYGLTFKAHTVNQDQRTSLDLTPDRSL HLGNKFSLSFDIKLREELHTYGYVFRIISDDTSSCLDFISYQLRSRFNFILNKGSSIVEN VDFIDSTQIVKDEWIKIRLDFTEQGIKISVNEATQMMRHSFKDFNSLSFLFGSNRHPKYY TTDVPPVSLRNIELYNQDGRIIRCWKLALHNRKGETMDEVEAQRAIVQNGIWEIDRHYNW QKEKSLKFDSKKFQCQNLQLAYDSIGGRIFMVLKEKVLVYHVDNKQVDTLSVEKGEPYFG VSRQVIYSAATDELISYSTEHPLLSKYNFATHEWSIPSTWEVDSRQHHNRMIDPETNHLI LFGGYGRHTYNADFVEKGLNDGDQWQIISLDSCITPRYLSAMGIEDKDHILIMGGYGSCS GKQEESPQNYYDLYRLNIRDKSCSKVWSFFNEGEHFTFSNSMIIDTVADKLYALVYNNDR FHTNLNLCSFDICTSNPQSKIVSDSIEYDFLDIKSYCDLFLYQDQMLYAFVVQEKIPGEA VLDVYSLVYPPLSSDEIYMEKNKDCHFSTWVIYGSILICLVFFFFVIVYVYKKKKSKTTG VSMTISKVEYGEQEIAKPSNRKISAILLLGGFQVFDKQGNNITGEFTPTLKLLFLFLLLN SIKGGKGTTSQRLEETFWFDMSKTSAANNRRVNIRKLRLILETVGEVQIVNKNDYWYIDM GKDTLCDYHQVCQILNAFEFYNSSNKEMIVNFVELASLGVLLPNVSTEWVDEYKSEYSHN VIELLLSISVREEIGKDNKLLLKIADIILMHDCTDEDAIRIKCRALFLSGQKGLAKQCFD KYCADYNRLLNTSPEFTYDDVICES >gi|222159301|gb|ACAB01000058.1| GENE 4 3835 - 6963 2015 1042 aa, chain + ## HITS:1 COG:no KEGG:BT_3854 NR:ns ## KEGG: BT_3854 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 32 1042 1 1012 1012 1205 59.0 0 MKARICKEKHWNVHLRFFLVMMGDLLFSVSVLAQSNLITGKVIDNASEPIIGASIQSSVV GKGTISNMNGDFSIEVPVGTELTISFLGYRTQKIKVSDMRSLVIKLLEDTQALDEVVVVG YGVQKKESLTGAISNIKNDEIIKTKATSLAQSLEGKVSGLKIRQNDGEPGQFRSDINIRG LGTPLFIIDGIVRDGANEFQRLNPDDIESISFLKDGTAAIYGMNSANGAIVVTTKKGYKG KAKISLSANWGWSKPTNIPRMANAAQYMELRNDAAILGEGNPLVTKEELELWKTGASGYA STDLYDHVIKNSSIQQQYTLSLQGGSDIVSYFGSFSYASDEGLLKSGDLGYEKFTFRSNT DIKIAKGLTAGVNLAGRYDKTSQPWNSFYEIFKQTRVNVPTVPAYANNNPDYLATQSMGI NPIALADADMTGYHYYYNKNFQSTFTLKYEVPFVQGLSIQGQLGYDYNHNKHKGLRKKFS TYTYAIETDEYIENEYNNPSMIQVNNNEASRLDMQAQISYNRLFNDTHNVSATLVYERRQ EKSDWSNIERKFDLLTYDEVDYAGLKDAVSGGMSNEQAFISYVGRFNYDYKGRYLLELGF RNDGSYRYAPGSRWAFFPMGSIGWRLSEEKFLKEKLGFLDNLKIRASYGESGQDAGDPFQ YVSGYNLNVGIYEFNDGTTTSGISAPSITNTNLTWYKAKLLDIGFDLSIFKGLFSLEFDL YQRKRSGLLATRVVALPNTFGSQLPQENLNSDLTRGIDITLGHTNKIRNFSYSVKANMNL ARTRMEHVERGPFTSSMDRWRNQSSGRWNDFTWGYQTNGQFQTTEEIKYAPVQDGELGNS KELPGDYRYVDANGDGIIDDNDRMPLFWSGTPLIHYGFNLEASWKNFDIYALLQGAGMYT VQFSEVYAEVLWSKGANTPAYFYDRWRKEDPYNPNSKWVAGKWPATRLIQDTGAMYKESN IWRRDASYLRLKTLELGYTLPDSILRKSGISNIRFYINGYNLLTFCDSFVKAFDPEKVEG SYSAGMNYPLTKSFNIGFTANF >gi|222159301|gb|ACAB01000058.1| GENE 5 6970 - 8928 1435 652 aa, chain + ## HITS:1 COG:no KEGG:BT_3855 NR:ns ## KEGG: BT_3855 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 652 1 643 643 765 56.0 0 MKYKIILLLNMALLFAGCTDILDIEPNNKVVADKVLTTEEGIAMHMANLYGRLPIEDFSY AIDKGFNLGIGNTDPNNGGRAAAMYCDEAMHSEFDEWGSEGFNFWERYTDDKGLYNASVG IYTLIRDINNLLETIPTLNTSHNKKTLLEGEAAFLRAYTYFALAKRYGGVSIIKKVQVYN GDVEELKVPRSTEKDTWDFILSDCDDAIRKLSESSEDTRRANKWIALALKSRAALYAASI AKYTHEPYVGLSGPAVEQKLVGIDASEADRYYEICMQSSDEIMKSGLYGLYGAEPQSRDE AIVNYKKLFEDPNSVLNGTKEPIFIKGYTSGTQLAHNYDIWYRPNQVANGWPHPGRMNPS LDLADIYEDYTDDGQGLSKPIRTRVDNNEIFSGFDRNANYYSYPMDAPYEIFNDKDIRMW ATMILPGTQWKGQKIIIQGGFIRPDGSYVFRTDASTQGKDGKTYYSFGAQDQKMYSGFSS VGGNYTRSGFLLRKFLQESKDVTSEWNKGFTDWIEFRYAEILMNYAEASIEHSSATGEQL KKGKKALNAVRKRAAHKDEIPLTIENVRKERFVEFAFENRRRWDLIRWRIFHKEYENRIK KSLVPFLDLRGETPKYIFVRMDAPGVNKHSFDYSWYYQDIPGTGSNGLVQNP >gi|222159301|gb|ACAB01000058.1| GENE 6 8961 - 9653 401 230 aa, chain + ## HITS:1 COG:no KEGG:BT_3856 NR:ns ## KEGG: BT_3856 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 230 1 231 231 239 51.0 6e-62 MKATFYLLTIIFFSICFVACEKDNLDIAPNATFGGEIRDMETGELVEQEIIRGSQIYFIE LGWDNPPVQAMAVKNDGTFQNNLMFAGDYKIILNKGNYIPQDTIDFRINKGKNYKVFEVL PYIRIVNPDIRYKGNKVVAKFSLQQYTVDPVKTIALFAHTESHVSSNIQTAKVSSEINTA VSPLTEYELSMDLSQTTGIEREHSYFFRIGALSSTSEAKYNYSPAVEIDL >gi|222159301|gb|ACAB01000058.1| GENE 7 9669 - 11906 1181 745 aa, chain + ## HITS:1 COG:CC0533 KEGG:ns NR:ns ## COG: CC0533 COG3537 # Protein_GI_number: 16124788 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Caulobacter vibrioides # 214 726 290 756 770 248 30.0 4e-65 MKRKIQDCLAYRLFLLLPLFSFGGCVEIKAPVDYVNPYMGNISHLLVPTYPTVHLPNSML RIYPQREDYTSNKLKGLPVIVTSHRGNSAFSINPFNSFCDSLEKGSRSSFILYNYDNEKI KPYYYSVFLSDYDIEVKYVPSHQSALYQLDFRAGQPGLLIGATNGELHVEKNKVFGYQQI GSYETRVYLYLETDVVPLQTEEGNTVVMKFPENDKRINLRYGISFISVDQAKKNMEREVA GKDLETLTRTGRTIWNDALGKIQVKGNCEDSKTIFYTSLYRTYERMVCLSEDGQYYSAFD NQVHQDSGVPFYTDDWIWDTYRATHPLRILLEPEREKYMIQSFIRMAGQMKHFWMPTFPE ITGDSRRMNSNHGVVTVLDAYVKGITDFDLKEAYRACRNAITEKTLAPWSDMEAGKLDSF YVSNGYFPALASGEEEVVSEVHSFEKRQAVAVTLGTVYDEWCLAQIARLLDNKEDYNYFL QRSLNYHQLFNAETKFFHPKDSEGNFITPFDYRYSGGMGGRDYYGENNGWVYRWDVPHNV ADLIKLMGGKEAFVDELEKMFDTPLGKSKYEFFSRFPDHTGNVGQFSMANEPCLHIPYLY NYANKPWMTQKKIRALLEEWFRNDLMGMPGDEDGGGMSAFIVFSQLGFYPVTLGLPIYVI GSPVFERAKIQLSDDKTFEVYCHNYSPKHKYIQSVKLDGVEWDKSWFSHEELMKGNKLEV TMGAYPNKEWASQDQSVPPSFEMRK >gi|222159301|gb|ACAB01000058.1| GENE 8 11937 - 13100 675 387 aa, chain + ## HITS:1 COG:no KEGG:BT_3862 NR:ns ## KEGG: BT_3862 # Name: not_defined # Def: endo-alpha-mannosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 44 386 38 375 382 408 58.0 1e-112 MMQRLFFMMIFFLSLSLESCSKEDDNNPSNSENNGGNNNLGTELDYDTFCFYYDWYGSEA IDGQYRHWAHAIAPDPNGGSGQNPGTIPGTQESIASNFYPQLGRYSSSDPNILTKHMDMF VMARTGVLALTWWNEQDETEAKRIGLILDAADKKKIKVCFHLEPYPSRNVQNLRENIVKL ITRYGNHPAFYRKDGKPLFFIYDSYLIEPSEWEKLLSPGGSITIRNTAYDALMIGLWTSS PTVQRPFILNAHFDGFYTYFAATGFTYGSTPTNWVSMQKWAKENGKIFIPSVGPGYIDTR IRPWNGSVIRTRTDGQYYDAMYRKAIEAGVSAISITSFNEWHEGSQIEPAVPYTSSEFTY LDYENREPDYYLTRTAYWVGKFRESKQ >gi|222159301|gb|ACAB01000058.1| GENE 9 13127 - 14248 960 373 aa, chain + ## HITS:1 COG:no KEGG:BT_3860 NR:ns ## KEGG: BT_3860 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 373 1 376 376 290 43.0 6e-77 MKKVTKVKMLLIMLCALFISVCSISCSDDDDNVLKVDTVAFISQLTDYQNQLETLRDGSQ YGNKTGMYPEESRDILELVLIQIKKSIRQIENGEETNPSQEKVDELIQAAKKAMEDFKAT VIIEYVSVAAELYVDGQNDGYIDFGASENYSKFGNTGEQVFTVELWVKLKYTSGFGSVVD VFIEDGTGHYRKGWVINNFDNKRLRMSLGMDNWGLLEPGIDFSTTEKWVHFAAVINEKGV DGDVNSEGKPIVVKVYMNGELKDQATGLDGNPYNPNDLGTSMIAFRHMGGNLGMTNDWKL SGYIKHFHLWKSAKSQGEIKKLMNEEIRVTGQEDDLVCGWEFDATVEDDTNIPDLTGKYS AKLLGEYKWIPLE >gi|222159301|gb|ACAB01000058.1| GENE 10 14428 - 15063 466 211 aa, chain - ## HITS:1 COG:Rv1337 KEGG:ns NR:ns ## COG: Rv1337 COG0705 # Protein_GI_number: 15608477 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Mycobacterium tuberculosis H37Rv # 7 190 35 223 240 92 33.0 7e-19 MKQDIQRIMLAAAKPLFLIFILYILKIMEVGMDWDFSRLGVYPMEKRGLTGILTHPLIHS GFSHLLANTIPLFFLSWCLFYFYRGIAGKIFILIWLGSGLLTFLIGKPGWHIGASGLIYG FAFFLFFSGILRKYVPLIAISLLVTFLYGGIIWHMFPYFSPTNMSWEGHLSGGIMGTLCA FVFVNHGPQRLEPFADETEEENNEEEKGKSP >gi|222159301|gb|ACAB01000058.1| GENE 11 15277 - 17214 1651 645 aa, chain + ## HITS:1 COG:MA0905 KEGG:ns NR:ns ## COG: MA0905 COG3408 # Protein_GI_number: 20089784 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen debranching enzyme # Organism: Methanosarcina acetivorans str.C2A # 1 636 22 669 680 280 31.0 9e-75 MSYLRFDKTLMTNLEESLQREILRTNKAGAYHCTTIVDCNTRKYHGLLVIPVPNLDDENH VLLSSLDETVIQHGAEFNLGLHKYQGNNFSPNGHKYIREFDCEHIPATTYRVGGVILRKE KIFVHHENRILIRYTLLDAHSATTLRFRPFLAFRSVREYTHENPQASREYQLVENGIKTC MYPGYPELYMQLNKKCEFHFMPDWYRGIEYPKEQERGYDFNEDLYVPGYFEVDIKKGESI VFSAGTSEVTPRRLKQTFEAEVADRTPRDSFYHCLKNSAHQFHNQQEDEHYILAGYPWFK CRARDMFIALPGLTLALDEIDQFEDVMKTAEKAIRNFINEEPVGYKIYEMEHPDVLLWAV WAMQQYAKETSREQCRQKYGELLKDIMEFIRQRRHENLFLHDNGLLFANGTDKAITWMNS TVNGHPVIPRTGYIVEFNALWYNALRFIADLVREGGDVYLADELDAQAEVTGKSFVEVFR NEYGYLLDYVDGNMMDWSVRPNMIFTVAFDYSPLDRAQKKQVLDIVTKELLTPKGIRSLS PKSGGYNPNYVGPQIQRDYAYHQGTAWPWLMGFYLEAYLRIYKMSGLSFVERQLISYDDE MTSHCVGSIPELFDGNPPFKGRGAVSFAMNVAEILRVLKLLSKYY >gi|222159301|gb|ACAB01000058.1| GENE 12 17231 - 18496 1151 421 aa, chain + ## HITS:1 COG:Ta0340 KEGG:ns NR:ns ## COG: Ta0340 COG0438 # Protein_GI_number: 16081471 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermoplasma acidophilum # 1 415 20 388 388 223 32.0 8e-58 MKVLMFGWEFPPKIYGGLAVASYGITKGLSLQGDMETIFCMPKPSGEEEKFLKIIGMNQV PIVWRDVHYDYLKSRLLEMTPEEYYSFRDHIYADFSYMHVNDLGCMEFAGGYPGNLHEEI NNFSIIAGVVARQQEFDIIHAHDWLTYPAGVHAKMVSGKPLCIHVHATDFDRSRGKVNPT VYSIEKNGMDHADCIMCVSELTRRTVINEYHQDPRKVFAMHNAVYPLSQELLDIPRPDHS KEKVVTFLGRITMQKGPEYFVEAAALVLQRTRNIRFVMAGSGDMLNAMINLVAERGIADR FHFPGFMKGRQVYEVYKNSDVFVMPSVSEPFGIAPLEAMQCGTPSIISKQSGCGEILDKV IKTDYWDIHAMADAIHSLCTNPSLFEYLKEEGKKEVDGITWEKVGLRIRALYESVLRNYG K >gi|222159301|gb|ACAB01000058.1| GENE 13 18514 - 19968 1297 484 aa, chain + ## HITS:1 COG:MA4052 KEGG:ns NR:ns ## COG: MA4052 COG1449 # Protein_GI_number: 20092845 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-amylase/alpha-mannosidase # Organism: Methanosarcina acetivorans str.C2A # 1 387 1 390 396 265 36.0 1e-70 MRTICLYFEIHQIIHLKRYRFFDIGNDHYYYDDYANETGMNEVAERSYIPALNTLIEMAK NSGGAFKVALSISGVALEQLEIHAPAVIDLLHQLNETGCCEFLCEPYSHGLSSLANEDCF REEVLRQRDKMKQMFGKEPKVFRNSSLIYSDEIGGLVASMGFKGMLTEGAKHVLGWKSPH YVYHCNQAPSLKLLLRDFKLSDDISLRFSNSDWAEYPLFADKYINWIDALPQEEQVINIF MELSALGMAQPLSSNILEFMKALPECAKAKGITFSTPTEIVTKLKSVSQLDVAYPMSWVD EERDTSCWLGNVMQREAFNKLYSVAERVHLCDDRRIKQDWDYLQASNNFRFMTTKNNGMW LNRGIYDSPYDAFTNYMNILGDFIKRVDALYPVDVDSEELNSLLTTIKNQGDEITELEKV LAKLQAKVEAAKKTAVKKATNAKEPVVKEAPATKSKPAAKKAEVKKATAESKKSTGKAKK AAVK >gi|222159301|gb|ACAB01000058.1| GENE 14 20344 - 22083 1174 579 aa, chain - ## HITS:1 COG:no KEGG:BT_4306 NR:ns ## KEGG: BT_4306 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 579 1 540 542 584 58.0 1e-165 MKAKYALIALLAITFWGCDDNTAGLGLGMFPGSDQNINGKLSTFDVTTESVHAGKIYAMT NVGYVGKFTDETFGTYQAGFLAELNCPSGMTFPGVYNGTALNDKKKATNIMVDEAGLGDD AIGIYNTDNINDPKRKLIGNIHTIELYLWYSSYFGDSLTACRLSVYALNENLDTKNAYYT DITPEDYYDANTDNSLLGTKAYTAVDLSVKDSIRKLSTYVPSVHVSFRDKAAKEIGKEII KRANELGVNFDNKEFRKIFKGIYVKSDYGDGTVLYIDQAQMNVVYKCYAVDTLTGVKLEK KVVKEGESKDSTYYGYRTFATTREVIQANQLDNDKDAIQKCINEDTWTYLKSPAGIFTQI TLPISQIADSLLNQTAEKRQSDILNAVKLGIPIYNETSDKKFGMSTPSNVLLIRKKYKDS FFEKNQLSDEITSSLFRPTTTSFTQYTFNNITQMINDCLADREKAEKEIHEKGSITIKIT DLDGNSKDETVNNIKDWEDLSEWNKFVLIPVLVTTDSSSSNSYYGSSNVISIQHDLKPGY ARLKGGKKGTIQDAKGNPVYPEYVLKLEVVSTNFGTKSK >gi|222159301|gb|ACAB01000058.1| GENE 15 22106 - 22939 693 277 aa, chain - ## HITS:1 COG:FN0853 KEGG:ns NR:ns ## COG: FN0853 COG0297 # Protein_GI_number: 19704188 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Fusobacterium nucleatum # 6 237 2 219 461 84 27.0 2e-16 MTKANKVLFITQEITPYVSESEMANIGRNLPQAIQEKGREIRTFMPKWGNINERRNQLHE VIRLSGMNLIIDDTDHPLIIKVASIQSARMQVYFIDNDDYFQNRMQTADENGVEYDDNDS RAVFYARGVLETVKKLRWCPDVIHCHGWMTALAPLYIKKAYKDEPSFRDAKVVFSVYEDD FKNTLSDDFAAKLMLKGISKKDLGDLKEPVDYAALCKLAVDYSDGVIQNSEKVDESIIEY ARQSGKLVLDYQNPENYADACNEFYDQVWDATANEEE >gi|222159301|gb|ACAB01000058.1| GENE 16 23082 - 23927 712 281 aa, chain + ## HITS:1 COG:CAC2915 KEGG:ns NR:ns ## COG: CAC2915 COG0414 # Protein_GI_number: 15896168 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate synthetase # Organism: Clostridium acetobutylicum # 1 279 1 279 281 231 42.0 1e-60 MKVVHTIKDLQAELTVLRAQGKKVGLVPTMGALHAGHASLVKRSVSENGVTVVSVFVNPT QFNDKNDLEKYPRTLDADCRLLEECGADFAFAPSVSEMYPEPDTRQFSYAPLDTVMEGAF RPGHFNGVCQIVSKLFNAVQPDCAYFGEKDFQQLAIIREMVRQLKYNLEIVGCSIVREED GLALSSRNKRLSAEERENALNISRTLFKSRNFAATHTVSETQKMVEDAIEAAPGLRMEYF EIVDGNTLQKISNWEDTSYAVGCITVFCGEVRLIDNIKYKE >gi|222159301|gb|ACAB01000058.1| GENE 17 23946 - 24299 522 117 aa, chain + ## HITS:1 COG:NMA1492 KEGG:ns NR:ns ## COG: NMA1492 COG0853 # Protein_GI_number: 15794392 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate 1-decarboxylase # Organism: Neisseria meningitidis Z2491 # 1 115 1 114 127 127 60.0 4e-30 MMIEVLKSKIHCARVTEANLNYMGSITIDEDLLDAANMIAGEKVYIADNNNGERFETYII KGERGSGKICLNGAAARKVQPDDIVIIMSYALMDFEEAKSFKPAVIFPDPATNKVVK >gi|222159301|gb|ACAB01000058.1| GENE 18 24444 - 26744 2545 766 aa, chain - ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 307 760 5 459 468 427 50.0 1e-119 MNKIISKERFSEKVFKFEIEAPLIAKSRKAGHFVIVRVGEKGERMPLTIAGSDLKKGTIT LVVQEVGLSSTRLCELNEGDYITDVVGPLGQATHIEKFGTVVCAGGGVGVAPMLPIVQAL KAAGNRVITVLAGRNKDLIILEKEMRESSDEVIIMTDDGSYGRKGLVTEGVEEVIKREKV DKCFAIGPAIMMKFVCLLTKKYEIPTDVSLNTIMVDGTGMCGACRITVGGKTKFVCVDGP EFDGHQVNFDEMLKRMGAFKNIEREEMHKLQPECEATKEIDEKSRNAAWRQELRKSMKPK ERTAIPRVEMNELDAEYRSHSRKEEVNQGLTAEQAVTEAKRCLDCANPGCMEGCPVGIDI PRFIKNIERGEFLEAAKTLKETSALPAVCGRVCPQEKQCESKCIHLKMNEKPVAIGYLER FAADYERESGQISVPVIAEKNGIKIAVIGSGPAGLAFAGDMAKYGYDVTVFEALHEIGGV LKYGIPEFRLPNKIVDVEIENLSKMGVNFIKDCIVGKTIGVEDLKAEGFKGIFVASGAGL PNFMNIPGENSINIMSSNEYLTRVNLMDAASEDSDTPVAFGKNVAVIGGGNTAMDSVRTA KRLGAERAIIIYRRSKEEMPARIEEVKHAKEEGVEFLTLHNPIEYIADEQGCVKQVILQK MELGEPDASGRRSPVAIPGATETIDIDLAIVSVGVSPNPIVPSSIKGLELGRKGTITVDD NMESSIPMIYAGGDIVRGGATVILAMGDGRKAAAAMNEQLKANAGN >gi|222159301|gb|ACAB01000058.1| GENE 19 26903 - 28177 1431 424 aa, chain - ## HITS:1 COG:aq_298 KEGG:ns NR:ns ## COG: aq_298 COG0172 # Protein_GI_number: 15605830 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Aquifex aeolicus # 1 423 1 422 425 328 43.0 1e-89 MLTIKQITENTEAVLRGLEKKHFKNAKETIDQVIALNNKRRSTQNELDKNLAEVNSLSRT IGQLMKEGKKEEAETARARVAELKEGNKELDAAMTQAATDMQNVLYTIPNIPYDSVPEGV GAEDNVVEKMGGMETELPKDALPHWELAKKYDLIDFDLGVKITGAGFPVYKGKGAQLQRA LINFFLDEARKSGYTEIMPPTVVNAASGYGTGQLPDKEGQMYHCEVDDLYLIPTAEVPVT NIYRDVILEEKQLPIMNCAYTQCFRREAGSYGKDVRGLNRLHEFSKVELVRIDKPEHSKQ SHQEMLDHVEGLLQKLELPYRILRLCGGDMSFTAALCFDFEVYSEAQKRWLEVSSVSNFD TYQANRLKCRYRNAEKKTELCHTLNGSALALPRIVAALLENNQTPEGIRIPKALVPYCGF DMID >gi|222159301|gb|ACAB01000058.1| GENE 20 28428 - 28691 451 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237715589|ref|ZP_04546070.1| 50S ribosomal protein L27 [Bacteroides sp. D1] # 1 87 1 87 87 178 100 9e-44 MAHKKGVGSSKNGRESQSKRLGVKIFGGEACKAGNIIVRQRGTEFHPGENIGMGKDHTLF ALVDGTVNFKVGKEDRRYVSIVPATEA >gi|222159301|gb|ACAB01000058.1| GENE 21 28713 - 29030 534 105 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237715590|ref|ZP_04546071.1| 50S ribosomal protein L21 [Bacteroides sp. D1] # 1 105 1 105 105 210 100 2e-53 MYAIVEINGQQFKAEAGQKLFVHHIEGAENGSTVEFEKVLLVDKDGNVTVGAPTVEGAKV VCQVVSNLVKGDKVLVFHKKRRKGHRKLNGHRQQFTELTITEVVA >gi|222159301|gb|ACAB01000058.1| GENE 22 29223 - 29861 489 212 aa, chain - ## HITS:1 COG:MA2967 KEGG:ns NR:ns ## COG: MA2967 COG0546 # Protein_GI_number: 20091785 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Methanosarcina acetivorans str.C2A # 1 210 63 272 279 187 47.0 1e-47 MNYKTYLFDFDYTLADSSRGIVKCFRIVLTRHQYLTVTDEAIKRTIGKTLEESFSILTGI TDPEQLESFRQEYRLEADVHMNVNTRLFPDTLSTLKELKKRGARVGIISTKYRFRILSYL EEYLPKDFLDIVVGGEDVKAPKPSPEGVLFALEHLGSTPEETLYIGDSTVDAETARNAGV DFAGVLNGMTTAEELRIYPHKIIMQNLGELVQ >gi|222159301|gb|ACAB01000058.1| GENE 23 29900 - 30784 866 294 aa, chain - ## HITS:1 COG:SMc02488 KEGG:ns NR:ns ## COG: SMc02488 COG3735 # Protein_GI_number: 15966800 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 21 294 84 358 358 89 27.0 6e-18 MKSFIGAVIFICVAFSANAQLLWKVSGNGLTQPSYIIGTHHLAPFSIMDSIAGLQKAMNE TQQVYGELKMSEMQSPATMGKMQKAMMIESDTTLTSLLSPEDFETANKFCKENLMVDLNM APKLKPAFLLNNVVVMAYVKHVGKFNPQEQLDTFFQSQAAQNGKKVDGLETAEFQFNLLF NGSSLQRQAQLLMCTLNNIEAEVENLKKLTNAYMKQDLNTMFKISEERKGNQCDALPSEE DALIYNRNKIWAEKLPAIMKAAPTFVAVGALHLPGEKGLLKLLKSQGYTVEPVK >gi|222159301|gb|ACAB01000058.1| GENE 24 30811 - 31341 316 176 aa, chain - ## HITS:1 COG:ECs3067 KEGG:ns NR:ns ## COG: ECs3067 COG0791 # Protein_GI_number: 15832321 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Escherichia coli O157:H7 # 51 175 63 184 188 106 40.0 2e-23 MIEIQLMKAKLKFIYLLVGLAAIFSSCRTSAPRLDYQALARASILLGVDVNLEDNHKLYL EAADWIGVPYRGGGDSKRGTDCSGLVYQVYRKVYRTQVPRNTEDLKKESNKVAKRNLREG DLVFFTSSRSKKKVAHVGIYLKNGKFIHSSTSKGVIVSNLNESYYTKHWISGGRIR >gi|222159301|gb|ACAB01000058.1| GENE 25 31386 - 32093 212 235 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 231 1 240 245 86 25 5e-16 MITIDKLKKNFGEKVAVDIEHYEINQGDMLGLVGNNGAGKTTLFRLMLDLLKADDGKVII NDIDVSQSEDWKSITGAFIDDGFLIDYLTPEEYFYFIGKMYGLKKEEVDERLIPFERFMS GEVIGHKKLIRNYSAGNKQKIGIISAMLHYPQLLILDEPFNFLDPSSQSIIKHLLKKYNE EHQATVIISSHNLNHTVDVCPRIALLEHGVIIRDIINEDNSAEKELEDYFNVAEE >gi|222159301|gb|ACAB01000058.1| GENE 26 32132 - 33625 1060 497 aa, chain - ## HITS:1 COG:no KEGG:BT_4319 NR:ns ## KEGG: BT_4319 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 497 1 496 496 798 87.0 0 MMIFNELRKHGRLAAKRHPMYEKNKVAKILGYVMGAFWAGYLIFFGTTFAFGFSDMVPNR EPYHVMNAVVLIFILALDFLLRVPLQKTPTQEVKPYLLLPVKRIRVIDFLLIRSGLSLFN LFWLFLFVPFSFITITKFFGISGVITYLIGILLLIIANNYWYLLCRTLINERIWWILLPI AFYGGIGCLLFIPEDSPLFYFFMDLGDGYIQGNIFYFLGTILVIVLLWLINRKIMSGLIY AELAKVDDTQIKHVSEYKFFERYGEVGEYMRLELKMLLRNRRCKGALRNISIVIVAFSVA LSFSSVYDGNFMTSFICVYNFAVFGMIILSQIMSFEGNYIDGLMSRKESIMSLLKAKYYT YSIGEIIPFILMIPAIIMNKLTLLGAFAWFFYTIGFIYFCFFQLAVYNKQTVPLNEKVTS RQNNSAIQMLVNFGAFGVPLILYSLLNALLGETITYTILLVIGLGFTLTSPLWIKNVYHR FMKRRYENMEGFRDSRQ >gi|222159301|gb|ACAB01000058.1| GENE 27 33650 - 36487 2983 945 aa, chain - ## HITS:1 COG:BB0536 KEGG:ns NR:ns ## COG: BB0536 COG0612 # Protein_GI_number: 15594881 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Borrelia burgdorferi # 8 894 15 885 933 300 28.0 7e-81 MKHLLRGLFIAVLFICCNFQLVLAQPMQELPVDKNVRIGKLDNGLTYYIRHNALPEKRVE FYIAQKVGSILEEPQQRGLAHFLEHMAFNGTKNFPGDETGLGIIPWCETKGIKFGTNLNA YTSVDQTVYNISNVPTENINVVDSCLLILHDWSSAIDLADKEIDKERGVIREEWRSRNSG MLRIMTNAQPTMYPDSKYSDCMPIGSIDVINNFPYQDIRDYYAKWYRPDLQGIVIVGDIN VDEIEAKLKKVFADVKAPVNPAERIYYPVADNQEPLIYIGTDKEVKNPSVNIFFKQDATP DSLKNTIAYYATSYMVSMAMNMLNNRLNELRQTANPPFTSAGAEYGEYFLAKTKEAFSIS ASSKIDGIDLATKTILEEAERARRFGFTATEYDRARANYLQAVESAYNEREKTKSGSYVN EYVNNFLDKEPIPGIEVEYTLLNKLAPNIPVEAVNQIMQQLITDNNQVVLLAGPEKEGVK YPTKEEIAALLKQMKSFDLKPYEDKVSNEPLISEELKGGKIVSEKAGDIYGTTKLVLSNG VTVYVKPTDFKADQIVMKGVSFGGTSIFPNEEIINIAQLNGVALVGGIGNFSKVDLGKAL AGKRANVAAGIGNTTETVSGSCAPKDFETMMQLTYLTFTSPRKDNEAFESYKNRLKAELQ NADANPMTAFSDTITSVLYGHHPRAIRMKEYMVDQINYDRILEMYKDRYKDASDFTFYLV GNVDLATMKPLIAKYLGSLPSINRKETFKDNHMDIRKGQIKNVFAKAQETPMATIMFLYS GSCKYDLRNNVLLSFLDQALDLVYTAEIREKEGGTYGVSCNGSLGKYPKEELVLQIVFQT DPAKKDHLSAIVVEQLHKMAKEGPSAEHMQKIKEYMLKKYKDAQKENGYWLNNMDEYFYT GVDNTKDYEKLVNSITAKEVQDFLAKLLKQNNEIQVIMTVPEENK >gi|222159301|gb|ACAB01000058.1| GENE 28 36569 - 37369 860 266 aa, chain - ## HITS:1 COG:FN1224 KEGG:ns NR:ns ## COG: FN1224 COG2877 # Protein_GI_number: 19704559 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase # Organism: Fusobacterium nucleatum # 12 266 30 284 286 263 50.0 2e-70 MIELKNNPAGNFFLLAGPCVIEGEEMAMRIAERVVKITEALQIPYVFKGSYRKANRSRLD SFTGIGDEKALKVLRKVHETFGVPTVTDIHSADEAAMAAEYVDVLQIPAFLCRQTDLLVA AAKTGKTINIKKGQFLSPLAMQFAADKVVEAGNKNVMLTERGTTFGYQDLVVDYRGIPEM QSFGYPVILDVTHSLQQPNQTSGVTGGMPQLIETVAKAGIAVGADGIFIETHENPAVAKS DGANMLKLDLLEGLLTKLVRIREAIK >gi|222159301|gb|ACAB01000058.1| GENE 29 37493 - 38419 847 308 aa, chain - ## HITS:1 COG:TM0358 KEGG:ns NR:ns ## COG: TM0358 COG1597 # Protein_GI_number: 15643126 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Thermotoga maritima # 10 284 6 281 304 114 28.0 2e-25 MSVEPKKWGVIYNPKAGTRKVKKRWKEIKEYMDSKGVDYDYVQSEGFGSVERLAKILANN GYRTIVIVGGDGALNDAINGIMLSDAEDKENIALGMIPNGIGNDFAKYWGLSTEYKPAVD CIINHRLKKIDVGYCNFYDGKEHQRRYFLNAVNIGLGARIVKITDQTKRFWGVKFLSYVA ALFSLIFERKLYRMHLRINDEHIRGRIMTVCIGSAWGWGQTPSAVPYNGWLDVSVIYRPE FLQIISGLWMLIQGRILNHKVVKSYRTKKVKVLRAQNAAVDLDGRLLPRHFPLEVGVLSE KTTLIIPN >gi|222159301|gb|ACAB01000058.1| GENE 30 38477 - 39397 955 306 aa, chain - ## HITS:1 COG:TP0637 KEGG:ns NR:ns ## COG: TP0637 COG0324 # Protein_GI_number: 15639624 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Treponema pallidum # 4 286 20 306 316 221 42.0 2e-57 MPDYDLIAILGPTASGKTPFAAALAYELNTEIISADSRQIYRGMDLGTGKDLADYTVNGR TIPYHLIDIADPGYKYNVFEYQRDFLISYESIKQKGCLPVLCGGTGMYLESVLKGYKLMP VPENPELRIRLANHSLEELTEILGRYKTLHNSTDVDTVKRAIRAIEIEEYYAAHPVPERE FPELNSLIIGVDIDRELRREKITRRLKQRLDEGMVDEVRQLIEQGIAPDDLIYYGLEYKY LTLYVIGKLTYEEMFNGLEIAIHQFAKRQMTWFRGMERRGFTIHWMNAELPMKEKIAFVK EKLEGI >gi|222159301|gb|ACAB01000058.1| GENE 31 40273 - 42888 1695 871 aa, chain - ## HITS:1 COG:no KEGG:BT_4324 NR:ns ## KEGG: BT_4324 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 870 13 880 880 1393 77.0 0 MAFFFGMLFCFTSLSGQNTQDTILAYFNLLEKVPQEKLYLHLDKPFYGAGEKIWFKGYLV NSVTHQDNTQSNFIITELVNRSDSIVERKKIRRDSLGFHNAFALPPTLPAGDYYLRGYSN WMLNQEPEFFYSRNLKIGNSIDNTIVSNIEYQQEDESHYTARVRFTSNTQEAFGNTTIRY RTIENGKIKDKGKRKTDESGLISISLPDLKPIATRQIEVEFDDPQYIYKRTFYLPSFTKD FDVKFFPEGGVLLTVTHQNIAFKAQGSDGFSTEIEGFLFDAKGDTLTAFRSEHDGMGVFT LNPIAGNSYYVIAKSSDGITKRFDLPAAEEKGIALSMTHYKKEIRYEIQKTEATQWPQKL FLIAHTRGKLAILQPVSADRTFGRMNDSLFNAGITHFMLIDQQGNALSERLVFVPDRNPH QWQILADKPTYRKREKVSLQISAKDDNGTPVEGSFSVSITDRRSIQPDSLADNILSNLLL TSDLKGYVENPGYYVLQQDLRTLRTIDFLMMTHGWRRHHIRNVLTSPSLNLTNYMEKGQT ISGRIKGFFGGNVKKGPICILAPKQNIVATTTTDDKGEFIVNTSFRDSTTFLVQARTKRG FAGVDIVIDAPQYPVASPKSPFHDGTSTSFMEDYLLNTRDQYYMEGGMRVYNLKEVVVTG SRKKASSESIYTGGINTYTIEGDRLEGFGAQTAFDAVSRLPGVSVTNGNEIHIRNNPEQP VIVIDDVVYEDDNDILTMIQTSDMSSLSLLRGADAAILGSRGSAGAIVITLKDGKDLPAR PAQGIITCTPLGYSDSVEFYQPTYDTPEKKNDQRSDLRSTIYWNPSLQLNADGKATIEYY TPDSTAPEDIIIEGVDKNGKICRTIQTINKK >gi|222159301|gb|ACAB01000058.1| GENE 32 42949 - 44154 904 401 aa, chain - ## HITS:1 COG:no KEGG:BT_4325 NR:ns ## KEGG: BT_4325 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 401 1 401 401 738 94.0 0 MPTLVWKLLRQHISIGQLAGFFLANLFGMMIVLLSVQFYKDVIPVFTEGDSFMKKDFIIA TKKISTLGSFAGKSNTFSAEDIADLKKQPFTKTIGAFTPSQFKVSAGLGMQEAGIHLSTD MFFESVPDEFVDIKLDKWHFDESTHTIPIIIPRNYLNLYNFGFAQSRSLPKLSEGLMGLI QMDIMMRGNGRVEQYKGNIVGFSNRLNTILVPQSFMKWANENFAPNAEAQPARLIIEVSN PADSAIASYFQKKGYETEDGKLDAGKTTYFLRLIVGIVLGVGLFISILSFYILMLSIFLL LQKNTTKLESLLLIGYSPNKVALPYQLLTVGLNVIVLVLSIGLVSWLRSYYIDSIRLLFP QLETGSLWAAISMGVVLFIVVSVINILAVKRKVLSIWMHKS >gi|222159301|gb|ACAB01000058.1| GENE 33 44261 - 44905 197 214 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 18 212 22 213 223 80 28 3e-14 MNSIHLQQTLPQVFADRNSVTSDVWHQDLIFRKEEMYLIEAASGTGKSSLCSYIYGYRND YQGIINFDEINIKAYSVKQWVDLRKHSLSMLFQDLRIFTELTALENVQLKNNLTGCKKKK EILSFFEQLGIADKINVKAGKLSFGQQQRVAFIRALCQPFDFLFLDEPISHLDDDNSRIM GELIIAEAKTQGAGVIATSIGKHIELPYNHILQL >gi|222159301|gb|ACAB01000058.1| GENE 34 44926 - 46473 1491 515 aa, chain - ## HITS:1 COG:no KEGG:BT_4327 NR:ns ## KEGG: BT_4327 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 515 1 515 515 776 81.0 0 MAKKTISRLSVLAVLIVFLAACSKTSEYTNVIPADASVVASINLKSLASKAGLDDKENEA AKQKVLEALKSGMNAATFQQLEKVMKNPGESGIDVESPFYVFSSSSFPYPTVVGKVNNED NLHASLDVMAKEQICQPISEADGYSFTTMNGGLLAFNNSTVLIVNVSGTTQTKKAKEAIT NLLKQTADNSIVKSGAFQKMEKQKSDINFLASMEAIPATYRNQISMGLPTEVKAEDITLV GGLNFEKGKIALKTENYTENEAVKALIKKQMESFGKANNTFVKYFPSSTLMFFNVGVKGE GLYNLLSENKEFRNTVSIAKADEVKELFGSFNGDISAGLINVTMNSAPTFMMYADVKNGN ALETIYKNKQSLGLKRGEDIIQLGKDEYVYKTKGMNIFFGIKDKQMYATNDELLYKNVGK AADKSIKDAPYASDMKGKNIFVAINAEAILDLPVVKMVAGFGGQEVKTYIELANKVSYLS MSSEGEISEIDLCLKDKDVNALKQIVDFAKQFAGM >gi|222159301|gb|ACAB01000058.1| GENE 35 46514 - 47212 609 232 aa, chain - ## HITS:1 COG:PA0419 KEGG:ns NR:ns ## COG: PA0419 COG1385 # Protein_GI_number: 15595616 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 13 223 17 228 240 97 30.0 2e-20 MHVFYTPDIQKSNELPEEEAQHCTRVLRLGIGDEITLTDGKGNFYKAEITVATNKRCFVT IKETIFQEPLWPCHLHIAMAPTKNMDRNEWFAEKATEIGFDELTFLNCRFSERKVIKTER IEKILVSAIKQSLKARLPKLNEMIEFDPFIRQEFKGQKFIAHCYEGEKPLLKNVLKPGED ALVLIGPEGDFSEEEVKKAIEQGFVPISLGKSRLRTETAALVACHTLNLQNQ >gi|222159301|gb|ACAB01000058.1| GENE 36 47221 - 47814 576 197 aa, chain - ## HITS:1 COG:MT1877 KEGG:ns NR:ns ## COG: MT1877 COG1259 # Protein_GI_number: 15841299 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis CDC1551 # 6 164 3 157 164 90 36.0 2e-18 MDKKVELQVINITNSQAQVGAFAMLLGEVDGERQLPIIIGPAEAQATALYLKGVKTPRPL THDLFTTSLTILGASLIRVLIYKAKDGIFYSYIYLKKDEEIIRIDARTSDAIALAVRADC PILIYESILEQECLHMSSEKRTRSEETDNDEETEEEHDLPDATSRTLEEALEQAIKDENY ELAARIRDQINSRNKNQ >gi|222159301|gb|ACAB01000058.1| GENE 37 47828 - 49078 1246 416 aa, chain - ## HITS:1 COG:STM3113 KEGG:ns NR:ns ## COG: STM3113 COG0477 # Protein_GI_number: 16766414 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 1 409 1 406 418 413 54.0 1e-115 MSIKVRLIIMNFLQFFVWGSWLISLGGYMGRELHFEGGQIGAIFATMGIASLVMPGIIGI IADKWFNAERLYGLCHIAGAGCLFYASTATGYDQMYWAMLLNLLVYMPTLSLANTVSYNA LEQYKCDLIKDFPPIRVWGTIGFICAMWAVDLTGFKNSSAQLYVGGASALLLGLYSFTLP ACRPAKSENKSWLSAFGLDALVLFKKKKMAIFFLFSMLLGAALQITNTYGDLFLSSFASI PEYAESFGVKHSVILLSISQMSETLFILAIPFFLRHFGIKQVMLISMFAWVFRFGLFGFG DPGSGLWMLILSMTVYGMAFDFFNISGSLFVEQEANSSIRASAQGLFFMMTNGLGAIIGG YASGAVVDAFSVYADGRLVSREWMDIWLIFAAYALVIGILFALVFKYKHQQESKTN >gi|222159301|gb|ACAB01000058.1| GENE 38 49602 - 50780 572 392 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 1 392 1 393 396 224 36 8e-58 MHKVYLKPGKEDSLKRFHPWIFSGAIARFDGEPDEGEVVEVYTSKKEFIAEGHFQIGSIA VRVLSFHQEPIDHDFWKRKLEIAYDMRRSIGIATNPTNNTYRLVHGEGDNLPGLVIDVYA KTAVMQAHSAGMHVDRMTIAEALSEVMGDKIENIYYKSETTLPFKADLFPENGFLKGGSS DNIAQEYGLQFHVDWLKGQKTGFFVDQRENRSLLERYAKDRSVLNMFCYTGGFSFYAMRG GAKLVHSVDSSAKAIDLTNKNVELNFPGDSRHEAFAEDAFKYLDRMGDQYDLIILDPPAF AKHKDALRNALQGYRKLNAKAFEKIKPGGILFTFSCSQVVSKDNFRTAVFTAAAMSGRSV RILHQLTQPADHPVNIYHPEGEYLKGLVLYVE >gi|222159301|gb|ACAB01000058.1| GENE 39 50886 - 51554 428 222 aa, chain - ## HITS:1 COG:no KEGG:BT_4332 NR:ns ## KEGG: BT_4332 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 203 1 203 216 376 92.0 1e-103 MIVRRTIDKDELKELPKTVFPGRIYVIQSEAETERAVAYLQSRSVIGIDSETRPSFTKGQ SHKVALLQISSEECCFLFRLNMTGLTQPLVDLLENPAVIKVGLSLKDDFMMLHKRAPFTQ QSCIELQDYVRQFGIQDKSLQKIYAILFKEKISKSQRLSNWEADVLSDGQKQYAATDAWA CLNIYNLLQELKQTGDWEIAALPPAPKEREEVTIGPISNQQS >gi|222159301|gb|ACAB01000058.1| GENE 40 51563 - 52225 518 220 aa, chain - ## HITS:1 COG:no KEGG:BT_4333 NR:ns ## KEGG: BT_4333 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 220 1 219 219 347 88.0 2e-94 MEKESQTIFDKNVIEFVTVAAEFCAFLERAERMKRSTFVDTSLKILPLLYLKASMLPKCE TIGDEAPETYVTEEIYEILRINLSGLMGDKDDYLDVFVQDMVYSDQPIKKSISEDLADIY QDIKDFIFVFQLGLNETMNDSLAICQENFGTLWGQKLVNTLRALHDVKYNQLEEEEEENG NEEGFYEPSDDNDCCEEEGCHCHDDDCHCHEDGCHCHDDE >gi|222159301|gb|ACAB01000058.1| GENE 41 52352 - 54844 2356 830 aa, chain + ## HITS:1 COG:BS_spoIIIE KEGG:ns NR:ns ## COG: BS_spoIIIE COG1674 # Protein_GI_number: 16078743 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Bacillus subtilis # 242 815 221 776 787 400 41.0 1e-111 MAKKKLDKEAESTPSSPSKIVAVCKNETVHFVIGLMLVIFSVYLLLAFSSFFFTGAADQS IIDSGSSADLAAVNNQVKNYAGSRGAQLASYLINDCFGISSFFILVFLAVAGLKLMRVRV VRLWKWFIGCTLLLVWFSIFFGFAFMDHYQDSFIYLGGMHGYNVSRWLISQVGVPGVWMI LLITAICFFIYISARTVIWLRKLFALSFLKREKKEKEEVTPEGEGDQEFTTSQPQEVEFN LKRTYKQTPPPASVMDIQAEEPEDEFHVNQPEPEESPLSDESEGVTMVFEPTVPNPVPAV QDEPLEEAEPGFEVEPATSEEEYEGPELEPYNPTKDLENYRFPTIDLMKHFENDDPTIDM DEQNANKDRIINTLRSFGIEISTIKATVGPTVTLYEITPEQGVRISKIRGLEDDIALSLS ADGIRIIAPIPGKGTIGIEVPNKNPKIVSGQSVIGSKKFQESKYDLPIVLGKTITNEVFM FDLCKMPHVLVAGATGQGKSVGLNAIITSLLYKKHPAELKFVLVDPKKVEFSIYSVIENH FLAKLPDGGEPIITDVTKVVQTLNSVCVEMDTRYDLLKMAHVRNVREYNEKFINRRLNPE KGHKFMPYIVVVIDEFGDLIMTAGKEVELPIARIAQLARAVGIHMIIATQRPTTNIITGT IKANFPARIAFRVSAMMDSRTILDRPGANRLIGKGDMLFLQGADPVRVQCAFIDTPEVEE ITKFIARQQGYPTPFFLPEYVSEDSNSEVGDIDMGRLDPLFEDAARLVVIHQQGSTSLIQ RKFAIGYNRAGRIMDQLEKAGIVGPTQGSKARDVLCIDDNDLEMRLNNLQ >gi|222159301|gb|ACAB01000058.1| GENE 42 54864 - 55511 559 215 aa, chain + ## HITS:1 COG:no KEGG:BT_4335 NR:ns ## KEGG: BT_4335 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 215 1 215 215 316 81.0 4e-85 MRKYIFSVLIALLSLPVIAQQQQSQAKVILDKTAEAFRKAGGVKADFTVKAVANGLVEGA ENGTIQLKGEKFVLKTSDIITWFDGKTQWSYVTKNDEVNVSNPTQEELQQINPYTFLYMY QKGFSYKLGATKTFRGKAVWEVVLTARDKKQELERITLFVTKDTYEPLYILLQQRGQQTR NEITVTSYQTGQNYTDRVFIFDKKQYPNAEVIDLR >gi|222159301|gb|ACAB01000058.1| GENE 43 55548 - 56498 707 316 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 10 312 5 305 306 276 47 2e-73 MAEIEKVKCLIIGSGPAGYTAAIYAGRANLCPVLYEGLQPGGQLTTTTDVENFPGYPEGI SGPQLMEDLRAQASRFGTDIRFGIATAADLSKAPYKITIDGDKVIETESLIIATGATAKY LGLEDEKKYAGMGVSACATCDGFFYRKKVVAVVGGGDTACEEAVYLAGLASKVYLIVRKP FLRASKIMQERVMNHEKIEVLFEHNAVGLFGDNGVEGVNLVKRWGEPDEERYSLPIDGFF LAIGHQPNTEIFKEYIDTDEVGYIITDGDSPRTKVPGVFAAGDVADPHYRQAITAAGSGC KAALEAERYLSSKGLV >gi|222159301|gb|ACAB01000058.1| GENE 44 56694 - 59123 2157 809 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 31 636 31 605 757 468 40.0 1e-131 MKKSISLLLLSLLMIAPGCQKTKEVVNEYNIVPKPNQILPQEGRFELNSKVCLVVPSDAP EVKSIADSLAGQLKLTAGISLKEAESADGKSAICFVVQEGMPKEGYKLSVTPSLITLTAS QPNGFFYGVQTLYQLLPPAVYGNQLDKKANWSVPAVEIEDSPRFVHRGLMLDVCRHYVPI DYIYKFIDLLAMNKMNVFHWHLTDDQGWRIEIKKYPKLTEIGSKREKTLVDYYYVNYPQV FDGKEHGGYYTQEQIKDVVAYAASKYINVVPEIEMPGHALAALAAYPELSCDSTQTYKVS PTWGVFEQVFCPSETTFKFFEGVMDEVIELFPSEYIHIGGDECPKTAWKNSAFCQQLIRQ LGLKDDTAPSKTDGIKHSKEDKLQSYFVTRMEKYLNSKGRNIIGWDEILEGGLAPNATVM SWRGVEGGMNAAKAGHNAIMTPNPYVYLDYYQEEPEIAPTTIGGYNTLKKTYSYNPVPDD ADELAKKHIIGIQGNIWREYMQTSERTDYQAFPRAMAIAETAWTQNANKDWKNFCERMVT EFERLEVMNTKPCLNFFDVNVNTHADENAPLMVLLESFYPNAEIRYTTDGSEPNKASTLY EKPFVLEGNIDLKAAAFKDGKMLGKVAHKPLYGNLLTGKPYTVNYKMGWTGDIFDENDVL GADKTTFGLTNGKRGNNASYTPWCSFGIVEGKDLEFIVNLDKPTQISKIIFGSLFNPAMR ILPAGGVAVEVSADGKQYTPVAEKALKHDYPETGRIAFTDSIEFEPTQATFLKVKIKNGG TLRNGVNFEKNNGPEVIPAELWIDEIEAY >gi|222159301|gb|ACAB01000058.1| GENE 45 59281 - 60570 1025 429 aa, chain - ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 318 417 5 104 117 111 48.0 3e-24 MKRIITKMYLCLLAFCITGGISAQTQNSMTEVIPFKTIDGKIIVEATINGEVADFVLDLS GHNALLPEALKKLHINTEKRGTFSSYQDFVFKQVPVGKVYEMGTVAIGKNTFANDLPAFT LEDEPYLRKLGVMGVLSGAVFRTSVLTIDMQRKKITITQPYRPSYMKLNYRENFNLITGL GVVCPINIQGKPISFVLDTWSEGLVNLTEADFNTWSAQYTKGSNQKVSNGYKEISQDEES LILPETMFVKTKIEDAIAVKNPFLKRSVLGKKILDYGIISIDYIHQKIYFQPFDMVPIPE AEAKVTETKVEDGKLNPITRQFFLEHIFDYRKGNDFVYNGDKPVVIDFWATWCGPCMRLL PEMEKLAEKYKGKVIFYKVNADKEKDLCSHFSVQALPTLFFIPVGGKPIIEVGATPEKYV QIIEEQLLK >gi|222159301|gb|ACAB01000058.1| GENE 46 60591 - 61751 894 386 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715615|ref|ZP_04546096.1| ## NR: gi|237715615|ref|ZP_04546096.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 386 1 386 386 741 100.0 0 MKRIGILTGLWLLLFLNCGQEMVAQIQNKVCDTIPYEFIQEKIIIPVTVNGVNVKYIVDT GGRTGTMYDAATEMKATAAGYMRISDVNAQGSNYQEAHVQNVSIGKNYKIKQLKTMVLPK NPFFTDLGVVGILGGDAFAQSVVTFDSRSKIMVINYPYRPEKLKVTDGIPLLDETEHHSI VNVRLGNNDLKVLFDTGAGGFLLYSTEDYNRLADISQVTNHGYGIVAAGITGLGKPVDIK KVSVPSINIMGKEFTNVGSTTTVMNGTIIGVDLLQYGKVIIDYMRRRFYFLPFEEGKIDM GGAPALWNVSILPRNERFEITTIWDSMKDTVAFGDEVININGTSLKDCPMSQMAVEEIMN AIPGDTGYIIIKKDNQEKRIEIRKEK >gi|222159301|gb|ACAB01000058.1| GENE 47 61753 - 62892 756 379 aa, chain - ## HITS:1 COG:no KEGG:HCH_00467 NR:ns ## KEGG: HCH_00467 # Name: not_defined # Def: PDZ domain-containing protein # Organism: H.chejuensis # Pathway: not_defined # 22 373 39 388 395 96 24.0 1e-18 MKYTRLIGTVIAFCMAVPLFAQQYKATIPYRMVGEKMVIEMKVNGNARPFIFDTGGRTAL TTKACQALQITATDSMKVTDVNNVESYYKTTRIENLTTPDDVINFKNAPSLIINEVKGWE CFGVDGIIGSDLFASTIVSIDSQTKNIIVTSAEKPSTVSLRKMLNFTKEGGMPIVNVQIA PVSNITVLFDTGSPSLLSLIESDFERIKPEASMEVVSEGYGEGSIGVAGQADKASSYRVH IPLLSVGATKFRNLTTHTDKHPYTLLGVKLLQYGKVTIDYPRGRFYFEAFQPDNEINNQC NNFDLTVKDGDLFVSTVWSSTKGKIEVGDKVIKINGKPAKKYDFCESILNGIPELKEKKQ TKLTIETASGIKNIIYKKE >gi|222159301|gb|ACAB01000058.1| GENE 48 62903 - 64609 1414 568 aa, chain - ## HITS:1 COG:alr0996 KEGG:ns NR:ns ## COG: alr0996 COG1404 # Protein_GI_number: 17228491 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 270 515 117 360 488 87 30.0 7e-17 MKKLIFPIVCCFISVTVSAQLIKPKAETQKKQSELDWFNCSFDQDSVYGAEVNKAYDYLK ANKKKAKKRPVVALIGTGMDVEHEDLKHAIWINPKEKSNQKDDDKNGLVDDINGWNFIGG KDGQVMEALTREGEREFFRLKDKYADYIFDGKKYYKIVNGKRQEVPAPENIEEYNYYRYK VMPESRIGGAYGGLQLSYVIEEYVEKFDKDMKKRFPGKELTVEDFQSCYDPKAERDSLSE VAFVFTAYYFSIYNTDKWEPVYQNMGKKSVETGKASYEEALKRYGSDNRKEIVGDNPLDI NDTHYGNNVLLTSDAATGVMKAGVIAAKRDNGIGSNGIADNAEIMTLRIHPGEGEPYLKD MALAIRYAVNHGADVIVLPEQNSIYPKEQKQWVSEALKEAEKKGALVIVPVWDLSMDMDK DEFFPNRKMNKEGELTNFMVVASSDKNGNPVLNTNYGATALDIYAPGTDIYSSYMGDTYQ KGTGEGMASATVAGVAALVKSYFPKLTGSQIRDILLKSVTSRKGVEVEKGIRVNDSPSQD LFLFDDLCISGGIVNAYQAILEAEKVSK >gi|222159301|gb|ACAB01000058.1| GENE 49 64641 - 65516 718 291 aa, chain - ## HITS:1 COG:BS_yneN KEGG:ns NR:ns ## COG: BS_yneN COG0526 # Protein_GI_number: 16078864 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 150 270 36 148 170 72 35.0 7e-13 MNEGVVLSCIIGTNNTFSTDEINIIPDVYKLYLGETEQNIYLENNPVTINGYFDEKNPEQ SSLSFTGIDPFLTLQNYMPAEKDPDIATISTSVKGKLTPSMASALAYLADVNDYQSNKML LDMIPEQDRKSLSAKWLVNRVEILSHQIIGAECPDFTFTDANGKNVSLKDFRGKIVVLDF CASWCGPCRKEMRSMLTIYNELKADDLEFISVSLDDSEAKWRKMLDEEKLPWVMLWDKTG FPKNSKTPSAIQTDYGFYSIPFLVVIDKEGKLAARNVRGEQVREAILKIRQ >gi|222159301|gb|ACAB01000058.1| GENE 50 65678 - 66685 722 335 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715619|ref|ZP_04546100.1| ## NR: gi|237715619|ref|ZP_04546100.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 335 3 337 337 674 100.0 0 MKKIYILWGLLACMALFTSCYEEDTLTPTEGGIELRFKVPQGNNSWDDDIAQIYEDYNVY LIYKDLQRADFNRSWTGISYGSGYEGQGCVNDEMTNYYVEFMKKHIFAYLNPPITSKVLP MYWYLGYNVYSKSVLEVGGVILASWIVPIHENYDGLDYWCTCMFGEDNPSDPYLIPTDRA LLDRRRKMILAPVLEKAVKAGNIIIPEEFEVGFDHVTNLIAGLGMEDDPNYYLTRGYPGS VNTYVFNSISEPANNSYPPTNEETFIGYMHLSMYYNQARLAEVYPADKYPFLTEKFAFVQ KYLKDKYQIDLEAIANGPEDWDLPLPPIPETPDEE >gi|222159301|gb|ACAB01000058.1| GENE 51 66723 - 68267 1042 514 aa, chain - ## HITS:1 COG:no KEGG:Cpin_3049 NR:ns ## KEGG: Cpin_3049 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.pinensis # Pathway: not_defined # 1 507 1 465 475 243 33.0 1e-62 MKRIIYILTGIVLLSLSSCSDFLEPKSQSEYVPKDANALQEMLIGSAYPKQDKSNFLLPF LSFLDDDIQFHKTDYEFSINSLKNIEAKQAVYTWQPDMFFIMERNGYPLQNIWEGYYNYI LGANAALDYIGDVNGTEAEKNYVIAQSLGLRAFYYFMLVNHFGAPYNYNKQALGVPLKLD SNLLPEDQLLMTRNTVEEVYNQIVDDLNEAERLFLTLSKDKQYEPNYLVSLPMIQLLKSR VFLYMENWKDAAIYANKVIKDWSFALVDLNNLPSPTVAEPYYNFTSLKSSEVIWLYGSVS DLTVFNDESVEYEEEGYFGNTTTYYREAFIASDNLIESFEDGDLRKEKYIAKEFNKDDKV FYEDSYTTFGKYKLSATGEPSGSENFALSFRLGEAYLNLAEAAAHNNDESTALSALKTLL AKRYEPDKFVEPTGLTGDALKTFIKNERRKELCFEGQRWFDLRRYGMPQIIHRWGEQVYT LKQNDPSYTMPIPDAVLIKNKKLEQNPLAPKRES >gi|222159301|gb|ACAB01000058.1| GENE 52 68287 - 71853 2207 1188 aa, chain - ## HITS:1 COG:no KEGG:BT_3279 NR:ns ## KEGG: BT_3279 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 1186 9 1180 1182 677 35.0 0 MKKRTQPSSKGVSYVSRKFVKFCLLWIFSLSSLMIQAQDNQRISVDLRGESLESTLWYLQ NRTKFIFMYATEDIANITDISVRAKNKTITEILDECLEGTNLTYEVSGTAIVIKKRKTNK ITISGWIRDASGEALPGATVTVRGSKHGAIAGLDGHYTFNIPAQEGLILTYSFIGMEKKT VRYTGKKTINVTLNSSSTEIDEVVVTGYQNIQRRDLVGSITTVKAKDILMPSYTTIDQML QGRVAGMIVTNTSSRVGTAPKIQIRGTSTLLGNQDPLWVVDGIIQEDPLELNASSLMTED LKNIIGNQISWLNPADIESISILKDASATAIYGSKASNGVIEITTKKNTTDRLSVNYTSN FVMGTRPNYGQFNYMNSKERVLFSQEAFNWGTPYGAEPIKQIYTYEGLLNMYLSHDISSE EFLAQRNVLETQNTDWFKLLTRRSFSHNHNISVSGGTNKYSYAASMGYSNSEGQEIGNDS ERMTGRVAITIRPVQKLTINATINGSVSTNNGFAGGVNPMGYATTTNRSIDPDAYYQMKA SYPYNSGVKSLSYNFVNERNNSGSKSKSSFMSASLDLKWNILDWLTYQFTGGYSNNNSTN EAWETEQTFYIAETYRGYDFNSVSPNSKEFGAALLPFGGELFTNNAQQYSYNIQNKLQFS KAFNNENRINALVGMELRSSTNKGVSNTVWGYAPDRGEVITLPTTPQAFTPITGSKDEGW GLLKKIYDGGWRKVNTTDNYLSVFATLAYSLKNRYVVNANIRNDASNRFGQDANHRIDPT YSFGFSWRASEEDFMKKYVKWITTLNFRGTYGIQGNAVTRISPDLILNQGKVANLYNRYQ STISQIPNPNLSWERTKSWNFGVDLELFSMFYMNLEYYTRRSNAIVELELPYEYGITSMK RNGGIIHNRGIEYTLTFTPIQKRDYALSISLNASKNWNEGGHTDIDVNAASFLSGRSDIV LKEGYPLSSFWSYSFAGLDGKTGEALFNLLDIPEEERSRQIDPTTYLVYSGQKEPYFTGG LSLSFRYKSLTLNTSFSLLLGNKKRLPSPYNQFASSYYMPDPYTNINRDLLNRWKEPGDE AHTIIPSLPKAGMAYIELPNNENVYSIPLWEQSDAMVVSGSFLRCRNIGLSWQMKREWCE KIYMKNLSLNFNMDNIFVIASKRFNGFDPEVSNSVLPRNYSLGINIGF >gi|222159301|gb|ACAB01000058.1| GENE 53 71883 - 73043 561 386 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 186 385 129 327 331 100 31.0 5e-21 MNSFKEKFKIAKILASLFTHSSTPEEEKNYHAWLDENPEHQKIANRILNKETYEENSRLI KSFSSQKAWDRIYPLLGGEKTSGFLSWKRSLKYAALLLLLLIPISYLIYNWVAGEPIGEI TPGTHGGELTLSNGNTFNLFENVLPEGATEVFIIDSKGINYQTPANKPKVKEIKNTLRTL HGMECHIVLSDGTKVHLNAESQLTYPICFSDKERIVQVEGEAYFDVAPDKAHPFIVQTPH TSIRVTGTSFNVRAYADEEVESTTLISGAIEISSENEDYELIPNQHFVYDKNSRESTVSN VNTELYTSWESGSFIFMNVPLENVMSYLSKWYGFTYTFEDDAARQVQIGAYLNRYANMNP IIDMIMELNMVDIKQREGILHISYKQ >gi|222159301|gb|ACAB01000058.1| GENE 54 73111 - 73665 289 184 aa, chain - ## HITS:1 COG:no KEGG:Dfer_2829 NR:ns ## KEGG: Dfer_2829 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: D.fermentans # Pathway: not_defined # 8 181 17 196 206 92 35.0 6e-18 MNIHIENIIADIRRGNKQAFKKLFDDYYPILCVFASHYIEDKEVCKDIAQDVLLAYWERK EDFDDILKVKSFLYTVTRNKCLNHLKHEQLDIPNFSGQEEFDSGFEAAIIEQETFHMVRK AVEELPTQMRNIILYSMKGLKNHEIADKLQISEGTVHTLKKFAYRKLRESLKGINYTLLL FLCK >gi|222159301|gb|ACAB01000058.1| GENE 55 73700 - 74401 624 233 aa, chain - ## HITS:1 COG:slr0449 KEGG:ns NR:ns ## COG: slr0449 COG0664 # Protein_GI_number: 16332256 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Synechocystis # 18 233 16 233 238 79 25.0 6e-15 MVKMNMSDIDISEPLSDMLAPLNNEQKEFLMNNYTIQTYKKNETIYCEGETPSHLMCLIS GKVKIFKDGVGGRSQIIRMIKTREYFAYRAYFAKQDFVTAAAAFEPSVVCLIPMSAITTL ISQNNDLAMFFIRQLSIDLGISDERTVNLTQKHIRGRLAESLIFLKESYGLEEDGSTLSI YLSREDLANLSNMTTSNAIRTLSQFATERLITIDGRKIKIIEEEKLKKISKIG >gi|222159301|gb|ACAB01000058.1| GENE 56 74518 - 76707 2300 729 aa, chain - ## HITS:1 COG:slr0288 KEGG:ns NR:ns ## COG: slr0288 COG3968 # Protein_GI_number: 16331104 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Synechocystis # 5 729 7 724 724 623 44.0 1e-178 MSKMRFFALQELSNRKPLEVTAPSNKLSDYYGSHVFDRKKMQEYLPKEAYKAVTDAIEKG TPISREIADLIANGMKSWAKSLNVTHYTHWFQPLTDGTAEKHDGFIEFGEDGGVIERFSG KLLIQQEPDASSFPNGGIRNTFEARGYTAWDVSSPAFVVDTTLCIPTIFISYTGEALDYK TPLLKALAAVDKAATEVCQLFDKNITRVYTNLGWEQEYFLVDSSLYNARPDLCLTGRTLM GHSSAKDQQLEDHYFGSIPPRVTAFMKELEIECHKLGIPAKTRHNEVAPNQFELAPIFEN CNLANDHNQLVMDLMKRIARKHHFNVLLHEKPYSGVNGSGKHNNWSLCTDTGINLFAPGK NPKGNMLFLTFLVNALMMVYKNQDLLRASIMSASNSYRLGANEAPPAILSCFLGSQLSST LDEIVRQVGNEKMTPEEKTTLKLGIGRIPEILLDTTDRNRTSPFAFTGNRFEFRAAGSSS NCAASMIAINAAMANQLNEFRASVEKLMEEGVGKDEAIFRILKETIIASEPIRFEGDGYS EEWKQEAARRGLTNICHVPEALMHYIDNQSKSVLIGERIFNETELNSRLEVELEKYTMKV QIEGRVLGDLAINHIVPTAVAYQNRLLENLRGLKEIFPAEEYEVLSADRKELIREISHRV TSIKVLVREMTEARKVANHMENYKERAFEYEEKVRPYLDQIRDHIDHLEMEVDDEIWPLP KYRELLFTK >gi|222159301|gb|ACAB01000058.1| GENE 57 76848 - 77321 289 157 aa, chain - ## HITS:1 COG:no KEGG:BT_4340 NR:ns ## KEGG: BT_4340 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 104 156 1 53 66 67 69.0 2e-10 METKQVTVQPAVGGMIKASKDSFVIPKEELKDFYLKFAFLLNPDSCSINRTEFELLNILL KDLKKILAALTHLNTHPWDEGIAEIQISCGVYSLQDNLSKEKRMQMNTSTGKHLQFLTQM AMDSPVFKQLFKNYHNHYIQVESLVKQMAKEMDQQKQ Prediction of potential genes in microbial genomes Time: Wed May 18 02:34:35 2011 Seq name: gi|222159300|gb|ACAB01000059.1| Bacteroides sp. D1 cont1.59, whole genome shotgun sequence Length of sequence - 54220 bp Number of predicted genes - 35, with homology - 34 Number of transcription units - 17, operones - 7 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 478 - 537 5.8 1 1 Tu 1 . + CDS 558 - 1382 591 ## BDI_2654 hypothetical protein + Term 1398 - 1448 7.1 - Term 1447 - 1485 -0.8 2 2 Tu 1 . - CDS 1496 - 1597 63 ## - Prom 1632 - 1691 4.7 + Prom 1392 - 1451 6.1 3 3 Tu 1 . + CDS 1551 - 3758 1523 ## BT_4341 hypothetical protein + Prom 3865 - 3924 7.1 4 4 Tu 1 . + CDS 3966 - 5096 664 ## COG3344 Retron-type reverse transcriptase + Term 5303 - 5353 -0.4 - Term 5022 - 5056 2.2 5 5 Tu 1 . - CDS 5147 - 11116 4287 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits - Prom 11230 - 11289 7.8 + Prom 11206 - 11265 6.3 6 6 Tu 1 . + CDS 11285 - 13084 3063 ## PROTEIN SUPPORTED gi|237715632|ref|ZP_04546113.1| 30S ribosomal protein S1 + Term 13107 - 13146 8.9 + Prom 13225 - 13284 6.2 7 7 Op 1 . + CDS 13436 - 14374 761 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 8 7 Op 2 . + CDS 14349 - 15905 1086 ## COG2989 Uncharacterized protein conserved in bacteria 9 7 Op 3 . + CDS 15930 - 16481 514 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 10 7 Op 4 . + CDS 16529 - 16906 460 ## BT_4348 hypothetical protein 11 7 Op 5 . + CDS 16934 - 17401 430 ## BT_4349 hypothetical protein + Term 17409 - 17453 -1.0 - Term 17441 - 17485 10.8 12 8 Op 1 . - CDS 17620 - 18408 1006 ## COG3956 Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain 13 8 Op 2 . - CDS 18446 - 19615 782 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 19638 - 19697 5.0 - Term 19630 - 19704 19.1 14 9 Op 1 . - CDS 19751 - 20521 236 ## gi|237715640|ref|ZP_04546121.1| predicted protein 15 9 Op 2 . - CDS 20533 - 21120 358 ## gi|237715641|ref|ZP_04546122.1| predicted protein 16 9 Op 3 . - CDS 21132 - 22406 643 ## COG3152 Predicted membrane protein 17 9 Op 4 . - CDS 22413 - 23309 524 ## gi|262408652|ref|ZP_06085198.1| predicted protein - Prom 23349 - 23408 6.8 18 10 Tu 1 . - CDS 23534 - 24091 417 ## Coch_0427 hypothetical protein 19 11 Op 1 . - CDS 24485 - 25387 311 ## BT_2509 putative transcriptional regulator 20 11 Op 2 . - CDS 25467 - 26537 801 ## BT_4351 hypothetical protein - Prom 26563 - 26622 5.4 + Prom 26575 - 26634 2.2 21 12 Op 1 . + CDS 26660 - 27298 655 ## BT_4352 hypothetical protein 22 12 Op 2 . + CDS 27363 - 29999 2689 ## COG0525 Valyl-tRNA synthetase + Term 30030 - 30069 6.0 + Prom 30031 - 30090 3.9 23 13 Tu 1 . + CDS 30116 - 32011 1045 ## COG0642 Signal transduction histidine kinase + Term 32103 - 32165 6.0 - Term 32083 - 32139 1.2 24 14 Tu 1 . - CDS 32325 - 32900 367 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 32971 - 33030 5.4 + Prom 32864 - 32923 4.9 25 15 Tu 1 . + CDS 33035 - 34057 634 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 34144 - 34203 6.6 26 16 Op 1 . + CDS 34304 - 37585 2394 ## COG4206 Outer membrane cobalamin receptor protein 27 16 Op 2 . + CDS 37597 - 39324 1055 ## Cpin_1801 RagB/SusD domain protein 28 16 Op 3 . + CDS 39336 - 40004 534 ## gi|237715654|ref|ZP_04546135.1| predicted protein + Term 40013 - 40055 3.2 + Prom 40067 - 40126 4.5 29 17 Op 1 . + CDS 40146 - 41603 1027 ## COG3119 Arylsulfatase A and related enzymes + Prom 41605 - 41664 2.8 30 17 Op 2 . + CDS 41695 - 43170 835 ## COG3119 Arylsulfatase A and related enzymes 31 17 Op 3 . + CDS 43160 - 45916 1466 ## Phep_2759 alpha-L-rhamnosidase 32 17 Op 4 . + CDS 45951 - 47345 748 ## COG3119 Arylsulfatase A and related enzymes 33 17 Op 5 . + CDS 47369 - 49600 1807 ## COG1472 Beta-glucosidase-related glycosidases 34 17 Op 6 . + CDS 49604 - 51820 1401 ## Phep_0964 alpha-L-rhamnosidase 35 17 Op 7 . + CDS 51799 - 54096 740 ## Phep_0964 alpha-L-rhamnosidase Predicted protein(s) >gi|222159300|gb|ACAB01000059.1| GENE 1 558 - 1382 591 274 aa, chain + ## HITS:1 COG:no KEGG:BDI_2654 NR:ns ## KEGG: BDI_2654 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 274 1 305 305 329 57.0 6e-89 MGRFINPFTDFGFKFLFGREVEKELLIDFLNDLLVGEHVITDIQFLNNEQQPEVKTERGL IYDIYCRTNTGEHIIVEMQNREQPYFKDRALFYLSRAITQQARKGIWNFQLDAVYGVFFM NFVMDKDIPSKIRTDVVLSDRDTGKLFNSKFRQIFIELPNFNKEEDECENDFERWIYILK HMDTLDRMPFKARKAVFERLEKLASKANMTQEERMQYEEEWKIYNDYFNTLDFAEQKGIQ KGIRETARKLKELGVDDDIIIKSTGISKEEIEKL >gi|222159300|gb|ACAB01000059.1| GENE 2 1496 - 1597 63 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHPINILSLNRRISFINVGKYKKEYRKAMELSY >gi|222159300|gb|ACAB01000059.1| GENE 3 1551 - 3758 1523 735 aa, chain + ## HITS:1 COG:no KEGG:BT_4341 NR:ns ## KEGG: BT_4341 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 735 1 719 719 1314 91.0 0 MKEIRRLRDKILIGCILCLFSFIAKGAANNYIMNPYTHQEISADSIIERVMTFAPLYETI VSDYRANLYIKGKMDIQKKNFILRYVPSMFRLQKGVREYLLETYSDLHYTAPNIYDQKVK ASQGTVRGNRGLPGLLEYFNVNIYSSSLLNDERLLSPLAKNGQKYYKYRIDSVMGDPNNL DYRIRFIPRTKSDQLVGGYMVVSSNVWSVREIRFSGRSELITFTCWIKMGDVGKKNEFLP VRYDVEALFKFLGNKVDGNYTASLDYKSIELKEKKVRKKEKRKYNLSESFSLQCDTNAYK TDASTFAIMRPIPLNESEKKLYFDNALRRDTATIQKPSKSQAFWGTMGDLMVEDYKFNLS NIGSVRFSPFINPLLFSYSGSNGLSYRQDFRYNRLFRGDKLLRIVPKLGYNFTRKEFYWS LNADFEYWPQKRGFFRLNVGNGNRIYSSKVLDELKAMPDSIFNFDLIHLDYFKDLYFNFR HTVEVVNGLDIGLGFSAHKRTAVEPSRFVITGDYPMPPPEFMDKFKNTYISFAPRIRIEW TPGLYYYMNGKRKINLHSIYPTFSVDYERGIKGVFKSTGEYERIEFDLQHQIRMGLMRNI YYRFGFGAFTNQDELYFVDFANFSRHNLPVGWNDEIGGVFQVLDSRWYNSSRRYVRGHFT YEAPFLILRHLMKYTRYVQNERIYISALSMPHLQPYLEVGYGIGTHIFDVGVFVSSENWK FGGIGCKFTFELFNR >gi|222159300|gb|ACAB01000059.1| GENE 4 3966 - 5096 664 376 aa, chain + ## HITS:1 COG:SA2010 KEGG:ns NR:ns ## COG: SA2010 COG3344 # Protein_GI_number: 15927789 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Staphylococcus aureus N315 # 107 333 51 273 338 161 38.0 2e-39 MDGILITILIVTGFLIYKMISSSRKQEGYKRWKATDGSKEEKTTKVKWEHGAWLRHALGA TVPRAQYVQKYDESAVVWCANLLGMETGQLKEILRNVSVHYREFWMRKRKGGYRMISAPN KTMQSIQATINSRILSPVTMIHPAAVGFRNGHSVVDNANPHLGKRYVLKMDIHDFFFSIR SPRVKKTFEKIGYPENVSKVLGTLCCLHRHLPQGAPTSPSLSNIVGYEMDRKLAALAAEY GLTYTRYADDLTFSGDVFPKEQIIPRIKQIIRDEKFEPNHKKTRFINEYGRKIITGVSIS SGVKLTIPKARKRESRKNVYFILTKGLAEHQRRIGSSDPVYLKRLIGELCYWRSIEPDNS YVSDSIAALKRLEKGY >gi|222159300|gb|ACAB01000059.1| GENE 5 5147 - 11116 4287 1989 aa, chain - ## HITS:1 COG:MA3490 KEGG:ns NR:ns ## COG: MA3490 COG1112 # Protein_GI_number: 20092301 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Methanosarcina acetivorans str.C2A # 377 1825 11 1475 1939 564 30.0 1e-160 MDERLDKVKVQCDYLPLINFAIQQNGASIIHQLSIENTTPVPLKDIQVVITTEPTFGNAA PMAVEQIPANDSIRLSSFNLTLSANYFTQLTERLSGNLKIEITAEAEPIFCQTYPIDILA YDQWGGLNVLPEMLAAFITPNHTAISPIIKRAASILGQWTGNPSLDEYQSRTPDRVRKQM AAIYTAIAEQQIIYSTVPASFEEYGQRVRLADSVMAQKLGTCLDMALLYASCLEAIGLNA LIVITQGHAFAGAWLVPETFPDPTIDDVSLLTKRTAEGIYDITLVETTCMNMGHSSDFDD AVKKANGKLADGNNFLLAIDIKRARYSGVRPIPQRILHGQVWEVDEKETNIQKSAVHATP QSINPYDLSGNETQTVITKQLLWERRLLDLSLRNNLLNIRITKNTLQLIPANLACLEDAL ADGEEFRILHRPADWESPAMDFGIYSSVPESDPVVGFINSELSQKRLRFYLSENDLGKAL THLYRSSRTSIEENGANTLYLALGLLKWYETPSSERLRYAPILLMPVEIIRKSAAKGYVI RSREEETMMNITLLEMLRQNFGITVSGLDPLPTDESGVNVKLIYSIIRNSIKNQRKWDVE EQAILGIFSFNKFIMWNDIHNNANKLVQNKIVSSLINGKIEWEAATEEIDATDMDKQLSP TDIVLPIIADSSQLEAIYEAVHDKTFILHGPPGTGKSQTITNIIANALYKGKRVLFVAEK MAALSVVQNRLAAIGLAPFCLEIHSNKTKKSAVISQLKETTEIIRQTPPEEFKKEAERLL NLRAELNQYIEALHKEYPFGVSLYDAIIHYQSVDVEPCFEIPQPYLDTLDKDTFAQWEEA IESLVRTANACGHPYRHPLTGISISEYSSAGKEEASQLLTGFIDLLNTIRQKLDVFSVLL KDTDIHPTRKDFQTIACIIRRILDIPELTPGLLTLPLLNETLNEYREVVVHGQKRDEQRK EIEAGFTKEILSINAKQTVAEWNRVSDQWFLPRYFGQRKIKKAINIYALKTIETEDIKPL LHRIIRYQEEKDAVQKYTGQLPSLFGRFGKNEDWTAIEQIINDMASLHSHLLNYAKDIAK VSQIKQNLSVQLTEGIQTFRDIHAHSFNELYQLSDTLTVIEKKLSGTLGISTEELYTSSA DWITIALSKAQTWKHNLDKLKDWYQWLQAYQTLNKLGIGFVATEYKEKNIPTDQLTDIFC KSFYQAVIQYIIAKEPTLELFNGKIFNDIIAKYKQISAKFEETTKKELFARLASNIPSFT HEAIQSSEVGILQKNIRNNARGISIRKLFDQIPTLLSRMCPCMLMSPLSVAQFIDTDADK FDLIVFDEASQMPTYEAVGAIARGKNVIIVGDPKQMPPTSFFSVSTVDEDNIEMEDLESI LDDCLALSIPSKYLLWHYRSKHESLIAFSNSEYYDNKLMTFPSPDNIESKVRIVNINGYY DKGKSRQNRAEAQAVVDEIARRLRSEELRKKSIGVVTFSIVQQALIEDLLSDLFIFHPEL ETLALECDEPLFIKNLENVQGDERDVILFSVGYGPDAEGRVSMNFGPLNRVGGERRLNVA VSRARYEMIIYSTLRSDMIDLNRTSSIGVAGLKRFLEYAEKGTRNTINSVTAQSTETAAS IENIIADKLRSLGYTVHTDIGCSGYKIDIGIVDTENTSNYQLGIICDGKNYKRTKTARDR EIVQNNVLKALGWDIYRIWTMDWWEKPDEVIAAIQEAIARKKSSKVNAQTTTTTEIDSAP MTAEKESVNKESTDKEKITKEEPIKEESIKEAVPTAERDNNEISFVLKASPATSEKQTAS ASSAQSGIQQKYRSAKITPGSYSPEDFFFSESYSILTSQIRKIIENEAPVSKSLLCKKIL SEWGISRLGARVETQIETALDTLNIYRTEHEGFVFCWKDREQCISYSIYRPVSEREATDI APEEIANAIRQLLTDSISLPVADLIKACAQQFGFARMGSNIDAAMQRGIREAVKRNYAKI ENERVTIAD >gi|222159300|gb|ACAB01000059.1| GENE 6 11285 - 13084 3063 599 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237715632|ref|ZP_04546113.1| 30S ribosomal protein S1 [Bacteroides sp. D1] # 1 599 1 599 599 1184 100 0.0 MENLKNVAPIEDFNWDAYENGETVTNVSHDELEKAYDGTLNKVNDREVVDGTVIAMNKRE VVVNIGYKSDGIIPLNEFRYNPDLKVGDTVEVYIENQEDKKGQLVLSHRKARATRSWDRV NAALENEEIIKGYIKCRTKGGMIVDVFGIEAFLPGSQIDVKPIRDYDVFVGKTMEFKVVK INQEFKNVVVSHKALIEAELEQQKKEIIGKLEKGQVLEGTVKNITSYGVFIDLGGVDGLI HITDLSWGRVSDPKEVVELDQKLNVVILDFDDEKKRIALGLKQLTPHPWDALDTDLKVGD KVKGKVVVMADYGAFIEIAPGVEGLIHVSEMSWSQHLRSAQDFMKVGDEVEAVVLTLDRE ERKMSLGIKQLKQDPWETIEEKYPVGSKHTAKVRNFTNFGVFVEIEEGVDGLIHISDLSW TKKVKHPSEFTQIGADIEVQVLEIDKDNRRLSLGHKQLEENPWDVFETVFTVGSVHEGTI IEMLDKGAVVALPYGVEGFATPKHLVKEDGSQAQLDEKLEFKVIEFNKDAKRIILSHSRI FEDVAKAEERAEKKAASGAKKASSGKREDSPMIQNQAASTTLGDIDALAALKEQLEGKK >gi|222159300|gb|ACAB01000059.1| GENE 7 13436 - 14374 761 312 aa, chain + ## HITS:1 COG:slr0050 KEGG:ns NR:ns ## COG: slr0050 COG1234 # Protein_GI_number: 16331469 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Synechocystis # 5 304 2 307 326 201 37.0 1e-51 MEKFELHILGCGSALPTTRHFATSQVVNLREKLFMIDCGEGAQMQLRRSRLKFSRLNHIF ISHLHGDHCFGLLGLISTFGLLGRTADLHIHSPRGLEELFAPMLAFFCKTLTYKVFFHEF ETKEPMLVYDDRSVTVTTIPLKHRIPCCGFLFEEKQRPNHIIRDMVDFYKVPVYELNRIK NGADFVTPEGEVIPNLRLTRPSAPARKYAYCSDTIYRPSLAEQISNVDLLFHEATFAQTE QARAKETYHTTAAQAAQLALDANVKQLVIGHFSARYEDESILLNEASAIFPQTILAKENM CIDVDGGTVYEK >gi|222159300|gb|ACAB01000059.1| GENE 8 14349 - 15905 1086 518 aa, chain + ## HITS:1 COG:BMEI1014 KEGG:ns NR:ns ## COG: BMEI1014 COG2989 # Protein_GI_number: 17987297 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Brucella melitensis # 239 512 272 521 531 149 33.0 2e-35 MEELCMRNRRLLWGAVILFVVCLAACKKEKPQSPIPDSLASLQELFNPAYQIDADSIRLM IRSYLNENPITPWDSALLAHYREKDEFFWLNDSLVSDKPAAQVADSMLFWLGDISRHGIN PNLYSVDSIRERLQQIRSLKLQEGKTMNRLLADVEYQLTAAYLSYVCQLKFGFLPSERRW NDSINRIPLKHCDVTFANAALDSLRANPIAAFHGAQPSSPLYRKMQEELVRVNAWGKTDT TDYYRNRLLVNMERARWQYALDKGQKYVIANVAAFMLQAINEETDSILEMRICVGSVKNK TPLLSSRIYYMELNPYWNVPQSIIRKEIIPTYRRDTTYFTRNRMKVYDKNGLQVNPHQVN WAKYAGKGVPYTVKQDNKTGNSLGRIIFRFPNPHSVYLHDTPSRWAFTRNNRAVSHGCVR LQKALDFAFFLLKEPDELLEDRIRIAMDIKPVSEEGKKLPVSAAYRELKHYSLEKYIPLF IDYQTVYLSADNNLRYCEDTYKYDSSLLEAMNNLNLKP >gi|222159300|gb|ACAB01000059.1| GENE 9 15930 - 16481 514 183 aa, chain + ## HITS:1 COG:RSc1055 KEGG:ns NR:ns ## COG: RSc1055 COG1595 # Protein_GI_number: 17545774 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 5 182 2 190 199 82 28.0 5e-16 MIPYNEREVLKLLQEESTQRKGFEMIVAQYSEQLYWQIRRMVLSHEDANDLLQNTFIKAW TNIDYFRAEAKLSTWLYRIALNECLTFLNKQRAMTTVDIDDPEAAIVQKLESDSYFSGDE IQLCLQKALLTLPEKQRMVFNLKYYQEMKYEEMSEIFGTSVGALKASYHHAVKKIEKFLE EID >gi|222159300|gb|ACAB01000059.1| GENE 10 16529 - 16906 460 125 aa, chain + ## HITS:1 COG:no KEGG:BT_4348 NR:ns ## KEGG: BT_4348 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 125 1 128 128 186 85.0 3e-46 MKEEDILLKKLGKENSFKVPDGYFENLTSEVMNKLPEKEKVVFKEESVSTWTRLKPLLYL AAMFVGAALIIRVASTDHKPAAVDEVAVTEVDTEVVSDEMLDVALDRAMLDDYSLYVYLS DASVE >gi|222159300|gb|ACAB01000059.1| GENE 11 16934 - 17401 430 155 aa, chain + ## HITS:1 COG:no KEGG:BT_4349 NR:ns ## KEGG: BT_4349 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 154 155 225 81.0 3e-58 MRKLIAWVVLLCGFMPVLWAADGCDQHLSREEFRAKQKEFITEQAGLSKKEAAKFFPVYF ELQDKKKKLNDESWDLMRKGKDDKTTEAQYAEINDKIANNRIAADQLDKTYLGKFKKILS SKKIFLVQRAEMRFHREMIKGMNRGKDKGNDSKKK >gi|222159300|gb|ACAB01000059.1| GENE 12 17620 - 18408 1006 262 aa, chain - ## HITS:1 COG:BS_yabN KEGG:ns NR:ns ## COG: BS_yabN COG3956 # Protein_GI_number: 16077126 # Func_class: R General function prediction only # Function: Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain # Organism: Bacillus subtilis # 12 259 233 484 489 217 48.0 1e-56 MSHTRQEQMEAFGRFLDILDELRVKCPWDRKQTNESLRPNTIEETYELCDALMRDDKKEI CKELGDVLLHVAFYAKIGSETGDFDIKDVCDKLCDKLIFRHPHVFGEVKAETAGQVSENW EQLKLKEKDGNKSVLSGVPAALPSLIKAYRIQDKARNVGFDWEEREQVWDKVKEEIGEFQ DEVANMNKDKAEAEFGDVMFSLINAARLYKINPDNALELTNQKFIRRFNYLEEHTIKEGK SLKDMSLEEMDAIWNEAKKKGL >gi|222159300|gb|ACAB01000059.1| GENE 13 18446 - 19615 782 389 aa, chain - ## HITS:1 COG:TM1265 KEGG:ns NR:ns ## COG: TM1265 COG1373 # Protein_GI_number: 15644021 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Thermotoga maritima # 36 387 42 385 387 112 27.0 2e-24 MDRLYENFNKLLKATSTDFIRYKYTEINWDSHMLGLVGPRGVGKTTMFLQHIKQNMNPKD TLYVSADNMYFADNSLIDLTDKFSKRGGKHLFIDEIHKYPNWSRELKQIFDSYPDMQVLF TGSSILDIYKGTADLSRRAPIYEMQGLSFREYLSMFHQIHVPVYTLEEILEHKVEIPGIA HPLPLFAEYIQHGYYPFSKDITFEIELNQVINQTMENDIPQYANMNVSTGRKLKQLLMII AKSVPFKPVMQKLADIIGVSRNYLSDYLLYIEKAGMIAQVRDDTGGVRGLGKVEKIYLDN TNLIYALGRENSNIGNIRETFFYNQMRVKQDIISSKISDFQIGERTFEIGGKNKGQQQIT GAKEGYIVKDDIETGYGNIIPLWNFGMNY >gi|222159300|gb|ACAB01000059.1| GENE 14 19751 - 20521 236 256 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715640|ref|ZP_04546121.1| ## NR: gi|237715640|ref|ZP_04546121.1| predicted protein [Bacteroides sp. D1] # 1 256 1 256 256 479 100.0 1e-134 MINNTKQCPFCGEEIQATAKKCRHCGEWLEDSVSNTKNQATTEVSFQRDSNNHKTEVNHL KTPISDFVLILFWTGVIATFISMSHQSGVCHLTNPHKWLQIMQWATYIPEWVADLLSGLV DIIFAYALYIGMKQQTKPMSGLLITNIIITVVVSFLILCMDLISIADEDYIGILISLFVI LGMLITSTIIGVQFIRHFNGLLNKLGWGMLASLIIVISAAALISEDEFSMTNTIISFIEF WIISYILYIQAELLTD >gi|222159300|gb|ACAB01000059.1| GENE 15 20533 - 21120 358 195 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715641|ref|ZP_04546122.1| ## NR: gi|237715641|ref|ZP_04546122.1| predicted protein [Bacteroides sp. D1] # 1 195 1 195 195 355 100.0 1e-96 MKHNNLLTSFTLLSLLCIFCTSCNDMEMAKKACGTWESIRIVEEDGCKEKTYYDFGSVKG SIPGGDLKETTYFTECGEVEDGQEYNMQCKSIIEGTWKIEVGDIYFTYDLSTLKVTFEGI SFPGADRLSESLAQSFIKNYGQSLIQESIEELKEHLYNWYSENENNEDCYQNVNIKGDNM SFDATDGVIKLIRIK >gi|222159300|gb|ACAB01000059.1| GENE 16 21132 - 22406 643 424 aa, chain - ## HITS:1 COG:PA0563 KEGG:ns NR:ns ## COG: PA0563 COG3152 # Protein_GI_number: 15595760 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pseudomonas aeruginosa # 70 182 3 105 117 88 39.0 3e-17 MEKETKKCPFCGEEILAVAKKCKYCGEWLEEKQTVTKKMVACPVCTEQREEDATSCENKE TPQPSFLSHYFIRVICKQYADFNGMATRKQYWLYILFYNIIAIAASCIDILSGIDFQLFG ESLGYGWLFTIVSLTLTIPSLAIGIRRLHDIGKSGWWFLIVLLPLVGIFWLLFLLCKKGN AANASTPLTITDKIVLNASIVVVAILVVLTYFKSANHLQMPQLQHSDWTLDGDTAGVDTL GNIGYSKNRQSINSGNNAEIKKRIIHDFSCYYAIYTFCYNMKQAITKGYKGEGLLTLTEE LVENDDLLVISNTEFKWLLQQFDFSNFSRLLQAYASGNRSFRSDFTESDLGHTAIECLKK RKVVFNSQWFEIFDNYLPHDEVQIEEFHNLGNKEYQVVFDGFHNISPNFVYNVSYANGLT ITTN >gi|222159300|gb|ACAB01000059.1| GENE 17 22413 - 23309 524 298 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262408652|ref|ZP_06085198.1| ## NR: gi|262408652|ref|ZP_06085198.1| predicted protein [Bacteroides sp. 2_1_22] # 1 298 75 372 372 527 100.0 1e-148 MESLKAKSKEFEKIKVEIENAISKKELEQKRVDSIQKVNDAERIVQERIADSLYLIRAAQ TWEWLHDKKGKWETIDRSYPKEEHYQVNSLFPQYQVIDQNAYLNGKLVGVCHPAKNNKNN SEKERCFRIDVMTFLCQQDFLNNKYEIQKESPKTQEYIKQKLGLKKQLEPTDSPVYKEMI ANASKLRIAQERLRKGEIDLNTYNRIKTKLGAKFTGTVYQEMADSNSPEINAGKRYLEQL RTDNKSLIGEYTIQRIDGTNFTYQFRNNEGKKTFSVKVSFFVNEKKNVLYTISSLQKK >gi|222159300|gb|ACAB01000059.1| GENE 18 23534 - 24091 417 185 aa, chain - ## HITS:1 COG:no KEGG:Coch_0427 NR:ns ## KEGG: Coch_0427 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 185 6 198 199 87 31.0 3e-16 MEKVSLLMKDSSEGDSQKETMIDFILSWTLRRSIQQYSEEKPILYQYCRKILGKLIGIEM TDDVQVTSVETWKQWKYIDLWANIRITCNGKEEFHAVLIENKAYTPTHHNQLARYKAIFD QVCEEYMPNTKRHYILITALDEMPAMLANECKENGYIPFCLGDLNDYEQQDSESDLFNEF WLRYW >gi|222159300|gb|ACAB01000059.1| GENE 19 24485 - 25387 311 300 aa, chain - ## HITS:1 COG:no KEGG:BT_2509 NR:ns ## KEGG: BT_2509 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 299 1 299 300 364 58.0 2e-99 MAKDLFNRYLWLVDTIYRAGTITLDEINRKWLQCEMSNGEEIPNRTFHNHRKAIEELFDI NIECDRHNGCNYYIENTEELKKEGLRKWLLNTFAVNTTLNKSHKLKDKILLEDYPSGEQY LIPAIEAIHNCITLEITYQSYWQDKSYTFQIEPYCIKAFKQRWYIAARSPYYNKVLIYSL ERILEMEQTDLSFECPQTFNPKVYFDNSFGVIVDEEYDVEKIRIKVYGNQCKYFRSLPLH HSQKESETHSNYSIFDYSLRPTYDFCQAILSHGNLVEVLEPQWFRIQIGTMIQKMNQLYQ >gi|222159300|gb|ACAB01000059.1| GENE 20 25467 - 26537 801 356 aa, chain - ## HITS:1 COG:no KEGG:BT_4351 NR:ns ## KEGG: BT_4351 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 356 1 350 355 558 79.0 1e-157 MTDFIQMMKNKPNVRYVALGALLILVWLTQWIPALATIYSQTIYPLISYVLSFFSGLFPF AIGDLFIFLSITGVIVYPFYARLRKKLPWKKILLRDGEYLLWIYVWFYLAWGLNYSQKNF YQRTEIPYTAYTPEIFQEFVDDYITQLNRSYTPVNSINQDLIREETVRIYHQLSDSLGVN RPPHEHPRVKTMLFTPFISMVGVTGSMGPFFCEFTLNGDLLPVNYPATYAHELAHLLGIT SEAEANFYAYQICTRSEAMGIRFSGYFSILGHVLGNAQRLLPEEKYTRLFKRIRPEIIEL AKNNQAYWAAKYSPVVGAVQDWIYDLYLKGNKIESGRQNYSEVVGLLISYQEWKKK >gi|222159300|gb|ACAB01000059.1| GENE 21 26660 - 27298 655 212 aa, chain + ## HITS:1 COG:no KEGG:BT_4352 NR:ns ## KEGG: BT_4352 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 212 1 212 212 326 76.0 3e-88 MATSEEIEKYCRNCVSRDFVNGKGLVCKRTGELPAFEEECDSFEQDKELERLAPPKPEDF PVFMTEEEMLAEENLPKGILCAVVACIVGAVAWGLISVSTGRQIGFMPIAIGFMVGFAMR QGKGIRPIFGITGAALSLVSCILGDFLSIIGYISQDYEMGYFQVLAGVDYGEIFSILLKN VMSMTALFYGFALYEGYKFSFRAQKHPEGGKI >gi|222159300|gb|ACAB01000059.1| GENE 22 27363 - 29999 2689 878 aa, chain + ## HITS:1 COG:FN2011 KEGG:ns NR:ns ## COG: FN2011 COG0525 # Protein_GI_number: 19705307 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 2 872 3 882 887 732 45.0 0 MELASKYNPADVEGKWYQYWLDHKLFSSKPDGREPYTIVIPPPNVTGVLHMGHMLNNTIQ DILVRRARMEGKNACWVPGTDHASIATEAKVVNKLASQGIKKTDLSRDEFLKHAWEWTDE HGGIILKQLRKLGASCDWDRTAFTMDEKRSESVIKVFVDLFDKGLIYRGVRMVNWDPKAL TALSDEEVIYKEEHGKLYYLRYKVEGDPEGRYAVVATTRPETIMGDTAMCINPNDPKNEW LKGKKVIVPLVNRVIPVIEDDYVDIEFGTGCLKVTPAHDVNDYMLGEKYNLPSIDIFNDN GTLSEAAGMYIGMDRFDVRKQIEKDLEAAGLLEKIEAYTNKVGYSERTNVVIEPKLSMQW FLKMQHFADMALPPVMNDDLKFYPAKYKNTYRHWMENIKDWCISRQLWWGHRIPAYFLPE GGYVVAATPEEALAKAKEKTGNAALTMEDLRQDEDCLDTWFSSWLWPISLFDGINNPGNE EIKYYYPTSDLVTGPDIIFFWVARMIMAGYEYEGQMPFKNVYFTGIVRDKQGRKMSKSLG NSPDPLELIEKYGADGVRMGMMLSAPAGNDILFDDTLCEQGRNFCNKIWNAFRLIKGWTN GKGSIPVPPDAHLAVQWFDQRLDAAAVEVADLFSKYRLSEALMLVYKLFWDEFSSWLLEI VKPAYGQPINGFIYSMVLSSFERLLELLHPFMPFITEELWQQLREREPGASLMVTLMKEP LEVNEQFLQEFELAKEIISNVRSIRLQKNIALKEQLELQVVGSHPVEKMNPVIIKMCNLS AINVVFKKAEGAASFMVGTTEFAVPLIDMIDIDAEITRLLAELKHKESFLQGIVKKLSNE KFVNNAPAAVIELERKKQADAESIIRSLKESLTILLKK >gi|222159300|gb|ACAB01000059.1| GENE 23 30116 - 32011 1045 631 aa, chain + ## HITS:1 COG:mlr3786_1 KEGG:ns NR:ns ## COG: mlr3786_1 COG0642 # Protein_GI_number: 13473249 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 370 627 192 462 478 177 36.0 5e-44 MKQVLRTISFLGLFLFIEGFLLSVSARTEITNKEVLVLNSINFNLPWAKNFYWCIHDALQ KQGISAKAESLSVPALVDKMEADAVVTHLREKYPVAPAAIVLIGDPGWIVCRELFDDVWK DVPVIVTNARDRLPASLDVLLSHAPLTETNSVPGEEWRRGYNLTLLKQHYYVKETVELIY KLIPDMQRLAFISDDRYISEETRRDVKEAVEENFPDLRLDLLSTTQLSTEMLLDTLHSYK PNTGIIYYSWFESHNENDNNYLFDHIQEIITNFTPSPLFLLSPENLSNNTFAGGYYVSTE SFCDSLLEILNRILNGEQARNIPGGVGGEGNSYLCYPVLKSYNIPSSRYPDDAVYIDKPQ TFFQQHYVKIIASSVFLLLLIAAIVYYIRILRKTYSRLSEAVEKAEQANQLKSAFLANMS HEIRTPLNAIVGFSNMLPHTEDPVEMREYADIIETNTDLLLQLINDILDLSKIEAGTFDF YPSSIDVNQTMEEIEQSMRLRLKNSDVTLAFTERLPGCLFYIDKNRLIQLLANFVNNAIK FTQTGTICMGYRMTDTDTIYFYVSDTGCGMSNEQCEHVFERFVKYNPFIQGTGLGLSICR TIVERLGGKIGVDSEEGKGSTFWFTLPYRKR >gi|222159300|gb|ACAB01000059.1| GENE 24 32325 - 32900 367 191 aa, chain - ## HITS:1 COG:VC2467 KEGG:ns NR:ns ## COG: VC2467 COG1595 # Protein_GI_number: 15642463 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Vibrio cholerae # 6 182 4 190 190 62 26.0 5e-10 MKGTYQSDNDFLLSAVQRGDQKAFDTLFRRYYPMLCAYGHRFVELEDAEEIVEDSLLWIW ENRETLVIESSLNSYLFKMVYRRALNKLAHIDATQRADTRFYEEMQEMLQDTDYYQIEEL AKRIEDAVAALPESYREAFVMHRFRDMSYKEIAETLGVSPKTIDYRIQQALKQLRVDLKD YLPLLLPLLFP >gi|222159300|gb|ACAB01000059.1| GENE 25 33035 - 34057 634 340 aa, chain + ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 86 301 85 293 327 75 24.0 1e-13 MDKTDRNTIEELLPRYCEGVATEEERLQVEMWMSESDENRRMAKQIHALFLATDTVNVMK KVDTEKALLKVKSRMSGNRHKGTMWWQWAQRAAAILFIPLLVTLMLQYWGGNEQELAQIM EVKTNPGMMTSLTLPDGTLVYLNSESTLSYPSRFDNDTRNVTLQGEAYFEVAKNPEKKFI VSTSHQSQIEVLGTHFNVEAYEKEDRVSATLVEGKIGFIFKRDNVSKKVLMKPGQKLVYD FKDSKVQLYTTSCESEIAWKEGKIIFKDTPLEESLRMLEKRYNVEFIIKNERLKKYSFTG TFTHQRLERILEYFKISSRIRWRYIDSADIKDEKSRIEIY >gi|222159300|gb|ACAB01000059.1| GENE 26 34304 - 37585 2394 1093 aa, chain + ## HITS:1 COG:PA1271 KEGG:ns NR:ns ## COG: PA1271 COG4206 # Protein_GI_number: 15596468 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Pseudomonas aeruginosa # 181 329 21 169 616 67 33.0 1e-10 MKLFIICLICSVSITWATEVYAQQAMISIDVRNQTVGEILEEIESQSDFDFFFNNKHVDL SRRVSVSVNNSNIFQVLKEVFAGTGVKFSVLDKKIILSTDILVPEQDKKMKVSGKVVDRN DEPVIGATIKEEDGSVGTVTDIEGNFTLSVSSDKAMINVSYIGYQPQLVKVKEGKTLKVV LEEDTKLLDEVVVVGYGTQKKVNLTGSVATISSKDIAETHASNTSTLLAGRLPGVISMAA NGFPGQGASVLIRGKSSWNDAPVLYVVDGIQVNKEAFDAINIDEIENISVLKDASAAVYG SRAANGVILVTTKRGENGRPKFSFSTNLGISTPTAYPELLNAYEYATLWNEAQTNMGYDI NNSSDAHLFYNDEQIQKARTESSDWFGETFKKHSLNQKYNMSVNGGSDRIKYFMSLGYLH DEGMYDGIDYKRYNLRANIDAKINNYINVRLDLEGMESTLEQPQVASASLFEYVVRRSPL EKIYNADGSYYDMGSRHPVAERDDSGYKRNKKNNYRAKLGFDIKIPYVDGLKFNALFSYV RGANHSKSFLTPYSLYLLGDDGSVANERIFGKTSLSEEMYIGNNMTSTLTLTYNKKIKGH SMSGLLVYEQFESLGSTLGASRTNFPFTSIDQLFAGGDDEEQSNWGTPAQDARKGLIGRF NYDYKGKYLAEFSFRYDGSLKFHPDRRWGFFPSVSLGWRLSEEAFMKKFSNLDNLKVRGS YGILGNDAVGGWQWLTNYSFGNAYIFNQAPIKSIVSGGIPNVDLTWEKTATFNFGVDASI YKGLLAVEADVFYKRTYDILGSRNASMPQTFGATLPSENYGIVKVKGFELQVKHDNKIGD FRYHVGGNVSWSRNKVIEKDYAAGAEPWNIPVGKTMGYRACFVALGLFQSDEEAAAWPRF KGTQPTAGDVKYADLNNDGIMDERDKKVVSPYSNTPEIMFGLNLSASWKGFDFSALFQGA ANRNVMLSGFATQMFINGASNLPKYLYEDHWTPDNRDARYPKAWGPDHPSNNKSSTFWLI NGNYVRLKNVELGYSISKDLLSRVGVDRLRIYVGGSNLFSIDHMPGYDPEKQDGGPNYYP QQRVFNMGVNITF >gi|222159300|gb|ACAB01000059.1| GENE 27 37597 - 39324 1055 575 aa, chain + ## HITS:1 COG:no KEGG:Cpin_1801 NR:ns ## KEGG: Cpin_1801 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.pinensis # Pathway: not_defined # 1 574 1 578 587 255 33.0 4e-66 MKKYIFYLIAFLAVSCEDVLDKKNLGAIDDSYVWSDPNMIELNVNSFYANWMPTGLFERP GLAELGIISDEARSGYNGRAVNWINGVRFGADRTDVPYQKWYYGGIRKANEFLQNIEERY ILPTNATQAQKERRDRFIGEVRFYRAHLYWEMIKVYGGVPIITKVIDKDETDESLLYPKR NTTDESFDFVIKELKEAADLLSITYSGDNWGRITKGAALAYCARIQLYRASPMFNPTNER KYWEDAYDTYKDVIELDVYDLHPMFSEIWKEKGENNKEIIWFKDYKKGTITHGWDAGNMM RSQAVGDATANCPVQELVDAFPMKDGTPYVKSNPETNPYDYRDPRLRATVVWNGDTYGPR NEKVYTFVSESTDPNSPMYNFDGIDSHQSATSTGYYMRKMTDESLDGTKGDYGYGKGSYT QWVELRYAEVLLGLAEAANEIGETEEGVEQLKLIRKRAGILPGDDERYGIPENISKDDFR TLVQNERYIELAFENKRYWDLRRWKLAHLKLNNQLTAMEIRKKINGEEISYTYTRRVQRH DKTNKPVFLEKFYFLPLPKEELLRNPNLEQNEEWK >gi|222159300|gb|ACAB01000059.1| GENE 28 39336 - 40004 534 222 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715654|ref|ZP_04546135.1| ## NR: gi|237715654|ref|ZP_04546135.1| predicted protein [Bacteroides sp. D1] # 1 222 8 229 229 452 100.0 1e-126 MKKICLYILVCMFIGYSCDDMNEFEQGTLSGYVLDAVTGEGLADVDVVLEPVVAANGSVT SKPDGHFNLSRLVAGTYRVNVRKNGASIIDNAQDEITITDGCTLDREYKLTPRVSVFDFS VDYDKSDPTKFVVHFKARGNEGNKFNYYSVMWNEYPNFIFADLPNTQRKAVKHATSEEAE VTYEVSGLDLKKGTTYYIRVGVTHIANGGDYNHSKMIPIMFE >gi|222159300|gb|ACAB01000059.1| GENE 29 40146 - 41603 1027 485 aa, chain + ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 45 469 3 456 467 166 28.0 1e-40 MKNNFFISGLCLLTASQTMEAQDNKPENVNILFIQADQHRYDCTGFSGKGLVKTPNIDKL ASEGVIFTNSYSCIPTSCPARQSLISGKWPEQHKGLWNYDITLPVTPFNGPTWTEKLSEK DIKMGYVGKWHVSDRKSPKDFGFDDYVPEWSYNNWRKKNNLPDYVWQDSRWVMGGYDPVD KMQSRTHWLAQRVIEMIKKYQSEGKKWHVRFDTSDPHLPCYPVREFLAMYDKEKIQEWPN YRDDLSNKPYIQRQQIYNWELEDSNWEMWQGYLQRYFANITQLDDAVGMVIEALKEMGVY DNTFIVYTTDHGDAAGSHNMVDKHYVMYEEEVHVPLVMKIPGVSHRIIDRFVNNQLDMAA TFCDMYQLDYKTQGESLLPLIEEKKEASDWREYAFSNYNGQQFGLFVQRMIRDKRMKYVW NLTDTDELYDLESDPWELNNLVYSKEYKAELVRLRKALYEDLKQRKDPLIWQNAAKRQLI DNKKL >gi|222159300|gb|ACAB01000059.1| GENE 30 41695 - 43170 835 491 aa, chain + ## HITS:1 COG:PM1682 KEGG:ns NR:ns ## COG: PM1682 COG3119 # Protein_GI_number: 15603547 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 26 477 3 439 453 140 26.0 5e-33 MKRSSITKVWLSSCCALPLLGFAQKRPHIILIFTDQQNVNAMSAAGNPFLYTPNMDALAN DGIRFTNAYCTSPVSGPSRASIVTGLMAREAGVEWNDNSKLSEGIHTVGDLLGENGYRTV WAGKWHIPEIYPQRSKNEIKYLHGFELLPFWDAPNKHWLLGAETDPPLTDAVVSFLDGYD EREKPLFLAISYHNPHDICMYPRKVGWETMSDSLLNIRPFGKYKLPEPMGVHPDSLSYLP PLPGNFSKNVDEPEFIIDKRVKPNPYGDEVQLSSRFSGREWQAYLNSYYRLTELVDKEIG EVIEALKRNGMYENSLIIFTSDHGDGMAAHEWAAKLSFYEESVKVPLIMVLPEKWQRGAV NSGLVSLVDLVPTFCDYAGVSPKTNFAGMSLRRAIYPTEERWRDFVVAELADHLKDRTRK GRMIRTGRYKYAIYSSGERNEQLFDLMTDPGETINLAYSKEYWEILKKHRSLLVKWMKER GDNFKVVTYEK >gi|222159300|gb|ACAB01000059.1| GENE 31 43160 - 45916 1466 918 aa, chain + ## HITS:1 COG:no KEGG:Phep_2759 NR:ns ## KEGG: Phep_2759 # Name: not_defined # Def: alpha-L-rhamnosidase # Organism: P.heparinus # Pathway: not_defined # 1 918 1 915 916 847 47.0 0 MRNNIYFIYLLLLAVLPVHSRNTSLFSVYEITCEQQIEPSGIESKSPRFSWKVACQQRGY RQSAYRILVADTEDALTADKGTVWDSGTFLSSNSVLVPFQGKELHSATRYYWKVKVWNEA GEESAWSRQGSFVTGMMKERDWGKAQWIALEKDDSAKNLYPGIHAPLVRKMIGDRKVGGY KLPMFRKELPINKEVKEAIVNISGLGHFDFYINGEKVGNHFLDPGWTNYAKTALYQTFDV TALLKKRNVMGVMLGNGFYNVPRERYFKQLISFGAPKVKLALSLKYADGSSEVVVTDQSW KVCESPITYSSIYGGEDYDVRKYQVSWMNPGFDDDSWKTPVIAKTEMGLKAQVSAPLTVR DILPVVRKYKNSKGNWVFDFGQNFSGIIRLKVKGHSGHKITMKPGELLEKDSTLNQRASG GPYLWNYTLSGKGVEEWQPQFTYYGFRYVEVSDIEKLELIELAGMHTTNSAPEAGSFSCS LPMFNKIYELIDWSIRSNLASILTDCPHREKLGWLEVAHLMQHAMQYRYQLDGLYSKVMG DIKDSQTQEGIVPSIAPEYVRFADGFENSPEWGSAFIIIPWYVYRWYGDKSLLSTYYPYM KRYLEYLSTRADNYIVAYGLGDWFDLGPKHPGYSQLTSNGVTSTGMYYYNASIMSQIAKL LGEEKDEERFKQLATCIRKAYNEKFYNTVTKQYDRNSQTANAISLYFGLVEEQNREVVYQ NLINDIKNRDYALTSGDIGYRYLLRVLEENGDSDIIYKMNTKYDVPGYGWQLAYGATALT ESWQAYGFVSNNHCMLGHLMEWFFSGLGGIGQQTESVGFKKVKIKPQIPIGINSASTSYT SPYGDIACRWVRKSGKIRLYVQIPGNSEAIIYLPAKTVEEITESGIPLKDVGECRILEAH NEDYILVSVGSGNYIFEV >gi|222159300|gb|ACAB01000059.1| GENE 32 45951 - 47345 748 464 aa, chain + ## HITS:1 COG:YPO0829 KEGG:ns NR:ns ## COG: YPO0829 COG3119 # Protein_GI_number: 16121138 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Yersinia pestis # 10 452 27 487 517 106 23.0 1e-22 MKYNLFSLGILALSSQMVMAYEKGKMNIVFLLADDLRYDAVGYMNKTTIQTPHIDCLAND GTVFTNMYATSAISCCSRASIFTGMYNRRNGITDFSGTLRGEALKNTYPMLMKDNGYYVG FIGKYGVGNYLPQKEFDYWRGLAGQGTYYQKDKDGNPRHLTGLISEQIDEFLANRDETKP FCLSVSFKSPHTESEQDPFPFDKKYSFMYEDEFFDKPETFGENYYRLFPESFRKDGKWEN EGYVRFKNRYGTDEKYQSSVKGYYRLIAGIDEAIGKLRKRLKEMGLDKNTLIIFTSDNGY YLGEHGLEGKWYGHEESIRLPLVIYDPRLEKPVKKIDEIALNIDLTSTMLDYAGIRQLEK MQGESLRPLMEGKKIKWRQEFLYEHLMNLDKKGWYVYIPQTEGLVTKRYKYMRYFVNNQS HTPIYEELFDIKKDPYEKKNLIKNKADLAGRMRTRVDNLIKVIK >gi|222159300|gb|ACAB01000059.1| GENE 33 47369 - 49600 1807 743 aa, chain + ## HITS:1 COG:CC1105 KEGG:ns NR:ns ## COG: CC1105 COG1472 # Protein_GI_number: 16125357 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Caulobacter vibrioides # 37 733 18 729 743 597 46.0 1e-170 MKTFVKLATGMLFLVSCGGNISDKVSTLSIPDKYEQRVDSVLKLMTLDEKIGQLNQYTGN WQATGPVVEDPTKIEQIKAGKVGSMLNIKSVKHTRELQEYAMQSRLRIPLMFGLDVVHGL RTIYPIPLGEAASFDLDLMKRTAAGAAKEASAQGVHWTFAPMIDISRDARWGRVMEGAGE DTWYGCKVAQARVSGFQGDDLSEPHTIMACAKHFAAYGACIAGKDYNTVDISEQTLHEVY LPPFKAAVDAGVASFMNSFNDINGIPATGNTYIQRDLLKGSWNFNGLTVSDWGAIREMIP HGYVSDLKGAAEKAILAGCDIDMESRAYHIHLKKLVEEGTVSEDYIDDAVRRILFKKFEL GLFDNPFLYCDETREKEVVLSEELKNLSREAGAKSIVLLKNDQGSLPLNNPKKIAVIGSL AKSQKDMLGFWANEGIVDEVVTVYEGLKNKYPESDVVYADGYDLATNELHLMDARNAAMQ SDVVIVAVGERFENSGEAKSRADINIHPNHQLLVKELKKTGKNVVVLLMGGRPMIFNEMT PHADAILLTWWLGTEAGNAIADVLAGDYNPSGKLPMTFPAHVGQIPIYYNYKNTGRPENK EIGYSCRYQDIDFEPAYPFGYGLSYTDFLISEPVVKDSVFSLKTPLEVCVKVKNTGKYAG KETVQLYIRDLVASLTRPVKELRGFQQVELQPGEEKEISFMLTEKELGFESACKGWTVEP GLFDVMVGNSALNVKKTRVELKQ >gi|222159300|gb|ACAB01000059.1| GENE 34 49604 - 51820 1401 738 aa, chain + ## HITS:1 COG:no KEGG:Phep_0964 NR:ns ## KEGG: Phep_0964 # Name: not_defined # Def: alpha-L-rhamnosidase # Organism: P.heparinus # Pathway: not_defined # 29 719 43 741 770 671 50.0 0 MNELKKLVCILLSLIGMHTIQAEIITYPAEVTPGSWLCFRKEISVEKDASHNLLKIAADS KYWLWINGELVVREGGLKRGPNPKDTYCDILQDVKGLVPGKNTIALLVWYFGKEGFSHRN SPTAGISVDLTIGKQRYISDDSWKVSIHPSFYIPKGIKPNFRLPESNIGFDAEKKVAFWD KDFDDTQWKNVKVIKKELSGWGQLVERPIPMWKDYGLKDYVKVERKSDTLLVAYLPYNAQ VNPYIKLKAKAGRLIDIRTDNYRGGGTPNVYAEYITKSGIQEFEAWGWMNGHQVLYTIPK DVEVLELKFRETGYDTELAGSFSCEKQFYNKLWNKSLRTLYITMRDTYMDCPDRERAQWW GDVVNELGEAFYSLDQNAHLLTRKAILELMNWQRPDSTIFAPIPAGNWNQELPMQMLASV GYYGFWTYYMGTGDKNTIKAVYPNVKKYIHVWKLDEEGLVVPRKGGWTWGDWGDNKDLVL LYNLWYSLALEGFHLMADLLGEKEDSRWACQVNERLKRAFHTKYWNGTFYLSPHYKGKPD DRAQALAVVAGVLPESEYSVIRPFFKQQYHSSPYMEKYVLEALCVMGYYEDALDRMKKRY HDMVESELTTLWEGWGIGNKGFGGGSYNHAWSGGPLTILSQYFAGISPMKPAFKEFAIKP AWNCFEHIQSVTPTQWGEIELNITNESDMIVMKVKIPRNTKGYFYIPDRIKRYRINGKEK INKKREILKKGIWDIELY >gi|222159300|gb|ACAB01000059.1| GENE 35 51799 - 54096 740 765 aa, chain + ## HITS:1 COG:no KEGG:Phep_0964 NR:ns ## KEGG: Phep_0964 # Name: not_defined # Def: alpha-L-rhamnosidase # Organism: P.heparinus # Pathway: not_defined # 13 666 20 680 770 639 49.0 0 MGHRIILTVILLSFTICSFGDNIFNGAQWISVSESQNTPNQWFCFRKQVECERKHSIAEL NIAVDSKYWLWVNDSLVTFEGGLKRGPNPLDTYYDCVDISRFLKKGLNIIAIQVWFWGKD GYCHKNSGRAGLLVDLRLGKKERVLSDKSWKVKVHPAYGESSLPHPNYRLPEANVHFDAR KDIEGWQSLDYEDQLWGYATACGEYPCLPWNKVHLRPFPNWKDSGIIRYDSLKTDEAGKI VGFLPKNISITPYLKIKADAGKLIDIRTDNYKGGSEYNVRAEYITKEGVQSFEAFNYMNG HSVVYSIPQEVEVIELGYRETRFNTELEGSFVCEDEFYNRLWCKALNTMNLNMRDAIQDP DRERSQWWGDAVIVSGEIFYACDLNGKLLVKKAIKNLVDWQKDDGVLYSPVPAGSWDKEL PVQMLASVGKYGIWNYYVYTGDSATIKEAYPAVRKYLSLWKLDERNLVKHRTGGWDWSDW GQDIDVCVLDNAWYSLALEGLANMATLLGDQLIAEDCLFKMKKVREAVNKYYWNGRLYRN PFYNGRTDDRANALAVLAGFATENQWKTIQEYLSNYQAASPYMEKYILEAFFCKGDIKGG LQRMKNRYQYMVNHRLTTLWEDWNIGGAGGGSINHGWAGGPLSLLSQYVAGIAPLTAGWK TFMIRPSDVLFKKIKCSVPLGQGVVVLNMDAYQRKINVDCDVNQNFILAVPRTWFKGASL CILNGKEYDIQELKGIQNKKISFDSEDTQSFYFKIVDQSVQIHII Prediction of potential genes in microbial genomes Time: Wed May 18 02:36:49 2011 Seq name: gi|222159299|gb|ACAB01000060.1| Bacteroides sp. D1 cont1.60, whole genome shotgun sequence Length of sequence - 100827 bp Number of predicted genes - 84, with homology - 81 Number of transcription units - 36, operones - 17 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1139 1083 ## COG1874 Beta-galactosidase - Term 1418 - 1465 11.1 2 2 Op 1 . - CDS 1536 - 1982 547 ## BT_4360 hypothetical protein 3 2 Op 2 . - CDS 2044 - 3150 1016 ## BT_4361 hypothetical protein - Prom 3255 - 3314 6.7 - Term 3192 - 3240 15.1 4 3 Op 1 . - CDS 3392 - 6709 4057 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) - Prom 6846 - 6905 3.8 5 3 Op 2 . - CDS 6907 - 8481 1662 ## BT_4363 putative alkaline phosphatase 6 3 Op 3 . - CDS 8481 - 9668 707 ## BT_4364 hypothetical protein 7 4 Op 1 . + CDS 10273 - 11109 520 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor 8 4 Op 2 . + CDS 11096 - 12409 1043 ## BT_4367 putative outer membrane protein 9 4 Op 3 . + CDS 12475 - 13800 1437 ## BT_4368 hypothetical protein 10 4 Op 4 . + CDS 13805 - 14461 515 ## BT_4369 hypothetical protein 11 5 Tu 1 . + CDS 14549 - 15775 1069 ## COG1253 Hemolysins and related proteins containing CBS domains + Term 15804 - 15842 -0.7 + Prom 15784 - 15843 3.6 12 6 Tu 1 . + CDS 15895 - 18036 2498 ## BT_4371 peptidyl-prolyl cis-trans isomerase + Prom 18127 - 18186 6.3 13 7 Tu 1 . + CDS 18248 - 21601 2293 ## CPS_1799 hypothetical protein + Prom 21607 - 21666 4.8 14 8 Op 1 . + CDS 21705 - 22739 882 ## COG0820 Predicted Fe-S-cluster redox enzyme 15 8 Op 2 . + CDS 22758 - 23798 909 ## BT_4373 hypothetical protein 16 8 Op 3 . + CDS 23803 - 24897 861 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 17 8 Op 4 . + CDS 24907 - 26148 1040 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 18 8 Op 5 . + CDS 26135 - 26662 475 ## BT_4376 hypothetical protein 19 8 Op 6 . + CDS 26668 - 27441 426 ## BT_4377 hypothetical protein 20 8 Op 7 . + CDS 27448 - 27843 374 ## BT_4378 preprotein translocase subunit SecG + Term 27889 - 27926 -0.5 + Prom 27954 - 28013 3.3 21 9 Op 1 . + CDS 28093 - 29481 1565 ## BT_4379 putative oxalate:formate antiporter + Prom 29496 - 29555 6.7 22 9 Op 2 . + CDS 29579 - 29929 422 ## BT_4380 hypothetical protein + Term 30166 - 30196 1.4 - Term 29895 - 29923 -0.0 23 10 Op 1 . - CDS 29948 - 30232 361 ## BT_4381 hypothetical protein 24 10 Op 2 . - CDS 30269 - 31336 1181 ## BT_4382 hypothetical protein 25 10 Op 3 . - CDS 31409 - 32926 599 ## PROTEIN SUPPORTED gi|225093729|ref|YP_002662469.1| ribosomal protein S15 - Prom 33105 - 33164 5.6 + Prom 34081 - 34140 5.8 26 11 Tu 1 . + CDS 34172 - 34393 176 ## BT_4384 hypothetical protein + Prom 34434 - 34493 4.1 27 12 Op 1 . + CDS 34587 - 36131 1521 ## COG3104 Dipeptide/tripeptide permease 28 12 Op 2 . + CDS 36172 - 36708 749 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 29 12 Op 3 . + CDS 36768 - 37337 697 ## COG0563 Adenylate kinase and related kinases + Term 37362 - 37417 -0.1 + Prom 37345 - 37404 2.6 30 13 Op 1 . + CDS 37427 - 38590 1269 ## COG0536 Predicted GTPase 31 13 Op 2 . + CDS 38614 - 39399 440 ## COG1496 Uncharacterized conserved protein 32 13 Op 3 . + CDS 39450 - 40115 653 ## COG3382 Uncharacterized conserved protein 33 13 Op 4 . + CDS 40144 - 40782 518 ## COG0739 Membrane proteins related to metalloendopeptidases - Term 40737 - 40772 1.5 34 14 Tu 1 . - CDS 40826 - 42007 773 ## COG2706 3-carboxymuconate cyclase - Prom 42196 - 42255 3.8 35 15 Tu 1 . - CDS 42786 - 43715 315 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 36 16 Tu 1 . - CDS 43893 - 44024 171 ## gi|295086997|emb|CBK68520.1| Uncharacterized conserved protein - Prom 44093 - 44152 5.8 - Term 44091 - 44134 5.6 37 17 Tu 1 . - CDS 44171 - 45598 677 ## BT_4408 hypothetical protein - Prom 45668 - 45727 7.1 + Prom 45697 - 45756 4.6 38 18 Tu 1 . + CDS 45846 - 46907 671 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair + Term 47082 - 47138 -0.8 - Term 47068 - 47124 3.0 39 19 Tu 1 . - CDS 47257 - 47403 106 ## - Prom 47626 - 47685 3.4 + Prom 47173 - 47232 11.9 40 20 Op 1 . + CDS 47386 - 49320 1674 ## BT_4410 hypothetical protein 41 20 Op 2 . + CDS 49380 - 50081 809 ## BT_4411 hypothetical protein + Term 50119 - 50164 7.6 - Term 50198 - 50230 -1.0 42 21 Op 1 . - CDS 50242 - 51420 1174 ## BVU_3875 aminopeptidase C - Prom 51443 - 51502 5.1 - Term 51448 - 51488 6.2 43 21 Op 2 . - CDS 51514 - 51651 126 ## - Prom 51842 - 51901 7.1 + Prom 51751 - 51810 7.5 44 22 Tu 1 . + CDS 51976 - 54702 2168 ## BT_4412 hypothetical protein + Term 54741 - 54782 1.4 45 23 Op 1 . - CDS 54805 - 55560 571 ## COG3142 Uncharacterized protein involved in copper resistance - Term 55566 - 55595 2.1 46 23 Op 2 . - CDS 55625 - 57160 1809 ## COG1418 Predicted HD superfamily hydrolase - Prom 57181 - 57240 8.4 - Term 57179 - 57243 7.1 47 24 Op 1 . - CDS 57285 - 57575 302 ## BT_4418 hypothetical protein 48 24 Op 2 . - CDS 57587 - 57901 245 ## BT_4419 hypothetical protein - Prom 58032 - 58091 7.0 - Term 58434 - 58475 2.3 49 25 Tu 1 . - CDS 58687 - 59598 561 ## COG1609 Transcriptional regulators + Prom 59762 - 59821 5.8 50 26 Op 1 . + CDS 60036 - 61355 1078 ## Ccel_0951 hypothetical protein 51 26 Op 2 . + CDS 61370 - 63160 963 ## COG0591 Na+/proline symporter 52 26 Op 3 . + CDS 63192 - 64556 999 ## gi|237715716|ref|ZP_04546197.1| predicted protein 53 26 Op 4 . + CDS 64562 - 67648 2361 ## BT_2461 hypothetical protein 54 26 Op 5 . + CDS 67673 - 69412 1464 ## BT_2460 hypothetical protein 55 26 Op 6 . + CDS 69434 - 70387 955 ## SG0242 hypothetical protein 56 26 Op 7 . + CDS 70405 - 70734 302 ## gi|237715720|ref|ZP_04546201.1| predicted protein 57 26 Op 8 . + CDS 70722 - 72098 937 ## Dfer_0342 hypothetical protein 58 26 Op 9 . + CDS 72098 - 73372 746 ## Ccel_0950 HI0933 family protein 59 26 Op 10 . + CDS 73413 - 75650 1592 ## PRU_0396 histidine acid phosphatase family protein 60 26 Op 11 . + CDS 75637 - 76959 979 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 61 26 Op 12 . + CDS 76981 - 78066 680 ## Phep_1387 hypothetical protein 62 26 Op 13 . + CDS 78063 - 79091 693 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 63 26 Op 14 . + CDS 79096 - 79962 522 ## COG0524 Sugar kinases, ribokinase family + Term 80001 - 80037 3.0 + Prom 79964 - 80023 3.5 64 27 Tu 1 . + CDS 80048 - 80614 515 ## BT_4451 putative MTA/SAH nucleosidase + Term 80622 - 80659 0.7 - Term 80543 - 80571 -1.0 65 28 Tu 1 . - CDS 80584 - 80943 246 ## COG2832 Uncharacterized protein conserved in bacteria - Prom 80965 - 81024 5.0 + Prom 80960 - 81019 6.1 66 29 Op 1 22/0.000 + CDS 81073 - 81429 459 ## COG0720 6-pyruvoyl-tetrahydropterin synthase 67 29 Op 2 . + CDS 81448 - 81996 217 ## PROTEIN SUPPORTED gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 68 29 Op 3 17/0.000 + CDS 82048 - 82788 660 ## COG0247 Fe-S oxidoreductase 69 29 Op 4 . + CDS 82785 - 84170 1128 ## COG1139 Uncharacterized conserved protein containing a ferredoxin-like domain 70 29 Op 5 . + CDS 84167 - 84748 555 ## BT_4457 hypothetical protein + Term 84850 - 84894 8.4 71 30 Tu 1 . + CDS 85083 - 85955 847 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase + Term 85981 - 86030 2.9 - Term 85977 - 86009 -0.3 72 31 Op 1 . - CDS 86073 - 87356 897 ## BT_4460 hypothetical protein 73 31 Op 2 . - CDS 87353 - 87925 405 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 74 31 Op 3 5/0.000 - CDS 87926 - 88582 413 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 75 31 Op 4 4/0.000 - CDS 88607 - 89563 888 ## COG4589 Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase 76 31 Op 5 . - CDS 89560 - 90216 453 ## COG0558 Phosphatidylglycerophosphate synthase - Prom 90381 - 90440 8.0 + Prom 90184 - 90243 6.4 77 32 Tu 1 . + CDS 90406 - 92256 2052 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain + Term 92364 - 92424 5.1 78 33 Op 1 . - CDS 92490 - 92633 80 ## - TRNA 92516 - 92588 80.5 # Trp CCA 0 0 79 33 Op 2 . - CDS 92654 - 93007 324 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family - Prom 93028 - 93087 2.6 80 34 Tu 1 . - CDS 93140 - 93718 322 ## COG3663 G:T/U mismatch-specific DNA glycosylase - Prom 93838 - 93897 6.6 81 35 Op 1 . + CDS 93887 - 96958 2482 ## BT_4470 outer membrane protein 82 35 Op 2 . + CDS 96974 - 98569 1279 ## BT_4471 hypothetical protein 83 35 Op 3 . + CDS 98643 - 99350 683 ## BT_4472 hypothetical protein + Prom 99375 - 99434 2.8 84 36 Tu 1 . + CDS 99504 - 100817 346 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 Predicted protein(s) >gi|222159299|gb|ACAB01000060.1| GENE 1 3 - 1139 1083 378 aa, chain + ## HITS:1 COG:XF0840 KEGG:ns NR:ns ## COG: XF0840 COG1874 # Protein_GI_number: 15837442 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Xylella fastidiosa 9a5c # 1 210 401 602 612 140 35.0 4e-33 FNQGWGTILYRTTLPEATPVGTVLKITEVHDWAQIYADGKLLARLDRRKGEFTTTLPALK KGTQLDILVEAMGRVNFDKSIHDRKGITEKVELVSGNQAKELKNWTVYNFPVDYSFIKDK KYNDTKILPAMPAYYKSTFKLDKVGDTFLDMSTWGKGMVWVNGHAMGRFWEIGPQQTLFM PGCWLKEGENEILVLDLKGPAKASIKGLKKPILDVLREKAPETHRKDGEKLKLAGEKVTH EGAFTPGNGWQEVRFAAPAKGRYFCLEALSPQANDNIAAVAEFDVLGADGKPVSREHWKI RYADSEETRSGNRTADKIFDLQESTFWMTVDNVPYPHQLVIDLSKVEIVTGFRYLPRAEK EYPGMIKEYRIFIKKEDF >gi|222159299|gb|ACAB01000060.1| GENE 2 1536 - 1982 547 148 aa, chain - ## HITS:1 COG:no KEGG:BT_4360 NR:ns ## KEGG: BT_4360 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 148 1 148 148 200 84.0 2e-50 MKLSQQSLSIIESAIQKAVAKYVCNCEQTVVTDIHLQPDQTSGQLNIYNDDDEELANIMV EEWATYEGDDFLESVEPSLRNILCRMKDAGDFDKVTILKPYSFVLVDEEKETIAELLLVD DDTILVNDELLKGLDKELDEFLKELLEK >gi|222159299|gb|ACAB01000060.1| GENE 3 2044 - 3150 1016 368 aa, chain - ## HITS:1 COG:no KEGG:BT_4361 NR:ns ## KEGG: BT_4361 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 368 1 369 369 541 75.0 1e-152 MAKSYSLAFVTIFLLTLGSCQSLEQISIDYLQPADLSFPPQLRKVAIVNNTSNTPDNKLI TTTEKIKEGTPLVSRATAYANGDPKIATESLAEEIAHQNYFEEVVICDSALRANDKLARE STLSQEEVRQLASSLGVDFIIALENLQLKATKSVRFLNEFNCFQGAVDVKVYPTVKVYLP ERSRPMTTLHPNDSIFWEEFGGTAVEAATRMIRDKQMLEEAAVFAGTVPVKYLVPMWKKG TRYLYTGGSVPMRDAAIYVRENSWDDAYELWNQAFEGTKNQKKKMRAALNIAVYYEMKDS LAKAEEWAEKAQQLAKKIDKKNITDSVQASIDDVPNYYLTSLYLAELKERNAQLPKLKLQ MSRFNDDF >gi|222159299|gb|ACAB01000060.1| GENE 4 3392 - 6709 4057 1105 aa, chain - ## HITS:1 COG:PA4403 KEGG:ns NR:ns ## COG: PA4403 COG0653 # Protein_GI_number: 15599599 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Pseudomonas aeruginosa # 3 1103 2 914 916 633 37.0 0 MGFNEFLSSIFGNKSTRDMKEIKPWVEKIKAAYPEVEKLDNDALRAKTEELKKYIHESAT AERAKVEELKASIETLELEDREEVFAQIDKTEKEILEKYEKALDEVLPVAFSIVKATAKR FTENEEIVVTATDFDRHLAATKDFVRIEGDKAIYQNHWIAGGNDTVWNMVHYDVQLFGGV VLHKGKIAEMATGEGKTLVATLPVFLNALTGNGVHVVTVNDYLAKRDSEWMGPLYMFHGL SVDCIDRHQPNSDARRQAYLADITFGTNNEFGFDYLRDNMAISPKDLVQRQHNYAIVDEV DSVLIDDARTPLIISGPVPKGDDQLFEQLRPLVERLVEAQKVLATKYLSEAKKLIASNDK KEVEEGFLALYRSHKALPKNKALIKFLSEQGIKAGMLKTEEIYMEQNNKRMHEVTDPLYF VIDEKLNSVDLTDKGVDLITGNSEDPTLFVLPDIAGQLSELENQHLTNEQLLEKKDELLT NYAIKSERVHTINQLLKAYTMFEKDDEYVVIDGQVKIVDEQTGRIMEGRRYSDGLHQAIE AKERVKVEAATQTFATITLQNYFRMYHKLSGMTGTAETEAGELWDIYKLDVVVIPTNRPI ARKDMNDRVYKTKREKYKAVIEEIEKLVQAGRPVLVGTTSVEISEMLSKMLAMRKIEHKV LNAKLHQKEADIVATAGLSGTVTIATNMAGRGTDIKLSPEVKAAGGLAIIGTERHESRRV DRQLRGRAGRQGDPGSSVFFVSLEDDLMRLFSSDRIASVMDKLGFQEGEMIEHKMISNSI ERAQKKVEENNFGIRKRLLEYDDVMNKQRTVVYTKRRHALMGERIGMDIVNMIWDRCANA IENNDYEGCQMELLQTLAMETPFTEEEFRNEKKEKLAEKTFNIAMDNFKRKTERLAQIAN PVIKQVYENQGHMYENILIPITDGKRMYNISCNLKAAYESESKEVVKSFEKSILLHVIDE AWKENLRELDELKHSVQNASYEQKDPLLIYKLESVTLFDAMVNKINNQTISILMRGQIPV QEAPDEQAARRVEVRQAAPEQRQDMSKYRENKQDLSDPNQQAAASQDTREQQKREPIRAE KTVGRNDPCPCGSGKKYKNCHGKNL >gi|222159299|gb|ACAB01000060.1| GENE 5 6907 - 8481 1662 524 aa, chain - ## HITS:1 COG:no KEGG:BT_4363 NR:ns ## KEGG: BT_4363 # Name: not_defined # Def: putative alkaline phosphatase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 524 1 524 524 1002 89.0 0 MKGLLTSLITVLTFTGLQAQSLPSAPKLVVGLTIDQLRTDYLEAFSSLYGEKGFKRLWKE GRVFHNAEYTFCGVDRASAIAAIYSGTTPSMNGIISQRWMDASTLRPVNSTDDTEFMGYY TDQTAAPTKLLTSTIADELKIATQGKGLVYAIAPDCDAALFAAGHAGNGAFWLNPNTGKW SGTTYYGEFPWWASQYNDRQAIDFRIAGMTWEPVFPRGMYTFLPDWRDILFKYKFDDDRN NKYRRFIASPFVNDEVNALAEEALNKSSIGMDDITDLLALTYYAGNYAHKSVQECAMEIQ DTYVRLDRSIENLLEALDKKVGLQNVLIFVTSTGYTDSESADSGLYKIPSGEFYLNRCAA LLNMYLMATYGEGKYVEAHHNQQIYLNHKLLEKKELNLAEIQQKSAEFLMQFSGVNEAYS ANRLLLGSWTPEIHKIRNGYHRKRSGDLVIDVLPGWTIVNENGGENKVVRHSYIPSPLIF MGHSVKPAIIQTPVTIEHIAPTLAHFMRIRAPNACTSAPITDLR >gi|222159299|gb|ACAB01000060.1| GENE 6 8481 - 9668 707 395 aa, chain - ## HITS:1 COG:no KEGG:BT_4364 NR:ns ## KEGG: BT_4364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 395 1 395 395 666 83.0 0 MKRFFLYTILSFFLLLPSAGQAPANSNDSIRLSLLTCAPGEEIYSLFGHTAIRYENPSQG IDVVFNYGLFSFNTPNFIFRFSLGETDYQLGVTDYEHFAAEYAFYGRSVWQQTLNLTDEE KTKLIQLLQENYHPENRVYRYNFFYDNCATRPRDKIEESIAGKVVYPTEPQDGSRTFRDI VHQYCKGHPWARFGIDLCIGSEADQPITQRQMMFAPFYLMDAFDGAQISTDTYSRPLVKA NELVIDVTPEPDESGWMPTPLQCSLLLFILTAAATIYGIRRRTGLWGIDLVLFGMAGIVG CVLAFLALFSQHPAVSSNFLLLVFHPGQLLFLPYIVYCVRKGKKCWYLTLNLVVLTLFIV LFPAIPQRFDFAVVPLALVLLIRSASNLIVTSKKK >gi|222159299|gb|ACAB01000060.1| GENE 7 10273 - 11109 520 278 aa, chain + ## HITS:1 COG:TM0883 KEGG:ns NR:ns ## COG: TM0883 COG1521 # Protein_GI_number: 15643645 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Thermotoga maritima # 38 267 3 239 246 86 30.0 5e-17 MFGGISKMKKEDNSSSMLQLFFFYFTFAPAKEPKDLNLIIDIGNTKAKIAFFDGGEMVDV VAESNQSLGCLKALCSQYPVEQGIVATVIDLNEKVLADLAALPFPLLWLNHQTPLPVVNL YETPETLGYDRMAAVVGANEQFPHRDILVIDAGTCITYEFIDSKGQYHGGNISPGMQMRF KALHQFTGRLPLVDTNGRKLPMGRDTETAIRAGVMKGMEYEISGYIESMKHKYPELLVFL TGGDDFSFDSSVKSIIFADRFLVLKGLNRILNYNNGRI >gi|222159299|gb|ACAB01000060.1| GENE 8 11096 - 12409 1043 437 aa, chain + ## HITS:1 COG:no KEGG:BT_4367 NR:ns ## KEGG: BT_4367 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 437 1 431 431 635 77.0 0 MVGFKHTLCALLLTMVTGMAIAQNNTNSPYTRYGYGDLSDQSFGNSKAMGGIAFGLRDGA QINPLNPASYTAIDSLTFIFEGGVSLQNMNISGGGVKLNAKNSSFDYLAMQFRLHPRIAM SIGLLPFSNVGYSVSDTQTATDPGTGATADYARNYTGDGGLHQLYAGLGVKVLKNFSVGV NASYFWGDINRTRVVYYPSVSGAYNYNHQSIASVSSYKLDFGAQYTLDIDKKHSVTIGAV YSPELKLGNDYSVTTQMVSNSTGTAVSTTTLNPDATLKVPNTFGVGFTYNYDKRLTVGLD YSLQQWSKAKFGVNTSDDAVREDFDETYTYCDRHKISVGAEYIPNLIGRSYLSHIKYRLG AYYTTPYYKIGGKDAAREYGVTAGFGLPVPRSRSILSISGQFVRVSGQESTFVNENIFRV SIGLTFNERWFFKRRVE >gi|222159299|gb|ACAB01000060.1| GENE 9 12475 - 13800 1437 441 aa, chain + ## HITS:1 COG:no KEGG:BT_4368 NR:ns ## KEGG: BT_4368 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 441 1 433 433 731 93.0 0 MKIKTLVAMLFLSAGATTVVAQDATNCNSNSSISHEAVRAGNFKDAYTPWKAVLENCPTL RFYTFTDGYKILKGLMGQIKDRNNPEYQKYFDELMNTHDLRIKYTDEFLAKGTKVSSADE ALGIKAVDYIAFAPKIDVNQAYQWLSQSVNAVKAESAAATIFYFLQMSLDKLKTDPNHKE QFIQDYLAASEYADAAIAAETNEAKKKNLQGIKDNLVALFVNSGTADCESLQNIYGPKVE ANQTDLAYLKKVIDIMKMMRCTESEAYQQAAFYVYKIEPSADAATGCAYQAFKKGDIDGA VKFFDEAIGLETDNVKKAEKAYAAAAVLASAKKLSQARAYCQKAIGFNENYGAPYILIAN LYAMSPNWSDESALNKCTYFAVIDKLQRAKQVDPSVAEEANSLIGRYSGHTPQAKDLFML GYKQGDRITIGGWIGETTTIR >gi|222159299|gb|ACAB01000060.1| GENE 10 13805 - 14461 515 218 aa, chain + ## HITS:1 COG:no KEGG:BT_4369 NR:ns ## KEGG: BT_4369 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 218 1 193 193 335 84.0 5e-91 MLRKRRNRLLHKSMSITITFGVVVMLLLLSSCGGKQKAMGEAITERDSLPVMTTLGVTTL ISDSGVTRYRVNTEEWMMYDRKKPSYWAFEKGVYMEQFDSIFNVEASIKADTAYYYDKER LWKLIGNVDIQNRKGERFNTELLYWNEATQKVYSDKFIRIQQPDRIITGHGFDSNQQMTI YTIRNIEGIFYVDEEASSGPPQPETKALPDSVNKDAAK >gi|222159299|gb|ACAB01000060.1| GENE 11 14549 - 15775 1069 408 aa, chain + ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 4 408 17 420 426 155 27.0 1e-37 MVFSAFFSGVEIAFVSVDKLRFEMERKGGITSRILSIFFKNPNEFISTMLVGNNIALVIY GILMAQIIGDNLLAGFIDNHFLMVLAQTVISTLIILVTGEFLPKTIFKINPNLVLNVFAI PLIVCYVVLYPISKLASGLSCIFLRLFGMKVNKDASDRAFGKVDLDYFVQSSIDNAENEE ELDTEVKIFQNALDFSNIKIRDCIVPRTEVVAVDLTTSLDELKSRFIESGISKIIVYDGN IDNVVGYIHSSEMFRAPKNWHENVKQVPIVPETMSAHKLMKLFMQQKKTIAVVVDEFGGT SGIVSLEDLVEEIFGDIEDEHDNTSYISKQIDEREYVLSARLEIEKVNETYGLDLPESDD YLTVGGLILNQYQSFPKLHEVVRVGRYQFKIIKVTATKIELVRLKVLE >gi|222159299|gb|ACAB01000060.1| GENE 12 15895 - 18036 2498 713 aa, chain + ## HITS:1 COG:no KEGG:BT_4371 NR:ns ## KEGG: BT_4371 # Name: not_defined # Def: peptidyl-prolyl cis-trans isomerase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 713 1 712 712 1215 90.0 0 MATLQNIRSKGPLLVIVIGLALFAFIAGDAWKVMQPHQAHDVGEVNGDALSAQEYQNLVE EYTEVVKLMRGVTALNDEQTNQVRDEVWRSYVNNKLIEKEAKALGLTVSAAEIQDILKAG VHPLLRQTPFQNPQTGNFDKDMLNKFLVEYAKMSESQMPAQYAEQYNNMYKYWSFIQKTL IESRLAEKYQALVSKALLSNPVEAQDAFDARVNQYNVLMAGIPYSSVVDSTIVVKESELK DLYNKKKEQFKQYQETRDIKYIDVQVTASAEDRAAIQKEVDEATEQLATTTEDYTSFIRS TGSEAPYVDLFYNKTAFPSDVVARLDSASVGTIYGPYYNGADNTINSFKIVAKAAAADSV EFRQIQVYAEDAVKTKTLADSIYNAIKGGANFADLAKKYGQTGDANWLTSAQYEGAQIDG DNLKFISAINNAGVNELVNLPMGQANVILQVTNKKSVKDKYKVAVIKREVEFSKETYNRA YNDFSQFIAANPSVEKMVANAEEAGYRLLDRMDLSSSEHNIGGVRGTKEALRWAFDKAKP GDVSGLYECGESDHMMVVGLVGIKPEGYRPLKAVQDQLRAEIVKDKKAEKIMADMKAANA TSFNQYKEMTNAVSDSLKMVTFAAPAYVSALRSSEPLVGAYASAAEVNQLSAPIKGNAGV FVLQVYGKDKLNETFDAKAEEATLTNMHARFASRLMNDLYLKANVKDTRFLFF >gi|222159299|gb|ACAB01000060.1| GENE 13 18248 - 21601 2293 1117 aa, chain + ## HITS:1 COG:no KEGG:CPS_1799 NR:ns ## KEGG: CPS_1799 # Name: not_defined # Def: hypothetical protein # Organism: C.psychrerythraea # Pathway: not_defined # 14 563 7 532 1241 135 26.0 1e-29 MTKQLAQYSVPAENSRVIRVFLSSTFRDMEMERSALVKLFKGLQVKAASRGATISLVDLR WGITEEDAKSGKVVEICLKEIVNSRPFFIGMVGDRYGWCPSYEDLSQTLNDSLEYRWIED DLNHHLSVTEIEMQFGVLRNPNPLHAYFYIKQSADDERSCPEEESLKLKRLKESILQQDR YPVQEYDSPDQLCNLVEAAFTELLEREFPDVLTVEDNQALQEQLTRNELLFNYHPIPEAD QAFADFLAADEQRCLVVTGGCGLGKSALLAHWSDLVNNDMPMIPIYHRLDSTTLSSYPET LARMLASKCQNALKHQSLGEENLFAAEVSKELDSTQSTAKSLMDAVKNSVLMGDAGLNTF SGNTLRNLKEIEEDLNSIQKFSQLWSALGASRYPIILLIDDISYLNPTEASLFSLFASIP PNVKVVLSFSASSTAYLPFVQNGYAHFQLNGFSQADAKSFSKQYLSTYSKALSAQQEDIL ASWVLAKQPRCLSVLLNELVSFGQYDALNEYMSGYCRLNEVGQFYDSVLRRLSADYGFEE IGRTLLMLSLTLEGFTEDEVKSMADINQILWSQLRVEMSSWLTNKGGRYCIGDTQMVEAI ERYFAQDDECIDDSRHEIISALLDEEEILSHPLTFADYNYRMKQFCYHDSYRYKVEITYQ CYKMQEWDILKDWICDVEIFEILYRTNRSLLEDSWKAIMNDNPEVTPEVYAELDFDEIDS FLIPVIANDMATFLSSSFHLTKAAAAVSEKSMEGAAMPLIAKSVLKMNEGCRYARNEEYE TACDCFLKALVMQENIVPTPELEIANTCRNLALAYYYNEQYNEAVTYLNRALDYHAASAD EKSQAEVIELSEYLAYCDYYKDEEESAAEKFRKVAEMHESLNGRLSGGVAKCLRMQGKCL YYIKRYDEAWMLMNQALDIAIQIDSKKQIVACHKQLYYLCREFKRMMNERGDEQASTLFF RESLLHEMYFSEKPRLAELTVRYEALRCDIMQQYYMNKDYDNVIRIATSLDFHDDADPNA SCLVYYYKAQAYVKLENYPMAKEAFFRELELRKKYLGWEEEDTILACQNLGVLHSFCNER EEALACFREAYGHEVKKNGEDSEMAQKLLQYISVVES >gi|222159299|gb|ACAB01000060.1| GENE 14 21705 - 22739 882 344 aa, chain + ## HITS:1 COG:STM2525 KEGG:ns NR:ns ## COG: STM2525 COG0820 # Protein_GI_number: 16765845 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Salmonella typhimurium LT2 # 2 334 20 363 388 248 38.0 2e-65 MSKYPLLGMTLVELQSLTKRLGMPGFAAKQIASWLYEKKVASIDDMTNLSLKHRELLKQN YEVGAEAPVDEMRSVDGTVKYLYKVGENHFVESVYIPDDDRATLCVSSQVGCKMNCKFCM TGKQGYTANLTASQIMNQIHSLPERDKLTNVVMMGMGEPLDNLDEVLKALELLTATYGYA WSPKRITLSTVGLRKGLQRFIEENDCHLAISLHSPLTVQRAELMPAEKAFSITEMVELLK NYDFSKQRRLSFEYIVFKGLNDSQVYAKELLKLLRGLDCRVNLIRFHAIPGVDLEGADMD TMTRFRDYLTSHGLFTTIRSSRGEDIFAACGMLSTAKQEKNNES >gi|222159299|gb|ACAB01000060.1| GENE 15 22758 - 23798 909 346 aa, chain + ## HITS:1 COG:no KEGG:BT_4373 NR:ns ## KEGG: BT_4373 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 346 1 345 345 612 88.0 1e-174 MKKYSFILCIALVAFVVASCGLKGNHTSSGRAYELLVVVDHGVWDRAAGRALHDALDADM PGLPQSEPSFRIMYTSPKDYDSTLKLIRNIIIVDIQDIYTKASFKYAKDVYANPQMILTI QAPNEEEFEKFVEENKKTIVDFFTRAEMNRQITFLEGKHSNFISQKVDSLFGCDIWVDAE LANSKTGDDFFWASTNTGTADRNFVMYSYPYTDKDTFTKEYFVHKRDSVMKANIPGFKEG VYMSTDSLLTDVRPINVQNSYTMEARGLWRMKGDFMGGPYVSHTRLDEKNQRIITAEIFV YSPDKMKRNLVRQMEASLYTLKLPNEVQQNQIPLGEASKEAEQTNK >gi|222159299|gb|ACAB01000060.1| GENE 16 23803 - 24897 861 364 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 2 344 3 343 346 336 47 3e-91 MEDNKIRIGITQGDINGVGYEVILKTFSDPTMLELCTPIIYGSPKVAAYHRKALDIQANF SIVNSATEAGYNRLSVVNCTDDEVKVEFSKPDPEAGKAALGALERAIEEYREGLIDVIVT APINKHTIQSEEFSFPGHTEYIEERLGNGDKSLMILMKNDFRVALVTTHIPVREIATTIT KELIQEKLMIFHRCLKQDFGIGAPRIAVLSLNPHAGDGGLLGMEEQEVIIPAMKEMEEKG ILCYGPYAADGFMGSGNYTHFDGILAMYHDQGLAPFKALAMEDGVNYTAGLPVVRTSPAH GTAYDIAGKGVACEDSFRQAIYVAIDVFRNRQREKVARANPLRKQYYEKRDDSDKLKLDT VDED >gi|222159299|gb|ACAB01000060.1| GENE 17 24907 - 26148 1040 413 aa, chain + ## HITS:1 COG:BMEI0866 KEGG:ns NR:ns ## COG: BMEI0866 COG2204 # Protein_GI_number: 17987149 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Brucella melitensis # 15 263 181 428 528 244 48.0 2e-64 MTKAEIQQVKQRFGIIGNTEALSRAIDVAIQVAPTDLSVLITGESGVGKESFPQIIHQYS RRKHGQYIAVNCGAIPEGTIDSELFGHEKGAFTGAIGERKGYFGEADGGTIFLDEVGELP MSTQARLLRVLESGEFMKVGSSKVQKTDVRVVAATNVNLTQAIAEGRFREDLYYRLNTVP IQIPPLRERGDDVLLLFRKFSADFAEKYRMPAIQLTEDAKKELLAYPWPGNVRQLKNITE QISIIETNREISAAILQNYLPAQNTQRLPALMGTRESKGFESEREILYSVLFDMRQEVAE LKKMVHNLMAERAGQVGQVGQMGTVVTTPVVTAHQPSVPAIIHTMQPTVCKDDDDIQDTE EYVEESPLSLDEVEKEMIRKALERHHGKRKSAAKDLNISERTLYRKIKEYELD >gi|222159299|gb|ACAB01000060.1| GENE 18 26135 - 26662 475 175 aa, chain + ## HITS:1 COG:no KEGG:BT_4376 NR:ns ## KEGG: BT_4376 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 175 1 175 175 317 95.0 1e-85 MNWIKKIIRPLLLFGLPLVVIACTVSYKFNGSSINYDKVKTISIADFPIKSEYVYAPLAT KFNEDLKDIFIRQTRLQLLKPSQNADLQIDGEITGYNQYNQAVSADGYSSETKLTITVNV RFVNNTNHAEDFEQQFSAFRTYDSSQLLTAVQDGLIAEMSKEITDQIFNATVANW >gi|222159299|gb|ACAB01000060.1| GENE 19 26668 - 27441 426 257 aa, chain + ## HITS:1 COG:no KEGG:BT_4377 NR:ns ## KEGG: BT_4377 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 257 1 264 264 367 79.0 1e-100 MTSVNFQQWIQHPETLNRDTLYELRNLLARYPYFQSLRLLYLKNLYILHDISFGGELRKA VLYIADRRQLFRLIEGDRYDVQARKKGVPLTEVLKDEPSVDRTLALIDAFLSTVPEEVTA RTSFDYSMDYTSYLLEETPATERPAEETPKLKGYELIDDFIEKSESDSPLCKKPLREEMP SSPTSSDELTEEETMKEEEEDDSCFTETLAKIYVKQQRYSKALEIIKKLSLKYPKKNAYF ADQIRFLEKLIINANSK >gi|222159299|gb|ACAB01000060.1| GENE 20 27448 - 27843 374 131 aa, chain + ## HITS:1 COG:no KEGG:BT_4378 NR:ns ## KEGG: BT_4378 # Name: secG # Def: preprotein translocase subunit SecG # Organism: B.thetaiotaomicron # Pathway: Protein export [PATH:bth03060]; Bacterial secretion system [PATH:bth03070] # 1 131 1 131 131 179 87.0 4e-44 MYLLFVILMVIAALLMCFIVLIQNSKGGGLASGFSSSNAIMGVRKTTDFLEKATWGLAIF MVVMSIATAYVVPRSAVAKDAVLEQAQKEQQTNPYNLPAGTAAPQTEAPATNAPATETPA PATETPAPAAE >gi|222159299|gb|ACAB01000060.1| GENE 21 28093 - 29481 1565 462 aa, chain + ## HITS:1 COG:no KEGG:BT_4379 NR:ns ## KEGG: BT_4379 # Name: not_defined # Def: putative oxalate:formate antiporter # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 462 1 462 462 832 97.0 0 MTEHLKQKLNDSAVLRWSVLALVAFTMLCGYFLTDVMSPLKPMLEKELLWDSLDYGFFTS AYGWFNVFLLMLIFGGIILDKMGVRFTGMGACLLMVFGCGLKYYAISTTFPEGAMLFGFK TQVTLAALGYAIFGVGVEIAGITVSKIIVKWFKGKEMALAMGLEMATARIGTTLAMVLTV PLADFFGSTDESGTFHTNIPAPILFCLIMLCVGTIAFFLYTFYDKKLDASLDAEGLEPEE PFRMKDIVYIITNKGFWLIALLCVLFYSAVFPFIKYAADLMVQKYNVDPKLAGTIPGLLP IGAIILTPLFGSLYDRIGKGATLMIIGSVMLIFVHTMFALPILNIWWFATIIMIILGFAF SLVPSAMWPSVPKIIPEKQLGTAYALIFWVQNWGLMGVPLLIGWVLNTYCKGPVVDGAQT YDYTLPMAIFALFGVLALIVALMLKAENKKKGYGLEEANIQK >gi|222159299|gb|ACAB01000060.1| GENE 22 29579 - 29929 422 116 aa, chain + ## HITS:1 COG:no KEGG:BT_4380 NR:ns ## KEGG: BT_4380 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 113 1 113 115 186 81.0 2e-46 MAAKEKINLLEVIPCRSEHITAEREGETIVLSFPRFKYPWMQRFLVPKGMSKELHVKLEE HGTAVWELIDGKRNVREIIEKLADHFQNEEGYESRVSAYLSQMQKDGFIKLVIPVV >gi|222159299|gb|ACAB01000060.1| GENE 23 29948 - 30232 361 94 aa, chain - ## HITS:1 COG:no KEGG:BT_4381 NR:ns ## KEGG: BT_4381 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 92 1 92 95 125 77.0 3e-28 MVGEGHAFDMIRRIKEGREASRLRRERANDKLKHLNRANEPYPLPNTTPEEMERIIHDSE KKKEKDSNYFVWGTLIIMGVLIAIAVILWAVFIK >gi|222159299|gb|ACAB01000060.1| GENE 24 30269 - 31336 1181 355 aa, chain - ## HITS:1 COG:no KEGG:BT_4382 NR:ns ## KEGG: BT_4382 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 355 1 354 354 574 87.0 1e-162 MKKLIIAAGLLMTTSAYAQTEVLTGVTRGKDYGVVYSLPKTQIELEIKANKVNYTPGEFS KYADRYLRLTNVSADPEEYWELASVKVKSVGVPNSETMYFVKLKDKTVAPLMELTEDGIV KTINVPYSNSSVGKKAAPAPAVLQKKANPREFLTEEILMASSTAKMAELVAKEIYNIRES KNALLRGQADNMPSDGAQLKIMLDNLNAQEEAMTQMFSGTCNKEERTFTVRLTPDKEFNN EVAFRFSKKLGVVANNDLAGTPFYISLKDLKSVKMPQEDGKKKKDLDGIAYNVPGQAMVT LTDGKKKLYEGELPITQFGVIEYLAPVLFNKNSTIKVYFDPNTGGLLKVDREEGK >gi|222159299|gb|ACAB01000060.1| GENE 25 31409 - 32926 599 505 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225093729|ref|YP_002662469.1| ribosomal protein S15 [gamma proteobacterium HTCC5015] # 1 496 4 491 497 235 30 7e-61 IAMKIFPSSSIKKLDAYTIENEPIASIDLMERAATVLTKAITDRWNTETPVTIFAGPGNN GGDALAVARMMAEKGYKIEVFLFNTKGELSPDCQTNKELVEMMEEVKFHEISTQFVPPAL TPDHLVIDGLFGSGLNKPLSGGFAAVVKYINSSPAMVVAIDIPSGLMGEENTFNVKANII RADVTFSLQLPKLAFLFAENTEFVGEWEVLDIQLSEDGIEEMETNYEMLEIEQIRSLIKP RKQFAHKGNFGHALLIAGSKGMAGASILAARACLRSGVGLLTVHAPMCNNDILQTSIPEA IVETDINETCFAVPTDTDDYQAVGIGPGLGRNEETEAALLEQLEHCQTPVVVDADALNIL ANHRHALTHLPKGSILTPHPKELERLTGKCQDSYERLMKACELARAAHVHIILKGAYSAI ITPEGKCFFNPTGNPGMATGGSGDVLTGVILALLAQGYPTEEAAKIGTYIHGLAGDVAQK KQGMIGMITSDIITCLPTAWRLVSE >gi|222159299|gb|ACAB01000060.1| GENE 26 34172 - 34393 176 73 aa, chain + ## HITS:1 COG:no KEGG:BT_4384 NR:ns ## KEGG: BT_4384 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 2 74 74 114 82.0 1e-24 MEKTKIGLNAGKVWRILNEKGELSMFDLCRELSLTFEDVALAIGWLTREDKIFLRKREGM LFASIENVEFTFG >gi|222159299|gb|ACAB01000060.1| GENE 27 34587 - 36131 1521 514 aa, chain + ## HITS:1 COG:CAC0751 KEGG:ns NR:ns ## COG: CAC0751 COG3104 # Protein_GI_number: 15894038 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Clostridium acetobutylicum # 7 510 24 520 521 277 36.0 5e-74 MSALKGHPKGLYLIFATSTAERFSYYGMRAIFILFLTQALLFDKEAAASIYGSYTGLVYL TPLIGGYIADKYWGIRRSVFWGAVMMAVGQFLMFMSASTLDNTNLAHWMMYGGLGFLILG NGCFKPTVSSLVGQLYEPGDRRLDAAYTIFYMGVNVGSFAAPLICGYLGDTGDPHDFKWG FLAAGIMTLFTVVLFETQKNKYLFSPSGEPIGIIPDAKREKKEDKADHISHPKMDKHTKV RNTLVITILTIALIAFFNYAFEGDWVSIGIFTACIVFPVLILLDGSLTKIERSRIFVIYI VAFFVIFFWAAYEQAGASLTLFASEQTNRDIFGWEMPASWFQSFNPLFVVVLAYIMPGVW GFLNKRKMEPASPTKQAIGLLLLSLGYLFICFGVKDAVPGVKVSMIWLTGLYFIHTMGEI ALSPIGLSMVNKLSPLRFASLMMGIWYLSTATANKFAGMLSGLYPEAGKVKSIFGYQIAT MYDFFMVFVVMSGVASLILFLLSKKLQKMMHGVE >gi|222159299|gb|ACAB01000060.1| GENE 28 36172 - 36708 749 178 aa, chain + ## HITS:1 COG:DR1376 KEGG:ns NR:ns ## COG: DR1376 COG0634 # Protein_GI_number: 15806393 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Deinococcus radiodurans # 13 173 10 171 176 124 38.0 1e-28 MDTIQIKDKLFTVSIKEQEIQKEVIRVANEINRDLAGKNPLFLSVLNGSFMFTADLLKHI TIPCEISFVKLASYQGITSTGVIKEVIGLNEDIAGRTIVIVEDIVDTGLTMQRLLETLGT RNPEAIHIASLLVKPEKLKVNLNIEYVAMEIPNDFIVGYGLDYDGFGRNYPDIYTVVD >gi|222159299|gb|ACAB01000060.1| GENE 29 36768 - 37337 697 189 aa, chain + ## HITS:1 COG:CC1269 KEGG:ns NR:ns ## COG: CC1269 COG0563 # Protein_GI_number: 16125518 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Caulobacter vibrioides # 2 187 1 186 191 174 45.0 9e-44 MLNIVIFGAPGSGKGTQSERIVEKYGINHISTGDVLRAEIKNGTELGKTAKGYIDQGQLI PDELMIDILASVFDSFKDSKGVIFDGFPRTIAQAEALKKMLAERGQDVSVMVDLDVPEEE LMVRLIKRGKDSGRADDNEETIKKRLHVYHSQTAPLIDWYKNEKKYQHINGLGTMEGIFA EICEAVDKL >gi|222159299|gb|ACAB01000060.1| GENE 30 37427 - 38590 1269 387 aa, chain + ## HITS:1 COG:aq_2069 KEGG:ns NR:ns ## COG: aq_2069 COG0536 # Protein_GI_number: 15607036 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Aquifex aeolicus # 6 335 4 342 343 270 48.0 2e-72 MAESNFVDYVKIYCRSGKGGRGSTHMRREKYCPNGGPDGGDGGRGGHIILRGNRNYWTLL HLKYDRHAMAGHGESGSKGRSFGKDGADKIIEVPCGTVVYNAETGEYLCDVTEDGQEVIL LKGGRGGQGNWHFKTATRQAPRFAQPGEPMQEMTVIMELKLLADVGLVGFPNAGKSTLLS SISAAKPKIADYPFTTLEPNLGIVSYHGGKSFVMADIPGIIEGASQGKGLGLRFLRHIER NSLLLFMVPADSDDIRKEYEVLLNELRTFNPEMLDKQRVLAVTKSDMLDQELMDEIEPTL PEGIPHVFISSVSGLGISVLKDILWEELNKESNKIEDIVHRPKDVTRLQQELKDMGEDED IVYEYEEDVDDDDDIDYEYEEEDWEEK >gi|222159299|gb|ACAB01000060.1| GENE 31 38614 - 39399 440 261 aa, chain + ## HITS:1 COG:BS_ylmD KEGG:ns NR:ns ## COG: BS_ylmD COG1496 # Protein_GI_number: 16078601 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 12 261 29 277 278 124 29.0 2e-28 MKGYKSLEAYPEVAHFVTTRHEGISTGAYGSFNCSPYTNDSCMNVNRNQSWLFQCMNHQI KELFIPEQNHGCASLIINESFFKESLEMRRLLLRGMDALITNVPGYCVCVTTADCVPVLL YDKKQHVVAAVHAGWKGTVKHIVSNVMDHLNKMFGTQGEDVIACIGPSISLESFEVGDEV YDAFEESGFDMSLISMKKKETGKYHIDLWEANRIELLNLGVPAEQIEVAGICTYIHHDEF FSARRLGIDSGRTLSGIMIRK >gi|222159299|gb|ACAB01000060.1| GENE 32 39450 - 40115 653 221 aa, chain + ## HITS:1 COG:SSO0658 KEGG:ns NR:ns ## COG: SSO0658 COG3382 # Protein_GI_number: 15897568 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sulfolobus solfataricus # 48 206 46 208 224 84 37.0 2e-16 MFEIKVSKEIKDACPVFAGAAVYAVVKNTAYSEGLWQEIDAFTRQLTSTTQMEDIKHQPV IFATREAYKRCGKDPGRYRPSAEALRRRLMRGIPLYQIDTLVDLINLVSLRTGHSIGGFD ADKIQGTHLELGIGKAGEPFEGIGRGVLNIEGLPVYRDSFGGIGTPTSDHERTKMDLGTT HILAIVNGYNGKEGLQEAVEMIQTLLRNYTESDGGEIVFFE >gi|222159299|gb|ACAB01000060.1| GENE 33 40144 - 40782 518 212 aa, chain + ## HITS:1 COG:SMc00539 KEGG:ns NR:ns ## COG: SMc00539 COG0739 # Protein_GI_number: 15965497 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Sinorhizobium meliloti # 52 171 271 398 413 87 37.0 1e-17 MKNILILLTVLLLPLAADGQDKLSFSAREMADVRVATPGLFAKSNHIYLHLDSLKDHEYA FPLPGGKVISAYGTRGGHSGTDIKTCAKDTIRAAFDGVVRMSKPYYAYGNIVVIRHANGL ETLYSHNFKNLVKTGDVVKAGQPIGLTGRTGRATTEHVHFETRINGQHFNPNLIFDLKER TLRKECIKCTKNGSKIVVKTQIPDNRIAQNKK >gi|222159299|gb|ACAB01000060.1| GENE 34 40826 - 42007 773 393 aa, chain - ## HITS:1 COG:PA4204 KEGG:ns NR:ns ## COG: PA4204 COG2706 # Protein_GI_number: 15599399 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Pseudomonas aeruginosa # 48 392 30 385 388 205 35.0 1e-52 MIKTYLSKTSIFKAFAAVCIFGISISGCTSKKKTPMETTDTTENELTMLVGTYTSGTSKG IYSYRFNEEDGTATPLSETEVENPSYLVPSADGKFVYAVSELNNTQAAANAFAFNKEKGT LQLLNSQQTGGEDPCYIIASGNNVITANYSGGSISVFPIAKDGSLLPASDIIQFKGTGVD KERQEKPHLHCVRITPDGKYLFADDLGTDQIHKFIINPAANVENKEAFLKKGSPAAFKVK AGSGPRHLTFSPNGHYAYLINELSGTVIAFEYKDGDLKEMQTIAADTVNAQGSADIHISP DGKFLYASNRLKADGIAIFSIHPDNGMLAKAGYQLTGIHPRNFIITPNGKYLLVACRDSN VIQVYERDADTGLLTDVHQDIKVDKPVCIKFVP >gi|222159299|gb|ACAB01000060.1| GENE 35 42786 - 43715 315 309 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 64 306 101 331 339 125 32 6e-28 MLSCDKVLRTPTIPNMLSLTDELNCLPITQEPLAQIQKIDKTISAMLQTSIDDIEKDFVC SPISKINWTSHVILLDSSLPLGNKYWYMKQSVENGWSSNVLKIQIETNLFKRQIETQKVN NFTKTLPAPQSDLANYLLKDPYIFDVAGTKELADERDIEKQLVEHVTRYLLEMGNGFAFV AKQKHFQVGDSDFYADLILYSIKLHAYIVVELKATPFKPEYAGQLNFYMNIVDDQLRGEN DNKTIGLLLCKGKDEVVAQYALTGYAQPIGVSDYQLSKAIPENLKSALPSIEEVEEELSQ LLDNKKDEK >gi|222159299|gb|ACAB01000060.1| GENE 36 43893 - 44024 171 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295086997|emb|CBK68520.1| ## NR: gi|295086997|emb|CBK68520.1| Uncharacterized conserved protein [Bacteroides xylanisolvens XB1A] # 1 43 1 43 430 87 97.0 3e-16 MESSNHPIERKDFEALIHAIGFEIEHAQVKLIVAANAQMLFYY >gi|222159299|gb|ACAB01000060.1| GENE 37 44171 - 45598 677 475 aa, chain - ## HITS:1 COG:no KEGG:BT_4408 NR:ns ## KEGG: BT_4408 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 466 1 462 463 645 70.0 0 MKNTLFCLFIFCSVSTFAQNGITFKVEKLSKPEELLNLHSPNEIYESLILSDMDIEPYEI KEKNIKVPYHIIAKSESPDSLVSFGPNSFFNGMYQAYADHRPFVLSPDMIWLLISQGFAR HINANQESMRNELVDFSGKLSLIVREDKKLEDPTLSWEEIFPQFTEQISQYAGNHLTELL TCNFSTTTSLEKVASEITIMEAVKPYFEFIIIRIVCGIPEITLEGTPEDWEKLLYKARGL KEYKLDWWISELEPLLEEFVKASKGEVNKDFWRNMFKYHSQKRYGAPNIIDGWIVKFFPY DKEGKRNNLKQLEGRNCLPDEIVKVDVKYQEVYGDAVKETPLEFWAGFIGLEQNSKTFAL RPQIGWMVRKKDVNKEGLKSKLSADAQKDGWGSGINIRVKEFPAVLLELKEIKRLDIQFI DTIDIPDEISKIKIGSLNLHGKITKEGIERIKRLLPDTDIKINGSRSGSASFMTP >gi|222159299|gb|ACAB01000060.1| GENE 38 45846 - 46907 671 353 aa, chain + ## HITS:1 COG:SMa2355 KEGG:ns NR:ns ## COG: SMa2355 COG0389 # Protein_GI_number: 16263727 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Sinorhizobium meliloti # 1 330 34 363 379 342 54.0 5e-94 MDAFYASVEQRDNPELRGKPLAVGHAEERGVVAAASYEARRYGVRSAMASQKAKRLCPQL IFVPGRMDVYKSVSRQIHEIFHEYTDIIEPLSLDEAFLDVTENKKDISLAVDIAKEIKQK IREQLNLVASAGVSYNKFLAKIASDYRKPDGLCTIHPEQALDFIARLPIESFWGVGPVTA KKMHLLGIHNGLQLRKCSLEMLTAHFGKAGALYYECSRGIDERPVEAIRIRKSIGCERTL ERDISARSSVIIELYHVAVELIERLQRKDFKGNTLTLKIKFHDFSQITRSLTQSQELTTL DRVLPLAKELLKSVEFEQHPIRLIGLSVSNPKEEADEQHGVWEQLSFEFSDWD >gi|222159299|gb|ACAB01000060.1| GENE 39 47257 - 47403 106 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFHSIHGIFIHLSIAKITYLIKNEVSKSYEPDDKKVHTSKIVLLIAIF >gi|222159299|gb|ACAB01000060.1| GENE 40 47386 - 49320 1674 644 aa, chain + ## HITS:1 COG:no KEGG:BT_4410 NR:ns ## KEGG: BT_4410 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 644 1 644 644 1259 92.0 0 MNRMKHLFCVLLLAALGSFSFKANAYTERNMLQKAADETTLKNVLVMKQAWVPYPAYTDR VAWDSLMGPNKQRLIEAGEKLLDYKWQLIPATAYLEYERSGNRKIMEVPYDANRQALNAL MLAELAEGKGRFIDQLLNGAYMSCEMNSWVLSAHLPRQSSKRSLPDFREQIIDLGSGGYG ALMAWVHYFFRKPFDKINPVVSLQMRKAIKERILDPYMNDDEMWWMAFNWKPGEIINNWN PWCNSNALQCFLLMENNKDKLTKAVYRSMQSVDKFINFVKSDGACEEGTSYWGHAAGKLY DYLQILSDGTGGKLSLFNEPMIRRMGEYMSRSYVGNGWVVNFADASAQGGGDPLLIYRFG KAVNSDEMMHFAAYLLNGRKPYATMGNDAFRSLQSLLCCNELAKATPKHDMPDVTWYPET EFCYMKNKNGMFVAAKGGFNNESHNHNDVGTFSLYVNTIPVIIDAGVGTYTKQTFGKDRY TIWTMQSNYHNLPMINGVPQKFGQEYKATNTVCNEKKRMFSTDIATAYPAEAKVKSWVRS YALDDKKLIIGDIYTLDEAIAPNQMNFLTWGNVTFPSAGKIRIEVKGQKVEMNYPSQFKA ELETIKLDDPRLSNVWGKEIYRITLKTEEKKVTGKYGFVIQQVK >gi|222159299|gb|ACAB01000060.1| GENE 41 49380 - 50081 809 233 aa, chain + ## HITS:1 COG:no KEGG:BT_4411 NR:ns ## KEGG: BT_4411 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 233 233 378 90.0 1e-104 MKKVLFAGLLVLAGVSVSAQNLIKNEKFATEVKTKVTNANKATAGEWFIMNNEADGVTTI AWEETGDAKYSNAMKLDNSGAEKNLSWYKAFLGQRITDGLDKGIYVLTFYAKAKEAGTPV SVYIKQTNEEKNDNGKYNTTFFMRRDYDADAQPNASGAQYNFKIKDAGKWTKVVVYYDMG QVVNAISSKKANANLEVSDTDDDAAILKDCYVAILSQNKGGVVEISDVTLKKK >gi|222159299|gb|ACAB01000060.1| GENE 42 50242 - 51420 1174 392 aa, chain - ## HITS:1 COG:no KEGG:BVU_3875 NR:ns ## KEGG: BVU_3875 # Name: not_defined # Def: aminopeptidase C # Organism: B.vulgatus # Pathway: not_defined # 3 392 4 396 396 581 70.0 1e-164 MKKLFMSVALLILVVTSFAQTPGYEFTTVVSHQATPVKDQGSTGTCWCFATASFMESELL RMGKGEYDLSEMFIVRQKYMNQMEDNYLRRGKGSIGEGSLAHTFKNAYKKAGIVPEEVYT GLPGDKKDHNHGALSRYLKALVDANIESKKRTPQFDALINNLFDIYLGKVPEKFTYKGKE YTPQSFTESLGLNMDDYVELTSFTHKPYYETFSPEVPDNWENQPMYNLPLDELIGVIDYA LNKGYTVCWDGDVSEQGFSFKNGIAINPQVEDVKDYSTTDRARFEKMPKYQRMDEVFKFE HPYPEINVTPEIRQDGYEKFVTTDDHLMHITGIVKDQNGTKYYITKNSWGAESNKSGGYL NMSESYVRAKTICVMVHKDSLPKELKKKLGIQ >gi|222159299|gb|ACAB01000060.1| GENE 43 51514 - 51651 126 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKNANEKIMMLQYRIKRYQAMGNGAMCQTLNGKLQKLLSQQVAM >gi|222159299|gb|ACAB01000060.1| GENE 44 51976 - 54702 2168 908 aa, chain + ## HITS:1 COG:no KEGG:BT_4412 NR:ns ## KEGG: BT_4412 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 908 1 905 905 1568 82.0 0 MKTFTHLLCVLILSIVLFACNNAHFLKEESYRNQVVQDFEQKKQALPHGDLFAIFGDSAL SVYEREALMFLYAYMPIGDVTDYPGDYYLENVRLSKQTREEMPWGKEIPDEVFRHFVLPI RVNNENLDDSRRVFYDELKDRVKGLPMKDAILEVNHWCHEKVVYRPSDARTSSPLASVKT AYGRCGEESTFTVAALRAVGIPARQVYTPRWAHTDDNHAWVEAWADGHWYFFGACEPEPV LNLGWFNSPASRGMLMHTKVFGRYNGPEEIMLETPNYTEINVIDNYAPTAKAIVTVTDAD GQPVADAKVEFKIYNYAEFYTVATKYTDAEGKASLTAGKGDMLVWASRNGQFGYAKISFG KDDALQLSLNRKEGEAYSLPMDLVPPVEGANIPEVTPEQRAENDRRMAQEDSIRNTYVAT MMTEKQAKEWIDKLYGNTLQPEKKEKLVNFLVASRGNHQTLKDFLSAIRKEKDAVSWEEI RAIWILESLSAKDLRDVTLDVLNDHLLTNISDWEKIETDLFKRMYLNPPRIANEMLTPYK KVLREAIEKTVYQSVPDSMKRDPKVLIEWCRKEIKINNELNSQQIPISPMGVWKARVADE KSRDIFFVAAYRSMGWASAAWIDEVTGKVQILNEEFAKEDVNFDTAEAAQSRKGVLQATY TPIRSVEDPKYYSHFTLSKFKNGTFQLLNYDEGETDMGDGTTWRNLLKYGRELDEGYYMM VTGTRLASGAVLSNSTFFTIEPGKTTTVDLVMRESKDQVQVIGNFNSEATYRPVDSTEQR SILQTCGRGYFVVAVLGVGQEPTNHALRDIAALGNDFEQWGRKMVFLFPSEEQYKKFNAD EFKGLPSTITYGIDVDDSIRKEIVQAMNLNNSILPVFIIADTFNRVVFVSQGYTIGLGEQ LMKVVHGL >gi|222159299|gb|ACAB01000060.1| GENE 45 54805 - 55560 571 251 aa, chain - ## HITS:1 COG:PM0526 KEGG:ns NR:ns ## COG: PM0526 COG3142 # Protein_GI_number: 15602391 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Pasteurella multocida # 5 245 2 242 244 214 44.0 2e-55 MGKYQFEICTNSVESCLAAQKGGADRVELCAGIPEGGTTPSYGEIAIAREVLTHTRLHVI IRPRGGDFLYSDIEIRTMLKDIEIARKLGADGVVFGCLTAEGEVDLPAMQKLMEAAQGLS VTFHRAFDVCRNPQRALEQIIKLGCNRILTSGQQPTAEQGIPLLKELQKQAAGRITLLAG CGVNENNIARIAAETGINEFHFSARESIQSEMIFKNEAISMGGTVHINEYERNVTSVRRV KETIETLHKRI >gi|222159299|gb|ACAB01000060.1| GENE 46 55625 - 57160 1809 511 aa, chain - ## HITS:1 COG:CAC1816 KEGG:ns NR:ns ## COG: CAC1816 COG1418 # Protein_GI_number: 15895092 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 34 511 44 514 514 396 48.0 1e-110 MIAIIATAIACFIVGGVLSYILFRYVLKSKYDNVLKEAETEAEVIKKNKLLEVKEKFLNK KADLEKEVALRNQKIQQAENKLKQREMVLSQRQEEIQRKKMEAEAVKENLEAQLVIVDKK KEELDKLQQQEIDKLEAISGLSAEEAKERLVESLKEEAKTQAQSFINDIMDDAKLTASKE AKRIVIQSIQRVATETAIENSVTVFHIESDEIKGRIIGREGRNIRALEAATGVEIVVDDT PEAIVLSAFDPVRREIARLALHQLVTDGRIHPARIEEVVAKVRKQVEEEIIETGKRTTID LGIHGLHPELIRIIGKMKYRSSYGQNLLQHARETANLCAVMASELGLNPKKAKRAGLLHD IGKVPDEEPELPHALLGMKLAEKYKEKPDICNAIGAHHDETEMTSLLAPIVQVCDAISGA RPGARREIVEAYIKRLNDLEQLAMAYPGVTKTYAIQAGRELRVIVGADKIDDKQTESLSG EIAKKIQDEMTYPGQVKITVIRETRAVSFAK >gi|222159299|gb|ACAB01000060.1| GENE 47 57285 - 57575 302 96 aa, chain - ## HITS:1 COG:no KEGG:BT_4418 NR:ns ## KEGG: BT_4418 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 96 1 96 96 131 92.0 9e-30 MNDKIKINLQIADSNYPLTINREEEQMVREAAKQVNIRLNKYREVYKNLEPEKIIAMVAY QFSLERLQLMQRNDTSPYVEKVKELTELLEDYFKEE >gi|222159299|gb|ACAB01000060.1| GENE 48 57587 - 57901 245 104 aa, chain - ## HITS:1 COG:no KEGG:BT_4419 NR:ns ## KEGG: BT_4419 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 104 1 97 97 127 93.0 9e-29 MYNFGVVMTEEEKKLLNSFETQLRHLIYLHDELKRENAELKKLLENEKLKNEKVQAQYDE LEVSYTNLKTATAISLNGSDVKETKLRLSKLVREVDKCIALLNE >gi|222159299|gb|ACAB01000060.1| GENE 49 58687 - 59598 561 303 aa, chain - ## HITS:1 COG:BH2219 KEGG:ns NR:ns ## COG: BH2219 COG1609 # Protein_GI_number: 15614782 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 3 297 39 328 335 172 36.0 1e-42 MIEAAKRLGYRPNPVAINLKYGHTNTVGVIVPEMVTPFASRVINGIQEVLHTKNIKVIIA ESGGDPEKEKENLQLMEGFMVDGIIICLCSYKRNKEEYARLQQAGMPMVFYDRIPYGLEV PQVIVDDYMKSFFLVESLIRSGRKQIVHIQGPDDIYNSIERVRGYKDALAKFGIPFDKNN MLIKTGMTFEEGKKAADILVERNIPFNAIFAFTETLAIGAMNRLRELGKKIPEEIAVASF SGTELSNIVYPKLTTVEPPLYQMGKKAAELILEKIKDPASPNHSIVLDAEIKMRASTPRL EVY >gi|222159299|gb|ACAB01000060.1| GENE 50 60036 - 61355 1078 439 aa, chain + ## HITS:1 COG:no KEGG:Ccel_0951 NR:ns ## KEGG: Ccel_0951 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 8 439 4 431 432 338 42.0 3e-91 MDKKVRSIEISKRVPIVGNYDVVVCGGGPAGFIAAIAAARSGAKTAVVEQYGFLGGMATM GLVTPLSVFTYNNEKVIGGIPWEFIERLEKMGGCIIEKPLGNVAFDPELYKLLCQQMMLE AGVDMYMHSYLSGCQAKDGKISCILFENKNGTEAISADMYIDCTGDGDLAAMAGVPMQTD DCKPLQPLSTYFILGGVDTDSPMIIDAMHHNKQGQNCHCIAVREKLLAMKEKLGIPEFGG PWFCTTLRPGEVTVNMTRTAGNAIDNRNFTAAECRLREDVFKIARIFKENFEEFKNSYVT TVAVHAGIRETRRIKGVHTITAEEYVNAYKYPDSISRGAHPIDIHVAAGAEQSVTFLKKA AYVPYRALIAEDFPNLLVAGRCISADKTSFASLRVQASCMGVGQAAGVAAAQCIKAGVTV QKADIHNLIEELKKLGAII >gi|222159299|gb|ACAB01000060.1| GENE 51 61370 - 63160 963 596 aa, chain + ## HITS:1 COG:VCA0667 KEGG:ns NR:ns ## COG: VCA0667 COG0591 # Protein_GI_number: 15601425 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Vibrio cholerae # 6 463 7 436 513 83 21.0 2e-15 MEQLDYIIILLYFLGLIAVSVVMSRKIKNSEDMFIAGRNSSWWLSGVSSYMTIFSASTFV VWGGVAYKSGLVAVVVALCLGVASFIVGKWISGKWRELRIKSPGEFLTIRFGHRTVSFYT ISGIIARAVHTAVSLYAVAVVMCALIKVDGGSIFASTGMMGDSPMGYLSIWWALLILGAI ALGYTIAGGFLAVLMTDVIQFGVLLAVVVFMIPLSFNAVGGVSAFIDKASEIPGFFSGTS PTYTWGWLLLWIFLNVSMIGGDWPFVQRYISVPTTRDAKKSTYLIGILYLVTPLIWYLPT MIYRVMEPGLALDLDATTMTFNGEHAYVNMSKLVLMKGMVGMMLAAMLSATLSNVSGILN VYANVYTYDIWGHKEKNRQADEKKRIKVGRLFTFVFGLVIIALSMLIPFAGGAEKVVVTL LTMVMCPLYIPSIWGLFSKRLTGNQLISAMILTWLVGIMARVIIPASVISPSLIESVAGC VLPVLILAIMEVWSARKKYEDNGYQAICEYTDPEADREPTLKEKKAVLIYSHLAVNCFCI TIGVVALLLIGLLIAGDPKTLAVKGIVIGSIVLMIAMILAYVIYRIIYARRLKMSS >gi|222159299|gb|ACAB01000060.1| GENE 52 63192 - 64556 999 454 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715716|ref|ZP_04546197.1| ## NR: gi|237715716|ref|ZP_04546197.1| predicted protein [Bacteroides sp. D1] # 1 454 1 454 454 903 100.0 0 MNLKMLLFASVSLVLFASCHNSDDPYMTLATKTLDVEAEGGLFTVDLSSNVYYRVNNDCQ TDGSDSHWAVVDSHETQGEITKFTIKVSENSSTSSRVGAIRFIGDDVTPLKLVITQKMIV PKGISPTTESIDATTTESSFKVFGDKEWKAVCTDADVTVSPANGVGECDIKLTFPENKTF AKRTIKVTVTIQDDTDYTYTLVQDAFSGILADWDLNSLTANTSGTFADDEAQSVFPGTNG KYIAPSAGSGKIEYWACDRTGYVEQKVVCSRAVGANGDPYVSGAIPGDYWYVYGDMKGTT IPAGTKIHFYFVTKLGTMCSNYWMIEFKDGEEWKPALPTSTLQESATETLSGAPINYSAT ITYNFAGMLLDEKNNGAYIPAEGIFTTTKDMDEIVLRFGQAGRLCLNGAKFAGKYIDCTH ASGQTRFSAQHPSNPETGEAIREYNEHVLLEIVE >gi|222159299|gb|ACAB01000060.1| GENE 53 64562 - 67648 2361 1028 aa, chain + ## HITS:1 COG:no KEGG:BT_2461 NR:ns ## KEGG: BT_2461 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 1028 132 1134 1134 794 42.0 0 MKKSILSLLLACFLQINLAFAQNVMKGVVSDANGPLVGVVIHDKDNAKGTTSDMSGKYAL NGAESGHTIEFRYLGYVTETVVWDGKSVVNIKLKEDAVQLEETVVIGYGSVKKKDLTGAV GVINSSLIEKQSTSQLSQSLQGLIPGLTVTRSSSMPGASATVQVRGVTTMSDSSPLILVD GMMVSSLDNIASEDVQQITVLKDAASASIYGARAAAGVILITTKEATEGQLSIGYNGEIS LSSPTEFPKFLTDPYHYMTMYNEWSWNDAGNPAGGEFANYSQDYIDNYATNNRYDPIQYP IYDWKDAILSNTAMRHKHNLTMTYGNKVIKSHTSATYENADAIYKGSNHERISIRSRNNL KISDKLSGSIDFSVRYATKNDPTSGSPIRAAYMYPSIYLGLYPDGRVGPGKDGSLSNTLA ALLEGGEKKTVSNTMTGKFSLSYKPIKDLTLTANLTPTVGTVSIKEMKKAIPVYDAYETD VMLGYVSGYTSNSLSEERRNIKSLEKQFIATYDKTFSKVHNFNAMVGYEDYSYTYETMSG STNDMSLSSFPYLDLANKNALAVAGNSYQNAYRSFFGRVMYNYDSRYYLQLNAREDGSSR FHKDHRWGFFPSASVGWVISNEKFMQNITPINYLKFRASIGTLGNERIGNYPYQTYISFN NAIMYDSAGSTPQSSMSAAQQDYAYENIHWEKTQSWDIGVDAAFFNNRLDFSADYYYKKT TDMLLSVAIPSFTGYSAPDRNVGKMHTRGWEVKLGWSDRIGDVSYAVSFNISDYKSIIDN LNGKQQFNSDGTIITEGAEYNSWYGYKTAGLFQTAEEVSESALLSASTKPGDVKYVDVSG PDGTPDGIINETYDRVVLGSSLPHYLYGGSISLGWKGLSFSLLFNGVGKQLSRLTESMIR PMQGQWLPAPSVLLNDNGSRNYWSVYNTAEQNAAASYPRLSHQGGEYNNYKMSDYWLKSS AYMRIKNINVGYTVPKKIVSKVGIKGLRVYVNIDDPYCFDSYLSGWDPEAGASTYITRTY TFGVDIKF >gi|222159299|gb|ACAB01000060.1| GENE 54 67673 - 69412 1464 579 aa, chain + ## HITS:1 COG:no KEGG:BT_2460 NR:ns ## KEGG: BT_2460 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 579 1 572 572 408 39.0 1e-112 MKKIHVLICMIALLSVTSCVNLDLNPPSAASSENWFSSPEEVRISLNDFYRSTFFVIEEG WTLDRNTDDWAQRTNIYTIAAGSLNASSTSNPNIKTVWSYTYKNISRANRILEALDKLEG KYSTTELNTLRAEARFFRAFAYSRLITLWGDVPFYVTSITPEEAFEMGRTDKAVVLKQIY EDYDYAAENLPAANNNSGATRVDKGTAYAYKARTALYQHDYGTAAKAAQDCMDLEVYDLA PDYGELFRDKTRGSKEVIFSVAHSSDLELDENGKPTTQAIGSFIARSAGGTHNAQPSWEL LAVYEMTNGKTIDEPGSGFDPHDPFANRDPRCLETFAAPGSRIYGIEWNPAPNALEVMDY TQNRMITNKDSKGGSDASNCAYNGCCLRKGAQESWRTTLYNDNPVILMRYADVLLMYAEA KIELGEIDATVLACINDVRARAYGTTRTQTNDYPSITTTDQTALRKVLRRERRVEFAWEN LRYFDLLRWHQFENAFGHNMYGFTRTANRAKEYFAAGNWFWPETPTFDKDGFPSFEAMAD GTYIVQHGERKYDEKIYLWPLPSDDVLIMNGKLVQNPGY >gi|222159299|gb|ACAB01000060.1| GENE 55 69434 - 70387 955 317 aa, chain + ## HITS:1 COG:no KEGG:SG0242 NR:ns ## KEGG: SG0242 # Name: not_defined # Def: hypothetical protein # Organism: S.glossinidius # Pathway: not_defined # 10 275 25 271 286 201 42.0 3e-50 MSIGEILLRSRYQPLKRVLSYDVFNTGFNGWMTLMPNFTEYPDFDVPKTLVNKDQWPPVM LSSATFRYPGTHGAMSGTYSLKLSTRPVAAPYTEIPAEGCLGHAIKRLSFSRPGCKYLQI ECWFTYTAEQDVVDGGDRPQPGLHESSIRDFGMGFDVQEGGKRYHVGIRYLNAVDGKLMQ KWQYEHSNEDITDRDWAYGLDGDWCKRGVDPWWFGRRYPNGDHDGFKDLKDGHQKLIYNE TDCKLNWQYMRLKLDTELREYVEFQCQDKIWDMRGIAADTVDGYGRIDNLINPLFWVGTD TNRRVFFYIDSVVVSQE >gi|222159299|gb|ACAB01000060.1| GENE 56 70405 - 70734 302 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715720|ref|ZP_04546201.1| ## NR: gi|237715720|ref|ZP_04546201.1| predicted protein [Bacteroides sp. D1] # 1 109 1 109 109 219 100.0 6e-56 MKMFNDSHLEVKKFFKGTFFTHPYEAGWADEAIFFVMVEKIEGDPVFEGRVQLSQDGIHW ADDGSDPVIFKGLGQHIIKVNSNFGNYIRLAVSIEGGEMFLNLHIACKG >gi|222159299|gb|ACAB01000060.1| GENE 57 70722 - 72098 937 458 aa, chain + ## HITS:1 COG:no KEGG:Dfer_0342 NR:ns ## KEGG: Dfer_0342 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 37 458 22 419 419 254 36.0 4e-66 MQRLIFIFSLFILAIVGCSSNESHSFLIRPDGGGEDAATEIEVGKTIPAWQEGVMDIHFI NTTTGESMFVIFPDGTQMLIDAASSSVTTNSNSNTTNTGIRSRWDPTLTSTRGSQIITDY IRKCMVWTGNSTIDYAVLTHFHNDHFGGYTSSLPKSSNSDTYSLIGFAEIFDNFKIGTLL DRGYPDYNYPFNMATMADNAPSCSNYINAVKWHVANQKFDAAIFKAGANDQIVQKYNPAK YPTAKVQNVAVNGEIWTGSGTTTKKTFPELSEISYENSKNITSSDNCPPENITSCVMKVS YGNFDFFAGGDLQYNGRSSHAWKDAELPCAKAVGQVELLKANHHGVTNTNQVDALKALNP QTIVVNSWVDCHPRTDILNSMETTLPACDMFITNFWQGDRPSGVDDRVTAEEAARVKGYD GHIVVRVTDGGNKYRVVTITDSDGAMTVKTISGPYTSR >gi|222159299|gb|ACAB01000060.1| GENE 58 72098 - 73372 746 424 aa, chain + ## HITS:1 COG:no KEGG:Ccel_0950 NR:ns ## KEGG: Ccel_0950 # Name: not_defined # Def: HI0933 family protein # Organism: C.cellulolyticum # Pathway: not_defined # 5 424 17 435 435 407 50.0 1e-112 MNTKYYDVIVAGAGPAGICAAVAAARQGARVALIERYGVIGGNLTAGYVGPILGSVSKNT MRDEVCAILGVKDNDWIGEHGNAHDFEEAKLTLAEFVAREKNIDVFLQCCVSDVIRDGKV VKGIKCASNEGTLCFEAAVTIDCTGDAIVSFLAGAKIEKGRADGLMQPVTLEYTIDGVDE SKGIICIGDVDNVQLNGECFLDWCKKKADEGKLPRMLAAVRLHPSVRPGCRQVNTTQVNR VDITSVSSIFTADLELRQQIRLLTQFLKENLPGYENCRVIGSGTTTGVRESRRVMGDYVI DADEMAEGCRFADVVVHKALFIVDIHNPDGAGQAEPTIQYCKPYDLPYRCFLPLGLEGLL VAGRCISGTHRAHASYRVMSICMAMGEAVGIAAAMSASQHCTPRALDVGELQKRLESLGV ELFD >gi|222159299|gb|ACAB01000060.1| GENE 59 73413 - 75650 1592 745 aa, chain + ## HITS:1 COG:no KEGG:PRU_0396 NR:ns ## KEGG: PRU_0396 # Name: not_defined # Def: histidine acid phosphatase family protein # Organism: P.ruminicola # Pathway: not_defined # 308 744 10 432 436 177 28.0 2e-42 MKKIYLSVVCLLISIPLIAQLYVEPEKEVECSVFLAKEGRGRAQQGLEIWDDYIFSCEDG GHVNIYDFKSADPKPVAGFELASSHPDNHVNNVCFGVETKRGASFPLLYITNGKVGSELE WLCFVESITRRGKRFSSEIAQTIELDGSKWAEKGYVPIFGAPSWLVDRERGFIWIFSARK RTVAKVTKHAWENQYVATKFRIPSLSEGAKVRLDENDILDQVVFPYEVWFTQAGCMHDGK IYFCFGVGKQDDSRPSCIRVYDTDRRTITARYNVQEQVIYEPEDIVVKDGVMYVNTNTNA KKTSDLPCIFKLSLPKEKPVAENPLDEIRRDPERAGGVYYVTDLSHPVTPAPKGYTPFYI NGYFRHGARQIDDEVTYPAIYGVLEKAHATNNLTDFGKALYERLEPFKKNVFYKEGDLTQ IGYRQTREIGRRMVQNYPEVFEGHPYLKTNATNVLRVAATMQSVNSGILSLRPGLEWAEI DNSRSFLTTLNPYGNVCPDRSPLDKYILGKENSWYKKYRSYIDEKLDVDAFFTRLFIDVT QVESEYDKYDLIHRFWLMASLMQCLDRQVPIWDIFTEEEILAWAEIENYKYFAQKGPEPV SHGRSWGLASRTLRHLLDESAEDLVRKRHGINLNFGHDGVLMAILTNLQAGTWAREAGNS KEALRSWKYWDIPMGANLQMIFYQSEGNPDVLVKFMLNEKDLRLPLEAVEASYYKWNEVY KFYIEHCDKVERSLAETLKLSYEDF >gi|222159299|gb|ACAB01000060.1| GENE 60 75637 - 76959 979 440 aa, chain + ## HITS:1 COG:PA2342 KEGG:ns NR:ns ## COG: PA2342 COG0246 # Protein_GI_number: 15597538 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Pseudomonas aeruginosa # 5 427 19 473 491 325 38.0 1e-88 MKTFNYNRAEIKAGIVHFGVGNFHRAHLEAYTNLLLEDPSQRCWGVFGAMIMPTDGVLFN ALKKDDGIYQLTTCSPSGEQDSTLIGSLVELAWGEIDSEPIIAKIASEEIKIISLTITEG GYKVDFNQSRSVFWYVAEGLKRRMEKDLPITILSCDNMQMNGNAAKCAFMSYFEAKYPEV AAWAKKKVTFPNSMVDRITPVTKPGKVTDVCCEDFIQWVIEDNFIAGRPAWEKVGVTFTH DVTPYEIMKLSLLNASHTLLSYPAYMEGFRKVDAVMADERYRAMIKLFMNRDVTPYVPVP EGVDLEAYKDQLIERFSNKAISDQVSRLCGDGIAKFAVYVVPILKQMLQDGKDISIEAFL IAVYCKYLIGARTESGENIAISEPHITPADRKLISGGSPAEFLKISPFVSLGLDKYPVFM EKYEQFYAMQVAEGLKVLLQ >gi|222159299|gb|ACAB01000060.1| GENE 61 76981 - 78066 680 361 aa, chain + ## HITS:1 COG:no KEGG:Phep_1387 NR:ns ## KEGG: Phep_1387 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 24 358 27 357 358 179 32.0 2e-43 MKKFFITVVLALSCVLSVSAQKQTEASTLNLIGKPFESTPNPYHRVDTLVYKGFNRTENR QLRCSAGMAVLFKTNTRNIQITTKWGYVYSSHSTMPISYKGYDLYIKNADGQWQYAASGS LKAYKGEKTETFMLIENMDGTMHECMMYMPMYSEVISCKIGIDDDAVIEPLKSDFRHRIA VYGSSFTQGVSTDRSGMSYPMQFMRNTGLQVVSLATSGRCLMQPYMLDVLAEVKADAFIF DTFSNPDAELIRERLMPFIDRLIAAHPATPLIFQRTIYRERRSFDTVLDAKERAKAATVE ELFAKIRTNPKYKDVYLITPDASDAHETSVDGTHPSSYGYALWAKSIETPVIEILSKYGI K >gi|222159299|gb|ACAB01000060.1| GENE 62 78063 - 79091 693 342 aa, chain + ## HITS:1 COG:YJR159w KEGG:ns NR:ns ## COG: YJR159w COG1063 # Protein_GI_number: 6322619 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Saccharomyces cerevisiae # 28 340 31 353 357 159 32.0 6e-39 MRQAVLVEPKHIEFREVEAPKASDLKEHQVLLNIKRIGICGSEIHSYHGCHPATFYPVVQ GHEYSAVVVATGAAVTICKAGDVVTARPQLACGKCKPCQRGEYNICEELRVQAFQANGAA QDFFVVDDDRVAVLPEGMSLDYGAMIEPVAVAAHATMRGGDLKGKNVVVSGAGTIGNLVA QFAKARGAKRVLITDVSDFRLEIARKCGIIDTLNVAKTPLKEGAKRLFGDEDFQAAFEVA GVESSIRSLMECIEKGSTIVVVAVFGKDPSLNMFYLGEHELKVNGTMMYRHEDYLTAIDQ VSSGAIRLEPLISNHFPFEQYDEAYKFIDEHSATSMKIIIKL >gi|222159299|gb|ACAB01000060.1| GENE 63 79096 - 79962 522 288 aa, chain + ## HITS:1 COG:MA1840 KEGG:ns NR:ns ## COG: MA1840 COG0524 # Protein_GI_number: 20090690 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Methanosarcina acetivorans str.C2A # 1 280 29 311 326 155 33.0 7e-38 MKIIGIGEVVWDCFPEGKRLGGAPINFCFFAKELGAESYPVTAIGDDELGDETFTVLKET GLDLGYISRNILPTGKVLVSLNEAGVPQYDIVENVAWDTIECSPATMKLVGDADAVCWGS LAQRSEKSRAAILRLIDAVPDTSLKVFDINIRQHFYSTDLIVESLQKANVLKLNEDELPL LISLLSLSTDFVEAIAELIARFSLKYVIFTQGAVRSGIYDASGEVSSINTPKVEVADTVG AGDSFTATFVVNILRGASVAESHRKAVDVSAYTCTQRGAINPLPDSKK >gi|222159299|gb|ACAB01000060.1| GENE 64 80048 - 80614 515 188 aa, chain + ## HITS:1 COG:no KEGG:BT_4451 NR:ns ## KEGG: BT_4451 # Name: not_defined # Def: putative MTA/SAH nucleosidase # Organism: B.thetaiotaomicron # Pathway: Cysteine and methionine metabolism [PATH:bth00270]; Metabolic pathways [PATH:bth01100] # 1 188 1 188 188 372 92.0 1e-102 MLKILVTYAVQGEFVELKWPDIEPYYVRTGIGKVKSAFHLAEAIRQVQPDLVLNIGSAGT VNHQVGDIFVCRKFVDRDMQKMKEFGLECEIDSSALLEEKGYCTHWTEHGICNTGDGFLT ELTHVSGDVVDMEAYAQAFVCRSKEIPFISVKYVTDIIGQNSVKHWEDKLADARQGLSHY FNVLKERI >gi|222159299|gb|ACAB01000060.1| GENE 65 80584 - 80943 246 119 aa, chain - ## HITS:1 COG:PA1439 KEGG:ns NR:ns ## COG: PA1439 COG2832 # Protein_GI_number: 15596636 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 8 117 21 129 135 90 44.0 6e-19 MKTLYIVLGSISLALGILGIFLPLLPTTPFLLLTAALYFKGSPRLYNWLLNHRHFGPYIR NFRENKAIPLRAKIISLVLMWGTMLYCIFFLIPFIWVKILLGLIAAGVTYHILSFKTLK >gi|222159299|gb|ACAB01000060.1| GENE 66 81073 - 81429 459 118 aa, chain + ## HITS:1 COG:aq_853 KEGG:ns NR:ns ## COG: aq_853 COG0720 # Protein_GI_number: 15606204 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Aquifex aeolicus # 5 118 7 114 114 66 34.0 2e-11 MFTVIKRMEISASHKLVLPYRSKCASLHGHNWIITVYCRSARLNSEGMVVDFTRIKEVVT EKLDHQNLNEVLPFNPTAENIARWVCKQIPQCYKVEVQESEGNIVIYEKDPSGAEGQE >gi|222159299|gb|ACAB01000060.1| GENE 67 81448 - 81996 217 182 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 [Rickettsia canadensis str. McKiel] # 4 181 19 224 225 88 33 1e-16 MMRKINEIFYSLQGEGYHTGTPAIFVRFSGCNLKCDFCDTQHEEGKMMTDDEIIAEVKKY PAVTVVLTGGEPSLWIDDELIDRLHQAGKYVTIETNGTRPLPAAIDWVTCSPKQGVKLAI DRMDEVKVVYEGQDISIFELLPAEHFFLQPCSCNNTASTVDCVMRHPKWRLSLQTHKLID IR >gi|222159299|gb|ACAB01000060.1| GENE 68 82048 - 82788 660 246 aa, chain + ## HITS:1 COG:BH1832 KEGG:ns NR:ns ## COG: BH1832 COG0247 # Protein_GI_number: 15614395 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Bacillus halodurans # 1 245 1 238 244 163 37.0 2e-40 MKVGLFIPCYINAIYPNVGVASYRLLKSLGVDVDYPLDQTCCGQPMANAGFEDESMKLAL RFDDLFREYDYIVGPSASCVAFVKENHPGILEKEGHVCQSAGKIYDICEFIHDVLKPSKI PARFPHKVSIHNSCHGVRELLISAPTELNIPYYNKLRDLLNLVEGIEVFEPSHIDECCGF GGMFAVEEQAVSVCMGRDKVKDHIATGAEYIVGADSSCLMHMQGVIKREHLPIQIIHIVE ILASQS >gi|222159299|gb|ACAB01000060.1| GENE 69 82785 - 84170 1128 461 aa, chain + ## HITS:1 COG:ECs0345 KEGG:ns NR:ns ## COG: ECs0345 COG1139 # Protein_GI_number: 15829599 # Func_class: C Energy production and conversion # Function: Uncharacterized conserved protein containing a ferredoxin-like domain # Organism: Escherichia coli O157:H7 # 29 458 34 473 475 311 38.0 2e-84 MSTKHSKAAEKFLQDSKMAAWHNETLWMVRAKRDKMSKEVPEWEELRNKACELKLYSNSH LEELLLEFEKNATANGAIVHWAKDADEYCAIVYEILNEHNVHHFIKSKSMLAEECGLNPF LMERGIDVVESDLGERILQLMHLEPSHIVLPAIHIKREQVGELFEKEMGTEKGNFDPTYL THAARKNLRHLFLNAEAAMTGANFAVASTGDIVVCTNEGNADMGTSYPKLNIAAFGMEKI VPDLDALGVFTRLLARSATGQPVTTYTSHYRRPREGGEYHIIIVDNGRSALLSKPDHIKT LNCIRCGACMNTCPVYRRSGGYSYTYFIPGPIGINLGMAHAPEKYYDNLSACSLCMSCSD VCPVKVDLAEQIYKWRQDLDGLGKANTGKKIMSGGMKFLMERPALFNAALWAAPVVNSLP RFMKYNDLDDWGKGRELPEFASESFNEMWKKNKVQGKEESK >gi|222159299|gb|ACAB01000060.1| GENE 70 84167 - 84748 555 193 aa, chain + ## HITS:1 COG:no KEGG:BT_4457 NR:ns ## KEGG: BT_4457 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 193 1 193 193 357 91.0 2e-97 MSSREDILASIRQHTQTRYDKPDIADMKRLSYPDRIEQFCAISRAVGGAAVVLGEGEDVN TVIRRTYPDAMRIASVLPDISCATFNPDNLDDPKELDGTDVAVVKGEIGVAENGAIWIPQ TVKYKAIYFISEKLVILLDRNKIVDTMYDAYRELDGQEYQFGTFISGPSKTADIEQALVM GAHGARDVLVILT >gi|222159299|gb|ACAB01000060.1| GENE 71 85083 - 85955 847 290 aa, chain + ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 6 290 5 290 290 322 50.0 6e-88 MYANKVKKIAAVHDLSGMGRVSLTVVIPILSSMGFQVCPLPTAVLSNHTQYPGFSFLDLT DEMPKIIAQWKRLEVEFDAIYTGYLGSPRQIQIVSDFIRDFRRPDSLIVADPVLGDNGRL YTNFDGEMIKEMRHLITKADVITPNLTELFYLLDKPYKADNTDEELKEYLRLLSDKGPQV VIITSVPVHDEPHKTSVYAYNRQGNRYWKVTCPYLPAHYPGTGDTFTSVITGSLMQGDSL PMALDRATQFILQGIRATFGYEYDNREGILLEKVLHNLDMPIQMASYELI >gi|222159299|gb|ACAB01000060.1| GENE 72 86073 - 87356 897 427 aa, chain - ## HITS:1 COG:no KEGG:BT_4460 NR:ns ## KEGG: BT_4460 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 427 1 447 449 363 48.0 7e-99 MKLLDYIRGLRKGKEAHRLERESMQDPFLADAMDGYSQVEGNHEQRIEKLRMQVSAHSAK KKNTRAITWSIAACLAIGFGISSYFLFLKKSMTDEVFIAKESVSSKLAEPAIPPTPAIPA TPTVPTTPQKEIALATAKVKTDSTPASEDSTPVSEITARQADKKDMIAKIQATSQPQQGT PPVATVPVMEEVSEETAALQEVVATMDTFESESDKKMKLAKVATILPQKNMIKGRVTDEK GEPLIGASVTYKGTNIGTITNMNGEFSLVKKDDKKRLTAEYIGYDPVEIRIDTSRTMLIA MNENKQALNEVVVVGYGAKKNKKSTTTGNVVTVKEQAKKEITPQPVIGKRSYQKYLKENL VRPTDDNCKDIKGEVVLSFFVDEEGKPQNITVIHGLCEFADKEAIRLVKEGPKWTSGKLP ARVTVRF >gi|222159299|gb|ACAB01000060.1| GENE 73 87353 - 87925 405 190 aa, chain - ## HITS:1 COG:mll4824 KEGG:ns NR:ns ## COG: mll4824 COG1595 # Protein_GI_number: 13474039 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 17 187 9 179 179 76 25.0 3e-14 MLFFKRNISKLSDEELLIHYTKSGDTEYFGELYNRYIPLLYGLCLKYLHDEDRAQEAVMQ LFEDLLPKLGNYEIKIFKPWLYRVAKNHCLQLLRKENKEIPLDYTVNIMESDEFLHLLSE EESSEEQLKALHHCLEKLPEEQRTSITRFFLEEMSYADIVEQTGFTLNNVKSYIQNGKRN LKICIKKQAL >gi|222159299|gb|ACAB01000060.1| GENE 74 87926 - 88582 413 218 aa, chain - ## HITS:1 COG:VC1937 KEGG:ns NR:ns ## COG: VC1937 COG0204 # Protein_GI_number: 15641939 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Vibrio cholerae # 22 216 22 219 223 125 35.0 6e-29 MQATAMQIIYKGVFQWFLKLIVGVQFTDCRFLKKEKQFIILANHNSHLDTLSLLASLPGE LLWKVKPVAAEDYFGKTRFQASISNFFINTLLIRRKGEKDSEHDPIRKMLEAIDAGYSLI LFPEGTRGKPEQMGKIKSGIARILSLRPEVKYIPVFMTGMGRSLPKGKMILLPYKASIYY GMPALVKSADTHEILDQITGDFERMKEKYQVVIDEEEE >gi|222159299|gb|ACAB01000060.1| GENE 75 88607 - 89563 888 318 aa, chain - ## HITS:1 COG:VC1936 KEGG:ns NR:ns ## COG: VC1936 COG4589 # Protein_GI_number: 15641938 # Func_class: R General function prediction only # Function: Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase # Organism: Vibrio cholerae # 22 311 14 303 310 260 47.0 3e-69 MKNLLDKIFPTLSDELIIVISLIIGLLVTASLILFLVKKISPKTNISELAARTRSWWIMA GMFIGAVFISYNISYFFLAFLSFIAFRELYSVLGFREADRGALFWGILSIPIQYYLAYLA WYGAFIIFIPVVMFLVLPLRLVLKGDTHGITKSMALLQWILMLSVFGISHLAYLLSLPEL PGFSSGGRGLLLFLVFLTEINDIMQFIWGKLLGRHKILPKVSPNKTWEGFLGGVISTTVI GYFLGFLTPLSAPNVILVSALLAIAGFSGDVVISAIKRDKGIKDMGNSIPGHGGVFDRID SLSYTAPVFFHLVYYIAY >gi|222159299|gb|ACAB01000060.1| GENE 76 89560 - 90216 453 218 aa, chain - ## HITS:1 COG:VC1935 KEGG:ns NR:ns ## COG: VC1935 COG0558 # Protein_GI_number: 15641937 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Vibrio cholerae # 2 206 27 233 252 166 45.0 4e-41 MKNEVDGRREIASRNTAWANNIARKLTRWGVTPNQISMMSVFFAMVGCLLLIGTVIYPGF NKYVAYILFIVCMQSRLLCNLFDGMVAIEGGKKSANGDLYNDMPDRFADALFIIPVGYVA GGFGVELGWLGALLAVMTAYFRWIGAYKTHQHFFNGPMAKQHRMALLTLTFVVATCTIHS GYDRMVCFIALIIINIGLIATLIHRLYLISHTTNTEIK >gi|222159299|gb|ACAB01000060.1| GENE 77 90406 - 92256 2052 616 aa, chain + ## HITS:1 COG:STM2315 KEGG:ns NR:ns ## COG: STM2315 COG2304 # Protein_GI_number: 16765642 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Salmonella typhimurium LT2 # 146 610 118 591 593 413 46.0 1e-115 MKTNQFRAMMLVLLMAVVSLGMVNAQAITVSGTVTDAKDGTPLVGCSVQIKGTTKGTVTN MNGRYTIQAKKGETLLFQYIGYKQERRVVKSATLDVKMKADELVLEECVVVGYGHETRAA KVMSTAYRAVCPTPGIMYDAANAEEYGEFQENGFKSVSDAPLSTFSIDVDAASYSNMRRF INKGELPPVDAIRTEELVNYFSYDYPKPTGSDPVKITMESGACPWNTNHRLVRIGLKAKE IPTDNLPASNLVFLIDVSGSMWGANRLDLVKSSLKLLVNNLRDKDKVAIVTYSGSAGVKL EATPGSDKQKIREAIDELTAGGSTAGGAGILLAYKIAKKNLISNGNNRIILCSDGDFNVG VSSAEGLEQLIEKERKSGVFLTVLGYGMGNYKDKKIQVLAEKGNGNHAYIDNLQEANRVL VGEFGATLHTVAKDVKLQVEFNPSQVQAYRLVGYESRLLKDEDFNNDAKDAGDMGAGHTV TAFYEVIPTGVKNEYVGKIDDLKYQKKEKVSVKPTGSNELLTVKLRYKAPDKDVSKKMEL PFVDNKGNNVSSDFRFASAVAMFGQLLRESDFKGNASYDKVIDLAKQGLNNDDKGYRREF IRLVEAAKGLERTNKN >gi|222159299|gb|ACAB01000060.1| GENE 78 92490 - 92633 80 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFGDSEVSTTFAAANTGVAQLVEHRSPKPGVGSSSLSSRAELNSCKR >gi|222159299|gb|ACAB01000060.1| GENE 79 92654 - 93007 324 117 aa, chain - ## HITS:1 COG:FN0052 KEGG:ns NR:ns ## COG: FN0052 COG1393 # Protein_GI_number: 19703404 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Fusobacterium nucleatum # 4 116 5 117 120 122 61.0 1e-28 MATLFLQYPACSTCQKAKKWLTENNIEFTNRLIVEENPTVEELKAWIPRSGLPLKKFFNT SGLVYKELKLSEKLPAMSEEEQIVLLATNGKLVKRPLVVTDSFVLVGFKPDEWEKLK >gi|222159299|gb|ACAB01000060.1| GENE 80 93140 - 93718 322 192 aa, chain - ## HITS:1 COG:NMB0698 KEGG:ns NR:ns ## COG: NMB0698 COG3663 # Protein_GI_number: 15676596 # Func_class: L Replication, recombination and repair # Function: G:T/U mismatch-specific DNA glycosylase # Organism: Neisseria meningitidis MC58 # 1 186 33 220 229 201 51.0 7e-52 MEIETHPLEPFLPAKSKLLMLGSFPPQKKRWSMDFYYPNLNNDMWRIYGILFFNDKNHFL NSTLKSFCREQIIDFLNEKGIALFDTASSIRRLQDNASDKFLEVVKATDVAALLRQLPEC KAIVTTGQKATDTLRQQFEIEEPKVGDYSEFVFEGRAMRLYRMPSSSRAYPLALDKKAAA YRIMFQDLQILR >gi|222159299|gb|ACAB01000060.1| GENE 81 93887 - 96958 2482 1023 aa, chain + ## HITS:1 COG:no KEGG:BT_4470 NR:ns ## KEGG: BT_4470 # Name: not_defined # Def: outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 1023 19 1035 1035 1708 80.0 0 MSLFFLEGGNCYASEQQLLRGTIETPSDNYTVRGRVIDVYGEPLIGATIREKGGANGTVT DIDGRFFLSVPDSAVLQVSFVGYKTLEVNVTGRTMLEIRLQEDAVMLDHVIVTALGIEKD EATLAYSAQKIKGEELNRIKEINMITALAGKAAGVQINKNSSGMGGSAKVSLRGIRSVSG DNQPLYVIDGVPMSNTSSEQAYSAIGGTANAGNRDGGDGISNLNPEDVESISILKGAPAA ALYGSMAANGVILITTKKGNSVGQRNINFSTGLTFEKAFSLPKMQNRYGVSDVVDSWGEK ENLTAHDNLNDFFRTGLTSMTSVSVSYGNETLQTYFSYANTTGKGIMDKNKLKKHNINLR ETATMFNKRLKLDGNVNVMKQTVENKPVSGGFYMNPLVGLYRFPRGENLSYYKDHFETYD EERNLNVQNWHTFTEDFEQNPYWIVNRIQSKETRTRIILSLSANFRINDWLTIQARGNMD YWADKLRQKFYASTATALCGANGRYIEMDYQETQMYGDVMAMFKKTWGDFTLDAAIGGSI NDRIRNSTRYDSKNASLKFANVFNIANIIMNSSASIDQKIDEHRQLQSIFGTAQIGYKEK LFLDLTARNDWASTLAYTEHEKAGFAYPSVGLSILLDKWVKLPEWISFAKLRGAYSKVGN DIPVFVTNSTSHIGAGGEIQANDAAPFKDMEPEMTHAMEFGTEWRFFQHRLGINLTYYRT NTYNQFFKLPALAGDKFAFRYVNAGNIQNQGWEVTLNGTPILTSDFTWKTSINFSANKNK IVKLHDELKELVYGPTSFSSSYAMKLVKGGSIGDIYGKAFVRDAAGNIVYETEGDYKGLP LVEGDGNTVKVGNANPVFMMGWDHTFSYKGFSLYFLLDWRYGGKVLSQTQAEMDLYGVSE ITADARDRGYVMLEGQQIDNVKGFYKNVVGGRAGVTEYYMYDATNLRLREVSLSYNFSKK WIQKTKVLKDVQLSFVARNLCFLYKKAPFDPDLVLSTGNDNQGIEVFGMPTTRSLGFTLK CEF >gi|222159299|gb|ACAB01000060.1| GENE 82 96974 - 98569 1279 531 aa, chain + ## HITS:1 COG:no KEGG:BT_4471 NR:ns ## KEGG: BT_4471 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 531 1 531 531 920 80.0 0 MKRKYVYLYLLGLLLFSVACTGNFCDYNTDLSGITDDDLIIDDNGYGIRLGIIQQGIYFN YDFGKGKNWPFQLIQNLNADMFGGYMHDGKPLNGGSHNSDYNMQDGWNSAMWTHMYSYIF PQIYQSENATRNTHPALFGITKILKVEAMHRVTDYYGPIIYKNFANAEKHYRPDKQKDVY YEFFNELDSAVVALTGYIEEKPEFNGFARFDILLDGKYPSWVKFANSLRLRLAMRIASVA PDKARAEIQKIKENDYGFFEAETGGAVVSTKSGYTNPLGELNRVWNETYMSANMESILVG YNDPRLGIYFELCTDETLKGQYRGIRQGTCFAHSHYSGLSKLFVKQSTDAPLMTASEVWF LRAEAALRGWTDEDEETCYQNGVTTSFHQWGIYGVEDYLQSERKASDFIDTYDEENNIEA RCKVSPKWNRLDDKETKLEKIITQKWIAMFPEGCGAWAEQRRTGYPRLFPVRFNHSRNGS IDTEIMVRRLNFPGTLQTEDPEQYSALVEALGGNDHGGTRLWWDTGNNNLE >gi|222159299|gb|ACAB01000060.1| GENE 83 98643 - 99350 683 235 aa, chain + ## HITS:1 COG:no KEGG:BT_4472 NR:ns ## KEGG: BT_4472 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 235 1 235 235 432 91.0 1e-120 MKQIISALFLCMLLSAVPGLQAQNIQLHYDFGRSLYDKDLQGRPLLTSTVEKFHPDTWGS TYFFVDMDYTSEGVAAAYWEIAREVKFWKGPFSAHLEYNGGLSKGMSYKNAYLAGATYTF NNASFSKGFTLTAMYKYIQKHSSPNNFQLTGTWYVNFCRNLLTFSGFADWWREETNYGKT IFLSEPQFWVNLNQIKGVNKNFNLSVGSEVELSNNFGGRDGFYVIPTLALKWPLN >gi|222159299|gb|ACAB01000060.1| GENE 84 99504 - 100817 346 437 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 9 433 7 418 447 137 25 2e-31 MKTDLIYGVEDRPPFKDALFAALQHLLAIFVAIITPPLIIASALKLDVEKTSFLVSMSLF ASGVSTFIQCRRFGPIGAKLLCIQGTSFSFIGPIIATGMVGGLPLIFGSCVAAAPIEMIV SRTFKYLRNIITPLVSGIVVLLIGLSLIKVGIVSCGGGYAAMDNGTFATWENLSIAALVL LSVLFFNRCGNKYLRMSSIVLGLCLGYGLAFVLGKVDMSALNVEMLMSFNIPQPFKYGVD FNVSSFIAIGLVYLITAIEATGDVTANSMISGLPIEGDSYLKRVSGGVMADGFNSFLAGI FNSFPNSIFAQNNGIIQLTGVASRYVGYYIAAMLVLLGLFPIVGAVFSLMPDPVLGGATL LMFGTVAAAGIRIISSQEIGRKETLVLAVSLSLGLGVELMPDVLQQAPEAIRSIFSSGIT TGGLTAIIANIVIRVKE Prediction of potential genes in microbial genomes Time: Wed May 18 02:40:49 2011 Seq name: gi|222159298|gb|ACAB01000061.1| Bacteroides sp. D1 cont1.61, whole genome shotgun sequence Length of sequence - 6837 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 8 - 52 1.2 1 1 Tu 1 . - CDS 111 - 2249 1327 ## COG1509 Lysine 2,3-aminomutase - Prom 2277 - 2336 4.2 2 2 Op 1 . - CDS 2391 - 2699 87 ## gi|295087043|emb|CBK68566.1| hypothetical protein 3 2 Op 2 . - CDS 2735 - 3367 552 ## BT_4475 hypothetical protein - Prom 3388 - 3447 11.0 4 3 Op 1 . - CDS 3479 - 4867 793 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 5 3 Op 2 . - CDS 4894 - 5541 541 ## BT_4477 putative ATP-dependent DNA helicase - Prom 5612 - 5671 6.8 + Prom 5576 - 5635 4.8 6 4 Tu 1 . + CDS 5689 - 5865 329 ## gi|167754220|ref|ZP_02426347.1| hypothetical protein ALIPUT_02513 + Term 5888 - 5937 11.3 - Term 5874 - 5926 12.8 7 5 Tu 1 . - CDS 5955 - 6836 789 ## BVU_3461 hypothetical protein Predicted protein(s) >gi|222159298|gb|ACAB01000061.1| GENE 1 111 - 2249 1327 712 aa, chain - ## HITS:1 COG:MJ0634 KEGG:ns NR:ns ## COG: MJ0634 COG1509 # Protein_GI_number: 15668815 # Func_class: E Amino acid transport and metabolism # Function: Lysine 2,3-aminomutase # Organism: Methanococcus jannaschii # 8 679 11 619 620 137 23.0 8e-32 MKQKKMLTLTFSQLKQIYGQEIPEIVEIADKSSTVEDFKAGILRLLETCRIENEAAEEAR EQIRLLLDYDGQNVHELSTGQDMSVQTIRLLYEFLTGTLENMEMPTDLFIEIFQMFKRLK GEVMPLPSPQRIKSRNDRWETGLDEEVREIRDENKERMLHLLIQKIENRKSKPSVRFHFE EGMSYEEKYRLVSEWWNDFRFHLAMAVKSPGELNRFLGNSLSSETMYLLYRARKKGMPFF ATPYYLSLLNITGYGYNDEAIRSYILYSPRLVETYGNIRAWEKEDIVEVGKPNAAGWLLP DGHNIHRRYPEVAILIPDTMGRACGGLCASCQRMYDFQSERLNFEFESLRPKESWDRKLR RLMTYFEEDTQLRDILITGGDALMSQNKTLQNILDAVYRMAVRKQKANLERPEGEKYAEL QRVRLGSRLLAYLPMRINDGLVDILREFKEKASAIGVKQFIIQTHFQTPLEVTPEAKEAI RKILSAGWIITNQLVYTVAASRRGHTTRLRQVLNSLGVVCYYTFSVKGFNENYAVFAPNS RSMQEQQEEKIYGRMTPEQAEELYKILETKVGTEEETKEDVAKQLRRFMRKHHLPFLATD RSVLNLPAIGKSMTFQLVGLTEEGKRILRFEHDGTRHHSPIIDQMGQIYIVENKSLAAYL RQLAKMGEDPEDYASIWNYTKGETEPRFSLYEYPDFPFHTTDKMSNLEISIK >gi|222159298|gb|ACAB01000061.1| GENE 2 2391 - 2699 87 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295087043|emb|CBK68566.1| ## NR: gi|295087043|emb|CBK68566.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 83 1 83 83 135 100.0 1e-30 MIEKAGVLYFRFFCYLYVLFSFVLIQKKKYQKEKIKAASARLLRSHPRLKGRNSLRSNSL PFLTPGMKPPLDAVQTRPGGHAAWNAMQCRVLCGVIVYYVAL >gi|222159298|gb|ACAB01000061.1| GENE 3 2735 - 3367 552 210 aa, chain - ## HITS:1 COG:no KEGG:BT_4475 NR:ns ## KEGG: BT_4475 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 210 1 210 210 395 96.0 1e-109 MGLFGLFKKKSDETKVGNVEDFISLTRVYFQSVIATNLGITNIRFLPDVANFKRLFKVPT QNGKLGLAEKSASRKMLMQDYGLNENFFKEIDASVKRNCRTQNDIQSYLFMYQGFSNDLM MLMGNLMQWKFRMPSIFKKALYGMTQKTVHDVCTKMVWKADDVHKTAAAIRQYKERLGYS EQWMTDYVYNIVLLAKKEPKRKDDDTETKK >gi|222159298|gb|ACAB01000061.1| GENE 4 3479 - 4867 793 462 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 2 434 5 440 456 310 39 3e-84 MTFINEINDILWTYILIIMLLGCAIWFSIQTRFVQFRMIREMIVLLSESAGKGKQGEKHV SSFQAFAISIASRVGTGNLAGVATAIAIGGPGAIFWMWVIALLGASSAFIESTLGQLYKI RGKDSFIGGPAYYMKKGLKQPWMGMLFAILISITFGFAFNSVQSNTICAAAEHAFGVNHI LLGGVLTLLTLVIIFGGIQRIARVSSIIVPVMALGYVGLALVIVILNITHLPGVIALIVS HAFGWEQALAGGVGMALMQGIKRGLFSNEAGMGSAPNAAATAHVSHPAKQGLIQTLAVFT DTLLICTCTAFIILFSGAPLDGSTNGVQLTQQALTNEIGSSGSVFVAVALFFFAFSSILG NYYYGEANIRFITHRKWVLHGYRILVGGMVLFGSLATLDMVWSLADVTMALMAICNLIAI LFLGKYAIRLLNDYRAQKKAGIQSPVFKKESMPDIEKDLECW >gi|222159298|gb|ACAB01000061.1| GENE 5 4894 - 5541 541 215 aa, chain - ## HITS:1 COG:no KEGG:BT_4477 NR:ns ## KEGG: BT_4477 # Name: not_defined # Def: putative ATP-dependent DNA helicase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 215 1 215 215 386 92.0 1e-106 MKLLTDTDYIHALIAEGEHQQQDFKFEISDARKIAKTLSAFANTDGGRLLIGVKDNGKIA GVRSEEEKYMIEAAAQLYCVPEVEYTLQTYIVEGRQVLVATIEETPHKPVYAKDETGKPL AYLRIKDENILATPIHLRVWQQSDSPRGELIRYTEREQLLLEQLEHGTLLSLNRYCRQTG LSRRAAEHLLAKFVRYDIVEPVFENHKFYFRIKNE >gi|222159298|gb|ACAB01000061.1| GENE 6 5689 - 5865 329 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754220|ref|ZP_02426347.1| ## NR: gi|167754220|ref|ZP_02426347.1| hypothetical protein ALIPUT_02513 [Alistipes putredinis DSM 17216] # 1 58 33 90 90 71 74.0 1e-11 MKELVEKVAALYADFSKDANAQIENGNKAAGTRARKASLEIEKAMKEFRKASLEASKK >gi|222159298|gb|ACAB01000061.1| GENE 7 5955 - 6836 789 293 aa, chain - ## HITS:1 COG:no KEGG:BVU_3461 NR:ns ## KEGG: BVU_3461 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 293 291 582 590 427 72.0 1e-118 FSEKSYGGTTMYNSNMVLYFVDNYIRNGGYMPRNMVEENIRVDYNKLRMLIRKDKEFAHD ASTIQTLVQQGYITGELKTGFPAETIAEPDNFISLLFYFGMLTISGTKRGKTLLTIPNQV VREQLYSYLLDTYNEANLRFDNWEKGELASAMAYDGDWKAYFDYIAECLHRYSSQRDKQK GEAYVHGFTLAMTAQNRFYRPISEQENQEGYADIFMFPLLDIYKDMLHSYIIELKYAKGK DSDEKVEQLRQEAITQANRYAASETVQKAIGTTTLHKIIVVYQGMKMVVCEEV Prediction of potential genes in microbial genomes Time: Wed May 18 02:41:09 2011 Seq name: gi|222159297|gb|ACAB01000062.1| Bacteroides sp. D1 cont1.62, whole genome shotgun sequence Length of sequence - 1343 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 461 - 1297 747 ## BVU_3461 hypothetical protein Predicted protein(s) >gi|222159297|gb|ACAB01000062.1| GENE 1 461 - 1297 747 278 aa, chain - ## HITS:1 COG:no KEGG:BVU_3461 NR:ns ## KEGG: BVU_3461 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 278 306 582 590 405 72.0 1e-112 MVLYFVDNYIRNGGYMPRNMVEENIRVDYNKLRMLIRKDKEFAHDASTIQTLVQQGYVTG ELKTGFPAETVAEPDNFTSLLFYFGMLTISGTLEGETKLTIPNQVVREQLYSYLLDTYNE ADLRFDNWEKGKLASAMAYRGDWKAYFDYIAECLHRYSSQRDKQKGEAYVHGFTLAMTAQ NRFYRPISEQENQEGYADIFMFPLLDIYKDMLHSYIIELKYAKGKDSDEKVEQLRQEAIT QANRYAASETVQKAIGTTTLHKIIVVYQGMKMVVCEEV Prediction of potential genes in microbial genomes Time: Wed May 18 02:41:14 2011 Seq name: gi|222159296|gb|ACAB01000063.1| Bacteroides sp. D1 cont1.63, whole genome shotgun sequence Length of sequence - 1614 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 231 - 290 7.2 1 1 Tu 1 . + CDS 359 - 1297 777 ## BT_4479 integrase protein Predicted protein(s) >gi|222159296|gb|ACAB01000063.1| GENE 1 359 - 1297 777 312 aa, chain + ## HITS:1 COG:no KEGG:BT_4479 NR:ns ## KEGG: BT_4479 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 310 1 303 305 501 80.0 1e-141 MLKVVTFMKQVAMGLQVEGNFGTAHVYRSSLNAIIAYCGEEDLLFSEVTSEWLKGFEVYL RSRGCSWNTVSTYLRTFRAVYNRAVDLGKAPYVPHLFRSVYTGTRADHKRALCDDDMKKV FAKLSRTSGVPFAVCQAQELFILMFSLRGMPFVDLAYLRKSDLRDNVITYRRRKTGRPLS VTLTPEAMILVKKYMNRDLSSPYLFPLLKSREETKEAYREYQLALRSFNQQLMLLGELLG LSDKLSSYTARHTWATTAYYCEIHPGIISEAMGHSSITVTETYLKPFRSKKIDEANKQVL DFVKRSVVDVNT Prediction of potential genes in microbial genomes Time: Wed May 18 02:41:27 2011 Seq name: gi|222159295|gb|ACAB01000064.1| Bacteroides sp. D1 cont1.64, whole genome shotgun sequence Length of sequence - 38259 bp Number of predicted genes - 37, with homology - 37 Number of transcription units - 24, operones - 8 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 2944 2536 ## BT_1826 hypothetical protein + Term 3177 - 3220 11.7 2 2 Op 1 . - CDS 3277 - 3726 392 ## COG3023 Negative regulator of beta-lactamase expression 3 2 Op 2 . - CDS 3730 - 4053 250 ## BT_1518 hypothetical protein - Prom 4089 - 4148 5.6 4 3 Tu 1 . - CDS 4311 - 4835 544 ## BT_1517 hypothetical protein - Prom 4941 - 5000 8.3 5 4 Op 1 . - CDS 5026 - 6387 1394 ## COG0305 Replicative DNA helicase 6 4 Op 2 . - CDS 6405 - 7025 635 ## BT_1515 hypothetical protein 7 5 Tu 1 . - CDS 7605 - 7880 259 ## BT_1514 hypothetical protein - Prom 8044 - 8103 5.5 8 6 Op 1 1/0.000 + CDS 8160 - 9422 1082 ## COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif + Prom 9424 - 9483 5.1 9 6 Op 2 . + CDS 9609 - 10382 621 ## COG1573 Uracil-DNA glycosylase + Term 10409 - 10446 -0.9 - Term 10313 - 10347 3.6 10 7 Tu 1 . - CDS 10389 - 10703 380 ## BF2945 hypothetical protein - Prom 10875 - 10934 7.6 11 8 Tu 1 . - CDS 10941 - 11609 376 ## PROTEIN SUPPORTED gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 - Prom 11659 - 11718 5.7 + Prom 11623 - 11682 7.6 12 9 Tu 1 . + CDS 11785 - 12153 164 ## BT_4485 hypothetical protein + Term 12192 - 12240 2.2 13 10 Tu 1 . - CDS 12157 - 13026 579 ## COG1226 Kef-type K+ transport systems, predicted NAD-binding component 14 11 Tu 1 . - CDS 13158 - 14138 971 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) + Prom 14371 - 14430 7.5 15 12 Op 1 . + CDS 14450 - 15763 1314 ## COG1090 Predicted nucleoside-diphosphate sugar epimerase + Term 15816 - 15859 -0.5 + Prom 15861 - 15920 5.1 16 12 Op 2 . + CDS 15942 - 16217 406 ## COG2388 Predicted acetyltransferase + Term 16248 - 16295 7.3 17 13 Tu 1 . - CDS 16303 - 16569 402 ## BT_4495 hypothetical protein - Prom 16589 - 16648 3.9 18 14 Tu 1 . - CDS 16688 - 17197 500 ## COG0300 Short-chain dehydrogenases of various substrate specificities 19 15 Op 1 . - CDS 17315 - 17419 144 ## gi|298483425|ref|ZP_07001602.1| oxidoreductase, short chain dehydrogenase/reductase family 20 15 Op 2 . - CDS 17440 - 17739 240 ## BF2926 hypothetical protein - Prom 17762 - 17821 3.5 + Prom 17766 - 17825 5.2 21 16 Tu 1 . + CDS 17855 - 18127 241 ## BT_4498 hypothetical protein + Prom 18169 - 18228 2.6 22 17 Tu 1 . + CDS 18248 - 19621 1170 ## COG0534 Na+-driven multidrug efflux pump 23 18 Op 1 . - CDS 19718 - 20701 643 ## BT_4500 hypothetical protein 24 18 Op 2 . - CDS 20762 - 22609 1216 ## COG0564 Pseudouridylate synthases, 23S RNA-specific - Prom 22631 - 22690 1.8 + Prom 22623 - 22682 4.1 25 19 Tu 1 . + CDS 22780 - 23256 481 ## BT_4502 hypothetical protein + Prom 23268 - 23327 4.6 26 20 Tu 1 . + CDS 23377 - 23841 641 ## COG2030 Acyl dehydratase + Term 23980 - 24013 -0.4 - Term 23877 - 23925 -0.8 27 21 Tu 1 . - CDS 23957 - 24937 746 ## COG3049 Penicillin V acylase and related amidases - Prom 24957 - 25016 6.0 + Prom 25045 - 25104 5.0 28 22 Tu 1 . + CDS 25138 - 25770 377 ## COG2095 Multiple antibiotic transporter 29 23 Op 1 . - CDS 25799 - 26326 413 ## BT_4505 hypothetical protein 30 23 Op 2 . - CDS 26348 - 27760 924 ## COG0346 Lactoylglutathione lyase and related lyases 31 23 Op 3 . - CDS 27832 - 28713 642 ## COG2367 Beta-lactamase class A - Prom 28922 - 28981 5.5 + Prom 28899 - 28958 7.7 32 24 Op 1 . + CDS 28999 - 30195 1034 ## BDI_0255 hypothetical protein 33 24 Op 2 . + CDS 30225 - 30776 613 ## BDI_0254 hypothetical protein 34 24 Op 3 . + CDS 30760 - 34428 2807 ## BDI_0253 putative DNA repair ATPase 35 24 Op 4 . + CDS 34425 - 35285 657 ## BDI_0252 hypothetical protein 36 24 Op 5 . + CDS 35295 - 37049 1198 ## BVU_2278 hypothetical protein + Prom 37051 - 37110 3.0 37 24 Op 6 . + CDS 37130 - 38023 599 ## COG1708 Predicted nucleotidyltransferases Predicted protein(s) >gi|222159295|gb|ACAB01000064.1| GENE 1 2 - 2944 2536 980 aa, chain + ## HITS:1 COG:no KEGG:BT_1826 NR:ns ## KEGG: BT_1826 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 980 13 1038 1038 421 36.0 1e-115 ALTVSTTVTSCKDYDDDIKNLQEQIDKVTSTNPVSTEDMKTAISSAIQTLQTQLQTAIDG KADAKAVQDLLKTVEALQTALENKADASTIKTLGDQITALSEQVNSIEGTLNKTKEDLEA KVADLTEKLAGAASSEDLEKLADELAEAKNELKAVKEMADNNAAAIVSIQANILELQKLD GRISALETFNQNAASKDDLADYVAHSELAGLVDDEVLELLKDNGSIAKYVNDAIESQVLA EASAINLAIKGVDGRLATLSTNFETYKQEQATAYQTVTGNITTLTTFKTTIETALTDGGY ENFAAVLTEITTIKTSYGYCATKEVFDGKVEAYLADYKKTVNDKFAALEKRLTALENQIQ SIVYIPEYEDGQVKFMSYFYDNKLVAETEPIRMKFRISPATAAANFAENYTPSFDGQEIK TRAAEIYHIEKTEVDEATGIVTFTVSTSTDKSFAVSLNLIAKDQSKNLTNISSNYFPVIS DYRAVTGVAVKSPNEAASSILYDKPLSVIDYATGAVLQITGTDRAGKNIDNEPMASSVNT EKFVVTYSVGGTDPESYTIENGVLKLTTYNDASFNGKKVTPEAKVEIIGTNFSKVTTFTE VTAKAASADPEVPSTIAAIEFDGTKEQVADVTVTYGTIGGSTDIGILQAAYEKLPQENFT FKPSATGGVTLRFKANTTKNELEIVVPKGTEAGTYVPEVKVKVSDVQNFTLKPSIEVKVT AATYTLVYDGNLTSGTMALKPNLIPIAKPTSLNFAMSIPTLFTNYETIVEKAEAVGATVQ FSLTEEITGVSIVNNVLTIDKTYSNPDPAKDIVVAAKVIAKNTAGEDVELVVAKETTFTL TDVSGTWAAPGSTEIKIGSDNLNNSFPLAKDFKWTASNSKVMWKDGAEVTAGTDWGVAPL GIFGFVAPTFELADADNAKYVILNKNTGAFSLSEAGKVLSVSKDIVVNIVAESRWGTITG YDAAKTVTVKIDLSKEAASI >gi|222159295|gb|ACAB01000064.1| GENE 2 3277 - 3726 392 149 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 149 2 105 116 94 44.0 7e-20 MRTINLIVVHCSATREDKSFTEHDLDVCHRRRGFNGVGYHFYIRKNGDIKSTRPLERIGA HSRGFNRESIGICYEGGLDCMGQPKDTRTCWQKHSLRVLILTLLKDFPGCRVCGHRDLSP DLDGDGEIEPEEWIKACPCFEASKEWDKE >gi|222159295|gb|ACAB01000064.1| GENE 3 3730 - 4053 250 107 aa, chain - ## HITS:1 COG:no KEGG:BT_1518 NR:ns ## KEGG: BT_1518 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 101 1 101 101 131 72.0 6e-30 MLDKIIDIITAILPFFGGRKKRQQMMQDVKEFSELVKEQYGFLMKQLEKVLKDYFDLSDR VKEMHSEIFSLKGKLSEAVTLQCVNKECIQRNNGSESASSSTSLIPA >gi|222159295|gb|ACAB01000064.1| GENE 4 4311 - 4835 544 174 aa, chain - ## HITS:1 COG:no KEGG:BT_1517 NR:ns ## KEGG: BT_1517 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 142 1 142 176 225 81.0 5e-58 MDVLVERYQRRKYVNQEDAPLLYYIRQKSGNVRVMDVDTMATAIESKSSLTAGDVKHTIE AFVEQLRLSLTQGDKVKIDGLGTFHITLTSDGTESMKDCTVRSIRRVNVRFVADKALKLM NSSHTSTRSENNVDFVLGGKGDGSDSGNGGSDDDSGSGSGGNKPGGGEAPDPAA >gi|222159295|gb|ACAB01000064.1| GENE 5 5026 - 6387 1394 453 aa, chain - ## HITS:1 COG:lin0047 KEGG:ns NR:ns ## COG: lin0047 COG0305 # Protein_GI_number: 16799126 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Listeria innocua # 3 427 11 427 450 262 38.0 1e-69 MQPHASELEEAIIGACLIEQEALPLVADKLRPEMFYDDHHQLLFAALIAMYQANKKIDIL TVKEELTRRGVLEKIGGPYTIVQLSSRVASSAHIEYHAQIVHQKYLAREAVVGFNKLLTC AMDETIDIDDTLIDAHNLLDRLEGESGHHDHIRCMDTLMTDTLKEAELRIAKSVNGVTGI PTGLTELDQKTGGLQDSDLIVIAARPSVGKTAFALHLARSAAMAGNAVAVYSLEMQGERL ADRWLAAASNINPYRWRNGIPTLQEMENAHTAASELSGLPIYVDDSTSVSMDHIRSSARL LKSRNQCDAIIIDYLQLCDMATKQANRNREQEVAQATRKAKLLAKELHIPVILLSQLNRE SENRPGGRPELAHLRESGAIEQDADVVMLLYRPAMQRIVTDRESGYPTEGLGVVIVAKQR NGETGNVYFGHNPSMTKIYDYVPPLEYLEKHAK >gi|222159295|gb|ACAB01000064.1| GENE 6 6405 - 7025 635 206 aa, chain - ## HITS:1 COG:no KEGG:BT_1515 NR:ns ## KEGG: BT_1515 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 204 1 210 212 176 47.0 4e-43 MKQLPNNLLHEGYILIPKALLKRQINDKAPGELEALLQVLIHANYSETTYKIQQIDIVCQ RGESVISQQHWSRLFQWSRSKTLRFFQKIQEEGIIKIIPHQKGIFHIHINNYDFWTGCIS PEAREEKKKEKSEVFDVFWDKYHETMQKPKQYVARARREWDKLTKEEQQTAIDHIEEVYY HTNDTRFIPLAATYLKDKAFLNEYID >gi|222159295|gb|ACAB01000064.1| GENE 7 7605 - 7880 259 91 aa, chain - ## HITS:1 COG:no KEGG:BT_1514 NR:ns ## KEGG: BT_1514 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 90 5 86 99 68 45.0 7e-11 MKTSKENEEEMEEREWVINGFKTFSAVAQEYFPEYSNADTASKRMRNEIELDKQLFDELK AAHYVHETTRLSPKQQQILFMTWGPEKIILR >gi|222159295|gb|ACAB01000064.1| GENE 8 8160 - 9422 1082 420 aa, chain + ## HITS:1 COG:CAC3343 KEGG:ns NR:ns ## COG: CAC3343 COG4277 # Protein_GI_number: 15896586 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with the Helix-hairpin-helix motif # Organism: Clostridium acetobutylicum # 4 406 2 401 440 445 53.0 1e-125 MNENVLAKLKILAESAKYDVSCSSSGTVRSNKPGTLGNTVGGWGICHSFAEDGRCISLLK IMLTNYCIYDCAYCINRRSNDLPRATFSVSELVELTMEFYRRNYIEGLFLSSGVVRNPDY TMERLVRVAKDLRQVYRFNGYIHLKSIPGASRELVNEAGLYADRLSVNVEIPKEENLKLL APEKDHKSVFAPMKYIQQGVLESKEERQKFRHAPRFAPAGQSTQVIVGATSESDKDILFL SSALYGRPTMKRVYYSGYVSVNTYDKRLPALKQPPLVRENRLYQADWLLRFYQFKVDEIV DDAYPDLDLEIDPKLSWALRHPEQFPVDINKADYEMLLRVPGVGVKSAKLIVASRRFSRL GFYELKKIGVVMKKAQYFITCKELPLQMLTVNELSPQRVRSLLLPKPKKKVDERQLRFDF >gi|222159295|gb|ACAB01000064.1| GENE 9 9609 - 10382 621 257 aa, chain + ## HITS:1 COG:CC2333 KEGG:ns NR:ns ## COG: CC2333 COG1573 # Protein_GI_number: 16126572 # Func_class: L Replication, recombination and repair # Function: Uracil-DNA glycosylase # Organism: Caulobacter vibrioides # 87 254 78 231 479 68 29.0 8e-12 MKVFVYDKSFDGLLTAVFDAYFRKTFPDDLLSEGDALPLFYDELHTVVTDEEKAGRVWRG LQKKVSVSALGCLTQSWLSELPEVGMLIFRYIRKAIDAPRSIETNFGDPDILRLAQIWKK VDGERVHLMQFVRFQKAADGTFFAAFEPQYNALPLTVHHFKDRFADQKWIIYDMKRRYGF YYDLQEVTTISFDDDSRESHLITGMLDESLMDKDEKLFQQLWKTYFKAICIKERMNPRKH RQDMPVRYWKYLTEKQK >gi|222159295|gb|ACAB01000064.1| GENE 10 10389 - 10703 380 104 aa, chain - ## HITS:1 COG:no KEGG:BF2945 NR:ns ## KEGG: BF2945 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 104 1 104 104 161 86.0 8e-39 MEKMKFYTEEEITDKHIGKKGTLARDKFEGDLQSFLIGEAIRKARQSKNLTQEELGNLIG VQRAQISRIENGKNLTLSTLSKVFKAMGISAKLEIGNLGKVALW >gi|222159295|gb|ACAB01000064.1| GENE 11 10941 - 11609 376 222 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 [Bacillus selenitireducens MLS10] # 8 216 15 229 236 149 39 2e-35 MMLNIDFVIRLLVAGILGAIIGLDREYRAKEAGYRTHFLVSLGSALIMIVSQYGFQEIIK ESSVTLDPSRVAAQVVSGIGFIGAGTIIFQKQIVRGLTTAAGIWATAGIGLAVGAGMYTI GIAAMVLTLIGLELLSYLFKSIGMKSSMVSFSTSNKDTLKQIADRFNSKDYLIVSYEMET LHKGEAEFYQVSMVIKSKRNNDEGHLLSLIQEFPEVTVQRIE >gi|222159295|gb|ACAB01000064.1| GENE 12 11785 - 12153 164 122 aa, chain + ## HITS:1 COG:no KEGG:BT_4485 NR:ns ## KEGG: BT_4485 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 121 1 121 121 157 81.0 1e-37 MKKRFFYIISFLLIFCIEVLIALYVRDNFIRPYVGDMLVVVLVYSFVRIFLPTGIPRMPF YVFLFACFVEVLQYFQLVETLGITNRVARIILGSTFDWADIACYAVGCVFIVLFERFFQH KS >gi|222159295|gb|ACAB01000064.1| GENE 13 12157 - 13026 579 289 aa, chain - ## HITS:1 COG:MA2034 KEGG:ns NR:ns ## COG: MA2034 COG1226 # Protein_GI_number: 20090882 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, predicted NAD-binding component # Organism: Methanosarcina acetivorans str.C2A # 17 279 19 279 279 273 51.0 4e-73 MKLKERIHRFLHDEKLKRKLYVIIFESDTPAGKLFDVILIGCILVSVLLVIIESLKGLPS FLTTPFVIMEYLFTAFFTFEYLTRIYCSPRPRKYIFSFFGIVDLLATLPLYIGLLFPGAR YLLIIRAFRLIRVFRVFKLFNFLNEGERLLTALRESSKKIAVFFLFVVILVTSIGTLMYM IEGTQPNSQFNNIPNSIYWAIVTMTTVGYGDITPVTGFGKFLSACVMLIGYTIIAVPTGI VSASMMKDYKRRRDKECPNCHRSGHEDNAEFCKYCGHHLNPSETDLEKK >gi|222159295|gb|ACAB01000064.1| GENE 14 13158 - 14138 971 326 aa, chain - ## HITS:1 COG:aq_1420 KEGG:ns NR:ns ## COG: aq_1420 COG0741 # Protein_GI_number: 15606599 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Aquifex aeolicus # 93 287 81 268 299 112 36.0 1e-24 MKKQTKINYILTLILVFCIGASIPVLTGSSQVTEQHSAKSEVPYCVTSPTVPAQVTFDGE TIDLRRYDRRERMDREMMAFTYMHSTTMLLIKRANRYFPIIEPILKANGIPDDFKYLMVI ESNLNNIARSPAGAAGLWQFMPATGREFGLEVNDNVDERYHIEKATVAACKYFKQAYAKY GDWMAVSAAYNAGQGRISSQLDKQLASHAMDLWLVEETSRYMFRILAAKEIFNNPQRYGF LLKREHLYPPIPYKKVTVSTSINDLNDYAKSQGITYAQLRDTNPWLRDTSLRNKTGKTYT LYIPTQEGMYYDPKKTEAYNKQWVID >gi|222159295|gb|ACAB01000064.1| GENE 15 14450 - 15763 1314 437 aa, chain + ## HITS:1 COG:SA0724 KEGG:ns NR:ns ## COG: SA0724 COG1090 # Protein_GI_number: 15926446 # Func_class: R General function prediction only # Function: Predicted nucleoside-diphosphate sugar epimerase # Organism: Staphylococcus aureus N315 # 5 273 6 285 300 173 37.0 5e-43 MNIAMTGATGYIGKHLSNYLTEKGGHRIIPLGRSMFREGMSGHLIQTLTHCDVIINLAGA PINKRWTPEYKQELFNSRIVVTHRIIRALNAVKTKPKLMISASAVGYYPPEVEADEYTRT RGDGFLSDLCYAWEKEAKHCPQPTRLVITRFGVVLSPDGGAMQQMLRPLQATKVATAIGP GTQSFPWIAMHDLCRAMEFFIAHEETRGVYNLVAPQQISQYSFTREMGRAYQAWTTIIAP QRAFRIFYGEAASFLTAGQKVRPTRLTEAGFRFSIPTVERLFKGTDHTTVSSLDLNRYMG LWYEIARYENRFEHGLVDVTATYTLRPDGTIRVENRGCKRNSPYDICKTANGHAKIPDPA QPGKLKVSFFLNFYSDYYILELDEENYNYALVGSSTDKYLWILSRAPQLPEEIKKKLVTA AERRGYDTNRLQWIEQF >gi|222159295|gb|ACAB01000064.1| GENE 16 15942 - 16217 406 91 aa, chain + ## HITS:1 COG:PA1749 KEGG:ns NR:ns ## COG: PA1749 COG2388 # Protein_GI_number: 15596946 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Pseudomonas aeruginosa # 5 82 74 152 161 60 44.0 8e-10 MDYEIIHQPEQHLFKTEVDGRTAFVQYRLLGDSLDIIHTIVPRPIEGRGIAAALVKAAYD YAIANGMKPKATCSYAVKWLERHPELNGNSD >gi|222159295|gb|ACAB01000064.1| GENE 17 16303 - 16569 402 88 aa, chain - ## HITS:1 COG:no KEGG:BT_4495 NR:ns ## KEGG: BT_4495 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 88 1 88 89 144 87.0 8e-34 MEQKICQSCGMPIDDSTFGKEADGSKNQDYCHYCYADGHFTKECTMDEMIELNLNYLDEF NKDSEVKYTVEEARKTMKEFFPQLKRWK >gi|222159295|gb|ACAB01000064.1| GENE 18 16688 - 17197 500 169 aa, chain - ## HITS:1 COG:CAP0051 KEGG:ns NR:ns ## COG: CAP0051 COG0300 # Protein_GI_number: 15004755 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Clostridium acetobutylicum # 1 157 76 234 240 74 31.0 7e-14 MDLFLLSSGIGFQNMDLNMEVELNTAHTNVAGFIRMVDTAFTYFKKNGGGHLAVISSIAG TKGLGVAPAYSATKRFQNTYIDALEQLSYLQKLHIRFTDIRPGFVATDLLNDGKHYPLLM DAAEVGRHISWSLKRKQRVAVIDWRYRILVFFWKMTPRWMWKRLPVKTN >gi|222159295|gb|ACAB01000064.1| GENE 19 17315 - 17419 144 34 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298483425|ref|ZP_07001602.1| ## NR: gi|298483425|ref|ZP_07001602.1| oxidoreductase, short chain dehydrogenase/reductase family [Bacteroides sp. D22] # 1 34 1 34 243 69 100.0 6e-11 MKKAIIIGATSGIGQEVAKCLLLEGWKIGVAGRR >gi|222159295|gb|ACAB01000064.1| GENE 20 17440 - 17739 240 99 aa, chain - ## HITS:1 COG:no KEGG:BF2926 NR:ns ## KEGG: BF2926 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 97 1 97 98 135 74.0 7e-31 MKVSRIAHEKKTVELMIRLYCRKKEKNKILCTDCKELLRYAHARLDRCPFGEKKGACKEC TVHCYKPVLRERMRQVMRFSGPRMLFYAPWQTIRHLLNL >gi|222159295|gb|ACAB01000064.1| GENE 21 17855 - 18127 241 90 aa, chain + ## HITS:1 COG:no KEGG:BT_4498 NR:ns ## KEGG: BT_4498 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 90 6 80 81 132 81.0 5e-30 MNRLFETSSEELVDDGNNLYLCGMKKLLCPQCKIAAMYVKNEQGERLLVYVLENGEVVPK YPEDSMEGFDLTEVFCLGCSWHGSPKRLVK >gi|222159295|gb|ACAB01000064.1| GENE 22 18248 - 19621 1170 457 aa, chain + ## HITS:1 COG:SP1939 KEGG:ns NR:ns ## COG: SP1939 COG0534 # Protein_GI_number: 15901763 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Streptococcus pneumoniae TIGR4 # 6 426 8 428 456 249 35.0 8e-66 MATSKEMTAGPALPLILKFTLPLLLGNLLQQTYSLVDAAIVGKFLGINALASVGASTSVV FLILGFCNGCCGGFGIPVAQKFGARDYSTMRSYVAVSLKLAAGMSVVIALLTCILCEDIL RIMRTPENIFEGAYAYLLVTFIGVPCTFFYNLLSSIIRALGDSKTPFWFLLFAAVLNIIL DLFCILVLDWGVAGAAIATVFSQGLSAVLCYIYMYRKFEILQGTPKERRFQSKLAKTLLY IGVPMGLQFSITAIGSIMLQSANNALGTACVAAFTSAMRIKMFFICTFESLGIAMATYSG QNYGAGKPERIWLGIKASALMMIIYAAFTFLLLMVGAKYFALIFVDPSETEILLDTELFL HVSCMFFPMLGLLCILRYTIQGVGFTNLAMFSGVAEMIARILVSLYAVPVFGFIAVCYGD PMAWIAADLFLVPAFIYVYRRLKKQVFTNSQVTQTVA >gi|222159295|gb|ACAB01000064.1| GENE 23 19718 - 20701 643 327 aa, chain - ## HITS:1 COG:no KEGG:BT_4500 NR:ns ## KEGG: BT_4500 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 327 1 325 325 478 78.0 1e-133 MDIKFRITKYLAVSALAMLLLGACSKNNIYINYPDEENNGGSSGSGNDDNNGNPGKENAL ITFSASVEGRNITRAMSPMGKGLQSWLCAYTANTSNNITDAPVAQGNYVTSSPGVLTGNL GYKMYLSNAIYSFYAVSCNSTSPAPTFTNGLSEPLSNGVDYLWWHAVHQDIASSQINIPI TYQHAATQVVVAIAGGENITLNKILSATITPPKPGAIMDLSTGIITSEVSYDKAADMGIN DFTVQYIMLPVKSSSPMTLTLELMVNGESFSRTYTAPLTPPNNLLSAGNSYLFRAVINEN SISFANVSVKEWTEVDESGNPLYPVQD >gi|222159295|gb|ACAB01000064.1| GENE 24 20762 - 22609 1216 615 aa, chain - ## HITS:1 COG:all4080 KEGG:ns NR:ns ## COG: all4080 COG0564 # Protein_GI_number: 17231572 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Nostoc sp. PCC 7120 # 54 608 69 549 549 254 32.0 5e-67 MIHFFKKPVSHLALPEKFTYPFHYTPHPLCVLAAEEVKEYIASRKEWQEELASGKMFGVL IVQTDNGITNNEENQIGYLAAFSGNLGGKNLHPYFVPPVYDLLQPEGFFKIEEEQISAIN IRIRELENSSSYLDSKEKWKIETEQAKAVLNQAKAELKMAKEAREIRRQSSPELSEEEQA SLIRESQYQKAEYKRLEKEWKKRLEELETEVRHFDIEIERLKTERKERSAALQRKLFEQF RMLNAQGEVKDLYTIFEQTVQKVPPAGAGECALPKLLQYAYLHQLKPLAMAEFWWGDSPK NEIRHHGYYYPSCKGKCEPILQHMLQGLEVDENPLLNPVHEEEELEIVFEDEWLLVVNKP AGMLSVPGKAEDRDSVYHRLKKKYPEATGPMIVHRLDMATSGLLLVAKTKEVHQDLQAQF ANRSIKKRYVAVLDGAIIKTEKETKPIAEKAILLAKETVSTKKTAKAERTGNTGRIELPL CLNPLDRPRQMVSREHGKEAITEYQIISESERITSESENTFNESNRIDESERSINESRKY TRIIFYPLTGRTHQLRVHAAHPEGLGCPILGDELYGKKADRLYLHAEYIEFRHPIYGDIL CIQKEADFHKNMIKP >gi|222159295|gb|ACAB01000064.1| GENE 25 22780 - 23256 481 158 aa, chain + ## HITS:1 COG:no KEGG:BT_4502 NR:ns ## KEGG: BT_4502 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 15 169 176 271 81.0 5e-72 MAFWALDIYFPQWECLNYGAPGEGLAYVESFAVDTSGCQVVVQFGTNDIYQLNDENIDEY VERYVKAVLAVPSLKTYLFCIFPRNDYDDYSTAVNKFIQILNRKIHEKLQGTDIVYLDVF NRLLQDGRLNPELTLDDLHLNGKGYSILTEALKQASGL >gi|222159295|gb|ACAB01000064.1| GENE 26 23377 - 23841 641 154 aa, chain + ## HITS:1 COG:CC0942 KEGG:ns NR:ns ## COG: CC0942 COG2030 # Protein_GI_number: 16125194 # Func_class: I Lipid transport and metabolism # Function: Acyl dehydratase # Organism: Caulobacter vibrioides # 8 148 5 145 148 119 44.0 2e-27 MEKVIINSYEEFEKLVGQQIGVSDYVELSQERINLFADATLDHQWIHVDTERAKVDSPYH STIAHGYLTLSMLPYLWNQIIQVNNLKMMINYGMDKMKFGQAVLSGQSIRLVTTLHSLTN LRGVAKAEIKFAIEIKDQPKKALEGIAVFLYYFN >gi|222159295|gb|ACAB01000064.1| GENE 27 23957 - 24937 746 326 aa, chain - ## HITS:1 COG:mlr8141 KEGG:ns NR:ns ## COG: mlr8141 COG3049 # Protein_GI_number: 13476735 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Mesorhizobium loti # 2 324 25 350 350 402 60.0 1e-112 MCTRVVYSGKNGMVATGRSMDWKTEMHSNLWVFPKGIERNGETGANSLKWTSKYGSVVTS AFEIASTDGMNEKGLVANLLWLPETEYPVRDQSKPGLAITAWVQYMLDNFATVDEAVAFI DENTFQVVSDLMPDGSRLATLHLSISDATGDCAIFEYTGGKLTVYHSKEYKVMTNSPTYN KQLALNEYWKSIGGLSFLPGTNRPSDRFARASFYINALPQTDDVRIAIASVFSVIRNTSV PYGISTPEFPEISTTQWRTVSDSKNLLYFFESSLTPNTFWVNLRETDLSEGAPVLKLSIA NDETYHGNATKEFKPAQPFRFMGVKG >gi|222159295|gb|ACAB01000064.1| GENE 28 25138 - 25770 377 210 aa, chain + ## HITS:1 COG:PAB0863 KEGG:ns NR:ns ## COG: PAB0863 COG2095 # Protein_GI_number: 14521504 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Pyrococcus abyssi # 1 205 13 199 201 66 23.0 4e-11 MALFPVINPLGNGFVVNGFFTDLDPQQRKAAIQKLTLNFIMIGVGTLVIGHLFLLIFGLA IPVIQLGGGILICKTAMELLGDSGSSDKEEASKNVDGFRWKNIEQKIFYPITFPISIGPG SISVIFTLMASASVKGKLLQTGINYLVIALVIICMAAIFYVFLSQGQRFIQRLGPVGNQI INKLVAFFTFCIGIQISVTGISQIFHLNIL >gi|222159295|gb|ACAB01000064.1| GENE 29 25799 - 26326 413 175 aa, chain - ## HITS:1 COG:no KEGG:BT_4505 NR:ns ## KEGG: BT_4505 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 166 1 166 180 244 80.0 9e-64 MNLRARLSERVHIEDIREVLHFIQDDERLREEIYQLIFDEDAIVSYQALWVCTHFSKADV AWLSRKQEELIDAAMTCPHSGKRRMILNLICQQPAADPPRVDFLDFCMERMISREEPPGV QSLCMKLAYQLTRSIPELQQELRTILEIMEPDLLVPAIRSVRKNTLKAMKTKKNR >gi|222159295|gb|ACAB01000064.1| GENE 30 26348 - 27760 924 470 aa, chain - ## HITS:1 COG:lin0429 KEGG:ns NR:ns ## COG: lin0429 COG0346 # Protein_GI_number: 16799506 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Listeria innocua # 1 124 1 123 126 140 53.0 4e-33 MKLHHIAIWTFRLEELKEFYVRFFGGKSNEKYINPKKGFESYFISFGEGTDLELMSRTDV QNTPIEENRVGLTHFAFTFPSQEEVLRFTEQMRSEGYTIAGEPRTSGDGYFESVVLDPDG NRIECVYRKTANESKNKARQETDIENIPPVTLHTERLFLRPFEERDAEAFFACCQNPNLG NNAGWPPHRTLDESRRILHSTFINQEGIWAVILKDTKQLIGSVGIIPDPKRENPQVRMLG YWLDESHWGKGYMTEAVQGVLNYGFEELRLSLITATCYPHNKRSQKVLKKNGFIYEGTLH QAELTYNGNIYDHQCYYLPGISQPTPEDYDEILHVWEMSVRHTHNFLTEEHIQFYKPLVR KHYLPAVELFVIRNANGKMAAFMGLSDELIEMLFVHPDEQGKGYGKRLMEYARDKKHMDK VDVNEQNEKALQFYLHLGFQIIGRDETDSMGKPFPILHLQLPEADSANRD >gi|222159295|gb|ACAB01000064.1| GENE 31 27832 - 28713 642 293 aa, chain - ## HITS:1 COG:SMa1953 KEGG:ns NR:ns ## COG: SMa1953 COG2367 # Protein_GI_number: 16263522 # Func_class: V Defense mechanisms # Function: Beta-lactamase class A # Organism: Sinorhizobium meliloti # 3 268 10 311 334 75 26.0 1e-13 MRSFIVFLCLVPTLLFARQTQLETQLKEAIKGKKAEIGIAVIIDGKDTVTVNNDIHYPLM SVFKFHQALALADYMGKQRQSLETRLPIKKSDLKPDTYSPLRDKYPQGGIEMSIADLLKY TLQQSDNNACDILFDYQGGPNAVNKYIHSLGIRECAIAGTETAMHEDLNLCYENWTTPLA AAELVEIFRKKPLFPKVYKDFIFQTMIECQTGQDRLVAPLLDKKVTVGHKTGTGDLNAKG QQIGCNDIGFVLLPGGRTYSIAVFVKDSEENNQANSKIIADISRIVYEYVVQH >gi|222159295|gb|ACAB01000064.1| GENE 32 28999 - 30195 1034 398 aa, chain + ## HITS:1 COG:no KEGG:BDI_0255 NR:ns ## KEGG: BDI_0255 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 398 1 398 398 474 65.0 1e-132 MSTFRSIEELVKSLDREKELLKEMFAKRKSLSFRYDYALEMTEYKEERIRYLIDYGVIRD TGDFLEMEDIYLKFFEDVLEVNEEINVSFVQDYLTRLNENIDYYLKENNEQRKYNYQREV KRCLKNIALTTVRNVMDLKRNMDNTYKNEPNYKIKKTKLVRLDEKRNNIALLIRKSEELI DYGQPIFFRVAMDVQMRNVVSDVKLQLNDSYHNLIEIQKQIIHYLNLIDYQNRIFEKVKK LKYLKDQFLLEEHTSIRSVAAQKNPVWMEPQVGYRIKLSIDNLRTSDEAFQILKKLVARQ RNSPKGMKQLADAIPEGYLDGQSQMIDTVNLQEVHNSFMASSTHLFSFVMNYRYRKEVTR GEKLIFFCQLASQYADELRFTDTYEISDEVEYPLIYAK >gi|222159295|gb|ACAB01000064.1| GENE 33 30225 - 30776 613 183 aa, chain + ## HITS:1 COG:no KEGG:BDI_0254 NR:ns ## KEGG: BDI_0254 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 183 1 183 183 230 67.0 2e-59 MKYTEEIFNILSKGGFISSNSVSTQVKRLYDAIEEDLPDYYDYYKGIGFYLEGGDGYYYF TRKESKVDLERKLEAIQKWIDYLSFLKTYHSAFGPGFLFRAADIEIQIGCDIELKEKATK LFSDKKKYDEVVGKLLKELESIGLIEKENELDGTYKVLSAFHYMEDLVDCITISEEVQDE IPE >gi|222159295|gb|ACAB01000064.1| GENE 34 30760 - 34428 2807 1222 aa, chain + ## HITS:1 COG:no KEGG:BDI_0253 NR:ns ## KEGG: BDI_0253 # Name: not_defined # Def: putative DNA repair ATPase # Organism: P.distasonis # Pathway: not_defined # 1 1218 1 1220 1221 1208 55.0 0 MRYLNKIIFLNSAHIPYAEVKLDGNVHFIGTQGVGKSTLLRAILFFYNADKLRLGIPKEK KSFDAFYFPYANSYIIYEVMRENGAYCVVAAKSQGRVFFRFIDAPFQQDWFIDEHNVVHS EWGRIREHIGSKIQITAQVTSYEMYRDIIFGNNRKHEMIPYRKFAIVESAKYQNIPRTIQ NVFLNSKLDADFIKDTIIRSMSDEDISVDLDFYRSQIKEFEQEYRDVMLWFTKNKNGEVP VRKMAEKVMNAYRDLIYTQKQIGEGRAELNFAEKQALHEIPLVKEEQAKAETERERLLRL MGELQQKYTNERDGLIRDIGIINDLLKKIREKRLHYEQMQIEEIIKRVSCEELLIQELEQ VRNMKSELTRAYEDVLSKYRLLFEKLEADFRTFENSQQARINARNAEIAGKQEELMQQLR VEEEKVRATQEEKVQTVDNRIRQLRDEQAQCNLKLQKVKYEHPHQKEMSDCEDGINELRK KDKELEHSIRQQQGEIKQLRQECEWKVKELKWEFQAKMETVRKERSAIEEQLQTLDALIE KRKGSLCEWLEKNKPDWQETIGKVADEELVLYNNELQPQLVNKEATLFGVSLNLTAIERS VRTPEEMKQERDRQQAARQLCTDRLTRLTEEEGEAVSSLEKKYSKQIHSILEEQHLMEAE RMQIPAKLKNLQADYASWKTKEEEWKRACVEELQAQLNDIGHRLYVAEGEKEKHLAEREK LLKACRKVYNDSRTELRKELEEFVAGIQQEIERMKQQTAERKKELKQAQENELNGKGADT VTIRKYDDRLAEINKELDYIGKSRPQVLYYERDKEELFDKEPATRSRKKEQDAKLVALDE RFALKKEKLQVQKKGADEHLDRINKELHLLEDGLNKVDMFRKDETFCPPSVTEIGEKPTR KNCDVIVEELKSLIVSTIRKTEEFKKAVTQFNGNFSSKNTFHFRTELVGEQDYYDFASNL CEFVDNDKISDYQKHISERYTDIIRRISQEVGGLTRNESEIHRTIKDINDDFVKRNFAGV IKEIALRPLQSSDKLMQLLLEIKRFSDENQYNMGKVDLFSQDSREDVNAVAVRHLLSFMK FLLDDPGRRRLALADTFKLEFRVKENDNDTGWVEKIANVGSDGTDILVKAMVNIMLINVF KEKASRKFGDFKIHCMMDEIGKLHPNNVKGILDFANCRNILLVNSSPTTYNVEDYRYTYL LSKDGRSNTQVVPLLTYNKIEK >gi|222159295|gb|ACAB01000064.1| GENE 35 34425 - 35285 657 286 aa, chain + ## HITS:1 COG:no KEGG:BDI_0252 NR:ns ## KEGG: BDI_0252 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 8 285 14 291 293 288 49.0 2e-76 MSGSLFTLAMARLVLKMLGGELIPYTKFDGLIANRLMDEGIITIVPRGSKRSFRMIDPEG CRIYISQNYTSGMELEDWIEMKNCPDEVSRSEQVAKAGDSKLRYTRTFKGFLLNCYTPIE ATFHGEPCVLSPLQGTSIFMQDYEYFRIPEDVVVVGIENGENFQHIRAQKYLFEGMKVLF VSRYPQSKDLCNWLKIIPNRYIHFGDIDLAGISIFLNEFYVKLGNRAEFFIPADVKKRLK DGNRQLYDNQYLRYRAMLVSDERLRPLVAMIHKYGKAYEQEGYIKY >gi|222159295|gb|ACAB01000064.1| GENE 36 35295 - 37049 1198 584 aa, chain + ## HITS:1 COG:no KEGG:BVU_2278 NR:ns ## KEGG: BVU_2278 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 584 3 594 594 822 72.0 0 MITLDNFENFVPYKILMRGEEYYDTDAVSELEETSPGEWTATVEGTDDYNVEISMNGKEV ESWYCDCPYDGEICKHVVAALLAIRDNNKRVSRSAFSKMRIVTKEEEVVQPDVDIKQLLS FISPQEISTFISEYASTNPEFKAALLKRFIFKESSSTSKGKDYRTEIQKIFNSFGDSKKS RYHNRYNDYSRDWETVFNRMDVFLKKADFFLSLGDMDSTIAIALQTLRSIGENYEDELLY IDDDDDFGTSLYCEHAGGLLMKVVGHPKTTQKQKTDILQELRQIAEISTYRNYGIYDIDE LMMQINLSIQPTEKALELIDGLLETRKDTHDLYQLVLRKVNLLLEQNEEQKANETIRQYL YLTEIREMEVEKLIVRCQYDEAIRLLDEGIEIAKEDIYSGTDSKWLEIKLKICETTNRTS EVVDTCRLLFVTGRDKLTYYNKLKTLIPKEQWKNFLDAMMKETEFSNYFSFGGSAEADIY VKEQDNERLFTLLSSTRYDQLEALMRYAHYLKDTHSEQLIAMYTSSLNDYAERKMGRRNY EFIAQVLPCIHKLKGGQTAVKNIVAEFRIKYKRRPAMMEVLKDF >gi|222159295|gb|ACAB01000064.1| GENE 37 37130 - 38023 599 297 aa, chain + ## HITS:1 COG:SMb20835 KEGG:ns NR:ns ## COG: SMb20835 COG1708 # Protein_GI_number: 16264326 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Sinorhizobium meliloti # 3 296 27 327 331 150 31.0 3e-36 MIMKKSIKRLPKRTQEELTVLLDLVCKNIENCQMVILFGSYARGNYVLWDTNIEFGVHTS YQSDYDILLVVTGQTKYVERKLNRITNKYHDLFADRRHAFPQFVVEHINTVNRNLEISQY FFTDIVKEGIMLYNSGKCELAKPRKLSFREIRDIAQSEFDRLYPYACDFLGVVKEYFMPK EQYNLSAFMLHRTCEKLYYTILMVFTNYLPKTHKIKELSGMVKRFSQELTTVFPQNTDEE KECFDLLCRSYIEARYNKDFSISQEQLEYLIARIDILKDITERLCKEKIVEYDTMPE Prediction of potential genes in microbial genomes Time: Wed May 18 02:42:46 2011 Seq name: gi|222159294|gb|ACAB01000065.1| Bacteroides sp. D1 cont1.65, whole genome shotgun sequence Length of sequence - 1533 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 539 - 754 232 ## BVU_0157 hypothetical protein - Prom 825 - 884 2.7 2 2 Tu 1 . - CDS 923 - 1393 525 ## BT_0374 hypothetical protein - Prom 1438 - 1497 5.7 Predicted protein(s) >gi|222159294|gb|ACAB01000065.1| GENE 1 539 - 754 232 71 aa, chain - ## HITS:1 COG:no KEGG:BVU_0157 NR:ns ## KEGG: BVU_0157 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 51 214 264 388 86 68.0 2e-16 MWNEYVEICGVSEREIHENLEAELHEFAAARGITYDKLCEDLRECYDGYHFKIGINFSAE TRNIEKWIVES >gi|222159294|gb|ACAB01000065.1| GENE 2 923 - 1393 525 156 aa, chain - ## HITS:1 COG:no KEGG:BT_0374 NR:ns ## KEGG: BT_0374 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 516 260 82.0 1e-68 MSNKIYPIGIQNFEKIRNDGYFYIDKTALMYQMVKTGSYYFLSRPRRFGKSLLISTLEAY FQGKKELFEGLAVEKLEKDWIKHPILHLDLNIEKYDTLESLDKILNDNLEYWESQYGTRP SETSFSLRFAGIIQRACEKTGQRVVILVDEYDKPML Prediction of potential genes in microbial genomes Time: Wed May 18 02:43:00 2011 Seq name: gi|222159293|gb|ACAB01000066.1| Bacteroides sp. D1 cont1.66, whole genome shotgun sequence Length of sequence - 36493 bp Number of predicted genes - 24, with homology - 24 Number of transcription units - 12, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 246 - 274 -0.1 1 1 Tu 1 . - CDS 351 - 788 438 ## BT_4400 hypothetical protein - Prom 833 - 892 6.2 + Prom 805 - 864 4.5 2 2 Op 1 . + CDS 926 - 3022 1436 ## COG5545 Predicted P-loop ATPase and inactivated derivatives + Prom 3090 - 3149 5.4 3 2 Op 2 . + CDS 3268 - 3780 751 ## COG2193 Bacterioferritin (cytochrome b1) + Term 3797 - 3856 10.1 + Prom 3826 - 3885 4.9 4 3 Op 1 . + CDS 3967 - 4920 848 ## COG0685 5,10-methylenetetrahydrofolate reductase 5 3 Op 2 1/0.000 + CDS 4922 - 6046 992 ## COG2812 DNA polymerase III, gamma/tau subunits 6 3 Op 3 . + CDS 6062 - 7561 1346 ## COG1774 Uncharacterized homolog of PSP1 7 3 Op 4 . + CDS 7539 - 8012 271 ## BT_3818 hypothetical protein 8 4 Op 1 19/0.000 - CDS 8015 - 9472 1259 ## COG0772 Bacterial cell division membrane protein 9 4 Op 2 . - CDS 9453 - 11315 1554 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 10 4 Op 3 . - CDS 11318 - 11815 298 ## BT_3815 hypothetical protein 11 4 Op 4 22/0.000 - CDS 11812 - 12657 771 ## COG1792 Cell shape-determining protein 12 4 Op 5 . - CDS 12665 - 13687 1062 ## COG1077 Actin-like ATPase involved in cell morphogenesis - Prom 13750 - 13809 4.1 - Term 13741 - 13791 -0.4 13 5 Tu 1 . - CDS 13814 - 15337 1799 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) - Prom 15379 - 15438 8.7 - Term 15407 - 15461 12.6 14 6 Op 1 . - CDS 15489 - 17525 2411 ## COG3590 Predicted metalloendopeptidase - Term 17625 - 17664 2.0 15 6 Op 2 . - CDS 17735 - 19690 1782 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 19738 - 19797 7.1 - Term 19764 - 19814 10.5 16 7 Tu 1 . - CDS 19833 - 20744 963 ## BT_3808 hypothetical protein - Prom 20792 - 20851 5.8 - Term 21023 - 21072 -0.6 17 8 Op 1 . - CDS 21269 - 23572 1864 ## COG0729 Outer membrane protein 18 8 Op 2 . - CDS 23614 - 28056 3895 ## BT_3806 hypothetical protein - Prom 28131 - 28190 3.6 + Prom 28331 - 28390 8.0 19 9 Tu 1 . + CDS 28507 - 29565 359 ## COG0236 Acyl carrier protein - Term 29622 - 29669 2.3 20 10 Op 1 . - CDS 29871 - 32186 1887 ## COG0642 Signal transduction histidine kinase 21 10 Op 2 . - CDS 32221 - 33579 1176 ## COG0534 Na+-driven multidrug efflux pump - Prom 33602 - 33661 4.0 22 11 Op 1 . - CDS 33665 - 34903 1196 ## COG0612 Predicted Zn-dependent peptidases 23 11 Op 2 . - CDS 34909 - 35748 1009 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family - Prom 35976 - 36035 4.5 + Prom 35696 - 35755 5.8 24 12 Tu 1 . + CDS 35836 - 36493 484 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains Predicted protein(s) >gi|222159293|gb|ACAB01000066.1| GENE 1 351 - 788 438 145 aa, chain - ## HITS:1 COG:no KEGG:BT_4400 NR:ns ## KEGG: BT_4400 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 47 144 1 98 100 137 67.0 8e-32 MKEPFDYSIVPYTFGLCAAEECPRATTCLRHIALEYVPAERVFLSIMNPNRLKAMKSACD YYRSNEKVRYVRGFMRTISALTVRVADTFRYRMIEYMGRKNYYLKRRGDMNLSPVEQRRI IAVAKELGVSLDEYFDGYVEDYNWG >gi|222159293|gb|ACAB01000066.1| GENE 2 926 - 3022 1436 698 aa, chain + ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 460 679 421 633 836 68 26.0 4e-11 MKITQLRDKDKTTALTTMDMETWIGKTRTETKTQPVSAFREVLRYSLPDSRCYEADKLPK ILPAAEFRRTEGGKQMKSYNGIVELTVGPLSGGPEITLVKQLAWEQPQTHCVFTGSSGQT VKIWAKFTRPDNSLPQKREEAEIFHAHAYRLAVKCYQPQIPFSILPKEPSLEQYSRLSYD PELMYRPNSVPFYLSQPSGMPEELTYREAVRSEKSPLTRAVPGYDTERAIFMLFEAALRK THEEIYEAEDEGAPERGEDFQAMVTQLAVNCFHSGIPEEETVKRTIFHYYLRRQEVLIRQ LVKNVYEEQKGFGKKSSLGKEQYLSLQTEEFMNRRYEFRYNTQVGEVEYRERNSFHFYFN PINKRVLNSIALDAQAEGIPLWDRDISRYIYSNRIPVFNPLEDFLYHLPVWDGKDRIRGL AQTVPCENKHWVDLFHRWFLNMVMHWRGTDKKYANNVSPLLVGPQGCRKSTFCRSLIPPA MRAYYTDSIDFSRKTDAELYLNRFALINIDEFDQISATQQGYLKHILQKPIVNMRKPYGN AVLEMRRYASFIATSNQKDLLTDPTGSRRFICIEVTGTIDTNKAIDYEQLYAQAMYELDH GERYWFDQSEEQIMTRSNREFEQVSLEEQLFYRYFRPAKEKEDGEWLSPAEILEDIKKNS AIPLSNKRVSVFGRVLRKHEIPSKRVHRGTVYHVVRVL >gi|222159293|gb|ACAB01000066.1| GENE 3 3268 - 3780 751 170 aa, chain + ## HITS:1 COG:PA4880 KEGG:ns NR:ns ## COG: PA4880 COG2193 # Protein_GI_number: 15600073 # Func_class: P Inorganic ion transport and metabolism # Function: Bacterioferritin (cytochrome b1) # Organism: Pseudomonas aeruginosa # 14 161 32 177 177 78 35.0 5e-15 MARESVKILQGKLDVESLIAQLNAALAEEWLAYYQYWVGALVVEGAMRADVQGEFEEHAE EERRHAQLLADRIIELEGVPVLDPKQWFELARCKYDAPQGFDSVSLLKDNVASERCAILR YQEIADFTNGKDFTTCDIAKHILAEEEEHEQDLQDYLTDIARMKKSFLDK >gi|222159293|gb|ACAB01000066.1| GENE 4 3967 - 4920 848 317 aa, chain + ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 1 316 1 287 296 169 32.0 5e-42 MKVIDLIHSNKKTAFSFEILPPLKGTGIEKLYQTIDTLREFDPKYINITTHRSEYVYKDL GNGLFQRNRLRRRPGTVAVAAAIQNKYNITVVPHILCSGFTREETEYVLLDLQFLNITEL LVLRGDKAKHESVFTPEGDGYHHAIELQEQINNFNKGIFVDGSEMKVSSTPFSYGVACYP EKHEEAPNIETDLYWLKKKVENGAEYAVTQLFYDNRKYFEFVEQAKAADINIPIIPGIKP FKKLSQLSMIPKTFKVDLPEDLVKEALKCKNDAEAEQVGIEWCVAQCKELMAHGVPSIHF YSIGAVDSIKEVAKIIY >gi|222159293|gb|ACAB01000066.1| GENE 5 4922 - 6046 992 374 aa, chain + ## HITS:1 COG:DR2410 KEGG:ns NR:ns ## COG: DR2410 COG2812 # Protein_GI_number: 15807400 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Deinococcus radiodurans # 3 214 13 180 615 99 31.0 9e-21 MFFRDVIGQEEIKQRLIQEVNEGRIPHAQLICGPEGVGKMPLAIAYARYISCTNRGETDA CGTCPSCVKFNKLVHPDVHFVFPIVKSAKGKKEVCDDYIADWRPFVINHPYFNLNHWLGE MDAENSQALIYAKESDEILKKLSLKSSEGGFKITIVWLPEKMHPVCANKLLKLLEEPPEK TVFLLVSEAPDMILPTILSRTQRMNVRKIDEASIDRVLQTKYNILPADSISIAHLANGNF IKALETIHLNEENQLFFDLFVSLMRLSYQRKIREMKMWSEQVAGMGRERQKNFLEYCQRM IRENFIFNLHQRNLTYMTINEQNFATRFAPFVNERNVMGIMDELSEAQLHIEQNVNAKMV FFDFSLKMIVLLKQ >gi|222159293|gb|ACAB01000066.1| GENE 6 6062 - 7561 1346 499 aa, chain + ## HITS:1 COG:BS_yaaT KEGG:ns NR:ns ## COG: BS_yaaT COG1774 # Protein_GI_number: 16077100 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Bacillus subtilis # 41 275 3 231 275 164 38.0 3e-40 MEYKLHNGSGGLCCKGCSRQDKKLNTYDWLADIPGNAEESDMVEVQFKNTRKGYFRNSNK IKLEKGDVVAVEAAPGHDIGVVTLTGRLVPLQMKKANFKADTEIKRVYRKAKPVDMEKFN EAKAKEHATMIRARQIALNLNLDMKIGDVEYQGDGNKAIFYYIADERVDFRQLIKVLAEA FRVRIEMKQIGARQEAGRIGGIGPCGRELCCATWMTSFVSVSTSAARFQDISLNPQKLAG QCAKLKCCLNYEVDCYVEAQKRLPSREIELETKDGTFYFFKADILSNQVSYSTDKNFPAN LVTISGKRAFEVISMNKKGMKPDSLLEEEKKPEPRKPVDLLEQESVTRFDRSRNNKEGGN NANRNNKKKKKGNNNNGNRPQQQAEGGNRPQQPQRENENRPQQSENGNRGERDNRPRNNN NNNRNRGQNQGRNNENRRPERGSNQERPQGQERPQQQDRQREQQGQERQERRPNHERPSR PERNQNQEKQSTNEKPTQE >gi|222159293|gb|ACAB01000066.1| GENE 7 7539 - 8012 271 157 aa, chain + ## HITS:1 COG:no KEGG:BT_3818 NR:ns ## KEGG: BT_3818 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 157 1 157 159 264 77.0 9e-70 MKSLLKNSILSLFSACLLTACNEHTVYHSYQSLPNKGWGKSDTLSFQIPITDSVPTTLRL FAEVRNSIEYPYHDLHLFISQNLQDSTVWRTDTIAFCLADSTGRWTGHGWGSIYQSETFI TSVRPLHPTNYTIKIMSGMKDEKLQGLSDVGIRIEKQ >gi|222159293|gb|ACAB01000066.1| GENE 8 8015 - 9472 1259 485 aa, chain - ## HITS:1 COG:TP0501 KEGG:ns NR:ns ## COG: TP0501 COG0772 # Protein_GI_number: 15639492 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Treponema pallidum # 52 479 46 430 433 168 32.0 2e-41 MVTRNDSLWKTLDWVTIFIYLLLIVGGWFSVCGASYDYGERDFLDFSTRAGKQFVWIICS FGLGFVLLMLEDRMYDMFAYIIYVGMILLLIVTIFIAPDTKGSRSWLVMGPVSLQPAEFA KFATALALAKYMNSYSFSIKKEKCAFILGFIILLPMLLIIGQRETGSALVYLAFFLVLYR EGMPGVVLFAGVCAVIYFVVGIRFDEVFIADTPTPLGEFIVLLLILLFAGGMVWVYRKKW SATRNIIGGSLAILLIAYLISEYWVHFSLVWVQWALCIVVIGYLIYLALSERQRTYFLIA LFTIGSVGFLYSSNYVFDNVLEPHQQVRIKVVLGLEEDLTGAGYNVNQSKIAIGSGGLTG KGFLNGTQTKLKYVPEQDTDFIFCTVGEEQGFVGSAAVLLAFLILILRLIFLSERQTSNF GRVYGYSVVSIFLFHLFINIGMVLGLTPVIGIPLPFFSYGGSSLWGFTILLFIFLRIDAG RGRRL >gi|222159293|gb|ACAB01000066.1| GENE 9 9453 - 11315 1554 620 aa, chain - ## HITS:1 COG:RSc0062 KEGG:ns NR:ns ## COG: RSc0062 COG0768 # Protein_GI_number: 17544781 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Ralstonia solanacearum # 24 600 31 633 801 281 33.0 3e-75 MAKDYILEKRKFVIGGIAISIVLIYLIRLFVLQITTDDYKKNADSNAFLNKIQYPSRGAI YDRTGKLLVFNQPAYDITIVPKEIENLDTLDLCQSLNITRAQFLKIMSDMKDRRRNPGYS RYTNQLFMSQLSAEECGVFQEKLFKFRGFYIQRRTIRQYSYNAAAHALGDIGEVSAKEME ADEEGYYIRGDYVGKLGVEKSYEKYLRGEKGIEILLRDAHGRIQGHYMDGEYDRPSVPGK NLTLSLDIDLQMLGERLLKNKIGSIVAIEPETGEILCLVSSPNYDPHLMIGRQRGKNHLA LQRDMTKPLLNRALMGVYPPGSTFKTAQGLTFLQEGIITEQSPAFPCSHGFHYGRLTVGC HAHGSPLPLIPAIATSCNSYFCWGLFRMFGDRKYGSPQNAITVWKDHMVSQGFGYKLGVD LPGEKRGLIPNAQFYDKAYRGHWNGLTVISISIGQGEILSTPLQIANLGATIANRGYFVT PHIVKEIQDNQLDSIYRVPRYTTIEKRHYESVVEGMRGAATGGTCRMLSVMVPDLEACGK TGTAQNRGHDHSVFMGFAPMNKPKIAIAVYVENGGWGATYGVPFGALMMEQYLKGKLSPE NELRAEEFSNRVILYGNEER >gi|222159293|gb|ACAB01000066.1| GENE 10 11318 - 11815 298 165 aa, chain - ## HITS:1 COG:no KEGG:BT_3815 NR:ns ## KEGG: BT_3815 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 165 165 249 90.0 2e-65 MIITYIHRIGWFIGLVLLQVLILNSVHIAGYATPFLYIYFILKFSSGTSRNELMLWAFFF GLTIDIFSDTPGMNAAATVLLAFLRPSLLRLFTPRDNPDSFIPSFKTMGISPFLKYTTAS VFVHSLALLSIEFFSFTSIWLLLLRVLLCTILTVTCIIAIEGIKK >gi|222159293|gb|ACAB01000066.1| GENE 11 11812 - 12657 771 281 aa, chain - ## HITS:1 COG:lin1582 KEGG:ns NR:ns ## COG: lin1582 COG1792 # Protein_GI_number: 16800650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Listeria innocua # 46 268 62 278 295 89 30.0 5e-18 MRNLLNFLLKYNYWFLFILLEVASFVLLFRFNRYQQSAFFTSANTVVGAVYEVSGGISSY FHLKSVNEDLLDRNMVLEQQITNLEKALREQQLDSMAINSIRQVPQADYQLFKAHVIKNS LNLVDNYITLDKGSSSGIRSEMGVVDGNGIVGIVYETSPSYSVVISVLNSKSNISCKIIG SDYFGYLKWEHGDSRYAYLKDLPRHAEFNLGDTVVTSGFSTVFPEGIMVGTVDDMSDSND GLSYLLKIKLATDFGKLSDVRVVARTGQEEQKKLENKVMKE >gi|222159293|gb|ACAB01000066.1| GENE 12 12665 - 13687 1062 340 aa, chain - ## HITS:1 COG:CAC1242 KEGG:ns NR:ns ## COG: CAC1242 COG1077 # Protein_GI_number: 15894525 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Clostridium acetobutylicum # 1 335 1 332 335 284 45.0 2e-76 MGLFSFTQEIAMDLGTANTIIITNGKIVVDEPSVVALDRRTEKMIAVGEKAKLMHEKTHE NIRTIRPLRDGVIADFYACEQMMRGLIKQVNTRNHLFSPSLRMVIGVPSGSTEVELRAVR DSAEHAGGRDVYLIFEPMAAAIGIGIDVEAPEGNMIVDIGGGSTEIAVISLGGIVSNNSI RTAGDDLTEDIREYMSRQHNVKVSERMAERIKINVGAALTELGDDAPEDYIVHGPNRITA LPMEVPVCYQEVAHCLEKSISKIETAILSALENTPPELYADIVHNGIYLSGGGALLRGLD KRLTDKINIPFHIAEDPLHAVAKGTGVALKNVDRFSFLMR >gi|222159293|gb|ACAB01000066.1| GENE 13 13814 - 15337 1799 507 aa, chain - ## HITS:1 COG:aq_1963 KEGG:ns NR:ns ## COG: aq_1963 COG0138 # Protein_GI_number: 15606962 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Aquifex aeolicus # 10 507 3 506 506 436 47.0 1e-122 MSESKRIKTALVSVYHKEGLDEIITKLHEEGVEFLSTGGTRQFIESLGYPCKAVEDLTTY PSILGGRVKTLHPKIFGGILCRRGLEQDMQQIEKYEIPEIDLVIVDLYPFEATVASGASE ADIIEKIDIGGISLIRAAAKNYNDVIIVASQAQYKPLLDMLMEHGATSSLEERRWMAKEA FAVSSHYDSAIFNYFDAGEGSAFRCSVNSQKQLRYGENPHQKGYFYGNLEAMFDQIHGKE ISYNNLLDINAAVDLIDEFDDLTFAILKHNNACGLASRATVLEAWKDALAGDPVSAFGGV LITNGVIDKEAAEEINKIFFEVIIAPDYDVDALEILGQKKNRIILVRKEAKLPKKQFRAL LNGVLVQDKDTNIETVADLKTVTDKAPTPEEVEDMLFANKIVKNSKSNAIVLAKDKQLLA SGVGQTSRVDALKQAIEKAKSFGFDLNGAVMASDAFFPFPDCVEIADKEGITAVIQPGGS VKDDLSFAYCNEHGMAMVTTGIRHFKH >gi|222159293|gb|ACAB01000066.1| GENE 14 15489 - 17525 2411 678 aa, chain - ## HITS:1 COG:MA2001 KEGG:ns NR:ns ## COG: MA2001 COG3590 # Protein_GI_number: 20090849 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted metalloendopeptidase # Organism: Methanosarcina acetivorans str.C2A # 32 678 16 665 665 550 41.0 1e-156 MKVTKYLPILAVCLMTTGCNSKKEAVLTSGIDLANLDTTAMPGTSFYQYACGGWVKDHPL TDEYSRFGTFDMLRENSREQLKALIAELAAKKDNAPGSAAQKVGDLYNIAMDSVKLNQEG VAPIKAELAAIDALKDKGEIYAYIAESQKKGIRPYFTMFVSADDMNSSMNIVQTYQGGIG MGQRDYYLENDEQTKNIRNKYQEHIAKMFQLAGYDEATAQKAVKAVMNIETRLAKAARSQ VELRDPHANYNKMDRATLKKNFPTFDWDTYFTVSGLKDLEEVNVGQPAAMKEVADVINTV SLDDQKLYLQWGLIDAAASYLSDDFEAQNFDFYSRTMSGKKEMQPRWKRSVSTVDGVLGE VVGQMYVEKYFPAAAKERMVTLVKNLQTSLGERIKGLEWMSEPTKEKALEKLATFHVKIG YPDKWKDYSALEIKDDSYWANIERANEWDYNEMIAKAGKPVDKDEWLMTPQTVNAYYNPT TNEICFPAAILQPPFFDMNADDAMNYGAIGVVIGHEMTHGFDDQGRQYDKDGNLKDWWTE EDAKKFEERAQVMVNFFDSIEVAPGVHANGSLTLGENIADHGGLQVSFQAFKNATEAAPL EIVDGFTPEQRFFLAYANVWAGNIRPEEILRLTKLDPHSLGKWRVDGALPHIQNWYEAFK ITEQDSMFVPKEKRVSIW >gi|222159293|gb|ACAB01000066.1| GENE 15 17735 - 19690 1782 651 aa, chain - ## HITS:1 COG:all4183 KEGG:ns NR:ns ## COG: all4183 COG0488 # Protein_GI_number: 17231675 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Nostoc sp. PCC 7120 # 1 532 1 531 564 400 42.0 1e-111 MISVEGLSVEFNATPLFEDVSYVINKKDRIALVGKNGAGKSTMLKILAGLQSPTRGVVAT PKDVTIGYLPQVMILSDNRTVMGEAELAFEHIFELQAKLERMNQELAERTDYDSEEYHQL IDRFTHENDRFLMMGGTNFQAEIERTLLGLGFSREDFERPTSEFSGGWRMRIELAKLLLR RPDVLLLDEPTNHLDIESIQWLENFLSTRANAVVLVSHDRAFLNNVTTRTIEITCGQIYD YKVKYDEFVVLRKERREQQLRAYENQQKQIQDTEDFIERFRYKATKAVQVQSRIKQLEKI DRIEVDEEDNSALRLKFPPASRSGNYPVICEDVRKAYGSHVVFHDVNLTINRGEKVAFVG KNGEGKSTLVKCIMDEIDFEGKLTIGHNVQIGYFAQNQAQMLDENLTVFDTIDRVATGDI RLKIRDILGAFMFGGEASDKKVKVLSGGERTRLAMIKLLLEPVNLLILDEPTNHLDMRSK DVLKEAIREFDGTVILVSHDRDFLDGLATKVYEFGGGLVKEHLGGIYEFLQKKKIDSLNE LQKGAGLSASPTASAKGNEPETVQPSENKLSYEAQKELNKKIKKLERQVADCEASIEETE SAIAIVEAKMATPEGASDMQLYERHQKLKQQLDGIVEEWERVSMELEETKN >gi|222159293|gb|ACAB01000066.1| GENE 16 19833 - 20744 963 303 aa, chain - ## HITS:1 COG:no KEGG:BT_3808 NR:ns ## KEGG: BT_3808 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 303 1 303 303 608 97.0 1e-173 MKPTLFLLAAGMGSRYGGLKQLDGLGPNGETIMDYSIYDAINAGFGKLVFVIRKDFEQDF RDKIISKYEGHIPCELVFQSIDDLPEGFTCPADRTKPWGTNHAVMMGADVIKEPFAVINC DDFYGRDSFQVMGKFLSALPENSKNVYSMVGFRVGNTLSESGTVSRGICSTDAKGLLTSV VERTKIQRLDGEVKYIGDDGEWTATPDTTPVSMNFWGFTPDYFAYSQEFFKTFLSDPKNM ENLKSEFFIPLMVDKLINDGTATVEVLDTTSKWFGVTYPEDRQSVVDKIQALVDAGEYPA KLF >gi|222159293|gb|ACAB01000066.1| GENE 17 21269 - 23572 1864 767 aa, chain - ## HITS:1 COG:mll1662 KEGG:ns NR:ns ## COG: mll1662 COG0729 # Protein_GI_number: 13471632 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein # Organism: Mesorhizobium loti # 578 751 459 614 617 61 28.0 8e-09 MGIILFSGCSVTKHLPEGEVLYTGGKTVIQNKSTTPVGGTALTEIEAALDKTPSTKMLGG LLPIPFKMWMYNDFVKYKKGFGKWMFNRLAANPPVFISTVNPEVRIKVATNLLRDYGYFN GKVTYETLVDKKDSLKASILYTVDMKNPYFIDTVYYQRFTPQTLRIMERGRRMSYISPGE QFNVVDLDEERTRISTLLRNRGYFYFRPDYMTYQADTTLVPGGHISLRLIPVPGLPAAAQ RPYYVGDASVYLFGKNGEAPNDSMMYKNLNIHYYKKLQVRPNMLYRWLNYQQFVRNAQMR ASNRTRLYSQYRQEQVQEKLSQLGIFSYLDLQYAPKDTTAVCDTLNVTMQATFAKPLDAE LELNVVTKSNDQTGPGASFGVTRNNVFGGGESWNVKLKGSYEWQTGGGEKSSLMNSWEMG VSTSLTFPHVVFPHWGKREFDFPATTTFRLYIDQLNRARYYKLLSFGGNATYDFQPTRTS RHSITPFKLTFNVLQHQSEEFREIADANPALYISLKDQFIPAMEYTYTYDNASARGIKNP IWWQSTVTSAGNLTSVIYRAFGQSFSKEDKRLLNVPFAQFVKLNTEFRHLWNMDKNNKIA SRVALGALFAYGNATIAPYSEQFYVGGANSIRAFTVRSIGPGGYHPAESRYSYLDQTGTF RFEANVEYRFRIFKSIWGATFLDAGNVWLMRKDEARPNSQLELKTFPKQIALGTGVGIRY DMDILVFRLDFGIPLHLPYDTERSGYYNVTGSFMKNLGIHFAIGYPF >gi|222159293|gb|ACAB01000066.1| GENE 18 23614 - 28056 3895 1480 aa, chain - ## HITS:1 COG:no KEGG:BT_3806 NR:ns ## KEGG: BT_3806 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1480 1 1479 1479 2521 87.0 0 MVLLYVPPVQNLLRREVTAYASKATGMQIQVERIDLRFPLNLLVRGVEVIQQPDTLLSLE SLNVRVQAWPLIKGKVEVDEVTLSRVAVNSANLMEGMKIKGVLGRFFLQSHGVDLSNEIA IINQAELSDTHVQLLMNDTTTTPKDTTASAPVNWKVDLHQLKLKNVSFSMQLPVDSMRMA AHIGEAAIDDAQADLKNQFYGLKKFLLSGTSASYDTGTAQPIEGFDASHIAVRDVRIALD SLLYKGRDMNAVIREFTMNERSGLSVVSLTGRAYSNDSIISVPGLKLKTPHSEIDLSAHT YWELVNIPTTGRLSANLNAYIGKEDVMLFAGGLPDSFKEAYPFRPLVIRAGTDGNLKQMQ ISRFTVDLPGAFALEGGGMIENLADSLTRTGTIGLKMTTQNLNFLTALSGEAPNGTIVVP DSMNLVAKVDIKGPEYKAGLRLKEGKGAMDVNAALNTSTEVYKADLKIDNLQLHNFLPKD SIYELSLSAAANGRGLDVMSYHSFAKLNLSLDQLHYAKYHLSNLNLTGDLKGALVTAHLT SDNALLKMTTDAEYNLAHSYPDGKITVNVTQLDLHELGIMPQPMKRPLTFNLAAEARRDL VSAHFVSGDMKLDLSARSGVNPLIRQSTHFVDVLMKQIDEKALNHAELRKALPTAVFSFS AGQENPLAYFLATKKIAYHDASVKFGAAPNWGINGKAAIHALKVDTLQLDTIFFTVKQDT TLMKLRAGVINGPKNPQFSFSTTLTGEIRDRDAELLVDFKNGKGETGVLLGVNARPLFEG QGKGDGLAFTLIPEEPIIAFQKFHFNENHNWIYVHKNMRVYANVDMWDDEGMGFRVHSVR GDTVSLQNIDVEIRRISLAELSKVLPYFPEITGLFSAEAHYVQTEKDLQLSVESSIDELT YERQRIGDVTLGATWLPGEQGKQYLNAYLNHDNVEVMVADGKLVPTRTGKDSLEVNATLE HFPLRVANVFIPDQMVTLAGDMDGNLNITGSTEQPLINGELILDSVTVLSRQYGANFRFD NRPVQIKNNRLEFDKFAIYTTGKNPFTIDGSVDFRDMSRPMANLNLLAQNYTLLDAKRTR ESLVYGKVYADFRATVKGPLDGLNMRGNMSLLGNTDVSYILTDSPLTVQDRLGSLVTFTS FSDTTTVVQQEVPTVSLGGLDMVMMVHIDPSVRLKVDLDASNDNRVELEGGGDLSMKYTP QGDLTLTGRYTLSGGLMKYALPVIAAKEFAIDNGSYVEWTGNPMDPMLNFKATDRIRASV SEGENGGTRMVNFDVSIVVKNRLDNLSFAFDVSAPEDATIQNELTAMGAEERGKQALYIM VMKTYLGTGPIGGGGGGLGKLNMGSALNSVLSSQINSLMGNLKNASVSVGIEDHDLSDTG GKRTDYSFRYSQRLFNNRFQIVIGGKVSTGENATNDAESFIDNISLEYRLDRTGTRYVRL FYDKNYESVLEGEITETGVGLVLRKKLDKLSELFIFKKKK >gi|222159293|gb|ACAB01000066.1| GENE 19 28507 - 29565 359 352 aa, chain + ## HITS:1 COG:aq_1717a KEGG:ns NR:ns ## COG: aq_1717a COG0236 # Protein_GI_number: 15606797 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Aquifex aeolicus # 281 350 5 74 78 63 44.0 8e-10 MKKRIRTLTMGSILVLHLSSCQESLDYNFYDLKKIEPPVIKTLSSNENDIDEKLNTIIST ILDIIPSMYNDNTRFIEDIGADELKMQAFFNKVEETFNIKLFDSEKDNCKTVGNLKALLK KLCYPKTQQYYSNYGPVIFQTENDIIPNFDLSFNLNCTTKCMNQRSNIISEVTNIDATVT PFNFSSDQDWIITITYKSEKGNYIAKTLNTEITWKGTVTSTIAYTNGNKIIEKWEVIIHA IVSNERGLVLTTSNIEKKGKEIKPNKESDNNKDDLSHYTFEFLKKIISEQCNIDPAFIHK ESQLIHELNLDSLDFIELIMRIEEEYGIDISAENAERLNTAGDLYQYIIEHA >gi|222159293|gb|ACAB01000066.1| GENE 20 29871 - 32186 1887 771 aa, chain - ## HITS:1 COG:rcsC_1 KEGG:ns NR:ns ## COG: rcsC_1 COG0642 # Protein_GI_number: 16130155 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 243 492 424 675 700 149 33.0 2e-35 MASFRFTKLKITAGYTLLLAILLFSLVFVHREMEALSAADDQQNLRTDSLLTLLHEKDQN TIQMLRVLSEANDSLLSASEIEEIISEQDSVITQQRVQHRVITKRDSLITTPKKKGFFKR LAEVFSPSKQDSAVLINTSLEVATDTILQPTTSKDSLQQKIRMATEEKRLQRRRTIRRTS TKYQRMNTQLTARMDSLIKQYEGEMTLRARQDAELQQEVRMRSARIIGGIAVGAVLLSAF FLILIMRDISRSNRYRQQLEVANKRAEDLLVAREKLMLAITHDFKAPLGSIMGYTELLSR LTEDERQRFYLDNMKSSSEHLLKLVSDLLDFHRLDLNKAEVNRVTFNPSQLFDEIYVSFE PLTAAKGLALQCHVAPELNGRYISDPLRLRQIVNNLLSNAVKFTQKGEISLTAGYDSSKL TIAIADTGKGMALEDRERIFQEFTRLSGAQGEEGFGLGLSIVKKLVTLLEGTIDVQSTLG KGSCFTVTLPLYPVGKSIAESESTESENADITEESAVIPPMKVIRVLLIDDDKIQLNLTA AMLKQHGIDAVCCEQLEQLVEQLRSSVFDVLLTDIQMPAINGFDLVKLLRASNIPQAKTI PVIAVTARSEMDKAALHEHGFAGCLHKPFTVKELLMTVNEGQLSADEAHITEDMATAGIN FSALTAYSEDDPEAASSIIQTFIEETGKNIERMQQALNDKEVDGIAAMAHKLLPLFTMIG VDEAIPLLEWLEVQRGQDFSKKVKEKTDHVLQEILIVLTKAREYEQYLLQK >gi|222159293|gb|ACAB01000066.1| GENE 21 32221 - 33579 1176 452 aa, chain - ## HITS:1 COG:VC1540 KEGG:ns NR:ns ## COG: VC1540 COG0534 # Protein_GI_number: 15641548 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Vibrio cholerae # 1 444 1 449 461 185 29.0 1e-46 MRHIYSTYKEHYKALIFLGLPIVIGQIGVIVLGFADTLMIGHHSTIELGAASFVNNVFNL AIIFSTGFSYGLTPIVGGLYGTHQYAPAGQALRNSLLANLMVALLLTICMTVLYLNIEHL GQPEELIPLIKPYYLVLLASLVFVMLFNGFKQFTDGITDTQTAMWILLGGNVLNIIGNYI LIYGKLGLPELGLLGAGISTLFSRIVMVIVFIIIFMRSPRFVRYKIGFFRLGWSRAVFGR LNGLGWPIAFQMGMETASFSLSAIMIGWLGTIALASHQVMLAISQFTFMMYYGMGAAVAV RVSNFKGQNDIVNVRRSAYAGFHLMMTLGVVLSLIVFLCRNYLGSWFTDSQEVVAMVTSL IFPFLVYQFGDGLQITFANALRGISDVKLMMVIAFIAYFVISLPVGYFCGFVMGWGIVGV WMAFPFGLTSAGLMLWWRFHYMTKLPEPHPKT >gi|222159293|gb|ACAB01000066.1| GENE 22 33665 - 34903 1196 412 aa, chain - ## HITS:1 COG:CC3584 KEGG:ns NR:ns ## COG: CC3584 COG0612 # Protein_GI_number: 16127814 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Caulobacter vibrioides # 5 412 47 459 948 196 31.0 7e-50 MKINRHILENGLRLVHSQDESTQMVALNILYNVGARDEDPEHTGFAHLFEHLMFGGSVNI PDYDMPLQLAGGENNAWTNNDITNYYLTVPRQNVETGFWLESDRMLSLDFSERSLEVQRG VVMEEFKQRCLNQPYGDVGHLLRPLAYQTHPYQWPTIGKELSHIANATLEEVKAFFFRFY APNNAILAVTGNISFEEAVALTEKWFASIPRREVPLRNLPQEQEQTEERWLTVERNVPLD ALFMAYHMPDHRHPDYYAFDILSDVLSNGRSSRLNQRLVQQKQLFSSIDAYISGSVDAGL FHISGKPSAGVTLEQAEAAVREELELLQQELVDEQELEKVKNKFESTQIFGNINYLNVAT NLAWFELLGRAEDMEKEVERYRSVTAEQLRTVAQSAFRKENGVVLYYKKQQN >gi|222159293|gb|ACAB01000066.1| GENE 23 34909 - 35748 1009 279 aa, chain - ## HITS:1 COG:SPy0457 KEGG:ns NR:ns ## COG: SPy0457 COG0652 # Protein_GI_number: 15674576 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Streptococcus pyogenes M1 GAS # 31 278 74 261 268 96 31.0 4e-20 MKKILLIILTISFCGLTACKTGTKKGGDMDKETLVKIETTVGDIEVKLYNETPKHRDNFI KLVKDGVYEGTLFHRVIKDFMIQAGDPDSKNAPKGKMLGTGDVGYTVPAEFVYPKYFHKK GALSAARQGDNVNPKKESSGCQFYIVTGKVFNDSTLLGMESQMNENKINVIFNTLAQKHM KEIYKMRKANDENGLYDLQEKLFAEAQEMAAKQPEFHFTPEQIEAYTTVGGTPHLDGEYT VFGEVVKGMDIVDKIQQVKTDRSDRPEEDVKITKVTILD >gi|222159293|gb|ACAB01000066.1| GENE 24 35836 - 36493 484 219 aa, chain + ## HITS:1 COG:PA4726 KEGG:ns NR:ns ## COG: PA4726 COG2204 # Protein_GI_number: 15599920 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Pseudomonas aeruginosa # 2 217 1 213 478 146 37.0 3e-35 MMSSILIVEDDITFGMMLKTWLGKKGFEVSSVSNIARARKHIESQNVDLILSDLRLPDHE GIDLLKWMNEQGMDIPLIIMTGYADIQSAVQAMKLGARDYIAKPVNPEELLKKISECLQS EKSPATHNVAKSSSKKGASTSSKDSTENHRAYLEGESDAAKQLYNYVGLVAPTNMSVLIN GSSGTGKEYVAHRIHQLSKRNDKPFIAVDCGSIPKELAA Prediction of potential genes in microbial genomes Time: Wed May 18 02:43:37 2011 Seq name: gi|222159292|gb|ACAB01000067.1| Bacteroides sp. D1 cont1.67, whole genome shotgun sequence Length of sequence - 31999 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 9, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 716 564 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 734 - 778 2.2 + Prom 841 - 900 3.3 2 2 Tu 1 . + CDS 932 - 2488 1331 ## COG3119 Arylsulfatase A and related enzymes + Prom 2621 - 2680 5.1 3 3 Tu 1 . + CDS 2763 - 3029 342 ## gi|160885634|ref|ZP_02066637.1| hypothetical protein BACOVA_03637 + Term 3140 - 3176 -0.7 - Term 2988 - 3020 -0.9 4 4 Op 1 . - CDS 3149 - 4720 1547 ## BT_3791 hypothetical protein 5 4 Op 2 . - CDS 4757 - 6685 1740 ## Phep_3405 RagB/SusD domain protein 6 4 Op 3 . - CDS 6699 - 9944 3561 ## BT_2894 hypothetical protein 7 4 Op 4 . - CDS 9981 - 11372 1328 ## COG4833 Predicted glycosyl hydrolase 8 4 Op 5 . - CDS 11387 - 12634 1278 ## Cpin_1591 hypothetical protein 9 4 Op 6 . - CDS 12653 - 14626 1591 ## BT_3791 hypothetical protein - Prom 14652 - 14711 5.6 10 5 Tu 1 . - CDS 14815 - 18855 2991 ## COG0642 Signal transduction histidine kinase - Prom 19013 - 19072 3.5 + Prom 18889 - 18948 6.4 11 6 Op 1 . + CDS 19066 - 21348 2156 ## COG3537 Putative alpha-1,2-mannosidase 12 6 Op 2 . + CDS 21402 - 22352 902 ## COG3568 Metal-dependent hydrolase 13 6 Op 3 . + CDS 22405 - 23562 911 ## COG4833 Predicted glycosyl hydrolase 14 6 Op 4 . + CDS 23622 - 25067 1360 ## COG3538 Uncharacterized conserved protein + Term 25113 - 25173 9.1 + Prom 25459 - 25518 5.0 15 7 Tu 1 . + CDS 25566 - 26726 881 ## COG2152 Predicted glycosylase + Term 26736 - 26797 3.2 - Term 26858 - 26917 3.1 16 8 Tu 1 . - CDS 27028 - 29298 1810 ## COG3537 Putative alpha-1,2-mannosidase - Prom 29333 - 29392 2.5 - Term 29431 - 29473 3.3 17 9 Op 1 . - CDS 29518 - 30189 204 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family 18 9 Op 2 . - CDS 30199 - 30945 287 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 19 9 Op 3 . - CDS 30962 - 31552 475 ## BT_3770 transcriptional regulator + TRNA 31791 - 31866 71.3 # Met CAT 0 0 Predicted protein(s) >gi|222159292|gb|ACAB01000067.1| GENE 1 3 - 716 564 237 aa, chain + ## HITS:1 COG:STM4174 KEGG:ns NR:ns ## COG: STM4174 COG2204 # Protein_GI_number: 16767428 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 1 230 209 439 441 202 45.0 3e-52 EFFGHVKGSFTGALTDKTGAFVAANGGTIFLDEIGNLSYEVQIQLLRALQERKIRPVGST QEISVDIRLVSATNENLEQAIEKGTFREDLYHRINEFTLRMPDLKERKEDILLFANFFLD QANKELDKHLIGFDAKASQALMNYHWPGNLRQMKNIVKRATLLAQSSFITLLELGTELLE TPASSNTSIALRNEETEKEHILEALRQTGNNKSKAAQLLNIDRKTLYNKLKLYNIDL >gi|222159292|gb|ACAB01000067.1| GENE 2 932 - 2488 1331 518 aa, chain + ## HITS:1 COG:ECs2103 KEGG:ns NR:ns ## COG: ECs2103 COG3119 # Protein_GI_number: 15831357 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli O157:H7 # 33 502 68 550 571 137 26.0 4e-32 MNKKTLLPLAFVPLAVTNLQAQSNMQIERADKRPNIILFMVDDMGWQDTSLPFWTQKTHY NELYETPNMERLAKQGMMFTQAYANSISSPTRCSLITGTNAARHRVTNWTLQKNTMTDRK DSILAVPDWNYNGVSQVSGTNHTFVGTSFMQLLKNSGYHTIHCGKAHFGAIDTPGEDPHH WGFEVNIAGHAAGGLASYLGEENYGHTKDGKAVSLMSVPGLEKYWGTETFVTEALTLEAI KALDKAKKYNQPFYLYMSQYAIHIPLNKDMRFYEKYKKKGMTDHEAAYATLIEGMDKSLG DLMNWLEKNGEANNTIIIFMSDNGGLASESGWRDGKLHTQNYPLNSGKGSTYEGGIREPM IVSWPGVVAPGSKCNNYLLIEDFYPTILEMAGVKNYQTVQPIDGISFIPLLKQTGNPAKG RSLFWNMPNNWGNDGPGINFNCAVRNGDWKLIYYYGTGKKELFNIPDDIGESNDLSAQHP DIVKKLSKELGNFLRKVDAQRPTFKATGKPCPWPDEIK >gi|222159292|gb|ACAB01000067.1| GENE 3 2763 - 3029 342 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885634|ref|ZP_02066637.1| ## NR: gi|160885634|ref|ZP_02066637.1| hypothetical protein BACOVA_03637 [Bacteroides ovatus ATCC 8483] # 1 88 1 88 88 112 100.0 6e-24 MKKLVLVVAMFMFVCGGSFLVKAQSSAEAVTAPTEINATVVNDTVVKDTVTKEDAPAKAS LVALAANVNDTVVTDTTSKDKPAEPVKE >gi|222159292|gb|ACAB01000067.1| GENE 4 3149 - 4720 1547 523 aa, chain - ## HITS:1 COG:no KEGG:BT_3791 NR:ns ## KEGG: BT_3791 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 484 13 499 545 130 28.0 2e-28 MKRIYTYMCLLLLSVLSFAVTGCDDDDDDDVAKDLIELIVNKPVIRIGQNEEAKVNVVVG NGNYTVRSFNTAIATATVSGELITVKSGSQNGATTVEVMDGEGVVANISVNVGVFELEVN EPEVILEVAGEKQLIVSMGNFSSNDELSYEVEDETVVSMVNTDQFRPFYTLTGLKNGHTT VTFTDHKGKQAVVQVTVKPISIDVSNLTPRVGVNNKIMITVEKGNGGYSLTAENEEIVAI QQVDDTRFNLIGKKAGITTVFVRDEAEQELSLTVTVVQADKVANLGSGNYFKVPFEYNGT ADESLKILSTITFEARFNIESLNGNDNGNARINTVMGIEKKFLLRVDVHKGGSNDEERFL QLAADDKGSIRYEGSTKIETNKWYDVAVVLDNSKSGSERIALYVNGVRETLQLSNGTPDD LKEINLTSDFYIGQSDGKRRLNGAISYARIWTKALSDQQISEQSGKLLSEDKDGMVANWL FNNGNGNTKTFVSLAGKSFEAEAANIVSSWKTDPILETSTPTE >gi|222159292|gb|ACAB01000067.1| GENE 5 4757 - 6685 1740 642 aa, chain - ## HITS:1 COG:no KEGG:Phep_3405 NR:ns ## KEGG: Phep_3405 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 1 642 1 600 600 179 27.0 4e-43 MKKLYRFSAIALLVLPLVFTSCSDYLNRDDDDNITEADVFARYEKVNGLVSDVYAAAKKA DRPLVFFEHFSNSAITDECEGTNVEGNITNNFNNGAWNPNSLPGSVGQYWEALYEGIRKA NLIIENVQKYNTPDNPQQDGDLRNRIGEMYFMRGYFHMLLLRMYGEAPYIDRVINAGDNM DFKKESVHSMVEKIVTDAQTAYGMVPNKYVKTSENFGRVDKGACLGLISFVRWVAATPLW NGASQYGYNLRRVFENEYAYDATRWRKAKEAAKAVLDFEVGGTKRYSLYTKHDANDFKDP ADGNLNDSRVYARLWDMFYDMDAFANEYVFFMTKSKDQAWQGDIYPPSREGSSRQQPVQE QVDEYEYIVGDYGYPVYSAEARKGGYDDTNPYVKGTRDPRFYRDVIYHGAPYRNNKNESK TINTASGSDKIGATNATTTGYYLRKFQQESWNKSGNFSINAPAIWRLPEFIYIYAEACNE LGEDIDEAYKLVNTVRERSFMKPMPPEVKTNQQLMREYIQRERRVELFYEGKRPWTCRLY LEPTSKEELAKESLWKSSGSDNSKRTQKYWAANNGALPRCQRMINGMRPVQDENGAITVD GVKYRMERFCVEERTFSIQHYLFPIRQSELQKTPTIEQNPGW >gi|222159292|gb|ACAB01000067.1| GENE 6 6699 - 9944 3561 1081 aa, chain - ## HITS:1 COG:no KEGG:BT_2894 NR:ns ## KEGG: BT_2894 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 39 1081 25 1017 1018 573 35.0 1e-161 MIHIKRNICLVAVSCTLLAGIPLQGVAQTGRTAKVQATQSNKITVSGTVLDKTTNDPLIG VSVVVKGVANAGTITDMDGKFTLKLPYAEAPLVFSYLGYQPQEIVPGAKKELTVLLQEDT KALQEVVVVGYTKQRKETMIGSVATITTKDLTQSPTANINNALAGRLPGLIVNQYAGGEP GVDQSELFIRGKATYGNQSAIVIVDGIERDMSYLAPDEIETFTILKDASATAAYGIRGAN GVIVITTKRGKAAEKATVNLKASIGINQPIGFPEYLGSADYATLYNEARLNDAKMTGADI SSLNLFSQQAIDNFRRAKGDNSDGLGYDWDYYDFAFKPGLQEDVSLSIRGGTDKVRYYVL ANYFSQGGNYKYSNAGEYDSQTKFTRYNFRSNIDININRYLSTRLDLGARITDRNAPGTT AGRLMTICATQPPYLPILVEENAHPQNEEYIQQNPRGMLYGDNIYRYNLLGELSRTGYLN EKNTYLNGSFAMNLDMEFLTKGLKAEVMFSYDASEGRWINRKLDTYKDGYREYPKYATFM PIEGSDAYMAGGHYTGAYKTGNKYDIDQTIGNGFSHNASDGRTYIQARLDYNRLFSNRHE VTAMLLANRGNRTVNNELAYHSQGITGRFAYYYNQKYLMEFNFGYNGSENFAPGKRYGFF PAGSIGWVVSEEEFMKKASWIDFLKVRASYGLVGSDNVSSRFPYLAFYGGGSGYDFGNNF GTNVGGTSEGNLANANLTWEKARKLNVGIDFTTLNQRLALTIDAFYEYRFDIITDMNSDG IMGYPDIVGKDAALQNLGEVSNRGVDIELSWNDKIGKDFRYYIRPNLTFSRNRLEYKAEV ARKNSWRKETGKRLYENFVYVFDHFVADQEEADRLNKIGYQPWGQLIPGDVVYKDLDRNG VIDDEDRTVMGNPRSPELMFGIPFGFQYKNFDFSVLLQGATKSSILLNGAAVFDFPQFEQ DKIGRVKKMHLDRWTPETAATAKYPALHYGTHDNNKNGNSSLFLYDASYLRLKNVEIGYN VSPKLLRKFHVQQARIYVQGLNLLTFDKLGDVDIDPETKSGDGASWYPIQKVFNFGIDIT F >gi|222159292|gb|ACAB01000067.1| GENE 7 9981 - 11372 1328 463 aa, chain - ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 99 439 46 341 341 108 29.0 2e-23 MKNYIVSLSKRVLLLSGALAIFAACDDGDLGDIYKGNSSGEYVSDVNWTDAADKSTGTYI KYFYENASGRNTFCGSIYWEKPNTPETGEPTSTGGSGGWSQGHALDVITDAYIRHANNPE YQTYLYENIMKPFLPAFDDWNEHCGYGGKDFWNNFYDDMEWMALASLRVYELTGDPDYYS ALMKMWNHIKGAKNDYKGAGGMAWKTDAPASRMSCSNGPGCLLAMKLYQLTIKEAKDGWE EQAAYYLNFAKEVYNWMTAYLCDISTGQVYDNLSIKDDGTPGDPDKVALSYNQGTFMAAA LELYNATGEEEYLRNAVAFGSYQVNKKMDSNYPVFSGEGNSGDNLLFRGIFVRYFLDMVK QPTNSLYPEKTKNKFIAALRSCSDVLWTLAHPEGYYVWEYDWAKAPAFGNRDNREDRLTI SLNAEVPGATMIEIRARYEDWVQGKATEKANWVGPDFGKKAEE >gi|222159292|gb|ACAB01000067.1| GENE 8 11387 - 12634 1278 415 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1591 NR:ns ## KEGG: Cpin_1591 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 14 405 15 376 381 117 24.0 9e-25 MRNNMITKKLIKFALLFFLIGLTACDNKLNPGFDPDDYEPIVPPEPVRTMDAMASGERVV LSGASLTDIVLKWTPTEKHGNTVYRYEVLIDTIGGDFTKPVETIFSDNNGLESQLTLTHY QANTIGKLAKFRCNTNGTLRWKVRAYCGLDQALSSLEGYFVIFMMDGIDDMPAENDPVYI TGVGTEDDGDEAEAQQMLRQNEGIYQTFTQLKANQPFVFMSTVEGRKCYYYVDDNGVLRE RNDGEEYTVTVPQSGIYRITINMGEQTISYDEIGAVYLYNQSGGYRQNFDYLGYGKWGVK NYTARKQKEDWAGSGETRHSFKMEINGTTYRWGHKERDKGQPNLDTDKSYYNLYQLALGT DAWDYSFRFCDELLQWGAIQGNVYYATVKTDVTLYFNAELGTYTHRWVASETNED >gi|222159292|gb|ACAB01000067.1| GENE 9 12653 - 14626 1591 657 aa, chain - ## HITS:1 COG:no KEGG:BT_3791 NR:ns ## KEGG: BT_3791 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 117 654 29 543 545 120 27.0 2e-25 MKRKLSFLFLSALTIPFFMVSCSDDKEEQEVKPATDLVVDNNSVTLLQGDEITVSITSGN GDYYVKPFDENVATATINGNKVTIKATENSQLEENNRTETTILIVDGRKKVARVLVRVAK LWNLTVDAPEEGFDLFIGEKRLVKILTGNGDYQISIPEGADKFLEVGELSGQVIPLTAKF ETGADPVNVTITDKKGKTITIPVVVNIVDLTLKSNEATFAEPDAESQYISIERGNGGYRF TYQVGDGEPTTDATIVEDTEKENLITLKPKARGDVKVIVTDQKGSVEEIAVKVNPYNLKL ENEATSLIIGGYEASTEIAITRGNGDYKLARLTDDNKTYLKTAELITEDGATKLKLTGKK LGTTTLELSDAAGQKLTLPVYVNPVCYRMEYDVCFKIDIKKYAESHSEVKSMNQLTFEVV FYPTYTRSMQSFIGLESVFLLRAEAKDVNPRFEIATKINGKSDPRFRSQQTIYCDDSEGG RKTPGKWYHIAIVYDGTKSSTKEAYKMYINGVRETLTPADNSYEDCAPNSSLNLTDVGGN DKALLIGRSGDSYRVGYCKVYQARMWKRALAESEIKANMCKILNAEEHSDLMGYWVFSKG VGGTTVFENWGNGGNGLDAQVCLQNISENKPAWGAELPATYNGDKSRFEPIECPHSY >gi|222159292|gb|ACAB01000067.1| GENE 10 14815 - 18855 2991 1346 aa, chain - ## HITS:1 COG:BS_resE_4 KEGG:ns NR:ns ## COG: BS_resE_4 COG0642 # Protein_GI_number: 16079368 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 827 1050 43 266 269 102 29.0 6e-21 MGKQSVYILLFFLNLLIHNAYAYNLKQVADKEYMSNSSITSLCQDERGLMWFGTCDGLNI YDGQEIEEFKTRDKEDYLSGNLIDNIVYTGEDVYWIQTYYGLNRLNRKTNTITHYNEFQK LFFMNKDSHGNLFIIQDSNCIYYYHKKEGVFKKINITGIPISDIVNFFIDGNNRMWVVMK GYNRCYDIQQEAASGDITLLPQKTNLIYQTSLIYCFNDEQSLYYIDKEFNFYAFHIPTQK NEFIANLGKEIQERGKISSIVHYHNSYFVGFLMDGILLLEKQKETDHYQIQSLPINSGVF CLKKDRFQDVVWIGTDGQGVYLYSNPLYSIKSMVLSNYTEKIERPVRAIYLDDECTFWVG TKGNGILKIYDYEFDKNISDCRAEVLTTSNSALSSNAVYCFAKSHRNLLWIGDEEGLTYY SYREKRIKSIPIRIGNEDFKYIHDIYETADSELWLASVGMGVVKARIAGTPDNPVIVDAQ RYIINDGELGSNYFFTIYAENEANLLFGNKGYGVFRYNETTNGLEPVSTHKYENMTLNNI LAISKDSSNNYLFGTSYGLIKYTSETSYQLFNAKNGFLNNTIHAILRNSSDDFWLSTNLG LINFDTKRNVFRSYGFGDGLKVVEFSDGAAFRDSQTGTLFFGGINGVVAIRADGRPEQLY MPPVYFDKLSIFGEQYNLGEFLTRKKETEVLNLQYDQNFFSVSFASVDYLNGNNCTYFYK LKGLSDQWVNNGSESGVSFTNMAPGEYTLLVKYYNSVFDKESDVYSLVIRIGDPWYASWW AYLIYALCLLLLAALLIRSFILRSKRKKQELLNEIEKRHQKNVFESKLRFFTNIAHEFCT PLTLIYGPCGRILSSKGLSKFVVDYVQMIQTNAERLNNLIHELIEFRRIETGNREVRVES LNVSSIVKGIAKTFVEMAKSRNITFLSKIPEQVMWNSDKGFLNTIIINLISNAFKYTPDG QSIKIEVDTSRENMLALRVANEGSTIKEKDFQYIFNRYAILDNFENQDEKNFSRNGLGLA ISYNMAKLLNGTLKVENTPDGWVMFTLTLPVVKLTTGVSETKRLTAEYIPKIDTQSILKL PQYEFDKMRPTLLVVDDEIEMLWFIGEIFSADFNVVTLQDPERLDQVMNEVYPNVIICDV MMPGMGGIELTRRIKSVKETAHIPIIVVSGRHEMEQQMEALSAGAEMYITKPFSAEYLRI SVCQVMERKEVLKNYFSSPISSFEKSDGKLTHKESKKFLQSVLKIINDNITNKDLTPRYI ADRLAISPRSLYRKMEEIGEDSPTDLIKECRLHIAKDLLLTTKKTIDEIVFDSGFSNKVT FFKVFREKYECTPKEFRMKHLEEVQQ >gi|222159292|gb|ACAB01000067.1| GENE 11 19066 - 21348 2156 760 aa, chain + ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 30 758 3 715 717 419 33.0 1e-117 MKTHFSFKHLLFLGGAVLYSLQSSAVKNPVDYVSTLVGTQSKFELSTGNTYPATALPWGM NFWTPQTGKMGDGWAYTYDADKIRGFKQTHQPSPWMNDYGQFAIMPITGGLVFDQDRRAS WFSHKAEVAKPYYYKVYLADHDVTTELAPTERAVMFRFTYPETKNAYVIVDAFDKGSYVK VIPKENKIIGYSTKNSGGVPENFKNYFVVQFDKPFTFVSTVSENNILPNEIEAKGNHTGA VIGFATKKGEIVHARVASSFISPEQAELNLKELGRNSFDQLVTNGRDVWNREMSKIEVED DNIDNLRTFYSCLYRSMLFPRSFYEIDAKGEIMHYSPYNGEVRPGYMFTDTGFWDTFRCL FPFLNLMYPSMNLKMQEGLVNAYKESGFLPEWASPGHRDCMVGNNSASVVADAYIKGLRG YNIETLWEALKHGANAHLRGTASGRLGYESYNQLGYVANNIGIGQNVARTLEYAYNDWAI YTLGKKLGKPESEIDIYKKHALNYKNVYHPERKLMVGKDNKGVFNPNFDAVDWSGEFCEG NSWHWSFCVFHDPQGLINLMGGKKEFNAMMDSVFVIPGKLGMESRGMIHEMREMQVMNMG QYAHGNQPIQHMVYLYNYSSEPWKAQYWIREIMNKLYTAGPDGYCGDEDNGQTSAWYVFS ALGFYPVCPGTDEYIIGTPLFKSAKLHLENGKTITIKADNNQLDNRYIKEMKVNGKSQTR NFLTHDQLIKGANIQFQMSPVPNKQRGTTEKDVPYSLSFE >gi|222159292|gb|ACAB01000067.1| GENE 12 21402 - 22352 902 316 aa, chain + ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 29 310 3 256 257 160 33.0 2e-39 MKLKNLLLIALVAIVFCGCQSNYQPTSITVASYNLRNANGGDSINGNGWGQRYPVIAQIV QYHDFDIFGTQECFIHQLKDIKEALPGYDYIGVGRDDGKEKGEHSAIFYRTDKFDVIEKG DFWLSKTPDVPSKGWDAVLPRICSWGHFKCKDTGFEFLFFNLHMDHIGKKARVESAFLVQ DKMKELGKGKELPAILTGDFNVDQTHQSYDAFVSKGVLCDSYEKAGFRYAINGTFNDFDP NSFTESRIDHIFVSPSFQVKRYGVLTDTYRSIVGKGEKKQANDCPEEIDIKTYQARTPSD HFPVKVELEFDQRQQK >gi|222159292|gb|ACAB01000067.1| GENE 13 22405 - 23562 911 385 aa, chain + ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 139 361 90 314 341 81 31.0 3e-15 MRNICFVACMLFCLTSAVGKTPGNTRYLSIADSILSNVLNLYQTNDGLLTETYPVNPDQK ITYLAGGTQQNGTLKASFLWPYSGMMSGCVALYKATGNKKYKKILEKRILPGMEQYWDNS RLPACYQSYPTKYGQHGRYYDDNIWIALDYCDYYQLTHKPASLEKAVALYQYIYSGWSDE IGGGIFWCEQQKEAKHTCSNAPSTVLGVKLYRLTKNAKYLKKAKETYAWTKKHLCDPTDH LYWDNINLKGKVSKEKYAYNSGQMIQAGVLLYEETGDEQYLHDAQQTAAGTDAFFRTKAD KKDPTVKVHKDMAWFNVILFRGLKALYKIDKNPAYVNAMVENALHAWENYRDENGLLGRD WSGHNKEQYKWLLDNACLIEFFAEI >gi|222159292|gb|ACAB01000067.1| GENE 14 23622 - 25067 1360 481 aa, chain + ## HITS:1 COG:XF0843 KEGG:ns NR:ns ## COG: XF0843 COG3538 # Protein_GI_number: 15837445 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 37 468 61 497 516 470 49.0 1e-132 MNITKAFCLSIALLGASNMQAITNSDFVIQQDNTKINNYQTNRPETSKRLFVSQAVEQQI AHIKQLLTNARLAWMFENCFPNTLDTTVHFDGKDDTFVYTGDIHAMWLRDSGAQVWPYVQ LANKDAELKKMLAGVIKRQFKCINIDPYANAFNMNSEGGEWMSDLTDMKPELHERKWEID SLCYPIRLAYHYWKTTGDASIFSDEWLTAIAKVLKTFKEQQRKEDPKGPYRFQRKTERAL DTMTNDGWGNPVKPVGLIASAFRPSDDATTFQFLVPSNFFAVTSLRKAAEILNTVNKKPD LAKECTTLSNEVEAALKKYAVYNHPKYGKIYAFEVDGFGNQLLMDDANVPSLIALPYLGD VKVNDPIYQNTRKFVWSEDNPYFFKGTAGEGIGGPHIGYDMIWPMSIMMKAFTSQNDAEI KTCIKMLMDTDAGTGFMHESFHKNDPKNFTRSWFAWQNTLFGELILKLVNEGKVDLLNSI Q >gi|222159292|gb|ACAB01000067.1| GENE 15 25566 - 26726 881 386 aa, chain + ## HITS:1 COG:PAB1622 KEGG:ns NR:ns ## COG: PAB1622 COG2152 # Protein_GI_number: 14521331 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Pyrococcus abyssi # 49 375 8 287 305 166 34.0 7e-41 MNNMKSTFLFLLTTTMMTCTAYGQSSNHKENKLPDWAFCGFERPKNVNPVISPIENTKFY CPLTKDSIAWESNDTFNPAATLYNGEIVVLYRAEDKSGVGIGHRTSRLGYATSTDGTHFQ REKTPVFYPDNDSQKELEWPGGCEDPRIAVTDDGLYVMMYTQWNRHVPRLAVATSRNLKD WTKHGPAFAKAFDGKFFNLGCKSGSILTEVVKGKQVIKKVNGKYFMYWGEEHVFAATSDD LIHWTPIVNIDGSLKKLFSPRDGYFDSHLTECGPPAIYTPKGIVLLYNGKNHSGRGDKRY TANVYAAGQALFDANDPTRFITRLDEPFFRPMDSFEKSGQYVDGTVFIEGMVYFKNKWYL YYGCADSKVGVAVYDPKRPAKADPLP >gi|222159292|gb|ACAB01000067.1| GENE 16 27028 - 29298 1810 756 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 32 748 46 777 790 572 41.0 1e-162 MKAFKLLALTSCVLLTAGNGVAQQNYSKSEGLLQYVDPYIGSGYHGHVFVGTSVPYGMVQ LGPTNIHKGWDWCSGYHYSDSILIGFSHTHLSGTGCTDLCDILIMPLNEIRTPRGNQDDI RDGYASRYSHANEIARPEYYSLLLDRYNIKAELTATDRVGFHRYTYPEGKPASILIDLRE GNGSNAYDSYIRKVDDYTVEGYRYVRGWSPSRKVYFVLKSDKKIEQFTAYDDNAPQPWDQ LKVASVKSVLTFGNVKEVKIKVALSSVSCDNAAMNLQSELTHWDFDKVVDMSADRWNKQL EKMTVETDDEASKRVFYTAHYHTMIAPTLFCDVNGEYRGMNDMIYTDPKKANYTTLSLWD TYRALNPLMTITQPEMVDHVVNSMISIYRQQDKLPIWPLMSGETDQMPGYSSVPVIADAY LKGFTGFDAEEALQAMIATATYEKQKGVPYVVKKGYIPADKVHEATSIAMEYAVDDWGIA AMARKMGKTEDAETFSKRAHYYKNYFDSSIHFIRPKLEDGSWRTPYDPARSIHTVGDFCE GNGWQYTFFAPQDPYGLIALFEGDKPFTTKLDGFFTNTDSMGEEASSDITGLIGQYAHGN EPSHHVAYLYAYAGEQWKTAEKVRFIMSDFYTDQPDGIIGNEDCGQMSAWYLLSSMGLYQ VNPSDGVFVFGSPCFKKVEVKVRGGNTFTVEAPNNSKENIYIQKVYLNGKPYDKSYITYQ DIINGSTLKFVMGKKPNKNFGKAPANRPVVLNKING >gi|222159292|gb|ACAB01000067.1| GENE 17 29518 - 30189 204 223 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 3 210 83 279 287 83 29 2e-15 MTVVYEDNHIIVVNKTASEIVQADKTGDTPLSETVKQYLKEKYQKPGNVFLGVTHRLDRP VSGLVIFAKTSKALTRLNEMFRTSEVKKTYWAVVKNAPQEPEGELVHFLVRNEKQNKSYA YDKEVPNSKKAILHYRLIGHSENYYLLEVDLKTGRHHQIRCQLAKMGCPIKGDLKYGSPR SNPDGSICLHARRVRFIHPVSKELIELEAPLPEGNLWKGFAID >gi|222159292|gb|ACAB01000067.1| GENE 18 30199 - 30945 287 248 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 248 4 242 242 115 34 5e-25 MGLLDGKTAIVTGAARGIGKAIALKFAAEGANIAFTDLVIDENAENTAKELEAMGVKAKG YASNAANFEDTAKVVEEIHKDFGRIDILVNNAGITRDGLMMRMSEQQWDMVINVNLKSAF NFIHACTPIMMRQKAGSIINMASVVGVHGNAGQANYAASKAGMIALAKSIAQELGSRGIR ANAIAPGFILTDMTAALSDEVRAEWAKKIPLRRGGTPEDVANIATFLASDMSSYVSGQVI QVDGGMNM >gi|222159292|gb|ACAB01000067.1| GENE 19 30962 - 31552 475 196 aa, chain - ## HITS:1 COG:no KEGG:BT_3770 NR:ns ## KEGG: BT_3770 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 355 95.0 5e-97 MTVSKTKAKLVDVARQLFAKMGVENTTMNDIALASKKGRRTLYTYFKSKDEIYLAVVESE LDILSDMMKRVAEKNISPDEKLLEMIYTRLDAVKEVVYRNGTLRAYFFRDIWRVEKVRKK FDAKEVQLFKAVLLEGQAKGVFHIDDVEMTADLIHYCVKGIEVPYIRGHIGAHLDEDTRN RYVSNIVFGALHRTEI Prediction of potential genes in microbial genomes Time: Wed May 18 02:44:33 2011 Seq name: gi|222159291|gb|ACAB01000068.1| Bacteroides sp. D1 cont1.68, whole genome shotgun sequence Length of sequence - 37849 bp Number of predicted genes - 30, with homology - 28 Number of transcription units - 13, operones - 8 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 51 - 91 9.6 1 1 Op 1 . - CDS 118 - 915 565 ## BT_4180 acetyl xylan esterase A 2 1 Op 2 . - CDS 933 - 3608 2379 ## COG3250 Beta-galactosidase/beta-glucuronidase 3 1 Op 3 . - CDS 3628 - 5856 1758 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases - Term 5884 - 5931 -0.7 4 2 Tu 1 . - CDS 5994 - 7787 1508 ## gi|262405086|ref|ZP_06081636.1| conserved hypothetical protein 5 3 Op 1 . - CDS 7844 - 8830 772 ## Phep_2142 hypothetical protein 6 3 Op 2 . - CDS 8892 - 10430 723 ## Acid_0712 hypothetical protein - Prom 10468 - 10527 5.7 7 4 Op 1 . - CDS 10630 - 12171 1159 ## gi|237715245|ref|ZP_04545726.1| conserved hypothetical protein 8 4 Op 2 . - CDS 12199 - 12807 457 ## gi|237715246|ref|ZP_04545727.1| conserved hypothetical protein 9 4 Op 3 . - CDS 12821 - 14281 1480 ## Phep_0446 RagB/SusD domain protein 10 4 Op 4 . - CDS 14295 - 17432 2765 ## Phep_0445 TonB-dependent receptor plug - Prom 17530 - 17589 8.1 + Prom 17530 - 17589 9.7 11 5 Tu 1 . + CDS 17619 - 17801 100 ## + Term 17803 - 17848 4.5 - Term 17655 - 17685 0.0 12 6 Op 1 3/0.000 - CDS 17768 - 21793 2448 ## COG0642 Signal transduction histidine kinase - Prom 21841 - 21900 6.0 13 6 Op 2 . - CDS 21902 - 22801 820 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 22858 - 22917 4.8 14 7 Tu 1 . - CDS 22977 - 24131 1484 ## COG1454 Alcohol dehydrogenase, class IV - Prom 24164 - 24223 1.8 + Prom 23860 - 23919 4.3 15 8 Tu 1 . + CDS 24130 - 24219 74 ## 16 9 Op 1 . - CDS 24228 - 25037 926 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases - Prom 25057 - 25116 7.1 17 9 Op 2 . - CDS 25130 - 26149 986 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 18 9 Op 3 6/0.000 - CDS 26153 - 27409 1560 ## COG4806 L-rhamnose isomerase 19 9 Op 4 . - CDS 27460 - 28917 1453 ## COG1070 Sugar (pentulose and hexulose) kinases 20 9 Op 5 . - CDS 28936 - 29121 102 ## gi|294644504|ref|ZP_06722261.1| hypothetical protein CW1_4369 - Prom 29141 - 29200 4.1 + Prom 29073 - 29132 6.2 21 10 Op 1 . + CDS 29153 - 29626 585 ## COG1438 Arginine repressor 22 10 Op 2 . + CDS 29654 - 30232 530 ## BT_3761 hypothetical protein 23 10 Op 3 . + CDS 30246 - 31454 1668 ## COG0137 Argininosuccinate synthase 24 10 Op 4 . + CDS 31451 - 32419 827 ## COG0002 Acetylglutamate semialdehyde dehydrogenase + Prom 32421 - 32480 2.2 25 11 Op 1 . + CDS 32536 - 32829 142 ## Arnit_1086 hypothetical protein 26 11 Op 2 . + CDS 32868 - 33989 1094 ## COG4992 Ornithine/acetylornithine aminotransferase + Prom 34042 - 34101 4.1 27 12 Op 1 . + CDS 34126 - 34899 967 ## COG0345 Pyrroline-5-carboxylate reductase 28 12 Op 2 1/0.000 + CDS 34940 - 35494 589 ## COG1396 Predicted transcriptional regulators + Prom 35525 - 35584 3.3 29 12 Op 3 . + CDS 35625 - 37280 1752 ## COG0365 Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases + Term 37308 - 37356 5.1 + Prom 37362 - 37421 6.0 30 13 Tu 1 . + CDS 37498 - 37698 158 ## BT_3747 hypothetical protein Predicted protein(s) >gi|222159291|gb|ACAB01000068.1| GENE 1 118 - 915 565 265 aa, chain - ## HITS:1 COG:no KEGG:BT_4180 NR:ns ## KEGG: BT_4180 # Name: not_defined # Def: acetyl xylan esterase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 263 1 266 267 330 62.0 4e-89 MLRSIYLLSFFLLQSLFAFAGNVKEDVPSTFDLYVCIGQSNMAGRATLIPEVMDTLRNVY LLNDKGNFEPAVNPLNRYSTVRKDLSMQRLGPAYGFAKEMARQTKRPVGLVVNARGGSSI NSWLKGSKDGYYEEALSRVRIAMKQGGVLKAILWHQGEADCSNPEAYKQKLISLVKDLRE DLGMPNLPVVVGQISQWNWTKREAGTVPFNQMIKKVSSFIPHSDWVSSKGLGWYKDEKDP HFNTEAQLLLGKRYAKKVLKFYKHQ >gi|222159291|gb|ACAB01000068.1| GENE 2 933 - 3608 2379 891 aa, chain - ## HITS:1 COG:SMb21655 KEGG:ns NR:ns ## COG: SMb21655 COG3250 # Protein_GI_number: 16263752 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sinorhizobium meliloti # 22 646 3 584 755 176 27.0 3e-43 MKKLFYIFILLFIVGWAQAQQRVVYTINDGWKFTKGSPFEAQLTGCDDSSWETVNIPHTW NDKDADDETPGFYRGPVWYRKQLFIDKSQEGRQAVIYFEGANQEVRFYLNGQFVGEHKGG YTRFCFDITSHLRYGQENLFAIYVNNVYNPNIPPLSADFTFFGGIYRDVYLQFMNPVHIA TNDYASSGVYIRTPEVNNSAASVEITTLLTNDMPQATEIRVENIICDADGKEVKKTQAEV KLAAGETKTDISKKIKIDSPRLWDIDDPYRYMVYTRILDKRKGTLLDEVVNPLGLRWFKF DSEKGFFLNGKGRKLIGTARHQDYFQKGNALRDELHVQDVLLLKEMGGNYLRVSHYPQDP VIMEMCDKLGIVTSVEIPVVNAVTETEEFLHNSVEMAKEMVRQDFNRPSVMIWGYMNEIF LRRPYTEGKQLEDYYRFTEKVARALEATIREEDPSRYTMMAYHNMPQYYEDAHLTEIPMI QGWNLYQGWYEPDINEFQRLLDRAHKVYKGKVLMVTEYGPGVDPRVHSYQPERFDFSQEY GLVYHKHYLNEMMKRPFVAGSSLWNLNDFYSESRVDAVPHVNNKGVVGLNREKKDVYWFY KTALSRRPILVIGNREWKSRGGVVNTAQKECIQSVPVFSNAEEVELFVNNKSLGKKKIED NYALFDVPFVGGENLLEAVAVTGGNKLRDMLRIQFQLVGSQLKDEAVPFTELNVMLGSPR YFEDRAANVAWIPEQEYKPGSWGFIGGTSYRRQTGFGTMLGSDIDIHGTDMNPIFQTQRV GIKSFKADVPNGEYSVYLHWAELESDKEREALVYNLGADSEQTFAGNRSFGISINGTTVS DDFNVARDYGYARAVIKKFVITVKDGKGVSVDFHKKEGEPILNAIRIYRNY >gi|222159291|gb|ACAB01000068.1| GENE 3 3628 - 5856 1758 742 aa, chain - ## HITS:1 COG:SSO3022 KEGG:ns NR:ns ## COG: SSO3022 COG1501 # Protein_GI_number: 15899728 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Sulfolobus solfataricus # 1 737 13 724 731 496 39.0 1e-140 MSEGCVRILSFPEGDTLVTKRLVVDSQKPAFKDYKWKDTGQQLIFSTRELSVAFDKEDAA FTFQEVSTGKLLLKEKEAKKARHFKRSVTGGEQCLEVTQRFVPTDDEAIYGLGQYQNGIM NYRGKSVLLLQANMDIVNPFLISTNGYGILWDNYSSTKFEDTKEGYSFTSEVGDASDYYF VYGKNMDEVVAGYRELTGDVPMFGKWVYGFWQSKERYKSFDELKAVVKEYRKRGIPLDNI VQDWEYWGDKPHWNSLTFHPANFDHPRQVIEELHQQDHVHFMLSVWPGFGPETAVYQSLD SIGALFSEPTWAGYKVFDAYNPAARDIFWQYLKKGLYDMGVDAWWMDATEPSFRDGFTQL KQEERTKAAGNTYLGSFHRYLNTYSLEMLKDFYQRLRAESDQKRIFILTRSAFASQQHYG TAVWSGDVSASWENMHKQLVAGLNLSMSGIPYWTSDTGGFFVTERDAKYPNGLKSNDYKE LYSRWFQFSAFTPVFRAHGTNVPREVWQFGEKGTPSYDNQVKYIQLRYRLLPYIYSMSHQ VTANNYTLLRGLAMDFTTDTRTFDIDNAYMFGTSLLVRPVFHPQSEEKNISVYLPEHNGE YWYDFWTGKAFEGGREQMQANTLDILPLYVKAGSILPLAEVKQYAMEYPDRELELRIYGG ANASFLWYEDEGDSYRYEDGVCSKVLMQWKDSERTLTIGLREGSYPGMPEQVKMRVKLYL PDGAALESKECVYTGREIKIKF >gi|222159291|gb|ACAB01000068.1| GENE 4 5994 - 7787 1508 597 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262405086|ref|ZP_06081636.1| ## NR: gi|262405086|ref|ZP_06081636.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 597 1 597 597 1238 100.0 0 MRKLLLFLLVSIALPALAQNGKKVYADFHGVRYTRQHDGKLGRWEMYANTEKSSTGRKSL CYNADLIDSEGRHEIAAVAYPQVGMQSNLDPDYIEYQILSAKAAKIDGFFIEWGFKPHEN DILLREMQKVAAKYDFEIGVNWCDGWLYYDWITKIYPEVNTREAKTEYMAKCYQYLVDSV FTGPTAPMVKGMPVFYHFGPGAKVDEYKKVLSLVKFPQGMKQPVALRRWADWGKLENDKY IPVTRSDDMDAWKKVGEIPTAWLPARVHTRDQAHAEWDNYATQDDVIEFMKPFRDSIWHS NNPAYTIKSGFAMPGMDNRGCAGWGRGHFYYIPRNNGETYQSMWKFCMAEKDSLDMMFIA SWSDYTEGHEIEPTIENGDRELRTTLKYAAEFKGEQADERGLTLPLMLFRLRKEARFLEK TKMDVSACQRSLDKAALLISQGRYPVAIGLLSQIENDVKTAKSALAVEMMRLRESDMKIQ GKRKSGGYNAEETLSISLPKELVSKLQMNNYVGYLYFEYLDKGNESLFIRSSTQREPKEP FKIVSRIRTDNTGEWKSAKVELYKDNIVNGFNMPTFYLKGNVVIRNLSLGYTIYTVK >gi|222159291|gb|ACAB01000068.1| GENE 5 7844 - 8830 772 328 aa, chain - ## HITS:1 COG:no KEGG:Phep_2142 NR:ns ## KEGG: Phep_2142 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 39 318 51 324 333 219 43.0 1e-55 MKKISLLALFLGFNIALGACSDDDSAPVVKNESFQLPAKAYVLAEQTRRAIVIRDAETQR NVWSWDPYTACVPSAHQDWFINPSEVKPVFNKRYILMTASGGAVALIRLSDHKLMFYANC GQNPHSAEILPDGNIVTAESKSGEINTFVVDTVKVLGMKANTLKLGNAHNVVWDRKKECI YATATIQAGVTALFRMKYNGDRNNPQLTNQTRIYTFDKESGGHDLFPVYGEEDKLWLTAA SAVYKFDISTDTPTCEKVYNTADIKSVCNGPDGILMLKPTEEWWAEGLVNEKGEELFKMD GAKIYKGRWMIDNTFSYPEKHDFVLGED >gi|222159291|gb|ACAB01000068.1| GENE 6 8892 - 10430 723 512 aa, chain - ## HITS:1 COG:no KEGG:Acid_0712 NR:ns ## KEGG: Acid_0712 # Name: not_defined # Def: hypothetical protein # Organism: S.usitatus # Pathway: not_defined # 71 511 42 461 462 223 33.0 1e-56 MERHIYRYFRFFYLSGLILLFATSCSADAGEPSEPETPTTPTDEYTYLNVEYRKWQNGTF QLWTTADSRETRIIDNMNRYAPSGDYTRTAWGGRNGLQPSSVTGKEGFFRVAHCGGRAYL LDPDNGAVILHGVQHVRPGESTAHQKAFSTKYGSEVRWSEETGKLLADNHINYISYGSNR IETFPVAIRANLLTPKTQKIAYAETLYLLRTFMWDMTKNLGYAFDDDKYNRLILLFEPTF AAYIDNLVREKSALFAGDKHFIGYYLDNELPFASYQNGDPLRGIDLKHFLSLPDRYKAAR TYAEKFMQERGIVSPAAITKADQEDFRGVVSDYYYQLTTTTVRRYDTEHLILGTRLHDWS KYNQKVVEACARYCDVVSVNYYGRWQPETDFLANLKAWCAVKPFLVSEFYTKAEDASYKG VEYANTEGGGWLVHAQKNRGEFHQNFCLRLLETRNCIGWIHFEYNDSYASDGSASNKGIV SLEYEPYESFLSYVRQLNLAVYPLIDYYDTRQ >gi|222159291|gb|ACAB01000068.1| GENE 7 10630 - 12171 1159 513 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715245|ref|ZP_04545726.1| ## NR: gi|237715245|ref|ZP_04545726.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 513 1 513 513 925 100.0 0 MIRKLYTILLIGLCLNLVACGDDNENIDPNASAPVIKFPMEQLDVDLNKVDNLPVVAVIK SQAGLQSVTMKIQTVEGTVEYKTVTDFFNPNSYSLSENLEYNANYQAFIIEATDKLDHII TGTLPISVTDVVERPVITFDPEEIIYDEMDENPTIPRTTFKITSEAGLKTVEMYLVSASG QESKGIINLSGEKEYTFDEMIDYKEGDRGFKVKAEDTYGYITISTLPVTYKTIPGPSLTL TESTIFAGTDAKKGVPVQIESVRGVHEVVIYRIENGSEVEALRETKNGEHTLNYAPEIDF TEATSKLKVVVSDGREGKEAIGYMKAYVNMDVATLNVGSQPLANNAHVKYPDAFGMVSLN DLKTYSVDYAIANEVNAKNVDFKFYCFGASGSPRLYSMDNTGKDGEFSGSTGKLSAIKVK NLTRFAILSNFDYENATVASISSEILSSSIAQSLLDPIAVGNVIAFRTGGSSAAGGGRIG VMKVINITEPKELVSNNATARVMTVEIKFSKKK >gi|222159291|gb|ACAB01000068.1| GENE 8 12199 - 12807 457 202 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715246|ref|ZP_04545727.1| ## NR: gi|237715246|ref|ZP_04545727.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 202 10 211 211 404 100.0 1e-111 MKKTIYRTIFFCLSVIALAGCDLELQKNYDYEASVDDPHVNVTAWEYFQDHQDIFSEFTT AIEYTGLKDYYTQTENKYTYLALNNTAMQSYRENVFPGIASITDCDKETVKNMLLYHIVD GEYSSYGQLQVEAMFVLTMLPGEKGLMTMSVWKNPWQAAVGKILVNETGSNGKSPQRNAK TSNILPTNGVIHIFEKYCYYQK >gi|222159291|gb|ACAB01000068.1| GENE 9 12821 - 14281 1480 486 aa, chain - ## HITS:1 COG:no KEGG:Phep_0446 NR:ns ## KEGG: Phep_0446 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 20 445 23 469 530 204 32.0 7e-51 MKNIKYFVGAVCLALSLNSCSDFLNEEPVSEIPAGDMWQTARDAKAGINEIYGLLRSTLR ENYFYWGEFRSDNVAPGAPVMADQARVINNLMSTDEKCAKWTTLYQMINQANLAIKYVPN ISMPDVADRNDYLGQAYALRALAYFYAIRVWGDVPLFIEPTEKYSEAIYKERTDKNYILE HVILPDLKKAEGLINRNKNYERKRISICGVWAIMADAYMWSKEYNLADQTIDKMADIKSS KGRLVDFEPSIQTWHTMFTEELNNKPSDDTPENDEYSTKEFIFLIHFNMDEVGTNGYSYM YQWFSGSGNRAAVLSDKLMSVFNEPDMQGDLRKAYTVKDYQNGNELRKYMAGDISNSLNK TCEVAYPIYRYTDMLLLQAEARARLGKWEEALDLVKKVRDRAGLVTPTALSFASEDEVVN YILRERQVELVGEGRRWFDLLRTGKWKEVMESINGMSQDGNELFPIHYSHILENPKITQN PYYGNN >gi|222159291|gb|ACAB01000068.1| GENE 10 14295 - 17432 2765 1045 aa, chain - ## HITS:1 COG:no KEGG:Phep_0445 NR:ns ## KEGG: Phep_0445 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 11 1045 3 1043 1043 714 40.0 0 MKNLQSRTKVRLLQSFLLISLSLFFLVNPAYAQGGFAVKGVIVDKTGFPLPGANVMEKGT SNGTITDLDGNFSLNVSKKGVTLTVSFMGYTPKDVVVKDKTMNITLEENSKLLDEVVVVG YGTMKKRDVTGAITSISSEAIEQKMATNVFEALQGTTAGVQVVSGSGQPGESSSIKIRGT STFSAEGVTPLYIVDGVPLESIDGINTNDITSMEILKDAASAAIYGSRSANGVIIITTKS GQEGKARIDIKYNHSWGTLSHKVPQANRKERLLYDQYRKEYFETYGGGNPDESSDILNDP LNSFFNVDNDYLDMITSTAQKDQVDISVGGGTKKLKYFINTGYYNEKGIISNTGFQRLNT RINSDYSPTDWMNMGSRISLTYSKKKGLNEGTLLSAVLTRRPYFNTYYPDGSLVGVFNGQ KNPIAQVNYTTDFTDSYKANFFQFFEIKFNKYLKFRANINANFYLDKRKKLEPSLITDEW QKQNKGYSYNYLNWNWMNEDILTYARKIKDHNFTAMVGVSAQQWRYENETFVGINSSTDF IYTMNAFAANLDLSSTGSTLSNHSMASIFARVTYDYKGRYLLNAIMRRDGSSRFAKENKW GNFPSVSVGWRFSDEKFMKFSKKFLEDGKIRASFGITGNEAIGNYDYIYSYSPNSIYDGV GGVIPTRIGKDNLKWEETKQFNLGLDLNFWNSRLTITADYYDKYTDGLLANYQLPKESGF AYMKTNVGEMSNRGFEIAVTGDIIRTKDWKWNASFNISRNVNRIEKLSEGKAYMEGDIWW MQEGGRVGDFYGFKSAGVFAYDESNAFTDKWEQLTPVFENGVFQYKYLLDGKEYAGNIRQ KTLPNGKPFRGGDYNWEEPEGTRDGVIDDNDRMVIGNAMPDVTGGLNTTVTWKNLSLYLG FYYSLGGQIYNAAEHNRNMFKYTGTTPSPEVIHNMWLHPGDQAIYPRPYNDDYNNARMGN SFYLEDASFIRLQNIRVAYDLPENWIKKLMLKNINIYAFVNNALTWTNYSGFDPEFSTNN PLQVGKDSYRYPRKREYGIGFSANF >gi|222159291|gb|ACAB01000068.1| GENE 11 17619 - 17801 100 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKVCADTPNQRKLKYKLALSNATKVSHLNILHPIELFVLSSLKIQMTYFNHFTEGSEEVL >gi|222159291|gb|ACAB01000068.1| GENE 12 17768 - 21793 2448 1341 aa, chain - ## HITS:1 COG:BS_phoR_3 KEGG:ns NR:ns ## COG: BS_phoR_3 COG0642 # Protein_GI_number: 16079962 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 804 1038 40 274 279 149 36.0 5e-35 MRGQKLSKIHFLILLSLLGFISPLQSQEYKFRTLGVEDGLSQITVSDICQDEKSRIWIAT LDGLNCFDGNHIKVFNHFHNDSISYGNLYVTQMVEDGQGSLFLLTSTGLFQFDLETEKYY ILPVPSPSTLAKGKLGVWIAEGGKLFLYDKNTRSVKPMYAEVHLPDTGPTMVEGSEGNLW VALKDGGVMRVDTCGSMSLYLPGIKVMKLIKSNDQNIWVGSQDHGVFCFSPQGAVIHHYD YNNKSVYTVRDDMARALCQDLEGNIWVGYRSGLSKIEVATGKIFHYQADPNRVGAMSNRS VTSLYTDKQGTVWVGTYWGGVNFFSPEYQHFVHYHASDTGLSFPVVGAMAEDKSGNIWIC TEGGGLDLYQPEQGTFKHFNAHTGYHFSTDYLKDVVFDEANNCLWIAADFTNKVNCFHLD NYRNDIYDLEPLGEESVGEALFALADTPRKLYVGTTSAVVSLDKQALKTEVLFHQKELFT HNYNTLLLDSKNRLWFASDDGCVAYVIDEGRFETYRISLKKQVRSQKELVNVIYEDRKGN IWVGTHGNGLFLLDKKERLFRLHTPESVLSGENIRVLGETPSGNLLIGTGHGLSVLEQKD GKVINFNSKTGFPLTLVNRKSMHVSRNHDIYMGGATGLVAIRESSLSYPPKIYDLELAHL YVNNKEITTGDQTGILNKSFAYTDRIKLNYLQNVFSIGFSTDNFLHIGGGEVEYRLIGSN DEWSENRLGNDITYTNISPGDYVFEIRLKNFPEVIRSLHITITPPFYATWWAYTIYACVI LTILFFVVREYRIRLFLKTSLDFELREKQYIEEMNQSKLRFFTNISHEIRTPITLILGQV DLLLNSGKLSTYAYSKLLNIHKNAGNLKSLITELLDFRKQEQGLLKLKVSQFDLYSLLKE HYVLFKELAANRNISFVLHADCEQCLVWGDRMQLQKVVNNLLSNAFKYTSDGGSISMELA DGADECMFSVSDNGAGISEEDYVKIFERFYQAENIGQYGGTGIGLALSQGIVKAHQGDIT VESQLGKGSCFKVTLKKGDAHFDSSVSRIEPEQDKEYIYYSEDKELLVKEVQSAQSESGT TDCKLLVIDDNEEIRNILVDIFSPLYTVETASDGEEGYEKVKMMQPDLVISDIMMPGMPG TELCAKIKNNIETCHIPVVLLTALSAPERELEGLRIGADIYVVKPFNMRRLVMQCNNLIN TRRLLQNKYAHQLDSKAEKIATNELDQRFIEQATQVVEDNMENPEFSVDVFSREMGVGRT VLFQKIKGITGSTPNNFIMNLRLKKAAYFLQNSPEMNISDIAYRLGFGNPQYFNKCFKEL FDIAPTQYRKAHNTSSEPSVK >gi|222159291|gb|ACAB01000068.1| GENE 13 21902 - 22801 820 299 aa, chain - ## HITS:1 COG:SMb21419 KEGG:ns NR:ns ## COG: SMb21419 COG2207 # Protein_GI_number: 16264994 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Sinorhizobium meliloti # 11 293 8 287 295 154 30.0 2e-37 MIEDNSLGLEFKYLIVNDMDRKFGLWVNTVGYQSIPPDSPYPLKEHPSGYFFNAEKGRVL REYQLVYITKGRGLFSSDSTSEKQVCKGRLMVLFPGQWHTYRPLRQTGWTEYYIGFEGPM IDAIVDDAFLSQEQQILEIGLNEELVSLFSRALAVAEADKISAQQYLSGIVLHMIGMILS VSKNKVFEMSDVDQKIEQAKIIMNENVSGNVDPEELAMRLNISYSWFRRVFKEYTGYAPA KYFQELKLRKAKQMLVGTSQSVKEISFFLGFQSTEYFFSFFKKRTGLTPLEYRSFGREE >gi|222159291|gb|ACAB01000068.1| GENE 14 22977 - 24131 1484 384 aa, chain - ## HITS:1 COG:STM2973 KEGG:ns NR:ns ## COG: STM2973 COG1454 # Protein_GI_number: 16766278 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Salmonella typhimurium LT2 # 2 384 3 382 382 412 56.0 1e-115 MNRIILNETSYFGAGCRSVIAVEAARRGFKKAFFVTDKDLIKFGVAAEIIKVFDENQIPY ELYSDVKANPTIANVQNGVAAYKASGADFIVALGGGSSIDTAKGIGIVVNNPDFADVKSL EGVADTKHKAVPTFALPTTAGTAAEVTINYVIIDEDARKKMVCVDPNDIPAVAIVDPELM YSMPKGLTAATGMDALTHAIESYITPGAWVMSDMFELKAIEMIAQNLKAAVDNGKDVAAR EAMSQAQYIAGMGFSNVGLGIVHSMAHPLGAFYDTPHGVANALLLPYVMEYNAESPAAPK YIHIAKAMGVDTTGMSEAEGVKAAVEAVKALSASINIPQKLHEINVKEEDIPALAVAAFN DVCTGGNPRPTSVADIEALYRKAF >gi|222159291|gb|ACAB01000068.1| GENE 15 24130 - 24219 74 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVIYLFKGLTAAPSRVENATVSKVIKIND >gi|222159291|gb|ACAB01000068.1| GENE 16 24228 - 25037 926 269 aa, chain - ## HITS:1 COG:rhaD KEGG:ns NR:ns ## COG: rhaD COG0235 # Protein_GI_number: 16131742 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 26 267 21 265 274 124 32.0 2e-28 MKSILENRPALAKEVNKVAEVAGYLWQKGWAERNGGNITVNITEFVDDEIRQMKPISEVK SIGVTLPYLKGCYFYCKGTNKRMRDLARWPMENGSVIRILDDCASYVIIADEAVVPTSEL PSHLSVHNDLLSKNSPYKASVHTHPIELIAMTHCPKFLEKDVATNLLWSMIPETKAFCPR GLGIIPYKLPSSVELAEATIKELQDYDVVMWEKHGVFAVDCDAMQAFDQIDVLNKSALIY IAAKNMGFEPDGMSQEQMKEMSVAFNLPK >gi|222159291|gb|ACAB01000068.1| GENE 17 25130 - 26149 986 339 aa, chain - ## HITS:1 COG:YPO0334 KEGG:ns NR:ns ## COG: YPO0334 COG0697 # Protein_GI_number: 16120671 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Yersinia pestis # 3 332 5 334 344 166 32.0 6e-41 MEILIGLLIIAIGSFCQSSSYVPIKKVKEWSWESFWLIQGVFAWLVFPFLGSLLGVPQES SLFDLWGAGGAGMSIFYGILWGVGGLTFGLSMRYLGVALGQSISLGTCAGFGTLLPALFA GTNLFEGNGLILLLGVCITLAGIAIIGYAGSLRAQNMSEEEKRAAVKDFALTKGLLVALL AGIMSACFALGLDAGTPIKEAALAGGVEGLYAGLPVIFLVTLGGFLTNAAYCLQQNVVNK TMGDYAKGKVWGNNLVFCALAGVLWYMQFFGLEMGKSFLTESPVLLAFSWCILMALNVTF SNVWGIILREWKGVSNKTITVLIAGLIVLIFSLVFPNLF >gi|222159291|gb|ACAB01000068.1| GENE 18 26153 - 27409 1560 418 aa, chain - ## HITS:1 COG:STM4046 KEGG:ns NR:ns ## COG: STM4046 COG4806 # Protein_GI_number: 16767312 # Func_class: G Carbohydrate transport and metabolism # Function: L-rhamnose isomerase # Organism: Salmonella typhimurium LT2 # 7 418 5 418 419 495 55.0 1e-140 MKKEELIQKAYEIAVERYAAVGVDTEQVLKTMQDFHLSLHCWQADDVAGFEVQAGSLTGG IQATGNYPGKARNIDELRADILKAASYIPGTHRLNLHEIYGDFQGKVVDRDQVEPEHFKS WIEWGKEHNMKLDFNSTSFSHPKSGDLSLSNPDEGIRQFWIEHTKRCRAVAEEMGKAQGD PCIMNLWVHDGSKDITVNRMKYRALLKDSLDQIFATEYKNMKDCIESKVFGIGLESYTVG SNDFYIGYGASRNKMITLDTGHFHPTESVADKVSSLLLYVPELMLHVSRPVRWDSDHVTI MDDPTMELFSEIVRCGALDRVHYGLDYFDASINRIGAYVIGSRAAQKCMTRALLEPIAKL REYEANGQGFQRLALLEEEKALPWNAVWDMFCLKNNVPVGEDFIAEIEKYEAEVTSKR >gi|222159291|gb|ACAB01000068.1| GENE 19 27460 - 28917 1453 485 aa, chain - ## HITS:1 COG:BS_yulC KEGG:ns NR:ns ## COG: BS_yulC COG1070 # Protein_GI_number: 16080172 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus subtilis # 5 471 3 467 485 393 42.0 1e-109 MKQNFFAVDLGATSGRTILGTFIEGGLNLEEINRFPNHLIEVGGHFYWDIYALYRHIIDG LKLVAHRGESIASIGIDTWGVDFVCVGKDGNLLRQPYAYRDPHTVGAPEALFSRISRSEV YGKTGIQIMNFNSLFQLDTLRRNHDSALEAADKILFMPDALSYMLTGEMVTEYTIASTAQ LVNAQTRRLEPELLKAVGLSEKNFGRFVFPGEKVGVLTEEVQKITGLGAIPVIAVAGHDT GSAVAAVPALDRNFAYLSSGTWSLMGVETDAPVINAETEALNFTNEGGVEGTIRLLKNIC GMWLLERCRLNWGDTSYPELISEADACEPFRSLINPDDDCFANPADMEKTIAEYCRATGQ AVPEKRGQVVRCIFESLALRYRQVLENLRSLSPRPIDTLHVIGGGSRNDLLNQFTANAIG IPVVAGPSEATAIGNVMIQAMAAGEATDVAGMRQLINRSIPLKTYQPQDTEAWDAAYIHF KNCVR >gi|222159291|gb|ACAB01000068.1| GENE 20 28936 - 29121 102 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294644504|ref|ZP_06722261.1| ## NR: gi|294644504|ref|ZP_06722261.1| hypothetical protein CW1_4369 [Bacteroides ovatus SD CC 2a] # 1 61 1 61 61 108 100.0 8e-23 MHYFGENMHKTKVFLHKSEKYAKRCKPAFLFCVKMYTLSIKIHKEREEGIAYLCHVGKWI P >gi|222159291|gb|ACAB01000068.1| GENE 21 29153 - 29626 585 157 aa, chain + ## HITS:1 COG:BH2777 KEGG:ns NR:ns ## COG: BH2777 COG1438 # Protein_GI_number: 15615340 # Func_class: K Transcription # Function: Arginine repressor # Organism: Bacillus halodurans # 4 138 3 134 149 94 36.0 7e-20 MKKKANRLDAIKMIISSKEVGSQEELLQELNREGFELTQATLSRDLKQLKVAKAASMNGK YVYVLPNNIMYKRSTDQSAGEMLRNNGFISLQFSGNIAVIRTRPGYASSMAYDIDNNEFS EILGTIAGDDTIMLVLREGVAISKIRQLLSLIIPNIE >gi|222159291|gb|ACAB01000068.1| GENE 22 29654 - 30232 530 192 aa, chain + ## HITS:1 COG:no KEGG:BT_3761 NR:ns ## KEGG: BT_3761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 397 98.0 1e-109 MDTQQIDVMVADASHEVYVDTILETIRNAAAVRGTGIAERTHEYVATKMKEGKAIIALCG DTFAGFTYIESWGNKQYVATSGLIVHPDFRGLGLAKRIKQASFQLARLRWPKAKIFSLTS GAAVMKMNTELGYVPVTFNELTDDEAFWKGCEGCINHDILVAKNRKFCICTAMLYDPTEP RNIKKEQERNNI >gi|222159291|gb|ACAB01000068.1| GENE 23 30246 - 31454 1668 402 aa, chain + ## HITS:1 COG:XF0999 KEGG:ns NR:ns ## COG: XF0999 COG0137 # Protein_GI_number: 15837601 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Xylella fastidiosa 9a5c # 5 385 3 384 401 204 34.0 2e-52 MEEKKKKVVVAFSGGLDTSFTVMYLAKEKGYEVYAACANTGGFSEEQLKTNEENAYKLGA VKYVTLDVTQEYYEKSLKYMVFGNVLRNGTYPISVSSERIFQALAIARYANEIGADAIAH GSTGAGNDQIRFDMTFLVLAPNVEIITLTRDMALSRQEEIDYLNKHGFSADFTKLKYSYN VGLWGTSICGGEILDSAQGLPETAYLKHVEKEGSEQLRLTFEKGELKAVNDEKFDDPIKA IQKVEEIGAAYGIGRDMHVGDTIIGIKGRVGFEAAAPMLIIGAHRFLEKYTLSKWQQYWK DQVANWYGMFLHESQYLEPVMRDIEAMLQESQRNVNGTAILELRPLSFSTVGVESEDDLV KTKFGEYGEMQKGWTAEDAKGFIKVTSTPLRVYYNNHKDEEI >gi|222159291|gb|ACAB01000068.1| GENE 24 31451 - 32419 827 322 aa, chain + ## HITS:1 COG:AF2071 KEGG:ns NR:ns ## COG: AF2071 COG0002 # Protein_GI_number: 11499653 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Archaeoglobus fulgidus # 2 319 1 329 332 215 38.0 1e-55 MIKAGIIGGAGYTAGELIRLLLNHPETEIVFINSSSNAGNRITDVHEGLYGETDLRFTDQ LPLDEIDVLFFCTAHGDTKKFMESHNIPEDLKIIDLSMDYRIMSDDHDFIYGLPELNRRA TCTAKHVANPGCFATCIQLGLLPLAKNLMLTDDVMVNAITGSTGAGVKPGATSHFSWRNN NMSVYKAFEHQHVPEIKQSLKQLQNSFDAEIDFIPYRGDFARGIFATMVVKTKVALEEIV RMYEEYYAKDSFVHIVDKNIDLKQVVNTNKCLIHLEKHGDKLLIISCIDNLLKGASGQAV HNMNLMFNLEETVGLRLKPSAF >gi|222159291|gb|ACAB01000068.1| GENE 25 32536 - 32829 142 97 aa, chain + ## HITS:1 COG:no KEGG:Arnit_1086 NR:ns ## KEGG: Arnit_1086 # Name: not_defined # Def: hypothetical protein # Organism: A.nitrofigilis # Pathway: not_defined # 2 90 27 115 118 104 68.0 1e-21 MKQEFILSKQILRSGTSIGASIRESEFAQSNADFINKLSISLKEANETDYWLNLLKDSDY IDSNAFNSMEIDCGELIALLVSSIKTAKNNQMNHEVT >gi|222159291|gb|ACAB01000068.1| GENE 26 32868 - 33989 1094 373 aa, chain + ## HITS:1 COG:BH2897 KEGG:ns NR:ns ## COG: BH2897 COG4992 # Protein_GI_number: 15615460 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Bacillus halodurans # 3 373 4 377 384 251 39.0 1e-66 MKLFDVYPLYNINIVKGDGCKVWDENGTEYLDLYGGHAVISIGHAHPHYTAMISNQVAKL GFYSNSVINKLQQEVAERLGKISGYEDYSLFLINSGAEANENALKLASFYNGRTKIVSFN KAFHGRTSLAVEATHNPSIIAPINNNGHVTYLPLNDMEAMKQELSKGDTCAVIIEGIQGV GGIKIPTTEFMQELRKACSETGTILILDEIQSGYGRSGKFFAHQYNDIKPDLITVAKGIG NGFPMAGVLISPMFKPVYGQLGTTFGGNHLACSAALAVMDVIEQENLIENAKVVGNYLLE ELKKFPQIKEVRGRGLMIGLEFEEPIKELRSRLIYDEHVFTGASGTNVLRLLPPLCLTME EAEDFLARFKKVL >gi|222159291|gb|ACAB01000068.1| GENE 27 34126 - 34899 967 257 aa, chain + ## HITS:1 COG:lin0414 KEGG:ns NR:ns ## COG: lin0414 COG0345 # Protein_GI_number: 16799491 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Listeria innocua # 2 256 3 259 266 137 34.0 1e-32 MKIAIIGAGNMGGSIARGLAKGSLIDDSDIIVSNPSAGKLEKLKKEFPGISITNSNVEAA TGADIVILAVKPWFMEPVMRELKLKSKQILISVAAGISFEELAHYVVAPEMAMFRLIPNT AISELESMTLVAARNTNDEQDKFILRLFSEMGTVMLIPEDKIAAATALTSCGIAYVLKYV QAAMQAGIEMGLRPKDAMQMIAQSLKGAAALIQNNDTHPSVEIDKVTTPGGITIKGINEL EHNGFTSAIIKAMKASK >gi|222159291|gb|ACAB01000068.1| GENE 28 34940 - 35494 589 184 aa, chain + ## HITS:1 COG:MTH700 KEGG:ns NR:ns ## COG: MTH700 COG1396 # Protein_GI_number: 15678727 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanothermobacter thermautotrophicus # 1 184 1 182 182 166 49.0 2e-41 MNEQIKQIAERLRGLRDVLELTAEDIARDSDISAEEYRLAETGDYDISVSMLQKIARTYN IALDALMFGEEPKMSSYFVTRAGKGVSIERTKAYKYQSLASGFMNRTADPFIVTVEPKAN DEPIHYNQHNGQEFNLVIEGRMLINIEGKEIILNQGDSIYFNSKLPHGMKALDGKTVRFL AVIM >gi|222159291|gb|ACAB01000068.1| GENE 29 35625 - 37280 1752 551 aa, chain + ## HITS:1 COG:MA2912 KEGG:ns NR:ns ## COG: MA2912 COG0365 # Protein_GI_number: 20091733 # Func_class: I Lipid transport and metabolism # Function: Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases # Organism: Methanosarcina acetivorans str.C2A # 1 550 7 558 560 706 59.0 0 MVERFLSQTSFSSQEDFIKNLKINVPENFNFGYDVVDAWAAEQPNKNALLWTNDKGESRQ FSFADMKRSTDMTASYFQSLGIGRGDMVMLILKRRYEFWYSTIALHKLGATVIPATHLLT KKDIIYRCNAADIKMIVAAGEGIILQHIKDALPECPTVEKLVSVGPEIPEGFEDFHQGIE NAAPFVRPRHANTNDDISLMYFTSGTTGEPKMVAHDFTYPLGHIVTGSFWHNLDENSLHL TIADTGWGKAVWGKLYGQWIAGANIFVYDHEKFTPAAILEKIQEYHVTSLCAPPTIFRFL IHEDLTKFDLSSLKYCTIAGEALNPAVFETFKKLTGIKLMEGFGQTETTLTIATMPWMEP KPGSMGLPNPQYDVDLIDHEGRSVEAGEQGQIVIRTSKGKPLGLFKEYYRDAERTHEAWH DGIYYTGDVAWKDEDGYLWFVGRADDVIKSSGYRIGPFEVESALMTHPAVVECAITGVPD EIRGQVVKATIVLSKDYKARAGEELIKELQNHVKKVTAPYKYPRVIEFVEELPKTISGKI RRVEIRENDKK >gi|222159291|gb|ACAB01000068.1| GENE 30 37498 - 37698 158 66 aa, chain + ## HITS:1 COG:no KEGG:BT_3747 NR:ns ## KEGG: BT_3747 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 66 1 66 66 105 90.0 4e-22 MITSRKISQINVFAGSPWEVASVKSLLNAAYIQVSMKDNGLNSILISVPCEYYTAAMRVI NNRKVS Prediction of potential genes in microbial genomes Time: Wed May 18 02:46:07 2011 Seq name: gi|222159290|gb|ACAB01000069.1| Bacteroides sp. D1 cont1.69, whole genome shotgun sequence Length of sequence - 3036 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 22 - 456 280 ## BT_3564 hypothetical protein - Prom 663 - 722 5.0 + Prom 1295 - 1354 2.6 2 2 Tu 1 . + CDS 1475 - 3035 1110 ## BT_4162 hypothetical protein Predicted protein(s) >gi|222159290|gb|ACAB01000069.1| GENE 1 22 - 456 280 144 aa, chain - ## HITS:1 COG:no KEGG:BT_3564 NR:ns ## KEGG: BT_3564 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 110 1 110 146 65 35.0 6e-10 MSQWISIEAAAEKYRLEKEYIWLWVEMKKITVSYENDTVSIDDDSIQQFIKRTKLGITSE YIDELEQLCMEKNKTSRLYASLLNMRDQELMAIRGQSSRLDGLWKMVEEQYERLRSFEKN SMLDNAICSNCWIRKICRRLKRIL >gi|222159290|gb|ACAB01000069.1| GENE 2 1475 - 3035 1110 520 aa, chain + ## HITS:1 COG:no KEGG:BT_4162 NR:ns ## KEGG: BT_4162 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 520 1 519 692 844 77.0 0 MKITQIRENGDTEALSVTDIDLLIEKMKKETKLRPVTGLRQALHFVLPDEPCSLANKLPR VIPAAAFGRVNGVKRMKTYNGIVELTIGPLAGKTEVEIVKQKAAELPQTMLAFMGASGKS VKIWTCFTRPDGTLPQTTEEAEVFQAHAYRLAIKCYQPQLPFNILLKEPKLEQFSRLSYD PDLIYRPTPVPFYLSQPIGMPGEMTYHEKSSTEASPLNRALPGYDTEDTVALLYEAALRK TFEEMETDWHRNDHDLQTLVVPLAENCYYSGIPEEEVTRRTIMRYYKRKNPMLIREMIRN VYKECKGVPKGSCLTKEQRLSLQMDEFMNRRYEFRYNTQIGEVEYRERFSFQFYFHPIDK RAQNSIMLDAQSEGIGVWDRDIDRYLHSNRVPIYNPLEEFLFHLPHWDGKDRIHALANRV PCKNPHWELLFHRWFLNMVSHWRGVDKMHANNTSPILVGRQGTHKSTFCREMIPPALRAY YTDSIDFSHKRDAELYLNRFALINIDEFDQITLPQQGFLK Prediction of potential genes in microbial genomes Time: Wed May 18 02:46:26 2011 Seq name: gi|222159289|gb|ACAB01000070.1| Bacteroides sp. D1 cont1.70, whole genome shotgun sequence Length of sequence - 48383 bp Number of predicted genes - 37, with homology - 36 Number of transcription units - 13, operones - 7 average op.length - 4.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 523 376 ## BT_4162 hypothetical protein + Term 537 - 585 12.3 - Term 525 - 573 10.2 2 2 Tu 1 . - CDS 809 - 3121 1748 ## COG1472 Beta-glucosidase-related glycosidases - Prom 3149 - 3208 4.9 3 3 Op 1 . - CDS 3287 - 4501 977 ## BT_3986 putative patatin-like protein 4 3 Op 2 . - CDS 4513 - 5532 861 ## BT_4711 hypothetical protein 5 3 Op 3 . - CDS 5551 - 6564 794 ## BT_4709 glycosyl hydrolase 6 3 Op 4 . - CDS 6607 - 8208 1108 ## BT_4708 hypothetical protein 7 3 Op 5 . - CDS 8221 - 11583 2540 ## BT_4707 hypothetical protein - Prom 11610 - 11669 3.0 8 4 Tu 1 . - CDS 11764 - 12735 748 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 12824 - 12883 8.5 - Term 12785 - 12823 3.8 9 5 Tu 1 . - CDS 12885 - 13478 396 ## BT_3748 RNA polymerase ECF-type sigma factor - Prom 13631 - 13690 5.0 - Term 13659 - 13720 6.4 10 6 Op 1 . - CDS 13774 - 14913 847 ## BT_4407 hypothetical protein 11 6 Op 2 . - CDS 14949 - 16070 932 ## BT_4406 hypothetical protein 12 6 Op 3 . - CDS 16104 - 17645 1148 ## BT_4405 hypothetical protein 13 7 Tu 1 . - CDS 17760 - 21083 3008 ## BT_4404 hypothetical protein - Prom 21103 - 21162 6.0 14 8 Op 1 6/0.000 - CDS 21245 - 22231 676 ## COG3712 Fe2+-dicitrate sensor, membrane component 15 8 Op 2 . - CDS 22302 - 22844 395 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 22927 - 22986 11.3 - Term 23067 - 23109 11.2 16 9 Tu 1 . - CDS 23148 - 23351 61 ## + Prom 23007 - 23066 12.2 17 10 Op 1 . + CDS 23302 - 24336 486 ## gi|237715284|ref|ZP_04545765.1| conserved hypothetical protein + Prom 24388 - 24447 2.6 18 10 Op 2 . + CDS 24526 - 24702 181 ## gi|256839139|ref|ZP_05544649.1| conserved hypothetical protein + Term 24732 - 24786 8.7 + Prom 24968 - 25027 6.6 19 11 Op 1 . + CDS 25184 - 25468 314 ## gi|262405128|ref|ZP_06081678.1| predicted protein 20 11 Op 2 . + CDS 25461 - 26846 603 ## PG0871 hypothetical protein 21 11 Op 3 . + CDS 26906 - 27955 554 ## PG0870 hypothetical protein + Prom 27960 - 28019 4.0 22 12 Op 1 . + CDS 28109 - 28456 235 ## PG0869 mobilization protein 23 12 Op 2 . + CDS 28453 - 29460 352 ## BDI_3505 mobilization protein BmgA 24 12 Op 3 . + CDS 29457 - 29909 104 ## gi|237715290|ref|ZP_04545771.1| predicted protein + Term 29916 - 29961 4.0 + Prom 29953 - 30012 3.9 25 13 Op 1 . + CDS 30056 - 30292 186 ## BDI_2133 hypothetical protein 26 13 Op 2 . + CDS 30294 - 30764 205 ## gi|237715292|ref|ZP_04545773.1| predicted protein 27 13 Op 3 . + CDS 30768 - 31532 389 ## DhcVS_262 hypothetical protein 28 13 Op 4 . + CDS 31519 - 32076 418 ## gi|237715294|ref|ZP_04545775.1| predicted protein 29 13 Op 5 . + CDS 32088 - 35747 2144 ## DhcVS_261 hypothetical protein 30 13 Op 6 . + CDS 35765 - 39655 2695 ## COG1002 Type II restriction enzyme, methylase subunits 31 13 Op 7 . + CDS 39667 - 41016 764 ## COG3950 Predicted ATP-binding protein involved in virulence 32 13 Op 8 . + CDS 41013 - 41762 368 ## PCC8801_0627 hypothetical protein 33 13 Op 9 . + CDS 41773 - 44229 878 ## DhcVS_259 hypothetical protein 34 13 Op 10 . + CDS 44233 - 45570 758 ## COG4930 Predicted ATP-dependent Lon-type protease 35 13 Op 11 . + CDS 45467 - 46372 536 ## COG4930 Predicted ATP-dependent Lon-type protease 36 13 Op 12 . + CDS 46384 - 47496 374 ## BT_4745 hypothetical protein 37 13 Op 13 . + CDS 47484 - 48381 341 ## BT_4745 hypothetical protein Predicted protein(s) >gi|222159289|gb|ACAB01000070.1| GENE 1 2 - 523 376 173 aa, chain + ## HITS:1 COG:no KEGG:BT_4162 NR:ns ## KEGG: BT_4162 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 172 521 692 692 259 71.0 3e-68 ILQKPVVNLRKPHGRSVLELQRYASFIGTSNQKDLLTDPSGSRRFICIEVTGNIDTTQPI DYEQLYAQAMHEIHHGERYWFDSEDEQIMTENNREFEQTPAMLQLFYQYFKTAQTKEEGE FLTPVEILNYLKKKSGMSLSDNKVYHFGRLLQKCGIPSKHTYKGTVYQVIKLS >gi|222159289|gb|ACAB01000070.1| GENE 2 809 - 3121 1748 770 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 18 763 4 752 778 565 42.0 1e-160 MMTGVAYSFAQQLPAVTYKNPTLPVETRVADLLSRMTLEEKVGQLLCPLGWEMYEIKRND VFPSDKFKQLIKDRKAGMLWATYRADPWTKKTLSTGLNPALATKAGNALQRYVIENTRLG IPLFLAEEAPHGHMAIGATVFPTGIGMAATWSPQLIREVGKAIGKEIRLQGGHISYGPVL DLARDPRWSRVEETFGEDPVLTGEIGKAMVEGLGGGDLSHPYSTLATLKHFLAYGISESG QNGNPSFAGIRELHENFLPPFRQAIDAGALSVMTSYNSMDGVPCTANHSLLTELLRNEWK FRGIVVSDLYSIEGIHQSHFVAPTMEEAAILALSAGVDVDLGGDAYMNLMNAVNTGRISK TALDASVARVLRLKFEMGLFENPYVDPEKAKKEVRSEESVTLARRVAQASITLLKNEHSL LPLNKNRKVALIGPNADNRYNMLGDYTAPQEEENIKTVLDGIRAKLSSSQVEYVKGCSIR DTVTTDIEQAVAAAQRSEVIIAVVGGSSARDFKTSYKETGAAIADEKTISDMECGEGFDR ATLSLLGKQQELLKALKATGKPLIVVYIEGRPLDKNWASENADAVLTAYYPGQEGGIAIA DVLFGDFNPAGRLPFSVPRSVGQIPLYYNKKAPQSHDYVEMSASPLYPFGYGLSYTSFDY SDLHLSALMPRSFEISFKVRNTGKYDGEEVAQLYLRDEYASVVQPLKQLKHFARFYLKRG EEREVKFILSEEDFSLVDRNLKKIVEPGTFQIMIGAASNDIRLQTKVEIK >gi|222159289|gb|ACAB01000070.1| GENE 3 3287 - 4501 977 404 aa, chain - ## HITS:1 COG:no KEGG:BT_3986 NR:ns ## KEGG: BT_3986 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 397 1 376 384 137 30.0 6e-31 MKKYTLLSLLALLLCVSGCKTEPDYSDAVYMTGTLTSSNVKFLVEGQSTLGLTVTSTDKT DTDVKVGVQVAPQLLESFNASTGRNCQMPPEGSYSFEGGEVVIPAGQNQSTQIKVTADSE KLQEGVSYCLPVSITSVSNSDLKVMETSRTAYVMLTKVINIKAAYLARRGYFNIPSFGDQ EKSPVKALGQMTLEMKVLPVSFPVGSERNANGISSLCGCEENFLFRFGDGAGNPVNKLQF VKGSIGSASHPDKKDHYESWVEKEFPTGHWLHFAAVYDGQYLRLYLDGEQIHFVETKNGG TINLSMAYDGHTWEDTFSIGRSAGNARFFDGYISECRVWNVARTSAQIEDGVCYVDPTSE GLIAYWRFDGETQDDGSVLDMTGHGHNAVPGGTITYVDNQKCPF >gi|222159289|gb|ACAB01000070.1| GENE 4 4513 - 5532 861 339 aa, chain - ## HITS:1 COG:no KEGG:BT_4711 NR:ns ## KEGG: BT_4711 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 329 4 311 320 79 25.0 3e-13 MTMKNRKNLFYISMFAICVLVFTACGDDETYDVYGDPYNRVYVLDNSKEYKIVQTPISTV SNVDFTWEAKCSKKASGDIRVTVAVDNSLIDAYNEEHDTEFEALPVEAIVLENKEMTIPA GEMVVADAVHLKLTDDVNVLSTLKSEKGYIVPLRLVSAEGGNSQLSTNMLAPSFLTITVT EDNMNHEATQYTGTGTLVADQTGWTATTNGTVQSWYEPIEAIFDGNYETYCSISNRSGEL LLDIDMGKAYSFDSIKMTSSGYDYETWTEKEVGAFSAGMTVSTSDNGTDWKTQGEIERNA EDCVFYAPLTARYIRITVPNAGGWYGASLECGVFNVYAK >gi|222159289|gb|ACAB01000070.1| GENE 5 5551 - 6564 794 337 aa, chain - ## HITS:1 COG:no KEGG:BT_4709 NR:ns ## KEGG: BT_4709 # Name: not_defined # Def: glycosyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 337 1 312 313 155 32.0 2e-36 MKTMKYLLVALATGSLMAACNTDIESLTIQRPLTYDDQYYQNLRDYKASDHEIAFGWFAQ YGAQNSAAVRFMGLPDSLDICSMWGGIPAKENTEIWDEIRFVQKVKGTKMLCVAITRIDA ETDEHDFKKAYNEAKAMPEGDERTVALNRSFEMYAEYFLDQVFLNDLDGFDADYEPEGDF LSGSNFEYFYKHMAKYMGPNPDITKEERLQLIEERYGKEIASQEGICDKMLNIDQTSTSM TSLVPYSNYCFLQAYSGGTGAGGWPDEKVVYCCNMGDGWQGDMQSMYNQARYKPANGKRK GGFGAFFIHRDYNVHEYNPEPYYRFRQCIQIQNPAVH >gi|222159289|gb|ACAB01000070.1| GENE 6 6607 - 8208 1108 533 aa, chain - ## HITS:1 COG:no KEGG:BT_4708 NR:ns ## KEGG: BT_4708 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 33 532 33 529 531 360 42.0 1e-97 MKQISKYIKVVNGTLLAGGLLLTACTSHFESWNINPNEVTPEQMERDNLKTGAYFTQMER GIFVVGKDKGGLFEETQMLTGDIFASYFAPVKTWDYAGTEDNDCYKLYRQWYNSPFNNAY TEVMQPWQSIVENTDEVSPARALATIVKVFGMSRITDKYGPIPYSKFGTGIHVAYDSQKD VYYRFFEELADAIDVLTGYNSRTSEPYMERYDYIYNGRVEKWIKFANTLRLRLAMRISYV DETKARTEIEAAIGHSIGLMTSVDDNAVLKQSASFTFINEWWEAFESFNDFRMSATMDCY LKGLQDPRLACYFKAAVKDGAYHGVRNGQTSRNQGTLSEAASSMNVEQNDNIEWMGAAET YFLLAEAKLRLNLGDKTVQEYYEEGVRISFSSKGATGADTYLADDTNVPATSFVDPTTER STDVSSMVSNLTVKWNDNATESKKLERIMVQKWIALFPDGQEAWSEMRRTGYPGIVTISS NRSGGEVPEGELISRLKFPTTEYSDNGENTQAAVSLLKGTDIAGTRLWWDVKR >gi|222159289|gb|ACAB01000070.1| GENE 7 8221 - 11583 2540 1120 aa, chain - ## HITS:1 COG:no KEGG:BT_4707 NR:ns ## KEGG: BT_4707 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 1120 1 1099 1099 979 50.0 0 MQNYCIVQFLGGLKSLRRTLKEISLAMKLLLMLLVCSVGLTYASDGYAQRNSISLNVKNS TIEEVLHTIERESGFGFFINSKNLNLNRRITVSVSDKNIFQVLKQVFEGTNIEYKILDNQ IVLTAKETEVAQQKVQNVKGTVVDKNGEPLIGVSIVEKGTTNGTVTDLDGNYTLTVQTAN PVLLFSYVGYQKKELPVVGSVLNVTLEDDSQVLNEVVVTALGIRKEAKALSYNVQEVSAS EIVGVKDANFVNSLSGKIAGVVINSSSSGIGGGAKVVMRGAKSLSGNNNALYVIDGIPMP SLETTQPDDFMTGMGQSGDGASMINPEDIETMSVLSGAAASALYGSDAANGVIMITTKKG TKDKLRVSYANNTSFFNPFVTPEFQNTYGATTGELKSWGQKMGQPSNYDPLDFYQTGWNE TNSLTISNGNEKNQTFLSMAATNAAGIVPNNTLDRYNFTIRNTTSMLNDRLRLDLSASYM NVREQNMVSQGQYFNPIVPVYLMSPSYSLDMLQQFEMYNETRNFKTQYWPWGNQGIALQN PYWIVNRDNFVNHKNRFLISGGLNFEITKGITLGARAKMDYTSALFEKKYSASTDNIFCQ DFGGYYKGDASTRQLYGDVMLNIDKYFGDFSLTATLGTSIQDVNYQYFDVGGSLNSVANS FTTLNLMQSTVKFQQDGYHDQTQSIFATAQLGWKSKLYLDVTGRVDWSSALAWTDAKSVA YPSVGLSAILTELLPIKNDVFTFLKVRGSYSEVGNAPTRYIAYQTYPYESASPTTATTYP NTDIKPERTKAWEVGLQSHFWNDKLELNVSLYKTSTYNQLFNPSLPSSSGYSSIYINGGQ VDNKGIEASLTLNQPLGPVKWNSTFTYTLNRNKIKKLLKPTTLSGGEVVSQDMMDLGGLE IVKSRLFEGGSIGDLYVTALRTDHHGYIDVDYVNNTVAIDDKAGERKDGWIYAGNSQAKY MMGWRNSFSWKGLTLSCLINARIGGVVVSQTQALMDAYGVSKATATARDLGYVLIDGHEV PAVQKYYSTVGSGVGSMYVYSATNVRLAELSLGYDVPVQKIIPWIQSMNVAFTGRNLFMF YCKAPYDPELTASTGTHFSGMDYFMLPSLRNLGFSVKLNF >gi|222159289|gb|ACAB01000070.1| GENE 8 11764 - 12735 748 323 aa, chain - ## HITS:1 COG:PA1364 KEGG:ns NR:ns ## COG: PA1364 COG3712 # Protein_GI_number: 15596561 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 107 295 74 255 280 70 27.0 3e-12 MDKELLLKYIAGKASQKEKEDVATWIDADAANLKEFISLRKSYDAFIWQDTDAFSKKTKK TISLHPVTQRILRIAAVFVVAFGLSYAMIQVLQKEDIEMQTVYVPAGQRTLVTLADGTTV WVNGKSTLTFPNCFSSRTRKVELDGEAYFDVRKDPEKQFIVSTAHQSAIKVLGTKFNVKA YKEADEVITTLVEGKVNFEFNNASQQPQYIVMAPGQKLVYYFQDGKTELYTTSGERELSW KDGKIIFRQTSLRDALDILADRYNAEFVVRENVPHDDSFSGTFTNRNLEQILNFISVSSK VRWRYLNNNGTAGKEKIKIEIFI >gi|222159289|gb|ACAB01000070.1| GENE 9 12885 - 13478 396 197 aa, chain - ## HITS:1 COG:no KEGG:BT_3748 NR:ns ## KEGG: BT_3748 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 197 1 180 180 297 85.0 1e-79 MSKMYFFAAFNTYGYTGMAQINFNSIYTAYYRKAFLFTLSYVHNDLVAEDIVSEAIIHLW ELSKEREIPSVEAILITYIRSKSLNYLKHIQAQENVFQTLLDKGQRELEIRISTLEACDP KEILSEELRAKVHALLESMPEKTRIAFIHDRLDGKSHKEIAEELGISVKGVEYHISRAVK MLRDNLKDYAPFLFFFI >gi|222159289|gb|ACAB01000070.1| GENE 10 13774 - 14913 847 379 aa, chain - ## HITS:1 COG:no KEGG:BT_4407 NR:ns ## KEGG: BT_4407 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 349 21 345 391 219 39.0 1e-55 MNKHIFYYLMVGSMALLLGACNDSMNDLLEPKVYFESKEYNFSVEDEMDVMTFDLVSRLS SATSSQVDVSYSVAEPSVVDEYNAKYGTNYEMLDVSQVKLSSTTSSISSGKLYADNVEVE LSGLEALKAGNSYVLPMRVHSSSVSTLSGTNIAYFFFSKPLKITKAGNFSNHYISVKFPV GTFFSSFTYEALINVDYFLDNNTIMGTEGVMILRIGDAGGGITPKDYLEVAGRQNYRVTK PLLTNRWYHVALTYDQPTGKTGIYVNGEKWAGSDWGIDGFDPNSDMGFYIGRIYGFKWGE RPFHGKMSEVRVWSVARTENQLKQNMLGVDPASEGLALYYKLDGSETQEGGVIKDATGRI NGTTNGITIKTLDAPIAIN >gi|222159289|gb|ACAB01000070.1| GENE 11 14949 - 16070 932 373 aa, chain - ## HITS:1 COG:no KEGG:BT_4406 NR:ns ## KEGG: BT_4406 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 372 4 388 388 286 43.0 1e-75 MKSLRKLLYIIPLTGMMVTSCMDVENIEIDHIGGYATMNNAESEAYYANLRAYKATAWNY NRPVAFGWYSNWAPAGAYRRGYLSAMPDSMDFVSMWSGAPGRYEITPEQKADKEFVQKVK GTKLLQVSLLSYLGKGATPNSVYLEVEKQAEEEGWSAAQLETAKKQARWKYWGFEGQFES ENHYACLAKFAKALCDSLYANEWDGYDVDWEIGSGVFDMDGTLSQNKHLIHLVKEMNNYI GPKSDPEGKGHKMICIDGNIYGLTHELDEYVDYWIIQSYGSSNPGFDGYGVDPKKIICTE NFEKYATNGGQLLKQAAAMPREGYKGGVGAYRFDNDYDNTPNYKWMRQAIQINQRVFNEW KAKQNEAENKPQE >gi|222159289|gb|ACAB01000070.1| GENE 12 16104 - 17645 1148 513 aa, chain - ## HITS:1 COG:no KEGG:BT_4405 NR:ns ## KEGG: BT_4405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 512 34 538 542 629 64.0 1e-179 MSDGEGAMDGFEVGGLITAMQRTVIPVGTQADDTDVINEYQIAYHLSADNWSGFFGENNS NGWNAGSNNTTYYLLDNWIKATYTQSYTNALDPWKKLKIASEKNGTPEVFALAQILKISA WHKTLESFGPMPYSHAADATMNIPFDSEKEVYTAMFEELTAAIEELTEKAENGVNVMGAY DAVYAGDATKWVKYGNSLMLRLAMRVRFADAELAKKYATQAVNHSIGVMTAKDDAAQMSQ GAGMTFRNNIEWLAGNYNEARMGSSIFSYLMGYEDPRLSVYFLPMDGNASYGVEAFDGKT YQAVPAGHANAQNDIYKSCSKPNIQSGTPTYWLRASEVYFLRAEAALVWEGFGSANSWYK QGIDMSFQENGVTDPVDDYMNSNLSPRAYVFSHYQYGQTLSAPCETTAKFEGTTEQKLEK IMIQKWIALFPNGQEAWTEWRRTGYPKLNVIKTLKGAVQGATLEGGIRRMIYPTSFSQTN EGKAIYEAALKLFNNGAGGEDKSSTRLWWDCKR >gi|222159289|gb|ACAB01000070.1| GENE 13 17760 - 21083 3008 1107 aa, chain - ## HITS:1 COG:no KEGG:BT_4404 NR:ns ## KEGG: BT_4404 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1107 1 1106 1106 1800 82.0 0 MLQIYEKIAEKIAHKRVFLTFLSLILIQTFALAQNDSKITIQQKNITVIDALKTVEKQSK KSINYSDSELKGKVIAQLNLQNASLETALDTILKGTGFSFQIQGNYIIIAEKKPVVAQTL KNIKGKVTDESGEPLIGVNISVDGSSTGTITDFDGNFTIKAAEKSILKVSYIGYAAQIIP VSKKDFYPVVMKQDTEVLDEVVVTALGIKRAEKALSYNVQQVKGDALTTVKDANFVNSLN GKIAGVSINKSASGVGGATRVVMRGAKSIEGDNNALYVVDGIPLFNTNMGNTDSGIMGEG KAGTEGIADFNPEDIESLSVLSGPSAAALYGSSAANGVILITTKKGKEGKLSVQFSSSSE FSKAYMTPEFQNTYGNKKDMYESWGEKLLTPTSYDPKKDFFNTGTNFINSVTLTTGTKSN QTFASVSSTNSKGIVPNNEYDRLNFTIRNTATFLNDKLQLDLGASYVKQKDKNMVSQGQY WNPVMAAYLFPRGEDFNGIKSFEHFDESRQLPVQYWPVADPVYASQNPYWTAYRNVATNE KSRYMFNVGLTYNITDWLNATARFRMDDTHVLFERKIYASSDDKFAEGKKGLYGYNNYED RQEYADFMLNVNKHIADFSISANAGWNYSNYWALERGYKGTLLGVPNKFAASNIDPANGR ISEKGGDSRVRNHAVFANLELGWKSMLYLTLTGRNDWNSRLVNTDEESFFYPSIGLSGII SEMVKLPEFISYLKVRGSYTEVGAPVSRSGLTPGTVTTPIVGGALDPTGIYPFTDFKAER TKSYEFGLSLKLWNKLSAEVTYYHSNTYNQTFLGDLPEFTGYKQIYLQAGNVENRGWEAS LSYSDQFKFGLGISSTLTFSRNINEIKEMVENYHTDLMDEPINIPEVLKDGGRVILKEGG SIHDIYANTFLKKDHLGYVEVKSDGTFGMEKGEPVYLGKTSPDFNMGWSNMLTYKGFGLG FQINGRFGGVVTSSTEALLDRFGVSKRSAEAREAGGVLLKGQGLVDAKSYYQMTGTGNYE TSGYYVYSATNIRLQELTFSYTMPNKWFGNVLKDVTVSFIANNPWMLYCEAPFDPELTPS TSTYGQGNDYFMQPSVRSFGFGIKFKL >gi|222159289|gb|ACAB01000070.1| GENE 14 21245 - 22231 676 328 aa, chain - ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 120 273 121 272 323 89 35.0 8e-18 MKNYIQQLVELFGHNSYSAGTQKKVQQWLADEEHVDEKSEALRELWKQAGEQKVPDGMQQ SIQRMRQNLGMQSITSRRNYQLLIWRAAAIFLLAVSSVSIYLMLEKERPEKDLVECYIPT AEIRELTLPDGTYVMLNSKSILLYPEKFTGETRSVYLIGEANFKVKPDKKHPFIVKANDY QVTALGTEFNVNAYPENSELMATLLEGSVKVEFNNLLSNIILKPNEQLVYDKHTKAHNLR MPEIDDVTAWQRGELVFSNMYLEDIFTSLERKFPYAFVYSLHSLKKNTYSFRFSKQANLE EVMKIISQVVGNVNYVIKGNKCYVTSKE >gi|222159289|gb|ACAB01000070.1| GENE 15 22302 - 22844 395 180 aa, chain - ## HITS:1 COG:PA2426 KEGG:ns NR:ns ## COG: PA2426 COG1595 # Protein_GI_number: 15597622 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 6 172 8 171 187 60 27.0 2e-09 MENNDRQIELKFQRFFTVNFPKVKNFAQMLLKSEADAEDVAQDVFCKLWLQPELWLDNDK ELDNYIFIMTRNIVLNIFKHQQVEQEYQSEVIEKTLLYELTEKEEILNNVYYKEMLMIIQ LTLEKMPKRRRLIFELSRFRGLSHKEIADKLDVSIRTIEHQVYLALIELKKVLLFFIFFL >gi|222159289|gb|ACAB01000070.1| GENE 16 23148 - 23351 61 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVPSDFVVLLNNVNPFITYHYLMWCKYIKIAPLNCNIAPQKCNLVQKINIYKYVYQCIRF YLRCMWV >gi|222159289|gb|ACAB01000070.1| GENE 17 23302 - 24336 486 344 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715284|ref|ZP_04545765.1| ## NR: gi|237715284|ref|ZP_04545765.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 344 1 344 344 629 100.0 1e-178 MKGLTLFRRTTKSEGTIKLRFRLRDGRGVDLYHKSEIKADLKDLSKFEDDGTLRPKVSIY NHELKEKIDKEIKAIEAAYIELCGKMDKSLITADLFEETISRCLHPEDYIAITKEETLLE RYNRFIKEGFRDGTFAEKRVLQYNGLYKELKRYLTIHNQTNILPKHFTADDLMDLRIFLF EEHKYVDKYPSLYADVKQQCLPTKERAQNTVAIKLRMLQAFYTELEERDEIEVSPFRKLG RNRRKAVMKESYDAPIYLLQEELLKVMNTEVPEVLQEAKECFLLQCAFGSRIEDYKALSM DKVTVSTDGIPYLRYLPQKTLKSNEHKDEVEIPIVRYALDLIKK >gi|222159289|gb|ACAB01000070.1| GENE 18 24526 - 24702 181 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|256839139|ref|ZP_05544649.1| ## NR: gi|256839139|ref|ZP_05544649.1| conserved hypothetical protein [Parabacteroides sp. D13] # 1 56 406 461 461 70 53.0 3e-11 MLTKVQVNMYVSGHHKEGSNAVKHYSSLSLKDRFILMSAAYKQPLYKVDKELNIIEEV >gi|222159289|gb|ACAB01000070.1| GENE 19 25184 - 25468 314 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262405128|ref|ZP_06081678.1| ## NR: gi|262405128|ref|ZP_06081678.1| predicted protein [Bacteroides sp. 2_1_22] # 1 94 25 118 118 155 100.0 8e-37 MMNEFNKSTQTLMIVSKDDLENVVQNAVNRLLEKRENKPEVYLSVEETARRLKVDRSTLW RWNKDGYLTTTKVGNKVRYKLSDVERIQKGEVYE >gi|222159289|gb|ACAB01000070.1| GENE 20 25461 - 26846 603 461 aa, chain + ## HITS:1 COG:no KEGG:PG0871 NR:ns ## KEGG: PG0871 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 1 454 1 455 458 383 43.0 1e-105 MNETRITAKGIMEDADKLINQVCKSQFPVEIFPTMIRKIIWKMKEALTYPVDYTASSMLV AIAVGIGNTHALKFKSEWTVKDILYMALVGRPGANKSHPLRTVLRPYFEFDMEQYRKFSE ELNRYNKIMEMNKKERLEQGYETAPTSPVRKRFLVSDITQEAMVKALAENRRGICLYMDE LQAWVKNFTRYSNGSEEQFYISLFNGSFYISDRKGDTNNNCIDDPFACIIGTIQPTILTE MLKGSKSNNGFAERILFAIPEEQEKSYWNDTDMDASYYDDWHNIIQKLIHLKISTDETGR VIPTTIEYEPEAKKLLLEWQRNWTDMINEEESDKKKGIYSKFEIFIHRFCIIIQVCKWLC NEGSADKVDFDTVTKAIKLVDYYKGTALLVNDMMNGIFLTAKQKELLDKLPEEFSRAEGL EIAIQQEWNPSTFDRFLKKASTKYLIHRHGHYQKALDKMPN >gi|222159289|gb|ACAB01000070.1| GENE 21 26906 - 27955 554 349 aa, chain + ## HITS:1 COG:no KEGG:PG0870 NR:ns ## KEGG: PG0870 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 1 343 1 336 352 388 51.0 1e-106 MTNYRFILEPYNRSKRNRYTCPSCGKKFQFSRYIDIESSIEFPDYVGKCNRENKCGYHYS PSMYFDKNPNEREKTREQSFSPPTTKDVKPFTPEAMDASFIDRRIMEQSLRHYERNNLYV YLCGQLGTKTAIQQMQMYHVGTAKKWGNSTVFWQVDTEGRVHTGKVMLYNPNTGKRVKEP YAKISWVHTILNLPAFTLNQCFFGEHLLAGNNKPIAIVESEKTAIIASVYIPNYIWLATG GKNGCFNEHHIDILCGRNVVLFPDIGMEDEWQKKACLMRRKGINVILSDYLGQHASAEEK ENGYDIADYLIKEKSGEAILQSMCQKNPSLKKLIELLNLELVDFHFESK >gi|222159289|gb|ACAB01000070.1| GENE 22 28109 - 28456 235 115 aa, chain + ## HITS:1 COG:no KEGG:PG0869 NR:ns ## KEGG: PG0869 # Name: not_defined # Def: mobilization protein # Organism: P.gingivalis # Pathway: not_defined # 1 114 3 116 117 80 41.0 2e-14 MNDKNKGGRPQLTRVEKRSEIFTIKCTPIEKHTIKGKAKMSGVKPAEYIRECAVNGSIKS RLTSEEIKYIRALAGMANNLNQLAKNANTYGYHHVATEADHLMSEVDNIIKKIRG >gi|222159289|gb|ACAB01000070.1| GENE 23 28453 - 29460 352 335 aa, chain + ## HITS:1 COG:no KEGG:BDI_3505 NR:ns ## KEGG: BDI_3505 # Name: not_defined # Def: mobilization protein BmgA # Organism: P.distasonis # Pathway: not_defined # 1 234 1 235 313 216 45.0 1e-54 MIAKIMKRSSFGGVVNYVFKDGKDAKLLASDGVRSNTLQNIIACFNDQASQNSKVRNIVG HTALSFSEKDKHQLNDERMVQIAHDYMEKMGIKNTQYIIVRHFDRDHPHIHIVYNRVDND GHTISDSNDQIRSAAICKQITLQYGLYMPKGKEKVKVHRLRGVNKEKFHLYATILDVLHD CNNWDNFQRRLERRGIAISFRRNENGNIHGICFTKNGHTYSGSKIDRSLGYNKLVGLLGQ INQSQVNQKEYYHYENTHVYQETLDDGRIIRIYEPEISSSDSGPSFGHNIANAAIEFVLQ PHDVPTSGGGGGSTSEDEDNEKENKNNKPRKFRRR >gi|222159289|gb|ACAB01000070.1| GENE 24 29457 - 29909 104 150 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715290|ref|ZP_04545771.1| ## NR: gi|237715290|ref|ZP_04545771.1| predicted protein [Bacteroides sp. D1] # 1 135 1 135 150 230 100.0 2e-59 MKKDNENAALLSMFDELVDEVKKNRQAIAQLQNTIQRHDIGKIVDESIYVALNEKFPNKS QLEKDEETKRYHQAMATWTTQREKNLFNQYIKPTLTPLNGIEKKIETATRSIVNEYRKER ESQKMAYYKTITIIFLCSLLLISILIICLL >gi|222159289|gb|ACAB01000070.1| GENE 25 30056 - 30292 186 78 aa, chain + ## HITS:1 COG:no KEGG:BDI_2133 NR:ns ## KEGG: BDI_2133 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 2 77 5 80 80 94 69.0 1e-18 MKDLNRIKVVLVEKKRTNKWLAEQLGKDPATISKWCTNSSQPDLATLARVAALLEVDVKD LLNSTQPKDNFIIIQQKS >gi|222159289|gb|ACAB01000070.1| GENE 26 30294 - 30764 205 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715292|ref|ZP_04545773.1| ## NR: gi|237715292|ref|ZP_04545773.1| predicted protein [Bacteroides sp. D1] # 1 156 1 156 156 295 100.0 9e-79 MTKLYIEQILKMRMAVYQAGIKVGIWKDINLLGASDMMNYIFTKSGQIAYYNLIVEYMRK EHDVITGGSYFLYKMPVQIEKEIMDFLKNGSIDFTSLIQDPEEYLQSMDTIVTDHCFTTI NIGSFTLNEIDSILRLCASHYRYSFQYGVKSYPYFD >gi|222159289|gb|ACAB01000070.1| GENE 27 30768 - 31532 389 254 aa, chain + ## HITS:1 COG:no KEGG:DhcVS_262 NR:ns ## KEGG: DhcVS_262 # Name: not_defined # Def: hypothetical protein # Organism: Dehalococcoides_VS # Pathway: not_defined # 3 251 8 262 266 108 25.0 2e-22 MRETTYNSNLAKGCGMIEETLTLLSLYDEGMTKATLVDYVHQNNSLSKCTEKRSKDIVNL VFYPRFMKCDSRIPSWLKAIRNRGVMLPQFKQLLMLYCARDNAIMYDYIVSQLNVLRASG NNHIKREDIRQYIDSVVAEGKASWGESICHKQTSYIKAVLMDFDLVDKRNNILPYEIANT TVLYLMHELHFRGFSDMAIWHHEDWVLFGLDKYQVQEKIMNLNLKGGYIAQCSGELLTIS WNYQTMEEFINGAI >gi|222159289|gb|ACAB01000070.1| GENE 28 31519 - 32076 418 185 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715294|ref|ZP_04545775.1| ## NR: gi|237715294|ref|ZP_04545775.1| predicted protein [Bacteroides sp. D1] # 1 185 1 185 185 363 100.0 4e-99 MGQYDDKIEKALQYLKNPSRITPTGYSPIVYVVYQPEDMLMVRSIIDTFLCAKADFYGFK AHIISMGELLDKYINNHDYREIWTDPSVNEDEMYNSIKQEIVSDGYLEKSILQIHDELLA DGNALLVIKDVEMLHPFYMMGVIENKIYNKIKVPILVFYPGETQGTARSFLGLYNQDGNY RSINF >gi|222159289|gb|ACAB01000070.1| GENE 29 32088 - 35747 2144 1219 aa, chain + ## HITS:1 COG:no KEGG:DhcVS_261 NR:ns ## KEGG: DhcVS_261 # Name: not_defined # Def: hypothetical protein # Organism: Dehalococcoides_VS # Pathway: not_defined # 4 1094 5 1120 1228 453 29.0 1e-125 MHKIQDLFDPTKQLNRTIESVVTFGANTEQDLSSEIREYVVTDKLHKNYEDVITDLQTAF DDSSKEVGIWVSGFYGSGKSSFAKYLGLSFDKSVMIDGVSFGDKLMSRIQDTAITAMHRT IISRHNPQVVMIDLSTQSVAGKIANVSDIVYYETLKLLGVTKCTDQKVMCFIDLLHSEGK YEEFCRLVETEKNKSWESVESNDLAANLIAANLAPRVLPAYFPDSASFKNINLSSALNEK ERFVRLHKLVKEKTGKDKIIYVLDEVGQYVASNVDLILNVQGMMQIFKDEFRGDVWVIAT AQQTLTEDNRQAQLNSNELFRLNDRFPVKVDIEANDIKEIITKRLLGKSKDGSDYLKRLF SQNEGLLKNSIHLTLQERSIYNQLLTDETFTNLYPFLPVHIDILLSLLQKLASRTGGVGL RSVIRLIRDILVDNHLADATIGQLAGPEHFYDVLHSDMERNSAKEIVISADKAISIFNGD KLAVRICKTIAVMQLLDDFNLSFDNLCALLYNNVGSTFDKSKVREAIERIMDSTSVTLQE IDGKYQFMTNAILSIREERNKIIPRDAEKADVLQEQLKDMLSPAPSVNVYSSKTITAGVE LTERNRPYNIHPTTSGLKLNIRFVDGSAFDETHQQLLTESTRQENNRTLYWLCTLAKDKE TILQEIVRSQNIKNRHQNETNKEIQAYLRAQTDFADEKKRELNKILREAQANSEVIFRGS PQQVNGETYKTVALKTIAEKVFEKYPLASSNMKGDCVNKLASYGDLTTIPDALNPFKIIK KTDGTIDVSHPAISEIKDYIATLNEVTGSEIMNRFERDPYGWSKDTIRYIVALMLKANII QIRVAGKNVTVFGDTAVEAMATNNSFNKISVLLNTEGALTTTELLKAAQSMTGLFNSSRV APVKDQIAKEAYKKIKLYLQKFNRLLPDFEVLKLDGESIVRQAINYAQRIIDSEGGEAAF LLGKDVDCQKAFKYVMDIIKCNEQASLINHLKHINHLANESHTLPEVEQLADYIKHVNDV RQLYHEYIANPDLHLIVSNITDLRNQFDAYLNQACIEFQTQTEKLLDESRTTMKSMPEYA KLDDKQRTQIDTQLDGLSIECTHPSISNLREMINKFVTYYLPSGSIKTIENRIRQYAADN APAVKVIPTPPTSPQKGATQEDDATKPNKVEEPTYQPKRLQVKRKITTRTELQQVIDQLT SLLGEISDNSPVEFNFNED >gi|222159289|gb|ACAB01000070.1| GENE 30 35765 - 39655 2695 1296 aa, chain + ## HITS:1 COG:STM4495 KEGG:ns NR:ns ## COG: STM4495 COG1002 # Protein_GI_number: 16767739 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Salmonella typhimurium LT2 # 58 1222 43 1212 1225 134 22.0 1e-30 MISQGGKKILERFVSTAKNLLMQNITELLQQHYGIWADGHTIAVEKLANQDTDIVHTARM LRERLKHLQAALPETEEDKDRLAVGQLIAEQAFTQLNRFCALRMCEERDLILESIRGGYD AVGFQSYDAIAQTIGVSKYNRYRWYLHSIFDELSIELPAVFDRYSPYGLVFPDESTLLKL IELINDSQLSDWYDEQNGTTVNFWKEDETLGWMFQYYNSLEERRKMRDESNKPRNSREMA VRNQFFTPEYVVRFLTDNSLGRIWYEMTGGQSRIGEEQCQYMVRRPNEEMKERAIKEPTE ILSLDPTCGSMHFGLYLYEVYEYIYTDAWDNYPELLHKYRELHTRESFLREVPRLILTHN IYGCEIDPRALQIAALSLWLRAQRSYGEMGIDATERPLITKSNLVLAEAMPGNKRLLKGL MEELDAPMQRLITKIWDKMKFVGEAGLLFKMEKEIEEDIEYLRKNWSKVNQSRNADLFAT EEERARIDAENEARKLLRQNKSAFFEEITDRLKEALEQLSAKLSEEEGYENALFSEDATR GFAFIELCQKRFDCIVMNPPFGEGSENTSSYLDNNYPAWCRNLVCAFFDRMQEMLDADGR LGAIFDRTVMIKSSYEDFRKRNLCGYITNCADIGWGVLDASVETSTLVLNKHSSDVEGVF MDILDVNAEEKNAQLIALIHAMCDGDEVKWSYIASSKDFANLPNAIIGYYFDVNILSVFK MQNMEERNLVSKKGHDLAANIYPRLYYEITNTLGFSLMYNGGAFSMFYFPYNDVVFWNEE IIRNNSLCNLRSLHLQKLGTVGYGKRGDILDAHVLKKGFIFTREGIGVPNLSISDGYSAL AYLNSIVAQYTINLYCAQHKGNGYVNILPMPDYDKSLSDIESIVNSIIDIKRHWFSLDET NLEYHGLIAQMSISDSIEDSINRMQEGITADYQHYQELVKENDNLWMDLANIDRESEFRQ TLNEYKNRRPYEELLSIDNASNQNIIDKKVIAQEIVQELVGMAFGRWDTAYAQKVKQIPE FGDVFDALPFMPTVSLSEIPSDYALQIPEDGILNNQADSPDSLVNRVRDVMQHIWGKHAD ELEYELCQLIGCKNLQVYFEAPTGFFDYHFKRYTKSRRKAPIYWPLSSEDGSYTYWVYYP KLSRNTLPSLLIKLRSEDEQLRSGINAAMAGHDKAQESSLRAKQAQVEGMMEEINQILAS GYEPNHDDGVPVTSAPMLKLVAHRGWNNECKDNLEKLTKGEYDWSHLAMSMFPARVTQKA KKDWCLALTHGLEHLCENKPKEKKARKKKDNNQTLF >gi|222159289|gb|ACAB01000070.1| GENE 31 39667 - 41016 764 449 aa, chain + ## HITS:1 COG:STM3753 KEGG:ns NR:ns ## COG: STM3753 COG3950 # Protein_GI_number: 16767037 # Func_class: R General function prediction only # Function: Predicted ATP-binding protein involved in virulence # Organism: Salmonella typhimurium LT2 # 1 358 1 369 396 136 28.0 8e-32 MRINTIKIKNFRGFEDKSFEFDSRMNVVLGNNTTGKTTLLHAVQIALGAYLQALTLIPGG KYFSRNFLKGDQVRKYSESTKSFLLDEQKPSIEVNADYYVGRIALGTKDYSEQIKNITWL REGSKNSRKNNGELMDEVYYMEQQRRNADATKVNSIFPLMLSFGATRLQNNYNGAEKTKA RASREEKAYKCALDEQVDFKSAFDWIYSYEKNILKELEFEGTDEAFIQAIKEAIPAIKQI FIDKKNDEFTAQIQMTGDLTPRWLTYDMMSDGFKSMINIVAEIAYRCIELNGFLGKDAVK KTPGIVMIDEVDLYLHPHWQQHILEDLQKAFPMFQFIVTTHSPFIVQSVDTNNIITLDAK VSPISPSNRGIEEIMVAEMGLDIDISNRSEKYRKKYDLAHRYYQLVKNGREGTEEANYVK SELNKLEAESKMFHDPAFEAALRLKRGDL >gi|222159289|gb|ACAB01000070.1| GENE 32 41013 - 41762 368 249 aa, chain + ## HITS:1 COG:no KEGG:PCC8801_0627 NR:ns ## KEGG: PCC8801_0627 # Name: not_defined # Def: hypothetical protein # Organism: Cyanothece_PCC8801 # Pathway: not_defined # 1 237 1 221 227 124 34.0 4e-27 MRPVTKKTIGESIMLGNGTTHVIQEEYKPYGNARPILLTNLGRYCSYCEQAFLCGSNLQT EHIQPKGLDVNGIKPYAELSTKWENFLLGCATCNGKGNKGDKNVVFEDIHLPHRNNTYLS LVYKEAGVVIPNPKLEGKSLANAEALISLVGLDKEDLDTDGRCGMRREVWEKAIMYLNDY ENGEIKLRHLIDYIKVSGCWSIWFTVFSGHDEVRKALIEEFPGTAKECFDANNHYEPVNR NPKNAIDPV >gi|222159289|gb|ACAB01000070.1| GENE 33 41773 - 44229 878 818 aa, chain + ## HITS:1 COG:no KEGG:DhcVS_259 NR:ns ## KEGG: DhcVS_259 # Name: not_defined # Def: hypothetical protein # Organism: Dehalococcoides_VS # Pathway: not_defined # 21 813 20 835 841 208 24.0 7e-52 MDIQGYILDHWQAKMKPETPVIIIYDKEGVYYDLLPIAIEKGFKVIDTTKGALHARLSAS RFWCNDLSIDKNVQMIIYRQRPMPTNNRSWVEEPFSCFMKSACIFPYGPQDTYENICRTF LPSKQQDLDKLFANGSTSFNMINALLDGAAYPELEQLTGGKSISEITVGLLALDECTNMN WQKEWRILAEVQFPRLDCTGVSLNEVQNKLWTYLLFSEFVFDLPGALPENLKSVPMAPVE MKEKVYMICDKLRNQINLRETYVRIANKVSEQLKLSEVFVKAKHLGERVTFNFENSVEYN RFIDYIKAGDIQEATRLISKNESGVWYQEEQEVSNYWKLAQHIVDLMQCINNGIETDGTL ADLIEWYTNVGCVADNAFRKFHTDKLGMLHLPKQADELTNLINKHYREFTERGVKAYQQL IIDIKDYPKLRNQGYIQFVKPAIKNGKRVVFVMVDAFRYEMGKTFAKSIERNFMERVECL SRISYLPSITRFGMASHMGDISVRNIQGKLQPFIGEQMVSTPEDRITWLKADTNIEVQDF RLEEFNSAKVNDRTRLLVVRSVSIDSAGENDKLNGLATMEREQIRLAKLLDDCKRLKFDE AIIVADHGFMIQPSFNVGDLINKPKGSDISIEESRVIAGNLNDSNDTLSLTTTQLGNDME VMKLSYAKDYTVFTRGEIYYHEGLSLQENVVPIIRVKLQEEKKRQSFQVSLSYKGKNGGT IFSRRPLIDINTTFANLFADDVIMKLKVTGDNDCEIGEAEDRFYDSVTGLIRIPSGATSV RQLINIRDEYHGNTVTITALDTETNATLSTLRLNFEND >gi|222159289|gb|ACAB01000070.1| GENE 34 44233 - 45570 758 445 aa, chain + ## HITS:1 COG:STM4491 KEGG:ns NR:ns ## COG: STM4491 COG4930 # Protein_GI_number: 16767735 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent Lon-type protease # Organism: Salmonella typhimurium LT2 # 4 425 24 432 694 351 42.0 2e-96 MDALDIKLNEAFPGKVVRKDLLHEIKRAVNVPSFVLEFLLSRYCASEDPEEIEEGKKAVL QTIEKCYVRPDESNKAQAIVEQKKRHKFIDKIHVKYVEREKRYWAEMENFASKRIAINPN FYQENEKLLESGIWAEVTLAYNEIQEDDYAFFIEDLRPIQIAHFDEQKFFKGREHFTTDE WIDVLIRSIGINPDWLAERSRKLLGDKNGRRLKFHILSRFLPLVQSNYNSIELGGRSTGK SYFYSEFSPYSTLLSGGQASTATLLYNNQRKQVGAVGFWDNVAFDEVAKMKIKDADTVQI MKDYMANGRFSRGREVTGYASFSFVGNFDLNIPRIVNSYDHDLLVTLPEAFDLAVIDRIY NYIPGWEIPKIDDSAYNNNFGLITDYMSEALHYLFVHDSDYVGVVNSRLKKGDNIEGRDN KALQKNCFRFYQATLPYRPTHRRGV >gi|222159289|gb|ACAB01000070.1| GENE 35 45467 - 46372 536 301 aa, chain + ## HITS:1 COG:MA2364 KEGG:ns NR:ns ## COG: MA2364 COG4930 # Protein_GI_number: 20091197 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent Lon-type protease # Organism: Methanosarcina acetivorans str.C2A # 13 83 419 489 682 60 50.0 4e-09 MTTLKDEIIKLYKKTVSGFIKLLFPTGQPTDEEFDEIVEYAIEGRRRVKEQLNKRKPDEE YARINMSYINKDGNKVVVYCPESKYSEATQNPRKDLNVEVLTKPIVEEVKPVVDQSIVLP IHSAIAPIVDSSPKEKTIDIQYGDIGYGYDDLFAEYLKGASIVMLEEPYLGNGFQIVNLV RFIELLVKIGDCKAFRLITKPGETPEESASISDNLKRIKETLGEMDGNIMSFEYEFDENS HDRYIRTSNNWDITLGRGLHFYQNMNPNKDSRNFFQMGTYDLSLRPCLKARFTFMKRDAQ E >gi|222159289|gb|ACAB01000070.1| GENE 36 46384 - 47496 374 370 aa, chain + ## HITS:1 COG:no KEGG:BT_4745 NR:ns ## KEGG: BT_4745 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 364 23 379 1222 357 54.0 3e-97 MTIETLLPKKLDKTDDSYKSVLDLNEALSSAEQEHIRNIALTGPFGSGKSSVLITLMEDF AEGRNYLPISLATLQANEEEIDLHESEEESNTQTPKKSEKDDVDSDNKDVSRKEKQIENL NRKIEYSILQQLIYREETKTVPNSRFRRITHLSKWGLFKYPFCIVLSIVCFFIVFEPSFA KVDTLYNLFNWGYPWNLAFDLIATAWLLLMLFVAVRYVLKSYSNSKLNKLNLKDGEIEVV ENNSIFNRHLDEILYFFQVTQYNIVIIEDLDRFGTPNIFLKLRELNQLINESKIVNRHIT FIYAVKDDIFKDEERTKFFDYITTVIPIINPSNSKDKLKHALEEKGCGSDGITDDDLSEM AIFLFKTCES >gi|222159289|gb|ACAB01000070.1| GENE 37 47484 - 48381 341 299 aa, chain + ## HITS:1 COG:no KEGG:BT_4745 NR:ns ## KEGG: BT_4745 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 299 382 684 1222 192 39.0 9e-48 MRILTNIVNEYKQYRDKLCTSKGTKLNKTKLLAMIVYKNYYPQDFALLHRRQGKVYECIS KKRLFVEEALKIIDGRKEKLSENEKTYINTLHLNLKELRLLFLYEFRNNANKRLVTIKIN NAYRSLEDIADNDGFFEELLSLKTINYEYYYNYYNVTTSSQSVDLPTLMNELHYTERIEM LKYGAENFKKEYENIKKEIISIKSLKLKVLLNKYNLGMNQIYQQIHLSDMQDVFIRRGYI DEEYYDYISYFYPGMVSLEDRELLLNIKRQINQPYDYHIDKIENFVKELKEYMFESDAI Prediction of potential genes in microbial genomes Time: Wed May 18 02:49:14 2011 Seq name: gi|222159288|gb|ACAB01000071.1| Bacteroides sp. D1 cont1.71, whole genome shotgun sequence Length of sequence - 1517 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 24 - 1515 682 ## BT_4745 hypothetical protein Predicted protein(s) >gi|222159288|gb|ACAB01000071.1| GENE 1 24 - 1515 682 497 aa, chain + ## HITS:1 COG:no KEGG:BT_4745 NR:ns ## KEGG: BT_4745 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 497 707 1199 1222 169 26.0 2e-40 MFNHFMYRLERNDAPLQFLSQYYIEGKQQNKVFAHYIEFDNSWNSIINWNNNTEKNNLIE AYLKYSSKITTDIQKWINANYKFLVEHLEEITLEKTQTLVANSCFTDLCDGSDDLLNYIV KHRCFKVNLNNLVIVTKHLSKGLAIPSFNNLNYTRIKGTNNDSFIGYVEDNISTVILELK DENKDESPESLLYILNSPSITEDSKNKYLSGQNYHIDGFVEILDENMYDVAIETRIILPT WENVSYYYTYKKCLSDTLSGYINHYANELSLMKCSDSIENKNLLYASLLGGTELAISNYS LLTTSFDGLFPQIALLEHLEKERLIILIKAGKVPFNQETITIINKTLSFADYIIYYSHEF AKKLDYNYKFSRSNAIEILKYGNFSLDDKYNIIGILPYDILKGSQILANIAIEVFNHKHE VNVNDEVLVNLIGVSNNIALKVQLITRLIRKGYTNHDNITHLVSAIDENYIDVCDKGKKA KLPNNELNLQFLTALED Prediction of potential genes in microbial genomes Time: Wed May 18 02:49:37 2011 Seq name: gi|222159287|gb|ACAB01000072.1| Bacteroides sp. D1 cont1.72, whole genome shotgun sequence Length of sequence - 66541 bp Number of predicted genes - 57, with homology - 55 Number of transcription units - 26, operones - 13 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 231 - 317 61.5 # Leu TAA 0 0 - TRNA 334 - 406 84.5 # Gly GCC 0 0 1 1 Op 1 . - CDS 500 - 1840 1109 ## COG0165 Argininosuccinate lyase 2 1 Op 2 . - CDS 1851 - 2261 367 ## BT_3732 hypothetical protein - Prom 2296 - 2355 3.4 - Term 2271 - 2329 4.0 3 2 Op 1 . - CDS 2357 - 2995 743 ## COG0461 Orotate phosphoribosyltransferase 4 2 Op 2 . - CDS 3074 - 3556 566 ## BT_3730 putative regulatory protein 5 2 Op 3 . - CDS 3553 - 4389 354 ## PROTEIN SUPPORTED gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase - Prom 4411 - 4470 3.0 + Prom 4199 - 4258 5.4 6 3 Tu 1 . + CDS 4426 - 5481 537 ## COG0117 Pyrimidine deaminase + Prom 5488 - 5547 6.5 7 4 Op 1 . + CDS 5624 - 7138 638 ## BF0506 hypothetical protein 8 4 Op 2 1/0.000 + CDS 7164 - 7898 401 ## COG0020 Undecaprenyl pyrophosphate synthase 9 4 Op 3 . + CDS 7929 - 10583 2533 ## COG4775 Outer membrane protein/protective antigen OMA87 10 4 Op 4 . + CDS 10607 - 11122 589 ## BT_3724 cationic outer membrane protein precursor 11 4 Op 5 . + CDS 11179 - 11694 630 ## BF0502 putative outer membrane protein OmpH + Term 11725 - 11775 15.0 12 5 Tu 1 . - CDS 11595 - 11804 62 ## - Prom 11885 - 11944 4.0 + Prom 11724 - 11783 3.3 13 6 Op 1 . + CDS 11803 - 12645 524 ## COG0796 Glutamate racemase 14 6 Op 2 . + CDS 12719 - 12952 242 ## BT_3721 hypothetical protein 15 7 Tu 1 . - CDS 12921 - 14069 996 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 14178 - 14237 2.9 16 8 Tu 1 . + CDS 14175 - 15257 934 ## COG0263 Glutamate 5-kinase + Term 15328 - 15369 -0.2 - Term 15316 - 15357 -0.2 17 9 Tu 1 . - CDS 15475 - 15624 94 ## - Prom 15683 - 15742 1.8 + Prom 15342 - 15401 6.7 18 10 Op 1 . + CDS 15548 - 16804 1068 ## COG0014 Gamma-glutamyl phosphate reductase 19 10 Op 2 . + CDS 16884 - 17840 889 ## COG0078 Ornithine carbamoyltransferase - Term 17968 - 18004 0.2 20 11 Op 1 . - CDS 18011 - 18403 273 ## COG0607 Rhodanese-related sulfurtransferase 21 11 Op 2 . - CDS 18418 - 19119 476 ## BT_3715 hypothetical protein 22 11 Op 3 . - CDS 19116 - 20264 1023 ## BT_3714 hypothetical protein 23 11 Op 4 . - CDS 20272 - 21246 1034 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 24 11 Op 5 . - CDS 21243 - 22316 970 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 25 11 Op 6 . - CDS 22343 - 22993 622 ## BT_3711 hypothetical protein - Prom 23106 - 23165 5.5 + Prom 22959 - 23018 5.0 26 12 Tu 1 . + CDS 23057 - 23296 131 ## gi|295088095|emb|CBK69618.1| hypothetical protein + Term 23308 - 23355 3.4 - Term 23289 - 23350 22.6 27 13 Tu 1 . - CDS 23365 - 23523 266 ## PROTEIN SUPPORTED gi|160885524|ref|ZP_02066527.1| hypothetical protein BACOVA_03524 - Prom 23543 - 23602 4.8 + Prom 23540 - 23599 7.2 28 14 Tu 1 . + CDS 23716 - 24282 756 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) + Term 24297 - 24350 11.2 - Term 24287 - 24336 12.5 29 15 Op 1 . - CDS 24352 - 25752 1503 ## COG1785 Alkaline phosphatase 30 15 Op 2 . - CDS 25826 - 27421 1423 ## BT_3705 regulatory protein SusR - Prom 27448 - 27507 7.2 31 16 Tu 1 . + CDS 27665 - 29518 1537 ## COG0366 Glycosidases + Prom 29627 - 29686 4.8 32 17 Op 1 . + CDS 29717 - 31927 2203 ## BT_3703 alpha-glucosidase SusB + Prom 31930 - 31989 3.9 33 17 Op 2 . + CDS 32015 - 35038 3235 ## BDI_1558 hypothetical protein 34 17 Op 3 . + CDS 35059 - 36657 1596 ## BT_3701 SusD, outer membrane protein 35 17 Op 4 . + CDS 36696 - 37844 1018 ## BT_3700 outer membrane protein SusE 36 17 Op 5 . + CDS 37868 - 39232 1038 ## PRU_2684 hypothetical protein 37 17 Op 6 . + CDS 39250 - 41631 1834 ## COG0296 1,4-alpha-glucan branching enzyme + Term 41679 - 41735 18.2 - Term 41667 - 41722 9.3 38 18 Op 1 . - CDS 41893 - 42657 594 ## COG2908 Uncharacterized protein conserved in bacteria 39 18 Op 2 . - CDS 42710 - 43021 465 ## COG2151 Predicted metal-sulfur cluster biosynthetic enzyme - Prom 43141 - 43200 4.8 40 19 Tu 1 . - CDS 43202 - 43897 449 ## COG2003 DNA repair proteins - Prom 43933 - 43992 3.4 41 20 Tu 1 . - CDS 44148 - 45137 799 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 45286 - 45345 6.2 - Term 45566 - 45611 3.0 42 21 Tu 1 . - CDS 45635 - 48331 2543 ## BVU_1478 hypothetical protein - Prom 48353 - 48412 4.9 43 22 Tu 1 . - CDS 48706 - 49641 490 ## BT_1503 integrase - Prom 49718 - 49777 4.9 - Term 49781 - 49822 7.3 44 23 Op 1 21/0.000 - CDS 49846 - 51045 1399 ## COG0282 Acetate kinase 45 23 Op 2 . - CDS 51086 - 52105 1209 ## COG0280 Phosphotransacetylase - Prom 52313 - 52372 7.0 + Prom 52090 - 52149 8.7 46 24 Op 1 23/0.000 + CDS 52302 - 52745 333 ## COG1380 Putative effector of murein hydrolase LrgA 47 24 Op 2 . + CDS 52742 - 53437 717 ## COG1346 Putative effector of murein hydrolase 48 24 Op 3 . + CDS 53494 - 54411 1050 ## COG4866 Uncharacterized conserved protein 49 24 Op 4 . + CDS 54428 - 55441 615 ## COG4552 Predicted acetyltransferase involved in intracellular survival and related acetyltransferases 50 25 Op 1 . - CDS 55527 - 56660 1105 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 51 25 Op 2 . - CDS 56665 - 57960 876 ## BT_3686 hypothetical protein 52 25 Op 3 . - CDS 57975 - 58997 960 ## BT_3685 hypothetical protein 53 25 Op 4 . - CDS 59026 - 60282 928 ## COG4289 Uncharacterized protein conserved in bacteria - Prom 60321 - 60380 6.5 54 26 Op 1 1/0.000 - CDS 60406 - 62310 1128 ## COG2273 Beta-glucanase/Beta-glucan synthetase - Term 62326 - 62378 7.1 55 26 Op 2 . - CDS 62391 - 63866 1105 ## COG2273 Beta-glucanase/Beta-glucan synthetase 56 26 Op 3 . - CDS 63885 - 64778 709 ## BF2940 hypothetical protein 57 26 Op 4 . - CDS 64806 - 66443 1135 ## PRU_2229 putative lipoprotein - Prom 66470 - 66529 5.2 Predicted protein(s) >gi|222159287|gb|ACAB01000072.1| GENE 1 500 - 1840 1109 446 aa, chain - ## HITS:1 COG:XF1003 KEGG:ns NR:ns ## COG: XF1003 COG0165 # Protein_GI_number: 15837605 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Xylella fastidiosa 9a5c # 1 416 6 424 445 251 34.0 2e-66 MAQKLWEKSVQVNKDIERFTVGRDREMDLYLAKHDVLGSMAHITMLESIGLLTKEELDQL LVELKSIYASAEKGEFVIEDGVEDVHSQVELMLTRRLGDIGKKIHSGRSRNDQVLLDLKL FTRTQIKEVAEAVEQLFHVLIRQSERYKNVLMPGYTHLQIAMPSSFGLWFGAYAESLMDD MLFLQAAFKMCNRNPLGSAAGYGSSFPLNRTMTTDLLGFDSMDYNVVYAQMGRGKMERNV AFALATIAGTISKLAFDACMFNSQNFGFVKLPDDCTTGSSIMPHKKNPDVFELTRAKCNK LQSLPQQIMMIANNLPSGYFRDLQIIKEVFLPAFQELKDCLQMTTYIMNEIKVNEHILDD DKYLLIFSVEEVNRLAREGMPFRDAYKKVGLDIEAGKFSHTKEVHHTHEGSIGNLCNAEI SALMQQVIDGFNFCGMEKAEKALLGR >gi|222159287|gb|ACAB01000072.1| GENE 2 1851 - 2261 367 136 aa, chain - ## HITS:1 COG:no KEGG:BT_3732 NR:ns ## KEGG: BT_3732 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 136 1 136 136 226 87.0 3e-58 MSNFESSVKVIPYSQERVYNKLSDLSNLEAVKDRLPKDKVQDLSFDSDTLCFSVSPIGQL TLQIVERDPCKCIKLATTNSPLPFNMWIQLVETAEEECKVKVTIGMDLNPFMKAMVQKPL QEGLEKMVDMLAVIEY >gi|222159287|gb|ACAB01000072.1| GENE 3 2357 - 2995 743 212 aa, chain - ## HITS:1 COG:lin1945 KEGG:ns NR:ns ## COG: lin1945 COG0461 # Protein_GI_number: 16801011 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Listeria innocua # 3 208 2 207 209 225 53.0 6e-59 MKNLERLFAEKLLKIKAIKLQPANPFTWASGWKSPFYCDNRKTLSYPSLRNFVKIEITRL ILERFGQVDAIAGVATGAIPQGALVADALNLPFVYVRSTPKDHGLENLIEGELRPGMKVV VVEDLISTGGSSLKAVEAIRRDGCEVIGMVAAYTYGFPVAEEAFKNAKVTLVTLTNYEAV LDVALRTGYIEKEDIQTLNEWRKDPAHWDAGK >gi|222159287|gb|ACAB01000072.1| GENE 4 3074 - 3556 566 160 aa, chain - ## HITS:1 COG:no KEGG:BT_3730 NR:ns ## KEGG: BT_3730 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 158 1 158 160 275 91.0 4e-73 MSAQLTDEEALNRVASYCSTAEHCRAEINEKLQRWGIAYDTIARILDRLESEKFIDDERF CRAFVNDKFRFAKWGKMKIAQGLYMKKIPSDVAWRYLNEIDEEEYLSILRDLLASKRKSI HAADDYELNGKLMRFAMSRGFELKDIKRCIDIPDEEEQID >gi|222159287|gb|ACAB01000072.1| GENE 5 3553 - 4389 354 278 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase [Acidobacterium capsulatum ATCC 51196] # 1 278 16 290 294 140 37 1e-32 MNRITAYIRQSLQEIYPPEEVKALSMLICCDMLGLDALDIYMGKDIILSECKQRELENII FRLQKNEPIQYIRGVAEFCGRNFKVASGVLIPRPETAELVELIVEENPNARRLLDIGTGS GCIAISLDKKLPDAEVEAWDISEEALAIARKNNDALEARVRFLQRDVLADDWEKIPSFDV IVSNPPYVTETEKNEMDANVLDWEPGLALFVPDEDPLRFYNRIARLGSELLLPGGKLYFE INQAYGRETAHILEMNQYRDVRVIKDIFGKDRIVTANR >gi|222159287|gb|ACAB01000072.1| GENE 6 4426 - 5481 537 351 aa, chain + ## HITS:1 COG:BH1554_1 KEGG:ns NR:ns ## COG: BH1554_1 COG0117 # Protein_GI_number: 15614117 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine deaminase # Organism: Bacillus halodurans # 7 149 1 141 143 161 52.0 2e-39 MAKSTKMEEEKYMRRCIELAKNGLCNVSPNPMVGAVIVCDGRIIGEGYHIRCGEAHAEVN AIHSVKDESLLKRSTIYVSLEPCSHYGKTPPCADLIIEKQIPRIVIGCQDPFSEVAGRGI QKLRDAGREVSVGVLEEECKSLIRRFITFNTLHRPFITLKWAESADHFIDIERTDGKPIV LSSPLTSMLVHKKRAEADAIMVGRRTALLDNPSLTVRNWYGHNPIRVVLDRTLSLPNDSQ IFDGNVPTLIFTEKQQPEKKNITYITINFSHNPLKQIMEALYQRKIQSLLVEGGRQLLQS FIDNELWDEAYIEKCPSRLHSGIKAPQMDDNFSYSIEEHFERQIWHYVRRL >gi|222159287|gb|ACAB01000072.1| GENE 7 5624 - 7138 638 504 aa, chain + ## HITS:1 COG:no KEGG:BF0506 NR:ns ## KEGG: BF0506 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 501 1 450 452 269 38.0 2e-70 MKIKFLSFIASFFMVSFVITSCLDDDNNIEYSPDATIHAFALDTAGLGSYKFTIDQLSRE IYNEDSLPVHADTIIDKILIKTLTTASGVVTMKDKSGNDSVLNINDSIDLRKELTIKVWS TEALAGISPNQTKEYKIKVNVHNYDPDSLRWKYMDKINNQIQITGEQKSIIFGSEVFTYS VVNNKLYVYKNSLTNFGNGAPQATVGLPEDKLPTSIIAFQFNRSKTMLYATSNGDGKVYE SADGENWNESTIFGKGVELLLATLTNNDVSRICYIKKGADEQRYFYYQTNDVPKETLDNA ENGGKVPSNFPTKNISYTVYKSSTNINSVLLVGDTETTTLADDSKLETTIVWAYDGNKWV EFSTTSSVAYCPKYTQPSIIYYNDLVYIFGQDFSSIYVSNQGLFWKKANAKFSFPHRDWS KGGTPDPDVDPEFRGKTNYSMVLDPDTQNLWIIFSKGSASFKEEVEKEESTKATTTETRT YEHDSEVWRGRLNQLWFDLANAGK >gi|222159287|gb|ACAB01000072.1| GENE 8 7164 - 7898 401 244 aa, chain + ## HITS:1 COG:SPy1965 KEGG:ns NR:ns ## COG: SPy1965 COG0020 # Protein_GI_number: 15675763 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Streptococcus pyogenes M1 GAS # 12 238 16 248 249 247 48.0 2e-65 MSYIEQIDKTRIPQHVAIIMDGNGRWAKQRGEERTYGHRAGAETVQNITEDAARLGIKYL TLYTFSTENWNRPQDEIAALMNLLLESIEEETLMKNNIRFNVIGDFKKLPVEVQKSLTSC IERTSKNSGMYMVLALSYSSRWEITEAVRQIATQVKTGEISPEQITDECISSHLDTNFMP DPDLLIRTGGEIRLSNYLLWQCAYSELYFCDTFWPDFDKEELYKAIWEYQQRERRFGKTS EQIS >gi|222159287|gb|ACAB01000072.1| GENE 9 7929 - 10583 2533 884 aa, chain + ## HITS:1 COG:RSc1412 KEGG:ns NR:ns ## COG: RSc1412 COG4775 # Protein_GI_number: 17546131 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein/protective antigen OMA87 # Organism: Ralstonia solanacearum # 46 884 30 765 765 144 22.0 9e-34 MHYRISFIFITFVCLCCFATTGVAQNANTDEDSKPVILYSGTPKKYEIADIKVEGVKNYE DYVLIGLSGLSVGQTITVPGDEITGAIKRYWRHGLFSNVQITAEKIEGNKIWLKISLTQR PRIADVRYHGVKKSERTDLEAKLGMVKGMQITPNTVDRAKTLIKRYFDDKGFKNAEVIIA QKDDPSSENQVIVDIDIDKKEKIKVHKITIAGNTAIKASKLKKVMKKTNEKGKLLNLFRT KKFVPENFEADKQLIIDKYNELGYRDAMIVKDSVSQYDEKTVDVYLDIDEGQKYYLRNVT WVGNTLYPSEQLNFLLRMKKGDVYNQKLLNERVSTDDDAIGNLYYNNGYLFYNLDPVEVN IVGDSIDLEMRIYEGRQATINKIKISGNDRLYENVVRRELRIRPGQLFSKEDLMRSLREI QQMGHFDPEKLQPDIQPDPMNGTVDIGLPLTSKANDQVEFSAGWGQTGIIGKLSLKFTNF SVANLLHPGENYRGILPQGDGQTLTISGQTNAKYYQSYSISFFDPWFGGKRPNSFSVSAF FSVQTDISSRYYNSSYFNNYYNSMYSGYGGYGMYNYGNYNNYENYYDPDKSIKMWGLSVG WGKRLKWPDDYFTLSAELAYQRYNLSDWQYFPVTNGKCNDLSISLTLARNSIDNPIFPRS GSDFSLSVQLTPPYSLMDGKDYKGYYSNPETGSITQDNMNKLHKWIEYHKWKFKGKTYTP LMDPIAHPKCLVLMTRTEFGLLGHYNQYKKSPFGTFDVGGDGMTGYSTYATESIALRGYE NSSLTPYGSEGYAYARLGIELRYPLMLETSTNIYVLGFLEAGNAWHDIKKFNPFELKRSA GVGVRIFLPMIGMMGIDWGYGFDKVFGSKQYGGSQFHFILGQEF >gi|222159287|gb|ACAB01000072.1| GENE 10 10607 - 11122 589 171 aa, chain + ## HITS:1 COG:no KEGG:BT_3724 NR:ns ## KEGG: BT_3724 # Name: not_defined # Def: cationic outer membrane protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 171 1 171 171 241 84.0 7e-63 MRKSVLLIMVLFAVSMAANAQKFALIDTEYIMKNIPAAQSANEQMQEATKKYQSEVEALA KEAQKMFQDYQAKSSTLSAAQKTKTEDAIVAKEKAAAELKRNYFGPEGELAKMRDKLITP IQDDIYEAVKAISQQHGYDLIIDRASATGIIFANPRIDISDEILRRLGYSN >gi|222159287|gb|ACAB01000072.1| GENE 11 11179 - 11694 630 171 aa, chain + ## HITS:1 COG:no KEGG:BF0502 NR:ns ## KEGG: BF0502 # Name: not_defined # Def: putative outer membrane protein OmpH # Organism: B.fragilis # Pathway: not_defined # 1 171 1 169 169 207 76.0 1e-52 MLKKIALVMLLALPMGVFAQNLKFGHINAQEIITVMPEFTKAQNDIQTLEKQLTAELQRT QEEFNKKYQEFQQAIAKDSLPPNIAERRQKELQDMMQRQEQFQQDAQQQMAKAQNDAMAP IYQKLDNAIKAVGAAEGVIYIFDLARTSIPYVNESQSINLTSKVKANLGIK >gi|222159287|gb|ACAB01000072.1| GENE 12 11595 - 11804 62 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTHLIYIRKCTPKKGKEKEVDPSFSFPFGISLFLVSSYLIPRFAFTLLVKLMLCDSFTYG IEVLARSKM >gi|222159287|gb|ACAB01000072.1| GENE 13 11803 - 12645 524 280 aa, chain + ## HITS:1 COG:lin1200 KEGG:ns NR:ns ## COG: lin1200 COG0796 # Protein_GI_number: 16800269 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Listeria innocua # 12 279 5 264 266 171 37.0 1e-42 MKQHLSHTPGPIGVFDSGYGGLTILDKIREVLPEYDYIYLGDNARAPYGTRSFEVVYEFT RQAVNKLFDMGCHLVILACNTASAKALRSIQMNDLPGIDPARRVLGVIRPTVECVGEISK NQHIGVLATAGTIKSESYPLEIHKLFPEIQVSGTACPMWVSLVENNESQDEGADYFIRKY IDQLLSKDPQIDTVILGCTHFPILLPKIRQYIPEHISVIAQGEYVAESLKDYLKRHPEMD AKCTKNGNCQFYTTEAEEKFSESASTFLKQQINVKHITLE >gi|222159287|gb|ACAB01000072.1| GENE 14 12719 - 12952 242 77 aa, chain + ## HITS:1 COG:no KEGG:BT_3721 NR:ns ## KEGG: BT_3721 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 134 96.0 8e-31 MKEEDYSKAIEVFSGSPWEAEIIKGLLESNDIRCVIKDGIMGTLAPYIAPSVSVLVTEDQ YEAATALIRSRAEKDND >gi|222159287|gb|ACAB01000072.1| GENE 15 12921 - 14069 996 382 aa, chain - ## HITS:1 COG:MA0636 KEGG:ns NR:ns ## COG: MA0636 COG0436 # Protein_GI_number: 20089523 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanosarcina acetivorans str.C2A # 7 379 13 390 394 408 53.0 1e-114 MKHTNPQVEQMTSFIVMDVLERANELQKQGIDIIHLEVGEPDFDVPACVAEAAKAAYDRH LTHYTHSLGDPELRREIAAFYLREYGVTVDPDCIVVTSGSSPSILLALMLLCNPDSEVIL SNPGYACYRNFVLATQAKPVLVPLFKEYLQYDIEAIRKCVNPHTAAIFINSPMNPTGMLL DEKFLKDVAALGVPVISDEIYHGLVYEGRAHSILEYTDQAFVLNGFSKRFAMTGLRLGYL IAPKSCMRSLQKLQQNLFICASSVAQQAGIAALRQAEPDVERMKQVYDERRRYMIARLRE MGFEIKVEPQGAFYIFADARKFTTDSYRFAFDVLEHAHVGITPGVDFGTGGEGYVRFSYA NSLENIREGLDRISRYLSRPGS >gi|222159287|gb|ACAB01000072.1| GENE 16 14175 - 15257 934 360 aa, chain + ## HITS:1 COG:BS_proJ KEGG:ns NR:ns ## COG: BS_proJ COG0263 # Protein_GI_number: 16078908 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Bacillus subtilis # 7 341 9 342 371 197 33.0 3e-50 MKQEFTRIAVKVGSNVLTRRDGTLDVTRMSALVDQIAELHKSGVEIILISSGAVASGRSE VHPQKKLDSVDQRQLFSAVGQAKLINRYYELFREHSIPVGQVLTTKESFGTRRHYLNQKN CMTVMLENNVIPIVNENDTISVSELMFTDNDELSGLIASMMDAQALIILSNIDGIYNGSP ADPGSSVIREIDHGKDLSNYIQATKSSFGRGGMLTKTNIARKVADEGITVIIANGKRDNI LVDLLQHPKETLCTRFVPSNEPVSSVKKWIAHSEGFAKGEIHINECATEVLNSEKAVSIL PIGITHVEGEFEKDDIVRIMDFQGNQVGVGKVNCDSKQAQEAIGKHGKKPVVHYDYLYIE >gi|222159287|gb|ACAB01000072.1| GENE 17 15475 - 15624 94 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSSFSKANSRLATCTAAKVSFKFVVICSFFALNGKINYANIVNDVNKTC >gi|222159287|gb|ACAB01000072.1| GENE 18 15548 - 16804 1068 418 aa, chain + ## HITS:1 COG:SP0932 KEGG:ns NR:ns ## COG: SP0932 COG0014 # Protein_GI_number: 15900812 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Streptococcus pneumoniae TIGR4 # 7 416 6 419 420 339 46.0 7e-93 MTTNLNETFAAVQVASRELALLNDDIINQILNAVADAAIAETPFILAENEKDLARMDKND PKYDRLKLTEERLKGIAADTRNVATLPSPLGKVLKESVRPNGMRLTKVSVPFGVIGIIYE ARPNVSFDVFSLCLKSGNACILKGGSDADCSNRAIISVIHEVLRKFNINPHIVELLPADR EATAALLNAVGYVDLIIPRGSSSLIHFVRENAKIPVIETGAGICHTYFDEFGDVNKGAAI IHNAKTRRVSVCNALDCAIIHEKRLTDLPMLCEKLKDSHVIIYADAQAYQALEGSYPTEL LEHAKAESFGTEFLDYKMAVKTVKSFEDALGHIQENSSKHSECIVTENKERAALFTKIVD AACVYTNVSTAFTDGAQFGLGAEIGISTQKLHARGPMGLEEITSYKWVIEGDGQTRRN >gi|222159287|gb|ACAB01000072.1| GENE 19 16884 - 17840 889 318 aa, chain + ## HITS:1 COG:XF0998 KEGG:ns NR:ns ## COG: XF0998 COG0078 # Protein_GI_number: 15837600 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Xylella fastidiosa 9a5c # 28 302 27 322 336 189 37.0 6e-48 MNKFTCVQDIGDLKSALAEAFEIKKDRFKYVELGRNKTLMMIFFNSSLRTRLSTQKAAIN LGMNVMVLDINQGAWKLETERGVIMDGDKPEHILEAIPVMGCYCDLIGVRSFARFENRDF DYQETIINQFIQYSGRPVFSMEAATRHPLQSFADLITIEEYKKTARPKVVMTWAPHPRPL PQAVPNSFAEWMNATDYEFVITHPEGYELAPQFVGNAKVEYDQMKAFEGADFIYAKNWAA YSGDNYGQILSKDRDWTVSDRQMAVTNNAYFMHCLPVRRNMIVTDDVIESPQSIVIPEAA NREISATVVLKRLLEGLK >gi|222159287|gb|ACAB01000072.1| GENE 20 18011 - 18403 273 130 aa, chain - ## HITS:1 COG:MA0746 KEGG:ns NR:ns ## COG: MA0746 COG0607 # Protein_GI_number: 20089631 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Methanosarcina acetivorans str.C2A # 27 125 39 146 151 67 35.0 5e-12 MFKLNQLIVGIFLFLSSLFSCQQKGDFQSMNVEEFDSLIQNEDIQRLDVRTLAEYSEGHI TKTININVMDDSFASMADSLLQKDKPVAVYCRSGNRSKKAAAILSEKGYKVFELDKGFNS WQEAGKEIEK >gi|222159287|gb|ACAB01000072.1| GENE 21 18418 - 19119 476 233 aa, chain - ## HITS:1 COG:no KEGG:BT_3715 NR:ns ## KEGG: BT_3715 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 237 237 308 64.0 9e-83 MKRRKLHVIWLWGLLFLMTACGDDDYYYPSVKLEFVTVEAGEDGRIQTLIPDKGEALPVA EDRTGSTIAANTSRRVMSNYEVLPDGSAATIYSLQSLIVPVPKPEDDPVYKDGIKQDPVE VVSIWLGRDYLNMILNLKVSTGKGHTFGIVEDVSELKTNGIVNMLLYHDANSDEEYYNRR AYISVPLAQYIDEEHPGRTINIKFKYCTYDKDGSAVVSEKYCDPGFDYTPGQN >gi|222159287|gb|ACAB01000072.1| GENE 22 19116 - 20264 1023 382 aa, chain - ## HITS:1 COG:no KEGG:BT_3714 NR:ns ## KEGG: BT_3714 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 382 1 382 382 704 89.0 0 MDLTEFNEIRPYNDEELPQIFEELIADPAFQKAATGAIPNVPFELLAQKMRACKTKLDFQ EAFCYGILWKIAADHTAGLTLDHTAIPDKSKAYTYISNHRDIILDSGFLSILLIDQGMDT VEIAIGDNLLIYPWIKKLVRVNKSFIVQRALTMRQMLESSARMSRYMHYTISENKQSIWI AQREGRAKDSNDRTQDSVLKMLAMGGEGDLIDRLMEMNIAPLAISYEYDPCDFLKAQEFQ LKRDIEGYKKTTQDDLINMQTGLFGYKGRVHFQVAPCLNDDLKELDRSLPKPDLFARISA CIDRRIHRNYQIYPGNYVAYDWLNGTTEFVSNYTEEEKQQFMNYIEQQLAKIKIPNKDED FLREKLLLMYSNPLVNYLAACR >gi|222159287|gb|ACAB01000072.1| GENE 23 20272 - 21246 1034 324 aa, chain - ## HITS:1 COG:HI1140 KEGG:ns NR:ns ## COG: HI1140 COG1181 # Protein_GI_number: 16273066 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Haemophilus influenzae # 2 321 5 303 306 180 32.0 4e-45 MKRNIAIVAGGDTSEIVVSLRSAQGIYSFIDKEKYNLYIVEMEGRRWEVQLPDGNKVPVD RNDFSFTNGTEKVVFDFAYITIHGTPGEDGRLQGYFDMMRIPYSCCGVLAAAITYDKFTC NQYLKAFGVRIAESLLLRQGQSISDEEVVEKIGLPCFIKPSLGGSSFGVTKVKTKEQIQP AIVKAFGEAQEVLVEAFMDGTELTCGCYKTKEKTVIFPPTEVVTHNEFFDYDAKYNGQVD EITPARISDELTKRVQMLTSAIYDILGCSGIIRVDYIVTAGEKLNLLEVNTTPGMTTTSF IPQQVRAAGLDIKDVMTDIIENKF >gi|222159287|gb|ACAB01000072.1| GENE 24 21243 - 22316 970 357 aa, chain - ## HITS:1 COG:BH2542 KEGG:ns NR:ns ## COG: BH2542 COG0564 # Protein_GI_number: 15615105 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Bacillus halodurans # 26 346 1 300 305 236 43.0 6e-62 MIEEELPDELENDLDDIEPVGDESQLYEHFRVVVDKGQAMVRVDKYLFERIVNASRNRIQ KAAEGGFVMANGKPVKSSYKVKPLDVITVMMDRPRYENEIIPEDIPLTIVYEDPYVMVVN KPAGLVVHPGHGNYHGTLVNALAWHMKDIPDYDANDPHVGLVHRIDKDTSGLLVIAKTPD AKTNLGLQFFNKTTKRRYRALVWGVVEQDEGTIVGNIARNPKDRMQMAVMSDPTVGKHAV THYRVLERLGYVTLVECILETGRTHQIRVHMKHIGHVLFNDERYGGHEILKGTHFSKYKQ FVNNCFDTCPRQALHAMTLGFVHPVTGEEMYFTSELPDDMTRLIEKWRGYISNRELE >gi|222159287|gb|ACAB01000072.1| GENE 25 22343 - 22993 622 216 aa, chain - ## HITS:1 COG:no KEGG:BT_3711 NR:ns ## KEGG: BT_3711 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 216 1 216 216 313 70.0 4e-84 MTIKEFFSFKTNKFFWVNMIAMIVVVVVMIFGVLKWLDIHTHHGETVAVPDVKGMTVDEA AKMFRNHGLVYVISDTKYVKDKAAGIILELKPGAGEKVKEGRTVYLTVNTLDVPLRAIPD VADNSSLRQAQAKLLNAGFKLNQVQLVNGEKDWVYGVKYQGRQLAAGEKIPVGSSLTLMV GNGSGDTSEEDSADVSVDTDQPVTSESSSTQDDSWF >gi|222159287|gb|ACAB01000072.1| GENE 26 23057 - 23296 131 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295088095|emb|CBK69618.1| ## NR: gi|295088095|emb|CBK69618.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 79 1 79 79 124 100.0 2e-27 MKNVLIYYARTLIIARIAATIIPTPIDFNNIIIFMRGFKSPESALLLRMFVYESCSIPQR KAKSHKQEQTIALFNILFS >gi|222159287|gb|ACAB01000072.1| GENE 27 23365 - 23523 266 52 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160885524|ref|ZP_02066527.1| hypothetical protein BACOVA_03524 [Bacteroides ovatus ATCC 8483] # 1 52 1 52 52 107 100 2e-22 MKRTFQPSNRKRKNKHGFRERMASANGRRVLAARRAKGRKKLTVSDEYNGQK >gi|222159287|gb|ACAB01000072.1| GENE 28 23716 - 24282 756 188 aa, chain + ## HITS:1 COG:MT2609 KEGG:ns NR:ns ## COG: MT2609 COG0231 # Protein_GI_number: 15842068 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Mycobacterium tuberculosis CDC1551 # 1 188 1 187 187 181 44.0 6e-46 MINAQDIKNGTCIRMDGKLYFCIEFLHVKPGKGNTFMRTKLKDVVSGYVLERRFNIGEKL EDVRVERRPYQFLYKEGEDYIFMNQETFDQHPIAHDLINGVDFLLEGAVLEVVSDASTET VLYADMPIKVQMKVTYTEPGVKGDTATNTLKPATVESGATVRVPLFINEGETIEIDTRDG SYVGRVKA >gi|222159287|gb|ACAB01000072.1| GENE 29 24352 - 25752 1503 466 aa, chain - ## HITS:1 COG:TM0156 KEGG:ns NR:ns ## COG: TM0156 COG1785 # Protein_GI_number: 15642930 # Func_class: P Inorganic ion transport and metabolism # Function: Alkaline phosphatase # Organism: Thermotoga maritima # 1 465 1 421 434 226 34.0 8e-59 MKKLIYTLFFVFISVVANGQAKYVFYFIGDGMGVNQVNGTEMYQAEIQNGRIGVEPLLFT QFPVATMATTFSAKNSVTDSAAAGTALATGKKTYNHAISVGEDKNAIQTVAEKAKKAGKK VGVTTSVSVDHATPAAFYAHQPDRNRYYEIALDLPKANFDFYAGGGFLKPTTSFDKKEAP SIFPIFEEAGYTVARGYNDYKAKAAKAEKMILIQEEGANPSCLPYAIDRKEGDLTLAQIT ESAIDFLTKDNKKGFFLMVEGGKIDWACHANDAATVFNEVKDMDNAIKVAYEFYKKHPKE TLIVVTADHETGGIVLGTGKYELNLKALQHQKHSADGLSQRISELRKSKGNKVTWEDMKT FLGEEMGFWKQFPISWEQEKKLRDEFEKSFVKNKVVFAESMYSKSEPMAARAKEVMDEIA MVGWVSGGHSAGYVPVFAIGAGSQLFGEKIDNTEIPKRIAKAAGYK >gi|222159287|gb|ACAB01000072.1| GENE 30 25826 - 27421 1423 531 aa, chain - ## HITS:1 COG:no KEGG:BT_3705 NR:ns ## KEGG: BT_3705 # Name: susR # Def: regulatory protein SusR # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 531 64 582 582 786 83.0 0 MRNIAYILAFLLVCPTLLFATQTDDNAAVLKRLDDIINKKETFQVQKEKAIDALKMQLAH SVAPADKYRLYGSLFDAYLHYQADSALYYINRRQQLLPQLTRPELADEIIIDRATVLGVM GMYIEAMKELESINSEKLDKQTLLSYYQTYRACYGWLADYTTNKEEKKKYLTKTDLYRDS IIGIMPPEINRTIVLAEKCIVTGKADTALVMLSDALKDAVDERQKVYIYYTLSEAYGMKG DMEKEVYYLILTAIADLESSVREYASLQKLAHLMYELGDVDRAYKYLSCSMEDAVACNAR LRFIEVTEFFPIIDKAYKLKEEKERVVSQAMLVSVSLLSFFLLIAVFYLYRWMKKLSAMR RDLSLANKQMQAVNKELEQTGKIKEVYIARYLDRCVNYLDKLETYRRSLAKLAMASRIED LFKAIKSEQFIRDERNEFYNEFDKSFLKLFPNFITAFNELLVEEGRIYPKSDELLTTELR IFALIRLGVIDSNKIAHFLGYSLATIYNYRSRMRNKAAGDKDRFEQDVMNL >gi|222159287|gb|ACAB01000072.1| GENE 31 27665 - 29518 1537 617 aa, chain + ## HITS:1 COG:lin2231 KEGG:ns NR:ns ## COG: lin2231 COG0366 # Protein_GI_number: 16801296 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Listeria innocua # 127 615 139 587 591 162 28.0 2e-39 MKKNFLFVILLVLLFGCSPVYAATVIKKVVPAFWWAGMKNPELQVLLYGEHIAFAEVSIS SEDITLQEVVKLENPNYLIVYLNISEAAPQTFNMILKQGKKQTVIPYELKERKPGSSQIE GFNSSDVLYLIMPDRFANGDSSNDIISGMLEARVDRNDSFARHGGDFKGIEKHLDYIADL GVTSIWLNPIQENDMKEGSYHGYAITDYYQADRRFGNNEEFRNLVDQAHAKGMKVVMDMI FNHCGSENYLYKDMPSKDWFNFKGNYVQTTFKTATQLDPYASDYEKKIAVDGWFSQVMPD FNQRNRHVATYLIQSSIWWIEYAGINGIRQDTHPYADFDMMAHWCKAVNEEYPKFNIVGE TWLGSNVLISYWQKDSKLTYPKNTYLPTVMDFPLMEHMNKAFDEETTDWNGGLCRLYEYL SQDIVFANPMNLLTFLDNHDTSRFYRSEADTKNLNRYKQALVFLLTTRGIPQIYYGTEIL MAADKANGDGLLRCDFPGGWQNDTHNYFDAANRTPLQNEAFSYLKKLLQWRKGNEVIAKG KLKHFAPSKGIYAYERKQGDKSVVVLLNGTDREQTISLDTYREILPDTSAYNVLEDKKVE LGKDLTLPSRGIYLLSF >gi|222159287|gb|ACAB01000072.1| GENE 32 29717 - 31927 2203 736 aa, chain + ## HITS:1 COG:no KEGG:BT_3703 NR:ns ## KEGG: BT_3703 # Name: susB # Def: alpha-glucosidase SusB # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 736 4 738 738 1446 94.0 0 MKKNLLLIAFFCISFIANAQQKLTSPDGNLVMTFQVNKEGAPTYDLIYKGKTVIKPSTLG LELKKEDNTRTDFDWVDRRDLTKLDSKTNLYNGFEVKDVKTSTFNETWQPVWGEEKEIHN HYNELAATLYQPMNDRSILIRFRLFNDGLGFRYEFPQQKSLNYFIIKEEHSQFAMAGDHI AYWIPGDYDTQEYDYTISRLSEIRGLMKEAITPNSSQTPFSQTGVQTALMMKTDDGLYIN LHEAALIDYSCMHLNLDDKNMVFESWLTPDAKGDKGYLQTPCNTPWRTVIVSDDARNILA SRITLNLNEPCKITDAASWVKPVKYIGVWWDMITGKGSWAYTDELTSVKLGETDYSKTKP NGKHSANTANVKRYIDFAAANGFDAVLVEGWNEGWEDWFGNSKDYVFDFVTPYPDFDVKE VHRYAASKGIKMMMHHETSASVRNYERHMDKAYQFMVDNGYNSVKSGYVGNIIPRGEHHY GQWMNNHYLYAVTKAADYKIMVNAHEATRPTGICRTYPNLIGNESARGTEYESFGGNKVY HTTILPFTRLVGGPMDYTPGIFETHCNKMNPANNSQVRSTIARQLALYVTMYSPLQMAAD IPENYERFMDAFQFIKDVAIDWDETNYLEAEPGEYITIARKAKGTGDWYVGCTAGENGHT SKLVFDFLTPGKQYIATVYADAKDADWKENPQAYTIKKGILTNKSKLNLRAANGGGYAIS IKEVKDKAEVKGLKKF >gi|222159287|gb|ACAB01000072.1| GENE 33 32015 - 35038 3235 1007 aa, chain + ## HITS:1 COG:no KEGG:BDI_1558 NR:ns ## KEGG: BDI_1558 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 8 1007 11 989 989 1156 59.0 0 MKKSLRLKALLTLLVGLFLSIGAFAQQIAVKGHVKDTTGEPVIGANVLVKGTTNGTITDF DGNFMLNVPKDAILSVSFVGYKSAEVKAASTVMVTLEDDSQVLDAVVVIGYGSVKKNDMT GSVTAIKPDKLNKGLITNAQDMMTGKIAGVSVISKGGAPGEGATIRIRGGSSLTAENDPL IVIDGLAMDNKGVKGLANPLSMVNPNDIESFTVLKDASATAIYGSRASNGVIIITTKKGQ AGARPTISYDGNVSVSTVKSTVDVMDGDQFRSFIKDIWGEDSEAYSKLGNANTDWQKEIF RPAVSTDHNLTISGGLKNMPYRVSFGYTNQNGIVKTSKFERYTASVSLAPSFFEDHLKVN ANLKGMIAKNRYADGSAVGSAVSFDPTQSVRSDDPYHQYYFDGYFQWNTDASSLNDDTWK RTFNSNAPGNPVALLEEKDDRAISKSLIGNLELDYKFHFLPDLHAHVNGGMDLSTGKQYT DVSPYSSTNNYYGSYGWEQKDKYNLSLNAYLQYSKDFTDKHRFDVMAGYEWQHFHDTSDQ EYWGLYPLSNNVVENRGQRYNNTSSGSATESYLVSFFGRVNYTLLDRYLFTATVRQDGSS RFHKNNRWGLFPSFALGWKLKEEAFLKDVDVLSDLKLRLGYGITGQQNINSGDYPYLAVY ETNKDGAYYPILGEGTTYRPNAYNPDLKWEKTTTYNVGLDFGFLNNRINGAVDYYYRKTT DLLNSVFVSAGTNFKNKVLSNVGSLENSGIEFSINSKPVVTTDWTWDLGFNITYNKNEIT KLTTGDSENYYVAAGDNIGGGRDMKAMAHAVGHPASSFYVYQQVYDENGKPIENEFVDRN GDGTINGDDRYFYKKPTADVLMGLTSRLSYKSWDFSFSLRASLNNYVYNSVEAGGSDCNP TSVYSFGALNNRPLMGVANNIQNLKDNTLLSDYFVQNASFMKCDNITLGYSFKKLFGAPI GGRVYAAVQNVFTITKYKGLDPEVEKGLDNNIYPRPLTTLIGLSLNF >gi|222159287|gb|ACAB01000072.1| GENE 34 35059 - 36657 1596 532 aa, chain + ## HITS:1 COG:no KEGG:BT_3701 NR:ns ## KEGG: BT_3701 # Name: susD # Def: SusD, outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 532 1 550 551 448 46.0 1e-124 MKFRYIKSILSAASLLLAVSVTSCIGDLDATPIDPNIVMTFDQASVFNKIYGTLGLTGQK GPDGSGDLDDIDEGTSSFYRMTWCANQLMTDEAIVNSWNDAGVATIADCSWSSSNEIVTG LYYRLTFDITLCNYFLEQTEGLTDDETTRQRAEVRFIRALNNYYLMDMFGNPPYCDKVST EKPQQIQRADLFAKIEEELKEIGDDATATLAQPLQTTYGRVDRVAAWLLLARMYLNAEVY TGTPQWGKAKEYAQKVIGSGYKLAPVYKHLFMADNDGSNVNKARQEVILPILQDGVRTKS WGGSLFLIAGTHKSDTGMNPWGSKQGWGGPHCRQAMVAKFFPNVNDAPSVLEDQMVVAAK DDRALMCGVQRSTSTGDNMSFTDGFACAKFSNIRADNGLTSDTDNPDMDIPLLRMAEAYL IVAEASIRANNGVSTQETIDAMHEIRKRANAAESDSYTLTDIRDEWAREFWFEGRRRIDL IRFGDFGGHTDYNWDWKGGEKQGTEIKEFRNIYPIPANDINANTNLDQNPEY >gi|222159287|gb|ACAB01000072.1| GENE 35 36696 - 37844 1018 382 aa, chain + ## HITS:1 COG:no KEGG:BT_3700 NR:ns ## KEGG: BT_3700 # Name: susE # Def: outer membrane protein SusE # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 347 1 325 387 142 33.0 2e-32 MKKINILTSILVATALLTACEDDRDSNPTIQEPTTFVLNTPANATYNVYDLNQSKNIELT CTQPDYGYPAVVTYTMQADLTDKWTDETETADASYLTLPFISTSAKVDANTQELNKAIVK LAGWTSENDYDGEPMSVFVRLYAHIGDKGYPIHSNSIELKVIPYYMDISDAVPATYYLLG DFIGEVPWGNPTMAAGTAYFPMSLVKGYAYDANTGKGEFTYTGYIPADKGFKVVATPGAW DDQWGNADSEGFTNLVNDKNSQNIKVNAAGWYTLHLNTLENKLTMKATTFETQPTEYDAV TLIYGSESVNMTKVTMEHSHVWYADITIAASCKAKFTSGDKTWGGEVFPFGSFVDGATIS CKPGDYTVLFNELDECYYFKAK >gi|222159287|gb|ACAB01000072.1| GENE 36 37868 - 39232 1038 454 aa, chain + ## HITS:1 COG:no KEGG:PRU_2684 NR:ns ## KEGG: PRU_2684 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 16 446 1 433 445 130 28.0 8e-29 MKKLSIYITVLLAAALTACNEDFNEGVASPQSYGQEEAAGKITFTATGVAPINIGNVEEE SVAVAVFTAPAVEEEATLSYKMKLDNKVTLIVDDKGYVATEDLQNAVAQIYGIRPVERTM NAVLTSYVAVGKTVYAAPAESYELKVTPEAPVIESAYYINGSLTWEQNVAFVNTSGDPYT NSVFTTTVPALVTDNTGAKDAYFLIKSNSGKSLGAVDADNDAPEGNLILSETANPIKISG GDYKSVRISINMMEGTYKIEKTMDAPHLWIPGNHQEWKPSQAPTLYKPEGNNAYWGMSEL NGGFKFTAQPYWPDGSNGGLDYGYDYFTSKEGMTNDGGNLSLPQGIYYIVVNLDDKYVSA TEITSMDIIGTATGGWENGKELTYDATEKCWTAKTEMTVGEFKLRVNSSWDSGVSFGGTL DTPSPFAGDNISMDATGNYTIKFYLNGRLVLIKE >gi|222159287|gb|ACAB01000072.1| GENE 37 39250 - 41631 1834 793 aa, chain + ## HITS:1 COG:all0875 KEGG:ns NR:ns ## COG: all0875 COG0296 # Protein_GI_number: 17228370 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Nostoc sp. PCC 7120 # 207 764 13 527 552 166 27.0 2e-40 MKKDIKHIFRMLLCLFTIVACSDENHEMLITGPEIPELDHEPSIEELVEGINNYPTKFKG DKQGKIWYKAAAGDDLYGYEGDVYVHIGINDWMYVPTEWGVNEDKYKAEKVADNIWCFTL APTVREWFGAENAANIQNVCILFRNDEALTDEKKTSDFFITVTGDKTFTPAPVEIAACPV EEAGIHPSADGTSVTFALYDLDTDNNYKDYACLIGDFNDWELSTEYQMKRDNDKHFWWYT VTGIDPAKEYGFQYYMGSEKDGNVRIGDPYCEKVLDGSNDKYLVEQGVYPASAIQYPNGK TTGIVSVFQTKPASYNWQVSDFKIDNPDNMVIYELLFRDFTQVGSELATGTIKEATKHLD YIKSLGVNAIELMPIQEFDGNNSWGYNPCYYFAMDKAYGTKEEYKQFIDECHKQGIAVLL DVVYNHATGSHPFAKLYWNSKESKTAKNNPWFNVDAPHPFSVFHDFNHESPLVREFVKRN LKFLLEEYKFDGFRFDLTKGFTQNKSNESTASNKDDSRIVILKDYYKTVNTTNPNAVMIL EHFCNLDEESELAKAGMKLWHNMNESYCQSGMGESSNSDFSYMRNSGMPAEGWVNFMESH DEERVAYKQTAFGNLQNAGLDIRMKQLGTNAAFFLTVPGPKMIWQFGELGYDYSIMYKYD GTMGTEKNTDAKPVKWDYLTDQYRKGLYDTYSTLLKLRNDNPDLFSDNAFKDWKVSVSDW DKGRYLRLESTTKKLVVVGNFKNEQINTGVYFGNTGDWYELNGETLNVTNSSEQPVVIPA NSFKLYTNFPVNN >gi|222159287|gb|ACAB01000072.1| GENE 38 41893 - 42657 594 254 aa, chain - ## HITS:1 COG:NMA0723 KEGG:ns NR:ns ## COG: NMA0723 COG2908 # Protein_GI_number: 15793700 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis Z2491 # 1 233 1 222 240 89 29.0 8e-18 MKNVYFLSDAHLGSRAIEHGRTQERRLVNFLDSIKHKASAVYLLGDMFDFWYEFRLVVPK GYTRFLGKISELTDMGVEVHFFIGNHDIWCGDYLTKECGVIMHREPLTTEIYGKEFYLAH GDGLGDPDKKFKLLRSMFHSKTLQTLFSAIHPRWSVELGLSWAKRSRQKRTDGKEPDYMG ENQEHLVLYTKEYLKSHPNINFFIYGHRHIELDLMLSVTSRVLILGDWINFFSYAVFDGE NLFLEEYIEGETQV >gi|222159287|gb|ACAB01000072.1| GENE 39 42710 - 43021 465 103 aa, chain - ## HITS:1 COG:CC1859 KEGG:ns NR:ns ## COG: CC1859 COG2151 # Protein_GI_number: 16126102 # Func_class: R General function prediction only # Function: Predicted metal-sulfur cluster biosynthetic enzyme # Organism: Caulobacter vibrioides # 6 100 21 115 118 97 50.0 5e-21 MEKIEIEEKIVAMLKTVYDPEIPVNVYDLGLIYKIDVSDTGEVVLDMTLTAPNCPAADFI MEDIRQKVESVEGVTAATINLVFEPEWDKDMMSEEAKLELGFL >gi|222159287|gb|ACAB01000072.1| GENE 40 43202 - 43897 449 231 aa, chain - ## HITS:1 COG:MA1979 KEGG:ns NR:ns ## COG: MA1979 COG2003 # Protein_GI_number: 20090827 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Methanosarcina acetivorans str.C2A # 5 231 4 228 229 134 29.0 2e-31 MESKHKLSINQWALEDRPREKMMEKGAAALSDAELLAILIGSGNTEESAVELMRRLLLSC DNNLNSLAKWEVCDYSRFKGMGPAKSITVMAALELGKRRKLQNTKERPQITCSKDIYDIF QPLMCDLEQEEFWVLLLNQATKLIDKVRISTGGIDGTYTDVRTILREALLQRATQIAVVH NHPSGNIRPSQPDKTLTEHIRKAADTMNIHLIDHVVVCEDGFFSFADEGLL >gi|222159287|gb|ACAB01000072.1| GENE 41 44148 - 45137 799 329 aa, chain - ## HITS:1 COG:SA0248 KEGG:ns NR:ns ## COG: SA0248 COG0463 # Protein_GI_number: 15925961 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Staphylococcus aureus N315 # 1 109 2 106 573 77 38.0 4e-14 MRYSVIIPVYNRPDEVDELLQSLTMQHFKDFEVVVVEDGSFIPCKEVVERYADRLNIKYF SKPNSGPGQTRNYGAERSEGEYLIILDSDVILPEGYFDAVEKELLASPADAFGGPDRAHD SFTDIQKAINYSMTSFFTTGGIRGGKKKMDKFYPRSFNMGVRRAVYEALGGFSKMRFGED IDFSIRIFKNGYTCRLFPDAWVYHKRRTDLKKFFKQVHNSGIARINLYKKYPDSLKLVHL LPAVFTLGVALLLLGTPFCLFSFTLIILYALLVCMDSTIQNKSLTIGIYSIAAAFIQLIG YGTGFWRAWWQRCVRGKDEFEAFQKNFYK >gi|222159287|gb|ACAB01000072.1| GENE 42 45635 - 48331 2543 898 aa, chain - ## HITS:1 COG:no KEGG:BVU_1478 NR:ns ## KEGG: BVU_1478 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 893 1 947 952 256 30.0 3e-66 MKKKFVKVMLFGALALSTISYVGCKDYDDDIDGLQQQVDANKSKIEEVDAAMKAGKFIVS YTPVANGYELSLSDGSKLTITNGKDGENGQPGLNGETVIPKFKVSSDNYWQVSTDDGKTY EYVLDADGNKVNATGKKGEQGEPGKPGEPGQDASANVSINKDGYIVIGDVVTSLKTDTKV PSIVINEVDGLYVITLDGKEYKMLAEGSAYNGLQSIIYRRQAANDKDDFVKSIGLMSSND PDAELLATSASVATFKVWPTTLDLGKAVFAFTDTYKTRALVPALKYVEGSAKWVDNRQGI LSVNMISENIEVDKPYASSLDVTINGYTTASDYFNFKAEKWTPADLSFVHTKDAVEVPTD VLADATTESDFSSYVEYTFVYNKSYNLNDSVALGHDVEEDFISMADLGFSGITVEFKQTA DKAKGIFEIKDGVITAKSSEQASAINELCYVTATYKNEKGTEIISYDFAVKAVREQQVAP ALVDIDVETKDAAAADKLAKLQYSTSEQTIDLNVRAFLNSLGGRDYMSDNENVTPVKYGL YYLTNEDGKTYANKVDAYIEFTPGTSTDLDNLKLIFPAETVINGDQQLYSFETWPGGGNS KYEITNSEEYAWRKSTTTHRVNGSNKQYTLKLKDLVKCERKVNIKQNSAFVVNGKTTITG AWTEADKSFSMTADLTVLYAAYNTDNSTSSEVIEYYLAPQAKQSDAVKAVYSQISIAGNV ITVTPDVDVKTLGAIKIGAKIQGTDIEATILDINGKEVTYCEPVLRSPLDVLSYKSSIDW NIDGDQNKTINVGTKAEIKMSDKDINVTKKNVVIENGVAKAPWAAAYGIDAAGAITYAID GYSATVNEGTYTINPSTGVITCNNTAIVQSLDIYVKVTVNHNWGTETINKITVKASLK >gi|222159287|gb|ACAB01000072.1| GENE 43 48706 - 49641 490 311 aa, chain - ## HITS:1 COG:no KEGG:BT_1503 NR:ns ## KEGG: BT_1503 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 311 25 336 336 501 84.0 1e-140 MLKVVAFMRQVANGLQMGGNFGTAHVYRSSLNAIIVYCGKGDFTFDKVTPGWLKGFEIHL RSRGCSWNTVSTYLRTFRAVYNRAVDLRKAPYVPNLFHSVYTGTRADRKRALCEEDIKKV FAKLHSSVATPALRCTQELFTLMFLLRGLPFVDLAYLRKSDLHDNVITYRRRKTGRTLSV TLTPEAMLLVKKYMNCDPSSPYLFSLLKSREGTKEAYREYQLALRTFNRQLMLLGELLGL GDKLSSYTARHTWATTAYYCEIHPGIISEAMGHSSITVTETYLKPFQNKKIDEANKRVID FIRYSVRNAIC >gi|222159287|gb|ACAB01000072.1| GENE 44 49846 - 51045 1399 399 aa, chain - ## HITS:1 COG:TM0274 KEGG:ns NR:ns ## COG: TM0274 COG0282 # Protein_GI_number: 15643044 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Thermotoga maritima # 1 399 1 400 403 461 57.0 1e-129 MKILVLNCGSSSIKYKLFDMTSKEVIAQGGIEKIGLKGSFLKLTLPNGEKKILEKDIPEH TVGVEFILNTLIHPEYGAIKSLDEINAVGHRMVHGGERFSESVLLNKEVLEAFTACNDLA PLHNPANLKGVNAVSAILPNIPQVGVFDTAFHQTMPDYAYMYAIPYELYEKYGVRRYGFH GTSHRYVSKRVCEFLGVNPVGKKIITCHIGNGGSIAAIKDGKCIDTTMGLTPLEGLMMGT RSGDIDAGAVTFIMEKEGLNTTGVSNLLNKKSGVLGISGVSSDMRELLAACAAGNEKAIL AEKMYYYRIKKYIGAYAAALGGVDIILFTGGVGENQMECRREVCKDMEFMGIELDNDVNA KVRGEEAIISTPASKVKVVVIPTDEELLIASDTMDILKK >gi|222159287|gb|ACAB01000072.1| GENE 45 51086 - 52105 1209 339 aa, chain - ## HITS:1 COG:CAC1742 KEGG:ns NR:ns ## COG: CAC1742 COG0280 # Protein_GI_number: 15895019 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 2 332 1 329 333 319 53.0 5e-87 MLNLINQIVARAKANRQRIVLPEGTEERTLKAANMILTDEVADLILLGKPAEINELAAKW GLGNIGKATIIDPETSPKHEEYAQLLCELRKKKGMTIEEARKLTNDPLFFGCLMIKSGDA DGQLAGARNTTGNVLRPALQIIKTAPGITCVSGAMLLLTHAPEYGKNGILVMGDVAVTPV PDANQLAQIAVCTAQTAKAVAGIENPKVAMLSFSTKGSAKHEVVDKVVEATKIAKEMAPT LDLDGELQADAALVPEVGASKAPGSEVAGQANVLIVPSLEVGNISYKLVQRLGHADAIGP ILQGIACPVNDLSRGCSIEDVYRMIAITANQAIAAKANK >gi|222159287|gb|ACAB01000072.1| GENE 46 52302 - 52745 333 147 aa, chain + ## HITS:1 COG:NMA0437 KEGG:ns NR:ns ## COG: NMA0437 COG1380 # Protein_GI_number: 15793442 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Neisseria meningitidis Z2491 # 1 109 3 111 114 95 49.0 3e-20 MIRQCAILFGCLALGELIVYLTGIKLPSSIIGMLLLTLFLKLGWIKLHWVQGLSDFLVAN LGFFFVPPGVALMLYFDVIAAEFWPIVTATIISTALVLVVTGWVHQIVRKFRLARQIKLA RKLHLTDFHLSEKLHLKDKINLTNKDK >gi|222159287|gb|ACAB01000072.1| GENE 47 52742 - 53437 717 231 aa, chain + ## HITS:1 COG:NMA0436 KEGG:ns NR:ns ## COG: NMA0436 COG1346 # Protein_GI_number: 15793441 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Neisseria meningitidis Z2491 # 10 229 11 229 230 177 45.0 2e-44 MSFLENNFFLLAITFGIFFFAKLLQKKTGLVLLNPILLTIALLIIFLKMTNISYETYNKG GHLIEFWLRPAVVALGVPLYLQLEMIKKQLLPILLSQLAGCIVGVISVVLIAKFMGASQE VILSLAPKSVTTPIAMEVTKAIGGIPSLTAAVVVAVGLLGAICGFKTMKIMRVGSPIAQG LSMGTAAHAVGTSTAMDISSKYGAYASLGLTLNGIFTALLTPTILRLLGVL >gi|222159287|gb|ACAB01000072.1| GENE 48 53494 - 54411 1050 305 aa, chain + ## HITS:1 COG:jhp0277 KEGG:ns NR:ns ## COG: jhp0277 COG4866 # Protein_GI_number: 15611347 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori J99 # 4 293 2 286 290 160 36.0 3e-39 MIPFKDITLADKDTITSFTMKSDRRNCDLSFSNLCSWRFLYDTQFAVVDNFLVFKFWAGE QLAYMMPVGTGDLKAVLWELIEDARKENQHFCMLGVCSNMRADLEAILLEQFTFTEDRDY ADYIYLRSDLSTLKGKKFQAKRNHINRFRNTYPDYEYTPITPDRIQECLDLEAEWCKVNN CDQQEGTGNERRALIYALHNFEALGLTGGILHVNGKIVAFTFGMPINHETFGVHVEKADT SIEGAYAMINYEFANRIPEQYIYINREEDLGIEGLRKAKLSYQPATILEKYMACLKEHPM NMVKW >gi|222159287|gb|ACAB01000072.1| GENE 49 54428 - 55441 615 337 aa, chain + ## HITS:1 COG:BH1812 KEGG:ns NR:ns ## COG: BH1812 COG4552 # Protein_GI_number: 15614375 # Func_class: R General function prediction only # Function: Predicted acetyltransferase involved in intracellular survival and related acetyltransferases # Organism: Bacillus halodurans # 41 321 50 343 386 73 25.0 8e-13 MIKEQVKALWKICFDDSEEFVEMYFRLRYKTEVNVAIQSGDEVISALQMLPYPMTFGGET VQTSYISGACTHPDFRSKGVMRELLSQSFARMLRNGVHFSTLIPAEPWLFDYYARMGYAS VFKYSTKEIVLPEFIPAKEIAVSVVSEFQEEVYSYLNKKLSERACCIQHTSEDFQVIMTD LAISGGYLFVARQENEIKGITIIYKGDKHIIINELCAENKDVEYSLLYAIRQHTGYKCMV QILPPEEKQPQHPLGMARIINAKEVLQIYAAAFPEDEMQLELSDKQLSVNNGYYYLCKGK CMYSTERLPGTHIQMNISELTDRILQPLKPYMSLMMN >gi|222159287|gb|ACAB01000072.1| GENE 50 55527 - 56660 1105 377 aa, chain - ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 56 373 47 350 352 167 32.0 2e-41 MKKLSATLLSALFLGGAICAGCADKKDTSSEEVINIIHKVNNYWQANHPEHGRSFWDNAA YHTGNMEAYFLTKKPEYLDYSKAWAEHNEWKGAKSDNKADWKYSYGESDDYVLFGDYQIC FQTYADLYNLEPDTQKIARAREVMEYQMSTPNRDYWWWADGLYMVMPVMTKMYNITKNPL YLEKLHEYLAYADSIMYDEEAGLYYRDGKYVYPKHKSVNGKKDFWARGDGWVLAGLAKVL KDLPETDQYRQEYMDRFRTLAKSVAACQQPEGYWTRSMLDPQHAPGPETSGTAFFTYGLQ WGINNGFLNAAEYQPVVEKAWKYLSTVALQPDGKIGYVQPIGEKAIPGQVVDANSTSNFG VGAFLLAACERVRHLNQ >gi|222159287|gb|ACAB01000072.1| GENE 51 56665 - 57960 876 431 aa, chain - ## HITS:1 COG:no KEGG:BT_3686 NR:ns ## KEGG: BT_3686 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 431 14 431 431 766 85.0 0 MKRILLLLCGIILVPTLVCSQRLVEVGKGYSCTSVNTAVFRNNSLVTHGDEQYISYYDND GYLILGKRKLDSKQWTLHRTQYQGNVKDAHNVISMMVDGEGYIHVSFDHHGHKLNYCRSI APGSLKLGDKIPMTGVDEGNVTYPEFYSLSGGDLLFVYRSGSSGRGNLVMNRYSLKDRKW TRIQDILIDGENKRNAYWQMYVDEKGTIHLSWVWRESWHVETNHDICYARSFDNGVTWYK SSGEQYELPIRLSNAEYACRLPQNCELINQTSMSADAEGNPYIATYWRDSDSDVPQYRIV WNDGKVWHQRQITDRKTPFTLKGGGTKMIPIARPRIVVEGGEVFYIFRDEERGSRVSMAH ASDVGISKWTITDLTDFSVDAWEPSHDTELWKKQRKLHLFVQHTRQGDGERTAEIDPQMV YVLETDMNINK >gi|222159287|gb|ACAB01000072.1| GENE 52 57975 - 58997 960 340 aa, chain - ## HITS:1 COG:no KEGG:BT_3685 NR:ns ## KEGG: BT_3685 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 338 1 338 355 647 90.0 0 MTRLLLLFVACIGLLPISVQAQKNTDFTPGQVWKDTDGNPINAHGGGLLYHNGTYYWYGE YKKGKTILPDWATWECYRTDVTGVGCYSSKDLLNWKFEGIVLPAVKDDPNHDLHPSKVLE RPKVIFNKKTGKFVMWAHVESADYSKACAGVAVADSPVGPFVYQGSFRPNNAMSRDQTVF VDDDGRAYQFYSSENNETMYISLLTDDYLKPSGRFTRNFVKESREAPAVFKYNGKYCMLS SGCTGWDPNVAEIAVADSIMGTWKTIGDPCTGPDADKTFYAQSTYVQPVIGKKDAYIAMF DRWKKKDLEDSRYVWLPVLVKDGKITIPWHEKWTLSIFDK >gi|222159287|gb|ACAB01000072.1| GENE 53 59026 - 60282 928 418 aa, chain - ## HITS:1 COG:TM1061 KEGG:ns NR:ns ## COG: TM1061 COG4289 # Protein_GI_number: 15643819 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 37 417 8 385 387 285 40.0 9e-77 MNLNRHYVLLLLMVILMIPSQDLSAKKKKQVKEPTDRELWAEVLYRMAEPVLSNMSEGKL QQNMLVELSPTWDGRNKKVTYMECFGRLMAGLAPWISLPDDDTSEGIWRKQLREWALKSY AQAVDPESPDYLLWRKEGQTLVDAAYIAESFIRGYDALWLPLDSVTKQRYITEFTQLRRV DPPYTNWLLFSATVEAFLRKAGAPSDTYRIASALRKVEEWYVGDGWYSDGKDFAFDYYNS FVLHPMYIEPLEIMTNSGKNKVWNMPECDYNRAVKRMQRFGMILERFISPEGTLPVFGRS ITYRTGTLQPLALLAWREWLPKELPDGQVRAAMTAVIKRMFGDDRNFNEKGFLTLGFNGK QPNISDWYTNNGSLYMASLAFMPLGLPADHPFWTSPAEDWTSKKAWEGNDFPKDHAFH >gi|222159287|gb|ACAB01000072.1| GENE 54 60406 - 62310 1128 634 aa, chain - ## HITS:1 COG:TM0024 KEGG:ns NR:ns ## COG: TM0024 COG2273 # Protein_GI_number: 15642799 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Thermotoga maritima # 25 283 210 457 642 151 36.0 5e-36 MKKNNVYILLLLLLTLPLLATARNDDWRLVWSDEFNTEGRLSPSVWNYEQGYVRNEEAQW YQTDNAFCKGGFLVIEARKEQGRRNPLYVCGSNDWRKKREFVEYTSSSVTTAGKKEFLYG RFEVRARIPVAKGAWPAIWTLGSNMEWPSCGEIDIMEYYQIKGTPHILANAAWGTDKQWG AKWNSKAIPYVYFTEKDPNWASKFHIWRMDWDEEAIKLYLDDELLNEIPLKDAVNGSIGR GTNPFMKPQYLLLNLAIGGINGGPIDESALPMKYEIDYVRVYQKEKNIVSGKVWRDTDGN VINAHGGGVLYHEGKYYWFGEHRPESGFVTEKGINCYSSIDLCKWKPEGIVLAVSGEEGA DIEKGCIMERPKVIYNKKTGKFVMWFHLEKKGNGYGSACAAVAVSDSPTGSYRFIRSGRV NKGVYPLNMDTKEREIEWDFSKYKEWWTPEWYSAIEKGLFLKRDMEGGQMSRDMTLFVDD DGKAYHIYSSEDNLTLQIAELSDDYLGHTGRYIRIFPGGHNEAPAIFKKDGTYWMITSGC TGWEPNKARLLTATSILGEWKQLPNPCVGEKADKTFGGQGTYSFPLQGKEDRFVFMADSW CPESLSDSRYIWLPIQFNEKGIPFIEWIDRWKIY >gi|222159287|gb|ACAB01000072.1| GENE 55 62391 - 63866 1105 491 aa, chain - ## HITS:1 COG:TM0024 KEGG:ns NR:ns ## COG: TM0024 COG2273 # Protein_GI_number: 15642799 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Thermotoga maritima # 205 488 196 457 642 107 29.0 6e-23 MKMNKIWMVACMATAFIGGVYGCDDDKDFSYEGQLDLNLLNLTQARDVWDGAECNVTTAL QQSEDETTVKNFTYKLNLSLYQGRKAEQEAKVNLVVNKDTLNKVLAKVPEGGIYAKYDGA ELLPESYYHLSSQTLTLQAGETKSDAVSVTVYSSELITACQEAERNLLYVLPLTIESSSS YGVNSKTNTLMLLFNVTYVKPEEPEDQEAYMPDKVGIPDDHELENGMKLLWHDEFNGTGE PNPDIWQFETGFVRNEEDQWYQKENAKMKDGALLIEGRIEAVKNPNYQKGSSDWKKNREY SEYTSSCILTKPGYVFKYGRMEVRAKIPVEQGAWPAIWSTGNWWEWPLGGEIDMLEFYKE KIHANVCWGGNKRWEGTWNSKNYPITNFTSKDKQWSDKYHLWVMDWDKDFIRIYLDGELL NETDLTLTVNKGDNGAGQGGYQNPYSNDYEGFGQRMMLNLAIGGINGRPVDNTAFPLKYH IDYIRIYQSKK >gi|222159287|gb|ACAB01000072.1| GENE 56 63885 - 64778 709 297 aa, chain - ## HITS:1 COG:no KEGG:BF2940 NR:ns ## KEGG: BF2940 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 14 293 16 294 300 141 30.0 3e-32 MKLNIKHMIAAFTFLGMVGCDKNFEEFDIVGGGAPAAVELSTVIPEALPGQIKLSWKAPE GDYAYMQIRYYDPLQKKNVCKIASKGTTEMLIENTRARFGDYTFYLQTFNAAHEAGTVQE LKARSGAMPASYTEKSRAQVSLTVDQLSCNYPDASEGYFDRLIDGKTADPSFFHTNWHSP QVDLPHYIQIDLKEEHENFAFEYYTRDTGNSDGFPTSAELQISTDGEHWETVSTLTGLPT TRQTKYASDFVMPGKKFKYFRFNVLTSSQNKKYFHIGEIAFFDADIEKYDPETVPLD >gi|222159287|gb|ACAB01000072.1| GENE 57 64806 - 66443 1135 545 aa, chain - ## HITS:1 COG:no KEGG:PRU_2229 NR:ns ## KEGG: PRU_2229 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 16 545 17 559 559 408 42.0 1e-112 MKLKNICLGIACIAVVSACSDKMDYHEYTSYDKGYVFSDFSRTAGFVNNIYSYLDSDLPS YQSMASACDEAEMAVTYSSVLDYTNGAWSALNPKSLWGYYSGIRAANYYLEESKELDFYD LRYAQDYEAQMNRFNRYQYEVRLLRAYYYFLLVRAYGDVPFTQNVLTEKEANSLSRTPSS EIFDFIISECEEVAPELPVSYSALDNDAAGGSNPEAGRVTQGTALALKARAALYRASKLF SGGEDRNLWREAAMANKAVIDYCTANGIRLGKYTDIWGTDNYQASEMIFVRRIGDTSSPE YTNFPVGMENANSGNCPTQTLVDAYESKANGDKDPRFSMTIACNGDKWPNTNPNPLETFI GGKNGLPLPYATPTGYYLKKYLDASTDISAESGSGGKRHNWVIFRLGEFYLNYAEAIFRY LGSAGTTDNEFKMSACAAVNKVRQRSDVQMPDFPDDISNTDFWERYKKERMVELAFEQHR FWDVRRWKEGGFINIGRMEITKNADGSFRYNRINKPLVWDDKMYFFPIPASEMRKNPNLN QNPGW Prediction of potential genes in microbial genomes Time: Wed May 18 02:51:38 2011 Seq name: gi|222159286|gb|ACAB01000073.1| Bacteroides sp. D1 cont1.73, whole genome shotgun sequence Length of sequence - 2011 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 2009 1480 ## PRU_2228 hypothetical protein Predicted protein(s) >gi|222159286|gb|ACAB01000073.1| GENE 1 2 - 2009 1480 669 aa, chain - ## HITS:1 COG:no KEGG:PRU_2228 NR:ns ## KEGG: PRU_2228 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 666 261 918 944 404 37.0 1e-111 DIATLSFRGGSTKMRYYTMMNLQNNRGFIKNFDTNADYSTQEKYSKANFRTNLDIDLSPK TKMQANIMGILNEFSRPGMGSDNLIAKLYQLPSAAFPIRTESGLWGGNTTWGENWNPVAL TEGRAYSKGHTRGLYADMSLRQDLSSLTKGLGASVRMGYDNLASYWENHTKGYKYGMVSV ASWENGLPVAGEEITGGKDTEMSGDSKLDWQYRAFNFQLNVDWQRQFGAHSLYSMLLYTY KYDNAKGINNTFYRQNAGWYTHYGFKNRYFADFTLMASASNLLAPDHRWNVSPTVGLAWL ISNEKCMQSQNVVNFLKLRASFGMLNTDNIPGNGYWNETVGGGNGYPINNNFGGDGGWHE GRLASVNGTTEKAYKYNVGVDATLFKGLTLTLDGFYERRSDIWVSSDGQNSAVLGATSPY VNAGIVDSWGTEIGANYCKKIGNVEFNLGGTFTYNRSKIIEMLEEPAAYDYTRSTGNPVG QIFGLQAIGYFVDQADIDNSLPQQFGPVKAGDIKYKDMNGDKVINSDDRVAMGYNSTCPE TYYSFSLGLEWKGLGFSAQFQGVGNYTAILSGTYYHPLVDNTTISNYVYRNRWTPETPNA RFPRLTTETVDNNLQTSSLWLADRSFLKLRNCEVYYKLPSSWLNKFWMKIAKVYVRGVDL LCFDSIDQL Prediction of potential genes in microbial genomes Time: Wed May 18 02:51:50 2011 Seq name: gi|222159285|gb|ACAB01000074.1| Bacteroides sp. D1 cont1.74, whole genome shotgun sequence Length of sequence - 18277 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 934 869 ## BT_3297 hypothetical protein 2 1 Op 2 . - CDS 968 - 2899 1343 ## PRU_2227 putative lipoprotein 3 1 Op 3 . - CDS 2915 - 6043 2523 ## PRU_2226 hypothetical protein - Prom 6081 - 6140 5.9 - Term 6110 - 6151 -0.9 4 2 Tu 1 . - CDS 6234 - 10322 2547 ## COG0642 Signal transduction histidine kinase - Prom 10410 - 10469 5.6 5 3 Op 1 . - CDS 11303 - 13252 1805 ## BT_3661 alpha-glucosidase 6 3 Op 2 . - CDS 13289 - 15754 2037 ## COG3534 Alpha-L-arabinofuranosidase - Prom 15796 - 15855 8.5 - Term 16047 - 16093 3.1 7 4 Tu 1 . - CDS 16132 - 16995 723 ## BT_3655 arabinosidase - Prom 17070 - 17129 2.6 8 5 Tu 1 . - CDS 17133 - 18275 1105 ## COG1874 Beta-galactosidase Predicted protein(s) >gi|222159285|gb|ACAB01000074.1| GENE 1 1 - 934 869 311 aa, chain - ## HITS:1 COG:no KEGG:BT_3297 NR:ns ## KEGG: BT_3297 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 311 1 313 1026 161 36.0 3e-38 MKKYKILALAMFACVTLNGWAQSEDNVTGRVLDKEGKPVAGALVSVEENPLVRVATDKNG RFEIIAVKGNRLKVQTGDDATKVVKVNGGSELTVVMDYSSEKVNYGFGLQQTNAESTGAV STVYAEDIDKSSAFSIGNSLYGNVLGLTTMQSTGVVWEQMPSMYIRGLKTLNGNNGILLV VDGLERDNNWQALKYITPEEVESVSVLRDAAALALYGYRGVNGVVNIVTKRGKYNTREIN FSYDHAFNYMTRKPEMADAYMYASALNEALTNDGKQVRYSQNELNAFKNGTSPYLYPNVN WWDEVFRDRGA >gi|222159285|gb|ACAB01000074.1| GENE 2 968 - 2899 1343 643 aa, chain - ## HITS:1 COG:no KEGG:PRU_2227 NR:ns ## KEGG: PRU_2227 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 642 3 676 676 399 37.0 1e-109 MMKHFSKWFIAAFVGVALFVSSCVDQVKFGDSFLEKAPGVAVTQDTIFGKATYARAFLWN TYSKLYYALPVYWNTVEGKMNTGIFEMMSDCWHSHTDWNGINRKYYSGSYKAGDEDSSDD TRFGYTKENCWEAIRAALLFVENVGRVPDMEDAEKKRLAAEAKVIVASRYFDLFRHFGGL PLIKETYDVQPSYELPRATVEETVKYMVDLLDEAAATPELPWDLGTDDTNWQGRFTKAAA MGLKCKILLFAASPLFNDDEPYCTEPPQDAVTNHQVWYGAYKPELWEQCWQACDDFFKEL QAKGYYELTQATEATAKGYRDAYNKAYFTRENNKELLISTHISRFSKFNSWDEWQYIFVN SGNGTVITGGLTPTLEFMEMFPMSKGEPFQLNSTTNPFYTDNDYNKPTRDPRLYETMLVN GTQFGDHAAELWIGGRDNINDTEKETGKYATGFGCYKFYKEGVNSLKDKYLQWPYLRLAE MYLIYAEALLKSKNDLTGAIEQVNKVRARVGLGDLAACNPDKNLTNDANALLEEILRERA CELGLEDVRLFDMTRYKRDDLFRKQLHGLKIYRNDGGGNTPWSGSTGNSSTYPKPTQFTY EAFPLVNPSRAWWSNFSPKWYLSAFPPSEVNKKYGLTQNPGWN >gi|222159285|gb|ACAB01000074.1| GENE 3 2915 - 6043 2523 1042 aa, chain - ## HITS:1 COG:no KEGG:PRU_2226 NR:ns ## KEGG: PRU_2226 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 32 1042 4 1046 1046 811 45.0 0 MERQKQFYEGQAGKKCRLLPIALSLFLFLLIPLKGYGDDRPMSEVVQQHSKVTGTVVDAN GEPVIGANVTLVGTTLGTITDIDGRFSLNANSGAKIKVSFIGYKEKVVTIKKGISLNIVL EEDAQSLGEVQVVAYGVQKKVSITGAISSMKGDDLLKTPAGSLSNVLSGQITGISSVQYS GEPGADAADIYVRGVATWNNAKPLIQVDGVERDFSQIDPNEIESVTVLKDASATAVFGVR GANGVILITTKRGAEGKAKVSFSTSAGVNVRTKDLEFANSYQYASYYNMMKVNDGGVATF SDEQLETFRNHSNPLLYPDINWIDYCMNKAAFQSQHNVNISGGTSNMKYFVSAGLFTQGG MFKQFNATDDFNFDYKRYNYRANLDFDVTKTTLLSVNIGGRIETKRTPESGEDQNQLFRK LYWAVPFAGAGIVNGERVVSNSEILPFTGVDGLSSYYGKGFQTKTTNVLNVDLALNQKLD FITKGLSVKLKGAYNSEYANTKKASSSKAYYTPVANADGSISLRKYGTDSQLSYGEPDNG FSKARNWYMELALNYARKFGDHNVTGLLLYNQSKKYYPSEYSDIPTGYVGLVGRVTYDWK TRYMAEFNAGYNGSENFRPGNRYGFFPAGSLGWVVSEESFFQPIKSVVNYLKLRASVGMV GNDVSQRFLYLPDSYQYGDGGYYFGQNVGNKMPGASEVSRSNPDAKWETAIKQNYGMDAT FLAERLTVSLDYFREDRSDILSKPDYLPGILGMSLPSVNVGETLNRGFEVQLKWNETLKN DFRYWANFNISFSRNKIVYKNEVEQNEPWMYETGRRIGSRSMYKFWGFYDETADMRYQEE FGHPIANHGITLVPGDCVYVDLNADGEINSNDATRDIGYTDMPEYTAGLNTGFSWKNFDF SMQWTGAWNVDRMLDEFRRPLGDTNEKGLLLYQYNTTWRSSADTFTAKFPRTSSLHASNN YAGSDLYLINASYLRLKSVEIGYNFNFPFMKKLKMDNCRLYVNGYNLITITGFKWGDPES RQSSRPNYPLTRVFNVGLKLGF >gi|222159285|gb|ACAB01000074.1| GENE 4 6234 - 10322 2547 1362 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 840 1079 8 246 294 144 36.0 1e-33 MQPSTALLASGDNLVANLKFKRFPALNNLPSDEIQKIYQDKDGFVWLASRYGFYQYDGYE TTLYKSNLYTPGLLTNNNILCLADDYDHNLWIGTQEGLNILDKKTGEIKKILFPVIPNNV ISCLLVMRDHSIWLGTDAGLCKYDAENGTFVVYNRELTGGVLDYTAIKSLLEDSEGDLWI GTWSSGLYRYVPSTEKFYAYPTMNERNSAHVIYEDTNKNIWVGSWDCGLFKLNNPKNLRT VSYVNYRHKTGDSTSLSDDIVYDINEDLNTGTLWVGTRSGLSIMSKNIPGRFINYRSRSS SHYIFCDEINTILRDKAGMMWIGSIGGGVLAVDTNQPMFAFHSLDFADNDIPITSVRALF ADSERNIWMGIGTYGLACVESVTGKLKSHSQMPEFAGITVPTVYSVMQRRNSGEIWFGTY DGGIFIYQKGEKVRNLTVDNCKFLGNSCVFALYEDEHGNCWVGTRGSLGVLLANGKSFLF SNITFTDNTRLDWLYVRDIITDSENSVWIATSNYGIIHIQGDIQHPSTLKYSNYSFYNGA LTTNNVLCLYEDKAGCLWAGTEGGGLYLYDRKNDRFDGKNQEYNIPGDMVGSIEEDKTGN LWLGTNAGLVKLGAQPIGRDAVVRVYTEVDGLQGNFFISQSACSRDGELFFGGYRGYNCF FPENMEEKQREVSLAITDIKIFNRSITLLPLDIQRKISKFTPAFTQKIELPYQYNNFSIE FATLTYKNPELNRYAYQLEGFDKEWVYTNADRRFAYYNNLESGTYIFRLKATNENGIWSG YVRELTVVVLPPFWATWWAYILYVLLVAGAVFWLFQITRNRILLRNELRLREMEKMKAEE LNHAKLQFFTNITHELLTPLTIISATVDELKTQAPRHTDLYAVMQSNIHRLIRLLQQILE FRKAETGNLKLRVSPGDIAAFVKKEAESFQPLIKKSKIHFSVLCNPESITGYFDTDKLDK ILYNLLSNAAKYTVEGGFIQVTLSYAEDRDHVLLKVKDNGKGISKEKQTTLFQRFYEGDY RKFNTIGTGIGLSLTKNLVELHEGTISVESETGQGAEFIVCIPIDRSYFREDQIDDEAIV PIQKMMTYAEEDTQPMDDSEVEKKKHSVLVIEDNEELLQLMTRLLKREYNVFTAENGKEG ISVLENEDIDLIVSDVMMPEMDGIEFCKYVKSNLEISHIPVILLTAKNKEEDRAEAYEVG ADAFISKPFNLPVLYARIRNLLKYKEGVVRDFKHQLVFELKDLNYTSLDEDFLQRAIDCV NGHLEDAEFDQPQFADEMKTSKSTLYKKLKSLTGLNTSAFIRNVRLKSACRIMEEKGSNI RISELAYAVGFNDPKYFSSCFKKEFGMLPSEYIERFLAVPAK >gi|222159285|gb|ACAB01000074.1| GENE 5 11303 - 13252 1805 649 aa, chain - ## HITS:1 COG:no KEGG:BT_3661 NR:ns ## KEGG: BT_3661 # Name: not_defined # Def: alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 647 1 646 647 1204 86.0 0 MKKIISALMVAGCCLTSLMAQDVVVKGPDEKLQLVVFTSPAEKPSYSITYNGKTMLEKSP LGMNTNIGDFAKGMKLTGHTVTPIDTVYRQDRIKTSQVHYQANELNCHFENPKGQKIDVV FRVSNNDVAFRYALPRQDGKGSVTVTAEETGFRFPQQTTTFLCPQSDAMIGWKRTKPSYE EEYKADAPMSDRSQYGHGYTFPCLFRIGDDGWVLVSETGVDSRYCGSRLSDVSEGNLYTI AFPMAEENNGNGTSAPAFALPGATPWRTITVGETLKPIVETTVAWDVVRPLYETKYDYRF GRGTWSWILWQDGSINYDDQVRYIDFAAAMGYEYALIDNWWDTNIGRDRMKSLVEYACSK GVELFLWYSSSGYWNDIEQGPVNHMDNAIIRKREMKWLQSLGVKGIKVDFFGGDKQETMR LYEDILSDADDHGLMVIFHGCTIPRGWERMYPNYVGSEAVLASENMVFNQHFCDEEAFNA CLHPFIRNAVGSMEFGGCFLNKRLNRNNDGGTTRRTTDIFQLATTVLFQNPVQNFALAPN NLKDVSPVCMDYMKAVPTTWDETRFIDGYPGKYVVLARRHNDTWYLAAVNAGKETLKLKL DLEMFAGKTVTLYKDDKKGEPQLMPLKVKENGKVQLELLPQGGAVLVNK >gi|222159285|gb|ACAB01000074.1| GENE 6 13289 - 15754 2037 821 aa, chain - ## HITS:1 COG:CAC3436 KEGG:ns NR:ns ## COG: CAC3436 COG3534 # Protein_GI_number: 15896677 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Clostridium acetobutylicum # 206 715 51 547 835 415 43.0 1e-115 MRKLFLFLVVLFLSFQQVTLAAIKEMTSTPDSVYLFSFATSGDDGRSGLRFAWSMDKENW FEVGRNYGYLRCDYSRWGSQKKMLDPYLKQSPAGEWICTWKLNDRDGYGQATSKDLINWT SQKYPRTTSDFDGTRVKAVVAGEEQKGTINRVAWTLVDGLNKNYGWNQYRNSLHEERPVQ DGERFAGLKPVKATVTVQPERAKDISDVLLGAFFEDINYSADGGLYAELIQNRDFEYDPS DREGDKNWNSTHSWTLKGDKTTFTINTTDPIHANNPHYAVLNVEQPGAALENTGFDGIAL NVGEKYDFSIFARVPQGQSNKLQVRLVDGEGNICGETSLTVSSRQWKTYKTVITAKATAD TRLEIIPQSAGELNLDMISLFPQHTFKGRKNGLRKDLAQVLADIHPRFIRFPGGCVAHGD GLKNIYQWKNTVGPLEARKAQRNLWGYHQSMGLGYFEYFQFCEDIGAEPLPVLAAGVPCQ NSACHGDLRGGQQGGIPMSEMGAYIQDILDLIEWANGDAKKTKWGKVRAEAGHPKPFNLK YIGIGNEDLITDIFEERFTMIFNAIKEKYPEMIVVGTVGPFNEGTDYVEGWKLADKLGVP MVDEHYYQTPGWFLNNQDFYDKYDRSKKTKVYLGEYATHIPGRKANIETALTEALYLAAL ERNGDVVHMTSYAPLLAKEGHTQWNPDLIYFNNREVKPTTGYYVQKLYGQNAGNEYLPSK ITLDNKDDKVQKRFASSIVRDSASGDVIVKLVNLLPVEVNTHVDLSGIGVIQPSAKRTVL TGKPADTPLPVEDTVEVAEKFDCQLPAYSFTVIRIKKANEK >gi|222159285|gb|ACAB01000074.1| GENE 7 16132 - 16995 723 287 aa, chain - ## HITS:1 COG:no KEGG:BT_3655 NR:ns ## KEGG: BT_3655 # Name: not_defined # Def: arabinosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 285 37 321 323 525 89.0 1e-148 MSEEAVRMAVSLDGYNYKALNGNQPVLDSRVISSTGGVRDPHILRCEDGKTFYMVVTDMV SANGWSSNRAMVLLKSKDLVHWTSNIVNIQKKYPDQENLKRVWAPQTIYDKEAGKYMIYW SMQHGNGPDIIYYAYANKDFTDIEGEPKPLFLPKNEKSCIDGDIIYKDGIYHLFYKTEGN GNGIKKATSPSLISGQWTESDDYKQQTKDAVEGAGIFPLIGSDKYILMYDVYMKGKYQFT ESVDLEHFKVIDHAISMDFHPRHGTVIPITQKELQRLFKAYGKPEGF >gi|222159285|gb|ACAB01000074.1| GENE 8 17133 - 18275 1105 380 aa, chain - ## HITS:1 COG:XF0840 KEGG:ns NR:ns ## COG: XF0840 COG1874 # Protein_GI_number: 15837442 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Xylella fastidiosa 9a5c # 1 210 401 602 612 141 35.0 3e-33 FNQGWGTILYRTTLPEATPAGTVLKITEVHDWAQIYADGKLLARLDRRKGEFTTTLPALK KGTQLDILVEAMGRVNFAKSIHDRKGITEKVELILGNQAKELKNWTVYNFPVDYSFIKDK KYNDTKILPAMPAYYKSTFKLDKVGDTFLDMSTWGKGMVWVNGHAMGRFWEIGPQQTLFM PGCWLKEGENEILVLDLKGPAKASIKGLKKPILDVLREKAPETHRKDGEKLKLAGEKVTY EGAFTPGNGWQEVRFAAPVKGRYFCLEALSPQANDNIAAVAEFDVLGADGKPVSREHWKI RYADSEETRSGNRTADKIFDLQESTFWMTVDNVPYPHQLVIDLSKVETVTGFRYLPRAEK EYPGMIKEYRVYVKSADFNY Prediction of potential genes in microbial genomes Time: Wed May 18 02:52:40 2011 Seq name: gi|222159284|gb|ACAB01000075.1| Bacteroides sp. D1 cont1.75, whole genome shotgun sequence Length of sequence - 62460 bp Number of predicted genes - 42, with homology - 40 Number of transcription units - 18, operones - 10 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 130 - 996 845 ## COG0294 Dihydropteroate synthase and related enzymes - Prom 1025 - 1084 5.4 + Prom 866 - 925 6.2 2 2 Op 1 . + CDS 1114 - 3129 1499 ## COG0642 Signal transduction histidine kinase 3 2 Op 2 . + CDS 3207 - 4502 1021 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase + Term 4532 - 4598 19.3 - Term 4522 - 4583 16.1 4 3 Tu 1 . - CDS 4642 - 5034 382 ## BT_3643 hypothetical protein - Prom 5097 - 5156 5.8 + Prom 5035 - 5094 7.8 5 4 Op 1 . + CDS 5138 - 6508 1106 ## COG0733 Na+-dependent transporters of the SNF family 6 4 Op 2 . + CDS 6508 - 7428 405 ## COG1555 DNA uptake protein and related DNA-binding proteins + Term 7636 - 7667 -0.7 - Term 7521 - 7583 6.5 7 5 Op 1 . - CDS 7607 - 8263 375 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 8 5 Op 2 . - CDS 8272 - 8988 620 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 9 5 Op 3 . - CDS 8988 - 11123 2000 ## COG0475 Kef-type K+ transport systems, membrane components - Prom 11211 - 11270 2.4 + Prom 11466 - 11525 6.2 10 6 Tu 1 . + CDS 11587 - 12594 1229 ## COG0136 Aspartate-semialdehyde dehydrogenase + Term 12629 - 12688 9.2 + Prom 12751 - 12810 5.5 11 7 Op 1 . + CDS 12830 - 15667 1344 ## BT_3633 hypothetical protein 12 7 Op 2 . + CDS 15672 - 16904 718 ## BT_3632 hypothetical protein 13 7 Op 3 . + CDS 16897 - 18405 641 ## BT_3631 hypothetical protein 14 7 Op 4 . + CDS 18445 - 19365 664 ## BT_3630 hypothetical protein + Prom 19440 - 19499 4.8 15 8 Op 1 . + CDS 19549 - 20136 644 ## BT_3629 hypothetical protein 16 8 Op 2 . + CDS 20195 - 20689 267 ## BT_3628 hypothetical protein 17 8 Op 3 . + CDS 20713 - 21600 228 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 18 8 Op 4 . + CDS 21671 - 22339 577 ## BT_3626 hypothetical protein 19 8 Op 5 . + CDS 22381 - 23415 1058 ## BT_3625 hypothetical protein + Term 23442 - 23509 6.1 + Prom 23486 - 23545 5.3 20 9 Op 1 . + CDS 23676 - 24959 806 ## BT_3624 hypothetical protein + Prom 24962 - 25021 4.0 21 9 Op 2 . + CDS 25052 - 26545 1058 ## COG0591 Na+/proline symporter 22 9 Op 3 . + CDS 26545 - 29010 1558 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 23 9 Op 4 . + CDS 29007 - 30365 1073 ## COG2385 Sporulation protein and related proteins 24 9 Op 5 . + CDS 30380 - 31663 1271 ## COG0477 Permeases of the major facilitator superfamily 25 9 Op 6 . + CDS 31677 - 32513 666 ## BT_3618 hypothetical protein 26 9 Op 7 . + CDS 32554 - 33360 874 ## COG2103 Predicted sugar phosphate isomerase 27 10 Tu 1 . - CDS 33335 - 33502 57 ## - Prom 33524 - 33583 4.8 + Prom 33461 - 33520 4.6 28 11 Op 1 . + CDS 33540 - 36008 1821 ## Fjoh_3877 hypothetical protein + Term 36032 - 36073 1.3 29 11 Op 2 . + CDS 36081 - 38486 1106 ## COG3533 Uncharacterized protein conserved in bacteria + Term 38601 - 38647 16.1 30 12 Tu 1 . - CDS 38598 - 38714 57 ## - Prom 38805 - 38864 5.6 31 13 Op 1 . - CDS 38943 - 39398 430 ## Dfer_0298 putative esterase 32 13 Op 2 . - CDS 39331 - 39624 243 ## Plim_2731 beta-lactamase - Prom 39644 - 39703 6.0 - Term 39662 - 39706 0.1 33 14 Tu 1 . - CDS 39707 - 41584 1277 ## Csac_0206 protein of unknown function DUF303, acetylesterase putative - Prom 41778 - 41837 3.9 + Prom 41668 - 41727 6.2 34 15 Tu 1 . + CDS 41903 - 44338 1777 ## BVU_2979 glycoside hydrolase family protein + Term 44416 - 44458 4.1 + Prom 44400 - 44459 6.3 35 16 Tu 1 . + CDS 44497 - 48516 2473 ## COG0642 Signal transduction histidine kinase - Term 48478 - 48527 4.2 36 17 Op 1 . - CDS 48538 - 49926 1241 ## COG5498 Predicted glycosyl hydrolase 37 17 Op 2 . - CDS 49934 - 51850 1336 ## COG2382 Enterochelin esterase and related enzymes 38 17 Op 3 . - CDS 51877 - 54411 1532 ## BVU_0030 hypothetical protein - Prom 54445 - 54504 5.1 - Term 54437 - 54490 2.1 39 18 Op 1 . - CDS 54506 - 57325 1963 ## CJA_3286 endo-beta-galactosidase, putative, ebg98A (EC:3.2.1.-) 40 18 Op 2 . - CDS 57346 - 58992 960 ## COG5520 O-Glycosyl hydrolase 41 18 Op 3 . - CDS 59026 - 61257 1711 ## PRU_2739 endo-1,4-beta-xylanase (EC:3.2.1.8) 42 18 Op 4 . - CDS 61290 - 62324 755 ## Slin_2105 hypothetical protein - Prom 62348 - 62407 8.4 Predicted protein(s) >gi|222159284|gb|ACAB01000075.1| GENE 1 130 - 996 845 288 aa, chain - ## HITS:1 COG:ECs4056 KEGG:ns NR:ns ## COG: ECs4056 COG0294 # Protein_GI_number: 15833310 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Escherichia coli O157:H7 # 13 278 21 285 297 225 45.0 7e-59 MMKPISPIYINVKGRLLDLATPQVMGILNVTPDSFYSGSRMQTEEDIAARARQILDEGAS IIDIGAYSSRSNAEHISAEEEMRRLRTGLEILNRNHPEAIISVDTFRADVAEECVKEYGV AIINDIAAGEMDHRMFQTVADLGVPYIMMHMQGTPQNMQKEPSYDNLIKDVFLYFARKVQ QLRDLGVKDIILDPGFGFGKTLEHNYELLAHLEEFHIFELPVLVGVSRKSMIYKLLGGTP QDSLNGTTVLDTVALMKGAHILRVHDVREAVEAVRITEKLKIESGYDK >gi|222159284|gb|ACAB01000075.1| GENE 2 1114 - 3129 1499 671 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 429 668 64 310 328 134 33.0 9e-31 MSRLTLFRVGFLFLILFFTTTAKAQKEAETFNVDSTLYEYYQRCQEYLLEPVVLNMSDTL FRMAGERQDERMQAVAIATQLDYYYFQGTNEDSVIHYTNKVKEFAKATHQPKYYYFAWAN RLITYYLKTSKTNIALYEVQNMLKEAQEEDDKTGLSRCYNIMSQIYTIKRFDSMAFEWRL KEIELTEKYKLENYNISQTYAQIANYYINQKKQKEALAAVEKAIATANSSTQQISAKLEF VNYYSKFGDFQAAEKLLKECQAAFEQDKRLESIKKRLYNIECLYYQQTKQYQKALEAAEM QEKEERRLSESILSSSHYRTQGEIYQKMGNMNLAVKYLQMYINTDDSLKIANEQVASSEF ATLLNVEKLNAEKKELMLQAQQKELHNKTTLIISLIILLGILFIFLYRENFLKRKLKVSE AELKTRNEELMVSREELRKAKDIAEASSRMKTTFIQSMTHEIRTPLNSIVGFSQVLSDHY SNSPETQEFVNIIKSNSNDLLRLVTDVLTLSELDQYEQLPTDPETDLHAICQLASEVAKD NTQKDVEVLFEPERESLLIRSNSERISQVLNNLAHNATKFTTHGSIRIAYSVLEAEKKIE ISVTDTGTGIPKDQQEAVFERFYKMNSFTQGTGLGLPICRSIAEKLGGSLRIDTSYTEGC RMILTLPLIYA >gi|222159284|gb|ACAB01000075.1| GENE 3 3207 - 4502 1021 431 aa, chain + ## HITS:1 COG:BS_murF KEGG:ns NR:ns ## COG: BS_murF COG0770 # Protein_GI_number: 16077524 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Bacillus subtilis # 14 431 28 451 457 218 36.0 2e-56 MKLSALYQISLDCQSVTTDSRNCPDGSLFIALKGESFNGNAFAKLALDSGCAFAIIDEIQ YAVEGDQRYILVDDCLQTMQQLANYHRRQLGTRVIGITGTNGKTTTKELISSVLCQAHNV LYTLGNLNNHIGVPTTLLRLKPEHDLAVIEMGANHPGEIKFLCEIAEPDYGIITNVGKAH LEGFGSFEGVIKTKGELYDFLRKKDATTFIHHDNPYLMNISQGLNLISYGTEDDLYVNGR ITGNSPYLAFEWKAGKDGEAHQVRTQLIGEYNFPNALAAITIGRFFGVEAKKIDKALAEY TPQNNRSQLKKTEDNTLIIDAYNANPTSMMAALQNFRNMTVPHKMLILGDMRELGADSPA EHQKIVDYIKESDFEKVWLVGEQFAASEHSFKTYANVQEVIKDLQEDKPKGYTILIKGSN GIKLSSTVEFL >gi|222159284|gb|ACAB01000075.1| GENE 4 4642 - 5034 382 130 aa, chain - ## HITS:1 COG:no KEGG:BT_3643 NR:ns ## KEGG: BT_3643 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 126 1 126 130 248 94.0 6e-65 MLSYIKKYPISLFIILTVIYLSFFKPPKTDLNEIPNLDKLVHVCMYFGMSGMLWLEFLRA HRRDRAPMWHAWVGAFLCPVLFSGCVELLQEYCTTYRGGDWLDFAANTTGAILASLVAYY GVRPRMKINK >gi|222159284|gb|ACAB01000075.1| GENE 5 5138 - 6508 1106 456 aa, chain + ## HITS:1 COG:BH1128 KEGG:ns NR:ns ## COG: BH1128 COG0733 # Protein_GI_number: 15613691 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus halodurans # 6 449 9 446 453 320 45.0 4e-87 MAKIDRANFGSKLGVILASAGSAVGLGNIWRFPYETGNHGGAAFILIYLGCIFLLGLPIM IAEFLIGRRSRANTARAYQTLAPGTHWRWVGRMGVLAGFLILSYYAVVAGWTLEYILEAA TNGFAGKNSGEFISSFQQFSSSPWRPVVWLVAFLLITHFIIVKGVEKGIEKSSKIMMPTL FIIILILVVCSVTLPGAGAGIEFLLKPDFSKVDGNVFLSAMGQAFFSLSLGMGCLCTYAS YFSKETNLTKTAFSVGIIDTFVAILAGFIIFPAAFSVGIQPDSGPSLIFITLPNVFQQAF SGVPVLAYIFSVMFYALLAMAALTSTISLHEVVTAYLHEEFNLSRGKAARLVTGGCVFLG IFCSLSLGVMKGFTVFGLGMFDLFDFVTAKIMLPLGGLCISLFTGWYLDKKIVWSEITND GSLKIPVYKLIIFILKYIAPIAISLIFINELGLIKL >gi|222159284|gb|ACAB01000075.1| GENE 6 6508 - 7428 405 306 aa, chain + ## HITS:1 COG:TM1052 KEGG:ns NR:ns ## COG: TM1052 COG1555 # Protein_GI_number: 15643810 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Thermotoga maritima # 180 284 47 161 181 67 33.0 2e-11 MWKDFLYYTKTERQGIIVLVVLILGVYAAPELFAFFTRAEDTGCTKNEKADQEYNDFVAS LRETKPHPKSGHSFPSTPQREIKLTMFDPNTADSTTFLSLGLPSWMIKNILHYRHKQGKF RHPEDFRKIYGLTEEQYQTLHPYIQITEDFSSKDKDTIRLLTVQSIQRDTLVKYQPGTVV SLNSADTTELKKIPGVGSSIARMIVNYRERLGGFCRIEQLQEIHLKAERLRPWFSIDTHQ IHRINLNKAGMERMMRHPYINYYQAKVIIEYRKKKGILKSLKQLSLYEEFTPIDLERIEP YICYNK >gi|222159284|gb|ACAB01000075.1| GENE 7 7607 - 8263 375 218 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 215 1 217 245 149 41 5e-35 MIKLEGITKSFGSLQVLKGIDLEINKGEIVSIVGPSGAGKTTLLQIMGTLDEPDAGTVQI DGTVVSRMKEKELSAFRNKNIGFVFQFHQLLPEFTALENVMIPALIAGVSSKEANDRATK ILDFMGLVDRASHKPNELSGGEKQRVAVARALINDPAVILADEPSGSLDTHNKEDLHQLF FDLRDRLGQTFVIVTHDEGLAKITDRTVHMVDGMIKKY >gi|222159284|gb|ACAB01000075.1| GENE 8 8272 - 8988 620 238 aa, chain - ## HITS:1 COG:FN0725 KEGG:ns NR:ns ## COG: FN0725 COG1179 # Protein_GI_number: 19704060 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Fusobacterium nucleatum # 8 236 4 229 234 167 40.0 1e-41 MEKYDWQQRTALLLGEEKMERIRNAHVLVVGLGGVGAYAAEMICRAGVGRMTIVDADTVQ PTNMNRQLPAMHSTLGKAKAEVLAARYKDINPDIELTVLPVYLKDENIPELLDANQYDFI VDAIDTISPKCFLIYEAMKRRIKIVSSMGAGAKSDITQVRFADLWETYHCGLSKAVRKRL QKMGMKRKLPVVFSTEQADPKAVLLTDDEQNKKSTCGTVSYMPAVFGCYLAEYVIKRL >gi|222159284|gb|ACAB01000075.1| GENE 9 8988 - 11123 2000 711 aa, chain - ## HITS:1 COG:all3567_1 KEGG:ns NR:ns ## COG: all3567_1 COG0475 # Protein_GI_number: 17231059 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Nostoc sp. PCC 7120 # 8 396 22 404 413 305 44.0 3e-82 MNLFDFNLTLPITDPTWVFFLVLIIILFAPMILGRLHIPHIIGMILAGVLIGEHGFHVLD RDSSFELFGKVGLYYIMFLAGLEMDMEDFKKNRTKSVVFGWLTFLIPMALGIWSSMSMLG YGFLTAVLLASMYASHTLIAYPIISRYGLSRLRSVNITIGGTAVTVTLALIILAVIGGMF KGTVDGWFWVFLVAKVAFLGFLIVFFFPRIGRWFFRKYDDSVMQFVFVLAMVFLGGGLME FVGMEGILGAFLAGLVLNRLIPHVSPLMNRLEFVGNALFIPYFLIGVGMIIDVRSLFTGG EALKVAIVMTVVATFSKWLAAWITQKIYRMQSNERSMIFGLSNAQAAATLAAVLIGHEII MENGERLLNDDVLNGTVVMILFTCVISSLVTERSARRFALDENVQAEEEAKQINKEQILI PVANPETIEDLVNLALVIKDAKQKNALVALNVINDNNSSEKKEQQGKRNLEKAAMIAAAA DVPVTMVSRYDLNIASGIIHTIKEYEATDIVIGLHRKANIVDSFFGHLAESLLKGTHREV MIAKFLMPVNTLRRINIAVPPKAEYESGFSKWVEHFCRMGSILGCRVHFFANERTLMRLQ QLVKKRHAGTPTEFSILEEWEDLLLLTGQVNYDHLLVVVSARRGSISYDTSFERLPAQLG KYFSNNSLIIIYPDQFGEPQEIVSFSDPRGHNESQHYEKVGKWFYKWLKKN >gi|222159284|gb|ACAB01000075.1| GENE 10 11587 - 12594 1229 335 aa, chain + ## HITS:1 COG:aq_1866 KEGG:ns NR:ns ## COG: aq_1866 COG0136 # Protein_GI_number: 15606903 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Aquifex aeolicus # 2 331 4 336 340 370 58.0 1e-102 MKVAIVGVSGAVGQEFLRVLDERNFPMDELVLFGSKRSAGTTYTFRGKQIEVKLLQHNDD FKGVDIAFTSAGAGTSKEFEKTITKYGAVMIDNSSAFRMDADVPLVVPEVNAEDALERPR GVIANPNCTTIQMVVALKAIEKLSHIKTVHVSTYQAASGAGAAAMDELYEQYRQVLANEP VTVEKFAYQLAFNLIPQIDIFTENGYTKEEMKMYNETRKIMHSDVKVSATCVRVPALRAH SESIWVETERPISVEEAREAFANGEGLVLQDNPAEKEYPMPLFLAGKDPVYVGRIRKDLT NENGLTFWIVGDQIKKGAALNAVQIAEYLIKVKNI >gi|222159284|gb|ACAB01000075.1| GENE 11 12830 - 15667 1344 945 aa, chain + ## HITS:1 COG:no KEGG:BT_3633 NR:ns ## KEGG: BT_3633 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 23 945 23 945 945 1489 80.0 0 MGYFRLNIAFFGLLFFIVFKATANPGTYTISGRITEEKTNNPLPGASIIIKGTYLWAVSN QNGDFTIQGVREGKCNLEVSFLGYVTTTIPVNIRSSIKDITIQLKENTLALDDVVITAQA PKSELNTTLIIGSNALEHLQISNVSDISALLPGGKTKVPDLTTNNVFSLRDGGSTVGNAT FGTAVAVDGIRIGNNASFGNMSGIDTRSIAVTNIASVEVITGVPSAEYGDLNSGMVNIRT KKGKTPWEVLLAINPRTEQFSFSKGLPLGNDKGTINISGEWTKATRKLSSPYTSYTRRGF AANYNNTFRNIFRFNIGFTGNIGGMNTKDDPDAYTGEYTKVRDNVFRANTSLSWLLNKSW ITNLKLDASLHYNDNKSHAHTPYTYASEQPAVHAEQEGYFLADKLPYSYFADQIIDSKEL DYAASLKYEWNRRLKKVNSNLIAGIQWKSTGNIGEGEYYLSPSLAPNGYRPRPYTTYPYM HNVSLYAEENLTVPLGSTTLKLMAGIRWEKIFISGTEYKNLNTFSPRFNAKWQLKRNISI RGGWGVAEKLPSYYILYPRQEYRDIQTFGVSYNNNESSYVYYSQPYVFLHNKDLRWQRNQ NAELGVDIEIAKNKISLVGYFNRTKNPYKYTNAYTPFSYDVLQLPDGFSMPANPQINVNN QTGMVYIRGNESEEWVPMDVKVTNRTFVNSVSPDNGPDINRRGVEMIVDFPEITPIRTQL RLDASYDYMKYIDNSLSYYYQTGWSHTGIPNRSYQYVGIYANGDNSSTTANGKRTHSLDA NITAITRIPKARLIISFRLEASLLKRSQNLSEYNGKEYAFNVSDNSNAETGGSIYNGNSY TAIYPVAYLDLNNEIRPFTEKEAQNSAFANLIRKSGNAYTFAADGYDPYFSANINITKEI GDHVSLALNAINFTNSRKYVTSYATGVSAIFTPDFYYGLTCRIKF >gi|222159284|gb|ACAB01000075.1| GENE 12 15672 - 16904 718 410 aa, chain + ## HITS:1 COG:no KEGG:BT_3632 NR:ns ## KEGG: BT_3632 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 410 3 411 411 654 74.0 0 MKQLAYILLLMTFLTTGSCTDIDKDNPYDNQLYTLQVNAVYPNEYSDYLRKGVTVEIEDI DRGNSYTSKTDKNGTVRFSLTKGIYRIQISDKAEQDIFNGLADKVKLVNGDLALNLPLVH SRSGDIVIKEIYCGGCAKLPFEGNYQSDKYMILHNNTSETQYLDGLCFGSLDPYNSQATN VWVTQDESTGATIFPDFLPVVQCVWQFGGTGQTFPLAPGEDAVIVICGAIDHAAQYTQSV NLNKPGYFVCYNPVYFWNTLYHPAPGDQITPDHYLNVVIKTGQANAYTFSVFSPATVLFK AKDTTIQDFVSQADNVIQKPGSIVDRIVKVPIDWVLDAVEIYYGGSSNNKKRMPPSVDAG YVTQSALYDGRTLYRHTDEEASREAGYEILEDTNNSSLDFYEREKQSLHE >gi|222159284|gb|ACAB01000075.1| GENE 13 16897 - 18405 641 502 aa, chain + ## HITS:1 COG:no KEGG:BT_3631 NR:ns ## KEGG: BT_3631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 502 1 502 502 783 76.0 0 MSNMKRIYITCLFLYMASTVSAQSYDIVERRNPWNAGANVTGIMVDSITVSYAELYDRNN HGDFHNYYEADKLWNAGAIAKSITHLKNYSLTGSFSFDHTSGRNMSGSMFIHPGFYPVDI LEFTPGRKDLQTYAFMGGIAKDIAPCWRIGGKIDFTSSNYSKRKDLRHTNYRLDLKVAPS IMYHSDDYAIGFSYIFGKNSESVKAEEIGTAATSYYAFLDKGLMYGAYETWEGSGIHLNE SGINGFPIKELSHGAAVQFQWKAFYGEVEYSHSSGSAGEKESIWFKFPTNRVTSHLSYRF SKGNAAHFLRLNLTWSRQFNNENVLGQETSNGITTTHVYGSNRIFERNVFSVQPEYEFIN SRRELRFSANVSSFKSLTTQMFPYSVSQTMTCGRIYLASTFHTKLFDLKTSGVFSVGDYT EKSKTVKTESETGEPPYRLTDYYNLQNEYETAPRLTLEVGLRYKFYRRMYAEIQTEYTHG FNLKYIVGANRWSETIKLGYTF >gi|222159284|gb|ACAB01000075.1| GENE 14 18445 - 19365 664 306 aa, chain + ## HITS:1 COG:no KEGG:BT_3630 NR:ns ## KEGG: BT_3630 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 26 303 267 540 541 301 56.0 2e-80 MKKYLMFMLAAALFASCSDDDKLPKEDETPGDTNFEEHDGYFVYYGETYKTVKLANGTTW MAEPLRYVPEGYTPSSDPTADAHIWYPYQLTGVTDKVTAGGAEALTDEASIKKSGYLYDL YAALGGKEVTEENCYEFEGAQGICPEGWHIPTRLEFVGLCGLSNKAVGETGNMTKNDALF YDETYSGGNMNKYNAAGWNYVLSGVRMQNNFAATPTYQLTTFYSGNTTSETLEKYKGQPA LTYIMSSTCYAPLYLDKTDPTKLTNIQFFAQMTTFTKRYPEGRINVPYISIKSGQQLRCV KDQAAN >gi|222159284|gb|ACAB01000075.1| GENE 15 19549 - 20136 644 195 aa, chain + ## HITS:1 COG:no KEGG:BT_3629 NR:ns ## KEGG: BT_3629 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 195 1 195 195 313 90.0 2e-84 MKTKKLTIALAIVVMTLIGTSCGNKQQKSASEATTEQAASSALEIDSLLANAENLAGQEV TIEGVCTHACKHGAKKIFLMGSDDTQVIRVEAGTLGAFDPKCVNSIVRVTGTLKEQRIDE AYLQNWEAQLKAQAAEKHGTGEAGCDSEKKARGETANSPEARIADFRAKIADRKAETGKE YLSFYFMEANSYEVE >gi|222159284|gb|ACAB01000075.1| GENE 16 20195 - 20689 267 164 aa, chain + ## HITS:1 COG:no KEGG:BT_3628 NR:ns ## KEGG: BT_3628 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 164 25 188 188 287 87.0 9e-77 MVLIYAISGIVMNHRDTINPNFSITRKEYKIAEKLPDKAGMSKEKVLTLLEPLGETGNYT KFYFPKTDVMKVFLKGGSNLLINVKTGEAVYESVTRRPLIGAMSRLHYNPGQWWTYFADI FAIALIIITLSGIIMLKGNKGIIGRGGIELIVGILIPILFLFFF >gi|222159284|gb|ACAB01000075.1| GENE 17 20713 - 21600 228 295 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 21 291 26 304 309 92 25 5e-18 MEQIIECNNLTHYYGKRLIYENLSFTVPKGRILGLLGKNGTGKTTTINILSGYLKPRSGE CRIFGQEIQTMAPALRRHIGLLIEGHVQYQFMTITEIEKFYAAFYPGQWKKEAYYELMNK LKVATGQRISRMSCGQRSQVALGLILAQNPELLVLDDFSLGLDPGYRRLFVDYLRDYARS EGKTVFLTSHIIQDMERLVDDCIIMDYGKILIQKPIAELLEKGRRYTFTIPEGYELPASD DFYHPSIMRNQLETFSFLQPAETEAKLKSMSVPYTDLHCEQVNLEDAFIGLTGKY >gi|222159284|gb|ACAB01000075.1| GENE 18 21671 - 22339 577 222 aa, chain + ## HITS:1 COG:no KEGG:BT_3626 NR:ns ## KEGG: BT_3626 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 222 1 222 222 367 93.0 1e-100 MYAIFYKEWIKTRWYFLLAVLATLGFTGYCMLRINRVVEMKGAAHVWEVMLQRDVIFIDM LQYIPLIAGILMAIVQFVPEMQRKCLKLTLHLPYPELKMTGNMLLSGLIPLLVCFASNFL LMEVYLNGILAHELKNHILLTALTWYLAGISGYLLVAWICLEPAWKRRILNLIIAVLLLR IFFLLPTPEAYNKFLPYLVVYTLLTASFSWLSIVRFKAGKQD >gi|222159284|gb|ACAB01000075.1| GENE 19 22381 - 23415 1058 344 aa, chain + ## HITS:1 COG:no KEGG:BT_3625 NR:ns ## KEGG: BT_3625 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 344 1 344 344 627 89.0 1e-178 MRRFSTILLYVTIAFLLLWQLPWCYNFFVVKPEKTPFTLYSFVIGDFAQMGQEEGKGTVR RDLAGNIYSEAAFDSILPMFYFRQLMSDERFPDTIQGIAVTPKMVQTENFNFRSVPSDIN APSIGLYPLMESMSGRVDLKMPDDVFRITSKGIEFIDMATNSVKEDKSLQFTEAMTKKDF RFPATEIVGNPTVKKEYDEGYLLLGADRRLFHLKQVKGRPYVRAITLPEGLTLEHLYLTE FRNKKTLAFMTDVNKAFYVLQNRTYEIVKTGIPAFDPETDALTIIGNMFDWTVRVTTPAS DNYYALDANDYSLIKKLENESNVHSMPGITFTSYTDKYVMPRFE >gi|222159284|gb|ACAB01000075.1| GENE 20 23676 - 24959 806 427 aa, chain + ## HITS:1 COG:no KEGG:BT_3624 NR:ns ## KEGG: BT_3624 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 426 1 420 421 570 67.0 1e-161 MKVYQAFAFLSIFIILFYNCSLPADDEEKGELEGGEDTFTPELPWEGEMDKFTINSKEGI HLNDPQKDAGTAYVTIPSTSVKNTRWEFGVHLTFNPSANNYARFYLTSSSNILSDNLNGY YIQIGGAKDNVTLYRQNGEQSKLLASGRELMKGNSSPKLYIKVECDNNGYWTFWTRLESE NEYVKEKQIKDTDIQTSRYCGIYCIYTKTRCKGFTFHHIQLSNDVETNTSPDETPDNPDT DLPDNPDTPELPEDVRGMLLFNEIMYDNATDGAEYVEIYNPGEKTITLPTLYLYKMYESG TVYSTTILCNESSSTPLTILSKGYLCFSKYTSKVIRKHKVNGENLIEISKFPTLNNDGGY LALSCSEKPEKGQTFDTCRFRDEMHDSDNKKTTGISLEKKSPELSSLNKNWRSSKHATGG TPGIKNR >gi|222159284|gb|ACAB01000075.1| GENE 21 25052 - 26545 1058 497 aa, chain + ## HITS:1 COG:sll1087 KEGG:ns NR:ns ## COG: sll1087 COG0591 # Protein_GI_number: 16330938 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Synechocystis # 10 421 9 420 512 127 28.0 4e-29 MSPAVISITIVAYFIILFTISYIAGRKADNEGFFVGNRKSAWYIVAFAMIGSTISGVTFV SVPGMVQASSFSYLQMVLGFIVGQIIIAFVLVPLFYRMNLVSIYEYLENRFGSSSYKTGA WFFFISKMLGAAVRLFLVCLTLQLLIFEPFHLPFLLNVILTVFIVWLYTFRGGVKSLIWT DVLKTFCLVVSVVLCIYYIASSLHLNFSGLVSTISDSDFSKTFFFDDVNDKRYFFKQFLA GVFTVIAMNGLDQDMMQRNLSCKNFRDSQKNMITSGISQFFVILLFLMLGVLLYTFTAQQ GIGNPEKSDELFPMIATGNYFPGIVGILFIIGLIASAYSAAGSALTALTTSFTVDILHAQ KKGEAALSQIRKHVHIGMAVVMGAVIFVFNLLNNTSVIDAIYTLASYTYGPILGLFAFGI FTKKQVYDKYIPLVAIASPILCYILQRNSEAWLNGYQISYELLIINALFTFLGLCLFIKK QDKETSYTTHTTKQEVK >gi|222159284|gb|ACAB01000075.1| GENE 22 26545 - 29010 1558 821 aa, chain + ## HITS:1 COG:PH0430 KEGG:ns NR:ns ## COG: PH0430 COG0463 # Protein_GI_number: 14590346 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Pyrococcus horikoshii # 247 352 7 113 334 61 35.0 7e-09 MKTTINCFLPFSRLEETMQTVKELRASTLVDKIYLLASPITNVCIPGCELISVEKMQSTQ AMRAIAEHSDKTYTLLYTKHTALKIGMFALERMVQVMEMTQAGMVYADHYQQIDSVQKTA PVIDYQPGSLRDDFNFGSVLLFNANAFKEVIENTEEEYQYAGLYDLRLKISQKSKLVHIN EYLYTEVESDKRKSGEKQFDYVDPKNRQVQIEMEQACTNHLQEIGGYLYPIFRPVDFSSH TFEYEASVIIPVRNRIRTVKDAVRSALNQQTTFPFNVIVIDNHSTDGTSEVLRELSSDKR LIHVIPERDDLGIGGCWNIGIHHEKCGKFAVQLDSDDVYKDEHSLQIIIDTFYKQKCAMV IGTYMMTNFDMQEIAPGIIDHKEWTPDNGRNNALRINGLGAPRAFYTPILREIKVPNTSY GEDYALGLKISHDYQIGRIYDVIYLCRRWEGNSDAALPVEKINQNNLYKDRLRTWELEQR ILQQETTLEKIQQKIEQLFQKQTLSWELAKENYQALEQYRSRTKEISKWFGNNCLSAKLF FNPKRILSATAQTDTASIHSRPCFLCQTNRPQEQEFISYRNYQILVNPYPIFKHHFTIVD KEHKSQSIVGRFKDMIEFTDIMREYFLMYNGPECGASAPDHAHFQACSKEESMQGSYYDH IDLIDNDKVQISYEGFPYSFIRIQAKNKKTMSKTFHLIYDILAANNNGKEPMMNILAWYG LERTKEDFRERYEEEFESVAEHPYNCIIFLRSKHRPDCYYAKGDEQILISPAIAEMNGIF PIVREEDMEKLTPEKVYDIYREISISKEKLQKILERIKAVL >gi|222159284|gb|ACAB01000075.1| GENE 23 29007 - 30365 1073 452 aa, chain + ## HITS:1 COG:sll1283 KEGG:ns NR:ns ## COG: sll1283 COG2385 # Protein_GI_number: 16329811 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Synechocystis # 88 449 127 388 391 109 26.0 1e-23 MTQPQITVGILSGKEIEFSFPVKFSSSVGTEISGTQKVIYQDGKIHWQGKEYDELSFIPP QNAHAFFELKDVTIGINFHWERKEVQKFKGELKIIVEGEQLTAINVISIEEYLTSVISSE MSATASLELLKAHAVISRSWLLEKNKELRMKNEEFKRTLQKDSAANSSFFIPNSSFIKWY DHEAHRNFDVCADDHCQRYQGITRASTPQAIEAVSATQGEVLMYKGAICDARFSKCCGGA FEEFQNCWENVKHPYLIRQRDSKTEKQLPDLNIEAEADKWIRTSPVAFCNTQDKKILSQV LNNYDQETADFYRWKISYSQQELSELIHQRSGIDFGQIIDLIPVERGTSGRLVRLKIVGT LRTLIIGKELEIRRTLSTSHLYSSAFVVDKEYEEKGHKEDKIPSRFILTGAGWGHGVGLC QIGAAVMGEQGYKYEEILSHYYPGSTLEKQYQ >gi|222159284|gb|ACAB01000075.1| GENE 24 30380 - 31663 1271 427 aa, chain + ## HITS:1 COG:YPO3162 KEGG:ns NR:ns ## COG: YPO3162 COG0477 # Protein_GI_number: 16123324 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 17 422 20 409 492 112 25.0 1e-24 MTTSTTKRNPWSWIPTLYFAEGLPYVAVMTIAVIMYKRLGLSNTEIALYTSWLYLPWTIK PLWSPFVDLVKTKRAWIIAMQGFIAAGFAGIAFFIPTAHYVQLTLAFFWLLAFSSATHDI AADGFYMLGLNNKEQSFFVGIRNTFYRLANIFGQGILVMLAGWLETSQNNIPLAWSITFY LLAGLFLALTIYHRLILPHPDSDIKRPGLTPGKLLGDFLLTFVTFFQKKNLGLMFFFLLT YRLGESQLAKIASPFLLDTTDKGGLGLSTAIVGMIYGTIGVIALLLGGIISGFLVSRDGF KKWILPMALAINIPDLLYVWMAAATPDNPIFIAICVAIEQLGYGFGFTAYMLYLIYIAEG EHKTAHYAIGTGFMALGMMIPGMPAGWIQEHLGYTNFFIWVCICTLPGIVASLMIRNRLE DSFGKKQ >gi|222159284|gb|ACAB01000075.1| GENE 25 31677 - 32513 666 278 aa, chain + ## HITS:1 COG:no KEGG:BT_3618 NR:ns ## KEGG: BT_3618 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 278 1 278 283 515 88.0 1e-145 MILIADSGSTKTDWCIVFNGTPIKRMGTKGINPFFQSEEEIQQELTHSLLPQLPEGTINS VFFYGAGCTPEKAPVLRRAIADSLPVIGNIKANSDMLAAARGLCGHEAGIACILGTGSNS CFYNGEEIVNNISPLGFILGDEGSGAVLGKLLVGDVLKNQLPPAIKEAFLKQFDLTAPEI IDRVYRQPFPNRFLASLSPFLAQHLEEPAIRSLVLNSFIAFLRRNVMQYDYKQYPVHFIG SVAHCYKEILQEAAQTTGIQIGKILQSPMEGLIQYHQQ >gi|222159284|gb|ACAB01000075.1| GENE 26 32554 - 33360 874 268 aa, chain + ## HITS:1 COG:STM2571 KEGG:ns NR:ns ## COG: STM2571 COG2103 # Protein_GI_number: 16765891 # Func_class: R General function prediction only # Function: Predicted sugar phosphate isomerase # Organism: Salmonella typhimurium LT2 # 4 252 10 257 297 236 50.0 3e-62 MQITEQPSLYDHLEKKSVREILEDINQEDQKVALAVQKAIPQIEELVNQIVPRMKQGGRI FYMGAGTSGRLGVLDASEIPPTFGMSPNWIIGLIAGGDTALRNPVEGAEDNENRGWEELV EHQINEKDTVIGIAASGTTPYVIGALREARRHGVLTGCITSNPDSPMAAEADVAIEMIVG PEYVTGSSRMKSGTGQKMILNMISTSVMIQLGRVKGNKMVNMQLTNHKLIERGTQMIMDE LGLDHDRAQKLLLLHGSVKKAIDAYQHS >gi|222159284|gb|ACAB01000075.1| GENE 27 33335 - 33502 57 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFTKVGHLKIECTKKMFLFHITGVTMLFCTFHIALEVLHLAYCEYIDTKNVDMHQ >gi|222159284|gb|ACAB01000075.1| GENE 28 33540 - 36008 1821 822 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_3877 NR:ns ## KEGG: Fjoh_3877 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 1 820 1 835 841 979 57.0 0 MKKVFLFLYFSIISLLISAADTFVIFTPADHHFPLIKNGKPCSILMDINEDKGVKMAIDN LQQDLLNVCGSQSEINTKASDKQCLLIGTLQTPLIQKLIASGKIDRKELEGKNEKYILQV VTAPLEGVDEALVITGSDKRGTIYGIYELSKQIGVSPWYWWADVPVKKQTDIYIKPGTYT DGEPKVEYRGIFLNDEWPCLGSWSQEKFGGFNSQFYKHVFELVLRLKGNFMWPAMWASAF YDDDPQNGELANTMGIVMGTSHHEPMALAQQDWKRRGKGEWDYNHNAKELRNFWTSGMER VKNWETVVTVGMRGDGDEPMSEGANISLLENIVKDQRKIIETVTKKKAKNTPQVWALYKE VQDYYDKGMRVPDDITLLLCDDNWGNVRKLPNLNEKPRSGGYGMYYHFDYVGAPRNSKWI NISPIPRVWEQMNLTYEYGVRKLWIVNVGDLKPMEYPITFFLDMAWNPEKFNAQNLQRHT EDFCAQQFGEKYAKEAARILSQYAKYNRRVTPELLNAQTYSFHYNEWERVVNEYNLLALD AHNLGFLMPASYKDAYDQLILFPVQACANLYNMYYAQAQNQRLAAEQKTEANRWADKVVA CYERDSLLTHYYHKVMSKGKWNHLMDQVHIGYTSWNNPPKQIMPKITRVPEQAGTYTFIE TNGYISIEAEHFTRAIAEGETTWNLIPDFGKTLSGVTTLPVTKTPDKMCLEYDIEIEKAG SVEVTLLLAPTLNFNDNKGLRYAICLDGDKEQIINFNGHYQGELGKWQANPIIESRSNHL LAKEGKHTLRIRPLDPGIVFEKIQINTGGLQPSYLGAPETLK >gi|222159284|gb|ACAB01000075.1| GENE 29 36081 - 38486 1106 801 aa, chain + ## HITS:1 COG:BH1877 KEGG:ns NR:ns ## COG: BH1877 COG3533 # Protein_GI_number: 15614440 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 33 791 4 756 758 515 37.0 1e-145 MIKQKFLLLSCALVLLLTPTNAQEKLYKNEFPIADVKLLDGVFKHARELNIEVLLKYDVD RLLAPYRKEAGLTERKKTYPNWDGLDGHVAGHYLSAMSMNYAATGNKECGRRMEYMISEL QLCLEANAINNTEWAIGYIGGFPNSKNLWSTFKKGDLRIYNSAWAPFYNLHKMYAGLRDA WLYCNNKQAKTLFLKFCDWAISITDDLNEEQMQTVLKMEYGGMNEILADAYQITGNKKYL VAAKRYSQNILLDPLSQGIDNLDNKHANTQIPKFIGFARIAELSGDTKYTNASRFSWETI TGNRSLAFGGNSRREHFPSVTSCSDYINDVDGPESCNSYNMLKLTEDLFRMQPSAHYADY YERTVFNHILSTQHPEHGGYVYFTSARPRHYRVYSAPNEAMWCCVGTGMENHSKYNQFIY THSDDSLFVNLFIASELNWKNKKISLRQETNFPYEERTKLTVTKASSPFKLMIRYPGWVD KGALKVSVNGKSMNYSALPSSYICIDRKWNKGDVVEVELPMRSTIEHLPNVPNYIAFMHG PILLGAKTGTEDLRGLIAGDGRWGQYPSGKLLPVDQAPILIVDDMENITSKLVPIKNEPL HFKANIKAANSIDIKLEPFANIHDARYMMYWLTLTNKGYQTYIDSLSTIEKEKIILEKLT VDFVAPGEQQPETDHKILQEKSRTGNANQQFFREASSEGYFSYEMKTNAETELSLMVRYW GAEWGGRKFDIYIDDEKLLTEDNTGRWNQSKFQDIRYKIPSSMVQGKNHIRVKFQSIPGN TAGAVYFIRLVKRVDDPAIKD >gi|222159284|gb|ACAB01000075.1| GENE 30 38598 - 38714 57 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAIDSQPGSYWLNGNLAVRLKALQLYIYWGALKGHPNK >gi|222159284|gb|ACAB01000075.1| GENE 31 38943 - 39398 430 151 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0298 NR:ns ## KEGG: Dfer_0298 # Name: not_defined # Def: putative esterase # Organism: D.fermentans # Pathway: not_defined # 6 141 100 238 286 108 41.0 7e-23 MRGGGDPNIFQNGYFDMPGWAYETFFFTEFLPYVEKRFRVIGDKRNRAVAGLSMGGGGAT CYGQRHTELFCAVYAMSALMDIPEEGAARFNDPNGKLAVLTRSVIEKSCVKYVTGADEER KNELRSVAWFVDCGDDDFLLDRTVILNFIVQ >gi|222159284|gb|ACAB01000075.1| GENE 32 39331 - 39624 243 97 aa, chain - ## HITS:1 COG:no KEGG:Plim_2731 NR:ns ## KEGG: Plim_2731 # Name: not_defined # Def: beta-lactamase # Organism: P.limnophilus # Pathway: not_defined # 1 80 373 451 640 64 38.0 1e-09 MTDTLYSDVLKAKRAYTVYLPKSFEQDKKKMYPVLYLLHGMWEKNDVWMNRGHVKDVMDC LTASGEACEMIIVCPDAGGRRSEYISEWVFRYAGLGV >gi|222159284|gb|ACAB01000075.1| GENE 33 39707 - 41584 1277 625 aa, chain - ## HITS:1 COG:no KEGG:Csac_0206 NR:ns ## KEGG: Csac_0206 # Name: not_defined # Def: protein of unknown function DUF303, acetylesterase putative # Organism: C.saccharolyticus # Pathway: not_defined # 27 624 3 627 629 569 46.0 1e-160 MITVKNAFILLMVLAVTLIAQPLSAVIRLPRLVSDKMVLQRDTELKIWGWADSGEKVTVR FQGKHYDTEANQKGEWSVMLPPQKAGGPFVMEVNELIIRDILVGDVWLCSGQSNQETPIS RLTDMFPEINVSNNHMIRHYKVPTQNSVEDLKETIAGNAVWHSAIASEVMNWTALAYFFA QEAYQKYRVPVGMLVSSLGGSAIESWISQEHLKEFPQLLIDREALDSLRLVRRDKGAGKW MATDVDDSDWSTIQVPGNWRTNGMDVNGVVWYRKDFEVPASMVGRHAKLYMGTLIDSDSV FVNGCFVGSTAYMYPPRKYNIPAGVLREGRNNITVKLTANSANGGFVEEKPYKIVGDEAE INLVGTWKYRVGMNLNEAGKYVKRLANLKSAGSGLYNGMIYPIKDYKIKGTIWYQGETNA GHPQGYATLLESLITNWRELWEMPEMPFLLVQLPNFMKKQMQPSDGGWARLREAQLQIAM NVPHTTLAVTYDVGEWNDIHPLNKKAVAHRLFLGARKVAYGEKLVSSGPVYKEMKIEGDK IILTFTETGSGLTSKEGVLKHFAIAGEDRKFVWANAVIKGGRIIVSSKDVAKPVAVRYAW SDNPEEANLCNKEGLLASPFRTDNW >gi|222159284|gb|ACAB01000075.1| GENE 34 41903 - 44338 1777 811 aa, chain + ## HITS:1 COG:no KEGG:BVU_2979 NR:ns ## KEGG: BVU_2979 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 1 804 1 815 819 1064 61.0 0 MNRNTLFLLLLLTSNLVSGQDLKLWYSQPAKNWSEALPIGNSRLGAMVYGGTEREELQLN EETFWAGSPYNNNNPNAVHVLPIVRKLIFEGRNKEAQRLIDANFLTRQHGMSYLTLGNLY LEFPGHKDADDFYRDLNLENATTTTRYQVNGINYTRTTFASFTDNVIIMHIKASQPNALN FNVSYNCPLKNEVNVQNDKLIITCQGKEQEGMKAALRAECQVQVKTDGIIHPAGNILQIN GGTEATLYISAATNYVNYQNVSADESRRTTDYLEEAILIPYEKALKEHIAFYKKQFDRVQ LHLPSSEASQIETPRRIENFGQGNDMAMAALLFQYGRYLLISSSQPGGQPANLQGIWNNS THAPWDSKYTININTEMNYWPAEVTNLSETHSPLFSMLKDLSVTGAETARTMYDCWGWVA HHNTDLWRICGVVDFAAAGMWPSGGAWLAQHIWQHYLFTGNKEFLKEYYPILKGTAQFYM DFLVEHPTYKWLVVSPSVSPEHGPITAGCTMDNQIAFDALHNTLLASYIAGEAPSFQDSL KQTLEKLPPMQIGKHNQLQEWLEDIDNPKDEHRHISHLYGLYPSNQISPYSNPELFQAAR NTLLQRGDKATGWSIGWKVNFWARMLDGNHAFQIIKNMIQLLPNDHLAKEYPNGRTYPNM LDAHPPFQIDGNFGYTAGVAEMLLQSHDGAVHLLPALPDAWEEGSVKGLVARGNFTVDMD WKNNVLNKAIIRSNIGGTLRIRSYVPLKGKGLKQVNGKECSNRLFATTPIKRPLVAKGIS AQSPKLQKVYEYDIETKAGKTYIVNTIEGKQ >gi|222159284|gb|ACAB01000075.1| GENE 35 44497 - 48516 2473 1339 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 803 1071 8 269 294 145 34.0 4e-34 MKNVFTFLLLLFSLLCNGQSPKLFTTDKELSSSLINQIYQDRNGFIWIATEDGLNRYDGA KFTIYKHEPNNEHSLAHNFVRTVFEDSKGHLFIGTYMGIQMYDPATDNFTPLAIREDNGE IFESNIVSFIERRNGEIWVSGNMLCRLNIKDYLLTVQKIDIPMTSTGFMIEDKKQNVWIA KGEEGVYRLDSNNQLTLYLSKDNGYVKSICEDSRGNIYVGSMRKGLFVYNKRQDSFIPID FKEKRELPICFLYSGPQNELYIGTDGKGMSVYNNQTHEISEYNFDNNYFDSKNSKIHSIL KDNSGNLWLAAYQKGVMQIPARTNSFKYIGHKSMDKNIIGSSCITSFCKDNNGMLWVGTD NDGIYALTEKLEPAKHFSHTTHPHSVPSTVIKLYEDSEHNMWIGSFINGMGKLNKQTGLC DYQYKLVDKNNNYIQRVYDFAEDKNKRLWIATMGFGLFYYDLKTKEFTSVQSQTSLINEW IGCLHYSDDNKLYVGTYDGVNCIDLDSPDFQSHKILSQNVIYSIFEDADGVVWLGSSEGL SGWNKKTKELTTYTTADGLPSNTIYAIQGDGKDFLWISTNAGISQFQKKNDKFINYYVSD GLQGNEFYKNASFKDKQGIIWFGGMNGITYFNPQDIINPAKTWNIRITDFFLHNNPVRKG MKSGIHNIINCPVFNAKEFYLSHKDNVFSIEFATLELNAPEHINYLYSINDEKWISLPKG VNRISFSNLKPGTYNFKIRAKDNTVYSNIKEITIFISPAWYASWWAKVIYSLLLLTIIFI IILQIRHRYRMHQEMLQHIHAEQINEAKLQFFINISHEIRTPMSLIISPLQKLIKNEENN ERLKIYHIIYRNAERILNLVNQLMDIRKIDKGQMFLMFRETNIIPFIEDLCTTFGQQANT KNIQLQLHSTLRELNVWVDTGNFDKIILNILSNAFKFTPEKGNIDITIRTGEDNTLPDPL KQYAEIIIADTGTGIDEQEKEHIFERFYQIRNSQQNPKGGTGIGLHLTRSLVELHHGIIY VENNKEQPGCRFIIRLPLGNKHLRPEEVDNNEQKVTVTVPTVPVISPIIENEEEKKVRVK TKYRVLVVEDDEEIRNYIAKEFGDKFHIMESRNGKEALEQIFKKTPDLVISDIMMPEMDG LTLCRKIKQNVNLNHIPVILLTAKTREEDNLEGLNTGADAYIMKPFNIEILQKTVENLIN TRQQLRKVFTGQQNLENKVQKLEVKSPDEKLMERIMKVINENIGNPNLTIETITTEVGIS RVHLHRKLKELTNQTTRDFIRNIRLKEAARLLSEKQHTISEIAMLTGFTDPNNFSTTFKE LYGMPPSMYMKEQLSKKEE >gi|222159284|gb|ACAB01000075.1| GENE 36 48538 - 49926 1241 462 aa, chain - ## HITS:1 COG:BH0236 KEGG:ns NR:ns ## COG: BH0236 COG5498 # Protein_GI_number: 15612799 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted glycosyl hydrolase # Organism: Bacillus halodurans # 275 460 736 921 1020 84 32.0 5e-16 MKMKSTFIAATLLLVSVMTEGGLSAQNPIVQTCFTTDPAPMVHDGTLYVYTGHDEDKADF FWMQEWRVYSTKDMVNWTDHGSPLAIESFDWADDRAWAAQCVERNGKFYWYVCLHSKLSN TMAIGVAVGDSPVGPFKDAIGKPLHDGSWDYIDPTVFIDDDGRAYLYWGNPNIYYAELND DMISLKGEVGKLEQTVESFGAPNPEKRVKDVKYKDTYTEGPWLHKREGKYYLLYAAGGVP EHIAYSMSDGPLGPWKYMGEIMPLQDTGSFTNHCGVIDYKGNSYFFYHTGKLPGGGGFGR SAAVEQFKYNADGTFPIINATREGVSPVGTLNPYERVEAETIAFSEGVKSEPNAKTGIYI SEIHNGDYIKVREVDFGNKSPKRFTATVASALRGGTLEVRTDSISGPLIAELTIPSTGGW ECWKTLQADIVKPVTGIQDIYFVFKGRKGCKLFNFDWYKFNR >gi|222159284|gb|ACAB01000075.1| GENE 37 49934 - 51850 1336 638 aa, chain - ## HITS:1 COG:yieL KEGG:ns NR:ns ## COG: yieL COG2382 # Protein_GI_number: 16131587 # Func_class: P Inorganic ion transport and metabolism # Function: Enterochelin esterase and related enzymes # Organism: Escherichia coli K12 # 285 637 41 400 400 142 30.0 3e-33 MKKIIVLCCLLCAGIFAFAQDPNFHIYLCLGQSNMEGNAKIEAQDTCNVNERFLMMAAVD CPSLGRVKGQWYKAVPPLVRCHTGLTPADYFGRTLVERLPDNIKVGVINVAVGGCRIELF DEENCEEHIASQPEWLKNTAKAYGNNPYRRLKELAVEAQKAGVIKGILLHQGESNTGDKE WPQKVKRVYENLLRDLNLQAKDVPLLAGEVVHADQNGRCASMNEIINTLPQVILTAYVIP SSGCPAAEDNLHFTAEGYRKLGVRYAEKRLLLLEKEPNSGITTEPASTNIPGYDYPRVDK EGRAHFRFYAPQANRLQVDCCGKKYDMWKDAGGLWTATTNPLPVGFHYYFLIADGVSVTD PSSYTFFGCCRMASGIEIPEGEEGDYYRPQQVPHGQVRSCTYYSETQKEFRLCMVYTPAE YETHPKKRYPVLYLQHGMGEDETGWSTQGKMNHIMDNLIASGQCVPMLVVMDSGDVEAPF RPRPGKDVNEERALYGATFYDVILKDLIPMIDRTFRTKTDREHRAMAGLSWGGHQTFNTV LPHLDKFSYIGSFSGAIFGLDMKTCFNGVFADADKFNKKVNYFFLGCGTEEQMGTKKMVD SLRKLGIEVDYYESQGTAHEWLTWRRCLKEFVPHLFKH >gi|222159284|gb|ACAB01000075.1| GENE 38 51877 - 54411 1532 844 aa, chain - ## HITS:1 COG:no KEGG:BVU_0030 NR:ns ## KEGG: BVU_0030 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 841 1 858 864 922 52.0 0 MFLSRIIIIFLFLCVSVLNAAEPFVTFISADAKLPLVTPQNRSFSIWCDDAEHRGVLLAV RNLQTDFEKVTTVKPELSNIAGKEVRIIIGSFDKSPLIRQLVKTGKLDGKELRGKNEKYI ITTLHAPLEGLQGDVLLIAGSDKRGTIYGIYELSKQIGISPWYWWLDVPPVKHQAIYIKE GVYTDGEPAVKYRGIFINDEWPCMGGWTTEKFGGFNSKMYVHVYELLLRLKANFLWPAMW SAAFYADDPMNSPLADEMGIIIGTSHHEPMARNHQEYARNRKLYGEWNYQRNKEGVERFF REGIKRMNGKEEVITIAMRGDGDAPMGPDTDTHLLEEIVKGQRKIIADVTGKPACKTPQL WALYSEVLEYYDKGMKIPDDVMILLCDDNWGNVRRLPDLKDKRHPGGYGMYYHVDLHGAP RAYQWLNMTQIQHMWEQLYLTYSYGVDKMWILNVGDLKPNEFPVDFFLNMAWNPGHLTAD NLQEYTCSFCKQQFGDNYAVEAARILNLYCKYAARVTAEMLDSQTYNLSSGEFKAVTDEF LALEAHAYRQFMTLPEELKDTYKELILFPVQAMANLYEMYYAVAMNHKLASEGDPRANEW ADRVEYCFRYDAELCYDYNNNIADGKWNHLMDQTHIGYTSWDEPKGGNIMPEIIRVDVSA YKPGGYEYKEKGGVVVMEAERFAECTQGNKTEWTVIPDLGRTLSGLSLMPYTQPVTGASL TYRMKLNSEMKNIRVRLILDSTLPFIKGGHSYAISLDGGKEQIVNYNSEMTWANCYTKMY PAGAARLIESVVDFSNVNLKEGIHALIIRPLSPAIVLHKIIIDCGSDEVSRLNLQESPYR KVTH >gi|222159284|gb|ACAB01000075.1| GENE 39 54506 - 57325 1963 939 aa, chain - ## HITS:1 COG:no KEGG:CJA_3286 NR:ns ## KEGG: CJA_3286 # Name: ebg98 # Def: endo-beta-galactosidase, putative, ebg98A (EC:3.2.1.-) # Organism: C.japonicus # Pathway: not_defined # 227 938 306 1016 1016 746 51.0 0 MMNLLNTKSKLSYLLFLGIVCCACILGSCKDDDVIDPDAPSVPKPGTAVENINTNVKALR KLIEAKQQDLAVKTYNPVNNGASYTIELSDGTSFSMYAQIAALEGGGEDVVYSPKVGAKV EHDEYYWTLDDAWLTFENDEKVKVLDENNTVAPIVDINTDGYWTVKYGAKSRTLDKAVSG KLTSQFKQVSAIGDESVSFTFTDRTPVIELNLFKGDNPEIPPVTGALRRPISPEQPAWFV HIDSWNYADPQKIIDLIPADIRPFTIFNISLSVSHDEATGIYNVSEYGYEIAKSWLRTCA ENNVWAMVQPSSGGFSHFKDVSLYSQFESDDKVRVYDEFFREYPNFLGFNYCEQFWGYDD QFSVSWLQRVAHWNQLLKLTHKYGGYLVVSFCGNTWSANINPIALVKRNSDFAQTAKLYS ENFIMCEKYTTQSGFFNVEGICLGTWLSGFAGQYGIRFDQCGWTEEKGQNGDKDFPPAAG ALPIIEHVMLTGQTVIDGPELIWQQCFKETNAVSVGDGYQSRNWECFPQFVNINIDMFRK IIDKTIRIPSRKEVIDRTKVVILQDVYSGDDNAKYSSPKNLHEGLYLRDDDGNLWDNHCY FKKTGRYPTIPVAFELCDDVANSFQYKINQSTFEGSWSDVNTKVGKFNRWFPQEYTGELY AGRIENGWVVYNGLAGIRNAAIPFKYNTCDKMELAYSKYTVSVIKEYANKLTFYMNNYDP SGSSKTEVIKIYGCTSKPTHSVSSRANGTAQVSENWKEDVYTLTVTHNGPLDLTVNCSGK ATDRLTVSTAASIQIPASPQIYQGAYQYEAECFDFKNVTKRVTKGDSEPIRNYTAQGYIN FGASSAAAVRDAVTALEDGVYTIRIRYRAPSATVNTVDMYINNTKVGTPEFAQTDNDNTV WNTALMSVSLRKGANTFELKANSSGAGDLYLDNIVIERK >gi|222159284|gb|ACAB01000075.1| GENE 40 57346 - 58992 960 548 aa, chain - ## HITS:1 COG:BS_ynfF KEGG:ns NR:ns ## COG: BS_ynfF COG5520 # Protein_GI_number: 16078876 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: O-Glycosyl hydrolase # Organism: Bacillus subtilis # 138 542 36 417 422 201 35.0 4e-51 MKNITLLFCLFLANILLGACSGGEDEKKEMDEGKGAYALFLKKSITVSTGESQTDVVVEW AKTSWEITIGEGEIVKSVTPTSGGNSSGEKQYTKVRVSCGANSTMKKRTQTIHLFDKTNE TTVDLLVEQEPPFKSVTLTVDPMVKYQPVVGFGGMYNPKIWCGDNLISAPQLDKMYGAGG LGYSILRLMIYPNESDWSADVEAAKAAQANGAIIFACPWDCTDALADKITVNGKEMKHLK KENYEAYANHLIRYVTFMKEKGVNLYAISVQNEPDMEFTYWTPSEVVDFVKQYGARIRET GVKLMSPEACGMQPEYTDPIINNAEAFAQTDILAGHLYQGFTDLSSGYVKNRHDYICGVY SRIQGKTWWMTEHLFNDGENSDDSSKWEFLKWQYSLNHLGKEIHMCMEGYCSAYIYWYLK RFYGLMGDTDKRSPTSEGEITKNGYIMAHYAQYATETTRIKAVTNNEEVCATAYWDEKTG EVTIVLLNLNGASQWLEIPLAGIKKASAVETNETKNMEVIDTGLMESAEGITVLLSANSI TSVRLTFK >gi|222159284|gb|ACAB01000075.1| GENE 41 59026 - 61257 1711 743 aa, chain - ## HITS:1 COG:no KEGG:PRU_2739 NR:ns ## KEGG: PRU_2739 # Name: not_defined # Def: endo-1,4-beta-xylanase (EC:3.2.1.8) # Organism: P.ruminicola # Pathway: not_defined # 485 742 419 704 707 150 34.0 2e-34 MKYINNKIIAFTLIAASIMSCTDEYNCQLQVEKPQNAAINEYLAQFDLLKSYIDRSGTPF QLAVNVPGSEFVKKEIAFSTVFTNFDAVDMNGSYDPLNTLKEDGTYNFGGMQTAADVAAE AGVTLYGGVLCSNQGQRAAYYNKLIEPIDIPVEVQKGTTKLFNFENDAIGTTYPMTGNSS AIVEEDPAGESGHVLHVGTNDVKAAYSYPKFHVVLPAGRKLGDYVRLNVDLRFVGTDGIW GQGLRVLINGTEFNVGKNGNDFCDGGDKWKREGTINLKDATAPGLVVPESFGSLTEFDLA IGSASTSAQFYIDNISMDYEVSGKGTTVINFEGDELNKTYPMTNGATATVVQDPANESGK VLFIDHAAYSFPKFTVKLSEGKTLGDYSGMSMDMRLIAGKYGGGMSVVINGQTFPLKQNA LAYGCDDNNTWKRGGIYVTFVKEGTYPKKDEVPATIIEIPDAMKDLNEIEFSIGSSSTNW TAYIDNLIFTWEAKPQHIEKTPEEKKEIFTKEMEKWIGGMVYAGVNETKSVKAWNIIGNP LDKTVNDNTFNWGEYLGEVDYARTAVKIARDTVKNANVDLELFVSNTFGQYDEMGNMADE LISLVDAWEADNVTKIDGYNILLNAIYSKDVVFQEGNKTMITNLFDKLGKTGKLIRVSDL SMMVEELDGNFIAINKLTEEDRAAAASYMAFIMQEYRRLIPADKQYGISISGITETNTGY KLCPWTSDYNRNEMYEGIVDGLK >gi|222159284|gb|ACAB01000075.1| GENE 42 61290 - 62324 755 344 aa, chain - ## HITS:1 COG:no KEGG:Slin_2105 NR:ns ## KEGG: Slin_2105 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 17 333 4 321 331 286 46.0 9e-76 MKKNLAYIGLVLLILTWTSCESSDNEFPDFDYQTVYFANQYGLRTIELGESEFVDNTLDN QHKMVIKAAWGGGYTNRNNVVINFKVDESLCDNLYFKDTDQPLVPMPASYYTLASDRIAI PKGQIMAGVEVQLTDDFFADEKSISENYVIPLLMTNVQGADSILQGKPVVENPVLTNAGD WSILPQNFVLYAVKYVNPWHGEYLRRGIDHATVAGTSKDIIRHEQFVENDEVVNISTKSM KDNLLTLKTKDESGKDISYTVRLSFAEDGSCTVHSGSQNVVVSGSGKFVSKGEKNSLGGK DRNAIYLDYTVNLTDNNIQLATKDTLVLRTRNVYGGKSLEVVRK Prediction of potential genes in microbial genomes Time: Wed May 18 02:54:44 2011 Seq name: gi|222159283|gb|ACAB01000076.1| Bacteroides sp. D1 cont1.76, whole genome shotgun sequence Length of sequence - 11678 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1671 1516 ## ZPR_0751 hypothetical protein 2 1 Op 2 . - CDS 1699 - 4518 2281 ## Slin_2103 TonB-dependent receptor plug 3 1 Op 3 . - CDS 4551 - 6422 1389 ## Slin_2102 RagB/SusD domain protein 4 1 Op 4 . - CDS 6439 - 9591 2604 ## Slin_2101 TonB-dependent receptor plug - Prom 9651 - 9710 10.8 5 2 Tu 1 . - CDS 9712 - 11445 1251 ## COG3507 Beta-xylosidase - Prom 11616 - 11675 6.5 Predicted protein(s) >gi|222159283|gb|ACAB01000076.1| GENE 1 3 - 1671 1516 556 aa, chain - ## HITS:1 COG:no KEGG:ZPR_0751 NR:ns ## KEGG: ZPR_0751 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 1 545 1 537 580 523 51.0 1e-147 MKKKIIFLVAAALMLNSCDDLFEPAIENFKDVEQMYDDAQYAQGFLVNVYRCVPGYYDNS EYATDDAVINQKNNAFLTMATGGWTSRLWTPINQWTNSFSSIQYINLFLENVDKVRWSDD AEKAQLFARRTKGEAYGLRGMFLYYLLRAHAGFGENGELLGVPRLTEYLTINSDLNLPRA SFADCVQQIYSDLEKAEELLPWEYNDVNEVPADFQSITQDKGKYNTVMGEKSRQLFNGLI ARAYRVRTALLAASPAFQDASNPASWADAANAAAAVLNYNGGLGGLDDKGVEYYSKEVVE NLQKGINPKEIIWRENVSNSEAANSQEANNFPPSLNGSGNMNPSQNLVDAFPMANGYPIS DIVNSNYDKNNPYNGRDPRLAKYIIYNGSIAGVGDVPIYTGRKSGTDDGIDVKEQRSTRT GYYMKKRLRMDVNRALGSVTNQQHYIPRIRYTEMYLAYAEAANEAWGPKGDNGNGYSAYD VIKAIRKRAGIGGTSDPYLEQCAGDRDKMANLIRNERRLELCFEGFRFWDIRRWKENLNE PVRGIDWDRDGHSFNE >gi|222159283|gb|ACAB01000076.1| GENE 2 1699 - 4518 2281 939 aa, chain - ## HITS:1 COG:no KEGG:Slin_2103 NR:ns ## KEGG: Slin_2103 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 28 939 58 966 966 842 48.0 0 MNRKFIYIGCTVFAMSLLSMTGVQAQEESKDSLVNVAFGKVAQEDLTHAISTVNTSELTK KTANNNSLVGLESFVGGYNGSSLWGQGPLVLVDGVPRSASSVKASEVESISMLKDAAAVV LYGSRAAKGVILITTKRGKDSPMHIDVRANVGVNVPKAYPKYLDADCYMTMYNEACRNDG IAERYDAGTIYNTAMGTNPYRYPDIDFYSSDYLRKFYTNSDVTGEVYGGNEKTHYYLNFG MNYANSLLKYGESKKAYDMSFNVRGNVDMALASWLKATTNAAIIFDNSYQGRGDFWGAAS TLRPNWFAPLLPIDMMDPNVAYIQEYITNSNHLIDGKYLLGGTSADTTNPFSELLAAGYT KEKARRFMFDVAVMADLGGLLKGLSFKTAYSVDYTSYYSEAYAEKYAIYEPKWSNVNGKD MIIGLTKFNDDTKSTNEYVGKSTYDQTMTFSAQFDYNRTFKTFHNVSATLLGWGYQTQSS ADEGHESSDYHRTSNVNLGLRASYNYEHKYYADFSGALVHSAKLPEGNRNAFSPSVTLGW RLSKEKFMESVGFVDDLKITASYAKLHQDLDISDYYMYKGYFSDKGGWFQWYDGTVGGNT TGSKRGDNPNLDFITREEVRAGIDATLFNRLLRINANYFTQKTSGLLTQGASTIYPSFFD LGTDLSFLPWINYNEDKRTGFDFSLSANKKIGDFDVTLGFNGMVFSSKASIRDEVANESY LLRQGRALDATYGYVCEGFFQDQADIEKHAKQTFGTVRPGDLKYKDINGDGVINSDDQID LGAGGWSTSPFTFGLNLTLKYKNFTLFAMGSGQTGAVAFKNNSYYWNRGTSKFSEIVMGR WTEETKETATYPRLSTSNGDNNFRNSTFWMYKNNVFRLSNVQLTYDFPTDTFNGTFIRGL SLYCGGSNLLTIAKEREYMEMSLGAPQYRNFYLGFKAAF >gi|222159283|gb|ACAB01000076.1| GENE 3 4551 - 6422 1389 623 aa, chain - ## HITS:1 COG:no KEGG:Slin_2102 NR:ns ## KEGG: Slin_2102 # Name: not_defined # Def: RagB/SusD domain protein # Organism: S.linguale # Pathway: not_defined # 16 623 13 617 617 544 48.0 1e-153 MNKYINKIFLTSVITLAGMLGATSCTDYLDKSPYSDIEENDPYKNFKNFQGFTEELYNCI PVVSNSEYHTSFNFGEEDYWEPQELRLFARNIDYGDFWGWTTCYYSYPSSIRGGKANSQE RSDKGNLWKLCWYGIRKANIGIANLGNLVSATEEERNLIEGQLYFFRGWFHFMLMEWWGG MPYVDQVLPPDVAPALSRLTWQECAEKCVADFDHAIPLLPVDWDQTTAGRATLGLNDTRI NKIMALAYKGKTLLWAGSPLMNWASGGDKEYDAGRCKRAAEAFGEALKIVEDTKRYELAD MENYNDLFVVYNSGGKLNGVKEAIFRENLVQYDGRWNYNMNTDFRPRSHINAGIKCYPTA NYVNYFGMANGYPIKDMTKADAESGYDPKYPWKNRDPRFYKDIIFDGEWCGETGNGQYCQ LYTNGADSEEKDPQKGCFTGYMNSKLCLKIVNNANVRGAHYAVLSLMRLADVYLMYSEAT AVGYGSPQSKASSFYMTAEDAINEIRDRAGVAHVLDRFTGNTTDFLGEIRRERAVELAFE GFRFMDLRRWMLLDKSPYTLKTKIEFDRVPTASYNKEHPEENEIMNLKEVVLVERKYTDR HYWLPLPKADVYLYEGFGQNPGW >gi|222159283|gb|ACAB01000076.1| GENE 4 6439 - 9591 2604 1050 aa, chain - ## HITS:1 COG:no KEGG:Slin_2101 NR:ns ## KEGG: Slin_2101 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 29 1050 25 1038 1038 1048 53.0 0 MVMKNVTKKIQKIPYLFLLMLFSFVTASAQKGITVRGTVLDSNGETIIGASVTLKGNNSV GTISDIDGNFVLTVPSEKSVLIVSYVGMKPQEVKVSSKGMIKVTLEDDTKQLEEVVVVGY GQQKKASVVGAITQTSGKTLERAGGVTSLGSALTGSLPGVITSASSGMPGAEDPQIIIRT QSSWNNSEPLILVDGIEREMSSVDISSVENISVLKDASATAVYGVKGANGVILITTKRGK EGKASVQIKANVTAKVASKLPEKYDAYDTFYLLNNSIEREACLNPNGWNDYTPTSIIDKY RHPANAEEWDRYPNTDWEKALFNSTAMSYNTSVNVSGGTKIVSYFAAADFVSEGDLFKVY DNRRGYDYGFGYNRINVRSNLDFQLTKTTKFSTNLFGSNAQRTVPWDYNDQDATFWAAAY KSAPDAMRPIYSNGMWGWYQPRDADVPNSVYKLAVAGSEKRTTTKMTSDFILEQDLHMLT KGLKFKANFSMDYTFVENKRGVNDMYHNSQRMWVRPDTGEIILEQPELGTGLDAIINPIY WEHQAGSVNTGATYRKLYYSMQMDYARNFGKHEVTALALFSRLKEASGSVFPIYREDWVF RLTYNYAMRYFFETNGAYNGSEKFGPDYRFAFFPSFSLGWMISEEKIVKNNLKFLDMLKL RASWGRVGDDAVVQPWQRFTDGRFLYKNQLDYGGNTLMGNIKPANTPYTYYTISRLGNPN ISWETVEKRNLGIDYAFLGGAVAGSVDIFNDTRTDILVKGSDRAIASYFGTDAPYANLGK VSSHGYELELRLNHTFNNGIRAWLNTSMTHAVNKVKFRDDAPLKPDYQKGAGHAINQVYS YIDHGNLATWDDVIGSPVWTTGNDAKLPGDYNIVDFNGDGVIDADDRAPYQYSTMPQNTY NASIGAEWKGFSIFAQFYGVNNVTREVNFPTFRSTAHVAYAEGDYWTPDGSATLPAPRWG TTIDQAASGTRYWYDGSYLRLKNVELSYTFQKSNWLKKMGIKNCRIYLNGDNLYMWTKMP DDRESNTGYSSSDGAYPTMRRFNLGIDITL >gi|222159283|gb|ACAB01000076.1| GENE 5 9712 - 11445 1251 577 aa, chain - ## HITS:1 COG:CC2802 KEGG:ns NR:ns ## COG: CC2802 COG3507 # Protein_GI_number: 16127034 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Caulobacter vibrioides # 43 574 12 543 548 373 38.0 1e-103 MKNTQVIQLMSIVRLSIFMLGITMMSCNSKKEQQLPAIGKSVALFDYFSYKGNDDFYISN PLSDEDYFYNPILPGWYSDPSVCTNGEGDYFLVTSTFTYFPGVPIFHSKDLVNWKQIGHV LNRASQLVNMEGQKVSGGIFAPAISYNPYNKTYYMVTTNVGAGNFFVKTQDPFGEWSEPV MLPEVGGIDPSFFFDEDGKAYIVNNDEAPDNKPEYSGHRTIRIQEFDVKTDKTIGPRKIL VNKGAQPADKPIWIEGPHLYKINGKYFLMSAEGGTGNWHSEVIFRGDSPMGKFLPWKNNP ILTQRHLNSGRPNSVTCAGHADLIQTKEGDWWAVFLACRPINNQFENLGRETFMMPVKWS EDGFPYMTQGDDLVPMIVKREGAKRDTTVTYGNFELIANFDSPVLDMTWMTLRASASDLY SLTETPGYLTLKCVDISATEKKTPAFVCRRLQHHKFECATRMLFNPSNDKETAGMLLFKD ETHQYFFCLNKVGENKNISLKQIGEKEQTLASDEIDADTNEVYLKLVSQGIGYDFYYSID GEKSWKLLCKDVDPSYLSTTTAGGFTGTTIGLYATCK Prediction of potential genes in microbial genomes Time: Wed May 18 02:55:39 2011 Seq name: gi|222159282|gb|ACAB01000077.1| Bacteroides sp. D1 cont1.77, whole genome shotgun sequence Length of sequence - 68669 bp Number of predicted genes - 45, with homology - 45 Number of transcription units - 16, operones - 8 average op.length - 4.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1321 947 ## ZPR_1028 glycosyl hydrolase 2 1 Op 2 . - CDS 1341 - 3281 1245 ## Fjoh_3873 hypothetical protein 3 1 Op 3 . - CDS 3339 - 5783 1812 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 4 1 Op 4 . - CDS 5809 - 7536 1135 ## COG3507 Beta-xylosidase - Term 7899 - 7948 -0.4 5 2 Op 1 . - CDS 8090 - 10675 2222 ## COG1472 Beta-glucosidase-related glycosidases - Prom 10706 - 10765 7.0 - Term 10708 - 10759 5.0 6 2 Op 2 . - CDS 10772 - 12370 1256 ## COG3507 Beta-xylosidase - Prom 12392 - 12451 6.4 - Term 12574 - 12605 2.5 7 3 Op 1 . - CDS 12667 - 13683 1122 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 8 3 Op 2 . - CDS 13709 - 15001 799 ## COG0738 Fucose permease 9 3 Op 3 . - CDS 15040 - 15963 748 ## BT_3615 hypothetical protein 10 3 Op 4 . - CDS 15979 - 16911 942 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Prom 17029 - 17088 3.1 + Prom 16992 - 17051 2.1 11 4 Tu 1 . + CDS 17091 - 18116 763 ## COG1609 Transcriptional regulators + Term 18233 - 18279 2.9 - Term 18218 - 18270 9.7 12 5 Op 1 . - CDS 18294 - 18854 456 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 13 5 Op 2 . - CDS 18866 - 20407 1724 ## COG0423 Glycyl-tRNA synthetase (class II) 14 6 Tu 1 . - CDS 20511 - 21017 483 ## BT_3610 hypothetical protein - Prom 21038 - 21097 2.7 + Prom 20997 - 21056 6.2 15 7 Tu 1 . + CDS 21250 - 22395 740 ## COG1609 Transcriptional regulators + Prom 22419 - 22478 4.4 16 8 Tu 1 . + CDS 22504 - 23691 757 ## BT_3608 hypothetical protein + Prom 23724 - 23783 9.2 17 9 Op 1 . + CDS 23860 - 27009 2189 ## BT_0452 hypothetical protein 18 9 Op 2 . + CDS 27022 - 28656 1144 ## BT_0451 hypothetical protein 19 9 Op 3 . + CDS 28676 - 30373 1104 ## BT_0450 hypothetical protein 20 9 Op 4 . + CDS 30415 - 31983 1054 ## COG4124 Beta-mannanase 21 9 Op 5 . + CDS 32007 - 34211 1521 ## COG1472 Beta-glucosidase-related glycosidases 22 9 Op 6 . + CDS 34218 - 35534 1062 ## COG0738 Fucose permease 23 9 Op 7 12/0.000 + CDS 35552 - 36739 864 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 24 9 Op 8 . + CDS 36720 - 37511 533 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 25 9 Op 9 . + CDS 37518 - 38525 511 ## BT_3586 putative dehydrogenase 26 9 Op 10 . + CDS 38530 - 39939 733 ## BT_3585 putative oxidoreductase 27 9 Op 11 . + CDS 39987 - 40757 576 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 28 9 Op 12 . + CDS 40747 - 42084 1133 ## COG0673 Predicted dehydrogenases and related proteins 29 9 Op 13 . + CDS 42088 - 42327 114 ## BT_3582 hypothetical protein + Prom 42332 - 42391 3.8 30 9 Op 14 . + CDS 42415 - 43758 971 ## COG0673 Predicted dehydrogenases and related proteins + Term 43782 - 43821 4.5 - Term 43766 - 43813 1.9 31 10 Tu 1 . - CDS 43858 - 44445 403 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 44586 - 44645 8.9 + Prom 44467 - 44526 5.8 32 11 Op 1 . + CDS 44747 - 48091 2186 ## BT_4606 hypothetical protein 33 11 Op 2 . + CDS 48104 - 50542 2110 ## BT_4606 hypothetical protein 34 11 Op 3 . + CDS 50555 - 51436 761 ## BT_4606 hypothetical protein 35 11 Op 4 . + CDS 51450 - 52388 829 ## COG0584 Glycerophosphoryl diester phosphodiesterase 36 11 Op 5 . + CDS 52428 - 53510 594 ## Phep_1387 hypothetical protein + Prom 53533 - 53592 2.9 37 12 Op 1 4/0.000 + CDS 53634 - 54620 534 ## COG3712 Fe2+-dicitrate sensor, membrane component 38 12 Op 2 . + CDS 54644 - 58186 2482 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 39 12 Op 3 . + CDS 58209 - 59960 1535 ## Slin_4978 RagB/SusD domain protein + Term 59991 - 60028 6.3 + Prom 59971 - 60030 4.6 40 13 Tu 1 . + CDS 60063 - 61391 690 ## COG2271 Sugar phosphate permease + Term 61452 - 61508 4.2 + Prom 61405 - 61464 5.9 41 14 Op 1 . + CDS 61527 - 64172 2542 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 42 14 Op 2 . + CDS 64172 - 65032 607 ## BT_3578 hypothetical protein 43 14 Op 3 . + CDS 65047 - 66141 551 ## COG0793 Periplasmic protease + Term 66249 - 66319 22.0 + TRNA 66215 - 66299 69.8 # Leu TAG 0 0 + Prom 66520 - 66579 3.3 44 15 Tu 1 . + CDS 66775 - 67596 429 ## BT_4009 integrase + Term 67637 - 67675 -0.7 + Prom 67643 - 67702 5.5 45 16 Tu 1 . + CDS 67801 - 68668 415 ## BT_4010 hypothetical protein Predicted protein(s) >gi|222159282|gb|ACAB01000077.1| GENE 1 1 - 1321 947 440 aa, chain - ## HITS:1 COG:no KEGG:ZPR_1028 NR:ns ## KEGG: ZPR_1028 # Name: not_defined # Def: glycosyl hydrolase # Organism: Z.profunda # Pathway: not_defined # 8 440 7 434 437 565 64.0 1e-159 MKNLNFWGVSLLLVMMAVSGTAQNPIIQTKYTADPAPMVYNDTVFLYTTHDEDDAEGFKM LDWLLYTSIDMVNWTDHGAVASLKSFDWVKRDNGAWAEQVIERNGKFYMYCPIHGNGIGV LVSDSPYGPFKDPLNKPLVWQKEHWYDIDPTVFIDDDGQAYMYWGNPNVYYVKLNEDMIS YSGEIVQVENKPEHYQEGPWVYKRNGHYYMAFASTCCPEGIGYAMSDKATGPWSTKGYIM RPTERSRGNHPGIIDYKGSSYVFGLNYDLLHLETLDHKERRSVSVAKMHYNPDGTIKEVP YWQETKLEQIENFNPYRRVEAETMAWGYGLKAENHKNGGLYITDIDDNEYLCVRGADFGK KGAKKFSVSAACVEKGGMIEIRLDSTEGPVIGSVSISPTGGLDIYKQMSCRIKNAKGVHD LYFCFKGEKGNKLFNLDYWE >gi|222159282|gb|ACAB01000077.1| GENE 2 1341 - 3281 1245 646 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_3873 NR:ns ## KEGG: Fjoh_3873 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 1 644 1 645 647 956 68.0 0 MKFFIVMAMLLGSSVASAENKQIASPDGKLVVTVADMDGRPSYSVSYDNVLFLKPSPLGM IANIGDFSSGMSLEKNVSTNKIDETYELASIKKSKVRYVANEAVFSFTQQGKTIYDVIFR ISNNDVAFKYKIYPQGETLSCVVKQEVTGFVFPDGTTTFLCPQSKPMGGFARTSPSYETS YTADDVAGKNGWGEGYTFPCLFRNGDNGWTLVSETGVNGGYCASRLLGHKEGVYTIGFPQ EGEANGNGTVSPGIALPGETPWRTITVGKTLAPIVETTVPFDVVKPLYSAKGEYTYGRGS WSWIIGMDGSTNYKEQLRYIDFSAAMGYQSVLVDALWDKQIGHDKIEELAKYGKDKGVAL YLWYNSNGYWNDAPQTPRGIMDNAIARRKEMKWMQSIGIRGIKVDFFGGDKQMTMQLYED ILSDANEYGLLVIFHGCTLPRGWERMYPNFASSEAVLASENLHFSQGSCDNEAFNATLHP FIRNTVGSMDFGGSALNKYYNADNAPRGSRRVTSDVYALATAVLFQSPVQHFALAPNNLT DAPSWAIDFMKEVPTTWDEVRFIDGYPGKYVILARRHGDKWYIAGVNAQKETLKLKVNLP MFSNGEKVRLFSDDKALQGSVKQIEIGKNQELQLSIPCNGGVLITK >gi|222159282|gb|ACAB01000077.1| GENE 3 3339 - 5783 1812 814 aa, chain - ## HITS:1 COG:SSO3022 KEGG:ns NR:ns ## COG: SSO3022 COG1501 # Protein_GI_number: 15899728 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Sulfolobus solfataricus # 35 813 3 729 731 504 37.0 1e-142 MKIHHLFWGICLCFSTNILFAQNYQKTSSGIKTTVNAVDIEVQFFAPAVARVIKSPEGVA YEKQSLSVIAKPEKVSFKADIQDNKIVLNTSELSVSVDTGTGIVSYFSKDGKSLLAEKSG MQFIDFDDAGTKTYQVYQPFVLDKEEAIYGLGQLQNGKMIQRNMTKNLIQGNVEDVSPFF QSTKGYGVFWDNYSPTLFTDNEVETSFRSEVGDCVDYYFMYGKNADGVIAQVRSLTGQAP MFPLWTYGYWQSKERYKSQEEVVDVVRKYRELGVPLDGIIQDWQYWGHNYLWNAMDFQNP TFNHPQKMIEDVHAMNAHMAISIWSSFGPMTKPYRELDKKGMLFNFTTWPQSGLESWPPN MEYPSGVRVYDAYNPEARDIYWQYLNDGIFKLGMDAWWMDSTEPDHLDWKPEDMDTKTYL GSFRKVRNAYPLMTVGGVYDHQRAMTSDKRVFILTRSGFLGQQRYGANVWSGDVASTWES FRNQIPAGLNFSLCGMPHWNSDIGGFFAGHYNKSWNDDSASKNPLYQELYVRWLQFGTFN PMMRSHGTDVYREIYKFGKKGEPVYDAIEKMIGLRYSLLPYIYSTSWEVSNRQSSFMRAL MMDFVDDRKVWDINDEYMFGKSLLVAPIAHAQYTPEAVVKVSEEEGWNRDGAKKTKTDVA VDFMETKSTNIYLPAGTLWYDFWTNEKHEGGKEITKETTLDVIPLYVKAGSIIPVGPQVQ YATEKPWDHLELKVYAGANGNFILYEDEFDNYNYEKGAYTEIPISWNNVSRKLTIGARKG TYEGMLKNRKFTVTLQDGTQKNVDYNGKAISVKF >gi|222159282|gb|ACAB01000077.1| GENE 4 5809 - 7536 1135 575 aa, chain - ## HITS:1 COG:CC2802 KEGG:ns NR:ns ## COG: CC2802 COG3507 # Protein_GI_number: 16127034 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Caulobacter vibrioides # 41 573 14 546 548 355 38.0 2e-97 MRTLKLILSGWQSVGLVVFFMLFFNMEVSGKGKITKNAPVFTQFIYQGEDAIYQNNPLKP GEFYNPILQGCYPDPSITRKGNDYYLVCSSFAMFPGVPIFHSNDLVNWKQIGHVLDRTSQ LKVEDCGISAGVYAPAIRYNPNNDTFYMITTQFSGGFGNMVVKTKNPENGWSDPVKLQFE GIDPSLFFDDNGKAYVVHNDAPAKANERYSGHRVIKIWDYDVENDKVVPGTDRIIVNGGV NIEEKPIWIEAPHIYKKDGRYYLMCAEGGTGGWHSEVIFVSDHPKGPYLPANNNPILTQR YFPANRANKVDWAGHADLVEGPDGKYYGVFLGIRPNEKNRVNTGRETFILPVDWSGTFPV FENGLIPMKPTLKMPSGVENQTGKNGYLPSGNFVFKDDFSDKTLDLRWIGLRGPREEFVD MTDKGLRIIPFTSNINEVKPTSTLFYRQQHNQFTAAATMEYKPKNEKDFAGITCYQNERY HYVFGITKKGKDYYLILQRTEKGQASVLGEVKIETEKPVTLQVTANGDDYRFNYSIDGKG FLNLGGTVSGDILSTNEAGGFTGAMIGLYATSVGN >gi|222159282|gb|ACAB01000077.1| GENE 5 8090 - 10675 2222 861 aa, chain - ## HITS:1 COG:XF0845 KEGG:ns NR:ns ## COG: XF0845 COG1472 # Protein_GI_number: 15837447 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Xylella fastidiosa 9a5c # 34 846 31 858 882 584 41.0 1e-166 MKRGWIPIMGVCLVLSFSACKQLLPYQDTSLTAEQRAEDLLPRLTLEEKVSLMQNASPAI PRLGIKEYEWWNEALHGVGRAGLATVFPQSIGMGASFNDSLLYEVFNATSDEARVKSRIF GESGVLKRYQGLTFWTPNVNIFRDPRWGRGQETYGEDPYLTGQMGMAVVRGLQGPEDARY DKLHACAKHFAVHSGPEWNRHSFDAENIDPRDLWETYLPAFKDLVQKAHVKEVMCAYNRF EGEPCCGSNRLLMQILRDEWGYKGIVVSDCGAISDFYRPGTHGTHPDKEHASAGAVRAGT DLECGSEYASLADAVKAGLIDEKEIDISLKRLLTARFELGEMDEQSAWSEIPTSVLNSKE HQALALRMARESLVLLQNKNNILPLNTHLKVAVMGPNANDSVMQWGNYNGIPAHTVTLLE AVRAKLPEGQIIYEPGCDRVDGKTLQSLFDECSINGKPGFLAEYWNNRDREGEVVATDQI STPFHFATTGATTFAPGVEITNFSARYESVFRPSQSGDVAFRFQLDGEVTLIIDSEQVAR KIYVKNPTNLYTLQAKAGKEYHIEILFKQRNERATLDFDLGKEVGIDLNLAVKKVMDADV ILFAGGISPSLEGEEMPVEVPGFKGGDRTDIELPDVQRDLLKALKKAGKKVVFINYSGSA IGLVPETTTCEAILQAWYPGQAGGTAIVDALWGEYNPGGRLPVTFYKDVNQLPDFEDYSM KGRTYRYMQQQPLFPFGHGLSYTTFTYGEAKLSKNTIAKGENVVLTIPVSNVGQRDGEEV VQVYLRRPGDKEGPRYTLRAFKRVHIPAGKTESVAISLTHESFEWFDEATNTMHPVADTY ELLYGGTSEQNQLKSVTVHVQ >gi|222159282|gb|ACAB01000077.1| GENE 6 10772 - 12370 1256 532 aa, chain - ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 27 526 42 529 531 310 37.0 5e-84 MLKTKINTILLCILFTLFFPLSAGAQYRNPILYADVPDMSVCRAGDYFYMVSTTMHLMPG APIMRSPDMKHWETISYVFPRIDDGPRYDLLEGTAYGQGQWASSIRYHDGKFYVWFTANG APGRGFVYTATDPAGPWKLLSRPPHFHDGSLLFDDDGRVYLFHSTGQLTELKPDLTDVLP GGINQQIFERDADEQGLLEGSSVIKHNGKYYLLMISMDWSIPGRLRREVCYRADKITGPY EKRVILETEFDGHGGVGQGCIVDGKNGEWYGLIFQDRGGVGRVPCLMPCTWTEDGWPMLG DKDGHIPNDTTLSYMSMDGICGSDDFSASGLSLYWQWNHNPVDQAWSLTDRPGFLRLKTS RVVDNLFVAPNTLTQRMVGPKCMGTVSLSLGGMKDGDRAGLSAFNGDSGVLTIEKNGNKL SLVMSEQKSVFEKTKRAISRVNMTEQARIPLNKELVYLRVEGDFTNGRDEARFSYSLDGK TWLPVGLPIKMKFDYTRMFMGSKFAIFNYATRSVGGYVDVDSFDYSFCDASM >gi|222159282|gb|ACAB01000077.1| GENE 7 12667 - 13683 1122 338 aa, chain - ## HITS:1 COG:STM1542 KEGG:ns NR:ns ## COG: STM1542 COG1063 # Protein_GI_number: 16764887 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Salmonella typhimurium LT2 # 1 338 3 340 341 194 33.0 2e-49 MKAVQIVNPSEMKVVELEKPTVGAGEVLVRIKYVGFCGSDLNTFLGRNPMVKLPVIPGHE VGAVIEEIGPNVPAGFEKGMNVTLNPYTNCGKCASCRNGRVNACEHNETLGVQRNGVMCE YAVLPWTKIIPAGNISPRDCALIEPMSVGFHAVSRAQVIDNEYVMVIGCGMIGIGAIVRA ALRGATVIAVDLDDEKLELAKRVGASYVINSKTENVHERMQQVTEGFGADVVIEAVGSPV TYVMAVDEVGFTGRVVCIGYAKSEVAFQTKYFVQKELDIRGSRNALPADFRAVINYMKEG NCPVEELISRIAKPEGALEAMQEWTANPGKVFRILVEF >gi|222159282|gb|ACAB01000077.1| GENE 8 13709 - 15001 799 430 aa, chain - ## HITS:1 COG:fucP KEGG:ns NR:ns ## COG: fucP COG0738 # Protein_GI_number: 16130708 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Escherichia coli K12 # 2 415 20 422 438 253 38.0 7e-67 MKKNTYTIPLALVFCLFFLWAISSNLLPTMIRQLMKTCELNTFEASFTETAYWLAYFIFP IPIAMFMKRYSYKAGIIFGLVLAAIGGLLFFPAAILKEYWAYLCIFFIIATGMCFLETAA NPYVTVLGAPETAPRRLNLAQSFNGLGAFIAAMFLSKLILSGTHYTRETLPVDYPGGWQA YIQLETDAMKLPYLILAILLIAIAVVFIFSKLPKIGDEGETASSDKTTSSGKTKEGSQKE KLIDFGVLKHSHLRWGVIAQFFYNGGQTAINSLFLVYCCTYAGLPEDTATTFFGLYMLAF LLGRWIGTGLMVKFRPQDMLLVYALMNILLCGAVMIWGGMIGLYAMLAISFFMSIMYPTQ FSLALKGLGNQTKSGSAFLVMAIVGNACLPQLTAYFMHANEHIYYMAYCVPMICFVFCAY YGWKGYKVID >gi|222159282|gb|ACAB01000077.1| GENE 9 15040 - 15963 748 307 aa, chain - ## HITS:1 COG:no KEGG:BT_3615 NR:ns ## KEGG: BT_3615 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 307 1 307 307 566 89.0 1e-160 MDYTIIDAHAHLWLRQDTVVDGLPIRTLENGRSEFMGEIRQMVPPFMIDGVNSAEVFLSN MDYAQVAAAVITQEFIDGIQNDYLSEVVSHYPNRFFVCGMCEFRKPGFLEQAKELIAKGF KAIKIPAQRLLLKEGRVMLNNEEMMQMFHSMEERNVMLSIDLADGATQVPEVEEIIQECP RLKIAVGHFGMVTRPDWKEQIRLARHPNVMIESGGITWLFNDEFYPFKGAVKAIREAADL VGMEKLMWGSDYPRTITAITYKMSYDFVVKSSELTEEDKRLFLGENARNFYGFTDLPVLP YIKNMSE >gi|222159282|gb|ACAB01000077.1| GENE 10 15979 - 16911 942 310 aa, chain - ## HITS:1 COG:YMR041c KEGG:ns NR:ns ## COG: YMR041c COG0667 # Protein_GI_number: 6323684 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Saccharomyces cerevisiae # 13 259 14 266 335 124 33.0 2e-28 MQYHEIGKTGMKVSSLSFGASSLGGVFHDLKEKEGIQAVFTAVEAGMNFIDVSPYYGHYK AETVLGKALKDLPRDRYYLSTKVGRYGKDGVNLWDYSAKRATESVYESMERLNIDFIDLI NVHDVEFADLNQVVNETLPALVELREKGVVGHVGITDLQLENLKWVIDHSPSGTIESVLS FCHYCLCDDKLADFLDYFESKEIGVINASPLSMGLLSERGVPAWHPAPKPLVEACRKAME HCKAKNYPIEKLAMQFSVSNPRIATTLFSTTNPENVKKNIAFIEEPIDWELVREVQEIIG EQKRVSWANS >gi|222159282|gb|ACAB01000077.1| GENE 11 17091 - 18116 763 341 aa, chain + ## HITS:1 COG:YPO0108 KEGG:ns NR:ns ## COG: YPO0108 COG1609 # Protein_GI_number: 16120455 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 1 335 1 337 342 177 31.0 3e-44 MENRKHTSLKDLAQALGVSIPTVSRALKDSPEISRELCAKAKALAKEMNYRPNPFAMSLR KNAPRIIGVIVPDIVTHFFASILNGIENMAIANGYFVIITTSYESYEHEKRNIENLVNMR VEGIIACLSQETTDFSHISALKDINMPLILFDRVCLTDQFSSVIADGAQSAQIATQHLLD NGSKRVAFIGGANHLDIVKRRKHGYLEALRENRIPIEKELVVCRKIDYEEGKIATETLLS LPQPPDAILAMNDTLAFAAMEVIKNHGLRIPNDVAIIGYTDEQHANYVEPKLSAVSHQTY KMGETACQLLIDQIKGDKTIKQVTIPTHLQIRESSIKKTKM >gi|222159282|gb|ACAB01000077.1| GENE 12 18294 - 18854 456 186 aa, chain - ## HITS:1 COG:STM4397 KEGG:ns NR:ns ## COG: STM4397 COG0545 # Protein_GI_number: 16767643 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Salmonella typhimurium LT2 # 80 184 123 220 220 62 34.0 7e-10 MSKKIYLFSLVLLALAFTACSETEETSRYDNWQARSVAFIDSIASVYNSPENQALANDDP EKLHAFPDPTNSQTIYVKKIKKGEGTESPKYTSTVSAHYRMSYFNGDVVQQTYTGTEPTE FDSPTNFTLNGVISGWSYTLMYMKVGDFWTLYIPYQSGYGSSTNDGNLQAYSALVYNVRL EKIVER >gi|222159282|gb|ACAB01000077.1| GENE 13 18866 - 20407 1724 513 aa, chain - ## HITS:1 COG:SA1394 KEGG:ns NR:ns ## COG: SA1394 COG0423 # Protein_GI_number: 15927145 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Staphylococcus aureus N315 # 10 502 8 460 463 471 49.0 1e-132 MAQEDVFKKLVSHCKEYGFVFPSSDIYDGLGAVYDYGQMGVELKNNIKKYWWDSMVLLHE NIVGIDSAIFMHPTIWKASGHVDAFNDPLIDNKDSKKRYRADVLIEDQLAKYDDKINKEV AKAAKRFGESFDEAQFRSTNGRVLEHQAKRDALHTRFAKALNDGNLEELRQIIIDEEIVC PISGTKNWTEVRQFNLMFSTEMGSTSEGAMKIYLRPETAQGIFVNYLNVQKTGRMKVPFG IAQIGKAFRNEIVARQFIFRMREFEQMEMQFFVKPGTELDWFKKWKEIRLKWHKALGFGD ASYRYHDHDKLAHYANAATDIEFLMPFGFKEVEGIHSRTNFDLSQHEKFSGKSIKYFDPE LNESYTPYVIETSIGVDRMFLSIMSASYCEEQLENGESRVVLKLPAALAPVKLAVMPLVK KDGLPEKAREVIDSLKFHFHCQYDEKDSIGKRYRRQDAIGTPYCVTVDHQTLEDNCVTLR NRDTMQQERVAISELNNIIADRVSITSLLKTLQ >gi|222159282|gb|ACAB01000077.1| GENE 14 20511 - 21017 483 168 aa, chain - ## HITS:1 COG:no KEGG:BT_3610 NR:ns ## KEGG: BT_3610 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 153 1 155 156 211 71.0 6e-54 MAVPYVVRKKADLTSGERKELWYGVPKKLIDPIRNREFAEYMEKRSGFHRGQIDGILTEM VDAIRSLLSIGQPVTIEGLGTFHTSLTSPGFERPEQVTPGKVSVSRVYFVACPEFSREVK KMKCMRIPFNLYMPEEMLTKEMKKADREQEREEYDRMVAPEIEEEVQE >gi|222159282|gb|ACAB01000077.1| GENE 15 21250 - 22395 740 381 aa, chain + ## HITS:1 COG:YPO4034_1 KEGG:ns NR:ns ## COG: YPO4034_1 COG1609 # Protein_GI_number: 16124154 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 3 245 7 255 265 82 23.0 2e-15 MIKILLLIDYSSEFDRKLLRGLVQYSKENGPWLFYRLPSYYSTMHGEQGILKWAKEWKAD AIIGQWNNDTIDLQKELNIPVVLQNYHHRSVTYSNLTGDYKGTGRMAAQFFAKRMFRNFA YFGVKGVVWSDERCEGYRQEVKRIGGEFFSFESDKQEDEIRMEVSQWLQQLPKPVALFCC DDAHALFISETCKMTNIPIPEEIALLGVDNDELMCNISDPPISSIELEVERGGYSIGRLI HQQIKKEHEGTFNIVINPIRIELRQSTEKHNIKDPYILEVVKYIDSHYSSDLTIESLLAN IPLSRRNFEVKFKNALNTSIYQYILNCRCNHLADLLLTTDRPLADLAMEVGFTDYNNIAR IFKKFKGCSPIEYRQKKTRQR >gi|222159282|gb|ACAB01000077.1| GENE 16 22504 - 23691 757 395 aa, chain + ## HITS:1 COG:no KEGG:BT_3608 NR:ns ## KEGG: BT_3608 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 35 387 31 383 392 517 70.0 1e-145 MQRHTIRGKTRTIIYPIIICLGITSIMGCKRLEINQTKRNWEKLSAYFTPPPLYQEKFGT YRNPLLFYNGDTVKNADDWLKRRKEIKDKWLNLIGHWPAIITNQKLEIIKTTEREDFKQH LVRFYWTPLEQTYGYLLEPNKKGKHPAVITVFYEPETAIGWGGKANRDFAYQLTKRGFVT LSLGTRQTTKDKTYSLYYPTINNSTMQPLSVLAYAAANAWEVLARVESVDSTRIGIMGHS YGAKWAMFASCLYEKFACTAWSDPGIVFDETKDNYINYWEPWYLGYYPPPWKKIWSNNGN NSSTSVYARLCKEGHDLHELHSLLAPRPFLVSGGYSDNVDRWIPLNHSVAVNRLLGYHHR VAMTNRPKHDPTPESNETIYKFFEWFLKRKTPKED >gi|222159282|gb|ACAB01000077.1| GENE 17 23860 - 27009 2189 1049 aa, chain + ## HITS:1 COG:no KEGG:BT_0452 NR:ns ## KEGG: BT_0452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 1049 7 1074 1074 824 43.0 0 MKKIRWNPRLVTSLFITLLMCTLDCLAQEVIKGFVKDGNDEPLPGVSVAVKGATNVGTIT NVDGKYSINAHKNDVLVFSYIGMVSQEVKVGNKTNINITLKEDVSSLDEIVVVGYGTQKR GSLTAAISTVSDKEILKAPTMGISNIIGARVAGISAVQASGQPGADNASLSIRGQSGIIY VIDGIRRSASDFNGLDPNEIESVSVLKDASAVAVYGLDANGAFIVTTKKGQTDKVTISYT GTVGISQNAEEQEWLDGPGYAYWYNKARLLQGDTEVFTVDMVRKMREGVDGWGNTNWYDK VYGTGVRQHHNISASGGSEKIRFFTSIGYLEEKGNIDKFKYRRMNLRSNIDAQLAKGLSL SLGVSGRIEKRDAPKFSADPDDWMNIPQQVCYALPYVQDTYEYNGKIYDVSTPTSGSPVA PIASIYDSGYNRSNQSYMQSNFSLKYDTPWLKGLSLKFQGAYDLVHGMTKQLTKPNEVMI MDLPNAATTTLTYHKDYSVLKDTPILSESASRAQEFTTQTSITYDNKFGDHSIGVLLLAE TRERNSNNLGVTGTGLDFIQLDELNQITGFTKEGKEQPAIPSGSSSQTRVAGFVGRLNYN YADKYYLEASLRHDGSYLFGGMNKRWVTLPGLSLAWRINNEKWFHATWVDNLKLRAGIGK TATSGIQPFQWRNTMSTSPNSVVIGGASQTAIYPSVLGNPNLTWAQCLNYNIGIDVTLWN GLLGMELDAFYKYEYDKLSSVTGSHTPSMGGYYFSTANVNKADYKGFDVTFTHQNRIGSF SYGAKLIWSYAYGRWLKYAGDSENTPEYRRLTGKQIGSKMGFIAQGLFQSEEEIANSATM PDRPAYPGYIKYMDRNGDGIITVNQDQGYVGKSSRPTHTGSFNLFGNWKGFDFDILASWG LGSDVALTGVYTATGSSGIQSATAFTRPFYQNGNAPVYLVANSWTPENTNAEFPCLEINP RSLNNGLASTFWYRNGNYLRIKTAQIGYNFPKKWLSPLGVEALRLYVEGYNLLTFSAVSK YNIDPESPAVNNGYYPQQRTYTLGAKITF >gi|222159282|gb|ACAB01000077.1| GENE 18 27022 - 28656 1144 544 aa, chain + ## HITS:1 COG:no KEGG:BT_0451 NR:ns ## KEGG: BT_0451 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 544 1 553 553 382 41.0 1e-104 MKKLCINISLFFMLLTLSSCNDWLDEVKQTTTVSDEIIWQDETQVDKYVNSFYPLLHDYG QFGEAQFYGSFTESLTDAFKYGSYTLGHRAGHPNLYVLTPDAISPDNCLYSIWLRSTAYK QIRETNQFLSLQRKYSEFSADRNKLWEAQVRFFRAFVYFQLAKRHGGVILYDDLPVSTNK ARSSAEETWQFIADDLDFAANNLPKEWDAANKGRITKGAAYALKSRAMLYAKRWQDAYDA ANNVIALKLYGLTDKYEDAWKGNNKEAILEFDYDAANGPNHLFDRYYVPQCDGYDNGSTG TPTQEMVECYESKNGEKIDWTPWHGITDETPPYDQLEPRFAATVIYRGCTWKGKKMDCSL DGKNGVFMPYREQGTSYGKTTTGYFLRKLLDETLTDVKNGKSAQPWVEIRYAEVLLNKAE AAYRLNKIGEAQSAMNEVRARVKVNLPGKSSTGEAWFKDYRNERKVELAYEGHLFWDMRR WELAHIEYNNYRTHGFKIIGATNTYEYVDCDGQDRKFIKKLYVLPVPSEELKNNSLIEQY DEWK >gi|222159282|gb|ACAB01000077.1| GENE 19 28676 - 30373 1104 565 aa, chain + ## HITS:1 COG:no KEGG:BT_0450 NR:ns ## KEGG: BT_0450 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 564 1 564 565 211 30.0 5e-53 MKNKIISIIQALVLLWGVTSCHSPEELGSDIQREGINSITASFPNDDSSENSFKGEIDYT NHSITFVFPYNYPKLSDNVLPTSALKRVRLSASLANNVSVTPPLLYMDLTQDNYITVKDQ TSGTSTEFKVIGEIRKSNECSITKFDIPAIGLLGVINENEKTISLVTIDNVGEQIANIDI SHGATCSPNPETEALDYEQEQTIVVTAQNGIDKATYTVKKDIPQKTVAGIRQGSGKLLWS KRLSEISGILLPGKVTGLAVVDKYVVINERANDRAIYLNSQTGEIAGSMDISQFAGDNSN FHATADRGNNILFCSYTPSGGTFTVWKANGVNEKPQKYIEYKTGTNIRFGWKISIQGDLD ANALITTPVFQKDSKVQFARWRVINGTLQSQSPEFVIMSSSLLTSNWIKWADVIYADDTD TQSDYFLASHVTDTSAKRYFYWFKGTDNSIKAANTGAPGNTIINAVDYAVFNKVPYVIYN HVNSFNYAVTGSDAVRMYDLSSGSFDNQIVVCPDKIYGGLENSGQNTEGTGDVVFKVAQN GYYLYVYLVFSNGGIACYQYDCIDM >gi|222159282|gb|ACAB01000077.1| GENE 20 30415 - 31983 1054 522 aa, chain + ## HITS:1 COG:BMEII0724 KEGG:ns NR:ns ## COG: BMEII0724 COG4124 # Protein_GI_number: 17989069 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-mannanase # Organism: Brucella melitensis # 296 472 45 219 243 60 29.0 6e-09 MKLKNLWLIAVAILIYACNAEETRSWEVTFDPNKPKDPNEQSIINVAAGKFYKMNVVANS AYADKAPLDPWTSGTKLTDEETGTLSTKNRGVGWDAQTVEVIIDLGSLRSITEVSVHAIS DPTSQIVFPAQIEVSTSKDKSQWEKAASPITYSDSNGKSDAWGKTDFSNVACRYVKATLK SSASTSMMMLIDEIKVMGEFHNDMKYVPEKGCYHGAFPPLYGFDPEDREGSTDQCAVALF EKLVGKQLSMILWYQNMEPGRNFSEMQTVREKYWGKNYQGKYRFFLYGWLPVIPTQQMAN GELDDFHKSYFAEVAAQKVRDMGPIWFRPANEMNGSWTPYYGDPTNYVKAWRRMYNIAEQ LGVTAYNVFVWSPNSVSMPGTEANAMKNYYPGDMYVDWLGVSCYPPSLSATYPEDRRYPL TLMQGIKQVSADKPIMISEGGYSSTCDHQRWVREWFKLKDEEPRVKAVVWENHENAENGD RRLQSDPLALELYKELVQDPYWLDLIPDAVYSEIETRKNNSK >gi|222159282|gb|ACAB01000077.1| GENE 21 32007 - 34211 1521 734 aa, chain + ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 29 724 29 705 793 413 36.0 1e-115 MKRLTPYSWLILIIFSSCLTGNKVASTDKQIESLLSQMTLEEKIGMLHSNTMFSSTGVPR LGIPDLHYSDGPHGVRFEGVANGWESARWDNDACSYLPALSALASTWNRDLAQLYGEVLG AECKARGKHVSLAPGVNIHRSLLNGRNWEYFSEDPFFSGELAVPYIQGVQSQGVASCVKH FALNSQAYNQYKVSVEVDERTLHEIYLPAFEAAIQRGGAMAAMAAYNKVRGLWCTESPYL LDTLLRDELGFDGLVVSDWNAVHNTERTALCGMDVEMGTSIKENGKYAFNKYYLADPLLK KVRNGEIPEEAVNKKVRNILKLMIRLDLIGQAPYDTTGMAAKLAMPIHTKAARKIAEESL VLLKNSKDMLPLDPAQYKNVAVIGANATEVFAAGGGSTKLKAKYEVTPLEGLQNLLDGKA RIEYAPGYQLNKKAYKVGHWFTNEFDKSDEELYKKAINTAAQAELVIYIGGTSHEHGSDC EGYDKPNLKLPYQQDRLLKGILEVNPNTVVVLISGGPVEIGEWYNDATALLYCSFLGMEG GNALARTLFGEVNPSGKLTTTWCKCLEDMPDHVFGEYPGINDTVRFKEGLMVGYRYFDTY RVVPQFEFGYGLSYTTFTYSDIKMKPVWKESDTEFAVSFTITNTGKRYGQEIAQLYLHQN KCSVERPFKELKGFTKVGLKPGESKQVTIKLPRRALQYYDTESKHWKDEPGMFTVLIGAS SRDIKLQKNFELKK >gi|222159282|gb|ACAB01000077.1| GENE 22 34218 - 35534 1062 438 aa, chain + ## HITS:1 COG:XF1462 KEGG:ns NR:ns ## COG: XF1462 COG0738 # Protein_GI_number: 15838063 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Xylella fastidiosa 9a5c # 59 429 3 360 377 242 37.0 1e-63 MGNNNLISSKKTYYISIAILAGMFFIFGFVSWVNAILIPYFRISCELTHFESYFVAFAFY IAYFVMAIPSGVLLKKVGFKRGIMYGFMLTALGAFLFVPAALARQFEIFLAGLFSIGTGL AILQTAANPYVTIIGPIDSAARRISIMGICNKFAGIVSPLIFAALILNVTDKELFATIES GTLDIVTKNAMLDELIQRVIVPYAVLGVILLLTGIGIRYSILPEINTDEENSTEDTGSHH HTRKSIFDFPYLILGAVAIFLHVGTQVIAIDTIINYANSMGMDLLEAKTFPSYTLACTMI GYLLGILLIPKYVSQKNALIGCTIIGLLLSFGVVFADFEVTLFGHHANASIFFLNALGFP NALIYAGIWPLSIHELGKFTKTGSSLLIMGLCGNAILPLIYGHLADMYSLRFGYWVLIPC FLYLVFFAVKGHKIDSWK >gi|222159282|gb|ACAB01000077.1| GENE 23 35552 - 36739 864 395 aa, chain + ## HITS:1 COG:SA0656 KEGG:ns NR:ns ## COG: SA0656 COG1820 # Protein_GI_number: 15926378 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Staphylococcus aureus N315 # 1 391 1 389 393 219 35.0 6e-57 MERLIITNGKLILPTGIKMGQTLICENGKIEQIIPNGSYQPLVGDKVIDARQNYVSPGFI DMHIHGGGGHDFMDGTIEAFLGVAETHAKYGTTAMVPTTLTSTNEELMTTFTVYRKAKEM NINGSQFIGLHLEGPYFSPKQCGAQDPNFLKKPQAEEYNAILEASKDIIRWSVAPELEGA LALGQTLQQHHILPSIAHTDAIYEEVEKAFTAGYTHVTHLYSAMSSVTRKNAFRYAGVVE AAYLIEDMTVEIIADGIHLPKPLLQFVYKFKGVDKTALCTDAMRGAGMPDGESILGSLNN GQKVIIEDGVAKMPDRKAFAGSVATTDRLVRTMIYLAGVPLIDAVRMMTLTPARILHIDK EKGSLEIGKDADIVIFDNQININNTILKGHVIYTK >gi|222159282|gb|ACAB01000077.1| GENE 24 36720 - 37511 533 263 aa, chain + ## HITS:1 COG:all0727 KEGG:ns NR:ns ## COG: all0727 COG0363 # Protein_GI_number: 17228222 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Nostoc sp. PCC 7120 # 10 258 5 253 258 213 43.0 4e-55 MLSILNETKTKVFQQEQLTVRIFPSIQEMGSVAAKEVGDQICRLLESKPEINMIFAAAPS QNEFLSHLIHDKRIDWTRINAFHMDEYIGIHPEAPQSFGHFLRIRIFDKVPFKKVNYLNG LAENLEEECQRYADLLTKHPVDIVCLGIGENGHIAFNDPDVADFNDPKLVKVVELDPICR QQQVNEKCFKTLDLVPKEALTLTIPALLKAEWMFCIVPFKNKAQAVYQTVYGEVSEKCPA SILRRKENSSLYLDPESAERINL >gi|222159282|gb|ACAB01000077.1| GENE 25 37518 - 38525 511 335 aa, chain + ## HITS:1 COG:no KEGG:BT_3586 NR:ns ## KEGG: BT_3586 # Name: not_defined # Def: putative dehydrogenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 333 1 333 334 566 81.0 1e-160 MKNRISLLLILLAISFSIHAQKVMRIGIIGLDTSHSTAFTELINSGSDETFSKGFRVVAA YPYGSKTIQSSYERIPGYIEKVKANGVEITSSIADLLEKVDCVLLETNDGRLHLEQAVEV FKSGKICYIDKPVGATLGDAIAIYEMAEKYNAPVFSSSALRFTPQNQKLRNGEFGKILGA DCYSPHKVEPTHPDFGFYGIHGVETLYTIMGTGCESVNRMSSDRGDVVVGRWKDGRIGTF RAIIKGPQIYGGTAYTSKGAVAAGGYQGYKALLEQILKYFQTGISPISREETIEIFTFMK ASNMSKEENGRIVTLEEAYQKGWKDARKLIKTYNK >gi|222159282|gb|ACAB01000077.1| GENE 26 38530 - 39939 733 469 aa, chain + ## HITS:1 COG:no KEGG:BT_3585 NR:ns ## KEGG: BT_3585 # Name: not_defined # Def: putative oxidoreductase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 467 1 467 468 679 69.0 0 MKFIYKGIGLISILVIVSSCSFSKKQTMNKAEANDSCKIEVVVLDPGHFHASLLQKDALA VINDTIRIYAPEGIGVNQYLESIDSYNHRPKSPTTWKKQVYTGEDYLQKMLSDHKGDVVI LAGNNQKKTRYIIESIKAGYNVLADKPLAINSQDFQLLTEAYQLAQQKGLLLYDLMTERY DILNIIEKELLHQTELFGELQKGSPDNPSVIMESVHHFFKKVSGKPLVRPAWYYDIAQQG EGIADVTTHLIDLINWQCFPDEAIHYQSDVKVLSAKHWPTPITLAEFSQSTQTDSFPIYL NQYIKNDVLEVMANGSLNYTVKGICMGMKVTWNYMPPVHGGDTFTSIKKGSKATLKIVQN EKNGFVKELYIQKKPNIDSHTFETQLQKTIEQLQESYPFLSVKNKSNWIYLIDIPQEYRL GHEEHFSKVAKAFLHYIRNKNIPEWENENTLTKYYITTTAVEMAKKENK >gi|222159282|gb|ACAB01000077.1| GENE 27 39987 - 40757 576 256 aa, chain + ## HITS:1 COG:CAC2766 KEGG:ns NR:ns ## COG: CAC2766 COG1477 # Protein_GI_number: 15896021 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Clostridium acetobutylicum # 35 251 35 289 319 86 23.0 4e-17 MFHGFIPHIMGTRFDILLIHSDAERLNGLWCHIINELERLDKILNRFDPHSEVSGINKHA LQSYIQISKELEEILQLCQYYYENTFHLFDITLKDFSKIQIHDHQRISFASSTISLDFGG FAKGYALKKIKELIEQENINHAFVNFGNSSILGMGHHPYGDSWRVSFLNPYNLSLLNEFN LQNTALSTSGNTLQYTGHIMNPLTGLFNEQRKASSIISTDPLEAEILSTVWMIANKEQQQ LLTENFKNIQATLYDL >gi|222159282|gb|ACAB01000077.1| GENE 28 40747 - 42084 1133 445 aa, chain + ## HITS:1 COG:BH1248 KEGG:ns NR:ns ## COG: BH1248 COG0673 # Protein_GI_number: 15613811 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 57 250 5 194 340 102 32.0 2e-21 MIYNMKDNKKEGVDLGLRQFIKSLGYVAGGTALLATTPWLTSCTPEKLKEIKHEKARIAL IGTGSRGQYHIHNLKEIPHAQIVAVCDNYAPNLQQALELCPDAKSYTDYRKLLESKDIDG VIISTPLNWHAPIVLDALAAGKHVFCEKAMARTLDECKAIYDTYNQSEKVLYFCMQRMYD EKYIKGMQMIHSGLIGDVVGMRCHWFRNADWRRPVPSPELERKINWRLYKDSSGGLMTEL ACHQLEVCNWAAKRMPVSIMGMGDIVYWKDGREVYDSVNVTYRYSDGTKIAYESLISNKF NGMEDQILGHKGTMEMAKGIYYLEEDHSTSGIRQLIDQVKDKVFAAIPTAGPSWRPETKM EYTPHFIIDGDIHVNSGLSMIGADKDGSDIILSSFCQSCITGEKAQNVVEEAYCSTVLCL LGNQAMNEQRHILFPDEYKIPYMKF >gi|222159282|gb|ACAB01000077.1| GENE 29 42088 - 42327 114 79 aa, chain + ## HITS:1 COG:no KEGG:BT_3582 NR:ns ## KEGG: BT_3582 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 79 1 79 79 78 68.0 7e-14 MKDIIITSKKLKQERNIFLLSFLLAFIINVIAIIIYSRPWIEIISQIGYVIVINFFIYLI LWIPRGILIFLSHLFRRKK >gi|222159282|gb|ACAB01000077.1| GENE 30 42415 - 43758 971 447 aa, chain + ## HITS:1 COG:MK0248 KEGG:ns NR:ns ## COG: MK0248 COG0673 # Protein_GI_number: 20093688 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Methanopyrus kandleri AV19 # 42 189 3 144 317 83 33.0 1e-15 MTTRRDFIKKTVAGTAALSLGSILPGFGSSRYQDILGANEKIRIGVIGVNSRGKALAQGF AKLPDCEVTYICDVDSRALEKCQAAIHKITGRTPKGEKDIRKMLESNDFEAVVIATPDHW HAKAAIMAMQAGKHVYLEKPTSHNPAENEMLIRAALKYNRIVQVGNQRRSFPNVIKAMEE IKSGSIGKVRYAKSWYVNNRPSIGTGKVVPVPDYLDWDLWQGPAPRVADFKDNFIHYNWH WFWNWGTGEALNNGTHFVDILRWGLGVDYPTKVDSIGGRYRFQDDWQTPDTQLITFQFGD EASFSWEGRSCNTMPVDGYGVGTAFYGETGTLFIGGGNEYKIADIKGKTIKEVKSDLKFE TGNLLNPSEKLDAFHFRNWFDAIRKGTKLNSGIVDACISTQLVQLGNIAQRVGHSLQIDP GSGRILNDLEANKLWGREYEKGWEIRV >gi|222159282|gb|ACAB01000077.1| GENE 31 43858 - 44445 403 195 aa, chain - ## HITS:1 COG:mll5456 KEGG:ns NR:ns ## COG: mll5456 COG1595 # Protein_GI_number: 13474550 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 8 180 6 172 182 70 26.0 2e-12 MNKTDLSSATEEQLLLQRLREGDMGSYETLFHRYYPTFFAFAKGMLKDAGAAEDIIQNVF MKIWIHREALDETMSIKNYIYVLSKREVFNHLRAKYNTHVVLTEDMMTLERPSSIDEPTT DYRELREAVQSVINTMPPKRRSVFCLSRFKSLTNQEIADKLGISIRTVEKHIELALRTFK EQLGSFFALFVGWLL >gi|222159282|gb|ACAB01000077.1| GENE 32 44747 - 48091 2186 1114 aa, chain + ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 169 6 181 394 74 29.0 3e-11 MRNFLLLFLLLMPVIGSCTDDYDDSAAWKDIDGIYKDLDQLKEKLNSLQLQANALSQIVK GGAITSVTEAANGGYVISYKGSDNIEHSFTIATTDQMVSSPIIGIQEEAGTYYWTTTTKG QTTFLLDANKQKIPVSGSAPQIRVDENGYWIINGRQILDSNQKPIKAEGKTTSLITKVEM NDNGTASITLGNGETLSVNTFTLFNVEFKNADQTAISPIIIEEGTKNLTLNYNIIGKKAA QALMLITRNDDGLEARLNSSNKTLVVTFADDFEEGVTMIMLYDTEDNVLIKPMRFTLPII ENGGIATATDFKAFIDAVTSGSSLRKFKDTEGNVILLNDIDMKDITLTSGAGSNVTSNTT NANTKVVYTIGEQTFNDVFDGKGHSVINLTFTYNLEDGNIAHGLFNALGSSGVIRNLVIS GNATITGKAPQGAAIGGLVGYCEGSILACTNQINLSFEGTDAANVGVRMGGLAGVLYGNK IGDTTQANGCSNEGNLTCSNIVNTASGAYSAFNQGGIAGYIENDEAYIGYAINKGNISAP SGRGGGIAGTLQEGIIENSTNEGVIQDDVNGVFASTSKRYNVKRIGGLAGGINTDKYLKN CINNGNVYSQNGSRAGGFVGHNAGFVQSCTNNGIILSDATADGANKHGAGWACGYSGTKN GTDYITDCHIGGKVGDYSIYKNNPEDTPGATYSNAVRHGAFSKEANNFSNQDEAYYDWQV TEDRELASGIVYKHYSFTNFNQNIYAIEIDMNNPKVTFETVMADEICPNPNGNNNSNNGK VLRETLSETCTRRRDEGRNIIVGINTGFFNSHDGFPRGMHIEEGEPVFINNPYVRSILTN HVWGFTFFDNRTVSFEKRDFTGKLKVGTKEYEYYSVNDTIVRLSGKPSYDANLYTFRYVK EPHPGLTNPIGTKALFIIGKNNQPLKVNSGDFEATITKIIDGRGTTVEAPYVTDKNEWVL QVTGDKADELVQNLKTGDKVQISAELKIGSSTNPIKVHNSSMYRYVYNGVYSAPPKKEDA ETINPTTNLGMTQDKSKIVIFCVDGRTDSDRGLDFYEAYRVCKKLGLYDVIRFDGGGSTV MWTYENGIGKVINHVSDTKGERSCMNYLHVRVLE >gi|222159282|gb|ACAB01000077.1| GENE 33 48104 - 50542 2110 812 aa, chain + ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 206 19 214 394 71 28.0 2e-10 MLITEYMKKFISISLFISACFLGSCSSYDDERIWQDLNEIEKTIGDYEAQAETLTKQMTS LSKVIGSSFITLISQDADGNHVISYSDGGGETHTVTIATQKDAIKLPIITAKQDTDSKFY WAQTTDKGKTYTFILDGDGKKFPIGGTMPDVKINENGYWSVNGASTGVLANDLSNLLFKS AYIDDKTEEAVFILADGQELRMSLQEALGIRFNSPVYNAVTDYATPVSIPYEIYGTQSEN AYVDLFTAYNMEVKIDKASSTLIATMKEGATEGNILLLASAGNNTVLKPIYFTYGTAILD EPLYQGHVGPIQLKGTQMDIEMQISANIFYQVSTENEWITYKGTRALITTTHAFTILANE TGDERSGKIIFSNSLYNISSSIDVIQEAKEVEAKGGISTAADLVNFAKAVNNGTSTSRWQ NDAGEIVLLNDIDMSSVTSWTPIGDIDASNYTTAEPYVSIHPFTGTFNGQGHAIKNLNCS ADITNGGLAYGLFGSIENATVKNLVLGDASTTITWMMSGTAPKYTVIAPLVCFAKKSVIE GCTNYYNIDFTADNKSGEFNALSGLVGTIVNTTIGGESKAQGCSNKGFVRTGRISNTANG GTGMQTAGICAFMAKAEGGKLNYCTNYGDISCPSGRTGGIIATLMYGNIYNCDNRGTIED DKVGQHEGKEASVTYNYKRMGGIVGGTDDLKTKPEYTVESCTNYGNVMTHLSVRTGGIIG HSNIQIIGCVNKGAVLGDVFTEGNGTNRHGPGWLCGYSGASTATWTNCKACVCGGYVGDY SKYKDDPTSAPDATNQNAFCHANQNFDPSINF >gi|222159282|gb|ACAB01000077.1| GENE 34 50555 - 51436 761 293 aa, chain + ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 210 1 225 394 87 29.0 4e-16 MKRIYFVCSVLLSLLLTSCGDYDDSSIQNKLNDFKERIAALQTKADKLNEDISKLGYLTE GNVITSVSRNSDGQYVITYKDNNNEEKAVVVATQEDVIEAPILGVRLNDDDQLYYWTTTI GNETNWLTDDTEKKVPVCGYTPEMGVNADGYWTVNGEILKDNKGTPITATTDETAIFKNI TKTDEGYLKITLGNGETLTLEVFSSLNLRLKANAVTKITDLSSPLKIEYEVTGASAEEAL VTIAQAVNVKATIDKETHTLTVIFENNFDEGHVIITAYDLQHLVLRPLLFKKN >gi|222159282|gb|ACAB01000077.1| GENE 35 51450 - 52388 829 312 aa, chain + ## HITS:1 COG:AGl598 KEGG:ns NR:ns ## COG: AGl598 COG0584 # Protein_GI_number: 15890416 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 48 306 36 294 306 117 29.0 3e-26 MNMKKRILFTGLLLLLTLGTFAVTPVAPDKTHAQKIVEAIHNPKTDYVVVVSHRGDWRDY PENSIPAIESIIRMGVDVMELDLKLTADSVLVLCHDGTIDRTTTGKGRVSDVTYDYIKSC FLKTGHNCPTKYKMPTLKEALAVCKDRIVVNIDQGYQYYDLVMAITEELGVTEQILIKGK KSVDFVDAQAKKYKHAMMYMPIIDINKPSGQDLFRQYMDKKIIPLAYEVCWQQETSEVKK CMKDILAQGSKIWVNTLWASLCGGEEAGMYDDYAFEHGAEVYQKVLDLGTSIIQTDRPEL LISYLKKIGRHN >gi|222159282|gb|ACAB01000077.1| GENE 36 52428 - 53510 594 360 aa, chain + ## HITS:1 COG:no KEGG:Phep_1387 NR:ns ## KEGG: Phep_1387 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 26 356 24 356 358 181 32.0 3e-44 MRNFRLLLLASLLAPTILWGQSIKDIRYVDASQLTLVGKALPTPHLYHRIDTVAFKGFSK SENQQARCSAGLAVVFRTNSPQIDLLPSYKWEYRKDNVTGIAAAGFDLYIRQNNEWIYAN SLAPAKRNEAFTLMYGMEPKEKECLLYLPMYSELESLKIGIQPGSSIETIPNPFGQKVVF FGSSFTQGIGASRPGMSYPLQIERNTNLHVCNLGFSGNAKLQSYFAEVIAATEADAFVFD VFSNPDAMQIKERLQAFVDIIVAKHPKTPLIFVQTIQRGNEAFNTLIRARESDKLEIVET LMKEIIRKYPNIYLIGNPLPSPENRDTCTDGTHPSDLGYYFWAKNLEKKIIEILNKNKCL >gi|222159282|gb|ACAB01000077.1| GENE 37 53634 - 54620 534 328 aa, chain + ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 126 279 127 278 323 66 29.0 7e-11 MKITNDILKDYFYDKCSHEEEIEIQKWLTENDNSLVDQSFREIISEIQMENHELSQKAFE KFQKATQPQQVRTPRPGLQRSIRWMQRVAAVLFIPLLFLAGYLLLDKKANTHWNDITVPH GMHQTLTLSDGTTLHVNSGTRVIYPSDFTENKREIFVSGELFADVAKDPDKPFIISAGDV HVQVLGTKFNLRAYENIETVEVALVEGSVLFKTPTHPNEILKKGEMIQYNRTSQKIVRDT FLANLYKCPAKNEGFYFSNLPLNDIVKELEYYFDTRIIILDQKLGDSTYIAYFTNGETLD EILSNLNTDGQMSITRSQGVILITSAAP >gi|222159282|gb|ACAB01000077.1| GENE 38 54644 - 58186 2482 1180 aa, chain + ## HITS:1 COG:CC1623 KEGG:ns NR:ns ## COG: CC1623 COG1629 # Protein_GI_number: 16125870 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 196 330 50 202 970 61 37.0 1e-08 MKQTNIYKKYISTIWFACVMIFFSQPILAQDITLQLKNVTVKEAIEALHKTKNYSVVIKS AEINMSKKVSVNATNAPIKAVLDQIFVGQNVSYAINGHSIIISKKSDTSQQKPGEKKKQT ITGIVYDEEGNPVIGASVMNKETAQGAITDLDGKFSLEAFIPSTIEISYVGYEMATVNVK DNQTKTIRLIPSSLMIDEVVVVGYGSQRRSNLTGAVSTISSKDLNNRPVVSAANALQGAD PSVNLTFGTGSPESGYSLNIRGGISVNGGTPLVLCDGVEVPLNQVNANDIESISVLKDAS SCAIYGAKASAGVVLITTKSGSAATKGKAKISYNGRFGWTQNTTSTDFIRTGYDYVTFAN KFYHAYNGVNMYLYEDEELQKLYDRRNDMTENPERPWVEIGEDGKYYYYGNTDWYGHFYN RTRPQMEHNVSITGGGEKVNYYISGRYYQQYGMFNIDKDLYKDYSFRAKMDAQLNKWIKW STNIGLDNNNYRYNGTSNYAMTIARLESNISPSFVPFNPDGTIVQYTNQLYANSPLGAGD GGYLTSQRGHNTKSRTLLSVVNQIDITLLEGLTLTANYSYQQRKQLYRYRNNSFEYSRSQ GVTQTFTSGSIFNNYEEDESFPVTHMLNYYATFEHSWAKKHNLKVVAGSQYETYRNVNKD TSMTNLSNDNLDSFSAVTPESVLTVSQDISAYKTLGFFGRINYDYMGKYLLEVSCRADGS SRFAEGDRWGVFPSVSAGWRISEENFFKPVSDWWSSLKLRASVGSLGNQQVDYYAYLQTI TSDNQFSYTFDGEGKAYYAKISNPISSGLTWETVTTYNVGLDMSFLRNRLSMSADYYVRK TTDMLTTSLTLPDVYGASTPKANCADLRTNGWELSVSWNDSFKVANKPFRYGIQATLGDY QRTITKYNNPDKLISDHYVGKKMGEIWGYHVDGLFKTDKEAAEYQAKINDKAVNGRVYSS KVDGYLRAGDVRFADLNGDNVIGAGAGTVDDPGDKRIIGNTTPRYNYSFRLDASWNGFDV SAFFQGIGKRDWYPSSSSSSQGANSFWGPYSFPSTSFIEKSFPEDCWTEDNRNAFFPRIR GYQSYSGGSLGTVNDRYIQNIAYLRFKNLSIGYTLPINKRFFEKVRVYVSGENLYYWSPL KKHNKTIDPELAISSSTYSSNTGSGYAYPRVYTVGVDITF >gi|222159282|gb|ACAB01000077.1| GENE 39 58209 - 59960 1535 583 aa, chain + ## HITS:1 COG:no KEGG:Slin_4978 NR:ns ## KEGG: Slin_4978 # Name: not_defined # Def: RagB/SusD domain protein # Organism: S.linguale # Pathway: not_defined # 4 578 1 576 576 460 44.0 1e-128 MKAINKIIILGMVTTLFSSCDLTLLPENAVTPENYFQNKSDLELWTNQFYTLLDEPDASA GTNADDMIDKGMGQVIEGTRSAASETGWSWSKLRHINYFLQHSSNCDDETARSQYNGVAQ FFRAYFYFVKVRRYGDVPWYDQVLGSEDQELLAKARDSREFVMDRVLKDFEDAATSLPTK STDTRNTRVTKWAALAFASQAALYEGTYRKYHGLDNYEKYLEIAASTAKQFIDESGFSLY KEGTEPYRDMFCADNAKTTEVVLARAYNFEGLQLSHSVQFSIANLQMGFTRRFMNHYLMT DGTRFTDKQGYETMFYTEEVKDRDPRLQQTVLCPNYIQKGETTVTANDLTAYCGYRPIKF VGTKDHDGAAKSTSDWPLMRAAEVYLNYAEAKAELGTLKQEDLDISINKIRERAKMPDLI LTDANNNPDPYLGTCYPNVEQGANKGVILEIRRERTIELVMEGLRQWDLFRWKEGKQMFN QYIPYYGIYIPGVGTYDMDGDGKPDLEIYETTATSQCDNKKKLDKDIYLSNGTSGYIIGF PKVTYGNDWKEERDYLWPIPADQRVLTQGILTQNPGWEDGLSY >gi|222159282|gb|ACAB01000077.1| GENE 40 60063 - 61391 690 442 aa, chain + ## HITS:1 COG:VCA0707 KEGG:ns NR:ns ## COG: VCA0707 COG2271 # Protein_GI_number: 15601463 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Vibrio cholerae # 5 439 1 440 459 359 42.0 6e-99 MLKQLIKFYQISTPAPVLKENKKKLKKLQWATFFSATAGYGIYYVCRLSLNVVKKPIVEE GIFSETELGIIGAVLFFTYAIGKFTNGFLADRSNIRRFMSTGLLITALANLCLGFTHSFI LFAVLWGISGWFQSMGAASCVVGLSRWFEDKKRGSFYGFWSASHNIGEAMTFIIVASIVS VLGWRYGFIGAGSIGIIGVLIVWNFFHDSPESEGLSAVNHPQIKDETDDKADFNKAQKQA LMMPAIWILAIASALMYVSRYAVNSWGVFYLEAQKGYSTLDASFIISISSVCGIIGTMFS GVISDKFFNGRRNAPALIFGLMNVAALCLFLLVPGVHFWIDALSMVLFGTAIGVLLCFLG GLMAVDIAPRNASGAALGIVGIASYIGAGIQDIMSGVLIEGHKSIVDGKEIYDFSYINIF WIGAALLSVFFALLVWNVRSKD >gi|222159282|gb|ACAB01000077.1| GENE 41 61527 - 64172 2542 881 aa, chain + ## HITS:1 COG:BB0035 KEGG:ns NR:ns ## COG: BB0035 COG0188 # Protein_GI_number: 15594381 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Borrelia burgdorferi # 37 615 11 583 626 433 41.0 1e-121 MSDEINEITEEHSDYKPADARDESVKHQLTGMYQNWFLDYASYVILERAVPHINDGLKPV QRRILHSMKRLDDGRYNKVANIVGHTMQFHPHGDASIGDALVQLGQKDLLIDCQGNWGNI LTGDGAAAPRYIEARLSKFALDVVFNPKTTEWKLSYDGRNKEPVTLPVKFPLLLAQGVEG IAVGLSSKILPHNFNELCDASISYLHGEEFQLYPDFQTGGSIDVAKYNDGERGGAVKVRA KINKIDNKTLAITEIPYGKTTSTVIDSILKAVDKGKIKIRKVDDNTAANVEILVHLAPGT SSDKTIDALYAFTDCEVSISPNCCVIDDSKPHFLTVSKVLRKSADNTLDLLKQELEIKKN EILEALHFASLEKIFIEERIYKDKEFEQSKDMDAACAHIDERLTPYYPKFIREVTKEDIL KLMEIKMGRILKFNSDKADELIARMKEEVAEIDNHLAHIVDYTVNWYQMLKNKYGKNFPR RTELRNFDTIEAAKVVEANEKLYINREEGFIGTALKKDEFVACCSDIDDVIIFFRDGKYI VTPVADKKFVGKNVLYVNVFKKNDKRTIYNVAYRDGKEGTTYVKRFAVTSVVRDREYDVT QGTPESRITYFSANPNGEAEIIKVTLKPNPRVRRIIFEHDFSEVSIKGRQARGVILTRLP VHKISLKQKGGSTLGGRKVWFDRDILRLNYDGRGEYLGEFQSDDSILIVLNNGDFYTTNF DLSNHYEDNVSIVEKFDPNKIWTAALYDADQQNYPYLKRFCFEASNRKQNYLGENKNNRL ILLTDEYYPRLEVIFGGHDNFRDPLNIDADEFIAVKGFKAKGKRITTYAVETINELEPTR FPEPSQEQQEVPEEEPENLDPDSGKSEGDIIDEITGQMKLF >gi|222159282|gb|ACAB01000077.1| GENE 42 64172 - 65032 607 286 aa, chain + ## HITS:1 COG:no KEGG:BT_3578 NR:ns ## KEGG: BT_3578 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 286 1 286 286 503 91.0 1e-141 MKKKLIYLGLTGCILFALCTRLQAQIDSLQAHRYVTRATMYGIGFTNVFDTYLSPQEYKG IDFRVSRELIRMTKLFDGNVSVQNFFQADIGYTHNRADNNNTFSGLVNWNYGLHYQFRLT DNFKLLAGGLIDVNGGFVYNLRNTNNPASARAYVNLDASGMAIWHLKIKRYPMVLRYQVN LPVMGVMFSPHYGQSYYEIFSLGNSSGVIRFTSLHNQPSLRQMLSVDLPIGYTKMRFSYL ADLQQSNVNNIKTHTYSHVFMVGFVKDLYRIRNKKGTALPSSVRAY >gi|222159282|gb|ACAB01000077.1| GENE 43 65047 - 66141 551 364 aa, chain + ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 157 362 193 398 408 63 28.0 5e-10 MLKLQVRDQLMMRNKIRQLLWLLCCLPVLTGCIGEDDYANDPRGNFEQLWKIIDEQYCFL DTKGIDWDAVHDEYSKLIIPSMSNDDLFDILSQMLYILKDGHVNLSSASRISYYDAWYQG YPWNYREDILYNYYLGSTNSGYRTSAGLKYKIFDNNIGYIRYESFSAGVGDGNLDEVLYY LQICNGLIIDVRDNGGGNLTNSSRIAARFTDKLALTGYIQHKTGPGHNDFSEMEPIYLEP SNSIRWQKKVVILTNRRCYSATNDFVNIMRSLNNDNEDKRIIQLGDQTGGGSGLPFSSEL PNGWSVRFSASPHFDKYGKSLEDGIKPDVYVNMDKILEEGIEQGDIESQKKDPLIEKAFE ILSE >gi|222159282|gb|ACAB01000077.1| GENE 44 66775 - 67596 429 273 aa, chain + ## HITS:1 COG:no KEGG:BT_4009 NR:ns ## KEGG: BT_4009 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 271 100 368 370 293 56.0 6e-78 MFQSEDNEYSTIKTLFEYHMIMEGKKLSQSTCYQYGVTLKYLLSYVRIKHKVADYDISII DAAFVNEFFAYLQGYLRQDNMKRCDINGALKHMERFKKVMEMAFNNEWINRNPVKALKAH KEKTDINELDEEAVKRLSTVILPPNLGIVRDLFIFAVYTGVSYEDMTRLTQKNIVIGIDK SLWLHYKRVKTGVRVSLPLLEPALEIINRYDTYHKGGKKNQPLFPYVSNQIMNRYLKKVA KLAGVEDRVTYHVASHSVFSFFLKINSLQNIPV >gi|222159282|gb|ACAB01000077.1| GENE 45 67801 - 68668 415 289 aa, chain + ## HITS:1 COG:no KEGG:BT_4010 NR:ns ## KEGG: BT_4010 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 288 1 284 462 478 92.0 1e-134 MSVTMVVKTIKLTSLFVNTENYRFEPLSSQKEAIDKMVEDQGDKLYSLVDDIVTNGLSPV DLIIVTPNEDNNKYIVLEGNRRITSLKLLNNPTLIDDKYISLRKKFQKLQKENPNAISEL KNIACAVFENPTEADIWIKRKHSGELNGIGTVTWNAQQKQRFEEKTEGKSSIPLQIITLL KSQDNVSDTIKDSLSKLNITNLQRLMSDPYVREHLGLGINNGTLVSKVEVSEVVKGLIKV VTDILNPEFKVSEIYNREKRKQYIDNFDTNQKPDLSNEASEQWSVQDIV Prediction of potential genes in microbial genomes Time: Wed May 18 02:57:37 2011 Seq name: gi|222159281|gb|ACAB01000078.1| Bacteroides sp. D1 cont1.78, whole genome shotgun sequence Length of sequence - 22214 bp Number of predicted genes - 28, with homology - 25 Number of transcription units - 15, operones - 8 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 144 - 217 72.9 # Thr TGT 0 0 + Prom 238 - 297 7.4 1 1 Tu 1 . + CDS 327 - 914 549 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 961 - 1016 8.0 - Term 955 - 999 5.3 2 2 Tu 1 . - CDS 1060 - 1791 685 ## COG0500 SAM-dependent methyltransferases - Prom 1875 - 1934 4.2 + Prom 1752 - 1811 3.7 3 3 Op 1 . + CDS 1887 - 2072 235 ## 4 3 Op 2 . + CDS 2077 - 2544 65 ## BT_1581 hypothetical protein + Term 2716 - 2756 9.2 - Term 2700 - 2749 9.9 5 4 Op 1 . - CDS 2856 - 3233 194 ## PROTEIN SUPPORTED gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 6 4 Op 2 . - CDS 3248 - 4393 1001 ## BT_1579 hypothetical protein 7 4 Op 3 . - CDS 4407 - 5138 681 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 8 4 Op 4 . - CDS 5205 - 5852 297 ## COG0259 Pyridoxamine-phosphate oxidase 9 4 Op 5 . - CDS 5896 - 6600 633 ## COG1741 Pirin-related protein 10 4 Op 6 . - CDS 6664 - 7665 924 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 7824 - 7883 5.3 + Prom 7654 - 7713 5.8 11 5 Tu 1 . + CDS 7876 - 9300 1254 ## COG2067 Long-chain fatty acid transport protein + Term 9321 - 9374 14.2 - Term 9316 - 9355 7.4 12 6 Tu 1 . - CDS 9378 - 10667 1150 ## BT_1573 hypothetical protein - Prom 10838 - 10897 4.8 + Prom 10597 - 10656 3.1 13 7 Op 1 . + CDS 10716 - 10814 62 ## 14 7 Op 2 . + CDS 10832 - 11338 451 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 11382 - 11433 12.0 15 8 Op 1 . - CDS 11431 - 11544 90 ## 16 8 Op 2 . - CDS 11567 - 12571 658 ## COG3943 Virulence protein - Prom 12593 - 12652 4.1 + Prom 12545 - 12604 8.8 17 9 Op 1 . + CDS 12806 - 13486 604 ## COG1738 Uncharacterized conserved protein 18 9 Op 2 . + CDS 13495 - 14154 594 ## COG0603 Predicted PP-loop superfamily ATPase + Term 14397 - 14444 0.3 + Prom 14656 - 14715 3.6 19 10 Op 1 . + CDS 14735 - 15190 353 ## PROTEIN SUPPORTED gi|148994988|ref|ZP_01823966.1| ribosomal protein L11 methyltransferase 20 10 Op 2 . + CDS 15246 - 15878 515 ## BT_1563 hypothetical protein + Term 15946 - 15980 3.2 - Term 15736 - 15773 0.6 21 11 Tu 1 . - CDS 15992 - 16465 402 ## COG1576 Uncharacterized conserved protein - Prom 16547 - 16606 5.7 + Prom 16460 - 16519 5.5 22 12 Op 1 . + CDS 16593 - 16985 356 ## BT_1561 hypothetical protein 23 12 Op 2 . + CDS 16978 - 17826 848 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 + Prom 17865 - 17924 4.2 24 13 Op 1 . + CDS 18053 - 18604 301 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 25 13 Op 2 . + CDS 18588 - 19637 788 ## BT_1558 hypothetical protein 26 13 Op 3 . + CDS 19667 - 20173 444 ## BT_1557 hypothetical protein + Prom 20306 - 20365 2.7 27 14 Tu 1 . + CDS 20385 - 21110 612 ## BT_1556 hypothetical protein + Term 21199 - 21236 -1.0 + Prom 21129 - 21188 5.3 28 15 Tu 1 . + CDS 21279 - 22212 839 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase Predicted protein(s) >gi|222159281|gb|ACAB01000078.1| GENE 1 327 - 914 549 195 aa, chain + ## HITS:1 COG:DR0189 KEGG:ns NR:ns ## COG: DR0189 COG0526 # Protein_GI_number: 15805225 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Deinococcus radiodurans # 53 171 47 160 185 75 39.0 5e-14 MNVCGMAILAIGIWACSGQKKGTANVEVATDSVEVAADAISVQADSTGYIVRVGEMAPDF TITLTDGKQVSLSSLRGKVVMLQFTASWCGVCRKEMPFIEKDIWLKHKNNADFALIGIDR DEPLDKVLAFAKSTGVTYPLGLDPGADIFAKYALRESGITRNVLVDKEGRIVKLTRLYNE EEFASLVQKINEMLK >gi|222159281|gb|ACAB01000078.1| GENE 2 1060 - 1791 685 243 aa, chain - ## HITS:1 COG:Ta0580 KEGG:ns NR:ns ## COG: Ta0580 COG0500 # Protein_GI_number: 16081683 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Thermoplasma acidophilum # 10 226 5 210 227 93 29.0 3e-19 MATTTLSAEKDPMGAAISDYFNHHRADRLRVFSSQFEEDEIPVKELFRSIQSMPILERTA LQMATGRILDVGAGSGCHALALQEMEKEVCAIDISPLSVEVMQQRGVNDPRLINLFDETF TETFDTILMLMNGSGIIGRLNNMPGFFQRMKRILRPGGCILMDSSDLRYLFEEEDGSIVI DLAGDYYGEIDFQMQYKDVKGDTFDWLYIDFQTLSLYASECGFKAELVKEGKHYDYLVKL SIA >gi|222159281|gb|ACAB01000078.1| GENE 3 1887 - 2072 235 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSIEEAIMGGIVFKGKKDKPKEEEKVKTKAKKATYIRGQHGSGAAKMKADIRKKRASRHK K >gi|222159281|gb|ACAB01000078.1| GENE 4 2077 - 2544 65 155 aa, chain + ## HITS:1 COG:no KEGG:BT_1581 NR:ns ## KEGG: BT_1581 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 154 4 147 150 101 40.0 8e-21 MVINIIICVVSVVLMLRGAFLILKMGLLWMLHCRIIKSYDRKTHGKVVDIETQVKSDGMN ITDVCIPKIKYKLEGEDEKCCQFLPSNLKSGEENNIYLYPKQYHIDDKVTILYDKGKAND MFVLPRMQLMRLLVNQFVPGCFFILAAILGLCFVF >gi|222159281|gb|ACAB01000078.1| GENE 5 2856 - 3233 194 125 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 [Streptococcus pneumoniae SP3-BS71] # 5 125 3 126 126 79 37 2e-14 MEIKSKFDHFNINVTNLERSIAFYEKALGLKEHHRKEASDGSFTLVYLTDNETGFLLELT WLKDHTAPYELGENESHLCFRVAGDYDAIRAYHKEMNCVCFENTAMGLYFINDPDDYWIE ILPQK >gi|222159281|gb|ACAB01000078.1| GENE 6 3248 - 4393 1001 381 aa, chain - ## HITS:1 COG:no KEGG:BT_1579 NR:ns ## KEGG: BT_1579 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 381 1 377 377 609 85.0 1e-173 MNKFTILFLTMFLALPMAMKADSAKEKKDDTRYLAGAVPEVEGKVVFSKEFQIPGMSQAQ IYDTMTKWMDERLKENKNIDSRIVFSDEAKGTIAGIGEEWIVFSSSALSLDRTLVNYQIT VTCKPGNCLVELEKIRFTYRETEKYKAEEWITDKYALNKAKTKLVRGLAKWRRKTVDFAD DIFMDVAVAFGAPDTRPKTEKKKKEEEQKTPSIVAAAGPIVIGAADKKTDIKVTTGEPAQ TTVPAATLTPATPAGKASADMPGYIEIDLKQIPGEVYALMGSGKLVISIGKDEFNMTNMT ANAGGALGYQSGKAVAYCTLSPDQAYDAIEKADSYTLKLYAPNQTTPSAVIECKKMPSQT TPQAGQPRTYVGEIVKLLMKK >gi|222159281|gb|ACAB01000078.1| GENE 7 4407 - 5138 681 243 aa, chain - ## HITS:1 COG:FN1387 KEGG:ns NR:ns ## COG: FN1387 COG2220 # Protein_GI_number: 19704722 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Fusobacterium nucleatum # 5 232 4 228 237 156 38.0 4e-38 MTLDYIYHSGFAIEMEGVTIIIDYYKDSSETEHNRGIVHDYLLQRPGKLYVLATHFHPDH FNREILTWKEQRPDIQYIFSKDILKSHRAKAEDAFYIKKGETYEDDTIRIDAFGSTDVGS SFLLHLQDWSIFHAGDLNNWHWSEESTEEEIRKANGDFLAEVKYLKEKVPNIDLVLFPVD RRMGKDYMKGAKQFIEQIKTTIFVPMHFSEDYEGGNALRSFAENAGCRFISITRRGESFE ITK >gi|222159281|gb|ACAB01000078.1| GENE 8 5205 - 5852 297 215 aa, chain - ## HITS:1 COG:sll1440 KEGG:ns NR:ns ## COG: sll1440 COG0259 # Protein_GI_number: 16330895 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxamine-phosphate oxidase # Organism: Synechocystis # 1 215 17 230 230 204 48.0 1e-52 MAKLNIADIRQEYTKGGLRESELPGDPLSLFSRWLQEAIDAEVDEPTAVIVGTVSPEGRP STRTVLLKGLHDGKFIFYTNYESRKGKQLAQNPSISLSFIWHALERQIHIEGIATKVSPE ESDEYFRKRPYKSRIGARISPQSQPIASRMQLIRSFVREAARWIGKEVERPDNWGGYAVT PTRIEFWQGRPNRLHDRFLYTLQPDGEWKISRLAP >gi|222159281|gb|ACAB01000078.1| GENE 9 5896 - 6600 633 234 aa, chain - ## HITS:1 COG:sll1773 KEGG:ns NR:ns ## COG: sll1773 COG1741 # Protein_GI_number: 16330260 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Synechocystis # 5 233 4 232 232 177 38.0 2e-44 MKKVIHKADTRGHSQYDWLDSYHTFSFDEYFDSDRINFGALRVLNDDKVAPGQGFQTHPH KNMEIISIPLKGHLQHGDSKKNSRIITVGEIQTMSAGTGIFHSEVNASPVEPVEFLQIWI MPRERNTRPVYQDFSIAELERPNELAVIVSPDGSTPASLLQDTWFSIGKVEAGKKLGYHM HQSHAGVYIFLIEGEIVVDGEVLKRRDGMGVYDTNSVELETLKDSHILLIEVPM >gi|222159281|gb|ACAB01000078.1| GENE 10 6664 - 7665 924 333 aa, chain - ## HITS:1 COG:FN0511 KEGG:ns NR:ns ## COG: FN0511 COG1052 # Protein_GI_number: 19703846 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Fusobacterium nucleatum # 5 333 6 333 335 355 57.0 5e-98 MAYTIAFFGTKPYDESSFNDKNKEFGFEIRYYKGHLNKNNVLLTQGVDAVCIFVNDVADA EVIRVMAANGVKLLALRCAGFNNVDLNAASAAGITVVRVPAYSPYAVAEYTVALMLSLNR KIPRASWRTKDGNFSLHGLMGFDMHGKTAGIIGTGKIAKILIHILRGFGMNVLAYDLYPD YNFARQEQIVYTSLDELYHNSDIISLHCPLTEETKYLINDYSISKMKDGVMIINTGRGQL IHTNALIEGLKNKKIGSAGLDVYEEESEYFYEDQSDRIIDDDVLARLLSFNNVIVTSHQA FFTHEAMENIAATTLQNIKDFINHKPLLNEVKK >gi|222159281|gb|ACAB01000078.1| GENE 11 7876 - 9300 1254 474 aa, chain + ## HITS:1 COG:FN1003 KEGG:ns NR:ns ## COG: FN1003 COG2067 # Protein_GI_number: 19704338 # Func_class: I Lipid transport and metabolism # Function: Long-chain fatty acid transport protein # Organism: Fusobacterium nucleatum # 232 430 7 237 273 64 26.0 5e-10 MRKISLIGLAMLIVSIPTFAGDYLTNTNQNAAFLRMIARGASIDVDGVYSNPAGLAFLPK DGLQVALTIQSAYQTRDIAATSPLWTMDGQTTVRNYEGKASAPVIPSIHAVYKKGDWAFS GSFAIVGGGGKASFDNGLPMFDAAAISLVNTIGQGMLAPNQYSISSAMEGRQYIYGLQLG ASYKINEHFSVFAGARMNYFTGGYKGFLDINLKEGVAEELGREIIKQLMAGGMTLEQAQQ AALQKSQQLNDAKLKLDCDQTGWGLTPIIGVDAKFGKLNLAAKYEFKANMNIENDTHEIV APDAAAAFVAPYQNGVNTPSDLPSMLSVAASYEFLPSLRASVEYHFFDDKNAGMADGKQK TLKHGTHEYLAGVEWDINKLFTVSGGYQKTDYGLSDAFQSDTSFSCDSYSVGFGGRINFT QALSLDVAYFWTTYSDYTKEKPRGLEATSMASLVDKDVYSRTNKVFGVSVNYKF >gi|222159281|gb|ACAB01000078.1| GENE 12 9378 - 10667 1150 429 aa, chain - ## HITS:1 COG:no KEGG:BT_1573 NR:ns ## KEGG: BT_1573 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 429 1 429 429 814 91.0 0 MKTSFKMVAMLLGIGIFPLCAHAQKKVIIEDEEPNSIMFVSKNKAGDEIIRIMNDRSQMR FHDPNAPRFLLTDQKGKFALGIGGYVRATAEYDFNGIVNDVDFYPALIPQRGSGNFAKNQ FQMDITTSTLFLKLVGRTKHLGDFVVYTAGNFRGDGKTFELQNAYAQFLGFTIGYSYGSF MDLSALPPTIDFAGPNGSAFYRTTQLSYMCDKLKNWKFGVSMEMPSVDGTTNNDLSINTQ RMPDFATSVQYNWNSNSHIKLGAIIRSMTYSSNVHEKAYSATGFGLQASTTFNITKKLQA YGQFNYGKGIGSYLNDLSNLNVDIVPDPDKEGKMQVLPMLGWYAGLQYNLCPSIFISGTY SLSRLYSENGYPSENPESYRKGQYLVANAFWNVSSNLQVGVEYLRGWRTDFSSATRHANR LNMLVQYSF >gi|222159281|gb|ACAB01000078.1| GENE 13 10716 - 10814 62 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVERGKGKIAKLIGRIAIYNDSKKADLCIILI >gi|222159281|gb|ACAB01000078.1| GENE 14 10832 - 11338 451 168 aa, chain + ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 6 164 8 161 183 88 35.0 6e-18 MEKVDFTKGILAIQPDLHRFAYKLTADRESANDLVQDCLLQALDNQEKFTYSKNLKGWMY TLMRNIFVNNYRRTVREMNLIDDSYSINQQHLIEDEDADRFEFTYDMKQLYRVIHSIPEE MKVPFQMFVAGFKYREIAEKLGLPMGTVKSRLFFIRKRLKEELKDFSS >gi|222159281|gb|ACAB01000078.1| GENE 15 11431 - 11544 90 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNNPFQTTTVKEKAIEAAHTAQATVQKAKHRVQEILD >gi|222159281|gb|ACAB01000078.1| GENE 16 11567 - 12571 658 334 aa, chain - ## HITS:1 COG:NMA1039 KEGG:ns NR:ns ## COG: NMA1039 COG3943 # Protein_GI_number: 15793995 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Neisseria meningitidis Z2491 # 8 326 2 320 336 330 54.0 3e-90 MAKENKSNIIIYQSEDGQTHIEVQMDEDTVWLSQQQMADLYQTSRTNVVEHIKHIYEDGE LVEESTCRKIRQVRQEGARMVEREIPHYNLDMIISLGYRINSIQATHFRQWATARLKEYI IKGFTMDDERLKQMGGGYYWKELLDRIRDIRSSEKVMYRQVLDIYATAVDYDPHAKQSIE FFKIVQNKLHFAAHGHTAAEVIYERADADQHMMGLTSFKGDHPTLRDAKIAKNYLSAEEL KVLNNLVSGYFDFAEVQAIKHRPMYMNDYIKHLDAILSSTGEVLLMDAGMISHEQAMDKA ETEYRKWEVRTLSPVEQAYLNSIKTLNHKTKKKG >gi|222159281|gb|ACAB01000078.1| GENE 17 12806 - 13486 604 226 aa, chain + ## HITS:1 COG:Cgl0234 KEGG:ns NR:ns ## COG: Cgl0234 COG1738 # Protein_GI_number: 19551484 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 9 213 43 249 250 129 34.0 4e-30 MKEKVSVPFMLLGILFNVCLIAANLLETKVIQIGSLTVTAGLLVFPISYIINDCIAEVWG FKKARLIIWSGFAMNFFVVALGLIAVAIPAAPFWEGEEHFDFVFGMAPRIVAASLMAFLV GSFLNAYVMSKMKVASQGRNFSARAIWSTVVGETADSLIFFPVAFGGVIAWKELLIMMGI QIVLKSLYEVMILPVTIRVVKAIKKIDGSDVYDTDISYNVLKVKDI >gi|222159281|gb|ACAB01000078.1| GENE 18 13495 - 14154 594 219 aa, chain + ## HITS:1 COG:CAC3627 KEGG:ns NR:ns ## COG: CAC3627 COG0603 # Protein_GI_number: 15896861 # Func_class: R General function prediction only # Function: Predicted PP-loop superfamily ATPase # Organism: Clostridium acetobutylicum # 1 213 5 217 222 311 64.0 7e-85 MNREAALVVFSGGQDSTTCLFWAKRNFKKVYALSFLYGQKHQKEVEFAREIARKAEVEFD VMDVSFIGQLGHNSLTDTTMVMDQEKPADSVPNTFVPGRNLFFLSIAAVYARERGINHLV TGVSQTDFSGYPDCRDAFIKSLNVTLNLAMDEQFVIHTPLMWIDKAETWALADKLGVLEL IRTETLTCYNGVQGDGCGHCPACTLRREGLEKYLKSKNQ >gi|222159281|gb|ACAB01000078.1| GENE 19 14735 - 15190 353 151 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148994988|ref|ZP_01823966.1| ribosomal protein L11 methyltransferase [Streptococcus pneumoniae SP9-BS68] # 53 151 7 105 114 140 65 7e-33 MTELKDQLSLLGRKTEYKQDYAPEVLEAFDNKHPENDYWVRFNCPEFTSLCPITGQPDFA EMRISYIPDIKMVESKSLKLYLFSFRSHGAFHEDCVNIIMKDLIKLMNPKYIEVTGIFTP RGGISIYPYANYGRPGTKFEQMAEHRLMNQE >gi|222159281|gb|ACAB01000078.1| GENE 20 15246 - 15878 515 210 aa, chain + ## HITS:1 COG:no KEGG:BT_1563 NR:ns ## KEGG: BT_1563 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 208 212 329 81.0 5e-89 MTHHVPADLQSYVCQNIISQYANFDKAHQIDHVEKVIEESLKLAMHYEVDYSMVYIIAAY HDLGLYEGREFHHIASGKVLLADETLRRWFTEEQLLQMKEAIEDHRASNKQAPRTIYGMI VAEADRIIDPEITLRRTVQYGLSHYPEMDKEGQYARFRKHLTDKYAEGGYLKLWIPQSDN AGRLAELRNLIANEDELIKVFDKLYSDELG >gi|222159281|gb|ACAB01000078.1| GENE 21 15992 - 16465 402 157 aa, chain - ## HITS:1 COG:SA0023 KEGG:ns NR:ns ## COG: SA0023 COG1576 # Protein_GI_number: 15925729 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 1 155 1 158 159 103 37.0 2e-22 MKTTLLVVGRTVEQHYITAINDYIQRTKRYITFDMEVIPELKNTKSLSMEVQKEKEGELI LKALQPGDVVVLLDEHGKEMRSLEFAEYMKRKMNTVNKRLVFIIGGPYGFSEKVYQAAHE KISMSKMTFSHQMIRLIFVEQIYRAMTILNGGPYHHE >gi|222159281|gb|ACAB01000078.1| GENE 22 16593 - 16985 356 130 aa, chain + ## HITS:1 COG:no KEGG:BT_1561 NR:ns ## KEGG: BT_1561 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 130 197 88.0 1e-49 MKKRVLVGMTALLLSLSLLMAQEIPAGVITAFKRGSSQELSKYMGDKVNLVLQGRSTNVD KQKATAAMQEFFTENKVSGFNVNHQGKRDESSFVIGTLATTNGNFRVNCFLKKVQNQYLI HQIRIDKINE >gi|222159281|gb|ACAB01000078.1| GENE 23 16978 - 17826 848 282 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 3 281 9 284 286 331 60 3e-90 MNKEKELIDKLIDLAFAEDIGDGDHTTLSCIPATAMGKSKLLIKEAGVLAGIEVAKEIFN RFDPTMKVEVFINDGTEVKPGDVAMVVEGKVQSLLQTERLMLNVMQRMSGIATMTRKYAK VLEGTNTRVLDTRKTTPGMRILEKMAVKIGGGVNHRIGLFDMILLKDNHVDFAGGIDKAI TRAKEYCKEKGKDLKIEIEVRSFDELQQVLDLGGVDRIMFDNFTPEMTKKAVEMVAGKYE TESSGGITFDTLRDYAECGVDFISVGALTHSVKGLDMSFKAC >gi|222159281|gb|ACAB01000078.1| GENE 24 18053 - 18604 301 183 aa, chain + ## HITS:1 COG:SMb20592 KEGG:ns NR:ns ## COG: SMb20592 COG1595 # Protein_GI_number: 16265252 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Sinorhizobium meliloti # 44 174 47 188 227 71 29.0 8e-13 MENEIELIKGCRAGKDSARKELYTLYSKQMLAVCFRYTGDMDAAHDVLHDGFIKIFTNFS FRGESSLCTWITRVMVTQSLDFLRREKRVSQLVVHEEQLPDIPDISDSGGGAGISEEQLM AFIAELPDGCRTVFNLYVFEEKSHKEIAKMLHIKEHSSTSQLHRAKYLLAKRIKEYRNHE ERK >gi|222159281|gb|ACAB01000078.1| GENE 25 18588 - 19637 788 349 aa, chain + ## HITS:1 COG:no KEGG:BT_1558 NR:ns ## KEGG: BT_1558 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 349 1 350 350 579 83.0 1e-164 MKKENDEITDLFRTRLADAGMSVRDGFWEELSQEIPVACQHRRRILLFRVAAAASVLLVL AASSATFLYFSPKEEMEEAFTKIAVTNGGQMDGDGIRVNQLPLPVEPVLPKPAPKSYGML SQYTEEEDSLSITFSMSFSFSATTSTGNGNRYGNQGNNGFWQATNGDTESSVAPEEQSNV NVAQPKAVKKHRWAMKVQVGTALPADNGTYKMPVSAGVTVERKLNESLGIETGLLYSNLR SAGQHLHYLGIPVKVNVTLVDTKKFDLYATVGGIADKCIAGAPDNSFKEEPIQLAVTAGI GINYKINDRLAVFAEPGVSHHFKTDSKLATVRTKRPTNFNLLCGLRMTY >gi|222159281|gb|ACAB01000078.1| GENE 26 19667 - 20173 444 168 aa, chain + ## HITS:1 COG:no KEGG:BT_1557 NR:ns ## KEGG: BT_1557 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 168 2 169 169 311 88.0 4e-84 MLWMAVCLIVSFTGCTSEEMDYNNPDVALFVKQLKSGTYKMKNDKGVVEVPHFTEEDIPE LLKYAEDLTIIPSFPSVYNMNNGKIRLGECMLWVIESIRQGTPPSLGCKMVLANAENYEA IYFLTDEEVLDAAACYRSWWEERQYPKTRWTIDPCYDEPLCGSGYRWW >gi|222159281|gb|ACAB01000078.1| GENE 27 20385 - 21110 612 241 aa, chain + ## HITS:1 COG:no KEGG:BT_1556 NR:ns ## KEGG: BT_1556 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 241 5 242 242 409 81.0 1e-113 MDRLKRFIATMVLCLTALFAYSQVWTAQDSLHLKKLLESDQELHLNMDAVKSIDFGTAVG TPRMSEEKSWMMPDESLPEALPKPKVVLSLMPYKANTRYNWDPIYQKKIKIDKNTWRGDP FYEIRHQRSYSNWARNPMAKGVRKSLDEIQASGVRFRQLSERANGMMVNTVVMDAPIPLF GGSGVYINGGTVGGLDLMAVFTKEFWNKKGRDNRARTLEVLRTYGDSTTVLINKPIEQIA R >gi|222159281|gb|ACAB01000078.1| GENE 28 21279 - 22212 839 311 aa, chain + ## HITS:1 COG:CAC3576 KEGG:ns NR:ns ## COG: CAC3576 COG2070 # Protein_GI_number: 15896810 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 7 307 9 304 310 224 43.0 1e-58 MNRITSLLGIQYPIIQGGMVWCSGWRLASAVSNAGGLGLIGAGSMHPDTLREHIRKCNAA TKFPFGVNIPLMYPQIEEIMNIVVEEGVKSVFTSAGNPKTWTGWLKERGITVVHVVSSSR FAMKCEEAGVDAVVAEGFEAGGHNGREETTTFCLIPAVHEATTLPLIAAGGIGTGEGILA AMVLGAEGVQIGTRFALTEESSASPVFKDYCLSLGEGDTKLLLKKLAPTRLVKNAFREAV EKAEDSGATSEELRTLLGRGRAKKGIFEGDLEEGELEIGQVSAIISRRQSVAEVMNELVE SYRQAVEKKYL Prediction of potential genes in microbial genomes Time: Wed May 18 02:58:22 2011 Seq name: gi|222159280|gb|ACAB01000079.1| Bacteroides sp. D1 cont1.79, whole genome shotgun sequence Length of sequence - 17730 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 7, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1108 889 ## COG0686 Alanine dehydrogenase - Term 1140 - 1180 6.1 2 2 Tu 1 . - CDS 1239 - 1922 554 ## Rmar_0751 KWG repeat protein - Term 2271 - 2311 2.0 3 3 Op 1 . - CDS 2363 - 3700 877 ## BT_1553 hypothetical protein 4 3 Op 2 . - CDS 3711 - 7058 2251 ## BT_1552 hypothetical protein 5 3 Op 3 . - CDS 7089 - 9722 2415 ## BT_1551 hypothetical protein - Prom 9765 - 9824 7.7 + Prom 9723 - 9782 10.0 6 4 Op 1 . + CDS 9921 - 11573 1293 ## COG0739 Membrane proteins related to metalloendopeptidases 7 4 Op 2 . + CDS 11602 - 13239 1540 ## COG4690 Dipeptidase + Prom 13332 - 13391 9.9 8 5 Tu 1 . + CDS 13418 - 15163 1980 ## COG1109 Phosphomannomutase + Term 15214 - 15261 14.1 9 6 Tu 1 . - CDS 15549 - 16724 953 ## COG2311 Predicted membrane protein - Prom 16752 - 16811 5.3 10 7 Tu 1 . - CDS 16831 - 17619 509 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding - Prom 17666 - 17725 5.8 Predicted protein(s) >gi|222159280|gb|ACAB01000079.1| GENE 1 2 - 1108 889 368 aa, chain - ## HITS:1 COG:BS_ald KEGG:ns NR:ns ## COG: BS_ald COG0686 # Protein_GI_number: 16080244 # Func_class: E Amino acid transport and metabolism # Function: Alanine dehydrogenase # Organism: Bacillus subtilis # 1 365 1 364 378 386 56.0 1e-107 MIIGVPKEIKNNENRVGMTPSGVAEVVKQGHRVFIQHTAGINSGFPDEAYQAVGAHILPT IEDIYATAEMIVKVKEPIITEYNLIRKGQLLFTYFHFASDKELTLAMLSNKSICLAYETV EEADHSLPLLIPMSEVAGRMSIQEGARFLEKPQGGKGILLGGVPGVKPAKVLILGGGVVG SNAAQMAAGMGADVTITDINLARLRYLSETLPKNVKTLYSSELRIRKELPDVDLVVGSVL IPGDKAPHLITKEMLSMMQPGTVLVDVAIDQGGCFETSHPTTHSAPTYIVDGIVHYAVAN IPGAVPYTSTLALTNATLPYVIALSNKGWKKACKDNPALALGLNIVEGKIVYKAVADVFG LKYEPVSL >gi|222159280|gb|ACAB01000079.1| GENE 2 1239 - 1922 554 227 aa, chain - ## HITS:1 COG:no KEGG:Rmar_0751 NR:ns ## KEGG: Rmar_0751 # Name: not_defined # Def: KWG repeat protein # Organism: R.marinus # Pathway: not_defined # 24 223 117 331 340 134 35.0 2e-30 MSSTITKFFASFLAYGVANKKKRFSAIGRFSEGLAPVKGKIQWGYINKGYDVVIPLMYER AFSFKEGLGMVVLNSQYGFIDHTGQIQIPFKYTAAHSFEQECARVCHDGLWGLIDRQGNY ILPPTYSQIEQFAEGLALVSLHNKIGFINKKGEVVIPLEYDNGRSFSEGLAAVCIESQSS KWGYINKDNEEVLPFKYDIAEPFYNNIARVGLYGKSMKINKQGSECL >gi|222159280|gb|ACAB01000079.1| GENE 3 2363 - 3700 877 445 aa, chain - ## HITS:1 COG:no KEGG:BT_1553 NR:ns ## KEGG: BT_1553 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 445 1 442 442 564 65.0 1e-159 MTKKIYYIFTLIGSLLLSSCDSYLDIQPVGQVIPNTLAEYRALFTTAYNTALNDRGICEI RTDIATILQSDATSKNSLGDVEKWNDVNPNASTRQFGWAAYYTNIYYANAIIDKKDEISE GSQEDINQLVGEAYLMRAYMHFILVNLYGQPYTAAGALETKAVPLKLNTDLEEIPSRNTV KEIYTSILSDIETARKLINKKEWEVQYSYRFSTLSVDAMESRVYLYMGEWGKSYESSERV LAGKSTLVNLNDEGDKLPNEFTSVEMITAYEIFPNSDYAGSLLLYPAFLQEYEEGKDLRP NKYYQANKNGDYTSIKSGESKFKCTFRTGELYLNSAEAAAHLNKLPEARTRLLKLIENRY TSEGYEQKKNEINAMSQEKLVTEILKERARELAFEGHRWFDLRRTTRPEIKKEIEDVTYT LVQDDSRYTLRIPQDAIDANPGLLN >gi|222159280|gb|ACAB01000079.1| GENE 4 3711 - 7058 2251 1115 aa, chain - ## HITS:1 COG:no KEGG:BT_1552 NR:ns ## KEGG: BT_1552 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1115 1 1114 1114 1790 80.0 0 MRKKYLCILICLFANALAFASAANRTITGVVISGEDNEPLIGASVYVNADDLKKAGASQT SLGTITDMDGKFSISIPEKVTRLHCSYIGFEEQDIVLQAGKDTYRIVLQASAHTLGDVVV TGYQELERRKLTAAIAKVDVTDGMVGAAKSIDQALAGQIAGVAVTTTSGAPGAPARIRIR GTASLNGTQDPLWVLDGIPLEGTDIPEINKDNDNDIVNMSQSSIAGLSPNDIESITILKD AAATAIYGARAANGVIVVTTKRGKTGKPVINFNTKLTYTPNLNTSRLNLLNSEEKVDLEL QLLKEARFDILWGLTDPIPVFPEKGKVAAIMKQYNLIDIYKEQGWNGLTPEAQNAINKLK TINTDWNDILFRDAFTQEYNFSISGGSEKVTYYNSLGYVKENGNVPGVSMSRFNLTSKTS YQINKILKIGMSIFANRRKNNTFMTDTYGLINPIYYSRIANPYFAPFDEQGNYLYDYDVV RSNETDEKQGFNIFEERANTNKESVTTAINSIFDVQLRFNDQWKVYSQIGVQWDQLSQEE YAGINSYNIRNIRETNKYWKNGVQTYLIPEGGMLKTTNSTTSQLTWKIQGEYKNTFGDIH DIQIMAGSEIRKNWVDNQASTGYGYDPKKLTFQNLIFKDEAQANDWNLKTKSYKENAFAS FFANGSYTLMNRYTLGGSVRMDGSDLFGVDKKYRFLPIYSVSGLWRLSNESFIRQYKWID NLALRLSYGLQGNIDKGTSPFLVGKYDNVNILPGYSEENIIINSAPNSKLRWEKTASYNL GMDFSVLNQAINLSVDYYYRKGTDLIGSKALALENGFTNMSINWASMENKGVEVNLQTRN ITTKNFSWYTTFNFAYNQNKVLKVLTDKSQVTPSLEGYPVGAIFALKTKGINPDTGQIYL ENKEGKAVTVEELFRMTSNEDGLGTYQIGPNSEEQRDFYSYVGTSDAPYTGGFLNTFNYR NWELNLNFSYNFGAHVKTTPSYNVSDLDPGRNMNRDILDRWTPENKTGKFPALATYNYNP ADYYLFSTRNDIYRSLDIWVKKLSFVRLQNIRLAYRVPSEWLHKLSIGGATVGLEARNLF VISSNYDNYMDPESMGNLYSTPVPKSITFNLSLNF >gi|222159280|gb|ACAB01000079.1| GENE 5 7089 - 9722 2415 877 aa, chain - ## HITS:1 COG:no KEGG:BT_1551 NR:ns ## KEGG: BT_1551 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 877 1 876 876 1678 92.0 0 MRMYVKVMAAVIACILLSGEALSAFATTPATENLSWFKKKKKKPEEKEEKSKSDYEKLVE DSKTTKGMFAVHQKKNDYYFEIPTSLLGRDLLVVNKLQRVPAELNDAGVNRGVNYENQMV CMEWDKATGKLMLRQQRPLPLAPQTDAIFRSVKDNFISPLIAAFKIEAVNADSTALVIKV NDIYDGTETSINNVFTNINLGTSAIKNLSRILSIKSFSNNVVATSELTTRVTEGTTTVYV TVEVSSSILLLPEKPMMGRFDNQKVGYFTNPLLSFSDAQQRTDKTQYITRWRMEPKPEDR EAYLKGQMVEPAKPIVFYIDNSTPYQWRSYIKKGIEDWQIAFEKAGFKNAIIAKEITDSM HVDMDDVNYSVLTYAASEKKNAMGPSLLDPRSGEILEADIMWWHNVLSMVREWITVQTGT VCPEARNVQLPDALMGDAIRFVACHEVGHSLGLRHNMMGSWAFPTDSLRSEAFTSRMNST ASSIMDYARFNYIAQPGDGVKVLSPHIGPYDMFAIEYGYRWYGKKTPEEEKDVLFDFLSK HTDRLYKYSEAQDVRDAVDPRAQNEDLGDDPVRSSLLGIENLKRIVPQILQWTTTGEKGQ TYEEASRLYYAVINQWNNYLYHVLANIGGIYIENTIVGDGVKTYTFVEKEKQQASLKFLM DEVLTYPKWLFDTEVGQYTYLLRNTPIGKQENAPTQILKNAQAYILWDLLGNTRLMRMIE NESVNGKKAFTVVELMDGLHKNIFGVTERGGIPNVMERSLQKNFLDALLTAAAEPEAVKI NKKIANEHFLLDHATPFCSCYAAEQRALRQEDRMGAPRVLNFYGSQLNRISDAISVKRGE LLRIKKLLQNRLGTSDTAARYHYEDMILRINTALGIK >gi|222159280|gb|ACAB01000079.1| GENE 6 9921 - 11573 1293 550 aa, chain + ## HITS:1 COG:TM1660 KEGG:ns NR:ns ## COG: TM1660 COG0739 # Protein_GI_number: 15644408 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Thermotoga maritima # 25 272 19 256 323 77 27.0 5e-14 MRRYITALMLACCIGGYGQEKKQVTFVPPFDFPLTLSGNFGEIRSNHFHGGLDFKTGGVI GKPVRALADGYISRIRVTNGSGYVLDVCYHNGYSTINRHLSGFVSPIAERVEKLQYEEES WEVEIVPEPGEYPVKGGQQIAWSGNTGYSFGPHLHLDVFETESGDYIDPMPFFQSKIKDT RAPKADGILFFPQLGKGVVDGKQENKTILPNSERPVEAWGVIGVGIKAYDYMDGVNNHYG VYSVVLTVDGNEIFRSTVDRFSQEENRMINSWTYGQYMKSFIDPGNTLRLLKASNDNRGL VTIDEERDYQFLYTLKDAFGNTSKYSFTVRGRKQPIEPLNHREKYYFTWNKTNYLQEPGL NLVVPKGMLYDDVLLNYQVKADSGAVAFTYQLNDKAVPLHAACELCIGLRRKPIADTTKY YVARITPKGGKYSVGGKYEDGYMKASIRELGTYTVAIDTIPPEIIPVNKNQWGRNGKIVY RLKDQGAGIASYRGTIDGKYALFGRPNIVKSYWECTLDPKRVKKGGKHTVEFTVTDYCGN ETVARESFVW >gi|222159280|gb|ACAB01000079.1| GENE 7 11602 - 13239 1540 545 aa, chain + ## HITS:1 COG:MA3377 KEGG:ns NR:ns ## COG: MA3377 COG4690 # Protein_GI_number: 20092191 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidase # Organism: Methanosarcina acetivorans str.C2A # 22 500 2 538 574 186 27.0 1e-46 MKRRIILCAAVLMAAVANTFACTNLIVGKNASADGSTIVSYSADSYGLFGELYHYPAATY PKGTMLKVYEWDTGKYLGEIEQARQTYNVVGNMNEYQVTIGETTFGGRPELADSTGIIDY GSLIYIGLQRSRTAREAIKIMTDLVQQYGYYSEGESFTIADPNEIWIMEMIGKGPGIRGA VWVAVRVPDDCISAHANQSRIHQFDMNDKENCMYSPDVVSFAREKGYFNGVNKDFSFSLA YAPLDFGARRFCEARVWSYFNKFTDNGKDYLPYIEGKTNTPMPLFVKPKHKLSVQDVKDM MRDHYEGTPLDISNDFGAGPYKTPYRLSPLNFKVDGQEYFNERPISTQQSGFVFVAQMRA HKPDPIGGVLWFGVDDANMAVFTPVYCCATKVPVCYTRVDGADYITFSWNSAFWIFNWVS NMVYPRYGLMIGDVREAQKEMETTFNNAQEGIEEMAAKLLAKDKNAAIDFLTNYTNMTAQ STFDTWKQLGTFLIVKYNDGVVKRVKDGKFERNSIGQPAGVMRPGYPKEFLQEYVKQTGD RYLVK >gi|222159280|gb|ACAB01000079.1| GENE 8 13418 - 15163 1980 581 aa, chain + ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 12 553 5 549 575 455 44.0 1e-127 MENQELIKQVTEKAEKWLTPAYDAETQAEVKRMLENDDKTELIEAFYKDLEFGTGGLRGI MGVGSNRMNIYTVGAATQGLSNYLKKNFKDLPQISVVVGHDCRNNSRLFAETSANIFSAN GIKVYLFDDMRPTPEMSFAIRHLGCQSGIILTASHNPKEYNGYKAYWDDGAQVLAPHDKG IIDEVNAIASAADIKFQGNPDLIQIIGEDIDKIYLDMVKTVSIDPAAIARHKDMKIVYTP IHGTGMMLIPRALKMWGFENVFTVPEQMIKDGNFPTVVSPNPENAEALSMAVNLAKEIDA DLVMASDPDADRVGIACKDDKGEWVLINGNQTCMMYLYYILTQYKQLGKIKGGEFCVKTI VTTELIKKIADKNNIEMLDCYTGFKWIAREIRLREGKQKYIGGGEESYGFLAEDFVRDKD AVSACCLIAEVAAWAKDNGKSLYQLLLDIYVEYGFSKEFTVNVVKPGKSGAEEIKAMMEN FRANPPKELGGSKVILSKDYKTLKQTDDKGNVTAIDMPEPSNVLQYFTEDGSKVSVRPSG TEPKIKFYMEVQGEMGCRNCYASAESAAMEKIEAVKKSLGI >gi|222159280|gb|ACAB01000079.1| GENE 9 15549 - 16724 953 391 aa, chain - ## HITS:1 COG:BS_yrkO KEGG:ns NR:ns ## COG: BS_yrkO COG2311 # Protein_GI_number: 16079697 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 9 386 17 385 405 89 24.0 1e-17 MELSTKTPRIEVVDALRGFAVMAILLVHNLEHFIFPVYPESSPEWLTILDAGVFNATFSL LAGKSYAIFALLFGFTFYIQSHNQQLKGKDFGYRFLWRMILLAGFATLNAAFFPAGDILL LFVVVSLVLFIVRKWSDKAILITAILFSLQPIEWFHYIMSLFNPAYTLPDLNVGAMYAEV AEYTKAGSFWDFLIGNVTLGQKASLFWAIGAGRFLQTAGLFLFGLYIGRKELFVTTESHL KFWMKALIIAAISFAPLYSLKEQIMQSDSSLIQQTVGTAFDMWQKFAFTIVLVASFVLLY QKDQFKKTVSNLRFYGKMSLTNYISQSILGAIIYFPFGFYLAPYCGYTLSLIIGIILFLA QVRFCKWWLSKHKQGPLETIWHKWTWIGTKK >gi|222159280|gb|ACAB01000079.1| GENE 10 16831 - 17619 509 262 aa, chain - ## HITS:1 COG:MA1439 KEGG:ns NR:ns ## COG: MA1439 COG2816 # Protein_GI_number: 20090298 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Methanosarcina acetivorans str.C2A # 3 258 24 279 285 182 38.0 5e-46 MNQTAQSWWFIFYKDQLLLEKKEDGIYAIPCGESSPIVIKEKTTVHNITTLEGRNCKAFS LSSPIEESEQWIMIGLRASYEYLPLSHYQTAGKAHEILHWDRNSRFCSACGTPMEQKESI MKRCPKCGREVYPSISTAILVLVRKKNSLLLVHARNFKGTFNSLVAGFLETGETLEECVA REVKEETGLNVKNITYFGNQPWPYPSGLMVGFIADYAGGEINLQDEELSSGDFYTRDNLP ELPRKLSLARKMIDWWIEHPNE Prediction of potential genes in microbial genomes Time: Wed May 18 02:59:04 2011 Seq name: gi|222159279|gb|ACAB01000080.1| Bacteroides sp. D1 cont1.80, whole genome shotgun sequence Length of sequence - 42114 bp Number of predicted genes - 31, with homology - 31 Number of transcription units - 13, operones - 7 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 83 - 1783 1246 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 2007 - 2066 3.5 - Term 2085 - 2126 1.6 2 2 Op 1 . - CDS 2160 - 3533 435 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 - Prom 3571 - 3630 2.2 3 2 Op 2 . - CDS 3633 - 4214 640 ## COG3059 Predicted membrane protein 4 2 Op 3 . - CDS 4303 - 5142 782 ## COG2207 AraC-type DNA-binding domain-containing proteins 5 2 Op 4 . - CDS 5145 - 5627 377 ## COG0295 Cytidine deaminase - Prom 5735 - 5794 5.1 + Prom 5543 - 5602 4.4 6 3 Op 1 . + CDS 5738 - 6643 871 ## COG1705 Muramidase (flagellum-specific) 7 3 Op 2 . + CDS 6725 - 8122 1279 ## COG1252 NADH dehydrogenase, FAD-containing subunit + Term 8180 - 8213 -0.1 + Prom 8124 - 8183 6.2 8 4 Op 1 24/0.000 + CDS 8356 - 9603 1211 ## COG0845 Membrane-fusion protein 9 4 Op 2 . + CDS 9621 - 10286 292 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 10 4 Op 3 . + CDS 10354 - 10980 649 ## BT_1534 hypothetical protein 11 4 Op 4 . + CDS 11077 - 12366 1141 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 12 4 Op 5 . + CDS 12396 - 13649 904 ## BT_1532 ABC transporter permease + Prom 13654 - 13713 4.1 13 4 Op 6 . + CDS 13734 - 16343 1739 ## COG0642 Signal transduction histidine kinase + Term 16350 - 16394 5.3 + Prom 16360 - 16419 5.1 14 5 Tu 1 . + CDS 16473 - 17951 1360 ## BT_1530 putative outer membrane protein OprM precursor + Prom 18030 - 18089 5.0 15 6 Op 1 13/0.000 + CDS 18256 - 19605 1313 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 16 6 Op 2 . + CDS 19660 - 20952 1155 ## COG0642 Signal transduction histidine kinase 17 6 Op 3 . + CDS 21050 - 23602 2297 ## BT_1527 hypothetical protein + Prom 23623 - 23682 4.6 18 7 Tu 1 . + CDS 23711 - 25000 1517 ## COG1260 Myo-inositol-1-phosphate synthase + Term 25026 - 25083 2.4 + Prom 25011 - 25070 4.2 19 8 Op 1 . + CDS 25162 - 25641 431 ## COG1267 Phosphatidylglycerophosphatase A and related proteins 20 8 Op 2 . + CDS 25656 - 26123 317 ## BT_1524 hypothetical protein + Prom 26147 - 26206 4.0 21 9 Tu 1 . + CDS 26228 - 26878 784 ## COG0558 Phosphatidylglycerophosphate synthase + Prom 26920 - 26979 3.4 22 10 Tu 1 . + CDS 27003 - 27920 654 ## BT_1522 putative aureobasidin A resistance protein + Prom 27922 - 27981 9.1 23 11 Op 1 . + CDS 28029 - 29039 638 ## BT_1521 hypothetical protein 24 11 Op 2 . + CDS 29173 - 30345 1273 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family + Term 30366 - 30418 10.0 + Prom 30426 - 30485 10.2 25 12 Op 1 . + CDS 30565 - 33702 2549 ## BT_3271 hypothetical protein 26 12 Op 2 . + CDS 33711 - 35270 1429 ## BT_3272 putative outer membrane protein 27 12 Op 3 . + CDS 35283 - 36008 666 ## BT_3273 hypothetical protein 28 12 Op 4 . + CDS 36018 - 37658 1266 ## BT_3274 hypothetical protein 29 12 Op 5 . + CDS 37714 - 38670 781 ## gi|237714997|ref|ZP_04545478.1| conserved hypothetical protein 30 12 Op 6 . + CDS 38713 - 41280 2749 ## Cpin_5142 hypothetical protein + Term 41313 - 41373 10.1 + Prom 41282 - 41341 4.1 31 13 Tu 1 . + CDS 41555 - 42113 303 ## COG1373 Predicted ATPase (AAA+ superfamily) Predicted protein(s) >gi|222159279|gb|ACAB01000080.1| GENE 1 83 - 1783 1246 566 aa, chain - ## HITS:1 COG:CC2587 KEGG:ns NR:ns ## COG: CC2587 COG0488 # Protein_GI_number: 16126825 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Caulobacter vibrioides # 3 565 4 525 535 263 31.0 1e-69 MSISIQQISYIHPDKEVLFSDLNFAISKGQKLGLVGNNGCGKSTLLQIIAGQLSPSSGVI VRPDDLYYIPQHFGQYDSLTIAQALQIECKQQALHAILTGDVSNENFTILNDDWNIEERS IAALDLWGLGQFTLSYPMNLLSGGEKTRVFLAGMDIHHPSVILMDEPTNHLDSSGRQRLY DWVEKYRSTLLVVSHDRTLLNLLPEICELEKHQINYYGGNYEFYKEQKTLMQEALQQRIE EKEKALRIARKVARETAERRDKQNVRGEKSNIKKGVPRIVLNALQGKSEKSTSKLTGVHQ EKAEKLTDERNQLRGSLSPTAALKTDFNSSSLHIGKILVTAKEINFSYHSNSINNNILTN DEINSSDTGYHPNPNSNDIQENSISKQQLWQAPVSFQLKSGDRLRIEGANGSGKTTLLKL ITGQLQPQEGTLTRTDFSYVYLNQEYSIIDDRNSILEQAYAFNSRNLPEHEIKIILNRYL FPASEWDKSCRKLSGGEKMRLAFCCLMISNNTPDMFILDEPTNNLDIQSIEIITATIKNY AGTVIAISHDNYFIQEIGVEQCILLS >gi|222159279|gb|ACAB01000080.1| GENE 2 2160 - 3533 435 457 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 1 451 1 444 458 172 26 4e-42 MKQYDAIIIGFGKAGKTLAAELSNRGWQVAIVERSNMMYGGSCPNVACIPTKTLVHEAEV SALLYHDDFPKQTNMYKQAIGRKNRLTSFLRNDNYERLSKRPNVTIYTGTGSFISANTIK VTLPEGYIELQGKEIFINTGSTPIIPAIDGIQQSQHVYTSSTLLDLSVLPHHLIIIGGGY IGLELASMYAGFGSKVTILEGGNKFMPREDRDIANSVKEVMEKKGIEIHLNARAQSIHDT NDGVTLTYSDVSDGTPYYVDGDAILIATGRKPMIEGLNLQAAGVGVDAHGAIIVNDQLRT TAPHIWAMGDVKGGSQFTYLSLDDFRIIRDQLFGDKKRDIGDCDPVQYAVFIDPPLAHIG ISEEEALKRGYSFKVSRLPASSIVRTRTLRQTDGMLKAIINNHNGKIMGCTLFCADASEI INIVAMAIKTGQTSTFLRDFIFTHPSMSEGLNQLFDV >gi|222159279|gb|ACAB01000080.1| GENE 3 3633 - 4214 640 193 aa, chain - ## HITS:1 COG:STM0566 KEGG:ns NR:ns ## COG: STM0566 COG3059 # Protein_GI_number: 16763943 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Salmonella typhimurium LT2 # 9 187 5 182 186 173 51.0 2e-43 MKEKLIALLTFTSSLKSFGMKFIRVAILVVFVWIGGLKYFHYEADGIVPFVANSPFMSFF YAKGAPEYKEHKNAEGAFVPENRAWHEANRTYTFSYGLGALIMGIGILVFLGIFFPKVGL AGDTLAIIMTLGTLSFLVTTPEVWVPDLGSGEFGFPLLSGAGRLVIKDIVILASAVVLLS DSSQRVLKTLKKD >gi|222159279|gb|ACAB01000080.1| GENE 4 4303 - 5142 782 279 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 117 272 131 285 288 83 31.0 4e-16 MLQQYHTKLKGTLALTDSYLTEKTLQKEKGLYKFIWVRNGSITVEIDHQEMTLTQDEVIS LTHLQHLEFKSIDGEYLTLLFNSNFYCIYGNDHEVSCSGFLFNGSSHLIRFTLNEKERKE LDTITEALENEFTVSDSLQEEMLRILLKRFIIQCTRIARHRMNITREKESGFEIVRQYYN LVDEHYRTKKQVQDYADMLHKSPKTLSNIFSTCKLPSPLRVIHERVEAEAKRLLLYSNKS AKEIADILGFEDQASFSRFFKNMTGQSAVQFRNTQEGKN >gi|222159279|gb|ACAB01000080.1| GENE 5 5145 - 5627 377 160 aa, chain - ## HITS:1 COG:BH1366 KEGG:ns NR:ns ## COG: BH1366 COG0295 # Protein_GI_number: 15613929 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Bacillus halodurans # 25 160 6 132 132 91 41.0 5e-19 MKDLTITAIIKVYQYDELNEGDRSLIKTAMEATARSYSPYSHFSVGAAALLGNGTVVTGT NQENAAYPSGLCAERTTLFYANSQYPDQPVVTLAIAARTEKDFIDHPIPPCGACRQVILE TEKRYKHPIRILLYGKECIYEVKSIGDLLPLSFDASAMED >gi|222159279|gb|ACAB01000080.1| GENE 6 5738 - 6643 871 301 aa, chain + ## HITS:1 COG:lin1064_1 KEGG:ns NR:ns ## COG: lin1064_1 COG1705 # Protein_GI_number: 16800133 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Listeria innocua # 33 171 54 201 201 87 40.0 4e-17 MENKLHRLIFLTVVFFFAVGVQAQKRNARYVEYINKYSDLAVEQMKLHKIPASITLAQGL LESGAGYSQLARKSNNHFGIKCGGSWRGRTVRHDDDARNECFRAYKHPRDSYEDHSDFLR RGARYAFLFKLDITDYKGWARGLKKAGYATDPSYANRLITIIEDYDLYKYDRKGVYSERK LKKNPWLMSPHQVYIANDIAYVVARNGDTFKDLGNEFDISWKKLVKYNDLQRDYTLMEGD IIYLKSKKKKASKPYTVYVVKDGDSMHGISQKYGIRLKNLYKMNRKDGEYVPEIGDRLRL R >gi|222159279|gb|ACAB01000080.1| GENE 7 6725 - 8122 1279 465 aa, chain + ## HITS:1 COG:all2964 KEGG:ns NR:ns ## COG: all2964 COG1252 # Protein_GI_number: 17230456 # Func_class: C Energy production and conversion # Function: NADH dehydrogenase, FAD-containing subunit # Organism: Nostoc sp. PCC 7120 # 11 423 5 424 442 265 35.0 1e-70 MSLNIAKSNKKRVVIVGGGFGGLKLANKLKKSGFQVVLIDKNNYHQFPPLIYQVASAGME PTSISFPFRKIFQHRKDFFFRMAEVRAIFPEKNMIQTSIGKAEYDYLVLAAGTTTNYFGN KHIEEEAMPMKNVSEAMGLRNALLANLERALTCSTKQEQQELLNIVIVGGGATGIEVAGI LSEMKKFVLPNDYPDMSSSLMHIYLIEAGPRLLAGMSEESSAHAEQFLREMGVNILLNKR VVDYRDHKVVLEDGTEIATRTFIWVSGVTGVTIGNLDASLIGRGGRIKVDSFNRVEGMNN VFAIGDQCIQLADENYPNGHPQLAQVAIQQGELLAKNLIRMEKGQEMKPFHYRNLGSMAT VGRNRAVAEFSKVKMQGWFAWVMWLVVHLRSILGVRNKVIVLLNWVWNYFTYDQSMRMIV YARKAKEIRDREKVEETTHWGKELIQEPKQHSPQEIQQASEQEKK >gi|222159279|gb|ACAB01000080.1| GENE 8 8356 - 9603 1211 415 aa, chain + ## HITS:1 COG:PA2521 KEGG:ns NR:ns ## COG: PA2521 COG0845 # Protein_GI_number: 15597717 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 105 401 136 466 484 61 24.0 4e-09 MDREIPKEIRQKERNKKIIRYSSIGVASIVVIGVLISVLRAGVEAKDLVFSTVDTGVIEV SVSASGKVVPAFEEIINSPINSRILEVYKKGGDSVDVGTPILKLDLQSTETEYKKLLDEE QMRRYKLDQLRVNNQTKLSDMAMQIKVSAMKLSRMKVELRNEHYLDSLGAGTTDKVRQAE LSYNVAQLEYEQLQQQYKNEKEVAAAELKVQELDFNIFRKSLSEKKRTLDDAQIRSPRKA ILTYINNQIGAQVSEGGQVAIISDLSHFKVEGEIADTYGDRVAAGGKAIVKIGSDKLEGT VSSVTPLSKNGVISFTVQLKEDNHRRLRSGLKTDVYVMNAVKEDVMRVANGSFYVGRGEY ELFVCNSDNELVKRKIQLGDSNFEYVEVLSGLQPGDKVVVSDMSAYKNKNKLKIK >gi|222159279|gb|ACAB01000080.1| GENE 9 9621 - 10286 292 221 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 214 1 211 245 117 34 1e-25 MITLNSLSKIYRTDEIETVALENVNLTVERGEFLSIMGPSGCGKSTLLNIMGLLDAPTMG IVEINGIRTEGMKDKELAIFRNKTLGFVFQSFHLINSLNVTDNVELPLLYRRVGSSERKR LAQEVLEKVGLSHRMRHFPTQLSGGQCQRVAIARAIIGNPEIILADEPTGNLDSRMGAEV MELLHRLNKEDGRTIVMVTHNEEQAKQTSRTIRFFDGRQVQ >gi|222159279|gb|ACAB01000080.1| GENE 10 10354 - 10980 649 208 aa, chain + ## HITS:1 COG:no KEGG:BT_1534 NR:ns ## KEGG: BT_1534 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 208 208 288 73.0 8e-77 MTKNIFIVAIAVLLSVAFCSLSAQNVSKDYNVGDFSAINLQSVGNIIFAQSAECTCRLEG PSEFVEKTRVTVKNGTLVIDYKEKNVKNVKNLIFYITAPDLSKVKIDGVGNFDAKEKLNL KNIAFELDGVGNCNVKNLHCDELKLDVDGVGNMKMNVDCGLIKAKVDGVGNITLSGKADT AFFKKDGVGKINHKNLKCKDITKKGWNF >gi|222159279|gb|ACAB01000080.1| GENE 11 11077 - 12366 1141 429 aa, chain + ## HITS:1 COG:YPO1365_2 KEGG:ns NR:ns ## COG: YPO1365_2 COG0577 # Protein_GI_number: 16121645 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Yersinia pestis # 118 426 113 392 395 73 24.0 1e-12 MIKLYFKQAFHLLGENKLLSSISIIGTALAIAMIMVIVITLRATIAPFAPETHRDRMLIF RFAGLQSKSNVNWQSNGPIGYNTAKACFKAMTIPEVVSITNIWQETMLAAKPAGEMESCS VLQTDDAFWKIFEFEFLSGKPYDNADFDAGAAKAVISEDMARRLFGTSEVVGKTFLLNHS AYIVCGVVRPVSKLAKYAYAQVWIPLSSTSAFTATWGDDNIMGMTAVYILAKSKDDFPAI RQEADRLRAIFMAGHPNFDLLYRGQPDTYFVAAQRYSANNPPAVKEAVRQYILTLLVLLI VPAVNLSGLTLSRMRKRISEIGVRKAFGAPRRELMMQVLSENMLYSLFGGILGLVLSYVA AFLLGGMLFSVDFVSNGVEDLRTMCVDLLFDPTVFLLAFLACFLLNLLSAVIPAWRVTRT NIVDAINER >gi|222159279|gb|ACAB01000080.1| GENE 12 12396 - 13649 904 417 aa, chain + ## HITS:1 COG:no KEGG:BT_1532 NR:ns ## KEGG: BT_1532 # Name: not_defined # Def: ABC transporter permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 417 1 417 417 642 76.0 0 MQLLKQIWNERRSNGWLWSELLIVFVVLWYVVDWTYVTARTYYEPVGFDITDTYYLELSL KNDKSNSYLSKEQKSTSLGQDIIELTNRLRRLPEVEAVSISNNARPYIGSNSGSMLRIDT LVSNPLRRSVTPDFFQVFRYQSADGRGYQPLVQALRNGNVVVGENFWPKDYKGDRTLLGK EMVDVDDSTKVYKIGGVSKKVRYNDFWPNYSDRYVAIELTEKIMVELDDELYPSSVEVCL RVKPGTSRDFAEHLMKLSANQLSVGNLFILKVHDYEDLRNDFQQGSYNQVQVRFWMMGFL LLNILLGIVGTFWFRTQHRRAESALRIAVGSSRMQLWQRLNKEGLLLLTLAALPAAAICY NIGHLELTEGYMEWGVVRFLITFVITYFLMSLMILIGIWFPARQVIRIQPAEALREE >gi|222159279|gb|ACAB01000080.1| GENE 13 13734 - 16343 1739 869 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 612 865 52 311 328 167 40.0 1e-40 MNRRYYCMFIVLFFAVLTQAAAAERTFNILFIQSYTSQTPWHSDLNQGLAKGFKESGLKV NITTEYLDADFWAFNSEKVIMRRFCQRARDRQTDLIVTASDEAFYTLFACGDSLPLQIPV VFFGIKYPDTKLITTHPNVCGFTANPDFDVILRQAQKIFPRRKEVVCVIDNSFLSNKGLE DFEEEWKIFQKDNPDYRMKIYNTQNHTTSHIIAAICYPRNSYERLVVAPKWSPFLSFVGK NSKAPVFSTQNVGLTNGVFSAYDADSYTSASLAAQRAASVLKGTSPRDIGVTEITQGFIF DYKQLDFFHIDPDKVSSSGTIVNEPYWEKYKYLFILLYPSILALLIASIVWLMRANRRES KRRIQAQTRLLVQNKLVEQRNEFDNVFHSIRDGVITYDTDLHIHFTNRSLLQMLHLPYEA GGRFYEGMMAGSIFKIYYNGQDILHKMLKQVASKGESVKIPQGAFMKEVHSDKYFPVSGE IVPIRSKDAITGMALSARNISDEEMQKRFFDMAVDESSIYPWQFDMETNCFIFPQGFLKR LGYDESVTTILRDEMDRTIHPDDLKEIRPLFNRALTGEDSNTRLNFRQRNVNGEYEWWEY RSSVITGLTQDSLYNILGVCQSIQRYKTAEQEMREARDKALQADKLKSAFLANMSHEIRT PLNAIVGFSDLLSDTSGFTSEEIAQFIGTINKNCGLLLALINDILDLSRIESGTMEFMFA EHNLPLLLKTVHDSQRLNMPPGVELVLRMPESDKKYLTTDNVRLQQVVNNLINNAAKFTS SGFITFGYEDDEVPGYTRIFVEDTGVGISEEGIRHIFERFYKVDNFTQGAGLGLSICQTI IERLNGTISVTSEVGKGTRFTVRLPNYCE >gi|222159279|gb|ACAB01000080.1| GENE 14 16473 - 17951 1360 492 aa, chain + ## HITS:1 COG:no KEGG:BT_1530 NR:ns ## KEGG: BT_1530 # Name: not_defined # Def: putative outer membrane protein OprM precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 492 1 492 492 819 88.0 0 MKKKSILFALAAVCLPLALSAQKEREITLNEAIAMARIQSVDAAVALNELKTAYWEFRTF RADLLPEVNLTGTLPNYNKSYSSYQNSDGSYGFVRNNTLGLTGDLSIDQNIWLTGGKLSL TSSLDYIKQLGAGGDRHFMSVPVTLQLTQPIFGVNNIKWNRRIEPVRYAEAKAAFITATE EVTMRAITYYFNLLLAKENLGTAKQNQTNADHLYEVALAKRKMGQISENELLQLKLSALN AKAALTEAESDLNAKMFQLRAFLGVGEDEVLNPVLPEAVDGPRMEYNQVLNKALERNSFA QNIRRRQLEADYEVATARGNLRSVDLFASVGYTGENRNFPAVYRNLQDNQIVQVGVKIPI LDWGKRRGKVRVAKSNRDVVLSKIRQEQINFNQDIFLLVEHFNNQAQQLDIAKEADAIAQ QRYKTSIETFLIGKINTLDLNDAQNSKDDARQKHISELYYYWYYYYQIRSLTLWDFRTNT ELEADFDEIIRR >gi|222159279|gb|ACAB01000080.1| GENE 15 18256 - 19605 1313 449 aa, chain + ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 2 447 7 456 461 343 43.0 6e-94 MILIIDDDSAVRSSLSFMLKRAGYEAQTVPGPREAMEVVRSIAPDLILMDMNFTLSTTGE EGLTLLKQVKIFRPETPVILMTAWGSIQLAVQGMQAGAFDFITKPWNNAALLQRIETALQ LSGTPQEPTQEQSDSFDRSHIIGRSQGLMDVLNTIARIAKTNASVLITGESGTGKELIAE AIHINSQRAKHPFVKVNLGGISQSLFESEMFGHKKGAFTDASADRIGRFEMANKGTIFLD EIGDLDPSCQVKLLRVLQDQTFEVLGDSRPRKTDIRVVSATNADLRKMVGERTFREDLFY RINLITVKLPALRERREDIPLLARHFADRQAVTNGLPRTEFSADALQFLSRLPYPGNIRE LKNLVERTILVSGKPLLDASDFDAQYIRHDDARVTEGAALAGMTLDEIERQTILQALDRY KGNLSQVATALGISRAALYRRLEKYNITM >gi|222159279|gb|ACAB01000080.1| GENE 16 19660 - 20952 1155 430 aa, chain + ## HITS:1 COG:BH1920 KEGG:ns NR:ns ## COG: BH1920 COG0642 # Protein_GI_number: 15614483 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 104 430 188 537 548 102 27.0 2e-21 MRIKGFFYILVFLLLALGSVLLFLSSQLNTIFFYIGEGLILFILIYLTFFYRKIVKPLNT IGSGMELLREQDFSSRLSPVGQYEADRVVNIFNRMMEQLKNERLRLREQNNFLDLLIKAS PMGVIITSLDEDLSELNPMALKMLGVRLEDVQGKKMKEIDSPLAVELANLPKGEKVTVRL NDSNIYRCTHSSFIDRGFQHPFYLVETLTDEVMKAEKKAYEKVIRMIAHEVNNTTAGITS TLDTVEQALSSEEGMEDICDVMRVCTDRCFSMSRFITRFADVVKIPEPTLSSVNLNDLVF TCKRFMEGMCNDRRITLRMEIDESLKDVMLDAALFEQVLVNIIKNAAESIETDGEIIVRT LAPATVEVIDNGQGISKETEAKLFSPFFSTKPNGQGIGLIFIREVLMRHGCTFSLRTYAD GLTRFRITFP >gi|222159279|gb|ACAB01000080.1| GENE 17 21050 - 23602 2297 850 aa, chain + ## HITS:1 COG:no KEGG:BT_1527 NR:ns ## KEGG: BT_1527 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 850 1 850 850 1594 92.0 0 MRLMKKGTLTLFLLLAISIPLCAQYVVQGVVTDSLTKEPLPYASVRLKDTTEGTTTGSDG RFYFKTHRSEAVLVISVIGYNDYVRTIRPARNASYKVALAPTEYALGEVVVKPKREHYRK KDNPAVEFVRRMIESRDNYSPYEKDFWQRERYEKTTFALNNFDEEKQKKWLYRKFDFLTE YVDTSAVTGKPILTVSARELLATDYYRKSPRSEKQWVKGRKQAGVDEFLSKQGMQAAINE VFKDVDIYENNISLFTNKFVSPLSRIGTGFYKYYLMDTLQIAGEPCADLAFTPFNSESFG FNGHLYVTLDSTYFVKRAVFNFPKKINLNFVDYMLLEQEFKRAEDGTRLLDHESITVEFK LTEGQDGIFARRVADYSNYTFTPTAEADKAFTKPERIIEETEALSRPATFWAENRPQAAI SQQENSVDRLMTQLRSYPVYYWTEKVLSILFTGYIPTSKEAPLFYIGPMNATISGNTLEG PRIRAGGMTTAWLNPHLFGKGYVAYGFKDERVKGLAELEYSFKKKKEYANEFPIHSLKLR YESDVNQYGQNYLYTSKDNVFLALKREKDDRIGYFRQAEMTYTNEFYSGFSFQLTARTRQ DESSYLIPFLKKEGDTYTPVKDFSVSAAELKLRYAPNEKFFQTQWNRFPVSLDAPVFTLS HTIAGKGVLGSDYTYNHTEAGIQKRFWFSAFGYTDVILKAGKVWDKVPFPLLIMPNANLS YTIQPESYSLMNAMEFMNDEYFSWDVTYFLNGWLFNRVPLLKKLKWREIVSCRGLYGHLS DKNNPALSDGLFAFPIENTQTMGKTPYVEAGVGIENIFKVLRLDYIWRLTYRDSPGIDRS GLRISLHMTF >gi|222159279|gb|ACAB01000080.1| GENE 18 23711 - 25000 1517 429 aa, chain + ## HITS:1 COG:YJL153c KEGG:ns NR:ns ## COG: YJL153c COG1260 # Protein_GI_number: 6322308 # Func_class: I Lipid transport and metabolism # Function: Myo-inositol-1-phosphate synthase # Organism: Saccharomyces cerevisiae # 11 421 87 541 555 186 29.0 1e-46 MKQEIKPATGRLGVLVVGVGGAVATTMIVGTLASRKGLAKPIGSITQLATMRMENNEEKL IKDVVPLTNLEDIVFGGWDIFPDNAYEAAMYAEVLKEKDLNGVKEELEAIKPMPAAFDHN WAKRLNGTHVKKAATRWEMVEQLRQDIRDFKAANNCERVVVLWAASTEIYIPLSDEHMSL AALEKAMKENNTDVISPSMCYAYAAIAEGAPFVMGAPNLCVDTPAMWEFSKQKNVPISGK DFKSGQTLMKTVLAPMFKTRMLGVNGWFSTNILGNRDGEVLDDPDNFKTKEVSKLSVIDT IFEPEKYPDLYGDVYHKVRINYYPPRKDNKEAWDNIDIFGWMGYPMEIKVNFLCRDSILA APIALDLVLFSDLAMRAGMCGIQTWLSFFCKSPMHDFEHQPEHDLFTQWRMVKQTLRNMI GEKEPDYLA >gi|222159279|gb|ACAB01000080.1| GENE 19 25162 - 25641 431 159 aa, chain + ## HITS:1 COG:STM0420 KEGG:ns NR:ns ## COG: STM0420 COG1267 # Protein_GI_number: 16763800 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphatase A and related proteins # Organism: Salmonella typhimurium LT2 # 10 159 23 169 171 80 39.0 2e-15 MKRPPFLPVFIGTGFGSGFSPFAPGTAGALLASIIWIALYFLLPFSWVLWLTAALVIVFI FAGIWAADKLETYWGEDPSRVVVDEMVGVWIPLLAVPNDDKWFWYVIAAFALFRIFDIAK PLGIRRMESLKGGVGVMMDDVLAGVYSFILLVGARWVIG >gi|222159279|gb|ACAB01000080.1| GENE 20 25656 - 26123 317 155 aa, chain + ## HITS:1 COG:no KEGG:BT_1524 NR:ns ## KEGG: BT_1524 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 154 154 258 93.0 7e-68 MISQKGGIFMFLRAQLSAQMATIADFLVTILLVRLFDVYYVYATLAGAIYGGIVNCVINY KWTFKSKGKKTHVAVKFILVWVCSVWLNTWGTYALTESLAKIPWVRNTLSLYFGDFFIIP KVVVAIIVALFWNYNMQRFFVYRNIDIRSLFGKRN >gi|222159279|gb|ACAB01000080.1| GENE 21 26228 - 26878 784 216 aa, chain + ## HITS:1 COG:MT2687 KEGG:ns NR:ns ## COG: MT2687 COG0558 # Protein_GI_number: 15842152 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 13 170 14 163 217 89 32.0 5e-18 MNYRDYLQQLIYKIINPLIRGMIKIGITPNFITTTGFILNVVAAGMFVYAGIYGGQNDLA IIGWAGGVILFAGLFDMMDGRVARLGNMSSKFGALYDSVLDRYSELMTFFGICYYLSMKD YFLYALIAFVALIGSLMVSYVRARAEGLGIECKVGFMQRPERVVLTSLGALFCGVFKDIT AFEPILIMIVPLAFVALFANITAFARVRHCYKAMKE >gi|222159279|gb|ACAB01000080.1| GENE 22 27003 - 27920 654 305 aa, chain + ## HITS:1 COG:no KEGG:BT_1522 NR:ns ## KEGG: BT_1522 # Name: not_defined # Def: putative aureobasidin A resistance protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 305 1 305 305 513 90.0 1e-144 MPSKREALTVTVIMALFLLLTSIFIGLRSEHLLMAVLYLVLFFAGLPTRKLAVALLPFAI FGISYDWMRICPNYEVNPIDVAGLYNLEKSLFGVMDNGLLVTPCEYFAAHNWPIADVFAG IFYLCWVPVPILFGLCLYFKKERKTYLRFALVFLFVNLIGFAGYYIHPAAPPWYAINYGF EPVLNTPGNVAGLGRFDAFFGVTIFDSIYGRNANVFAAVPSLHAAYMVVALVYAIIGKCR WYVVTLFSIIMVGIWGTAIYSCHHYIIDVLLGISCALIGWLVFEYILMRIPAFKRFFERY YTYIK >gi|222159279|gb|ACAB01000080.1| GENE 23 28029 - 29039 638 336 aa, chain + ## HITS:1 COG:no KEGG:BT_1521 NR:ns ## KEGG: BT_1521 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 328 1 309 309 491 89.0 1e-137 MKNKYRNIFLAFGIIAVLIMIFTFDMDYQELWLNLKRAGVYLPLVLLLWLFVYLINTVSW YIIIRSGGKTGFSFVRLYKFTVTGFALNYVTPVGLMGGEPYRIMELKPYIGIERATSSVI LYVMMHIFSHFCFWLSSVLLYVCLYPVGWMMGIILGAITVFCLLIVILFMKGYRQGMAVA FVRMGGRIPFLKKKVFHFANAHKEKLENIDKQIALLHQQKKQTFYSALILEYTARVVSCL EIWLILNVLTTHVSFADCCLIAAFSSLLANLLFFLPMQLGGREGGFALAVGGLSLSGAYG VYAALITRVREMVWIVIGLVLMKVGNFKQSKERYSA >gi|222159279|gb|ACAB01000080.1| GENE 24 29173 - 30345 1273 390 aa, chain + ## HITS:1 COG:BH3344 KEGG:ns NR:ns ## COG: BH3344 COG1979 # Protein_GI_number: 15615906 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Bacillus halodurans # 1 390 1 387 387 310 40.0 4e-84 MNNFIFYSPTEFVFGRDTEAQTGVLVQKYGARKIMIVYGGGSVIRSGLLARVEKSLQEVG IPYCMLGGVQPNPIDTKVYEGIDLCRKENVDMMLAVGGGSVIDTAKAIAAGVPYNGDFWD FYIGKAIVTKALKVAVVLTIPAAGSEGSGNTVITKVDGLQKLSLRAPGVLRPVFAVMNPE LTYTLPPFQTACGIADMMAHIMERYFTNTKEVEIGDRLCEGTLLTIIKEATTVMKDPENY GARANLMWCGTIAHNGTCGVGCEEDWASHFLEHEISAIYNVTHGAGLSVIFPAWMTWMTE HNVDKIAQYAIRAWGVAESDDKKAVALEGISRLKSFFTSIGLPVTFKELGIENPDIDRLA DSLHRNKGELVGNYVKLAKQDSKEIYRLAL >gi|222159279|gb|ACAB01000080.1| GENE 25 30565 - 33702 2549 1045 aa, chain + ## HITS:1 COG:no KEGG:BT_3271 NR:ns ## KEGG: BT_3271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 48 1045 112 1116 1116 843 44.0 0 MQKLNSGALNRILLFVYILSLSTNAIAQNKKNLKETYNLPPQGNYVYGRVIEKLSNEPMV GVTIRLDGHSTGVITDINGCYVLTLPEKGGLVIYSYIGFETRKIKVTSRQKVDVQMVEAT ESIQEVIVTGYNSIQKESFTGNTTKIEKEDLLKVNPNNLISAIQTFDPSFRIQENLAAGS DPNSLPQFVLRGQTGIGETTLGQTSTSSISREVLSGNSNLPIFILDGFEVDVEKIYDLDM NSIHSINILKDAAATAMYGSRAANGVIVIERRAPEAGKFRVQYSGVLSAELPDLSSYNLV NAREKLETERLAGLYDSNTPEIDPYTNGYYQRLNNVLTGVDTYWLSQGLRTALNHKHSVF IDGGENDVRWGVELGFRGTEGVMKHSSRKNANAAFYVDYRIGGLQIKNKVTYTYNKSTDV PFNSFSDYSHLLPYMRLYDENGDYVRRLEKFDGASGTQVNPLYEINFYNSFDHSGYDEVT DDLSLNWRITDGLRLRGQFSVLMRNSTGDLYKDPASASYSASTGNINGEKTESTQKRTVI DGSLSLMYNNTFKGHNLNICLSSNMRQTQSTASETRYRGFPGGDLVSSNYAAEVYGKPSS SDNTTRLVGALLTSNYTYNNIYLADLTGRIDGSSEFGSDKRWSMFWSTGAGINIHNYDFM KSNELFSMLKFRVSYGLTGKTNFSLYSAKDMYQLQTDSWYPTGYGVFLYQMGNPNLKWER KYTLDYGVEIGLWHDKIYLKASAYDERTIDLITDYTIPSSTGFTSYKENMGKVKNTGVEL ELRARLYSDRNWLFQLYGSFARNKNTIIEISQAMRDYNKRVEELFSGYNPESSSDSKYAK TYLKYYEGASLTSIYGMKSLGISPTNGKEIYLRRNGDVTDVWSADEWTIIGDTAPKGQGS FGYTLSYKQLSMFASFLYTFGGDAYNNTLVSYVENADIKNDNVDKRVLLDRWQKPGDITT MKDIRDRNVTTGASSRFVQKNNTLQWSSLTMSYNFRPEQLKKLHLSGLRLSFTMNDLFYW STIRQERGLDYPYSRSFNLTTNIIF >gi|222159279|gb|ACAB01000080.1| GENE 26 33711 - 35270 1429 519 aa, chain + ## HITS:1 COG:no KEGG:BT_3272 NR:ns ## KEGG: BT_3272 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 496 4 476 488 241 33.0 6e-62 MKKLISHILIALTGMLAVSCNAWLDVTPENAIADDDLFSTGFGYRNALNGIYTNLASDEL YGKQLSWGFLSAISQQYNQKAGTISPMYADASELIYNTVDTEPVVTAIWEKGYKVIANLN KLIENIRPTDISLFEYGEEEKNLIYAEALSLRAMMHFDLLRLFAPATATNPSGAYLPYRD KYEAAVVEKCTVTDFIEKVLKDLLEAEDILRKFDTEYHPEAMYASQMYEPTPEWNARYRF NSGSYIDDMGAFFWYRGIRFNYLALLGLKARVCIYAGPAYYKNAETAAKELYNTYYQQKR WIGFTEGENITCNLNSRYTKVSHDILFGLYKKQLATDYEQAVWGSSSSSSTTRLPLANIP SLFASDNTGVYTDYRLTYLIGTTNETQSKYYTLKYNPSVESVVEAMENPMIPVIRFSEIC HILAEISSYNGKITEGINYLETVRKARGAERTLSLTVSTREQLDAEILLDIRKEMIGEGG TFYTYKRMNLSTVPDSDEEGEINMTGSYVLPLPTSETTN >gi|222159279|gb|ACAB01000080.1| GENE 27 35283 - 36008 666 241 aa, chain + ## HITS:1 COG:no KEGG:BT_3273 NR:ns ## KEGG: BT_3273 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 184 1 186 236 68 28.0 2e-10 MKRILTYTIFCLLFVLTSCSEEQLDVYNGDNYVYFTYMSDKSPQKITFNFATDAPLLREG TVKVKMTLLGYLLEENATCDISAVGEKSTARSGIDYAPLTSGIFHSGLAEDTYEVTVYRN EDLLNTDYTLTLSLDAVENCLVGPAEYKHVTIQVTDRISQPVWWNQSSAANLGTYSDMKY RVFIIFMDGEILESLDKYTGIEFVNLIADFKAWWKDQWQQGNYQYYDTDGVTPLYETILD N >gi|222159279|gb|ACAB01000080.1| GENE 28 36018 - 37658 1266 546 aa, chain + ## HITS:1 COG:no KEGG:BT_3274 NR:ns ## KEGG: BT_3274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 504 3 476 534 73 23.0 1e-11 MKRKTILFISILQLLVITGCYDDLGNYDYTNSTKVSFVNSASSDYTFMVGDLFEMDAPIN FSTDIGNVDELFTVQWYLNRELIYTGYHLKYQFEKGGTYELILKVINKETNETYISNKYT LTGKNSFDWGWMILSNKGDGKSSLSFINPGLRATHDVENIIEDGLGTDPLGIYYYYVLGS ISGSYVSGLPKILINQGSGSVTLDGNSLQKDMWLAHEFENRKEPEGLKIMDFAFKEEYYV ICSEQGEVYIRAVGTDNKAIPYYGKYGAMPYEFEGGSRITCFAPFHNVTYWCADEERCIL YDEQNARFIGITHYPQWGAVYTPAIVYFKTYDQDLEVPSGVLRVNNMGAGTRCLAIGAYE KKDVASNGGLTFWSNYVSLIDVQGTGNYDLHEFAVKDMDNNSHLITGTDQYGFSGSSLLT PQSVIKMSSNFEKNPYFYFTDGDKNLYIYSMQMRSHMLAYTAGSRITGISGSPVVCEFYG YGGNSTDPNFRLALSQENGDIAIIDVNTSQMVRLFEGFAPDLELKTFSGFGDVKGMVWCT NYEGEY >gi|222159279|gb|ACAB01000080.1| GENE 29 37714 - 38670 781 318 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714997|ref|ZP_04545478.1| ## NR: gi|237714997|ref|ZP_04545478.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 318 1 318 318 574 100.0 1e-162 MKNRLFYFAACVALVLSSVSCSSDDGEEEDADIPVVGKIAVSGAYNVYSHGSQGWIEADK IGIYVLSDGKPQDNLPYAPSEVAVATTTELDGKTYIVYDRDSRVTEDITLNPSSELAAGF KGGDHTIYAYTPYNAASQDYKAVALPNLAVQEYYESEFMPHRNYSFAYASATTSSYSAAT VSLGEFKPLFSQITLPGAGCPDSFVGKKFTKVVVSCDHPIAYEDGATVDLSTGKISGTPI NSITYNLPNGGFEITAGYFGASLETCYMMIAVPFETGLTYTYNFTLTIDGQEYTISGKPN EKMSSINNLNMYGIAGIE >gi|222159279|gb|ACAB01000080.1| GENE 30 38713 - 41280 2749 855 aa, chain + ## HITS:1 COG:no KEGG:Cpin_5142 NR:ns ## KEGG: Cpin_5142 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 43 848 37 825 837 829 52.0 0 MKKLVSILMILAMIVPLAGAQSTDIFKKKKKKKSKTEAVDKAKADSIAKSKKDALQPYAK VITGKAKTMDGFFKVHYVDGKYFFEIADSLFGRDILIVNRVVKAPVDAQKRKVGYPGDYI SDEVIRFEKGRGDKLFVREISYLEHSADTLGMYQAVLNSNVQPIVATFPLKTVRKEGETT NYVIDMTDYIRKDNEMFSFTSRVKDNIGASSMVDDASYIDTLKAFPQNIEIRTVRTFQRK KGGGSGLEKLLAAFFATSTTPLTYELNSSMLLLPKEPMKPRLHDDRVGYFAVSYKDFDEN PQGVKYKANITRWRLEPKDEDREKYLRGELVEPKKPIIIYIDPVTPKKWVPYLIQGVNDW QAAFEKAGFKNAIFGKEAPTDDPTWSLEDARHSAIVYKPSDIPNASGPHVHDPRSGEILE THINWYHNVMSLLYNWYIVQAGAIDPGARKPMFDDELMGELVRFVSSHEVGHTLGLRHNF GSSNTVPVEKLRDKIWVEANGHTPSIMDYARFNYVAQPEDNVSRSGIFPRIGMYDKWAIE WGYRWMPEYETAEAEIPHLNKWIIEKLREDKRYTFGTELDRNDPRNQSEDLGDDAMLASS YGIKNLKRVMPEIMNWTYEPNEGYMKAVRLYQNVVGQFDLYMGHVATNVAGIYHNPISVE QTDMKAVEYVPKDIQKKAVDFLNKELFTTPTWLMDDKLSERTGINTFNSIYRVQSSTLKQ LLSSRTLDKMTDNELVNGAKAYTANDLFRDLKKSIWSDMQGGKKPDASQRSLQKTYVNAL IGMLDKPKNSSGSLGSLGGYSLVAFDFPSEAPTIARGQLTDLRRDLTNAANASSGIYRSH YLNLKALIDAAFDVK >gi|222159279|gb|ACAB01000080.1| GENE 31 41555 - 42113 303 186 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 3 164 24 187 470 106 40.0 2e-23 MFKRDILFHLERWKDDTHRKPLILRGARQVGKTTVVNEFGKQFDNYLYLNLEKREAASLF ELNVSLKDLMPLFFAHCGKIRNEGTILLFIDEIQNSAKAVALLRYFYEELPDIYVIAAGS LLENLIDVRVSFPVGRVQYMALRPCSFREFLGAVGEEPLLSVLDKPEITLAFHDRLMLLF NIYTLI Prediction of potential genes in microbial genomes Time: Wed May 18 03:00:30 2011 Seq name: gi|222159278|gb|ACAB01000081.1| Bacteroides sp. D1 cont1.81, whole genome shotgun sequence Length of sequence - 6830 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 5 - 796 516 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 839 - 897 1.8 - Term 827 - 883 1.4 2 2 Op 1 . - CDS 924 - 1373 419 ## COG3023 Negative regulator of beta-lactamase expression 3 2 Op 2 . - CDS 1446 - 1712 414 ## BT_1518 hypothetical protein - Prom 1732 - 1791 5.0 4 3 Tu 1 . - CDS 1861 - 2355 580 ## BT_1517 hypothetical protein - Prom 2460 - 2519 13.0 - Term 2474 - 2529 6.4 5 4 Op 1 . - CDS 2545 - 3924 1482 ## COG0305 Replicative DNA helicase 6 4 Op 2 . - CDS 3972 - 4595 616 ## BT_1515 hypothetical protein 7 5 Tu 1 . - CDS 5099 - 5368 266 ## BT_1514 hypothetical protein - Prom 5509 - 5568 9.9 + Prom 5444 - 5503 9.5 8 6 Tu 1 . + CDS 5571 - 6515 695 ## BT_1503 integrase Predicted protein(s) >gi|222159278|gb|ACAB01000081.1| GENE 1 5 - 796 516 263 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 253 216 457 470 64 22.0 2e-10 MPEVVQLYAERRDILSLESTYETLLQGYRDDVEKYALGKQLPEVIRFILKEGWHKAGQII TLGGFAGSSYNAREVGEAFRLLEKAMLLELVYPTTATEVPATPEIKRMPKLVWLDTGLVN YAAQVQKEVLGAKDIMDAWRGMIAEQIVAQELLTLTDKVSQKRCFWVRNKSGSNAEVDYV WVQDSMVYPIEVKSGHNAHLRSLHSFMNHSNQTVAVRIWSQPYAVDEVKTADGKEFKLIN LPFYLVGKLDSILRRFETTFDNA >gi|222159278|gb|ACAB01000081.1| GENE 2 924 - 1373 419 149 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 141 2 97 116 100 47.0 7e-22 MRTINLIVIHCSATREDKSFTEYDLDVCHRRRGFNGTGYHFYIRKNGDIKSTRPIERIGA HSRGFNKESIGICYEGGLDCKGQPKDTRTEWQKHSLRVLILALLKDYPNCRICGHRDLSP DLNGNGEIEPEEWIKACPCFNAETDWDKV >gi|222159278|gb|ACAB01000081.1| GENE 3 1446 - 1712 414 88 aa, chain - ## HITS:1 COG:no KEGG:BT_1518 NR:ns ## KEGG: BT_1518 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 88 10 96 101 122 85.0 4e-27 MTILPFLGSNRKKRKVMAQEVKEFSELVKDQYTFLMQQLEKVLKDYFDLSSKVKEMHTEI FSLRDQLAQAAALQCINKECAQRSAAEA >gi|222159278|gb|ACAB01000081.1| GENE 4 1861 - 2355 580 164 aa, chain - ## HITS:1 COG:no KEGG:BT_1517 NR:ns ## KEGG: BT_1517 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 176 234 85.0 6e-61 MNVLVERYQRRKYVNQPDSQMLYYVRQKSGTVRVMDINKLADAIEANSSLTAGDVKHSIE AFVEQLRLSLTQGDKVKIDGLGTFHITLSSEGAEKEKDCTVRNIRRVNVRFVADKALQLV NTSHATTRGENNVDFILAGKGDGEDADGGNSGSGGSGEAPDPAA >gi|222159278|gb|ACAB01000081.1| GENE 5 2545 - 3924 1482 459 aa, chain - ## HITS:1 COG:lin0047 KEGG:ns NR:ns ## COG: lin0047 COG0305 # Protein_GI_number: 16799126 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Listeria innocua # 2 433 4 427 450 266 38.0 8e-71 MNTENRVSPQAPEIEEAIIGACLIEQRAIPLIADKLRPEMFYVLRHQLIYAAILAMYHAG MKIDILTVKEELSHRGKLEEAGGAFGITQLSSKVATSAHLEYHAQIVHEKYLRREMTLGF NKLLACSLDETMDIDDSLMDAHNLLDRLEGEFGHNNHMRDMDELMTATMTEAEGRIANNI NGVTGIPTGLADLDRMTSGLQNGELVVIAARPGVGKTAFALHLARNAAMAGHAVAVYSLE MQGERLADRWLTAASEVSARHWRSGTVSPQELAEARTAAADLKRLPIHVDDSTSVNMEHV RSSARLLQSQHACDAIIIDYLQLCDMTTGQNNRNREQEVAQATRKAKLLAKELNVPVVLL SQLNRESENRPAGRPELAHLRESGAIEQDADVVILLYRPALARITTDRESGYPTEELGIA IIAKQRNGETGNVYFRHNSAMTKITEYVPPLEYMLKHAK >gi|222159278|gb|ACAB01000081.1| GENE 6 3972 - 4595 616 207 aa, chain - ## HITS:1 COG:no KEGG:BT_1515 NR:ns ## KEGG: BT_1515 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 207 11 210 212 239 60.0 5e-62 MNNFNFEAMMEHGFLIIPKALLQQQIEDPNIEAGEIEALLKILMKVNYSDTLYSDRQHKD YLCKRGESMFSYRDWSRIFRWSVGKTFRFIHALAILGIIEIVPHPNNSSLHIRVVEYDKW VGAPDSGKQKKKAVNEKFRLFWNEFHSITQLPKENIAKAQREWKKLSDKEQQLAIDKVED YYFHQTNINYLLHAASYLSNKAFLNEY >gi|222159278|gb|ACAB01000081.1| GENE 7 5099 - 5368 266 89 aa, chain - ## HITS:1 COG:no KEGG:BT_1514 NR:ns ## KEGG: BT_1514 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 88 12 89 99 76 50.0 3e-13 MNNNEPAAERYSINGFKYFSELAKEYFPDLANASSASKKMRKRIKANKTLNEQLVAAYYT SQTIDVSPEMQLIIYRHWGPPHIDLPTNV >gi|222159278|gb|ACAB01000081.1| GENE 8 5571 - 6515 695 314 aa, chain + ## HITS:1 COG:no KEGG:BT_1503 NR:ns ## KEGG: BT_1503 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 309 31 331 336 455 76.0 1e-126 MKKESFTLFMKQVAVDLQQSGNLGTAHVYRSTLNAILTFQGSDRLSFREITPEWLKHFEG SLRARGCSWNTVSTYLRTLRAVYNRAVDLRKASYVPHLFRSVYTGTRADRRRALDMEDMK KVFARLLQSDAITPAMKGAQELFILMFLLRGLPFVDLAYLRKSDLRGNVISYRRRKTGRP LSVTLTTEAMFLLQKYMNREEQSPYLFPILHSDEGSPKAYREYQLALRNFNYQLELLGKA LGLKDRLSSYTARHTWATTAYYCEIHPGIISEAMGHSSITVTETYLKPFRNKKIDEANMQ VLDFIKRSVVGVSA Prediction of potential genes in microbial genomes Time: Wed May 18 03:00:54 2011 Seq name: gi|222159277|gb|ACAB01000082.1| Bacteroides sp. D1 cont1.82, whole genome shotgun sequence Length of sequence - 40644 bp Number of predicted genes - 32, with homology - 32 Number of transcription units - 20, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 2611 2492 ## BT_2486 hypothetical protein + Term 2650 - 2687 0.8 2 2 Tu 1 . + CDS 3179 - 4735 1374 ## BT_0374 hypothetical protein + Term 4742 - 4791 11.0 - Term 4731 - 4777 7.1 3 3 Tu 1 . - CDS 4781 - 6355 1736 ## COG1530 Ribonucleases G and E - Prom 6504 - 6563 2.4 - Term 6429 - 6486 12.5 4 4 Tu 1 . - CDS 6639 - 6914 237 ## COG0776 Bacterial nucleoid DNA-binding protein - Prom 6989 - 7048 7.5 + Prom 6935 - 6994 6.1 5 5 Tu 1 . + CDS 7126 - 8166 842 ## COG1194 A/G-specific DNA glycosylase + Prom 8232 - 8291 10.7 6 6 Op 1 . + CDS 8348 - 8827 542 ## COG0629 Single-stranded DNA-binding protein + Term 8835 - 8890 11.4 7 6 Op 2 . + CDS 8905 - 10257 1289 ## COG1253 Hemolysins and related proteins containing CBS domains 8 6 Op 3 . + CDS 10301 - 10933 382 ## BT_1495 siderophore (surfactin) biosynthesis regulatory protein + Term 11006 - 11055 7.2 - Term 10990 - 11047 12.4 9 7 Op 1 . - CDS 11112 - 11336 238 ## BT_1494 hypothetical protein 10 7 Op 2 . - CDS 11345 - 11647 296 ## COG2388 Predicted acetyltransferase - Prom 11669 - 11728 4.1 11 8 Tu 1 . - CDS 11817 - 13190 1415 ## COG3033 Tryptophanase - Prom 13213 - 13272 4.8 - Term 13327 - 13369 9.1 12 9 Op 1 . - CDS 13427 - 14212 587 ## BT_1491 hypothetical protein 13 9 Op 2 . - CDS 14249 - 16225 1480 ## COG3391 Uncharacterized conserved protein 14 9 Op 3 . - CDS 16249 - 18291 1320 ## BT_1489 vitamin B12 receptor, outer membrane - Prom 18415 - 18474 3.5 15 10 Tu 1 . + CDS 18641 - 18865 80 ## gi|237715024|ref|ZP_04545505.1| predicted protein + Term 18928 - 18965 -1.0 16 11 Tu 1 . - CDS 18942 - 19886 842 ## BT_1485 hypothetical protein - Prom 19918 - 19977 3.7 17 12 Op 1 . + CDS 20010 - 22025 1266 ## BF2729 hypothetical protein + Prom 22043 - 22102 3.3 18 12 Op 2 . + CDS 22128 - 22952 657 ## BDI_2654 hypothetical protein + Term 22969 - 23020 18.1 + Prom 23434 - 23493 2.8 19 13 Op 1 . + CDS 23514 - 24749 748 ## BT_1451 hypothetical protein 20 13 Op 2 . + CDS 24756 - 25802 1027 ## BT_3148 hypothetical protein 21 13 Op 3 . + CDS 25847 - 27586 1135 ## gi|237715030|ref|ZP_04545511.1| predicted protein 22 14 Tu 1 . + CDS 27824 - 28741 834 ## gi|237715031|ref|ZP_04545512.1| predicted protein 23 15 Tu 1 . + CDS 28957 - 29424 422 ## PG0555 histone-like family DNA-binding protein + Prom 29628 - 29687 4.7 24 16 Tu 1 . + CDS 29729 - 30010 112 ## BT_4140 hypothetical protein + Term 30104 - 30162 2.4 - Term 29917 - 29951 1.1 25 17 Op 1 40/0.000 - CDS 30061 - 31842 1615 ## COG0642 Signal transduction histidine kinase 26 17 Op 2 . - CDS 31870 - 32559 829 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 32581 - 32640 6.9 + Prom 32558 - 32617 7.2 27 18 Op 1 . + CDS 32801 - 33994 882 ## BT_1481 hypothetical protein 28 18 Op 2 . + CDS 34047 - 34853 814 ## BT_1480 hypothetical protein 29 18 Op 3 . + CDS 34881 - 36749 1967 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 30 18 Op 4 . + CDS 36829 - 37668 592 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family + Term 37696 - 37755 15.8 - Term 37644 - 37692 1.6 31 19 Tu 1 . - CDS 37768 - 39003 834 ## COG2407 L-fucose isomerase and related proteins - Prom 39061 - 39120 5.4 32 20 Tu 1 . + CDS 39185 - 40384 1024 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase Predicted protein(s) >gi|222159277|gb|ACAB01000082.1| GENE 1 2 - 2611 2492 869 aa, chain + ## HITS:1 COG:no KEGG:BT_2486 NR:ns ## KEGG: BT_2486 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 123 16 140 1016 103 44.0 2e-20 TLATVTYVGCKDYDDDIDNLQTQIDANAAGLAELQAKVNAGNWVTDIKSITGGFEITFNN GNKYSIVNGKDGSVVEIGENGNWFIDGVDTGKPARGEKGETGATGPVGPVGPEGPVGPVG PEGSVGPVGPEGPVGPVGPEGPIGPVGPEGPAGPAGKSPFIGDGTGEFEKDYWYFYDDAT DKWVKGDYSSATVYAVQNEGLPSFTLHVKDKTTGTELTTILPTAALISSLEGVTIENGKI TAGGTKELKLSYAQCKADFTFGMEDEKKEFKKNDLLITNSGVLNALINPVGPDFTDSKYQ IYLMNSQNEANFVISKIEQNKTAKPLTRATEAKVNRGVYDLTVTLKDGLNLETALPADEA YAFCTKDAWNNEIISAYDVKIKPEAVTSATKLVDAAVSAKVGEVQVLDDLAAAATTTPMD LSTVYAYYYKLAADAPEGVTLGTNEAGKQTITSTKGQEAKVEVCYITTNGIPFDGETHEG IDYSSVGGSSAPAKLTVTFKQIETKSLAAQTVVWNKSEKSDIAVSAANIKAIKDAITTAK LASSTATANDGDKVKFSGIASGDTKYDNLKLTVTAAYKGTMEAVLTVDVDATKQIVFKLP VTVDYKTPVFTKTSGMWTEDGKVSLLINETKNADKIQSISLEREMGSIFTTWNAITGGVT SGIYNSITYDVDKNGDDHVGSVANDKFTVDMAHAKDNMAFSVDVNCQPNDATEPASIANQ KIQFISLSELLKSEFGATDAKKGITTAYTAETIDSEIDVLGDIVWKDRKGKAMWPTADNN TYDTGTLTTADAVLALYGYSIKVELSDKTNFKFDSTGQKLILTDAGKALKGLVKDLVVTV TITPSISWSATSPAPVVKTVTFPTALFAD >gi|222159277|gb|ACAB01000082.1| GENE 2 3179 - 4735 1374 518 aa, chain + ## HITS:1 COG:no KEGG:BT_0374 NR:ns ## KEGG: BT_0374 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 516 516 820 77.0 0 MSHKIYPIGIQNFEKIRNDGYFYIDKTALMYQMVKTGSYYFLSRPRRFGKSLLVSTLEAY FQGKKELFEGLAVEKLEKDWIKYPILHLDLNIEKYDTSESLDNILDKSLTAWEKLYGAEP SERSFSLRFAGIIERACKLAGQRVVILVDEYDKPMLQAIGNEELQKQFRNTLKPFYGALK TMDGCIKFAFLTGVTKFGKVSVFSDLNNLDDISMWNEYVEICGISEREIHNNLETELHEF AAARGITYDKLCEELRECYDGYHFTHNSIGMYNPFSLLNAFKRKDFGSYWFETGTPTYLV KLLQKHHYDLERMTHEETDAQVLNSIDSESTNPIPVIYQSGYLTIKGYDEEFGMYRLGFP NREVEEGFVRFLLPYYANVNKVESPFEIQKFVREVRSGDYSSFFRRLQSFFADTTYEVIR DQELHYENVLFIVFKLVGFYAKVEYHTSEGRIDLVLQTDKFIYIMEFKLNGTAEEALQQI NDKHYALPFEMDERKLFKIGVNFSAETRNIEKWIVEEK >gi|222159277|gb|ACAB01000082.1| GENE 3 4781 - 6355 1736 524 aa, chain - ## HITS:1 COG:CT808 KEGG:ns NR:ns ## COG: CT808 COG1530 # Protein_GI_number: 15605542 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Chlamydia trachomatis # 1 517 1 512 512 277 32.0 4e-74 MTSELVVDVQPKEVSIALLEDKSLVELQSEGRNISFSVGNMYLGRIKKLMPGLNACFVDV GYEKDAFLHYLDLGPQFNSLEKFVKQTLSDKKKLTSISKATLLPDLDKDGTVSNTLKVGQ EVVVQIVKEPISTKGPRLTSEISFAGRYLVLIPFNDKVSVSQKIKSSEERARLKQLLMSI KPKNFGVIVRTVAEGKRVAELDGELKVLVKHWEDAMVKVQKATKYPTLIYEETSRAVGLL RDLFNPSFENIHVNDEAVYNEIKDYVSLIAPDRANIVKLYKGQLPIYDNFGITKQIKSSF GKTISYKSGAYLIIEHTEALHVVDVNSGNRTKNANGQEGNALEVNLGAADELARQLRLRD MGGIIVVDFIDMNEAENRQKLYERMCANMQKDRARHNILPLSKFGLMQITRQRVRPAMDV NTTETCPTCFGKGTIKSSILFTDTLESKIDYLVNKLKVKKFSLHVHPYVAAYINQGLVSL KRKWQMKYGFGIKIIPSQKLAFLQYVFYDTHGEEIDMKEEIEIK >gi|222159277|gb|ACAB01000082.1| GENE 4 6639 - 6914 237 91 aa, chain - ## HITS:1 COG:lin2048 KEGG:ns NR:ns ## COG: lin2048 COG0776 # Protein_GI_number: 16801114 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Listeria innocua # 3 90 4 91 91 58 38.0 3e-09 MTKADIVNEITKKTGIDKQTVLTTVEAFMDAVKDSLSNDENVYLRGFGSFVVKKRAQKTA RNISKNTTIIIPEHNIPAFKPAKTFTISVKK >gi|222159277|gb|ACAB01000082.1| GENE 5 7126 - 8166 842 346 aa, chain + ## HITS:1 COG:L0296 KEGG:ns NR:ns ## COG: L0296 COG1194 # Protein_GI_number: 15672823 # Func_class: L Replication, recombination and repair # Function: A/G-specific DNA glycosylase # Organism: Lactococcus lactis # 1 300 8 314 387 235 38.0 9e-62 MNEFTKTIVEWYKENKRELPWRESADPYLIWISEIILQQTRVAQGYDYFLRFIKRFPDVR TLAAAEEDEVMKYWQGLGYYSRARNLHAAAKSMNGVFPETYPEVLALKGVGEYTAAAICS FAYGMPYAVVDGNVYRVLSRYFGIDTPIDSTEGKKLFAALADEMLDKKHPAVYNQGIMDF GAIQCTPQSPDCLFCPLAGSCSALSKGWVTKLPVKQHKTKTTNRYFNYIYVRAGAYTFIN KRTGNDIWKNLFELPLIETPTALSEEDFLTLPEFRALFVPGEVPVVRSICREVKHVLSHR VIYANLYEVTLSENLTSFSDFLKIRVDELEQYAVSKLVQDLLQALE >gi|222159277|gb|ACAB01000082.1| GENE 6 8348 - 8827 542 159 aa, chain + ## HITS:1 COG:Zssb KEGG:ns NR:ns ## COG: Zssb COG0629 # Protein_GI_number: 15804651 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Escherichia coli O157:H7 EDL933 # 3 159 6 178 178 103 39.0 1e-22 MSVNRVILIGNVGQDPRVKYFDTGSAVATFPLATTDRGYTLANGTQIPERTEWHNIVASN RLAEIVDKYVHKGDKLYLEGKIRTRSYSDQSGAMRYITEIYVDNMEMLSPKGANPGAGAS ATGQPATGQQQQPVAGQSQQAQQSQAQPAQDNPTDDLPF >gi|222159277|gb|ACAB01000082.1| GENE 7 8905 - 10257 1289 450 aa, chain + ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 40 444 17 424 426 204 33.0 2e-52 MDSDGYLSQLADIFNGITVNTPSISAIIAIVLAGVLLLASGFASASEIAFFSLTPSDRND IDEQNHPSDEKISALLGDSERLLATILITNNFVNVTIIMLCNFFFMNVFVFHSPLAEFLI LTVILTFLLLLFGEIMPKIYSAQKTLAFCRFSAPGIYFLEKLFRPIATVLVRSTTFLNKH FVKKSHNISVDELSHALELTDKAELSEENNILEGIIRFGGETVKEVMTSRLDMVDLDIRT SFKEVMQCIIENAYSRIPIYSGSRDNIKGVLYIKDLLPHVNKGDNFRWQSLIRPAYFVPE TKMIDDLLRDFQANKIHIAIVVDEFGGTSGLVTMEDIIEEIVGEIHDEYDDEERTYVVLN DHTWIFEAKTQLTDFYKIAKVDEDEFEKVVGDADTLAGMLLEIKGEFPALHEKVTYHHYE FEVLEMDSRRILKVKFTILPKDMEGSEEKE >gi|222159277|gb|ACAB01000082.1| GENE 8 10301 - 10933 382 210 aa, chain + ## HITS:1 COG:no KEGG:BT_1495 NR:ns ## KEGG: BT_1495 # Name: not_defined # Def: siderophore (surfactin) biosynthesis regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 210 1 199 201 313 75.0 3e-84 MALFLQHKTDDIQWAVWKMEESLEVLLALLPDARRVFCEQELNRFVSERRKMEWLSVRVL LYAMLQEDKEIGYSPEGKPYLTDHSFFISISHTKGYVAVMLASFTPAGIDIEQYAQRVHK VSDRYIRSDEQTEPYQGDMTWGLLLHWSAKEAVFKRMENADADLRKLRLTHFIPQEQGTF QVQELATEQQELYSVGYRICPDFVLTWTLS >gi|222159277|gb|ACAB01000082.1| GENE 9 11112 - 11336 238 74 aa, chain - ## HITS:1 COG:no KEGG:BT_1494 NR:ns ## KEGG: BT_1494 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 74 134 91.0 9e-31 MEKKIEYTNGELTIIWKPELCQHAGICVKMLPNVYHPKERPWVQIENATTEELIAQISKC PSGALSYRLNKKEK >gi|222159277|gb|ACAB01000082.1| GENE 10 11345 - 11647 296 100 aa, chain - ## HITS:1 COG:DR1844 KEGG:ns NR:ns ## COG: DR1844 COG2388 # Protein_GI_number: 15806844 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Deinococcus radiodurans # 10 96 7 93 93 74 42.0 4e-14 MAEDYKLIDNEEKHRYEFQIDGKIAEIDYIKSNNGEIYLVHTEVPASLGGKGVGSQLAEK ALTDIERQGLRLVPLCPFVAGYIHKHPEWKRIVMRGIHIK >gi|222159277|gb|ACAB01000082.1| GENE 11 11817 - 13190 1415 457 aa, chain - ## HITS:1 COG:PM0811 KEGG:ns NR:ns ## COG: PM0811 COG3033 # Protein_GI_number: 15602676 # Func_class: E Amino acid transport and metabolism # Function: Tryptophanase # Organism: Pasteurella multocida # 6 455 6 455 458 472 50.0 1e-133 MELPFAESWKIKMVEPIRKSTREEREQWIKEAHYNVFQLKSEQVYIDLITDSGTGAMSDR QWAGMMLGDESYAGATSFFKLKEMITKLTGFEYIIPTHQGRAAENVLFSYLVHEGDIVPG NSHFDTTKGHIEGRHATALDCTIDAAKHTQLEIPFKGNVDPDKLQKALTEYAERIPFIIV TITNNTAGGQPVSMQNLHEVRAIADKYGKPVLFDSARFAENAYFIKMREEGYQDKTIKEI TREMFSLADGMTMSAKKDGIVNMGGFIATRRADWYEGAKGFCVQYEGYLTYGGMNGRDMN ALAIGLDENTEFDNLETRIKQVEYLAKKLDEYGIPYQRPAGGHAIFIDAPKVLTHVPKEE FPAQTLTIELYLEAGIRGCEIGYILADRDPVTHENRFNGLDLLRLAIPRRVYTDNHMNVI AAALKNVYERRESITHGVRIAWEAPLMRHFTVQLERL >gi|222159277|gb|ACAB01000082.1| GENE 12 13427 - 14212 587 261 aa, chain - ## HITS:1 COG:no KEGG:BT_1491 NR:ns ## KEGG: BT_1491 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 259 5 253 256 85 29.0 1e-15 MKKLSLLLLLSVFVFCGCSDDDDTKTYVLSLPNYETHTLDMGDQENPDDSWSVSSDWGTT NYKYNLLTDASGIFEFDCVSSTYGFYSDSFAFTNCTEKDCPDFATYDYRAITKKGVINNT YVIVGAAGYKIGKNSDKEAAIRFRDHDNSNKLESYRVKGLYLTNCVYAYNSMKEGSSIFE GKDKFGNTDSFKIIIYNMDKTQSVECTLGEGTQFVTTWKWVDLTSLGETEGLKFNIKTTK EDQWGAMTPTYFCLDGITIED >gi|222159277|gb|ACAB01000082.1| GENE 13 14249 - 16225 1480 658 aa, chain - ## HITS:1 COG:MA1904_1 KEGG:ns NR:ns ## COG: MA1904_1 COG3391 # Protein_GI_number: 20090753 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 95 331 109 326 361 69 29.0 2e-11 MKRKLIYIFSCWLSSIILFVACDDMEDKPFTSGIIGDPTETGTAELYVLCEGLFNQNNSS LARFSFGNQQMVRDYFKAVNRRGLGDTANDMAIYGSKIYVIVNISSTVEVIDFRTGSSLK QIQMLAENGSSRQPRYIAFHKEKAYVCSYDGTVARIDTTSLSIEAITSVGRNPDGICVQN EKLYISNSGGLDYSSGLVGVDNTVSVVDIATFKESSKLTVGPNPGKIVAGPDETVYVATR GEDVEAGDYNFVKIDCRTNKVTQSNEKVQNFAIDGEIAYLYNYNYNTQTSSIKMFNLKTE ETIRENFITDGTVIKTPYGININPYSNNVYITEARDYTTYGDLLCFNQQGQLMFRLNNIG LNPNTIAFSDKASQSDIDDNDDDKENPLAFANKVWEYRPAPGQFINTTTSAYKKGFTYDD ILEEATRRIQQKSLLTLGGFGGYIVLGFPQSIPNVTGEYDFKIKGNAYYNSKTGTGALGG SAEPGIVFVSKDVNGNGKPDDEWYELKGSEYGKDTETRGYEITYHRPNPANLKVFWKDNQ GNEGYIFRNSFHNQESYYPLWIESDEITFQGTRLKDNAVLENGLWVGYCYPWGYADNHPN SKEGSNFKIDWAVDSNGSPVDLDQIDFVKIMTAVNQDAGQMGEISTEVTTIENLHFKK >gi|222159277|gb|ACAB01000082.1| GENE 14 16249 - 18291 1320 680 aa, chain - ## HITS:1 COG:no KEGG:BT_1489 NR:ns ## KEGG: BT_1489 # Name: not_defined # Def: vitamin B12 receptor, outer membrane # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 680 1 681 681 1044 77.0 0 MKKRTFNKVIFVAFSCQFLSIPLFAQQQKVDTTHTYSIPEITVSDIYQTREVRSTAPLQV FSKDALKNLHALQVSDAVKHFAGVTVKDYGGIGGLKTVSIRSLGAQHTAVGYDGITLTDC QTGQIDIGRFSLDNVDRLSLNNGQSDNIFQPARFFASAGILNIQTLTPLFTKGKKTNIAG AFKTGSWGLINPSLLLEQQFNKTWSMSANGEWMSSDGHYPYTLHYGNAAEDLSSREKRKN TDVQTFRAEAGLYGNFSDKEQWRLKAYYFQSSRGLPKATTFYNDHSTQHLWDKNTFIQSQ YKKEFSRQWVFQTSAKWNWSYQRYLDPDTKNSLKKTENSYYQQEYYLSASVLYRLLSNLS FSLSTDGSINTMNADLANFVHPTRYAWLTAFAGKYVNNWLTISASALATVINEDVKKGGS AGNHRKLTPYVSAAFKPFQHEEFRIRFFYKDIFRLASFNDLYYEEVGNTQLKPEKAKQYN IGLTYNKNVCPFLPYLSVTVDAYYNKVTDKIIAYPTKNLAVWSMKNLGEVDIKGIDATGS LSLQPWDKIRINLSGNYTYQRALDVTPPDPNTYESTYKHQIAYTPRVSASGQAGVETPWI NLSYSFLFSGKRYALGQNIAENRLDSYSDHSISANRDFQIRKITTSFSVEVLNLMDKNYE IVKYFPMPGRSVRATIRIRY >gi|222159277|gb|ACAB01000082.1| GENE 15 18641 - 18865 80 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715024|ref|ZP_04545505.1| ## NR: gi|237715024|ref|ZP_04545505.1| predicted protein [Bacteroides sp. D1] # 21 74 1 54 54 97 100.0 2e-19 MGVANQCATKIGSLFVFSKGMFFFFRLDTKEKEPKRKDQGCVFSATPVLPSAKGQKLATL KQSALFDAEENTCA >gi|222159277|gb|ACAB01000082.1| GENE 16 18942 - 19886 842 314 aa, chain - ## HITS:1 COG:no KEGG:BT_1485 NR:ns ## KEGG: BT_1485 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 309 17 308 311 482 80.0 1e-135 MRTNINLIPVFCILLLCACKSGNASSQNSKEAPKDTITSFTLPAIPPMLTAPELRADFLV KHYWDNVNFADTNYIHHPEVTEQAWADYCDILNHVPLETAQQAIQKTIERTNVNKKVFTY ITDLADKYLYDPNSPMRNEEFYIPVLDAMLASPVLEEIEKVRPKARRELAQKNRIGTKAL NFNYTLASGAQGSLYQQKSDYTLLFINNPGCHACTETIEALKNAPIINQLLEQKRLTVLS IYPDEELDEWRKHLNEFPKEWVNGYDKTFAIKEQQLYDLKAIPTLYLLNKDKTVLLKDAP AQTIEEYLLMKGEQ >gi|222159277|gb|ACAB01000082.1| GENE 17 20010 - 22025 1266 671 aa, chain + ## HITS:1 COG:no KEGG:BF2729 NR:ns ## KEGG: BF2729 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 70 665 83 660 662 176 28.0 4e-42 MIKSTYIFFVFFMLTAVWSCNRPSDMQREHPAVSPTFYADSLRTLGRGIRLIKEEKESLA FPPLDSVFRLPVDDVLTTEELRRLSSESLRRLMIYFNLMMNFESGYQYFDSLERAKHPVV SRYCRRELWVAKAQMLMALDRHAEAVDYLNRAMALKDENNDPLSEIFCTATAGITYMGVD TISTRAESAFRRACRVAERSGLSNYWLYPQAIGRLADIYLQQGKYEESISLCREAIRLCE KSGSYHGKLVVAEILTEAYRLLGLYDEAFRYCAVGTGEPARAEVDNNLIGRFFIAKAEIH NNLNRPDSALLVLAQADSCFDRTKNDYYHLMLQIDRMYYLAAFPDSVNVALRGFAALEGK VPRHRLPYYDYYYGATLARVGKWLEAIPLLRKSIGELKDISELHPASEAAELLMEGYRHT GRAADILTVFPEYRVMRDSVTRKDKIRQLASANIRFETQKKEQENLLLTAEVRLQDTLLH VYFIAGVCTLLLVFFIAGWMVMRHRNLRLRWQLEAQEHERADERLRDQESRLHELIAARQ DLYERNRSLIRQLSDIQARHRNTCELDSVMESLQSHLLTRKEEEDFRNAFLSVYPSALLH LREACPAVTRSEELFCMLVLLKQSNEELARTLGISVASVSKTRYRIRVKLGLPEGSDTDA EIRHIMAGEDL >gi|222159277|gb|ACAB01000082.1| GENE 18 22128 - 22952 657 274 aa, chain + ## HITS:1 COG:no KEGG:BDI_2654 NR:ns ## KEGG: BDI_2654 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 274 1 305 305 349 58.0 7e-95 MGRFINPFTDYGFKFLFGREVEKELLIDFLNDLLSGEHVITDIQFLNNEQQPEVKTERGI IYDIYCMTDTGERIIVEMQNREQPYFKDRALFYLSRAITQQAKSGPWDFRLDAVYGVFFM NFVIDKDMPAKIRTDVILSDRDTGQLFNNKFRQIFIELPNFDKEEDECSNDFERWIYVLK HMDTLDRMPFKARKAVFERLEKMASKANMTPEERAQYEKEWKVYNDYFNTLDFAEQKGML RGKESSARMMKSKGLAIDLISECTGLTAEEIEAL >gi|222159277|gb|ACAB01000082.1| GENE 19 23514 - 24749 748 411 aa, chain + ## HITS:1 COG:no KEGG:BT_1451 NR:ns ## KEGG: BT_1451 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 237 405 18 192 195 122 37.0 3e-26 MRMKNKQNVAWSTFRYFVLAFLLCLSVVPLRGQPDAAVLSSPSADSLIIFSYPSRIRMFK TEYADNRLSLRKLKKRLRTFRKRQDTLYVQSYSGSWDDEKSNLRAAYWRANNLKGYLIDH YGLRERHFRTLNHPLAHPLWGEVVVVGCRGFQDCQSEIPQNRQEPASAARRQEQQEALPV ALQGDKEPVDSSAVAATAVVPDAATAPVAPDAATALSTSVASSASVTPSSSTASTSTATS PSLPANGRYLGVKTNLAAWAGTIMNVAAEVQVGKHLSVELPILWCPWHISGKHAVKTFTL QPEARYWLSKPGSGHFFGLHAHVGWFNVKWNRDRYQDADRPLLGAGVSYGYLLPLGGHWA GEFTLGAGYANMKYDTYYNIDNGARIDTRTKNYWGITRMGLSIVYRFNLKK >gi|222159277|gb|ACAB01000082.1| GENE 20 24756 - 25802 1027 348 aa, chain + ## HITS:1 COG:no KEGG:BT_3148 NR:ns ## KEGG: BT_3148 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 112 337 199 422 430 137 38.0 6e-31 MKQKYAIGFITFMLATVLAACVHDYPAMTPDGEEGIDPTLVEVSTEVTLDLELLPLEIIT NKAHRGITKARAGEQTDYRRRFIIEAWRDGKPAARQVTVMDNADEAGNGKITLPVHLKLY AVEYTLAVWTDYVVAGTTTDLYYNTENLQQVTCTTPYTGSTGYRDCLYGSTTLDLRPYRD EWNVRVQAKVDMVRPLAKYRIIATDVQEFLAKTQRQRDAEGGNNTYTVTFSYGFYFPLGF NTATGKPMNSVQGVTFSTPLTIPDDGTEKCPLGSDFIFVNGTESFVPLNIELADANGKVV SRTRGLEVPYRRGHLTTLRGNFLTNEMQGGINIDTGYDDEIDIDLDSF >gi|222159277|gb|ACAB01000082.1| GENE 21 25847 - 27586 1135 579 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715030|ref|ZP_04545511.1| ## NR: gi|237715030|ref|ZP_04545511.1| predicted protein [Bacteroides sp. D1] # 1 579 1 579 579 1117 100.0 0 MNRSMKKNLFFGLMALAGLFTSCSQDDTDASPTADSNQVSMSIGMPADFVKTRASAPATF PAGHTLRCILEVWTKDGSTRRVRQEQLVTAGAANITFSFELADQGDYKAVLWADYIVSGA TASGDHYPDKYYKTDNADGLKKVTIITAAYTYADQLREAFAAVVPFTKGATAKNDLTATL VRPLTKVTIAEKNTEMIGKCKDMTATYTVPSEFNAFSEEVSPTATYDATYITTSMDGTDI TINGNNCKILFSDYVFTTADATLGGIKLTFTGTGSITMNDRDIPANIPLKRNNWVRAAGN LITVGNDPAVTLSVDMTTDWVSQDVTDISDIVKVGDFYYADGTWSTALDANKTCIGIVFQ TDPSRIGDKEKQVLAAKGVATPHGLVMSLKTVTKSLMSWGEDHDFSELTKCTDKVACNAD INGLLNYTTVIDYAAANNKELENFYPAFKAVKDYVVQAPEKTTGWYLPSIGQWYDFTANL GGLPSWDDAINEGNDLTPNLYRWSNQTELVSKINAYFEPLGTGNYDAIPNGSYQKFFSSS TYSDSGIWTWFVGKQANVVQCWHNVRYNSDSAVRPILAF >gi|222159277|gb|ACAB01000082.1| GENE 22 27824 - 28741 834 305 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715031|ref|ZP_04545512.1| ## NR: gi|237715031|ref|ZP_04545512.1| predicted protein [Bacteroides sp. D1] # 1 305 19 323 323 573 100.0 1e-162 MKKILVIINKNWETEPVLNALTNPKLRPAALPFPEVINTPCDGDNRMSQPRAVFSLPREG EEPLQVVVRCIEDLMATGVNTSSSLEKYKVLPQAIAADAADLIISVSTANYPDPAVTHNG TVVLGGNFFIHDGNPDSHTDPEHNLIDDRVGTFIASNVAPAVFSLAETVNARLHCYPGGA CKLVPPPNFPAPQLVCEGASTFTAVGSVNVTDYGSYDTIDAQALKEFAAAAPEGYTANSI ETTHGVVKISTSGEPILFLSPITDRLNHFNEDVTDTQNYVSAFNGGLALGELLCALAEYG GERQI >gi|222159277|gb|ACAB01000082.1| GENE 23 28957 - 29424 422 155 aa, chain + ## HITS:1 COG:no KEGG:PG0555 NR:ns ## KEGG: PG0555 # Name: not_defined # Def: histone-like family DNA-binding protein # Organism: P.gingivalis # Pathway: not_defined # 14 126 19 131 172 85 38.0 7e-16 MGFFKKVKQKINGMWYPQSITVGKPVTTDEVAKRLAIESTVSPADTFAVLKSLGSVLGSY MADGRTVKLDGVGTFYYTAVASGNGVDSPDKVTAKQITGVRVRFIPETSRSSNNQVTTRS LVDSNIFWEEWGGKSTTPSEGGGGEGGGEAPDPAA >gi|222159277|gb|ACAB01000082.1| GENE 24 29729 - 30010 112 93 aa, chain + ## HITS:1 COG:no KEGG:BT_4140 NR:ns ## KEGG: BT_4140 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 78 17 95 104 125 84.0 6e-28 MAPTAKTILYGLQARNEARSDSDIDLLILLDGEKMTLKDEESITLPLYELELKTGVSISP IVTLKKLWENRPFPPLFILTSPMKVSCYERNLG >gi|222159277|gb|ACAB01000082.1| GENE 25 30061 - 31842 1615 593 aa, chain - ## HITS:1 COG:CAC1701 KEGG:ns NR:ns ## COG: CAC1701 COG0642 # Protein_GI_number: 15894978 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 84 589 71 563 566 160 29.0 5e-39 MNLPVTQKHFLSFSRKLFLSVISLFLVFAICFIAYQYQREREYKIELLNTKLQDYNSRLY EQLEEQPLDSEIISGYINKHILEDLRVTLIDAEGNVVYDSYPNHNNQIENHLNRPEVQKA IKHGNGYDVRRTSETTGVPYFYSATRYKDYIVRSALPYNVSLINNLQADPHYLWFTIIVT LLLMIIFYKFTNKLGTSISQLREFAMRADRNEPIEMAMQSAFPHNELGEISQHIIQIYKR LHETKEALYIEREKLITHLQISHEGLGVFTKDKKEILVNNLFTQYSNLISDSNLETTEEV FAINELKEIIHFINKNQQERSRGKGEKRMSVTINKNGRTFIVECIIFQDASFEISINDVT QEEEQVRLKRQLTQNIAHELKTPVSSIQGYLETIVSNENIPREKINVFLERCYAQSNRLS RLLRDISVLTRMDEAASMIDMERVDISVLVGNIINEVSLELDEKHITVINSLKKSIQVKG NYSLLYSIFRNLMDNAIAYAGSNIQININCFREDENFYYFSFADTGIGVSPEHLNRLFER FYRVDKGRSRKLGGTGLGLAIVKNAVIIHGGNISAKNNQGGGLEFVFTLAKEK >gi|222159277|gb|ACAB01000082.1| GENE 26 31870 - 32559 829 229 aa, chain - ## HITS:1 COG:TM1655 KEGG:ns NR:ns ## COG: TM1655 COG0745 # Protein_GI_number: 15644403 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Thermotoga maritima # 1 226 9 240 247 162 42.0 7e-40 MNDYRILVVDDEEDLCEILKFNLENEGYEVDTANSAEEAMKMDISSYHLILLDVMMGEIS GFKMANILKKDKKTAKVPIIFITAKDTENDTVTGFNLGADDYISKPFSLREVIARVKAVL RRTVTTETERAPERLTYQSLVIDITKKKVSIDDEEVPLTKKEFEILLLLVQNKGRVFSRE DILARIWSDEVYVLDRTIDVNITRLRKKIGIYGKRIVTRLGYGYCFEAE >gi|222159277|gb|ACAB01000082.1| GENE 27 32801 - 33994 882 397 aa, chain + ## HITS:1 COG:no KEGG:BT_1481 NR:ns ## KEGG: BT_1481 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 397 1 397 397 729 93.0 0 MIKIDLKRHRKEIIGSVVVLLILLGGMSVFKYTSFNSGFEIVDDLGGNIFPSAILSVATT DAQVIVPSDSTSLGNPKSCIAVRLKSKTAYSRVRIEVAETPFFSRSVSEFVLNKPRTEYT IYPDIIWNYEALKNEVQAEPVSVAITVEMNGKDLGQRVRTFSVRSINECLLGYVANGTKF HDTSIFFAAYVNEENPMIDQLLREALNTRIVNRFLGYQSKAKGAVDKQVYALWNILQKRK FRYSSVSNTSLSSNVVFSQRVRTFDDALESSQINCVDGSVLFASLLRAINIDPILVRTPG HMFVGYYTDNSHTNKNFLETTMIGDVDLDDFFPDEQLDSTMVGKSQNEMSLLTFEKSKQY ANKKYKENEEGIHSGKLNYMFLEISKEVRRKIQPIGK >gi|222159277|gb|ACAB01000082.1| GENE 28 34047 - 34853 814 268 aa, chain + ## HITS:1 COG:no KEGG:BT_1480 NR:ns ## KEGG: BT_1480 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 268 1 268 268 461 85.0 1e-128 MQGKKFISPGAWFSMNYPSDWNEFEDGEGSFLFYNPEVWTGNFRISAFKGKAGYGKDVIR QELKENDSASLVKVGTWECAYSKEMFQEEGTYYTSHLWITGVDDIAFECSFTVPKGGVVK EAEDVIATLEVRKEGQKYPAELIPARLSEIYLINEAYEWVVSTVKQELKKDFQGIEEDLE KLQQVINSGKIGSKKKEEWLAIGITVCIILTNEVEGMEWKTLIDGNREAPVLQYKDRIID PLKLAWSKVKAGEPCDIIEEYKSVIINH >gi|222159277|gb|ACAB01000082.1| GENE 29 34881 - 36749 1967 622 aa, chain + ## HITS:1 COG:sll0912 KEGG:ns NR:ns ## COG: sll0912 COG0488 # Protein_GI_number: 16331003 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Synechocystis # 6 619 4 634 636 484 43.0 1e-136 MAVPYLQIDNLTKSFGDLVLFENISFGIAEGQRVGLIAKNGSGKTTLLNIIAGKEGYDSG NIVFRRDLRVDYLEQDPQYPEELTVLEACFHHGNSTVELIKEYERCMETEGHPGLENLLA RMDQEKAWEYEQKAKQILSQLKIRNFDQKVKQLSGGQLKRVALANALITEPDLLILDEPT NHLDLDMTEWLEDYLRRTNLSLLMVTHDRYFLDRVCSEIIEIDNQQIYQYKGNYSYYLEK RQERIEAKSVEIERANNLYRTELDWMRRMPQARGHKARYREDAFYELEKVAKQRFNNDNV KLEVKASYIGSKIFEADHLFKSFGDLKILDDFSYIFARYEKMGIVGNNGTGKSTFIKILM GQVAPDSGTVDVGETVRFGYYSQDGLQFDEQMKVIDVVQDIAEVIELGNGKKLTASQFLQ HFLFTPETQHSYVYKLSGGERRRLYLCTILMRNPNFLVLDEPTNDLDIITLNVLEEYLQN FKGCVIVVSHDRYFMDKVVDHLMVFNGQGDIRDFPGNYSDYRDWKDAKAQKEKEAEKPQE EKTARVRLNDKRKMSFKEKREFEQLEKEIAELETEKVQIEELLCSGTLSVDELTEKSKRL PEVNDLIDEKTMRWLELSEIEG >gi|222159277|gb|ACAB01000082.1| GENE 30 36829 - 37668 592 279 aa, chain + ## HITS:1 COG:SPCC1672.01 KEGG:ns NR:ns ## COG: SPCC1672.01 COG1387 # Protein_GI_number: 19075372 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Schizosaccharomyces pombe # 8 271 6 277 306 89 27.0 6e-18 MTNLTNYHSHCLYCDGRANMEDFIRFAISEGFTSYGISSHAPLPFSTAWTMEWDRMDDYL SEFSRLKKKYADKIELAIGLEIDYLNEESNPSLPCFQKLPLDYRIGSVHMLYSPEGKIVD IDTPADLFRQLVDKHFDGDLDSVVHLYYKNLLRMVELGGFDIVGHADKMHYNASCYRPGL LDEPWYDALVRKYFTAIAEYGYIVEINTKSYHDLGTFYPNKRYFSFLKELGIRVQVNSDA HYPERINNGRAEALAALKKAGFTSVAEWHNGKWEEREIE >gi|222159277|gb|ACAB01000082.1| GENE 31 37768 - 39003 834 411 aa, chain - ## HITS:1 COG:APE1887 KEGG:ns NR:ns ## COG: APE1887 COG2407 # Protein_GI_number: 14601699 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Aeropyrum pernix # 53 398 72 417 433 145 29.0 1e-34 MTIHLISFASILHKQVSLRSSHEAILSEIEKYYTVKLVDYQDMDKLSSDDFKIIFIATGG VERLVIQQFENLPRPAILLADGMQNSLAAALEISTWLRGRGMKSEILHGELPAIILRIHT LYNNFRAQRSLFGKRIGVIGTPSSWLVASNVDYLLAKRRWGIEYVDIPLERIYEQFQQIT DEQVGASCAAVASQALACREGTPEDLIKSMRLYRAIKKVCQEENLEALTLSCFKLIEQID TTGCVALSLLNDDGIIAGCEGDLQSVFTLLAVKALTGKDGFMANPSMINSRTNELILAHC TIGLKQTERYIIRNHFETEKGIAIQGLLPTGDVTIIKCGGECLDEYYLSTGTLTENTNYI NMCRTQVRIRMNTPAEYFLKNPLGNHHIMLHGNYEDTLNEFFQANACKRTE >gi|222159277|gb|ACAB01000082.1| GENE 32 39185 - 40384 1024 399 aa, chain + ## HITS:1 COG:CAC1001 KEGG:ns NR:ns ## COG: CAC1001 COG0436 # Protein_GI_number: 15894288 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 5 394 4 393 395 357 45.0 2e-98 MPTISIRGNEMPASPIRKLAPLADAAKQRGVHVFHLNIGQPDLPTPQAAIDAIRNIDRKV LEYSPSAGYRSYREKLVGYYAKFNINLTADDIIITSGGSEAVLFSFLSCLNPGDEIIVPE PAYANYMAFAISAGAKIRTIATTIEEGFSLPKVEKFEELINERTKAILICNPNNPTGYLY TRREMNQIRDLVKKCDLFLFSDEVYREFIYTGSPYISACHLEGIENNVVLIDSVSKRYSE CGIRIGALITKNKEIRDAVMKFCQARLSPPLIGQIAAEASLDAPEEYSRETYDEYVERRK CLIDGLNRIPGVYSPIPMGAFYTVAKLPVDDSDKFCAWCLSDFEYEGQTVFMAPASGFYT TPGSGINEVRIAYVLKKEDLTRALFVLQKALEAYPGRTE Prediction of potential genes in microbial genomes Time: Wed May 18 03:03:23 2011 Seq name: gi|222159276|gb|ACAB01000083.1| Bacteroides sp. D1 cont1.83, whole genome shotgun sequence Length of sequence - 161465 bp Number of predicted genes - 146, with homology - 143 Number of transcription units - 79, operones - 35 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1137 - 1196 4.3 2 2 Op 1 1/0.067 + CDS 1239 - 2705 2438 ## PROTEIN SUPPORTED gi|29346884|ref|NP_810387.1| ribosomal protein S6 modification protein-related protein 3 2 Op 2 . + CDS 2705 - 3163 223 ## PROTEIN SUPPORTED gi|116624156|ref|YP_826312.1| SSU ribosomal protein S18P alanine acetyltransferase - Term 3070 - 3109 8.6 4 3 Op 1 . - CDS 3166 - 4338 652 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 5 3 Op 2 . - CDS 4395 - 5420 999 ## COG0628 Predicted permease - Prom 5546 - 5605 4.0 6 4 Tu 1 . - CDS 5721 - 6347 606 ## BF4289 hypothetical protein - Prom 6369 - 6428 2.0 7 5 Op 1 . + CDS 6762 - 7514 745 ## COG2186 Transcriptional regulators 8 5 Op 2 . + CDS 7562 - 8578 437 ## COG3055 Uncharacterized protein conserved in bacteria 9 5 Op 3 . + CDS 8615 - 9526 722 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase + Prom 9528 - 9587 4.0 10 6 Op 1 . + CDS 9643 - 11283 936 ## COG4409 Neuraminidase (sialidase) 11 6 Op 2 . + CDS 11290 - 12522 1037 ## COG0477 Permeases of the major facilitator superfamily 12 6 Op 3 . + CDS 12548 - 15832 1891 ## BF3939 hypothetical protein 13 6 Op 4 . + CDS 15852 - 17306 992 ## BF3938 hypothetical protein 14 6 Op 5 . + CDS 17293 - 18453 540 ## COG4409 Neuraminidase (sialidase) 15 6 Op 6 . + CDS 18472 - 20043 865 ## COG3291 FOG: PKD repeat + Term 20048 - 20109 11.2 16 7 Tu 1 . + CDS 20111 - 22195 1107 ## COG1472 Beta-glucosidase-related glycosidases + Term 22263 - 22303 5.4 + Prom 22225 - 22284 5.7 17 8 Tu 1 . + CDS 22326 - 22931 310 ## gi|295087822|emb|CBK69345.1| Protein of unknown function (DUF2500). + Term 23130 - 23176 2.4 + Prom 23037 - 23096 7.1 18 9 Op 1 . + CDS 23218 - 25695 1887 ## BT_1460 hypothetical protein 19 9 Op 2 . + CDS 25753 - 26835 566 ## BT_1459 two-component system sensor 20 9 Op 3 9/0.000 + CDS 26889 - 27941 503 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 21 9 Op 4 . + CDS 27938 - 28654 602 ## COG3279 Response regulator of the LytR/AlgR family + Term 28729 - 28764 0.5 - Term 28766 - 28812 13.6 22 10 Tu 1 . - CDS 28842 - 29339 555 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 29445 - 29504 4.7 + Prom 29669 - 29728 8.0 23 11 Tu 1 . + CDS 29855 - 31252 1240 ## COG1690 Uncharacterized conserved protein + Prom 31298 - 31357 2.9 24 12 Op 1 . + CDS 31384 - 32118 722 ## BF2709 hypothetical protein 25 12 Op 2 . + CDS 32093 - 33196 793 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase + Term 33244 - 33297 2.1 - Term 33379 - 33434 2.3 26 13 Tu 1 . - CDS 33521 - 35020 1443 ## COG1620 L-lactate permease - Prom 35112 - 35171 2.5 + Prom 35435 - 35494 7.5 27 14 Op 1 2/0.067 + CDS 35533 - 37077 1664 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 28 14 Op 2 . + CDS 37135 - 38646 1762 ## COG0439 Biotin carboxylase 29 14 Op 3 . + CDS 38676 - 39200 621 ## COG1038 Pyruvate carboxylase + Term 39223 - 39283 13.2 + Prom 39202 - 39261 2.1 30 15 Op 1 . + CDS 39288 - 41645 1946 ## COG0642 Signal transduction histidine kinase 31 15 Op 2 . + CDS 41653 - 42093 220 ## gi|237715074|ref|ZP_04545555.1| predicted protein + Term 42106 - 42140 -0.5 + Prom 42100 - 42159 7.0 32 16 Tu 1 . + CDS 42223 - 42909 486 ## Fjoh_2065 hypothetical protein + Prom 42994 - 43053 3.6 33 17 Op 1 . + CDS 43126 - 43440 310 ## BT_4140 hypothetical protein 34 17 Op 2 . + CDS 43437 - 43859 251 ## COG1895 Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN + Term 43878 - 43906 -0.0 35 18 Tu 1 . + CDS 43936 - 45495 1334 ## BT_0374 hypothetical protein + Term 45587 - 45623 -0.8 - Term 45472 - 45523 7.1 36 19 Op 1 9/0.000 - CDS 45539 - 46186 630 ## COG0132 Dethiobiotin synthetase 37 19 Op 2 5/0.000 - CDS 46183 - 46950 420 ## COG0500 SAM-dependent methyltransferases 38 19 Op 3 5/0.000 - CDS 46961 - 47620 456 ## COG2830 Uncharacterized protein conserved in bacteria 39 19 Op 4 6/0.000 - CDS 47617 - 48771 760 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes - Prom 48791 - 48850 3.7 - Term 48899 - 48955 6.4 40 20 Op 1 . - CDS 48993 - 51407 1686 ## COG0161 Adenosylmethionine-8-amino-7-oxononanoate aminotransferase - Prom 51434 - 51493 2.3 41 20 Op 2 . - CDS 51499 - 52608 851 ## BT_1441 hypothetical protein - Prom 52711 - 52770 4.4 + Prom 52674 - 52733 4.0 42 21 Op 1 . + CDS 52766 - 55894 2945 ## BT_1440 hypothetical protein 43 21 Op 2 . + CDS 55909 - 57390 1406 ## BT_1439 hypothetical protein + Term 57430 - 57474 9.2 44 22 Tu 1 . - CDS 57523 - 58728 865 ## COG0477 Permeases of the major facilitator superfamily - Prom 58878 - 58937 7.5 + Prom 58801 - 58860 5.8 45 23 Op 1 . + CDS 58890 - 59135 336 ## BF4188 hypothetical protein 46 23 Op 2 . + CDS 59180 - 59467 406 ## BT_1435 hypothetical protein + Term 59479 - 59523 6.5 + Prom 59710 - 59769 6.9 47 24 Op 1 4/0.000 + CDS 59855 - 60928 836 ## COG1609 Transcriptional regulators 48 24 Op 2 3/0.000 + CDS 60950 - 61762 1047 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) + Prom 61804 - 61863 2.5 49 24 Op 3 . + CDS 61889 - 63058 1258 ## COG1312 D-mannonate dehydratase + Term 63083 - 63147 16.2 50 25 Tu 1 . - CDS 63133 - 63729 505 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases - Prom 63774 - 63833 6.1 51 26 Op 1 . + CDS 63916 - 66276 1510 ## COG0642 Signal transduction histidine kinase 52 26 Op 2 . + CDS 66316 - 67794 1093 ## COG3303 Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit 53 26 Op 3 1/0.067 + CDS 67832 - 68656 341 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 68658 - 68717 2.4 54 26 Op 4 . + CDS 68738 - 69154 447 ## COG3871 Uncharacterized stress protein (general stress protein 26) + Term 69195 - 69253 10.2 - Term 69186 - 69238 12.5 55 27 Tu 1 . - CDS 69287 - 69562 430 ## BT_1428 hypothetical protein - Prom 69681 - 69740 7.2 + Prom 69637 - 69696 7.0 56 28 Tu 1 . + CDS 69894 - 70733 544 ## BT_1427 tetracycline resistance element mobilization regulatory protein RteC + Prom 70736 - 70795 2.5 57 29 Op 1 . + CDS 70842 - 71444 261 ## COG4332 Uncharacterized protein conserved in bacteria + Prom 71446 - 71505 2.1 58 29 Op 2 . + CDS 71529 - 72230 643 ## BT_1425 hypothetical protein + Term 72288 - 72325 5.2 + Prom 72341 - 72400 4.5 59 30 Tu 1 . + CDS 72448 - 73536 543 ## COG1162 Predicted GTPases + Term 73549 - 73590 10.4 - Term 73537 - 73578 6.6 60 31 Tu 1 . - CDS 73637 - 74209 214 ## PROTEIN SUPPORTED gi|52841322|ref|YP_095121.1| nucleotidyltransferase PLUS glutamate rich protein GrpB PLUS ribosomal protein alanine acetyltransferase - Prom 74293 - 74352 4.9 61 32 Tu 1 . - CDS 74389 - 74856 349 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 - Prom 74901 - 74960 4.3 62 33 Tu 1 . - CDS 74963 - 75553 183 ## gi|237715104|ref|ZP_04545585.1| predicted protein - Prom 75598 - 75657 2.7 63 34 Tu 1 . - CDS 75663 - 76382 618 ## COG4884 Uncharacterized protein conserved in bacteria - Prom 76414 - 76473 4.5 + Prom 77102 - 77161 5.5 64 35 Op 1 . + CDS 77320 - 77706 282 ## BT_2537 hypothetical protein 65 35 Op 2 . + CDS 77713 - 78258 284 ## gi|237715107|ref|ZP_04545588.1| predicted protein 66 36 Op 1 . - CDS 78241 - 78729 231 ## BT_2536 hypothetical protein 67 36 Op 2 . - CDS 78768 - 79331 269 ## HTH_0737 signal peptidase I 68 36 Op 3 . - CDS 79312 - 79650 354 ## gi|237715110|ref|ZP_04545591.1| predicted protein 69 36 Op 4 . - CDS 79670 - 80209 360 ## BT_0512 hypothetical protein 70 36 Op 5 . - CDS 80245 - 80733 523 ## BF1788 hypothetical protein - Prom 80969 - 81028 7.6 + Prom 80797 - 80856 4.7 71 37 Tu 1 . + CDS 80935 - 81156 189 ## gi|237715113|ref|ZP_04545594.1| predicted protein 72 38 Op 1 . - CDS 81337 - 81768 147 ## BT_0513 hypothetical protein 73 38 Op 2 . - CDS 81828 - 82724 390 ## gi|237715115|ref|ZP_04545596.1| predicted protein 74 38 Op 3 . - CDS 82802 - 83308 292 ## gi|237715116|ref|ZP_04545597.1| predicted protein - Prom 83329 - 83388 2.7 75 38 Op 4 . - CDS 83398 - 83700 200 ## gi|262408935|ref|ZP_06085480.1| predicted protein - Prom 83765 - 83824 3.3 76 39 Tu 1 . - CDS 83827 - 84003 195 ## gi|295087761|emb|CBK69284.1| hypothetical protein - Prom 84148 - 84207 8.1 77 40 Tu 1 . - CDS 84418 - 84531 82 ## - Prom 84677 - 84736 9.0 + Prom 84610 - 84669 8.1 78 41 Tu 1 . + CDS 84769 - 85005 211 ## BF3342 putative exported beta-lactamase protein + Prom 85128 - 85187 4.4 79 42 Tu 1 . + CDS 85376 - 87622 1972 ## Fjoh_4747 hypothetical protein + Term 87669 - 87714 4.1 + Prom 87730 - 87789 5.3 80 43 Op 1 . + CDS 87892 - 88509 532 ## BT_1424 hypothetical protein 81 43 Op 2 . + CDS 88587 - 90299 1710 ## BT_4445 hypothetical protein + Term 90324 - 90371 -0.2 + Prom 90340 - 90399 5.9 82 44 Tu 1 . + CDS 90571 - 90885 58 ## BT_1421 hypothetical protein + Term 90907 - 90948 6.4 + Prom 90894 - 90953 4.3 83 45 Op 1 . + CDS 90973 - 93132 1695 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 84 45 Op 2 . + CDS 93153 - 93620 369 ## BT_1419 hypothetical protein + Term 93644 - 93684 0.2 + Prom 93664 - 93723 3.6 85 46 Op 1 1/0.067 + CDS 93774 - 94373 253 ## COG3005 Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit 86 46 Op 2 . + CDS 94413 - 95894 1294 ## COG3303 Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit 87 46 Op 3 . + CDS 95918 - 97162 1044 ## BT_1416 hypothetical protein 88 46 Op 4 . + CDS 97159 - 97950 643 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 89 46 Op 5 . + CDS 97980 - 99260 1136 ## BT_1414 hypothetical protein + Term 99325 - 99373 3.1 - Term 99311 - 99359 3.1 90 47 Tu 1 . - CDS 99439 - 100128 367 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 100196 - 100255 8.0 + Prom 100155 - 100214 7.9 91 48 Tu 1 . + CDS 100375 - 100914 496 ## COG0655 Multimeric flavodoxin WrbA + Term 101015 - 101066 10.4 + Prom 101088 - 101147 10.9 92 49 Op 1 . + CDS 101167 - 101988 566 ## COG1237 Metal-dependent hydrolases of the beta-lactamase superfamily II 93 49 Op 2 . + CDS 102006 - 105365 2869 ## COG0793 Periplasmic protease + Term 105396 - 105452 10.6 - Term 105382 - 105440 7.2 94 50 Tu 1 . - CDS 105522 - 106193 477 ## COG2364 Predicted membrane protein - Prom 106267 - 106326 4.3 + Prom 106062 - 106121 3.4 95 51 Tu 1 . + CDS 106349 - 107239 559 ## COG2207 AraC-type DNA-binding domain-containing proteins 96 52 Tu 1 . - CDS 107409 - 109166 1824 ## COG1154 Deoxyxylulose-5-phosphate synthase - Prom 109253 - 109312 7.8 + Prom 109146 - 109205 5.5 97 53 Tu 1 . + CDS 109345 - 110112 453 ## COG0500 SAM-dependent methyltransferases + Term 110304 - 110341 2.2 + Prom 110323 - 110382 6.5 98 54 Op 1 . + CDS 110524 - 111555 847 ## COG1073 Hydrolases of the alpha/beta superfamily 99 54 Op 2 . + CDS 111611 - 112474 654 ## BT_1399 hypothetical protein 100 54 Op 3 . + CDS 112478 - 113635 816 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 101 54 Op 4 1/0.067 + CDS 113706 - 114632 795 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 114634 - 114693 1.8 102 54 Op 5 . + CDS 114718 - 115293 255 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 103 55 Op 1 . - CDS 115447 - 116358 701 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 104 55 Op 2 . - CDS 116381 - 116599 65 ## - Prom 116625 - 116684 2.0 + Prom 116341 - 116400 5.8 105 56 Op 1 2/0.067 + CDS 116544 - 116939 392 ## COG1359 Uncharacterized conserved protein + Prom 116966 - 117025 3.5 106 56 Op 2 . + CDS 117055 - 118260 1044 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit + Term 118493 - 118532 9.1 + Prom 118698 - 118757 8.7 107 57 Tu 1 . + CDS 118937 - 120523 1505 ## BDI_2898 hypothetical protein + Term 120579 - 120618 -0.7 + Prom 120690 - 120749 8.8 108 58 Tu 1 . + CDS 120926 - 122059 1135 ## BT_1391 hypothetical protein + Term 122083 - 122127 9.0 + Prom 122124 - 122183 5.3 109 59 Op 1 . + CDS 122275 - 123348 1241 ## COG3831 Uncharacterized conserved protein 110 59 Op 2 . + CDS 123352 - 124362 596 ## BF1940 hypothetical protein 111 59 Op 3 . + CDS 124349 - 125203 597 ## BF1879 putative membrane-associated metal-dependent hydrolase 112 59 Op 4 . + CDS 125200 - 126474 671 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 113 59 Op 5 . + CDS 126480 - 128015 614 ## BF1943 hypothetical protein + Term 128179 - 128239 17.3 + Prom 128034 - 128093 9.4 114 60 Op 1 . + CDS 128276 - 129640 926 ## COG0534 Na+-driven multidrug efflux pump 115 60 Op 2 . + CDS 129681 - 130229 496 ## BT_1386 hypothetical protein + Term 130278 - 130330 -0.8 + Prom 130269 - 130328 2.5 116 61 Op 1 1/0.067 + CDS 130352 - 131230 412 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 131234 - 131293 3.9 117 61 Op 2 . + CDS 131322 - 131744 373 ## COG3871 Uncharacterized stress protein (general stress protein 26) + Term 131952 - 132024 14.1 118 62 Tu 1 . - CDS 131783 - 132022 105 ## - Prom 132117 - 132176 6.3 119 63 Tu 1 . - CDS 132321 - 132917 548 ## COG0693 Putative intracellular protease/amidase - Prom 132990 - 133049 7.8 + Prom 132939 - 132998 5.6 120 64 Tu 1 . + CDS 133132 - 133299 68 ## gi|295087721|emb|CBK69244.1| hypothetical protein + Term 133379 - 133446 17.1 - TRNA 133447 - 133520 85.5 # Asp GTC 0 0 - TRNA 133576 - 133649 85.5 # Asp GTC 0 0 121 65 Op 1 . + CDS 134093 - 134947 801 ## COG0788 Formyltetrahydrofolate hydrolase 122 65 Op 2 25/0.000 + CDS 134947 - 135537 518 ## COG0118 Glutamine amidotransferase + Prom 135549 - 135608 8.6 123 65 Op 3 23/0.000 + CDS 135629 - 136348 799 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase + Term 136368 - 136406 3.1 + Prom 136394 - 136453 6.9 124 66 Op 1 24/0.000 + CDS 136473 - 137228 743 ## COG0107 Imidazoleglycerol-phosphate synthase + Prom 137279 - 137338 4.2 125 66 Op 2 . + CDS 137358 - 137969 649 ## COG0139 Phosphoribosyl-AMP cyclohydrolase 126 66 Op 3 . + CDS 138012 - 138740 289 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 127 66 Op 4 1/0.067 + CDS 138754 - 140073 1131 ## COG0527 Aspartokinases + Term 140089 - 140131 6.3 128 67 Tu 1 . + CDS 140152 - 141312 1258 ## COG0019 Diaminopimelate decarboxylase + Prom 141326 - 141385 4.3 129 68 Tu 1 . + CDS 141455 - 141934 649 ## COG1528 Ferritin-like protein + Term 141972 - 142015 6.3 + Prom 141990 - 142049 4.5 130 69 Tu 1 . + CDS 142089 - 144353 1610 ## COG0642 Signal transduction histidine kinase 131 70 Tu 1 . - CDS 144387 - 144794 493 ## BT_1372 hypothetical protein - Prom 144825 - 144884 3.9 132 71 Tu 1 . - CDS 144967 - 146160 1350 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes - Prom 146253 - 146312 3.3 + Prom 146182 - 146241 5.0 133 72 Op 1 . + CDS 146297 - 147250 745 ## COG0451 Nucleoside-diphosphate-sugar epimerases 134 72 Op 2 . + CDS 147250 - 148098 745 ## BT_1369 hypothetical protein + Prom 148117 - 148176 4.4 135 73 Op 1 . + CDS 148200 - 149192 950 ## COG0812 UDP-N-acetylmuramate dehydrogenase 136 73 Op 2 . + CDS 149203 - 149961 606 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I + Term 149969 - 150003 -0.8 + Prom 149965 - 150024 3.1 137 73 Op 3 . + CDS 150045 - 151619 789 ## BT_1366 hypothetical protein + Term 151835 - 151880 -0.8 138 74 Tu 1 . - CDS 151624 - 151812 86 ## gi|262408873|ref|ZP_06085418.1| predicted protein - Prom 151941 - 152000 9.7 - Term 152025 - 152071 3.3 139 75 Tu 1 . - CDS 152114 - 152488 296 ## BT_1365 hypothetical protein - Prom 152640 - 152699 4.1 + Prom 152471 - 152530 4.6 140 76 Tu 1 . + CDS 152643 - 153767 1134 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) + Prom 153801 - 153860 1.6 141 77 Op 1 . + CDS 153880 - 154659 811 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases 142 77 Op 2 . + CDS 154659 - 155867 1029 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 143 77 Op 3 . + CDS 155894 - 157555 1825 ## COG0497 ATPase involved in DNA repair + Prom 157577 - 157636 7.5 144 78 Op 1 . + CDS 157658 - 158401 392 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 145 78 Op 2 . + CDS 158403 - 160109 1706 ## COG0457 FOG: TPR repeat + Term 160145 - 160192 10.9 + Prom 160215 - 160274 9.7 146 79 Tu 1 . + CDS 160478 - 161422 444 ## BT_0595 integrase Predicted protein(s) >gi|222159276|gb|ACAB01000083.1| GENE 1 2 - 1108 868 368 aa, chain + ## HITS:1 COG:RSc1117 KEGG:ns NR:ns ## COG: RSc1117 COG4591 # Protein_GI_number: 17545836 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Ralstonia solanacearum # 3 368 53 416 416 100 23.0 4e-21 VIGFKSEVRNKVIGFGSHIQITNLDAVSSYETHPIVVGDSMMTALADYPEISHVQRFSTK PGMIKTDDAFQGMVLKGVGPEFDPHFIKEYLVEGEIPVFSDSVSTNQVLISKALATKMKL KLGDKIYTYYIQDDIRARRLTIAGIYQTNFSEYDNLFLLTDLNLVNRLNGWQPEQVTGVE LQVKDYDKLEDITYEIATDIDNRQDELGGVYYVRNIEQLNPQIFAWLDLLDLNVWVILIL MIGVAGFTMISGLLIIIIERTNMIGILKALGANNFTIRRTFLWFAVFLIGKGMLWGNAIG LAFCILQSQFGLFKLDPETYYVDTVPVSFNVLLFILINLGTLFASVLMLIGPSFLITKIN PASSMRYE >gi|222159276|gb|ACAB01000083.1| GENE 2 1239 - 2705 2438 488 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29346884|ref|NP_810387.1| ribosomal protein S6 modification protein-related protein [Bacteroides thetaiotaomicron VPI-5482] # 1 488 1 488 488 943 93 0.0 MNNVLILLDNLDDWKPYYETSSVLTVSDYLKNKPVEKDRKLVINLSDDYSYNSEGYYCSL LAQTRGQKVIPDVDIINKLETGTGVRMDRSLQALCYQWIQKNNVKDDIWYLNIYFGKCRE KGLERIARFIFENYPCPLLRVALNTHPRNQIESIQFLPLNRLNDEEQDFFANTLDNFNKK IWRAPKSAKASRYSLAVLVDPQEKFPPSNKGALHKLAEVAKKMNIHVEMITEDDAIRLLE FDALFIRTTTSLNHYTFHLSQLAAQNGMAVIDDPLSIIRCTNKVYLKELFEKEKISAPKS TLIFQSNDHSFEQISEQVGAPFILKIPDGSYSIGMKKVSNEEELQASLKMLFEKSAILLA QAFTPTEFDWRVGLLNGVPLYACKYYMAKGHWQIYCHYDSGRSRCGLVDTIPIYQVPRVV LDTAVKAANLIGKGLYGVDLKMVDDKAYVIEINDNPSIDHGLEDAIIGDEMYYRLLNHFE QALEAKHY >gi|222159276|gb|ACAB01000083.1| GENE 3 2705 - 3163 223 152 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116624156|ref|YP_826312.1| SSU ribosomal protein S18P alanine acetyltransferase [Solibacter usitatus Ellin6076] # 2 143 1 144 152 90 38 4e-17 MMETGLFVRKAQQSDIPAILEIEWECFREDSFSKEQFAYLISRSKGTFYVMMEGDRVIAY VSLLFHGGTHYLRIYSIAVHPDFRGKGLGQALMDQTIRTANECKAAKITLEVKVTNAAAI ALYMKNGFIPAGIKPCYYHDGSDAIYMQRLIP >gi|222159276|gb|ACAB01000083.1| GENE 4 3166 - 4338 652 390 aa, chain - ## HITS:1 COG:Ta1048 KEGG:ns NR:ns ## COG: Ta1048 COG0463 # Protein_GI_number: 16082079 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Thermoplasma acidophilum # 57 296 7 212 256 76 25.0 1e-13 METFTFNTVELILLSAAGILFIIQLIYYFGLYNRIHAHNKAVRKEEVHFSRELPPLSVIL CARNEAENLRKILPAILEQDYPQFEVIVINDASTDETEDILGMMEEKYPHLYHSFTPESA RYISHKKLALTLGIKASKHDWLVFTETNCMPASNQWLKLMARNFTPQTQIVLGYSGYDRT KGWLHKRTAFDTLFQSLRYLGFALAGKPYMGIGRNLAYRKELFFQQKGFSKYLNLQRGED DLFINQLATPSNTRVETDINATTRINPVYRYKEWKEEKISYMATARYYQGIQRYLLGFET FSRLLFYVSCIAGIASGVLNSHWLVAGIALLIWLLRFIMQIVVINQTAKEMGGNRKYYFS LPLFDLLQPIQSLNFKICRFFRGKGDFMRR >gi|222159276|gb|ACAB01000083.1| GENE 5 4395 - 5420 999 341 aa, chain - ## HITS:1 COG:VC0624 KEGG:ns NR:ns ## COG: VC0624 COG0628 # Protein_GI_number: 15640644 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Vibrio cholerae # 165 331 186 351 361 98 35.0 1e-20 MSTKEQYWKYSLIVIILFMGIIIFRQITPFLGGLLGALTIYILVRGQMRYLVEKRKLKRS LSALLITAETIFVFLIPLGLTVWMVVNKLQDINLDPQTYIAPIQQVAEFIKEKTGYDVLG KDTLTFIVSILPRIGQIIMESISSLAINLFVMIFVLYFMLIGGKKMEAYVNDILPFNETN TQEVIHEINMIVRSNAIGIPLLAIIQGGVATIGYLLFGAPNILLLGFLTCFATIIPMVGT ALVWFPVAAYLAISGDWFNAIGIAAYGAIVVSQSDNLIRFILQKKMADTHPLITIFGVVI GLPLFGFMGVIFGPLLLALFFLFVDMFKKEYLDLRNNLPSR >gi|222159276|gb|ACAB01000083.1| GENE 6 5721 - 6347 606 208 aa, chain - ## HITS:1 COG:no KEGG:BF4289 NR:ns ## KEGG: BF4289 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 197 3 198 206 157 45.0 3e-37 MAYYDFKKKPALTTKEGEKDVLYPSIVFNGTINTKQLLKQLVARTGYKPGVVEGTLMELV DLVGEYIGQGYRVEVGEFGYFSGKIKSRLVKDKKDLRSPSIQFNGVNFLASKTFKKKATG KLERAQKLFFQASSQLDDEELKRRLLEHVNRYGFITRTTYTELTGRLKNKALEDLKRFAK EGIICKVGRGNQMLFVKNQEDSQTEQNI >gi|222159276|gb|ACAB01000083.1| GENE 7 6762 - 7514 745 250 aa, chain + ## HITS:1 COG:AGl3126 KEGG:ns NR:ns ## COG: AGl3126 COG2186 # Protein_GI_number: 15891681 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 241 1 238 238 72 26.0 9e-13 MKIEINQTTLIDQVEDSLLTYFKKNDLRRGDSIPNENNLAAELGVARSVVREALSRLKMM GLIHARPRKGMVLTEPSILGGMKRVIDPRVLSEETILDLLDFRIALEIGISSDIFRKITP KDIEELSEIVKMGIVFENNEYALISESAFHTKLYKITGNKIISEFQEIIHPILVYVKEKF KDYLKPINIEMSKSGKIATHADLLDFIKKGDEKGYRDAIERHFEVYKIFKVNRSQELMAE KESSEKVEGI >gi|222159276|gb|ACAB01000083.1| GENE 8 7562 - 8578 437 338 aa, chain + ## HITS:1 COG:FN1470 KEGG:ns NR:ns ## COG: FN1470 COG3055 # Protein_GI_number: 19704802 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 17 336 24 367 372 103 24.0 3e-22 MNTSCLRKKEKRIEVQWENSLLLPGCAGMPENVGLAGAYSGIVEGKLLVLGGANFPDKYP WEGGTKTWWSTLYSYDLQTGKWTVYDDFLDRPLAYGVSISLPEGLLCIGGCDRTQCSDNV FLIKKEEDSFVIDSVSYPSLPVPLANATGAMGDNCIYIAGGQETMVNEQSTHHFYMLDLM HKERGWQEMPDWNGPSLSYAVGVAQGERFYLFSGRSYAPDEAMVEHTEGYVFEPGIGKWS KMIGSFPVMAGTGIPYGEDKILLFGGVEEILPTSSEHPGFSRKLRVVSTSTNSLVDSLEC PYRIPVTTNVVSVGNQVFIVSGEVQPGIRTPFILKGSF >gi|222159276|gb|ACAB01000083.1| GENE 9 8615 - 9526 722 303 aa, chain + ## HITS:1 COG:VC1776 KEGG:ns NR:ns ## COG: VC1776 COG0329 # Protein_GI_number: 15641779 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Vibrio cholerae # 5 297 2 288 298 219 39.0 6e-57 MNNYERLEGMVAATFTPMDENGDVNLSVIEKYADWIASTPIKGVFVCGTTGEFSSLTIDE RKLILEKWVASAGKRFKVIAHVGSNCQRSAMELACHAEKAGADAIASIAPSFFKPGTVDE LIDFFAPVCRSASELPFYYYNMPSITGVNLPVDRFLVEGKKKIPNLVGTKFTHNNLMEMG ACIDLEQHKFEVLHGFDEILIAGLSMGAVAGVGSTYNYIPNVYHAIFESMKQNDVETARA WQMKSIRTVEVIIKYGGGVRGGKAIMKLIGIDCGSCRLPIKPFSIEEYDKLKGDLDAINF FDF >gi|222159276|gb|ACAB01000083.1| GENE 10 9643 - 11283 936 546 aa, chain + ## HITS:1 COG:Cgl1519 KEGG:ns NR:ns ## COG: Cgl1519 COG4409 # Protein_GI_number: 19552769 # Func_class: G Carbohydrate transport and metabolism # Function: Neuraminidase (sialidase) # Organism: Corynebacterium glutamicum # 182 497 81 373 399 94 30.0 7e-19 MKVRREILIIFSMILLLVLPATSLGKEIKWVLERPVIPVLVKKPASPVLKVTLIRADNQP YAIQQIDLDLLGSTDVADVVSVAIYGTQENGLIDTSRLLYKALPAARKISFTDKVQVNQD SLSFWVAVTLKDTVSLDHRIQLNCNRIKTTKGNLKISEKGSKPLRVGVAVRQKGQDGCVS SRIPGLATSNQGTLLAIFDARYDYSRDLQGNIDIALHRSTDQGLTWQPVQTVLDMGEWGG LPQKYNGVSDACILVDKNTGDIYVAGLWMHGLLDKDGKWIEGLNESSTVWTHQWKGKGSQ PGTGLKETCQFMIAKSTDDGLSWSFPDNITAKTKHPEWWLFAPAPGQGITLKDGTLVFPT QGRDEKGLPFSNITYSKDHGKTWVTSNSAYQDVTECSVVQLNDGALMLNMRDNRNRGHKE VNGRRICTTTDLGASWKEHPTSRKALVEPTCMASLHRHEYIEEGKKKSMLLFVNPNDYGK RDKLTLKVSFDDGMTWPEEHWILFDQYRSAGYSCITSIDENSIGILYESSQSDLAFIKID LTEILK >gi|222159276|gb|ACAB01000083.1| GENE 11 11290 - 12522 1037 410 aa, chain + ## HITS:1 COG:CC2486 KEGG:ns NR:ns ## COG: CC2486 COG0477 # Protein_GI_number: 16126725 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 6 379 37 433 519 92 23.0 1e-18 MNNKTSNIYPWVVVGLLWGVALLNYMDRQMLSTMRIPMMEDVRELESAANFGRLMAVFLW VYGLMSPLSGIIGDRVNRKWLIVGSLCVWSGVTYLMGYATTFNQLYWLRGIMGISEALYL PAALSLIADFHKDKTRSLAVGIHMTGLYVGQAIGGFGATFAAIYSWHTTFHWFGIIGVGY GIILAFFLRDKERGNVSENQKMKKIPVLKSLGMLFSNVFFWVILFYFCVPGTPGWAAKNW LPTLFSDSLSIDISVAGPMSTISIALSSLFGVLAGGYISDRWVLKNVRGRVYTGALGLGL IIPSLLFIGYGHSIFALVMGAVLFGIGFGMFDANNMPILCQFVSARYRATAYGIMNMCGV FAGAAITSLLGESMDAGHLGRDFALLAILVLAMLVILVTCLRPKTIDMKD >gi|222159276|gb|ACAB01000083.1| GENE 12 12548 - 15832 1891 1094 aa, chain + ## HITS:1 COG:no KEGG:BF3939 NR:ns ## KEGG: BF3939 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 11 1094 10 1098 1098 844 44.0 0 MKNSIKHTNKILSVLFAFLFISLNIQADDISSKEIKVSGIVQDQTGVELPGVSISVKGID KGTITNASGEFSIMVSPDATLIVSFVGMETKEIAVKGRRNIVIVLKESSVLLDEVVAIGY GKQSRALITNSISKINKEEFQKAPGQNPLLQLQGKVPGLSLQISSGQPGADPQLFIRGGS STSPESDTPLIIIDGIISQGFRNISDMNPADIESIEVLKDAASTAIYGSRAANGIILVKT KSGQKGKPVVSLRYTYGVEQQPQRIPLLNARDYITLSRSNIAKFNQADLTYNGKEDQAKF LSGSFGMSTGNPRNSKNTLEFLDVYLQKYGQGYVSNLLENEGWQTMADPVTGKQLIFQDN DFQKATFTTGQKHEIDLSISGGTEAINYYVGLRYLNQDGILRGTNYKNYSVLFNGNYKLS EAWSLSTKASLQVRDAVGGGNTVNTISRSILTPPTYRLYYEDGTPAPGEGISSFRSRLHE IYYKTNYDDTNVYRTTFQLGAIWNILPGLILKPTAYYFGTEGIENYFQADNETTGNTIRP ASAKHNFDRHLQGDLVLSYDKKVKDHNIGVVAGASYTHDYSYRLSASGSGSSIDLIPTLN ATADSTQRASSTKTMEATLSYFGRVNYDYNGKYLFSISMRADGSSRFAEDNKWGFFPGVS AGWNMHRENFYKPLEAIVSRWKWRASWGRTGNNNLSVANSRGEYKITDTNYQGSVGILNT TLKNSQLRWETTESYDIGVDLGFFNNRLGLLIDYYNKLTFDRLYDEPLWSSTGFSSIKSN YGSVRNSGVEIELNATPIQTKDFSWDLGLTFAYNKGVVVDLPDNGEEKHRVGGNFVYDPA TGGTKKVGGIAEGERFGGRWAFHYLGTYQTEEEAAKAPNDPNAQGRKKHAGDAIFEDVNN DGQLDSKDMIFMGYVRPDKVGGIINTLKYKGLTVRIVMDWAMGHVIDNGFKGQIMGSSRN NNNAIKEAMTNSWQSANDGSKYPKYTVQSDYDYQYRNHMRWDNQIGSSASGSTNNSLYFS KGDYLAFREVSLSYMLPLSWIRKMRLSAVEVFAGAYNIGYIKKYDGMFPEIYTGVDYGIY PRPRQYNMGVKINF >gi|222159276|gb|ACAB01000083.1| GENE 13 15852 - 17306 992 484 aa, chain + ## HITS:1 COG:no KEGG:BF3938 NR:ns ## KEGG: BF3938 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 483 1 482 486 375 42.0 1e-102 MKQFKSILLGISLFLLHSCNLLDVDTVSSITGDGYWNTKGDVESYMIGIYTKLRDTSNST LHFEDRGDAFTTGLEGGPSNLWAQNLTSQNGYSWSSYYSVIQHCNMLLKYTPGIDFGVEA DKNRLLAEAYCIRGYMYFCIARIWGDAPLELEPTESSNKPKLAREPAEEVLARALSDVNT AIDLFPEESYANGKGRASKPACYALKADILLWKAKVMNGSEQDLKDVITYADLASKGLSL EDNFADIYGTKYGKEVIWTIHFEIYEKEAQYSQSLKPRDVFVEKAVNKDEIPYAKGGARS TYAPSPFLIGLFNANPADIRTKDSYITAKDADGNEIGTFDNKMKGTKTEGDRTYDSDIVI YRLAEMYLFKAEAYAALNQTPQAIIELNRVRDRAKIGTYNGSTNKIAVEKEILNERAREL YLERKRWPDLLRFHYGGTIDVYQEVPNLKKKVDDNIIIPLYLAIPLSDININPNLKQTQG YENL >gi|222159276|gb|ACAB01000083.1| GENE 14 17293 - 18453 540 386 aa, chain + ## HITS:1 COG:Cgl1519 KEGG:ns NR:ns ## COG: Cgl1519 COG4409 # Protein_GI_number: 19552769 # Func_class: G Carbohydrate transport and metabolism # Function: Neuraminidase (sialidase) # Organism: Corynebacterium glutamicum # 39 363 60 399 399 119 28.0 7e-27 MKTYKIITCVFLFTSMFTQACQKDDETQGNPTPVPPEEEVPSSEFNYIYNQGTDGFELYR IPAIVKSKSNTLLAFAEARKARSNGDSGDIDLVVKRSSDNGKTWSKQITIWNDGQNTCGN PVPIVDDRGRIHLLMTWNFQTDKWGAITNGTGEDSRRPYYTYSDDDGITWAQPVEITSSV KKEKWDWYATGPCHGIQIQKGIHKGRLVAPNYFTTRESGKVTSYSHIIYSDDYGKTWKPG EPTPVGGVGECSVAEIGEGTLMLNMRADEGFYRKSCTSIDGGLTWSSPQISIDQIDCKCQ GSILSIGGAVFLSNAASATERINMTIKKSTDNGKNWKGQYTVYEGNSGYSDIVELSDSQI AIIYEGGEKRYTDGLAFKVVSIKSIQ >gi|222159276|gb|ACAB01000083.1| GENE 15 18472 - 20043 865 523 aa, chain + ## HITS:1 COG:MA0851 KEGG:ns NR:ns ## COG: MA0851 COG3291 # Protein_GI_number: 20089735 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 30 295 2102 2366 2566 134 34.0 7e-31 MKRKTIYLLQCIVVILLLACSEDDLQSLQAAFESDLQEVTIGESITFKDISTGEPSKWNW RFEGGEPETSILFSPNVVYNKPGVYSVTLSVGRGEEANEMVKEQYITVNYPSQITVDFSA DKTTATNEDVISFKDLSKGYPNEWLWSFTPKEGGAVITSTEQNPQMTLSPGIYTIKLTAK NPKASSDKVREDYITVIDKNAIAADFGAQCRNTYAGGYINFLDKTLGTVEEWEWTFEGGT PASSVEQNPVVQYNNPGKYKVTLKAKNSVNSSTKEKEGYVYVVSAEKLVLYLPFDGDNKD AGPNQLNPEELTAGAGSSVYNSQARFSGESAECRFAAHFQGDKQNYSILSIPEEGLKNHY TDSEFTVAFWVKVSNMTAKNAVFHQGVGPGATYTDPVPRQSWFRLDTSGKTVVFCVEYKG KAGNWAEYEGKRMDDGEWHHYVCIYKKVDGKRDSYLYIDGQKVIEKKGVVDKVVDNWPYY IGCNYRFTNGEFAPENFLNGYLDDFILYNRILSEEEIQDLYNN >gi|222159276|gb|ACAB01000083.1| GENE 16 20111 - 22195 1107 694 aa, chain + ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 94 692 99 712 793 343 32.0 5e-94 MKKLFLLIFMLVNCIGLYAQQVMETSEAYKKADELLKKLTIEEKALMVRGYNKFFIKGFE EKGILPVYLSDATQGVNIRNNLPDPNVVKQLERSTAFPSPILLASTFSPELSYQYAKAIG EECRAGGIEVLLGPGLNIYRQSQCARNFEYFGEDPYLVSQMVSQYVTGLQSTGTAACLKH FYGNNTEFYRKRSNSIIGERAMNEIYLPGFKAGIDAGAMSVMTSYNQIDGEWAGQSSYVI KKILREKLGFKWLVMSDWNSVWDLEKVIKSGQNLEMPGSYNFGVSVLDLYHEKKITEKDL DDMIRPTLATCVAMGFYSRPKYDTTLLSKYPEHEQTARRVAEEGVVLLKNRNEILPLDPT KNRKILLTGKFVYEIPRGYGAAEVIGYNNVSLIDALQKVFGRTVYYIEKPTVAEIKEADV VLLSMGTRDKEAVERPFALPREDESFMRYITKNNPNTIAIINTGSAIDMSAWNEQLAALI YGWYGGQSGFEALTDIIIGKVNPSGKLPMTIERSFKDSPAWGYLPQGASLYNELKNEHLI NVYDVNYKEDVLVGYRWYDTKKIEPLYPFGYGLSYTTFALTKPRLSSNKMNDKQTIKCSV TLTNTGKCEGAEVVQLYIKENQPSVLRPEKELKRFEKVSLKPGENRILEFIITSKDLAFW DDQTHSWKTNTGQYTIFLGTSSRHINQTLSFIKE >gi|222159276|gb|ACAB01000083.1| GENE 17 22326 - 22931 310 201 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295087822|emb|CBK69345.1| ## NR: gi|295087822|emb|CBK69345.1| Protein of unknown function (DUF2500). [Bacteroides xylanisolvens XB1A] # 1 201 36 236 236 399 99.0 1e-110 MVELSYLFGQFCIFADIYYNIMKIIFINTLRILMILVIMISCGVAYVVYEDTLAAWWIPV GVALIIVIATIPFYKGWIWLTTMDNKVINCCCHLVCVGAISCVLFLGGNYWFADSASTHE EKVMVQKKYVETHKKTRRVGRHRYVSDGVRKEYYLQVAFENGNVETLHVSPSTYNKTKTG RPKILTLQKGLFGLPVITKGL >gi|222159276|gb|ACAB01000083.1| GENE 18 23218 - 25695 1887 825 aa, chain + ## HITS:1 COG:no KEGG:BT_1460 NR:ns ## KEGG: BT_1460 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 825 1 825 825 1501 88.0 0 MKMNKTRLLEKISIQSGVSLDECGTVLKTFEKVLSEELSRKIYRYGGWILLLTALFVSVT VFSQTTGRKGRPVQTIRGIVIDGDSKFPIPYATVKLSEKEGAGTITDSLGRFSIPQVPVG RHTVEAAFMGYEPGIFREILVTSAKEIYLEIPLKESVNELNEVIIRARTNKEEAMNKMAT TGARMLSVEEASRYAGGFDDPARLVSAFAGVAPSVSSNGISIHGNAPHLLQWRLEDVEIP NPNHFADIATLGGGILSSLSSQVLGNSDFFTGAFPAEYGNAVSGVFDMKLRNGNNQKNEN TIQVGIMGIDVASEGPLSKKHKASYIFNYRYSTTGLLNLEGGTMDYQDLNLKLNFPTQKA GTFSVWGTSLIDKFTSDFEKNTEKWDYWGDRSESRDKQYMAAGGVSHRYFFNNDASLKTT IAATYSQLDGGATLFNHSMESTPYMDLDSKYTSLIFTTTFNRKFSNRFTNKTGFTYTNMF YKMDLSIAPYEAEPLEIVSQGKGNTSLISAYNSSSVGLTERWTLNAGIYGQLLTLNNKWS VEPRVGLKWQATPKTTFALAYGMYSRMEKMDVYFVKTKSTGNQSVNKDLDFTKAQHIMLS FGYKISDRMNLKIEPYIQFLHDVPVMADSSYSVLNRSDFYVEDALVNKGRGRNVGIDITF ERFLEKGLYYMISGSWFDSRYRGGDGVWYNTKFNRNYVINGLIGKEWMLGRNKQNILSVN LKLTLQGGDRYSPIDLEATMNHPDKEVQYDETKAFSKQYSPMLIGNYTVSYRINKRKVSH EFAVKGLNFTGAKEHYGHEYNVKTGKIDVSDNSTILTNVSYKLEF >gi|222159276|gb|ACAB01000083.1| GENE 19 25753 - 26835 566 360 aa, chain + ## HITS:1 COG:no KEGG:BT_1459 NR:ns ## KEGG: BT_1459 # Name: not_defined # Def: two-component system sensor # Organism: B.thetaiotaomicron # Pathway: not_defined # 34 359 1 325 326 481 79.0 1e-134 MTTETLATGSESTFLYRFLVSPDLRWMRYLVLILVLGTISFNQVFIIFLDYKDILGGWIY TFTFLYLLTYVAVIYLNLFQLFPKYLLKRHYLTYLSLLSTAMIVALLIQMSIEYMAYSYW PELHARGSYFSMHMVVDYISSFMLSTLCMIGGTMTVLLKEWMINNQRVSQMEKAHVVSEV ERLKEQISPELLFKTLHQSGELTLSEPETASKMLMKLSQLLRYQLYDCNRAKVLLSSEIT FLTNYLTLEQTSRPQFYYEFTSEGEVNRMLVPPLLFIPFVQYIVKAIDEQQIQPPVSLKT HLKAEKGTIIFACACPEVNLLSSDKGLERIRQRLDILYGSRYRLSLAVGSIWLELKGGES >gi|222159276|gb|ACAB01000083.1| GENE 20 26889 - 27941 503 350 aa, chain + ## HITS:1 COG:RSc1351 KEGG:ns NR:ns ## COG: RSc1351 COG2972 # Protein_GI_number: 17546070 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Ralstonia solanacearum # 129 327 137 352 414 91 28.0 2e-18 MNDKSVTAFLLSPRYRIYRHLLLQLVVVLITINVLWYEPLQTVSFGRRLGGCLAYFASMN MVIYINLYVLVPYFLLKNRWGSYVLMAVITNIAVITFLSVTQGLLFEVILPGKDPNGFAT FINAFSGILTIGFVMAGSAAISLFMHWLRYNLRIDELESTTLQSELKFLKNQINPHFLFN MLNNANVLIKRNPEEASKVLFKLEDLLRYQINDSSRERVSLASDIRFLNDYLNLEKIRRD NFQFTMEEHGETDSIWIQPLLFIPFVENAVKHSFDSEHPSYVHVSFKVDNDRLEFRCENS TPKVAVSKGKVGGIGLVNIQRRLGLLYPGRYELKQIENENKYTVILSITL >gi|222159276|gb|ACAB01000083.1| GENE 21 27938 - 28654 602 238 aa, chain + ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 1 220 2 224 240 105 30.0 5e-23 MNCIIVDDEPLAREAMKLLIEESNNLQLIGSFNSASTASDFMEQHVTDLVFLDIQMPGIT GIEFARTISKKTLVIFTTAYTEYALDSYEVDAIDYLIKPVEAERFQKAVDKALSYHSLLL KEEKEAIETVVTADYFFVKAERRYFKVNFSDILFIEGLKDYVIIQLSDQRIITRMSLKAI FDLLPKSTFLRVNKSYIVNTGHIESFDNNDIFIKSYEIAIGNSYRDDFFEGFVMKQRV >gi|222159276|gb|ACAB01000083.1| GENE 22 28842 - 29339 555 165 aa, chain - ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 43 156 3 115 117 127 45.0 1e-29 MKQIKSLLSVFVLTLAATACAGNSGENKKSNEPTKEDNKMEVVALNKADFLKKVYNYEAN PNDWKFEGSRPAIVDFYATWCGPCKVIHPILEELSKEYSGKVDIYQIDVDKEQDLAAAFG IRSIPTLLMIPMKEEPRIMQGAMPKDQLKKAIDEFLLKQNNEAKQ >gi|222159276|gb|ACAB01000083.1| GENE 23 29855 - 31252 1240 465 aa, chain + ## HITS:1 COG:DR0430 KEGG:ns NR:ns ## COG: DR0430 COG1690 # Protein_GI_number: 15805457 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 42 463 40 464 470 392 51.0 1e-109 MGIRLKDLSKLGYRDNVARSLVVDIVSKHCKYNTKEQIEMTLSDILEHPESYKNNEIWNK LAERLSPTIIAKEFIAYDLLDEPLMYKTYGGKFIETIAKQQMNLAMRLPVTVAGALMPDA HAGYGLPIGGVLATDNAVIPYAVGVDIGCRMSLTVFDAGADFLKRYAYQMKEALKDFTHF GMDGGLGFEQEHEVLDREEFRLTPLLRDLHGKAVRQLGSSGGGNHFVEFGEITLQEKNVL NLPEGSYLALLSHSGSRGLGAAIAKHYSLLAREVCRLPREAQHFAWLDLNTEEGQEYWMS MNLAGDYARACHERIHLNLAKALGLKPLANVNNHHNFAWKEEITPGRMAIVHRKGATPAQ KGQAGLIPGSMATPGYLVCGKGVEDALNSASHGAGRAMSRQKAKDSFTQSALKKLLSQAG VTLIGGSVEEMPLAYKDIDRVMYTQETLVEVQGKFMPRIVRMNKE >gi|222159276|gb|ACAB01000083.1| GENE 24 31384 - 32118 722 244 aa, chain + ## HITS:1 COG:no KEGG:BF2709 NR:ns ## KEGG: BF2709 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 236 1 236 237 392 77.0 1e-108 MISEKYGRTYHYPFSPGTTSDDRINHTYWEDIQRIKTLVHTEKLDGENNCLSQWGVFARS HAAPTTSPWTRQLRERWELIKNDLGDIEIFGENLYAIHSIEYQRLETHFYIFAVRCMDQW LSWEEVKFYAALFDLPTVPELKIEPVSGLTPELLKQEIIDMSQDPSVFGSCDPWTKVACT REGVVSRNIEEYPVSEFAHHVFKYVRKGHVKTDEHWTRNWKRAPLVWELSNEKENNDELE IDRR >gi|222159276|gb|ACAB01000083.1| GENE 25 32093 - 33196 793 367 aa, chain + ## HITS:1 COG:CAC0753_1 KEGG:ns NR:ns ## COG: CAC0753_1 COG0617 # Protein_GI_number: 15894040 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Clostridium acetobutylicum # 16 214 19 223 228 119 33.0 9e-27 MNWKLIEDKSWCSLEQQFEWVREMNVVPQDTCYHAEGSVAEHTRMVLEALQQSSAYQTLC TLEKEIIWTSALLHDVEKRSTSVDEGEGRVKGHARKGEYTTRTILYRDCPAPFHIREQIA SLVRYHGLPVWLMEKSDSVKKLCEASLRVDTSLLKMLAEADVRGRICEDKNGLLEAVELF EIFCREQDCWSKPREFATDYARFHYFHAEGSYIDYIPHEQFKCEVTMLSGLPGMGKDYYI QSAGMDMPVVSLDAIRRKYKLSPTDKSANGRVVQMAKEEARTYLRKGQDFVWNATNITRQ MRAQLIDLFVDYGAKVKIVYLEQPYHTWRQQNKSREYALPESVLDKMLDKLEVPQLTEAH EVVYHVV >gi|222159276|gb|ACAB01000083.1| GENE 26 33521 - 35020 1443 499 aa, chain - ## HITS:1 COG:BB0604 KEGG:ns NR:ns ## COG: BB0604 COG1620 # Protein_GI_number: 15594949 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Borrelia burgdorferi # 3 497 6 499 500 296 40.0 8e-80 MTLILAIIPVLLLIILMAFFKMPGDKSSIISLIVTMLIAIFGFAFSVDNLFYSFLYGALK AVSPILIIILMAIFSYNVLLKTEKMEIIKQQFASISTDKSIQVLLLTWGFGGLLEAMAGF GTAVAIPAAILISLGFKPIFSATVSLIANSVATAFGAIGTPVLVLAKETNLDVLHLSTNV VLQLSVLMFLIPLVLLFLTDSKLKSLPKNIFLALLVGGVSLVSQYLAAKYMGAESPAIIG SILSIIVIVLYGKLTASKEEKARKSHLKTKDILNAWSIYLLILFLIILTSPLFPELRHTL ENNWITRISLPINASTVNYTISWLTHAGVLLFIGTFIGGLIQGAKVKDLFIVLWNTVKQL KKTFITVICLVGLSTIMDSAGMIAVIATALATATGSLYPLFAPVIGCLGTFITGSDTSSN ILFGKLQASVAGQIHVSPDWLSAANTVGATGGKIISPQSIAIATSAGNQQGKEGEILKAA IPYALVYVAITGIIVYIFS >gi|222159276|gb|ACAB01000083.1| GENE 27 35533 - 37077 1664 514 aa, chain + ## HITS:1 COG:BMEI0801 KEGG:ns NR:ns ## COG: BMEI0801 COG4799 # Protein_GI_number: 17987084 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Brucella melitensis # 1 514 1 510 510 669 64.0 0 MKELINNLEELNRKAEKGGGDARIEKQHSVGKLTARERIDLLLEKGSFIELDKLVTHRCT DFGMEKQKFAGDGVVTGYGMIGKRLVYVFAQDFTVFGGALSETHAKKICKVMDMAMQMGA PIIGLNDSGGARIQEGVRSLAGYAEIFLRNSMASGVIPQISAIMGPCAGGAVYSPALTDF ILMVKNSGYMFITGPDVVRSVTQEEVTKEELGGVGVHMTKSGVAHLSAENDIECINYIRE LISYLPGNNMEEPPFVATNDSPTRLTPELANLIPTNPNQPYNIKEMIEAVADDNSFFELQ AEYAANIVTGYIRLNGKTVGVVANQPLVLAGTLDINASVKAARFVRFCDAFNIPLLTLVD VPGFLPGIDQEYGGIIRNGAKLLYAYCEATVPKVTVITRKAYGGAYDVMSSKHIRGDVNL AFPTAEIAVMGPDGAVNILFRKDIDKAGNPEEKRKELQDDYRGKFANPYRAAELGYVDEV IDPAVTRLRLIRSFEMLANKRQSNPPKKHSNLPL >gi|222159276|gb|ACAB01000083.1| GENE 28 37135 - 38646 1762 503 aa, chain + ## HITS:1 COG:MA0675 KEGG:ns NR:ns ## COG: MA0675 COG0439 # Protein_GI_number: 20089560 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Methanosarcina acetivorans str.C2A # 1 478 1 493 493 493 50.0 1e-139 MIKKILVANRGEIAMRIFRTCRVMNIATVAVYTHVDRGALHVRYAEEAYCISESPEDTSY LKPELILSIAKKTGAAIHPGYGFLSENADFARRCEEEGVIFIGPSADIISKMGIKTEARK IMREAGLPIVPGTETPVQGIDEVKKVANEVGYPIMLKALAGGGGKGMRLVRTEEEVETAL RLSQSEAGTSFGNDAVYIEKYIENPHHIEVQIMGDKYGNVVHLYERECSIQRRNQKVIEE SPSPFVKEETRKKMLKVAVEACKKIGYYSAGTLEFMMDKDQNFYFLEMNTRLQVEHPVTE ECTGVDLVRDMITVAAGNPLPYKQDDIQFSGAAIECRIYAEDPENNFIPSPGVITVREAP EGRNLRLDSAAYAGFEVSLHYDPMIAKLCCWGRTRASAISNMARALREYKILGIKTTIPF HQRVLKNAAFLKGEYDTTFIDTRFDKEDLKRRQNTDPTVAVIAAAVRHYEREKEAASRAT TLPVVGESLWKYYGKLQMTANNY >gi|222159276|gb|ACAB01000083.1| GENE 29 38676 - 39200 621 174 aa, chain + ## HITS:1 COG:YGL062w KEGG:ns NR:ns ## COG: YGL062w COG1038 # Protein_GI_number: 6321376 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Saccharomyces cerevisiae # 80 174 1066 1169 1178 69 42.0 3e-12 MGTTLATYYAKLQDMPDSEYKVEILEDGPIKKIAVNGKIYEVDYNMGGDSIHSIIIDHHS HGVQISPSSNNSYTIMNKGELYQIELQGEMEKIHNARTAAESVGRQVVQAPMPGVILKTY VKKGDSVKRGDPLCVLVAMKMENEIRSVTDGVVKEVFVEDGMKVGLNDRIMVIE >gi|222159276|gb|ACAB01000083.1| GENE 30 39288 - 41645 1946 785 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 344 644 5 315 328 193 40.0 1e-48 MNTEQNQKVLLEQLEALKKENEQLKKELSILRNENISNRPVSFKEKYAVRILDSLPDMLT VFNQNEVGIEVVSNEETNHVGISNKDFKGMYMREMVPPEAYQNIHSNMRQAVSTGAVSTA HHELDFNGEHHHYENRIFPLDEEYVLIMCRDITERVTTQRQLEVFKSVLDKVSDSILAVS EDGTLVYANKQFIEEYGVTQQMGIQKIYDLPVSMTTKEAWERRLQEIRDNDGTFAYRAAY MRKGEDKERMHQVSTFLIRENNEELTWFFTQDITDVIKKQDELRELNLLLDGILNNIPVY LFVKDPENDFRYLYWNKAFADHSGIPASKAIGHTDYEVFPSHGDAEKFRKDDLELLQTHK RIDMQETYLSVTGKARIVQTLKALVPMEGRKPLQIGISWDITNLQNIEQELIKARIKAEQ SDRLKSAFLANMSHEIRTPLNAIVGFSQLLPAAETAEEKKLYSGIINQNSDILLQLINDI LDLSKIEAGTLEYIKRPMNLGEVCRTIYAVHKERVKEGVTLVFDNEDENLFIEGDQNRIM QVITNFLTNASKFTYAGEIRLGFERTDKNIRVYVKDTGIGIEPEKVDHIFERFVKLNSFA QGTGLGLSICQMIIEKIGGEIGVTSELGKGSTFYFTIPYEEAGELGEIFKMSKTESKGNT VNRVQQIKKILVAEDVESNFILLKNLIGREYTLLWAKDGVEAIEMYKQYQPDLILMDIKM PRMDGLEATHIIRSYSKEVPIIALTAYAFETDKELALEMGCNDFVTKPVSERTLRKALDK YSTIV >gi|222159276|gb|ACAB01000083.1| GENE 31 41653 - 42093 220 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715074|ref|ZP_04545555.1| ## NR: gi|237715074|ref|ZP_04545555.1| predicted protein [Bacteroides sp. D1] # 42 146 1 105 105 185 100.0 7e-46 MRQKGKRKIIKTIPKKKYVQKPPHKGTFWSEYSFAIIAFFFMVFVSFLLYYSSLPENYDK ETVGNIVRHELKYDMTQGRMGGSVSYSFVVYYQYMVGGEHYSSKVSLGYTRQNIPFIKKI KDYGSMYPVTVTYDSHNPQISTIVVE >gi|222159276|gb|ACAB01000083.1| GENE 32 42223 - 42909 486 228 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_2065 NR:ns ## KEGG: Fjoh_2065 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 57 228 11 181 184 170 48.0 3e-41 MSMSQRIYNPLVFGIRIFNPLILSDRLKSFVSFDFNRTFILSNQISISMDTANRVLDELY FVTSTVVDWIDIFTRPKYKHIILESLAYCQEKKGLRIYAWVLMSNHLHMIVSSGTEASVS DILRDFKKFTSKRIMAELETDPQESRREWMLDRFRFAGANDKKISKYRFWQEGNHPELIY LHDFFLQKLNYIHNNPVKQEIVARQEDYLYSSAVSYAGDKGLLEVIVV >gi|222159276|gb|ACAB01000083.1| GENE 33 43126 - 43440 310 104 aa, chain + ## HITS:1 COG:no KEGG:BT_4140 NR:ns ## KEGG: BT_4140 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 1 104 104 150 71.0 1e-35 MKRPEIIEAIRETLKRVAPNAQAILYGSEARGDARHDSDVDLLILVEGDKMTLAKEEAIT LPLYELELKTGVSISPIVVLKKFWENRPFKTPFYVNVINEGIVL >gi|222159276|gb|ACAB01000083.1| GENE 34 43437 - 43859 251 140 aa, chain + ## HITS:1 COG:TM1000 KEGG:ns NR:ns ## COG: TM1000 COG1895 # Protein_GI_number: 15643760 # Func_class: S Function unknown # Function: Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN # Organism: Thermotoga maritima # 10 134 4 128 132 85 35.0 2e-17 MKEILDKESKKALVAYRIQRAYETLREAEVMIRESFYNAAINRLYYACYYATVALLLKYD IQTQTHNGVKTMLGLHFISTGKLPVKVGKTFSTLFEKRHSGDYDDFVYCDEEMVNNLYPQ AETFINSIQELIRDKEWESI >gi|222159276|gb|ACAB01000083.1| GENE 35 43936 - 45495 1334 519 aa, chain + ## HITS:1 COG:no KEGG:BT_0374 NR:ns ## KEGG: BT_0374 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 516 516 820 77.0 0 MSNKIYPIGIQNFESLRQDGYFYIDKTALMYQMVKTGRYYFLSRPRRFGKSLLISTLEAY FQGKKELFTGLAVEKLEKDWIKYPILHLDLNIEKYDTPESLDKILNDNLEYWESQYGTRP SETSFSLRFAGIIQRACEKTGQRVVILVDEYDKPMLQAIGNEELQKQFRNTLKPFYGALK TKDGYIKFALLTGVTKFGKVSVFSDLNNLDDISMWNEYVEICGVSEREIHENLEAELHEF AAARGITYDKLCEDLRECYDGYHFTHNSIGMYNPFSLLNAFKRKEFGSYWFETGTPTYLV KLLKKHHYDLERMAHEETDVQVLNSIDSESTNPIPVIYQSGYLTIKGYDEEFGMYRLGFP NREVEEGFIRFLLPFYANVNKVESPFEIQKFVREVRSGDYNSFFRRLQSFFADTTYEVIR DQELHYENVLFIVFKLLGFYTKVEYHTSEGRIDLVLQTDKFIYIMEFKLNGTAEDALQQI NDKNYALPFEMDGRKLFKIGVNFSAETRNIEKWIVETKN >gi|222159276|gb|ACAB01000083.1| GENE 36 45539 - 46186 630 215 aa, chain - ## HITS:1 COG:NMA0943 KEGG:ns NR:ns ## COG: NMA0943 COG0132 # Protein_GI_number: 15793901 # Func_class: H Coenzyme transport and metabolism # Function: Dethiobiotin synthetase # Organism: Neisseria meningitidis Z2491 # 3 208 2 207 215 239 54.0 2e-63 MKQNVYFVSGIDTDAGKSYATGFLAREWNKNGQRTITQKFIQTGNIGHSEDIDLHRRIMG ISFTEEDKKGLTMPEIFSYPASPHLASQLDNRPIDFGKIKRATEELSERYDFVLLEGAGG LMVPLTTELLTIDYIAQENYPLIFVTSGKLGSINHTLLSLEAIQKHGIVLDTVLYNMYPT VKDKTIQNDTMNFIQNWLKKYFPDTKFILVPEIKE >gi|222159276|gb|ACAB01000083.1| GENE 37 46183 - 46950 420 255 aa, chain - ## HITS:1 COG:PM1903 KEGG:ns NR:ns ## COG: PM1903 COG0500 # Protein_GI_number: 15603768 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Pasteurella multocida # 1 251 4 249 251 176 40.0 5e-44 MDKQLIAERFSKAIATYPQEANVQRQIADKMIHLLTEHISFPCSKVIEFGCGTGIYSRML LQALRPEELLLNDLCPEMKYCCEDILRKEQVSFLPGDAETVPFPAESTLITSCSALQWFE SPENFFKRCNALLNSQGYFAFSTFGKENMKEIRELTGNGLPYRSREELVTALSSHFDILH SEEELISLSFDNPIKVLYHLKQTGVTGISGTSSQQLRTRRDLQLFSERYTQEFTQGTSVS LTYHPIYIIAKKKKV >gi|222159276|gb|ACAB01000083.1| GENE 38 46961 - 47620 456 219 aa, chain - ## HITS:1 COG:NMB0473 KEGG:ns NR:ns ## COG: NMB0473 COG2830 # Protein_GI_number: 15676384 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis MC58 # 13 206 12 201 215 133 38.0 2e-31 MKQYFIIKNNQKHLLLFFAGWGMDETPFLQIHPTDKDWMICYDYRSLEFDADILQEYSEI TLIAWSMGVWAASQIMKQYPSLPLSQSIAINGTLSPIHETKGITPTIFEGTLQGLNEQSL QKFQRRMCGSAADYKAFQTVAPQRPVEELKEELAAIQKQYLSLPPSDFAWQRAIIGKSDR IFLPDSQWLAWRNKVDSLEYTEAAHYQQDLFDNVIMQIN >gi|222159276|gb|ACAB01000083.1| GENE 39 47617 - 48771 760 384 aa, chain - ## HITS:1 COG:PM1901 KEGG:ns NR:ns ## COG: PM1901 COG0156 # Protein_GI_number: 15603766 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Pasteurella multocida # 3 379 2 379 387 431 54.0 1e-120 MTLEYINQELQTLKEKKNYRSLPPLIHEGRDVLLNGQRMLNLSSNDYLGLANDISLREEF LRTLTPETFLPTSSSSRLLTGNFSDYQKLEQQLATMFGTESALIFNSGYHANTGILPAIC NTHTLILADKLVHASLIDGIKLSSAKCVRYRHNDISQLQRLIAENHNAYEQIIIVTESIF SMDGDEADLPALIQLKKSYSNILLYIDEAHAFGVRGKKGLGCAEEQECINDIDFLIGTFG KAIASAGAYIVCRKLIREYLINKMRTFIFTTALPPINIQWTSWVLERLPALQHKRTHLLQ ISEKLKEALTTKGYNCPSVSHIVPMIVGASEDTILKAEELQRKGFYALPVRPPTVPEGTS RIRFSLTADITEHEIDQLIKLING >gi|222159276|gb|ACAB01000083.1| GENE 40 48993 - 51407 1686 804 aa, chain - ## HITS:1 COG:NMB0732 KEGG:ns NR:ns ## COG: NMB0732 COG0161 # Protein_GI_number: 15676630 # Func_class: H Coenzyme transport and metabolism # Function: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase # Organism: Neisseria meningitidis MC58 # 387 803 14 430 433 598 64.0 1e-170 MKQQRHIQTTRTLLSRFRYWGRKNYAAFASMGREFQIGHLHINVVDVALRKQNAKITIPY HTFMTLQEIKDQVLAGIDISPDQAAWLANMADSEALYAAAHEITVARASHEFDMCSIINA KSGRCPENCKWCAQSSHYKTKADIYDLLPAEECLRQAKYNEAQDVNRFSLVTSGRKPSPK QITQLCDTVRQMRRHSSIQLCASLGLLNEEELRSLYEAGITRYHCNLETAPSYFSKLCTT HTQEQKRATLDAARRVGMDICCGGIIGMGETMEQRIEFAFTLKELNVQSIPINLLSPIPG TPLENEQPLSEEEILKTIAIFRFINPTAFLRFAGGRSQLSSEAMHKALYIGINSAIVGDL LTTLGSKVSEDKKMIQEEGYHFAASQFDREHIWHPYTSTTDPLPVYKVKRADGVTITLED GQTLIDGMSSWWCAVHGYNHPVLNQAAKEQLDKMSHVMFGGLTHDPAIELGKLLLPLVPS SMQKIFYADSGSVAVEVALKMAVQYWYAAGKPEKNNFVTIRSGYHGDTWNAMSVCDPVTG MHSLFGSALPVRYFVPSPTSRFDGEWNPEDILPLQEIIEKHSKELAALILEPVVQGAGGM WFYHPQYLREAEKLCRKHDILLIFDEIATGFGRTGKLFAWEHAGVEPDIMCIGKALTGGY MTLSAVLASNRIADTISNHAPGAFMHGPTFMGNPLACAVACASVRLLLESDWQENVKRIE TQLKEELAPAREFPEVADVRILGAIGVIEMKRPVNMAYMQRRFVEERIWVRPFGKLVYLM PPFIITSEQLSKLTSGLLKVIQKR >gi|222159276|gb|ACAB01000083.1| GENE 41 51499 - 52608 851 369 aa, chain - ## HITS:1 COG:no KEGG:BT_1441 NR:ns ## KEGG: BT_1441 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 368 17 367 370 592 86.0 1e-168 MGKRVILYLFLLFSLSLQAQSEKKTLSTNDSTRIDSTTVTQPSPFKRAIKKFMNFSDFDT LYISPNRYNYALMATHFSNFEYYSVTSEQPQPQKLSFSPNPHNKIGLYFGWRWIFLGWSV DVDDIYRKTNRKNRGTEFDLSLYSSKLGVDIFYRRTGNNYKIHKIRGFSEDIPSNYSEDF SGIKVDIKGLNLYYIFNNRKFSYPAAFSQSTNQRRNAGTFIAGFSISKHHLDFDYTELPD FLQQTMNPGMKVKDIKYTNANISFGYAYNWVFARNCLACLSLTPAIAYKASDVDAETHEG KAWYGKFNLDFLLRAGVVYNNGKYFVGTSFVGKNYNYHRNNFSVDNGFGTLQIYAGFNFN LRKEYRKKK >gi|222159276|gb|ACAB01000083.1| GENE 42 52766 - 55894 2945 1042 aa, chain + ## HITS:1 COG:no KEGG:BT_1440 NR:ns ## KEGG: BT_1440 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1042 1 1042 1042 1881 93.0 0 MRIYLRLLVVSLLLFAGNIVYAEAQQEKRVTGTVTSEGEPLPGVSVQLKGASSGTITDID GNYSIEVPATGTLVFRFVGMRSVEQPVNNRSVINVTLESESKELEEVMVVAYATAKKYSF TGAASTMKAGEIEKLQTSSVSRVLEGTVSGVQASAASGQPGTDAEIRIRGIGSINASSAP LYVVDGVPFDGSVNSINPDDISSMTVLKDAASAALYGSRGANGVIIITTKQGDQNTKATV KVKASLGGSNRAVRDYDRVSTDQYFELYWEALRNQYAKSADYTPATAAAQASKDLVTKLM GGGPNPYGPQYTQPVGTDGKLVAGARPLWNSDWSDAMEQQALRTELNLSVSGGGKANQYF FSAGYLNDKGIALESGYQRFNLRSNITSEMTSWLKGGVNLSFAHSMQNYPVSSDSKTSNV ITAGRTMPGFYPIYEMNTDGSYKLDENGDRIYDFGSYRPSGSMANWNLPATLPLDKSERM KDEVSGRTFLEATIIEGLKFKTSFNFDLINYNTLDYTNPQLGPAKENGGGVSRMNTRTFS WTWNNIATYDKTIGEHHFNVLAGVEAYSYRYDELTASRSKMAQPDMPELVVGSQLTGGSG YRIDYALVGYLTQALYDYQNKYFFSASYRRDGSSRFAPETRWGNFWSLGTSWRIDREEFM ASTSDWLSALTLKMSYGAQGNDNLGTYYASKGLYTIVSNLGENALVSDRMATPNLKWETN LNFNVGIDFSLFNNRFSGSFDFFTRRSKDLLYSRPIAPSLGYGSIDENVGALKNTGIEMV LNGTIINQNGWVWKLGMNLTHYKNKVTDLPLKDMPRSGVNKLQVGRSVYDFYMIEWAGVD PENGDPLWYKDEVDANKNPTGKRVTTNDYGSADYYYVNKSSLPKVYGGFNTSLSWKGFDL SAIFAYSIGGYIYNRDVTMILHNGSLEGRDWSTEILRRWTPDNRYTDVPALSTTSNNWNS ASTRFLQNNSYMRLKNLTLSYNLPKQWISKLSLSSVQVYVQGDNLFTIHRNQGLDPEQGI TGITYYRYPAMRTISGGINVSF >gi|222159276|gb|ACAB01000083.1| GENE 43 55909 - 57390 1406 493 aa, chain + ## HITS:1 COG:no KEGG:BT_1439 NR:ns ## KEGG: BT_1439 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 493 1 493 493 872 86.0 0 MRKIKIQSILAAVAGLFLATSCSSSFLDTDPTDAVSSDKVSVPENAEALVNGAWYNLFDY SSTYANIGYRALQCLDDMMASDIVSRPKYGFNSSYQFNDIAQSSNGRTEFAWYLIYKTID NCNTAISIKGDSEELRQAQGQALALRAFCYLHLVQHYQFTYLKDKDAPCVPIYTEPSNSS TVPKGKSTVAQVYQLIFDDLTLAKDYLKNYVRSGDNQKFKPNVAVVDGLLARAYLLTGQW EEAAKAAEAARAGYTLMTTTAEYEGFNNISNKEWIWGFPQIPSQSDASYNFYYLDATYVG AYSSFMADPHLKDTFTEGDIRLPLFQWMREGYLGYKKFHMRADDTADLVLMRAAEMYLIE AEAKVRDGVALAQAVAPLNTLRNARGVGDYEVTGKSQEDVINEILMERRRELWGEGFGIT DILRTQKAVERVALSDEMQKTEVDCWQEGGSFEKRNPLGHWFLNFPNGKPFTVNSTYYLY AIPQKEINANPNI >gi|222159276|gb|ACAB01000083.1| GENE 44 57523 - 58728 865 401 aa, chain - ## HITS:1 COG:ECs0532 KEGG:ns NR:ns ## COG: ECs0532 COG0477 # Protein_GI_number: 15829786 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 10 398 20 405 406 382 56.0 1e-106 MVQNQEQPVGKITFSILIALSLSHCLNDLLQSVVSAAYPLFKDDLGLSFAQIGLITLVYQ LSASVFQPITGIIFDKYPVAWSLPIGMSFTLIGLINLAFSDNLYWILASVFLIGIGSSVL HPEASRITFLASGGKRGLAQSLFQVGGNFGGSLGPLLVALLVAPYGRQHLIVFAFVALAA IGVMYPICKWYKSYLNRMKAQTVSVRKPVHLPLPMDKTALSIAILLILIFSKYIYMASLT SYYTFYLIHKFNVSVQDSQLYLFIFLVATAIGTLIGGPVGDRIGRKYVIWASILGAAPFS LLMPHANLLWTIILSFCVGLMLSSAFPAILLYAQELLPTKLGLISGLFFGFAFGVAGVAS AVLGNLADKTSIEYVYNICAYMPLLGLVTFFLPNLKKKKIE >gi|222159276|gb|ACAB01000083.1| GENE 45 58890 - 59135 336 81 aa, chain + ## HITS:1 COG:no KEGG:BF4188 NR:ns ## KEGG: BF4188 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 81 1 81 81 67 77.0 2e-10 MGFIWYIIIGIVAGFLAGKVMRGGGFGLIINLLLGILGGVLGGWVFALFGLSASGLIGSL ITSTVGAILVLWIASLFSKSK >gi|222159276|gb|ACAB01000083.1| GENE 46 59180 - 59467 406 95 aa, chain + ## HITS:1 COG:no KEGG:BT_1435 NR:ns ## KEGG: BT_1435 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 95 1 95 95 131 89.0 6e-30 MECRHGNFWIGLGIGSILGAVAYRLSRTAKAKQLESEIYNAIHRIGRDAEIAAAHAERKA MDLGLKAVETGAEIADKVAAEADKVAGKAKDKWEK >gi|222159276|gb|ACAB01000083.1| GENE 47 59855 - 60928 836 357 aa, chain + ## HITS:1 COG:SP1999 KEGG:ns NR:ns ## COG: SP1999 COG1609 # Protein_GI_number: 15901822 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 8 186 7 176 336 79 32.0 1e-14 MDKEFSNIRIVDIAKMAGVSVGTVDRVIHNRGRVSEENRKKVQAILEMVHYQPNLMARSL AASKKQYHILAITPSFVQGEYWEAISEGIDKAAAEMESYNITITKLFFDQYNNKTFDDII RNLLNEKVDGVLIATLFTDSVIRLSQELDRNEIPYVYVDSNIGGQHQLAYFGTESYDAGV IAARLLMDRLPSSSDILMARIIHSGKNDSNQGKNRREGFCHYLTEIGFNGNLHEVELKIN DSVYNFMKLDEIFEANPNIKGAIIFNSTCYILGNYLKARGMQAVKLVGYDLIGRNTQLLS EGVITALIAQRPERQGYDGVKSLCNHLLFKQNSEKVNLMPIDILLKENLKYYLNNKL >gi|222159276|gb|ACAB01000083.1| GENE 48 60950 - 61762 1047 270 aa, chain + ## HITS:1 COG:BH1067 KEGG:ns NR:ns ## COG: BH1067 COG1028 # Protein_GI_number: 15613630 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Bacillus halodurans # 6 269 7 279 281 224 43.0 1e-58 MNELFNVKGKVVVITGGAGILGKGIAAYLAKEGAKVVVLDRSEEAGKALVDSIKAEGNEA MFLYTDVMDKEVLEGNKVEIMKAYGRIDVLLNAAGGNMAGATIAPDKTFFDLQIDAFKKV VDLNLFGTVLPTMVFAEIMVEQKKGSIVNFCSESALRPLTRVVGYGAAKAAIANFTKYMA GELALKFGNGLRVNAIAPGFFLTDQNRALLTNPDGSLTDRSKTILAHTPFNRFGEPEDLY GTIHYLISDASNFVTGTVAVIDGGFDAFSI >gi|222159276|gb|ACAB01000083.1| GENE 49 61889 - 63058 1258 389 aa, chain + ## HITS:1 COG:STM3135 KEGG:ns NR:ns ## COG: STM3135 COG1312 # Protein_GI_number: 16766435 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Salmonella typhimurium LT2 # 5 389 2 392 394 536 62.0 1e-152 MYLCEQTWRWYGPNDPVSLWDIKQAGATGIVNALHHIPNGEVWTVEEIMKRKQMIEEVGL TWSVVESVPVHEHIKTQTGDFMKYIENYKESIRNLAKCGVMVVTYNFMPVLDWTRTDLAY TMPDGSKALRFEKAAFVAFDLFILKRPNAEKDYTPEEIAKAKARFEQMSEDDKKLLVRNM IAGLPGSEESFTVEQFQQALDRYNDIDAEKLRSNLIFFLKEIAPVADEVGVKLVIHPDDP PYTILGLPRILSTEEDFKKLIEAVPNESNGLCLCTGSFGVRADNDLAGMMERFGDRVNFV HLRSTQRDEEGNFYEANHLEGNVDMYGVMKALILLQQRRKCSIAMRPDHGHQMIDDLKKK TNPGYSCLGRLRGLAELRGLEMGIAKSIL >gi|222159276|gb|ACAB01000083.1| GENE 50 63133 - 63729 505 198 aa, chain - ## HITS:1 COG:CAC0738 KEGG:ns NR:ns ## COG: CAC0738 COG0847 # Protein_GI_number: 15894025 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Clostridium acetobutylicum # 4 163 3 160 306 127 42.0 1e-29 MRNFAAIDFETANGKRTSVCSVGVVIVKDGKIVNKIYRLIRPAPNYYTQWTTAIHGLTYD DTMEAEDFPDVWAEIKPLIDGLPLVAHNSPFDEGCLRAVHELYDMTYPNYKFYCTCRTSR KVFGKDLPNHQLHTVAERCGYNLENHHHALADAEACAQIALLIIPDPPKPKKAKKADKDT HVGDLFASLIPQQVKKNK >gi|222159276|gb|ACAB01000083.1| GENE 51 63916 - 66276 1510 786 aa, chain + ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 524 786 9 278 280 137 32.0 8e-32 MNKTYFKYKYVLIFILSLLVFIGGLNYKHWYVTDNLMVKRISLIYSISKDDIGNRNIEKL LYQEFQKQGIEVMFDKFYFDCSKYDEKERIEHVREYLEFLESKSTDLILTVGDQATNSLL STRHRLLSSIPVVACNVHFPNEELIEEYDSQKVYVLRDSPDLKRNIDFIKTLYPHNDMEI IYNIDLTFLGHKSFDSLSRVVDRKNVRVLGYQKAFVQECDYKHLTEMIEYFNLTPGLIND NVNKNGLTISLCPFRYIKGSSLLVMLKQSKRRQQNQAFLLDKLDMLAIPIVTALNIPSFS CIREGFGENAKIVGGYMATEGITAKASASLAARLLKKEKIGMPKIRDLEKEYVLDWTYFS EYAGDISNVPQNVRIINYPFYDCYRKELYLLGGLFVFSFILVTISLLRTHRRSLMERKNL QMLEEAHKRLTLSMNGGKISLWNIQEGVLEFDDNYVRLVGMEQRRFTKEDIMRYTHPDDV QLLSSFYETLYQSPSMQIQRIRFCFGGKEADYQWYELRCSTLKDAQGEIMLAGIMQNIQN LVEHEQQLILAKQIAEKAELKQSFLNNMSHEIRTPLNAIVGFTNLLIGEGADEIEPEEKA AMIEIVNNNNELLLKLVNDVLEISRLDSGNLSFDIKEHNITKIIKEIYVTYQTLIQPSLC FILELDETVSLPVNIDCFRFTQVISNFLNNANKFTKDGTITLGCKIYKEHQEVCVYVKDT GKGIDDKELMMIFDRFYKTDEFEQGSGLGLSICKVIIERLAGRIEVHSEVGKGSCFSVIL SLANII >gi|222159276|gb|ACAB01000083.1| GENE 52 66316 - 67794 1093 492 aa, chain + ## HITS:1 COG:HI1069 KEGG:ns NR:ns ## COG: HI1069 COG3303 # Protein_GI_number: 16273000 # Func_class: P Inorganic ion transport and metabolism # Function: Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit # Organism: Haemophilus influenzae # 15 492 37 533 538 435 43.0 1e-121 MKDKFKPWQRWVLFALAMAVIFASGVIISFLMEHSAEVVNVDYKKKIKINGIEARSIIFA ENYPREYKTWVDTTSTDLHSRLNGRTSIDVLAQRPEMVILWAGYAFSKDYSTPRGHMYAL QDIVHSLRTGAPMGDADGPQFASCWVCKSSDVPRMIEAIGVDSFYNNKWAAWGAEIVNPI GCADCHEPKNMDLHISRPSLTEAFSRQGRDITHATPQEMRSLVCAQCHSEYYFKGNIKYP TFPWDKGFTVEDLEKYYDEIGFTDYIHKLSRAPILKAQHPDYEIFKMGIHAQRGVSCADC HMPYNDEGGIKYSDHHIQNPLAVTERTCQTCHRDNKETLCKNVYERQQKANELRTLLEKE LAKAHIEAKFTWDIGATENEMQEALLLIRQAQWRWDFGVSSHGGSFHAPQETMRILGHGL NKVFQARMLISKVLVAHGYTDNVPLPDITTKEKAQQYIGLDMAVEKADKDKFLKEIVPEW LQKAKANGRIVN >gi|222159276|gb|ACAB01000083.1| GENE 53 67832 - 68656 341 274 aa, chain + ## HITS:1 COG:mlr1196 KEGG:ns NR:ns ## COG: mlr1196 COG2207 # Protein_GI_number: 13471273 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Mesorhizobium loti # 88 256 107 273 276 79 28.0 7e-15 MYTEYQPSHLLAPYVDNYWELKGTPEYGMRIHILPDGCTDFIFTLGEVAEAVEKESLIMQ PYRSYFVGPMTKFSELVTYAESIHMFGIRFLPCGLSCFTKLPLHEFVNSRISTREMRAVF DDTFVERLCEQEHIEGRIQLVEGYLLAYLARHYQSADSHVAMAVNMINQSGGKRSVRSLM DEVCLCQRHFERKFKHYTGYTPKEYSRIIKFKNAVELLRNTAPSNLLTTAINAGYYDLAH FSKEVKSLSGSTPTSFLSLTVPEDTTLTYIEPGK >gi|222159276|gb|ACAB01000083.1| GENE 54 68738 - 69154 447 138 aa, chain + ## HITS:1 COG:DR1146 KEGG:ns NR:ns ## COG: DR1146 COG3871 # Protein_GI_number: 15806166 # Func_class: R General function prediction only # Function: Uncharacterized stress protein (general stress protein 26) # Organism: Deinococcus radiodurans # 1 130 30 161 193 68 30.0 4e-12 MSTKTMKEKATELLQRCEVVVLASVNKEGYPRPVPMSKIATEGISTIWMSTGSDSLKTID FLANPKAGLCFQDKGDSVALTGTVEVVTDEKMKQELWQDWFIEHFPGGPTDPGYVLLKFE SNHATYWIEGIFIHKKLD >gi|222159276|gb|ACAB01000083.1| GENE 55 69287 - 69562 430 91 aa, chain - ## HITS:1 COG:no KEGG:BT_1428 NR:ns ## KEGG: BT_1428 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 91 1 91 91 149 93.0 4e-35 MIRLNVFVRVSETNREKAIEAAKELTACSLKEEGCIAYDTFESSTRHDVFMICETWQNAE VLAAHEKSSHFSKYVGIIQELAEMKLEKFEF >gi|222159276|gb|ACAB01000083.1| GENE 56 69894 - 70733 544 279 aa, chain + ## HITS:1 COG:no KEGG:BT_1427 NR:ns ## KEGG: BT_1427 # Name: not_defined # Def: tetracycline resistance element mobilization regulatory protein RteC # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 279 1 279 279 425 80.0 1e-117 MENFVRNSKQNIEKEIERIERQPIAPLDKIKQIIDVIQMSLILLKSAVAEYQFPNPKEEI LFFKTWKPQISGLLMFYVRLYQIEKKRIGESPSSQCKYLKSELENLKKYFLNNSFYDYYR TGRTELDEQYFVRGNYDILADTRFGLLDRDSSFTTLHDSSVAEIIANNRLAEYLSAQIEI LSEELHLKFTSMVENRLLQWTDSKVALVEFIYALYAGKCFNNGNTSLKDIAFCCETLFNI EIGDFYRIFLEIRNRKKSRTQFLDKLKDKIIKMMDELDK >gi|222159276|gb|ACAB01000083.1| GENE 57 70842 - 71444 261 200 aa, chain + ## HITS:1 COG:CAC0055 KEGG:ns NR:ns ## COG: CAC0055 COG4332 # Protein_GI_number: 15893352 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 7 191 6 183 196 103 34.0 3e-22 METTFTWEIKVKNTPLLIKKCSHCESDRFYCSDKFRMNAQKKNIDVWLIYRCVKCDNTCN ITLLSRTKPDLIDKKLFHSFSMNDREVAWQYAFSAGMASKNKLQIDYDSVEYEVISNISF EDIMNMNNEIISIWIECDLELNLKLLSLIKRCFSLSTTRLKHLFEEGNISLLSGKTSPKC KVKNGDIILIDRKSLIDIWG >gi|222159276|gb|ACAB01000083.1| GENE 58 71529 - 72230 643 233 aa, chain + ## HITS:1 COG:no KEGG:BT_1425 NR:ns ## KEGG: BT_1425 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 232 232 330 77.0 2e-89 MKKVFLLAFVLFAWSMVVNAQESDGKYIEVTGSSEIEVVPDEIHFLIQIKEYWQEEYTGK SNKEEDFRTKVPLAMIEKDLRRSLRKIGIADDAIRTQEIGDYWRQRGKEFLIGKQLDIRL TDFEQINSIIRSVNTWGIESMRIGELKHKDLLMYRKQGKIEALKAAREKASYLVEAMGQE LGEVIRIVEPVDNNISRYLPFQAQSNVSMGTAATEQYRVIKLRYEMTARFAIK >gi|222159276|gb|ACAB01000083.1| GENE 59 72448 - 73536 543 362 aa, chain + ## HITS:1 COG:MA3445 KEGG:ns NR:ns ## COG: MA3445 COG1162 # Protein_GI_number: 20092257 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Methanosarcina acetivorans str.C2A # 1 359 1 363 369 211 36.0 1e-54 MNNENQTIHKSNDNLTIYGWNEKLNQLKQESIYSTLAHGRVSIVHRTCYEVISGNGLFQC ELTGNMMYGKSDDELPCTGDWVIFQPFDEHKGIIVDSLPRERTLYRKKSGTVADKQVIAS YVDKAFIVQSLDDNFNVRRVERFMVQIMEENISSVLVLNKADLDFDRQSVEEQIKHISSQ IPVFFTSIHQLETIVRLRKSISEGETVVFVGSSGVGKSSLVNALCEKPVLLTSDISLSTG KGRHTSTRREMVLMDSSGVLIDTPGVREFGLAIDNPDSLAEVLEISDYAESCRFKDCKHI NEPGCAVLEAVNSGVLDYKVYESYLKLRREAWHFSASEHEKRKKEKSFTKLVEEVKNRKA NR >gi|222159276|gb|ACAB01000083.1| GENE 60 73637 - 74209 214 190 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|52841322|ref|YP_095121.1| nucleotidyltransferase PLUS glutamate rich protein GrpB PLUS ribosomal protein alanine acetyltransferase [Legionella pneumophila subsp. pneumophila str. Philadelphia 1] # 1 189 153 349 601 87 32 5e-16 MEQKKLTELSLDELWQLFPITLTAHQDYWADWYKEEAELLKERLPCIERISHIGSTAIKG IWAKPIIDILVEIPREEKVVDLKGTIERCGYICMAENDSRIDFNKGYTLQGFAERVFHLH LCYEGDNDELYFRDYLQANTAIAKEYERLKLDLWKQYEHDRNMYTSHKGDFVEQYTQKGK LLFTGRYDNK >gi|222159276|gb|ACAB01000083.1| GENE 61 74389 - 74856 349 155 aa, chain - ## HITS:1 COG:PA1746 KEGG:ns NR:ns ## COG: PA1746 COG2110 # Protein_GI_number: 15596943 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Pseudomonas aeruginosa # 1 150 1 158 168 70 30.0 1e-12 MIKYISGDILQSKDEYIAQGVAVGSQEGLGTGLAFKLSSQFPEIQKLFKKYTRNTKFQAG NVFIGEIKGFSPGIIYIATQPDMYHAELTYLNKGLKRLKKVCESRGIKTVSLPKIGAGLG KLDWNSEVKPLLESILSDCETVFNVYEDYKIEYEI >gi|222159276|gb|ACAB01000083.1| GENE 62 74963 - 75553 183 196 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715104|ref|ZP_04545585.1| ## NR: gi|237715104|ref|ZP_04545585.1| predicted protein [Bacteroides sp. D1] # 1 196 1 196 196 398 100.0 1e-109 MDIERILKTRENAIYYGIGEYQKDCFGWVGGNAPACFDDKYLSDKDNLYFYLTFQNPLNP NKQISIFTPDFDIALEYNTYPDCKLLLVEHELSSQSKSDRYKHPEIDEIYSIYEMSTEKD MPEKNCAIKFGGNVMPIQWGLDNDGKVVKDGNSFIFQINEICMKDISVFMAGCIYVYGKI EGNKVTNPFVAYWEYS >gi|222159276|gb|ACAB01000083.1| GENE 63 75663 - 76382 618 239 aa, chain - ## HITS:1 COG:yfeS_2 KEGG:ns NR:ns ## COG: yfeS_2 COG4884 # Protein_GI_number: 16130346 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 73 235 1 169 172 166 51.0 2e-41 MKRAFIYKDEKSHKFWWIDYSGCDFAVNYDKFDSIGKFEIKDFDTEEECLKQAEKLIHSK KKKGYVEDANFDFMHRFYIDSEEYGLHPKTSHPRFTEHFTEELYYDCVDEEAPFGSDEGS DTLDSLEETIRKNPKLNFLDYPKYLIEHDWGMEYIPVESLDPEVVKKLASKKEMDMTQSD MVTYATAFGQIKITGRLSPKLQEQGVKAIKRLALLWGNGVTEIQSRMIDDLLSFPIDKD >gi|222159276|gb|ACAB01000083.1| GENE 64 77320 - 77706 282 128 aa, chain + ## HITS:1 COG:no KEGG:BT_2537 NR:ns ## KEGG: BT_2537 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 128 1 127 127 152 61.0 3e-36 MITSCIAIIFGFRFFGRGLHEISGEEMVIIAFFMLFLFVPCLWLVRYLKDDEIDSNRQDK LVEIHIALTIVLTVAGGVFSRMSVLVNKWIDEQPFIVAAYIALALSVFGVVIGKAIDLWV YYLKKRIF >gi|222159276|gb|ACAB01000083.1| GENE 65 77713 - 78258 284 181 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715107|ref|ZP_04545588.1| ## NR: gi|237715107|ref|ZP_04545588.1| predicted protein [Bacteroides sp. D1] # 1 181 1 181 181 357 100.0 1e-97 MIEYMNIPVLVCVSLLGVGAFVMIVGSLLKKQWATDGKKTSLSIYLLYLFAMSGSGWTWY NSIHKARALAEGNRYTLATYEKVSGVIKGGHKVRQFGFYVDGVKYNTTTTYPKFLGVSNP LRIIVRYAVSNPEYNKALGDEFVPTWVLSPPERGWKQYPPAIHWEGAVLDYVYMQSLQGN N >gi|222159276|gb|ACAB01000083.1| GENE 66 78241 - 78729 231 162 aa, chain - ## HITS:1 COG:no KEGG:BT_2536 NR:ns ## KEGG: BT_2536 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 162 1 162 162 204 72.0 8e-52 MNKRIISFVCAGLVATSCVSSLCFYDYFANDGTTSNLIYWIFDFFQLSILLGQAGIGVYT INVLLRQRKEAISSLGYFAAHVFLQNITPLLQAWSAAVITEYPQWLLMSQVGFSIVLFLT LLIIQPASLFKKSEYIPWLLMAAFYINCVYTALTVIGLIITL >gi|222159276|gb|ACAB01000083.1| GENE 67 78768 - 79331 269 187 aa, chain - ## HITS:1 COG:no KEGG:HTH_0737 NR:ns ## KEGG: HTH_0737 # Name: lepB # Def: signal peptidase I # Organism: H.thermophilus # Pathway: Protein export [PATH:hth03060] # 4 183 8 184 228 75 34.0 9e-13 MIFWMNEINLMKKLLYLLVMPLILAACNPKVELTSAGMYPNYQVGEIVNLIPVDSLTYGD VIAYHSYIPGFQERAFKRIVGLPGDTVRFQDQQCIVNGKKCEWVLIRKLFYEEDECEEYC ESLPNGMKVNICKSVVPIDSATATTTAVVVPAGSYFVAGDYRGGSIDSRSQGCVAADSII GKGVKRK >gi|222159276|gb|ACAB01000083.1| GENE 68 79312 - 79650 354 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715110|ref|ZP_04545591.1| ## NR: gi|237715110|ref|ZP_04545591.1| predicted protein [Bacteroides sp. D1] # 1 112 1 112 112 196 100.0 4e-49 MKEFGTIGDLINAYKAIPCDAYIYCSKSVIETENLENGKYLVIESEEEDGYVETEDGEVP KQAYDLGMSSLVDVETFKDILSFQYKLNPKTLYKDCIKAIEYYLSNDDFLDE >gi|222159276|gb|ACAB01000083.1| GENE 69 79670 - 80209 360 179 aa, chain - ## HITS:1 COG:no KEGG:BT_0512 NR:ns ## KEGG: BT_0512 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 179 1 179 189 284 89.0 1e-75 MRDKDSGKKIFWIYPTNKFVWAGVVICSIFCAFMIYFLCVANLSENKVLFIFLLPISLSI LLICWALPSKILLCEDRIEARSLFGKRSIRVDEINSWGVVQMYKPYRCKEGIYSYIPAKS FKPSRKIEENMIFSYSLFLSNIPHYDGKKKNSSQTDKTIFVSYRKEVYIALEKYLKEKK >gi|222159276|gb|ACAB01000083.1| GENE 70 80245 - 80733 523 162 aa, chain - ## HITS:1 COG:no KEGG:BF1788 NR:ns ## KEGG: BF1788 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 162 7 168 168 263 76.0 1e-69 MGILASILSLLGCGYGNKRATQSESINPYIPVAAQITMDKLPGVLKNVKAGRTEYDFTGI CANGVDCIYFMQDNGKFYIDFEAMSKDQLPYLNTLKQFAKEHNYPIIETTYNNTPIDYDH VKFAPVLSLKVNADIDSIVHVGKLIEQTIFKNNDQTIYDIVP >gi|222159276|gb|ACAB01000083.1| GENE 71 80935 - 81156 189 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237715113|ref|ZP_04545594.1| ## NR: gi|237715113|ref|ZP_04545594.1| predicted protein [Bacteroides sp. D1] # 1 73 1 73 73 118 100.0 1e-25 MYDANKQSKNGIVVSYKHIVPENKIIKIGIRKNDRKLYSMLFLTLAKQFKNKVIAPVDNA KIPKIDQANAIPL >gi|222159276|gb|ACAB01000083.1| GENE 72 81337 - 81768 147 143 aa, chain - ## HITS:1 COG:no KEGG:BT_0513 NR:ns ## KEGG: BT_0513 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 142 1 142 143 240 89.0 1e-62 MKVYRIAINMIKQTKIFKFIIPIVVFILLYAVSTIRNNNVRKDGIYSIVTLVKYSSAYRG QSAKYEFVYNKTLYKGSFFISFAESKNTPIGTRYFVTFLAQAPDRHLILDSVPSWFTLKA PDKGWKTLPTQKQLRIMMKDSLN >gi|222159276|gb|ACAB01000083.1| GENE 73 81828 - 82724 390 298 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715115|ref|ZP_04545596.1| ## NR: gi|237715115|ref|ZP_04545596.1| predicted protein [Bacteroides sp. D1] # 1 298 1 298 298 582 100.0 1e-165 MELIQTFKKSYFLKMSNDKHLIAQVNSSQINIFENETYKHLAQFKEVGNASVFFSNDSNL LLAKDNDRKLVVYDLATMSIRCKLKPKVGSSNGDGDACISHDNKYIINLGYDFPYGYISV YDIENGKETRFREDMREVYNQIRYIPSRQLYFIDGFRKPNEGTSEKNRYFYLWFDMDNRT FEQTFCDLDDANFLYSEHLEQVLYFTEEGNLAFRILPLNIELPIKHQQGCLDIQLSHNNK MIALYQDNNLKLCTFPNMEILAELSNIGYGNISFSPDDKEILIASTIKGLIYKLEKCS >gi|222159276|gb|ACAB01000083.1| GENE 74 82802 - 83308 292 168 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237715116|ref|ZP_04545597.1| ## NR: gi|237715116|ref|ZP_04545597.1| predicted protein [Bacteroides sp. D1] # 1 168 1 168 168 294 100.0 1e-78 MKLREEIEPKIIQIEKICPQISRLLRGYDSEKDNKCLNIIKKISELTHKVITKDILSEYM EDDSICMVALRLSIGTPPLLHIPLSCDELLEIIQRIHSKNYVEYKVKAFPEDELWWVLSH DYYVPLLEKNMELSEPSLIREMLYQETVFDSLRYKPEEVLEKILGVMK >gi|222159276|gb|ACAB01000083.1| GENE 75 83398 - 83700 200 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262408935|ref|ZP_06085480.1| ## NR: gi|262408935|ref|ZP_06085480.1| predicted protein [Bacteroides sp. 2_1_22] # 1 100 1 100 100 188 100.0 1e-46 MEDINIKDLEVRKEIACGDIIIKLYTPPVKQRYNRNITGENIDGEILWQIEDVRPNVDSP FMNIILYDEKKIEAYNWEGVFYYVHIYTGEVESIPNQRPW >gi|222159276|gb|ACAB01000083.1| GENE 76 83827 - 84003 195 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295087761|emb|CBK69284.1| ## NR: gi|295087761|emb|CBK69284.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 58 112 169 169 116 100.0 4e-25 MEIFDICPVCMWENTNTGEDQYSAPNRSTLKEYRQAFLNNQKLEPNNLKYIQYELGSI >gi|222159276|gb|ACAB01000083.1| GENE 77 84418 - 84531 82 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTDEIKNIKTTAMYVHIANTYKSKIKSLLDDILEEEI >gi|222159276|gb|ACAB01000083.1| GENE 78 84769 - 85005 211 78 aa, chain + ## HITS:1 COG:no KEGG:BF3342 NR:ns ## KEGG: BF3342 # Name: not_defined # Def: putative exported beta-lactamase protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 47 406 451 537 77 78.0 1e-13 MAANNLLYTYESQFGVVETMARIEQTLKNMGIPVFAKFDHGKNAQDVMATEYGLDKESVI GKMQKLLEKIVIQSASVY >gi|222159276|gb|ACAB01000083.1| GENE 79 85376 - 87622 1972 748 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4747 NR:ns ## KEGG: Fjoh_4747 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 3 746 4 745 746 929 58.0 0 MKKDLLKVSIRQHAIYLPAIEGTEKREALTSTTVTLVAQLRKVGYSLSEELLHAVNQLYP AQQVEILQVMKEVLGVSLNWAPLVKGWDTPTGETRLDHWITWLANMFNSKKGVKLSCGHV IPDNTFPLERYNGCPFCGTPFETASTEYFGQASKLKVLELWQEKELNVFFGDLLESRTAL DATQADSLKILLAELPLPAVGIKMKETLMLVIDTLVEQDRAQETQIYFSAPNDILRYLWY KKTGFLQIIEPKTLIRKAGRNNAHLCNALDKSRSAAQAKREELKLKYTRRECKMVALWLN NLAMTPEKSCEMMHSKREMWVRMIRALRLAEYARKPGFENLKELMDVFYCQAYTVWQGEV ERSRLKADAAQTFALLKQRPGMFARSLFANMLWFGPEETLAAFKEVVHLLPARLVVTLGM YAESYFEQGHKRMVKPLGGNALLIEPHYLVSLYMEDQLKEMVKEVQDLCKEVVATRFANA GAGSGSASMYIDPMLFHIPLSIGDRSETVQDTSCALQGTRFPVEGDKVRLFMQWGKGLPA QHLDMDLSCHIALPSTTEVCSYFNLKAIGAKHSGDIRSIPDKKGTAEYIELDLNELDRVG AQYVAFTCNAYSNGAISPNLVVGWMNSAYPMKISERNGVAYDPSCVQHQVRVSQSVQKGL VFGVLKVKEREVVWLEIPFGGQTVLSLDTQTIEKYLDKLEAKTTVGELLAIKAQAQGLKL ADTPEADEVYTREWALNTAAVTKLLLGD >gi|222159276|gb|ACAB01000083.1| GENE 80 87892 - 88509 532 205 aa, chain + ## HITS:1 COG:no KEGG:BT_1424 NR:ns ## KEGG: BT_1424 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 205 1 205 205 313 77.0 2e-84 MSAYYDLYETPDVQNTGEKQPLHARIVPSGTYSQKEFIERVSRYQHFPQNMVDGVLGAVI DELGSLLARGYIVELGELGHLSVSLKCTQKVMTKKEIRSESICFDNVHLRTSKNFKLKVR REMRLERVPKSGRTVSKTEIPIEQRLQMLQEFLKKNGGITRIEYSRLTGVARLKAVDDLN TFIQEGKLRKRGAGRNVFYVWKQEE >gi|222159276|gb|ACAB01000083.1| GENE 81 88587 - 90299 1710 570 aa, chain + ## HITS:1 COG:no KEGG:BT_4445 NR:ns ## KEGG: BT_4445 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 570 1 570 571 1031 89.0 0 MARITIPYAVADFIDLRERGFYYVDKTDYIPKLEDYNAPVFLRPRRFGKSLLVSTLACYY DRTKAHRFEELFGGTWIGNHPTKEHNSYMIIRYDFSKMVMADTIEGLAQNFNDLNCGPVD VMVEHNRDLFGDFQFTTRGDASKMLEEVLNYARSHEFPKVYLLIDEYDNFTNQLLTAYND PLYEEVTTNDSFLRTFFKVIKAGIGEGSIRTCFCTGVLPVTMDDLTSGYNIAEILTLEPN FLNMLGFTYEETETYLRYVLDKYSTGQERFDEIWQLIVSNYDGYRFRPNGDRLFNATILT YFFKKFAANAGSIPDELVDENLRTDINWIRRLTLSLDNAKAMLDALIIDDELPYNVADLA SKFNKKKFFDKEFYPISLFYLGMTTLKDKFVTTLPNMTMRSVYMDYYNQLNKIEGNAQRY VPVYRYYDSNRSLEPLVQNYFEQYLGQFPAQVFDKINENFIRCSFYELVSRYLSSCYTFA IEQNNSVGRSDFEMTGIPGTDYYTDDRVVEFKYYRAKEAEKMLALTEPLPEHVEQVKRYG EDTKRKFPYYNVRTYVVYICANKGWKCWEV >gi|222159276|gb|ACAB01000083.1| GENE 82 90571 - 90885 58 104 aa, chain + ## HITS:1 COG:no KEGG:BT_1421 NR:ns ## KEGG: BT_1421 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 1 104 104 166 82.0 3e-40 MYHKRNIKVYIAWVLLMTFIPFFVVKTFHYHGSEDETSCSHAQHSHNPSDDCAICKYSLF LFTEPQPVEFHCTLTLVPYEPVIYQDKVVCKRTYSHHLRGPPVA >gi|222159276|gb|ACAB01000083.1| GENE 83 90973 - 93132 1695 719 aa, chain + ## HITS:1 COG:PA0781 KEGG:ns NR:ns ## COG: PA0781 COG1629 # Protein_GI_number: 15595978 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Pseudomonas aeruginosa # 31 714 39 687 687 121 23.0 5e-27 MRIGIISGVIGLFITLSVHAQKSDSIKSMLLPDVVVTETYQQRQAKKSALTVDVADQDFL RKHFTGNFMQAMENIPGVQAMDIGSGFSKPMIRGMGFNRIAVLENGIKQEGQQWGADHGL ELDAFNIGTVNVLKGPSSLLYGSDAMGGVIDVVPPAVPMDNRVFGDVTLLGKSVNGTIGG SLMLGIKKNAWYSHIRYSEQHFGDYRIPTDSIVYLTQRIPIYGRKLKNTAGIERNIGFFT QYQRRAYRANFSVSNVYQKTGFFPGAHGIPDASRVEDDGDSRNIELPYSKVNHLKVTTHQ QYAWEKLILSGDLGFQNNHREEWSAFHTHYGSQPAPEKDPDKELAFNLNTFSASVKARFI GSSSWEHTLGWDGQHQRNDISGYSFLLPEYRRSTTGMLWLTTYRPNNVFSVSGGARYDYG YINISSHEDVYLADYLQKQGYTQEQIDLYKWNSHQVKKHYGDYSLSLGLVWTPSDKHLVK VNIGRSFRLPGANELAANGVHHGTFRHEQGDANLKSEQGWQLDASYHLKYRRISFSVSPF VSWFSNYIFLRPTGEWSVLPHAGQIYRYTGAEALFAGTEATVDVDFLRNFNYRISAEYVY TYNCDEHIPLSFSPPPVMRNTLTWQKNRYMLYAEWQSIARQNRVDRNEDRTAGANLFHLG GSLNIPIGGNNEIEITLTARNIFNTRYYNHLSFYRKVEIPEPGRNFQILIKVPFKKLLK >gi|222159276|gb|ACAB01000083.1| GENE 84 93153 - 93620 369 155 aa, chain + ## HITS:1 COG:no KEGG:BT_1419 NR:ns ## KEGG: BT_1419 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 9 164 164 207 76.0 1e-52 MVSLLATVTFMFSSCDNDDSSDTIKPLIELHEPEEGQALEIGNEHGVHFEMDLSDDVMLK SYKIEIHSNFDHHSHGGNSRAVQETVDFSFNRSYDVSGQKTAHIHHHDIVIPANATAGDY HLMVYCTDAAGNESYIARNIKLSNEVEDEDHHHNE >gi|222159276|gb|ACAB01000083.1| GENE 85 93774 - 94373 253 199 aa, chain + ## HITS:1 COG:Cj1358c KEGG:ns NR:ns ## COG: Cj1358c COG3005 # Protein_GI_number: 15792681 # Func_class: C Energy production and conversion # Function: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit # Organism: Campylobacter jejuni # 31 165 30 167 171 83 33.0 3e-16 MMKFPIINRFFPSFKWKVAAVIIGGVIVGGGALFMYMLRAHTYLGDDPAACVNCHIMTPY YATWFHSSHARNATCNDCHVPHENAVKKWTFKGMDGMKHVAAFLTKSEPQVIQAHEASSE VIMNNCIRCHTQLNTEFVKTGKIDYMMSQVGEGKACWDCHRDVPHGGKNSLSGTPGAIVP LPESPVPEWLRKMVNQKDK >gi|222159276|gb|ACAB01000083.1| GENE 86 94413 - 95894 1294 493 aa, chain + ## HITS:1 COG:PM0023 KEGG:ns NR:ns ## COG: PM0023 COG3303 # Protein_GI_number: 15601888 # Func_class: P Inorganic ion transport and metabolism # Function: Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit # Organism: Pasteurella multocida # 52 490 68 506 510 446 45.0 1e-125 MEKKLKSWQGWLLFGGSMVVVFVLGLCVSALMERRAEVASIFNNRKNAIKGIEARNELFK DDFPREYQTWTETAKTDFESEFNGNIAVDALEKRPEMVILWAGYAFSKDYSTPRGHMHAI EDITASLRTGAPVNPTDGPQPSTCWTCKSPDVPRMMEALGVDSFYNNKWGAMGAEIVNPI GCSDCHDPETMNLHISRPALIEAFQRQGKDITKATPQEMRSLVCAQCHVEYFFKGDGKYL TFPWDKGFTVEDMEAYYDEVGFYDYIHKLSRTPILKAQHPDYEIAQMGIHGQRGVSCADC HMPYKSEGGVKFSDHHIQSPLAMIDRTCQVCHRESEETLRNNVYERQRKANEIRNRLEQE LAKAHIEAKFAWDNGATEAQMKDVLALIRQAQWRWDFGVASHGGSFHAPQEIQRILSHGL DRAMQARLAVSKVLAKNGYTGDVPMPDISTKAKAQEYIGLDMDAERAAKEKFLKTTVPAW LEKAKENGRLAQI >gi|222159276|gb|ACAB01000083.1| GENE 87 95918 - 97162 1044 414 aa, chain + ## HITS:1 COG:no KEGG:BT_1416 NR:ns ## KEGG: BT_1416 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 412 1 409 411 717 85.0 0 MWSKPWSYKEGLAIGAGLLIIGILLQMTVGAINWDLFACPVNVIVLLVYIVALIAMHLLR KRVYLFSWLSHYSAAVSALLWVVGMTVVMGLIRQAPSGHAPNASTDLLGFSQMIASWPFV LLYFWMVTALGLTILRASFPFKWRRLSFLLNHIGLFVALIAATLGNADMQRLKMTTRMGN AEWRATDDKGQLIELPLAIELKDFTIDEYPPKLMLIDNETGRTLPEKSPEHVLLEEGVIK GTLQDWQLTIEQSIPMAASVATEDTVKFIEFHSMGATYAVYLKAFNQENQTTKEGWVSCG SFLFPYKAIRLDSLTSLVMPEREPQRFASEVKIYTQEGTITEGTIEVNRPMEIEGWKIYQ LSYDETKGRWSDVSVFELVRDPWLPFVYAGIIMMMAGAVCLFVSAQKRKEEDKA >gi|222159276|gb|ACAB01000083.1| GENE 88 97159 - 97950 643 263 aa, chain + ## HITS:1 COG:all0936 KEGG:ns NR:ns ## COG: all0936 COG0755 # Protein_GI_number: 17228431 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Nostoc sp. PCC 7120 # 166 260 253 347 351 86 42.0 6e-17 MNWEYFILFAIAALVCWALGAFAAWKGTKPGWAYGFTFLGLAIFFSFIIGMWISLERPPM RTMGETRLWYSFFLPLAGLITYVRWKYKWILSFSCILSFVFICINIFKPEIHNKTLMPAL QSPWFAPHVIVYMFAYAMLGAATVIAVYLLWFKKKKIERKEMDLCDNLTYVGLAFMTLGM LTGAIWAKEAWGHYWAWDPKETWAAATWFAYLAYIHFRLGKPLKARPALVILLVSFVLLQ MCWYGINYLPSAQGVSVHTYNLN >gi|222159276|gb|ACAB01000083.1| GENE 89 97980 - 99260 1136 426 aa, chain + ## HITS:1 COG:no KEGG:BT_1414 NR:ns ## KEGG: BT_1414 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 426 1 428 428 699 81.0 0 MKTFYWSFLLMLLPSMAYTQNTEKENEFTMSMQIRPRAEYRNGALTPRDEGVAPTSFINN RARLSMDYKRSDLELKMSAQHVGVWGQDPQIEKNGRFMLNEAWAKMNFGEGFFAQLGRQS LIYDDERILGGLDWNVAGRYHDALKLGYANKNNEVHLILAFNQNNDNRTSGGTYYDSSTG QPYKNMQTVWYHYKADNVPFGASLLFMNLGLETGDKATDDSHTRYLQTMGTYLTYKNSNW NLDGAFYYQMGKNKTADKVSALMGSIQAAYTFDHTWGAVASFDYLSGDKGNGGKYKAFDP LYGTHHKFYGAMDYFYASTFANGYAPGLMDARIGGRFRASAKVDMELNYHYFSTAVKVQD LKKYLGSEVDYQINWSIMKDVKLSAGYSFMRGTKTMDAVKTGNHKSWQDWGWLSLNINPK ILFVKW >gi|222159276|gb|ACAB01000083.1| GENE 90 99439 - 100128 367 229 aa, chain - ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 19 229 14 225 229 82 25.0 6e-16 MKKIQLTDSHKEKLFQIPLFRDLPLNIKESLLEKLDFVIYAANKKEIVVTQGTPCNKLFV LLEGKLRTDIIDGLGNEVMIEYIIAPRTFATPHLFNSNNTLPATFTALEDSVILMATKDS TFKVISQDPQVLHNFLCIAGNCNICTVSRLKPLSRKTVRERFIVYLYEHKKKDSLVVDIM HTQSQLAEYLNVSRPALSKEINKMMKEGLVIMEGKRIEILDKTTLEKYL >gi|222159276|gb|ACAB01000083.1| GENE 91 100375 - 100914 496 179 aa, chain + ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 179 1 179 179 253 66.0 1e-67 MAKKVLIISSSPRKGGNSDLLCDEFMKGAIEAGNEVEKIFLKDKTVHPCTGCSVCSMYGK PCPQKDDAAEIVEKMIAADVIVMATPVYFYTMSGQMKIMIDRCCARYTEITNKEFYFIIA AAENDKAMMERTIDGFRGFLDCLEGPQEKGTVYGIGAWKVGEIKDTPYMQEAYNMGKMV >gi|222159276|gb|ACAB01000083.1| GENE 92 101167 - 101988 566 273 aa, chain + ## HITS:1 COG:MTH1101 KEGG:ns NR:ns ## COG: MTH1101 COG1237 # Protein_GI_number: 15679112 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily II # Organism: Methanothermobacter thermautotrophicus # 4 270 2 260 260 137 35.0 3e-32 MGYKITTLVENCVYGRKLQAEHGLSLYIETQEHRLLFDTGASDLFIRNARLLHIDLQKVD YLILSHGHSDHTGGLRYFLELNTQATVVCKCEVFSPKFKDERENGMMHTQNLDRSRFRFI TEQTELLPGVFLFPSIDIINQEDTHFERFWVQQEDGCKIPDTFQDELAMVLVEPEGFSVL SACSHRGITNILRTVQAAFPESPCKLLLGGFHIHNAEKQKYQVIADYLQEYLPRQIGVCH CTGVDKYAFFYKDFGDSTFYNYTGKLIQTDLSE >gi|222159276|gb|ACAB01000083.1| GENE 93 102006 - 105365 2869 1119 aa, chain + ## HITS:1 COG:VCA0045 KEGG:ns NR:ns ## COG: VCA0045 COG0793 # Protein_GI_number: 15600816 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Vibrio cholerae # 748 1102 23 379 394 285 42.0 3e-76 MKKLITSLALVLSALSSYAITPLWMRDARISPDGSEIVFCYKGDIYKVPAQGGTAVQLTT QTSYEANPVWSPDGKQIAFASDRNGNFDLFIMPADGGAARRLTYHSASEIPSTFTPDGKY VFFSASIQDPASSALFPTGAMTELYKVPVAGGRTEQVLGTPAELVCFDKTGKNFLYQDRK GFEDEWRKHHTSSITRDIWLYNTQTGKHTNLTNRGGEDRNPVYAPDGNAIYFLSERDGGS FNVYTFSLNTPQEVKAMTTFKTHPVRFLSVSDKGTLCYTYDGELYTQKSGARPEKVKVEL VRDDEQQLATLKFSQGATSASVSPDGKQIAFIVRGDVFVTSTDYATTKQITNTPAKEAAV SFAPDNRTLVYASERTGNWQLYTAKISRKEDPNFPNATLIEEEVLLPSKTVERTYPQYSP DGKELAFIEDRNRLMVLNLKTKKVRQVTDGSTWYNTGGGFDYEWSPDGKWFTLEFIGNRH DPYSDVGIVSAQGGAITNLTNSGYISGAPRWVLDGNAILFQTERYGMRAHASWGSQQDVM LVFLNQDAYDRYRLSKEDFELLKEFEKEQKKAKEKDDEKTKDAKKSKAEKADKGNVNKGK ADKSKADKEKSKVDSSKDSNQDESADDKADQKELLVELNGIEDRIVRLTPNSSDLGSAIL SKDGEDLYYFSAFEDGYDLWKMNLREKDTKRLHKLNTGWASLMLDKKGDVFLLGSRIMQK MDAKSDALKSISYQAEMKMDLAAERETMFDHVYKQHQKRFYNVNMHGVDWDAMTNAYRKF LPHIDNNYDFAELLSEWLGELNVSHTGGRYSPKGKGDVTSNLGLLFDWNYQGKGMQIAEV IEKGPFDHSRTKVKAGCIIEKINGEEITLDNDITCLLNNKAGKKTLISIYNPQSKERWEE VVMPVTNGQLNGLLYKRWVKQRAADVEKWSKGRLGYVHIQSMGDDSFRTVYSDILGKYNN CEGIVIDTRFNGGGRLHEDIEILFSGQKYFTQVVRGREACDMPSRRWNKPSIMLQCEANY SNAHGTPWVYKHQKIGKLVGMPVPGTMTSVSWETLQDPSLVFGIPIVGYRLPDGSYLENT QLEPDVKVANNPETVVKGEDTQLKVAVDELLKEIDSQKK >gi|222159276|gb|ACAB01000083.1| GENE 94 105522 - 106193 477 223 aa, chain - ## HITS:1 COG:CAC0198 KEGG:ns NR:ns ## COG: CAC0198 COG2364 # Protein_GI_number: 15893491 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 207 1 203 227 158 45.0 9e-39 MEMKEIFKRYLVFVIGLYFLAAGIVLIIRSTLGTTPISSINYVLSLNSPLSLGTCTFIIN MVLILGQFWLIRKNRTRQDLIEILLQLPFSFIFSAFIDFNMMLTSELHPANYGMSIALLL TGCMVQSIGVVLELKPRVAMMSAEAFVKYASRHYNKEFGKFKVYFDITLVTLAVILSLLL TQGIQGVREGSLIAACITGYIVSFLNQKIMTRKTLHRLLPVWK >gi|222159276|gb|ACAB01000083.1| GENE 95 106349 - 107239 559 296 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 108 293 95 287 288 84 30.0 3e-16 MVKSRIFDNTQLLKEHPELPHYKEEVVCFRSEYKDRSYPANECYFNRELYMIFVLEGRSE ILLNGEFIAIEPNMLLIHGANYLTEHLYSSRDIKFITLALSESMRTDDSYLTQITAILLA TMRRNKQYTIQLTEYEAQIIHKELEVLMHLMNIEHHFLFRRIQAACNSLFLDIADFLSRK TVIKKEISRKDHVLQEFHALVTRNFREEHFVSFYADKLAISEQYLARIVRLGTGKTINSL INELLVMEARTLLTSTKFTVGEIATKLGFSDAAGFCKFFKRNAGQTPLNYRKGLLI >gi|222159276|gb|ACAB01000083.1| GENE 96 107409 - 109166 1824 585 aa, chain - ## HITS:1 COG:CAP0106 KEGG:ns NR:ns ## COG: CAP0106 COG1154 # Protein_GI_number: 15004809 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Clostridium acetobutylicum # 1 584 1 585 586 829 69.0 0 MYLENIYSPADVKKLSFKELNDLSNEIRASLLQKLSEHGGHFGPNFGMVEATIALHYVFN SPEDKIVFDVSHQSYVHKMLTGRKDAFLYPAEYDNVSGYSEPQESKHDFFVIGHTSTSVS LASGLAKGRDLIGGNENIIAVIGDGSLSGGEAFEGLDYMAELGTNMIIIVNDNQMSIAEN HGGLYKNLKELRDSNGQCECNFFKAMGLDYIYVNDGNDVQALIEAFSKVKDIQHPIVVHI NTLKGKGYAHAEQDKETYHWRTPFNPETGEAKVSYEEEDYSEVTAQYLLKKMKEDSRVVT ITSGTPAVLGFTPDRRKEAGKQFVDVGIAEEHAVALASGIAANGGKPVYGVYSTFIQRSY DQLSQDLCINNNPAVLLVFWGTLSGMNDVTHLCFFDIPLISNIPNMVYLAPTCKEEYLAM LEWSIRQNEHPVAIRVPATDVITCGEPVETDYSVLNRYKVTHRGSKVAILALGSFYGLGQ SVASLLKEKANIDATLINPRYITGVDNELMDELKADHELVITLEDGVLDGGFGEKIARYY GATNMKVLNFGAKKEFVDRYDIQEFLRANHLTDEQIVEDITAVIG >gi|222159276|gb|ACAB01000083.1| GENE 97 109345 - 110112 453 255 aa, chain + ## HITS:1 COG:BH3955 KEGG:ns NR:ns ## COG: BH3955 COG0500 # Protein_GI_number: 15616517 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus halodurans # 13 250 11 252 255 115 30.0 9e-26 MNSDYKVGDLIYDANIYDAMNTNLDDWYFYKRWLPENKDARILELCCGTGRLTLPIAKEG YDITGIDYTPSMLAQAKIKASEAGLEISFIEADIRTLNLPEKYDFIFIPFNSIHHLYKNE DLFKVFNVVKNHLKDGGLFLFDCFNPNIQYIVEGGKEQKEIAAYTTDDGREVLIKQTMRY ENKTQINRIEWHYFINGKFNSIQNLDMRMFFPQELDSYLEWNGFHINHKYGGFEEEAFDD NSAKQIFICQCDLVY >gi|222159276|gb|ACAB01000083.1| GENE 98 110524 - 111555 847 343 aa, chain + ## HITS:1 COG:RSc0206 KEGG:ns NR:ns ## COG: RSc0206 COG1073 # Protein_GI_number: 17544925 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Ralstonia solanacearum # 10 342 6 342 342 491 72.0 1e-139 MKRILLLTTVFMMMLGTSFSSAQTDADNFYKSDLVSVEKVSFSNQYKMKVAGNLFLPKNM KEGDKYPAIIVGHPMGAVKEQSANLYATKMAERGFVTLSIDLSFWGESEGEPRNAVLPEV YAEDFSAAVDFLGTRPFVDRNLIGVIGICGSGSFAISAAKIDPRLKAIATISMYNMGTAS RNGLKHSLTLEQRKQIMAEAAEQRYAEFLGGETQYTGGTVHQLTEKSSPIEREFYEFYRT QRGEFTPDGATPMTTTHPTLSSNVKFMNFYPFEDIETISPRPMFFITGENAHSREFSEDA YRLAAEPKELYIVPGAGHVDLYDRVSLIPFDKLESFFKEYLKK >gi|222159276|gb|ACAB01000083.1| GENE 99 111611 - 112474 654 287 aa, chain + ## HITS:1 COG:no KEGG:BT_1399 NR:ns ## KEGG: BT_1399 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 287 1 287 287 516 83.0 1e-145 MEQIRNKEYGGERPLFATHDLQLEDVTIHAGESALKECSNIIAINCRFEGKYPFWHTNGF IVKNCLFTEGARAALWYSQSLQMADTLVEAPKMFREMDGIKLENVQLPNALETFWYCRNI DLKNVQIDKADYLFIHSENIKIQNYAQNGNYSFQYCKNVEIRNAVINSKDAFWNTENVTV YDSELNGEYLGWHSKNLRLVNCKISGTQPLCYAHDLVMENCTMAEDADLAFENSSVKATI KSPVHSVKNPRTGSIVAESFGAIILDENLKVPGNCELKLWDDLTCFN >gi|222159276|gb|ACAB01000083.1| GENE 100 112478 - 113635 816 385 aa, chain + ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 3 385 2 382 384 396 46.0 1e-110 MKYNFDEIIPRRGTNSYKWDSAEDADVLPMWVADMDFRTAPPVVEALKKRVEHGIFGYVR VPDAYYEAVVNWFARRHAWRMEKEWIIYTTGVVPAISAVIKALTLPGDKVMVQTPVYNCF FSSIRNNGCEMIANPLVYRNRGYQIDLDDLERKTLDPKVKLLLLCNPHNPAGRVWSKQEL RRIGEICLRNNVFVVADEIHCELVFLGHEYTPFASISEEFLLNSVTFVSPSKAFNLAGLQ IANIISADADVRVRINKAINMNEVCDVNPFGVEALIAAYNEGEEWLEELKIYLFANYIYL KGYFETYLPEFPVMMLEGTYLVWVDCSVLHQASEEIVKDLLKKEKLWVNEGSLYGEAGEG FIRINIACPRQRLIDGLNRLKRALK >gi|222159276|gb|ACAB01000083.1| GENE 101 113706 - 114632 795 308 aa, chain + ## HITS:1 COG:PA4783 KEGG:ns NR:ns ## COG: PA4783 COG0697 # Protein_GI_number: 15599977 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pseudomonas aeruginosa # 17 301 13 284 296 104 32.0 2e-22 MESLNCSKGNTVGIVVAYFLIYVVWGSTYFFIGVALKDFSPFLLGALRFTAAGIILLGIC YLRGEQIIKKSLVKRSAVSGIVLLFVDMAVVMLAQRYLTSSLVAIIASSTALWILLLDVP MWRTNFRNPLTIMGGLIGFGGVVMLYAEQLNVRWLHSYSERGILLLIFGCISWALGTLYA KYRSSREEKVNAFAGSAWQMLFASGMFWLCAVINGDVREADLREVSVTSWLSLLYLVSFG SLLAYSAYVWLLKVRPAAEVGTHAYVNPFIAVLLGVFLGNEQVTFIQVSGLLVILLGVML ISRKRKAQ >gi|222159276|gb|ACAB01000083.1| GENE 102 114718 - 115293 255 191 aa, chain + ## HITS:1 COG:all1011 KEGG:ns NR:ns ## COG: all1011 COG0110 # Protein_GI_number: 17228506 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Nostoc sp. PCC 7120 # 2 189 9 191 192 187 51.0 7e-48 MKTEKEKMLAGEVYDCADNELLTRWHKAKQLQQEYNNTLTTDAGKISDLLDELIGSRGDN VWISAPFFVDYGENIYIGKNVEINMNCVFLDCNKIVIGDNSGIGPGVHIYTVFHSTKALE RTSENSTFWKSQTAPVIIGNNVWIGGGCIILPGVTIGDNTTIGAGSVVTKSIPANVLAVG NPCRILKQIED >gi|222159276|gb|ACAB01000083.1| GENE 103 115447 - 116358 701 303 aa, chain - ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 215 302 269 361 368 66 37.0 6e-11 MEENYKDEMIVIDNVNRYNEIFGLETKHPLVSIIDLTKATTWPTRAWFRYEVYALFLKNV KCGDIKYGRQYYDYQDGTIVCFGPGQITDLELIQNIQPNAHGLLFHPDLIRGTSLGQEIK NYSFFSYETNEALHLSEEERQIVMDCLQKIVIELNHAIDRHSRRLICTNIRLLLDYCMRF YERQFETRNKVNNDIIVRFEHLLNEYFEGDAPQHLGLPSVKYFADKVFLSPNYFGDMIRK QTGKTVSEYIQDKMIELAKEQLLSSDKTTSQIAYEIGFQYPQHLSRMFKRIVGMTPNKFR TQT >gi|222159276|gb|ACAB01000083.1| GENE 104 116381 - 116599 65 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLAEIAAANNRKAISNFFILITYDSSNFSIKLPITKLLGIIHPGCRRFTDIYTPIMDFED GELILWIKELYL >gi|222159276|gb|ACAB01000083.1| GENE 105 116544 - 116939 392 131 aa, chain + ## HITS:1 COG:SMa0558 KEGG:ns NR:ns ## COG: SMa0558 COG1359 # Protein_GI_number: 16262744 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 1 121 43 160 170 94 43.0 6e-20 MKKLLIAFLLFAAAISASIIKPVEQFATENNKVRLSRITVDSARLEEYNAYLKEEIEASM RLEPGVLVLYAVAEKERPHHVTILEIYVDEAAYKSHIATSHFQKYKKGTLDMVQSLELVD TTPLIPGLKIK >gi|222159276|gb|ACAB01000083.1| GENE 106 117055 - 118260 1044 401 aa, chain + ## HITS:1 COG:MA0409 KEGG:ns NR:ns ## COG: MA0409 COG0599 # Protein_GI_number: 20089302 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Methanosarcina acetivorans str.C2A # 173 400 22 250 250 322 67.0 7e-88 MRKLFFIIVTAIGMMQIPMQSINAQNMKKEEVPQNISAFPVGKANTGFEQYFTGRSWLAP LTGNKDLNVPMSNVTFEPGCRNNWHSHTGGQLLIAVGGVGYYQERGKAARRLLPGDVVEI APNIEHWHGAAPDSWFSHLAIGCNPQTNQNTWLERVDDQQYAEATKGNVAIGLQATDPEL DDIFSNFTKEVQEYGALDIKTRLMVTLASNIASQAQAEYRITLESALNEGITSIEIKEIL YQSVAYAGMAKVRDFIGITNDILLARGVRLPLEGQSVVSSETRFNKGLELQKSIFGERIE QMHKNAPDNQKHIQRYLSANCFGDYQTRSGLNVKTRELITFSILVSLGGCESQVKSHIQG NVNVGNNKDTLLAVVTQLLPFIGYPRTLNAIACLNEVIPEK >gi|222159276|gb|ACAB01000083.1| GENE 107 118937 - 120523 1505 528 aa, chain + ## HITS:1 COG:no KEGG:BDI_2898 NR:ns ## KEGG: BDI_2898 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 12 526 6 520 522 724 69.0 0 MDIMDRLKGPRRNLPIGIQSFERLRRDGYLYVDKTAFVYELASTGNPYFLSRPRRFGKSL LLSTLEAYFCGKKELFEGLAIEQLETEWNEHAVLHLDLNAEDYSEIEGLRNGLELQLRSW EKVYGTSEEGLSYSGRFMQVIKEAYRQTGRGVVVLIDEYDKPLLRSMHNPELQDKFREML TAFYTVLKSADPWLRFVFITGVTKFAQMGIFSTLNQLIDISFDPQYNALCGMTRPEIEAN FIPELERLAERNGLTKEDCMAYLTRMYDGYHFNYVRKEGMYNPFSILNVLRSGMFENYWF ASGTPTFLAEMLKKTHYDLRELDGLEVTAASLTDDRADVNNPVPMIYQSGYLTIKGYDTE LRLYKLGYPNDEVKYGFLNFITPFYTSLDESKAPFYIGQFVKELRAGDVEAFLTRLRAFF ADFPYELHDKTERHYQVVFYLVFKLLGQFINAEVQSALGRADAVVKTANAVYVFEFKLNG TAEEALAQIDNRGYLIPYTTNGCRIVKIGAEFSKEERNLSRWLVEEEG >gi|222159276|gb|ACAB01000083.1| GENE 108 120926 - 122059 1135 377 aa, chain + ## HITS:1 COG:no KEGG:BT_1391 NR:ns ## KEGG: BT_1391 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 377 1 375 375 633 86.0 1e-180 MIMRKSIILLALILGGVTANAQSVVEGTKVTDNWSVEVNAGAITPLTHSAFFKSMRPAFG VGVSKQLTPIFGLGFQGMGYINTTPSKTSFDASDVSVLGKVNLMNLFAGYKGEPRLFEVE AVAGMGWLHYYVNGDGDQNTWSTRLGLNFNFNLGESKAWTLGIKPAIVYDMQGTYPETKS RFNANNAGFELTAGLTYHFKTSNGTHHFAKVRVYNQAEIDGLNSSINALRSEVNNKDGQI SNANQRINGLQEELEACRTKVVPVETVVKTARVPESIITFRQGKSSVDASQLPNVERVAS YMKKYADSKVVIKGYASPEGSVEVNARIAAARAEAVKTILVNKYKISASRITAEGQGVGD MFTEPDWNRVSICTIED >gi|222159276|gb|ACAB01000083.1| GENE 109 122275 - 123348 1241 357 aa, chain + ## HITS:1 COG:ZmolR.A_1 KEGG:ns NR:ns ## COG: ZmolR.A_1 COG3831 # Protein_GI_number: 15802594 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 EDL933 # 3 69 2 68 94 82 58.0 1e-15 MKRVFVFQDFKSQKFWSIEVVGTDVTVNYGKLGTAGQTQVKNYATTEEAEKAANKLIAEK TKKGYVETAEETAREMKVEAKKYTLSYDEYENDVKLLDKILKDKHLSEYKQITIGCWDYE GDDCSALLQGLIENKDKFAQIEGLFWGDIEQEEQEISWIEQADLSPLLDSMPKLKDLKIK GTNNLRLGKTSRPELRSLEIISGGMPTEVVEDILASDFPNLEKLILYVGVEDYGFEGDIE IFRPLFSKERFPKLTYLGLVNSEEQDSIVEMFLESDILPQLETMDISAGTLKDEGAQLLL DNMDKIAHLKFINMRYNYLSKDMKKQLQNLPMKIDIAETEEADEYDGELWYYPMITE >gi|222159276|gb|ACAB01000083.1| GENE 110 123352 - 124362 596 336 aa, chain + ## HITS:1 COG:no KEGG:BF1940 NR:ns ## KEGG: BF1940 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 336 1 336 336 371 55.0 1e-101 MQIIVVSNSLSKRIEYFVEAGKHLQTEVCFMTYEELFNCLPLLRQAVIKLEPCVSDETDF LKYALLNQAYKETLQRLSEMSLPDDVCFLNTPDALLRALDKKETKEVLMDKGLKVTPMLP SPHSFDELRQLLADCGRGCFFKPRYGSGAGGIMAIRYQPNRNKWVVYTTLRQVDGVIHNT KRINRLSVEKEIIPLAEAVIQTEAILEEWIPKAQLQGENYDLRVVCRESEIDYIVVRCSK GSITNLHLNNKAHWWNELSLSEEVRQQIYFQCQEAVQSLDLQYAGVDVLIERGTDIPYII EVNGQGDHVYQDMFAHNSIYTQQIKNIKKKYNHANR >gi|222159276|gb|ACAB01000083.1| GENE 111 124349 - 125203 597 284 aa, chain + ## HITS:1 COG:no KEGG:BF1879 NR:ns ## KEGG: BF1879 # Name: not_defined # Def: putative membrane-associated metal-dependent hydrolase # Organism: B.fragilis # Pathway: not_defined # 1 283 1 283 283 498 81.0 1e-139 MQIDELPAGEQTRHPDLDMNQVVGTHDILMLCFDTLRYDVSKEEEAAGRTPVLNSHGGEW EKRHAPGNFTYPSHFAIFAGFLPSPAEPHSLRSRNWLFFPVQAGTGRIPPKGSYPFTEAT FVQSLAHTGYETICIGGVNFFSKRNELGRVFPGYFTKSYWLPSFGCTAPDSTEKQIDFAL KKLENYPDDKRIFMYINFSAIHYPNCHYVEGKMKDDKESHAAALQYIDSQLPCLFQAFQK RSNTLVIAFSDHGTCYGEDGYEYHCISHETVYTVPYKHFILTKQ >gi|222159276|gb|ACAB01000083.1| GENE 112 125200 - 126474 671 424 aa, chain + ## HITS:1 COG:STM4012 KEGG:ns NR:ns ## COG: STM4012 COG0635 # Protein_GI_number: 16767277 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Salmonella typhimurium LT2 # 8 415 7 405 413 275 38.0 1e-73 MNQPLPRYVDYMYSYPHKTAYRPFPSPVSLLPYLEQVEGQKASLYFHIPFCSHKCGYCNL FSLQTNRADYIATYLETLHKQAQQLSPLTTGLIFDSFAIGGGTPLLLTVSQLEYLLETAA LFGVHPSHAFTSIETSPEYADPARLTWLKQAGVARISIGIQSFLNEELTALKRRPRQDTI NQALETIRKMEFPFFNIDLIYGIKGQTVDSFLYSLEQALLFQPNELFIYPLYVRQGTAIT EREPDDVCFRMYCAARDLLKDRGFLQTSMRRFIHHPSTDAEISCGDEVMLSCGSGGRSYL GHLHYATRYTVSQHCIAGEIDDYMGTADFTVARNGFILSQEERRQRFIIKNLMYYMGLDK AEYERRFGESPDDIPLFRQLAERQWIESTDDRICLTSEGIGYSDYIGQLFITPEIRGLME TYSY >gi|222159276|gb|ACAB01000083.1| GENE 113 126480 - 128015 614 511 aa, chain + ## HITS:1 COG:no KEGG:BF1943 NR:ns ## KEGG: BF1943 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 11 507 3 498 500 553 51.0 1e-156 MEMKNEGDLLSIKRIYYRGKLNSCNYTCSYCPFGKKSHLTATTQDEQAWNRFITAIEQWQ GGPLQLFIIPYGEALIHRYYRKGIIHLAALPQVAGISCQTNLSFSADEWLDEFSATPTLI SKIKIWASFHPEMTSVESFVRRLHTLYNAGIQVCAGAVGNPMAKSVLSDLRNALLPDIYL FINAMQGLKSPLSVEDIRFFTQLDNLFEYDLKNASAQWDICSGGRSSCFIDWKGDIFGCP RSQVKIGNLYQNQILDPLLPCRRKVCDCYIAFSNLTNHPLHRIMRAGAFWRIPDKPFITS VFFDVDGTLTDAQGRVSESYAHALRYIAQFVPLYLATSLSMQQARRKLGKALFSLFEGGV FADGGLLVYAGQNRCMPVELLLDINEESAKITAHSYEGQVYKYSMLVYDKEERINILSRL KEKPYQVFYKPPLITVIHKDVDKRKGVLHICKALASPLDQVLVVGNSLKDWEMMSVVSHS CAVMNAEPLLKERARYTLNPDRLAAFFRFRE >gi|222159276|gb|ACAB01000083.1| GENE 114 128276 - 129640 926 454 aa, chain + ## HITS:1 COG:L170983 KEGG:ns NR:ns ## COG: L170983 COG0534 # Protein_GI_number: 15672149 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Lactococcus lactis # 5 396 3 396 446 161 30.0 3e-39 MKELNLTQGSVPKVLLQFAVPFLIANVLQALYGGADLFVVGQYDDSASVAAVAIGSQVMQ TITGIILGITTGTTVLIAIAIGAKDDRKVAFTIGSSVWLFSIVGVLLTLVMLAFHGQITE LMHTPAEAMADTKNYILVCSAGILFIIGYNVVCGILRGLGDSKTPLYFVGLACIINIVLD FILVGYFHWGATGAAIATVTAQGVSFGIALWFLYRHGFHFDFSRKDIRLNRNLSKKIMVL GAPIALQDALINVSFLIITVIVNQMGVIASASLGVVEKIIVFAMLPPMAISSAVATMTAQ NYGAGLIKRMNKCLASGIGIALVFGVSVCVYSQFLPETLTAFFTKDAAVVAMAAKYLRGY SIDCIVVSFVFCINSYFSGQGNSLFPMIHSLIATFLFRIPLSYWFSQMDSSSLFIMGFAP PISTVVSLLICIWYLRYTQRKLYLRGTLMPAMSN >gi|222159276|gb|ACAB01000083.1| GENE 115 129681 - 130229 496 182 aa, chain + ## HITS:1 COG:no KEGG:BT_1386 NR:ns ## KEGG: BT_1386 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 182 1 182 182 284 83.0 1e-75 MEAYLKKLCFEYQVDEKEVKELLARMEMVYLDKGETIASATMPEQSLYIIVSGILHTYTT HEGEDRTIRFLSEGDAVLCYNSSQYTVKALTKCAAYYISEEEIEELCATSISFANLVRQL MEYQFYFKEEENMNVRKLTVRERYLSLLAEIPDILYRVPLKYITHYLGADVTSLGYMAGS SK >gi|222159276|gb|ACAB01000083.1| GENE 116 130352 - 131230 412 292 aa, chain + ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 1 287 3 291 307 74 23.0 2e-13 MKNSIPRYTFYKNKYGSELLVDVVELKYVKKFLAESAIHTLNYYDVTFITEGKGTFSVDN QTYEVTPCDVLFSKPGEIRNWDTRHIINGYALIFEEEFLSSLFKDSLFVQHLSFFQLESF SSRLHLPDELYARILQILHHIKMEIDSYQQNDVHVLRALLYEVLMLLDRAYLKMTSIEEG RSREASNNHVSKFMNLVNIHSKEQHSVQYYADKLCITSNYLNEMVTSTMGFSAKQYIQNK VMDEAKRLLVYTNVPISDIAFELCFSTVSYFIRSFRQHTGETPLLYRKTHKP >gi|222159276|gb|ACAB01000083.1| GENE 117 131322 - 131744 373 140 aa, chain + ## HITS:1 COG:CAC3491 KEGG:ns NR:ns ## COG: CAC3491 COG3871 # Protein_GI_number: 15896728 # Func_class: R General function prediction only # Function: Uncharacterized stress protein (general stress protein 26) # Organism: Clostridium acetobutylicum # 12 140 13 140 145 119 43.0 1e-27 MMRDAEKTVGNMIDKLKTAFIGSIDREGFPNIKAMLQPRKREGIKTIYLTTNTSSMRVAQ YRENNHACIYFCDTRFFRGAMLRGTMEVLTDSASKEMIWQEGDTMYYPEGVTDPDYCVLK FTATSGRFYSNFKSESFIVE >gi|222159276|gb|ACAB01000083.1| GENE 118 131783 - 132022 105 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTNKFLGEEASDAHRGTPRSSPRNERFANKKRCIPYVVLCEIDLHFRILLHYFMPFIHWF IVFKVATLLYNLKNSTLRL >gi|222159276|gb|ACAB01000083.1| GENE 119 132321 - 132917 548 198 aa, chain - ## HITS:1 COG:BS_ydeA KEGG:ns NR:ns ## COG: BS_ydeA COG0693 # Protein_GI_number: 16077578 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Bacillus subtilis # 1 194 1 184 197 120 38.0 1e-27 MKEVIFVILEGFADWEGAYIATCLNQGVKPGNPISYKVKTLSITQEPVSSIGGFRVLPDY GLKDMAEDYAGLVLIGGMNWFSPEAELIVPLVEKAIKEKKLVAGICNASVFLGMHGFLNE VKHTSNTLNYLKQYAGDKYTGDSNYINEQAVRDENIVTANGTGQLEFCKEILYALEADTA DAIEESYLFYKNGFCPGA >gi|222159276|gb|ACAB01000083.1| GENE 120 133132 - 133299 68 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295087721|emb|CBK69244.1| ## NR: gi|295087721|emb|CBK69244.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 55 9 63 63 74 98.0 1e-12 MGECDTLGKEMTQLTEDKFLSVLGLSIHKQVNLSSQVIMIQTIITSMTKQIRMKS >gi|222159276|gb|ACAB01000083.1| GENE 121 134093 - 134947 801 284 aa, chain + ## HITS:1 COG:aq_1818 KEGG:ns NR:ns ## COG: aq_1818 COG0788 # Protein_GI_number: 15606867 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate hydrolase # Organism: Aquifex aeolicus # 1 283 1 282 283 322 54.0 4e-88 MTTAKLLLHCPDKPGILAEVTDFITVNKGNIIYLDQYVDHVENIFFMRIEWELKDFLVPQ EKIEDYFRTLYGQKYEMDFRLYFSDVKPRMAIFVSKLSHCLFDILARYTAGEWNVEIPLI ISNHPDLQHVAERFGIPFYLFPITKETKEEQERKEMELLAKHNITFIVLARYMQVISEQM INAYPNKIINIHHSFLPAFVGAKPYHAAFQRGVKIIGATSHYVTTELDAGPIIEQDVVRI THKDAIEDLVNKGKDLEKIVLSRAVQKHIERKVLAYKNKTVIFS >gi|222159276|gb|ACAB01000083.1| GENE 122 134947 - 135537 518 196 aa, chain + ## HITS:1 COG:YPO1545 KEGG:ns NR:ns ## COG: YPO1545 COG0118 # Protein_GI_number: 16121818 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Yersinia pestis # 1 196 1 196 196 171 42.0 6e-43 MKVAVVKYNAGNIRSVDYALKRLGVEAVITADKEELQSADKVIFPGVGEAETTMNHLKAT GLDKLIKNLRQPVFGICLGMQLMCRHSEEGEVDCLNIFDVDVKRFVPQKHEDKVPHMGWN TIGKTNSKLFEGFTEEEFVYFVHSFYVPVCDFTAATTDYIHPFSAALHKDNFYATQFHPE KSGKTGEKILTNFLNL >gi|222159276|gb|ACAB01000083.1| GENE 123 135629 - 136348 799 239 aa, chain + ## HITS:1 COG:YPO1544 KEGG:ns NR:ns ## COG: YPO1544 COG0106 # Protein_GI_number: 16121817 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Yersinia pestis # 4 234 2 237 245 188 38.0 7e-48 MIEIIPAIDIIDGKCVRLSQGDYDSKKVYNENPVEVAKEFEANGVRRLHVVDLDGAASHH VVNYRVLERIAAHTSLVIDFGGGVKSDEDLKIAFESGAQMVTGGSIAVKDPELFCHWLEI YGSEKIILGADVKDHKIAVNGWKDESACELFPFLENYMDKGIRKVICTDISCDGMLSGPS IDLYKEMLAKFPDLYLMASGGVSKVDDIVALDEAGVPGVIFGKALYEGHITLKDLRIFL >gi|222159276|gb|ACAB01000083.1| GENE 124 136473 - 137228 743 251 aa, chain + ## HITS:1 COG:aq_181 KEGG:ns NR:ns ## COG: aq_181 COG0107 # Protein_GI_number: 15605750 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Aquifex aeolicus # 1 250 1 250 253 296 58.0 2e-80 MLAKRIVPCLDIKDGQTVKGTNFVNLRQAGDPVELGRAYSEQGADELVFLDITASHEGRK TFTELVKRIAANINIPFTVGGGINELSDVDRLLNAGADKISINSSAIRNPQLIDEIAKNF GSQVCVLAVDAKQTENGWKCYLNGGRIETDKDLFEWTKEAQERGAGEILFTSMNHDGVKA GYANDALAALADQLSIPIIASGGAGCKEHFRDVFLQGKADAALAASVFHFGEIKIPELKS YLCGEGITTRG >gi|222159276|gb|ACAB01000083.1| GENE 125 137358 - 137969 649 203 aa, chain + ## HITS:1 COG:hisI_1 KEGG:ns NR:ns ## COG: hisI_1 COG0139 # Protein_GI_number: 16129967 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Escherichia coli K12 # 2 100 9 107 112 147 66.0 9e-36 MELDFDKMNGLVPAIIQDNETRKVLMLGFMNKEAYDKTVETGKVTFFSRTKNRLWTKGEE SGNFLHVVSIKADCDNDTLLIQVNPVGPVCHTGTDTCWGEKNEEPVMFLKALQDFIDKRH EEMPEGSYTTSLFESGINKIAQKVGEEAVETVIEATNGTDERLIYEGADLIYHMIVLLTS KGYRLEDLARELQERHSSTWKKH >gi|222159276|gb|ACAB01000083.1| GENE 126 138012 - 138740 289 242 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 6 222 1 219 245 115 29 1e-24 MDDEALIQYKNVEIHQQELCVLSDVNLELHKGEFVYLIGKVGSGKTSLLKTLYGELDVID GEAEVLGYNMRSIKRKHIPQLRRKLGIVFQDFQLLTDRTVYNNLEFVLRATGWKNKQEIQ DRIEEVLQLVGMSNKGYKLPNELSGGEQQRIVIARAVLNSPAIILADEPTGNLDVETGKA IVELLHNICESGSSVVMTTHNLQLLKEYPGRVYRCADHQIVDVTAEYMPRQRTIEIDLNI DN >gi|222159276|gb|ACAB01000083.1| GENE 127 138754 - 140073 1131 439 aa, chain + ## HITS:1 COG:VC0391 KEGG:ns NR:ns ## COG: VC0391 COG0527 # Protein_GI_number: 15640418 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Vibrio cholerae # 3 439 34 479 479 253 35.0 5e-67 MKVLKFGGTSVGSAQRMKEVAKLITDGEQKIVVLSAMSGTTNTLVEISDYLYKKNPEGAN EIINKLEAKYKQHIDELFATQEYKQKGLEVVKSHFDYIRSYTKDLFTLFEEKVVLAQGEL ISTAMVNFYLQECGVKSVLLPALEFMRTDKNAEPDPVYIKDKLRAQLDLYPDTEIYITQG FICRNAYGEIDNLQRGGSDYTASLIGAAVNASEIQIWTDIDGMHNNDPRIVDKTAPVRQL HFEEAAELAYFGAKILHPTCIQPAKYANIPVRLLNTMDPEAPGTLISNDTEKGKIKAVAA KENITAIKIKSSRMLLAHGFLRKVFEIFESYQTSIDMICTSEVGVSVTIDNTKHLNEILD DLKKYGTVTVDKEMCIICVVGDLEWENVGFEAKALDAMRNIPVRMISFGGSNYNISFLIR ECDKKVALQSLSDMLFNDK >gi|222159276|gb|ACAB01000083.1| GENE 128 140152 - 141312 1258 386 aa, chain + ## HITS:1 COG:mlr3508 KEGG:ns NR:ns ## COG: mlr3508 COG0019 # Protein_GI_number: 13473029 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Mesorhizobium loti # 15 376 27 388 422 281 44.0 1e-75 MKGIFPIDKFRTLQTPFYYYDTKVLRDTLSAINQEVAKYPNYSVHYAVKANANPKVLTII RESGMGADCVSGGEIRAAIRAGFPADKVVFAGVGKADWEINLGLEYGIFCFNVESIPELE VINELAAAQNKVANVAFRINPDVGAHTHANITTGLAENKFGISMQDMDKVIDVAQEMKNV KFIGLHFHIGSQILDMGDFVALCNRVNELQDKLEARRILVEHINVGGGLGIDYGHPNRQA IPNFKDYFATYAGQLKLRPYQTLHFELGRAVVGQCGSLISKVLYVKQGTRKKFAILDAGM TDLIRPALYQAFHKMENITSEEPLEAYDVVGPICESSDVFGKAIDLNKVKRGDLIALRSA GAYGEIMASGYNCRELPKGYTSDEMV >gi|222159276|gb|ACAB01000083.1| GENE 129 141455 - 141934 649 159 aa, chain + ## HITS:1 COG:MTH158 KEGG:ns NR:ns ## COG: MTH158 COG1528 # Protein_GI_number: 15678186 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Methanothermobacter thermautotrophicus # 1 159 1 161 171 148 44.0 3e-36 MISEKLQNAINEQITAEMWSANLYLAMSFYFEKEGFSGFAHWMKKQSQEEMGHAYAMADY IIKRGGTAKVDKIDVVPNGWGTPLEVFEHVYKHECHVSQLVDKLVDVAAAEKDKATQDFL WGFVREQVEEEATAQGIVDKIKKAGDTGIFFVDSQLGQR >gi|222159276|gb|ACAB01000083.1| GENE 130 142089 - 144353 1610 754 aa, chain + ## HITS:1 COG:PA0928_1 KEGG:ns NR:ns ## COG: PA0928_1 COG0642 # Protein_GI_number: 15596125 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 508 748 259 507 509 153 38.0 1e-36 MNGVNRGIFRLLPIFLLLFLGVSSCSQKEERRILVIHSYEETYAAYPEFNRMIAEQFEKE KIDADIRTVYLDCESYWEEPELERMRFLVDSVSKDWRPEVILVNEDQATYSLMKCGVQLG KEVPVVFGGVNYPNWGLLKRHPNVTGFHDKIAFNENVSVAKELFGEYVRLFTMLDTTYID KQIRRDAKEQFKGHKVTGFIDNPELSPEEQIRLTQEEGYTRFMAIPLRNARNHSDATFMW VLNRSYRDQCYIQLKRDYTTINIGSICGSPSLTAINEAFGFGEKLLGGYITSLPIQVEEE VKAAVRILHGASPADMPIVESRKEYVVDWNTMTQIGLSKESIPAKYRIINIPFSDKYPLL WGISVASFILILIILFASLWWLYLREQMRKKQALIALADEKETLSLAIEGGMTYAWRLDN GCFVFEDAFWASQGLNSRQLSFKEFMSFIHPDHWEGVKFNWRNLKSAHKKIVQELCNFDG KGYQWWEFRYTTKQLPGGEYKTAGLLLNIQDIKDREEELEAARLLAEKAELKQSFLANMS HEIRTPLNSIVGFANILALEDGLSSVEREEYISTINKNSELLLKLINDILELSRIESGYM SFSFKRCKVRELIDDIYMTHQVLIAPHLEFLKEVDDTPLEINVDRERLIQVLTNFLNNAC KFTETGYIKLGYSYLPDEGNVQIYVEDSGRGIPREEQRMIFSRFYKQNEFSQGAGLGLSI CQVIIEKLGGKIELKSEVGKGSRFTVILPCRVVS >gi|222159276|gb|ACAB01000083.1| GENE 131 144387 - 144794 493 135 aa, chain - ## HITS:1 COG:no KEGG:BT_1372 NR:ns ## KEGG: BT_1372 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 135 23 157 157 193 94.0 2e-48 MEELTLTTPALLFSAVSLILLAYTNRFLSYAQLVRQLRDRYMENPSDITEAQIENLRKRL NLTRRMQGLGIASLFFCVVSMFLIYIGLQLFSAYVFGLALILLIASLGVSFREIQISTRS LEIYLGTMEKGKNKK >gi|222159276|gb|ACAB01000083.1| GENE 132 144967 - 146160 1350 397 aa, chain - ## HITS:1 COG:YPO0059 KEGG:ns NR:ns ## COG: YPO0059 COG0156 # Protein_GI_number: 16120412 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Yersinia pestis # 7 394 12 402 403 479 60.0 1e-135 MYGKMQEYLCQTLAEIKEAGLYKEERLIESAQQAAITVKGKEVLNFCANNYLGLSNHPRL IKASQEMMNNRGYGMSSVRFICGTQDIHKELEAAISEYFQTEDTILYAACFDANGGVFEP LFSEEDAIISDSLNHASIIDGVRLCKAKRYRYANADMKDLERCLQEAQAQRFRIVVTDGV FSMDGNVAPMDQICDLAEKYDALVMVDESHSAGVVGATGHGVSELYKTHGRVDIYTGTLG KAFGGALGGFTTGRKEIIDLLRQRSRPYLFSNSLAPGIIGASLEVFKMLKESNALHDKLV ENVNYFRDKMTAAGFDIKPTQSAICAVMLYDAKLSQIYAARMQEEGIYVTGFYYPVVPKD QARIRVQISAGHEKAHLDKCIAAFIKVGKELNVLKAE >gi|222159276|gb|ACAB01000083.1| GENE 133 146297 - 147250 745 317 aa, chain + ## HITS:1 COG:SA0511 KEGG:ns NR:ns ## COG: SA0511 COG0451 # Protein_GI_number: 15926231 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Staphylococcus aureus N315 # 1 315 1 314 321 340 53.0 2e-93 MEHILIIGATGQIGSELTMELRKRYGNANVVAGYIPGAEPKGELKESGPSAIADVTDGEA IASVVKEYHIDTIYNLAALLSVVAESKPKLAWKIGIDGLWNVLEVAREQGCAVFTPSSIG SFGASTPHTKTPQDTIQRPRTMYGVTKVTTELLSDYYFNKYGVDTRAVRFPGIISNVTPP GGGTTDYAVDIYYSAVKGEKFVCPIKQGTLMDMMYMPDALNAAITLMEADPTKLIHRNAF NIASMSFDPETIYQAIRKHVPQFEMIYDIDPLKQRIADSWPDSLDDTCAREEWGWKPAYD LESMTVDMLEKLREKLK >gi|222159276|gb|ACAB01000083.1| GENE 134 147250 - 148098 745 282 aa, chain + ## HITS:1 COG:no KEGG:BT_1369 NR:ns ## KEGG: BT_1369 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 282 1 282 282 503 87.0 1e-141 MKKLIVGVMLLGILGACGNKKTNIDPFASITKEVDSIRRVADSIHCDESPEDPQPIQADE SFDDFIYNFASDDVLQRQRVKFPLPYYNGDKKTNIEEHNWKHDDLFTKQHYYTLLFDREE DMDLVGDTSLTSVQVEWVFVKTRMVKKYYFERIKGAWMLEAINLRPIKQNDNENFVEFFG HFAADSLFQSKRVCEPLAFVTTDPDDDFSILETTLDLNQWFAFKPGLPADRLSNINYGQR NDDDSPTKILALKGIGNGFSNILYFRRKAGEWELYKFEDTSI >gi|222159276|gb|ACAB01000083.1| GENE 135 148200 - 149192 950 330 aa, chain + ## HITS:1 COG:PM1589 KEGG:ns NR:ns ## COG: PM1589 COG0812 # Protein_GI_number: 15603454 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Pasteurella multocida # 1 330 1 329 341 277 43.0 3e-74 MYSLLPYNTFGIDVSAARFLEYTSVEELKKLIVQGAVTTPFLHIGGGSNLLFTKDYDGVI LHSRIEGIEVTEEDDHSVSVRVGAGVVWDDFVAYCVEHGWYGAENLSLIPGEVGASAVQN IGAYGVEVKDLITAVETVNIQAEERVYSVGECGYTYRNSIFKYPENKATFVTYVRFRLSK EEHYTLDYGTIRQELEKYPALTLSVVRKVIIAIRESKLPDPKVMGNAGSFFMNPIVPKEK LEALQQEYPRIPYYELADGRVKIPAGWMIDQCGWKGKALGPAAVHDKQALVLVNRGGAKG SDIIALSDAVRASVREKFGIDIHPEVNFIN >gi|222159276|gb|ACAB01000083.1| GENE 136 149203 - 149961 606 252 aa, chain + ## HITS:1 COG:BB0533 KEGG:ns NR:ns ## COG: BB0533 COG1235 # Protein_GI_number: 15594878 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Borrelia burgdorferi # 6 252 6 253 253 211 41.0 2e-54 MKVRIIGSGTSTGVPQIGCTCPVCTSSDPKDNRLRASAIVETDDARILIDCGPDFRTQVL HLPFERIDGVLITHEHYDHVGGLDDLRPFCRFGSVPIYAEDYVAQGLRLRMPYCFVDHRY PGVPDIPLQVISVGQSFSINHTEVLPLRVMHGRLPILGYRIGQLGYITDMLTMPEESYEQ LAGIDVLVMNALRIASHPTHQNLEEALAVARRIQAKKTYFIHMSHDMGLHAEVEKNLPEN IHLAFDGLDIYV >gi|222159276|gb|ACAB01000083.1| GENE 137 150045 - 151619 789 524 aa, chain + ## HITS:1 COG:no KEGG:BT_1366 NR:ns ## KEGG: BT_1366 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 524 1 525 525 821 76.0 0 MKLRTIVKIAITSSVVLLCSGFALYSFFRLSAAEGQKDFNLYELVPSTTSAVFVTDDVLE FVAEVDELTCSKNQQYLYVSKLFSYLKQSLYALSEDTPHGLSRQMNQMLISFHEPDNERN QVLYCRLGNGDKELVNRFVRKYISSLYPPKTFIYKGEEIIIYPMADGDFLACYLTSDFMA LSFQKKLIENVIDAYKSGKSLADDSTFTGIRAPKKSAAAATIYTRMQGMMGWTEFDMKMK DDFIYFSGITHDADTCFAFINQLRQQQSVKGFPGEVLPSTAFYFSRQGITDWVFLLSYGN AQGQSVPARTSEVQNRDKEFSRYLMENAGQDLVACLFQREDTLQGAAAVLSLSVADVTEA ERMLRALVNAASADEGRRNPRITFCYTVNKAYLVYRLPQTTLFEQLTSFAEPTLDVYAAF YGGRLLLAPDADALAHYIRQLDKGEVLNGAMAYQTGMDHLSDSYHFMLMADFDHIFQQSE NHVRFVPDFFLRNADFFRHFTLFVQFACADGMVYPNIVLKYKSE >gi|222159276|gb|ACAB01000083.1| GENE 138 151624 - 151812 86 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262408873|ref|ZP_06085418.1| ## NR: gi|262408873|ref|ZP_06085418.1| predicted protein [Bacteroides sp. 2_1_22] # 1 62 1 62 62 80 100.0 3e-14 MQKEKKQKGVSPKAERTQKSMGQWSKKTINSILKMYGGSRQSLWIIQPKAMEVFRKIIAD KR >gi|222159276|gb|ACAB01000083.1| GENE 139 152114 - 152488 296 124 aa, chain - ## HITS:1 COG:no KEGG:BT_1365 NR:ns ## KEGG: BT_1365 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 124 1 124 124 207 92.0 7e-53 MEFTGKIIAILPPRGGVSKTSGNEWKSQEYVIENHDQYPRKMCFDVFGADKIEQFNIQMG EELTVSFDVDARQWNDRWFNSIRAWKVERVSTGAPMASGSPVPPPAPSAMPEFTPGDAKD DLPF >gi|222159276|gb|ACAB01000083.1| GENE 140 152643 - 153767 1134 374 aa, chain + ## HITS:1 COG:BMEI1942 KEGG:ns NR:ns ## COG: BMEI1942 COG0592 # Protein_GI_number: 17988225 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Brucella melitensis # 1 370 26 395 397 160 30.0 5e-39 MKFIVSSTALSSHLQAISRVINSKNALPILDCFLFELEDGTLSVTVSDSETTMVTTVEVN ESDTNGRFAVVAKTLLDALKEIPEQPLTFDINPDNYEITVQYQNGKYSLMGQNADEFPQS ATLGDNAVRVEMEASVLLGGINRSVFATADDELRPVMNGIYFDITTEDITMVASDGHKLV RCKTLAAKGNERAAFILPKKPATLLKNLLPKEQGAVTIEFDERNAVFMLESYRMVCRLIE GRYPNYNSVIPQNNPHKVTVDRQQLVGALRRVSIFSSQASSLIKLRMQENQIVISAQDID FSTSAEETQVCQYAGAAMSIGFKSTFLIDILNNISADEVVIELADPSRAGVIIPVEQEEN EDLLMLLMPMMLND >gi|222159276|gb|ACAB01000083.1| GENE 141 153880 - 154659 811 259 aa, chain + ## HITS:1 COG:CT261 KEGG:ns NR:ns ## COG: CT261 COG0847 # Protein_GI_number: 15604982 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Chlamydia trachomatis # 9 239 4 214 232 81 31.0 2e-15 MKLNLKNPIVFFDLETTGTNINTDRIVEICYLKVYPNGNEEAKTLRINPEMHIPEASSAI HGIYDADVADCPTFKEVAKNIARDIEGCDLAGFNSNRFDIPVLAEEFLRAGVDIDMTKRK FVDVQVIFHKMEQRTLSAAYKFYCDKNLEDAHTAEADTRATYEVLKAQLDRYSDLQNDIA FLADYSSFSKNVDFAGRMVYDDNGVEVFNFGKYKGMSVAEVLKKDPGYYSWILNSDFTLN TKAALTKIRLREMSNLITK >gi|222159276|gb|ACAB01000083.1| GENE 142 154659 - 155867 1029 402 aa, chain + ## HITS:1 COG:BH2510 KEGG:ns NR:ns ## COG: BH2510 COG0452 # Protein_GI_number: 15615073 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus halodurans # 1 391 1 390 404 340 46.0 3e-93 MLKGKKIILGITGSIAAYKACYIIRGLIKQGAEVQVVITPAGKEFITPITLSALTSKPVI SEFFAQRDGTWNSHVDLGLWADAVLIAPATASTIGKMANGIADNMLITTYLSAKAPVFVA PAMDLDMFAHPATQKNLDILRSYGNHIIEPGTGELASHLVGKGRMEEPENIIRVLDEFFA SSDELSGKKVMITAGPTYEKIDPVRFIGNYSSGKMGFALAEECARRGAQVTLITGPVQLK TQHSGIIRVDVESAEEMYKAAQAHFPDADAGILCAAVADYRPATVADKKIKREKEEELTL HLRATQDIAASLGAIKRKQQCLVGFALETNNEQQNAEGKLERKNFDFIVLNSLNDAGAGF RHDTNKISIIDRKGRTDYPLKSKTEVAQDIIDCLVATLSPNF >gi|222159276|gb|ACAB01000083.1| GENE 143 155894 - 157555 1825 553 aa, chain + ## HITS:1 COG:PA4763 KEGG:ns NR:ns ## COG: PA4763 COG0497 # Protein_GI_number: 15599957 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Pseudomonas aeruginosa # 1 551 1 553 558 338 40.0 2e-92 MLRSLYIQNYALIEKLDISFETGFSVITGETGAGKSIILGAIGLLLGQRADVKAIRRGAS KCIIEARFDIAAYGMRPFFEENELEYDEECILRREVQASGKSRAFINDTPASLVQMKELG EQLIDVHSQHQNLLLNKEGFQLNVLDILAHNDAALEKYHLRYGEWKQTERELAELMSLAE KSRSDEDYIRFQLEQLEEAHLVEGEQEELEQEAETLSHAEEIKAGLYRVEQSFVSDEGGL LSYLKDSLNTLNSLQRVYQPAKELTERMESAYIELKDISHEVSSQSDSVEFNPVRLEEVN ERLNLIYSLQQKHRAQTLDELIALTDEYRSKLSDITSYDERIAELTVRKEEQYKQVKQQA ELLTKARTKAAREVEKQLAARLIPLGMPNVRFQIEMGLKKEPGLQGEDTVNFLFSANKNG TLQNISSVASGGEIARVMLSIKALIAGAVKLPTIVFDEIDTGVSGEIADRMADMMQEMGD RNRQVISITHLPQIAARGRAHYKVYKKDSDTETNSHIRRLTDEERVEEIAHMLSGATLTE AALSNAKSLLDAK >gi|222159276|gb|ACAB01000083.1| GENE 144 157658 - 158401 392 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 6 244 7 246 255 155 35 1e-36 MLDKSEMIFGVRAVIEAIQAGKEIDKILVKKDIQSDLSKELFAALKGLMIPVQRVPVERI NRITRKNHQGVIAFISSVTYQKTEDLVPFLFEQGKNPFFVMLDGVTDVRNFGAIARTCEC AAVDAVIIPVRGSASVNADAVKTSAGALHTLPVCREQNLRSTLQYLKDSGFRIVAATEKG DYDYTKADYTGPLCIIMGAEDTGVSYENLALCDEWVKIPMLGTIESLNVSVAAGILVYEA VKQRNND >gi|222159276|gb|ACAB01000083.1| GENE 145 158403 - 160109 1706 568 aa, chain + ## HITS:1 COG:FN1787 KEGG:ns NR:ns ## COG: FN1787 COG0457 # Protein_GI_number: 19705092 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 231 563 59 336 628 75 24.0 4e-13 MKKILILPLLLFFLIQGSMAQTPKWVEKAKRAVFSVVTYDKNDKMLNTGNGFFVSEDGLA LSDYTLFKGAERAVVITSEGKQMPVSLILGANDMYDVIKFRVAITEKKVPALIVAKTAPA VGADAWMLPYSTQKSIACVTGKVKEVSKVAGEYHYYTLGMQMKDKMVSCPVMNAEGQVFG IAQKSSGIDTVTTCYAAGAAFAMSQKISALSLGDAALKKIGIRKGLPETEDQALVYLFMA SSSLSGDDYEKLLDDFIRQFPANADGYLRRANYYAAKGKDDQTWYDKAVADFNQALKVAQ KKDDVYYNIGKLIYAYQLSKPEKTYKDWTYDTALQNVRQAIGIDPLPIYIQMEGDILFAQ QDYAGALAAYEKVNASNIASPATFFSAAKTKELAKGDPKEVVALMDSCIARCPQPITADF APYLLERAQMNMNAGQPRNAMLDYDAYHTAVKGEVNDVFYYYREQAALKARQFQRALDDI VKAIEMNPTDLTYQAEHAVVNLRVGRYEEAIQILDNILKADPKYAEAYRLLGLCQIQLKK TDEACGNFKKAKELGDPNVDELITKYCK >gi|222159276|gb|ACAB01000083.1| GENE 146 160478 - 161422 444 314 aa, chain + ## HITS:1 COG:no KEGG:BT_0595 NR:ns ## KEGG: BT_0595 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 308 4 310 318 402 66.0 1e-110 MEKNRFTICANNYIDCLRQEGRYSTAHVYKHAIRSFSQFCGTQSITFSKINRKTLKRYSN YLMASRLKPNTISTYMRMLRSIYNRGVDMHQAPYVHGLFRDVFTGVDTRQKKAIPIGELH MLLNKDPQSEKLRRTQAIANLLFQFCGMPFSDLAHLEKSNLERGLLKYNRTKTGTPMSIE VLESAQNAIGGLYNKSDARSSGYPDYLFRILSGAYKRNEEGAYREYQSALRRFNNELKSL SRKLRLHSPVTSYTLRHSWATTAKYRGVPIEMISESLGHKSIKTTQIYLKGFELEERTKV NKLNYSYVCNFKML Prediction of potential genes in microbial genomes Time: Wed May 18 03:08:07 2011 Seq name: gi|222159275|gb|ACAB01000084.1| Bacteroides sp. D1 cont1.84, whole genome shotgun sequence Length of sequence - 10189 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 2, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 24 - 620 426 ## BT_0596 putative transcriptional regulator 2 1 Op 2 . + CDS 598 - 3012 2246 ## COG1596 Periplasmic protein involved in polysaccharide export 3 1 Op 3 . + CDS 3025 - 4161 891 ## BT_1355 hypothetical protein 4 1 Op 4 . + CDS 4182 - 5591 531 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis + Term 5669 - 5709 6.1 + Prom 5698 - 5757 5.4 5 2 Op 1 5/0.000 + CDS 5835 - 6542 534 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 6 2 Op 2 5/0.000 + CDS 6546 - 7625 667 ## COG0451 Nucleoside-diphosphate-sugar epimerases 7 2 Op 3 . + CDS 7622 - 8542 442 ## COG0451 Nucleoside-diphosphate-sugar epimerases 8 2 Op 4 1/0.000 + CDS 8529 - 9683 593 ## COG0562 UDP-galactopyranose mutase 9 2 Op 5 . + CDS 9746 - 10187 304 ## COG0463 Glycosyltransferases involved in cell wall biogenesis Predicted protein(s) >gi|222159275|gb|ACAB01000084.1| GENE 1 24 - 620 426 198 aa, chain + ## HITS:1 COG:no KEGG:BT_0596 NR:ns ## KEGG: BT_0596 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 186 1 186 192 326 84.0 3e-88 MILTKPKSVNAGPSSGTGEGVAHSKRWYVALVRMHHEKKVAERLSKMGIDSFVPVQQQIH QWSDRRKMVDTVLLPMMVFVHVNPKERMEVLSFSTVSRYMVMRGESTPAVIPDEQMARFR FMLDYSEEAVCMNDTPLARGEKVRVIKGPLSGLVGELVTVGGKSKIAVRLNMLGCACVDM PIGYVESTKITNDNTKKI >gi|222159275|gb|ACAB01000084.1| GENE 2 598 - 3012 2246 804 aa, chain + ## HITS:1 COG:aq_505 KEGG:ns NR:ns ## COG: aq_505 COG1596 # Protein_GI_number: 15605977 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Aquifex aeolicus # 142 439 90 381 725 138 34.0 5e-32 MTTRRKFNVFLFILLLGVFSPLMAQSMSDSQVLEYVKDGIRQGKEQKQLASELARRGVTK EQATRVKQLYEQQNNVNASNATGTDVNESRLREEMKENTSDMLEDHPSTQDLARGNQVFG RNIFNTRNLTFEPSVNIATPLNYRLGPGDEVIIDIWGASQNTIRQQISPDGTINIQKIGP VNLNGLTIAEANDYLKKTLNKIYNGLNNANDPTSDIRLTLGSIRTIQINVMGEVVQPGTY SLSSFATVFHALYRAGGVSDIGSLRNVQLVRNGKNIATIDVYQFIMKGNIQDDIRLQEGD VVIVPAYDVLVKIDGKVKRPMRFEMKKNESLSTLISYAGGFEADAYTRSLRVVRQNGQEY EVNTVKDLDYSVYKMRNGDVVTAEAILNRFINKLEIRGAVYRPGIYQLNGKLNTVRELVN EAQGLTGDAFLNRAVLYRQREDLTTEVVPVDIKAIMDGTSQNIILMKNDILYIPSIHDLE DRGNVVIHGEVAKPDSYPYADNMTLEDLIIQAGGLREAASVVRVDVSRRIKNPRSTVDND TIGQIYTFSLKEGFIVDGTPGFVLQPYDEVYVRRSPGYQAQQNVVVEGEILFGGSYAMTS REERLSDLINKAGGATNYAYLRGAKLTRVANASEKKRMGDVVRLMSRQLGEAMMDSLGVR VEDTFSVGIDLEKALANPGSTADIVLRVGDVISIPKNNNTVTINGAVMVPNTVSYMEGKN IDYYLNQAGGYSENAKKSKKFIVYMNGQVTKVKGSGKKQIEPGCEIIVPSKAKKRTNMGN ILGYATTFSTLGMMVASIANLIKK >gi|222159275|gb|ACAB01000084.1| GENE 3 3025 - 4161 891 378 aa, chain + ## HITS:1 COG:no KEGG:BT_1355 NR:ns ## KEGG: BT_1355 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 369 3 367 379 463 68.0 1e-129 MTTLIEQKSEKDIKQKRNNHTNEEIEIDLMGVLRRIMGIRKKIYKAAGIGLIIGVIIAIS IPRQYTVEVTLSPEMGSAKGGGLSGLAASFLGSDVTMGDGSDALNASLSADIVSSTPFLL ELSTMKIPVLKNEMMTLNAYLDEESSPWWSYVIGFPAMVIGGVKSLFIEGEDEFISSDKV SQGTIELSKKELGKIGVLKKMIVASVDKKTSMTSVAVTLQEPRVAAVVADSVVKKLQEYI IDYRTSKAKEDCIYLEKLFKERQQEYYAAQKKYADYLDSHDNIILQSVRAEQERLQNDMS LAYQIYSQVGSQLQVSRAKVQEEKPVFAVIEPAVVPLTPSGTDKKICVLAFVFLSVCIVI FWHLLGKDILNKFKEIRA >gi|222159275|gb|ACAB01000084.1| GENE 4 4182 - 5591 531 469 aa, chain + ## HITS:1 COG:XF2367 KEGG:ns NR:ns ## COG: XF2367 COG2148 # Protein_GI_number: 15838958 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Xylella fastidiosa 9a5c # 13 469 36 484 484 216 31.0 7e-56 MASTNQINKVVESIVVLGDLIILNLCLSLLLFFWKKAFINLPFSCTVSFMMASLTLCYLA CFGSRGRVWDSRGIRADQLVLRVLKNIIAFSIFWACIMTFSGISIFSPLFFVSYFFILFL VLSIYRIIIRHLLIVYCTKGKHRRYAVFIGGGNNMQMLYEEMESSLASSVYEVVGYFDIK SNDTISSQCPYLGSPDGFSDFMSVHTGIKHVFCSLSMEEGRYNFSIMNYCENHLLYFHGV PNVCKGFPRRIWHSMVGNMPILNLRYEPLGKMENRILKRIFDIALSGIFLVTVFPFVYLI VGSIIKFTSPGPILFKQMRTGLNGVDFVCYKFRSMKVNDEADSKQATADDPRKTRFGDFL RRSNIDELPQFINVFKGEMSIVGPRPHMLSHTEIYARLIDKYMVRHFIKPGVTGWAQIHG FRGETKELSQMGGRVKADIWYMEHWTIFLDLYIIYKTIANVVIGEKNAY >gi|222159275|gb|ACAB01000084.1| GENE 5 5835 - 6542 534 235 aa, chain + ## HITS:1 COG:alr2825 KEGG:ns NR:ns ## COG: alr2825 COG1208 # Protein_GI_number: 17230317 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Nostoc sp. PCC 7120 # 1 235 25 257 257 296 58.0 2e-80 MVEIGGKPILWHIMKTYSHYGINEFIICCGYKQYVIKEYFANYFRHNSDMTVDLSNNCVE ILDNHSEDWRVTMVDTGLNTQTGGRLKRVQKYIGAERFVLTYGDGVADINIAESIKEHEL SECAISLTAYKPGGKFGALQIDLESNKVLSFQEKPDGDRNWINAGYFVCEPEILEYIPEN DDTVIFERKPLENLAKDGKMHAYRHTGFWKPMDTLRDNVELNEMWDKGLAPWKVW >gi|222159275|gb|ACAB01000084.1| GENE 6 6546 - 7625 667 359 aa, chain + ## HITS:1 COG:STM2091 KEGG:ns NR:ns ## COG: STM2091 COG0451 # Protein_GI_number: 16765421 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Salmonella typhimurium LT2 # 5 340 2 336 359 369 50.0 1e-102 MSIDIFNNFYKGKRVLVTGHTGFKGSWLSIWLHELGAEVVGVAKDPYSEKDNYVLSGIGK KIKADLRADICDSQRMKEIFQTYQPEIVFHLAAQPLVRLSYDIPVETYETNVMGTINILE AIRVTDSVKVGIMITTDKCYENKEQIWGYRENEPMGGYDPYSSSKGAAEIAIASWRRSFF NPDQYDKHGKSIASVRAGNVIGGGDWALDRIIPDCIKALESGKNIDIRNTKSVRPWQHVL EPLSGYMLLTAKIWEEPTKYCEGWNFGPRAESITSVWDVANDVVKNYGSGGLNDISIPNA PHEARFLMLDISKAKFQLGWEPRMNIHQCVALTVDWYKRYTFENVYSLCVDQIKQYLIK >gi|222159275|gb|ACAB01000084.1| GENE 7 7622 - 8542 442 306 aa, chain + ## HITS:1 COG:Cj1319 KEGG:ns NR:ns ## COG: Cj1319 COG0451 # Protein_GI_number: 15792642 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Campylobacter jejuni # 1 306 1 314 323 87 25.0 4e-17 MKKVIVVGANGFLGRELCKKLASKGIAVVALVAKGGDYSFAKSLHNVNCIEFDLSRIVDW DGIEEIKGADTMFYMAWAGVSSTYKNQVEIQVSNILYGIQVMEFAHRNGISRVIFPGSAS EYACGNEVINGRSIPSPSDLYSASKVATKFLCQTYARQNGISLIWAVITSIYGPGRNDEN LITYCIKTLLKGEKPSFTGLEQQWDYLYIDDLIDALVAVGEKGIGGKTYPIGSGENKQIV EYVKIIRDMIDPTLPLGIGDIPYKNKTIDNQILDISELNEDTGFTARYSFESGIKLTIDY YKNAIK >gi|222159275|gb|ACAB01000084.1| GENE 8 8529 - 9683 593 384 aa, chain + ## HITS:1 COG:MTH344 KEGG:ns NR:ns ## COG: MTH344 COG0562 # Protein_GI_number: 15678372 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-galactopyranose mutase # Organism: Methanothermobacter thermautotrophicus # 6 383 2 374 380 271 39.0 2e-72 MRLSNFDYIIAGAGLSGSTFARLRAEKGSKILVIDRRSTVAGNLYDETNMYGILVQQYGP HIFHTNSEEVYSFITKYHKWNPFKLRCSVDMCGQQTPSPFNFKTIDQFYATDKAELIKEA LLLKYPNQSTVTIVELLNSEEPLIKQYAQMLFDNDYSLYTAKQWGIKASEIDINVLKRVP VRLDYKEMYFSDKYECMPEGGFTSFIKDLLDHDNIEYINNEDALKYISLNDQDHRIEFHD LEVSTNCKLVYTGAIDELFNYEFGKLPYRSLTFKYETLNEELFQATPVVAYPQVEGYTRI TEYKQLPKQNILGITTIAYEYPMMFEEGKADEPYYPVPTDDTAMLYSKYKAKAGNYDNLI LCGRLANYKYYNMDQAILAVLRLT >gi|222159275|gb|ACAB01000084.1| GENE 9 9746 - 10187 304 147 aa, chain + ## HITS:1 COG:STM2087 KEGG:ns NR:ns ## COG: STM2087 COG0463 # Protein_GI_number: 16765417 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 4 120 3 115 333 61 33.0 5e-10 MFKLSICQPTYNRVYFIKDTTTKIIEQILSEHRENDVQLCFSDNGSTDGTKDIIEEMKRK YPTINFKINYFGENRGLDANHEQVMKMADGEWSILKGDDDYLMDGGLNKIFNLLDQNQDV DVIVSSPIGMSPELKPLQPIYFLREEV Prediction of potential genes in microbial genomes Time: Wed May 18 03:08:27 2011 Seq name: gi|222159274|gb|ACAB01000085.1| Bacteroides sp. D1 cont1.85, whole genome shotgun sequence Length of sequence - 44164 bp Number of predicted genes - 32, with homology - 32 Number of transcription units - 19, operones - 8 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1290 627 ## COG3182 Uncharacterized iron-regulated membrane protein - Term 1304 - 1342 2.9 2 2 Op 1 . - CDS 1380 - 2768 1411 ## BDI_3402 hypothetical protein 3 2 Op 2 1/0.000 - CDS 2809 - 5121 1723 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 5352 - 5411 9.7 4 3 Op 1 3/0.000 - CDS 5415 - 6266 632 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 5 3 Op 2 . - CDS 6263 - 7486 1058 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 6 3 Op 3 . - CDS 7538 - 8299 628 ## BT_2068 3-oxo-5-alpha-steroid 4-dehydrogenase - Prom 8341 - 8400 6.0 - Term 8407 - 8459 3.1 7 4 Op 1 4/0.000 - CDS 8494 - 9837 1547 ## COG0372 Citrate synthase 8 4 Op 2 1/0.000 - CDS 9858 - 11039 1244 ## COG0538 Isocitrate dehydrogenases 9 4 Op 3 . - CDS 11058 - 13301 1977 ## COG1048 Aconitase A + Prom 13284 - 13343 7.0 10 5 Tu 1 . + CDS 13368 - 15305 1268 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits - Term 15284 - 15319 5.3 11 6 Tu 1 . - CDS 15449 - 17416 1553 ## BF3757 hypothetical protein - Prom 17499 - 17558 8.3 - Term 17543 - 17579 4.0 12 7 Op 1 . - CDS 17608 - 18651 1286 ## COG0059 Ketol-acid reductoisomerase 13 7 Op 2 . - CDS 18719 - 19462 676 ## COG3884 Acyl-ACP thioesterase 14 7 Op 3 32/0.000 - CDS 19468 - 20025 487 ## COG0440 Acetolactate synthase, small (regulatory) subunit 15 7 Op 4 6/0.000 - CDS 20083 - 21777 1682 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] - Prom 21797 - 21856 4.5 16 7 Op 5 . - CDS 21902 - 23701 1776 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase - Prom 23917 - 23976 6.7 + Prom 23968 - 24027 5.3 17 8 Tu 1 . + CDS 24182 - 24760 610 ## COG1047 FKBP-type peptidyl-prolyl cis-trans isomerases 2 + Term 24773 - 24837 8.0 18 9 Tu 1 . - CDS 24888 - 26174 1288 ## COG3681 Uncharacterized conserved protein - Prom 26219 - 26278 5.3 + Prom 26196 - 26255 7.5 19 10 Op 1 . + CDS 26308 - 27393 872 ## BT_2081 hypothetical protein 20 10 Op 2 . + CDS 27515 - 28300 644 ## BT_2082 hypothetical protein + Term 28323 - 28390 14.0 - Term 28311 - 28377 9.3 21 11 Tu 1 . - CDS 28406 - 28972 648 ## BT_2083 hypothetical protein - Prom 29108 - 29167 4.9 + Prom 28945 - 29004 8.6 22 12 Op 1 . + CDS 29135 - 30211 1122 ## COG0082 Chorismate synthase 23 12 Op 2 . + CDS 30211 - 31575 1107 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Term 31607 - 31660 11.5 24 13 Tu 1 . - CDS 31689 - 32747 972 ## COG3049 Penicillin V acylase and related amidases - Prom 32797 - 32856 7.9 + Prom 32735 - 32794 6.1 25 14 Tu 1 . + CDS 32944 - 33072 124 ## gi|237720084|ref|ZP_04550565.1| conserved hypothetical protein - Term 32987 - 33035 10.7 26 15 Op 1 . - CDS 33100 - 35136 1353 ## BT_2246 hypothetical protein - Prom 35156 - 35215 4.6 27 15 Op 2 . - CDS 35246 - 37423 1947 ## COG0550 Topoisomerase IA - Prom 37479 - 37538 6.1 + Prom 37395 - 37454 4.8 28 16 Tu 1 . + CDS 37547 - 39109 1305 ## BDI_2898 hypothetical protein - Term 39051 - 39091 -1.0 29 17 Tu 1 . - CDS 39157 - 40875 1458 ## BT_4445 hypothetical protein - Prom 40930 - 40989 6.0 - Term 41112 - 41148 4.0 30 18 Tu 1 . - CDS 41161 - 41928 775 ## BT_0003 hypothetical protein - Prom 41992 - 42051 5.0 31 19 Op 1 . - CDS 42117 - 43658 931 ## BT_1284 putative endo-beta-N-acetylglucosaminidase F1 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F1) 32 19 Op 2 . - CDS 43693 - 44163 334 ## BT_1283 hypothetical protein Predicted protein(s) >gi|222159274|gb|ACAB01000085.1| GENE 1 3 - 1290 627 429 aa, chain - ## HITS:1 COG:PA4513_1 KEGG:ns NR:ns ## COG: PA4513_1 COG3182 # Protein_GI_number: 15599709 # Func_class: S Function unknown # Function: Uncharacterized iron-regulated membrane protein # Organism: Pseudomonas aeruginosa # 1 423 1 364 395 124 24.0 3e-28 MKKIFRKIHLWLSVPFGLIITLVCFSGAILVFENEVNEWFRRDLYYVETVKESPLPMDKL LEKVATTLPDSVSVTGVSISSDPGRAYQVSLSKPRRASLYVDQYTGEVKGKSERSGFFMF MFRMHRWLLDSMNPGNEGIFWGKMIVGVSTLLLVFVLISGIVIWWPRTRKALKNSLKITA TKGWRRFWYDLHVAGGMYALIFLLAMALTGLTWSFPWYRTAFYKVFGVEVQQRAAQGHEQ KSDAQKRDTKLAAHREKKREGNEVRKGERSGRPEGRRNNREHAKNRNTGEYTGEQPDSKG RPENNHSDMYSVTSPFVYWQEIYDKLRRQNPEYKQISISSGTASVSFNRFGNQRASDRYS FNTDNGEFTETSLYQHQDKSGKIRGWIYSVHVGNWGGMFTRILTFIAALLGAALPLTGYY LWIKRLIKV >gi|222159274|gb|ACAB01000085.1| GENE 2 1380 - 2768 1411 462 aa, chain - ## HITS:1 COG:no KEGG:BDI_3402 NR:ns ## KEGG: BDI_3402 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 457 1 466 471 361 47.0 3e-98 MKKFLLQAFAVAFCLSSGVFVASCDDDNNPVDNPPIDGETAYVVAATTGEASYLVVANSL DEGTVSTQGNGTEVIGGTYWVYKGLDYVFALVYNKGGAGTGASYYLGADGKMKEKYTYTY NRITSYGTWGDKVVTVSTGDSKITDEDQNVAQALLFNYLDATDGSQEEGTLLAENYLGNG EKVSWAGLVEANNKIYTSVIPMGMSKYGIKKWPEAVTDQELVTKTDGGSGSGAYTAGVIP STQYPDNAYVAIYNGTNFNETPVIAKTDKIGYACGRMRSQYYQTIWAADNGDVYVFSPGY GRTAVSSSDLKKVTGQKPSGAMRIKAGATDFDPDYYVNFEEIGTKHPIFRCWHISEDYFL LQLYKKGAEDMINGGTSADVSELAVFKAEDQTIMPVTGLPADGKFGGEPYGEKGYAYMAV TVTTGEKPAFYKIDAKTGKAVKGLTVEADAITTVGKMEYLSK >gi|222159274|gb|ACAB01000085.1| GENE 3 2809 - 5121 1723 770 aa, chain - ## HITS:1 COG:FN1971 KEGG:ns NR:ns ## COG: FN1971 COG1629 # Protein_GI_number: 19705267 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 73 245 25 204 657 75 35.0 4e-13 MSTEKEVVDFATVYLKGTNQGCYTDEKGIYHLKTTAGEYTLVVSAIGYQTVEKKVKLAKG ERIKVNVTIAPAVKELGEVLVTTSGVGRVNQSAFNAVAIDAKKLHNSTQTLAGALTKVPG VKLRESGGVGSDMQLYIDGFSGKHVKIFIDGIPQEGAGAAFDLNNVPINFADRIEVYKGV VPVGFGTDAIGGVVNIVTNKQPGKWFMDASYSYGSFNTHKSYVRFGQTFKNGFMYEVNAF QNFSDNDYYVDTYVREFEIKEDGSVRFPPLDKNKIYHLKRFNDQYHNEAVIGKIGLVGKK WADRLALSFNYSHFYKEIQTGVYQDVVFGEKFRKGHSLVPSLEYYKKNLLVKNLDLLLTA NYNHNITNNVDTASRAYNWRGDFYEKGSRGEQSYQNSESKNKNWNGTLKMNYHIGQAHTF TFSHVISDFERTSRSTIGASSKFTDFSIPKITRKNVSGLSYRLMPSDRWNVSAFAKYYRQ YNKGPVSQNTDGIGNYINLSNTASALGYGAAGTYFIWKDLQVKLSYEKAFRLPTTDELFG DEDLEAGKMNLKPEKSDNVNLSFSYNHQFGKHGLYAEAGLIYRDTKDYIKRGLDVLGGTS YGYYENHGHVRTKGYNLSLLYSFSRWFDIGGTFNSIDTRDYEKFLAGSSLQESMHYKVRM PNLPYRYANINANFYWNDLFVKGNVLSIGYDSYWQHDFPLYWENLGDKDSKNMVPEQFSH NLSLSYTMKNGRYNVSFECRNFTDAQLFDNFSLQKAGRAFYGKFRVFFGK >gi|222159274|gb|ACAB01000085.1| GENE 4 5415 - 6266 632 283 aa, chain - ## HITS:1 COG:VNG0479G KEGG:ns NR:ns ## COG: VNG0479G COG1028 # Protein_GI_number: 15789712 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Halobacterium sp. NRC-1 # 8 203 21 218 316 119 35.0 7e-27 MSEMKWAIITGADGGMGTEITRAVAKAGYRIIMACYHPKKAEVVRERLSKEIGNPDLEVI AIDLSSMQSVVAFASQILERNLPISLLMNNAGTMETGFHTTSEGFERTVSVNYMGPYLLT RKLIPLMVRGARIVNMVSCTYAIGKLDFPDFFHRGKTGTFWRIPVYSNTKLALLLFTFEL SEQLREKGITVNAADPGIVSTDIITMHKWFDPLTDIFFRPFIRKPKKGASTAIGLLLDEK EAGVTGQLYVNNHRKNLSDKYTNHVQKEQLWEVTERALASWLK >gi|222159274|gb|ACAB01000085.1| GENE 5 6263 - 7486 1058 407 aa, chain - ## HITS:1 COG:MT3467 KEGG:ns NR:ns ## COG: MT3467 COG1902 # Protein_GI_number: 15842955 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Mycobacterium tuberculosis CDC1551 # 5 379 11 385 396 286 41.0 4e-77 MESKLFSPVTFGPLTLRNRTIRSAAFESMCPENTPTQMLLDYHRSVAAGGVGMTTVAYAA VTQSGLSFDRQLWLRPSIIPRLHELTKAVHDEGAAVGIQIGHCGNMSHKNICGVTPISAS SGFNLYSPTFVRGMEKEELPEMAKAYGNAVNLARKAGFDAVEVHAGHGYLISQFLSPYTN HRKDEYGGSLENRMRFMDMVMEEVMKAAGSDMAVFVKMNMRDGFKGGMEIDESIQVAKRL LELGAHGLVLSGGFVSKAPMYVMRGAMPIRSMSYYMNCWWLKYGVRMFGKWMIPSVPFKE AYFLEDALKFRAALPDAPLIYVGGLVSRQKIDEVLDSGFDAVQMARALLNEPGFVNRMKQ EEQARCNCGHSNYCIGRMYTIEMACHQHLKEQLPSSLQKEIDKLEKK >gi|222159274|gb|ACAB01000085.1| GENE 6 7538 - 8299 628 253 aa, chain - ## HITS:1 COG:no KEGG:BT_2068 NR:ns ## KEGG: BT_2068 # Name: not_defined # Def: 3-oxo-5-alpha-steroid 4-dehydrogenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 253 1 253 253 409 87.0 1e-113 MSIDTFNLFLGVMSLIALFVFIALYFVKAGYGIFRTASWGVAISNKLAWILMEAPVFVVM CVMWIYSERRFESVIFTFFLFFQIHYFQRAFIFPLLLKGKSKMPLAIMSMGVLFNLLNGY MQGKWIFYLAPETMYQSGWFTSPWFIIGTLLFFAGMLLNWQSDYIIRHLRKPGDTRHYLP QKGMYRYVTSANYFGEIVEWAGWAILTCSLSGLVFLWWTIANLVPRANAIWCRYREEFGD AVGERKRVFPFLY >gi|222159274|gb|ACAB01000085.1| GENE 7 8494 - 9837 1547 447 aa, chain - ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 12 447 8 441 441 380 46.0 1e-105 MKKEYLIYKLSEEMKEATRIDTELFSKFDVKRGLRNEDGTGVLVGLTRIGNVVGYERVPG GGLKPIPGKLFYRGYDVEDISHAIIKEKRFGFEEVAYLLLSGRLPDKEELLSFRELINDN MPLEQKTKMNIIELEGNNIMNILSRSVLEMYRFDANADDTSRDNLMRQSIELISKFPTII AYAYNMLRHATFGRSLHIRHPQEKLSIAENFLYMLKKDYTELDARTLDLLLILQAEHGGG NNSTFTVRVTSSTGTDTYSAIAAGIGSLKGPLHGGANIQVADMFQHLKENIKDWTSVDEI DTYYTRMLNKEVYNKTGLIYGIGHAVYTISDPRALLLKELARDLAREKGRESEFAFLELL EERAIATFGRVKNNGKTVSSNIDFYSGFVYEMIGLPQEIFTPLFAMARIVGWCAHRNEEL TFDGKRIIRPAYKNVLDDLAYIPIKKR >gi|222159274|gb|ACAB01000085.1| GENE 8 9858 - 11039 1244 393 aa, chain - ## HITS:1 COG:SA1517 KEGG:ns NR:ns ## COG: SA1517 COG0538 # Protein_GI_number: 15927272 # Func_class: C Energy production and conversion # Function: Isocitrate dehydrogenases # Organism: Staphylococcus aureus N315 # 3 390 7 422 422 507 59.0 1e-143 MQTDGTLLVPDVPTVPYITGDGVGAEVTPAMQAVVDAAIRKAYGGKRRIEWKEVLAGERA FNATGSWLPDETMETFQEYLVGIKGPLTTPVGGGIRSLNVALRQTLDLYVCLRPVRWYQG VQSPVKSPEKVNMCVFRENTEDIYAGIEWEAGTPEAEKFYQFLKDEMGVTKVRFPETSSF GVKPVSREGTERLVRAACQYAIDHHLPSVTLVHKGNIMKFTEGGFKKWGYELAQREFGDA LADGRLVIKDCIADAFLQNTLLIPEEYSVIATLNLNGDYVSDQLAAMVGGIGIAPGANIN YQTGHAIFEATHGTAPNIAGKDVVNPCSIILSAVMMLEYFDWKEAAALIEKALEQSFLDA RATHDLARFMPNGTSLSTSAFTREIVERIEKQK >gi|222159274|gb|ACAB01000085.1| GENE 9 11058 - 13301 1977 747 aa, chain - ## HITS:1 COG:SPAC24C9.06c KEGG:ns NR:ns ## COG: SPAC24C9.06c COG1048 # Protein_GI_number: 19114943 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Schizosaccharomyces pombe # 12 745 41 769 778 917 58.0 0 MVYDVTMLEAFYAAYKGKVEHVRAILKRPLTLAEKILYAHLYDVADLKDYKRGEDYVNFR PDRVAMQDATAQMALLQFMNAGKDQVAVPSTVHCDHLIQAYKGAKADIATARLTNEEVYD FLRDVSSRYGIGFWKPGAGIIHQVVLENYAFPGGMMVGTDSHTPNAGGLGMVAIGVGGAD AVDVMTGMEWELKMPKIIGVRLTGKLSGWTSPKDVILKLAGILTVKGGTNAIIEYFGPGT ESLSATGKATICNMGAEVGATTSLFPFDGRMATYLRATGRDCVVDWAESVDADLRADDIV TDEPSNYYDRVIEIDLSELEPYINGPFTPDAATPISEFAEKVLLNGYPRKMEVGLIGSCT NSSYQDLSRAASLAKQVTEKNLSVASPLIVNPGSEQIRATAERDGMIEAFERLGATIMAN ACGPCIGQWKRETDDLTRKNSIVTSFNRNFAKRADGNPNTYAFVASPELTMALTIAGDLC FNPLKDRLVNHNGEKVKLSEPVGDELPLKGFEQGNEGYIAPHGAKTEIRVKPDSQRLQLL TPFPAWDGQDLLNMPLLIKAQGKCTTDHISMAGPWLRFRGHLENISDNMLMGAVNAFNGE TNRVWNRSTNTYGTVSGTAKMYKSEGIPSIVVAEENYGEGSSREHAAMEPRFLNVRVILA KSFARIHETNLKKQGMLALTFVDKADYDKIREHDLLSVSGLVHFAPGRNLTIVLHHEDGT KESFEVQHTYNEQQIAWFRAGSALNAR >gi|222159274|gb|ACAB01000085.1| GENE 10 13368 - 15305 1268 645 aa, chain + ## HITS:1 COG:MK0070 KEGG:ns NR:ns ## COG: MK0070 COG1112 # Protein_GI_number: 20093510 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Methanopyrus kandleri AV19 # 153 643 135 669 698 263 34.0 1e-69 MKYFFYPERRFVTFAPIMSNNPKSPIIDLQQQQLLLRMEYEYEKEEFKRQTETMGVARKV KRGLCWYPVSLGRSYYNSLNQLVIDITRTENKEIEHSFEFGRPVCFFHQSFEGKVKYMDF IATVSFADEERMVVVLPGAGALAELQTDGILGVQLYFDETSYRAMFEALEDTIRAKDNRL AELRDILLGTQKPGFRELYPVRFPWLNSTQETAVNKVLCTRDVSIVHGPPGTGKTTTLVE AIYETLHREPQVLVCAQSNTAVDWICEKLVDRGVPVLRIGNPTRVNDKMLSSTYERRFES HPAYPELWGIRKSIREMGSRMRRGSYSEREGMRNRMSRLRDRATELEIQINADLFDSARV IASTLVSSNHRLLNGRRFPTLFIDEAAQALEAACWIAIRKADRVILAGDHCQLPPTIKCI EASRGGLDHTLMEKVVQQKPSAVSLLKVQYRMHETIMQFPSDWFYHGELEAAPEVRYRGI LDFDTPMNWIDTSEMDFHEDFVGESFGRINKQEANLLLQELETYIERIGKERILDERIDF GLISPYKAQVQYLRGKIKGSSFLRPFRSLITVNTVDGFQGQERDVIFISLVRANEDGQIG FLNDLRRMNVAITRARMKLVILGDTSTLAKHPFYKRLMLFIKKED >gi|222159274|gb|ACAB01000085.1| GENE 11 15449 - 17416 1553 655 aa, chain - ## HITS:1 COG:no KEGG:BF3757 NR:ns ## KEGG: BF3757 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 22 654 20 635 639 342 38.0 2e-92 MMYFNLPKRMINITKLVCVIGVVSVAFSSCLQKDVYDPANSGDKEEEVTLDNYFDFATTK NIQLNIDYGKECPRAYFEVYAENPLSYVAEGGQIIKKAGVSHIATGFTDIQGRYIKPASF PTAVSEVYIYSPDFGVPTLYKTKVVGSDVSAKITFENALDVTPVDSSTRSAQTRSSLKFI TNVIPNVLGTWNVNTGRPNYLDTSKKINVDATLKSYITTYFPEGKNNVGTNLVSDDADIL IKEDANVVVNYFGGDTGAQSVFAYYCYSENASIDEIRQAAKHACVIFPNAHKSNLGNYSG VAVNLKYINETGSFPEEEPERIPAGTKIGFLIWNDGWRGVKANGNMFYSTKSLNSDKISH TAIFAAKNKAGDRVNVITMEDWKNGENDYNDVAFVISSNPIAAIEVPDVPNPGDRQGTEK YSGVLGFEDNWPEQGDYDLNDVVMKYQSSVDYNIDNKVLNIIDKFTLAWTGANYKNSFAY EVPFDLSKASKVTVNGSETSSYSGNVITLFKDAKAELGVSNVNAEDMINQNIQEKTYTVS IQFNNPTLDKSVVVAPYNPFIKVFNSATEVHLTDHKPTTGANNRFPSGADISRGDVDGTY FICKDGFPFAIHVDARLDASILNLDLKKENQRIDKTYPKFAEWAKTRDPQIKWWK >gi|222159274|gb|ACAB01000085.1| GENE 12 17608 - 18651 1286 347 aa, chain - ## HITS:1 COG:YLR355c KEGG:ns NR:ns ## COG: YLR355c COG0059 # Protein_GI_number: 6323387 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Saccharomyces cerevisiae # 1 346 48 394 395 424 60.0 1e-118 MAQLNFGGTTETVVIRDEFPLEKAREVLKNETIAVIGYGVQGPGQALNLRDNGFNVIVGQ REGKTYDKAVADGWVPGETLFGIEEACEKGTIVMCLLSDAAVMSVWPTIKPYLTAGKALY FSHGFAITWNNRTGVVPPADIDVIMVAPKGSGTSLRTMFLEGRGLNSSYAIYQDATGKAY ERTIALGIGIGSGYLFETTFQREATSDLTGERGSLMGAIQGLLLAQYEVLRENGHSPSEA FNETVEELTQSLMPLFAKNGMDWMYANCSTTAQRGALDWMGPFHDAIKPVVEKLYNSVKT GNEAQISIDSNSQPDYREKLNEELRQLRESEMWQTAVTVRKLRPENN >gi|222159274|gb|ACAB01000085.1| GENE 13 18719 - 19462 676 247 aa, chain - ## HITS:1 COG:CAC3591 KEGG:ns NR:ns ## COG: CAC3591 COG3884 # Protein_GI_number: 15896825 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Clostridium acetobutylicum # 17 215 15 211 248 94 27.0 2e-19 MNEENRIGTYQFVAEPFHVDFNGRLTMGVLGNHLLNCAGFHANDRGFGIATLNEDNYTWV LSRLAIELDEMPYQYENFSVQTWVENVYRLFTDRNFAIIDKDGKKIGYARSVWAMINLNT RKPADLLTLHGGSIVDYVCDEPCPIEKPSRIKVTSDQPIATLTAKYSDIDINGHVNSIRY IEHILDLFPIELYKTKRIRRFEMAYVAESYFGDELSFFCDEVNANEFHVEVKKNGNEVVC RSKVIFE >gi|222159274|gb|ACAB01000085.1| GENE 14 19468 - 20025 487 185 aa, chain - ## HITS:1 COG:MTH1443 KEGG:ns NR:ns ## COG: MTH1443 COG0440 # Protein_GI_number: 15679440 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanothermobacter thermautotrophicus # 14 161 16 162 168 75 31.0 7e-14 MSDKTLYTIIVHSENIAGLLNQVTAVFTRRQINIESLNVSASSIKGVHKYTITAWTDKDI IEKVVKQIEKKIDVIQAHYFTEDEIYFHEIALYKVSTPAFQENPEASKLIRRYNARIVEV NPVFSIVEKNGMSEDITSLYGELKALNCVLQFVRSGRVAITTSCFERVNEFLDGREAMYN QSKNQ >gi|222159274|gb|ACAB01000085.1| GENE 15 20083 - 21777 1682 564 aa, chain - ## HITS:1 COG:MA3792 KEGG:ns NR:ns ## COG: MA3792 COG0028 # Protein_GI_number: 20092588 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Methanosarcina acetivorans str.C2A # 5 561 9 559 564 503 48.0 1e-142 MKDLITGAEALMRSLEHQGVTTIFGYPGGSIMPVFDALYDHQNILNHILVRHEQGAAHAA QGYARVSGDVGVCLVTSGPGATNTVTGIADAMIDSTPIVVIAGQVGTGFLGTDAFQEVDL VGITQPITKWSYQIRRAEDVAWAVARAFYIARSGRPGPVVLDFAKNAQVEKTKYEPTKVD FIRSYVPVPDTDEESVQAAAELINNAERPLVLVGQGVELGNAQSELREFIEKADMPAGCT LLGLSALPTEHPLNKGMLGMHGNLGPNINTNKCDVLIAVGMRFDDRVTGNLATYAKQAKV IHFDIDPAEVNKNVKVDVAVLGDCKETLASVTKLLKKHTHTEWIDSFKEYEKVEEEKVIR PELHPATNSLSMGEVVRAVSDATHHEAVLVTDVGQNQMISARYFKYTKERSIITSGGLGT MGFGLPAAIGATFGAPERTVCVFMGDGGLQMNIQELGTIMEQKAPVKIICLNNNYLGNVR QWQAMFFNRRYSFTPMLNPDYMKVASAYDIPSKRAFTREELKEAIAEMLATDGPFLLEAC VVEEGNVLPMTPPGGSVNQMLLEC >gi|222159274|gb|ACAB01000085.1| GENE 16 21902 - 23701 1776 599 aa, chain - ## HITS:1 COG:NMB1150 KEGG:ns NR:ns ## COG: NMB1150 COG0129 # Protein_GI_number: 15677026 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Neisseria meningitidis MC58 # 4 597 3 612 619 797 65.0 0 MKKQLRSSFSTQGRRMAGARALWAANGMKKNQMGKPIIAIVNSFTQFVPGHVHLHEIGQL VKAEIEKLGCFAAEFNTIAIDDGIAMGHDGMLYSLPSRDIIADSVEYMVNAHKADAMVCI SNCDKITPGMLMAAMRLNIPAVFVSGGPMEAGEWNGQHLDLIDAMIKSADESVSDQEVAN IEQNACPTCGCCSGMFTANSMNCLNEAIGLALPGNGTIVATHENRTQLFKDAAELIVKNA KLYYEEGDESVLPRSIATRQAFLNAMTLDIAMGGSTNTVLHLLAVAHEAEVDFKMDDIDM LSRKAPCLCKVAPNTQKYHIQDVNRAGGIIAIMDELAKGGLIDTSVRRVDGMSLAEAINE YSITSPNVSEKAIKKYSSAAGNKFNLVLGSQGMYYKELDKDRANGCIRDLEHAYSKDGGL AVLKGNIAQDGCVVKTAGVDESIWKFTGPAKVFDSQEAACEGILGGRVVSGDVVVITHEG PKGGPGMQEMLYPTSYIKSRHLGKECALITDGRFSGGTSGLSIGHISPEAAAGGNIGKIV DGDIIEIDIPARKINVRLTDEELAARPMTPVTRDRYVPKSLKAYASMVSSADKGAVRII >gi|222159274|gb|ACAB01000085.1| GENE 17 24182 - 24760 610 192 aa, chain + ## HITS:1 COG:FN1875 KEGG:ns NR:ns ## COG: FN1875 COG1047 # Protein_GI_number: 19705180 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 2 # Organism: Fusobacterium nucleatum # 1 156 1 149 164 85 37.0 6e-17 METVENKYITVAYKLYTMEDGEKELFEEAKAEHPFQFISGLGTTLEDFENQITALSKGDK FDFTIPADKAYGQYDEQHVIDLPKNIFEIDGKFDSERIKEGNIVPLMTGDGQRVNASVVE IKPDVVVVDLNHPLAGADLIFEGEILESRPATNEEIQELVKMMSGEGGCGCGCDSCGDGC GDDCGCEGGHCH >gi|222159274|gb|ACAB01000085.1| GENE 18 24888 - 26174 1288 428 aa, chain - ## HITS:1 COG:ECs3990 KEGG:ns NR:ns ## COG: ECs3990 COG3681 # Protein_GI_number: 15833244 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 7 427 11 435 436 324 43.0 2e-88 MTESERRQIIELIKKEVIPAIGCTEPIAVALCVAKAAETLGAKPEKIEVLLSANILKNAM GVGIPGTGMVGLPIAVALGALIGKSDYQLEVLRDCTSEAVEQGKLFIAEKRICISLKEDI TEKLYIEVICKTEDKTAKAIIAGGHTTFIYIAKNEQTLLDKQQTVSEKEEEASPELNLRK VYDFALTAPLDEIRFILDTARLNKAAAEQAFKGNYGHSLGKMLRGTYEHKVMGDSVFSHI LSYTSAACDARMAGAMIPVMSNSGSGNQGISATLPVVVFAEENGKSEEELIRALMLSHLT VIYIKQSLGRLSALCGCVVAATGSSCGITWLMGGNYNQVAFAVQNMIANLTGMICDGAKP SCALKVTTGVSTAVLSAMMAMENRCVTSVEGIIDEDVDQSIHNLTRIGSQAMNETDKMVL DIMTHKGC >gi|222159274|gb|ACAB01000085.1| GENE 19 26308 - 27393 872 361 aa, chain + ## HITS:1 COG:no KEGG:BT_2081 NR:ns ## KEGG: BT_2081 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 361 625 86.0 1e-178 MKAKHVIVYLLLAIVSSSCIREEALNAEADILSCILPGVAMTTSPIINNNSITIFVGPGT DISELKPEFTLTPGATISPLSGTERNFNTPQEYTVTAADGIWKKAYTVSVIDTELATNYN FEDTLGGKKYYIFVERERDKVVMEWASGNAGYAMTGVAKTADDYPTFQITDGKAGKCLSL VTRSTGFFGQLAGMPIAAGNLFIGSFDVSNAMSNPLKATKFGLPFRHVPTYLAGYYKYKA GDQFTEGGKPVNGKRDICDIYAIMYETSESVPTLDGTNAFTSPNLISTARINNAKETNEW TYFKLPFITLPGKFIDKEKLRDGKYNIAIVFTSSLEGDHFNGAIGSTLLIDEAELIYHSE N >gi|222159274|gb|ACAB01000085.1| GENE 20 27515 - 28300 644 261 aa, chain + ## HITS:1 COG:no KEGG:BT_2082 NR:ns ## KEGG: BT_2082 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 261 1 261 261 509 94.0 1e-143 MKIYPYIFSLLACIGIALPGYSQVDRNETLIRSALHGLEYEIKAGFSIGGTAPLPLPVEI RSIDGYNPTLAISIGGEVTKWIAVQNKLGIIVGLRLENKAMTTEATVKNYNMEILGQGGE RISGVWTGGVKTKVHTAGLTIPLMATYKLSNRWNIKAGPYFSYLLSREFSGHVYEGYLRE DDPTGPKVEFTDGKIATYDFSDDLRHFQWGLQVGAGWRAFKHLNVYADLTWGLNDIFKSD FNTITFAMYPIYLNIGFGYAF >gi|222159274|gb|ACAB01000085.1| GENE 21 28406 - 28972 648 188 aa, chain - ## HITS:1 COG:no KEGG:BT_2083 NR:ns ## KEGG: BT_2083 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 186 1 185 187 254 72.0 1e-66 MKKNLLYLLALVCSLTFFAACSSDDDDSDNKNNGNPPEEEAAITAPDVVGTYWGNLDISM IPDGSDQEIVIGDGIEKFITLSQVSNTEVKIELKEFELFINQQILKFGDIVVDKCEVKKG EGVSTFTGQQDLTFEGNAAALGTCPVTVTGTVEDGNADMTINVKVSTLQQTVKVTYSGVK QVAESGGN >gi|222159274|gb|ACAB01000085.1| GENE 22 29135 - 30211 1122 358 aa, chain + ## HITS:1 COG:sll1747 KEGG:ns NR:ns ## COG: sll1747 COG0082 # Protein_GI_number: 16330007 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Synechocystis # 1 351 1 353 362 357 53.0 2e-98 MFNSFGNIFRLTSFGESHGKGVGGVIDGFPAGIVIDEEFVQQELNRRRPGQSILTTARKE ADKVEFLSGIFEGKSTGCPIGFIVWNENQHSNDYNNLKNVYRPSHADYTYTVKYGIRDHR GGGRSSARETISRVVAGALAKLALRQLGISITAYTSQVGPIKLEGTYSDYDLDLIETNDV RCPDPEKAKEMADLIYKVKGEGDTIGGTLTCVIKGCPIGLGQPVFGKLHAALGNAMLSIN AAKAFEYGEGFKGLKMKGSEQNDVFYNNNGRIETHTNHSGGIQGGLSNGQDIYFRVVFKP IATLLMEQETVNIDGIDTTLKARGRHDACVLPRAVPIVEAMAAMTILDYYLLDKTTQL >gi|222159274|gb|ACAB01000085.1| GENE 23 30211 - 31575 1107 454 aa, chain + ## HITS:1 COG:DR2025 KEGG:ns NR:ns ## COG: DR2025 COG0624 # Protein_GI_number: 15807020 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Deinococcus radiodurans # 18 441 18 440 459 411 48.0 1e-114 MNEIQKYIIAHEPEMMNDLFSLIRIPSISALPEHHDDMLACAERWAQLLLEAGADEALVM PSKGNPIVFAQKMVDPDAKTVLVYAHYDVMPAEPLELWKSQPFEPEIRNGYIWARGADDD KGQSFIQVKAFEYLVKNGLLKNNVKFIFEGEEEIGSPSLEAFCEEHKELLKADVILVSDT SMLGAELPSLTTGLRGLAYWEIEVTGPNRDLHSGHFGGAVANPINTLCQIISKVTDADGR ITVPGFYDDVEEVPQAEREMIAHIPFDEKKYKEAIGVKELFGEKGYSTLERNSCRPSFDV CGIWGGYTGEGSKTVLPSKAYAKVSCRLVPHQDHHKISQMFAEYISSIAPETVQVKVTPM HGGQGYVCPISLSAYQAAEKGFEIAFGKKPLAVRRGGSIPIISTFEQVLGIKTVLMGFGL ESDAIHSPNENFSLDIFRKGIEAVIEFHQEYARR >gi|222159274|gb|ACAB01000085.1| GENE 24 31689 - 32747 972 352 aa, chain - ## HITS:1 COG:AGl573 KEGG:ns NR:ns ## COG: AGl573 COG3049 # Protein_GI_number: 15890402 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 24 346 28 353 355 378 56.0 1e-104 MKKRLVGVALVLAAVSLGSIQPVEACTRAVYIGPDQMVITGRTMDWKEDIMTNIYVFPRG IQRAGHNKDKTVNWTAKYGSVIATGYDIGTCDGMNEKGLVASLLFLPESIYSLPGDTRPA MGISIWTQYVLDNFATVREAVDELKKETFRIDAPRMPNGGPESTLHLAITDETGNTAVLE YLDGKLSIHEGKEYRVMTNSPRYDYQLAINDYWKEIGGLQMLPGTNRASDRFVRASFYIH AIPQTADAKIAVPSVLSVMRNVSVPFGINTPEKPYISSTRWRSVSDQKNKVYYFESTLTP NMFWLDLKKIDFSPKAGIKKLSLAKGEIYAGDAVKDLKDSQSFTFLFQTPVM >gi|222159274|gb|ACAB01000085.1| GENE 25 32944 - 33072 124 42 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720084|ref|ZP_04550565.1| ## NR: gi|237720084|ref|ZP_04550565.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 2 42 246 286 288 87 97.0 3e-16 MHKVLRGGSYYTFEKYCKVTSRYGVTPQRWDIDYGLRLVVSL >gi|222159274|gb|ACAB01000085.1| GENE 26 33100 - 35136 1353 678 aa, chain - ## HITS:1 COG:no KEGG:BT_2246 NR:ns ## KEGG: BT_2246 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 630 5 628 682 208 26.0 7e-52 MNRFANYYKKIFSQEYIDRTISGGIKSQLTLLLVTIATALAIFFIIAMLFSIQLHGHEEW GERLWAVYNNFVDPGNQIEETAWPNRILVGLISISGSVLLGGVLISTISNIIERRVGVVY AGRMTYRNIKNHYVLIGFNELSINMIRELYDECPSARILLMSGMESATVRHRIQSALPVE VERQVLVYFGNIESIEELQRLNIESAIEVYVLGDEERYGRDAKNIAIVHLVSTLRGKCYD GKMMPVYVQFDSIPSYSNIQKMNLPPEVFCIEGKPNIFFRPFNLHENLARQLWSLYGADC ERRYDPLDYRPISITQQPDGSWSATSQDYVHLVIVGFNRVGRSLLLEALRICHYANYDDR LPADERIRTRITLVDREMEAQKDYFKAQFPYIESQIDDIEVEYCHDDICSTAMRTRLQQW AQNKHCMLTVAICVHDPDLSLSLGLNLPHEVYQYQCRVLIRQEFNNDLSSMVDDEKGRYR YVKVFGMVDRGIKKNILQDKLALYVNYLYDCCYADESLKQKEVLKKMYESYGNHSADFIL MNHQAQYLWNKLSEPLRWANRYQLDAYSVFCRTLGYGIRRSDHSPAHISESMFNEDLPAQ VLYLLVRMEKYRWNAERTVAGWRRAKVKDKVFLQHPLIMPFSELLQKYPEEVEKDADVIY NLPYILALGGYELYRLAD >gi|222159274|gb|ACAB01000085.1| GENE 27 35246 - 37423 1947 725 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 721 4 650 709 446 39.0 1e-125 MIVCIAEKPSVARDIAEVLGAHTRKEGYIEGNGYQVTWTFGHLCTLKEPHEYTPNWKSWS LSSLPMIPPRFGIKLIGHDPGIEKQFHVIEWLMQNADEIINCGDAGQEGELIQRWVMQKA GARCPVKRLWISSLTEEAIREGFSKLKDQAEFQSLYEAGLSRAMGDWLLGMNATRLYTLK YGQNKQVLSIGRVQTPTLALIVNRQLEIANFEPKQYWELKTNYRNTTFSALIRKSDEEIA AEEEKNGGKKKIDNPGIDPIANREEGEALVERIKDLPFVVTNVGKKDGKEYAPRLFDLTS LQVECNKKFAYSADETLKLIQSLYEKKVATYPRVDTTFLSDDIYPKCPAILKGLRDYEVL TAPLAGTTLLKSKKVFDNSKVTDHHAIIPTGVYAQNLTDMERRVYDLIARRFIAVFYPDC KISTTTVMGEVDKIEFRVTGKQILEPGWRVVFAKEVKDPTEEKEEEDENVLPAFVKGESG PHIPDLNEKWTQPPRPYTEATLLRAMETAGKLVDNDELRDALKENGIGRPSTRAAIIETL FKRNYIRKERKNLIATPTGVELVQLIHEELLKSAELTGIWEKKLREIEKKTYDARQFLEE LKQMVSEIVMSVLSDNTNRRITIQDAVAAKAEEKEKKEPKKRERKPSTPKEKKPKAEKTT NEVKTDSPGSPAPVAMAASTPSTAGEVDAFVGQPCPLCGKGTIIKGKTAYGCSEWRNGCT FRKNF >gi|222159274|gb|ACAB01000085.1| GENE 28 37547 - 39109 1305 520 aa, chain + ## HITS:1 COG:no KEGG:BDI_2898 NR:ns ## KEGG: BDI_2898 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 520 1 520 522 629 60.0 1e-179 MIDLNRKLPIGIQTFEKIRKGNYLYVDKTALVGKMVSTSIPYFLSRPRRFGKSLLISTFE AYFLGRKDLFEGLAISQMEVDWQVYPVFHIDLNARKYDSPADLIAMLNQHLEKWEAIYGT EKQDRQPEERFAYIIERACVQTGKQVVVLVDEYDKPLLQALDNLLLFEEYRKMLKAFYGV LKSADRYLRFVFLTGVTKFSQVSVFSDLNQLNDISMKPPYATICGITKQELIDTFTPELD KLASYNRMTPEDTIHKMTSLYDGYHFCEYAEGMFNPFSVLNVFDGYKFENYWFQTGTPTF LVKMLMDSNYDLRTLIDGVEANAASFNEYRAESRNPIPLIYQSGYLTIKEYDPRFKTYQL AFPNDEVRYGFMNFLLPFYSNIPDNEQDFYIGKFVHELESGNINAFLTRLQAFFADIPYE LNDQTERHYQTVFYLIFKLMGQFTQAEVRSAKGRADAVVKTPKYIYVFEFKLNGTAEQAL QQIEDKGYLIPYQADEREVKKVGVEFSTDTRNVSRWLPEE >gi|222159274|gb|ACAB01000085.1| GENE 29 39157 - 40875 1458 572 aa, chain - ## HITS:1 COG:no KEGG:BT_4445 NR:ns ## KEGG: BT_4445 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 572 1 570 571 970 83.0 0 MARITIPYAVADFAEMRERGFYYVDKTNYIPGLEDYNAPIFLRPRRFGKSLLISMLAHYY DRTKANRFEELFGGTWIGEHPTEEHNQYLVIRYDFSAMVMADDMEGVVQNFNDLNCGPVE VTVEHNRDLFGDFQFTTRGNAVQMLEELLGYISSHGLPKAYILIDEYDNFTNQLLTSYND SLYEEVTTSDSFLLTFFKVIKAGIGEGTIRTCFCTGVLPVIMDDLTSGYNIAEILTFKPV FLNMLGFTYEETKTYLRYVLDKYAPGASEERFEEIWQLIVNNYDGYRFSPVGERLFNSTI LTYFFKKFAANAGSIPSELIDENLRTDINWIRRLTLSQNNAKEMLDALVIGGELSYNVRD LSGKFNKKKFFEKEFYPVSLFYLGMTTLKSAYRMVLPNMTMRSIYMDYYNQLNKIEGNAS RYVPVYELYDSNRSFEPLVQNYFEQYLGQFPAQVFDKINENFIRCSFYELVSRYLSHCYT FAIEQNNSVGRSDFEMIGIPGTDYYTDDRVVEFKYYRSKDADRMLALTEPLVEHVEQVKG YAADTKRKFPNYHVRSYIVYICANKGWKCWEV >gi|222159274|gb|ACAB01000085.1| GENE 30 41161 - 41928 775 255 aa, chain - ## HITS:1 COG:no KEGG:BT_0003 NR:ns ## KEGG: BT_0003 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 251 1 256 258 278 59.0 1e-73 MKNVLKIWLVDNTVTVDNKDDKIGQLESSGNLSLQDILDEMHKEDTGLRPETIEHVVKLY NRVVSDLILSGYSVNTGLYHAVAQLRGVIDGGKWNPEKNSVYVSFTQDKELRETIAQTSI SILGERQSVMYVAGFSPATATAGRPFTVNGRMLKIAGTDSSVGITITDSKEQTTPVDLNM LAVNNPSQLTFIVPAGLADGEYTLTVTTQYAGSTLLKTPRTAIATFYVGTKPDAGGGSGS GGSEGGGGEAPDPAA >gi|222159274|gb|ACAB01000085.1| GENE 31 42117 - 43658 931 513 aa, chain - ## HITS:1 COG:no KEGG:BT_1284 NR:ns ## KEGG: BT_1284 # Name: not_defined # Def: putative endo-beta-N-acetylglucosaminidase F1 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F1) # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 462 1 456 508 139 26.0 2e-31 MKRIIRNISLCVLLALVMGACEEDAALVAPNVAHDKSTVSPEEVGEAAFLQSASESGNKF IFLDFVEQGSDELYVELVEPAERDMTFTIGINTSLVLQDIPSITEQFYKVGANFVEDWEN KFTITNGGKVTVAAGERKSTRISFTVSNMELIGSYYLLPLLATDEDGNQYKLFYYICQVE MERQDRSNKPYTVVAYIDTEMMNPLIADQYTSKLQYMKSRRDRKTIYENVPVFDIVNLRK ALLKYDESSRRAILKFTPDIEHVLKNYAQYIQPLKRSGIKVCLSIEGGGTGIGFANLTDV QIADFVAQVQVAVKMYQLDGVHLRDEGAGYDKTGAPELDETSYPKLVKALREAMPDIMLT LADDGGTTAMMDKEQGGIVVGDYIDLAWNVVWDTAVNPWASGSERKPIAGITKERYGGIS FYIKPIMTNEEGLFFENLQDESRAIALNEGLGKVAVAENIPYRDYIQEVANVMRIMGLLS CFHDTNNGEYPRYNVEVTEALAASYYFAFRKDW >gi|222159274|gb|ACAB01000085.1| GENE 32 43693 - 44163 334 156 aa, chain - ## HITS:1 COG:no KEGG:BT_1283 NR:ns ## KEGG: BT_1283 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 283 439 440 165 50.0 6e-40 TATIAFYVNGQLQSISTSHGKSDLTSISLGDKLPDDEFGNGGDFNFYFGRSYGESHDISR QFDGEICEARIWNVARTQEQIWENMYDIPDPTEEPALCAYWKFDEGTGMEVKDRTGHGNN AKVVPYWKASDHVEVYSKTDAELWPSGIEVPKINQE Prediction of potential genes in microbial genomes Time: Wed May 18 03:09:52 2011 Seq name: gi|222159273|gb|ACAB01000086.1| Bacteroides sp. D1 cont1.86, whole genome shotgun sequence Length of sequence - 80308 bp Number of predicted genes - 57, with homology - 56 Number of transcription units - 24, operones - 15 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 824 458 ## BT_1283 hypothetical protein 2 1 Op 2 . - CDS 837 - 1739 940 ## BT_1282 hypothetical protein 3 1 Op 3 . - CDS 1765 - 3405 1847 ## BT_1281 hypothetical protein 4 1 Op 4 . - CDS 3411 - 6773 2837 ## BT_1280 hypothetical protein - Prom 6910 - 6969 6.1 5 2 Op 1 6/0.000 - CDS 6980 - 7987 787 ## COG3712 Fe2+-dicitrate sensor, membrane component - Term 8007 - 8045 -0.3 6 2 Op 2 . - CDS 8051 - 8602 490 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 8628 - 8687 5.2 - Term 8671 - 8720 12.3 7 3 Op 1 7/0.000 - CDS 8751 - 10898 2417 ## COG1884 Methylmalonyl-CoA mutase, N-terminal domain/subunit 8 3 Op 2 . - CDS 10900 - 12801 1979 ## COG1884 Methylmalonyl-CoA mutase, N-terminal domain/subunit - Prom 12872 - 12931 7.1 + Prom 12764 - 12823 6.0 9 4 Tu 1 . + CDS 13071 - 14735 1539 ## COG2985 Predicted permease + Term 14758 - 14810 8.2 - Term 14746 - 14797 10.2 10 5 Op 1 . - CDS 14830 - 15462 591 ## BT_2093 hypothetical protein 11 5 Op 2 . - CDS 15515 - 18271 1729 ## BF2098 hypothetical protein - Prom 18395 - 18454 7.3 - Term 18448 - 18501 6.8 12 6 Op 1 . - CDS 18524 - 19825 802 ## Fjoh_1410 PKD domain-containing protein 13 6 Op 2 . - CDS 19856 - 21172 910 ## Fjoh_1873 PKD domain-containing protein 14 6 Op 3 . - CDS 21193 - 22353 721 ## Cpin_3942 hypothetical protein 15 6 Op 4 . - CDS 22350 - 24356 1184 ## COG4206 Outer membrane cobalamin receptor protein - Prom 24515 - 24574 5.2 - Term 24525 - 24580 2.1 16 7 Tu 1 . - CDS 24726 - 24926 98 ## gi|294644167|ref|ZP_06721941.1| hypothetical protein CW1_1690 - Prom 24963 - 25022 2.6 + Prom 24764 - 24823 6.4 17 8 Op 1 33/0.000 + CDS 24904 - 26037 1042 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 18 8 Op 2 35/0.000 + CDS 26112 - 27050 695 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 19 8 Op 3 . + CDS 27047 - 27802 217 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 28039 - 28100 0.3 - Term 27794 - 27856 10.1 20 9 Tu 1 . - CDS 27906 - 29075 927 ## COG0738 Fucose permease - Term 29094 - 29142 11.0 21 10 Op 1 . - CDS 29155 - 31527 2227 ## COG3537 Putative alpha-1,2-mannosidase 22 10 Op 2 . - CDS 31527 - 32618 1023 ## BF0331 putative hydrolase 23 11 Op 1 . - CDS 32730 - 33956 1152 ## ZPR_4380 glycosyl hydrolase family 76 24 11 Op 2 . - CDS 33969 - 35108 842 ## ZPR_4381 hypothetical protein 25 11 Op 3 . - CDS 35139 - 36743 1425 ## ZPR_4382 RagB/SusD family protein 26 11 Op 4 . - CDS 36755 - 39802 2821 ## ZPR_4383 TonB-dependent receptor Plug domain protein 27 11 Op 5 3/0.000 - CDS 39842 - 40942 918 ## COG1940 Transcriptional regulator/sugar kinase 28 11 Op 6 . - CDS 40976 - 42706 1585 ## COG1482 Phosphomannose isomerase - Prom 42743 - 42802 7.6 + Prom 42751 - 42810 9.6 29 12 Tu 1 . + CDS 42849 - 43577 538 ## COG2188 Transcriptional regulators + Term 43624 - 43673 0.1 30 13 Op 1 . - CDS 43599 - 44864 723 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 31 13 Op 2 . - CDS 44861 - 45190 343 ## BF1147 putative transcriptional regulator - Prom 45228 - 45287 4.6 - Term 45400 - 45438 2.0 32 14 Op 1 . - CDS 45447 - 45599 171 ## BT_2116 hypothetical protein 33 14 Op 2 . - CDS 45603 - 45755 96 ## + Prom 45842 - 45901 6.6 34 15 Tu 1 . + CDS 46042 - 46641 454 ## BT_2534 hypothetical protein + Term 46661 - 46717 1.1 - Term 46393 - 46435 -0.3 35 16 Op 1 . - CDS 46679 - 46882 295 ## gi|160885246|ref|ZP_02066249.1| hypothetical protein BACOVA_03245 36 16 Op 2 . - CDS 46892 - 47383 491 ## BF3041 hypothetical protein 37 16 Op 3 . - CDS 47380 - 47646 271 ## gi|237714767|ref|ZP_04545248.1| conserved hypothetical protein + Prom 47791 - 47850 6.1 38 17 Tu 1 . + CDS 47977 - 48456 466 ## BT_2538 hypothetical protein + Prom 48516 - 48575 9.1 39 18 Tu 1 . + CDS 48599 - 48844 231 ## BT_1170 hypothetical protein + Prom 49011 - 49070 9.4 40 19 Op 1 . + CDS 49109 - 53110 2697 ## COG0642 Signal transduction histidine kinase 41 19 Op 2 . + CDS 53144 - 56011 1742 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 56013 - 56072 2.6 42 20 Op 1 . + CDS 56107 - 56709 379 ## Slin_1080 protein of unknown function DUF303 acetylesterase putative 43 20 Op 2 . + CDS 56728 - 57678 469 ## Cpin_1583 sialate O-acetylesterase (EC:3.1.1.53) 44 20 Op 3 . + CDS 57728 - 60748 1666 ## Phep_1293 coagulation factor 5/8 type domain protein + Prom 60753 - 60812 8.0 45 21 Op 1 . + CDS 60863 - 63922 2156 ## Fjoh_2077 TonB-dependent receptor 46 21 Op 2 . + CDS 63939 - 65552 1336 ## Fjoh_2078 RagB/SusD domain-containing protein 47 21 Op 3 . + CDS 65597 - 67423 1361 ## gi|237714776|ref|ZP_04545257.1| conserved hypothetical protein 48 21 Op 4 . + CDS 67445 - 68992 1213 ## Fjoh_2081 hypothetical protein 49 21 Op 5 . + CDS 69037 - 71421 1270 ## COG4289 Uncharacterized protein conserved in bacteria 50 21 Op 6 . + CDS 71441 - 73243 1102 ## Cpin_1751 coagulation factor 5/8 type domain protein 51 21 Op 7 . + CDS 73250 - 75811 1363 ## COG4289 Uncharacterized protein conserved in bacteria 52 21 Op 8 . + CDS 75821 - 77044 778 ## COG4289 Uncharacterized protein conserved in bacteria 53 21 Op 9 . + CDS 77067 - 78176 640 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins + Term 78242 - 78277 -0.0 54 22 Tu 1 . - CDS 78213 - 78602 146 ## BT_2114 hypothetical protein - Prom 78634 - 78693 4.4 55 23 Op 1 . + CDS 78900 - 79193 363 ## COG1669 Predicted nucleotidyltransferases 56 23 Op 2 . + CDS 79210 - 79581 176 ## BT_2210 hypothetical protein + Prom 79608 - 79667 1.7 57 24 Tu 1 . + CDS 79761 - 80195 236 ## BF3357 hypothetical protein Predicted protein(s) >gi|222159273|gb|ACAB01000086.1| GENE 1 2 - 824 458 274 aa, chain - ## HITS:1 COG:no KEGG:BT_1283 NR:ns ## KEGG: BT_1283 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 274 8 281 440 281 51.0 2e-74 MRIHKLYQCFAAAVLLFLPVACDNTDYSEKSPFDNSAYLNIAASKNTETFTFNRKVTSQT KLFTVKLSYPSGDDVKVSLKVDASLTSEYNAKNDTHYEVLPETYYQLLKTEVVIPAGKTT SEEVGIKFSKLDELEIDVTYLCPLSIGGAEGVGIMDGSRTMYYLVRRSSAITTAMNLKNV YVTVPGFDKGSPTADVVNNLTAVTMEAIIRVNNFQQEISSIMGIEQYFLMRIGDANFPNQ QLQTQTTFGKFPEISNQKLLLAGEWYHVALTWDI >gi|222159273|gb|ACAB01000086.1| GENE 2 837 - 1739 940 300 aa, chain - ## HITS:1 COG:no KEGG:BT_1282 NR:ns ## KEGG: BT_1282 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 297 6 324 327 157 33.0 4e-37 MKYKSIFKLCLLTTFLSVTTISCDDWTEMEIHDSQVNGFKEQNPQEYAAYTQNLRAYKAT KHAVVYARLDNAPEVSTGEKDFLRALPDSIDIVTMRNADRLSEYDREDMKLVREDYGTKV LYYIDCTAKDKLNTSITSAVEAVRTGTFDGLALGSEGSAVDVSALKSLLDALGQTPCLLV FEGTPSLLPEAQRSLFNYFVLDISGASDEYDIETSVFYATGYGKAAPDRLLLAVTPDGTL TDYNGVTRNAIAGAAYGALNMETPLGGIAIYNISADYYDTDIIYKQTRGGIQFLNPASAH >gi|222159273|gb|ACAB01000086.1| GENE 3 1765 - 3405 1847 546 aa, chain - ## HITS:1 COG:no KEGG:BT_1281 NR:ns ## KEGG: BT_1281 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 535 1 528 531 555 51.0 1e-156 MMDMKRIYNILLIMILSLFLLPLGGCFDSDINRSMYEADGDEMQRENHIVGATLKGMQGL VIPTREHLYQFMDAMAGGAYGGYLEGIVDTWVMKFSTFNPEQGWLKSPFADPIKDMYPQY RDMMNKTDDPVALAFGKILRVCIMHRVTDIYGPIPYSKMMDNDNSGEDLAVAYDSQEQVY TQMLKELEEADKVLEENKDLSSEAFRKLEDLYYGNISKWRKFVHSMQLRIAMRMSYVNPT EAQRIAQKAVEAGVIESNEDNAMLHVAENRSELLFNNWNDYRISADLVSIMKGYEDPRLD KMFVKGVQTVDQDGEKVDVYDYYGVRIGIFTQKKDDMINLYSKQVISSTDPYLWMNAAEV TFLRAEGALRGWNMGGDAQALYEKAIALSFEERGGATGADQYVKDAEKKPIDYVNPMDGA DIKYSHPAVSTITIAWEPGAEYFERNLERIITQKWIAIFPLGLEAWAEHRRTGYPKLLPA VENKDPNNSVNVTIGPRRLPFPADEYTGNPKYIDQAVEMLNGPDAAGTKLWWDKKDHSIE NSQSSN >gi|222159273|gb|ACAB01000086.1| GENE 4 3411 - 6773 2837 1120 aa, chain - ## HITS:1 COG:no KEGG:BT_1280 NR:ns ## KEGG: BT_1280 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1120 1 1110 1110 1605 72.0 0 MQQIYEFIANKQAISRVFMLFLCCLLLSAPTFAQSSNVRITIHKKNISVIEALKEVEKQT KLSVGYNESQLKNKPSINLALEKATLESSLKEILRNTGYTYQLKGKYVIIVPQEKQEAEA ASVKRITGRVLDDNGEPLIGVNVHVEGTSVGSITDIDGNFILEAPVGSVLSISYVGYTPQ TVKVTGQNTYSVQLASDTKLLSEVVVTALGIKREQKALSYNVQQVKSDELTQIKDANFVN SLNGKVAGVTINTSSSGVGGASRVVMRGTKSIEQSSNALYVIDGIPMYNFGGGGDTEFGS KGATEAIADINPEDIESISVLTGAAAAALYGSNAANGAIVITTKKGQVGKLQVGVSSGIE WLNAFKMPEFQDRYGTGSNGKTGNSNIYSWGPKLNSAAQTGYEPDDFFDTGAVYTNSVTL STGTDKNQTFFSAAAVNSAGMVPNNRYNRYNFTFRNTTSFLNDKMKLDVSASYILQNDRN MTNQGQYSNPLVSAYLFPRGDDFSIVKNFERWDEARKISVQFWPQGEGDLRMQNPYWIAY RNLRLNNKKRYMASAGLSYQILDWLNVAGRVRIDNTHSEYEGKLYASSSNTLTDGSSQGH YTVNNGQYSQTYADVLVNINKRIQDFTIVANIGASYSGVTSKELGYAGPIRETGIPNLFN VYDLDNAKKRATQVGWREATESIFASAEVGWKSMLYLTVTGRNDWASQLTHSPQASFFYP SVGLSAVITEMLKLPDWVDYLKVRGSFSSVGNPYPRFLTYPTYSYDANKQDWKSQTNYPI GKLYPERTDSWEVGLDATLFKDFKLSGSFYYANTYNQTFDPRLPVSSGYDKLYVQTGYVR NYGFEAMLSYGHRWGDFGWDSSFTFSANKNEIVELVKDYVHPETGKTYNVDKLELKTDEG RGFGKAKFILKEGGTLGDLYTHADLKRDINGNVLIDDSGNVTAIDNAGDIKLGSVLPKAN LAWNNSFSYKGINAGFLLTARLGGIVYSATQAYLDLYGVSETSAAARDAGGVWINGRSRV NPQSFYEVVASQSGLPTYYTYSATNLRLQEAHIGYTVPRRWLGNVCDINVSLVGRNLWMI YCKAPFDPEAVATTNNYYQGIDYFMMPSTRNIGFNIKINF >gi|222159273|gb|ACAB01000086.1| GENE 5 6980 - 7987 787 335 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 16 288 27 289 331 76 25.0 7e-14 MKNYIRDVIDRYINHDYPKEVDQDFRTWLTDEERADEKDHELNQLWEATESATTSGYRHS LERMRELTGIGTRRRIHSLRTHLVVWRVAAALLIAFSSVSIYLALQNRQAPDLLQAYIPT AEMRNITLPDGTQVLINSQSTLLYPQQFKGDTRCVYLVGEAGFKVKRDEEHPFIVKSSDF QVTALGTEFNVTAYPDEEEVTATLISGKVLVEYNNQQGQEILKPNEQLAYNKRTRSGNVL HPDMQDVTAWQRGEIVFRSMTLEEIFTRLERKYPYTFVYSFRSLKEDRFNLTFGQNASME EVMDIIARVTGNLDYKIVGDKCYISGTGEGRLRHR >gi|222159273|gb|ACAB01000086.1| GENE 6 8051 - 8602 490 183 aa, chain - ## HITS:1 COG:all2193 KEGG:ns NR:ns ## COG: all2193 COG1595 # Protein_GI_number: 17229685 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 28 174 50 193 201 68 30.0 7e-12 MNETIAELDIRFEQFYAANFPRVKNFAKLLTKSEEDAEDIAQNIFLKLWTRPELWQDQET MTGYLYTVTRNEIFNLFKHQKVEQEYEDKVLKAQLLGELCDEDSSLLENLYYKEVLMLIR MTLSQLPDRRRRVFEMSRFEGLSHKEIADKLQIPLRTVEDHIYKTLTELRKVLMFVILFR LFP >gi|222159273|gb|ACAB01000086.1| GENE 7 8751 - 10898 2417 715 aa, chain - ## HITS:1 COG:BH2955_1 KEGG:ns NR:ns ## COG: BH2955_1 COG1884 # Protein_GI_number: 15615517 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, N-terminal domain/subunit # Organism: Bacillus halodurans # 24 587 19 582 582 847 72.0 0 MRKDFKNIDIYAAFQPANGAEWQKANGISADWKTPEHIEVKPVYTKEDLEGMEHLGYAAG LPPYLRGPYSVMYTLRPWTIRQYAGFSTAEESNAFYRRNLASGQKGLSVAFDLATHRGYD PDHERVVGDVGKAGVSICSLENMKVLFDGIPLNKMSVSMTMNGAVLPIMAFYINAGLEQG AKLEEMAGTIQNDILKEFMVRNTYIYPPAFSMKIISDIFEYTSQKMPKFNSISISGYHMQ EAGATADIELAYTLADGLEYLRAGTAAGIDIDAFAPRLSFFWAIGTNHFMEIAKMRAARM LWAKIVKQFNPKNPKSLALRTHSQTSGWSLTEQDPFNNVGRTCIEAMAAALGHTQSLHTN ALDEAIALPTDFSARIARNTQIYIQEETYICKNVDPWGGSYYVESLTNELAHKAWEHIQE IEKLGGMAKAIETGIPKMRIEEAAARTQARIDSGQQTIVGVNKYRLEKEAPIDILEIDNT AVRLEQIENLKRLKEGRNQAEVDKALAAITECVKTGKGNLLELAVEAARVRATLGEISYA CEQVVGRYKAIIRTISGVYSSESKNDSDFKRACELAEKFAKKEGRQPRIMVAKMGQDGHD RGAKVVATGYADCGFDVDMGPLFQTPAEAAREAVENDVHVVGVSSLAAGHKTLVPQIIEE LKKLGREDIVVIAGGVIPAQDYDFLYKAGVAAIFGPGTPVAKAACQILEILMDEE >gi|222159273|gb|ACAB01000086.1| GENE 8 10900 - 12801 1979 633 aa, chain - ## HITS:1 COG:BH2956_1 KEGG:ns NR:ns ## COG: BH2956_1 COG1884 # Protein_GI_number: 15615518 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, N-terminal domain/subunit # Organism: Bacillus halodurans # 8 470 9 468 525 217 30.0 5e-56 MADKKEKLFSDFSPVSTEQWMEKVTADLKGADFEKKLVWKTNEGFKVKPFYRMEDLEGLK TTDALPGEFPYLRGTKKDNNEWLVRQEIRVECPKEANAKALDILNKGVDSLSFHVKAKEL NAEYIETLLNDIQAECVELNFSTCQGHVVELANLLVAYFQKKDYDVKKLKGSINYDFFNK MLTRGKEKGDMVQTAKALIEAIQPLPFYRVLNVNAISLNNAGAYISQELGYALAWGNEYM NQLTDAGIPAAVVAKKIKFNFGISSNYFLEIAKFRAARLLWANIVASYKPECLRDCDNKG ANGECRCAAKMAVHAETSTFNLTLFDAHVNLLRTQTEAMSAALGGVDSMTVVPFDKTYGT PDELSERLARNQQLLLKEESHFDKVIDPAAGSYYIENLTVSIAKQAWELFLATEEAGGFY AALKAGTVQAAVNESNKARHKAVAQRREILLGTNQFPNFNEKAGDKKPVEGKCCCGGDSH TCEKDVDTLVFDRAASEFEALRLETEASGKRPKAFMLTIGNLAMRQARAQYSCNFLACAG YEVVDNLGFETVEAGVEAAMAAKADIVVICSSDDEYAEYAVPAFKALNGRAMFIVAGAPA CMDDLKAAGIENFIHVRVNVLDTLKEFNAKLLK >gi|222159273|gb|ACAB01000086.1| GENE 9 13071 - 14735 1539 554 aa, chain + ## HITS:1 COG:STM3807 KEGG:ns NR:ns ## COG: STM3807 COG2985 # Protein_GI_number: 16767092 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Salmonella typhimurium LT2 # 28 549 18 545 553 390 42.0 1e-108 MDWLQSLLWAPSSVAHIVLLYAFVVAAGVYLGKIKIFGVSLGVTFVLFAGILMGHFGFTA DTHILHFIREFGLILFVFCIGLQVGPSFFSSFKKGGMTLNLLAVGIVVLNIAVALGLYYL WNGRVELPMMVGILYGAVTNTPGLGAANEALNQLNYTGPQIALGYACAYPLGVVGIIGSI IAIRYIFRVNMAKEEESLKIQSGDAHHKPHMMSLEVRNESISGKTLIEIKEFLGRQFVCS RIRHEGHVSIPNHETIFNMGDQLFIVCSEEDAPAITVFIGKEVELDWEKQDLPMVSRRIL VTKPEINGKTLGSMHFRSMYGVNVTRINRSGMDLFADPNLVLQVGDRVMVVGQQDAVERV AGVLGNQLKRLDTPNIVTIFVGIFLGILLGSLPIAFPGMPTPLKLGLAGGPLVVAILIGR FGHKLHLVTYTTMSANLMLREIGIVLFLASVGIDAGANFVQTVVEGDGLLYVGCGFLITV IPLLIIGAIARLYYKVNYFTLMGLIAGSNTDPPALAYANQTTSGDAPAVGYSTVYPLSMF LRILTGQMILLTMM >gi|222159273|gb|ACAB01000086.1| GENE 10 14830 - 15462 591 210 aa, chain - ## HITS:1 COG:no KEGG:BT_2093 NR:ns ## KEGG: BT_2093 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 209 1 209 211 400 95.0 1e-110 MSGNYVINIGRQLGSGGKEIGEKLAARLGIDFYDKELINLASEESGLCKEFFEKADEKAS QGIIGGLFGMRFPFISEGAMPCNNCLSNDALFKVQSDVIRHLAAEKSCVFVGRCADYILR EHPRCANVFISASKEDRIARLCAMHHIDAEAAEEMIEKADKRRSEYYNYYSYKTWGAAAT YHLCVDSSSLGVEETVRFIEEFVVKKLQLI >gi|222159273|gb|ACAB01000086.1| GENE 11 15515 - 18271 1729 918 aa, chain - ## HITS:1 COG:no KEGG:BF2098 NR:ns ## KEGG: BF2098 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 5 914 4 918 922 717 42.0 0 MKTCRLCGLLLLLLCIHTILYAQQTRTVKGRVQTLEPGSNKRQPLPSASVVVLQKNDSTF IKGITSDKNGRFTLSYQPQKKKQYLLKVSFMGMQSVYHVLGDSALVNVGTLTLKDDDIQI GEVIVTGKLQEVVMEGDTTVINAAAYKTPDGAYLEDLVKRVPGLVYNKKDQSLTYNGQPI SEINVNGESFFSGDKKTALENLPADLISKLKVYDKKSKEEEFTGIGSGEKKYVLDLQTKD ELNKTWLTNATVGYGNNQKKDLEGQVNYFSKDGENLSFIGRSTNRYQNSTYKDNINNSVG MNMTHKFGKKFSLTGSMNYNLNRNGNISSMYQEQYLTAGNQYSASTNEGNSKSRSFNSNI MGSWEVDKRTRIHFNGGFSFSPNRNESNSQNASFDAPPNVNHESLFDDFESISQDIKVNR SENRSRSEGQSNRYNWMMGIMRRLNEKGTTLGLNIQSSDSWGDNESFSLSKTTYFRLKDK QGNDSLLYRNQYLKSPQKNNSWRVGINFTQPIGKKMHIRAAYNWNTHYERDNRDTYELSS LAKSDVFGELPPDYETGYVDSLSNRSHSRTNGHDLNVGFNYSDDTWMVNASLGITPQRRT IERKMGKLYADTTVHTIDFQPMIWLAWKKKEARITFNYDGRTRQPSLSDLMPLTDNSNPL YITRGNPDLKQMFAHSMRISFQHSKKGISANLGGQLEQNSVTQVMIYDAQTGGRETYPVN INGNWNVYGSANWWKRLGHFSLRLDMNGNHSNSVSMINEDRSLEPVKSTTRDTGLNCEAN VSYQPAWGGIDFSTSWNYQYSLNSVNDNNTYTRYYNFRLEGYVDLPLGIQLRTDGAYTFR NGTNIRKGEDDEMLWNASATWRFLKKKEAELSAYWVDILGKRKSYNRMATSDGFYEYRTQ EIKGYFIVTFKYNFRLMM >gi|222159273|gb|ACAB01000086.1| GENE 12 18524 - 19825 802 433 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_1410 NR:ns ## KEGG: Fjoh_1410 # Name: not_defined # Def: PKD domain-containing protein # Organism: F.johnsoniae # Pathway: not_defined # 6 294 10 298 439 66 24.0 2e-09 MRKFLVLATVALGFVFTACDDEDNLNSSVLIGNKSGSNVIVQLDTLYLDARAENLSGTME YLWTVDGKEVSTASTYKFSQPKTGEYVIGLTVSDGNGETLQTNITAKVEGRFGKGTFILN EGNMSNETGTLTFVDSKGTAMDSAYYRVNQTLLGNVCQDLFISDSKIYVLSQNGAKNGGE GLLTIANADNLEKERIYDNATLSWPTNLAVIKEALYIRDNKGVYMLNTSTDALTFVEGTG GALKNRMAVVGEKVFVMGSKKLFVIQNGTVIHTIPFESALSGIAKAYDENLWVSCTNPAS INKVNPLDYTVESHALDVSIGAGWGVAPAFSAKDDIVYFSNAGFNLYRHIFSQNKTEKVA NIKDYVEDAGTYYNSLGVDPVSGEVYFATLKGFSEYKINDIAIFDFTKTPALQADIKNKN SFPAGVFFTENFK >gi|222159273|gb|ACAB01000086.1| GENE 13 19856 - 21172 910 438 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_1873 NR:ns ## KEGG: Fjoh_1873 # Name: not_defined # Def: PKD domain-containing protein # Organism: F.johnsoniae # Pathway: not_defined # 41 429 171 518 519 295 44.0 4e-78 MDKKILRLTLVIAVSLFWGTAFTGCSDEEDTPVAYQLKKEDIRVSQPEGGFAVVIDQLLK VQVESESDEGISYVWLLDGTEIAQTKSLEYMFEEVGEYELTLRVSQGESRFDYPFTVTVT FENIEPAPEGATAYVTKVFDFVPAVGQFTNTLPVYKEGDTQEAMNEKVLAAIGNNKKGMI SLGGFGGYVVVGFDHTITNVTGKRDFRVLGNAFYSAANPDSGAPEGGSCEPGVIMVAYDK NQNGRPDDDEWYEIAGSAHEDVTLELWYDKAVAAGNDVKTYRNYEITYYRPEKEPTTAEE REMYIRWEDNQGKSGYKVKNTFHNQCYYPEWIKEDKVTFKGTCLPQNAVDESGQGSYFVL YKFRYGYADNELNSKDESAIDIDWAVNSKGQKVHLPGVDFIKIYTGVNQENGWLGECSTE ISGVEDLHVLGVDIDARK >gi|222159273|gb|ACAB01000086.1| GENE 14 21193 - 22353 721 386 aa, chain - ## HITS:1 COG:no KEGG:Cpin_3942 NR:ns ## KEGG: Cpin_3942 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 23 383 27 383 383 405 56.0 1e-111 MIVNNIFSRILFGILASCCLLACRGELPVLPSDEIEVGKDPLGGVHIKGMYLLNEGNMGS NKCTLDFYDSTTGKYHRNIYAERNPSVVKELGDVGNDLQIYGDKLYAVINCSNFIEVMDV ETAQHLGQIDVVNCRYITFSNGKAYVSSYAGPVGIDPNARPGKVVEVDTTNLAITREVVV GYQPEEMVIKDGKLYVANSGGYRFPDYDRTVSVIDLKTFQVIKTIDVAINLHHMKLDRYG RIYVSSRGDYYGTGSNVFVIDTQTDCVTGSLGIAASEMCLSGDSIYMTSVEWSYVTESNT ITYALYDVKQNKMVSRNFITDGTEAEIAIPYGVTVNPETKEIFVTDAKSYVVPGYLYCFS PEGKKRWKVRAGDIPAHFAFTTKTFQ >gi|222159273|gb|ACAB01000086.1| GENE 15 22350 - 24356 1184 668 aa, chain - ## HITS:1 COG:STM4130 KEGG:ns NR:ns ## COG: STM4130 COG4206 # Protein_GI_number: 16767394 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Salmonella typhimurium LT2 # 49 473 45 442 614 60 23.0 9e-09 MFRQIHFRVFWIGCFSSMFLLLSAQSKLDSLHHLHEVVITAKINKEVIPVQSLSGDKLEK LAVHSVADAIRYFSGVQIKDYGGIGGLKTVNIRSMGTNHVGVFYDGIELGNAQNGVIDLG RFSLDNMEAVTLYNGQKSAIFQPAKDFGSAGSIYLQSRTPVFQANKAYHVKAAFKTGSFG VVNPSLLWEQKLSENVSASLSTEYMYTTGKYKFTYAVAGGYDTTAVRRNGDVNALRTEGG LYGKIKGGYWRTKAYFYNSERGYPGAVVRNRFSHEDRQWDTNFFLQSSFKKDFGEAYSLL LNTKYAYDYLHYLADPRKEEATMYVNNTYRQQEVYLSMANRVTVLPFWDINLSVDYQWNK LNANLTDFPYPRRNTTLVAAATSLHFDRFKFQASVLGTFVHENVASDTTSAGNKMEFTPT AIASYKPFKNIDFNLRAFYKRIFRMPTLNDLYYTFIGNIKLKPEYTNQYNIGFTYQKLFT GTWLQALNVQLDAYYNEVENKIIATPTNNFFRWTMINLGFVEIKGVDVVLQGGWKLGDNW TFDSRLSYTYQKAQDFTDKLDEDTYGGQISYIPWHSGSAILNTEYKSWELNYSFIYIGER YGISANTPHNYYLPWYTSDVSLAKKFNWKKKDFKLALEVNNILNQQYEVVRAYPMPGTNF KFILNLTI >gi|222159273|gb|ACAB01000086.1| GENE 16 24726 - 24926 98 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294644167|ref|ZP_06721941.1| ## NR: gi|294644167|ref|ZP_06721941.1| hypothetical protein CW1_1690 [Bacteroides ovatus SD CC 2a] # 24 66 1 43 43 74 97.0 2e-12 MVKIGDFTFIQILIRKKISGCSRVAKIGLIISLFLFSLEFSSDLFLFDVSLPQKFWCHSS DATEEW >gi|222159273|gb|ACAB01000086.1| GENE 17 24904 - 26037 1042 377 aa, chain + ## HITS:1 COG:alr4031 KEGG:ns NR:ns ## COG: alr4031 COG0614 # Protein_GI_number: 17231523 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 55 375 97 422 426 213 33.0 5e-55 MKSPILTIALLCLFFLASCISRNKTSLERFNQDIYTPEYASGFKILGADNAASTLIQVSN PWQGAQDVKMSYFISRNGEQAPAGFNGPVIPAGARNIVCMSSSYIAMLDALGELKRIVGV SGINYVSNPYILAHKDSIKDMGAEMNYELLLGLKPDVVLLYGIGDAQTAVTDKLKELAIP YIYMGEYLEESPLGKAEWMVVLSELIDNREKGIEVFREIPKRYHALKALTDSISQHPTVM FNTPWNDSWVMPSTQSYMAQLIADAGADYIYKENNSNSSTPIGLETAYGLIQKADYWINV GSVTTLDELKAVNPKFADAKAVREKTVYNNNLRLTSTGGNDYWESAVVRPDVVLRDLIHI FHPELVSDSPYYYRHLE >gi|222159273|gb|ACAB01000086.1| GENE 18 26112 - 27050 695 312 aa, chain + ## HITS:1 COG:alr4032 KEGG:ns NR:ns ## COG: alr4032 COG0609 # Protein_GI_number: 17231524 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Nostoc sp. PCC 7120 # 1 304 40 351 362 217 42.0 2e-56 MATGDTFIPISKIWTVLTGGECDETTRNIILSIRFIRVIVAALIGIALSVSGLQMQTIFQ NPLADPYLLGVSSGAGLGVALFILGAPLLGWTDFPFLQSVGIVGSGWIGTSAILLGVAII SRKVKNILGVLIMGVMIGYVAGAIIQILQYLSSAEQLKMFTLWSLGSLSHITTGQLTIML PVICIGLLLSIACIKSLNLLLLGENYARTMGMNIKRSRTLIFISTALLTGTVTAFCGPVG FIGLAIPHITRILFDNANHRILMPGTMLTGLIGMLICDIIAKKFLLPVNCITALLGVPVI LWVIGKNLRIFK >gi|222159273|gb|ACAB01000086.1| GENE 19 27047 - 27802 217 251 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 211 1 212 245 88 27 1e-16 MIQLKDLTLGYEQRILLEKVSTHITGGQLVALLGRNGTGKSTLLRAIMGLETPKNGEIIL HGKNIASLKPEKLARKISFVTTDKVRIANLRCKDVVALGRAPYTNWIGQLQAEDEKRVVT AMHLVGMSDYAEKTMDKMSDGECQRIMIARALAQDTPIILLDEPTAFLDLPNRYELCLLL KKLAQTEGKCILFSTHDLDIALSLCDTIMLIDNPQLYSLPTNEMITSGHIERLFRNESVT FDAQAMKIIIK >gi|222159273|gb|ACAB01000086.1| GENE 20 27906 - 29075 927 389 aa, chain - ## HITS:1 COG:NMB0535 KEGG:ns NR:ns ## COG: NMB0535 COG0738 # Protein_GI_number: 15676441 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Neisseria meningitidis MC58 # 2 381 23 418 426 75 25.0 2e-13 MKTQNSIYAALPVLFGFFVMGFCDIVGISSDYVQRTFNWSPVMTGFVPSLVFIWFLFLGI PIGNQMNKWGRKNTVLLSMGITVVGMLLPLVVYNSATCMIAYALLGIGNAILQVSLNPLL SNVVTSQRLLTSSLTAGQVIKAVSSLVGPEIVLLAVAHFGDDKWYYCFPILGFITLLSAV WLMATPVKREDSSAATQQLSISDTFSLLKDKTILLLFLGIFFIVGVDVATNFISSKLMAE RFDWTTEQVKFAPQVYFLSRTVGALLGAFLLARIAEIKYFRVNIVACIFSLLILAFVKND MVNLICIGAVGFFASSVFSIIYSMALQARPEKANQISGLMITAVAGGGVVTPVIGFAIGT VGVIGGVFVTLACVFYLTYCAFGVKTAKA >gi|222159273|gb|ACAB01000086.1| GENE 21 29155 - 31527 2227 790 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 20 781 40 770 790 286 29.0 1e-76 MKLKWIGSLLILAGVVGCKTPDRPNLVEYVNPNIGTVHSRWFFYTPAAEPFGLAKLGAST NGTYGNNQGWEAVGYEDGHTSIDGFPCLHEFQIGGIALMPVTGEVKTNPGKLEDPDKGFR SRFDKKDEMARPGYYSVLLKDYQVKAELTATARVGFQRYTFLESENAHILFNIGNRQGES GAVRDAYIKQIDGNTIEGYVITEPEYVKKYQAGASVAMYFYAKLDRAPESVEVFYQDSTL TARNEIKGPGAIMCLNYKTKKDEVVNVKIGLSYTSIENAKVNLESEAKDLTFDEAMKATT DKWNESLSRILVSGGTEDSKIKFYTGLYHALLGRGLASDVNGAYPKNDGTIGQIPLNKDG KPEHNHYNTDAVWGAYWNLTSLWALAYPEYYNDFVNSQLLVYKDAGWLGDGIATSKYVSG VGTNMVSITLAGAYNSGIRNFDVETAYQAALKNELGWEGRIEGAGKMDVKQFVEKGYVPY ENSVHFGTHPEGSSFSVSHTLEYSFSAYAVAQWAKALGRTEDYKRLMELSAGWEKLFDDS LKMIRPRVPSGEFIDNFNPLESWRGFQEGNAMQYTFFVPQNPARLIEKVGKDEFNNRLDS IFTEARKSIFGGGKVVNAFSGLQSPYNHGNQPSLHISWLFNFSGKPYLTQKWTRLICDEF YGTNGEHGYGYGQDEDQGQLGAWYVMAAMGLFDVQGGSCERPTFQIGSPLFDKIEIKLSP MNATGKTFVIETTGNTPDAYYVQSATLNGKPLEQCWMYRDELYKGGTLKLTMGNQPNEKW GVENPPHCSE >gi|222159273|gb|ACAB01000086.1| GENE 22 31527 - 32618 1023 363 aa, chain - ## HITS:1 COG:no KEGG:BF0331 NR:ns ## KEGG: BF0331 # Name: not_defined # Def: putative hydrolase # Organism: B.fragilis # Pathway: not_defined # 1 350 33 386 388 248 40.0 3e-64 MFQRVWELYRVPKHGLFSEYYPSSHRPDLTYFNDSTRQAQEVSYLWPMSGVFSSAVVMAA IEPEKYTVYVDSMVMAMERYYDTTRVPFGYQAYPVQFGKVDRYYDDNGLVGIDYIDSYLV TKNPHYLEKAKQVLTFILSGWDENFEGAVSWLEGVKDQKPACSNGKAMVLALKLYEATKD EYYLEVGKKFYHWIDKYLKDPERGVVWNSWLTTTSAVCPDLYTYNTGTLLQAAVALYNYT GEQAYLDNAKFLAEGSYKVFFKYTEDGIPYIADLPWFNLVLFRGYHDLYNVTGDSKYVDT MIKGLDYAWEHARDQAGLMYHDWTGRTDEKRKPKWLLDASCVPEYYARVAMIKGEVTNRK INK >gi|222159273|gb|ACAB01000086.1| GENE 23 32730 - 33956 1152 408 aa, chain - ## HITS:1 COG:no KEGG:ZPR_4380 NR:ns ## KEGG: ZPR_4380 # Name: not_defined # Def: glycosyl hydrolase family 76 # Organism: Z.profunda # Pathway: not_defined # 36 403 32 391 391 234 36.0 5e-60 MKKIKTILAILPALLFSCAGDDVEKYIPPTPIAPSEPGEEVVYHKRAKEQFDLINQCYRN NSGATEGLYNENYPKKDGDNSASFLWPYDGLVSGAAALHALGYDVNYADMVDRFEVYYRT PSGTVGGYGSQTNGTTGSGTRFYDDNSIVGIELVEAFNLLNNQDYVTKAKRIVEFLQAGE DDTFGGGLWWNEDQKGQQGVGDSNKPTCANGYATLFLLEYYSVCPQEEKADVLALAKRLY AWTLTNLRDPEDGCYWNDKQADGSINKTKWTYNTGVMISNGVRLYKITGEQTYLDSAIAS SEGAYNYFVRPLNGLALAYPDHDPWFTTKLIRAFIDIEPYYKNAGNYIKTFINFLDYAYE NARLSNGLFYEDWTGATPKRAEQLLMQDAALESLGMIALYKGETVTEE >gi|222159273|gb|ACAB01000086.1| GENE 24 33969 - 35108 842 379 aa, chain - ## HITS:1 COG:no KEGG:ZPR_4381 NR:ns ## KEGG: ZPR_4381 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 1 360 5 395 412 159 31.0 1e-37 MKKILWICSVLLAMVSCQEDYELNTDFAVPTELSSPASIQLNVSSPTLVVLSWSGGGAAD GGIVLYEVLFDKADGDFSKPLATVKSDLGAATSLSITHAAINTIARNAGIYPLETGDIKW TVNASKGGVVKRTDKVATITVTRGEGIDNIPAELYLYGSATENSGQGGIPFRCVEEGIFQ IYTKLSAGNISFKSATTGETFSYYIDDSSKLREGEGETAVAASEEVTRLTVNFNTMAMTT EKIGSSVRCIWGATFGDIAVLEYAGDGKFVGEGDIRFLDPSKPETGAPDWLSWIEERYYF LAKVNGGDMCWGRGDDVSAERPVGGEPASFYALYEFQWSQWDHLWKMKGSLDYTHATITI DTNSDGLMIHTFTNVTPIN >gi|222159273|gb|ACAB01000086.1| GENE 25 35139 - 36743 1425 534 aa, chain - ## HITS:1 COG:no KEGG:ZPR_4382 NR:ns ## KEGG: ZPR_4382 # Name: not_defined # Def: RagB/SusD family protein # Organism: Z.profunda # Pathway: not_defined # 23 534 8 511 511 452 46.0 1e-125 MKTTIKNLILSAVLVVGGASCVDLDTAPYDRETDLTYWEEDPEAAVKALNTCYTYLGNMD EQLYCEAMTDNAYTKQPNDATQNIGNGSYSTADPYVKKVWDGRYTGIRMCNELLENIDRV PDLDPELKKRYIGEAKVLRAYNYYELYTKFGDVPYTTKVLSIKESMSIARTAKATVIANV LADLDEVINGNYLPTSYDADNKGRITRWAAMAIKAKIYLFEGNWTQVKNITSTIMTEGGF KLFGSYAGLFEIANEYNSEVILDAQYRPTSREHQMMYVFLPPTLGGYSQLSPLQELVDSY IMLDGKTIKETGTSFDESHPYANRDPRLKATVMYTGNSYTLADGTEVVINCEKGEGKDGY GVGSDCSATGYYIKKYWDNTYRATLYSGLNPILIRYADILLMNAEALAELGELDKTAWDA TIKPIRDRAGFTLASALEFPEGASKDKLIEIVRNERRSELALEGHRHKDIIRWRIADNVL NGWCHGLKTNDVVGTDDGYVRVENRTFNANKHYLWPIPQAERDLNGNLEQNPNW >gi|222159273|gb|ACAB01000086.1| GENE 26 36755 - 39802 2821 1015 aa, chain - ## HITS:1 COG:no KEGG:ZPR_4383 NR:ns ## KEGG: ZPR_4383 # Name: not_defined # Def: TonB-dependent receptor Plug domain protein # Organism: Z.profunda # Pathway: not_defined # 46 1015 22 1001 1001 927 50.0 0 MKQSNLLNAQFARRLFRTAFSSWQLLAVMLIVCMNVAVGSKLYAQSNTIAVKGKVMADGE PVIGATVLVKGVSTGTATDMDGNFSLNVASKAVLVVSSIGYETQEVPVNGRQFINVVLKS DVVTLKDVVVVGYGVQKKVNLTGAVSSVSTDELEGKPISNVLEAMQGTTPGLVIQQGSST PGSVPSINIRGLNTMNNNDPLVIIDGIEGSLGNLNPADIEQISILKDASSTAIYGSRASN GVVLVTTKKGKAGKVEISYDFMYGVQQPTSLPKIADSWVYAELYNEAAVNSGRAAKFTPE QIAQFRNGGPNVNWVKELYHRNSPQSSHNVSMTGGNDQLSYMASLGYMDQNSMFKGPDYG YKRYNARLNVSHKVTNNFTLNLTSQFARNDIKEHAYWTEWIIEQANRMPPIYPIKNEDGS YNYPAGSNSNGLQRLEEGGYRQNVNDELLGTIQAEWEVYKGLKLIGSAGGRVWNNNLHEN RKAFEGTGDSENKLTEQFYRSKNITTNLMVTYNTKIGKHSIGGLLGYAYEGFSEKQFSTS RLTEDSKYDIFVGDLSGDKVSNGGSASDWAIYSGFARATYNYDEKYLLEFNIRNDYSSYF AKGNRSGVFPSFSAGWRISEEKFWSVLKPYVPSLKIRGSWGLVGNNRIGAYQYMQTVSVK NGISFGDKLAQTAEFASTNPDLKWETTRMANIGFELGLLNNDLNITFDCFNNRTKDILVN LPVPGLFGNGAPIQNAGKVETRGWELSVSYRLKTGPVVHNFAGNISDSFNEVIDTRGTEI IGGSDVQTIIKEGYPLYSYYAYRSDGFFQNEEECQKGPHLEGITPKPGDIRYLDKNGDGV IKPDDDRFIVGNDFPRYTFGFTYGLEYKGFDFSMMWQGVGKRNKWMRGESVEAFHNNNEG PVMDFHQDRWTPNNPDATYPRLTMGAESANNAAKSDFWIQDAKYLRLKNAQIGYTFPQQW MKKLYVKNLRIFASVQNPLTFTKMKGGWDPEYTGDGSGRSYPVARVYSFGLNVKF >gi|222159273|gb|ACAB01000086.1| GENE 27 39842 - 40942 918 366 aa, chain - ## HITS:1 COG:slr0329 KEGG:ns NR:ns ## COG: slr0329 COG1940 # Protein_GI_number: 16331233 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Synechocystis # 7 286 30 288 327 79 24.0 8e-15 MYEHDERIVMTLDAGGTNFVFSAIQGCREIVEPICLPAASDDLERCLSVLVEGFLEVEKR LPKLPVAISFAFPGPADYEHGIIGDLPNFPAFRGGVALGPYLREQFGIPVFINNDGNLFA YGEALAGTLPEVNKRLKEAGSSKVYKNLLGITLGTGFGAGVVIDSRLLTGDNGCGGDVWI MRNKKYPEMIAEESVSIRAVRRVYQELTGKDASSLTPKDIYDIAEGTAEGDQQAAVRSFY ELGEMAGDAIIRALNIVDGLVVIGGGVAGAAKYILPGIMNEMNRQIGTFAGASFPCLQME VFNLSEKEAFDKFLEEKDKMVKIPFSEREVHYACHKKIGIAVSTLGASRAVALGAYSFAL SQLNIL >gi|222159273|gb|ACAB01000086.1| GENE 28 40976 - 42706 1585 576 aa, chain - ## HITS:1 COG:SA1945 KEGG:ns NR:ns ## COG: SA1945 COG1482 # Protein_GI_number: 15927717 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Staphylococcus aureus N315 # 227 459 2 217 312 83 26.0 1e-15 MRKANYDKFPSTKLTGMLVQGWDSIISMLKEKMDARKVLAVDLYTGVYEEEVLDAFSKEF SGRVMNVRDLMKPEKEIQALTERFMTEDVLFGYVTNLKLEDYLDADKVAAARKQISEAKE TIVIIGTGAAVVAPQDAMVVYADMARWEIQQRFRRHEVKALGIDNRNDAVSLQYKRGYFN DWRVCDRYKERLFDRVEFWIDTHVAGTPKMIDKDTFFKGVEATVNTPFRVVPFFDPAPWG GQWMKEVCDLDRERENFGWCFDCVPEENSLYFEVNGVRFELPSVDLVLLKSKELLGEPVE ARFGKDFPIRFDFLDTMGGGNLSLQVHPTTQFIRDSFGMYYTQDESYYMVDAEEDAVVYL GVKTGVDKEAMIGDLRKAQKGELVFDAEKYVNKIPTKKHDHFLIPGGTVHCSGANSMVLE ISSTPNLFTFKLWDWQRLGLDGKPRPINVERGKCVINWNRDTEYVNEHLRNQFKEVASGE GWVEERTGLHPNEFIETRRHRFSSPVLHHTNDSVNVLNLLEGEEAVVESPTHAFEPFVVH YAETFIIPAGVKEYTIAPYGKSAGKECVTIKAYVRF >gi|222159273|gb|ACAB01000086.1| GENE 29 42849 - 43577 538 242 aa, chain + ## HITS:1 COG:CAP0006 KEGG:ns NR:ns ## COG: CAP0006 COG2188 # Protein_GI_number: 15004711 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 239 1 235 237 101 27.0 2e-21 MKLDHNSDKPLHIQAEEILRRLIESEEYKNGKLFPNEVELSEQLHISRNTLRQAINKLVF EGLLVRKKGYGTKVVKKGIVGGVKNWLSFSQEMKMLGIEIRNFELHISLKRATEEIGTFF DLGSSPDTRCVVMERVRGKKEYPFVYFISYFNPNIPLTGEEDYTRPLYEMLETQYNIVVK TSKEEISARLAGEFIAEKLEIKSNDPILIRKRFVYDVNGVPIEYNVGYYRADSFTYTIEA ER >gi|222159273|gb|ACAB01000086.1| GENE 30 43599 - 44864 723 421 aa, chain - ## HITS:1 COG:CC2770 KEGG:ns NR:ns ## COG: CC2770 COG3550 # Protein_GI_number: 16127002 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Caulobacter vibrioides # 71 404 68 420 435 95 27.0 2e-19 MKKVWVYADFDWLKDIELIGELSCELLRGSETYGFQFSNDWLKKYGELFLSEDLNNYPGS QYTRQGRDIFGCFSDALPDRWGRVLLNRREQLMAAEEKRPVNRLNSFDYLVGIDDFSRMG GFRFKENPEGEFINTSNKLRIPPLTAIRDLMYASQEIEKSEESNLLPDKKWLIQLIQPGT SLGGARPKASVTDERGILYIAKFPSRKDDYDVGLWEHFCHLLAAKAGIRVASTRVLVTEN KYHTLLSERFDRRNDGKRIHFASAMTLTGLTDGANAATGNGYLDIVDFIIQGCTNVEANL QELYRRVAFNICVGNTDDHFRNHGFLLTPKGWTLSPAYDMNPTLNNQQSLLISESSCESD LMILIDACENYMLSRDTAEKIISEVVDAMKDWRRIAIRLNISKREIDLFSQRFDTYSSFG C >gi|222159273|gb|ACAB01000086.1| GENE 31 44861 - 45190 343 109 aa, chain - ## HITS:1 COG:no KEGG:BF1147 NR:ns ## KEGG: BF1147 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.fragilis # Pathway: not_defined # 1 105 1 105 106 149 88.0 4e-35 MTKSTMGTKLPRKLVQKMQIVGEQIKLARLRRNLSIAQVAERATCSELTVSRVEKGLPTV SIGIYLRVLYALQLDDDILLLAKEDSLGKALQDLNLKQRERASKKRDSL >gi|222159273|gb|ACAB01000086.1| GENE 32 45447 - 45599 171 50 aa, chain - ## HITS:1 COG:no KEGG:BT_2116 NR:ns ## KEGG: BT_2116 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 44 1 44 183 78 88.0 9e-14 MNRQLIEETYEMRKLNISEKIKSFDCGDTDLNDFILNESFLYREEVGKVC >gi|222159273|gb|ACAB01000086.1| GENE 33 45603 - 45755 96 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MARPIKETPVLFGEDARRFEERMKEKRSETPEQREKRLKDYELAMKIFKK >gi|222159273|gb|ACAB01000086.1| GENE 34 46042 - 46641 454 199 aa, chain + ## HITS:1 COG:no KEGG:BT_2534 NR:ns ## KEGG: BT_2534 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 198 1 199 202 232 57.0 7e-60 MSALYDLFENPPSPDGEKQPLHARIVSKGTVDKEEFLDRVHKFTGISRSLLAGAMEAFAN EARDLLADGWNVEMGNFGFFSTSLQCPPVKDKKEIRAASVQMKNINFRASRPFKKEVGDK MRLQRGESITRPKKNSISRETCRDRLNTYLENHLFISRTDYSHLTGRNKKVAIEELNSFI TDGIIGKEGVGKLTVYIKV >gi|222159273|gb|ACAB01000086.1| GENE 35 46679 - 46882 295 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885246|ref|ZP_02066249.1| ## NR: gi|160885246|ref|ZP_02066249.1| hypothetical protein BACOVA_03245 [Bacteroides ovatus ATCC 8483] # 1 67 1 67 67 109 100.0 7e-23 MIDVLVWNKYTRVVMQLAERLNISPEKALHLFYNSKVYALLLNKQYPLITLSDAYITDEI ILELQQQ >gi|222159273|gb|ACAB01000086.1| GENE 36 46892 - 47383 491 163 aa, chain - ## HITS:1 COG:no KEGG:BF3041 NR:ns ## KEGG: BF3041 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 156 2 155 171 160 48.0 2e-38 MKVYHGSTQAVEYPLVGVGRENLDFGKGFYVTDIYAQAERWGAVMALRHPGSTSVVNIYE LDVEKINSSGYKWLHFDGYNNDWLEFIVSSRLGKQPWLEYDIVEGGIANDRVFDTIENFM ENQITKEVALGRLRFEQPNNQLCILNQRLIDECLSFKEYVTIK >gi|222159273|gb|ACAB01000086.1| GENE 37 47380 - 47646 271 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714767|ref|ZP_04545248.1| ## NR: gi|237714767|ref|ZP_04545248.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 88 1 88 88 168 100.0 1e-40 MLHSSQSAPVVDGKVMEFVIFAIESAAQKLGIPAPTLYNRLEKLNLIRQYLISGYDMLHT QSREYIADTLVEALENWEAYYKEKGEFV >gi|222159273|gb|ACAB01000086.1| GENE 38 47977 - 48456 466 159 aa, chain + ## HITS:1 COG:no KEGG:BT_2538 NR:ns ## KEGG: BT_2538 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 157 1 158 160 202 70.0 4e-51 MALRYVIKKRTFGFDKTKAEKYVAQNVITNTVDFRDLCEEITKVGMVPSGAVKFVLDALI DTLNLNLRKGISVQLGDFGCFRPGMNCESQDTEKEVDSDTIRRVKIIFTPGYKFKEMLSK VSVQKAVASDDGSISPEQPDPNPNPNPDDGKGEAPDPAA >gi|222159273|gb|ACAB01000086.1| GENE 39 48599 - 48844 231 81 aa, chain + ## HITS:1 COG:no KEGG:BT_1170 NR:ns ## KEGG: BT_1170 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 81 8 85 85 62 39.0 5e-09 MKELPSEEEEIPFRIKTYKKKELACMYNPNITPRCAIRIFTKWIKINKELLQLLTATGYH PRTRTFTPRQVQIITSILDIP >gi|222159273|gb|ACAB01000086.1| GENE 40 49109 - 53110 2697 1333 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 802 1075 1 271 294 161 35.0 9e-39 MYKKYIPFLLLLISTLSFGQNIHFQAIGVENGISQPTVTSIYQDEFGIIWIGTKDGLNRY NGTDFHIFRPIENDKNSLYNNNIGTICGDKNGHIYIRCKYAVVEYDIRKNIFHTIRNNNI QAIDYGNSRLWVCTRDSLFTYNRKEDKLEYYYHLDSVRISCITEDHEGNLYVGTMNKGLY IIDSNKKWLNYLPEKDITCIYEDSKKNIWVGTKDDGLFRLDRNGVQINYKDGPYKNRLSS NYIRCIVEDNQGNYWIGTFKGLDKLDTTNNITHYSEDNKPYSLSNSSIICMMKDQQGTFW IGTYYGGINLFNPDYEIYTYYYPDESQKGKLTSPFAGRMKEDSKGGIWIATEGGGVNYLD RKTKSFIEYKHSNQSNSLASNTIQALYLDEKNQILWIGTLKGGLDKLNLKTQKFTNYRHV PSIKNTLINDIIRKIIPYKGNLLLATHNGIGLFNPETGECTKLLTDSKLNNRQIIDMHLD KHNNLWFSYYLGLVKYNIATHKRDEYFVPNTSEKVIGSNLINVLFEDKKGNIWAGSSGDG IFLYQPETNTFKSFNSQNSDLINDYILDINESLSGYLLIASNQGFSRFDMENERFYNYNK QNGFPMTALNPYGLFVAHDNEIFLSGPKMMISFFEKELNSYVKPYQLNFTSLEVNNKLIL PNDGSGILSESILYQPQITLNHNHSIITVNFSLSNYVSVLRNRIYYKLEGFDKEWMSAGY RKGITYTNLNPGKYKLIIKSSEEYSGRESICKEIDIVIKPPFYKSTWAYCLYGIIIIISI YIIVSFYSSKLKLRASLEYEKKEKKQIEELNQSKLRFFTNISHEFRTPLTLIVSQLEMLM ERNDIQPLVYGKLVGIHRNTLRMKRLITELLDFRKQEQGFEKFKYSKQDIYSFLDEIYLS FKEYARGKQIIFEYFNKDRSLDVWFDVVQLEKVIYNLLSNAFKYTPLGGTVSLSVQEYEN SVMILISDTGIGIAEENLNKIFDRFYQVDSLDNQKGTGIGLALAKSIIEAHKGKIGARSR EGKGTTFVVELPLGDSHISVSQKVETPDIDSYCISLLKMDDEKITEEIPEDENSDRTEEP SSKILIVEDNEELRELLVRLFSKVYSVYEAQDGEEGFEKTKEVQPDIVLSDIMMPKMSGI EMCRMIKSNFETSHIPVILLTAQTAEEFTIQGLKMGADDYITKPFNVKHLFMRCNNLVNG RKLLQKKYAKQMDNNVDILATNGADQQFMEQCVICIEQNIDNPNFDVNMFAQALNIGRTK LFLKLKGITGQTPNDFILNVRLKKAQMLLIQSDTKTISEIAYEVGFNSPSYFIKRFRELF GITPAQFQKGITE >gi|222159273|gb|ACAB01000086.1| GENE 41 53144 - 56011 1742 955 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 77 242 273 444 1087 67 29.0 1e-10 MRTKLLGAFCLLFPLLSVAADIQRITPITSDNSVKIEVTLSAEAGEHLLLDATIVGSQNK EKLCNFSKEYLFNHKTDTTVILATDQLTPKLWSPVSPVLYDLTLKAGKQTFQKRIGFRKF EMHNGVFYLNDKPIYLRGNAINPPERGIPEQLEQSKDFARDYVRYMKSLHINIIRIPDNQ NWMDVCDEEGMMIFAGRYGRPKHATKTAPPTDFELSLKTYKEIDLGPFTSHPSVVIYILS NEMPYEGKVGDLYRDFLTRMYQELKKWDSTRLYICNAGYGLGKSADIYDVHRYWGWYYNS FLTYLNMRDKAMWQNPGKVQPITFTECVGNYTGIDGRFNLCSRTKQPGSQKCWTGHLPDA KQAEAAMAYQAFVLKNATELFRRLRSQNDCLAGTMPFTIVFHHWDGVSSFAEMKPKPVAR QYQLSYQPILLSWENWQSQVYAGKKLSVVAHVVNDDDYGNGLSNARLHWWIEHEGKKVIS GENEFPFVPYYGTDKLPLTINIPQNLPTGDYLLKGEIYSKDKKVSYNESELFIAGKDWNN PADTETTVFVYDTTPEQQTLNCLQRKGYSVKTASSLTKLPMHSTFVIGKDSWDDNLDRQT EELKAYVNKGGRIICLEQNQTTFNSSWLPVKVKFLEHSNNDPVYLSPSLAYKDGMNINLE RPYHPIFSGLNPRMFRLWADYTSYDESKNGFPAIYPVNTGYELQSSSIEDVATLANYSRA LAGTALSELFMGKGSILLSGFDLIDHCEVDPVADKLLSNMILYMAVNKKHEQYVAINDSI IWGDYASERGIVNAPCNGLMVNTIPIIPKGQEELPKYKVQIDEYGYQYAGSYGGWNSKPG VQYVTYGRRPMAPFTFSRGGSPIVDTSSTVGEGYFYLALPRKAKTMVTVLENPVDEPLNI SLSVTDGTWEKYIIHPKQQLIIRTNISHLKNKMKVTLKGDRRTILLKTIIKKKQK >gi|222159273|gb|ACAB01000086.1| GENE 42 56107 - 56709 379 200 aa, chain + ## HITS:1 COG:no KEGG:Slin_1080 NR:ns ## KEGG: Slin_1080 # Name: not_defined # Def: protein of unknown function DUF303 acetylesterase putative # Organism: S.linguale # Pathway: not_defined # 9 199 10 201 652 162 44.0 7e-39 MKRILFSSFLSLITILSFATDKFTVADVFTDHMVLQRNAIIKIWGEAQNGSLVEVRFAGQ LRKVKAIQGKWQVTLKTGEAGGPYKLDIINGNNKVSFQDVLIGDVWLAGGQSNMEFALRR VKDAQKEISSADYPQIRYYKVPRKFYPEHEVSKASWRVCSPQTAPEFSAIAYYFSRNIHK ELNIPIGIIQTPVGGTTVEA >gi|222159273|gb|ACAB01000086.1| GENE 43 56728 - 57678 469 316 aa, chain + ## HITS:1 COG:no KEGG:Cpin_1583 NR:ns ## KEGG: Cpin_1583 # Name: not_defined # Def: sialate O-acetylesterase (EC:3.1.1.53) # Organism: C.pinensis # Pathway: not_defined # 44 300 225 482 486 199 43.0 1e-49 MSDKDFRPIVQHYDSIVNSYGSDGYEKLYNRYVSSLAEYNQLNAEQKKYIDKPVEPMGRK NFHRPIGLSETMLNTVIPYTLKGFLFYQGESNTARGAQYRKLFPAMINEWRTAWGQGDIP FLFIQLPRFETKTRYWYELREAQYLTSHHVKNTAMVVAFDQGNPKDIHPIVKDTVGWRLS QLALGKVYGKKVVCQGPEFKKMTKTADGSLLLDFTNAGTGLVSKDNAATLSGFAVAGKDG KFYPAEAIIVGKNQVKVKNNLVTNPVDVRYLWVNSGDMNLFNKEGFPAFPFRTDKYRLAT ESVYVNPEPKICRSSL >gi|222159273|gb|ACAB01000086.1| GENE 44 57728 - 60748 1666 1006 aa, chain + ## HITS:1 COG:no KEGG:Phep_1293 NR:ns ## KEGG: Phep_1293 # Name: not_defined # Def: coagulation factor 5/8 type domain protein # Organism: P.heparinus # Pathway: not_defined # 21 1006 18 1004 1009 1022 51.0 0 MYLRKCLVLVYGAMVIPFVGISADNVNVALHKPVIASSQQKEFPASNVTDGVISRTSSWT SAKGARTPHILDLKLQKYYDIDRIVIYTGIPEPEKTEQEKGQAPGFWAMKNFKIQYWDDA NWTDLPNTECTENRLDKIEFTFTPQLTTFQLRLISTDGEPVTINEFEVYGKEKKNMIVPV IQNELPDRVHKEFPKDMLVTVNKDVIGNTMKYVAYNQGYYMPGSNISGWLEYSNVNSLRV WTSMNDYVPEEAVNYQKNLNTLEEFEAYKHELRSSPEKNNFIKWKLILERCRKQQFSTNS MVFEYALLELKRLNIDVVLQMNSTDFDHTWSNKWKQWQRFYALAFYAAKTGDVTMFAMQN EPNHRHSGPMKITQYVDAMKIVSDAIYCAIQDVNKLYGKHLKSRFVSPVTAGSNANWWAE VVKSLRIDYRGFPSDRDLLDIFSTHSYNLPAAGYVSKVSDIRKIIVDNHPLKRSLPIVYT ETGRWMNAYLIDKYETMDSPSLFTEWAGEYTNNTLNQGHGMWAFKFANTTSNTYPRGIKS GHHFIWQGKRIVEDAYNNVALGKKVIDLTSSHPAQVKVITDGNKTDASMWTSVNTDEKKC LEIDLGKSYSLGAAVVYTGSEYGVYTGPDRVKNFRLQYWEGTGWKDIEETIENNARYTQS FFLFKTPVTSSRIRFIATDKGSIKVREIKLFDEESVKNIPESYDISGIQRTGQVVRLFAK GFKDERPLLETVKSVPDNEVDALTSYNPEEMRYYIWLVQRKLSPNTLTLNLKSLNLPTGT KVFAEEVSADAYGEVVWSKEISENGQLSFELPAQSVMLLTIPACQSTPYTLTATADATIK SGSNSEKNFGKIKKMSVEMNASQINHNQVSYIKFDLSKVDNINAALLQIYGNSSDKYSYR FHVYALDNCDWNEMSLNWNNAPNLDRKQMRITQVGNTAHVAGEIIVDKNASYHQLDVTKL IRKCKQKEITFVLIRELRQLGDDSDNGKSCYLDTKESNHKPILSIW >gi|222159273|gb|ACAB01000086.1| GENE 45 60863 - 63922 2156 1019 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_2077 NR:ns ## KEGG: Fjoh_2077 # Name: not_defined # Def: TonB-dependent receptor # Organism: F.johnsoniae # Pathway: not_defined # 1 1019 12 1008 1008 639 38.0 0 MKNNQLKLILPAIFLMIMVGIHAQNVRVTGVVSDAQSPLIGVNVHVKGGTTGVITDMNGK YSIEVPSNATLVFSYIGYADQEHKVGNRKIIDVTMSEDSKLLEEIVVVGYGYQKKSDIAT SVASVKTDEMKSFPAGNVGDMLRGRVAGVNVTSSSGRPGSAPTITIRGNRSISAQNTPLY VIDGSVSSSEEFSTLSAESIESIEILKDAASQAIYGARASDGVILVTTKRGMQGKMEVNY NGYVGIQSLWRNFDFYSPQEYMMLRREAMANDKGIIDAREISVAEALSDEIMSEVWANGE FVNWEDLMLKNALYQNHDLTVRGGTDKLRVSAGLNYFDQDGMVTTGSGYKKAAFRLNVDY KVNKWASFGVNTSYALSKSEREDGNFTEFITRTPLAKVYNADGSYTKYIDTANDVNPLYR AQNYAREITNNSYRVNIFLELKPFKGFNYRLNTSFYNRQQEDGEVKGVNYPGGGATAKLT NNEQRNYLIENIFTYEVPIKKQQHKLTLTAVQSVDHSQTKGLGYATSDLPVDMDWDFISN GQFSGTPTRTFSENNLVSFMGRASYIFMDRYIMNVAIRRDGSSRFGKNNKWGTFPSVALA WRANEESFLRNVSWIDNLKLRLSYGIVGNQNGIGNYTTLGLTDQERYEFGDNSYMGYLPG KELSNPNLKWEQSRTANIGLDFGFFNNRLSGTIEYYNTRTTDLIVKREINSVLGYEQMLD NLGETKSHGIDISINGDVFRTKNFTWSLGANFSQYANEIVKIDNQVDENGKPLSQPGNSW FVGKPIHVYYDYLPDGIYQYEDFDIKRNAYGKLEYTLKPTIDTDGDGIADKALTREDNVA PGSVKIKDVNGDGKINADDRTPISRDPDFTLSLNTTLKWKGFDFYMDWYGVSGRKIRNGY LSESNSGGSLQGKLNGVKVNYWTPFNPSNEFPRPSHNTNVTYHGSLAIQDASYIRLRTLQ LGYTFPTTWIKKLQLQKLRVYATATNLLTFTDFLSYSPELTPGAYPESKQYVFGINVSF >gi|222159273|gb|ACAB01000086.1| GENE 46 63939 - 65552 1336 537 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_2078 NR:ns ## KEGG: Fjoh_2078 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: F.johnsoniae # Pathway: not_defined # 1 526 1 520 530 195 32.0 3e-48 MKNKLYIAALAVYSITLLSSCEDFLNLKSKTDITTDYLTTSPDGLYRAAIGLYSLDRELA KGDGSGGSNLYIVTMCDYSTDIMAFRAGTSTAIAKLQNFMPNNSDVESFWQHYYFIIGKA NEIITGAEQLGFDDPVTTRAWGEAKFFRGRAYFELWKRFERLYLNTIPTTVDNLERDFHP VSTEQIFTQIKTDLDDAVNALEWGIPNNDYGRVTKATAKHVRAQVAMWEKDYDKAIEECE DIFRDGTMYSMMSKTGDVFNSADMRSAEVLWSYQFSENMGGGGTGTPLMGHRAAIITTTR SQANPDCTFEAAQGGYGWGRVYPNTYLFSLYDKEKDNRYKDLFIHTFYYNDQSKPNYGQE IPKDLYGKAAGYMERLHPMSKKHFDQWTNATQPDRTTSFKDLIVYRLAETYLMCSEAYFH RDGGDSPKAIEYYNKTWERAGNTHENGPLTLNMLLDEYARELHFEGVRWSLLKRLGILGE RVRQHGGDLMLEDPYLDKDYAECRQNFVLGKHETWPIPQTQIDLMGTNVFPQSDPWK >gi|222159273|gb|ACAB01000086.1| GENE 47 65597 - 67423 1361 608 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714776|ref|ZP_04545257.1| ## NR: gi|237714776|ref|ZP_04545257.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 608 1 608 608 1136 100.0 0 MKRLTLISLILSMILTSCKEYGEVRIMPEFNNSGTEVELYKNEGSSKTVVISTTANEVTA DYNASWLSVDANKQRIIYTALTTNETGEVRSTTVKLNAGEFSMEVTVNQLAKDESEVKTL KVGQLTEDGLGMIFWVDPDNQEAGKAISLERWGGNPFEASIKLHNAFSTINGIENTALYT DAGNNDAAALCTNLGEGWYLPASEELGHLFDIYNGIARDNGFTNATPNQISDAEKASRAT FDKNLTDLGGAVINAAAENGNGESYWSSTENEDGQKARYVRFGKYGMDYGAKTGTSRFVR AMKIIGDYKFPEEPATLSVSPMQIELTSEKGATADVTVSTNKPSFAYVIEGNGNTWLSAE QNGDKIKFTALSKNNSDETRTAIVTITAGNGDAQATATVTIRQQKEQTEVAAFQIGDFVK MDGGTELAEGGIVFWVEGNNAKILSLKRSATAINWANEGFTDALGLTDQEDGEANTQKMR ESGIAANIPILEYCKDGWYLPARNEMEAVFNAYNGRPSQSIGLKPDAIKQEEKDARAAWD KILTDNGGDVMNVKADNTAGDSYFTSTEADDASKVFYVRFGQWNPGLTGAKYAKSPARYV RCIRKISK >gi|222159273|gb|ACAB01000086.1| GENE 48 67445 - 68992 1213 515 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_2081 NR:ns ## KEGG: Fjoh_2081 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 85 515 40 470 471 243 35.0 2e-62 MKVYSFYIKALLVCHAILASCTDYVVKDPNFMPPDVVINDDTGNNDIIEGLPTPGEMQPY SPSLLGKPYRPIKVKYSNEFPPVTKWTESNTRIVAYMGEYKPTIKSESDYKAITNKYGSF ISGTKQQATGRFYVKKVNGRWWIIDPEGYPHYERSVTSMRYGSSARNKEAWNKRFGTDAK WIATSQAELASIGFHGTGAFCTNTYGKIQTHNSSIPNSPLTLTPSFGFLGQFRSQNGHTY PGNTSDNELGLVLYDDWADFCKKYVNTSLAPYLHDANVLGFFSDNEINFSSQNSRILDRF LALTDKTDVAYLAAKKFMDEKGATGVTDNLNSEFAGRLAEIYYKGVKDAIKEADPDMMYL GTRLHGTPKYMKDVVAAAGKYCDIISINYYSRWSPELDSYVKNWGEWTDAPFLVTEFYTK GQDSDLNNLSGAGFTVPTQNDRAYAYQHFTLGLLEAKNCVGWHWFKYQDDDGTDNSGKPA NKGVYDNHYEMYPYLGKFMQEVNYNVYNLIEYFDK >gi|222159273|gb|ACAB01000086.1| GENE 49 69037 - 71421 1270 794 aa, chain + ## HITS:1 COG:TM1061 KEGG:ns NR:ns ## COG: TM1061 COG4289 # Protein_GI_number: 15643819 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 417 793 8 386 387 342 47.0 1e-93 MVMKNRKKILIISLTLLASFTCAAETVTKGMKKVIDDALDFSVKQSMSMFYEMKDQKGIL PRTAENGKMITCESAWWTSGFYPGTLWYCYEYSNDPQIRAAAEEMTSRVEKQKYTTSNHD VGFIINCSFGNGYRLTHNETYRKVIETAAKSLSTRFHPVTGCTRSWNSKKWQFSVIIDNM MNLELLNVASSLTGDNTYYNMAKSHADRTMINHFRPDGSSYHVVSYDTITGKVLNQVTHQ GVNNQSSWSRGQAWGLYGFTMMYRQTGKKEYLDHAIKIGKFIMNHPRLPKDKIPYWDFDA PNIPKEDRDASAGAIMASAYVELSTYVEGELSKQFLTIAEQQIKSLASPAYRAKKVGDNN HYIIKHCTGFMAKQYEIDAPLTYADYYFIEALIRYKKLLENRPVVETITAFSENSDRSLW LSSLHRISYPLLTNMAKGELRKNMPVESIAADMQKRKEVTHLEALGRLITGISTWLELGP DNTIEGKLRAEYIDLALKSISNGVNPQSPDYLNFNNGRQPLVDAAFLAHGLLRARTQLWD KLDKTTQERVIKELKSSRVIKPSETNWLFFSAMVEAALKEFTGEWEYERVKYACDRFAQW YKGDGWYGDGADFHLDYYNSFVIHPMMVEVLEIMKKNGIESSIPYDLELERYARYAEQQE RLISPEGTFPIVGRSLAYRFGAFHALSDVAYRKLLPERVKPAQVRSALSAIINRQVNAPG TFNPEGWLRVGFAGYQPHIGETYISTGSLYLCTAVFIALGLPESDEFWSSPSADWTCKKG WAGVDLNVDKALKK >gi|222159273|gb|ACAB01000086.1| GENE 50 71441 - 73243 1102 600 aa, chain + ## HITS:1 COG:no KEGG:Cpin_1751 NR:ns ## KEGG: Cpin_1751 # Name: not_defined # Def: coagulation factor 5/8 type domain protein # Organism: C.pinensis # Pathway: not_defined # 1 592 1 566 576 326 34.0 2e-87 MKNILSILFVAVVLFSCSENKSVKKQTTFCNPMNLDYGWGNFQTKEKKARSAADPVIVLF KNKYYLFTTMDIGGYRVSDDLITWKNIYFNPEVKTSALDVDHYVAPAVAADENYVYFVNF TRDRTKKTVDVIRSSDPENGKWEKCGEVRRMADPCLFIDNGRFFFYYGLGGTQSTTFFEV DPATFKEIEGSKKVLREYVTDINQCKSGYHFGRRELYDEIDASAWLGKFSKVPCPEGAWI VKNQNKYYLQYATPGTICNWYCDVVLESDSVNGGFVEQPYNPVSLKVGGFIGGAGHSCVF KDKYENWWQVTSMWVGNHDEFERRIGLFPVSFDDKGRMRTHTVLGDYPMSLPQKKFNPQD ISAFDWMLQSYHKKSTASSSLPGFEPEKAVDENVRTWWSATSGKAGEYFIMDLGKKIRMN SVQINFAEQDINPDAPKETDYHAYKLYVSDNGRDWKLIVDKSKNMVAIPHEYVEFPKPIE TSFVKIENVHTPKEGKFALLDLRVFGFGYSDKPEIVKELSVKRNKDDERYASLSWNKVSD ADGYLVRFGYQPDFLNQCIQVKDKETTDLQLHILTKGVQYHYRVDSYNDSGITEGIVISE >gi|222159273|gb|ACAB01000086.1| GENE 51 73250 - 75811 1363 853 aa, chain + ## HITS:1 COG:TM1061 KEGG:ns NR:ns ## COG: TM1061 COG4289 # Protein_GI_number: 15643819 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 470 852 3 386 387 331 43.0 4e-90 MDKKLLIYPLLFSAGLLQARQKPNVVIIFTDDQGYQDLGCYGSPLIQTPFIDRMAKEGIK LTDFYVSSSVSSASRAGLLTGRLNTRNGVKGVFFPESEGMPSEEITLAEALKEQGYTTGC FGKWHLGDLKGHLPTDQGFDYYYGIPYSNDMYIGPSQQFASNVTFREGYNLSKAKEDQEF VRTSSRADIKKRLNNASPLFEGDKIIEYPCDQSTTTRRYFDHAIDFIENNPEQPFFVYIT PSMPHVPLFASEQFKGKSKRGLYGDVVEEIDWNVGRLIDYLDKKKLAENTLVIFASDNGP WLSFKEDGGSAEPLRGGKFSYYEGGVRVPCIIRWKGSIPAGVTSDAIVASIDLFPTIMHY AGCQSFKQKIDGINISSFLKNPSLRLRDEYVYVKGGEVHGIRKGDWVYLPKTGNSKFKKG DVPELFNLKQDIGESNNLHLQYPNKVKELQEVMKKYQSTSTMPYSQIRDTLNNDRQYWIQ TLVKIADPVISNLSKDQLKKNIPVGRSSSALASSREFITHMEAVGRTIAGIAPWLELGPD NTPEGKLREKYIKMTCKALANSVNPESNDYFNSTATRQILVNSAFLIQGLLQAPTQLWGN LDDTTRKRLIEQWKSTRTMKPGNNNWLLFSAMVECGLKEFSGEWNFPTIEKALTSHREWY KGDGVYGDGPDFHLDYYNSYVIHPMLLQILKVTVKQRPSFQSFLNEEWIRFIRYAEIQER MIAPDGSYPVLGRSVSYRSAAFQVLGACALFGKLPESLKPGQVRGAMTAMLKRLFEQPGT FDKEGWLTIGVCGEQPELGDIYLSTPCVYLCSLGFLPLGLPANDPFWCNPVEPWTSIKAF GGIDFPIDKFIKP >gi|222159273|gb|ACAB01000086.1| GENE 52 75821 - 77044 778 407 aa, chain + ## HITS:1 COG:TM1061 KEGG:ns NR:ns ## COG: TM1061 COG4289 # Protein_GI_number: 15643819 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 28 406 8 386 387 323 43.0 4e-88 MRYSILIVAFLMFNLTGYSQTTSGQEDRIYWVNILSQIADPLLNSMSKGELKKNMPVETI SGVPNPSNARTTHLEALGRLFVGIAPWLELGPDNTQEGQIREKYIRLMIESINHGFNPQS PDYLNFTVTRQPLVDTAFFCQGLLRSPKQIWSKLSAETQKNILNALQQVSKIKPVESNWL LFSAMVEATLLELTGKCNMHPIEYAIMRFKEWYKGDSWYGDGINLHMDYYNSFVIHPMLL DILEIMQKHNKGETDFYKKEQLRFSRYAEQQERMISPDGAYPVIGRSIAYRFGTFHVLST AALKDLLPTTITKSQVRCGLTAVIKRHMAIKGNFDEHGWLTLGFAGHQPQIAERYISTGS LYLCSTVFTALGLPVTDEFWSAPYEEWTEKKIWSGNPNVKLDKAIKL >gi|222159273|gb|ACAB01000086.1| GENE 53 77067 - 78176 640 369 aa, chain + ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 99 365 90 352 352 214 39.0 3e-55 MKKILSLFLVSISCSLNAQQLFSSKMEKEDITKVCNAVSEWQITHHNEVKHNPLDWTNGA LYRGMTEWGKVSGNQSCYDFVRTIGEKHKWNMWDRVYHADDICVGQAFIEMYRRFDDKRM LQPVMERAYYVASHPSKATLQKTDAVGTTERWSWSDALFMAPPVYAALYTITGDKIYLNY MDSEYKECVDSLYDKEDHLFYRDNKRIPLREKNGSKQFWGRGNGWVFAGLPLIIDNLPLN CPSRNYYIRLFTEMAEAVRKTQCKDGDWRTSLLDPDSYKMPENSCSAFMCFGIAWGIRNG YLPQRTYKPVIEKGWQSLVKAVHSDGKLGYIQPVGAAPKAAGFDATDVYGVGAFLLAGSE LYKLSCTYH >gi|222159273|gb|ACAB01000086.1| GENE 54 78213 - 78602 146 129 aa, chain - ## HITS:1 COG:no KEGG:BT_2114 NR:ns ## KEGG: BT_2114 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 122 135 256 258 150 64.0 1e-35 MYNSILKGLKLKKDILATYRKRAIDARFLSLADNNKNINVGQEEWILQLPVCKVADTRLA NVLYNQDVRTVKDLLEIVSGRGWKSLLRIEGVGKTSYYHLLSKLQMIGVVDESLDRILAR HSVGQFDNK >gi|222159273|gb|ACAB01000086.1| GENE 55 78900 - 79193 363 97 aa, chain + ## HITS:1 COG:MJ1379 KEGG:ns NR:ns ## COG: MJ1379 COG1669 # Protein_GI_number: 15669569 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Methanococcus jannaschii # 1 97 1 100 100 71 46.0 3e-13 MKTTAEILDILRDFKAHYAEKYGIITLGLFGSAARGEHDETSDIDICIKLQEPNYFTIQD IKEDLEKIFRTKVDIISLGAIMRNFFKKSLEEDAIYI >gi|222159273|gb|ACAB01000086.1| GENE 56 79210 - 79581 176 123 aa, chain + ## HITS:1 COG:no KEGG:BT_2210 NR:ns ## KEGG: BT_2210 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 114 16 124 125 89 40.0 3e-17 MLHFVEEQSSFIERTTKHLNSYHDFLISDSAMVLFNSTCMCLQTIGETVRQIDNLTEEAL LVENYKEIPWKRIIGLRNILSHEYAAIDPEAIFNTIKIGIPPLLATINSIIADIENGKHN SLF >gi|222159273|gb|ACAB01000086.1| GENE 57 79761 - 80195 236 144 aa, chain + ## HITS:1 COG:no KEGG:BF3357 NR:ns ## KEGG: BF3357 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 144 1 144 144 116 36.0 3e-25 MNQKIDFNDIPKNYLYCTHNKCPRRNECLRHQATLYIPQNVPDFRTVNPNHIIGNENNCR FFNPYCTSRFACGIDHILDNIPYSTAITIRKELYSLMGRSMFYRIRNKERMLHPDEQKQI TAVFLKHGIENKPEFDKYIDLFDW Prediction of potential genes in microbial genomes Time: Wed May 18 03:13:10 2011 Seq name: gi|222159272|gb|ACAB01000087.1| Bacteroides sp. D1 cont1.87, whole genome shotgun sequence Length of sequence - 14736 bp Number of predicted genes - 13, with homology - 11 Number of transcription units - 8, operones - 5 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 53 - 802 330 ## gi|237714788|ref|ZP_04545269.1| predicted protein 2 1 Op 2 . - CDS 813 - 2174 1224 ## COG0534 Na+-driven multidrug efflux pump - Prom 2278 - 2337 4.4 + Prom 2196 - 2255 2.5 3 2 Op 1 . + CDS 2280 - 4007 2279 ## COG1190 Lysyl-tRNA synthetase (class II) 4 2 Op 2 . + CDS 4061 - 5029 795 ## COG0240 Glycerol-3-phosphate dehydrogenase + Prom 5062 - 5121 7.0 5 3 Op 1 . + CDS 5144 - 6481 1691 ## COG0166 Glucose-6-phosphate isomerase 6 3 Op 2 . + CDS 6517 - 6747 74 ## - Term 6575 - 6606 1.1 7 4 Tu 1 . - CDS 6661 - 7167 500 ## BT_2125 hypothetical protein - Prom 7218 - 7277 5.8 + Prom 7157 - 7216 7.5 8 5 Op 1 . + CDS 7415 - 8224 744 ## BT_2126 hypothetical protein 9 5 Op 2 . + CDS 8262 - 8945 675 ## COG0637 Predicted phosphatase/phosphohexomutase + Prom 8964 - 9023 4.3 10 6 Tu 1 . + CDS 9068 - 11920 2195 ## COG2605 Predicted kinase related to galactokinase and mevalonate kinase - Term 11962 - 12005 2.1 11 7 Tu 1 . - CDS 12036 - 12146 175 ## - Prom 12167 - 12226 6.3 - TRNA 12575 - 12646 52.0 # Arg CCG 0 0 + Prom 12722 - 12781 9.2 12 8 Op 1 . + CDS 12923 - 13960 917 ## COG2502 Asparagine synthetase A 13 8 Op 2 . + CDS 13967 - 14629 563 ## COG0692 Uracil DNA glycosylase + Term 14689 - 14734 4.2 Predicted protein(s) >gi|222159272|gb|ACAB01000087.1| GENE 1 53 - 802 330 249 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714788|ref|ZP_04545269.1| ## NR: gi|237714788|ref|ZP_04545269.1| predicted protein [Bacteroides sp. D1] # 1 249 1 249 249 503 100.0 1e-141 MITSQMSYEELANEVAKDYMDVSIIMQKKMPDALKYFRRQSKFPMFLFSTVTSPRKNKWI LIFFAKSKRRLKQYVDSFLVCVRETDHGKYVYRYDLPAKEGSLPGVTFYPPHFFSRYALR MGLELTGEDLIKRYFKTNTAMHYNADHLFLSEEEMKDLLNPVWYTSPDGISLGSATMVSG MELFICKTFVPWNMCKKDQLITCGKEEMFRLQEDLALDTHEEDVVSQSENHKIVEEFARM IMELIEKAG >gi|222159272|gb|ACAB01000087.1| GENE 2 813 - 2174 1224 453 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 7 430 4 428 448 322 40.0 9e-88 MTGQKTPTALGTEKIGKLLMQYAIPAIIAMTASSLYNMVDSIFIGHGVGAMAISGLALTF PLMNLAAAFGSLVGVGAATLVSVKLGQKDYDTAQRVLGNVLVLNIIIGLAFTVVTLLFLD PILYFFGGSDATVGYARDYMVVILLGNVVTHLYLGLNAVLRSAGHPQKAMYATIATVVIN TILDPVFIYGFGWGIQGAAIATITAQVIALAWQFKLFSNREELLHFHKGIFRLKKKIVFD SLAIGMAPFLMNLAACFIVILINKGLKQYGGDLAIGAFGIVNRLVFLFVMIVMGLNQGMQ PIAGYNYGAKQYPRVTKVLKITIYVATLITTVGFLMGMLIPDLAVSIFTTDEELVRISAK GLRIVVMFFPIVGFQMVTSNFFQSIGMASKAIFLSVSRQVLILIPCLLILPHFYGQLGVW ISMPISDLIASMIAGTMLWYQFRQFSLSSNLKH >gi|222159272|gb|ACAB01000087.1| GENE 3 2280 - 4007 2279 575 aa, chain + ## HITS:1 COG:CAC3197 KEGG:ns NR:ns ## COG: CAC3197 COG1190 # Protein_GI_number: 15896444 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 13 501 33 510 515 455 48.0 1e-127 MNILELSEQEIIRRNSLNELRAMGIDPYPAAEYVTNAFSTDIKAEFKDDEEPRQVSVAGR IMSRRVMGKASFIELQDSKGRIQVYITRDDICPGEDKELYNAVFKRLLDLGDFIGIEGFV FRTQMGEISIHAKKLTVLAKSIKPLPIVKYKDGVAYDSFEDPELRYRQRYVDLVVNDGIK ETFLKRATVVKTLRNVLDEAGYTEVETPILQSIAGGASARPFITHHNSLDIDLYLRIATE LYLKRLIVGGFEGVYEIGKNFRNEGMDKTHNPEFTCMELYVQYKDYNWMMNFTEKLLERI CIAVNGCTESVVDGKTISFKAPYRRLPILDAIKEKTGYDLNGKSEEEIRQVCKELKMEEI DETMGKGKLIDEIFGEFCEGTYIQPTFITDYPVEMSPLTKMHRSKPGLTERFELMVNGKE LANAYSELNDPLDQEERFKEQMRLADKGDDEAMIIDQDFLRALQYGMPPTSGIGIGIDRL VMLMTGQTTIQEVLFFPQMRPEKVVKKDAAAKYMELGIAEDWVPVIQKAGYNTVADMKDV NPQKLHQDICGINKKYKLELTNPSVNDVTEWIQKI >gi|222159272|gb|ACAB01000087.1| GENE 4 4061 - 5029 795 322 aa, chain + ## HITS:1 COG:TM0378 KEGG:ns NR:ns ## COG: TM0378 COG0240 # Protein_GI_number: 15644628 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Thermotoga maritima # 1 317 8 311 323 151 30.0 2e-36 MGGGSWATAIAKMCLAQEDSINWYMRRDDRIADFKRLGHNPAYLTGVKFDTRRITFSSNI NDVVKESDTLIFVTPSPYLKAHLKKLKTKIKDKFIITAIKGIVPDDNVIVSEYFTKEYGV PPENIAVLAGPCHAEEVALERLSYLTIACPDKDKARIFARRLGSSFIKTSVSDDVAGIEY SSVLKNVYAIAAGICSGLKYGDNFQAVLISNAIQEMNRFLNTVHPLNRNVDESVYLGDLL VTGYSNFSRNRTFGTMIGKGYSVKSAQIEMEMIAEGYYGTKCIKEINKHHHVNMPILDAV YNILYERISPMIEIKLLTDSFR >gi|222159272|gb|ACAB01000087.1| GENE 5 5144 - 6481 1691 445 aa, chain + ## HITS:1 COG:BH3343 KEGG:ns NR:ns ## COG: BH3343 COG0166 # Protein_GI_number: 15615905 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Bacillus halodurans # 2 445 5 449 450 484 54.0 1e-136 MISLNIEKTFGFISKEKVFAYEAEVKAAQEMLEKGTGKGNDFLGWLHLPSSITKEHLADL NATAKVLRDNCEVVIVAGIGGSYLGARAVIEALSNSFTWLQEKKTAPVMIYAGHNISEDY LYELTEYLKDKKFGVINISKSGTTTETALAFRLLKKQCEDQRGKETAKKVIVAVTDAKKG AARVTADKEGYKTFIIPDNVGGRFSVLTPVGLLPIAVAGFDIDKLVAGAADMEKVCGSDV AFAENPAAIYAATRNELYRNGKKIEILVNFCPKLHYVSEWWKQLYGESEGKDNKGIFPAS VDFSTDLHSMGQWIQEGERSIFETVISVEKVNHKLEVPSDEANLDGLNFLAGKRVDEVNK MAELGTQLAHVDGGVPNMRIVLPELSEYNIGGLLYFFEKACGISGYLLGVNPFNQPGVEA YKKNMFALLDKPGYEEESKAIRAKL >gi|222159272|gb|ACAB01000087.1| GENE 6 6517 - 6747 74 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNGEQLCYMHRFVHRSLFIEIQNGDIPKCPYLVTIYRLTLLVATTRQFSLIILRFRLQTD FFLCSFTFGFGRKIEL >gi|222159272|gb|ACAB01000087.1| GENE 7 6661 - 7167 500 168 aa, chain - ## HITS:1 COG:no KEGG:BT_2125 NR:ns ## KEGG: BT_2125 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 168 22 187 187 205 69.0 6e-52 MKKRVLLLLPLFLGACSDSGESPVIKIEKLYAKVESEDESSGSGEKDYANQRDELPALKV GDEVKALLLLDGNGAELKTFRLQNDDEVDTKLIFEKTEVSTEGNLTDVEKGQLRFKDGVS KAKIMVIATIKQVDKNGDVKLEFYLSSKAECEGAQEEIGLKTKAEDDK >gi|222159272|gb|ACAB01000087.1| GENE 8 7415 - 8224 744 269 aa, chain + ## HITS:1 COG:no KEGG:BT_2126 NR:ns ## KEGG: BT_2126 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 266 4 265 268 298 58.0 1e-79 MTGNLLKKSSGFILLCSISLLMLVMTGCQEAKLKTVIGVANKQCPIDMGAVGTITSIIYD GNNVVYTLNMNEEITNIKILKDNPESMKSSITMMFQNPAADVKEMLKLMAKCNSGLHMIF VGNKSGEQAVCELTAEELKEVINTNADPAQSGQTKLEAQLKMANLQFPMQASEEVIVEKI EVIGESVVYICSVDEELCPISQIEENAAEVKEGIVSTLASQTDPATQIFIKTCVDNNKSI AYRYIGKESGQQYDVIIPLSDLKKMLIEK >gi|222159272|gb|ACAB01000087.1| GENE 9 8262 - 8945 675 227 aa, chain + ## HITS:1 COG:MA0451 KEGG:ns NR:ns ## COG: MA0451 COG0637 # Protein_GI_number: 20089342 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Methanosarcina acetivorans str.C2A # 5 222 2 211 218 115 36.0 8e-26 MKKKLKAVLFDMDGVLFDSMPYHSEAWHTVMKSHGLTLSREEAYMHEGRTGASTINIVFQ RELGREATQEEIESIYQEKSVLFNSYPEAKRMPGAWELLQKVKKDGLIPMVVTGSGQLSL LERLEHNYPGMFRKELMVTAFDVKYGKPNPEPYLMALKKGGLKADEAIVIENAPLGVEAG YNAGIFTIAVNTGPLDGQVLLDAGADLLLPSMQALSDHWDTLFEKNT >gi|222159272|gb|ACAB01000087.1| GENE 10 9068 - 11920 2195 950 aa, chain + ## HITS:1 COG:CAC3055 KEGG:ns NR:ns ## COG: CAC3055 COG2605 # Protein_GI_number: 15896306 # Func_class: R General function prediction only # Function: Predicted kinase related to galactokinase and mevalonate kinase # Organism: Clostridium acetobutylicum # 585 887 2 275 364 90 26.0 1e-17 MQKLLSLPPNLIHCFHELEEVNHTDWFCTSDPIGSKLGSGGGTTWLLQACHQAFAPQKSF GNWIGDEKRILLHAGGQSRRLPSYGPSGKILTPIPIFSWERGQKLGQNLLSLQLPLYERI MNQAPAGLNTLIASGDVYIRSEKPLQDIPNADVVCYGLWVNPSLATHHGVFVSDRKKPEV LDFMLQKPSLEELEGLSKTHLFLMDIGIWILSDRAIEVLMKRSLKEGTNDINYYDLYSDY GLALGEHPKTEDEEINQLSVAILPLPGGEFYHYGTSHELISSTLAIQDKVRDQRRIMHRK VKPNPAIFIQNSITQVSLSADNANLWIENSHVGKGWKLGSRQIITGVPENQWNINLPDGV CIDIIPIGDNDFVARPYGLDDVFKGALDKSTTTYLNIPFTRWMEERGITWEDIKGRTDDL QSASIFPKVTSVEDLGILVRWMTSEPQLEEGKKRWLKAEKVSADEISAGANLKRLYEQRN AFRKENWKGLAANYEKSVFYQLNLLDAANEFVRFNLDTPDVLQEDAAPMLRIHNRMLRAR IMKLREDKDCAKEEQVAFQLLRDGLLGVMNERKSHPTLNVYSDQIVWSRSPVRIDVAGGW TDTPPYSLYSGGSVVNLAIELNGQPPLQVYVKPCKEYHITLRSIDMGAMEVIRNYEELQD YKKVGSPFSIPKAALTLAGFAPAFSTESYPSLAKQLEAFGSGIEITLLAAIPAGSGLGTS SILASTVLGAINDFCGLAWDKNDICSYTLVLEQLLTTGGGWQDQYGGVFSGIKLLQSEAG FEQHPLVRWLPDQLFIHPDYRDCHLLYYTGITRTAKSILAEIVSSMFLNSGPHLSMLAEM KAHAMDMSEAILRSNFDSFGRLVSKTWIQNQALDCGTNPPAVAAIIEKIKDYTLGYKLPG AGGGGYLYMVAKDPQAAGQIRRILTEQAPNPRARFVEMTLSDKGLQVSRS >gi|222159272|gb|ACAB01000087.1| GENE 11 12036 - 12146 175 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEEARKKKWASVALIIGAIGFIIIMIYFTVISSLNM >gi|222159272|gb|ACAB01000087.1| GENE 12 12923 - 13960 917 345 aa, chain + ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 10 345 3 327 327 348 51.0 1e-95 MSYLIKPKNYNPLLDLKQTELGIKQIKEFFQLNLSSELRLRRVTAPLFVLKGMGINDDLN GIERPVSFPIKDLGDAQAEVVHSLAKWKRLTLADYHIEPGYGIYTDMNAIRSDEELGNLH SLYVDQWDWERVITNEDRNVNFLKEIVNRIYAAMIRTEYMVYEMYPQIKPCLPQKLHFIH SEELRQLYPDLEPKCREHAICQKYGAVFIIGIGCKLSDGKKHDGRAPDYDDYTTKGLNDL PGLNGDLLLWDNVLQRSIELSSMGIRVDKEALQRQLKEEKEEKRLELYFHKRLMNDTLPL SIGGGIGQSRLCMFYLRKAHIGEIQASIWPEDMRKECEELDIHLI >gi|222159272|gb|ACAB01000087.1| GENE 13 13967 - 14629 563 220 aa, chain + ## HITS:1 COG:PA0750 KEGG:ns NR:ns ## COG: PA0750 COG0692 # Protein_GI_number: 15595947 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Pseudomonas aeruginosa # 3 220 8 226 231 278 60.0 8e-75 MNVQIEESWKAHLKPEFDKDYFRTLTDFVKSEYSQYQIFPPGKLIFNAFNLCPFDKVKVV IIGQDPYHGPGQAHGLCFSVNDGVPFPPSLVNIFKEIKADISSDAPTTGNLTRWAEQGVL LLNATLTVRAHQAGSHQNRGWETFTDAAIRALAEEKENLVFILWGSYAQKKGAFIDRNKH LVLTSAHPSPLSAYNGFFGNKHFSRTNDYLKTHGKTEIAW Prediction of potential genes in microbial genomes Time: Wed May 18 03:13:43 2011 Seq name: gi|222159271|gb|ACAB01000088.1| Bacteroides sp. D1 cont1.88, whole genome shotgun sequence Length of sequence - 35630 bp Number of predicted genes - 34, with homology - 33 Number of transcription units - 24, operones - 7 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 83 - 2797 2264 ## BT_2133 hypothetical protein - Prom 2821 - 2880 5.6 2 1 Op 2 . - CDS 2887 - 3672 389 ## COG1418 Predicted HD superfamily hydrolase + Prom 3409 - 3468 6.3 3 2 Tu 1 . + CDS 3535 - 4659 719 ## COG3876 Uncharacterized protein conserved in bacteria - Term 4702 - 4757 7.6 4 3 Op 1 . - CDS 4791 - 5654 479 ## BDI_1888 putative DNA repair ATPase 5 3 Op 2 . - CDS 5701 - 7785 1245 ## BT_3385 hypothetical protein - Prom 7830 - 7889 5.0 + Prom 8084 - 8143 3.9 6 4 Tu 1 . + CDS 8184 - 8792 526 ## BT_4161 hypothetical protein + Prom 8799 - 8858 3.3 7 5 Tu 1 . + CDS 8974 - 10317 550 ## COG1512 Beta-propeller domains of methanol dehydrogenase type + Prom 10368 - 10427 4.2 8 6 Tu 1 . + CDS 10460 - 10585 64 ## BDI_1939 hypothetical protein 9 7 Tu 1 . - CDS 10839 - 11324 330 ## gi|237714809|ref|ZP_04545290.1| conserved hypothetical protein - Prom 11367 - 11426 1.8 10 8 Tu 1 . - CDS 11430 - 12350 729 ## gi|237714810|ref|ZP_04545291.1| conserved hypothetical protein - Prom 12495 - 12554 4.5 11 9 Op 1 . - CDS 12562 - 12879 256 ## gi|294647585|ref|ZP_06725160.1| hypothetical protein CW1_4648 12 9 Op 2 . - CDS 12876 - 13883 549 ## BT_0514 hypothetical protein - Prom 13968 - 14027 6.7 + Prom 14312 - 14371 6.0 13 10 Tu 1 . + CDS 14405 - 14599 111 ## gi|224539589|ref|ZP_03680128.1| hypothetical protein BACCELL_04497 + Prom 14602 - 14661 4.1 14 11 Op 1 . + CDS 14725 - 15351 386 ## COG3340 Peptidase E 15 11 Op 2 . + CDS 15348 - 16280 441 ## COG2378 Predicted transcriptional regulator 16 11 Op 3 . + CDS 16312 - 16551 103 ## BF2068 hypothetical protein 17 11 Op 4 . + CDS 16439 - 16924 248 ## BF3477 hypothetical protein + Term 16998 - 17040 -0.8 + Prom 16939 - 16998 1.6 18 12 Tu 1 . + CDS 17055 - 17387 328 ## BF3287 hypothetical protein + Term 17399 - 17440 5.4 - Term 17217 - 17259 5.7 19 13 Op 1 . - CDS 17440 - 18159 508 ## BF3286 hypothetical protein 20 13 Op 2 . - CDS 18164 - 19084 659 ## BF3285 mobilization protein 21 13 Op 3 . - CDS 19050 - 19433 277 ## BF3284 mobilization protein - Term 19492 - 19539 10.5 22 14 Tu 1 . - CDS 19579 - 20544 423 ## BDI_3503 DNA primase - Prom 20619 - 20678 4.3 23 15 Op 1 . - CDS 20707 - 21780 751 ## BF2791 hypothetical protein 24 15 Op 2 . - CDS 21786 - 22073 307 ## ZPR_1827 putative excisionase 25 16 Tu 1 . - CDS 22205 - 23128 565 ## gi|262406901|ref|ZP_06083450.1| conserved hypothetical protein - Prom 23158 - 23217 3.2 - Term 23186 - 23231 3.6 26 17 Tu 1 . - CDS 23240 - 24433 676 ## COG4974 Site-specific recombinase XerD - Prom 24509 - 24568 6.9 - Term 24572 - 24608 0.4 27 18 Tu 1 . - CDS 24612 - 25508 684 ## BT_2140 putative sodium-dependent transporter - Prom 25666 - 25725 5.7 + Prom 25354 - 25413 3.0 28 19 Op 1 . + CDS 25650 - 26909 821 ## COG0860 N-acetylmuramoyl-L-alanine amidase 29 19 Op 2 . + CDS 26919 - 27812 1026 ## BT_2142 hypothetical protein 30 20 Tu 1 . - CDS 27890 - 28036 73 ## - Prom 28105 - 28164 5.0 + Prom 27896 - 27955 6.0 31 21 Tu 1 . + CDS 28164 - 29576 1019 ## COG0593 ATPase involved in DNA replication initiation 32 22 Tu 1 . - CDS 29666 - 30409 654 ## COG0778 Nitroreductase - Prom 30529 - 30588 7.4 + Prom 30418 - 30477 7.7 33 23 Tu 1 . + CDS 30694 - 33237 2228 ## COG0209 Ribonucleotide reductase, alpha subunit + Term 33276 - 33315 5.3 + Prom 33255 - 33314 13.1 34 24 Tu 1 . + CDS 33369 - 35628 1578 ## COG1640 4-alpha-glucanotransferase Predicted protein(s) >gi|222159271|gb|ACAB01000088.1| GENE 1 83 - 2797 2264 904 aa, chain - ## HITS:1 COG:no KEGG:BT_2133 NR:ns ## KEGG: BT_2133 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 904 1 902 902 1552 91.0 0 MAPLKAKILISFILLIVLLMLPDEATSQRRRRGMIASNTAKTDSLQGNDSLKADTVTVRI DSIAPVKKKQPLDAPVVYESNDSTVFTLGGAATLYGSGKVNYQNIELAAEVISMNLDSST VHAYGIKDTTGVEKGKPVFKEGDTSYDTETIRYNFKTKKAGITDIVTQQGEGYVTGSKAK KGANDEIFMEHGRYTTCDHHDHPHFYMQLTRAKVRPKKNVVTGPAYLVVEDVPLPLAVPF FFFPFSSSYSSGFIMPTYMDDSSRGFGLAEGGYYFAMSDIMDLKITGDIFTKGSWRLSGL TNYNKRYKYSGTLQADYQVTKTGDKGMPDYAVAKDFKVVWNHRQDAKASPNSTFSASVNF STSSYERSNINNLYNSQLLTQNTKTSSISYSRSFPDIGLTLSGTTNIAQTMRDSSIAVTL PDLNITLSRLFPFKRKKAAGAERWYEKISISYTGRLTNSIRTKDDRLFKAGLSEWENAMN HNIPISATFTLFKYLQVSPSVNYTERWYTRKINQQYNEVDHKLEALPGDTLNGFYRVSNY SASLSLSTKLYGMYKPLFAKKKEIQIRHVFTPQVSLSGAPGFSKYWEEYTDYNGNTQYYS PFTGQPYGVPSREGSGTVSFSISNNLEMKYYDAKKDTLKKVSLIDELGASMSYNMAAKER PWSDLSMNLRLKLTKNYTFNMNASFATYAYTFDKSGNVVTSNRTEWSYGRFGRFQGYGSS FNYTFNNDTWKKWFGPKEEDEKGKDKKSEDSDDEESDGTEGDGTTPKKVEKAQADPDGYQ VFKMPWSLSLSYSFNIREDRSKPINRYSMRYPYTYTHNINANGNIKISNNWSLSFNSGYD FQAKEITQTSCTISRDLHCFNLSASLSPFGRWKYYNVTIRANASILQDLKYEQRSQTQSN IQWY >gi|222159271|gb|ACAB01000088.1| GENE 2 2887 - 3672 389 261 aa, chain - ## HITS:1 COG:MJ0778 KEGG:ns NR:ns ## COG: MJ0778 COG1418 # Protein_GI_number: 15668959 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Methanococcus jannaschii # 104 244 18 149 169 71 35.0 2e-12 MNLLSSYNRSVIDHHSDAFSFQQRQILIHLFCSRDNLTVYTHWQQHNEQKYKNEGNSSLH IGNFFVILHKNINFSSREQKIKHMNPYEIIDKYYPENTQQRQILVIHSLAVSGKAMKMLD AHPELRLNRSFVKEAALLHDIGIFQTDAPTIQCFGTHPYIAHGYLGAEILRAEGFPQHAL VCERHTGAGLSLEDIIAQQLPVPHREMLPITLEEQLICFADKFFSKTHLDEEKTVEKARQ SIAKYGEEGLSRFDRWCSLFL >gi|222159271|gb|ACAB01000088.1| GENE 3 3535 - 4659 719 374 aa, chain + ## HITS:1 COG:BS_ybbC KEGG:ns NR:ns ## COG: BS_ybbC COG3876 # Protein_GI_number: 16077233 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 18 373 45 414 414 253 41.0 3e-67 MLLPMRVNSQIVTGAEQMDQYLPLLKGKRIGMVVNHTSVVGRKQVHLLDTLLKRDVRVVK VFAPEHGFRGNADAGETVKDGKDSRTGIPIVSLYGDNKKPSAAQLKDVDVIVFDIQDVGA RFYTYISTMYYVMEACAENKKEMVVLDRPNPCDYVDGPILKPAYRSFVGMLPLPVLHGCT IGELAQMINGEGWIANKKNPCSLKVIPMTGWKHGSPYSLPIKPSPNLPNDQSIRLYASLC PFEATRVSVGRGTTFPFQVLGAPNKKYGSFTFTPRSLPGFDKNPMHKGITCYGEDLRNVT DVNGFTLRYFLDFYRLSGENAAFFSRARWFDLLMGTNSVRKAILKGESEEAIRNSWQKEL QDYKEIRKKYLLYE >gi|222159271|gb|ACAB01000088.1| GENE 4 4791 - 5654 479 287 aa, chain - ## HITS:1 COG:no KEGG:BDI_1888 NR:ns ## KEGG: BDI_1888 # Name: not_defined # Def: putative DNA repair ATPase # Organism: P.distasonis # Pathway: not_defined # 1 272 1 272 281 343 65.0 4e-93 MTKKDAIKVFENKKIRAVWDDQKEEWYFSIVDVVSVLTESVDGRKYWNKLKQRLKAEGSE LVTNCHQLKLPSSDGKLYKTDVATTQQLFRLIQSIPSPKAEPFKMWMAQVAKERLDEMQD PELTINRAMMEYKSLGYSDNWINQRLKSIEVRKELTDEWKRNGMQEGVQFAALTDIIYQT WAEKSAKEYKQFKGLKKENLRDNMTNTELILNMLAETATTDLSKERNPSGFAENAEVAKD GGNVARVARKQLESQLKRPVISPLNAKSALQVGDEREKNETSGESVD >gi|222159271|gb|ACAB01000088.1| GENE 5 5701 - 7785 1245 694 aa, chain - ## HITS:1 COG:no KEGG:BT_3385 NR:ns ## KEGG: BT_3385 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 694 1 694 696 1118 77.0 0 MRITLVRDDGKVKTLRTLKMELLLEQMKTEVKAQPVSKMREVLRYTLPGNSIEEVRKVPK VMPAAAFVRKEGVMTLNEYNGIVMMEVNNLSGRAEADEIKELVKELPQTYLAFTGSSGKS VKIWVRFTYPDDRLPTLREQAELFHAHAYQTAVKYYQPQLPFDIELKEPSLEQYCRLTYD PELYFNPKAMPIYMKQPVSLPSETTFIKRTRETDSPLQRMAPGYENYEALSVLFSAAFNR ALEELDGYREGDDLQPLLVCLAEHCFRAGIPEEDTVRWTKAHYRLPSDELLIRETVKSVY RSAKGFGKKSSLTAEQLFAMQMDEFMKRRYEFRYNTLTTEVEYRERNSFNFYFRPVDKRV LASITMNAMYEGVKMWDRDVIRYLDSDHVPVYQPVEDFLYHLPHWDGKDRILELANRVPC DNPHWAPLFRRWFLNMVAHWRGMDKKHANSTSPLLIGPQAYRKSTFCRMLLPPALQAYYT DSIDFSRKRDAELYLNRFLLINMDEFDQISPTQQAFLKHILQKPVVNTRRPNASAVEELR RYASFIATSNHRDLLTDTSGSRRFIGIYMTGAIDVSRPIDYEQLYAQALELLYHNERYWF DSEEEAIMTENNREFEQSPAIEQLFMVYYRRAEEEEEGEWLLAIDILRRIQKASKMTFSA RQASYFGRILQRLGVKSKRKTYGTYYHVVPLEVE >gi|222159271|gb|ACAB01000088.1| GENE 6 8184 - 8792 526 202 aa, chain + ## HITS:1 COG:no KEGG:BT_4161 NR:ns ## KEGG: BT_4161 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 198 1 198 200 212 55.0 7e-54 MYAKYDFRKKPSSKEDEDEQPLYPRIVSNGTIDFQQIVKEIAQASSFTPADIEGVQLAIE NKISEYLVSGHHVQLGNLGYFSAKLKARPVMDAKEIHAQSIYFDNVNFRPSSSFRKKVRG FVEKAKSGFAHSAEVPVEERRRRLEKFLDERPMIRRKEYTQLTGLLKNKALNELNGWVKE GVLDTIGSGSHKIYVRVNTMKD >gi|222159271|gb|ACAB01000088.1| GENE 7 8974 - 10317 550 447 aa, chain + ## HITS:1 COG:BMEI0229 KEGG:ns NR:ns ## COG: BMEI0229 COG1512 # Protein_GI_number: 17986513 # Func_class: R General function prediction only # Function: Beta-propeller domains of methanol dehydrogenase type # Organism: Brucella melitensis # 65 212 34 180 253 85 31.0 3e-16 MKDKTQKSFRGIARTLLLVFCYAFGSHTFALALEHEQTVETYKVTDVPNPRNESSSNWVS NPNQILDESYVWEINNMLSQLEDSLSIEVAIVALSSIGEDIPAEFAHKLFEHWGIGKKAD DNGLLILLVLDQRKVTFATGYGLEGVLPDALCFRIQQNEMVPWFRKNDFDRGMTEGVRAV TLVLYGSDYEPVSQSTSDNYWKSASNTLWNFLANQSPMLWIFLILVNVLTYRMKVNKARP KDGSALAAIKVLTLYSPLGCLVLFFPVWPALIAASLWYKFYQKQRVILQSKTCDSCKAVA LQLLSNELATPLLSASEQMEHKLGSAIHRIYQCTSCGWILRYKSVVGSEYKMCNQCHTIA SKRISPWKTVKEATYSDAGLEVADYLCLMCGDKKQATQKIPRKTPPNSDSPSNSHSSSSS SSSSGSSSSSGSFGGGRSGGGGASSSF >gi|222159271|gb|ACAB01000088.1| GENE 8 10460 - 10585 64 41 aa, chain + ## HITS:1 COG:no KEGG:BDI_1939 NR:ns ## KEGG: BDI_1939 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 41 1 41 120 73 85.0 2e-12 MNIEEFREYCLSFKGVHEKMPFPNVPDKYSRDVLCFYVADK >gi|222159271|gb|ACAB01000088.1| GENE 9 10839 - 11324 330 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714809|ref|ZP_04545290.1| ## NR: gi|237714809|ref|ZP_04545290.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 161 9 169 169 301 100.0 8e-81 MGKVMKVFKTLINELFGTKREYKVEGLGIFTCKVCDWWRDKCYLWSGAVQLPFYSVETLV LIEGDASAPFSRQLLELQVLLQNWVPMAEQLDSMLSSKSQQKHKEKIYGSWQDEFYPYTI VPAVLYSDSWEIVFYRNSGVNYNFTVFWKENKIQDLHLGGF >gi|222159271|gb|ACAB01000088.1| GENE 10 11430 - 12350 729 306 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714810|ref|ZP_04545291.1| ## NR: gi|237714810|ref|ZP_04545291.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 306 45 350 350 580 100.0 1e-164 MGLNGKVKTLDIYYYRPENISSFFSTKFKKGKKWEEDGIHHYHFHFNIDGNLAQCNEYDK QENIIRKTDYTYETWGHTSTLSFPKTNSTIVSKYNTDGKLLEIVYSDSHGEINTYDDRGN LIRTKQIDGTIDKPAMTEYDYNEHDLLVEKREYYNIGELSGKTVYEYDTNGLLAQTQVYD LLSRDRPETHFIYEYSDSNRVVKKYRIDDEGNKELESWCKREYYPNGKLKTVVRNSIVQE KYDEQGRPIDERPGFEILYKDFQDYGCDANGNWIETKNFKDYVDFGTGLGRILKPYVERI FTYYEE >gi|222159271|gb|ACAB01000088.1| GENE 11 12562 - 12879 256 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294647585|ref|ZP_06725160.1| ## NR: gi|294647585|ref|ZP_06725160.1| hypothetical protein CW1_4648 [Bacteroides ovatus SD CC 2a] # 1 105 1 105 105 199 100.0 6e-50 MKDIIELLQNERTKTVDALKQGEQDKLSHLQQLDKALGWLKVVEDNELATVGSYKIHRLP DPRSGFSYYHLMIDNESGDPKDWTEYKPDNQSLELCFDDIIITRK >gi|222159271|gb|ACAB01000088.1| GENE 12 12876 - 13883 549 335 aa, chain - ## HITS:1 COG:no KEGG:BT_0514 NR:ns ## KEGG: BT_0514 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 333 3 324 334 475 71.0 1e-132 MKLRANVVEQKIFEIEDLKELEEFLQSQSEIEQLRERLFAEFLKYADYKNAGEWNKAVRL CESLAIIGWGNHEPVEALRGQFFNGNPATCFQNKFGETRFVDAIWSKRVNGFTMEQGRTS YCFSPDDPNQKQSVFWEYEIKEDIQDIRLESQRNWIPKNPVWIKRTIGNCYENSKVVIES VDKELKPELDRRMRPEIYGRAINRIIINCSYSYYDHDHCKTNYIIADEKLKLKQKDFYRT LLTMFTRQEIEKNGYFLRNRFEFGPFRADTGKIRIGLNLEKEFSELSHSEQRLKLSEYIL FALNHVTDKLKKKKLDYDFDLMLEDFNSILTEWKA >gi|222159271|gb|ACAB01000088.1| GENE 13 14405 - 14599 111 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|224539589|ref|ZP_03680128.1| ## NR: gi|224539589|ref|ZP_03680128.1| hypothetical protein BACCELL_04497 [Bacteroides cellulosilyticus DSM 14838] # 1 62 1 62 281 127 95.0 2e-28 MEKIQVIQPTKLLAPYIKQYWLLRIDDVKQGFQRSIPAGCVALVFHKGNKIISSFHKETQ PHYK >gi|222159271|gb|ACAB01000088.1| GENE 14 14725 - 15351 386 208 aa, chain + ## HITS:1 COG:lin0382 KEGG:ns NR:ns ## COG: lin0382 COG3340 # Protein_GI_number: 16799459 # Func_class: E Amino acid transport and metabolism # Function: Peptidase E # Organism: Listeria innocua # 1 207 1 204 209 200 51.0 1e-51 MKRLFLCSSFADVANLFVDCAKEDLQGKIIAFIPTASLTEPIRFYVKKGKKALEEAGMIV EEVEITQLPKEEISSILHKCDYIYITGGNTFFLLQELKRKGVDKIISKQVKLGKLYIGES AGAIIASPDAEYMRSVNFDPIEKAPELKDCTSLDLVDFYTIPHYGNFPFKKKGEKIVQLY NEKLQLIPISNKQAVIIEDSNIQIKDAK >gi|222159271|gb|ACAB01000088.1| GENE 15 15348 - 16280 441 310 aa, chain + ## HITS:1 COG:STM0410 KEGG:ns NR:ns ## COG: STM0410 COG2378 # Protein_GI_number: 16763790 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Salmonella typhimurium LT2 # 3 229 4 223 230 103 29.0 5e-22 MNRIDRISAILIQLQSHSLVKAQQISERFNISIRTVYRDIRTLEEAGIPIIGNPGIGYSL AEGFKLSPLMFTQKEALSFLIAEKLVHELTDSNSNEHYKSGIEKIRSVMRFADKNMLETM EKCMSVLDTYKSSTYKPDILLLILQSIYQKRIIEISYLGSNALSVSERKIEAVGIFFSRT NWYLIGFYLPKEIYLTFRIDRIQKMHILNELQSREHPPLEKFIREFYDKEKLHEIVIRIE KNKTSIMNDDKYYYGLTSEKEIGDMFELHFLTFSISKFAHWYLSFADSATIIRPDSLKYE VKNIINNISI >gi|222159271|gb|ACAB01000088.1| GENE 16 16312 - 16551 103 79 aa, chain + ## HITS:1 COG:no KEGG:BF2068 NR:ns ## KEGG: BF2068 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 73 12 84 90 110 84.0 1e-23 MKKIFDQRFFRLLSECSQRKVSASEFAEAIEELATHVANFSINEQDYNVLLRYFSFGLHR LKSYRVRFEQEKNAPSASN >gi|222159271|gb|ACAB01000088.1| GENE 17 16439 - 16924 248 161 aa, chain + ## HITS:1 COG:no KEGG:BF3477 NR:ns ## KEGG: BF3477 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 31 158 1 128 139 230 88.0 2e-59 MSRITTFYYVIFPLVCTVLNRIVYGLSKKKMPHLHLIDEAIGLLNTEIRLIEWRIKYPEQ LQQRINKQPLSPLYLADKTTLINIMEMVSGLFLSKNIVYQNGKPAYLVDLAKAFEWLFNI KIGDCYQKHEDVIKRKPGKLTGFLNGLVELIKKEHDKKGYR >gi|222159271|gb|ACAB01000088.1| GENE 18 17055 - 17387 328 110 aa, chain + ## HITS:1 COG:no KEGG:BF3287 NR:ns ## KEGG: BF3287 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 108 10 117 117 145 75.0 3e-34 MYIDNDDFSVWMQKLYAKLEELCKDVRVLRNADKVLPEDDNLLDNQDLCLLFKVSIKTLQ RYRAIGALPYFTISGKVYYKASDVREFIKERFSVTTLRQFEKEHCTKKKK >gi|222159271|gb|ACAB01000088.1| GENE 19 17440 - 18159 508 239 aa, chain - ## HITS:1 COG:no KEGG:BF3286 NR:ns ## KEGG: BF3286 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 239 1 241 241 222 55.0 8e-57 MEDNLILEGLLSMVTELKEKQEAQATPASREETLERLGAIEQALSELHSNSAVPEEKLRV IQSQLDDIRSRMQGQQKNIEDTKKITLETYRCFKVMIDTLGSCKTDKEEATPLPFYQRIY NKVASWIRPGLFVFSAVLVVCSVSIFLNVRLAERMQQLQDNDMKYRYLLMQGQADGETFD MLENKFKWQRDNGFIRSLTDSVMDFEYRIQKQAEALERARLLNEQAEQLKKEADKLGKP >gi|222159271|gb|ACAB01000088.1| GENE 20 18164 - 19084 659 306 aa, chain - ## HITS:1 COG:no KEGG:BF3285 NR:ns ## KEGG: BF3285 # Name: bmgA # Def: mobilization protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 306 1 304 304 454 80.0 1e-126 MIGKIKKGSGFKGCVNYVLGKEQAVLLHADGVLTESRGDIIRSFCMQTGMNPDLKKPVGH IALSYSTVDAPKLTDGKMVQLAQEYMREMKITDTQYIIVRHQDREHPHVHIVFNRIDNNG KTISDRNDMYRNEQVCKKLKAKHGLYFAGGKEQVKQHRLKEPDKSKYEIYTAVKNEIGKS RNWQQLQERLAEKGITVRFKRKGQTDEIQGISFSKGEYTFKGSEIDRSFSFSKLDKCFGY AGMNVAGSQRQTTFAPVREQALAPGKADSPLITGSLGLFSASSPPVDEEPNFNLRKKKKK KKQLKL >gi|222159271|gb|ACAB01000088.1| GENE 21 19050 - 19433 277 127 aa, chain - ## HITS:1 COG:no KEGG:BF3284 NR:ns ## KEGG: BF3284 # Name: bmgB # Def: mobilization protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 126 1 126 127 188 84.0 6e-47 MTSIKDKPGGRPAKKRIEKQQRVVSTKLTELQYYAIRKRAGEAGLRVSEYVRQAVVSAEV IPRLNRQDADTIRKLAGEANNINQLAHRANAGGFALVAVELVKLKNRIIGIINHLSDDWK NKKGKRF >gi|222159271|gb|ACAB01000088.1| GENE 22 19579 - 20544 423 321 aa, chain - ## HITS:1 COG:no KEGG:BDI_3503 NR:ns ## KEGG: BDI_3503 # Name: not_defined # Def: DNA primase # Organism: P.distasonis # Pathway: not_defined # 1 291 1 291 312 469 74.0 1e-131 MNIEDVKQIPIADYLHSLGYSPVKQQGNGLWYKSPLREEHEPSFKVNTDRNLWYDFGAGK GGNIIALAKELYCSDSLPYLLNRIAEQTPHVRPVSFSFPQRRTEPSFQHLEVRDLIHPAL LRYLQGRGINVELAKRECKELHFTNNGRPFFAIGFPNMAGGYEVRNSFFKGCIAPKDITH IRQQGEPREKCLVFEGFMDYLSFLTLRMKNCPTMPDLDRQDYVILNSTVNVPKAIDVLYP YERIHCMLDNDKAGYEATRAIELEYSYRVRDFSHNYRGYSDLNDYLCGRKQEQKNEASQV QETKQETGQRAASRQKRGRGI >gi|222159271|gb|ACAB01000088.1| GENE 23 20707 - 21780 751 357 aa, chain - ## HITS:1 COG:no KEGG:BF2791 NR:ns ## KEGG: BF2791 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 352 1 356 368 387 53.0 1e-106 MENRKATEAGQDVTMQKEDFAALWKTIHLKVTDTYEVPPEILWVNGSTIGTLGNFSASTG KAKSKKTFNISAIVAAALKNDEVLKYSAYLPPNKRKILYVDTEQSKYHCHKVMERILRLA GLPTDKDRDDFVFIVLREQTPDKRKQIIGYMLENMPDVGLLIIDGIRDLMYDINSPSEST DLINLLMRWSSGYNLHIHTVLHLNKGDDNTRGHIGTELNNKAETVLQITKSQQDGNISEV KAMHIRDREFDPFAFRINDNALPEIVDDYVFQQPKQDRNFSLTELTEQQHREALENGFGK QVVQGYSNVIAALKQGYASIGYERGRNVLVSLNKFLVNKRMIVKEGKGYRYNPDFHY >gi|222159271|gb|ACAB01000088.1| GENE 24 21786 - 22073 307 95 aa, chain - ## HITS:1 COG:no KEGG:ZPR_1827 NR:ns ## KEGG: ZPR_1827 # Name: not_defined # Def: putative excisionase # Organism: Z.profunda # Pathway: not_defined # 5 91 6 92 93 78 43.0 6e-14 METSIEKRVAELENLVFLSKNVLSFDEASKFLNLSKSYLYKLTSGNLIPHYKPQGKMLYF EKAELEAWLRQNPVKTQAQIEQEAQKYILNRPLKI >gi|222159271|gb|ACAB01000088.1| GENE 25 22205 - 23128 565 307 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262406901|ref|ZP_06083450.1| ## NR: gi|262406901|ref|ZP_06083450.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 307 1 307 307 557 100.0 1e-157 MTEEELKIKFDHIQGIFNRCINHASQVMIDNIASKSLYFDEEQADKLEQQEYVRTADELV QLYIRYSVLNDIQYFYSVSDFFWESGFYESLKSDEKRKYMSFNPLSFDYSRYEQDNTVYD EELPYFSVIVKTVVLEKYSAYLRKKKESKVQAEMQPQQKEPQPIQDKCQEPKIISHIAET ENPFKSILNDRQIALLVDCINEVEIFNAPMTFEDLKAILSCKPKVIFRSNNNRQVAFLFS ELSNRGLITPNWQSVIAKNKLFVTKNIKKDKYLNQGDLATAANYVKGVEHEKDYVTISNY IKQLKKL >gi|222159271|gb|ACAB01000088.1| GENE 26 23240 - 24433 676 397 aa, chain - ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 128 380 20 281 297 72 29.0 2e-12 MAKKKQEVKLKEPVRIRFKQLANGNQSIYLDYYTGDVIRKENYVGGKRQYEFLKLYLIPE KTREDKAKNEATLALAKAIQSKRIVELQNDAHGFQNTNKSKANVIDYLMNMRSQSKERGS LNYEKTVGNTIRELKLFRGDYIAFRDIDKDFLNSFVDFLKQAKKASKFGLLKAGGVLSNN SVIAYYGVLRTAINRAYKEGIITVNPTKEFDFASKVKAEVSRREYLTIEELKRLIGTECK YEIMKQAFLFSCLCGLRVSDIRKLKWNDLQKSGERIRIEIKMQKTKEPLYLPISDEALKW LPQQNEAKGDDLIFPLTHEGTINKILQKWAKDAGIIKHISFHVARHTHATMMLTLGADLY TVSKLLGHKNIATTQIYAKIVDKKKEEAISLIPNLTD >gi|222159271|gb|ACAB01000088.1| GENE 27 24612 - 25508 684 298 aa, chain - ## HITS:1 COG:no KEGG:BT_2140 NR:ns ## KEGG: BT_2140 # Name: not_defined # Def: putative sodium-dependent transporter # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 297 1 297 297 503 92.0 1e-141 MLKFLKNWTLPIAMLVGAIGYPLFISLSFLTPYLIFTMLLLTFCKVSPCDLKPKPLHMWL LLIQIGGALAAYLLLYRFDKIVAEGVMVCIICPTATAAAVITSKLGGSAASLTTYTLIAN IGAAIAVPILFPLVEVHPDVTFWEAFLVILGKVFPLLICPFLAAWLLSKCLPKVHQKLLG YHELAFYLWAVSLAIVTAQTLYSLLNDPADGFTEIMIAVGALVACCLQFFLGKTIGSIYN DRISGGQALGQKNTILAIWMAHTYLNPLSSVAPGSYVLWQNIINSCQLWKKRKNEIKY >gi|222159271|gb|ACAB01000088.1| GENE 28 25650 - 26909 821 419 aa, chain + ## HITS:1 COG:aq_1681 KEGG:ns NR:ns ## COG: aq_1681 COG0860 # Protein_GI_number: 15606778 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Aquifex aeolicus # 30 255 130 353 359 131 38.0 2e-30 MKLNRPYILYIFICLWLLFLPSCTNHLWGKDFVVVIDAGHGGHDPGAIGKISKEKNINLN VALKVGNLIKRNCDDVKVIYTRSKDVFIPLNRRAEIANNAKADLFISIHTNALANNRTAK GASTWTLGLAKSDANLEVAKRENSVILYESDYKTRYAGFNPNSAESYIIFEFMQDKYMEQ SVHLASLMQKQFRQTCRRADRGVHQAGFLVLKASAMPSILIELGFISTPEEERYLNSEEG TGTMAKGIYRAFLNYKREHELRLTGVSKTIVPTEQEEDNAPAIAQKDTESVNTAPQQEKL LAEAKTKPAATAKTAPKRPIVVESATNDSEITFKIQILTSSKPLAKNDKRLKGLKEVDYY KEGGIYKYTYGASSDYNKVLRTKRTITAQFKDAFIIAFRNGEKMNVNEAIAEFKKRRNK >gi|222159271|gb|ACAB01000088.1| GENE 29 26919 - 27812 1026 297 aa, chain + ## HITS:1 COG:no KEGG:BT_2142 NR:ns ## KEGG: BT_2142 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 297 1 297 297 524 90.0 1e-147 MKYITKEVRIGIAGIVALCVLIYGINWLKGIHMFQPSSYFYAKFENVNGLTKSSPVFADG VRVGIVRDIYYDYAKPGNVIVEVELDTELRIPKGSSAELVSELMGGVRMNILLANNPREK YAVGDTIPGTLNNGMMESAAKLIPKVEEMLPKLDSILISLNNILGDKSIPATLHSIEKTT ANLAVVSSQVKGLMSNDIPQLTSKLNTIGDNFVVISGNLKEVDYAATFKKIDETLANVKI LTEKLNSKDNTIGLLFNDPTLYNNLNATTENAASLLEDLKEHPKRYVHFSLFGKKDK >gi|222159271|gb|ACAB01000088.1| GENE 30 27890 - 28036 73 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTYAFVVGHTLFTLSCSFISICMLMPYIYNFSLFSHFCTFVREYYFGL >gi|222159271|gb|ACAB01000088.1| GENE 31 28164 - 29576 1019 470 aa, chain + ## HITS:1 COG:CAC0001 KEGG:ns NR:ns ## COG: CAC0001 COG0593 # Protein_GI_number: 15893299 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Clostridium acetobutylicum # 9 462 8 439 446 294 35.0 2e-79 MIESNHVVLWNRCLDVIKDNVPETTYNTWFAPIVPLKYEDKTLILQIPSQFFYEILEERF VDLIRKTLYKVIGEGTKLMYNVMVDKTSIPNQTVNLEASNRSTAVTPKSIIGGNKAPSFL QAPAVQDLDPHLNPNYNFENFIEGYSNKLSRSVAEAVAQKPGGTAFNPLFLYGASGVGKT HLANAIGTKIKEIYPEKRVLYVSAHLFQVQYTDSVRNNTTNDFINFYQTIDVLIIDDIQE FAGVTKTQNNFFHIFNHLHQNGKQLILTSDRAPVLLQGIEERLLTRFKWGMVAELEKPTV ELRKNILRNKIHRDGLQFPPEVIDYIAENVNESVRDLEGIVIAIMARSTIFNKEIDLDLA QHIVHGVVHNETKAVTIDDILKVVCKHFDLEPSAIHTKSRKREVVQARQIAMYLAKNHTD FSTSKIGKFIGNKDHATVLHACKTVKGQLEVDKSFSAEVQEIESLLKKRN >gi|222159271|gb|ACAB01000088.1| GENE 32 29666 - 30409 654 247 aa, chain - ## HITS:1 COG:BH1048 KEGG:ns NR:ns ## COG: BH1048 COG0778 # Protein_GI_number: 15613611 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Bacillus halodurans # 1 182 5 186 244 108 31.0 1e-23 MFETVKNRRTIRKYLPKDINPSLLNDLLETSFRASTMGGMQLYSVIVTRDAKMKEKLSPA HFNQPMVKNAPVVLTFCADFRRFSKWCEQRKAVPGYDNLMSFMNASMDTLLVAQTFCTLA EEAGLGICYLGTTTYNPQMIIDTLQLPELVFPLTTITVGYPDGIPAQVDRLPLEAAVHDE KYHDYTPEEIDKLYAYKESLPENKQFIEENKKETLAQVFTDVRYTKKDNEFMSENLLKVL RQQGFLK >gi|222159271|gb|ACAB01000088.1| GENE 33 30694 - 33237 2228 847 aa, chain + ## HITS:1 COG:AF1664 KEGG:ns NR:ns ## COG: AF1664 COG0209 # Protein_GI_number: 11499254 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Archaeoglobus fulgidus # 25 847 7 752 752 282 29.0 2e-75 MEKQIYSYDEAYEESLRYFQGDELAARVWVNKYAVKDSFGNIYEKSPEDMHWRIANEVAR VESKYPNALTAKELYDLLDHFKYIVPQGSPMTGIGNDYQVASLSNCFVIGVDGAADSYGA IIKIDEEQVQLMKRRGGVGHDLSHIRPKGSPVKNSALTSTGLVPFMERYSNSTREVAQDG RRGALMLSVSIKHPDSEAFIDAKMTEGKVTGANVSVKLDDAFMQAAVDEKPYVQQYPIDS AQPTFTKEIDASTLWKKIVHNAWKSAEPGVLFWDTIIRESVPDCYADLGYRTVSTNPCGE IPLCPYDSCRLLAINLYSYVVNPFKPDAYFDFDLFQKHVALAQRIMDDIIDLELEKIERI MTKIDEDPENEEVKHAERALWEKIYKKSGQGRRTGVGITAEGDMLAALGLRYGTEEATEF SEKVHKTVALGAYRSSVEMAKERGAFEIYNSEREQNNPFIQRLAAADPKLYEDMKKYGRR NIACLTIAPTGTTSLMTQTTSGIEPVFLPVYKRRRKVNPNDTNVHVDFVDETGDAFEEYI VFHHKFVTWMEANGYDPARRYTQEEIDELVAKSPYYKATSNDVDWLMKVKMQGRIQKWVD HSISVTINLPNDVDEDLVNRLYVEAWKSGCKGCTVYRDGSRSGVLISTKSDKDKKEGLPP CKPPTVVEVRPRILEADVVRFQNNKEKWVAFVGLLDGHPYEIFTGLQDDDEGILLPKSVT CGRIIKNVDEDGTKRYDFQFENKRGYKTTIEGLSEKFNKEYWNYAKLISGVLRYRMPIEQ VIKLVGSLQLNSESINTWKNGVERALKKYIQDGTEAKGKKCPNCGNETLVYQEGCLICTT CGASRCG >gi|222159271|gb|ACAB01000088.1| GENE 34 33369 - 35628 1578 753 aa, chain + ## HITS:1 COG:L94405 KEGG:ns NR:ns ## COG: L94405 COG1640 # Protein_GI_number: 15672678 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Lactococcus lactis # 399 749 3 348 489 320 46.0 6e-87 MILSFNIEYRTNWGEEVRISGLFPESIPLHTTDGIYWTAELELEVPQEGMTINYSYQIEQ NGIVIRKEWDSFSRSIFLSGSSRKIYRINDCWKNIPEQLYLYSSAFTEALLAHPEKENIP QRYKKGLVIKAYAPRINKDYCLAICGNQKSLGHWDPEKAVLMSDTNFPEWQIELDASKLK YPLEYKFILYNKQEKKADCWEKNPNRYLADPELKTNETLVIADRYVYFDIPAWKGAGIAI PVFSLKSEKSFGVGDFGDLKRMVDWAVNTRQKVIQILPVNDTTMTHAWTDSYPYNSISIY AFHPMYADIRQMGTLKDKEAVSKFSKKQKELNSLPAIDYEAVNQTKWEFFNLLFRQEGEK VLASKGFKDFFETNKEWLQPYAVFSYLRDAYKTPNFRQWPRHSVYQAEDIEKMCQPGTAD YPHISLYYYIQYHLHLQLLSATEYARQHGVVLKGDIPIGISRNSVEAWTEPHYFNLNGQA GAPPDDFSINGQNWGFPTYNWDVMEKDGYRWWMKRFQKMAEYFDAYRIDHILGFFRIWEI PMHAVHGLLGQFDPSLPMSREEIESYGLTFRDEYLLPFIHESFLGQLFGPHTHLVKQDFL ESVDDSGLYRMKPGFETQREVEQFFIGRNDEDSVWIREGLYSLISNVLFVADKKEEGKYH PRIGVQRDFVFRSLNEEEKNAFNRLYDQYYYHRHNEFWYQQAMKKLPQLTQSTRMLVCGE DLGMIPACVSSVMNDLRILSLEIQRMPKNPMYE Prediction of potential genes in microbial genomes Time: Wed May 18 03:15:39 2011 Seq name: gi|222159270|gb|ACAB01000089.1| Bacteroides sp. D1 cont1.89, whole genome shotgun sequence Length of sequence - 20366 bp Number of predicted genes - 23, with homology - 21 Number of transcription units - 11, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 55 - 420 221 ## COG1640 4-alpha-glucanotransferase + Prom 599 - 658 6.2 2 2 Op 1 . + CDS 722 - 1774 592 ## COG3594 Fucose 4-O-acetylase and related acetyltransferases 3 2 Op 2 . + CDS 1758 - 2129 165 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 - Term 2149 - 2213 16.8 4 3 Op 1 . - CDS 2262 - 2786 408 ## COG1803 Methylglyoxal synthase 5 3 Op 2 . - CDS 2790 - 3818 498 ## COG1216 Predicted glycosyltransferases 6 3 Op 3 . - CDS 3821 - 4744 772 ## COG1560 Lauroyl/myristoyl acyltransferase 7 3 Op 4 . - CDS 4748 - 6067 280 ## PROTEIN SUPPORTED gi|229245919|ref|ZP_04369978.1| SSU ribosomal protein S12P methylthiotransferase - Prom 6130 - 6189 5.8 + Prom 5975 - 6034 5.7 8 4 Op 1 . + CDS 6178 - 7851 1294 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) 9 4 Op 2 . + CDS 7884 - 8072 87 ## gi|237714845|ref|ZP_04545326.1| predicted protein 10 4 Op 3 . + CDS 8011 - 8232 239 ## + Term 8256 - 8302 -0.7 - Term 8248 - 8286 4.3 11 5 Tu 1 . - CDS 8321 - 9229 795 ## COG1082 Sugar phosphate isomerases/epimerases - Prom 9280 - 9339 5.5 - Term 9326 - 9363 -0.3 12 6 Op 1 . - CDS 9379 - 10242 873 ## BT_2157 hypothetical protein 13 6 Op 2 9/0.000 - CDS 10287 - 11774 1360 ## COG0673 Predicted dehydrogenases and related proteins 14 6 Op 3 . - CDS 11787 - 12899 582 ## COG0673 Predicted dehydrogenases and related proteins - Prom 12920 - 12979 3.9 - Term 13035 - 13100 -0.8 15 7 Tu 1 . - CDS 13139 - 14620 919 ## BT_2160 putative regulatory protein - Prom 14810 - 14869 3.7 - Term 14815 - 14863 11.1 16 8 Op 1 27/0.000 - CDS 14885 - 15328 712 ## PROTEIN SUPPORTED gi|237714852|ref|ZP_04545333.1| 50S ribosomal protein L9 17 8 Op 2 11/0.000 - CDS 15343 - 15615 461 ## PROTEIN SUPPORTED gi|160885186|ref|ZP_02066189.1| hypothetical protein BACOVA_03184 18 8 Op 3 . - CDS 15618 - 15962 578 ## PROTEIN SUPPORTED gi|160885185|ref|ZP_02066188.1| hypothetical protein BACOVA_03183 - Prom 15987 - 16046 10.1 + Prom 15963 - 16022 9.6 19 9 Op 1 . + CDS 16123 - 16569 513 ## COG1846 Transcriptional regulators + Prom 16571 - 16630 5.5 20 9 Op 2 . + CDS 16710 - 16886 280 ## + Term 16897 - 16956 12.0 - Term 16887 - 16938 15.3 21 10 Op 1 40/0.000 - CDS 16964 - 17665 822 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 22 10 Op 2 . - CDS 17667 - 19223 1295 ## COG0642 Signal transduction histidine kinase - Prom 19330 - 19389 7.7 - Term 19414 - 19469 8.1 23 11 Tu 1 . - CDS 19555 - 20355 633 ## COG0480 Translation elongation factors (GTPases) Predicted protein(s) >gi|222159270|gb|ACAB01000089.1| GENE 1 55 - 420 221 121 aa, chain + ## HITS:1 COG:L94405 KEGG:ns NR:ns ## COG: L94405 COG1640 # Protein_GI_number: 15672678 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Lactococcus lactis # 2 117 372 487 489 114 43.0 3e-26 MSTLRGWWEEDYQQTQRYYNATLGHYGVAPTTATPELCEEIVRNHLNSNSILCILSFQDW LSIDGKWRNPNVAEERINVPSNPRNYWRYRMHLTLEQLMKAKTLNDKIGELIKYTGRDPN K >gi|222159270|gb|ACAB01000089.1| GENE 2 722 - 1774 592 350 aa, chain + ## HITS:1 COG:CAC3042 KEGG:ns NR:ns ## COG: CAC3042 COG3594 # Protein_GI_number: 15896293 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose 4-O-acetylase and related acetyltransferases # Organism: Clostridium acetobutylicum # 7 280 2 268 337 68 25.0 1e-11 MEGNQSRRIDFVDLTKGVCIILVVMAHVGGAFEQLDTNSMLSCFRMPLYFFISGVFFKSY EGLFGFILRKINKLIIPFLFFYISAFLMKYIVWKIAPGVFQLPVSWNELLVVFHGHDLIK FNPPIWFLLALFNCNILFYLIHFLREKHLPVMFAVTILIGCAGFYLGKLQIELPLYIDVS MTALPFYVAGFWIRRYNFFLYPSHRFDKLIPVFIVLALVVMYFTATTLGMRTNNYTGNIF QVYIAAFAGIFMIMLLCKKVKKIKVVSYLGRYSIITLSIHGPILHFLGPLVSRYIHNSWA QASALLLITLSICLLLTPVFLKVIPQMVAQKDLLKVKQDHTKKTVYEDKQ >gi|222159270|gb|ACAB01000089.1| GENE 3 1758 - 2129 165 123 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 10 123 6 120 278 68 26 4e-11 MKINSSYILLKEIRCYAYHGVAPQENLIGNEYIIDLKLKVDISKAARTDEVTDTVNYAEV HQVIKNEMAVPSKLLEHVSGRIIQKLFDQFPCIEEIELRLSKRNPPMGADIESAGIELHC SKK >gi|222159270|gb|ACAB01000089.1| GENE 4 2262 - 2786 408 174 aa, chain - ## HITS:1 COG:TM1185 KEGG:ns NR:ns ## COG: TM1185 COG1803 # Protein_GI_number: 15643941 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Thermotoga maritima # 1 165 12 166 166 133 49.0 2e-31 MESKVRRGIGLVAHDAMKKDLIEWVLWNSELLMGNKFYCTGTTGTLILEALKEKHPDEEW DFTILKSGPLGGDQQMGSRIVDGQIDYLFFFTDPMTLQPHDTDVKALTRLAGVENIVFCC NRSTADHIISSPLFMDPDYERIHPDYSSYTKRFQDKPVVTEAVESVNRRKKKRK >gi|222159270|gb|ACAB01000089.1| GENE 5 2790 - 3818 498 342 aa, chain - ## HITS:1 COG:alr4493 KEGG:ns NR:ns ## COG: alr4493 COG1216 # Protein_GI_number: 17231985 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Nostoc sp. PCC 7120 # 3 270 9 276 295 128 28.0 1e-29 MKVSVVILNWNGCDMLRTFLPSVVRYSEGEGIEVCVADNGSTDASVTLLQQEFPSVRTIV LDQNYGFADGYNLALQQVDAEYVVLLNSDVEVTEHWLEPMIAYLDKHPEVAACQPKIRSQ RQKEYFEYAGAAGGFIDKYGYPFCRGRIMGVVEKDEGQYDTVIPVFWATGAALFIRRADY VNVGGLDGRFFAHMEEIDLCWRLRSRNREIVCVPQSIVYHVGGATLKKENPHKTFLNFRN NLVMLYKNLSQEELNKVMRIRTCLDYLAAFNFLLQGHWDNASAVMRARKEYKRLCPSFSL SREENMRKKTLNPIPERTKSSILWQFYARGCKRFSQLSDLKG >gi|222159270|gb|ACAB01000089.1| GENE 6 3821 - 4744 772 307 aa, chain - ## HITS:1 COG:NMA1630 KEGG:ns NR:ns ## COG: NMA1630 COG1560 # Protein_GI_number: 15794524 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Neisseria meningitidis Z2491 # 4 294 2 284 289 82 27.0 7e-16 MRSKLIYWLVYSGMWLFSALPFRVLYMLSDLNYLLMYRVGKYRRKVVRGNLLRSFPEKTD AERLQIERKFYRYLSDYMLEDLKLLHMSAEELCERMTYKNTEQYLELTEKYGGIIVMIPH YANYEWLIGMGAIMKPGDVPVQVYKPLRDKYLDELFKRIRSRFGGYNIPKHSTAREIIKL KHDGKKMVVGLITDQWPSGYDKYWTTFLGQETAFLNGAERIAKMMNFPVFYCELSKKRRG YCEAEFKLMTETPKETREGEITEMFAHRLEQTIRKEPAYWLWSHKRWKMTREEADRLEEK ELNKKKE >gi|222159270|gb|ACAB01000089.1| GENE 7 4748 - 6067 280 439 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229245919|ref|ZP_04369978.1| SSU ribosomal protein S12P methylthiotransferase [Catenulispora acidiphila DSM 44928] # 153 388 224 461 529 112 28 2e-24 MIDTTVFQNKTAVYYTLGCKLNFSETSTIGKILREAGVRTARKGEKADICVVNTCSVTEM ADKKCRQAIHRLVKQHPDAFVVVTGCYAQLKPGDVAKIDGVDVVLGAEQKGELLQYLGDL QKHEKGEAITTTTKDIRSFSPSCSRGDRTRFFLKVQDGCDYFCSYCTIPFARGRSRNGTI ASMVEQARQAAAEGGKEIVLTGVNIGDFGKTTGESFFDLVKALDQVEGIERYRISSIEPN LLTDEIIEFVSHSRSFMPHFHIPLQSGCDEVLQLMRRRYDTALFASKVKKIKEVMPDAFI GVDVIVGTRGETPEYFEQAYQFIDGLDVTQLHVFSYSERPGTQALKIEYVVSPEEKHQRS QRLLALSDQKTQAFYARHIGQTMPVLMEKSKAGAPMHGFTENYIRVEVESDDSLDNQVIN VRLGEFNEEMTALKGTILI >gi|222159270|gb|ACAB01000089.1| GENE 8 6178 - 7851 1294 557 aa, chain + ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 28 554 15 503 600 235 31.0 2e-61 MIKENFIKLYENSFRENWDLPCYTDYGEDTQYTYGEVAEKIARLHLLFKHCSLRRGDKIS VIGKNNAHWCIAYMATITYGAIIVPILQDFTPNDVHHIVNHSESVFLFTSDSIWENLEEE KLTGLRGVFSLTDFRCLYQRDGETIQKFLKNTDKEMHALYPKGFTREDVQYTTLSNDKVM LLNYTSGTTGFSKGVMLTGNNLAGNVTFGIRTELLKKGDKVLSFLPLAHAYGCAFDFLTA TAVGTHVTLLGKTPSPKIIMKAFEEVKPNLIITVPLVIEKIYKNIIQPLINKKGMKWALN IPLLDTQIYNQIRKRLIDALGGRFKEIIIGGAAMDKEVEEFFYKIKFPFTIGYGMTECGP LISYAPWDEFVLGSSGKILDIMEARIYKETPEAETGEIQVRGENVMVGYYKNQEATQEVF TQDGWLRTGDLGSMDSNGNIFIRGRLKTMILSSSGQNIFPEELETKLNNLPFILESLVIE RNKKLVALVYADYEALDSLGLNNPDNLKTIMDENLKNLNSNVAAYEKISKIQLYPTEFEK TPKRSIKRYLYNSIAVD >gi|222159270|gb|ACAB01000089.1| GENE 9 7884 - 8072 87 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714845|ref|ZP_04545326.1| ## NR: gi|237714845|ref|ZP_04545326.1| predicted protein [Bacteroides sp. D1] # 1 62 1 62 62 92 100.0 1e-17 MASYKKKYKKSWRFENNILKKYIPLHRFNKAIDCITFLKSKENEKISFDGRSYRSSIIRI MW >gi|222159270|gb|ACAB01000089.1| GENE 10 8011 - 8232 239 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKLVLMAVAIVAVSFASCGNKAADAAKATADSIRIADSIAAVEAAALEAEQAAAAAADS LNADSTATETVAE >gi|222159270|gb|ACAB01000089.1| GENE 11 8321 - 9229 795 302 aa, chain - ## HITS:1 COG:lin2265 KEGG:ns NR:ns ## COG: lin2265 COG1082 # Protein_GI_number: 16801329 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Listeria innocua # 29 292 2 241 246 117 31.0 2e-26 MKTRIYICALAALFMIPLAVTAQTKKKAKKEVAIQLYSVRDILNKVDNKNGKCDPTYTAL LKKLANMGYTGVEAANYNNGKFYDRTPQQFKKDVESAGLKVLSSHCTRQLSKEELASGDY SKSLEWWDQCIADHKAAGMKYIVAPWMDVPKTLKDLETYCAYYNEIGKRCKQQGLSFGYH NHAHEFQKVEDKVMYDYMLEHTNPEYVFFQMDLYWVVRGQNSPVDYFNKYPGRFQIFHVK DHREIGQSGMVGFDAIFKNAKTAGVNYLVAEIEKYSVPVEESVEVSLDYLLNAPFVKSSY AK >gi|222159270|gb|ACAB01000089.1| GENE 12 9379 - 10242 873 287 aa, chain - ## HITS:1 COG:no KEGG:BT_2157 NR:ns ## KEGG: BT_2157 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 287 1 290 290 536 90.0 1e-151 MKKVFYPLACCCLAAGVFASCGGQKKANAPEEQKVALTYSKSLKAAETDSLNLPVDENGY ITIFDGETFNGWRGYGKDRVPSKWTIEDGCIKFNGSGGGEAQDGDGGDLIFAHKFKNFEL EMEWKVSKGGNSGIFYLAQEVTSKDKDGNDVLEPIYISAPEYQVLDNANHPDAKLGKDNN RQSASLYDMIPAVPQNSKPFGEWNKAKIMVYKGTVVHGQNDENVLEYHLWTKQWTDMLQA SKFSEEKWPLAFELLNNCGGENHEGFIGLQDHGDDVWFRNIRVKVLD >gi|222159270|gb|ACAB01000089.1| GENE 13 10287 - 11774 1360 495 aa, chain - ## HITS:1 COG:lin2266 KEGG:ns NR:ns ## COG: lin2266 COG0673 # Protein_GI_number: 16801330 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 41 195 3 164 358 67 29.0 5e-11 MSNISRRKFLKTGAAALAGITIAPSTILGMSHGHVSPTDKLNLAAVGIGGMGHANINHVK GTENIVALCDVDWKYAKGVFDEFPKAKKYWDYRKMYDEMGKSIDGVIIATADHTHAIITA DAMTMGKHVYCQKPLTHSVYESRLLTKLAASTGVVTQMGNQGASDEGTDLVCEWIWNGEI GDVTKVECATDRPIWPQGLNVPEKVDKIPSTLNWDLFTGPAKMNPYNAIYHPWNWRGWWD YGTGALGDMACHILHQPFKALKLQYPTKVEGSSTLLLNACAPQAQHVKMIFPARENMPKV AMPEVEIHWYDGGMMPERPKGFPEGKQLMQSGGGLTIFHGTKDTLICGCYGQNPWLLSGR KPNAPKVCRRVPKAMNGGHEMDWVRACKEDKSNRIMTKSDFSEAGPMNEMVAMGVLAIRL QALNKTLEWDGANMCFTNIGDNETIRTVIKDGFKIHDGHPTFDKTWTDPINAKQFAAELV KHNYREGWRLPDMPR >gi|222159270|gb|ACAB01000089.1| GENE 14 11787 - 12899 582 370 aa, chain - ## HITS:1 COG:lin2932 KEGG:ns NR:ns ## COG: lin2932 COG0673 # Protein_GI_number: 16801991 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 10 192 4 184 333 77 26.0 4e-14 MNTQSSVDTRIKVGIIGFGRMGRFYWEAMTKSDRWNIAYICDTNPASRQLAKKLSPNSII LEDNQKIFEDESVQVVGLFTLADSRMEQIEKAIRYGKHIISEKPVADTMENEWKVVEMTE NTNLISTVNLYLRNSWYHNLMKEYIQQGEIGELAIIRICHMTPGLAPGEGHEYEGPAFHD CGMHYVDITRWYAGCDYRTWNAQGVNMWNYKDPWWVQCHGTFQNGVVFDITQGFVYGQLS KDQTHNSYVDIIGTKGVVRMTHDFNTAVVDLHGVNQTIRVEKPFGGKNIDVLCDLFADSV ETGKRSSRLPLMRDSAIASEYAWTFLKDTRKHDLPAIGNLRTLEQIRERRKNMKNGYGLL HGNLPKIINP >gi|222159270|gb|ACAB01000089.1| GENE 15 13139 - 14620 919 493 aa, chain - ## HITS:1 COG:no KEGG:BT_2160 NR:ns ## KEGG: BT_2160 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 487 75 561 561 686 76.0 0 MLKSPGLTLEGEYRINLRLYNEYKKFHIDSAIHYVDRNIEISRRLNKPYFTNQSFLNLGL LYSMCGRFREAEIILKSIKTSELPRDLLINYYQTYSSFWGHYSISVANNLYGKQQAAYQD SLFALIDHTSWDYRMSQASYYIWRDTLKSKEIFKELLEIEEVGTPNYAMITHSYSRLCHH QKKYDEEKKYLMLSAIADTRNATRENASLQSLALIAYDEQNLADAFKFTQSAIDDVISSG IHFRAIEIYKFNSIINTAYQAEQARSRSHLTTFLISTSIILFLLVLLVLFIYIQMKKTLK IKQALAQSNEELLRLNNKLNSMNSQLNDTNNQLCEINSIKEYYIAEFFDVCFSYIHKMEK YQNMLYKIAINKYYDELIKKLKSSALIDDELSALYTRFDKVFLGLYPTFVSDFNALLKDE EKIILKPDALLNRELRIYALLRLGITDSGKIANFLRCSTSTVYNYRTKMRNKAAVDRDEF ENEIMKISSTQET >gi|222159270|gb|ACAB01000089.1| GENE 16 14885 - 15328 712 147 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237714852|ref|ZP_04545333.1| 50S ribosomal protein L9 [Bacteroides sp. D1] # 1 147 1 147 147 278 100 2e-74 MEIILKEDVVNLGYKNDIVNVKSGYGRNYLIPTGKAIIASPSAKKMLAEELKQRAHKLEK IKKDAEAMAAKLEGVSLTIATKVSSTGTIFGSVGNIQIAEALSKLGHEVDRKIIVVKDAV KEVGSYKAIVKLHKEVSVEIPFEVVAE >gi|222159270|gb|ACAB01000089.1| GENE 17 15343 - 15615 461 90 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160885186|ref|ZP_02066189.1| hypothetical protein BACOVA_03184 [Bacteroides ovatus ATCC 8483] # 1 90 1 90 90 182 100 2e-45 MAQQVQSEIRYLTPPSVDVKKKKYCRFKKSGIRYIDYKDPEFLKKFLNEQGKILPRRITG TSLKFQRRIAQAVKRARHLALLPYVTDMMK >gi|222159270|gb|ACAB01000089.1| GENE 18 15618 - 15962 578 114 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160885185|ref|ZP_02066188.1| hypothetical protein BACOVA_03183 [Bacteroides ovatus ATCC 8483] # 1 114 1 114 114 227 100 6e-59 MNQYETVFILTPVLSDVQMKEAVEKFKGVLQAEGAEIINEENWGLKKLAYPIQKKSTGFY QLIEFNADPTVIDKLEINFRRDERVIRFLTFKMDKYAAEYAAKRRSVKSNKKED >gi|222159270|gb|ACAB01000089.1| GENE 19 16123 - 16569 513 148 aa, chain + ## HITS:1 COG:FN2010 KEGG:ns NR:ns ## COG: FN2010 COG1846 # Protein_GI_number: 19705306 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 25 148 17 141 160 64 31.0 7e-11 MIEQFNFDIRLIFAILNGKVSAAINRKLYRNFRQNGLEISPEQWTVLIFLWEKDGVTQQE LCNATFKDKPSMTRLIDNMERQHLVVRISDKKDRRTNLIHLTKDGKELEEKARVIAGQTL KEALHGITLDELSIGQEVLKKVFYNTKD >gi|222159270|gb|ACAB01000089.1| GENE 20 16710 - 16886 280 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKGLLKNLGLILILVGVVILLACSFTGNVNNNAILGTSVVLVVLGLISYIVINKKIAD >gi|222159270|gb|ACAB01000089.1| GENE 21 16964 - 17665 822 233 aa, chain - ## HITS:1 COG:lin2728 KEGG:ns NR:ns ## COG: lin2728 COG0745 # Protein_GI_number: 16801789 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 6 226 3 221 225 141 38.0 1e-33 MDEKLRILLCEDDENLGMLLREYLQAKGYSAELYPDGEAGYKAFLKNKYDLCVFDVMMPK KDGFTLAQDVRAANAEIPIIFLTAKTLKEDILEGFKIGADDYITKPFSMEELTFRIEAIL RRVRGKKNKESNIYKIGKFTFDTQKQILSSEGKQTKLTTKESELLGLLCAHANEILQRDF ALKTIWIDDNYFNARSMDVYITKLRKHLKEDDSIEIINIHGKGYKLITPEVES >gi|222159270|gb|ACAB01000089.1| GENE 22 17667 - 19223 1295 518 aa, chain - ## HITS:1 COG:CAC1701 KEGG:ns NR:ns ## COG: CAC1701 COG0642 # Protein_GI_number: 15894978 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 283 513 341 564 566 125 36.0 3e-28 MKKSTIWILGIVMGLSFLSLLYLQVSYIEEMMKTRKEQFDSAVRNSLDQVSKDVEYAETR RWLIEDISEAERKALIANNASIQQDNLIQQTQRFTVKSKDGKVYSDFELKVMTTKPSELP KAMISPYRGTKTIPETSRSLVEAIKNRYMYQRALLDEVAWQMIYRGSDKSIGDRVRFKEL DDYLKSSLYNNSIDLPYHFTVIDKDGREVYRCADYEAKGSEDAYQQALFKNDPPAKMSIL KVHFPGKKDYIFDSISFMIPSLIFTLVLLVTFIFTIYIVFRQKKLTEMKNDFINNMTHEF KTPISTISLAAQMLKDPAVGKSPQMFQHISGVINDETKRLRFQVEKVLQMSMFERQKATL KMKEIDANELISGVINTFALKVERYNGKITSNLEATDPVIFADEMHITNVIFNLMDNAVK YKKPEEDLELKVRTWNESGKLMISIQDNGIGIKKENLKKIFEKFYRVHTGNLHDVKGFGL GLAYVRKIILDHKGTIRAESDLNVGTKFIIALPLLKNN >gi|222159270|gb|ACAB01000089.1| GENE 23 19555 - 20355 633 266 aa, chain - ## HITS:1 COG:TM1651 KEGG:ns NR:ns ## COG: TM1651 COG0480 # Protein_GI_number: 15644399 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Thermotoga maritima # 4 266 447 681 683 189 42.0 4e-48 MKWRLENNEKLQVKYEEPKIPYRETITKAARADYRHKKQSGGAGQFGEVHLIVEPYKEGM PVPETYKFNGQEFKITVRGTEEIPLEWGGKLVFVNSIVGGSIDARFLPAIMKGIMSRMEQ GPLTGSYARDVRVIVYDGKMHPVDSNEISFMLAGRNAFSEAFKNAGPKILEPIYDVEVFV PSDRMGDVMGDLQGRRAMIMGMSSEKGFEKLVAKVPLKEMSSYSTALSSLTGGRASFIMK FSSYELVPTDVQDKLMKDFEAKQTEE Prediction of potential genes in microbial genomes Time: Wed May 18 03:16:07 2011 Seq name: gi|222159269|gb|ACAB01000090.1| Bacteroides sp. D1 cont1.90, whole genome shotgun sequence Length of sequence - 15901 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 10, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1345 1336 ## COG0480 Translation elongation factors (GTPases) - Prom 1402 - 1461 4.7 + Prom 1813 - 1872 5.9 2 2 Op 1 . + CDS 1894 - 3024 1004 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 3 2 Op 2 . + CDS 3044 - 3598 576 ## BT_2169 RNA polymerase ECF-type sigma factor 4 2 Op 3 . + CDS 3689 - 4123 254 ## BT_2170 hypothetical protein 5 2 Op 4 . + CDS 4150 - 5028 558 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 5070 - 5129 2.5 6 3 Op 1 . + CDS 5249 - 7732 1628 ## BT_2172 hypothetical protein 7 3 Op 2 . + CDS 7740 - 8738 700 ## BT_2173 hypothetical protein + Prom 8837 - 8896 1.7 8 4 Op 1 . + CDS 8927 - 9208 175 ## BT_2176 hypothetical protein 9 4 Op 2 . + CDS 9205 - 9813 339 ## COG2431 Predicted membrane protein + Prom 9844 - 9903 6.8 10 5 Tu 1 . + CDS 10106 - 10462 441 ## BT_2178 hypothetical protein + Term 10487 - 10528 7.9 - Term 10477 - 10512 5.8 11 6 Tu 1 . - CDS 10526 - 11593 1032 ## BT_2179 putative DNA mismatch repair protein - Prom 11706 - 11765 5.7 + Prom 11562 - 11621 4.6 12 7 Tu 1 . + CDS 11777 - 12469 622 ## COG1011 Predicted hydrolase (HAD superfamily) - Term 12463 - 12521 8.0 13 8 Tu 1 . - CDS 12544 - 13323 515 ## BT_2181 transcriptional regulator - Prom 13401 - 13460 3.6 + Prom 13290 - 13349 5.1 14 9 Tu 1 . + CDS 13411 - 14076 635 ## COG3506 Uncharacterized conserved protein + Term 14090 - 14153 5.4 15 10 Op 1 . - CDS 14661 - 15713 724 ## BT_2183 hypothetical protein 16 10 Op 2 . - CDS 15752 - 15901 82 ## gi|294647177|ref|ZP_06724776.1| conserved hypothetical protein Predicted protein(s) >gi|222159269|gb|ACAB01000090.1| GENE 1 1 - 1345 1336 448 aa, chain - ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 447 3 452 690 287 36.0 3e-77 MKVYQTNEIKNIALLGSSGSGKTTLVEAMLFESGVIKRRGSVAAKNTVSDYFPVEQEYGY SVFSTVLHVEWNNKKLNIIDCPGSDDFVGSTVTALNVTDTAIILLNGQYGVEVGTQNHFR YTEKLNKPVIFLVNQLDNEKCDYDNILEQLKEAYGSKVVPIQYPIATGPGFNALIDVLLM KKYSWKPEGGAPTIEDIPAEEMDKAMEMHKALVEAAAENDENLMEKFFEQDSLSEDEMRE GIRKGLIARGMFPVFCVCGGKDMGVRRLMEFLGNVVPFVSEMPKVQNTEGKEVAPDSNGP ESLYFFKTSVEPHIGEVSYFKVMSGKVHEGDDLLNADRGSKERIAQIYVVAGGNRVKVEE LQAGDIGAAVKLKDVKTGNTLNGKDCDYKFNFIKYPNSKYTRAIKPVNEADVEKMMSILN RMREEDPTWVIEQSKELKQTLVHGQGEF >gi|222159269|gb|ACAB01000090.1| GENE 2 1894 - 3024 1004 376 aa, chain + ## HITS:1 COG:SPy1040 KEGG:ns NR:ns ## COG: SPy1040 COG0635 # Protein_GI_number: 15675037 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Streptococcus pyogenes M1 GAS # 5 368 9 369 376 237 33.0 3e-62 MAGIYLHIPFCKTRCIYCDFYSTTRSELKTRYVQTLCRELAMRKEYLKGEDIETIYFGGG TPSQLEKEDFEQIFDTIRKHYGLNHCQEITLEANPDDLSQEYLGMLSSLPFNRLSMGIQT FDDATLKLLRRRHNARTAIEAIDRCRKAGFQNISIDLIYGLPGETKERWENDLRQAISLN VEHISAYHLIYEEDTPIYNMLKQHQISEVDEDSSLEFFTLLIEHLQKAGFEHYEISNFCR PGKYSRHNTSYWKGIAYLGCGPSAHSFDGMTREWNVSSIDTYIKGIEEDCRAFETEYLDP TTRYNEFIITTIRTVWGTPIEKLKQMFGNEMWEYCQKMAAPYLKNGKLEEYNGALRLTRE GIFISDSIMSDLLWVD >gi|222159269|gb|ACAB01000090.1| GENE 3 3044 - 3598 576 184 aa, chain + ## HITS:1 COG:no KEGG:BT_2169 NR:ns ## KEGG: BT_2169 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 181 1 181 182 283 89.0 2e-75 MTESEVRKLLRQMKELDSQTAFRDFYNMTYDRFFRIAYYYVKQEEWSQEIVLDVFLKLWK QRSNLLDVRNIEDYCFILVKNASLNYLEKESRHTLIHPDSLPEPQEQSYSPEESLISEEL FALYVKALDRLPDRCREVFIRIREEKQSYAQVAEELGISMNTVDAQLQKAITRLKEMISR AEID >gi|222159269|gb|ACAB01000090.1| GENE 4 3689 - 4123 254 144 aa, chain + ## HITS:1 COG:no KEGG:BT_2170 NR:ns ## KEGG: BT_2170 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 1 143 144 196 71.0 2e-49 MKKLNDANMGIFVLLSVCLCLFSCNNDNDNLPKDYAGFEHSKETVECESDKPECELKIKI VATEKTKEDRTVVLATPPPVVGQAAVVQLTEKKVIIKAGKKSATTVIKIYPKQMILKKQN VTLSCTPQWKEGGISKLTILLKRK >gi|222159269|gb|ACAB01000090.1| GENE 5 4150 - 5028 558 292 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 74 253 115 290 331 85 32.0 9e-17 MKYTDTELEEILNKLVASTRSPRGRFSATTSYPELEKRLKSHTRRLTLIRTFSAAAAVTL LCLSVWTAYLYMQPVTIQTVSTLAETRTVCLPDGSTVTLNHYSSLSYPEKFKSDKREVKL NGEAYFEVNKNKKHPFIVQTETIDVQVLGTHFNVDAYQNNPAVKTTLLTGSVAVSNKSKS VRVILKPNEIAIYNKVEEKLTRKVLENVEDEISWRQGEFIFDDLPLQEIARELSNSFEAT IHIADTALQNYRITARFRNGEDLATILSVLHNAGYFDYSQNKKQIIITAKPD >gi|222159269|gb|ACAB01000090.1| GENE 6 5249 - 7732 1628 827 aa, chain + ## HITS:1 COG:no KEGG:BT_2172 NR:ns ## KEGG: BT_2172 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 827 74 897 897 1388 83.0 0 MQVKDKPLREILDLAFKNEQISYQFTGRHILLQKKKESKTVSRKFTISGYVTDGTSSETL IGTNIIESHQNQGTTTNPYGFYSITLPEGETELRFSYLGYATEAHHFTLSQDTLLNIRMQ GNTQLQEVVIVSDKTETGTIATQMGSIEIPMTQIKNTPSILGEADVMKAIQLMPGVQAGV DGSAGLYIRGGSPDQNLILLDGIPVYNVDHMFGFFSVFTPEAVKKVTLFKGSFPARFGGR LSSVIDVRTNDGDMQKYHGTLSIGLLTSKINLEGPIVKGKTSFNISARRSYVDLIAKPFM PDDEEYSYYFYDINAKINHKFSDRSRVYLSAYNGKDHFAANYDGNTDFKDGSKMNWGNTI LSARWNYVFNNRLFCNTTVSYSNYLFDINSYTNNQYFGSSGTSFTNRYSADYRSGINDWN YQIDFDYNPSPKHHLKFGTGYIYHRFRPEVMTSKISNKTGDKIDLDTTYHSIANNRIYGH ELSAYLEDNIKMNDRLRLNLGLHFSLFHVQKQSYFSLQPRVSARYQLGKDVTLKASYTQM SQYVHLLSSMPIAMPTDLWVPVTKKIKPMRSHQYSLGGYYTGIKGWEFSVEGYYKDMYNV LEYKEGVSFFGSSSGWENKVEMGKGRSAGIEFMAQKTLGRTTGWLSYTLSKSDRQFAKGG INNGERFPYKYDRRHNINLTINHKFSDRIDIGASWVFYTGGTSTIPEEKTAVIRPSDGTN NGFGGGYGYGGYFDSNITSPTIGEASYVEHRNNYRLPASHRLNVGVNFNKKTKHGMRTWN ISLYNAYNAMNPTFVYRSTSKNDPNKPIIKKYTILPLIPSFTYTYKF >gi|222159269|gb|ACAB01000090.1| GENE 7 7740 - 8738 700 332 aa, chain + ## HITS:1 COG:no KEGG:BT_2173 NR:ns ## KEGG: BT_2173 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 329 1 333 337 435 66.0 1e-120 MRTRICYPFILLIALLTTVSCENELPFSVKDNPPKLVMNALINADSLTNVLYLNFTGRGY ATHVENATVEVRVNGQLSESLRPLPPQTEGDMQCRFHISNKFTPGDVVRIDALTDDGQYH AWAEVTVPQRPHEIADIDTVTIPMTKYYYTQNFLRYKINIKDRSNEDNYYRLIMDKQMTV KDYNEETGEFVSRTIHRYHFISREDIVLTDGQPTNSDDEDNGMFDTVKNIYGVFDDSRFK NTSYTMTVYNQTDIDGFPEYGTNVKMDIIVRLLSITETEYYYLKALNLVDSDAYDETINE PIKYPSNVHGGIGMIGISTETSKIIHIEKPQR >gi|222159269|gb|ACAB01000090.1| GENE 8 8927 - 9208 175 93 aa, chain + ## HITS:1 COG:no KEGG:BT_2176 NR:ns ## KEGG: BT_2176 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 92 1 94 95 105 84.0 5e-22 MFIIIGIMLTGMLLGYLLRSKRLSWIHKIITLLIWILLFLLGIDVGGNESIIKGLYTLGL EAVVITVAAVIGSTLCAWGLWYLLYSKGKETKA >gi|222159269|gb|ACAB01000090.1| GENE 9 9205 - 9813 339 202 aa, chain + ## HITS:1 COG:FN1083 KEGG:ns NR:ns ## COG: FN1083 COG2431 # Protein_GI_number: 19704418 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 5 200 2 195 198 113 38.0 2e-25 MKGSLIIVSFFIIGTLCGVYHLIPYDFTDSKLSYYALCGLMFCVGISIGNDPNTLKSFRS LNPRLVFLPIMTILGTLAGCAIAGAFMSQRSPLDCMAVGAGFGYYSLSSIFITEYKGPEL GTIALLSNIMREIIALLCAPLLVKYFGKLAPISVGGATTMDTTLPIITRYSGKEFVIISI FHGFVVDFSVPFLVTFLCSISF >gi|222159269|gb|ACAB01000090.1| GENE 10 10106 - 10462 441 118 aa, chain + ## HITS:1 COG:no KEGG:BT_2178 NR:ns ## KEGG: BT_2178 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 118 183 93.0 2e-45 MKRLGLTLVAALCLAASTFAAGNQPTTAKWEGNINVSKLGKYLKLNSDQSEEVANICDYF STQMSRATTAKKDKEAKLRNAVYGNLKLMRKTLSAEQYAKYAALMNITLQNKGIELNK >gi|222159269|gb|ACAB01000090.1| GENE 11 10526 - 11593 1032 355 aa, chain - ## HITS:1 COG:no KEGG:BT_2179 NR:ns ## KEGG: BT_2179 # Name: not_defined # Def: putative DNA mismatch repair protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 355 1 354 354 629 89.0 1e-179 MKIGDKVRFLSEVGGGIVTGFQGKDFVLVEDADGFDIPMPIRECVVIETDDYNMRRKPGS SAPKPEEPVKPVKPEMPVIQRQPEVRGGDTLNVFLAYVPEDAKAMMTTPFETYLVNDSNY YLYYTYLSAEGKAWKNRSHGLVEPNTKLLLEEFTKDVLNDMERVAVQLIAFKDGKPAAIK PAVSVEIRIDTVKFYKLHTFSDSDFFEEPALIYDIVKDDMPAKQVYVSAEEIQEALLQKK SVDKPKSQPIVKPNHTTHGGKSGIIEIDLHIDSLLDDTQGMGNAEILNYQLDKFREVMET YKNKREQKIVFIHGKGDGVLRKAVIDELKRKYSNCRYQDASFQEYGFGATMVTIK >gi|222159269|gb|ACAB01000090.1| GENE 12 11777 - 12469 622 230 aa, chain + ## HITS:1 COG:mlr6523 KEGG:ns NR:ns ## COG: mlr6523 COG1011 # Protein_GI_number: 13475450 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Mesorhizobium loti # 5 226 7 227 238 192 43.0 3e-49 MKELIKVIAFDADDTLWSNEPFFQEIEKQYTDLLKPYGTSEDISAALFQTEMNNLKYLGY GAKAFTISMVETALHVSGQKISGTDIQHIIELGKSLLKMPIELLPGVKETLKVLKEKGKY KLVVATKGDLLDQENKLERSGLASYFDHIEVMSDKTEKEYQRMLNILQIAPSEFVMIGNS LKSDIQPVLSLGGYGIHIPFEVMWKHEVVDTFTHDHLKQVKRFDELLLLF >gi|222159269|gb|ACAB01000090.1| GENE 13 12544 - 13323 515 259 aa, chain - ## HITS:1 COG:no KEGG:BT_2181 NR:ns ## KEGG: BT_2181 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 257 1 257 257 459 89.0 1e-128 MDVLQKEIDEVYATQLIADEILDNGVVEQHQRFIHSLTEINGGCAVISDLSNRKSYIAVH PWAHFLGLTPEEAALSVIDSMDEDCIYRRIHPEDLVEKRLLEYQFFQKTFSMSSEERLKY RGRCRIRMMNEKGVYQYIDNLVQIMENTPSGSVWLIFCLYTLSADQRTEQGIYPTITHME RGEVETLFLSEEHRNILSEREKEILRCIRKGLSSKEIAAALYISVNTVNRHRQNILEKLS VGNSIEACRAAELMKLLDL >gi|222159269|gb|ACAB01000090.1| GENE 14 13411 - 14076 635 221 aa, chain + ## HITS:1 COG:all7165 KEGG:ns NR:ns ## COG: all7165 COG3506 # Protein_GI_number: 17233181 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 28 204 1 176 183 160 42.0 2e-39 MKHLKIKMMSIAFMAVTTSSMAQSLNKMNWLNEPQQWEIKDGKTLVMDVPAKTDFWRISH YGFTVDDGPFYYATYGGEFEAKVKITGNYVTTFDQMGLMLRIDHENWIKAGVEYIDGKQN VSAVVTHRTSDWSVVQLPDAPRSIWIKAVRRLDAVEIFFSRDDKEYIMMRTCWLQDNCPV MVGLMGACPDGKGFTATFEEFKVTPLADQRRLEWAKRQINK >gi|222159269|gb|ACAB01000090.1| GENE 15 14661 - 15713 724 350 aa, chain - ## HITS:1 COG:no KEGG:BT_2183 NR:ns ## KEGG: BT_2183 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 348 1 348 351 578 82.0 1e-163 MKIPNIRLLNQQLLSPLFFQPKELVSWMGAMQAQNYSMVKWAVGMRLKSATIQTVEKALR DGEILRTHVMRPTWHLVAAEDIRWMLKLSAGRIISANESYAKGHDLEISEELYTKSHNLL EKILCGKKSLTRQEIAEHFNRSGIVADNHRMTRFMARAEQVGIVCSGEDKGSKCTYALLE ERVPPMPELTKDESLARLARSYFRSHAPAVLQDFVWWSGLPITDARQAIYLIDSELTAEE WNGQTWYIHEDCRTRGKVTGSLHLLPSYDEYLLGYKDRTDVLPKEHYSKAFTNNGLFYPI VLHEGQVIGNWDKSVKKRGSLIEHSWFRLDDCVDEGALDREKDKYIRFWR >gi|222159269|gb|ACAB01000090.1| GENE 16 15752 - 15901 82 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294647177|ref|ZP_06724776.1| ## NR: gi|294647177|ref|ZP_06724776.1| conserved hypothetical protein [Bacteroides ovatus SD CC 2a] # 1 49 372 420 420 93 100.0 4e-18 GNLLKIKDNYPKVVVSGEKMFENTYEGIEHIYIRDFLSSVLLHSITRVI Prediction of potential genes in microbial genomes Time: Wed May 18 03:16:56 2011 Seq name: gi|222159268|gb|ACAB01000091.1| Bacteroides sp. D1 cont1.91, whole genome shotgun sequence Length of sequence - 31336 bp Number of predicted genes - 33, with homology - 33 Number of transcription units - 14, operones - 8 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1086 684 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 1156 - 1215 4.0 + Prom 1087 - 1146 6.5 2 2 Op 1 . + CDS 1295 - 2524 1262 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase 3 2 Op 2 . + CDS 2537 - 3106 388 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 4 2 Op 3 . + CDS 3181 - 3897 567 ## COG0300 Short-chain dehydrogenases of various substrate specificities 5 3 Op 1 . + CDS 4526 - 7759 2632 ## BF4448 hypothetical protein 6 3 Op 2 . + CDS 7773 - 9785 1765 ## Coch_1140 RagB/SusD domain protein 7 4 Tu 1 . + CDS 10132 - 10881 360 ## BT_1587 hypothetical protein - Term 10874 - 10923 3.2 8 5 Tu 1 . - CDS 10938 - 11936 730 ## COG2234 Predicted aminopeptidases - Prom 12114 - 12173 6.2 + Prom 12004 - 12063 6.0 9 6 Op 1 . + CDS 12109 - 12552 456 ## BT_2205 hypothetical protein 10 6 Op 2 . + CDS 12552 - 13361 708 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 11 6 Op 3 . + CDS 13375 - 13788 493 ## COG0802 Predicted ATPase or kinase 12 6 Op 4 . + CDS 13785 - 14024 186 ## BT_2208 hypothetical protein 13 6 Op 5 6/0.000 + CDS 14085 - 14378 271 ## COG1669 Predicted nucleotidyltransferases 14 6 Op 6 . + CDS 14362 - 14739 248 ## COG2361 Uncharacterized conserved protein 15 7 Op 1 23/0.000 - CDS 14734 - 16044 913 ## COG1721 Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) - Prom 16064 - 16123 6.4 16 7 Op 2 . - CDS 16226 - 17200 1039 ## COG0714 MoxR-like ATPases 17 7 Op 3 . - CDS 17212 - 18483 679 ## BT_2213 hypothetical protein 18 7 Op 4 . - CDS 18480 - 19097 480 ## BT_2214 hypothetical protein 19 7 Op 5 . - CDS 19114 - 20052 802 ## BT_2215 hypothetical protein 20 7 Op 6 . - CDS 20033 - 20995 617 ## COG1300 Uncharacterized membrane protein - Prom 21034 - 21093 4.8 + Prom 21012 - 21071 2.6 21 8 Tu 1 . + CDS 21097 - 21822 473 ## COG1714 Predicted membrane protein/domain + Term 21859 - 21895 3.7 - Term 21957 - 22002 7.9 22 9 Op 1 . - CDS 22035 - 22370 521 ## BT_2228 hypothetical protein 23 9 Op 2 . - CDS 22403 - 22717 170 ## PROTEIN SUPPORTED gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein 24 9 Op 3 . - CDS 22777 - 26583 4094 ## COG0587 DNA polymerase III, alpha subunit - Prom 26701 - 26760 6.8 + Prom 26563 - 26622 3.4 25 10 Op 1 14/0.000 + CDS 26745 - 27431 556 ## COG0688 Phosphatidylserine decarboxylase 26 10 Op 2 . + CDS 27442 - 28149 424 ## COG1183 Phosphatidylserine synthase 27 11 Tu 1 . + CDS 28217 - 28507 235 ## BT_2233 hypothetical protein + Term 28620 - 28651 -0.8 - Term 28446 - 28482 0.1 28 12 Op 1 . - CDS 28522 - 28959 494 ## COG0590 Cytosine/adenosine deaminases 29 12 Op 2 . - CDS 28986 - 29156 156 ## BF0707 hypothetical protein - Prom 29250 - 29309 5.0 30 13 Tu 1 . + CDS 29287 - 29652 424 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase + Prom 29681 - 29740 7.1 31 14 Op 1 . + CDS 29768 - 30124 447 ## COG2315 Uncharacterized protein conserved in bacteria 32 14 Op 2 . + CDS 30108 - 30866 462 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 33 14 Op 3 . + CDS 30933 - 31335 100 ## BF0631 hypothetical protein Predicted protein(s) >gi|222159268|gb|ACAB01000091.1| GENE 1 3 - 1086 684 361 aa, chain - ## HITS:1 COG:FN1382 KEGG:ns NR:ns ## COG: FN1382 COG1373 # Protein_GI_number: 19704717 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 7 361 1 363 402 250 39.0 3e-66 MKIPRIIKQRNTYVDRIRPFMRKTLVKVMVGHRRVGKSFILYQLMELIRKEESDANIIYI NKEDVNFIDIKTYKELNDYILSKSVDDRMNYIFVDEIQEIQDFRLAIRSLALDDNNDIYV TGSNSEIFSSDLANELGGRYVEFVVYSLSYIEFLDFHELENTDASLEKYIHYGGLPYLIH LPMEEPVVMEYLRSIYSTIILKDVIQRKNIRNTVFLEQLVSFLAGNIGNLFSSKSISDFL KSQKVDISPYVVSEYAMALSDAFIVHRVGRYDIAGKKLFERGEKYYFENMGIRNVITGYK PQDRAMRLENLVYNHLVYSGYDVKIGTLGTEEIDFVCKRNGEILYVQVSLELSKQETIER E >gi|222159268|gb|ACAB01000091.1| GENE 2 1295 - 2524 1262 409 aa, chain + ## HITS:1 COG:PM0839 KEGG:ns NR:ns ## COG: PM0839 COG0128 # Protein_GI_number: 15602704 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Pasteurella multocida # 5 396 10 430 440 228 35.0 1e-59 MLYKLIPPSVVTATIQLPASKSISNRALIINALGKGIYPPENLSDCDDTQVMIKALTEGK ETIDIMAAGTAMRFLTAYLSVTPGERTITGTARMQQRPIQILVNALRELGAEIEYVHNEG YPPLCIKGAELKGNEITLKGNVSSQYISALLMIGPALKDGLTLHLSGEIISRPYINLTLQ LMQDFGAKAAWTSPNSISVAPQLYQSIPFKVESDWSAASYWYQIAALSPKAEIELLGLFP NSYQGDSRGAEVFSRLGITTEFTSQGVKLKKTGKAPERLEEDFIDIPDLAQTFVVTCALL NIPFRFTGLQSLKIKETDRIAALRAELKKLGYMIKEENDSILMWNGERCEPEETPVIETY EDHRMAMAFAPAIIRHPNLLIANPQVVTKSYPGYWEDLKQAGFQVINEG >gi|222159268|gb|ACAB01000091.1| GENE 3 2537 - 3106 388 189 aa, chain + ## HITS:1 COG:all4541 KEGG:ns NR:ns ## COG: all4541 COG0664 # Protein_GI_number: 17232033 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Nostoc sp. PCC 7120 # 31 189 34 190 193 101 37.0 6e-22 MEDFNKYTSHINYSPIVDFVLQQGKQTYYKKGDYFSRQGEVCKTMGFVSSGSFRYCCTNS VGENSIVGYTFDHSFVGNYPAFQLQDKSNVDIQALCDCSVYVINYQQMADFYDTNDAHQK LGRRIAETLLWEVYDRMISMYSLTPEERYLEIINRCPDLLKLITLKELASYLLIRPETLS RIRRKVVQK >gi|222159268|gb|ACAB01000091.1| GENE 4 3181 - 3897 567 238 aa, chain + ## HITS:1 COG:CAP0051 KEGG:ns NR:ns ## COG: CAP0051 COG0300 # Protein_GI_number: 15004755 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Clostridium acetobutylicum # 1 233 1 236 240 169 41.0 4e-42 MKKIIIIGATSGIGRGLAEVYSQEDYLIGITGRRENLLEEVCAQDKDKLFYQVCNITDTQ ATISSLETLTQKMGGMDILIICAGTGELNPELSYQLEEPTLLTNVIGFTNIADWGFRYFE QQKSGHLVTISSVGGTRGSGIAPAYNASKAYQINYMEGLRQKATKSPYSIYTTDIRPGFV DTAMAKGEGLFWVTPVDKAVKQIKKAISKKKKVAFISKRWRYVTILFRLLPSAIYCRM >gi|222159268|gb|ACAB01000091.1| GENE 5 4526 - 7759 2632 1077 aa, chain + ## HITS:1 COG:no KEGG:BF4448 NR:ns ## KEGG: BF4448 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 1077 28 1092 1092 958 48.0 0 MLLFLLTMSVEAAYPETGESTLKVSVVQQGTTCKGVVKDAAGETIIGASVVAKGTTNGTI TGINGDFSLNNVKTGDIIVVSFVGYQTQEIKFTGQSLNIILKDNTQTLDEVVVVAFGTQK KVNVTGSVSTVGAKEISARPVNSTIEALQGMVPGMNISTGDGGGSLGSDKKFNIRGVGTI GAGSKVEPLVLIDGMEGDMNAINPQDIENISVLKDAAASSIYGSRAPGGVILITTKKGKS GKTVVNYNNNFRFVSPLNMPEMADSYNFALAVNDQLTNGGQTPMYSATKLQQILDFQAGK STQYMWPTDAGRWNSFDDPQRQDVMPTGNTDWLKTLFGNSFTHEHSLSVNGGTDKIQYYL SANYLDQGGLLKFGDDNKQRYSFTAKINADLTKWLKISYSMRFNRTDYEAPSFAGGDIKS NVFYFDVCRYWPVIPVVDPNGFYTVESKIYQLTEGGRYKSQKDVMAHQLAFIVEPIKDWK INVELNYRSNYNFNHTDYQTVYGYDVSKNPYIIANQTSSVTEYAYKSNFFNPNIFTEYGK SLESGHNFKVMLGFQSELFKQRDITASQDGIMSEVATLNTTQTNAQNRGGYSEWATAGFF GRVNYDYKGRYLAEVNMRYDGTSRFLRDNRWNLFPSFSLGWNMAREAFFEDLTDLISTFK IRGSWGELGNQNTDNWYPFYRTIDINKDQWGNYALGSWLVNGVKPNISKESALVSSLLTW EKTQTLDLGFDLSMLNNRLNVTFDYFQRKSKNMVGPAPELPNLLGIAVPKVNNLDMTSKG WEIQVNWRDQIRDFKYGVTLSLSDNQVVIDKYPNPSNTILDKDNNNTYYAGAHVGDIWGF QTIGIAKTDQEMKDHLAGMPNGAQDVLGSGWGAGDIMYADLNGDGQISRGNKTLADHGDL KKIGNSTPRYNFGLNLDAAWKGFDLKLFFQGTMKRDYMPGSGSTMFWGAVGYWQTNFFKP HLDYFRGEDTTNPLGANLGGYYPRPLENDRNRNPQTRYLQNAAYCRLKNVTLGYTLPKSL TEKFCVNNLRFFVSAENLFTITSLADTFDPETVGIGNWDGCTYPLSKTVSFGLSATF >gi|222159268|gb|ACAB01000091.1| GENE 6 7773 - 9785 1765 670 aa, chain + ## HITS:1 COG:no KEGG:Coch_1140 NR:ns ## KEGG: Coch_1140 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.ochracea # Pathway: not_defined # 1 663 1 645 650 535 47.0 1e-150 MKKHIKLLTIGTLLLGGLTGCNDFLDREPLDKVIPEKYFASESDLAAYTINAYPFETVTD AYGINFFGKDNDTDNQASGDSPAFWIPGQKKVPSGEGEWDWSKIRTCNYFFDNTLPKFEA GTITGNQDNVKHYIGEMYVIRAYNYYKLLVSLGDLPIITTALPDIEETLVESSKRQPRNK VARFILDDLQKATELLLDKSPGGKNRISKNVAHLLRARVALFEATWEKYHKGTAFVPGGK GWPGNPADVSGFNSDAEVAYFLDEAMKSSKVVGDYIVGKLADNTDTPEGMNASLVSINPY YTMFCDENMEGYDEILMWKQFKEGLVTSNLQMELARNGGGSGWTRGMVNSFLMRNGLPIY AAGSDYNPDWEKEGVNSTLQNRDSRIVIFTKKPGDANTENKGDVNYYGDDGTPSYCSIRF IYGDKGSLATTGFIIKKGKHYSSHMANDHSAGTSGGIVFRAAEAMLIYMEASYEKNGRID GTADGYWKALRRRAKVDEDYNKTIAATQMSEEAKGDFGAYSHGQLIDATLYNIRRERRNE LCAEALRWEDLKRWRACDQLISKPYRVEGMLYWGSNYETQLADLCKVDPAEGNMSSPDLS KYILPYEKITKNNLIAGQKGFLFTPAHYLNPIGMAVFRQTASDKNDFTSSVVYQNPGWKI EGDTGAQPVE >gi|222159268|gb|ACAB01000091.1| GENE 7 10132 - 10881 360 249 aa, chain + ## HITS:1 COG:no KEGG:BT_1587 NR:ns ## KEGG: BT_1587 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 249 1 250 250 454 84.0 1e-126 MNTDFTNLTTENLSNEHLCCIIRSKKSHPGIEAKRQWLSDRLNEGHVFRKLNAKATVFIE YAPLETAWVPVIGDNYYYLYCLWVLGSPRGNGYGKALIEYCIADAKEKGRSGICMLGAKK QKSWLSDQSFAKKFGFEVVDTTDNGYELLALSFDGTVPKFAPNAKNLKIESEELTIYYDM QCPYIYKYIEMIKQYCETNDVPVSFIQVDTLQKAKELPCVFNNFAVFYKGSFETVNLPPI DYLKRILKK >gi|222159268|gb|ACAB01000091.1| GENE 8 10938 - 11936 730 332 aa, chain - ## HITS:1 COG:CC2502 KEGG:ns NR:ns ## COG: CC2502 COG2234 # Protein_GI_number: 16126741 # Func_class: R General function prediction only # Function: Predicted aminopeptidases # Organism: Caulobacter vibrioides # 39 302 34 272 309 118 31.0 2e-26 MKRDYLLLALLLVGNITFAQSPIERALNTINRSSAEATINFLASDELQGREAGFHGSRVT SEYIVSLLQWMGVSPLADSYFQPFDAYRKERQKKGRLEVHPDSIAKLKQEVHQKLTMRNV LGMIPGKNTKEYVIVGAHFDHLGIDPALDGDQIYNGADDNASGVSAVLQIARAFLASGQQ PERNVIFAFWDGEEKGLLGSKYFVQTCPFLSQIKGYLNFDMIGRNNKPQQPKQVVYFYTA AHPVFGDWLKEDIRKYGLQLEPDYRAWENPIGGSDNGSFAKVGIPIIWYHTDGHPDYHQP SDHADRLNWDKVVEITKASFLNMWKMANEKSF >gi|222159268|gb|ACAB01000091.1| GENE 9 12109 - 12552 456 147 aa, chain + ## HITS:1 COG:no KEGG:BT_2205 NR:ns ## KEGG: BT_2205 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 140 1 140 141 225 86.0 3e-58 MWILIISLVLLGVIALIAGIIRNKRLQKKIEKGELDRMPEVKEVDVECCGQHEVCERDSL LAAVSKKIEYYDDEELDQFIGRAGDAYTEEETEMFRDVLYTTLDVEVAGWIRSLQLRGIE LPDDLKDEVFLIIGERRNVEVKKADER >gi|222159268|gb|ACAB01000091.1| GENE 10 12552 - 13361 708 269 aa, chain + ## HITS:1 COG:MA0025 KEGG:ns NR:ns ## COG: MA0025 COG1108 # Protein_GI_number: 20088924 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 2 261 3 262 274 213 46.0 4e-55 MDLLQYTFFQHALLGSLLASIACGIIGTYIVTRRLVFISGGITHASFGGIGLGLFAGISP ILSAAVFSVLSAFGVEWLSRRKDMREDSAIAVFWTLGMALGIMFSFLSPGFAPDLSAYLF GNILTINQIDLWMLGILALILTGFFYLFIRPIVYIAFDREFARSQKIPVEIFEYVLMMFI ALTIVACLRMVGIVLAISLLTIPQMTANLFTYSFKKIIWVSIGIGFLGCLGGLFISYHWK VPSGASIIFFSILIYAVCKIGKSCCRKKS >gi|222159268|gb|ACAB01000091.1| GENE 11 13375 - 13788 493 137 aa, chain + ## HITS:1 COG:CAC2838 KEGG:ns NR:ns ## COG: CAC2838 COG0802 # Protein_GI_number: 15896093 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Clostridium acetobutylicum # 1 130 1 127 152 91 35.0 5e-19 MEIKIQSLESIHEAAREFIAAMGDNTVFALYGKMGAGKTTFVKALCEELGVTDVITSPTF AIVNEYRSDETGELIYHFDFYRIKKLSEVYDMGYEDYFYSGALCFIEWPELVEELLPGNA VKVTIEELEDGNRVIRL >gi|222159268|gb|ACAB01000091.1| GENE 12 13785 - 14024 186 79 aa, chain + ## HITS:1 COG:no KEGG:BT_2208 NR:ns ## KEGG: BT_2208 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 1 73 74 111 91.0 8e-24 MTGQYIVQGIFALAGIVSLLASLLNWDWFFTTRNAQTIVRNVGRNRARLFYGILGIIIIG MAIFFFVETRKAIGRIICF >gi|222159268|gb|ACAB01000091.1| GENE 13 14085 - 14378 271 97 aa, chain + ## HITS:1 COG:MJ1215 KEGG:ns NR:ns ## COG: MJ1215 COG1669 # Protein_GI_number: 15669400 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Methanococcus jannaschii # 1 76 5 81 86 61 45.0 4e-10 MKTTNEYLTKIRQFKQQFAEKYGIISIGIFGSVARGEQHEESDLDVFVELKDPDPFIMFD IKEELERICNCKIDLLRLRKNLRSLISQRIARDGIYA >gi|222159268|gb|ACAB01000091.1| GENE 14 14362 - 14739 248 125 aa, chain + ## HITS:1 COG:MJ0434_1 KEGG:ns NR:ns ## COG: MJ0434_1 COG2361 # Protein_GI_number: 15668610 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanococcus jannaschii # 5 124 1 112 119 60 33.0 9e-10 MEYTLKEEIQDKFLQISESISIIEERCKNIQNVDDFLLSPWGMTILDACIMRIQVIGETI KAVDDKTQKNFLKDYPQIPWAKVIGLRNIISHEYANIDYEIIWVVIQKHLPPLKETVEQI IKDLS >gi|222159268|gb|ACAB01000091.1| GENE 15 14734 - 16044 913 436 aa, chain - ## HITS:1 COG:PA4323 KEGG:ns NR:ns ## COG: PA4323 COG1721 # Protein_GI_number: 15599519 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) # Organism: Pseudomonas aeruginosa # 59 436 67 443 443 171 29.0 2e-42 MFLTRRFYIALVLVILLLGSGYVFAPFFVIGQWALFVLLLVVLADVYSLYRIRGIRAFRQ CADRFSNGDENEVSIRVESSYPHPVSLEIIDEIPFIFQKRDVDFQVKLGANEGKTVTYRL RPTHRGVYSFGHIRVFVTGKIGFISRRYTCAEPLDIKVYPSYLMLHQYELLAISDNLTEL GIKRIRRVGHHTEFEQIKEYVKGDDYRTINWKASARRHELMVNVYQDERSQQIYNVIDKG RVMQQAFCGMTLLDYAINASLVLSYVAMQKEDKAGLVTFDEHFDTFVPASKQSGYMQTLL ESLYSQQTTFGETDFSALCVHLNKHVSKRSLLVLYTNFSSIGGMNRQLSYLKQLNRQHRL LVVFFEDVDLKEYIAQPAKDTESYYRHVIAEKFAYEKRLIVSTLKQHGIYSLLTTPENLS IDVINKYLEMKSRQLL >gi|222159268|gb|ACAB01000091.1| GENE 16 16226 - 17200 1039 324 aa, chain - ## HITS:1 COG:BH0731 KEGG:ns NR:ns ## COG: BH0731 COG0714 # Protein_GI_number: 15613294 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Bacillus halodurans # 28 324 11 308 308 278 48.0 1e-74 MEENTEQRVDLTLFSEKIQELKDRIASVIVGQEQTVDLVLTAILANGHVLIEGVPGVAKT LLARLTARLIDADFSRVQFTPDLMPSDVLGTTVFNMKTNGFDFHQGPIFADIVLVDEINR APAKTQAALFEVMEERQISIDGTTHRMGDLYTILATQNPVEQEGTYKLPEAQLDRFLMKI TMDYPSLEEEVNILERHHTNAALVKLDDITPAITKEELLSLRAFMNQVFVDRTLLQYIAL IVQQTRTSRAVYLGASPRASVAMLQSSKAYALLQGRDFVTPEDIKFVAPYVLQHRLILTA EAEMEGYSPVKVTQRLIDKVEVPK >gi|222159268|gb|ACAB01000091.1| GENE 17 17212 - 18483 679 423 aa, chain - ## HITS:1 COG:no KEGG:BT_2213 NR:ns ## KEGG: BT_2213 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 423 1 424 424 588 68.0 1e-166 MKGSRWFIVFIVAFLFIMFAIEYHLPKNFVWKPTFGHHDEQPFGCAVFDTVLSSSLPQGY TFSRKSLYQLEQEDTTQRRGILVISDNLRLSDVDVNALLKMAERGDKIMLVSTLFGRYLE DTLQFRSYYSYFSPMALKKYATSFMKKDSLHWIGDSTVYPRQTFYFYPQLCSSYFWGDSL PERVLAQRVFESNEFRYETEADSLTTDSLVYMPVAMSRRWGKGEVILVSTPLIFTNYGML DGKNATYIFRLLSQMGKLPIVRTEGYMKETAQVQQSPFRYLLAHQPLRWALYLTMITIIL FMIFTAKRRQRVIPVIHEPANKSLEFTELIGTLYFQKKDHADLVRKKFSYLAEELRREIQ VDIEEVADDERSFHRIARKTGMDAGEIAKFIREVRPVIYGGRVIDAEQMKVYIDKMNEII NHI >gi|222159268|gb|ACAB01000091.1| GENE 18 18480 - 19097 480 205 aa, chain - ## HITS:1 COG:no KEGG:BT_2214 NR:ns ## KEGG: BT_2214 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 205 2 204 204 307 80.0 2e-82 MNLTSPADTLVCDTAQIALWQSDPAYNYNCELITPEMNVFEWISRQFGELLRKIFGSHFA EEYSGLILVCIAILLLLLIVWFVYRKRPELFMVSHKNALPYTVEEDTIYGVDFPGGIAEA LSRQNYREAVRLLYLQTLKQLSDAERIDWQLYKTPTQYINEVRLPAFRQLTNHFLRVRYG NFEATEELFRVMQTLQEEIEKGGVS >gi|222159268|gb|ACAB01000091.1| GENE 19 19114 - 20052 802 312 aa, chain - ## HITS:1 COG:no KEGG:BT_2215 NR:ns ## KEGG: BT_2215 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 312 1 310 310 310 59.0 5e-83 MESQKPKIAMYVKRPFGEKLNASFDFIKENWKQLFKYSTYLILPICLIQAANFSGLMGSM TDITAMQASGGISDNPLAALGPSFALNYAGVIFFSCLGALLMTSLIYAMVRLYNEREERL NGIVFGDIKSLLLRNIKRLFLMGIACSFLFIFAVIFIVLLAVLTPFTLILTIPLLFAFMV PLALMAPIYLFEDISLGEAFAKTFRLGFATWGGVFLILIVMGIIASVLQGIVSIPWYVIY IVKMIFTMSDGGATSSSVGLNFAQYLFSILMLYGSYLSAIFGIVGLVYQYGHASEVVDSI TVESDIDNFDKL >gi|222159268|gb|ACAB01000091.1| GENE 20 20033 - 20995 617 320 aa, chain - ## HITS:1 COG:alr1808 KEGG:ns NR:ns ## COG: alr1808 COG1300 # Protein_GI_number: 17229300 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Nostoc sp. PCC 7120 # 31 315 39 317 318 132 31.0 8e-31 MKEVTFIRRNIEKWKETEKVVERAASLSPDQLADAYTDLTADLAFAQTHFPTSRITIYLN NLASALHNEIYRNKREKWTRIVTFWTQEVPRTMHDARRELLTSFLIFVASALIGVLSAAN DPDFVRLILGNGYVDMTLDNIANGEPMAVYNGSSEVPMFLGITLNNVMVSFNCFAMGLLT SFGTGYMLLSNGIMVGAFQTFFYQHDLLWESSLAIWLHGTLEIWAIIVAGAAGLALGNGW LFPGTYSRLESFRRGAKRGLKIVIGTVPVFIMAGFIEGFLTRHTELPDVLRLGVIFTSLA FIIFYYIYLPNRKKHGITET >gi|222159268|gb|ACAB01000091.1| GENE 21 21097 - 21822 473 241 aa, chain + ## HITS:1 COG:BH0734 KEGG:ns NR:ns ## COG: BH0734 COG1714 # Protein_GI_number: 15613297 # Func_class: S Function unknown # Function: Predicted membrane protein/domain # Organism: Bacillus halodurans # 3 149 8 164 266 69 30.0 4e-12 MAESTIITGQFVRISQTPASIGERLMALIIDYFLIGLYILSTITLLSKLSLPSGFSLFFF LCVVYLPILGYSFLCEMFNHGQSFGKKLINIRVVKVDGSTPSIGSYLLRWLLFPIDGPIT SGLGLLVVLLNKNNQRLGDLAAGTMVIKEKNYRKIHVSLDEFDYLTQNYHPVYPQSADLS LEQVNVITRTLESSEKDRARRITALAKKVQELLSVTPRESNQEKFLQTILRDYQYYALEE I >gi|222159268|gb|ACAB01000091.1| GENE 22 22035 - 22370 521 111 aa, chain - ## HITS:1 COG:no KEGG:BT_2228 NR:ns ## KEGG: BT_2228 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 111 152 94.0 4e-36 MGLEDDFLLADADDEKTIEFIKNYLPQELKEKFSDDELYYFLDLIDEYYSESGILDAQPD EDGYVNIDLEEVVAYIVKEAKKDEIGEYDPEEVLFVVQGEMEYGNSLGQVD >gi|222159268|gb|ACAB01000091.1| GENE 23 22403 - 22717 170 104 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein [Methanocorpusculum labreanum Z] # 3 103 18 117 120 70 32 2e-11 MALEITDSNYKEVLAEGKPVVVDFWAPWCGPCKMVAPIIEELAAEFEGQVIIGKCDVDEN GDMAAEYGIRNIPTVLFFKNGEIVDKQVGAVGKPVFAEKVKKLL >gi|222159268|gb|ACAB01000091.1| GENE 24 22777 - 26583 4094 1268 aa, chain - ## HITS:1 COG:CAC0516 KEGG:ns NR:ns ## COG: CAC0516 COG0587 # Protein_GI_number: 15893807 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Clostridium acetobutylicum # 2 1230 9 1133 1167 710 35.0 0 MQDFVHLHVHTQYSLLDGQASVARLVDKAMQNGMKGIAVTDHGNMFGIKEFTNYVNKKNS GPKGEVKDLKKRIAGIEAGTIECADKEAEIAACKAKIVEAENKLFKPVIGCEMYVARRTM DLKEGKPDQSGYHLIVLAKNEKGYHNLIKLVSHAWTRGYYMRPRTDRSELEKYHEGLIVC SACLGGEVPKRITAGQFAEAEEAIQWYKNLFGDDYYLEMQRHKATVPRANHECYPLQVNV NKYLIEYAKKFNVKLICTNDVHFVDEENAEAHDRLICLSTGKDLDDPSRMLYTKQEWMKT REEMNELFADVPEALSNTLEILDKVEYYSIDHAPIMPTFAIPEDFGTEEGYRAKFTEKDL FDEFTQDEHGNVVLSEEDAKAKIKRLGGYDKLYRIKLEGDYLAKLAFDGAKRIYGEPLSE EVKERMNFELYIMKTMGFPGYFLIVQDFINAARKELGVSVGPGRGSAAGSAVAYCLGITK IDPIQYDLLFERFLNPDRISLPDIDVDFDDDGRGEVLRWVTNKYGQEKVAHIITYGTMAT KLAIKDVARVQKLPLSESDRLAKLVPDKIPDKKLNLRNAIEYVPELQAAEASSDPLVRDT IKYAKMLEGNVRGTGVHACGTIICRDDITDWVPVSTADDKETGEKMLVTQYEGSVIEDTG LIKMDFLGLKTLSIIKEAVENIRLSRNVEVDVDAIDICDPATYKLYSDGRTIGTFQFESA GMQKYLRELQPSTFEDLIAMNALYRPGPMDYIPDFIDRKHGRKPIEYDIPVMEKYLKDTY GITVYQEQVMLLSRLLADFTRGESDALRKAMGKKLRDKLDHMKPKFIEGGRKNGHDPKVL EKIWTDWEKFASYAFNKSHATCYSWVAYQTAYLKANYPPEYMAAVMSRSLSNITDITKLM DECKAMGIQTLGPDVNESNLKFTVNHDGNIRFGLGAVKGVGEAAVHSIMEERSKNGPFLG IFDFVQRVNLNACNKKNMECLALAGGFDSFPELKREQYFAVNSKGEVFLETLMRYGNRYQ ADKAAAFNSLFGGENVIDVATPEIPQGAERWSDLDRLNRERDLVGIYLSAHPLDEFSIVL EHVCNTRMADLEDKAALVGREITMGGIVTSVRRGVSKNGNPYGIAKIEDYSGSTEIPFWG NDWVTYQGYLNEGTFLYIKARCQAKQWRQDELEVKITSMELLPDVKEELVQKITIIIPLS VLNSALVTELATLTKEHPGNTELYFKVTDDADVSHMSIDLISRPIKLSVGRDLITYLKER PELGFHIN >gi|222159268|gb|ACAB01000091.1| GENE 25 26745 - 27431 556 228 aa, chain + ## HITS:1 COG:NMA1160 KEGG:ns NR:ns ## COG: NMA1160 COG0688 # Protein_GI_number: 15794106 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Neisseria meningitidis Z2491 # 12 227 10 213 265 150 41.0 2e-36 MGRLKKLKKIRIHREGTHILWASFLLLLLINAALYWGIDCKIPFYVVAVASIAVYLLMVN FFRCPIRLFGQDTEKIVVAPADGKIVVIEEVDENEYFHDRRLMVSIFMSIVNVHANWYPV DGTIKKVAHHNGNFMKAWLPKASTENERSTVVIETPEGVEILTRQIAGAVARRIVTYAEV GEECYIDEHMGFIKFGSRVDVYLPIGTEICVKMGQLTTGNQTVIAKLK >gi|222159268|gb|ACAB01000091.1| GENE 26 27442 - 28149 424 235 aa, chain + ## HITS:1 COG:SMc00552 KEGG:ns NR:ns ## COG: SMc00552 COG1183 # Protein_GI_number: 15964875 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Sinorhizobium meliloti # 9 189 42 221 289 87 32.0 3e-17 MTNVIKNSIPNTVTCLNLFSGCIACVMAFEAKYELALLFIALSSIFDFFDGLLARALNAH SIIGKDLDSLADDVSFGVAPSLIVFSLFKEMYYPANMEFIAPYLPYLAFLISVFSALRLA KFNNDTRQTSSFVGLPVPANALFWGSLVAGAHDFLISDNCHPVYLLILVCLFSGLLVSEI PMFSLKFKNLSWNDNKISFIFLIICIPLLLLLGISSFAAIIVWYILLSLFTRKSK >gi|222159268|gb|ACAB01000091.1| GENE 27 28217 - 28507 235 96 aa, chain + ## HITS:1 COG:no KEGG:BT_2233 NR:ns ## KEGG: BT_2233 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 96 16 99 99 99 64.0 3e-20 MHILLFILIFIIAIFVFGLSIVGFILRTIFGLGRSSSSSRPKQTESGRTSQQSYNQTNRR SNDDEEEIYSENVPEKRHKKIFTQDDGEYVDFEEIK >gi|222159268|gb|ACAB01000091.1| GENE 28 28522 - 28959 494 145 aa, chain - ## HITS:1 COG:SA0516 KEGG:ns NR:ns ## COG: SA0516 COG0590 # Protein_GI_number: 15926236 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Staphylococcus aureus N315 # 1 144 1 149 156 131 47.0 4e-31 MLDDIYFMKQALIEAGKAAERGEVPVGAVVVCKERIIARAHNLTETLNDVTAHAEMQAIT AAANVLGGKYLNECTLYVTVEPCVMCAGAIAWAQTGKLVFGAEDEKRGYQKYAGSALHPK TVVVKGIMADECAALMKEFFAAKRK >gi|222159268|gb|ACAB01000091.1| GENE 29 28986 - 29156 156 56 aa, chain - ## HITS:1 COG:no KEGG:BF0707 NR:ns ## KEGG: BF0707 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 56 22 77 77 91 92.0 1e-17 MSILLSDEEQLIVDRYLEKYKITNKSRWLRETILMFIHKNMEEDYPTLFGEHDMRR >gi|222159268|gb|ACAB01000091.1| GENE 30 29287 - 29652 424 121 aa, chain + ## HITS:1 COG:CAC1763 KEGG:ns NR:ns ## COG: CAC1763 COG0792 # Protein_GI_number: 15895040 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Clostridium acetobutylicum # 8 116 9 120 122 61 37.0 5e-10 MAEHNDLGKSGENAAVAYLEQKGYLIRDRNWRKGHFELDIVAAKDNELIVVEVKTRSDTL FAAPEDAVDLPKIKRTVRAADAYIRLFQIDTPVRFDIITVVGNDGNFKVEHIEEAFYPPL Y >gi|222159268|gb|ACAB01000091.1| GENE 31 29768 - 30124 447 118 aa, chain + ## HITS:1 COG:DR2400 KEGG:ns NR:ns ## COG: DR2400 COG2315 # Protein_GI_number: 15807390 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 3 113 8 125 132 76 41.0 1e-14 MNVETVREYCLNKKGVTESFPFDDVSLVVKVMNKMFALIDLEEANHIALKCDPEKAIELR EHYSGIEGAYHFNKKYWNSVRFDSDVDDKLMKELIDHSYDEVIKKFTKKLRAEYDALP >gi|222159268|gb|ACAB01000091.1| GENE 32 30108 - 30866 462 252 aa, chain + ## HITS:1 COG:lin2018_2 KEGG:ns NR:ns ## COG: lin2018_2 COG0340 # Protein_GI_number: 16801084 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Listeria innocua # 36 234 35 234 253 87 31.0 2e-17 MMPSPDTFPVPLIHINETNSTNNYLQSLCSEQKVEELTVVVADFQTSGRGQRGNSWESDP GKNLLFSTVIFPEFLEARRQFLISQIISLAIKEELDTYTTDISIKWPNDIYWKEKKICGM LIENDLMGRNINQSIAGIGININQETFHSSAPNPVSLLQITEEEHDLFEILKNIMLRIQS YYSLLKKGDTTSIACQYEKSLFRREGMHRYKDANGEFLARIVCVEPEGKLILEDEKLIKR GYMFKEVEYLLK >gi|222159268|gb|ACAB01000091.1| GENE 33 30933 - 31335 100 134 aa, chain + ## HITS:1 COG:no KEGG:BF0631 NR:ns ## KEGG: BF0631 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 134 1 136 436 107 45.0 9e-23 MKTKLIFLLPLLWILIGCDDSDSATLHISESKFDNISASGESLTIDITCSSSWTVTSNKQ WCIPNTQKGENDGKLILSINANLESKSRTATVTIISHKVSKTVQIIQNGSTNTAEEYHYE LPVIFHVLYKEDEN Prediction of potential genes in microbial genomes Time: Wed May 18 03:17:54 2011 Seq name: gi|222159267|gb|ACAB01000092.1| Bacteroides sp. D1 cont1.92, whole genome shotgun sequence Length of sequence - 30674 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 9, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 86 - 889 333 ## BF0702 hypothetical protein + Term 911 - 949 5.5 2 2 Tu 1 . - CDS 923 - 1606 549 ## BT_2240 TPR domain-containing protein - Prom 1657 - 1716 12.7 + Prom 1599 - 1658 6.3 3 3 Op 1 . + CDS 1719 - 3047 715 ## COG0534 Na+-driven multidrug efflux pump 4 3 Op 2 . + CDS 3102 - 3812 747 ## COG0528 Uridylate kinase + Term 3835 - 3880 2.3 5 4 Op 1 . - CDS 3985 - 5538 1185 ## Fjoh_2081 hypothetical protein 6 4 Op 2 . - CDS 5549 - 6607 928 ## Acid_0712 hypothetical protein 7 4 Op 3 . - CDS 6615 - 6962 229 ## gi|294647215|ref|ZP_06724814.1| conserved hypothetical protein 8 4 Op 4 . - CDS 6966 - 9629 1777 ## COG3250 Beta-galactosidase/beta-glucuronidase 9 4 Op 5 . - CDS 9641 - 11749 1469 ## gi|237714920|ref|ZP_04545401.1| conserved hypothetical protein 10 4 Op 6 . - CDS 11781 - 13394 1368 ## Fjoh_2078 RagB/SusD domain-containing protein 11 4 Op 7 . - CDS 13409 - 16468 2861 ## Fjoh_2077 TonB-dependent receptor - Prom 16519 - 16578 3.9 - Term 16490 - 16536 -0.9 12 5 Tu 1 . - CDS 16600 - 20634 2700 ## COG0642 Signal transduction histidine kinase 13 6 Op 1 . + CDS 21036 - 21947 618 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 14 6 Op 2 . + CDS 21999 - 22559 831 ## COG0233 Ribosome recycling factor + Prom 22561 - 22620 2.0 15 7 Tu 1 . + CDS 22666 - 23598 961 ## COG1162 Predicted GTPases + Prom 23619 - 23678 3.2 16 8 Op 1 27/0.000 + CDS 23698 - 24786 1024 ## COG0845 Membrane-fusion protein + Prom 24794 - 24853 3.5 17 8 Op 2 . + CDS 24935 - 27967 2799 ## COG0841 Cation/multidrug efflux pump 18 8 Op 3 . + CDS 27964 - 29304 1161 ## BT_2253 putative outer membrane protein TolC + Term 29331 - 29377 11.3 + Prom 29321 - 29380 6.8 19 9 Tu 1 . + CDS 29456 - 30550 773 ## BT_2254 putative pectate lyase + Term 30630 - 30674 4.7 Predicted protein(s) >gi|222159267|gb|ACAB01000092.1| GENE 1 86 - 889 333 267 aa, chain + ## HITS:1 COG:no KEGG:BF0702 NR:ns ## KEGG: BF0702 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 267 86 356 356 339 63.0 5e-92 MNLTFTLATTDKNGETLPNPGVEYIQWPESYPIDCEAFMKDNSGKYVKYLWDPNSYINIM VYNFTTEPNSNSVTLGISHIPFSTTGKHYLEGLGETDYSHLTLANLQFPLSVSINSLYIN EESTPTEYSTADIAVTLAHELGHYLGLHHVFAETDNGTCEDTDYCKDTKSYNKQEYDSKC DYIYENEREKYTFKNLVKRTGCDGIEFISYNIMDYAISYSNQFTQNQRERIRHVLSYSPL IPGPKKGDIDTRALNEGPLDLPIRTIK >gi|222159267|gb|ACAB01000092.1| GENE 2 923 - 1606 549 227 aa, chain - ## HITS:1 COG:no KEGG:BT_2240 NR:ns ## KEGG: BT_2240 # Name: not_defined # Def: TPR domain-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 227 15 227 227 327 85.0 1e-88 MRTLTIFLISLFSLPLVLNAQSVDEMLQKVSAAIEAGQNGQAVSYFRQTIALNIDRTEMY YWTNVDKNSEISSKLATELALAYKKNRNYDKAYLFYKELLQKTPNNVDCLEACAEMQVCR GQEKDALRMYEKILQLEADNLAANIFLGNYYYLTAEQEKKKLEMDYKKLSSPTKMQYARY RDGLSKLFTTRYEKARNSLQKVILRFPSTEAQKTLDKILRIEKEVNR >gi|222159267|gb|ACAB01000092.1| GENE 3 1719 - 3047 715 442 aa, chain + ## HITS:1 COG:VC0090 KEGG:ns NR:ns ## COG: VC0090 COG0534 # Protein_GI_number: 15640122 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Vibrio cholerae # 5 424 12 431 454 285 39.0 1e-76 MTDKKKSSVNRRILQIAIPSIISNITVPLLGLIDVTIVGHLGSPAYIGAIAVGGMLFNII YWIFGFLRMGTSGMTSQAYGQHDLNEITRLLLRSVGVGLFIALCLLILQYPILKLAFTLI QTTPEVKQLATTYFYICIWGAPATLGLYGFAGWFIGMQNSRFPMYIAITQNIVNIVASLC FVYLLDMKVAGVATGTLIAQYAGFFMAILLYMRYYSVLKKRIVWKEIIQKQAMYRFFQVN RDIFFRTLCLVIVTMFFTSAGAAQGEIVLAVNTLLMQLFTLFSYIMDGFAYAGEALTGRY IGAKNQTGLRNTVHHLFYWGFGLSLVFTILYAAGGKEFLGLLTNDTSVISASDTYFYWAL IIPLAGFSAFLWDGIFIGATATRQMLYSMLVASASFFGVYYAFHPLLGNHALWLAFLVYL SLRGIVQTLLGRQIMKKVIVSR >gi|222159267|gb|ACAB01000092.1| GENE 4 3102 - 3812 747 236 aa, chain + ## HITS:1 COG:FN1622 KEGG:ns NR:ns ## COG: FN1622 COG0528 # Protein_GI_number: 19704943 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Fusobacterium nucleatum # 4 234 6 236 239 263 59.0 2e-70 MAKYKRILLKLSGESLMGEKQYGIDEKRLAEYAEQIKEIHQQGVQIGIVIGGGNIFRGLS GANKGFDRVKGDQMGMLATVINSLALSSALVATGVKARVLTAVRMEPIGEFYSKWKAIEC MENGEVVIMSAGTGNPFFTTDTGSSLRGIEIEADVMLKGTRVDGIYTADPEKDPTATKFD DITYDEVLKRGLKVMDLTATCMCKENNLPIVVFDMDTVGNLKKVISGEEIGTLVHN >gi|222159267|gb|ACAB01000092.1| GENE 5 3985 - 5538 1185 517 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_2081 NR:ns ## KEGG: Fjoh_2081 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 67 517 25 470 471 229 34.0 3e-58 MKKLFILGTFLFISSIPMVSCTDDDDKDPNFMPPDIVMGGGDVESEYPEDLPAPGASVMY TPSLNANMYRPISVKYSSAYPPISSWKTENTRIIAYMDGYKPAIKTLKAYQESVNKYGSS TTLPKQAATGRFYTKKIDGRWWLVDPEGCLHLERSATSLRKGTSSRNKTAWNSRFGTDEK WLSTTQRELSEIGFHGTGAFCTGTYSLIQTHNASNPSSPLTLAPSFAFLSQFKSAKSYNY PGGSDDNAAGLVFYNGWTEWCESYLAGSAFADYLRDPNVLGFFSDNEINFSSNSSRILDR FLAISNSSDPAYVAAKAFMDSKGTQSVTDDLNNEFAGIVAEKYYKAVKEAVKKVDDKLLY LGTRLHGTPKYMEGVVRAAGKYCDVISINYYSRWSPELTTAIADWANWADKPFLVSEFYT KGVEDSDLNNQSGAGYSVPTQNERAYAYQHFTLGLLEAKNCVGWHWFKYQDDDGTDNSSK PANKGLYDNSYQLFPYLSFFARELNFNAYDLIQYFDK >gi|222159267|gb|ACAB01000092.1| GENE 6 5549 - 6607 928 352 aa, chain - ## HITS:1 COG:no KEGG:Acid_0712 NR:ns ## KEGG: Acid_0712 # Name: not_defined # Def: hypothetical protein # Organism: S.usitatus # Pathway: not_defined # 3 345 118 459 462 304 44.0 4e-81 MLDKFGTEEKWIEGTARMIHSLGFSGAGSWSNEEAIASYNASHKEVLTRSIILNLMSGYG KKRGGTYQLPGNTGYPNQCIFVFDPEFETYCDEMAQKLVANKTDKNIIGYFSDNELPFGP KNLEGYLTLKNPNDPGRLYAESWLKQQGITLQQITDEHREEFAGVVAERYYKVVSEAIRK YDPNHLYLGSRLHGKPKFVRQIVEAAGRYCDVVAINYYGAWTPSEKTMKHWGEWAQKPFI ITEFYTKGMDSGLANTTGAGFTVQTQQERGYAYQHFVLGLLESGNCVGWHWFRYQDNDPT AKGVDPSNLDSNKGLVDNEYNLYKPLADAMKELNINAYRLADWFDQQSTNNK >gi|222159267|gb|ACAB01000092.1| GENE 7 6615 - 6962 229 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294647215|ref|ZP_06724814.1| ## NR: gi|294647215|ref|ZP_06724814.1| conserved hypothetical protein [Bacteroides ovatus SD CC 2a] # 1 96 2 97 471 203 100.0 3e-51 MKNSIVILFLLLLSQFGYAQGRTFKVTARPWVKGQKNLPWKEYDTRTIAQLDGFKPTGKV RVNKYGSDLDAPRYRATGFFRVERIRDRWWMIDPDGLSAFAESGGRSTFRYFRTQ >gi|222159267|gb|ACAB01000092.1| GENE 8 6966 - 9629 1777 887 aa, chain - ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 56 596 40 554 570 192 27.0 3e-48 MYKVYRKIILACLLLIVAGTVCGQRVTQTINDGWKFSLFEGDAFDADFDVSGWTDVSIPH TWNAKDAEDEIPGYFRGKGWYRKAVIVEELIAGQRVYLCFEGANQETNVFVNGKLVGNHK GGYSAFTFDVTDYVHTGCNLVAVSVDNSYNPDIAPLSADFTFFGGLYRDVYLVYTSPVQL STTHYASSGVYLKIVGITDAQAEVCAKTFLSNALKSNQALILETEILDADGNRVALSAKK VNVKAGEKNVAFEALMTIAQPKRWDVDSPYLYKVYSRLKNKKGEVLDCVVNPLGIREYHF DAEKGFFLNGKYRKLIGTSRHQDYKGMGNALRDEMHIRDIQLSKDMGSNFLRVAHYPQDP VVMQMCDKLGLLTSVEIPIVNAITQSRAFMDNCVEQATEMVCQNYNYPSVIIWAYMNEVL LRPPFNPDNKKERAEYMKFLHQIASAVEAQIRSLDSERYTMLPCHSASQIYQEAGITELP MLLGFNLYNGWYGGNLGGFEEKLEELHKEFPHKPLLITEYGADVDTRIHSFSPVRFDFSC EFGSVYHEHYLPEILKRDYIVGAMVWNLNDFYSEARRNAMPHVNNKGLVSTDRERKDGYF LYQAYLKESPVLHIASKSWKNRAGASRDGKSCTQPLKVYTNADKVEVFLNGKSLGVYPVA DKVVSVDIPFVNGKNVVDAVIEKEGREYRDQYVCDFKCVNVKNGFTEINVLLGARRYFED RIAEMCWIPEQAYAEGSWGYIGGEVAPNKTRYGSLPASDTDILGTDQDPVFQTQRVGIEA FKADVPDGVYAVYLYWTELTSENKREALVYNLGNDVVREDYINRVFSVDINGVSVAKQLN IAEEYGSERAVIKKYIVPVSQGKGLVVRFGAVESVPILNAIRIVKEY >gi|222159267|gb|ACAB01000092.1| GENE 9 9641 - 11749 1469 702 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714920|ref|ZP_04545401.1| ## NR: gi|237714920|ref|ZP_04545401.1| conserved hypothetical protein [Bacteroides sp. D1] # 17 702 19 704 704 1207 100.0 0 MKRIIYLLLILNLGLLSCVDDASVRVMPEFNCKDTKVNLAKAAGSFVTSLLYTNVGQVVA QYQAEWLSVDVNAKSVIYTALTQNDGEDARSTVVKLTCGSYTVEVTVTQDSKEPDLSLKV GQSVDDGIGMIFWVDPSDKMVGKAVSVKRQGGNPFEASVMSHNALSTVNGYANTALFTAP AANDAVAYCQSLGEGWYLPARDELWELFDVYNGIGHADPDFASVVPDKLTEVEKAARAAF DKMLTDLQGDVINEAAGSGNGESYWSSTENAAGDKAYWVRFGKSGADAGNKTATNRFVRC MRTIGDYTYPEEPATLTVAPNPVTLEGANEAEANVTLTSNKSVFSVVLADDSWLSYTISG TTVTFKAKSKNTTGNIRTTIATITAGTGAAAKAVEVTVNQNVAAEGDASLELSTNTVTIT PDAVAKSEVITMISDETEFAINITDESWVKAYVDVTSKTLYFWTLSPNLNSSSRVTTATV IAGNGANAPKQEVTITQRGLLSSEFAVGQVIADNGSLKGGIVFWVDGTNRGKAKIMSLDR ENLAWSTAGSPASTGVDLSNDDGLANTTALAALPNAAEMPALKYCMDKGAGWYWPTRSDL EQMFETYNGTKVADATEDNPNAITDFEKANRAAWDLVVTNAGGTAMNTAAASSTGDSYWA SRETSSGTNAFYVRFGKPLAWDKANGKKTGARYIRAVRSISK >gi|222159267|gb|ACAB01000092.1| GENE 10 11781 - 13394 1368 537 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_2078 NR:ns ## KEGG: Fjoh_2078 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: F.johnsoniae # Pathway: not_defined # 1 527 1 521 530 187 31.0 8e-46 MKKLIYTAFVICGMLTASCSDLLNLESKTDVTNNYLFTTPEGLNTAVTGLYSLARELPGG ADNNESNLYIVTMCDFNTDIAILRAGVSTSIGRLNTSFTPSTGDVNKFWKHHYGIIGKAN EIIVAAEALGLDDSDVLHAWSEAKFFRGRSYFELWKRFDRLYLNVTPTTVDNLKREYKPA SHEELMTLMTTDLDDAMKGLDWSLPQNNGNVLYGRVTRATAKHVRAQVAMWESDWDTAIE ECEDIFKQEGIYSMEKKAENVFNSADLRSPEVLWSFQYSQNLGGGGSGTPVAGHRISIQT TTRINKIAGCINTADQGGYGWGRIYPNTYLLSLYDQAKDTRYNELFVHRFKYNDPTSPKY GELIPLAKSSSYCETLHFMSKKYFDQWTMADNPDRTTGFKDLIVYRLAETYLMAAEAYMR RDGGMSTDALRCYNKTWERAGNDKFAGPLTQDILLDEYARELNFEGVRWPLLKRLGLLGE RVKAHYGETKAENPYLDKDYAECRTSFVVGKHECWPIPQEQIDLMGKENFPQNENWY >gi|222159267|gb|ACAB01000092.1| GENE 11 13409 - 16468 2861 1019 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_2077 NR:ns ## KEGG: Fjoh_2077 # Name: not_defined # Def: TonB-dependent receptor # Organism: F.johnsoniae # Pathway: not_defined # 16 1019 27 1008 1008 657 39.0 0 MKNIVIKWAVFLTAFLISLEVSAQNVRVSGVVTDALGPIPGVNIMEEGTTNGTVTDVNGK YSISVSAKSTLVFSCIGYKEQKIRVGTKTVLNVNMVEESKMLDELVVVGYGVQRKSDVAT SVASVKADEMKTFPAGNVADMLRGRAAGVNVTSSSGRPGSTPSITIRGSRSISADNAPLY IIDGSPSSATEFSTLSADDIESVEILKDAASQAIYGARASDGVVLVTTKRGKAGKVEVNY NGYLGIQSLWRNFDFYSPEEYMQLRREAKAHDKGIIDAREISIAEALEDEVMQRVWASGK FIDWEKEMFRNAIYHNHDVSVRGGTEKIKVSAGANYFDQQGMVVTGSGYQKFSLRLNLDF EIAKWISFGINSSYAMTKQDREDGNFNDFITSSPLAEIYDADGKYTKYINSEGNYNPLYR AEHYGREVTRDNYRLNFFMDVKPFKGFNYRLNTSVYNQTSEDGSYKDSQYPGGGGTAVLD ESRTQNWLVENIVTYKVPIRNKKHQLTLTGVQSVDHNGSKSIGYSVENLPVDKDWNFISQ GEFTGKPRRQFNENNLVSFMARAQYSLLDRYLLNVAVRRDGSSRFGKENKWGTFPSAAFA WRVNQESFLRDVSWIDNLKLRVSYGIVGNQNGIGNYTTLGLADNKGYEFGNVFQMGYLPG KELSNPNLKWEQSATANFGVDFSFFNGRLNGTVEYYNTHTKDLLVKRSLNASLGYTTMLD NLGKTKSSGIDLSLNGDVVRTKEFTWTLGTNFSMYKNEIVRIDDTLDENGKPASQVAQGW IIGEPINVYYDYLIDGIFQYDDFDITRDGTGNLVYTLKNTYDSDNDGVADSPINYGGAIE PGMVKVRDNNGDGKITADDRVPIRKDPKFTLSLSSTWNWKGFDLFMDWYGVSGRKIKNGY LYEYNSGGSLRGKLNGVKVDYWTPFNPSNKFPRPSYSADPAYLSAIAIQDASYIRLRTLQ LGYTFPTRLLKNTPIHKLRLYATATNLLTFTEFKSYSPELTPGSYPESRQYVFGVNVSF >gi|222159267|gb|ACAB01000092.1| GENE 12 16600 - 20634 2700 1344 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 808 1053 53 287 318 145 37.0 5e-34 MAKVKFLVVYCLFISGLCCMANVTKDINMRFKDVRQGLSHQTVNCFYQDEFGFLWIGTQD GLNRFDGRKFEVFKPDNANPYSININNIRQVCGNKAGLLFIRSLQSVTLYDMRLNRFRVL REGEVAGICYAHDALWIATGKEIYRYRNLDQAPELFFSFPSDEEDVLMNSLMVRQNQTVV VGTSSKGVYCIDQSAHITRRMDVGAVNSITEDRNGSIWIATRNRGLIRLDEDGTLTHFKH NKVAENTINHNNVRHVTQANDSLLYIGTYAGLQTLNLFTGEFTDYEYDLNVEAADIRSII SMHYDTSGTLWLGTFYQGIQYYNVANDAFHFYRSSTAVGGHLNSYIISSIAEDRSGRIWF ASEGSGLNYYDKHTKRFFPLKKVYSQELSFKIVKSLYYEREPDYLWVASLYQGINRINLS TGHIESISENIYTPEGKTVVDRAYNLVKMIEFAGHDSLLIAAKGGLLVLDKKHLRLHHFE HPSLAAKHLSQVWDMTFDKDGDLWLTTSFDLIRVNLKAGIAHSYPFSQIAQSTAQHHINH ILCDKKGRIWLGSTGSGIYLFDKKTDSFVGYGAKQGLENGFITGLVESPLDGSIYVATNG GFSKFNLATTTFENYNRQSDFPLNNVNDGGLYIASDHDIYVCGLAGIVSIAQEKLNKQSV DYDVFVKRVLVDNTEIQPLDSLGLIKETVLYEHRLVLPPRYSSVTFEIASNTLNNISNIG LEYKLEGFDNEYMKAGDNTMITYTNLHPGHYTFHVRGDQVGVHGQEAPSVSFELIVEAPV YQRAWFILLMVLAAILIAGYIIRMFWIRKTLRHSLLAEKREKEYIESVNQSKLRFFTNVS HEFRTPLTLISSQLEMLLMHKDIIPEVYNKILDIYKNSRRMNNLVDEVIDIRKQDQGYLK LKISKENIVAVIEEICGSFYSYAQLNKIDLRFSSTLKEADLYIDKTQIEKVFYNLLSNAF KYTKPGDWISVELSAEGDNDIVISVNNLGVGIDKSKIKHVFERFWQDDSATTTRTVKGSG IGLAMAKGIVELHQGTIGVESELNGVTSFTVTLHRDANMNMEASSSADEHVAEHYIIEPQ EISEMVKPDKTVKILIVEDNPEMRKVLTQIFEQIYEVYTAADGQEGLERASSLQPQMIVS DIMMPVMSGLEMCEKLKSNLQTSHIPVVLLTARNREEHTLEGLQTGADDYISKPFNIKIL VARCNNIIQTRKLLQQRFVRNDEPKVEDLPFNPIDKKMLMDATAIVESYIDNPDFDVATF AREMCMSRTLLFTKLKALTGQTPNDFILSLRLKKATEKLRNDPNALIADIAFDYGFSNPS YFIRCFKNAYDITPAAYRRKYANS >gi|222159267|gb|ACAB01000092.1| GENE 13 21036 - 21947 618 303 aa, chain + ## HITS:1 COG:PAB0040 KEGG:ns NR:ns ## COG: PAB0040 COG0697 # Protein_GI_number: 14520295 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pyrococcus abyssi # 7 283 23 291 295 70 26.0 6e-12 MTNKTKGFIYGAIAAASYGMNPLFALPLYAAGMSVDTVLFYRYFFATIVLGILMKMQHQS FALHKADVLPLVIMGLLFSFSSLLLFMSYNYMDAGIASTILFVYPVMVAVIMGIFFKEKI SAITVFSILLALSGIALLYQGDGNKPLSTLGIIFVLLSSLSYAIYIVGVNRSTLKNLPTT KLTFYAILFGLSVYIVRLNFCTELQVIPSAWLWADVLSLAILPTAVSLVCTALAIHYIGS TPTAILGALEPVTALFFGVLLFHEKLTPRLMMGILMIITAVTLIIIGKSLIKKMGMLLQM NKK >gi|222159267|gb|ACAB01000092.1| GENE 14 21999 - 22559 831 186 aa, chain + ## HITS:1 COG:RSc1407 KEGG:ns NR:ns ## COG: RSc1407 COG0233 # Protein_GI_number: 17546126 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Ralstonia solanacearum # 10 186 9 186 186 158 50.0 4e-39 MVDVKTCLDNAQEKMDMAIMYLEEALAHIRAGKASARLLDGIRVDSYGSMVPISNVAAIT TPDARSIVIKPWDKSMFRVIEKAIIDSDLGIMPENNGEMIRIGIPPLTEERRKQLAKQCK GEGETAKVSVRNARRDGIDALKKAVKDGLAEDEQKNAEAKLQKIHDKYIKQIDDMLAEKD KEIMTV >gi|222159267|gb|ACAB01000092.1| GENE 15 22666 - 23598 961 310 aa, chain + ## HITS:1 COG:TM1717 KEGG:ns NR:ns ## COG: TM1717 COG1162 # Protein_GI_number: 15644464 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Thermotoga maritima # 2 298 6 286 295 213 39.0 4e-55 MKGLVIKNTGSWYQVKTDDGQSIECKIKGNFRLKGIRSTNPVAVGDRVRIILNQEGTAFI SEIEDRKNYIIRRSSNLSKQSHILAANLDQCMLVVTINYPETSTIFIDRFLASAEAYRVP VKLVFNKVDAYDEDELRYLDALINLYTQIGYPCFKVSAKNGTGVAEIKKALEGKITLFSG HSGVGKSTLINSILPGIETKTGEISSYHNKGMHTTTFSEMFPVEGDGYIIDTPGIKGFGT FDMEEEEIGHYFPEIFKTSADCKYGNCTHRHEPGCAVRKAVEEHLISESRYTSYLNMLED KEEGKYRAAY >gi|222159267|gb|ACAB01000092.1| GENE 16 23698 - 24786 1024 362 aa, chain + ## HITS:1 COG:XF2384 KEGG:ns NR:ns ## COG: XF2384 COG0845 # Protein_GI_number: 15838975 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Xylella fastidiosa 9a5c # 24 355 20 374 411 127 26.0 3e-29 MNKKTKWGIIILVGAGIIGGGIYSQLPKKNDELAAADKVMGSNKKRGKQVLNVNAKVIKP QSLTDEFTTTGVLLPDEEVDLSFETSGKIVEINFEEGTAVKKGQLLAKVNDRQLQAQLQR LISQLKLAEDRVFRQDALLKRDAVSKEAYEQVKTDLATLNADIEIIKANIELTELRAPFD GVIGLRQVSIGTYASPTTVVAKLTKIAPLKVEFSVPERYAKQIKKGTNLNFSVEGTLDAF GAQVYAVESAIDPNLHQFTARALYPNVNRTLLPGRYASVLLKKDEIPNAIAIPTEAIVPE MGKDKVYLYKSGKAEPVDIITGIRTASEVQVIRGLHVGDTIITSGTLQLRTGLAVTLDNI EE >gi|222159267|gb|ACAB01000092.1| GENE 17 24935 - 27967 2799 1010 aa, chain + ## HITS:1 COG:VC0914 KEGG:ns NR:ns ## COG: VC0914 COG0841 # Protein_GI_number: 15640930 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Vibrio cholerae # 1 1001 1 1009 1036 650 37.0 0 MNISELSIRRPVLATVLTIIILLFGFIGYNYLGVREYPSVDNPIISVSCSYPGANADVIE NQITEPLEQNINGIPGIRSLSSVSQQGQSRITVEFELSVDLETAANDVRDKVSRAQRYLP RDCDPPTVSKADADAMPILMVALQSDKRSLLELSEIADLTVKEQLQTISDVSSVSIWGEK RYSMRLWLDPVKMAGYGITPVDVKNAVDNENVELPSGSIEGNTTELTIRTLGLMHTADEF NDLIVKEENNRIVRFSDIGRAELGPADIKSYMKMNGVPMVGVVVIPQPGANHIEIADAVY QRMEQMKKDLPEDVHYNYGFDNTKFIRASINEVKSTVYEAFVLVIIIIFLFLRDWRVTLV PCIVIPVSLIGAFFVMYLAGFSINVLSMLAIVLSVGLVVDDAIVMTENIYIRIEKGMTPK EAGIEGAKEIFFAVISTTITLVAVFFPIVFMDGMTGRLFREFSIVISGSVIISSFAALTF TPMLATKLLIKREKQSWFYAKTEPFFEGMNRLYSRSLAAFLSKRWIALPFTFITICLIGI LWNAVPAEMAPLEDRSQISINTRGAEGVTYEYIRDYTEDINQLVDSILPDAEAVTARVSS GSGNVRITLKDMNERNYTQMDVAEKISKAVQKKTMARSFVQQQSSFGGRRGSMPVQYVLQ ATNLEKLEEVLPKFMAKVYENPVFQMADVDLKFSKPEARIQINRDKASIMGVSTKNIAQT LQYGLSGQRMGYFYMNGKQYEILGEINRQQRNKPADLKAIYVRSSSGDMIQLDNLIELES GIAPPKLYRYNRFVSATISAGLADGKTIGQGLDEMDKIAKETLDDTFRTALSGDSKEYRE SSSSLMFAFILAILLIYLILAAQFESFKDPLIIMLTVPLAIAGALVFMYFGDITMNIFSQ IGIIMLIGLVAKNGILIVEFANQKQEAGEDKMSAIKDAALQRLRPILMTSASTVLGLIPL AFATGEGCNQRIAMGTAVVGGMVVSTLLTMYIVPAIYSYISTNRIKKLQE >gi|222159267|gb|ACAB01000092.1| GENE 18 27964 - 29304 1161 446 aa, chain + ## HITS:1 COG:no KEGG:BT_2253 NR:ns ## KEGG: BT_2253 # Name: not_defined # Def: putative outer membrane protein TolC # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 446 1 447 448 766 93.0 0 MKQKGICMKRIIYITIACAFLSVSFAKAQVLTLKECLEEGLQNNYSLRIIHNEELISKNN ATLGNAGYLPTLDFSAGYTGNLDNIETKARATGEITKHNGVYDQTVNVGLNLNWTIFDGF NISTTYKQLKELERQGETNTRIAIEDFIAALTSEYYNFIQQKIRLKNFHYAMSLSKERLR IAEASHLVGKFSGLDYQQAKVDFNADSAQYIKQQELLHSSRIQLNELMANNNVNQNIIIK DSTIDVHSDLQFDDLWNSTLATNASLLKADQNTVLSQLDYKKINSRNYPYLKLNTGYGYT FNKYDINTNIRRGELGFNAGITVGFNIFDGNRRREKRNASLAFKNRRLERQDLELALRSD LSNLWQAYRNNLQLLNLERQNLITAKDNHDIAMDRYIQGDLSGFEVREAQKSLLDAEERI LSAEYNTKLCEISLLQISGKITKYLE >gi|222159267|gb|ACAB01000092.1| GENE 19 29456 - 30550 773 364 aa, chain + ## HITS:1 COG:no KEGG:BT_2254 NR:ns ## KEGG: BT_2254 # Name: not_defined # Def: putative pectate lyase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 364 1 363 364 628 84.0 1e-178 MRKLSLFLLLLFILTNASAQKATDYRKQQNYKEWVHIAPKFDDDFFKTEEAQRIGDNVLL YQQTTGGWPKNIYMPAELTEQEYNAALKAKEDTNQSTIDNNATTTEIEYLSRLYLATQKE KYKEGVLNGIQYLLKSQYENGGWPQFYPRPKDYYVQITYNDNAMVRVMNQLRSIYEKKAP YTFLPDNICEQARNAFNKGIECILKTQVRQNGELTVWCAQHDRVTLEPCKARAYELPSLS GQESDNIVLLLMSLPHPSADVVKSIEGAIKWFQKSEIKGIQKEYFTNSDGKKDYRMVPCE DCPTLWARFYDLETNRPFFCDRDGIKKYDISEIGHERRNGYSWYNKDGSKVLKRYEKWKK EQNK Prediction of potential genes in microbial genomes Time: Wed May 18 03:19:10 2011 Seq name: gi|222159266|gb|ACAB01000093.1| Bacteroides sp. D1 cont1.93, whole genome shotgun sequence Length of sequence - 2004 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 763 409 ## Aave_1910 hypothetical protein - Prom 933 - 992 7.5 2 2 Tu 1 . - CDS 1396 - 2004 302 ## BDI_2962 hypothetical protein Predicted protein(s) >gi|222159266|gb|ACAB01000093.1| GENE 1 1 - 763 409 254 aa, chain - ## HITS:1 COG:no KEGG:Aave_1910 NR:ns ## KEGG: Aave_1910 # Name: not_defined # Def: hypothetical protein # Organism: A.avenae # Pathway: not_defined # 1 250 1 249 424 204 43.0 2e-51 MEKRTEGAWLIHHTKKLFDVRDTQDFEDIELAGKCGIFLSNLAADENTDLNKDKVDAIAK ASSIKKTEIETIKNKLVEAHLIDVAKNGGISVLGITTSTVLSHTANIFDSSNSNNYQKAA LELSERISDLPKPEKELKEYISDEYKLTSRDTVDLFSQCEEIGFIDYENLDNDSKFYFNG NLFRKENIQKTNAILNSLNSDDSRKIPEMNELLSKKGCITFEKAKAILGEVLLTKLQSIG MYDFNEVSNSHETK >gi|222159266|gb|ACAB01000093.1| GENE 2 1396 - 2004 302 202 aa, chain - ## HITS:1 COG:no KEGG:BDI_2962 NR:ns ## KEGG: BDI_2962 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 202 67 268 268 334 82.0 1e-90 EFEGFRKEAGLNAFTLSPQKWIETTCAIGLISKSGRYGGTFAHKDIAFKFASWISVEFEL YIVKEFQRLKEQEQAQIGWSAKRELSKINYHIHTDAIKQHLIPQEITPQQASMIYASEAD ILNVAMFGMTALEWRDTHPNLKGNIRDYASINELICLSNMENLNAVFINEGLTPKDRLIK LNQIAIQQMSILEDIKNKKLLK Prediction of potential genes in microbial genomes Time: Wed May 18 03:20:11 2011 Seq name: gi|222159265|gb|ACAB01000094.1| Bacteroides sp. D1 cont1.94, whole genome shotgun sequence Length of sequence - 197786 bp Number of predicted genes - 178, with homology - 172 Number of transcription units - 97, operones - 40 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 224 116 ## BDI_2962 hypothetical protein - Prom 289 - 348 8.6 2 2 Tu 1 . + CDS 570 - 1271 612 ## COG2755 Lysophospholipase L1 and related esterases + Term 1302 - 1341 3.5 - Term 1490 - 1536 0.6 3 3 Tu 1 . - CDS 1754 - 3784 2246 ## COG0556 Helicase subunit of the DNA excision repair complex - Prom 3833 - 3892 5.0 + Prom 3770 - 3829 6.3 4 4 Op 1 3/0.000 + CDS 3920 - 5218 1392 ## COG1541 Coenzyme F390 synthetase 5 4 Op 2 . + CDS 5342 - 5767 503 ## COG4747 ACT domain-containing protein + Term 5795 - 5855 11.2 + Prom 6052 - 6111 6.2 6 5 Op 1 . + CDS 6145 - 6948 750 ## COG4105 DNA uptake lipoprotein 7 5 Op 2 . + CDS 6963 - 7298 431 ## BT_0574 hypothetical protein + Term 7320 - 7389 12.2 + Prom 7328 - 7387 5.7 8 6 Tu 1 . + CDS 7407 - 7850 399 ## BT_0575 hypothetical protein + Term 7897 - 7938 6.0 - Term 7987 - 8041 11.8 9 7 Tu 1 . - CDS 8117 - 8683 345 ## BT_0576 hypothetical protein 10 8 Tu 1 . + CDS 8700 - 8882 106 ## + Term 8955 - 9000 1.1 11 9 Tu 1 . - CDS 8859 - 10625 1258 ## COG1388 FOG: LysM repeat - Prom 10747 - 10806 5.8 + Prom 10537 - 10596 4.3 12 10 Tu 1 . + CDS 10784 - 13609 2539 ## COG0178 Excinuclease ATPase subunit + Prom 13643 - 13702 6.7 13 11 Op 1 . + CDS 13799 - 14278 512 ## COG2606 Uncharacterized conserved protein + Prom 14291 - 14350 3.7 14 11 Op 2 . + CDS 14376 - 15935 1563 ## COG3104 Dipeptide/tripeptide permease + Term 15958 - 16014 14.3 - Term 15946 - 16002 14.3 15 12 Tu 1 . - CDS 16008 - 16505 174 ## PROTEIN SUPPORTED gi|229884790|ref|ZP_04504247.1| acetyltransferase, ribosomal protein N-acetylase - Prom 16554 - 16613 8.9 + Prom 16546 - 16605 7.1 16 13 Op 1 . + CDS 16671 - 16997 397 ## COG1695 Predicted transcriptional regulators 17 13 Op 2 . + CDS 17019 - 18116 1031 ## BT_0583 hypothetical protein 18 13 Op 3 . + CDS 18167 - 19267 732 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 19 13 Op 4 . + CDS 19286 - 19792 612 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Term 19822 - 19876 9.3 - Term 19815 - 19859 3.0 20 14 Op 1 . - CDS 19968 - 22070 2037 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 21 14 Op 2 . - CDS 22119 - 23450 1111 ## COG0534 Na+-driven multidrug efflux pump - Term 23461 - 23508 5.5 22 15 Op 1 . - CDS 23535 - 25388 1809 ## COG0706 Preprotein translocase subunit YidC 23 15 Op 2 . - CDS 25446 - 27071 1792 ## COG0504 CTP synthase (UTP-ammonia lyase) - Prom 27127 - 27186 10.1 + Prom 27037 - 27096 4.8 24 16 Tu 1 . + CDS 27260 - 28768 1049 ## BT_0591 hypothetical protein + Prom 28832 - 28891 3.6 25 17 Tu 1 . + CDS 28917 - 29507 254 ## BT_1358 putative transcriptional regulator + Term 29658 - 29707 1.5 26 18 Op 1 1/0.087 + CDS 29834 - 31762 1554 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 27 18 Op 2 2/0.000 + CDS 31810 - 32643 725 ## COG1596 Periplasmic protein involved in polysaccharide export 28 18 Op 3 1/0.087 + CDS 32654 - 35077 1890 ## COG0489 ATPases involved in chromosome partitioning + Prom 35432 - 35491 5.3 29 19 Op 1 7/0.000 + CDS 35518 - 36726 996 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 30 19 Op 2 . + CDS 36783 - 37997 701 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis + Term 38040 - 38092 3.5 + Prom 38514 - 38573 5.8 31 20 Op 1 . + CDS 38598 - 39716 766 ## COG0381 UDP-N-acetylglucosamine 2-epimerase 32 20 Op 2 . + CDS 39706 - 40743 622 ## CLD_1855 hypothetical protein 33 20 Op 3 . + CDS 40740 - 41768 701 ## CLH_1453 hypothetical protein 34 20 Op 4 2/0.000 + CDS 41773 - 42777 682 ## COG2089 Sialic acid synthase 35 20 Op 5 . + CDS 42782 - 43477 678 ## COG1083 CMP-N-acetylneuraminic acid synthetase 36 20 Op 6 . + CDS 43474 - 44508 832 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 37 20 Op 7 . + CDS 44566 - 45660 294 ## Amet_0211 hypothetical protein 38 20 Op 8 . + CDS 45657 - 46865 220 ## BT_0386 putative F420H2-dehydrogenase + Term 46866 - 46927 3.3 + Prom 47553 - 47612 3.8 39 21 Tu 1 . + CDS 47687 - 48043 130 ## gi|237714527|ref|ZP_04545008.1| predicted protein 40 22 Tu 1 . - CDS 49005 - 49193 101 ## - Prom 49426 - 49485 5.4 + Prom 49015 - 49074 2.0 41 23 Op 1 . + CDS 49102 - 49458 133 ## gi|237714528|ref|ZP_04545009.1| flippase Wzx 42 23 Op 2 . + CDS 49452 - 50264 405 ## Huta_1119 hypothetical protein 43 23 Op 3 8/0.000 + CDS 50261 - 51475 245 ## COG0438 Glycosyltransferase 44 23 Op 4 . + CDS 51472 - 51993 219 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 45 23 Op 5 . + CDS 52006 - 53148 487 ## gi|237714532|ref|ZP_04545013.1| predicted protein 46 23 Op 6 . + CDS 53138 - 54274 654 ## BT_0385 hypothetical protein 47 23 Op 7 . + CDS 54276 - 55034 404 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 48 23 Op 8 . + CDS 55038 - 56213 258 ## BT_0386 putative F420H2-dehydrogenase 49 23 Op 9 . + CDS 56236 - 56448 126 ## + Term 56542 - 56580 1.0 50 24 Op 1 . + CDS 56877 - 57026 126 ## gi|237714536|ref|ZP_04545017.1| predicted protein 51 24 Op 2 . + CDS 57013 - 58011 741 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 52 24 Op 3 . + CDS 58017 - 58454 381 ## BT_0397 hypothetical protein + Term 58482 - 58512 -0.3 53 25 Op 1 . - CDS 58739 - 59167 274 ## COG3023 Negative regulator of beta-lactamase expression 54 25 Op 2 . - CDS 59172 - 59279 61 ## - Prom 59388 - 59447 2.1 - Term 59372 - 59409 -1.0 55 26 Tu 1 . - CDS 59456 - 59944 588 ## BT_1705 hypothetical protein - Prom 60126 - 60185 4.1 + Prom 59905 - 59964 6.3 56 27 Tu 1 . + CDS 60164 - 60382 192 ## BT_1704 hypothetical protein + Term 60390 - 60427 1.5 - Term 60502 - 60547 -0.1 57 28 Op 1 . - CDS 60564 - 62405 1336 ## BT_1703 hypothetical protein 58 28 Op 2 . - CDS 62444 - 63067 405 ## BT_1637 hypothetical protein - Prom 63195 - 63254 7.6 + Prom 63042 - 63101 10.1 59 29 Tu 1 . + CDS 63238 - 63378 138 ## gi|160884254|ref|ZP_02065257.1| hypothetical protein BACOVA_02232 + Term 63531 - 63581 5.1 + Prom 63544 - 63603 6.6 60 30 Op 1 . + CDS 63750 - 64175 118 ## BT_0616 hypothetical protein 61 30 Op 2 10/0.000 + CDS 64183 - 65136 1072 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 62 30 Op 3 12/0.000 + CDS 65161 - 66498 1375 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 63 30 Op 4 12/0.000 + CDS 66504 - 67511 1135 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 64 30 Op 5 13/0.000 + CDS 67538 - 68197 893 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 65 30 Op 6 3/0.000 + CDS 68212 - 68796 773 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 66 30 Op 7 . + CDS 68819 - 69391 740 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA + Term 69432 - 69463 1.1 + Prom 69457 - 69516 7.5 67 31 Tu 1 . + CDS 69615 - 70649 896 ## COG1087 UDP-glucose 4-epimerase + Term 70682 - 70721 6.3 - Term 70750 - 70787 -0.9 68 32 Tu 1 . - CDS 70906 - 71730 696 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase - Prom 71751 - 71810 4.4 + Prom 71877 - 71936 7.8 69 33 Tu 1 . + CDS 71965 - 73527 1478 ## COG0305 Replicative DNA helicase + Prom 73539 - 73598 2.0 70 34 Op 1 . + CDS 73640 - 76102 2896 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit 71 34 Op 2 . + CDS 76136 - 76873 1039 ## COG0217 Uncharacterized conserved protein 72 34 Op 3 . + CDS 76883 - 77128 231 ## BT_0628 hypothetical protein + Term 77163 - 77207 9.0 + Prom 77204 - 77263 5.9 73 35 Op 1 . + CDS 77386 - 78639 1016 ## COG1914 Mn2+ and Fe2+ transporters of the NRAMP family 74 35 Op 2 . + CDS 78669 - 79430 827 ## COG0708 Exonuclease III + Term 79519 - 79569 2.4 75 36 Op 1 . - CDS 79425 - 79832 401 ## COG0432 Uncharacterized conserved protein 76 36 Op 2 . - CDS 79848 - 80114 252 ## gi|237714561|ref|ZP_04545042.1| conserved hypothetical protein - Prom 80162 - 80221 6.5 + Prom 80124 - 80183 10.0 77 37 Op 1 . + CDS 80207 - 80665 505 ## BT_0631 hypothetical protein + Term 80763 - 80808 0.5 + Prom 80710 - 80769 6.3 78 37 Op 2 . + CDS 80821 - 81021 338 ## BF2557 hypothetical protein + Term 81040 - 81113 8.1 + Prom 81028 - 81087 8.8 79 38 Tu 1 . + CDS 81150 - 82931 1879 ## COG0481 Membrane GTPase LepA + Term 82938 - 83003 15.3 + Prom 82984 - 83043 2.3 80 39 Op 1 . + CDS 83072 - 84250 781 ## BT_0633 putative Na+/H+ exchange protein 81 39 Op 2 . + CDS 84296 - 85597 977 ## COG3004 Na+/H+ antiporter 82 40 Op 1 . - CDS 85594 - 86829 1108 ## COG1322 Uncharacterized protein conserved in bacteria 83 40 Op 2 . - CDS 86829 - 87683 780 ## COG0024 Methionine aminopeptidase - Prom 87712 - 87771 2.9 84 41 Tu 1 . - CDS 87784 - 88407 196 ## BT_0639 hypothetical protein - Prom 88442 - 88501 6.0 85 42 Tu 1 . - CDS 88519 - 90159 950 ## COG1032 Fe-S oxidoreductase - Prom 90209 - 90268 7.4 + Prom 90187 - 90246 6.3 86 43 Tu 1 . + CDS 90271 - 90684 338 ## gi|237714572|ref|ZP_04545053.1| conserved hypothetical protein + Term 90706 - 90749 9.4 - Term 90698 - 90733 1.3 87 44 Tu 1 . - CDS 90756 - 91676 204 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family - Prom 91704 - 91763 6.5 - Term 91711 - 91764 2.6 88 45 Tu 1 . - CDS 91780 - 93150 1170 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase + Prom 93202 - 93261 6.5 89 46 Tu 1 . + CDS 93398 - 96118 3025 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase + Term 96152 - 96208 15.3 - Term 96144 - 96191 9.1 90 47 Tu 1 . - CDS 96194 - 96790 253 ## COG4430 Uncharacterized protein conserved in bacteria + Prom 97084 - 97143 6.9 91 48 Tu 1 . + CDS 97197 - 97712 540 ## BT_0645 hypothetical protein + Term 97740 - 97788 8.3 + Prom 97735 - 97794 7.3 92 49 Tu 1 . + CDS 97909 - 98478 715 ## BT_0646 hypothetical protein + Term 98498 - 98551 12.1 - Term 98492 - 98531 4.1 93 50 Op 1 . - CDS 98622 - 99230 511 ## BT_0647 thiamine phosphate pyrophosphorylase 94 50 Op 2 . - CDS 99293 - 99985 822 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 95 50 Op 3 . - CDS 99990 - 101114 987 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes - Prom 101135 - 101194 6.8 96 51 Op 1 . - CDS 101201 - 102898 1729 ## COG0422 Thiamine biosynthesis protein ThiC 97 51 Op 2 3/0.000 - CDS 102911 - 103684 930 ## COG2022 Uncharacterized enzyme of thiazole biosynthesis - Prom 103757 - 103816 4.1 98 51 Op 3 . - CDS 103866 - 104486 603 ## COG0352 Thiamine monophosphate synthase 99 51 Op 4 . - CDS 104491 - 104691 244 ## BT_0653 ThiS protein, involved in thiamine biosynthesis - Prom 104746 - 104805 1.5 100 52 Tu 1 . - CDS 104901 - 106883 1574 ## COG0642 Signal transduction histidine kinase - Prom 106908 - 106967 3.6 - Term 106962 - 106997 5.8 101 53 Op 1 . - CDS 107032 - 107649 432 ## PROTEIN SUPPORTED gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent 102 53 Op 2 . - CDS 107725 - 109464 971 ## BT_0656 hypothetical protein - Prom 109614 - 109673 7.7 + Prom 109437 - 109496 8.7 103 54 Op 1 . + CDS 109611 - 111989 2126 ## COG0210 Superfamily I DNA and RNA helicases 104 54 Op 2 . + CDS 112015 - 112707 629 ## BT_0658 hypothetical protein + Term 112759 - 112829 16.6 - Term 112927 - 112967 6.7 105 55 Tu 1 . - CDS 112979 - 113536 464 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) - Prom 113582 - 113641 6.5 + Prom 113524 - 113583 7.6 106 56 Tu 1 . + CDS 113713 - 114852 829 ## COG0019 Diaminopimelate decarboxylase + Term 114862 - 114904 -0.0 + TRNA 114971 - 115043 84.5 # Gly GCC 0 0 + TRNA 115078 - 115162 51.5 # Leu CAG 0 0 + TRNA 115183 - 115266 44.5 # Leu GAG 0 0 + TRNA 115286 - 115358 84.5 # Gly GCC 0 0 + TRNA 115391 - 115475 51.5 # Leu CAG 0 0 + TRNA 115510 - 115585 92.8 # Gly GCC 0 0 107 57 Op 1 1/0.087 + CDS 115830 - 116996 1349 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 108 57 Op 2 . + CDS 117018 - 118190 1107 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase + Term 118264 - 118325 8.3 - Term 118252 - 118313 4.5 109 58 Tu 1 . - CDS 118450 - 118812 393 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 118903 - 118962 4.6 + Prom 118946 - 119005 7.4 110 59 Op 1 13/0.000 + CDS 119049 - 120314 1319 ## COG1538 Outer membrane protein 111 59 Op 2 11/0.000 + CDS 120337 - 121431 894 ## COG0845 Membrane-fusion protein 112 59 Op 3 . + CDS 121446 - 124571 2797 ## COG3696 Putative silver efflux pump + Term 124653 - 124686 3.1 113 60 Op 1 40/0.000 - CDS 124659 - 125990 912 ## COG0642 Signal transduction histidine kinase 114 60 Op 2 . - CDS 126011 - 126685 555 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 115 61 Tu 1 . - CDS 126827 - 128491 1414 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) - Prom 128546 - 128605 7.2 116 62 Tu 1 . - CDS 128774 - 130240 1483 ## COG3263 NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain - Prom 130308 - 130367 7.2 + Prom 130310 - 130369 5.7 117 63 Tu 1 . + CDS 130407 - 131591 923 ## COG2233 Xanthine/uracil permeases + Term 131694 - 131740 2.6 - Term 131682 - 131726 2.2 118 64 Tu 1 . - CDS 131799 - 133430 1690 ## COG1151 6Fe-6S prismane cluster-containing protein - Prom 133506 - 133565 6.3 119 65 Op 1 . - CDS 133581 - 134249 742 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 120 65 Op 2 8/0.000 - CDS 134260 - 135378 737 ## COG5000 Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation - Prom 135503 - 135562 1.9 121 65 Op 3 . - CDS 135570 - 136910 901 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 122 65 Op 4 . - CDS 136986 - 137795 417 ## BT_0691 hypothetical protein 123 65 Op 5 . - CDS 137770 - 138573 670 ## BT_0692 calcineurin superfamily phosphohydrolase - Prom 138723 - 138782 5.5 124 66 Op 1 . + CDS 138875 - 139549 329 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 125 66 Op 2 . + CDS 139564 - 141882 1802 ## BT_0695 ABC transporter, permease protein 126 67 Tu 1 . - CDS 141922 - 143283 1211 ## COG0534 Na+-driven multidrug efflux pump - Prom 143307 - 143366 5.5 + Prom 143271 - 143330 4.8 127 68 Tu 1 . + CDS 143422 - 144069 620 ## COG0637 Predicted phosphatase/phosphohexomutase + Term 144165 - 144222 5.6 + Prom 144078 - 144137 9.3 128 69 Op 1 . + CDS 144335 - 145156 1049 ## COG0413 Ketopantoate hydroxymethyltransferase 129 69 Op 2 . + CDS 145165 - 146322 746 ## COG0477 Permeases of the major facilitator superfamily 130 70 Tu 1 . - CDS 146426 - 148642 2270 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases - Prom 148742 - 148801 3.4 + Prom 148668 - 148727 5.8 131 71 Tu 1 . + CDS 148749 - 150266 1828 ## BT_0701 hypothetical protein + Prom 150329 - 150388 8.2 132 72 Op 1 . + CDS 150420 - 151043 543 ## COG4845 Chloramphenicol O-acetyltransferase 133 72 Op 2 . + CDS 151078 - 152472 974 ## BT_0703 hypothetical protein 134 72 Op 3 . + CDS 152518 - 153081 516 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 153093 - 153130 0.3 + Prom 153100 - 153159 2.2 135 72 Op 4 . + CDS 153179 - 154333 1172 ## COG1835 Predicted acyltransferases + Term 154378 - 154433 12.4 - Term 154368 - 154418 7.1 136 73 Op 1 . - CDS 154505 - 154975 381 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 137 73 Op 2 . - CDS 154996 - 156162 1430 ## COG0027 Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) - Prom 156206 - 156265 7.6 + Prom 156556 - 156615 3.9 138 74 Op 1 42/0.000 + CDS 156798 - 158315 1797 ## COG0055 F0F1-type ATP synthase, beta subunit 139 74 Op 2 . + CDS 158328 - 158573 269 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 140 74 Op 3 . + CDS 158657 - 159088 235 ## BT_0713 hypothetical protein 141 74 Op 4 . + CDS 159072 - 160211 1187 ## COG0356 F0F1-type ATP synthase, subunit a 142 75 Tu 1 . - CDS 160136 - 160306 60 ## - Prom 160330 - 160389 1.5 + Prom 160229 - 160288 2.9 143 76 Tu 1 . + CDS 160323 - 160580 461 ## BT_0715 ATP synthase C subunit + Prom 160621 - 160680 6.3 144 77 Op 1 38/0.000 + CDS 160702 - 161205 654 ## COG0711 F0F1-type ATP synthase, subunit b 145 77 Op 2 41/0.000 + CDS 161211 - 161771 471 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 146 77 Op 3 42/0.000 + CDS 161771 - 163354 1926 ## COG0056 F0F1-type ATP synthase, alpha subunit + Term 163364 - 163426 -0.8 147 77 Op 4 . + CDS 163467 - 164363 875 ## COG0224 F0F1-type ATP synthase, gamma subunit + Term 164397 - 164448 12.2 + Prom 164369 - 164428 8.3 148 78 Tu 1 . + CDS 164517 - 166460 1701 ## BT_0720 hypothetical protein + Term 166507 - 166552 12.4 + Prom 166559 - 166618 4.6 149 79 Tu 1 . + CDS 166646 - 169252 2008 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Term 169176 - 169217 8.7 150 80 Op 1 . - CDS 169348 - 169968 531 ## COG2431 Predicted membrane protein 151 80 Op 2 . - CDS 170034 - 170309 384 ## BT_0723 hypothetical protein - Term 170782 - 170824 1.1 152 81 Tu 1 . - CDS 170878 - 171810 558 ## COG1091 dTDP-4-dehydrorhamnose reductase - Prom 171993 - 172052 3.9 + Prom 171856 - 171915 3.6 153 82 Tu 1 . + CDS 172016 - 173185 1074 ## COG1690 Uncharacterized conserved protein + Term 173245 - 173276 1.8 - Term 173302 - 173342 1.2 154 83 Tu 1 . - CDS 173470 - 174744 878 ## BT_0727 hypothetical protein - Prom 174941 - 175000 6.4 - Term 174945 - 174974 1.4 155 84 Op 1 . - CDS 175166 - 176065 687 ## COG2207 AraC-type DNA-binding domain-containing proteins 156 84 Op 2 . - CDS 176096 - 176998 680 ## BT_0729 transcriptional regulator - Prom 177177 - 177236 10.9 + Prom 177025 - 177084 5.5 157 85 Tu 1 . + CDS 177204 - 178220 787 ## COG0451 Nucleoside-diphosphate-sugar epimerases 158 86 Tu 1 . - CDS 178326 - 178877 362 ## BT_0731 hypothetical protein - Prom 178986 - 179045 2.6 - Term 178984 - 179026 7.1 159 87 Op 1 . - CDS 179049 - 180692 1825 ## BT_0735 aspartate aminotransferase (EC:2.6.1.1) 160 87 Op 2 . - CDS 180715 - 182409 1964 ## COG2985 Predicted permease - Prom 182547 - 182606 7.5 + Prom 182490 - 182549 12.4 161 88 Tu 1 . + CDS 182579 - 184246 1833 ## COG2759 Formyltetrahydrofolate synthetase + Term 184307 - 184367 5.3 162 89 Tu 1 . - CDS 184052 - 184441 93 ## - Prom 184502 - 184561 7.6 - Term 184506 - 184545 7.5 163 90 Tu 1 . - CDS 184600 - 185880 1753 ## COG0112 Glycine/serine hydroxymethyltransferase - Prom 185914 - 185973 5.8 164 91 Op 1 . - CDS 186015 - 186755 653 ## BT_0739 hypothetical protein 165 91 Op 2 . - CDS 186772 - 187380 548 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 166 91 Op 3 19/0.000 - CDS 187450 - 187911 535 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit 167 91 Op 4 . - CDS 187908 - 188849 926 ## COG0540 Aspartate carbamoyltransferase, catalytic chain - Prom 188881 - 188940 2.7 - Term 188884 - 188934 6.2 168 92 Op 1 . - CDS 188962 - 189744 563 ## BT_0592 hypothetical protein 169 92 Op 2 . - CDS 189768 - 190112 387 ## BT_0593 hypothetical protein - Prom 190146 - 190205 5.7 170 93 Tu 1 . - CDS 190341 - 190751 403 ## gi|237714657|ref|ZP_04545138.1| conserved hypothetical protein - Prom 190849 - 190908 9.9 + Prom 190744 - 190803 4.1 171 94 Tu 1 . + CDS 190880 - 191833 535 ## BT_0595 integrase + Term 191871 - 191907 5.0 + Prom 191893 - 191952 5.7 172 95 Op 1 . + CDS 192187 - 192753 503 ## BT_0596 putative transcriptional regulator 173 95 Op 2 13/0.000 + CDS 192826 - 193713 810 ## COG1209 dTDP-glucose pyrophosphorylase 174 95 Op 3 9/0.000 + CDS 193763 - 194332 468 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 175 95 Op 4 11/0.000 + CDS 194338 - 195204 723 ## COG1091 dTDP-4-dehydrorhamnose reductase 176 95 Op 5 . + CDS 195212 - 196285 769 ## COG1088 dTDP-D-glucose 4,6-dehydratase + Prom 196323 - 196382 5.3 177 96 Tu 1 . + CDS 196445 - 196615 127 ## BT_0467 hypothetical protein + Prom 196629 - 196688 5.6 178 97 Tu 1 . + CDS 196845 - 197784 161 ## BT_0467 hypothetical protein Predicted protein(s) >gi|222159265|gb|ACAB01000094.1| GENE 1 2 - 224 116 74 aa, chain - ## HITS:1 COG:no KEGG:BDI_2962 NR:ns ## KEGG: BDI_2962 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 72 1 72 268 104 69.0 9e-22 MAKITVQTTDVTILKINDTDYISLTDIAKYKTTDANIVIANWLRNRMTIEYLGLWEILYN PHFKPLEFKPFRFQ >gi|222159265|gb|ACAB01000094.1| GENE 2 570 - 1271 612 233 aa, chain + ## HITS:1 COG:CAC3448 KEGG:ns NR:ns ## COG: CAC3448 COG2755 # Protein_GI_number: 15896689 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Clostridium acetobutylicum # 58 221 15 178 190 61 32.0 1e-09 MDKRRLSQWMLLAVVCLSFSIGESYAQKNDFGNLARYSKQNAALPKATKKDKRVIFMGNS ITEGWVRTHPDFFKSNGYIGRGISGQTSYQFLLRFREDVINLSPALVVINAGTNDVAENT QTYNEDYTFGNIVSMTELAKANKIKVILTSVLPAAAFKWRMEIKDVPQKIKSLNDRIEAY AKANKIPFVNYYKAMVDENQALNPQYTKDGVHPTGEGYDIMEPLIKAAIEKAL >gi|222159265|gb|ACAB01000094.1| GENE 3 1754 - 3784 2246 676 aa, chain - ## HITS:1 COG:BS_uvrB KEGG:ns NR:ns ## COG: BS_uvrB COG0556 # Protein_GI_number: 16080570 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Bacillus subtilis # 3 668 5 658 661 728 55.0 0 MNYELTSAYKPTGDQPEAIAQLTEGVLQGVPAQTLLGVTGSGKTFTIANVIANINKPTLI LSHNKTLAAQLYSEFKGFFPNNAVEYYVSYYDYYQPEAYLPSSDTYIEKDLAINDEIDKL RLAATSALLSGRKDVVVVSSVSCIYGMGNPSDFYKNVIEIERGRMMDRNVFLRRLVDSLY VRNDIDLNRGNFRVKGDTVDIYLAYTDNLLRVTFWGDEIDGVEEVDPITGVTIAPFDAYK IYPANLFMTTKEATLRAIHEIEDDLTKQVAFFESIGKEYEAKRLYERVTYDMEMIRELGH CSGIENYSRYFDGRAAGTRPYCLLDFFPDDFLIVIDESHVSVPQIRAMYGGDRARKINLV EYGFRLPAAMDNRPLKFEEFQEMAKQVIYVSATPADYELIQSEGIVVEQVIRPTGLLDPV IEVRPSLNQIDDLMEEIQLRIEKEERVLVTTLTKRMAEELAEYLLNNNVRCNYIHSDVDT LERVKIMDDLRQGIYDVLIGVNLLREGLDLPEVSLVAILDADKEGFLRSHRSLTQTAGRA ARNVNGKVIMYADRMTDSMKLTIDETNRRREKQLAYNEANGITPQQIKKARNLSVFGNAK EADELLKERHAYVEPSTPNIAADPVVQYMSKAQMEKSIERTRKLMQEAAKKLEFIEAAQY RDELLKLEDLMKEKWG >gi|222159265|gb|ACAB01000094.1| GENE 4 3920 - 5218 1392 432 aa, chain + ## HITS:1 COG:MTH1855 KEGG:ns NR:ns ## COG: MTH1855 COG1541 # Protein_GI_number: 15679843 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanothermobacter thermautotrophicus # 1 431 1 431 433 506 56.0 1e-143 MIWNESIECMDRESLRKIQSIRLKKIVDYVYHNTPFYRKKMQEMGITPDDINSIDDIVKL PFTTKHDLRDNYPFGLCAVPMSQIVRIHASSGTTGKPTVVGYTRKDLASWAECISRAFTA YGAGRSDIFQVSYGYGLFTGGLGAHAGAENIGASVIPMSSGNTEKQITLMHDFGSTVLCC TPSYALYLADAIKDSGLPREEFQLKVGAFGAEPWTENMRHEIEEKLGIKAYDIYGLSEIA GPGVGYECECQNGTHLNEDHYFPEIIDPNTLQPVEPGQTGELVFTHLTKEGMPLLRYRTR DLTALHHDKCSCGRTLVRMDRILGRSDDMLIIRGVNVFPTQIESVILEMAEFEPHYLLIV GRENNTDTMELQVEVRPEFYSDEINKMLALKKKLGGRLQSVLGLGVNVKLVEPRSIERSV GKAKRVIDNRKI >gi|222159265|gb|ACAB01000094.1| GENE 5 5342 - 5767 503 141 aa, chain + ## HITS:1 COG:MTH1854 KEGG:ns NR:ns ## COG: MTH1854 COG4747 # Protein_GI_number: 15679842 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Methanothermobacter thermautotrophicus # 1 141 1 143 143 110 41.0 6e-25 MVAKQLSIFLENKSGRLTEVTEVLAKENINLSALCIAENADFGILRGIVSDPDRAYKALK DNHFAVNVTDVVGISCPNIPGALAKVLGYLSDEGVFIEYMYSFANNNIANVVIRPSNLDK CIEVLKEKKVDLLAASDLYKL >gi|222159265|gb|ACAB01000094.1| GENE 6 6145 - 6948 750 267 aa, chain + ## HITS:1 COG:BMEI0587 KEGG:ns NR:ns ## COG: BMEI0587 COG4105 # Protein_GI_number: 17986870 # Func_class: R General function prediction only # Function: DNA uptake lipoprotein # Organism: Brucella melitensis # 8 258 40 282 309 59 22.0 7e-09 MKKNIIITLLAAATLTSCGEYNKLLKSTDYEYKYEAAKNYFAKGQYNRSATLLNELITIL KGTDKAEESLYMLGMSYYNQKDYQTAAQTFITYFNTYPRGTFTELARFHAGKALFLDTPE PRLDQSSTYQAIQQLQMFMEYFPNSTKKQEAQDMIFALQDKLVLKELYSARLYYNLGNYL GNNYESCVITAQNALKDYPYTDYREELSILILRARHEMAIYSVEDKKMDRYRETVDEYYA FKNEFPESKYLKEAEKIFNESQKVIKD >gi|222159265|gb|ACAB01000094.1| GENE 7 6963 - 7298 431 111 aa, chain + ## HITS:1 COG:no KEGG:BT_0574 NR:ns ## KEGG: BT_0574 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 111 186 100.0 3e-46 MDYKKTNAPATTVTRDMMELCADTGNVYETVAIIGKRANQISVEIKNDLSKKLAEFASYN DNLEEVFENREQIEISRYYEKLPKPDLIATQEYIEGKIYYRNPAKEKEKLQ >gi|222159265|gb|ACAB01000094.1| GENE 8 7407 - 7850 399 147 aa, chain + ## HITS:1 COG:no KEGG:BT_0575 NR:ns ## KEGG: BT_0575 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 147 147 222 82.0 3e-57 MIQRIQTVYLLIVAGLLITAMCMPIGYFIDTMGEHPFKALGLEINDAFQSTWGLFGILML STIVAVATIFLYKNRMLQIRMTIFNSLLLVGYYIAALAFYFALKNDANMFRVGWALCLPL ISIILNILAVRAIGRDEVMVKAADRLR >gi|222159265|gb|ACAB01000094.1| GENE 9 8117 - 8683 345 188 aa, chain - ## HITS:1 COG:no KEGG:BT_0576 NR:ns ## KEGG: BT_0576 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 188 57 240 240 302 79.0 4e-81 MGITGGLTARYISEKYFAMICGAQVELNVSQRGWDQLFETVSLDDNGYEVTSKDPSKTYT RKMTYIDIPFLAHLAFGRDRGLQFFVHAGPQISFLISESETIKGIDMNTLSNTQKAVYGV KIQNKFDYGIAGGGGVELRTKKAGSFIVEGRYYFALSDFYSTTKKDYFARAAHGTITIKL TYLFDLKK >gi|222159265|gb|ACAB01000094.1| GENE 10 8700 - 8882 106 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLGVKETLFKLTPPLTPTDKLLRASPIWAVAGRANPVSRSVAIKNGFIFIIFVLFEVKFY >gi|222159265|gb|ACAB01000094.1| GENE 11 8859 - 10625 1258 588 aa, chain - ## HITS:1 COG:BB0625_1 KEGG:ns NR:ns ## COG: BB0625_1 COG1388 # Protein_GI_number: 15594970 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: FOG: LysM repeat # Organism: Borrelia burgdorferi # 32 187 165 312 350 62 26.0 2e-09 MKPINRIFLFLLFISVSYAISYAQENQSYFLHTIEKGQSLYSISKMYNVTTSDIIRLNPG CDEKIYAGQTIKIPTGKESQKGETFHTIQAGETLYKLTTMYNVSAKDICEANPGLSAENF RIGQVILIPQKKEEQTDAVVQTPTEQSTIQGPVVPRCKDMHKVKRKETIFSVSREYGISE QELIDANPELKKGMKKGQFLCIPYPAATTVQPTQKEDPYAVPPSNSELFRKNKETPQKLS TIKAALLLPFQEDKRMVEYYEGFLMAVDSLKRTGVSLDLYVYDSGKDISTLNTILAKNEM KSMNIIFGPMHQNQIKPLSDFAEKNDIRLVIPFSKKGEEVFNNPAVYQINTPQSYLYSEV YEHFTRQFPNAHVIFIESANSDKEKAEFISGLKQELKNKGISMKSVSESATKEILKTTLR NDKENIFIPTSGNNVILIKILPQLTLLVRENPVEKIHLFGYPEWQTYTKDYLESFFELDT YFYSSFYTNTLFPAAVQFTNNYHRWYSKDLVSEWPNYAMLGFDTGFFFLKGLSRYGSELE NNLTKMNLTPIQTGFKFQRVNNWGGFINKKVFFIRFTKNFELVKLDFE >gi|222159265|gb|ACAB01000094.1| GENE 12 10784 - 13609 2539 941 aa, chain + ## HITS:1 COG:BH3594 KEGG:ns NR:ns ## COG: BH3594 COG0178 # Protein_GI_number: 15616156 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Bacillus halodurans # 7 936 6 935 957 1023 56.0 0 MQETEYINVYGARVHNLKDIDAEIPRNSLTVITGLSGSGKSSLAFDTIFAEGQRRYIETF SAYARNFLGNLERPDVDKITGLSPVISIEQKTTNKNPRSTVGTTTEIYDYLRLLYARAGI AYSYLSGEEMVKYTEEQILDLILKDYKGKKIYLLAPLVRSRKGHYRELFEQIRKKGYLYV RVDGEVREITHGMKLDRYKNHDVEVVIDKLVVAEKDDRRLKQSVATAMRQGDGLMMILDA QSESIRHYSKRLMCPVTGLSYREPAPHNFSFNSPQGACPKCKGLGVVNQIDVDKVIPDRE LSIYEGAIAPLGKYKNAMIFWQIGALLEKYDVTLKTPVKELPDDAIDEVLYGSDERIKIK SSLIGTSSDYFVTYEGVVKYIQMLQEKDASATAQKWAEQFAKTTVCPECKGARLNKEALH FRIHDKNINELANMDINELYDWLMKVDEFLSDKQKKISVEILKEIRTRLKFLLDVGLDYL ALNRSSVSLSGGESQRIRLATQIGSQLVNVLYILDEPSIGLHQRDNLRLINSLKELRDMG NSVIVVEHDKDMMLAADYVIDMGPKAGRLGGEVVFAGTPQEMLKTDTMTSQYLNGKMKIE IPAKRRKGNGKSIWLKGAKGNNLKNVDVEFPLGKLICVTGVSGSGKSTLINETLQPILSQ KFYRSLQDPLEYDSIEGLENIDKVVNVDQSPLGRTPRSNPATYTGVFSDIRNLFVGLPEA KIRGYKPGRFSFNVSGGRCEACTGNGYKTIEMNFLPDVYVPCEVCHGKRYNRETLEVRFK GKSIADVLDMTINRAVEFFENVPQILNKIKVLQDVGLGYIKLGQSSTTLSGGESQRVKLA TELSKRDTGKTLYILDEPTTGLHFEDIRVLMGVLNKLVDKGNTVIVIEHNLDVIKMADYI IDMGPEGGKGGGELLSYGTPEEVAKSPKGYTPKFLREELGL >gi|222159265|gb|ACAB01000094.1| GENE 13 13799 - 14278 512 159 aa, chain + ## HITS:1 COG:lin0783 KEGG:ns NR:ns ## COG: lin0783 COG2606 # Protein_GI_number: 16799857 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 5 159 4 158 158 167 55.0 9e-42 MKINKTNAARLLDKAKIAYELIPYEVDENDLSAVHVAADLGENIEQVFKTLVLHGDKSGY FVCVIPGEHEVDLKLAAKASGNKKCDLIPVKELLPLTGYIRGGCSPIGMKKHFPTYIHET SREFPYIYVSAGVRGLQIKIAPEDLIRESKAEICRLFEE >gi|222159265|gb|ACAB01000094.1| GENE 14 14376 - 15935 1563 519 aa, chain + ## HITS:1 COG:CAC0751 KEGG:ns NR:ns ## COG: CAC0751 COG3104 # Protein_GI_number: 15894038 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Clostridium acetobutylicum # 3 476 22 478 521 147 27.0 5e-35 MFSKHPKGLIAAALANMGERFGFYIMMAILTLFISAKFGLSETTTGYIYSAFYASIYILA LAGGVIADKTKNFKGTILVGLILMAIGYLIIAIPTPTPVPSMALYLGLTCFGLLVIAFGN GLFKGNLQALVGQMYDDPKYSNLRDAGFQIFYMFINIGAVFAPFIAIGVRNWWLKVNNFD YDATLPELCHQYLEKGNDMAPQAMENLTTLAKSVVLDGTPVTDMGMFVNNYLDVFNRGFQ YAFMAAIGAMLVSLIIYMANKKRFPDPATKLETSKGTAAVNKEEIQMSATEIKQRIYALF AVFGVVIFFWLSFHQNGYSLTYFARDYVDLSVINIDLGFTQIKGAEIFQSVNPFFVVFLT PFIMWMFGSMKKNGKEPSTPMKIAIGMGIAALAYVFLMVFSFTLPSKEVLGTMSAAEINA IRVTPWIMIGLYFILTVAELFISPLGLSFVSKVAPPHLQGLMQGCWLAATAVGNSLLFIG GILYTTVPIWACWLVFVGATGASMIVMLSMVKWLERVAK >gi|222159265|gb|ACAB01000094.1| GENE 15 16008 - 16505 174 165 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229884790|ref|ZP_04504247.1| acetyltransferase, ribosomal protein N-acetylase [Sebaldella termitidis ATCC 33386] # 13 161 15 160 169 71 30 2e-11 MFTIRKATVADCELIHMMAKEVFPATYKDILSPEQLDYMMDWMYAPSNVRKQMEEEGHVY SIAYKEDEPCGYVSVQQQGKDVFHLQKIYVLPRFQGAHCGSFLFKEAIKCIKGMHPEPCL MELNVNRNNKALQFYEYMGMRKLREGDFPIGNGYYMNDYIMGLDI >gi|222159265|gb|ACAB01000094.1| GENE 16 16671 - 16997 397 108 aa, chain + ## HITS:1 COG:BH0406 KEGG:ns NR:ns ## COG: BH0406 COG1695 # Protein_GI_number: 15612969 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 5 108 67 170 174 77 36.0 5e-15 MKVDNVKSQMRKGMLEYCIMLLLHKEPAYASDIIQKLKEAQLIVVEGTLYPLLTRLKNDD LLSYEWVESTQGPPRKYYKLTEQGETFLGELEISWKELNDTVNHIANR >gi|222159265|gb|ACAB01000094.1| GENE 17 17019 - 18116 1031 365 aa, chain + ## HITS:1 COG:no KEGG:BT_0583 NR:ns ## KEGG: BT_0583 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 364 1 363 364 580 82.0 1e-164 MKKTLTINLGGIVYHIDDDAYRLLDNYLSNLKHYFRKQEGAEEIVDDIEMRIAELFAEKV TEGKQVITVSDVEEIIARVGKPEDFGIADEDTDSQKRTEQASSANQSNTQTSAQRRWFRD PDNKLLGGVAAGLAAYFGWDITLVRILMIILVFVPYCPMIILYIIGWLVIPEARTAAEKL SMRGEAVTIENIGKTVTDGFERVADGVNNYVNSGKPRTFLQKIGDVFVSIAAVLFKIFLV ALVILCCPVLFVLAVVLVALVFAAIAVAVSGGALLYELLPAIDWMPVASVSPMMTLLGTI AGVALIGIPLGAFLYTILRQLFHWSPMGTGLKWSLLILWILGAVIMVINLSALGWQLPLY GLHCC >gi|222159265|gb|ACAB01000094.1| GENE 18 18167 - 19267 732 366 aa, chain + ## HITS:1 COG:YPO1228 KEGG:ns NR:ns ## COG: YPO1228 COG2220 # Protein_GI_number: 16121515 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Yersinia pestis # 113 353 84 335 342 169 34.0 8e-42 MYPDSVSGVHFLFPLIHRKSRFFSKFVNTYSGNKMGEHICFRRNERLATVNPYWRGNPMV RGRFFNRQHRFRPGMGSVLKWRLSPNPQRKEKKTVKWDPKVCYLRSLDAMVGDSLIWLGH NSFFLQLAGKRIMFDPVFGSIPFVKRQSEFPANPDIFTEIDYLLVSHDHFDHLDKQSIAR LLKNNPQMKLFCGLGTGELIQGWFPEMKVIEAGWYQQIEDEGLKITFLPAQHWSKRSVRD GGQRLWGAFMLQGNGISLYYSGDTGYSSHFREIPDMFGAPDYALLGIGAYKPRWFMRPNH ISPYESLTAAEEMHAGLTIPMHYGTFDLSDEPLHDPPKVFAAEAKKRKIPVEIPYLGEIV KLKKQK >gi|222159265|gb|ACAB01000094.1| GENE 19 19286 - 19792 612 168 aa, chain + ## HITS:1 COG:FN0320 KEGG:ns NR:ns ## COG: FN0320 COG1853 # Protein_GI_number: 19703665 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 2 140 4 139 180 77 36.0 1e-14 MKKLEVKDLKENFFEAIGKEWMLVTAGTKEKFNTMTASWGGIGWLWNKPVAFVFIRPERY TYEFVEENDYLTLSFLGEENKKIHAICGSKSGRNIDKVKETGLKPVFTENGNVLFEQARL SLECKKLYADGIKPECFLDKESLEKWYGGAHGGFHKMYIVEIENIYSE >gi|222159265|gb|ACAB01000094.1| GENE 20 19968 - 22070 2037 700 aa, chain - ## HITS:1 COG:XF2260 KEGG:ns NR:ns ## COG: XF2260 COG1506 # Protein_GI_number: 15838851 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Xylella fastidiosa 9a5c # 51 694 52 707 709 335 31.0 2e-91 MRQANLFMMSAAMLLAACGGTKDAGKTDQVLIEKSDIKIEGKRMTPEALWAMGRIGEFAV SPDGKKIAYTVAYYSVPENKSNREVFVMNADGSENQQITHTPYQENEVTWIKGGTKLAFL SNDNGSSQLYEMNPDGSGRKQLTHYDGDIEGYSISPDGKKLLFISQVKTKESTADKYPDL PKATGIIVTDLMYKHWDEWVTTAPHPFVADFDGNGISNIVDILNGEPYESPMKPWGGIEQ LAWNTASDKVAYTCRKKTGLEYAISTNSDIYVYDLNTQKTENITEENKGYDTNPQYSPDG KYIAWQSMERDGYEADLNRLFIMNLETGEKRFVSKAFESNVDAFVWGADAKMIYFTGVWH GESQIYALDLANDSVKAITSGMYDYEGVALFGDKLIAKRHSMSMGDEIYTVALDGSTTQL TQENKQIYDQLEMGKVEGRWMKTTDGKQMLTWVIYPPQFDPNKKYPTLLFCEGGPQSPVS QFWSYRWNFQIMAANDYIIVAPNRRGLPGFGVEWNEQISGDYGGQCMKDYFTAIDEMAKE PYVDKDRLGCVGASFGGFSVYWLAGHHDKRFKAFIAHDGIFNMEMQYLETEEKWFANWDM GGAYWEKQNPMAQRTFANSPHLFVEKWDTPILCIHGEKDYRILANQAMAAFDAAVMRGVP AELLIYPDENHWVLKPQNGVLWQRTFFEWLDKWVKKAPSK >gi|222159265|gb|ACAB01000094.1| GENE 21 22119 - 23450 1111 443 aa, chain - ## HITS:1 COG:PAB0243 KEGG:ns NR:ns ## COG: PAB0243 COG0534 # Protein_GI_number: 14520582 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Pyrococcus abyssi # 8 443 6 463 463 124 25.0 5e-28 MKTKYSYKQIWTISYPILISLIMEQMIGMTDTAFLGRVGEIELGASAIAGVYYLAIFMMA FGFSIGAQILIARRNGEGNYKEIGPIFYQGIYFLLAMAVILFTFSIVFSPYILKNIISSP HIYDAAESYIHWRVYGFFFSFVMVMFRAFFVGTTQTKTLTLNSIVMVLSNVVFNYILIFG KFGFPQLGIAGAAIGSSLAEMVSVIFFIIYTWKRIDCKKYALNILPKFHGKTLKRILNVS VWTMIQNFVSLSTWFMFFLFVEHLGERSLAIANIIRNVSGIPFMIAMAFASTCGSLVSNL IGAGEQDCVRGTIKQHIRIGYIFVLPILVFFCLFPDLILRIYTDIPDLRAASVPSLWVLC SAYLVLVPANVYFQSVSGTGNTRTALAMELCVLAIYVTYSAYFIMYLRMDVAFAWTTECV YGTFILLFCYWYMKKGNWQKKKI >gi|222159265|gb|ACAB01000094.1| GENE 22 23535 - 25388 1809 617 aa, chain - ## HITS:1 COG:BB0442 KEGG:ns NR:ns ## COG: BB0442 COG0706 # Protein_GI_number: 15594787 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Borrelia burgdorferi # 86 561 72 533 544 148 27.0 3e-35 MDKNTITGLVLIGILLVGFSFLSRPSEEQIAAQKRYYDSIAVVQQQEEALKAKTEAALAN SQKEVASAADSSALFFNALHGTDSKISIQNNVAEITFATKGGRVYSAMLKDYMAQDKKTP IMLFDGDDASMNFNFYNKAGAIQTKDYFFEAVNKTDSSVTMRLAADSASYIDFIYTLKPD SYLMNFEIKATGMEDKLASTKYVDIDWSQRARQLEKGFTYENRLSELTYKVTGDNVDNLS AAKDDSQDLPGRIDWVAFKNQFFSSVFIAEQDFDKVSVKSKMEQQGSGYIKDYSAEMNTF FDPSGKEPTEMYFYFGPNHFKTLKALDKGRDEKWELHRLVYLGWPLIRWINQFITINVFD WLSGWGLSMGIVLLILTIMVKVLVYPATWKTYMSSAKMRVLKPKIDEINKKYPKQEDAMK KQQEVMSLYSQYGVSPMGGCLPMLLQFPILMALFMFVPSAIELRQQSFLWADDLSTYDAI ITFPFHIPFLGNHLSLFCLLMTLTNILNTKYTMSMQDTGAQPQMAAMKWMMYLMPIMFLF VLNDYPSGLNYYYFVSTLISVGTMILLRRTTDETKLLAILEAKKKDPKQMKKTGFAARLE AMQKQQEQLQQQRQNKK >gi|222159265|gb|ACAB01000094.1| GENE 23 25446 - 27071 1792 541 aa, chain - ## HITS:1 COG:BS_ctrA KEGG:ns NR:ns ## COG: BS_ctrA COG0504 # Protein_GI_number: 16080768 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Bacillus subtilis # 4 533 2 530 535 596 53.0 1e-170 MGETKYIFVTGGVASSLGKGIISSSIGKLLQARGYNVTIQKFDPYINIDPGTLNPYEHGE CYVTVDGHEADLDLGHYERFLGIQTTKANNITTGRIYKSVIDKERRGDYLGKTIQVIPHI TDEIKRNVKLLGNKYKFDFVITEIGGTVGDIESLPYLESIRQLKWELGKNALCVHLTYVP YLAAAGELKTKPTQHSVKELQSVGIQPDVLVLRAEHPLSDGLRKKVAQFCNVDDKAVVQS IDAETIYEVPILMQAQGLDSTILEKMGLPVGETPGLGPWRKFLERRHAAETKEPINIALV GKYDLQDAYKSIREALSQAGTYNDRKVEVHFVNSEKLTDENVAEALKGMAGVMIGPGFGQ RGIDGKFVAIKYTRTHDIPTFGICLGMQCIAIEFARNVLGYADANSREMDEKTPHNVIDI MEEQKAITNMGGTMRLGAYECVLQKGSKAYQAYGQEHIQERHRHRYEFNNDYKDRYEAAG MKCVGINPESDLVEIVEIPALKWFVGTQFHPEYGSTVLNPHPLFVAFVKAAIENEKTTAK G >gi|222159265|gb|ACAB01000094.1| GENE 24 27260 - 28768 1049 502 aa, chain + ## HITS:1 COG:no KEGG:BT_0591 NR:ns ## KEGG: BT_0591 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 502 1 483 483 761 78.0 0 MMRITTLCSAFILSILSLSLFAQESVDPSPLPTDSLQPAPEAIKPAKTVKSSTVVAKARV VKADTLSAELQKYLMLKLNMSGPTPKLDTVSYLYNKYVAQLDYLNDLSVPPRYIPSDPDY FRLFTPLAYYYAPMAQYSKLEWKPMQWDTTPQLTAELLPYDTLAFTKTQRAEKIVNSALM DLYLERPNLVVTTEDRIMSRDVFRHDVKPKISPKANVIHLFQPENMDDNVGKANMKISRP NWWVTGGNGSLQISQNHLSDNWYKGGESNFSGLATFQIFANYNDNEKVLFENQLEAKVGM TSTPSDEYHKYLFNTDQFRLYNKLGLRALKNWYYTISSEFKTQFFNGYKANSEELVSSFM SPADLAVSVGMDYKLSKKKFNLSVFMAPLTYNLRYIGNSEVDETKFGLEKGKCSKNDFGS QLQSTLNWTIISAVTLESRLNYLTNYHWARVEWENTFNFVLNRYLSTKLYIHTRFDDSSK PTEGDSFFQLKELLSFGINYKW >gi|222159265|gb|ACAB01000094.1| GENE 25 28917 - 29507 254 196 aa, chain + ## HITS:1 COG:no KEGG:BT_1358 NR:ns ## KEGG: BT_1358 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 190 300 78.0 1e-80 MIIKKNDDRDLLSTDVIGSSVARSKRWLVAIVRICHEKKTSERLTKMGIENFLPIQQEVH QWSDRRKVVDRVLLPMMIFVHVDPQEQKEVLTLSAISRYMVLRGESTPAVVPDQQMLRFK FMLDYSDETISMSTSPLAPGERIRVIKGPLAGLEGELVHVNGKSKVAVRLTMLGCACVDI PAGCVEPVSENGDLRD >gi|222159265|gb|ACAB01000094.1| GENE 26 29834 - 31762 1554 642 aa, chain + ## HITS:1 COG:FN1696 KEGG:ns NR:ns ## COG: FN1696 COG1086 # Protein_GI_number: 19705017 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Fusobacterium nucleatum # 70 631 58 605 607 331 37.0 3e-90 MEKTLKDIVQYVRKNYFSYWVILGIDTAISVLCSLVAYAVIHYMAHVPMTDWMLCKFACV SLVASVAGSLLFHTYRNTIRFSQARELWRIMCAVLFKIACLVIISFGVIYETQLPYNYKI SYLLFDGLLTLVILTTFRVSLIIVYDFLLDWVNKKNTRILIYGTNEESVALKLRLRDSAH YKVTGFYVYGKNNSRRRLADLPVYYFENESDVDYIMRKRGIKGILFARYEDTRLEEKRLL EYCKNNELKTLIAPTISEADSDGNFHQWVRPIKIEDLLGRAEININLRQVAEEFRGKVVL VTGAAGSIGSELCRQLVQMGIQKLIMFDSAETPLHNVRLEFEKKYPTIDFVPVIGDVRVK ERVRMVFESYHPQIVFHAAAYKHVPLMEENPCEAVLVNVTGSRQVADMAVEYGAEKMIMV STDKAVNPTNVMGCSKRLAEIYVQSLGCAIREGKVKGHTKFITTRFGNVLGSNGSVIPRF KEQIENGGPVTVTHPDIIRFFMTIPEACRLVMEAATMGEGNEIFVFEMGKAVKIVDLATR MIELAGYRPGEDIKIEFTGLRPGEKLYEEVLSDKENTIPTENKKIMIAKVRRYEYADIFD TYAEFERLSRAVKIMDTVRLMKKIVPEFKSKNSPRFEVLDKE >gi|222159265|gb|ACAB01000094.1| GENE 27 31810 - 32643 725 277 aa, chain + ## HITS:1 COG:PM1016 KEGG:ns NR:ns ## COG: PM1016 COG1596 # Protein_GI_number: 15602881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Pasteurella multocida # 50 244 76 253 387 61 30.0 2e-09 MNKSLRTHLNVRSLYGSLFSCLIAVFLFASCQSYKKVPYLQDAGVVKDTNQQENLYDAKI MPKDLLTIVVSCTSPELAVPFNLTVATPANAATASTQLTTQPVLQPYLVDNEGKINFPVL GELKVGGLTKREAEQLIIDKLKPYIKETPIVTVRMVNYKISVLGEVARPGTFTINNEKVN LLEALAMAGDMTVWGVRDNVKLIREGADGKQEIVTLDLNKAETILSPYYWLQQNDIVYVT PNKAKARNSDIGNSTSLWFSATSILVSLASLLVTIFK >gi|222159265|gb|ACAB01000094.1| GENE 28 32654 - 35077 1890 807 aa, chain + ## HITS:1 COG:VC0937_2 KEGG:ns NR:ns ## COG: VC0937_2 COG0489 # Protein_GI_number: 15640953 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Vibrio cholerae # 493 792 1 299 302 119 28.0 2e-26 MIEEKRDRGEQPEEQVNIQEILFRYLIHWPWFVISIIICIACAWGYLRLATPVYDITATV LIKDDKKGGGASMSSELEKMGLDGFVSSSNNVDNEIEVLKSKSLAREVVNNLGLFVTYKD EDEFPNRELYRTSPVVVSLTPQEAEKLSAPMEVEMTLFPNGGMDALITVKDKEYRKQFDQ LPAVFPTDEGTVAFFESKDTLTTNQAKEESKERHIKAFINRPMSVAKGYIQSLSIAPTSK TTSVAVLSLKNSNTRRGKDFINKLLEMYNINANNDKNEVAQKTAEFIDERIGIISKELGS TEQDLENFKRSAGITDLSSEAQIALTGNAEYEKKRVENQTQINLVMDLQRYMQGNEYEVL PSNIGLQDAASAGAIDRYNEMLVERKRLLRTSTENNPTIINLDTSIRAMRSNVQATLDAT LKGLQITKEDLAREANRYSRRINDAPTQERQFVSIARQQEIKAGLYLMLLQKREENAITL AATANNAKIIDEALADDNPVSPKRMMIYLAALVLGVGFPVGIIYLIGLTKFKIEGRADVE KLTSLPVIGDIPLADEKSGSIAVFENQNNLMSETFRNVRTNLQFMLENGKNVILVTSTIS GEGKSFVSSNLAISLSLLGKKVVIVGLDIRKPGLNKVFNLSRKEQGITQFLINPAVNLMD LVQRSDINKNLFILPGGAVPPNPTELLARDSLEKAIETLKANFDYVILDTAPVGMVTDTL LIGRTADLSVYVCRADYTRKAEFTLINELMENNKLPNLCIAINGLDLQKKKYGYYYGYGK YGKYYGYGKRYGYGYGYGKENLDTTSK >gi|222159265|gb|ACAB01000094.1| GENE 29 35518 - 36726 996 402 aa, chain + ## HITS:1 COG:MJ1061 KEGG:ns NR:ns ## COG: MJ1061 COG1086 # Protein_GI_number: 15669250 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Methanococcus jannaschii # 36 264 10 233 333 110 33.0 5e-24 MFNLEKFIADSVTSRPVSMFAADIEANKETLTAEIKGKKVCVIGGAGSIGSSFIKAVLRF EPASIVVVDLNENGLAELVRDVRSTESLYVPEEFRCYTLNFADQIFERIFCEEKGFDIVA NFSAHKHVRSEKDKYSVQALIENNDIKAKKLMDLLCVYPPKHFFCVSTDKAANPVNIMGA SKRIMEDLVMAYNTHFKVTTARFANVAFSNGSLPDGWIHRLQKKQPLTAPSDVKRYFVSP EESGQICMLACILGNGGEVFFPKLDEDQMLTFSAICDDFVKAEGFIKVECKNDPEAKRYA AQMSYESDTYPVFYFKSDTTGEKTYEEFYVLGEKVDMERFMSLGVVCESTRRSMNEVNEF FLELEGIFQKKDFTKAEVIGSIKKFIPNFEHEEKGKNLDQKM >gi|222159265|gb|ACAB01000094.1| GENE 30 36783 - 37997 701 404 aa, chain + ## HITS:1 COG:Cj1320 KEGG:ns NR:ns ## COG: Cj1320 COG0399 # Protein_GI_number: 15792643 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Campylobacter jejuni # 4 400 2 378 384 355 46.0 1e-97 MVDYKNTIDFIKSVYGNKEFTPLAVPVFAGNEKAYLNECIDTTFVSSVGKFVDRFEEDMA KYTGAKRAVVCVSGTNALHMSLILAGVKKDDEVLTQALTFIATCNALSYIGAHPVFLDVD FTTMGLSPDAMKEWLQANAEVRKNTRIDELPASLNFAFHEDELACYNKYTGRRIKACVPM HTFGHPVRIDEIAVLCKEWHIELVEDAAESIGSTYMGQHTGTFGRIGAISFNGNKTITTG GGGMMLFMDEELGAYAKHITTQAKIPHRWEFRHDHIGFNYRMPNINAALGCAQLENLDKY VASKRKVAVEYIEYFKNVDGIDFFVEPENTFSNYWLSAVVLKDKESQLDFLQQTNDNGIM TRPIWELMNRLPMFKNCQNDGLKNTIWFANRVVNIPSSVRPEDL >gi|222159265|gb|ACAB01000094.1| GENE 31 38598 - 39716 766 372 aa, chain + ## HITS:1 COG:Cj1328 KEGG:ns NR:ns ## COG: Cj1328 COG0381 # Protein_GI_number: 15792651 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Campylobacter jejuni # 2 363 4 365 384 225 33.0 8e-59 MKKVCIITAARSEYGLLRWTIDSVQQNPNLELQLVVTGAHLMAEHGYTYKYIEEDGYPIS AKMDMGLSSDSKEAIVESMGRCSMGFAKTFAELQPDLAVVLGDRYELLPICSAALVMNIP IAHISGGDVTIGAIDNEVRNAVTMMSVLHFPGVEKSAQNIIRMRNSDANVWTVGEPGLDN FRRSVLMTRQELAENIGIPMSNKWILVTLHPETNESLEYNLQMAKNIIALTDSMTDASIV ISKANADFGGNQINDNWTELERLNPEKYHLYPSLGQMRYLSFMNECYAVLGNSSSGIIEA SCLGIPVINIGNRQTGRFICKNVRQVNNDLTEIQSAWSAIETDPERIKDSYYGDGCTAEK IVKHIEEYLYAK >gi|222159265|gb|ACAB01000094.1| GENE 32 39706 - 40743 622 345 aa, chain + ## HITS:1 COG:no KEGG:CLD_1855 NR:ns ## KEGG: CLD_1855 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B1 # Pathway: not_defined # 7 340 4 346 350 125 27.0 2e-27 MQNSHIIGGEFGIDLQGFQYAQSNNGFLEGIYKYSSGRSALYYILLDVQKRYGITTAYLP NYLCSSVVVAAEKSQVKVVFYNLNDQLEIDIEKFPLDDNEKAAVLLINYFGLKSLQLQVE LVRSISMEAIVIEDDVQAFYEFCKPELAADYKFTSLRKTFACPDGGLVKTAKQLTVVTEV NKFHQFKLAGCILKSLHKPKYYDDDVYLHLFEKGESHIDDEITMGMSQISQEVITKTDFE RIAYIRLRNTRQILSGLNTLGIRTILPVPENKTPLFVPIWLEDRNKVRKQMFQQQIFCPV HWPLEGMNVQKGAEMAEHEMSIIIDQRYTNKDMDFILNTLEKALK >gi|222159265|gb|ACAB01000094.1| GENE 33 40740 - 41768 701 342 aa, chain + ## HITS:1 COG:no KEGG:CLH_1453 NR:ns ## KEGG: CLH_1453 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 5 341 9 339 340 223 41.0 8e-57 MIGFTEHWNEYLERIPAKLQDLYFREEYVRLYETESEKACCFVYQNGDSILLFPFLRRAF KYKGNTYFDFETAYGYGGPVSNDHNEVFMTAALQAMTDKAKAENYVCGFIRFHPLLENWD CFDKVGHLIQDRKTIAIDLTGGIEATWMNEIHTKNRNVIKKGEKNGLEFVVDNDFTYLKE FEQLYNSTMDRLEADGFYYFDPKYYDQLKDTIQNRFLGIVIHEDKVVAGAIFFYQLPYGH YHLAGSDKVSLKLSPNNFLLWGAARELINRGVEHFHLGGGTDGSEENSLYQYKCKFSKHE YQFILGKMIFNPSLYDEVCADWAAANPEKTETLKHILLKYKY >gi|222159265|gb|ACAB01000094.1| GENE 34 41773 - 42777 682 334 aa, chain + ## HITS:1 COG:Cj1327 KEGG:ns NR:ns ## COG: Cj1327 COG2089 # Protein_GI_number: 15792650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sialic acid synthase # Organism: Campylobacter jejuni # 3 332 4 333 334 314 51.0 2e-85 MHTYIIAEAGVNHNGQLDLALQLCDAAKTAGVDCVKFQTWQTEKIVTCKAKKATYQSENT YDAEESQFEMLKKLELSYENFRIVQEHCKKIDIDFLSTPDEEYSLAFLMNELNLPLIKIG SGEVTNIPYLRQMASYSKPIILSTGMATLAQVATAYDTLLAAGAPSVTLLHCTTNYPCPK NEVNLRAMQTMKEAFKCQVGYSDHTMGTEIPIAAVAMGAEIIEKHFTLDRNMKGPDHKAS LEPQELKYMVDCIRNIEVALGDGIKRPNPSEVEISKVLLKSIVAKVPVKKGDILSANNIT IKRAGSGIPATHWDMVVDTKALHDFDIDEPIKLD >gi|222159265|gb|ACAB01000094.1| GENE 35 42782 - 43477 678 231 aa, chain + ## HITS:1 COG:Cj1143_2 KEGG:ns NR:ns ## COG: Cj1143_2 COG1083 # Protein_GI_number: 15792468 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-N-acetylneuraminic acid synthetase # Organism: Campylobacter jejuni # 4 218 1 210 218 110 35.0 3e-24 MRPLVIIPARGGSKGIPHKNIKELGGKPLICYAIDAARKFTTDDNICVSTDDDAIIKVVE DYGLKVLFKRPEYLATDNAGTNGVLLHALDFYEQKGNHYDMVVLLQATSPFRRAEDVIEA AKLYDDSIDMITSVMLAKCNPYYDGFEDNAEGFLTISKGNGTIGRRQDAPHVWQQNGAVY VINPEQLKAKGLAGMTKIRKYVMDEIHSVDLDNMLDWKIAEIMLKDKLVEI >gi|222159265|gb|ACAB01000094.1| GENE 36 43474 - 44508 832 344 aa, chain + ## HITS:1 COG:Cj1329_2 KEGG:ns NR:ns ## COG: Cj1329_2 COG1208 # Protein_GI_number: 15792652 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Campylobacter jejuni # 116 341 3 228 228 123 32.0 5e-28 MRKYIISETASVRDALVAINNITHDGELLIVVNAAQQMVGSLTDGDIRRGLIAGAELTDT INKIMHRDFKFIKQEDYDVAHLKSFRDRRIMFIPILDAENHVVDVVNLQKFKSKLPIDAV LMAGGKGERLRPLTEKIPKPLLEVGGKCIIDHNVDRLRSYGVQYVNVTVNYLGEQLEEHF STPRDGVQVRTFREPKFLGTIGSIKFVDTFYNDTVLVMNSDLFTNIDYEDFFLHFQMHDA EMSVAAVPYNISIELGILDLDGRNIKGLIEKPKYNYYANAGIYLIKKRALAEIPKDTFFH ATHLVEKLIAQDKKVIRYPLNGTWIDIGTLQEYERAKELVKHLK >gi|222159265|gb|ACAB01000094.1| GENE 37 44566 - 45660 294 364 aa, chain + ## HITS:1 COG:no KEGG:Amet_0211 NR:ns ## KEGG: Amet_0211 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 1 362 23 366 369 152 30.0 2e-35 MLQGFALQHYIRSLGQDVKCELINYRGVPPTNASKWKRLKTRISRTGYYLSHIKEIRTKS KYVSKFALRNTYFDEFLKNNTQTTDKLYRNKNQLMENPPVYDIYVTGSDQTWSPNVSGGY ELTPMFLDFAVRGAKKAAYAPSLGINSFSNEQERYLKTKLRDYSILSCREVGGATLLSKI IDNEVPAVVDPTLLIKKEEWRSLAHKPMISGKYIFCYFLGDRQYYRDFVQQLSKQTGLPI YYIPVNWREFSDADNLIWNAGPLDFVGLINGAEYICTDSFHGVVFSSNLNKNFYAFVKHS GSESAGDNSRLFDYLHRIGLEARLLSYYNGGLIDIRSIDYSVVNAKFESEREMSYSVINR IIEL >gi|222159265|gb|ACAB01000094.1| GENE 38 45657 - 46865 220 402 aa, chain + ## HITS:1 COG:no KEGG:BT_0386 NR:ns ## KEGG: BT_0386 # Name: not_defined # Def: putative F420H2-dehydrogenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 360 13 364 397 203 36.0 8e-51 MMKEICQHNECTGCSACVNVCGKNAIFYCEDKIGFRYPVVNLDLCIDCGLCQKVCPNNVE VDKSEPTLCFVGHATDRSEQETSSSGGIASLLSRTIIRQGGVVYGCTARDISHVQHIRVN NDDDLFLLKGSKYVQSDLGDCYKRIKVDLHSNLQVIFIGTPCQIAGLRTYLRKDYENLVT IDFVCHGVPSQIILNESLRTKTNENLRECTLSFRRKIKNGKKIDSKYGLFLDNKKGQSVY NGMHPKDMYIAGFLSALYYRESCYQCKYATSERVSDITVGDYADRDSEYSQLSGSDFLLS MITLNTQKGEKLFADLNETVEIAPIEYSKLVAVQGQLRMPMKRHPNRDKFSDLFQNADFE KDVKLLISNDLKRIAKITRKTRIREILFHIPFMKHYLNQRNK >gi|222159265|gb|ACAB01000094.1| GENE 39 47687 - 48043 130 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714527|ref|ZP_04545008.1| ## NR: gi|237714527|ref|ZP_04545008.1| predicted protein [Bacteroides sp. D1] # 1 118 275 392 392 199 100.0 6e-50 MVFNMPFMDLLIGMPAANPYDYLISGGFHSNEIIVKENSIFVSTFWLVLAKFGVIGLALF LSLYIRPLINSKEILTYIIMLFASMFFQSYSFGTGGFAFQIIFIYLFTSIYNRQNSLV >gi|222159265|gb|ACAB01000094.1| GENE 40 49005 - 49193 101 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MATTMTPYCFRNKTVSKIETIAPIKVETNCMIFFLSDENIEPRKPANAPRNTDIKRYGVT HQ >gi|222159265|gb|ACAB01000094.1| GENE 41 49102 - 49458 133 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714528|ref|ZP_04545009.1| ## NR: gi|237714528|ref|ZP_04545009.1| flippase Wzx [Bacteroides sp. D1] # 1 118 355 472 472 187 100.0 2e-46 MQFVSTLIGAIVSILLTVLFLKQYGVIVVAISTMIGYYIIWLIRRVAVNKYLNIGVSPIS TTMQVILLVVESVFVGKGAYLWAMLCFAILTVLNFSEIVSIVRFAIRELNHYIHRKLW >gi|222159265|gb|ACAB01000094.1| GENE 42 49452 - 50264 405 270 aa, chain + ## HITS:1 COG:no KEGG:Huta_1119 NR:ns ## KEGG: Huta_1119 # Name: not_defined # Def: hypothetical protein # Organism: H.utahensis # Pathway: not_defined # 2 233 1 227 265 66 27.0 1e-09 MVKLAIRDDDANFFTKVEDIEFVYKDFEGFPISFAVVPHVMDVSTKGTCSDTKGNTTPRD IDKNIELCLWFREKYVRKECDILLHGITHQYKVDGNIRYPEMIWSGMSGNKLIIKNIKLA KDYLEYALKCQISVFVAPSNMISKKCLNAVVANNMHFSGIVPINFQRNLTIQNIISYCKR WYYRVTIGFAYPGVLKYSDHLEINACALRSLTYLKALFDYCDKNNLPMVVNVHYWHLRDN PEYLEMLRSFVMDYAIPKGAQPTLLSNILK >gi|222159265|gb|ACAB01000094.1| GENE 43 50261 - 51475 245 404 aa, chain + ## HITS:1 COG:BMEI1404 KEGG:ns NR:ns ## COG: BMEI1404 COG0438 # Protein_GI_number: 17987687 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Brucella melitensis # 139 369 100 327 372 62 23.0 2e-09 MKVLFVTSHITHPSSPTFMRNQTGFGYMVFDMAKHISQTNQVDMFVANTFSLSVQLEGVN IIKRTWMSYFCGLTLNSLLNGFKYVKKYKLPLKRSFRELYMYTSLSQLHSNLMEYDVVHI HGCSEITDSMIRMCKIANVPFVVTLHGLNSFEDSIKLPFSLKRYERDFLKEASSNGYPVS FISTGNKKTAESFIGHDVANFYVICNGCDVVEKGSTQNIREKYHIYEDEFLFVFVGNISK NKNQIQAARAWLKMPEEYRNRSKILFVGRYFETDEVAKFIKIHNLQDKLILCGIQPKENI FNYYETCNATILTSLTEGFGLSIIEGYVYGKPNVTFKDLPAVEDLYVNDAMIAVEERTDE ALAKAMVEIMCKIIDCENIKSVSKGFSFQRMTQQYVELYKSVIK >gi|222159265|gb|ACAB01000094.1| GENE 44 51472 - 51993 219 173 aa, chain + ## HITS:1 COG:CAC2109 KEGG:ns NR:ns ## COG: CAC2109 COG0110 # Protein_GI_number: 15895378 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Clostridium acetobutylicum # 49 159 26 146 157 67 38.0 1e-11 MKRIIVFCVDKFRRLKHYVSTFILKHQLVAYGDHVGAARYIKISRQAKVSVGHNCGFNGM VISGFGSVKIGNYFHSGTNILIMLGSHDYEYGDKIPYGIHYKPKNVVIDDFVWIGSNVII SGNVHIAEGAIVAVGSVVVKDVPPCAIVGGNSAKIIKYRDIEQFNRLKAKGRF >gi|222159265|gb|ACAB01000094.1| GENE 45 52006 - 53148 487 380 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714532|ref|ZP_04545013.1| ## NR: gi|237714532|ref|ZP_04545013.1| predicted protein [Bacteroides sp. D1] # 1 380 1 380 380 704 100.0 0 MNYIKFLNYLRSFDKELFEVTSKEQKEFLSTMKEPVDDIDRSFNQFRGRHFFYHTLKEIA AEIISFFALPVYVLYCLILYVFKPKQTQSYQAVGDFYKFEEIIPNTLRHEFDINNNVWNK GRMLSGSDVLLLIKIFFRHPVPCFVFKTAIKLSYYKYIIYCYSPKAIIVHNEASYAGSIL TKYCNDLGIEHVNVMHGEKLFYLGNVYFRYNRCYVWGEHYKQMFLSLKAEATQFRIEIPE SFNVDVENNFRKEYYSDYKYYLQIYDEAQLKSIIESLNFIVKNGQIYKFRPHPRFSNIKL LEELVDKSIIEYPKDVPILESIASTKHVIGSYTTVLNQAYYANIDVIFDDVTYKDTFDKL GDLNYIFAKEDTNKLSQYVK >gi|222159265|gb|ACAB01000094.1| GENE 46 53138 - 54274 654 378 aa, chain + ## HITS:1 COG:no KEGG:BT_0385 NR:ns ## KEGG: BT_0385 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 378 2 381 386 144 31.0 8e-33 MSSNKVLLTTVFSGYNYGSSLQALAGKTILKELGYDCQLVAMKSLVKGRDIRLKKLLTIL VRSLMLRGKNGSKSLSIYQNSYNKTMIGDSASLFIRFSDEYLQPNYLSWDGMKKAAKEAV ACFAGSDQIWNSSTMYVDPMYYLRFAPAEKRVALAPSFGRDFVADYNKEKIGKWISEFAY LSVREDSGVKLIKEMTGREAIQLVDPTLMVDGETWKNILGIDDKESNYILAYFLDKPSEL ARKAITELRAALKYEVIAIPYQFDDMSYCDKMVSSGTIEFLDLINNAKCVLTDSFHGTAF SINLHTPFYVFSRAYGTAHSQNSRVESILKKVKMQARFEPKDVLVQYDQIDFAYSESVLI EERKNAREYISNALKTIE >gi|222159265|gb|ACAB01000094.1| GENE 47 54276 - 55034 404 252 aa, chain + ## HITS:1 COG:BH3661 KEGG:ns NR:ns ## COG: BH3661 COG0463 # Protein_GI_number: 15616223 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus halodurans # 3 250 11 253 257 155 36.0 5e-38 MNVSIVMPYYNAAKYIKETVGAIIAQTYTDWELIIVDDCSLAPETVEVLKAVEAMDDRIR VLRAKVNGGAGAARNIAIKEAYGRYLAFCDSDDWWYPTKLEEQIRFMEENNYPFTCTWYE DADENLEPYYTMKQNEKQTYKSMITGCNIGTPGVMVDTQVLGKKQMPNLRRAEDWGLWMM YLRDTDYLVTYPKVLWKYRHVPGSETSNKWKQLKAVIRMYQEVLGFSAFKANMIVFLLFL PNNIWRKIQKRF >gi|222159265|gb|ACAB01000094.1| GENE 48 55038 - 56213 258 391 aa, chain + ## HITS:1 COG:no KEGG:BT_0386 NR:ns ## KEGG: BT_0386 # Name: not_defined # Def: putative F420H2-dehydrogenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 327 13 331 397 204 36.0 5e-51 MNVAELDIDKCSGCGLCASVCSKHSISIVPDDSGFLRPIVDKNTCVDCGLCVKRCVIVNP RKQTIPQKTYAAIRQDKDRIALSSSGGVFAAVAEYVLLKKTNWVVVGSTLDETVSANHII VDNVVDLKNLYGSKYVQSETTGIYKKIQILLDDSKSVLFSGTPCQVAAIQRYTNNHPNLW TIEVICHGVSNNKMFNSYLDMYKRNEIRMFYFRDKEQGWSFNNKIVYQNGKEKKINHRMS SYMTYFLKGETYRDCCYCCPYAKPERCADITIGDFWGILQTRPDLNNKIDIEKGVSCVLV NTDKGISMVGNAELELYDVEYDAIRKENGPVNEPSHHTVKRDLVLAEWGKKKDWTDVHTF WKKNDRKITFVLWSMIPVSLQHKIRVMLGKR >gi|222159265|gb|ACAB01000094.1| GENE 49 56236 - 56448 126 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTINKYNELLKDNLSFIVIEVNGKHLLNNDIPPFLKNVLEICRYYIEKKHLDCIKLPVKE NTVEWMWKCF >gi|222159265|gb|ACAB01000094.1| GENE 50 56877 - 57026 126 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714536|ref|ZP_04545017.1| ## NR: gi|237714536|ref|ZP_04545017.1| predicted protein [Bacteroides sp. D1] # 8 49 1 42 42 68 100.0 9e-11 MGIRVSCMGKQNNNLSDRGLYTNGKYNESFVLIFILNIEVNTLKQYGIS >gi|222159265|gb|ACAB01000094.1| GENE 51 57013 - 58011 741 332 aa, chain + ## HITS:1 COG:RC1279 KEGG:ns NR:ns ## COG: RC1279 COG0472 # Protein_GI_number: 15893202 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Rickettsia conorii # 30 258 33 273 327 77 31.0 4e-14 MELVSTYGIIFVILLALELCYFKVADHFNIIDKPNERSSHSTIVLRGGGIIFLIGVWIWS VFFGFQYPWFLVGLTLVAGISFVDDIHSLPDSVRLVAQFVAAAMAFYQLGILHWSMWWVI LLALIVYVGATNVINFMDGINGITAGYSLAVLIPLALVNMDDIFVEQSLIISTILSSLVF CIFNFRPKGKAKCFAGDVGSIGIAFIILFLLGNVIIKTMDITWLIFLLVYGVDGCLTIVH RIMLHENLGEAHRKHAYQIMANELKVGHVKVALLYTVMQLVISLGFIYLCPDTVLAHWLY LVGVSAVLAIAYILFMKKYYHLHEEYLISLKQ >gi|222159265|gb|ACAB01000094.1| GENE 52 58017 - 58454 381 145 aa, chain + ## HITS:1 COG:no KEGG:BT_0397 NR:ns ## KEGG: BT_0397 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 1 138 148 184 68.0 1e-45 MEFDKEFLDKLFEQAKENPRLRQNFDLRTSSADTSQRMLNALLPETKVPIHRHEDTTETV ICLCGKLDEVIYEEVVSYEKDTDDFQKGVNVQDVARKVEYREVQRIHLNPTETKYGCQIP KGAWHTVEVIEPSVIFEAKDGAYAR >gi|222159265|gb|ACAB01000094.1| GENE 53 58739 - 59167 274 142 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 47 129 2 97 116 81 42.0 6e-16 MRTITLIIIHCSATPEGKSLSAEACRQDHILHRGFRDIGYHFYITRDGEIHRERALEKVG AHCRNHNAHSVGVCYEGGLDANGKPKDTRTLEQKGALLALLRELKRQFPKALIVGHRDLN PMKGCPCFDAVKEYSQVISYQK >gi|222159265|gb|ACAB01000094.1| GENE 54 59172 - 59279 61 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKSSSSRSVWSLVLKIVITVATAIGGVFGIQSCM >gi|222159265|gb|ACAB01000094.1| GENE 55 59456 - 59944 588 162 aa, chain - ## HITS:1 COG:no KEGG:BT_1705 NR:ns ## KEGG: BT_1705 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 1 164 166 254 88.0 6e-67 MIRYKIYQNQQKKGLNAGKWFARAVSDETFDLAKLAEHMSKHNSPYSGGVIKGVLSDMVD CIKELLLDGKCVKIDDLAIFGVGIRSKAADTLEEFSLEKNISGMRLKARATGNLSTNNLK LDSQLKQQAEYQKPTTAGGGSDSGDNPDPKPDGGGEAPDPAA >gi|222159265|gb|ACAB01000094.1| GENE 56 60164 - 60382 192 72 aa, chain + ## HITS:1 COG:no KEGG:BT_1704 NR:ns ## KEGG: BT_1704 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 72 114 80.0 1e-24 MSEFKIRAYGRMELAQLYSPQLTDIAAYRKMKKWISLCPGLLQRLYDLGYESKRRSFTPL EVRVIVDALGEP >gi|222159265|gb|ACAB01000094.1| GENE 57 60564 - 62405 1336 613 aa, chain - ## HITS:1 COG:no KEGG:BT_1703 NR:ns ## KEGG: BT_1703 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 611 1 611 612 807 65.0 0 MTDIESLCRLTEAVEAAGADIAPTYLEYVQLSFAIATDCGEAGRDFFHRLCRVSPKYQRE HAERVFSNALHTQRGEVHLGTAFHLAEATGVSICIEEMPKHPTGTKGTAGTARKFPPHTG AYNKVGNDNISEEREGEEELLPGSEPQHQLPTFPKNNWPEFLQRIIKAGSSPIQHDIMLL GALTALGACMSRHVRCLYGGKYHHPSLQCFVVAPSASGKGILSYIRLLVEPIHDEIRKEV AIQMKAYKKEKAEYDAMGKERTKKEAPQMPPNRMFLISGNNTGTGILQNIMDSDGTGLIC EAEADTLSTAIGSDHGHWSDTLRKAFDHDRLSYNRRTDQEYREVKKSYLSVLISGTPSQV QTFIPTAEDGSYSRQLYYYMCGISKWISQFMDNEIDWEEIFTAMGLEWKEKLSVIKAHGI HTLQLTDEQKEEIDTVFSDLFERSSVANGREMYSFVARLAVNLCRIMSEVAVLRALESPQ PYDFKTSPTSPFTPDKEIPADNRKDDIITRWDVSISQEDFHAVLSLAEPLYCHATHILSF LPNTEISHRPNADRDYLFQKLGDEFTRTQLLEEAVAMGIKENTALTWLKRLTKRGILVSI DGKGLYARACVYE >gi|222159265|gb|ACAB01000094.1| GENE 58 62444 - 63067 405 207 aa, chain - ## HITS:1 COG:no KEGG:BT_1637 NR:ns ## KEGG: BT_1637 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 205 1 206 208 268 65.0 7e-71 MSNSYRMSYFMPPIAPIKDKQGRLMTPPTLIPFCEVSIEQVFQMITCNENLKTLTAQVRN ATDIRAAKASLLPYVTPCGTFTRRSCKDFVSPSHLVIVDVDGLHSYQEAVEMRRMLYDDP LLQPVLTFISPSGLGVKAFVPCHYSPTINDAQNITDNMSWAMRYVETAYNTVTAVSSETK SKVDFSGKDLVRSCFLSYDPEALFRTK >gi|222159265|gb|ACAB01000094.1| GENE 59 63238 - 63378 138 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160884254|ref|ZP_02065257.1| ## NR: gi|160884254|ref|ZP_02065257.1| hypothetical protein BACOVA_02232 [Bacteroides ovatus ATCC 8483] # 1 46 1 46 46 79 100.0 8e-14 MKKKTVCCSDLGAYINELLKRAKLKNEYVCETLGMGHDVLNGIKKG >gi|222159265|gb|ACAB01000094.1| GENE 60 63750 - 64175 118 141 aa, chain + ## HITS:1 COG:no KEGG:BT_0616 NR:ns ## KEGG: BT_0616 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 141 1 141 141 234 87.0 9e-61 MTNNTIKHLGIVENIQGSHLSVRIVQTSACAACSAKGHCSSADSKDKIIDIIDTAASSYQ VGEKVMVVGETSMGMMAVVLAFVLPFVLLIFSLFLLMAWIENELYAALLSLAVLIPYYFV LWLNKTQLKQQFSFTIKPINN >gi|222159265|gb|ACAB01000094.1| GENE 61 64183 - 65136 1072 317 aa, chain + ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 1 265 1 261 264 173 41.0 3e-43 MNLILIAVISLGAIALVLAAILYVASKKFAVYEDPRIAQVGEVLPQANCGGCGYPGCSGF ADACVKAGSLDGKFCPVGGQPVMAQIADILGLVAGEAEPMVAVVRCNGTCANRPRTNQYD GAKSCAIAASLYGGETGCSYGCLGCGDCVAACQFDAIHMNPETGLPEVDEAKCTACGACV KACPKAIIEIRPQGKKSRRVYVSCVNKDKGAVARKACTVSCIGCGKCVKTCPFEAITLEN NLAYIDPHKCKSCRKCVEVCPQNSIIELNFPPRKPKEEAPAAPKPAAVSKEAVETPAPAA KVETPAAKATEAPKVTE >gi|222159265|gb|ACAB01000094.1| GENE 62 65161 - 66498 1375 445 aa, chain + ## HITS:1 COG:TM0244 KEGG:ns NR:ns ## COG: TM0244 COG4656 # Protein_GI_number: 15643016 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Thermotoga maritima # 8 442 22 451 451 374 48.0 1e-103 MLKTFSIGGVHPHENKLSAHQPIITAEVPAKAVILLGQHIGAPAKPVVAKGDVVKVGTRI AEPAGFVSAAIHSSVSGKVAKIDTIVDASGYAKPAIFIDVEGDEWEETIDRSTTLVKECE LSAEEIVKKIADAGIVGLGGACFPTQVKLCPPPSFKAECVIINAVECEPYLTADHQLMLE HAEEIMVGVSILMKAVKVNKAFIGIENNKPDAIELMTKVASSYAGIEVVPLKVKYPQGGE KQLIDAITKRQVASGALPISTGAVVQNVGTAFAVYEAVQKNKPLFERVITVTGKSVAKPS NFLARIGTPMKQLIDACGGLPEDTGKVIGGGPMMGKALVNIEVPTAKGSSGILIMNQKEA KRGEAQTCIRCAKCVSACPMGLEPYLLGALSENGDFETMEKERIMDCIECGSCQFTCPAN RPLLDYCRLGKGKVGAMIRARQAKK >gi|222159265|gb|ACAB01000094.1| GENE 63 66504 - 67511 1135 335 aa, chain + ## HITS:1 COG:TM0245 KEGG:ns NR:ns ## COG: TM0245 COG4658 # Protein_GI_number: 15643017 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Thermotoga maritima # 4 333 2 318 318 262 48.0 6e-70 MENKLIVSLSPHVHGGDSVQKNMYGVLIALIPAFLVSLYFFGLGALIVTATSVAACLFFE WAIGKYLMKKPTTTICDGSAIITGVLLAFNLPSNLPIWIIILGALFAIGVGKMSFGGLGC NPFNPALAGRVFLLLSFPVQMTSWPVVGQLTAYTDATTGATPLALMKQAIHAADKSAAMD ALNQIPDALSLLIGQNGGCLGEVSALALLIGLVYMLWKKIITWHIPVSILATVFIFAGIM HLADPEKYVSPVLQLLSGGLMLGAVFMATDYVTSPMSKKGMLIYGVCIGLLTVVIRLFGA YPEGMSFAILIMNAFTPLINTYCKPKRFGEVAKKK >gi|222159265|gb|ACAB01000094.1| GENE 64 67538 - 68197 893 219 aa, chain + ## HITS:1 COG:PA3493 KEGG:ns NR:ns ## COG: PA3493 COG4659 # Protein_GI_number: 15598689 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Pseudomonas aeruginosa # 3 196 14 199 214 72 28.0 5e-13 MLLVLTGVTAVSVALLAYVNELTKGPIAEANAKTLNEALKKVLPEFTNNPVAESDTIFSE KDGKKNVDFIVYPAKNGEELVGTAVEAKSMGFGGELKVLVGFNAEGKIYNYSLLAHAETP GLGSKADKWFGAYDPAKGEQAVSHEESKKSILGMNPGEAPLTVSKDGGAVDAITASTITS RAFLNAVNAAYQAYKATPNTDAATGATIKVELTDSVSAK >gi|222159265|gb|ACAB01000094.1| GENE 65 68212 - 68796 773 194 aa, chain + ## HITS:1 COG:FN1593 KEGG:ns NR:ns ## COG: FN1593 COG4660 # Protein_GI_number: 19704914 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Fusobacterium nucleatum # 1 189 1 190 205 192 60.0 5e-49 MNNFKVMMNGIIKENPTFVLLLGMCPTLGTTSSAINGMGMGLATMFVLICSNVVISLIKN LIPDMVRIPSFIVVIASFVTLLQMVMQAYVPGLYATLGLFIPLIVVNCIVLGRAEAFAAK NNAVASMFDGIGMGLGFTIALTLLGAVREFLGTGKIFDLTIMPEEYGMLVFVLAPGAFIA LGYLIALINSFKKA >gi|222159265|gb|ACAB01000094.1| GENE 66 68819 - 69391 740 190 aa, chain + ## HITS:1 COG:FN1592 KEGG:ns NR:ns ## COG: FN1592 COG4657 # Protein_GI_number: 19704913 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Fusobacterium nucleatum # 18 189 21 192 194 174 59.0 1e-43 MEYILIFISAIFVNNIVLSQFLGICPFLGVSKKVETAMGMSAAVAFVLTIATIVTFLIQK FVLDVFGLGYLQTITFILVIAGLVQMVEIILKKVSPALYQALGVFLPLITTNCCILGVAI LVIQKDYDLLTGVVYAFSTAIGFGLALVLFAGLREQMSLVKVPKGMQGTPIALITAGLLA MAFMGFSGVV >gi|222159265|gb|ACAB01000094.1| GENE 67 69615 - 70649 896 344 aa, chain + ## HITS:1 COG:SP1828 KEGG:ns NR:ns ## COG: SP1828 COG1087 # Protein_GI_number: 15901657 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Streptococcus pneumoniae TIGR4 # 5 337 3 330 336 343 51.0 3e-94 MKERILVTGGTGYIGSHTVVELQNSGYEVIIIDNLSNSSADVVDNIEKVSGIRPVFEKLD CLDFAGLDAVFTKYKGIKAIIHFAASKAVGESVQKPLLYYRNNLVSLINLLELMPKHGVE GIVFSSSCTVYGQPDELPVTEKAPIKKAESPYGNTKQINEEIIRDTVASGAPINAIMLRY FNPIGAHPTALLGELPNGVPQNLIPYLTQTAIGIREKLSVFGDDYDTPDGSCIRDFINVV DLAKAHVIAIRRILEKTQKEKVEVFNIGTGRGVSVLELINGFEKATGVKLNYQIVGRRAG DIEKVWANPDFANQELGWKAVETLEDTLRSAWNWQLKLRERGIQ >gi|222159265|gb|ACAB01000094.1| GENE 68 70906 - 71730 696 274 aa, chain - ## HITS:1 COG:NMA1092 KEGG:ns NR:ns ## COG: NMA1092 COG1947 # Protein_GI_number: 15794040 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Neisseria meningitidis Z2491 # 3 249 9 245 281 142 39.0 7e-34 MLAFPNIKINLGLSITEKRPDGYHNLETVFYPVALEDALEIRTSSETEKKIILHQYGMEI AGNPEDNLVAKAYSLLDKEFHLPPVEIHLYKHIPSGAGLGGGSSDAAFMLKLLNDHFQLE LSEEQLEVYAATLGADCAFFIKNKPTYAEGIGNLFSPIELSLNGYQIMIIKPNVFVSTRE AFSNIHPHRPEYPVKEAILRPVAEWKDILINDFEASVFPQHPVIGEIKRELYHQGAIYAS MSGSGSSVFGLFAPGTSLPKTMGESDVFCFKGKL >gi|222159265|gb|ACAB01000094.1| GENE 69 71965 - 73527 1478 520 aa, chain + ## HITS:1 COG:lin0047 KEGG:ns NR:ns ## COG: lin0047 COG0305 # Protein_GI_number: 16799126 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Listeria innocua # 24 466 8 442 450 372 45.0 1e-103 MAEQKRNTRNTKSTKVQPVNDYGRIQPQAPELEEAVLGALMIEKDAYSLVSEILRPESFY EHRHQLIYSAITDLAVNQKPVDILTVKEQLSKRGELEEVGGPFYITQLSSKVASSAHIEY HARIIAQKSLARELITFTSNIQSKAFDETLDVDDLMQEAEGKLFEISQQNMKKDYTQINP VIDEAYKLIQKAAARTDGLSGLESGFTKLDKMTSGWQNSDLIIIAARPAMGKTAFVLSMA KNIAVDYRNPVALFSLEMSNVQLVNRLIANVCEIPSEKIKSGQLANYEWQQLDYKLKNLM DAPLYVDDTPSLSVFELRTKARRLVREHGVRIIIIDYLQLMNASGMAFGSRQEEVSTISR SLKGLAKELNIPIIALSQLNRGVESREGIDGKRPQLSDLRESGAIEQDADMVCFIHRPEY YKIYQDDRGNDLRGMAEIVIAKHRNGAVGEVLLRFKGEFTRFSNPEDDMVIPMPGEPAGA MLGSKMNTGDAGSMPPPPAPDFAPQTANPFGAPGDGPLPF >gi|222159265|gb|ACAB01000094.1| GENE 70 73640 - 76102 2896 820 aa, chain + ## HITS:1 COG:FN2122_2 KEGG:ns NR:ns ## COG: FN2122_2 COG0072 # Protein_GI_number: 19705412 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Fusobacterium nucleatum # 154 820 3 652 653 374 33.0 1e-103 MNISYNWLKEYVNFDLTPDETAAALTSIGLETGGVEEVQTIKGGLEGLVIGEVLTCTEHP NSDHLHITTVNLGDGEPVQIVCGAPNVAAGQKVVVATLGTKLYDGDECFTIKKSKIRGVE STGMICAEDEIGIGTDHAGIIVLPETAVPGTLAKDYYNIKSDYVLEVDITPNRADACSHY GVARDLYAYLIQNGRQATLQRPSVDGFKVENHDLNIEVKVENSEACPHYAGVTVKGVTVK ESPEWLQNKLRLIGVRPINNVVDITNYIVHAFGQPLHCFDAGKIKGNEVIVKTMPEGTPF VTLDEVERKLNERDLMICNKEEAMCIAGVFGGLDSGSTEATTDVFIESAYFHPTWVRKTA RRHGLNTDASFRFERGIDPNGVIYCLKLAALMVKELAGGTISSEIKDVFTTPAQDFIVDL AYEKVHSLVGKVIPVETIKSIVTSLEMKITNETAEGLTLAVPPYRVDVQRDCDVIEDILR IYGYNNVEIPTTLNSSLTTKGEHDKSNKLQNLIAEQLVGCGFNEILNNSLTRAAYYDGLE AYPSNHLVMLLNPLSADLNAMRQTLLFGGLESIAHNANRKNADLKFFEFGNCYYFNADKK NEEKVLAPYSEDYHLGLWVTGKKVSNSWAHADENSSVYELKAYVENILKRLGLDLHNLVV GNLTDDVFAAALSVNTKGGKRLASFGIVTKKLLKAFDIDNEVYYADLNWKELMKAIRSVK ISYKEISKFPAVKRDLALLLDKNIQFAEIEKIAYETEKKLLKEVELFDVYEGKNLEAGKK SYAVSFLLQDENQTLNDKMIDKIMSKLVKNLEDKLGAKLR >gi|222159265|gb|ACAB01000094.1| GENE 71 76136 - 76873 1039 245 aa, chain + ## HITS:1 COG:Cj1172c KEGG:ns NR:ns ## COG: Cj1172c COG0217 # Protein_GI_number: 15792496 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 1 238 1 234 235 177 45.0 1e-44 MGRAFEYRKAAKLKRWGHMAKTFTRLGKQIAIAVKAGGPEPENNPTLRSVIATCKRENMP KDNIERAIKNALGKDQSDYKSMTYEGYGPHGIAVFVDTLTDNTTRTVADVRSVFNKFGGN LGTMGSLAFLFDHKCVFTFKKKDGLDMEELILDLIDYDVEDEYEEDDEEGTITIYGNPKS YAAIQKHLEECGFEDVGGDFTYIPNDLKEVTPEQRETLDKMIERLEEFDDVQTVYTNMQP EEGEE >gi|222159265|gb|ACAB01000094.1| GENE 72 76883 - 77128 231 81 aa, chain + ## HITS:1 COG:no KEGG:BT_0628 NR:ns ## KEGG: BT_0628 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 81 1 81 81 150 92.0 1e-35 MEYVYKTQGTCSTNIELNVEDGVVKEVAFWGGCNGNLQGISRLVRGMKVEEVIAKLEGVR CGGRPTSCPDQLCRALHEMGY >gi|222159265|gb|ACAB01000094.1| GENE 73 77386 - 78639 1016 417 aa, chain + ## HITS:1 COG:CAC0628 KEGG:ns NR:ns ## COG: CAC0628 COG1914 # Protein_GI_number: 15893916 # Func_class: P Inorganic ion transport and metabolism # Function: Mn2+ and Fe2+ transporters of the NRAMP family # Organism: Clostridium acetobutylicum # 13 417 11 415 417 456 62.0 1e-128 MKNIFQDLRRKDHKRYLGGLDVFKYIGPGLLVTVGFIDPGNWASNFAAGSEFGYSLLWVV TLSTIMLIVLQHNVAHLGIVTGLCLSEAATKYTPKWVSRPILGTAVLASISTSLAEILGG AIALEMLLDIPIIWGAVLTTLFVSIMLFTNSYKKIERSIIAFVSVIGLSFIYELFLVEID WPAATAGWVTPSFPKGSMLIIMSVLGAVVMPHNLFLHSEVIQSHEYNKKDDASIKKVLKY ELFDTLFSMIVGWAINSAMILLAAATFFKSGIQVEELQQAKSLLEPLLGSNAAIVFALAL LMAGISSTITSGMAAGSIFAGIFGESYHIKDSHSQVGVLLSLGIALLLIFFIGDPFKGLI ISQMVLSIQLPFTVFLQVGLTSSRKVMGDYVNSRWSTFVLYSIAIIVSVLNIMLLFS >gi|222159265|gb|ACAB01000094.1| GENE 74 78669 - 79430 827 253 aa, chain + ## HITS:1 COG:MTH212 KEGG:ns NR:ns ## COG: MTH212 COG0708 # Protein_GI_number: 15678240 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Methanothermobacter thermautotrophicus # 1 252 4 255 257 246 45.0 4e-65 MKIITYNVNGLRAAVSKGLPEWLAQENPDILCLQETKLQPDQYPGEVFEALGYKSYLYSA QKKGYSGVAILTKREPDHVEYGMGMEAYDNEGRFIRADFGDLSVVSVYHPSGTSGDERQA FKMVWLEDFQKYVMELQKSRPNLILCGDYNICHEPIDIHDPVRNATNSGFLPEEREWMTR FLSAGYVDSFRTLCPEKQEYTWWSYRFNSRAKNKGWRIDYCMVSEPVRPLLKRAYILNEA VHSDHCPMALEIL >gi|222159265|gb|ACAB01000094.1| GENE 75 79425 - 79832 401 135 aa, chain - ## HITS:1 COG:DR2598 KEGG:ns NR:ns ## COG: DR2598 COG0432 # Protein_GI_number: 15807580 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 1 134 7 142 151 154 55.0 6e-38 MATTFDIQLPHYSRGFHLITRDIVSQLPPLPESGLLVVFIKHTSAGLTINENADPDVRHD FQTFFNKLVPDGAPYFIHTLEGPDDMSAHIKASLIGSSVTIPIKNHRLNLGTWQGVYLGE FRDGGDTRKLSITIL >gi|222159265|gb|ACAB01000094.1| GENE 76 79848 - 80114 252 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714561|ref|ZP_04545042.1| ## NR: gi|237714561|ref|ZP_04545042.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 88 1 88 88 134 100.0 2e-30 MNKTISYKESTAIAIQAMMSAARKDEYADRKRKSNFPQNRKKRDVTLKDIREWNKRRNYK EDTGITINAMMESAVKDPYVDLNPPSKF >gi|222159265|gb|ACAB01000094.1| GENE 77 80207 - 80665 505 152 aa, chain + ## HITS:1 COG:no KEGG:BT_0631 NR:ns ## KEGG: BT_0631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 152 1 152 152 273 94.0 2e-72 MEDRIQKAVELFKSGYNCSQSVVAAFADMYGFTQEQALRMSASFGGGIGRMRETCGAACG MFLVAGLETGATEATDREGKAANYAVVQELAAEFKKRNGSLICGELLGLKKKEPVSTIPE ERTAQYYSKRPCAKMVEEAARIWSEYLEKHPK >gi|222159265|gb|ACAB01000094.1| GENE 78 80821 - 81021 338 66 aa, chain + ## HITS:1 COG:no KEGG:BF2557 NR:ns ## KEGG: BF2557 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 65 1 65 66 92 83.0 5e-18 MLKEKAGVIAGTIWNALNETEGMTAKQLKKATKLVDKDLFLGLGWLLREDKVSVEEVEGE LFIKLI >gi|222159265|gb|ACAB01000094.1| GENE 79 81150 - 82931 1879 593 aa, chain + ## HITS:1 COG:BS_lepA KEGG:ns NR:ns ## COG: BS_lepA COG0481 # Protein_GI_number: 16079605 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Bacillus subtilis # 4 593 14 606 612 724 57.0 0 MKNIRNFCIIAHIDHGKSTLADRLLEFTHTIQVTSGQMLDNMDLEKERGITIKSHAIQME YTYQGEKYILNLIDTPGHVDFSYEVSRSIAACEGALLIVDASQGVQAQTISNLYMAIEHD LEIIPVINKCDMASANPEEVEDEIVELLGCKREEVIRASGKTGMGVEEILAAVIERIPHP EGDEEAPLQALIFDSVFNSFRGIIAYFKIENGMIRKGDKVKFFNTGKEYDADEVGVLKMD MVPRNELRTGDVGYIISGIKTSKEVKVGDTITHIARPCEKAIAGFEEVKPMVFAGVYPIE AEDFEDLRASLEKLQLNDASLTFQPESSLALGFGFRCGFLGLLHMEIVQERLDREFDMNV ITTVPNVSYHIYDKQGNMKEVHNPGGMPDPTMIDHIEEPYIKASIITTTDYIGPIMTLCL GKRGELIKQEYISGNRVEIYYNMPLGEIVIDFYDKLKSISKGYASFDYHPNGFRTSKLVK LDILLNGEPVDALSTLTHIDNAYDMGRRMCEKLKELIPRQQFDIAIQAAIGAKIISRETI KAVRKDVTAKCYGGDVSRKRKLLEKQKRGKKRMKQIGNVEVPQKAFLAVLKLD >gi|222159265|gb|ACAB01000094.1| GENE 80 83072 - 84250 781 392 aa, chain + ## HITS:1 COG:no KEGG:BT_0633 NR:ns ## KEGG: BT_0633 # Name: not_defined # Def: putative Na+/H+ exchange protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 392 1 381 381 651 93.0 0 MRKVLSFSGFLMLGLVVSQFLPMMAGDGYGAVKSISNVLLYVCLSFIMINVGREFVLDKT RWKSYAQDYFIAMATAALPWFMIAIYYVFVLLPPDFWNSWEAWKENLLLSRFAAPTSAGI LFTMLAAIGLKSSWIYKKIQVLAIFDDLDTILLMIPLQIMMIGLRWQLIIVVVIVFLLLS IGWQRLNKYDWRQDWKAILFYSIIIFLATQILYLGSKELYGEEGSIHIEVLLPAFVLGMI MKHKEHDTPVERKVSTGISFFFMFLVGMSMPHFIGVNFAETHAGAYSVTGSQEMMSWGMI MFHVVIVSLLSNIGKLCPMFFYRDRKLSERLALSIGMFTRGEVGAGIIFIALGYNLGGPA LVISVLTLVLNLILTGIFVLWVKNLALRSYTN >gi|222159265|gb|ACAB01000094.1| GENE 81 84296 - 85597 977 433 aa, chain + ## HITS:1 COG:jhp1447 KEGG:ns NR:ns ## COG: jhp1447 COG3004 # Protein_GI_number: 15612512 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter # Organism: Helicobacter pylori J99 # 6 429 13 427 438 270 40.0 3e-72 MTILRTMRNFSSMNITASILLFLAAISAAIIANSSVAPVYQEFLSHELHLQIGNFNLLSH GGENLRMIEFINDGLMTIFFLLVGLEIKRELLVGELSSFRKAALPFIAACGGMLLPVMIY SLICVPGSEGGHGLAIPMATDIAFSLGVLSLLGSRVPLSLKIFLTAFAVVDDIGGILVIA LFYSSHVAYGYLLVAILFYILLYFIGKYGTTNKVFFLVIGVIIWYLFLQSGIHSTISGVI LAFVIPAKPRLNVGKYIEKIRHTIADFPAMQSESIVLTNEQIAKLKEVESASDRVISPLQ SLEDNLHGTVNYLILPLFAFVNAGVVFSGGGELVGAVSIAVAAGLLLGKFIGIYFFTWLA IKIRLTPMPLGMTWKNLSGVALLGGIGFTVSLFIANLSFGVDYPVLLNQAKFGVLTGTVL SGLLGYVVLRISL >gi|222159265|gb|ACAB01000094.1| GENE 82 85594 - 86829 1108 411 aa, chain - ## HITS:1 COG:XF0413 KEGG:ns NR:ns ## COG: XF0413 COG1322 # Protein_GI_number: 15837015 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 2 411 4 449 456 298 41.0 1e-80 MELILLIVIAALVIVLLILSLTKGNNQTQAEQLQTALRQQMQENREELNRSIRELRLEMT QTLNQNMQQLQDVLHKNMLTNGELQRQKFDMMARQQESLIKSTEKRLDDMRTMVEEKLQK TLNERIGQSFEIVRSQLENVQKGLGEMKSLAQDVGGLKKVLSNVKMRGTFGEVQLGALLE QMMSPEQYDANVKTKKSGTEFVEFAIKLPGKDDANSTVYLPIDAKFPKDVYEQYYDAFEA GDTALIESSGKQLETTIKKMAKDIHDKYVDPPFTTDFAIMFLPFESIYAEVIRRTSLVET LQKDYKIVVTGPTTLGAILNSLQMGFRTLAIQKRTGEVWTVLGAVKTEFSKFGGLLEKVQ KNLQNAGDQLEEVMGKRTRAIERKLRQVEQLPHEESMKILPIDDGDDESTD >gi|222159265|gb|ACAB01000094.1| GENE 83 86829 - 87683 780 284 aa, chain - ## HITS:1 COG:PA3657 KEGG:ns NR:ns ## COG: PA3657 COG0024 # Protein_GI_number: 15598853 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Pseudomonas aeruginosa # 39 283 5 248 261 270 52.0 2e-72 MKNFIKGFRFTPSNYPAEVEAKIQKYRKQGYKLPPRKVLRTPEQLEGIRESAKINTALLD YISENIREGMSTEEIDVLVYDFTTSHGAIPAPLNYEGFPKSVCTSINEVVCHGIPNKNEI LKSGDIINVDVSTIYKGYFSDASRMFMIGDVNPDMQRLVQVTKECMEIGIAAAQPWKQLG DVGAAIQEHAEKNGFSVVRDLCGHGVGMQFHEEPDVEHFGRRGTGMMIVPGMTFTIEPMI NMGTYEVFIDDADGWTVCTDDGLPSAQWENMILITENGNEILTY >gi|222159265|gb|ACAB01000094.1| GENE 84 87784 - 88407 196 207 aa, chain - ## HITS:1 COG:no KEGG:BT_0639 NR:ns ## KEGG: BT_0639 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 208 208 255 67.0 7e-67 MDIRKYQLNILCLLLFPLSAAGQEWSKQDSLRLQQMLESDQEIKINRKLIEKVEQKMYSR KPFVDFDPTLPTLKSSTIFSKPSIHTYKMFQKPGSTFLPTYSWLRINKNLILHSKSDFAE NSNHFHIQSQMEYKLSSRWSLDIYGSQNLDTRRYRGLPSEVEPTKLGSNVVFKINKNWKI KTGMQYQYNAIRKRWEWIPQVSVSYEW >gi|222159265|gb|ACAB01000094.1| GENE 85 88519 - 90159 950 546 aa, chain - ## HITS:1 COG:CAC1021 KEGG:ns NR:ns ## COG: CAC1021 COG1032 # Protein_GI_number: 15894308 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 2 460 3 463 548 168 28.0 3e-41 MKLLWLDLNSSYAHSSLALPALHAQIANNTDIEWCTVSATINENTGSVVNQIYRHQPDII AATNWLFNHEQLLHIVSRAKALLPHCCVILGGPEFLGDNEAFLYKNKFVSGVFRGEGEEV FPLWLKVWNQPRKEWKSITGLCYLNESGEYQDNGLARVMNFSELVPPEKSRFFNWSKPFV QLETTRGCFNTCAFCVSGGEKPVRTLSLEAIKERLDVIHEHGIKNVRVLDRTFNYNNKRA KELLNLFREYPDICFHLEIHPALLSDELKQELATLPKGLLHLEAGIQSLRENVLEQSRRI GKLSNALAGLHYLCSLENMETHADLIAGLPLYHLSEIFDDVRTLTEYGAGEIQLESLKLL PGTEMRRRADELGIQYSPLPPYEVLQTREITVDELQTAHYMSRLLDGFYNTPTWRNITRI LILENPHFIHDLLDHLVRTDVIDTPLSLERRGLILYDFCKNHYPDYLTQVSIAWIEAGMS LKKAPAEKVRTKRQLPPESWEVVYGAYRENLRLCFLPVDEEGHGYWFGFESEIQKIQPVF KARTLS >gi|222159265|gb|ACAB01000094.1| GENE 86 90271 - 90684 338 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714572|ref|ZP_04545053.1| ## NR: gi|237714572|ref|ZP_04545053.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 137 1 137 137 251 100.0 8e-66 MEMIKRYAGILMMLTLLLGFTSCESEDETEFNLPGEWYTNEEIDFGAYTWGRGTLMTFNA RNQGTIGSAGDPNYLVFEWRWIDGGYNSMELYFYGDGTYAYIWGAEATDRTFSGTWYNNW RDFRDRIDGQPFYMRRQ >gi|222159265|gb|ACAB01000094.1| GENE 87 90756 - 91676 204 306 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 85 299 77 284 287 83 29 8e-15 MKKRPRRTPAEKARAQYTNYAVKEPMELMEFLAAKMPDASRTKLKSLLSKRVVFVDNVIT TQFNFPLEAGMKVKISKQKGKKEFNNRLLKIVYEDAYIIVVEKMQGLLSVNTERQKERTA YTILNEYVQRSGRQFRVFIVHRLDRDTSGLMMFAKDEKTQRTLRDNWHEIVTDRRYVAVV EGSMEKDYDTVVSWLTDKTLYVSSSEYDDGGSKSITHYKTIKRANGYSLLELDLETGRKN QIRVHMQDLGHPIIGDGRYGREDSPNPIGRLALHAFKLCFYHPVTGDLMEFETPYPAEFK KLFLKK >gi|222159265|gb|ACAB01000094.1| GENE 88 91780 - 93150 1170 456 aa, chain - ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 2 456 20 457 458 282 35.0 1e-75 MDVAAEGKAIAKVNDLVIFVPYVVPGDVVDLQIKRKKNKYAEAEAVKFHELSPNRAVPFC QHYGVCGGCKWQVLPYSEQIRYKQKQVEDNLRRIGKIELPEISPILGSDKTEFYRNKLEF TFSNKRWLTNEEVRQDVKYDQMNAVGFHIPGAFDKVLAIEKCWLQDDISNRIRNAVRDYA YEHDYSFINLRTQEGMLRNMIIRTSSTGELMVIVICKITEDHEMELFKQLLQFVADSFPE ITSLLYIINNKCNDTINDLDVHVFKGKDHIFEEMEGLRFKVGPKSFYQTNSEQAYNLYKI ARNFAGLTGNELVYDLYTGTGTIANFVSRQARQVIGIEYVPEAIEDAKVNAEINDIKNAI FYAGDMKDMLTQDFINQHGRPDVIITDPPRAGMHQDVIDVILFAEPKRIVYVSCNPATQA RDLQLLDEKYKVKAVQPVDMFPHTHHVENVVLLELR >gi|222159265|gb|ACAB01000094.1| GENE 89 93398 - 96118 3025 906 aa, chain + ## HITS:1 COG:mlr7532 KEGG:ns NR:ns ## COG: mlr7532 COG0574 # Protein_GI_number: 13476256 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Mesorhizobium loti # 4 897 3 877 892 1007 57.0 0 MDKKRVYTFGNGQAEGKADMKNLLGGKGANLAEMNLIGIPVPPGFTITTDVCTEYNTLGR DKVVELLKDEVVKAIAHVEALMKSKFGDVENPLLVSVRSGARASMPGMMDTILNLGLNDE VVEGIIRKTGNARFAWDSYRRFVQMYGDVVLGMKPTNKDDIDPFEAIIEEVKKAKGVELD NELKVEDLQELVKKFKAAVKEQTGKDFPTCAYEQLWGAICAVFDSWMNERAILYRKMEGI PDEWGTAVNVQAMVFGNMGDTSATGVCFSRDAGTGEDLFNGEYLINAQGEDVVAGIRTPQ QITKIGSQRWAVLAGVTEDVRAAKFPSMEEAMPEIYKELDALQTKLENHYKDMQDMEFTV QEGKLWFLQTRNGKRTGAAMVKIAMDLLRQGMIDEKTALMRVEPNKLDELLHPVFDKDAL KKAKVLTRGLPASPGAATGQIVFFADDAAEWHAAGKRVVMVRIETSPEDLAGMAVAEGIL TARGGMTSHAAVVARGMGKCCVSGAGALNIDYKNRTVEIDGVLLKEGDYISLNGSTGVVY NGKVETKAAELFGDFAELMTLADKYTRLQVRTNADTPHDAEVARNFGAVGIGLCRTEHMF FEGEKIKAMREMILAENAEGRRKALAKILPYQQEDFKGIFKAMAGCPVTVRLLDPPLHEF VPHDLKGQQEMADTMGVSLQYIQQRVESLCEHNPMLGHRGCRLGNTYPEITQMQTRAILG AALELKKEGVETHPEIMVPLTGILYEFKEQENVIRSEAKKLFEEVGDSIDFKVGTMIEIP RAALTADRIASSAEFFSFGTNDLTQMTFGYSRDDIASFLPVYLEKKILKVDPFQVLDQNG VGQLVRMATEKGRAIRPDLKCGICGEHGGEPSSVKFCHRVGLNYVSCSPFRVPIARLAAA QAAIEG >gi|222159265|gb|ACAB01000094.1| GENE 90 96194 - 96790 253 198 aa, chain - ## HITS:1 COG:alr0739 KEGG:ns NR:ns ## COG: alr0739 COG4430 # Protein_GI_number: 17228234 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 10 175 15 189 193 110 38.0 2e-24 MSIEIKYFENRKDWRKWLNDNFEIANEVWFVFPSKSSGEKSITYNDAVEEALCFEWIDST IKSLDKEHKIQHFTPRNPKSTYSQANKERLKWLLENKMIHPKFEDKIRNVLSDPFIFPND IIERLKEDKTIWENYLHFSDAYKRIRIAYIEAARKRPEEFEKRLNNFINKTKENKKIRGF GGIEKYYSSQLKINDNLK >gi|222159265|gb|ACAB01000094.1| GENE 91 97197 - 97712 540 171 aa, chain + ## HITS:1 COG:no KEGG:BT_0645 NR:ns ## KEGG: BT_0645 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 171 1 171 171 295 87.0 4e-79 MKKGLLLVVVMLATIAVKAQDIYVGGSLNVWRNSTGNTTSFKIAPEIGYNFNETWALGAE LDYSHNYDGGVTKNSVFVAPYIRWSYCETGAVRLFLDGTAAVGFVKVKDGDTSKAGQVGL RPGIAVKLNDHFSFIAKYGFLGYRRNINTLGDSFGLQLTSEDLSIGFHYAF >gi|222159265|gb|ACAB01000094.1| GENE 92 97909 - 98478 715 189 aa, chain + ## HITS:1 COG:no KEGG:BT_0646 NR:ns ## KEGG: BT_0646 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 190 190 305 84.0 8e-82 MKKSVLFVLFALISVAGFSQITGWNAKVGMNISNYTGDTDMNAKIGFKLGGGFEYAFNDT WSLQPSLYLTSKGAKKDELTINAYYLELPVMAAARFNVADNTNLVVNAGPYLACGIAGKS KMDMGNVEYKEDTFGDDALKRFDAGLGVGVALEFGKIIAGLEGQFGLVDVQKVGNPKNMN FSIVVGYKF >gi|222159265|gb|ACAB01000094.1| GENE 93 98622 - 99230 511 202 aa, chain - ## HITS:1 COG:no KEGG:BT_0647 NR:ns ## KEGG: BT_0647 # Name: not_defined # Def: thiamine phosphate pyrophosphorylase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 202 1 202 202 392 95.0 1e-108 MKLIVVTTPTFFVEEDKIITALFEEGLDVLHLRKPETPAMYSERLLTLIPDKYHRRIVTH EHFYLKEEFNLMGIHLNARNPKEPHDYYGHISCSCHSVEEVKNRKHFYDYVFMSPIYDSI SKVNYYSTYTAEELREAQRAKIIDSKVMALGGINEDNLLEIKDFGFGGAVVLGDLWNRFD ACQDQNYLAVIEHFKKLKKLSD >gi|222159265|gb|ACAB01000094.1| GENE 94 99293 - 99985 822 230 aa, chain - ## HITS:1 COG:all2906_1 KEGG:ns NR:ns ## COG: all2906_1 COG0476 # Protein_GI_number: 17230398 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Nostoc sp. PCC 7120 # 2 215 18 231 262 216 52.0 2e-56 MRYDRQIILPEVGEEGQKKLQEAKVLIVGMGGLGSPIALYLTGAGVGCLGLVDDDLVSIT NLQRQVLYSEKELGKPKAICAAERLSALNSEIEIHPYAARLTKDNAYDIIQEYDIVVDGC DNFATRYLINDICIEQKKPYVYGAICGFEGQVSVFNYGNQKKNYRDLYPDEEEMQRMPPP PKGVMGVTPAIVGSIEATEVLKIICGFGDVLAGELWTIDLRTLQSNKFSL >gi|222159265|gb|ACAB01000094.1| GENE 95 99990 - 101114 987 374 aa, chain - ## HITS:1 COG:VC0066 KEGG:ns NR:ns ## COG: VC0066 COG1060 # Protein_GI_number: 15640098 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Vibrio cholerae # 2 370 3 368 370 373 49.0 1e-103 MFSDELEKISWEETTKAIYSKTDADVRRALGKKEHLDVNDFMALISPAATPYLEVMARLS QKYTMERFGKTISMFVPLYLTNSCTNSCVYCGFHISNPMKRTILTEEEIVNEYKAIKRLA PFENLLLVTGENPAAAGVPYIARALDLAKPYFSNLQIEVMPLKAEEYKELTNHGLNGVIC FQETYHKANYKTYHPRGMKSKFEWRVNGFDRMGQAGVHKIGMGVLIGLEEWRTDVTMMAY HLRYLQKHYWKTKYSVNFPRMRPSENGGFQPNVVMNDRELAQLTFAMRIFDHDVDISYST RESAEIRNHMATLGVTTMSAESKTEPGGYFSYPQTLEQFHVSDERKAVEVERDLKKLGRE PVWKDWDQSFDFKR >gi|222159265|gb|ACAB01000094.1| GENE 96 101201 - 102898 1729 565 aa, chain - ## HITS:1 COG:PA4973 KEGG:ns NR:ns ## COG: PA4973 COG0422 # Protein_GI_number: 15600166 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis protein ThiC # Organism: Pseudomonas aeruginosa # 7 564 23 596 627 781 63.0 0 MEQKIKFPRSQKVYLPGKLYPNIRVAMRKVEQVPSVSFEGEEKIATPNPEIYVYDTSGPF SDAEMNIDLKKGLPRMREEWIVSRGDVEQLPEITSEYGQMRRDDKSLDHLRFEHIALPYR AKKGEAITQMAYAKRGIITPEMEYVAIRENMNCEELGIKTHITPEFVRQEIAEGRAVLPA NINHPEAEPMIIGRNFLVKINTNIGNSATTSSIDEEVEKALWSCKWGGDTLMDLSTGENI HETREWIIRNCPVPVGTVPIYQALEKVNGIVEDLTWEIYRDTLIEQCEQGVDYFTIHAGI RRHNVHLADNRLCGIVSRGGSIMSKWCLVHDQESFLYDHFDDICDILAQYDVAVSLGDGL RPGSIYDANDEAQFAELDTMGELVLRAWDKNVQAFIEGPGHVPMHKIKENMERQIEKCHD APFYTLGPLVTDIAPGYDHITSAIGAAQIGWLGTAMLCYVTPKEHLALPDKEDVRVGVIT YKIAAHAADLAKGHPGAQVRDNALSKARYEFRWKDQFDLSLDPERAQTYFRAGHHIDGEY CTMCGPNFCAMRLSRDLKKSTKSNK >gi|222159265|gb|ACAB01000094.1| GENE 97 102911 - 103684 930 257 aa, chain - ## HITS:1 COG:YPO3742 KEGG:ns NR:ns ## COG: YPO3742 COG2022 # Protein_GI_number: 16123879 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of thiazole biosynthesis # Organism: Yersinia pestis # 2 256 62 324 333 316 61.0 2e-86 MEKLVIAGREFNSRLFLGTGKFNSNEVMEQAILASGTEMVTVAMKRIDMDNKEDDMMKHI IHPNIQLLPNTSGVRNAEEAVFAAQLAREAFGTNWLKLEIHPDPRYLLPDSIETLKATEE LVKLGFIVLPYCQADPVLCKRLEEAGAATVMPLGAPIGTNKGLQTKEFLQIIIEQAGIPV VVDAGIGAPSHAAEAMELGASAVLVNTAIAVAGNPVEMAKAFKAATEAGRQAYKAGLGLQ AVDFVAEASSPLTAFLD >gi|222159265|gb|ACAB01000094.1| GENE 98 103866 - 104486 603 206 aa, chain - ## HITS:1 COG:PAB1645 KEGG:ns NR:ns ## COG: PAB1645 COG0352 # Protein_GI_number: 14521295 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Pyrococcus abyssi # 17 206 21 206 207 128 39.0 8e-30 MISLQFITHQTERYSYLESARMALEGGCKWIQLRMKDALLEEVEAVALQLKPLCKEHEAI LILDDHVELAKKLEVDGVHLGKKDMPIDQARQILGEAFIIGGTANTFEDVVQHYRAGADY LGIGPFRFTTTKKNLSPVLGLEGYSSILSQMKEANIEIPVVAIGGITFEDIPAILHTGVN GIALSGTILGADNPVEETRRIIESDL >gi|222159265|gb|ACAB01000094.1| GENE 99 104491 - 104691 244 66 aa, chain - ## HITS:1 COG:no KEGG:BT_0653 NR:ns ## KEGG: BT_0653 # Name: not_defined # Def: ThiS protein, involved in thiamine biosynthesis # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 66 1 66 66 107 90.0 2e-22 MKVQVNNKEVEMTPASTLTQLTAQLELPVQGIAIAVNNKMIPRIEWECFILHENDNLVII KAACGG >gi|222159265|gb|ACAB01000094.1| GENE 100 104901 - 106883 1574 660 aa, chain - ## HITS:1 COG:VC2453_1 KEGG:ns NR:ns ## COG: VC2453_1 COG0642 # Protein_GI_number: 15642449 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 379 649 238 513 516 166 34.0 1e-40 MVLYKRKEIRYIVSIAVLLFFPLFCNAAATEEPEEEILFITSYNSDTKYTYDNISTFIET YTQLGGRYSTIVENMNATDLTQAHQWKKTLTDILDKHPKAKLVILLGGETWSSFLHLEDE KYKQLPVFCAMASRNGIRIPEDSIDIRNYNPVSIDLTERMKEYNVKYCDTYEYNISKDIE MIQDFYPDTEHLVFVSDNTYNGLAELAWFKKNLQHFPQLSITYIDGRIHTLDMAANQLRN LPRNTAMLLGIWRIDSRGITYMNNSVYAFSKANPLLPVFSMTSTAIGYWAIGGYVPQYEG VGKNMGEYAYRFLDQKETGISSINILPNRYKFDAKKLKEWGFENKKLPVNSMVINQPVPF FVAYKTEVQFILIIFLVLVGSLMISLYYYYRTKILKNHLERTTQQLREDKEKLEESEIEL RDAKERAEEANQLKSAFVSNMSHEIRTPLNAIVGFSSLIIGSVEQNDELKEYADIVQTNS NLLLQLISDVLDISRLESGKLQFNYEWCELVNHCQNMITLTNRNKTVDVDIKLQMPKEPY MLYTDPLRLQQIIINLLNNALKFTPAGGSITLDYEVDEEKQCMLFSVTDTGTGIPEDKQE LVFQRFEKLNEFVQGTGLGLAICKLTIQYMGGDIWIDKSYKNGARFIFSHPIKKQESTEK >gi|222159265|gb|ACAB01000094.1| GENE 101 107032 - 107649 432 205 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent [Streptococcus pneumoniae TIGR4] # 13 203 1 195 201 171 43 3e-41 MNTILMSLIMMTMTYEMPKLPYANNALEPVISQQTIDFHYGKHLQTYVNNLNSLVPGTEY EGKTVEEIVATAPDGVIFNNAGQVLNHNLYFLQFAPKPSKKEPSGKLGEAIKRDFGGFEN FKKEFNAAAVGLFGSGWAWLSVGKDGKLKITKEANGSNPVRAGLKPLLGFDVWEHSYYLD YQNRRADHVNALWDIIDWDVVEKRM >gi|222159265|gb|ACAB01000094.1| GENE 102 107725 - 109464 971 579 aa, chain - ## HITS:1 COG:no KEGG:BT_0656 NR:ns ## KEGG: BT_0656 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 579 1 580 580 868 74.0 0 MKRSIFLSIILSLFLVACIPQAMAQKQSRLEKLLRYLNDNDADKWQKNRDKIDDETQTYY AEELALLDVLNELWNKQSEQAATNYFGCYEQAMKAYFPNICEEEKIQLSNVQDKAEQAVI SILEASKEQIPFSKTLMDSIQSSGYPADSALLQKVRDIRELALLEGMLKTPALNIYQTYI SEYPNGKFISQINTAENKRLYQIVKSNPTSANFKAFFDNAAMQKFFTDKDTRPFLSEVRT LYDDFLFQGIDSLREKGNATAIRQIIDDYKQSPYLTSTARTHLDDLEYLSEKADFELLKP AIVSSESLSMLQDFLCTHRYKEFRDQANALRAPFILQTIISTPTSVKYYNGGRLIKSAEN DSTGTTSTTYSYDDKGQLVSTLALTMKNGQPSNEIQTNRLYDPQGHCIFEVQTNPKTKTD LYRRTRRIGTDGSIESDSLKYADGRVVISSYNKQGLLTETKEYNKNGELQSYTANKYDDK GRLVSSQHQNLLFANSPDQVISQKDAYEYDKYGYLSQIVYQRILGNNQKTSGCLTCLYDK YGNRIDGNSYYEYDNTGQWICRTNRDHPKEVERIQYIYK >gi|222159265|gb|ACAB01000094.1| GENE 103 109611 - 111989 2126 792 aa, chain + ## HITS:1 COG:SPy1267 KEGG:ns NR:ns ## COG: SPy1267 COG0210 # Protein_GI_number: 15675225 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Streptococcus pyogenes M1 GAS # 6 786 5 767 772 517 39.0 1e-146 MNTNYIDELNESQCAAVTYNDGPSLVIAGAGSGKTRVLTYKIAYLLEQENGYNPWNILAL TFTNKAAREMKERIARQVGMERARYLWMGTFHSIFSRILRAEAQYIGFTSQFTIYDTADS KSLLRSIIKEMGLDEKTYKPGVVQARISNAKNHLVTPTGYAANKEAYEGDMAAKMPAIRD IYTRYWDRCRQAGAMDFDDLLVYTYILFRDFPDVLTRYQDQFHYVLVDEYQDTNYAQHSI ILQLTKENQRICVVGDDAQSIYSFRGADIDNILYFTKIYPNTKVFKLEQNYRSTQTIVCA ANSLIEKNERQIRKAVFSEKEKGEPIGVFQAYSDVEEGDIVANKIAELRREYHYGYAEFA ILYRTNAQSRIFEEALRKRSMPYKIYGGLSFYQRKEIKDVIAYFRLVVNPNDEEAFKRII NYPARGIGDTTVGKIISAATNHGVSLWAAVCEPLSYGLDINKGTHAKLQGFRELIEGFIA DQADKNAYEIGTDIIRQSGIINDVCQDTSPENLSRKENIEELVNGMNDFCALRQEEGNPN VSLTDFLSEIALLTDQDSDKADDGEKVTLMTVHSAKGLEFKNVFVVGLEENLFPSGMVGD SPRALEEERRLFYVAITRAEEHCYLSFAKTRFRYGKMEFGSPSRFLRDIDVDYLRLPHEA GVSRAVDEGAGRFRREIEGGFARSTSPSRAPFGSNSSEQRERPKAQIIAPSVPRNLKKVS AVGGSSSAQMASSGSASVAGVQVGQMIEHERFGLGEVLKVEGTGDNAKATIHFKNAGDKQ LLLRFARFKVVE >gi|222159265|gb|ACAB01000094.1| GENE 104 112015 - 112707 629 230 aa, chain + ## HITS:1 COG:no KEGG:BT_0658 NR:ns ## KEGG: BT_0658 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 230 1 231 231 319 80.0 6e-86 MKKQLTVILLSALLLSGCASGRMGNPGAIMAGASIGGSLGSSIGGLIGDNNHGWRGGYRG SAIGNIVGTIAGAAIGNALTAPRQEQIEEDAYIPEVREVRVQKYKKQPQVQPPISQLKLR KIRFIDDNRSHVIDAGENSKIIFEIMNEGRKPVYNVVPVVETVGKVKHLGISPSVMIEEI LPGEGIRYTASIHAGERLKDGEVTFRVAVADENGVICDSQEFTLPTQRGN >gi|222159265|gb|ACAB01000094.1| GENE 105 112979 - 113536 464 185 aa, chain - ## HITS:1 COG:CC0205 KEGG:ns NR:ns ## COG: CC0205 COG2249 # Protein_GI_number: 16124460 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Caulobacter vibrioides # 8 176 8 179 185 122 38.0 5e-28 MNKDLRKVVILLAHPNIKESQANKALVDAVSDMEGVAVFNLYELSQEIAFNIDEWSKIIS DASAVIYQFPFYWMSAPSLLKKWQDEVFTFLSKTPAVAGKPLTVVTTTGSEYEAYRSGGR NRFTTDELLRPYQVSAIHSGMSWQTPIVVYGMGTADAGKNIAEGANLYKQRVEMLIGSSN AGNNW >gi|222159265|gb|ACAB01000094.1| GENE 106 113713 - 114852 829 379 aa, chain + ## HITS:1 COG:sll0873 KEGG:ns NR:ns ## COG: sll0873 COG0019 # Protein_GI_number: 16330194 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Synechocystis # 4 378 14 386 387 431 50.0 1e-120 MIDFNQFPSPCYIMEEELLRKNLCLIKNVADRAGVEIILAFKSFAMWRSFPIFREYIAHS TASSVYEARLALEEFGSKAHTYSPAYTEQDFPEIMRCSSHITFNSMQQFERFYPMVVAEG SGISCGIRVNPEYSEVETELYNPCAPGTRFGITADLLPDVLPQGIEGFHCHCHCESSSYE LERTLEHLEAKFSRWFPQIKWLNLGGGHLMTRKDYDTEHLITLLQGLKARHPHLRIILEP GSAFIWQTGVLTSEVVDIVESRGIKTAILNVSFTCHMPDCLEMPYQPAVRGAEMGNEGKY IYRLGGNSCLSGDYMGLWSFDHPLQIGERIVFEDMIHYTMVKTNMFNGIHHPAIAIWTKE GKAEIYKQFSYEDYRGRMS >gi|222159265|gb|ACAB01000094.1| GENE 107 115830 - 116996 1349 388 aa, chain + ## HITS:1 COG:lin2213 KEGG:ns NR:ns ## COG: lin2213 COG1820 # Protein_GI_number: 16801278 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Listeria innocua # 43 379 45 378 380 204 34.0 2e-52 MLTQIINARILTPQGWLKDGSVLIRDNKILEVTNCDLAIIGAKLIDAKGMYIVPGGVEIH VHGGGGRDFMEGTEEAFRTAIKAHMQHGTTSIFPTLSSSTIPMIRAAAATTEKMMAEPNS PVLGLHLEGHYFNMDMAGGQIPENIKDPDPEEYIPLLEETRCIKRWDAAPELPGAMQFGK YITAKGVLASVGHTQAEFEDIQTAYEAGYTHATHFYNAMPGFHKRKEYKYEGTVESIYLI DDMTVEVVADGIHVPPTILRLVYKIKGVERTCLITDALACAASDSQVAFDPRVIIEDGVC KLADHSALAGSVATMDRLIRTMVQKAEIPLEDAVRMASETPARIMGVLDRKGTLERGKDA DIIALDRDLNVRAVWAMGELVEGTNKLF >gi|222159265|gb|ACAB01000094.1| GENE 108 117018 - 118190 1107 390 aa, chain + ## HITS:1 COG:lin2213 KEGG:ns NR:ns ## COG: lin2213 COG1820 # Protein_GI_number: 16801278 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Listeria innocua # 44 379 46 378 380 189 31.0 7e-48 MLTQIINGRILTPQGWLKDGSVLICDGKILEVTNSDLAVIGATVIDARGMTIVPGFVSMH AHGGGGHDYTEATEEAFRTATNAHLKHGATGIFPTLSSTSFERIYQAVDVCEHLMKEKDS PILGLHIEGPYLNPKMAGTQYDGFLKTPDENEYIPLLERTSCIRRWDISPELPGAHDFAK YTRSKGIMTAVTHTEAEYDEIKAAFAVGFSHAAHFYNAMPGFHKRREYKYEGTVESVYLT DGMTVEVIADGIHLPATILKLVYKLKGVENTCLVTDALAYAAYEGNEPIDPRYVIEDGVC KMADHSALAGSLATMDVLVRTMVKKANIPLEDAVRMASETPARLIGVSDRKGALAKGKDA DIVILDKELNVRCVWSMGKIVPGTDNLLHK >gi|222159265|gb|ACAB01000094.1| GENE 109 118450 - 118812 393 120 aa, chain - ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 4 114 3 112 117 104 39.0 5e-23 MKIIDLTKDSFVEKIADYQSYPDSWNFKGNKPCLVDFHAPWCVYCKALSPILDQLAKEYE GKLDIYKVDVDQEPELESAFKIRTIPNLLLCPLNGKPAMKLGTMNKAQLKELIETSLLSE >gi|222159265|gb|ACAB01000094.1| GENE 110 119049 - 120314 1319 421 aa, chain + ## HITS:1 COG:PA2522 KEGG:ns NR:ns ## COG: PA2522 COG1538 # Protein_GI_number: 15597718 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 26 384 40 389 428 73 22.0 1e-12 MNRVFLISFFLLLTGGICAQQATTGTLTLKEAEQRFLERNLSLIAERYNIDMAQAQVLQA RLFENPVISLEQNVYNRLNGKYFDFGKEGETVVEVEQVIRLAGQRNKQIKLEKINKEIAE YQFEEVMRTLRQELNEKFVQVYFLSKSISIYEKEVNSLQELLAGMKLQQEKGNISLMEMS RLESMLFSLKKEKNERENELLTLRGELNVLLNLPGDTTVKLSLDEEVLKQLDLSQLSFAD LKAMVNERPDLKIARSTVSASRANLKLQKSMAFPEFSVKGNYDRAGNFINNYFAVGVSLS VPIFNRNQGNIKAVRFSIQQAGAEQENAANRADMELYTAYASLEKAVQLYQSTNMDLERN FEKLITGVNENFTKRNISLLEFIDYYDSYKETCIQLHEIKKDVFLAMENLNTTIGQNILN Y >gi|222159265|gb|ACAB01000094.1| GENE 111 120337 - 121431 894 364 aa, chain + ## HITS:1 COG:PA2521 KEGG:ns NR:ns ## COG: PA2521 COG0845 # Protein_GI_number: 15597717 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 43 287 158 403 484 122 27.0 1e-27 MNWNKFLPCILLTTVLGACSGKGEQPNVEPTALCLTDSLLRIVSVDTVHVQEVIDELTLN GRVTFNENQVAHVYPMFGGTVTELKAEIGDYVRKGDVLAVIRSGEVADYEKQLKEAEQQL LLARRNMDATQDMYNSGMASDKDVLQAKQELTSAEAEERRIKEVFSIYNFSGNAYYQLKS PVSGFIVEKQISRDMQLRPDQSEELFTISGLSDVWVMADVYESDISKISEGASVRISTLA YPDKMFAGTIDKVYHLLNNESKTMNVRIKLKNEEYLLKPGMFTNVSVKCKADETSMPRID SHALVFEGGKNYVVVVEPDQRLQVKEVDVYKQLSKECYIRSGLSEGDRVLNNNVLLLYNA LNAD >gi|222159265|gb|ACAB01000094.1| GENE 112 121446 - 124571 2797 1041 aa, chain + ## HITS:1 COG:RSp1040 KEGG:ns NR:ns ## COG: RSp1040 COG3696 # Protein_GI_number: 17549261 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Ralstonia solanacearum # 5 1034 2 1025 1038 729 37.0 0 MHKFIDNIVAFSLKNKFFIFFCTAIAVIAGVVSFKHTPIDAFPDVTNTKVTIITQWPGRS AEEVEKFITIPVEIAMNPVQKKTDIRSTTLFGLSVINVMFEDRVDDFTARQQVYNLLNDA DLPDGVTPEVQPLYGPTGEIFRYTLRSDKRSVRELKTIQDWVIERNLRSVSGVADIVSFG GEVKTFEVSVNPHQLINYGITSLELYDAIAKSNINVGGDVITKSSQAYVVRGIGLINDLE ELRNIVVKNINGTPILVKNLADVHESCLPRLGQVGRMDEDDVVQGIVVMRKGENPGEVIA NLKDKIEDLNQNILPKDVKIVSFYDREDLVNLAVKTVTHNLIEGILLVTFIVLIFMADWR TTVIVAVVIPLALLFAFICLRVMGMSANLLSMGAIDFGIIIDGAVVMVEGVFVALDKKAR QVGMPAFNVMSKMGLIRNTAKDKAKAVFFSKLIIITALIPIFSFQKVEGKMFSPLAYTLG FALLGALIFTLTLVPVMSSMLLKKNVREKNNRFVRFINTKCTALFDLFYAHRKLTIGLAT VIAGVGLWLFSFLGTEFLPQLNEGSIYIRATLPQSISLDESVTLANKMRRKLLTFPEVRQ VLSQTGRPNDGTDATGFYNIEFHVDIYPEKEWESKLTKMELIDKMQEDLSIYPGIDFNFS QPITDNVEEAASGVKGSIAVKVFGKDLYESEKYAVQIEKILGTVQGIEDLGVIRNIGQPE LRIELNEGQLARYGVAKEDVQSIIEMAIGGKSASLLYEDERKFNIMVRYSEQFRQNEEEI GKILVPAMDGTMVPIKELADITTITGPLLIFRDNHARFCAVKFSVRGRDMGTAVAEAQKK VNASVHLPAGYSLKWTGDFENQQRATKRLAQVVPISIAIIFIILFILFSNARDAGLVLLN VPFAAVGGIVALLITRFNFSISAGIGFIALFGICIQNGVIMISDIKANLKLGSPLEEATK EGVRSRIRPVIMTAAMAAIGLLPAAMSHGIGSESQRPLAIVIIGGLIGATFFALFVFPLI VEVVYERMLYDKNGKLLQRRI >gi|222159265|gb|ACAB01000094.1| GENE 113 124659 - 125990 912 443 aa, chain - ## HITS:1 COG:RSp1043 KEGG:ns NR:ns ## COG: RSp1043 COG0642 # Protein_GI_number: 17549264 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 157 431 167 450 466 159 32.0 1e-38 MKIGSKIALFYTLISVLTTVIIIAVFYIFSTQYINKLYASYLREKAYLTAQKHWEKDEVD EQSYQIIQRKYDELLPEAHEILLNMDSLSEVRDTLNKYLTQHQQTLLLARQDSIPFSFKY KDQLGAALYYPDNEGNFIVLVMSRNAYGTEIKEHLLLLSIFLVLASSILIFFIGKIYSGR ILVPLQHILKELKRIRANSLNRRLKTTGNNDELEDMIKTLNSMLDRLDSAFKAEKSFVSH ASHELNNPITAIQGECEISLLKERSTGEYIESLQRISSESKRLSSLIRHLLFLSRQEEEL LKNNVEEIILSDILKELTGSNERIRLHLEATEQQAVVKANPYLLKIALKNIIDNACKYSD KEVNVALYREQQQVILEVEDQGIGIPQEEIEHIFQSFYRGSNTRDYAGQGIGLSLTLKII SAYHAKLDISSEIEKGTKVRVIF >gi|222159265|gb|ACAB01000094.1| GENE 114 126011 - 126685 555 224 aa, chain - ## HITS:1 COG:alr1194 KEGG:ns NR:ns ## COG: alr1194 COG0745 # Protein_GI_number: 17228689 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 4 222 5 222 228 194 46.0 1e-49 MAKILLVEDEINIASFIERGLKEFGHTVTVCHDGNAGWKILQDEPFDLLILDIIMPKING LELCRLYRQTFGYQSPVIMLTALGTTEDIVKGLDAGADDYLVKPFSFQELEARIKALLRR NKEVTSNLLTCDNLILDCNTRRAKRGNIDIDLTVKEYRLLEYFMTHQGVALSRITLLKDV WDKNFDTNTNIVDVYVNYLRVKIDRDFDKKLIHTVVGLGYIMNT >gi|222159265|gb|ACAB01000094.1| GENE 115 126827 - 128491 1414 554 aa, chain - ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 16 548 4 499 600 230 29.0 5e-60 MEQEHQFIDYIEQSIIKNWDKDALTDYKGITLQYKDVARKIAKFHIVLESAGIQPGDKIA VCGRNSAHWAVTFLATITYGAVIVPILHEFKADNIHNIVNHSEAKLLFVGDQAWENLNED AMPLLEGIALLTDFSPLVSRNEKLTYAFEHRNAIYGRQYPKNFRPEHICYRKDHPEELAI INYTSGTTGYSKGVMLPYRSLWSNVAYCFEMLPVKPGDHIVSMLPMGHVFGMVYDFLYGF SAGAHIYFLTRMPSPKIISQSFSEIKPRVISCVPLIVEKIIKKDILPKVDSTIGKLLLKV PIVNDKIKSLARQAAMEIFGGNFDEIIIGGAPFNAEVEAFLKKIGFPYTIAYGMTECGPI ICSSRWETLKLASCGKATSRMEVRIDSPDPETHAGEIVCRGMNMMLGYYKNPEATAQIID ANGWLHTGDLGTLDEEGYVTVRGRSKNLLLTSSGQNIYPEEIESKLNNMPYVAESLIVLQ HEKLVAMIYPDFDDAFAHGLQQTDIQKVMEQNRIELNQQLPNYSQISKIKIHFEEFEKTA KKSIKRFMYQEAKG >gi|222159265|gb|ACAB01000094.1| GENE 116 128774 - 130240 1483 488 aa, chain - ## HITS:1 COG:BH4038 KEGG:ns NR:ns ## COG: BH4038 COG3263 # Protein_GI_number: 15616600 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain # Organism: Bacillus halodurans # 1 484 1 480 490 298 37.0 2e-80 MIFTAENTLLIGSILLFVSIVVGKTGYRFGVPTLLLFLVVGMLFGSDGLGLQFHDAKDAQ FIGMVALSIILFSGGMDTKFREIKPILGPGIVLSTVGVLLTALFTGLFIWWISGMSWSNI YLPITTSLLLASTMSSTDSASVFAILRSQKMNLKHNLRPMLELESGSNDPMAYMLTIVLI QFIQSSGMGAGAIVGSFAIQFIVGAAAGYVLGKLAIRMLNKLNIDNQALYPILLLAFVFF TFSITDLLKGNGYLAVYIAGIMVGNNKIMHRKDIYTFMNGLTWLFQIIMFLCLGLLVNPH EMLEVAAVALLIGVFMIIIGRPLSVFLCLLPFRKITMKSRIFVSWVGLRGAVPIIFATYP VVAGVEGSNLIFNIVFFITIVSLVVQGTTISFVARILNLSKPLEKTGNDFGVELPEEIDS DLSDMTITKSMLEEADTLKDMNLPKGTLVMIVKRGDEFLIPNGTLKLHEGDKLLLISEKS KEEETDSD >gi|222159265|gb|ACAB01000094.1| GENE 117 130407 - 131591 923 394 aa, chain + ## HITS:1 COG:VC2171 KEGG:ns NR:ns ## COG: VC2171 COG2233 # Protein_GI_number: 15642170 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Vibrio cholerae # 9 393 1 399 417 349 50.0 4e-96 MDSNNLTPLRKGVVGVQFLFVAFGATVLVPLLVGLDPSTALFTAGIGTLIFHAVTRGKVP IFLGSSFAFIAPIIKATELYGLPGALSGMVGVALVYFVMSALVKWQGVRVIERLFPPVVI GPVIILIGLSLAGTGVNMAKENWVLALLSLVTAVVVSMKAKGLLKLIPIFCGIVVGYLAA WIFYGLDLSGVRDAAWIGLPQFVFPKFSWEPVLFMIPVAIAPVIEHIGDVYVVNTVTGKD FVKDPGLHRTLLGDGLACLCAGLLGGPPVTTYSEVTGAMSLTKITNPQVVRIAAISAILF SVIGKISALLRSIPSAVLGGIMLLLFGTIACAGIGNLVNNCIDLSRTRNIVIVSLTLTVG IGGAAFSWGDFSLSGIGLAALVGVVLNLILPKED >gi|222159265|gb|ACAB01000094.1| GENE 118 131799 - 133430 1690 543 aa, chain - ## HITS:1 COG:CAC2750 KEGG:ns NR:ns ## COG: CAC2750 COG1151 # Protein_GI_number: 15896007 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 1 541 1 527 530 735 65.0 0 MSMFCYQCQETAMGTGCTLKGVCGKTSEVANLQDLLLFVVRGIAVYNEHLRKEGQSSEQA DKFIYDALFITITNANFDKVAITEKIKEGLKLKKELAGKVKIENAPDECLWDGNEEEFEE KSKTVGVLRTPNEDIRSLKELVHYGLKGMAAYVEHAHNLGYQSPEIFAFMQHALSELTRN DITVEELVQLTLETGKHGVSAMAQLDKANTSSYGNPEISQVNLGVRNHPGILISGHDLKD LEELLEQTEGTGVDVYTHSEMLPAHYYPQLKKYKHLAGNYGNAWWKQKEEFESFNGPILF TSNCIVPPRANASYKDRIYITGACGLEGAHYIPERKDGKPKDFSALIAHAKQCQPPVAIE NGTLIGGFAHAQVTALADKVVDAVKSGAIRKFFVMAGCDGRMKSREYYTEFAQKLPGDTV ILTAGCAKYRYNKLALGDINGIPRVLDAGQCNDSYSLAVIALKLKEIFGLDDVNQLPIVY NIAWYEQKAVIVLLALLALGVKHIHLGPTLPAFLSPNVKNVLIEQFGIGGISTVDEDIVK FLS >gi|222159265|gb|ACAB01000094.1| GENE 119 133581 - 134249 742 222 aa, chain - ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 10 222 14 226 229 121 30.0 8e-28 MIPTLVNNPLFRGITPERLLADLEEISFHTRSYKKGEILARQGDVCNRLVILTKGSVRGE MIDYSGRLIKVEDITAPRAIAPLFLFGEQNRYPVEVTANEPTEVIELPKPSVLSLFRKNE QFLENYMNLSANYARTLSDKLFFMSFKTIRQKLASYLLRLYKQQQQTHITLDRSQQELSD YFGVSRPSLARELAHMQEDGLLIADRKHITILQKEELVRLIQ >gi|222159265|gb|ACAB01000094.1| GENE 120 134260 - 135378 737 372 aa, chain - ## HITS:1 COG:CC1742 KEGG:ns NR:ns ## COG: CC1742 COG5000 # Protein_GI_number: 16125986 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation # Organism: Caulobacter vibrioides # 17 371 326 698 716 115 27.0 1e-25 MIDALENNDNTFHFTEENGTPESKEINRALNRVGHILYSVKAETAQQEKYYELILDFVST GLVVLNDNGAVYQKNKEALRILGLNVFTHIRQLSQVDAQLMEKIENCRPGDKLQVIFHNE RGTINLSIRVSEINVRKEHLRILALNDINTELDEKEIDSWIRLTRVLTHEIMNSVTPITS LCDTLLSISANKDEEINHGLQTISTTGKGLLAFVESYRQFTRIPTPEPSLFYVKAFIERM VELACHQHPCEHIRFHTEITPADLIVYADENLISQVVINLLKNAIQAIGNQPDGRIELKA SCNDMEEIWIEIKNNGPEIPAEIAEHIFIPFFTTKEGGSGIGLSISRQIMRLSGGSLTLL REKETTFILKFK >gi|222159265|gb|ACAB01000094.1| GENE 121 135570 - 136910 901 446 aa, chain - ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 7 443 8 441 441 307 40.0 3e-83 MSKSGTIIIVDDNKGVLTALQILLKNYFSKVVTLSSPVTLSSVIREEAPEVILLDMNFTS GINTGNEGLFWLHEIKKVRPELPVVLFTAYADIELAIRGIKEGATDFIVKPWDNQKLVET LQTAAASTHNNKKTDGKEETTHSPMYWGESKVMQQLRALIEKVAITDANILITGENGTGK EMLAREIHVLSNRKYKEMIAVDMGTITESLFESELFGHVKGSFTDAHTDRTGKFEAADNS TLFLDEIGNLPYHLQAKLLTVLQRRSIVRVGSNTPIPIHIRLICATNRNLQEMVVKGEFR EDLLYRINTIHVEIPPLRERKEDIIPLAERFMVRFCKQYDKALMKFTPDAKDKLKAHLWY GNIRELEHVIEKAVIINDSPLVPAELFQLSIPRTESQEKSISTLEEMEMQMIRKALDTCA GNLSAVAAQLGITRQTLYNKMKKFGL >gi|222159265|gb|ACAB01000094.1| GENE 122 136986 - 137795 417 269 aa, chain - ## HITS:1 COG:no KEGG:BT_0691 NR:ns ## KEGG: BT_0691 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 269 1 267 267 461 82.0 1e-129 MNMNWLNTNYILRAGVLSVLLLLPGLISATCKADTIINSYKDDSLRFIIKDDSIITFRTG DSIFTIIRADSVLPVTPKQVKHSRYDNRIHRFRSHWERIIPTHSKIQYAGNMGLLSFGTG WDYGKHNQWETDILLGFIPKYSSKKAKVTMTLKQNYMPWSINIGKGFSTEPLTCGLYVNT VFGDQFWVNEPERYPKGYYGFSSKVRFHVFMGQRLTYDIDPQRRFLAKSVTFFYEISTCD LYVISAVNNSYLRPRDYLSLSFGLKFQWL >gi|222159265|gb|ACAB01000094.1| GENE 123 137770 - 138573 670 267 aa, chain - ## HITS:1 COG:no KEGG:BT_0692 NR:ns ## KEGG: BT_0692 # Name: not_defined # Def: calcineurin superfamily phosphohydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 267 1 267 267 451 81.0 1e-125 MMRKNVYVFLSCFLLSGCGMIDYHPYDVRISGETEVNAHNIERIEANCQGKTTIRFVTMG DSQRWYDETEDFVKEINKRNDIDFVIHGGDMSDFGLTKEFLWQRDIMNGLNVPYVVLIGN HDCLGTGAETYKAVFGPTNFSFIAGNVKFICLNTNALEYDYSEPVPNFTFMEQELTNRQD EFKKTVISMHARPYTDVFNDNVAKVFQHYIRQYPGIQVCTAAHTHHYQDDVIFDDGIHYV TSDCMDYRTYLVFTITPEKYEYELVKY >gi|222159265|gb|ACAB01000094.1| GENE 124 138875 - 139549 329 224 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 219 1 218 245 131 35 3e-29 MIQIENISKVFRTTEVETVALNHVNLEVKEGEFVAIMGPSGCGKSTLLNILGLLDNPTEG SYLLMGEEVAGLKEKERTRVRKGKLGFVFQSFNLIDELNVYENVELPLTYLGLKSSERRR MVEDILKRMNINHRAKHFPQQLSGGQQQRVAIARAVVTNPQLILADEPTGNLDSKNGAEV MNLLTELNREGTTIIMVTHSQHDASFAHRTVHLFDGSIVASVKA >gi|222159265|gb|ACAB01000094.1| GENE 125 139564 - 141882 1802 772 aa, chain + ## HITS:1 COG:no KEGG:BT_0695 NR:ns ## KEGG: BT_0695 # Name: not_defined # Def: ABC transporter, permease protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 772 1 771 771 1240 80.0 0 MRQIYYTIRTLLRECGSNIIRIISLSLGLTIGILLFSQIAFELSYEKCYPEAERLALVRC QMTNLSTGETAGDDGEIGYDYTVFDVVAPTLAEEMPKEIEVASSVLSMGSANIYYEDKLL PDADYIFADTCFFQTFGIPVLEGNPKDMIMPGSVFVSEHFARETFGDESPVGKVLSVEKQ NTLTIRGIYKDVPENTMLTHDFVISVHQNGGYHAGAGWRGNDVFYAFLRLRHASDIDKVN ADIQRVIGKYTALEFDGWKIEFSAIPLVKRHLASPDVQKRLVIYGFLGFAIFFVAIMNYM LISIATLSRRAKGVGVHKCNGASSTHIFRMFMAETGILVILSVLLSFLLIINARGLIEDL LSVRLSSLFTWETLWVPLLTILVLFILAGGIPGRLFSRIPVTQVFRRYTDGKKGWKRSLL FVQFTGVSFVLGLLLVTLLQYSHLMSRDMGIVVPGLAQAQTWLPKESVEHIKDDLNRQPM VEGVTVAVNGVLGEYWTRGLMGNDGKRIATLNFNSCHYNYPEVMGIKIIEGTTLKKQNDL LVNEELVRLMKWTDGAVGKTVNDIQGTIVGVFRDIRNNSFYGSQSPIVLIGDENANHAFD VRLKEPYNENLKRLNEFVENTYPNISLRFILVDQMVKNIYKDVYRFRNSVWITSAFILLI VIMGLIGYVNDETQRRSKEIAIRKVNGAEASHILGLLTRDILYVSVISILVGTTVSYFAG QAWLDQFAEQIDLNPLLFAATALFVQLLIVICVVLKAWHIANENPVNSIKAE >gi|222159265|gb|ACAB01000094.1| GENE 126 141922 - 143283 1211 453 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 4 430 2 429 448 335 42.0 1e-91 MSKNDPHILGKERIGKLLLQYSIPAIIGMTITSIYNIIDSIFIGHGVGPMAIAGLAISFP LMNLVVAFCTLVSAGGSTLASIRLGQKDMKGATEILSHTLMLCITNSLFFGILSFIFLDD ILTFFGASADTLPYARSFMQVILLGTPITYTMIGLNNVMRATGYPKKAMLTSMVTVVANI ILAPIFIFHFEWGMRGAATATVISQLIGMVWVVSHFTKKDSTVHFEGNIWKMKPRIVQSI FAIGMSPFLMNVCACAIVIIINNSLQNYGGDMAIGAYGIINRLLTLYVMIVLGLTMGMQP IVGYNFGAQKIDRVKQTLRLGIISGVVITSSGFVICEFFPHAVSALFTDSDELIDLAVGG IRLTVLMFPFVGAQIVIGNFFQSIGKAKVSIFLSLTRQLLYLLPCLLLFPNWWGLEGIWI SMPVSDALAFITAVISLMIYIKKVSKQHPVVAE >gi|222159265|gb|ACAB01000094.1| GENE 127 143422 - 144069 620 215 aa, chain + ## HITS:1 COG:all1058_2 KEGG:ns NR:ns ## COG: all1058_2 COG0637 # Protein_GI_number: 17228553 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Nostoc sp. PCC 7120 # 10 210 10 213 223 79 28.0 3e-15 MNTTKTIAALFDFDGVIMDTETQYTVFWDEQGRKYLNEEDFGRRIKGQTLLQIYEKHFAD KPEAQLEISAELNVYEKKMSYEYIPGVEAFIADLRRNGAKIAVVTSSNEEKMANVYNAHP EFKGMVDRILTGEMFARSKPAPDCFLLGMEIFEATPENSYVFEDSFHGLQAGMTSGATVI GLATTNSREAITGKAHYIIDDFTGMTYEKMISLHK >gi|222159265|gb|ACAB01000094.1| GENE 128 144335 - 145156 1049 273 aa, chain + ## HITS:1 COG:Cgl0115 KEGG:ns NR:ns ## COG: Cgl0115 COG0413 # Protein_GI_number: 19551365 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate hydroxymethyltransferase # Organism: Corynebacterium glutamicum # 8 273 5 269 269 249 50.0 4e-66 MAGYISDDTRKVTTHRLVEMKQRGEKISMLTAYDYTMAQIVDGAGMDVILVGDSASNVMA GNVTTLPITLDQMIYHAKSVVRGVKRAMVVVDMPFGSYQGNEMEGLASAIRIMKESHADA LKLEGGEEIIDTVKRIVCAGIPVMGHLGLMPQSINKYGTYTVRAKDEAEADKLIRDAHLL EEAGCFAIVLEKIPATLAERVASELTIPIIGIGAGGHVDGQVLVIQDMLGMNNGFRPRFL RRYADLYTVMTDAISRYVSDVKNCYFPNEKEQY >gi|222159265|gb|ACAB01000094.1| GENE 129 145165 - 146322 746 385 aa, chain + ## HITS:1 COG:BS_ywoG KEGG:ns NR:ns ## COG: BS_ywoG COG0477 # Protein_GI_number: 16080698 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 2 342 4 347 396 71 24.0 3e-12 MAVKLWTVHFMRICVANLLLFISLYVLFPVLSVEMADRLGVPAAQTGVIFLFFTLGMFLI GPFHAYLVDAYKRKYVCMFAAALMVVATIGYAFVTNFTELILLSTVQGLAFGIGTTAGIT LAIDITNSTLRSAGNVSFSWTARLGMLLGIILGVWLYQSHSFQNLLTVSVITGAVGILML SGVYVPFRAPIVTKLYSFDRFLLLRGWVPAINLILITFVPGLLIPMVHPFLNDFVLGNVG IPVPFFVGTTLGYIISLFIARLFFFKEKTLRLVIIGIGLEMVAMSLLNTDLSIGISSVLL GLGLGLTMPEFLMIFVKLSHHCQRGTANTTHLLASEVGISLGIATACYMELDTDKMLHTG QMVASIALLFFVLVTYPYYIKKKVR >gi|222159265|gb|ACAB01000094.1| GENE 130 146426 - 148642 2270 738 aa, chain - ## HITS:1 COG:lin1558 KEGG:ns NR:ns ## COG: lin1558 COG0317 # Protein_GI_number: 16800626 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Listeria innocua # 59 737 51 735 738 400 33.0 1e-111 MDDFFTLEEKKELFSLYRHLLQSAGDSIFWRDCQKLKKHLIKAAQCNGLQRNNFGMNPVI RDLQTAVIVAEEIGMKGSCLIGIMLHEIVKGHVLSIDEVNAEYGEDVASIIKGLVKTNEL YAKSPAIESENFRNLLLSFAEDMRVILIMIADRVNVMRKIKDTGNEEDRIKVANEAAYLY APLAHKLGLYKLKSELEDLSLKYTQKETYYFIKDKLNETKASRDKYIAAFIEPIQKKVAE AGLKFDIKGRTKSIHSIWNKIQKQKTPFEGIYDLFAIRIILDSEPDPAKEKQECWQVYSI VTDMYQPNPKRLRDWLSIPKSNGYESLHITVMGPEGKWVEVQIRTRRMDEIAERGLAAHW RYKGIKGETGLDEWLTSVREALENADNDSMKVMDQFKMDLYEDEVFVFTPKGDLFKLAKG ATVLDFAFHIHSKLGCKCIGAKVNGKNVQLKQKLNSGDQVEIMTSNTQTPKQDWLNIVTT SKARTKIRQALKEMVARQHAFAKETLERKFKNRKLEYDEATMMRLIKRLGFKNVTEFYQR IADGGLDVNEILDKYIEQQKRDSDTHDEIVYRSAEGYNLQTAQEETTSKEDVLVIDQNLK GLEFKLAKCCNPIYGDDVFGFVTVSGGIKIHRSDCPNANQMRERFGYRIVKARWAGKSEG TQYPITLRVVGHDDIGIVTNITSIISKENGITLRSIGIDSNDGLFSGTLTVMVGDTGRLE ALIKKLRTVKGVKQVSRN >gi|222159265|gb|ACAB01000094.1| GENE 131 148749 - 150266 1828 505 aa, chain + ## HITS:1 COG:no KEGG:BT_0701 NR:ns ## KEGG: BT_0701 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 505 1 505 505 953 92.0 0 MITTQDKELLAKKGITEEQIAEQLACFQTGFPFLKLDAAASIEKGILAPDAEEQKAYLAA WDAYTNTDKTVVKFVPASGAASRMFKNLFEFLSADYDQPTTKFEQAFFDGIKNFAFYDEL NVACQRISGKDIPGLMEEGNYKAVVSALLETAGLNYGALPKGLLKFHKYPEGSRTPMEEH LAEGALYAAGKSGKVNVHFTVSTEHRELFKKLVTEKVDDFAKRYGVDYYITFSEQKPSTD TIAADMDNQPFRDNGKLLFRPGGHGALIENLNDLDADIIFIKNIDNVVPDRLKADTVTYK KLIAGVLVTLQKHVFEYLTLLDSGKYTHDQMMEMLQFLQKKLFCKNPETKDLEDSVLAIY LKNKFNRPMRVCGMVKNVGEPGGGPFLAYNSDGTISLQILESSQIDMDDPEKKEMFEKGT HFNPVDLVCAVRDYKGHKFDLVKYVDKATGFISYKSKNGKDLKALELPGLWNGAMSDWNT VFVEVPLSTFNPVKTVNDLLREQHQ >gi|222159265|gb|ACAB01000094.1| GENE 132 150420 - 151043 543 207 aa, chain + ## HITS:1 COG:CAC0235 KEGG:ns NR:ns ## COG: CAC0235 COG4845 # Protein_GI_number: 15893527 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Clostridium acetobutylicum # 6 201 2 196 212 125 30.0 4e-29 MNQIEKIIDIATWNRREHYEHFSAFDDPFFGVTVNVDCTRAYQEAKDKGVSFSLLVLHRI VTAAAAVEEFRYRIEGNRVVCYDSLLPEATVGRADHTFSFAAFEYDPDELVFIRRAKAEM ERLQATTGLNKGGTFHPNAIHYSAVPWLSFTDMKHPTNMRSGDSVPKISTGKYFREGERL MMPVSVTCHHGLMDGYHVAQFIEKLHL >gi|222159265|gb|ACAB01000094.1| GENE 133 151078 - 152472 974 464 aa, chain + ## HITS:1 COG:no KEGG:BT_0703 NR:ns ## KEGG: BT_0703 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 85 464 33 407 407 655 83.0 0 MRVKIPRREIILFCSALLFPVASLCAQEVGQEVKEVRQDVEQLQQDVRELREEVRRLQEE LHGFRRNSFPQCGMDTVAPYVPHRFIHRLGIEARPQYVFPTNPFLQGENERWKPIQSSFA AHLKYSFKFRPNTCADRIYGGAYQGFGLAVTTFGDRKQLGDPVTFYVFQGARIARFNPRL SLNYEWNFGLSAGWKPYDNDYNSYNGAVGSRVNAYLNAGIYLNWSLSRYFDFIIGGDFTH FSNGNTKFPNAGVNTTGAKIGLVYNFNREESDLTKSLVKPYIPRFPRHISYDLVLFGSWR RKGVYVESGKQIASPGSYPVAGFNFAPMYNLNYKLRFGVSLDGVYDGSANVYTEDRIVEY DYDGGSGTSERRFLVPGIQHQLALGLSGRAEYVMPFFTINVGLGTNVLGRGDLRGLYQVF ALKIDVTRSSFLHIGYNLQNFQTPNYLMLGLGFRFNNKYPKVRH >gi|222159265|gb|ACAB01000094.1| GENE 134 152518 - 153081 516 187 aa, chain + ## HITS:1 COG:CAC3336 KEGG:ns NR:ns ## COG: CAC3336 COG0664 # Protein_GI_number: 15896579 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 14 186 21 194 199 67 30.0 1e-11 MENIIQKIREIYPVSDEALQALQANMELRYYPKDTYIVQSGVTDRLVYFIEEGIARSVFH HNGQDTTTWFTLEGDITFGMDSLYYQQPSVESIETLSDCKIYVIHIDKLNMLYETYIDIA NWGRILHQNVNKELSHMFVERLQLSPKERYEQFNRRYPGLINRVKLKYVAAFLGISIYTL SRVRAKK >gi|222159265|gb|ACAB01000094.1| GENE 135 153179 - 154333 1172 384 aa, chain + ## HITS:1 COG:CC1328 KEGG:ns NR:ns ## COG: CC1328 COG1835 # Protein_GI_number: 16125577 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Caulobacter vibrioides # 16 375 12 332 337 116 33.0 7e-26 MSNISSSAFADTKAHYDLLDGLRGVAALMVIWYHVFEGYAFAGGGNIETLNHGYLAVDFF FILSGFVIGYAYDDRWGKSLTMKDFFKRRLIRLHPMVVMGAVLGAITFCIQGCVQWDGTH VAISMIMLSLLCTIFFIPAMPGVGYEVRGNGEMFPLNGPCWSLFFEYIGNILYALFIRRL SNKALTVFVVLLGVALAAFAVFNVSTYGNIGVGWTLDGVNFLGGSLRMLFPFSLGMLMSR NFKPMKVRGAFWICTVALIALFAVPYLEGMEPLCMNGVYEAFCVIVAFPIILWIGASGTT TDVQSTKICKFLGDISYPVYVIHYPLMYLFYAWLIENKLYTLGETWHVAVCVFVLSIVLA YLCLKLYDEPIRKYLAKRFLSKKR >gi|222159265|gb|ACAB01000094.1| GENE 136 154505 - 154975 381 156 aa, chain - ## HITS:1 COG:alr3535 KEGG:ns NR:ns ## COG: alr3535 COG0454 # Protein_GI_number: 17231027 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Nostoc sp. PCC 7120 # 3 155 2 153 156 89 33.0 2e-18 MNRIRQVNLSDIPALQELYQHTVLTVNRKDYTAEEVANWASCGDDTSHWGELFEEQHYVV AENEEGVIVGFGSVNDDGYMHTLFVHKDFQHQGIATSLYKYLEAYARERGAKRLTSEVSI TAKPFFEKQGFQVDEEQKRKANQMCLTNYKMSKQLY >gi|222159265|gb|ACAB01000094.1| GENE 137 154996 - 156162 1430 388 aa, chain - ## HITS:1 COG:alr1299 KEGG:ns NR:ns ## COG: alr1299 COG0027 # Protein_GI_number: 17228794 # Func_class: F Nucleotide transport and metabolism # Function: Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) # Organism: Nostoc sp. PCC 7120 # 2 387 9 388 391 421 57.0 1e-117 MKKILLLGSGELGKEFVISAQRKGQHIIACDSYAGAPAMQVADEFEVFDMLNGEELERVV KKHQPDIIVPEIEAIRTERLYDFEKEGIQVVPSARAVNFTMNRKAIRDLAAKELGLKTAK YFYAKTLDELKEAAAKIGFPCVVKPLMSSSGKGQSLVKSADELEHAWEYGCSGSRGDIRE LIIEEFIKFDSEITLLTVTQKNGPTLFCPPIGHVQKGGDYRESFQPAHIDPAHLKEAEEM AEKVTRALTGAGLWGVEFFLSHENGVYFSELSPRPHDTGMVTLAGTQNLNEFELHLRAVL GLPIPGIKQERIGASAVILSPIASQERPQYRGLEEVTKEEDTYLRIFGKPFTRVNRRMGV VLCYAPLDADLDALRDKAKRIAEKVEVY >gi|222159265|gb|ACAB01000094.1| GENE 138 156798 - 158315 1797 505 aa, chain + ## HITS:1 COG:SPAC222.12c KEGG:ns NR:ns ## COG: SPAC222.12c COG0055 # Protein_GI_number: 19114063 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Schizosaccharomyces pombe # 6 502 55 522 525 605 64.0 1e-173 MSQIIGHISQVIGPVVDVYFEGTDAELVLPSIHDALEIKRPNGKILVVEVQQHIGENTVR TVAMDSTDGLQRGMKVYPTGGPITMPIGEQIKGRLMNVVGDSIDGMKGLNRDGAYSIHRD PPKFEDLTTVQEVLFTGIKVIDLLEPYAKGGKIGLFGGAGVGKTVLIQELINNIAKKHNG FSVFAGVGERTREGNDLLREMIESGVIRYGEAFKESMEKGHWDLSKVDYNELEKSQVSLI FGQMNEPPGARASVALSGLTVAESFRDAGKEGEKRDILFFIDNIFRFTQAGSEVSALLGR MPSAVGYQPTLATEMGAMQERITSTRKGSITSVQAVYVPADDLTDPAPATTFSHLDATTV LDRKITELGIYPAVDPLASTSRILDPHIVGQEHYDIAQRVKQILQRNKELQDIISILGME ELSEEDKMVVNRARRVQRFLSQPFAVAEQFTGVPGVMVGIEDTIKGFKMILDGEVDYLPE QAFLNVGTIEEAIEKGKKLLEQAKK >gi|222159265|gb|ACAB01000094.1| GENE 139 158328 - 158573 269 81 aa, chain + ## HITS:1 COG:HI0478 KEGG:ns NR:ns ## COG: HI0478 COG0355 # Protein_GI_number: 16272425 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Haemophilus influenzae # 1 77 1 77 142 66 37.0 1e-11 MKELHLSIVSPEKSIFDGDVKIVTLPGMIGSFSILPGHAPIVSSLKAGTLSYTTMEGEEH TMDIQGGFVEMSDGTVSACVS >gi|222159265|gb|ACAB01000094.1| GENE 140 158657 - 159088 235 143 aa, chain + ## HITS:1 COG:no KEGG:BT_0713 NR:ns ## KEGG: BT_0713 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 2 144 144 171 84.0 8e-42 MNISKTKRSFICWHTLFAVISAALGAVILHFALPGHYFGGYPFIPVYFYFFGLASIYMFD ACRRHAPQRLLLLYLAMKMIKMILSLILVLIYCLAVREEARAFLLTFISFYLIYLIFETW FFFSFEMNQKRKKKNKKKHETVA >gi|222159265|gb|ACAB01000094.1| GENE 141 159072 - 160211 1187 379 aa, chain + ## HITS:1 COG:BMEI1546 KEGG:ns NR:ns ## COG: BMEI1546 COG0356 # Protein_GI_number: 17987829 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Brucella melitensis # 195 366 101 269 277 90 35.0 4e-18 MKQLHNIVAPLILFCLMAAVNLPVLAQEQEGETVAQQTQDEITPKEEQENTVDVKEIVFG HIGDSYEWHITTWGNTHITIPLPIIVYSSTSGWHTFLSSRLEENGGTYEGLSIAPAGSKY EGKLVEYDAAGEQVRPWDISITKVTFALLFNSVLLLVIVLCVAHWYRKRPQGAKAPGGFI GFMEMFIMMVNDDIIKSCVGPNYRKFAPYLLTAFFFIFINNMMGLIPFFPGGANVTGNIA ITMVLAVCTFLAVNIFGTKHYWKDIFWPDVPWWLKVPVPMMPFIEFFGIFTKPFALMIRL FANMLAGHMAMLVLTCLIFISASMGPALNGTLTVASVLFNIFMNALELLVAFIQAYVFTM LSAVFIGLAQEGAKVKTEE >gi|222159265|gb|ACAB01000094.1| GENE 142 160136 - 160306 60 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVYIFSCFYYLIPHSARLFRIYSVTFSPRFIDYSSVFTLAPSCASPINTADNIVNT >gi|222159265|gb|ACAB01000094.1| GENE 143 160323 - 160580 461 85 aa, chain + ## HITS:1 COG:no KEGG:BT_0715 NR:ns ## KEGG: BT_0715 # Name: not_defined # Def: ATP synthase C subunit # Organism: B.thetaiotaomicron # Pathway: Oxidative phosphorylation [PATH:bth00190]; Metabolic pathways [PATH:bth01100] # 1 85 1 85 85 74 100.0 1e-12 MLLSVLLQATAAAVGVSKLGAAIGAGLAVIGAGLGIGKIGGSAMEAIARQPEASGDIRMN MIIAAALIEGVALLAVVVCLLVFFL >gi|222159265|gb|ACAB01000094.1| GENE 144 160702 - 161205 654 167 aa, chain + ## HITS:1 COG:VC2768 KEGG:ns NR:ns ## COG: VC2768 COG0711 # Protein_GI_number: 15642761 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Vibrio cholerae # 16 157 12 152 156 63 30.0 2e-10 MSLLLPDSGLLFWMFLSFGIVFVILAKYGFPVIIKMVEGRKTYIDQSLEVAREANAQLSK LKQEGDALVAAANKEQGRILREAMEERDKIVHEARKQAEIAAQKELDAVKQQIQMEKDEA IRDIRRQVAVLSVDIAEKVLRKSLEDKDAQMGMIDRMLDEVLTPNKN >gi|222159265|gb|ACAB01000094.1| GENE 145 161211 - 161771 471 186 aa, chain + ## HITS:1 COG:sll1325 KEGG:ns NR:ns ## COG: sll1325 COG0712 # Protein_GI_number: 16329328 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Synechocystis # 10 174 14 177 185 64 27.0 1e-10 MEVGILSMRYAKAIIEYAQEKGLEDRLYQEFLTLSHSFCEQPGLREALDNPVITTKEKLA LVCTAADGDGKSTREFVRFITLVLRNRREGYLQFISLMYLDLYRKLKHIGTGKLITAVPV DKETEDRIRSAAAHILHAQMELETVIDPSIEGGFIFDINDYRLDASVATQLKRVKQQFID KNRRIV >gi|222159265|gb|ACAB01000094.1| GENE 146 161771 - 163354 1926 527 aa, chain + ## HITS:1 COG:TM1612 KEGG:ns NR:ns ## COG: TM1612 COG0056 # Protein_GI_number: 15644360 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Thermotoga maritima # 5 517 3 496 503 574 58.0 1e-163 MSENIRVSEVSDILRKQLEGINANVQLDEIGTVLQVSDGVVRIYGLRNAEANELLEFDNG IKAIVMNLEEDNVGAVLLGPTDKIKEGFVVKRTKRIASIRVGEGMLGRVIDPLGEPLDGK GLIGGELYDMPLERKAPGVIYRQPVNQPLQTGLKAVDAMIPIGRGQRELIIGDRQTGKTA IAIDTIINQRTNFLAGDPVYCIYVAIGQKGSTVASIVNTLREYGALDYTVVVAATAGDPA ALQYYAPFAGAAIGEYFRDTGRHALVVYDDLSKQAVAYREVSLILRRPSGREAYPGDIFY LHSRLLERAAKIISQEEVAREMNDLPESLKGIVKGGGSLTALPIIETQAGDVSAYIPTNV ISITDGQIFLETDLFNQGTRPAINVGISVSRVGGNAQIKAMKKVAGTLKIDQAQYRELEA FSKFSSDMDPITALTIDKGRKNGQLLIQPQYSPMPVEQQIAILYCGTHGLLHDVPLDKVQ DFERSFIESLQLNHQEDVLDILKTGVIDDNVIKAIEETAAMVAKQYL >gi|222159265|gb|ACAB01000094.1| GENE 147 163467 - 164363 875 298 aa, chain + ## HITS:1 COG:BH3755 KEGG:ns NR:ns ## COG: BH3755 COG0224 # Protein_GI_number: 15616317 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Bacillus halodurans # 1 295 1 280 285 175 34.0 1e-43 MASLKEVKTRINSVKSTRKITSAMKMVASAKLHKAQGAIENMLPYQKKLNKILTNFLSAD LPIESPYVQEREVKRVAIVVFSSNTSLCGAFNANVIKMMMQTLGEFRTLGQDNILIFPIG KKVDEAVKRMGFKPQETSPTLSDKPTYQEAAELAHRLMDMYVAGEVDRVEIIYHHFKSMG VQILLRETYLPIDMTNVVSEEDSMNKEEVEEHEIANDYIIEPNAEELIASLIPTVLSQKI FTAAVDSNASEHAARTLAMQVATDNANELIQDLTKQYNKSRQQAITNELLDIVGGTMK >gi|222159265|gb|ACAB01000094.1| GENE 148 164517 - 166460 1701 647 aa, chain + ## HITS:1 COG:no KEGG:BT_0720 NR:ns ## KEGG: BT_0720 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 645 16 650 655 757 62.0 0 MRIKHLLNFFVATVFIGVAVGCAEKDLYDPNYGKEPVKGPEEYFGFETCGDVSLDVNYAL PGFAALIEVYDIDPMEIVDNTPVKKQGVEALFKIYTDESGKFNGKMNIPTSVKSIYLYTE SWGLPRCMPLEIKDGMVSFDMSNIGASTVKAKNNTRSYGFQGTVPYVLNGNNKLYSLCKW GEGGSLDSKYMSFEENVGDETVGSLTQRLKNFFNPDGIANVDNMNLVTQSRTTNITVTQD GTALDVIFLNRDAMYNNSFGYYYYKTSKEPDMRGMSDMKKYIIFPNVSFSVYGGQLPILK CGSKVRLLYFDEQGNAKEEFPAGYTVGWFMYADGYNQSENEIDITKQVASGFSNLLASNQ VLGQQRQNFVSVKDETSGKVIIGVEDGANNSYCDLLFYVNASKTIEEPNDRPVITPDDGN EPEKPDVTETKTGTLAFEDIWPGGGDYDMNDVIVEYNRAVSFDKKNQVTKIVDTFTPVHD GAVFANAFAYQVDKGQFGKMTFSATTEGIHTESATSSIIVCPNVKQAIQKVYTITREFTG GSFNKKDLKSYNPYIIVKYAEGQKGRTEVHLPKHEATSLADLSLAGTQKDAYYIDKEGAY PFAIDIPILNFISVTEKKSIDTEYPNFKAWADSKGEKYTDWYNNHVN >gi|222159265|gb|ACAB01000094.1| GENE 149 166646 - 169252 2008 868 aa, chain + ## HITS:1 COG:SPBC887.14c KEGG:ns NR:ns ## COG: SPBC887.14c COG0507 # Protein_GI_number: 19113280 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Schizosaccharomyces pombe # 21 406 328 758 805 154 29.0 8e-37 MENNPELQLAWQFIENTGTHLFLTGKAGTGKTTFLRRLKEHTPKRMVVLAPTGIAAINAG GVTIHSFFQLSFAPFVPETTFNSSQTHYRYSKEKRNIIRSMDLLVIDEISMVRADLLDAV DATLRRYRDREKPFGGVQLLMIGDLQQLAPVVKDNEWELLRKHYETPYFFASHALKETVY MTIELKKVYRQSDTFFLSLLNKIRENKADDEVLNELNRRYQPGFQPQKEEGYIRLTTHNY QAQKVNDRELASLSGKAYHFRAEIEGDFPEYSYPADELLTIKEGAQIMFLKNDSSSEKRY YNGMIGEVVTVNDAGIIVRGKGNKSEFQLLPEEWGNYKYVLNEETKEITEVIEGTFRQYP IRLAWAITIHKSQGLTFERAIIDARNSFAHGQTYVALSRCKTLEGMVLESPLRREAIISD ATVDNFTKAVEQNKPGSQQLNDMQKAYFFDLLSDLFNFYSIDQAYKRLLRLIDEDLYKLF PKQLAEYKALEPRIKEKIVEVSQRFRNQYTRLIHESEDYATNQELQERIRSGAGYFHKEL APIRALYNKTNMPLDNKELRKLLAERMQALDDALWIKESLLEAVSARERFAITDYLKLKA KVMLSLEDDSSSSGSSKALKEKKERKERKERTRSGAEKVKVEVPTDILHPGLYRALAEWR TAKTREVNLPAYVIMQQKALMGIVNLLPDNPRALEAIPYFGAKGVEKYGLEILGIVRKYM AENQLERPEIMDMLISDNREAASRREELKQRKEEEKQKKEAEKQKKEAEKEKKKDTKLVS YEMFCQGMSIDEIAKARELVSGTIAGHLEYYVRLGKIKVEKVVKAENLAKIRKHLEEHEY MGIFAIKAALGDDVSYADIKFVLAVSGH >gi|222159265|gb|ACAB01000094.1| GENE 150 169348 - 169968 531 206 aa, chain - ## HITS:1 COG:FN1083 KEGG:ns NR:ns ## COG: FN1083 COG2431 # Protein_GI_number: 19704418 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 5 206 2 195 198 94 34.0 1e-19 MKGSLIVILFFCVGCVMGAFNKFQFDTHTVSMYILYALMLQVGISIGSNKNLKAIVSHLH PKMLLIPLGTIIGTLLFSALASLLLRQWSVFDCMAVGSGFAYYSLSSILITQFKEPSIGL QLATELGTIALLTNIFREMMALLGTPIIKKYFGKLAPISAAGVNSMDVLLPSISRYSGKE MIPIAILHGILIDISVPVFVSFFCNL >gi|222159265|gb|ACAB01000094.1| GENE 151 170034 - 170309 384 91 aa, chain - ## HITS:1 COG:no KEGG:BT_0723 NR:ns ## KEGG: BT_0723 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 91 1 91 92 115 94.0 6e-25 MFSIISTMFLGIGIGYVLRNWSILQKTEKTISLTIFLLLFILGVSIGSNSLIVNNLGKFG WQAIVLAVSGVLGSLIAARLVLQLFFRKGGE >gi|222159265|gb|ACAB01000094.1| GENE 152 170878 - 171810 558 310 aa, chain - ## HITS:1 COG:TVN0900 KEGG:ns NR:ns ## COG: TVN0900 COG1091 # Protein_GI_number: 13541731 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Thermoplasma volcanium # 1 308 1 271 280 136 31.0 5e-32 MKKILIIGANGFTGRQILNDLSVHTQYKVTGCSLHPDILPNDAGKYRFIETDIRNEADIK RLFEEVQPDVVINCSALSVPDYCETHHEEAYLTNVTAVSQLAVFCEEYKSRFIHLSTDFV FDGKMFVFDEKINEDAGLLYTEEDVPAPVNYYGYTKWKGEEKVAETCSSFAIIRVAIVYG RALPGQHGNIVQLVMNRLKAGQEIRVVSDQWRTPTYVGDVSDGVQRLIAHPTNGIFHICG DECMSIAEIAYQVADYMGLDRSLIHPVTTEEMNETTPRPRFSGMSIDKARTMLGYEPQKL KEVLANWEHL >gi|222159265|gb|ACAB01000094.1| GENE 153 172016 - 173185 1074 389 aa, chain + ## HITS:1 COG:STM3519 KEGG:ns NR:ns ## COG: STM3519 COG1690 # Protein_GI_number: 16766807 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 11 389 13 405 405 266 40.0 5e-71 MEKVIMGTRLPAKLWLNEVEDSCMSQIMDLTSLPFAFKHIAIMPDAHTGKGMPIGGVLAT KGVIVPNAVGVDIGCGMCAIKTNRKAEDFSYTDLTSIMSKIRAAIPLGFDHHTKKQDQEL LPQGFDLEEMPILKNQYEACLKQIGILGGGNHFIEIQKDTETSDVWVMIHSGSRNIGLKV ANHYNKIAQYWNEKWYSEMVSGLAYLPMETQMAKDYFREMNYCVAFAFANRQLMMTRICE AIQAVKPETDFEPMINIAHNYAAWENHFDQDVIVHRKGATRAYEGEIGIIPGSMGTKSYI VEGLGNPESFKSCSHGAGRLMGRKDACRRLSLDEEKERMNQQGIIHGLRSQDELDEAPGA YKDIAQVIANERDLVKPLVELAPMAVIKG >gi|222159265|gb|ACAB01000094.1| GENE 154 173470 - 174744 878 424 aa, chain - ## HITS:1 COG:no KEGG:BT_0727 NR:ns ## KEGG: BT_0727 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 424 1 393 393 550 67.0 1e-155 METMQQYYRLMKSILFVLFLSFGMISCEKEDPVSTSPNTRQTDNTDPTDPDPTDPVVRSD NEQTVFMYLPWSTDLTSFFYQNIADLKSIIGQNILKNERVLVFICTTATKATLYELSYEK GAAVQKALKSYNYPTPSYTTAEGITSILNDIQTYSPAKRYAMIIGCHGMGWIPVSKTQSR SSLQTVKKHWEYGNAPMTRLFGGRESKYQTDITTLAEGISSAGLKMEYILFDDCYMSTVE VAYDLKNVTSHLIASTSEIMAYGMPYDKIGQYLIGNIDYEKICDGFYSFYSNYVTPCGTI SVTDCSEVDNLAAIMKEINQRYTFNEELISSLQSLDGYKPSIFFDCGDYVAKLCSDPDLL EQFNEQLKRTVPYKKNTEYYFTAISSYYGERKKINTFSGITISDPSTSAAALKKNETAWY VATH >gi|222159265|gb|ACAB01000094.1| GENE 155 175166 - 176065 687 299 aa, chain - ## HITS:1 COG:BS_ytdP KEGG:ns NR:ns ## COG: BS_ytdP COG2207 # Protein_GI_number: 16080067 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 189 296 666 766 772 71 36.0 2e-12 MGDKVLKIGTVHQCNCCLGSKTLHPLVSVIDLSKADLSPHTDIKFDFYTILLSECKCEAY VYGHQYYDFSDGTLICLAPGESISMKEKNKRFPSKGWILAFHPDLICGTPLGLNIDNYTF FSYQPEEALHISLREKQIILEFMDRINQELERCIDRHSKKIVSKYIELLLDYCVRFYERQ FITRNEVNKKIIKQFDKIINNHFETKPVPAVDVLSNEYCANVLHLSPEYFNDLLKHETGK SFKEYIEFKRFEIAKYWLLNTDKTVNQITQELGFQNPQYFSRLFKKVTGCSPNDFRVPN >gi|222159265|gb|ACAB01000094.1| GENE 156 176096 - 176998 680 300 aa, chain - ## HITS:1 COG:no KEGG:BT_0729 NR:ns ## KEGG: BT_0729 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 300 1 300 300 530 89.0 1e-149 MDEITKIETVDQYDQLFGLETLHPLVNVIDFSKATRSVEYIRMNIGFYCLFLKDAKCGDL TYGRKNYDYQEGTVVCMAPGQVSGIDNRNRPAPRTKSIGVLFHPDLIRGTSLGQHIKNYT FFSYEVNEALHLSDQEREIVTDCIHKIRIELEHPIDKHSKQLIVRNIELLLDYCMRFYER QFITRNQANKDIIVKFEQLLDEYFQNQVAITEGLPSVKYFADKACLSPNYFGDLIKKETG KTAQEYIQCRIIELAKERILEGVQTVSQVAYELGFQYPQHFSRLFKKHVGYTPNEYKQRN >gi|222159265|gb|ACAB01000094.1| GENE 157 177204 - 178220 787 338 aa, chain + ## HITS:1 COG:alr4831 KEGG:ns NR:ns ## COG: alr4831 COG0451 # Protein_GI_number: 17232323 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Nostoc sp. PCC 7120 # 1 260 1 240 311 105 28.0 1e-22 MKALFIGGTGTISTDVVALAQQKGWEITLLNRGSKKMPEGIHSIIADINDEEAVAKAIVS EHYDVVAQFIGYTAEDVKRDIRLFQNKTRQYIFISSASAYQKPLADYHITESTPLVNPYW QYSRNKIEAEEVLMAAYRTNGFPVTIVRPSHTYNGTKPPVSVHGDKGNWQILKRILEGKP VIIPGDGSSLWTLTHSKDFAKGYVGLMANPHAIGNAFHITTDESMTWNQIYQTIADALGK PLNALHVASDFLAKHSDHYDFRGELLGDKAVTVVFDNSKIKRLVPDFICNTSMADGLRQA VHYMLSHPESQIPDPEFDSWCDRIANAISAADKAFELS >gi|222159265|gb|ACAB01000094.1| GENE 158 178326 - 178877 362 183 aa, chain - ## HITS:1 COG:no KEGG:BT_0731 NR:ns ## KEGG: BT_0731 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 183 14 178 178 271 82.0 7e-72 MKKLILFLSLILSVGFASAQSEIPSDSIRRAPSTNIKEFGGFLLDMGLMNVATPELPKFN LEMPNMTKDYNQLFRLNTDVTYSQGFTDSFSSSSFSSFSGFSGFGYGWGLSSSPQFMQMG SFKLKNGMRINTYGDYDKDGWRVPNRSAMPWERNNFRGAFELKSANGNFGIRIEVQQGRN APY >gi|222159265|gb|ACAB01000094.1| GENE 159 179049 - 180692 1825 547 aa, chain - ## HITS:1 COG:no KEGG:BT_0735 NR:ns ## KEGG: BT_0735 # Name: not_defined # Def: aspartate aminotransferase (EC:2.6.1.1) # Organism: B.thetaiotaomicron # Pathway: Alanine, aspartate and glutamate metabolism [PATH:bth00250]; Cysteine and methionine metabolism [PATH:bth00270]; Metabolic pathways [PATH:bth01100] # 1 540 16 555 557 1040 92.0 0 MEKKTNNPAITKSYAKKMETISPFELKNKLIDMADESIKKIAHTMLNAGRGNPNWIATEP REAFFLLGQFGLCECRHAFSLEEGIAGIPQKAGIAARFEAFLKENEKAPGANLLKEGYNY MLMEHAADPDTLIHEWAESVIGDQYPVPDRILHFTELIVQDYLAQEMCDRRPPKGTFDLF ATEGGTAAMCYLFDSLQENFLLNQGDAIALMIPVFTPYIEIPELRRYQFDVTEISADQMT PDGLHTWQYKDEDIDKLKDPRIKALFITNPSNPPSYALSRETTERIINIVKNDNPNLMII TDDVYGTFIPHFRSLMAELPHNTLCVYSFSKYFGATGWRTAVIALHEDNIYDKMIARLSE EQKSILNKRYSSLSLQPEKMKFIDRMVADSRQIALNHTAGLSLPQQMQMSLFAIFSLLDK EDSYKAKMQEIIHRRLHALWDNTGFTLVEDPLRAGYYSEIDMLVWAKKFYGDDFVAYLQK TYNPLDVVFRLANETSLVLLNGGGFAGPKWSVRVSLANLNEADYVKIGQSIKCVLEEYAQ TWKASKE >gi|222159265|gb|ACAB01000094.1| GENE 160 180715 - 182409 1964 564 aa, chain - ## HITS:1 COG:STM0870 KEGG:ns NR:ns ## COG: STM0870 COG2985 # Protein_GI_number: 16764232 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Salmonella typhimurium LT2 # 14 563 15 556 561 251 30.0 3e-66 MEWIINQLRVHPELAIFLTLFAGFWLGRLKIGKFSLGTVTSVLLVGVLVGQLNITVDGPM KAVFFLLFLFAVGYKVGPQFFRGLKKDGLPQVGFAVLMCIVSLVAPWILAKIMGYHVGEA AGLLAGSQTISAVIGVASDTINQLGISDAQKATFINAIPVAYAVTYIFGTAGSAWILASL GPKMLGGLDKVKADCKELEAQMGTSEADEPGFSPALRPVVFRAYKITNEWFGKGKKVSEL ETYLCKNDKRLFVERIRQRGVVKDVDPNLILRKNDEVVLSGRREFVIGEEDWIGPEVIDA QLLDFPAETLPVMVTHRTFAGETVAKIRAQKFMHGVSIRNIKRAGINVPVLSKTVVDSGD ILELTGLKHEVEGAAKQMGYIDRPTNQTDMIFVGLGILLGGLFGALAIHLGGVPISLSTS GGALIAGLLFGWLRSKHPTFGGIPEPSLWVLNNVGLNMFIAVVGIAAGPSFIAGFKEVGV SLFIVGALATAIPLLAGLLMARYLFKFHPALSLGCTAGARTTTAALGAIQDAVESDTPAL GYTVTYAVGNTLLIIWGVVIVLLM >gi|222159265|gb|ACAB01000094.1| GENE 161 182579 - 184246 1833 555 aa, chain + ## HITS:1 COG:SP1229 KEGG:ns NR:ns ## COG: SP1229 COG2759 # Protein_GI_number: 15901091 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Streptococcus pneumoniae TIGR4 # 1 554 1 555 556 560 53.0 1e-159 MKSDIEIARSIELKKIKQVAESVGIPREEVENYGRYIAKIPEQLIDEEKVKKSNLILVTA ITATKAGIGKTTVSIGLALGLNKIGKNAIVALREPSLGPCFGMKGGAAGGGYAQVLPMDK INLHFTGDFHAITSAHNMISALLDNYLYQNQAKGFGLKEILWRRVLDVNDRSLRSIVVGL GPKSNGITQESGFDITPASEIMAILCLSKDVSDLRRRIENILLGFTYDDQPFTVKDLGVA GAITVLLKDAIHPNLVQTTEGTAAFVHGGPFANIAHGCNSILATKLAMSFGDYVITEAGF GADLGAEKFYNIKCRKSGLQPRLTVIVATAQGLKMHGGVSLDRIKEPNMEGLKEGLRNLD KHVRNLRSFGQTVIVAFNKFASDTDEEMELLREHCEQLGVGFAINNAFSEGGEGAVDMAR LVVDTIENNPSEPLRYTYKEEDNIQQKIEKVATNIYGASVITYSSIARNRIKLIEKMGIT HYPVCIAKTQYSFSADPKIYGAVNNFEFHIKDIVINNGAEMIVAIAGEILRMPGLPKEPQ ALHIDIVDGEIEGLS >gi|222159265|gb|ACAB01000094.1| GENE 162 184052 - 184441 93 129 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSYITHNKYEFNRTVCMKRKEKRTGLYIQSSPSFHIHFIGKDGDLFIFTDVAPEQVFFKC IFKCISAQSFYFSIHNIDMQCLRFFRQSRHTKNFSGNGYNHFSSVIDYNIFNMELEVVHR AINLRIGRE >gi|222159265|gb|ACAB01000094.1| GENE 163 184600 - 185880 1753 426 aa, chain - ## HITS:1 COG:aq_479 KEGG:ns NR:ns ## COG: aq_479 COG0112 # Protein_GI_number: 15605959 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Aquifex aeolicus # 1 424 5 410 428 482 58.0 1e-136 MKRDDIIFDIIEKEHQRQLKGIELIASENFVSDQVMQAMGSCLTNKYAEGYPGKRYYGGC EVVDQSEQIAIDRLKEIFGAEWANVQPHSGAQANAAVFLAVLNPGDKFMGLNLAHGGHLS HGSLVNTSGIIYTPCEYNLNQETGRVDYDQMEEVALREKPKMIIGGGSAYSREWDYKRMR EIADKVGAILMIDMAHPAGLIAAGVLENPVKYAHIVTSTTHKTLRGPRGGVIMMGKDFPN PWGKKTPKGEIKMMSQLLDSAVFPGIQGGPLEHVIAAKAVAFGEILQPEFKEYAKQVQKN AAVLAQALIDRGFTIVSGGTDNHSMLVDLRSKYPDLTGKVAEKALVSADITVNKNMVPFD SRSAFQTSGIRLGTPAITTRGAKEDLMLEIAEMIETVLSNVENEEVIAQVRARVNETMKK YPLFAY >gi|222159265|gb|ACAB01000094.1| GENE 164 186015 - 186755 653 246 aa, chain - ## HITS:1 COG:no KEGG:BT_0739 NR:ns ## KEGG: BT_0739 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 246 1 245 245 412 85.0 1e-114 MDMLRKFIFLGLLVAAGSPVVGQNYNADRVDRPNTKKINIGIKAGFNSSMFMVSELKIKD VTIDEVQNNYKIGYFGALFMRINMKKHFIQPEVSYNVSKCEITFDKLGSQHPAIEPDYAS VQSVLHSVDFPVLYGYNVVKKGPYGMSIFAGPKLRYLWGKHNEITFKNFDQKGIHEKLYP FNVSAVIGVGVNISRIFFDFRYEQGIGNISKSIIYDNINSDGSTGVSNIIFRRRDSALSF SLGFIL >gi|222159265|gb|ACAB01000094.1| GENE 165 186772 - 187380 548 202 aa, chain - ## HITS:1 COG:FN1468 KEGG:ns NR:ns ## COG: FN1468 COG1853 # Protein_GI_number: 19704800 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 32 197 20 185 197 160 49.0 2e-39 MLICVICGKKMTTMKQDWKPGTMIYPLPAVLVSCGKEESEYNMFTVAWTGTICTNPPMCY ISVRPERHSYDIIKKNMEFVINLTTKDMAFATDWCGVRSGRDYHKFEEMKLTPGQCTVVS APLIEESPLCIECRVKEIISLGSHDMFIADVVNVRADERNLNPKTGKLELAEANPLVYVH GGYYNLGEKIGKFGWSVEKKKS >gi|222159265|gb|ACAB01000094.1| GENE 166 187450 - 187911 535 153 aa, chain - ## HITS:1 COG:PAB1499 KEGG:ns NR:ns ## COG: PAB1499 COG1781 # Protein_GI_number: 14521525 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Pyrococcus abyssi # 8 150 4 148 152 140 47.0 7e-34 MSENKQALQVAALKNGTVIDHIPSEKLFTVVQLLGVEQMTSNITIGFNLDSKKLGKKGII KIADKFFCDEEINRISVVAPHVKLNIIRDYEVVEKKEVKMPDELRGIVKCANPKCITNNE PMSTIFHVIDKDNCIVKCHYCEKEQKREEITIL >gi|222159265|gb|ACAB01000094.1| GENE 167 187908 - 188849 926 313 aa, chain - ## HITS:1 COG:VC2510 KEGG:ns NR:ns ## COG: VC2510 COG0540 # Protein_GI_number: 15642506 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Vibrio cholerae # 4 304 29 330 330 312 53.0 5e-85 MENRSLVTIAEHSKEKILYMLEMAKQFEMNPNRRLLQGKVVATLFFEPSTRTRLSFETAA NRLGARVIGFSDPKATSSSKGETLKDTIMMVSNYADIIVMRHYLEGAARYASEVAPVPIV NAGDGANQHPSQTMLDLYSIYKTQGTLENLNIFLVGDLKYGRTVHSLLMAMRHFNPTFHF IAPEELKMPEEYKLYCKTHQIKYVEHTDFSEEIIADADILYMTRVQRERFTDLMEYERVK NVYILRNKMLENTRPNLRILHPLPRVNEIAYDVDDNPKAYYFQQAQNGLYAREAILCDVL GITLDDVKNDILL >gi|222159265|gb|ACAB01000094.1| GENE 168 188962 - 189744 563 260 aa, chain - ## HITS:1 COG:no KEGG:BT_0592 NR:ns ## KEGG: BT_0592 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 255 8 259 261 243 51.0 6e-63 MNMINKGIPYFPTPANFFDEEVMELLEAKFGVLASYMVMRLLCKIYKEGYYISWGKEQNL IFVRKVGGGIKEDMMEKIVDLLLEKGFFHKETYEKHGILTSEQIQRVWFEATTRRKIDFS QLPYLLETKQRKRIHKDELNKENANIFPTQEEVSSENADISRQTKLKETKLNTEEEEEIS DASFEIPGYAYNQATHNMNGLIESLERHKVTNLKERQTILRLSDYGRKGTQVWKLLSNTA WSKIGAPGKYIIAALASGRK >gi|222159265|gb|ACAB01000094.1| GENE 169 189768 - 190112 387 114 aa, chain - ## HITS:1 COG:no KEGG:BT_0593 NR:ns ## KEGG: BT_0593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 114 23 121 121 96 45.0 3e-19 MRQKIKKGKLEACKLVWKKRITAEKGISDKCADRIVQECIKLIEHMLYGNAMIAFHKQDG TFCLEKGTLVGYEKFFHREFNITAQQESIIYWSEEQKGWRRFMIGNLMEWKAIV >gi|222159265|gb|ACAB01000094.1| GENE 170 190341 - 190751 403 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714657|ref|ZP_04545138.1| ## NR: gi|237714657|ref|ZP_04545138.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 136 1 136 136 156 100.0 5e-37 MRSEKSKKGSKNNKAKRNVQTVEKAPGRQSKEEIISEEELENRIAISGDIRLYLTMHLRI FIDGYFHHPKKKKLINLAQYIYDQKVLYIHKHGGYKLMELSSIHAELAALKKSVEEEYMK EKKEKREQAEKLKSKY >gi|222159265|gb|ACAB01000094.1| GENE 171 190880 - 191833 535 317 aa, chain + ## HITS:1 COG:no KEGG:BT_0595 NR:ns ## KEGG: BT_0595 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 316 1 316 318 487 77.0 1e-136 MQMMNKNGFSRCGENYINRLRKEGRYSTAHVYKNALYSFGRYCGTLNVSFKQVTKERLRR YGQYLYECGLKPNTISTYMRMLRSIYNRGVEAGSAPYVPRLFHDVYTGVDVRQKKALSAG ELHKLLYEDPKSERLRRTQIIAALMFQFCGMSFADLAHLEKSALDQSVLRYNRIKTKTPM SVEVLDTARGMINQLWSNQEPIPDCPDYLFDILCSNKKRKDERAYREYQSALRNFNNRLK DLARVLRLKSPVSSYTLRHSWATTAKYRGVPIEMISESLGHKSIKTTQIYLKGFELKERT EVNKGNLSYIRNYRLGG >gi|222159265|gb|ACAB01000094.1| GENE 172 192187 - 192753 503 188 aa, chain + ## HITS:1 COG:no KEGG:BT_0596 NR:ns ## KEGG: BT_0596 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 186 1 187 192 329 85.0 4e-89 MILTKETLSAGPNIGTGEGVAHSKRWYVALVRMHHEKKVAERLDKMGIENFVPVQQEVHQ WSDRRKVVESVLLPMMVFVHADPKERKEVLSFSTVSRYMVMRGESSPTIIPDEQMARFRF MLDYSEEAICMNSAPLARGEKVCVIKGPLTGLVGELVTVDGRSKIAVRLNMLGCACADMP VGYVEPFK >gi|222159265|gb|ACAB01000094.1| GENE 173 192826 - 193713 810 295 aa, chain + ## HITS:1 COG:NMB0062 KEGG:ns NR:ns ## COG: NMB0062 COG1209 # Protein_GI_number: 15675999 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Neisseria meningitidis MC58 # 1 291 1 288 288 410 64.0 1e-114 MKGIVLAGGSGTRLYPITKGVSKQLLPVFDKPMIYYPISVLMLAGIREILIISTPYDLPG FKRLLGDGSDYGVRFEYAEQPSPDGLAQAFIIGEDFIGNDSVCLVLGDNIFYGQSFTRML QEAVRTVEEEQKATVFGYWVADPERYGVADFDKDGNVLSIDEKPENPKSNYAVVGLYFYP NKVVDVAKHIQPSPRGELEITTVNQEFLNDHQLKVQLLGRGFAWLDTGTHDSLSEASTFI EVIEKRQGLKVACLEGIALRHGWITADKMRELAKPMLKNQYGQYLLKVINELGLE >gi|222159265|gb|ACAB01000094.1| GENE 174 193763 - 194332 468 189 aa, chain + ## HITS:1 COG:MA3780 KEGG:ns NR:ns ## COG: MA3780 COG1898 # Protein_GI_number: 20092576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 178 1 179 183 204 55.0 8e-53 MEIIKTAIEGVVIIEPRLFKDDRGYFFESFSQREFTEKVRKVDFVQDNESKSSYGVLRGL HFQKPPYAQSKLVRVIKGSVLDVAVDIRKGSPTFGEHVAVELTEENHRQFFIPRGFAHGF VVLTEEVIFQYKCDNFYAPQCEGALAWDDPALKIDWKVPADKIILSEKDKHHERLEEASW LFDYNENLY >gi|222159265|gb|ACAB01000094.1| GENE 175 194338 - 195204 723 288 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 283 1 279 280 231 44.0 9e-61 MNILVTGANGQLGNEMRRVSSTSSNHYIFTDVAELDITSRDSIRKMVNDNQIHVIVNCAA YTNVDKAEDDFATADLLNNKAVENLAIVAKEADATLIHVSTDYVFQGDRNVPCREDWETN PLGVYGKTKLAGEHSIQGTGCRYLIFRTAWLYSPYGKNFVKTMRQLTSDKDTLKVVFDQV GTPTYAGDLASVIYQVIEENQLYKEGIYHFSNEGVCSWYDFAKEICDLSGNVCDIQPCHS DEFPSKVKRPHFSVLDKTKVKSTFGITVPYWKDSLQKCINELKQQLDY >gi|222159265|gb|ACAB01000094.1| GENE 176 195212 - 196285 769 357 aa, chain + ## HITS:1 COG:ECs4721 KEGG:ns NR:ns ## COG: ECs4721 COG1088 # Protein_GI_number: 15833975 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Escherichia coli O157:H7 # 5 347 2 347 355 388 55.0 1e-108 MNFARNILITGGAGFIGSHVVRLFVNKYPEYHIVNLDKLTYAGNLANLKDVEDQPNYTFV KADICDFEKMLEIFKQYHIDGVIHLAAESHVDRSIKDPLTFAQTNVMGTLSLLQAAKLTW EILPECYEDKRFYHISTDEVYGALEFDGTFFTEETKYQPHSPYSASKAGSDHFVRAFHDT YGMPTIVTNCSNNYGPYQFPEKLIPLFINNIRQGKPLPVYGKGENVRDWLYVVDHARAID LIFHNGNTADTYNIGGFNEWTNIDLIKVIIKTVDRLLGNSEGTSDHLITYVTDRKGHDLR YAIDSNKLKNELGWEPSLQFEEGIEKTVRWYLDNQNWMDNVTTGDYQKDNDRDDKSL >gi|222159265|gb|ACAB01000094.1| GENE 177 196445 - 196615 127 56 aa, chain + ## HITS:1 COG:no KEGG:BT_0467 NR:ns ## KEGG: BT_0467 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 54 5 58 489 70 68.0 2e-11 MAISLYTSRVILEVLGINDFGIYNVVGGFVGMFALISATLTSSTQRFITYELGKKE >gi|222159265|gb|ACAB01000094.1| GENE 178 196845 - 197784 161 313 aa, chain + ## HITS:1 COG:no KEGG:BT_0467 NR:ns ## KEGG: BT_0467 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 301 138 438 489 275 52.0 1e-72 MAAFAYITLIESILKLLIVYLLILSTFDKLIIYGALMLLVSLAIRMIYGVYCSRNFIECK FKLVKERFYYKQILGFSGWNIIGSSSVVLTNYGINILLNIFFGVAVNAARGITTQVDNAL NQFVSNFVMALNPQITKSYASGNREYMMKLVMTGSRYSFYLLLIMVIPILFETEYILACW LKNTPKYTVIFVQLSLIYMLCQSLSNTLFTAMLATGNIRNYQIIVGGLALMAFPLSYGLF KMGFEPAYCYYATIFISILCLVVRLIMLHRIIGLSIRIFFKDVIVRVILVSIFSLISPYI IISFMTQGLERFI Prediction of potential genes in microbial genomes Time: Wed May 18 03:24:56 2011 Seq name: gi|222159264|gb|ACAB01000095.1| Bacteroides sp. D1 cont1.95, whole genome shotgun sequence Length of sequence - 4956 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 71 - 607 331 ## MmarC6_0586 polysaccharide pyruvyl transferase 2 1 Op 2 . + CDS 582 - 908 204 ## MmarC6_0586 polysaccharide pyruvyl transferase + Prom 1000 - 1059 7.8 3 2 Op 1 . + CDS 1293 - 2234 491 ## gi|237714667|ref|ZP_04545148.1| predicted protein + Term 2235 - 2275 2.5 + Prom 2237 - 2296 3.0 4 2 Op 2 1/0.000 + CDS 2316 - 3422 687 ## COG3754 Lipopolysaccharide biosynthesis protein + Term 3453 - 3490 -0.9 + Prom 3844 - 3903 5.4 5 2 Op 3 . + CDS 3935 - 4525 293 ## COG0438 Glycosyltransferase Predicted protein(s) >gi|222159264|gb|ACAB01000095.1| GENE 1 71 - 607 331 178 aa, chain + ## HITS:1 COG:no KEGG:MmarC6_0586 NR:ns ## KEGG: MmarC6_0586 # Name: not_defined # Def: polysaccharide pyruvyl transferase # Organism: M.maripaludis_C6 # Pathway: not_defined # 11 165 107 265 395 84 33.0 2e-15 MIKQVEYVAAINGGDGFSDIYNTTSFLNRLPDINLAMKFNIPVIILPQTLGPFRESKNKA IADRILCYASQIFVRDNKYASDLEAMGLKYEQTRDLSYYMKPEPFNIEIKANAIGINISG LAYSNKFRTLSGQFSTYPYLINKLIVYFQQKNIAIYLIPHSYNYQKVEVANDDYGSCT >gi|222159264|gb|ACAB01000095.1| GENE 2 582 - 908 204 108 aa, chain + ## HITS:1 COG:no KEGG:MmarC6_0586 NR:ns ## KEGG: MmarC6_0586 # Name: not_defined # Def: polysaccharide pyruvyl transferase # Organism: M.maripaludis_C6 # Pathway: not_defined # 4 100 273 365 395 68 38.0 8e-11 MTIMEAARDVYSKLHDKSNIILIDQDLISPQIKFIISQMKFFIGTRMHANFAAMYTGVPL FGLAYSYKFQGAFEANGIYDSTAMINDISEKEADAIVERIVTKYKSLD >gi|222159264|gb|ACAB01000095.1| GENE 3 1293 - 2234 491 313 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714667|ref|ZP_04545148.1| ## NR: gi|237714667|ref|ZP_04545148.1| predicted protein [Bacteroides sp. D1] # 1 313 35 347 347 539 100.0 1e-151 MWNVSRYDPKIIRYSRNCMVIVIIVIVIYGIFLLTLNGLNPYVYFMAQINNAELREAQFG EQMARLIIKISSVFTHPMIFGLFLGLAMVYLYSLKDKIKPLFVYLLMFFIVVCIFLCGIR TPIGAMFLTVFFYLLMLRRIKPMIYVAVIGFIGYIIIENIPELSATIDSIFIKDSRQTNV EGSSIDMRMEQLNGCFREIQDCLIFGKGYEWCGYYMSIHDLHPVLLAFESLIFVVLCNSG IVGLCVWVITFVWLFRGVYRMNKNVNVTLFVITLAVYYIAYSAITGEYGYMKYFIIFYTL LLMESKIFIGRKH >gi|222159264|gb|ACAB01000095.1| GENE 4 2316 - 3422 687 368 aa, chain + ## HITS:1 COG:CC0633 KEGG:ns NR:ns ## COG: CC0633 COG3754 # Protein_GI_number: 16124886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis protein # Organism: Caulobacter vibrioides # 6 365 221 564 818 248 38.0 1e-65 MITKARVIAFYLPQFHPIKENDEWWGKGFTEWTNVGKAKPLFKGHYQPRVPADLGYYDLR MPEVREAQADMAREAGIEGFMYWHYWFGNGKQILERPFNEVLSSGKPDFPFCLGWANHSW TRRTWNSDAQKHKNLDLMIQEYPGDEDIVKHFNNVLPAFKDKRYIAVDGKPIFLIYDPEA LPDAKHFIDVWKKLAKQNGLAGIHFVGLQNAAVSRYQRIFDLGFDALAPSNLWHAEELCK GRWCKLLQHKIRQLFPNYTPLDKYKYKDIISNFYTSYDYREDVYPSIIPNWDRSPRAGRR AVIYTGSTPALFEEHIKKALEVILQKQDQHKILFLRSWNEWAEGNYVEPDLKFGHGYLDV LKSSILMK >gi|222159264|gb|ACAB01000095.1| GENE 5 3935 - 4525 293 196 aa, chain + ## HITS:1 COG:TM0622 KEGG:ns NR:ns ## COG: TM0622 COG0438 # Protein_GI_number: 15643387 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 4 172 179 346 388 75 30.0 7e-14 MRTVDYDKVSVELEKLKIHDDDKIFIHVARCAKVKNQELLVKTFNRFLNEGNHGILILIG ASYDSKENIHIINSAQKGIYWLGTKNNVVDYLLKADFFVLSSLAEGLPISLLEAMSCGVI PICTPVGGIPNVIDGEDKGYISKSSNADDFYYTLKKAFDNEEKINRMKLKEYFDNNFSIS HCAESYLRVFGYDEIK Prediction of potential genes in microbial genomes Time: Wed May 18 03:25:18 2011 Seq name: gi|222159263|gb|ACAB01000096.1| Bacteroides sp. D1 cont1.96, whole genome shotgun sequence Length of sequence - 18434 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 6, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 73 - 132 5.3 1 1 Op 1 8/0.000 + CDS 152 - 1186 527 ## COG0451 Nucleoside-diphosphate-sugar epimerases 2 1 Op 2 . + CDS 1212 - 2525 847 ## COG1004 Predicted UDP-glucose 6-dehydrogenase + Prom 2527 - 2586 6.8 3 2 Op 1 . + CDS 2632 - 3369 494 ## COG1922 Teichoic acid biosynthesis proteins 4 2 Op 2 1/0.000 + CDS 3383 - 4468 915 ## COG1089 GDP-D-mannose dehydratase + Prom 4512 - 4571 7.1 5 2 Op 3 2/0.000 + CDS 4594 - 6000 626 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 6 2 Op 4 2/0.000 + CDS 6045 - 6848 691 ## COG1596 Periplasmic protein involved in polysaccharide export 7 2 Op 5 . + CDS 6860 - 9292 1956 ## COG0489 ATPases involved in chromosome partitioning + Term 9514 - 9552 2.0 + Prom 9515 - 9574 5.5 8 3 Tu 1 . + CDS 9643 - 11202 1074 ## BT_1642 hypothetical protein + Term 11260 - 11309 3.1 + Prom 11282 - 11341 5.1 9 4 Op 1 . + CDS 11364 - 11894 523 ## BT_0615 hypothetical protein 10 4 Op 2 . + CDS 11963 - 12055 114 ## + Term 12095 - 12128 6.1 + Prom 12143 - 12202 8.1 11 5 Tu 1 . + CDS 12309 - 14636 2080 ## COG5009 Membrane carboxypeptidase/penicillin-binding protein + Prom 14640 - 14699 5.3 12 6 Op 1 . + CDS 14740 - 15126 177 ## BT_0744 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase 13 6 Op 2 . + CDS 15127 - 15879 690 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase 14 6 Op 3 . + CDS 15884 - 17167 1025 ## COG0612 Predicted Zn-dependent peptidases + Term 17172 - 17213 -0.4 + Prom 17174 - 17233 5.0 15 6 Op 4 . + CDS 17253 - 18422 1024 ## COG4642 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|222159263|gb|ACAB01000096.1| GENE 1 152 - 1186 527 344 aa, chain + ## HITS:1 COG:BH3709 KEGG:ns NR:ns ## COG: BH3709 COG0451 # Protein_GI_number: 15616271 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 1 341 1 334 343 355 52.0 8e-98 MKILVTGAAGFIGSKLMGVLASRGDEVVGIDSINNYYDVRLKYGRLSEMGIMLNDEFVWN QPIQSSRYETCRFIRMSIDDRHAMEELFEREHFEKVVNLAAQAGVRYSITNPYAYLQSNL AGFLNVLECCRHYEVKHLVFASSSSVYGLNSKVPYSEEDKVDTPVSLYAATKKSNELMAH SYSKLYGLAVTGLRFFTVYGPWGRPDMAPMLFARAISNGEQIKVFNNGDMIRDFTYIDDI VEGTIRTLDHVPVTQKSSNGVAYKIYNIGCSHPVKLMDFIHEIESAMGHEAEKIFLPMQP GDVYQTNADTSMLKKEIGYEPMVTLHDGVAKFIQWYKSEKNPLK >gi|222159263|gb|ACAB01000096.1| GENE 2 1212 - 2525 847 437 aa, chain + ## HITS:1 COG:XF1606 KEGG:ns NR:ns ## COG: XF1606 COG1004 # Protein_GI_number: 15838207 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Xylella fastidiosa 9a5c # 1 437 1 444 450 500 55.0 1e-141 MNIAVVGTGYVGLVSGTCFAEMGVNVTCVDVNEEKIKSLLDGQIPIYEPGLDEMVLRNHR EGRLNFTTDLKTCLDNVDIVFSAVGTPPDEDGSADLTYVLEVARTVGRNINKYVLLVTKS TVPVGTAQKVKKAIHEELEKRGVDIEFDVASNPEFLKEGAAIQDFMRPDRVVVGIESEKA KELMSRLYRPVMLNNFRVIFTDIPSAEMIKYAANSMLATRISFMNDIANLCELVGADVNM VRKGIGADTRIGSKFLYPGCGYGGSCFPKDVKALVKTAENRGYEMRVLKAVEEVNENQKH IVFNKLCKHYNGELHGKVIAIWGLAFKPETDDMREATALITINQLIKAGCKVQVFDPVAM DECKRRVGEGVTYAKDMYDAVLNADALLLLTEWKQFRLPSWGVLKRTMNSPVIIDGRNIY DPVEMMEKGVTYYCIGR >gi|222159263|gb|ACAB01000096.1| GENE 3 2632 - 3369 494 245 aa, chain + ## HITS:1 COG:mlr6502 KEGG:ns NR:ns ## COG: mlr6502 COG1922 # Protein_GI_number: 13475436 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Teichoic acid biosynthesis proteins # Organism: Mesorhizobium loti # 27 237 45 250 262 123 32.0 3e-28 MKLCLKTIILLKNQKDLSSLPEGKLLINTINAHSYNTALKDPLFQEALLKGGALIPDGIS MVLAFKFLRGEKIERTAGWDLFQYEMNKLNQKGGVCFFLGSSKKTLSLVCEKVKTVYPNI QVKTYSPPYKSSFTEEENREMINAVNMANPDLLWIGMTAPKQEKWAYEHLNELNVHCHIG TIGAVFDFFAGTVQRAPQWWQKNGLEWAYRLLKEPKRMWRRYIIGNVLFLWNILKEKNGM IMDHN >gi|222159263|gb|ACAB01000096.1| GENE 4 3383 - 4468 915 361 aa, chain + ## HITS:1 COG:BMEI1413 KEGG:ns NR:ns ## COG: BMEI1413 COG1089 # Protein_GI_number: 17987696 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: GDP-D-mannose dehydratase # Organism: Brucella melitensis # 1 361 1 352 362 486 65.0 1e-137 MKKIALISGITGQDGSFLAEFLIEKGYEVHGILRRSSSFNTSRIEHLYLDEWVRDMKKDR LVNLHYGDMTDSSSLIRIIQQVQPDEIYNLAAQSHVKVSFDVPEYTADADAIGTLRMLEA VRILGMEKKTKIYQASTSELFGMVQEVPQKETTPFYPRSPYGVAKQYGFWITKNYRESYG MFAVNGILFNHESERRGETFVTRKITLAAARIAQGLQDKLYLGNLDSLRDWGYAKDYVEC MWLILQHDAPEDFVIATGEYHTVREFTTLAFKETGVNLCWEGKGVNEKGIDGATGKVLVE VDPKYFRPAEVEQLLGDPTKARTLLGWNPRKTSFEELIKIMVSHDLKFVKRLYMQESMNR E >gi|222159263|gb|ACAB01000096.1| GENE 5 4594 - 6000 626 468 aa, chain + ## HITS:1 COG:wcaJ KEGG:ns NR:ns ## COG: wcaJ COG2148 # Protein_GI_number: 16129987 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Escherichia coli K12 # 50 458 51 455 464 233 36.0 6e-61 MQEVQRFNKVLKSCVLLGDLILLNLLLWGFEQFLGNRFWCENCGSILQGMALITLCYLLC NMHSGVILHRSVVRPEQIMVRVLRNMVPFVLLSVCILLLFHFEFSHSRLFGLFYIVLILV IVSYRLAFRYFLELYRKQGGNVRKVILIGSHENMQELYHAMTDDPTSGYRVLGYFEDFPS DRYPSDVSYLGHPQEVNNFLKQNVGRVDQLYCSLPSARSAEIVPIINYCENHLVRFFSVP NVRNYLKRRMHFEMLGNVPVLSIRREPLELLENRIVKRTFDIVCSTLFLCTIFPFIYIIV GVAIKMSSPGPIFFKQKRSGEDGKEFWCYKFRSMRVNAQCDTLQATEHDPRKTRIGEIIR KTSIDELPQFINVFKGDMSIVGPRPHMLKHTQEYSLLINKFMVRHFVKPGITGWAQVTGY RGETKELWQMEGRVMRDIWYIEHWTFLLDLYIMYKTVYNAIHGEKEAY >gi|222159263|gb|ACAB01000096.1| GENE 6 6045 - 6848 691 267 aa, chain + ## HITS:1 COG:RSp1020 KEGG:ns NR:ns ## COG: RSp1020 COG1596 # Protein_GI_number: 17549241 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Ralstonia solanacearum # 48 229 86 262 381 60 31.0 3e-09 MRRLNKKFGLLLLPFLLVACQSYKKVPYLQDVEVVEQVTQQENLYDAKIMPKDLLTIVVS CTSPELALPFNLTIASPSGIASSNSTFTTAQPMLQPYLVDNEGKINFPVLGELKLGGLTK KQAEQMIVDKLKLYITETPIVTVRMVNYKISVIGEVTRPGTFTISNEKVNLLEALAMAGD MTVYGLRDNVRLIREDSSGKQQIITLDLNKAETILSPYYWLQQNDIIYVTPNKAKARNSD IGNSTSLWFSATSILVSLASLLFNILK >gi|222159263|gb|ACAB01000096.1| GENE 7 6860 - 9292 1956 810 aa, chain + ## HITS:1 COG:alr2856_2 KEGG:ns NR:ns ## COG: alr2856_2 COG0489 # Protein_GI_number: 17230348 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Nostoc sp. PCC 7120 # 543 788 3 253 275 120 29.0 8e-27 MKEKKVYNVEQEANEEKIDFQEILFKYLIHWPWFVGTVVVFLLGAWLYLRMATPVYNISA TVLIKDDKKGGGAGMASELENLGLDGLISSSQNIDNEIEVLRSKTIAKEVVIDLNLYISY KDEDEFPAKDMYKTSPVQVNLIPQEAELLDDPMIVEMALQPQGSLDINVKVGDDEFQKHF DKLPAVFPTARGTLAFFMSPDSLISKGTDDVDLAKKVRNITATINNPLRVARWYCKSMTI EPTSKTTSVAVISLKNSSLRRGQDFINKLLEMYNINTNNDKNEIAQKTAEFIDERIGIIS KELGSTEESLETFKRNAGITDLTSEAQIALTGNAEYEKKRVENQTQINLVEDLRRYMRGN EYEVLPGNIGLQDAGLVAQIDRYNEMLVERKRLLRTSTENNPTIINLDTSIRAMKMNVDV TLDRTLQGLLITKADLDREASRFSRRINEAPGQERQFVSIARQQEIKSGLYLLLLQKREE NAITLAATANNAKIIDDAIADEIPVSPKGKIIYLVALVLGVGIPVGVIYLINLTKFRIEG RSDVEKLTSIPIVGDIPLTDEKQGAIAVFENQNNLMSETFRNIRTNLQFMLENDKKVILV TSTVSGEGKSFISANLAISLSLLGKKVIIVGLDIRKPGLNKVFNIPRKEVGITQYLANPE KNLMDLVQLSDVSKNLYILPGGTVPPNPTELLARDGLDKAIETFKKSFDYVILDTAPVGM VTDTLLIGRVADLSVYVCRADYTHKNEYTLINELAENNKLPKLCTVINGLDLKRRKYGYY YGYGKYGKYYGYGKRYGYGYGYGEQTHDKE >gi|222159263|gb|ACAB01000096.1| GENE 8 9643 - 11202 1074 519 aa, chain + ## HITS:1 COG:no KEGG:BT_1642 NR:ns ## KEGG: BT_1642 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 519 1 521 521 852 81.0 0 MENLRRYPIGIQTFSEIRKSNYLYIDKTEYVYWMVHFTKYVFLSRPRRFGKSLLTSTLHS YFSGQKELFQGLAIEKLEKEWTEYPVLHFDMSTAKHADCEQLLQELNMKLIRYEEVYGKM EGEVNPNQRLEGLIKRAYEQTGKQVVVLIDEYDAPLLDVVHEEERLGVLRNIMRNFYSPL KACDPYLRYVFLTGITKFSQLSIFSELNNIKNISMNESYAAICGITENEILVQMKDDVDA LAQKLEVTSEEVLAKLKENYDGYHFTYPSPDIYNPFSLLNAFADGKFNSYWFGSGTPTYL IKMLDKFGVAPSEIGRKTAVAEDFDAPTERMVSITPLLYQSGYITIKDYDKELDLYTLDI PNKEVRIGLMKSLLPNYVASKTPEANTMVAYLSRDIRNGDMDAALRRLQTFLSTIPQCDN TKYEGHYQQMFYIIFSLLGYYVDVEVRTASGRVDMVLRTKTTLYVMELKLDKSADRAMEQ IDLKNYPKRFALCGLPIVKVAVSFDSEQCTIGEWKILKV >gi|222159263|gb|ACAB01000096.1| GENE 9 11364 - 11894 523 176 aa, chain + ## HITS:1 COG:no KEGG:BT_0615 NR:ns ## KEGG: BT_0615 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 150 1 150 172 257 89.0 1e-67 MSVIYKVITRPTDPRVPNSPKRYYPHLITLGQSVNLKYIAQKMQDRSSLSVGDIKSVIQN FVEKMKEQLLEGKSVNIEGLGVFMLTARSKGAELAKDINAKSVESVRIFFQANKELRVTK TATRADEKLDLISLDEYLKKLNMTISPNDPEKPDDGGEEGGGNESGGSGEAPDPAA >gi|222159263|gb|ACAB01000096.1| GENE 10 11963 - 12055 114 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKTWSIILKVIIAVAGAIAGVVGVQAANL >gi|222159263|gb|ACAB01000096.1| GENE 11 12309 - 14636 2080 775 aa, chain + ## HITS:1 COG:aq_624 KEGG:ns NR:ns ## COG: aq_624 COG5009 # Protein_GI_number: 15606057 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein # Organism: Aquifex aeolicus # 1 748 1 679 726 278 30.0 3e-74 MIRKIIKALWIFLAVIVLAIVVIFVSISKGWIGYMPPVEELENPSYKFATEIFSEDEKVL GTWSYSKENRVYTAYKDLSPSIINALIATEDVRFVEHSGIDAKALFRAFVKRGLMFQKNA GGGSTLSQQLAKQLFTENVARNTLQRLFQKPIEWVIAVKLERYYTKEEILSMYLNKFDFL NNAVGIKTAAHTYFGCEPKDLKIEEAATLVGMCKNPSLYNPVRFNERSRGRRNVVLEQMR KAGYITDAECDSLQALPLKLKYNRVDHKEGLATYFREYLRGVMTAPKPVKSDYRGWQMQK FYEDSIAWETNPLYGWCAKNKKKDGTNYNIYTDGLKIYTTINSRMQQYAEDAVKEHLGDY LQPVFFKEKEGSKNAPYARSLPEKRVEELLTKAMKQTDRYRLMKEAGASEQQIRKAFDTP EEMTVFSWKGDKDTIMTPMDSIRYYKSFLRTGFMSMDPVSGHVKAYVGGPNYVYFQYDMA MVGRRQVGSTIKPYLYTLAMENGFSPCDQTRHVEQTLIDENGTPWTPRNANNKRYGEMVT LKWGLANSDNWISAYLMGKLNPYNLVRLIHSFGVRNKAIDPVVSLCLGPCEISVGEMVSA YTAFANKGIRVAPLFVTRIEDSDGNVLSTFAPQMEEVISISSAYKMLVMLRAVINEGTGG RVRRYGITADMGGKTGTTNDNSDAWFMGFTPSLVSGCWVGGDERDIHFGRMTYGQGAAAA LPIWALYMKKVYDDPTLGYDQQEKFKLPEGFDPCAGSETPDGEVIEEGGLDDLFN >gi|222159263|gb|ACAB01000096.1| GENE 12 14740 - 15126 177 128 aa, chain + ## HITS:1 COG:no KEGG:BT_0744 NR:ns ## KEGG: BT_0744 # Name: not_defined # Def: 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 48 165 173 147 62.0 1e-34 MHRYIVCIGSNYNRKENLTFARQELTESFSSICFAPELETEPLFFKNPALFSNQVVMFFS DKDEEVVRKMLKDIEQRSGRRPEDKKEEKVCLDIDMLLYDNKIVKPEDWQRGYIQQSLSA FHSSLFIK >gi|222159263|gb|ACAB01000096.1| GENE 13 15127 - 15879 690 250 aa, chain + ## HITS:1 COG:FN0807 KEGG:ns NR:ns ## COG: FN0807 COG1212 # Protein_GI_number: 19704142 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Fusobacterium nucleatum # 1 247 1 239 245 194 45.0 1e-49 MKFLGIIPARYASTRFPAKPLAMLGGKTVIQRVYEQVAGVLDDAYVATDDERIEAAVKAF GGKVVMTSIHHKSGTDRCYEACTKIGGDFDVVVNIQGDEPFIQPSQLDAVKACFEDVTTQ IATLVKPFTADEPFAVLENVNSPKVVVNKNWNALYFSRSIIPYQRNAEKQDWLKGHTYYK HIGLYAYRTDVLKEITMLPQSSLELAESLEQLRWLENGYKIKVGISEVETIGIDTPQDLE RAEEFLKNRI >gi|222159263|gb|ACAB01000096.1| GENE 14 15884 - 17167 1025 427 aa, chain + ## HITS:1 COG:sll2009 KEGG:ns NR:ns ## COG: sll2009 COG0612 # Protein_GI_number: 16330306 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Synechocystis # 22 420 9 410 435 108 26.0 2e-23 MDRKIQPEIQTLKNFRILPPVRMTLPNGIPLTVINAGEQEVVRMDVLFAGARWQQSQKLQ ALFTNRMLREGTTKYTAATIAEKLDYYGSWLELSSSSEYAYITVYSLNKYLAKTLEVVES MIKEPLFPEKELHTILDTNIQQYLVNTSKVDFLAHRSLLKSLYGEQHPCGKIVVEEDYHA ITPEVLREFYERYYHSGNCSIFLSGKVTEDIISRVTDTFGTSFGQHQQPALKLSFPFTAV SEKRIFIEREDAMQSAVKMGYTTITRNHPDYLKLRVLMTLFGGYFGSRLMSNIREEKGYT YGISAGIMFYPDSGLLAISTETDNEYVEPLIQEVYHEIDRLHQEPVSMEELTIVRNYMLG EMCRSYESPFSLSDAWIFIATSGLDDDYFSRSLLAVNEVTPAEIQDLAQRYLCKETLKEV IAGKKLS >gi|222159263|gb|ACAB01000096.1| GENE 15 17253 - 18422 1024 389 aa, chain + ## HITS:1 COG:slr1485 KEGG:ns NR:ns ## COG: slr1485 COG4642 # Protein_GI_number: 16329198 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 46 360 27 341 349 163 33.0 5e-40 MRKYLYTTLLLALLAQEGVMAQENEGKKGGFFGKIKDTFSTEIKIGNYTFKDGSVYTGEM KGRKPNGKGKTVFKNGDVFEGEYVKGKREGYGIYMFPDGEKYEGQWFQDQQHGKGIYYFM NNNRYDGMWYQDYQHGEGTMYYHNGDLYVGHWVNDKREGEGTYTWANGAKYTGHWKNDKK NGKGTMNWDDGSKYDGDWKDDVRHGKGVFEYTNGDKYDGDWADDIQHGKGTYYFHTGDRY EGSYLLGERTGPGVYYHANGDKYVGNFKNGMQDGKGTFTWANGAVYEGSWKNNKRDGKGV YKWSNGDVYDGDWKDNRPNGQGTLKTVAGMQYKGGFVDGLEEGQGVQIDKDGNRFDGFFK QGKKDGPFVETDKDGKVIKKGTYKFGRLQ Prediction of potential genes in microbial genomes Time: Wed May 18 03:25:35 2011 Seq name: gi|222159262|gb|ACAB01000097.1| Bacteroides sp. D1 cont1.97, whole genome shotgun sequence Length of sequence - 7799 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 54 - 105 15.6 1 1 Tu 1 . - CDS 135 - 1073 804 ## COG0462 Phosphoribosylpyrophosphate synthetase - Prom 1140 - 1199 7.2 + Prom 1212 - 1271 7.3 2 2 Op 1 . + CDS 1296 - 5543 3113 ## COG0642 Signal transduction histidine kinase 3 2 Op 2 . + CDS 5578 - 6306 1005 ## BF2233 two-component system response regulator 4 2 Op 3 . + CDS 6354 - 7790 1139 ## COG2978 Putative p-aminobenzoyl-glutamate transporter Predicted protein(s) >gi|222159262|gb|ACAB01000097.1| GENE 1 135 - 1073 804 312 aa, chain - ## HITS:1 COG:Cj0918c KEGG:ns NR:ns ## COG: Cj0918c COG0462 # Protein_GI_number: 15792247 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Campylobacter jejuni # 7 311 4 309 309 313 51.0 2e-85 MSEKAPFMVFSGTNSRYLAEKICASLDCPLGNMNITHFADGEFAVSYEESIRGAHVFLVQ STFPNSDNLMELLLMIDAAKRASAKSVVAVIPYFGWARQDRKDKPRVSIGAKLVADLLSV AGIDRLITMDLHADQIQGFFNIPVDHLYASAVFLPYIQSLQLENLVIATPDVGGSKRAST FSKYLGVPLVLCNKSREKANEVASMQIIGDVEGKNVVLIDDIVDTAGTITKAANIMLEAG AQSVRAIASHCVMSDPASFRVQESALTEMVFTDSIPYAKKCPKVKQLSIADMFAETIKRV MNNESISSQYII >gi|222159262|gb|ACAB01000097.1| GENE 2 1296 - 5543 3113 1415 aa, chain + ## HITS:1 COG:all7583 KEGG:ns NR:ns ## COG: all7583 COG0642 # Protein_GI_number: 17158719 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 658 796 327 447 466 69 36.0 5e-11 MVPTKEVRLIDSLNGKAYTYRYRSLDSSYKYANEAYQQVNFYKSGKAEASNNLGFCAFMA MDFDRAEALHKEVYKLTKNELELLIADIGLMKICQRTAMNKEFYDYRNSALRRMKRIREE SDLFADRHEALRLDYAFTEFFIVSSIYYYYLQQRQEAIASLNQIPEDEVLADTNQLLYYH YIKGSASLVEATKPEDRKMREFDQLYITWRTAVQTNHPYFEGNGLQGLANLMVSSNNFEL FRTRRGYALDQFGFPVDSLLPLRMAQLALEKFREYNDLYQIAGAYVSIGKYMNEHGRYSE ALDTLTKALDCVNQHHMLYYHHAVDTLDKLRIYAEGDTTYTGVPWIMEEDVRTVPEWISR IREQLSVSYAGLGMKYASDYNRNIYLDILNYTRQDKELESRYLSLESDSRQMTLVLSLVI AGLVLVVILWWFFNKRSKIRNQVDVERLQRILALCRDITSSIPMNVPLIQQGIDQLFGKG RLQLEIPEEGKAALVPLHRLNRDEKALVHVLEPYIVWAADNEQMVEALSDERMQLEKQRY VYEQHIAGNKRQNLIKKACLAIVNGINPYIDRILNEVHKLTERGYIDHEKIKKEKYQYID ELVTTINEYNDILALWIKMKQGTLSLNIETFDLNELFELLGKGRRAFEMKNQKLEIEPTT VMVKADRALTLFMINTLAENARKYTPEGGTIKVYARTTDAYVEISVEDNGRGISEEDMAR IIGEKVYDSRVIGMKNAANPEVLKENKGSGFGLMNCKGIIEKYKKTNELFRGCVFDVESE LGKGSRFYFRLPSGVRKTMGVLLLCLLLPFGVSSCLHDPIPPMLQEGDSIVVVTDSAYED LLDAASDYANAAYFANVDENYEFALQYIDSAILLLNEHYEKYARPDRPHRYMKLVGEGTP AEISWWNELFDSDYHVILDIRNEAAVAFLALKQLDAYSYNNSAFTDLYKLQGEDQTLEAY CRQLERSNTNKTVGIILCFVLLIVSLVGYYFLYMRKRLQNRLNLEQVLEINQKVFAASLV RPQEQENAEALQREESTLKEIPQRIVDEAFGPVNELLTIDRMGIAVYNETTHRLEYASRP GQEMPEMVEQCFSSGEYLSEQHLQAIPLMVEAGGEHQCVGVLYLERREGTEQETDRLLFE LVARYVAIVVFNAVVKLATKYRDIESAHEETRRASWEDSMLHVQNMVLDNCLSTIKHETI YYPNKIKQIVGRLNTQKLSETEEREAVETMTELIEYYKGIFTILSSCASRQLEEVTFRRT VIPVQELLDAAGKYFKKSMKNRSERIELEIEPMEAKVIGDVNQLRFLLENLIDEALTVRE DGLIRLQARQDNEYIRFLFTDTRREKSVEELNQLFYPNLARMTSGEKGELRGTEYLICKQ IIRDHDEFAGRRGCRINAEPAEGGGFTVYFTIPRR >gi|222159262|gb|ACAB01000097.1| GENE 3 5578 - 6306 1005 242 aa, chain + ## HITS:1 COG:no KEGG:BF2233 NR:ns ## KEGG: BF2233 # Name: not_defined # Def: two-component system response regulator # Organism: B.fragilis # Pathway: not_defined # 1 242 1 242 242 440 95.0 1e-122 MEEQKFKVIIVEDVKLELKGTEEIFRHEIPNAEVIGTAMTESEFWPLMEAQLPDLVLLDL GLGGSTTIGVDICRNIFKRYKGVRVLIFTGEILNEKLWVDVLNAGADGIILKTGELLTKT DVQAVMDGKKLVFNYPILEKIVDRFKKSVANDAKRQEAVISYDIDEYDERFLRHLALGYT KEMIANLKGMPFGVKSLEKRQNDLIGRLFPNGERVGVNATRLAVRALELRIIDLDNLEPD EE >gi|222159262|gb|ACAB01000097.1| GENE 4 6354 - 7790 1139 478 aa, chain + ## HITS:1 COG:FN0470 KEGG:ns NR:ns ## COG: FN0470 COG2978 # Protein_GI_number: 19703805 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Fusobacterium nucleatum # 1 478 23 503 512 300 37.0 3e-81 MPHPATMFLLLTMAVVFLSWICDIYGLKVTLPQTGEDIRVQSLLSPEGIRWWLRNAIKNF TGFAPLGMVIIAMFGLGVAQHSGFIDACIRMGVGNRQEKRKIILWVIVLGLLSNAIGDGG YIILLPIAAMLFQWVGLHPIAGIVTAYVSVACGYSANIVLSTMDPLLAHTTQEAALAQTG YQGNTEPLCNYFFMSASTVVITAIVYWITQKWLLPTLGKYEGSVKVVAYHPLSRKERRAI MISIVVAAVYVALILWLTFSSYGILRGVNGGLMHSPFIAGILFLLSLGAGITGMAYGFSS GRYRTDNDVIEGLTQPMKLLGVYFVIAFFAAQMFACFEYSHLDKCLAIMGADLLSSFEPA PLSALILFILFTALINLIMVSATSKWAFMSFIFIPMFAQMGIAPDVAQCAFRIGDSSTNA ITPFLFYMPLVLTYMRQYDKQITYGSLLKYTWRYSLGILVTWTLMFIVWYLLKIPMGL Prediction of potential genes in microbial genomes Time: Wed May 18 03:25:43 2011 Seq name: gi|222159261|gb|ACAB01000098.1| Bacteroides sp. D1 cont1.98, whole genome shotgun sequence Length of sequence - 13299 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 7, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 121 - 194 75.7 # Arg TCT 0 0 + Prom 250 - 309 7.5 1 1 Tu 1 . + CDS 481 - 1215 782 ## BT_0766 hypothetical protein + Term 1241 - 1282 5.3 + Prom 1250 - 1309 8.8 2 2 Op 1 9/0.000 + CDS 1343 - 2350 588 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 3 2 Op 2 . + CDS 2334 - 2930 488 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase + Term 2987 - 3035 1.3 + Prom 2972 - 3031 6.6 4 3 Op 1 . + CDS 3217 - 3882 613 ## COG5587 Uncharacterized conserved protein 5 3 Op 2 . + CDS 3920 - 4366 631 ## COG2731 Beta-galactosidase, beta subunit 6 3 Op 3 . + CDS 4385 - 6397 2123 ## COG0296 1,4-alpha-glucan branching enzyme + Term 6417 - 6462 1.2 - Term 6406 - 6449 4.6 7 4 Tu 1 . - CDS 6485 - 6973 379 ## BT_0772 hypothetical protein - Prom 7036 - 7095 3.0 - Term 7036 - 7084 13.1 8 5 Op 1 . - CDS 7115 - 8812 1418 ## COG0366 Glycosidases 9 5 Op 2 . - CDS 8866 - 9672 603 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Prom 9817 - 9876 5.3 10 6 Tu 1 . - CDS 9913 - 11244 416 ## COG1145 Ferredoxin - Prom 11407 - 11466 3.9 - Term 11395 - 11439 4.1 11 7 Op 1 . - CDS 11468 - 12403 880 ## COG2006 Uncharacterized conserved protein 12 7 Op 2 . - CDS 12417 - 12926 404 ## BT_0777 hypothetical protein - Prom 13051 - 13110 5.6 Predicted protein(s) >gi|222159261|gb|ACAB01000098.1| GENE 1 481 - 1215 782 244 aa, chain + ## HITS:1 COG:no KEGG:BT_0766 NR:ns ## KEGG: BT_0766 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 24 267 267 378 88.0 1e-104 MKKLIFLFAFCIVVTNVFAQTAPEQLKKEGNDAFNAKNYPVAYAKFSEYLKQTNNQDSAT AYYCGIAADEVKKYAEAVTFFDIAIQKKFNIGNAYARKALALDAQKKTAEYVATLEEGLK VDPKNKTMVKNYGLHYLKAGLAAQKAGKAEEAEDCFKKVIPLDHKQYKTNALYSLGVLCY NDGANILKKAAPLANSDADKYAAEKEKADGRFKEALDYLEEAAKISPENENVKKMLPQVK AVMK >gi|222159261|gb|ACAB01000098.1| GENE 2 1343 - 2350 588 335 aa, chain + ## HITS:1 COG:PM1464 KEGG:ns NR:ns ## COG: PM1464 COG0147 # Protein_GI_number: 15603329 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Pasteurella multocida # 10 334 7 323 324 297 48.0 2e-80 MHLYNKEQAIKRMNQFGQHHRPFVFIINYLQDASYIEEVAAVDSTELLYNLNGFTNQATS TEKYTPSFSAKPANSIHWQPSAESFYSYQRSFNIVRQNILAGNSFLTNLTCRTPVDTNLT LKDIYYHSKAIYKLWVKDTFTVFSPEIFVRIHQGKISSYPMKGTIDASIPSAAQLLMNDP KEAAEHATIVDLIRNDLSMVANQVSVSRYRYIDTLQTNQGAILQTSSEIQGILPENYPEH LGELIFRLLPAGSITGAPKKKTMQIIREAETYDRGFYTGIMGYSDGINLDSAVMIRFVEQ EGEKMYFKSGGGITCQSDAESEYNEMKQKVYVPIY >gi|222159261|gb|ACAB01000098.1| GENE 3 2334 - 2930 488 198 aa, chain + ## HITS:1 COG:HI1169 KEGG:ns NR:ns ## COG: HI1169 COG0115 # Protein_GI_number: 16273093 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Haemophilus influenzae # 1 181 1 185 188 124 37.0 8e-29 MYPFIETIRIEDGQIYNLDYHTERFNRTRAVFWKDSVPLDLREYISPPVLNGIHKCRIVY GKEVEEVTYAPYQMRKVASLHLIESDTINYTYKSTHREELNALYAQRGMADDILIVKDGY LTDTSIANIALYDGYTWFTPAHPLLRGTKRAELLNKQFIVEKDIAQVHLNDYSHIMLFNA MIDWERIVLPINEEHFIL >gi|222159261|gb|ACAB01000098.1| GENE 4 3217 - 3882 613 221 aa, chain + ## HITS:1 COG:XF2023 KEGG:ns NR:ns ## COG: XF2023 COG5587 # Protein_GI_number: 15838617 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 7 218 19 234 237 105 28.0 6e-23 MNIPVIFQFLKDLSANNNREWFNEHKAEYETARAEFDNFLATVIARISLFDETIRGIQPK DCTYRIYRDTRFSADKTPYKIHFGGYINAKGKKSDHCGYYVHLQPDGSMLAGGSLCLPSN ILKAVRQSIYDNIEEFVAIVEDPEFKKYFPVIGEDFLKTAPKGFPKDFKYVDYLKCKEYV CSYNVPDDFFTRPDMLEQMDKAFRQFKRFADFINYTIDDFE >gi|222159261|gb|ACAB01000098.1| GENE 5 3920 - 4366 631 148 aa, chain + ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 1 147 1 150 152 98 34.0 4e-21 MVVDTLENLEKYASLNPLFAQAIDFLKSHDLQAMEIGKTELKGKDLFVNIAQTKPKTKEE AKLETHNEFIDIQIPLSGTEIMGYTAAKDCVPADAPYNAEKDITFFEGLAETYVTVKPGM FAIFFPQDGHAPGITPDGVKKVIVKVKA >gi|222159261|gb|ACAB01000098.1| GENE 6 4385 - 6397 2123 670 aa, chain + ## HITS:1 COG:YEL011w KEGG:ns NR:ns ## COG: YEL011w COG0296 # Protein_GI_number: 6320826 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Saccharomyces cerevisiae # 8 670 12 704 704 561 46.0 1e-159 MEKTLNLIKNDPWLEPFAGAITGRHQHVLDKEAELTNKGKQTLSDFASGYLYFGLHRTDK GWTFREWAPNATHIYMVGTFNNWEEKAAYKLKKLKNGNWEINLPADAIHHGDLYKLNVYW EGGQGERIPAWATRVVQDEQTKIFSAQVWAPEKPYKFKKKTFKPDTNPLLIYECHIGMAQ REEKVGTYNEFREKILPRIAEEGYNCIQIMAIQEHPYYGSFGYHVSSFFAASSRFGTPDE LKALIDAAHEMGIAVIMDIVHSHAVKNEVEGLGNFAGDPNQYFYPGVRREHPAWDSLCFD YGKNEVIHFLLSNCKYWLEEYKFDGFRFDGVTSMLYYSHGLGEAFCNYGDYFNGHQDDNA ICYLTLANEVIHQVNPKAITIAEEVSGMPGLAAKIEDGGYGFDYRMAMNIPDYWIKTIKE KIDEDWKPSSMFWEVTNRRQDEKTISYAESHDQALVGDKTIIFRLIDADMYWHMQKGDEN YVVNRGIALHKMIRLLTSSTINGGYLNFMGNEFGHPEWIDFPREGNGWSCKYARRQWDLV DNKNLTYHYMGDFDKDMLKVLKSVKDFQATPVQEIWHNDGDQVLAYGRKDLIFVFNFNPK QSFTDYGFLVTPGAYEVILNTDDVAFGGNGLADDSVVHFTIADPLYSKEKKEWLKLYIPA RTAVVLRKKK >gi|222159261|gb|ACAB01000098.1| GENE 7 6485 - 6973 379 162 aa, chain - ## HITS:1 COG:no KEGG:BT_0772 NR:ns ## KEGG: BT_0772 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 1 158 158 258 82.0 4e-68 MKKFMICAITLFLLLSTSVFGDTPPGNVQATFKKMYPKANGIAWSQDDGYYCANFVMNGF TKNVWFNVRGQWVMTQTDLVSLDRLSPAVYNAFVSGPYASWVVDNVTMVEFPKWQAIIVI KVGQDNVDIKYQLFYTPQGVLLKTRNVSDMYDILGPSTFLVN >gi|222159261|gb|ACAB01000098.1| GENE 8 7115 - 8812 1418 565 aa, chain - ## HITS:1 COG:TM1650 KEGG:ns NR:ns ## COG: TM1650 COG0366 # Protein_GI_number: 15644398 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Thermotoga maritima # 6 360 5 262 422 90 24.0 9e-18 MNDENKIIIYQVFTRLFGNNNNHCVYNGDITTNGCGKMADFTLKALGEIKKLGTTHIWYT GIIEHATQTDYRRYNISPDHPAIVKGKAGSPYAIKDYYDVDPDLANDVQGRMKEFENLVH RTHRTGLKVIIDFVPNHVARQYHSDAQPDGTTELGANDDPSQSFSPYNNFYYIPQAELRA QFDMKDGAAEPYREFPAKATGNNRFDATPNITDWYETVKLNYGVDYQNGGTCHFSPTPDT WIKMLDILLFWASKNIDGFRCDMVEMVPVEFWEWAIPQVKEVYPEILFIAEVYNPSEYRN YLFRGKFDYLYDKVGLYDTLRNVACGYESATAITHCWQSLNGIEKKMLNFLENHDEQRIA SDFFAGNPRKGIPALIVSACMNTNPMMIYFGQEFGELGMDSEGFSGRDGRTTIFDYWSVD TIRRWRNGGKFDGKMLTEEHKHLYSIYQKVLTLCNEEQAIKKGVFFDLMYANINGWRFNE HKQYTFMRKYKNEILFFVINFDSQLVDVAINVPSHAFDFLQIPQMESYQAIDLMTGAKEE ICLLPYKPTDVSVGGYNGKILKITF >gi|222159261|gb|ACAB01000098.1| GENE 9 8866 - 9672 603 268 aa, chain - ## HITS:1 COG:aq_1386 KEGG:ns NR:ns ## COG: aq_1386 COG1752 # Protein_GI_number: 15606577 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Aquifex aeolicus # 6 261 4 258 259 163 35.0 3e-40 MEVFTNNRNSRKYQIGYALSGGFIKGFAHLGVMQALLEHDIKPEIISGVSAGALAGVFYA DGNEPHKVLDYFSGHKFQDLTKLVIPKKGLFDLCEFIDFLRTNVKAKNLEDLQLPLIITA TDLDHGRMVHFHRGSIAERVAASCCMPVMFAPVNIDGTNYVDGGLMMNLPVSTLRRVCDK VVAVNVSPIMAQDYKMNIVSIAMRSFHFMFRANTFPEREKCDLLIEPYNLYGYSNTELEK AEEIFGQGYNTANEVLNQLLEEKGKIWK >gi|222159261|gb|ACAB01000098.1| GENE 10 9913 - 11244 416 443 aa, chain - ## HITS:1 COG:ECs3097 KEGG:ns NR:ns ## COG: ECs3097 COG1145 # Protein_GI_number: 15832351 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Escherichia coli O157:H7 # 258 419 17 164 164 79 28.0 2e-14 MGILQDVIARISKSTGKKKKRYSYSPAKNKLRWGILGLTVIGFLCGFTFIVGLLDPYSAY GRVVVHIFKPIYMLGNNLLESVFSRFDNYTFYQVDTSVLSISSLLIAIITLAVIFVMAWK HGRTWCNTVCPVGTVLGLLSRFSLFKIRIDTAKCNGCGLCATKCKASCINSKEHAIDYSR CVDCFDCLGACKQKALVYNPSLKKQQANVEAPVPSSPDTDSSKRCFLVAGLVTAGAAPKL LSQAKESVARLEGKKAYKKENPITPPGSISREHFQQQCTSCHLCVSKCPSHVLKPAFMEY GLAGIMQPTVSFEKGFCNFDCTVCGDVCPNGAILPISVEQKHLTQMGYVVFIEENCIVYT DGTSCGACSEHCPTQAVAMVPYKDGLTIPHVNKEICVGCGGCEYVCPARPFRAIYIEGNP VQKEAKPFKENEEHKVEIDDFGF >gi|222159261|gb|ACAB01000098.1| GENE 11 11468 - 12403 880 311 aa, chain - ## HITS:1 COG:MA1031_1 KEGG:ns NR:ns ## COG: MA1031_1 COG2006 # Protein_GI_number: 20089906 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 49 273 15 245 295 81 27.0 2e-15 MDRRDFLKTVAITGAALSIQHSEAMEILTQTINNTNGGNPDLVAVMGGEPEAMFRRAISE LGGMKQFVKPGQKVVVKPNIGWDKVPELAGNTNPQLIAEIVKQCFAAGAKEVTVFDHTCD DWQKCYKNSGIEAAAKAAGAKVMPAHLESYYKPVNLPNGQKMKKAKIHEAILNCDVWINV PILKNHGGANLTISMKNHMGIVWDRGFFHQNDLQQCIADICTLQKKAVLNVVDAYRIMKT NGPRGRSASDVVLAKGLFISPDIVAVDTAAAKFFNQVREMPLDTVGHLAKGEALKIGTMN IDKLNVKRIKM >gi|222159261|gb|ACAB01000098.1| GENE 12 12417 - 12926 404 169 aa, chain - ## HITS:1 COG:no KEGG:BT_0777 NR:ns ## KEGG: BT_0777 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 169 3 171 171 243 87.0 2e-63 METSAKLYSLNYSNVRTYLFALLFVTGNIALPQICHLVPYGGPTLLPIYFFTLIAAYKYG FRVGLLTAILSPVINHLLFAMPSGAALPIILIKSALLAGTSALAARTLKSVSLWAILGVV LSYQIIGTAFEWAFIGNFHAAVQDFRIGIPGMLIQWFGGYALLKAIAKL Prediction of potential genes in microbial genomes Time: Wed May 18 03:25:53 2011 Seq name: gi|222159260|gb|ACAB01000099.1| Bacteroides sp. D1 cont1.99, whole genome shotgun sequence Length of sequence - 545 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 27 - 443 272 ## gi|237714305|ref|ZP_04544786.1| predicted protein - Prom 471 - 530 7.8 Predicted protein(s) >gi|222159260|gb|ACAB01000099.1| GENE 1 27 - 443 272 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714305|ref|ZP_04544786.1| ## NR: gi|237714305|ref|ZP_04544786.1| predicted protein [Bacteroides sp. D1] # 1 138 1 138 138 215 100.0 7e-55 MEENQFIEVMRKHSDEKLLEILNVKRKDYVADAITAAEEVLIERGVSFTKIQNEKFEEKD NRPFIEKIRASSEKRLFGHIIDIAFITIITFLVIIIAMLIESTSMSELEIRLSYSAFYFL YFFGLETTNGERLLARDC Prediction of potential genes in microbial genomes Time: Wed May 18 03:26:02 2011 Seq name: gi|222159259|gb|ACAB01000100.1| Bacteroides sp. D1 cont1.100, whole genome shotgun sequence Length of sequence - 9749 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 6, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 263 - 793 193 ## gi|237714306|ref|ZP_04544787.1| predicted protein 2 1 Op 2 . - CDS 795 - 1838 512 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 3 1 Op 3 . - CDS 1868 - 2449 341 ## gi|237714308|ref|ZP_04544789.1| predicted protein 4 1 Op 4 . - CDS 2502 - 4058 793 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain 5 2 Tu 1 . - CDS 4669 - 5256 157 ## COG0582 Integrase - Prom 5341 - 5400 2.9 - Term 5353 - 5412 7.1 6 3 Op 1 . - CDS 5414 - 5686 147 ## BVU_3126 hypothetical protein - Term 5708 - 5739 0.1 7 3 Op 2 . - CDS 5765 - 5890 94 ## - Prom 5911 - 5970 3.8 - Term 5910 - 5957 5.0 8 4 Tu 1 . - CDS 5975 - 6991 328 ## BT_0948 hypothetical protein - Prom 7145 - 7204 6.4 + Prom 7489 - 7548 5.3 9 5 Tu 1 . + CDS 7590 - 8711 358 ## gi|237714313|ref|ZP_04544794.1| predicted protein + Prom 8937 - 8996 9.3 10 6 Tu 1 . + CDS 9028 - 9612 461 ## Fjoh_3635 hypothetical protein Predicted protein(s) >gi|222159259|gb|ACAB01000100.1| GENE 1 263 - 793 193 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714306|ref|ZP_04544787.1| ## NR: gi|237714306|ref|ZP_04544787.1| predicted protein [Bacteroides sp. D1] # 1 176 1 176 176 332 100.0 8e-90 MTKMSQCSTCIGDGVHCEHYRPTDDSPCPNYIFDTSSTEQGTIKYEVKKALYRYIVLIII VLILLSFIGAVTFWDYVSIIPYLSFLVSLYLVIAHFDRILQLVKEKFWNRKIKHHETRNI EFATRNDQKQANHSEISKIVDTVMNTSNNRTRDLVINTLKEIGCQPEVDDDDDILF >gi|222159259|gb|ACAB01000100.1| GENE 2 795 - 1838 512 347 aa, chain - ## HITS:1 COG:alr0718 KEGG:ns NR:ns ## COG: alr0718 COG0768 # Protein_GI_number: 17228213 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Nostoc sp. PCC 7120 # 30 341 255 590 609 93 24.0 6e-19 MNKNLIKVGICFHCFLLSCQPKQRTEIISTIDSTLQVNATSILEEKLSEINAQSGQVIVM EVQSGQIKALVGLTKKDSTNYQSCENFSVWQSTGLMRPISLLAALETGKVKLSDKVDTGD GIYQVHGRELKDHNWHRGGYGELTVQEGLAVGSNIATYKTMEKAFGDNPQTFFDLLANMS YGKPDSINGIASLQPYKITEKNITWNCIGYEQLISPIQVLTFYNAIANNGKMIQPQLYKD SVVVINPQIASRASIDSLKKALVFNVTDGLGQPAKSDMITVAGQQGTIVVSTDNGNTVYS VEFCGYFPANNPKYSIIVSINKTGLPASGGLMAGDVFKKIIVNLKQN >gi|222159259|gb|ACAB01000100.1| GENE 3 1868 - 2449 341 193 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714308|ref|ZP_04544789.1| ## NR: gi|237714308|ref|ZP_04544789.1| predicted protein [Bacteroides sp. D1] # 1 193 1 193 193 347 100.0 2e-94 MKKLRLFIIVVILFIVKSTYAQNEITGFLGVKFGDTNYVAIDILKKKFPQLEYNFPYINI PQISFLGTTFDSLVITFKEGKLVEGTFSLSETTSALLNEYPFRTQEQTQATIKNKQEYIV NKYTQLFNDIGNTFVIKYGKPQFPSEGTAIWRDNKLNSIKINSSIESRNGSNIIYFDGKL AITYQVGMSTDEF >gi|222159259|gb|ACAB01000100.1| GENE 4 2502 - 4058 793 518 aa, chain - ## HITS:1 COG:alr0702 KEGG:ns NR:ns ## COG: alr0702 COG0265 # Protein_GI_number: 17228197 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Nostoc sp. PCC 7120 # 51 210 147 310 429 90 35.0 8e-18 MKRQLLILLFYLGNTNLFAQDLTIKQLITKAENAIFTVFAADEQGNTFSQGSGFFISAQG VGITNYHVLDGAHSGYIKDKNGNKYKIKSILDYNPNTDLVKFQVENTNLKSFNYLHISYR TQVKGEQIINISSPLGLEQTVSTGIISSIRTDEMHGSILQITAPISHGSSGSPVMDMKGN VVGIATFCREGGQSLNFAVNATQISKLSHKRNISVSQMNTNPLETKMVKSANDAYFMGDT NKALSLLDNELKTNPSNHLAFYMKGMIEFSIKNVESALSNLIKACEIGKDISFYYKQLGK CYFQLYIYTHDTSYAEYALNAYSKGLQLTTEDAELFYHIGILFYQYALKSTENPYSDNNK QLYLKAQEALDYSIKIYPTAEAYTARADVKKMIRNYGSTILDCDKAIELAPDYYRGYFTR GDIKIFDIGSYEGIVDLERALFFVLDPKEKADILGLRATAYERKAFQELGANAGDLAAKA ILDYEEAYKLSNQPMYQEFKNKLIDKIKEYVQQRGSFP >gi|222159259|gb|ACAB01000100.1| GENE 5 4669 - 5256 157 195 aa, chain - ## HITS:1 COG:CAC1595 KEGG:ns NR:ns ## COG: CAC1595 COG0582 # Protein_GI_number: 15894873 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Clostridium acetobutylicum # 32 195 25 186 186 58 33.0 6e-09 MSLKYSSTTADYLQWNEAMNLIRKLARDSNYKMSLLIALGCFTGLRISDILALRWNQILD AEEFTITEIKTGKQRTIRINMQLQQHIRDCYEHISPVGINAPVLISQKGTVYTVQRINVM LKEIKKKYRLQIGNFSCHSLRKTFGRQVYNMNNDNSELALVKLMELFNHSSVSITKRYLG LRQEELLNTYDCLSF >gi|222159259|gb|ACAB01000100.1| GENE 6 5414 - 5686 147 90 aa, chain - ## HITS:1 COG:no KEGG:BVU_3126 NR:ns ## KEGG: BVU_3126 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 90 1 90 90 147 80.0 1e-34 MVIEGSREYKAAQELERALNDYSWNPKKFAESTRYYHRTLQQELMKTIVEIIRMVGNKNY GTDLRNQASHELCKRIVDSGVLDEAHLPFI >gi|222159259|gb|ACAB01000100.1| GENE 7 5765 - 5890 94 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLQALLVVISWIILLLVMCAISPIVFFLMIIWTIYKIITMK >gi|222159259|gb|ACAB01000100.1| GENE 8 5975 - 6991 328 338 aa, chain - ## HITS:1 COG:no KEGG:BT_0948 NR:ns ## KEGG: BT_0948 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 338 1 338 338 504 70.0 1e-141 MESTLQIMPVQRTSRNFGEYAEEAVIIEEPVKPKKVNHFIEANSVEVTLDHLKNENVIPV FSKDNELTISHPQFIETVWEAANSFYSGEQIEQPDIRCSHVIKGRKPEAVNKPKNLLTEA DTTQYYERCAFAIDIPSIYEDVSGNRLNLSIVGVRALNRENLATKKSPELFRLAVSFKNT VCCNMCVFTDGYKDDIKVMSTKELFRATLELLNNFNAAKNIHLLQTLGDSYLNEHQFVTL LGRMRLYQCLPQGYQKEIPRMLFTDTQVNNVARAYINDENFGSLGNDLSMWKLYNLLTGA NKSSYIDSFLDRAYNATELATGICSALHGDDKYQWFLS >gi|222159259|gb|ACAB01000100.1| GENE 9 7590 - 8711 358 373 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714313|ref|ZP_04544794.1| ## NR: gi|237714313|ref|ZP_04544794.1| predicted protein [Bacteroides sp. D1] # 1 373 1 373 373 705 100.0 0 MKKSTGRQYIELFYSLQIINDSLSLISLGKKYHVIPIAGQLRAILIKDKQTPVPLYYAIQ KILEVKQYIYLSTIPEKIKISKDCECYFNVMNVSLERDKLHYQKEDIGKWLQYCIVETPQ KSFTIEEVIKIVANKNGGAHYNEEISNDAVLLYTATDEKHISIIDKIIVNIALIIKALGL LLIKKAFDFHYLANIAIKFDELSSHKNIISYHDEDYYLPVAILLTSKRQLILKITDPDRR LFIVPLKENIEKKGIYTICFSYEINSNFESELKIYSLFDQTTKYVLTTPIYVHNHFTSFP HQWWGDEHIEMGFYNLQLYTSVLPEIIIIKKMKDMEVDENTPMVILKGRNYAYVDKKNNL CFGSIKCSTFNDL >gi|222159259|gb|ACAB01000100.1| GENE 10 9028 - 9612 461 194 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_3635 NR:ns ## KEGG: Fjoh_3635 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 1 194 1 198 198 188 46.0 8e-47 MKRNLFLLLFAIVTLSTFAQTTSEHLTFKGVPIDGTLTEFVSKLKQKGLTHIGTEDGTAI LKGDFAAYKNCTVAAIALKQKNLVAKVGVMFPSLETWSSLSNNYFSLKEMLTKKYGEPEV CIEEFQTKIMQNDDNSKIHEVRMNRCKYVTGYTTEKGDIELQIKGSFTDGCYVTLIYFDK INGEVIEAEAMEDL Prediction of potential genes in microbial genomes Time: Wed May 18 03:26:52 2011 Seq name: gi|222159258|gb|ACAB01000101.1| Bacteroides sp. D1 cont1.101, whole genome shotgun sequence Length of sequence - 38075 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 14, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 2088 - 2135 10.6 2 2 Op 1 . - CDS 2287 - 2460 140 ## gi|237714316|ref|ZP_04544797.1| predicted protein - Prom 2492 - 2551 1.9 3 2 Op 2 . - CDS 2699 - 4492 1069 ## PG0838 integrase - Prom 4585 - 4644 3.6 + Prom 4443 - 4502 5.3 4 3 Tu 1 . + CDS 4668 - 6146 1907 ## COG0516 IMP dehydrogenase/GMP reductase + Term 6171 - 6233 2.4 5 4 Op 1 . + CDS 6286 - 7833 1009 ## BT_3846 peptidyl-prolyl cis-trans isomerase 6 4 Op 2 . + CDS 7888 - 8733 917 ## BT_3847 hypothetical protein 7 4 Op 3 . + CDS 8743 - 10125 1568 ## COG0760 Parvulin-like peptidyl-prolyl isomerase 8 4 Op 4 . + CDS 10148 - 11914 1506 ## BT_3849 hypothetical protein 9 4 Op 5 . + CDS 11916 - 12215 362 ## BF4070 hypothetical protein 10 4 Op 6 . + CDS 12233 - 14134 1799 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) + Prom 14138 - 14197 3.0 11 4 Op 7 . + CDS 14238 - 15425 1388 ## BT_3852 major outer membrane protein OmpA + Term 15491 - 15549 13.2 - Term 15477 - 15537 14.4 12 5 Tu 1 . - CDS 15594 - 16007 235 ## BT_4231 hypothetical protein - Prom 16027 - 16086 5.9 13 6 Op 1 . - CDS 16171 - 16872 711 ## gi|237714327|ref|ZP_04544808.1| conserved hypothetical protein 14 6 Op 2 . - CDS 16910 - 17368 385 ## gi|237714328|ref|ZP_04544809.1| conserved hypothetical protein - Prom 17393 - 17452 4.2 + Prom 17440 - 17499 10.8 15 7 Tu 1 . + CDS 17650 - 20073 1398 ## COG5373 Predicted membrane protein + Term 20100 - 20141 6.1 16 8 Tu 1 . - CDS 20168 - 21262 1287 ## COG0180 Tryptophanyl-tRNA synthetase - Prom 21291 - 21350 3.2 + Prom 21594 - 21653 6.6 17 9 Tu 1 . + CDS 21689 - 24910 3567 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) + Term 24938 - 24991 11.2 + Prom 25063 - 25122 7.3 18 10 Op 1 . + CDS 25155 - 25928 647 ## COG3022 Uncharacterized protein conserved in bacteria 19 10 Op 2 . + CDS 25932 - 26486 540 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 26487 - 26534 1.5 + TRNA 26799 - 26873 90.6 # Val TAC 0 0 + TRNA 26908 - 26982 90.6 # Val TAC 0 0 20 11 Op 1 . + CDS 27125 - 28471 1367 ## COG0015 Adenylosuccinate lyase 21 11 Op 2 . + CDS 28546 - 30033 1947 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases + Prom 30035 - 30094 1.7 22 11 Op 3 . + CDS 30139 - 31542 1509 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases + Term 31560 - 31608 10.7 + Prom 31587 - 31646 6.2 23 12 Tu 1 . + CDS 31682 - 32173 458 ## BT_3874 hypothetical protein + Term 32224 - 32260 -0.6 + Prom 32211 - 32270 5.7 24 13 Tu 1 . + CDS 32431 - 33639 1125 ## BT_2913 unsaturated glucuronylhydrolase + Prom 33643 - 33702 2.6 25 14 Tu 1 . + CDS 33835 - 37932 2894 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|222159258|gb|ACAB01000101.1| GENE 1 205 - 1965 824 586 aa, chain - ## HITS:1 COG:no KEGG:BT_2983 NR:ns ## KEGG: BT_2983 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 28 497 24 524 536 189 30.0 4e-46 MKQEIINGGNARYLGELERFKDGIPFGIVNKTKTDVGGTYVAANCSSNYIIVCPFKDLVD SIAADKNNRYEVFKCYGGIREYQFRKYIKNNTTYKIAVTYDSLPKLIGWLSGTEGWKVLI DEYHLILEDMDFRYDAINGLMEEIQKFKYYSFLSATPIDLDFEIDFLKRLPHYKVQWNGV TKITPIRYKVTQLTKGLARFIQIFLDEGISLPDINGNVSKVEELYIFINSVTSIKQIADT LKLNPDDVKICCADRIRNNKLLGEYQIESVSSPNKKINFFTKKCFQGCNLFTNNGLIIVA SDAYRTQTLVDISTTMEQIAGRIRINDEYQNIFRNVIVHLFSTNKNVMSDEEFEMVMQDK EKEADKLLSGWSKLDKEERQTYIKRMNLDTELVSIINGRMVYNNLKKQSFIYKQALRKTY KDGISIRDSFMQSEKFELTNQNKWKDFNIKLAKAMTVSYEQLLKDYLDSPSESYEQEYPE FPLIKRYLKESEMNTLRWNREKMLKAVEDKKQVDKVFLAIYQPGFISNQDLKGKLKDEFG RLGIKLSPKATLIENCTLYNVEKASRKIDGKTVSGYELGKMVFTFE >gi|222159258|gb|ACAB01000101.1| GENE 2 2287 - 2460 140 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714316|ref|ZP_04544797.1| ## NR: gi|237714316|ref|ZP_04544797.1| predicted protein [Bacteroides sp. D1] # 1 57 1 57 57 96 100.0 5e-19 MKELTYADIRKIALEHGIKDTRLHVGLWATDRYIKKRKMVQGKTYTIYLPHHKSEQE >gi|222159258|gb|ACAB01000101.1| GENE 3 2699 - 4492 1069 597 aa, chain - ## HITS:1 COG:no KEGG:PG0838 NR:ns ## KEGG: PG0838 # Name: not_defined # Def: integrase # Organism: P.gingivalis # Pathway: not_defined # 230 431 224 427 432 79 33.0 4e-13 MERQIFINEMQARFNLRKPRSEKPTNLYLVCRINNKQVKLSTGVKIYPDHWNEKRQEAYI SVRLSELDNINNTIVNKKITKLKEYFIEFKHYLCMHPDEIGESMKLLKQHIYKDRMKKEL QKPATFIMKQIIEAKTCAESSKKQYRSNIDKFERFLKENEIPNTWESMNLDTINRYQKQI IKENPLHPHNTLRNIIKGTIFNLLGIADKRLDIPFKWSDSNLNSFEFVKDKSNKELADNK KVSLTEEQLNKFYKHIITGTERQIKKYTEIRDLFILQCLVGQRIGDMQKFFNGDNEMDEE AGTISIIQQKTKARAIIPLLPLAKEIISKYENKELLYYKERKSIVNEALKEVAEQAGLDE PITYEENGIKQTQPLYKLLHTHTARHTFITILCRKGIPKETVIIATGHEDTKMIDKVYSH LNSKDKAKKVSNAFKSLNNGIFNMGKVETYSLNEAKPMNDVTNNITFDTLLDTQFFASKI NKAAELQSQVGHLKNGKLCSYDNEITSLISEIEKFSQSSTSDSDVAKQYVKQLSVWKQSD LYDSFREMIIKCVKIGISKDAIMQFINKALEIGLLDKERFTNIKEITTALLDKRNQG >gi|222159258|gb|ACAB01000101.1| GENE 4 4668 - 6146 1907 492 aa, chain + ## HITS:1 COG:BH0020_3 KEGG:ns NR:ns ## COG: BH0020_3 COG0516 # Protein_GI_number: 15612583 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Bacillus halodurans # 207 489 1 281 282 346 61.0 6e-95 MSFIADKIVMDGLTYDDVLLIPAYSEVLPRTVDLSTKFSKNIELKIPFVTAAMDTVTEAK MAIAIAREGGIGVIHKNMSIEEQARQVAIVKRAENGMIYDPVTIKRGSTVQDALDIMAEY KIGGIPVVDDEGYLVGIVTNRDLRFERDMAKHIDLVMTPKERLVTTNQSTDLESAAQILQ KHKIEKLPIVGMDGKLIGLVTYKDITKAKDKPMACKDAKGRLRVAAGVGVTADTLDRMQA LVDAGADAIVIDTAHGHSMFVIEKLKEAKQRFPNIDIVVGNIATGEAAKALVEAGADAVK VGIGPGSICTTRVVAGVGVPQLSAVYDVAKALKGTGIPLIADGGLRYSGDVVKALAAGGY CVMIGSLVAGTEESPGDTIIFNGRKFKSYRGMGSLEAMENGSKDRYFQSGTADVKKLVPE GIAARVPYKGTLFEVVYQLSGGLRAGMGYCGAANIDKLHDAKFTRITNAGVMESHPHDVT ITSESPNYSRPE >gi|222159258|gb|ACAB01000101.1| GENE 5 6286 - 7833 1009 515 aa, chain + ## HITS:1 COG:no KEGG:BT_3846 NR:ns ## KEGG: BT_3846 # Name: not_defined # Def: peptidyl-prolyl cis-trans isomerase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 515 1 514 514 700 70.0 0 MMKRSLLLGCISLFVVAVFAQEDPVLMRVNGREVLRSEFENAYRRYAERSNARLSPKEYA ALFAQSKLKVEAARAAGLDTTTVFRKQHEKCWTELVESYLIDKQVMDSCARVLYQKMGLK ARSGRVQVMQIFKRLPQTVTSRHLEEEKARMDSIYRVIQNQPDLNFNRLVEIYSDDKQSR WIEGLETTSEFEDVAFSLAKGTVSQPFFTPEGIHILKVIDREEAFAYEDVSGRLIERLRR KEILDKGTAAMLDRLKRSWQYTPNQAAMEELLTKGRTEQNLFTIDGQAYTGAMFTQFASS HPQAAKRQLEGFIAKSLLDYESRNIDKKHPEIRIALRESDENYLVKEITRQKIELPAIND RAGLATYFKFHSSDYRWESPRYRGVVLHCVDKKTAKQAKKMLKKVPEKEWVDQLRQTFNT SGTEKIQVEQGIFADGDNKYIDKLVFKKGGFEPVMSYPFTIVVGEKMKGPDDYREVIEQV RKDYRSYLDTCWARELREFGKVEINQEVLKTVNNN >gi|222159258|gb|ACAB01000101.1| GENE 6 7888 - 8733 917 281 aa, chain + ## HITS:1 COG:no KEGG:BT_3847 NR:ns ## KEGG: BT_3847 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 281 14 281 281 461 92.0 1e-128 MRILVLLLITLLGCGACKEQRDHKGKTPLVEVDGNFLYKEDLMSVLPVGLSEDDSILFTE HYIRSWAEEILLYEKAANNIPDNVDVDKLVENYRKALIMHTYQQELINQKLSNDISEQEI AEYYGKNKELFKLETPLIKGLFIKVPLTAPQLNNVRRWYKSEKQDAVESLEKYSLQNAVK YEYFYDKWVSVTDVLDMIPLKVEAPEEYVDKHRQVELKDTAYYYFLNVSDYRGIGEEKPY EFARSEVKDLLVNQKRVSFMEQVKNDLYQQAVSKKKIIYNY >gi|222159258|gb|ACAB01000101.1| GENE 7 8743 - 10125 1568 460 aa, chain + ## HITS:1 COG:STM0092 KEGG:ns NR:ns ## COG: STM0092 COG0760 # Protein_GI_number: 16763482 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Salmonella typhimurium LT2 # 13 338 9 339 428 76 27.0 1e-13 MKKFVNFRFVVTLVLAIFANVATYAQDNVVDEVVWVVGDEAILKSDVEEARMDALYNGRR FDGDPYCVIPEEIAVQKLFLHQAKLDSIEVSEAEIIQRVDMMTNMYIQQIGSREKMEEYF NKTSTQIRETLRDNARDGLTVQKMQQKLVGEIKVTPAEVRRYFKDLPQDSIPYIPTQVEV QIITLQPKIPISEIEDVKKTLRDYTDRVTKGEIDFSTLARLYSEDKASAIKGGECGFMGR GMMDPAYANVAFSLQDPKKVSKIVESEFGFHIIQLIEKRGDRVNTRHILLRPKVSEKELT EACARLDSIADDIRANKFTFDDAAAVISQDKDTRNNHGIMVNINENSGVTTSKFQMQDLP QDVAKVVDKMNVGEISKAFTMINEKDGKEVCAIVKLKAKINGHKATIAEDYQDLKEIVMD KRREEMLQKWILDKQKHTYVRINENWQKCDFKYPGWVKKD >gi|222159258|gb|ACAB01000101.1| GENE 8 10148 - 11914 1506 588 aa, chain + ## HITS:1 COG:no KEGG:BT_3849 NR:ns ## KEGG: BT_3849 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 588 1 576 576 1041 87.0 0 MLKTNKKNKQQDRHRWLLIGFLCLFGVCLVAQVRPTGQKPVTGDSIKSAADTSKSASKAK TPAGNKKQSDNKKQPENKKTKVYLLHADEGQADKLARPDVQVLIGNVKLRHDSMYMYCDS ALIFEKTNSVEAFSNVRMEQGDTLFIYGDYLYYDGMTQIAQLRENVKMINRNTTLLTDSL NYDRLYDLGYYFEGGTLMDEENVLTSDWGEYSPATKQSVFNHDVKLVNPKFVLTSDTLRY NTESKIAVILGPSNIVSDNNHIYSERGFYNTMTEQAELLDRSVLTNQGKKLVGDSLFYDR IIGYGEAFDNVKMTDSINKNMLTGDYCFYNELTDSAFATKRAVAIDYSQGPDSLFMHGDT LQLVSYNLNTDSVFRLMKAYHKVRMYRTDVQGVCDSLVYNSKDSCMTMYTDPILWNEGQQ LLGEQIKIYMNDSTIDWAHIINQALTVEMKDSIHYNQVSGKEMKAYFINGDMRHIEVIGN VLTAFYPEEKDSTMTGFNCLEGSVLHLYMKDKKMEKGLFIGKSNGTMYPMDQIPPDKLRL PTFAWFDYVRPLNKDDIFNWRGKRAGDTLKPTTDRRPKTEKRNLINMK >gi|222159258|gb|ACAB01000101.1| GENE 9 11916 - 12215 362 99 aa, chain + ## HITS:1 COG:no KEGG:BF4070 NR:ns ## KEGG: BF4070 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 99 1 98 98 168 83.0 5e-41 MGMFNSMRKPRGFNHQYIYVDERKEKLAKMEENAKRDLGMLPEKEFNPEDIRGKFIEGTT HLKRRKASGRKPVSFGIILIIIAFLLYLWHYLATGSWNF >gi|222159258|gb|ACAB01000101.1| GENE 10 12233 - 14134 1799 633 aa, chain + ## HITS:1 COG:BS_mutL KEGG:ns NR:ns ## COG: BS_mutL COG0323 # Protein_GI_number: 16078768 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Bacillus subtilis # 1 633 1 624 627 303 32.0 9e-82 MSDIIHLLPDSVANQIAAGEVIQRPASVIKELVENAIDADAQNIHVLVTDAGKTSIQVID DGKGMSETDARLSFERHATSKIRQAADLFALRTMGFRGEALASIAAVAQVELKTRPESEE LGTRLVIAGSQVESQEAVSCSKGSNFSVKNLFFNVPARRKFLKANSTELSNILAEFERIA LVHPEVAFSLYSNDSELFNLPVSQLRQRILAIFGKKLNQQLLNIEVNTTMVKISGYVAKP ETARKKGAHQYFFVNGRYMRHPYFHKAVMEAYEQLIPTGEQISYFIYFDVDPANIDVNIH PTKTEIKFENEQAIWQILSASVKESLGKFSAIPSIDFDTEDMPDIPAFEEKMSSEPPKIH YNTDYNPFKVSAGGGGGGSYSRSKVEWEDLYGGLTKASKMNNPQPEPEMDWEDSSIGGQP TFVEEKMETVTSAASSTLYASEPVIEKGNQHLQFKGRFILTSVKSGLMLIDQHRAHIRVL FDRYMVQIQQKQGVSQGVLFPEILQLPASEAAVLQSIMDDLSAVGFDLSNLGGGSYAING IPSGIEGLNPVELVRNMLHTAMEKGSDVKEEIQSILALTLARAAAIVYGQVLSNEEMVSL VDNLFACPSPNYTPDGKTVLTTIKEEDIERLFK >gi|222159258|gb|ACAB01000101.1| GENE 11 14238 - 15425 1388 395 aa, chain + ## HITS:1 COG:no KEGG:BT_3852 NR:ns ## KEGG: BT_3852 # Name: not_defined # Def: major outer membrane protein OmpA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 395 1 394 394 703 92.0 0 MKKILMLLAFAGVASVASAQQTMTVTEYEVIQVQDKYQVITNPFWSNWFFSVGGGAQVLY GNNDHIGKFRDRVAPTFNVSVGKWVTPGFGLRLQYSGLQAKGFTTSETANYVVGGPREDG SYKQRWDYMNLHGDLMINLNALFGGYNPNRVYEIIPYIGAGWAHAYSRPHTNSATFNAGI INRFRLSNAVDLNLELSATGLEGKFDGEHGGRPDYDGILGATLGVTYYFPTRGFQRPTPQ IISELELNQMRNQMNAMAAANMQLQQQLANAQQPVEVEDTEEVVITDTNIAPRTVFFKIG SDKLSPQEEMNLSYLASKIKESPNATYTINGYADSATGTPAFNQKLSLERAQVVKDLLVK KYGISADRLKVAAGGGVDKFGQPILNRVVLVESAQ >gi|222159258|gb|ACAB01000101.1| GENE 12 15594 - 16007 235 137 aa, chain - ## HITS:1 COG:no KEGG:BT_4231 NR:ns ## KEGG: BT_4231 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 135 89 209 212 173 72.0 2e-42 MLKAIGLQIRLNREQISADTPRRNSKVKLKAIQFRSDKKLKQSVGYIKIKQMKRVKHSAK LSEIEIDMRLKEYFSDHQIMQRSDFQGITGMVRSTAMIHIRRLRQEGKPQNIGIPSQPIY VPAPGFYGKSRDYQPVK >gi|222159258|gb|ACAB01000101.1| GENE 13 16171 - 16872 711 233 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714327|ref|ZP_04544808.1| ## NR: gi|237714327|ref|ZP_04544808.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 233 1 233 233 457 100.0 1e-127 MSIYTFIPVIFIVLYLGYWFYMKNKNSQQAQVVNNTDFKAEFANAEQYKKLCLNSDLSFL KEAMGEEKIDAFNYASNEYGVASALKDGMKDKLKGMATLGTVRFNTVQTPKYLVLSGDNL HLFDTDTDGEIDNHFVFNQARLENSRLIAIPMEGQVQAQAQARGNNVKAYKLSLQTDEKP VELIIYSCLIFTNIPEIPTDPQETIQDIIIGNDFLKQLGDKYPNLKVSLPIFS >gi|222159258|gb|ACAB01000101.1| GENE 14 16910 - 17368 385 152 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714328|ref|ZP_04544809.1| ## NR: gi|237714328|ref|ZP_04544809.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 152 1 152 152 303 100.0 2e-81 MKHLKFLFALGGILFLSCQTVSARGLKIPFGDREVLTKVADLPDTEEYQTDDGNYIDLAT FHQEFNIAYLLPLYIEKEPRLVGYCEKEDTYYELTEEQLATILKENNLDGEKLNKVSFYS RYGGKAVGLLIIALIIWGCIPGKKKEVKPVEV >gi|222159258|gb|ACAB01000101.1| GENE 15 17650 - 20073 1398 807 aa, chain + ## HITS:1 COG:RSc0786 KEGG:ns NR:ns ## COG: RSc0786 COG5373 # Protein_GI_number: 17545505 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Ralstonia solanacearum # 139 496 136 487 938 142 28.0 3e-33 MDDFNTVYALLVLVMLFILLSRLDNRFGKIEKELNEIKKRMEDYLKVQQNSVAEASKPKE EISEKETKDIPVMPQAVEHVKAVKTGQREEAVQEKHEPVVEAVVRETLEKTPEKTLETRM EEVCVEAVRKESAESSTVPVVPVTPATPVVPVAPTVPKQKKQVNYEKFIGENLFGKIGIL IFVIGVGFFVKYAIDKNWINETFRTVLGFLTGAVLLAVAERLQKKYRTFSSLLAGGAFAV FYLTVAIAFHYYHIFSQTMAFIILIGVTVFMSILSVVYNRRELAIISLVGGFLAPFIVSS GEGSYLVLFTYVSILNLGMFGLSIYKKWSELPMISFVFTCLIMGIFLLFNYTSSSTVISS HLFWFATLFYFIFLLPVFSILRGENMRTMSRGLVFVIITNNFIYLLSGALFLRNMGLSFK ASGLLSLFIALVNLGLVLWLWKNRKEYKFLVHTTLGLVLTFVSITIPIQLDGNYITLLWA SEMVLLLWLYVKSKLRVYEYAAKVLVGLTFVSYLMDVYSVMFEHHSLDTIFLNSSFATSL FVGLATGAFALLMEYYHSFFSTARRLKYSFWNPFMLIVSVIILYYTFMMEFNLYFEGATR SGAMFLFTAISISSVCYAFRKRFPITKHLTSYILTIGANVLVYIINIWGDQRIWTSPPVV LPWLTAVFVIANLYYVARMYYTSIGIKSRFTVYLNILATLLWLTMVRSFLWQVGVDDFSA GLSLSLSIAGFVQMGLGMRLHQKLLRMVSLATFGLVLLKLVFDDLWAMPTIGKIIVFIIL GLILLILSFLYQKLKDVLFKNDEEETN >gi|222159258|gb|ACAB01000101.1| GENE 16 20168 - 21262 1287 364 aa, chain - ## HITS:1 COG:L0358 KEGG:ns NR:ns ## COG: L0358 COG0180 # Protein_GI_number: 15672048 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Lactococcus lactis # 7 348 6 340 341 416 61.0 1e-116 MGKEKIILTGDRPTGRLHIGHYVGSLKRRVDLQNAGDYSKMFIFIADSQALTDNIDNPEK VRQNVIEVALDYLACGIDPTKATIFIQSQIPELCELSFYYMDLVSVSRLQRNPTVKSEIQ MRNFEASIPVGFFTYPISQAADITAFRATTVPVGEDQEPMLEQAREIVRRFNYIYGETLV EPEILLPDNAACLRLPGTDGKAKMSKSLGNCIYLSEEPEEIQKKIMSMYTDPGHLRVQDP GKIEGNTVFTYLDAFCLPEHFERYLPDYPNLAELKAHYQRGGLGDVKVKRFLNSIMQETL EPIRNRRKEFSKDIPAIYEMLQQGCEVARAAAAETLADVKKAMKINYFDDKELIEEQVKR FSQE >gi|222159258|gb|ACAB01000101.1| GENE 17 21689 - 24910 3567 1073 aa, chain + ## HITS:1 COG:YJL130c_2 KEGG:ns NR:ns ## COG: YJL130c_2 COG0458 # Protein_GI_number: 6322331 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Saccharomyces cerevisiae # 4 1055 4 1051 1070 1150 54.0 0 MEKEIKKVLVLGSGALKIGQAGEFDYSGSQALKALKEEGISSVLVNPNIATIQTSEGIAD KVYFLPVNTYFVEEIIKKERPDGILLAFGGQTALNCGAELYTQGVLDKYGVKVLGTSVEA IMYTEDRDLFVKKLDEIEMKTPVSQAVENMEDAIAAARRIGYPVMVRSAYALGGLGSGIC ANEEEFLKLAESSFAFSKQILVEESLKGWKEIEFEVIRDANDHCFTVASMENFDPLGIHT GESIVVAPTCSLDDKELALLKELSTKCIRHLGIVGECNIQYAFNSETDDYRVIEVNARLS RSSALASKATGYPLAFVAAKVALGYTLDQIGEMGTPNSAYVAPQLDYYICKIPRWDLTKF AGVSREIGSSMKSVGEIMSIGRSFEEIIQKGLRMIGQGMHGFVGNDELHFDDLDKELSRP TDLRVFAIAQAMEEGYTIERIHDLTKIDPWFLGKLKNIVDYKAKLSTYNKVEDIPADVMR EAKVLGFSDFQIARFVLNPTGNMEKENLAVRAHRKSMGILPAVKRINTVASEHPELTNYL YMTYAVEGYDVNYYKNEKSVVVLGSGAYRIGSSVEFDWCSVNAVQTARKLGYKSIMINYN PETVSTDYDMCDRLYFDELSFERVLDVIDLEQPRGVIVSVGGQIPNNLAMKLYRQSVPVL GTSPISIDRAENRNKFSAMLDQLGIDQPAWMELTSLEEVKGFVEKVGYPVLVRPSYVLSG AAMNVCYDDEELENFLKMAAEVSKEYPVVVSQFLENTKEIEFDAVAQNGEVVEYAISEHV EFAGVHSGDATLVFPAQKIYFATARRIKKISRQIAKELNISGPFNIQFLARNNEVKVIEC NLRASRSFPFVSKVLKRNFIETATRIMLDAPYSRPDKSAFDIDWIGVKASQFSFSRLHKA DPVLGVDMSSTGEVGCIGDDFSEALLNSMIATGFKIPEKAVMFSSGAMKSKVDLLDASRM LFAKGYQIYATAGTAAFLNAHGVETTPVYWPDEKPGAENNVMKMIADHKFDLIVNIPKNH SKRELTNGYRIRRGAIDHNIPLITNARLASAFIEAFCELKLNDIQIKSWQEYK >gi|222159258|gb|ACAB01000101.1| GENE 18 25155 - 25928 647 257 aa, chain + ## HITS:1 COG:PA3539 KEGG:ns NR:ns ## COG: PA3539 COG3022 # Protein_GI_number: 15598735 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 1 251 1 252 259 151 34.0 1e-36 MLVLLSCAKTMSETSKVKVPLKTVPQFQKEASGIALQMSQFSVDELERLLRINARMAVEN YKRYQAFHAEDTSELPALLAYTGIVFKRLNAKDFSKEEFEYAQEHLRLTSFCYGLLRPLD VIRSYRLEGDVVLPEPGNQTMFSYWKSRLTDVFIEDIKKAGGILCNLASDEMKSLFDWKR VEREVRVVTPEFQVWKNGKLASIVIYIKMSRGEMTRFILKNRIENPEDLKSFSWEGFEFN ESLSDEKKFVFTNGKEI >gi|222159258|gb|ACAB01000101.1| GENE 19 25932 - 26486 540 184 aa, chain + ## HITS:1 COG:all1011 KEGG:ns NR:ns ## COG: all1011 COG0110 # Protein_GI_number: 17228506 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Nostoc sp. PCC 7120 # 2 180 10 189 192 162 46.0 4e-40 MTEVEKMRSSQLADMAAPELQVRFEHAKKLLARMRGMSTYDEGYRELLEELVPEIPETSI ICPPFHCDHGDGIKLGEHVFVNANCTFLDGGYITIGAHTLVGPCVQIYTPHHPMDYLERR GSKEYAYPVTIGEDCWIGGGAIICPGVTIGNRCVIGAGSVVTKDIPDDSVAVGNPARVVR KQVV >gi|222159258|gb|ACAB01000101.1| GENE 20 27125 - 28471 1367 448 aa, chain + ## HITS:1 COG:PA2629 KEGG:ns NR:ns ## COG: PA2629 COG0015 # Protein_GI_number: 15597825 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Pseudomonas aeruginosa # 1 447 1 447 456 454 50.0 1e-127 MELDVLTAISPIDGRYRGKTKALAAYFSEFALIKYRVQVEVEYFITLCELPLPQLKGIDS SVFETLRNIYRNFSEADAQRIKDIESVTNHDVKAVEYFLKEEFDKMGGMDDYKEFIHFGL TSQDINNTSVPLSIKEALEQVYYPLIEELIAQLNTYATEWANIPMLAKTHGQPASPTRLG KEVMVFVYRLERQLAMLKACPLTAKFGGATGNYNAHHVAYPQYDWKQFGNRFVAEKLGLE REEYTTQISNYDNLSAVFDAMKRINTIMVDMNRDFWQYISMEYFKQKIKAGEVGSSAMPH KVNPIDFENAEGNLGIATSILEHLAVKLPVSRLQRDLTDSTVLRNVGVPFGHIVIAIQSS LKGLRKLLLNEPAIYRDLDNCWSVVAEAIQTILRREAYPHPYEALKALTRTNQAITESSI KEFIEELNVSEDIKKELRAITPHTYTGL >gi|222159258|gb|ACAB01000101.1| GENE 21 28546 - 30033 1947 495 aa, chain + ## HITS:1 COG:SPy0369 KEGG:ns NR:ns ## COG: SPy0369 COG1187 # Protein_GI_number: 15674518 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Streptococcus pyogenes M1 GAS # 260 489 1 232 240 167 39.0 3e-41 MSTENEEWRENSFNEENTGAGRDGNRSYNREGGERPYRPSYNREGGDRPYRPRFNANNEG GERPQRSYGDRSYGDRPQRPSYNREGGDRPYRPRFNNNNEGGERPQRPYNREGGSYDRPQ RPSYNREGGDRPYRPRFNSGEGGERPSYGDRPQRPSYNREGGDRPYRPRFNNGEGGDRPQ RPSYNREGGDRPYRPRFNNGEGGGYRSNNGGGGYRPRYNNDRQGGYRPRPRTGDYDPNAK YSVKKQIEYKEQFVDPNEPIRLNKFLANAGVCSRREADEFITAGVVSVNGEVVTELGTKI KRSDVVKFHDEPVSIERKVYVLLNKPKDTVTTSDDPQERRTVMDLVKGACNERIYPVGRL DRNTTGVLLLTNDGDLASKLTHPKFLKKKIYHVHLDKNLTKADMEQIAAGIQLEDGEIHA DAISYTDDFKKDQVGIEIHSGKNRIVRRIFESLGYKVVKLDRVFFAGLTKKGLRRGDWRY LSEAEVNYLRMGSFE >gi|222159258|gb|ACAB01000101.1| GENE 22 30139 - 31542 1509 467 aa, chain + ## HITS:1 COG:sll0495 KEGG:ns NR:ns ## COG: sll0495 COG0017 # Protein_GI_number: 16332045 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Synechocystis # 4 467 52 513 513 557 55.0 1e-158 MEKIGRTKIVDLLKRTDIGAMVNVKGWVRTRRGSKQVNFIALNDGSTINNLQIVVDLANF DEEMLKLITTGACISVNGEMVESVGSGQKVEVQAREIEVLGTCDNTYPLQKKGHSMEFLR EIAHLRPRTNTFGAVFRIRHNMAIAIHKFFHEKGFFYFHTPIITASDCEGAGQMFQVTTM NLYDLKKDERGSISYEDDFFGKQASLTVSGQLEGELAATALGAIYTFGPTFRAENSNTPR HLAEFWMIEPEVAFNDITDNMDLAEEFIKYCVKWALDNCADDVKFLNDMFDKGLIERLQG VLKDDFVRLPYTDGIKILEEAVAKGHKFEFPVYWGVDLASEHERFLVEEHFKRPVILTDY PKEIKAFYMKQNEDGKTVRAMDVLFPKIGEIIGGSEREADYNKLMTRIEEMHIPMKDMWW YLDTRKFGTCPHSGFGLGFERLLLFVTGMANIRDVIPFPRTPRNADF >gi|222159258|gb|ACAB01000101.1| GENE 23 31682 - 32173 458 163 aa, chain + ## HITS:1 COG:no KEGG:BT_3874 NR:ns ## KEGG: BT_3874 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 163 1 163 163 319 96.0 3e-86 MKKLYFFTMLSIMLLAVTGATAQKKTKFKAADLKGIWQLCHYVSESPDVPGALKPSNTFK VLSDDGQIVNFTIIPGADAIITGYGTYKQLTDDSYKESIEKNIHLPMLDNQDNILEFEIK DNDYLHLKYFIKNDLNGNELNTWYYETWKRVEMPAKFPEDIVR >gi|222159258|gb|ACAB01000101.1| GENE 24 32431 - 33639 1125 402 aa, chain + ## HITS:1 COG:no KEGG:BT_2913 NR:ns ## KEGG: BT_2913 # Name: not_defined # Def: unsaturated glucuronylhydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 396 8 400 402 385 51.0 1e-105 MKNKLLLAVGSMALLTACDASKGNEMAWFDHAVKTSGHQLLYMAEQLKNEPDTACFPRSI KEGKYRLEHPTDWTSGFYPGSMWLAYELTGDEALAKEARKYTDRLQDMQYYTGNHDLGFM MFCSYGQGIRLKPEPTDSLILIHSSESLCSRFRPEVGLIRSWDFGEWSYPVIIDNMMNLE MLFWASEQTNNPKYREIAISHADKTLKNHFREDMTSYHVVSYLADSGEVESKGTFQGYAD SSAWARGQAWGVYGYTMCYRFTKQQSYLDAAHKIARFIIDHRPSENDYVPYWDYDAPNIP NEPRDASAAAVTASALLELSGYGDKKQGEEYFRYAEHILKQLSSDDYLAKEGENHGFILL HSVGSFPHDSEIDTPLNYADYYYLEALKRYKDLKEKSENPSY >gi|222159258|gb|ACAB01000101.1| GENE 25 33835 - 37932 2894 1365 aa, chain + ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 812 1236 274 668 676 153 27.0 2e-36 MKDRHLLFRYLFGLTVCILFFPLAVSAADSFIHFKRLSVDDGLSQNTVLALTQDHNNKLW VGTIDGLNWYEGSRFVSYYKAPDDTTSLANNHVYSLHTDSKGTVWVGTQVGLSRYNIVGN NFTNYSSPDNQPMQVLAIGEPEEGDRLLLATNIGLLVFDKKTGRMKVQPELAGKTIYSVC RMNDGFLLGTSEGVYFYYVRNENVTRLLLQLKGETISDMLYDDKTGNCWLASLTNGVYCV DNNFQIKHHYNKQNTPAYFLTNSVRTLSGDDKGRVWIGTMEGLLILEPETGTFRICRFSP EDPTTLGHNSIRSILKDNQGGMWIGTFYGGLNYYHPLAPSFGRLQHSAWRNSLSDNTVSC IAEDPDNGNLWIGTNDGGLNYYNRKTGVFSYYRTGTSTNALKSDNIKCIWTDKDGSVYIG THGGGMSRLYHRSGRIETYSFPHSTSLTNSCYSLLDGTDGTLWVGGMSGLYLFDKQTGEL SQHPLAKKHKKLENVLIYTLFRDSKGRIWIGTEESLFLYAGGKLEELHLSSSAYLHGLIQ AFCVQEDSRHEIWIGSSTGLYCYKEGAPTAWKHYTMKDGLPNDFIYNILEDERGRLWLTT NKGLACFNTEEGTFLNYTKQDGLPHDQFNYFGACKAHDGTFFLGSLGGVAYFKPYELGDN PYSPDAVVTGAVVLNQVITDMKSERVRYYQDEQGRMLGMSFPSDQKLFNIRFAVINYLAG KRNQFVYKLEGFETEWNYSRHVSFARASYSNLPPGEYVFKVKACNNNGKWSEATTEFFVH IIPMWYQTWWAKTLFIFFSVGLLVFVIYFFIARAKMKMQVRIEQIERNKIEEISQEKVRF YMNMSHELRTPLSLILAPLEELLGQSNLKGTPVQQKLDYVYKNGRKLLHIVNQLLDFRKA EAGAMPIHVAQVDVEELLQDAFALFKENAQKRAISFHIKSDLEGRLFPADRTYVETILMN LLSNAFKFTPDGGSISLSLWTEGDTYGFTVRDSGIGMSPEQLTHIFERFYQVDGQRKGTG IGLSLVKMLVEKHHGTITVASEPAQYTEFKVTLPADMAAFTEKERELPAHEVETSASLRE LPVADEYFSGDASAIVAEELSDGDQIEAGSEEERPTILLVDDNKEMVDYLKDNFRQNYVT LTAGNGEEALAIMKEHRVDIVLSDVMMPGIDGIKLCQLIKRNLQTCHIPVLLLSAKGSVD AQTEGIQAGADDYIAKPFSIHLLKGKIANQLKSRQRLKHYYSNTIDIDTAKMTSNNLDEE FMSKAIQVVEENISNEDFTSDELASQLCMSRSSLYLKMNSISGEPPANFIRRIRFNKACK LLLEGRYSISEISGMVGFGSSSYFSTSFKKYVGCLPSEYVKQHTK Prediction of potential genes in microbial genomes Time: Wed May 18 03:28:20 2011 Seq name: gi|222159257|gb|ACAB01000102.1| Bacteroides sp. D1 cont1.102, whole genome shotgun sequence Length of sequence - 64598 bp Number of predicted genes - 58, with homology - 57 Number of transcription units - 25, operones - 15 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 590 - 649 2.1 1 1 Op 1 . + CDS 673 - 3276 1967 ## BT_4652 hypothetical protein 2 1 Op 2 . + CDS 3303 - 4040 574 ## BDI_3526 hypothetical protein 3 1 Op 3 . + CDS 4078 - 5088 1141 ## BT_3148 hypothetical protein 4 1 Op 4 . + CDS 5137 - 6690 1364 ## gi|237714346|ref|ZP_04544827.1| predicted protein + Term 6727 - 6769 4.3 5 2 Op 1 . + CDS 6821 - 9367 1816 ## BT_4652 hypothetical protein 6 2 Op 2 . + CDS 9411 - 12623 2970 ## Phep_2829 TonB-dependent receptor plug 7 2 Op 3 . + CDS 12645 - 14339 1578 ## Phep_2828 RagB/SusD domain protein + Prom 14387 - 14446 5.0 8 3 Op 1 59/0.000 + CDS 14542 - 15003 794 ## PROTEIN SUPPORTED gi|237714350|ref|ZP_04544831.1| 50S ribosomal protein L13 9 3 Op 2 . + CDS 15010 - 15396 640 ## PROTEIN SUPPORTED gi|160883130|ref|ZP_02064133.1| hypothetical protein BACOVA_01099 10 4 Op 1 38/0.000 + CDS 15518 - 16354 1397 ## PROTEIN SUPPORTED gi|160883131|ref|ZP_02064134.1| hypothetical protein BACOVA_01100 + Term 16367 - 16440 7.1 + Prom 16397 - 16456 3.1 11 4 Op 2 . + CDS 16476 - 17468 1409 ## COG0264 Translation elongation factor Ts + Term 17566 - 17606 7.5 - Term 17554 - 17594 4.5 12 5 Op 1 . - CDS 17673 - 18182 467 ## COG0526 Thiol-disulfide isomerase and thioredoxins 13 5 Op 2 . - CDS 18219 - 18569 356 ## COG0023 Translation initiation factor 1 (eIF-1/SUI1) and related proteins - Prom 18691 - 18750 5.2 + Prom 18493 - 18552 5.5 14 6 Op 1 . + CDS 18735 - 19385 673 ## COG2344 AT-rich DNA-binding protein 15 6 Op 2 . + CDS 19385 - 19999 648 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) 16 6 Op 3 . + CDS 20038 - 20520 463 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 17 6 Op 4 . + CDS 20539 - 21546 906 ## COG1409 Predicted phosphohydrolases 18 7 Tu 1 . - CDS 21536 - 22399 379 ## BT_3886 hypothetical protein - Prom 22420 - 22479 4.4 + Prom 22371 - 22430 5.3 19 8 Op 1 . + CDS 22510 - 23595 967 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 20 8 Op 2 . + CDS 23620 - 24975 933 ## COG1404 Subtilisin-like serine proteases 21 8 Op 3 . + CDS 24988 - 26220 891 ## COG1570 Exonuclease VII, large subunit 22 8 Op 4 . + CDS 26281 - 26490 243 ## BT_3891 hypothetical protein 23 8 Op 5 . + CDS 26542 - 27561 1231 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 24 8 Op 6 . + CDS 27607 - 28230 607 ## COG5523 Predicted integral membrane protein + Term 28255 - 28305 11.2 + Prom 28260 - 28319 3.6 25 9 Op 1 . + CDS 28343 - 29101 724 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase + Prom 29112 - 29171 5.3 26 9 Op 2 . + CDS 29196 - 30296 1142 ## COG0489 ATPases involved in chromosome partitioning + Term 30344 - 30402 19.3 - Term 30544 - 30588 8.4 27 10 Tu 1 . - CDS 30628 - 32541 1218 ## BT_3897 putative thiol:disulfide interchange protein DsbE - Prom 32575 - 32634 7.5 + Prom 32353 - 32412 4.3 28 11 Tu 1 . + CDS 32549 - 32740 66 ## - Term 32655 - 32707 10.2 29 12 Op 1 . - CDS 32732 - 34549 1123 ## BT_3898 TonB 30 12 Op 2 . - CDS 34568 - 34933 318 ## BT_3899 transcriptional regulator - Prom 35064 - 35123 7.0 + Prom 35096 - 35155 10.0 31 13 Tu 1 . + CDS 35332 - 35901 521 ## BT_0646 hypothetical protein + Term 35930 - 35984 10.4 - Term 35978 - 36019 5.0 32 14 Op 1 22/0.000 - CDS 36049 - 37308 962 ## COG0842 ABC-type multidrug transport system, permease component 33 14 Op 2 9/0.000 - CDS 37309 - 38490 496 ## COG0842 ABC-type multidrug transport system, permease component 34 14 Op 3 13/0.000 - CDS 38493 - 39485 1176 ## COG0845 Membrane-fusion protein - Prom 39509 - 39568 2.2 35 14 Op 4 . - CDS 39592 - 41055 1379 ## COG1538 Outer membrane protein - Prom 41226 - 41285 4.7 - Term 41252 - 41299 9.1 36 15 Tu 1 . - CDS 41325 - 42266 1244 ## COG0039 Malate/lactate dehydrogenases - Prom 42354 - 42413 7.3 + Prom 43042 - 43101 6.4 37 16 Tu 1 . + CDS 43152 - 44024 949 ## BT_3912 hypothetical protein + Term 44216 - 44248 -0.8 38 17 Tu 1 . - CDS 44658 - 45563 474 ## COG0061 Predicted sugar kinase - Prom 45653 - 45712 6.8 + Prom 45471 - 45530 5.1 39 18 Op 1 . + CDS 45676 - 46392 651 ## COG0854 Pyridoxal phosphate biosynthesis protein 40 18 Op 2 . + CDS 46389 - 47105 902 ## COG0811 Biopolymer transport proteins 41 18 Op 3 . + CDS 47113 - 47529 304 ## BF3738 putative tansport related protein 42 18 Op 4 . + CDS 47536 - 48411 884 ## BT_3921 hypothetical protein 43 18 Op 5 . + CDS 48439 - 48990 733 ## COG0693 Putative intracellular protease/amidase 44 18 Op 6 . + CDS 48998 - 49657 296 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 45 18 Op 7 . + CDS 49658 - 51754 1487 ## COG1200 RecG-like helicase + Term 51782 - 51822 2.6 + Prom 51871 - 51930 6.5 46 19 Op 1 . + CDS 51978 - 52442 371 ## COG0105 Nucleoside diphosphate kinase 47 19 Op 2 . + CDS 52463 - 53335 698 ## COG0739 Membrane proteins related to metalloendopeptidases + Term 53374 - 53421 11.0 + Prom 53342 - 53401 6.0 48 20 Op 1 . + CDS 53437 - 54000 585 ## BT_3927 hypothetical protein 49 20 Op 2 . + CDS 53990 - 55348 826 ## BF3957 hypothetical protein + Prom 55376 - 55435 7.6 50 21 Tu 1 . + CDS 55456 - 56211 1027 ## COG0149 Triosephosphate isomerase + Term 56235 - 56304 4.7 + Prom 56271 - 56330 4.9 51 22 Op 1 . + CDS 56366 - 56809 442 ## BT_3930 hypothetical protein 52 22 Op 2 . + CDS 56817 - 57407 688 ## COG0302 GTP cyclohydrolase I + Term 57557 - 57595 -0.8 - Term 57209 - 57246 0.2 53 23 Tu 1 . - CDS 57437 - 59518 1586 ## COG0358 DNA primase (bacterial type) - Prom 59545 - 59604 4.4 - Term 59556 - 59600 9.1 54 24 Op 1 . - CDS 59625 - 60398 778 ## BT_3933 chorismate mutase/prephenate dehydratase (TyrA) 55 24 Op 2 . - CDS 60456 - 61517 1149 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 56 24 Op 3 . - CDS 61605 - 62783 1224 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 57 24 Op 4 . - CDS 62758 - 63606 743 ## COG0077 Prephenate dehydratase - Prom 63739 - 63798 5.0 - Term 64021 - 64045 -1.0 58 25 Tu 1 . - CDS 64084 - 64530 164 ## CHU_1441 transposase Predicted protein(s) >gi|222159257|gb|ACAB01000102.1| GENE 1 673 - 3276 1967 867 aa, chain + ## HITS:1 COG:no KEGG:BT_4652 NR:ns ## KEGG: BT_4652 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 867 1 872 872 806 47.0 0 MKKVLLISFVILSVAAQMTHAQKQAVIKLTETTLMHEMRATPYPLDKAVVNDRAVSFQWP LRSDMNSQDSPLDGFEHKVKKVDKTKITYRLRYSQDAGLKSGVVQVETRWPFYNPEQPLA PGVWYWQFGYVENGQVTWGSTQQVTVEDRSGKFCPPSLKTVLAKLPADHPRVWIMKNEWK DFINHSKQKAERQWYLERADQVLQTPMKSVKDINVSQVKNLKNEMQINSYLTRESRRIID AEEGNTEALIRAWLLTQDTKYADEAIKRVFIMADWDKDKNVKGDFNASSLLSLCSMAYDS FYDRLNTSQKKALLEAIKNKGGEMYENFNNRMENHIADNHVWQMTLRILTMAAFSVYGDL PEADTWVDYCYNVWLARFPGLNKDGGWHNGDSYFTVNTRTLVEVPYYYSKLTGYDFFSDP WYQGNIMYTIFQQPPFSKSGGNGSSHQNVARPNSIRIGYLDALARLTGNTYAADFVRRTL KVEPDYMKKALLSKPGDLAWFRLQCDKPLPEGEGLTALPAGYVFPATGLASFQTNWDRVG GNAMWSFRSSPYGSTSHALANQNAFNTFYGGKPLFYSSGHHIEFTDVHSMLCHRATRAHN TILVNGMGQRIGTEGYGWIPRYYASEKIGYVLGDASNAYGKVISPLWLTRGEQSEVHYTP ENGWDENHVKTFRRHIVNLGKTGLIFIYDELVADEPVNWSYLLHTTENPMTVDQSNHRFV HIQATNRGGASDAYLFSTGTLQTDTTSRFFYPAVNWLRADDKGVFKKYPNHWHFTATSEK AQVYRFATVINTHALKYPAKDPEILSDGRIKVGGWLISVNLKSDGAPSFFIRSTQEKVNI TYKGEATVINEDGYETVMRDTVPELEI >gi|222159257|gb|ACAB01000102.1| GENE 2 3303 - 4040 574 245 aa, chain + ## HITS:1 COG:no KEGG:BDI_3526 NR:ns ## KEGG: BDI_3526 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 83 243 19 182 185 112 38.0 1e-23 MRHSTFNMRTICLLAALSGITVLSAQTPGKPRVQTSAAPEETTVYKPGGKKQETAVATSS KKEEDKRQSGKENAGKKAAAFDGQRYLALKTNVIYDACALLNLAVEMQVHKKITVELPLT CSLWDLGDKHGVRTVALQPEARWWIGNETGRGHFVGLHAHVAWFNVKWNDDRYQDTDRPL LGAGISYGYKLPLSRHWGAEFNLGVGYANMKYNTYYNVDNGAQLDTRVRHYWGITRVGAS LVYRF >gi|222159257|gb|ACAB01000102.1| GENE 3 4078 - 5088 1141 336 aa, chain + ## HITS:1 COG:no KEGG:BT_3148 NR:ns ## KEGG: BT_3148 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 105 327 202 424 430 81 29.0 5e-14 MRRNLYILAQCTFLLFITLLVNGCSLHEEPEMTPEGELGVDPTAVTLNLNLAMNLSLAER APVTITRASETNYLRRFIVEAYLDRQVAARQTVYEEDFNRASLSVSMKLHARNYRILVWA DYVNAETPEQGLVYDAKNLAFILPAGKYIGNSRYKDVFAASAMADLTSFRNHWGAETSLD VELYRPVARYELVAKDVATFLNKLSTGGLKGESFTARVKYSDYLPTGYNLWDDVPKNSLM YMEYKVAFERPADGTKELKLGFDYVLTDAGETVSIPVELEILNEKNEVLARTAFRIPCER GKNTTVRGNFLTSDANGGIGIDPDYDGDLEVDLGEL >gi|222159257|gb|ACAB01000102.1| GENE 4 5137 - 6690 1364 517 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714346|ref|ZP_04544827.1| ## NR: gi|237714346|ref|ZP_04544827.1| predicted protein [Bacteroides sp. D1] # 1 517 1 517 517 987 100.0 0 MKKNLFYSLMALFVVLFASCGQEEIVSDNGTEINAPVTISVQAPVNNVFSRAVTIPDGYT MQCIMQLLNADGNKIGDQVTKPITDGKVSFTISVDEQKEVSKALFWAEYVPESGAANKVY NTADLRAVGYNTVSFDLTNDALMAASDAFCGKLETIGNASVTLKRPFANVSVKPKNPEVA AAANKLEITYNALSGYDILEGKCTATTPVTYTNASFASADGNWFANFFFAPSNVGKFTEE ITMALSGGYSKEIKIPANTLPLDANMQIMAKFEIGDGNFDIEVGVDPDYEALEMKVGSYI NAEGKVVRDAADAVGIVFKMEAIGDDVPANYPVALQSKTIVGYAVAIENVATGRQVLNNA AMSSLVTTDATVTNGTQATEALLTGIGEVAFMTTYNSWVNEHPLDGESLSSWYIPTVDQL GEFVGMLFKIGEVEPTGGQAFREMPEFAFENGVMFDRDPIESVYYASCTVNGSKDISGVI INVDKTTKQVLDAKASSLKVTGSSQKALCRPMITIFK >gi|222159257|gb|ACAB01000102.1| GENE 5 6821 - 9367 1816 848 aa, chain + ## HITS:1 COG:no KEGG:BT_4652 NR:ns ## KEGG: BT_4652 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 47 840 36 868 872 375 31.0 1e-102 MHTTFRLSVFLLFMVQLFCNCTSDSLPEPDGTQPETPAGIPDAYQDKIRTQPYPKADNEL YLNPASLIVPQAMKTGERLQFSLSRTEDFSSSETLLSEPQEWCMYNLHRRLEVGTWYWRF RSTNLNGTTPGEWSSTYRFEVKNDTPKFITPPFQTFLANAPRQHPRIYCFWDERIGEARN RVTSHPEYAELQSRASQALKAEYTGMTDLYSRAEELRQHATYLYQAYHLTQKEIYAEKLR QLLEALIVAPPADGQLFASNFTASNIAWCLVAAYDLLYNNLSASDRTAAEELMMRVARYY YKVNCGFQENHLFDNHFWQQNMRVLFQVALSLYDKPAYSFEVLPMMEYYYELWTARAPAS GFNRDGIWHNGTGYFSANILTLAYMPSLLSFISRYDFLSHPWYQNAGRSLVYTCPPGSKS NGFGDSSEKGSEPNRLIAAFADYLACETGNSYAGWYAGECRDLLRRDYELRLYRMCTDQD YNTTFPAGADKMVWYKDAGEVAMHSAPEDAGKDLALSFRSSTFGSGSHTTASQNAFNLLY KGVDVYRSTGYYQNFSDAHNLMSYRHTRAHNSLLVNGIGQPYSTEGYGSVMRAMGGQHIS YCLGDASHAYRGISNDPMWVGYFKQAGIEQTPENGFGATPLTKYRRHVLMLHPHTVIVYD ELEASEAVRWEWLLHSPTEFKIDTVKKTLSTDNKTKGWMAVTQLFGGHVFTLSQTDRFVV PPAITGAEYPNQWHLTARVDGCSATRFLAIIQVGDEAVSIINRDGDTFNVGDWTIKAVLD ASKAPELTVSHRTEQAVFSYGTDNPALNGNFYSRQYTGSSLLYDEIDGAYQVVEMTDRSP ISTRVVNQ >gi|222159257|gb|ACAB01000102.1| GENE 6 9411 - 12623 2970 1070 aa, chain + ## HITS:1 COG:no KEGG:Phep_2829 NR:ns ## KEGG: Phep_2829 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 3 1070 2 1053 1053 839 44.0 0 MKSRIICILLLLVGVSGIYAQSLTVTGKVVDNEGLEVIGGNVTVKGKQNTGTITDINGKY TITVSDPQKDVLVFSFIGLENMEVPVKGRKQIDVTMKAASVLLDEVVAIGYATVKRKDLT GSVASVRSDDLLKVPSSDVTQALAGRMAGVQIIQTDGQPGATMSVRVRGGISITQSNEPL YIIDGFPTEDGMSSLDPADIESIDVLKDASATAIYGARGANGVVVITTKSGAKSEGKATL TFDSYVGVRTLAKRLDVLSVEEFVLADYERTLGDATDPEESMRSWQNRYGGFVDLHENYG NRKGIDWLDRTMGRTTVTQNYRVGVNGGNDKLNYNMSYGYFKDEGAMVYSGSDKHNIALS VKSEVNKRLSVTGRINFDYLKVYGAGVAGNGTNEGGSNVDAKFNKMVQILQYRPTIGIRG NDSDLLAGEDPVLSDADGNVMQNPLIAAAEEKDNKETRTLQANGGLTFKIIKGLTFRNNT GMRYQLYRRELFYGDQSIMGRRSGIYGSIRNTETGSFQTSNVLTYDKRFQKKHKVVVQLG QEFVKRWTRVLESGVSGLPTDEFILGDMSLGTPSVASSDENYDDNLLSFFARLNYDFTDK YLFSATFRADGSSKFGKNNKWGYFPAVSAAWRVGEEDFIKKLNVFSDLKFRIGYGLAGNN RIGSYNSLALMSSIITAMGDQLTPGYASKQIPNPDLKWEANKTFNMGVDLGFLNQRITIS PEFYINRSSNLLLNAQLPYSSGYQSMLINAGETKNVGVELTVNTVNFSTKKFSWNTTLTL SHNKNSVKALTGEAVQLYEARFGFNQNTHRIAVGEPLGQFYGYITEGLYQVDDFNYDAST QTYTLKDGIPYHGDKSRIRPGMWKFKNLTGDDNVIDENDKTVIGNAQPKFYGGLNNSFTY KGFDLSIFLTFSYGNEVLNATKLVTSKVGSLNYNALDVMNSSNRWMTINSDGQKVTDPGE LAALNAGKTVAAYHDAQQGDNYIHSWAVEDASYLKLSNVTLGYTFPKNLIARVGLKNLRL YATGNNLLTWTKYSGFDPEVSTMKSGLTPGVDFGAYPLSRSFIFGLNVAF >gi|222159257|gb|ACAB01000102.1| GENE 7 12645 - 14339 1578 564 aa, chain + ## HITS:1 COG:no KEGG:Phep_2828 NR:ns ## KEGG: Phep_2828 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 1 557 10 544 549 303 35.0 2e-80 MKKIHLLYAVFALLATGFSSCEDLLTEEPNSKYDRDRYFDSEDKAEMAVMGIYSSLSDFN HYGWYEMASPASDDTYYTARTQSDNQVHDIAHYQLNSTNTWIESIWKLKYEGIDRANLTI DGICGMTGYAENTRLKALEAEARFLRAFLAFDLVRYWGDVPFKTSYSSSYESAFGERVDR EVIYDEIISDLTFAKNNLDWATASSSPERVTQGAARALLMRVYLQRAGYSLQSNGQLKRP EDSKRMEYFDAVIKEWEAFEKKGYHDFYDGGYEALFKSYSQGVLNNKESLWEIAFYHSQG RRNGGAWGIYNGPQVAEPTGISASESSQYMGRANGFFIVVPEWRNFFEASDKRRDVAICT YRYTWNGTKKEHVKEERSAGSWYVGKWRREWMPKESWNKNINYADVNYCPLRYADVVLMA AEAYNETGTDRQKAWDLLNSVRTRAEATSITEANYDEMMSARKKTHNLTFIDDSTPEGKF RTALYWERGFELAFEGQRKYDLIRWGVLGKALKLFGEISSVNQKENKPYPAYRNFMEGKH ELFPIPLKEIQSNPKLNGMNNNGY >gi|222159257|gb|ACAB01000102.1| GENE 8 14542 - 15003 794 153 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237714350|ref|ZP_04544831.1| 50S ribosomal protein L13 [Bacteroides sp. D1] # 1 153 1 153 153 310 99 1e-83 MDTLSYKTISANKATVTKEWVIVDATDQTLGRLGAKVAKLLRGKYKPNFTPHVDCGDNVI IINADKVKLTGNKWNDRVYLSYTGYPGGQREMTPARLIAKPNGEERLLKKVVKGMLPKNI LGAKLLNNLYVYAGSEHKQAAQNPKMIDINSYK >gi|222159257|gb|ACAB01000102.1| GENE 9 15010 - 15396 640 128 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883130|ref|ZP_02064133.1| hypothetical protein BACOVA_01099 [Bacteroides ovatus ATCC 8483] # 1 128 1 128 128 251 100 9e-66 MEVVNALGRRKRAIARIFVSEGTGKITINKRDLAEYFPSTILQYVVKQPLNKLGAAEKYD IKVNLCGGGFTGQSQALRLAIARALVKMNAEDKAALRAEGFMTRDPRSVERKKPGQPKAR RRFQFSKR >gi|222159257|gb|ACAB01000102.1| GENE 10 15518 - 16354 1397 278 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883131|ref|ZP_02064134.1| hypothetical protein BACOVA_01100 [Bacteroides ovatus ATCC 8483] # 1 278 1 278 278 542 100 1e-153 MSRTNFDTLLEAGCHFGHLKRKWNPAMAPYIFMERNGIHIIDLHKTVAKVDEAAEALKQI AKSGKKVLFVATKKQAKQVVAEKAQSVNMPYVIERWPGGMLTNFPTIRKAVKKMATIDKL TNDGTYSNLSKREVLQISRQRAKLDKTLGSIADLTRLPSALFVIDVMKENIAVREANRLG IPVFGIVDTNSDPSNVDFVIPANDDATKSVEVILDACCAAMIEGLEERKAEKIDMEAAGE APANKGKKKSVKARLDKSDEEAINAAKAAAFIKEDEEA >gi|222159257|gb|ACAB01000102.1| GENE 11 16476 - 17468 1409 330 aa, chain + ## HITS:1 COG:BS_tsf KEGG:ns NR:ns ## COG: BS_tsf COG0264 # Protein_GI_number: 16078713 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor Ts # Organism: Bacillus subtilis # 1 225 1 233 293 117 40.0 4e-26 MAVSMADITKLRKMTGAGMMDCKNALTEAEGDFDKAMEIIRKKGQAVAAKRSEREASEGC VLAKTTGDRAVIVALKCETDFVAQNADFVKLTQDILDLAVANKCATLDEVKALPMGNGTV QDAVTDRSGITGEKMELDGYMTVEGVCTAVYNHMNRNGLCTIVAFNKEVNEQLAKQIAMQ IAAMNPIAIDEDGVSEEVKQKEIEVAIEKTKAEQVQKAVEAALKKANINPAHVDSEEHMD SNMAKGWITAEDVAKAKEIIATVSAEKAAHLPEQMIQNIAKGRLGKFLKEVCLLNQEDIM DGKKTVREVLAAADPELKIVDFKRFTLKAE >gi|222159257|gb|ACAB01000102.1| GENE 12 17673 - 18182 467 169 aa, chain - ## HITS:1 COG:TP0100 KEGG:ns NR:ns ## COG: TP0100 COG0526 # Protein_GI_number: 15639094 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Treponema pallidum # 51 166 83 199 200 76 31.0 2e-14 MRKVTLFCAAALFSLLSFAQDNNADIVKVGDSMPAFTLHSTVNGTVNSEDLKGKVVLINI FATWCGPCQSELAEVQKTLWPKYKDNKDFCMLVIGREHTDNQLTEYNKRKGFTFPLYPDP KREVTGKFASQYIPRSYLIDKDGKVISATVGYKKEEMDKLMKEIDKALK >gi|222159257|gb|ACAB01000102.1| GENE 13 18219 - 18569 356 116 aa, chain - ## HITS:1 COG:alr3795 KEGG:ns NR:ns ## COG: alr3795 COG0023 # Protein_GI_number: 17231287 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 1 (eIF-1/SUI1) and related proteins # Organism: Nostoc sp. PCC 7120 # 33 116 33 115 115 67 48.0 5e-12 MKNNDWKDRLNVVYSTNPDFGYEMDNDEEQVTLDKNKQSLRVSIDKKNRGGKVVTLITGF IGTENDLKELGKLLKSKCGVGGSAKDGEIMVQGDFKTKIIELLIKEGYSKTKGIGG >gi|222159257|gb|ACAB01000102.1| GENE 14 18735 - 19385 673 216 aa, chain + ## HITS:1 COG:lin2178 KEGG:ns NR:ns ## COG: lin2178 COG2344 # Protein_GI_number: 16801243 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Listeria innocua # 7 207 3 203 215 140 35.0 3e-33 MSTSIRKEADKVPEPTLRRLPWYLSNIKLMKEKGEQYVSSTQISKEINIDASQIAKDLSY VNISGRTRVGYNIDALIEVLESFLGFTNMHKAFLFGVGSLGAALLRDSGLHHFGLEIVAA FDVNPELVGKDLNGIPIFHSDDFEAKMKEYDVNIGVLTVPINIAQEITDKMVDGGIKAVW NFTPFRIRVPENIVVQNTSLYAHLAVMFNRLNFNEK >gi|222159257|gb|ACAB01000102.1| GENE 15 19385 - 19999 648 204 aa, chain + ## HITS:1 COG:ycgM KEGG:ns NR:ns ## COG: ycgM COG0179 # Protein_GI_number: 16129143 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Escherichia coli K12 # 2 189 18 205 219 153 40.0 3e-37 MKIIAVGMNYAQHNKELGHTQVNTEPVIFMKPDSAILKDGKPFFIPDFSNEIHYETELVV RINRLGKNIAPRFANRYYDAVTVGIDFTARDLQRKFREQGNPWELCKGFDSSAAIGTFVP VEHYKDIQNLNFNLLIDSKEVQRGCTADMLFKIDDIIAYVSRFVTLKIGDLLFTGTPVGV GPVSIGQRLQGYLEEEKLLDFYIR >gi|222159257|gb|ACAB01000102.1| GENE 16 20038 - 20520 463 160 aa, chain + ## HITS:1 COG:RSc1644 KEGG:ns NR:ns ## COG: RSc1644 COG0245 # Protein_GI_number: 17546363 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Ralstonia solanacearum # 1 158 4 161 168 177 55.0 8e-45 MKIRVGFGFDVHQLVEGRELWLGGILLEHTKGLLGHSDADVLLHAVCDALLGAANMRDIG YHFPDTAGEFKNIDSKILLKKTVELIATKGYKVGNIDATICAERPKLKAHIPLMQETMAT VMGIDADDISIKATTTEKLGFTGREEGISAYATVLIEKIS >gi|222159257|gb|ACAB01000102.1| GENE 17 20539 - 21546 906 335 aa, chain + ## HITS:1 COG:CAC2806 KEGG:ns NR:ns ## COG: CAC2806 COG1409 # Protein_GI_number: 15896061 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 4 302 2 303 317 164 37.0 2e-40 MIRLLKIYLTLTFLLVTTFGMAQKSELKFSKDGKFKIVQFTDVHFKYGNRASDIALERIN QVLDDERPDLVIFTGDVVYSAPADSGMLQVLEPVVKRKLPFVVTFGNHDNEQGMTREQLY DIIRKVPGNLLPDRGTVLSPDYVLTVKSSSNVKKDAALLYCMDSHSYSPLKDVKGYAWLT FDQINWYRQQSAAYKAQNGGQPLPALAFFHIPLPEYNEAARTENAILRGTRMEEACAPKL NTGMFAAMKEAGDVMGMFVGHDHDNDYAVMWKGILLAYGRFTGGNTEYNHLPNGARIIVL DEGARTFTSWIRQKDGVVDKISYPASFVKDDWTKR >gi|222159257|gb|ACAB01000102.1| GENE 18 21536 - 22399 379 287 aa, chain - ## HITS:1 COG:no KEGG:BT_3886 NR:ns ## KEGG: BT_3886 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 286 1 286 292 510 88.0 1e-143 MANPRLPGISENEEALLYAKLNEYNRGRASFKEAGVYLVVLPRPGKPNYSLWLYSPLPEK QSILYIHDLSPDINESLRMASTMFYYSRRCLILMDYNEKRMQSNGDDLIFFGKYRGHFLH EILKIDPAYLSWVAYKFTPKIPKQERFVQIAQAYHSIHLDIMIRKSREKRSSSRYLGELG EKLTDLKLKVTRVRLEDDPYKTRVNGTTPQFFVKQILTLTDASGNLVIISIPSKNPSAVS CTLSGIEHEYRLGDIIYIASAKVSRQYESYGSKYTRLSHVKFASLNV >gi|222159257|gb|ACAB01000102.1| GENE 19 22510 - 23595 967 361 aa, chain + ## HITS:1 COG:CAC2233 KEGG:ns NR:ns ## COG: CAC2233 COG0482 # Protein_GI_number: 15895501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Clostridium acetobutylicum # 5 345 2 354 355 258 38.0 1e-68 MEERNKRVLVGMSGGIDSTATCLMLQEQGYEIVGVTMRVWGDEPQDARELAERMGIEHYV ADERIPFKETIVKNFIDEYKQGRTPNPCVMCNPLFKFRVLTEWADKLNCAWVATGHYSRL EEKNGNIYIVAGDDDKKDQSYFLWRLGQDVLKRCIFPLGDYTKVKVREYLAEKGYEAKSK EGESMEVCFIKGDYRDFLREQCPELDSEIGSGWFVNSEGVKLGKHKGAPYYTIGQRKGLE IALGKPAYVLKINPQKNTVMLGDADQLETEYMLAEQDKIVDERELFGCENLTVRIRYRSR PIPCRVKRLEDGRLLVRFLETASAIAPGQSAVFYDGRRVLGGAFIASQRGIGLVIIENEE L >gi|222159257|gb|ACAB01000102.1| GENE 20 23620 - 24975 933 451 aa, chain + ## HITS:1 COG:BS_aprX KEGG:ns NR:ns ## COG: BS_aprX COG1404 # Protein_GI_number: 16078789 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Bacillus subtilis # 152 438 129 431 442 88 27.0 3e-17 MKKFIVLILALNMYLGVSAQFTPGDTLKYRISLKDKAATDYSLQKPEKYLSKKSIERRKK QGLPIDSTDLPVCRTYVDAIRKTGVHVLVTGKWDNFVTVSCNDSTLISEIAQLPFVRSTE RVWKGITQKAFQRDSLINKPLRTDSLYGPAITQAAMSRVDLLHDAGFKGQGMTIAVIDAG FHNVDKIDAMKNIRILGVRDFVNPEADIYAESSHGMSVLSCMAMNQPHVMIGTAPEASYW LLRSEDEYSENLVEQDYWAAAIEFADSVGVDLVNTSLGYYSFDDPTKNYRYRDLNGHYAL MSREAAKAADKGMVVVCSAGNSGAGSWKKITPPGDAENIITVGAVTKRGELAPFSSVGNT ADGRVKPDVVAVGLNSDVMGTDGNLRRANGTSFASPIMCGMVACLWQACPKLTAKQIIDL VRQSGDRADFPDNIYGYGIPDLWKAYQSTQR >gi|222159257|gb|ACAB01000102.1| GENE 21 24988 - 26220 891 410 aa, chain + ## HITS:1 COG:DR0186 KEGG:ns NR:ns ## COG: DR0186 COG1570 # Protein_GI_number: 15805222 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Deinococcus radiodurans # 4 409 28 416 416 155 31.0 1e-37 MDSLSLLELNSLVRRSLEQCLPDEYWIQAELSDVRSNTTGHCYLEFVQKDPRSNNLVAKA RGMIWNNIYRLLKPYFEESTGQLFTSGIKVLVKVTVQFHELYGYSLTVLDIDPAYTLGDM ALRRREILLQLEEEGVLTLNKELEMPVLPQRVAVISSATAAGYGDFCHQLQHNPGGFYFY TELFPALMQGNQVEESVLAALDRINARINEFDVVVIIRGGGATSDLSGFDTYLLAAACAQ FPLPIITGIGHERDDTVLDSVAHTRVKTPTAAAELLIHQVTEVAEHLEELSVRLQQGAYM LLEQEQRRLEALQIRIPNLVHRKLADARFSLLAAKKDLSQVAKALLARQSHRLELLQQRI ADASPDKLLSRGYSITIKDGKAVTDASSLKPGDRLTTRLLKGEVQSVVEK >gi|222159257|gb|ACAB01000102.1| GENE 22 26281 - 26490 243 69 aa, chain + ## HITS:1 COG:no KEGG:BT_3891 NR:ns ## KEGG: BT_3891 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: Mismatch repair [PATH:bth03430] # 1 68 1 68 68 70 92.0 2e-11 MAAKKETYSQAMERLEKIVRQIDNNELDIDILSEKIKEANEIIAFCKDKLTKADREVEKL LQEKRLSEE >gi|222159257|gb|ACAB01000102.1| GENE 23 26542 - 27561 1231 339 aa, chain + ## HITS:1 COG:L0086 KEGG:ns NR:ns ## COG: L0086 COG0115 # Protein_GI_number: 15673270 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Lactococcus lactis # 4 339 5 340 340 342 51.0 7e-94 MKEIDWANLSFGYMKTDYNVRINFRNGAWGELEVSSDEHLNLHMAATCLHYGQEAFEGLK AFRGKDGKVRIFRLEENAARLQSTCQGILMAELPTERFKEAILKVVKLNERFIPPYETGA SLYIRPLLIGTSAQVGVHPAEEYMFVVFVTPVGPYFKGGFSTNPYVIIREFDRAAPHGTG IYKVGGNYAASLRANKKAHDLGYSCEFYLDAKEKKYIDECGAANFFGIKDNTYITPKSSS ILPSITNKSLMQLAEDMGIKVERRPIPEEELETFEEAGACGTAAVISPIQRIDDLENGKS YVISKDGKPGPICTKLYNKLRGIQYGDEPDTHGWVTIVE >gi|222159257|gb|ACAB01000102.1| GENE 24 27607 - 28230 607 207 aa, chain + ## HITS:1 COG:lin0656 KEGG:ns NR:ns ## COG: lin0656 COG5523 # Protein_GI_number: 16799731 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Listeria innocua # 6 162 4 164 345 126 40.0 3e-29 MLKLNSELRAQAREALRGKWPMAAVAALIYSVIAGGLSAIPVIGGLCSLFVGLPVAYGFT IVMLGVCRGKDIDFGVLFEGFQDYGRIFVTMLLQTVYTVLWSLLLVIPGIIKSYSYAMTS FILKDEPEMKNNAAIEKSMAMMEGNKMKLFMLDLSFIGWAILCIFTFGIGFLFLQPYVAI SRAAFYEDLKAQQGGNVEVNVEVNVEI >gi|222159257|gb|ACAB01000102.1| GENE 25 28343 - 29101 724 252 aa, chain + ## HITS:1 COG:CAC2627 KEGG:ns NR:ns ## COG: CAC2627 COG0220 # Protein_GI_number: 15895885 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Clostridium acetobutylicum # 30 215 23 206 211 118 36.0 8e-27 MGKNKLEKFADMASYPHVFEYPYSAVDNVPFDMKGKWHEEFFKNDHPIVLELGCGRGEYT VGLGKMFPEKNFIAVDIKGARMWTGATESLQAGMKNVAFLRTNIEIIERFFAEGEVSEIW LTFSDPQMKKATKRLTSTYFMERYRKFLQPNGIIHLKTDSNFMFTYTKYMIEANRLPVEF MTEDLYHSDLVDDILGIKTYYEQQWLDRGLAIKYIKFRLPQEGQLQEPDVEIELDPYRSY NRSKRSGLSTSK >gi|222159257|gb|ACAB01000102.1| GENE 26 29196 - 30296 1142 366 aa, chain + ## HITS:1 COG:alr0652 KEGG:ns NR:ns ## COG: alr0652 COG0489 # Protein_GI_number: 17228148 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Nostoc sp. PCC 7120 # 8 349 10 350 356 266 44.0 3e-71 MTLYPKLILDALATVRYPGTGKNLVEAEMVADNLRIDGMTVSFSLIFEKPTDPFMKSMLK AAETAIHTYVSPDVQVTITAESKQAARPEVGKLLPQVKNIVGISSGKGGVGKSTVSANLA VALAKLGYKVGLLDADIFGPSMPKMFQVEDARPYAERIDGRDMIIPVEKYGVKLLSIGFF VDPDQATLWRGGMASNALKQLIADAAWGELDYFLIDLPPGTSDIHLTVVQTLAMTGAIVV STPQAVALADARKGINMFTNDKVNVPILGLVENMAWFTPAELPENKYYIFGKEGAKKLAE EMNVPLLGQIPIVQSICEGGDNGTPVALDEDSVTGRAFLSLAASVVRQVDRRNVEMAPTQ IVEMHK >gi|222159257|gb|ACAB01000102.1| GENE 27 30628 - 32541 1218 637 aa, chain - ## HITS:1 COG:no KEGG:BT_3897 NR:ns ## KEGG: BT_3897 # Name: not_defined # Def: putative thiol:disulfide interchange protein DsbE # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 637 1 634 634 733 58.0 0 MKRFAWIIGLIFCTICTIQAKDRVIERPPFLAWSSNSIEIDKIVMSDTVTTVYIKAFYRP KYWIKIATGSFLKDNNGMLYPIRKGVGITLDKEFWMPESGEAEFQLLFPPIPENVTSLDF SEGDFDGAYKIWGIQLDKETSYKQKLPKEAVIHKINKKAILPTPQLAFGTATLKGKILDY QKEMMQQMKMHIESPALNIHNEQNIIKIKEDGTFQAEVKVASVTSVALELPFGWIECLIA PNEETSLIINTKELCRRQAHLQRKDKTYGEPVYFNGYLASLQQELASVDIDIVLKSVYYM DMYNDIAGKSADEYKAYVLERLPSIRKEIAQSPYSNACKELLNIQVDLAATGKIAMTERE LKSAYIAVNKLNKEQTDDYFYNTRIDIPTGYYDILKEFTSINTLKALYGKYYASTIYLIS FLPNSLDVLKETLGTGQGPLFDNIKFNKLYQSIKDFTPLTAEQNAELKTFSSPAYAEMLT QTNKEIIKKIELNKRKTGFTVNETGQVSNEDLFPSIISKFRGHTLLVDFWATWCGPCRTA NKAITPMKEELKDKDIIYLYITGETSPKGTWENMITDIHGEHFRVTNEQWSFLMSSFNIR GVPTYFVVDPEGNITFKQTGFPGVDTMKKELMKALNK >gi|222159257|gb|ACAB01000102.1| GENE 28 32549 - 32740 66 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFYYTNKYKDLFITILVKNGKRNDFFQHRGKIRPAPGGESSPPMSAPFLTDEERERYRGS YLL >gi|222159257|gb|ACAB01000102.1| GENE 29 32732 - 34549 1123 605 aa, chain - ## HITS:1 COG:no KEGG:BT_3898 NR:ns ## KEGG: BT_3898 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 605 1 609 609 890 71.0 0 MTPELAYFLKINVAIALFYAFYRLFFHKDTFFHWRRMALLCFFAISMLYPLLNIQGWIKA HEPMVAMADLYATILLPEQVVTPTQATVINWQEVITQFAKIIYWSGMLLLTARFFVQLGS IIRLHFQCSKSNIQGVRVHLLKKETGPFSFFHWIFIHPQSHTESEISEIITHEETHARQY HSVDVLFSEIMCIFCWFNPFIWLMKREVRGNLEYMADHRVLETGHDSKSYQYHLLGLAHH KAAANLSNSFNVLPLKNRIKMMNKRRTKEIGRTKYLMFLPLAGILMIVSNIEMVARTTEK FAKEMMGQVTEEVAMQAETTNIPELSTREIQEISLPQGTKEKEVTETQIKSVPDSVVFQV VEEMPDFPGGMKALMDYLSKNVKYPAEAHAIGAQGRVIVSFTVKKDGSIADTKVERSVNP YLDKEAMRVIAAMPKWQPGKQRGEAVNVKFTVPVAFRLSDPPTPKAEEIKQSDLDEVVVV GYGPQEDSTPGAVGVKGENAVQAFTVVETMPKFPEGQAGLMRYLARSIKYPVIAQKNKEQ GRVIIQMIIGTDGSLSNVKVLRSVSPSLDAEAIRVVGNMPKWEPGMQKGQAVPVKYTLPI TFRLQ >gi|222159257|gb|ACAB01000102.1| GENE 30 34568 - 34933 318 121 aa, chain - ## HITS:1 COG:no KEGG:BT_3899 NR:ns ## KEGG: BT_3899 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 120 1 120 121 205 92.0 4e-52 MEKLTIQEEEVMIYIWELQCCFVKDIVAKYTQPAPPYTTVASIVKNLERKGYVTPKRVGN TYQYTPAIRENEYKRHFMSGVVRNYFENSYKEMVSFFAKDQKISTDDLKDIIELIEKGKE N >gi|222159257|gb|ACAB01000102.1| GENE 31 35332 - 35901 521 189 aa, chain + ## HITS:1 COG:no KEGG:BT_0646 NR:ns ## KEGG: BT_0646 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 190 190 197 57.0 2e-49 MKKGLIFVLFALVSIVSYSQISWNAKVGMNMSNFTGDMDTDMRIGFNVGVGMEYQFSDMW SIQPSLMFTQKGAKQDEVKMNPMYLEIPVLAAARFAIADNQNIVVKAGPYFAFGIAGKCK IGDEKIDFFGDGDDQFGAKRFDAGLGVGVAYEINKFFIDLSGEFGLAKLADGDGAPKNMN FSIGVGYKF >gi|222159257|gb|ACAB01000102.1| GENE 32 36049 - 37308 962 419 aa, chain - ## HITS:1 COG:VC1609 KEGG:ns NR:ns ## COG: VC1609 COG0842 # Protein_GI_number: 15641617 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Vibrio cholerae # 17 381 29 388 408 154 28.0 5e-37 MKDIKLKDKIAQGINDLFYIWKREFQTTFRDQGVLIFFILVPLGYPLLYSFIYDNEVVRE VPAVVVDDSHSSLSREYLRKVDATPDIKIVSYCADMEEAKQMLKNRRAYGIIYIPSDFSD NIAKGKQTQVSIYCDMSGLLYYKSMLLANTAVSLNMNKDIKIARSGNTTDRQDEITAYPI EYEEISIFNPTAGFAAFLIPAVLVLIIQQTLLLGIGLAAGTARENNRFKDLVPINRHYNG TLRIVLGKGLSYFLVYILVAFYVLYVVPRLFSLNQIGQPSSLILFVVPYLAACIFFAMTA SIAIRNRETCMLIFVFTSVPLLFISGISWPGAAIPPFWKYVSYIFPSTFGINGFVKINNM GATLSEVAFEYKALWLQAGIYFLTTCWVYRWQILMSRKHAIERYKELKEKANLSKQISD >gi|222159257|gb|ACAB01000102.1| GENE 33 37309 - 38490 496 393 aa, chain - ## HITS:1 COG:VC1608 KEGG:ns NR:ns ## COG: VC1608 COG0842 # Protein_GI_number: 15641616 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Vibrio cholerae # 28 355 22 348 387 125 27.0 2e-28 MREREKKYIALWQVMQRECRRLVSRPLYLFCMVIAPLFCYIFFTTLMDSGLPKDLPAGVV DMDDSSTSRNIVRNLDAFSQTGVVAHYSNVTDARIAMQEGKIYGFFYLPKGLSAEAQSQR QPTISFYTNYSYLIAGSLLFRDMKMMGELTSGAAARTMLYAKGATEDQAMAYLQPIVIDT HPLNNPWLNYSVYLCNTLIPGVLMLLIFMVTVYSIGVEIKDRTAREWLRMSNNSIYIALA GKLLPHTVVFFTMGIFYNVYLYGFLHFPCNSGIFPMIFATLCLVLASQCCGIVMIGTLPT LRLGLSFASLWGVISFSISGFSFPVMAMHPVLQALSNLFPLRHYFLIYVDQALNGYSMAY SWTNYMALLIFMMLPFFVVHRLKEALVYYKYIP >gi|222159257|gb|ACAB01000102.1| GENE 34 38493 - 39485 1176 330 aa, chain - ## HITS:1 COG:VC1607 KEGG:ns NR:ns ## COG: VC1607 COG0845 # Protein_GI_number: 15641615 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 17 328 6 322 324 217 45.0 3e-56 MAPIKSQNSNMLLAFLTLLGVIAIVAVVGFFMLRKGPEIIQGQAEVTEYRVSSKVPGRIL EFRVKEGQSVNAGDTLAILEAPDVVAKMEQARAAEAAAQAQNAKAIKGAREEQIQAAYEM WQKAQAGVTIAEKSYQRVKNLYEQGVMPAQKLDEVTAQRDASIATEKAAKAQYTMAKNGA EREDKMAAEALVNRAKGAVAEVESYIKETYLIAPAAGEVSEIFPKVGELVGTGAPIMNIA ELNDMWVTFNVREDLLKNLTMGSEFEAVIPALDNKKIKLKVYYLKDLGTYAAWKATKTTG QFDLKTFEVKASPIEKVENLRPGMSVIIDK >gi|222159257|gb|ACAB01000102.1| GENE 35 39592 - 41055 1379 487 aa, chain - ## HITS:1 COG:HP1489 KEGG:ns NR:ns ## COG: HP1489 COG1538 # Protein_GI_number: 15646098 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Helicobacter pylori 26695 # 23 455 48 471 510 82 24.0 1e-15 MKKVFLLTILLSLTFIGKAQNFLSLDSCRALALANNKDLLISNEKINAAHYQRKAAFTNY LPNFSATGAYMRNQKEFSLLNNDQKAALSGLGTNLAGPIQQAATEIATAHPELKPLIASL SGKLGAALPALDQAGNSLVDALRTDTRNVYAGALTLTQPLYMGGKIRAYNKITKYAEELA QEQHHGGMQEVIMSTDQAYWQVISLVNKKKLAEGYLKLLQQLDGDVEKMINEGVATKADG LSVRVKVNEAEMTLTKVEDGLSLARMLLCQLCGIDLSSPITLADENMEDIPLLTTDPHFD LSTAYENRPEIRSLELATQIYKQKVNVTRAEHLPSIALMGNYMVTNPSVFNSFENKFKGM WNVGVMVQIPIWHWGEGIYKTKAAKAEARIAQYQLQDAREKIELQVNQAAFKVKEAGKKL VMSSKNMEKAEENLRYATLGFKEGVIATSNVLEAQTAWLSAHSEKIDAQIDVKLTEIYLK KSLGTLK >gi|222159257|gb|ACAB01000102.1| GENE 36 41325 - 42266 1244 313 aa, chain - ## HITS:1 COG:BH3158 KEGG:ns NR:ns ## COG: BH3158 COG0039 # Protein_GI_number: 15615720 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Bacillus halodurans # 3 307 7 311 314 266 46.0 3e-71 MSKVTVVGAGNVGATCANVLAFNEVADEVVMLDVKEGVSEGKAMDMMQTAQLLGFDTTLV GCTNDYAQTANSDVVVITSGIPRKPGMTREELIGVNAGIVKSVAENLLKYSPNAIIVVIS NPMDTMTYLALKALGLPKNRVIGMGGALDSSRFKYFLSQAIGCNANEVEGMVIGGHGDTT MIPLTRFATYKGMPVANFISAEKLEEVAAATMVGGATLTKLLGTSAWYAPGAAGAFVVES ILHDQKKMVPCSVLLEGEYGESDLCIGVPVILGKNGIEKIVELNLNEDEKAKFAASAKAV HGTNAALKEVGAL >gi|222159257|gb|ACAB01000102.1| GENE 37 43152 - 44024 949 290 aa, chain + ## HITS:1 COG:no KEGG:BT_3912 NR:ns ## KEGG: BT_3912 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 290 1 290 290 422 88.0 1e-116 MSKKSLLIAAVAILVIAIIGITYLLFTEKKANRELVQEFQLDKEDLENEYSRFVQEYDEL KFKVTNDSLGVLLEQEQLKTQRLLEELRTVKSSNATEIRRLKKELATLRKVMVGYINQID SLNKLAAQQKQVIAEVTQKYNQASQQISNLSEEKKNLDKKVTLAAQLDATNIRIEPRNKR GKVAKKVKDVVKLAISFTIVKNITAENGERTIYIRITKPDNDVLTKSASNTFPYENRTLV YSIKKYIEYNGEEQNINVFWDVEEFLYAGNYRVDIFEGGNLIGSQSFTLN >gi|222159257|gb|ACAB01000102.1| GENE 38 44658 - 45563 474 301 aa, chain - ## HITS:1 COG:PA3088 KEGG:ns NR:ns ## COG: PA3088 COG0061 # Protein_GI_number: 15598284 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Pseudomonas aeruginosa # 75 297 64 289 295 165 38.0 1e-40 MPNLHPELMMLLMKFAIFGNTYQAKKSSHAVTLFKLLKKQGAEIGMCREFYQFLVSENMD IEADQLFDGDDFTADMVISIGGDGTFLKAARRVGRKGIPILGINTGRLGFLADISPEEME ETFDEIQNGRYSVEERSVLQLICNDKHLQDSPYALNEIAILKRDSSSMISIRTAINGAYL NTYQADGLVIATPTGSTAYSLSVGGPIIVPHSNTIAITPVAPHSLNVRPIVIRDDWEITL DVESRSHNFLVAIDGRSETCKETTQLTIRRADYSVKVVKRFNHIFFDTLRSKMMWGADGR R >gi|222159257|gb|ACAB01000102.1| GENE 39 45676 - 46392 651 238 aa, chain + ## HITS:1 COG:XF0060 KEGG:ns NR:ns ## COG: XF0060 COG0854 # Protein_GI_number: 15836665 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal phosphate biosynthesis protein # Organism: Xylella fastidiosa 9a5c # 2 238 5 252 260 207 47.0 1e-53 MTKLSVNINKIATLRNARGGNVPDVVKVALDCESFGADGITVHPRPDERHIRRSDVYDLR PLLRTEFNIEGYPSPEFIDLVLKVKPHQVTLVPDDPSQITSNSGWDTKANQEFLTEVLDQ FNSAGIRTSVFVAADPEMVEYAAKAGADRVELYTEPYATDYPKNPEAAIAPFIEAAKTAR KLGIGLNAGHDLSLVNLNYFYKNIPWVDEVSIGHALISDALYLGLERTIQEYKNCLRS >gi|222159257|gb|ACAB01000102.1| GENE 40 46389 - 47105 902 238 aa, chain + ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 35 223 1 190 202 90 31.0 2e-18 MNAMILLAQEAMNMADSLATANPVLTEVNAPEMNMLDMAVKGGWIMIVLGVLSVICFYIL FERNYMIRKAGKEDPMFMERIKDYIHSGEIKAAINYCRTMNTPSARMIEKGISRLGRPIN DVQVAIENVGNIEVAKLEKGLTVMATISGGAPMLGFLGTVTGMVRAFYEMANAGSGNIDI TLLSGGIYEAMITTVGGLIVGIIAMFAYNYLVMLVDRVVNKMESRTMEFMDLLNEPAK >gi|222159257|gb|ACAB01000102.1| GENE 41 47113 - 47529 304 138 aa, chain + ## HITS:1 COG:no KEGG:BF3738 NR:ns ## KEGG: BF3738 # Name: not_defined # Def: putative tansport related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 138 6 143 145 234 87.0 9e-61 MGLKRRNRVSPNFSMASMTDVIFLLLIFFMITSTVVSPNAIKVLLPQGKQQTSAKPLTRV VIDKDLNFYAAFGNEKEQPVALNDLTSFLQSCAEKEPEMYVALYADESVPYREIVRVLNI ANENHFKMVLATRPPENK >gi|222159257|gb|ACAB01000102.1| GENE 42 47536 - 48411 884 291 aa, chain + ## HITS:1 COG:no KEGG:BT_3921 NR:ns ## KEGG: BT_3921 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 291 3 292 292 373 87.0 1e-102 MDRRKKGEYIGALGALLVHVAVIALLILVSFTVPQPDEDAGGVPVMMGNVDAASGFDDPS LVDVDIMDEDAAAPPAETEPQLPSEQDLLTQTEEETVTLKPKTEEPKKETVKPKEMVKPK EPVKKPEKTEAEKAAEAKRLAEEKAERERKAAEEAARKRVSGAFGKGAQMTGNKGTAASG TGTEGSKEGNSSTGAKTGTGGYGTFDLGGRSLGTGSLPKPVYNVQEEGRVVVNITVNPAG QVISTSISPQTNTVNSALRKAAEDAAKKARFNTIDGVNNQTGTITYYFNLR >gi|222159257|gb|ACAB01000102.1| GENE 43 48439 - 48990 733 183 aa, chain + ## HITS:1 COG:CAC1629 KEGG:ns NR:ns ## COG: CAC1629 COG0693 # Protein_GI_number: 15894907 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Clostridium acetobutylicum # 7 182 6 180 188 132 42.0 4e-31 MGTVYAFFADGFEEIEAFTAIDTLRRAGLNVEIVSVTPDEIVVGAHDVSVLCDINFENCD FFDAELLLLPGGMPGAATLDKHEGLRKLILDFAAKGKPIAAICAAPMVLGKLGLLKGKKA TCYPSFEQYLDGAECVNAHVVRDGNIITGMGPGAAMEFALTIVDLLVGKEKVDELVEAMC VKR >gi|222159257|gb|ACAB01000102.1| GENE 44 48998 - 49657 296 219 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 1 219 1 223 234 118 37 7e-26 MKKYVIIVAGGKGLRMGSDLPKQFLPMGDKPVLMHTLEVFRRYDEALQIILVLPQEQQSF WKQLCDEHHFTVKHVLAEGGETRFHSVKNGLALVQEPGLVGVHDGVRPFVSVEVIRRCYE LAEVQKAVIPVVDVVETLRHLTDAGSETVSRIDYKLVQTPQVFDVELLQQAYAQEFTPFF TDDASVVEAMGMPVYLAEGNRENIKITTPFDLKVGSALL >gi|222159257|gb|ACAB01000102.1| GENE 45 49658 - 51754 1487 698 aa, chain + ## HITS:1 COG:slr0020 KEGG:ns NR:ns ## COG: slr0020 COG1200 # Protein_GI_number: 16331409 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Synechocystis # 18 670 146 805 831 496 43.0 1e-140 MFDLATRDIKFISGVGPQKAAVLNKELEIYSLHDLIYYFPYKYIDRSRIYYIHEIDGNMP YIQLKGEILGFETIGEGRQRRLTAHFSDGTGVVDLVWFQGIKYILGKYKLHEEYIIFGKP TVFNGRINVAHPDVDKPDDLKLSSVGLQPYYSTTEKMKRSFLNSHAIEKMMATVIQQIQE PLPETLSPKLLTEHHLMPLTEALRNIHFPTNPDVLRRAQYRLKFEELFYVQLNILRYAKD RQKRYRGYIFEKVGDVFNTFYAKNLPFQLTGAQKRVLKEIRNDVGSGRQMNRLLQGDVGS GKTLVALMSMLLALDNGYQACMMAPTEILANQHYETIKELLFGMDIRVELLTGSIKGKRR EAILAGLLTGDVQILIGTHAVIEDTVNFSSLGFVVIDEQHRFGVAQRARLWSKNVQPPHV LVMTATPIPRTLAMTLYGDLDVSVIDELPPGRKPITTIHQFDNRRESMYRSVRKQIDEGR QVYIVYPLIKESEKIDLKNLEEGYQHILEEFPKCTVCKVHGKMKPAEKDEQMQLFVSGKA QIMVATTVIEVGVNVPNASVMIIENAERFGLSQLHQLRGRVGRGAEQSYCILVTNYKLTE DTRKRLEIMVRTNDGFEIAEADLKLRGPGDLEGTQQSGIAFDLKIADIVRDGQLLQYVRA IAESIVEQDPAAQSPENEILWRQLKALRKTNVNWAAIS >gi|222159257|gb|ACAB01000102.1| GENE 46 51978 - 52442 371 154 aa, chain + ## HITS:1 COG:AF0767 KEGG:ns NR:ns ## COG: AF0767 COG0105 # Protein_GI_number: 11498373 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside diphosphate kinase # Organism: Archaeoglobus fulgidus # 2 149 1 148 151 145 45.0 2e-35 MLEKTLVILKPCTLQRGLVGEITHRFERKGLRLAGMKMMQLTDELLSEHYAHLSGKSFFQ RVKDSMMTAPVIVCCFEGVDAIQTVRTLAGPTNGRLAAPGTIRGDYSMSFQENIVHASDS PETAAIELKRFFKPEEIFDYKQATFNYLYANDEY >gi|222159257|gb|ACAB01000102.1| GENE 47 52463 - 53335 698 290 aa, chain + ## HITS:1 COG:HI0409 KEGG:ns NR:ns ## COG: HI0409 COG0739 # Protein_GI_number: 16272358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Haemophilus influenzae # 98 206 328 443 475 102 44.0 7e-22 MNFNCIIKTGLVAVAAMVSLSSFSQDLIARQAPIDKKLKSVDSLALQKQIRAEQSEYPAL SLYPNWNNQYAHSYGNAIIPETYTIDLTGFRMPTPSTKITSPFGPRWRRMHNGLDLKVNI GDTIVSAFDGKVRIVKYERRGYGKYVVIRHDNGLETIYGHLSKQLVEENQLVKAGEPIGL GGNTGRSTGSHLHFETRFLGIAINPIYMFDFPKQDIVADTYTFRKTKGVKRAGSHDTQVA DGTIRYHKVKSGDTLSRIAKLRGVSVSTLCKLNRIKPTTTLRIGQVLRCS >gi|222159257|gb|ACAB01000102.1| GENE 48 53437 - 54000 585 187 aa, chain + ## HITS:1 COG:no KEGG:BT_3927 NR:ns ## KEGG: BT_3927 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 187 1 180 180 342 96.0 5e-93 MRTIWKKMKDTKQQFEHVIALCRDLFSKKLHDYGPAWRILRPASVTDQIFIKANRIRSIE TKGVTLIDEGIRAEFIAIVNYGIVGLIQLELGYAESADISNEEAMALYDKYAKEALDLML AKNHDYDEAWRSMRVSSYTDLILMKIYRTKQIESLAGNTLVSEGIDANYMDMINYSVFGL IKIEFEG >gi|222159257|gb|ACAB01000102.1| GENE 49 53990 - 55348 826 452 aa, chain + ## HITS:1 COG:no KEGG:BF3957 NR:ns ## KEGG: BF3957 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 440 1 442 443 695 74.0 0 MKDKNLHIIQEVVANTCRFLLAASFIFSGFVKAVDPLGFQYKIQDYLTAFGMASWFPSFF PLLGGIILSAVEFFIGISLFFATRRTLATSLALMLMIFMTPLTLYLAIFDPVSDCGCFGD AWVLTNWETFGKNIVLLFAAMMAFRHRRMLVRFISVKMEWLVSLYTLFFVFTLSFYCLDR LPVLDFRPYKIGKNILEGMTMPEGAKPSVYESIFILEKNGEKKEFTLDNYPDSTWTFVDT RTILKEKGYEPAIHDFSMIDLNTGEDITDDVLTDIGYTFLLVAHRIEEADDSNIDLINEI YDYSVEHGYKFYCLTSSPEEQIELWKDKTGAEYPFCQMDDITLKTMVRSNPGLILIKNGT ILNKWSDEDIPDEYVLTDKLENLPLGKQKVSSDTHTVGYVFLWFVIPLLLVLGVDVLVVR RRERKNAKRKQQEEEMKSKELKTENPKIEEQE >gi|222159257|gb|ACAB01000102.1| GENE 50 55456 - 56211 1027 251 aa, chain + ## HITS:1 COG:FN1366 KEGG:ns NR:ns ## COG: FN1366 COG0149 # Protein_GI_number: 19704701 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Fusobacterium nucleatum # 1 251 1 251 251 236 49.0 3e-62 MRKNIVAGNWKMNKTLQEGIALAKELNEALANEKPNCDVIICTPFIHLASVTPLVDAAKI GVGAENCADKASGAYTGEVSAEMVASTGAKYVILGHSERRAYYGETVAILEEKVKLALAN GLTPIFCIGEVLEEREANKQNEVVAAQMESVFSLSAEDFSKIVLAYEPVWAIGTGKTASP EQAQEIHAFIRSIVANKYGKEIADNTSILYGGSCKPSNAKELFANPDVDGGLIGGAALKV SDFKGIIDAFN >gi|222159257|gb|ACAB01000102.1| GENE 51 56366 - 56809 442 147 aa, chain + ## HITS:1 COG:no KEGG:BT_3930 NR:ns ## KEGG: BT_3930 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 147 6 149 149 251 83.0 6e-66 MKVLLTLLFVLTATFAQAQSIIKSLERNIPGQGKVTIHQDPRIEALIGMERPATGEQKVI KTSGFRIQAYAGNNTRQAKNDAYHVASRVKEYFPELTVYTSFNPPRWLCRVGDFRSIEEA DAMMRRLKATGVFKEVSIVRDQINIPL >gi|222159257|gb|ACAB01000102.1| GENE 52 56817 - 57407 688 196 aa, chain + ## HITS:1 COG:slr0426 KEGG:ns NR:ns ## COG: slr0426 COG0302 # Protein_GI_number: 16331608 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Synechocystis # 6 193 41 229 234 215 59.0 5e-56 MLEKEEIVSPNLEELKSHYRSIITLLGEDAEREGLLKTPERVAKAMLSLTKGYHMDPHEV LRSAKFQEEYSQMVIVKDIDFFSLCEHHMLPFYGKAHVAYIPNGYITGLSKIARVVDIFS HRLQVQERMTLQIKECIQETLNPLGVMVVVEAKHMCMQMRGVEKQNSITTTSDFTGAFNQ AKTREEFMNLIQHGRV >gi|222159257|gb|ACAB01000102.1| GENE 53 57437 - 59518 1586 693 aa, chain - ## HITS:1 COG:BH1375 KEGG:ns NR:ns ## COG: BH1375 COG0358 # Protein_GI_number: 15613938 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Bacillus halodurans # 2 450 5 453 599 280 36.0 8e-75 MIDQITIDRILDAAQIMDVVSDFVTLRKRGVNYVGLCPFHSDKTPSFYVSPAKGLCKCFA CGKGGNAVHFIMEHEQMSYPEALKYLAKKYNIEIKERELSDEEKFVQSERESLFIVNNFA RDYFQNILKNHVDGRSIGMAYFRNRGFRDDIIEKFQLGYCTEAHDAFAKEAIQKGYKKEY LVKTGLCYETDDHRLRDRFWGRVIFPVHTLSGKVVAFGGRVLASATKGVKVKYVNSPESE IYHKSNELYGIYFAKQAIVKQDRCFLVEGYTDVISMHQSGIENVVASSGTALTPGQIRMI HRFTNNMTVLYDGDAAGIKASIRGIDMLLEEGMNIKVCLLPDGDDPDSFARKHNSTEFQT FISEHETDFIRFKTNLLLEDAGKDPIKRAELIGNLVQSISVIPEAIVRDVYIKECAQLLH VEDKLLVSEVAKRRETQAEKRAEQTERERRMAERTAMMSQGSTSPEDVPIPNGDIPLPPE VDGGYTDVPPALQEDSYASFIPQEGKEGQEFYKFERLILQAVVRYGEKIMCNLTDEEGNE IPVTVIEYVVNDLKEDELAFHNPLHRQMLSEAAAHMHDSNFIAERYFLAHPDPVISKLSV DLINVRYQLSKYHSKSQKIVTDEERLYEMVPMLMINFKYAIVTEELKHMLYALQDPALAH DNEKCDSLMQRYNELRTVQSIMAKRLGDRVVLR >gi|222159257|gb|ACAB01000102.1| GENE 54 59625 - 60398 778 257 aa, chain - ## HITS:1 COG:no KEGG:BT_3933 NR:ns ## KEGG: BT_3933 # Name: not_defined # Def: chorismate mutase/prephenate dehydratase (TyrA) # Organism: B.thetaiotaomicron # Pathway: Phenylalanine, tyrosine and tryptophan biosynthesis [PATH:bth00400]; Novobiocin biosynthesis [PATH:bth00401]; Metabolic pathways [PATH:bth01100]; Biosynthesis of secondary metabolites [PATH:bth01110] # 1 257 1 257 257 488 98.0 1e-137 MRILILGAGKMGSFFTDILSFQHETAVFDVNPHQLRFVYNTYRFTTLEEIKEFEPELVIN AVTVKYTLDAFRKILPVLPKDCIISDIASVKTGLKKFYEESGFRYVSSHPMFGPTFASLS NLSSENAIIISEGDHLGKIFFKDLYQTLRLNIFEYTFDEHDETVAYSLSIPFVSTFVFAA VMKHQEAPGTTFKKHMAIAKGLLSEDDYLLQEILFNPRTPGQVTNIRTELKNLLEIIENK DAEGMKKYLTKIREKIK >gi|222159257|gb|ACAB01000102.1| GENE 55 60456 - 61517 1149 353 aa, chain - ## HITS:1 COG:DR1001_2 KEGG:ns NR:ns ## COG: DR1001_2 COG2876 # Protein_GI_number: 15806024 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Deinococcus radiodurans # 1 242 13 245 270 146 38.0 7e-35 MELESILLPGVEAKRPIVIAGPCSAETEEQVMDTAKQLAAKGQKIYRAGIWKPRTKPGGF EGIGVEGLAWLKEVKKETGMYVSTEVATAKHVYECLKAGIDILWVGARTTANPFAVQEIA DALKGVDIPVLVKNPVNPDLELWIGALERINNAGLKRLGAIHRGFSSYDKKIYRNLPQWH IPIELRRRIPNLPIFCDPSHIGGKRELVAPLCQQAMDLNFDGLIVESHCNPDCAWSDASQ QVTPDVLDYILNLLVIRTETQSTESLAQLRKQIDECDDNIIQELAKRMRVAREIGTYKKE HGITVLQAGRYNEILEKRGAQGEQCGMDSEFMKKIFEAIHEESVRQQMEIINK >gi|222159257|gb|ACAB01000102.1| GENE 56 61605 - 62783 1224 392 aa, chain - ## HITS:1 COG:aq_273 KEGG:ns NR:ns ## COG: aq_273 COG0436 # Protein_GI_number: 15605813 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Aquifex aeolicus # 13 392 5 385 387 301 42.0 2e-81 MQKESQTYKIAPADRLASVSEYYFSKKLKEVAQMNAEGKDVISLGIGSPDMPPSRETIET LCNNAHDPNGHGYQPYVGIPELRKGFANWYQRWYGVELNPNTEIQPLIGSKEGILHVTLA FVNPGEQVLVPNPGYPTYTSLSKILGAEVINYDLKEEDGWMPDFEALEKMDLSRVKLMWT NYPNMPTGANATPEIYERLVDFARRKNIVIVNDNPYSFILNDKPISILSVPGAKDCCIEF NSMSKSHNMPGWRIGMLASNAEFVQWILKVKSNIDSGMFRAMQLAAATALEAEADWYEGN NENYRNRRHLAGEIMKTLGCTYDEKQVGMFLWGKIPASCKDVEELTEKVLHEARVFITPG FIFGSNGARYIRISLCCKDNKLAEALERIKRI >gi|222159257|gb|ACAB01000102.1| GENE 57 62758 - 63606 743 282 aa, chain - ## HITS:1 COG:VC0705_2 KEGG:ns NR:ns ## COG: VC0705_2 COG0077 # Protein_GI_number: 15640724 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Vibrio cholerae # 11 275 3 265 278 138 34.0 1e-32 MKKIAIQGTLGSYHDIAAHKYFEGEEIELICCANFEDVFTSIRKDSQVIGMLAIENTIAG SLLHNNELLRQSGTQIIGEYKLRISHSFVCLPDESWEDLTEVNSHPIALMQCREFLNQHP QLKVVEGEDTARSAEIIKNENLKGHAAICSKAAAERYGMKILQEGIETNKHNFTRFLVVA DPWQVDELRQHHVNATNKASMVFTLPHTEGSLSQVLSILSFYNINLTKIQSLPIIGREWE YQFYVDVAFNDYLRYKQSIAAITPLTKELKLLGEYAEGKSNV >gi|222159257|gb|ACAB01000102.1| GENE 58 64084 - 64530 164 148 aa, chain - ## HITS:1 COG:no KEGG:CHU_1441 NR:ns ## KEGG: CHU_1441 # Name: not_defined # Def: transposase # Organism: C.hutchinsonii # Pathway: not_defined # 1 147 1 146 149 128 50.0 5e-29 MPQSLSKNYIHLTFSTKYRKDSINEDKLSEISKYISGILKNIDCPPIIVGGFTNHIHILC ILNKNIALSKMVEEVKRSSSKWIKSLGPSYHDFSWQEGYGAFSVSQSKVETVTRYILGQK EHHKKMTFQDELKMFLKEYQVEYNEEFI Prediction of potential genes in microbial genomes Time: Wed May 18 03:30:31 2011 Seq name: gi|222159256|gb|ACAB01000103.1| Bacteroides sp. D1 cont1.103, whole genome shotgun sequence Length of sequence - 19048 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 7, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 106 - 1068 1252 ## COG0457 FOG: TPR repeat 2 1 Op 2 . - CDS 1108 - 3012 1443 ## COG0514 Superfamily II DNA helicase 3 1 Op 3 . - CDS 3005 - 4723 1726 ## COG0608 Single-stranded DNA-specific exonuclease - Prom 4966 - 5025 4.1 + Prom 4874 - 4933 5.9 4 2 Tu 1 . + CDS 5004 - 5567 501 ## BT_2463 RNA polymerase ECF-type sigma factor + Term 5669 - 5705 -0.6 + Prom 5618 - 5677 6.2 5 3 Tu 1 . + CDS 5747 - 6682 697 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 6820 - 6879 3.5 6 4 Op 1 . + CDS 6933 - 10229 3049 ## Phep_1362 TonB-dependent receptor 7 4 Op 2 . + CDS 10248 - 11513 1177 ## Phep_1361 RagB/SusD domain protein 8 4 Op 3 . + CDS 11551 - 12483 764 ## Phep_1360 exopolysaccharide biosynthesis protein 9 4 Op 4 . + CDS 12512 - 13939 1002 ## Fjoh_0602 two component regulator 10 4 Op 5 . + CDS 13989 - 14843 513 ## COG4632 Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase 11 4 Op 6 . + CDS 14901 - 15113 175 ## gi|298480262|ref|ZP_06998460.1| alkaline phosphatase 12 4 Op 7 . + CDS 15210 - 16304 986 ## COG1785 Alkaline phosphatase + Term 16488 - 16539 2.6 - Term 16363 - 16400 -1.0 13 5 Tu 1 . - CDS 16529 - 16708 157 ## gi|237714413|ref|ZP_04544894.1| predicted protein - Prom 16740 - 16799 5.6 14 6 Tu 1 . - CDS 16966 - 17397 400 ## COG0824 Predicted thioesterase - Prom 17608 - 17667 4.8 + Prom 17530 - 17589 5.3 15 7 Op 1 . + CDS 17712 - 18275 506 ## COG0009 Putative translation factor (SUA5) 16 7 Op 2 . + CDS 18276 - 19047 514 ## COG0038 Chloride channel protein EriC Predicted protein(s) >gi|222159256|gb|ACAB01000103.1| GENE 1 106 - 1068 1252 320 aa, chain - ## HITS:1 COG:alr0622 KEGG:ns NR:ns ## COG: alr0622 COG0457 # Protein_GI_number: 17228118 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 98 297 256 451 547 77 28.0 3e-14 MPNFFKSFFSGKSETPESEKQKNDQKNFEIFKYDGLRAQRMGRPDYAVKCFTEALAIQKE FETMGYLSQLYIQMGETAKARELLEKMAAMEPHLTSTFLTLANVCFIQEDYQAMEEAANK AIAIEEGNAVAHYLLGKARKGQNDDLMTIAHLTKAITLKDDFIEARLLRAEALLNLKQYK EMMEDIDAVLAQNPEEETAMLLRGKVKESNGQGEEAEEDYKLVTEINPFNEQAYLYLGQL YINQKKLTEAIGLFDEAIELNPNFAEAYKERGRAKLLNGDKDGSVEDMKKSLELNPKEEA GLNGEFKNLGPKPEALPGIF >gi|222159256|gb|ACAB01000103.1| GENE 2 1108 - 3012 1443 634 aa, chain - ## HITS:1 COG:CAC2687 KEGG:ns NR:ns ## COG: CAC2687 COG0514 # Protein_GI_number: 15895945 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Clostridium acetobutylicum # 7 424 8 420 714 296 39.0 7e-80 MNKYQEILKQYWGYDSFRDLQEEIITSIGEGKDTLGLMPTGGGKSITFQVPALAQEGICI VITPLIALMKDQVQNLRKRGIKALAVYSGMTRQEILTALENCIFGDYKFLYISPERLDTD IFRTKLRSMKVSMITVDESHCISQWGYDFRPAYLKIAEIRALLPGIPVLALTATATPEVV KDIQARLDFREENVFRMSFERKNLAYMVRQTDNKTQELLHILRKVPGSAIIYVRNRRRTK EITELLVNEDITADFYHAGLDNAVKDLRQKRWQSGEVRVMVATNAFGMGIDKPDVRIVLH LDLPDSLEAYFQEAGRAGRDGEKAYAVILYTKTDRTTLHRRVVDTFPDKEYILNVYEHLQ YYYQMAMGDGFQCVREFNLEEFCRKFKYFPVPVDSALKILTQAGYLEYTDEQDNASRILF TIRRDELYKLREMGTEAEALIQMILRSYTGVFTDYAYISEATLSVRTGLTREQIYNILVT LTKRRIVDYIPHKKTPYIIYTRERQELRFVHIPPAVYEERKARYEARIKAMEEYVISENV CRSRMLLRYFGEKNEHNCGQCDVCLSHRATDTLTGKSLEELKKKITELLAQKPHTPAEIA EKIEAEKERVSEVIQYLLEEGEWKMQDGMIHISK >gi|222159256|gb|ACAB01000103.1| GENE 3 3005 - 4723 1726 572 aa, chain - ## HITS:1 COG:lin1560_1 KEGG:ns NR:ns ## COG: lin1560_1 COG0608 # Protein_GI_number: 16800628 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Listeria innocua # 5 568 8 561 562 361 37.0 2e-99 MNHKWNYRPITPEQAETSQTLAQELGISPILGQLLVQRGITKAADAKKFFRPQLPDLHDP FLMKDMDIAVERLNRAMGKKERILIYGDYDVDGTTAVALVYKFIQQFYSNLDYYIPDRYN EGYGISKKGVDYAAETGVGLIIVLDCGIKAVEEITYAKEKGIDFIICDHHVPDDILPPAV AILNAKRLDNTYPYTHLSGCGVGFKFMQAFAISNGIEFHHLIPLLDIVAVSIASDIVPIM GENRILAYHGLKQLNSNPSIGMKAIIDVCGLSEKEITVSDIVFKIGPRINASGRIQNGKE AVDLLTEKDFSAALEKAGQINQYNETRKDLDKSMTEEANNIVANLEGLADRRSIVLYNEE WHKGVIGIVASRLTEVYYRPAVVLTRTDDMATGSARSVSGFDVYKAIEHCRDLLENFGGH TYAAGLSMKVENVDAFTKRFEDYVSRHILPEQTSAVIEIDAEIDFRDISSKFFNDLKKFN PFGPDNTKPIFCTHHVYDYGTSKVVGRDQEHIKLELVDNKSNNVMNGIAFGQSSHVRYIK TKRSFDICYTIEENTHKRGEVQLQIEDIKPIE >gi|222159256|gb|ACAB01000103.1| GENE 4 5004 - 5567 501 187 aa, chain + ## HITS:1 COG:no KEGG:BT_2463 NR:ns ## KEGG: BT_2463 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 187 11 185 189 166 46.0 3e-40 MGRLDDTINTGMTFDNIYMEYYQRCFLFAKSYLHDEMLSKDIASEAMITLWTTMKTEDVK NIHAFLMTVVKNQALNHMRNEHLRMEARESILADELYELDFRIASLDSSDPNRLFSEEIT DIVNRTLNGLPEKTRKAFMMSRYENKSVKEIAEALNVTVKGADYHISKALQQLRKNLKDY LYTLLFF >gi|222159256|gb|ACAB01000103.1| GENE 5 5747 - 6682 697 311 aa, chain + ## HITS:1 COG:PA0150 KEGG:ns NR:ns ## COG: PA0150 COG3712 # Protein_GI_number: 15595348 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 89 297 120 318 331 67 28.0 3e-11 MDAKLLNKYIAGDALPEEKKEVIRWMKESEENREQLMQLHRVYNATIWNGNLQAEKTENK KPVMRYLWASIKIAAVVAMVAFIIHKEYQEYRIEHSAEMQIMTVPAGQRASLVLADGTIV WLNSNSTLKYPATGFHSKERKVILEGEGYFEVAHNEKHPFIVETEKYDIRVLGTTFNVSA YPNSGLFETSLIEGKVTVYQPDTQHEMTLKPHEKVEVKDGKLYKETFSSDNDFLWRMGIY SFKDEPLETVFRKLEQYYEVKIINKNEEIASRPCTGKFRQKEGIEHVMKVLQKYVKFNYI QDDEKNQIIIY >gi|222159256|gb|ACAB01000103.1| GENE 6 6933 - 10229 3049 1098 aa, chain + ## HITS:1 COG:no KEGG:Phep_1362 NR:ns ## KEGG: Phep_1362 # Name: not_defined # Def: TonB-dependent receptor # Organism: P.heparinus # Pathway: not_defined # 1 1098 39 1140 1140 1089 50.0 0 MRLTVFFLLFIILETYALNGYSQNQKVSMNKGTATLAEIIRQIEKQTDYLFIYNEREVSL KRKVSIFTQNGTVASLLLDALKGTEFSYTMEGKHIILTKKQAEALSPVQHKKRITGLVKE YSGEAIVGCNVSVKGTTIGTITDINGKYSLEVPDNATLVFSFLGYKSMEYPVKAQQTINV TLGEDSKALNEVVVVGYGTQRKALVTNAISSFKPSESNMRPVLTPSELLQGRVAGVTVST GSGNLGSSERMSIRGAASLSASNEPLYVVDGIPILNSNASLFNMGEDLSSMAVLNLTDIE SIEVLKDAASAAIYGSRATNGVVVITTKSGKEGRSDIRLNVSTGISKFANKGRIKYADSD LYVETYNDGVERYNRQNGYTVGSAGYVVPISNPFQGLPDTDWLDLITQVGHSYNVDLSFS GGSKKTKFYVGANYNYQEGIIKTNDITKINLKAKISHEMTSWLEVGANVSGNYLKNNRVP GANIGSTIVARAVEQRPFDRPYKPNGDYYLGGTDELARHNPLQILNEEVSYIDTYRYLGT FNADLKYKKFSLKNSVSTDIGYTYDYVYYNENHPYGAGGGRIVEYNRLVKNLLIENVFNY NDKFGDFEAGLMLGHSFQKMSTRTSSIDGRGFPSPVFDTVGTASEIYNASGGISEFAMES YFGRINLSYLDRYILNVTMRSDGSSRFAPSDRYGYFPSVSLGWNVSKESFWKFPQTDLKF RLSYGKTGNQDGIGNYAWQPLMSGGINYGNNSGMAVTSMGNNKLTWETADQYDFGFDLGF WNGKLNMIADIYLKNTNNLLYSMPLHGTSGFTSITSNIGSMRNYGVEFSINGHLNIGKVN WTSSFNISHNKNKLTKLLGDDLLPIGSNRALKVGEELGAFYLFQMDGLYQYDGEVPQPLY DLGVRAGDVKYHDADNNGIINDNDRVLTGSSNPDFFGGWNNTFKYKGFQLDVFFTYMYGN DVYAEWAVTATRPGYRMAITEDVAKNRWTAPGSSDKYPRAVNTLCGHNSKNSTRFLEDGS FIRLRSLTFSYTFPQVMLQKICLKGLRLYVQGDNLLLFSKYSGWDPEVSKNMDPQYFGVD LYGVPPSRSVNFGINLSF >gi|222159256|gb|ACAB01000103.1| GENE 7 10248 - 11513 1177 421 aa, chain + ## HITS:1 COG:no KEGG:Phep_1361 NR:ns ## KEGG: Phep_1361 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 16 421 18 424 426 441 56.0 1e-122 MKKIITIFLGCLLLASCSGMLDIESHSAVSPGSVTPKDLSALRMGMYNKVQNSPARESYI TFDILGGDLTQSTGNARDLINSVLSSLNSIVANSWNGYYNALYQVNNVISIVEDLPESDL RNLIIGEAHYFRAYIYHSLVTRWGGVPIQRVNTMDKPFRDSEEAVWAFIEEELETALAFL GTSSSCYYVSRDAALALKARVMLERGKKTEAAALAEGLIKDGKYKLDSFDKIFRGKTNTE VIFSFQCLAEESNITISTLFYTYAHPNHGSYVYRPTNDVMNMYDDKDKRKEVSIINVGAE PCINKYPSGQTGTDPVVMSRLAEMYLISAEAQGLSKGLGRLNDLRKERGLDAVNPKDEDE FIEYILDERRKELLAEGFRYYDLIRTNKAKTMLGLKDYQLVLPIPGKEMISNPNLEPNPG Y >gi|222159256|gb|ACAB01000103.1| GENE 8 11551 - 12483 764 310 aa, chain + ## HITS:1 COG:no KEGG:Phep_1360 NR:ns ## KEGG: Phep_1360 # Name: not_defined # Def: exopolysaccharide biosynthesis protein # Organism: P.heparinus # Pathway: not_defined # 10 308 1 299 303 184 36.0 3e-45 MKKLIYDSALLLLGCLLWTGCNNDEDLTVYSTEGAKTELGQKIIVGSDGYVGQYFSDTTY TLAPGVKALEMEILSATGMAVKMFVLEVDLKDTHLTMKASSPKDEGKLKTKQQMTLQALA HDKQGSRVLAAVNGDFFATDGTPQGIYYRNGVCLKNTMTDNVCTFFAVTKGKKAVIGSYD EYDTYKDEIQEAVGGRVRLMTNGNVLPQTLTALEPRTAIGVTDNNVVYILVADGRNFWYS NGMRYAEMGAVMKALGAKDAINLDGGGSSTFIIRSKAGFEENRFAIRNWPYDNGGVERAV ANGLLVVTDN >gi|222159256|gb|ACAB01000103.1| GENE 9 12512 - 13939 1002 475 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_0602 NR:ns ## KEGG: Fjoh_0602 # Name: not_defined # Def: two component regulator # Organism: F.johnsoniae # Pathway: not_defined # 189 453 64 336 2491 79 26.0 3e-13 MIRNNRMYSCWQGLWGIICCLSLLLGMASCQDFTDDTGRTMPEPDLKFADGTLNLPLEEK EYTVDIESNLPWRVKTSATWIDLLSSNGMGTGSFRISVSKNANVAPREAEIAGWIIEGAE TKLKVVQEGVGIALKKRAVKVGAEGSAEEVIPFSTMVTYTYELSEGCDWIHVTDGAAITP GVINESELKLAIDPYTDIDEGRTASLYLKGSNGITDVLTITQDKKPLGDIDYLRMFYESA NGDNWTKKWNFDAPLETNPTNWFGLKFENKRVVEIDIQSPNNIEGDIIPLCNLSELRSIK FKHQKLAGIPEEIGQLSQLTTLWIIESAASGNLPESLGECELLTSFNISNHPTSTPAGFN NTFTGNLDMLINIPGMVTIKAYCNNLSGPLPVIPLDGNNKPTTWKSLKEFMIYTNGFSGS IPYGYGTVIEKSGSSGIFRVNDNQLSGQIPADIKAWSQYATRKAAWILQGNSLTE >gi|222159256|gb|ACAB01000103.1| GENE 10 13989 - 14843 513 284 aa, chain + ## HITS:1 COG:CAC2633 KEGG:ns NR:ns ## COG: CAC2633 COG4632 # Protein_GI_number: 15895891 # Func_class: G Carbohydrate transport and metabolism # Function: Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase # Organism: Clostridium acetobutylicum # 66 282 135 352 354 70 30.0 3e-12 MRKYFVILVLALCSVGTLKGQTASDSLAIVSAQWEIIHAQKGIIHKSASIPQLYQCPQVI NLIEIDPGKGMKAGIAISDGMKKTSRIASEHHALAAINGSYFDMKHGNSVCFLKTDRQVI DTTTINEFKLRVTGAVYERKGKMKLIPWNRQIEKKYKRKVGTILASGPLLLKDGRVCDWS LCGKDFVQNKHPRSAVCMTKDGKILLVTVNGRFPGRAEGVNIPELAHLLRILGGKDALNL DGGGSTTLWLSGAPENGVVNYPCDNKRFDHAGERGVPNIIYVYE >gi|222159256|gb|ACAB01000103.1| GENE 11 14901 - 15113 175 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298480262|ref|ZP_06998460.1| ## NR: gi|298480262|ref|ZP_06998460.1| alkaline phosphatase [Bacteroides sp. D22] # 18 70 18 70 467 118 100.0 1e-25 MKKLLRCSLLGLLLLCSVVVNGAQVPKYIFLFIGDGMGFNHVEATQIYAEKVGTDTGECS LLFPTFPVMT >gi|222159256|gb|ACAB01000103.1| GENE 12 15210 - 16304 986 364 aa, chain + ## HITS:1 COG:PAB2366 KEGG:ns NR:ns ## COG: PAB2366 COG1785 # Protein_GI_number: 14521248 # Func_class: P Inorganic ion transport and metabolism # Function: Alkaline phosphatase # Organism: Pyrococcus abyssi # 20 362 120 423 495 135 30.0 1e-31 MDAEKNHGLKSLARQLKDKGYKIGIITSASIDHATPGGFYASQPDRFMYYEIGVDAANSG FDFFGGAGLLEPRSKRNLSAPCLYDLFNQKGYTMFRGMDAYNRAAAKDKILLFPTDTVSK SLKYAMDRSAKDLSLPDLTKACLANFQETAKKGFFMMVEGGKIDWAAHAHDGGAVVKETI DFDQCIRLAYDFYKKHPNETLILVTADHETGGLGLGNSDMNLNIDLLQYQKCSQEALTAA MREMKSGKMIPSWEDMKAFLKKNLGFWEQIKITPREELELLVCYEESFLKKKSKDVVSLY AKDEPLAVAAIALLDKKASLGWTTKTHTGAPVPLYAIGKQAVLYSGRRDNTDMANVLRKL FLIK >gi|222159256|gb|ACAB01000103.1| GENE 13 16529 - 16708 157 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714413|ref|ZP_04544894.1| ## NR: gi|237714413|ref|ZP_04544894.1| predicted protein [Bacteroides sp. D1] # 1 59 1 59 59 96 100.0 6e-19 MFAVISQPHFKLLTSFTLFTLLRKEALVQTKTQVFSYVEERKKNMTGYRIETADIVSVD >gi|222159256|gb|ACAB01000103.1| GENE 14 16966 - 17397 400 143 aa, chain - ## HITS:1 COG:TVN0706 KEGG:ns NR:ns ## COG: TVN0706 COG0824 # Protein_GI_number: 13541537 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Thermoplasma volcanium # 12 102 11 101 133 72 35.0 2e-13 MEEIVFHHTLPIQLRFNDVDKFGHVNNTVYFSFYDLGKTEYFASVCPGVDWEKIGIVVVH IEADFVKQIFASDHIAVQTAVSKIGTKSFHLIQRVIDTETNEVKCICKSVMVTFDLERHE SMPLTEEWIEAICKYEERDLQKA >gi|222159256|gb|ACAB01000103.1| GENE 15 17712 - 18275 506 187 aa, chain + ## HITS:1 COG:AF0781 KEGG:ns NR:ns ## COG: AF0781 COG0009 # Protein_GI_number: 11498387 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Archaeoglobus fulgidus # 3 158 10 164 309 102 38.0 5e-22 MIEDIKKACQVMNEGGVILYPTDTVWGIGCDATNEEAVRRVYEIKKRADSKAMLVLVDSP VKVDFYVQDVPDVAWDLIEVADKPLTIIYSGARNLAPNLLAEDGSVGIRVTNEDFSRRLC QQFRKAIVSTSANVSGQPGAANFSEISDEIKSAVDYIVGFRQDDMNKPKPSSIIKLEKGG VIKIIRE >gi|222159256|gb|ACAB01000103.1| GENE 16 18276 - 19047 514 257 aa, chain + ## HITS:1 COG:TVN0094_1 KEGG:ns NR:ns ## COG: TVN0094_1 COG0038 # Protein_GI_number: 13540925 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Thermoplasma volcanium # 27 255 20 266 467 93 27.0 4e-19 MKGEKLSLLQRCIKWREANIKEKQFILILSFLVGIFTAFAALILKFFIHQIQNFLTNNFN ATEANYLYLVYPVVGIFLAGWFVRNIVKDDISHGVTKILYAISRRQGRIKRHNIWSSTIA SAITIGFGGSVGAEAPIVLTGSAIGSNLGSVFKMEHRTLMLLVGCGAAGAIAGIFKAPIA GLVFTLEVLMIDLTMSSLLPLLISAVTAATVSYIVTGTEAMFKFHLDQAFELERIPFVIL LGIFCGLISLYFTRAMN Prediction of potential genes in microbial genomes Time: Wed May 18 03:31:22 2011 Seq name: gi|222159255|gb|ACAB01000104.1| Bacteroides sp. D1 cont1.104, whole genome shotgun sequence Length of sequence - 56242 bp Number of predicted genes - 36, with homology - 36 Number of transcription units - 19, operones - 10 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1021 857 ## COG0038 Chloride channel protein EriC 2 1 Op 2 1/0.000 + CDS 1056 - 2027 933 ## COG0223 Methionyl-tRNA formyltransferase 3 1 Op 3 . + CDS 2095 - 2745 806 ## COG0036 Pentose-5-phosphate-3-epimerase + Term 2791 - 2828 6.1 + Prom 2820 - 2879 5.7 4 2 Op 1 . + CDS 2991 - 4847 378 ## COG0658 Predicted membrane metal-binding protein 5 2 Op 2 . + CDS 4894 - 5931 909 ## COG0618 Exopolyphosphatase-related proteins + Prom 5969 - 6028 4.1 6 3 Op 1 . + CDS 6050 - 6745 606 ## BT_3949 hypothetical protein 7 3 Op 2 . + CDS 6783 - 8171 1884 ## COG1109 Phosphomannomutase + Term 8211 - 8270 5.3 + Prom 8800 - 8859 5.5 8 4 Tu 1 . + CDS 8987 - 12946 1906 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 13104 - 13160 3.6 + Prom 13021 - 13080 4.3 9 5 Op 1 . + CDS 13229 - 16366 2713 ## PRU_2708 putative receptor antigen RagA 10 5 Op 2 . + CDS 16394 - 18019 1378 ## PRU_2709 putative lipoprotein + Term 18054 - 18100 8.8 + Prom 18225 - 18284 10.7 11 6 Op 1 . + CDS 18327 - 20612 1918 ## COG3537 Putative alpha-1,2-mannosidase 12 6 Op 2 . + CDS 20646 - 22961 1826 ## COG3537 Putative alpha-1,2-mannosidase 13 6 Op 3 . + CDS 22975 - 23769 824 ## BT_3964 putative secretory protein 14 6 Op 4 . + CDS 23780 - 26137 1927 ## COG3537 Putative alpha-1,2-mannosidase + Prom 26150 - 26209 4.2 15 7 Tu 1 . + CDS 26241 - 28187 1587 ## BT_3294 putative alpha-glucosidase + Term 28237 - 28304 8.8 - Term 28431 - 28466 1.5 16 8 Op 1 9/0.000 - CDS 28595 - 29389 618 ## COG3279 Response regulator of the LytR/AlgR family 17 8 Op 2 . - CDS 29395 - 30462 464 ## COG3275 Putative regulator of cell autolysis 18 8 Op 3 11/0.000 - CDS 30533 - 31426 690 ## COG0845 Membrane-fusion protein 19 8 Op 4 . - CDS 31489 - 35814 3479 ## COG3696 Putative silver efflux pump + Prom 35946 - 36005 4.4 20 9 Tu 1 . + CDS 36025 - 36567 489 ## COG0386 Glutathione peroxidase + Term 36577 - 36621 -0.8 21 10 Op 1 . - CDS 36545 - 37063 428 ## COG1443 Isopentenyldiphosphate isomerase 22 10 Op 2 . - CDS 37108 - 38328 933 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) - TRNA 38386 - 38460 65.3 # Pro GGG 0 0 + Prom 38801 - 38860 8.3 23 11 Tu 1 . + CDS 38924 - 39793 376 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 39843 - 39890 7.3 - Term 39835 - 39872 4.0 24 12 Tu 1 . - CDS 39906 - 40967 1007 ## COG0337 3-dehydroquinate synthetase - Prom 41076 - 41135 3.5 + Prom 40954 - 41013 9.4 25 13 Op 1 . + CDS 41048 - 41452 252 ## BT_3976 hypothetical protein + Prom 41454 - 41513 1.8 26 13 Op 2 . + CDS 41536 - 44457 2281 ## BF0745 hypothetical protein + Term 44487 - 44539 14.5 + Prom 44504 - 44563 2.6 27 14 Tu 1 . + CDS 44590 - 46029 807 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 28 15 Op 1 . - CDS 46054 - 46587 226 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 29 15 Op 2 . - CDS 46578 - 47138 428 ## BT_3980 hypothetical protein - Prom 47231 - 47290 2.6 - Term 47277 - 47312 5.4 30 16 Tu 1 . - CDS 47348 - 48034 908 ## BT_3981 hypothetical protein - Prom 48123 - 48182 8.2 + Prom 48029 - 48088 7.3 31 17 Tu 1 . + CDS 48130 - 49548 840 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member + Term 49569 - 49628 12.6 + Prom 49567 - 49626 6.6 32 18 Tu 1 . + CDS 49652 - 50788 589 ## COG1672 Predicted ATPase (AAA+ superfamily) + Term 50819 - 50877 8.0 - Term 50812 - 50859 7.4 33 19 Op 1 . - CDS 50945 - 52438 765 ## gi|237714450|ref|ZP_04544931.1| conserved hypothetical protein 34 19 Op 2 . - CDS 52476 - 53909 1037 ## BT_3987 endo-beta-N-acetylglucosaminidase F1 precursor 35 19 Op 3 . - CDS 53946 - 55118 984 ## BT_3986 putative patatin-like protein 36 19 Op 4 . - CDS 55129 - 56232 863 ## BT_3985 hypothetical protein Predicted protein(s) >gi|222159255|gb|ACAB01000104.1| GENE 1 2 - 1021 857 339 aa, chain + ## HITS:1 COG:RSp0020 KEGG:ns NR:ns ## COG: RSp0020 COG0038 # Protein_GI_number: 17548241 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Ralstonia solanacearum # 3 190 278 447 461 80 28.0 6e-15 VEGVFGKLNNPYKKLAFGGVMLSVLIFLFPPLYGEGYDTINLLLNGTSAEEWDTVMNNSM FYGYGNLLLVYLMLIILLKVFASSATNGGGGCGGIFAPSLYLGCIAGFVFSHFSNDFAFS AYLPEKNFALMGMAGVMSGVMHAPLTGVFLIAELTGGYDLFLPLMIVSVSSYLTIIAFEP HSIYSMRLAKKGQLLTHHKDKAVLTLMKMENVVEKDFVVVHPEMDLGELVKAIAASHRNV FPVTDKKTGELLGIVLLDDIRNIMFRQELYHRFTVNKLMISAPAKIFDTDGMEQVMQTFD DTKAWNLPVVDEEGRYQGFVSKSKIFNSYRQVLVHFSED >gi|222159255|gb|ACAB01000104.1| GENE 2 1056 - 2027 933 323 aa, chain + ## HITS:1 COG:BH2508 KEGG:ns NR:ns ## COG: BH2508 COG0223 # Protein_GI_number: 15615071 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Bacillus halodurans # 6 313 1 300 317 222 39.0 6e-58 MKKEDLRIVYMGTPDFAVEALRQLVEGGYNVVGVITMPDKPAGRGHKIQYSPVKQYALEQ NLPLLQPEKLKDEAFVEALREWKADLQIVVAFRMLPEVVWNMPRLGTFNLHASLLPQYRG AAPINWAVINGDTETGITTFFLKHEIDTGEVIQQVRVPIADTDNVEVVHDKLMMLGGKLV LETVDAILNGTVKPIPQEDMAVVGELRPAPKIFKETCRIDWNQPVKKIYDFIRGLSPYPA AWSELIASEKAESVVVKIFESEKVYESHQLVAGTIVTDGKKFMKVAVPDGFVSILSLQLP GKKRLKIDELLRGYHLEDGCKMK >gi|222159255|gb|ACAB01000104.1| GENE 3 2095 - 2745 806 216 aa, chain + ## HITS:1 COG:lin2808 KEGG:ns NR:ns ## COG: lin2808 COG0036 # Protein_GI_number: 16801869 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Listeria innocua # 5 215 4 214 214 208 47.0 5e-54 MKPIIAPSILSADFGYLAKDIEMVNRSEAEWVHIDIMDGVFVPNISFGFPVLKYVAKLSK KPLDVHLMIVNPEKFIPEVKALGAHTMNVHYEACPHLHRVIQQIREAGMQPAVTINPATP VALLQDIIRDVYMVLIMSVNPGFGGQKFIEHSVEKVRELRALIERTGSKALIEVDGGVNL ETGARLVEAGADALVAGNAVFGAQDPVEMIRQLHEL >gi|222159255|gb|ACAB01000104.1| GENE 4 2991 - 4847 378 618 aa, chain + ## HITS:1 COG:BS_comEC_1 KEGG:ns NR:ns ## COG: BS_comEC_1 COG0658 # Protein_GI_number: 16079611 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Bacillus subtilis # 93 396 149 431 469 117 30.0 7e-26 MTGQLQQAVCSFPKEETVYRVLITDAAQPKEHTYLCQALLKERRDTAGIYPIGHTAVLYL QQDSSASRLKSGDELLISARISPPLNNRNFDEFDYARFLMRKGISGTGYVASGKWIKCDG MNNFDLKSVASSCRRNVVSLYQELGFNGDELAVLSALTIGDKTELSDSVRESYSVAGASH ILALSGLHIGLLYTMLFFILKPIARRGNIGRCVRSVFLLVLLWTFAFFTGLSPSVVRSVS MFSILAMADMVGREPLSLNTLAVAAWLMLFCNPAWLFDVGFQLSFLAVASILLIQKPIYH LITVKSRIGKHVWGLMSVSVAAQIGTAPLVLFYFSRFSVHFLLTNLVVIPLITIILYAAV VMLLLTPFSWLQIGVAGGVKKLLEGLNFFVRWVERLPCASIDGIWLYQSEVLGIYIVIAL LTYYFMNRRYRNLQICLFSILLMGTYHATLYWLDCPQTSLVFYNVRGCPAVHCIKSDGQS WINYMDTLSNEKRLKHMTANYWKRHHLLPPQEITADCRHAELNRQQQIISYHGCRVCVIN DNRWRNKSTVSPLYIQYLYLCKGYDGHLEELAQVFSFSYVILDASLSEYRKHLLESECKK SGLRFISLSDEGSVRFLL >gi|222159255|gb|ACAB01000104.1| GENE 5 4894 - 5931 909 345 aa, chain + ## HITS:1 COG:aq_1630 KEGG:ns NR:ns ## COG: aq_1630 COG0618 # Protein_GI_number: 15606737 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Aquifex aeolicus # 24 338 19 319 325 133 31.0 5e-31 MLTKVIEQAKIDHFTKWFERADKIVIVSHVSPDGDAIGSSLGLYHFLDSQEKTVNVIVPN AFPDFLRWMPGSKDILLYDRYKDFADKLIAEADVICCLDFNALKRIDDMADAVAASPARK IMIDHHLYPEDFCKIVMSYPKISSTSELIFRLICRMGYFSDITKEGAECIYTGMMTDTGG FTYNSNNREIYFIISELLSKGIDKDDIYRKVYNTYSESRLRLMGYVLSNMTVYPDCNSAL ITLTKAEQSKFNYIKGDSEGFVNIPLSIKNVCFSCFLREDTEKPMIKISLRSVGTFPCNQ LAAEFFNGGGHLNASGGEFFGTMEEAKAVFEKALEKYKPLLTAKS >gi|222159255|gb|ACAB01000104.1| GENE 6 6050 - 6745 606 231 aa, chain + ## HITS:1 COG:no KEGG:BT_3949 NR:ns ## KEGG: BT_3949 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 229 1 212 214 273 63.0 4e-72 MKKLVFLFLSLLTAGSLFQACDNSKTYAEMLEDEKNAVNKFIKDNDIRVISLEEFERDTV TASKEAGDGYDEYVAFSNGVYMQIVDRGGKEEGENGVEFINEVDTFANNNVICTRYVEKD MMTGEVTCFNVPLEEWMDAPDYYKFPLTFRYVQNASTVYGIVLSGSLEYDLLWNSNGYGT AIPSGWLIALPYLRNNAHVRLIVPSKMGHTTAQQYVNPYFYDIWKFEKAKS >gi|222159255|gb|ACAB01000104.1| GENE 7 6783 - 8171 1884 462 aa, chain + ## HITS:1 COG:PH0923 KEGG:ns NR:ns ## COG: PH0923 COG1109 # Protein_GI_number: 14590777 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Pyrococcus horikoshii # 12 448 6 441 455 249 37.0 8e-66 MTLIKSISGIRGTIGGGAGEGLNPLDIVKFTSAYATLIRKTCKAKSNKIVVGRDARISGE MVKNVVVGTLMGMGWDVVDIDLASTPTTELAVTMEGACGGIILTASHNPKQWNALKLLNE HGEFLNAAEGNEVLRIAEAEEFDYADVDHLGSYRKDLTYNQKHIDSVLALDLVDVEAIKK ADFRVAIDCVNSVGGIILPELLERLGVKHVEKLYCEPTGNFQHNPEPLEKNLGDIMNLMK GGKADVAFVVDPDVDRLAMICENGVMYGEEYTLVTVADYVLKHTPGNTVSNLSSTRALRD VTRKYGMEYSASAVGEVNVVTKMKATNAVIGGEGNGGVIYPASHYGRDALVGIALFLSHL AHEGKKVSELRATYPPYFIAKNRVDLTPEIDVDAILAKVKEIYKNEEINDIDGVKIDFAD KWVHLRKSNTEPIIRVYSEASTMEAAEEIGQKIMDVINELAK >gi|222159255|gb|ACAB01000104.1| GENE 8 8987 - 12946 1906 1319 aa, chain + ## HITS:1 COG:STM3163 KEGG:ns NR:ns ## COG: STM3163 COG2207 # Protein_GI_number: 16766463 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 1183 1309 164 292 305 70 34.0 2e-11 MINTMKMRLIFPFVLLIYSLICQADGGKDSYIFRKVDYQQGLSNSAVLCLFQDNTGLMWF GTYDGVNCYDGRSMEVFRSDFSAPKALSNNVIHSIQQADNNCLWISTHLGINRLSQNSRQ VVGYYDFTDDYYLHSNSKGNTWVVSHGGIFYYNTSYKRFVQIKNLKVPVEDMDKRAFVTD DGVLWIFTQQTGELLQVSQDAFDCDTLSIHSTVSSTDFHAKPIEDVFYQNGVLCFIDSEH DLYVYDISRQSKIYIRNLSSLVQKNGTIAGIALFYEDIIIGFRTNGLVRLRTSQKYKEEV VDRNVRIYSIYRDPHQNVLWVASDGQGTIMYAKKYSIATNLMLNQLSSNLSRQVRSVMTD DSGGLWFGTKGDGLLHIPDYRENEEASAVTVYSPEGKQNVVSYIRWNKEFPVYKLVQSRY MDGFWIGSGDPGLFYYSFEDKALHSVENLPAQPTEIHGIYEENDSVLYVVTAGSGFHKLI LEKQAGTIRFKSQKSYHFFHGQREITMFYPMLPEGDSILWLGSREKGLVRFDKRTEEYKV ISLKEMLHKSVDDVLSLYRTKEGLLYVGTTSGLVCLNSNGKQMKATYIGREQGLLNDMIH GVLEDENGLLWLGTNRGLIKYNPINGSSHAYFYSAGVQIGEFSDDAYYMCPYTQELFFGG IDGLLYLDKEVQAAPEFYPDILLRKLTVGHTQVVQGDGDYYTDDGKALQLKGTEVSFALS FVVPDFLSGEDIEYSYQLEGYDKDWTSFSSINEASYTGVPAGDYIFKVRYKRDVFDTEYR HFSIPVYILSPWYRSVAAYFVYLIIFLLLLGYVIYLLRKNYLQERMMKTLMGTESCRKSE TVYTNRRMLEDFTLIYNYCDQLRAENLSYEQCLEKVSLIRETVMSALLNPDALFLEELKQ FFPDRFIVSARMSIQGVSQEVLRTLEEQGIDHSSISSMIPEHLTFPVYKNALYSILYCCY LRITEMKGTHGVIVEMSEQDGKMQLHFSSKDATAKALYEYLSDKASSVTEKDSDHVFVVH LLLGFVRSALERIHAVLRYDDDESGSRLTIIFEPAVLTVAGEQGKKTVLLLEDRDEMTWL ISNFLADEYVVHQVKSVQLAFDEIRRSAPALLLVDMTMYANAESTFMEYVSRNRTLLSKT AFIPLLTWKVGSAIQRELILWSDSYIVLPYDILFLREVVHNAIYGKREAKQIYMEELGDL AGQIVCTTTEQADFIRKLLKVIEENLDKEELGSTLIADRMAMSSRQFYRKFKEISNTAPG DLIKSYRMEKAARLLLDEELSIQDVIMEVGISSRSYFYKEFTRRFGMTPKDYREQRKVC >gi|222159255|gb|ACAB01000104.1| GENE 9 13229 - 16366 2713 1045 aa, chain + ## HITS:1 COG:no KEGG:PRU_2708 NR:ns ## KEGG: PRU_2708 # Name: not_defined # Def: putative receptor antigen RagA # Organism: P.ruminicola # Pathway: not_defined # 22 1045 46 1069 1069 1544 74.0 0 MNEKLKYLTLFVIGMMLSLGMNAQSVKSISGTVTDDQGEAVISGTVKVKNGSTGTITDIN GKYTLSVPSNATLVFSYIGYITQEFKVSELKSNVLNVVLQSDTKALDEVVVVGYGTMRKS DLTGSISTAKGKDMLKAQSFNALDGLKGKVAGVNIFSNTGQPGGESRVIIRGISTINASA SPLYVVDGVVMSNFELLNPNDIESIEVLKDASSAAIYGARGANGVIMVTTKRGNAGKGVH VSYDGSLSIGKMARKIDVMDANEWMSTFKQGLENANEFQGKNFTTDLSQIFTDERLFNAD GSPKYNTNWQDEASRTAISHNHQISIQRTGEGSSIGAFINYTDQQGILLNSYFKRINAKL AYDDKPTDWLSTSANLLVNHTWGNRTSDNPYGQGALRTMIEQLPFLPVKLDGEYTQTNII NTSSILNNQTDPNSGKQGFSPEGVGNPVELLERMQAMQYRTQIFGNAALTFHLMKGLDLK TQFGIDYHNNRDANYTPFTPRPMINQSSEGAASANNSNSLYWQEETYLTYVKDINKHHIN AMAGMSWQEYNYTKFEASDSKYIDDFYGYYNLGSGTNRPSVGNDYDKWAMNSYFLRLAYS YDNKYMATVTGRYDGSSKFGQNNKYAFFPSVGLGWMISNEDFLKDNSLISKLKLHTSYGL TGNSEIGTYKSLATVSQSNTIIGDALHVVSYLDNMPNPDLKWEKTGQWDLGFELGLFNNR LNFDISYYYKYTSDLLLDRPVPESTGYSSIMDNIGAVSNRGVDILVTAYPIQTHDFQWTS TLNLGFNKNRVEKLDESASVDPVTGKRQITTDGFVGYDMLIREGEELSSFYGYKRAGIYD GIPSNWDPETMNIPSTIGEKVTYKKREIIGNGLPDWMGSFINTFNYKGFDLTLDFQFTWG VDVMQEYYHSTVARFLTNGIDRLYKEAWHPTLNPSGKEQAIRLNNFGQGANNQADDDWVC NGSYLRCNMIQLGYTFNPTLIKKIGLSSLRLYANVNNAFLITSKDYNGYDPDNSSRLGDN KWGANRQFFTYPRPRTFTFGLNVAF >gi|222159255|gb|ACAB01000104.1| GENE 10 16394 - 18019 1378 541 aa, chain + ## HITS:1 COG:no KEGG:PRU_2709 NR:ns ## KEGG: PRU_2709 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 540 1 536 537 675 62.0 0 MKLLKVYISAFCMCSLSVLLNSCDDFLKEEPLDMKSSDQFWKTKADAESGVNALYFGGVP YLHNTDVGGGWTPKATMWGGIVSGLYVDKRKDRTFTTASEGCNFNIESFDDIAMKYWHEF YKGISRANFVIANIPTMTGVLDEATINNYVAQGKFFRAYGYFWLVKEFGGVPYISEPYTS TEGMYKERLSAEEVYKKIEADLLDIVNGDALPNKTFYDNGCYVTRAMAQTLLAQVYLQWA GAPLNGGTDYYGKAAQMALKVINDSPHELIHPNGTTDDLSSAYNVIKTTKSSAEIIYAKE YNYSDYSVGNSYACRSIGTDAFQWGVFHPGGDVLYNAYLPCDMLLNSYASNDIRGHEKQF FFKKYTDAAGKTYTLNNAGNWAWFDENALISGHDGDYNMPVFRFAEVLLIAAEGLARTNH EDNAVGGAKYYLNQVRKRAGLADETATGNDLIQSILTERLHELPLEFKIWDDIRRTRLYP EASTESGKLKWTALASAQIQNKPDGSTRAGAIPEYALLWPIPLDEIQANPALEGHQNPGW N >gi|222159255|gb|ACAB01000104.1| GENE 11 18327 - 20612 1918 761 aa, chain + ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 27 759 3 717 717 437 33.0 1e-122 MNKKKLLSFAIALLLGVSSTFAQKQPVDYVNPLMGTDSKISLSNGNTYPAIALPWGMNFW MPQTGKMGDGWAYTYASDKIRGFKQTHQPSPWINDYGQFSIMPMTKQLKIDQDSRASWFS HKAEKATPYYYSVYLSEYDMTTEIAPTERCAYFRFTFPETSDAYVVVDAFDRGSYVKVIP EENKIVGYTTRNSGGVPQNFKNYFVIEFDKPFTFNKVWADYHLVETHLELQSNHVGAAIG FSTKKGEQVHAKVASSFISPEQAELNLKEIGNKTFEQTKEAGRKAWNNVLGRIKVEDSDE NRMRTFYSCLYRSVLFPRMFYEVNGKGETIHYSPYNGEIRSGYMFTDTGFWDTFRCLFPF VNLIYPSMGEKMQEGLLNTYLESGFFPEWASPGHRGCMVGNNSASVVADAFMKNVTKADA EKMYEGLLKGANSVHPKVSTTGRRGYEYYNKLGYVPYDVKINENAARTLEYAYDDWCIYR MGEKLGRPAEELDVYKKRSQNYRNLFDPETKLMRGKNSDGTFQTPFNPFKWGDAFTEGNS WHYTWSVFHDVQGLVDLMGGKKMFVSMLDSVFNLPPVFDDSYYGGVIHEIREMEIANMGN YAHGNQPIQHMIYLYNYAGEPWKAQYWLRETMNRLYLPTPDGYCGDEDNGQTSAWYVFTA LGFYPVCPGSNEYVMGAPYFKKATITLENGKKLEISAPKNNDDNRYIRSLNYNGKNYTKN YLNHFDLLKGGRLVFDMDNKPNKGRGINESDFPYSFSRDAK >gi|222159255|gb|ACAB01000104.1| GENE 12 20646 - 22961 1826 771 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 19 766 49 781 790 546 41.0 1e-155 MMTMAAATLSAIAQQPVDYVNPIIGTNGMGHTFPGACTPFGWVQLSPDTDTIPHNVNGAY QKNAYEYCAGYQYRDKTIVGFSHTHLSGTGHSDLGDILLMPAVGDVKLNPGRADYPEEGY RSRFDHATEKAVPGYYEVILDDYGIKAQLTATQRTGIHKYTFPKGKDGHLILDLVHGIYN YDGKVLWANLRVENDTLLTGYRITNGWARTNYTYFAISLSQPIKDYGYKDKEKVLYNGFW RRFKMEKNFPEITGRKIVAYFNFDTANNSELVVKVALSAVSTEGAIKNLRAEASGKSFEQ LAEAARTDWNSELEHFEIEGTPDQKAMFYTSLYHTMINPSVYMDVDGSYRGLDHNIHRAE GFTNYTIFSLWDTYRAEHPFLNLVKPGRNADMVESMIKHEQQSVHGMLPIWSLMGNENWC MSGYHAVSVLADAITKGVFSNVDEALAAMVSTSTVPYYEGIADYMKLGYIPLDKSGTAAS STLEYAYDDWTIYQTALKAGNKEIAETYRKRALNYRTIYDTSIGFARPRYSDGSFKKEFD VLQTYGEGFIEGNSWNFSFHVPHDVFGMIDLMGGEGTFVQKLDELFSMHLPEKYYEHNED ITEDCLVGGYVHGNEPSHHVPYLYAWTSQPWKSQYWLREILNKMYKNDINGLGGNDDCGQ MSAWYLFSVMGFYPVCPGTDQYVLGAPYLPYLKMTLPNGKTLEIKAPGVSDKKRYVQSLK LDGKVYDKMYITHEDILKGGVLEFKMGASPNKRRGLSPEDKPYSLTNGINQ >gi|222159255|gb|ACAB01000104.1| GENE 13 22975 - 23769 824 264 aa, chain + ## HITS:1 COG:no KEGG:BT_3964 NR:ns ## KEGG: BT_3964 # Name: not_defined # Def: putative secretory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 263 1 263 264 496 90.0 1e-139 MKKIHLIYAACLLVGMGACAASVQKQVKDNFDVWKEYNTGAILFEDKAPETLGSDIYHRI IPDAESYIKEQARTVLATLYNSPEDSIPAVHKIHYTLEDINGISAKGGGNGDVTIFYSTR HIEKSFAANDTAKLFFETRGVLLHELTHAYQLEPQGIGSYGTNRVFWAFIEGMADAVRVA NGGFDGPNARPKGGNYMDGYRTAGYFFVWLRDNKDPEFLRKFNRSTLEVVPWSFDGAIKH ILGNEYSIDELWHEYQVAVGDIQA >gi|222159255|gb|ACAB01000104.1| GENE 14 23780 - 26137 1927 785 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 30 765 46 781 790 511 37.0 1e-144 MVSVMGKRGLFLVWLCLVNILPGMAQTEKLIDYVNPFVGTDGYGNVYPGAQIPFGGIQMS PDTDSKYYDAASGYKYNHSTLLGFSLTHLSGTGIPDLGDFLFIPGTGEMKLDPGTREEPE KGYRSRYSHEKEWASPNYYAVELSDYGVKAEMTSGVRSGMFRFTYPQSDKAFIMIDMNHT LWQSCEWANLRMENDSTITGYKLVKGWGPERHIYFTATFSKKLTGLRFMQNKKPVIYNTS RFRSSYEAWGKNLMACISFDTKAGEEVIVKTAISSVSTNGAKNNMAELAELAFDDLKAKG EALWEKELGKYTLTADRKTKRTFYTSAYHAALHPFIFQDADGQFRGLDKNIEKAEGFTNY TVFSLWDTYRALHPWFNLVQQEVNADIANSMLAHYDKSVEKMLPIWSFYGNETWCMIGYH AVSVLADMIVKGVKGFDYERAYEAMKTTAMNPNYDCLPEYRMMGYVPFDKEAESVSKTLE YAYDDYCMAQAAKALGKEDDYRYFLNRALSYQTLIDPETKYMRGKDSQGNWRTPFTPVAY QGPGSVNGWGDITEGFTVQYTWYVPQDVQGYMNEAGEDWFRNRLDELFTVELPDDIPGAH DIQGRIGAYWHGNEPCHHVAYLYNYLKEPWKCQKWVRTIAERFYGDTPDALSGNDDCGQM SAWYMFNCIGFYPVAPSSNMYNIGSPCVEAITVRMSNGKVIEMVADNWSPKNVYVKELYV NGKKYDKSYLKYEDIRDGVKLRFVMSNKPNYKRAVSDEAVAPSLSLPGKTMKYQANLMKT CISNQ >gi|222159255|gb|ACAB01000104.1| GENE 15 26241 - 28187 1587 648 aa, chain + ## HITS:1 COG:no KEGG:BT_3294 NR:ns ## KEGG: BT_3294 # Name: not_defined # Def: putative alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 648 1 650 650 1076 76.0 0 MRRYFLLVVCLCLASLLRAQNKLELISPNGELKVSLNLSDKIYYSIDYNGDVLLKDNSLQ LTLRNQVLGQNPKLRRQKRTSVDEQLTPIVPLKYAKVNNRYNQLLLTFKDYSVEFRAFDD GVAYRFITSQKGDVEVMNEECAINFPSDYLLHLQQPGGFHTAYEEPYTHVQSNAWKPEER IAVLPVLIDTQKDYKILISESDLADYPCMFLKGTGTNGAISVFPKAPLAFAENSDRSVKI TQEADYIAKTKGTRNYPWRYFVISKNDKQLIENTMTYKLAEKNQLQNVSWIKPGQVSWEW WNDASPYGPDVNFVSGYNLDTYKYYIDFASKFGIPYIIMDEGWAKSTRDPYTPNPKVDLH ELIRYGKEKNVEIVLWLTWLTVENNFDLFKTFNEWGVKGLKIDFMDRSDQWMVNFYERVA REAAKHHLFVDFHGSFKPAGLEYKYPNVLSYEGVRGMEQMGGCYPDNSLYLPFMRNAVGP MDYTPGAMISMQPNVYRSERPNAASIGTRAYQLALFVVFESGLQMLADNPTLYYRNEDCT RFITQVPVTWDETVALEAKAGEYVVVAKRKGDKWFIGGMTNNGEKEREFTIKLDFLNKDR SYQMISFEDGINAGRQAMDYRCKSAQVKAGEQLTIKMVRNGGFAAVIE >gi|222159255|gb|ACAB01000104.1| GENE 16 28595 - 29389 618 264 aa, chain - ## HITS:1 COG:SA0251 KEGG:ns NR:ns ## COG: SA0251 COG3279 # Protein_GI_number: 15925964 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Staphylococcus aureus N315 # 1 256 1 240 246 89 28.0 8e-18 MKIVIIEDEKAAVRNLTSLLNEIKPNGEIVTILDSISSTIEWFSSHPMPELVFMDIHLAD GSAFEIFNHINITCPIIFTTAYDEYALRAFKVNSIDYLLKPIGKEDIEHAFDKLQGLQDA ADKHQFPFPQQENELLKLIHSLQKQENYKSHFLIPTKGDKLLPVSVDMIQLFYIKDCQVK VVLTDETEYCFSQTLDELTECLNPSLFFRANRQYLISREAIKDIDLWFNSRLSINLYYSG IKEKILVSKARVAEFKEWFSKRKR >gi|222159255|gb|ACAB01000104.1| GENE 17 29395 - 30462 464 355 aa, chain - ## HITS:1 COG:SA0250 KEGG:ns NR:ns ## COG: SA0250 COG3275 # Protein_GI_number: 15925963 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Staphylococcus aureus N315 # 148 331 358 557 584 101 33.0 2e-21 MIAKAIKKDKYILSTVIISLAVAVLIHFPESVSLFDRFESHSLFPGMKFIDVANEILFTF LSLLLLFAINTRLFHFNQASIKITGTKILLSFIVTWILSNLSGQFFVFLHRTFDIPAIDA MVHHYLHPLRDFIVACLVTSSCCIIHLIFKQQLVLIENEQLQAENLRNQYEVLKNQLNPH MLFNSLNTLRSLVRENQDKAQDYIQELSRVLRYTLQSNESQCVSLREEMEFVSAYIFLLK MRFENNLQFDIQISNAYEEYLLPPMAVQVLIENAVKHNEISNRKPLTIHITTDDNGNLSI SNDIQPKWTATSGTGIGLANLAKRYRLLFKRDIQITEDREFTVCIPLIEKSQLEQ >gi|222159255|gb|ACAB01000104.1| GENE 18 30533 - 31426 690 297 aa, chain - ## HITS:1 COG:RSp0529 KEGG:ns NR:ns ## COG: RSp0529 COG0845 # Protein_GI_number: 17548750 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Ralstonia solanacearum # 31 297 5 267 344 68 24.0 1e-11 MKKIVYPILLLVLTACKGNQQTTNNESAAIPVAETIADTIAQPEVDAITSATSKPNQVSF NGRIVLPPQRQATIALTMGGIVKKASLLPGQWVAANSVIATLENPEFITLQQTYLDSHAQ TEYLLAEYERQKNLSAEQAASQKKFQQSKADFLSMKSRQDAAAAQLSLLGVQTEALLKNG IQPLLEVKAPISGYAANVEMNIGKYINPGEALCELIDKSSPMLCLTTYEKDLADIQVGSP VQFRVNGMGTQTFHGIVISIGQKVDETNRSLEVYASIKEENVQFRPGMYVTAHIQKQ >gi|222159255|gb|ACAB01000104.1| GENE 19 31489 - 35814 3479 1441 aa, chain - ## HITS:1 COG:PA2520 KEGG:ns NR:ns ## COG: PA2520 COG3696 # Protein_GI_number: 15597716 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Pseudomonas aeruginosa # 1 1031 1 1034 1051 803 43.0 0 MFTAIVRFSIKKKLFVGLTTLFLFIGGIYAMLTLPIDAVPDITNNQVQIVTVSPTLAPQE VEQLITMPIEIAMSNIMNVEDIRSVSRFGLSVVTVVFKESVPTLDARQLINEQIQSVTSE IPSELGTPEMMPITTGLGEIYQYILKVAPGYEKKYDAMELRTIQDWMVKRQLSGIPGIVE INSFGGYLKQYEVAVDPDALFSLNITIGEVFEALSSNNQNTGGSYIEKIKNAYYIRSEGM ITRIKDIEQIVVTNRNGIPVHISDVGVVRFGSPKRFGAMTKDGEGECVGGIAMMLKGANA NVVTQELEKRVEKIQQLLPEGVSIEPYLNRSELVNRNISTVVNNLIEGAVIVFLVLILFL GNLRAGLIVASVIPLAMLFAFIMMRVFNVTANLMSLGAIDFGIVVDGSIVILEGILAHIY GKQFRGRTLTRKEMDKEVEKGASGVARSATFAVLIILIVFFPILTLNGIEGKYFTPMAKT LVFCIIGALILSLTYVPMMASLFLKHTIVVKPTLADRFFEKLNKTYQHTLRACLRHKWRT VIVAFSALIGSLFLFTRLGAEFIPTLDEGDFAMQMTLPAGSSLTESIEVSRLAEKALMDK FPEIKHVVAKIGTAEVPTDPMAVEDADVMIIMKPFKEWTSAGSRAEMVDKMKDALAPLSE PAEFNFSQPIQLRFNELMTGAKADIAIKLYGEDTHELYERAKEAATFVEKVPGAADVIVE QTMGLPQLVVKYNRGKIARYGINIQELNTIIRTAYAGESTGVIFENERRFDLVVRLDQEK VADLNLDKLFVRTSEGIQIPVGEVASIDLVSGPLQINRDATKRRIVIGVNVRDADIQQVV SNIQETLNKNIKLKPGYYFEYGGQFENLQNAINTLLVVIPVALMLILLILFFAFKNITYT LMVFSTVPLSLIGGIAALWLRGLPFSISAGVGFIALFGVAVLNGILMVNHFNELRKQNTY SMTTNQIIKRGTPHLLRPVFLTGLVASLGFVPMAIATSAGAEVQRPLATVVIGGLIISTV LTLLIIPVFYKIVNSFATWRRPGIKLRRPFGGLCLLFLFLPAFVSAQQTQTVSLEQAIEI AKQNHPRLKIANNAIQQAKATRGEVIEATPTSFSYSWGQLNGENKKDKELTFEQSLGSLL TPFYKNALVNRQVKTNTYYRQMVEKEITAEVKRAWAYYQYATNLCSMYRDQDKMAEELQR IGELRYQQGEITLLEKNMMTTIAADLHNRWFQAKEEEKMALARFQWSCYSNDPIVPVDST LSLFYTGISDGNLSGAHTAYFQSQADEAKAMLRVEQSHFFPEISVGYTRQDILPLKNLNA WMIGVSFPIYFLPQKSKVKQARLTATSAQIQADANIRELRNKTLELEASLRRYNESLRYY TSSALKEADELTKAANLQLQQSETGIAEYIQSITTAREIRRGYIETVYQYNIAALEYELF K >gi|222159255|gb|ACAB01000104.1| GENE 20 36025 - 36567 489 180 aa, chain + ## HITS:1 COG:PA0838 KEGG:ns NR:ns ## COG: PA0838 COG0386 # Protein_GI_number: 15596035 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione peroxidase # Organism: Pseudomonas aeruginosa # 23 179 3 157 160 165 49.0 4e-41 MKTFVLMMVSLLFAISSEAQNKSFYDFTVKTIDGKDFPLSSLKGKKVLVVNVASKCGLTP QYAQLEKLYEKYKEKDFVVIGFPANNFMGQEPGSNEEIAKFCSLNYDVTFPMMAKIPVKG KDMAPLYQWLTEKKQNGKENAPVQWNFQKFMIDENGHWAGVVSPKESPFSEKIITWIEKE >gi|222159255|gb|ACAB01000104.1| GENE 21 36545 - 37063 428 172 aa, chain - ## HITS:1 COG:MT1787 KEGG:ns NR:ns ## COG: MT1787 COG1443 # Protein_GI_number: 15841209 # Func_class: I Lipid transport and metabolism # Function: Isopentenyldiphosphate isomerase # Organism: Mycobacterium tuberculosis CDC1551 # 8 155 12 166 203 62 25.0 4e-10 MPSDNNQEMFPIVDEQGNIIGAATRGECHSGSKLLHPVVHLHVFNAQGDIYLQKRPEWKD IQPGKWDTAVGGHIDLGESVEIALKREVREELGITDFTPELLTSYVFESSREKELVFVHK TVYEEEIHPSDELDGGRFWKIEEIKENLGKGIFTPNFEEELKKVSLIPSLSK >gi|222159255|gb|ACAB01000104.1| GENE 22 37108 - 38328 933 406 aa, chain - ## HITS:1 COG:HI0245 KEGG:ns NR:ns ## COG: HI0245 COG0809 # Protein_GI_number: 16272205 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Haemophilus influenzae # 8 404 1 353 363 226 33.0 9e-59 MKEDPRHIHISEYSYPLPDERIAKFPLPTRDQSKLLIYRRGEISEDVFTSLPEYLLQGSL MIFNNTKVIQARLHFRKETGALIEVFCLEPIQPNDYVLNFQQTAHAAWLCMIGNLKKWKD GQLKREMTVKGFPITLTATRGECKGTSHWVDFAWDNPEVTFADILEVFGELPIPPYLNRD TEESDKETYQTVYSKIKGSVAAPTAGLHFTPRVLDALQEKGIDLEELTLHVGAGTFKPVK SEEIEGHEMHTEYISVNRSTIKKLIDHDGCAIAVGTTSVRTLESLYHIGVTLAENPYATE EELRVKQWQPYEKYDQIPPVVALQKILGYLDRNGLETLHTSTQIIIAPGYNYKIVKAMVT NFHQPQSTLLLLVSAFVKGNWRTIYDYALAHDFRFLSYGDSSLLIP >gi|222159255|gb|ACAB01000104.1| GENE 23 38924 - 39793 376 289 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 178 283 180 285 288 62 31.0 1e-09 MLDIKNRETLEIYVLGDDFKFYQNLKYLPLTSYPSNNQSALIIYCLGGKAKITVHEDVHW IQPEELIILLPGQFVSFSEPSEDFSTVTMVISTSLFSDALSGVPRFSPHFFFYMRTHYWY PQTERDIPRMYNYLGMIKDKVTSQDIYRRELIIHLLRYLYLELFNAYQKESTLMTARRDT RKEELANKFFGLIMKHFKENKDVAFYADKLCITSKYLTMVIKETSGKSAKDWIVEYIILE IKALLKNTSLNIQEIAIKTNFANQSSLGRFFRKHTGMSLSQYRMSNLEQ >gi|222159255|gb|ACAB01000104.1| GENE 24 39906 - 40967 1007 353 aa, chain - ## HITS:1 COG:FN0871_1 KEGG:ns NR:ns ## COG: FN0871_1 COG0337 # Protein_GI_number: 19704206 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Fusobacterium nucleatum # 26 348 26 349 350 182 35.0 1e-45 MSKQEVILCESLETSLGRAIELCPHDKLFVLTDEHTQRLCLPSLKESGLLKDAVEICIGA EDVHKTLETLASVWMALSTQGATRHSLLINLGGGMVTDLGGFAAATFKRGISYINIPTTL LAMVDASVGGKTGINFNGLKNEIGAFAPANSVLIETEFLRTLDTHNFFSGYAEMLKHGLI SNTAHWAELLNFDSSSIDYAALKQLVGQSVQVKEDIVEQDPFEHGIRKALNLGHTVGHAF ESMALAENRPVLHGYAVAWGIVCELYLSHLKVGFPKEKMRQTIQFIKDNYGVFTFDCKKY DQLYAFMTHDKKNTSGTINFTLLKDIGDICINQTADKDTIFEMLDFYRECMGI >gi|222159255|gb|ACAB01000104.1| GENE 25 41048 - 41452 252 134 aa, chain + ## HITS:1 COG:no KEGG:BT_3976 NR:ns ## KEGG: BT_3976 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 1 134 134 202 90.0 3e-51 MKALLPVYCRPLGYVVLLVALFIPFILVMQGVVTDTNLLFYKECTKLLMMAGCLLIIFAL SKNESRETEQIRNSAVRNAIFLTFLFVFGGMLWRVMQGDVINVDTSSFLTFLVFNVLCLE FGLKKALVDRFFKR >gi|222159255|gb|ACAB01000104.1| GENE 26 41536 - 44457 2281 973 aa, chain + ## HITS:1 COG:no KEGG:BF0745 NR:ns ## KEGG: BF0745 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 950 1 944 970 1238 70.0 0 MKRFSAGLVLMLLCTFSIFAQNKVITVSGRVVESDTKEPAAQATVQLLSLPDSAYAAGIA SSNQGWFTLPKVKAGKYVLKVSYIGFRTKLVPVQLSANATDKKLGTITLDPDAVMLKEAV ITAEAPQVVVKEDTLEYNSTAYRTPEGAMLEELVKKLPGAEIDDDGNVKINGKEVKKIMV DGKEFFGGDVKTGLKNLPVNMIDKLKTYDKKSDLARVTGIDDGEEETVLDLKVKKGMNQG WFGNASVAGGTEDRYGSNLMLNRFVDNSQFSLIGSANNVNDQGFSGGGGGPRFRNSNGLT ATKMLGANFATQTEKLELGGSARYNFSDRDATSTNYSERFLQNGNSYSNSNSKGRNKNTN FNADFRLEWKPDTLTNIIFRPNVSYGKSNSYSISESGTFNGDPFNLVSNPNEFLNKIFWG STDDPLEAIRVNASNSESKSEGQNLSANASLQVNRRLNNQGRNITFRGTFSYGDNDSEQF SESLTRYFDDNAKKEDDERKQYTTSPTKNYDYTAELTYSEPIARATFLQFRYKFQYKYSE SDRSTYSLIPDADKGQDWFWNFGDGLPVGYEENKDRDLSKYAQYKYYNHDINAGLNIIRE KYRLNFGVSLQPQNTRLDYKKAEVDTVVKRNVFNFAPNVDFRYRFSKVSQLRFTYRGRAS QPSMENLLDVTDDSNPLNIRKGNPGLKPSFSHSMRLFYNTYNADKQRGMMAHVNLNMTQN SITNATTYNQSTGGVTVKPENINGNWNAMGMFGFNTALRDKRFTINSFSRANYTNAVSYL FNDDTKINDKNTSTTLTFGENLNGTFRNDWFEFTLNGSINYNFERNQLRPENNQEPYTFG YGASTNISLPWSMTLSTNITNNARRGYRDASMNKNELIWNAQIAQNFLKGNAATISFEVY DILRQQSNISRSLTADMRSVSEYNGINSYCMLRFSYRLNVFGNKEARGNMRHGGFDGGGP RGPRGGFGGGRPH >gi|222159255|gb|ACAB01000104.1| GENE 27 44590 - 46029 807 479 aa, chain + ## HITS:1 COG:BS_ywnE KEGG:ns NR:ns ## COG: BS_ywnE COG1502 # Protein_GI_number: 16080712 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Bacillus subtilis # 11 479 3 482 482 353 37.0 4e-97 MIDWNYMLSQIATVAFDILYFGAIIGTIVIIILDNRNPVKTMAWILILLFLPIVGLVFYF FFGRSQRRERIIGQKSYDRLLKKPMAEYLAQDCSDVPYEYSRLIQLFQQTNQAFPFEGNR VAVYTEGYTKLQSLLRELQKAKQHIHMEYYIFEDDAIGRMVRDVLIEKASHGVEVRVIYD DVGCWHVPNRFFEEMRNAGIEVRSFLKVRFPLFTSRVNYRNHRKIVVIDGRVGFVGGMNL AERYMRGFSWGIWRDTHIMLEGKAVHGLQTAFLLDWYFVDRTLITASRYFPKIDSCGSSL VQIVTSEPIGPWKEIMQGLTVAITGAKKYFYMQTPYFLPTEPILAAMQTAALSGVDIRLM LPERADNWITHLGSRSYLADVMQAGVKVYFYKKGFLHSKLMVSDDMLSTVGSTNVDFRSF EHNFEVNAFMYDVETALEMKEIFLQDQRESTQIFLKNWGRRSWRQKAAESIVRLLAPLL >gi|222159255|gb|ACAB01000104.1| GENE 28 46054 - 46587 226 177 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 177 13 193 199 91 34 9e-18 MRVISGIYKRRRFDVPRTFKARPTTDFAKENLFNVLNNYIDFEEGITALDLFAGTGSISI ELVSRGCDRVISIEKDPAHHSFICKIMKEVQTDKCLPIRGDVFKFIKNGREQFDFIFADP PYALKELETIPELIFQNNLLKEGGLLVLEHGKDNNFEENPHFLERRVYGSVNFSLFR >gi|222159255|gb|ACAB01000104.1| GENE 29 46578 - 47138 428 186 aa, chain - ## HITS:1 COG:no KEGG:BT_3980 NR:ns ## KEGG: BT_3980 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 186 71 256 256 311 84.0 5e-84 MMAGKRFTIAPLELFEEEQAELLFYHNHQKKENETVLYNILRKNNVAVIFGIDKSAQTFL NEQYPEARFYSQSTPFIDYFSVKSRLGNSKKMYASIRKDGIDIYCFERGHLLLANSFECT HTEDRIYYLLYAWKQLEFNQERDELHLTGILPEKDVLMSELKKFILQVFIMNPATNIDMQ ALLTCE >gi|222159255|gb|ACAB01000104.1| GENE 30 47348 - 48034 908 228 aa, chain - ## HITS:1 COG:no KEGG:BT_3981 NR:ns ## KEGG: BT_3981 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 228 1 228 228 410 89.0 1e-113 MKTVFNIVLVLCAAALIYICYTSIMGPINFENAKKEREKAVIARLIDIRKAQQEYRSLHR GMYAPKLDTLIDFVKNQKLPFVMKMGMLTDKQLEDGLTEKKAMAIIEKAKKTGRYDEVKK WGLENFKRDTMWVAVLDTIYPKGFNADSMKYIPHGNGAQFEMNVKNDTAKSGAPVYLFEV KAPYETYLSGLDKQEIINLKDLDSKLGKYSGLMVGSIDTPNNGAGNWE >gi|222159255|gb|ACAB01000104.1| GENE 31 48130 - 49548 840 472 aa, chain + ## HITS:1 COG:mll1421 KEGG:ns NR:ns ## COG: mll1421 COG0507 # Protein_GI_number: 13471448 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Mesorhizobium loti # 21 464 6 374 375 82 26.0 1e-15 MINNYLERQIKENFPYQPTLEQEIAVKSLSEFLLSTLADEVFILRGYAGTGKTSLVGALV KTMDQLLQKSVLLAPTGRAAKVFSAYAGHPAFTIHKKIYRQQSFSNELSNFSINDNLATN TLFIVDEASMISNEGLSGSMFGTGRLLDDLVQFVYSGQGCRLLLMGDTAQLPPVGEELSP ALFADALKGYGLEVREIDLTQVVRQVQESGILWNATQLRQLIAEDDCYSLPKIKITGFPD IKLVPGIELIEELTNCYDHDGMDETIVVCRSNKRANLYNNGIRAQILWREDELNTGDMLM IAKNNYYWTEKYKEMDFIANGEIAVVRRVRRTREIYGFRFAEVTLRFPDQNDFELDANLL LDTLHSDSPALPKEDNDRLFYTVLEDYIDIPNKRDRMKKMKADPHYNALQVKYAYAITCH KAQGGQWQNVFLDQGYMTDEYLTPDYFRWLYTAFTRATKTLYLVNYPKEQVE >gi|222159255|gb|ACAB01000104.1| GENE 32 49652 - 50788 589 378 aa, chain + ## HITS:1 COG:SSO2730 KEGG:ns NR:ns ## COG: SSO2730 COG1672 # Protein_GI_number: 15899446 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Sulfolobus solfataricus # 5 308 3 313 377 78 23.0 2e-14 METPFLYGRIAENENFTNRRNETEFLKKNFKGLINTVIISPRRWGKTSLVHKVAKLISEE EKNVIICQVDIFNCRTEEEFYTVFANSLLKSTTTVWEEFVGGVKKYLGRLAPIVSISDAT QTYELSFGIDFKDSRLSYDEILDLPQVIANDTKKRIVVCIDEFQNINEYEEPLAFQRKLR SHWQKHTSVCYCLYGSKRHMLLNIFNNYGMPFYKFGDILFLSKIEHAEWVTFISERFADT GKQISTELAGLIADKMKNHPYYTQQLSQQVWFRTPDNGCTKEIVEEAFNSLIAQLSLLFA NVVDSLTPKQINFLLAVADSVSNFSSKEVLTKYKLGTSANIKNLRKATLEKDLIDVLSGN IIEIQDPAFEYWLKYVYR >gi|222159255|gb|ACAB01000104.1| GENE 33 50945 - 52438 765 497 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714450|ref|ZP_04544931.1| ## NR: gi|237714450|ref|ZP_04544931.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 497 1 497 497 927 100.0 0 MNLLKRISIVLSATLFLWGCSDSDEPVNEMNDSLSLSTNEIKINGLGGEVAVTVTSSDDW RLAGTYDWAHPSATSGKSGDEVTFTIDPNSLDEIRTATFKFFVGSTVVPLQVECSPVYAV DLLTEQKIDLPRKKSDVSIKFESNVSDLTITYSDDSKEWLTLEKQSEFIGKTTLLFSVAE NETYKIRSADVTLESPLLDEPIVVNITQACVPYFTITPSETSQKYDLSEQTISFTVESNL EYTPSVVSGAEWITNQTISKQQTDDRGITTSTLSYKLLAASNARVGSVRVENSLKNAEIS IIQADPNTPVASISNWVIAEYAETNGWLVRLAGNEYIITEDGMKATEFACTKTVNDLNGL EYFPNLTTIQIKGSSDLEKVDISKLHKVTSLTITNTATIYIKEFNFGDNPITSFDLLSAR YFGSGCEDLVFISDNLQAINMKVQRPGSDYVERIDLTQCPVLHTFKFNGEYVETLYLKPD QTIPNLELATKYKIERK >gi|222159255|gb|ACAB01000104.1| GENE 34 52476 - 53909 1037 477 aa, chain - ## HITS:1 COG:no KEGG:BT_3987 NR:ns ## KEGG: BT_3987 # Name: not_defined # Def: endo-beta-N-acetylglucosaminidase F1 precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 470 1 465 476 358 42.0 4e-97 MNMINHIKYAFLPLAVFTAGLFTSCDEELEFTADESAYLSATQINGMLLDAATNKNNSIV ELRNDEQSTDIVFRLSKQPKTGVDVTITVDESYVTTYNAIHNTDFEMFPAENVTITNNGA YLLAPDDTTIPGLKVTLTAFEGMEEDKTYIVPLSATSPTKSISFSEMSKHMVLLVQDCRH KANHDKGEDAVKTVIYFEVNDANPLNALEFKTASGKYFFDDVVLFAANINWDAKKQRVYI SNNDNVQFLLDHNDEYLQPLRKAGMRVIISILGNHDQAGVAQLSEMGAKEFAREIAAYCR GYNLDGVAFDDEYSNDPVLSNPWLAEQSVEAGSRLMYECKKAMPEKIVSLYSLGSLYADE LKIIDGVEPGQYCDYSVADYGRSEDPGIGMTLKQCAGMSIELFRDRGDTSTSTARSKKNK GYGYYMFFSPTPPKYISDEEKKDQLKYCENVCLGLYDESLIRPSYYYPKNSTVRTAR >gi|222159255|gb|ACAB01000104.1| GENE 35 53946 - 55118 984 390 aa, chain - ## HITS:1 COG:no KEGG:BT_3986 NR:ns ## KEGG: BT_3986 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 389 1 382 384 237 37.0 7e-61 MKLNKLIMMMFIAFSITLTGCQSNEDEQHYDNKLFLSTTAFTQEILFKAGDTNIVKGLSV EIAKPEAHDIKVTMAPAPELLDTYRSAYYDDAAILLPETHYVMEENKVTIKAGAVSSNIL PIQFINTGELDADLTYVLPVTIQSAEGIEVLQSAKNYYYVFHGASLINVVCNLNENRAYP DFNNDARFNNMKENTLELLFKASQLKNEISTLMGIEGKYLLRIGDAGVPSNQLQLATSNG NLTNSDLQFDGNKWYHVAVTFKEGEAKIYIDGIEKASKNFSRLKTVSLGTKHSDESGGQA RCFWIGYSYNEQRFFNGSVAEVRIWNRALTAEEIQAVNHFYTADPASEGLIAYWKFDEGE GQTVKDHSASGYNLTIENLPKWEFVTLPEK >gi|222159255|gb|ACAB01000104.1| GENE 36 55129 - 56232 863 367 aa, chain - ## HITS:1 COG:no KEGG:BT_3985 NR:ns ## KEGG: BT_3985 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 364 1 356 358 178 31.0 4e-43 MKKIFKHTLPALIACLMMLAACSDWTEPESITLRYPSIQEQNPALYTQYLKSLNDFKSSE HAIVIVSMNNLSTAPANQSQHLTALPDSIDYICLNNIFDVNEMHITEMDEVRKKGTKVLG LINFDAIESAWQAILEKEAEDASNTETSEKENELESENETPDEDPAIVNARRFIEYCKEE TARQLDAINALGTDGIVINFTGFDLNSLLQDEEKTAAETARQGAFFNLLTEWKAAHAGKE IVFKGSPQNVIGKEILVESKYIIVNAHSAKNFYEMSYLVLMASAKGVPTDRFIIGVTTPY LSNMGDQYGVVDGRSAIVAAAEWTLQEVPDYIKAGISVDGVEQDYFNPAKIYPNVREAIN ILSPTAK Prediction of potential genes in microbial genomes Time: Wed May 18 03:33:00 2011 Seq name: gi|222159254|gb|ACAB01000105.1| Bacteroides sp. D1 cont1.105, whole genome shotgun sequence Length of sequence - 46571 bp Number of predicted genes - 32, with homology - 31 Number of transcription units - 13, operones - 9 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 . - CDS 1748 - 5140 1903 ## BT_3983 hypothetical protein - Prom 5367 - 5426 5.3 + Prom 5202 - 5261 8.4 3 2 Op 1 . + CDS 5440 - 6510 883 ## BT_3985 hypothetical protein 4 2 Op 2 . + CDS 6520 - 7680 854 ## BT_3986 putative patatin-like protein 5 2 Op 3 . + CDS 7707 - 9134 1021 ## BT_3987 endo-beta-N-acetylglucosaminidase F1 precursor 6 2 Op 4 . + CDS 9155 - 10459 1026 ## BT_3988 putative peptidoglycan bound protein + Term 10499 - 10542 6.8 7 3 Tu 1 . - CDS 10662 - 11150 252 ## COG3177 Uncharacterized conserved protein - Term 12026 - 12077 8.1 8 4 Tu 1 . - CDS 12165 - 14435 1910 ## COG3537 Putative alpha-1,2-mannosidase - Prom 14458 - 14517 5.4 9 5 Op 1 . - CDS 14540 - 16846 1544 ## COG3537 Putative alpha-1,2-mannosidase 10 5 Op 2 6/0.000 - CDS 16891 - 17883 877 ## COG3712 Fe2+-dicitrate sensor, membrane component 11 5 Op 3 . - CDS 17933 - 18490 367 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 12 5 Op 4 . - CDS 18518 - 20755 2011 ## COG3537 Putative alpha-1,2-mannosidase - Prom 20775 - 20834 10.7 - Term 21031 - 21075 2.2 13 6 Tu 1 . - CDS 21278 - 23896 2568 ## COG0013 Alanyl-tRNA synthetase - Prom 23921 - 23980 4.1 + Prom 23862 - 23921 4.5 14 7 Op 1 . + CDS 24068 - 25036 820 ## COG0739 Membrane proteins related to metalloendopeptidases 15 7 Op 2 . + CDS 25057 - 25401 394 ## COG0789 Predicted transcriptional regulators + Term 25404 - 25452 9.4 16 8 Op 1 . - CDS 25438 - 27678 2113 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 17 8 Op 2 . - CDS 27751 - 29046 1075 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 18 8 Op 3 . - CDS 29080 - 29967 669 ## BT_4000 hypothetical protein 19 8 Op 4 25/0.000 - CDS 29968 - 30858 1032 ## COG1475 Predicted transcriptional regulators 20 8 Op 5 . - CDS 30874 - 31638 734 ## COG1192 ATPases involved in chromosome partitioning + Prom 31778 - 31837 6.2 21 9 Op 1 . + CDS 31994 - 32761 662 ## COG0496 Predicted acid phosphatase 22 9 Op 2 . + CDS 32854 - 33990 1030 ## COG0763 Lipid A disaccharide synthetase + Term 34011 - 34053 2.6 + Prom 33993 - 34052 4.5 23 9 Op 3 . + CDS 34085 - 34876 861 ## BT_4005 hypothetical protein + Term 34897 - 34955 10.0 - Term 34894 - 34936 9.2 24 10 Op 1 . - CDS 34983 - 35831 797 ## COG0575 CDP-diglyceride synthetase 25 10 Op 2 . - CDS 35848 - 38001 1266 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 26 10 Op 3 . - CDS 38011 - 38370 296 ## COG0799 Uncharacterized homolog of plant Iojap protein + TRNA 38635 - 38708 49.9 # Gln CTG 0 0 - Term 38967 - 39019 14.2 27 11 Op 1 . - CDS 39046 - 40896 2176 ## BT_4041 hypothetical protein - Prom 40917 - 40976 3.8 28 11 Op 2 . - CDS 40985 - 42325 1433 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) 29 11 Op 3 . - CDS 42387 - 43187 771 ## COG0030 Dimethyladenosine transferase (rRNA methylation) - Prom 43330 - 43389 5.0 + Prom 43186 - 43245 6.4 30 12 Op 1 . + CDS 43304 - 44290 576 ## BT_4044 putative dolichol-P-glucose synthetase + Prom 44327 - 44386 3.0 31 12 Op 2 . + CDS 44412 - 45869 1629 ## COG2195 Di- and tripeptidases + Term 45919 - 45971 13.3 32 13 Tu 1 . + CDS 46267 - 46359 82 ## + Term 46384 - 46432 8.9 Predicted protein(s) >gi|222159254|gb|ACAB01000105.1| GENE 1 145 - 1704 1096 519 aa, chain - ## HITS:1 COG:no KEGG:BT_3984 NR:ns ## KEGG: BT_3984 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 518 11 536 537 679 65.0 0 MGALTISFISCTSQYMDINSNPYQPGSLAADDYSLGSAMSNLAGCVVSSDVNTAQFTDCL LGGPCGGYFADSKSAWAFTISNFNPTDDWTRVFLKSDKVIPVLYSNLSTVKSVSESTNNP VPYAIAEIIKVAAMSRVTDTYGPIPYSKIGADGKITIPYDSQEEVYNKFFEELNHAIEIL TENSGSALVATADYVYGGNVPKWIKFANSLKLRLAIRISYANKKLAQEMAESAVKHEFGV IESNEDNAKWNYFGTIPNPLYTAVRYNEETSGGDSHVAADIVCYMNGYADARRAAYFEKS GWEDQEYVGLRRGINLEKAKDYFINYSKIKISASDPILWMNAAEVAFLRAEGKAIFKFDM GGEARDFYNQGIRLSFEQWSAGSPEEYLNDESKIPTTYTDPSGLNPYNSQLSTITIKWDE SSTDEVKQERIITQKWIANWMLGNEAWADYRRTGYPKLIPTTDEGNMSGGTVDNKKGARR MPYPSAEYIDNTVNVQYAVNNYLKRADNMATDLWWACKP >gi|222159254|gb|ACAB01000105.1| GENE 2 1748 - 5140 1903 1130 aa, chain - ## HITS:1 COG:no KEGG:BT_3983 NR:ns ## KEGG: BT_3983 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1130 9 1135 1135 1655 74.0 0 MESKHSLQKSKLFRALLILLLIAVPVQWAAAQLTLSTPRTTLGSVIKQIQSQSKYQFFYN DKLSTITVEPLKVKDVSLEEVLNTLLKNKDISYKIEENIIYLSEKGVPAPTQQPTGKERT ITGRVLDAKGEPLIGVSILIKGTTDGAITDLDGNYKVVTKNANPILVYSYIGYKTQEIPL KGQTAINITMLDDTQVIDEVVVTALGIKRSEKALSYNVQQVNTNEITSNKDANFVNSLSG KVAGVNINASSSGVGGVSKVVMRGTKSIMQSSNALYVIDGVPIFSGRSTKSGGTEFDSRG SSEPIADINPEDIESMSVLTGAAAAALYGSEAANGAIVITTKKGKEGRVSITVNSNTEFT SPFIMPRFQTRYGTGMEGVQNVSGARSWGRKLTETNYFGYSPQDDYFQTGVIATESVSFS TGTEKNQTYASAAAVNSKGIVPNNKYNRYNFTVRNTTSFLDDKMTLDFGASYILQNDRNM TNQGTYNNPLVGAYLFPRGNDWEDISMYERYDPARKIYTQYWPVGDEAMTMQNPYWVNYR NLRENKKDRYMLNANLSYQILDWLSVSGRVRLDNSNNDYTEKFYASTNTQLTEKSSRGLY GITRTNDKQLYADFLVNVNKMFGETISLQANVGGSFSDIRSDAMKVRGPIADGTDSFKGE TVGLTNFFAIQNLSPSKTERMQEGWREQTQSIFASAELGYKSTYYITLTGRNDWPSQLAG PNSKSKSFFYPSVGASVVLSELMPKLNKDYLSYMKLRGSWASVGSAFGRYLANPRYEWTA NSAQWSIMTQYPLYNLKPERTQSFEVGLTMRFLKNFDLDITYYNAKTMNQTFNPQLPVSG WSAMYIQTGAVRNQGIELSLGYKNTWNKFSWDTGFTFSMNRNKILTLASDAINPITGENF SIGTLNMGGLGEARFLLKEGGSMGDLYSFIDLKRDTNNKIYIDQAGKMTTETIKAPKKYI KLGSVLPKGNLAWRNNFSWNNFNAGFLVSARLGGVVFSRTQAMLDYYGVSEATAEARDHG YVLVNGNDCVNPETWYGIIGGGTSVPQYYTYSATNVRLQEVSLGYTFPRKMLHNICDIKI SMIGRNLWMIYNKAPFDPESIASTDNFYQGIDYFMMPSLRNIGFSLSFKF >gi|222159254|gb|ACAB01000105.1| GENE 3 5440 - 6510 883 356 aa, chain + ## HITS:1 COG:no KEGG:BT_3985 NR:ns ## KEGG: BT_3985 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 356 1 358 358 423 57.0 1e-117 MRRIVKNSWFAVILLTWGMICFSCEDWTDPESIDIHYPTFEEQNPQLYADYIRDLKNYKA GEHKLVFVSFDNPVGNPSNQVERLIAIPDSVDFVCLNNSENLSNETQVEMVKIREKSIRT LSIINYESLEQEWNKKAKENSELTEEDAQKYLNERTDAMLALCDSYGYDGILIDYTGLSM VGMQEAVLQQYKTRQQNFFSRVLDWRNKHANKTLVLYSYVQYLTPENMDMLDEYNHLILK TASSKNIDDLTLNVLMAIQAGIDAAGMDANPVPKDRFVACAQLPQQEDKDMIIGYWNTQD ANGNKTLAAQGTAQWIVQSSSTYTRAGIFITNIQKDYYNNTYAAVRETIHIMNPNK >gi|222159254|gb|ACAB01000105.1| GENE 4 6520 - 7680 854 386 aa, chain + ## HITS:1 COG:no KEGG:BT_3986 NR:ns ## KEGG: BT_3986 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 385 1 383 384 533 66.0 1e-150 MKLHIYQLAIMALTVFTMSSCKDTDINDEHHYDNKLYISSVPVTDDLLIKEDILSDSRKI TYRLAAPADDEVQISFDAKPALTAAYNLSFADNATALPEAYYDIPVKDATIKTGEISGDD IIVNFVNLNQLDDSRRYVLPVTITDVSGIGVLESARTTYFIIKGAALINVVANIKENYFP IKWNSDVSKMTTITVEALVRSDDWVAKRDNALSSVFGIEGNFLIRIGDGDRPRDQLQAFV PGGSFPPANHAPGIGLPVNEWVHIAVVYNTVNKNRIYYKNGVEVYKDQSANSTVNLTRNC YIGRSFDGTRWLPGEISEVRIWNVERTAEQIADNPYKVDPASEGLVAYWKFNEGSGKVVI DRTGNGNDITGNKEPVWIPVELPQMY >gi|222159254|gb|ACAB01000105.1| GENE 5 7707 - 9134 1021 475 aa, chain + ## HITS:1 COG:no KEGG:BT_3987 NR:ns ## KEGG: BT_3987 # Name: not_defined # Def: endo-beta-N-acetylglucosaminidase F1 precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 474 1 476 476 592 61.0 1e-167 MKQKYLITGLVVATFTSCLTYLTSCADDLVIDTTVDESAYTGIYENNGYLRDGKSNLASS VIELYKDTYTTSVKMGLSKAPGSTTSAQVIIDANYLDTYNKEHETDFELYPEHLVSFKND GMLTVDVQTKSARTDITIQADGTLKEDKTYALPIVLTHTSSDITIKDEKAGHCIYLIKDM RKLGDTYKGEDVVKSFVFFDGTNPLNALSFQLENGKLLWDVVSSFAANINWDAQAQRPYL KCNSYIQYLLDNNEVFLQPLRKRGAKIVLGVLSNGDITGVAQLSEQGAKDFARELAQYCK AYNLDGVCFDDEYEGAYDPNNPALTKPTEEAAARLCYETKQAMPDKIVAVYALRRMYSSK VTVVDGVTMKNWIDIVIGDYGRDPSSNPYGDLTSKECSGQSMEFVRGTGGDLQGQRLINQ GSGWFVGFSPKPENYSNVFRRLSDVKTLYGSPLMAPTVFYKDNDATPYQYPDDLQ >gi|222159254|gb|ACAB01000105.1| GENE 6 9155 - 10459 1026 434 aa, chain + ## HITS:1 COG:no KEGG:BT_3988 NR:ns ## KEGG: BT_3988 # Name: not_defined # Def: putative peptidoglycan bound protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 433 1 441 444 310 41.0 7e-83 MRILYRFFFLLAISSLTVLSGCREDEDLNLLTYPDNNFTLAADDEDGSEITVNATYNNDG ILEFDKPVTFTFRFNASPEETIVTFEPMGENVELSDTKLIIPAGYTDANVTLNVKDMEAF QSNYDEATCNLGVKATVQGYKMPSTTLEAKALIKKAAYVATGFLEGKADSKTTFEHIYFN GEFKKTERNLYTFSVQLDRPARKDVKVKLTTEGLDDEYLKDITITPSEIVIPAGEKTTGD ITWEITNDFLLRTSDDSSNTLNIMATFECEDPVFVQDEKRSSFTLNINKYIKGFEFVSYK RPSGWVEWSKQGWSVEVEDNGDIWQNDGGSALIDGTTNNYEGIASDGNISFTLDMGEERT INGVGFDYIYYSAPQNIQLSVSSDGNTWNYLGKVASADEDADYYKFIEPITARYVKCELL ASWGCELYEVYVFK >gi|222159254|gb|ACAB01000105.1| GENE 7 10662 - 11150 252 162 aa, chain - ## HITS:1 COG:mlr2757 KEGG:ns NR:ns ## COG: mlr2757 COG3177 # Protein_GI_number: 13472455 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 66 122 161 217 263 62 52.0 5e-10 MRTTGGIIYSSAGDYDSSKGDWRKSSVHVGDRYFVNHQKVEREVTRLCKILNQRIKQVQA PDAIYQLAFDAHFYLVSIHPFADGNGRISRLLMNYILSYHQLPLATIFKEDKLEYYQALE ASCPQDDEAPDLCPIRDFMSAQQMKYLSMEIKKYKQAEKRGG >gi|222159254|gb|ACAB01000105.1| GENE 8 12165 - 14435 1910 756 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 31 754 13 716 717 450 34.0 1e-126 MKKYLLLVCALSCIVFAKAKGWTQYVNPLMGSQSSFELSTGNTYPAIARPWGMNFWTPQT GKMGDGWQYTYTANKIRGFKQTHQPSPWINDYGQFSIMPVVGKPEFNEEKRASWFAHKGE TATPYYYKVYLAEHDVVTEMAPTERAVLFRFTFPENDHSYVVVDAFDKGSYIKVIPEENK IIGYTTRNSGGVPENFKNYFIIEFDKPFTYKATVVNGNLQENVTEQTTDHAEAIIGFQTH KGEQVHARIASSFISFEQAALNMKELGKDNIEQLAQKGKEAWNQVLGKIEVEGGNLDQYR TFYSCLYRSLLFPRKFYELDAAGQPVHYSPYNGQVLPGYMYTDTGFWDTFRCLFPLLNLM YPSVNKEMQEGLINTYKESGFFPEWASPGHRGCMVGNNSASVLVDAYMKGVKVDDIKTLY EGLIHGTENVHPQVSSTGRLGYEYYNKLGYVPYDVKINENAARTLEYAYNDWCIYQLAKE LKRPKKEINLFAKRAMNYKNLFDKESKLMRGRNEDGTFQSPFSPLKWGDAFTEGNSWHYT WSVFHDLQGLIDLMGGKEMFITMMDSVFAVPPIFDDSYYGQVIHEIREMTVMNMGNYAHG NQPIQHMIYLYDYAGQPWKAQYWLRQVMDRMYTPGPDGYCGDEDNGQTSAWYVFSALGFY PVCPGTDEYIIGAPLFKKATLHFENGNSLVIDAPNNSKKNFYVNSLNVNGTDYTKNYLRH EDLFKGGTIKVDMGNQPNKNRGTKEEDMPYSFSKEQ >gi|222159254|gb|ACAB01000105.1| GENE 9 14540 - 16846 1544 768 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 29 752 45 765 790 540 39.0 1e-153 MKLKTFLIVGCLGGLFTLSSCTAPTNVKDYSAYVNPFIGTGGHGHTFPGAVVPHGMIQPS PDTRIDGWDACSGYYYADSTINGFSHTHVSGTGCCDYGDVLLMPTVGKQQYRTTDPQSQR LAYASSFSHKNEIAEPGYYSVFLDTYQVKAEISSTKRGAIHRYTFPENAESGFIIDLDYS LQRQTNSEMEIEVISDTEICGHKKTTYWAFDQYINFYAKFSKPFSYTLVTDSITMDNGKR LPVCKALLHFNTSKDEQVLVKVGVSAVDIAGARKNVESEIPGWDFEKVRKDARTAWNEYL SKIDITTSDKEDKAIFYTALYHTGISPNLFTDADGRYLGMDLEVHQGDTLNPIYTVFSLW DTFRALHPLMTIIDPDLNNHFINSLIKKHQEGGIYPMWDLASNYTGTMIGYHAVPVIVDA YMKGYRNFDAKEAYKACLRAAEYDTTGIKCPDLVLPHLMPKAKYYKNAIGYIPCDRENES VAKALEYAYDDWCISIFAEAMSDFESKAKYERFAKAYEFYFDKSIRFMRGLDSKGEWRTP FNPRASTHRSDDYCEGTAWQWTWFVPHDVEGLVNLMGGEDAFVQKLDSLFSADSSLEGET TSSDISGLIGQYAHGNEPSHHVIHLYNYVNHPWRTQELVDSVYRSQYTNSIDGLSGNEDC GQMSAWYILNSMGFYQVCPGKPVYSIGRPAFDKAVINLPEGKTFSIVVKNNGKKNKYIES VLLNGKALNIPFFNHQDIANGGTMEIKMTDHPTKWGVLSPALSSKEEE >gi|222159254|gb|ACAB01000105.1| GENE 10 16891 - 17883 877 330 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 18 317 29 313 331 76 25.0 8e-14 MSNLSEEIINRYLTGQCSEEELIEVNAWMKESEENARQLFRMEEIYHLGKFDQYTDEQRI LRAEKQLYKKLDEEKGKQSKILSMHRWMKYAAVIAVMLVMGGGVGYWFYQNGNNQQMMVA VASEGIVKEVVLPDGSKVWLNNAATLKYPREFSEKERNVYLEGEAYFEVTKNRHKPFTVQ SDAIRVRVLGTTFNLKSDKRCRIAEATLIEGEIEVKGNKEEGQIILTPGQRAELNKNNGR LTVKQVDAKLDAVWHDNLIPFQKADIFTITKALERFYDVKIILSPDIQTGKTYSGVLKRK SNIESVLKSLQNSIPIDYKIVGNNIFISPK >gi|222159254|gb|ACAB01000105.1| GENE 11 17933 - 18490 367 185 aa, chain - ## HITS:1 COG:CC0981 KEGG:ns NR:ns ## COG: CC0981 COG1595 # Protein_GI_number: 16125233 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Caulobacter vibrioides # 10 171 14 175 201 63 31.0 2e-10 MNENFDITYKALFRRYYPSLIFYATRLVGEEEAEDVVQDVFVELWKRKDHIEIGDQIQAF LYRAVYTRALNVLKHRSVEEGYCVAMEEINQRRAEFYQPDNNEVIRRIEDKELRKEIHDA INELPDKCKEVFKLSYLHDMKNKEIADILGVSLRTVEAHMYKALKYLRGRLDPLWTILFL FLWRL >gi|222159254|gb|ACAB01000105.1| GENE 12 18518 - 20755 2011 745 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 20 733 37 765 790 533 40.0 1e-151 MRTPAYLCLLLTCMLISCTPTQEKHEIDYTSYVNPFIGTDFTGNTYPGAQAPFGMVQLSP DNGLPGWDRISGYFYPDSTIAGFSHTHLSGTGAGDLYDISFMPVTLPYKEAEAPLGIHSK FSHKDESAHAGYYQVRLTDYNINVELTATERCGIQRYTFPEAQSAIFLNLKKAMNWDFTN DSHIEVVDSVTIQGYRFSDGWARDQHIYFRTRFSKPFEKMELDTTAIIKDNKRIGTAVIA RFDFNTQKDEQILVNTAISGVSMEGAAKNLQAEVPENDFDKYLAETKANWNRQLGKIEIK GDNENDKVNFYTALYHSMIAPTIYSDVDGTYYGPDKKVHQANGWVNYSTFSLWDTYRAAH PLFTYTEPERVNDMVKSFIAFYEQNGRLPVWNFYGSETDMMIGYHAVPVIVDAYLKGIGD FDAKKALDACIATANLDNYRGIGLYKELGYIPYNVTDHYNAENWSLSKTLEYAFDDYCIA EMAKKMGKQDIADEFYKRSQNYKNVYNPATSFMQPRDDKGTFIKDFKADDYTPHICESNG WQYFWSVQHDINGLIDLGGGKNRFAEKLDSMFTYHPAADDELPIFSTGMIGQYAHGNEPS HHVIYLFNAIGHENRTQEYVAKVMNELYKNEPAGLCGNEDCGQMSAWYVFSAMGFYPVNP VSGKYEIGTPLFPEMQLHLANGKTFTVLAPKVNKENIYIQSIKIDGKTYDKTYLTHEQIM SGATVEFEMSNTPKAIGVIFEENNP >gi|222159254|gb|ACAB01000105.1| GENE 13 21278 - 23896 2568 872 aa, chain - ## HITS:1 COG:ZalaS KEGG:ns NR:ns ## COG: ZalaS COG0013 # Protein_GI_number: 15803211 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Escherichia coli O157:H7 EDL933 # 3 870 4 871 878 651 43.0 0 MLTAKEIRDSFKNFFESKGHHIVPSAPMVIKDDPTLMFTNAGMNQFKDIILGNHPAKYQR VADSQKCLRVSGKHNDLEEVGHDTYHHTMFEMLGNWSFGDYFKKEAISWAWEYLVEVLKL NPEHLYATVFEGSPEEGLSRDDEAASYWEQFLPKDHIINGNKHDNFWEMGDTGPCGPCSE IHIDLRPAEERAKISGRDLVNHDHPQVIEIWNLVFMQYNRKADGSLEPLPAKVIDTGMGF ERLCMALQGKTSNYDTDVFQPMLKAIAAMSGTEYGKDKQQDIAMRVIADHIRTIAFSITD GQLPSNAKAGYVIRRILRRAVRYGYTFLGQKQAFMYKLLPVLIDNMGEAYPELVAQKTLI EKVIKEEEESFLRTLETGIRLLDKTMEDTKANGKTEISGKDAFTLYDTFGFPLDLTELIL RENGMTVNIEEFNEEMQQQKQRARNAAAIETGDWIVLKEGTTEFVGYDYTEYEASILRYR QIKQKNQTLYQIVLDYTPFYAESGGQVGDTGVLVSEFETIEVVDTKKENNLPIHITKKLP EHPEAPMMACVDTDKRAACAANHSATHLLDAALREVLGEHIEQKGSLVTPDSLRFDFSHF QKVTDEEIRKVEHIVNARIRANIPLKEYRNIPIEEAKELGAIALFGEKYGDRVRVIQFGT SIEFCGGTHVAATGNIGMVKIISESSVAAGVRRIEAYTGARVEEMLDTIQDTLSDLKALF NNAPDLRIAIRKYLDENAGLKKQVEDFMKEKEAALKERLLKNVQEIHGIKVIKFCAPLPA EVVKNIAFQLRGEITENLFFVAGSTDNGKPMLTVMLSDNLVAGGLKAGNLVKEAAKLIQG GGGGQPHFATAGGKNADGLPAAVEKVLELAGI >gi|222159254|gb|ACAB01000105.1| GENE 14 24068 - 25036 820 322 aa, chain + ## HITS:1 COG:mll8577 KEGG:ns NR:ns ## COG: mll8577 COG0739 # Protein_GI_number: 13477076 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Mesorhizobium loti # 84 297 222 423 434 132 34.0 9e-31 MRKVYYIYNPQTQTYDRIYPTVRQRALSILRRLFYGMGLGAGCFIVLLLIFGSPSEKELR IENSRLLAQYNVLSRRLDDAMGVLQDIQQRDDNLYRVILQADPVSPAIRQAGYGGTNRYE ELMDLANAKLVVNTTQKLDVLSKRLYIQSKSFDDVIDMCKNHDEMLKCIPAIQPISNKDL RQTASGYGTRIDPIYGTTKFHAGMDFSAHPGTDVYATGNGTVVKVGWETGYGNTIEIDHG FGYLTRYAHLQGFNTKVGKKVVRGEIIGKVGSTGKSTGPHLHYEVYVKGQVVNPVNYYFM DLSAEDYEKMIQLAANHGKVFD >gi|222159254|gb|ACAB01000105.1| GENE 15 25057 - 25401 394 114 aa, chain + ## HITS:1 COG:AGc2183 KEGG:ns NR:ns ## COG: AGc2183 COG0789 # Protein_GI_number: 15888519 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 82 26 105 203 66 41.0 1e-11 MLNTDKELKLYYSIGEVADMFGVNPSLLRFWEKEFPQISPKTAGRGIRQYRKEDVETIGL IYHLVKEKGMTLPGARQRLKDNKEATVRNYEIVNKLKAIKEELLAIKRELDGRE >gi|222159254|gb|ACAB01000105.1| GENE 16 25438 - 27678 2113 746 aa, chain - ## HITS:1 COG:VC2710 KEGG:ns NR:ns ## COG: VC2710 COG0317 # Protein_GI_number: 15642704 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Vibrio cholerae # 34 709 19 667 705 458 38.0 1e-128 MDNLPPKEISDEEMINQAFHELLNDYLNTKHRKKVEIITKAFNFANQAHKGIKRRSGEPY IMHPIAVASIVCNEIGLGSTSICAALLHDVVEDTDYTVEDIENIFGPKIAQIVDGLTKIS GGIFGDRASAQAENFKKLLLTMSNDIRVILIKIADRLHNMRTLGSMLPNKQYKIAGETLY IYAPLANRLGLYKIKTELENLSFKYEHPEEYAEIEEKLNATAAERDKVFNDFTAPIRTQL DKMGLKYRILARVKSIYSIWNKMQTKHVPFEEIFDLLAVRIIFEPRNEEEELNDCFDIYV SISKIYKPHPDRLRDWVSHPKANGYQALHVTLMGNNGQWIEVQIRSERMNDVAEQGFAAH WKYKEGGGSEDEGELEKWLKTIKEILDDPQPDAIDFLDTIKLNLFASEIFVFTPKGELKT MPQNSTALDFAFSLHTDIGSHCIGAKVNHKLVPLSHKLQSGDQVEILTSKSQRVQPQWEV FATTARARAKIAAILRKERKANQKIGEEILNEFLKKEEIRPEETVIEKLRKLHNAKNEEE LLAAIGSKAIVLGEADKNELKEKQTSNWKKYLTFSFGNNKEKQEEKEPQEKEKINPKQVL KLTEESLQKKYIMAECCHPIPGDDVLGYVDENDRIIIHKRQCPVAAKLKSSYGNRILATE WDTHKELSFLVYIYIKGIDSMGLLNEVTQVISRQLNVNIRKLTIETEDGIFEGKIQLWVH DVDDVKTICNNLKKIQNIKQVSRVEE >gi|222159254|gb|ACAB01000105.1| GENE 17 27751 - 29046 1075 431 aa, chain - ## HITS:1 COG:PA1812 KEGG:ns NR:ns ## COG: PA1812 COG0741 # Protein_GI_number: 15597009 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Pseudomonas aeruginosa # 126 429 118 392 534 160 32.0 6e-39 MKKLVNYCSIIFLLLVATSQVKAQSVDVVIRENGTERKESIDLPKSMTYPLDSLLNDWKA KNYIDLGKDCSTAEINPLFSDSVYIDRLSRIPAIMEMPYNDIIRKFIDMYAGRLRNQVSF MLSACNFYMPIFEEALDAYGLPLELRYLPIIESALNPSAVSRAGASGLWQFMIGTGKIYG LESNSLVDERRDPIKATWAAARYLKEMYDIYGDWNLVIAAYNCGPGTINKAIRRANGETD YWKIYNYLPKETRGYVPAFIAANYVMTYYCDHNICPMETNIPASTDTVQVNKNLHFEQIA DLCNVPLDQIKSLNPQYKKQMIPGDSKPYTLRLPIDAISTFIDKQDTIYAHRADELFRNR KTVAVKEITPSTRKTTTAVAGKGKLTYYTIKSGDTLSTIAGKYGVTIKDIQRWNGMSNTK IAAGKRLKIYK >gi|222159254|gb|ACAB01000105.1| GENE 18 29080 - 29967 669 295 aa, chain - ## HITS:1 COG:no KEGG:BT_4000 NR:ns ## KEGG: BT_4000 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 295 1 295 295 489 85.0 1e-137 MTRKSKKYQLLVSALLLCFLQVAGIDVYAQSTVTPARKDTTIIQREAPKARARRHRDPII QDSVKKDSIQIIPSKELPAIDSLSAAKIQIADSLDAANKKELKKIEQPASIVVKTDTVPP PAQDINKKIFIPNPTKATWLAVVFPGGGQIYNRKYWKLPIIYGGFAGCAYALSWNGKMYK DYSQAYLDIMDSNPNTKSYEDLLPPNSTYNEEQLKNTLKRRKDMFRRYRDLSIFAFIGVY LISIIDAYVDAELSNFDITPDLSMKVEPAVIDNNNQFRSSSLKSKSVGLQCVLRF >gi|222159254|gb|ACAB01000105.1| GENE 19 29968 - 30858 1032 296 aa, chain - ## HITS:1 COG:ML2706 KEGG:ns NR:ns ## COG: ML2706 COG1475 # Protein_GI_number: 15828464 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Mycobacterium leprae # 32 295 64 334 335 196 43.0 5e-50 MATQKRNALGRGLDALLSMDDVKTEGSSSINEIELAKITVNPNQPRREFDETALQELADS IAEIGIIQPITLRKLSDDEYQIIAGERRYRASQRAGLKTIPAYIRTADDENMMEMALIEN IQREDLNAVEIALAYQHLLDQYELTQERLSERIGKKRTTIANYLRLLKLPAPIQMALQNK QLDMGHARALISLGDPKLQVKIFEEIQEHGYSVRKVEEIVKSLSEGEAVKSGTRKITPKR GKLPEEFNLLKQQLSGFFNTKVQLTCSEKGKGKISIPFGNEEELERIMEIFDTLKK >gi|222159254|gb|ACAB01000105.1| GENE 20 30874 - 31638 734 254 aa, chain - ## HITS:1 COG:lin2923 KEGG:ns NR:ns ## COG: lin2923 COG1192 # Protein_GI_number: 16801982 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Listeria innocua # 1 250 1 250 253 277 57.0 1e-74 MGKIIALANQKGGVGKTTTTINLAASLATLEKKVLVVDADPQANASSGLGVDIKQSECTI YECIIDRANVQDAIFDTEIDSLKVISSHINLVGAEIEMLNLPNREKILKEVLTPLKKEYD YILIDCSPSLGLITINALTAADSVIIPVQAEYFALEGISKLLNTIKIIKSKLNPALEIEG FLLTMYDSRLRQANQIYDEVKRHFQELVFNTVIQRNVKLSEAPSYGVPTILYDAESTGAK NHLALAKEIINRNK >gi|222159254|gb|ACAB01000105.1| GENE 21 31994 - 32761 662 255 aa, chain + ## HITS:1 COG:alr4846 KEGG:ns NR:ns ## COG: alr4846 COG0496 # Protein_GI_number: 17232338 # Func_class: R General function prediction only # Function: Predicted acid phosphatase # Organism: Nostoc sp. PCC 7120 # 8 253 3 260 265 152 35.0 5e-37 MEKEKPLILVSNDDGVMAKGINELVKFLRPLGEIIVMAPDAPRSGSGCALTVTQPVHYQL VKKEVGLTVYKCSGTPTDCIKLARNTVLDRTPDLIVGGINHGDNSATNVHYSGTMGVVFE GCLNGIPSIGFSLCNHAPDADFEAAGPYIRSIAAMILEKGLPPLTCLNVNFPDTADIKGV KICEQAKGRWTNEWAACPRLNDPNYFWLTGEFTDHEPENEKNDHWALANGYVAITPTTVD VTAYHFMDELNKWFN >gi|222159254|gb|ACAB01000105.1| GENE 22 32854 - 33990 1030 378 aa, chain + ## HITS:1 COG:FN0597 KEGG:ns NR:ns ## COG: FN0597 COG0763 # Protein_GI_number: 19703932 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A disaccharide synthetase # Organism: Fusobacterium nucleatum # 1 341 1 323 356 152 30.0 9e-37 MKYYLIVGEASGDLHASHLMAALKAEDPQADFRFFGGDLMAAVGGTMVKHYKELAYMGFI PVLLHLRTIFANMKRCKEDIVSWEPDVVILVDYPGFNLDIAKFVHAKTQIPVYYYISPKI WAWKEYRIKNIKRDVDELFSILPFEVEFFEGKHQYPIHYVGNPTVDEVAAYQAAHPKNKE HFIAENQLEDKPIIALLAGSRKQEIKDNLPDMLKAASAFPDYQLVLAGAPAIAPEYYKQY VGEAKVKIIFDQTYSLLQHADVALVTSGTATLETALFRVPQVVCYYTPIGKVVSFLRRHI LTVRFISLVNLIADREVVKELVADTMTVKNMQNELRNIIENEAYRNEMLSGYEYVAERLG PAGAPRHAAREMLRLLKK >gi|222159254|gb|ACAB01000105.1| GENE 23 34085 - 34876 861 263 aa, chain + ## HITS:1 COG:no KEGG:BT_4005 NR:ns ## KEGG: BT_4005 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 263 1 258 258 390 79.0 1e-107 MKKIKWLLGILLLALVPMLQSCDDDGYSIGDFSWDWATVRATGGGGYYLEGDNWGVIDPV ATSIPWFKPVDGERVVAFFNPLYDMEGGKGVQVKMEGIQELLTKEVEDMSTEEEAEEFGN DPILIYQGDMWLGGKFLNVIFRQELPRSEKHRISLVQNKMEPGEPGEPSEPGTLNVDEDG YIHLELRYNTYEDVTDYWGWGRVSYNLEKFFPTPKDSWVAPKGFKVTINSREHGEGRVIV LDLDHPVGVPEAAKDVHSTSSIR >gi|222159254|gb|ACAB01000105.1| GENE 24 34983 - 35831 797 282 aa, chain - ## HITS:1 COG:HI0919 KEGG:ns NR:ns ## COG: HI0919 COG0575 # Protein_GI_number: 16272856 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Haemophilus influenzae # 13 270 11 278 288 103 31.0 3e-22 MINNFIKRAITGVLFVAILVGCILYDAFSFGILFTAISALTIYEFGQLVNMRAEGVKINK TINMLGGAYLFLAIMGFCIDAADSKIFIPYVLLLLYMMISELYLKKENPVLNWAYFMLSQ LYIGLPFALLNVLAFHNDPGSEYSSVSYNPILPLSIFIFLWLNDTGAYCIGSLIGKHRLF ERISPKKSWEGSIGGGVVAIGVSFILAHYFPFMSMIEWAGLALVVVIFGTWGDLTESLLK RQLHVKDSGTILPGHGGMLDRFDSSLMAIPAAVVYLYALTWF >gi|222159254|gb|ACAB01000105.1| GENE 25 35848 - 38001 1266 717 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 19 670 3 634 636 492 43 1e-138 MDNNNSNNNNSNKPNNKVNMPKFNLNWMYMIIALMLLGLWWGSDSKGAGSKAVTYSEFQD YVKKGYVSKVLGYEDKSIEAFLKPTAVGAVFGADSTKVGRNPVITSRTPSTDKLEEFLQA EKEAGHFDGSSDYPPKSDIFPAILIQVLPLVLLVALWIFFMRRMSGGGSGGPGGVFNVGK SKAQLFEKGGAIKITFKDVAGLAEAKQEVEEIVEFLKEPQKYTDLGGKIPKGALLVGPPG TGKTLLAKAVAGEANVPFFSLAGSDFVEMFVGVGASRVRDLFKQAKEKAPCIVFIDEIDA VGRARGKNPAMGGNDERENTLNQLLTEMDGFGSNSGVIILAATNRVDVLDKALLRAGRFD RQIHVDLPDLNERKEVFGVHLRPIKIDDSVDVDLLARQTPGFSGADIANVCNEAALIAAR HGKKFVSKQDFLDAVDRIIGGLEKKTKITTEAERRSIALHEAGHASISWLLEYANPLIKV TIVPRGRALGAAWYLPEERQITTKEQMLDEMCATLGGRAAEDLFLGRISTGAMNDLERVT KQAYGMIAYLGMSDKLPNLCYYNNDEYSFNRPYSEKTAELIDEEVKRMVNEQYDRAKRIL SENKEGHNELTQLLIDKEVIFAEDVERIFGKRPWASRSEEIMAAKESQDAARAERELAQK LKEEEKEIKEEEAENTAKEEQAPIDTKVAAEGKKVTVEGKVTVEGKSNGEEQANGSN >gi|222159254|gb|ACAB01000105.1| GENE 26 38011 - 38370 296 119 aa, chain - ## HITS:1 COG:slr1886 KEGG:ns NR:ns ## COG: slr1886 COG0799 # Protein_GI_number: 16330295 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Synechocystis # 4 118 27 140 154 78 35.0 4e-15 MNDTKVLIEKIKEGIQEKKGKKIVVADLTSIEDTICKYFVICQGNSPSQVSAIVDSIKEF TRKGADSKPYAIDGLRNAEWVAMDYADVLVHVFLPETRAFYNLEHLWADAKLTQIPDLD >gi|222159254|gb|ACAB01000105.1| GENE 27 39046 - 40896 2176 616 aa, chain - ## HITS:1 COG:no KEGG:BT_4041 NR:ns ## KEGG: BT_4041 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 616 1 624 624 885 92.0 0 MDAHDTNQPLNQGELEEEKKTVEVSEAITETPAEDVTAEVQPEAAPKPATKEDVLNQLKE LAQDAENANKQEIDNLKQSFYKLHNAELEAAKVQFTDNGGNIEDFVAEEDPTEEEFKRLM GVIKEKRSKQVAELERQKEENLQVKLSIIEELKELVESGDDANKSYTEFKKLQQQWNDTK LVPQGKVNELWKNYQLYVEKFYDLLKLNNEFREYDFKKNLEIKTHLCEAAEKLADEEDVV SAFHQLQKLHQEFRDTGPVAKELRDEIWNRFKAASTAVNRRHQQHFESLKEAEQHNLDQK TVICEIVEAIEYDELKTFSAWENKTQEVIALQNKWKTIGFAPQKMNVKIFERFRHACDDF FKKKGEFFKSLKEGMNENLEKKKALCEKAEALKDSTDWKVTADALTKLQKEWKTIGPVAK KHSDAIWKRFITACDYFFEQKNKATSSQRSVETENMEKKKALIEKLSAIDESMDTEEAST LVRDLMKEWNSIGHVPFKEKDKLYKQYHGIIDQLFDRFNINASNKKLSNFRSNISNIQGS GTQSLYREREKLVRTYENMKNELQTYENNLGFLTSTSKKGSSLLTELNRKVDKLKADLEL VLQKIKVIDESIKAEE >gi|222159254|gb|ACAB01000105.1| GENE 28 40985 - 42325 1433 446 aa, chain - ## HITS:1 COG:BH0511 KEGG:ns NR:ns ## COG: BH0511 COG2239 # Protein_GI_number: 15613074 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Bacillus halodurans # 9 441 14 441 452 228 32.0 2e-59 MNEEYIDNVKHLIEQKDADKVKGLLIDLHPADIAELCNDLNAEEAKFVYRLLDNEIAADV LVEMDEDARKELLEMLPSETIAKRFVDYMDTDDAVDLMRELDEDKQEEVLSHIEDIEQAG DIVDLLKYDENTAGGLMGTEMVLVNENWSMPECLKEMRQQAEELDEIYYVYVIDDDERLR GIFPLKKMITSPSVSKVKHVMQKDPISVHVDTPIDEVVQAIEKYDLVAIPVVDSIGRLVG QITVDDVMDEVREQSERDYQLASGLSQDVETDDNVLRQTTARLPWLLIGMIGGIGNSMIL GNFDSTFAAHPEMALYIPLIGGTGGNVGTQSSAIIVQGLANSSLDAKNTFKQITKESVVA LINATIISLLVYTYNFIRFGATATVTYSVSISLFAVVMFASIFGTLVPMTLEKFKIDPAI ATGPFIAITNDIIGMMLYMGITVLLS >gi|222159254|gb|ACAB01000105.1| GENE 29 42387 - 43187 771 266 aa, chain - ## HITS:1 COG:PA0592 KEGG:ns NR:ns ## COG: PA0592 COG0030 # Protein_GI_number: 15595789 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Pseudomonas aeruginosa # 5 257 8 263 268 166 37.0 4e-41 MKLVKPKKFLGQHFLKDLKVAQDIADTVDTFPELPILEVGPGMGVLTQFLVKKDRLVKVV EVDYESVAYLREAYPSLEDHIIEDDFLKMNLHRLFDGKPFVLTGNYPYNISSQIFFKMLD NKDIVPCCTGMIQKEVAERIAAGPGSKTYGILSVLIQAWYKVEYLFTVSEHVFNPPPKVK SAVIRMTRNETKELGCDEKLFKQVVKTTFNQRRKTLRNSIKPILGKDCPLTEDILFNKRP EQLSVAEFIDLTNKVEEALKTAGNAQ >gi|222159254|gb|ACAB01000105.1| GENE 30 43304 - 44290 576 328 aa, chain + ## HITS:1 COG:no KEGG:BT_4044 NR:ns ## KEGG: BT_4044 # Name: not_defined # Def: putative dolichol-P-glucose synthetase # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 327 14 327 328 502 92.0 1e-141 MKKLLKKTLKLILPVVLGGFILFWVYRDFDFTKAGDVLLHGTNWWWMLFSLLFGVFAQVF RGWRWRQTLEPLDAFPKKSDCVNAIFISYAASLVVPRIGEVSRCGVLAKYDNVSFAKSLG TVVTERLVDTLTILLITGITVLLQLPIFVTFLQQTGTKIPSLVHLLTSVWFYIILFCFIG VVILLYYLRKTLFFYERVKGFVLNIWEGIMSLKGVRNIPLFIFYTLAIWACYFLHFYFTF YCFAFTAHLGILAALVMFVGGTFAVIVPTPNGAGPWHFAIISMMMLYGVNVTDAGIFALI VHGIQTFLVVLLGVYGLAALPFTNRHRA >gi|222159254|gb|ACAB01000105.1| GENE 31 44412 - 45869 1629 485 aa, chain + ## HITS:1 COG:VC2279 KEGG:ns NR:ns ## COG: VC2279 COG2195 # Protein_GI_number: 15642277 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Vibrio cholerae # 2 485 50 533 534 461 47.0 1e-129 MSTILSLAPQNVWKHFYSLTQIPRPSGHMEKITAFLLGFGKELGLESFVDEAGNVIIRKP ATPGMENRKGVILQAHMDMVPQKNNDTVHDFEKDPIETYIDGDWVKAKGTTLGADNGLGV AAIMAVLEAKDLKHGPLEALITKDEETGMYGAFGLKPGTLNGEILLNLDSEDEGELYIGC AGGMDVTATLEYKEVAPEAGDIAVKVTLKGLRGGHSGLEINEGRANANKLLVRFVREAVA SYEARLASWEGGNMRNAIPREAHAVITIPAENEEELLGLVKYCEDLFNEEYSAIETPISF TAERVELPAGQVPEEIQDNLIDAIFACQNGVTRMIPTVPDTVETSSNLAIITIGEGKAAI KILARSSSDSMKEYLTTSLESCFSMAGMKVEMTGGYSGWQPDVNSPILHAMKASYKQQFG VEPAVKVIHAGLECGIIGAIIPGLDMISFGPTLRSPHSPDERALIPTVRKFYDFLVATLE QTPLK >gi|222159254|gb|ACAB01000105.1| GENE 32 46267 - 46359 82 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAKKMSFLVRKILSEVGRNEMLSWGYRIEK Prediction of potential genes in microbial genomes Time: Wed May 18 03:34:20 2011 Seq name: gi|222159253|gb|ACAB01000106.1| Bacteroides sp. D1 cont1.106, whole genome shotgun sequence Length of sequence - 66993 bp Number of predicted genes - 54, with homology - 51 Number of transcription units - 18, operones - 12 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 250 215 ## BF1358 conjugate transposon protein 2 1 Op 2 . + CDS 257 - 430 123 ## BT_4770 conjugate transposon protein 3 1 Op 3 . + CDS 451 - 1083 630 ## BF1239 hypothetical protein 4 1 Op 4 . + CDS 1108 - 2127 974 ## BVU_2139 conjugate transposon protein 5 1 Op 5 . + CDS 2149 - 2256 76 ## 6 1 Op 6 . + CDS 2302 - 2772 168 ## BT_2292 conjugate transposon protein 7 1 Op 7 . + CDS 2785 - 3045 224 ## gi|237714163|ref|ZP_04544644.1| predicted protein 8 1 Op 8 . + CDS 3045 - 3233 262 ## gi|293368915|ref|ZP_06615516.1| hypothetical protein CUY_3180 9 1 Op 9 . + CDS 3327 - 4352 660 ## BT_0087 conjugate transposon protein 10 1 Op 10 . + CDS 4386 - 5285 739 ## BVU_2144 conjugate transposon protein 11 1 Op 11 . + CDS 5288 - 5872 385 ## BVU_2145 hypothetical protein 12 1 Op 12 . + CDS 5883 - 6371 173 ## gi|237714167|ref|ZP_04544648.1| predicted protein 13 1 Op 13 . + CDS 6423 - 6887 368 ## BVU_2146 conjugate transposon protein TraQ 14 1 Op 14 . + CDS 6899 - 7384 288 ## gi|237714169|ref|ZP_04544650.1| predicted protein + Term 7401 - 7451 12.5 - Term 7389 - 7439 12.5 15 2 Op 1 . - CDS 7463 - 8236 487 ## COG3617 Prophage antirepressor 16 2 Op 2 . - CDS 8285 - 8971 217 ## gi|294808552|ref|ZP_06767298.1| hypothetical protein CW3_4495 17 3 Op 1 . - CDS 9351 - 9446 150 ## 18 3 Op 2 . - CDS 9450 - 9797 433 ## BVU_2108 hypothetical protein 19 3 Op 3 . - CDS 9810 - 10130 350 ## BT_0107 hypothetical protein - Prom 10345 - 10404 6.3 + Prom 10528 - 10587 5.3 20 4 Tu 1 . + CDS 10773 - 11423 543 ## COG0546 Predicted phosphatases 21 5 Op 1 . - CDS 11447 - 11848 222 ## COG1895 Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN 22 5 Op 2 . - CDS 11845 - 12168 252 ## PRU_1839 nucleotidyltransferase domain-containing protein - Prom 12207 - 12266 3.7 - Term 12209 - 12266 2.3 23 6 Op 1 . - CDS 12318 - 12752 328 ## gi|237714177|ref|ZP_04544658.1| conserved hypothetical protein 24 6 Op 2 . - CDS 12799 - 14343 879 ## BT_4183 pectate lyase L precursor + Prom 14432 - 14491 5.6 25 7 Tu 1 . + CDS 14604 - 16574 1023 ## BVU_0152 polysaccharide lyase family protein 11, rhamnogalacturonan lyase + Term 16584 - 16628 5.2 - Term 16570 - 16615 7.2 26 8 Op 1 . - CDS 16671 - 18260 1043 ## BT_1014 hypothetical protein 27 8 Op 2 . - CDS 18282 - 19274 566 ## gi|294647717|ref|ZP_06725278.1| hypothetical protein CW1_4628 28 8 Op 3 . - CDS 19317 - 22559 1974 ## BVU_1870 hypothetical protein 29 8 Op 4 . - CDS 22598 - 24217 1128 ## Dfer_1583 RagB/SusD domain protein 30 8 Op 5 . - CDS 24229 - 27369 2261 ## Slin_6567 TonB-dependent receptor plug + Prom 27545 - 27604 7.7 31 9 Op 1 . + CDS 27672 - 32030 2118 ## COG0642 Signal transduction histidine kinase + Term 32086 - 32123 -1.0 + Prom 32076 - 32135 4.6 32 9 Op 2 . + CDS 32155 - 32934 419 ## gi|237714186|ref|ZP_04544667.1| conserved hypothetical protein + Term 32969 - 33009 5.0 + Prom 33519 - 33578 3.3 33 10 Op 1 . + CDS 33598 - 34674 742 ## Fjoh_4231 hypothetical protein 34 10 Op 2 . + CDS 34714 - 35223 237 ## Fjoh_4224 hypothetical protein 35 10 Op 3 . + CDS 35189 - 38359 1494 ## BVU_1870 hypothetical protein + Term 38365 - 38408 11.3 - Term 38347 - 38400 15.1 36 11 Tu 1 . - CDS 38428 - 39891 1072 ## COG5434 Endopolygalacturonase - Prom 39966 - 40025 7.6 + Prom 40014 - 40073 5.9 37 12 Op 1 . + CDS 40148 - 41044 505 ## BVU_0152 polysaccharide lyase family protein 11, rhamnogalacturonan lyase 38 12 Op 2 . + CDS 40968 - 42017 625 ## BVU_0152 polysaccharide lyase family protein 11, rhamnogalacturonan lyase 39 12 Op 3 . + CDS 42032 - 42541 434 ## COG3250 Beta-galactosidase/beta-glucuronidase 40 12 Op 4 . + CDS 42520 - 44703 2040 ## COG3250 Beta-galactosidase/beta-glucuronidase 41 12 Op 5 . + CDS 44775 - 45023 315 ## gi|237714193|ref|ZP_04544674.1| conserved hypothetical protein 42 12 Op 6 . + CDS 45020 - 45544 336 ## BF3041 hypothetical protein 43 12 Op 7 . + CDS 45541 - 45747 148 ## gi|237714195|ref|ZP_04544676.1| conserved hypothetical protein 44 13 Tu 1 . - CDS 45818 - 50185 3522 ## COG0642 Signal transduction histidine kinase - Prom 50211 - 50270 4.0 + Prom 50079 - 50138 3.5 45 14 Tu 1 . + CDS 50205 - 50417 75 ## 46 15 Op 1 . - CDS 50336 - 50650 442 ## COG3254 Uncharacterized conserved protein 47 15 Op 2 . - CDS 50670 - 52079 1505 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 48 15 Op 3 . - CDS 52083 - 53924 1617 ## BT_4175 hypothetical protein - Prom 54031 - 54090 8.0 + Prom 53931 - 53990 4.0 49 16 Op 1 . + CDS 54169 - 55308 992 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 50 16 Op 2 . + CDS 55438 - 57072 1591 ## COG2755 Lysophospholipase L1 and related esterases + Term 57219 - 57281 13.4 - Term 57207 - 57269 8.3 51 17 Tu 1 . - CDS 57391 - 60789 2372 ## BVU_0159 hypothetical protein - Prom 60842 - 60901 5.1 - Term 60875 - 60925 12.0 52 18 Op 1 . - CDS 60960 - 63095 1366 ## Ndas_0923 cellulose-binding family II 53 18 Op 2 . - CDS 63173 - 64894 1666 ## PRU_2229 putative lipoprotein 54 18 Op 3 . - CDS 64916 - 66961 1575 ## PRU_2228 hypothetical protein Predicted protein(s) >gi|222159253|gb|ACAB01000106.1| GENE 1 2 - 250 215 82 aa, chain + ## HITS:1 COG:no KEGG:BF1358 NR:ns ## KEGG: BF1358 # Name: not_defined # Def: conjugate transposon protein # Organism: B.fragilis # Pathway: not_defined # 1 82 692 773 837 140 79.0 2e-32 KTVRKYFGEAVVVTQELDDIVSSPIIKDTIINNADCKILLDQRKYINKFDSVQSLLGLTD KEKGQILSINQSNDPARKYKEV >gi|222159253|gb|ACAB01000106.1| GENE 2 257 - 430 123 57 aa, chain + ## HITS:1 COG:no KEGG:BT_4770 NR:ns ## KEGG: BT_4770 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 56 715 770 772 77 67.0 1e-13 MGGVRSAVYATETSVEEYLTYTTEETEKLEVTQMTEKLGGNMEQAIRMLASEKKKKK >gi|222159253|gb|ACAB01000106.1| GENE 3 451 - 1083 630 210 aa, chain + ## HITS:1 COG:no KEGG:BF1239 NR:ns ## KEGG: BF1239 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 210 1 209 209 211 51.0 1e-53 MRKRIISMLFVLSAFIVPASVQAQWVVSDPGNLVQGIVNSVNEIVETSETAQNALSTWKE TSKIFEQGREYYDKLRKVNDLISGSEKVKESVLMLGDISEIYVNNFGKMLTDKNFSQREL NAIAAGYNTIMKKSSRSITELKNIINPTGMSMNDKERIDLVNRVYREMTHYKELANYYTR KNLHVSYLRAKEKNEQQQVFDLYGKDERYW >gi|222159253|gb|ACAB01000106.1| GENE 4 1108 - 2127 974 339 aa, chain + ## HITS:1 COG:no KEGG:BVU_2139 NR:ns ## KEGG: BVU_2139 # Name: not_defined # Def: conjugate transposon protein # Organism: B.vulgatus # Pathway: not_defined # 1 311 1 308 337 352 55.0 2e-95 MVLLSISFENMHQILRNLYDTMTKLCHPMMDMAMALAALGALFYIAYRVWQSLSRAEPID VFSMLRPFVLGMCILFFDVMVLGTLNGIFSPIVQGTGMLLRDQTFDLQKYQQEKDKLRAD MMAKTMMTGRVFASNEELDAELDNMDWSEEEHTALQTMYEVSYAFSLQGIVQMVMRKLLE ILFQSASLVIDTIRTFFLIVLSILGPIAFALSVFDGLQNTLVQWLARYISVYLWLPVADL FGAMLAKIQTLILQEEMNLMADPMSVIDVDGSSAIYLIFMVIGIIGYFCVPTVSNWIVQA GGMSAYNRNVNNTTSKVTNVAGAAAGASTGNVGAVLLKK >gi|222159253|gb|ACAB01000106.1| GENE 5 2149 - 2256 76 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEFKSFKNIESSFRQIRLFTLVFLGICASLTTYAI >gi|222159253|gb|ACAB01000106.1| GENE 6 2302 - 2772 168 156 aa, chain + ## HITS:1 COG:no KEGG:BT_2292 NR:ns ## KEGG: BT_2292 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 52 207 207 224 69.0 6e-58 MDQGKSLIVALSQDAALNRPVEAREHVRRMHELFFTLAPDKAAIESNINRAMYLADKSLY SYYRDWNEKGYYNRLISGNINQTVLVDSMQCDFDNYPYRITTFARQMIIRNSNITERSLV TRCFLQSTVRSDNNPQGFMAERFEILENRDIRTVER >gi|222159253|gb|ACAB01000106.1| GENE 7 2785 - 3045 224 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714163|ref|ZP_04544644.1| ## NR: gi|237714163|ref|ZP_04544644.1| predicted protein [Bacteroides sp. D1] # 1 86 1 86 86 161 100.0 1e-38 MKSRKIKTEISEKLKNVCEELSEKERKGVLVGMLVTSTVLCGITVARAFGRFLSHGTQKE LPFVHPVDSLHRADKDSIMYHPKTNE >gi|222159253|gb|ACAB01000106.1| GENE 8 3045 - 3233 262 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293368915|ref|ZP_06615516.1| ## NR: gi|293368915|ref|ZP_06615516.1| hypothetical protein CUY_3180 [Bacteroides ovatus SD CMC 3f] # 1 62 1 62 62 77 100.0 2e-13 MEDKDLEKPEAAEETAVQPENGRALKEEPEKEKKVLTPEEMEKRRKFIVIPAFVLVFLGV MY >gi|222159253|gb|ACAB01000106.1| GENE 9 3327 - 4352 660 341 aa, chain + ## HITS:1 COG:no KEGG:BT_0087 NR:ns ## KEGG: BT_0087 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 341 91 461 461 257 43.0 4e-67 MDKKTAAYEKQQAEVEEESRMKSLEQLAGTLMNEGKGASDTGEKEKDGDRLQQSVNTYEQ ISGQLNDFYETPKETANSDLERKVDELNKKLEEAEKEKNKTATREELMERSYKMAAKYLN PDKDTVQEKAVGKEPTETVPVQRAEYQTTSGLSQPVTDSAYIASLTVERNYGFNTAVGNS YQMGTNTIAACISENQIIEQGGRVKLRLLQPLQAGNITVPENSLVTGAAVIQGERLDILI SSIEYAGNIIPVQLATYDIDGQKGIFVPGSETRNAAKDAAGTVSESMGNSVSFARSAGQQ VVMDLTRGVMQGGTRLITGRVRAVKVTLKAGYKVLLVTKRQ >gi|222159253|gb|ACAB01000106.1| GENE 10 4386 - 5285 739 299 aa, chain + ## HITS:1 COG:no KEGG:BVU_2144 NR:ns ## KEGG: BVU_2144 # Name: not_defined # Def: conjugate transposon protein # Organism: B.vulgatus # Pathway: not_defined # 7 296 5 296 299 335 54.0 2e-90 MKIVIAFLMAVLASVSGFSQEERPVNPVRILTAGQHIIPYKIEVTFGKTVYILFPSEVRY VDLGSNNIIAGKADGVENVVRVKAAVKEFPGETNFSVITGDGSFFSFNVVYKEEPSTLNI NMDQWMNPDEGEKKGGSSIRVTELGEEDPTVIASVMYTIHRLDRRDVKHIGCRQLGMQAL LKGIYVHKDLIFFHVSLTNNSNVPFDVDFVRFKIVDKKIAKRTAQQETYIEPVRTLNALT RIEGKSTGRIVYAFPKIVIPDDKLLEVEIYEKGGGRHQRFYIENSDLVDARIVNELIGE >gi|222159253|gb|ACAB01000106.1| GENE 11 5288 - 5872 385 194 aa, chain + ## HITS:1 COG:no KEGG:BVU_2145 NR:ns ## KEGG: BVU_2145 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 194 1 194 194 194 48.0 2e-48 MKRVLFILLCICLPFVAGKSFAQRYLPGQKGIQLTLGGVDDFGSNVKHLHGNFQVGLALS RYNRNHSRWLFGADYVKKHYSYKDVAIPKAQFTGEVGYFVPFISDRGKNVFFSAGLSALA GYETTNWNCKLLYDGATLKNDGCFIYGFAPAFEMEAFLSDRLVFLFNVRQRIFFGSSVGN FHTVAAVGVKYIFN >gi|222159253|gb|ACAB01000106.1| GENE 12 5883 - 6371 173 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714167|ref|ZP_04544648.1| ## NR: gi|237714167|ref|ZP_04544648.1| predicted protein [Bacteroides sp. D1] # 1 162 60 221 221 321 100.0 8e-87 MEQGNWNVDEMLHWLDMKINREDRNIREQSKKMNENFLHFFEWNAESLYKSHFMSGCYKI LRQAVDGAKGMDTVWNIVEDNIAYCENKLLNGQVDCNSSSRTTNVAHFLKLECMQQLVRD YREFANILAQTPPEENLQQTANKTEKKREEPPERKIKTGIRR >gi|222159253|gb|ACAB01000106.1| GENE 13 6423 - 6887 368 154 aa, chain + ## HITS:1 COG:no KEGG:BVU_2146 NR:ns ## KEGG: BVU_2146 # Name: not_defined # Def: conjugate transposon protein TraQ # Organism: B.vulgatus # Pathway: not_defined # 3 153 1 153 153 153 50.0 2e-36 MKMKLIISGIMLTLLCLLGGCDDKLEVQQAYDFSLTSWYLQKTISPDETVEIRLTLNRSG NYEEAGYQIGYIQMEGSGEVYDKKKVYLVNREMQPLDSIAELDDSDPCRQVFTLFYHNRS SKNAEIKFVIADNFHQERELDISFQSETETDMEL >gi|222159253|gb|ACAB01000106.1| GENE 14 6899 - 7384 288 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714169|ref|ZP_04544650.1| ## NR: gi|237714169|ref|ZP_04544650.1| predicted protein [Bacteroides sp. D1] # 1 161 1 161 161 287 100.0 2e-76 MDKSDLRIEQLQQYLDKKKGVVESDIKEYNQQLGKNYLHFFDWHADDLYKACYMDKNYKA IQEAIDTAETPKDIEGYLKRCTLYIEEDLLNGPLVKKSTNPMSNMAHSLEMECKQELLKD LRYLNRLLQSETVSERIRPQEAPRQEIVPVKEKKKTGPRLR >gi|222159253|gb|ACAB01000106.1| GENE 15 7463 - 8236 487 257 aa, chain - ## HITS:1 COG:HI1418 KEGG:ns NR:ns ## COG: HI1418 COG3617 # Protein_GI_number: 16273324 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Haemophilus influenzae # 18 111 34 127 201 98 47.0 2e-20 MNKVSIFEHPEFGRIRTLEIDGKIWFCASDVAAALGYSNPRDAVVRHCKPMGVVVYDTPT RSAVQKIKYISEGNVYRLIAGSKLPSAEKFESWIFDELVPETLKNGGYLLKKNGETDNEL LARAILLAQNRIKERDSRISALEKENNYAILKLKLQAPKVQYYDKVLQSQSTYTTTQIAK ELGMTAGMLNKRLRWAGIQFRQSGQWLLKAPYQNQGYTATRTHVWESRTGETGTAMLTVW TEKGRLFIHYLFEAYLV >gi|222159253|gb|ACAB01000106.1| GENE 16 8285 - 8971 217 228 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294808552|ref|ZP_06767298.1| ## NR: gi|294808552|ref|ZP_06767298.1| hypothetical protein CW3_4495 [Bacteroides xylanisolvens SD CC 1b] # 1 228 103 330 330 450 100.0 1e-125 MLLKPFAVILNAKDIPSSRYLYRLVPSVHLTDGIEFGLLPTVTAQDYKRRGPGSRQQGLP EIIHGMLLPTPVATEIHHAERVRKWKSMNLSSPHAQIAGEKNPNGLTDFLDFYGILPEPI PDNTELENTDGNNLEESILQWLAEGQVMPTPTARDWKGAPSLENLKKRGKIPQKNSLPDF FARTGKSFQLNPLFVAEMMGFPPDWTVSPFLGEDRHPLKDTGTLSSPR >gi|222159253|gb|ACAB01000106.1| GENE 17 9351 - 9446 150 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKEEEQDKKQRKRLVHASLFSGFGAPDLAAE >gi|222159253|gb|ACAB01000106.1| GENE 18 9450 - 9797 433 115 aa, chain - ## HITS:1 COG:no KEGG:BVU_2108 NR:ns ## KEGG: BVU_2108 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 6 103 3 101 109 63 35.0 2e-09 MENQSKVIVIERNKFAALVKSHRKCLQMLNILTYIYTVKEVSLTLTLQEICEVLHMTPEE VEIQRQKGYIRFTTQKGMTVYEITDLLRLENMLEMGSIYRKIDKKVMNLEPLNNE >gi|222159253|gb|ACAB01000106.1| GENE 19 9810 - 10130 350 106 aa, chain - ## HITS:1 COG:no KEGG:BT_0107 NR:ns ## KEGG: BT_0107 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 93 1 90 92 62 36.0 6e-09 MEENRNDIEILKAMIKDNFEEQRKLIAKLETALEAVTSFNGKQMLDSRDMRLMLKVCDRT LIRWRNSGKLPFFKLSGKIYFWASDVYKFLREECLNEDFISDSINS >gi|222159253|gb|ACAB01000106.1| GENE 20 10773 - 11423 543 216 aa, chain + ## HITS:1 COG:TP0554 KEGG:ns NR:ns ## COG: TP0554 COG0546 # Protein_GI_number: 15639543 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Treponema pallidum # 1 198 4 203 222 133 36.0 3e-31 MKKLVIFDLDGTLLNTIADLAHSTNYALNKLGYPTHEIEKYNFMVGNGINKLFECALPEG EKTEENVLRVRNEFVPYYDIHNADDSRPYPGIPALLSYLQSAGIQIAVASNKYQAATEKL VAHYFPEIHFTAVFGQREGVNVKPDPTIVFDILKLANVRKEDVLYVGDSGVDMQTAANAG VTACGVTWGFRPRTELEEFNPAYMADAAEKIKKMVL >gi|222159253|gb|ACAB01000106.1| GENE 21 11447 - 11848 222 133 aa, chain - ## HITS:1 COG:TM1000 KEGG:ns NR:ns ## COG: TM1000 COG1895 # Protein_GI_number: 15643760 # Func_class: S Function unknown # Function: Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN # Organism: Thermotoga maritima # 7 132 3 128 132 71 32.0 4e-13 MKLTDEERNSLVILQLEKAKVFLKQADEMFDMKYWDIASNRYYYACFHAVQALLIQNGLS CKTHDGLIACFGLNFIKTGKISARLGSFLARMEQLRQKGDYNCIYSISEDEISTIKAPAR ELIETIEVLLAES >gi|222159253|gb|ACAB01000106.1| GENE 22 11845 - 12168 252 107 aa, chain - ## HITS:1 COG:no KEGG:PRU_1839 NR:ns ## KEGG: PRU_1839 # Name: not_defined # Def: nucleotidyltransferase domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 5 107 3 105 105 133 61.0 2e-30 MKSHTNKILESIKQALTEHLPKGGKALLFGSQARGDARIDSDWDILIILDKEKLEPEDYD KVSFPLTMLGWDLGARINPIMYTMKEWAASCITPFYKNVEQEGIELV >gi|222159253|gb|ACAB01000106.1| GENE 23 12318 - 12752 328 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714177|ref|ZP_04544658.1| ## NR: gi|237714177|ref|ZP_04544658.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 144 1 144 144 278 100.0 5e-74 MNRKTIINRPLDVFRKAMLFLLLCVLIASCSKEDVITEETGITGTPHFIWDIAVELEQPN NWCMTLATEKVAITNLTEGKQYILTWKGGLSTGRKAGGVLKTVIRGEQTKNTDLDLLEIK ESGNNTYEIFLRGGGRKGEIVFTK >gi|222159253|gb|ACAB01000106.1| GENE 24 12799 - 14343 879 514 aa, chain - ## HITS:1 COG:no KEGG:BT_4183 NR:ns ## KEGG: BT_4183 # Name: not_defined # Def: pectate lyase L precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 433 3 429 437 624 70.0 1e-177 MKRISLLLFFIGCFLFSLSATNYFVATNGSDSNSGTIDKPFATLGKAQSKVMAGDTVYIR QGTYRVTEAEIMEHYTAGSTTWSRVFKMSKSGTGPDKRICYSGYKNERPVFDLSAVKPEN ERVIVFYVSGSYLHFRNIEVVGTQVTIVGHTQSECFRNEGGSDNIYEHLSMHDGMAIGFY IVTGKNNLVLNCDAYNNYDPVSDGGKGGNVDGFGGHLTSPQYTGNVFRGCRAWYNSDDGF DLINCQAVFTIDNCWSFLNGYTKDGGKAGDGTGFKSGGYGMSDSPKAPSVIPMHIVQYCL AYMNKNKGFYANHHLGGIAWYNNTGYQNPSNFCMLNRKTASEAVDVPGYGHIIMNNLSHT PRSSGKHIIDVNQAECEIANNSFLPVDMAVTDDDFVSLDASQLTLPRKSDGSLPYVEFLR LKTNSKLYNAGMGCFLTGGGEDTSYDWLEDAAILVEGNVAKIVGHGAEAFVYFYINGKAV SFSDRQVDLSAYKGEIDLKATTDNGGVTKLKVIR >gi|222159253|gb|ACAB01000106.1| GENE 25 14604 - 16574 1023 656 aa, chain + ## HITS:1 COG:no KEGG:BVU_0152 NR:ns ## KEGG: BVU_0152 # Name: not_defined # Def: polysaccharide lyase family protein 11, rhamnogalacturonan lyase # Organism: B.vulgatus # Pathway: not_defined # 71 656 50 624 635 422 42.0 1e-116 MKNLFYTLTFFSFVTACSDNSVQNPTPPPVSDNGDNNNMQVVEFLRNRAMVASYNTIFPN KETERIERKGVLLSWRWLSTDPDDIGFDIYRKEGNYKFQKLNKTPIINSSNYKDLTANIN KILKYEVRQANTNNILCSCNFTPEMAQNFYRTVPLNNNNLPYPDLVYKASDAAIGDLDGD GDYELVLKREVSPLDNGSTGIGITPGSCLLEAYKLTTGTFLWRIDLGSNIRQGIHYTPFI VYDLNGDGKAEIAVRTSEGTVFGDGTKIGDVNQDGITDYVDRAPQSATYGRIITGPEFLS IIEGRTGKEVARTDYIYRGEKNKWVTYWGDNWANRMDRFLMGVGHFRSQKGIPSLLMCRG YYKNYQIVALDFTDNKITERWHFDTADNYSDYIGQGNHNLAVGDIDDDGKDEVLYGACVI DHNGKGLYSTKLGHGDAMHLGKFDPTQEGYQVVVCHEEPKEYGNIGTEFRDARTGRILHY IPGNGKDVGRCMVADVDPDSPGCEYWSSEPDGVMYSCKGNELTGKRAPIAKGGDTSYNMT IWWSGSLNRQMLDYLVIHSYTDGRLFNGSDWGVKTASGTKNNACFYGDIWGDWREEVIFV DENDTELRIFTTDFETDYRFHPLMDDHLYRLSATHQNIGYNQPTHPGYYIGSDLNK >gi|222159253|gb|ACAB01000106.1| GENE 26 16671 - 18260 1043 529 aa, chain - ## HITS:1 COG:no KEGG:BT_1014 NR:ns ## KEGG: BT_1014 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 492 19 470 506 74 23.0 1e-11 MRYCKFIFVIYVLLGSFLMVACSQDEDAKLAVNSNILKIEVVDSSIQQNGNVSRAVTDVT YKTTFSDGDAIGIFAVNSDKEVFIKNVLATYNDGIWGIDGGRLSCTEDLETVTFYAYYPY KENITIDMTKEDPFETIVGNWTVDTDLSGDRYTNNDLMTGEASADGSTITFVMNHRMALM VAELPSVTYNFTNEVSPELPSYSVSLREVKFSIGEQVIIPYYDKETTTYRVLVNPTKKVE QIGGSFISSVDNGLKKYSIDATKLKAGEYIYCEIDGGLQTVDHELKVGDLIYSNGALASV DDNAPVSDDCVGVVYFVGNPMPSVLYPFTEDNEFTYSERQDALLRDHPGCTHGLVLGLKE NTNIVFGEKDEIRVWYRTEFAERNSYIDLSPMGWDGSASTGTLNGTSRDQRLGYNHTEVI KKYAEAKNKSLLVNSLNNYNLVAPSISSGWFIPSVGDLRAMVTNWETMNTQLGKITGSDG LKNDMYYWSSTERNNTSIFGVQLTSSGSTKVDGLKYDRPGVSSRYVFAF >gi|222159253|gb|ACAB01000106.1| GENE 27 18282 - 19274 566 330 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294647717|ref|ZP_06725278.1| ## NR: gi|294647717|ref|ZP_06725278.1| hypothetical protein CW1_4628 [Bacteroides ovatus SD CC 2a] # 1 330 1 330 330 637 100.0 0 MKRKMPFIAALSILCWGCSSYDYSGDDIVGVKAAISGTITEVVEKSRTVGTTWTDGDRIG VTCEDDVNISYKYTGNLSSFAAFDENRSIYFLGKQEHVLSAYYPFTETSVMVADCITVET TSDKQTQEKQMSIDFLYATTEAGRNNPDVNFAFSHQMSRIDFSFEGKDGLTLDDITYTLI GLKLKGTFNTIDGTTAVVEDAPALALSQKVMAGENMRTSLIVFPQKVSEVTFEIEMGGKS FIKKIGEMDLVPGHIHPYTVIISERDETIYVTVESGEVQGWVEGDRQEINTSDGNTITGV EPGDVQWDGGNTQTIVSQGGRSISMELYDN >gi|222159253|gb|ACAB01000106.1| GENE 28 19317 - 22559 1974 1080 aa, chain - ## HITS:1 COG:no KEGG:BVU_1870 NR:ns ## KEGG: BVU_1870 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 5 1080 9 1085 1085 1508 65.0 0 MNKSCLIGFLMCSAFSLVVQAATDGAHITKTISQIERKKGIKRTIQYRPDGNDFVCVNGK NRYTRALYGGITDFRIETSDRPVFAAYKPKDSKHICFRLQCGNQSIALDSLKFCEARYRA GRRTYRLTDPLLGKGELNISVLAFPDAEGGIWRFTMKDMPEGALLHCYISEIRAKNLSRN GDMGADRADSFDAPQLPKEQKHYKLSLDGTTAYIVVEDQELRLPALVEGTSLYDKSEKWR SEIESSLQISTPDPYFNPLGGALAAAADGIWDGKIWLHGAIGWRSQLNGWRAAYIGDFIG WHDRARTHFDSYAASQVTDVPATIPHPTQDTALHLARSVKKWGTPHYSNGYICRTPYRKD QMHHYDMNLCYIDELLWHFNWTGDLDYVRKMWLVLTSHLAWEKLNYDPDNDGLYDAYCCI WASDALYYNSGAVTHSSAYNYRANKLAAMLAEKIGEDATPYEKEADKILKAMNERLWIKD KGHWAEFQDFMGHKRLHESAGLWTIYHAIDSETADPFQAYQATRYVDTSIPHIPVFADGL DRDYETLSTTNWMPYVWSVNNVAFAEVMHTALAFFQAGRGDDGFNLLKGSILDGMYLGKS PGNFGQISLYDAVRRETYRDFADQIGITSRTLIQGLYGVSPDALNGKLVIRPGFPMAWDK ASMSMADLAYEFLRKGDMDIYNVTQRFKEPLALTLQVNAVKDKIRLVKVNGKAVKWKTEE AANGYPLIVVSVPAVTKAEIEIQWEGNVLQQIANDEIVTTLEGQVDLNAPQGISFLKVYD PQNVLVTKKVNTTKLSSKVNKDKKGHHTFFVYTRQGNMEWWQPVNVYVNVPQVVYNGFEN IETGKCRVVNMDKQFNSSVADIFKNEYLSPRSPYTTLQLPTQGIGEWCHPLLTAEIDDSG LRSLVKDNMYKTSLGIPFRVMKEGNNIAYTSLWDNYPDMVKVPLSGKASHAYLLLVGSTN HMQCRIANGIVRVYYTDGTSEVLELVNPDNWCPIEQDFYLDDFAFDAPRPRPYRFHLKTG IVSRDLGKALKLRGSADRRFEGGAGVMLDIPLDKNKTLKELTLETVANDVVIGLMSVTLQ >gi|222159253|gb|ACAB01000106.1| GENE 29 22598 - 24217 1128 539 aa, chain - ## HITS:1 COG:no KEGG:Dfer_1583 NR:ns ## KEGG: Dfer_1583 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 1 538 1 477 479 175 31.0 4e-42 MKLYQYILSGAMSMGLLFSSCSDWLDVTPVDTRTTENFYTTPSQMEQALIGVYNGLLPLS TYALLMSEVRSDNTWCGELSTAQRDYMDISSFNPNISTIATVRDAWNDLFEIVSRANLFL SKVDGVNYTIEGIKEQQIGEARFLRALAYFDLVRYYGRVPLTLVPQTISEAMSTPQSEAV EIYEKVIIPDLEYAVGSLTEEPKDYLGRKSASGRATLTAAKALLGRVYLTMAGFPLYDET KKELARELLEEVIDYADATGKFWAKNASEWQRMWINENDNKYHIFEIQYIVAKDYGNPMV FHSVPRLPSKYVTLEMSGNSIACAKGLDNLLKEEQDEEGHFLDVRCLATIDTTKFVNDDN PNSVTKYAGEDFFIKFLEHKMKREALGYDDINAQIVDRTYFPLNFPLIRLEDVMLMYAEI VGPTSKGVDMVDRIRRRAGIPVLTNAEKEPAAFRECVDKERRRELACEGIRWHDLVRHNN LQAVRDKFQEYAVDANGNVIRPTLLLYIRQIKDGTYLYPIPDSQMKVKEGLYVQNEAYR >gi|222159253|gb|ACAB01000106.1| GENE 30 24229 - 27369 2261 1046 aa, chain - ## HITS:1 COG:no KEGG:Slin_6567 NR:ns ## KEGG: Slin_6567 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 33 1046 141 1155 1155 654 38.0 0 MKSTVCEISCWRRKYIFFFLMMISALHSVTVFAQKRTISGTVVDERNEPLIGATIVAEGK SGMTITNVDGKFTLAVPDKAKTIEVSYLGYASQTISIKGKSVFAIRMSENDVQMDEVVVI GYGTAKRGNVTGAIAKVDAAKLEDRPAPNLASSLQGQLAGVEVRSTNGAPGSELQIRVRG AASINADATPLYVVDGIPIDDLGSINPNDIQSIEVLKDASSSAIYGSRGANGVVLITTKM ANKDDKVRVQFSAAFSIQQLEKKIDVLSPEEWITFRTAYNNSRYLAALGGKGATADDDWD TRYALNGNRVNYEYMNDPRWTQPGYGGLRLIDWQDEFYRLAPMQNYQLSVSNGRGNTQYR LSLGYIDQQGIAIESAYKRLNLRANVETKLYDRITLGVNMAPSMDWKDGGRVDGKDQQAN QVLTMCPVAEPDAGIYSGAEPYASYLWSSTKISPIAYMEQVTNHIETARLNSSAFIRVNI IDGLRAEVKGAYDFTSRQTRTFTPSSVIKNWTDGEGYHSSANRIDIRSSKYLLQAVLNYD KKLGKHNIAAMAGYSMESSSGATTNLSAQQFPDNSLEVFNQSDEAIKVALATLSTPNRLL SYFGRAQYEYDNRYLLTASIRRDGSSRFGKNNRWGMFPAVSAAYRISNEKFWPEKFVVNQ MKIRGSWGMNGNNSISTNAAIGLMGSSNYSIGGSLTNGFAPSSIDNKELGWEKTHSWNVG LDLGFFDNRITLAADYYDKTTKDLLYQVSVPGTMGFTQAWGNIGSIKNKGFEIELTTQNL TGRFKWTTSLNIAYNKNKVLSLGEDNSTVFIGWDKSNTQVLMVGQPLRAYYMYDAVGVYQ YKEDLRKYPTMSNSIQGDVRYRDVNDDGKINDQDRTLVGKPDPDYTFGMTNTFKYKDFDL SVLLTGQTGGMIYSLLGRGMDRPGMGASINVLSRWENMWVSEEQPGDGKTPGINNSNTGS LYDTRWLYSTDFIKIKNVTLGYRIPIKNKKIINYARVYISGENLLMWDKYEGGYSPEVNN DGKNSDYDYGSYPQARTITLGVNVTF >gi|222159253|gb|ACAB01000106.1| GENE 31 27672 - 32030 2118 1452 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 887 1131 49 305 318 132 31.0 4e-30 MRFFIKATIFFFICTCIIYPANASIEIHARNLTATDGLANNSIRHIYQDSKGFIWFSTLN GLSRYDGNSFVTFRPQSGEELSLADHRIRSVQEDSDGFLWITTSAEQISCYDLKKDCFVD FTGKGEMKDRYNEVTILPNKDIWLWGRVQGCRKITYKNNKFSSETYSTDNHKLKSDNIRF LSVFDSNSIWIGTGKGLYLLKDNTLECIDSTHYFLQSAVIDNTIYFITSEGFIWKYNQQH LTKVANTATSTDFLLTGNIVLRNEWLLFTKQGGILFNPKTNTTRIAPEELNIRNGLVVKD NRNNYWVHNQTGYINYIQKESGSVKKIDVRSSKLPYFIDYERYHVYHDSRDIIWITTYGN GLFAYNPTTKELEHFTATANHTNPIASNYLQYIIEDHSGNLWTSSEYSGISQIEIINKGA AKVYPEGELNIDRANAIRFISHMQDDEVWITTRAGGLYIYDNKLTKQKSKKYYDINIYSA CEDSKGNIWLGTRGKGLQVGDDQHYIHQATDTNSLAADPVFCILQDRKQRMWIGTFGGGL DLAVPKKDKYIFRHFFNKTYGQKEIRTICEDRNGWIWVGTSEGVFVFDPDRIIKDPNDFY QYNLDNHALKSNEIKSIIQDKKGRIWIAESGIGFCVANIKNDYKDISFTHYTVNDGLVHS VVQAFIEDDEGNIWVSTEYGISCFNPENKIFNNYFFSNDILGNVYTEGCAKLKDGRLAFG TNHGLIILNTKQIKNKEKILSVTFTDLKLNGISVRPADMDSPLTAALAYTDAISLKYYQS SFVIDFSTFDYPISTNTRFSYKLEGYDDDWSIPSTLNFAAYKNLPAGTYYLHVKACSVSG IWSDNEETLEIKVTPPFWATGWAFFVYILIAGIIMYFVYRTIRNINNLRNKIKVEKQLTE YKLVFFTNISHEFRTPLTLIQGALDRIHRTHNIPKEIRYSIKLMDKSTQRMLRLINQLLE FRKMQNNKLALSLEETDVIAFLYEIYLSFQDTAESKNMDFKFIPSVNSYKMYIDKGNIDK IAYNLLSNAFKYTPSGGKIEFSIYIDKQKQLLIMKVTDTGVGIPKEKRNELFKRFMQSSF SSDSIGVGLHLTHELVHVHKGNICYEENPSGGSIFIVTLPTDSSIYQSNDFLIPENAILK EEAQNHPSLSALNEENAHSESEEEIDKEVENIEKELKTELNASDQEGPLNKRKILIIEDD NDVREFLKEELTPYFEVAAEADGKNGLEYAHNNDIDLIISDVMMPGYNGFEITRKLKSDF STSHIPIILLTALNAAESHLEGVKSGADSYITKPFSTKLLLASIFKLIEQRDKLKEKFSN DLSAKRPVMCTSDKDKEFVENLTKIVEEQLTNPEFTADDFASMMSLGRTIFYRKVRGVTG YTPKEYLRIMRMKKAAELLSTKKYTVSEVTYMVGINDPFYFSRCFKAQFGISPSSYQKRY QEGIRETEINED >gi|222159253|gb|ACAB01000106.1| GENE 32 32155 - 32934 419 259 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714186|ref|ZP_04544667.1| ## NR: gi|237714186|ref|ZP_04544667.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 259 1 259 259 503 100.0 1e-141 MKRSVLILSLAAFVLSSNAQVLKSNLLNGYKQGDKLEKATYNKADAPIKIDTWCEAFSSQ TDKNAGSPITGQELSYEGYTEKGVSIKIGELPENVKGSRFSVYSISEKNDYCRGTLYLSF LANFSSVASSRLGDFIALSPVHTGGNLRARVYVGKIDDEHIRFGTNLLRVNAESFKSHAL NKTHLLVLKLDYKKNEVSLFVDPALTAEEPQPDVVAKGDEDNNKLKGGIRSISFRNRDDL TGNVGNFRFSSSWAGVIAQ >gi|222159253|gb|ACAB01000106.1| GENE 33 33598 - 34674 742 358 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4231 NR:ns ## KEGG: Fjoh_4231 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 17 355 34 367 368 410 56.0 1e-113 MKHILFYALFFILKIASAQAQSLDQPKNYPADAISSFAERLEPMGRILEDDNYYVWCCAP IIDEKNKVHVFYSRWEKKYEMKGWLGHCEIAHAVADQPEGPYKYVSTVLSPRPGYFDGNT CHNPSIYKIDGRYWLFYMGATNGKFGTKRIGYAVANSIWGPWERCEKPLLMPGNTGEWDD YITTNPAFLKHPDGKYWLYYKSCNENEYVNQHINGISGNRKYGVAFADKITGPYRRYEKN PIVDFSGFGENRQVEDAYIWYENGIFKMLMRDMGFYDHTVGLYFESKDGLNWQNPQIGWF GAEHYIKQPPAPKHLKRYGRFERPQVLMKNGKPAYLFTASQGGKAETSSGFVFKIKPE >gi|222159253|gb|ACAB01000106.1| GENE 34 34714 - 35223 237 169 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4224 NR:ns ## KEGG: Fjoh_4224 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 32 162 31 160 1136 112 38.0 5e-24 MKKIILILFTLLQFPANAKDLPHSSYWHGEERTLRYKPEGEEFVITNGNKRFTRAIYGTN TGFRFETSDFPEFGLYMPNLGGSVYMAISTPSNITWIKDMEFIESRFKSGQRTYIVRDRR HLGNGSLTIDAVAMSDGDGLVVRYKAKDIPAGTKILWIYGGSQQSEIRT >gi|222159253|gb|ACAB01000106.1| GENE 35 35189 - 38359 1494 1056 aa, chain + ## HITS:1 COG:no KEGG:BVU_1870 NR:ns ## KEGG: BVU_1870 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 94 725 200 833 1085 668 50.0 0 MGGANNQKFGREGDLNSDPADCFFIKANNCSGNIYQINQNKFTIQYGHHTSNRASQHKVM SIAGTFPIGTDIKEVDGNLIDQLDELITSRKSNKPVIIAKYPVSTIPFYIELHNPKSHSE FQYNDLSNVFNKGVEFRTQIASRIKIKTPDPFLNTLGGTFSGAEDAVWQSPSYLHGAICF RELLLTGWRGAYLADLMGLEDRAKTHFNGYLNSQVIDVPVTLPHLQDTALNLARSAKIWG TPMYSNGYIGRVPNRRDVMHHYDMNLVFIDQLLWHLKWTGDLDYAREIYPALQRHFTWEK KLFDPDNDGLYDAYACIWASDALQYNGGKVTHSTAYNYRANKIMAEISEKLGKDPSIYKK EAELILSAINKELWIKDKGWWAEFKDNMGNRIRHDNAAIWTIYHAIDSDIHDPFKAYQAT RYIDTEIPHIPVMGKGLKDTANYVISTTNWQPYMWSINNVAFGEISHTALAYWQTGRYEE AFKMFKGAVLDAMYLGSGPGNVTQLSFYDAARRETYRDFADGIATGVRALVQGMYGIMPD LINNRLTIRPGFPDDWNFAEIETQNMAYTFERKGNIERYSITPNLLKKDVSLSMEIKAIR NKIKSIKVNGKDTPYTLLTTTILSPEIKFEAGIADKYDITIEWDGEKINRDMISVNVANG SQFRLNIPYKSGKIYDPQNVLRHANLKNDILTGTITAQNGHRTVFVNITNGNMNYWLPID INVNNPLDIVCDSESSSLIFTLKNNMDKVIKGDLYINGKKVNENINIEAHGKNNYEFDIP IASSGTNRIKVKSGKDTYSFRAINWNISVPEKSVYKTVDMKKIFNDKVSNIFAYGKYMFP RWKYTTLQVPTQGMGQWCHPQSISVIDDRGIRNKASRNNNRFIMPQGIPFSTPGEKEYNN IAFTTLWDNYPTSINIPLNGKASKAYFLIAASTYYMQSHIVNGEIKIEYTDGQKEVLKLI LPDNLIPLDQDIFVDGYAFNTKDPRPWRVRLKTGDVSKYHAGELGKTISNNPISIDGGMA TMLDLPLNPVKELKSLSLETTANEVVIGLMGVTLVK >gi|222159253|gb|ACAB01000106.1| GENE 36 38428 - 39891 1072 487 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 58 366 33 362 448 147 28.0 4e-35 MKYLLSIVLLALIGFTTPKKTIPAYDWGSVSAEPDLSWADQVGAQRTPENTEWDAGKFGL RNDTSVFSTPAIQAAIDACHQQGGGTVVVAQGYYKIGALFIKSGVNLHLSKGTTLLASDN IQDYPEFPSRIAGIEMTWPSAVINIMDAENAALTGEGFIDCRGKVFWDKYWAMREEYEKK NLRWIVDYDCKRVRGVLVSNSKHITLKDFTLVRTGFWACQILYSDHCSVSGLTINNNVGG HGPSTDGIDIDSSTNILVENCDVDCNDDNICIKAGRDADGLRVNRPTENVVVRNCIARKG AGLLTCGSETSGSIRNVLAHDLIAYGTGSVLRLKSSMNRGGTVENIYVTGVEADSVSNVL EVDLNWHPKYSYSKLPVEYESREIPAHWKIMLTPVEPKEKGYPHFRNVYFSHVRAKRSKC FITASGWNDSLRIENFYLFNIRAQVKTAGKIVYGKKVRLSEIYLEVEDKSQVRQEYNIDS KIEISYQ >gi|222159253|gb|ACAB01000106.1| GENE 37 40148 - 41044 505 298 aa, chain + ## HITS:1 COG:no KEGG:BVU_0152 NR:ns ## KEGG: BVU_0152 # Name: not_defined # Def: polysaccharide lyase family protein 11, rhamnogalacturonan lyase # Organism: B.vulgatus # Pathway: not_defined # 10 279 22 293 635 397 68.0 1e-109 MLLPVVTMTAQPGYNYSKLQREKLNRGVVAIRENPSEVIVSWRYLSSDPIQTGFNVYRDG KNLTDTPITVSTLFRDKNNSQKTAVYEVRPVLKGKETHHIDGTYTFPENAPLGYLEIPLQ KPADGITPAGDTYTYSPNDASIGDVDGDGEYEIILKWDPSNSHDNAHEGYTGEVYIDCYR MNGEQLWRINLGKNIRAGAHYTQFMVYDLDGDGKAEVVMRTADGTVDGKGKVIGNADADY REAGTFDPSRNQMMKQGRILKGKEYLTVFSGDTGEALHTYHRLYPCPRQRCRLGRRKR >gi|222159253|gb|ACAB01000106.1| GENE 38 40968 - 42017 625 349 aa, chain + ## HITS:1 COG:no KEGG:BVU_0152 NR:ns ## KEGG: BVU_0152 # Name: not_defined # Def: polysaccharide lyase family protein 11, rhamnogalacturonan lyase # Organism: B.vulgatus # Pathway: not_defined # 5 349 291 633 635 597 78.0 1e-169 MKPYIHTIDYIPARGNVADWGDAKGNRSDRFLACVAYLDGVHPSVVMCRGYYTRTVLAAF DWNGKELKNRWVFDSNHPGCEQYAGQGNHNLRVGDVDGDGCDEIIYGSCAIDHNGKGLYS TRMGHGDAIHLTHFDPSRKGLQVWDCHENKRDGSTYRDAATGEVLLQIKSNTDVGRCMAA DIDPTHPGVEMWSGDSQGIRNVKGEIIAPKMRNMPTNMAVWWDGDLLRELLDRNMIIKYD WENKKFVPLVKFTGTLFNNGTKSNPCLQGDIIGDWREEVLVRSENNAALRLYVSTIPTEY RFHTFLEEPIYRISIATQNVGYNQPTQPGFYFGPDLIKMKGTFRGYQFK >gi|222159253|gb|ACAB01000106.1| GENE 39 42032 - 42541 434 169 aa, chain + ## HITS:1 COG:SPy1586 KEGG:ns NR:ns ## COG: SPy1586 COG3250 # Protein_GI_number: 15675473 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pyogenes M1 GAS # 24 149 51 181 1168 79 32.0 3e-15 MKHKLFLFTFLLAILSAINIQASERKKYNFNSEWKLRIGDFPKAKDTKFDDSKWKQVTLP HAFNEDEAFKLSIEQLTDTVVWYRKSFQIPELKSNQKVFVEFEGVRQRGDFYLNGHYLGR HENGVMAVGFDLTPHIKEGENVIAVRTDNDWMYREEGTNSKFQWNDRRR >gi|222159253|gb|ACAB01000106.1| GENE 40 42520 - 44703 2040 727 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 140 524 438 871 871 110 26.0 1e-23 MERPQTVKASAILHNLHFWSWGYGYLYTVKTALKDDNNQIFDEVSTRTGFRKTRFAEGKI WLNDRVIQMKGYAQRTSNEWPAVGLSVPAWLSDFSNDLMVKGNANLVRWMHATPWKQDVE SCDRVGLIQAMPAGDAEKDREGRQWEQRVELMRDAIIYNRNNPSILFYECGNKAISREHM IEMKAVRDKYDPFGGRAIGSREMLDIREAEYGGEMLYINRSEHHPMWATEYCRDEGLKKY WDEYSYPFHKEGDGPLYKGQPATDYNRNQDELAITMIARWYDYWRERPGTGNRVSSGGTK IIFSDTNTHYRGAENYRRSGVTDAMRIEKDAFYAHQVMWDGWVDTEKDQTYIIGHWNYPD NTVKPVQVVSTGEEVELFLNGNSLGKGKRQYNFLFTFDNVAFKPGKLEAVSYNKAGKEIS RYAVNTAGEPASLKLTAIQNPEGLHADGADMTLIQVEVVDKDGQRCPLDNRTIQFTLKGQ AEWRGGIAQGKNNHILDTNLPVECGINRALIRSTTAAGKVTLTAQAKGLLSASLTLETVP VKVTGGLSTYLPQATLKGRLDRGETPSTPSYKDSKKGVRIVSAKAGSNNNDAEKSYDDIE LTEWKNDGKLSTAWITYTLERDAEIDDICIKLQGWRSRSYPLEVYAGNTLIWSGNTDKSL GYIHLNVEKPVRANTITIRLKGNTSDKDAFGQIIEVEAIAANTMELEKSSSKHQLRIIEV EFLETIK >gi|222159253|gb|ACAB01000106.1| GENE 41 44775 - 45023 315 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714193|ref|ZP_04544674.1| ## NR: gi|237714193|ref|ZP_04544674.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 82 1 82 82 140 100.0 2e-32 MCKQPDILESEKIHFAVMAIEAGAREMGISPMEMRRRLEKMNLIKRLLLDNYEVMHTQSL KHVGEDVVEALKNWEAQEGEKG >gi|222159253|gb|ACAB01000106.1| GENE 42 45020 - 45544 336 174 aa, chain + ## HITS:1 COG:no KEGG:BF3041 NR:ns ## KEGG: BF3041 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 168 2 165 171 163 48.0 3e-39 MKITLYHGSTLSIEHPLAKVGRADLDFGRGFYLTSLRSQAEQWAARVQLLRASTTAWINV YEFDMDAAIKAGFKLLRFDAYDQHWLNFIVASRNGKQPWKDYDIIEGGVANDRVIDTIED YLNDIITIEQALGQLVYAQPNHQICLLNQQLIDTYLHFDNSFPLDTMDRKGGNK >gi|222159253|gb|ACAB01000106.1| GENE 43 45541 - 45747 148 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714195|ref|ZP_04544676.1| ## NR: gi|237714195|ref|ZP_04544676.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 68 1 68 68 120 100.0 4e-26 MREQVLWRKISRIVLLLSARLEISPERALNLFYETNVCAMLHDSRYGLHLMSDTYIINDV LRELQDKQ >gi|222159253|gb|ACAB01000106.1| GENE 44 45818 - 50185 3522 1455 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 924 1144 8 227 294 131 39.0 1e-29 MECFFYYLNYLDYFCPTFIINLGIYMTQRHLLVFTFLVSFVLNAYSAIELRSTQMRTSDG LPNNSIRYIYQDSKGFLWLATLNGLSRYDGNSFLTFRPEAGDKVSLADNRIYDLTEDKDG FLWISTTPELYSCYDLQRARFVDYTGCGELRQNYSTVFVTANGDVWLSHQGNGCRRMVHQ KNGEMTSTVFRTERGNLPDNRVKFVNEDASGRIWIGTQCGLVSVSNGQYRIEDRLIHFTS SLAYKDDMYFLTADGDIYYYHSATQKMQKLAALSTVAGQTSPTGNFLLKDKWMILTTTGV YTYDFTTGEVAADPRLNIKKGELIRDNHGDYWIYNHTGRLTYVIAATGESKDFQLIPQDK ISYIDFERYHIVHDSRGIIWISTYGNGLFVYNTAEDKLEHFVANINDQSHISSDFLQYVM EDRAGGIWVASEYSGLSRISVLNEGTSRIYPESRELFDRSNTIRMLTKMSNGDIWVGTRK GGLYTFDANLQSKMTNQYFHSNIYAIAEDRQGQMWTGTRGNGLKVGDTWYYNTPSDPTSL SDNNVFAIYRDRKDRMWVGTFGGGLELAEPTSDGKYKFRHFFQQTFGMRMVRVIEEDENG MVWVGTSEGICIFHPDSLIADGDNYHLFSYTNGKFCSNEIKCIYRDTKERMWIGTSGSGL NLCTPQDDYHSLKYEHYGTSEGLVNDVIQSVLGDRKGNLWVATEYGISKFTPSTRSFENY FFSSYTLGNVYSENSACMREDGKLLFGTNYGLIVIDPEKIQDSETFSPVVFTDLYVNGTQ MNPQMEDSPLKQSLAYSDEITLKYFQNSFLIDFSTFDYSDSGHTKYMYWLENYDQGWSAP SPLNFASFKYLNPGTYILHVKSSNGSGIWNDSETTLKIVIVPPFWKTTWAMLCYVLLLMV ALYFAFRIVRNFNGLRNRINVEKQLTEYKLVFFTNISHEFRTPLTLIQGALEKIQRVTDI PRELIYPLKTMDKSTQRMLRLINQLLEFRKMQNNKLALSLEETDVISFLYEIFLSFGDVA EQKNMNFRFLPSVPSYKMFIDKGNLDKVTYNLLSNAFKYTPSNGTIILSVNVDEGKQTLQ IQVSDTGVGIPKEKQNELFKRFMQSNFSGDSIGVGLHLSHELVQVHKGTIEYKDNEGGGS VFIVCIPTDKTVYSEKDFLVPGNVLLKEADGHAHHLLQLSEELPDPEKMAAPLNKRKVLI IEDDNDIREFLREEIGAYFEVEVAADGTSGFEKARTYDADLIICDVLMPGMTGFEVTKKL KTDFDTSHIPIILLTALNSPEKHLEGIEAGADAYIAKPFSVKLLLARVFRLIEQRDKLRE KFSSEPGIVRPAMCTTDRDKEFADRLAAILEQNLARPEFSIDEFAQLMKLGRTVFYRKLR GVTGYSPNEYLRVVRMKKAAELLLSEDNLTVAEVSYKVGISDPFYFSKCFKAQFGVAPSV YQRGVNNEGINEKNE >gi|222159253|gb|ACAB01000106.1| GENE 45 50205 - 50417 75 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLKDNYYHFLDADYMNDADFLFLNLCHPCNPRLKNYRLIELLKTQSMWNTSGKGMDTGEL SGFTSMISAI >gi|222159253|gb|ACAB01000106.1| GENE 46 50336 - 50650 442 104 aa, chain - ## HITS:1 COG:mll5702 KEGG:ns NR:ns ## COG: mll5702 COG3254 # Protein_GI_number: 13474745 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 2 103 3 104 105 111 54.0 3e-25 MKREAFKMYLKPGCEAEYEKRHAAIWPELKALLSQNGVSDYSIYWDKETNILFAFQKTEG GAGSQDLGNTEIVQKWWDYMADIMEVNPDNSPVSIPLPEVFHMD >gi|222159253|gb|ACAB01000106.1| GENE 47 50670 - 52079 1505 469 aa, chain - ## HITS:1 COG:STM1911 KEGG:ns NR:ns ## COG: STM1911 COG4225 # Protein_GI_number: 16765253 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Salmonella typhimurium LT2 # 216 412 161 357 379 102 32.0 2e-21 MNIRIIAITLLLIALPVSAQKKKTVVNDSNTPLHLLQPAYQGTYGDLTPEQVKKEVDRVF AYIDKETPARVVDKNTGKVITDYTTMGEEAQLERGAFRLASYEWGVTYSALIAASEATGD IRYMDYVQNRFRFLAEVAPHFKRVYKEKGTTDPQLLQILTPHALDDAGAVCAAMVKVRLK DPSLPVDELICNYFDFIINKEYRLADGTFARNRPQHNTLWLDDMFMGIPAVAQMSYYDKE QKDKYLAEAVRQFLQFADRMFIPEKGLYRHGWVESSSDHPAFCWARANGWAMLTACELLD VLPEDYPQRAKVMDYFRAHVRGVTALQSGEGLWHQLLDRNDSYLETSATAIYVYCLAHAI NKGWIDAIAYGPVAHLGWHAVAGKINAEGQVEGTCVGTGMAFDPAFYYYRPVNVYAAHGY GPVLWAGAEMIRLLKNQYPQMNDSAVQYYQVKQKTTAPIFAVDTEEKKD >gi|222159253|gb|ACAB01000106.1| GENE 48 52083 - 53924 1617 613 aa, chain - ## HITS:1 COG:no KEGG:BT_4175 NR:ns ## KEGG: BT_4175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 613 1 613 613 1149 85.0 0 MKCISAFLSLCLIAAFVVAQPNYDFSKLKREHLGRGVIAIRENPSTVAVSWRYLSSDPMD ESFDVYRNGEKVNKYPIRNATFFQDIYKGTESVLYTVKAIQSKTESNYQLPSDAPAGYLN IPLNRPENGTTPAGQSYFYAPNDASIGDVDGDGEYEIILKWDPSNAHDNSHDGYTGEVYF DCYKLNGQHLWRINLGRNIRAGAHYTQFMVFDFDGDGKAEVVMKTADGTVDGKGKVIGDA QADYRNEQGRILTGPEYLTVFNGLTGEAMQTIDYVPGRGNLMDWGDNRGNRSDRFLACVA YLDGIHPSVVMCRGYYTRTVLAAYDWNGKELKERWIFDSNHPGCEDYAGQGNHNLRVGDV DGDGCDEIIYGSCAIDHNGKGLYTTKMGHGDAIHLTHFDPSRKGLQVWDCHENKRDGSTY RDAATGEILFQIKDSTDVGRCMAADIDPTHPGVEMWSLASGGIRNIKGEVVKARVRGLSC NMAVWWDGDLLRELLDRNIVSKYNWKKGVCERIAIFEGALSNNGTKATPCLQGDIVGDWR EEVLLRTADNTALRLYVSTIPTDYRFHTFLEDPVYRISIATQNVAYNQPTQPGFYFGPDL QGTLFRGCKIPKK >gi|222159253|gb|ACAB01000106.1| GENE 49 54169 - 55308 992 379 aa, chain + ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 66 378 47 351 352 182 34.0 1e-45 MKKSLLSFFTVTLLCLMTGKPVMAQELPAQKETLETIVKVNDYFMKKYADYTLPSFYGRV RPSNIWTRGVYYEGLMALYGIYPREDYYKYAYDWADFHKWGMRNGNTTRNADDHCCGQVY IDLYNMCPSDPNMIRNIKASIDMVVNTPQVNDWWWIDAIQMAMPIYAKFGKMTGEQKYYD KMWDMYSYTRNVHGEAGMYNPKDCLWWRDQDFDPPYKEPNGEDCYWSRGNGWVYAALVRV LDEIPANETHRQDYINDFLAMSKALKKCQREDGFWNVSLHDPTNFGGKETSGTALFVYGM AWGVRNGLLDRKEYLPVLLKAWNAMVKDAVHPNGFLGYVQGTGKEPKDGQPVTYKSVPDF EDYGVGCFLLAGTEVYKLK >gi|222159253|gb|ACAB01000106.1| GENE 50 55438 - 57072 1591 544 aa, chain + ## HITS:1 COG:BS_yesY KEGG:ns NR:ns ## COG: BS_yesY COG2755 # Protein_GI_number: 16077774 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Bacillus subtilis # 317 539 5 209 217 98 28.0 3e-20 MKNKLFPYICWLMAITFSLQLQAQNKVSAPMADVNQVIDNTLDSLNKARTSRPEAGSSRK GDNPVLFLVGNSTMRTGTLGNGNNGQWGWGYYAGDYFDSNRITVENHALGGTSSRTFYNR LWPDVIKGVRPGDWVIIELGHNDNGPYDSGRARASIPGIGKDTLNVTIKETGVKETVYTY GEYMRRFIQDVKAKGAHPILFSLTPRNAWEDKDSTIITRVNKTFGLWAKQVAEEQHVPFI DLNDISARKFEKFGKNKVKYMFYIDRIHTSAFGAKVNAESAADGIRAYEGLELANYLKPV EKDTVTDSSRKEGRPVLFTIGDSTVRNEDKDKNGMWGWGSVIADEFNLNKISVENRAMAG RSARTFLDEGRWDKVYNALQPGDFVLIQFGHNDAGDINKGKARAELRGSGDESKVFLMEK TGKYQVVYTFGWYLRKFIMDVQEKGAIPIVLSHTPRNKWKDGKIERNTESFGKWTREAAE ATGAYFIDLNKISADKLEKKGVKKAATYYNHDHTHSSLKGAHMNAKSIVEGLKKSDCPLK NYLK >gi|222159253|gb|ACAB01000106.1| GENE 51 57391 - 60789 2372 1132 aa, chain - ## HITS:1 COG:no KEGG:BVU_0159 NR:ns ## KEGG: BVU_0159 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 16 1131 23 1103 1106 1899 80.0 0 MKKLFILLFLFITLGTNAQNNKVSGLNARQFHKYWKVESESPDYKVTFQGDTAEILSPKG LTLWRKEKMSGRVTIEYDACVVVEKEGDRLSDLNCFWMASDPKHPDNIWKREKWRSGIFL NCYSLQLYYMGYGGNYNSTTRFRRYDGNEAGITDPKVRPAILKEYTDTEHLLKANHWYHI KITNENNRVSYYIDGKRLVDFRDAEPLTEGWFGFRTTLSRTRITNFRYECLPSQTSTVPL HWIGNTPEQDKAVSFGVPFDEGYLFPETSLRLKTDRNQEIPIDTWPLAYWPDGSVKWSGV AGVIPAGTERLTLEKAPRKAKTINKQPDASIAITETPENIQIETGVISVFIPQRGDFLID SLLYKGTKVGEKARLICNTQSEPIQENTSQISFTRYIGEIKSVTIERLGSVRALVKLEGI HRNRNKEIDTNHSEEEGNYANNSDMNKWNNREWLPFVVRLYFYGCSEQIKMVHSFVYDGD QKKDFIRSLGIRFDVPMREALYNRHIAFSCADGGVWSEPVQPLVGRRILTLNKTDNKKNS NEKKDAQQKSTDEPSLQQQQMEGKRIPPYESFDEKNRSLLDNWASWNDYRLSQLTADAFS IRKRANNDNPWIGTFSGTRSDGYTFVGDITGGLGLCMHDFWQSYPSSIEISDARTPVATL TAWLWSPESEPMDLRHYDRIAHDLNASYEDVQEGMNTPYGIARTTTFTLIPQSGYTGKKA FADYAKQFSSPSLLLPTPNYLHARQAFGIWSLPDRTTPFRTRVEDRLDAYIDFYQKAIEQ NKWYGFWNYGDVMHAYDPVRHTWRYDVGGFAWDNTELASNMWLWYNFLRTGRIDIWRMAE AMTRHTGEVDVYHIGPNAGLGSRHNVSHWGCGAKEARISQAAWNRFYYYLTTDERCGDLM TEVKDADHKLYELDPMRLAQPRSEYPCTAPARLRIGPDWLAYAGNWMTEWERTGNTTYRD KIIAGMKSIAALPNRLFTGPKALGFDPSTGIITTECDPKLETTNHLMTIMGGFEIANEMM RMIDIPEWKDAWLDHAARYKKKAWELSHSRFRVSRLMAYAAYHLRNTQMAEEAWKDLFTR LEHTPAPPFRITTILPPEVPSLLDECTSISTNDAALWSLDAIYMQEVIPIDN >gi|222159253|gb|ACAB01000106.1| GENE 52 60960 - 63095 1366 711 aa, chain - ## HITS:1 COG:no KEGG:Ndas_0923 NR:ns ## KEGG: Ndas_0923 # Name: not_defined # Def: cellulose-binding family II # Organism: N.dassonvillei # Pathway: not_defined # 65 697 177 732 746 340 34.0 9e-92 MKIHYIILGLFYLLAIVACSDNDPVVEVNNGNQTEELPPLPTEVITGSRAMWVSYDPISA SDPNNASGIASALISWRLLKTDPSNVAFDIYKSVDGETEVKLNEEPISNTTSWVDADIDV SKTNVYRVTLANQAETLCDYTFTSEMAEKFYHEIRLDMNVPDASITYSPDDIQLGDLDGD GELEIVVKREPYDGANMGVWFNGTTLLEAYKMDGTFLWRIDLGINIRSGSHYTSYILYDF DGDGLCEIAFRTSEGTKFADGKIITDANGKVNDYRNRQTDGKGWYSGAAIARDQNDPSTA TTCGLIMEGPEYISICRGYDGREITRIDNIPRGGEGSKVSRAKYWSEYWGDDFGNRMDRF FIGVAYLDGIPDETTGARVANPSLIISRGIYKNWQVWALDLKGNELVPRWKFDTADHSSK WLGMCSHCFRVADLDGDGRDEILYGSAAIDDNGSELWCSGNGHGDILHVGKFIKDRSGLQ IVASFEESKDYEGQGNGYACQVMNARDGSMITGHGRNLPADASDVGRCIVADVDPDSPDF EYWSSTQEGMFSCNGTGLVSTTYPTGIGSGVMYNVAIYWSGQSTREMLDRGCIVSYKANP DVNKSNKNRLISFDLYGSNQGNHASKYNPCYYGDFLGDYREEVILGSSDYKSIYIFSTNH PTTHRLPHLMTDHNYDMSQAMQNMGYNQGTNLGYYVGAETLKSSETEEPKE >gi|222159253|gb|ACAB01000106.1| GENE 53 63173 - 64894 1666 573 aa, chain - ## HITS:1 COG:no KEGG:PRU_2229 NR:ns ## KEGG: PRU_2229 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 16 573 17 559 559 408 43.0 1e-112 MKLKNIIYGMMCVVTLGSCSDKMEYHEYNNYDEDFVKLNFGNVGGLITNIYLSMDVDFGN YSGAILGSATDEAEYAYSGNQIEDFYNGSWSPSNAKSSMWTSCYEGIANCNLYLEKFTGL TFPELALNSDYAQQMFRYTNYQYEVRFLRAYFYFNLVRQYGDVPFSDHILTAEESNTLSR RPAQEIFDYIIAECDDIKDKIIVDYSKLGDMALPSSPAETGRANRKTVLALKARAALYAA SPLFNPTDNKDLWYRAAKANKEILDDSNGFAEKAKLLVEDYSSLWSKDNYNDATSELIFL RRATTTTNSFEGYNFPVGLENCKGGNCPTQTLVDAYEMKDTGLRPDELDNYDPTNPYYTN RDPRFYLTIAKNGDEKWPNWNTVPLQTYQGGLNAEPLSGGTPTGYYLKKYCQTAVDLRAG TASKTYHSWITFRFGEFYLNYAEAVYKYLGSPYATDSEFTTSAVDAIKVVRTRAGMPGFP QGMTNDAFWKKYQNERMVELAFEGHRFWDVRRWKEGDKHFKNIVEMKITKNGDDTYTYER KVKERSWDDKMYFFPIPQSEKSKNPNLEQNPGW >gi|222159253|gb|ACAB01000106.1| GENE 54 64916 - 66961 1575 681 aa, chain - ## HITS:1 COG:no KEGG:PRU_2228 NR:ns ## KEGG: PRU_2228 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 8 681 284 944 944 367 34.0 1e-99 MANLLTDKGFIKTPNANDGYSTQNMYSRANLRTNLDIDLTGTTKLKLNLLGTLSESRIPG ASVNLWDMIYSIPSAAFPVRTEDGSWGGSTTWAGTSNPVAQSQGAAYSKGHSRSLFTDLT LSQDLSGLLKGLGANFRLAYDNYSNILEDHSKTYTYAGYATSWTTNGPFYTAISGGESSE MGSGAGIDNWARQFNFAGSVDYNRSFQKWDVYSQLKWDYEYRDSYGLNTTIYRQNVSWYN HVGFSNRYYLDLALVGSASSLLAPGHKWAFSPTISAAWVLSEEKWLKDVSWVNFLKLRAS FGVINVDYLPKDGSTTVYDYWDQIYTTTGTQYKFNSSYDSEFGSTIIGRLATANSTHEKA YKYNVGVDAMLFNGLNVTAEGYYQRRKDIWVSSEGKYTDVLGVDAPFENAGIVDSYGMEL GLNYTKRLGDVVFNLGGNFAWNKNEIKEQLEEPRLYKNLVQTGNRLGQVYGMVAEGFFKD KEDIANSLTQNFTTVVPGDIKYKDVNGDGIIDANDKTAIGYSTTAPEIYYSFHLGAEWKG IGVDVMFQGTGNYSALLNTKSMFWPLINNTTLSTHYYENRWTPENQNAKYPRLSSQSNAN NYQANTVWLADRSFLKLRNAEVYYRFPKELMKKTKILGSAKLYVRGTDLFCSDHIDVADP ECYGVATPLNKSVVVGVTIGF Prediction of potential genes in microbial genomes Time: Wed May 18 03:38:24 2011 Seq name: gi|222159252|gb|ACAB01000107.1| Bacteroides sp. D1 cont1.107, whole genome shotgun sequence Length of sequence - 1977 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 973 800 ## BT_2894 hypothetical protein 2 1 Op 2 . - CDS 1021 - 1893 665 ## PRU_2227 putative lipoprotein Predicted protein(s) >gi|222159252|gb|ACAB01000107.1| GENE 1 1 - 973 800 324 aa, chain - ## HITS:1 COG:no KEGG:BT_2894 NR:ns ## KEGG: BT_2894 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 312 9 308 1018 172 36.0 1e-41 MRKNKILILALLVCLSIPVRAQNTDNNIIGIVVDKSGNPVYGASVNVEGGAVETRVETDK DGKFEIAVEKGQRLSVFSADKGSTTVVVKADGPMTIVMGYAAQTIDVGANRTFNRHESTA AVSTTYNEEFNKRSSRNISNSLYGYGLGLTTLQNAASTGLMADPTFYVRGLQSLSSSTPL VLVDGLERDMSLVSPEEVESVSILKDAAAVALYGYKGVNGAILITTKRGKYKTKEITFTY DHIINTQSRRPEFVDAATYASAVNEARGYEGLGARYTPEEVDAFRNGVGSGGASRMYPYL YPNVNWIDETFKDRGVSNKYTIEF >gi|222159252|gb|ACAB01000107.1| GENE 2 1021 - 1893 665 290 aa, chain - ## HITS:1 COG:no KEGG:PRU_2227 NR:ns ## KEGG: PRU_2227 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 39 290 426 676 676 153 40.0 6e-36 MFPWADGTPFDWKKAETEGKLDEMFLTGTFKNGEQLLSEIVLTRDPRLYEHCMVNGLPKM LDWSAGTMSGLPYELWVGGYDEGQNAALESGNYATGYKNMKYYLGTEYQRQYTHWVYLRL SDIYLTYAEALLQAKNDHRGAINQMNIVRARVGLKKDLAECVTDKNLLSDKAALLEELLC ERVRELGLEDSRYFDLVRYKRADRFEKRLHGLLAYRLDDTGNRMTGNTKWNEGDKNKGAL QPTRFEYERFELSKPVRRWWTYGFDSKWYLSPFPQTEINKGYGLIQNPGW Prediction of potential genes in microbial genomes Time: Wed May 18 03:38:37 2011 Seq name: gi|222159251|gb|ACAB01000108.1| Bacteroides sp. D1 cont1.108, whole genome shotgun sequence Length of sequence - 18805 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1056 708 ## PRU_2227 putative lipoprotein 2 1 Op 2 . - CDS 1078 - 4257 2494 ## BT_2818 hypothetical protein 3 1 Op 3 . - CDS 4318 - 5943 1168 ## gi|237714210|ref|ZP_04544691.1| conserved hypothetical protein 4 1 Op 4 . - CDS 5998 - 7035 842 ## gi|237714211|ref|ZP_04544692.1| conserved hypothetical protein - Prom 7225 - 7284 4.9 + Prom 7636 - 7695 5.1 5 2 Tu 1 . + CDS 7753 - 8589 526 ## BVU_0168 tyrosine type site-specific recombinase + Prom 8953 - 9012 6.5 6 3 Op 1 . + CDS 9100 - 12231 2531 ## PRU_1591 putative receptor antigen RagA 7 3 Op 2 . + CDS 12256 - 14337 1729 ## BVU_0172 hypothetical protein - Term 14209 - 14245 -0.6 8 4 Tu 1 . - CDS 14430 - 15119 263 ## BVU_1770 hypothetical protein - Prom 15142 - 15201 3.6 9 5 Op 1 . - CDS 15208 - 16590 1122 ## COG5434 Endopolygalacturonase 10 5 Op 2 . - CDS 16653 - 18740 1924 ## COG1874 Beta-galactosidase Predicted protein(s) >gi|222159251|gb|ACAB01000108.1| GENE 1 3 - 1056 708 351 aa, chain - ## HITS:1 COG:no KEGG:PRU_2227 NR:ns ## KEGG: PRU_2227 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 336 1 345 676 207 37.0 5e-52 MKLQNKWFIGAMLGAAVCLVSTSCVDEIKFGNSFLDKAPGGNATIDTVFNSAEYTRQFLN TCYSRQYYGLPYNTDSNGNIPDSSSPYLGKKDALTDCWSLYFSDATVYQQYYLGSLNANY GTRGNIFPYTREMVWEVVRWCWLLMENIDRVPGMDNAEKTQLVAEAKCLIAARYFDMFRH YGGLPLLYASFTGTESGYEIPRATVEETVNYMIKMLDDAINSGGLVWAYADADLASKAGH WTKAGAMALKCKIWQFAASPLFNDVNGYAGGNSEAEQQRLVWYGGYHAELWDNCLKACEA FFKELESQGGYSLNKASEETPEQYRQAYRMGYIRQNSKEVLHSVRTNMSDA >gi|222159251|gb|ACAB01000108.1| GENE 2 1078 - 4257 2494 1059 aa, chain - ## HITS:1 COG:no KEGG:BT_2818 NR:ns ## KEGG: BT_2818 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 44 1059 31 1057 1057 709 42.0 0 MNKKNFIKGTTLPVALFFTIATFAPAWGLQTASAAIEVVQQQSTIKGTVVDSQGEPIIGA SVLAEGTSNGTITDIDGVFRINVRPGTKLKVSFIGYTDKTVVAKNDMKIILVEDVTALEE VEIVAYGTQKKVTMTGAIASVKGEELTRVSVGSVSNVLGGQMTGLTTVQYSGEPGADAAE IFIRGKATWENSSPLIQVDGVDRESMSDIDPNEIESISILKDASATAVFGIRGANGVILI TTKRGKEGKAKISFTTSASVLMPTKMVEQASSYDYARFHNQMMSGDGKSALFSDGVIQKF ADGSDPIRFPNIQWADYIMKSSTLQSQHNLNISGGTDKVRYFISAGAYTQGGLFSEFDLP YNLSYQYRRFNYRTNLDMDVTKTTTLSFNISGNVNNSDQPRTSQGSSGLIKNMYYATPFS SPGIIDGKLVNSSDETYSDGLKLPFTGGTGMGYYGNGFSQTSNNSLNVDVILDQKMDFLT KGLSFKVKGSYNSSFTVNKVGSGGSIATYTPVLMDDGSIAYRKFGENTDVKYSYTTGKAR NWYMEASFNYNRTFGDHTVTALFLYNQQKEYYPKVYSDIPHGYVGLVGRVTYDWKSRYMA EFNVGYNGSENFASDLRFSYFPAGSIGWVMSEEKFFRPLKSVVSFLKLRASVGLVGNDNI GGSRFMYMADPYNVGLGDLANRVTASGGATNAWGYGFGTDNGTVSLGAREVAKNNAAVTW EKALKRNYGVDINFLDDRLKTTFEYYKERRNDILLRDGTAPGMLGFITPYSNLGSVDSWG WELSLKWNDKIGDNFRYWAGINLSYNQNEILEKKEAPQNNLYQYQKGHRIGSRSQYVFWR FYDEDTPALYEQTFNRPFPTHESILQNGDAVYVDLNGDRKIDANDASYDYGFTDDPEYML GINLGFSWKNLEISTQWTGAWNVSRMISDVFRQPFYSSSNSEQGGLLAYHLDHTWTPENP SQSSEYPRATIDNAKNNYATSTLYEKDAKYLRLKTLQVAYNFHFPLMKKLGLNTCQLAFA GYNLWTITPYLWGDPEARATNAPSYPLSKTYTLSLKLGF >gi|222159251|gb|ACAB01000108.1| GENE 3 4318 - 5943 1168 541 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714210|ref|ZP_04544691.1| ## NR: gi|237714210|ref|ZP_04544691.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 541 1 541 541 1088 100.0 0 MKIYNYISMLFLAGAAVVASCSQNEELTGETPDSLQGFQISVSDEGFMDESGKTRATENG YTTRFSNGDAIGIFAVRGETVVEDIKNRKFTLTDGYWELTDGGDPIEYKGSQFQRMTFYA YYPYNANVTFDPTKVDPFETYVNNWKIGSEQNEGNYTQYDLMTSTGSVQGDRLKGQIAFT MQHRMALAVVKMPNLTYSFTNGGIDDYLLPLAAGSFTVNNTQATPYYQESTDTYRFLVNP NKEFSIKGTYTGVSEMEYEAKGTLEGGTAKMYTIEDKSKINHTLQVGDYFCADGKIVSVD AEAVPENVIGIVCYVGNSQPSVTHPELYSAEIDALRRDFPACTHGIVLSTKNSLVKDDEQ NDIALQMFHSSKAGYYSDWFNSDEDWKDKFVGCNTERDVNKNDASKVFPALMGYNNTKIL TMCYEGMGSTSTCDYVYDYIMAYRKAEAVPSGVTPWYLPSVMCWDQVAKNMSGINSSLQK VNGDKMVTSDLPVNNKAAGHYWSSTQRSAVNQWTHSMGNGSFHVTCERASCAGYFRMMLA F >gi|222159251|gb|ACAB01000108.1| GENE 4 5998 - 7035 842 345 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714211|ref|ZP_04544692.1| ## NR: gi|237714211|ref|ZP_04544692.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 345 1 345 345 654 100.0 0 MFNFKKHGMKKNYVTTAVALLLLAGCSNDIKETDIQFTGDTGAKVSFSAVINNHEASNLV TRATETNWEAEDSVGITCGTEQVNIHYKYKGNGLFAAKGGNAEEIWVLGSGEYDVVAYAP FSGTSGEEQPVLEVTTASENQATAEERAKIDFLFAKGKATSQTPNVTLAFNHVMSRIRLE FQAGEGVDLADITCYLIGLKNNGTFNPKTGETTVSEDPVTDKDDIYWDKIGEQDNYTVQA ILLPQKVQGKVTIQARMNGYLYEAEFANLTELKSGFSYNYIIKTNKYEDNNYVLTITEQT QIIGWEDENHDPITSDPSIAETGTEITNPSWNITEETVTPQPVVK >gi|222159251|gb|ACAB01000108.1| GENE 5 7753 - 8589 526 278 aa, chain + ## HITS:1 COG:no KEGG:BVU_0168 NR:ns ## KEGG: BVU_0168 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.vulgatus # Pathway: not_defined # 1 278 115 392 393 406 72.0 1e-112 MKGEEREKSYKLYKIASERLVKYMKGDFPLIQLTPIHIQGFAKVLHEENLADTTIRIYLT LIKVIVNYAGKMNYVSYSIHPFVLFKMPTSNVRELDLSIDELKRIRDVRLLKSSLSMVRD IFMLSYYLGGINLRDLLAYDFKNKNDMRYVRHKTRNSKKGENEIVFTLQPEAKVLIDKYM SRNGHLQFGKYSSYKQIYSLVFRHINKVTELSGVKKKVTYYSARKTFAQHGYDLGIQIEK IEYCIGHSMKNNRPIFNYIKIMQEHADKVFRAVLDQLL >gi|222159251|gb|ACAB01000108.1| GENE 6 9100 - 12231 2531 1043 aa, chain + ## HITS:1 COG:no KEGG:PRU_1591 NR:ns ## KEGG: PRU_1591 # Name: not_defined # Def: putative receptor antigen RagA # Organism: P.ruminicola # Pathway: not_defined # 20 1015 1 1009 1027 908 48.0 0 MKSNKKGNHLYAYKDTIRRLFLLTLFSFLIVESYAQNKTISGTVTDFTGEPVIGASVLVN GTTNGTITDLNGKFSLSNVPTKGTITITYIGYKKQEVSVAGNTNFKITLQEDTETLDEVV VVGYGVQKKSDVTGAMARVGEKELKAMPVRNALEGMQGKTAGVDITSSQRPGEVGNINIR GQRSINAEQGPLYVVDGMVIQNGGIENINPSDIEAIDILKDASATAIYGSRGANGVILVT TKKGKEGKVTLNYSGTVTFETLHDVTEMMSATEWLDYARLAKYNAGSYASATPTFEADKA AFGSVSASWKNIEQAWSNGNYDPSKVGSYDWASHGKQTGITHEHTLSASGGSDKFQGYAS FGYLDQKGTQPGQAYERYTLKTSFDVTPVDWFKMGSSINASYSTQDYGYSFSKSVTGSGD FYSALRSMIPWTVPYDENGEYVRYPSGDVNIINPIRELDYNTNQRRTFRASASMYSQIDF GKIWKPLEGLSYRLQFGPEFQFYTLGVANAADGINGDGNNGAQYKNEQKRAWTLDNLIYY NKTLGQHSLGMTLMQSASAYHYEMGDMRATNVASSDELWYNMGSAGTLNSFGTGLTETQM ASYMVRLNYGYKDKYLLTASMRWDGASQLAEGHKWASFPSAAIAWRMDQEDFMKDISWLN QLKLRVGMGVTGNAAIKAYATKGAITGLYYNWGQSDSSLGYVPSDPSQKEPAKMANPTLG WERTTQYNVGIDYGFFNNRLTGSIDAYKTKTNDLLLEMSIPSLTGYVSTYANVGKTSGYG IDLQVNAIPVQTKDFTWSTTLTWSMDRNRIDELSNGRTEDVNNKWFVGEEIGVYYDWVYD GIWKTEEAEEAAKYGRKPGQIKVKDLNNDGAIDANDDKQIVGHTRPRWTGGWSNTFSYKN FELSFFILSRWGFTVPQGAVTLDGRYMQRKVDYWVAGTNENAKYYSPGSNGEGADAFNSA MNYQDGSYIKVRNISLGYNFTPQQLKKLGINNLKLYVQAMNPFNIYKACDFLDTDLVNYD NNTKTFGSPTTLKSFVIGVNIGF >gi|222159251|gb|ACAB01000108.1| GENE 7 12256 - 14337 1729 693 aa, chain + ## HITS:1 COG:no KEGG:BVU_0172 NR:ns ## KEGG: BVU_0172 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 3 693 4 712 712 344 35.0 7e-93 MKLNRILFAALIASVTGSSITSCSESFLDENLTTQYSTDRFKTQEGLDELVTGAYQKLKF KFNYIWGIQCYNMGVDEFTDANNVIPAWNHYSQDLNSSENGANQPIWDNYYGLVEPANIL IQNIPQYYNQSSPTYNTRLGEAHFLRAYAYFELVKQFGGVPLKLAPSTSAETYFTRNSAE EIYTQVISDFGEAYRLLPNKGESIGRINKYAAAHFLAKAHLFRASELYSDWNSNYVASDL DAVIQYGSEVVDAHPLCSDYVELWDYEQPNGANEKVSEVILAAQFSNDESTWGRFGNQMH LYYPAVYQGNDIGGCKRDISGGREFSYVSATEYTMQVFDRVNDSRFWKSFITCYGANETK SAPTWTAEDMPYAPAGVKEGDKRFSGGELGMKYIVNDPGDNRYEKYPNAPAYTVLKDGKM CNTYTYVRYFKGQEHSWNVNEKTGNYYDIIPHKRSVALSKFRDGYRVSIASQFGTRDAII ARSADDVLMVAEAYIRKGEANYDKAIEWMNKLRERAGYKTGEDRSKNVDGGQAYKNNPYC SGKGGGHSSEGAIYWEENTYYESNNIEQETTASTKTTMKLNSVSDVYNSTVDVPIYNELG CTSNADKMMCFLLNERTRELCGELQRWEDLARTKTLDARWHKFNDGASRGLGEFKSEKHY YRPIPQAFLDGITNSNGSALSNEEKKAMQNPGY >gi|222159251|gb|ACAB01000108.1| GENE 8 14430 - 15119 263 229 aa, chain - ## HITS:1 COG:no KEGG:BVU_1770 NR:ns ## KEGG: BVU_1770 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 224 1 226 231 345 71.0 7e-94 MKKSCFLLLLTILCSSISFAQSLKSISILGDSYSTFEGYLQPDTNSIWYYVSPRQQTDVT SVKQTWWHKFIKENNYRLCVNNSFSGATICNTGYRNEDYSDRSFITRMDQLGCPDVIFVF GATNDCWAGSPLGEYQYVGWAKDDLYKFRPAMAYMLAHMIDRYPNVEIYFLLNSGLKEEF NESVRTICRHYNIDCIELHDIDKKSGHPSIKGMEQISEQIKEFMVKKTK >gi|222159251|gb|ACAB01000108.1| GENE 9 15208 - 16590 1122 460 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 48 374 24 359 448 274 41.0 2e-73 MKTYLLKPMVVVCLCLLTASFTFAADNYKMVKVKAPFPMQPIKVFIYPDKDFLITDYGAK NGGKVNNTKAIAAAIEACHKSGGGRVVVPAGIWLTGPIHFKSNVNLYLEENAILNFTDNP SDYLPAVMTSWEGLECYNYSPLLYAFECENVAITGKGTLQPKMDTWKVWFKRPQPHLEAL KELYTKASTDVPVIERQMAVGENHLRPHLIHFNRCKNVLLDGFKIRESPFWTIHLYMCDG GLVRNLDVKAHGHNNDGIDFEMSRNFLVEDCSFDQGDDAVVIKAGRNQDAWRLNTPCENI VIRNCQILKGHTLLGIGSEISGGIRNIYMHDCTAPNSVMRLFFVKTNHRRGGFIENVYMK NVQAGMAQRVLEIDTEVLYQWKDLVPTYEERITRIDGIYMDKVTCESADAIYELKGDAKL PVKNVTIKNVKVGEVKKFVKKVNNVENVVEKNVTYEREVK >gi|222159251|gb|ACAB01000108.1| GENE 10 16653 - 18740 1924 695 aa, chain - ## HITS:1 COG:TM1195 KEGG:ns NR:ns ## COG: TM1195 COG1874 # Protein_GI_number: 15643951 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Thermotoga maritima # 37 693 1 644 649 349 31.0 1e-95 MKYMIKSKILAILCVSLLTLSTTAHPSSWFNDKDLTLTGVYYYPEHWDENQWERDFKKMH ELGFEFTHFAEFAWAQLEPEEGRYDFAWLDRAVALAAKYDLKVIMCTSTATPPVWMSRKY PEILLKNEDGTVLDHGARQHASFASPLYRELSYKMIEKLAQHYGNDPRIIGWQLDNEPAV QFDYNPKAELAFRDFLRAKYNNDIQLLNNAWGTAFWSESYSSFDEITLPKRVQMFMNHHQ ILDYRRFAAMQTNDFLNEQCLLIKKYAKNQWVTTNYIPNYDEGHIGGSPALDFQSYTRYM VYGDNEGIGRRGYRVGNPLRIAWANDFFRPIQGTYGVMELQPGQVNWGSINPQPLPGAVR LWMWSVFAGGSDFICTYRYRQPLYGTEQYHYGIVGTDGVTVTPGGKEYETFIKEIRELRK HYAPREAKPADYLARRTAILFNPENSWSIERQKQNRTWDTFVHIEKYYRTLKSFGAPVDF ISEAKNLSDYPVVIAPAYQLADKELVDKWISYVKNGGNLILTCRTAQKDRYGRLPEAPFG SMITPLTGNEMNFYDLLLPEDPGTVVMNGKEYAWNTWGEILNPPADAQIWATYKNEFYEG SPAVTFRKLGKGTITYVGVDSHNGALEKDILKKLYAQLNIPVMDLPYGVTMEYRNGLGIV LNYSDQPYAFNLPKGAKALIGTTNIPTAGVLVFSM Prediction of potential genes in microbial genomes Time: Wed May 18 03:39:55 2011 Seq name: gi|222159250|gb|ACAB01000109.1| Bacteroides sp. D1 cont1.109, whole genome shotgun sequence Length of sequence - 20988 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 6, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 2352 1439 ## BVU_0180 glycoside hydrolase family protein - Prom 2544 - 2603 6.0 + Prom 2512 - 2571 5.2 2 2 Tu 1 . + CDS 2596 - 5472 2348 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 5519 - 5578 3.5 3 3 Op 1 . + CDS 5603 - 6841 951 ## COG2755 Lysophospholipase L1 and related esterases 4 3 Op 2 . + CDS 6890 - 8425 996 ## COG5434 Endopolygalacturonase 5 3 Op 3 . + CDS 8376 - 8978 369 ## BT_4147 hypothetical protein 6 3 Op 4 . + CDS 8975 - 10378 1103 ## COG5434 Endopolygalacturonase 7 3 Op 5 . + CDS 10378 - 13143 2221 ## BT_4145 hypothetical protein + Term 13229 - 13271 4.3 8 4 Tu 1 . - CDS 13460 - 14632 1115 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities - Prom 14657 - 14716 4.1 + Prom 14590 - 14649 3.9 9 5 Tu 1 . + CDS 14701 - 15486 831 ## COG0561 Predicted hydrolases of the HAD superfamily + Prom 15507 - 15566 1.8 10 6 Op 1 . + CDS 15592 - 17592 2001 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 11 6 Op 2 . + CDS 17676 - 20201 2627 ## BT_4129 outer membrane assembly protein 12 6 Op 3 . + CDS 20281 - 20895 515 ## COG0009 Putative translation factor (SUA5) + Term 20927 - 20972 6.2 Predicted protein(s) >gi|222159250|gb|ACAB01000109.1| GENE 1 3 - 2352 1439 783 aa, chain - ## HITS:1 COG:no KEGG:BVU_0180 NR:ns ## KEGG: BVU_0180 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 1 783 1 784 784 1159 70.0 0 MKKVVVWMALSLCSVMTIFAGKTAYLFSYFINDSKDGLHLAYSYDGMNWMPLNGGRSYLT PSVGKDKLMRDPSICQAPDGTFHMVWTSSWTDRIIGHASSRDLIHWSEQQAIPVMMHEPT AYNCWAPELFYDEPSQTYYIFWATTIPGRHKEIPTSESEKGLNHRIYYVTTKDFYTFSKT KMFFNPDFSVIDAAIVKDPKLGDLIMIVKNENSNPPEKNLRVTRTKKIEKGFPTKVSAPI TGKYWAEGPAPLFVGDTLYVYFDKYRDHRYGAVRSLDHGETWEDVSDQVSFPRGIRHGTA FAVDASVVEALIANRTYNPLIPDNVADPSVSKFGDTYYLYGTTDLDYGLERAGTPVVWKS KDFVNWSFEGSHISGFDWSKGYEYTNDKGEKKKGYFRYWAPGRVIEHDGKFYLYVTFVKP GDKMGTYVLVADRPEGPFRFTEGKGLFVSGEEVDSPAIINDIDGEPFIDDDGTGYIFWRR RNAARLSADRLHLDGEPVTLKTVRQGYSEGPVMFKRKGIYYYIYTLSGHQNYANAYMMSR ESPLTGFVKPEGNDIFLFSSPENQVWGPGHGNVFYDEGTDEYIFLYLEYGDGGTTRQIYA NWMEFNDDGTIKTLVPDMRGVGCLATPQEKRTNLSLQSHFYASSEKEPRTSVVSIETQPN QPLPEKASVKNYTRTHTYQAAHVADESNGTRWMAADTDSSPFITVDLKGIRNVNECQFYF TRPTEGHTWRLEKSMDGKNWRTCAEQAKVEVRSPHLAKEIGEARYLRLHIRRGDAGLWEW KIY >gi|222159250|gb|ACAB01000109.1| GENE 2 2596 - 5472 2348 958 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 26 454 44 444 1087 88 23.0 8e-17 MKQLLVSIAIVFASILTVTGQNHSFSLSGKWNFQIDREDIGTKEQWFKKSLDDQINLPGS MPEKLKGDEVTARTRWTGSLYDSSYYFNPYMEKYRIEGQVKLPFFLTPDKHYVGVAWYQK KVTIPSNWKGERITLFLERPHIETTVWVNQQKLGMQNSLCVPHVYDLTSAITPGKTCLIT IRIDNRIKEINVGPDSHSITDQTQGNWNGIVGRIELQATPKVHFEDIQIYPDLNNQKALV RMNIQSASSTKGDITLSAESFNTDTQHKVAPVQQSFHIRPGDNAIEMELPMGKNFLTWDE FNPALYKLTAKLTNGKQSDIKQVQFGMRDFKIEEKWFYVNGRKTMLRGTVENCDFPLTGY APMDVASWERVFRICRNYGLNHMRFHSFCPPEAAFIAADLVGFYLQPEGPSWPNHGPRLG NGQPIDKYLMDETIALTKEYGNYASYCMLACGNEPSGRWVAWVSKFVDYWKATDPRRVYT GASVGNSWQWQPHNEYHVKAGARGLSWAGAQPESKSDYRARIDTVKQPYVSHETGQWCVF PNFNEIRKYTGVNKAKNFEIFRDILNDNHMGGQSYDFMMASGKLQALCYKHEIEKTLRTP DYAGFQLLALNDYSGQGTALVGVLDVFFEEKGYINAEQFRRFCSPTVPLARIPKFVYAND ETFHADIEVSHFGAAPLQGAKTSYTIKNKFGKVYAKGTVGTQTIPIGNLCALGSVDMNLK GIDTPQKLNLEVCIEGSDAVNDWDFWVYPAQVELTQGSVYTTDTLDEKAISVLKEGGNVL ITAAGKIQYGKEVKQYFTPVFWNTSWFKMRPPHTTGIFLNEYHPLFREFPTEFHSNLQWW ELLNKAQVMQFTDFPAEFQPTVQSIDTWFISRKIGMLFEANVLNGKLMMTSMDITSQPEK RIVARQMHKAILDYMNSDAFRPTEKISPELIQALFTKVAGDVKSYTKDSPDELKPKIN >gi|222159250|gb|ACAB01000109.1| GENE 3 5603 - 6841 951 412 aa, chain + ## HITS:1 COG:BS_yesT KEGG:ns NR:ns ## COG: BS_yesT COG2755 # Protein_GI_number: 16077769 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Bacillus subtilis # 162 379 6 223 232 180 43.0 4e-45 MKTTIIGLLLLATITVHAQEEVRTYQLSDAPRYSEETGYGYDLTATPEKGSKSPFFFSVR VPDGNYKVTVRLGSKKQAGVTTVRGESRRLFVENLPTKKGQFVDETFIINKRTPRISEKE YVRIKPREKAKLNWDDKLTLEFNGDAPVCQSIRIEPADPSVITVFLCGNSTVVDQDNEPW ASWGQMIPHFFGTDVCIANYAESGESANTFIGAGRLKKALSQMKKGDYLFMEFGHNDQKQ KGPGKGAYYSFMTSLKTFIDEARARGAYPVLVTPTQRRSFDSTGHIRDTHEDYPEAMRWL AAKENIPLIDLNEMTRTLYEALGTETSKRAFVHYPAGTYPGQTKAFEDNTHFNPYGAYQI AQCVIEGMKKAVPELAKHLKIDPAYNPAHPDDVNTFHWNESPFTEIEKPDGN >gi|222159250|gb|ACAB01000109.1| GENE 4 6890 - 8425 996 511 aa, chain + ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 68 366 33 351 448 145 30.0 3e-34 MKTKSIFPLFLLAGTLLTAACVTPTAKQAGSNSFEWGQVPQQPDLSWADSVGSRQMPGNS LILSANSFGAVADSTVLSTEAIQKAIDSCSVSGGGTVTLQPGYYQTGALFVKSGVNLQLD KGVTLLASPHIHHYPEFRSRIAGIEMTWPAAVINIVNEKNAAVSGEGTLDCRGKVFWDKY WEMRKEYEKKGLRWIVDYDCKRVRGILVERSSDITLKGFTLMRTGFWGCQVLYSNYCTID GLVINNNLGGHGPSTDGIDIDSSTNILIENCDVDCNDDNICIKSGRDADGLRVNLPTENI VIRNCIARKGAGLITCGSETSGSIRNILGYNLQAVGTSAVLRLKSAMNRGGTIENIYMTD VKAENVRHVLAADLNWNPSYSYSVLPKEYEGKDIPEHWRVMLTPVTPPEKGYPHFRNVYV SKIKAENVDEFISASGWNDSLRLENFYLYAIEAQTNKPGKICYTKNFNLSEITLDMKDKN AIELKENEQSNIHFNYAETTPDHRTAGNLAH >gi|222159250|gb|ACAB01000109.1| GENE 5 8376 - 8978 369 200 aa, chain + ## HITS:1 COG:no KEGG:BT_4147 NR:ns ## KEGG: BT_4147 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 199 1 190 196 323 76.0 3e-87 MLKQLLTTVLLGILLIDAQAQSLTPPAGTFRLGISKGNESHWLKPKEKVEGIHFQWKALP DSHGFILEVEVTSTSKADKLFWSFGDCQPDSDINVFSVEGQAFTCYYGESMKLRTLQAVT PSDDIRLSNGHKDETPLMLYESGKRTDRPVLAGRCNLSPTLSQEKDTKPFRLYFCFYEQN EKADYNYYMLPDIFAKIDKK >gi|222159250|gb|ACAB01000109.1| GENE 6 8975 - 10378 1103 467 aa, chain + ## HITS:1 COG:CC0572 KEGG:ns NR:ns ## COG: CC0572 COG5434 # Protein_GI_number: 16124826 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Caulobacter vibrioides # 20 449 34 474 527 524 61.0 1e-148 MKRAYLLFTLLIGCIYIHAAIYNVKDFGAKADGKTIDSPAINRAIEAAAEAGGGTVYLPA GEYACYSIRLKSNIHLYLEQGACILAAFPGKEAGYDSAEPNEHNKYQDFGHSHWKNSLIW GIGLENITISGPGLIYGKGLTREESRLPGVGNKAISLKECRNVTLKDLSMLHCGHFALLA TGIDNLTILNLKVDTNRDGFDIDCCRNVRISQCTVNSPWDDAIVLKASYGLGRFQDTENV TISDCYVSGYDKGSVLDGTWQLDEPQAPDHGYRTGRIKLGTESSGGFRNIAITNCIFEHC RGLALETVDGGHLEDIVINNITMRNIVNAPIFLRLGARMRSPEGTPVGTMKRILISNINV WNADSRYASIISGVPGACIEDVTFRNIHLYYKGGYSEEDGKRIPPEQEKVYPEPWMFGTI PAKGFYIRHARNITFDGIRFHFEQPDGRPLFVTDDAENIEYYNTPKE >gi|222159250|gb|ACAB01000109.1| GENE 7 10378 - 13143 2221 921 aa, chain + ## HITS:1 COG:no KEGG:BT_4145 NR:ns ## KEGG: BT_4145 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 918 1 918 924 1763 88.0 0 MNKRQWIVLCLLATGGVMQAQQWPDTPVEARPGARWWWLGSAVDEKNLTYNLEEYARTGM GAVEITPIYGVQGNDANEIQFLSPRWMEMLKHTQAEGKRTGIEIDMNTGTGWPFGGPEVS IEDAATKAIFQTYEIEGGKEVEQDINVTDPKQQPFSVLSRVMAYDEKGKCINLTAHVKKD KLQWKAPAGKWKVIALYIGKTRQKVKRAAPGGEGYVMNHLSKKAVKNYLSRFDRAFKSSK TSYPHTFFNDSYEVYQADWTDDFLEQFARRRGYKLEEHFSEFLDENRPEVSRRIVSDYRE TISDLLLENFSHQWTDWAHKNGSITRNQAHGSPGNLIDIYASVDIPECEGFGLSQFHIEG LRQDSLTKKNDSDLSMLKYASSAAHIAGKTYTSSETFTWLTEHFRTSLSQCKPDMDLMFV SGVNHMFFHGTPYSPKEAEWPGWLFYASINMSPTNSIWRDAPSFFHYITRCQSFLQMGRP DNDFLIYLPVYDMWNEQPGRLLLFTIHHMDKLAPKFINAIHRINNSGYDGDYISDNFIRS TRFKNGQLVTSGGTGYKALVVPAAHLMPSDVLAHLYELAKQGATIVFLENYPTDVPGYGQ LEQKRKSYQQTLKNLPAVSFSETTVTPIGKGKIITGTDYARTLACCNIPAEEMKTKFGLQ TIRRVNDTGHHYFISSLQNKGVDGWITLGTNAETAALFNPMTGECGEAKVRQVDGKTQVY LQLKSGESIILQTYRQPLQASKPWKYVKEQPFSLRLDHGWKLHFAESTPEIQGTFDIDRP CSWTNIDHPAAQTNMGTGVYSLDIELPALKADNWILDLGDVRESARVRINGQEAGCVWAV PYQLKVGQFLKPGKNHIEIEVTNLPANRIAELDRQGVQWRKFKEINIVDLRYRPANYGHW APMPSGLNSEVRLIPVDFTQE >gi|222159250|gb|ACAB01000109.1| GENE 8 13460 - 14632 1115 390 aa, chain - ## HITS:1 COG:YPO3006 KEGG:ns NR:ns ## COG: YPO3006 COG1168 # Protein_GI_number: 16123185 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Yersinia pestis # 1 389 1 392 393 368 44.0 1e-101 MNYNFDEIINRNGTDSVKWDAVESRWGRNDLIPMWVADMDFRTAPFVIEALKKRLEHEVL GYTFACKEWSESIVNWVKERHGWTIREEMLTFTPGIVRGLAFVIQCFTQKGDKVMVMPPV YHPFFLVTQKNEREVVYSPLVLKDGQYYIDFDRFRRDVQGCKLLILSNPHNPGGRVWTKE ELSLIADICYESGTLVISDEIHADLTLPPYKHPTFALISEKARMNSLVFMSPSKAFNMPG LASSYAIIENDELRHRFQKYMEASEFSEGHLFAYLSVSAAYSHGTEWLDQVVSYIKGNVD FTETYLKEHIPAIKMIRPQASYLIFLDCRELGLNQKELNRLFVEDAHLALNDGTTFGKEG EGFMRLNIACPRAVLEQALKQLEQAVINLK >gi|222159250|gb|ACAB01000109.1| GENE 9 14701 - 15486 831 261 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 261 1 256 256 141 32.0 1e-33 MTKALFFDIDGTLVSFETHRIPSSTIEALEAAHAKGLKIFIATGRPKAIINNLSELQDRN LIDGYITMNGAYCFVGEQVIYKSAIPQEEVKAMAAFCEKKGVPCIFVEEHHISVCQPDDM VKKIFYDFLHVNVIPTVSFEEATSKEIIQMTPFITEEEEKEIRPSIPTCEIGRWYPSFAD ITAKGDTKQKGIDEIIRYFGIKLEETMAFGDGGNDITMLRHAAIGVAMGQAKEDVKAAAD YVTAPIDEDGISKAMKHFGII >gi|222159250|gb|ACAB01000109.1| GENE 10 15592 - 17592 2001 666 aa, chain + ## HITS:1 COG:Cj0945c KEGG:ns NR:ns ## COG: Cj0945c COG0507 # Protein_GI_number: 15792274 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Campylobacter jejuni # 23 417 13 406 447 168 33.0 3e-41 MSVDTNNAEFQDALNLIQYTRQSVFLTGKAGTGKSTFLRYVCEHTKKKHVVLAPTGIAAI NAGGSTMHSFFKLPFYPLLPDDPNLSLQRGRIHEFFKYTKPHRKLLEQIELVIIDEISMV RADIIDAIDRILRVYSHNLREPFGGKQLLLVGDVFQLEPVVKNDEREILNRFYPTPYFFS ARVFGQIDLVSIELQKVYRQTDPVFVSVLDHIRTNTAGAADLQLLNTRYGSQIEESEADM YITLATRRDTVDSINEKKLAELPGEPITFEGVIEGDFPESSLPTSQELVLKPGAQIIFIK NDFDRRWVNGTIGVIAGIDEEEETIYVITDDGKECDVKLESWRNIRYHYNEKTKEIEEEV LGSFTQYPIRLAWAITVHKSQGLTFSRVVIDFTGGVFAGGQAYVALSRCTSLDGIQLKKP VNRADVFVRPEIVNFAGRFNNRQAIDKALKQAQADVQYAAASRAFDKGDMEECLEQFFRA IHSRYDIEKPVPRRLIRRKLGIINTLKEQNKKLKEQMREQQERIRQYAHEYLLMGNECIT QAHDVRAALANYDKALSLDPNYIDAWIRKGITLFNNKEYFDAENCFNTAVSLHPANFKAV YNRGKLRLKTENTEGAIADLDKATSLKPEHAGAHELFGDALLKAGKEVEAALQWRIAEEL KKKKNP >gi|222159250|gb|ACAB01000109.1| GENE 11 17676 - 20201 2627 841 aa, chain + ## HITS:1 COG:no KEGG:BT_4129 NR:ns ## KEGG: BT_4129 # Name: not_defined # Def: outer membrane assembly protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 841 1 838 838 1406 90.0 0 MKKGLKIAAITVGVIIILMFLLPFAFQGKIADIVKTEGNKMLNAQFDFKNLNISLFRNFP QASVTLEDFWLKGTGEFANDTLVQAGEVTAAVNLFSLFGDSGYDISKIFIEDTKLHAIIL PDGRTNWDIMKPDTIATVETPAEEEASSPFKVKLQRFVIKNMNLIYDDQQGKMYADIRDF NAICAGDLGSERTTLKLEAETKSLTYKMNGIPFLANANISAKMDVDADLANNKYTLKDNT IRLNAIQAGIDGWVALKDPAIDMDLKLNTNDIGFKEILSLIPAIYATEFSSLKTDGTATL TATAKGILQGDTVPAFNIDMQVKDAMFRYPALPAGVDQINISANVQNPGGNIDLTTVNIN PFSFRLAGNPFSLTATVKTPISDPDFKAEAKGVLNLGMIKQVYPLGDMELNGTIDADMQM SGRLSSIEKEEYERIQASGTIGLTGMKLKMKDMPDVEIKKSLFTFTPKYLQLSETTVNIG KNDITADSRFENYIGYVLKGSTLKGNLNIRSNYFNLNDFITASADEASTSEAASTDSVAT AATGIVEVPRNIDFQMDANLKQVLFDKMSFNNMNGKLVVKDGKVDMKNLSMNTMGGNVVM NGYYSTANIKKPEMKAGFKLSNIGFAQAYKELDMVQKMAPIFENLKGNFSGSINVLTDLD AAMSPVLNTMQGDGSLSTRDLSLSGVKAIDQIADAVKQPSLKDMKVKDMTLDFTIKDGRV ETKPFDIKMGDYTLNLSGSTGLDQTIDYSGKVKLPASVGNISKLMTLDLKIGGSFTSPKV SVDTKSMANQAVEAVADEAISKLGQKLGLDSAATANKDSIKQKVTEKAAEKALDFLKKKL K >gi|222159250|gb|ACAB01000109.1| GENE 12 20281 - 20895 515 204 aa, chain + ## HITS:1 COG:YPO2212 KEGG:ns NR:ns ## COG: YPO2212 COG0009 # Protein_GI_number: 16122440 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Yersinia pestis # 5 199 7 200 206 161 41.0 1e-39 MLLKLYDKNNNPQDLQRIIDILNDGGLIIYPTDTMYAIGCHGLKERAIERICRIKEIDPR KNNLSIICYDLSSISEYAKVDNNVFKLMKHNLPGPFTFILNGTNRLPKIFRNRKEVGIRM PDNNISREIARLLDAPIMTTTLPYEEHEDLEYMTDPELIDEKFGDIVDLVIDGGIGGIEP STVVKCTDNELEIVRQGKGWLEEN Prediction of potential genes in microbial genomes Time: Wed May 18 03:40:57 2011 Seq name: gi|222159249|gb|ACAB01000110.1| Bacteroides sp. D1 cont1.110, whole genome shotgun sequence Length of sequence - 119987 bp Number of predicted genes - 74, with homology - 72 Number of transcription units - 30, operones - 19 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 70 - 882 923 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 2 1 Op 2 . - CDS 934 - 2130 1225 ## COG0426 Uncharacterized flavoproteins 3 1 Op 3 . - CDS 2185 - 3165 839 ## COG1242 Predicted Fe-S oxidoreductase - Prom 3262 - 3321 7.4 - Term 3415 - 3462 7.1 4 2 Op 1 . - CDS 3603 - 4421 649 ## COG3279 Response regulator of the LytR/AlgR family 5 2 Op 2 . - CDS 4425 - 5219 462 ## BT_4117 hypothetical protein - Prom 5315 - 5374 1.6 6 3 Op 1 . - CDS 5393 - 6970 1003 ## BT_4115 hypothetical protein 7 3 Op 2 . - CDS 7012 - 8652 1101 ## BT_4115 hypothetical protein - Prom 8775 - 8834 9.9 + Prom 8711 - 8770 7.5 8 4 Op 1 . + CDS 8950 - 12171 2560 ## BT_4114 hypothetical protein 9 4 Op 2 . + CDS 12185 - 14212 1667 ## Phep_0771 RagB/SusD domain protein 10 4 Op 3 . + CDS 14231 - 15826 1127 ## BT_4112 hypothetical protein + Term 15860 - 15909 9.4 - Term 15927 - 15961 2.3 11 5 Op 1 . - CDS 16057 - 20679 3847 ## COG0642 Signal transduction histidine kinase 12 5 Op 2 . - CDS 20722 - 22062 859 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 22083 - 22142 5.3 13 6 Tu 1 . - CDS 22395 - 24143 1569 ## COG4677 Pectin methylesterase + Prom 23793 - 23852 2.4 14 7 Tu 1 . + CDS 24049 - 24285 114 ## + Term 24376 - 24410 0.6 - Term 24363 - 24397 1.6 15 8 Op 1 . - CDS 24503 - 25474 808 ## COG4677 Pectin methylesterase 16 8 Op 2 . - CDS 25484 - 26683 1400 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 17 9 Op 1 . + CDS 26985 - 28391 1666 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 18 9 Op 2 . + CDS 28456 - 29376 1015 ## COG3717 5-keto 4-deoxyuronate isomerase + Term 29416 - 29476 18.5 + Prom 29491 - 29550 5.2 19 10 Tu 1 . + CDS 29643 - 31160 1431 ## COG0477 Permeases of the major facilitator superfamily + Term 31183 - 31245 20.6 - Term 31394 - 31423 1.4 20 11 Op 1 . - CDS 31561 - 34947 2795 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits 21 11 Op 2 . - CDS 34910 - 35752 625 ## COG0805 Sec-independent protein secretion pathway component TatC 22 11 Op 3 . - CDS 35770 - 35991 343 ## BT_4102 putative sec-independent protein translocase - Prom 36044 - 36103 8.2 23 12 Op 1 . - CDS 36105 - 38645 2533 ## COG0787 Alanine racemase 24 12 Op 2 . - CDS 38621 - 39640 475 ## BT_4100 hypothetical protein - Prom 39734 - 39793 9.5 + Prom 39719 - 39778 6.7 25 13 Op 1 . + CDS 39808 - 41712 1788 ## COG1154 Deoxyxylulose-5-phosphate synthase 26 13 Op 2 17/0.000 + CDS 41714 - 43054 1557 ## COG0569 K+ transport systems, NAD-binding component 27 13 Op 3 . + CDS 43078 - 44529 827 ## COG0168 Trk-type K+ transport systems, membrane components + Term 44654 - 44702 2.1 - Term 44641 - 44689 2.1 28 14 Op 1 . - CDS 44743 - 46125 1195 ## BT_4096 hypothetical protein 29 14 Op 2 . - CDS 46196 - 47734 1092 ## COG3507 Beta-xylosidase 30 14 Op 3 . - CDS 47746 - 48879 1016 ## COG2152 Predicted glycosylase - Prom 48906 - 48965 3.4 31 15 Op 1 . - CDS 49056 - 51212 1848 ## COG3537 Putative alpha-1,2-mannosidase 32 15 Op 2 . - CDS 51243 - 52397 899 ## COG3537 Putative alpha-1,2-mannosidase 33 15 Op 3 . - CDS 52315 - 53517 981 ## COG3537 Putative alpha-1,2-mannosidase - Prom 53581 - 53640 4.8 34 16 Op 1 . - CDS 53793 - 55097 820 ## BT_4084 hypothetical protein 35 16 Op 2 . - CDS 55122 - 55190 67 ## - Prom 55424 - 55483 6.5 + Prom 55149 - 55208 8.8 36 17 Op 1 . + CDS 55455 - 57299 1153 ## BT_4085 hypothetical protein 37 17 Op 2 . + CDS 57315 - 57770 356 ## BT_4086 hypothetical protein 38 17 Op 3 . + CDS 57786 - 58853 747 ## BT_4087 hypothetical protein 39 17 Op 4 . + CDS 58875 - 61718 2336 ## BT_4088 hypothetical protein 40 17 Op 5 . + CDS 61724 - 63508 1652 ## BT_4089 hypothetical protein 41 17 Op 6 . + CDS 63515 - 66610 2694 ## BT_4090 hypothetical protein + Term 66634 - 66705 22.0 - Term 66760 - 66811 12.4 42 18 Tu 1 . - CDS 66867 - 67664 695 ## Fisuc_0264 KilA, N-terminal/APSES-type HTH DNA-binding domain protein - Term 68000 - 68036 -0.8 43 19 Tu 1 . - CDS 68049 - 70193 1555 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases - Prom 70222 - 70281 4.6 - Term 71091 - 71143 7.3 44 20 Op 1 . - CDS 71226 - 73898 1771 ## BT_4076 alpha-rhamnosidase 45 20 Op 2 . - CDS 73917 - 75323 950 ## BT_4075 hypothetical protein 46 20 Op 3 . - CDS 75369 - 78458 2263 ## COG3250 Beta-galactosidase/beta-glucuronidase 47 20 Op 4 . - CDS 78507 - 80846 1847 ## COG3537 Putative alpha-1,2-mannosidase 48 20 Op 5 . - CDS 80865 - 83429 1432 ## COG0383 Alpha-mannosidase 49 20 Op 6 . - CDS 83442 - 83672 208 ## gi|237714278|ref|ZP_04544759.1| conserved hypothetical protein 50 20 Op 7 . - CDS 83750 - 84274 405 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 84390 - 84449 6.5 - Term 84333 - 84368 4.3 51 21 Tu 1 . - CDS 84467 - 86071 1157 ## BT_4069 putative regulatory protein - Prom 86134 - 86193 5.9 + Prom 86243 - 86302 4.1 52 22 Tu 1 . + CDS 86332 - 86709 125 ## BT_4068 hypothetical protein + Term 86748 - 86798 3.1 + Prom 86711 - 86770 4.2 53 23 Op 1 30/0.000 + CDS 86828 - 87178 218 ## PROTEIN SUPPORTED gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A 54 23 Op 2 9/0.000 + CDS 87169 - 87771 435 ## PROTEIN SUPPORTED gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B + Term 87780 - 87834 -0.1 + Prom 87773 - 87832 3.1 55 23 Op 3 8/0.000 + CDS 87854 - 89446 1500 ## COG0649 NADH:ubiquinone oxidoreductase 49 kD subunit 7 56 23 Op 4 31/0.000 + CDS 89467 - 90543 1140 ## COG1005 NADH:ubiquinone oxidoreductase subunit 1 (chain H) 57 23 Op 5 28/0.000 + CDS 90556 - 91044 520 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 58 23 Op 6 30/0.000 + CDS 91047 - 91559 486 ## COG0839 NADH:ubiquinone oxidoreductase subunit 6 (chain J) 59 23 Op 7 26/0.000 + CDS 91597 - 91905 373 ## COG0713 NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) + Term 91906 - 91941 -0.9 + Prom 92016 - 92075 6.9 60 23 Op 8 30/0.000 + CDS 92097 - 94028 1532 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit 61 23 Op 9 22/0.000 + CDS 94067 - 95551 1368 ## COG1008 NADH:ubiquinone oxidoreductase subunit 4 (chain M) 62 23 Op 10 . + CDS 95589 - 97028 1170 ## COG1007 NADH:ubiquinone oxidoreductase subunit 2 (chain N) + Term 97052 - 97119 6.1 - Term 97044 - 97102 13.5 63 24 Tu 1 . - CDS 97169 - 99979 2581 ## COG0642 Signal transduction histidine kinase - Prom 100052 - 100111 8.3 - Term 100073 - 100136 13.1 64 25 Op 1 . - CDS 100172 - 101380 1254 ## BT_4056 hypothetical protein 65 25 Op 2 . - CDS 101424 - 104138 2752 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 104180 - 104239 3.3 + Prom 104302 - 104361 4.0 66 26 Tu 1 . + CDS 104595 - 106160 1089 ## BT_4046 hypothetical protein - Term 106132 - 106171 4.1 67 27 Op 1 2/0.200 - CDS 106187 - 107083 737 ## COG2207 AraC-type DNA-binding domain-containing proteins 68 27 Op 2 . - CDS 107118 - 108464 1274 ## COG0668 Small-conductance mechanosensitive channel 69 27 Op 3 . - CDS 108496 - 109524 815 ## BT_4052 putative ABC transporter ATP-binding protein 70 27 Op 4 . - CDS 109547 - 110410 986 ## COG3950 Predicted ATP-binding protein involved in virulence - Prom 110438 - 110497 5.5 + Prom 110448 - 110507 2.4 71 28 Tu 1 . + CDS 110533 - 111450 736 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 111451 - 111511 9.0 - Term 111442 - 111492 11.0 72 29 Op 1 . - CDS 111541 - 112638 870 ## COG1703 Putative periplasmic protein kinase ArgK and related GTPases of G3E family 73 29 Op 2 . - CDS 112644 - 113711 865 ## BT_4048 hypothetical protein - Prom 113731 - 113790 4.2 74 30 Tu 1 . - CDS 113937 - 119681 4975 ## COG2373 Large extracellular alpha-helical protein - Prom 119836 - 119895 8.0 Predicted protein(s) >gi|222159249|gb|ACAB01000110.1| GENE 1 70 - 882 923 270 aa, chain - ## HITS:1 COG:BB0152 KEGG:ns NR:ns ## COG: BB0152 COG0363 # Protein_GI_number: 15594497 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Borrelia burgdorferi # 1 264 1 264 268 370 69.0 1e-102 MRLIIQPDYQSVSLWAAHYVAAKIKAANPTPEKPFVLGCPTGSSPLGMYKALIDLNKKGI VSFQNVVTFNMDEYVGLPKEHPESYYSFMWNNFFGHIDIKPENTNILNGNAPDLDAECAR YEEKIKSYGGIDLFMGGIGPDGHIAFNEPGSSLTSRTRQKTLTMDTIIANSRFFDNDINK VPKTSLTVGVGTVLSAKEVMIIVNGHNKARALYHAVEGAITQMWTISALQMHEKGIIVCD DAATAELKVGTYRYFKDIESAHLDPESLIK >gi|222159249|gb|ACAB01000110.1| GENE 2 934 - 2130 1225 398 aa, chain - ## HITS:1 COG:FN0512 KEGG:ns NR:ns ## COG: FN0512 COG0426 # Protein_GI_number: 19703847 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 5 398 5 402 403 364 45.0 1e-100 MEHKTRIKGNVHYVGVNDRNKHRFEGMWPLPYGVSYNSYLIDDEMVALVDTVDICYFEVY LRKIKQVIGERPINYLIINHMEPDHSGSIRLIKQHYPEIIIVGNKQTFGMIEGFYGVTGE QYLVKEGDFLALGHHKLRFYMTPMVHWPETMMTFDETDGVLFSGDGFGCFGTVDGGFLDT RINVDKYWGEMVRYYSNIVGKYGSPVQKALQKLGGLPISAICSTHGPVWTENIAKVIGIY DRLSRYDADEGVVIAYGSMYGNTEQMAEAIAEELSAQGVKNIVMHNVTKSHPSYILADIF RYKGLIIGSPTYSNQIFPEVEALLSKILLREVKGRYLGYFGSFTWAGAAVKRLAEFAEKS KFELVGDPVEMKQAMKDLTYTQCENLARAMADRLKKDR >gi|222159249|gb|ACAB01000110.1| GENE 3 2185 - 3165 839 326 aa, chain - ## HITS:1 COG:SA1581 KEGG:ns NR:ns ## COG: SA1581 COG1242 # Protein_GI_number: 15927337 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Staphylococcus aureus N315 # 9 315 14 317 317 233 39.0 3e-61 MNMSTQLLYNDFPTFLRKYFPYKVQKISLNAGFTCPNRDGTKGWGGCTYCNNQTFNPDYC RTEKSITTQLEEGKCFFAHKYPEMKYLAYFQAYTNTYAELEGLKRKYEEALTVDGVVGLV IGTRPDCMPESLLRYLEELNKHTFLMVEYGIESTCDETLKRINRGHTYADTVAAVRRTAA CGILTGGHIILGLPGETHDTMVAQAEILSDLPLATLKIHQLQLIRGTRMAHEYDEAPDGF HLFNEVEEYIGLVIDYVEHLRPDIVVERFVSQSPKDLLIAPDWGLKNYEFVARLQKRMKE RGAYQGKKYRDSEKRIIFADDKLTTE >gi|222159249|gb|ACAB01000110.1| GENE 4 3603 - 4421 649 272 aa, chain - ## HITS:1 COG:SA0251 KEGG:ns NR:ns ## COG: SA0251 COG3279 # Protein_GI_number: 15925964 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Staphylococcus aureus N315 # 168 272 141 245 246 60 29.0 2e-09 MKAHPIIESPFRWIPMLVLALILVVFQVALVCGYTGNDYLPALVDGIVTIGWLAAIAYLA WFVVGLVSLFQTDVIMIIVGSLLWLAGSFMVCDIMVRIVGVSYVPFAQTIPFRLLFGLPV LIAITLWYRLIVTKEEVQNQEMEKELAVHQVNVAEQQKESAMEWVDRITVKDGSRIHLVK ADELIYIQACGDYVMLITPTGEYLKEQTMKYFETHLSPDTFVRVHRSTIVNVTQISRVEL FGKETYQLLLKNGVKLRVSLSGYRLLKERLGI >gi|222159249|gb|ACAB01000110.1| GENE 5 4425 - 5219 462 264 aa, chain - ## HITS:1 COG:no KEGG:BT_4117 NR:ns ## KEGG: BT_4117 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 264 1 264 264 426 80.0 1e-118 MEEKQKFPTATGFSGKTLIASLFILSGILLFARNMGWITSEVFDMIVSWHSLLIILGIYS MIRRHFIGGIILLLIGVYFLIGGLSWLPENSQAMVWPIALITAGVLFFFKSGKNRRGHWA HHHAMQHRQKWMKMHQGQPGMNFESEQQQSESVDGFLRAENVWGAARHVVLDELFKGAVI RTSFGGTTIDLRHTHIAPGETYIDLDCSWGGVEIYVPSDWKVVFKCNAFFGGCDDKRWQN GNINKESVLVIRGTLSFGGLEVKD >gi|222159249|gb|ACAB01000110.1| GENE 6 5393 - 6970 1003 525 aa, chain - ## HITS:1 COG:no KEGG:BT_4115 NR:ns ## KEGG: BT_4115 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 515 12 490 497 374 44.0 1e-102 MNLIFRCLTVLLTLIGNEAAFASSGIGNTLDFGELAAFPGAEGYGRMTTGGRGGEVYIVT SLADDGMPGTLRYGIEKLDGPRTIIFQVSGTIFLHKDLKIRKGNLTIAGQTAPGNGICIA GYPVTNYASNVILRFLRFRMGNKECVHPDGADALGGKGARNVIIDHCSISWCTDECASFY ENDDFTMQWCIISESLRLGGHTKGPHGYGGIWGGEHSSYHHNLLAHHDSRNPRFGLGAKV RKNGECDGDYVDFRNNVIYNWGMNSSYGGERMNINIVNNYYKPGPATVTGSKRGRIFAID ATENRNGGYLWGKYYIDGNVVDGGADDKNSQKATANNWEYGVYNQFSNNYKKVVTQKTKD SIRLDKPHEFASVTTHSAFNAYKQVLDYAGCSLHRDDVDARIVKETRTRTAGYKGLNVHN GEGGIWKSEGYPKPGLIDSQDDLLSLNTSENVSAWPVLLQRSTLIDSDNDGLPDAWERKF GLNPHDASDGNGKTIDKYGQYTNLEMYMNSLVHDIIEKQNSGGKK >gi|222159249|gb|ACAB01000110.1| GENE 7 7012 - 8652 1101 546 aa, chain - ## HITS:1 COG:no KEGG:BT_4115 NR:ns ## KEGG: BT_4115 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 46 542 38 496 497 380 44.0 1e-104 MKMDIILSKWCLAILFAFGILACTDGSDANELPKEPEGTGETQDEKIPAFPGAEGHGRYT TGGRGGDVYIVTSLEDKLQNGTLRYGIEKLSGKRTIIFQVSGTIHLNGDLKIKNGDLTIA GQTAPGDGICLAGYPVFLEADNVIVRFMRFRMGNKEDVSADGADAFGGRYHRNIMIDHCS MSWCTDECVSFYQNENFTLQWCLISESLRLGGHTKGPHGYGGIWGGMKASFHHNLLAHHD SRNPRLGPGVNSTKENEIVDMRNNVIYNWCGNSCYGGEAMHVNIVNNFYKPGPATPTGTS KRGRIIAIDKKVSDSDKKSYPAIFDTWGDFFIQGNVVDDGQINGAADYDRCMKATKDNWE YGVYNQFDKKYGTLDEGTKKALKRTTPVETGTVTTHDARTAFERVMDYAGCSLHRDRVDE RIVQETRTGTANYQGMNEHNGQGVVEGIDWKSVGYPKKGIIDSQDDVIPVGESSAWPELV QGVILKDSDNDGMPDEWEKKYGLNPNDASDRNGKTVDEEGTYTNLEMYMNSLVQKIVEEQ NKGGVQ >gi|222159249|gb|ACAB01000110.1| GENE 8 8950 - 12171 2560 1073 aa, chain + ## HITS:1 COG:no KEGG:BT_4114 NR:ns ## KEGG: BT_4114 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1073 1 1071 1071 860 46.0 0 MSNKVKNMRSWLLMLFAAISLGVSAQTITVKGNVKDTTGEPIIGASVVEKGNTTNGTITD LDGNYSIKVPSKATLTISYIGMKTLDIAVKGQSQINVTLSDDTQALDEVVVIGYGTVAKK DLTGSVSSVSAKQIAAIPVSSASEALQGKMAGVSITTTEGSPDADVKIRVRGGGSLSQDN SPLYIVDGFPVSSISDIAPTDIQSVDVLKDASSTAIYGARGANGVIIITTKSGQEGKVQV NFGASYGLRKVTKMVDVLNPYEFVMWQQELDQKSFNYGTFDDLEIYKSYTDTDYQGEIFG RTGNQQQYNISVSGGNKDTKFNISYSRNDEKSIMLNSGFAKDNINAKINTKINKWLTFDF NARLSYQKVKGTGSGADDNQSSATTSVNARSVVFAPVEKLISGSSDDDENVNNARYTPIE ILNATYKRVSRFQQNYNAGVNWEPVKGLKFRSEFGYGWKYNNTDQVWGVEATNNSKYGNP GMPQALLTKLINKNWRNANTVTYNNDKMFNLNHRMNVMLGHEVSSSYDHKTTMTSVGFPK NMSIKEVLANMGQGQALPTDTYIGIKDNLLSFFGRINYTINDKYLMTFTMRADGSSKFAS GNQWGYFPSLALAWRISDEAFMKNAQEWLSNLKLRLSIGTAGNNRIPSGSLETTYSLAGS GDKHIYFNNTSSVVLQRGKNLSNPNLKWETTLTRNIGLDFGFWNGRLSGSLDAYWNTTKD LLMQTTLPKGAGYEFQYQNFGQTSNKGIELALDAILVDQKDFGLNFNFNISYNQAKIDKL PTNKIWQSSGWGGSLIAGIDDFLIEEGGRLGEVYGYKLDGFYTTEDFTWDNGWKLKEGRV NPTESGMKVNNFGPGSPKLKANADGKIEKERLGNTIPKITGGFGINARYKSFDLSAFFNY SLGNKIINATKLASGFYSDTKKDWNLNNEFNLSKRYTWIDPANGMDMTSKSYINEFGSDY AINRLNEINAGASIFNPASITQIPLLDWAVEDGAFLRVNNISIGYTLPKVFVQKLQMQNV RIYFTGYNLYCFTKYSGADPEVDTCRKTPMTPGIDYSAYPKSRSFVGGINVTF >gi|222159249|gb|ACAB01000110.1| GENE 9 12185 - 14212 1667 675 aa, chain + ## HITS:1 COG:no KEGG:Phep_0771 NR:ns ## KEGG: Phep_0771 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 14 558 18 524 591 250 33.0 1e-64 MKSIIKYIIGGMGLLSITSCSDFLDQSSPSEIKGDMVFNSLTFTEQSLNKVYAGLTLDHT YGARIPLNFGFNTDIELVDADVNKPASFTESSERALGNYNGESISGSWSKLDDNWKNCFA IIEDANLVVEGIRSSELFQEGNPARNKMANFLGEALTIRAMVYFDLVKNYGDLPMKFETT KVDGSNLYVAKSDRDDIMEQLLSDLEEAANYLPWAGESGYTTEHATAGFAHGLFARIALA KAGYSIRESSKPGYETLAGSDGTYPTQRPGEAERKLLYERALKHLTAVIGSGVHKLNPSY ADQWFQVNQRKLDNTYKENLYEVAHGLNKSGEMGYTIGVRISGASSYYGAKGNSSGKVKL TAPFFWSFDHSDLRRDITCATYELKEENGHIKENMQKNAPFGIYVAKWDIRKMNDEWLNA VRASDAKIGYGINWIAMRYSDILLMYAEVMNELYGADAANPLGGTAMTARTALTEVHSRA FDNKANAQAYVAAISSGDDFFNAIVDERAWEFAGECVRKYDLIRWGLLSKKIDQFKEDYR QLTTIAPKYIFYKMKADDEYSIDMSSICWYEYPSFVNEINNELDVKNAIKNATDPNWKYV PGWGTFPNGKIEKDATTKQEVFKEDGSTSNDSNLSGLTDYVSTGLNKTVKNRHLIPLGSK TISESNGTLANSYGF >gi|222159249|gb|ACAB01000110.1| GENE 10 14231 - 15826 1127 531 aa, chain + ## HITS:1 COG:no KEGG:BT_4112 NR:ns ## KEGG: BT_4112 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 528 19 525 527 107 25.0 1e-21 MKISTNIHTLLSIICLTVISFSSCDQSNDWSSDERYNRLFRSNNLAVTPQTFEAEVIWNA TPNTDYYIIEVNKNEFTNDELQEGSIIYGEDKGITSSPYSITGLEEQTTYYLRLRSCSES AQPSKWCTLEDGTFTTKSEQIIESSSNVTSKNATIHWTAGVDVTHFLIESTEGETRRDIT ANEKNAGQATLTGLKAATTYKVSIYNNQKLRGNCKFETTEDFPEGYNIANLKEGDDIDAV LAEQQGDVVLVFPAGSTFERTEKLSIPASVNAIIFWGASGGAQPNFKPKEVTATENTTSI KFYNMNLYNNGKGSDYMINQDAMTTDVNISFDKCKVSKTRGILRVQGGGTGCSINNIEFT NTTFSEIGSYGVIHTKDMTGNLNSIHISKCTFNDVAATNGATFTLTAKNTTHSITFNIEQ CTFYTCVQASNKHLIDANKVELHDIRITKCLFANCGSSDAKNKLCSVKSIVKETSDNWYT TDCAWHSEAQAMCTEYGKTSAELFEDPENGNFKIKDALFKNRAGDPQWFSE >gi|222159249|gb|ACAB01000110.1| GENE 11 16057 - 20679 3847 1540 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 960 1187 56 287 318 144 35.0 2e-33 MILQVFFYLYATEPSDMMKKTLISVILLLAIAFSTYAQQHCFFTHYSTEDGLSQNTVMNI LQDHKDNLWFATWDGINRFNGYTFKTYKARQGNYISLTNNRVDRMYEDRYGFLWLLTYDN RVHRFDPKTETFEQVPAAGEEGSAFNVHTIEVMPNGTVWLLTENDGAIRILTHPDENNRL TWDIYSSKTELFPSLHVFKVHQDKAGNDWLLTDNGLGMIHPGKKEPDSYFTETKGKFGGM NQAFYAVQERDKDICFASDHGRIWSYQKESGEFTLIELPTKGQITSIHPIAPDVSVITTD SDGFFTYNLRTKTNVHYSFLTCKALPAKPILSAYVDRSSEVWFEQEEPGVVAHFNPSTGV VTREQIPIEYSNPERSRPAFHIHEDVNGYLWVHPYGGGFSYFDPQKKCLVPFYNGLGSRD WRFSNKIHSAFSDKQGNLWLCTHSKGLEKVTYRNVPFAMMTPMPHEHESLHNEVRALCED KLGNLWVGLKDGILRMYDVGKTYKGYLTESGTISTTGTPMLGTVYFVIQDSKGIIWIATK GDGLVRAEPISSNGMSYKLTRYLHREDDMYSLSDNNVYCVYEDHYGRIWLATFAGGINYI TENEAGKTLFINHRNNLKGYPIDPCYKARFITSDNNGRLWVGTTTGAVAFDENFKKPEDV KFYHFSRMPNDTQSLSNNDVHWIISTKKKELYLATFGGGLNKLVSISKDGHGEFKSYSVL DGLPSDVLLSIREDSKQNLWISTENGICKFIPSEERFESYDERSITFPVRFSEAASALTA KGSMLFGASGGVFIFNPDSIRKSSYIPPIVFSKLTVANEDITPGGNSLLKVDIDDADKLV LSHKENIFSVHFAALDYTNPQNIQYAYILDGFEKQWTFADKQRSVTYTNLPKGEYVLRVR STNSDGVWVDNERILDIVILPSFWETPIAYVLYILFILIIILVAVYILFTIYRLKHEVSV EQQISDIKLRFFTNISHELRTPLTLIAGPVEQVLKNDKLPADAREQLVVVERNTSRMLRL VNQILDFRKIQNKKMKMQVQRVDIVPFVRKVMDNFEAVAEEHRIDFLFQTEKEHLYLWVD ADKLEKIVFNLLSNAFKYTPNGKMITMFIREDENTVSIGVQDQGIGIAENKKKSLFVRFE NLVDKNLFNQASTGIGLSLVKELVEMHKATISVDSRLGEGSCFKVDFLKGKEHYDKETEF ILEDAEAPVRMGQVVDIANSSIQSETLIPDDSEKIEAVYEEDTAKEENSKELMLLVEDNQ ELREFLRSIFTPMYRVVEAADGREGANKALKYLPDIIISDVMMPEKDGIEMTRELRADMT TSHIPIILLTAKTTIESKLEGLEYGADDYITKPFSAIYLQARVENLLMQRKKLQSFYRDS LMHINMSVTSGELLASTKAMAEEEKKIVSEREEEQTQLQSQQQPTIPDMSPNDRKFMDKL VELMEQNMDNGDLVVDDLVRELAVSRSVFFKKLKTLTGLAPIEFIKEIRIKRATQLIETG EFNMTQISYMVGINDPRYFSKCFKAQVGMTPTEYKEKIGR >gi|222159249|gb|ACAB01000110.1| GENE 12 20722 - 22062 859 446 aa, chain - ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 3 434 24 450 470 178 29.0 2e-44 MFERRIQNQLHEWAEKNNRKPLILRGARQVGKTTIVEQFGKEFDNFLSLNLEKKEAKDLF ESTDDVKMLLPLLFLYDNKPQRPGRTLLFIDEIQSSPQAVALLRYFYEELPELYVIAAGS LLENMLDKHISLPVGRVEYMALHPCSFIEFLQAIGEGRFVDWILEARLPEAFHQQLMLLF NSYALIGGMPEIVARYAANRDVVSLSDTYNQLLNAYKNDVEKYAETNTQAAVIRYILEEG WGYAGQAVTLGGFAQSAYKARETGEAFRTLEKALLLELVYPTTHTIMPATTDLKRAPKLI WLDTGLVNYAAEIQKEYLLSKDLTDTWRGMAAEQIVAQELKALSNEVGKKQKFWVRAKRG SSAEVDFVYIYGGKIIPIEVKNGHNAHLKSIHQFMNETEHDLAVRIWSGNYAIDEVTTNE GKHFKLINLPFYMIAALPNILKKIET >gi|222159249|gb|ACAB01000110.1| GENE 13 22395 - 24143 1569 582 aa, chain - ## HITS:1 COG:CAC3373 KEGG:ns NR:ns ## COG: CAC3373 COG4677 # Protein_GI_number: 15896615 # Func_class: G Carbohydrate transport and metabolism # Function: Pectin methylesterase # Organism: Clostridium acetobutylicum # 281 571 2 321 321 239 42.0 1e-62 MKNMKTLRVKMVGLLATALILFSAFRADKPVITIFMIGDSTMANKKIDGGNPERGWGMVL PGFFSEDIRIDNHAANGRSSKSFISEGRWEKVISKVKKGDYVFIQFGHNDEKADSTRHTD PGSTFDENLRRYVNETRAKGGIPVLFNSIVRRNFVQPKDDAIAKDVRRTPGEKEQPKEGI VLFDTHGAYLDAPRNVAKELGVTFIDMNKITHDLVQELGPVESKKLFMFVEPNQVPAFPK GREDNTHLNVYGARTIAGLAVDAIGKEIPELAKYIRQFDYVVAQDGSGDFFTVQEAINAV PDFRKDVRTTILIRKGTYKEKLIIPESKINISLIGEDSAILTYDGFANKKNVFGENMGTS GSSSCYIYAPDFYAENITFENSSGPVGQAVACFVSADRVYFKNCRFLGFQDTLYTYSKQS RQYYEDCYIEGTVDFIFGWSTAVFNRCHIHSKRDGYVTAPSTDKGKKYGYVFYNCKLTAE PEATKVYLSRPWRPYAQAVFIRCELGKHILPVGWNNWGKKENEKTVFYAEYESRGEGAHP KARAAFSQQLKNLKGYEINAVLAGEDGWNPIENGNKLITVKR >gi|222159249|gb|ACAB01000110.1| GENE 14 24049 - 24285 114 78 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTGLSARNAEKRIRAVASSPTILTLSVFIFFIPSYEFYFLFLSATCNVLLVSYLQSTDCS VADTVADTVADVYLQHLI >gi|222159249|gb|ACAB01000110.1| GENE 15 24503 - 25474 808 323 aa, chain - ## HITS:1 COG:CAC3373 KEGG:ns NR:ns ## COG: CAC3373 COG4677 # Protein_GI_number: 15896615 # Func_class: G Carbohydrate transport and metabolism # Function: Pectin methylesterase # Organism: Clostridium acetobutylicum # 33 319 1 318 321 254 42.0 2e-67 MKRSIFKGMMCLLLLGVGATSVYAQQQQRKDTLVVARDGTGEYRNIQEAVEAVRAFMDYT VTIFIKNGIYKEKLVIPSWVKNVQLVGESAEKTIITYDDHANINKMGTFRTYTVKVEGND ITFKDLTIENNAAPLGQAVALHTEGDRLMFVNCRFLGNQDTIYTGTEGARLLFTNCYIEG TTDFIFGPSTALFEYCELHSKRDSYITAASTPQSEEFGYVFKNCKLTAAPGVKKVYLGRP WRPYAATVFINCEFGNHIRPEGWHNWKNPENEKTARYAEFGNTGAGADTSGRVAWAKQLT NKEAMKYTPQNIFKESSNWYPYK >gi|222159249|gb|ACAB01000110.1| GENE 16 25484 - 26683 1400 399 aa, chain - ## HITS:1 COG:CAC0359 KEGG:ns NR:ns ## COG: CAC0359 COG4225 # Protein_GI_number: 15893650 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Clostridium acetobutylicum # 59 397 23 361 361 334 46.0 2e-91 MRRTPFYNFFIVALLVTMTASVSAQQVDSKLPWSVRLTESEMIRYPESWQLDFQPKLKWD YCHGLELGAMLDVYDAYGDKKIRDYAIAYADTMVHEDGSITAYKLTDYSLDRINSGKILF RIYEQTKNPKYKKALDLLYSQFEGQPRNEDGGFWHKKIYPHQMWLDGIYMGAPFYAEYAF RNNLPQDYADVINQFVTCARHTYDPKNGLYRHACDVSRTERWADPVTGQSKHTWGRAMGW YAMALVDVLEFIPQHEAGRDSLLDILNNVAVQVKKLQDPKTGGWYQVMDRSGDKGNYVES SCSAMFIYSLFKAVRLGYIDKSYLDVALKGYKGFLDNFIEVDKNGLVTITKACAVAGLGG KVYRSGDYDYYINETIRNNDPKAVGPFIMASLEYERLQK >gi|222159249|gb|ACAB01000110.1| GENE 17 26985 - 28391 1666 468 aa, chain + ## HITS:1 COG:Z4877 KEGG:ns NR:ns ## COG: Z4877 COG3775 # Protein_GI_number: 15804015 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Escherichia coli O157:H7 EDL933 # 4 457 21 458 462 281 35.0 2e-75 MEEVFKYIIGLGAAVMMPIIFTILGVCIGIKLPKALKSGLLVGVGFVGLSVVTALLTSSL GPALSKMVEIYGLELGIFDMGWPSAAAVAYNTSVGAFIIPVCLGVNLLMLLTKTTRTVNI DLWNYWHFAFIGAIVYFASDNIYWGFFAAIICYIITLVMADMTAPAFQKFYDKMDGISIP QPFCQSFVPFAIVINKLLDKIPGFDKLNIDSEGLKKKFGLMGEPLFLGIVIGCGIGALGC GSWKEVVDSIPSILGLGIKMGAVMELIPRITSLFIEGLKPISDATRELIAKKYKSNTGLS IGMSPALVIGHPTTLVVSLLLIPVTIFLAVILPGNRFLPLASLAGMFYLFPMILPITKGN VVKSFIIGLVALIVGLYFVTELAGFFTLAAKDVYEATGDPTVNIPAGFEGGALDFASSLF CWGIFHLTYSLKIIGPAILVALALGMAIYNRIRMTRNDAKNALNSNKE >gi|222159249|gb|ACAB01000110.1| GENE 18 28456 - 29376 1015 306 aa, chain + ## HITS:1 COG:BH2166 KEGG:ns NR:ns ## COG: BH2166 COG3717 # Protein_GI_number: 15614729 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Bacillus halodurans # 28 306 6 276 276 288 49.0 1e-77 MKKLAIAMMLGIAAMSASAQVNYKMQVACSPQDVKTYDTNRLRSSFLMEKVMVPNEINVT YSMYDRLIFGGAVPATKELVLETIDPLKSKFFLERRELGVINIGGEGIVTVDGKEYTLKF KDALYVGRGKQKVTFKSKDSSNPAKFYINSATAHKEYKTQLITIDGRKGSLKANSFAAGK LEESNDRVINQLIVNNVLEEGPCQLQMGLTELKPGSVWNTMPAHTHSRRVEAYFYFNVPA GNAICHFMGEPQEERIVWMQNEQAIMSPEWSIHAAAGTSNYMFIWGMAGENLDYGDMDKI KYTEMR >gi|222159249|gb|ACAB01000110.1| GENE 19 29643 - 31160 1431 505 aa, chain + ## HITS:1 COG:CC1508 KEGG:ns NR:ns ## COG: CC1508 COG0477 # Protein_GI_number: 16125755 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 9 430 18 395 431 248 35.0 2e-65 MNAFQKTGEKMTNYRWTICAMLFFATTVNYLDRQVLSLTWDEFIKPEFHWDESHYGTITS VFSIVYAICMLFAGRFVDWMGTKKGFLWAIGVWSAGACLHAVCGIVTEAQVGLHSAAELA GATGDVVVTIATVSMYCFLAARCILALGEAGNFPAAIKVTAEYFPKKDRAYATSIFNAGA SIGALIAPLTIPILAKMFGWEMAFIVIGGLGFIWMGFWVFMYDAPSKSKHVNQAELDYIE QDNREAGSAPMTDEKDEKRMKFWQCFSYKQTWAFVFGKFMTDGVWWFFLFWTPSYLNTQF GIKTSDPLGMALIFTLYAVTMLSIYGGKLPTIFINRTGMNPYAARMKAMLIFAFFPLVVL LAQPLGTVSPWFPVILIGIGGAAHQSWSANIFSTVGDMFPRTAIASITGIGGMAGGVGSM ILQKVAGNLFVYASGTTMIDGKEVEMTKELLEQGAQFVHPAMTFMGFEGKPAGYFVIFCV CAVAYLLGWVIMKALVPKYKPIVLE >gi|222159249|gb|ACAB01000110.1| GENE 20 31561 - 34947 2795 1128 aa, chain - ## HITS:1 COG:sll1582 KEGG:ns NR:ns ## COG: sll1582 COG1112 # Protein_GI_number: 16329815 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Synechocystis # 203 1128 203 1098 1118 292 28.0 3e-78 MKLEMKLPCPKNEAIESYEILLAVCRAEDAYLAVGYKQMRDLLERICRGQMQNESLQMTD LSARISFVSAKVGLSVTEQNRLHTFRLTSNAILNRQQEPNREQLLRDAKTLAFFIRKLLD EDIPLELYRLFPRADATYLVAPPARERVQRMRVCFQYADEQYLYVTPLDEVSEQPYLVRY NVPQINEEFAETCKLLWRHAQVNLLDVAIDEAGILTPSFIVLEPDYLLDISSLAECFRDY GHHPANYFLSRLQPIENARPLLLGNIANLFLDEWIHAKSEEIDYRTCMQKAFRRYPIELA ACSDLRDKEKERQFFEDCKLHFDHIRETVNDTFHAAGYELDKTDAVLEPSYICEALGLQG RLDYMQRDMSSFIEMKSGKADEYAIRGKVEPKENNKVQMLLYQAVLQYSMGMDHRKVKAY LLYTRYPLLYPSRPSWAMVRRVIDLRNRIVADEYGIQLRNSLEYTSQKLEEINASTLNER GLKGRFWETYLRPSIDNFQSKLKALSALEKNYFYAVYNFITKELYTSKSGDVDYEGRTGA ASLWLSTLAEKCEAGEIIYDLKIKENHAADEYKAGLTLTAGSEMLHAETFLPNFRQGDAI ILYERNCDTDNVTNKMVFKGNIEYLTENEIGIRLRATQQNPSVLPTESLYAIEHDTMDTT FRSMYQGLYAYLSARKERRDLLLSQRPPRFDKSLDSMIFCSEDDFTRVALKAKAAQDYFL LIGPPGTGKTSCALKKMVETFHADKDAQILLLSYTNRAVDEICKSLASIAPAVDFIRVGS ELSCDEAYRGHLIENELSSCNRRSEVYERIRNCRIIVGTVAAISGKPELFRLKHFDVAII DEATQILEPQLLGILCARGEDEKDAIDKFVLIGDHKQLPAVVQQNAEQAAIYDESLLSIG LSNLKDSLFERLYRNCTAACSSSAIHRSYDMLCRQGRMHPEVALFANRAFYGGRLIPVGL PHQIEDSDTICRLAFYPSVPEKAGASAKINYSEARIVADLAVRIYEHHQSDFDESRTLGI ITPYRSQIALIKKEIESVGIPVLNRILVDTVERFQGSERDVIIYSFCVNYPYQLKFLSNL TEEEGVLIDRKLNVALTRARKQMFITGVPELLERNPLYKSLLKLIEGS >gi|222159249|gb|ACAB01000110.1| GENE 21 34910 - 35752 625 280 aa, chain - ## HITS:1 COG:DR0806 KEGG:ns NR:ns ## COG: DR0806 COG0805 # Protein_GI_number: 15805832 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway component TatC # Organism: Deinococcus radiodurans # 1 266 19 263 270 132 34.0 1e-30 MAEMTFWDHLDELRKVLFRVVGVWFVLAIGYFIAMPYLFDHVILAPCHNDFIFYDLLRHI GKTFDLTDDFFTQQFYVKLVNINLAAPFFIHMSTAFWMSVVTAIPYIFFEIWRFINPALY PNERKGVRKALTIGTVMFFIGVLLGYFMVYPLTLRFLSTYQLSSEVENILSLNSYIDNFM MLVLCMGLAFELPLVTWLLSLLGVVNKSFLRRYRRHAVVVIVIAAAIITPTGDPFTLSVV AIPLYLLYEMSILMIKDKKKTEEEEEVEDETGDEVALSEE >gi|222159249|gb|ACAB01000110.1| GENE 22 35770 - 35991 343 73 aa, chain - ## HITS:1 COG:no KEGG:BT_4102 NR:ns ## KEGG: BT_4102 # Name: not_defined # Def: putative sec-independent protein translocase # Organism: B.thetaiotaomicron # Pathway: Protein export [PATH:bth03060]; Bacterial secretion system [PATH:bth03070] # 1 72 1 72 73 99 91.0 3e-20 MTNLLLLGFLPSGSEWIIIALVILLLFGGKKIPELMRGLGKGVKSFKDGVNEAKDEINKA KDELDKPVDPSKN >gi|222159249|gb|ACAB01000110.1| GENE 23 36105 - 38645 2533 846 aa, chain - ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 488 846 11 376 386 207 36.0 8e-53 MSYAIESIAKSIGARRMGKHKATIDWLLTDSRSLSFPEETLFFALTTKRNSGVRYIPELY DRGVRNFVITEEDFKLIENGELKMEKSGQDDDAQPILNSQLSTLNFLIVPNPLKALQKLA EAHRDKFKIPVIGITGSNGKTIVKEWLHQLLSPDRCIVRSPRSYNSQIGVPLSVWQLSEE AELGIFEAGISEMGEMGALKRMIKPTIGILTNIGGAHQENFFSLQEKCMEKLTLFKDCDV VIYNGDNELISNCVAKSMLTAREIAWSRRDIERPLYISRVIKKEDHTVISYRYLEMDNTF CIPFIDDASIENVLNCLAACLYLMTPADQITERMARLEPIAMRLEVKEGKNNCVLINDSY NSDLASLDIALDFLVRRSEKKGLKRTLILSDILETGQSTATLYRRVAQLVQSKGINKLIG VGQEISSCSARFDDDLERYFFPNTEALLTSGILKSLHSEVILVKGSRVFNFDLVSEELEL KVHETILEVNLGAMVANLNHYRSMLRHPETKVICMVKASAYGAGSYEIAKSLQEHHVDYL AVAVADEGSELRKAGITASIIIMDPELTAFKTMFDYKLEPEVYNFHLLDALIKAAEKEGI TNFPIHVKLDTGMHRLGFAVEDIPLLIRRLKNQSAVIPRSVFSHFVGSDSPQFDAFTRRQ IELFEKGSQELQAAFSHKILRHICNTAGIERFPDAQFDMVRLGIGLYGVSPIDNSIIHNV STLKTTILQIRDVPQEDTVGYSRMGHLARPSRIAAIPIGYADGLNRHLGRGNAYCLVNGK KAPYVGNICMDVCMIDVTDIDCREGDQAIIFGDDLPITVLSDKLDTIPYEVLTSISTRVK RVYYQD >gi|222159249|gb|ACAB01000110.1| GENE 24 38621 - 39640 475 339 aa, chain - ## HITS:1 COG:no KEGG:BT_4100 NR:ns ## KEGG: BT_4100 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 339 1 329 329 518 76.0 1e-145 MKRNEMMENRMNFQTSVELPAGMPPVSHADRILLMGSCFAENIGRQLMDAGFQLDLNPFG ILYNPLSVSSALREIMGHKEYTSQDLFTYKDLWHSPMHHGSFSAFTPEETLHTINARLHH AHQRLPELNWLMVTLGTAYVYKQKESGQVVANCHQLPENHFLRYRLTVEEIVEDYTALIT GMAACNPELKWLFTVSPIRHIRDGMHANQLSKSTLLLAIDRLQQLFPERVFYFPSYEIIL DELRDYRFYADDMLHPSPLAIRYLWERFSETFFSAETKQVIMAVQDIRWDLAHKPFHPES EAYQRFLGQIVLKIERLIGKYPYLDFQKETELCHMRLNP >gi|222159249|gb|ACAB01000110.1| GENE 25 39808 - 41712 1788 634 aa, chain + ## HITS:1 COG:PM0532 KEGG:ns NR:ns ## COG: PM0532 COG1154 # Protein_GI_number: 15602397 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Pasteurella multocida # 7 633 4 614 614 549 43.0 1e-156 MKNEPIYNLLNTINYPDDLRRLDVEQLPEVCSELRQDIIKELCCNPGHFAASLGTVELTV ALHYVYNTPYDRIVWDVGHQAYGHKILTGRREAFSTNRKLGGIRPFPSPEESEYDTFTCG HASNSISAALGMAVAAARKGDVKRHVVAIIGDGSMSGGLAFEGLNNASATSNNLLIILND NDMAIDRSVGGMKQYLFNLTTSNRYNQLRFKLSRMLFKLGILNEERRKALIRFGNSLKSM AAQQQNIFEGMNIRYFGPIDGHDVKNLARVLRDIKDMQGPKILHLHTIKGKGFGPAEKHA TEWHAPGKFDPVTGERFIANTEGMPPLFQDVFGNTLVELAEANPKIVGVTPAMPSGCSMN ILMEKMPKRAFDVGIAEGHAVTFSGGMAKDGLQPFCNIYSSFMQRAHDNIIHDVAIQNLP VVLCLDRAGLVGEDGPTHHGAFDMACLRPIPNLTISSPMDEHELRRLMYTAQLPDKGPFV IRYPRGRGVLVDWKCPLEEIPVGKGRKLKEGSDLAVITIGPIGNVAARAITRAEADFGLS IAHYDLRFLKPLDEELLHEVGRKFQRVLTIEDGIIKGGMGSAILEFMADNEYKPTVKRIG IPNLFVEHGSVAELYQLCGMDEEGILTKIKEFIN >gi|222159249|gb|ACAB01000110.1| GENE 26 41714 - 43054 1557 446 aa, chain + ## HITS:1 COG:PA0016 KEGG:ns NR:ns ## COG: PA0016 COG0569 # Protein_GI_number: 15595214 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Pseudomonas aeruginosa # 1 445 1 450 457 222 32.0 1e-57 MKIIIAGAGNVGTHLAKLLSREKQDIILMDDDEEKLSALSANFDLLTVAASPSSISGLKE VGVKEADLFIAVTPDESRNMTACMLATNLGAKKTVARIDNYEYLLPKNKEFFQKLGVDSL IYPEMLAAKEIVSSMRMSWVRQWWEFCGGSLVLIGTKMREKAEILNIPLHQLGGPNIPYH VVAIKRGTETIIPRGDDVIKLHDIVYFTTTRKYIPYIRKIAGKEDYADVRNVMIMGGSRI AVRTAQYVPDYMQVKIVDNDLNRCNRLTELLDDKTMIINGDGRDMDLLIEEGLKNTEAFV ALTGNSETNILACLAAKRMGVEKTVAEVENIDYIGMAESLDIGTVINKKMIAASHIYQMM LDADVSNVKCLTFANADVAEFTVPEGAKITKHLIKDLGLPKGTTIGGMIRNGEGILVTGD TQIRPGDHVVVFCLSMMIKKIEKFFN >gi|222159249|gb|ACAB01000110.1| GENE 27 43078 - 44529 827 483 aa, chain + ## HITS:1 COG:MA1483 KEGG:ns NR:ns ## COG: MA1483 COG0168 # Protein_GI_number: 20090342 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Methanosarcina acetivorans str.C2A # 2 482 1 476 476 303 38.0 5e-82 MINSKMIFRILGFLLLIETAMLLCCGAVSLFYKENDLQSFLISSAVTACIGFLLLAIGKD AVKSLNRRDGYVIVSAAWVTFSLFGMLPYYIGGYIPNVADAFFETMSGFSSTGATILNNI ESMPHGILFWRAMTQWIGGLGIVFFTIAVLPIFGVGGIQVFAAEASGPTHDKVHPRIGIT AKWIWGIYAGMTGTLIILLIGGGMSIFDSVCHAFATTSTGGFSTKQASIEYYHSPYIDYV ISIFMFLSGINFTLLLLMFNGKIKKFLHDAELKFYFICVSFFTLFIAIWLHQTSSMGVEE AFRKSLFQVVSLQTSTGFVTADYMLWPSILWGCLIIVMIIGACAGSTTGGIKCIRMVILF QVVKNEFKHILHPNAILPVRVNKQVISPSIQSTVLAFTFLYVVIVIISVLVMMGLGVGFL ESIGTVISSIGNMGPGLGRCGPAFSWSALPDAAKWLLSFLMLLGRLELFTVLLLFTSDFW KKN >gi|222159249|gb|ACAB01000110.1| GENE 28 44743 - 46125 1195 460 aa, chain - ## HITS:1 COG:no KEGG:BT_4096 NR:ns ## KEGG: BT_4096 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 460 1 460 460 808 83.0 0 MKKLFVLIAAACMTCTAAFAQTVKPFKEGERAVFLGNSITDGGHYHSYIWLYYMTRFPDM PIRVFNGGIGGDTAYDMNKRLDGDIFVMKPSVLMVTFGMNDSGYFEYNGDKPKEFGEQKY QESIKNYQQMEKRFKDLPDTRIVMVGTSPYDETVQLKENTPFKTKNETIKRLVEYQKESA AKNNWEFTDLNAPMTAINQQYQQKDPTFTLCGSDRIHPDNDGHMVMAYLFLKAQGFVGKE VADMEINANKKQVVKSENCTVSNIKKNGKDLSFDYLAEALPYPLDTIARGWGQKKSQAEV LKVVPFMEEMNRETLKVTGLKGNYKLLIDDEEIGIWSGDELAKGINLAAESKTPQYQQAL TVMHLNEYRWEIERTFREYAWCEFGFFQQKGLLYADDRKAIEVMDENLDKNVWLKGRRDM YSKMMFEAVRDARQQEMDVLINKIYEINKPVVRKILLRKV >gi|222159249|gb|ACAB01000110.1| GENE 29 46196 - 47734 1092 512 aa, chain - ## HITS:1 COG:CC0813 KEGG:ns NR:ns ## COG: CC0813 COG3507 # Protein_GI_number: 16125066 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Caulobacter vibrioides # 25 365 61 396 540 116 28.0 1e-25 MKKITIFLFSVCLALQLVAQEQHYFMNPVIRGDMPDPSIIRIDDTYYATGTSSEWAPFYP VFTSKDMVNWKQAGHIFSKQPSWTSNSFWAPELFYHNSKVYCYYTARQKSTGISYIGVAT SDSPLHEFTDHGPVITYGTEAIDAFIYDDNGQLYISWKAYGLDKRPIELLGCKLSADGLH LKGEPFSLLVDDEEIGMEGQYHFKQGDYYYIIYSAHGCCGPSSNYDVYVARSKNFCGPYE KYTGNPVLHGGEGDYLSCGHGTVVKTPDGKMFYMCHAYLKGEGFYAGRQPVLQEMEMTAD HWVRFTTGKLAIAKQPMPFKTTIQQPLTDFEDHFTENKLKVDWTWNYPYSDINIVLKKGK LSLSGTPKKDNQRGTALCLRPQSSHYSCETKVIHANESLQGLTLYGDDKNLITWGVKGKQ LQLKFIKENSESILYESGAVDKDIYLKIEVEQGCIFNFYKSKDGKIWNPILDTPLKGESL IRWDRVQRPGLLHCGAKEVPAEFAYFRMINLK >gi|222159249|gb|ACAB01000110.1| GENE 30 47746 - 48879 1016 377 aa, chain - ## HITS:1 COG:PH1107 KEGG:ns NR:ns ## COG: PH1107 COG2152 # Protein_GI_number: 14590938 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Pyrococcus horikoshii # 66 373 16 287 299 154 34.0 2e-37 MKRKLQNIAYLLMAAALVTSCGGKKQTSEFPDWAWADFQRPEGINPIISPDTTTFFYCPM RQDSVAWEASDTFNPAATVYNGKVVVLYRAEDNSATGIGSRTSRLGYASSDDGLHFKRMS VPVFYPADDSQKELENPGGCEDPRVAVTEDGLYVMHYTQWNRKQARLAVATSRDLQTWEK HGPAFAKAYNGRFIDEFSKSASIVTKLIDGKQVIAKIDGKYWMYWGEKFVNVSTSTDLVN WEPMLDENGEFLKVMTPRAGKFDSDLTECGPPAILTDKGILLFYNGKNKSGAEGDTLYTA NSYCTGQALFDAKNPTKLIERLDKPFYIPESDFEKSGQYPAGTVFIEGLVFYNQKWFLYY GCADSRVAVAVYDPLKK >gi|222159249|gb|ACAB01000110.1| GENE 31 49056 - 51212 1848 718 aa, chain - ## HITS:1 COG:Rv0584 KEGG:ns NR:ns ## COG: Rv0584 COG3537 # Protein_GI_number: 15607724 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Mycobacterium tuberculosis H37Rv # 32 717 44 768 877 393 35.0 1e-109 MKNKILTGLLLVLIGGWGSLSAQNVGSNYARQVNTLIGTKGVGLTSGYLYPGATYPYGMV QFTPSYFSKRSGFVINQLSGGGCEHMGNFPTFPVKGKLKMSPDNILNYRINISEEKGHAG YYEAKVQEDIHAKLTVTERTGMASYEYSADQQYGTVIIGGGISATPIDQAAIVITAPNKC EGYAEGGNFCGLRTPYKVYFVAEFDTDALESGTWKRNELKPNTTFAEGEYSGVYFTFDVN KKKNIQYKIGVSYVSVENARENLKAENTGWDFLQIQNQAESKWNHYLGKIEVEGSNPDRT TQFYTHLYRSFIHPNVCSDVNGEYMGADFKVHKSRSKHYTSFSNWDTYRTQIQLLSILDP EVASDIVISHQLFAEEAGGAFPRWVMANIETGVMQGDPTPILISNAYAFGARNYDPKPIF KTMRKGAEEPGAMSQEVEARPGLKQYLDKGYYNASIQLEYTSADFAIAQFALHAVGDEFA SWRYFHFARSWKNLYNPETGWLQSRNPDGSWKPLTEDFRESTYKNYFWMVPYDIAGLIEI IGGKAAAEKRLDEFFTRLDAGYNDAWFASGNEPSFHIPWIYNWVGTPYKAQEIINRVLNE QYSSKIDGLPGNDDLGTMGAWYVFACIGLYPEIPGVGGFTVNTPIFSSVKIHLKKGDMVI KGGSEKNIYIKSMKLNGKPYDSTWIDWDQLNNGATIEYTTSSKPDVKWGTKVTPPSFP >gi|222159249|gb|ACAB01000110.1| GENE 32 51243 - 52397 899 384 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 10 383 345 716 717 262 37.0 1e-69 MPCLPHKNNADGYLPGQLPEKQGGMLGNHSISLLADAWAKGIRTFDPEKALKAYAHEAMN KGPWGGANGRGFWKEYFELGYVPYPESMGSSAQTMEYAYDDFCGYQLAKMTGNKHYQEVF ARQMYNYKNVFDPSIGFMRGKGVDGKWQEPFDPLEWGGPFCEGNAWHYTWSVFHDVEGLI DLFGSDQRFTTKMDSVFTLPCTIKPGTYGGVIHEMKEMELAGMGQYAHGNQPIQHMPYLY SYAGQPWKTQYWVRQIVERLYNSTERGYPGDEDQGGMSSWYILSSLGIYAVCPGTDEYVI GSPLFKKATITMENGNKFVIEADNNSKENLYIQSATLNGRVLDKNFIRYDDIADGGIIRF EMGSQPNKERCTSKYAAPFSLSKE >gi|222159249|gb|ACAB01000110.1| GENE 33 52315 - 53517 981 400 aa, chain - ## HITS:1 COG:SP2145 KEGG:ns NR:ns ## COG: SP2145 COG3537 # Protein_GI_number: 15901958 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Streptococcus pneumoniae TIGR4 # 21 394 1 350 694 143 26.0 5e-34 MNKLIYLVIFSFLSTSIYAQMKDLVQYVNTLQGTDSHFGLSYGNTYPTTGMPYAMHTWSA QTGKNGEGWKYQYAVDNIRGFCQSHQCSPWMSDYAVYSFMPMVGKLVVNQEMRATKFSHD NEIAKPHYYKVMLDNGITTEMAPTTRGVHLRFSYPSTEDAYLVIDGYTDMSEIKIDPVKR QISGWVNNQRFVNNSKTFRSYFVVQFDKAFEDYGMWENQKDEIFPKKLEGEGKGYGAYIK FKKGSKVQAKAASSYISAEQAVITLNDELGKDKNLEATKIRGHKTWNELLNRIQVEGGTD EQMKTFYSCLFRANLFSRKFYERKANGEPYYYSPYDGKVYDGYMYTDNGFWDTFRSQFPL TNILHPTMQGRYMNALLAAQEQCGWLPSWSAPGETRRNVG >gi|222159249|gb|ACAB01000110.1| GENE 34 53793 - 55097 820 434 aa, chain - ## HITS:1 COG:no KEGG:BT_4084 NR:ns ## KEGG: BT_4084 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 34 430 45 442 443 444 54.0 1e-123 MKPIHLILSIALLAGGACGDQQEPYYGNSEVKMPQIDSEWNLELMPNRSGQYWKVFVYKD LKYNALFTRSLGWNGGDGVFTTGLPDGNIFWSFNDSFYGVINENRSRGNCSFPRNSIMVQ TPGEKDENLVWLADYVQTNDPNADRYYQVRTHIRHPKATLSDEKIQAGEIDQDYLYWAGD ATIYNNQMQMLWGAVDNTDPNNLMRRFGTCLATYSLEGKPGDATYMKLISRNDNFNDHTL GYGDTMWEDEDGHIYLYTTSNYKVAVARTATRDLGSQWEYYVADPQGHFSWTTQYPSTQD AENSTIIPLESACSMPWVFKKGDTYYMIGQSMWFGRDVLMFRSKHPYGPFVDQKTLFTLP EFLDKIGEQRYQHVYMVNIHPALSRTGELVISTNTDCSNFWDNFNAPGSADFYRPYFYRV FNWESLYDNDAPLE >gi|222159249|gb|ACAB01000110.1| GENE 35 55122 - 55190 67 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRIFLSTFMACIIGFTGAETN >gi|222159249|gb|ACAB01000110.1| GENE 36 55455 - 57299 1153 614 aa, chain + ## HITS:1 COG:no KEGG:BT_4085 NR:ns ## KEGG: BT_4085 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 613 17 617 618 574 50.0 1e-162 MKLYMKSLTLLLFLGTVASCNDSFLDQTQTSDLNRETVFADSTYTANFLTGIYTDIGFDI NYNRWTYLLANGGGLQSACDEAEFYPSSTIWTNMMFATGTVNPVTVSDDAWSKCYTNIRR CNVFLANIDRSPMVQSRKAQYISEALFLRAWYYFILMKHYGGVPIIGNNVYDATSTMKTS RNTWAECVEYVSHQCDSIVQLNVLPVRRTGQENGRASSAAALGLKARVCLYAASPLFNGS DFAPTDTKELVGYPSYDKERWKTAMDAARAVINLGTYNLFIRSKDLNGKAEPGFGFYVVF QAGDTRNANLTDDDGKNNNGAFAGNILDRKYQRESGHEAAYYPPSGSHSGTRHGGYIYKE LADAFPMIDGKPINNSQYTFDPLNPAQNRDPRFNNTVIYDGALVPADDTYSATTVVNTWI GVGQTSDAVYQGTATGFYVRKGCDRLCKGSTWNPSRHNDPLIRYADILLMYAEAANEYHG PDFEETLGGQQLGCYPILKLIRKRAGIEPGADGMYGLKKNMTQKQMRQAIQEERRLEFAF EGHRFFDVRRWMIAPQTESKTMHGIEITRATDGTKTWKEFDVRTHVWRPAMYFFPIPYNE TVKSKDLLQNPYYN >gi|222159249|gb|ACAB01000110.1| GENE 37 57315 - 57770 356 151 aa, chain + ## HITS:1 COG:no KEGG:BT_4086 NR:ns ## KEGG: BT_4086 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 150 1 150 150 154 48.0 1e-36 MKHSIHLAAIALLTTLLVTSCDTFSAPADPDKNTEQVKDISGTWQLTTVSRNSNDITETM DFSQFQIDLKKDGKYTIENYLPFVVRHNGTWKVDDIYYPFRLYFTEDGATEQASTDILFP ITNGERNIVITHSPGCGSNTYTYVFKKISEN >gi|222159249|gb|ACAB01000110.1| GENE 38 57786 - 58853 747 355 aa, chain + ## HITS:1 COG:no KEGG:BT_4087 NR:ns ## KEGG: BT_4087 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 355 6 350 350 300 46.0 5e-80 MKKKLISLLQRKRHIVALSTILMTFIVMSCLFIDSVDITQMIDGKAVNYAKAGTTATFKM HGHIKVQGDPRNDKRLVFGFLAPKSWNLAQNARVSYTEDTFDPNIGEQNMTLIPLTEQPS NKPGLSWSAALMQEYGVGTNILEDMEWAAYWTRPYNGVADEIHFTIYVRVPVGNKNLRFK PSFFINSTDDNFSTSADAKKCEEAGCFEVVEGEGLVTDFCSEHFNKTTPLTALQNDFVTF SFIGGMDDENALVKADKIYFEGTAVASDGHRYTVNEKSDKTLMKRENQYTKTYNITFWPE GFFNVPEGTELVSIEYAFTNADGSISVTQSDDDFVMLNIPLPPQKEPFIYTFYCE >gi|222159249|gb|ACAB01000110.1| GENE 39 58875 - 61718 2336 947 aa, chain + ## HITS:1 COG:no KEGG:BT_4088 NR:ns ## KEGG: BT_4088 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 947 125 1044 1044 1039 57.0 0 MKQNMYKMITTGLLCIAAVALKAQDAPIDSVDHTKSNTLGASSTVYTNDLIKYQSATILT GLQGRLKGLNVSPFRGMQLLRTDANTKSDIVGAIPNVGGGIYGDNSEFLISARGQSPVAI VDGVERDLYSIDPEAIESVTIQKDALSNMFLGMRSSRGALIITTKNPDAKGGFHLSLTGK FGISSALKSGPNPLSAYQYAYLLNEALLNDGKSPLYTYDDFEAYRNGTSPYLHPDVNWKD AIMNNSTTSQAYNLNVTGGGRVAQYFVSLGYYSENGLFKTSDANSYNTNFKYNRYLITSK VNINVTDEFKVSMSLMGRIEEGNQPGGISGTGYSDLLSNVWQTPNNAYPVLNPNGTYGGN ASYTQNLYAQTTGSGYISSNTRDVVGTINLKYDFDKLVRGLSVGATGNISSQVRNAIVRT KQAQVFQYSITQQGNEAYDKYGDVSSQTNSYRSVSTYQYMYGKMYVDWERQFGMHGVKAS LWGDTRTILNNYDLPMIPSNIGQKVEYNYDNKYFAQAAVTESYYNRYDNGRRWGTFWAVG LGWDISKEKFMEASKIDQLKLRATYGHTGNGIDNAGYFSYLKRYNEDGGFWYSNGTSMSN GGSVSEISPLANTLLTWEKGRKVNVGLDLTLLKNRLTLSADYYNDYYYDILQSRGKSIEL LGIAYPAENIGKTRYYGLETQLSWQDHIGKVNYYVSANWSMEQNKRLFMDEQYVPYDYLK MTGQPTGTIYGLVATGFLTAKDIADGYPVMNGFNNIQAGDVKYKDMNGDGEINEFDRTVI GGDKPTCYFGIDLGFEWKGLEVTALIQGAYNRDLYNSDRTLLEGFQVIGQSYGQAYTNLL NRWTPETAETATYPRLTAGGNMYNYGNNWNSSLFVQNGNYIRLKNATVSYKLPENFCRNY LGGLRVKIFVQGQNLLTWSRTRLQDPEVTFTSYPLQRTITTGINLNF >gi|222159249|gb|ACAB01000110.1| GENE 40 61724 - 63508 1652 594 aa, chain + ## HITS:1 COG:no KEGG:BT_4089 NR:ns ## KEGG: BT_4089 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 594 1 598 598 676 58.0 0 MMDMKQNKYGTWAVGAFMLLLSAAACTDSYESTPVDMFTEDYLFSKTDSNGTVVRKYLNR IYYYMRNGHNGVNGDYLDAASDDALSVISSESDVYKLAIGRYSASNFINSDMIWSDPYLV IRRVNILLSGIDVVPFNTTYTDALGNTRRLNDSMKAEARFLRAYFYFELVKRYGGVPIIG DKVYELNENIELPRSTFEQCIKYIVSELDDIKDDLRSLPLPDAAASAHVVNTQAAQALKI RVLLYAASPLFNEKPIESGNELIGYASYDRERWKIAADAARNFINTYGTGDGAIMGLDDN FKNVFTNWYSNEHKEVIFFRENAMDKTVETANGPLGLSGAKQGNGRTNPTQNLVDAFLMK DGHFINDKGKYEYNEQLPYAQRDPRLEYTIIHNGSSWVNSTMETWQGGANNPLGSSYSLT SYYMRKFMGDFEAANEYQDTQHNWVMFRYAEILLNFAEAENEYLSTPSQAVYDAIIALRR RAGIEPGEKNLYGLTTNEQTHNLTQAEMRKVIQNERRIELAFEEHRYWDIRRWRLAEQVY AQPIQGMYITKSQTSTTYVPQAVLNVQWDNKRYFYPIPYSEVIKNKNMVQNPNW >gi|222159249|gb|ACAB01000110.1| GENE 41 63515 - 66610 2694 1031 aa, chain + ## HITS:1 COG:no KEGG:BT_4090 NR:ns ## KEGG: BT_4090 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1031 1 1026 1026 1312 65.0 0 MKRVIITFLCLLCIVQAYSQSFTISGTIIDGESNEPLIGAGILVKGAGRGTITNIDGKYS LEVKRGETLVFSFVGYQNQEVAITNQRTLNIALRVSEANNLNEVVITGRGSAKRITLTGA VSGIQANELRRVPTSSLQNTLSGKLPGFFSQQRSGQPGKDASDFFIRGVSSLNEDGNKPL IMVDDVEYSYEQLSQINVNEIESISILKDASTTAIYGIKGANGVLVVKTRRGEEGKPVIH VRAEAGGQIPVRKPNFLDSYNTALLVNEARTNDGLTKMFTQHDLELFKDGSDPYGHPNVN WYDEVFKKSAMQSNINVDVSGGTKRLKYFVSGGYFSQGGLVRNFAKADDDVNTGYFYRRF DYRTNLDFTVTDNLTMRLDFSSRFMNINEPSSLNATGEIYNFTAMHPYSAPVLNPDGSYA YLSDVDGYGPTLNARLANEGYTRTRRNDNNILYGVNWKMDWLTKGLSANARIAYSTIDEV FRKVNRGKDAYPTYHYDPTTDQYNINPNRKYAYSQYALTAGTNQAVKNLDVQASINYARV FNGIHDVSATLLYSRQSRTVEKNPDDTGEKIPENFQGLTATVGYKYKDKYLVDFNVAYNG TDRFAEGHRYGIFPAIGVGWRISEESFFKDNISFIQLLKIRASYGIVGSDVAMGNRYLYN QIYTATDNAYNFGQSDVTSTAITEGDLGNSNVTWEEAKKFDIGLDFNAFDRFSFTFDWFY DKRYNQLVKRNDIPQILGVGTSPINVARTSNMGFDGQIGYQDRFGEFNFNTNFVFSYAKN KVEFNAEAQQRYDWLSATGRPIGQPFGYTWIGYYTPEEVDLIHAGAANAPAVPNTDVPIQ AGDLKYKDLNGDGVINDFDKGAIGKPNLPNTTLGWTIGGSWKGLSVSVLFQGSFNYSFSV NGTGIEPFKSQFQPLHQKRWTLERYLNGEAIEFPRLTSNPSTVNSAAAYMSDFWLIDAWY IRLKTIDVSYQVPTKVLPSWLTNLRVYLNAYNMFTWTSFDKYQQDPEIKSNSAGDAYMNQ RVFNLGVQLTF >gi|222159249|gb|ACAB01000110.1| GENE 42 66867 - 67664 695 265 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_0264 NR:ns ## KEGG: Fisuc_0264 # Name: not_defined # Def: KilA, N-terminal/APSES-type HTH DNA-binding domain protein # Organism: F.succinogenes # Pathway: not_defined # 3 264 2 263 265 401 74.0 1e-110 MTKITVLTTDINVVSINNEDYICITDMLKAKDGDFFISDWLRNRNTLEYLGIWERLYNPG FNYGEFATIRNQSGLNSFKISVKEYVEKTHAIGIQAKAGRYGGTYAHKDIAFEFGMWISA EFKIYLVREFQRLKEQELAQLGWSAKRELSKINYRIHTDAIKQNLIPAEVTPQQASRIYA SEADVLNVAMFGLTALEWRDAHPDLKGNVRDYATVNELICLSNMENLNAVFIGEGLPQHE RLVRLNRIAIQQMQILEDVNKKYLK >gi|222159249|gb|ACAB01000110.1| GENE 43 68049 - 70193 1555 714 aa, chain - ## HITS:1 COG:lin0222 KEGG:ns NR:ns ## COG: lin0222 COG1501 # Protein_GI_number: 16799299 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Listeria innocua # 79 694 130 751 763 398 35.0 1e-110 MRKRISLLQMAVFFFTAQVAWAGNIKTEKVTLVGNGIVEFIPVGYDVSRTPSLILSHEPE KKGGVPENWKLVPQFTMTDGKANASLDVPAGTSLYGGGEVTGPLLRNGQKIKMWNTDNGM YRVDGGSRLYQTHPWVLGVRPDGTAFGVLFDSFWKAELITNSDKIEFNTEGAPFRTYIID RESPQAVLKGLAELTGTISMPPRWAIGYHQSRFSYVPEARVKEVANTFREKKIPCDVIWF DINYMDEFRVFTINNRDFPDPKRMNKYLHDNGFHSVYMIDPGVKVDDNYFVYKTGKEQNA FVCDIYRNEFHGKVWPGACAFPDFTRPETRTWWSGLYKDFMANGIDGIWNDMNEPSVFDG PGGTMPENNIHLGGGNLPIGSHLMYHNAYGRLMVEASYNGMMAANPSKRPFLLSRSNIIG GQRYAAMWTGDNEATYEQMKLSVPMSITLGLSGQPFNGPDIGGFAGNTTPDLWGNWLGFG AFFPFSRGHASCDTNNKEPWAFTKDIEKESRMALERRYRLLPYLYTAFHVAHKDGQPVMA PVFFADPKDESLRAEEQAFMLGTDLLIIPAFAKNPSLPKGIWENLSLVKGDTKGKYQAKL KVRGGSIIPVGKIIQNVNENSFDPLTLVVCPDEDGKAEGSLYWDKGDGWGFQKGDYKQLT FKAELVDGHHLIVKVTEDKGEDQIDFGMIKVEVLHAGHTYKSSGDIAKGITVKL >gi|222159249|gb|ACAB01000110.1| GENE 44 71226 - 73898 1771 890 aa, chain - ## HITS:1 COG:no KEGG:BT_4076 NR:ns ## KEGG: BT_4076 # Name: not_defined # Def: alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 890 5 892 892 1409 73.0 0 MKRLLCLGLICYFCCLSMIVYGNEKTSPFYLAELKCENLIDPLGIDNVTPHFSWKLKGDG WKGGQTYYEIQVASDSILLVQDKADLWNTGKLKSKTSVMVPYRGKTLTSRSLCYWRVRVW DAKKQASSWSPVARFGVGILDQSQMKGEYIGASVEGGKICAPILRKKVKLTQGETSFLHV NTLGYHEIYINGRKVGEDVLTPAVSHLSKRSLIVTYDITPYLREGENDLLIWLGQGWYKT TTFGAAYEGPLVKAELDVLRNGKWEVVTKTDGLWYGCESGYSDTGTWRALQFGGERVDGR ILPRDLSTQALDKMKWTPVVKVNVPDHITSPQMCEINKIHQILPAVSVKKLGESLWLVDM GKVQTGWFEMQMPILPAGHEVIMEYSDNLTKDGEFDKQGESDIYISGGKQGEYFRNKFNH HAFRYVRISNLPQKPETGAMKSLQIYGDYKQTATFECSDADLNAIHQMIQYTMKCLTFSG YMVDCPHLERAGYGGDGNSSTMSLQTMYDVAPTFENWVQTWGDSMREGGSLPHVGPNPGA GGGGPYWCGFFVQAPWRTYVNYNDPRLIEKYYSQMKEWFKYVDKYTVDGLLKRWPDTKYR DWYLGDWLAPMGVDAGNQASVDLVSNCFISECLSTMYKTAITLGKREEAEEFAIRREKLN KLIHQTFYREDEGIYSTGSQLDMCYPMLVGVVPDSLYNKVKENVVAMTEEKYKGHIAVGL VGVPILTEWAVRYKQVDFFYQMMKKRDYPGYLYMIDHGATATWEYWSGERSRVHNCYNGI GTWFYQAVGGIRLDEANPGYRHFYVDPQIPNGVTWAKVTKESPYGTIAVNWKLKDDNQLN LQLTVPAGTTATVCIPNNAVSCKKNKKKVSVKEQTVDVEAGHYDFLFNLK >gi|222159249|gb|ACAB01000110.1| GENE 45 73917 - 75323 950 468 aa, chain - ## HITS:1 COG:no KEGG:BT_4075 NR:ns ## KEGG: BT_4075 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 467 1 466 467 750 76.0 0 MANRLKMRILLLLSLVYINCDYLQAQTTIPRFEEGERVVFVGNSITHGGHYHSFIWLYYM TRFPDKPITIMNAGIGGESAWDMKDRLDYDVFNRKPTYVTLTFGMNDTGYDIYMKDDAKE LSEQRIAKSLESYREIEERLLAKNKIKKVLIGGSPYDETSQFNNFILHNKNNAILKIIDA QRTSAKKNGWGFVDFNQPMREISRKEQEADSTFTFCRIDRIHPDNDGQMVMAYLFLKAQG LAGDEVSSVSIDAYHSSVITHKNCKISKLKKNGADLTFDYLAYALPYPLDSISRSGWGNK RSQRDAMQLVPFMEEFNQERFQVTNLEKGMYRLTIDNQFIDNLSSEKLANGVNLADYPNT PQYQQAAKIMYLNEERFEVEKRFREYLWTEYSFLKKEGLLFADDQKAIDKLKEYLPKDGF LRMSYDWYIKAMNPEIREVWSNYIKSLVETIYKINKPVTHKVRLARVE >gi|222159249|gb|ACAB01000110.1| GENE 46 75369 - 78458 2263 1029 aa, chain - ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 67 729 22 667 785 147 25.0 2e-34 MMNKLGLIIISVICALIQTGCAEQSTWMSLNSSNPRIIWEVKPQADLENITGEQISTPEF KMSDYVKGVVPGTVFTAYVEAGIVPDPNYADNIYKVDETFYNRPFWYRTEFELPSSYSEG KRVWLHFDNTNRFADFYFNGEKISGTKASTKDVSGHMLRSKFDVTNLIKKSGKNVIAVLI TDADQKKTRKAKDPYGVACSPSYLAGAGWDWMPYVPGRLAGITGNAYLVVTGDVVMEDPW VRSELPTLQKAELSVSTDIRNASSSPKEVVVSGVIQPGNISFSKDIRVEGGTTARLSINK DDIAQLVVDHPRLWWPNGYGDPNLYTCKLTCSVDGKVSDVKEMTFGIKKYEYKMVDNVVG YPVLTFFINGQKIYLKGGNWGMSEYLLRCQGKEYETKIRLHKEMNYNMIRLWTGCVTDDE FYDYCDKYGIMVWNDFWLYVAYNDVAQPEAFKANALDKVRRLRNHPSIAIWCGANETYPA PDLDNYLREMIAKEDNNDRMYKSCSNQDGLSGSGWWGNQPPRHHFETSGSNLAFNTPAYP YGIDHGYGMRTEIGTATFPTFESIKEFIPQKDWWPLPTDEQLKNDDDNVWNKHFFGKEAS NANPVNYKNSVNTQYGESSGLEEFCEKAQMLNLEVMKGMYEAWNDKMWNDAAGLLIWMSH PAYPSFVWQTYDYYYDPTGAYWGAKKACEPLHIQWNASNNNIKVINTTAKDLKGAIAKAA IYNLNGKEVPAYGQAKQVDVAASNIAEAFSLNFNPFNLAYGKKAVASSSTGASKSASMVT DGGAGSRWESAYSDPQWIYIDLGKEEKIEKAILKWEAACAKKYELQVSNDAQEWKTVYAN KDGRGGTEQIELEPVTARYVKLAGISRATQFGYSLFEFEIYGEKPKEIEELTPLHFIKLE LTDVKGNLISENFYWRNGVNDLDYTLLNTLPEADLSCRLVDKSMSDGKMKIAVKNNSGTV AFANRVRLVNKATQKRILPIIMSDNYATLMPGEEKVITMEATPELLKGGVSVLVKQYGKA EKNKLDIAD >gi|222159249|gb|ACAB01000110.1| GENE 47 78507 - 80846 1847 779 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 33 778 47 770 790 360 31.0 4e-99 MYKATANLLIVCFLCWMTGCSSTNTVSEDWVPTDYVNPFIGASTSVGAAGVYHGLGKTFP GATTPYGMVQVSPNTITGGDNGSGYSDEHKTIEGFAFTQMSGVGWFGDLGNFLVMPTTGE LYKVAGKENNDSIKGYRSAYNKATETAKAGYYSVELTDYHIKVESSATPHCGILHFTYPS SDQSRIQIDLARRVGGTSTSQYIKVVDDYTIQGWMKCTPDGGGWGNGEGKADYTVYYYAQ FSKPLTNYGFWSADIPEHWVRKRDEVVSIPYLTRVSQAPVIKDKKELTGKHLGFFTEFPT KEGETVEMKVGISFVDMEGAANNFEQEIASKNFGQVKQEATELWNKELNRVRISGGTDDE KTIFYTAMYHTMIDPRIYTDVDGRYVGGDYKIHTADSTFTKRTIFSGWDVFRSQFPLQTI INPRLVSDELNSLITMADQSGREYYERWELLNSYSGCMLGNPALSVLADAYIKGIRTYDA EKAYRYAVNTSRRFGNDSLGYAPEPLSISTTLEYAYTDWCISQLAKAMGKEDDAKCFYEK GQAYHHIFDKEKGWFRPRKADGTWVEWPENARLKEWYGCIEANSYQQGWFVPHDVTGMVN LMGGKEKVIADLTDFFNKTPSNMLWNEYYNHANEPVHFVPFLFNQLEVPWYTQKWTRYTC EKAYANKVEGIVGNEDVGQMSAWYVLAASGIHPSCPGNTRMEITSPVFDKVEFNLDPSYY TGKKFTVIAHNNSINNVYIQKALLNGQEYNKCYLDFADIAAGGTLELFMGDKPNVEWGL >gi|222159249|gb|ACAB01000110.1| GENE 48 80865 - 83429 1432 854 aa, chain - ## HITS:1 COG:all0848 KEGG:ns NR:ns ## COG: all0848 COG0383 # Protein_GI_number: 17228343 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Nostoc sp. PCC 7120 # 63 818 278 1015 1047 87 19.0 7e-17 MSKIYIILLTVFFYANVYSQQAYFVDGYHGGIYGHYPVKWKTQFIVDQLAMHPDWRICME IEPETWDTVRVQTPEAYLRFKEMATSNQVEFTNPTYAQPYCYNISGESIIRQFQYGIAKI NKHFPEVDFVTYSVEEPCFTSCLPQILKQFGFKYAVLKCPNTCWGGYTAAYGGELVNWVG PDGTAILTVPRYACEKLEPGSTWQTTAWGNSDAYLKDCRNAGIKHPVGMCFQDAGWKNGP WLGSGKNTKNNSIYMTWRDYIENVSIGKTDDNWSFSQEDIPVNLMWGSQVLQKIAQEVRV SENRIVMAEKMSVMAYLENKYICRQADMDEAWRTLMLAQHHDSWIVPYNGLNRKGTWADQ IKRWTDSTNHLADGIIEASMRSFNEKFIPQNNAQQQYIRVFNTLGMRRKEIVSVLLPTEF ENADLSVYDWKGKDIGCLVENGGKEIRLFFEAEIPPFGYSTYCIKKKEAGKKEASESRFV LEGNKVNKQEYVVENDMYKIVFDLSKGGTIKSLIAKKEGNKDFAGKTEKYALGELRGFFY EEGKFRSSIETPAKLTVVRDNVYEQKIKIEGEIASHPFTQVITLTKGTRRIDFDLTVDWK KNVGIGEYKEERWRDNRRAYCDDRFKLSVLFPTDLHAPRVYKNAPFDVCESKLTDTFFGS WDQIKHNIILHWVDLAEQEGDYALALLSDHTTSYSYGEDYPLGLTAQYSGGGLWGPDYKI THPLRMKYAIIPHRGKWDKASIADDSDCWNEPLLHSCYPVAKPESKSFIDLQNTGYQVSA LQMKDGKVLLRLFNSEGDEKLQKVTIDMPLSGVEEVDLNGQCIERKKIKTRAGKSEMMIS MPRFGIKTFVLSLT >gi|222159249|gb|ACAB01000110.1| GENE 49 83442 - 83672 208 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714278|ref|ZP_04544759.1| ## NR: gi|237714278|ref|ZP_04544759.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 76 1 76 76 106 100.0 5e-22 MENCLNKYFADEFTSDEKTEFLIEVENNERLKEEFIENQTLLALVDWISPEYENNKEVVQ HKLYEFMRRMEQHKDK >gi|222159249|gb|ACAB01000110.1| GENE 50 83750 - 84274 405 174 aa, chain - ## HITS:1 COG:PA1300 KEGG:ns NR:ns ## COG: PA1300 COG1595 # Protein_GI_number: 15596497 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 2 168 8 169 175 66 28.0 2e-11 MSFSELYLIYYPKLVRFAKEFVMSEEDAENITQDVFTDLWEKRESMNHIENMNAYLFRLV RNKCLDYLKHKVFEQKYVENVKASFEIELNLKLQSLDRFDVYDISERNQMEKLIRDAINS LPKRCRDIFLLSRMEGLKYREISERLGISVNTVECQMGIALKKLRAKLNVTLAA >gi|222159249|gb|ACAB01000110.1| GENE 51 84467 - 86071 1157 534 aa, chain - ## HITS:1 COG:no KEGG:BT_4069 NR:ns ## KEGG: BT_4069 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 534 1 534 534 798 83.0 0 MKRTLWFIFFLLNSLFYLYADSENDSLLKVLDKVISERLIYTQKKEATIKELKKKKVGLN SLEDIYNLNKEIIHQYETFVCDSAEQYIHENIDIAKTIGNKEYLLEEQLRLAFVYSLSGL FIQANDIFKSIKCADLPDHLKALYCWNRIRYYENLIKYTDDVRFSNEYISEKEAYRDTVM SILFDQSDEYRKEKAVKLQDKGSTKEALLILTEIYDKQEPASHGYAMMAMGLARAYRLIG NYALEEKFLMLAAMTDTKLAVKENEALLTLAVNLYHKGDIDRAYNYIKVALDDAIFYNSR FKNTVIARIHPIIENTYLIRLEKQKQNLRFYIFLTSLFVVALAITLYFTYKQTKIVSRAK RHLKAMNEELVGLNKNLDEANLIKEKYVGYFMNQCAVYINKLDEYRKNVNRKIKTGQIDD LYKSSSRPFEKELEELYHNFDKAFLNLYPNFVEKFNSLLKPEERYKLEKDQLNTELRIFA LIRLGITDVGQIAVFLHYSVQTIYNYKSKVKRMSTLDSNIFEEEVKKLGSLSQK >gi|222159249|gb|ACAB01000110.1| GENE 52 86332 - 86709 125 125 aa, chain + ## HITS:1 COG:no KEGG:BT_4068 NR:ns ## KEGG: BT_4068 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 125 1 125 125 159 68.0 3e-38 MKQSLKFGLLLVLALLIHCATNGAMEDMCKVSSPTYRQEKCYVSQDRPIQDALERLYNLY TTQTCDMSHTDVAHVPTDKSMFLLIAYFCDHYKHQSPPYPVSSHSPTYYYDPISYYIYGL RKIVI >gi|222159249|gb|ACAB01000110.1| GENE 53 86828 - 87178 218 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A [Campylobacter curvus 525.92] # 5 116 14 126 129 88 40 1e-16 MNFTFLVVVLLTALAFVGVVIALSRAISPRSYNLQKFEAYECGIPTRGKSWMQFRVGYYL FAILFLMFDVETAFLFPWAVVMRDMGPQGLISVLFFFIILVLGLAYAWRKGALEWK >gi|222159249|gb|ACAB01000110.1| GENE 54 87169 - 87771 435 200 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B [Campylobacter curvus 525.92] # 33 190 12 169 170 172 50 9e-42 MEITKKPKIKSIPYDEFIDNESLEKLVKELNAGGANVALGVLDDFINWGRSNSLWPLTFA TSCCGIEFMALGAARYDMARFGFEVARASPRQADMIMVCGTITNKMAPVLKRLYDQMPDP KYVVAVGGCAVSGGPFKKSYHVVNGVDKILPVDVYIPGCPPRPEAFYYGMMQLQRKVKIE KFFGGVNRKEKKPDYLRNEE >gi|222159249|gb|ACAB01000110.1| GENE 55 87854 - 89446 1500 530 aa, chain + ## HITS:1 COG:SMa1529 KEGG:ns NR:ns ## COG: SMa1529 COG0649 # Protein_GI_number: 16263284 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 49 kD subunit 7 # Organism: Sinorhizobium meliloti # 163 530 11 404 404 308 39.0 2e-83 MQEIQFIAPAALHDEMLRLRNEKQMDFLESLTGMDWGVADEKDAPEKLRGLGVVYHLEST ITGERIALKTATTNRELPEIPSVSDIWKIADFYEREVFDFYGITFVGHPDMRRLYLRNDW IGYPMRKDNDPEKDNPLCMTNEETFDTTQEIELNPDGTIKNKETKLFGEEEYVVNIGPQH PATHGVMRFRVSLEGEIIRKIDANCGYIHRGIEKMNESLTYPQTLALTDRLDYLGAHQNR HALCMCIEKAMGIEVSERVQYIRTIMDELQRIDSHLLFYSCLAMDLGALTAFFYGFRDRE KILDIFEETCGGRLIMNYNTIGGVQADLHPNFVKRVKEFIPYMRGIIHEYHDIFTGNIIA QSRMKGVGVLSREDAISFGCTGGTGRASGWACDVRKRMPYGVYDKVDFKEIVYTEGDCFA RYLVRMDEIMESLNIIEQLIDNIPEGPYQEKMKPIIRVPEGSYYAAVEGSRGEFGVFLES QGDKTPYRLHYRATGLPLVAAIDTICRGTKIADLIAIGGTIDYVVPDIDR >gi|222159249|gb|ACAB01000110.1| GENE 56 89467 - 90543 1140 358 aa, chain + ## HITS:1 COG:RP796 KEGG:ns NR:ns ## COG: RP796 COG1005 # Protein_GI_number: 15604628 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 1 (chain H) # Organism: Rickettsia prowazekii # 30 349 15 327 339 249 44.0 8e-66 MFDFSIVTNWIHELLLSIMPEGLAIFIECVAVGVCLVALYAILAIILIYMERKVCGFFQC RLGPNRVGKWGSIQVVCDVLKMLTKEIFMPKGADHFLYNLAPFMVIIASFLTFACIPFNK GAAILDFNVGVFFLLAASSIGVVGILLAGWGSNNKFSLIGAMRSGAQIISYELSVGMSIM TMVVLTGTMQFSEIVEGQADGWFIFKGHIPAVIAFIIYLIAGNAECNRGPFDLPEAESEL TAGYHTEYSGMGFGFFYLAEYLNLFIVASVAATIFLGGWMPLHIVGLDGFNTVMDYIPGF IWFFAKAFFVVFLLMWIKWTFPRLRIDQILNLEWKYLVPISMVNLLLMACCVAFGFHF >gi|222159249|gb|ACAB01000110.1| GENE 57 90556 - 91044 520 162 aa, chain + ## HITS:1 COG:SMa1519 KEGG:ns NR:ns ## COG: SMa1519 COG1143 # Protein_GI_number: 16263279 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Sinorhizobium meliloti # 19 146 18 140 188 94 38.0 1e-19 MEYKDKKYTYLGGLVHGISTLATGMKTSIKVYFRKKVTEQYPENRKELKMFDRFRGTLNM PHNENNEHRCVACGLCQMACPNDTIKVTSETVETEDGKKKKILATYEYDLGACMFCQLCV NACPHDAITFDQNFEHAVFDRTKLVLKLNHDGSKVIEKKKEV >gi|222159249|gb|ACAB01000110.1| GENE 58 91047 - 91559 486 170 aa, chain + ## HITS:1 COG:jhp1190 KEGG:ns NR:ns ## COG: jhp1190 COG0839 # Protein_GI_number: 15612255 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 6 (chain J) # Organism: Helicobacter pylori J99 # 5 164 2 162 182 66 30.0 2e-11 MGSTLETVVFYFLAAFIIAMSIMTVTTQRIVRSATYLLFVLFGTAGIYFLLGYTFLGSVQ IMVYAGGIVVLYVFSILLTSGEGDRAEKAKRSKVLAGLFTMIAGLAIILFITLKHDFMQT ANLAPQEISIHAIGNALLSSDKYGYVLPFEAVSILLLACIIGGIMIARKR >gi|222159249|gb|ACAB01000110.1| GENE 59 91597 - 91905 373 102 aa, chain + ## HITS:1 COG:VNG0643G KEGG:ns NR:ns ## COG: VNG0643G COG0713 # Protein_GI_number: 15789840 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) # Organism: Halobacterium sp. NRC-1 # 1 102 1 100 100 75 43.0 2e-14 MIHMEYYLVVSTIMMFAGIYGFFTRRNTLAILISVELMLNATDINFAVFNRFLFPGGMEG YFFALFSIAISAAETAIAIAIMINIYRNLRSIQVRNLDELKW >gi|222159249|gb|ACAB01000110.1| GENE 60 92097 - 94028 1532 643 aa, chain + ## HITS:1 COG:slr0844 KEGG:ns NR:ns ## COG: slr0844 COG1009 # Protein_GI_number: 16331732 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Synechocystis # 2 643 6 680 681 385 39.0 1e-106 MELTILILLLPFFSFLILGIGGKWMSHRTAGTIGTLILGAVAVLSYVTAIQYFSAPRLED GTFATLMPYNFTWLPFTETLHFDLGILLDPISVMMLIVISTVSLMVHIYSFGYMKGEVGF QRYYAFLSLFTMSMLGLVVATNIFQMYLFWELVGVSSYLLIGFYYTKPAAIAASKKAFIV TRFADLGFLIGILIYGYYGGTFGFTPDTVSLISGGASMLPLALGLMFVGGAGKSAMFPLH IWLPDAMEGPTPVSALIHAATMVVAGVYLVARMFPLFITYAPNTLHMVAWVGAFTAFYAA SVACVQSDIKRVLAFSTISQIGFMMVALGVCTSMNPHEGGLGYMASMFHLFTHAMFKALL FLGAGSIIHAVHSNEMSAMGGLRKYMPITHWTFLIACLAIAGIPPFSGFFSKDEILAACF QYSPVMGWVMTVIAAMTAFYMFRLYYGIFWGSEPPRAGSHSEHNTPHGNLEAAPCRPHES PLAMTFPLMFLAVVTCGAGFIPFGHFISSNGESYSIHLDPSVAITSVVIAIISIAIATWM YKNAKQPVADSLAKRFKGLHKAAYNRFYIDDIYQFITHKIIFRCISTPIAWFDRHVVDGF FDFLAWATNTTSDEIRGLQSGQVQQYAYVFLCGALALILLLIL >gi|222159249|gb|ACAB01000110.1| GENE 61 94067 - 95551 1368 494 aa, chain + ## HITS:1 COG:slr1291 KEGG:ns NR:ns ## COG: slr1291 COG1008 # Protein_GI_number: 16329430 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 4 (chain M) # Organism: Synechocystis # 69 439 71 443 559 251 36.0 2e-66 MNFLSIFVLIPLLMLAGLWAARGIKAIRGVMVTGASALLIASVVLTFLYLGERSAGNTAE MLFRADTLWYAPLHISYSVGVDGISVAMLLLSAVIVFTGTFASWRLQPLTKEYFLWFTLL SMGVFGFFISVDLFTMFMSYEIALIPMYLLIGVWGSGRKEYAAMKLTLMLMGGSAFLLIG ILGIYFGSGATTMNLLEIAQLHNIPFAQQCIWFPLTFLGFGVLGALFPFHTWSPDGHASA PTAVSMLHAGVLMKLGGYGCFRIAMYLMPEAANELSWIFLILTGISVVYGAFSACVQTDL KYINAYSSVSHCGLVLFAILMLNQTAATGAILQMLSHGLMTALFFALIGMIYGRTHTRDV RELAGLMKIMPFLSVCYVIAGLANLGLPGLSGFIAEMTIFVGSFQNNDVFHRTLTIIACS SIVITAVYILRLVGKILYGTCTNKHHLELTDATWDERVAVICLIVCVAGLGMAPFWVSHM IGESVLPVVSQLIP >gi|222159249|gb|ACAB01000110.1| GENE 62 95589 - 97028 1170 479 aa, chain + ## HITS:1 COG:BMEI1145 KEGG:ns NR:ns ## COG: BMEI1145 COG1007 # Protein_GI_number: 17987428 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 2 (chain N) # Organism: Brucella melitensis # 68 431 65 430 478 239 39.0 8e-63 MDYSQFLNMREELSLIAVLLLLFLADLFMSPDAHKNSGKARLNTMLPVILMAIHTAINLI PGTATDAFGGMYHYVPMHTVVKSILNVGTLIVFLMAHEWMKREDTSFKQGEFYVLTLSTL FGMYLMISAGHFLMFFIGLETASIPMAALIAFDKYRHNSAEAGAKYILTALFSSALLLFG LSMIYGSAGTLYFDDLPAHIDGNPLQIMAFIFFFTGMAFKLSLVPFHLWTADVYEGAPST VTAYLSVISKGSAAFVLLAVLIKVFAPMINDWQEVLYWVTIASITIANIFAVRQQNLKRL MAFSSISQAGYIMLGVIGGTAQGMTALVYYVLVYAAANLGVFAVITIVALRSQKFTLEDY AGLYKTNPKIALLMTLSLFSLAGIPPFAGFFSKFFIFMAAFDAGFHLLVFIALVNTVISL YYYLLIVKAMYITPSDNPIPTFRSDRCTKWGLALCTLGIIGLGIASIVYQSIDKLSFGI >gi|222159249|gb|ACAB01000110.1| GENE 63 97169 - 99979 2581 936 aa, chain - ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 662 920 1 267 280 189 40.0 2e-47 MQTDNERYKKMASLAQIGWWEVDLTAGCYLCSDYLSDLLGLDGDTISTSDFLNLIREDYR KQIAQEFRANSSIHKDFYEQTFPIHSKYGEVWLHTRLAFREKGTGVDGGDKAFGIIQRVE APKEEDQRDALRRVNDLLCRQNFVSQSLLRFLRDEAVESCIADILRDILNLYNGKGRVYI FEYDEIYAHHSCIYEVVSEGVSAEIDNLQDMPASESKWWSEQILSGKPIILNTLEQLLEE APDEYQILVVQGIKSLMVTPLMTGDRVWGYMGIDLVETYHDWSNEDFQWFSSLGNIINIC IELRKTKDKVIREQTFLNNLFHFMPMGYIRMSIIRDENNKPCDYRVTDANEVSSTFFGLP LESYIGSLASEKHPDYLQKLNFLEEILDSNSYREKDEYFPRTERYTHWVIYSPGKDEIVG LFLDSTGSVQANRALDRSEKLFQNIFANIPAGVEIYDKDGYLIDLNNKDLEIFGVVNKSD VIGVNFFDNPNVPQNIRDRVRNEDLVDFRLNYSFEQAGGYYETSRSNVIELYSKVSKLYD NEGNFSGYILISIDNTERIDAMNRIRDFENFFLMISDYAKVGYAKLNLLNRKGYAIKQWY KNLGEEEDIPLSEVVAVYSQMHPEDRKRFLDFYDGVRDGKRRHFQGEMRIRRPGTKNEWN WVSSNVMVTNYKPEENEIEIIGINYDITELKETEAELIQARDKAEMMDRLKSAFLANMSH EIRTPLNAIVGFSDLLVETEELAERQEYIKIVRENNELLLQLISDILDLSKIEAGTFEFT NGDVDVNLLCEDIVRSMGMKAKEEVELVFDDHLPVCHVISDRNRIHQVISNFVNNAMKFT SEGSIHVGYKLKDGELEFYVEDTGIGIEKEQLPHIFERFVKLNSFVHGTGLGLSICQSIV EQLGGRIGVDSEKGKGSRFWFTIPGVIVTEEMNCAR >gi|222159249|gb|ACAB01000110.1| GENE 64 100172 - 101380 1254 402 aa, chain - ## HITS:1 COG:no KEGG:BT_4056 NR:ns ## KEGG: BT_4056 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 402 1 403 403 571 85.0 1e-161 MKRLFVNFMTFAVMATALTLTACSSDSDGDGDGNGDGGNNGGTGSSIVVGENILSGTLTG EQTLEAKEYILNGTVVIENGGRLNIPAGTTIKAREGFSSYLLVAQGGKLYADGTADKPIV FTANSTTPTSGYWGGIIINGKAPISGSNANKSDTGLTEIDNNYKYGGNVDNDNSGSLTYV KICYAGARSTADIEHNGLTLNGVGNGTKIENIYILESADDAIEFFGGTVNVTNLLAVNPD DDMFDFTQGYSGKLKNCYGVWESGYTSTEADPRGIEADGNLDGIYPDHLRQSDFAVENMT IVNNAANTTDNADRMQDVIKIRRGAKTTITNALVKGSGGTIDLIDMNDSKGAGNAASSIS ITYTLNFKNKLNGTLNTFTEPTTNTGADASLFTWSGYNFSSL >gi|222159249|gb|ACAB01000110.1| GENE 65 101424 - 104138 2752 904 aa, chain - ## HITS:1 COG:CC0171 KEGG:ns NR:ns ## COG: CC0171 COG1629 # Protein_GI_number: 16124426 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 115 904 57 888 888 135 23.0 4e-31 MKKLFLIRFSVAAFFCLLCVLPALAANIKIKGAVKDKLSKEPLIGATIRLLGTQVGAVTD MEGNFELNSMGVLEGMYDIEIKYVGYKTEVRRKVRVENGKLVILNLELETDAQELADVVV VAKKNRENENMLLLEQQKAVIAVQSVGVRELSRKGVSDAEGAVTKVAGVSKQDGVKNVFV RGLGDRYNATTFNGFALPSEDPEYKNISLDFFGTDIIQSVGVNKAFNAGGSSDVGGATID IVSKELIGSGHLGFGISGGLNTQTVAADFLKQDGVNFMGFANRTEPADENSWNFRNKLDP SAQHLQINRSYSISGGKRFYVGKDKNPLSFFLTAGHTTDYQYTDEIIRNTTTGGTVYKDM NGKKYAENISQLALANVDFDMQNRHHISYNLMMIHANTQSVGDYNGKNSIFSDDYENLGF TRRQQTNDNMLIVNQLMSNWGLTKSLSLDAGASYNMVKGYEPDRRINNLTKAEDGYTLLR GNSQQRYFSTLDEDDLNVKAGLVYRLKDNIEEISNIRFGYTGRFVDDNFKATEYNLTVGH VSTLPSLDDFSLDDYYNQENFASDWFKIQKNLDEYTVKKNIHSAYAEATYQFTPRWIVNL GLKYDKVDIEVDYNVNRGGSKGNNTIQKDYFLPSLNLKCNLNDKNSLRLGASKTYTLPQA KEISPYRYVGVNFNSQGNLNLKPSDHYNLDLKWDFNPTPTELISLTAFYKLIKNPISRIE VASAGGYLSYENIADKATVAGVEVEIRKNLFVRPVSNAAHGMNKLSLGLNGSYIYTNAKM PLATVTTGSQLEGAAPWIVNFDLSHNFTKRERSFVNTLVLNYVSDKIYTIGTQGYQDMME QGVLTLDFVSQAKLNKHLSLNLKARNLLNPSYKLSRKVNENGEKVILNDYKKGINISLGV SCTF >gi|222159249|gb|ACAB01000110.1| GENE 66 104595 - 106160 1089 521 aa, chain + ## HITS:1 COG:no KEGG:BT_4046 NR:ns ## KEGG: BT_4046 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 518 1 517 518 897 86.0 0 MEENFRRYPIGIQNFEDLRNNDCVYVDKTELIYRLTHTNKVYFLSRPRRFGKSLLVSTLE AYFLGKKELFQGLAMETLEKDWTVYPVLHFDFSVSKYITADMLSAVINRQLLLWEKIYQR EEGETTFSLRFEGIIRRAYKQTGKQVVVLIDEYDSPMLDSNHNDEVQKEIRNIMRDFFSP LKAQGQYLRFLLLTGISKFSQMSIFSELNNLQNISMRDDYSAICGITEQELRTQLQIDIE QMAQANKETYEEACVHLKQQYDGYHFSKNCVDIYNPFSLFNAFAQKSYENFWFSTGTPTF LIDILQESDFDIRELDNTTATAEQFDAPSNRITDPLPVLYQSGYLTIKGYDPDFQLYTLA YPNKEVRKGFIESLMPAYVHLPARENTFYVVSFIKDLRIGHLEECLERLKSFFASIPNKL NNKEEKHYQTIFYLFFRLMGQYIDVEVDTAIGRADAVVKLQDTIYVFEFKVNGTPEEALA QINSKGYAIPYQADHRKIVKVGVNFDSATRTIGEWKIELGS >gi|222159249|gb|ACAB01000110.1| GENE 67 106187 - 107083 737 298 aa, chain - ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 26 296 36 304 313 127 30.0 2e-29 MTTPLSNQIIREITPLSDKDCFYIAERYKTEFTYPIHNHSEFELNFTEKAAGVRRIVGDS SEVIGDYDLVLITGKDLEHVWEQNDCHSKEIREITIQFSSDLFFKSFINKNQFDSIRRML DKAQKGLCFPMSAILKIYPLLDTLASEKQGFYAVIKFMTILYELSLFEEEARTLSSSSFA KIDIHSDSRRVQKVQEYINTHYQEEIRLGQLADMVGMTDVSFSRFFKLRTGKNLSDYIID IRLGFASRLLVDSTMSIAEICYECGFNNLSNFNRIFKKKKSCSPKEFRENYRKKKKLI >gi|222159249|gb|ACAB01000110.1| GENE 68 107118 - 108464 1274 448 aa, chain - ## HITS:1 COG:VC0265 KEGG:ns NR:ns ## COG: VC0265 COG0668 # Protein_GI_number: 15640294 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 56 442 26 410 412 358 45.0 1e-98 MLMKEVKEVVDTVSVALANGQPEEAGNVVMQEVNHALLSMGVDEAMADKIDNFIILLFII GIALLANLICRKIILRTVAKLVRQTKATWDDIVFNDKVMVNISRMVAPILIYISIPIAFP EHADSALLDFLRRLCMIYILAVFLRFVSALFTAVYLVYSAREQYKDKPLKGLLQTAQVIL FFIGAIIIISILIKQSPVVLLTGLGASAAVLMLVFKDSIMGFVSGIQLSANNMLKVGDWI TMPKYGADGTVIEVTLNTVKIRNFDNTITTIPPYLLISDSFQNWQGMQESGGRRVKRSIN IDMTSVHFCTPEMLAKYRKIQLLKDYVDETEKVVEEYNKEHHIDNSVLVNGRRQTNLGVF RAYLTNYLKNLPTVNQDLTCMVRQLQPTETGIPLELYFFSANKVWVAYEGIQADVFDHVL AIIPEFDLRVFQNPSGADLHQIGVKIEN >gi|222159249|gb|ACAB01000110.1| GENE 69 108496 - 109524 815 342 aa, chain - ## HITS:1 COG:no KEGG:BT_4052 NR:ns ## KEGG: BT_4052 # Name: not_defined # Def: putative ABC transporter ATP-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 333 297 629 629 622 92.0 1e-177 MATSLRDNLTSSYFNAAHKLYPKKARRRIIAYVESYDDIAFWRTLLEEFEDDEHYFQVML PSATSLAKGKKMVLMNTLNTAELGRSLIACVDSDYDFLLQGATNTSRKINRNRYIFQTYT YAIENYHCFAESLHEVCVQATLNDRSILDFNFYLKKYSEIVYPLFLWNVWFYRQRDTYTF PMYDFHTYTSLREINLRHPEKSLESLQQRVNQKLAELKRKFPHNINQVNGLRTEFKELGL VPETTYLYMQGHHVMDNVVMKLLIPVCTVLRREREQEIKRLAEHNEQFRNELTCYQNSQV NVEIMLKKNVAYKRLFHYDWLRQDISEYLEEGKNKEEKKQRS >gi|222159249|gb|ACAB01000110.1| GENE 70 109547 - 110410 986 287 aa, chain - ## HITS:1 COG:STM2746 KEGG:ns NR:ns ## COG: STM2746 COG3950 # Protein_GI_number: 16766058 # Func_class: R General function prediction only # Function: Predicted ATP-binding protein involved in virulence # Organism: Salmonella typhimurium LT2 # 156 260 302 410 427 63 33.0 6e-10 MEQQANYIRRIEIHGLWERFNIGWDLRPDVNILSGINGVGKTTILNRSVGYLEELSGEMK SDEKNGVRLFFDNPQATYIPYDVIRSYDRPLIMGDFTARMADKNVKSELDWQLYLLQRRY LDYQVNIGNKMIEMLSSTDEEERRKAATLSVAKRRFQDMIDELFSYTRKKIDRKRNDIAF YQDGELLFPYKLSSGEKQMLVILLTVLVQDNSHCVLFMDEPEASLHIEWQQKLIAMIREL NPNVQIILTTHSPAVIMEGWLDAVTEVSDISTTVGHKLDKDSPNCNL >gi|222159249|gb|ACAB01000110.1| GENE 71 110533 - 111450 736 305 aa, chain + ## HITS:1 COG:BH0390 KEGG:ns NR:ns ## COG: BH0390 COG0697 # Protein_GI_number: 15612953 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus halodurans # 4 290 2 291 311 67 23.0 3e-11 MDKSKNLYGHLFALTANVMWGLMSPIGKSALQEFSPISVTTFRMVGAAAAFWILSMFCKQ EHVNHQDMLKIFFASLFALVFNQGVFIFGLSMTSPIDASIVTTTLPIVTMVVAAIYLKEP ITNKKVLGIFIGAMGALILIMSSQAGSNGNGSLIGDLLCLVAQISFSIYLTVFKGLSQRY SAVTINKWMFIYASMCYIPFSYYDISTIQWTSVSTVAILQVLYVVLGGSFLAYLCIMTAQ RLLRPTVVSMYNYMQPIVATIAAIAMGIGSFGWEKGIAIALVFLGVYIVTQSKSRADLEK AEKPH >gi|222159249|gb|ACAB01000110.1| GENE 72 111541 - 112638 870 365 aa, chain - ## HITS:1 COG:BH2954 KEGG:ns NR:ns ## COG: BH2954 COG1703 # Protein_GI_number: 15615516 # Func_class: E Amino acid transport and metabolism # Function: Putative periplasmic protein kinase ArgK and related GTPases of G3E family # Organism: Bacillus halodurans # 35 365 4 334 340 337 49.0 3e-92 MIMEHPENSEEYKGLVVNKGIEQPSSVNPYLKRKPKKRQLSVAEFVEGIVKGDVTILSQA VTLVESVKPEHQAVSQEIIEKCLPFSGNSIRIGISGVPGAGKSTSIDVFGLHVLEKGGKL AVLAIDPSSERSKGSILGDKTRMEQLSVHPKSFIRPSPSAGSLGGVARKTRETIILCEAA GFDKIFVETVGVGQSETAVHSMVDFFLLIQLSGTGDELQGIKRGIMEMADGIVINKADGD NLERAKLAATQFRNALHLFPAPESGWIPKVLTYSGFYNLGVKEVWDMIYEYIDFVKENGY FEYRRNEQSKYWMYESINEQLRDSFYHNPKIETMLLEKEQQVLKGNLTSFIAARSLLDTY FEDLK >gi|222159249|gb|ACAB01000110.1| GENE 73 112644 - 113711 865 355 aa, chain - ## HITS:1 COG:no KEGG:BT_4048 NR:ns ## KEGG: BT_4048 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 355 19 373 373 623 92.0 1e-177 MKRSLLSILSLLTVALAAVAQPRISSNKETHNFGQIEWKRPVTVEYTITNTGNQPLVLTN VTTSCACAVADWTKEPIAPGGKGVVKASFDAKALGHFEKSIGIYSNASPSLVYLKFTGEV VQEIKDYTKLLPYTIGNIRLDRDEFAFPDVYRGQQPSLTFDIANLSDRPYEPVLMHLPPY LKMEAEPKVLLKGKKGTIKLTLDASQLKDYGLTQTSVYLSRFSGDKVSEDNEIPVSAILL PDFSRMTEKDSLNAPAMHISETDIDLSIPLIKKNKVSHDILIANSGKTPLVISKLQVFNS SVGVSLKKTVLPPDGMTKLKVTIRKRDVGNKKHHLRILMITNDPLRPKVEINIKR >gi|222159249|gb|ACAB01000110.1| GENE 74 113937 - 119681 4975 1914 aa, chain - ## HITS:1 COG:TM0984 KEGG:ns NR:ns ## COG: TM0984 COG2373 # Protein_GI_number: 15643744 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Thermotoga maritima # 462 781 149 450 1536 66 20.0 5e-10 MKVKQICMMVLLWLGVIPAVQAQTFDKLWKEVEQAEKKSLPKTVIKLTDEIYQKGEKEKN SPQMLKAYAWRMKYREMLNPDSLYAGLKGLEQWVKQTDQPMDRAILHSLIAGIYANYAAN NQWQLRQRTEIVGQTPSADIREWTANMFVEKVRTNVKEALADSVLLLETSSRDYIPFVEL GKTSEYYHHDLYHLLASRGIDALLQVEKLGSGYTETNAVNPVKQDIIAIYGNMLSAYKAA GLKEGYVLTALNYLEWRRGAERYIRPLQAKGEALVLTDDTYLKALNTLKSKYASEPICAE VYLAQARYAIEKQQQVNALQLCDEAIRLYPGYDRINALKNLREEILAPYLNVYAADQAFP NEEIELRASHKNLDGFTVRIYQAKKLIKEQHYSVIRPEDYRTQDTVFTFKAPELGAYIMR IIPDIRAKRDSESKFDVTRFKVLTCRLPEKQYQVVTLDGQTGHPISNAKVTMYSNDEKVL QEFTTDKDGKVVFPWKSEYRYLRASKGTDTAMPKQGIYAGSYGYYGDEDKVAENMTLLTD RSLYRPGQTVYVKGIAYSQKSDTADVLPNKEYMVTLLDANNQEVGQKSVRTNEFGSFATD FALPSACLNGMFSLKAGRGRTSIRVEDYKRPTFDITFEKQQGSYKLGDEVQVKGKIESYS GVLLQDLPVKYKVMRSTYSLWRFAESVQIASGEVTANENGEFTIPVRLQESDSYKNDDKV YYRYSIEATVTNVAGETQSSTDVISAGNRSLVLQVELHDKTCKDKPFDTMFNVQNLNGQP VEVKGNYYLYPAKDQDFKQLEEKSIATGTFTSNEEITLDWKDLPSGPYVLKASVKDNQGK EVTADTNTILFSIEDKRPPVETTVWFYGENMEFDATHPAVFCFGTSKKDAYVMMNVFSGD KLLESKTLNLSDTIVRFEYPYQESYGDGVFVNFCMVRDGQVYQEHARLTRRVPDKTLTMK WEVFRDKLRPGQKEEWKLTIKTPQGQAANAEMLATMYDASLDKIWNRQQNFQIFYSQIVP YSNWMSGYSGNNSFNYWWNTKSLKVPGLAYDYFVMSSGVGNVYALSESLADGVVVRGLAV QRKVSMTGSVTSRSNAVEVKYVPALVSDASEDVEFESGWGQTGTLGKLDETSGNETLPEA PADLRTNLAETAFFYPQLRTNEQGEISFSFTMPESLTRWNFRGYSHTKGMLMGTLDGEAT TSKEFMLTPNLPRFVRVGDKTSLTASISNMTGKSQAGTVTLILFDPMTEKVISTQKQKFS LVAGKTIGVSFQFTVSDKYEILGCRMIADSGTFSDGEQQLLPVLSNKEHLVETLPMPVRG EETRTFSLDRLFNQQSKTATDRKLTVEFTGNPAWYAIQALPSLSLPTSNNAISWATAYYA NTLASFIMNSQPKIKAVFESWKLQGGTKETFLSNLQKNQEVKNIILSESPWVLEAQTEEQ QKERIATLFDLNNIRSNNIAALTRLQELQNANGAWSWYKGMTGSRYITTYIAELNARLAM LTGEKLTGPALSLQQNAFTYLHQSALDEYKEILKAQKEGVKFTGVSGSILQYLYLIAISG EQVPAANKAAYTYYLSKVGELLTSPSMDTKAIAAIVLDKAGRKKEAQEFVASLKEFLTKT DEQGMFFAFNENPYTWGGMQMQAHVDVMEALEAIGGNSDTVEEMKLWLLKQKQTQQWNSP VATADAVFALLMKGVNLLDNQGDVRIVIANEVLETVAPSKTTVPGLGYIKRSFTQKSVVD ARKIEVEKRNPGIAWGAVYAEFESPISDVKQQGGELNVEKQLYVERMVNNVPQLQPITAK TVLQVGDKVVSRLSIRVDRPMDFVQLKDQRGACFEPIGSISGYRWNNGLGYYVDIKDAST NFFFDHLGKGVYVLEYSYRVSRAGTYETGLATMQCAYAPEYASHSASMTVAVKE Prediction of potential genes in microbial genomes Time: Wed May 18 03:43:55 2011 Seq name: gi|222159248|gb|ACAB01000111.1| Bacteroides sp. D1 cont1.111, whole genome shotgun sequence Length of sequence - 42008 bp Number of predicted genes - 40, with homology - 40 Number of transcription units - 22, operones - 10 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 12 - 65 9.3 1 1 Tu 1 . - CDS 117 - 1097 1179 ## COG0205 6-phosphofructokinase - Prom 1125 - 1184 4.3 2 2 Op 1 2/0.000 - CDS 1299 - 2168 372 ## PROTEIN SUPPORTED gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase 3 2 Op 2 . - CDS 2177 - 3010 199 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 - Prom 3056 - 3115 4.7 + Prom 3001 - 3060 3.6 4 3 Op 1 . + CDS 3080 - 3763 771 ## BT_2059 TonB + Term 3776 - 3822 6.0 5 3 Op 2 . + CDS 3833 - 4807 990 ## COG0142 Geranylgeranyl pyrophosphate synthase 6 3 Op 3 . + CDS 4823 - 5539 526 ## BT_2057 hypothetical protein 7 3 Op 4 . + CDS 5533 - 6309 600 ## COG0084 Mg-dependent DNase + TRNA 6430 - 6517 58.9 # Ser GGA 0 0 + Prom 6442 - 6501 80.4 8 4 Op 1 . + CDS 6592 - 7392 861 ## COG0811 Biopolymer transport proteins 9 4 Op 2 . + CDS 7399 - 7884 391 ## BT_2054 hypothetical protein 10 4 Op 3 . + CDS 7921 - 8514 659 ## BT_2053 hypothetical protein 11 4 Op 4 . + CDS 8518 - 8988 485 ## BT_2052 hypothetical protein + Term 9002 - 9066 26.1 - Term 9000 - 9045 14.6 12 5 Op 1 . - CDS 9101 - 9640 448 ## BT_2051 hypothetical protein - Prom 9666 - 9725 2.0 13 5 Op 2 . - CDS 9739 - 11016 810 ## BT_2050 hypothetical protein - Prom 11088 - 11147 6.0 + Prom 10864 - 10923 5.2 14 6 Tu 1 . + CDS 11169 - 11648 383 ## COG1522 Transcriptional regulators + Term 11726 - 11779 2.0 - Term 11490 - 11541 0.2 15 7 Op 1 16/0.000 - CDS 11695 - 12189 290 ## COG0262 Dihydrofolate reductase 16 7 Op 2 . - CDS 12195 - 12989 580 ## COG0207 Thymidylate synthase 17 7 Op 3 . - CDS 13029 - 14300 1018 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Prom 14425 - 14484 1.8 18 8 Op 1 . - CDS 14487 - 14816 356 ## BF3732 hypothetical protein 19 8 Op 2 . - CDS 14797 - 15348 462 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 20 8 Op 3 . - CDS 15345 - 16130 697 ## BT_2043 hypothetical protein - Prom 16153 - 16212 4.6 + Prom 16125 - 16184 9.0 21 9 Tu 1 . + CDS 16239 - 17771 931 ## COG0144 tRNA and rRNA cytosine-C5-methylases - Term 17572 - 17616 8.1 22 10 Tu 1 . - CDS 17787 - 18614 509 ## BT_2041 hypothetical protein - Prom 18784 - 18843 5.2 23 11 Tu 1 . - CDS 18873 - 20057 1163 ## BT_2040 hypothetical protein - Prom 20078 - 20137 2.6 24 12 Op 1 11/0.000 - CDS 20215 - 23313 2923 ## COG3696 Putative silver efflux pump 25 12 Op 2 . - CDS 23457 - 24692 1111 ## COG0845 Membrane-fusion protein - Prom 24716 - 24775 1.8 - Term 24732 - 24773 1.6 26 13 Op 1 . - CDS 24778 - 25134 172 ## BT_2037 hypothetical protein 27 13 Op 2 . - CDS 25206 - 25406 237 ## gi|298479689|ref|ZP_06997889.1| hypothetical protein HMPREF0106_00114 28 13 Op 3 . - CDS 25396 - 25977 519 ## COG4185 Uncharacterized protein conserved in bacteria - Prom 26005 - 26064 6.6 29 13 Op 4 . - CDS 26066 - 27703 1733 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 27770 - 27829 7.2 + Prom 27708 - 27767 6.1 30 14 Tu 1 . + CDS 27822 - 28220 334 ## COG0784 FOG: CheY-like receiver + Term 28399 - 28434 1.3 - Term 28387 - 28422 1.3 31 15 Op 1 . - CDS 28503 - 30116 1703 ## BDI_0503 hypothetical protein 32 15 Op 2 . - CDS 30128 - 33064 2823 ## BDI_0500 hypothetical protein + Prom 33210 - 33269 7.1 33 16 Tu 1 . + CDS 33440 - 33661 255 ## BVU_1841 hypothetical protein + Term 33804 - 33848 -0.6 - Term 34020 - 34080 18.3 34 17 Tu 1 . - CDS 34188 - 34889 702 ## COG0120 Ribose 5-phosphate isomerase - Prom 35044 - 35103 4.4 - Term 35029 - 35084 4.6 35 18 Tu 1 . - CDS 35252 - 35902 551 ## COG1272 Predicted membrane protein, hemolysin III homolog - Prom 36135 - 36194 6.8 - Term 36133 - 36172 8.1 36 19 Tu 1 . - CDS 36202 - 37101 1234 ## BT_1979 meso-diaminopimelate D-dehydrogenase - Prom 37148 - 37207 6.2 + Prom 37184 - 37243 9.9 37 20 Op 1 . + CDS 37289 - 37891 621 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 38 20 Op 2 . + CDS 37914 - 38885 903 ## BT_1977 hypothetical protein - Term 39183 - 39243 11.6 39 21 Tu 1 . - CDS 39493 - 40329 973 ## BF3641 hypothetical protein - Prom 40486 - 40545 4.8 + Prom 40305 - 40364 4.7 40 22 Tu 1 . + CDS 40493 - 41926 1484 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase Predicted protein(s) >gi|222159248|gb|ACAB01000111.1| GENE 1 117 - 1097 1179 326 aa, chain - ## HITS:1 COG:BH3164 KEGG:ns NR:ns ## COG: BH3164 COG0205 # Protein_GI_number: 15615726 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Bacillus halodurans # 4 326 1 319 319 312 51.0 6e-85 MGTVKCIGILTSGGDAPGMNAAIRAVTRAAIYNGLQVKGIYRGYKGLVTGEIKEFKSQNV SNIIQLGGTILKTARCKEFTTPEGRQLAYDNMKKEGIDALVIIGGDGSLTGARIFAQEFD VPCIGLPGTIDNDLYGTDTTIGYDTALNTILDAVDKIRDTATSHERLFFVEVMGRDAGFL ALNGAIASGAEAAIIPEFSTEVDQLEEFIKNGFRKSKNSSIVLVAESELTGGAMHYAERV KNEYPQYDVRVTILGHLQRGGSPTAHDRILASRLGAAAIDAIMEDQRNVMIGIEHDEIVY VPFSKAIKNDKPIKRDLVTVLKELSI >gi|222159248|gb|ACAB01000111.1| GENE 2 1299 - 2168 372 289 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase [Clostridium acetobutylicum ATCC 824] # 1 278 1 274 642 147 32 8e-35 MIKVEIDEGSGFCFGVVTAIHKAEEELAKGETLYCLGDIVHNSREVDRLKTMGLITINRE EFKQLKNAKVLLRAHGEPPETYIIARENNIEIIDATCPVVLRLQKRIRQGYLADSDEEKQ IVIYGKSGHAEVLGLVGQTDGKAIVIEKAEEAKKLDLNKSIRLFSQTTKSLDEFQEIVEY FKQHISPEATFEYYDTICRQVANRMPKLREFAATHDLIFFVSGKKSSNGKMLFEECLKVN ANSHLIDNEKEIDPSLLQNVKSIGVCGATSTPKWLMEKIYNHIRTLIKE >gi|222159248|gb|ACAB01000111.1| GENE 3 2177 - 3010 199 277 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 50 263 32 274 863 81 28 9e-15 MYSFPSLPLFIQLFNFFNSFVLVLLLLTANLWLIFKFTVSLADKSDELNMKKITIAIDGF SSCGKSTMAKDLAREVGYIYIDSGAMYRAVTLYSIENGIFDGDVIDTERLKKEIKDIHIS FRLNKETGRPDTYLNGVNVENKIRSMEVSSKVSPISTLDFVREAMVAQQQAMGNEKGIVM DGRDIGTTVFPDAELKIFVTATPEIRAQRRYDELKAKGQEASFDEILENVKQRDYIDQNR EVSPLRKAEDALLLDNTDLSIEEQKKWLFEQFNKVSK >gi|222159248|gb|ACAB01000111.1| GENE 4 3080 - 3763 771 227 aa, chain + ## HITS:1 COG:no KEGG:BT_2059 NR:ns ## KEGG: BT_2059 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 1 227 227 362 96.0 4e-99 MEVKKSPKADLEGKKSTWLLVGYVVVLAFMFVAFEWTQRDVKIDTSQAVADVVFEEEIIP ITETPEQAAPPPPEAPKVAELLEIVDDKADIEETTTIINEDNQARVEVKYVPVQVVEEEP EEQTIFEVVENMPDFPGGQAALMQYLAKNIKYPTIAQENGTQGRVIVQFVVNRDGSIVDA KVVRSVDPYLDKEALRVINTMPKWKPGMQRGKPVRVKFTVPVMFRLQ >gi|222159248|gb|ACAB01000111.1| GENE 5 3833 - 4807 990 324 aa, chain + ## HITS:1 COG:MK0774 KEGG:ns NR:ns ## COG: MK0774 COG0142 # Protein_GI_number: 20094211 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Methanopyrus kandleri AV19 # 11 323 16 323 324 193 37.0 4e-49 MFTASQLLDKINHHISEIQFTRTPKGLYEPIEYILSLGGKRIRPVLMLMGYNLYREDVAS IYDPATAIEVYHNHTLLHDDLMDRSDVRRGKPTVHKVWNDNTAVLSGDAMLILAFRYMTG CPQEHLKEVMDLFSLTTLEICEGQQLDMEFESRCDVTEDEYIEMIRLKTAVLLAASLKIG AILAGATAEDAENLYHFGMHIGVAFQLQDDLLDVYGDPEVFGKKIGGDILCNKKTYMLIK ALNRADEKQREELNRWLNAETFQPAEKIEAVTEIYNQLNIRNVCENKMREYYTLAMESLA AVAVAEDRKKELKNLVKLLMYREM >gi|222159248|gb|ACAB01000111.1| GENE 6 4823 - 5539 526 238 aa, chain + ## HITS:1 COG:no KEGG:BT_2057 NR:ns ## KEGG: BT_2057 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 238 1 238 238 399 85.0 1e-110 MPYRRLPNTDQARVRALKAAVEKGEMYNVRDLAITLKTLFEARNFLHRFEAAQIYYTQCY NNQSRASRKHQMNVKTARLYISHFIQVLNLAVLRDEIKVAHKELYGLPASNTVPDLLSEA ALVEWGKKIIEGEQQRTTQGGIPIYNPTIARVKVHYDIFLESYERQKNYQALTNRSLDEL ASMRGRADELILDIWNQVEAKYQDITPNDTRLEKCRDYGLIYYYRSSEKVKEEKEISC >gi|222159248|gb|ACAB01000111.1| GENE 7 5533 - 6309 600 258 aa, chain + ## HITS:1 COG:VC0103 KEGG:ns NR:ns ## COG: VC0103 COG0084 # Protein_GI_number: 15640135 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Vibrio cholerae # 2 258 1 255 255 228 42.0 8e-60 MLIDTHSHLFVEEFTEDLPQVMERARKAGVSYIFMPNIDSTTIDAMLSVCRDYPGFCYPM IGLHPTSVNESYEQELAVVHKYLSTSNEFVAIGEIGLDLYWDKTFLKEQILAFEKQIEWA LEYDLPIVIHSREAFEYIYKVMEPYKKTPLTGIFHSFTGTSEEAAKLLEFEGFMLGINGV VTFKKSTLPETLTTVPLERIVLETDSPYLAPVPNRGKRNESANVKDTLMKVAEIYQITPE HVAEVTSVNALKVFGIRK >gi|222159248|gb|ACAB01000111.1| GENE 8 6592 - 7392 861 266 aa, chain + ## HITS:1 COG:VC1547_2 KEGG:ns NR:ns ## COG: VC1547_2 COG0811 # Protein_GI_number: 15641555 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Vibrio cholerae # 113 254 46 184 205 73 33.0 3e-13 MKKLFAIVAVIGAFTFGSIQLAQAQDAPAAEQTEQQAAPAAEATTAAAPAAEEGGIHKEI KVKFIEGTASFMSLVAIALVIGLAFCIERIIYLSLAEINTKKFMASIEAALEKGDVEAAK DIARNTRGPVASIYYQGLMRIDQGIDVVEKSVVSYGGVQAGYLEKGCSWITLFIAMAPSL GFLGTVIGMVQAFDKIQQVGDISPTVVAGGMKVALITTIFGLIVALILQVFYNYVLAKIE ALTSEMEDSSISLLDMVIKYNLKYKK >gi|222159248|gb|ACAB01000111.1| GENE 9 7399 - 7884 391 161 aa, chain + ## HITS:1 COG:no KEGG:BT_2054 NR:ns ## KEGG: BT_2054 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 161 1 161 161 200 81.0 1e-50 MSKLTYRASYYALYAMFAIILIVLCLFFLGGDATGDAVIAGVDPEMWQPAQTDALLYLMY ALFGIAIAATILGAIFQFGAALKDNPANAIKSLLGLVLLVVVLVVAWSMGDGTPMQIQGY SGTDNVPFWLKITDMFLYSIYILLFVTVVAIIVSGIKKKLS >gi|222159248|gb|ACAB01000111.1| GENE 10 7921 - 8514 659 197 aa, chain + ## HITS:1 COG:no KEGG:BT_2053 NR:ns ## KEGG: BT_2053 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 197 1 197 197 341 93.0 8e-93 MARGKRKVPDINSSSTADIAFLLLIFFLITTSMDTDRGLARLLPPPPEDQDQDNTDKIKE RNVLQVYLNKDDALMCGNDYIGVDQLREKAKEFIANVANAEHMPEKTQKNVEFFGTYLVN DKHVISLQNDRGSSYQAYISVQNELVAAYNELRDELAEQKFGTTYAELNDDQQKAIREIY PQRISEAEPKKYGEKKK >gi|222159248|gb|ACAB01000111.1| GENE 11 8518 - 8988 485 156 aa, chain + ## HITS:1 COG:no KEGG:BT_2052 NR:ns ## KEGG: BT_2052 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 156 1 145 145 249 97.0 1e-65 MGKFNKTGKREMPALNTSSLPDLIFTLLFFFMIVTTMREVTLKVQFTLPQGTELEKLEKK SLVTFIYVGEPTQEYRAKMGTESRIQLNDSYAEVGEVQDFIFQERASMNEGDAAKMTVSL KVDQKTKMGIITDVKNALRKSYALKINYSATKRGEK >gi|222159248|gb|ACAB01000111.1| GENE 12 9101 - 9640 448 179 aa, chain - ## HITS:1 COG:no KEGG:BT_2051 NR:ns ## KEGG: BT_2051 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 178 1 178 178 303 83.0 3e-81 MIRLQPISTSDLQHYKFMEELLIDAFPPEEYRQLEQLREYTDRTGNFHNNIIFDDDLPVG FITYWDFDSFYYVEHFATNPALRNGGYGKRTLEYLCNYLKRPIVLEVERPVEEMAKRRIS FYQRQGFTLWEKDYCQPPYKPGDDFLPMYLMVHGELDCEKDFETIKKRIHKEVYGVKDN >gi|222159248|gb|ACAB01000111.1| GENE 13 9739 - 11016 810 425 aa, chain - ## HITS:1 COG:no KEGG:BT_2050 NR:ns ## KEGG: BT_2050 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 425 1 425 425 795 86.0 0 MKHICCIILCFYTSIGSYAQNFADYFQNKTMRVDYIFTGDATQQSIYLDELSQLPTWAGR QHHLSELPLEGNGQIIVKDLASKQCIYKTSFSSLFQEWLSTDEAKETAKGFENTFLLPYP KQPVEIEVTLYSPRKKTMATYKHIVRPDDILIHKRGVSHVTPHRYMLQSGNEKDCIDVAI LAEGYTEKEMDIFYQDAQRTCESLFSYEPFRSMKGKFNIVAVASPSTDSGVSVPRENLWK ETAVHSHFDTFYSDRYLTTSRVKSIHNALAGIPYEHIIILANTDVYGGGGIYNSYTLTTA HHPMFKPVVVHEFGHSFGGLADEYFYEDDVMTDTYPLDVEPWEQNISTQVNFASKWKDML PSDTPIPTPIAERKKYPVGVYEGGGYSAKGIYRPAYDCRMKTNGYPEFCPVCQRAIRRMI EFYVP >gi|222159248|gb|ACAB01000111.1| GENE 14 11169 - 11648 383 159 aa, chain + ## HITS:1 COG:HI0563 KEGG:ns NR:ns ## COG: HI0563 COG1522 # Protein_GI_number: 16272506 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Haemophilus influenzae # 4 156 2 149 150 119 40.0 3e-27 MGHHQLDALDEQILRLIAGNARIPFLEVARACNVSGAAIHQRIQKLTNLGILKGSEYVID PEKIGYETCAYIGIYLKDPESFDSVTKALEAIPEVVECHFTTGKYDMFIKIYAKNNHHLL SIIHDKLQPLGLARTETLISFHEAIKRQMPIMVDIEDED >gi|222159248|gb|ACAB01000111.1| GENE 15 11695 - 12189 290 164 aa, chain - ## HITS:1 COG:RSc0946 KEGG:ns NR:ns ## COG: RSc0946 COG0262 # Protein_GI_number: 17545665 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Ralstonia solanacearum # 1 163 1 161 167 125 44.0 5e-29 MSKVSIIAAVDRRMAIGFENKLLFWLPNDLKRFKALTTGNTILMGRKTFESLPKGALPNR RNIVLSSNPDTVCSGAEVFPSLETALRSCREDEHIYIIGGASIYQQALSFADELCLTEID STAPEADAYFPEVSSKVWQEKSREAHPADEKHLCSYAFVDYVRK >gi|222159248|gb|ACAB01000111.1| GENE 16 12195 - 12989 580 264 aa, chain - ## HITS:1 COG:BH3451 KEGG:ns NR:ns ## COG: BH3451 COG0207 # Protein_GI_number: 15616013 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Bacillus halodurans # 1 264 1 264 264 433 73.0 1e-121 MKQYLDLLNRVLTEGTEKSDRTGTGTISIFGHQMRFNLDDGFPCLTTKKLHLKSIIYELL WFLQGDTNVKYLQEHGVRIWNEWADENGDLGHVYGYQWRSWPDYNGGFIDQISEVVETLK HNPDSRRIIVSAWNVADLNNMNLPPCHAFFQFYVADGRLSLQLYQRSADIFLGVPFNIAS YALLLQMMAQVTGLKAGDFVHTFGDAHIYLNHLEQVKLQLSREPRPLPQMKINPDVKNIF DFKFEDFELVNYDPHPHIAGAVAV >gi|222159248|gb|ACAB01000111.1| GENE 17 13029 - 14300 1018 423 aa, chain - ## HITS:1 COG:SA1891 KEGG:ns NR:ns ## COG: SA1891 COG1502 # Protein_GI_number: 15927663 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Staphylococcus aureus N315 # 52 423 127 494 494 248 36.0 2e-65 MKLRIFILFLFLPLLHVQADVIDSLMTQPRDSIGLTSDSLVLHFLEESGIPISDNNKVKL LKSGREKFIDLFEAIREAKHHVHLEYFNFRNDSIANALFTLLAEKVKEGVEVRAMFDAFG NWSNNKPLKKKHLKKIREQGIEIVKFDPFTFPYINHAAHRDHRKIAVIDGKVAYTGGMNI ADYYINGLPKIGTWRDMHMRIEGDAVNDLQEIFLTIWNKETKQNVGGEAYFPQHEGQTDS TNIVVAIVDRTPKKNSRMLSHAYAMSIYSAQKNVHIVNPYFVPTSSIKKALNRTIDRGVD VTIMVSSASDIPFTPDAALYKLHKLMKRGATVYMYNGGFHHSKIMMVDDLFCTVGTANLN SRSLRYDYETNAFIFDKRITGELNNMFRNDIEHCTQLTPEFWKKRSPWKKFVGWFANLFT PFL >gi|222159248|gb|ACAB01000111.1| GENE 18 14487 - 14816 356 109 aa, chain - ## HITS:1 COG:no KEGG:BF3732 NR:ns ## KEGG: BF3732 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 109 1 109 109 121 58.0 6e-27 MTEIDNDKLLRDFFAENKQEIADNGFSRRVMHHLPGRSNHLARIWSAFVMTVAAVLFVWL GGLEAAWGTIREVFISMINHGTTGLDPKSIIIAAVVLLFMATRKVASMA >gi|222159248|gb|ACAB01000111.1| GENE 19 14797 - 15348 462 183 aa, chain - ## HITS:1 COG:RSc1055 KEGG:ns NR:ns ## COG: RSc1055 COG1595 # Protein_GI_number: 17545774 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 1 174 1 188 199 88 34.0 6e-18 MSQLNDISLVAQVVVFKNTRAFDQLVEKYQSPIRRFFLNLTCGDSELSDDLAQDTFIKAY TNIASFRNLSSFSTWLYRIAYNVFYDYIRSRKETADLDTKEIDAINSTEQENVGQKMDVY QSLKMLKEVERTCIMLFYMEDISIDKIAGIVGVPSGTVKSHLSRGKEKLATYLKQNGYDR NRQ >gi|222159248|gb|ACAB01000111.1| GENE 20 15345 - 16130 697 261 aa, chain - ## HITS:1 COG:no KEGG:BT_2043 NR:ns ## KEGG: BT_2043 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 219 250 270 72.0 4e-71 MKNFLLALMVVLTTCTLAIAQNQTTVVRDSVGKVKVTVTKDNKAKTNNTAVTVIGVDTAD TDSTEVDANSSNVSTGHGKASFTFESDDDDFPFHGFGINGGILVAIISIIAVFGFPVFIL FVIFFFRYKNRKARYRLAEQALAAGQPLPAEFIRENKTVDPRSQGIKNTFTGIGLFIFLW AITGEFGIGAIGLLVTFMGIGQWIIGSKQQAQGTDAPRTYTTHKDEKKNQDNVKNDSFEI IPSAPGEKDNGINEAKNDENK >gi|222159248|gb|ACAB01000111.1| GENE 21 16239 - 17771 931 510 aa, chain + ## HITS:1 COG:yebU_1 KEGG:ns NR:ns ## COG: yebU_1 COG0144 # Protein_GI_number: 16129788 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Escherichia coli K12 # 3 367 10 356 385 168 32.0 2e-41 MELPASFIDYTRSLLGDEEYDKLAVAIQQEPPVSIRLNDGKLKIENGKCRIVPQAGCITD FQLSAFNFEFNQVPWSSEGFYLDERLTFTFDPLFHAGCYYVQEASSMFVEQVLRQYVTGP VKMLDLCAAPGGKSTHARSVLPEGSLLVANEVIRNRSQILAENLMKWGHPDVVVTNNDPA DFAALPSFFDVILTDVPCSGEGMFRKDPVAVEEWSPENVEICWQRQRRIIGDIWDTLKPG GILIYSTCTFNTKEDEENARWIQQEYGAEPLTVQVQENWNITGDLLPDNCDGSKSSIPVY HFFPHKTKGEGFFLAAFRKPETGDGIPVSSFAKEKTSKKKDKKGGAVSSPVSKEQLNIAK SWLNDENSDKYILSAEGTSIRAFSKHYADELMAMKQCLKIVSAGVEIGEVKGKDLIPDHA LAMCSSLLCREAFATEEISYKQAITYLRKEAITLPVTAPRGYVLLTYRHIPLGFVKNIGN RANNLYPQEWRIRSGYLPENIRILSESTVD >gi|222159248|gb|ACAB01000111.1| GENE 22 17787 - 18614 509 275 aa, chain - ## HITS:1 COG:no KEGG:BT_2041 NR:ns ## KEGG: BT_2041 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 275 8 274 274 377 68.0 1e-103 MRPKQFYLFMLCCILTLFSGCKEDEENPLRFYNSEYEVPMGGRRYLGLESGNGNYSLEVK DTRIASAGTETGWTGVPAGRMIYVTGILTGQTYLTVTDNATQETCTLPIKVVDNYEDIKL LRSYLSNLPNGDANLLPGISDIFLINNHARDAYFFKQGEQTAFSSGLTLITRGSYKLEKE EGNNEKAILTLTFSEDAATPIPNHKFILWGNAYVFHRLDKSLNLNWNTPPIGETRTSPAP PPSYTLEEIAGGEPGAGKQISFTLSEQEIPAGILP >gi|222159248|gb|ACAB01000111.1| GENE 23 18873 - 20057 1163 394 aa, chain - ## HITS:1 COG:no KEGG:BT_2040 NR:ns ## KEGG: BT_2040 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 394 1 397 397 630 91.0 1e-179 MKQIMTISAALLFLTIGEVQAQNGIEQVLKNIETNNKELQANEQLITSQKLEAKTDNNLP DPTLSYAHLWGAKDKSETIGELVVSQSFDFPSLYATRNKLNRLKAGAFDSQSDVFRQEKL LQAKELCLDIIMLRQQKHILEERLRNAEELAKMYAKRLQTGDANALETNKINLELLNVRT EASLNETALRNKQQELNTLNGNIPVVFEENQYPAIPFPTDYQILKSEVMATDRTLMALGN ESLVARKQIAVNKSQWLPKLELGYRRNTETGVPFNGVVVGFSFPLFENRNKVKIAKAQAL NIDLQKDNATLQVESELAQLYREAKTLHASMEEYSKTFQSQQDLSLLKQALTGGQISMIE YFIEVSVIYQSHQNYLQLENQYQKAMARIYKSKL >gi|222159248|gb|ACAB01000111.1| GENE 24 20215 - 23313 2923 1032 aa, chain - ## HITS:1 COG:all7618 KEGG:ns NR:ns ## COG: all7618 COG3696 # Protein_GI_number: 17158754 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Nostoc sp. PCC 7120 # 1 1020 1 1019 1058 768 42.0 0 MLNKIIHYSLHNRLVVVCAAILLLIAGTYTAMHTEVDVFPDLNAPTVVIMTEANGMAAEE VEQLVTFPVETAVNGATGVRRVRSSSTNGFSVVWVEFDWGTDIYLARQIVSEKLAVVSES LPVNVGKPTLGPQSSILGEMLIVGLTADSTSMLDLRTIADWTIRPRLLSTGGVAQVAVLG GDIKEYQIQLDPERMRHYGISMGEVMAVTQDMNLNANGGVLYEFGNEYIVRGVLSTSKTE QLGKAVVKTVNNFPVTLEDIADVTIGPKAPKLGTASERGKSAVLMTITKQPATSTLELTD KLEASLKDLQKNLPPDVKVSTDIFRQSRFIESSIGNVKKSLFEGGIFVVIVLFLFLANVR TTLISLVTLPLSLLVSILTLHYMGLTINTMSLGGMAIAIGSLVDDAIVDVENVYKRLREN RQKAKAERLSTLEVVFNASKEVRMPILNSTLIIVVSFVPLFFLSGMEGRMLVPLGVAFIV ALFASTVVALTLTPVLCSYLLGSNKTNKKLKEAPLARWMKGIYEKALTWVLAHKRVTLGS TIALFVVALGVFFTLGRSFLPSFNEGSFTINISSLPGISLEESNKMGHRAEELLMTIPEI QTVARKTGRAELDEHALGVNVSEIEAPFELKDRSRSELVADVREKLGTITGANIEIGQPI SHRIDAMLSGTKANIAIKLFGDDLNKMFSLGNQIKGAISDIPGVADLNVEQQIERPQLKI QPKREMLAKFGITLPEFSEYVNVALAGKVISQVYEQGKSFDLIVKVKDDARDEIEKIRNL MVDTNDGRKVPLSYVAEVVSAMGPNTINRENVKRKIVISANVADRDLRSVVNDIQKRIDT SVQLPEGYHIEYGGQFESEQAASRTLALTSFISIVVIFLLLYNEFRSVKESGVILLNLPL ALIGGVFALVITTGEVCIPAIIGFISLFGIATRNGMLLISHYNHLQKEEGLNVYDSVIRG SLDRLNPILMTALSSALALIPLALGGDLPGNEIQSPMAKVILGGLLTSTFLNGFIVPIVY LMMHRKKAQESL >gi|222159248|gb|ACAB01000111.1| GENE 25 23457 - 24692 1111 411 aa, chain - ## HITS:1 COG:aq_468 KEGG:ns NR:ns ## COG: aq_468 COG0845 # Protein_GI_number: 15605952 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Aquifex aeolicus # 86 409 38 359 359 73 24.0 1e-12 MKKLIFVGILGLFVLGSCNNSTVTHTHNEHDHATEGHNHEAEGSDHSHEGECSGEHNHEA TDGHNESAEAHSDEIILPKAKAEAAGVKVSVIEPAPFQQVIKTSGQVLAAQGDESVAVAT VAGVVSFRGKVTEGMSVGNGTPLVTISSKNIADGDPVQRARIAYEVSKKEYERMKELVKN KIVSDKDFAQAEQSYENARLSYEALSKNHSAIGQSITAPIAGYVKSILVKEGDYVTIGQP LVSVTQNRRLFLRAEVSEKYYPYLRTISSANFQTPYNNQVYELKTLNGKLLSFGKAAGDN SFYVPVTFEFDNKGEVIPGSFVEVFLLSSTMENVLSLPRTALTEEQGIFFIYLQLDEEGY KKQEVTIGADNGKSVQILTGVKAGDRVVTEGAYQVRLASASNAIPAHSHEH >gi|222159248|gb|ACAB01000111.1| GENE 26 24778 - 25134 172 118 aa, chain - ## HITS:1 COG:no KEGG:BT_2037 NR:ns ## KEGG: BT_2037 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 19 136 136 197 85.0 1e-49 MLVAAVIPHHHHPNGMICMKQDLPVEQQCPTHHHHPGNDSCCSSECMTRFYSPTPSVHTD SGPDYVFIATLFTDVIIEHLLRPQERRIKNYYVYRDSLHGTDIHRATPLRAPPYSVFA >gi|222159248|gb|ACAB01000111.1| GENE 27 25206 - 25406 237 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298479689|ref|ZP_06997889.1| ## NR: gi|298479689|ref|ZP_06997889.1| hypothetical protein HMPREF0106_00114 [Bacteroides sp. D22] # 1 66 18 83 83 89 100.0 5e-17 MSNNEMQELSDKLRRGLQLAEKRLLEKNARNGKLLSQGTPDGKVIYVSATELLERLQEKE KESIKK >gi|222159248|gb|ACAB01000111.1| GENE 28 25396 - 25977 519 193 aa, chain - ## HITS:1 COG:alr5363 KEGG:ns NR:ns ## COG: alr5363 COG4185 # Protein_GI_number: 17232855 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 7 189 4 186 187 182 50.0 4e-46 MDETRQLYIISGCNGAGKTTASYTVLPEILLCKEFVNADEIAKGLSPFNPESMAIEAGRL MLKRINELLAAKVSFSIETTLATRSYTRLIQRAQNAGYKVSLIYFWLNSPELAVNRVLQR VNEGGHNVPMDVIYRRYQAGINNLFQIYMPRVDYWLLADNSISPRVIVAEGYQHGEDRIY ELELFKCIKNYVK >gi|222159248|gb|ACAB01000111.1| GENE 29 26066 - 27703 1733 545 aa, chain - ## HITS:1 COG:all4183 KEGG:ns NR:ns ## COG: all4183 COG0488 # Protein_GI_number: 17231675 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Nostoc sp. PCC 7120 # 1 533 1 531 564 395 40.0 1e-109 MISVDGLAVEFGGTTLFSDISFVINEKDRIALMGKNGAGKSTLLKILAGARQPTRGKVSA PKDCVVAYLPQHLMTEDGRTVFDETAQAFSHLHEMEAQIEKLNKELETRTDYESDSYMAL IEEVSALSEKFYSIDATNYEEDVEKALLGLGFTRSDFQRQTSDFSGGWRMRIELAKLLLQ KPDVLLLDEPTNHLDIESIQWLEDFLINNGKAVIVISHDRKFVDNITTRTIEVTMGRIYD YKVNYSQYLQLRKDRREQQQKAYDEQQKFIAETKEFIERFKGTYSKTLQVQSRVKMLEKL ELLEVDEEDTSALRLKFPPSPRSGSYPVIMEGVGKAYGEKVVFRNANLTIERGDKVAFVG KNGEGKSTLVKCIMKEIEHDGTLTLGHNVQIGYFAQNQASLMDENLTVFQTIDDVAKGDI RNKIRDLLGAFMFGGPEESMKKVKVLSGGERTRLAMIKLLLEPVNLLILDEPTNHLDMKT KDILKQALLDFDGTLIVVSHDRDFLDGLVTKVYEFGNQKVTEHLCGIYEFLEKKKMDSLQ ELEKK >gi|222159248|gb|ACAB01000111.1| GENE 30 27822 - 28220 334 132 aa, chain + ## HITS:1 COG:VC1445_2 KEGG:ns NR:ns ## COG: VC1445_2 COG0784 # Protein_GI_number: 15641456 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Vibrio cholerae # 13 127 1 115 117 78 38.0 3e-15 MESEQMNEFRPLILVAEDDDSNFKLIKAIIGKKCDIEWAKNGQEMVELFQQHQERTKAML MDIKMPVMNGLEATKIIRESNTEIPIIMQTAYAFSSDKENAMNAGATEVLVKPITLSILR TTLSKYLPDLQW >gi|222159248|gb|ACAB01000111.1| GENE 31 28503 - 30116 1703 537 aa, chain - ## HITS:1 COG:no KEGG:BDI_0503 NR:ns ## KEGG: BDI_0503 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 537 5 522 522 472 47.0 1e-131 MKKYKSIGKLLAVSFLTAAIASACTEDAMDKINENPNNPLDAPAKFLITDLGVSTAFSTV GGDFSLYSSVYIEHETGISNQLYRAEVRSGEPTTATTYNNAWINVYSNIKNAKIVIKKCE EDPSEKGNVVTEAIAKILLAYNGAVAADVFGNTPYSQTGILNPDGTPMYMQPKIDTQESI YQEVMQNLDDAITLLNNGTAKDAGLSGAVGSKDLIYGSNTSTQASMWLKTAYALKARYTM RLLNKSANKTTDLQNILTYVSKSFANASEECKLAVYDADSQLNPLWSFSYSRNSLAASAS LIEKFVERNDPRAPQAFLEPDPTGYIAYGYGGTQATDIAGIKAAPNGTPQELQNNYGMSM ISWAMSAPTLLISYHEVKFLEAEALCRLGGRLSEAKNALKAAITAGFENLENSIIDAADT WVYDGDSGLGADVAETYFTDEVEPLFDANPLQETMIQKYLAFFGASGESLEAYNDYRRLK GAGENFIVLKNPLNNNKFPLRFGYGADDVLANPEVKAAFGDGQYVYSEAVWWAGGNK >gi|222159248|gb|ACAB01000111.1| GENE 32 30128 - 33064 2823 978 aa, chain - ## HITS:1 COG:no KEGG:BDI_0500 NR:ns ## KEGG: BDI_0500 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 978 78 1057 1057 1195 62.0 0 MISQDVSIKPGIIKVVLKSDAKALDEVVVTAMGISREKKALGYAVQDVKSDQLTQAANSN LAGALQGKVSGLDVKPSSGMPGASSQITIRGARSFSGDNTPLYVIDGMPVTSTPDVSTDI QNNGSVSGADFANRAVDIDPNDIESINILKGQAASALYGIRASNGVIIITTKSGKGLEKG KPQVSFSSNVSFDVVGRLPEFQKTYAQGSGGVFSTTSGTSWGPKISELPNDATYGGNTDN EYTQKFGKQQGKYYVPQRAAAGLDPWATPQAYDNAKDFFDTGITWSNSLNVAQMLDKSSY SISLGNTHQDGIISSTGMDRYNVKVSADTKLTNNWSSGFTANYITTSIDKAVTSGNGLLR TVYAAPPSYDLAGIPSHVDGNPYTQNSFRGSFDNAYWAMENNKFTEDTNRFFGNVYASYQ TDFGTTNHKLNAKYMVGVDAYTTHYVDSYGYGSNTGGGRGQIENYGWTNATYNSLLTINY DWRINEDWGLNAVLGNEIIQSNRKKYYEYGTNYNFPGWNHINNATTQQTEEETWKNRTVG FFGNVSASYKNMLYLTLTGRQDYVSNMPRNNRSFFYPSISAGFILTELDALKNNVVNHAK LRVSYAEVGQAGDFLENYYSTPTYGGGFYTLTPIMYPMKGTTAYTPYYTIFDPKLKPQNT RSYEVGTDVNFLDNLITFSYTYSRQNVKDQIFEVPLASSTGASKLLTNGGKIHTNTHEFT LGFNPIRTKNINWDFAFNWTKIDNYVDELAPGVENISLGGYVTPQVRASAGEKFPVIYGV GFKRDENGNRLVDENGLPIAGEAQVIGKVSPDFLMGFNTTLRLWKCTISAVLDWKQGGQM YSRTTGLADYYGVSKRTENREGTIIFDGYKTDGTKNDIAITGANAQQVFYSRLNDIDESS VYDNSFIKLREVAVNYKILQKRSIELSVNAFARNILIWAQLPDLDPEASQGNNNMAGAFE DYSMPQTASFGFGFNIKF >gi|222159248|gb|ACAB01000111.1| GENE 33 33440 - 33661 255 73 aa, chain + ## HITS:1 COG:no KEGG:BVU_1841 NR:ns ## KEGG: BVU_1841 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 72 1 72 1002 112 76.0 5e-24 MKRKLMFLMTFLFIGIGLMTAQTSRVTGLVTSEEDGQPVVGASILVNGTTLGTITDIDGK FTIANVPSSAKTY >gi|222159248|gb|ACAB01000111.1| GENE 34 34188 - 34889 702 233 aa, chain - ## HITS:1 COG:MTH608 KEGG:ns NR:ns ## COG: MTH608 COG0120 # Protein_GI_number: 15678636 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase # Organism: Methanothermobacter thermautotrophicus # 36 224 21 209 226 149 45.0 3e-36 MDWENHLIKHLQWSDSIINREAKEHVAREIAATAKDGDVIGAGSGSTVYLTLFELARRIR EEHLHIEVIPASQEISMTCIQLGIPQTILWNKRPDWTFDGADEVDPQRNLIKGRGGAMFK EKLLIRSSRKTFIIIGPSKRVNLLGNKFPIPVEVFPDSLTYVDHELQRLGASEIVLRPAH GKDGPVFTENGNFILDTRFNYIDSSLEEQLKTITGVIESGLFIGYDVEVIMAE >gi|222159248|gb|ACAB01000111.1| GENE 35 35252 - 35902 551 216 aa, chain - ## HITS:1 COG:BH2865 KEGG:ns NR:ns ## COG: BH2865 COG1272 # Protein_GI_number: 15615428 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Bacillus halodurans # 6 212 7 207 215 119 39.0 4e-27 MKNKRYNNIEEWANTLSHGIGILLGIVAGYLLLEKASENIEPQWAVACVSVYLTGMLSSY ISSTWYHGSRPGKLKELLRKFDHGAIYLHIAGTYTPFTLLVLRHAGGWGWGIFAFVWLSA IAGFILSFKKLKEHSNLETACYVGMGACILVAMKPLLDHLGELGATTAFWWLIGGGVSYI VGAVFYSLRKPYMHATFHLFCLGGSIGHIIAIWLIL >gi|222159248|gb|ACAB01000111.1| GENE 36 36202 - 37101 1234 299 aa, chain - ## HITS:1 COG:no KEGG:BT_1979 NR:ns ## KEGG: BT_1979 # Name: not_defined # Def: meso-diaminopimelate D-dehydrogenase # Organism: B.thetaiotaomicron # Pathway: Lysine biosynthesis [PATH:bth00300] # 1 299 1 299 299 585 97.0 1e-166 MKKVRAAIVGYGNIGHYVLEALQAAPDFEIAGVVRRAGAENKPEELANYVVVKDIKELGN VDVAILCTPTRSVEKYAKEYLAMGINTVDSFDIHTGIVDLRRTLDAAAKEHKAVSIISAG WDPGSDSIVRTMLEAIAPKGITYTNFGPGMSMGHTVAVKAIDGVKAALSMTIPTGTGIHR RMVYIELKDGYKFEEVAAAIKADPYFVNDETHVKLVPSVDALLDMGHGVNLTRKGVSGKT QNQLFEFNMRINNPALTAQVLVCVARASMKQQPGCYTMVEIPVIDLLPGDREEWIGHLV >gi|222159248|gb|ACAB01000111.1| GENE 37 37289 - 37891 621 200 aa, chain + ## HITS:1 COG:PA0966 KEGG:ns NR:ns ## COG: PA0966 COG0632 # Protein_GI_number: 15596163 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Pseudomonas aeruginosa # 1 198 1 198 201 116 38.0 3e-26 MIEYIRGELAELSPATAVIDCNGVGYAANISLNTYSAIQGKKNCKLYIHEAIREDAYVLY GFAEKQEREIFLLLISVSGIGGNTARMILSALSPAELVNVISTENANMLKTVKGIGLKTA QRVIVDLKDKIKTMSGSAGGGASAGLLLQPANAEVQEEAVSALTMLGFAAAPSQKVVLAI LKEEPDAPVEKVIKLALKRL >gi|222159248|gb|ACAB01000111.1| GENE 38 37914 - 38885 903 323 aa, chain + ## HITS:1 COG:no KEGG:BT_1977 NR:ns ## KEGG: BT_1977 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 323 8 329 329 480 76.0 1e-134 MQAWLDAHGRAKVVNTDEWYLDFANQLLPLVAESYIYGGKEMEEDQKQVALTCALYLEDC VADGGNWRQFIHWHQANYGRCLPFYTLTDEYLPDEINREDVVFLLWAINSPVGDDFDGVE NPMDADLLEFADVLYNRLDAAFELAPISDYLATDWLLETELMQKKRMPLPIALPGEKMPA NVERFLEASKGEPLLYFNSYDALKFFFVQSLKWEDEEDSLLPDLREFDNFVVFANPKGLL IGPDVAPYFADKRNPLYNAELAEEEAYELFCEEGLCPFDLLKYGMEHELLPEAQFPFENG KELLQENWDFVARWFLGEYYEGE >gi|222159248|gb|ACAB01000111.1| GENE 39 39493 - 40329 973 278 aa, chain - ## HITS:1 COG:no KEGG:BF3641 NR:ns ## KEGG: BF3641 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 278 1 279 280 348 66.0 1e-94 MKKLIILFCGTLALAACGNGLEKKANEKLMVAKAAYERGDYEEAKLQIDSIKILYPKAFE ARKAGQALMLDVETKTQQKTLAYLDSAFQAKTKEFNAIKDKFTLEKDAEYQQVGNYLWPT QTIEKNMHRSYLRFQVNEQGIMSMTSIYCGAGNIHHNKVKVIAPDGSFAETPSSKDSYET TDMNEKIEKADYKLGEDGNVIEFLHLNKDKNIRVEYMGDRTYKTTMSPTDRLAAANVYEL AQILSAMEKINKEQEEANLKIGFINKKKERKAQEEITD >gi|222159248|gb|ACAB01000111.1| GENE 40 40493 - 41926 1484 477 aa, chain + ## HITS:1 COG:MT4026 KEGG:ns NR:ns ## COG: MT4026 COG0617 # Protein_GI_number: 15843539 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Mycobacterium tuberculosis CDC1551 # 27 438 34 445 480 229 35.0 7e-60 MIELTQEELKQHFSEPIFGQIAETADALGMECYVVGGYVRDIFLQRPSKDIDVVVVGSGI AMAEALGKRLGRGAHVSVFKNFGTAQVKYRGTEVEFVGARKESYQRDSRKPIVEDGTLED DQNRRDFTINALAVCLNKARFGELVDPFGGMEDMKEKTIRTPLDPDITFSDDPLRMMRCI RFATQLGFYIDDDTFESLCRNKERIEIISRERIADELNKIILSPIPSKGFIDLERSGLLP LIFPEFAALQGVETRNGRSHKDNFYHTLEVLDNISKKTDNLWLRWAALLHDIAKPVTKRW EPKAGWTFHNHNFIGEKMIPHIFRKMKLPMNEKMKYVQKLVSLHMRPIVIADDVVTDSAV RRLLFEAGDDIDDLMMLCEADITSKNMERKQRFLNNFQLVRQKLKDLEEKDRVRNFQPPV SGEEIMEIFGLEPCREVGVLKSAIKDAILDGVIPNEYEAAHAFMLERAKKMGLTPVC Prediction of potential genes in microbial genomes Time: Wed May 18 03:45:23 2011 Seq name: gi|222159247|gb|ACAB01000112.1| Bacteroides sp. D1 cont1.112, whole genome shotgun sequence Length of sequence - 45488 bp Number of predicted genes - 35, with homology - 33 Number of transcription units - 21, operones - 9 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 75 - 1622 1316 ## COG3119 Arylsulfatase A and related enzymes - Term 1649 - 1689 6.1 2 1 Op 2 . - CDS 1705 - 2868 1219 ## COG0006 Xaa-Pro aminopeptidase - Prom 2888 - 2947 7.7 - Term 3017 - 3065 13.0 3 2 Tu 1 . - CDS 3101 - 4438 1508 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase - Prom 4527 - 4586 4.2 - Term 4568 - 4630 18.2 4 3 Tu 1 . - CDS 4655 - 6766 1962 ## COG1505 Serine proteases of the peptidase family S9A - Prom 6808 - 6867 5.8 + Prom 6742 - 6801 4.8 5 4 Tu 1 . + CDS 6938 - 9907 2881 ## BT_1972 phosphoenolpyruvate synthase/pyruvate phosphate dikinase + Term 10095 - 10165 14.5 6 5 Tu 1 . + CDS 10320 - 11309 912 ## BT_1798 hypothetical protein + Term 11331 - 11385 7.1 + Prom 11406 - 11465 10.2 7 6 Tu 1 . + CDS 11514 - 12719 577 ## BVU_0249 hypothetical protein + Term 12765 - 12802 5.3 - Term 12742 - 12799 10.3 8 7 Op 1 . - CDS 12828 - 13613 547 ## BVU_0248 hypothetical protein 9 7 Op 2 . - CDS 13631 - 14770 809 ## BVU_0247 hypothetical protein - Prom 14799 - 14858 3.6 10 8 Tu 1 . - CDS 14910 - 15086 60 ## - Prom 15134 - 15193 7.6 - Term 15541 - 15585 9.4 11 9 Tu 1 . - CDS 15621 - 16955 1479 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase - Prom 17075 - 17134 6.2 - Term 17134 - 17180 11.1 12 10 Op 1 . - CDS 17203 - 17379 132 ## BF3606 hypothetical protein 13 10 Op 2 . - CDS 17376 - 19667 2777 ## COG0281 Malic enzyme 14 10 Op 3 . - CDS 19696 - 19812 72 ## - Prom 19851 - 19910 5.2 + Prom 19738 - 19797 6.3 15 11 Op 1 6/0.000 + CDS 19937 - 20503 420 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 20512 - 20571 3.8 16 11 Op 2 . + CDS 20601 - 21596 622 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 21746 - 21805 4.5 17 12 Op 1 . + CDS 21825 - 25067 2474 ## Coch_0443 TonB-dependent receptor plug 18 12 Op 2 . + CDS 25079 - 26554 1477 ## Dfer_5600 RagB/SusD domain protein 19 12 Op 3 . + CDS 26591 - 27970 1164 ## BT_3313 hypothetical protein 20 12 Op 4 . + CDS 28009 - 30519 1700 ## COG3525 N-acetyl-beta-hexosaminidase 21 12 Op 5 . + CDS 30541 - 32169 1241 ## COG3525 N-acetyl-beta-hexosaminidase 22 12 Op 6 . + CDS 32191 - 33306 668 ## COG4299 Uncharacterized conserved protein + Term 33436 - 33480 -0.7 23 13 Tu 1 . - CDS 33427 - 35040 1198 ## PRU_0502 hypothetical protein - Prom 35060 - 35119 8.0 24 14 Op 1 . - CDS 35148 - 36182 1112 ## COG1566 Multidrug resistance efflux pump 25 14 Op 2 . - CDS 36255 - 37574 1341 ## PRU_0500 outer membrane efflux protein - Prom 37764 - 37823 7.3 - Term 38056 - 38097 -0.4 26 15 Tu 1 . - CDS 38155 - 38538 120 ## Fisuc_2409 GCN5-related N-acetyltransferase - Term 38584 - 38629 6.1 27 16 Tu 1 . - CDS 38676 - 38834 122 ## gi|237714038|ref|ZP_04544519.1| predicted protein 28 17 Op 1 . - CDS 38944 - 40380 1191 ## BDI_2528 hypothetical protein 29 17 Op 2 . - CDS 40384 - 40563 83 ## gi|293371128|ref|ZP_06617665.1| hypothetical protein CUY_2977 - Prom 40640 - 40699 3.8 30 18 Tu 1 . - CDS 40723 - 42432 397 ## BF3423 hypothetical protein - Prom 42483 - 42542 7.4 + Prom 42834 - 42893 7.0 31 19 Op 1 . + CDS 42915 - 43628 505 ## COG4804 Uncharacterized conserved protein 32 19 Op 2 . + CDS 43622 - 43894 313 ## BF3544 multidrug efflux pump channel protein + Term 43976 - 44029 8.6 - Term 43968 - 44013 4.3 33 20 Op 1 . - CDS 44021 - 44560 306 ## gi|294809844|ref|ZP_06768524.1| hypothetical protein CW3_2895 34 20 Op 2 . - CDS 44587 - 44781 192 ## gi|237714045|ref|ZP_04544526.1| predicted protein - Prom 44828 - 44887 3.9 - Term 44825 - 44860 -0.7 35 21 Tu 1 . - CDS 44930 - 45238 353 ## BT_2030 hypothetical protein - Prom 45373 - 45432 4.9 Predicted protein(s) >gi|222159247|gb|ACAB01000112.1| GENE 1 75 - 1622 1316 515 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 1 494 1 467 497 180 28.0 6e-45 MKQPFILTLMSTVACAATATNKSPENTTPPQSNTPPNILFILCDDMGYGDLGCYGQPFIR TPHIDTMANEGMRFTQAYAGSPVSAPSRASFMTGQHSGHCEVRGNKEYWKNVPIVLYGQN QEYSIVGQHPYDSDHVILPEIMKDNGYTTGMFGKWAGGYEGSRSTPDKRGIDEYYGYICQ FQAHLHYPNFLNRYSKALGDTGVVRVVMDENIKYPMYGPDYQKRPQYSADMIHQKALEWL DQQDDQQPFFGILTYTLPHAELVQPEDSILNEYKAKFNPDKEFKGSEGSRYNAITHTHAQ FAGMITRLDYYVGEVLKKLKEKGLDENTLVIFSSDNGPHEEGGADPTFFGRDGKLRGLKR QCHEGGIRIPFIARWPGHIPAGKVNDHICAFYDLMPTFCEVIGIKDYEKKYRNKEKEVDY FDGISFAPTLLGKKKQKKHDFLYWEFDETDQIAVRMDDWKMVVKKGTPFLYNLKTDIHED TDIALQHPDVVEKMKAIIFEQHTPNPYFSVTLPKQ >gi|222159247|gb|ACAB01000112.1| GENE 2 1705 - 2868 1219 387 aa, chain - ## HITS:1 COG:MA4232 KEGG:ns NR:ns ## COG: MA4232 COG0006 # Protein_GI_number: 20093022 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Methanosarcina acetivorans str.C2A # 23 384 20 385 388 173 32.0 4e-43 MLQPELKFRRDKIRSLMVSQGIDAALITCNANLIYTYGCVVSGYLYLPLHSPALLFFKRP NNITGEHSFSIRKPEQIVDLLKEQGLPMPTKLMLEGDELPYTEYCRLASLFPEAEVVNGT PLIRQARSVKTPVEIEMFRRSGIAHAKAYEQIPSVYRPGMTDIEFSIEIERLMRLQGCLG IFRVFGRSMEIFMGSVLTGDNAGYPSPYDFALGGQGLDPALPGGANKTPLKEGQSVMVDL GGNFNGYMNDMSRVFSIGKLPEEAYTAHQVCLDIQTKIASIAKPGIACEVLYDIAVEIAK EAGFADKFMGTGQQAKFIGHGIGLEINEAPVLAPRMKQQLEPGMVFALEPKIVIPSVGPV GIENSWVVTNEGIEKLTNCNEEIIELS >gi|222159247|gb|ACAB01000112.1| GENE 3 3101 - 4438 1508 445 aa, chain - ## HITS:1 COG:PA4588 KEGG:ns NR:ns ## COG: PA4588 COG0334 # Protein_GI_number: 15599784 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Pseudomonas aeruginosa # 2 445 4 445 445 543 59.0 1e-154 MNIQKIMSSLEAKHPGESEYLQAVKEVLLSIEDIYNQHPEFEKAKIIERLVEPDRIFTFR VTWVDDKGEVQTNLGYRVQFNNAIGPYKGGIRFHASVNLSILKFLGFEQTFKNALTTLPM GGGKGGSDFSPRGKSDAEIMRFCQAFMLELWRHLGPDMDVPAGDIGVGGREVGYMFGMYK KLTREFTGTFTGKGLEFGGSLIRPEATGFGGLYFVNQMLQAKGIDIKGKTVAISGFGNVA WGAATKATELGAKVITISGPDGYIYDPDGISGEKIDYMLELRSSGNDIVAPYAEQYPNAT FVEGKRPWEVKADIALPCATQNELNGEDAQNLIKNDVLCVGEISNMGCTPEAIDLFIEHK TMYAPGKAVNAGGVATSGLEMSQNAMHLSWSAAEVDEKLHAIMHGIHAQCVKYGTEPDGY INYVKGANIAGFMKVAHAMMGQGVI >gi|222159247|gb|ACAB01000112.1| GENE 4 4655 - 6766 1962 703 aa, chain - ## HITS:1 COG:all2533 KEGG:ns NR:ns ## COG: all2533 COG1505 # Protein_GI_number: 17230025 # Func_class: E Amino acid transport and metabolism # Function: Serine proteases of the peptidase family S9A # Organism: Nostoc sp. PCC 7120 # 20 700 5 688 689 699 51.0 0 MKKATLLMSGIMVMSCAPQQKKLVYPETAKVDTVDVYFGTQVPDPYRWLENDTSAATTAW VEAQNKVTNEYLSQIPFRENLLKRLTTLADYEKISAPIKKHGKYYFSKNDGLQNQSVFYV QDSLDGEPRVFLDPNKLSDDGTVALTGLYFSNDGKYTAYSISRSGSDWSEIYVMDTESGK LLEDHIEWAKFTGAAWQGDGFYYSAYDAPSKGKEFSNVNENHKIYYHKIGEPQSKDKLIY QNPAYPKRFYTSSTSEDERILFLTESGAGRGNNLFIRDLKKPNSPFIQLTSDLDYQYYPI EVIGDQIYIYTNYGAPKNRIMVADINRPKLEDWKELVPESEAVLSNAEVIGGKLFLTYDK DASNHAYVYGLDGKQIQEIQLPSLGSVGFSGNKDDKECFFGFTSFTIPGATYKYDMDQNT YELYRAPKVQFNSDDFVTEQVFFASKDGVKIPMFLTYKKDLKKDGKNPVFLYGYGGFGIS LNPGFSATRIPFLENGGIYAQVNLRGGSEYGEDWHVAGTKMQKQNVFDDFISAAEYLIDQ KYTNKDKIAIVGGSNGGLLVGACMTQRPDLFRVAIPQVGVMDMLRYHKFTIGWNWASDYG TSEDSKEMFEYLKGYSPLHNLKPGTKYPATLVTTADHDDRVVPAHSFKFAATLQADNDGT NPTLIRIDSKAGHGAGKPMAKVLEEQADIYGFIMYNLGMKPKF >gi|222159247|gb|ACAB01000112.1| GENE 5 6938 - 9907 2881 989 aa, chain + ## HITS:1 COG:no KEGG:BT_1972 NR:ns ## KEGG: BT_1972 # Name: not_defined # Def: phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 987 1 987 990 1948 96.0 0 MLSKYKLNQLYFKDTQFANLMTRRIFNVLLIANPYDAFMLEDDGRIDEKIFNEYTSLSLR YPPRFSQVSTEEEALAQLENMSFDLVICMPSTGDNDSFDIGRHIKEKYEHIPIVILTPFS HGITKRIINEDLSAFEYVFCWLGNTDLLVSIIKLIEDKMNLEHDVQEVGVQLILLVEDGI RFYSSILPNLYKFVLKQSQEFSTEALNAHQRTLRMRGRPKIVLARTYQEAMEIYHKYQNN ILGVITDVRFPKVERGEKDGLAGIKLCAEIRKNDPFVPLIIQSSESENSSYAAKYGASFI DKNSKKMDVDLRRIVSDNFGFGDFIFRNPDTGEEIARVRNLKELQNILFAVPAESFLYHI SRNHVSRWFYSRAMFPVAEFLKPITWNSLQDVDAHRKIIFEAIVKYRKMKNQGVVAVFKR DRFDRYSNFARIGDGSLGGKGRGLAFIDNMVKRHPEFEEFENARIAIPKTVVLCTDVFDE FMDTNNLYQIALSDADDATILKYFLKAKLPDRLVEDFFTFFDVVKSPIAIRSSSLLEDSH YQPFAGIYNTYMIPYLDDRYEMLRMLSDAIKGVYASVYFRDSKAYMQATSNVIDQEKMAV ILQEVVGNQYGDRYYPSMSGVARSLNYYPLGDEKAEEGTVNLALGLGKYIVDGGMTLRFS PYHPNQVLQTSEMEIALKETQTRFYALDLKNAGHDFSIDDGFNLLKLHVKEAENDGALRY IASTYDPYDQIIRDGLYPGGRKVITFANILQHDVFPLARILQLVLKYGEQEMRRPVEIEF AATLSREHDKSGTFYLLQIRPIVDSKEMLDEDLNEIPDEDVILRSYNSLGHGIMNDIYDV VYVKTDNYSASNNQAIAWEIEKINQQFLNEGKNYVLVGPGRWGSSDTWLGIPVKWPHISA ARVIVEAGLTNYRVDPSQGTHFFQNLTSFGVGYFTINAFMNDGVYNQDFLNAQPAVEETK YLRHVRFEKPMVVKMDGKKKQGVVLMPEA >gi|222159247|gb|ACAB01000112.1| GENE 6 10320 - 11309 912 329 aa, chain + ## HITS:1 COG:no KEGG:BT_1798 NR:ns ## KEGG: BT_1798 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 46 263 159 399 448 84 28.0 4e-15 MRKNLFYYLFTVLCTATLFISCSDDEDTPNPVNPLVGVYSLSDYETADFVVAEGDDPIKN FPKAGPLYAKWDAKKEDPFTVAASGMFRMMGATMLPQVLNTIEMSEDGNIIASYVDKPTL QGTDKLMNWAMAGLMGGMFPEKSEITSLAATSGFVNSPAGLATWSESNGKVVVKLNITNI LGAALGGQDASSLETIINQVLTGEPAVLKELLKSLFQIDLANVTDASIRQLQGWALNGIP MNKKEESGKTYLYLDKSAFDTFMILRDTGEKDDYGAPIMKNDLLYVWEALVSANLIPAEA AQAVILIQIISGYWSETKTFDIGLDLVKQ >gi|222159247|gb|ACAB01000112.1| GENE 7 11514 - 12719 577 401 aa, chain + ## HITS:1 COG:no KEGG:BVU_0249 NR:ns ## KEGG: BVU_0249 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 388 1 399 401 202 38.0 2e-50 MKKELFCYLTFLLCVAICFVSCSDKDETIVCPIKNFTFTDVSGLALTYSGKPMLGKKVEF IPDAADATKARLILSGALVDLSDILGSVAMNSRAGITDLLKFASPGVIPGEKTTELHVTL NIDGDMVTFSGEEEQNGGIIRYNGSATKSFLKLDLNVALPENGFADTSWNLVPQKSVEPV YYVWKTSANPIWGSILVRAIFTAKVVDNKSIPQLLSAILKEVRFLPDGNIQAVYKNGIAD AEWNDSDLNLAMYRMGEQENQIKLFLNLSQIMDTAMSVTTLSRASGISSVDIKELLKLIP MLSEGIPLYYEQDGNGGMKVYVGEDVLLPLLKAVKPLFEDEEFVNGLLEMLEAQAGDMAS LVKALKPVLKSLPKVIDGTTEVKFGIMFTALQPSAETNILK >gi|222159247|gb|ACAB01000112.1| GENE 8 12828 - 13613 547 261 aa, chain - ## HITS:1 COG:no KEGG:BVU_0248 NR:ns ## KEGG: BVU_0248 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 7 261 4 258 258 348 64.0 1e-94 MTQRNLYIAFIFTIISVYSVSAQTNRTKTLINAALHGWEYEVRAGFSIGGTSPIPLPAEI RSINSYNPSLAISLEGNMTKWIEPTKKWGIVTGIRLESRKMKTDATVKNYNMEIIASDGG RLKGNWTGKVSTNVNNSYLTLPVLAAYKINDRWHMKAGVYASYILEREFSGNVYDGYLRE ENPTGDKVSIEGDKSAKYDFSEELRHFQWGVQAGADWLAFKHLKVYADLTWGLNDIFKKD FTTISFAMYPIYLNIGFAYAF >gi|222159247|gb|ACAB01000112.1| GENE 9 13631 - 14770 809 379 aa, chain - ## HITS:1 COG:no KEGG:BVU_0247 NR:ns ## KEGG: BVU_0247 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 379 1 379 382 344 46.0 3e-93 MKFTNVLATTLLSLLSTACIQNEEPNTEADILKCILPKEIFEYDNTNYYEPYDESINAYP LYISVNSKATLTQLAPEFVLTEGATIEPASGSIQDFTHPVRYTVTSADGKWHRTYSVSIS DDLQIIPEQFHFDKLSSKSSKYHIFYEENKAKGEFMEWASANQGYMLVSGNSPATEYPTS QADGGIENSKCAKLVTKSTGGLGSLVGAPLAAGNLFIGTFNFNSSVSDPLKSTAFGRIFH KKPLRLKGYYKYKAGSTYMDGTNEIPEQKDNFIVYGIFYKTDETLRTMDGYLAINEFKDP RMIALALLPEKERKETDQWTEFNIPFDYNKYGKEVDMEALSRGEYKISIVLSASKDGDRF KGAIGSTLYVDDLELVCEE >gi|222159247|gb|ACAB01000112.1| GENE 10 14910 - 15086 60 58 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MERFSTEKADFKYFFHLFHVLYYSSTSTYFLQLLYFSQLIHLLQKTKYKPNTMKRYET >gi|222159247|gb|ACAB01000112.1| GENE 11 15621 - 16955 1479 444 aa, chain - ## HITS:1 COG:PA4588 KEGG:ns NR:ns ## COG: PA4588 COG0334 # Protein_GI_number: 15599784 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Pseudomonas aeruginosa # 7 444 9 445 445 582 62.0 1e-166 MNAAKVLDDLKRRFPNEPEYHQAVEEVLSTIEEEYNKHPEFDKVNLIERLCIPDRVYQFR VTWMDDKGNIQTNMGYRVQHNNAIGPYKGGIRFHASVNLSILKFLAFEQTFKNSLTTLPM GGGKGGSDFSPRGKSNAEVMRFVQAFMLELWRHIGPETDVPAGDIGVGGREVGFMFGMYK KLAHEFTGTFTGKGREFGGSLIRPEATGYGNIYFLMEMLKTKGTDLKGKVCLVSGSGNVA QYTIEKVIELGGKVVTCSDSDGYIYDPDGIDREKLDYIMELKNLYRGRIREYAEKYGCKY VEGAKPWGEKCDIALPSATQNELNGDHARQLVANGCIAVSEGANMPSTPEAIKVFQDAKI LYAPGKAANAGGVSVSGLEMTQNSIKLSWSAEEVDEKLKSIMKNIHEACVQYGTEADGYV NYVKGANVAGFMKVAKAMMAQGIV >gi|222159247|gb|ACAB01000112.1| GENE 12 17203 - 17379 132 58 aa, chain - ## HITS:1 COG:no KEGG:BF3606 NR:ns ## KEGG: BF3606 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 58 1 58 58 74 84.0 1e-12 MKRPLSKSQIVCISLLWLALCYLVLTKAERIDGMTIVMLVISAALVFIPVYKSIKKNK >gi|222159247|gb|ACAB01000112.1| GENE 13 17376 - 19667 2777 763 aa, chain - ## HITS:1 COG:STM2472_1 KEGG:ns NR:ns ## COG: STM2472_1 COG0281 # Protein_GI_number: 16765792 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Salmonella typhimurium LT2 # 1 429 1 429 434 516 62.0 1e-146 MAKITKEAALLYHSQGKPGKIEVVPTKPYSTQTDLSLAYSPGVAEPCLEIEKNPQDAYKY TAKGNLVAVISNGTAVLGLGDIGALSGKPVMEGKGLLFKIYAGIDVFDIEVDEKDPEKFI AAVKAIAPTFGGINLEDIKAPECFEIERRLKEELDIPVMHDDQHGTAIISSAGLVNALQV AGKKIEDVKIVVNGAGASAVSCTKLYVSLGARLENIVMLDSKGVISKTRTDLNEQKRYFA TDRTDIHTLEEAIKGADVFLGLSKGNVLSQDMVRSMAPMPIVFALANPTPEISYEDAMAA RPDVLMATGRSDYPNQINNVIGFPYIFRGALDTQAKAINEEMKIAAVHAIANLAKQPVPD VVNAAYHVNNLSFGAEYFIPKPVDPRLITEVSCAVAKAAMESGVARTEIKDWDAYCVHLR ELMGYESKLTRQLYDTARRSPQRVVFAEGIHPNMLKAAVEAKAEGICHPILLGNDEAIGK LAEEMELSLEGIEIVNLRHPDESDRRERYSRILAEKRAREGFTYEEANDKMFERNYFGMM MVETGDADAFITGLYTRYSNTIKVAKEVIGIQPGFKHFGTMHILNSKKGTYFLADTLINR HPDTETLIDIAKLSDKTVRFFNHTPVISMLSYSNFGADTAGSPVKVHEAVAHMQEEYPEL AIDGEMQVNFAMNRELRDTKYPFTRLKGKDVNTLIFPNLSSANAGYKLLQAMDPDTEFIG PIQMGLNKPIHFTDFESSVRDIVNITAVAVIDAIVDKKKRGVK >gi|222159247|gb|ACAB01000112.1| GENE 14 19696 - 19812 72 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHILCISYHKFSHFAIKLIEITNQQQQFLTFIPTFAAN >gi|222159247|gb|ACAB01000112.1| GENE 15 19937 - 20503 420 188 aa, chain + ## HITS:1 COG:BH0263 KEGG:ns NR:ns ## COG: BH0263 COG1595 # Protein_GI_number: 15612826 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 2 174 4 184 187 73 26.0 3e-13 MITTQQLQLLKEGNKNAFEALYRAYNARIYSFVLSMIGDAGVAKDITQDIFLQIWEKRLN IDLEGNFDGYLFKISQNMVYHYVRRELLLQNYVDKLSNESADESVEIDEELDYLFLEEYI LKLLEELPPARREVFMLYWKSGLNYREIAEQLDISEKTVATQVHRSLDFLRDKLGTIAFS VSLFLHHI >gi|222159247|gb|ACAB01000112.1| GENE 16 20601 - 21596 622 331 aa, chain + ## HITS:1 COG:AGpAbx251 KEGG:ns NR:ns ## COG: AGpAbx251 COG3712 # Protein_GI_number: 16119537 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 122 295 111 278 311 70 30.0 5e-12 MEQHDYIVLLEKFFKKEASSEERRILIEWIRKPGIRDEFNSLCEQMWKEAAVEIDKTVEE EMWNHLQRNLDEPKILYPKKRRWQPIIYKVAATILLPVCLGLATYFGVGHINKVSQDPFM VAVDYGQKANLTLPDGTKVWLNSATHLSYDAEYNKSDRKIYLDGEAYFEVAKNKDKRFIV CCNDLEIEALGTTFDVKGYGDDLSVTTLLAEGSVRVSNKTNTTLLKPGEKVEYHKNKQTF TKSPISDLREIDFWRNNMLIFNSASLAEIATTLERMYGVKVVFDSEKLKNVPFSGTIRNS SLHNVFYIISLTYPLTYKLEGDTVKIGSSIN >gi|222159247|gb|ACAB01000112.1| GENE 17 21825 - 25067 2474 1080 aa, chain + ## HITS:1 COG:no KEGG:Coch_0443 NR:ns ## KEGG: Coch_0443 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.ochracea # Pathway: not_defined # 117 1080 29 989 989 904 49.0 0 MRKQLHIILRLVMFCYILLFVSQEVSAQITFSVKNQTIRQTIRVIEKKADYSFFYTDKLP GLDQKISLKVDNESIDSVLEKVFKGSPISYKIESGKQVVLTLKTEQNASQKGDKKTITGV VADRRGEPLIGVTVRVEGENIGTATDIDGRFSLEAPIGAKISFSYIGYVSQSHTVNKQNI YNVSLSEDSQVLEEVVVVGYGSMKRKDITTAVSVVSTADIEERPIMTAAQAIQGKAAGIQ VVQPSGMPGSGVTIRVRGATSVQASNEPLYVVDGLPSDDISNISPNDIESMQILKDASSA AIYGARAANGVVLITTKRGKIGAPQVKMSAYVGFSKLGKKIDALNTEQYKDLMKDLKAVS DVAPNIPENETRYVDWTDLFFGTGVNQNYQLSVANGTEKLQYFVSGGYSDEQGIVEKAHF NRYNFRANLDSEQTKWLKMALNFAYSHTGGQWVNESRSSLRAGSILSVVNTPPFMQKWNP YDPNEYDEQAYGARILNPLAANAADSNTNTDHINGSLGFTVDIYKGLKFKTTFGIELTNE HWDYYLDPISTSDGRGTKGRVEESFSRNFEWLFENLLTYDCSFNKHNLSILGGATQQRAQ YNASWMAGFDLAESYPDIHSISAANQLDKDACGSSASAWTLASFLGRIAYNYDSRYLLTV NFRADGSSRFAPGHRWGTFPSVSAGWRISSEKFMQPLQNIVTDLKLRAGWGMNGNQGGFG NYAYMASMSASKLPVSEGNLYPGLAIRPGSAANKELTWEKTSQWNLGLDLTMFDSRLSFS VDAYYKKTTDLLLTVSLPENVVPSSVTRNDGEMVNKGMEFTVSSQNLKGNFQWNTDFNIS FNRNKLTKLGLNKVYYYAEMYESKEKAVILKEGLPLGTFFGYISEGVDPETGNIIYRDLN GNGFIDPEDRTTLGNAQPKFIYGMTNTFSYAGFDLSVFLQGSQGNKIFNASRIDMEGMTD FRNQSVRVLDRWKRPGMITNVPRVGNTENNHNSSRFVEDGSYLRLKTVTLSYNFPKKWLN KIHLSRLQAYVTGQNLFTLTKYKGYDPEVNAFGGDSVAQGVDYGTYPQSRAVIFGLNVEF >gi|222159247|gb|ACAB01000112.1| GENE 18 25079 - 26554 1477 491 aa, chain + ## HITS:1 COG:no KEGG:Dfer_5600 NR:ns ## KEGG: Dfer_5600 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 6 491 3 488 488 316 40.0 1e-84 MKNIIKLAYVVLCVLVVSCSNFLDEKPQSDFMQEGTGTEDQESKYGSLADAQAELQGAYE SFKADIFQSENYTIGDVQSDNCYIGGDGVAEQEFDLLKLTSTNYKVELVWSQYYSMAGTA TSVIENTKMMDPASTVAEERNRVIAEAKFLRAWAYFDIVRLWGDAPMVLDLIPTITAENL DKWYPVMYPERSVADKIYDQILDDLNEENTIRYLVSKNKGVFQATKGAAYALRAKVLATR GEKSTRDYAKVVEACDKVIAEGYTLVGNFDELWQPDKKFSSESIFEVYYTSDAPNWAYWV LLKEDDGSVTWRRYCTPTHDLVAKFDKEKDTRYASSILWKSVPYDTYWPADSYPLSYKIR EKTSNIILMRLADILLLKAEALVELDRTPEAIRIVNGIRERAGFAPSSLDENMGQARGRL AVENERQLELYMEGQRWFDLVRNNRLLEVMQKHKDKDGRLLFAGLQAFRQLWPIPQGEKD KNTNLTQNEGY >gi|222159247|gb|ACAB01000112.1| GENE 19 26591 - 27970 1164 459 aa, chain + ## HITS:1 COG:no KEGG:BT_3313 NR:ns ## KEGG: BT_3313 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 51 457 44 438 667 84 24.0 1e-14 MKHNNLFLWLLGLTVCWLASCSEEDGRVKYPYSCPELSGLQFSTTDQTPAADSLYFSVKI HDPETPLSTLEVKLMVGETLVSSQSIRTKGTDVSIVEKGIYIPFEADLEENQEAKVVLTA INVEGSEVTQTFDFHIVRPQIPEVLYLHYNGEVVEMTRSAENPYLYLTEVSGSEEGYPME LTGKISTTESLDDAELIWGAAEKSNTAVLIEASGPSFAFDYRDWFVEQITFDVMTFKLGV VGFQKNLKIKETELIASEGFFRAQISFTQGEEFEMSGFEDVEHAYNRDFFSYNPDNGKFT FLRKSGTWEIYYSSKFNYIWVARMSDTAPTCFWLVGHGFTCAPVWNEAYNSGGWNLEDIS QLGYIVPIGEQKYQTTVYLSNTHEWESFEIEIYSDLQWNKDKGMELQAGSLSGDTDGIEI SASNGITSGDGFVPGYYRLTFDTSQGVGKETLHIERLSD >gi|222159247|gb|ACAB01000112.1| GENE 20 28009 - 30519 1700 836 aa, chain + ## HITS:1 COG:VC2217 KEGG:ns NR:ns ## COG: VC2217 COG3525 # Protein_GI_number: 15642215 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 63 809 81 858 883 309 29.0 2e-83 MRNILYMIGGSLCLLLFMTGCKSADGAKAPVSLTWEMGAAEVQPGYYENSFILKNISDVP LGKDWIIYYSQLPREILQDESASVKVEVVNANFFRMYPAENFQPLAPGDSLIVTFCCTNG LKKLSYAPEGTYWVSQAGGKQGTPLPVELTIQPLQGMETEDWYPAPDKIYASNLALETTA KLQQTDIFPSVKEAFPAIGKENVIIENKVRLTFHPDFANEAGLLKEKLETLYGLEVISEA PVTVHLDYLPQQETATNDEYYRMDTGNSLINISAPTSHGIFNGTQTLLSLLKGQEKPFRL EAVSIRDYPDLPYRGQMLDIARNFTTADQLKKLIDAISSYKLNVLHFHFSDDEGWRLQIP GLEELTSVGARRGHTTDELECLYPGYDGNYDPSAPTSGNGYYTREEFIDLLRYAAQRHVR VIPEIESPGHARAAIVSMKARYHKYINTDPEKAVEYLLSDAQDTSRYVSAQSYTDNVMNV ALPSTYRFMEKVIRELMVMYEEAEVPLTTIHLGGDEVPDGAWMGSPVCRAFMDEHGMTSA HELSEYYITKMADYLQQYHVQFSGWQEAALGHSEATDLHLNRLAAGVYCWNTVPEWEADE IPYQVANKGYPVILCNVNNFYLDLAYDAHPDERGLSWAGYVDESKGFSMLPYHIYRSSRT DMAGNPVDLGIAERGKTVLTASGKERIQGVQAQLFAETIRDFKWVEYYTFPKILGLVERG WNAFPAWSMLAREKEQQAFNKALALFYSKASEKEMPHWASRNINFRLPHPGLCLKEGKLY ANTPIRGGEIRYTTDGAEPTLDSALWEAPIACDASVVKAKLFYLNKESVTSTLKVN >gi|222159247|gb|ACAB01000112.1| GENE 21 30541 - 32169 1241 542 aa, chain + ## HITS:1 COG:VC0613 KEGG:ns NR:ns ## COG: VC0613 COG3525 # Protein_GI_number: 15640633 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 34 539 130 635 637 253 30.0 8e-67 MNYDKIKRSGILFLLGIGAITSLSCNDNDNGGYPERVPTRLSVMPLPERVDYKESVVTLP QNVTVSQNIPASTSQLLKSTLEEKLSLSASDASNDHAFIRVKQESDLAKEAYRLTVTKEG ACIYYSTETGLLWGIQTLRQALEQANFFTSGNSKYLPMVDIKDAPKYDWRGFHIDVVRHM FTVDYLKKVIDCLSFYKINKLHLHLTDDQGWRIEVKKYPLLTQEGSWRDFDEYDKRCVEL SQQDYNYEIDPRFVRNGSQYGGHYTQEEMKGLVSYALERGIDIVPEIDMPGHFSAAIKVY PELSCTGEAGWGEEFSYPICPSRPENYQFVQSIIDEMVEIFPSEYFHIGADEVEKDNWEQ CEVCQQLMQQEGYQKVDELQNRFVKIMTNYVKGKGKKVMGWDDAFLEKDPQDLIYTYWRD WLPDQPGKITQKGYPIVFMEWSRFYLSATPSDEGLSSLYNFEFEPQFPSIVKQNVLGFQA CVWTEMIPNERKFGQHVFPSLQAFSELAWGSDRNWIDFTNRLKWHVKWLNENGFYFTKPG FI >gi|222159247|gb|ACAB01000112.1| GENE 22 32191 - 33306 668 371 aa, chain + ## HITS:1 COG:all1887 KEGG:ns NR:ns ## COG: all1887 COG4299 # Protein_GI_number: 17229379 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 5 371 2 375 375 214 37.0 3e-55 MKSERLLSLDVLRGITIVGMILVNNPGTWESVYAPLRHAEWNGLTPTDLVFPFFMFIMGV SMSFALSRFDHHFSRSFITKLVRRTVILFLLGLFLSWFSLVCAGVEQPFSQIRILGVLQR LALAYFFGSLLIMSVRRPANLAWISAIILIGYIVLLALGNGFELSEQNIIAVTDRTLFGE THLYREWLPDGGRIFFDPEGLLSTLPCIAQVIIGYFCGNILREKTEIHHRLLQISILGIV LLFAGWLLSYGCPLNKKVWSPTFVLVTCGFASLFLVFLTWLIDIRKKQKWAYPFHVFGTN PLFIYVVAGVLATLLEVITVSGISLQGKVYTSILLILPDAYLASLIYGLLFIGFNYLIVW VLYKKRIFIKI >gi|222159247|gb|ACAB01000112.1| GENE 23 33427 - 35040 1198 537 aa, chain - ## HITS:1 COG:no KEGG:PRU_0502 NR:ns ## KEGG: PRU_0502 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 10 520 3 517 526 360 39.0 9e-98 MSAPVLNGPFVMPMFRSFVPRKMQPWIYLFIAVTFQLSGGVYLGALNQMIGSMALMREDI LMCMYANLAGMAIYFPLLFRMKFRFTNKTLLTSAALGVLLCNLIAPHITFLPLLWLICFI EGMCKIQGTFECMSNIQLWMTPKRDFTVFFPWLHIVILGSIQLSDLITTYLMYHYHWTYM NLFIAGLMLIVLLIQFTCVKHFRFMRKFPLFGIDWLGGVLWAALLAEIAFLFNYGDWYDW WNSPVIRQLTIVIFITLAICIWRMMTIRHPFLEPKMWTYRHFWSLLGLVTLVEAFLATEH VLEEVFYEEVMKYEELVSSQLDWFAIIGIVCGCVFSYWWMHIKQYNYVRLIIVGFLGLIG YLIGFYLTLSTDIHISQLYLPTVCRGFAYAVLSATFMVCLEETMTFQHFFQGLSVFNMLH MVVGGVLGCAVYAQGLAYYVPDNLARYGSAIDHVSFSSSPFNIGHYMEEFINQMMEISIK QIYGWVAYACIFLFLLLLLYDFPVRRSLKSMPSWREVAREVKNTFWRTTHTPSGKEK >gi|222159247|gb|ACAB01000112.1| GENE 24 35148 - 36182 1112 344 aa, chain - ## HITS:1 COG:PA3136 KEGG:ns NR:ns ## COG: PA3136 COG1566 # Protein_GI_number: 15598332 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Pseudomonas aeruginosa # 35 343 44 351 355 160 31.0 4e-39 MIARKTQKIIYNIVIICLLIGSVTYVCSRFIHLGNIEYTDNAQVRQHITPINTRVPGFIK KICFDEYQQVHKGDTLLIIEDTEFRLRMAQAEADLANATAGRQATTIGIATTQNNLSVSD ASIEEVRVQMENARRELNRFEKLLQEDAVTKQQYDNVHTAYEAAKARYEQVSRAKLSTSL VKSEQTHRLGQNEAGVRLAEAALELARLNLSYTVIIAPCDGTTGKKEILEGQLVQPGQTM VDIVDSSDLWVIANYRETQLPNIKEGAEVEITADAVPNITFKGVVESISDATGAAFSMIP QDNATGNFVKVEQRIPVRISLKGNKPEDLKRMRAGFNVECEVKY >gi|222159247|gb|ACAB01000112.1| GENE 25 36255 - 37574 1341 439 aa, chain - ## HITS:1 COG:no KEGG:PRU_0500 NR:ns ## KEGG: PRU_0500 # Name: not_defined # Def: outer membrane efflux protein # Organism: P.ruminicola # Pathway: not_defined # 1 439 1 460 460 414 49.0 1e-114 MSKKRWLIVLLYAVLCSQRLFAQTLDRRVLGINEMFRLADENSQSIQTYKTGEEAAKEAL KAAKSQRLPDIEASLSFSYLGDGYLWDRDFKNGQNIPMPHFGNNFALEAQQVIYAGGAIN SSITLAELGQQMAALDWQKNRQEIRFLLTGYYLDLYKLNNQLQVLQKNLDLTEQVIRNME SRRTQGTALKNDITRYELQKETLKLQLAKVQDACKIMNHQLVTTLHLPAGTEIVPDSTLL DEEVKALAENDWQMMAAQSNVGLQQAQLSVQMSEQKVKLERSELLPKIALVAGEHLDSPI TIEVPVLDNNFNYWYVGVGIKYNLSSLFKNNKKVRQAKLNARRAQEEYLLAQEQIENGVQ ANYVNFLTSFTDLRTQEKSVELADQNYNVISNRYKNDLALLTDMLDASNMKLSADLGLVN ARINLIYSYYKMKYITHTL >gi|222159247|gb|ACAB01000112.1| GENE 26 38155 - 38538 120 127 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_2409 NR:ns ## KEGG: Fisuc_2409 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: F.succinogenes # Pathway: not_defined # 1 126 58 186 187 99 44.0 4e-20 MEDDDETIAFFSLLNDKISQTTIDKNYWRKLRKAFPHRKHLGSYPAVKIGRLGVATSHRG EGIGTDIISAVKQMLINRQSVSATRFLTVDAYLSAVPFYERNGFHPLINQPCEDTLPMYF DLIQLIE >gi|222159247|gb|ACAB01000112.1| GENE 27 38676 - 38834 122 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714038|ref|ZP_04544519.1| ## NR: gi|237714038|ref|ZP_04544519.1| predicted protein [Bacteroides sp. D1] # 1 52 13 64 64 72 100.0 8e-12 MARPIKETPILFGEDAKRFLASMQNVKPASQQEKQRVKAAYEKLKKIATFMM >gi|222159247|gb|ACAB01000112.1| GENE 28 38944 - 40380 1191 478 aa, chain - ## HITS:1 COG:no KEGG:BDI_2528 NR:ns ## KEGG: BDI_2528 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 478 1 471 472 552 55.0 1e-155 MKKVYNRNMLFAAILCLQLNVVAAFGQTVVKGSVKDNTGKPVSGVVVTDGAHFKTTDAQG NYVLNTDPARYPMVYISTPADYELPSKEGVADGFYQYLDAKKSENQCNFVLTKRQKPADE FVYIAISDPQVRNEKQLNRFRTETVPDLKQTADSLKNFEIVGMGLGDLVWDAMNLYAPYR QAVSNLGITMFQLMGNHDFNLLYKSMTQTDHPADGYGEQNYYHSFGPANYSFNIGKIHVI AMKDIDYDGNKKYTERFTPEDLDWLRKDLSYVPKGSTVFLNVHAPVANNTVSAGGNARNA NALFQLLRPYQVHIFSGHTHFYENQQPAPTIYEHNIGAACGAWWAGHVNRCGAPNGYLVV EVKGDDVKWRYKATGCSPDYQFRLYKPGEFESQKDYVVANIWDWDRTYTINWYEDGVLKG VMQAFDDEDQDYINMVKGKKTGYHTRHLFRAQPAKGTKSVKVMVKNRFGEVFTKEVRL >gi|222159247|gb|ACAB01000112.1| GENE 29 40384 - 40563 83 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293371128|ref|ZP_06617665.1| ## NR: gi|293371128|ref|ZP_06617665.1| hypothetical protein CUY_2977 [Bacteroides ovatus SD CMC 3f] # 1 59 121 179 179 106 93.0 5e-22 MKKEEVSREVATSNMAHILDCANVMCFDDNSKSFEEAVKSADEPEEIIALFNQVTLRKL >gi|222159247|gb|ACAB01000112.1| GENE 30 40723 - 42432 397 569 aa, chain - ## HITS:1 COG:no KEGG:BF3423 NR:ns ## KEGG: BF3423 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 55 532 1 477 491 344 41.0 6e-93 MDTKNLFRLLFLVLGYCLTMTSCSDDSKTDWNLAPSLITRSVVTPDFDWENADWMPTPAG YARIPVPWIGAGSLSSFYETDILNDYKSFDGWEMLYNAFSSEASSIVNNPFFVLYNKYRG TMRIYLYITTPFVATSSYIQNGISVISAKETSILNFLGQPVLDAVAKDQKSYVQLQPAPI DGSMPLATNRWYMMEYELAYDQNLSSIPYNEIRFNWFLNYYSVSKIYLDGKMEGKLNAIL GQGAVDHSNKMGNILQSDELKSIGTGILAGIGLNSLERNKISDAKDGQSNNKLGLKNGIF NALYNGVSSALKGSTSGLPGLIVGFVSGIVGGSSNKGINPPVNYQLMTTISLSGTSTNSG AFPSMPISFWLPGTENVINATGYVPLYNKILGVVNFKGKPEIKSHEQYYEYQGVDPEYGD YINEVHRIVFDTSKDYSPYLEFNPEVKAIADIHITNQEIVLYYNKEVLCRSLKTSYNNEK YTYWEEGVYTNHWNTDWANFNKYAGDVCVRFSIKIQPKNGAPVSHIIKTFKLKENRTSEE LESMPNYLKPNYNDYNLEKDYHLILKPKK >gi|222159247|gb|ACAB01000112.1| GENE 31 42915 - 43628 505 237 aa, chain + ## HITS:1 COG:RC0367 KEGG:ns NR:ns ## COG: RC0367 COG4804 # Protein_GI_number: 15892290 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Rickettsia conorii # 9 154 9 141 159 99 33.0 7e-21 MEIQNSIFNYAALLKQVKARVALAQKKAIYSANEEMLSMYWDIGKLLCESQKQIGWGNNA LEQLANDLKNDYPKVKGFSPRNCRCMIQFHKEYNQELTIWQPVVAKLEMSNCVLPIKQLS WSHNITLIQRVKDLKARYWYMIQCLKSGWSRNFLIEAINQDYYHTYGALANNFDSTLPEI QAKQVKETLKDPYIFDMLTFTDEYNERDVELGLVKHIEKFLLEMLCKPLHKSSNKPC >gi|222159247|gb|ACAB01000112.1| GENE 32 43622 - 43894 313 90 aa, chain + ## HITS:1 COG:no KEGG:BF3544 NR:ns ## KEGG: BF3544 # Name: not_defined # Def: multidrug efflux pump channel protein # Organism: B.fragilis # Pathway: not_defined # 1 89 378 466 475 135 85.0 5e-31 MLNAGVEVNEALVQYQTAREKADYYDKQVASLQTAAKSTSLLMKHGNTTYLEVLTAQQTL LNAQLSQVANRFTEIQGVITLYQALGGGRM >gi|222159247|gb|ACAB01000112.1| GENE 33 44021 - 44560 306 179 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294809844|ref|ZP_06768524.1| ## NR: gi|294809844|ref|ZP_06768524.1| hypothetical protein CW3_2895 [Bacteroides xylanisolvens SD CC 1b] # 1 179 75 253 253 330 100.0 3e-89 MIFIQKASPNVGDAFDFAYIYKFYSNRKDICQRLKYIARVEVTNDIFAIKFYAARDKKLD NKYNRILKMHNYTETLRIFMTCASIVPLILQKFPLASFVINGAQSLDLESNKIEGRANNQ RFRLYKNMATQLFGKERFEHYEFIEISSYLMVNKKECSDIERKKDRIKETLLNLYDIDS >gi|222159247|gb|ACAB01000112.1| GENE 34 44587 - 44781 192 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714045|ref|ZP_04544526.1| ## NR: gi|237714045|ref|ZP_04544526.1| predicted protein [Bacteroides sp. D1] # 1 64 1 64 64 89 100.0 6e-17 MGNSLKKQKEASLAANNTKKSKGERLGWKSISEYKTVSLKKIVGNGNRVNETNIPVNTSI RLVP >gi|222159247|gb|ACAB01000112.1| GENE 35 44930 - 45238 353 102 aa, chain - ## HITS:1 COG:no KEGG:BT_2030 NR:ns ## KEGG: BT_2030 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 102 187 303 304 117 58.0 2e-25 MSGNKILVRPILPFKARKAVFEKLEDIADVASMSPEDRERYDNSVKVYRDYLVTMDAAEQ KGIKEGAQKAQLQIARNMKAKGIDNQSIAECTDLPLSMIEEL Prediction of potential genes in microbial genomes Time: Wed May 18 03:47:30 2011 Seq name: gi|222159246|gb|ACAB01000113.1| Bacteroides sp. D1 cont1.113, whole genome shotgun sequence Length of sequence - 66477 bp Number of predicted genes - 60, with homology - 58 Number of transcription units - 31, operones - 14 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 368 158 ## CHU_3356 hypothetical protein - Prom 463 - 522 5.4 - Term 423 - 463 -0.1 2 2 Op 1 . - CDS 686 - 1723 584 ## BT_2028 hypothetical protein 3 2 Op 2 . - CDS 1771 - 2214 475 ## COG3023 Negative regulator of beta-lactamase expression 4 2 Op 3 . - CDS 2211 - 2966 252 ## BT_2026 endonuclease - Prom 2986 - 3045 6.2 - Term 3003 - 3044 -0.9 5 3 Op 1 . - CDS 3075 - 3278 149 ## gi|298479630|ref|ZP_06997830.1| hypothetical protein HMPREF0106_00054 6 3 Op 2 . - CDS 3296 - 3640 301 ## BF1826 hypothetical protein 7 4 Tu 1 . + CDS 3936 - 4403 352 ## gi|237714053|ref|ZP_04544534.1| conserved hypothetical protein + Term 4447 - 4485 5.5 8 5 Tu 1 . - CDS 4446 - 4670 135 ## BVU_1841 hypothetical protein - Prom 4713 - 4772 13.5 + Prom 5160 - 5219 7.8 9 6 Op 1 . + CDS 5303 - 8392 3527 ## BT_3240 hypothetical protein 10 6 Op 2 . + CDS 8430 - 9983 1348 ## BT_3241 hypothetical protein 11 6 Op 3 . + CDS 10004 - 10915 883 ## BT_3242 hypothetical protein 12 6 Op 4 . + CDS 10944 - 12248 1282 ## BT_3243 hypothetical protein 13 6 Op 5 . + CDS 12250 - 13425 1109 ## BT_3244 hypothetical protein + Term 13568 - 13605 -0.5 + Prom 13604 - 13663 6.0 14 7 Op 1 . + CDS 13685 - 14881 1173 ## gi|237714061|ref|ZP_04544542.1| conserved hypothetical protein + Term 14893 - 14927 1.1 + Prom 14900 - 14959 3.5 15 7 Op 2 . + CDS 14981 - 16780 1345 ## gi|237714062|ref|ZP_04544543.1| conserved hypothetical protein 16 7 Op 3 . + CDS 16801 - 17319 405 ## gi|237714063|ref|ZP_04544544.1| conserved hypothetical protein + Term 17328 - 17358 1.0 17 8 Tu 1 . - CDS 17538 - 17810 216 ## gi|237714064|ref|ZP_04544545.1| conserved hypothetical protein - Prom 17848 - 17907 7.5 + Prom 18069 - 18128 8.3 18 9 Op 1 . + CDS 18160 - 18672 529 ## BT_2021 putative non-specific DNA-binding protein + Prom 18686 - 18745 3.2 19 9 Op 2 . + CDS 18789 - 18878 113 ## + Term 19031 - 19072 7.1 + Prom 19170 - 19229 6.2 20 10 Tu 1 . + CDS 19250 - 20110 620 ## BDI_2654 hypothetical protein + Term 20160 - 20205 -0.4 + Prom 20340 - 20399 4.4 21 11 Op 1 12/0.000 + CDS 20598 - 23012 2354 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase + Term 23037 - 23094 8.7 + Prom 23115 - 23174 5.8 22 11 Op 2 . + CDS 23207 - 23665 236 ## COG0602 Organic radical activating enzymes + Prom 23745 - 23804 2.9 23 12 Op 1 . + CDS 23901 - 25304 804 ## COG0477 Permeases of the major facilitator superfamily 24 12 Op 2 . + CDS 25350 - 25496 80 ## BF3697 hypothetical protein + Term 25526 - 25586 17.1 - Term 25514 - 25572 7.4 25 13 Op 1 17/0.000 - CDS 25605 - 26960 1363 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 - Prom 26986 - 27045 4.2 26 13 Op 2 . - CDS 27085 - 28254 1368 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 27 13 Op 3 . - CDS 28352 - 29212 708 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 29236 - 29295 4.3 28 14 Op 1 . - CDS 29352 - 29903 582 ## BT_2004 16S rRNA-processing protein RimM 29 14 Op 2 . - CDS 29900 - 31204 1477 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase - Prom 31226 - 31285 8.2 30 15 Op 1 . - CDS 31379 - 31990 775 ## BT_2006 hypothetical protein 31 15 Op 2 . - CDS 32020 - 32709 681 ## COG1214 Inactive homolog of metal-dependent proteases, putative molecular chaperone - Prom 32852 - 32911 7.2 + Prom 32714 - 32773 7.7 32 16 Op 1 8/0.000 + CDS 32799 - 33677 1131 ## COG1561 Uncharacterized stress-induced protein + Prom 33685 - 33744 4.2 33 16 Op 2 . + CDS 33850 - 34419 572 ## COG0194 Guanylate kinase + Prom 34472 - 34531 2.9 34 17 Tu 1 . + CDS 34628 - 35218 302 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase + Term 35395 - 35433 4.1 35 18 Tu 1 . - CDS 35224 - 36513 1026 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 36562 - 36621 7.0 36 19 Tu 1 . - CDS 36710 - 37789 1025 ## COG1408 Predicted phosphohydrolases - Prom 37964 - 38023 4.0 + Prom 37779 - 37838 7.5 37 20 Tu 1 . + CDS 37974 - 38867 791 ## COG1575 1,4-dihydroxy-2-naphthoate octaprenyltransferase + Term 38988 - 39024 -0.4 - Term 38774 - 38829 14.1 38 21 Op 1 16/0.000 - CDS 38871 - 40007 1157 ## COG1088 dTDP-D-glucose 4,6-dehydratase 39 21 Op 2 . - CDS 40030 - 40899 942 ## COG1209 dTDP-glucose pyrophosphorylase - Prom 40919 - 40978 3.7 40 21 Op 3 . - CDS 40999 - 41493 394 ## COG0622 Predicted phosphoesterase - Prom 41530 - 41589 7.9 + Prom 41458 - 41517 4.2 41 22 Tu 1 . + CDS 41549 - 43732 1859 ## COG0855 Polyphosphate kinase + Term 43940 - 44008 3.3 + Prom 43966 - 44025 2.4 42 23 Tu 1 . + CDS 44130 - 46379 1804 ## BT_2020 putative phosphate/sulphate permeases + Term 46438 - 46498 5.2 + Prom 46520 - 46579 4.8 43 24 Tu 1 . + CDS 46615 - 47121 355 ## YPTS_3418 hypothetical protein + Term 47223 - 47275 4.4 + Prom 47143 - 47202 4.6 44 25 Tu 1 . + CDS 47405 - 48031 480 ## BT_4601 hypothetical protein 45 26 Tu 1 . + CDS 48148 - 48924 369 ## BT_1962 hypothetical protein 46 27 Tu 1 . + CDS 49354 - 50247 606 ## BF3036 tyrosine type site-specific recombinase 47 28 Op 1 . + CDS 51357 - 52274 793 ## BF3033 hypothetical protein + Term 52286 - 52324 4.0 48 28 Op 2 . + CDS 52339 - 54120 1429 ## BT_1956 putative cell surface protein 49 28 Op 3 . + CDS 54135 - 56225 1197 ## BT_1955 putative cell wall biogenesis protein 50 28 Op 4 . + CDS 56254 - 57354 727 ## BT_1954 putative surface layer protein 51 28 Op 5 . + CDS 57373 - 59451 1299 ## BT_1953 putative TonB-linked outer membrane receptor 52 28 Op 6 33/0.000 + CDS 59463 - 60602 955 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 53 28 Op 7 35/0.000 + CDS 60603 - 61583 618 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 54 28 Op 8 . + CDS 61580 - 62338 195 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 55 28 Op 9 . + CDS 62335 - 62901 201 ## BF3025 hypothetical protein + Term 63019 - 63062 2.3 - Term 62680 - 62716 -0.5 56 29 Tu 1 . - CDS 62830 - 63102 122 ## - Prom 63124 - 63183 3.9 57 30 Tu 1 . - CDS 63485 - 63739 159 ## BT_1948 hypothetical protein - Prom 63799 - 63858 8.2 + Prom 63737 - 63796 5.5 58 31 Op 1 . + CDS 63816 - 64232 224 ## BT_1947 hypothetical protein 59 31 Op 2 . + CDS 64262 - 64666 169 ## BT_1946 hypothetical protein + Term 64675 - 64703 -1.0 60 31 Op 3 . + CDS 64742 - 65530 524 ## BT_1945 conjugate transposon protein + Term 65684 - 65715 0.1 Predicted protein(s) >gi|222159246|gb|ACAB01000113.1| GENE 1 2 - 368 158 122 aa, chain - ## HITS:1 COG:no KEGG:CHU_3356 NR:ns ## KEGG: CHU_3356 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 5 106 2 102 110 93 50.0 2e-18 MEDKKKVIVYVDGFNFYYGLKSKKWKMCYWLDLVSFFNSFLKSYQELVEVNYFSARPTDA GKHDRQDKLFQANKCNPKFNLILGKYLKKEIKCRYCGGIIHSFEEKETDVRIATKILSDA YK >gi|222159246|gb|ACAB01000113.1| GENE 2 686 - 1723 584 345 aa, chain - ## HITS:1 COG:no KEGG:BT_2028 NR:ns ## KEGG: BT_2028 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 132 1 132 335 244 89.0 3e-63 MARISKPGLDYFPLDVNFLQDRKVRRISCRHHAAGIAALTSLLCLIYKEKGYYISWNKDT LFDIAQEACCEEEEMQAIIDDCLAVGLFDNLIYKEYGVLTSQAIQEQYHKIITDSRRKYK LPLEHFWLIKEDFKNVTESHDFATTMPQTKEETDTDIKGKQEMDMENEIKSEIKKDKETE RQGKCEKENERIMPPVPSNSVFQASPVVVEGLSLDKSIGREKTESTAETVSETTSGTAPG TVLETAPKTISEGMQTGGGSEAILLLNMQTIGIRNEQTVKAIAALARRKELGGPGGILWK ILSSQYRPTLLKKNEPGDYILWALNHPTEFENTYNGTLKKLIRGR >gi|222159246|gb|ACAB01000113.1| GENE 3 1771 - 2214 475 147 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 43 141 1 98 116 104 48.0 4e-23 MRDTIIIHCSATRAGQDFTAADIDRWHRQRGFRSIGYHFVVRLDGTVEPGRDVALDGAHC TGWNHRSIGICYIGGLDKNGRPADTRTEAQREALVRLVEDLRLVFPSLQQVIGHRDTSPD LNGDGIISPNEYIKSCPCFDVKAEFIQ >gi|222159246|gb|ACAB01000113.1| GENE 4 2211 - 2966 252 251 aa, chain - ## HITS:1 COG:no KEGG:BT_2026 NR:ns ## KEGG: BT_2026 # Name: not_defined # Def: endonuclease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 162 1 162 322 230 62.0 3e-59 MENWRFIEENPDYMISDHGRVLSFKGKSKLILCTKIIGTGYETVSLLNKGICTDYHVHRL IAKAFIPNPKHLPQINHLDGNTMNNHVSNLEWCDAYTNTMHAIRTGLRPTGTGSRKTPCA VTDAKGIILRAYPSMKEMGREERLKPSHQHWLILQLKHPDRVWLHNLRARQHRGNDFMKV PFANSSALSKLPVFLNSSAITDTAVPPIHAPFTPVHHSVTKQYYRQLSAEEAAAFGFTSP TIRKQERRISQ >gi|222159246|gb|ACAB01000113.1| GENE 5 3075 - 3278 149 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298479630|ref|ZP_06997830.1| ## NR: gi|298479630|ref|ZP_06997830.1| hypothetical protein HMPREF0106_00054 [Bacteroides sp. D22] # 1 67 8 74 74 100 98.0 3e-20 MNQQERKEWEALCRQYNVDSLQEAIQGCREELEFLIHCAERLEPRTEQDEDSKKRRLTHT RNAAFPH >gi|222159246|gb|ACAB01000113.1| GENE 6 3296 - 3640 301 114 aa, chain - ## HITS:1 COG:no KEGG:BF1826 NR:ns ## KEGG: BF1826 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 37 110 37 110 115 65 41.0 4e-10 MTEIKKNHSLQRIRLMKQAAVIRQSNGMSRHDALVTAHRIGKLIRQMHHGNVCFRYTKQD GTIRRATGTLIGYEHSFRRPYTPKPENTFVVYYDIDAKGWRTFHAENFLDIEVE >gi|222159246|gb|ACAB01000113.1| GENE 7 3936 - 4403 352 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714053|ref|ZP_04544534.1| ## NR: gi|237714053|ref|ZP_04544534.1| conserved hypothetical protein [Bacteroides sp. D1] # 12 155 12 155 155 273 100.0 3e-72 MKKESKKFKVKSRDKQSKTTGVRHSDDDVKKAVVDRIFKIEQLNNIPERYVANHSNCSRS SIGRMCKCKFDGQSPIPDWTTIHNYSACIIGKSEFIPGFPEVLCHVLNLIVDDSADIDCT VDNDCHIDIEIRFHTSKKLVKDPMEKEGDREKEEQ >gi|222159246|gb|ACAB01000113.1| GENE 8 4446 - 4670 135 74 aa, chain - ## HITS:1 COG:no KEGG:BVU_1841 NR:ns ## KEGG: BVU_1841 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 73 1 72 1002 97 64.0 1e-19 MKRKLTLLLTCLLASISLTIAQTSKKVTGVVFSEEDNQPVIGASVVIKGTSIGTTTDLDG KFVISNVPSSSRTW >gi|222159246|gb|ACAB01000113.1| GENE 9 5303 - 8392 3527 1029 aa, chain + ## HITS:1 COG:no KEGG:BT_3240 NR:ns ## KEGG: BT_3240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1029 30 1058 1058 1796 85.0 0 MKPQDVPIKPGTIKVMMKSDSEVLQEVVVTGMQKMDKRLFTGATKQLDADNVKLDGLPDI SRGLEGRAAGVSVQNVSGTFGTAPKIRVRGATSIYGSSKPLWVVDGVILEDAIDVGPDDL SSGDAETLISSAIAGLNSDDIESFQILKDGSATSIYGARAMAGVIVVTTKKGRAGVSKIS YTGEFTTRAIPSYKEFNIMNSQEQMGIYKEMEQKGWLNSADLFRTKDTGVYGRMYRLIDE YNNSNGQRGLLNTPESRNAYLREAEMRNTDWFDVLFTNAISQNHSVSVTAGTEKSSFYAS LSAMLDPGWYKQSKVKRYTANVNTTYNLYKNLSINLISNASYRQQNAPGTMSSTLDIASG KMSRGFDINPYSYALNTSRTLDPSADYISNYAPFNILHELDNNYIEINMVDVKFQGELKW KVIQGLELSALGAVRYQTSSQEHNVLDDANQAVAYRTGMDDATIREQNGLLYKDPDNPYA LPITLLPEGGIYQRQDRRMLGLDFRGTLSWNHLFAEKHITNFFAGMEVNSLKKAYSSFQG WGMQYSMGEIPSYVYQFFKNGIESGSRYYSLGHSETRSIASFLNATYSYDGRYTLNGTFR YEGTNRMGRSRSSRWLPTWNLSGAWNVHEEAFFKALQPTLSNLTLKASYSLTADRGPAEV TNSQAIIRSFSPYRPFTDIQETGLYIEDIENSELTYEKKHELNIGIDLGLLENRINFSAD WYQRNNYDLIGVVPTQGVGGFINKYANVATMKSHGVEFTLSTRNIRTKDFSWNTDFIFSY AKNEVTQLKGLTSMMELVSGNGFAREGYPVRGLFSIPFAGLDADGIPQYLIDGKLTTTGV NFQEREKLDYLKYEGPTDPTVTGSFGNVFTYKGFKLNVFMTYSFGNVVRLDPVFNYKYSD LTAMPREFKNRWTVSGDEAKTNIPVIISDPQYQANTSLYLAYNAYNYSTERIAKGDFIRM KEISLMYEFPKSWIAPAKISNLSLKLQATNLFLIYADKKLNGQDPEFVNTGGVASPVPRQ FTLTLRLGL >gi|222159246|gb|ACAB01000113.1| GENE 10 8430 - 9983 1348 517 aa, chain + ## HITS:1 COG:no KEGG:BT_3241 NR:ns ## KEGG: BT_3241 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 515 10 514 517 706 69.0 0 MKKKINHILPALVLTASLGLTTVSCNGFLDEMPDNRTELNTDQKIAKLLVSAYADVSPNE LFELYSDNSDDSGPTYGYYKLSEQECYHWQDTKEEYQDTPNSLWGGYYSAIAAANMALNA IEEKGNPASLDPQRGEALVCRAYCHFMLANIFCNAYDTHASQSLGVPYMEEVETTVSPRY ERGTLEEVYRKVERDLLEGIGLISDDSYSVPKYHFTRKAAYAFAARFYLYYMKPDFSNCD RVIDFATRVLGTNPADLLRNWEALGNLSVNGNVQADAYIASSNEANLLISSTISNWGAIG SDYLIGPRYFHNQTLATSETCLSSGLWGSSDYFYLKPFSTNGFVASILRKSGLYEGTYLY YMPVLFSTDETLLCRAEAYALKKMYSQSAADITSWQRAYTRNKSVLTPEQINAYYNKMSF YTPFTQQTPKKQLAPDFTIEEGMQENLIHCILHIRRITTLHEGMRWPDIKRYGILIYRRD MANQRVTDEMPPNDLRYAIQIPRSVIVAGIQPNPRRN >gi|222159246|gb|ACAB01000113.1| GENE 11 10004 - 10915 883 303 aa, chain + ## HITS:1 COG:no KEGG:BT_3242 NR:ns ## KEGG: BT_3242 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 303 1 304 304 459 74.0 1e-128 MKKQLIYFLLALTAVCLGACNNGDEIDTAHSIFSTEPMERNAFDDWLLDNYTYPYNIDFM YRMKDIESDHKYNLVPADYDKAVALSKIIKHVWMDAYVELAGIDFLRVYVPKTFHLIGSP AYESSGNMVLGTAEGGKKITLYNVNDLNVKKINIEKLNNYYFETMHHEFAHILHQKRNFD PSFNRISEGKYVGADWYYYMTAEGAMPRTNDIAWPDGFVTAYAMNQANEDFVENIAMYVT HTQTYWDNMLRAAGETGAATINKKFAIVYNYMRDTWGIDLNELRKIVLRRQQEITEIDLS TIQ >gi|222159246|gb|ACAB01000113.1| GENE 12 10944 - 12248 1282 434 aa, chain + ## HITS:1 COG:no KEGG:BT_3243 NR:ns ## KEGG: BT_3243 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 357 3 354 443 367 54.0 1e-100 MKNITIYLLLALACLGLQSCLFQEENYFDDSSANRATADVNRCSELLKAAPNGWVMEYYI GKDYSLGGITLLCRFDGQRVTMASQMSEADETVSSLYSVKSEQATMLSFDTYNYLVHYFG QPQGSMSDDPNGTLGGDYEFIISSASAERIELKGKKYGNRIVMKALTEDQTWKQYLTKIK KVEDDAFFYEYDLLMDGLYTGQMIRSNYTFIVTYYDEIGKVHQKTVPFMFTADGLRFHEP VTIDGQTMQNFVWKNELISFVCTDEGATGVKLAGVYTAGYQSYDYYPGTYQMDFYRLNDE TGQLEVASQEVRLLKNEDGKSYWLKGLEYDILVMYDKPRGGLSVSPQLLEKVQGGYVYLA MWDVMNDYVLRNKSIGLISYATTNGIYLVDNGVWMGEVSGFIFGVYNSQEEDAAFIGYTD AVASIRLVKKTIEE >gi|222159246|gb|ACAB01000113.1| GENE 13 12250 - 13425 1109 391 aa, chain + ## HITS:1 COG:no KEGG:BT_3244 NR:ns ## KEGG: BT_3244 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 376 1 336 354 140 28.0 9e-32 MMNKLFMIVVSVCCLLMTACGTENDETPQGFRIVDTDVDLVAGGGTVEVALQAAGELTAV SGADWCRVIEVTNEKITLLARANYEYVSRATQVTVSDGSNEQKIVVTQAGNIFAPDTDQQ VLRLGNAPSQIVVRVNSSFDFKVSIQKGVDWVEVPTASKEEGFLLVLKENKTGNPRVCQL MVSTNEGRKFFYYVYQYEASDLLGAWRDSQIYVKWDTKKIDKAYRLSATEVTKDAEGDGY TINMSLVGTPVAPYLKDPSKATISLQATYEDGAFVVPIGAAQKDFVLTQEVDGTANNAMN RTADDTALEEDVVLQTLYGFSVAASSNIYAKGNFGFAPVLYQDGSIGMSYQDVAGAPESG CYLAIALFDGQQFSTAALKDYILFPDIALFK >gi|222159246|gb|ACAB01000113.1| GENE 14 13685 - 14881 1173 398 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714061|ref|ZP_04544542.1| ## NR: gi|237714061|ref|ZP_04544542.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 398 1 398 398 807 100.0 0 MDKKLMNLLGVWLFAVASVCALTACSDNDTENPEGEKTGGENGKPTPETVEFINSNLVYW GDEDGVGTDHFVLTLYTDMEVDAAGNPIGPGKIMAFSLNVPPFASGTTEFPLPEGTFDAA LNGYTFNEWTFNLGYMNQMDLPTGKVEVPAGSFYGDVKAHSTSVDADLLSGGKMTVKRSA DGEYTISGVLVGDLSLKRYFTYTGKLTTIDRHGSTEEIPNSTLNADLTLNEWAQARLQDK ADFYYLQDESCRVLELYLAEEGVGLAETWPTGNGRVLKVEFFVEWATDVTQGIPAGTYKV VARDEESKGIPREFLKPGGIASGYPNVFTYPAGTWYEKISNGTMKEYARIDAGTMTVVRD GDKHTLTIDFIDCDKAHPHHVRSTFSQDTPINVFGYHP >gi|222159246|gb|ACAB01000113.1| GENE 15 14981 - 16780 1345 599 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714062|ref|ZP_04544543.1| ## NR: gi|237714062|ref|ZP_04544543.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 599 1 599 599 1081 100.0 0 MKIRFISFFLFMLAMSCFMACSSDDPAPDPEPEPEPEPVETVNTYTYKDKTVQAKSVSCF EQEGIVYVCMSPLQSLETMEDFMNSGKEYIMLGVDKKLVGTPVTVGNSDNDNYAFYYMDA DGEAIVAVSPDGWEDVLTQGKITVTMVPGENETSAVKAVFDFTLKSGGKFTGEADCVYKA PEKLPSEYSFGDEVRPLKSVVATTLGGFQYVYLSPETGLTTVEDISDAEYLMLAITPEMI GQEIDITSSGDAEYAFYNMTNLGADDIDAVDPYGWSELCSAGKLKVEKTDESVKITFSFT LLSGDKFKGSYEGSYTDIKQSTTNILTLNGESTRDIKATFYEKTNEGVALYLTPSGISSA ADLNNVNTYYVRLFVPNTGLDGQEVDITTTNLAFEFTYYNPYDEERIEVSKGYLEDAAGT FSVSKSADNEYSLTLDLKYLGDNSLKISGNYNGTFKVYDTTIPNQYQLGAGGTPVTIQSV VVDKTDVDICVIYISRQAGITTVAGMSAADAVVRVPKSMMEGAVHGFSGSAENAKISIAY EGVTYNQANTTNGHLAAGGNASVALQGSEIEMTFNIFSIVQYDNSSLSGYYKGATTVIE >gi|222159246|gb|ACAB01000113.1| GENE 16 16801 - 17319 405 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714063|ref|ZP_04544544.1| ## NR: gi|237714063|ref|ZP_04544544.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 172 1 172 172 341 100.0 8e-93 MKKILLKYLNLFALSGLLLAFVGCSDDNDSVTPGPGVEEVPSIGTYSFRGVANNIVSGVA SVDGDYLTCVLSPEKMEEGKADTYFVFSLHLYWEGQVVDASSLYHNDQYVFIYEDPIYYY SQYKKVTGTFYVQRNSETNVTVKLNLRLHDGVRFKAEVTADLMKPSGEEPSE >gi|222159246|gb|ACAB01000113.1| GENE 17 17538 - 17810 216 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237714064|ref|ZP_04544545.1| ## NR: gi|237714064|ref|ZP_04544545.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 90 1 90 90 135 100.0 8e-31 MKTETTASSVSPKEMPETANSTRIQTYSKSQLATLYLPHIQPASARRTLRSWIAQNTALQ AALAQTGYSEKAILLTPAQVGLFFKFLGEP >gi|222159246|gb|ACAB01000113.1| GENE 18 18160 - 18672 529 170 aa, chain + ## HITS:1 COG:no KEGG:BT_2021 NR:ns ## KEGG: BT_2021 # Name: not_defined # Def: putative non-specific DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 168 1 173 175 255 78.0 5e-67 MPLIYKPYQATLANKEGQKLYYPRLVKFGKMVNTQKMAELIAEKASLTAGDVHNVIRNLM SVMREQLLNSRTVRLEGLGTFTMIAKANGKGVELENKVSSSQITSLRYQFTPEYTRAANG SATRALTTGVEFVHIKDVAGSFVADDAIDKDPDDDNKPGGGSGEAPDPAA >gi|222159246|gb|ACAB01000113.1| GENE 19 18789 - 18878 113 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLLLEKLLDILEAVLRFRHRRNGKAEKD >gi|222159246|gb|ACAB01000113.1| GENE 20 19250 - 20110 620 286 aa, chain + ## HITS:1 COG:no KEGG:BDI_2654 NR:ns ## KEGG: BDI_2654 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 286 1 305 305 367 63.0 1e-100 MGRFINPFTDFGFKFLFGREVEKELLIDFLNDLLVGEHVITDIRFLNNEQPPEVKTERGL IYDIYCVTDTGERIIVEMQNREQPYFKDRALFYLSRAITQQAKRGVWNFQLDAVYGVFFM NFVMDKDMPSKIRTDIVLSDRDTGKLFNSKFRQIFIELPNFNKEEDECENDFERWIYILK HMDTLDRMPFKARKAVFERLEKLASKANMTQEERAQYEEEWKVYNDYFNTLDFAEQKGLQ KGLQKGLQKGLQKGKEETARNLKELGVADDIIMKSTGLSKEEIEKL >gi|222159246|gb|ACAB01000113.1| GENE 21 20598 - 23012 2354 804 aa, chain + ## HITS:1 COG:CAC1209 KEGG:ns NR:ns ## COG: CAC1209 COG1328 # Protein_GI_number: 15894492 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Clostridium acetobutylicum # 15 804 5 690 699 414 35.0 1e-115 MLSGVKVMNYAEICIIKRDGKREDFSISKIKNAISKAFSASGIQDEQQLVADITMNVISQ FATPTITVEEIQDLVEKSLMKVRPEVAKKYIIYREWRNTERDKKTQMKQVMDGIVAIDKN DVNLSNANMSSHTPAGQMMTFASEVTKDYTYKYLLPKRFAEAHQLGDIHIHDLDYYPTKT TTCIQYDMDDLFERGFRTKNGSIRTPQSIQSYATLATIIFQTNQNEQHGGQAIPAFDFFM AKGVAKSFRKHLASFINFYVAMENGNQADEKSIRTLIKEYLPSIKSTEAERETLRIALVA LQIIIDKEHLARIVEKAYQQTRKDTHQAMEGFIHNLNTMHSRGGNQVVFSSINYGTDTSA EGRMVIEELLKATIEGLGTRGEVPVFPIQIFKVKDGVSYSEKDFEKAMKAENIEEAMTDS YEAPNFDLLLKACQTTAKALFPNFMFLDTPFNKNEKWKADDPQRYIYELATMGCRTRVFE NVAGEKSSLGRGNLSFTTLNMPRLAIEARIKAENLVEDERNKDAIEQKAKEIFIESVHQM SALVADQLYERYQYQRTALARQFPFMMGNNVWKGGGELNPNEQVGDALRSGTLGIGFIGG HNAMVALYGQGHGHSQKAWDTLYEAVMEMNKVVDEYKEKYNLNYSVLATPAEGLSGRFTK MDRRKYGKIPGVTDRDYYVNSFHVDVKEPISIVEKIKREAPFHAITRGGHITYVELDGEA QKNVRAIAKIVKVMHDEGIGYGSINHPVDTCHNCGYKGVIFDKCPVCQSESILRMRRITG YLTGDLSSWNSAKRAEEKDRVKHL >gi|222159246|gb|ACAB01000113.1| GENE 22 23207 - 23665 236 152 aa, chain + ## HITS:1 COG:CAC0481 KEGG:ns NR:ns ## COG: CAC0481 COG0602 # Protein_GI_number: 15893772 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Clostridium acetobutylicum # 1 152 3 153 153 129 44.0 3e-30 MNLLGTYSETIVDGEGIRYSIYLAGCSHHCPGCHNPESWNPGAGEELTEEKIQSIIREIK ANPLLDGVTFSGGDPFFHPEAFLLLLKRVKEETGMNVWCYTGYTYEEIKAQPRLSAALDY IDVLVDGRFEQALFSPYLEFRGSSNQRILKLK >gi|222159246|gb|ACAB01000113.1| GENE 23 23901 - 25304 804 467 aa, chain + ## HITS:1 COG:YPO1712 KEGG:ns NR:ns ## COG: YPO1712 COG0477 # Protein_GI_number: 16121972 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 14 462 6 454 455 481 58.0 1e-135 MVQSINPTQKNDETDGLPMPKRIWAVVSVGFALCMSVLDINIVNIVLPTLSHDFGTSPAV TTWIINGYQLAIVISLLSFSALGEIIGYRKVFLSGIGLFCVTSLICALSDSFWTLTVARI FQGFSASAITSVNTAQLRYIYPKSQIGRGMGINAMVVAISAAAGPSVASGILSIASWHWL FAINVPLGLTALLLGIKHLPRQEERTKRKFDYISAIANAVTFGLLIYTLDGFAHHEKMDF LFIQLIVLAVVGTYYVRRQLTQTTPLLPLDLLRIPIFRLSILTSICSFIAQMTAMVSLPF FLQNTLGHSEVMTGLLLTPWPLATLVTAPLAGYLVERIHPGILGSVGMALFAVGLFSLSS LTAESSDISIILRLMLCGAGFGLFQTPNNSTIISSAPTQRSGGASGMLGMARLLGQTTGT TLVALLFSFVAHDRSTAVCLMVGSGFAVVAAIVSSLRLSQPSTLKRK >gi|222159246|gb|ACAB01000113.1| GENE 24 25350 - 25496 80 48 aa, chain + ## HITS:1 COG:no KEGG:BF3697 NR:ns ## KEGG: BF3697 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 48 245 292 292 85 85.0 4e-16 MPKAELTPELEVYYNYTCQEGDYIKTCTVPSPKLDETDLEREKKMLKI >gi|222159246|gb|ACAB01000113.1| GENE 25 25605 - 26960 1363 451 aa, chain - ## HITS:1 COG:aq_1964 KEGG:ns NR:ns ## COG: aq_1964 COG0750 # Protein_GI_number: 15606963 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Aquifex aeolicus # 9 450 4 428 429 139 28.0 1e-32 METFLIRALQLIMSLSLLVIIHEGGHFLFARLFKVRVEKFCLFFDPWFTLFKFKPKKSDT EYAVGWLPLGGYVKIAGMIDESMDTEQMKQPEQPWEFRSKPAWQRLLIMVGGVLFNFLLA LFIYSMILFAWGDQYIKVQEAPLGMDFNETAKSVGFQDGDILLSADGVPFERYDGDMLSQ IADAREVSVIRNGAKASVYIPEDLMQRLLADSIRFASYRFPYVIDSVMVNSPAAQAGIQP GDSIIALNGTPISFSDFKEAMAERKKNAETLLKDSIDPRLITLTYVRGGVTDTLNMRVDS AYLMGVTACLVTDRLLPMVKKEYTFFESFPAGVSLGVKTLKGYVGNMKYLFSKEGAKQLG GFGTIGSIFPATWDWHQFWYMTAFLSIILAFMNILPIPALDGGHVLFLFYEMIARRKPSD KFMEYAQMTGMILLFGLLIWANFNDILRFFF >gi|222159246|gb|ACAB01000113.1| GENE 26 27085 - 28254 1368 389 aa, chain - ## HITS:1 COG:alr4351 KEGG:ns NR:ns ## COG: alr4351 COG0743 # Protein_GI_number: 17231843 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Nostoc sp. PCC 7120 # 10 383 3 380 399 357 48.0 2e-98 MNIETNKKKKQIAILGSTGSIGTQALQVIEEHPDLYEAYALTANNRVELLIAQARKFQPE VVVIANEEKYSELKEALSDLPIKVYAGTDAICQIVEAGPIDMVLTAMVGYAGLKPTINAI RAKKAIALANKETLVVAGELINQLAQQYRTPILPVDSEHSAVFQCLAGEVGNPIEKVILT ASGGPFRTYTLEQLKSVTKTQALKHPNWEMGAKITIDSASMMNKGFEVIEAKWLFGVQPS QIEVVVHPQSVIHSMVQFEDGAVKAQLGMPDMRLPIQYAFSYPDRISSSFDRLDFSKCTN LTFEQPDTKRFRNLALAYEAMYRGGNMPCIVNAANEVVVASFLKDGISFLGMSDVIEKTM ERATFIANPTYDDYVATDAEARKIAASLI >gi|222159246|gb|ACAB01000113.1| GENE 27 28352 - 29212 708 286 aa, chain - ## HITS:1 COG:NMB1483 KEGG:ns NR:ns ## COG: NMB1483 COG0739 # Protein_GI_number: 15677336 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Neisseria meningitidis MC58 # 163 286 295 415 415 90 41.0 3e-18 MPKRRRSKAFWKNFKFKYKLTIVNENTLEEIVGLRVSKLNGLSVLLCVLAVLFLIASCII TFTPLRNYLPGYMNSEVRTQIVDNALRVDSLQQVLNKQNLYIMNIQDIFSGKVPIDSVQT LDSLTAAREDTLMERTKREEEFRRQYEENEKYNLTSITSQPDVTGLILYRPTRGMVSDHF NAEKKHYGTDIAANPNESVLATMDGTVILSTYTAETGYLIGVQHNQDLISIYKHCGSLLK KEGERVKGGEAIALVGNSGTLSTGPHLHFELWYKGHPVNPEKYIVF >gi|222159246|gb|ACAB01000113.1| GENE 28 29352 - 29903 582 183 aa, chain - ## HITS:1 COG:no KEGG:BT_2004 NR:ns ## KEGG: BT_2004 # Name: rimM # Def: 16S rRNA-processing protein RimM # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 178 1 179 179 295 88.0 7e-79 MIKKEEVYKIGLFNKPHGIHGELQFTFTDDIFDRVDCDYLICLLDGIFVPFFIEEYRFRS DSTALVKLEGIDTAERARMFTNIEVYFPVKHAEEAEDGELSWNFFIGFQMEDIHHGPLGE VIDVDNTTVNTLFVVEREEEELLVPAQEEFIVGIDQKQKLITVELPEGLLNLEELDADDT APG >gi|222159246|gb|ACAB01000113.1| GENE 29 29900 - 31204 1477 434 aa, chain - ## HITS:1 COG:BB0472 KEGG:ns NR:ns ## COG: BB0472 COG0766 # Protein_GI_number: 15594817 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Borrelia burgdorferi # 1 434 16 439 442 379 45.0 1e-105 MASFVIEGGHRLCGEIHPQGAKNEVLQIICATLLTAEEVTVNNIPDILDVNNLIQLLREM GVTVAKKGIDSYSFKAENVDLAYLESDEFLKKCSSLRGSVMLIGPMVARFGKALISKPGG DKIGRRRLDTHFVGIQNLGADFRYDEGRGIYEITADRLQGSYMLLDEASVTGTANIVMAA VLAKGTTTIYNAACEPYVQQLCRLLNRMGAKISGIASNLLTIEGVEELHGAQHTVLPDMI EVGSFIGMAAMTKSEITIKNVSYENLGIIPESFRRLGIKLEQKGDDIYVPAQETYQIESF IDGSIMTIADATWPGLTPDLLSVMLVVATQAKGSVLIHQKMFESRLFFVDKLIDMGAQII LCDPHRAVVIGHNHGFKLRGARLTSPDIRAGIALLIAAMSAEGTSTISNIEQIDRGYQNI EGRLNAIGARITRI >gi|222159246|gb|ACAB01000113.1| GENE 30 31379 - 31990 775 203 aa, chain - ## HITS:1 COG:no KEGG:BT_2006 NR:ns ## KEGG: BT_2006 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 189 204 345 91.0 8e-94 MQYNTQQKRMPLPEYGRSIQNMVDYALTIQDRAERQRCANTIINIMGNMFPHLRDVPDFK HKLWDHLAIMSGFELDIDYPYEIIRKDNLVTRPDPIPYSTARMRYRHYGRTLEVLIKKAI EFPEGNEKRNLIALICNHMKKDYLAWNKDTVDDKKIAEDLYELSNGELQMTDDIVRLMAE RLNQNYRPKTNYTNNRQNNKRRY >gi|222159246|gb|ACAB01000113.1| GENE 31 32020 - 32709 681 229 aa, chain - ## HITS:1 COG:XF1533 KEGG:ns NR:ns ## COG: XF1533 COG1214 # Protein_GI_number: 15838134 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Inactive homolog of metal-dependent proteases, putative molecular chaperone # Organism: Xylella fastidiosa 9a5c # 4 134 3 127 229 79 37.0 5e-15 MSCILNIETSTSVCSVAASQDGQTIFVKEDLKGPSHAVSLGVFVDEALSFIDSHAIPLDA VAVSCGPGSYTGLRIGVSMAKGVCYGRNVPLIGIPTLEVLTVPVLLYHDLPEDALLCPMI DARRMEVYAAVYDRRLQVKRAVAADIVDENSYLEFLNEQPVYFFGNGADKCREQITHPNA HFIDNIHPLAKMMFPLAEKAIADEDYKDVAYFEPFYLKEFVASMPKKLL >gi|222159246|gb|ACAB01000113.1| GENE 32 32799 - 33677 1131 292 aa, chain + ## HITS:1 COG:CAC1716 KEGG:ns NR:ns ## COG: CAC1716 COG1561 # Protein_GI_number: 15894993 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Clostridium acetobutylicum # 1 291 1 291 292 132 32.0 1e-30 MIQSMTGYGKATAELPDKKINVEIKSLNSKAMDLSTRIAPAYREKEIEIRNEISKVLERG KVDFSLWIEKKETAENATPINQVLVEGYYKQILAISQNLGIPVPADWFQTLLRMPDVMTK TEIQELTDEEWKMVHATVLDAISHLVDFRKQEGAALEKKFREKIANISSLLEKIAPYEKE RVEKVKERITDALEKTLSVDYDKNRLEQELIYYIEKLDVNEEKQRLSNHLKYFISTMESG NGQGKKLGFIAQEMGREINTLGSKSNHAEMQKIVVQMKDELEQIKEQVLNVM >gi|222159246|gb|ACAB01000113.1| GENE 33 33850 - 34419 572 189 aa, chain + ## HITS:1 COG:RSc2155 KEGG:ns NR:ns ## COG: RSc2155 COG0194 # Protein_GI_number: 17546874 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Ralstonia solanacearum # 3 183 20 198 221 146 42.0 2e-35 MTGKLIIFSAPSGSGKSTIINYLLTQNLNLAFSISATSRPPRGTEQHGVEYFFLTPEEFR CRIENNEFLEYEEVYKDRYYGTLKAQVEKQLEAGQNVVFDVDVVGGCNIKKFYGDRALSV FIQPPSVEELRCRLEGRGTDAPEVIESRIAKAEYELGFAPQFDCVIVNDDLEAAKAEALK VIKEFLAEE >gi|222159246|gb|ACAB01000113.1| GENE 34 34628 - 35218 302 196 aa, chain + ## HITS:1 COG:BS_yqeJ KEGG:ns NR:ns ## COG: BS_yqeJ COG1057 # Protein_GI_number: 16079618 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Bacillus subtilis # 7 194 2 186 189 110 34.0 2e-24 MNQVKTRKTGIFSGSFNPIHIGHLALANYLCEYEGLDEIWFMVSPQNPLKTKAELWSDEL RLNLVELSISDYPRFRASDFEFHLPRPSYSVYTLEKLHEAYPDREFYFIIGSDNWERFGH WYQSERIIKENQLLIYPRPGFPVKEEELPETVRLVHSPVFEISSTFIREALSEGKDIRYF LHPRVWEAIKKDNIRL >gi|222159246|gb|ACAB01000113.1| GENE 35 35224 - 36513 1026 429 aa, chain - ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 423 23 448 470 387 46.0 1e-107 MRRNAMQQLYDWKEKSTRKPLIIRGARQVGKTWLMKEFAATAYKQFAYINFEDNEVMKEV FQKDFDIERILMAIQLVTGKVVDTDTLILFDELQEAPRGLTAMKYFLEKAPQYHVIAAGS LLGIAMHQNDSFPVGKVDFIDLYPLSFSEFLEAIGQESFVSLLAKQDWNLISTFRSKFTD FLKQYYFVGGMPEVVNAFIEHKDYTEVRQLQQNILDSYDRDFSKHAPIAEVPRIRMVWRS VPAQLAKENRKFIYGVIKEGARAKDFELAIEWLIDAGLIYKVNRVKKGGIPLSAYEDFSA FKLFMLDTGLMGAMSGLPPQALLEGNVLFTDYKGAITEQYVLQQLKSVKGLNIYYWSSDT SKGELDFLLQKEICIIPVEVKAEENLQSKSLRSFVEKNPELHGIRFSMSDYREQEWLTNY PLYSAGHIL >gi|222159246|gb|ACAB01000113.1| GENE 36 36710 - 37789 1025 359 aa, chain - ## HITS:1 COG:BH1618 KEGG:ns NR:ns ## COG: BH1618 COG1408 # Protein_GI_number: 15614181 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Bacillus halodurans # 115 359 27 256 256 103 30.0 6e-22 MKKTIYPLTKKTIYPVILLALLLLVSCKSKKNMVASLPHPVLHTDSIYPDTTNAIAGLFA PDHSKLKALAVSKNKKQHTKKKETDTDADKSDRMLRGTPITSSSVDVSSVYTGVDRVVKY DFTHRDVPEAFEGFRIAFISDLHYKSLLKEKGLNDLVRLLIAQKADVLLMGGDYQEGCEY VEPLFSTLARVKTLMGTYGVMGNNDYERCHDEIVNTMKHYGMRPLEHEVDTLRKDGQQII IAGVRNPFDLGRNGVSPTLALSPKDFVILLVHTPDYIEDVSVANTDLALAGHTHGGQVRV FGVAPALNSHYGNRFITGLAYNTAKIPLIITNGIGTSKLPIRVGAPAEIIVITLHRLTE >gi|222159246|gb|ACAB01000113.1| GENE 37 37974 - 38867 791 297 aa, chain + ## HITS:1 COG:VNG1075G KEGG:ns NR:ns ## COG: VNG1075G COG1575 # Protein_GI_number: 15790173 # Func_class: H Coenzyme transport and metabolism # Function: 1,4-dihydroxy-2-naphthoate octaprenyltransferase # Organism: Halobacterium sp. NRC-1 # 10 296 11 311 311 177 38.0 2e-44 MEEVKRNSLQAWILAARPKTLTGAVTPVLIGTALAAMDGHFHWLPALICCLFASLMQIAA NFINDLFDFLKGTDREDRLGPERACAQGWISPQAMKTGIIITVALACLIGCTLLFFAGWE LIVVGVLCVLFAFLYTTGPYPLSYNGWGDVLVIVFFGFVPVGGTYYVQALTWTPDVTIAS LICGLLIDTLLVVNNYRDREADARSGKRTVIVRFGEKFGRYFYLMLGVTASLLCLCFLRE GHFLAALLPQLYLIPHFLTWKRMVKIYSGKKLNSILGETSRNMLFMGVLIAIGMLVS >gi|222159246|gb|ACAB01000113.1| GENE 38 38871 - 40007 1157 378 aa, chain - ## HITS:1 COG:FN1667 KEGG:ns NR:ns ## COG: FN1667 COG1088 # Protein_GI_number: 19704988 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Fusobacterium nucleatum # 1 377 1 397 399 496 60.0 1e-140 MKTYLVTGAAGFIGANYIKYILAKHNDIKVVILDALTYAGNLGTIAKDIDNERCVFIKGD ICSRDVVDGLFAEYRFDYVVNFAAESHVDRSIENPQLFLITNILGTQNLLDCARRAWVMG KDEQGYPTWRKGVRYHQVSTDEVYGSLGAEGYFTETTPLCPHSPYSASKTSADMVVMAYH DTYKMPVTITRCSNNYGPYHFPEKLIPLIIKNILEGKHLPVYGDGSNVRDWLYVEDHCKA IDLVVREGKDGEVYNVGGHNEKTNLEIVKLTISTIHRLMAENPEYRQVLKKKVKDENGNI SIDWINEDLITFVKDRLGHDQRYAIDPTKITNALGWYPETKFEVGIVKTIEWYLTNQAWV EEVTSGDYQGYYEKMYGK >gi|222159246|gb|ACAB01000113.1| GENE 39 40030 - 40899 942 289 aa, chain - ## HITS:1 COG:MTH1791 KEGG:ns NR:ns ## COG: MTH1791 COG1209 # Protein_GI_number: 15679779 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Methanothermobacter thermautotrophicus # 1 287 1 287 292 401 64.0 1e-111 MKGIILAGGSATRLYPLSKAISKQIMPVYDKPMIYYPLSTLMLAGIREVLIISTPRDLPM FRDLLGTGEELGMSFSYKIQEQPNGLAQAFVLGADFLNGEPGCLILGDNMFYGQGFSAML RRAANIEKGACIFGYYVKDPRAYGVVEFNEQGKVISLEEKPEVPKSNYAVPGLYFYDASV TEKAAALRPSARGEYEITDLNRLYLEEGTLKVELFGRGFAWLDTGNCDSLLEASNFVATI QNRQGFYVSCIEEIAWRQGWIPTEQLLLLGQQLEKTEYGKYLIELAKQS >gi|222159246|gb|ACAB01000113.1| GENE 40 40999 - 41493 394 164 aa, chain - ## HITS:1 COG:PA0351 KEGG:ns NR:ns ## COG: PA0351 COG0622 # Protein_GI_number: 15595548 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Pseudomonas aeruginosa # 3 137 9 134 157 60 33.0 2e-09 MTRIGLLSDTHAYWDEKYLEYFESCDEIWHAGDIGSLEVAEKLAAFRPFRAVYGNIDGQE IRKLYPQINRFTVDGAEVLIKHIGGYPGKYDPSIIGSLMTRPPKLFISGHSHILKVKYDK TLDMLHINPGAAGMSGFHKVRTLVRFVINQGAFQDLEVIELADK >gi|222159246|gb|ACAB01000113.1| GENE 41 41549 - 43732 1859 727 aa, chain + ## HITS:1 COG:ECs3363 KEGG:ns NR:ns ## COG: ECs3363 COG0855 # Protein_GI_number: 15832617 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Escherichia coli O157:H7 # 7 722 7 681 688 405 35.0 1e-112 MESKYNYFKRDISWLSFNYRVLLEALDERLPLYERINFISIYSSNLEEFYKIRVADHKAV ASGATESDEETVQSARELVEEINKEVTRQLDDRVRIYEQKILPALRKNHIIFYQDNHVEP FHQQFIKDFFREEIFPYLQPVPVSKDKIVSFLRDNRLYLAIRVCPKKEEKNKGEETKENI EKYTGECINERNDGEGIKVGMEAARPNVTDLRQPLYFVMKQPYAKVPRFIELPSREKNHY LMFTEDIIKANLNLIFPGYDVDSSYCIKISRDADILIDDTASSADLVAQLKKKVKKRKIG DVCRFVYDRAMPSEFLDFLVDAFRIQRDELVPGDKHLNLEDLRHLPNPNKSLHSLEKPKP MKLTVLDEKESIFNYVAKKDLLLYYPYHSFEHFIHFLYEAVHNPETREIMVTQYRVAENS AVINTLIAAAQNGKKVTVFVELKARFDEENNLATAEMMQAAGIKIIYSIPGLKVHAKVAL IRRRGLNGEKIPSYAYISTGNFNEKTATLYADCGLFTCRKEIVADLYNLFRTLQGKEDPK FTTLLVARFNLIPELNRLIDREISLADEGKQGRIILKMNALQDPTMIDRLYEASEHGVQI DLIVRGICCLIPGQSYSRNIRVTRIVDSFLEHARIWYFGNDGTPKVFMGSPDWMRRNLYR RIEAITPVLAPDLRDSLIEMLNIQLADNQKACWVDDKLQNIFKKRTPGTPAVRAQYTFYD WLNKTNN >gi|222159246|gb|ACAB01000113.1| GENE 42 44130 - 46379 1804 749 aa, chain + ## HITS:1 COG:no KEGG:BT_2020 NR:ns ## KEGG: BT_2020 # Name: not_defined # Def: putative phosphate/sulphate permeases # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 749 1 750 750 1354 92.0 0 METIYLCIIIFLFVLAVFDLIVGVSNDAVNFLNSAVGAKAASFKTILFIAGVGIFIGASL SNGMMDIARHGIYQPEHFYFAEIMCILLAVMLTDVVLLDVFNSMGMPTSTTVSLVFELLG GTFALSLIKVHNSDTLGLGDLINTDKALSVIMAIFVSVAIAFFFGMIVQWLARVIFTFNY TKKMKYSIALFGGVAATAIIYFMLIKGLKDSSFMTPENKHWIQDNTLMLIAVFFVFFTVL MQILHWMKINVFKVIVLMGTFALALAFAGNDLVNFIGVPLAGFSSFMDYTANGGGNPNGF LMTSLLGPAKTPWYFLIGAGAVMVYALCTSKKAHAVIKTSVDLSRQDEGEETFGSTPIAR TVVRISMTLANGISRIMPSGSKEWFDSRFRKDEAIIADGAAFDLVRASVNLVLAGLLIAL GTSLKLPLSTTYVTFMVAMGTSLADRAWGRDSAVYRITGVLSVIGGWFITAGAAFTICFF VALVLHYGGNISIIALIGIAVFILIRSQVMYKKRKAKEQGSETLKQLMQATDSTEALQLM RKHTREELSKVLEYAETNFELTVTSFLHENLRGLRRAMGSTKFEKQLIKQMKRSGTVAMC RLDNNTVLEKGLYYYQGNDFASELVYSISRLCEPCLEHTDNNFNPLDAIQKGEFSDVAED ITYLIQQCRKKMENNEYNNLEEEIRRANDLNGQLSLLKRKELQRIQSQSGSIRVSMVYLT MVQEAQNVVTYTINLMKVSRKFQMETEMP >gi|222159246|gb|ACAB01000113.1| GENE 43 46615 - 47121 355 168 aa, chain + ## HITS:1 COG:no KEGG:YPTS_3418 NR:ns ## KEGG: YPTS_3418 # Name: not_defined # Def: hypothetical protein # Organism: Y.pseudotuberculosis_PB1 # Pathway: not_defined # 1 166 1 149 150 63 26.0 3e-09 MKYQLEITTLLVPVNVHQLFEKCEWPELNSFDKEMVEDYFSDLVNGIQTDEALDDWKLTI VLYIGTYLGANHISIRKHGITDTATKEKVLTIGIPLPCSKTVRWGVKKKERFTGKIPDEN YRRNNRLLPVNFAKYDTMGTYIEDNIRIALLNLFEVGFTLKGYKVKKR >gi|222159246|gb|ACAB01000113.1| GENE 44 47405 - 48031 480 208 aa, chain + ## HITS:1 COG:no KEGG:BT_4601 NR:ns ## KEGG: BT_4601 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 207 212 286 66.0 4e-76 MAMYVMEEMPDIHGTGEQVLYPRFAMMDQVSTEELIRQIASSSGFNVGDVEGVITQIGIE MAHQMAEGKSVKLDGIGTFSPSLALCKDKEREKAGEGETHRNARSIVVGNVNFRVDRKMI RRINGRCLLERAPWKSQRSSQKYTPAQRLALAVRYLEEHPFLTVYEYRKLTGLLRTAATN ELRQWAYTPGSGIGIDGRGTHRVYIKKT >gi|222159246|gb|ACAB01000113.1| GENE 45 48148 - 48924 369 258 aa, chain + ## HITS:1 COG:no KEGG:BT_1962 NR:ns ## KEGG: BT_1962 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 258 1 258 258 409 80.0 1e-113 MGNLISFMKEVAEGLRKSGNYGTAHVYRSSMNAIIAFNGSRNLSFKKVNQEFLKSFETYL REKDCSWNTVSTYMRTLRAVYNRAVDRRMASYIPHHFRYVYTGTRADRKRALEKEDMERL MKELPKQIHQGREELQRTRAYFFLMFMLRGMPFVDLAYLKKQDMVGNVLTYRRRKTGRLL TVTLLPETMKLIKKYMNTDSASPYLFPILTGGESTEATYREYQIALRNFNYQLLLLKQVL ALTSDLSSYAAKHHTISI >gi|222159246|gb|ACAB01000113.1| GENE 46 49354 - 50247 606 297 aa, chain + ## HITS:1 COG:no KEGG:BF3036 NR:ns ## KEGG: BF3036 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 297 1 297 297 606 99.0 1e-172 MGRITINGTQAGFSCKKEVSLALWDVKTNRAKGKSEEARTLNQELDNIKAQITRHYQYIC DHDSFVTAKKVYNRYVGFSEECHTLMNLFREQLEPYKKKIGIEKAESTYCGLVADYKSLL LFMKSKKNAEDIVIEELEKSFIEDYYNWMLGTCALANSTVFGRVNTLKWLMYIAQEKGWI RVHPFASFECMPEYKRRSFLSEEELQRIIHIEPRYKRQRAMRDMFLFMCFTGLSYVDLKA ITYDNIHTDSDGGTWLMGNRIKTGVAYVVKLLPIAIELIEKYRGTDEKKDSPNVSFR >gi|222159246|gb|ACAB01000113.1| GENE 47 51357 - 52274 793 305 aa, chain + ## HITS:1 COG:no KEGG:BF3033 NR:ns ## KEGG: BF3033 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 305 1 305 305 592 99.0 1e-168 MKRKLRFLAGACLFTATALFSGCSSDDDFLMDPVDSGTSQTRAVTNSDGTLTITFDDFDP GMLAGPTSAGENLYSYQGYPQVTTIYDNTPEEYLFLSMFNTVGGSTEYSSGGIALSNWNI RSNQSGNTGDWWYSYLNQCSVYNTAVEAEGQNKEAGHSGSNFGVVYGYVDAYNQAWMAKP EFYFNVPRKLVGLWICNTSYTYGVITYGNQFGSTGVATPLKEMKGYFQVNLECYDANGGL IRTYKRLLADYRNGQQQVDPITTWDYWEINAEGVQSVKFNFEGSDSGAYGLNTPAYICID DITIQ >gi|222159246|gb|ACAB01000113.1| GENE 48 52339 - 54120 1429 593 aa, chain + ## HITS:1 COG:no KEGG:BT_1956 NR:ns ## KEGG: BT_1956 # Name: not_defined # Def: putative cell surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 593 1 593 593 1164 100.0 0 MHRFHYFIISACMLFTSCNKDEVITEEVGGQPIIELDSETGIYTVKVDHELTIAPTYQNV EDALFAWTIDGTLVSSGPSLQRTWNECGDFYVKLRVDNAEGYAEEELKVEVKELTPPVIS LALPSQGLKVVRNTDYTFTPDIQHSDVEGFKIEWVREGKIVSTENTYTFNEKELGVYTVT INASNIDGTTTKDVSVEVVETMPYVVKFPTPSYLQTSTDRYTFADRPVFLRPLLEYFDNP RFEWSVDGQVMEGEVERMFKFTPSAPGEYTVSCTVSEDTPTEKISRNIDKGKTAVTATVK VVCVDKKEQDGFRASGSSKLWNKVYEYTPAPGQFINETSTIGGMTGNETSPEAAVAWATQ RLKDKLHVSLGSFGGYIIVGFDHSIPNSGNQYDFCVQGNAFDGSSEPGIVWVMQDINGNG LPDDEWYELKGSEAGKEETIQNFEVTYYRPEGKKMDVQWISSDGRNGWVDYLSAYHTQDY YYPAWISENSYTLTGTCLAARNTQDSQTGYWDNQSYDWGYVDNFGNDQIEGGSTVDGSGQ RNGFKISNAIHADGTEANLQYIDFIKIQCGVLAKSGWLGEVSTEVFSFEDLTK >gi|222159246|gb|ACAB01000113.1| GENE 49 54135 - 56225 1197 696 aa, chain + ## HITS:1 COG:no KEGG:BT_1955 NR:ns ## KEGG: BT_1955 # Name: not_defined # Def: putative cell wall biogenesis protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 696 1 696 696 1297 100.0 0 MKIYFLPMLLSLFFLGACDKNDEIIPEDADENFITSVVMTVDGKSYTADITDNTVTITVP YTVSLNNAEVEFKYTTSATIIPDPETVTDWDNERTFRVTSYNGDAREYTYKVVKSEIESD GDVELKTTEEVASFAATKTTVVKGNLIIGSDAEEAEKITDISALASLKEVTGNIVIRNSY NGADLTGLDNIVSAGGLQVGSTDVASKATELHMISMKALETLSGDISVYNDQVTYVLFEK LATIEGSVMFNASSLQSFEFPVLTTVGQDLNLQGLNEENTAAGSIASLEIPELTSVGGVL SVNNLAKLTSMSFLKLKETGGLDFHTVPVMLETINLPEIETVNGSIIMEANMEAPPTGSF VPQRNDVLQAFGGMDKLTTIKGQIKIKNFTALKQLPDWSKITTLGSITLDYLEDVSGTLL LPNARFETFGETAPQIEIINKVQLSKIETAEDLSNVNFVITSLTNNKFPEITFKNIKDFT CKPTTNNTDYTISTIQHVYGNLNVTGQMRSNAKFPDLEIIDGYGYIQIPMFASITMPVLK EVGGQFYLSGNFTSCNLPLLSKVCCSASPVYYKEGEGSLAISLQSKSLDIPELLHVGGEG LFVNKATGITCDKLQTIDGTLQIKSATSLSQETLSMEKLETLHGVVFDGLTKFTDYTFFG KFIENGMITGESWSVTKCGYNPTFQNMKDKQYTQQD >gi|222159246|gb|ACAB01000113.1| GENE 50 56254 - 57354 727 366 aa, chain + ## HITS:1 COG:no KEGG:BT_1954 NR:ns ## KEGG: BT_1954 # Name: not_defined # Def: putative surface layer protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 366 1 366 366 751 100.0 0 MIRVLFFIRMTMSRTIQRICLFLFCLPVFGSCMKWDYGEMEDFSVSASGLFITNEGNFQY SNATLSYYDPATCEVENEVFYRANGFKLGDVAQSMVIRDGIGWIVVNNSHVIFAIDINTF KEVGRITGFTSPRYIHFLSDEKAYVTQIWDYRIFIINPKTYEITGYIECPDMDMESGSTE QMVQYGKYVYVNCWSYQNRILKIDTETDKVVDELTIGIQPTSLVMDKYNKMWTITDGGYE GSPYGYEAPSLYRIDAETFTVEKQFKFKLGDWPSEVQLNGTRDTLYWINNDIWRMPVEAD RVPVRPFLEFRDTKYYGLTVNPNNGEVYVADAIDYQQQGIVYRYSPQGKLIDEFYVGIIP GAFCWK >gi|222159246|gb|ACAB01000113.1| GENE 51 57373 - 59451 1299 692 aa, chain + ## HITS:1 COG:no KEGG:BT_1953 NR:ns ## KEGG: BT_1953 # Name: not_defined # Def: putative TonB-linked outer membrane receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 692 1 692 692 1378 99.0 0 MKRHLILLFVGVSLPFLLAAQQKNSVSITKRVLRIPEVTVVGKRPMKDIGVQRTRFDSIA MRENIALSMADVLTFNSSVFVKNYGRATLSTVAFRGTSPSHTQVTWNGMRINNPMLGMTD FSTIPSYFIDDASLLHGTSSVNETGGGLGGLVRLSTSPANHEGFGLQYVQGVGSFSTFDE FLRLTYGDKHWQSSTRVVYSSSPNDYKYRNRDKKENIYDEDKNIIGSYYPTERNRSGAYK DLHVLQEIYYNTGEGDKFGLNAWYINSNRELAMLSTDYGNDMDFENRQREQTFRGVLSWD RVREKWKVGVKGGYIHTWMAYDYKRDKGNGEMASMTRSRSKINTFYGSADGDYAPSEKWL FTAGVSVHQHLVESADKNIISQEGNKAVVGYDKGRVEFSGSVSAKWRPVDHFAASLVLRE DMFGTEWAPVIPAFFIDGVLSKKGNIVAKASISRNYRFPTLNDLYFLPGGNPDLKSEHGF TYDVGLSFSVGKENVYALSGGINWFDSHIDDWIIWLPTTKGFFSPRNLKKVHAYGAETNA HLDIMLGKDWKLDMNGTFSWTPSINESEPMSPADQSVGKQLPYVPEFSATVTGRLSWRTW SLLYKWCYYSQRYTMSSNDYTLTGYLPPYFMNNVTLEKQLSFRWADLSLKGSINNLFDEE YLSVLSRPMPGINFEIFIGITPKFGKNKNSKR >gi|222159246|gb|ACAB01000113.1| GENE 52 59463 - 60602 955 379 aa, chain + ## HITS:1 COG:alr4031 KEGG:ns NR:ns ## COG: alr4031 COG0614 # Protein_GI_number: 17231523 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 51 375 93 420 426 227 36.0 3e-59 MNALKNLSLILLLSLAFTGCHNKSSKINDFNLLLYAPEYASGFDIKGAGGKESVLITVRN PWQGADSVTTWLFIVRNGEEVPEGFAGQVLKGDAKRIVAMSSTHIAMLDAIGEVRCITGV SGIDYISNPDIQARRDSIGDVGYEGNINYELLLSLDPDLVLLYGVNGASAMESKLEELDI PFMYVGDYLEESPLGKAEWMVVLSEVTGKREKGEKAFATIPVRYNALKKKVADSTLDTPS VMLNVPYGDSWFMPSTQSYVARLITDAGGRYIYQKNTGNASIPIDLEEAYLLASDADMWL NVGMANSLDDLKASCPKFTDTRCFKNGEVYNNNARTNTAGGNDYYESAVVNPDIVLRDLV KIFHPELVQEECVYYKQLK >gi|222159246|gb|ACAB01000113.1| GENE 53 60603 - 61583 618 326 aa, chain + ## HITS:1 COG:alr4032 KEGG:ns NR:ns ## COG: alr4032 COG0609 # Protein_GI_number: 17231524 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Nostoc sp. PCC 7120 # 6 323 22 356 362 235 48.0 7e-62 MRSRSTILFSILITLTVGLFLLDLAVGVVNIPIRDVWAALTGGNCSRATEKIVLNIRLIK AIVALLAGAALSVSGLQMQTLFRNPLAGPYVLGISSGASLGVALVVLAGIGSSIGIAGAA WVGAAVVLLVITAVGQRIKDIMVILILGMMFSSGVGAVVQILQYLSKEESLKAFVIWTMG ALGDVTSGQLLILVPSVFAGLLLAVLTIKPLNLLLFGEEYAVTMGLNIRRSRSLLFLSTT LLAGTITAFCGPIGFIGLAMPHVTRMLFQNSDHHVLLPGTILSGASILLLCDIISKIFTL PINAITALLGIPIVVWVVLRNKSITA >gi|222159246|gb|ACAB01000113.1| GENE 54 61580 - 62338 195 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 216 1 217 245 79 25 4e-14 MIELQHFSIGYKENSLLHEVNATIKKGQLTALIGRNGTGKSTLLRAIAGLNRCYSGKIIL DGHDIACMKTEDMAKTLAVVTTERTRIANLRCKDVVAIGRAPYTNWIGRMQETDKEIVMQ SLISVGMEAYANRTMDKMSDGECQRVMIARALAQDTPIILLDEPTSFLDMPNRYELVALL RRLVHDEKKCIMFSTHELDIALSMCDSIALLDTPNLSCLTASEMQKSGYIDRLFQNENIR FDSLCGTMILKQ >gi|222159246|gb|ACAB01000113.1| GENE 55 62335 - 62901 201 188 aa, chain + ## HITS:1 COG:no KEGG:BF3025 NR:ns ## KEGG: BF3025 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 188 1 188 188 391 100.0 1e-108 MSIYTVENFTSDITVEGYIAEFRDEPHFLELCKQCTNYGKSWGCPPFDFDTESFLRQYKY AHLMATKIIPEDKDIPIEYTQKLILPERIRIESELLDMERKYGGRSFAYIGKCLHCSDNE CTRNCGTPCRHPEKVRPSLEAFGFDIAKTLSELFNIELLWGKDGKLPEYLVLVSGFFHNE YELCNIAY >gi|222159246|gb|ACAB01000113.1| GENE 56 62830 - 63102 122 90 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWSAPADSKLCFRLTKTGRFNGEYTDKPLLISSSSVAFYISNYNSFEITKQIEPNVVSLS VYFMSESLVCYVAKFVFIMKKSAYENKIFR >gi|222159246|gb|ACAB01000113.1| GENE 57 63485 - 63739 159 84 aa, chain - ## HITS:1 COG:no KEGG:BT_1948 NR:ns ## KEGG: BT_1948 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 84 1 84 84 149 100.0 2e-35 MKPPVTYSHPKSVIVMIHMTGGTEMPFRSDGSSYAEATSEARSGVEKRAGFRFLVHRSFL ILMGRRESAHPSGYGFLNYRSMGV >gi|222159246|gb|ACAB01000113.1| GENE 58 63816 - 64232 224 138 aa, chain + ## HITS:1 COG:no KEGG:BT_1947 NR:ns ## KEGG: BT_1947 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 138 1 138 138 262 100.0 3e-69 MDSEKEKNRLSDIVLERVGLTGNLLSAPVSPSLEPVVEIPSHGSQVRAGKVTGPEEYKRR FLVPAPRAAEWKTAYIDGRLHRRIAMLVRAAGCGSISGFIIRLLELHMEEHREDIASLLG EVYRPWDEDGQPGGTPRR >gi|222159246|gb|ACAB01000113.1| GENE 59 64262 - 64666 169 134 aa, chain + ## HITS:1 COG:no KEGG:BT_1946 NR:ns ## KEGG: BT_1946 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 126 1 126 128 232 100.0 3e-60 MTGKNVKERIGLGGMDADRIREIMGEAPACPRKRRGTSGNDIIRPARKNGPMQPSVYGEE YLHGIAGVQRRSLHIPAALHRKLSILAGASRNGKVTLEGFINHLVSRHLEEYRETTDMIL EESLPGPVIKARPP >gi|222159246|gb|ACAB01000113.1| GENE 60 64742 - 65530 524 262 aa, chain + ## HITS:1 COG:no KEGG:BT_1945 NR:ns ## KEGG: BT_1945 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 247 1 247 260 499 100.0 1e-140 MMKKEKELLVAIASQKGGVGKSVFTVLLASVLHYRKDVRVAVVDCDSPQHSIALMRERDM ENVMKNDDLKVNLYRQYERIRKPAYPVIKSDPEKGVEDLRRYMDEKGETFDIVLFDLPGT LRSEGVVHTVAAMDYIFVPLKADNIVMQSSLQFTKVLEEELIAKGNCNLKGIRLFWNMVD RRGRKNLYDAWNRVIHRMGLRLLSSHIPNTLRYNKEADPVCKGVFRSTLFPPDPRQEKDS GLPELVEEICHAIGLEESDTER Prediction of potential genes in microbial genomes Time: Wed May 18 03:50:47 2011 Seq name: gi|222159245|gb|ACAB01000114.1| Bacteroides sp. D1 cont1.114, whole genome shotgun sequence Length of sequence - 15734 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 4, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 15 - 74 5.0 1 1 Op 1 . + CDS 317 - 2197 923 ## BT_1940 hypothetical protein 2 1 Op 2 . + CDS 2187 - 2384 81 ## gi|160890435|ref|ZP_02071438.1| hypothetical protein BACUNI_02877 + Term 2539 - 2584 8.2 3 2 Tu 1 . - CDS 2916 - 3020 114 ## - Prom 3066 - 3125 9.1 + Prom 3034 - 3093 9.3 4 3 Op 1 . + CDS 3117 - 5978 1844 ## BT_1939 putative outer membrane receptor 5 3 Op 2 . + CDS 5992 - 7263 1026 ## BT_1938 hypothetical protein 6 3 Op 3 . + CDS 7287 - 8849 1215 ## BT_1937 hypothetical protein 7 3 Op 4 . + CDS 8907 - 11054 1683 ## BT_1936 hypothetical protein 8 3 Op 5 . + CDS 11061 - 13262 1477 ## BT_1935 hypothetical protein 9 3 Op 6 . + CDS 13265 - 15457 1505 ## BT_1934 hypothetical protein + Term 15626 - 15662 -0.1 10 4 Tu 1 . - CDS 15404 - 15676 231 ## COG3328 Transposase and inactivated derivatives Predicted protein(s) >gi|222159245|gb|ACAB01000114.1| GENE 1 317 - 2197 923 626 aa, chain + ## HITS:1 COG:no KEGG:BT_1940 NR:ns ## KEGG: BT_1940 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 212 626 122 536 536 848 99.0 0 MKRKVLLSTLFLILHFTVALAQQVTISGQVLDEKSEPLIGATINIEGTTNAVITDLEGKF TIKVLPSEKLVISYLGYKPKTITIGKNRRFDIILDPSVTEMDEVVVVGYGSQRKSDIATA VASVNIKDIVNSSSTQTLQALQGKISGVQIIPTDGSLSSGMTFRIRGVNSVTGGTQPLFV IDGVPMPTQQITNEDTETVNNPLLGLNPNDIESNLNYWNPRPIKKDNPYLTVQRKLSPQT VEDFANRLFIYQVGKNNHIGFPFRKPSQMEILNFEMRNYFAETNTNYKAFATGGDKAQSC WMANFVPFDKVTDIYLFESAIDAMSFYEINHYTKETTCAFISTGGYVTKSQIENISRIFP SDKVKWNCCYDNDASGNGFDITTAYYLKGEECKAFARTNTGDTYKTIYLSFPDGNTQTFK EDAFSSGEYLKQHGIDNVNIIKPSRYKDWNELLVYYKRFDLNLGPGMKFIPAIEKTISQL NLRGYEQLANSISSSTKELVDSLLEQANYCISAPLAESGAYTLMVDCNIFMGLDTMVPVP SNLYVIEKCTQKKISAHAINEFLKKEYINIFRDMSSSDFKNFLEKDILTYTKGAVEKNFE KVILTFGWSLKPSILKKKSFDLEHGI >gi|222159245|gb|ACAB01000114.1| GENE 2 2187 - 2384 81 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160890435|ref|ZP_02071438.1| ## NR: gi|160890435|ref|ZP_02071438.1| hypothetical protein BACUNI_02877 [Bacteroides uniformis ATCC 8492] # 31 65 1 35 35 70 100.0 4e-11 MVYSITIQHDTIIVAVSSRHVVTIPARFNGMILGLWNLIQIFVAYCNENPNWVSNVRSLW GDRFK >gi|222159245|gb|ACAB01000114.1| GENE 3 2916 - 3020 114 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLFLLVPLHNLTDLNRLLSNNIATALYLGKSNIY >gi|222159245|gb|ACAB01000114.1| GENE 4 3117 - 5978 1844 953 aa, chain + ## HITS:1 COG:no KEGG:BT_1939 NR:ns ## KEGG: BT_1939 # Name: not_defined # Def: putative outer membrane receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 953 1 953 953 1823 99.0 0 MKKGILGCNIIYLFILICIGIALPIKSQNNYKVKLTGIVYEYDHNNKRLPLEFAAVSIPE IALGTTSDENGRYILENVPTGKIRMQIQYLGKVSIDTLINVNKDLVLNFTMRNEDFKLKE VTVTATNSRSGKSTSSHISRSAMDHMQATSLYDVMSLMPGGISQNQDMSSAQQINIRQVS SSSGPEAPMNAMGTAIIRDGAPISNNANLSAMSPTVLSGTETPASLAGGASPAGGTDVRS ISTENIESIQIVRGIPSVEYGDLTSGAVIINTKAGREPLRVKAKANPNIYQVSMGTGFEL GKKKGALNVSADYAYNTNNPISSYQHYQRATTKLLYSNTFFNNKLRTNSSFDFIYGKDQR ERNPDDEQTKTASEGRDIGFTLNTNGTWNINKGWLKTLRYVLSGTYMDKDSYYETVYSSA TSPYSMTTTNGAVLSNFAGQHIYDANGNQITNFGPEDINHYAVYLPSSYLGHYEIDSREV NLFAKVTSSLFKASGHVNNRILIGADFRSDGNVGKGKTYDPSTPPYRSQYGHNSSFRPRN YKDIPFINQFGAYVEDNFKWSISGTHDLNIQAGVRYDHTSVVGGIFSPRVNASIDLIPNL LSLQGGYGIAAKMPSLLYLYPENAYFEYININELTNENIPESQRLFMTTTEVRQVDNSDL KIAQNHKAEVGFNLRVGKTNLNVIAYKERLKDGYVMSQTFNTFNTFIYNEYQRTENGIEL SSSLPVLSTYAKPTNNLNIETKGLEFDLNIGRIDAIRTAFQINGSWMRTKSWRQGYSFYD NSEDAASARKPVAIYSQEGNASYKQQFVTTLRATHNIPRIGFVVTMTAQAIWQQSNWNTF GNDSIPVGYLALEDASVNMFPKGKYTTTQQVKDAGYGYMLNNVSHNNAIKESYSPYFCFN LNVTKEISNMLRVSFFANNMFRSYPRRESKRNPGSYIQLNNRFFFGLELSLTL >gi|222159245|gb|ACAB01000114.1| GENE 5 5992 - 7263 1026 423 aa, chain + ## HITS:1 COG:no KEGG:BT_1938 NR:ns ## KEGG: BT_1938 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 423 1 399 399 795 100.0 0 MKKYLIYLFTLASTLLIGCDSFRDMSGTAEVNPITVDVYLDITVENISTLKDLTVKFDNY DEDLHYVKEVTDNSVKVDGIIPGIYSVTVSGTAIDTENNEYYINGNSVNAALFKHGSALN IEVQGLKVSPLIFKEIYYCGSRPEKGGVYFRDQFYEIYNNSADILYLDGIYFANLTPGTA TTKLPIWPEADGNNYAYGERVWKFPGNGTEYPLAPGESCIISQFAANHQLDIYNPQSPID GSSSEFEFNMNNPNFPDQAAYDMQHVFYQGKAEMGSIPQYLTSVFGGAYVIFRVPEGEAW DPVNDENMKTTDLSKPNSNVYYAKIPIKYVLDAVEAVNNESKMNAKRVPGVLDAGITWVG ATYCGLGIARKLSTDEEGNPIIREETGTYIYQDTNNSTDDFERGVVPVMRRNGAKMPSWN HTL >gi|222159245|gb|ACAB01000114.1| GENE 6 7287 - 8849 1215 520 aa, chain + ## HITS:1 COG:no KEGG:BT_1937 NR:ns ## KEGG: BT_1937 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 520 1 520 520 1001 100.0 0 MKKQYMNYMKSCLLVMIACLVSMVGAAQGFSPAAMEQLKTRRLWSHSQNAAGMPFDDIQN YSNVILGYDLQDGNYCRPQEGQKEAIVGVSSEGFINLKNAYVWGAFNFAQKNLTDAGYNA SIADPFRGMPYYVADQHLSKWRNQYYDLKFRAATPLLGNHWALGLEGNYVATLAAKQRDP RVDTRFYTLGLTPGITYKLNNSHKFGASFKYSSIKEDSRMSNVNSYVDQDYYILYGLGTA IKGIGSGVTSNYIGDRFGGALQYNFSMPSFNLLLEGSYDVKAETVQQSYTTPKKIAGVKD KTAHVSLTMIQEGKDYTNYMRTTYTNRNIDGIQYISQRDNSESQSGWVELYNNIRSTYKA QTASLNYALSRNRGNEYSWKAELNVNYTKQDDEYLMPNSVQNAENLSLGLGGKKNFVLGN SLNRRLLIDVHVAYNNNLGGEYVYGGSHADYPTVTELQQGLTNYYTCDYYRIGGSITYSQ QVRENRRMNLFAKVVFDRVNTSDYDYDGRTHLSISLGCNF >gi|222159245|gb|ACAB01000114.1| GENE 7 8907 - 11054 1683 715 aa, chain + ## HITS:1 COG:no KEGG:BT_1936 NR:ns ## KEGG: BT_1936 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 715 1 715 715 1444 100.0 0 MLCSILSLRAQTFVKPAVKVKDTSFAVITDKGTFQACEAELKAYQEILGMEGLPTFIVYN EWNKPEDVKKVIVKLYKKDKLEGVVFVGDIPIPMLRKAQHMTSAFKMDEKNNDWRDSSVP SDRFYDDFDLQFDFLKQDSVENNFFYYNLAIKSPQQIRCDIYSARVKAVDNGEEPHAQIS RYFKKVVAEHQINNKLDQFFSYTGDGSYSNSLTAWTPETFTIREQMPGVFDKEGRARFIR YNFSDYPKDDVINMLKRTDLDLSIFHEHGMPERQYLSGSPATNRWNAHVDAMKYYYRGLA RRKQNNKKSFDEMLDMMKNTYGLDTTWIAGYDDPKVIAEDSLLDLRTGIILSEVTEFKPN SRMVIFDACYNGDFREKDYIAGRYIMSEGKCVTTFANSVNVLQDKMANEMLGLLGMGARV GQWAKLTNILESHITGDPTLRFQSINEVDANALFKEPYSESRMLELLQSPYADIQNFALH NLYRNDYPGISDLLRKTFETSSFMMVRFTCLALLEKISDKNFREVLHLAITDSYEFIRRT SVRMMQHVGLNEYVYPQIKAYVEDNLSERVAFNVSLGLQVFDQAAVQAAIDKVMAETYVL QDKEEMRKVLENANNSRSMQKELLSKETSERWRILYCNSLKNHMAHACVDGLLALLTDSS ESEKLKTCLLEAFAWFTHSYRKPDILRVCDQLRKDKSLSENLREEADRTYYRLKN >gi|222159245|gb|ACAB01000114.1| GENE 8 11061 - 13262 1477 733 aa, chain + ## HITS:1 COG:no KEGG:BT_1935 NR:ns ## KEGG: BT_1935 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 733 1 733 733 1449 99.0 0 MIKKIIYLAFLLPLAGNAQTTVIKPLVKQPTAFAIITDNQTYANTKDAMHQYKTAVEDDG LATYLISGDWQNPDQVKQIIIKTYQECPSLEGLVLIGDVPVALVRNAQHMTTAFKMNEKA FPWDQSSVPTDRFYDDLNLKFEFIRQDSVNHQHFYYKLTEDSPQRLNPTFYSARIKYPEK KEGDKYAAIASYLKKAAAAKADKHNQLDRVFSFNGASYNSDCLIVWMDDEKAYMENFPLA FGRQMGFKHWNFRMKHPMKYKLFSELQRKDLDLFMFHEHGMPTGQLINDELACTDFNNRY KMLKSTLYNAVMSHVGKRDKDTLRIQMQEKRQVNEVFFKDLDNPKFWEADSLHYADERIV TEDLMKRNLSTNPKMIMFDACYNGSFHENDYIAGQYIFNDGQTLVAQGNTRNVLQDRWTI EMIGLLSHGVRAGQYNKLIVSLEGHLFGDPTFRFAPIEANTLSTDITIHKDDKAYWKNLL NSPYADVQSLAMRMLADADTQKELSPLLLKKYRESGFNTVRMEAIKLLSRYQDDNFIEAL REGLNDTYEMVARQSAIYAGFVGDDSLLPAIVEALVEHNERLRVQMSANKALSLYPKEKV EKTIEDFYAKVDRLNENEEKKRLLRSLERMFVQEAKVHQTLMDVAAPEAKRISAIRNVRN YTFHFHVDDYLNVIRDAGNPQEVRVVMAEALGWFTNSVQRPHILEEIKKMQQTANLPEDL KAELEQTIKRLSL >gi|222159245|gb|ACAB01000114.1| GENE 9 13265 - 15457 1505 730 aa, chain + ## HITS:1 COG:no KEGG:BT_1934 NR:ns ## KEGG: BT_1934 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 730 1 714 714 1436 100.0 0 MNKYILLAITALCLQDMQAQTVVHPSIKTKTTFAIVVDQKSYDEAKSEIDAYRTSIEKEG LGTYLLIDDWKRPEPIREQLVKLHENEKTPLEGCVFIGDIPIPMIRDAHHLSSAFKRSPK ANWQKSSVPSDRYYDDFGLKFDYIKQDSLIPDYHYMTLRADSKQYISPDIYSARIRPLHL EGENRYQMLRDYLKKAVAEKAKQNAFDQLTMARGHGYNSEDPLAWSGEQIALREQLPQIF KSGNTVKFYDFNMRYPMKPLYLNEIQREGLDVMLFHHHGGPTMQYINGYENGSGINLSIE NAKIFLRSKVPSYAKKHGREAAIKEYAKQYGVPESWCAEAFDEEKIKSDSIVNRNMDIYT EDIRLLTPNARFILFDACFNGSFHLDDNIVGSYIFNKGKTIATMGCTVNTIQDKWPDEFL GLLAAGMRIGQFTRFTCFLENHLIGDPTFHFTNNAGLDMDINQALVAQEGNVTFWKKQLN SPMADMQAMALRQLSMANYSGLVELLKKSYHESNYFVVRLEALRLLALNYPTEVADVLQT AMNDSYELIRRYAVEYVEKNCNPELLPAWIESYLLRGHENRHRFRIFSAINTFDHDMALN ELKKQAADWSFYDSSYVNELLEYLPRQKKGLERDFALIDSPESTTKQIQSEISRFRNKPI AKAIEPLLNIIKNESQEEELRILAAETLGWYNLYYNKADIIKELNTFRTSNQKLMNEVTK TINRLKSQNR >gi|222159245|gb|ACAB01000114.1| GENE 10 15404 - 15676 231 90 aa, chain - ## HITS:1 COG:SMa0384 KEGG:ns NR:ns ## COG: SMa0384 COG3328 # Protein_GI_number: 16262658 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 14 71 106 163 400 61 44.0 4e-10 MSLYSVDNIKEKLVISLYAKGMSVSDIEEEMREIYEIELSTSAISIITNKVNQAAQEWQN RPLDPVYLIVWMILPILTFQPINCFGNFVH Prediction of potential genes in microbial genomes Time: Wed May 18 03:51:55 2011 Seq name: gi|222159244|gb|ACAB01000115.1| Bacteroides sp. D1 cont1.115, whole genome shotgun sequence Length of sequence - 35229 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 21, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 3518 1904 ## COG1002 Type II restriction enzyme, methylase subunits - Prom 3664 - 3723 4.3 2 2 Tu 1 . + CDS 4396 - 4689 312 ## BT_1930 hypothetical protein + Term 4704 - 4764 14.7 3 3 Op 1 . - CDS 4754 - 5965 1012 ## BF3004 tyrosine type site-specific recombinase 4 3 Op 2 . - CDS 5985 - 7214 938 ## BT_1928 transposase - Prom 7432 - 7491 7.7 + Prom 8017 - 8076 8.3 5 4 Op 1 . + CDS 8100 - 10703 2645 ## BT_1927 hypothetical protein 6 4 Op 2 . + CDS 10730 - 11596 795 ## BT_1926 hypothetical protein + Term 11632 - 11677 7.0 + Prom 11601 - 11660 2.0 7 5 Tu 1 . + CDS 11693 - 12223 327 ## BT_1925 hypothetical protein 8 6 Tu 1 . + CDS 12564 - 13484 495 ## gi|160883490|ref|ZP_02064493.1| hypothetical protein BACOVA_01459 + Term 13581 - 13632 8.1 - Term 13564 - 13624 9.7 9 7 Tu 1 . - CDS 13691 - 14362 695 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold - Prom 14531 - 14590 6.1 - Term 14553 - 14602 4.3 10 8 Tu 1 . - CDS 14630 - 15910 1339 ## COG2873 O-acetylhomoserine sulfhydrylase - Prom 15998 - 16057 6.8 + Prom 15880 - 15939 7.3 11 9 Tu 1 . + CDS 15996 - 17135 717 ## BF3546 putative N-acetylmuramoyl-L-alanine amidase + Term 17311 - 17354 3.6 - Term 17642 - 17700 12.1 12 10 Op 1 . - CDS 17742 - 19277 1611 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 13 10 Op 2 . - CDS 19297 - 19806 546 ## COG1038 Pyruvate carboxylase 14 10 Op 3 . - CDS 19830 - 21344 1566 ## COG0439 Biotin carboxylase - Prom 21557 - 21616 7.7 + Prom 21868 - 21927 6.8 15 11 Tu 1 . + CDS 21957 - 22280 366 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Prom 22399 - 22458 8.7 16 12 Op 1 . + CDS 22519 - 22836 176 ## PRU_1322 putative addiction module killer protein 17 12 Op 2 . + CDS 22844 - 23158 396 ## PRU_1323 putative addiction module antidote protein, HigA family + Term 23180 - 23206 -1.0 18 13 Tu 1 . - CDS 23160 - 24326 875 ## COG1488 Nicotinic acid phosphoribosyltransferase - Prom 24447 - 24506 7.7 + Prom 24237 - 24296 6.8 19 14 Tu 1 . + CDS 24493 - 26382 1146 ## gi|237714149|ref|ZP_04544630.1| predicted protein + Term 26472 - 26520 -0.3 20 15 Tu 1 . - CDS 26553 - 27332 249 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 21 16 Tu 1 . - CDS 27705 - 28643 613 ## BT_1910 hypothetical protein - Prom 28736 - 28795 4.8 22 17 Tu 1 . + CDS 29081 - 29950 849 ## BT_1908 hypothetical protein + Term 29958 - 30017 11.2 + Prom 30003 - 30062 7.7 23 18 Op 1 . + CDS 30217 - 31020 484 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 31063 - 31098 1.5 24 18 Op 2 . + CDS 31115 - 31486 318 ## COG3324 Predicted enzyme related to lactoylglutathione lyase - Term 31540 - 31601 -0.9 25 19 Tu 1 . - CDS 31702 - 32559 592 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 32807 - 32866 6.5 + Prom 32791 - 32850 7.5 26 20 Tu 1 . + CDS 32871 - 33095 268 ## BT_2368 hypothetical protein + Term 33208 - 33244 3.2 - Term 33189 - 33238 10.2 27 21 Tu 1 . - CDS 33357 - 34460 1086 ## Cpin_1073 AAA ATPase - Prom 34671 - 34730 8.5 Predicted protein(s) >gi|222159244|gb|ACAB01000115.1| GENE 1 2 - 3518 1904 1172 aa, chain - ## HITS:1 COG:jhp1409 KEGG:ns NR:ns ## COG: jhp1409 COG1002 # Protein_GI_number: 15612474 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Helicobacter pylori J99 # 6 1172 7 1155 1252 615 36.0 1e-175 MGLLKPNQVLNKAYRQVAIETTDFDLFKNALRTLRDNIVDGQREHTQKEHLRNFLSETFY KPYYMAPEEDIDLAIRLDKTIKSNIGLLIEVKSTTNKGEMISNDNLNRKALQELLLYYLK ERVNKKNNDIKYLIATNIHEFFIFDAHEFERKFYQNKQLRREFQDFVDGRKTSNKTDFFY TEIATTYIEEVKDSLEYTYFNLQDYQHLLDRTDSSASRKLIELYKIFSDTHLLKLSFQND SNSLNRGFYTELLHIIGIEERKENNKTVIVRKAVERRDEASLLENTINQLDAEDCLRHIN GRLYGNDYEERLFNVAMELCITWMNRILFLKLLEAQMLKYHNGDAIYKFLSITKIHDYDD LNTLFFQVLARDMGSRTHSIMRDFAYVPYLNSSLFEVTDLESKTIKINSLSQRTVLPVLA SSVLRNKKRNLQVNALPTLQYLFAFLDAYNFASEGSEEVQEEAKTLINASVLGLIFEKIN GHKDGSVFTPGFITMFMCREAITKTVLQKFNGYYGWNCTTRIELYNHIDNIVEANELINS LRLCDPAVGSGHFLVSALNELILLKYELGILVDATGKRIRKADYQLAIENDELIVTDTEG NLFAYNPLNAESRRMQETLFKEKRQIIENCLFGVDINPNSVKICRLRLWIELLKNAYYTA ESNYTYLETLPNIDINIKCGNSLLHRFALTDSIQTVLRESSISISQYKEAVAKYKNAQSK SEKQDLETFITEIKSKLKTEINRRDARLVRLNKRRSELANLQAPQLFEPTKKEKKASDKR IADLKKEIATLENIFEEIRSNKIYLGAFEWRIEFPEVLDAEGNFLGFDCIIGNPPYIQLQ SMGKSADVLECMGYITYARTGDIYCLFYELGMNLLTPNGFLCYITSNKWMRAGYGEALRG YFASKTNPIMLVDFAGIKIFDAITVEANILLSQKAANIFNTQACLVQDSNGLNNLSDFVQ QQGVKCNFADSIPWVILSPIEQSIKQKIESVGIPLKDWNIQINYGIKTGFNDAFIISTEK RDEILANCQTEDERVRTAELIRPILRGRDIKRYEYEWADLWIIATFPSRHYDIESYPAVK NYLLSIGIERLEQTGETHIVNGKKIKARKKTSNEWFETQDSISYWEDFSKPKIVWKIIGN QMAFAYDANNYVMNNACYIMTGDHLDYLLAVL >gi|222159244|gb|ACAB01000115.1| GENE 2 4396 - 4689 312 97 aa, chain + ## HITS:1 COG:no KEGG:BT_1930 NR:ns ## KEGG: BT_1930 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 97 97 169 100.0 2e-41 MEGIIDKENERVRRFFALLDDMEKKVERLARDNRPPFNGERFLTDRELSGMLKISRRCLQ DYRDQGRIPYIQLGGKILYRQSDIERLLEENYHPALV >gi|222159244|gb|ACAB01000115.1| GENE 3 4754 - 5965 1012 403 aa, chain - ## HITS:1 COG:no KEGG:BF3004 NR:ns ## KEGG: BF3004 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 403 1 403 403 800 100.0 0 MRSTFKLLFYINRNKVKSDGTTAVLCRISIDGKKSAVTTGVYCKPGDWDSKKCEIKTARE NNRLAAFRSRLEEAYGNLLRNQGVVTAELLKTTVSGANSVPEYLLQAGEVERERLRVRSK EINSTSTYRQSKTTQLNLRQFIESRGMKDIAFSDITEEFAESFKVFLKKELGHRNGHVNH CLCWLNRLIYIAVDREILRANPIEDVAYERKETPKLRHISRSELKRMMETPLPDPMMELA RRTFIFSSLTGLAYADTRALHPRHIGTTSEGRRYIRIRRAKTDVEAFIPLHPIAGQILEL YNTTDDDRPVFPLPVRDVLWYEVHGMGVALGMKENLSYHMARHSFGTLTLTAGIPIESIA RMMGHTNIDSTQVYAQVTDRKISSDMNRLMERRKPAAGKEAAG >gi|222159244|gb|ACAB01000115.1| GENE 4 5985 - 7214 938 409 aa, chain - ## HITS:1 COG:no KEGG:BT_1928 NR:ns ## KEGG: BT_1928 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 409 1 409 409 798 99.0 0 MKVEKFKVLLYLKKSEPDKTGKAPIMGRSTLNRTMAQFSCKLSCTPGLWNARESRLNGKS REAVETNEKIERLLLAVHSAFNSLMERKRDFDAAAVRDMFQGNAGMQMTLLKLLDRHNGE MKARVGVDRAPTTLSTYLFTYRTLSEFIKAKFKVPDLVFGQLNEQFIRDYQDFILLEKGY AVDTLRGYLAILKKICRIAYKEGHSEKYHFCHFKLPKQKETTPKAISRENFEKLRDLEIP EKRRSHVITRDLFLFACYTGTAYADAVSITRKNLFRDDEGSLWLKYQRKKTDYLGRVKLL PEAVALIEKYRDDTRETLFPPQDYHTLRANMKSLRLMAGLSQDLVYHMGRHSFASLVTLE EGVPIETICKMLGHSNIKTTQIYARVTPKKLFEDMDRFVEATRDLKLIL >gi|222159244|gb|ACAB01000115.1| GENE 5 8100 - 10703 2645 867 aa, chain + ## HITS:1 COG:no KEGG:BT_1927 NR:ns ## KEGG: BT_1927 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 556 1 524 928 143 32.0 3e-32 MRKWTYLVATLLMAGTTATFTGCIDTDEPEGIAELRGAKSDFIKAQAAVELVEAELRKAQ VAEQELVNAGLALQNKSAEIDLQLHELDIQLKQLLIEKEEAATAQAKAEAEAAIAKAEAD KTKWENEKALIVEQYKEKMLLAETATAKAQEAYKQAMEQIEASKLLLTDEEQLRLNVVEA KVAYAKIAMDKAMYGYSTTEVETEQRLVSKSTSSQTSVGEGAGTTTTTGTDEKYITYYVI TQNPDNANNGYAEGSLKKLQEQLADYPNWVADGNIEAKLKNAVEIAESNLNIAQKTADGY KAILDNEYTTLADWETEVKKLKDEMATLDVQKRQYELKKEELKVANPNLEKNKTTAEKKL ADATGALDKVNTTTNVASAYSKKADQSILKGLNDAFDGISGGVAGYNKSTGEFSYAKTVL VTKAQEQIDGWNKLIDAATQGIDLEDLAWSKLEVVKADAEATKANETYDADYKAWEKARD KYIEVEAIDINASKKAVETAIAAYNKLGDAARTQDANISKLATALAEYYQDALDKELKST KAEATEASTNTKKAISAWIIANPLANFKSIVGLELNLIEWTAGAVTKANKIEQADANKLL ELGASAGDVPYDSWKKASTKVFGDNFNRGNEPRRVAVSADMVLKEVDGNRLATELQNGDY GTLGISIYATAEAARLTAVINQADTYKALKAEFTAQKAKEQTTLDKKKTELAANVTKAET ALADAKTAYDKVFKEVEANIKKVSEDYDYKDAIKDKIESTISEYIASLDELGDDYKDMTL EEIKEAINAEYLKAVTGLLDKKAALATAERNLEKLAEGTYTDNDYIKDTFANIQEKIKVK QAEYDAAKADYDAASAQLKALLAIFLK >gi|222159244|gb|ACAB01000115.1| GENE 6 10730 - 11596 795 288 aa, chain + ## HITS:1 COG:no KEGG:BT_1926 NR:ns ## KEGG: BT_1926 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 288 16 288 288 526 94.0 1e-148 MIKKLCTLLLLTCMTAAPSMAQRYSSSDDASFAPKKGQWQVSVLLGSGKFFNENTSYLLP KFSNDGGVVGLPNGGKDNSGDLNRYLNIGSLNNNSLVNIAGIEGKYFVSDNWDVNFQFSM NVSLTPKKDYVEGDNSVPDMIIPAQSYINAQMTNNWYVSVGSNYYFKTRNERIHPYLGGA LGFQMARIETTEPYTGDTYKDSDDSEELPSQVYVSGSKAGQMYGFKVAAVAGIEYSIAKG FVFGFEMHPLAYRYDLIQICPKGFDKYNASHHNIKIFEMPVVKLGFRF >gi|222159244|gb|ACAB01000115.1| GENE 7 11693 - 12223 327 176 aa, chain + ## HITS:1 COG:no KEGG:BT_1925 NR:ns ## KEGG: BT_1925 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 46 176 1 131 131 189 69.0 4e-47 MNRYVKLLSFLPGICLIVACTGHQIKGYEINGSAPLPEFEGKMVYMKDVSSGQPVDSAEI IHGKFDFSDTVTIVSPVVKVLSIRAGKSGLEYRLPVVIENGSIQAYISDVVCTGGTMLNE RMQDFLMAVDEYSTACENKQTEQIKFGFADLLKKYIEINDDNAVGEYIRTAYRSSL >gi|222159244|gb|ACAB01000115.1| GENE 8 12564 - 13484 495 306 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160883490|ref|ZP_02064493.1| ## NR: gi|160883490|ref|ZP_02064493.1| hypothetical protein BACOVA_01459 [Bacteroides ovatus ATCC 8483] # 1 306 1 306 306 605 100.0 1e-172 MPGFNHTTENMKTKTLLGFFLLSVMAFCVSACQDDAEIILFSGSELVDETGTCTNTVSST VLYLNGWEAENIGIANGKGGYSAQSSDETIVTATVIDNRLWLSSHGKKGKVTVTVTDKDG NYVTLPVTVSYGVLTFMCVDQPEFQVSRPSEDMLSLDSESELQEKVNVAMAPYSFINKGD ICILRPDDVYHLSEEGTGGKFIYKTEDEQILVEGTYQIEWASLGGKQKKAFVFTYIDKNN VEQQHTFFNFSPYLGTQSSTRERGPITTAWLEEVSDSPYLEGVLPADRTVVYWVDTVISS LSAEAV >gi|222159244|gb|ACAB01000115.1| GENE 9 13691 - 14362 695 223 aa, chain - ## HITS:1 COG:MA0289 KEGG:ns NR:ns ## COG: MA0289 COG2220 # Protein_GI_number: 20089187 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Methanosarcina acetivorans str.C2A # 15 202 13 200 225 108 37.0 7e-24 MSNFEMDSFTTKNGKSLKITFFKHASLLLDYAGQKIFIDPVSDYADYIQQPKADFILITH EHGDHFDTKAIAAIETNQTKIIANPNCRKMLNRGQEMKNGDVLQLAKDIKLEAVPAYNTT PGRDKFHPKGRDNGYILTLGGTRIYIAGDTEDIPELNQVKDIDIAFLPVNQPYTMTPEQA IRAAKIIKPRVLYPYHYGETDIHKVKDGLKNETGTEVRIRALQ >gi|222159244|gb|ACAB01000115.1| GENE 10 14630 - 15910 1339 426 aa, chain - ## HITS:1 COG:PM0738 KEGG:ns NR:ns ## COG: PM0738 COG2873 # Protein_GI_number: 15602603 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pasteurella multocida # 9 426 5 420 422 522 58.0 1e-148 MAKQFKPETLCVQAGWTPKKGEPRVLPIYQSTTFKYDTSEQMARLFDLEDSGYFYTRLQN PTNDAVAAKIAALEGGVAAMLTSSGQAANFYAIFNICQAGDHFVCSSAIYGGTFNLFGVT MKKLGIDVTFVNPDASEEEISAAFKPNTKALFGETISNPSLEVLDIEKFARIAHSHGVPL IVDNTFPTPINCRPFEWGADIVVHSTTKYMDGHATSVGGCIVDSGNFDWDAHAEKFPGLC TPDESYHGLTYTKAFGKGAYITKATAQLMRDLGSIQSPQNSFLLNLGLETLHLRMPQHCR NAQKVAEYLSKNEKVAWVNYCGLPDNKYYSLAQKYMPNGSCGVISFGLKGGRDVSIKFMD SLEFIAIVTHVADARSCVLHPASHTHRQLTDEQLMEAGVRPDLIRLSVGIENADDIIADI EQALNA >gi|222159244|gb|ACAB01000115.1| GENE 11 15996 - 17135 717 379 aa, chain + ## HITS:1 COG:no KEGG:BF3546 NR:ns ## KEGG: BF3546 # Name: not_defined # Def: putative N-acetylmuramoyl-L-alanine amidase # Organism: B.fragilis # Pathway: not_defined # 35 376 20 346 346 538 79.0 1e-151 MSEMITLAKEIVNVVMKNKLYILLFLAFLFSGTTLWAQQKATPKAGEGISSFLLRHNRSP KKYYDDFIELNKQKLGKNNVLKVGVTYVIPPVKKSTTTSAKTTPVKNTGAKNTTSESAGT KQPSSKAKSTKIGTTINEPLFGKQLANVKVTSNRLAGACFYVVSGHGGPDPGAIGKVGRY ELHEDEYAYDIALRLARNLMQEGAEVHIIIQDAKDGIRDDSYLSNSKRETCMGDAIPLNQ VQRLQQRCDKINALYRKDRKNHSYCRAIFIHIDSRSKGKQTDVFFYYSNKKGDSKRLANN MKDTFESKYDKHQPNRGFSGTVSGRNLYVLSHTTPASVFVELGNIQNTFDQRRLVINSNR QALAKWLMEGFLKDYKEKK >gi|222159244|gb|ACAB01000115.1| GENE 12 17742 - 19277 1611 511 aa, chain - ## HITS:1 COG:VNG1529G KEGG:ns NR:ns ## COG: VNG1529G COG4799 # Protein_GI_number: 15790513 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Halobacterium sp. NRC-1 # 4 511 3 516 516 624 59.0 1e-178 MEEMNKAYATFEERDRIASLGGGVDKIEKQHESGKMTARERIEMLLDKGTFVELDKLMVH RCTNYGMDKNKIPGDGIVSGYGKVDGRQVFVYAYDFTVYGGSLSASNAKKIVKVQQLALK NGAPIIALNDSGGARIQEGIESLSGYADIFYQNTMASGVIPQISAILGPCAGGACYSPAL TDFIFMVKEKSHMFVTGPDVVKTVIHEEVSKEELGGAMTHSSKSGVTHFMCNSEEELLMS IRELLSFLPQNNMDEARKQPCADETNREDAFLDTIVPADPNVPYDMKDIIERVIDNGYFF EVMPNFAKNIIIGFARMAGRSVGIVANQPAYLAGVLDIDASDKASRFIRFCDCFNIPLIT FEDVPGFLPGYTQENNGIIRHGAKIVYAFAEATVPKLTVITRKAYGGAYIVMNSKQTGAD VNFAYPSAEIAVMGADGAINILFRKADEATKAKELEAYKEKFATPYQAAELGYIDEIIYP RQTRKRLIQALEMTENKMQTNPPKKHGNMPL >gi|222159244|gb|ACAB01000115.1| GENE 13 19297 - 19806 546 169 aa, chain - ## HITS:1 COG:AGc4940 KEGG:ns NR:ns ## COG: AGc4940 COG1038 # Protein_GI_number: 15889978 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 88 164 1095 1171 1174 62 40.0 4e-10 MEIHIGNRVAEVELVSKEDNKVVLTIDGKTFEADVVMAENGTCNILMDGRSSNAQLIRRD NGKSYKVNTHYSSFNVEIVDSQAKYLRMRKKGEEEQNDCIISPMPGKVVKIPVAVGQEMK AGDTAIVIEAMKMQSNYKVTSDCRIKEILVQEGDNITGDQTLITLEPIA >gi|222159244|gb|ACAB01000115.1| GENE 14 19830 - 21344 1566 504 aa, chain - ## HITS:1 COG:MA0675 KEGG:ns NR:ns ## COG: MA0675 COG0439 # Protein_GI_number: 20089560 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Methanosarcina acetivorans str.C2A # 1 463 1 464 493 512 55.0 1e-145 MIKKILVANRGEIAVRVMRSCREMEITSIAIFSEADRTAKHVLYADEAYCVGPAASKESY LNIEKIIEVAKAAHADAIHPGYGFLSENATFARRCQEEGIIFIGPNPETMEAMGDKIAAR IKMIEAGVPVVPGTQDNLKSVEEAVELCNKIGYPVMLKASMGGGGKGMRLIHSAEEVEEA YTTAKSESLSSFGDDTVYLEKFVEEPHHIEFQILGDKHGNVIHLCERECSVQRRNQKIVE ETPSVFVTPELRKDMGEKAVAAAKAVNYIGAGTIEFLVDKHRNYYFLEMNTRLQVEHPIT EEVIGVDLVKEQIKVADGQALQLKQENIQQRGHAIECRICAEDTEMNFMPSPGIIKQITE PNSIGVRIDSYVYEGYEIPIYYDPMIGKLIVWATNREYAIERMRRVLHEYKLTGVKNNIS YLRAIMDTPDFVEGHYDTGFIAKNSEFLQQRITRTSEYSENIALIAAYIDYLMNLEENNS GMAADNRPISKWKEFGLHKGVLRI >gi|222159244|gb|ACAB01000115.1| GENE 15 21957 - 22280 366 107 aa, chain + ## HITS:1 COG:slr0233 KEGG:ns NR:ns ## COG: slr0233 COG0526 # Protein_GI_number: 16331440 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Synechocystis # 22 91 21 90 105 68 40.0 2e-12 MKEKKIAREERNQEKLANGDWVMAEFYATWCPHCQRMKPVVEEFKKLMVGTLEVVEVDID QEPALTDFYTVESTPTFILFRKGQQLWRQSGELPLERLEKAVKGFKF >gi|222159244|gb|ACAB01000115.1| GENE 16 22519 - 22836 176 105 aa, chain + ## HITS:1 COG:no KEGG:PRU_1322 NR:ns ## KEGG: PRU_1322 # Name: not_defined # Def: putative addiction module killer protein # Organism: P.ruminicola # Pathway: not_defined # 1 105 1 105 105 100 46.0 1e-20 MVIEFEKEYLSELYYEGKCNDKKHRFQPQVIRNYVKRIVTLAEALNVEALYPLNSLNYEV LTGSKKDISSIRIDKQYRLEFKISTTDSEPIITICSIIDITNHYK >gi|222159244|gb|ACAB01000115.1| GENE 17 22844 - 23158 396 104 aa, chain + ## HITS:1 COG:no KEGG:PRU_1323 NR:ns ## KEGG: PRU_1323 # Name: not_defined # Def: putative addiction module antidote protein, HigA family # Organism: P.ruminicola # Pathway: not_defined # 11 101 7 97 99 102 50.0 4e-21 MSNLGYSFIPVHPGEIIKEELQSRGISQKRFAEVVGVSYTMLNDILNGRRPVSTDFALLI EAATNINAEMLMNMQTRYNMQIARKDKNVIAHLENLRKICATLL >gi|222159244|gb|ACAB01000115.1| GENE 18 23160 - 24326 875 388 aa, chain - ## HITS:1 COG:MA2533 KEGG:ns NR:ns ## COG: MA2533 COG1488 # Protein_GI_number: 20091361 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 2 379 1 391 404 277 40.0 2e-74 MIVKTLLDTDLYKFTTSYAYIKQFPYAMGTFSFNDRNETKYTEAFLETLKAEIKNLSQLR FTEEELEYMTKNCRFLPRVYWEWLSSFRFDPNKIDIHLDEACHLHIEVTDLLYKVTLYEV PLLAIVSEIKNRFFGHVADMNGILCKLSEKIELSNQHQLRFSEFGTRRRFSIDVQETVIK RLNETAQYCTGTSNCNFAMKYGMKMMGTHPHEWFMFHGAQFGYKHANYMALENWVNVYDG DLGIALSDTYTSGIFLSNLSRKQAKLFDGVRCDSGNEFEFIDRLVARYKELGIDATTKTI VFSNALDFTKALDIQEYCKNKIHCSFGIGTNLTNDTGFEPSNIVMKLTQCKMNVNQEWRE CIKLSDDEGKHTGSLEEVQACLHELRLN >gi|222159244|gb|ACAB01000115.1| GENE 19 24493 - 26382 1146 629 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714149|ref|ZP_04544630.1| ## NR: gi|237714149|ref|ZP_04544630.1| predicted protein [Bacteroides sp. D1] # 1 629 4 632 632 1239 100.0 0 MKNALVYSCVLLLLSCGGKTGGTSQSSSTNQETQENALSEKTTAISEAFSMDFYKGSDWI IPDTLSESLLKEPEASSLKVYDSDSICRKQVDGVERCFYKDSLMNGAYKHFYFRSTDSFV SEPYYSVVVYKDGIKNDTVRHYSISSHCMTSRIITLDSICELYQSYYSNGYPHIFYQATR GKLNGVRQRWYYDGKPDWFEHYKDGKLNGEEMRWHANGHLWQVNHYIDDQEVLPTECWYY SHSEYAYNDPDGVAEAPDWYEYKYQASEGTPFYYIQEIYALENGKKYLLKRKYFEKDSDE PIYFMDDKCDSVKRSIVTVGTQKRLDIRNYSGGRLRRFESYPYSEKGVPGVMTEIYSEEG NRIAKSMYDYHSGRMISLRMYSNTGADVKKLEFAEYTNMKADSCNKQLDSGKGRYTDNQN KLSVRWKKGELALLNSPAKEDEDGFIKWYYDGYQSDYGLHFFHYSGFESWGYFVMSDVTG EVYEYRSIDTPLFCGKSGLFLVVDENPYKEECYVRVYKMLPEGRLAEVAALNRGGGDYFE VDLDDFVWVGESSFIANKKTSEDELQCDFYGGCSDSCLEEYRKCGYLQIDEDGWGYVRVD LRPDALQTKQELPSSLEAINEMNAWVKSL >gi|222159244|gb|ACAB01000115.1| GENE 20 26553 - 27332 249 259 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 247 4 238 242 100 30 1e-20 MNRFENKVVVITGAAGGIGEATTRRIVSEGGKVVIADHSEKKAEQLANELTHAGADVRHV YFSATELQSCKELIDFSMNEYQRIDVLINNVGGTDPKRDLNIEKLDINYFDEAFHLNLCC TMYLSQQVIPIMTTNGGGNIVNVASISGLTADANGTLYGASKAGVINLTKYIATQMGKKN IRCNAVAPGLVLTPAALDNLNEDVRNIFLGQCATPYLGEPEDVAAAIAFLASNDARYITG QTIVVDGGLTAHNPTVALS >gi|222159244|gb|ACAB01000115.1| GENE 21 27705 - 28643 613 312 aa, chain - ## HITS:1 COG:no KEGG:BT_1910 NR:ns ## KEGG: BT_1910 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 312 1 312 312 552 81.0 1e-156 MIISASRRTDIPAFYSQWFFNRIKEGYVLVPNPYHPKMISRISLSPAVVDCFVFWTKNPA PMLNQLDKLQDYNYYFQFTLNPYGEKLENRLPSIDKRIDTFKKLADKIGREKMIWRYDPI LTNEEYNVSFHQEAFARIAHELKDHTSKCMLGFIDHYPHIRNSIQPFNINPLTKEEIEEM AVSFKKTIDIYPGIQLDTCTVKVDLRHLGIPSGLCIDKGLIEKITGYPILAQKDKNQRNI CNCIESIDIGTYESCLNGCVYCYAIKGNYNTAAYNMKKHDKNSPLLIGHLSEDDIVKERE MKSLRDNQYSLF >gi|222159244|gb|ACAB01000115.1| GENE 22 29081 - 29950 849 289 aa, chain + ## HITS:1 COG:no KEGG:BT_1908 NR:ns ## KEGG: BT_1908 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 289 1 289 289 447 81.0 1e-124 MKKNILFFGTLVGVLLLVSCSGGNKKQAASSVTSEELDNASKVINYYHTSLIVLRHVANA KDINAVLGYMEQTGKVPEVSPIAPPEVSARDTAELMDPGDYFNIEVRQNLKQSYRGLFSA RTQFYDNFNKFLSYKQAKETAKIGKLLDENYRLSVEMSEYKQVIFDILSPLTEQAEKELL ADEPLKDQIMAMRKMSGTVQSIMNLYSRKHALDGMRIDMKMAELKKELEAAKKLPAVTGY DEEQKNYYSFLTSVESFMKDMQKARDKGSYSDEDYNAMSEAYEYGLSVI >gi|222159244|gb|ACAB01000115.1| GENE 23 30217 - 31020 484 267 aa, chain + ## HITS:1 COG:CC2573 KEGG:ns NR:ns ## COG: CC2573 COG2207 # Protein_GI_number: 16126811 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Caulobacter vibrioides # 61 259 67 263 270 89 27.0 9e-18 MQSFQLIKPCLALAPYIRHYWILQDDSVTPVSERTLPIGCMQLVFHKGKQLLLLGESELQ PQSFISGQSVGFSDVQSTGRIEMITVVFQPYAVKALFHIPSHLFHGQNVDTDAMEDVELS DLVKQVTDTSDNAVSIRLIEQFFLRRLYTLPEYNLKRMSAVFHEINLQPQINIAHLSETA CLSSKQFGRIFADYVGTTPKEFIRIVRMQRALSMLQQDATIPFVQVAYECGFSDQSHMIK EFKLFSGYTPAEYLSVCAPYSDYFSEL >gi|222159244|gb|ACAB01000115.1| GENE 24 31115 - 31486 318 123 aa, chain + ## HITS:1 COG:CC1834 KEGG:ns NR:ns ## COG: CC1834 COG3324 # Protein_GI_number: 16126077 # Func_class: R General function prediction only # Function: Predicted enzyme related to lactoylglutathione lyase # Organism: Caulobacter vibrioides # 1 121 45 164 167 81 38.0 3e-16 MKKLIAFFEIPASDFHRAVDFYETVLGMQLPTFECETEKMACFTEEGETVGAISYASNFD FLPSTHGVLIHFNCEDIEQTLEKVLLKGGKVVIPRTKIEADGKGWFAVFTDSEGNRIGIY AEK >gi|222159244|gb|ACAB01000115.1| GENE 25 31702 - 32559 592 285 aa, chain - ## HITS:1 COG:BH3506_1 KEGG:ns NR:ns ## COG: BH3506_1 COG2207 # Protein_GI_number: 15616068 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 10 115 2 107 130 91 37.0 2e-18 MQQKKTTKEEYQKCVNVVVEYINQHLGEDIDLKSLARISNFSPFYFHRIMKAFLGEPIGT FIVRTRTEAAARLLRYSDVPIADIAYRIGYSSPSSLSKVFKQFYGISPLEYRNNKNFVIM KPAIIRPDLELKREIKNVSERNVIYIRLTGDYKLNDYGGTWGRLFQFIKEQQLPMGDFSP LCIYHDDPKVTPAEKLRTDVCMVMPVKVTPKSDVGFKVIPAGRYAIFLYKGPYDNLQAVY DTIYGKCLPEMECTLRDEASAERYLNNPCDTAPEELLTEIYIPVE >gi|222159244|gb|ACAB01000115.1| GENE 26 32871 - 33095 268 74 aa, chain + ## HITS:1 COG:no KEGG:BT_2368 NR:ns ## KEGG: BT_2368 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 74 105 72.0 4e-22 MDKTIVGNNAGKVWCALKEIGEISIPELARRLNLSVESTALAAGWLARENKICIQRKNGL IALSDESTFPFSFG >gi|222159244|gb|ACAB01000115.1| GENE 27 33357 - 34460 1086 367 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1073 NR:ns ## KEGG: Cpin_1073 # Name: not_defined # Def: AAA ATPase # Organism: C.pinensis # Pathway: not_defined # 23 367 50 377 381 160 30.0 8e-38 MSKEYTLADFTDIFDYSTGWFSDSSICSLFYQIFKRFPSMKMVKYKVNQDFLKEIKELYQ QDDAFDIFEHIYCNHFENEKEEEEEEEDTSTKELYDCIVICKKNLMIGYFDNCIKIAYSN IDKEEIDRINQICENHKKENEKLNNLFIVTYAHNYFSLKQSQINKPAIQIDRHYNNDFAP VAAEIENFLLEENKSGLIILHGKQGTGKTTYIRHLINLGKKRMIYMSGDLVDKLSDPSFI TFIRQQKNSIFIVEDCEELLSSRNGSNRMNAGLVNILNISDGLLSDELCIKFICTFNAPL KDIDEALLRKGRLAARYEFKDLTTDKVNQLIKEEGLDIPQQTQPMTLAEIYNYEGMDFSL GRKRVGF Prediction of potential genes in microbial genomes Time: Wed May 18 03:53:27 2011 Seq name: gi|222159243|gb|ACAB01000116.1| Bacteroides sp. D1 cont1.116, whole genome shotgun sequence Length of sequence - 17210 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 217 - 276 6.7 1 1 Op 1 . + CDS 358 - 537 80 ## gi|237713827|ref|ZP_04544308.1| predicted protein 2 1 Op 2 . + CDS 566 - 1774 981 ## COG5026 Hexokinase + Term 1874 - 1919 6.7 - Term 1865 - 1904 6.7 3 2 Tu 1 . - CDS 2011 - 4005 1690 ## BT_1871 putative alpha-glucosidase - Prom 4057 - 4116 10.8 4 3 Op 1 . - CDS 4119 - 5216 725 ## Cpin_2851 hypothetical protein 5 3 Op 2 . - CDS 5276 - 7387 1698 ## COG1874 Beta-galactosidase - Prom 7441 - 7500 4.9 6 4 Op 1 . - CDS 7521 - 9572 1727 ## BT_3313 hypothetical protein 7 4 Op 2 . - CDS 9599 - 10606 903 ## Fjoh_2023 hypothetical protein 8 4 Op 3 . - CDS 10630 - 12285 1308 ## Dfer_0810 RagB/SusD domain protein 9 4 Op 4 . - CDS 12296 - 15658 2706 ## Dfer_0811 TonB-dependent receptor - Prom 15686 - 15745 4.5 10 5 Op 1 . - CDS 15928 - 16941 781 ## COG3712 Fe2+-dicitrate sensor, membrane component 11 5 Op 2 . - CDS 17013 - 17210 93 ## BT_1877 RNA polymerase ECF-type sigma factor Predicted protein(s) >gi|222159243|gb|ACAB01000116.1| GENE 1 358 - 537 80 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713827|ref|ZP_04544308.1| ## NR: gi|237713827|ref|ZP_04544308.1| predicted protein [Bacteroides sp. D1] # 1 59 1 59 59 87 100.0 3e-16 MQVKNRRQEKVCVSNLNFAFFKKEQKGKEQKERDKSRNPAGAIYCIPTEQNSFSVVIHT >gi|222159243|gb|ACAB01000116.1| GENE 2 566 - 1774 981 402 aa, chain + ## HITS:1 COG:TP0505 KEGG:ns NR:ns ## COG: TP0505 COG5026 # Protein_GI_number: 15639496 # Func_class: G Carbohydrate transport and metabolism # Function: Hexokinase # Organism: Treponema pallidum # 6 287 20 306 444 125 31.0 1e-28 MEKNIFQLDNEQLKEIARSFKAKVEEGLNTENAEIQCIPTFITPKANGINGKSLVLDLGG TNYRVAIVDFSKMPPTIHPNNGWKKDMSIMKTPGYTREELFKEMADMITGIKREKEMPIG YCFSYPTESVPGGDAKLLRWTKGVDIKEMIGEVVGKPLLDYLNERNKIKFTNIKVLNDTV ASLFAGLTDSSYDAYIGLIVGTGTNMATFIPADKIKKLNPSHKVDGLIPVNLESGNFHPP FLTAVDNTVDGISGNPGKQRFEKAVSGMYLGDILKATFPLEEFEEKFDAQKLTSIMNYPD IYKEVYVEVAQWIYSRSAQLVAASLTGLIMLLKSYNKNIRKICLVAEGSLFWSKNRKDKN YNMIVMEKLRELFSLFGLEDVEVDIKSMNNANLIGTGIAALS >gi|222159243|gb|ACAB01000116.1| GENE 3 2011 - 4005 1690 664 aa, chain - ## HITS:1 COG:no KEGG:BT_1871 NR:ns ## KEGG: BT_1871 # Name: not_defined # Def: putative alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 664 1 662 662 1245 87.0 0 MKKLKILLFLCIVACTLTVQAQKQFTLSSPDGKLQTTITVGDKLTYDIRCNGRQILAPSP ISMTLDNGEIWGEKAKLSGTSGKKVDQMIPSPFYRASELRDHYNELTLRFKKDWNVEFRA YNDGIAYRFVSRAKKPFNVVDETVDYHFPSDMVASVPYVKTGKDGDYDSQYFNSFENTYT TDPLSKLNKKRLMFLPLVVDAGEGVKVCITESDLENYPGLYLSAIEGENRLTGRFAPYPK KMVQGGHNQLQMLVKEHEAYIAKVDKPRNFPWRMSIVTTSDKDLAASNLSYLLAAPSRLT DLSWIKPGKVAWDWWNAWNLDGVDFVTGVNNPTYKAYIDFASANGIEYVILDEGWAVNLQ ADLMQVVKEINLKELVDYAASKNVGIILWAGYYAFERDMENVCRHYADMGVKGFKVDFMD RDDQLMTAFNYRAAEMCAKYKLILDLHGTHKPAGLNRTYPNVLNFEGVNGLEQMKWSPAS VDQVKYDVMIPFTRQVSGPMDYTQGAMRNASKGNYYPCNSEPMSQGTRCRQLALYVVFES PFNMLCDTPSNYMREPESTGFIADVPTVWDESIVLDGKMGEYIVTARRSGNVWYIGGITD WTARDIEVDCSFLGGKTYDATLFKDGANAHRIGRDYKCESIRIKNDSKLKIHLAPGGGFA LKIK >gi|222159243|gb|ACAB01000116.1| GENE 4 4119 - 5216 725 365 aa, chain - ## HITS:1 COG:no KEGG:Cpin_2851 NR:ns ## KEGG: Cpin_2851 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 20 361 11 350 352 381 53.0 1e-104 MKNKWCVNVVFLFFLVGICVCACTQDSNSSNNSRWTEEKIQEWYDNQPWLVGCNYIPATA INQIEMWSADTFDPEQIDKELSWAHELGFNTLRVFLSSVVWQNDAAGMKKRMDDFLNICG QYSIRPMFVFFDDCWNPESAYGKQPEPKTGVHNSGWVQDPSCSLRKDTLTLYPFLQEYVK DIVRTYANDDRILMWDLYNEPGNSKHEETSLSLLTNVFRWVRDCKPSQPITAGVWDYNSP RKNVLNAFVLNHSDIISYHNYDNEERHGECIKFLKMLNRPLICTEYMARRNDSRFCNVLP LMKKEKVGAINWGFVAGKTNTIFAWDDVIPSGEEPELWFHDIYRPTGVPYQQEEVDCIQS LTGKR >gi|222159243|gb|ACAB01000116.1| GENE 5 5276 - 7387 1698 703 aa, chain - ## HITS:1 COG:TM1195 KEGG:ns NR:ns ## COG: TM1195 COG1874 # Protein_GI_number: 15643951 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Thermotoga maritima # 34 696 2 646 649 342 32.0 2e-93 MKKSLLVLALLFSLASFSQSPVDKSFPPEELFALGSYYYPEQWDSSQWERDLKKMSEMGI KFTHFAEFAWAMMEPEEGKYDFEWLDRAVSLAEKYGLKVIMCTPTPTPPAWLSKKYPDIL IQRDNGVSIQHGRRQHASWSSDRYRRYVENIVSRLAMRYGNHPAVIGWQIDNEPGHYGVV DYSENAQLKFRVWLQKKYGTIDKLNDTWGTSFWSETYQNFDQVRLPSQQEVPDKPNPHAM LDLNRFMADELAGFVNMQADILRQHINKDQWITTNLIPVFNPVDPVRIDHTDFLTYTRYL VTGHNQGIGSQGFRMGIPEDLGFSNDQFRNRVGKTFGVMELQPGQVNWGVYNPQPLPGAV RMWVYHVFAGGGKFVCNYRFRQPLKGSEQYHYGMIMTDGVTLSPGGEEYVRITQEMKKLR AAYDKKSRMPKQLASRRIGLLFDMNNYWEMEFQRQTDQWWTMPHIHKYYNLLKSFAAPVD VISEKEDFSGYPFLIAPAYQLLDNNLVERWTEYVKNGGHLILTCRTGQKDRNAKLWEAPL AAPIHQLAGINSLYYDHLPHSLYGKVDFGGEEYAWNNWADVLTPSAGTDVWAVYADQFYK GAASVIHRRLGKGTATYIGTDTDDGKLEKEVVRRVYKEAGVPTEDLPYGVVKEWRDGFYI ALNYTSDIQEIAIPDEAEILIGSARLEPAGVVVWKEQSDDRHK >gi|222159243|gb|ACAB01000116.1| GENE 6 7521 - 9572 1727 683 aa, chain - ## HITS:1 COG:no KEGG:BT_3313 NR:ns ## KEGG: BT_3313 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 182 570 175 554 667 65 23.0 9e-09 MKYYIYTIFLLLLAASCSDDVQKWDNWPEWKLASPLSVGGNVLDEEIYSNFQGKKLHLEK GQEIEFSGTDGIESILSPDYFEYLSGNKARFKGETGDYSVLYDPVNELLYVEKAGATYPE GLWFCGANWGHPQAGVVTTSGWSMDGANNVLYCYKSADNVFQLTVYLANNFSFKFFKHRG WGEGDNEITTLPEDNITLTTPFLVAGKSGGDFIPGPLFQPGVYLITLDLNNNTCAFEAKD ENIQEQTFLVNGHEMGILEEASSYLGIALELHEGDEVTFGNFGDVRKMLQPDFFEDITKD KATFIGADGNYKLFYDPVNKLIYLENRSVNYPDGLWVCGSNFGHPQAGRVTVATWTFNLP SDAFQCVKISDNVFETTLYLVKDFQFKFYKQRPWGGELASTTVNPYPINLLGKGWFYSDP ATGGTGGGHFTGDFVAGPDFTPGVYRVRIDLNKNICMFIDKVDEGQLGEESYKINGTELT QSNDPNYIGVELNLTKGQTVDFEGFSYLDYMLQPEYFTNENGQYKFNAPDGKYKISYNKN RELIYVEKTTGAEFPETVWITGATFGHPRISGLLADDIGNWGWENPKDFICCVKTGDRIF ETNLFLNNDFMFRFYKKKGWNNEITSFDVTIVSEGDLIARGGYWNGDQWQETENFGPGAN FRAGIYHVKLDMNTNTCTFTKKY >gi|222159243|gb|ACAB01000116.1| GENE 7 9599 - 10606 903 335 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_2023 NR:ns ## KEGG: Fjoh_2023 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 38 335 42 342 342 212 40.0 2e-53 MKRMSYYIIGLLLSLMSWACSDDVETGREEIPVETDGGYLFAHMTNANYGKLYYAASRDG VNWETLNKGRIINSAYIGHPDICQGHDGAFYMIAVNPLALWRSEDLVTWTSAPLDEMIFN RSNAQGFYTTYYWGAPKMFYDKDSGQYIISWHACNDPDKDDWDSMRTLYVLTKDFETYTE PQKLFNFTGADENMAIIDAIIRKVNGVYYAILKDERDPAVAPETGKTVRIATSSNLTGPY TNPGAPVTPNDMMREAPIFIERPNHSGWFIYAESYAAKPYGYHLFQSTSMDGPWKERTFS GPNVKDGTDRPGARHGCIVKVNETVYQALLKAYKK >gi|222159243|gb|ACAB01000116.1| GENE 8 10630 - 12285 1308 551 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0810 NR:ns ## KEGG: Dfer_0810 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 11 520 2 486 499 285 37.0 2e-75 MKSNKLCRMKKKIFLPMLVSLQLLFGSCENNFDPYIYGALLQGEYPSTEQEYVSYMMICY LPYTTVWTYDMGSGGLQHGIHIQSGGTVRMFDSTSDICASATTVGADWERFTKGDYSNCF YYWRGNVDDGSNLNHFPKTAQITRMTEIIGTLEKAPLTALTEEKKNNLLGEARLCRGLMM YFLLHVYGPVPVILDPEKVIDPEALSNTVRPSLDEMTQWITDDLEFAAKNAPEVAPDLGR YTRDYARFCLMKHCLNEGEHMEGYYQRAMDMYNELNTGKYDLFRTGSTPYVDLFKNANKF NKEIIMAVSCSPAADGNPKHGNANPFLMWALPSDVAKGDPFPMGGGWFQAFSMDKKYYDA FESNDGRLKTIVTSYKDKNGVIINKDNLGVRWNGYIMNKFPQETMTTFQGTDIPLARWAD VLLMYAEAEVRKTGTVPSVAAINAVNQVRNRAGLANLPAVATNTKDAFLDAILIERGHEL FYEGNRKIDLIRFNKYAQEMYKAKGVMPTHQYMPIPNYAVEQAVSYGKELKQTWERPGWA EDKSKAQQNIN >gi|222159243|gb|ACAB01000116.1| GENE 9 12296 - 15658 2706 1120 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0811 NR:ns ## KEGG: Dfer_0811 # Name: not_defined # Def: TonB-dependent receptor # Organism: D.fermentans # Pathway: not_defined # 111 1120 14 1016 1016 773 43.0 0 MNYRKKAILMVVALFCLNIAMLAQAVSLKMNNVSVKEAMTQLKNKSGYSFVYKVGDLDTK KIVSVKAKQLNEAIDQILYGQDVVYEVKGKNIVVQKGQRQNTSKDTKKRKITGVVNDANG EPIIGATIKEKGTANGTASDLNAKFSLEVSPGAVLEISYIGYQTQEVKVGDRTSLSVTLA ENQQILDEVVVVGYGTTSRKNLTTSIATVKTEKISKAATSNISGMLLGRAAGLQATVSSP QPGGGINISIRGGGTPIYVVDGVVMPSGSFEVGTGSTSLPSSVNRAGLAGLNPNDIESIE ILKDASAAIYGIGAADGVILVTTKKGKEGKPTIVYEGSYSIQKHYPYLEVLSGPELMNMV NVFSKENYLYDKGQYPYGNTAYDDKWTPIFTPTQIANAPTTDWLDKVLKTGAVTNHNLTI SGGSEKFKYYLGVNYYKEDATVHNSDMERYSLRTNITSQLTNFLKLTTIVNLNQNNYTNS TVGGDVGNLGDQGAGALFGAIFYPSYLPIYDAEGKYNVFSRTPNPVSMHDINDKSEQSGY YMNFSLDVDIIKNMLSAKVLYGLNKENTSRDSYIPSDVYYALQRKSRGNLGYGKRQQSTL EGTLTFQHKFGELLDMNLMAGMGRYLDSGDGSDISYENANDHIQGSSVGMADGPFYPTSY KYKNEKRSQFVRGNFDLLGRYVVSASLRRDGTDKFFPSKKYALFPSVSLAWKMNEESFIK NISWINMLKLRASYGETGSDNLGTTLYGIVTTTREDVQFNNNSVTYIPYILSGANYEDVT WQKTVMKNIGLDFSVLRDRIWGSVDVFRNDVTHLLGTAPTELLGMHGTRPINGGHYKRTG VDVSLNSLNLQTHDFKWTSQITMSHYNAVWIERMPNYDYQKYQKRKNEPMNAFYYYKTTG IINVDKSNMPESQRSLGPAACMPGYPIVEDKNGDGIIDVNDSYMDNTLPKLYFGFGNTFT WKNFDLDIFMYGQLGVKKWNDAYGNSIDVGGLSRGVDAHNVGIYSYNIWNTQTNTNGRFP GIAISKSVALPENLGFDYTRENASYIRVRNITLGYNLGPKELSVFKGYIRGIRVFVDFQN PLTFTKYKGYDPEINTSSSNLTGGQYPQMRVYSIGAKLTF >gi|222159243|gb|ACAB01000116.1| GENE 10 15928 - 16941 781 337 aa, chain - ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 129 303 122 292 327 68 27.0 2e-11 MEEEKKHIDELIATYLTEGLDKNALAELKAWIAASPENKNYFIQQREVWFSAVSREAASK YNKDKAFDTFKKRIGNRKEIEKTSHHGFRLSMLWRYAAIIAVILAVGCFSYWQGGVNVKD TFADISVEAPLGSRTKLYLPDGTLVWLNAGSRMTYSQGFGVDNRMIELEGEGYFEVRRNE KLPFFVKTKDLQLQVLGTKFNFRDYPEDHEVVVSLLEGKVELNNLLKKEKEAILAPDERA ILNKTNGRMTVETVTASNASQWTDGYLFFDEELLPDIVKELERSYNVTIRIANDSLNTFR FYGNFVRREQSIQEVLDALASTEKIQYKIEERNITIY >gi|222159243|gb|ACAB01000116.1| GENE 11 17013 - 17210 93 65 aa, chain - ## HITS:1 COG:no KEGG:BT_1877 NR:ns ## KEGG: BT_1877 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 64 132 195 196 110 87.0 2e-23 AIDKLPNECRRVFDKSRFEGKSYEEISQELGISVNTVKYHIKNALASLQMNLSKYLITLL LFFFG Prediction of potential genes in microbial genomes Time: Wed May 18 03:54:22 2011 Seq name: gi|222159242|gb|ACAB01000117.1| Bacteroides sp. D1 cont1.117, whole genome shotgun sequence Length of sequence - 21668 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 5, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 397 234 ## BT_1877 RNA polymerase ECF-type sigma factor - Prom 490 - 549 8.4 + Prom 497 - 556 8.7 2 2 Tu 1 . + CDS 586 - 2100 1467 ## COG0471 Di- and tricarboxylate transporters + Term 2125 - 2167 6.2 3 3 Op 1 . - CDS 2162 - 3448 486 ## BT_4744 putative multiple inositol polyphosphate histidine phosphatase 1 4 3 Op 2 . - CDS 3519 - 7544 2875 ## COG5002 Signal transduction histidine kinase - Prom 7726 - 7785 11.7 - Term 7721 - 7784 15.1 5 4 Op 1 . - CDS 7787 - 10312 1688 ## COG3525 N-acetyl-beta-hexosaminidase 6 4 Op 2 . - CDS 10320 - 11393 734 ## Phep_0506 hypothetical protein 7 4 Op 3 . - CDS 11428 - 12906 1132 ## Coch_0957 hypothetical protein 8 4 Op 4 . - CDS 12953 - 14326 977 ## gi|237713846|ref|ZP_04544327.1| conserved hypothetical protein 9 4 Op 5 . - CDS 14368 - 16011 1680 ## Dfer_0810 RagB/SusD domain protein 10 4 Op 6 . - CDS 16044 - 19460 2434 ## Dfer_0811 TonB-dependent receptor - Prom 19656 - 19715 5.3 + Prom 19658 - 19717 6.2 11 5 Op 1 6/0.000 + CDS 19788 - 20375 359 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 12 5 Op 2 . + CDS 20430 - 21476 617 ## COG3712 Fe2+-dicitrate sensor, membrane component Predicted protein(s) >gi|222159242|gb|ACAB01000117.1| GENE 1 1 - 397 234 132 aa, chain - ## HITS:1 COG:no KEGG:BT_1877 NR:ns ## KEGG: BT_1877 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 132 1 132 196 226 87.0 1e-58 MEHTETLIVEQLKIGNEDAYQYIYDHHYALLCHVASGYVKDQFLAETIVGDTIFHLWEIR ETLAISVSIRSYLVRAVRNRCINYLNSEWEKREIAFSSLMPDEITDDKMTISDSHPLGAL LERELEEEIYKA >gi|222159242|gb|ACAB01000117.1| GENE 2 586 - 2100 1467 504 aa, chain + ## HITS:1 COG:VC1314 KEGG:ns NR:ns ## COG: VC1314 COG0471 # Protein_GI_number: 15641326 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Vibrio cholerae # 33 498 22 482 487 427 52.0 1e-119 MYKIFHGFHLVEAYQDLKKAKRLAKNQTVARCIKLTVAITLSLILWLLPIDTFGIEGLTV IEQRLISIFIFATLMWVFEAVPAWTTSVLIVVLLLLTVSDSSLWFLTQGISTEELGQTVK YKSIMHCFADPIIMLFIGGFILAIAATKSGLDVLLARVMLRPFGTQSRYVLLGFILVTAV FSMFLSNTATAAMMLTFLTPVLKVLPADGKGKIGLAMAIPVAANVGGMGTPIGTPPNAIA LKYLNDPEGLNLNIGFGEWMSFMLPYTIIVLFIAWFILLRLFPFKQKNIELQIEGEAKKD WRSIVVYITFAITVILWMFDKVTGVNSNVVAMIPVAVFCITGVITKRDLEEISWSVLWMV AGGFALGVALQETGLAKHMIEAIPFNTWPPVLMIVGSGLICYAMANFISHTATAALLVPI LAIAGSSMRENLSSLGGVETLLIGVAIGSSLAMILPISTPPNALAHATGMIQQKDMEKVG IIMGIIGLILGYTMLIILGSNKLL >gi|222159242|gb|ACAB01000117.1| GENE 3 2162 - 3448 486 428 aa, chain - ## HITS:1 COG:no KEGG:BT_4744 NR:ns ## KEGG: BT_4744 # Name: not_defined # Def: putative multiple inositol polyphosphate histidine phosphatase 1 # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 424 7 424 425 534 59.0 1e-150 MRAILIFLAGLLFPITFLFGQSAIQKYAGTAMPYPLIKNLPVLNHDGMVPFYINHLGRHG ARFPTSGKALEKVRNVLILAEQENRLTVKGQELLATVLRLSEAFEGRWGELSAVGEQEQK GIAERMLLRYPEIFVDSARIEAIASYIPRCISSMDAFLSGMEKQDSSLVIKKSAGKQYNP LLRFFDLNKPYVYYKEKGDWISLYESFVQDKIVFTPVMKRIFLTSGQETEQEKREFVMAL FSIAAILPDTGLSFNMKGILNDKEWYGYWQTQNLRQYLTKSAAPVGNMLPVAIAWPLLSE FIQTTEQAINGQSDNRVNLRFAHAETIIPFVALMGIGKTDIQIVSPDSVSIYWQDYEIAP MAANVQWVFYRDKDCQVWVKILLNEQEATIPVVTSFFPYYRWEEVCHYLKQRIAISKEIL SRFSSKTD >gi|222159242|gb|ACAB01000117.1| GENE 4 3519 - 7544 2875 1341 aa, chain - ## HITS:1 COG:BH4026 KEGG:ns NR:ns ## COG: BH4026 COG5002 # Protein_GI_number: 15616588 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 802 1032 367 598 607 127 29.0 2e-28 MSNFVKMRGFFLFVFLLINCTVSKGQIYKYIGLEDGLNNQKIYHIQKDRRGYMWFLTQEG IDRYDGKHIKHYNFSDDNMTLDSRIALNWLYMDNRNVLWVIGQKGRIFRYDSQHDKFELA YVHPELIRNKSQAFLNYGYLDKNDRIWLCCKDSITWYDTHTGTVLNMSVPVNGEITTIEQ TDGNHFFIGTGSGLFRAGIEEGKLRLVPDEAVESIAPVHELYYHAVSKQLFVGNYKEGIL IYDMGGTGKIISCQFPNHVEVNQIAALNAHELLVATGGKGVYKLDVNTYMSEPYITADYS SYNGMNGNNINDVYVDEEDRIWLANYPTGITIRNNRYGSYDLIRHSLGNTRSLVNDQVHD VVEDSDGDLWFVTSNGISFYQTDTKEWRSFFSSFDPVPNDENHIFLALCEVSPGVMWAGG YTSGIYKIEKKKGFKVTYLSPAAIAGVRPDQYIYDIKKDSGGDIWSGGYYHLKRINLETK NMRLYPGVSSITTIQEKDDRLMWIGTRMGLYLLDKESGIYRYIDLPVESPYICALYQRED GILYIGTRGAGLLVYDINKKKFIHQYRTDNCALISDNIYTILPRQDESLLMGTETGITIY SPQKHSFRNWTREQGLMSVSFNAGSATTCNKSMLVFGGNDGAVKFPTDIQIPEPHYSRLL LRDFMIAYHPVYPGDDGSPLKKDIDETDRLELAYGQNTFSLDVASINYDYPSNILYSWKI DGYHKEWSRPSQDNRILVRNLPPGSYTLQIRAISNEEKYKTYETRNIQIVITPPVWASLW AMVGYAILLVLVMIIIFRIIMLQKQKKVSDEKTRFFINTAHDIRTPLTLIKAPLEEVIEN RMVTEQALPHMNMALKNVNTLLQLTTNLINFERIDVYSSTLYVSEYELNTFMNDVCAAFR KYAEMKHVRFVYESNFDYLNVWFDSDKMGSILKNILSNALKYTPEEGSVCISACEEGNTW SIEVKDTGIGIPSCEQKKLFRNCFRGSNVVNLKVTGSGIGLMLVYKLVKLHKGKIQIQSN EQQGTCVRVTFPKGNSHLHKAKFISPKLPDERPETIIPGSISDLPAMEISQINSSLQRIL IVEDNDDLRNYLVDMLKTSYNIQACPNGKDALIIIREFNPDLVISDIMMPEMSGDELCSA IKGDLEMSHIPVVLLTALGDEKNMLEGLEIGADAYITKPFSVGILKATIKNLLANRALLR QVYNSIEEEEQKFAVNCTNTLDWKFIASVKECIEKNMSDPDFNVEMLSSRHHMSRTSFFN KLKVLTGYAPADYIRMIRLQHAAQLLKQGEYTIAEITDRVGFSDAKYFREVFKKYYGVSP SKYGDSENTEPCPASNLKNEK >gi|222159242|gb|ACAB01000117.1| GENE 5 7787 - 10312 1688 841 aa, chain - ## HITS:1 COG:VC2217 KEGG:ns NR:ns ## COG: VC2217 COG3525 # Protein_GI_number: 15642215 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 52 801 71 846 883 325 30.0 2e-88 MITMRNIWIMMLGICLFGCGAGKQPPSSQLSLTWKLEKDSVEARYFKNTFCLTNNGNKSL TNNWVIYFNQTPIYYQQPINAPLEIECLGSTYYKMYPTEHYQALPPGETITFTILSEGNV INVSSVPEGAYVITTDEKGKPLQPQNVPIEIELFKPDTQWVSSRDSFPYADGNYFYKQND DFSKPVDCDMLSLFPAPKKVEKTGGVSSFSQKVCLKFDDAFKEEALLLKSQLTSLLRCSV SDKDEETIIELKKMEVPITCQYPDEYYEIVIKNNRLTLKASDTHGIFNACQTLLALLDNM ELTSSPLPNLHITDYPDMGHRGIMLDVARNFTKKADLLKLIDILSFYKMNVLHLHLSDDE AWRVEIPGLEELTEIASRRGHTIDEQTCLYPAYAWGWNETDTTSLANGYYSRSDFMDILK YAKERHIRVIPEIDIPGHSRAAIKAMNARYQKYIDTDQSKAEEYLLTDFADTSQYLSAQN FTDNVINVAMPSTYHFLEKVIDEIVRMYQDAGVELTAFHVGGDEVPEGIWEGSSICRTFM QENGLTKIRDLKDYFLEQILEMLDKRNIQAVGWQDIVMNPDNTVNEHFKNSKVLNYCWNT IPEQGGDEVPYKLANAGYPIILCNVGNFYLDMAYCYHVEEPGLRWGGYVDEYVTFDMLPF DIYKSLRRNLKGEPVDVKAASNGKQPLTKEGYQNIKGLSGQIWSETIRSFEQVEYYLFPK VFGLAERAWNVQPSWALSPDGKVYMDAKRKYNAGIIDYELPRLAKRGINFRVSPPGIMTR DGLLLANTAIPNAVIRYTTDGSEPTESSIEWQTPVVCNAPLIKAKAFYLGKESVTTVLFN R >gi|222159242|gb|ACAB01000117.1| GENE 6 10320 - 11393 734 357 aa, chain - ## HITS:1 COG:no KEGG:Phep_0506 NR:ns ## KEGG: Phep_0506 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 3 356 5 360 362 481 64.0 1e-134 MKKILLSFFIGVMLIACNQKTVSVREVWTKEQANEWYKQWGWLRGCDFIPSTAINQMEMW QADTFDPVTIDRELGWAEEIGMNCMRVYLHHLVWETDKEGFKKRINEYLTIADKHHISTI FVFLDDCWNPVYQAGKQPEPQPGVHNSGWARDPGDLYYKGDTAVILPILESYVKDILTTF KDDKRIVLWDLYNEPGGSGGYRYGERSLPLLQKIFTWGRTVNPSQPLSAGVWDMSLTNLN KFQLENSDVITYHTYEGVNSHQQLIDTLKQYGRPMICTEYMARTQNSTFQDIMPMLKREN IGAINWGLVAGKTNTIFAWDTPLPDVVEPPLWFHDIFRSDGTPYSTEEVECIRSLTK >gi|222159242|gb|ACAB01000117.1| GENE 7 11428 - 12906 1132 492 aa, chain - ## HITS:1 COG:no KEGG:Coch_0957 NR:ns ## KEGG: Coch_0957 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 121 492 25 411 411 208 34.0 4e-52 MYISMKTYSKKFLVTFFFSMLPMLLMSCSMDPEKASFSISGNLEKLQIGNDGFTEVYSIK SNTEWKFVNETDQSWVTVSPAKGSGNGTVTITANANTGTGRTAVFRVVPNGVKTQEIEII QGNSYIPTTDGEFPIIAWTGVEADKSLEKFPVMKASGINIYLGWYDDLETTLKVLDAAQK TGVKMITSCKDLLSVATAEEVVKAMMNHPALYAYHLKDEPEVNDLPGLGELVKKIKTIDS HHPCYINLYPNWAWGKELYSENVKSFIEQVPVPFISFDNYPIVSINGAPSIVRPDWYRNL EEISAAAKENNKPFWAFALALSHKLDETHFYKIPTLPELRLQVFSDLAYGAQAIQYFTYR GLQHDEPTEVYDLVKTVNQEVQRLAGIFLGAQVISVSHTGSEIPEGTKALGSLPTPIKSL TTSDTGAVVSVLEKGGNQYLVVVNRDFRNVMNLSIDVDSSVNRVLKNGSTTTPDGSTIAV EPGDMVIFTWRK >gi|222159242|gb|ACAB01000117.1| GENE 8 12953 - 14326 977 457 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713846|ref|ZP_04544327.1| ## NR: gi|237713846|ref|ZP_04544327.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 457 15 471 471 899 100.0 0 MKKIYVLFMSSLLLLSCAKDDVSKPSEWPEWPTPSKPKIENAVLRGVNGETVVAAGDKVK FTAQVSDEYNDLVSFQLLVTMDGAEILNLSKGLSGRSAVIEEEATLPFVAGFQNGRPVVT IKAVNDLGGNESTLTLDEAASVAVTRPETPSQLYLVDDLGNVYEMDKESQNETDYSFRTS AADLAGIGDHFKVAEKIINNKPDYSGLVWGYSVDKIAIVTDDAATSIPTPKVGSYTLENI TFDMLSFLADKTLTYSIDIDKSRFFDVGGGYVQLDVQLVESAKINFIGFGSDVTNMLRPE FFKDVSGSKAKFDGPSVLYNLKYNTSNGFMYMERPRDVFYPEVMYILGTGVGFPREPYVA TLAWDFAYPHQWFFFKKVGASAFDAVVYIDQTMGFKFYRGYGWAQEEDTKNLYTVEPANL ITRNAPGDLIPGPDFKAGLYTIHIDKASEVIRLTPYN >gi|222159242|gb|ACAB01000117.1| GENE 9 14368 - 16011 1680 547 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0810 NR:ns ## KEGG: Dfer_0810 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 20 518 19 486 499 288 37.0 6e-76 MNKNKLFKVFVMIAFIMCSCEDNFDPKMYGTLNVSNYPVTEAEYESFMMTCYMPFTTTWT YWIGAGTSGNQHGWYIPAGGVLKFFDYPTDEVAVWNNGWGGGYYFLSKADFSQCVYYASG TLSDENPNHFPKVSEISYFTNVIGTLEKASTEVVSEERKREFIAEARLCRGLMMYYLLHV YGPVPLIVDPNDLINPSKLENLVRPTLQQMTEWIMADFEYAYQYIADTQPEQGRYNKDYA RVCIMRHCLNEGYYMSGYYQKAIDMYNELKGRYSLFKKGDNPYIEQFKNANNFNSEVIMA VSCDETADGTNKSGNFNPLMMLATPDNAARVDDQGNPTPFYLQGQGWGQTFNVSPKFWDT YDPSDKRREVILTKYYTTAGTWIDRNTTTWDGFIINKFPVETATAFQGTDIPLARWADVL LMYAEAEVRKNNAAPSADAIAAVNEVRKRAGLGDLPSSATANVEAFLDALLTERGHEMLY EGMRKIDLIRFNQYAQRTAKIKGVAPTHQYVPIPNYAVQQASESYGKVLVQTFEREGWKA DLAAARN >gi|222159242|gb|ACAB01000117.1| GENE 10 16044 - 19460 2434 1138 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0811 NR:ns ## KEGG: Dfer_0811 # Name: not_defined # Def: TonB-dependent receptor # Organism: D.fermentans # Pathway: not_defined # 130 1138 10 1016 1016 773 42.0 0 MKLISIRKMKKRINVPANVKMGRICAIWLFCIALQAICIPILAQSAKITLRLKNVTVEEV LTSIENQTEYRFLYNKDIVDVSRIVSITVKNERMTLVLDKLFKGEGVSYTIEKRQIVLNK VSSQPQDNKQPVKVTGKVVDESGEVLPGVTVMIEGATQGTITGIDGNYQLQVPEGSTLKF TSIGYTTYTQKITRPMTLNVTMKEDSKQLDEVVVVGYGTTTRKNLTTSIATVKTEKISRA ATSNMSQMLLGRAAGLEATLTSPQPGGAVDLSIRGAGTPIFIVDGVMMPSTSLEVGNGNQ VMPNSINRSGLAGLNPADIESIEVLKDASASIYGIGAANGVVLITTKKGTETRPQITYEG NYSIVKNYPYLEPLSGEEYMNVANIFNKENYLFTNGMYPYGDKPFDNKWVPQFSPQQIAA AQTTDWLDCVLKDGSINNHNITITGGSKLLKYYLSGNYYKQEGTVENSAMERYALRTNIS SQLLPFLKLTAIVNVNENEYTNSTSGGGGGGNGYDAIQSALTYPSYLPIRGEDGKPTLYS NYPNPAEMIKVSDRTKTSGYYLNFAADVDIIKDMLSFRLMYGINKENANRNLYIPSDIYF MDMYKSRGHLGYVERRNQTMEGTLTFKKQFGDLLRVDAVVGMGRYTNDSNGLELDYEQIN DHIAADKVEAAEGAFYPTSFRAADERRSQFARASVDVLDRYVVAATIRRDGTDKFFPGKK YAYFPSVSLAWKLSNEPFMKHISWIDQLKIRGSYGQTGNDNLGSSLYGTFSLAAQYIKFS NNSVTYVPYLLSGPDYPNVTWEKTTMKNIGIDFSVLKDRIWGSFDMFRNDVTNLLGYDSA SPLSMTSSVPMNYGHYVRYGWDATINSLNFEIPRVFKWTSQLTLSHHNAVWKERMPNYYY EEYRIRKNEPVNAYYYYETEGVINIDKSNMPESQKSLPADAQQPGYPIIKDANGDNKITI DDVKMRNTLPKIHIGFGNTFVYKDFDLDVFMYGQFGRTRYNYAYRWALVGDVYYTSPKNS NKYVYTIWNSQTNQNGNRRGIASTKAVALPGNVGFEEDYQNASFVRVRNITLGYNLSGKK LGRVGDYVSSIRVFVDCQNPFTFTKFVGVDPEIKTGGDGSKAEYPMTRTYSFGAKICF >gi|222159242|gb|ACAB01000117.1| GENE 11 19788 - 20375 359 195 aa, chain + ## HITS:1 COG:PA0149 KEGG:ns NR:ns ## COG: PA0149 COG1595 # Protein_GI_number: 15595347 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 48 175 35 161 181 64 29.0 1e-10 MPNIQQQTSDDRFLVIALKQDDKQAFTRLFHAYYKDLVLFGGTYIPEKSTCEDIVQNIFL KLWNDRKSLVIENSLKSYLLKAVRNYCLDELRHRRIIDEHIAYELKSDSIDIDTTENYIL YSDLCRQLKNALEQLPPQEREVFEMSRLENIKYQEIANRLNISVRTVEVRISKALKQLRI LLKDFYLLLFLFLFH >gi|222159242|gb|ACAB01000117.1| GENE 12 20430 - 21476 617 348 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 138 320 132 312 331 80 29.0 3e-15 MINKDSDIEIIISRYLSKEATPEEIKVLDDWISATPENYRSFLSQKNVWEVSHPAFNPEE IDVDNAHRKVMEQILHPNQPVSVRPKLSFLHYWQQVAAILLLPLLILSAYLYFKPASQIA ETYQELFTPYGTWSVVNLPDGSKVWLNAGSSLKYPTQFNDKQRVVSMQGEAYFEVESDKK HPFIVKTKQLTVEATGTAFNVNAYAPDHVAAVTLVKGKVAVTLDQKKTISLSPGEKIDYN LATSLYNVNKTNTYKWCSWKDGILIFRDDPLEYVFKRLGQTYNVEFILKDAELGKYSYKA TFEGESLNEILRLLEMSAPIRCKEVSNRNTNNEKFEKQRIEVSRIVRK Prediction of potential genes in microbial genomes Time: Wed May 18 03:55:20 2011 Seq name: gi|222159241|gb|ACAB01000118.1| Bacteroides sp. D1 cont1.118, whole genome shotgun sequence Length of sequence - 7538 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 72 - 2270 1983 ## COG3537 Putative alpha-1,2-mannosidase - Prom 2499 - 2558 6.8 + Prom 2348 - 2407 5.5 2 2 Op 1 . + CDS 2543 - 4321 1697 ## COG0616 Periplasmic serine proteases (ClpP class) 3 2 Op 2 . + CDS 4332 - 5462 693 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase 4 2 Op 3 . + CDS 5419 - 6228 956 ## COG0005 Purine nucleoside phosphorylase 5 2 Op 4 . + CDS 6245 - 7282 1043 ## COG0611 Thiamine monophosphate kinase + Term 7322 - 7367 6.1 + TRNA 7444 - 7519 82.7 # Phe GAA 0 0 Predicted protein(s) >gi|222159241|gb|ACAB01000118.1| GENE 1 72 - 2270 1983 732 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 24 731 37 761 790 490 40.0 1e-138 MKRKTPFVAAALAAGLFFYSSCTPTEKAETKDYTQYVNTFIGAADNGHTFPGACYPFGMI QTSPVTGAVGWRYCSEYVYEDSLIWGFTQTHLNGTGCMDLGDILVMPVTGTRVRAWDAYR SHFPKDKEAATPGYYTAELSDPQVKAELTASIHAALHRYTYHKADSASILIDLQHGPAWR EEQYHSQVNSCEVNWEDAQTLTGHVNNTVWVNQDYFFVMKFNRPVVDSLYLPMGETEKGK RIIASFDMQPGDELMMKVALSTTSIDGAKKNLEAEIPAWDFEGVRATAHNEWNNYLSRIE IEGTDDEKTNFYTCFYHALIQPNQISDVDGMYRNAADSIVKAGTGAFYSTFSLWDTYRTA HPFYTLVIPERVDGFVNSLIEQGEVQGFLPIWGLWGKENFCMIGNHGVSVIAEAYRKGFR GFDAERAFNIIKKTQTVSHPLKSDWEVYTKYGYFPTDLIKAESVSSTLESVYDDYAAADM ARRMGKEEDAAYFAKRADYYKNLFDSETKFMRPRKADGTWKAPFNPSALAHSESVGGDYT EGNAWQYTWHVQHDVPGLIQLFGGEKPFLNKLDSLFTVKLEGESLADVTGLIGQYAHGNE PSHHVTYLYALAGRPERTQELVREIFDTQYKNKPDGLCGNDDCGQMSAWYMLSAMGFYPV DPVSAEYVFGAPQLPKMTLHLADGKTFTIIAENLSKEHKYVDSITLNGEPYTKKTISHED IVKGGTLVYKMK >gi|222159241|gb|ACAB01000118.1| GENE 2 2543 - 4321 1697 592 aa, chain + ## HITS:1 COG:all4590 KEGG:ns NR:ns ## COG: all4590 COG0616 # Protein_GI_number: 17232082 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Nostoc sp. PCC 7120 # 38 592 42 609 609 338 36.0 2e-92 MKDFLKFTLATVTGIILSSIVLFIISMVTLFGIMSASDTETIVKKNSVMMLDLNGTLVER TQEDPLGILSQLFGDESNTYGLDDILSSIQKAKENENIKGIYLQASSLGTSYASLQEIRN ALLDFKESGKFVIAYADSYTQGLYYLSSAADKVLLNPKGMIEWRGIASAPLFYKDLLQKI GVEMQVFKVGTYKSAVEPFIATEMSPANREQVTAFITSIWGQVTEGVSASRNIPVDSLNV YADRMLMFYPSEESVRCGFADTLIYRNDVRNYLKKLVEIDEDDNLPIVGLSDMMNVKKNV PKDKSGNIVAVYYASGEITDYPSSATSEDGIVGSKVIRDLRKLKDDNDVKAVVLRVNSPG GSAFASEQIWHAVKELKTKKPVIVSMGDYAASGGYYISCVADTIVAEPTTLTGSIGIFGI IPNVKGLTDKIGLSYDVVKTNKYADFGNIMRPFNEDERSLLQMMITEGYDTFVSRCAEGR HMTKEAIEKIAEGRVWTGETAKKLGLVDELGGIDKALDIAVAKAGIEGYTVVSYPAKQDF FSSLLDTKPTNYVESQLLKSKLGEFYQQFGLLKNLQEQSMIQARIPFELNIK >gi|222159241|gb|ACAB01000118.1| GENE 3 4332 - 5462 693 376 aa, chain + ## HITS:1 COG:aq_1656 KEGG:ns NR:ns ## COG: aq_1656 COG1663 # Protein_GI_number: 15606758 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Aquifex aeolicus # 12 325 6 285 315 122 31.0 1e-27 MDEHFIKIHKWLYPASWLYGAAVMMRNKLFDWEVLQSKSFDIPIISVGNLAVGGTGKTPH TEYLIKSLCNQYNVAVLSRGYKRHTKGYVLATPESTARSIGDEPYQMHQKFPSVTVAVDE KRCHGIEKLLALQKPSIDVILLDDAFQHRYVKPGLSILLTDYHRLFCDDTLLPAGRLREP ISGKNRAQIVIVTKCPQDIKPIDYNIITKRLNLYPYQQLFFSSFRYGNLQPVFPIMVPDT DTPSANNEIALSSLTNTDILLMTGIASPAPILERLKDCTQQIDLLSFDDHHNFSHRDIQL IKERFHKLKGEHRLIITTEKDATRLINHPALDKELKPFIYALPIEIEILQNQQDKFNQHI IDYVRENTRNRSLSER >gi|222159241|gb|ACAB01000118.1| GENE 4 5419 - 6228 956 269 aa, chain + ## HITS:1 COG:BH1532 KEGG:ns NR:ns ## COG: BH1532 COG0005 # Protein_GI_number: 15614095 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Bacillus halodurans # 3 266 6 270 275 292 55.0 5e-79 MLEKIQETAAYLKGKMHTSPETAIILGTGLGSLANEITEKYEIKYSDIPNFPVSTVEGHS GKLIFGKLGNKDIMAMQGRFHYYEGYSMKEVTFPVRVMRELGIKTLFVSNASGGTNADFE IGDLMIITDHINYFPEHPLRGKNIPYGPRFPDMSEAYSKELICKADEIAKEKGIKVQHGV YIGTQGPTFETPAEYKLFHILGADAVGMSTVPEVIVANHCGIKVFGISVITDLGVEGKIV EVTHEEVQKAADAAQPKMTTIMRELINRA >gi|222159241|gb|ACAB01000118.1| GENE 5 6245 - 7282 1043 345 aa, chain + ## HITS:1 COG:MTH1396 KEGG:ns NR:ns ## COG: MTH1396 COG0611 # Protein_GI_number: 15679395 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate kinase # Organism: Methanothermobacter thermautotrophicus # 1 340 1 324 327 158 34.0 2e-38 MRTEIASLGEFGLIDRLTEGIKLENESSKYGVGDDAAVLSYPSEKQVLVTTDLLMEGVHF DLTYVPLKHLGYKSAVVNFSDIYAMNGTPRQITVSLGLSKRFSVEDMDELYSGIRLACQQ YNVDIVGGDTTSSLTGLAISITCIGDADKDKVVYRNGAKETDLVCVSGDLGAAYMGLQLL EREKVVLKGDKDIQPDFTGKEYLLERQLKPEARKDIIEKLAAANIVPTSMMDISDGLSSE LMHICKQSNAGCRVYEEHIPIDYQTAVMAEEFNMNLTTCALNGGEDYELLFTVSIADHEK VSQMEGVRLIGHITKPELGCALITRDGQEFELKAQGWNPLKENKQ Prediction of potential genes in microbial genomes Time: Wed May 18 03:55:25 2011 Seq name: gi|222159240|gb|ACAB01000119.1| Bacteroides sp. D1 cont1.119, whole genome shotgun sequence Length of sequence - 21965 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 10, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 209 168 ## gi|298482322|ref|ZP_07000509.1| hypothetical protein HMPREF0106_02786 2 1 Op 2 . - CDS 265 - 750 485 ## BT_1884 cold shock protein, putative DNA-binding protein 3 1 Op 3 . - CDS 810 - 1934 792 ## COG0513 Superfamily II DNA and RNA helicases - Prom 1964 - 2023 4.4 4 1 Op 4 . - CDS 2029 - 3102 763 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase - Prom 3202 - 3261 5.4 - Term 3400 - 3456 8.4 5 2 Tu 1 . - CDS 3459 - 3758 219 ## COG0724 RNA-binding proteins (RRM domain) - Prom 3915 - 3974 5.6 6 3 Tu 1 . + CDS 4120 - 4857 595 ## BT_1888 LuxR family transcriptional regulator 7 4 Tu 1 . - CDS 4897 - 5481 476 ## BVU_0638 hypothetical protein - Prom 5591 - 5650 4.0 + Prom 5460 - 5519 4.1 8 5 Op 1 . + CDS 5620 - 7683 1251 ## COG3973 Superfamily I DNA and RNA helicases + Prom 7687 - 7746 3.2 9 5 Op 2 . + CDS 7767 - 8078 206 ## LIC13454 hypothetical protein + Term 8242 - 8282 -0.9 - Term 8100 - 8139 7.5 10 6 Tu 1 . - CDS 8170 - 9726 1484 ## BF2880 hypothetical protein - Term 9744 - 9783 1.0 11 7 Op 1 . - CDS 9875 - 10795 409 ## COG3023 Negative regulator of beta-lactamase expression 12 7 Op 2 . - CDS 10826 - 11176 366 ## BT_4442 hypothetical protein - Term 11193 - 11227 4.0 13 7 Op 3 . - CDS 11262 - 11780 626 ## BF2226 hypothetical protein - Prom 11999 - 12058 8.4 - Term 12100 - 12129 0.5 14 8 Tu 1 . - CDS 12165 - 12983 245 ## BT_1485 hypothetical protein - Prom 13015 - 13074 4.9 15 9 Op 1 . - CDS 13126 - 13452 110 ## gi|294645785|ref|ZP_06723469.1| hypothetical protein CW1_3654 16 9 Op 2 . - CDS 13488 - 15110 809 ## gi|298482308|ref|ZP_07000495.1| hypothetical protein HMPREF0106_02771 17 9 Op 3 . - CDS 15125 - 17110 1051 ## gi|237713872|ref|ZP_04544353.1| predicted protein 18 9 Op 4 . - CDS 17133 - 18101 557 ## BF4432 hypothetical protein - Prom 18146 - 18205 6.4 19 10 Op 1 . - CDS 18238 - 19995 1371 ## gi|237713874|ref|ZP_04544355.1| predicted protein 20 10 Op 2 . - CDS 20038 - 21201 715 ## BF1974 hypothetical protein 21 10 Op 3 . - CDS 21210 - 21803 348 ## BF1622 hypothetical protein - Prom 21896 - 21955 6.8 Predicted protein(s) >gi|222159240|gb|ACAB01000119.1| GENE 1 3 - 209 168 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298482322|ref|ZP_07000509.1| ## NR: gi|298482322|ref|ZP_07000509.1| hypothetical protein HMPREF0106_02786 [Bacteroides sp. D22] # 1 67 1 67 93 127 100.0 2e-28 MKKGNIMRDVISSSQFTANVIKEYKKGLGDVVLVAITPRTTIELPAHLTQAEIDARIEIY KKLHSSKV >gi|222159240|gb|ACAB01000119.1| GENE 2 265 - 750 485 161 aa, chain - ## HITS:1 COG:no KEGG:BT_1884 NR:ns ## KEGG: BT_1884 # Name: not_defined # Def: cold shock protein, putative DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 148 152 198 84.0 5e-50 MAKSMTVGKRENEKKRIAKREEKLKKKENKKLSGSSNSFEDMIAYVDENGRITSTPPTEN VRKEEINPDEIVIATPKKEDEEPTILRGRVEFFNESRGFGFIKDLSGVEKYFFHVNNVVG NITENNIVTFDLERGVKGMNAVNISLENKSKPENLAQTESV >gi|222159240|gb|ACAB01000119.1| GENE 3 810 - 1934 792 374 aa, chain - ## HITS:1 COG:ECs0875 KEGG:ns NR:ns ## COG: ECs0875 COG0513 # Protein_GI_number: 15830129 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli O157:H7 # 1 369 1 370 455 397 54.0 1e-110 MTFKELNITEPILKAIGEKGYTVPTPIQEKAIPPALAKRDILGCAQTGTGKTASFAIPII QHLQLDKEAARRQGIKALILTPTRELALQISECIDDYSKHTRIRHGVIFGGVNQRPQVDL LRKGIDILVATPGRLLDLMSQGHIHLDTIQYFVLDEADRMLDMGFIHDIKRILPKLPKEK QTLFFSATMPDSIISLTKSLLKNPVKIYITPKSSTVDSIKQLVYFVEKKEKSQLLISILS KAEDQSVLIFSRTKHNADKIVKILGKAGIGSQAIHGNKSQAARQSALGNFKSGKTRVMVA TDIASRGIDINELPLVINYDLPDVPETYVHRIGRTGRAGNMGTALTFCSQEERKLVNDIQ KLTGKKLNKADFTI >gi|222159240|gb|ACAB01000119.1| GENE 4 2029 - 3102 763 357 aa, chain - ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 6 345 8 347 355 286 44.0 3e-77 MKSFFIGNIEIKTPVIQGGMGVGISLSGLASAVANEGGVGVISCAGLGLLYPKGKGSYPE KCISGLREEIHKARTKTEGIIGVNVMVALSNYADMVRTAIEEKIDVVFSGAGLPLDLPSY LTPESVTKLVPIVSSSRAAKVICDKWQKNYNYLPDAIVVEGPKAGGHLGFKKEQIQDQHY ALEALIPEVVMIASSYKEQKQIPVIAAGGISTGEDIAHFMELGASGVQMGSIFVTTLECD ASETFKEVYIHSKSEDVLIIESPVGMPGRAIDGEFIHNVNNGLERPKSCSFHCIKTCDYT KSPYCIIKALYNAARGNMKKGYAFAGSNAFLAEKISSVKEVMATLEREFFLATHRLA >gi|222159240|gb|ACAB01000119.1| GENE 5 3459 - 3758 219 99 aa, chain - ## HITS:1 COG:all2777 KEGG:ns NR:ns ## COG: all2777 COG0724 # Protein_GI_number: 17230269 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Nostoc sp. PCC 7120 # 1 99 1 99 99 76 43.0 1e-14 MNIYISGLSYGTTDADLTNLFAEFGEVSSAKVIFDRETGRSRGFAFVEMTNDGEGQKAID ELNGVEYDQKVISVSVARPRTERPSNGGGRGGYNNSRRY >gi|222159240|gb|ACAB01000119.1| GENE 6 4120 - 4857 595 245 aa, chain + ## HITS:1 COG:no KEGG:BT_1888 NR:ns ## KEGG: BT_1888 # Name: not_defined # Def: LuxR family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 14 257 258 410 82.0 1e-113 MWAKQHLSAVDIDYAVWERDKSMLHQLSKMGRNCTFVVDVYKCKYAYASPNFVDLLGYDS HKIETLERQGDYLESRIHPDDRAQLAALQVILSQFIYSLPFEQRNNYSNIYSFRMLNAKQ QYIRVASRHQVLEQDRNGKAWLIIGNMDISPDQKASETVDCTVLNLKNGEIFSPSLSLMP STNLTNREIEVLRLIQKGLLSKEIADKLCISIHTVNIHRQNLLRKLGVQNSIEAIRLGQE SGLLG >gi|222159240|gb|ACAB01000119.1| GENE 7 4897 - 5481 476 194 aa, chain - ## HITS:1 COG:no KEGG:BVU_0638 NR:ns ## KEGG: BVU_0638 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 194 1 194 194 302 74.0 3e-81 MNILELAKRNQQRAWEIIEDTRIVRIWEGMGAKVNLVGSLRTGLLMKHRDIDFHIYTSPL DLSASFRAMAELAENTSVKKIEYTNLLHTAEACIEWHAWYQDMEGELWQMDMIHIQEGSR YDGYFERVAERISAVLTDEMRLAILKLKYETPDTEKIMGVEYYQAVIQDGVRSYPEFEEW RRLHPVVGVVEWMP >gi|222159240|gb|ACAB01000119.1| GENE 8 5620 - 7683 1251 687 aa, chain + ## HITS:1 COG:BS_yvgS KEGG:ns NR:ns ## COG: BS_yvgS COG3973 # Protein_GI_number: 16080398 # Func_class: R General function prediction only # Function: Superfamily I DNA and RNA helicases # Organism: Bacillus subtilis # 2 678 18 755 774 327 31.0 5e-89 MIFNQTEKQEKEYLQQVITTIHDTINTTSTSVKEHIDTLQEYKDYLWSNKDIDPHEIRSM RESILNHFALGENVIDKRRRLSKILDIPYFGRIDFQEKKDKSQVSPIYIGIHTFYDLNQK KNLIYDWRAPISSMFYDHELGEAFYSSPAGKINGTISLKRQYRIRKGKMEYMIESSVTVH DDILQKELCSNADNKMKNIVTTIQREQNQIIRNENASVLIIQGVAGSGKTSIALHRIAYL LYAQKGQISSKDILIISPNKVFADYISNVLPELGEETVPETNMGQILSDVIDNKYKHQNF FEQVTELLENPTPDFIERIQYKASFEFISLLDKFILYMENNYFKATDVKLTKYITVPAEF INEQFKRFHRYPIRQRFETMTDYILEMMQLQYNLTVSTPEKNQLKKEIKKMFAGNNALQI YKDFFEWIGKPEMFKTRKNRILEYADLAPLAYLHIALDGNNAQSYVKHLLIDEMQDYSPI QYKVIQKLYPCRKTILGDASQSVNPYGSSTAGMIQKAFVTGEIMKLCKSYRSTFEITSFA QKIQSNNELEPIMRHGEHPKILPFKSTKEEIQGIADLVNSFRNSHYTSLGIICKTESQAK ELVQKLQIYANDISLLSNQSSAYVKGIIITSAHMAKGLEFDEVIIPQTDDKNYHSNIDKS MLYVAVTRAMHKLTLTYSTTLPSCQFL >gi|222159240|gb|ACAB01000119.1| GENE 9 7767 - 8078 206 103 aa, chain + ## HITS:1 COG:no KEGG:LIC13454 NR:ns ## KEGG: LIC13454 # Name: not_defined # Def: hypothetical protein # Organism: L.interrogans_Copenhageni # Pathway: not_defined # 9 103 7 98 100 72 43.0 3e-12 MENKGMKYEFTTKVWNYSSTVGTGGWHIACLPKEMSKEIRENLGFLEEGWGRLKMTAKTG NTQWETAIWFDTKLDTYLLPLKAEIRKKEKITTDKEIEIMIWI >gi|222159240|gb|ACAB01000119.1| GENE 10 8170 - 9726 1484 518 aa, chain - ## HITS:1 COG:no KEGG:BF2880 NR:ns ## KEGG: BF2880 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 5 517 9 522 525 659 63.0 0 MDGILRKLPIGIQTFEKLREGDYLYVDKTALVWRIASTETPYFLSRPRRFGKSLLLSTFE AYFEGKKELFEGLAIADMEKEWKKYPVLHLDLNAEKYDSPQALVEILSRQLTQWELKYGK GEDEETLSGRFAGVIRRACEQSGRGVVVLVDEYDKPLLQALGDDALLDDYRKTLKAFYGV LKSSDRYLRFVFLTGVTKFAQVSVFSDLNQLNDISMKPQYATICGITRQELEDTFIPELN RLAETNELTYEETLNKMTALYDGYHFCEFAEGVFNPFSVLNVFDGYKFSNYWFQTGTPTF LVELLKKSEYDLRTLIDGVEANASSFMEYRVDANNPIPLIYQSGYLTIKGYDKRFGNYLL KFPNDEVRYGFMDFLVPYYTSVVDDERGFYLGKFVRELESGDVDAFLTRLQAFFADFPYE LNEKTERHYQVVFYLVFKLMGQFTGAEVLSARGRADAVVKTPKYIYVFEFKLNGTAEQAL QQIDEKGYLIPYQVDGRELVKVGVEFSAEKRNIDRWLY >gi|222159240|gb|ACAB01000119.1| GENE 11 9875 - 10795 409 306 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 45 148 1 104 116 110 50.0 4e-24 MRNIDSIIVHCSATKAGQDFTAADIDCWHRERGFNGIGYHYVVRLDGKLEKGRDVSLAGA HCKGWNERSVGICYIGGLDANGRPADTRTNAQKRVLYQIIMDLQREYNILQVLGHRDTSP DLNGDGVIEPYEYVKVCPCFDVKEFLRSGRELLFVLLVALVVPVLLSGCRSKKEVINRES EVRVDSSLNSSSGKSLVKNKVASEKDSEVVVEHIEQVLFVFPVDTLRLKAGMVVKTVVDR KNVNEKQLLQEVNSQSVSLSDMSAKTEIHKTGVEKTTNRTTGGWYWYGIILLVLVIVCWI CKVKRK >gi|222159240|gb|ACAB01000119.1| GENE 12 10826 - 11176 366 116 aa, chain - ## HITS:1 COG:no KEGG:BT_4442 NR:ns ## KEGG: BT_4442 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 116 1 116 116 120 54.0 1e-26 MQITTILAFITAMGGLEAVKWLVRYITCRKTDARKEEASVNSMEEENRRKKVDWLEERLT QRDEKIDGLYIELRKEQEEKIDWIHKCHEVELIQKESEVKKCEIRGCVKRMPPSDY >gi|222159240|gb|ACAB01000119.1| GENE 13 11262 - 11780 626 172 aa, chain - ## HITS:1 COG:no KEGG:BF2226 NR:ns ## KEGG: BF2226 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 152 1 145 155 80 36.0 2e-14 MAIRFRRVSRLCDPTNKEAGKKVYPVISYQYDTSATLDEFAKEIASASGVSEGETISVLK DFRTLLRKTLLGGRSVNIAGLGYFYLSAQSKGTEKAEDFTIANISGLRICFRANSDIRLF TGTTTRSDGLKFKDLDHINDLGSVGDGSDDGGETPDPDGGSGGNSEAPDPAA >gi|222159240|gb|ACAB01000119.1| GENE 14 12165 - 12983 245 272 aa, chain - ## HITS:1 COG:no KEGG:BT_1485 NR:ns ## KEGG: BT_1485 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 266 50 308 311 177 34.0 4e-43 MAKQQKDSILIERFWDTYDFTDTIRLSSSATTERMLVDYLSQLSELTEERACENIRRLVL KTKVNKVVNRWFLQRLEHHLYEPDSSLRNDSYYISVLEEVLTSGHLDGMMRIRPLYQLKM LKKNRIGIKAADFTFILSTGEFQKLWDISAKYTLLLFYDPDCIYCRQSMMELSESSVINS FLQHSGLSLSQLALVTICTEGEMDAWKEYQQSLPSVWINGYDVRKVLTEKELYFLRSLPS IYLLGEDKQVLLKEPSSIGEVTSYLLNKYDDI >gi|222159240|gb|ACAB01000119.1| GENE 15 13126 - 13452 110 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294645785|ref|ZP_06723469.1| ## NR: gi|294645785|ref|ZP_06723469.1| hypothetical protein CW1_3654 [Bacteroides ovatus SD CC 2a] # 1 108 7 114 114 204 100.0 1e-51 MNILNDILEKEGDNRGSVYLYCLEDSWKAFGYSAFYLSLLYPGIKIVKYDNPDIICFCMC ISDDCLMEIFEMSRIYVGNECIEMKVPERICGGENEYVKWCNELCLNL >gi|222159240|gb|ACAB01000119.1| GENE 16 13488 - 15110 809 540 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298482308|ref|ZP_07000495.1| ## NR: gi|298482308|ref|ZP_07000495.1| hypothetical protein HMPREF0106_02771 [Bacteroides sp. D22] # 1 540 4 543 543 1082 98.0 0 MIVKKKYIQLVLFVLLFSLGACNSDEMSDINDTTEVQEQMVLKLSVQQESVVYTRGTGTP ADEAKVESMDFLIFNHDGKIVYHQHPVLEWTGNEYKTTVNIPSATGEHTLYLIANYPMEA GTIHTLQDLESRICTSTKALVTPPFVMATRRITLSSLNMIAIRNAMESDGSNGFSLKRNV AKFSVGVADNNFELISINWLGCPVSASVLADVDYTSPSTLSVSSLAPSSNPIYLYQIRKM GLDAHKGFHIVVHGKYTAADGTEREGYYKLRLCTIDDATGEKVPLTSIVGNDYYKLNIRS VTGFGANSFENAEKNGFTNDMEAFMLLEYNGTHDYQENYLQNGYQMGFENSRWIIYSNEI LNSFVLGHFYRCIRDESLEGYTMFDPDYPNLQRGLITVKANGTEITGETVCNDTPDNPVE MDLCFTESSEVAGTEYSFTSIIQYGVLQKTVTVERHSSIGRSYTVLPMLYTNYGEVFDSC DWIGIAEQRHEGAIIYPQLDSANDYIFVHVKENGTGNAREGKVRLFGRNGYFELRIRQEG >gi|222159240|gb|ACAB01000119.1| GENE 17 15125 - 17110 1051 661 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713872|ref|ZP_04544353.1| ## NR: gi|237713872|ref|ZP_04544353.1| predicted protein [Bacteroides sp. D1] # 1 661 1 661 661 1316 100.0 0 MRLKLVYIYSFVVTLLYLLTACTDGIEASSDSNGQGEGKEIPVSIMPKLVAGDENEVSQL RVIIFSTRTSNPYGPKVLVSNQLIDVNEDYVAITYIGYNDIYVIGNEPVDLSGINSPEEL KNIQMNTESSLSASKFVFFRQLLNVNVRSKNEIYPEGASASVSKLEIKLQRVIAKLSVKF DLSTEICENGNPTGEFVNLESMELLRIPKYSYLASCKYRIEEGFLDNRVFSLENSSAEQN HFTWSSGDIYLPEYLPMDEIYRMVLRIKGTSRGITRIYTLPIGDAMNSIDSHSTEWNITR NRHYSLNIKAITGYGEESLEVKSNVSGWSEVNVPIEITGASFLAVNKKTVEVKSLRFYSY VRFACNAPVEVKLPDEVTIGNQLQMTIEYDDATKKSGRVGFRRGNWNTGNDIIYVTTLKS GTASVELNLQLYRSETIRVGNNLVTWAEAMGYWEEANWIRTGIYNDEFYKRATRETGCRA YYPEGINENDATRGKGCWRLATIRENWAINSAEAWCIEEHDTNGRKAYVSYGSGNLIDDY KENKKYRYNCTLDVRPPELSDFIVSSSDVTNVTKENASSICANLGSGWRLPTGKEMNYVF LNAGTNGLPNNFFSDSYWGKNEDGTFIVATMSDPDGSATTDELRNGRHTVRCVKLRYPPV G >gi|222159240|gb|ACAB01000119.1| GENE 18 17133 - 18101 557 322 aa, chain - ## HITS:1 COG:no KEGG:BF4432 NR:ns ## KEGG: BF4432 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 11 322 28 327 327 87 26.0 8e-16 MVAYISMLLWLITGCVNEDFSDCPQGSFQVAFEYVHHTDNICPDRFNIDVQQIDLYVFDA AGCFLKCITRKGTPFPKDFRIDPELSAGSYTLVAWGNLTDEVTLQPAFIAGQTTLEQALL SLNAAEDRSVNHRLTPVFHAMKQVEVNDVKEHTEVLSLIKNENHLHLNVKWFEKSGIPCI HRCADGVRVRVLDPKGATYKFDNSVVASGNELTYYPYQGVNNDAWNQFAGVFSLMRLVEG EKLTLLIERLMQDGTAKELYRSDLVELIRMHPYTRIQSELDRQDVYEVEISLIDDLDGDT DTYMQTAVTVSGWTIILQDTEM >gi|222159240|gb|ACAB01000119.1| GENE 19 18238 - 19995 1371 585 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713874|ref|ZP_04544355.1| ## NR: gi|237713874|ref|ZP_04544355.1| predicted protein [Bacteroides sp. D1] # 1 585 1 585 585 1114 100.0 0 MRKNYFLIGVFALATITFAACSSEDKLGEERSDDNGNTEVIEGEPTWAKFTFKLGKDGVV ARATGDTHEGTTTEKNIKNLRAYIFNESGSFEAKTDNATVDEHGTEHTAVLKLTSGKKKV YVVANMQDDWITSTTTTAQFEAVTLTHCTKAGRIAHTKGVETAVAVSRVGNFDKISGNGN AATNGFLMANVLKATEYTLYPGISAEDCSKPAYDSSAEMKKNNHLEVFISRAAAKLQATY ANADALILKDSESNTLGKLLTPTFTVRNLPVQSYIFLHNHVDGFLTPFYTMTNTSSTFAD FAKYYDEVNEPNLDLKASADGEAIESIYLPENSNQTPVRGNTTYVLVKGVFAPAANVMIN AVEVKNDGGNLSIENKLDGTDYTTGGKDVPPYFIVPEWENQIYTAPADFSGNLANYGLGM ILKTYLNKAGKQKWLNAVDEGSAVDEGDIENSVSVYYGSSFSTSEGIYSVVTIKKYSHPK NALPGYKESIEGTIRIFQYTGGTCYYRVNVEDNMDGKTEANNLFYSVMRNRFYNVNISKI TSIGYPDDEDVTVDPTDPINQKTYMQAHITVEPWTKVEQDAELGM >gi|222159240|gb|ACAB01000119.1| GENE 20 20038 - 21201 715 387 aa, chain - ## HITS:1 COG:no KEGG:BF1974 NR:ns ## KEGG: BF1974 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 382 1 391 394 174 30.0 5e-42 MNSFQIIILSFFMSITVMSVSAQNNDSIKINNFSIEKNAQGDSVSLRFHLMIAPGITKAN DELLLTPILIDIKGNHSFAFPVIRVNGKIREKIVRRQKVLHSSKAEKDLIYTVSNRQWNT ITCSTPMLANQLWIWQAKLVLRRQLGNCCMKKELDMVELFTWKRTDGYSIVKPEIKNKVE IPEMLPHKCATEVLAETEKFVEPIDNYSAGKRILAGEQEGAQIVYFKLAKAEIDSNYLDN ERVLEHIIDVIRRISEDPEVEIARVVLLGLSSPEGAFEFNKQLSGKRAEALKQYIADRIA LADSCFALVNGDEGWEELRYKVEHSDMEYRKEVLNIIDSVPIMKGREGQLQRLKRGVPYR YLEEHFFPQLRRAGYIKVYYRMKNGTI >gi|222159240|gb|ACAB01000119.1| GENE 21 21210 - 21803 348 197 aa, chain - ## HITS:1 COG:no KEGG:BF1622 NR:ns ## KEGG: BF1622 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 30 197 37 197 197 177 52.0 2e-43 MNDCIRNKAGILIGLFTLLGCFAVRAQDIAVKTNLLYGGLLQTPNIGVEWGISPKLTVDL WGAYNPFPLGKNGYGSNNRKMKHWLIQSELRYWLCERFNGHFFGGHLFYGQYNVSNIRYW GLGDFRRQGNLAGFGFSYGYQWILSPHWSIEGTLGIGYAHLHYSKYPCKECGKRLSVNNR NYLGPTRAAVSIIYLLK Prediction of potential genes in microbial genomes Time: Wed May 18 03:57:10 2011 Seq name: gi|222159239|gb|ACAB01000120.1| Bacteroides sp. D1 cont1.120, whole genome shotgun sequence Length of sequence - 11863 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 183 68 ## BVU_3207 hypothetical protein - Prom 308 - 367 3.1 2 2 Op 1 . - CDS 369 - 527 124 ## gi|294647412|ref|ZP_06725000.1| conserved domain protein 3 2 Op 2 . - CDS 602 - 5413 2952 ## COG0514 Superfamily II DNA helicase - Prom 5443 - 5502 4.2 + Prom 5405 - 5464 5.7 4 3 Op 1 . + CDS 5610 - 7094 1432 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog 5 3 Op 2 . + CDS 7141 - 7812 517 ## BF4362 hypothetical protein 6 3 Op 3 . + CDS 7842 - 8222 582 ## COG0509 Glycine cleavage system H protein (lipoate-binding) + Prom 8235 - 8294 1.9 7 4 Op 1 . + CDS 8334 - 8843 612 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase + Prom 8852 - 8911 3.0 8 4 Op 2 . + CDS 8983 - 10827 1545 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis + Term 11028 - 11075 11.4 Predicted protein(s) >gi|222159239|gb|ACAB01000120.1| GENE 1 3 - 183 68 60 aa, chain - ## HITS:1 COG:no KEGG:BVU_3207 NR:ns ## KEGG: BVU_3207 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 59 92 149 242 97 84.0 1e-19 MWISPRFQLLLVKEYQRLKEQEQTQVGWNAKRELSKINYRIHTDAIKQNLIPTEVTPKQI >gi|222159239|gb|ACAB01000120.1| GENE 2 369 - 527 124 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294647412|ref|ZP_06725000.1| ## NR: gi|294647412|ref|ZP_06725000.1| conserved domain protein [Bacteroides ovatus SD CC 2a] # 1 52 1 52 52 89 100.0 8e-17 MTKIKVQNTEIAVVSYHDDDYISLTDMARSQMQEHIIFRWLSLKSTLEYIGE >gi|222159239|gb|ACAB01000120.1| GENE 3 602 - 5413 2952 1603 aa, chain - ## HITS:1 COG:alr0205 KEGG:ns NR:ns ## COG: alr0205 COG0514 # Protein_GI_number: 17227701 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Nostoc sp. PCC 7120 # 247 603 6 343 718 221 37.0 7e-57 MTNITFIDLEVNPVNRQISDMGAIRNDGASFHANSPAQFIQFITQTEYIGGHNILNHDLK YIIPLFQQTGYIQPKAIDTLYLSPLLFPAKPYHHLLKDDKLQTDSLNNPLNDSMKAHELF LAEVEAFGRLDEDLKYIYYSLLHPTDEFKSFFDFIAYTIPFGKYDNPETVIRRRFAKEVC EHAPLENYISRAPIELAYCLALINCRDRYSITPPWVLHNFPKVESIMYVLRNTPCLTGCN YCNKAFDIHRGLKHFFGFDAYRTYNNQPLQENAVQAAVEGKSILAVFPTGGGKSITFQVP ALMSGEQVKGLTVVISPLQSLMKDQVDNLRKNDITSAVTINGLLDPIERAKSFENVENGF ASILYISPESLRSRSIERLLLGRKVVRFVIDEAHCFSSWGQDFRVDYLYIGDYLKQLQEK KGLQQPIPVSCFTATAKQKVIEDISTYFKDKLNIRLELFRSDISRTNLQYQVFERNQENE KYNSLRDIIEQKSCPTIVYVARTRTATKLAARLKQDGFSVAAFHGKMDIKDKTENQNDFI AGNIQIMVATSAFGMGVDKKDVGLVVHYEISDSLENYVQEAGRAGRDEHINASCYILFDE EDLNKHFILLNQTKLHIGEIQQVWKSIKEITRLRPKVSQSVLEIARKAGWDENVTNIETR VTTAIAALEESGYLHRGQNSPRVFADSILAKTAQEAIDKINLSPRFNEKQKMQATRIIKK LISTKSRKHANSEVPESRIDYISDHLGISNEEVIQIITLLREEKILADSRDLTAYILRKE QCNRSLEIIRFYNKLENFLLTVFEEEQKIVNLKELREEAENAIQRNVSPDKLKALINFWA EKNWIKRSYRDDNRNHISVYLRPQTGIPFEKRCENRHCLSEFIIDYLFNKIESDTESVTR EEILVEFSILELKKAYENQNVLFQESVTLEEIEDTLFYLSRMGALQIEGGFLVIYNRMNI ERLGATNIRYKQDDYRKLEQFYTNKIQQIHIVGEYARKMVHDYQAALRLVDDYFSLNSTS FIQKYFPGSRKDEINRRITPGKYRQLFGELSKTQLEIVNDNQPGHIVVAAGPGSGKTRLL VHKLASLLLMEDVKHEQLLMLTFSRAAANEFRSRLHDLIGNAVGYIEIKTFHSYCFDLLG RQGSIEKSKSVIKDAVEKIRNSEVEISRITKSVLVIDEAQDMTEDEFALIAALIEKNESL KVIAVGDDDQNIYAFRNSDSRHMQSLITNYQARHYELCENFRSRSNLVAFANAFVNLIPH RMKHTPIQAHQLEKGSIRITQYQSDNLIVPLIKDVLESPLAGTTGILTKTNEDAVFIACL LQEQGMPVRLVQTNSGNFYLGDLDEIRYFNRALDLSQDTHLIDDERWETAKRCLKREFGH AQSWEICRNIILNFEQVYSRKYHSDWTNYLFESKLEDFYPVQGEAIVVSTIHKAKGKEFD NVFLLLTNIESLSDEKKREVYVAITRAKQNLSIHTNGHYLSNIANGLAEYRINSNTYPAP QSLCYTLTHRDLWLDFFIDSHCQAAINTLKSGMSLRLTEKGCTDFSNNEILRFSKKFKDI LNELIQKDYKLEKASVNFVVYWEKQEDKKEAKIVLPKVVFKKV >gi|222159239|gb|ACAB01000120.1| GENE 4 5610 - 7094 1432 494 aa, chain + ## HITS:1 COG:RSc0408 KEGG:ns NR:ns ## COG: RSc0408 COG1508 # Protein_GI_number: 17545127 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Ralstonia solanacearum # 19 494 15 497 499 219 32.0 1e-56 MAQGSRQIQSQAQQQVQTLSPQQILVVKLLELPAVELEDRIHAELLENPALEEGKEENTA DEYADSDTTEDGMDSDANDYDSLSDYLTEDDIPDYKLQENNRSKDEQAEDIPFSDSTSFY EILKEQLRERNLTEHQRDLVEYLIGSLDDDGLLRKSLESICDELAIYAGIESTEEELEEA LCILQDFDPAGIGARSLQECLLIQICRKKDEEKKPNPILELEERIIRECYEEFTRKHWEK IIKKLDIDEETFQEALNEITKLNPRPGASLGEAIGRNLQQIVPDFIVETYDDGTINISLN NRNVPELRMSRDFTEMVEEHTKNRANQSKESKEAMMFLKQKMDAAQGFIDAVRQRHNTLM TTMQAIIDLQRPFFLEGDESLLKPMILKDVAERTGLDISTISRVSNSKYVQTNYGIYPLK FFFSDGYTTEDGEEMSVREIRKILKECIDGEDKKKPLTDDELADILKEKGYPIARRTVAK YRQQLNIPVARLRK >gi|222159239|gb|ACAB01000120.1| GENE 5 7141 - 7812 517 223 aa, chain + ## HITS:1 COG:no KEGG:BF4362 NR:ns ## KEGG: BF4362 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 223 5 226 226 304 79.0 2e-81 MDREREHIIADKTMLQVARATSMVFTPFSIPFLSFLVLFLFSYLRIMPIQYKLIVLGIVY CFTILTPTITIFLFRKINGFARQELSERKKRYVPILLTIISYVFCLLMMRRLNIPWYMTG IILVSLVVSIICIAVNLKWKLSEHMAGIGGVIGGLVSFSALFGYNPVGWLCLFILIAGIL GSARIVLGHHTLGEVLSGFAVGLICALLVLHPVSNILFRVFLF >gi|222159239|gb|ACAB01000120.1| GENE 6 7842 - 8222 582 126 aa, chain + ## HITS:1 COG:BH3484 KEGG:ns NR:ns ## COG: BH3484 COG0509 # Protein_GI_number: 15616046 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Bacillus halodurans # 2 125 3 126 128 128 52.0 3e-30 MNFLQNLKYTNEHEWIRVEGDIAYVGITDYAQEQLGDIVFVDIPTVGETLEAGETFGTIE VVKTISDLFLPVAGEVLEQNEALEENPELVNKEPYGEGWLIKVKPTDIKAVEDLLDAEAY KAVVNG >gi|222159239|gb|ACAB01000120.1| GENE 7 8334 - 8843 612 169 aa, chain + ## HITS:1 COG:PAB1077 KEGG:ns NR:ns ## COG: PAB1077 COG0041 # Protein_GI_number: 14521838 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Pyrococcus abyssi # 3 162 7 166 174 167 55.0 1e-41 MTPIVSIIMGSTSDLPVMEKAAQLLNDMHVPFEMNALSAHRTPEAVEEFAKNARNRGIKV IIAAAGMAAALPGVIAANTTLPVIGVPVKGSVLDGVDALYSIIQMPPGIPVATVAINGAM NAAILAIQMLALSDEKLAEAFAAYKEGLKKKIVKANEELKEVKFEYKTN >gi|222159239|gb|ACAB01000120.1| GENE 8 8983 - 10827 1545 614 aa, chain + ## HITS:1 COG:CPn0373 KEGG:ns NR:ns ## COG: CPn0373 COG0821 # Protein_GI_number: 15618288 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Chlamydophila pneumoniae CWL029 # 6 606 9 598 613 391 39.0 1e-108 MVDLFNYFRRETTEVNIGAVPLGGPNSIRVQSMTNTSTQDTQACVEQAKRIVDAGGEYVR LTTQGIKEAENLMNINIGLRSQGYMVPLVADVHFNPKVADVAAQYAEKVRINPGNYVDAA RTFKKLEYTDEEYAQEIQKIHDRFVPFLNICKENHTAIRIGVNHGSLSDRIMSRYGDTPE GMVESCMEFLRICVAENFTDVVISIKASNTVVMVKTVRLLVDVMEKEGMTFPLHLGVTEA GDGEDGRIKSALGIGALLSDGLGDTIRVSLSEAPEAEIPVARKLVDYVLLRQDHPYIPGL EAPEFNYLSPERRKTKAVRNIGGEHVPVVIADRMDGKTEVNPQFTPDYIYAGRALPEQRE EGVEYILDADVWQGEAGTWPAFNHAQLPLMGECDAALKFLFIPYMAQTDEVIACLKYHPE VVIISQSNHPNRLGEHRALVHQLMTEGLQNPVVFFQHYAEDDAENLQIKSAADMGALIFD GLCDGIFLFNQGSLSHAVVDATAFGILQAGRTRTSKTEYISCPGCGRTLYDLEKTIARIK AATSHLKGLKIGIMGCIVNGPGEMADADYGYVGAGRGKISLYKGKVCVEKNIPEEEAVER LLEFIRTDREENQQ Prediction of potential genes in microbial genomes Time: Wed May 18 03:57:24 2011 Seq name: gi|222159238|gb|ACAB01000121.1| Bacteroides sp. D1 cont1.121, whole genome shotgun sequence Length of sequence - 12659 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 9 - 68 3.9 1 1 Tu 1 . + CDS 169 - 1335 865 ## COG0527 Aspartokinases - Term 1310 - 1350 2.0 2 2 Tu 1 . - CDS 1396 - 1713 276 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Term 1812 - 1852 -0.2 3 3 Tu 1 . - CDS 1876 - 2736 813 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains - Prom 2847 - 2906 5.6 + Prom 2737 - 2796 3.6 4 4 Op 1 . + CDS 2866 - 5391 2000 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 5 4 Op 2 . + CDS 5423 - 5773 307 ## BT_2435 MarR family transcriptional regulator + Prom 5782 - 5841 2.3 6 5 Op 1 . + CDS 6001 - 6636 752 ## BT_2437 hypothetical protein 7 5 Op 2 . + CDS 6652 - 7278 598 ## BT_2438 hypothetical protein + Term 7322 - 7369 7.6 - Term 7306 - 7361 9.1 8 6 Op 1 . - CDS 7367 - 10378 2173 ## COG1472 Beta-glucosidase-related glycosidases 9 6 Op 2 1/0.000 - CDS 10375 - 11223 266 ## PROTEIN SUPPORTED gi|145635642|ref|ZP_01791339.1| 30S ribosomal protein S16 10 6 Op 3 . - CDS 11240 - 12025 731 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases - Prom 12162 - 12221 8.1 + Prom 12092 - 12151 4.6 11 7 Tu 1 . + CDS 12181 - 12534 585 ## PROTEIN SUPPORTED gi|237713894|ref|ZP_04544375.1| 50S ribosomal protein L19 + Term 12563 - 12601 7.1 Predicted protein(s) >gi|222159238|gb|ACAB01000121.1| GENE 1 169 - 1335 865 388 aa, chain + ## HITS:1 COG:MK0109 KEGG:ns NR:ns ## COG: MK0109 COG0527 # Protein_GI_number: 20093549 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Methanopyrus kandleri AV19 # 119 357 130 383 467 72 27.0 2e-12 MKVCKFEKIKDEDEMKQVINCIQKEHPYVAVVPILAQLQEWLQAISSSWFHEEDEASHAT VNAIEAYCCTLANHLITDSHLNQEIKNRILECIKKIHILVEDKADLLIDKMIKAEIYGLS SDLFTYCLRQQGLRAQTLDTGKLIQINLERKPDIPYIQESIQQYIDENRNVDIFIAPLSI CRNVYGEIDFMSEQRNDYYATVLATLFKADEILLSTPINHIYANRNCLREQHSLTYIEAE QLINSGVHLLYADCITLAARSNIVIRLTDTHDLSTERLYISSHDTGNSVKAILSQDSATF VRFTSLNVLPGYLFMGKILEVINKYQINVISMASSNVSISMMLTASRDTLRIIERELHKY AEMVMDENMSVIHIIGSHYLCTAQIKTN >gi|222159238|gb|ACAB01000121.1| GENE 2 1396 - 1713 276 105 aa, chain - ## HITS:1 COG:BH3899 KEGG:ns NR:ns ## COG: BH3899 COG3829 # Protein_GI_number: 15616461 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 3 105 456 558 559 63 36.0 8e-11 MPPAFLEALEQQPWKGNIRELRNVIERSLIVCEGGRLDICDLPLEIQNTHYECSDDSTGR FELAAMERRHIARVLEYTKGNKTEAARLLKIGLTTLYRKIEEYKI >gi|222159238|gb|ACAB01000121.1| GENE 3 1876 - 2736 813 286 aa, chain - ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 1 286 4 289 461 233 40.0 4e-61 MNKILIIDDEVQIRSLLARMLELEGYEVCQAGDCKTAIKQLEIQSPDVALCDVFLPDGNG VDLVLNIKKIAPNVEVILLTAHGNIPDGVQAIKNGAFDYITKGDDNNKIIPLISRAVEKA KMNVRLEKLEKKVGQMYSFDSILGESKVLKESVSLAQKVSVTDVPVLLTGETGTGKEVFA QAIHYNSKRSKQNFVAVNCSSFSKELLESEMFGHKAGSFTGALKDKKGLFEEANNGTIFL DEIGEMAFELQAKLLRILETGEYIKIGDTKPTRVNVRIIAATNRNL >gi|222159238|gb|ACAB01000121.1| GENE 4 2866 - 5391 2000 841 aa, chain + ## HITS:1 COG:YPO3001_1 KEGG:ns NR:ns ## COG: YPO3001_1 COG0446 # Protein_GI_number: 16123182 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Yersinia pestis # 18 456 20 458 459 421 49.0 1e-117 MKIIIIGGVAGGATTAARIRRVDETAEIILLEKGKYISYANCGLPYYIGGVIEERDKLFV QTPEAFSTRFRVDVRTENEAIFIDRKRKTVTIRQSSEDTYEESYDKLVISTGASPVRPPL PGIDLSGIFTLRNVTDTDRIKEYIKSHAPRKAVIVGAGFIGLEMAENLHTQGAKVSIVEM GNQVMAPIDFSMASLVHQHLMDKGVNLYLEQAVASFSREGKGLKVTFKNGQSISADIVIL SIGVRPETSLARAAELTIGPAGGIAVNDYLQTSDESIYAIGDAIEFRHPITGKPWLNYLA GPANRQGRIVADNVLGAKIPYEGSIGTSIAKVFDMTVASTGLPGKRLRQEEIDYMSSTIH PASHAGYYPDAMPMSIKITFDKKTGRLYGGQIVGYDGVDKRIDELALVIKHEGTIYDLMK VEQAYAPPFSSAKDPVALAGYVAEDIITGKTNPVYWRELRDIEMENKFLLDVRTPDEYSL GSLPGAVNIPLDELRDRLAELPKDKMIYTFCAVGLRGYLAYRILTQHEFDKVRNLSGGLK TYRAATAPIIIREENGNEIDESPEQQGGSPQVGQPAVAKVSDTTVTATAAVTADALATPA KTVRVDACGLQCPGPILKMKKTMDTLASGERVEITSTDPGFPRDAAAWCSSTGNQLISKD SSGGKSVVVIKKGEPKSCNIVTSCEGKGKTFIMFSDDLDKALATFVLANGAAATGQKVTI FFTFWGLNVIKKLHKPETEKDIFGKMFGMMLPSSSKKLKLSKMSMGGIGGKMMRYIMNKK GIDSLESLRQQALENGVEFIACQMSMDVMGVKQEELLDEVTIGGVATYMERADNANVNLF I >gi|222159238|gb|ACAB01000121.1| GENE 5 5423 - 5773 307 116 aa, chain + ## HITS:1 COG:no KEGG:BT_2435 NR:ns ## KEGG: BT_2435 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 116 1 117 122 151 74.0 9e-36 MKTICAMRDVFKAMGNFETAFEKMYQITLNEAMILCALKEASDKVTATNLSKQTELSPSH TSKMLRILEEKGLIVRSLGSEDRRQMYFHLTQLGKQRVTELELDKVEIPDLLKPLF >gi|222159238|gb|ACAB01000121.1| GENE 6 6001 - 6636 752 211 aa, chain + ## HITS:1 COG:no KEGG:BT_2437 NR:ns ## KEGG: BT_2437 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 210 210 343 89.0 3e-93 MKKLIPILLAVFALASCEKDPDMGKLDDNYLVYTNYDKKADFKVPTFYLAPQILVISDSK EPEYLEGEGAEQILAAYTDNMEARGYEAAADQESADLGIQVSYIASTYYFTSYTQPEWWW GYPGYWGPGYWGGNWGGGWYYPYAVTYSYSTNSFLTEMVNLKAEQGDSKKLPVVWTSYLT GFETGSKAINRTLAVEAVNQSFTQSPYLTNK >gi|222159238|gb|ACAB01000121.1| GENE 7 6652 - 7278 598 208 aa, chain + ## HITS:1 COG:no KEGG:BT_2438 NR:ns ## KEGG: BT_2438 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 208 208 353 93.0 1e-96 MKTRKNIYFKVVALAAIAIAFAMPAKAQLSDNGYANIDWQFNAPLSNNFADKASGWGMNF EGGYFVTPNIGLGLFLNYHSNHEYVGRETFQMGAGEVTTDQQHTIFQLPFGAAARYQWNR GGAFQPYVSAKLGAEYAKIRSNFSMLEARENSWGFYASPEVGINVFPWVYGPGLHFALYY SYGTNKADVLHYSVDGLSNFGFRLGVSF >gi|222159238|gb|ACAB01000121.1| GENE 8 7367 - 10378 2173 1003 aa, chain - ## HITS:1 COG:BH0675 KEGG:ns NR:ns ## COG: BH0675 COG1472 # Protein_GI_number: 15613238 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus halodurans # 13 566 96 663 686 259 31.0 2e-68 MKIIPSIVTLLFLIAAGTVSSPAQNATTVEPLLVYKALQDKDCRHWVDSVMDKLSFKEKV GQLFIYTIAPVDTKRNLELLREVIDTYKVGGLLFSGGKMQNQVELTNRAQRQAKVPLMIT FDGEWGLAMRLRGMPVFPRNMVLGCIRDNKLLYEYGREVARQCRQIGVQVNFAPVADVNI NPENPVINTRSFGEDPIQVADKVIAYASGLESGGVLSVCKHFPGHGDTDVDSHKALPVLP FTRERLDSVELYPFKEAIRAGLGGMMVGHLQVPVIEPIGGLPSSLSRNVVYDLLTDELAF KGLIFTDALAMKGVAGNGNVSLQALKAGNDMVLSPRNLKEEIPAVLAAVEKGELSREEIE SKCRKVLTYKYVLGLKKKSYVQLSGLEQRINSPQTRDLVRRLNLAAITVLNNKNHILPLH TDKEQKIALLEVGNPGKTDVLAKQLSRYTSLARFCLRANQTEEENQRLRDSLSTYKRIIV AVSEQRLASYQPFFAKFAPQSPAIYLFFTPGKMMLQIQRAVSHASAVVLGHSYNVDVQRQ VADVLFAKASADGQLSASLGELFPTGAGVTITPKTPLHFVPEEYGLSSTHLKRIDSIALD GIRQGAYPGCQVVVLKNGHSMFDKSFGTYAGKGSPRVESTSIYDLASLSKTTGTLLAIMK LYDKGRFNLTDKISDHLPFLQRTDKKDITIQEILYHQSGLPSWIPFYQEAIDKDSYDGRL FSARKDAHHPLQLGTTSWANPKFKFKSEYVSPVKTGDYTVQICDSLWLNRSFRKVMEEKI MEAPLKQKRYVYSDVGFILLGMLVEQLAGMPMEAYLQREFYEPMGLEHTGYLPLRRFAKS EIVPSNKDRFLRKETLQGFVHDEASAFFGGLAGNAGLFSTARDVARVYQMLLNGGEIDGQ RYLSKETCQLFTTETSKISRRGLGFDKPDADDPKKGNCAPAAPAGVYGHTGFTGTCAWVD PVNELVYVFLSNRIYPDVTNRKLNQLHIRERIQGAIYDAMKKK >gi|222159238|gb|ACAB01000121.1| GENE 9 10375 - 11223 266 282 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145635642|ref|ZP_01791339.1| 30S ribosomal protein S16 [Haemophilus influenzae PittAA] # 1 243 1 252 603 107 29 6e-23 ILNMKRFQILFLLCLALSFTFSIFAQDKDTKEVIILQTSDVHSRIEPVNQKGDEYYNKGG FLRRAAFLEQFRKEHKNVLLFDCGDISQGTPYYNMFRGEVEVKLMNEMGYDAMTIGNHEF DFDVDNMERIFKMANFPVVCANYNLDATVLKDIVKPYVVLEKYGLRIGVFGLGTQPEGMI QANKCEGVVYEDPVRVSNEIAALLKDEEGCDLVVCLSHLGIQMDEHLVAGTRNIDVILGG HSHTFMKGPKTYLNMDGKEVPVMHTGKNGVRVGRLDLTLKHK >gi|222159238|gb|ACAB01000121.1| GENE 10 11240 - 12025 731 261 aa, chain - ## HITS:1 COG:STM4104 KEGG:ns NR:ns ## COG: STM4104 COG0737 # Protein_GI_number: 16767370 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Salmonella typhimurium LT2 # 27 241 283 498 518 84 27.0 2e-16 MKQTYAKYLMVAALAGGFLFTSCRTARETTAQYEVTKVEGSMITIDSVWDTHPNAKAAEI LKPYKEKVDAMMYEVIGTSAMKMDKGGPESLLSNLVAGVLQQAAVQVLGKPADMGLVNMG GLRNILPEGDITVGDVFEILPFENSLCVLTMKGTDLRRLFEAIASLHGEGVSGIRLEITK DGKLLNAFVGEKPLKDDQLYTVATIDYLADGNGRMDAFLQAEKRVCPEDATLRGLFLDYV RKQTAEGKAITSKLDGRITIK >gi|222159238|gb|ACAB01000121.1| GENE 11 12181 - 12534 585 117 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237713894|ref|ZP_04544375.1| 50S ribosomal protein L19 [Bacteroides sp. D1] # 1 117 1 117 117 229 100 6e-60 MDLIKIAEEAFATGKQHPSFKAGDTVTVAYRIIEGNKERVQLYRGVVIKIAGHGDKKRFT VRKMSGTIGVERIFPIESPAIDSIEVNKVGKVRRAKLYYLRALTGKKARIKEKRANG Prediction of potential genes in microbial genomes Time: Wed May 18 03:57:41 2011 Seq name: gi|222159237|gb|ACAB01000122.1| Bacteroides sp. D1 cont1.122, whole genome shotgun sequence Length of sequence - 28420 bp Number of predicted genes - 23, with homology - 22 Number of transcription units - 10, operones - 3 average op.length - 5.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 118 - 168 12.6 1 1 Tu 1 . - CDS 192 - 1172 499 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase - Prom 1196 - 1255 6.7 + Prom 1138 - 1197 3.8 2 2 Op 1 36/0.000 + CDS 1352 - 2068 362 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Prom 2073 - 2132 3.3 3 2 Op 2 10/0.000 + CDS 2152 - 3411 296 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 4 2 Op 3 13/0.000 + CDS 3416 - 4657 326 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 5 2 Op 4 13/0.000 + CDS 4714 - 5814 1132 ## COG0845 Membrane-fusion protein 6 2 Op 5 . + CDS 5834 - 7174 891 ## COG1538 Outer membrane protein 7 2 Op 6 . + CDS 7259 - 8287 808 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase + Term 8344 - 8386 1.2 - Term 8331 - 8373 4.2 8 3 Tu 1 . - CDS 8430 - 8945 524 ## BT_2500 hypothetical protein - Prom 9091 - 9150 3.5 9 4 Op 1 . - CDS 9208 - 10668 552 ## Phep_3369 kelch repeat protein 10 4 Op 2 . - CDS 10744 - 11634 740 ## Mboo_1713 hypothetical protein 11 4 Op 3 . - CDS 11640 - 12869 1251 ## GFO_3192 phosphate-selective porin O and P 12 4 Op 4 . - CDS 12902 - 14425 1070 ## COG2866 Predicted carboxypeptidase 13 4 Op 5 . - CDS 14433 - 15809 1037 ## GFO_3192 phosphate-selective porin O and P 14 4 Op 6 . - CDS 15822 - 16970 1062 ## CCC13826_0552 isoaspartyl dipeptidase (EC:3.4.19.5) 15 4 Op 7 . - CDS 16986 - 18365 1233 ## COG3069 C4-dicarboxylate transporter 16 4 Op 8 . - CDS 18383 - 19798 1387 ## gi|237713910|ref|ZP_04544391.1| predicted protein - Prom 19993 - 20052 7.3 + Prom 19833 - 19892 7.3 17 5 Tu 1 . + CDS 20020 - 21972 1458 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) + Term 22179 - 22210 -0.5 + Prom 22276 - 22335 3.1 18 6 Tu 1 . + CDS 22539 - 22730 67 ## 19 7 Tu 1 . - CDS 23322 - 24014 650 ## BT_2504 hypothetical protein - Prom 24042 - 24101 4.8 - Term 24052 - 24088 6.4 20 8 Tu 1 . - CDS 24130 - 24969 222 ## PROTEIN SUPPORTED gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains - Prom 25087 - 25146 4.9 + Prom 24932 - 24991 6.4 21 9 Tu 1 . + CDS 25157 - 26158 1126 ## COG0039 Malate/lactate dehydrogenases + Term 26181 - 26240 14.7 - Term 26048 - 26087 -0.9 22 10 Op 1 . - CDS 26230 - 26670 285 ## BT_2511 putative transcription regulator 23 10 Op 2 . - CDS 26678 - 28408 1480 ## COG2217 Cation transport ATPase Predicted protein(s) >gi|222159237|gb|ACAB01000122.1| GENE 1 192 - 1172 499 326 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 10 317 5 316 319 196 38 1e-49 MNSNMEKPYVVGIDIGGTNTVFGIVDARGTIIASSSIKTSGYPTVEEYADEVCKSLLPLI IANGGVDKIRGIGIGAPNGNYYTGTIEFAPNLPWKGILPLAAMFEERLGIPTALTNDANA AAIGEMTYGAARGMKDFIMITLGTGVGSGIVINGQMVYGHDGFAGELGHVIARRDGRLCG CGRKGCLETYCSATGVARTAREFLAARTDASLLRNIPAENITSKDVYDAAVQGDKLAQEI FEFTGNILGEALADAIAFSSPEAIVLFGGLAKSGDYIMKPIQKAIDDNILNIYKGKTKLL VSELKDSDAAVLGASALAWELKDLKE >gi|222159237|gb|ACAB01000122.1| GENE 2 1352 - 2068 362 238 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 230 1 223 245 144 37 8e-34 MIQLTNVNKTYNNGASLHVLKGINLNIEQGEFVSIMGASGSGKSTLLNILGILDNYDTGE YYLNNVLIKNLSETKAAEYRNRMIGFIFQSFNLISFKDAVENVALPLFYQGISRKKRNAL ALEYLDRLGLKDWAHHMPNEMSGGQKQRVAIARALITQPQIILADEPTGALDSKTSVEVM QILKDLHKLGMTIVVVTHESGVANQTDKIIHIKDGVIERIEDNIDHDASPFGKDGFMK >gi|222159237|gb|ACAB01000122.1| GENE 3 2152 - 3411 296 419 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 11 417 15 411 413 118 25 4e-26 MIDIWQEIYSTIKRNKLRTFLTGFAVAWGIFMLIVLLGAGNGLIHAFEQSASERAMNSIK IFPGWTSKSYDGLKEGRRVQLDNKDVDATDHYFPDHVIKAGATVWQGGINLSFGQEYVSL NLSGVYPNHTEVEVVKLFKGRFINEIDIKERRKVIVLHKKTAEILFDKTHTEPIGQFVNA SNVVYQVVGLYNDKGDSGDSDAYIPFTTLQTIYNKGDKLNNLVMTTKNLETIEANEAFEA HYRKVLGANHRFDPTDHSAIWIWNRFTNYLQQQQGSNMLRIAIWVIGIFTLLSGIVGVSN IMLITVKERTREFGIRKALGAKPLSILWLIIVESVTITTIFGYIGMVAGIGVTEWMNNAF GNQTMDTGMWTETVFLNPTVDIRIAIQATLTLIIAGTLAGLFPARKAVSIRPIEALRAD >gi|222159237|gb|ACAB01000122.1| GENE 4 3416 - 4657 326 413 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 21 413 22 413 413 130 26 1e-29 MRIDMDTCEEILITITRNKTRSLLTAFGVFWGIFMLVALIGGGQGLEGMMKKNFEGFATN SGFLAAQKTGEAYKGFRKGRWWNLEAIDIERLRSKVKDVEVITPSIARWGSKAVYEDKKY DCSVKGLYPDYLHIESQEMAYGRFINEVDIQEARKVCVIGKRIYESLFKPGEDPCGKYIR VDGIYYQVIGMSSSEGDMNIQGRASEAVTLPFTTMQQTYNLGGRIDVICFTVKHGIKVSD VQPEMEQIIKAAHYIAPNDKQALMYLNAEAMFSMVDNLFTGINILVWMVGLGTLLAGAIG VSNIMMVTVKERTTEIGIRRAIGARPKDILQQILSESMVLTTIAGMCGISFAVMVLQLVE MGANADGGDVRFQVTFGLAIGTCALLIALGMLAGLAPAYRAMAIKPIEAIRDE >gi|222159237|gb|ACAB01000122.1| GENE 5 4714 - 5814 1132 366 aa, chain + ## HITS:1 COG:VC1563 KEGG:ns NR:ns ## COG: VC1563 COG0845 # Protein_GI_number: 15641571 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 45 316 52 335 338 127 26.0 4e-29 MKKYLKITLLVVIAIILVGTFVFLYQKSKPKVITYEIVKPEVTELQKTTVATGKVEPRDE ILIKPQISGIIDEVYKEAGQAVKKGEVIAKVKVIPELGQLNSAESRVRLAEINATQAETD FARVKKLYEDQLISREEYEKSEVALKQAREERQTAKDNLEIVKEGITKNSASFSSTMIRS TIDGLILDVPVKAGNSVIMSNTFNDGTTIATVANMNDMIFRGNIDETEVGRIHEQMPIKL TIGALQNLTFNAILEYISPKGVETNGANQFEIKAAITIPDSVQIRSGYSANAEIVLQRAN QVLAVPESTIEFNGDSTFVYIMTDSVPEQKFQRTQVTAGMSDGIKIEIKKGITAQDKIRG AEKKDK >gi|222159237|gb|ACAB01000122.1| GENE 6 5834 - 7174 891 446 aa, chain + ## HITS:1 COG:CC1318 KEGG:ns NR:ns ## COG: CC1318 COG1538 # Protein_GI_number: 16125567 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Caulobacter vibrioides # 104 419 96 406 483 81 22.0 3e-15 MRIKIAILFFAGAITCSAQETEVPLRDYTLRQCIDYAISHNITVRQSANNVEQSAVDIST AKWARLPNLNGSAGQNWRWGRNSVTETDKDGNQYQVYKNTYNYGTNASLSTEIPLFTGFE LPNQYALAKLNFKAAIADLEKAKEDIAIQVASAYLKVLLTQELKQVAINQVELSKEQSNR IARLFEVGKASPSEVAEAKARVAQDELSVTQSDNDYQLALLDLSQLLELSNPHLLSIQKA DTTFVTTLLTPPDEIYQYALTNKPEIQAAQYRLLGSEKSIRIAQSGFYPKLNLGGNWGTG FSSDLKNNFGTQMRENRNKSIAFSLSVPLFNRFATRNRVRTARLQHFNQNLKLDNAKKVL YKEIQQAWYNALAAESKYNSSEVAVKANEESFRLMSEKFNNGKATFVEYNEAKLNLTKSL SDKLQSKYEYLFRTKILDFYKGQVIE >gi|222159237|gb|ACAB01000122.1| GENE 7 7259 - 8287 808 342 aa, chain + ## HITS:1 COG:SP0660_2 KEGG:ns NR:ns ## COG: SP0660_2 COG0229 # Protein_GI_number: 15900561 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Streptococcus pneumoniae TIGR4 # 198 341 1 144 145 202 64.0 8e-52 MKYFIIIVIVVISNTFLGVAKPMKNQAEIYFAGGCFWGTEHFLKQIRGVESTQVGYANSN VANPSYEQVCSGKTNAAETVKVVYDPKTVDLNLLLDLYFKTIDPTSLNRQGNDRGTQYRT GIYYINKEDLPVINQAIKLLSTQYKTPIAIEVKPLTNFYPAETYHQDYLDKNPGGYCHIN PALFEMARKANAPKTKVYQKPDDSTLRKELSAEQYAVTQKNATEPAFHNEYWNEHRPGIY VDITTGEPLFVSTDKFDSGCGWPSFSQPIQKDIIAEKKDTSHGMIRTEVRSKTGDAHLGH VFTDGPKEKGGLRYCINSASLRFIPKEKMKEEGYGEYLPLVK >gi|222159237|gb|ACAB01000122.1| GENE 8 8430 - 8945 524 171 aa, chain - ## HITS:1 COG:no KEGG:BT_2500 NR:ns ## KEGG: BT_2500 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 171 1 172 172 258 81.0 5e-68 MKTTVLFRMIVLVAMVFAGISNATVKAQNNNFITNEETVGDLVVLKVIYRLDGSLYRHMK YDFTYDDQKRMATKEAFKWDASTDKWLPYFKIDYTYSSNEITLVYARWNDSHRAYDAAVE KSVYELNDANMPVAYMNYKWNDKWIEESSVNWAMNVATPATNEATLLTASR >gi|222159237|gb|ACAB01000122.1| GENE 9 9208 - 10668 552 486 aa, chain - ## HITS:1 COG:no KEGG:Phep_3369 NR:ns ## KEGG: Phep_3369 # Name: not_defined # Def: kelch repeat protein # Organism: P.heparinus # Pathway: not_defined # 251 485 600 837 842 195 40.0 4e-48 MFLSASYIQGFYSSGQKIILLIFCLLFTVSLRAVRSAEEIKSVKADQILDTKQFNYLDKG SVITLQVMLPGQDKVVFESLMRIESTDKELYNLILSRSPQAGTELLWVCNSEVEKVQVSS PEDSISTDTMSVGFSVYFPADSVYISANGHSYAFSSHGIRSDRGYRFHILGDKRSHLKLL ERHAELLSVGKEEQHLSAWYWILLIIVVDMIVFLGMHWRRRRQKRLKASNRVEILTSFGD EGEKVRLPRYNAVYLFGGLQVFDRNGEEVSQRFSPLLRELFVLLILRSAENGISSEELTT MLWFDKDDVSAKNNRAVNLSKLRVLVESVGGCTISKISGNWRVQFEKNAFVDYYECLPER IRPDSLTTERIKIIAALTAKGGLLNECDYLWLDAFKSRIADNLIESLLKYATLLDEHEAF DTKLLISDIIFRFDSVNEAALRLKCATYVFMRKQYMAKTTYEHFCREYKSLYDEPFEHSF NSIISQ >gi|222159237|gb|ACAB01000122.1| GENE 10 10744 - 11634 740 296 aa, chain - ## HITS:1 COG:no KEGG:Mboo_1713 NR:ns ## KEGG: Mboo_1713 # Name: not_defined # Def: hypothetical protein # Organism: M.boonei # Pathway: not_defined # 49 296 68 315 318 168 38.0 3e-40 MRVANLLILLIAVLMTSCFKDDESESLGETSKFLPGDGIVTYSGYAPLADRPVRIHFHIP ANGDMKKMPVLFVFPGLERNAADYLNAWRTEASNRNVMVFVFEFPTETYSTAQYIEGGMF QGNTLLDRSEWTFSLIESVFDTVRKDTGASRTKYDMWGHSAGAQFVHRYVTFMPDTRVDR AVSANAGWYTLPDVDVAYPYGLKNTDVATSSQVASLLARKLIVHLGTADTDRNGLNTSTG AEAQGANRYQRGKYYFSEAKRISVKGGYSLNWDKYEVTGVAHEYAKMAAAGAKILY >gi|222159237|gb|ACAB01000122.1| GENE 11 11640 - 12869 1251 409 aa, chain - ## HITS:1 COG:no KEGG:GFO_3192 NR:ns ## KEGG: GFO_3192 # Name: not_defined # Def: phosphate-selective porin O and P # Organism: G.forsetii # Pathway: not_defined # 1 386 1 382 401 98 26.0 4e-19 MKKIIIGVFSLILATGLYAQTIGNEAFPQLKDGLLDLNLRAPGQKLRFGGYLQGSGYYTD IKNAESEYGFDIDHAYLSLEGSFLNEKLGFFLQADFADSYPLLDAWVSYAPWKQLKISAG QKQTFTNTRQMMMLDQGLAFGEHSLMDRTFSRTGRELGLFVESRLSLGKLGFDLGAAVTS GDGRNSFGSSSTDPDLGGLKYGGRVTVYPLGFFTSGNELVFNDFIREKSPKLAIGVAYSY NVGASNMVGEGHNDFQMYDKEGAADYPDYRKLSADLLFKWNGFSLLAEYVNATANGLNGL YVEKDEGAKLKPRQIANYLALGNGLNVQAGYLFRKLWALDVAYSKVNPEWKETGQSVLRE ADNITFGASKYFNNNTFKLQLAGSYTSYNQLVGAGSKEFQVKLNLHILF >gi|222159237|gb|ACAB01000122.1| GENE 12 12902 - 14425 1070 507 aa, chain - ## HITS:1 COG:BH3831 KEGG:ns NR:ns ## COG: BH3831 COG2866 # Protein_GI_number: 15616393 # Func_class: E Amino acid transport and metabolism # Function: Predicted carboxypeptidase # Organism: Bacillus halodurans # 44 228 44 233 351 78 30.0 4e-14 MKRILLILCLLLSGLPFYAQQTGKVTAKFFPDPQVDIDTPAFAKKHGFTTYREMMTFLHD LATTYPERVKLQIVGRTQRGREIPMIKVSKGGNDKLRVLYTGCVHGNEPAGTEGLLYFMK QLTCDSQLSALLDKMDFYIMPSVNIDGSEQGERVTANGIDLNRDQTLLSTPEARTLQRVA LTVKPHLFIDFHEYKPLRASYEEVTDGLLVTNPNDFMFLWSSNPNVSPALRTVVDEFYVP EAIRMADAEGLTHHTYFTTKSNRGEIIFNIGGFSPRSSTNIMALRGAISMLMEVRGVGLG RTSYKRRVYTVYKLAESFARTTFEHEGQIRKAVDESAHYNGDVAVTFRSKAASGYPLTFI DMLACKEVTVPVEARIAPESEVVLTRQRPVAYYLDANQNRAVEILQQYGVELERLASPET IELECYTVTKAVESHDLVAGILPLNVVTNTSNRTITLPAGSYRISMSQPLATLVTVLLEP ESANGFVNYRVIDAAVNNTLGVYRKMK >gi|222159237|gb|ACAB01000122.1| GENE 13 14433 - 15809 1037 458 aa, chain - ## HITS:1 COG:no KEGG:GFO_3192 NR:ns ## KEGG: GFO_3192 # Name: not_defined # Def: phosphate-selective porin O and P # Organism: G.forsetii # Pathway: not_defined # 60 432 42 386 401 122 27.0 3e-26 MDMKKYLLFLIGLFLLLPVMAQTDEGSDDAEIVEASDESLSDIDNKVVLHRYKMGDGLRF TTQGGNKLTVSGMVQTSVESRRFEDVDQMYNRFRVRRARVRFDGSVYHDKLRFRLGLDLV KGSETDDDSGSLLMDAYAAYRPWGSKLVISFGQRSTPTDNYELQMSSHTLQFGERSKITS AFSTIRELGVFAESSFRLGSKGLLRPSIAITDGSGPISEGKRYGGLKYGGRLNYLPFGAF RSMGGSREGDMAYELTPKLSVGVAYSYADGTSDRRGGRSNGDILYMNDRDEIDLPDYAKL VADFAFKYRGFSMLGEYAKTWGYVPSSITKRVRNDGSTATTFDVNGEQNVEAYIKNRMML GQGFNIQAGYMLRSLWSFDLRYTYLKPDEYSYMNNNLYFNRHNFYDFSVSKYLTRNYAAK IQLTVGLARSNGENRTPDSTYTYNGNEWIGNLLFQFKF >gi|222159237|gb|ACAB01000122.1| GENE 14 15822 - 16970 1062 382 aa, chain - ## HITS:1 COG:no KEGG:CCC13826_0552 NR:ns ## KEGG: CCC13826_0552 # Name: iadA # Def: isoaspartyl dipeptidase (EC:3.4.19.5) # Organism: C.concisus # Pathway: not_defined # 5 381 3 379 379 300 42.0 8e-80 MMFKLIQNIHLYAPEDRGMNDLLICGEKIACIAPHIDIRGLEVEVIDGAGMNAAPGCIDQ HVHIIGGGGQTGYFSLAPEVPLSRLLACGTTTVVGLLGTDGFVKQLPALYAKTKALCMDG ISAYMLTGYYGLPTPTLTDSIANDLIFIDKVIGCKIAISDDRSSYPTKTELMRIIQQVRL GGFTSAKGGVMHVHLGALPGGIELLLEIAREYPSLISYISPTHMGRTHDLFLQGIEFARM GGMIDISTGGTKYCEPYESVLEGLNAGVSIDRMTFSSDGNAGVRRKDPITGEDSYTVAPL HRNLEQVIRLIVDGGIAPSDAFRLITSNPARTMKLKGKGELREGWDADITLFDDNWKLQG VYARGAEMMHEGVVMRKGNFEM >gi|222159237|gb|ACAB01000122.1| GENE 15 16986 - 18365 1233 459 aa, chain - ## HITS:1 COG:PM0933 KEGG:ns NR:ns ## COG: PM0933 COG3069 # Protein_GI_number: 15602798 # Func_class: C Energy production and conversion # Function: C4-dicarboxylate transporter # Organism: Pasteurella multocida # 5 459 2 460 462 270 41.0 5e-72 MMIYIGAFIALQVIVLVAYWLMKKNNPQGVLMVAGILMLALSMLLGMHSLSLTETTGTPV FDLFRIIKETFSSNMLRVGLMIMTIGGYVAYMKKIKASDALVYVSMQPLAIFKKFPYIAA SIVIPIGQMLFICTPSATGLGLLLVASIFPILVGLGVSRLTAVSVISACTIFDMGPGSAN TARAAELVGQNNMLYFVEHQLPLAIPLTILLMIVYYLTNRYFDRKDRQSGRTQPEMMDTK DFKVDIPLIYAILPVLPLFLLIIFSKYVHLFDPPIELDTTTAMFVSLAVSMLFELIRLRS FKAVFMTMKSFWEGMGKVFTNVVTLIVAAEIFSKGLISLGFIDALVEGCTSLGFSGILVS IVIILILFLAAILMGSGNASFFSFGPLVPSIATKVGMTPVDMILPMQLASSMGRAASPIA GVIIAISEIAGVSATDLAKRNIIPLTITLIVMIVIHFFI >gi|222159237|gb|ACAB01000122.1| GENE 16 18383 - 19798 1387 471 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713910|ref|ZP_04544391.1| ## NR: gi|237713910|ref|ZP_04544391.1| predicted protein [Bacteroides sp. D1] # 1 471 1 471 471 806 100.0 0 MMRMNKPMTTAMFCMALLLGGLSSCTTNDVEDIQGGGEQDLYEIKDRALGEYLVYNCSRT DENKLPYEMAVAENGKFYLNTQKAATVENLYLVKNDAQIAKLENAGLATAAVKIVDMDGL QFFTALKTLKITSNSVERMDLTALTELETVEMNNNSVAALDLSQNTKLIRFRYGGNSAGD TSTKLSTISFANNNVIEHIYLKNQNLQENGFILPSNYAALKELDLSNNPAAPFAIPEDLM TQLTTAVGVVADSEGGGGEDGELFTIPDKAFGEYLYYLSTDAGKLPQGLVVKEGNEYQLD KTMAATVTAVNVAKTSAATSELVTAGLETAETPIASVDGLQFFTGLVEFTATSNKFTEAL PLTGLSNLEVLQVNTAGVGSLDLSGNPKLRILNCNGSTKSGYGTLSTINLSYTPNLETLN LKNNKLTEINVSNLVKLTELDLSGNPGANFKIPSAIFNNLTTTKNKGVEAE >gi|222159237|gb|ACAB01000122.1| GENE 17 20020 - 21972 1458 650 aa, chain + ## HITS:1 COG:all2981 KEGG:ns NR:ns ## COG: all2981 COG0744 # Protein_GI_number: 17230473 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Nostoc sp. PCC 7120 # 433 587 88 239 640 126 43.0 2e-28 METSSKKRIIILSATGILLTGLICLFISRNIILCHIVDKRTTRIEQAYGLQIHYQKLQME GCNKVVLQGLSIIPNQRDTLLTLQSVNVKLSFWKLLKGEVEVRNVHMDGLAINFIKYDSL ANYDFLFLKRQQETESKPAIESNYANRIDRVLNLIYGFLPENGQLNQIKIAERKDSNFVA VNIPSFVIKDNHFQSTIQIKEDTLPQQQWQASGELNRRDYTLKAALFAPEKKKISLPYIT RRFGAEVTFDTLSYSMTKDKRATAQLLLKGKARVNGLDVFHKALSPEIIHLDRGQLCYEM NINGHSFELDSTTIVDFNKLQFHPYLRAEKEKGNWHFTAAVNKSWFPADDLFSSLPKGLF SNLEGIKTSGELAYHFLLDIDFAQLDSLKLESELKEKDFHIISYGATSLSKMSGEFIYTA YENGVPVRTFPIGPSCNHFTPLDSISPILRMSVMQSEDGAFFYHRGFLPDALREALIYDL QVKRFARGGSTITMQLVKNVFLNRNKNFARKLEEALIVWLIENERLTSKERMYEVYLNIV EWGPLVYGIQEASAYYFKKRPSQLTTEESIFLASIIPKPKHFRSSFAEDGQLKENMEGYY KLIAKRLAQKGVISEIEADSIRPDIQVTGDARNSLAGEKPKSSSPTAEEQ >gi|222159237|gb|ACAB01000122.1| GENE 18 22539 - 22730 67 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCLDETFRDVLLMNPDGKNYPDVSVHYADEWKYSDGTGQNQSPNDDKSDILSDLSPGAVQ SNG >gi|222159237|gb|ACAB01000122.1| GENE 19 23322 - 24014 650 230 aa, chain - ## HITS:1 COG:no KEGG:BT_2504 NR:ns ## KEGG: BT_2504 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 230 1 226 226 316 67.0 3e-85 MRKIIISFCILFAALSLQAQSVNGIRIDGGNTPILVYLGGNQISLPTTTCFIANLNPGHY TVEVYATRFTRPGERVWKGEKLYKDYVYFDGRGVKEIWVDGRDNMHPERPDRPGQGGHRP GQGEHRPGYGYNRVMNDQLFQTFYNEMKKEPFKDDRIKLLNAALAGSDFTSAQCLQLTKL YTFDDDRMEIMKIMYPRIVDKEAFFTVINTLTFTSSKDKMNDFVIGYGRR >gi|222159237|gb|ACAB01000122.1| GENE 20 24130 - 24969 222 279 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains [Anoxybacillus flavithermus WK1] # 5 273 2 275 285 90 26 1e-17 MSIELGKFNQLEVVKEVDFGVYLDGGEEGEILLPTRYVPEDCKIGDFLNVFLYLDMDERL IATTLTPLVQVGQFACLEVSWVNQYGAFLNWGLMKDLFVPFREQKMKMQVGRKYVVHAHV DEESYRIVASAKVERYLSKEKPEYAPGAEVNILIWQKTDLGFKAIIDDMYSGLLYENEIF CPLETGMEMKAFVKQVREDGKVDLILQKPGFEKIDDFSKTLLDYIKEHGGRINLNDKSPA EDIYDTFGVSKKTFKKGVGDLYKKRLITLHEDGIVLADS >gi|222159237|gb|ACAB01000122.1| GENE 21 25157 - 26158 1126 333 aa, chain + ## HITS:1 COG:TVN1097 KEGG:ns NR:ns ## COG: TVN1097 COG0039 # Protein_GI_number: 13541928 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Thermoplasma volcanium # 4 270 1 267 325 107 29.0 3e-23 MEFLTNEKLTIVGAAGMIGSNMAQTAMMMKLTPNICLYDPFAPALEGVAEELYHCGFEGV NLTYTSDIKEALTGASYIVSSGGAARKAGMTREDLLKGNAEIAAQFGKDVRQYCPNVKHI VVIFNPADITGLITLLYSGLKPSQVSTLAALDSTRLQNELVKYFHIPASDIQNCRTYGGH GEQMAVFASTTKIKGEPLTDFIGTTRLPLTEWEALKVRVIQGGKHIIDLRGRSSFQSPAY LSIEMIAAAMGGQPFRWPAGTYVSNGKFDHIMMAMETSITKDSVTYKEIAGTPAEQEELE KSYEHLCKLRDEVIEMGIIPAIKDWHALNPNID >gi|222159237|gb|ACAB01000122.1| GENE 22 26230 - 26670 285 146 aa, chain - ## HITS:1 COG:no KEGG:BT_2511 NR:ns ## KEGG: BT_2511 # Name: not_defined # Def: putative transcription regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 139 1 139 139 232 83.0 4e-60 MEEDIYLDKLVRRDIKPTAIRLLVIKEMMQAERAVSLLDLETLLDTVDKSTISRTIALFL SHHLIHSIDDGSGSLKYAVCDNSCNCVVQDLHSHFYCEKCHRTFCLEGTHIPVIDLPEGF TLHSINYVLKGICQECSSVGMTLAMK >gi|222159237|gb|ACAB01000122.1| GENE 23 26678 - 28408 1480 576 aa, chain - ## HITS:1 COG:PAB0626 KEGG:ns NR:ns ## COG: PAB0626 COG2217 # Protein_GI_number: 14521140 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Pyrococcus abyssi # 12 571 136 687 689 516 48.0 1e-146 MKEAWESIREKDYFSEFTLMIIATLGAFYIGEYPEGVAVMLFYTVGELFQGKAVDKAKRN IGALLDVRPEKALVLREGNLVTESPKKIKVGEIIEIKAGERVSLDGIMQNEVAAFNTAAL TGESVPRNIRKGEEVLAGMIVTDKVIRLEVTRPFDKSALARILELVQNAAERKAPAELFI RKFARVYTPIVIILAVLIVLSPFVYSLINPAFVFTFNDWLYRALVFLVISCPCALVVSIP LGYFGGIGAASRLGILFKGGNYLDAITKINTVVFDKTGTLTKGTFDVQACKSAGDISEEE LVKLIASVESDSTHPIAKAVVNYAKEQNIERVTVTDTKEYAGFGLEATVGGIPVLVGNCR LLSKFDISFPQELLKMTDTIVACAVGNRYAGYLLLADALKEDAKVAIDRLKALNIENIQI LSGDKQSIVTNFAEKLGISKAYGDLLPEGKVKHLEELRQDEANRIAFVGDGMNDAPVLAL SHVGIAMGGLGSDAAIETADVVIQTDQPSKVAEAIKVGKLTRRIIWQNVSLAFGVKLLVL ILGAGGIATLWEAVFADVGVALLAIMNAVRIQKMIK Prediction of potential genes in microbial genomes Time: Wed May 18 03:58:38 2011 Seq name: gi|222159236|gb|ACAB01000123.1| Bacteroides sp. D1 cont1.123, whole genome shotgun sequence Length of sequence - 1990 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 204 120 ## BT_2512 cation-transporting ATPase, P-type, putative zinc-transporting ATPase - Term 228 - 266 4.2 2 1 Op 2 . - CDS 271 - 906 737 ## PGN_1744 hypothetical protein - Prom 1135 - 1194 4.5 Predicted protein(s) >gi|222159236|gb|ACAB01000123.1| GENE 1 3 - 204 120 67 aa, chain - ## HITS:1 COG:no KEGG:BT_2512 NR:ns ## KEGG: BT_2512 # Name: not_defined # Def: cation-transporting ATPase, P-type, putative zinc-transporting ATPase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 67 3 71 652 95 66.0 4e-19 MGHCSCGAHSCAAEKKVDAKVSNFHEYGKVIFSLLLLAGGIIMNALDLAFFREGYVSLIW YIVAYLP >gi|222159236|gb|ACAB01000123.1| GENE 2 271 - 906 737 211 aa, chain - ## HITS:1 COG:no KEGG:PGN_1744 NR:ns ## KEGG: PGN_1744 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 46 211 75 227 227 91 34.0 2e-17 MKKALLLFALVAISVVSINAQDNLKWGVMAGMNVSKYTITGFDSRIGFHAGVKAEVGLSQ DANGSGAYMDFAALLTLKGAKVDGGSLASITFNPYYLEVPVRVGYKYAVNDDFSLFGSVG PYIAVGLFGKAKAKVDGDYFDFDEIGGNSASEDIFGDDGLKRFDLGLGLKAGVEFSKKYQ VAISYDFGLVEVIKEVGMKNRNLMISLGYMF Prediction of potential genes in microbial genomes Time: Wed May 18 03:58:45 2011 Seq name: gi|222159235|gb|ACAB01000124.1| Bacteroides sp. D1 cont1.124, whole genome shotgun sequence Length of sequence - 12492 bp Number of predicted genes - 11, with homology - 9 Number of transcription units - 7, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 215 - 274 4.0 1 1 Op 1 . + CDS 327 - 1571 677 ## COG0534 Na+-driven multidrug efflux pump 2 1 Op 2 . + CDS 1644 - 2321 455 ## BT_1899 hypothetical protein + Term 2420 - 2466 -0.7 - Term 2318 - 2361 -0.9 3 2 Tu 1 . - CDS 2560 - 2775 93 ## gi|262409170|ref|ZP_06085714.1| predicted protein - Prom 2795 - 2854 2.9 + Prom 2900 - 2959 8.8 4 3 Op 1 6/0.000 + CDS 2979 - 3431 233 ## COG3727 DNA G:T-mismatch repair endonuclease + Prom 3496 - 3555 2.9 5 3 Op 2 . + CDS 3590 - 4927 702 ## COG0270 Site-specific DNA methylase 6 3 Op 3 . + CDS 4881 - 6074 518 ## BF3682 type II restriction endonuclease + Term 6082 - 6125 10.1 - Term 6069 - 6111 6.1 7 4 Tu 1 . - CDS 6146 - 8608 1438 ## A1S_1281 TPR domain-containing protein - Prom 8811 - 8870 7.5 + Prom 8836 - 8895 6.9 8 5 Op 1 . + CDS 9119 - 9346 108 ## 9 5 Op 2 . + CDS 9366 - 9524 67 ## + Prom 9961 - 10020 6.1 10 6 Tu 1 . + CDS 10092 - 11291 863 ## BT_2890 transposase 11 7 Tu 1 . - CDS 11507 - 12391 349 ## BT_2889 AraC family transcription regulator - Prom 12432 - 12491 3.0 Predicted protein(s) >gi|222159235|gb|ACAB01000124.1| GENE 1 327 - 1571 677 414 aa, chain + ## HITS:1 COG:HI1612 KEGG:ns NR:ns ## COG: HI1612 COG0534 # Protein_GI_number: 16273502 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Haemophilus influenzae # 1 309 21 340 464 77 24.0 6e-14 MSILMEQLINITDAIFLGHVGEVELGASALAGIYYLAVYMLGFGFSIGLQVMIARRNGEQ HYKETGRTFFQGLYFLSGMAIILCLLLHLASPLILRQLITSDEIYQAVIHYLDWRSFGLL FSFPFLAFRSFLVGITTTKALSVAAIMAICINIPFNYLLIFKLNLGISGAAMASSLAEFG AFIVLLLYMWVRVDKVKYGLKVVYDEKVLIKLLRISVWSMLHSFISVAPWFLFFVAIEHL GRTELAISNITRSVSTVFFVIVNSFASTTGSLVSNLIGAGQGKELFPVCHKVLRLGYAVG VPLIVMTLWGNQWVIGFYTNNDNLVRLAFYPFIVMLLNYAFALPGYVYINAVTGTGKTKL AFIFQLITILVYLIYLYLLSECFHASLTVYMTAEYLFVILLGMQSIIYLKRKSN >gi|222159235|gb|ACAB01000124.1| GENE 2 1644 - 2321 455 225 aa, chain + ## HITS:1 COG:no KEGG:BT_1899 NR:ns ## KEGG: BT_1899 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 225 11 235 235 423 88.0 1e-117 MCAEMYHRADFIQSDPVQFPHRYTLKQDIEVSGLLTAIMSFGNRKQILKKVDELHGLMGD SPYQYVLSRQWERDFPSGATSSFYRMLSHADFYGYFQRLYIAYTQFESLEEALKTYPGIP MEKLCAFLEVSAKSPQKKLNMFLRWMIRRDSAVDLGIWESFDRRDLIVPLDTHVCRVAHY FKLTDTETFSLKNARQITAALAEVFPDDPCLGDFALFGLGVNGEI >gi|222159235|gb|ACAB01000124.1| GENE 3 2560 - 2775 93 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262409170|ref|ZP_06085714.1| ## NR: gi|262409170|ref|ZP_06085714.1| predicted protein [Bacteroides sp. 2_1_22] # 1 71 1 71 71 112 100.0 8e-24 MDKKLNNISINYIVGENRNQVRMTTLEQLVAPNSRVRVIDLFVDILSINELELNMSNFNQ KGHYRIILQHY >gi|222159235|gb|ACAB01000124.1| GENE 4 2979 - 3431 233 150 aa, chain + ## HITS:1 COG:NMA0429 KEGG:ns NR:ns ## COG: NMA0429 COG3727 # Protein_GI_number: 15793434 # Func_class: L Replication, recombination and repair # Function: DNA G:T-mismatch repair endonuclease # Organism: Neisseria meningitidis Z2491 # 1 123 1 122 140 135 54.0 2e-32 MVDVLTKEQRRKCMSNIRSKNTRPEILVRKYLFSQGIRFRVNVKRLPGTPDIVLRKYRTV IFINGCFWHGHEGCKYYVLPKTNTEFWKEKIGKNKERDLKERMQLRNMGWHVVLLWECQL KSKVREQNLKGLLYTLNKIVLENYNAKHID >gi|222159235|gb|ACAB01000124.1| GENE 5 3590 - 4927 702 445 aa, chain + ## HITS:1 COG:NMA1500 KEGG:ns NR:ns ## COG: NMA1500 COG0270 # Protein_GI_number: 15794400 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Neisseria meningitidis Z2491 # 112 430 19 336 337 460 67.0 1e-129 MTAKQFIIQDIKLVDFPEEVDVQNGIKIYGETSWENQLAVLSHYLHNNDCIGNEQNKKEI LYTIREFLERKKSNKEETITEEYVMKVAENPVEYSLFSDFFRVPFPSPQSPQFTFIDLFA GMGGFRLAMQAQGGKCVFSSEWNKYAQKTYLANFGEMPFGDITKEVTKSYIPQYFDILCA GFPCQPFSIAGVSKKKSLGRETGFKDKTQGTLFFDVADIISRHRPKAFYLENVKNLMSHD KGNTFKIIKGTLEELNYSIHFQVMDGQAYVPQHRERIMIVGFNREIFHGEEQFCFPEQKQ STRGIREILESDIDDKYTLSDKLWSYLQNYAEKHRAKGNGFGFGMVDLNGISRTLSARYY KDGSEILVPQGNGKNPRRLSPRECARLMGYPDKYRLDQVSDVQAYRQCGNSVVVPLITAV SEQIIKTINNYEQHTGLCDTKCTAI >gi|222159235|gb|ACAB01000124.1| GENE 6 4881 - 6074 518 397 aa, chain + ## HITS:1 COG:no KEGG:BF3682 NR:ns ## KEGG: BF3682 # Name: not_defined # Def: type II restriction endonuclease # Organism: B.fragilis # Pathway: not_defined # 1 392 1 392 396 489 60.0 1e-137 MSNILDSAIQSVQQSKAAWCRFITGNDTGATGSHQAGFYIPKCASSLLFDEPGKKGENKE KTVQIKWQDDFTTESCMKYYGQGTRNEYRITRFGRGFPFFQDDNVGDLLIIAKYTEEDYL GYVLSADEDIDGFFAYFNLAPDETNQLIDVAGVIKPDEKIVQLFRDFVSRFDKFPDTRQM ALGARECYNKAYKIGENVFKTNPDDVLLNWVDTEFRLFKCMEEKFYADMITHSFDSIDTF IQTANEVLNRRKSRAGKSLEHHLSDIFVHNALVFEEQAVTEDNKKPDFLFPNGKCYRNIQ FPAGDLIVLGAKTTCKDRWRQVLTEADRVDVKFLFTLQQGISKNQLKEMHDSRLTLVVPQ KYISSFPRDYQDEIKDLLGFILMVKEKQEHLPKHYFL >gi|222159235|gb|ACAB01000124.1| GENE 7 6146 - 8608 1438 820 aa, chain - ## HITS:1 COG:no KEGG:A1S_1281 NR:ns ## KEGG: A1S_1281 # Name: not_defined # Def: TPR domain-containing protein # Organism: A.baumannii # Pathway: not_defined # 47 374 14 370 392 272 49.0 5e-71 MGAGCSITSGIKSGTQLINDWKKEIIEYADDYDTSITSDEYFEKQNWFDERNPYSSLFEK RYDLQRQRRAFVENEVANKNPSIGYAYLVKLIENNYFNAIFTTNFDDLLNEAFYRFSNVR PVVCAHDSAITSITVTSKRPKIIKLHGDYLFEDIKSTLRETESLEGNMKNKFIEFSKDYG LIVVGYAGNDRSIMDILSFLLKKEEFFKNGIYWCIRKGDENISDDLKKLLWKDKVFFIQI EGFDELMAELNKRLNKGVLPIDNAILSSEHQIKLIKSLTSNEYVEKSNSKIIQEDSKRLK KMVNHNIFEDFFKHVDVKKEKKDNTTRKDSSSKISEEEKKTLNKVRILFLEDETEQAYKL INEQKLEDSDNSSYKAELLEMKLALLHNMNLENNIEYKQTIDSLIAIKPNTEMYYIEAVN SFPEIKDKIEYIDKAIAIYPNDYSLYRKKGDFMADYYNSLINKEESDIAPNVITTCYLTS IKLDGSIFNKAWVNLCDWYSHLYKKENEKKKEEILSLLKKYKEQNEYHPNYLKVCHRYSN MINEKIDKLLINGYSFAKQTDDPTWVESTVLILLDYFEEKDQKNELTKFMEEFEKEYNPS YDYLLKKATIFAKKFDNIAESIKILEDIENGNYFRNNMLIKLYAIIQEKEKAQIIVDKYN KIDHEILILYNLEFEEYETVCNLIEKYWETHKKDVSTFVSYSYAKLKLKQYKEIYWDLKQ YYDNPQTCIPEIAINYFIADIKLNKLDDKKIKNKILNNKDLYDNDVIAAAYSLIGDIPNA LDYLSKAIKDDNLLKYSVRDWPAFENCKNDNRYQRIIGIA >gi|222159235|gb|ACAB01000124.1| GENE 8 9119 - 9346 108 75 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHRLYIHHPIFELIFQQKRIKLGFHFHLNFLFNNKDTVLTEVIVDFRVLALSWARKNKSD ISSLFLGDALRMRNG >gi|222159235|gb|ACAB01000124.1| GENE 9 9366 - 9524 67 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGVRWERWIGMGSLDGGLLCDRSSKLCPYRTYMYLARYGLMANVTVTMGFEF >gi|222159235|gb|ACAB01000124.1| GENE 10 10092 - 11291 863 399 aa, chain + ## HITS:1 COG:no KEGG:BT_2890 NR:ns ## KEGG: BT_2890 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 397 1 396 397 388 52.0 1e-106 MLNKSRILNNGSYPLVFQVIHNRRKKLLYTGYRMKEEVFDELEGKIMNGVSSTFTATEVV KMNCELRKMRNQIDTRIRQLERTREEFTVEDILAQNAFGTGKPQFYLLRYINTQIERKQE LKKVGMAAAYKSTRSSLAKFISRPDVRMSEVDLAFVRRYEDFLYSNGASGNTVSYYLRNL RSLYNQAVTDGYHPRGEYPFAKAQTRPAKTVKRALSRTDMQNLADLNLENEPELEFTRDL YLFSFYAQGMAFVDIVLLKKTDICNGVLTYSRHKSKQLIRIVVTPQMQGLIDKYKTENEY LFPIISGEYASGYKQYRLALGRINRHLKKIAVVADIEVPLTTYTARHTWATLARDYGAPI SVISAGLGHTSEEMTRVYLKDFDVFMLNQVNSIVTNLSK >gi|222159235|gb|ACAB01000124.1| GENE 11 11507 - 12391 349 294 aa, chain - ## HITS:1 COG:no KEGG:BT_2889 NR:ns ## KEGG: BT_2889 # Name: not_defined # Def: AraC family transcription regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 289 1 289 294 352 57.0 6e-96 METKHTKSELLYIEEHQSCQNYMTTIETGFKYLEFAKNAAFEEDNTTKNYLLFFLKGDFT ISCNQFHNKLFHAGEMILIPRSSRLKGFAGTDSSLLSMFFDIPAENCDKLILQSLSNVCN NIEYDFSPIKIRYPLTLFLEVLTHCIKNGMSCAHLHGLMQREFFFLLRGFYEKQEIAVLF HPIIGKEVDFKDFVMHNHTKVDNIEQLISLSNMGRSCFFTKFNEVFGMTAKQWMLKQKNQ RILEKMTEPGVCIKDIIEELGFDSQGNFNRYCKQHFGCTPKQLIEQCQATNQTS Prediction of potential genes in microbial genomes Time: Wed May 18 03:59:34 2011 Seq name: gi|222159234|gb|ACAB01000125.1| Bacteroides sp. D1 cont1.125, whole genome shotgun sequence Length of sequence - 29360 bp Number of predicted genes - 27, with homology - 25 Number of transcription units - 15, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 275 279 ## BVU_3207 hypothetical protein + Prom 311 - 370 3.5 2 2 Op 1 . + CDS 391 - 1764 1604 ## COG0006 Xaa-Pro aminopeptidase 3 2 Op 2 . + CDS 1781 - 3091 1126 ## COG3458 Acetyl esterase (deacetylase) + Term 3114 - 3150 -0.5 + Prom 3144 - 3203 1.8 4 3 Tu 1 . + CDS 3268 - 3441 68 ## 5 4 Tu 1 . - CDS 3718 - 5391 683 ## PROTEIN SUPPORTED gi|39938628|ref|NP_950394.1| ribosomal protein L13 - Prom 5460 - 5519 6.5 + Prom 5330 - 5389 5.2 6 5 Tu 1 . + CDS 5474 - 7171 1459 ## COG1283 Na+/phosphate symporter + Term 7190 - 7249 7.0 - Term 7250 - 7302 10.4 7 6 Tu 1 . - CDS 7330 - 7494 209 ## COG1773 Rubredoxin - Prom 7616 - 7675 6.2 + Prom 7539 - 7598 2.4 8 7 Op 1 1/0.000 + CDS 7631 - 8893 1032 ## COG0642 Signal transduction histidine kinase + Prom 8895 - 8954 9.8 9 7 Op 2 . + CDS 8984 - 11689 2328 ## COG0474 Cation transport ATPase 10 7 Op 3 . + CDS 11679 - 12302 443 ## COG1011 Predicted hydrolase (HAD superfamily) 11 7 Op 4 . + CDS 12313 - 13293 416 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 12 7 Op 5 . + CDS 13337 - 13426 72 ## 13 7 Op 6 . + CDS 13433 - 14212 573 ## COG1266 Predicted metal-dependent membrane protease + Term 14218 - 14248 1.2 14 8 Op 1 . - CDS 14228 - 14653 379 ## COG2166 SufE protein probably involved in Fe-S center assembly 15 8 Op 2 . - CDS 14676 - 15671 1084 ## COG2234 Predicted aminopeptidases 16 8 Op 3 . - CDS 15742 - 16650 726 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF - Prom 16878 - 16937 6.4 + Prom 16768 - 16827 9.3 17 9 Op 1 3/0.000 + CDS 16869 - 17828 730 ## COG0280 Phosphotransacetylase 18 9 Op 2 . + CDS 17854 - 18915 1058 ## COG3426 Butyrate kinase 19 10 Op 1 . - CDS 19054 - 20811 1414 ## BT_2553 hypothetical protein 20 10 Op 2 . - CDS 20841 - 21314 514 ## BT_2554 hypothetical protein - Prom 21483 - 21542 7.7 + Prom 21286 - 21345 7.9 21 11 Tu 1 . + CDS 21534 - 21797 223 ## gi|237713949|ref|ZP_04544430.1| conserved hypothetical protein + Term 21877 - 21931 11.4 - Term 22222 - 22269 13.2 22 12 Tu 1 . - CDS 22295 - 24421 182 ## PROTEIN SUPPORTED gi|170017041|ref|YP_001727960.1| 40S ribosomal protein S1 - Prom 24577 - 24636 4.2 + Prom 24390 - 24449 10.1 23 13 Tu 1 . + CDS 24618 - 25769 1274 ## BT_2564 hypothetical protein + Term 25867 - 25922 12.3 + Prom 25801 - 25860 5.5 24 14 Op 1 . + CDS 25951 - 26415 611 ## COG0782 Transcription elongation factor + Term 26422 - 26460 4.2 25 14 Op 2 . + CDS 26467 - 26859 430 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases + Term 26882 - 26916 4.0 - Term 26941 - 26993 2.5 26 15 Op 1 8/0.000 - CDS 26996 - 27895 697 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 27 15 Op 2 . - CDS 27899 - 29215 739 ## COG0477 Permeases of the major facilitator superfamily - Prom 29244 - 29303 1.6 Predicted protein(s) >gi|222159234|gb|ACAB01000125.1| GENE 1 3 - 275 279 90 aa, chain + ## HITS:1 COG:no KEGG:BVU_3207 NR:ns ## KEGG: BVU_3207 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 90 152 242 242 154 89.0 8e-37 IIYADEADMLNVAMFGMTAKMWHEQNPELKGNIRDYASINELICLSNMENLNAVFINQSI PQGERLVKLNQIAIQQMKILEGDGKRKLLK >gi|222159234|gb|ACAB01000125.1| GENE 2 391 - 1764 1604 457 aa, chain + ## HITS:1 COG:FN1949 KEGG:ns NR:ns ## COG: FN1949 COG0006 # Protein_GI_number: 19705251 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 454 1 454 462 412 47.0 1e-115 MFAKETYMQRRALLKKNLGSGVLLFLGNDECGLNYEDNTFRYRQDSTFLYYFGLSCAGLS AIIDIDEDKEIIFGDELSIDAIVWMGSQPTLHEKCERVGVKNLMPSAEITSYLHKYVQKG KAVHYLPPYRPEHKLKLMDWLGIPPTRQEASVPFIRAVIAQRNYKSAEEIVEIEKACDVT ADMHITAMKVLRPGMYEYEVVAEMNRVAESNNCQLSFATIATINGQTLHNHYHGNKVKPG DLFLIDAGAEVESGYAGDMSSTIPADKKFTARQREVYEIQNAMHLESVKALRPGIPYMEV YELSARVMVDGMKALGLMKGNTEDAVREGAHALFYPHGLGHMMGLDVHDMENLGEIWVGY NGQPKSTQFGRKSQRLAIPLEPGFVHTVEPGIYFIPELIDMWKGEKKFTDFINYDVVETY KDFGGIRNEEDYLITETGARRLGKKIPLTPEEVEALR >gi|222159234|gb|ACAB01000125.1| GENE 3 1781 - 3091 1126 436 aa, chain + ## HITS:1 COG:TM0077 KEGG:ns NR:ns ## COG: TM0077 COG3458 # Protein_GI_number: 15642852 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acetyl esterase (deacetylase) # Organism: Thermotoga maritima # 128 417 12 303 325 113 29.0 6e-25 MKRLLLSVWLLSLTLLSAVAENYPYKSDVLWVTVPNHADWLYKTGEKATVEVQFYKYGIP RDNVTVTYEIGGDMMPVADTKGSVTLKNGRGVIPVGTMKEPGFRDCRLKATVDGKTYSHH IKVGFSPEKLRPYTTMPSDFKEFWEKAKAEQKEFPLTYTKEHVEKYSTDKIDCYLVKLQL NKRGQCVYGYLFYPKKEGKFPVVLCPPGAGIKTIKEPLRHKYYAEQGCIRFEFEIHGLNP EMTDEEFKEISNAFNGRENGYLTNGLDSRDNYYMKRVYLACVRGIDFLTSLPEWDGKNVI AQGGSQGGALALITAGLDERVTACVANHPALSDMAGYKAGRAGGYPHFFRNSVDMDTPEK IKTMAYYDVVNFAKLIKADTYITWGFNDDVCPPTTSYIVYNVLNCPKEALITPINEHWTS NDTEYGHLLWIKKHLK >gi|222159234|gb|ACAB01000125.1| GENE 4 3268 - 3441 68 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNGSIIHHHENSNDKINAGIRVREKHSQYCFACIRNSISSQTSRYLYRVKKNQPNIK >gi|222159234|gb|ACAB01000125.1| GENE 5 3718 - 5391 683 557 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|39938628|ref|NP_950394.1| ribosomal protein L13 [Onion yellows phytoplasma OY-M] # 39 557 31 546 546 267 30 5e-71 MKQMLQICCKNNNISKEFPIGSSLLDIYYGFNLNFPYQVVSAKVNNRSEGLNFRVYNNKD VEFLDVRDQSGMRTYVRSLCFVLFKAVTELFPDGKLFVEHPVSKGYFCNLRIGRPIELED VTRIKQRMQEIIAENISYHRIECHTAEAVRVFSERGMNDKVRLLETSGSLYTYYYTLGDT IDYYYGNLLPSTGYLKLFDIVKYYDGLLLRIPSRENPNVLEDVVKQEKMLDVFKEYLNWS YIMGLNNAGDFNLACEEGHATDLINVAEALQEKKIAQIADTIFHRGENGNRVKLVLIAGP SSSGKTTFSKRLSIQLMTNGLKPFPISLDNYFVDREETPLDENGNYDYESLYALDLELFN QQLQALLRGEEVELPRFNFSLGKKEYKGDKLKIEDNTILILEGIHALNPELTPHIPAERK FKIYVSALTTISLDDHNWIPTTDNRLLRRIIRDFNYRGYSARETISRWPSVRAGEDKWIF PYQENADVMFNSALLFEFAVLRLHAEPILMGVPRNCPEYCEAYRLLKFIKYFVPVQDKEI PPTSLLREFLGGSSFKY >gi|222159234|gb|ACAB01000125.1| GENE 6 5474 - 7171 1459 565 aa, chain + ## HITS:1 COG:TP0771 KEGG:ns NR:ns ## COG: TP0771 COG1283 # Protein_GI_number: 15639758 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Treponema pallidum # 8 552 47 585 593 266 33.0 1e-70 MEYSFYDFLKLIGSLGLFLYGMKIMSEGLQKVAGDRLRSILTAMTTNRVTGVLTGVLITA LIQSSSATTVMVVSFVNAGLLTLAESISVIMGANIGTTVTAWIISIFGFKVDMAAFALPL LAIALPLIFSGKSNRKSIGEFIFGFSFLFMGLSYLKANAPDLNANPEMLAFVQNYTDMGF FSIILFLLIGTILTMIVQASAATMAITLIMCANGWISLELGAALVLGENIGTTITANLAA LTANTQAKRAALAHFVFNVFGVIWVLIVFHPFMELVNWVVDTFFQSNNPEVAISYKLSAF HSIFNICNVCILIWAVKLIERTVCALIHPKEEDEEPRLRFITGGMLSTAELSILQARKEI HLFAERTHRMFGMVQDLLHTEKDDVFNKLFSRIEKYENISDNMELEIANYLNQVSEGRLS SESKLQIRAMLREVTEIESIGDSCYNLARTINRKRQTNQDFTEKQYEHIHFMMKLTNDAL AQMIVVVEKPEHQSIDINKSFNIENEINNYRNQLKNQNILDVNNKEYDYQMGVYYMDIIA ECEKLGDYVVNVVEASSDVKEKKAS >gi|222159234|gb|ACAB01000125.1| GENE 7 7330 - 7494 209 54 aa, chain - ## HITS:1 COG:CAC2778 KEGG:ns NR:ns ## COG: CAC2778 COG1773 # Protein_GI_number: 15896033 # Func_class: C Energy production and conversion # Function: Rubredoxin # Organism: Clostridium acetobutylicum # 1 53 1 53 54 78 73.0 4e-15 MKKYICTVCEYIYDPEQGDPESGIEPGTAFEDIPDDWTCPLCGVGKEDFEPYEG >gi|222159234|gb|ACAB01000125.1| GENE 8 7631 - 8893 1032 420 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 165 418 55 315 328 178 38.0 2e-44 MLIALLSTLIVLCAVLLISFLWERRKCLRMQQRLFRESRKLERTSHIAGAVLKNVHAFIL LIDNDFKVLKTNYYQKTGTRKGTEEKRVGDLLQCRNALAAEGGCGTHSFCGSCPIRTAIR QAFEQRRNFTDLEATLSVVTSDNTTVECDAVISGSYFLLNEEENMVITVHDVTRRKQAEN ELRLAKEKAEKADISKSAFLANMSHEIRTPLNAITGFAEVLGSANTEEEKAQYQEIIKMN ADLLMQLVNDILDMSKIEAGTLEFVQTTVDVNLLLSDLQQLFQMRVNEAGENIQIIAEPS RSSCMIQTDRNRVAQVLSNFAGNAIKFTHEGSIRIGYEARDTELYFYVKDTGAGIPAEKL PDVFERFVKLNKDKKGAGLGLSISQTIVAKLGGQIGADSVEGEGSTFWFTIPYRTCGKPR >gi|222159234|gb|ACAB01000125.1| GENE 9 8984 - 11689 2328 901 aa, chain + ## HITS:1 COG:SPAPB2B4.04c KEGG:ns NR:ns ## COG: SPAPB2B4.04c COG0474 # Protein_GI_number: 19114802 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Schizosaccharomyces pombe # 26 889 213 1139 1292 452 34.0 1e-126 MSTNKNDYYHLGLTDDEVLQSREKNGINLLTPPKRPSLWKLYLEKFEDPVVRVLLVAAVF SLIISIIENEYAETIGIIAAILLATGIGFFFEYDASKKFDLLNAVNEETLVKVIRNGRVQ EIPRKDIVVGDIVILETGEEIPADGELLEAISLQVNESNLTGEPVINKTTVEADFDEEAT YASNLVMRGTTVVDGHGTMRVLHVGDATEIGKVARQSTEDNLEPTPLNIQLTKLANLIGK IGFTVAGLAFLIFFVKDVLLFFDFGSLNGWHEWLPVFERTLKYFMMAVTLIVVAVPEGLP MSVTLSLALNMRRMLSTNNLVRKMHACETMGAITVICTDKTGTLTQNLMQVHEPNFYGIK NGSVLSDDDISTLIAEGISANSTAFLEESTTGEKPKGVGNPTEVALLLWLNKQGKNYLEL REKAHILDQLTFSTERKFMATLVESPLIGKKILYIKGAPEIVLGKCKEVVLDGRRVDAVE YRSTVEAQLLGYQNMAMRTLGFAFKIVGENEPNDCTELVSANDLNFLGVVAISDPIRPDV PAAVAKCQSAGIGIKIVTGDTPGTATEIARQIGLWNPETDTERNRITGVAFAELSDEEAL DRVMDLKIMSRARPTDKQRLVQLLQQKGAVVAVTGDGTNDAPALNHAQVGLSMGTGTSVA KEASDITLLDDSFNSIGTAVMWGRSLYKNIQRFIVFQLTINFVALLIVLLGSVIGTELPL TVTQMLWVNLIMDTFAALALASIPPSETVMLEKPRRSTDFIISKAMQSNILGVGSIFLIV LLGMIYYFDHSTEGMDIHNLTIFFTFFVMLQFWNLFNARVFGTTDSAFKGLSKSYGMELI VLAILAGQFLIVQFGGAVFRTEPLNWQTWLLIIGVSSTVLWVGELIRLVQRIIHKKDRNE K >gi|222159234|gb|ACAB01000125.1| GENE 10 11679 - 12302 443 207 aa, chain + ## HITS:1 COG:L111950 KEGG:ns NR:ns ## COG: L111950 COG1011 # Protein_GI_number: 15672092 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Lactococcus lactis # 6 195 3 192 207 113 33.0 3e-25 MKSKGIKNLLIDLGGVLINLDRQRCIENFQKLGLRNVEDLLDIHNPNGLLMQQEKGLITP AEFRDGVRQMIGKAVSDIQIDAAWNSFLVDIPTHKLDLLLKLREKYVVYLLSNTNQIHWD WTCIHLFPYRTFKVEDYFEKTYLSFEMKMAKPEPEIFKAIIEDAGIEPQETLFIDDSEMN CKAAQNLGISTYTAKEGEDWSHLFNSK >gi|222159234|gb|ACAB01000125.1| GENE 11 12313 - 13293 416 326 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 1 310 1 309 317 164 34 5e-40 MQIIRDTSAINPKPSVATIGFFDGVHAGHRYLIQQVKEIAAAKGLRSALVTFPVHPRKVM NTNYRPELLTTPEEKIRLLANIGVDYCLMLDFTPEISRLTAREFMTQLLKERYQVKYLVI GYDHRFGHNRSEGFEDYVRYGKEIGIEVIRAKAYTSNIEIENVPNVPVSSSLIRKLLHQG EVDLAADCLKYEYFLDGIVVGGYQVGRKIGFPTANLSVDDPDKLIPADGVYAVWVTFDKK TYMGMLNIGVRPTIDNGPNRTIEVNILHFHSDIYDKFIRLTFVKRTRPELKFSSIDELIV QLHKDAEETEAILLASKANSNSRETK >gi|222159234|gb|ACAB01000125.1| GENE 12 13337 - 13426 72 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQKSFMSISLVVIAISMLAMFIYSFFTTN >gi|222159234|gb|ACAB01000125.1| GENE 13 13433 - 14212 573 259 aa, chain + ## HITS:1 COG:RSc3402 KEGG:ns NR:ns ## COG: RSc3402 COG1266 # Protein_GI_number: 17548119 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Ralstonia solanacearum # 119 221 141 243 285 68 42.0 1e-11 MKTAIKLILIDLLIAQIVAPILIMIPCTIYLLVTTGDLDKVALTQMIMIPAQLAGQIMMG IYLWKAGYISKKKATWSPVSVPFLICSGLAILTAGFLVSALMGLLDWIPNIMEQSFDILQ SGWGGILAIAIIGPVLEEILFRGAITRALLQQYNPTKAILISALLFGVFHINPAQILPAF LIGILLAWTYYKTGSLIPCILMHVLNNSLSVYLSIKYPEAENMDDLINGSPYLVTLAGSI LLLICTILTMNNLTSRKQP >gi|222159234|gb|ACAB01000125.1| GENE 14 14228 - 14653 379 141 aa, chain - ## HITS:1 COG:XF0994 KEGG:ns NR:ns ## COG: XF0994 COG2166 # Protein_GI_number: 15837596 # Func_class: R General function prediction only # Function: SufE protein probably involved in Fe-S center assembly # Organism: Xylella fastidiosa 9a5c # 14 137 23 146 146 109 44.0 1e-24 MSINELQDEVIAEFSDFDDWMDRYQLLIDLGNEQEPLEEKYKTEQNLIEGCQSRVWLQAD DVDGKIVFKAESDALIVKGIIALLIKVLSGHTPDEILNTDLYFIDKIGLKEHLSPTRSNG LLSMVKQIRMYALAFKAKEGK >gi|222159234|gb|ACAB01000125.1| GENE 15 14676 - 15671 1084 331 aa, chain - ## HITS:1 COG:MK0503 KEGG:ns NR:ns ## COG: MK0503 COG2234 # Protein_GI_number: 20093941 # Func_class: R General function prediction only # Function: Predicted aminopeptidases # Organism: Methanopyrus kandleri AV19 # 1 323 1 283 337 61 24.0 2e-09 MKKKTMIVLMSAFLLLSAFSCGGGNKANSTSEQSEETAVNVPQFDADSAYLYVKNQVDFG PRVPNTKEHVACGNYLAGQLEAFGAQVTNQYADLIAYDGTLLKARNIIGSYKPESKKRIA LFAHWDTRPWADNDPDEKNHNTPILGANDGASGVGALLEIARLVNQQQPELGIDIILLDA EDYGAPQFYTGQHKEEFWCLGSQYWARNPHVQGYNARFGILLDMVGGEGSVFMKEGYSEE FAPDINKKVWKAAKKIGNGKTFMDGNGGFVTDDHLFINRLARIKTIDIIPYNQEGDFTPT WHTVNDNMEHIDKNTLKAVGQTVLEVIYNEK >gi|222159234|gb|ACAB01000125.1| GENE 16 15742 - 16650 726 302 aa, chain - ## HITS:1 COG:slr1534 KEGG:ns NR:ns ## COG: slr1534 COG1619 # Protein_GI_number: 16330906 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Synechocystis # 6 289 11 291 301 149 35.0 7e-36 MDIQFPPFLQKGDKVVIVSPSSKIDQQFLKGAKKRIESWGLKVAMGKYAGSSSGRYAGTI RQRLKDLQDAMDDPEVKAILCSRGGYGVVHLIDKIDFTAFHEHPKWLLGFSDITALHNLF QKNGYASLHSLMARHLTVEPEDDLCTNYLKDILFGNIPSYTCEKHKLNKQGTAQGVLHGG NMAVAYGLRGTPYDIPAEGTILFIEDVSERPHAIERMMYNLKLGGVLEKLSGLIIGQFTE YEEDCSLGKELYAALADLVKEYDYPVCFNFPVGHVTHNLPLINGAKVELVVGKKNVELKF IC >gi|222159234|gb|ACAB01000125.1| GENE 17 16869 - 17828 730 319 aa, chain + ## HITS:1 COG:CAC3076 KEGG:ns NR:ns ## COG: CAC3076 COG0280 # Protein_GI_number: 15896327 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 16 304 13 300 301 172 34.0 1e-42 MEPILNFAQLTAHLKKLNHRKRIAVVCANDPNTEYAISRALEEGIAEFLMIGDSTILKKY PTLKQYPEYIKTIHIENPDEAAREAVRIVREGGADILMKGIINTDNLLHAILDKEKGLLP KGKILTHLAVMEIPTYHKLLFFSDAAVIPRPTLQQRIEMIWYAICTCRHFGIEQPRVALI HCTEKVSAKFPHSLDYVNIVELAEAGEFGNVIIDGPLDVRTACEQASGDIKGIVSPINGQ ADVLIFPNIESGNAFYKSVSLFAKAEMAGLLQGPICPVVLPSRSDSGLSKYYSIAMACLQ VSGDCECRKQASQVTNSSF >gi|222159234|gb|ACAB01000125.1| GENE 18 17854 - 18915 1058 353 aa, chain + ## HITS:1 COG:CAC1660 KEGG:ns NR:ns ## COG: CAC1660 COG3426 # Protein_GI_number: 15894937 # Func_class: C Energy production and conversion # Function: Butyrate kinase # Organism: Clostridium acetobutylicum # 2 353 4 356 356 357 50.0 2e-98 MKILVINPGSTSTKIAVYENETPLFISNIKHSVEELSAFPEVIDQFEFRKNLVLQELENN KIPFSFDAIIGRGGLVKPIPGGVYEVNDAMKRDTVHAMRTHACNLGGLIASELAATLPDC PAFIADPGVVDELEDIARITGSPLMPKITIWHALNQKAIARRFAKEQGTQYEELDLIICH LGGGISVAVHHHGRAIDANNALDGEGPFSPERAGTLPAGQLIDLCFSGQHTKDELKKRIS GRAGLTAHLGTTDVPAIIQSIEEGDKKAELILDAMIYNVAKAIGASATVLCGKVDAILLT GGIAYSDYVVSRLKKRISFLAPIYVYPGENEMESLAFNAIGALKGELPIQVYK >gi|222159234|gb|ACAB01000125.1| GENE 19 19054 - 20811 1414 585 aa, chain - ## HITS:1 COG:no KEGG:BT_2553 NR:ns ## KEGG: BT_2553 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 582 1 582 583 1014 87.0 0 MMKQIPYGITDFSRIQKENYYYVDKTMFIEKIEMQPPYLFLIRPRRFGKSLTLAMLEAYY DVNHAEQFDELFGQLYIGQHPTKLHNQFLIMRFNFSEVSSNVNEVELSFKLHCCSKLKDF VFKYEDLLGKEIWDVLDEKIQEEPGAFLSAVNSYATRKKNIRIYLLIDEYDNFTNTILST YGTEFYRKATHGEGFIRGFFNVIKSATTGTGAALERLFITGVSPVTMDDVTSGFNIGTNI TNDSWFNDLVGFSENELREMLEYYKEQGVLQESIDEIVAMMKPNYDNYCFSEDTLEQCMF NSDMALYFLKSFVLHHKKPKEIVDPNIRTDFNKLAYLIKLDHGLGENFSVIKEIAEQGEI ITDIVTHFSALEMTDPSNFKSLLFYFGLLSIKGVDMVGRPILHVPNLVVREQLFNFLIQG YIKHDIFKIDMNKMSALFENMAFRGDWKPLFDFIANAVREQSRIREYIEGEAHIKGFLLA YLGMYRYYQLYPEYELNKGFADFFFKPSLSVPVLPPFTYLLEVKYAKAGASEKEIRALAD EAREQLLRYSEDELVAEAKAKGELKLITIVWRSWELVLLEEVTLS >gi|222159234|gb|ACAB01000125.1| GENE 20 20841 - 21314 514 157 aa, chain - ## HITS:1 COG:no KEGG:BT_2554 NR:ns ## KEGG: BT_2554 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 157 1 147 147 186 76.0 2e-46 METEELTIGRVHHGRNIRRTRIEKNMNQEGLSELVHLSQPAVSKYEKMKVIDDEMLQRFA RALNVPFDYLKTLEEDAQTVVFENNTVNNSEQSAGGANISMGIVKSDTEDSINDSRVNNF NPIDKITELYERLLKEKDEKYAALERRLQNIEKSLQK >gi|222159234|gb|ACAB01000125.1| GENE 21 21534 - 21797 223 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713949|ref|ZP_04544430.1| ## NR: gi|237713949|ref|ZP_04544430.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 87 1 87 87 140 100.0 2e-32 MKRILTSKEYRFIKGLKELMTKYNAVISTDAQGKIEIVVNEDDESFDFETESTIYLGECF SDNELNELLEKNLAHIIRIKEEYKTNL >gi|222159234|gb|ACAB01000125.1| GENE 22 22295 - 24421 182 708 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|170017041|ref|YP_001727960.1| 40S ribosomal protein S1 [Leuconostoc citreum KM20] # 590 708 251 361 396 74 32 6e-13 MINPIVKTIELPDGRTITLETGKLAKQADGSVMLRMGNTMLLATVCAAKDAVPGTDFMPL QVEYKEKFAAFGRFPGGFTKREGRASDYEILTCRLVDRALRPLFPDNYHAEVYVNIILFS ADGVDMPDALAGLAASAALAVSDIPFNGPISEVRVARIDGQFVINPTFEQLEKADMDLMV AATYENIMMVEGEMHEVSEAELLEAMKVAHEAIKIHCKAQMELTEEVGKTVKREYNHEVN DEDLRKAVREACYDKAYAVAASGNNNKHERFAAFEVIREEFKAQFSEEELDEKAALIDRY YHDVEKEAMRRSILDEGKRLDGRKTTEIRPIWCEVGPLPGPHGSAIFTRGETQSLTSVTL GTKLDEKIIDDVLEHGKERFLLHYNFPPFSTGEAKAQRGVGRREIGHGHLAWRALKGQIP ADYPYVVRVVSDILESNGSSSMATVCAGTLALMDAGVKIKKPVSGIAMGLIKNPGEEKYA VLSDILGDEDHLGDMDFKVTGTKDGITATQMDIKVDGLSYDILERALNQAKEGRMHILNK ITETIAEPRADLKEHAPRIETMTIPKEFIGAVIGPGGKIIQGMQEETGAVITIEEIDGMG RIEVSGTNKKCIDDAMRMIKAIVAVPEVGEVYKGKVRSIMPYGAFIEFLPGKDGLLHISE IDWKRLETVEEAGIKEGDEIEVKLIDIDPKTGKFKLSRKVLLPRPEKK >gi|222159234|gb|ACAB01000125.1| GENE 23 24618 - 25769 1274 383 aa, chain + ## HITS:1 COG:no KEGG:BT_2564 NR:ns ## KEGG: BT_2564 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 383 1 383 383 689 89.0 0 MKKVTFAALAALTITACSSGPKFQVNGDVSGADGKMLYLEASGLEGIVPLDSVKLKGEGT FSFKQPRPESPEFYRLRIDDKIINFSVDSIETIQIKAPYVDFSTTYTVEGSENSNKIKEL TLKQIRLQKEVDDLLAALRSNRMGHDVFEDSLATLLNNYKEDVKVNYIFAAPNTAAAYFA LFQKLNNYLIFDPLNNKDDVKCFAAVATSLNNAFPHAVRSKNLYNIVIKGMKNTRQPQAK ALEIPQEKIVETGIIDIALRDVKGNVRKLTDLKGKVVLLDFSVFQSPAGSPHNLMLRELY NEYAKQGLEIYQVSLDADEHYWKTAADNLPWVCVRDGNGVYSTNVAVYNVRQVPSIFLIN RNNELKLRGEDIKDLEAAVKSLL >gi|222159234|gb|ACAB01000125.1| GENE 24 25951 - 26415 611 154 aa, chain + ## HITS:1 COG:RC1332 KEGG:ns NR:ns ## COG: RC1332 COG0782 # Protein_GI_number: 15893255 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Rickettsia conorii # 4 152 51 199 206 116 46.0 2e-26 MAYMSEEGYKKLMAELKELETVERPKISAAIAEARDKGDLSENAEYDAAKEAQGMLEMRI NKLKATIADAKIIDESKLKTDSVQILNKVELKNVKNGMKMTYTIVSESEANLKEGKISVN TPIAQGLLGKKVGDVAEITVPQGKIALEVVNISI >gi|222159234|gb|ACAB01000125.1| GENE 25 26467 - 26859 430 130 aa, chain + ## HITS:1 COG:MA0811 KEGG:ns NR:ns ## COG: MA0811 COG0537 # Protein_GI_number: 20089695 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Methanosarcina acetivorans str.C2A # 4 101 17 119 150 88 43.0 3e-18 MATIFSRIIAGEIPCYKVAENEKFFAFLDINPLVKGHTLVVPKQEVDYIFDLSDEDLAAM HVFAKKVVRAIEKAFPCKKVGEAVIGLEVPHAHIHLIPIQKESDMLFSNPKLKLSDEEFK SIAQAINSSL >gi|222159234|gb|ACAB01000125.1| GENE 26 26996 - 27895 697 299 aa, chain - ## HITS:1 COG:BS_yyaM KEGG:ns NR:ns ## COG: BS_yyaM COG0697 # Protein_GI_number: 16081133 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 17 269 15 267 305 77 25.0 3e-14 MKNKKLEANLSMAVSKVFSGLNMNALKYLLPLWMSPLTGATLRCTFAAAAFWVIGWFMPP EKSSAKDKWLLFLLGALGLYGFMFLYLAGLSKTTPVSSSIFTSLQPIWVFLMMIFFYKEK ATTKKIIGISIGLIGALVCILTQQSDDLASDAFTGNMLCLLSSVVYAVYLILSQRILTAI GAITMLRYTFSGAAVSAIIVTFITGFDAPVFSMPFHWTPFLILMFVLIFPTTISYMLLPV GLKYLKTTVVAIYGYLILIVATIASLALGQDRFSWTQTFAIIFICIGVYLVEVAESKER >gi|222159234|gb|ACAB01000125.1| GENE 27 27899 - 29215 739 438 aa, chain - ## HITS:1 COG:AGc4286 KEGG:ns NR:ns ## COG: AGc4286 COG0477 # Protein_GI_number: 15889635 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 21 178 15 174 400 70 30.0 8e-12 MKIQTGRGTIPLITLIAIWSISALTSLPGLAVSPILGDLTKIFPKATDLDIQMLTSLPSL LIIPFILLGGKLTEKVDYVRVLKVGLWLFAASGVLYLISNRMWQLIVVSALLGIGSGLII PLSTGLVSRYFVGTYRVKQFGLSSAITNFTLVIATAVTGYLAEVSWHLPFLVYLLPLVSI LLVGHLKEDRSGEAALTASSSSDTATTTAANVATANAATVTVPETSSADDSKAARQSAID IGGSKYGIHIKHLIELMLFYGVITYIVIVVIFNLPFLMEKHHFSSGNSGLMISLFFLAIT APGFCLDKIVGLLKERTKAYSLLSMALGLLLIWIAPIEWLIIPGCILVGLGYGIIQPMLY DKTTQTALPQKTTLALAFVMMMNYLAILLYPFIVDFFQWVFHTQSQEFPFIFNLLITVVT LFWAYRRRHTFLFNDQLK Prediction of potential genes in microbial genomes Time: Wed May 18 04:00:09 2011 Seq name: gi|222159233|gb|ACAB01000126.1| Bacteroides sp. D1 cont1.126, whole genome shotgun sequence Length of sequence - 18698 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 8, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 53 - 112 3.0 1 1 Op 1 . + CDS 196 - 1638 1600 ## COG0076 Glutamate decarboxylase and related PLP-dependent proteins 2 1 Op 2 . + CDS 1648 - 2613 1155 ## COG2066 Glutaminase + Term 2640 - 2691 10.8 3 2 Tu 1 . - CDS 3047 - 3826 697 ## BT_2572 putative potassium channel subunit - Prom 4012 - 4071 3.9 + Prom 3792 - 3851 9.1 4 3 Tu 1 . + CDS 4043 - 5734 1471 ## COG0531 Amino acid transporters - Term 5791 - 5825 -0.4 5 4 Tu 1 . - CDS 5892 - 7154 928 ## COG0642 Signal transduction histidine kinase + TRNA 7354 - 7426 82.1 # Phe GAA 0 0 + TRNA 7440 - 7515 75.3 # Pro CGG 0 0 + Prom 7442 - 7501 80.0 6 5 Tu 1 . + CDS 7621 - 9258 1453 ## BT_2662 alpha-galactosidase precursor - Term 9296 - 9344 4.3 7 6 Op 1 . - CDS 9435 - 10031 319 ## BF3938 hypothetical protein 8 6 Op 2 . - CDS 10056 - 10238 214 ## gi|237717537|ref|ZP_04548018.1| conserved hypothetical protein 9 6 Op 3 . - CDS 10255 - 10623 296 ## COG1725 Predicted transcriptional regulators 10 6 Op 4 . - CDS 10635 - 11456 449 ## BF4127 hypothetical protein 11 6 Op 5 . - CDS 11463 - 12308 681 ## COG1131 ABC-type multidrug transport system, ATPase component - Prom 12346 - 12405 5.1 - Term 12409 - 12450 4.1 12 7 Op 1 . - CDS 12479 - 13945 1536 ## BT_2663 TPR repeat-containing protein 13 7 Op 2 . - CDS 13967 - 14908 816 ## COG0226 ABC-type phosphate transport system, periplasmic component 14 7 Op 3 . - CDS 14915 - 15730 866 ## BT_2665 TonB 15 7 Op 4 . - CDS 15757 - 16410 699 ## BT_2666 hypothetical protein 16 7 Op 5 . - CDS 16426 - 17031 494 ## BT_2667 hypothetical protein 17 7 Op 6 . - CDS 17065 - 17862 918 ## COG0811 Biopolymer transport proteins - Prom 17882 - 17941 5.6 18 8 Op 1 . - CDS 18016 - 18456 251 ## BT_2669 hypothetical protein 19 8 Op 2 . - CDS 18464 - 18607 106 ## - Prom 18630 - 18689 3.2 Predicted protein(s) >gi|222159233|gb|ACAB01000126.1| GENE 1 196 - 1638 1600 480 aa, chain + ## HITS:1 COG:sll1641 KEGG:ns NR:ns ## COG: sll1641 COG0076 # Protein_GI_number: 16329656 # Func_class: E Amino acid transport and metabolism # Function: Glutamate decarboxylase and related PLP-dependent proteins # Organism: Synechocystis # 29 443 35 448 467 437 49.0 1e-122 MEDLNFRKGDAKTDVFGSDRMLQPSPVEKIPDGPTTPEVAYQMVKDETFAQTQPRLNLAT FVTTYMDEYATKLMNEAININYIDETEYPRIAVMNGKCINIVANLWNSPEKDTWKTGALA IGSSEACMLGGVAAWLRWRKKRQAQGKPFDKPNFVISTGFQVVWEKFAQLWQIEMREVPL TLEKTTLDPEEALKMCDENTICIVPIQGVTWTGLNDDVEALDKALDAYNAKTGYDIPIHV DAASGGFILPFLYPEKKWDFRLKWVLSISVSGHKFGLVYPGLGWVCWKGKEYLPEEMSFS VNYLGANITQVGLNFSRPAAQILGQYYQFIRLGFQGYKEVQYNSLQIAKYIHSEIAKMVP FVNYSEDVVNPLFIWYLKPEYAKNAKWTLYDLQDKLSQHGWMVPAYTLPSKLEDYVVMRV VVRQGFSRDMADMLLGDINNAIAELEKLEYPTPTRMAQEKNLPVEAKMFNHGGRRKTVKK >gi|222159233|gb|ACAB01000126.1| GENE 2 1648 - 2613 1155 321 aa, chain + ## HITS:1 COG:ECs0538 KEGG:ns NR:ns ## COG: ECs0538 COG2066 # Protein_GI_number: 15829792 # Func_class: E Amino acid transport and metabolism # Function: Glutaminase # Organism: Escherichia coli O157:H7 # 9 312 6 308 310 271 45.0 2e-72 MDKKVTLAQLKEVVQEAYDQVKTNTGGKNADYIPYLANVNKDLFGISVCLLNGQTIHVGD TDYRFGIESVSKVHTAILALRQYGAKEILDKIGADATGLPFNSIIAILLENDHPSTPLVN AGAISACSMVQPIGDSAKKWDAIVGNVTDLCGSAPQLIDELYKSESDTNFNNRSIAWLLK NYNRIYDDPDMSLDLYTRQCSLGVTALQLSIAAGTIANGGVNPVTKKEVFDATLAPKITA MIAAVGFYEHTGDWMYTSGIPAKTGVGGGVMGVLPGQFGIAAFAPPLDGSGNSVKAQLAI QCIMNKLELNVFSNNHITVVD >gi|222159233|gb|ACAB01000126.1| GENE 3 3047 - 3826 697 259 aa, chain - ## HITS:1 COG:no KEGG:BT_2572 NR:ns ## KEGG: BT_2572 # Name: not_defined # Def: putative potassium channel subunit # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 1 243 243 397 90.0 1e-109 MKSAFSDFISGKKGIYGILHIIILIMSLFLVISISVDTFKGIPFYTQSSYMKVQLCICLW FLFDFVLEFFLAKHKGRYLRTHFIFLLVAIPYQNIIAYYGWTFSDEITYLLRFIPLLRGG YALAIVVGWLTYNRASSLFVSYLTMLLATVYFSSLAFFVLEHRVNPLVNDYGDALWWAFM DVTTVGSNIIAQTVTGRVLSVLLAALGMMMFPIFTVYITNLIQQSNKRRKQYYEEEEQQK KASAQKESAEKAVVQKVNT >gi|222159233|gb|ACAB01000126.1| GENE 4 4043 - 5734 1471 563 aa, chain + ## HITS:1 COG:BMEII0909 KEGG:ns NR:ns ## COG: BMEII0909 COG0531 # Protein_GI_number: 17989254 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Brucella melitensis # 9 486 23 501 510 508 58.0 1e-143 MANIKNAVKLGVFTLAIMNVTAVVSLRGLPAEAVYGMSSAFYYLFAAIVFLIPTSLVAAE LAAMFQDKQGGVFRWVGEAYGKKLGFLAIWVQWIESTIWYPTVLTFGAVSIAFIGMNDVH DMSLANNKYYTLVVVLIIYWLATFISLKGMSWVGKVAKIGGMVGTIIPAGLLIILGIIYL ATGGHSNMDFNSSFFPDFTNFDNVVLAASIFLFYAGMEMGGIHVKDVDNPSKNYPKAVFI GALITVLIFVLGTFALGVIIPAKDINLTQSLLVGFDNYFKYIHASWLSPIIAVALAFGVL AGVLTWVAGPSKGIFAVGKAGYMPPFFQKTNKLGVQKNILFVQGIAVTVLSLLFVVMPSV QSFYQILSQLTVILYLIMYLLMFSGAIALRYKMKKLNRPFRIGKSGNGLMWFVGGLGFCG SLLAFILSFIPPSQISTGSNTVWFSVLIIGAIIVVVAPFIIYASKKPSWVDPNSNFEPFH WEVQAQPATANVSASSVNAPRPANATSAHTGGTTGTSTATPGATASNAATSGTASSGSAS FGSSSASKASPGTGDKDKDAPKS >gi|222159233|gb|ACAB01000126.1| GENE 5 5892 - 7154 928 420 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 137 409 24 312 328 159 33.0 1e-38 MSSLAIIIIVVLSATLIYVWAKCQSLQKEVHSKENKENELKKLALVLQNINAYFLLIDKD FVVCDTNYYSLNRLPVQVGGVTKRVGDLLHCRNAIAAGECGQHEQCRLCCIRASIGRAFY KKASFKNLEASMKLLSEDEMTVTPCDVSVSGTYLNIHGKDYMVLTVYDVTELKNAQRLLA IEKEHSISAEKLKSAFIANMSHEVRTPLNAIVGFSGLMVSASGEEERKMYADIIAENNER LLRLVNDIFDLSQIESGTVDFVYKEFDANDLLRELDGIFKAKLNNSPVELICEAHVQPIM MYSERERIIQVLSNLLHNAMKFTASGEIRFGCRLEGMEEVYFFVSDTGIGIPEEDQKKIF SRFIKLDREMQGTGLGLTLSQTIIQNLGGNLELDSKVNRGSTFSFVLPRVIKPELIKPQP >gi|222159233|gb|ACAB01000126.1| GENE 6 7621 - 9258 1453 545 aa, chain + ## HITS:1 COG:no KEGG:BT_2662 NR:ns ## KEGG: BT_2662 # Name: not_defined # Def: alpha-galactosidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 39 545 1 507 507 955 87.0 0 MKNRIYLFTVAVASFMCISCTKTQTTLSENEKAVNPPIMGWSSWNAFRVDISEDIIKHQA DLMVEKGLKDVGYHYVNVDDGYFGKRDDNGIMLANEKRFPNGMKPVADHIHSLGMKAGLY TDAGNSTCGSMWDNDTAGIGAGIYGHEPQDAQLYFGDWGFDFIKIDYCGGDALGLNEKER YTSIRNSIDKVNKDASINICRWAFPGTWAKDAATSWRISGDINAHWGSLRYVVGKNLYLS AYAGNGHYNDMDMMVIGFRNDSKVGGQGLTPTEEEAHFGLWCIMSSPLLIGCNLENIPES SLELLKNKELIALNQDPLGLQAYVAQHENEGYVLVKDIEQKRGNVRAVALYNPSDTVCRF SVPFSSLEFGGNVKVRDLVKHNDLGSFSGTFEQTLPAHSAMFLRMEGETRLEPTLYEAEW AYLPLFNDLGKNPKGILYANDKEASGKMKVGFLGGQPENYAEWKEVYSEDGGRYNMTIHY SYGKGRQIELDVNGIITKIDSLGEDNNHNEITVPVELKAGYNTIRMGNSYNWAPDIDCFT LTKTL >gi|222159233|gb|ACAB01000126.1| GENE 7 9435 - 10031 319 198 aa, chain - ## HITS:1 COG:no KEGG:BF3938 NR:ns ## KEGG: BF3938 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 13 198 76 265 265 121 35.0 1e-26 MFVSGEMQIESPSALSGKEQISYPKGKYLNVIQKDDTLFMKLDFSANNIPDKFQHQDYIY STGFDVKLAVDSLASAITDTEGLKLNLKGIETDSLVVRGRYSVSLDSCQLRSLDIQGNVR EFHAKDSKIENFYLNLDGVWRWTFANTEVGTEYLTGSSHHSNDLQKGECKRVVWTPLTED ACLQMNIEEKAEITITPE >gi|222159233|gb|ACAB01000126.1| GENE 8 10056 - 10238 214 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717537|ref|ZP_04548018.1| ## NR: gi|237717537|ref|ZP_04548018.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 60 1 60 268 102 83.0 6e-21 MKRTTYIFIGLLVSGLIVVVATIIFVSTSGKPYWENGAFLGDEQVTMDLNGVHVVKVFVS >gi|222159233|gb|ACAB01000126.1| GENE 9 10255 - 10623 296 122 aa, chain - ## HITS:1 COG:BH3492 KEGG:ns NR:ns ## COG: BH3492 COG1725 # Protein_GI_number: 15616054 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 3 120 5 118 129 65 30.0 2e-11 MNFKESKAIYLQIADRICDEILLGQYPEEERIPSVREYAAIVEVNANTVMRSFDYLQMQN IIYNKRGIGYFVTTGAKELIHSLRKDTFLKEELDYFFRQLYTLDIPIKEIETMYHEFIKK QK >gi|222159233|gb|ACAB01000126.1| GENE 10 10635 - 11456 449 273 aa, chain - ## HITS:1 COG:no KEGG:BF4127 NR:ns ## KEGG: BF4127 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 273 1 271 271 280 56.0 6e-74 MMKDTFFSLPRFMNLCRKEMVESWRSNVLRMVLMYGVMAVVLVWNGYFEYRGTYSHERDP MWIFLLIAFIWGLWGFGCLSASFTMEKMKSKTSRTSMLMVPATPFEKFFSRWFVFTVVYL VVFLISYKLADYTRLIIYSLAYPEKDFIAPVALSHLAGDKKYYTLCNTGLQFGALMSGYF FVQSLFVLGSSIWPKNSFLKTFAAGTVIVMVYFLVGILMSKILLENGQYYPGGIFESKDT IWWIIIVAGIFFALVNWTLAYFRFKESEIINRM >gi|222159233|gb|ACAB01000126.1| GENE 11 11463 - 12308 681 281 aa, chain - ## HITS:1 COG:BB0573 KEGG:ns NR:ns ## COG: BB0573 COG1131 # Protein_GI_number: 15594918 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Borrelia burgdorferi # 2 215 5 214 270 169 43.0 7e-42 MITVENLSFTYRKSKRAVLHDFSLSFESGRVYGLLGKNGAGKSTLLYLMSGLLTPKNGKV VFHDTDVRRRLPVTLQDMFLVPEEFELPSVSLVSYVELNSPFYPRFSKEEMIKYLHYFEM DIDIDLGSLSMGQKKKVFMSFALATNTSLLLMDEPTNGLDIPGKSQFRKFIASGMSDDKT IVISTHQVRDIDKVLDHVLIMDDSRVLLDESTSNICDKLFFVESDNRELAKNALFAIPTI QGNYLILPNEEQEESELNLELLFNATLAAPEEIARLFHTQK >gi|222159233|gb|ACAB01000126.1| GENE 12 12479 - 13945 1536 488 aa, chain - ## HITS:1 COG:no KEGG:BT_2663 NR:ns ## KEGG: BT_2663 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 488 1 483 483 714 89.0 0 MKRVQMFLAGAFIAVGSLCAQSTDAEWQAEVAKLKGTIQTNPAQAAEEAEHLIKGKNKKN VELLVAIGNAYLDADKLPEAQEYATLARKANGKSALASVLEGNIAMKQKNAGLASQKYEE AIYFDPKCTEAYLKYADVYKSANASLAIEKLNQLKALEPSNTAVDKKLAEIYYLKNDFSK AAEAYANFAMGPTATEEDLVKYAFALFLNHDFEKSLEVANMGLQKNAHHAAFNRLAMYNY TDLKRFDEAIKAADIFFNECEKADYSYLDYMYYGHLLESLKKYDDAVIQYEKAVKMDPTK TDLYKNISSAYEQKNDYKKAISAYQKYYSSLDKEKQTPDLQFQVGRLYYGAGTQPDSLTI TVEERKQALMSADSVFHAIAEAAPDSYLGNFWRARANSALDPETTQGLAKPFYEEVAALL ESKNDPHYNSALVECYSYLGYYYLLAIENPALKAEAKANKDKSIEYWNKILAIDPANATA KRALDGIK >gi|222159233|gb|ACAB01000126.1| GENE 13 13967 - 14908 816 313 aa, chain - ## HITS:1 COG:TM1264 KEGG:ns NR:ns ## COG: TM1264 COG0226 # Protein_GI_number: 15644020 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Thermotoga maritima # 37 301 23 271 274 86 25.0 7e-17 MKRQFRLIGLVALVVLSACNSKSKGPTDTYSSGVVSIAADESFEPIIQEEIEVFENLYPL AGIVPRYTTEVEAINLLLKDSVRLAITTRTLTKEEMNSFHSRKFFPREIKLATDGLALIV NRANPDSLLSVRDFRRILTGEVKNWKEVNPDSRLKGIQVVFDNKNSSTVRFAIDSICGGK PLAEGNVSALKTNQQVIDYVAKNPDAMGVIGVNWLGNRSDTTNLSFREEIRVMAVSAEDV ATPANSYKPYQAYLFYGNYPLARSIYALLNDPRSGLPWGFASFMISDKGQRIILKSGLVP ATQPVRIVHVKDE >gi|222159233|gb|ACAB01000126.1| GENE 14 14915 - 15730 866 271 aa, chain - ## HITS:1 COG:no KEGG:BT_2665 NR:ns ## KEGG: BT_2665 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 271 1 270 270 444 93.0 1e-123 MAKLDLASSEWCQLIFEGKNQAYGAYRMRANSTKRHNVAMLIVVIIAAVGFSIPTLLKLA TPEQKEVMTEVTTLSKLAEPEIKQEEMKRVEPVAPPPPALKSSIKFTAPVIKKDEEVHED DEIKSQEDLNATKVSISIADVKGNDEANGKDIADLKQVVTQAAPEPEKVFDMVEQMPTFP GGQQELMAYLGKNIKYPTIAQENGTQGRVIIQFVVERDGSITDVRVARGVDPYLDKEAVR VVKSMPKWLPGKQNGKAVRVKFTVPVMFRLQ >gi|222159233|gb|ACAB01000126.1| GENE 15 15757 - 16410 699 217 aa, chain - ## HITS:1 COG:no KEGG:BT_2666 NR:ns ## KEGG: BT_2666 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 216 1 216 217 340 93.0 2e-92 MSAEVQESGGKRGKSKQKKITVRVDFTPMVDMNMLLITFFMLCTTLSKPQTMEISMPSND KDITENQKSMVKASQAITLLLGPDNKLYYYEGEPNYKDYTSLKETSYGANGLRAVLLQKN AVAVNKVRELKQQKLDLKITDDEFKKQVSEIKSGKDTPTVIIKATDDASYMNLIDALDEM QICNIGKYVITDIAEADEFLIKNFDAKGGLSQNLADN >gi|222159233|gb|ACAB01000126.1| GENE 16 16426 - 17031 494 201 aa, chain - ## HITS:1 COG:no KEGG:BT_2667 NR:ns ## KEGG: BT_2667 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 201 1 201 201 346 94.0 2e-94 MGRAQIKKKSTFIDMTAMSDVTVLLLTFFMLTSTFVKKEPVQVTTPASVSEIKIPETNVL QILVDPEGKIFMSLDKQQDMQAVLESMGEEYGIKFTPEQEKRFMLSSTFGVPIRSMQKYL DLPEDQRDKILKNEGIPCDSVDNQFKSWVRNARTANADLRIAIKADATTPYSVIKNVMNS LQDLRENRYNLITSLKAESEN >gi|222159233|gb|ACAB01000126.1| GENE 17 17065 - 17862 918 265 aa, chain - ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 52 261 1 200 202 87 28.0 2e-17 METTKKTQVVGIKNAGIVIICCLVIAVCIFQFLLGNPSNFMNNDPNNHPLNMLGTIYKGG IIVPIIQTLLLTVLALSIERYFALRSAFGKGSLSKFVANIKDALAAGDMKKAQEICDKQR GSVANVVTSTLRKYEEMEKNTSLPKEQKLLAIQKELEEATALEMPMMQQNLPIIATITTL GTLMGLLGTVIGMIRSFAALSAGGGADSMALSQGISEALINTAFGILTGALAVISYNYYT NKIDKLTYGLDEVGFSIVQTFAATH >gi|222159233|gb|ACAB01000126.1| GENE 18 18016 - 18456 251 146 aa, chain - ## HITS:1 COG:no KEGG:BT_2669 NR:ns ## KEGG: BT_2669 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 146 1 147 147 172 74.0 4e-42 MKLSPKFILAFVLLMLSFSMGGNAMNFSAYASNCEASCQLSESSSDVNRNSSVDNQSVSY DNSQSLTISDVELGFKLVTESNSNNYRLRRIIEINDSLKDVMHKFLVLRENSLVLDQSKS FYSDKDPHYSIMCSDYYVFALRHILI >gi|222159233|gb|ACAB01000126.1| GENE 19 18464 - 18607 106 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFPRTKHFRTCSYSRDKFLCLNDVNNDTRVHFIATFVAESVTNDLNN Prediction of potential genes in microbial genomes Time: Wed May 18 04:01:05 2011 Seq name: gi|222159232|gb|ACAB01000127.1| Bacteroides sp. D1 cont1.127, whole genome shotgun sequence Length of sequence - 42601 bp Number of predicted genes - 42, with homology - 40 Number of transcription units - 24, operones - 9 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 559 - 1575 706 ## BT_1668 hypothetical protein - Prom 1635 - 1694 5.1 + Prom 1591 - 1650 3.7 2 2 Tu 1 . + CDS 1696 - 2715 1252 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit + Prom 2740 - 2799 3.2 3 3 Op 1 1/0.000 + CDS 2871 - 4070 960 ## COG0477 Permeases of the major facilitator superfamily 4 3 Op 2 . + CDS 4067 - 4744 636 ## COG0177 Predicted EndoIII-related endonuclease + Prom 4759 - 4818 5.7 5 4 Op 1 . + CDS 4838 - 6097 1754 ## COG0126 3-phosphoglycerate kinase 6 4 Op 2 . + CDS 6134 - 7192 947 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 7 4 Op 3 . + CDS 7217 - 8260 884 ## BT_1674 hypothetical protein + Prom 8305 - 8364 9.9 8 5 Op 1 . + CDS 8391 - 9575 825 ## COG2311 Predicted membrane protein 9 5 Op 2 . + CDS 9586 - 11790 1717 ## COG0457 FOG: TPR repeat + Term 11852 - 11893 6.2 - Term 11837 - 11884 9.2 10 6 Op 1 . - CDS 11934 - 12515 601 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation 11 6 Op 2 . - CDS 12579 - 13100 539 ## COG1778 Low specificity phosphatase (HAD superfamily) 12 6 Op 3 . - CDS 13081 - 13875 607 ## BT_1678 hypothetical protein 13 6 Op 4 . - CDS 13883 - 14218 338 ## BT_1679 hypothetical protein 14 6 Op 5 . - CDS 14221 - 14775 526 ## COG0778 Nitroreductase + Prom 14856 - 14915 3.0 15 7 Tu 1 . + CDS 14938 - 15162 146 ## gi|237713696|ref|ZP_04544177.1| conserved hypothetical protein - Term 15154 - 15198 2.8 16 8 Tu 1 . - CDS 15251 - 15793 462 ## COG0288 Carbonic anhydrase - Prom 15833 - 15892 5.0 + Prom 15814 - 15873 5.3 17 9 Tu 1 . + CDS 15944 - 16942 574 ## BT_1684 hypothetical protein + Term 17029 - 17064 -0.8 + Prom 16956 - 17015 5.4 18 10 Op 1 2/0.000 + CDS 17118 - 17522 526 ## COG0346 Lactoylglutathione lyase and related lyases 19 10 Op 2 . + CDS 17551 - 19104 1828 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 20 10 Op 3 . + CDS 19136 - 20056 919 ## BT_1687 hypothetical protein 21 10 Op 4 . + CDS 20081 - 20512 554 ## COG1038 Pyruvate carboxylase 22 10 Op 5 . + CDS 20514 - 21674 1415 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit + Term 21696 - 21745 14.6 - Term 21780 - 21826 8.3 23 11 Tu 1 . - CDS 21859 - 22863 1186 ## COG0191 Fructose/tagatose bisphosphate aldolase - Prom 22906 - 22965 4.2 + Prom 23167 - 23226 3.0 24 12 Tu 1 . + CDS 23297 - 23551 437 ## PROTEIN SUPPORTED gi|160886426|ref|ZP_02067429.1| hypothetical protein BACOVA_04437 + Term 23565 - 23629 17.3 - Term 23567 - 23605 4.1 25 13 Tu 1 . - CDS 23632 - 24846 1326 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 24941 - 25000 6.8 + Prom 24903 - 24962 8.6 26 14 Tu 1 . + CDS 25040 - 25447 290 ## gi|237713708|ref|ZP_04544189.1| conserved hypothetical protein 27 15 Tu 1 . - CDS 25801 - 25914 67 ## - Prom 26147 - 26206 6.5 + Prom 26096 - 26155 6.8 28 16 Tu 1 . + CDS 26181 - 26612 295 ## BT_3564 hypothetical protein + Prom 26775 - 26834 6.5 29 17 Tu 1 . + CDS 26920 - 27285 252 ## gi|237713712|ref|ZP_04544193.1| predicted protein + Term 27377 - 27416 -0.1 + Prom 27371 - 27430 3.6 30 18 Op 1 . + CDS 27624 - 28928 865 ## BT_1173 hypothetical protein 31 18 Op 2 . + CDS 28957 - 30756 1280 ## BT_1172 DNA primase/helicase + Term 30974 - 31009 0.1 - Term 30618 - 30646 -1.0 32 19 Tu 1 . - CDS 30760 - 30954 272 ## gi|237713715|ref|ZP_04544196.1| predicted protein - Prom 31146 - 31205 7.6 + Prom 30927 - 30986 5.4 33 20 Tu 1 . + CDS 31188 - 31697 647 ## BT_2021 putative non-specific DNA-binding protein + Term 31764 - 31804 1.3 + Prom 31793 - 31852 4.9 34 21 Op 1 27/0.000 + CDS 32032 - 33087 837 ## COG0845 Membrane-fusion protein + Term 33090 - 33138 -0.8 35 21 Op 2 9/0.000 + CDS 33154 - 36186 2833 ## COG0841 Cation/multidrug efflux pump 36 21 Op 3 . + CDS 36235 - 37509 1206 ## COG1538 Outer membrane protein - Term 37546 - 37600 14.3 37 22 Op 1 . - CDS 37618 - 37968 253 ## BVU_0489 hypothetical protein 38 22 Op 2 . - CDS 37949 - 38212 417 ## gi|160886440|ref|ZP_02067443.1| hypothetical protein BACOVA_04451 - Prom 38240 - 38299 5.0 39 23 Op 1 4/0.000 - CDS 38316 - 39551 1410 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 40 23 Op 2 . - CDS 39551 - 41386 2065 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase 41 23 Op 3 . - CDS 41413 - 41673 390 ## BT_1698 hypothetical protein - Prom 41729 - 41788 8.8 42 24 Tu 1 . + CDS 42028 - 42162 129 ## + Term 42267 - 42305 1.0 Predicted protein(s) >gi|222159232|gb|ACAB01000127.1| GENE 1 559 - 1575 706 338 aa, chain - ## HITS:1 COG:no KEGG:BT_1668 NR:ns ## KEGG: BT_1668 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 338 1 338 350 516 78.0 1e-145 MKKIVFCLLLLTFSFRLAAQIDYLEPVKPFSSYTGELGEYYRSVFSLLNTGFQKQPYARF AAIPSFSPEYAMSVERKNGRYTLISNTLSRTYWQAEKGTVTVDTKSVVISASLYQSLGAI FRLVTEQVQDLDGSTAGLDGIVYYFSSTDAKGKERMGRKWSPEKGTLMERLVLVCQSAYM LSRGENISEQTLAEEAASLLKALQQRSKEEPDAYKQPMYVGIYPVGPRAKTLSGRQVEEP AHFSAMSLEEYIANEMVYPAGLLEKNVSGYALCEFTIDKEGVILRPHILRSTHPEFAEEA LRIVKGMPKWSPALAGGKPADSNYTLYIPFRPQLYRNK >gi|222159232|gb|ACAB01000127.1| GENE 2 1696 - 2715 1252 339 aa, chain + ## HITS:1 COG:lin1184 KEGG:ns NR:ns ## COG: lin1184 COG0016 # Protein_GI_number: 16800253 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Listeria innocua # 1 337 1 344 350 317 46.0 1e-86 MIAKIEQLLKEVEALHASNAEELEALRIKYLSKKGAINDLMADFRNVAADQKKEVGMRLN ELKTKAQDKINALKELFESQDNDCDGLDLTRSAYPVELGTRHPLTIVKNEIIDIFARLGF SIAEGPEIEDDWHVFSALNFAEDHPARDMQDTFFIEAHPDVVLRTHTSSVQTRVMETSQP PIRIICPGRVYRNEAISYRAHCFFHQVEALYVDKNVSFTDLKQVLLLFAKEMFGADTKIR LRPSYFPFTEPSAEMDISCNICGGKGCPFCKHTGWVEILGCGMVDPNVLESNGIDSKIYS GYALGMGIERITNLKYQVKDLRMFSENDTRFLKEFESAY >gi|222159232|gb|ACAB01000127.1| GENE 3 2871 - 4070 960 399 aa, chain + ## HITS:1 COG:STM2280 KEGG:ns NR:ns ## COG: STM2280 COG0477 # Protein_GI_number: 16765607 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 3 389 2 381 396 132 24.0 1e-30 MAKDRLITPSYCFILAANFLLYFGFWLLIPVLPFYLSEFFQTGNSTIGIVLSCYTVAALC IRPFSGYLLDTFARKPLYLFAYFIFMMMFGGYLIAGSLTLFIIFRIIHGVSFGMVTVGGN TVVIDIMPSSRRGEGLGYYGLTNNTAMSIGPMFGLFLHDAGVSFATIFCYAFGSCILGFL CASLVKTPYKPPVKREPISLDRFILMKGLPAGLSLLLLSIPYGMTTNYVAMYARQIGLNT QTGFFFTFMAVGMAISRIFSGKLVDRGKITQVIIAGLYLVVCSFFLLSACVYLIQWNDTV CNILFFGIALLMGVGFGIMFPAFNTLFVNLAPNSQRGTATSTYLTSWDVGIGIGMLTGGY IAEISTFDKAYLFGACLTVVSALYFKLKVTPHYHKNKLR >gi|222159232|gb|ACAB01000127.1| GENE 4 4067 - 4744 636 225 aa, chain + ## HITS:1 COG:RP746 KEGG:ns NR:ns ## COG: RP746 COG0177 # Protein_GI_number: 15604580 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Rickettsia prowazekii # 9 213 8 210 212 197 47.0 2e-50 MRKKERYEKVIAWFQDNVPVAETELHYNNPYELLIAVILSAQCTDKRVNIITPPLYKDFP TPEALAATTPEVIFEYIRSVSYPNNKAKHLVGMAKMLVNDFNSQVPDNLEDLVKLPGVGR KTANVIQSVVFNKAAMAVDTHVFRVSHRIGLVPDSCTTPFSVEKELVKNIPEKLIPIAHH WLILHGRYVCQARTPKCDTCGLQMMCKYFCNTYKVTKEEPKAKNK >gi|222159232|gb|ACAB01000127.1| GENE 5 4838 - 6097 1754 419 aa, chain + ## HITS:1 COG:all4131 KEGG:ns NR:ns ## COG: all4131 COG0126 # Protein_GI_number: 17231623 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Nostoc sp. PCC 7120 # 8 419 13 399 400 387 52.0 1e-107 MQTIDKFNFAGKKAFVRVDFNVPLDENFNITDDTRMRAALPTLKKILADGGSIIIGSHLG RPKGVADKFSLKHIIKHLSELLGVEVQFANDCMGEEAAVKAAALQPGEVLLLENLRFYAE EEGKPRGLAEDATDEEKAAAKKAVKESQKEFTKKLASYADCYVNDAFGTAHRAHASTALI AKYFDVNNKMFGYLMEKEVKAVDKVLNDIKRPFTAIMGGSKVSSKIEIIENLLSKVDNLI IAGGMTYTFTKAMGGKIGISICEDDKLDLALDLMKKAKEKGVNLVLAVDAKIADAFSNDA NTKFCAVDEIPDGWEGLDIGPKTEEIFANVIKESKTILWNGPTGVFEFDNFTHGSRAVGE AIVEATKNGAFSLVGGGDSVACVNKFGLASGVSYVSTGGGALLEAIEGKVLPGIAAIQE >gi|222159232|gb|ACAB01000127.1| GENE 6 6134 - 7192 947 352 aa, chain + ## HITS:1 COG:AF0088 KEGG:ns NR:ns ## COG: AF0088 COG0715 # Protein_GI_number: 11497708 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Archaeoglobus fulgidus # 61 332 23 297 300 132 30.0 1e-30 MIRGQFFVKRVALIYKKRSSSYLQKRGMKRLILLFWFILLFLYSCQSKKGDGSFLPLTEL QPLTLGMMPTLDGLPFHIAKTQGIYDSLGLNLTILSFNSANDRDAAFQTGKMDGMITDYP SAVVLQAIHHADLGFVLKNDGYFCFIVSKESNINQLEQLKEKNIAVSRNTVIEYATDQLL SKAGISLSEMNMPEIGQLPLRLQMLQYNQIDASFLPDPAASIAMNSQHRSLVSTQELGID FTATAFSRKALNEKRKEIELLITGYNLGVDYIKMHPQKEWEQVLIEIGVPENLTGLIALP SYQKAKRPSAEGIDKAIQWLKENHRIPETYSERNLIDTTYIPTVSTIIQYQP >gi|222159232|gb|ACAB01000127.1| GENE 7 7217 - 8260 884 347 aa, chain + ## HITS:1 COG:no KEGG:BT_1674 NR:ns ## KEGG: BT_1674 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 347 1 347 347 634 88.0 1e-180 MRKTLLQFLTFVLCITGMQNPLLAQEQTPLNQVVNTLKERISLSGYAQLGYTYDDAANPD NTFDIKRIIFMAHGKITKRWTCDFMYDFYNGGMLLEVYTDYQFLPGLTARIGEFKVPYTI ENELSPTTVELINCYSQSVCYLAGVSGSDKCYGMTSGRDIGMMLHGKLFRDFLQYKVAVM NGQGLNTKDKNSQKDVVGNLMVNPLKWLSVGGSFIRGTGHAIADSEYTGIKAGENYAKKR WSAGGVVTTSTFNLRTEYLAGKDRSVKSEGFYATGSVRFARNFDFIASFDYFNPNKAADF KQNNYIAGVQYWFYPRCRLQAQYTFCDKKGDGQKDSNLIQAQVQVRF >gi|222159232|gb|ACAB01000127.1| GENE 8 8391 - 9575 825 394 aa, chain + ## HITS:1 COG:BH3427 KEGG:ns NR:ns ## COG: BH3427 COG2311 # Protein_GI_number: 15615989 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 9 388 2 374 397 106 27.0 7e-23 MEHKIAETNARIDVADVLRGLAVMGIILLHSIEHFNFYSFPEEVPFEWMKFTDQAIWRGL FFTFSNKAYAVFALLFGFSFYIQDNNQQRRGKDFRLRFLWRLFILFMIGQFNAAFFTGEI LTMYAILGIILPIFCRMSDRTVVIFATLLILQPIDWVKLIYALCNPDYVAGKSLAGYYFS IAFDVQKNGTFLETVRMNMWEGQLANMTWALEHGRILQTPGLFLFGMLVGRRKYFLYSEQ NERLWLKALAVSLLCFFPIYGLNNMLPEFIERSAILVPLQLILSSFSSLSFMVLLVTGLL LTFYRVKDRSFFMRFTSYGKMSLTNYLGQSIFGSLLFYHWGFELGRYLGITYSFLFGILF VLLQMVFCSWWLRHHKHGPFEGLWKRLTWIGKNK >gi|222159232|gb|ACAB01000127.1| GENE 9 9586 - 11790 1717 734 aa, chain + ## HITS:1 COG:all3773_2 KEGG:ns NR:ns ## COG: all3773_2 COG0457 # Protein_GI_number: 17231265 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 512 697 133 316 395 67 30.0 7e-11 MNEKTINEQYAYIRTLLEEKRLKEALMQLESLLWQCPDWDLRTRLEQLQTSYKYMLEYMK QGANDPDRWNLYQKMVADTWGIADQSRLLMLDNASSRYYHEVRRTPRSADLSNYGIKTIL HILESFNDDLAVSGLLSDEKMDEVLRRHEDTLKFMFIRTWTNSAWTPEDEEDAKAMLASE LLPGDDLCLFVSALTLSLMECFDIRKIMWLLNAYEHPNVNVSQRALVGTMIIFHIYRSRL SFYPELIKRVDLMEEIPSFREDVARIYRQMLLCQETEKIDKKMREEIIPEMLKNVSSMKN MRFGFEENDEENNDMNPDWEDAFEKSGLGDKLREMNELQLEGADVYMSTFAALKNYPFFR EVHNWFYPFSKQQSNVLKAMKQAGNQGSSLLDLILQSGFFSNSDKYSLFFTIHQLPQSQQ DMMLSQLNEQQVAELAEKSNVETMKRFNERPGTVSNQYLHDLYRFFKLSVRKSEFRDIFK EKLDLHHVPALDNILYWEDVLFPIADFYLAKERWDEAIEIYEELESIGGFEGESAEYYQK FGYALQKRKKYAEAIQAYLKADTLKPDNIWNNRHLAICYRLNRNYQAALTYYKKVEEAAP EDTNVTFYIGSCLAELGQYEEALNYFFKLDFIENNCIKAWRGIGWCSFISQKYEQAMKYY EKIIEQKPLAIDYMNAGHVAWVMGDIQKAAVFYGKAITASGNRERFLEMFHKDEESLLTQ GIREEDIPLMLDLL >gi|222159232|gb|ACAB01000127.1| GENE 10 11934 - 12515 601 193 aa, chain - ## HITS:1 COG:BS_maf KEGG:ns NR:ns ## COG: BS_maf COG0424 # Protein_GI_number: 16079857 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus subtilis # 10 191 5 183 189 135 42.0 4e-32 MLDNLKKYQIILASNSPRRKELMSGLGVDYVVRTLPDVDESYPDTLVGAEIPEYIAREKA GAYRTMMQPGELLITADTIVWMDGKVLGKPEGREGAIEMLRALSGKSHQVFTGVCLTTTE WQKSFTASSEVLFDVLSEDEIQYYVDLYQPMDKAGAYGVQEWIGYIGVKSISGSFYNIMG LPIQKLYGELKKL >gi|222159232|gb|ACAB01000127.1| GENE 11 12579 - 13100 539 173 aa, chain - ## HITS:1 COG:FN0213 KEGG:ns NR:ns ## COG: FN0213 COG1778 # Protein_GI_number: 19703558 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Fusobacterium nucleatum # 8 165 1 158 168 120 36.0 2e-27 MSTINYDLSRIKALAFDVDGVLSSTTIPLHPSGEPMRTVNIKDGYAIQLAVKKGLHIAII TGGRTEAVRIRFEGLGVKDLYMGSAVKIHDYRDFRDKYGLTDDEILYMGDDVPDIEVMRE CGLPCCPKDAVPEVKSVAKYISYADGGRGCGRDVVEQVLKAHGLWMAEDAFGW >gi|222159232|gb|ACAB01000127.1| GENE 12 13081 - 13875 607 264 aa, chain - ## HITS:1 COG:no KEGG:BT_1678 NR:ns ## KEGG: BT_1678 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 264 1 264 264 448 85.0 1e-125 MKRSIEDTPIVFIGAGNLATNLAKALYHKGFRIVQVYSRTMESARTLAEKVEAEYTTDLQ EISKDAKLYIVSLKDAALVDLLPQITEGKQSSLLVHTAGSIPMSIWEGHAERYGVFYPMQ TFSKQREVNFQEVPFFVEAKRPEDVEFLKAVAATLSEKVYEASSEQRKSLHLAAVFICNF TNHMYALAADLLEKYNLPFDVMLPLIDETARKVHELAPRDAQTGPAVRYDENVMNNHLAM LVDSPALQDIYKLMSKSIHEHHQL >gi|222159232|gb|ACAB01000127.1| GENE 13 13883 - 14218 338 111 aa, chain - ## HITS:1 COG:no KEGG:BT_1679 NR:ns ## KEGG: BT_1679 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 111 116 81.0 3e-25 MGAAKASNSKDTYLATAVTFIVIGALYLIDKLIHFSTIGLPWVMNKDNMLLYASICFLIF KRDKSIGFVLLGLWLVMNISLVISLLGSLSGYLLPLTLLIVGIILFWFAKR >gi|222159232|gb|ACAB01000127.1| GENE 14 14221 - 14775 526 184 aa, chain - ## HITS:1 COG:CAC3555 KEGG:ns NR:ns ## COG: CAC3555 COG0778 # Protein_GI_number: 15896791 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 12 181 3 172 174 145 39.0 5e-35 MKPENNMENFSELIKNRRSMRKFTDEELTQDQVVTLMKAALMSPSSKRSNSWQFVVIDDK EVLKELSHCKEQASSFIADAALAIVVMADPLASDVWIEDASIASIMIQLQAEDLGLGSCW VQVRERFTATGMPSDEFVHGILGIPLQLQILSVIAIGHKGMERKPFNEEHLQWEKIHINK FGGK >gi|222159232|gb|ACAB01000127.1| GENE 15 14938 - 15162 146 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713696|ref|ZP_04544177.1| ## NR: gi|237713696|ref|ZP_04544177.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 74 1 74 74 112 100.0 6e-24 MKKRITQDDYIKANRKASREAEIEMYGHPICHKRVHQSKKVYNRRKIKAADKKLPYFFAF KIASLSSRHSDTIQ >gi|222159232|gb|ACAB01000127.1| GENE 16 15251 - 15793 462 180 aa, chain - ## HITS:1 COG:BS_ytiB KEGG:ns NR:ns ## COG: BS_ytiB COG0288 # Protein_GI_number: 16080121 # Func_class: P Inorganic ion transport and metabolism # Function: Carbonic anhydrase # Organism: Bacillus subtilis # 1 180 3 182 187 210 52.0 9e-55 MLEEILAYNKQFVENKGYESYITNKYPDKKIAILSCMDTRLTALLPAALGIKNGDVKMIK NAGGVISHPFGSVIRSLLVAIFELGVEEIMVIAHSDCGACHMHSEVMLEKMKARGINPDY IDMMRFCGVDFHSWLDGFEDTEKSVRGTVDFIVRHPLIPSDVTVYGFIIDSTTGELTRIV >gi|222159232|gb|ACAB01000127.1| GENE 17 15944 - 16942 574 332 aa, chain + ## HITS:1 COG:no KEGG:BT_1684 NR:ns ## KEGG: BT_1684 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 332 1 332 332 603 85.0 1e-171 MPLKLTTYYHGKDIPELPGKNTFHSKELFLIYEATPGYTPLLIVATEDGRPVARLLAAIR KAKKWLPSSLVKHCVVYSEGEFLDESLSTNKEKAEEVFGDMLEHLTQEASRSCVLIEFRN LNNSMFGYRVFRANDYFPVNWLRVRNSLHSMKKTEDRFSPSRIRQIKKGLKNGAKVEEAH TVEEIHDFSRMLHKVYSSRIRRYFPANDFFRHMNDMLIKGQQAKIFVVKYKEKIIGGSVC IYSGDDAYLWFSGGMRKTYALQYPGVLAVWKALEDARQRGFRHMEFMDVGLPFRKHGYRD FVLRFGGKQSSTRRWFRVNWNWLNKLLVKFYI >gi|222159232|gb|ACAB01000127.1| GENE 18 17118 - 17522 526 134 aa, chain + ## HITS:1 COG:PH0272 KEGG:ns NR:ns ## COG: PH0272 COG0346 # Protein_GI_number: 14590197 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Pyrococcus horikoshii # 6 133 8 133 136 121 55.0 4e-28 MKISHIEHLGIAVKSIEEALPYYENVLGLKCYNIETVEDQKVRTAFLKVGETKIELLEPT CPESTIAKFIENKGAGVHHVAFAVEDGVANALAEAEGKEIRLIDKAPRKGAEGLNIAFLH PKSTLGVLTELCEH >gi|222159232|gb|ACAB01000127.1| GENE 19 17551 - 19104 1828 517 aa, chain + ## HITS:1 COG:RC0960 KEGG:ns NR:ns ## COG: RC0960 COG4799 # Protein_GI_number: 15892883 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Rickettsia conorii # 11 517 12 514 514 650 61.0 0 MSNQLEKIKELIERRAVARIGGGEKAIAKQHEKGKYTARERLAMLLDEGSFEEMDMFVEH RCTNFGMEKKHYPGDGVVTGCGTIEGRLVYVFAQDFTVSAGSLSETMSLKICKIMDQAMK MGAPCIGINDSGGARIQESINALAGYAEIFQRNILASGVIPQISGIFGPCAGGAVYSPAL TDFTLMMEGTSYMFLTGPKVVKTVTGEDVSQENLGGASVHSTKSGVTHFTAQTEEEGFEL IRKLLSYIPQNNLEEAPYVDCTDPIDRLEDSLNDIIPDSPTKPYDMYEVIGAIVDNGEFL EIQKDYAKNIIIGFARFNGQSVGIVANQPKYLAGVLDSNASRKGARFVRFCDAFNIPIVS LVDVPGFLPGTGQEYNGVILHGAKLLYAYGEATVPKVTITLRKSYGGSHIVMSCKQLRGD MNYAWPTAEIAVMGGAGAVEVLYAREAKDQENPAQFLAEKEAEYTKLFANPYNAAKYGYI DDVIEPRNTRFRVIRALQQLQTKKLSNPAKKHGNIPL >gi|222159232|gb|ACAB01000127.1| GENE 20 19136 - 20056 919 306 aa, chain + ## HITS:1 COG:no KEGG:BT_1687 NR:ns ## KEGG: BT_1687 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 306 1 306 306 559 93.0 1e-158 MNKTKIGIFLSLLLLIGLTSCGEQKSNNKLVLNEILIDNQSNFQDDYGLHSAWIEIFNKS FGSADLAACLLKVSSQPGDTVTYFIPKGDILTLVKPRQHALFWADGEPNRGTFHTSFKLN PETANWVGLFDSGKKLLDQIVVPAGALGPNQSYARVSDGAAEWEVKSGSGDKYVTPSTNN KTLDSNSKMEKFEEHDADGVGMSISAMSVVFCGLILLFIAFKIVGKVAVNLSKRNAMKSK GIDKHEAKELSQAPGEVYAAISMALHEMQDEVHDVEETVLTITRVKRSYSPWSSKIYTLR ETPPRK >gi|222159232|gb|ACAB01000127.1| GENE 21 20081 - 20512 554 143 aa, chain + ## HITS:1 COG:SA0963 KEGG:ns NR:ns ## COG: SA0963 COG1038 # Protein_GI_number: 15926699 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Staphylococcus aureus N315 # 65 143 1068 1146 1150 68 44.0 3e-12 MKEYKYKINGNSYKVTIGDIEDNIAHVEVNGTHYKVEMEKQPKPVAKPVTVRPMPNAPTA PTQVVKPTAPSTGKSGVKSPLPGVILDIKVNVGDTVKKGQTIIILEAMKMENNINADKDG KITAINVNKGDSVLEGNDLVIIE >gi|222159232|gb|ACAB01000127.1| GENE 22 20514 - 21674 1415 386 aa, chain + ## HITS:1 COG:TM0880 KEGG:ns NR:ns ## COG: TM0880 COG1883 # Protein_GI_number: 15643642 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Thermotoga maritima # 12 385 17 383 384 348 53.0 8e-96 MGDFINFLGNNLADFWTYTGFANATVGHVVMILVGLVFIYLAIAKEFEPMLLIPIGFGIL IGNIPFNMDAGLKVGIYEEGSVLNILYQGVTSGWYPPLIFLGIGAMTDFSALISNPKLML IGAAAQFGIFGAYMIALEMGFDPMQAGAIGIIGGADGPTAIFLSSKLAPNLMGAIAVSAY SYMALVPVIQPPIMRLLTTKNERVIRMKPPRAVSHTEKVIFPIIGLLLTCFLVPSGLPLL GMLFFGNLLKESGVTRRLANTASGPLIDTITILLGLTVGASTQASEFLTLDSIKIFALGA LSFVIATASGVIFVKIFNIFLKKGNKINPLIGNAGVSAVPDSARISQVIGLEYDSTNYLL MHAMGPNVAGVIGSAVAAGILLGFLM >gi|222159232|gb|ACAB01000127.1| GENE 23 21859 - 22863 1186 334 aa, chain - ## HITS:1 COG:TP0662 KEGG:ns NR:ns ## COG: TP0662 COG0191 # Protein_GI_number: 15639649 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Treponema pallidum # 1 328 1 328 332 448 68.0 1e-126 MVNYKDLGLVNTREMFAKAIKGGYAIPAFNFNNMEQMQAIIKAAVETKSPVILQVSKGAR QYANATLLRYMAQGAVEYAKELGCAHPEIVLHLDHGDTFETCKSCIDSGFSSVMIDGSHL PYEENVALTKKVVEYAHQFDVTVEGELGVLAGVEDEVSSDHHTYTDPEEVIDFATRTGCD SLAISIGTSHGAYKFTPEQCHIDPKTGRMVPPPLAFEVLDAVMEKLPGFPIVLHGSSSVP EEEVETINKYGGALKAAIGIPEEELRKAAKSAVCKINIDSDSRLAMTAAVRKVFAEKPAE FDPRKYLGPARDNMEKLYKHKIINVLGSDNKLAQ >gi|222159232|gb|ACAB01000127.1| GENE 24 23297 - 23551 437 84 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160886426|ref|ZP_02067429.1| hypothetical protein BACOVA_04437 [Bacteroides ovatus ATCC 8483] # 1 84 1 84 84 172 100 2e-42 MKKGLHPESYRPVVFKDMSNGDMFLSRSTVATKETIEFEGETYPLLKIEISNTSHPFYTG KSTLVDTAGRVDKFMSRYGDRKKK >gi|222159232|gb|ACAB01000127.1| GENE 25 23632 - 24846 1326 404 aa, chain - ## HITS:1 COG:FN1382 KEGG:ns NR:ns ## COG: FN1382 COG1373 # Protein_GI_number: 19704717 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 402 1 402 402 347 48.0 2e-95 MINREQYMEQVVPFIDKPFVKVITGIRRSGKSVVLRLIRDELLRRGVREERIIYLNFESF QWIDLKEAKALYAYIRGQAGDAGKYYILLDEIQEVDGWEKVVNSLLVDLDTDIYVTGSNS RMLSSELATYLTGRYVAFHVMTLSFREYLTFHDLQANDPTLNRKEEFQKYLRMGGFPAIH TADYGYEAIYKIVYDIYSSVILRDTVQRHNIRNVELLERVVKFVFDNIGNKLNAKNIADY FKSQQRRVDMNTIYNYLNALESAFIIQRIPRYDIKGKEILQTNEKYFVSDLSLIYSVMGY RDRLIAGMLENLVCLELKRRGYEVYVGKQDDKEVDFVAIRREEKIYVQVTYQLASQATVE REFAPLLAINDHYPKYVVSMDSLWQDNVEGVRHRHIADFLLDDA >gi|222159232|gb|ACAB01000127.1| GENE 26 25040 - 25447 290 135 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713708|ref|ZP_04544189.1| ## NR: gi|237713708|ref|ZP_04544189.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 135 1 135 135 238 100.0 9e-62 MEITDLKQMTKEEVFNFIRQRLSFSKELQEQFRHVNKDDLAKEHRRFEMSGNESKTGQCT IFNTAILNEFADLGIYDYTSYLFLDFHNGTPTVYLKYFSENENLEYTFTGYTTTEIIFAI LELTIFSGKPKRNRS >gi|222159232|gb|ACAB01000127.1| GENE 27 25801 - 25914 67 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLIYDYLRGVSVVSSLGVKKLLIFKFERQYVIYSIKE >gi|222159232|gb|ACAB01000127.1| GENE 28 26181 - 26612 295 143 aa, chain + ## HITS:1 COG:no KEGG:BT_3564 NR:ns ## KEGG: BT_3564 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 116 1 115 146 71 36.0 1e-11 MPKWISVEKAVIKYHIEKEAILLWVEMGQFPMLYIDNVPNVDEECILELFRRSKAGITAE YIDTLEQLCIDKTMVCEKYAHIIQLKEKEIQLQKEINTLINEIQAAMKRQNERIRDLKKA IGENNNVIRSDSWIKRLRKRFQW >gi|222159232|gb|ACAB01000127.1| GENE 29 26920 - 27285 252 121 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713712|ref|ZP_04544193.1| ## NR: gi|237713712|ref|ZP_04544193.1| predicted protein [Bacteroides sp. D1] # 1 121 1 121 121 248 100.0 7e-65 MKKTMKNDSCLTVSRRAAIIRQDAEENYGTLLTLGESMRRAYQVEELIDRMRWDNADFCL QRTDGTCLSITGTLTEYEHYFGRPYYENPSNRFLPYYDVHCRKWLTCRVAGIVFGSSVKK N >gi|222159232|gb|ACAB01000127.1| GENE 30 27624 - 28928 865 434 aa, chain + ## HITS:1 COG:no KEGG:BT_1173 NR:ns ## KEGG: BT_1173 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 121 432 6 326 328 82 23.0 3e-14 MTTSNSANTKQSNNASQKRKPIHNGYFNHPTSSSSNIPMSILIREQGLEIYGLYWVMLEE AHAQLKCCVNIQTMEIIANIFHAQPEHLELLYHHYFRRPGKGYNSHILYADFCEESAIRS YFPHPLLAYTDNELLRMIMQDGLKAYGLYWLVMEKLYQQPQHFLAPQTASFIQNLYDVSD ELMESVLYNYGLFYLDEKMNLHSKAIDDYREALDNMEDEKKRNTKPHVNNSLKANGNAED FNTREIKKTSNIQKNPSPIINKEINKKNSSSEESGKEKEEENGLKVTDLNAVTPDMRPNA WEENLQEAMNDTSWYEVVAIQSGIPRLMMEEKEWFFNYLREQIILRGNESSMNSLHEIKN YFANLTRQGSHVSSTTQVALKKFLKNRQEQQQCSPYETITNGIRTYDGHPIPAYAKPRPS AAHIWNPVTNEWTR >gi|222159232|gb|ACAB01000127.1| GENE 31 28957 - 30756 1280 599 aa, chain + ## HITS:1 COG:no KEGG:BT_1172 NR:ns ## KEGG: BT_1172 # Name: not_defined # Def: DNA primase/helicase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 581 1 575 599 481 44.0 1e-134 MDRFKELGIDTHGRTSGKMKTKCPWCHAQRTDKRDKSLSVNLDTKLYLCHYCGAHGSAAI YAGKGRKLGDPLVKFPTATGSNSSCPPTEKPHGDPEQGALTQKQIEWCRDVRHIPPEVLV EAGVAFASISMPISGKEKGWEKRDCLCFNFFENGELVNTKFRDSQKHFKLLQGARTIPYN IDAIRDTPECILVEGEFDALSYMAVGRTDVISVPNGANSQLDWLDELSESHFEQKQVIYL SVDTDRKGRELCRELSRRLGVDRCRIVTYGEAYKDANELLVAEGPDALLKALEDAPIPRL EGTFTAEDLREGLHQLFEEGYTSGVELGIPNLDEIMRLETGRVLTVTGIPGHGKSDFVDE IVLRLCTRQDWRAGYFSPENTPIEYHHAKLAEKLLGHRFRKDFSTEEEFARVVDYLSQRV WHILPDGDFTLGNVLSKARELVHRHGIRVFVIDPYNYINHQIPAGMTETGYIGSFMNSLA RFARLNSCLVILVAHPRKMNKQYGTQKTEVPTMYDINGSANFFNMTDYGIVVDRQDEMGI VYIHVEKTRFRNFGTKGNAAFCYDVTNGRYSPCTPPPEPGMQVQPWKGAKDLFSSEGWI >gi|222159232|gb|ACAB01000127.1| GENE 32 30760 - 30954 272 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713715|ref|ZP_04544196.1| ## NR: gi|237713715|ref|ZP_04544196.1| predicted protein [Bacteroides sp. D1] # 1 64 1 64 64 118 100.0 1e-25 MKKTELALLYMPYASADVARRHLNKCIRRNRELLEALEATGWAFWIHWLTPLQVELIEKY LGEP >gi|222159232|gb|ACAB01000127.1| GENE 33 31188 - 31697 647 169 aa, chain + ## HITS:1 COG:no KEGG:BT_2021 NR:ns ## KEGG: BT_2021 # Name: not_defined # Def: putative non-specific DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 167 3 173 175 173 52.0 2e-42 MLIYKAVQSNIASKDKKKKWHPCLIKMGNVVDTQMIGETIAERSSLTAGDVHNVIRNLMA VMREQLLNSRTVKLEGLGTFTMVCQSRCKGVDAEADVSPAQITGLRCQFTPEYTRDAGGN ITRALIQGASYIHVNQLTKALGTGGGGNAGGSGDDKPGGGGQEAPDPAA >gi|222159232|gb|ACAB01000127.1| GENE 34 32032 - 33087 837 351 aa, chain + ## HITS:1 COG:VC1756 KEGG:ns NR:ns ## COG: VC1756 COG0845 # Protein_GI_number: 15641760 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 8 346 19 354 364 150 31.0 5e-36 MKKKFGFVLAAAILLAGCGQKKETTTTTARPVKTTIVESRSIIRKDFSGIVEAVEYVKLA FRVNGQIIQLPVIEGQKVKKGQLIAAIDPRDIALQYAATKSAYETASAQVERNKRLLSRQ AISVQEYEISLANFQKAKSEYELSANNMRDTKLTAPFDGSIEKRLVENYQRVNSGEGIVQ LVNTHNLRIKFTIPDAYLYLLRAKDPRFLVEFDTFKGHVFQAKLEEYLDISTDGTGIPVS ITIDDPSFDRDLYAVKPGFTCSIRFTADVGPLVQDSWTIVPLSAVFGESEGNKMYVWVVE DNKVHKREVTVNAPTGEAQALISEGLKPGEKIVIAGVYQLVEGESITTVDR >gi|222159232|gb|ACAB01000127.1| GENE 35 33154 - 36186 2833 1010 aa, chain + ## HITS:1 COG:VC1757 KEGG:ns NR:ns ## COG: VC1757 COG0841 # Protein_GI_number: 15641761 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Vibrio cholerae # 1 1009 1 1012 1016 596 33.0 1e-170 MSLAKYSLDNTKIIYFFLAVLLIGGITSFGKLGKKEDAPFVIKSAVIMTRYPGAEPAEVE RLITEPISREIQSMSGVYKIKSESMYGLSKITFELQPSLSASSIPQKWDELRRKVLNIQP QLPSGASAPTVSDDFGDVFGIYYGLTADDGYTYEEMRNWAERIKTQVVTADGVMKVALFG TQTEVVNIFISTNKLVGMGIDPKQLASLLQSQNQIINTGEIRAGEQQLRVTANGMYTTVD DIRNQVITTKAGQVKLGDIAVIEKGYMDPPSNIMHVNGKRAIGIGVSTDPQRDVVQTGKN VKAKLDELLPLMPVGLELQSLYLENEIANEANNGFIINLIESILIVIVIIMLVMGLRAGV LIGSSLIFSIGGTLLIMSFFGVGLNRTSLAGFIIAMGMLVDNAIVVTDNAQIAIARGVDR RKALIDGATGPQWGLLGATFIAICSFLPLYLAPSAVAEIVKPLFVVLAISLGLSWVLALT QTTVFGNFILKAKAKDGTKDPYDKPFYHKFASILRTLIRRKTLTLGSMVVLFVASLVIMG MMPQNFFPSLDKPYFRADVFYPDGYSINDVVKEMKSVEEHLAKQPEVKKVSITFGSTPLR YYLASTSVGPKPNFANVLVELTDSKYTKEYEEDFDAYMKANYPNAITRTSLFKLSPAVDA AIEIGFIGPNVDTLVALTNQALEIMHRNPDLINVRNSWGNKVPVWKPVYSPERAQPLGVS RQGMAQSIQIGTTGMTLGEYRQGDQVLPILLKDNTVDSFRINDLRTLPVFGTGNETTSLE QVVSEFDFQYRFSNVKDYNRQMVMMAQCDPRRGVNAIAAFNEVWPLVQKEIKVPEGYTMK YFGEQESQVESNEALAKNLPLTFFLMFVTLLFLFRTYRKPTVILLMLPLIFIGIVLGLVL LGKSFDFFSILGLLGLIGMNIKNAIVLVEQIDLEAKTGKKPLDAVVSATTSRIVPVAMAS GTTILGMLPLLFDAMFGGMAATIMGGLLVASALTLFVLPVAYCAIQRIKG >gi|222159232|gb|ACAB01000127.1| GENE 36 36235 - 37509 1206 424 aa, chain + ## HITS:1 COG:FN1273 KEGG:ns NR:ns ## COG: FN1273 COG1538 # Protein_GI_number: 19704608 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Fusobacterium nucleatum # 100 422 87 412 413 62 21.0 2e-09 MKTQLITLSLLLLGLTAGAQQPYYLSREAYRDKVEAYSQILKQQKLKTMASTEARKIAHT GFLPKIDVNADGTLNMSDLNAWNEPIGEYRNHTYQGVFIVSQPLYTGGALNAQHKIAKAD EKLNQLNEELTIDQIHYQSDAVYWNASASQAMLQAADKYQSIVKQQYDIIQDRFNDGMIS RTDLLMISTRLKEAELQYIKARQNYTLALQKLNILMGEEPNNPVDSLYTIDTASAPVQIL SLENVLQRRADYESTEVNIMKSQAQRKAALSQFNPQLNMYFSGGWATATPNLGYDVSFNP IVGINLNIPIFRWGARFKTNRQQKAYISIQKLQQSYVTDNINEELSAALTKLTETEYQVK TAKETMSLANENLDLVSFSYNEGKANMVDVLSAQLSWTQAHTNLINAYLSEKMAVAEYRK VISE >gi|222159232|gb|ACAB01000127.1| GENE 37 37618 - 37968 253 116 aa, chain - ## HITS:1 COG:no KEGG:BVU_0489 NR:ns ## KEGG: BVU_0489 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 4 116 1 113 113 108 54.0 5e-23 MPLMKYRISVTLEFLREMKHLSKRYKSLKEDLRNFGNDLLLNPEQGVSLGNNLRKIRIAI TSKNKGKSGGARVITYTIIWTEIDTEIKLLTIYDKSERANITDKEIEDILKQNGIL >gi|222159232|gb|ACAB01000127.1| GENE 38 37949 - 38212 417 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160886440|ref|ZP_02067443.1| ## NR: gi|160886440|ref|ZP_02067443.1| hypothetical protein BACOVA_04451 [Bacteroides ovatus ATCC 8483] # 1 87 10 96 96 126 100.0 5e-28 MSAMSLEAEKNELIRRILDVDDVAILRRVKSMLSCEEEQTNVVAEEAAPYQTKAEILASL DQACKELKLNLEGKLEFKSLDDALNEI >gi|222159232|gb|ACAB01000127.1| GENE 39 38316 - 39551 1410 411 aa, chain - ## HITS:1 COG:AF2084 KEGG:ns NR:ns ## COG: AF2084 COG1883 # Protein_GI_number: 11499666 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Archaeoglobus fulgidus # 24 411 5 354 354 295 48.0 8e-80 MNEIFENLYDMTAFSNIIAEPQFLIMYAIAFVLLYLGIKKQYEPLLLVPIAFGVLLANFP GGDMGVIQADENGMVMVNGVLKNIWEMPLHDIAHELGIMNFIYYMLIKTGFLPPIIFMGV GALTDFGPMLRNLHLSIFGAAAQLGIFTVLLVAILMGFTPQEAASLGIIGGADGPTAIFT TIKLAPHLLGPIAIAAYSYMALVPVIIPLVVKLFCTKKELSINMKEQEKKYPSKTEIKNL RVLKIIFPIVVTTIVALFVPSAVPLVGMLMFGNLVKEIGTNTFRLFDAASNSIMNAATIF LGLSVGATMTAEAFLNWTTIGIVIGGFLAFALSITGGIFFVKLVNLFSKKKINPLIGATG LSAVPMASRVANEIALKYDPKNHVLQYCMASNISGVIGSAVAAGVLISFLA >gi|222159232|gb|ACAB01000127.1| GENE 40 39551 - 41386 2065 611 aa, chain - ## HITS:1 COG:MA0674_1 KEGG:ns NR:ns ## COG: MA0674_1 COG5016 # Protein_GI_number: 20089559 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Methanosarcina acetivorans str.C2A # 10 478 9 449 467 184 29.0 7e-46 MKREVKFSLVFRDMWQSAGKYVPRVDQLVKVAPAIIEMGCFARVETNGGGFEQVNLLFGE NPNKAVREWTKPFHEAGIQTHMLDRALNGLRMSPVPADVRKLFYKVKKAQGTDITRTFCG LNDVRNIAPSITYAKEAGMISQCSLCITHSPIHTVEYYTNMALELIKLGADEICIKDMAG IGRPVSLGKIVANIKAAHPEIPVQYHSHAGPGFNMASILEVCEAGCDYIDVGMEPLSWGT GHADLLSVQAMLKDAGYQVPEINMEAYMKVRGMIQEFMDDFLGLYISPKNRLMNSLLIAP GLPGGMMGSLMADLESNLESINKYKAKHNLPFMTQDQLLIKLFDEVAYVWPRVGYPPLVT PFSQYVKNLAMMNVMAMEKGKERWGMIADDIWDMILGKAGRLPGKLAPEIIEKAEREGRK FFEGDPQNNYPDALDKYRKLMKENKWEVGQDDEELFEYAMHPAQYEAYKSGKAKEDFLED VAKRRAEKDKSPEEDAKPKTLTVQVDGQAYRVTVAYGDAELPAAPAGAATAPAGEGKEVL SPLEGKFFLVKGAQETPLQVGDTVKEGDVICYVEAMKTYNAIRAEFGGTVTAICVNPGDA VSEDDVLMKIG >gi|222159232|gb|ACAB01000127.1| GENE 41 41413 - 41673 390 86 aa, chain - ## HITS:1 COG:no KEGG:BT_1698 NR:ns ## KEGG: BT_1698 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 86 1 86 86 112 93.0 6e-24 MENIETAILLMVVGMATVFVILLIVIYLGKLLITLVNKYAPEEVVPVKREASQGPAPVPG NILAAITAAVNVVTQGKGKITKVEKL >gi|222159232|gb|ACAB01000127.1| GENE 42 42028 - 42162 129 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIDRFSPFLVSYSLNLLLNYQIKPALLNLSCAITAFILLISVAV Prediction of potential genes in microbial genomes Time: Wed May 18 04:02:31 2011 Seq name: gi|222159231|gb|ACAB01000128.1| Bacteroides sp. D1 cont1.128, whole genome shotgun sequence Length of sequence - 23568 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 9, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 20 - 862 670 ## BT_1727 putative transmembrane sensor 2 1 Op 2 . - CDS 871 - 1461 488 ## BT_1728 RNA polymerase ECF-type sigma factor 3 1 Op 3 . - CDS 1539 - 3113 1399 ## COG4108 Peptide chain release factor RF-3 - Prom 3133 - 3192 2.2 4 2 Op 1 . - CDS 3211 - 4077 700 ## COG1091 dTDP-4-dehydrorhamnose reductase 5 2 Op 2 . - CDS 4078 - 4623 612 ## BT_1731 hypothetical protein 6 2 Op 3 . - CDS 4635 - 5387 567 ## COG1280 Putative threonine efflux protein - Prom 5530 - 5589 2.9 + Prom 5320 - 5379 5.6 7 3 Tu 1 . + CDS 5547 - 9251 4122 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain + Term 9271 - 9312 9.1 + Prom 9374 - 9433 5.6 8 4 Tu 1 . + CDS 9520 - 9912 274 ## BT_1734 two-component system sensor histidine kinase/response regulator, hybrid ('one component system') + Prom 9917 - 9976 2.7 9 5 Op 1 . + CDS 10069 - 13530 1938 ## COG5002 Signal transduction histidine kinase 10 5 Op 2 7/0.000 + CDS 13527 - 14069 441 ## COG2059 Chromate transport protein ChrA + Prom 14078 - 14137 4.4 11 5 Op 3 . + CDS 14195 - 14743 622 ## COG2059 Chromate transport protein ChrA - Term 14578 - 14614 -0.2 12 6 Tu 1 . - CDS 14854 - 16362 1353 ## COG1649 Uncharacterized protein conserved in bacteria - Prom 16390 - 16449 4.9 + Prom 16332 - 16391 3.6 13 7 Tu 1 . + CDS 16457 - 19228 2466 ## COG0178 Excinuclease ATPase subunit + Term 19268 - 19309 3.6 - Term 19243 - 19305 6.0 14 8 Op 1 . - CDS 19325 - 19810 560 ## BT_1740 hypothetical protein 15 8 Op 2 . - CDS 19815 - 20045 248 ## BT_1741 hypothetical protein 16 8 Op 3 . - CDS 20042 - 21472 1091 ## COG1966 Carbon starvation protein, predicted membrane protein - Prom 21516 - 21575 2.9 + Prom 21455 - 21514 7.3 17 9 Tu 1 . + CDS 21581 - 23047 1068 ## COG1649 Uncharacterized protein conserved in bacteria + Term 23237 - 23273 5.9 Predicted protein(s) >gi|222159231|gb|ACAB01000128.1| GENE 1 20 - 862 670 280 aa, chain - ## HITS:1 COG:no KEGG:BT_1727 NR:ns ## KEGG: BT_1727 # Name: not_defined # Def: putative transmembrane sensor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 280 1 283 283 323 59.0 4e-87 MEKWTESLEQYLEEGKLPLVGEIFSRKKEIGDKLKLQREESHKITLSRKRKQSVGRRSFS FSWGVAAAVVLLLGVGSYFLAEEKVVTDNTAMNYELPDGSSVQVMENSRLTYNHITWLWE RKLQLLGKASFNVTKGKTFTVSTEAGDVTVLGTKFLVDQQGKKMFVNCEEGSVKVETAVG KRTLLAGESVRCDETKIVPVEKKADESEFPEVLGYEDDPLINVVADIEHIFEVTVVGHEK CEGLTYNGTVLTKDLNATLEKVFGSCGISYQIRGKEIILQ >gi|222159231|gb|ACAB01000128.1| GENE 2 871 - 1461 488 196 aa, chain - ## HITS:1 COG:no KEGG:BT_1728 NR:ns ## KEGG: BT_1728 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 289 79.0 3e-77 MKTLSWDKIQQGDEEAFRQLYEQYADLLYGYGMKIAGDDTLVTEAIQSLFVYIFEKRETC AAPQSIPAYLCVSLRHMIVNELKKENSGSLKSLDEVGTSEYQFDLEIDIETAIIRSELEK EQLEVLQKELNNLTKQQREVLYLKYYKKMSSEEIAQVMGLTSRTVYNTTHMAISSLRERM SKSFLLLVAANLWIFN >gi|222159231|gb|ACAB01000128.1| GENE 3 1539 - 3113 1399 524 aa, chain - ## HITS:1 COG:XF0174 KEGG:ns NR:ns ## COG: XF0174 COG4108 # Protein_GI_number: 15836779 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Xylella fastidiosa 9a5c # 6 523 21 540 548 533 50.0 1e-151 MADNTEILRRRTFAIIAHPDAGKTSLTEKLLLFGGQIQVAGAVKSNKIKKTATSDWMEIE KQRGISVTTSVMEFDYRDYKINILDTPGHQDFAEDTYRTLTAVDSVIIVVDGAKGVETQT RKLMEVCRMRKTPVIIFVNKMDREGKDPFDLLDELEEELMIQVRPLSWPIEQGARFKGVY NIYEQKLDLYQPSKQMVTEKVAVDIHSEELDQQIGKSLADKLRGDLELIEGVYPEFDSES YLAGDCAPVFFGSALNNFGVQELLNCFVEIAPSPRPVQAEEREVKPDEPKFTGFIFKITA NIDPNHRSCVAFCKICSGKFVRNAPYTHVRHGKTMRFSSPTQFMAQRKTTIDEAYAGDII GLPDNGTFKIGDTLTEGEMLHFRGLPSFSPEMFKYIENADPMKQKQLAKGIDQLMDEGVA QLFVNQFNGRKIIGTVGQLQFEVIQYRLLNEYNASCRWEPVSLYKACWVESDDPAELEAF KKRKYQYMAKDREGRDVFLADSGYVLQMAQMDFKHIKFHFTSEF >gi|222159231|gb|ACAB01000128.1| GENE 4 3211 - 4077 700 288 aa, chain - ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 284 1 280 280 245 49.0 8e-65 MRILVTGANGQLGNEMQVLAKENPQHTYYFTDVQELDICDKQAVWTYMAEKQIELVVNCA AYTAVDKAEDNQELAYKLNCEAPKQLASAAQANGAAMIQVSTDYVFDGTAHTPYTEDCNP CPDSVYGTTKLEGEKEVMNHCEQAVVIRTAWLYSIFGNNFVKTMLRLGKERDSLGVVFDQ IGTPTYANDLARAIYAIINKGIVRGIYHFSNEGVCSWYDFTVAIHRLAGITTCKVKPLHT AEYPAKANRPAYSVLDKTKIKTTFDIEIPHWEESLKQCLIKLGMKNEE >gi|222159231|gb|ACAB01000128.1| GENE 5 4078 - 4623 612 181 aa, chain - ## HITS:1 COG:no KEGG:BT_1731 NR:ns ## KEGG: BT_1731 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 181 1 181 181 291 91.0 1e-77 MNQIAQQLKEKNIAEYLIYMWQEEDLIRANHCEPEEMEANVIARYPEEQRPAMREWYTNL ITMMGEEGVREKGHLQINKNVIINLTELHNALSSSPKFPFYSAAYFKALPFIVELRNKNG KKDEPELETCFEALYGVLLLRLQKKPISEGTAKAVEAITSFLSMLANYYDKDRKGELKLD E >gi|222159231|gb|ACAB01000128.1| GENE 6 4635 - 5387 567 250 aa, chain - ## HITS:1 COG:BS_ycgF KEGG:ns NR:ns ## COG: BS_ycgF COG1280 # Protein_GI_number: 16077378 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Bacillus subtilis # 42 201 2 161 209 60 29.0 4e-09 MYVLLIIAIYSDSCVEKTLIYYPHITKKCDICDMIQIETIFDILVKGFIIGVVVSAPLGP VGVLCIQRTLNKGRWYGFVTGLGASLSDIAYALLTGYGMSFVFDYINKNIFYLQLLGSVM LLLFGIYTFRSNPVQSIRPASSSKGSYFHNFITAFFVTLSNPLIIFLFIGLFARFAFVQP GVLVFEEITGYLAIAIGALTWWLGITYFVNKVRTKFNLRGIWILNRVVGSIVMLVSVAGL IYTLLGESLY >gi|222159231|gb|ACAB01000128.1| GENE 7 5547 - 9251 4122 1234 aa, chain + ## HITS:1 COG:HI0752_1 KEGG:ns NR:ns ## COG: HI0752_1 COG0046 # Protein_GI_number: 16272693 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Haemophilus influenzae # 15 890 65 970 1011 506 38.0 1e-142 MILFFRTPSKSVIAVESNHQLTPDESNKLCWLFGEAVMESEENLKGCFVGPRREMITPWS TNAVEITQNMGLEGISRIEEYFPVKDENADYDPMLQRMYKGLDQNVFTTNRQPEPIIYIE DLEVYNEKEGLALSKEEMDYLKKVEKDLGRKLTDSEVFGFAQINSEHCRHKIFGGTFIID GVEQESSLFQMIKKTTQENPNKIISAYKDNVAFAEGPVVEQFAPADHSKPDFFQVKDIKS VISLKAETHNFPTTVEPFNGASTGTGGEIRDRMGGGKGSWPIAGTAVYMTSYPRTEEGRE WEEILPVRKWLYQTPEQILIKASNGASDFGNKFGQPLICGSVLTFEHTENKEVYGYDKVI MLAGGVGYGTQRDCLKGTPEAGNKVVVIGGDNYRIGLGGGSVSSVDTGRYSSGIELNAVQ RANAEMQKRANNVVRALCEEEVNPVVSIHDHGSAGHVNCLSELVEECGGLIDMSKLPIGD KTLSAKEIIANESQERMGLLIKEEAIEHVRKIAERERAPMYVVGETTGDHRFAFQQADGV RPFDLAVEQMFGSSPKTYMVDKTVERHYEMPKYELSKLHEYLTNVLQLEAVACKDWLTNK VDRSVTGKVARQQCQGELQLPLSDCGVVALDYRGEKGIATSIGHAPQAALADPAAGSILS VSEALTNLVWAPMAEGMDSISLSANWMWPCRSQEGEDARLYTAVKALSDFCCALQINVPT GKDSLSMTQKYPNGEKVISPGTVIVSAGGEVSDVKKVVSPVLVNNEKTTLYHIDFSFDEL KLGGSAFAQSLGKVGDEVPCVQDAEYFRDAFLAVQELVNKGLILAGHDISAGGLITTLLE MCFSNVEGGMEISLDKMKEEDIVKILFAENPGIVIQINDKHKDEVKKILEDAGVGYIKLG KPTDERHILVSKDGATYQFGIDYMRDVWYSSSYLLDRKQSMNGCAKARFENYKMQPVEFA FMPEFKGKLSQYGITPDRRTPSGIRAAIIREKGTNGEREMAYSLYLAGFDVKDVTMTDLI SGRETLEDVNMIVYCGGFSNSDVLGSAKGWAGAFLFNPKAKEALDKFYAREDTLSLGVCN GCQLMMELNLINPELKKKGKMLHNNSHKFESRFLGLTIPTNRSVMFGSLSGSKLGIWVAH GEGKFSLPYDEDKYNVVAKYSYDEYPGNPNGSDYSIAALASADGRHLAIMPHLERSIFPW QNGCYPADRKNSDQVTPWIEAFVNARKWVEAKMK >gi|222159231|gb|ACAB01000128.1| GENE 8 9520 - 9912 274 130 aa, chain + ## HITS:1 COG:no KEGG:BT_1734 NR:ns ## KEGG: BT_1734 # Name: not_defined # Def: two-component system sensor histidine kinase/response regulator, hybrid ('one component system') # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 1346 196 82.0 2e-49 MKKWLFLILLFPLICVAQTYRYLGVEDGLSNRRVYCIQKDKTGYMWFLTHEGIDRYNGKE FKRYKLMDGDAEVNSLLNLNWLYIDQEGVLWEIGKKGKVFRYDQIHDCFSLVYKLPMESF SEQPDPVTYA >gi|222159231|gb|ACAB01000128.1| GENE 9 10069 - 13530 1938 1153 aa, chain + ## HITS:1 COG:BS_yycG KEGG:ns NR:ns ## COG: BS_yycG COG5002 # Protein_GI_number: 16081092 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 614 842 369 599 611 117 31.0 2e-25 MGIHYAQLENNALKLINCDKLESVKAQVSDLHFDKKIRKLFIGTFLRGIMVYDMNTKSVI RPDYNLKDISITRFKPLNDKELLIATDGGGVHKINMDTYQIVPEIVADYNSNCGMNGNSI NDIFVDDEERIWLANYPIGITIQNNRYTSYKWIKHSIGNKQSLINDQVNAIIEDREGDLW FATNNGISFFNSKTGQWRSVLSAFEESQGNKSHIFLTICEVAPGIIWAGGYSSGAYQIDK KTFSVSYFMPPLYTHTNKRPDKYIRDIRTDMQGYIWSGGFYNLKRINLKTQEVRFYDGLN SITAIVEKDEKSMWIGSATGLCLLDKESGKFERIKLPVESSYIYSLYQAKNGSLYIGTSG SGLLIYDINKKLFTHYHTENCALISNNIYTILSDADKDIIMSTESGLTSFYPNEKKFYNW TKDMGLMTTHFNALSGTLRKNNKFILGSSDGAVEFDKDMKLPRSYSSKMIFSDFKLFYQT IYPGDEDSPLKASINDTKVLKLKYNQNIFSLQVSSINYDYPSNILYSWRLEGFYDKWSKP GTENTIRYTNLAPGKYTLRVRAISNEDKRIMLEERSMDIIIAQPFWLTFWAMLVYTAILC LIAIVLLRILILRKQRKVSDEKIHFFINTAHDIRTPLTLIKAPLEELREKEELSKEGISN MNTALRNVNALLRLTTNLINFERADVYSSELYISEHELNTFMNEIFNAFQQYANIKHINF TYESNFRYMNVWFDKEKMESIFKNIISNALKYTPENGNVQVFVSETSDSWSVEVRDTGIG IPANEQKKLFKLHFRGSNAINSKVTGSGIGLMLVWKLVRLHKGKINLSSIENQGSVIKIT FPKDSKRFRKAHLATPSKQRIENTIDNVPPPSPEIYENAQKKENINHRRILIVEDNDELR NYLSQTLSEEYFVQVCSNGKEALTIIPEYKPELVISDIMMPEMRGDELCQAIKNNIETSH IPVILLTALNNEKDILSGLQIGADEYVVKPFNIGILKANVANLLANRALLRSKFANLDLN DEENDEDCINCSQDIDWKFIANVKKNVEDNIDNPALTVDVLCSLMGMSRTSFYNKLRALT DQAPGDYIRLIRLKRAVQLLKEDTHSITEIAEMTGFSDAKYFREVFKKHYNVSPSQYGKE KKAVSKEGGEKKE >gi|222159231|gb|ACAB01000128.1| GENE 10 13527 - 14069 441 180 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 2 169 4 171 186 156 51.0 2e-38 MNIYLEAFGIFFKIGAFTIGGGYAMVPLIENEIVTKRKWIAQEDFIDLLAISQSAPGILA VNISIFIGYKLRGIRGSIITALGTILPSFIIILAIALFFHSFKDNPIVERIFKGIRPAVV ALIAAPTFTMGRSAKINRYNLWIPVVSAILIWLLGFSPIWIIIAAGVGGFLWGKFRKVES >gi|222159231|gb|ACAB01000128.1| GENE 11 14195 - 14743 622 182 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 182 1 173 176 120 46.0 1e-27 MIYLQLFYTFFKIGLFGFGGGYAMLSMIQGEVVTRYGWVSSQEFTDIVAISQMTPGPIGI NAATYVGFTSTGSVWGSIIATFAVVLPSFILMLTISKFFLKYQKHPIVESIFNGLRPAVV GLLASAALVLMNAENFGSPTEDTYSFVISIIIFLIAFIGTRKYKANPILMIIACGIAGLL LY >gi|222159231|gb|ACAB01000128.1| GENE 12 14854 - 16362 1353 502 aa, chain - ## HITS:1 COG:BS_yngK KEGG:ns NR:ns ## COG: BS_yngK COG1649 # Protein_GI_number: 16078889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 4 501 6 510 510 373 40.0 1e-103 MKLKNYLLLLALLLVVGVRAQVPSGNKYPKREFRGAWIQAVNGQFRGIPTERLKQILISQ LNSLQEAGINAIIFQVRPEADALYASQHEPWSRFLTGTQGQMPSPMWDPMQFMIEECQKR NMEFHAWINPYRVKTSLKNKLAPEHIYHQHPEWFVTYGDQLYFDPALPESREHICKIVTD IVSRYDVDAIHMDDYFYPYPVNGLDFPDDASFARYGGGFTNKADWRRSNVNVLIKKLHET IRGIKPWVKFGVSPFGIYRNQKSDPLGSNTNGLQNYDDLYADVLLWAREGWIDYNIPQIY WEIGHKAADYETLVKWWATHSENRPLFIGQSVPKTVQFADPQNPSINQLPRKMALQRAYQ TIGGSCQWYAAAVVENQGRYRDALISEYHKYPALVPVFDFMDDKAPDKVRKMKKVWTEDG YILFWTAPKADTEMDKAVRYVVYRFGSKEKVNLDDPSHIVAITRNPFYKLPYETGKTKYR YVVTALDRLHNESKSVSKKLKL >gi|222159231|gb|ACAB01000128.1| GENE 13 16457 - 19228 2466 923 aa, chain + ## HITS:1 COG:MTH443 KEGG:ns NR:ns ## COG: MTH443 COG0178 # Protein_GI_number: 15678471 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Methanothermobacter thermautotrophicus # 7 923 11 948 962 844 47.0 0 MSENNYISIKGARVNNLKNIDVDIPRNKLVVITGLSGSGKSSLAFDTLYAEGQRRYVESL SSYARQFLGRMSKPECDFIKGIPPAIAIEQKVNSRNPRSTVGTSTEIYEYLRLLYSRVGK TYSPISGQEVKKHSTEDIVNCMLSYPEGTRYTILTPIRLREDRTLQQQLEIDLKQGFNRI EVNGEMKRIDEYTPVAGDEVYLLVDRMAVATSKDAISRLTDSAETAMYEGEGTCMLRFFL SDGTTKLHTFSTKFEADGIIFEEPNDQMFSFNSPIGACPACEGFGKVIGIDEHLVVPDRS LSVYEGAIVCWRGEKMGEWKEELIHNAEKFDFPIFTPYYELTDAQRRLLWEGNQYFHGIN DFFKMLEENQYKIQYRVMLARYRGKTLCPKCHGTRLKPEAGYVRVGGKNISELVDLPITE LKQFFDHLELDEHDSNVSRRILIEINSRIRFLIDVGLGYLTLNRLSNSLSGGESQRINLA TSLGSSLVGSLYILDEPSIGLHSRDTDRLLHVLHQLQQLGNTVVVVEHDEEIIRAADYII DIGPNAGRLGGEVVYQGDMKDLKKGSNSYTVRYLLGEDEIPVPDHRRPWNNYIELKGARE NNLKGVNVRIPLNVMTVVTGVSGSGKSTLVRDIFFRALKRELDECSDRPGEFTSIGGSLR DLRNVEFVDQNPIGKSSRSNPVTYIKAYDEIRKLWSEQPLAKQMGYTPGFFSFNSEGGRC EECKGEGTITVEMQFMADLVLECESCHGKRFKSDTLEVKFNDKSIYDVLEMTVNQAIEFF NEHGQKKIVKKLLPLQDVGLGYIKLGQASSTLSGGENQRVKLAFYLSQEKADPTMFIFDE PTTGLHFHDIRKLLDAFDALIRRGHSIVIIEHNMDVIKCADYVIDLGPEGGDKGGNIVAV GTPEEVAACGASYTGQFLKEKLG >gi|222159231|gb|ACAB01000128.1| GENE 14 19325 - 19810 560 161 aa, chain - ## HITS:1 COG:no KEGG:BT_1740 NR:ns ## KEGG: BT_1740 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 161 1 161 161 278 86.0 6e-74 MKKLALALCLLAVSFTAQAQFEKGTTIINPSLSGLDFSYSKNDKAKFGVGAQVGTFFAEG IALMVNAGADWSKPVDEYTLGTGVRFYFNKTGIYLGGGLDWNRFRWSGGKHQTDWGLGIE AGYAYFLSRTVTIEPAVYYKWRFNDGDMSRFGVKIGFGFYL >gi|222159231|gb|ACAB01000128.1| GENE 15 19815 - 20045 248 76 aa, chain - ## HITS:1 COG:no KEGG:BT_1741 NR:ns ## KEGG: BT_1741 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 76 1 76 76 86 89.0 2e-16 MKKIRKSTGVALAFLIYVSVTAAYLLPRNTEVSLTEKIITVAVSYVIVLLLWLVLRKKEQ MRERRKKDEQYIHLKK >gi|222159231|gb|ACAB01000128.1| GENE 16 20042 - 21472 1091 476 aa, chain - ## HITS:1 COG:MA1905 KEGG:ns NR:ns ## COG: MA1905 COG1966 # Protein_GI_number: 20090754 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 1 448 1 447 479 412 51.0 1e-114 MITFTLCLLALIVGYFTYGRLMERVFGPDDRKTPALTKADGVDYIPLPTWKIFMIQFLNI AGLGPIFGAIMGAKFGSSSYLWIVLGSIFAGAVHDYFAGMLSLRNGGESLPEIIGRYLGL TTKQVMRGFTVILMILVGSVFVAGPAGLLAKLTPESLDATFWIIVVFAYYILATLLPVDK IIGKIYPLFAVALLFMAVGILVMLYVNHPALPELWDGLQNTNPEASELPIFPIMFVSIAC GAISGFHATQSPLMARCMTSERHGRPVFYGAMITEGIVALIWAAAATYFFHENGMEESNA SVIVDAITKEWLGAIGGVLAILGVIAAPITSGDTAFRSARLIVADFLGMEQKSMRRRLYI CIPMFVLAIGLLLYSLRDANGFNMIWRYFAWANQTLAVFTLWAITVFLAVSKKPYIITLV PALFMTCVCSTYICIAPEGLGLSHTVSYGVGIACVMVATVWFYIWINKQKTRKLSE >gi|222159231|gb|ACAB01000128.1| GENE 17 21581 - 23047 1068 488 aa, chain + ## HITS:1 COG:BS_yngK KEGG:ns NR:ns ## COG: BS_yngK COG1649 # Protein_GI_number: 16078889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 6 485 20 509 510 233 31.0 5e-61 MRYTIFILSLLFPFLSIVAQPKHEVRAAWVTAVYGLDWPRTRATTPQTIRKQKEELIDIL DKLKAANFNTILFQTRTRGDVLYPSAIEPFNSILTGKTGGNPGYDPLAFAVEECHKRGME CHAWMVTIPLGNKKHVASLGSQSVTKRMKEICVPYKREYFLNPGHPATKEYLMKLVREVV SGYDVDGVHFDYLRYPENAPLFPDKYDFRRYNKGRTLDQWRRDNISEIVRYIYKGVKAMK PWVKVSTCPVGKYRDTSRYPSRGWNAFFTVYQDPQGWMGEGIMDQIYPMMYFQGNNFYPF ALDWQEQSNGRQVIPGLGIYFLHPDEGKWTRDEIDRQMNFIRKQKMAGEGHYRVKYLMEN TQGIYDELSENFYAYPALQPPMPWLDNVPPTAPSALKVTHIDNGYTELTWQAATDNDQRN KPMYVIYASNDYPVDINRPENIIAQNIRETSYVYAPILPWNTKKHFAVTAIDRYGNESTA TQEQTVQQ Prediction of potential genes in microbial genomes Time: Wed May 18 04:03:16 2011 Seq name: gi|222159230|gb|ACAB01000129.1| Bacteroides sp. D1 cont1.129, whole genome shotgun sequence Length of sequence - 109193 bp Number of predicted genes - 81, with homology - 80 Number of transcription units - 45, operones - 21 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 40 - 99 2.5 1 1 Tu 1 . + CDS 129 - 1685 1422 ## BDI_2898 hypothetical protein - Term 1577 - 1611 -0.8 2 2 Tu 1 . - CDS 1696 - 2898 777 ## BT_1745 hypothetical protein - Prom 3020 - 3079 4.7 + Prom 2872 - 2931 9.4 3 3 Op 1 . + CDS 3061 - 4239 939 ## COG1373 Predicted ATPase (AAA+ superfamily) 4 3 Op 2 . + CDS 4272 - 7820 3546 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit + Term 7834 - 7888 7.3 - Term 7833 - 7868 5.3 5 4 Op 1 14/0.000 - CDS 7973 - 8839 872 ## COG2113 ABC-type proline/glycine betaine transport systems, periplasmic components 6 4 Op 2 16/0.000 - CDS 8860 - 9684 814 ## COG4176 ABC-type proline/glycine betaine transport system, permease component 7 4 Op 3 . - CDS 9681 - 10907 1401 ## COG4175 ABC-type proline/glycine betaine transport system, ATPase component - Prom 11064 - 11123 3.2 8 5 Tu 1 . - CDS 11207 - 11545 196 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase - Prom 11596 - 11655 8.9 + Prom 11486 - 11545 4.3 9 6 Tu 1 . + CDS 11628 - 13016 744 ## BF3314 hypothetical protein - Term 13011 - 13071 17.6 10 7 Tu 1 . - CDS 13097 - 15940 2250 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 16092 - 16151 8.2 - Term 16290 - 16335 7.2 11 8 Op 1 2/0.000 - CDS 16359 - 17246 1011 ## COG0524 Sugar kinases, ribokinase family 12 8 Op 2 . - CDS 17282 - 18448 1265 ## COG0738 Fucose permease - Prom 18475 - 18534 5.4 - Term 18616 - 18666 6.0 13 9 Tu 1 . - CDS 18790 - 20625 1770 ## COG1621 Beta-fructosidases (levanase/invertase) - Term 20646 - 20686 7.3 14 10 Op 1 . - CDS 20726 - 22294 1590 ## BT_1760 glycosylhydrolase 15 10 Op 2 . - CDS 22312 - 23697 1298 ## BT_1761 hypothetical protein 16 10 Op 3 . - CDS 23724 - 25436 1781 ## BT_1762 hypothetical protein 17 10 Op 4 . - CDS 25464 - 28589 3291 ## BT_1763 hypothetical protein - Prom 28613 - 28672 8.5 + Prom 29058 - 29117 4.3 18 11 Tu 1 . + CDS 29154 - 30026 585 ## COG2017 Galactose mutarotase and related enzymes - Term 30028 - 30072 5.2 19 12 Tu 1 . - CDS 30198 - 32078 1369 ## COG1621 Beta-fructosidases (levanase/invertase) - Prom 32106 - 32165 3.5 + Prom 32168 - 32227 5.2 20 13 Tu 1 . + CDS 32301 - 33323 712 ## BT_1767 hypothetical protein 21 14 Op 1 . - CDS 33411 - 34256 680 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily 22 14 Op 2 . - CDS 34237 - 36459 1554 ## COG0475 Kef-type K+ transport systems, membrane components - Prom 36511 - 36570 5.5 + Prom 36529 - 36588 7.1 23 15 Tu 1 . + CDS 36644 - 37171 354 ## BT_1784 hypothetical protein - Term 37063 - 37093 1.3 24 16 Op 1 . - CDS 37135 - 37449 69 ## gi|298480764|ref|ZP_06998960.1| conserved hypothetical protein 25 16 Op 2 . - CDS 37436 - 37789 215 ## BF3187 hypothetical protein - Prom 37826 - 37885 3.8 26 17 Tu 1 . + CDS 37794 - 38978 1217 ## COG3579 Aminopeptidase C + Term 39007 - 39048 1.2 - Term 39520 - 39574 0.4 27 18 Tu 1 . - CDS 39686 - 40537 690 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 40753 - 40812 5.0 28 19 Op 1 . + CDS 40777 - 42135 1099 ## BT_1798 hypothetical protein 29 19 Op 2 . + CDS 42167 - 45013 2492 ## COG1629 Outer membrane receptor proteins, mostly Fe transport + Term 45042 - 45080 9.3 + Prom 45053 - 45112 3.3 30 20 Op 1 . + CDS 45149 - 45829 619 ## BT_1803 hypothetical protein 31 20 Op 2 . + CDS 45870 - 47744 1785 ## COG3669 Alpha-L-fucosidase + Term 47898 - 47955 12.7 + Prom 47766 - 47825 4.7 32 21 Tu 1 . + CDS 48052 - 48582 374 ## PROTEIN SUPPORTED gi|229873878|ref|ZP_04493445.1| acetyltransferase, ribosomal protein N-acetylase + Prom 48690 - 48749 3.1 33 22 Op 1 29/0.000 + CDS 48841 - 49713 1107 ## COG2086 Electron transfer flavoprotein, beta subunit 34 22 Op 2 3/0.000 + CDS 49716 - 50735 1154 ## COG2025 Electron transfer flavoprotein, alpha subunit 35 22 Op 3 . + CDS 50742 - 52448 1700 ## COG1960 Acyl-CoA dehydrogenases + Term 52475 - 52513 6.2 + Prom 52472 - 52531 6.9 36 23 Op 1 . + CDS 52600 - 54354 1176 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 37 23 Op 2 . + CDS 54362 - 54781 276 ## BT_1808 hypothetical protein 38 23 Op 3 . + CDS 54819 - 59879 4591 ## BT_1809 hypothetical protein 39 23 Op 4 . + CDS 59887 - 60357 334 ## PROTEIN SUPPORTED gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 + Term 60388 - 60432 12.4 - Term 60420 - 60464 11.6 40 24 Op 1 . - CDS 60470 - 60784 325 ## BT_1811 hypothetical protein 41 24 Op 2 . - CDS 60799 - 61764 1132 ## COG2214 DnaJ-class molecular chaperone - Prom 61784 - 61843 4.4 42 25 Op 1 . - CDS 61859 - 63562 1150 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase 43 25 Op 2 . - CDS 63549 - 64742 811 ## BT_1814 hypothetical protein 44 25 Op 3 . - CDS 64838 - 65851 1066 ## COG2008 Threonine aldolase - Prom 65872 - 65931 4.2 - Term 65937 - 65993 14.5 45 26 Tu 1 . - CDS 66023 - 66241 249 ## BT_1819 hypothetical protein - Prom 66450 - 66509 6.8 + Prom 66231 - 66290 5.0 46 27 Tu 1 . + CDS 66477 - 68216 1698 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] + Term 68255 - 68311 12.2 + Prom 68275 - 68334 7.8 47 28 Op 1 . + CDS 68372 - 69289 633 ## COG4984 Predicted membrane protein 48 28 Op 2 . + CDS 69273 - 70247 320 ## BT_1824 putative permease 49 28 Op 3 . + CDS 70234 - 70731 354 ## COG4929 Uncharacterized membrane-anchored protein + Term 70742 - 70786 8.1 - Term 70910 - 70954 6.6 50 29 Tu 1 . - CDS 71018 - 73870 2383 ## BT_2486 hypothetical protein - Prom 73892 - 73951 4.9 51 30 Tu 1 . - CDS 74245 - 75183 842 ## BT_4479 integrase protein - Prom 75206 - 75265 5.8 - Term 75261 - 75313 13.2 52 31 Op 1 41/0.000 - CDS 75345 - 76982 1674 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 53 31 Op 2 . - CDS 77026 - 77298 432 ## COG0234 Co-chaperonin GroES (HSP10) - Prom 77456 - 77515 5.9 + Prom 77462 - 77521 6.2 54 32 Tu 1 . + CDS 77561 - 78592 859 ## BT_1831 hypothetical protein + Term 78665 - 78717 2.5 - Term 78898 - 78941 9.2 55 33 Op 1 . - CDS 79003 - 79710 482 ## gi|237713801|ref|ZP_04544282.1| conserved hypothetical protein 56 33 Op 2 . - CDS 79770 - 80318 461 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 57 33 Op 3 . - CDS 80389 - 80562 232 ## gi|237713803|ref|ZP_04544284.1| conserved hypothetical protein - Prom 80591 - 80650 4.5 + Prom 80430 - 80489 6.6 58 34 Op 1 . + CDS 80522 - 80722 78 ## 59 34 Op 2 . + CDS 80706 - 82175 1413 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 60 34 Op 3 . + CDS 82156 - 83232 605 ## COG0502 Biotin synthase and related enzymes 61 34 Op 4 . + CDS 83285 - 84706 1376 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 62 34 Op 5 . + CDS 84718 - 85908 933 ## COG1160 Predicted GTPases + Term 85935 - 85994 15.6 - Term 85891 - 85931 -0.1 63 35 Tu 1 . - CDS 86051 - 87415 1659 ## COG0124 Histidyl-tRNA synthetase - Prom 87474 - 87533 4.8 - Term 87552 - 87599 8.0 64 36 Tu 1 . - CDS 87656 - 88336 782 ## COG2738 Predicted Zn-dependent protease - Prom 88362 - 88421 4.0 65 37 Tu 1 . - CDS 88457 - 89755 1123 ## COG3669 Alpha-L-fucosidase - Prom 89781 - 89840 2.7 - Term 89797 - 89843 8.1 66 38 Op 1 . - CDS 89865 - 91136 1604 ## COG0104 Adenylosuccinate synthase 67 38 Op 2 . - CDS 91133 - 91621 248 ## COG0735 Fe2+/Zn2+ uptake regulation proteins - Prom 91687 - 91746 3.5 - Term 91774 - 91841 13.0 68 39 Op 1 . - CDS 91878 - 92543 716 ## BT_1845 hypothetical protein 69 39 Op 2 . - CDS 92550 - 94598 2227 ## BT_1846 putative dipeptidyl-peptidase III - Prom 94627 - 94686 4.6 - Term 94619 - 94668 5.4 70 40 Tu 1 . - CDS 94694 - 95158 499 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 95209 - 95268 6.3 + Prom 95216 - 95275 10.9 71 41 Tu 1 . + CDS 95429 - 97237 1342 ## COG0514 Superfamily II DNA helicase 72 42 Op 1 . - CDS 97444 - 98262 953 ## COG0457 FOG: TPR repeat 73 42 Op 2 . - CDS 98290 - 99240 729 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 - Prom 99377 - 99436 9.4 74 43 Op 1 . + CDS 99372 - 101207 1238 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily + Prom 101219 - 101278 4.3 75 43 Op 2 . + CDS 101303 - 101992 481 ## COG0671 Membrane-associated phospholipid phosphatase + Term 102104 - 102150 14.1 - Term 102091 - 102137 14.1 76 44 Op 1 11/0.000 - CDS 102210 - 103271 1358 ## COG0473 Isocitrate/isopropylmalate dehydrogenase 77 44 Op 2 . - CDS 103332 - 104879 1796 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases 78 44 Op 3 30/0.000 - CDS 104861 - 105457 753 ## COG0066 3-isopropylmalate dehydratase small subunit 79 44 Op 4 6/0.000 - CDS 105499 - 106893 1616 ## COG0065 3-isopropylmalate dehydratase large subunit 80 44 Op 5 . - CDS 106982 - 108478 1599 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases - Prom 108500 - 108559 6.4 - Term 108803 - 108855 10.7 81 45 Tu 1 . - CDS 108881 - 109114 183 ## BT_1862 hypothetical protein Predicted protein(s) >gi|222159230|gb|ACAB01000129.1| GENE 1 129 - 1685 1422 518 aa, chain + ## HITS:1 COG:no KEGG:BDI_2898 NR:ns ## KEGG: BDI_2898 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 518 1 518 522 657 63.0 0 MNDLLRKLPIGIQTFEKLRKGDYLYVDKTELVWKIAYTSTPYFLSRPRRFGKSLLLSTFE AYFEGKKELFEGLAIEKLETKWEKHPVLHLDLNAEKYDSPERLYDILSRQLTLWEIQYGK GIDENTLSGRFSGVIRRAYEQTGSSVVVLVDEYDKPLLQALGNDVLLDEYRKTLKAFYGV LKSADRYLRFVFLTGVTKFSQVSVFSDLNQLQDITLWPDYGTLCGITLQELLDTFQPEIK ILAANNSISYDDAVQRMTRLYDGYHFCINSVGIFNPFSVLNVLKSKVFDNYWFQTGTPTF LVEMLQETEYDLRTLLDGIEAPSSMFSEYRVDSNNPIPLIYQSGYLTIKDFDREFGNYLL QFPNDEVRYGFINFLVPFYTGVRNSDQGFYIGKFVQELRSGDYDAFLTRLQAFFADFTYE LNEQTERHYQVVFYIVFKLMGQFTDAEVRSARGRADAVVKTPKYIYVFEFKLHDTAEAAL KQIDDKGYLIPYQADGREVIKIGVEFSAEKRNISRWLV >gi|222159230|gb|ACAB01000129.1| GENE 2 1696 - 2898 777 400 aa, chain - ## HITS:1 COG:no KEGG:BT_1745 NR:ns ## KEGG: BT_1745 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 396 1 395 398 543 66.0 1e-153 MTKKALYIFNPEHDLALASGETNYMAPASARRMASELALLPMWYVEEGSAVLASSAYNLG YVKKIQELLGLSVDLMTEPELAIEPNLDIRPWGWDVALRKRLSGLGVDEALLPSMEQLNG LREYSHRSKAVSLLPELQLNEYFCGESYYLKTQEEWKTFVEERECCLLKAPLSGSGKGLN WCKGIFTPFISGWCTRVAASQGGIIAEPIYNKVEDFAMEFYSDGTGEVTFMGYSLFHTGK SGMYEGNRLLSNEAIWKQLSQYVPSKVLTDLENCLKYRLSALVGTVYKGYLGVDMMICRF PENEKPVFRIHPCVEINLRMNMGVVARFLYDRYVRSGSTGRFVIDYHPSEGEALQEHERM SATYPLEIREGRVYSGYLPLVPVHRRSCYRAWIWVTPDNI >gi|222159230|gb|ACAB01000129.1| GENE 3 3061 - 4239 939 392 aa, chain + ## HITS:1 COG:MJ1637 KEGG:ns NR:ns ## COG: MJ1637 COG1373 # Protein_GI_number: 15669833 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanococcus jannaschii # 33 290 73 342 473 81 28.0 3e-15 MESFYRTHAYLVEHTNAPVRRDLMDEIDWSDRLIGIKGTRGVGKTTFLLQYAKEKFGNDR SCLFINMNNFYFSGHSIVDFANEFQKRGGKVLLIDQVFKHPEWSKELRMCYDRFPNLKIV FTGSSVMRLKEENLELRDIAKSYNLRGFSFREFLNLQTGMKFRAYSLEEILSTHEQIAKG VLSKVRPLDYFQDYLHHGFYPFFLEKRNFSENLLKTMNMMVEVDILLIKQIELKYLSKIK KLLYLLAVDGPKAPNVSQLASDIQTSRATVMNYIKYLADARLINLVYPKGEEFPKKPSKI MMHNSNLMYSIYPVKVEEQDVLDTFFANSLWKDHKVHKGDKNVSFMVDEVMPFKICQEGA KIKNNPNVTYALHKAEIGRGNQIPLWMFGFLY >gi|222159230|gb|ACAB01000129.1| GENE 4 4272 - 7820 3546 1182 aa, chain + ## HITS:1 COG:CAC2499_1 KEGG:ns NR:ns ## COG: CAC2499_1 COG0674 # Protein_GI_number: 15895764 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Clostridium acetobutylicum # 5 411 3 409 413 592 69.0 1e-168 MTKQKKFITCDGNQAAAHISYMFSEVAAIYPITPSSTMAEYVDEWAAAGRKNIFGETVLV QEMQSEGGAAGAVHGSLQAGALTTTYTASQGLLLMIPNMYKIAGEFLPCVFHVSARTLAS HALCIFGDHQDVMSARQTGFAMLAEGSVQEVMDLAGVAHLATIKSRVPFMNFFDGFRTSH EIQKIEMLENDDLAPLIDQEALAEFRARALNPMKPVARGMAENPDHFFQHRESCNNYYEA VPAIVEEYMNEISKITGRKYGLFDYYGAEDAERVIIAMGSVTEAAREAIDYLVANGEKVG MVAVHLYRPFSAKHFLAAVPKTAKTIAVLDRTKEPGANGEPLYLDVKDCFYGAENAPVIV GGRYGLGSKDTTPAQILSVFENLAMPMPKNHFTIGIVDDVTFTSLPQKEEIALGGEGMFE AKFYGLGADGTVGANKNSVKIIGDNTDKHCQAYFSYDSKKSGGFTCSHLRFGDNPIRSTY LVNTPNFVACHVQAYLHMYDVTRGLRKNGSFLLNTIWEGEELAKNLPNKVKKYFAQNNIS VYYINATQIALEIGLGNRTNTILQSAFFRITGVIPVEQAVEQMKKFIVKSYGKKGEDVVN KNYAAVDRGGEYKTLAVDPAWANLPDDAKAENNDPAFINEVVRPINAQDGDLLPVSAFKG IEDGTWYQGTAKYEKRGVAAFVPEWNAENCIQCNKCAYVCPHASIRPFVLDAEEQKGAKF EQLKAVGKAFDGMTFRIQVDVLDCLGCGNCADICPGNPKKGGKALTMKHLESQLAQAENW TYCAENVKTKQHLVDIKANVKNSQFATPLFEFSGACSGCGETPYVKLISQLFGDREMVAN ATGCSSIYSGSVPSTPYTTNENGHGPAWANSLFEDFCEFGLGMELANEKMRARLVKVMNE AIAADCTPAEVKELFAEWINNMLDADKTKELAAKIIPVVEANKDKCNHCKQIAELQQYLV KRSQWIIGGDGASYDIGYGGLDHVIASGKDVNILVLDTEVYSNTGGQSSKATPVGAIAKF AAAGKRVRKKDLGLMATTYGYVYVAQIAMGADQAQTLKAIREAEAYPGPSLIIAYAPCIN HGLKAGMGKSQEEEEKAVKCGYWHLWRYNPALEEEGKNPFQLDSKEPNWEDFQGFLKGEV RYASVMKQYPAEAEELFKAAEENAKWRYNSYKRLARENWGAE >gi|222159230|gb|ACAB01000129.1| GENE 5 7973 - 8839 872 288 aa, chain - ## HITS:1 COG:MA2147 KEGG:ns NR:ns ## COG: MA2147 COG2113 # Protein_GI_number: 20090990 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, periplasmic components # Organism: Methanosarcina acetivorans str.C2A # 30 286 58 314 315 182 38.0 5e-46 MRIYKIVGIVLSAILLLASCTNSDLEKKKVKIAYANWLEGIAMSHLAKVVLEEHGYEVEL QNADLAPIFVSMSGKKSDVFLDAWLPITMKDYMDQYGDSIEFLGEVYGEARVGLVVPQYV TIQSISELEANKDRFSSEIVGIDAGAGIMKTTDKAIAAYGLDGYTLMTSSSSTMLASLKK AMDKGEWIVITGWTPHWMFDQFDLKFLDDPKKVYGDLEEIHAIAWKGFSEKDPFAAEFFG NIKLTTEELSSFMTAMKDARMDEEEIARKWRDEHRQLVDSWIPKSENK >gi|222159230|gb|ACAB01000129.1| GENE 6 8860 - 9684 814 274 aa, chain - ## HITS:1 COG:BMEII0549 KEGG:ns NR:ns ## COG: BMEII0549 COG4176 # Protein_GI_number: 17988894 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport system, permease component # Organism: Brucella melitensis # 3 265 6 268 301 256 56.0 3e-68 MINIGQYIEMVINWMMVHFSTFFDAVNAGIGSFIIGFQHVLFGIPFYITILVLAALAWMK AGRGTAIFTALGLLLIYGMGFWEATMQTLALVFSSTCLALIVGVPLGVWTANSPRAEKIL RPVLDLMQTMPAFVYLIPAVLFFGLGAVPGVFATIIFAMPPVVRLTGLGIRQVPKNVVEA SRSFGATRWQLLYKVQLPLALPTILTGVNQTIMMSLSMVVIAAMIAAGGLGEIVLKGITQ MKIGLGFEGGIAVVILAIILDRITQGMAGRKNKN >gi|222159230|gb|ACAB01000129.1| GENE 7 9681 - 10907 1401 408 aa, chain - ## HITS:1 COG:MA2145 KEGG:ns NR:ns ## COG: MA2145 COG4175 # Protein_GI_number: 20090988 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport system, ATPase component # Organism: Methanosarcina acetivorans str.C2A # 3 397 6 401 491 381 52.0 1e-105 MSKIEIKDLYLIFGHEKQKALKMLKKDKSKEEILKDTGCTVGVKDANLSINEGEFFVIMG LSGSGKSTLLRCINRLIRPTAGQVLVNGVDISKISEKELLQVRRKELAMVFQNFGLLPHR SVLSNIAFGLELQGVKKEEREKKAMESMKLVGLKGYENQMVGELSGGMQQRVGLARALAN NPEVLLMDEAFSALDPLIRVQMQDELLALQSKMKKTIVFITHDLSEAIKLGDRIAIMKDG EVVQVGTSEEILTEPANDYVARFVENVDRSKIITASSLMIDKPLVARLKKEGPEVLIRKM RAKNITVLPVIDADDKLVGEVRLNDLLKLRSRQEKSIDTIVRTEVHSVLEDTVLEDILPL MTKSNSPVWVIDETHEFLGTIPLSSLIIEVTGKDKEEINEIIQNAIDL >gi|222159230|gb|ACAB01000129.1| GENE 8 11207 - 11545 196 112 aa, chain - ## HITS:1 COG:lin0580 KEGG:ns NR:ns ## COG: lin0580 COG3695 # Protein_GI_number: 16799655 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Listeria innocua # 15 106 6 97 98 122 58.0 1e-28 MKDYKVDKASLSTSFCQEVYQVVREIPVGKVSTYGGIAALLGMPQCSRMVGRALKQVPDD LSTPCHRVVNASGRLVPGWTEQKQLLLEEGISFKQNGCVDLKKHLWNYSVSE >gi|222159230|gb|ACAB01000129.1| GENE 9 11628 - 13016 744 462 aa, chain + ## HITS:1 COG:no KEGG:BF3314 NR:ns ## KEGG: BF3314 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 61 456 30 427 430 130 26.0 1e-28 MQLYLLYLQIIPTFVPIKSNPIINKTSTAVMKKKTILFASCLFGVFAFSSCEKNLYDESK QPEKEIQVKDLDIPAGFQWKLTQVAAGTVAATTPTMVSFFLDEACSKEEKIADIPVDTEI SSLPLSIPTYVNTLYAQYKTSTNETKKVAIPVNADRSFSLNIANDAKSKSNTTRSITRGH DIEDDIQLSKGVIFHPKDGWGTIMFEDQFPSLGDYDFNDFVVNYKVQFQGIRKVDKKYTA QYIQIGLRLKAIGGIFPYSPYLRLKEIDSDEVESIEVYETKNVIPAIDGVELVPNKHLII DYSPLIKNLAKPAGSQYYNTEKNALVATSDLPEINILITLKKRKEVKEILEGDEFDLYLK RNDSGTEIHMNGIEPITYQYPFNDKNLLPIYTNGDEEDDNYYFSAGRLIWGLRVPGNAAH AIEKANFLEAYKGFAKWAQSSGKNEQNWYNQGNADKSLLIHN >gi|222159230|gb|ACAB01000129.1| GENE 10 13097 - 15940 2250 947 aa, chain - ## HITS:1 COG:SMb20671 KEGG:ns NR:ns ## COG: SMb20671 COG1879 # Protein_GI_number: 16265126 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 24 306 28 313 322 195 40.0 3e-49 MRMIRMMGYTKWIAVLLCLLGMTACRQDAPRFRIGVAQCSDDSWRHKMNDEILREAMFYD GVSVEIRSAADDNRKQAEDVHYFIDEGVDLLIISANEAAPMTPIVEEAYQKGIPVILVDR KILSDKYTAYIGADNYEIGRAVGNYIASSLKGKGNVVELTGLGGSTPAMERHQGFMAAIS NFPDIKLIDKADAAWEREPAEVEMDSMLRRHPKIDAVYAHNDRIAPGAYQAAKKVGREKE MIFVGIDALPGKGNGLEMVLDSVLNATFIYPTNGDKVMQLAMNILEKKPYPRETVMNTAV VDRTNAHVMQLQTTHISELDQKIETLNGRIGGYLSRVATQQVVMYGGLVILLLVAGLLLV VYKSLRAKNRLNKELSEQKKQLEEQRDKLEEQRDQLEEQRDKLEEQRDQLIQLSHQLEEA THAKLVFFTNISHDFRTPLTLVADPVEHLLADHTLSGDQHRMLMLIQRNVNILLRLVNQI LDFRKYENGKMEYTPVQVDVLSSFEGWNESFLAAARKKHIHFSFDNMPDTDYHTLADMEK LERIYFNLLSNAFKFTPENGKITVRLSSLTKEDTRWIRFTVANTGSMISAEYIRSIFDRF YKIDMHHAGSGIGLALVKAFVELHKGTISVESDEKQGTVFTVDLPVQTCETILAEDSLKS SISAVPLNPASPGSPASSNQNETLAVEEEELEKGYDSSKPSVLVIDDNADIRSYVRRLLH TDYTVIEAADGSEGIRKAMKYVPDLIISDVMMPGIDGIECCRRLKSELQTCHIPVILLTA CSLDEQRIQGYDGGADSYISKPFSSQLLLARVRNLIDSHRRLKQFFGDGQTLAKEDVCDM DKDFVEKFKALIEAKMGDSNLNVEDLGKDMGLSRVQLYRKIKSLTNYSPNELLRIARLKK AASLLASSDMTVAEIGYEVGFSSPSYFTKCYREQFGESPTDLLKRKG >gi|222159230|gb|ACAB01000129.1| GENE 11 16359 - 17246 1011 295 aa, chain - ## HITS:1 COG:MA1840 KEGG:ns NR:ns ## COG: MA1840 COG0524 # Protein_GI_number: 20090690 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Methanosarcina acetivorans str.C2A # 9 294 35 322 326 195 40.0 1e-49 MNNIIVGMGEALWDVLPEGKKIGGAPANFAYHVSQFGFDSRVVSAVGKDELGDEILNVFN EKKLKTQIEQVDYPTGTVQVTLDDEGVPCYEIKEGVAWDNIPFTDELKRLALSTRAVCFG SLAQRNEVSRATINRFLDTMPDIDGQLKIFDINLRQDFYTKEVLRESFRRCNILKINDEE LVTISRMFGYPGIDLQDKCWILLAKYNLKMLILTCGINGSYVFTPGVVSFQETPRVPVAD TVGAGDSFTAAFCASILNGKPVPEAHKLAVEVSAYVCTQSGAMPELPQVLKDRLM >gi|222159230|gb|ACAB01000129.1| GENE 12 17282 - 18448 1265 388 aa, chain - ## HITS:1 COG:NMB0535 KEGG:ns NR:ns ## COG: NMB0535 COG0738 # Protein_GI_number: 15676441 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Neisseria meningitidis MC58 # 4 382 24 417 426 95 26.0 1e-19 MENSKNSSLSKLIPVMLCFFAMGFVDLVGIASNYVKADLGLSDSQANIFPSLVFFWFLIF SVPTGMLMSRIGQKKTVLLSLLVTFASLLLPVFGDSYMLMLISFSLLGIGNALMQTSLNP LLSNIVRGDRLASSLTFGQFVKAIASFLAPYIAMWGATQAIPTFDLGWRILFPIYMIVAV IAILLLNVTQIKEEKEEGKPSTFGQCLALLGKPFILLSFIGIMCHVGIDVGTNTTAPKIL MERLGMGLDDAAFATSLYFIFRTAGCFLGSFILRKMSPKSFFGISVVMMLIAMAGLFIFH DKTMIYVCIGLIGFGNSNVFSVVFSQALLYLPGKKNEVSGLMIMGLFGGTVFPLAMGVAS DAMGQSGAVAVMTVGVLYLLFYTFRIKK >gi|222159230|gb|ACAB01000129.1| GENE 13 18790 - 20625 1770 611 aa, chain - ## HITS:1 COG:BS_sacC KEGG:ns NR:ns ## COG: BS_sacC COG1621 # Protein_GI_number: 16079757 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus subtilis # 136 610 22 510 677 455 49.0 1e-127 MNLRLSSKQIQTFAVLFCMIVMNISLSARADNSPLLIKDLGEGHCLVRVNTNQKYLLLPV EDASPDVRISMIVNNKEVKNFDVRLAIHKVDYFVPVDLSDYSGKLISFKFKMNSNDPVRV NLSPDNTACCKEMKLSDTFDTSNREKFRPTYHFSPLYGWMNDPNGMVYKDGEYHLFYQYN PYGSKWGNMNWGHAISKDLVNWEHRPVAIAPDAFGTIFSGSAVIDHHNTAGFGAGAIVAI YTQNGDRQVQSIAYSTDNGRTFTKYENNPVLVSEARDFRDPKVFWYEGAKRWIMVLAVGQ EMQFFSSPNLKDWTFESSFGKGHGAHGNVWECPDLFELPVEGTNEKKWVLLCSLGDGPFG DSATQYFVGSFNGKEFVNESPSKTKWMDWGKDHYATVTWSDAPDNRRIAIAWMSNWQYAN DVPTSQYRSPNSVPRDLSLFTIDGETYLQSAPSPELLALRDASKKRSFKVNGTRTIKEMI PGNDGAYEIELTIENQRADVIGFRLYNDKGEEVDMQYDMKEKKFSMDRRKSGEVSFNENF PMLTWTAIESGKDALKLRLFVDKSSVEAFGDGGRFVMTNQVFPSEPYNHIDFYSKGGAYK VDSFVVYKLKP >gi|222159230|gb|ACAB01000129.1| GENE 14 20726 - 22294 1590 522 aa, chain - ## HITS:1 COG:no KEGG:BT_1760 NR:ns ## KEGG: BT_1760 # Name: not_defined # Def: glycosylhydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 522 2 523 523 1016 94.0 0 MKNMILHIAITTLLASMSACSDEMDPVLTQRDWDGTATYFQSADEHGFSMYYKPQVGFVG DPMPFYDPVAKDFKVMYLQDYRPNPEATYHPIFGVATKDGATYESLGELIPCGGRDEQDA AIGTGGTIYNPVDKLYYTFYTGNKFKPSSDQNAQVVMVATSPDFKTWTKSRTFYLKGDTY GYDKNDFRDPFLFQTEDGVYHMLIATRKSGKGHIAEFTSTDLKEWASAGTFMTMMWDRFY ECPDVFKMGDWWYLVYSEQASFMRKVQYFKGRTLEELKATTANDAGIWPDNREGMLDSRA FYAGKTASDGTNRYIWGWCPTRAGNDNGNVGDVEPEWAGNLVAQRLIQHEDGTLTLGVPD AIDRKYISAQEVKVMAKDGNVTESGKTYTLAEGASVIFNRLKVHNKISFTVKASSNTDRF GISFVRGTDSKSWYSIHVNADEGKANFEKDGDNAKYLFDNKFNIPSDNEYRVTIYSDQSV CVTYINDQLSFTNRIYQMQKNPWALCCYKGEITVSDIQVSTY >gi|222159230|gb|ACAB01000129.1| GENE 15 22312 - 23697 1298 461 aa, chain - ## HITS:1 COG:no KEGG:BT_1761 NR:ns ## KEGG: BT_1761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 461 1 461 461 827 90.0 0 MKSIIKQLYTILLVTVACLTVAGCSDDFKSNLRLDGDVWVNSIKLDEYAGTIDYQNKTIV VGVPYDYDVTRMAVSEINLSEGATASIAVGETIDFSLPVSLTVKNGDVQMNYTITVKRDE AKILTFKLNDTYVGKVDQLSKTISVVVPLTVDITQLKGTFTGSDGVTVTPASGSIQDFTN PVTYTATYRSAVTPYVVTVTQGNVIPTAFIGTASSVSQLTSPEEKAAAQWMMDNISMSEY ISFKDVVDGKVDLGKYTAIWWHFHADNGDNPPLPDDAKAAVEKFKVYYQNGGNLLLTRYA TFYIKDLSIAKDERVPNNSWGRSEDSPEVVDGAWSFPIVGNESHPLFQDLRWKDGDKTRV YTFDAGYATTNSTAQWHIGTDWGGYEDLNAWRSLTGGIDVACGDDGAVIIAEFEPRANSG RTICIGSGCYDWYGKGVDTSVDYYHYNVEQMTLNAINYLCK >gi|222159230|gb|ACAB01000129.1| GENE 16 23724 - 25436 1781 570 aa, chain - ## HITS:1 COG:no KEGG:BT_1762 NR:ns ## KEGG: BT_1762 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 570 1 570 570 1091 94.0 0 MKKILYIATIGFTLLTSSCNDFLDRQVPQGIVTGDQIASPEYVDNLVISAYAIWATGDDI NSSFSLWNYDVRSDDCYKGGSGTEDGGVFNALEISKGINTTDWNINDIWKRLYQCITRAN TALQSLDLMDEKTYPLKDQRIAEMRFLRGHAHFMLKELFKKIVIVNDENMDPDAYNKLSN TTYTNNEQWQKIADDFQFAYDNLPEVQIEKGRPTQAAAAAYLAKTYLFKAYRQDGVNNNL TGIDEEDLKQVVKYTDPLIMAKAGYGLENDYSMNFLPQYENGAESVWAIQYSINDGTYNG NLNWGMGLTTPQILGCCDFHKPSQNLVNAFKTDSQGKPLFNTYDNENYEVTTDNVDPRLF HTVGMPGFPYKYNEGYLIQKNDDWSRSKGLYGYYVSLKENVDPDCDCLKKGSYWASSLNH IVIRYADVLLMRAEALIQLNDGRIADAISLINDVRSRAAGSTMLIFNYKEEYGVNFKVTP YELKAYTQDEAMKMLKWERRIEFGMESSRFFDLVRWGEAKDVINAYYVTEAARCSIYKNA GFTENKNEYLPIPFEQISASNGNYTQNFGW >gi|222159230|gb|ACAB01000129.1| GENE 17 25464 - 28589 3291 1041 aa, chain - ## HITS:1 COG:no KEGG:BT_1763 NR:ns ## KEGG: BT_1763 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1041 1 1041 1041 1970 96.0 0 MHRIMKNKKLLCSVCFLFTFMSVLWGQSITVKGNVTSKTDGQPVIGASVVEATATANGTI TDLDGNFTLSVPANSTLKITYIGYKPVTVKSAAIVNVLLEEDTQMVDEVVVTGYTTQRKA DLTGAVSVVKVDEIQKQGENNPVKALQGRVPGMNITADGNPSGSATVRIRGIGTLNNNDP LYIIDGVPTKAGMHELNGNDIESIQVLKDAASASIYGSRAANGVIVITTKQGKKGQIKIN FDASVSASMYQSKMDVLNTEQYGRAMWQAYVNDGENPNGNALGYNYNWGYDANGNPVLHS MSLSKYLDSKNTMPVADTDWFDEITRTGVIQQYNLSVSNGSEKGSSFFSLGYYKNLGVIK DTDFDRFSARMNSDYKLIDDILTIGQHFTLNRTSEVQAPGGIIETALDIPSAIPVYASDG SWGGPVGGWPDRRNPRAVLEYNKDNRYTYWRMFGDAYVNLSLFKGFNVRSTFGLDYANKQ ARYFTYPYQEGTQTNNGKSAVEAKQEHWTKWMWNAIATYQLEIGKHRGDVMAGMELNRED DSHFSGYKEDFSILTPDYMWPDAGSGTAQAYGAGEGYSLVSFFGKMNYSYADKYLLSLTL RRDGSSRFGKNHRYATFPSVSLGWRITQESFMKELTWLDDLKLRASWGQTGNQEISNLAR YTIYAPNYGTTDSFGGQSYGTAYDITGSNGGGTLPSGFKRNQIGNDNIKWETTTQTNAGI DFSLFKQSLYGSLEYYYKKTTDILTEMAGVGVLGEGGNRWINSGAMKNQGFEFNLGYRNK TAFGLTYDLNGNISTYRNEILELPETVAANGKFGGNGVKSVVGHTYNSQVGYIADGIFKS QEEVDNHATQEGAAVGRIRYRDIDHNGVIDEKDQEWIYDPTPAFSYGLNIYLEYKNFDLT MFWQGVQGVDIISDVKKKSDFWSASNVGFLNKGTRLLNAWSPTNPNSTIPALTRSDTNNE QRVSTYFVENGSFLKLRNIQLGYTVPAVISKKLRMERLRFYCSAQNLLTIKSKDFTGEDP ENPNFSYPIPVNITFGLNIGF >gi|222159230|gb|ACAB01000129.1| GENE 18 29154 - 30026 585 290 aa, chain + ## HITS:1 COG:lin1322 KEGG:ns NR:ns ## COG: lin1322 COG2017 # Protein_GI_number: 16800390 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Listeria innocua # 1 288 1 289 290 203 39.0 3e-52 MKTISNKQLTIQVSPHGAELCSIVANGKEYLWQADPAFWKRHSPVLFPIVGSVWENEYRN EGIPYTLTQHGFARDMEFTLVSEKEDEVRYRLVSNEETLHKYPFPFCLEIGYRIQGKQIE VMWEVKNSGKKDIYFQIGAHPAFYWPEFDANSLERGFFGFDKENGLKYILISEKGCADPS TEYSLELTDGLLPLDTHTFDKDALILENEQVRKVTLYNKEKQAYLSLHFNAPVVGLWSPP AKNAPFVCIEPWYGRCDRAHYTGEYKDKDWIQHLQPEEIFQGGYTIEIDE >gi|222159230|gb|ACAB01000129.1| GENE 19 30198 - 32078 1369 626 aa, chain - ## HITS:1 COG:BS_sacC KEGG:ns NR:ns ## COG: BS_sacC COG1621 # Protein_GI_number: 16079757 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus subtilis # 127 624 23 509 677 383 42.0 1e-106 MKTNPWMKLCKGAILALAVSYGLTYCHTTKSKLTLEQKSDSLTVIHITNPTNYILLPIEE EAAESQVLLDTGEAADTDMDIRLAQTQVDYFVPFALPAGAKAATVRVRNKSKDALCWKEI KLSDTFDTANSEKFRPVYHHTPLYGWMNDANGLVYKDGEYHLYFQYNPYGSKWGNMHWGH SVSKDLVHWEHLEPAIARDTLGHIFSGSSIVDQENIAGYGAGSILAFYTSASDKNGQIQC LAFSKDNGRTFTKYEKNPILSPADGLKDFRDPKVFRYEPEDKWVMIVSADKEMRFYDSKN LKDWNYMSSFGEGYGVQPCQFECPDMVELPVDGDLNRKKWALIVNVNPGCYFGGSATQYF TGDFDGKKFSCDSQPNVTKWLDWGKDHYATVCFSNAGERTIAVPWMSNWQYCNIVPTKQF RSANALPRELSLYTQDNEIYLSAAPVPEIKTLRKEKKEIPAFTVANDYHIDSLLADNDGA YELALEITAGEAEIMGFSLFNDKGEKVDIYFNLPEKRLVMDRTKSGIVDFGKKSVPHEIE VHDRRKTTSINYIDDFALATWAPIKKENKYTLDVFVDKCSVEIFLNGGKVAMTNLIFPSE PYNRMCFYSKGGSFQVDSFNAYRLGL >gi|222159230|gb|ACAB01000129.1| GENE 20 32301 - 33323 712 340 aa, chain + ## HITS:1 COG:no KEGG:BT_1767 NR:ns ## KEGG: BT_1767 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 340 1 340 340 602 89.0 1e-171 MPANRNALIRYKTIDNCLRNPYRRWTLEDLVDACSDALYEYEGIDKGISKRTVQMDIQIM RSEKLGYNAPIVVYENKYYKYEDPEYSITQTPLNEQDLKTMSEAVEVLRQFKGFSYFTEM SDIINRLEDHVASARMKTTPVIDFEKNESLKGLDYLDTIYHAIVNEHPIQLKYRSFKARS ANSFIFYPYLLKEYRNRWFVYGVRGNGRILQNLALDRIQSLEVLPQEHYIKNTFFDPNTF FNDLVGVTKNSGSVAEKVGFKVAAAEAPYIITKPIHRSQQVVERLADGSVILEIKVVINH ELERVFFGYAEGIQVLYPKTLVELMGRKLKKAAEQYTHSK >gi|222159230|gb|ACAB01000129.1| GENE 21 33411 - 34256 680 281 aa, chain - ## HITS:1 COG:CAC2424 KEGG:ns NR:ns ## COG: CAC2424 COG4667 # Protein_GI_number: 15895690 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Clostridium acetobutylicum # 11 275 5 269 283 239 43.0 6e-63 MKDLKIDESTGLVLEGGGMRGVFTCGVLDYFMDHDIRFPYAIGVSAGACNGLSYASRQRG RAKYSNIDLLEKYDYIGLKHLLKKRNILDFDLLFNEFPEHILPYDYETYFASPERFVMVT TNCITGEANYFEEKKDSRRVIDIVRASSSLPFVCPITYVDGIPMLDGGIVDSIPLQRAIT DGCTRNVVVLTRNRGYRKDSKDIRIPSFVYRKYPKLREALSRRCAVYNEQLEMVERMEDE GQIIVIRPLKPVAVDRIEKDVQKLTEFYQEGYECARALLEE >gi|222159230|gb|ACAB01000129.1| GENE 22 34237 - 36459 1554 740 aa, chain - ## HITS:1 COG:slr1595 KEGG:ns NR:ns ## COG: slr1595 COG0475 # Protein_GI_number: 16329583 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Synechocystis # 64 458 6 401 410 282 43.0 2e-75 MRNRARKNYVIYVLMLLLFGGLIYVAIEEGDRFSHYVVNAQNIVQGNPFTMFLQFIQDNL HHPLTTLLIQIIAVLLMVRLFGYLFSLIGQPGVIGEIVAGIVLGPSVLGFFFPDAFHFLF PAHSLTNLELLSQVGLILFMFVIGMELDFSVLKNKINETLVISHAGILVPFFLGILSSYW VYEEYAAAQTPFLPFALFIGISMSITAFPVLARIIQERNMTKTPLGTLAIASAANDDVTA WCLLAVVIAISKAGSFASALYSVGLAVVYIAVMFLVVRPFLKKVGEVYANREAINKTFVA FILLILIISSCLTEIIGIHALFGAFMAGVVMPSNLGFRKVMMEKVEDISLVFFLPLFFAF TGLRTEIGLINSPELWMVCLLLVTVAVVGKLGGCAIAARLVGESWKDSLTIGTLMNTRGL MELVALNIGYEMGVLPPSIFVILVIMALVTTFMTTPLLHLVEHIFVRREEKLSLKHKLIF CFGRPESGRVLLSIYDLLFGKQLKKNHLIAAHYTVGTDLNPLNAEHYASESFALLNQQAA KLNIQVDNHYRVTDKLVQEIIHFVRKEHPDMLLLGAGSHYRSDMPGTPGAILWLTLFRDK IDEIMEQVKCPVAVFVNRQYREGTTVSFVLGGMIDLFLFSYLDKMLQNGHSVRLFLFDTD DEEFRGRIDDLQVRYPEQTMIVWFAGVEDLVTEEKDGLLIMSHLSYTKLSEDEAVMRELS SLLVIRRNKNTGDKNEGLEN >gi|222159230|gb|ACAB01000129.1| GENE 23 36644 - 37171 354 175 aa, chain + ## HITS:1 COG:no KEGG:BT_1784 NR:ns ## KEGG: BT_1784 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 170 1 170 173 295 92.0 7e-79 MRIQFEIKENLPEIIEEILHSDKWQTSVKEELSGRTTVVIRDQAYGSEATIEIYATSIEI KTAWSKYSYRIFVTNDIVWCEYNGAYRGLLEQVLLPTITPKENLLNSDVTESSLYGNEHK KLREYAEDNLKLKQFRRENFNEQKNGTAPFDHPKRVYDEFIKEDYVVAPKEEENK >gi|222159230|gb|ACAB01000129.1| GENE 24 37135 - 37449 69 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298480764|ref|ZP_06998960.1| ## NR: gi|298480764|ref|ZP_06998960.1| conserved hypothetical protein [Bacteroides sp. D22] # 1 104 1 104 104 197 98.0 2e-49 MKRNKKVIGVSCFVLLLLVGIMYVYVHPVNRYRLEVTRVGGSGYGYKIYERERLIIVQPF LLAQSCHLTGTTVSIWWHWSANTMAQGKALSTDYLFSSSLGATT >gi|222159230|gb|ACAB01000129.1| GENE 25 37436 - 37789 215 117 aa, chain - ## HITS:1 COG:no KEGG:BF3187 NR:ns ## KEGG: BF3187 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 37 117 269 350 350 113 74.0 2e-24 MKLILNCFGYKYRKVSPQITQIYTDYFTTEDAEDTEEDKGKLRSGQTAGSSYRSNYWIYD PLTDLWDGEDLTDFEGSTRSKAVCFSTGKRGIIATGGSSTYYYDDTWELKPYEYEEK >gi|222159230|gb|ACAB01000129.1| GENE 26 37794 - 38978 1217 394 aa, chain + ## HITS:1 COG:SP0281 KEGG:ns NR:ns ## COG: SP0281 COG3579 # Protein_GI_number: 15900215 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase C # Organism: Streptococcus pneumoniae TIGR4 # 41 386 60 421 444 65 24.0 2e-10 MKKTILIAALGLFSLNIMAQDAKPEEGFVFTTVKENPITSIKNQNRSSTCWSFSTLGFIE SELLRLGKGEYDLSEMFVVHKTMQDRGVNYVRYHGDSSFSPGGSFYDVMYCIKNYGIVPQ EVMPGIMYGDTLPVHNELDAVASGYINAIAKGKLSKLTPVWKNGLSAIYDTYLGACPEKF TYKGKEYTPKTFSESLGLNCDDYVSLTSYTHHPFYSQFAIEIQDNWRNGLSYNLPIEELM AVMDNAVKKGYTFAWGSDVSEQGFTRDGIAVMPDVNKESELSGSDMARWTGLTAANKRQM MTTKPYPEIDVTQEMRQVAFDNWETTDDHGMVIYGIAKDQNGKEYFMVKNSWGKSGKYNG IWYASKAFVAYKTMNILVHKDALPKEIAKKLGIK >gi|222159230|gb|ACAB01000129.1| GENE 27 39686 - 40537 690 283 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 179 282 183 286 288 69 31.0 7e-12 MTTNESFPKFDLPVDFIADDSITGDILNYYGNFPCKIKAGVFVLCTEGMVRATINLLEIV IRKNDFIVVLPNSFIQIHEVSSDTRISFAGFSSDFMLSGNYVEILLDFMPMILNSPVISL QEDVAGLYRNVYLLLIRAYTLPHSLENKEIIRSVLAIFLQGTKELYKRYGKKLDEPLRRE QELYRQFIQLLMAHYTKEHEVAFYAEKCGVTAAHFSSAIRRASGYSPLAIITEIIIMNAK AQLKTTRLPVKEIAFSLGFNNHSFFNKYFRKKVKMTPLEYREK >gi|222159230|gb|ACAB01000129.1| GENE 28 40777 - 42135 1099 452 aa, chain + ## HITS:1 COG:no KEGG:BT_1798 NR:ns ## KEGG: BT_1798 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 448 1 443 448 404 48.0 1e-111 MRTLKRLFYVACTAFLLTSCEETYNDKLFWPGELCQEYGSYIKPATLNLTYSGEKLVGKT VDFKTEDSEKGTLTLNDIIPGEKQTPLPISLCEQEDSYTFSGKNVTMGGATVTYSGAITP KTMKLDLDVVIPQSKWKKSYGISNFTKGKKMTVTYSGGQYVWKETNEILTGGFYVHLDDV ELTKAGSTLFLRMKLIQNALCYFIPQLLQTITLQPDGNLVANYTTSPVYIGSVPINNIDP DKDVGTIATFVTKFMIGLLTEKDINNALTDRTWTASPINLITWTEESGRLKINLNLPAII SLATKDGETPIDSGLVSGIMEALAQSNPVQLKLLLGIVNSMIDNPLLGIITSMDTASFQQ VFYLLTEGIIFHIEEEDGHTHLYLTKESTTAFIQLLPGLQPIVEGMLPESMANNTVFKNL LGLLMGNDENGLPVLWNAANTIDLGLDLLPQE >gi|222159230|gb|ACAB01000129.1| GENE 29 42167 - 45013 2492 948 aa, chain + ## HITS:1 COG:PA1613 KEGG:ns NR:ns ## COG: PA1613 COG1629 # Protein_GI_number: 15596810 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Pseudomonas aeruginosa # 110 319 37 244 702 82 29.0 3e-15 MKKIYILTLLLCFTSLGSSFAQTLKGHIYDANTNEPLVGAAVTYKLQGNQGVVSDIHGAY EIKLPEGGVDLVFSYVGYEDVLMPIVIDRREVITKNVYMKESTKLLEEVVVSAGRFEQKL SNVTVSMDVVKAGDIARQAPTDISSTLRTLPGVDIVDKQPSIRGGSGWTYGVGARSQILV DGMSTLNPKTGEINWNTVPLENIEQIEVIKGASSVLYGSSALNGIINIRTARPGLAPKTR FSAYLGIYGDAENDEYQWSDKSFWKEDKYSVKPILRGSLLSGVRNPIYEGFDLSHARRIG NFDVSGSLNLFTDEGYRQQGYNKRFRMGGNLTYHQPDMGMKLLNYGFNIDFLSNQYGDFF IWRSPTEVYKPSPFTNMGREDNNFHIDPFVNYVNPENGTSHKIKGRFYYSADNIVRPTQG ASIMDILGNMGTDAQTIQNIAGGDYSSLYPALVGIGSGLINGNLEDAMNGVFTSLGNIFP NATTADYCDLISWVMDSGLPSDLGGLTNEQLPGDLIPWMSDVLNPSRNTPKTQTDKSTNY YLDYQFNKKWDSGAQITTGATYEHVRYNSAIMDEVYKSDNIAAFFQYDQRFWDRLSVSAG VRAEYYRVNNHHREAETKIFGTKVPFRPVFRAGLNYQLADYSFIRASFGQGYRNPSINEK YLRKDIGGVGVYPNLDIKPEKGFNAELGFKQGYKIGNFQGFVDVAGFYTQYKDMVEFQFG LFNNADYTMINSISDAIRMITGGQGFGIGAQFHNVSKAQIYGVEISTNGVYNFNKNTKLF YNLGYVYTEPRDADYKERNDAEDLYTDPLQMKEKSNTGKYLKYRPKHSFKATVDFQWKRI NLGANVAWKSKILAVDYLMMDERPKAQLDLMDYVRGIAFGYSKGETLATYWKKHNTDYAT VDMRLGVKASKEVAFQFMVNNLLNKEYSYRPMAVAAPRTFVVKMDITF >gi|222159230|gb|ACAB01000129.1| GENE 30 45149 - 45829 619 226 aa, chain + ## HITS:1 COG:no KEGG:BT_1803 NR:ns ## KEGG: BT_1803 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 226 1 226 226 399 88.0 1e-110 MKKQILLIAILLCTAFAQAQEVFVTADFVSSYIWRGMDSGNASVQPSLGLNWKGLTVYAW GSTEFREKNNEIDLSLEYEYKNLTLYANNYFTQTEEEPFKYFNYSSHSTGHTFEVGAGYM LSEKFPLSVSWYTTFAGNDYRENGKRAWSSYCELSYPFSVKDVNMSVEAGFTPWESMYSD KFNVVNIGLSATKEIKITSNFSLPIFGKLIANPYEEQLYFVFGITL >gi|222159230|gb|ACAB01000129.1| GENE 31 45870 - 47744 1785 624 aa, chain + ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 29 480 9 448 559 283 36.0 6e-76 MKKHFLIFGAGLLLLSACNSVKAPEAILPIPEAKQVEWQKMETYAFVHFGLNTFNDREWG YGDSDPKTFNPTRLDCEQWVQTFVKSGMKGVILTAKHHDGFCLWPTQLTEYCIRNTPYKD GKGDIIRELSDACKKYGIKFAVYLSPWDRHQANYGSPEYVEYFYKQLNELLTNYGDVFEI WFDGANGGDGWYGGAKDSRTIDRKTYYDYPRAYKLIDELQPQAVIFSDGGPGCRWVGNEN GFAGATNWSFLRAGEVYPGYPKYRELQYGHADGNQWVAAECDVSIRPGWFYHPEEDDRVK TADALTDLYYRSVGHNATLLLNFPVDRDGLIHPTDSANAVNFHQNVQKQLAHNLLAGLSP KASDERGKTFSAKAVTDGEYDTYWATNDGVNSATIEFDLPQTEKINRMMLQEYIPLGQRV KSFVVEYNQEGEWLPVKLNEETTTIGYKRLLRFETVTTDKIRVRFTDSRACLCINNIEAY YAGETSDTYTAKAEELKSYPFTLTGVDAEEAQKCMDKNNQTTCFINGNTLLIDLGEERTI TSFHYLPDQSEYNKGVIAAYEISVGTDSNAVNQLVAKDEFSNIKNNPILQSVYFTPVNAR YVQLKATRMIHDGEPMGLAEIGIQ >gi|222159230|gb|ACAB01000129.1| GENE 32 48052 - 48582 374 176 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229873878|ref|ZP_04493445.1| acetyltransferase, ribosomal protein N-acetylase [Spirosoma linguale DSM 74] # 16 176 17 177 185 148 41 1e-34 MVQKTEIIVQKKNYTLRTWQTEDAASLAQYLNNKNIWNNCRDGLPYPYSQEDANVFLAMV QAKENIHDFCIEVNGEAVGNIGFIPATDVERFSAEVGYWIGEPFWNQGIVTDALKKAINY YFEHTDKIRVFAVVFEHNSPSMRVLEKVGFTKVGIMQKAIFKNDNFMNAHYYELIK >gi|222159230|gb|ACAB01000129.1| GENE 33 48841 - 49713 1107 290 aa, chain + ## HITS:1 COG:mll5862 KEGG:ns NR:ns ## COG: mll5862 COG2086 # Protein_GI_number: 13474882 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Mesorhizobium loti # 3 286 1 265 283 181 41.0 1e-45 MSLKIVVLAKQVPDTRNVGKDAMKADGTINRAALPAIFNPEDLNALEQALRLKDAHPGST VTVLTMGPGRAADIIREGLFRGADNGYLLTDRAFAGADTLATSYALATAIRKIGEYDIII GGRQAIDGDTAQVGPQVAEKLGLTQITYAEEILEVGDGKIKVKRHIDGGVETVEGPLPIV ITVNGSAAPCRPRNAKLVQKYKHAKTVTEKQQGNLDYTDLYDKRDYLNLAEWSVADVNGD LAQCGLSGSPTKVKAIQNIVFQAKESKTISGSDRDVEDLIVELLANHTIG >gi|222159230|gb|ACAB01000129.1| GENE 34 49716 - 50735 1154 339 aa, chain + ## HITS:1 COG:CAC2709 KEGG:ns NR:ns ## COG: CAC2709 COG2025 # Protein_GI_number: 15895966 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Clostridium acetobutylicum # 4 335 9 332 336 263 45.0 3e-70 MNNLFVYCEIEEGNVLDVSLELLTKGRSLANQLNCQLEAVVAGTDLKDIEKQILPYGVDK LHVFDGEGLYPYTSLPHTAILVNLFKEEQPQICLMGATVIGRDLGPRVSSALTSGLTADC TSLEIGDHEDKKEGKTYKNLLYQIRPAFGGNIVATIVNPEHRPQMATVREGVMKKEILSP NYQGEVIRHDVKKYVADTDYVVKVIERHVEKAKNNLKGSPIIIAGGYGVGSKENFDLLFD LAKELHAEVGASRAAVDAGFADHDRQIGQTGVTVRPKLYIACGISGQIQHIAGMQESGII ISINNDPDAPINTIADYVINGTIEEVVPKMIKYYKQNSK >gi|222159230|gb|ACAB01000129.1| GENE 35 50742 - 52448 1700 568 aa, chain + ## HITS:1 COG:CC3393 KEGG:ns NR:ns ## COG: CC3393 COG1960 # Protein_GI_number: 16127623 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Caulobacter vibrioides # 45 445 47 459 603 206 33.0 8e-53 MANYYTDIPELKFHLNNPMMKRICELKERNYRDKDEFDYAPLDFEDAVDSYDKVLEITGE ITGEIIAPNAEGVDEEGPHCANGRVEYASGTKQNLDAMVKAGLNGMTMPRKFGGLNFPIT PYTMCAEIVAAADAGFGNIWSLQDCIETLYEFGNSDQHSRFIPRICQGETMSMDLTEPDA GSDLQSVMLKATYSEKDGCWLLNGVKRFITNGDADIHLVLARSEEGTRDGRGLSMFIYDK RQGGVNVRRIENKLGIHGSPTCELVYKNAKAELCGDRKLGLIKYVMALMNGARLGIAAQS VGLSQAAYNEGLAYAKDRKQFGKAIIEFPAVYDMLAIMKGKLDAGRALLYQTSRYVDIYK ALDDIARERKLTPEERQEQKKYAKLADSFTPLAKGMNSEYANQNAYDCIQIHGGSGFMME YACQRIYRDARITSIYEGTTQLQTVAAIRYVTNGSYIATIREFETIPCSPEMEPLMSRLK KMADKFEASTNAVKEVQDQELLDFTARKLVEMAADIIMCHLLIQDASKSSELFSKSAHVY LNYAEAEVEKHSNFIENFDKEDLAFYKK >gi|222159230|gb|ACAB01000129.1| GENE 36 52600 - 54354 1176 584 aa, chain + ## HITS:1 COG:TM0584 KEGG:ns NR:ns ## COG: TM0584 COG0705 # Protein_GI_number: 15643350 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Thermotoga maritima # 155 380 5 230 235 134 34.0 7e-31 MALKPSYNDKLHLPGLSKTEILILALEASQKLEWNIEEVTPEGARFEVPFNMYSHGEEIT FTIEPGNDGEVAVRSQSSSVQFVDYGKNRKNIQKLRETMEEIKTSLTPEELTQKAKDFEE ECNRPLTEEEKAYLEEEKKRNSFWSFFIPRKGFMATPILIDLNILVFIVMIASGVGIMSP STLSLLKWGADFGPLTLTGDWWRAVTCNFIHIGAFHLLMNMYAFMYVGLLLEGLIGSRRM FMSYLLTGLCSAVFSLYMHGETISAGASGAIFGLYGIFLAFLFFHRIAKEQRKALLTSIL IFVGYNLVYGMKAGIDNAAHIGGLLSGFLLGIIYVCSYKFEKADAQRTVSILGELGIFCI FLFSFLMLCKNVPPLYQDIRGEWESGIVEAYLNGELEEENKNGNQSGRETANNSSTSQYP PYVPVGNNDTWLSYYDAETNFSCQYPTNWRKITGAKGLTPSAEPPLLRLVNGANQLTVTA LTYDTQKEFEHIKKLSLTLPRNAQGEPAEDYKQSNVNINGLSMTRTTNPLHIGAPDEPGE DIQQIVLHYFQESKKRTFTIVMLVYDEEAETDLNAITSSIQITQ >gi|222159230|gb|ACAB01000129.1| GENE 37 54362 - 54781 276 139 aa, chain + ## HITS:1 COG:no KEGG:BT_1808 NR:ns ## KEGG: BT_1808 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 139 1 139 139 212 74.0 3e-54 MRVYVQVKQLGKRKCNIEKIPVDFPVPPVDVQGLIEAIVSWQVCEYNERLQQSEVLKYLT QEEVENKATSGKVGFAVNYNGKPAAEVEAITNALQSYEDGIFRIFIDDTETEDLSSPTGL KEESTITFIRLAMLSGRLW >gi|222159230|gb|ACAB01000129.1| GENE 38 54819 - 59879 4591 1686 aa, chain + ## HITS:1 COG:no KEGG:BT_1809 NR:ns ## KEGG: BT_1809 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1686 1 1676 1676 2556 75.0 0 MRLIYNDALNKRIAPYLERLTKKRTSLDKETMTLLDVFMQYFNMDTRYGTYSDKLEPCII HIIQEEKIKSVANLFDGKLNKVLRYLLGDEYANLFHTYLKLKARCPYTHGYSRRSQRSAN PLLHFSHIIDALTEFLKLRATGFSEQAILNGGNTPEEIETYKDAMNCQNWMAAQIAEGNQ TVIEHLNNVLTSENNANRLNQGHLQAIAVSGYRPLLELEGKLLLAAKLQEGLRQAIVETM DEGCPESYLHLFSVICDNGLQRFASVKRGIAVCTGIGEQDSSERITNKYVELIRRFLNDR EEARKALQSKDTVELYLALWSIGFYNTEEIQALVPGIIKDGAKYQVQTLLYFLRCTQYSG MNHRISKDAFEKWYNEPSVVAAILPLYLSGLYLSRYGGHKDAPSLHDYFDSKEEAIRHYD YLKNVYQSISAKEIYSPYVFPWESAELTRSEIVLKMAYITWMTNDSALKDDLCTCLPSLD TYMRAGYIGVVLNPPTSHLQEEYVLQSLGDRSQDVRDEAYKVLSEMTLSPEQNQKVEELL RFKYSEMRINAINLLMKQPKEQLSGSIRRLLTDKVAERRLAGLDMMKTIHNVEFLQDTYQ ELIPTVKEIQKPNAKEKVLIESLIGDGTKENTAQHYTKDNGFGLYDPALEVNLPEITQDK GFNVKKAFEFICFGRAKLVFKKLSKYIEIYKNEEFKNGYGEARLVGNSVLINWSNYGGLS GLGFPELWKAFYEEEIGSYDKLLMMSFMLASTGAPKDDDDYDEEDEEDIKADQKSSNTFE PLVNRMYAGITYRGLQKELRKMPYYEQMSDIIEALSYEYKDEAVYQRLAVNMLLQLLPLL NTKNIFRQYTNKHAWLRDKLEYGEKEIVYPIHNNKFVNFWLEMPQKPMSDDLFIRYFTVR YQLYKLTNYMEHTPELEETDSYLHATDFARAWMLGIIPTEEVYREMMGRISSPAQIKAIT TVLNDNVRFNKEKERYADIKNVDFSLFRSLAQKIVDRILEIELKRGDSETQVTSLAEELS YIYGADTFIHILQAFGKDTFIRDSYNWGSTKRGVLSSLLHACHPLPTDTSENLKKLAKQA EISDERLVEAAMFAPQWIELTEKAIGWKGLTSAAYYFHAHTNETCDDKKKAIIARYTPID VEDLREGAFDIDWFRDAFKTIGKRRFEVVYNAAKYISCSNSHTRARKFADATNGAVKAAD VKKEIVAKRNKDLLMSYGLIPLGRKPDKELLDRYQYLQKFLKESKEFGAQRQESEKKAVN IALQNLARNSGYGDVTRLTWSMETELIKELLPYLSPKEIDGVEVYVQINEEGKSEIKQIK DGKELNSMPAKLKKHPYIEELKAVHKKLKDQYTRSRVMLEQAMEDCTRFEESELRKLMQN PVIWPLLRHLVFICNGQTGFYTDGLLVTVNAVCLPLKPKDELRIAHPTDLYTSGDWHAYQ KFLFDKAIRQPFKQVFRELYVPTPEEVEATQSRRYAGNQIQPQKTVAVLKGRRWVADYED GLQKIYYKENIIATIYAMADWFSPADIEAPTLEYVCFHNRKDYKLMKISEIPPVIFSEVM RDVDLAVSIAHAGSVDPETSHSTIEMRSVLVELTMPLFHFKNVTIKGSFAHIEGKLGKYN IHLGSGVIHQEGGAQIAVLPVHSQNRGRLFLPFVDEDPKTAEILTKIIFFAEDDKIKDPS ILNQIK >gi|222159230|gb|ACAB01000129.1| GENE 39 59887 - 60357 334 156 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 [Streptococcus pneumoniae R6] # 12 150 6 145 165 133 47 4e-30 MAENLIIGTGNKEEKYRELLPQLHALVSTETDLIANLANVAAALKQTFGFFWVGFYLVKE DELVLGPFQGPIACTRIRLGKGVCGTAWKEAQTLIVPDVEQFPGHIACSSDSKSEIVVPI LQQGEVVGVLDIDSDELDSFDTIDARYLEEICTYIG >gi|222159230|gb|ACAB01000129.1| GENE 40 60470 - 60784 325 104 aa, chain - ## HITS:1 COG:no KEGG:BT_1811 NR:ns ## KEGG: BT_1811 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 4 107 107 162 86.0 5e-39 MQTELIIVSEYCHKCHIEPSFIEMLEEGGLINVHTEGGEHYLLLSELPNVERYSRMYYDL SINMEGIDAIHHLLEKMEDMRREMSSLRKQLLLYQEREIEDMDW >gi|222159230|gb|ACAB01000129.1| GENE 41 60799 - 61764 1132 321 aa, chain - ## HITS:1 COG:all1488 KEGG:ns NR:ns ## COG: all1488 COG2214 # Protein_GI_number: 17228981 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone # Organism: Nostoc sp. PCC 7120 # 3 319 8 300 315 174 35.0 2e-43 MAYIDYYKILGVDKSASQDDIKKAFRKLARKYHPDLNPNDPSAKDKFQEINEANEVLSDP EKRKKYDEYGEHWKHADEFEAQKRAQQQAGGFGGAGGFGGFGGAGQGFSDGNGTYWYSSD GEGFSGGNASGFSDFFESMFGHRGGRGQGSAGFRGQDFNAELHLSLRDAAQTHKQILTVN GKQVRITIPAGVADGQVIKLKGYGAEGVNGGPAGDLYITFVIAEDPVFKRLGDDLYIDVE VDLYSAVLGGEKVVDTLDGKVKLKIKPETQNGTKVRLKGKGFPVYKKEGQFGDLIVTYSV KIPTNLTDKQKELFRQLQSMN >gi|222159230|gb|ACAB01000129.1| GENE 42 61859 - 63562 1150 567 aa, chain - ## HITS:1 COG:RC0454 KEGG:ns NR:ns ## COG: RC0454 COG2194 # Protein_GI_number: 15892377 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Rickettsia conorii # 175 525 171 518 522 156 31.0 1e-37 MKLFKNIKNWLENQEHLFYLFLFILIVPNLVLCLTEPLPLMAKVCNVLLPFGCYYLLMTL SRNCGKMLWILFLFLFFGAFQIVLLYLFGQSIIAVDMFLNLVTTNSSEALELLDNLTPAI IAVIILYVPALILGTISIIRKRKLTAEFIRRERKRASIVFGISLLSLAGAYMQDPGYELK SDLYPLNVCYNVGLAFQRTALTQNYHRTSKDFTFHALPTHPKEKREVYVMVVGETSRALN WQLYGYERETNPLLSRQPGLIAFPKVLTESNTTHKSVPMLMSDATACNYDSIYHQKGIIT AFKEAGFRTAFFSNQRYNHSFIDFFGMEADTYDFIKEDSVSSSYNPSDDELLKLVEQELA KGATKQFIVLHTYGSHFNYRERYPSESAFFTPDYPMEAERKYRDNLVNAYDNSIRYTDGF LSRLIHMLEKQQIDAAMLYTSDHGEDIFDDSRHLFLHASPVPSYYQLHVPFLIWMSDNYL ETYPEYWDTAIDNKDKNVSSSSSFFPTMLSLAGIETPYRDDSQSVTAPHYVLKPRVYLND HNEPRPLDDLGMKKQDFQMLEKRNIKY >gi|222159230|gb|ACAB01000129.1| GENE 43 63549 - 64742 811 397 aa, chain - ## HITS:1 COG:no KEGG:BT_1814 NR:ns ## KEGG: BT_1814 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 397 2 398 398 694 88.0 0 MNTRQNRLIIFALLLLTQTPLGAQLHSTKIELPEDSISVTMEDSVPAKRSFFKKFLDYFN DANKEKKNKKFDFSVIGGPHYSSDTKFGLGLVAAGLYRTDRNDSLLPPSNVSLYGDVSTV GFYLLGIRGNHLFPQDKYRLNYNLYFYSFPSLYWGRGYDNGANSDNESDYKRFQAQVKVD FMFRLAKNFYIGPMAIFDYIDGRDFDKPELWEGMAARTTNTSLGLSLLYDSRDFLTNASH GYYLRIDQRFSPAFLGNKYAFSSTELTTSYYQPVWKGGVLAGQFHTLLTYGDTPWGLMAT LGSSYSMRGYYEGRYRDKGAMDAQIELRQHVWKRNGVAVWAGAGTIFPRLSEFTPKHILP NYGFGYRWEFKKRVNVRLDLGFGKHQTGFIFNINEAF >gi|222159230|gb|ACAB01000129.1| GENE 44 64838 - 65851 1066 337 aa, chain - ## HITS:1 COG:alr3296 KEGG:ns NR:ns ## COG: alr3296 COG2008 # Protein_GI_number: 17230788 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Nostoc sp. PCC 7120 # 1 336 5 341 345 253 38.0 3e-67 MRSFASDNNSGVHPLVMEALNRANIDHSLGYGDDKWTEEAVAKIKETFTPNCVPLFVFNG TGSNVVALQLMTRPYHSIFCAETAHIYVDECGSPVKMTGCQIRPIATPDGKLTPELMQPY LHGFGDQHHSQPRALYISQCTELGTIYTPEELKRLTDFAHLNGMYVHMDGARIANACAAL NLSLKELTVDCGVDILSFGGTKNGLMMGECVIVFNKDLQPEARFIRKQSAQLASKMRYLS CQFTAYLTGDLWLKNASHANAMAAKLRAELEKLPEVTFTQKAESNQLFLTMPRPVIDRML ESYFFYFWNEEKDEIRLVTSFDTTEEDVDEFIGLLKR >gi|222159230|gb|ACAB01000129.1| GENE 45 66023 - 66241 249 72 aa, chain - ## HITS:1 COG:no KEGG:BT_1819 NR:ns ## KEGG: BT_1819 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 72 122 90.0 4e-27 MNRNEIGVNAGKVWQLLSNNEKWSYGLLKRKSGLKDKELGAALGWLSRENKIEFDQCDEE LYVYLCVNVYIG >gi|222159230|gb|ACAB01000129.1| GENE 46 66477 - 68216 1698 579 aa, chain + ## HITS:1 COG:YPO1358 KEGG:ns NR:ns ## COG: YPO1358 COG0028 # Protein_GI_number: 16121638 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Yersinia pestis # 1 572 1 571 573 552 48.0 1e-157 MAKKIAEQLIDTLVKSGVERIYAVTGDSLNEVNEAVRKNDQIKWIHVRHEETGAYAAAAE AQLTGRPGCCAGSSGPGHVHLINGLYDAHRSGAPVIAIASTIPTGEFGTEYFQETNTIKL FNDCSYYNEVATTPKQFPRMLQSAIQTAVTRKGVSVIGLPGDLAKASAVAVDSSIRNYPA PPEVCPSEEDLAQLADLLNKHTRITLFCGIGCRGAHEEVIALSEKLNAPVVYTFKGKMEV QYENPYEVGMTGLLGMPSGYYSMHEAEVLLMLGTDFPYSAFLPDDIKIAQIDIKPERLGR RAKVDIGLCGDVKLSIQSLLRMLNPKTDDSFLLKQLKRYEGVKKDLAAYTEDKGDVNKIH PEYVMSEIDKLSSDDAVFTVDTGMTCVWGARYLQATGKRHMLGSFNHGSMANALPQAIGA ALAYPDRQVVALCGDGGLSMTLGDLETVVQYKLPIKIIVFNNRSLGMVKLEMEVDGLPDW QTNMLNPDFAQVAEAMGMTGFNVSDPEEVLTTLLNAFELDGPVLVNIMTDPNALAMPPKI EFGQMVGFAQSMYKLLINGRSQEVIDTINSNFKHIREVF >gi|222159230|gb|ACAB01000129.1| GENE 47 68372 - 69289 633 305 aa, chain + ## HITS:1 COG:YPO2801 KEGG:ns NR:ns ## COG: YPO2801 COG4984 # Protein_GI_number: 16122999 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Yersinia pestis # 18 147 49 175 735 83 44.0 4e-16 MEKTDSSPLSRQALYADKKQWNQFLSVFLLAVGVGFTVAGIIFFFAYNWDELPKFAKLGT VEVLLIASVLLATFTRWNKLVKQILLTGATFLIGTLFAVFGQIYQTGADAYDLFLGWTLF TILWAVAIRFAPLWLTFIGLLCTTIWLYNIQIANTNSWEMTLLANAVTWICALTTIITEW MSAKGHLNRNNRWFVSLLSLATIIHTSFLLMMAICEESAILSVPLISTVLLFSAGLWYGW KVKSLFYLAIIPFAALMILLTTFISQSNLRDVQIFFYGGVIVITGTTLLIYIILHLKKQW YGTEA >gi|222159230|gb|ACAB01000129.1| GENE 48 69273 - 70247 320 324 aa, chain + ## HITS:1 COG:no KEGG:BT_1824 NR:ns ## KEGG: BT_1824 # Name: not_defined # Def: putative permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 324 1 318 318 333 63.0 6e-90 MAQKHNLTIQVVSIIGGILTAIFFLGFLALARILRSDISCLIAGSILILTTLTISRMVVR SFLDAMNITLYIAGCVLIGFGINASINLLFITLMGISILTFLLSRGFILPFLSVILFNIS FFGEAAHVFSSFYPLQIAVVPILGVFLFTNIFENKLFECIGTENYLSKYKPFHFGLFVSG IVSLGGLSINYMISETNSWLVSCILSVCIWIGILIMVQRIMQVMKVDHPVNQVGIYILCI VICLPTVFAPYLSGSLLLILICFHYGYKAECAASLLLFIYAVSKYYYDLNLSLLTKSITL FFIGIACITAWYFFTQKRTRHEKV >gi|222159230|gb|ACAB01000129.1| GENE 49 70234 - 70731 354 165 aa, chain + ## HITS:1 COG:YPO2802 KEGG:ns NR:ns ## COG: YPO2802 COG4929 # Protein_GI_number: 16123000 # Func_class: S Function unknown # Function: Uncharacterized membrane-anchored protein # Organism: Yersinia pestis # 2 163 5 175 176 94 35.0 1e-19 MKKYSRILIIVNLILLLGYFNWSVYQKEQTLKDGQLVLLQLAPVDPRSLMQGDYMRLNYK EASSNLPDEQTDTRGYAILRTDSNQVGEIVRLQNTLEPVNDNELVIRYKIINRRLFLGAE SFFFEEGQDTLYQKAVYGGLKVDDKGQSLLVGLYDEDFHLIQSDK >gi|222159230|gb|ACAB01000129.1| GENE 50 71018 - 73870 2383 950 aa, chain - ## HITS:1 COG:no KEGG:BT_2486 NR:ns ## KEGG: BT_2486 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 119 1 142 1016 111 44.0 2e-22 MKRKFVKVMFFGALALSTVTYVGCKDYDDDIDNLQTQIDANKASIADLQKFVDAGNWVKS VEPVTGGFKVTFNDGKSYSIINGAKGDKGDKGDTGAPGVAGGQGTPGSKVTIDPETGEWL IDKKGTGWYAKATDGHSPYIADGTNGDKGYWYFWDDKANDGKGGFVKGDKAQGDRGEVTE GTNGKSPYIEDGYWYYYDDAAGKFVKGSQAAGTDGVSPFIDEDGYWCYWDINAVNEDGTK GKLVRGDYAQTSVYVVEAADRPAWELHVGALNEDGTYTDKTIILPTADKLSSMTVVSIAD GKISEGMSEVTMYYGILAKGTTVKFHNVTYGSTEKATTLVGANSSVMHALINPVSVDFSE YPVELINSHGEVCYTVLSKEKNLSENPLTTRADGEKKWNQGIYDLTVSLIADKAADNNLN NKAIAYALRAKDAWGNNIISNYDVKVIAKNEALDITNKEVDVVYTSSAKLDELFAAELSK VAAHYYSIEKADLEAIGAVFDEATNTISSPTKQGTIEVKVNYLKTDGTKVEGGNAKTFTV NFTYDTGADITIADAVVWNLTKGADATDPSNNTSDLVVNSTDDLFAILNKEFTGKKKPTV RLKSIGFAAGDKISVNGNEYAYESGSIDLVGAQPVTITDNKGKVTGYKIAFQFNPTMVAA VMHNAVLEIINPDYQPTLGDEVIKKVNVKVEVKNAGIFAFTPLSAYFTSSETAIAYGTPD KSNAVVNQDLYALFDEMSAEDKGHISFDEKAPNFGTDDNPEMGANWLVNKTNSVIAVPLY KSTKNDGVYSTRRMTIAYQPFGNKRLATISKEFNLTVRSAIKEGSHPDIAASKAKVLSIK ESEKLFKIKPADFTVKDVKDVAVTIAKTSRDSRVTDVTIELSSEIAKYAEIAGATSTPVA FEGELTVQLKATTLSIASKVEGYALVKITDTWGAVTEVKCPIVMKADGAN >gi|222159230|gb|ACAB01000129.1| GENE 51 74245 - 75183 842 312 aa, chain - ## HITS:1 COG:no KEGG:BT_4479 NR:ns ## KEGG: BT_4479 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 312 1 305 305 496 79.0 1e-139 MLKVVTFMKQVARGLQMEGNFGTAHVYRSSLNAIIAYCGKEDFTFYEVSPEWLKGFEVHL RSRGCSWNTVSTYLRTFRAVYNRAVDLRKAPYVPHLFRSVYTGTRADHKRALGEEDIKKV FARLSQAFVGSPSLRKAQELFILMFLLRGLPFVDLAYLRKSDLRDNVITYRRRKTGRTLS VTLTAEAMILVKRYMNRDSSSPYLFSLLKSREGTPEAYREYQLALRTFNRQLMLLGELLG LGDRLSSYTARHTWATTAYYCEIHPGIISEAMGHSSITVTETYLKPFRNKKIDEANRQVV DFVKRTISGLIA >gi|222159230|gb|ACAB01000129.1| GENE 52 75345 - 76982 1674 545 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 545 3 547 547 649 60 0.0 MAKEILFNIDARDQLKKGVDALANAVKVTLGPKGRNVIIEKKFGAPHITKDGVTVAKEIE LADAYQNTGAQLVKEVASKTGDDAGDGTTTATVLAQAIVAEGLKNVTAGASPMDIKRGID KAVAKVVESIKDQAETVGDNYDKIEQVATVSANNDPVIGKLIADAMRKVSKDGVITIEEA KGTDTTIGVVEGMQFDRGYLSAYFVTNTEKMECEMEKPYILIYDKKISNLKDFLPILEPA VQTGRPLLVIAEDVDSEALTTLVVNRLRSQLKICAVKAPGFGDRRKEMLEDIAILTGGVV ISEEKGLKLEQATIEMLGTADKVTVTKDYTTVVNGAGNKDSIKERCEQIKAQIVATKSDY DREKLQERLAKLSGGVAVLYVGAASEVEMKEKKDRVDDALRATRAAIEEGIIPGGGVAYI RAIDAIDGMKGDNADETTGVEIIKRAIEEPLRQIVANAGKEGAVVVQKVREGKGDFGYNA RLDIYENLHAAGVVDPAKVARVALENAASIAGMFLTTECVIVEKKEDKPEMPMGAPGMGG MGGMM >gi|222159230|gb|ACAB01000129.1| GENE 53 77026 - 77298 432 90 aa, chain - ## HITS:1 COG:RC0969 KEGG:ns NR:ns ## COG: RC0969 COG0234 # Protein_GI_number: 15892892 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Rickettsia conorii # 1 89 5 98 99 99 56.0 2e-21 MNIKPLADRVLILPAPAEEKTIGGIIIPDTAKEKPLKGEVVAVGHGTKDEEMVLKVGDTV LYGKYAGTELDVEGTKYLIMRQSDVLAVLG >gi|222159230|gb|ACAB01000129.1| GENE 54 77561 - 78592 859 343 aa, chain + ## HITS:1 COG:no KEGG:BT_1831 NR:ns ## KEGG: BT_1831 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 343 7 348 348 592 83.0 1e-168 MNRIASGLIIASCVFYSCTSRTGEISPVHEQQQTDSLSQDTITQPEVKPVEKKLTAEQIQ ITKDLLYDQYTLEDTYPYKDTTRQFQWDKIKERLALLENIQLQPSTWAILQNYKNRNGEA PLVKNFKRNAYGRVADTLGIERYQSVPLYLLTDTLVPERYGQDGELTRFIEDGEKFIKAK PMFTGDEWMIPKKYVKVIGDTIVFNKAVFVDRHNQNIAALERSGEGQWVVRSMNPSTTGR HLPPYAQETPLGMFVLQEKKVKMVFLKDGSKETGGYAPYASRFTDGAYIHGVPVNAPRKT QIEYSPSLGTTPRSHMCVRNATSHAKFIYDWAPVNETIIFVLE >gi|222159230|gb|ACAB01000129.1| GENE 55 79003 - 79710 482 235 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713801|ref|ZP_04544282.1| ## NR: gi|237713801|ref|ZP_04544282.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 235 1 235 235 384 100.0 1e-105 MKTILWSILCLFLSGWGSMQTVSAQDLQEMEKNLSAINEDLNQKTKEYSWQLAAAYADYC EANNKYISWNDLPYLQTVVEYERPASLETYRLAHKASKDELDKFLNTYKEYKDLTKKQKE AVTKEEKDAVSTAFSAFWKKLRSEENPYKDLYYAERKAISKYRAEALRYVIAHYKEKKQE IPTSYIKYAERSYLLQKGSALELLQKEINALESVQRELVQNITRARYGLGKTEDK >gi|222159230|gb|ACAB01000129.1| GENE 56 79770 - 80318 461 182 aa, chain - ## HITS:1 COG:CC3650 KEGG:ns NR:ns ## COG: CC3650 COG0494 # Protein_GI_number: 16127880 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Caulobacter vibrioides # 8 180 6 177 187 97 31.0 1e-20 MEEKNRAWKTVSSKYLFRRPWLTVRCEDMLLPNGNHIPEYYILEYPDWVNTIAITKDGKF VFVRQYRPGIERTCYELCAGVCEKEDASPLVSAQRELWEETGYGKGNWQEYMVISANPST HTNLTYCFLATDVELIDHQHLEATEDISVHLLTLEEVRSLLDKNEIMQALNAAPLWKYIA NL >gi|222159230|gb|ACAB01000129.1| GENE 57 80389 - 80562 232 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713803|ref|ZP_04544284.1| ## NR: gi|237713803|ref|ZP_04544284.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 57 1 57 57 90 100.0 2e-17 MNSSIFEQRSRFAMIGALMVIISLLFLFYMGSSLVSSTKKYLEQIQGIEITCIDTDE >gi|222159230|gb|ACAB01000129.1| GENE 58 80522 - 80722 78 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAKRLRCSKILLFITLPPLSFFVHHKDNTRGLQKSYKNVSKNLNIEILFVSLHQLTTSLL QYGIYE >gi|222159230|gb|ACAB01000129.1| GENE 59 80706 - 82175 1413 489 aa, chain + ## HITS:1 COG:CAC3230 KEGG:ns NR:ns ## COG: CAC3230 COG4624 # Protein_GI_number: 15896476 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 173 411 95 340 450 149 36.0 1e-35 MAFTNNIMIVRHKLLADLVKLWKNDELVEKIDRLPIELSPRKSKPLGRCCVHKERAVWRY KTFPLMGLDMTDEHDEVTPLSEYARLALSRPEPDKENIMCVIDEACSSCVQINYEITNLC RGCVARSCYMNCPKDAIRFKKNGQAMIDHDTCVSCGICHKSCPYHAIVYIPVPCEEACPV KAISKDEHGIEHIDESKCIYCGKCMNACPFGAIFEISQTFDVLQRIRKGEKMVAIVAPSI LGQFKTSIGQVYGAFKEIGFTDVIEVAEGAMATTSNEAHELLEKLEEGQKFMTTSCCPSY IELVEKHIPDMKPYVSTTGSPMYYAARIAKEKHPDAKVVFVGPCVAKRKEVRRDEAVDYI LTFEEIGSILDGLGIELEQVQEFSVLHTSVREAHGFAQAGGVMGAVKAYLKEEADKINAI QVSDINKKNIALLRACAKTGKAAGQFIEVMACEGGCITGPSTHNDIVSGRRQLAQELLKR KESYETMDR >gi|222159230|gb|ACAB01000129.1| GENE 60 82156 - 83232 605 358 aa, chain + ## HITS:1 COG:CAC1631 KEGG:ns NR:ns ## COG: CAC1631 COG0502 # Protein_GI_number: 15894909 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Clostridium acetobutylicum # 5 341 8 339 350 250 38.0 2e-66 MKQWIDKLRQERTLSPEELRQLLTGCDAETLRYINKQAQEVALRHFGNKIYIRGLIEVSN CCRNNCYYCGIRKGNPNIERYRLNLESILDCCRQGYKLGFRTFVLQGGEDPALTDDRIEM TVARIRQNYPDCAITLSLGEKSRDAYERFFRAGANRYLLRHETHNESHYRQLHPAEMSGK QRLQCLADLKDIGYQTGTGIMVGSPGQTVEHIIEDILFIEKLQPEMIGIGPFLPHHDTPF AQYPSGTVERTILLLSIFRLMHPSTLIPATTALATLIPDGRERGILAGANVVMPNLSPRE ERRKYELYNDKASLGAESAEGLATLQKQLNTIGYEISTERGDFKCTTDYTDSHRFISN >gi|222159230|gb|ACAB01000129.1| GENE 61 83285 - 84706 1376 473 aa, chain + ## HITS:1 COG:CAC1356 KEGG:ns NR:ns ## COG: CAC1356 COG1060 # Protein_GI_number: 15894635 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 2 473 1 472 472 720 76.0 0 MIYQKDSSKAEEFIHHEEILDTLEYAQNNKDNRVLIEQLIEKAALCKGLTHREAAILLEC NQPDLIERIFHLAKEIKQKFYGNRIVMFAPLYLSNHCVNSCVYCPYHAKNKTIARKKLTQ EEIRREVIALQDMGHKRLALEAGEHPSLNPIEYILESIQTIYSIKHKNGAIRRVNVNIAA TTVENYRRLKEAGIGTYILFQETYHKDNYEALHPTGPKSNYAYHTEAMDRAMEGGIDDVG IGVLFGLNTYRYDFIGLLMHAEHLEAKFGVGPHTISVPRICSADDIDAGDFPNAISDEIF SKIVAVIRIAVPYTGMIISTRESQESRKKVLELGISQISGGSRTSVGGYAETELPDHNSA QFDVSDTRTLDEVVNWLLELGYIPSFCTACYREGRTGDRFMSLVKSGQIANCCGPNALMT LKEYLEDYASEDTRQKGLELILKETERIPNPKIREIAIRNLKAIAAGQRDFRF >gi|222159230|gb|ACAB01000129.1| GENE 62 84718 - 85908 933 396 aa, chain + ## HITS:1 COG:CAC1651 KEGG:ns NR:ns ## COG: CAC1651 COG1160 # Protein_GI_number: 15894928 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 3 391 4 391 411 331 46.0 1e-90 MSLTDTPNANRLHITLFGRRNSGKSSLINALTGQDTALVSNTPGTTTDLVSKAMEIQGIG PCLFIDTPGFDDEGELGELRISRTLKAIEKTDIALLLCEDTTFFHEKEILALLKEKNIPV IPVLNKIDIRENSDHLATYIEEQCKISPLLISAKEKIGIELIRQAILEKLPSDFDQQNIT GKLVAENDLVLLVMPQDIQAPKGRLILPQVQTIRELLDKKCLVMTCTTDKFSATLQALAR PPKLIITDSQVFKAIYEQKPAESELTSFSVLFAGYKGDIHYYVESAAVIERLTESSRVLI AEACTHAPLSEDIGRVKLPRLLRKRIGENLQIDMVAGTDFPQDLTPYSLVIHCGACMFNR KYVLSRIERAREQHIPMTNYGVAIAFLNGILDQIKY >gi|222159230|gb|ACAB01000129.1| GENE 63 86051 - 87415 1659 454 aa, chain - ## HITS:1 COG:HP1190 KEGG:ns NR:ns ## COG: HP1190 COG0124 # Protein_GI_number: 15645804 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Helicobacter pylori 26695 # 5 440 4 422 442 248 34.0 3e-65 MAAKPSIPKGTRDFSPVEMAKRNYIFNTIRDVYHLYGFQQIETPSMEMLSTLMGKYGDEG DKLLFKIQNSGDYFSGITDEELLSRNAAKLASKFCEKGLRYDLTVPFARYVVMHRDEITF PFKRYQIQPVWRADRPQKGRYREFYQCDADVVGSDSLLNEVELMQIVDTVFSRFNIRVCI KINNRKILSGIAEIIGESDKIVDITVAIDKLDKIGLDNVNAELKEKGISDEAIAKLQPII LLSGTNAEKLATLKNVLSASEVGLKGVEESEFILNTLETMGLKNEIELDLTLARGLNYYT GAIFEVKALDVQIGSITGGGRYDNLTGVFGMAGVSGVGISFGADRIFDVLNQLELYPKEA VNGTELLFINFGEKEAAYSMGILAKVRAANIRAEIFPDAAKMKKQMSYANAKNIPFVAIV GENEMNEGKVMLKNMETGEQNLVSADELIAIVKK >gi|222159230|gb|ACAB01000129.1| GENE 64 87656 - 88336 782 226 aa, chain - ## HITS:1 COG:BH1677 KEGG:ns NR:ns ## COG: BH1677 COG2738 # Protein_GI_number: 15614240 # Func_class: R General function prediction only # Function: Predicted Zn-dependent protease # Organism: Bacillus halodurans # 2 226 1 223 224 180 43.0 2e-45 MMSYWVLFIGIAVVSWLVQMNLQNKFKKYSKIPTGNGMTGRDVALKMLHDNGIYDVQVTH TPGRLTDHYNPTNKTVNLSEGVYESNSIMAAAVAAHECGHAVQHARMYAPLKMRSALVPV VNFASSIMTWVLLGGILLINSFPQLLLAGIILFAMTTLFSFITLPVEINASKRALVWLSS SGITNSYNHAQAEDALRSAAYTYVVAALGSLATLVYYIMIFMGRRD >gi|222159230|gb|ACAB01000129.1| GENE 65 88457 - 89755 1123 432 aa, chain - ## HITS:1 COG:TM0306 KEGG:ns NR:ns ## COG: TM0306 COG3669 # Protein_GI_number: 15643075 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Thermotoga maritima # 24 350 8 358 449 134 30.0 3e-31 MKTRFITFLLLFVMNLGVFAQSSYQPTEENLKARQEFQDNKFGIFLHWGLYAMLATGEWT MTNNNLNYKEYAKLAGGFYPSKFDADKWVAAIKASGAKYICFTTRHHEGFSMFDTKYSDY NVVKATPFKRDIVKELAAACAKQGIKLHFYYSHLDWAREDYPWGRTGQGTGRSNSKGDWK SYYQFMNNQLTELLTNYGPIGAIWFDGWWDQPKTFNWELPEQYALIHKLQPGCLVGNNHH QTPFDGEDIQIFERDLPGENASGLSGQEVSRLPLETCETMNGMWGYKITDQNYKSTKTLI HYLVKAAGKNANLLMNIGPQPDGELPAVAVQRLAETGEWMKQYGETIYGTRSGAVAPHDW GVTTQKGNKLYVHILDLKDAALFLPLTDKKVKKAVLFKDQSPVRFTKTKAGVLLEFAEVP KDIDYVVELTID >gi|222159230|gb|ACAB01000129.1| GENE 66 89865 - 91136 1604 423 aa, chain - ## HITS:1 COG:PM0938 KEGG:ns NR:ns ## COG: PM0938 COG0104 # Protein_GI_number: 15602803 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Pasteurella multocida # 5 417 6 424 432 382 48.0 1e-106 MKVDVLLGLQWGDEGKGKVVDVLTPKYDVVARFQGGPNAGHTLEFEGQKYVLRSIPSGIF QGNKVNIIGNGVVLDPALFKAEAEALEASGHPLKERLHISKKAHLILPTHRILDAAYEAA KGDAKVGTTGKGIGPTYTDKVSRNGVRVGDILHNFDQKYAAAKARHEQILKSLNYEYDLT ELEKAWLEGIEYLKQFHFVDSEHEVNNLLKDGKSVLCEGAQGTMLDIDFGSYPFVTSSNT VCAGACTGLGVAPNRIGEVYGIFKAYCTRVGAGPFPSELFDETGDKMCTLGHEFGSVTGR KRRCGWIDLVALKYSVMINGVTKLIMMKSDVLDTFETIKACVAYKVNGEEIDYFPYDITE GVEPVYAELPGWQTDMTKMQSEDEFPEEFNAYLTFLEEQLGVEIKIVSVGPDRAQTIERY TEE >gi|222159230|gb|ACAB01000129.1| GENE 67 91133 - 91621 248 162 aa, chain - ## HITS:1 COG:Cj0400 KEGG:ns NR:ns ## COG: Cj0400 COG0735 # Protein_GI_number: 15791767 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Campylobacter jejuni # 1 154 1 157 157 78 32.0 6e-15 METQNVKDTVRQIFTEYLTANGHRKTPERYAILDTIYSIDGHFDIDMLYSRMMDQENFRV SRATLYNTIILLINARLVIKHQFGTSSQYEKSYNRETHHHQICTQCGKVTEFQNEELQQA IENTKLSRFQLSHYSLYIYGVCSKCDRANKRKKVNNNNKKEK >gi|222159230|gb|ACAB01000129.1| GENE 68 91878 - 92543 716 221 aa, chain - ## HITS:1 COG:no KEGG:BT_1845 NR:ns ## KEGG: BT_1845 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 218 1 218 221 402 91.0 1e-111 MDIKEQLKDIKTQLRLSMNGAVSQSMREKGLVYKLNFGVELPRIKMIAEGYEKNHDLAQA LWKEEIRECKILAGMLQPIETFYPEIADIWVENIRNIEIAELTCMNLFQHLPYAPAKSFH WIADEQEYVQTCGFLTAARLLMKKGDMTERASGELLDQAICAVHSDSYHIRNAALLVIRK YMQHSEEHAFQVCRLVEGMADSTLEGEQMLYNMVKEELAVD >gi|222159230|gb|ACAB01000129.1| GENE 69 92550 - 94598 2227 682 aa, chain - ## HITS:1 COG:no KEGG:BT_1846 NR:ns ## KEGG: BT_1846 # Name: not_defined # Def: putative dipeptidyl-peptidase III # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 682 1 675 675 1233 89.0 0 MKKHLILMTVAATLLTSCGGSKTTTAEADKFDYTVEQFADLQILRYKVPGFEELTLKQKE LIYYLTEAALEGRDILFDQNGKYNLRIRRMLEAVYTNYQGDKTTPDFKNMEVYLKRVWFS NGIHHHYGTEKFVPNFSQEFLKQAVLGIDAKLLPLAKGQTAEQLCAELFPVIFDPTVMPK RVNQADGEDLVLTSACNYYDGVTQKEAESFYSALKDPKDETPVSYGLNSRLVKENGKLEE KVWKVGGLYTQAIEKIVYWLKKAEGVAENEAQKAVITKLIQFYETGNLKDFDEYAILWVK DLDSRIDFVNGFTESYGDPLGMKASWESLVNFKDLESTHRTEIISSNAQWFEDHSPVDKS FKKEKVKGVSAKVITAAILAGDLYPATAIGINLPNANWIRAHHGSKSVTIGNITDAYNKA AHGNGFNEEFVYSDAEIQLIDAYSDLTDELHTDLHECLGHGSGKLLPGVDPDALKAYGST IEEARADLFGLYYVADPKLVELGLLYSPDAYKAQYYTYLMNGLMTQLVRIEPGNTVEEAH MRNRQLIARWVFEKGAADKVVELTKKDGKTYVVINDYQKVRELFGELLAEIQRIKSTGDF EGARSLVENYAVKVDPALHAEVLERYKKLNLAPYKGFVNPKYELVTDENGNVTDVTVSYD EGYVEQMLRYSTDYSPLPSINN >gi|222159230|gb|ACAB01000129.1| GENE 70 94694 - 95158 499 154 aa, chain - ## HITS:1 COG:VCA0926 KEGG:ns NR:ns ## COG: VCA0926 COG2207 # Protein_GI_number: 15601680 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Vibrio cholerae # 56 152 272 365 365 63 35.0 1e-10 MSDLENKKTEETPKKRPYNLREKKEKKAAYRSLIRPELADELYDKILNIIVVQKKYKDPD YSAKDLAKELKTNTRYLSAVVNSRFGMNYSCLLNEYRVKDALHLLTDKRYADKNVEEISA MVGFANRQSFYAAFYKNVGETPNGYRKKHLENKK >gi|222159230|gb|ACAB01000129.1| GENE 71 95429 - 97237 1342 602 aa, chain + ## HITS:1 COG:ECs4752 KEGG:ns NR:ns ## COG: ECs4752 COG0514 # Protein_GI_number: 15834006 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Escherichia coli O157:H7 # 2 602 16 606 611 528 45.0 1e-150 MRETLKTYFGYDNFRPLQEEIIRHILNKQDALVLMPTGGGKSICYQLPALLCEGTAVVVS PLISLMKDQVEALLANGIAAGALNSSNDETENANLRRACIEGRLKLLYISPEKLLAEKDY LLRDMNISLFAIDEAHCISQWGHDFRPEYTQMGVLHQQFPQIPIVALTATADKITREDIV RQLHLNHPRTFISSFDRPNISLTVKRGFQAKEKNKAILEFIHRHGGESGIIYCMSRSKTE TVAQMLQKQGIRCGVYHAGLSTQHRDETQNDFINDRIQVVCATIAFGMGIDKSNVRWVIH YNLPKSIESFYQEIGRAGRDGLPSNTVLFYSLGDLILLTKFASESNQQNINLEKLQRMQQ YAEADICRRRILLSYFGETTTEDCGNCDVCKNPPQRFDGTVIVQKALSAIARTEQQISTG VLIDILRGNYSAEVTGKGYQELKTFGAGRDIPPRDWQDYLLQMLQLGYFEIAYNENNHLK ITSSGSDVLFGRTQAALVVIQHEEAVTRKGKKKKVVIAKELPFGAAGGESQDLFEALRGL RKQLADQEALPAYIVLSDKVLHLLCISRPTTIEEFGEISGIGEHKKKKYGKDFVNLIRQF VE >gi|222159230|gb|ACAB01000129.1| GENE 72 97444 - 98262 953 272 aa, chain - ## HITS:1 COG:FN1787 KEGG:ns NR:ns ## COG: FN1787 COG0457 # Protein_GI_number: 19705092 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 46 257 86 294 628 80 29.0 3e-15 MVRIIIALFFCFPAVTFAQTYQQLSERAIECIEKDSLPQAEELLLQALKLEPKNAKNALL FSNLGLVQRRLGEFDKALESYSFALNFAPLAVPILLDRAAIYMEMGKTDRAYTDYCQVLD EDKQNKEALLMRAYIYVLRRDYPAARIDYNRLLELDPQSYSGRLGLATLEQKEGKFREAL EILNKMLAATPEDATLYIARADVEREMKHEDLALVDLEEAIRLDAASADAYLLRGNIYLA QKKKGLAKADFEKAISLGVPPADLHEQLRQCK >gi|222159230|gb|ACAB01000129.1| GENE 73 98290 - 99240 729 316 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 311 3 307 308 285 50 7e-76 MAKIAKKLTDLVGNTPLMELSGYSGKYGLEQNIIAKLEAFNPAGSVKDRVALSMIEDAEA RGALKSGATIIEPTSGNTGVGLAMVATIKGYHLILTMPETMSLERRNLLKALGAQIVLTD GLGGMAASIAKAQELRDSIPGSVILQQFENPANAAVHERTTGEEIWRDTDGEVAVFVAGV GTGGTVCGVARALKKHNPNIYIVAVEPASSPVLAGGEAASHRIQGIGANFIPKLYDASVV DEVIGVPDDEAIRAGRELAATEGLLAGISSGAAVYAARQLSQRPEFKNKKIVALLPDTGE RYLSTELFAFDAYPLD >gi|222159230|gb|ACAB01000129.1| GENE 74 99372 - 101207 1238 611 aa, chain + ## HITS:1 COG:VCA0802 KEGG:ns NR:ns ## COG: VCA0802 COG1368 # Protein_GI_number: 15601557 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Vibrio cholerae # 190 605 196 632 657 187 30.0 4e-47 MKKRIIQFLTTYFLFVLLFVLQKPIFMVYYHDLYTNASLGDYFRVMWHGLPLDLSLAGYL TAIPGILLIASAWTNSSILRRIRQGYFGVIAFVMACIFIIDLGLYGFWGFRLDATPIFYF FSSPKDAMASVSFWFVLLGILAILIYAAILYCIFYCVLIREKKPLKIPYRRQNVSLALLL LTAALFIPIRGGFSVSTMNLSKVYFSQDQRMNHAAINPAFSFMYSATHQNNFDKQYRFMD PKIADELFAEMVDKPVAATDSIPQLLNTQRPNIIFIILESFSTHLMETFGGQPNVAVNMD KFAKEGVLFSNFYGSSFRTDRGLASIISGYPGQPSTSIMKYPEKTDKLPSIPRSLKNAGY SLEYYYGGDADFTNMRSYLVSSGIEKIISDKDFPLSERTGKWGAQDHVLFQRLMKDLKEE KQEEPFLKLVQTSSSHEPFEVPFHRLDDKVLNSFAYADSCVGDFVKQYQETPLWKNTLFV LVPDHQGAYPYPIENPLDGQTIPLILIGGAIKQPLVVDTYASQIDIAATLLAQLGLPHDE FTFSKNILNPGSPHFAYFTRPDYFGMITADNQLVYNLDANTVQLDEGTAKGANLEKGKAF LQKLYDDLAKR >gi|222159230|gb|ACAB01000129.1| GENE 75 101303 - 101992 481 229 aa, chain + ## HITS:1 COG:MJ0374_2 KEGG:ns NR:ns ## COG: MJ0374_2 COG0671 # Protein_GI_number: 15668550 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Methanococcus jannaschii # 10 191 1 162 168 59 28.0 4e-09 MIHSEIVQFLSEIDTNIFLSFNGIHSPFWDYFMSSFTGKIIWVPMYATILYILLRNFHWK VVVCYVAAIALTITFADQMCSSIIRPVVARLRPANPDNPIVDMVYIVNGYRGGSYGFPSC HAANSLGLAMFVVFLFRKRWLSIFIVTWAILNCYTRIYLGVHYPGDLLVGGIIGGFGGWL FCTIAHKIAIYLEPSTRTKRKDIKQWSVTIYVGLLTVLGIIIYSTIKSW >gi|222159230|gb|ACAB01000129.1| GENE 76 102210 - 103271 1358 353 aa, chain - ## HITS:1 COG:aq_244 KEGG:ns NR:ns ## COG: aq_244 COG0473 # Protein_GI_number: 15605790 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Aquifex aeolicus # 3 352 4 358 364 347 49.0 2e-95 MDFKIAVLAGDGIGPEISVQGVDVMSAVCEKFGHKVSYEYAICGADAIDKVGDPFPEATY QVCKDADAVLFSAVGDPKFDNDPTAKVRPEQGLLAMRKKLGLFANIRPVQTFKCLIHKSP LRAELVENADFICIRELTGGMYFGEKYQDNDKAYDTNYYTRPEIERILKVAFEYAMKRRK HLTVVDKANVLASSRLWRQIAQEMAPNYPEVTTDYMFVDNAAMKMIQEPAFFDVMVTENT FGDILTDEGSVISGSMGLLPSASTGESTPVFEPIHGSWPQAKGLNIANPLAQILSVAMLF EYFDCKEEGALIRKAVDASLDENVRTPEIQVADGAKYGTKEVGQWIVDYIKKA >gi|222159230|gb|ACAB01000129.1| GENE 77 103332 - 104879 1796 515 aa, chain - ## HITS:1 COG:MK0391 KEGG:ns NR:ns ## COG: MK0391 COG0119 # Protein_GI_number: 20093829 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Methanopyrus kandleri AV19 # 7 505 4 491 499 244 34.0 3e-64 MGKEGVKIEIMDTTLRDGEQTSGVSFVPHEKLMIARLLLEDLKVDRVEVASARVSDGEFE AVKMICDWAARRNLLHKVEVLGFVDGHTSVDWIQRTGCRVINLLCKGSLKHCTQQLKKTP EEHLADIINVVHYADEQDITVNVYLEDWSNGIKDSPEYVFQLMDGLQETSVKRYMLPDTL GILNPLQVIEYMRKMKKRYPNTHLDFHAHNDYDLAVSNVLAAVLSGVKGLHTTINGLGER AGNAPLASVQAILKDHFNAATNIDESRLNDVSRVVESYSGIMIPANKPIVGENVFTQVAG VHADGDNKNNLYCNDLLPERFGRKREYALGKTSGKANIRKNLEDLGLELDEDAMRKVTER IIELGDKKELVTQEDLPYIVSDVLKHGAIGEKVKLKSYFVNLAHGLKPMATLKIEINGKE YEESSGGDGQYDAFVRALRKIYKVTLGRKFPMLTNYAVTIPPGGRTDAFVQTVITWSYDE QVFRTRGLDADQTEAAIKATMKMLNLIEDEYEKSK >gi|222159230|gb|ACAB01000129.1| GENE 78 104861 - 105457 753 198 aa, chain - ## HITS:1 COG:NMB1034 KEGG:ns NR:ns ## COG: NMB1034 COG0066 # Protein_GI_number: 15676921 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Neisseria meningitidis MC58 # 6 197 4 202 213 169 46.0 3e-42 MAKTKFNIITSTCVPLPLENVDTDQIIPARFLKATTREEKFFGDNLFRDWRYNADGSLNK DFVLNNPTYSGQILVAGKNFGSGSSREHAAWAIAGYGFRVVVSSFFADIHKNNELNNFVL PVVVTEEFLQELFDSIEADPKMEVEVNLPEQTITNKATGKSEHFEINAYKKLCLMNGLDD IDFLLSNKDKIEEWEKKA >gi|222159230|gb|ACAB01000129.1| GENE 79 105499 - 106893 1616 464 aa, chain - ## HITS:1 COG:NMA1450 KEGG:ns NR:ns ## COG: NMA1450 COG0065 # Protein_GI_number: 15794355 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Neisseria meningitidis Z2491 # 3 461 5 466 469 511 56.0 1e-145 MNTLFDKIWDAHVVTTVEDGPTQLYIDRLYCHEVTSPQAFAGLRERGIGVLRPEKVFCMP DHNTPTHDQDKEIEDPISKTQVDTLTKNAKDFGLTHYGMMHPKNGIIHVVGPERGLTLPG MTIVCGDSHTSTHGAMGAIAFGIGTSEVEMVLASQCILQSRPKTMRITVDGELGKGVTAK DVALYMMSKMTTSGATGYFVEYAGSVIRNLTMEGRLTLCNLSIEMGARGGMVAPDEVTFE YIKGRESAPQGEAWDKALAYWKTLKSDDDAVFDKEVRFEAADIEPMITYGTNPGMGMGIT QHIPTMEGMSEAAQVSFKKSMDYMEFQPGESLLGKKIDYVFLGACTNGRIEDFRAFASIV KGRKKAENVIAWLVPGSWMVDAQIRKEGIDKILTEAGFAIRQPGCSACLAMNDDKIPAGK YSVSTSNRNFEGRQGPGARTLLASPLVAAAAAVTGVITDPRELM >gi|222159230|gb|ACAB01000129.1| GENE 80 106982 - 108478 1599 498 aa, chain - ## HITS:1 COG:VC2490 KEGG:ns NR:ns ## COG: VC2490 COG0119 # Protein_GI_number: 15642486 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Vibrio cholerae # 1 494 1 496 516 461 50.0 1e-129 MDNRLFIFDTTLRDGEQVPGCQLNTVEKIQVAKALEALGVDVIEAGFPISSPGDFNSVIE ISKAVTWPTICALTRAVQKDIDVAVDALKFAKHKRIHTGIGTSDSHIKYKFNSNREEIIE RAVAAVKYARRFVDDVEFYAEDAGRTDNEYLARVVEAVIKAGATVVNIPDTTGYCLPSEY GAKIKYLIDHVDGIDNAILSTHCHNDLGMATANTIAGVLNGARQVEVTINGIGERAGNTA LEEIAMIIKSHHEIDIQTNINTQKIYPTSRMVSSLMNMPVQPNKAIVGRNAFAHSSGIHQ DGVLKNVQTYEIIDPHDVGIDDNSIVLTARSGRAALKNRLSILGVDLDQEKLDKVYDEFL KLADKKKDINDDDILVLAGADRSQNHRIKLDYLQVTSGVGVRSVASLGLNIAGEKFEACA SGNGPVDAAIKALKKIVERHMTLKEFTIQAISKGSDDVGKVHMQVEYDNQIYYGFGANTD IIAASVEAYIDCINKFKS >gi|222159230|gb|ACAB01000129.1| GENE 81 108881 - 109114 183 77 aa, chain - ## HITS:1 COG:no KEGG:BT_1862 NR:ns ## KEGG: BT_1862 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 80 80 67 83.0 2e-10 MKTKLFLAAVAVTFSFAMMSCAGNKTTNAASEGEETTVETVEAVVESDSCCQAKDSCATA CDKKADCAEKKECCDKK Prediction of potential genes in microbial genomes Time: Wed May 18 04:05:58 2011 Seq name: gi|222159229|gb|ACAB01000130.1| Bacteroides sp. D1 cont1.130, whole genome shotgun sequence Length of sequence - 8783 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 8, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 3 - 62 3.2 1 1 Tu 1 . + CDS 84 - 656 332 ## BPSL0752 hypothetical protein + Term 870 - 913 3.1 - Term 858 - 901 5.3 2 2 Tu 1 . - CDS 903 - 1139 233 ## BF0521 hypothetical protein - Prom 1218 - 1277 10.6 + Prom 1210 - 1269 6.0 3 3 Op 1 . + CDS 1430 - 1735 183 ## gi|262406349|ref|ZP_06082898.1| predicted protein 4 3 Op 2 . + CDS 1809 - 3767 465 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily + Term 3940 - 3977 -0.9 5 4 Tu 1 . - CDS 3779 - 4366 355 ## BF0465 putative integrase/recombinase - Prom 4394 - 4453 3.6 6 5 Tu 1 . - CDS 4490 - 4795 230 ## BF0517 hypothetical protein - Prom 4869 - 4928 6.0 + Prom 4810 - 4869 4.1 7 6 Tu 1 . + CDS 4937 - 5179 126 ## gi|262406345|ref|ZP_06082894.1| conserved hypothetical protein + Prom 5359 - 5418 6.0 8 7 Tu 1 . + CDS 5541 - 6920 398 ## gi|237713558|ref|ZP_04544039.1| predicted protein 9 8 Tu 1 . - CDS 7017 - 8783 817 ## BBR47_35840 putative integrase Predicted protein(s) >gi|222159229|gb|ACAB01000130.1| GENE 1 84 - 656 332 190 aa, chain + ## HITS:1 COG:no KEGG:BPSL0752 NR:ns ## KEGG: BPSL0752 # Name: not_defined # Def: hypothetical protein # Organism: B.pseudomallei # Pathway: not_defined # 2 175 113 294 345 86 30.0 5e-16 MVLDGKFLNKIKSITNENERISEEDLLNELSKYKSYKTYNNKSWKYIYEFSKSRLLTELD YSGFNSSIMDLNYFITKYIADEQKRAISLRMVYVILAHTLIIMDFILKDIAFLEQKDRES KLSIGLKYGNLGKEGIDKIISMAMHISGVTSANTIMKSLDSIPVDILKDFFFKKMRMQKK HLVGLKNSAY >gi|222159229|gb|ACAB01000130.1| GENE 2 903 - 1139 233 78 aa, chain - ## HITS:1 COG:no KEGG:BF0521 NR:ns ## KEGG: BF0521 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 77 1 77 78 139 93.0 3e-32 MTKIEEFVIQKVKAMRMERKWSQMELADYMNMSASFIADIENPKRRAKYNLNHLNILAKV FGCSPRDFLPDTPVEEEI >gi|222159229|gb|ACAB01000130.1| GENE 3 1430 - 1735 183 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262406349|ref|ZP_06082898.1| ## NR: gi|262406349|ref|ZP_06082898.1| predicted protein [Bacteroides sp. 2_1_22] # 1 101 57 157 157 190 100.0 2e-47 MFEFWVNDAISDFYIQGNIDYREELTTGKDVTKGCKMSIALVKLEEWYSSEKEYVSGTVT IEKWDLENFRVTLVFKDYKCKSGSKSIVLNGSVTFPTSINI >gi|222159229|gb|ACAB01000130.1| GENE 4 1809 - 3767 465 652 aa, chain + ## HITS:1 COG:PA3339_1 KEGG:ns NR:ns ## COG: PA3339_1 COG1752 # Protein_GI_number: 15598535 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 23 282 22 289 308 194 39.0 4e-49 MNKIALALISLILPFSLVYTSPRKKVGIVLSGGGAKGVAHIGVIKTLEELNIPIDYIAGT SIGAIIGGLYSIGYTSEQLETIVKQTDWIDLLTDKVSRKEIPFPFKSDDSKYLMSFPINN SGKNGGIIKGKNISQLLHQLTEGYQNVTNFDSLPIPFACIATDMVRNQKEVIRFGKLSEA MRASMAVPVVFSPIYSGQKVLIDGGFKDNLPIDVVKAMGADIIIGIDVQSELSDADNLHS IAGIANQLMLMICQSSLDESTKDIDAYIKVDVENYNAASFTKAAIDTLIIRGENAAKANY NTLLSIKNRVGIVNNKKNNKTKFPLFYPQYAPSNDNQLRIGLRFDSEDIAAVILNINLNK QKIGKAEIVARGGKQSFLRTQYHLPISKPQWITLSNQIGYNDIFLYSHGHKISNPTFIRN TSSLIYSIIPSRNLQMEVGASLDYHRYYRTLVSEDATLMKRKNLFLNYHTKLKYETLDKR YFPTKGLKTNISYTIYTNLHTNTAYSSLAAQITKIFPITLNTFIIPTIYGRFIFNNGIPF MYSNVIGGEGYNPHYEQQIPFSGLTHTESTLKLLGNAQIQVRHKFHEKQYLTLSGNYAIH QNQFSNFLQGKKIYGTSIGYGYDSPLGPIEGFICYSNRTNKLGFYLNIGFGF >gi|222159229|gb|ACAB01000130.1| GENE 5 3779 - 4366 355 195 aa, chain - ## HITS:1 COG:no KEGG:BF0465 NR:ns ## KEGG: BF0465 # Name: not_defined # Def: putative integrase/recombinase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 195 1 195 195 333 89.0 2e-90 MSLKNSYVTSDYMEWDSMLSLVHKLYRDKEYRLSLLIGCGSFFGLRISDILALTWSMLLD DEKFVIIEKKTGKRREIKVNSNFQKHIADCFKALDIVDKDEKCFVSRKKTVYSTQRINVL FKSIKIKYGLKIEHFSTHSMRKTFGRKVVETAGENSEFALIKLSELFNHADIQTTRRYLG LRTEELLETYDMLSF >gi|222159229|gb|ACAB01000130.1| GENE 6 4490 - 4795 230 101 aa, chain - ## HITS:1 COG:no KEGG:BF0517 NR:ns ## KEGG: BF0517 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 101 1 100 100 102 62.0 5e-21 MKRYNKRQVMKDAHRLYNNDFQRRRRSWSECLRAAWSWERDAVKVLEEKAARLDAMIAAS WKAHNERKEAKTNENWYKGIDSETLSYAMGYRRGNNFYCGD >gi|222159229|gb|ACAB01000130.1| GENE 7 4937 - 5179 126 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262406345|ref|ZP_06082894.1| ## NR: gi|262406345|ref|ZP_06082894.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 80 1 80 80 138 100.0 1e-31 MEEKEKKERTVIHLYIKENDTHHYYGSIANVFEYFSPNILGISYGSLRNYGLSNEKTFQN SKCIIRKGTLLSKSGNRGKK >gi|222159229|gb|ACAB01000130.1| GENE 8 5541 - 6920 398 459 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713558|ref|ZP_04544039.1| ## NR: gi|237713558|ref|ZP_04544039.1| predicted protein [Bacteroides sp. D1] # 1 459 1 459 459 813 100.0 0 MYKNEKFSFKLMISVNGYTEKPTSKDYNKISFEEKEITISGLEQFIKQGYIFSCTYIDKE IKSLFGYKKKERALHTNIIVFDIDNTLLSFDELLVTLKYQPSIAYTTFSHITKGNRYRLL YIFNTPLNQNQYQILYDNIITNIELDNHNRAIYQFYLGSYANCKLINNLIIYNYDDISIS NSIKEKPTPYNIQLQMEKRDKVVVEIKDHEYISNFWNLSHKDLIEKYREKYPFFENNLEE ANEDIPYITIPENYCCIKRYWFKEKTKLNNGNCVYSNRVCKIKDGNGRRKKLFLNAIVRK YMYPQIEFEHLLHCLANELYYYINNSKDPISKQELYDIAENAYNENIDKYSTMIKQQQTE KRKWIINPQYCIKYNLSAKTVRNMVKGKINNELIGLYYDCSKSLKENLSTFKALNLKVGK SKLYEWCKENNIPTNIKNNQSTKNTVGLYLYNPISAIAG >gi|222159229|gb|ACAB01000130.1| GENE 9 7017 - 8783 817 588 aa, chain - ## HITS:1 COG:no KEGG:BBR47_35840 NR:ns ## KEGG: BBR47_35840 # Name: not_defined # Def: putative integrase # Organism: B.brevis # Pathway: not_defined # 35 430 121 498 546 70 26.0 2e-10 VEQIKLFIDKGINLYFDKQKLWVRDSNKDVGSIILLHVLAVMSSYEIELFIERSLSGKIT KVQAGHGGGDERAYGYMHNKNKQIVINPIESKIVTRIFEMYVGGYSSIQISEILNAEKVP APYVRKLNEYKKNREAKGLEVKEYKFDTDNLKWRISTINRLMHNELYKGNRRITFYKPDP TNPLPLSERQDREIVYEYSEHVESLRIVSDELFQQAEDKLSKAHYNKNNAVRHENLLKHL MVCGECGANFSVGKSNETSKNYISGGRTYKCYGRVNRKDKPRTCTNGAELRQWKLDGLVL QLSIQMFAEINIQQTNILKIENLGKEIEELVQIKSSKNTELVEAENLYKKTLKRLIAIED DEVAKNLISDAKHKYDETKNVLNKTMDKLSREITTKRITIDNLKRLNANPLLINKMDEIR KNRDLVKTMVDEYIDKITIFRLHELWLLVVVSYKGGEEMWGTIKCARYKKEEMFYDELHC HYGVEFQGWLLNNTEHCFSYDKSTRIITYHGGSKIYVEFKSGEYDYDTFNQMIQETGWMG CFPFYAYEDSDKDLSVPFNEKFRKSLQENRVDWKAHNDKVLERLLSES Prediction of potential genes in microbial genomes Time: Wed May 18 04:06:41 2011 Seq name: gi|222159228|gb|ACAB01000131.1| Bacteroides sp. D1 cont1.131, whole genome shotgun sequence Length of sequence - 1342 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 273 120 ## gi|262406343|ref|ZP_06082892.1| predicted protein - Prom 467 - 526 4.7 + Prom 70 - 129 4.0 2 2 Tu 1 . + CDS 311 - 799 267 ## COG0013 Alanyl-tRNA synthetase + Prom 812 - 871 9.8 3 3 Tu 1 . + CDS 1071 - 1341 248 ## BF2625 small heat shock protein Predicted protein(s) >gi|222159228|gb|ACAB01000131.1| GENE 1 3 - 273 120 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262406343|ref|ZP_06082892.1| ## NR: gi|262406343|ref|ZP_06082892.1| predicted protein [Bacteroides sp. 2_1_22] # 1 90 1 90 679 173 100.0 3e-42 MNNMDKKVLIYCRVSTQQQTTDRQKEELLKFAVENHWNVEEEDIFIDVISGFKKGEFRPE YVKMLERVEYGDIDIILFSEFSRLARNATE >gi|222159228|gb|ACAB01000131.1| GENE 2 311 - 799 267 162 aa, chain + ## HITS:1 COG:aq_1293 KEGG:ns NR:ns ## COG: aq_1293 COG0013 # Protein_GI_number: 15606507 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Aquifex aeolicus # 23 142 559 679 867 70 33.0 2e-12 METNIEINKPSPKFQDVSIAPMHTAEHLLNATMVKTFGCPRSRNAHIERKKSKCDYILSS CPTAEQIQSIEETVNEAISQNLPVTIEYMTHEQAKDIVDLSKLPADASPTLRIVRIGDYD ACACIGLHVSNTSEMGTFKIISYDYDEERQTLRMRFKLIEKK >gi|222159228|gb|ACAB01000131.1| GENE 3 1071 - 1341 248 90 aa, chain + ## HITS:1 COG:no KEGG:BF2625 NR:ns ## KEGG: BF2625 # Name: not_defined # Def: small heat shock protein # Organism: B.fragilis # Pathway: not_defined # 1 90 1 90 142 127 85.0 1e-28 MMPVRRTQSWLPSIFNDFFDNDWMVKANATAPAINVLETEKEYKVELAAPGMTKDDFNVR IDEDNNLVISMEKKTENKEEKKDGRYLRRE Prediction of potential genes in microbial genomes Time: Wed May 18 04:07:04 2011 Seq name: gi|222159227|gb|ACAB01000132.1| Bacteroides sp. D1 cont1.132, whole genome shotgun sequence Length of sequence - 58484 bp Number of predicted genes - 44, with homology - 43 Number of transcription units - 25, operones - 8 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 26 - 157 64 ## BF2625 small heat shock protein - Term 189 - 246 10.8 2 2 Op 1 22/0.000 - CDS 320 - 1435 761 ## COG0842 ABC-type multidrug transport system, permease component - Prom 1455 - 1514 2.1 3 2 Op 2 45/0.000 - CDS 1524 - 2627 661 ## COG0842 ABC-type multidrug transport system, permease component 4 2 Op 3 10/0.000 - CDS 2630 - 4099 350 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 5 2 Op 4 . - CDS 4117 - 5052 909 ## COG0845 Membrane-fusion protein 6 2 Op 5 . - CDS 5045 - 6295 1182 ## BT_0560 outer membrane efflux protein - Prom 6425 - 6484 5.2 7 3 Tu 1 . + CDS 6466 - 7362 583 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 7216 - 7249 -1.0 8 4 Tu 1 . - CDS 7366 - 8421 876 ## COG0836 Mannose-1-phosphate guanylyltransferase - Prom 8445 - 8504 4.5 - Term 8537 - 8588 12.1 9 5 Op 1 24/0.000 - CDS 8616 - 11843 3294 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) - Term 11870 - 11905 1.1 10 5 Op 2 . - CDS 11917 - 13050 772 ## COG0505 Carbamoylphosphate synthase small subunit 11 5 Op 3 . - CDS 13078 - 14961 1860 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase - Prom 14994 - 15053 2.4 - Term 15015 - 15066 3.1 12 6 Tu 1 . - CDS 15094 - 16938 1614 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains - Prom 17050 - 17109 9.2 + Prom 17206 - 17265 3.1 13 7 Op 1 21/0.000 + CDS 17301 - 21896 4184 ## COG0069 Glutamate synthase domain 2 + Prom 21904 - 21963 3.3 14 7 Op 2 . + CDS 22011 - 23351 1249 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Prom 23462 - 23521 2.8 15 8 Tu 1 . + CDS 23566 - 25233 1644 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) + Term 25354 - 25408 5.2 + Prom 25355 - 25414 8.8 16 9 Tu 1 . + CDS 25458 - 26225 879 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Term 26312 - 26362 13.1 - Term 26203 - 26248 -0.9 17 10 Tu 1 . - CDS 26471 - 26767 333 ## BT_0549 putative thioredoxin + Prom 26969 - 27028 9.0 18 11 Op 1 . + CDS 27182 - 27997 663 ## COG0253 Diaminopimelate epimerase 19 11 Op 2 . + CDS 28048 - 29280 1093 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Prom 29319 - 29378 2.5 20 12 Tu 1 . + CDS 29467 - 30228 812 ## BT_0546 hypothetical protein + Prom 30283 - 30342 2.5 21 13 Op 1 24/0.000 + CDS 30377 - 30733 486 ## COG0347 Nitrogen regulatory protein PII 22 13 Op 2 . + CDS 30750 - 32189 1100 ## COG0004 Ammonia permease + Prom 32274 - 32333 3.0 23 14 Tu 1 . + CDS 32366 - 34555 2141 ## COG3968 Uncharacterized protein related to glutamine synthetase + Term 34581 - 34632 0.3 - Term 34570 - 34626 3.4 24 15 Tu 1 . - CDS 34719 - 35297 607 ## BT_0542 hypothetical protein - Prom 35440 - 35499 7.8 + Prom 35755 - 35814 5.7 25 16 Tu 1 . + CDS 35883 - 36530 488 ## COG3153 Predicted acetyltransferase - Term 37224 - 37273 9.3 26 17 Op 1 . - CDS 37315 - 37710 313 ## gi|299149097|ref|ZP_07042158.1| hypothetical protein HMPREF9010_04748 27 17 Op 2 . - CDS 37688 - 38188 354 ## gi|237713587|ref|ZP_04544068.1| hypothetical protein BSAG_03814 - Prom 38300 - 38359 9.8 - Term 38478 - 38535 -0.9 28 18 Tu 1 . - CDS 38536 - 40395 1742 ## COG0471 Di- and tricarboxylate transporters - Prom 40489 - 40548 5.0 + Prom 40624 - 40683 6.3 29 19 Tu 1 . + CDS 40735 - 42360 1167 ## COG0642 Signal transduction histidine kinase + Term 42487 - 42536 7.0 - Term 42528 - 42561 -1.0 30 20 Tu 1 . - CDS 42592 - 43737 1191 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family 31 21 Tu 1 . - CDS 43817 - 43957 179 ## + Prom 44037 - 44096 6.6 32 22 Op 1 . + CDS 44319 - 45506 1256 ## COG0133 Tryptophan synthase beta chain 33 22 Op 2 35/0.000 + CDS 45555 - 46961 1244 ## COG0147 Anthranilate/para-aminobenzoate synthases component I + Prom 47012 - 47071 3.6 34 22 Op 3 13/0.000 + CDS 47095 - 47661 553 ## COG0512 Anthranilate/para-aminobenzoate synthases component II 35 22 Op 4 21/0.000 + CDS 47666 - 48661 979 ## COG0547 Anthranilate phosphoribosyltransferase 36 22 Op 5 9/0.000 + CDS 48713 - 49495 876 ## COG0134 Indole-3-glycerol phosphate synthase 37 22 Op 6 . + CDS 49522 - 50151 555 ## COG0135 Phosphoribosylanthranilate isomerase 38 22 Op 7 . + CDS 50164 - 50934 333 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc + Term 50935 - 50974 1.1 + Prom 50960 - 51019 6.2 39 23 Tu 1 . + CDS 51156 - 52196 731 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D + Term 52217 - 52277 11.1 + Prom 52344 - 52403 3.9 40 24 Tu 1 . + CDS 52438 - 54822 1848 ## BT_0525 outer membrane protein, function unknown + Term 54843 - 54885 7.1 + Prom 54909 - 54968 4.8 41 25 Op 1 . + CDS 55057 - 55422 299 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 42 25 Op 2 . + CDS 55440 - 56597 718 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 43 25 Op 3 . + CDS 56609 - 57556 91 ## BT_0522 hypothetical protein 44 25 Op 4 . + CDS 57508 - 58155 471 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 58284 - 58338 10.0 Predicted protein(s) >gi|222159227|gb|ACAB01000132.1| GENE 1 26 - 157 64 43 aa, chain + ## HITS:1 COG:no KEGG:BF2625 NR:ns ## KEGG: BF2625 # Name: not_defined # Def: small heat shock protein # Organism: B.fragilis # Pathway: not_defined # 1 43 100 142 142 62 79.0 7e-09 MILPDNVDKEKIAASVEHGVLNIELPKLSEEEVKKPNRQIEIK >gi|222159227|gb|ACAB01000132.1| GENE 2 320 - 1435 761 371 aa, chain - ## HITS:1 COG:SMb21204 KEGG:ns NR:ns ## COG: SMb21204 COG0842 # Protein_GI_number: 16264618 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Sinorhizobium meliloti # 2 369 6 367 370 172 30.0 7e-43 MIKFLIEKEFKQLLRNSFLPKLILVFPCMIMLLMPWAVNLEIKNIQLNIVDNDHSAISQR LVNKIAAATYFRLVEVPTSYEEGLRNIEIGTADIVMEIPRHLERDWMNGEDAHVLIAANA VNGTKGGLGSSYLSSIINDYAAELRSEHPEAATVSGTFPSIQVDTQGLFNPNLNYKLYMI PALMVMLLTLICGFLPALNVVSEKEVGTIEQINVTPVPKFVFILAKLLPYWLIGFLVLTV CFILAWLIYGIVPVGHFLLIYFFAVLFVLVMSGFGLVISNYSATMQQSMFVMWFCLLVVI LMSGLFTPISSMPEWAQIITIFNPLKYFMEVMRMIYLKGSGFFDLLPQFGILLLFAVVFN SWAVISYRKNN >gi|222159227|gb|ACAB01000132.1| GENE 3 1524 - 2627 661 367 aa, chain - ## HITS:1 COG:CAC3268 KEGG:ns NR:ns ## COG: CAC3268 COG0842 # Protein_GI_number: 15896513 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Clostridium acetobutylicum # 2 365 4 374 378 221 34.0 3e-57 MRQFIAFVKKEFYHIFRDRRTMLILLGMPVVQIILFGFAISTEVKNVRLAVLDPSNDVVT RKIIDRLDASEYFTVTARFHSPQEMEAAFLKNKVDMAIVFSERFVDDLYTGDAHVQLVVD ATDPNMSTSQVNYATGIVSMVGQEMMPPNMSAARLTSDIKLLYNPQMKSAYNFVPGVMGL ILMLICAMMTSISIVREKETGTMEVLLVSPVKPLFIILAKAVPYFVLSFVNLITILLLSV FVLDVPVVGSLFWLITVSLLFIFVSLALGLLISSVTRTQVAAMLVSGLMLMMPTMLLSGM IFPIESMPLILQWISDILPARWYIQAVRKLMIEGVPVVLVYKEIGILLLMATVLITISIK KFKYRLE >gi|222159227|gb|ACAB01000132.1| GENE 4 2630 - 4099 350 489 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 250 456 7 215 311 139 34 4e-32 MEKDVPIVVVKEVGKSYGTVEALKDVSFVVERGEIFGLIGPDGAGKSTLFRILTTLLLAD KGTATIGGLDVATDYKQIRTKVGYMPGRFSLYQDLSVEENLEFFATVFHTTVQENYDLIK DIYQQIEPFRKRRAGALSGGMKQKLALSCALIHKPDILFLDEPTTGVDPVSRKEFWQMLR NLREQGITIVVSTPIMDEARQCDRIAFINHGKIHGIDTPERILQQFASILCPPSLEREEV RHETAPVIEVEQLTKCFGNFTAVDHISFQVNRGEIFGFLGANGAGKTTAMRMLCGLSKPT SGIGKVAGYDIFREAEQVKKHIGYMSQKFSLYEDLKVWENIRLFAGIYGMKEQEIERKTE ELLDRLGLADERDTLVKSLPVGWKQKLAFSVSIFHEPRIVFLDEPTGGVDPATRRQFWEL IYQAADQGITIFVTTHYMDEAEYCNRISIMVDGQIKALDTPARLKQQFGAETMDDVFQQL ARGAVRKAD >gi|222159227|gb|ACAB01000132.1| GENE 5 4117 - 5052 909 311 aa, chain - ## HITS:1 COG:PA5232 KEGG:ns NR:ns ## COG: PA5232 COG0845 # Protein_GI_number: 15600425 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 37 288 41 314 357 82 28.0 1e-15 MINITSPDMKRITKIGMWGITLMMLSACGNGTPDYDATGTFEATEVIVSAEAAGKLLRLE VEEGTKLEAGEEIGLVDTVQLYLKKLQLEASMKSVENQRPDLAKQIAATKQQIVTAERER KRVENLLAAGAANQKQLDDWDAQVKLLERQLVAQESSLRNSTNSLTEQGNSVAIQVAQVE DQLAKCHVQSPIAGTVLAKYAEAGELAAIGKPLFKVGEVDRMYLRAYITSEQLSQVKLGD QVTVYADYGNSEQKAYPGEVTWISDRSEFTPKTILTKNERANLVYAVKIAVKNDGALKIG MYGGVKLKNED >gi|222159227|gb|ACAB01000132.1| GENE 6 5045 - 6295 1182 416 aa, chain - ## HITS:1 COG:no KEGG:BT_0560 NR:ns ## KEGG: BT_0560 # Name: not_defined # Def: outer membrane efflux protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 415 6 408 409 669 86.0 0 MIFSFSFLLYVAGAYAQITLEECQRKTQDNYPLVHQYGLVEKTKEYNLENAAKGYLPQFA LSAKASYQSDVTEIPVKLPGVDLKGLPKDQYQVMLELQQKIWDGGGIRMQKKQTIAEAEV EKEKLNVDMYALNDRVNDLYFGILLLDEQIKQNTLLQDELERNYRQITAYVENGIANQAD LDAVKVEQLNTKQKRIDLVSSRAAYLKMLSLLVGEALSPETVLEKPVPQSTVSAVSEIRR PELALFDAQGAGLQVQEKALNVRHLPQFGLFVQGAYGNPGLDMLKNEFSPYYMAGVRLSW NFGSLYTLKNDRRVIEKKRQQLDNNRDIFLFNTRLEMTQQDQAIHSLEKQMRDDDEIIRL RTNIRKAAEAKVANGTLTVTEMLRELTNESLARQSKAMHEIQRLMGIYQLKYTIND >gi|222159227|gb|ACAB01000132.1| GENE 7 6466 - 7362 583 298 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 159 296 151 286 288 84 31.0 3e-16 MGQKEPIELALKLLDNLKIVRHNEHRYISSDFGFVNSFVKMETTLFSLGQPYRIKEGRIA FVKQGSARILINLIEYTIQPGYIAVIAPNSIIQITEVSPDFDMQMIAADHNFLPISGKDD FFSYLLHHQKNIILLLSPQEQQQVEHYFTLTWGVLQEPPFRKEAIQHLLASLLYYLEYIA QNSIEQNPAQLTRQEEIFQRFISLVNTHSKKERNVNFYADKLCLTPRYLNTVIRQASQQT VMDWINQSIILEAKVLLKHSNLLVYQVSDELNFPNPSFFSKFFKRMTGMTPHEYQQTK >gi|222159227|gb|ACAB01000132.1| GENE 8 7366 - 8421 876 351 aa, chain - ## HITS:1 COG:CAC3058 KEGG:ns NR:ns ## COG: CAC3058 COG0836 # Protein_GI_number: 15896309 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 9 345 5 340 350 286 43.0 5e-77 MIGMDNHVVIMAGGIGSRFWPMSTPECPKQFIDVMGCGRSLIQLTADRFDGVCPRENMWV VTSEKYIDIVREQLPEIPESNILAEPCARNTAPCIAFACWKIKKKHPRANIVVTPSDALV INTGEFRRVMEKALRFTDDSSAIVTIGIRPTRPETGYGYIAAANQLQTDKEIYTVDAFKE KPDKETAEKYLAEGNFFWNAGIFVWNVRTITSVMRVYAPGIAQIFDRIFPDFYTEKENET IKKLFPTCESISIDYAVMEKAEAIYVLPASFGWSDLGTWGALRGLLPQDKSGNATVGTDI RLYDSKNCIVHASEEKRVVVQGLDGYIIAEKDNTLLICKLEEEQRIKEFSK >gi|222159227|gb|ACAB01000132.1| GENE 9 8616 - 11843 3294 1075 aa, chain - ## HITS:1 COG:YJL130c_2 KEGG:ns NR:ns ## COG: YJL130c_2 COG0458 # Protein_GI_number: 6322331 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Saccharomyces cerevisiae # 6 1057 6 1051 1070 1223 58.0 0 MKENIKKVLLLGSGALKIGEAGEFDYSGSQALKALKEEGIETILINPNIATVQTSEGVAD QIYFLPVTPYFVEKVIQKEKPEGIMLAFGGQTALNCGVALYKEGILEKYNVKVLGTPVQA IMDTEDRELFVQKLNEINVKTIKSEAVENAEAARRAAKELGYPVIVRAAYALGGLGSGFC DNEEQLDILVEKAFSFSPQVLVEKSLRGWKEVEYEVVRDRFDNCITVCNMENFDPLGIHT GESIVIAPSQTLTNKEYHKLRELAIRIIRHIGIVGECNVQYAFDPESEDYRVIEVNARLS RSSALASKATGYPLAFVAAKLGLGYGLFDLKNSVTKTTSAFFEPALDYVVCKIPRWDLGK FHGVDKELGSSMKSVGEVMAIGRTFEEAIQKGLRMIGQGMHGFVENKELVISDIDKALRE PTDKRIFVISKAFRAGYTIDQVHELTKIDKWFLQKLMNIMQTSEELHSWGNNHKQIADLP NELLRKAKVQGFSDFQVARAIGYEGDMEDGILYVRKHRKEAGILPVVKQIDTLAAEYPAQ TNYLYLTYSGVANDVHYLGDHKSIVVLGSGAYRIGSSVEFDWCGVQALNTIRKEGWRSVM INYNPETVSTDYDMCDRLYFDELTFERVMDILELENPHGVIVSTGGQIPNNLALRLDAQN INILGTSAKSIDNAEDREKFSAMLDRIGVDQPRWRELTSMDDIQEFVEEVGFPVLVRPSY VLSGAAMNVCSNQEELERFLKLAANVSKKHPVVVSQFIEHAKEVEMDAVAQNGEIVAYAI SEHIEFAGVHSGDATIQFPPQKLYVETVRRIKRISREIAKALNISGPFNIQYLAKDNDIK VIECNLRASRSFPFVSKVLKINFIELATKVMLGLPVEKPEKNLFELDYVGIKASQFSFNR LQKADPVLGVDMASTGEVGCIGMDTSCAVLKAMLSVGYRIPKKNILLSTGTMKQKADMMD AARMLVNKGYKLFATGGTHKALAESGIESTHVYWPSEEGHPQALEMLHSKEIDMVVNIPK NLTAGELSNGYKIRRAAIDLNIPLITNARLASAFINAFCTMSVDDIAIKSWAEYK >gi|222159227|gb|ACAB01000132.1| GENE 10 11917 - 13050 772 377 aa, chain - ## HITS:1 COG:YJL130c_1 KEGG:ns NR:ns ## COG: YJL130c_1 COG0505 # Protein_GI_number: 6322331 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Saccharomyces cerevisiae # 2 375 20 405 433 370 46.0 1e-102 MRNVTLILDDGSRFSGKSFGYEKPVAGEVVFNTAMTGYPESLTDPSYAGQLMTLTYPLVG NYGVPPFTIEPNGLATFMESEKIHAEAIIVSDYSYEYSHWNAVESLGDWLKREKVPGITG IDTRELTKVLREHGVMMGKIVFENEELRMKNEEFPSYSDINYVDQVSCKEIIHYFPSGTS SHSAANSSFFIPHSSLKKVVLVDCGVKTNIIRCLLKRNVEVIRVPWDYDYNGFEFDGLFI SNGPGDPDTCDAAVQNIRKAMANEKLPIFGICMGNQLLSKAGGAKIYKLKYGHRSHNQPV RMVGTERCFITSQNHGYAVDNNTLSADWEPLFINMNDGSNEGIKHKRNPWFSAQFHPEAA SGPTDTEFLFDEFVKLL >gi|222159227|gb|ACAB01000132.1| GENE 11 13078 - 14961 1860 627 aa, chain - ## HITS:1 COG:HI1207 KEGG:ns NR:ns ## COG: HI1207 COG0034 # Protein_GI_number: 16273127 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Haemophilus influenzae # 36 519 17 429 505 130 25.0 1e-29 MEQLKHECGVAMIRLLKPLEYYEKKYGTWMYGLNKLYLLMEKQHNRGQEGAGLACVKLEA NPGEEYMFRERALGSGAITEIFENVQNNFKELTSEQLHDAAYAKRVLPFAGEVYMGHLRY STTGKSGISYVHPFLRRNNWRAKNLALCGNFNMTNVDEIFARITAIGQHPRKYADTYIML EQVGHRLDREVERVFNLAEAEGLTGMGITHYIEEHIDLANVLRTSSREWDGGYVICGLTG SGESFAIRDPWGIRPAFWYQDDEIAVLASERPVIQTALNVPFEEIKELQPGQALLISKEG KIRTSQINKPRENQACSFERIYFSRGSDVDIYKERKRLGEKLVPKILKAINNDIDHTVFS FIPNTAEVAFYGMLQGLDDYLNEEKVQQIAALGHNPNMEELEVILSRRIRSEKVAIKDIK LRTFIAEGNSRNDLAAHVYDITYGSLVPGVDNLVIIDDSIVRGTTLKQSIIGILDRLGPK KIVIVSSSPQVRYPDYYGIDMAKMSEFIAFRAAVELLKERDMKDVIAAAYRKSKEQVGLP KEQMVNYVKDIYAPFTDEEISAKMVELLTPKGTKAKVEIVYQPLEGLHEACPHHKGDWYF SGNYPTPGGVKMVNRAFIDYIEQMYQF >gi|222159227|gb|ACAB01000132.1| GENE 12 15094 - 16938 1614 614 aa, chain - ## HITS:1 COG:CAC0158 KEGG:ns NR:ns ## COG: CAC0158 COG0449 # Protein_GI_number: 15893453 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Clostridium acetobutylicum # 1 614 1 608 608 549 47.0 1e-156 MCGIVGYIGKRKAYPILIKGLKRLEYRGYDSAGVAIISDDRQLNVYKTKGKVSDLENFVT QKDISGTIGIAHTRWATHGEPCSANAHPHYSSSEKLALIHNGIIENYAVLKEKLQAKGYI FKSSTDTEVLVQLIEYMKVTNQVSLLTAVQLALGEVIGAYAIAILDKEHPDEIIAARKSS PLVVGIGADEFFLASDATPIVEYTDKVVYLEDGEIAVLHLGKELKVVNLSNVEMTPEIKK VELNLGQLEKGGYPHFMLKEIFEQPDCIHDCMRGRINVEADNVVLSAVIDHREKLLNAKR FIIVACGTSWHAGLIGKHLIESFCRIPVEVEYASEFRYRDPVIDGQDVVIAISQSGETAD TLAAVELAKSRGAFIYGICNAIGSSIPRATHTGSYIHVGPEIGVASTKAFTGQVTVLAML ALTLAKAKGTIDERHYLSIVQELNHIPEKMKEVLKLNDTLAELSKTFTYAHNFIYLGRGY SYPVALEGALKLKEISYIHAEGYPAAEMKHGPIALIDAEMPVVVIATQNGLYEKVLSNIQ EIKARKGKVIAFVTKGDTVISKIADCSIELPETIECLDPLITTVPLQLLAYHIAVCKGMD VDQPRNLAKSVTVE >gi|222159227|gb|ACAB01000132.1| GENE 13 17301 - 21896 4184 1531 aa, chain + ## HITS:1 COG:CAC1673_2 KEGG:ns NR:ns ## COG: CAC1673_2 COG0069 # Protein_GI_number: 15894950 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Clostridium acetobutylicum # 402 1216 1 804 804 884 54.0 0 MKKQELFNNVTEGSPYQRQPKQMGLYNAAYEHDACGVGMLVNIHGEKSHDIVESALKVLE NMRHRGAEGADNKTGDGAGIMLQIPHEFILLQGIPVPEKGRYGTGLLFLPKNEKDQAAIL SIIIEEIEKEGLTLMHLRNVPTCPEILGESALANEPDIKQVFITGFTETETADRKLYLIR KRIENKVRLSSIPTKDDFYVVSLSTKNIIYKGMLSSLQLRNYYPDLTNSYFTSGLALVHS RFSTNTFPTWGLAQPFRLLAHNGEINTIRGNRGWMEARESVLSTPTLGDIKELRPIIQPG MSDSASLDNVLEFLVMSGLSLPHAMAMLVPESFNEKNPISEDLKSFYEYHSILMEPWDGP AALLFSDGRFAGGMLDRNGLRPARYLITKNDMMVVASEVGVMDFEPGDIKEKGRLQPGKI LLVDTEKGEIYYDGELKKQLAEAKPYRTWLSTNRIELDELKSGRKVPHHVENYDCMLRTF GYSKEDIEKLIMPMASTGAEPIHSMGNDTPLAVLSDKPQLLYNYFRQQFAQVTNPPIDPL REELVMSLTEYIGAVGMNILTPSESHCKMVRLNHPILSNTQLDILCNIRYKGFKTVKLPM LFEAAKGKAGLQEALNSLCKMAEESVTEGVNYIVLTDRDVDITHAAIPSLLAVSAVHHHL ISVGKRVQTALIVESGEIREVMHAALLLGFGASALNPYMAFAVLDRLVKDRDIQLDYATA EKNYIKSICKGLFKIMSKMGISTIRSYRGAKIFEAVGLSEELSKAYFGGLGSPIGGIRLE EIARDAIAFHEEGFSEIKNGNEGTQSNSQSSTFNFPLLKNNGLYAFRKDGEKHAWNPETI STLQLATRLGSYKKFKEYTHLVDEKEKPIFLRDFLGFRRNPISIDQVEPVENILHRFVTG AMSFGSISKEAHEAMAIAMNKIHGRSNTGEGGEDAARFQPLPDGSSLRSAIKQVASGRFG VTAEYLVNADEIQIKIAQGAKPGEGGQLPGFKVNDVIAKTRHSIPGISLISPPPHHDIYS IEDLAQLIFDLKNVNPQAKISVKLVAESGVGTIAAGVAKAKADLIVISGAEGGTGASPAS SIRYAGISPELGLSETQQTLVLNGLRGQVVLQADGQLKTGRDIILMALMGAEEYGFATSA LIVLGCVMMRKCHQNTCPVGVATQNEELRKRFHGRSEYLVNFFTFLAQEVREHLAEMGFT RMDDIIGRTDLIERKSVANDPNPKHALIDFTKLLARIDNNAAIRHVIDQDHGVSTVKDVT LIDAAQEAIEHEKEISLEYTIANTDRAIGAMLSGVIAKKYGAKGLPEHTLNVKFKGSAGQ SFGAFLVPGVNFKLEGEANDYLGKGLSGGRISVLPPIRSNFEAEKNTIAGNTLLYGATSG EVYINGCVGERFAVRNSGAVAVVEGVGDHCCEYMTGGRVVVLGQTGRNFAAGMSGGVAYV WNKDGNFDYFCNMEMVELSLIEEASYRKELHELIRQHYLYTGSKLARTMLDNWNHYAEQF IQVVPIEYKKVLQEEQMRKLQQKIADMQRDY >gi|222159227|gb|ACAB01000132.1| GENE 14 22011 - 23351 1249 446 aa, chain + ## HITS:1 COG:VC2374 KEGG:ns NR:ns ## COG: VC2374 COG0493 # Protein_GI_number: 15642371 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Vibrio cholerae # 1 445 1 473 489 433 49.0 1e-121 MGDPKAFLNIPRQEAGYRPVNERIADYSQVEQTLNTNSRKLQASRCMDCGVPFCHWACPI GNKQPEWQDALYRGKWKEAYEVLSSTCDFPEFTGRICPALCEKSCVLKLSCDQPVTIREN EAAIVEAAFREGYIQVKTPRRNGKKVAVIGAGPAGLVVANQLNLKGYSVTLFDKDEAPGG LLRFGIPNFKLDKNVIDRRMKILAAEGILFEMGVEIDVNHLPEGFDAYCICTGTPTARDL SIPGRELKGIHFALEMLAQQNRLLAGQTFPKDKLVNAKGKKVLVIGGGDTGSDCIGTSVR QGAVSVTQIEIMPKPPVGYNPATPWPQWPAVFKTTSSHEEGCTRRWCLASNQFLGKNGKV TGVEVEEVEWIPAADGGRPTMKPTGKKEVIEADMVLLAMGFLKPEQPKFAKNVFLAGDAE TGASLVVRAMAGGRKAAAEIDAYLIQ >gi|222159227|gb|ACAB01000132.1| GENE 15 23566 - 25233 1644 555 aa, chain + ## HITS:1 COG:VC0991 KEGG:ns NR:ns ## COG: VC0991 COG0367 # Protein_GI_number: 15641006 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Vibrio cholerae # 1 554 1 554 554 782 67.0 0 MCGIAGILNIKVQTKELRDKALKMAQKIRHRGPDWSGIYVGGSAILAHERLSIVDPQSGG QPLYSPDRKQVLAVNGEIYNHRDIRAKYARKYNFQTGSDCEVILALYKDKGIHFLEDISG IFAFVLYDEEKDEFLIARDPIGVIPLYIGKDKEGKIYFGSELKALEGFCDEYEVFLPGHY FYSKEGKMKRWYSRDWTEYETVKENDAQTEDVKVALEEAVHRQLMSDVPYGVLLSGGLDS SVISAIAKKYAAKRIETDGASDAWWPQLHSFAIGLKGAPDLIKAREVAEYIGTVHHEINY TVQEGLDAVRDVIYFIETYDVTTVRASTPMYLLARVIKSMGIKMVLSGEGADEVFGGYLY FHKAPTPQAFHEETVRKLSKLHMYDCLRANKSLSAWGVEGRVPFLDKEFLDVAMNLNPKA KMCPGKNIEKRIVREAFADMLPESVAWRQKEQFSDGVGYSWIDTLREITAAAVSDEQMEH AAERFPINTPQNKEEYYYRSIFEEHFPSESAARTVPSVPSVACSTAEALAWDIAFRNLNE PSGRAVKGIHEEAYT >gi|222159227|gb|ACAB01000132.1| GENE 16 25458 - 26225 879 255 aa, chain + ## HITS:1 COG:SA0220_2 KEGG:ns NR:ns ## COG: SA0220_2 COG0584 # Protein_GI_number: 15925931 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Staphylococcus aureus N315 # 28 252 1 228 242 76 27.0 4e-14 MNLKKMMMASALLMAACCMQAQTKVIAHRGFWKTPGSSQNSISSLLKADSIGCYGSEFDV WIAKDNKLVVNHDPVYKMRPMEYSKGDALTGLKLSNGENLPSLEQYLETGKNCKTQLILE LKAHSNKKRETKAVQGIVAMVKKMGLENRMEYITFSLHAMKEFIRLAPAGTPVYYLNGEL SPKELKELGAAGLDYHMGVIKKHPEWIKEAHDLGLKVNVWTVDEVEDMKWLIEQKVDFIT TNEPVILQEELKKLQ >gi|222159227|gb|ACAB01000132.1| GENE 17 26471 - 26767 333 98 aa, chain - ## HITS:1 COG:no KEGG:BT_0549 NR:ns ## KEGG: BT_0549 # Name: not_defined # Def: putative thioredoxin # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 98 1 98 98 155 77.0 3e-37 MNIEKQLETAIKTNHLVLVVFYADWSPHYEWIGPVLRTYEKRTVELIRVNAEENRAIADA HNVETVPAFLLLHKGHELWRQVGELTVEELKEVLDDFQ >gi|222159227|gb|ACAB01000132.1| GENE 18 27182 - 27997 663 271 aa, chain + ## HITS:1 COG:slr1665 KEGG:ns NR:ns ## COG: slr1665 COG0253 # Protein_GI_number: 16332245 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Synechocystis # 5 264 3 276 279 224 43.0 1e-58 MTVRIKFTKMHGAGNDYIYVDTTRYPIADPEKKAIEWSKFHTGIGSDGLILIGVSDKADF SMRIFNADGSEAMMCGNGSRCVGKYVYEYGLTDKTEITLDTLSGIKILKLQVEGKTVSTV TVDMGSPQETGEIDWGKKYPFQSTKVSMGNPHLVTFIDDITRINLPEIGPELENHPLFPD RTNVEFAQIVGKDTIRMRVWERGSGITQACGTGACATAVAAFINGLAGRKSDVIMDGGTV TIEWDETSGHILMTGPATKVFDGEIVEEVES >gi|222159227|gb|ACAB01000132.1| GENE 19 28048 - 29280 1093 410 aa, chain + ## HITS:1 COG:MTH52 KEGG:ns NR:ns ## COG: MTH52 COG0436 # Protein_GI_number: 15678081 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanothermobacter thermautotrophicus # 1 406 1 405 410 543 60.0 1e-154 MALVNEHFLKLPGSYLFSDIAKKVNTFRITHPKQDIIRLGIGDVTQPLPPACIEAMHKAV EELAGKDTFRGYGPEQGYDFLIEAIIKNDFAPRGIHFSASEIFVSDGAKSDTGNIGDILR HDNSVGVTDPIYPVYIDSNVMCGRAGVLEEETGKWSNVTYMPCTSENNFIPEIPDKRIDI VYLCYPNNPTGTTLTKPELKKWVDYALANDTLILFDAAYEAYIQDENVPHSIYEIKGAKK CAIEFRSFSKTAGFTGVRCGYTVVPKELTAATLEGDRIPLNRLWNRRQCTKFNGTSYITQ RAAEAVYSAEGKAQIKKTIDYYMTNAKIMKEGLEATGLKVYGGVNAPYLWVKTPNGLSSW RFFEQMLYEANVVGTPGVGFGPSGEGYIRLTAFGERNDCIEAMRRIKNWL >gi|222159227|gb|ACAB01000132.1| GENE 20 29467 - 30228 812 253 aa, chain + ## HITS:1 COG:no KEGG:BT_0546 NR:ns ## KEGG: BT_0546 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 253 1 241 241 387 82.0 1e-106 MKTTINNKLGVITVALLLSGLVPSTMMAQDKVEASVGADLVSGYIWRGQNLGGVSVQPSL GISYKGLSLEAWGSVGIESKDAKEFDLTLGYSIGGFSVSITDYWFDKTYTGDVDEYGKEI YTTNKYFQYGAYSTAHVFEAQVGYDFGPLAVNWYTNFAGADGVKENGKRAYSSYLALSAP FKLGGLDWTVDLGMVPWETTFYNGYTSGFCVTDISLGASKEIKITDSFSVPTFAKVTVNP RTEGAYFAFGLSF >gi|222159227|gb|ACAB01000132.1| GENE 21 30377 - 30733 486 118 aa, chain + ## HITS:1 COG:aq_109 KEGG:ns NR:ns ## COG: aq_109 COG0347 # Protein_GI_number: 15605696 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Aquifex aeolicus # 1 111 1 112 112 95 50.0 2e-20 MKKIEAIIRKTKFEDVKDALLEADIEWFSYYDVRGIGKARQGRIYRGVVYDTSTIERILI SIVVRDKNTEKTVQAIIKAAQTGEIGDGRIFVIPIEDAIRIRTAERGDIALYNAEQER >gi|222159227|gb|ACAB01000132.1| GENE 22 30750 - 32189 1100 479 aa, chain + ## HITS:1 COG:sll0108 KEGG:ns NR:ns ## COG: sll0108 COG0004 # Protein_GI_number: 16331833 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Synechocystis # 41 478 60 486 507 355 48.0 1e-97 MDTKYKSHSFVKLWIATTIFILCCFTTDMSAQNTAAADTVAAVTETITEVVTAPAEEASA APDTLGELALGLNTVWMLLAAMLVFFMQPGFALVEAGFTRVKNTANILMKNFVDFMFGSL LYWFIGFGLMFGAGGFIGMPHFFDLSFIDNGLPTEGFLIFQTVFCATAATIVSGAMAERT KFSMYIVYTIFISVLIYPISGHWTWGGGWLMNGEEGSFMMSLFGTTFHDFAGSTVVHSVG GWIALVGAAILGPRIGKYGKDGKSKAIPGHSLTIAALGVFILWFGWFGFNPGSQLAAATE ADAIAISHVFLTTNLSACAGGFFALLVSWMKYGKPSLSLTLNGILAGLVGITAGCDAVSP AGAALIGAICGVVMIFSVDFIDKVLKIDDPVGASSVHGVCGFLGTILTGLFSTSEGLFYG YGFGFLGAQIFGALVVGAWAAGMGFIIFKTLDKIHGLRVPARIEEEGLDIYEHGESAYN >gi|222159227|gb|ACAB01000132.1| GENE 23 32366 - 34555 2141 729 aa, chain + ## HITS:1 COG:slr0288 KEGG:ns NR:ns ## COG: slr0288 COG3968 # Protein_GI_number: 16331104 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Synechocystis # 26 729 27 724 724 601 44.0 1e-171 MSKLRFRVVETAFKKKAVEVATPAERPSEYFAKYVFNKEKMFKYLPSKVYNALIDAIDNG APLDRSIADEVAAGMKKWAIEMGVTHYTHWFAPLTEGTAEKHDAFVEHDGKGGMMEEFTG KLLVQQEPDASSFPNGGIRNTFEARGYSAWDPSSPAFIVDDTLCIPTVFIAYTGEALDYK APLLKALRAVDKAAVDVCRYFNPEVKKVVAYLGWEQEYFLVDEGLYAARPDLLMTGRTLM GHDSAKNQQLEDHYFGAIPTRVAAFMKDLEIEALKLGIPVKTRHNEVAPNQFELAPIFEE CNLANDHNLLIMSLMRKVSRRHGFRVLLHEKPFKGVNGSGKHNNWSLGTDTGILLMAPGK TPEDNLRFVTFVVNTLMAVYHHNGLLKASISSATNAHRLGANEAPPAIISSFLGKQLSQV LDHIENSTKDDLISLSGKQGMKLDIPQIPELLIDNTDRNRTSPFAFTGNRFEFRAVGSEA NCASAMIALNSAVADQLVKFKKDVDALIEKGEPKVSAILEIIRGYIKECKAIHFDGNGYS DEWKKEAARRGLDCETSVPVIFDNYLKPETIAMFEATGVMTKKELEARNEVKWETYTKKI QIEARVLGDLAMNHIIPVATQYQTDLINNVYKMQSLFPAEKAAKLSAKNLELIEEIADRT AFIKEHVDAMIEARKVANKIESEREKAIAYHDTIVPALEEIRYHIDKLELIVDNQMWTLP KYRELLFVR >gi|222159227|gb|ACAB01000132.1| GENE 24 34719 - 35297 607 192 aa, chain - ## HITS:1 COG:no KEGG:BT_0542 NR:ns ## KEGG: BT_0542 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 309 90.0 4e-83 MKTMNYSLIRILFALVIGLVLVIWPNAAASYIVITVGVAFLIPGVISLFGYFGRKRQEGE AAPRFPIEGIGSLLFGLWLIVMPEFFADVLMFLLGFILIMGGVQQIASLSMARRWMPVPG AFYLVPSLILIAGIIALFNPTGARNTAFIIIGVSSLVYSLTELINWFKFTRRRPKTPVVH DDDDIEDAKIIE >gi|222159227|gb|ACAB01000132.1| GENE 25 35883 - 36530 488 215 aa, chain + ## HITS:1 COG:MA1701 KEGG:ns NR:ns ## COG: MA1701 COG3153 # Protein_GI_number: 20090553 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 1 193 15 207 217 165 39.0 6e-41 MDIKLRIEQPSDYNETENVTREAFWNHFSPGCDEHYLLHIMRNHPKFVPELDIVAELNGK IVGNVVCLKSFIMADDGNQYEVLSLGPISVLPEYQQKGIGGKMIALTKERAFDMGFRAIL LCGDPDYYLRQGFIPAETLGIRTEDNMYATALHVCELYDDALANAKGRYIEDEIYQIDKS AANEFDKRFPPKEIIVGTPSQKRFDQIVVMRRKAI >gi|222159227|gb|ACAB01000132.1| GENE 26 37315 - 37710 313 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|299149097|ref|ZP_07042158.1| ## NR: gi|299149097|ref|ZP_07042158.1| hypothetical protein HMPREF9010_04748 [Bacteroides sp. 3_1_23] # 1 131 651 781 781 255 98.0 7e-67 MLNTGNASTPLTDIYILEVNVKLKNTISVPSVTDGLDFFIAYQGIGQKRTEIHLTHFNSA TANGQLADNEVLEVIKAVNNTWALCVPDKFAYPTETTVITNAYSKFADWAHDQSSTTDWY KTVSSDKVIQY >gi|222159227|gb|ACAB01000132.1| GENE 27 37688 - 38188 354 166 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713587|ref|ZP_04544068.1| ## NR: gi|237713587|ref|ZP_04544068.1| hypothetical protein BSAG_03814 [Bacteroides sp. D1] # 1 166 1 166 166 332 100.0 7e-90 MKQNIAKVFTFSLLASSISFISCVDNEKNLFDADQLKQIYEETFPVKNIDLDGDWTVSRS VIACVSVNGDQGVDYKIQIFDADPLSPGSTAKLLAEGTVNQSTTLNVVMDCATALDKVFV ARIDEHKRYLVQPAAIENGTVTAHLGDAHYIWIYRSTTSHVEYREC >gi|222159227|gb|ACAB01000132.1| GENE 28 38536 - 40395 1742 619 aa, chain - ## HITS:1 COG:ECs3176 KEGG:ns NR:ns ## COG: ECs3176 COG0471 # Protein_GI_number: 15832430 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli O157:H7 # 1 619 5 610 610 372 37.0 1e-103 MLITIIILVLSAVFFVNGKVRSDIVALCALIALLVFQILTPDEALSGFSNSVVIMMIGLF VVGGAIFQTGLAKMISSRILKLAGTSEIRLFLLVMLVTSVIGAFVSNTGTVALMLPIVVS LAMSAGMNPSRLLMPLAFASSMGGMMTLIGTPPNLVIQNTLTSAGLEPLSFFSFLPVGIV CVIVGTLVLMPLSKWFLSKKGQKDDNKRSGKSLKQLVNEYGLSSNLFRLQVIKDSRLLGK TIIDLDIRRKYGLNIMEVRRGDASQHRFLKTITQKFAAPDTMLEVEDILYVTGDFDKVQL FAEDYLLEILGDHATEETKSTTNSLDFYDIGIAEIVLMPSSNLINQTIKEAGFRDKFNVN VLGVRRKKEYLLQDLGNERIHSGDVLLVQGTWNNIARLSKEDADWVVLGQPLAEAAKVTL DYKAPVAAAIMVLMVAMMVFDFIPVAPVTAVMIAGILMVLTGCFRNVEAAYKTINWESIV LIAAMLPMSLALEKTGASEYISNTLVNGLGSYGPIALMAGIYFTTSLMTMFISNTATAVL LAPIALQSAIQIGVSPVPFLFAVTVGASMCFASPFSTPPNALVMPAGQYTFMDYVKVGLP LQIIMGIVMIFVLPLIFPF >gi|222159227|gb|ACAB01000132.1| GENE 29 40735 - 42360 1167 541 aa, chain + ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 304 533 33 270 280 181 40.0 5e-45 MANNILLIFWIGEVPWGYNYLLVIILLLIISILLYRIHKLHKTIKKTNHSYRFSFDILDN LPFPIFVKDIANDFRYYYWNKESAVQSGISSEEAIGHTDYEIYGEERGEKYRNIDKELVQ EGKVYRKEEKYITPDGITHDTIAVKSIISWEGEKRWLLATRWDITQLKNYERELVAAKEE LEKALKKQKLALKSIDFGLIYIDKNYRVQWEETRQIASLVKGRRYIPGKICYQTSALRNE PCGQCAFKKAIEQGKIIRHTIRVDDVDFEVTATPVFGDEKETEIIGGLLRFENITEKLKM NRMLQEAKEKAEESNRLKSAFLANMSHEIRTPLNAIVGFSEMVCQTEEEEEKQEFVKIIS SNNILLLQLIDDILDLSKIEAGTMEFTFAQTDINELMEGICRQMQEKNSSPDVQIVFTER ANQCIINTDRIRLSQVIINFTNNALKFTSKGSIEMGYRIEEASDEIYFYVKDTGIGIPAD KIDKVFERFVKLNSFIKGTGLGLAICRVIVERLGGVIGAESKEGEGSCFWFKIPRTENIE K >gi|222159227|gb|ACAB01000132.1| GENE 30 42592 - 43737 1191 381 aa, chain - ## HITS:1 COG:alr4566 KEGG:ns NR:ns ## COG: alr4566 COG1979 # Protein_GI_number: 17232058 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Nostoc sp. PCC 7120 # 1 377 1 380 384 426 55.0 1e-119 MENFIFQNPVKLIMGKGMIARLAKEIPSDKRIMITFGGGSVKKNGVYDQVKEALKDHFTV EFWGIEPNPAIETLRKAIALGKEEKVDYLLAVGGGSVIDGTKLISAGILYDGDAWDLVLA GRPVTHTVPLSTVLTLPATGSEMNNGAVISRRETKEKYPFYANYPIFSILDPEVTFTLPP HQVACGLADTYAHVMEQYMTTPRQSRVMDRWAEGILQTLVEIAPKIRENQHDYQLMADFM LSATMALNGFIAMGVSQDWATHMIGHEITALHGLTHGHTLAIVLPATLQVLHEEKGDKLL QYGERVWGITSGTREERIDEAICRTEEFFRSLGLTTRLHEENIGQETILEIERRFNERGA KYGENGNVTGAVARRILETAL >gi|222159227|gb|ACAB01000132.1| GENE 31 43817 - 43957 179 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKQRKVLVGIAIAIFIILLLYWLLVAEDMKPWLSAMVPSVAQQFFA >gi|222159227|gb|ACAB01000132.1| GENE 32 44319 - 45506 1256 395 aa, chain + ## HITS:1 COG:TM0138 KEGG:ns NR:ns ## COG: TM0138 COG0133 # Protein_GI_number: 15642912 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Thermotoga maritima # 10 390 3 380 389 450 60.0 1e-126 MKSFLVDQDGYYGEFGGAYVPEILHKCVEELTNKYLEVIESEDFKKEFDQLLRDYVGRPS PLYPAKRLSEKYGCKLYLKREDLNHTGAHKINNTIGQILLARRMGKKRIIAETGAGQHGV ATATVCALMDMECIVYMGKTDVERQHINVEKMKMLGATVIPVTSGNMTLKDATNEAIRDW CCHPADTYYIIGSTVGPHPYPDMVARLQSVISEEIKKQLQEKEGRDYPDYLIACVGGGSN AAGTIYHYINDERVGIILAEAGGKGIETGMTAATIQLGKMGIIHGARTYVIQNEDGQIEE PYSISAGLDYPGIGPIHANLAAQSRANVLAINDDEAIEAAYELTKLEGIIPALESAHALG ALKKLKFKPEDIVVLTVSGRGDKDIETYLSFNEQL >gi|222159227|gb|ACAB01000132.1| GENE 33 45555 - 46961 1244 468 aa, chain + ## HITS:1 COG:TM0142 KEGG:ns NR:ns ## COG: TM0142 COG0147 # Protein_GI_number: 15642916 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Thermotoga maritima # 7 467 4 457 461 274 37.0 2e-73 MKTFNYTTHNKQVLGDMHTPVSIYLKVRDMYPQSALMESSDYHAGENSLSFIALCPLASI GVNGGIVTANYPDNSRTEEPLTKTFNVEKAMNRFINQFQVTGDNKNVCGLYGYTTFNAVK YFEHIPVKESHDEQNDAPDLLYILYKYVIVFNHFKNELTLVEMLSEGEESGLPELEAAIE NRNYASYNFSVTGPVTSPITDEEHKANVRKGIAHCMRGDVFQIVLSRRFIQPYAGDDFKV YRALRSINPSPYLFYFDFGGYRIFGSSPETHCKIENGRAYIDPIAGTTRRTGDTVKDREL TEALLADPKENAEHVMLVDLARNDLSRNCHDVRVVFYKEPQYYSHVIHLVSRVSGMLNEG ADKIKTFIDTFPAGTLSGAPKVRAMQLISEIEPHNRGAYGGCIGFIGLNGELNQAITIRT FVSRNNELWFQAGGGIVARSQDEYELQEVNNKLGALKKAIDLAVKLKN >gi|222159227|gb|ACAB01000132.1| GENE 34 47095 - 47661 553 188 aa, chain + ## HITS:1 COG:PA0649 KEGG:ns NR:ns ## COG: PA0649 COG0512 # Protein_GI_number: 15595846 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Pseudomonas aeruginosa # 3 186 2 190 201 188 48.0 4e-48 MKILLLDNYDSFTYNLLHVVKELGATDIEVVRNDQIELDEVDRFDKIILSPGPGIPEEAG LLLPIIKRYAPTKSILGVCLGHQAIGEAFGARLENLKEVYHGVQTPVSIIRQDLLFEGLG KEIPVGRYHSWVVSREGFPDCLEITAESQEGQIMAIRHKTYNVHGIQFHPESVLTPQGKE IIKNFLND >gi|222159227|gb|ACAB01000132.1| GENE 35 47666 - 48661 979 331 aa, chain + ## HITS:1 COG:MJ0234 KEGG:ns NR:ns ## COG: MJ0234 COG0547 # Protein_GI_number: 15668409 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Methanococcus jannaschii # 1 329 2 332 336 208 35.0 9e-54 MKQILYKLFEHQYLGRDEARTILQNIAQGKYNDVQVASLITVFLMRNISVEELCGFRDAL LEMRIPVDLSDFAPIDIVGTGGDGKNTFNISTASCFTVAGAGFPVVKHGNYGATSVSGAS NVMEQHGVKFTSDVDQLRRSMEKCNLAYLHAPLFNPALKAVAPVRKGLAVRTFFNMLGPL VNPVLPAYQLLGVYNLPLLRLYTYTYQESKTKFAVVHSLDGYDEISLTNEFKVATSDHEK IYTPESLGFSRYKDIDLDGGQTPEDAAKIFDHIMNNTATEAQKNVVIVNSAFAIHVIRPE KTIEECIALAKESLESGRALATLKKFIELNS >gi|222159227|gb|ACAB01000132.1| GENE 36 48713 - 49495 876 260 aa, chain + ## HITS:1 COG:XF0213 KEGG:ns NR:ns ## COG: XF0213 COG0134 # Protein_GI_number: 15836818 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Xylella fastidiosa 9a5c # 1 256 1 261 264 203 44.0 2e-52 MKDILSEIIANKRFEVDLQKQAISIEQLQEGISESPTPRSMKQALASSASGIIAEFKRRS PSKGWIQEEARPEEIVPSYAAAGASALSILTDEKFFGGSLKDIRAARPLVEIPILRKDFI IDEYQLYQAKIVGADAVLLIAAALEPEKCNELAEKAHELGMEVLLEIHSSEELAYINKGI DMVGINNRNLGTFFTDVENSFRLAGQLPQDSVLVSESGISDPEIVKRLRAAGFRGFLIGE TFMKTQRPGETLQNFLQAIQ >gi|222159227|gb|ACAB01000132.1| GENE 37 49522 - 50151 555 209 aa, chain + ## HITS:1 COG:TM0139 KEGG:ns NR:ns ## COG: TM0139 COG0135 # Protein_GI_number: 15642913 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylanthranilate isomerase # Organism: Thermotoga maritima # 7 206 4 203 205 118 36.0 7e-27 MINGKIIKVCGMREAENIQDVEAIEGIDMLGFIFYPKSPRCVYELPAYLPTHARRVGVFV NEDKQVVSMYADRFELNDVQLHGNESPEYCRSLHSTGLKIIKAFSVDRPKDLKKVYDYEK VCDLFLFDTKCEQYGGSGNQFDWSILHTYNGDVPFLLSGGINSYSANALKEFKHPRLAGY DLNSRFETKPGEKDPERIRTFLNELKSSL >gi|222159227|gb|ACAB01000132.1| GENE 38 50164 - 50934 333 256 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 1 240 1 242 263 132 35 3e-30 MNRINQLFNSNKKDILSIYFCAGTPTLDGTANVIRTLEKHGVSMIEIGIPFSDPMADGIV IQNAATQALRNGMSLKLLFEQLRDIRKDVKIPLVFMGYLNPIMQFGFENFCRKCVECGID GVIIPDLPFRDYQEHYRIIAERYNIKVIMLITPETSEERVREIDTHTDGFIYMVSSAATT GAQQNFNEQKQAYFKKIKDMHLNNPLMIGFGISNKATFQAACEHASGAIIGSKFVTLLEE EKDPEKAILKLKEAVK >gi|222159227|gb|ACAB01000132.1| GENE 39 51156 - 52196 731 346 aa, chain + ## HITS:1 COG:YPO2161 KEGG:ns NR:ns ## COG: YPO2161 COG0252 # Protein_GI_number: 16122393 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Yersinia pestis # 4 346 2 337 338 292 45.0 8e-79 MRAETPSVLLIYTGGTIGMIENPETGALENFNFDHLLKHVPELKRFNYRISSYQFDPPLD SSDMEPAYWAKLVKIINYNYDYFDGFVILHGTDTMAYTASALSFMLENLSKPVILTGSQL PIGTLRTDGKENLITAIEIAAAKNTDGTAIVPEVCIFFENHLMRGNRTTKINAENFNAFR SFNYPPLARVGIHIKYEPNLIRKPDPTKPLKPHYLFDTNVVILTLFPGIQESIVTSLLHV PGLKAVVMKTFGSGNAPQKKWFIRQLKEATDRGIIIVNITQCASGAVEMGRYETGMHLLE AGVISGYDSTPECAITKLMFLLGHGLPNKDIRYKMNSCLIGEITKS >gi|222159227|gb|ACAB01000132.1| GENE 40 52438 - 54822 1848 794 aa, chain + ## HITS:1 COG:no KEGG:BT_0525 NR:ns ## KEGG: BT_0525 # Name: not_defined # Def: outer membrane protein, function unknown # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 794 1 816 820 521 43.0 1e-146 MMKKTVQFVPIIAIAIAGTMFSCVDSGKDLYDPSYETPNPMGDGFAAPDDIDWNMITTKN VSVEVKDEEGGQFAYLIEIYAEDPLTNENASVLATRTANKENNFKFTAAVSLLPTQKGIY VKQTDPRGREQVYQFDVPENSDNITCKLYYAESAAQNRALMSRGVATRSLAFEKPDYSSI PSDAKEVTEMTGTTLLRNANYKITSDYNGIFKFDGYDGDIATRVYVDAQWTIPATFQFQN GIEIIVMNNAKINASGTMTFIRNSMLTIMEKGEVNADDVSFTNGAPAALRNWGTLAVTNT MILHSGATLYNKGTITSRDISINSNTKIVNDNKIELEGTLNLPSNFSLENNGEIYGKELI ANSNAVATNNNIMRFTTISLTNTTFNNACSLEATTSFYANGATFNFTQGYLKAPTMEFVN GTVNLSNGSMLDATTSIYMNTAHAKFYGKGENTSMIKSPVITGQGFTYDGNLVIECDNHV EKSPHWNNFHVQNGAYFTKMGESKVVIEVCTGTKNNGNEGEEPEDPKFPIIMDDTRNYAY LFEDQWPLYGDYDMNDLVLIIKERKISINKDNKAEEFTLSLDLSAAGATKSIGAAIMLDG VPASAITQPVVFSDNSLAKNFNVNSNKIENGQDYAVIPLFDDAHNALGRDRYEQINTIKD HSANTNPKNISFTIKFSNPISVDELNINKLNVFIFVEGNRNQRKEIHIVGYQPTKLANTD LFGGNNDDSSTSRKRYYISKDNLAWGIMVPTDFKWPLEYVNIKSAYSLFESWVTSGGTKN EEWWKTFDSSRVYK >gi|222159227|gb|ACAB01000132.1| GENE 41 55057 - 55422 299 121 aa, chain + ## HITS:1 COG:ECs0449 KEGG:ns NR:ns ## COG: ECs0449 COG0745 # Protein_GI_number: 15829703 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 120 1 120 229 77 37.0 7e-15 MKKKILLVDDKSTIGKVAGVYLGKEYDFTYLEDPIKAIEWLNEGNVPDLIISDIRMPLMM GDEFLRYMKNNELFKSIPIVMLSSEESTTERIRLLEEGAEDYILKPFNPLELKIRIKKII D >gi|222159227|gb|ACAB01000132.1| GENE 42 55440 - 56597 718 385 aa, chain + ## HITS:1 COG:all4420 KEGG:ns NR:ns ## COG: all4420 COG2148 # Protein_GI_number: 17231912 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 142 382 250 441 445 141 33.0 3e-33 MQYFVYIGRDSKTIELLSRLSIGVFYAASNCSKAVKVLEKVREKYDAALFFEQVNLSQDI ADIQYMRKKYPGLYMILVIDSLSKEEASEYLKAGINNTIKYETSQEALKDLSTFLKRRKD QKIKALQLKTQNINAFRLPLWKRTFDIFFSGMAILCLSPLLIFTALAIRLESKGPIIYKS KRVGSNYQIFDFLKFRSMYTDADKHLKDFNALNQYQQEDEDIWGEELEPDINEEADEEEI LLISDDFVISEEDYINKKSKEKSNAFVKLENDPRITKIGRIIRKYSIDELPQLINILKGD MSIVGNRPLPLYEAELLTSDEHIDRFMGPAGLTGLWQVEKRGEAGKLSAEERKQLDITYA KTFSFWLDIKIILKTVTAFIQKENV >gi|222159227|gb|ACAB01000132.1| GENE 43 56609 - 57556 91 315 aa, chain + ## HITS:1 COG:no KEGG:BT_0522 NR:ns ## KEGG: BT_0522 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 305 1 304 304 426 70.0 1e-118 MIDSYIYIIDDLIFFCTGLLLLYLFVMAVASHCKHITYPKAQKAYRCAILVPEGSLLPYI YKEESYEFITYSDLHQTIHSLDPEHYDLVLFLSHTASALSPQFLDKIYNAYDAGIQAVQL HTVIENHKGFRNHFCAIREEIKNSLCRAGNTQFGLSSYLLGTNMVIDLKWLQKNMKSSKT NIERKLFRQNIYIDYLPDVIVYCQSAPVCPYRKRIRKTTSYLLSSIFEGNWSFCNRIVQQ LTPSPLKLCIFVSVWASLITVYNWTLSFGWWIALFGLLITYSLAIPDYLVEDKKKKKHSI WRKKHLNNELKKTPA >gi|222159227|gb|ACAB01000132.1| GENE 44 57508 - 58155 471 215 aa, chain + ## HITS:1 COG:SMb21427 KEGG:ns NR:ns ## COG: SMb21427 COG0110 # Protein_GI_number: 16265003 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Sinorhizobium meliloti # 65 191 24 156 162 92 40.0 8e-19 MEKKTLKQRIKENPGLKQAVHRFIMHPVKTRPNWWIRLFDFIYLKRGKGSVIYRSVRKDL PPFNRFFLGKYSVVEDFSCLNNAVGDLTIGDYTRIGLRNTIIGPVHIGNHVNLAQNVTVT GLNHNYQDAEKMIDEQGVSTLPVVIEDDVWVGANSVILPGVTLGRHCVVAAGSVVSHSVP PYSICAGCPARIIKAYDFETKEWKKVEKTPATNHK Prediction of potential genes in microbial genomes Time: Wed May 18 04:08:15 2011 Seq name: gi|222159226|gb|ACAB01000133.1| Bacteroides sp. D1 cont1.133, whole genome shotgun sequence Length of sequence - 87474 bp Number of predicted genes - 74, with homology - 74 Number of transcription units - 33, operones - 20 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 424 - 1035 375 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 2 1 Op 2 . - CDS 1025 - 1732 514 ## BT_0519 hypothetical protein - Prom 1792 - 1851 5.4 + Prom 1457 - 1516 3.2 3 2 Op 1 . + CDS 1680 - 1883 102 ## gi|237719367|ref|ZP_04549848.1| conserved hypothetical protein 4 2 Op 2 . + CDS 1939 - 3114 1262 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) + Term 3142 - 3201 14.4 + Prom 3128 - 3187 3.6 5 3 Op 1 . + CDS 3249 - 3755 596 ## COG0716 Flavodoxins 6 3 Op 2 . + CDS 3789 - 4151 350 ## BT_0516 hypothetical protein + Prom 4216 - 4275 3.2 7 4 Tu 1 . + CDS 4306 - 5328 731 ## BT_0515 terminal quinol oxidase, subunit, putative (DoxD-like) + Term 5363 - 5414 1.2 + Prom 5356 - 5415 6.6 8 5 Op 1 . + CDS 5558 - 6859 1004 ## BDI_2604 hypothetical protein 9 5 Op 2 . + CDS 6878 - 8059 1149 ## BDI_2603 putative iron-regulated protein A precursor 10 5 Op 3 . + CDS 8072 - 9538 1380 ## COG3488 Predicted thiol oxidoreductase 11 5 Op 4 . + CDS 9556 - 10791 1080 ## BDI_2601 hypothetical protein 12 5 Op 5 . + CDS 10815 - 11522 481 ## BDI_2600 hypothetical protein 13 5 Op 6 . + CDS 11544 - 12698 810 ## COG0251 Putative translation initiation inhibitor, yjgF family 14 5 Op 7 . + CDS 12714 - 14819 1013 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component - Term 14905 - 14950 -0.9 15 6 Op 1 . - CDS 15139 - 15747 498 ## COG1309 Transcriptional regulator 16 6 Op 2 35/0.000 - CDS 15762 - 17501 221 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 17 6 Op 3 . - CDS 17494 - 19266 1506 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 19298 - 19357 5.3 - Term 19344 - 19387 9.6 18 7 Tu 1 . - CDS 19579 - 21885 2111 ## COG3525 N-acetyl-beta-hexosaminidase - Prom 22049 - 22108 7.1 + Prom 21891 - 21950 4.8 19 8 Op 1 . + CDS 22075 - 22989 744 ## COG0053 Predicted Co/Zn/Cd cation transporters + Prom 22998 - 23057 2.9 20 8 Op 2 . + CDS 23079 - 24644 1541 ## BT_4046 hypothetical protein + Term 24690 - 24757 21.2 - Term 24685 - 24734 8.2 21 9 Tu 1 . - CDS 24925 - 25269 358 ## gi|237713626|ref|ZP_04544107.1| predicted protein - Prom 25292 - 25351 2.9 22 10 Op 1 . - CDS 25390 - 26205 921 ## BDI_2292 hypothetical protein 23 10 Op 2 . - CDS 26266 - 27462 821 ## COG5492 Bacterial surface proteins containing Ig-like domains - Prom 27513 - 27572 3.0 - Term 27560 - 27590 2.9 24 11 Tu 1 . - CDS 27681 - 28874 611 ## BDI_2291 putative transcriptional regulator 25 12 Tu 1 . - CDS 28979 - 31318 1944 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins - Prom 31344 - 31403 3.9 26 13 Tu 1 . - CDS 31428 - 31745 113 ## BT_0503 hypothetical protein - Prom 31823 - 31882 4.3 - Term 31822 - 31864 6.5 27 14 Op 1 . - CDS 31895 - 34222 1962 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins - Term 34257 - 34304 2.6 28 14 Op 2 . - CDS 34307 - 34687 110 ## BT_0501 hypothetical protein 29 14 Op 3 . - CDS 34734 - 36332 1673 ## COG2985 Predicted permease 30 14 Op 4 . - CDS 36381 - 37283 950 ## COG1230 Co/Zn/Cd efflux system component - Prom 37489 - 37548 6.0 - Term 37499 - 37544 -0.3 31 15 Op 1 8/0.000 - CDS 37560 - 38228 894 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 32 15 Op 2 . - CDS 38298 - 39323 1226 ## COG0524 Sugar kinases, ribokinase family - Prom 39481 - 39540 6.7 + Prom 39289 - 39348 6.2 33 16 Op 1 . + CDS 39577 - 40641 1165 ## COG1879 ABC-type sugar transport system, periplasmic component 34 16 Op 2 . + CDS 40672 - 42162 1510 ## COG2721 Altronate dehydratase + Prom 42228 - 42287 6.1 35 17 Op 1 . + CDS 42442 - 42993 342 ## BT_1252 hypothetical protein 36 17 Op 2 . + CDS 42994 - 43290 243 ## COG1846 Transcriptional regulators 37 17 Op 3 . + CDS 43317 - 44669 1096 ## COG4452 Inner membrane protein involved in colicin E2 resistance + Term 44875 - 44925 -0.5 - Term 44887 - 44921 0.8 38 18 Op 1 . - CDS 45088 - 46368 515 ## COG4694 Uncharacterized protein conserved in bacteria 39 18 Op 2 . - CDS 46355 - 47293 467 ## COG4694 Uncharacterized protein conserved in bacteria 40 18 Op 3 . - CDS 47290 - 47439 62 ## gi|237713645|ref|ZP_04544126.1| predicted protein - Term 47869 - 47909 6.0 41 19 Op 1 . - CDS 47970 - 48494 485 ## Slin_0877 hypothetical protein 42 19 Op 2 . - CDS 48556 - 49236 454 ## Psta_4074 RNA polymerase, sigma-24 subunit, ECF subfamily 43 19 Op 3 . - CDS 49246 - 49689 379 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 44 19 Op 4 . - CDS 49719 - 50069 337 ## COG1733 Predicted transcriptional regulators - Prom 50194 - 50253 6.1 + Prom 50153 - 50212 5.3 45 20 Tu 1 . + CDS 50236 - 50763 492 ## BT_1263 putative protease I + Prom 51021 - 51080 4.8 46 21 Tu 1 . + CDS 51262 - 52863 1071 ## COG0642 Signal transduction histidine kinase + Term 52893 - 52942 -0.3 + Prom 52865 - 52924 7.6 47 22 Op 1 . + CDS 53065 - 54366 1218 ## COG1757 Na+/H+ antiporter + Prom 54393 - 54452 3.0 48 22 Op 2 . + CDS 54473 - 55027 914 ## PROTEIN SUPPORTED gi|160885844|ref|ZP_02066847.1| hypothetical protein BACOVA_03848 + Term 55063 - 55114 18.3 + Prom 55075 - 55134 3.7 49 23 Op 1 . + CDS 55202 - 56197 739 ## COG1609 Transcriptional regulators 50 23 Op 2 . + CDS 56265 - 58040 2045 ## COG2407 L-fucose isomerase and related proteins + Prom 58115 - 58174 6.7 51 24 Op 1 3/0.000 + CDS 58200 - 58838 565 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 52 24 Op 2 . + CDS 58856 - 60286 1216 ## COG1070 Sugar (pentulose and hexulose) kinases 53 24 Op 3 . + CDS 60293 - 60685 381 ## BT_1276 hypothetical protein 54 24 Op 4 . + CDS 60704 - 62014 999 ## COG0738 Fucose permease + Term 62075 - 62119 12.4 + Prom 62430 - 62489 2.2 55 25 Tu 1 . + CDS 62576 - 62743 257 ## gi|237713661|ref|ZP_04544142.1| predicted protein + Prom 62771 - 62830 6.1 56 26 Tu 1 . + CDS 62988 - 63860 856 ## BF3047 hypothetical protein + Term 63894 - 63932 6.2 + Prom 64441 - 64500 4.6 57 27 Tu 1 . + CDS 64525 - 65037 270 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 65403 - 65462 2.2 58 28 Tu 1 . + CDS 65593 - 67284 664 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 67319 - 67378 5.8 59 29 Op 1 . + CDS 67460 - 68113 638 ## BT_1287 hypothetical protein + Term 68143 - 68183 1.1 + Prom 68126 - 68185 3.3 60 29 Op 2 . + CDS 68208 - 68900 557 ## COG4912 Predicted DNA alkylation repair enzyme + Term 68916 - 68963 8.2 - Term 68904 - 68950 4.2 61 30 Op 1 25/0.000 - CDS 68993 - 70324 1109 ## COG0687 Spermidine/putrescine-binding periplasmic protein - Prom 70344 - 70403 3.5 - Term 70330 - 70397 -0.1 62 30 Op 2 36/0.000 - CDS 70414 - 71205 692 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 63 30 Op 3 30/0.000 - CDS 71199 - 71999 596 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 64 30 Op 4 . - CDS 72011 - 73396 1486 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components - Prom 73521 - 73580 10.2 + Prom 73236 - 73295 4.3 65 31 Tu 1 . + CDS 73541 - 74644 1243 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 74722 - 74767 7.6 - Term 74867 - 74911 5.4 66 32 Op 1 . - CDS 74923 - 77487 2581 ## COG0058 Glucan phosphorylase 67 32 Op 2 . - CDS 77524 - 79185 1586 ## COG0438 Glycosyltransferase - Prom 79255 - 79314 6.5 - Term 79303 - 79351 15.2 68 33 Op 1 16/0.000 - CDS 79388 - 79852 714 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 69 33 Op 2 4/0.000 - CDS 79928 - 81745 1567 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I 70 33 Op 3 16/0.000 - CDS 81742 - 82347 577 ## COG1394 Archaeal/vacuolar-type H+-ATPase subunit D - Prom 82391 - 82450 2.1 71 33 Op 4 16/0.000 - CDS 82456 - 83784 1585 ## COG1156 Archaeal/vacuolar-type H+-ATPase subunit B 72 33 Op 5 . - CDS 83816 - 85576 1935 ## COG1155 Archaeal/vacuolar-type H+-ATPase subunit A 73 33 Op 6 . - CDS 85594 - 86436 810 ## BT_1300 hypothetical protein 74 33 Op 7 . - CDS 86448 - 87038 632 ## COG1390 Archaeal/vacuolar-type H+-ATPase subunit E - Prom 87248 - 87307 7.2 Predicted protein(s) >gi|222159226|gb|ACAB01000133.1| GENE 1 424 - 1035 375 203 aa, chain - ## HITS:1 COG:BMEI1582 KEGG:ns NR:ns ## COG: BMEI1582 COG2197 # Protein_GI_number: 17987865 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Brucella melitensis # 126 194 140 208 213 63 50.0 3e-10 MKNNEVVRIAIAETSVIIRGGLTAALKRLSNVKVQPIELLSVEALHDCVRTQCPEMLIVN PAFGDYFDVAKFREEISGKRIRLIALVTSFVDASLLGKYDESISIFDDLETLSKKIAGLL NVVSEEEGMDNQDTLSQREKEIVICVVKGMTNKEIAEKLFLSIHTVITHRRNISKKLQIH SAAGLTIYAIVNKLVALSDVKDL >gi|222159226|gb|ACAB01000133.1| GENE 2 1025 - 1732 514 235 aa, chain - ## HITS:1 COG:no KEGG:BT_0519 NR:ns ## KEGG: BT_0519 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 235 1 234 234 449 96.0 1e-125 MMDNLQKYKPTDKMIDLISDNYSLLQVMSRFGLSLGFGDKTVKEVCELNGVDCRTFLIVV NFMAEGFSRLDGDKDDISIPALIDYLRQAHIYFLDFSLPAIRRKLIEAIDCSQDDVAFLI LKFFDEYTREVRKHMDYEEKTVFKYVDSLIKGNAPKNYQISTFSKHHDQVGEKLTELKNI IIKYCPAKANENLLNAALFDIYACEAGLESHCKVEDYIFVPAILNLERRIRENEK >gi|222159226|gb|ACAB01000133.1| GENE 3 1680 - 1883 102 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237719367|ref|ZP_04549848.1| ## NR: gi|237719367|ref|ZP_04549848.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 42 1 42 46 70 90.0 4e-11 MRSIILSVGLYFCKLSIIPMYIREYKETEYFAEMQSSFIGILNHHLLANTNYTLLFCGTT LGKLVFF >gi|222159226|gb|ACAB01000133.1| GENE 4 1939 - 3114 1262 391 aa, chain + ## HITS:1 COG:CAC2445 KEGG:ns NR:ns ## COG: CAC2445 COG0138 # Protein_GI_number: 15895710 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Clostridium acetobutylicum # 4 391 5 391 391 538 64.0 1e-152 MANELELKYGCNPNQKPARIFMKEGELPIEVLNGRPGYINLLDAFNSWQLVKELKEATGL PAAASFKHVSPAGAAVAVEMSDTLKKIYFVDDMKLSPLATAYARARGADRMSSYGDFIAL SDTCDEETARIINREVSDGVIAPDYTPEALEILKNKRKGTYNVIKIDPAYRPAPIEHKDV FGVTFEQGRNELKIDESLLKEMPTQNKEIPADAKRDLIIALITLKYTQSNSVCYAKDGQA IGIGAGQQSRIHCTRLAGNKADIWYLRQHPKVMNLPWIEKIRRADRDNTIDVYISEDYDD VLADGVWQQFFTEKPEVLTREEKRAWLNTMTGVALGSDAFFPFGDNIERAHKSGVSYIAQ PGGSVRDDHVIGTCDKYNMAMAFTGIRLFHH >gi|222159226|gb|ACAB01000133.1| GENE 5 3249 - 3755 596 168 aa, chain + ## HITS:1 COG:Cj1382c KEGG:ns NR:ns ## COG: Cj1382c COG0716 # Protein_GI_number: 15792705 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Campylobacter jejuni # 4 164 3 159 163 144 51.0 1e-34 MNKIGVFYGSTTGTTEDLARRIAEKLDVPSADVFDVSKLTEALVNEYDVLVLGSSTWGAG ELQDDWYDGVKVLKKCDLSHKSVALFGCGDSDSYSDTFCDAIGILYEDLKDTHCKFCGAT DTAGYTFDSSIAVVDGKFVGLPLDEVNEDSKTDERINAWAEQVKQEIS >gi|222159226|gb|ACAB01000133.1| GENE 6 3789 - 4151 350 120 aa, chain + ## HITS:1 COG:no KEGG:BT_0516 NR:ns ## KEGG: BT_0516 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 120 1 120 120 220 95.0 1e-56 MATERIIPGEIRIFLNHIYEFKKGVRNMVLYTMSKEHEEFAIRRLKNQKISYMIQEVGTN KINLFFGKAECMEAMRHIIIHPLNQLTAEEDFILGAMLGYDLCQQCKRYCSKKEGIKMAV >gi|222159226|gb|ACAB01000133.1| GENE 7 4306 - 5328 731 340 aa, chain + ## HITS:1 COG:no KEGG:BT_0515 NR:ns ## KEGG: BT_0515 # Name: not_defined # Def: terminal quinol oxidase, subunit, putative (DoxD-like) # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 339 2 340 341 566 84.0 1e-160 MKQQQEQLTAYTLAGIFTLSLRLIVGWTYFSAFWRRLVLENKLIPDSTGYIGEKFNHFLP NSIGIKPIIEYLVSTPDLLWWAMVIFTIIEGVVGLLYMLGFFTRLMSIGVFSLAFGILLG SGWLGTTCLDEWQIGILGVSAGFTIFLSGGGKYSLDYLLQPQLSKNKWLVWVTSGELPLS IKRFSKVAIGGAAILFILTLYTNQVFHNGVWGPLHNKSVKPKIEISDANVNNGALSFKVY RVEGADVYGSFLIGITLKDTSGKVILQKNGEDLALFPLTNIKNDYIAKVVPGKHSLIIPL GSKATLAIRDNSLQNLPEGQYELILTDISGITWNKNITIE >gi|222159226|gb|ACAB01000133.1| GENE 8 5558 - 6859 1004 433 aa, chain + ## HITS:1 COG:no KEGG:BDI_2604 NR:ns ## KEGG: BDI_2604 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 433 1 437 437 613 69.0 1e-174 MRKITAIIILCFLHSLTLPAQETKEKSKNRKTSSTEEYTKFRFGGYGEMAASYKDYNFNR FTPNGSGKLNRGEISIPRFVLAFDYKFSSKWVLSTEIEFEYGGTGSAREIEWFEENGEYE TEIEKGGEVALEQFHITRLINKHINLRFGHMIVPVGLTNNHHEPINFFGVYRPEGETTIL PSTWHETGIAVFGEIGRFDYELQLVNGLDPQGFRSEDWIKNGRQGAFEVNNFTSPAFVAR VNYHGVKGLRVGASFYYNQTAKNGTKPWRNTGYKFPVTIVTGDLQYKGFNNNLIVRGNAV YGNLGASGALTTINNSSSAASGYPNTTVAQNAVSYGGEAGYNIGSFFHSKAPRIYPFIRY EYYNPMEKTEANTGLLADKRFQVSAVTAGLNYYALPNLVIKADYTHRSIGGGKYNDENLV SVGIAYIGWFIKK >gi|222159226|gb|ACAB01000133.1| GENE 9 6878 - 8059 1149 393 aa, chain + ## HITS:1 COG:no KEGG:BDI_2603 NR:ns ## KEGG: BDI_2603 # Name: not_defined # Def: putative iron-regulated protein A precursor # Organism: P.distasonis # Pathway: not_defined # 3 390 1 401 403 333 47.0 8e-90 MKMKTWKFLPIMAMGVCMAMTSCSEDSKKEDVSPNPFAGFNTPDSGTATDDELKAAVATY VDDVVIPTYQDMYNKVTALNTAVQKLSTSSSDNDVANAADAWVAARKPWEKSEAFLYGPA DLNKLDPSLDSWPLDKGGIDAILQSGNWEDAVGGDVDDGDDAPADAPQNLRGFHTAEYLL FEDGEAKKITDLDANKIGYLKAVVGRMSKDTELLLKGWTEGKNLENGAYKDAMKAPNGTY SISNIKQSVAMILNSDNGAEGIANEVGATKIGDPVDKWNSGDKEGGVLAVESWYSWNSLD DYTDNIYSIRNAYYGSLDGTVASASLCSIMKKVNPTLNDMVVKQIQETITAINDIPHPFR NNLGATTEVKNAQNQCAFLCSGLALVRGKLAGE >gi|222159226|gb|ACAB01000133.1| GENE 10 8072 - 9538 1380 488 aa, chain + ## HITS:1 COG:VC1265 KEGG:ns NR:ns ## COG: VC1265 COG3488 # Protein_GI_number: 15641278 # Func_class: C Energy production and conversion # Function: Predicted thiol oxidoreductase # Organism: Vibrio cholerae # 92 488 54 461 461 243 38.0 7e-64 MMTKFFLLGKPVIKALRGGLPRRLGRILFVAINIGFLQGCDHDAFDGIIYKPNGEIAKPE ELSAGISTIFSTVSGAYDTNADWVTGELLTRFNRGDGLYDNSRGVGTGSGNGLGPIFGGY SCGACHRNTGRTTPAWVTGGSGAGFSSMLVYITRKTGGYFPEYGRVLHDQSIYGVKAEGK IKITTTTEKFKFPDGEEYELITPHYEITDWYDYEIKPEDLIISVRAPLRHVGMGQMMALN HDELKQLAAQSNYPEYGISGRLNYVTEKGKYDIGISGNKANHQDLTVELGFSSDLGVTND RFPHEVGEGQSQMMGFANQGVEISTRDMEDVDLYMQTLGVPARRNVDDETIKHGEQMFYQ AKCHLCHTVTLHTRPRGSVLLNGTQLPWLGNQIIHPYSDFLLHDMGQGLADDYPSGLARG CEWRTTPLWGIGLQEIVNGHTYFLHDGRARNLTEAIMWHDGEGAASRSIFSRMSKEDRAA LLTFLNSL >gi|222159226|gb|ACAB01000133.1| GENE 11 9556 - 10791 1080 411 aa, chain + ## HITS:1 COG:no KEGG:BDI_2601 NR:ns ## KEGG: BDI_2601 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 411 1 411 411 623 75.0 1e-177 MKKKYLALALTAFLCSHSYAQDSNAKEIDDKYVENDVVSLAGKKGFSFSTKAGDFLFKPY ALVQTSLSFNRYDDQGLESLYARNWANSGFAIPNAILGFTGKAFGKVTFNLSLNAAKSGA ALLQQAWFDVALKESFRIRVGKFKTPFMHAYLTTLGETLFPVLPSSVAGGVLMPYDINAV KPSIATGFDLGVQIHGLINGKWNYQLGIFNGTGIDVNSATKGMCDDHKWLPQLLYSGRLV YMPKGEMPATQGNPNNLKEDKMQFGVSTSYNAEAEDHSSSDWRIGAEFAMVKNRFYFAAE GYYMNMHFTEIMHKDKDLNYWGAYTQAGYFVTPKLQAALRYDIFDRNGTDEGGLLNMPAI GANYYFVGSNLKLQMMYQYLGRTGHDTQTDRDNDGVGLSRHSVTAMLQFTF >gi|222159226|gb|ACAB01000133.1| GENE 12 10815 - 11522 481 235 aa, chain + ## HITS:1 COG:no KEGG:BDI_2600 NR:ns ## KEGG: BDI_2600 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 20 235 19 235 235 243 53.0 4e-63 MNKRLNYLLITCACLNLCSACDEGKIYPDDTIDSGRTATVTLHFTHLDAWPQKNNMSLIS LGEDGVTPILVKRILEPSSEAEAVTVTLNNLNGNTRSIAVAVITSGMSLVHAYQSFPVNA SDESLTLPETTIEVASFSRIQTQVFNAKCVMCHGGSTTTAGDLDLTAAKAYQSLINVKAP HSAKGKNYVTPGDLNNSYLLDVLEHDNHIDIFNGAEQREVRALIRTWIKAGAENN >gi|222159226|gb|ACAB01000133.1| GENE 13 11544 - 12698 810 384 aa, chain + ## HITS:1 COG:PAB0825 KEGG:ns NR:ns ## COG: PAB0825 COG0251 # Protein_GI_number: 14521450 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pyrococcus abyssi # 278 374 28 126 127 60 33.0 5e-09 MNSKKQQPLKSEKTLAEIFKYEVADGVSEYHLMIHAIQPESTYEEQLNAIIDTYYQLREK ELQGTVAIFKRYFLSDASNQADTLLALTAESSDCALSIVEQPPLNGTKVAVWIYLQTKVQ TQMLHNGLFEVKHGPYRHLWGGSSFNRAANSEYQTRLLLNDYVMQLIEQGCKLAANCIRT WFFVQNVDVNYAGVVKARNEVFVTQNLTEKTHYISSTGIGGRHADPKVLVQMDTYAVGGI KPEQIHFLYAPTHLNPTYEYGVSFERGTYVDYGDRRQVFISGTASINNKGEVVYPGEIRK QTERMWENVKALLKEAECTFDDMGQMIVYLRDIADYTIVKEMYDKRFPCTPKVFVHAPVC RPGWLIEMECMGVKTCENKEYAPY >gi|222159226|gb|ACAB01000133.1| GENE 14 12714 - 14819 1013 701 aa, chain + ## HITS:1 COG:Cj1013c_2 KEGG:ns NR:ns ## COG: Cj1013c_2 COG0755 # Protein_GI_number: 15792340 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Campylobacter jejuni # 467 699 22 252 287 184 43.0 8e-46 MPLKRLLIILYICLIGLLAAATFIEQAYGTEFVERNIYHTIWFCCLWGVIAAIGVVALIR RSLWRRLPVLLFHGSLLVILVGAMITFIYGEQGYMHLRPDTVKSSFHSPQGDKNIHLPFT MKLDSFRVEYYPGTKAPADYVSHISYSLPGQKDSVHNECISMNRIFTTQGFRFYQSSFDE DGQGSWLTVNYDPWGTRVTYSGYILLGISMILLLFCRQGEFRKLLNHPLLKKGGLFILFL FYLTGTMQAQQTPLRVLNKIQADSLAQQQVIYHDRVVPFNTLARDFVKKLTGKIYYKGLT PEQVISGWILYPDSWKNEPMIYIKSPELHHLLGLESSYARLTDLFDGPVYRLQKTWQQEQ GKSSKLAKAIQETDEKVGLILMVEKGTFIQPLPTDGSVQPLSELEVKAELLYNRIPFSKI LFMINLSLGVLSFMLLLHNSLQRNTPSPKAKTISRTAGTIFSVALYLAFIFHLAGYCLRW YIGGRVPLSNGYETMQFMALCILLIACLLHRRFSFILPFGFLLSGFALLVSYLGQMNPQI TPLMPVLVSPWLSIHVSLIMMSYALLAFIMLNGILALCLRKKESENNVSGNDAVQDNRIE QLTLVSRLLLYPATFFLGAGIFLGAVWANVSWGRYWAWDPKEVWALITFLVYGVAFHSQS LRIFRKPLFFHIYMILAFLTVLMTYFGVNYVLGGMHSYANS >gi|222159226|gb|ACAB01000133.1| GENE 15 15139 - 15747 498 202 aa, chain - ## HITS:1 COG:CAC0821 KEGG:ns NR:ns ## COG: CAC0821 COG1309 # Protein_GI_number: 15894108 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 139 1 136 200 70 31.0 2e-12 MQFLKGDIQERILKVAEEVFLEKGYKDASMREIASRAGVTVSNIYHYFTNKDEIFRTILK PVLNDLYAMIYNHDADQMTIDVFMDSDYQKTSVREYIRLVSEHRDRLRLLLFQAQGSVLE NFRSEYTDLMTRTISVFFQGMKQKYPHINIAITNFFIHLNTVWLFALLEELVLHPVKKEE MEKFIAEYIVFETAGWKELMNA >gi|222159226|gb|ACAB01000133.1| GENE 16 15762 - 17501 221 579 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 338 559 279 507 563 89 29 5e-17 MIEVIKRRFALSTKGAKDFCKGVFFTTLLDIVLMLPAVFVFLFLEEYLRPVFQPSVSVTH GILYYSILGIIFMIVMYIFAVLQYRSTYTTVYDESANRRISLAEKLRKLPLAFFGEKNLS DLTATIMDDCTDLEHTFSHAVPQLFASIISILLITVGMAFYNWQLTIALFWVVPLAAAIL LFSKKEIQKSNESNYLNKRMVTEHIQEGLDTIQEIKPYNQERDYLEKLDASIDTYEKVLT RNELVLGILVNGSQSVLKLGLASVIIVGANLLASGTIDLFTYLIFMVIGSRVYAPLSEVM NNIAALFYLDVRISRMNEMEALPVQHGTTDFTPKGYDIEFQQVDFAYEQGKQILKNLSFT ARQGEKTALVGPSGSGKSTAARLAARFWDIQSGKITLGGQDISRIDPETLLTNYSVVFQE VVLFNASIMDNIRIGKRDATDEEVRRVARLAQCDEFVTKMPQGYQTIIGENGETLSGGER QRISIARALLKDAPIVLLDEATASLDVENETKIQAGISELVRNKTVLIIAHRMRTVANAD KIVVLENGSVAEMGTPEELKKKNGIFARMVNRQVTNMNG >gi|222159226|gb|ACAB01000133.1| GENE 17 17494 - 19266 1506 590 aa, chain - ## HITS:1 COG:SP1434 KEGG:ns NR:ns ## COG: SP1434 COG1132 # Protein_GI_number: 15901286 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Streptococcus pneumoniae TIGR4 # 1 586 3 583 586 397 38.0 1e-110 MSTLKKLQNYMGKRKVLLPAAMLLSALSALAGMLPYILIWLIVRELLEYGEITSSGNVVT YAWWAAGMAVASIVLYFAALMSSHLAAFRVESNLRKEAMRQIVRMPLGFFDINTSGRIRK IIDDNAGVTHSFLAHQLPDLAATFLVPLVAVILIFMFDWILGLACIVPVIIAMLVMGFMM NAEGRQFMKSYMTSLEEMNTEAVEYVRGIPVVKVFQQTIYSFKNFHRCIMNYNKMVFGYT RMWEKPMSLYTVIINGFVFFLAPLAILLIGYSGNYASVLLNFFLFVLITPVFSQSIMKSM YLNQALGQASEAIGRLENLVAYEHLTVVAHPQPVKEFSIQFEKVSFSYPGANQKAVDDIS FTITQGRTVALVGASGGGKTTIARLVPRFWEATEGKVLIGGINVREIAPEELMKHISFVF QNTKLFKTSLLENIKYGNPNATMEEVERAVDMAQCREIINKLPLGLNTKIGTEGTYLSGG EQQRIVLARAILKNAPIIVLDEATAFADPENEHLIQQALKELTKGKTVLMIAHRLSSITD ADNILVIDKGKIAEQGTHAKLLEKQGIYYNMWNEYQQSVRWTIGKEVSND >gi|222159226|gb|ACAB01000133.1| GENE 18 19579 - 21885 2111 768 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 28 608 29 591 757 307 33.0 5e-83 MFKKLSSSLLIVSACVFSSCTPTVKQEIAILPTPVSLTEQSGSFVLKDGMKIGVSDQSLF PAVGYLQEILRNVISSSVEVTTDQNQVDMYFQLKDTGGKPGSYKLESTPESVRVEATDYS GFISAITTIRQLLPATIEVQGEKQTYSIPAVQIEDAPRFEWRGFMLDASRHFWNKDEVKH VLDLMSLYKLNKFHWHLSDDQGWRIEIEKYPLLTEKGAWRKFNTQDRTCMARAKEEDNTD FLIPEDKIRIVEGDTLYGGYYTHDDIKEIVAYAAQRGIDVIPEIDMPGHFLAAIGQYPEL ACDGLIGWGETFSSPICPGKDTTLEFCRNVFKEIFELFPYEYVHMGGDEVEKANWKKCPL CQKRIRTEKLGSVEELQAWFVRDMEKFFLANGKKLIGWDEVVTDGLSSDAAITWWRSWAK DALPTATAQKQKVIACPNEHFYFDYAQDQNSVKKILAYDPCADERLSPEEKKYIWGVQAN LWAEWIPTMKRIEYLIVPRMIALSEIAWTEPAAKPSLEEFYRQLVPQFKRMDVMRVNYRV PDLQGFYKVNAFIDETAVDLTCPLPGTEIRYTTDGSMPTKESALYDGALEVEETTDFAFR TFRPDGSPSDVVRTKYVKAPYAEAVTAPAALQSGLKAVWHDFRGNLCADIEAAPVKGEYV VESVSIPEEVKGNIGLVLTGYLEVPADGIYTFALLSDDGSTLMLDGELLGDNDGAHSPVE IIVQKALKAGLHPIEVRYFDCNGGVLQMELVNEKGEKEVLPSTWLKHE >gi|222159226|gb|ACAB01000133.1| GENE 19 22075 - 22989 744 304 aa, chain + ## HITS:1 COG:TM0876 KEGG:ns NR:ns ## COG: TM0876 COG0053 # Protein_GI_number: 15643638 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Thermotoga maritima # 1 301 1 302 306 227 37.0 2e-59 MSRENILIKTSWISTIGNAILSASKIFIGLWAGSLAVVGDGIDSATDVVISIVMIFTARL INRPPSKKYVFGYEKAEGIATKILSLVIFYAGMQMLISSIQSIFSDEVKEIPSAIAIYVT IFSIVGKLMLASYQYKQGKKIDSSMLTANAINMRNDVVISAGVLLGLIFTFIFKLPILDS ITGLIISLFIIKSSIGIFLDSNVELMDGVKDVNVYNKIFEAVEKVPGASNPHRVRSRMIG NRYIITLDIEVNPQITITQAHEIAGAVEKSIESSIDNVYDILVHVEPAGECQTDEKFGVD KDMV >gi|222159226|gb|ACAB01000133.1| GENE 20 23079 - 24644 1541 521 aa, chain + ## HITS:1 COG:no KEGG:BT_4046 NR:ns ## KEGG: BT_4046 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 519 5 517 518 776 73.0 0 MEEGTFRRYPIGIQDFEDLRNNDYIYVDKTALIYQLVNTNKIYFLSRPRRFGKSLLVSTL EAYFLGKKELFNGLDMEQLEKDWTVYPVLHIDFSGSKYMEAESLRASINVQLLLWENVYG RQEGEDTFSLRLEGIIRRAYEQTGQQVVVLVDEYDAPMLDSNNNHELQHEISGIMRDFFS PLKKSGKYLRFLFLTGISKFSQMSIFSELNNLQNVSMSDNYSTICGITEQELLTQMKIDV EQMAQANDETYEEACAHLKKQYDGYHFSKSCVDVYNPFSLINAFAQKSYENYWFSTGTPT FLIELLQQMNFDIRLLDRMDAKPEDFDKATDCLTDPIPVLYQSGYLTIKSYDAFFRTYTL GYPNEEVRIGFIESLIPSYLYQPTRESNFYVVSFVRDLMKGNLESCLERTRSFFASIPND LNNKEEKHYQTIFYLLFRLMGQYVDTEVKSAVGRADVVIKMQDAIYVLEFKMNGTAEEAL AQINSKGYAIPYEADHRKVVKVGINFDSTTRTIGDWEIETD >gi|222159226|gb|ACAB01000133.1| GENE 21 24925 - 25269 358 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713626|ref|ZP_04544107.1| ## NR: gi|237713626|ref|ZP_04544107.1| predicted protein [Bacteroides sp. D1] # 1 114 1 114 114 219 100.0 4e-56 MKQLRIFFLCLVAATLVTACGGDDKDNSTPVEKITLDKDKLTFVAQWKMRPSHFPEGGFT VVKGDGYTLTQLDLEKITSLEMDVTGSSVTSENCELVLDTAENAIKLRIKATAY >gi|222159226|gb|ACAB01000133.1| GENE 22 25390 - 26205 921 271 aa, chain - ## HITS:1 COG:no KEGG:BDI_2292 NR:ns ## KEGG: BDI_2292 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 5 241 3 237 253 254 58.0 3e-66 MATNEKITLNVDLYDNIMTEQKGDYTGRARITGSLHNKEIAARIIKERTEYRQETIENIL DLADQKKVEAIAEGKSVVDGVGQYIVTVRGSFLGENAQFDATKHSLGVSYTPGQLLRDQL KAVKVICNGLAQTGPVINSITDSVTKSISQVITSGGPAVISGSNIKILGDDPSVGIYLTK DEEGAVPLKVSVIVHNAPSQLTIMLPAIEAGKLYALSITTQYSGSNKALKTPRSYRFPIL LGDENAGGGGGGGEAPDPEIPGGGEAPDPAA >gi|222159226|gb|ACAB01000133.1| GENE 23 26266 - 27462 821 398 aa, chain - ## HITS:1 COG:CAC3274_1 KEGG:ns NR:ns ## COG: CAC3274_1 COG5492 # Protein_GI_number: 15896519 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 29 115 395 480 480 67 44.0 3e-11 MKQLRIFFLCLMAATLATACGDDNNDDNSVPVESITLNHETLTLKSGATETLVPTIVPDR VTAESVVWSSDKTSVATVSKDGLVTAVAEGTATITATAREKSATCLVTVSNKTLVTTAAE LKTAIETADGTADAPTQIILGRSIEVAADAEHFALSIDGKHIAIDGGNSPIGGGNYYISR TASDKSLFELTNGASLKLTNLNIYGNAAAHSADVACIFVRASCKLTLGNGFELYSGDGND NDQLIGISVGDNATLIMEGDAEISKSIKGQEVLVAPTGILQLKGGKIKAREEGTYMSERS LCLQAAINGNQVTIPTVTVENELPADSDFKLDLYDYVLSSSTVRPGAETVVKGTDSYTLT DSDLMKFHLMTNTTGGMTYYDSYFELYLDGNAIKIRAK >gi|222159226|gb|ACAB01000133.1| GENE 24 27681 - 28874 611 397 aa, chain - ## HITS:1 COG:no KEGG:BDI_2291 NR:ns ## KEGG: BDI_2291 # Name: not_defined # Def: putative transcriptional regulator # Organism: P.distasonis # Pathway: not_defined # 6 393 6 390 393 219 35.0 1e-55 MDIPALHTLTNGIIIGICLSISFGLVCFKQMAHAINPKYRRACHFFIAASLIIAAGHLAE LLVDGFGVYRSLDLFSILVLVLASSQALMFTFMLILLFDSRYVTFANVMKHAAPSLVFIL LYVVSCCIEADVCVYSLAEWRACVVHNLPLAVRTLFGVTYTVQLFVYISLFFRKKHRYIT HLKELTGECISKLELRWTMRAFLYALSIGIGAWVLLVCPGPIPEFSFSFLIVVFYPAFAW LYTNYHYTYEMLRRFIVEQDGKLPQPPLEADMGDLICYLQSGDSNHLFEELEAYLQKEQP YLDYNFKRDDLIKALGTNERNLSAAVGRATGLTLQNYLLRLRIRHAMDLLLLPDYANHTI EDIAGASGFKAMRTFNRNFRELVKMTPTEFRQQKSVL >gi|222159226|gb|ACAB01000133.1| GENE 25 28979 - 31318 1944 779 aa, chain - ## HITS:1 COG:STM2199 KEGG:ns NR:ns ## COG: STM2199 COG4771 # Protein_GI_number: 16765529 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Salmonella typhimurium LT2 # 120 684 33 598 663 98 25.0 5e-20 MKKYLLVLAGLYCVLSYALAEHPDYPELKKSDANIIGHVLDKKTKEHLPYITIALKGTTI GTVTDATGHYFLKNLPEGNFILEVSSVGYKTVTRNVTLKKGKTLEEDFEIEEDAIALDGV VVSANRSETTRRLAPTLVNVVDLKLFETTNSSTLSQGLNFQPGVRVETNCQNCGFQQVRI NGLDGPYTQILIDSRPVFSALSGVYGLEQIPASMIERVEVMRGGGSALFGSSAIAGTINI ITKEPLRNSGQLSHTLTALGGGSSFDNNTSLNASLVTDDHRAGLYIFGQNRHRSGYDYDG DGFTELPKLKNQTVGFRSYLKTSTYSKLTFEYHHMQEFRRGGDMLDRPPHEAHIAEQLQH SIDGGSLKFDYFSPDEKNRLSIFASAANTDRDSYYGPGNDPLKAYGKTTDLTAMGGAQYV HTFDKCFFMPSDLTAGLEYNRDRLKDNMWGYDRHTDQTVNIYSAFLQNEWKNDRWGILIG GRLDKHNMVDNVIFSPRANLRFNPTQNINLRLSYSSGFRAPQAFDEDMHIENVGGTVAMI ERAKNLKEEKSQSFSASADMYHRFGVFQTNFLIEGFYTRLTDVFVLGEPYDRGDGILVKT RSNGPGAKVMGLTLEGKVAYLSILQIQAGLTLQRSRYDEPHKWHDDAPAEKKIFRTPDTY GYFTATYTPIKPLSIALSGTYTGRMLVQRMDITAENAELGAMPERKAEAIRTPRFFDLGV KLAYDFKLYKTVDLQLNGGVQNLFESYQKDFDRGANRDSGYIYGPSLPRSFFAGVKISY >gi|222159226|gb|ACAB01000133.1| GENE 26 31428 - 31745 113 105 aa, chain - ## HITS:1 COG:no KEGG:BT_0503 NR:ns ## KEGG: BT_0503 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 9 112 113 153 79.0 2e-36 MKWFLPVLFISYMAGITLFTHSHVVNGVTIVHSHPFKKGSEHSHTTVEFQLIHLLNHVLV TDSGLIPTFAVAALSLLCILFIRPQVEPYHRSCPGVISLRAPPVA >gi|222159226|gb|ACAB01000133.1| GENE 27 31895 - 34222 1962 775 aa, chain - ## HITS:1 COG:STM2199 KEGG:ns NR:ns ## COG: STM2199 COG4771 # Protein_GI_number: 16765529 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Salmonella typhimurium LT2 # 124 558 33 454 663 95 25.0 5e-19 MKKYIFSLVCLCCTLLPALEGQAHEYPNHPELRKSDANIVGHILDKNTKEHLPYITVALK GTTIGTVTDATGHYFLKNLPEGNFVLEVSSVGYKTVRRNVTLKKGRTLEEDFEIEEDAVA LDGVVVSANRNETTRRLAPTLVNVVDLKIFENTNSTTLAQGLSFQPGVRVESNCQNCGFQ QVRINGLDGPYTQILLDSRPIFSALSGVYGIEQIPASMIERVEVMRGGGSALFGSSAIAG TINIITKEPMRNSGMLSHTITGIGDGDVFDNSTALNASLVTDDQRAGLYIFGQNRHRSAY DHDGDGYSEIPKLHGQTIGFRSFLKTTTYSKLTFEYHHMEEFRRGGDLLNRPPHEANVAE QTEHSINGGGLKFDYFSPNENHRFNVFASAQHINRDSYYGPGDRDPLDAYGNTTDLNWMA GSQYVYSFGKCIFMPSDLTAGIEFNQDKLEDNMWGYNRTVDQKVNIGSAFLQNEWKNDHW GFLIGGRLDKHNLIDHVIFSPRANLRYNPTENINLRFSYSSGFRAPQAFDEDLHVENVGG NVAMVELADNLKEERSQSLSASADIYHRFGAFQVNFLVEGFYTKLSDVFALTDGEVVDGI LTRTRHNASGAQVLGLTLEGKMAYLNKIQVQAGVTLQQSHYSKPHIWNKEAPAVKKMMRT PNTYGYFTATYTPIKPLSIALSGTYTGSMLVPHEPVPGFLENPITVNTKDFFDIGLKAAY DFKLYKSMNLQLNAGIQNIFNAYQDDFDKGADRDSGYIYGPSLPRSFFAGVKISY >gi|222159226|gb|ACAB01000133.1| GENE 28 34307 - 34687 110 126 aa, chain - ## HITS:1 COG:no KEGG:BT_0501 NR:ns ## KEGG: BT_0501 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 126 2 110 110 151 84.0 8e-36 MPPQRKNEVKRLLLNITRYFLPILFVSYLVSFTFFAHVHVVNGVTIVHSHPFKKGAAHKH STVELQLIHFLSHLTADGAAVVFALSLFIPFLLCLLPGRSQHTHYHCPYHGVVGLRAPPV IRFSVL >gi|222159226|gb|ACAB01000133.1| GENE 29 34734 - 36332 1673 532 aa, chain - ## HITS:1 COG:ECs4625 KEGG:ns NR:ns ## COG: ECs4625 COG2985 # Protein_GI_number: 15833879 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 13 528 19 554 561 270 33.0 7e-72 MFTDLLHSSYFSLFLIVALGFMLGRIKIKGLSLDVSAVIFIALLFGHFGVIIPKELGNFG LVLFIFTIGIQAGPGFFDSFRSKGKTLILITMLIICAACLTAVGLKYAFDIDTPSVVGLI AGALTSTPGLAVAIDSTNSPLASIAYGIAYPFGVIGVILFVKLLPKIMRVDLDKEARRLE IERRGQFPELTTCIYRVTNANVFDRSLVQINARGMTGAVISRLKHNDEISIPTAHTVLHE GDYIQAVGSEESLDQLAVLVGNREEGELPLDKTQEIESLLLTKKDMINKQLGDLNLQKNF GCTVTRVRRSGIDLSPSPDLALKFGDKLMVVGEKEGLKGVARLLGNNAKKLSDTDFFPIA MGIVLGVLFGKINISFSDSLSFSPGLTGGVLMVALVLSAIGKTGPIIWSMSGPANQLLRQ LGLLLFLAEVGTSAGKNLVATFQESGLLMFGVGAAITLVPMLVAAIVGHFVFKISLLDLL GTITGGMTSTPGLAAADSMVDSNIPSVAYATVYPIAMVFLILFIQVIASAVY >gi|222159226|gb|ACAB01000133.1| GENE 30 36381 - 37283 950 300 aa, chain - ## HITS:1 COG:CC0303 KEGG:ns NR:ns ## COG: CC0303 COG1230 # Protein_GI_number: 16124558 # Func_class: P Inorganic ion transport and metabolism # Function: Co/Zn/Cd efflux system component # Organism: Caulobacter vibrioides # 18 256 75 313 361 243 53.0 2e-64 MEPHHHHEHNHQLISLNKAFIIGITLNIAFVIVEFGVGFYYNSLGLLSDAGHNLGDVASL VLAMLAFRLQKVHPNSRYTYGYKKSTILVSLLNAVILLVAVGIIIAESIDKLFHPVSVDG SAIAWTAGVGVVINALTAWLFMKDKDKDLNVKGAYLHMAADALVSVGVVASGIIITYTGW SIIDPIIGLGIAVVIIVSTWGLLHDSLRLSLDGVPVGIDAQKIQQIIMEQPGVENCHHLH IWALSTTETALTAHIVIDNITQLEEVKQHIKEALEEAGIHHATLEFEDERTTCCKECCED >gi|222159226|gb|ACAB01000133.1| GENE 31 37560 - 38228 894 222 aa, chain - ## HITS:1 COG:CC1495 KEGG:ns NR:ns ## COG: CC1495 COG0800 # Protein_GI_number: 16125742 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Caulobacter vibrioides # 5 221 4 221 224 227 48.0 1e-59 MAKFDKIAVLNKIGSTGMVPVFYHKDAEVAKKVVKACYDGGVRAFEFTNRGDFAQEVFAE IVKFAAKECPEMAIGIGSIVDPATAAMYLQLGANFVVGPLFNPEIAKVCNRRSVAYTPGC GSVSEVGFAQEVGCDLCKVFPGDVYGTNFVKGLMAPMPWSKLMVTGGVEPTKENLTAWIK AGVFCVGMGSKLFPKDKVAAEDWAYVTAKCEEALGYIAEARK >gi|222159226|gb|ACAB01000133.1| GENE 32 38298 - 39323 1226 341 aa, chain - ## HITS:1 COG:TM0067 KEGG:ns NR:ns ## COG: TM0067 COG0524 # Protein_GI_number: 15642842 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 4 341 2 339 339 365 54.0 1e-101 MGKKVVTLGEIMLRLSTPGNTRFVQSDSFDVVYGGGEANVAVSCANYGHDAYFVTKLPKH EIGQSAVNALRKYGVKTDFIARGGDRVGIYYLETGASMRPSKVIYDRAHSAIAEADAADF DFDAIMEGADWFHWSGITPAISDKAAELTRLACEAAKRHGVTVSVDLNFRKKLWTKEKAQ SIMKPLMQFVDVCIGNEEDAELCLGFKPDADVEAGHTDAEGYKGIFQQMMKEFGFKYVVS TLRESFSATHNGWKAMIYNGEEFYTSKRYDIDPIIDRVGGGDSFSGGIIHGLMTKPNQGA ALEFAVAASALKHTINGDFNLVSIEEVEALAGGDASGRVQR >gi|222159226|gb|ACAB01000133.1| GENE 33 39577 - 40641 1165 354 aa, chain + ## HITS:1 COG:mll7623 KEGG:ns NR:ns ## COG: mll7623 COG1879 # Protein_GI_number: 13476333 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 6 309 5 305 345 87 26.0 3e-17 MNKLPERIRIKDIARLADVSVGTVDRVIHGRSGVSEASKKRVEEILKQLDYQPNMYASAL ASNKKYTFICLLPEHLEGEYWTAVETGIHEAIATYSDFNTSVKINYYDPYDYHSFENASE AILALQPDGVMVAPTAPQYTKGFTDQLQALDIPYIYIDSNIKDVPPLAFFGQNSRQSGYF AARMMMLLARDEKEIVIFRKIHEGIVGSNQQENREIGFRQYMKEHHPSCTILELDLHAER NDEDNEMLDEFFRTYPTVKNGITFNSKAYIVGEYLQSRGKKDFNLIGYDLLERNVTCLKE GSISFLIAQQPELQGANGIKALCDHLIFKKEVTCINYMPIDLLTVETIDYYHSK >gi|222159226|gb|ACAB01000133.1| GENE 34 40672 - 42162 1510 496 aa, chain + ## HITS:1 COG:CAC0696 KEGG:ns NR:ns ## COG: CAC0696 COG2721 # Protein_GI_number: 15893984 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Clostridium acetobutylicum # 6 496 5 492 492 612 60.0 1e-175 METKYLRINPADNVAVAIVNLPAGEHLSVDGIEITLNEDIPAGHKFALKNFAEGENVIKY GYPIGHARMAKKQGDWMNETNIKTNLAGLLDYTYNPIQVSLDIPHKDLTFKGYRRKNGDV GVRNEIWIIPTVGCVNGIIGQLAEGLRRETEGKGVDAIVAFPHNYGCSQLGDDHENTKKI LRDMVLHPNAGAVLVVGLGCENNQPDVFREFLGEFDEDRVKFMVTQKVGDEYEEGMEILR DLYAKASKDERTDVPLSELRVGLKCGGSDGFSGITANPLLGMFSDFLIAQGGTSVLTEVP EMFGAETILMNRCKNEELFEQTVHLINDFKEYFLSHGEPVGENPSPGNKAGGISTLEEKA LGCTQKCGKSYVSGVMPYGERLQKKGLNLLSAPGNDLVAATTLAASGCHMVLFTTGRGTP FGTFVPTMKISTNSTLAKNKPGWIDFNAGVIVENEPMEKTCERFIDYIIKVASGEFVNNE KKGYREIAIFKTGVTL >gi|222159226|gb|ACAB01000133.1| GENE 35 42442 - 42993 342 183 aa, chain + ## HITS:1 COG:no KEGG:BT_1252 NR:ns ## KEGG: BT_1252 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 183 1 183 183 254 81.0 9e-67 MEKSSKFISLSGLAAIMAGIYALVGAYIATQVITPGTHLIVALELMAIIASLVLVAAAVT ACILSYYKSKKTGQKFFSRLTYRALWNFSLPMLTGGVLCISILMHEYYDILASVMLLFYG LALVNVSKFTYSSIIWLGYAFICLGVVDCFWEGHSLLFWTIGFGGFHILYGILFYLHYER KRS >gi|222159226|gb|ACAB01000133.1| GENE 36 42994 - 43290 243 98 aa, chain + ## HITS:1 COG:CC2206 KEGG:ns NR:ns ## COG: CC2206 COG1846 # Protein_GI_number: 16126445 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Caulobacter vibrioides # 8 96 10 98 103 81 46.0 3e-16 MIEAFQYINKAFESKVRLGIMAILMVNEEADFNFLKEQLSLTDGNLASHTRALEELGYIV CNKSFVGRKPRTVFQATPQGREAFKSHIEALEKFLKSK >gi|222159226|gb|ACAB01000133.1| GENE 37 43317 - 44669 1096 450 aa, chain + ## HITS:1 COG:PA0465 KEGG:ns NR:ns ## COG: PA0465 COG4452 # Protein_GI_number: 15595662 # Func_class: V Defense mechanisms # Function: Inner membrane protein involved in colicin E2 resistance # Organism: Pseudomonas aeruginosa # 47 448 27 438 452 224 34.0 3e-58 MDAFNENSNEQQEQQPMGCLHRFSKTIKVVIIGLLILLLMIPMFMIENLISERGRTQEEA IDEVSEKWSLAQTITGPYLNLQYPVVTENNGEKKVSIKDLILFPDELMVNGQLKTEILKR SIYEVNVYQSELTLKGSFSSEELKKSRIDMEQLQFDRAAICLNLTDMRGISEQISITLGD SVYIFEPGMDNRGIANTGVHAIANLSELKQNKKLPYEIKIKLKGSQSLNFIPLGKTTRVD LKANWNTPSFTGNYLPNNRNITEKEFSAQWQVLNLNRNYSQVMVDFNTSNIKDIESSSFG VNFKIPVEQYQQSMRSAKYAILIILLTFGVIFFTEIMNKTRIHALQYLLVGLALCLFYSL LLSFSEHVGFNPAYLLSSALTIILVGGYMFGITKKKKPSLIMSGLLGILYLYIFVLIQLE TFALLAGSLGLFIILAMVMYFSKKIDWFNE >gi|222159226|gb|ACAB01000133.1| GENE 38 45088 - 46368 515 426 aa, chain - ## HITS:1 COG:HP1142 KEGG:ns NR:ns ## COG: HP1142 COG4694 # Protein_GI_number: 15645756 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Helicobacter pylori 26695 # 14 426 350 759 759 145 27.0 2e-34 MIFTNIPIIHELEQTFALAFRAYKEAVNSNLIEIRKKYQNPSQIVELENTSTKLGVVNSI IQQANAIIVEFNKKVVGIDGELAIIKKKFWNRLRYDYDQSVTDYNNQKASTENKIRACET AKQHVQSLIDEQKAIIEAEQQSIVNIEESINHINTMLVDMGIIDFKIIKCREEGLYRIVR GEDRSSVFKTLSEGERTIISVLYFVETCQGILDRSKTQKKRIIVIDDPVSSLSTMYVFNI GRLLKNVFYPELIKDSTQETGFLMKRKFEQVFILTHSLYFFYEMTDMREPQRHAYQSLFR VSKSVAGSKIETMHYEHIQSDYHTYWMTIKDPDTHPALIANCMRNIIEYFFNFVEKRDLN NVFIQDKFKQPKYQAFQRYINRESHSLGQNINDFKDFDYNIFMDGLKMVFLEMGYSEHYK KMMKLK >gi|222159226|gb|ACAB01000133.1| GENE 39 46355 - 47293 467 312 aa, chain - ## HITS:1 COG:jhp1070 KEGG:ns NR:ns ## COG: jhp1070 COG4694 # Protein_GI_number: 15612135 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Helicobacter pylori J99 # 2 304 20 326 759 100 28.0 3e-21 MINKISIKGPASYKNMAVFETDKNINLIYGLNGSGKSTLSEFLRKRTDNEYAECSISPLL DEDTEEILVYNENYVNDVFYSSDTQKGIFSLSKENAGARKRIDAANAALQVANRDFQKQE LLQEKELEAWTSTKSIFANRFWQIKTQYTGGDRVLEYCFTGLKSSKELLLNHIVGLAKPS NKLVDSIDQLKEEIQRLNEAKGTQIPLIQEITFSAGDIEIDSLFKEVITGNANSRVAKLI DSLHNSDWVKVGLSFDTKDICPFCQRPYLDDDIIAELRSYFNEDYEKAVADIESKGKTYK DSIDLIPDIDFY >gi|222159226|gb|ACAB01000133.1| GENE 40 47290 - 47439 62 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713645|ref|ZP_04544126.1| ## NR: gi|237713645|ref|ZP_04544126.1| predicted protein [Bacteroides sp. D1] # 1 49 1 49 49 87 100.0 4e-16 MKHPNDALRIAIGIGGMRIVDKELDMMDGYEFMKVINGCSTDNDKISKV >gi|222159226|gb|ACAB01000133.1| GENE 41 47970 - 48494 485 174 aa, chain - ## HITS:1 COG:no KEGG:Slin_0877 NR:ns ## KEGG: Slin_0877 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 1 163 1 160 174 91 31.0 1e-17 MSKIDFLKEQIIEARTFVNRLMSELPEDLWYVIPEGTDSNFIWQVGHLLVSQNFHTLTTV TGVNEKVGRLVPIQKYNRIFNGMGTLHRSIEEGLIPVAELREQTEIIHQICIENLETLND DVLAECLQPFPFKHPVAETKYEALSWSFKHEMWHSAEMEAIKRELGYPIVWMEG >gi|222159226|gb|ACAB01000133.1| GENE 42 48556 - 49236 454 226 aa, chain - ## HITS:1 COG:no KEGG:Psta_4074 NR:ns ## KEGG: Psta_4074 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: P.staleyi # Pathway: not_defined # 5 180 29 211 293 82 28.0 1e-14 MEIDINRLVKEYGNMISMIAHRMILNKEIAREAAQEVWYELCKSITSFKGDSEISTWIYT IARRTIGRYAACEKQVKMSEIEYFRSLPEFEYSGGEEAKREWIKERCDWCITALNHCLNN DARLIFIFRENVGLPYRQIGEIMELKESNVRQIYNRSIQKITAFMNDTCPLYNPDGACKC RICKPVYSIDMDKEYTMVQRMMRLADLYRKFEKELPRKNYWEKFLQ >gi|222159226|gb|ACAB01000133.1| GENE 43 49246 - 49689 379 147 aa, chain - ## HITS:1 COG:CAC3445 KEGG:ns NR:ns ## COG: CAC3445 COG0454 # Protein_GI_number: 15896686 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 144 1 144 147 181 59.0 3e-46 MEIKLVTSDKKEFLDLLLLADEQESMIDRYLERGDMFVLYDDGLKALCVVTREGEGVYEI KNIATVPFFQRQGYGKRLIEFLFEYYQGKCSEMLVGTGDVPSSRSFYEHCGFAISHRIKN FFTDNYDHPMYEEGVQLVDMIYLKKTF >gi|222159226|gb|ACAB01000133.1| GENE 44 49719 - 50069 337 116 aa, chain - ## HITS:1 COG:AGc3635 KEGG:ns NR:ns ## COG: AGc3635 COG1733 # Protein_GI_number: 15889290 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 104 36 129 147 107 52.0 7e-24 MKNFHPTGNCPIRDVLSRLGDKWSMLVLITLNANGTMRFSDIHKTIEDVSQRMLTVTLRT LESDGLVERKVYAEVPPRVEYCLTDTGGTLIPHIEGLVGWALENMNIILDHRKMSR >gi|222159226|gb|ACAB01000133.1| GENE 45 50236 - 50763 492 175 aa, chain + ## HITS:1 COG:no KEGG:BT_1263 NR:ns ## KEGG: BT_1263 # Name: not_defined # Def: putative protease I # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 175 1 175 175 317 91.0 1e-85 MAKKVAVLAVNPVNGCGLFQYLEAFFENGISYKVFAVSDTKEIKTNSGINLTVDDVIANL KGHEDEFDALVFSCGDAVPVFQQNADKPYNVDLMQVIKTFGDKGKMMIGHCASAMMFDFT GITKGKKVAVHPLAKSAIQNGTATDEKSEIDGNFFTAQNENTIWTMMPQVTEALK >gi|222159226|gb|ACAB01000133.1| GENE 46 51262 - 52863 1071 533 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 277 526 50 307 328 176 37.0 8e-44 MTREKENYNAFLLHFLERLNGCESIENKITESLTDICEYYGFKRGFIYQTDGFRYFYLKE TVGNSDNILRQRFEISEMTNQHAARINGKNKPFYACRNQNTSWNDVDIVDFYKVDSLLIR QIQDSEGKIIGFIGFGDREHAISFTDEELQVIHLILGSLSKEIAVREYKEREVRASKTLS SIMNNMGVDIYVNSFDSHDMLYVNESMAAPYGGIEHFEGKKCWQALYKDKTGECEFCPKK HLIDENGQPTKVYSWDYQRPFDKCWFRVFSAAFAWIDGQIAHVITSVDINHQKTIEEELR VAKEKAENLDRLKSAFLANMSHEIRTPLNAIVGFASLLVESDDKKERQDYVDIMQENTEL LLQLISDILDLSKIEAGTLDFTMDHLDIKSFCEDIMRNYDIKEDKPVPVLLAPDLPEYYI YTDKKRLMQVITNFINNALKFTNEGQIMLEYRHKAESNEIEFAVTDTGMGIAPDKVDQVF DRFVKLNSFSKGTGLGLSICRSIIEHLGGTIGAESEIGVGSRFWFTHPLRLNQ >gi|222159226|gb|ACAB01000133.1| GENE 47 53065 - 54366 1218 433 aa, chain + ## HITS:1 COG:SA2117 KEGG:ns NR:ns ## COG: SA2117 COG1757 # Protein_GI_number: 15927906 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Staphylococcus aureus N315 # 4 430 21 448 459 336 45.0 5e-92 MTEEKIHKDRPGNWWALSPLLVFLCLYLVTSILVNDFYKVPITVAFLVSSCYAIAITRGL KLDQRIYQFSVGAANKNILLMIWIFILAGAFAQSAKQMGAIDATVNLTLHILPDNLLLAG IFIAACFISLSIGTSVGTIVALTPVAVGLAEKTEIALPFMVAVVVGGSFFGDNLSFISDT TIASTKTQECVMRDKFRVNSMIVVPAAIIVLGIYIFQGLSITAPTQTQTIEWIKVIPYII VLGTAIAGMNVMLVLIIGILTSGIIGIATGSFGIFDWFGAMGTGITGMGELIIITLLAGG MLETIRYNGGIDFIIQKLTRHVNGKRGAELSIAALVSIANLCTANNTIAIITTGPIAKDI AVKFHLDRRKTASILDTFSCLIQGIIPYGAQMLIAAGLASISPISIIGNLYYPFTMGACA LLAILFRYPKRYS >gi|222159226|gb|ACAB01000133.1| GENE 48 54473 - 55027 914 184 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160885844|ref|ZP_02066847.1| hypothetical protein BACOVA_03848 [Bacteroides ovatus ATCC 8483] # 1 184 1 184 184 356 100 2e-97 MATRIRLQRHGRKSYAFYSIVIADSRAPRDGKFIEKIGTYNPNTNPATVDLNFDAALAWV LKGAQPSDTVRNILSREGVYMKKHLLGGVAKGAFGEAEAEAKFEAWKNNKQSGLATLKAK QDEAKKAEAKARLEAEKKINEVKAKALAEKKAAEAAEKAAAEAPAEEAAAAPAEEAPATE AAAE >gi|222159226|gb|ACAB01000133.1| GENE 49 55202 - 56197 739 331 aa, chain + ## HITS:1 COG:L0146 KEGG:ns NR:ns ## COG: L0146 COG1609 # Protein_GI_number: 15673482 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 19 313 8 318 345 60 22.0 4e-09 MKRTDKKATFGQQASKVTQLADALSQAISRKEFLEGDSLPSINQLSAQYGVSRDTVFKAF LDLRERGLIDSTPGKGYYVTSQVTNVLLLLDQYTPFKEALYNSFVRHLPINYKVDLLFHQ YNERLFNTILRESIGKYNKYIVMNFDNEKFSTVLNKINPSRLLLLDFGKFEKEKYSYICQ DFDDSFYQALNLLKERLRNYHQLVFLFPKSLKHPQSSKVYFTRFCQEQGFLCEIQEDIEN LIVRKGVAYIAIKQQDVVKVVKQGRLEGLKCGKDFGLLAYNDIPSYEVIDEGITSLSIDW EMMGNEAANFVLNNVPVQKFLPTEVRLRKSL >gi|222159226|gb|ACAB01000133.1| GENE 50 56265 - 58040 2045 591 aa, chain + ## HITS:1 COG:SP2158 KEGG:ns NR:ns ## COG: SP2158 COG2407 # Protein_GI_number: 15901968 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 1 591 1 588 588 840 67.0 0 MKKYPKIGIRPTIDGRQGGVRESLEEKTMNLAKAVAELISSNLKNGDGSPVECVIADSTI GRVAESAACAEKFEREGVGSTITVTSCWCYGAETMDMNPHYPKAVWGFNGTERPGAVYLA AVLAGHAQKGLPAFGIYGRDVQDLDDNTIPEDVSEKILRFARAAQAVATMRGKSYLSMGS VSMGIAGSIVNPDFFQEYLGMRNESIDLTEIIRRMDEGIYDHEEYAKAMAWTEKNCKTNE GEDFKNRPEKRKTREQKDADWEFIVKMTIIMRDLMTGNPKLKEMGFKEEALGHNAIAAGF QGQRQWTDFYPNGDYPEALLNTSFDWNGIREAFVVATENDACNGVAMLFGHLLTNRAQIF SDVRTYWSPEAVKRVTGKELTGLAANGIIHLINSGATTLDGSGQSLDAAGSPVMKEPWNL TDADVENCLKATTWYPADRDYFRGGGFSSNFLSKGGMPVTMMRLNLIKGLGPVLQIAEGW TVEIDPEIHQKLNMRTDPTWPTTWFVPRLCEKPAFKDVYSVMNNWGANHGAISYGHIGQD LITLASMLRIPVCMHNVEDNEIFRPAAWNAFGMDKEGADYRACTTYGPIYK >gi|222159226|gb|ACAB01000133.1| GENE 51 58200 - 58838 565 212 aa, chain + ## HITS:1 COG:HI1012 KEGG:ns NR:ns ## COG: HI1012 COG0235 # Protein_GI_number: 16272947 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Haemophilus influenzae # 10 203 17 206 210 103 33.0 2e-22 MITNEHIEQFIEQAHRYGDAKLMLCSSGNLSWRIGEEALISGTGSWVPTLGKEKVSICHI ASGAPTNGVKPSMESTFHLGILRERPDVNVVLHFQSEYATAVSCMKNKPTNFNVTAEIPC HVGSEIPVIPYYRPGSPELAKAVVEAMLKHNSVLLTNHGQVVYGKDFDQVYERATFFEMA CRIIIQSGGDYSVLTPAEIEDLEIYVLGKKTN >gi|222159226|gb|ACAB01000133.1| GENE 52 58856 - 60286 1216 476 aa, chain + ## HITS:1 COG:BH1551 KEGG:ns NR:ns ## COG: BH1551 COG1070 # Protein_GI_number: 15614114 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus halodurans # 10 472 5 461 467 352 39.0 1e-96 MKDTIKHTYLAADFGGGSGRIIAGFLHHGKLELEEVYRFCNRQVKLGNHIYWDFPALFED MKTGLKLAAQKGYAVKSIGIDTWGVDFGLIDKHGNLLGNPVCYRDARTEGIPEEVFKLLD ERQHYADTGIQVMAINTLFQLYSMKQHQDAQLEVARQLLFMPDLFSYFLTGVANNEYCIA STSELLDAQSRNWSVDTIRALGLPEHLFGEIILPGTIRGTLKEDIARETGLGIVDVIAVG SHDTASAVAAVPAVENPIAFLSSGTWSLLGVEVDEPILTEEARKAQFTNEGGVDGKIRFL QNITGLWILQRLMSEWKACGEEQNYDIIIPQAAEAQIATIIPVDDATFMNPENMENALIH YCRHHALQVPKNKAETVRCVLQSLAFKYRQAVEQLNRCLPSPIRQLNIIGGGSQNQLLNQ LTADELGIPVYAGPVEATAMGNILTQAMAKGEIADLRELREIVTRSVTPQVYYPKK >gi|222159226|gb|ACAB01000133.1| GENE 53 60293 - 60685 381 130 aa, chain + ## HITS:1 COG:no KEGG:BT_1276 NR:ns ## KEGG: BT_1276 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 130 235 92.0 4e-61 METPKTGYQVQSYDVPVKRYCQTLDLRDSPELIAEYRKRHSQAEAWPEILAGIREVGILE MEIYILGTRLFMIVETPLDFDWDTAMERLNTLPRQQEWEEYMAIFQQAAPGMSSAEKWKP MERMFHLYNT >gi|222159226|gb|ACAB01000133.1| GENE 54 60704 - 62014 999 436 aa, chain + ## HITS:1 COG:HI0610 KEGG:ns NR:ns ## COG: HI0610 COG0738 # Protein_GI_number: 16272552 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Haemophilus influenzae # 16 433 10 425 428 392 51.0 1e-109 MKHPKQSILSKDGISYLIPFVLITSCFALWGFANDITNPMVKAFSKIFRMSATDGALVQV AFYGGYFAMAFPAAIFIRKYSYKAGVLLGLGMYAFGAFLFFPAKMTGEYYPFLIAYFILT CGLSFLETSCNPYILSMGTEETATRRLNLTQSFNPMGSLLGMYIAMQFIQAKLHPMCTED RALLNDSEFQAIKESDLSVLIAPYLIIGLIIVAMLLLIRFVKMPKNGDQNHRIDFFPTLK RIFTQTRYREGVIAQFFYVGAQIMCWTFIIQYGTRLFMSQGMDEKSAEVLSQQYNIIAMV IFCISRFICTFILRYLNAGKLLMILAIFGGIFTLGVIFLQNIFGMYCLVAVSACMSLMFP TIYGIALKGMGDDAKFGAAGLIMAILGGSVLPPLQASIIDLKQIAWLPAVNVSFILPFIC FLVIIGYGYRTVKRNW >gi|222159226|gb|ACAB01000133.1| GENE 55 62576 - 62743 257 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713661|ref|ZP_04544142.1| ## NR: gi|237713661|ref|ZP_04544142.1| predicted protein [Bacteroides sp. D1] # 1 55 41 95 95 99 100.0 7e-20 MYDVDPQTTGLKAYWKFNEGKGDIAKDYTENGNDAKAYTKAIWPEDIEVTQKNKE >gi|222159226|gb|ACAB01000133.1| GENE 56 62988 - 63860 856 290 aa, chain + ## HITS:1 COG:no KEGG:BF3047 NR:ns ## KEGG: BF3047 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 290 2 303 303 417 73.0 1e-115 MEQLDKYIRFDWAVKRLLRQKANFGVLEGFLTVLLGENVKIIEILESESNQQTIDDKFNR VDIKARNSKDEIIIVEIQNTRELYYLERILYGVAKAITEHISLGERYYAVKKIYSISILY FDIGKGEDYLYHGQNHFIGVHTGDRLEVTTKEKDAIVHKIPAEIFPEYFLIRVNEFDKVA VTPLEEWIEYLKTGCIRPDTTAPGLEEARQKLIYYNMSKEERHAYDEHLSAIMIQNDVLD GAKLEGRMEGRMEGKMEERLEIANNLKSLGVDITAIHKATGLSPEEIEKL >gi|222159226|gb|ACAB01000133.1| GENE 57 64525 - 65037 270 170 aa, chain + ## HITS:1 COG:BH2229 KEGG:ns NR:ns ## COG: BH2229 COG2207 # Protein_GI_number: 15614792 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 42 164 148 272 287 59 34.0 3e-09 MNQDYKRNASLADTFLYMPFNIENLKIEISVLIKNYRFLRKSFLQKIFGEKFLETDIEEM PEEDNYQLINKVKKFILDNLDNESLKISHIASETGMSRTKFFIKWKFLTGEAPIEFMTRI RMEKARELLESGKYRIQEIPEMVGLKDVKNFRNKYKKHFGIAPSETLRNI >gi|222159226|gb|ACAB01000133.1| GENE 58 65593 - 67284 664 563 aa, chain + ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 460 562 425 526 526 67 35.0 7e-11 MKYLLQTSPLVYSAIAILILYQLNFTSILLAGGISIYWIITHIITIQKERNLLKEGSQYF IDSFESIRAPLTLVHTPLKIAYNDVSPENIRKELSLAIQNIDCLNIHLTRLMDLKQLFAN TEKLNSSEYELGAFIKNRITSLKTHAANNLVGLNIKTDFNYASVWFDQSKISPVIDKFIK YAIDHTEQKKTITLLITSNPEYWQIKITDFENKKILQCYKCKNWQLFKRKEELEYNFAKN IFCKRLIKMCNGKILINHSSHTIALKFPSKDSPENVHQHTAIQIESNPAEERTDTLFGKD SQKRNSTKPLVVLADSNKEFRLYLEERLSKDFIVKSFENGLDALECIKEEYPDLVICDIM LHGMYGNELSSRLKTSGETSVIPIILYGSHIDTGQRSKRESSLADIFLYIPFHVEDLKTE MNVLIRNNRFLRKSFLERIFGKKFLEVEEDKISDEENYGLINQVKEFILKNIDKENLTID EIASELYMSRTAFFTKWKALTGEAPKYLIYRIRMEKARELLESGKYSVNVLPEMIGLKSL KNFRHKYKEYFGITPSESIMKKQ >gi|222159226|gb|ACAB01000133.1| GENE 59 67460 - 68113 638 217 aa, chain + ## HITS:1 COG:no KEGG:BT_1287 NR:ns ## KEGG: BT_1287 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 217 1 212 212 196 52.0 5e-49 MKTLTKMKKVCCFNRFVLLAMTLSMVTSFTACSSDDDSDIPVYSLKDVDGNYSGTMLTES TPVSPQANAEEGEEPAGAEVTAEVKDNQIMIKKLPVDDLIKSIVGEENAGAIIENLGDIN YNIPYTPTFDEGKSNILLQLKPEPLEIKYTEPIQVQEEGQEPAENLITVTIEAEKNGVYA YDGKKLAFVIKATGVKVGDVNFPNFPVTTFTFSMVKK >gi|222159226|gb|ACAB01000133.1| GENE 60 68208 - 68900 557 230 aa, chain + ## HITS:1 COG:FN0805 KEGG:ns NR:ns ## COG: FN0805 COG4912 # Protein_GI_number: 19704140 # Func_class: L Replication, recombination and repair # Function: Predicted DNA alkylation repair enzyme # Organism: Fusobacterium nucleatum # 4 230 14 251 251 147 37.0 2e-35 MTTTENIRKELQDLVDSKYQEFHSALVPGIGNILGVRIPQLRILAKEIAKRDNWRMFVEA TDTKFYEETMLQGMVIGLAKMELDERMKHISMFVPRIDNWAVCDIFCGELKTAVKKGKET VWQFIQPYLISPKEFEIRFGIVMLFHYVDEEHIDALLAYADTFSHDAYYARMAMAWMISI CFIKFPEKTMEYLKQSKLDNWTYNKSLQKTIESLRVDKETKDILRGIKRR >gi|222159226|gb|ACAB01000133.1| GENE 61 68993 - 70324 1109 443 aa, chain - ## HITS:1 COG:lin0800 KEGG:ns NR:ns ## COG: lin0800 COG0687 # Protein_GI_number: 16799874 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Listeria innocua # 4 346 13 325 357 183 32.0 6e-46 MKRIIPAILLLLSLTGCYNSGEPRERVLKIYNWADYIGDGVLEDFQAYYKEQTGENIRIV YQTFDINEIMLTKIEKGHEDFDVVCPSEYIIERMLKKRLLLPIDTNFAHSPNYMKNVAPF IREQINKLSQPGEEASRYAVCYMWGTAGLLYNRAYVPDSVAASWDCLWNKKYAGKILMKD SYRDAYGTAIIYAHAKELEAGSVTVEELMNDYSPQAMELAEKYLKALKPNIAGWEADFGK EMMTKNKAWLNMTWSGDAIWAIEEADAVGVDLDYEVPKEGSNIWYDGWVIPKYARNPEAA SYFINFMCRPDIALRNMDFCGYVSSIATPEILEEKIDTTLTYFSDLSYFFGPGADSIQID KIQYPDRKVVERCAMIRDFGDKTKEVLDIWSRIKGDNLGVGITILIFVVVALMSGWMIYK RWQRYSRRKQQRRRSRRKKVKRN >gi|222159226|gb|ACAB01000133.1| GENE 62 70414 - 71205 692 263 aa, chain - ## HITS:1 COG:CAC0838 KEGG:ns NR:ns ## COG: CAC0838 COG1177 # Protein_GI_number: 15894125 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Clostridium acetobutylicum # 1 257 1 252 260 180 42.0 2e-45 MVKKIFAQAYLWILLLLLYSPIVIIVIYSFTEAKVLGNWTGFSTKLYTSLFTTGTHHSLM NALINTITIALLAAAASTLLGSVAAIGIFNLKSRSRKAISFVNSIPILNGDIITGISLFL LFVSLGITQGYTTVVLAHITFCTPYVVLSVLPRLKQMNPNIYEAALDLGATPMQALWKVI VPEIRPGMISGFMLALTLSIDDFAVTVFTIGNQGLETLSTYIYADARKGGLTPELRPLSA IIFVVVLALLIVINHRAGKAKKK >gi|222159226|gb|ACAB01000133.1| GENE 63 71199 - 71999 596 266 aa, chain - ## HITS:1 COG:CAC0839 KEGG:ns NR:ns ## COG: CAC0839 COG1176 # Protein_GI_number: 15894126 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Clostridium acetobutylicum # 18 266 12 277 277 166 38.0 3e-41 MNKKFIVFLSSRKSWTLPYIIFSAIFVIIPLFLIVVYAFTDDSGHLTLANFQKFFEHPEA INTFVYSIGIAIITTLVCILLGYPAAWILSNSKLNRSKTMVVLFILPMWVNILVRTLATV ALFDFFSVPLGEGALIFGMVYNFIPFMIYPIYNTLQKMDHSYIEAAQDLGANPLQVFLKA VLPLSMPGVMSGIMMVFMPTISTFAIAELLTMNNIKLFGTTIQENINNSMWNYGAALSLI MLLLIAATSLFSTDDKDNSNEGGGLW >gi|222159226|gb|ACAB01000133.1| GENE 64 72011 - 73396 1486 461 aa, chain - ## HITS:1 COG:BB0642 KEGG:ns NR:ns ## COG: BB0642 COG3842 # Protein_GI_number: 15594987 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Borrelia burgdorferi # 4 348 2 347 347 399 56.0 1e-111 MQEDKSIIEVSHVSKYFGDKTALDDVTLNVKKGEFVTILGPSGCGKTTLLRLIAGFQTAS EGEIRISGMEITQTPPHKRPVNTVFQKYALFPHLNVYDNIAFGLKLKKTPKQTIEKKVKA ALKMVGMTDYEYRDVDSLSGGQQQRVAIARAIVNEPEVLLLDEPLAALDLKMRKDMQMEL KEMHKSLGITFVYVTHDQEEALTLSDTIVVMSEGKIQQIGTPIDIYNEPINAFVADFIGE SNILNGTMIHDKLVRFCGTEFECVDEGFGENVPVDVVIRPEDLYIFPVSDMAQLVGVVET SIFKGVHYEMTVMCGGYEFLVQDYHHFEVGAEVGLLVKPFDIHIMKKERICNTFEGKLLD ATHVEFLGCNFECTPVEDIAADTNVKVEVDFEKVILQDNEEDGTLTGEVKFILYKGDHYH LTVLSDWDENVFVDTNDVWDDGDRVGITIPPDAIRIIKITD >gi|222159226|gb|ACAB01000133.1| GENE 65 73541 - 74644 1243 367 aa, chain + ## HITS:1 COG:BS_resA KEGG:ns NR:ns ## COG: BS_resA COG0526 # Protein_GI_number: 16079372 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 205 349 16 155 181 75 26.0 2e-13 MKKLTYLVIATAALGMVACTGGNKAGYVVTGTVEGASDGDTVYLQEATGRNLTKLDTAVI TKGTFTFEGTQDSVVSRYVTCEVNGEPLMIDFFLENGKINVALTKDNDAVTGTPNNDAYQ EIRAQINDISKKMNTIYEAMGDTSLSDEQKEAKQKEGAQLEEQYDKVIKEGVKKNIANPV GVFLFKQTFYNNSTDENEALLQQIPANFQNDETIVRIKEMTDKQKKTAVGTQFVDFEMQT PEGKTVKLSDYVGKGKVVLVDFWASWCGPCRREMPNLVETYAKYKGKNFEIVGVSLDQDG AAWKEAIKKMNMTWPQMSDLKFWQSEGAQLYAVNSIPHTVLIDGSGKIIARGLHGEELQA KIAEAVK >gi|222159226|gb|ACAB01000133.1| GENE 66 74923 - 77487 2581 854 aa, chain - ## HITS:1 COG:PH1512 KEGG:ns NR:ns ## COG: PH1512 COG0058 # Protein_GI_number: 14591294 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Pyrococcus horikoshii # 21 746 17 734 837 615 44.0 1e-175 MKIKVSNVNTPNWKEVTVKSRIPEELEKLSEIARNIWWAWNFEATELFRDLDPELWKECG QNPVLLLERMSYEKLEALAKDKVILRRMNEVYTKFRDYMDVKPDEQRPSIAYFSMEYGLS SVLKIYSGGLGVLAGDYLKEASDSNVDLCAVGFLYRYGYFTQTLSMDGQQIANYEAQNFG QLPIERVMDANGQPLIVDVPYLDYFVHANVWRVNVGRISLYLLDTDNEMNSEFDRPITHQ LYGGDWENRLKQEILLGIGGILTLKALGIKKDVYHCNEGHAALINVQRICDYVATGLTFD QAIELVRASSLYTVHTPVPAGHDYFDEGLFGKYMGGYPSRMGITWDDLMDLGRNNPGDKG ERFCMSVFACNTSQEVNGVSWLHGKVSQEMFSSIWKGYFPEESHVGYVTNGVHFPTWSAT EWKELYFKYFNENFWFDQSNPKIWEAIYNVPDEEIWKTRMTMKNKLVDYIRKSFRDTWLK NQGDPSRIVSLMDKINPNALLIGFGRRFATYKRAHLLFTDLDRLSKIVNNPDYPVQFLFT GKAHPHDGAGQGLIKRIIEISRRPEFLGKIIFLENYDMQLARRLVSGVDIWLNTPTRPLE ASGTSGEKALMNGVVNFSVLDGWWLEGYREGAGWALTEKRTYQNQEHQDQLDAATIYSIL ETEILPLYYARNKKGYSEGWIKVVKNSIAQIAPHYTMKRQLDDYYSKFYCKLAKRFQTLA ANDNAKAKEIAAWKEEVVAKWDAIEIVSCDKVEDLKNGDIESGKEYTITYVIDEKGLNDA VGLELVTTYTTADGKQHVYSVEPFSVIKKEGNLYTFQVKHSLSNAGSFKVSYRMFPKNPE LPHRQDFCYVRWFI >gi|222159226|gb|ACAB01000133.1| GENE 67 77524 - 79185 1586 553 aa, chain - ## HITS:1 COG:YLR258w KEGG:ns NR:ns ## COG: YLR258w COG0438 # Protein_GI_number: 6323287 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Saccharomyces cerevisiae # 11 552 10 622 705 211 28.0 4e-54 MVKDLLTPDYIFESSWEVCNKVGGIYTVLSTRANTLQEKFRDRIFFIGPDVWQGKENPLF IESDNLCAAWKKHALEKDELSVRIGRWNIPGEPIVILVDFQPFFEKKDDIYTEMWNRYQV DSLHAYGDYDEASMFSYAAGRVVESFYRYNLTETDKVVYQAHEWMTGMGALYVQEAVPEV ATIFTTHATSIGRSIAGNHKPLYDYLFAYNGDQMAQELNMQSKHSIEKQTAHYVDCFTTV SEITNNECKELLDKPADVVLMNGFEDDFVPKGSTFTGKRKRARALMLNVANKLLGTNLGD DTLIVGTSGRYEFKNKGIDVFLESLNRLNRDKNLHKNVLAFINVPGWVGDPREDLQGRLK SKEKFDTPLEVPFITHWLHNMTHDQVLDMLKYLGMGNRPEDKVKVIFVPCYLNGRDGIMN KEYYDILLGQDLSVYASYYEPWGYTPLESVAFHVPTITTDLAGFGLWVNSLKNQHGINDG VEVLHRSDYNYSEVADGIKDTITLFADKTEKEVKEIRKRAAEVAEQALWKHFIQYYYEAY DIALRNAMKRQLS >gi|222159226|gb|ACAB01000133.1| GENE 68 79388 - 79852 714 154 aa, chain - ## HITS:1 COG:SPy0149 KEGG:ns NR:ns ## COG: SPy0149 COG0636 # Protein_GI_number: 15674359 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Streptococcus pyogenes M1 GAS # 5 148 14 154 159 75 38.0 3e-14 MEMNLFIAYIGIAIMVGLSGIGSAYGVTIAGNAAIGALKKNDSAFGNFLVLTALPGTQGL YGFAGYFMFQTIFGILTPEITAIQASAVLGAGIALGLVALFSAIRQGQVCANGIAAIGQG HNVFSNTLILAVFPELYAIVALAATFLIGSALVA >gi|222159226|gb|ACAB01000133.1| GENE 69 79928 - 81745 1567 605 aa, chain - ## HITS:1 COG:BB0091 KEGG:ns NR:ns ## COG: BB0091 COG1269 # Protein_GI_number: 15594437 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Borrelia burgdorferi # 1 604 1 604 608 181 25.0 4e-45 MITKMKKLTFLVYHKEYEEFLNSLRELGVVHIVEKQQGAADNTELQDNIRLSNRLAATLK LLQNQKHEKNAVIATEGGTAARGMQVLDEVDALQTEHGKLSQQLQSYAKEKEALEAWGNF EPANVQKLKDAGYVIGFYSCSEGNYKEEWETEYNAMIVNRISSKVFFVTVTKGGEEVDLD VEQAKLPAYSLVHLETLYNTTEQAVEENEKRLVALSETDIPSLKAALKELQSQIEFSKVV LSSEQTAGDKLMLIEGWAPAFSQVEIEAYLNDAHVYYEITDPMPGDNVPIRLNNKGFFAW FEPICKLYMLPKYNELDLTPFFAPFFMVFFGLCLGDSGYGLFLFLGATAYRLMAKKVTPS MKSIISLIQVLAVSTFFCGLLTGTFFGANIYNLDWPIVQRLKHAVMMDNNDMFQLSLILG AIQILFGMVLKAVNQTIQFGFKYAVATIGWIILLVSMAVSALLPDVMPMGSTVHLIILGV SAAMIFLYNSPGKNIFLNIGLGLWDSYNMVTGLLGDVLSYVRLFALGLSGGILAGVFNSL AVGMSPDNVIAGPIVMVLIFVIGHAINIFMNVLGAMVHPMRLTFVEFFKNSGYEGGGKEY KPFRN >gi|222159226|gb|ACAB01000133.1| GENE 70 81742 - 82347 577 201 aa, chain - ## HITS:1 COG:TP0428 KEGG:ns NR:ns ## COG: TP0428 COG1394 # Protein_GI_number: 15639419 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit D # Organism: Treponema pallidum # 1 179 1 180 206 88 31.0 9e-18 MAIKFQYNKTSLQQLEKQLKVRVRTLPIIKNKESALRMEVKRCKTEAADLEDRLEQQIQA YEAMFALWNEFDASLIKVNDVHLGVKKIAGVRVPLLENVEFEIRPYSMFNAPKWYADGIH LLEELAHTAIEREFMLAKLNLLEHARKKTTQKVNLFEKVQIPGYQDALRKIKRFMEDEEN LSKSSQKIMKSHQEKRKEVEA >gi|222159226|gb|ACAB01000133.1| GENE 71 82456 - 83784 1585 442 aa, chain - ## HITS:1 COG:TP0427 KEGG:ns NR:ns ## COG: TP0427 COG1156 # Protein_GI_number: 15639418 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit B # Organism: Treponema pallidum # 8 438 3 429 430 399 47.0 1e-111 MATKAFQKIYTKITQITKATCSLKATGVGYDELATVDGKLAQVVKIAGDDVTLQVFEGTE GIPTNAEVVFLGKSPTLKVSEQLAGRFFNAFGDPIDGGPDIEGQEVEIGGPSVNPVRRKQ PSELIATGIAGIDLNNTLVSGQKIPFFADPDQPFNQVMANVALRAETDKIILGGMGMTND DYLYFKNVFSNAGALDRIVSFMNTTENPPVERLLIPDMALTAAEYFAVNNNEKVLVLLTD MTSYADALAIVSNRMDQIPSKDSMPGSLYSDLAKIYEKAVQFPSGGSITIIAVTTLSGGD ITHAVPDNTGYITEGQLFLRRDSDIGKVIVDPFRSLSRLKQLVTGKKTRKDHPQVMNAAV RLYADAANAKTKMENGFDLTNYDERTLAFAKDYSNQLLAIDVNLDTTEMLDVAWGLFGKY FRPEEVNIKKDLVDQYWPKQNN >gi|222159226|gb|ACAB01000133.1| GENE 72 83816 - 85576 1935 586 aa, chain - ## HITS:1 COG:TP0426 KEGG:ns NR:ns ## COG: TP0426 COG1155 # Protein_GI_number: 15639417 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit A # Organism: Treponema pallidum # 3 580 4 571 589 526 45.0 1e-149 MATKGTVSGVIANMVTLVVDGPVAQNEICYISTGGDKLMAEVIKVVGSHVYVQVFESTRG LKVGAEAEFTGHMLEVTLGPGMLSKNYDGLQNDLDKMDGVFLKRGQYTYPLDKERIWHFV PLVNVGDKVQASAWLGQVDENFQPLKIMAPFTMKGTATVKTIMPEGDYKIEDTIAILTDE EGNDIPVTMIQRWPVKRAMTNYKEKPRPFKLLETGVRVIDTLNPIVEGGTGFIPGPFGTG KTVLQHAISKQAEADIVIIAACGERANEVVEIFTEFPELVDPHTGRKLMERTIIIANTSN MPVAAREASVYTAMTLAEYYRSMGLKVLLMADSTSRWAQALREMSNRMEELPGPDAFPMD ISAIISNFYGRAGYVKLSNDETGSITFIGTVSPAGGNLKEPVTENTKKVARCFYALEQDR ADKKRYPAVNPIDSYSKYIEYPEFEEYIKGHINDEWIGKVNELKTRLQRGKEIAEQINIL GDDGVPVEYHVIFWKSELIDFVILQQDAFDEIDAVTPMERQEDILNMIIDICHTEFEFDN FNEVMDYFKKMINICKQMNYSKFKSEQYEGFQKQLKELIEERKLQS >gi|222159226|gb|ACAB01000133.1| GENE 73 85594 - 86436 810 280 aa, chain - ## HITS:1 COG:no KEGG:BT_1300 NR:ns ## KEGG: BT_1300 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 280 1 280 280 497 93.0 1e-139 MSSKYYYLVAGLPELSLEDSKLSYTVADFKTEIYNGLSASDRKLIDLFYLKFDNANVLKL LKDKEAEIDKRGNYSAEELTEYISVLREGGEISPKEFPVYLSTFITDYLNTPAESTTLHE DHLAALYYEYAMKCGNKFVSAWFEFNLNINNILVAFTSRKFKWDIASNVVGNTEVCEALR TSSARDFGLSGEVDVFESLVKISEITELVEREKKLDALRWNWMEDAIFFDYFTIERIFAF LLKLEMIERWISLDKERGNQLFRSIIESLKNEVQIPAEFR >gi|222159226|gb|ACAB01000133.1| GENE 74 86448 - 87038 632 196 aa, chain - ## HITS:1 COG:TP0424 KEGG:ns NR:ns ## COG: TP0424 COG1390 # Protein_GI_number: 15639415 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit E # Organism: Treponema pallidum # 19 195 4 171 232 58 25.0 5e-09 MENKIQELTDKIYREGVEKGNEEAQRLIANAQEEAKKIIEDARKEAESIVNSSRKSADEL AENTKSELKLFAGQAVNALKSEVATMVTDKLITASVKDFAQDKDYLNAFIVALASKWSID EPIVISTADAESLKKYFAAHAKALLDKGVTIQQVNGIKTLFTVSPADGSYKVNFGEEEFM NYFKAFLRPQLVEMLF Prediction of potential genes in microbial genomes Time: Wed May 18 04:10:02 2011 Seq name: gi|222159225|gb|ACAB01000134.1| Bacteroides sp. D1 cont1.134, whole genome shotgun sequence Length of sequence - 38898 bp Number of predicted genes - 39, with homology - 38 Number of transcription units - 18, operones - 11 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 2042 1089 ## BF2797 hypothetical protein - Prom 2068 - 2127 2.2 - Term 2049 - 2095 9.0 2 2 Op 1 . - CDS 2141 - 2437 302 ## BF2796 hypothetical protein 3 2 Op 2 . - CDS 2450 - 2812 503 ## BF2795 conjugate transposon protein TraE 4 2 Op 3 . - CDS 2864 - 3196 186 ## BF2794 hypothetical protein 5 2 Op 4 . - CDS 3193 - 3645 562 ## BF2793 hypothetical protein - Prom 3665 - 3724 6.2 + Prom 3965 - 4024 2.6 6 3 Tu 1 . + CDS 4095 - 5663 1009 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen + Term 5703 - 5758 5.7 - Term 5693 - 5743 10.5 7 4 Op 1 . - CDS 5775 - 6662 612 ## BF2792 DNA primase 8 4 Op 2 . - CDS 6709 - 7776 453 ## BF2791 hypothetical protein 9 4 Op 3 . - CDS 7776 - 8222 354 ## gi|237713430|ref|ZP_04543911.1| predicted protein 10 4 Op 4 . - CDS 8219 - 8614 327 ## BF2790 putative excisionase - Prom 8684 - 8743 1.9 - Term 8628 - 8656 -1.0 11 5 Tu 1 . - CDS 8773 - 9549 420 ## gi|237722685|ref|ZP_04553166.1| predicted protein 12 6 Op 1 . - CDS 9607 - 11115 658 ## gi|237713433|ref|ZP_04543914.1| predicted protein 13 6 Op 2 . - CDS 11127 - 12287 954 ## COG4973 Site-specific recombinase XerC - Prom 12407 - 12466 8.6 + Prom 12200 - 12259 2.7 14 7 Tu 1 . + CDS 12398 - 12646 131 ## 15 8 Tu 1 . - CDS 12515 - 13912 1269 ## COG0486 Predicted GTPase - Prom 14029 - 14088 5.6 + Prom 13948 - 14007 4.0 16 9 Op 1 . + CDS 14048 - 14455 328 ## COG0784 FOG: CheY-like receiver + Term 14495 - 14531 2.1 + Prom 14458 - 14517 5.0 17 9 Op 2 . + CDS 14538 - 15197 809 ## COG2910 Putative NADH-flavin reductase + Term 15278 - 15314 0.5 - Term 15352 - 15401 17.0 18 10 Op 1 . - CDS 15483 - 16361 971 ## COG2820 Uridine phosphorylase - Prom 16449 - 16508 1.9 - Term 16397 - 16429 1.2 19 10 Op 2 . - CDS 16520 - 17521 1045 ## COG4864 Uncharacterized protein conserved in bacteria 20 10 Op 3 . - CDS 17544 - 18014 626 ## BT_4556 hypothetical protein 21 10 Op 4 . - CDS 18032 - 19453 1369 ## BT_4557 hypothetical protein - Prom 19563 - 19622 2.5 + Prom 19545 - 19604 1.5 22 11 Op 1 . + CDS 19637 - 20383 566 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 23 11 Op 2 . + CDS 20399 - 20770 266 ## COG3169 Uncharacterized protein conserved in bacteria + Term 20787 - 20831 7.6 24 12 Op 1 . - CDS 20841 - 21854 895 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 25 12 Op 2 . - CDS 21885 - 22466 518 ## COG1971 Predicted membrane protein 26 12 Op 3 . - CDS 22473 - 23276 644 ## BT_4562 hypothetical protein 27 12 Op 4 . - CDS 23206 - 23790 382 ## BT_4563 hypothetical protein 28 12 Op 5 . - CDS 23812 - 24762 850 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 29 12 Op 6 . - CDS 24795 - 25310 356 ## BT_4565 hypothetical protein - Prom 25372 - 25431 2.2 + TRNA 25661 - 25737 73.6 # Thr TGT 0 0 30 13 Tu 1 . - CDS 25904 - 26377 276 ## gi|237713452|ref|ZP_04543933.1| conserved hypothetical protein - Prom 26608 - 26667 3.9 - Term 26714 - 26753 5.0 31 14 Tu 1 . - CDS 26782 - 28062 1557 ## COG0148 Enolase - Prom 28119 - 28178 2.2 32 15 Op 1 . - CDS 28242 - 28904 413 ## COG1357 Uncharacterized low-complexity proteins 33 15 Op 2 . - CDS 28935 - 29312 333 ## COG0239 Integral membrane protein possibly involved in chromosome condensation - Prom 29399 - 29458 5.0 + Prom 29458 - 29517 7.8 34 16 Op 1 . + CDS 29664 - 31823 1964 ## BT_4581 alpha-glucosidase + Term 31878 - 31927 8.0 + Prom 31892 - 31951 6.0 35 16 Op 2 . + CDS 31988 - 33397 1399 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase + Term 33448 - 33491 12.1 + Prom 33468 - 33527 9.7 36 17 Op 1 . + CDS 33547 - 34770 1376 ## COG2195 Di- and tripeptidases 37 17 Op 2 . + CDS 34837 - 35922 1103 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) + Term 35945 - 35993 12.2 + Prom 35971 - 36030 3.0 38 18 Op 1 . + CDS 36073 - 38346 1635 ## COG0475 Kef-type K+ transport systems, membrane components 39 18 Op 2 . + CDS 38396 - 38821 530 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases + Term 38840 - 38868 0.7 Predicted protein(s) >gi|222159225|gb|ACAB01000134.1| GENE 1 2 - 2042 1089 680 aa, chain - ## HITS:1 COG:no KEGG:BF2797 NR:ns ## KEGG: BF2797 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 680 1 680 903 1153 81.0 0 MIIYVFIVFIALCVGMAVSVYAFGTGGKRKRIFQDIYFSVEEADGIGVIYTKTGEYSAVL KLENPVQKYCANIDSYYEFTQLFTAIAQTLGEGYAIHKQDIFIRKNFTEESDENREFLST SYFRYFNGRPYTDSECYLTITQEVKKSRLFSFDGKKWRDFLVKIRKVHDQLRDAGVQVRF LNRQEVNEYVDRYFAMNFRDKVVSMTNFKVNDECISMGDKRCKVYSLVDVDNVSLPSVIH PYTNVEVNNSSMPMDLVSVVSGIPNTEAVVYNQMIFIPNQKRELALLDKKKNRHASMPNP GNQMAVEDIKRVQEVVARESKQLVYTHYNLVVAMSADTDLHKCTNHLENQFSRMGIHISK RAYNQLELFVNSFPGNCYGMNPDYDRFLTLGDAAACLMYKERILHSEKTPLKIYYTDRQG VPVAIDITGKEGAEKLTDNSNFFCLGPSGSGKSFHMNSVVRQLHEQGTDVVMVDTGNSYE GLCEYLGGKYISYTEECPITMNPFRINRQELNVEKTGFLKNLVLLIWKGSQGTVTKTEDR LIEQVITEYYDTYFNGFDGFTPPQREDLRKSLLIDERNKSGNRAESETELNARIETVIDE IERRRKELKVESLSFNTFYEFSVQRIPDICNENSITGIDISTYRYMMKDFYRGGNHDKTL NENMDSSLFDETFIVFEIDS >gi|222159225|gb|ACAB01000134.1| GENE 2 2141 - 2437 302 98 aa, chain - ## HITS:1 COG:no KEGG:BF2796 NR:ns ## KEGG: BF2796 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 8 98 7 97 97 124 68.0 8e-28 MNGKEEHYPDYPLFKGLQRPLEFMGIQGRYIYWAAATIGIAIVGFIIAYCMAGFVLALIV LSVTVTSGVGLILFKQRKGLHSKRLERGVFIYAYSKKM >gi|222159225|gb|ACAB01000134.1| GENE 3 2450 - 2812 503 120 aa, chain - ## HITS:1 COG:no KEGG:BF2795 NR:ns ## KEGG: BF2795 # Name: not_defined # Def: conjugate transposon protein TraE # Organism: B.fragilis # Pathway: not_defined # 1 120 1 124 125 132 68.0 5e-30 MFEKVKRFAKKIATSKKTQMTCLMVMCGVTALMAQTTAGDYSAGTAALTQVSEEIAKYVP IVVKLCYAIAGVVAVVGAISVYIAMNNEEQDVKKKIMMVVGACIFLIAAAQALPLFFGLS >gi|222159225|gb|ACAB01000134.1| GENE 4 2864 - 3196 186 110 aa, chain - ## HITS:1 COG:no KEGG:BF2794 NR:ns ## KEGG: BF2794 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 110 26 129 129 137 75.0 1e-31 MTRHNGKRILSALCTLLPCSVFAKSGGVNYSWGADALASMHDYVVTMMLYVQYIIYAVAG GFAVIAAFQIYIKMNTGEDGITKHILTLVGACLFIIGATIVFPAFFGYRI >gi|222159225|gb|ACAB01000134.1| GENE 5 3193 - 3645 562 150 aa, chain - ## HITS:1 COG:no KEGG:BF2793 NR:ns ## KEGG: BF2793 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 131 51 183 200 94 43.0 1e-18 MKAYFIFAIALTVAYIIYYAVMIARDLYGKNEEKIKSGEEEFDVSDFDEEESVSVVENDK GFNIGDNEYETHYIDETQNVSEETNPTDEKPRQDVIEALNNKVKANMEETQVTFSNPYNS AELYKLMIQNGIGELHPDGGVTTKNVIDEL >gi|222159225|gb|ACAB01000134.1| GENE 6 4095 - 5663 1009 522 aa, chain + ## HITS:1 COG:FN0830 KEGG:ns NR:ns ## COG: FN0830 COG2865 # Protein_GI_number: 19704165 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Fusobacterium nucleatum # 10 517 13 512 522 126 24.0 1e-28 MALAINIDDLLNKQRIESNRIEFKKGWNPVSIYHSVCAFANDFDDLGGGYIVVGVDTDDK TGVAIRPVEGIPIEEIDDILQDMVGYNNKIAPYYMPRTSTEEVDGKTVLVIWCPAGINRP YSVPENVTAKNGSKEYFYIRSGTSSIIAKGEVLDELRELASRVPFDERGNPDIRLEDIST LLLREYLVKVGSKLASEINIRPLQEILEQMDLFVGPKEKRMLRNVAAMMFCENPSKFFKR TQVEIVYFPEGRLNNPSNLYEGAVITGSVPQIIDRTLEYLKRMLVMQSIIKPENDYRSRK FYTYPYQALEESVTNSLYHRDYREWEPVVITVEPDSITIQNVGGPDRSISAADISRCEIL VSKRYRNRRLGEYLKELDMTEGRSTGIPTIQNVLENNGSPRATVVTDEDRTFFRITIPCH EAAGNIIADIANKDASVKASRRGALKTALQTALQTALQTAPKTALQILEQIQINPKVTMT DLANLTGYSRRWVAQTMKQLQELNVIKRVGSTKTGYWEIIAK >gi|222159225|gb|ACAB01000134.1| GENE 7 5775 - 6662 612 295 aa, chain - ## HITS:1 COG:no KEGG:BF2792 NR:ns ## KEGG: BF2792 # Name: not_defined # Def: DNA primase # Organism: B.fragilis # Pathway: not_defined # 1 295 1 295 295 355 58.0 1e-96 MNINEAKQIRLVEYLRIIGHSPVNARGCQYWYLSPLREERTPSFKVNDNLNEWYDFGLSA GGDIIELGKHLYRTGNVSVVLLRISENAIGVPFQQLQSRSVRPCPIEEEMENVEVRELSH HALLSYLRSRYISENIGRLYCKEIHYELRKRHYFAIAFENKTGGYEVRNPYYKGCIKGKD ISVIRYNKDIIQNHVCVFEGFMDFLSYQELHRKGDYHVCIDSPTDFLVMNSVSNLKKCLV ELERYTAIHCYLDNDLAGRKTVETIAGLCGNRVTDHSECYYNYKDLNDYLRGKRR >gi|222159225|gb|ACAB01000134.1| GENE 8 6709 - 7776 453 355 aa, chain - ## HITS:1 COG:no KEGG:BF2791 NR:ns ## KEGG: BF2791 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 10 350 12 356 368 354 51.0 3e-96 MQPDMIQPNNYRIDEAKYKSLLKYIRLSVTEKYEFPQEIVQIDGVTIATIGNFSASTGKP KSKKTFNVSAIVASALSGKEILKYKAELPSCKNRILYIDTEQSKCHCHKVLHRILKLAGL PTDQENDNIQFLVLREYTPDQRRDIIRWALHEEQNIGLVIIDGIRDLIHDINSPSESLDI INELMRWSSYYELHIHTVLHLNKGDDNTRGHIGTELNNKAETILQISKNNENGKISEVRA MHIRDREFTPFAFEIGEDSLPHLVKEHQFKKNKMDRLASYIDMTEQQHRTALEVTFEECA EYGYQSLLEALKKGYESIGYSRGRNTLVNLCKFLLENHAITKNGRTYSYNPSFHL >gi|222159225|gb|ACAB01000134.1| GENE 9 7776 - 8222 354 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713430|ref|ZP_04543911.1| ## NR: gi|237713430|ref|ZP_04543911.1| predicted protein [Bacteroides sp. D1] # 1 148 22 169 169 262 100.0 5e-69 MNRKEIDLLLSRIEGLKSFLENNSLENFQQEILNVENYLKRFGSLAELLSHLENVEKIAY AAKEFLNIDEVAAYLQVSKGYVYKLTMQKELTVYKPNGKNIFILRDDLNRWIKRNPCFSN TEIERQANIIAYTLGQKSKNKPTKGGKK >gi|222159225|gb|ACAB01000134.1| GENE 10 8219 - 8614 327 131 aa, chain - ## HITS:1 COG:no KEGG:BF2790 NR:ns ## KEGG: BF2790 # Name: not_defined # Def: putative excisionase # Organism: B.fragilis # Pathway: not_defined # 26 122 10 106 132 82 44.0 7e-15 MRHRRTTNENKMLSKMENMISMMATEKPFERLQSLEQKINEVDKIQTIENYIAQLKERIW AVKEVLTTAEASAYLGLSESYIYKLTSLKQIPHYKPNGKLVYFNRKELCEWAMRNQVQTI GQPAAITREEL >gi|222159225|gb|ACAB01000134.1| GENE 11 8773 - 9549 420 258 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237722685|ref|ZP_04553166.1| ## NR: gi|237722685|ref|ZP_04553166.1| predicted protein [Bacteroides sp. 2_2_4] # 1 258 5 262 262 479 100.0 1e-134 MRGFQWTPAQTGKEGLFRLVEQAHKSYKDEVTTQQKSYRRYVLDFTHSHRFHDANDAWSK NEHLKNYHTVIELLGKHEILIRKHFDHDLLDENGIIKLLNEFYAIDLSESTTESDNSSPG NPIQTNQNTNIKPLLDRHTLDLIVQLANEVCLFKEVLDADNVAICYAADTLQTVTSRNNS KLVLLLDKLASHDVITYNWQSVIAQKHLVISSSRKKLLTQHDMSSTLNRIKDTTPNAIDK QFLAIIDKYITLIKNREV >gi|222159225|gb|ACAB01000134.1| GENE 12 9607 - 11115 658 502 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713433|ref|ZP_04543914.1| ## NR: gi|237713433|ref|ZP_04543914.1| predicted protein [Bacteroides sp. D1] # 1 502 3 504 504 1030 100.0 0 MTRQETVIKITKITRIVGEMKSQLDLDDEIEFEALDSSWMNIGKWAKEICLYMEQAPSPL LANLITNNEFTVPVVNYVQSHRLEIDSAYVKVIDCYANNMQALLSLCKRQEEEVKGEYKD LIEPLANEQVATLLQRAIRAGLLDEHYQPMPQTKPLQLKVIAYAVSTICKLPSTYILFEK QWKREYGKRFSTWRVPRYNTGLYETTKALYPEVDFTEFEPTHQTETFYTPQSEEDIAVLY RDLVKYGYIAPDTGLKTFVGIFNKKTFSKPVEWIKTQRQLSFFVYQAFYKFNKKDLWIKG ECCFSINGHTPHKACFVSGYSWIKRAGWLDRYDVRLKAICDKFKHIENTFNEETSDERLI HTSKVVFYSPNSEDEIHSMFSALLDGGYISSDTTFAAFKGIFDETVFEHPIVWMKTQTSL MYFVHLAFKQHNPYDVWVKCVNCFRLQRDKVPNRESMDSNFRFIVKKGLIDTYDIQLKTI ADNYLSTQNKNAINAKVANNNT >gi|222159225|gb|ACAB01000134.1| GENE 13 11127 - 12287 954 386 aa, chain - ## HITS:1 COG:HI0676 KEGG:ns NR:ns ## COG: HI0676 COG4973 # Protein_GI_number: 16272618 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerC # Organism: Haemophilus influenzae # 169 367 63 272 295 68 30.0 2e-11 MTKKTTIQKEPVKLREKKLANGNVSLYLDIYRNGKRQKEYLKLYLIDATTPLEREQNRQT LATAQAIKSKRLIEMQNGEYSFTRQFKEDTPFLEYYRKMVEERHKNPESDGNWGNWRSCL RYLEIYCDEKTTFKDITPEFINGFKDFLNNVEKDTHKRTGPRRERDVFQGLSQNSKVSYF NKLRACINQAYDERIIPVNPLRGIEGFKAAEVKRDYLTLEEVKQLAATPCRYPILKRAFL FSCLTGLRKSDIQKLTWSEVQKFGDYTRIVFKQKKTGGQEYLDISSQAEKYLGERGNPED PVFTGFTYGAWTSLELQRWSMAANINKNLTFHCGRHTFAVLMLDLGADIYTVSKLLGHKE LATTQIYAKVLDKNKQNAVSLIPDID >gi|222159225|gb|ACAB01000134.1| GENE 14 12398 - 12646 131 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTGRKYGEKEKMQILIKSLIFRTLQIYIFSFYAFLFFVSLADAKMLKDILKHHVIRNLT SYVRNKENTLPYVLRKKISGNL >gi|222159225|gb|ACAB01000134.1| GENE 15 12515 - 13912 1269 465 aa, chain - ## HITS:1 COG:CAC3734 KEGG:ns NR:ns ## COG: CAC3734 COG0486 # Protein_GI_number: 15896965 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 4 465 5 459 459 301 38.0 2e-81 MNQDTICAIATAQGGAIGSIRVSGPEAISITGSIFKPAKTGKLLSEQKPYTLTFGRIYDG DEIIDEVLISLFRAPHSYTGEDSTEITCHGSSYILQQVMQLLIKNGCRMAQPGEYTQRAF LNGKMDLSQAEAVADLIASSSAATHRLAMSQMRGGFSKELTELRNKLLNFTSMIELELDF SEEDVEFADRSALRKLADEIEQVISRLAHSFSVGNAIKNGVPVAIIGETNAGKSTLLNVL LNEDKAIVSDVHGTTRDVIEDTINIGGITFRFIDTAGIRETNDTIESLGIERTFQKLDQA EIVLWMVDAVNAASQIEQLSEKIIPRCEGKHLIVVFNKADLIEDKQKENLLSLLKDFPKE STESIFISAKQRENTSELQKMLIDAAHMPTVTQNDIIVTNVRHYEALNKALEAIHRVQNG LDSQISGDFLSQDIRECIFFISDIAGEVTNDMVLQNIFQHFCIGK >gi|222159225|gb|ACAB01000134.1| GENE 16 14048 - 14455 328 135 aa, chain + ## HITS:1 COG:STM2271_3 KEGG:ns NR:ns ## COG: STM2271_3 COG0784 # Protein_GI_number: 16765598 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Salmonella typhimurium LT2 # 44 133 26 115 124 72 34.0 2e-13 MEPAINLYSSNEKSELIRENYQEILSMLAAHQIALWEYDIITGKCSFSELGQEAINSFIR ESPDLILMDIRMPVMDGIQATEKIRTISLTVPIIAVTAYAFYTEQQQAIQAGCNAVISKL YSLERLRETIESYIG >gi|222159225|gb|ACAB01000134.1| GENE 17 14538 - 15197 809 219 aa, chain + ## HITS:1 COG:PA0741 KEGG:ns NR:ns ## COG: PA0741 COG2910 # Protein_GI_number: 15595938 # Func_class: R General function prediction only # Function: Putative NADH-flavin reductase # Organism: Pseudomonas aeruginosa # 6 219 2 213 213 199 50.0 4e-51 MEKVKKIVLIGASGFVGSALLNEALNRGFEVTAVVRHPEKIKIENENLKVVKADVAVLDE VADVCKGADAVVSAFNPGWNNPDIYDETIKVYLTIIDGVKKAGVNRFLMVGGAGSLFIAP GLRLMDSGEVPENILPGVKALGEFYLNFLKKEKEIDWVFFSPAADMRPGVRTGRYRLGKD DMIVDIVGNSHISVEDYAAAMIDELEYPKHHQERFTIGY >gi|222159225|gb|ACAB01000134.1| GENE 18 15483 - 16361 971 292 aa, chain - ## HITS:1 COG:VNG0893G KEGG:ns NR:ns ## COG: VNG0893G COG2820 # Protein_GI_number: 15790029 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Halobacterium sp. NRC-1 # 19 268 14 227 273 100 30.0 3e-21 MKKYFPSSELIINEDGSVFHLHVKPEWLADKVILVGDPGRVALVASHFENKECEVESREF KTITGTYKGKRITVVSTGIGCDNIDIVINELDALANIDFSTREEKDQFRQLELVRIGTCG GLQPNTPVGTFVCSQKSIGFDGLLNFYAGRNAVCDLPFERAFLNHMGWSGNMCAPAPYVI DASEELIDQIAKEDMVRGVTIAAGGFFGPQGRELRVPLADPKQNDKIESFEYKGYKITNF EMESSALAGLSRLMGHKAMTVCMVIANRLIKEANTGYKNTIDTLIKTVLDRI >gi|222159225|gb|ACAB01000134.1| GENE 19 16520 - 17521 1045 333 aa, chain - ## HITS:1 COG:BS_yqfA KEGG:ns NR:ns ## COG: BS_yqfA COG4864 # Protein_GI_number: 16079592 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 320 1 319 331 389 66.0 1e-108 MESSFYLPIFLIAGGIIFLIIFFHYVPFFLWLSAKVSGVNISLIQLFLMRIRNVPPYIIV PGMIEAHKAGLKNITRDELEAHYLAGGHVEKVVHALVSASKANIELPFQMATAIDLAGRD VFEAVQMSVNPKVIDTPPVTAVAKDGIQLIAKARVTVRANIRQLVGGAGEDTILARVGEG IVSSIGSSENHKSVLENPDSISKLVLRKGLDAGTAFEILSIDIADIDIGKNIGAALQIDQ ANADKNIAQAKAEERRAMAVASEQEMKAKAQEARAKVIEAEAEVPKAMAEAFRSGNLGIM DYYRMKNIEADTSMRENIAKPATGNAGNQPLSK >gi|222159225|gb|ACAB01000134.1| GENE 20 17544 - 18014 626 156 aa, chain - ## HITS:1 COG:no KEGG:BT_4556 NR:ns ## KEGG: BT_4556 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 155 156 253 95.0 1e-66 MDILIIAVLIIAAVILFLVELFVIPGISLAGISALVCILYANYYAFANLGMAGGFITLGI SAIACIGSLIWFMRSKMLDKLALKKDIDSKVDRSAEDSVKVGDTGISTTRLAQIGYAEIN GKIVEVKSIDGFLNEKTPIIVSRITDGTIMVEKHKE >gi|222159225|gb|ACAB01000134.1| GENE 21 18032 - 19453 1369 473 aa, chain - ## HITS:1 COG:no KEGG:BT_4557 NR:ns ## KEGG: BT_4557 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 473 1 473 475 813 87.0 0 MKGKYILFLICSLFLCEVSAQTLEQARTLFTKGDYEKAKPVFKKYAKSQPSNGNYNYWYG VCCLKTGEAEEAVKPLEIAVKKRVTSGQLYLGQAYNETYRFEDAVNCFEEYIADLSKRKK STEEAEKLLEKSKADLRMLKGVEDVCIIDSFVVDKATFLHAYKISEESGKLFTFNDFFKT EGDHPGTVYETEIGNKIYYSEKGENGNLDIFSKNKLLNEWSDGRPLPGSINASGNANYPF VLSDGVTVYYASDGEGLGGYDIFVTRYNTNTDTYLVPDNVGMPFNSPYNDYMYVIDEYNN LGWFASDRFQPEGKVCIYVFIPNTSKQTYNYEAMEQQQIIRLAQIHSLKETWKDKQAVTE ALQRLEAAISHKPKERRAMDFEFVIDDHTTYYLMSDFKSAKAKSLFQRYQQMEKDYYQQE EKLNSLRQQYAGANQQGKEKMAPAILDLEKRVLQMSEELDTLEVNVRNAEKTK >gi|222159225|gb|ACAB01000134.1| GENE 22 19637 - 20383 566 248 aa, chain + ## HITS:1 COG:NMA1465 KEGG:ns NR:ns ## COG: NMA1465 COG0037 # Protein_GI_number: 15794367 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Neisseria meningitidis Z2491 # 1 245 5 252 319 137 35.0 2e-32 MTQFTEEEKAIRRIERWFSKGVVQYGLIEEGDKILIGLSGGKDSLALVELLGKRARIYKP RFSVVAVHVVMKNIPYQSDTEYLKAHCEAYGVPFVQYETAFDPATDTRKSPCFLCSWNRR KALFTVAKEHGCNKIALGHHMDDILETLLMNITYQGAFSTMPPRLVMKKFDMTIIRPMCL VHEADLVELAALRHYQKQVKNCPYESQSSRSDMKGVLRQLEAMNPEARYSLWGSMTNVQE ELLPDKID >gi|222159225|gb|ACAB01000134.1| GENE 23 20399 - 20770 266 123 aa, chain + ## HITS:1 COG:XF0449 KEGG:ns NR:ns ## COG: XF0449 COG3169 # Protein_GI_number: 15837051 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 4 121 9 114 116 90 52.0 6e-19 MKGFYTILLLIVSNVFMTFAWYGHLKMKQEYSWFAALPLIGVIAFSWAIAFFEYSCQIPA NRIGFIGNGGPFSLMQLKVIQEVITLVIFTVFTTIFFKGEALHWNHLAAFICLIAAVYFV FMK >gi|222159225|gb|ACAB01000134.1| GENE 24 20841 - 21854 895 337 aa, chain - ## HITS:1 COG:YPO3234 KEGG:ns NR:ns ## COG: YPO3234 COG1477 # Protein_GI_number: 16123393 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Yersinia pestis # 32 334 27 336 340 201 37.0 2e-51 MKTKKSFLWLAFLILATIWILVRHNQQGGYYSVKGLVFGTVYKITYQHDGDLKPEIEAEL KRFDQSLSPFNDSSVVSRVNRNEELVTDSFFQKCFHRSMEISRETKGAFDITVAPLANAW GFGFKKGAFPDSLMIDSLLQITGYKKVKMDNGKVIKQDPRTMLSCSAVAKGYSVDVIAQL LDRKGIKNYMVDIGGEVVVKGKNPSKGLWRIGINKPIDDSLAVNQDLQTILEITDLGLAT SGNYRNYYYKDGKKYAHTIDPRTGYPVQHNILSSTVIAKDCMTADALATAFMVMGLEEAE AFCKADTTIDAYFIYSGENGEFKTYYTEGMKKYITTP >gi|222159225|gb|ACAB01000134.1| GENE 25 21885 - 22466 518 193 aa, chain - ## HITS:1 COG:Cj0167c KEGG:ns NR:ns ## COG: Cj0167c COG1971 # Protein_GI_number: 15791554 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Campylobacter jejuni # 8 192 8 186 187 119 47.0 3e-27 MTGLEIWLLAIGLAMDCFAVSIASGIILKRTRWKPMLIMAFAFGFFQAIMPFIGWMCAKT FSHLIESVDHWIAFAILAFLGGRMILESFKEEECCKLFNPANPKVVLTMAIATSIDALAI GISFAFLGVQDYTEILPPISIIGFVSFVMSLIGLIFGIQCGCGIARKLKAELWGGIILVI IGLKILIEHLFLQ >gi|222159225|gb|ACAB01000134.1| GENE 26 22473 - 23276 644 267 aa, chain - ## HITS:1 COG:no KEGG:BT_4562 NR:ns ## KEGG: BT_4562 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 80 267 1 188 188 345 89.0 1e-93 MKSRTCRYSIIIGQNAHLMKRLISLLLLFCIGLPLLAQQQRPAQTPKRDQKKKEVAGIDT IPLYNGTYVGVDLYGIGSKLLGGDFMSSEVSVAVNLKNKFIPTIEFGMGGTDTWSETGIH YKSKMAPFFRIGVDYNTMAKKKEKNSYLYAGLRYAFSSFKYDVSTMPADDPIWGDVIGNP SLEDGYWGGSVPFSHLGMKGSVQWLEIVLGVKVRIYKNFNMGWSVRMKYKTKASTGEYGD PWYVPGYGKFKSNNMGITYSLIYKLPL >gi|222159225|gb|ACAB01000134.1| GENE 27 23206 - 23790 382 194 aa, chain - ## HITS:1 COG:no KEGG:BT_4563 NR:ns ## KEGG: BT_4563 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 179 183 245 69.0 6e-64 MKNLIRIFLILTIIGGAVSLHSACSDENDCSLAGRPMMYCTIYTINPDNPTIALKDTLDS LTITALGTDSIILNNEKNVHTLMLPLRYTSDTTVFIFRYDPKRREKDFDTLYIVQQNTPY FQSMECGYMMKQNIISAKFGKPGRPGNPPPYGNGQPDQIDSLHIKNKEANTNEIENLQIF YNYRPERTPDETTN >gi|222159225|gb|ACAB01000134.1| GENE 28 23812 - 24762 850 316 aa, chain - ## HITS:1 COG:mlr7556 KEGG:ns NR:ns ## COG: mlr7556 COG0463 # Protein_GI_number: 13476277 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mesorhizobium loti # 3 308 4 301 326 253 42.0 5e-67 MDISVVIPLFNEEESLPELYAWIERVMQANGFSFEVIFVNDGSTDHSWEVIEKLKAQSEH VKGIKFRRNYGKSPALYCGFAEAEGNVVITMDADLQDSPDEIPELYRMITKDGYDLVSGW KQKRYDPISKTLPTKLFNATARKVSGIKNLHDFNCGLKAYRKEVVKHIEVYGEMHRYIPY LAKNAGFKKIGEKVVHHQARKYGTTKFGLNRFVNGYLDLLSLWFLSRFGIKPMHFFGLLG SLMFLIGMISVIIVGASKLYAMYNGLPYRLVTDSPYFYLSLTAMIIGTQLFLAGFLGELI SRNAPERNNYQIEKKI >gi|222159225|gb|ACAB01000134.1| GENE 29 24795 - 25310 356 171 aa, chain - ## HITS:1 COG:no KEGG:BT_4565 NR:ns ## KEGG: BT_4565 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 48 171 1 124 124 215 89.0 5e-55 MHFGTYMGIYWILKFILFPLGFHIPFLSLLFVILTLSVPFIGYHYAKMYRDKICGGSIQF SHAMLFTIFMYMFASLLVAVAHYAYFQFIDHGFIINSYIQLWDELMTNTPALIENKEVIK ETIDTARSLTSINITMQLLSWDVFWGSILAIPTALMVMKKAKSENDGVSQS >gi|222159225|gb|ACAB01000134.1| GENE 30 25904 - 26377 276 157 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713452|ref|ZP_04543933.1| ## NR: gi|237713452|ref|ZP_04543933.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 157 3 159 159 257 100.0 1e-67 MENSQLKDLQEEVSDATKQYILTTFNSENGMKTYYLQMSNIIRSAHINPPIDTEYNSLKK LSKKLKQYCTFIQTLGEHEWDKGIADIQKALGIYLMQNDIESKERKQTNQEIASQLQFIV FLSGNINIIKQLHGILQRHLSNVMLLLRSYPEHNIQE >gi|222159225|gb|ACAB01000134.1| GENE 31 26782 - 28062 1557 426 aa, chain - ## HITS:1 COG:SP1128 KEGG:ns NR:ns ## COG: SP1128 COG0148 # Protein_GI_number: 15900994 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Streptococcus pneumoniae TIGR4 # 3 420 4 423 434 614 73.0 1e-175 MKIEKIVAREILDSRGNPTVEVDVVLESGIMGRASVPSGASTGEHEALELRDGDKQRYGG KGVQKAVDNVNKIIAPKLIGMSSLNQRGIDYAMLALDGTKTKSNLGANAILGVSLAVAKA AANYLDLPLYRYIGGTNTYVMPVPMMNIINGGSHSDAPIAFQEFMIRPVGAPSFREGLRM GAEVFHALKKVLKDRGLSTAVGDEGGFAPNLEGTEDALNSILAAIKAAGYEPGKDVMIGM DCASSEFYHDGIYDYTKFEGEKGKKRTSAEQVDYLEELINKFPIDSIEDGMNENDWEGWK KLTERIGNRCQLVGDDLFVTNVDFLAMGIEKGCANSILIKVNQIGSLTETLNAIEMAHRH GYTTVTSHRSGETEDATIADIAVATNSGQIKTGSLSRSDRMAKYNQLLRIEEELGDLAVY GYKRIK >gi|222159225|gb|ACAB01000134.1| GENE 32 28242 - 28904 413 220 aa, chain - ## HITS:1 COG:CAC1657 KEGG:ns NR:ns ## COG: CAC1657 COG1357 # Protein_GI_number: 15894934 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Clostridium acetobutylicum # 56 219 53 216 216 118 39.0 8e-27 MIKRNPTKPVRVVPPMMEEQEVSTTTLQEWLDREETVSHLLFCKGKEEGIDKSYKSFKNC TFQHQTFSECKFRSSQLSDVRFENCDLSNISFAESSLYRVEFISCKLLGTNLSETTMNHV LLHDCNAGYINLAMSKMNQVRFAHSQLRNGSLNDCRFSSVAFESCDLVEADFSHAPLRGI DLRTSRISGITLNISDLKGAVITSLQAMDLLPLLGVIIKD >gi|222159225|gb|ACAB01000134.1| GENE 33 28935 - 29312 333 125 aa, chain - ## HITS:1 COG:AGc2712 KEGG:ns NR:ns ## COG: AGc2712 COG0239 # Protein_GI_number: 15888794 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 11 125 10 124 125 74 43.0 5e-14 MSKEVIYIFIGGGTGSALRYFIQLLMHERIVPYHFPWATFTVNLLGSFLIGLFYALSERF HLPFEVRLFLTTGLCGGFTTFSTFSSDGVGLLKGEFYGAFVLYTLLSIGIGLAATLAGGY VGKQI >gi|222159225|gb|ACAB01000134.1| GENE 34 29664 - 31823 1964 719 aa, chain + ## HITS:1 COG:no KEGG:BT_4581 NR:ns ## KEGG: BT_4581 # Name: not_defined # Def: alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 719 1 719 719 1358 89.0 0 MKNMKIGTVIGLCLLLFFFFGTSAKAESITSPDGQLKLNFSVNAQGEPVYELSYKGKEVI KPSKLGLELKNDPGLMNGFTLIDTKTATFDETWQPVWGEVKSIRNHYNEMAVTLNQKAQG RNLIICFRLYNDGLGFRYEFPQQKNLNYFVIKEEHSQFAMAGDHTAFWIPGDYDTQEYDY TESKLSEIRGLLQGAITPNASQTPFSPTGVQTSLQMKTADGLYINLHEAALIDYSCMHLN LDDQNLVFESWLTPDAKGDKGYMQTPCRSPWRTVIVSDDARDILASKLTLNLNESCIYED VSWIKPVKYIGVWWEMITGKGSWSYTDDVYSVKLGQTDYSKTKPNGRHSANNDKVKRYID FASEHGFDQVLVEGWNEGWEDWFGNSKDYVFDFVTPYPDFDVKMLNAYAKEKGVKLMMHH ETSASVRNYERHLDKAYQFMIDNGYNAVKSGYVGNIIPRGEHHYGQWMNNHYLYAVEKAA EYKICVNAHEATRPTGLCRTYPNLIGNESARGTEYEAFAGNKPFHTTLLPFTRQIGGPMD YTPGIFDTKLSFLSGDHSFVRTTLAKQLALYVTMYSPLQMAADLPESYERHMDAFQFIKE VAVDWDDSKYLEAEPGDYITVARKAKGTDNWFVGGITDENPRTSAFTLDFLEPGKQYVAT LYADGKDADFEKNPTSYQIKKGLVTNKNKMSVKLARSGGFAVSLIEATPADLKTIRKWK >gi|222159225|gb|ACAB01000134.1| GENE 35 31988 - 33397 1399 469 aa, chain + ## HITS:1 COG:lin1880 KEGG:ns NR:ns ## COG: lin1880 COG0034 # Protein_GI_number: 16800946 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Listeria innocua # 3 454 13 438 475 181 31.0 2e-45 MGGFFGTVSKTSCVTDLFYGTDYNSHLGTKRGGLATYSEEKGFIRSIHNLESTYFRTKFE GELDKFKGNAGIGIISDTDAQPIIINSHLGRFAIVTVAKIANIQELEEELLSQNMHFAEL SSSNTNQTELIALLIIQGRTFAEGIENVFNHIKGSCSMLLLTEDGSIIAARDRWGRTPVV IGKKDGAYAATSESSSFPNLDYEIERYLGPGEIVRMYDDHIDQLRKPNEDMQICSFLWVY YGFPTSCYEGKNVEEVRFTSGLKMGQTDESEVDCACGIPDSGVGMALGYAEGKGVPYHRA ISKYTPTWPRSFTPSNQEMRSLVAKMKLIPNRAMLQNKRLLFCDDSIVRGTQLRDNVKIL YDYGAKEVHMRIACPPLIYACPFVGFSASKNALELITRRIIKELEGDENKNLEKYATTGS PEYEKMVSIIAERFGLSSLKFNTLETLIEAIGLPKCKVCTHCFDGSSHF >gi|222159225|gb|ACAB01000134.1| GENE 36 33547 - 34770 1376 407 aa, chain + ## HITS:1 COG:CAC0476 KEGG:ns NR:ns ## COG: CAC0476 COG2195 # Protein_GI_number: 15893767 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Clostridium acetobutylicum # 3 407 4 408 408 500 58.0 1e-141 MTLVDRFLKYVSFDTQSDESTGLTPSTPKQMIFAEYLKKELESLGLEDITLDEHGYLFAT LPANIDKEVPTIGFIAHMDTSPDMTGKDVTPRIVEKYDGSDIVLCAEENVILSPSQFPEL LDHKGEDLIVTNGKTLLGADDKAGIAEIVSAMVYLKEHPEIKHGKIRIGFNPDEEIGEGA HKFDVEKFGCEWGYTMDGGEVGELEFENFNAAAAKITFKGRNVHPGYAKDKMINSIYLAN QFITMLPSQERPEHTTGYEGFYHLIGIQGDVEQSTVSYIIRDHDRAKFENRKKEIERLVA QMNAEYGAGTATLELRDQYYNMREKIEPVMHIIDTAFAAMEAVGVKPNVKPIRGGTDGAQ LSFKGLPCPNIFAGGLNFHGRYEFVPIQNMEKAMKVIVKIAELVASK >gi|222159225|gb|ACAB01000134.1| GENE 37 34837 - 35922 1103 361 aa, chain + ## HITS:1 COG:BH2816 KEGG:ns NR:ns ## COG: BH2816 COG0404 # Protein_GI_number: 15615379 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Bacillus halodurans # 1 361 4 362 365 329 46.0 6e-90 MKTTPFTEKHIILGAKMHEFAGYNMPIEYSGIIDEHLTVCQSVGVFDVSHMGEFWVKGPQ ALAFLQKITSNNVAALAPGKIQYTCFPNEEGGIVDDLLVYQYEPEKYMLVVNAANMEKDW NWCVSHNTEGAELENSSDNIAQLAVQGPKAVLALQKLTDIDLASIPYYTFKVGTFAGEEN VIISNTGYTGAGGFELYFYPSVADRIWKAVFEAGEEYGIKPIGLGARDTLRLEMGFCLYG NDLDDTTSPIEAGLGWITKFVEGKDFINRPLLEKQKTEGVTRKLVGFEMVDRGIPRHGYE LVNAEGEQVGVVTSGTMSPTRKIGIGMGYVKPEYSKVGTEICIDMRGRKLKAVVVKPPFR K >gi|222159225|gb|ACAB01000134.1| GENE 38 36073 - 38346 1635 757 aa, chain + ## HITS:1 COG:PA5529 KEGG:ns NR:ns ## COG: PA5529 COG0475 # Protein_GI_number: 15600722 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Pseudomonas aeruginosa # 7 443 6 440 585 352 44.0 1e-96 MSQLPTLIADLALILICAGVMTLLFKKLKQPLVLGYVVAGFLASPHMPYTPSVMDTANIK TWADIGVIFLLFALGLEFSFKKIVKVGGSAIIAACTIIFCMILLGIGVGMGFGWHRMDSL FLGGMIAMSSTTIIYKAFDDLGLRKKQFTGLVLSILILEDILAIVLMVMLSTMAVSHNFE GTEMLESIGKLLFFLILWFVVGIYLIPEFLKRCRKLMGEETLLIVSLALCFGMVVMAAHT GFSAAFGAFIMGSILAETIEAESIDRLVKPVKDLFGAIFFVSVGMMVDPAMIVEYAVPII VITLAVILGQSVFGTFGVILSGKPLKTAMQCGFSLTQIGEFAFIIASLGVSLHVTSDFLY PIVVAVSVITTFLTPYMIRLAEPASTFVDAHLPESWKKMMMRYSSGSQTALNHENLWKKL ILAMVRITVVYSIVSISIVALSFRFVVPFFKENLPHFWASLLGAVFIILCIAPFLRAIMV KKNHSVEFMTLWHDNRANRAPLLSTIVIRIMIAVLFVIFVISGLFKASIGLIIGVAVLVV LLMVWSRRLKKQSILIERRFFQNLRSRDVRAEYLGEKKPEYAGRLLSHDLHLADMEIPGE SCWAGKTLMELNLGKKFGVHVASILRGKRRINIPGGSVRLFPMDKIQVIGTDEQLNVFNE AMQNGAKIDWEVYEKSEMALKQFIIDSDSVFLGKTIRESGIRDKYHCMIAGVESEDGTLM VPDVNAPLEEGDVVWVVGEKEDVYQLVDQKNEKVQAG >gi|222159225|gb|ACAB01000134.1| GENE 39 38396 - 38821 530 141 aa, chain + ## HITS:1 COG:RSc0455 KEGG:ns NR:ns ## COG: RSc0455 COG0537 # Protein_GI_number: 17545174 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Ralstonia solanacearum # 30 106 27 103 147 62 35.0 3e-10 MKSDPKECLYCQNNETLHNLMIEIAQLSVSRVFLFKEQTYRGRCLVAYKDHVNDLNELSD EDRNAFMADVARVTRAMQKAFQPEKINYGAYSDKLSHLHFHLAPKYVDGPDYGGIFQMNP GKVYLTDAEYQELIDAVKANL Prediction of potential genes in microbial genomes Time: Wed May 18 04:11:48 2011 Seq name: gi|222159224|gb|ACAB01000135.1| Bacteroides sp. D1 cont1.135, whole genome shotgun sequence Length of sequence - 25434 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 15, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 8 - 430 600 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) - Prom 451 - 510 2.6 2 2 Op 1 22/0.000 - CDS 541 - 1200 634 ## COG0193 Peptidyl-tRNA hydrolase 3 2 Op 2 . - CDS 1244 - 1834 975 ## PROTEIN SUPPORTED gi|237713464|ref|ZP_04543945.1| 50S ribosomal protein L25/general stress protein Ctc - Prom 1892 - 1951 3.3 4 3 Tu 1 . - CDS 1972 - 2403 462 ## BT_4590 hypothetical protein - Prom 2565 - 2624 5.1 + Prom 2344 - 2403 10.5 5 4 Op 1 . + CDS 2542 - 3468 855 ## COG0781 Transcription termination factor 6 4 Op 2 . + CDS 3508 - 3831 440 ## COG1862 Preprotein translocase subunit YajC 7 4 Op 3 . + CDS 3847 - 4866 693 ## BT_4593 hypothetical protein 8 4 Op 4 . + CDS 4856 - 5470 387 ## COG0237 Dephospho-CoA kinase 9 4 Op 5 . + CDS 5495 - 5923 589 ## BT_4595 hypothetical protein + Term 5952 - 5998 9.0 + Prom 5930 - 5989 11.1 10 5 Tu 1 . + CDS 6167 - 6415 138 ## BT_4596 hypothetical protein + Term 6452 - 6503 11.4 - Term 6278 - 6312 -0.5 11 6 Tu 1 . - CDS 6545 - 9133 1852 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 - Prom 9192 - 9251 7.6 12 7 Tu 1 . - CDS 9367 - 10995 1044 ## BVU_1888 hypothetical protein - Prom 11154 - 11213 6.1 + Prom 11065 - 11124 7.0 13 8 Op 1 . + CDS 11154 - 11750 551 ## BT_4598 hypothetical protein 14 8 Op 2 . + CDS 11811 - 12707 819 ## COG0583 Transcriptional regulator + Term 12845 - 12879 1.0 15 9 Tu 1 . - CDS 12708 - 13367 496 ## BT_4600 hypothetical protein - Prom 13435 - 13494 5.3 + Prom 13745 - 13804 6.2 16 10 Op 1 . + CDS 13909 - 14334 223 ## COG5579 Uncharacterized conserved protein 17 10 Op 2 . + CDS 14351 - 14725 259 ## COG3304 Predicted membrane protein + Prom 14732 - 14791 4.0 18 11 Tu 1 . + CDS 14848 - 15840 1005 ## COG2855 Predicted membrane protein + Term 15876 - 15926 12.2 19 12 Op 1 . - CDS 15961 - 16995 1280 ## COG0468 RecA/RadA recombinase 20 12 Op 2 . - CDS 17022 - 17477 551 ## COG1225 Peroxiredoxin 21 12 Op 3 . - CDS 17506 - 18699 1277 ## COG1748 Saccharopine dehydrogenase and related proteins - Prom 18728 - 18787 2.2 22 12 Op 4 . - CDS 18790 - 20073 909 ## BT_4613 hypothetical protein - Prom 20133 - 20192 6.1 + Prom 20090 - 20149 7.0 23 13 Op 1 . + CDS 20219 - 21007 689 ## COG3187 Heat shock protein 24 13 Op 2 . + CDS 20937 - 21206 84 ## gi|237713487|ref|ZP_04543968.1| conserved hypothetical protein + Prom 21254 - 21313 5.8 25 14 Tu 1 . + CDS 21334 - 23247 2396 ## COG0443 Molecular chaperone + Term 23277 - 23327 9.1 + Prom 23275 - 23334 4.5 26 15 Op 1 . + CDS 23447 - 23998 357 ## BF1226 hypothetical protein 27 15 Op 2 . + CDS 24007 - 25266 658 ## COG4973 Site-specific recombinase XerC Predicted protein(s) >gi|222159224|gb|ACAB01000135.1| GENE 1 8 - 430 600 140 aa, chain - ## HITS:1 COG:Cgl2072 KEGG:ns NR:ns ## COG: Cgl2072 COG1188 # Protein_GI_number: 19553322 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Corynebacterium glutamicum # 5 121 10 122 126 94 42.0 7e-20 MAEARIDKWMWAVRIFKTRTIAAEACKKGRITINGSLAKAARMIKPGDVIQVKKPPITYS FKVLQTIEKRVGAKLVSEMMENVTTPDQYELLEMSKISGFVDRARGTGRPTKKDRRELEE FTTPELMDDFDFDFDFDSEE >gi|222159224|gb|ACAB01000135.1| GENE 2 541 - 1200 634 219 aa, chain - ## HITS:1 COG:slr0922 KEGG:ns NR:ns ## COG: slr0922 COG0193 # Protein_GI_number: 16331675 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Synechocystis # 31 219 1 190 194 139 41.0 4e-33 MQIKADVQSALICINLRFSFYINELFYGNTMIKYLVVGLGNIGPEYHETRHNIGFMTVEA LARINNAPPFMDGRYGFTTSFSIKGRQLILLKPSTFMNLSGLAVRYWMQKENIPLENVLI VVDDLALPFGTLRLKGKGSDAGHNGLKHIAAILGTQNYARLRFGIGNDFPRGGQVDYVLG HFTDEDWKTMDERLEMAGEIIKSFCLAGIDITMNQFNKK >gi|222159224|gb|ACAB01000135.1| GENE 3 1244 - 1834 975 196 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237713464|ref|ZP_04543945.1| 50S ribosomal protein L25/general stress protein Ctc [Bacteroides sp. D1] # 1 196 1 196 196 380 100 1e-105 MKSIEVKGTARTIAERSSEQARALKEIRKNGGVPCVLYGGNEVVHFTVTNEGLRNLVYTP HIYVVDLIIDGKKVNAILKDIQFHPVKDTILHVDFYQIDEAKPIVMEVPVQLEGLAEGVK AGGKLALQMRKIKVKALYNVIPEKLTVNVSHLGLGKTVKVGELSFEGLELISAKEAVVCA VKLTRAARGAAAAAGK >gi|222159224|gb|ACAB01000135.1| GENE 4 1972 - 2403 462 143 aa, chain - ## HITS:1 COG:no KEGG:BT_4590 NR:ns ## KEGG: BT_4590 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 143 1 134 134 161 81.0 6e-39 MKPIINGPKDMEDLKKKMSADMNDKEIVFSKSIKAGKRIYYLDVKKNRKDEMFLAITESK KVITGEGDDSQVSFEKHKIFLYREDFQKFMAGLEEAVNFIECSDANEYIARLNIEADEEN ERKAIEEARENKLESEIKIDIDF >gi|222159224|gb|ACAB01000135.1| GENE 5 2542 - 3468 855 308 aa, chain + ## HITS:1 COG:TM1765 KEGG:ns NR:ns ## COG: TM1765 COG0781 # Protein_GI_number: 15644510 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Thermotoga maritima # 197 298 32 133 142 63 36.0 6e-10 MINRVLIRLKIIQIVYAYYQNGSKNLDSAEKELFFSLSKAYDLYNYLLMLMIALTEYAQK RIDAAKAKLAPTTEELYPNRKFVDNKFIAQLEVNKQLTEFIANQKRTWTNDQDFIKELYE KIVETDIYKDYMASDDNSYEADRELWRKIYKTYIFNNDSLDQVLEDQSLYWNDDKEIVDT FVLKTIKRFDEKKGANQELLPEFKDDEDQEFARRLFRRTILNSDYYRHLVSENTKNWDLD RIAFMDVIIMQTALAEILSFPNIPVSVSLNEYVEIAKLYSTAKSGSFINGTLDGIVNQLK KEGKLTKN >gi|222159224|gb|ACAB01000135.1| GENE 6 3508 - 3831 440 107 aa, chain + ## HITS:1 COG:XF0224 KEGG:ns NR:ns ## COG: XF0224 COG1862 # Protein_GI_number: 15836829 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Xylella fastidiosa 9a5c # 10 95 21 107 120 75 43.0 2e-14 MNVLTVLLQAPAGAAGGGSMMWIMLIAMFVIMYFFMIRPQNKKQKEIANFRKSLQVNQNV ITAGGIHGTIKEITDDYIVLEIASNVKIKIDKNSIFADASAASNQSK >gi|222159224|gb|ACAB01000135.1| GENE 7 3847 - 4866 693 339 aa, chain + ## HITS:1 COG:no KEGG:BT_4593 NR:ns ## KEGG: BT_4593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 338 1 336 337 548 86.0 1e-155 MPMFDRRNIKYTYLKLSKKIKDFLLSDKSREFLIFLFFFLIAGGFWLLQTLNNDYEAEFS IPVRMKDLPNNVVLTSEPPSELRVRVKDKGTVLLNYMLGKSFFPVNLGFLDYKGKDNHVK IYASDFEKKILSQLNVSSKILSIKPDTLEYIYSEGKSKLVPVRFQGKVTAGLQYYVSDTI CKPDSVLVYAPEGILDTITTAYTQNITLENISDTTRRRIPLTSERGVKFVPASVEMTFPV DIYTEKTVEVPLHGVNFPADKVLRTFPSKVQITFQVGLKRFRSIKASDFIINISYEELLK LGSDKYTVKLKSFPSGINQIRIVPEQVDFLIEQITSNGD >gi|222159224|gb|ACAB01000135.1| GENE 8 4856 - 5470 387 204 aa, chain + ## HITS:1 COG:DR1892 KEGG:ns NR:ns ## COG: DR1892 COG0237 # Protein_GI_number: 15806892 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Deinococcus radiodurans # 4 181 15 190 207 97 36.0 2e-20 MAIKIGITGGIGSGKSVVSRLLEIMGIPVYISDIEAKRITQTDPVIRRGLCDLVGQDVFQ GGELNRSLLASYMFGHQEHVRKVNEIIHPQVKEDFRQWAARLKSELLVGMESAILVEAGF KDEVDFLVMVYAPLEVRVERAVKRDCSSRELVMKRIEAQMSDEVKRSHADFVIVNDDETP LIPQVLELISLLSKNNHYLCSAKK >gi|222159224|gb|ACAB01000135.1| GENE 9 5495 - 5923 589 142 aa, chain + ## HITS:1 COG:no KEGG:BT_4595 NR:ns ## KEGG: BT_4595 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 138 1 138 144 218 87.0 6e-56 MLKTILSISGKPGLYKLISQGKNMLIVESVNAEKKRFPAYGNEKIISLADIAMYTDDAEV PLYDVLESIKEKEKSAQASIDPKKATPEQLREYLAEVLPNFDRERVYVADIKKLVAWYNI LISNGITEFKPEGEIKEEEVAE >gi|222159224|gb|ACAB01000135.1| GENE 10 6167 - 6415 138 82 aa, chain + ## HITS:1 COG:no KEGG:BT_4596 NR:ns ## KEGG: BT_4596 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 82 1 82 82 140 82.0 1e-32 METITNKEVNVRLDNEKGTLCLTGLLAGTMVFLYDSQGELRGKHKFALPSLTMEIPQPGT YVLVMSHPNCQPEVRRITYLGI >gi|222159224|gb|ACAB01000135.1| GENE 11 6545 - 9133 1852 862 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 861 1 810 815 717 45 0.0 MNFNNFTIKSQEAVQEAVNLVQSRGQQAIEPVHIMQGVMKVGENVTNFIFQKLGMNGQQV ALVVDKQIDSLPKVSGGEPYLSRESNEVLQKATQYSKEMGDEFVSLEPIILALLTVKSTV STILKDAGMTEKELRNAISELRKGEKVTSQSSEDTYQSLEKYAINLNEAARSGKLDPVIG RDEEIRRVLQILSRRTKNNPILIGEPGTGKTAIVEGLAHRILRGDVPENLKNKQVYSLDM GALVAGAKYKGEFEERLKSVVNEVKKSEGDIILFIDEIHTLVGAGKGEGAMDAANILKPA LARGELRSIGATTLDEYQKYFEKDKALERRFQIVQVDEPDNLSTISILRGLKERYENHHH VRIKDDAIIAAVELSSRYITDRFLPDKAIDLMDEAAAKLRMEVDSVPEELDEISRKIKQL EIEREAIKRENDKPKLEIIGKELAELKEQEKSFKAKWQSEKTLMDKIQQNKVEIENLKFE AEKAEREGDYGKVAEIRYGKLQALDKEIEDTQKQLRDMQGDKAMIKEEVDAEDIADVVSR WTGIPVSKMLQSEKDKLLHLEEELHQRVIGQDEAIEAVADAVRRSRAGLQDPKRPIGSFI FLGTTGVGKTELAKALAEFLFDDETMMTRIDMSEYQEKHSVSRLVGAPPGYVGYDEGGQL TEAIRRKPYSVVLFDEIEKAHPDVFNILLQVLDDGRLTDNKGRVVNFKNTIIIMTSNMGS SYIQSQMEKLHGSNKEEVIEETKKEVMNMLKKTIRPEFLNRIDETIMFLPLNEKEIKQIV LLQIKGVQKMLAENGVELQLTEGALNFLSQVGYDPEFGARPVKRAIQRYLLNDLSKKLLS QEVDRSKAIIVDADGDGLVFRN >gi|222159224|gb|ACAB01000135.1| GENE 12 9367 - 10995 1044 542 aa, chain - ## HITS:1 COG:no KEGG:BVU_1888 NR:ns ## KEGG: BVU_1888 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 526 1 506 519 417 43.0 1e-115 MNKKKNQTSIDLDAIMTEVLNLFTPEEQQEVIDILSNNGLEKIYQQMEEAFYHYQHPDYS SEATPIAAYLQRLWDMQTCKDYPFDEFKKDCTTVPAAELEKDLRNFILHLFIHYQGPLEN AKEEDRPFKLWYIYWAMEHYRMESSLNIILEIMRQSPTFLASYCHLVQDEATIAIIYQLG QHQLPLLLSFMNEDGIAPFGKEDICSAVAQIAITNPERRLEIINWFCQLLNSYYDQFERG EDNNIAIIDYITNTLMNIRAIETLPILEKIYKRFKIPNTLTRKGIKEIKKEMPHAELKGL EVDSMEELMNLLNEFFAFNEDNEFDANEFDEEYDDEFDDDEYDNSFYIGEQTPKKLNIKI SLIDSDPEIYRIVEVPSNIRLESFADVINTAMGWEGYHMHLFQKGKTIYTTEESDDEFLF DPVKTVNSYSLSLGEILTRKGSHIKYEYDFGDSWMHRITLESQQAYKKDETQGIFLIDGA NACPPEDCGGIYGYQEMLEALKQPHSKAAKEYREWLGKNFNAHKFNAKKVERELRDFPIR IF >gi|222159224|gb|ACAB01000135.1| GENE 13 11154 - 11750 551 198 aa, chain + ## HITS:1 COG:no KEGG:BT_4598 NR:ns ## KEGG: BT_4598 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 198 1 194 194 332 88.0 7e-90 MTNDMNDKPQIKLSETVVLIDAAFLNFVITDMKRYFEETLQRSLQEIDLSMLTTYLTLDA GIAEGKNEVQFLFVYDKESGNLAHCRPSDLEKELNGVAFQSPYGEYSFASVPSEGMVSRE DLFLDLLSIVADSADVKKMILVSFNEEYGKKVTDALNEVQGKEIVQFRMNEPDTPVAYKW DMLAFPVMQALGIRADEL >gi|222159224|gb|ACAB01000135.1| GENE 14 11811 - 12707 819 298 aa, chain + ## HITS:1 COG:BS_ywfK KEGG:ns NR:ns ## COG: BS_ywfK COG0583 # Protein_GI_number: 16080817 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 6 275 6 272 299 142 30.0 8e-34 MSDFRLKVFQSVAKNLSFTKASQELFVSQPAITKHIQELEAYYQTRLFDRQGSKISLTKS GELLLKHSEKILDDYKQLEYEMHLLHNEYIGELKLGASTTIAQYVLPPLLADFIAKFPQI NLSLINGNSRGVEAALQEHRIDLGLVEGIFRLPNLKYTPFLQDELVAVVHTHSKLAVSDE IAPEDLPNIPLVLRERGSGTLDVFERSLLRHNLKLSSLNVLMYLGSTESIKLFLEHTDCM GIVSIRSVYKELVAGNFRVVEIKGMPMQREFNFVQLQGQEGGLSQAFMRFARHHSKSL >gi|222159224|gb|ACAB01000135.1| GENE 15 12708 - 13367 496 219 aa, chain - ## HITS:1 COG:no KEGG:BT_4600 NR:ns ## KEGG: BT_4600 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 219 345 563 563 422 91.0 1e-117 MTNLYQSFSADQITNPEIFPSLLFYYGMLTIIGTRGNLTILGIPNTNVRKQYYEYILEEY QNHHYINLIDIEILFNDMAFDGQWRPALEFISKAYKENTSVRSSIEGERNIQGFFTAYLS VNAYYLTMPEVELNHGFCDMFLMPDLQRYAEVAHSYILELKYLPKEKYDTQGTAQWQEAV EQIHGYAAGPKVRQLCQGTQLHCIVIQFCGWELVRMEEV >gi|222159224|gb|ACAB01000135.1| GENE 16 13909 - 14334 223 141 aa, chain + ## HITS:1 COG:SMc01703 KEGG:ns NR:ns ## COG: SMc01703 COG5579 # Protein_GI_number: 15964216 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 1 141 1 142 143 115 42.0 3e-26 MNMPNYNLQRFLDAQQSDYEQALTEVGNGRKYSHWVWYIFLQLKGLGMSYNSQYYGISGK EEAEAYLTYPVLGERLREITSVFLQLKNKTAQEVFGSLDAMKVLSCMTLFNEVASDDLFQ QVIDRYYQGKVDETTKRKLEK >gi|222159224|gb|ACAB01000135.1| GENE 17 14351 - 14725 259 124 aa, chain + ## HITS:1 COG:MT0892.1 KEGG:ns NR:ns ## COG: MT0892.1 COG3304 # Protein_GI_number: 15840283 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 1 124 1 123 129 88 43.0 3e-18 MGCLMNLLWLLLGGIFTAVEYLISSILMMLTIIGIPFGMQTLKLAGLALWPFGKEVRSGN RSSGCLYILMNILWIFLGGIWICLSHLVFGAILCITIIGIPFGLQHFKLAALALSPLGKD IITV >gi|222159224|gb|ACAB01000135.1| GENE 18 14848 - 15840 1005 330 aa, chain + ## HITS:1 COG:SPy1056 KEGG:ns NR:ns ## COG: SPy1056 COG2855 # Protein_GI_number: 15675048 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 45 325 34 331 339 134 34.0 3e-31 MISTATKTLQANNKMIYVALLSTLTFFLFLDYIPGLEAWSAWVTPPVALFLGLIFALTCG QAHPKFNKKTSKYLLQYSVVGLGFGMNLHSALASGKEGMEFTVISVIGTLVIGWFIGRKL FKIDRNTAYLISSGTAICGGSAIAAVGPVLKAKDSEMSVALGTIFILNAIALFIFPAIGH ALNMDQQQFGTWAAIAIHDTSSVVGAGAAYGEEALKVATTIKLTRALWIIPMAFATSFIF KSKGQKISIPWFIFFFVLALVVNTYLLDGVPQLGAAINGIARKTLTITMFFIGASLSIDV LKAVGIKPLVQGILLWVIISLSTLAYIYFV >gi|222159224|gb|ACAB01000135.1| GENE 19 15961 - 16995 1280 344 aa, chain - ## HITS:1 COG:AGc3441 KEGG:ns NR:ns ## COG: AGc3441 COG0468 # Protein_GI_number: 15889174 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 17 334 66 384 416 426 66.0 1e-119 MAKKDELNFETDNKMASSEKLKALQAAMEKIEKSFGKGSIMKMGDDSVEQVEVIPTGSIA LNVALGVGGYPRGRIIEIYGPESSGKTTLAIHAIAEAQKAGGIAAFIDAEHAFDRFYASK LGVNIDDLYISQPDNGEQALEIAEQLIRSSAIDIIVIDSVAALTPKAEIEGDMGDNKVGL QARLMSQALRKLTAAVSKTRTTCIFINQLREKIGVMFGNPETTTGGNALKFYASVRLDIR GSQPIKDGEEILGKLTKVKVVKNKVAPPFRKAEFDIMFGEGISHSGEIIDLGADLGIIKK SGSWYSYNDTKLGQGRDAAKQCIMDNPELAEELEGLIFEELKKK >gi|222159224|gb|ACAB01000135.1| GENE 20 17022 - 17477 551 151 aa, chain - ## HITS:1 COG:HI0254 KEGG:ns NR:ns ## COG: HI0254 COG1225 # Protein_GI_number: 16272212 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Haemophilus influenzae # 2 150 4 150 155 143 45.0 1e-34 MINVGDKAPEVLGINEKGEEIRLSAYKGKKIVLYFYPKDSTSGCTAQACNLRDNYSELRK AGYEVIGVSVDNEKSHQKFIEKNNLPFTLIADTDKKLVEEFGVWGEKKLYGRAYMGTFRT TFLINEEGIVERIITPKEVKTKEHASQILNQ >gi|222159224|gb|ACAB01000135.1| GENE 21 17506 - 18699 1277 397 aa, chain - ## HITS:1 COG:slr0049 KEGG:ns NR:ns ## COG: slr0049 COG1748 # Protein_GI_number: 16331467 # Func_class: E Amino acid transport and metabolism # Function: Saccharopine dehydrogenase and related proteins # Organism: Synechocystis # 1 389 1 388 398 553 66.0 1e-157 MGRVLIIGAGGVGTVVAHKVAQNADVFTDIMIASRTKEKCDKIVEAIGNPNIKTAKVDAD NVEELVALFNDFKPEMVINVALPYQDLTIMEACLKAGVNYLDTANYEPKDEAHFEYSWQW AYHDRFKEAGLTAILGCGFDPGVSGIYTAYAAKHYFDEIQYLDIVDCNAGNHHKAFATNF NPEINIREITQNGRYYENGKWVTTGPLEIHKDLTYPNIGPRDSYLLYHEELESLVKHYPT IKRARFWMTFGQEYLTHLRVIQNIGMARIDEVDYNGMKIVPLQFLKAVLPNPQDLGENYE GETSIGCRIRGLKDGKERTYYVYNNCSHQEAYQETGMQGVSYTTGVPAMIGAMMFFKGEW NRPGVNNVEEFNPDPFMEQLNKQGLPWHEEFDKDLEL >gi|222159224|gb|ACAB01000135.1| GENE 22 18790 - 20073 909 427 aa, chain - ## HITS:1 COG:no KEGG:BT_4613 NR:ns ## KEGG: BT_4613 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 427 1 427 427 653 73.0 0 MRIIVLSLILFCCGTCPITAQSDYIVTTPSTQEIPVGEEEQFIKNNFPLQPLCKWTPGMK FMFVPSTRNMFLPTLSSYDTEKGIDNSLLKHKILTFTGTEEKAQNISTGTNYSTRFVFEC EGEKYYYDIKNMRLDEICEKAPRAGINGLVYLKDVDTAKELLIGKTVYIQSESARVDDAN NYSGYRDIAIPVNTEATITAIGVGSQAYPVKIVFKDTQGHSYYLEVALSRTNSGMDLNDF QGEKRMKYFSNAFSFTNKSLGTIESLKNKYLGMTVYPKKMLPAKRIVSFEDKQTESRVHL PRYTVLQIKEIKLSPPGSLATLSLTDRDGAIYELETDLKYDVIVKNDNYIEDFFGFEDIH KKYPGITESRWQIISRGDLEAGMSTVECRLSIGDPIEIELKKDSRFETWFYNGKTLEFEN GTLQRYK >gi|222159224|gb|ACAB01000135.1| GENE 23 20219 - 21007 689 262 aa, chain + ## HITS:1 COG:DR1940 KEGG:ns NR:ns ## COG: DR1940 COG3187 # Protein_GI_number: 15806938 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Heat shock protein # Organism: Deinococcus radiodurans # 30 222 182 365 403 70 29.0 3e-12 MKKVFVSICIAGAALAMSSCRSVEKAIPLSSINGEWNIIEVNGSKIAPGESRTLPFIAFD TATGRVSGNSGCNRMMGSFDVNAKPGSLELTGMASTRMMCPDMTTENNVLNAFAQVKGYK KAGKDKMYLCNSSNRPVVVLQKKEADVKLSVLNGEWKIKEVNGEAIPSGMEKQPFIAFDV KKKTIHGNAGCNLINGGFETSTSNAKSISFPGVASTMMACPDMETEGKVLKAINEVKSFD VLSGGGIGLYDANGALVIVLEK >gi|222159224|gb|ACAB01000135.1| GENE 24 20937 - 21206 84 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713487|ref|ZP_04543968.1| ## NR: gi|237713487|ref|ZP_04543968.1| conserved hypothetical protein [Bacteroides sp. D1] # 11 89 1 79 79 141 100.0 2e-32 MCCPVVALVCMTQTVHWSSCWKSNKEELYIIGYACVALDDAGIFFFLYRTRAACICLKKQ LLIGFEYDYVSSVYLLKLSFWQILCHAFV >gi|222159224|gb|ACAB01000135.1| GENE 25 21334 - 23247 2396 637 aa, chain + ## HITS:1 COG:TP0216 KEGG:ns NR:ns ## COG: TP0216 COG0443 # Protein_GI_number: 15639209 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Treponema pallidum # 1 637 1 630 635 682 61.0 0 MGKIIGIDLGTTNSCVAVFEGNEPVVIANSEGKRTTPSVVAFVDGGERKVGDPAKRQAIT NPTRTIFSIKRFMGENWDQVQKEVTRVPYKVVKGDNNTPRVDIDGRLYTPQEISAMILQK MKKTAEDYLGQEVTEAVITVPAYFSDSQRQATKEAGQIAGLEVKRIVNEPTAAALAYGLD KAHKDMKIAVFDLGGGTFDISILEFGGGVFEVLSTNGDTHLGGDDFDQVIINWLVQEFKN DEGADLTQDPMALQRLKEAAEKAKIELSSSTSTEINLPYIMPVGGVPKHLVKTLTRAKFE SLAHELIQACLEPCKKAMSDAGLNNADIDEVILVGGSSRIPAVQKLVEDFFGKAPSKGVN PDEVVAIGAAVQGAVLTDEIKGVVLLDVTPLSMGIETLGGVMTKLIDANTTIPARKSETF STAADNQSEVTIHVLQGERPMAAQNKSIGQFNLSGIAPARRGVPQIEVTFDIDANGILKV SAKDKATGKEQAIRIEASSGLSKEEIEKMKAEAEANAEADKKEREKIDKLNQADSVIFQT ENQLKELGDKLPADKKAPIEAALQKLKDAHKAQDLAAIDTAMAEINTAFQAASAEMYAQG GAQGGAQAGPDMNGDAGQQDNSKHGDNVQDADFEEVK >gi|222159224|gb|ACAB01000135.1| GENE 26 23447 - 23998 357 183 aa, chain + ## HITS:1 COG:no KEGG:BF1226 NR:ns ## KEGG: BF1226 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 183 7 183 183 172 48.0 4e-42 MGAAKFEITRKCKICGEPFVAKTITSWYCSPRCSKVAFKRRKDEKKRNERLDAIAKAIPK DQEYITVPEAYSMFGISKETLYRQIRKGVIPSVNVGQRQTRVSKETLMEMYPLRKSLLKK QPKPIPKLYSLEPKDCYTIQQACEKYHMNDSTLYLQIRKFSIPTRQIGNFVYVPKKEIDN LYK >gi|222159224|gb|ACAB01000135.1| GENE 27 24007 - 25266 658 419 aa, chain + ## HITS:1 COG:NMA0588 KEGG:ns NR:ns ## COG: NMA0588 COG4973 # Protein_GI_number: 15793579 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerC # Organism: Neisseria meningitidis Z2491 # 223 399 96 284 305 66 28.0 9e-11 MKKSLSHTRVSVKLRKAEFRKEWYLYIESYPVISKNGGKPQRVREYVNRTITTPIWDKSR TARTASDGSVTYKPKRDVNGVILCKSELDQESCIYADKVRAIRQKEYDTAGLYSESEASL LEQKEKSQCDFIIYFKSIIDIRHRNSSDSIIVNWNRVYALMLIFTDNQPMIFANIDTKLV EDFRLFLLDAPQGGNKKGTISQNTASTYFAIFKAGLKQAFVDGYFVSDIAAKIKGIKEKE ARREHLTLEELNRLVCTPCDNATIKRAALFSALTGLRHCDIMKMTWKELTKEGTHYRINF DQKKTKGVEYMPISEQAYDLCGEPGHPDQLVFEGLPAPSWISKPLARWIEASGITKHITF HCFRHTYATLQLSNGTDLFTVSKMLGHTNVRTTQRYTKVVDEKKENAADAIKIDLAAIK Prediction of potential genes in microbial genomes Time: Wed May 18 04:12:34 2011 Seq name: gi|222159223|gb|ACAB01000136.1| Bacteroides sp. D1 cont1.136, whole genome shotgun sequence Length of sequence - 32272 bp Number of predicted genes - 29, with homology - 29 Number of transcription units - 10, operones - 8 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 479 206 ## gi|262407277|ref|ZP_06083825.1| predicted protein 2 1 Op 2 . + CDS 482 - 880 144 ## gi|237713492|ref|ZP_04543973.1| predicted protein + Term 881 - 918 -0.3 + Prom 910 - 969 3.5 3 2 Op 1 . + CDS 1066 - 1425 396 ## BT_4618 hypothetical protein 4 2 Op 2 . + CDS 1412 - 2614 620 ## BDI_2140 hypothetical protein + Prom 2678 - 2737 1.5 5 3 Tu 1 . + CDS 2765 - 3631 579 ## BVU_1505 DNA primase + Term 3718 - 3763 4.1 6 4 Op 1 . + CDS 3840 - 4235 352 ## gi|237713496|ref|ZP_04543977.1| predicted protein 7 4 Op 2 . + CDS 4256 - 5437 647 ## PG0868 mobilization protein 8 4 Op 3 . + CDS 5463 - 6320 434 ## gi|237713498|ref|ZP_04543979.1| predicted protein + Term 6365 - 6399 -0.1 + Prom 6616 - 6675 4.4 9 5 Op 1 . + CDS 6703 - 7923 371 ## gi|237713499|ref|ZP_04543980.1| predicted protein 10 5 Op 2 . + CDS 7936 - 9006 622 ## HRM2_29130 hypothetical protein 11 5 Op 3 . + CDS 9003 - 13121 1416 ## COG0514 Superfamily II DNA helicase 12 5 Op 4 . + CDS 13118 - 13516 239 ## gi|237713502|ref|ZP_04543983.1| predicted protein 13 5 Op 5 . + CDS 13541 - 14791 700 ## COG4804 Uncharacterized conserved protein + Term 14926 - 14969 8.1 - Term 14912 - 14956 4.1 14 6 Tu 1 . - CDS 15000 - 17246 1738 ## COG0475 Kef-type K+ transport systems, membrane components - Term 17279 - 17327 4.7 15 7 Op 1 21/0.000 - CDS 17374 - 18390 789 ## COG0306 Phosphate/sulphate permeases 16 7 Op 2 . - CDS 18406 - 19053 537 ## COG1392 Phosphate transport regulator (distant homolog of PhoU) 17 7 Op 3 . - CDS 19131 - 19766 317 ## COG0586 Uncharacterized membrane-associated protein - Prom 19793 - 19852 3.8 - Term 19854 - 19896 0.3 18 8 Op 1 . - CDS 19915 - 20346 150 ## BT_4640 hypothetical protein - Prom 20379 - 20438 8.2 19 8 Op 2 . - CDS 20446 - 21267 257 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 21310 - 21369 6.7 20 9 Op 1 . + CDS 21435 - 22694 1003 ## BT_4642 hypothetical protein 21 9 Op 2 6/0.000 + CDS 22745 - 23296 316 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 22 9 Op 3 . + CDS 23296 - 24150 530 ## COG3712 Fe2+-dicitrate sensor, membrane component 23 9 Op 4 . + CDS 24172 - 25716 904 ## BT_4645 hypothetical protein 24 9 Op 5 . + CDS 25778 - 25954 253 ## BT_4646 hypothetical protein + Prom 26033 - 26092 2.6 25 10 Op 1 . + CDS 26259 - 26798 506 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 26 10 Op 2 . + CDS 26803 - 27147 365 ## BT_4648 hypothetical protein 27 10 Op 3 . + CDS 27181 - 27744 637 ## BT_4649 hypothetical protein 28 10 Op 4 . + CDS 27793 - 30546 2079 ## BT_3977 hypothetical protein + Prom 30550 - 30609 2.2 29 10 Op 5 . + CDS 30636 - 32207 960 ## COG0038 Chloride channel protein EriC Predicted protein(s) >gi|222159223|gb|ACAB01000136.1| GENE 1 3 - 479 206 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262407277|ref|ZP_06083825.1| ## NR: gi|262407277|ref|ZP_06083825.1| predicted protein [Bacteroides sp. 2_1_22] # 1 158 51 208 208 287 100.0 2e-76 DDVEDLVRQDGRTLYECATDLKKEVDKFVSLDLSAWNALDFINMEQSHLKEYKERWDAAK DKATNLWREYQTESNRLDMMDFNSEDFNTLNAQCDNTKLAYDEAHKQGEILYSIYRQEQL KCGQVHYFEMQFLELLIRKISKLVDVILKNGEHLEKEV >gi|222159223|gb|ACAB01000136.1| GENE 2 482 - 880 144 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713492|ref|ZP_04543973.1| ## NR: gi|237713492|ref|ZP_04543973.1| predicted protein [Bacteroides sp. D1] # 1 132 1 132 132 261 100.0 8e-69 MIDNSLILKEIAQLRDIVNLGVCVGVYQTCNGKQFKHMPASDFINFLNLKLDKAKVHPLP RQKQRICYMLFAVSHTIALSDSPKHWIESMLELCDISMEYYDKHHKDFLSVGVSEKNKEY KEIIDESIKRSY >gi|222159223|gb|ACAB01000136.1| GENE 3 1066 - 1425 396 119 aa, chain + ## HITS:1 COG:no KEGG:BT_4618 NR:ns ## KEGG: BT_4618 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 118 7 117 117 80 35.0 1e-14 MQQENITFNDMPQMLAVMYAKLNELGDKVDKLIPPKKSEEQQWFNVADLIEFLPTHPAEQ TIYGWTSARKIPFHKKGKNIIFNKAEIEEWLKSGTYRKSEADLENEAMAFINNKRYGRK >gi|222159223|gb|ACAB01000136.1| GENE 4 1412 - 2614 620 400 aa, chain + ## HITS:1 COG:no KEGG:BDI_2140 NR:ns ## KEGG: BDI_2140 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 5 398 20 413 413 548 67.0 1e-154 MEGNKEEYIRVGTCLYKIAQQPLANGTCTLRRIPWSFGTIRQDYGKNNTPPIRKYDGFCT VPSHTNYHKEIGGFYNLYEPIDHIPSEGEFPDIMKLIHHIFGEQYELGMDYMQLLYAKPT QKLPILLLVSEERNTGKTTFLNFLKAVFEDNVTFNTNEDFRSQFNADWAGKLLIVVDEVL LCRREDSERLKNLSTAQTYKVEAKGKDRQEVNFFAKFVLCSNNELFPVIIDTGETRYWVR KIMPLESDDTNFLQKLKAQIPAFLYYLQHRVLYSTKESRMWFNPTLIHTDALERIMQCNR NHTEIDLVELLRSIMECQKVDKVSFIPQDLLPLLSINGVKVELWHIRKVVKELWRLKPAP NALSYTTYQYDYSKPTKFGAVSRVGRYYTVTKEFIESLNI >gi|222159223|gb|ACAB01000136.1| GENE 5 2765 - 3631 579 288 aa, chain + ## HITS:1 COG:no KEGG:BVU_1505 NR:ns ## KEGG: BVU_1505 # Name: not_defined # Def: DNA primase # Organism: B.vulgatus # Pathway: not_defined # 1 276 1 283 287 236 44.0 5e-61 MNSLDNKNIQIADFLAEKGYLPISKKGANWWYLSPLHNENTASFKVNVDKNVWYDFGLGK GGGLATLVNLLYHPNNFQDYLHHLSGIRTSFPSTPKTTREGSETFSNVEVKSLANSALLK YLGKRGIAQQVASQYCKEVHYQNRDKSYFAVGFPNRSGGYEIRNAYFKGCISPKDISVIS KGNKDCHVFEGFIDFLSYVVLHGDCDAIVLNSVINVPKSIDYLNRYDTVYCHLDNDKAGH DATEQIRILCKGNVIDASEEYGEAKDLNEFLCKRMNSQGQVLSCGFKR >gi|222159223|gb|ACAB01000136.1| GENE 6 3840 - 4235 352 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713496|ref|ZP_04543977.1| ## NR: gi|237713496|ref|ZP_04543977.1| predicted protein [Bacteroides sp. D1] # 1 131 1 131 131 234 100.0 1e-60 MAKHVINNPGGRKPLPEEEKYGKPIPVRFTPKQFVRIKRKAGEMPISTYIREGALHAIVR QPVSKELMKEIRDLNNLGTNINTLVKLAHQAGLMTIADKANEALDGVNAILHQARLKIKE KEDNDESENQS >gi|222159223|gb|ACAB01000136.1| GENE 7 4256 - 5437 647 393 aa, chain + ## HITS:1 COG:no KEGG:PG0868 NR:ns ## KEGG: PG0868 # Name: not_defined # Def: mobilization protein # Organism: P.gingivalis # Pathway: not_defined # 1 244 1 235 307 182 38.0 1e-44 MYGKILHGASFGGLINYINDPRKNATLVASSDGVNLTNNQTITDSFVMQAGLSARTKKPV GHFILSFSPHDELRINDRMLEQIVNDYLKHMGYDDNQFVAFRHFDKEHPHVHIMVNRVNF KGKCTKDSHEKDKNIKVCKELTEQYGLYIARGKEAIKERRLRSMDAIRYQMLHRVSESLQ VSRNWKEFENELSKAGIRLRFRYNTKTNGIEGISFTLAKENISSKMKHDISYSGKQLDSS LTLACICKKLGNPIAIVHEQARDMYDDARQEWYDTHNAYEVRDIDRVFPDFDLRFPWQAQ ARMYDAPQMCGNEQTLGQDFFIELYQAADDVSDIGKGVIQVGLASLGAILFPPYQPAISA GGGNSSSKLGWGDDDKYKKKYSSRSRSFGRGRR >gi|222159223|gb|ACAB01000136.1| GENE 8 5463 - 6320 434 285 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713498|ref|ZP_04543979.1| ## NR: gi|237713498|ref|ZP_04543979.1| predicted protein [Bacteroides sp. D1] # 1 285 43 327 327 534 100.0 1e-150 MAKSNEQLEDILDGLSVIKNNNAATEQRLSNIENLLAKLATKEEDVPITPVASTSATSIP YDEIKNAVHEEMDSYFNAMSDSAELLSDKTLKRLGTIFIELYVEEIEKYLEKDEKEIECK RNVYLQKRKAQGLMTIEQVAEWAPQYSLEIQRTIRYIGMKIIDENESVEKAHAILKIWGD ALQTITSPKSSPPPTLKSWWLYRWNSFKQRTDKWGLLQWYLVILGIIAYVLFSSLYQSRV MDLDRTNRIFYKKVIMDEKRKRNYQELDSLIHSDSFFKTYWGLNN >gi|222159223|gb|ACAB01000136.1| GENE 9 6703 - 7923 371 406 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713499|ref|ZP_04543980.1| ## NR: gi|237713499|ref|ZP_04543980.1| predicted protein [Bacteroides sp. D1] # 1 406 1 406 406 798 100.0 0 MGIFDFLNNKKKEKARQEQLRLQEEKRRAEEQRRLAERRKQEEQQRREESFLSNFEFDST CHQRYENGQPVRGLQVCPRYIKIKKNINGCSGYQLTPGDGYILTATNGDTGQPQFAPKPM RVVKFSDSEILLKGYCVSAQTPFGWQEIDLSDYGFSIILEKNVVKKCILFLYDRNVKLEY MVGSKTTENSANNTACRMVETESLVVEALKQLSIGNNGDETYHPLYKSWRSYKDNPEQLK NIKDFGHYGMGLMIFLSYGTISDIDDRQQLASLAYLFISKAIKQNSANANLFKNRLLLMI TNHEAFEYTVSSVVNKDQDFFSMNLMPFQARDAMFKMEYADLSFNRALLSIDILASKYQD LQTKINSGFFGKESTNESIISSGKSLHEQVLTYLEHKVLDEGDIDF >gi|222159223|gb|ACAB01000136.1| GENE 10 7936 - 9006 622 356 aa, chain + ## HITS:1 COG:no KEGG:HRM2_29130 NR:ns ## KEGG: HRM2_29130 # Name: not_defined # Def: hypothetical protein # Organism: D.autotrophicum # Pathway: not_defined # 4 340 5 326 339 308 46.0 1e-82 MATWAKQKLPRENGELVDMQVPILVSASRSTDIPAFYADWFFYRLDKAGYSAWTNPFNGV KSYVSYKDTRFIVFWSKNPKPLLRHLPILKQRGIGCYIQYSLNDYEEEGVEKNVPPIEQR IETFKQLTNALGKGHVIWRFDPLVLTDNINIDKLLDKVENIGNQLLGYTEKLVFSFADIL SYRKVQANLQKANINYIDWQENQMREFARRLSVLNQKWGYELATCGEKIDLTEFGVKKNH CVDDELIIKLAYDDKKLMDYLKVKFYPMPQKSLFGDSEPLPDSAIILPNGMYALHGDNKD KGQRAFCGCIKSKDIGEYNTCVHGCEYCYANASKQAAVMNYKCHKENPWSETITGK >gi|222159223|gb|ACAB01000136.1| GENE 11 9003 - 13121 1416 1372 aa, chain + ## HITS:1 COG:L0268 KEGG:ns NR:ns ## COG: L0268 COG0514 # Protein_GI_number: 15673790 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Lactococcus lactis # 514 938 7 341 592 170 30.0 2e-41 MSYIDDIHRIDNLTINEIIRISNKYVPNRYKLEPWRYMDGQGRTLEHGTAVLETEEQCCA YMSAYGPMHRHKLMRALDEKEFPYSDLTGGIEIYDWGCGQGIGTMTVIEKLRQHNMLRKL CKVVLEEPSNVARDRAVIHVKKALEDYNADVIAVSKYLPSDNGDNSNCITSINVEQPTAI HIFSNILDIEAVSLKGVSKMITSSGQKHIVLCIGPANLNESRILSFRNYFVENHIHIFTK FRETNFGLHPNRKAYGCLINSFSYSLSQTSEILHEYKYFAPIQYFASFTDTITKSSLKES AFEILAPFDITAHKNLCPVYALMSNLISRGCPTLASKKILDYVSNMPDTQKAKSLNAIAR IQKTFIEAIISERLDTNKLEWKILIIEDNTSVAKISLEDFSETYHNLIAMTKDYDDMVLP KLSVCIKDNASSGIEYDAIIDISIDQLCNAKEVTFSRFKVLNDCYFIVRSSEKVYEDRIL YTTERINYKPFVEKDSYGIYHKIEDNCSRLRYFLNLIFRKEDFRPGQLPILNRALQLKSV IGLLPTGGGKSLTYQLAAMLQPSVTLVVDPLKGLMKDQYDGLLNIGIDCISYINSDITKN YEEGRKRELALTGSQVQIMFLSPERLSIHRFRDVLRSMRESSVYFSYGVIDEVHCVSEWG HDFRLAYLHLGRNLFNYVLPKEVEGEDNHISLFGLTATASFDVLADVERELSGPNAYSLE DDATVRYENTNRLELQYNVYEVDAKDAPNAWAVGDIKEEKLLEVISDATQKIVDIQNNSA VKYIKERFVERENLIDSTIIDKIHHTNLGVDVENEWYNHTNTNTAGIVFCTRASDRANLS VPTVEASLRNHGLRSISTYKGGDDTACQDSFLKGRTNIMVATKAFGMGIDKSNVRMTFHL NYPGSLESFVQEAGRAGRDKKMALATIMYSPKKFWVKNVKTDKWQEFSADYTNNKFFYDS NFLGEEFELFVMELLMNNLQVKISNEEFYDVDKPCIANSQGIVKFIKRYEKGHILTYYIS YEEDEKVLDGYNQYLSTKNMPVFNTYNARNLKNSRGYSYIRDYGSAVYKDAIQKAIYRMC VIGLIDDFTEDYSKRTFRITTICQDESEYFEHLRLYYRKYYSTERVESMMGEVRKLANTE GVIMACLKHLTSFIYKSIADKRARGILDMEQFCNMAISSKKDWKETNEELKDFIYYYFNS KYAREGFVTYDSNLQQDVPFSLKDDTSHDIYSEDKITSFELVRKYMRVVDAEIVNNDSQM DNIKHLQGAVRLIRRAIAEMNPVLNLLNVFCILYLGQEANEMLEDELYNDYKAVYEQYMD EGKSALIDEFTQLLIKHAALKDEEYINKIQLAIQLEEHVKAFSNIKNKYTEK >gi|222159223|gb|ACAB01000136.1| GENE 12 13118 - 13516 239 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713502|ref|ZP_04543983.1| ## NR: gi|237713502|ref|ZP_04543983.1| predicted protein [Bacteroides sp. D1] # 1 132 1 132 132 209 100.0 5e-53 MIDQRINDSLKELEQGLKNIDSARKQVEMTINSYDGLHISTSEYVAQLGNLTTKVKELVT AIGTDYNQKATAFDKDRKVIVDSANSAIHELSNSTETFKNSLNNVENKLKYSLIVNITLF VILGIIVFLTSN >gi|222159223|gb|ACAB01000136.1| GENE 13 13541 - 14791 700 416 aa, chain + ## HITS:1 COG:STM3332 KEGG:ns NR:ns ## COG: STM3332 COG4804 # Protein_GI_number: 16766627 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 30 393 27 356 367 127 26.0 3e-29 MADNNKPYFVTSKNFHGDNDYVEWLKEIKSRYQAIRNRVVMQSNYGALEFNWLLGRDIVQ KRAESRWGSGTVNQLCLDLKAAYPDVKGFSVRNLYYMKEWYEFYMADEEHKEILHQVGAK LQEAENQNPIKLHQLGAEIVSADKISAILAEGGMLPVFGIVPWKHHLTLISKCKSIKEAF YYMARVIDEGLSKRELEDVIDNDDFSKYGNALTNFSSQLSTSQSLLAMNVLKDPYRLDFV TLERGYDEHDLESAIAKDITRFLLELGSGFTYVGRQPEFVVGNEGYFPDLLFYHIRLRCY VVIELKVVDFKPEFAGKLNFYVTACNKLLRQPEDNPTIGLLLCKSRDQTKVEWAFDSIQN PMGVATYEGIKIKDKLPSVEDLKKRLDMVEQELREYKENEEASAKDGNYTSNQETK >gi|222159223|gb|ACAB01000136.1| GENE 14 15000 - 17246 1738 748 aa, chain - ## HITS:1 COG:PA5529 KEGG:ns NR:ns ## COG: PA5529 COG0475 # Protein_GI_number: 15600722 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Pseudomonas aeruginosa # 3 429 2 427 585 352 43.0 1e-96 MSHLPTLIADLALILISASVITLLFKWMKQPLVLGYIIAGLLAGPYINIFPTVGDIENIN IWAEIGVIFLLFALGLEFSFKKLMNVGSTAFITATTEVVSMLLIGFLVGQLLGWGTMNSI FLGGMLSMSSTTIIIKAFDDLGLRNQRFTGIVFGTLVVEDLIAILMMVLLSTMAVSQDFV GEDLLISVLKVVFFLILWFLIGIFVIPAFLKKAKKLMNNETLLIVSIGLCLGMVVLATYT GFSTALGAFIMGSILAETIEAEHIEHIIQPVKDLFGAIFFVSVGMLVNPAVLVEYAWPVI IITLVTIIGKAIFSSFGVLLSGEPLNTSIKSGFSLAQIGEFAFIIAGLGVSLKVLDPFIS PIIVAVSVITTFTTPYFIRLANPFAEWLYKILPTKVQETLDRYASGKKTMNHDSDWKKLL KNMIGRVIIYSVLLTAIWLLSIQIIYPAISEMFAPVTLWINLVMCLGTLLLMTPFLWALI SNKYNSSELFLKLWRDENYNHGRLVSLVLGRVSVALFFITSVVISYFKLNWGISVIIAIA VVALILILREDLTQYSRLETRFLANLNRREEAAKKRHPLKTSFNSEFNDKDLDLTSVVVS PYSDYIGKSLGELSFSQNFGVNVVAITRGDLNIYIPKSSEPIYPQDKLIVVGTDMQLQEF RNRIEDVKNTKDTDAVDKKMTLHSFTVDDEFRFLNKTIAQSHLGEKYDGIVVAIERNNEL IPLDKDTAFQLGDLVWIVGNREKIREIL >gi|222159223|gb|ACAB01000136.1| GENE 15 17374 - 18390 789 338 aa, chain - ## HITS:1 COG:RSc1313 KEGG:ns NR:ns ## COG: RSc1313 COG0306 # Protein_GI_number: 17546032 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Ralstonia solanacearum # 12 333 19 331 336 256 46.0 6e-68 MELLVTIIILALIFDYINGFHDAANSIATIVSTRVLTPFQAVLWAAFFNFVAFFIAKYII GGFGIANTVSKTVVEQYITLPIILAGVIAAITWNLVTWWKGIPSSSSHTLIGGFAGAAIM ANGFEAIQLNIILKIAAFIFLAPFIGMVIAFVFTLFVLYICRRAHPHTAEVWFKRLQLVS SALFSVGHGLNDSQKVMGIIAAAMIAAHSMGLGMGINSINDLPDWVAFSCFTAISLGTMS GGWKIVKTMGTKITKVTPLEGVIAETAGAFTLYITEMLKIPVSTTHTITGAIIGVGATKR LSAVRWGITKSLMTAWILTIPVSGLLAAAIYYIVSLFL >gi|222159223|gb|ACAB01000136.1| GENE 16 18406 - 19053 537 215 aa, chain - ## HITS:1 COG:CAC3094 KEGG:ns NR:ns ## COG: CAC3094 COG1392 # Protein_GI_number: 15896345 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate transport regulator (distant homolog of PhoU) # Organism: Clostridium acetobutylicum # 9 215 4 210 210 103 32.0 2e-22 MKNSFFSRFTPKEPKFFPLLKQLSEVLCEASVVLTESLQHDSPTERADYYKKIKELEREG DKLTHRIFDELGTTFITPFDREDIHDLASCMDDVIDGINSCAKRISIYNPRPISENGKEL SRLIQEEATYICKAMDELEIFRKKPTLLREYCSRLHEIENQADDVYEFFITRLFEEEKDC IELIKIKEIMHELEKTTDAAEHVGKILKNLIVKYA >gi|222159223|gb|ACAB01000136.1| GENE 17 19131 - 19766 317 211 aa, chain - ## HITS:1 COG:STM2367 KEGG:ns NR:ns ## COG: STM2367 COG0586 # Protein_GI_number: 16765694 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Salmonella typhimurium LT2 # 3 208 6 211 219 263 64.0 2e-70 MDFLLDFILHIDQYMVMIVRDYHAWTYAILFFIIFCETGLVVTPFLPGDSLLFVAGAISA LPDMPISVHILVIILFAAAVLGDSCNYMIGHFFGRKLFNNPNSKIFKQSHLEKTHEFYKK YGGKTIILARFVPIVRTFAPFVAGMGKMNYYYFMMYNLAGGAAWVGIFCYAGYFFGDLPF VQENLKLLIVAIIFISILPAIIEVVRAKLKS >gi|222159223|gb|ACAB01000136.1| GENE 18 19915 - 20346 150 143 aa, chain - ## HITS:1 COG:no KEGG:BT_4640 NR:ns ## KEGG: BT_4640 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 1 146 146 132 51.0 3e-30 MMKYIFCILLGVFLNWGFHLISDEAQEAYRLETGSTSIQTYTNSYTSKTSIIQLDKEQKL VKTKEYTGKQDYNNDNGVLNPFSFRHLSPLKILRFNIPSVTIRILSSLKIQLPQNQWTGF SYYTNLFKYSDRYHIYSLGHILI >gi|222159223|gb|ACAB01000136.1| GENE 19 20446 - 21267 257 273 aa, chain - ## HITS:1 COG:FN1295 KEGG:ns NR:ns ## COG: FN1295 COG0454 # Protein_GI_number: 19704630 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Fusobacterium nucleatum # 16 139 3 126 135 106 41.0 4e-23 MIRKLDKTEYALAASLALEVYIQCGAEDFDEEGLNSFKSFISSEQLMNELVIYGAFEDKN LVGIMGTKHEGKHLSLFFIRKKYQCKGIGKQLFCFAINDCPVDEMTVNSSTYAIRFYQSL GFEKTKEKLCTNGITYTPMIFKRTVRISSIAPCGMDCALCYAFQDVKKPCPGCRTQTGKI RESCQNCIIFSCDKKKYYCFECTNFPCKRLKALDARYQNKYKMSMIMNLTFIKEQGEENF LIWQNHKYTCPKCGKLRTVHYDYCIHCKQQKLT >gi|222159223|gb|ACAB01000136.1| GENE 20 21435 - 22694 1003 419 aa, chain + ## HITS:1 COG:no KEGG:BT_4642 NR:ns ## KEGG: BT_4642 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 419 1 418 418 628 74.0 1e-178 MKNVEKHLVGWLSVWLLLCSFPVWAQGDAGDYLTIVGMVKDKQNKKALENVNVSVHGSNI GTVTNAEGEFALKIKKTETPRELDISHIGYVNTHISLEKETAAMLTVWLTPHSNLLNEVV VFAENPRIIVEKAINKIPLNYSDKRDMLTGFYRETVQKGRRYIGISEAVLDVSKTAYTNR NINYDKVRVEKGRRLLSQKASDTLAVKVVGGPNLGVTLDVVKNKGALLDFEELNNYEFWM AESMLIDNRMQYVINFRPKVILMYALLYGKLYIDRERLSFTRIEMSLDMQNKSKATTAIL YKKPLGLRFKPQELSYLVTYKDVDGKTYLNYIRNTIRFKCDWKRKLFSTSYTVASEMVVT DRKEGVLENIPNKEAFSLNQIFYDKVDEYWNPDFWGNYNIIEPTESLEHAVDKLKKQSN >gi|222159223|gb|ACAB01000136.1| GENE 21 22745 - 23296 316 183 aa, chain + ## HITS:1 COG:PA0149 KEGG:ns NR:ns ## COG: PA0149 COG1595 # Protein_GI_number: 15595347 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 39 169 33 161 181 67 32.0 2e-11 MLDELLILKKIKEGDIKAFEELFRRYYFPLCCYAAGITGQMAVAEEIVEELFYVLWKERE RLQIFQSVKSYLYRATRNQSLQYCEHEEVRNRYREAVLNTSNPEQSTDPHQQMEYEELQK FINNTLEKLPVRRRQIFEMHRLEGRKYVEIATQLSLSVKTVEAEMTKALRTLREEVETYI HMK >gi|222159223|gb|ACAB01000136.1| GENE 22 23296 - 24150 530 284 aa, chain + ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 84 201 152 271 354 60 35.0 4e-09 MKTEIYKIKTDQAWNRLYNRLDNDRLLAKVDEGRSIRKHSQWIRYGAVAAVLIGVVWGAL YWMTGSEREPAQNFLTQENQGISTLATTLEDGSVILLAKETSLLYPKHFIADKREVSLQG NAFFDVAKKQGQPFWIDTEQAKIEVLGTAFSVQSDEHAPFRLSVQRGIVKVTLKKGNQEC YVKAGEAVVVQSQRLVVLDADKENEEWGSFFKHIRFKDESLANILKVMNLNSDSLQIQVA SPALEERRLTVEFSDESSEVVATLIASALGLQCVRQGDILLLSE >gi|222159223|gb|ACAB01000136.1| GENE 23 24172 - 25716 904 514 aa, chain + ## HITS:1 COG:no KEGG:BT_4645 NR:ns ## KEGG: BT_4645 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 513 1 512 513 741 71.0 0 MKQAYRYLSILFTLFFVVDTMRADGGDVLERIVRLPKTKGTVYSLLGNVSQQSGYMFIYD SKVIDNDVVVKIKGGERSIRQAVYDIVGDTGLEFQVIGTHILITLPSEKKIYIQQDSIPK QPMNFSITGTLLDKETGVPISSATVGVRGASIGSITNQNGDFKLSLPDSLKNDSITFSHI GYLSQDIEFALLIGRHNILSLEPKVVSLQEVVIRRSEPKKLLREMIERREQNYSHTPVYL TTFYREGVQLKNKFQNLSEAVFKVYKTSSHSAVPDQVKLLKMSRLSNVEAKDSLLVKVKS GIQACIQMDIIKDMPEFLIPNIENSIYTYTSEGVTFLEDRFVNVVHFEQKKGISEPLFCG ELFLDSETSALLQARLEVHPVYVKNAAGMFVERRARNIRMIPQKVVYTISYKPWQGTYYI HHIRGDLHFKVKRTKMLFGSRDLHIWFEMITCKVETGQVVAFPRTERLPTRTIFSDTYFK YDENFWKDFNVIPLEEEISKLIEKISLKIEEIGD >gi|222159223|gb|ACAB01000136.1| GENE 24 25778 - 25954 253 58 aa, chain + ## HITS:1 COG:no KEGG:BT_4646 NR:ns ## KEGG: BT_4646 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 58 1 58 58 100 86.0 1e-20 MELPKDPMMLFSVINMKLRDCYSSLDELCEDMNVNKDELVNQLKAAGFEYSAEHNKFW >gi|222159223|gb|ACAB01000136.1| GENE 25 26259 - 26798 506 179 aa, chain + ## HITS:1 COG:VC2467 KEGG:ns NR:ns ## COG: VC2467 COG1595 # Protein_GI_number: 15642463 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Vibrio cholerae # 9 176 12 186 190 81 31.0 7e-16 MINENKIREACASNRERGFKMLMDSFQMPIYNYIRRLVVSHEDAEDVLQEVFIRVFRHID QFREESSLSTWIYRIATNESLRLLNGRKDEGVVSAEDVQEELMGKLKASDYIDYENELAV KFQEAILSLPEKQRLVFNLRYYDELEYEEIARVLDSKVDTLKVNYHYAKEKIKEYILNR >gi|222159223|gb|ACAB01000136.1| GENE 26 26803 - 27147 365 114 aa, chain + ## HITS:1 COG:no KEGG:BT_4648 NR:ns ## KEGG: BT_4648 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 114 1 119 119 120 65.0 1e-26 MKKDFDFDDIGKRTPYRTPDGFFEDVQRKVMERAGVKQQRKSHTKLIISTVITIAAVWVG FLFVPSLRQADEVTTSSSKVLANGTEPVDKWIKELSDEELEELVSFSENDIFLN >gi|222159223|gb|ACAB01000136.1| GENE 27 27181 - 27744 637 187 aa, chain + ## HITS:1 COG:no KEGG:BT_4649 NR:ns ## KEGG: BT_4649 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 184 1 186 194 196 75.0 4e-49 MKTKFIYVILAVLLMGSQMTLSAQNTDNKQKKQRPTPEQMVQMQAKQIVNTLMLDDATAA KFTPVYEKYLKELRECRMMTHKARTEKTKAQGTDANAKKERPSMTDDEIATMLRNQFTQS RKMLDIREKYYNEFSKILSQKQILKIYQQEKMNANKFRKEFDRRKGQKPGQGHHQGQRPR APRQGQK >gi|222159223|gb|ACAB01000136.1| GENE 28 27793 - 30546 2079 917 aa, chain + ## HITS:1 COG:no KEGG:BT_3977 NR:ns ## KEGG: BT_3977 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 916 3 909 937 617 39.0 1e-175 MNIGSHIYSFQYRLITCYVLLLISLTVFAQSGKERFSGRVIDTETNQPVPFATVRLLALP DSTLLGGGATDAIGKFQLSVSVPTATKAKSLLLQVSYIGYKPVFHLISISRKSTSYELGN INLTPESYALDEAVVVGQAPMAVTEGDTTVFNASAYRTPEGSMLEELVKQLPGSEIDADG KLLIHGKEVKKILVDGKEFFSDDPKAALKNLPVEMVEKLKAYERQSDLARLTGIDDGEEE MILDLSVKKDMKLGWMENFMGGYGSKDRYELANTLNRFRENSQLTIIGNLNNTNNQGFSE LQNESSNATGNIRNRMGLTTSRSLGLNATHDWKRIKLRSNLQYVGTDRLEDSRTTVDNFL REDRSINLGTNNSRLQNHELVANAFLEWKMDSVTTLIFRPQYRFSANDRENSGFQEGWGN DVLLNERESSGTNHNSRYNLTMMLQLSRKLSRLGRNVALKVDYGTNASATDRKSLSTTRY FKNNTKKIQNQKIEDDVDGYNYRVQLVYVEPLPWRHFLQLRYSYQYRVNNSDRFVYDWDK ELEEFAPDFDEDSSNRFENQYSNHLVNLAIRTSQKKYNYNIGVDFEPQKSVSHSLLSDAP EDQLERSVMNFSPTVNFRYKFSKRTRLQIVYRGKGKQPSVRDLQPVTDRTNPLNIRVGNP SLKPSYTNTFTLNFNSYNTKHQRNMVATALFENTINNVTNQVTYDSETGVRTTSPVNMNG NWRAMGSFSLNTPFKNRSWIFRTYSYLQYRNQNGYTTLNKEEPVKTSVQHLTGRERLRLT YRTRQMELTGLVGLTYNNSYNDVREKRTETFDYQAGTQLQLYLPWGMELYNDLTYSLRTG YGYEGYAKENFMWNCQLSKAFLKKKQLLIRFKIYDILHQDISLIRTITATAIRDTDYNAL GSYFMVHAILRLNMMGR >gi|222159223|gb|ACAB01000136.1| GENE 29 30636 - 32207 960 523 aa, chain + ## HITS:1 COG:FN1727 KEGG:ns NR:ns ## COG: FN1727 COG0038 # Protein_GI_number: 19705048 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Fusobacterium nucleatum # 18 522 10 520 521 261 33.0 2e-69 MFRLIKKIKDNGRWRIFKLKLIDARLYFVSIFVGLLTGLVAVPYHYLLQFFFNLRHDFFD SHPKWYWYIPLFLLMWGILVFVSWLVKKMPLITGGGIPQTRGVINGRVDYKHPFLELVAK FVGGILALSTGLSLGREGPSVQIGSYVGYLVSKWGRVLSGERKQLLSAGAGAGLAAAFAA PLASSLLVIESIERFDAPKTAITTLLAGVVAGGVASWIFPINPYFHIDAIVPGMTFWGQV KLFLLLAAVVSVFGKFFSVTTLQMKRIYPAIKHPEYVKMLYLLFIAFLISMAEFNLTGGG EQFLLSQAMHPDTHILWIVGMMLLHFVFSTFSFSSGLPGGSFIPTLVTGGLLGQIVGLIM VQQGVIAYENISYIMLICMSAFLVAVIRTPLTAIVLITEITGHLEVFYPSIVVGGLTYYF TEMLQIKPFNVILYDDMIHSPAFKEEPRYTLSVEVMSGSYLDGKIVDELRLPERCIIINV HRDRKNWPPKGQKLMPGDQVQIEMDSQDIEKLYEPLVSMANIY Prediction of potential genes in microbial genomes Time: Wed May 18 04:14:16 2011 Seq name: gi|222159222|gb|ACAB01000137.1| Bacteroides sp. D1 cont1.137, whole genome shotgun sequence Length of sequence - 7547 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 836 729 ## COG0648 Endonuclease IV - Prom 856 - 915 3.0 2 1 Op 2 . - CDS 919 - 3540 2038 ## BT_4652 hypothetical protein 3 1 Op 3 1/0.000 - CDS 3582 - 4850 1120 ## COG0738 Fucose permease 4 1 Op 4 . - CDS 4847 - 5818 490 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 5 1 Op 5 . - CDS 5860 - 7488 1474 ## BT_4655 hypothetical protein Predicted protein(s) >gi|222159222|gb|ACAB01000137.1| GENE 1 3 - 836 729 277 aa, chain - ## HITS:1 COG:STM2203 KEGG:ns NR:ns ## COG: STM2203 COG0648 # Protein_GI_number: 16765533 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Salmonella typhimurium LT2 # 1 277 1 279 285 385 68.0 1e-107 MKYIGAHVSASGGVEFAPVNAHEIGANAFALFTKNQRQWVSKPLTEESISLFKESCEKYG FQPEYILPHDSYLINLGHPEEEGLQKSRAAFLDEMQRCELLGLKLLNFHPGSSLNKISIE DCLSLIAESINIALEKTKGVTAVIENTAGQGSNLGSEFWQLKYIIDRVNDKSRVGVCLDT CHTYTAGYDIVNEYDKVFDEFDKEVGFNYLRGMHLNDSKKALGTHVDRHDSIGEGLIGKA FFERLMQDSRFDNMPLILETPDESKWKEEIAWLKSME >gi|222159222|gb|ACAB01000137.1| GENE 2 919 - 3540 2038 873 aa, chain - ## HITS:1 COG:no KEGG:BT_4652 NR:ns ## KEGG: BT_4652 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 873 1 872 872 1642 89.0 0 MMKQRISIFLLFTILLSANGYAQKGIMRLTQQTLMHEVRETPSPLDGQHITVNPPRFMWP DKFPHLGAVLDGVEEEDYKPEVTYRIRIARDPEFKSEVITAERKWAFFNPFKLFEKGKWY WQYAYVDKDGKEEWSPVSHFYIDEHIRTFNPPSLQEVLAKLPKTHPRILLDAEDWDNIIE RNKNNPEAQAYIRKADKCLNHPLKHLEEEIDTTQVVKLTNIVQYRSALIRESRKIVDREE ANIEAMLRAYLLTKNKVYYKEGIKRLSEILSWKNSKYFAGDFNRSTILSMSTSAYDAWYN LLTPDEKKLLLRTIRENGKKFYHEYVNHLENRIADNHVWQMTFRILNMAAFATYGELPMA STWVDYCYNEWVSRLPGLNTDGGWHNGDSYFQVNLRTLIEVPAFYSRISGFDFFADPWYN NNAFYVIYQQPPFSKSAGQGNSHESKLKPNGTRVGYADALARECNNPWAAAYVRTILQKE PDIMEKTFLGKSGDLTWYRCITKKALPKEGPTLAELPMAKVFNETGIGTMNTSLGDIDKN AMLSFRSSSYGSTSHALANQNAFNTFYGGKAIFYSSGHRTGFTDDHCMYSYRNTRAHNSI LVNGMTQKIGTEGYGWIPRWYEGEKIAYMAGDASNAYGKVTSPLWLKRGELSGTQYTPEK GWDENKLDMFRRHIIQLGTTGVFVIYDELEGKEAVTWSYLLHTVELPMEIQELTDEVKVI GKNKAGGVSVAHLFSSAKTEQAMVDTFFCAPTNWKNVTNAQGKALKYPNHWHFSSTTVPC KTARFLTVMDTHGKNRPDMQVVRKGNTIQVGDWTITCNLTEKGKAAIYVSNKTEKVSLNY DAGKKEGATIVTDQVKGKQVTKVLIDYLPDFEI >gi|222159222|gb|ACAB01000137.1| GENE 3 3582 - 4850 1120 422 aa, chain - ## HITS:1 COG:BMEII1053 KEGG:ns NR:ns ## COG: BMEII1053 COG0738 # Protein_GI_number: 17989398 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Brucella melitensis # 2 402 18 407 412 139 29.0 8e-33 MRKNLGMLALIMAFWFTISFITNILGPLIPDIIHNFNLSDLAMAGFIPTSFFLAYAIMSI PAGLLIDRFGEKPVLFGGFLMPFIGTILFACMHTYPMLLASSFIIGLGMAMLQTVLNPLQ RTVGGEENYAFIAELAQFMFGIASFLSPLVYTYLIRELDPATYTAGKGFFIDLLADITPR EMPWVSLYWVFTILLLVMLIAVGVSRFPKIELKEDEKSGSKDSYLALFKQKYVWLFFLGI FCYVSTEQGTSIFMSTFLEQYHGVNPQTDGAQAVSYFWGLMTAGCLVGMILLKLIDSKRL LQISGILTIILLLSALFGSKEVSMIAFPAVGFSISMMYSIVFSLALNSASQHHGSFAGIL CSAIVGGAGGPMIVSTLADATSLRTGMLFILVFVGYITFIGFWARPLINNKTVRLKDLFK QR >gi|222159222|gb|ACAB01000137.1| GENE 4 4847 - 5818 490 323 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 7 311 6 317 319 193 37 4e-49 MEKEYAIGIDLGGTSVKYALIDNEGVFYFQGKLPSKADVSAEAVIGQLVTAINEVKAFAQ EKGYKIDGIGIGTPGIVDGTNRIVLGGAENINGWENIHLADRIETETGLPALLGNDANLM GLGETMYGAGQGATHVVFLTVGTGIGGAVVIDGKLFNGYANRGTELGHVPLIANGEPCAC GSVGCLEHYASTSALVRRFSQRIIDAGISYPNEEINGELIVRLYKQGDPIAKISLEEHCD FLGHGIAGFINIFSPQKIVIGGGLSEAGDFYIQKVSEKARSYAIPDCAVNTQIIAAALGN KAGSIGAASLVFTQLSAPNLIKL >gi|222159222|gb|ACAB01000137.1| GENE 5 5860 - 7488 1474 542 aa, chain - ## HITS:1 COG:no KEGG:BT_4655 NR:ns ## KEGG: BT_4655 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 38 542 1 505 505 1046 97.0 0 MDRRKFLKNTGWSFLGLAASGSLLGSCAAGSKEAKKIMPSASNLKMYWGDLHNHCNITYG HGDMRDAFEAAKGQLDFVSVTPHAMWPDIPGADDPRLKWVIDYHTGAFKRLREGGYEKYV KMTNEYNKEGEFLTFVGYEAHSMEHGDHVALNYDLDAPLVECTSIEDWKEKAKGHKVFVT PHHMGYQGGYRGYNWKCFTEGDITPFVEMYSRHGLAESDQGDYPYLHDMGPRQWEGTIQY GLELGNKFGIMASTDQHSGYPGSYGDGRIGVLAPSLTRDAIWEALRTRHVCAATGDKIII DFRLNDAFMGDVVRGNSRRIYLNVTGESCIDYVDIVKNGQILARMNGPLTPVAPEGDTVR CKVKVDFGWNREEKYVHWQGKLSVDKGQIHSVTPCFRGAAFTSPQEGETEFHTHVNRIVS VGDKETELDMYSSKNPNTTTAAMQAVILDVEMPKDGKVIAEFNGKKFEHTLGELLEGSRS HFMIGWLSEAILFNRAMPESCFTLEHYMEDKEPQRDTDYYYVRVRQRDGQWAWSSPIWAE RV Prediction of potential genes in microbial genomes Time: Wed May 18 04:14:36 2011 Seq name: gi|222159221|gb|ACAB01000138.1| Bacteroides sp. D1 cont1.138, whole genome shotgun sequence Length of sequence - 12149 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1615 1434 ## COG3119 Arylsulfatase A and related enzymes - Term 1631 - 1696 1.9 2 2 Op 1 . - CDS 1717 - 3717 1824 ## BT_4657 heparinase III protein 3 2 Op 2 . - CDS 3777 - 5072 1088 ## BT_4658 glucuronyl hydrolase - Prom 5098 - 5157 4.3 4 3 Op 1 . - CDS 5262 - 6935 1150 ## BT_4659 hypothetical protein 5 3 Op 2 . - CDS 6952 - 10125 1881 ## BT_4660 hypothetical protein 6 3 Op 3 . - CDS 10125 - 12089 1095 ## BT_4661 hypothetical protein Predicted protein(s) >gi|222159221|gb|ACAB01000138.1| GENE 1 1 - 1615 1434 538 aa, chain - ## HITS:1 COG:PA0031 KEGG:ns NR:ns ## COG: PA0031 COG3119 # Protein_GI_number: 15595229 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 33 497 5 426 503 130 24.0 9e-30 MENYYTPSALLLPLAALSLTSCGNQKKEETRQPNIIFMMTDDHTTQAMSCYGGNLIQTPN MDRIANEGIRFDNCYAVNALSGPSRACILTGKFSHENGFTDNASTFNGDQQTFPKLLQQA GYQTAIIGKWHLISEPQGFDHWSILSGQHEQGDYYDPDFWENGKHIVEKGYATDIITDKA IEFLEGRDKNKPFCMMYHQKAPHRNWMPAPRHLGIFNNTTFPEPANLFDDYEGRGRAARE QDMSIEHTLTNDWDLKLLTREEMLKDTTNRLYSVYKRMPVEVQDKWDSVYAGRIAEYRKG DLKGKSLISWKYQQYMRDYLATVLAVDENIGRLLNYLEKIGELDNTIIVYTSDQGFFLGE HGWFDKRFMYEECQRMPLIIRYPKAIKAGSTSNAISMNVDFAPTFLDFAGIEIPSDIQGA SLKPVLVNEGKTPADWRKAAYYHYYEYPAEHSVKRHYGIRTQDFKLIHFYNDIDEWEMYD MKADPREMNNVFGKPEYAEKQKELMQLLQETQKQYKDTDPDEKEKVLFKGDRRLMKNR >gi|222159221|gb|ACAB01000138.1| GENE 2 1717 - 3717 1824 666 aa, chain - ## HITS:1 COG:no KEGG:BT_4657 NR:ns ## KEGG: BT_4657 # Name: not_defined # Def: heparinase III protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 666 1 666 666 1285 93.0 0 MNKTLKYIVLLAIACFVGKASAQELKSEVFSILNLDYPGLEKVKALHQEGKDEDAAKALL DYYRARTNVKTPDINLNKVTISKEEQQWADDGLKHTFFVHKGYQPSYNYGEDINWQYWPV KDNELRWQLHRHKWFTPMGKAYRISGDEKYAKEWAHQYIDWIKKNPLVKMNKKEYELISD GKIKGEVENVRFAWRPLEVSNRLQDQTSQFQLFLPSPSFTPDFLTEFLVNYHKHAIHILA NYSDQGNHLLFEAQRMIYAGAFFPEFKDAPAWRKSGIDILNREIHVQVYEDGGQFELDPH YHLAAINIFCKALGIADANGFRKEFPQDYLDTIESMIMFYANISFPDYTNPCFSDAKLTT KKEMVKNYKAWSKLFPKNQAIKYFATEGKEGALPDYMSKGFLKSGFFVFRNSWGMDATQM VVKAGPKAFWHCQPDNGTFELWFNGKNLFPDSGSYVYAGEGEVMEQRNWHRQTCVHNTVT LNNKNLDTTESVTKLWQPEGAIQTLVTENPSYKNLKHRRSVFFVDNTYFVIVDEMTGSAK GSINLHYQMPKGEIANSREDMTFLTQFEDGSNMKLQCFGPDGMSMKKEPGWCSTAYRKRY KRMNVSFNVKKDGEEAVRYITVIYPVKKSADAPKFAAKFKNKAFDENGLEVEVKVNGKKQ SLKYKL >gi|222159221|gb|ACAB01000138.1| GENE 3 3777 - 5072 1088 431 aa, chain - ## HITS:1 COG:no KEGG:BT_4658 NR:ns ## KEGG: BT_4658 # Name: not_defined # Def: glucuronyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 431 4 434 434 791 88.0 0 MKKKLVLFASVSIALASCQTTPKEDYSWIKKGLDVASAQLQLSAEEVSGTGMLPRSIRTG YDMDFLCRQLERDSSTFKDSLRAQPTADQLGKRRLCNVYDWTSGFFPGSLWYAYELTGND TLKTQAIQYTNLLNPVRYYKGTHDLGFMVNCSYGNAERLSPNDTIAAVMRETADNLCGRF NDSIGAIRSWDFGTWNFPVIIDNMMNLDLLFNVAKATGDNKYKDIAIKHAMTTMHNHFRP DYTCWHVISYNNDGTVESKQTFQGKNDDSSWARGQAWAIYGYTACFRETNDSIFLNFAKD IADMIMDRVKTEDAIPYWDYDAPVTKETPRDVSAASVTASAMIELSTMVPDGQKYLDYAE KILKSLSSDAYLAKVGDNQGFILMHSVGSLPNGSEIDTPLNYADYYYLEALKRFMELKKL RVENGELRVIQ >gi|222159221|gb|ACAB01000138.1| GENE 4 5262 - 6935 1150 557 aa, chain - ## HITS:1 COG:no KEGG:BT_4659 NR:ns ## KEGG: BT_4659 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 556 1 557 557 828 74.0 0 MKKILYYIAISVICALPFNSCTLDEETSTDVEKKNFMRDAKEAEIVLQGVYRSTIEDGQY GYHLSILFNLGTDMEQAEGNSTENYRIIPANAFPTTQSEVQTTWSSLYAGVYNANDFLER ISSKINTYSDTDRQLATLYIAEARALRAMFYFELVRRYGNIALMTNTEMSKQDPRTFVQV KPEKVYEFIEDDLLYASRILPYAQDDQYRIDNSYRFSRGAALGLLSKVYATWAGYPVYDE SKWEEAAKTARTLIESGKHALLSDYEQLWKNTCNGVWDPTESLIEISFYSPTVSGNSDPV GRIGKWNGVKTTSIAGQRGSCAGNVKVVHTFVTEWREHPQDLRCALSVANYKYDDDKVLW TKGKNDTDYSAKEKDKDPTKGQKEKQNYTPAKWDIEKYVTTSKLINNDKSNVNWYFLRYA DVLLLYAEALNEWKHGPTDEAYEAINMVRRRGFGNPSKTSICDLKDLNEEDFRKAVYQER AYELAFEGHRRMDLIRWGIYYETIQKTYNDLLNWWTAETEFNYVVYRHTVKGKHELFPIP QRDMDLMIKFNQNPNWE >gi|222159221|gb|ACAB01000138.1| GENE 5 6952 - 10125 1881 1057 aa, chain - ## HITS:1 COG:no KEGG:BT_4660 NR:ns ## KEGG: BT_4660 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 1057 3 1047 1047 1697 82.0 0 MSKTSLTMYNTITKSIKLRGYITLLALIFTISLQAQNITVKGVVVDETDTPLIGATVMVK GASTGAITDFDGNFTLTTSKGSIISFSYIGYKTQEIKYTGQSPMNVKMVPDNKTLDEVVV VGYGSMKRGDLTGSVASVASKDIEGYKSSSVMGALGGQIAGVQITQTDGTPGAGFNINIR GVGTLTGDASPLYIVDGFQVDNIDYLSNSDIESIEVLKDASSAAIYGARAANGVVMVSTK SGKVGRPVINYNGSASYRKISKMLDVLSPYEFVKLQGEIKTEYANSYYKSGNDDNGIPYL YQSAEDYIGMKGVDWQEETFNPTWSQDHNLSISGGSDNTKYTASFSRYIENGIFKNSGFD KTTGKIRFNQKITKNITFDTTVNYAQTNRKGVGTSADSGRFNMLAQILSARPTGGLKLTD EELLNSAIDPEMLETGESLAQVNPVMQTESVTNNKRGEMWSANASITWQIIKGLTFKTAG TYNTTDSRTDIFYKTGSKEAYRNGEKPYGRTQMGRDVRWTNYNNLTWKQKIKKHTYDVML GHEVSFKSSEYLLGEAMDFPFDNLGNDNLGIGATPSKVSSSYNDKMLLSFFARGNYNYDN RYLLTATIRADGSTVFSQKNKWGYFPSFSAAWRVSEEAFMKDIKWLSNFKVRFGWGTVGN DRISNYLSLDLYEASKYGVGNNTVTVLTPKQLKNANLKWEGSNTVNLGVDLGFFDSRLNI TADFFIKNTKDLLLAQSLAHITGFNSQMQNIGKIQNKGFELNVNSINIQTRDFTWNTNFN ISFIKNTLKSLASGVDAMYARSGFDSNFTAYDYIAKVGESLGLIYGYEFDGIYQSSDFYT KPGDPTLYLKDGVVNDPRYSTKEPLRPGVVKYKDQDGDGKITTKDRTVIGSAIPKWYGGI TNTLNYKGIDFSFMLQFNYGNDVYNATRLYSTQSRSGRRNMLAEVADRWSPTNASNKVPL YNGYITNDVYSRFVEDGSFLRLKNITLGYTLPKKWTSKFYVSRLRVYATGQNLFCVTGYS GYDPEVSTAGSNPMTPGLDWGAYPKSKVFTFGLDIQF >gi|222159221|gb|ACAB01000138.1| GENE 6 10125 - 12089 1095 654 aa, chain - ## HITS:1 COG:no KEGG:BT_4661 NR:ns ## KEGG: BT_4661 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 652 106 726 726 454 42.0 1e-126 MQVGLYKLSISCISGGNYYEFKDIVEINFLKAVPDGITVEPNKLQVKYNDIIDETSEVEL PTAQVKTDGDHVTITKYEIAKSDYSKYFDITKSGKISIIKGSTALLPGIYNISLKLTTGA SSEDEGIFENALEINVTSAPFGLEYTPNEDMLEAENDKSGKTSFQSNAPALKGSLEGIEY SIKNITPTTDKIKIDPTTGVLSVDKDHGLQSGNNYVISIHVKNNFGEEDFNNAFTLQVVE YIEPISGFEYETSIDKYQYSKFTISPKEGLKGDNIQFSLINEPDALKGQIEFDAQTGTIS VEKGNTIPQGNYPLTVRATNSKNAENPADATFTLNIIENPNYFTDIRYGNNIDVPEENNA NQFRITEDNEANADATLKSFTFPSPQTGLKGNVSVAWSIKNGNNCDNLTIDSNTGKISFN QEATWPADKNGVKANTIGFCYVTATAGTDKDSQISQTTLVFIHYDLKANNGVHIHYNPFV FQADPKNGGNSTVPLVTVNGITTTSNFALDYRRSFNYYPTEGTLVKGAPGAAGSFLNELW TTYYKAMDIKLSTGSRNPMSYYGSVYDMSHSKKLPNQSDRLSVALAYVVPNDLTIHISPN IWKNSNGEYANGIMVGEMTFLTNVTVDTTETGESLKNGKKIAPIIIWFDKKFIK Prediction of potential genes in microbial genomes Time: Wed May 18 04:15:20 2011 Seq name: gi|222159220|gb|ACAB01000139.1| Bacteroides sp. D1 cont1.139, whole genome shotgun sequence Length of sequence - 26891 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 10, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 88 - 126 -0.9 1 1 Tu 1 . - CDS 166 - 2355 1149 ## BT_4662 heparinase III protein, heparitin sulfate lyase - Prom 2380 - 2439 7.6 2 2 Tu 1 . + CDS 2876 - 3025 99 ## - Term 2743 - 2788 7.6 3 3 Op 1 . - CDS 2965 - 4086 688 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 4106 - 4165 5.0 4 3 Op 2 . - CDS 4211 - 8272 3038 ## COG0642 Signal transduction histidine kinase - Prom 8370 - 8429 10.4 5 4 Tu 1 . - CDS 8481 - 9851 1629 ## COG1350 Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) - Prom 9937 - 9996 6.0 + Prom 9775 - 9834 4.8 6 5 Op 1 17/0.000 + CDS 9984 - 11813 1256 ## COG0168 Trk-type K+ transport systems, membrane components 7 5 Op 2 . + CDS 11818 - 12504 858 ## COG0569 K+ transport systems, NAD-binding component + Term 12575 - 12623 5.6 - Term 12563 - 12611 9.4 8 6 Tu 1 . - CDS 12637 - 15084 2248 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 15297 - 15356 4.9 + Prom 15214 - 15273 5.2 9 7 Tu 1 . + CDS 15326 - 15877 508 ## BT_4674 hypothetical protein + Term 15935 - 15976 10.0 10 8 Tu 1 . + CDS 16001 - 17197 1075 ## BT_4675 heparin lyase I precursor + Term 17222 - 17270 7.7 + Prom 17239 - 17298 7.0 11 9 Tu 1 . + CDS 17378 - 18349 558 ## COG4974 Site-specific recombinase XerD + Prom 18472 - 18531 4.9 12 10 Op 1 . + CDS 18685 - 19281 401 ## BT_0596 putative transcriptional regulator 13 10 Op 2 . + CDS 19259 - 21673 2260 ## COG1596 Periplasmic protein involved in polysaccharide export 14 10 Op 3 . + CDS 21689 - 22822 851 ## BT_1355 hypothetical protein 15 10 Op 4 . + CDS 22841 - 24253 677 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 16 10 Op 5 . + CDS 24304 - 25023 213 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 17 10 Op 6 . + CDS 25031 - 26047 489 ## COG0451 Nucleoside-diphosphate-sugar epimerases 18 10 Op 7 . + CDS 26119 - 26890 304 ## COG3475 LPS biosynthesis protein Predicted protein(s) >gi|222159220|gb|ACAB01000139.1| GENE 1 166 - 2355 1149 729 aa, chain - ## HITS:1 COG:no KEGG:BT_4662 NR:ns ## KEGG: BT_4662 # Name: not_defined # Def: heparinase III protein, heparitin sulfate lyase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 728 1 702 702 525 43.0 1e-147 MKNIFFICMFGLLAFAGCSDNEEEILTGGNIDIDMLPPNQPNDVIDEKLFAVINLDYPGL EKAKEFYNSGKYYYAVNELLEYYRNRTSVINPNIQLMNTPITASELNIADQALENRFYVR GFAERVEDGKEIYYSFTKDNKIDWSFIPSKVTDQEFKYQIHRHQWMLPQAKAYRTSRDEK YAESWINVYSDWLKTFPYEEGTTFPEEGGKENDVDYQWKGLQVAERVISQIDIMTYFIQS KNFTPEWLSVFLTAFAKEVECIRLNYYKEGNILVTQAQAVAMAGILMPEFKNANEWLSEG SQKLGEQIDKQFLADGVHYEFDISYHVGAISDFYETYRVAQLNNKAGGFPAGYLEKLKLP AHFVMDITYPNYSVENFNDTRSSRLGKSVLIKNFKKYAEMFPDDQEIQWMASERQSGSTP TYLQKAYTNGGYYILRNKWDDQSMMMILKNNNNPNNKYHCQPDNGTFSLYKKGRNFFPDA GSYAYSGSDRETYRGTARHNTLTIMSKTIPDDSMKGKLLQLTTKTDEATSYDVLVTENPT YTVSGEKHLNLTHRRSVFFVNQKFFVIVDEGYDTTADHNSNINVNFHLCPIENASSDVVI DEGQKDSHIYGAHTAFSDNNNIMLRTFVETANDYSASVKTTNTSNDIGQKTERKGYQVTI RKPKIVNGAARFISIIYPINDGAEAANINMTATFTDIADDEAAGTFHANGTSVKVIIDNK EYNLSYTLN >gi|222159220|gb|ACAB01000139.1| GENE 2 2876 - 3025 99 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNLDQNLTEKRKVLSYTLGLFVMYKLSPCTYNRNSSILSGVITVKVASG >gi|222159220|gb|ACAB01000139.1| GENE 3 2965 - 4086 688 373 aa, chain - ## HITS:1 COG:MA3153 KEGG:ns NR:ns ## COG: MA3153 COG1373 # Protein_GI_number: 20091971 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanosarcina acetivorans str.C2A # 22 325 39 363 445 98 23.0 2e-20 MIARILEQTIQEKLGQGKAIIIMGPRQVGKTTLLKQMFTDSNDILWLNGDEQDTRSLFEN ISATRLKYIFGQKKIVIIDEAQRIENIGLRLKLITDQISDIQLIATGSSSFDLANKVNEP LTGRKWEYKMYPLSFKEMVNHHGLLDEKRLIPHRLVYGYYPDVVTHPGSESEILKQLSDS YLYKDILDWEHIKKADKIMKLLQALAFQIGSEVSYNELGQLCDLDPKTVERYVILLEQSY IIFRLSSFSRNLRNELKSSRKIYFYDNGIRNAVIANFSLAETRSDIGALFENFLVSERIK YLQYNRLWHNTWFWRTTAMQEIDLIEEGDGKLMTYEFKWNPKRSAKFPKAFSDNYPEATF TVITPDNIEEFLL >gi|222159220|gb|ACAB01000139.1| GENE 4 4211 - 8272 3038 1353 aa, chain - ## HITS:1 COG:CAC0323 KEGG:ns NR:ns ## COG: CAC0323 COG0642 # Protein_GI_number: 15893615 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 821 1051 374 616 654 140 36.0 2e-32 MNSLKKTTICLLFLCIGVNCVFSEVPEQINFSYISINEGLSQSTVFSIDQDQRGNMWFAT YDGVNKYDGYSFTVYQHNEEDPNSIANDISRIVKTDSQGRVWIGTRDGLSYYDEEKDKFR NFFYEKKGKRQQINGIAEISPEQLLISTQEGLTMFDIKESKFVDDSFSTAMHKLIASALY RQGDIIYIGTPVDGLYSYSIPQKKLEKITPITETKQIQAILQQSPTRIWVATEGTGLLLL NPKTKEVKAYHHSSSNPKSISSNYIRSLALDSQNRLWIGTFNDLNIYHEGSDSFVSYSSN PVENGSLSQRSVRSIFMDSQGGMWLGTYFGGLNYYHPIRNRFKNIQRIPYKNSLSDNVVS CITEDKDKNLWIGTNDGGLNLYNTKTQQFTHYILQENEREIGIGSNNIKAVYVDEQKSLV YIGTHAGGLTVLHRNSGKMESFNQLNSQLVDENVYAILPDKGGNLLLGTLSALVSFNPEK KSFTTIDKEKDGTPFTSQRITILFRDSHKRLWIGGEEGISVFQQEGLEIEKVPILPESSV TKTFTNCIYEADNGIIWVGTREGFYCFNEKEKKIKRYTTANGLPNNVVYGVLEDTFGRLW VSTNRGISCFNPETERFRNFTESDGLQSNQFNTSSFCRTSSGQMYFGGINGITTFRPELL LDNPYTPPVVITKLQLFNKTVRPDDETGILTKNINETESITLKSWQTAFTIEFVVSNYIS GQHNTFAYKLEGYDKEWYYLTDKRTVSYSNLPQGTYHFLVKAANSDGKWNTVPTMLEIIV LPIWYKTWWAIMIFLATFIGFITFVFRFFWMRKSMEAELELERRDKEHQEEINQMKMRFF INISHELRTPLTLILAPLQEIINKISDRWTRNQLEYIQRNANRLLHLVNQLMDYRRAELG VFELKVKKENAHQLIQDNFLFYDKLARHKKITYTLHSELEEKEELFDPNYLELIVNNLLS NAFKYTESGQSITVTLKEENNWLVLQVSDTGIGIPINKQGKIFERFYQIESEHVGSGIGL SLVQRLVELHHGRIELDSEEGKGSTFSVYLPQDINIYKPSELASNDAKNEEDQVYSTNSK EMYFIDTEKVKNEAIETGDKKRGTILIVEDNNEIRHYLSSGLSELFNTLDAGNGEEALEK LKENEVDIIVTDVMMPVMDGIKLCKNVKQNIRTCHIPVIILSAKSETKDQMEGLQMGADD YIPKPFSLAILTTKIQNMMRTRRRMLERYSKSLEVEPEKITFNAMDEALLKRAVAIVEKN MDNIEFSTDEFAREMNMSRSNLHLKLKAITGESTIDFIRKIRFNEAAKLLKDGRYTIAEV STMVGFNTPSYFATSFKKYFGCLPTEYIKKAKG >gi|222159220|gb|ACAB01000139.1| GENE 5 8481 - 9851 1629 456 aa, chain - ## HITS:1 COG:TM0539 KEGG:ns NR:ns ## COG: TM0539 COG1350 # Protein_GI_number: 15643305 # Func_class: R General function prediction only # Function: Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) # Organism: Thermotoga maritima # 10 426 7 421 422 487 58.0 1e-137 MSEKRKRYILPEEEIPHYWYNIQADMVNKPMPPLHPGTKQPLKAEDLYPIFAKELCHQEL NQTDAWIEIPEEVREMYKYYRSTPLVRAYGLEKALGTPAHIYFKNESVSPIGSHKLNSAL AQAYYCKEEGVTNITTETGAGQWGAALSYAAKVFGLEAAVYQVKISYEQKPYRRSIMQTF GAQVTPSPSMSTRAGKDILTKHPTYQGSLGTAISEAIELAQMTPNCKYTLGSVLSHVTLH QTIIGLEAEKQMEMAGEYPDIVIGCFGGGSNFGGISFPFMRHTIQEGKKTRFVAAEPASC PKLTRGKFQYDFGDEAGYTPLLPMFTLGHNFAPAHIHAGGLRYHGAGVIVSQLLKDNLME AVDIQQLESFEAGCLFAQSEGIIPAPESSHAIAAAIREANKCKETGEEKVILFNLSGHGL IDMASYDKYLAGDLVNYTLNDEDIQKNLDEIGDLAK >gi|222159220|gb|ACAB01000139.1| GENE 6 9984 - 11813 1256 609 aa, chain + ## HITS:1 COG:BH0598 KEGG:ns NR:ns ## COG: BH0598 COG0168 # Protein_GI_number: 15613161 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus halodurans # 154 606 18 445 448 222 35.0 2e-57 MKIYHKFLLYQNKLLKPYVRILLGLVEALTYLASLLLIVGVVYEHGFPLSIDEVANLQTL YKTVWIIFLIDVTLHISLEYRNTKKQYRRLAWILSGLLYLTLVPVIFHRPEEEGAILHIW EFLHGKFYHLLLLLVLSFLNLSNGLVRLLGRRTNPSLILAVSFMAIILIGAGLLMLPRCT VNGITWVDSLFTATSAVCVTGLVPVDVSTTFTTSGLVVIILLIQIGGLGVMTLTSFFAMF FMGNTSIYNQLVVRDMVSSNSLGSLLSTLLYILGFTLVIEGIGMVSIWFSIHGTLGMTLE GELGFAAFHSISAFCNAGFSTLSGNLGNPMVMTNHNWLFITVSLLIIFGGIGFPILVNFK DIVLYHLRRFWKLIRTRKLDRHKMQHLYNLNTKIVLIMTFLLLLIGTLAIAAFEWNGSFA GMPVADKWTQAFFNATCPRTAGFSSVDLASLSVQTLLVYLFLMWVGGGSQSTAGGVKVNA FAVVVLNLVAVLRGTERVEVFGRELSYDSIRRSNATVVMSLGVLFIFIFTLSILEPGVSI MALTFECVSALSTVGSSLNLTPHLCDASKLLVSLLMFIGRVGLITLMLGIVKQKKNTKYR YPSDNIIIN >gi|222159220|gb|ACAB01000139.1| GENE 7 11818 - 12504 858 228 aa, chain + ## HITS:1 COG:aq_1503 KEGG:ns NR:ns ## COG: aq_1503 COG0569 # Protein_GI_number: 15606658 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Aquifex aeolicus # 3 181 5 183 218 106 34.0 3e-23 MKYIIIGLGNYGHVLAEELSALGHEIIGADISESRVDSIKDKVATAFVIDATDEQSLSVL PLNSVDIVIVAIGENFGASIRVVALLKQKKVPRIFARAIDAVHKAVLEAFDLERILTPEE DAARSLVQLLDFGTNMEGFRIDQDYYVVKFTVPEKFVGYFVNELNLDEEFHLKMIGLKRA NKITNCLGISLTELHVKNELPGNEKVEEGDELVCYGRYRDFQAFWKAI >gi|222159220|gb|ACAB01000139.1| GENE 8 12637 - 15084 2248 815 aa, chain - ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 30 800 59 871 871 438 33.0 1e-122 MNMKHPKQLVTVLLMGAACTMQAQRSEALLEKNWKFSKGNFKEASQPEFNDTKWESVVIP HDWAIFGPFDMNNDLQNVAVTQNFEKKASLKTGRTGGLPYVGTGWYRTSFDAPADKEVTL LFDGAMSEARVYINGKEACFWPFGYNSFHCNVTSLLNKDGKNNTLAVRLENRPQSSRWYP GAGLYRNVHLIVTDKIHVPVWGTQITTPHVNKDFAAVRLQTKIDNAGEKTAIRVETEILS PEGKVVTRKENTSRINHGQPFEQNFIVNAPELWSPESPSLYKAVSKIYADDKLVDTYTTR FGIRSIEYIADKGFYLNGEHRKFQGVCNHHDLGPLGAAINVAALRHQLTLLKDMGCDAIR TSHNMPTPELVALCDEMGFMMMIEPFDEWDIAKCENGYHRYFDEWAERDMINMLHNYRNN PCVVMWSIGNEVPTQCSPVGYKVAKFLQDICHREDPTRPVTCGMDQVTCVLANGFAAMID IPGLNYRTQRYQESYDQLPQNLILGSETASTVSSRGVYKFPVEDKKGAKYEDHQCSSYDV EVCPWSNIPDEDFALADDHHWTIGQFVWTGFDYLGEPSPYDTDSWPNHSSMFGIIDLASL PKDRYYLYRSVWNKEAETLHILPHWTWPGREGEVTPVFVYTNYPTAELFINGKSYGKQSK NNSSLKNRYRLMWMDTKYEPGEVKVVAYDKNGKAVAEKVVRTAGKPHHIELVSNRNELTA DGKDLAYVTVKVVDKDGNLCPTDNRQINFSVKGTGKYRAAANGDPTNLEQFHLPKMHAFN GMLTAILQAGETAGEIVFTAKANGVKAGNIRIQTK >gi|222159220|gb|ACAB01000139.1| GENE 9 15326 - 15877 508 183 aa, chain + ## HITS:1 COG:no KEGG:BT_4674 NR:ns ## KEGG: BT_4674 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 182 19 182 183 273 82.0 2e-72 MRKLVYWLLLPLAVAVSACGGKKGTSDNQSLLARIDSIDAHGLQRMQTSKSETNFKFKGK DYHSFVSRIPDESLPHVTNEMGDTYVDNKIVLHLTRGNETVLNKTFTKNDFSSVVDAKFL SKSILEGIVYDKTTPQGIVYAASVCYPQTDLYMPLSITITADGKMSIQKVDMLEEDLNDE APN >gi|222159220|gb|ACAB01000139.1| GENE 10 16001 - 17197 1075 398 aa, chain + ## HITS:1 COG:no KEGG:BT_4675 NR:ns ## KEGG: BT_4675 # Name: not_defined # Def: heparin lyase I precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 398 9 376 376 669 86.0 0 MKKSIFIICVLASGCTAMAEQVEKPGPAIETQTLVPLTERVNVQADSACVNQIIDGCWVA VGAKKAHAIQRDFNLMFNGKPSYRFELKEDDNTLSGYAAGETKGRAEFSYCYATSDDFKG KPADTYKKAQIMKTVYHHGKGICPQGASRDYEFSVYIPSTLGSDVSTIFAQWHGMPDRTL VQTPQGEVKKLTADEFMELDKTTIFKKNMGYEKKPKLDKQGNPVKDKQGNPVYQAGKANG WLVEQGGYPPLAFGFSGGWFYIKANSDRKWLTDKDDRCNANPEKTPVMKPVTSTYKASTI AYKMPFADFPKDCWITFRIHIDWTVYGKEAETIVKPGMLDVQMDYQEKGKKVKKHIVDNE KIMIGRNDDDGYYFKFGIYRVGNSTKPVCYNLANYSER >gi|222159220|gb|ACAB01000139.1| GENE 11 17378 - 18349 558 323 aa, chain + ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 27 290 21 278 297 59 26.0 1e-08 MQMMNKNGFNRCAEFYIGRLRKEGRYSTAHVYKNALFSFSKFCGTSNVSFRQITRERLRR YGQYLYECGLKPNTISTYMRMLRSIYNRGVEAGSAPYVPRLFHDVYTGVDVRQKKALPAT ELYKLLYEDPKSERLRRTQAIAALMFQFCGMSFADLAHLEKSALDQNVLRYNRIKTKTPM SVEVLNTAKEMINQLRSKEDSHPDCPDYLFDILRGDKKRTDERGYREYQSALRRFNNSLK DLARTLHLQSPVTSYTLRHSWATTAKYRGVSIEMISESLGHKSIKTTQIYLKGFELKERT EVNKGNLSYVRNCCTGSVKIVKC >gi|222159220|gb|ACAB01000139.1| GENE 12 18685 - 19281 401 198 aa, chain + ## HITS:1 COG:no KEGG:BT_0596 NR:ns ## KEGG: BT_0596 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 187 1 187 192 336 86.0 3e-91 MILTKPKSVNAGPISGTGEGVAHSKRWYVALVRMHHEKKVSERLSKMGIDSFVPVQQQIH QWSDRRKMVDTVLLPMMVFVHVNPKERMEVLSFSTVSRYMVMRGESTPAVIPDEQMARFR FMLDYSDETVCMNDTPLARGEKVRVIKGPLSGLVGELVTVGGKSKIAVRLNMLGCACVDM PIGYVEPTKISNDDTKKL >gi|222159220|gb|ACAB01000139.1| GENE 13 19259 - 21673 2260 804 aa, chain + ## HITS:1 COG:aq_505 KEGG:ns NR:ns ## COG: aq_505 COG1596 # Protein_GI_number: 15605977 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Aquifex aeolicus # 142 439 90 381 725 138 34.0 5e-32 MTTQKNFNVFLFILLLGVFSPLMAQNMSDSQVLEYVKEGIRQGKEQKQLASELARKGVTK EQAMRVKQLYEQQNNVNASNATGTDVNESRLREEMKENTSDMLEDHPSTQDLARGNQVFG RNIFNTRNLTFEPSVNIATPLNYRLGPGDEVIIDIWGASQNTIRQQISPDGTINIQKIGP VNLNGLTIAEANDYLKKTLNKIYNGLNNTNDPTSDIRLTLGSIRTIQINVMGEVVQPGTY SLSSFSTVFHALYRAGGVSDIGSLRNVQLVRNGKNIATIDVYQFIMKGNIQDDIRLQEGD VVIVPAYDVLVKIDGKVKRPMRFEMKKDENLSTLISYAGGFDADAYTRSLRVVRQNGQEY EVNTVKDLDYSVYKMRNGDVVTAEAILNRFTNRLEIRGAVYRPGIYQLNGKLNTVRELVN EAQGLTGDAFLNRAVLYRQREDLTTEVIPVDIKAIMDGTSQNIILMKNDILYIPSIHDLE DRGNVVIHGEVAKPDSYPYADNMTLEDLVIQAGGLREAASVVRVDVSRRIKNPRSTVGND TIGQIYTFSLKEGFVIDGTPGFVLQPYDEVYVRRSPGYQAQQNVAVEGEILFGGSYAMTS REERLSDLINKAGGATNYAYLRGAKLTRVATEGEKKRMGDVIRLMSRQLGEAMMDSLGVR VEDTFSVGIDLEKALANPGSTADIVLREGDVISIPKNNNTVTINGAVMVPNTVSYIEGQK VDYYLDQAGGYSENAKKSKKFIVYMNGQVTKVKGSGKKQIEPGCEIIVPSKAKKRTNIGN ILGYATTFSTLGMMVASIANLIKK >gi|222159220|gb|ACAB01000139.1| GENE 14 21689 - 22822 851 377 aa, chain + ## HITS:1 COG:no KEGG:BT_1355 NR:ns ## KEGG: BT_1355 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 366 1 365 379 467 66.0 1e-130 MSTLMEQTSEKETKQVYNNHNDEELEIDLMDLLRKVIGIRKKIYKAAGIGLIIGVIVAIS IPKQYTVEVTLSPEMGNNKGGGLSGLAASFLGSGVTMGDGTDALNASLSADIVSSTPFLL ELSAMEIPVTKNEVMTLNIYLDEETSPWWSYVIGFPGMVIGGVKSLFTEEDEDIYFDKTS QGVIELSKKETEKIETLKEKITASVDKKTSMTSVTATFQDSKVAAVVADSVVKKLQEYII DYRTSKSKEDCIYLEKLFKERQQEYYAAQKKYADYLDSHDNLILQSVRAEQERLQNDMSL AYQVYSQVASQLQIARAKVQEEKPVFAIVEPAVVPLYPSGTSRKVCVLAFIFLSVCIVIS WNLFGKDFLNKFKEVCA >gi|222159220|gb|ACAB01000139.1| GENE 15 22841 - 24253 677 470 aa, chain + ## HITS:1 COG:wcaJ KEGG:ns NR:ns ## COG: wcaJ COG2148 # Protein_GI_number: 16129987 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Escherichia coli K12 # 38 470 29 464 464 225 36.0 1e-58 MVSTTSQINKFIEFVAVLGDLVILNLVLFFLLFFWEEVFVSLPFSCRVSWMMTSLSLCYL ACSSSRGKVWDSRGIRPDQLVLRVLKNIIAFSIFWACIMTFSGISIYSPLFFVAYFSFLF VILSIYRIIIRHFLIVYCAKGKHRRYAVFIGGGNNMQVLYEEMENSLASSLYEVVGYFDI KPNEAFSSQCSYLGNPDGFSDFMSAHLGIKHVFCSLSMEEGRYNFSIMNHCENHLLYFHG VPNVCKGFPRRIWHSMVGNMPILNLRYEPLGKMENRILKRIFDIVISGLFLVTIFPFVYL IVGSIIKLTSPGPVFFKQMRTGLNGVDFVCYKFRSMKVNNEADSKQATADDPRKTRFGNF LRRSNIDELPQFINVFKGDMSIVGPRPHMLAHTETYARLIDKYMVRHFIKPGVTGWAQTH GFRGETRELSQMEERVKADIWYMEHWTMLLDLYIIYKTVANVIVGEKNAY >gi|222159220|gb|ACAB01000139.1| GENE 16 24304 - 25023 213 239 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 4 232 6 225 234 86 27 1e-16 MNIALLIAGGSGNRMGQDIPKQFMHVDGAPIIIRTMQCFQLHPDISAIAVVCLKGWETVL QSYANQFMIDKLRWIFPGGDSGMESIHNGIYGLKEKGCDDEDLVLIHDSVRPLLSQEIIS SNIAICKAYGYAITGIQCREAILESEDGFTSTASIPRDTLIRTQTPQTFRLKNIIRAHEL ACKKGIVNSVASCTLLAELNENIEMHIVPGSEKNIKITTVEDLEILKALMHTSKDEWLK >gi|222159220|gb|ACAB01000139.1| GENE 17 25031 - 26047 489 338 aa, chain + ## HITS:1 COG:slr0809 KEGG:ns NR:ns ## COG: slr0809 COG0451 # Protein_GI_number: 16330703 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Synechocystis # 25 280 22 269 328 150 36.0 2e-36 MAYQDDIKKVATLLLPWGKLSGCNVLVTGATGLIGGCLVDVLMAHENKDYNVYASGRNEE RAKNRFLRYVNDKTFHFFKYDVMQPLVCDVDFQYIIHAASGACPAEFGKHPVEIMKGNIN GVDNLMSYGIQHGMKRLLYISSGEVYGEGDGRVFTEEYSGYVDCASPRSCYPSSKRAAET LCISYGVEYNVDVVIARPCHVYGPEFTESDNRVYAQFIRNVLQGEDVVMKSTGEQFRSWC YVVDCVSGLLHVLLKGENGEVYNIADEDSNITIRQLAELIAGLVGRKVIMKCPDDFEKKG YNMVSKSVFATKKLEALGWNIEGCMRDKMLKTITKVNL >gi|222159220|gb|ACAB01000139.1| GENE 18 26119 - 26890 304 257 aa, chain + ## HITS:1 COG:SP1273 KEGG:ns NR:ns ## COG: SP1273 COG3475 # Protein_GI_number: 15901133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 9 257 10 250 267 113 29.0 4e-25 MEEDKLKYKEILCKTMKSFINICKEHNLQYYACAGTCLGAIRHKGMIPWDDDIDVLMPRS DYDKFLALKQKLQGTGYEIVDSNNQFYNQWFAKFSDANTTIVEMTDFPIVFGVYVDIFPL DEVGNVDVAKKLHEEKSKYFDKYRRTFKKTFFRNCVNLFVHMHIKTFLKEIYYASIGKVF KEHYFFKYKHAEDLIQKQRGQKCMCYGGFYGFEKELCEKEWFGKGVAVPFEDFSIIVPSN YHAYLTRFYNDYMTPPP Prediction of potential genes in microbial genomes Time: Wed May 18 04:15:54 2011 Seq name: gi|222159219|gb|ACAB01000140.1| Bacteroides sp. D1 cont1.140, whole genome shotgun sequence Length of sequence - 27635 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 9, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 465 - 2930 2152 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 2958 - 3017 2.1 2 1 Op 2 . - CDS 3021 - 3623 584 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 3 1 Op 3 . - CDS 3640 - 5292 1150 ## COG2509 Uncharacterized FAD-dependent dehydrogenases - Term 5312 - 5359 10.6 4 1 Op 4 . - CDS 5371 - 6453 313 ## Ping_1180 hypothetical protein - Prom 6513 - 6572 5.0 5 2 Op 1 . - CDS 7080 - 8447 1313 ## COG1066 Predicted ATP-dependent serine protease 6 2 Op 2 . - CDS 8459 - 9658 931 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 9679 - 9738 3.8 - Term 9688 - 9750 2.4 7 3 Tu 1 . - CDS 9762 - 10799 864 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D - Prom 10824 - 10883 6.2 + Prom 11118 - 11177 5.1 8 4 Tu 1 . + CDS 11201 - 13636 2532 ## COG0527 Aspartokinases 9 5 Tu 1 . + CDS 13738 - 14946 1157 ## COG3635 Predicted phosphoglycerate mutase, AP superfamily + Prom 14955 - 15014 6.7 10 6 Tu 1 . + CDS 15074 - 16375 1417 ## COG0498 Threonine synthase + Term 16441 - 16496 -0.7 - Term 16432 - 16480 7.5 11 7 Op 1 2/0.000 - CDS 16548 - 17174 534 ## COG1564 Thiamine pyrophosphokinase 12 7 Op 2 . - CDS 17171 - 17779 585 ## COG3201 Nicotinamide mononucleotide transporter 13 7 Op 3 . - CDS 17792 - 20041 1960 ## BT_2390 hypothetical protein - Prom 20140 - 20199 5.9 + Prom 19980 - 20039 2.9 14 8 Tu 1 . + CDS 20183 - 20803 445 ## BT_3269 RNA polymerase ECF-type sigma factor + Prom 20826 - 20885 2.9 15 9 Op 1 . + CDS 20918 - 22084 756 ## COG3712 Fe2+-dicitrate sensor, membrane component 16 9 Op 2 . + CDS 22119 - 25496 2275 ## BT_3271 hypothetical protein 17 9 Op 3 . + CDS 25508 - 26980 948 ## BT_3272 putative outer membrane protein 18 9 Op 4 . + CDS 26999 - 27635 332 ## BT_3273 hypothetical protein Predicted protein(s) >gi|222159219|gb|ACAB01000140.1| GENE 1 465 - 2930 2152 821 aa, chain - ## HITS:1 COG:YPO1011 KEGG:ns NR:ns ## COG: YPO1011 COG1629 # Protein_GI_number: 16121312 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Yersinia pestis # 3 821 10 690 690 158 23.0 5e-38 MKTKNIMIALTLLTTGTAWAEDFPKDSLKIVDIEEVVVIATPKENRKLRELPVAATVLSQ DNMRANQVNSVKNLTGIVPNLFIPDYGSKLTTSIYIRGIGSRINTPSVGLYVDNIPYIDK SAFDFNYADIERIDILRGPQGTLYGRNTMGGLIKVHTKSPFTYQGTDIRMGAATYNNYNV SLTHYHRTSDRFAFSTGGFYEHTGGFFENSARNNEKVDKSNAGGGRFRGIYLPTSNLKID MALSYEYSDQGGYPYYYTGITPSAIAKAKENGKEMTEDRADYIGKISYNDRSSYRRGLLN SGVNIEYQANNFILSAVTGYQNLNDRMFLDQDFTERDIFNIEQKQRANTISEEIVLKAKP GKRWQWATGAFGFYQWLHTTGPVLFKEEGVKSVIENNANAAFEEVSAKPGAPTMGMTVHN PSLSVGGTFDTPTLSGALYHQSTFNNLFIEGLSVTAGLRLDYEKISMKYNSLSTPIDFGF DFHLAMGPNQINLSDQNMKAPASFVGKLSTDYVQLLPKFAIQYEWKNQNNVYATVTRGYR SGGYNIQMFSDLSQTELKNSMMNAIKESPTIGQDATWGKTIINMMNQMVPTKEIDVKAST TYKPEYSWNYEVGSHLTLWEGRLWADVAAFYMDTRDQQLSQFAESGLGRITINAGKSRSY GAEAALRASVTKELSLNVSYGYTYATFTDYVITEEQKDGTFKVTADYNGKYVPFVPKHTL NIGGEYAITCSPRSIFDRVVFQANYNAAGRIYWTEQNDVSQSFYGTLNWRTNLEIGDAMI SFWARNFLNKEYAAFYFETMNKGFMQKGRPMQFGVDLRCRF >gi|222159219|gb|ACAB01000140.1| GENE 2 3021 - 3623 584 200 aa, chain - ## HITS:1 COG:BMEI1582 KEGG:ns NR:ns ## COG: BMEI1582 COG2197 # Protein_GI_number: 17987865 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Brucella melitensis # 129 192 143 206 213 68 57.0 7e-12 MNYQPEIAIVEANTLTCLGLKGILEEMIPMATIRTFHHFSELMDDTPDMYAHYFISAQIY VEHNAFFLPRKRKTIVLASDSPQFQLSGVPVLNIHESEEELVKNILKLHQHAHHDGYPVK DMPPMPPMQPHQEILSAREIEVLVLITKGLINKEIADKLNISLTTVITHRKNITEKLGIK SVSGLTIYAVMNGYIEADRI >gi|222159219|gb|ACAB01000140.1| GENE 3 3640 - 5292 1150 550 aa, chain - ## HITS:1 COG:L195271 KEGG:ns NR:ns ## COG: L195271 COG2509 # Protein_GI_number: 15673161 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Lactococcus lactis # 1 548 1 528 535 353 39.0 6e-97 MIQEYQLRILPEIAANEQRLKEYLSKEKGLNLRDITATRILKRSIDARQRTIFVNLKVRA YIREMPKDDEYEHTIYNKVEGKPQVIVVGAGPGGLFAALRLIELGLRPVVIERGKDVRER KKDLAQISREHTVDPESNYSFGEGGAGAYSDGKLYTRSKKRGNVDKILNVFCQHGASTSI LVDAHPHIGTDKLPRVIENMRNTIIECGGEVHFQTRMDALIIENDEVKGIETNTRKTFLG PVILATGHSARDVYRWLAANHVAIEAKGIAVGVRLEHPAMLIDQIQYHNKNGRGKYLPAA EYSFVTQAEGRGVYSFCMCPGGFIVPAASGPEQVVVNGMSPANRGSRWSNSGMVVEIQPE DLGNEELKMRNKELAAQQDEQLMALNPNLKSSQLSEINSQLLSVLHFQEELERQCWLQGG RRQTAPAQRMLDFTRKKLSYDLPESSYSPGLISSPLHFWMPSFISKRLSLGFQQFGRSSH GFLTNEAVMIGVETRTSSPVRIIRDKDTLQHVTVRGLFPCGEGAGYAGGIVSAGVDGERC AEAAANYFNH >gi|222159219|gb|ACAB01000140.1| GENE 4 5371 - 6453 313 360 aa, chain - ## HITS:1 COG:no KEGG:Ping_1180 NR:ns ## KEGG: Ping_1180 # Name: not_defined # Def: hypothetical protein # Organism: P.ingrahamii # Pathway: not_defined # 1 360 1 360 362 138 34.0 3e-31 MTDEELQSKSIDFLRFPLIVGVVLIHAHFSNVIMNGVNVNILHEYSCPIYDTTSYFFSEL IGRIAVPLFFFISGFLFFYRSKEFSLSVYRHKLKNRGRSILLPYLFWNLMIISYSILIQT IPGLSSSTNQLMDMYSLTDWLDAFWSINGGCPICYQFWFLRDLIVMILFSPLIYILVKYF RWFSVFVLGVLWYFNWWFSLPGFSIAAFFFFSAGACFSVKNCNFVALLKPCLLPSAMFYS LLALVCIYFREQPWINFIHSLSILTGIVLAISLSAHFIERRMWTANTFLVGGAFFIYAFH GIPLGVILKYTLKYLPVSNDMIFLSIYFLSAGFIILTSLGIYSLLKKWLPRFLSVVTGGR >gi|222159219|gb|ACAB01000140.1| GENE 5 7080 - 8447 1313 455 aa, chain - ## HITS:1 COG:BS_sms KEGG:ns NR:ns ## COG: BS_sms COG1066 # Protein_GI_number: 16077155 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Bacillus subtilis # 1 455 1 457 458 483 51.0 1e-136 MAKEKTVYVCSNCGQDSPKWVGKCPSCGEWNTYVEEIVRKEPTNRRPVSGIETQKPKPLA LSDIEADDEPRINMHDDELNRVLGGGLVPGSLVLIGGEPGIGKSTLVMQTVLHMPEKKIL YVSGEESARQLKLRADRLSDTSSDCLIVCETSLEQIYVHIKNTNPDLVIIDSIQTISTES IESSPGSIAQVRECSASILRFAKETHTPVLLIGHINKEGSIAGPKVLEHIVDTVLQFEGD QHYMYRILRSIKNRFGSTAELGIYEMRQDGLRQVNNPSELLLSQDHEGMSGVAIASAIEG IRPFLIETQALVSSAVYGNPQRSATGFDLRRMNMLLAVLEKRVGFKLAQKDVFLNIAGGL KVNDPAIDLPVISAILSSNMDAAIEPEVCMAGEIGLSGEIRPVNRIEQRIGEAEKLGFKR FLLPKYNLQGIDTKKLKIELVPVRKVEEAFRALFG >gi|222159219|gb|ACAB01000140.1| GENE 6 8459 - 9658 931 399 aa, chain - ## HITS:1 COG:TM1265 KEGG:ns NR:ns ## COG: TM1265 COG1373 # Protein_GI_number: 15644021 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Thermotoga maritima # 31 399 37 387 387 111 26.0 2e-24 MDTLIRRYKRLLTATSTTYIRSLMNTINWDNRLIAIRGARGVGKTTLMLQYLKLHYANDS QSALYTSLDSLYFTQHSLSELAEQFYLKGGKCLFLDEVHKYPSWSKEIKNIYDEFPELKI VFTGSSLLQLLNAEADLSRRCISYNMQGLSYREYLNLYHQIDIRPYTLEEILDKSDGICN EVNSQCRPLAYFEDYLKHGYYPFYLEGNAEYYTRIENIANLILEIELPQQCGVDISNVRK LKSLLGILSSEVPFMVDITKLSAMAELSRTTILAYLQYLDRAKLIHLLYSDNDSIKKLQK PDKIYMENTNLLYALTFKDVNKGTLREVFMVNQLAYQHRVEYCTRSADYTIDSKYTIEVG GKSKDGKQIANSKQAFIAADDIEYSAGNKIPLWAFGFLY >gi|222159219|gb|ACAB01000140.1| GENE 7 9762 - 10799 864 345 aa, chain - ## HITS:1 COG:YPO2161 KEGG:ns NR:ns ## COG: YPO2161 COG0252 # Protein_GI_number: 16122393 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Yersinia pestis # 7 344 5 335 338 275 45.0 1e-73 MNPLNTSVLLIYTGGTIGMIENAATGALENFNFEQLQKYIPELQKFNFPIDTYQFDPPMD SSDMEPDMWRKLVRIIHENYDRYHGFVILHGTDTMAYTASALSFMLEGLDKPVILTGSQL PIGVLRTDGKENLMTSIEIAVAQNKEGRALVPEVCIFFENHLMRGNRTTKMNAENFNAFR SFNYPVLAEAGIHIKYNNVQIHVNGEERELKPHYLLDTNVVVLKLFPGIQEDVIAAILGI DGLKAVVLETYGSGNAPRKEWFIRRLCQASERGIVIVNVTQCSAGMVEMERYETGYQLLQ AGVVSGYDSTTESAVTKLMFLLGHGYTADEVRDRMNRSMAGEITL >gi|222159219|gb|ACAB01000140.1| GENE 8 11201 - 13636 2532 811 aa, chain + ## HITS:1 COG:MJ0571 KEGG:ns NR:ns ## COG: MJ0571 COG0527 # Protein_GI_number: 15668751 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Methanococcus jannaschii # 3 454 4 467 473 270 38.0 8e-72 MKVMKFGGTSVGSVNSILSVKRIVESASEPVIVVVSALGGITDKLINTSKMAAAGDSAYE GEFREIVYRHVEMIKEVIPAGEKQVSLQRQIGELLNELKDIFQGIYLIRDLSAKTSDTIV SYGERLSSIIVTELIDGAKWFDSRTFIKTERKHSKHTLDTDLTNKLVKEAFQSIPKVSLV PGFISSDKTTGDVTNLGRGGSDYTAAIIAAALDAASLEIWTDVDGFMTADPRVISTAYTI TELSYVEATELCNFGAKVVYPPTIYPVCHKNIPIIIKNTFNPDGVGTVIKQEVSNPQSKA IKGISSINDTSLITVQGLGMVGVIGVNYRIFKALAKNGISVFLVSQASSENSTSIGVRNA DADLACEVLNEEFAKEIEMGEISPILAERDLATVAIVGENMKHTPGIAGKLFGTLGRNGI NVIACAQGASETNISFVVDSKSLRKSLNVIHDSFFLSEYQVLNLFICGVGTVGGSLVEQI RCQQQKLMMENGLKLHVVGIIDAAKAMFSREGFDLSNFRQELLEKGKDSSLQTIRDEIIG MNIFNSVFVDCTASADIASLYKDFLQHNISVVAANKIAASSAYENYRELKTIARQRGVKY LFETNVGAGLPIINTINDLIHSGDKILKIEAVLSGTLNYIFNKISADIPFSRTIKMAQEE RYSEPDPRIDLSGKDVIRKLVILAREAGYRLEQEDVEKNLFVPNDFFEGSLEDFWKRVPS LDADFEARRQVLEKENKHWRFVAKLENGKASVGLQEVGANHPFYGLEGSNNIILLTTERY KEYPMMIQGYGAGAGVTAAGVFADIMSIANV >gi|222159219|gb|ACAB01000140.1| GENE 9 13738 - 14946 1157 402 aa, chain + ## HITS:1 COG:MA0132 KEGG:ns NR:ns ## COG: MA0132 COG3635 # Protein_GI_number: 20089031 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted phosphoglycerate mutase, AP superfamily # Organism: Methanosarcina acetivorans str.C2A # 1 402 1 396 397 384 49.0 1e-106 MKHIIILGDGMADWPVKSLGDKTLLQYAKTPYMDQLARMGRNGRLITVAEGFHPGSEVAN MSVLGYNLPKVYEGRGPLEAASIGVDLKPGEMAMRCNLICVEGEILKNHSSGHISTEEAD VLIQYLQEKLGNDRVRFHTGVQYRHLLVIKGGNKELDCTPPHDVPLKPFRPLMVKPLTPE AQETADLINDLILKSQELLKNHPLNLKRIAEGKDPANSIWPWSPGYRPQMTTFSETFPQV KKGAVISAVDLINGIGYYAGLRRIVVEGATGLYNTNYENKVAAALEALKTDDFVYLHIEA SDEAGHEGDIDLKLLTIENLDKRAVGPIYEAVKDWDEPVAIAVLPDHPTPCELRTHTSDP IPFLIWYPGIEPDEVQTYDEVSACNGSYGVLKEDEFIKEFMK >gi|222159219|gb|ACAB01000140.1| GENE 10 15074 - 16375 1417 433 aa, chain + ## HITS:1 COG:PM0115 KEGG:ns NR:ns ## COG: PM0115 COG0498 # Protein_GI_number: 15601980 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Pasteurella multocida # 1 432 1 424 424 367 44.0 1e-101 MKYYSTNKQAPVASLQEAVVKGLAADKGLFMPMSIKPLPQEFYDTIDTLSFQEIAYRVAD AFFGEDIPADTLKQIVYDTLSFDVPLVKVADNIYSLELFHGPTLAFKDVGGRFMARLLGY FIKKEGQKNVNVLVATSGDTGSAVANGFLGVDGIHVYVLYPKGKVSEIQEKQFTTLGQNI TALEVDGTFDDCQALVKAAFMDKELNEHLSLTSANSINVARFLPQAFYYFYAYAQLKRVG KADNAVICVPSGNFGNITAGLFGKKMGLPVKRFIAANNRNDIFYQYLQTGKYNPRPSIAT IANAMDVGDPSNFARVLDLYNGSHAAISAEISGTTYTDEQIRETVKETWKEHHYLLDPHG ACGYRALVEGLKEGETGVFLETAHPAKFLETVESIIGESVEIPAKLQEFMKGEKKSLQMT KEFADFKSYLLSL >gi|222159219|gb|ACAB01000140.1| GENE 11 16548 - 17174 534 208 aa, chain - ## HITS:1 COG:jhp1211 KEGG:ns NR:ns ## COG: jhp1211 COG1564 # Protein_GI_number: 15612276 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Helicobacter pylori J99 # 9 205 2 199 204 123 35.0 2e-28 MISEHYTPEAVILANGEYPTHVLPLKILEEAKFVVCCDGAANEYILRGHTPDIIIGDGDS LSPENKTRFSDIIHHIADQETNDQTKAVHFLQEKGYQRIAIVGATGKREDHTLGNISLLL DYMKSGMEVRTVTDYGVFIPASGTQTFVSHTGQQISIINFGAKGLKGEGLVYPLSDFTNW WQGTLNEATSNRFTIQCTGEYLVFLASM >gi|222159219|gb|ACAB01000140.1| GENE 12 17171 - 17779 585 202 aa, chain - ## HITS:1 COG:PA1958 KEGG:ns NR:ns ## COG: PA1958 COG3201 # Protein_GI_number: 15597154 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinamide mononucleotide transporter # Organism: Pseudomonas aeruginosa # 6 197 4 181 191 89 27.0 5e-18 MELNFLEIFGTIVGLVYLWLEYRASIYLWIAGIVMPAIYIFVYYKAGLYADFGINIYYLI AAIYGWFFWMWGHRKKKSQQSADASIGDKPKDLPIVHTPWKCYLPLFLVFIVAFIGIAWI LIEYTDSNVPWLDSFTTALSIVGMWMLARKYVEQWFAWILVDIVCCGLYIYKDLYFTSAL YGLYSIIAIFGYFKWKKLMSVQ >gi|222159219|gb|ACAB01000140.1| GENE 13 17792 - 20041 1960 749 aa, chain - ## HITS:1 COG:no KEGG:BT_2390 NR:ns ## KEGG: BT_2390 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 749 8 743 743 1258 87.0 0 MKKIVWMAIALSGVGITVHAQTSVKDSMRVVNLQEVQVVSTRATAKTPVAFTNIGIAELK KVNFGQDIPYLLSMTPSTLTTSDAGAGIGYTTLRVRGTDGTRINITVNGIPMNDAESHNL FWVNMPDFSSSVKDMQVQRGAGTSTNGAGAFGASVNMQTEGASMKPYAEFNGSYGSFNTH KETVKVGTGLLNNHWTFDARLSNIGTDGYIDRASVDLNSYYLQGGYFAENTSVKLIAFAG KEKTYHAWGYATKKEMEDFGRRYNPCGEMYTDANGNKHFYDDQTDNYLQKNYQLLFNHTF STAWNLNVALHYTKGDGYYEEYKDGRSLIEYGLKPFTIDGTEITKSDLVRQKKMDNKFGG GVFSLNYTVNRLNASLGGGLNQYRGNNFGRVPWVKNYVGTLSPDHEYYRNKSKKTDGNIY LKANYDLTRGLSAYADLQYRHIDYTIDGNNDKYDWSKNALRPLAVDKKFDFFNPKVGLNW NITSNHRVYASFSVAQKEPTRNNYTDGDPDSYPKAEKLLDYEAGYTFANQWLTAGANFYY MDYTDQLVLTGALNDIGEALTENVPDSYRMGVELMLGIKPCKWFQWDINATWSKNRIQDF VESLPGYHYNENGSSTSLPTVQIKHKDTHIAFSPDFLFNNRFSFNYKGFEAALQSQFVSK QYMTNAEVEELTLDKYFVSNLNLAYSFRPKKVLKEVTVGFTVYNLFNEKYENNGWASSDY TDTVENRGNYAGYAAQAGTNVMGHVSFRF >gi|222159219|gb|ACAB01000140.1| GENE 14 20183 - 20803 445 206 aa, chain + ## HITS:1 COG:no KEGG:BT_3269 NR:ns ## KEGG: BT_3269 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 38 206 11 187 190 77 29.0 4e-13 MFYWKNSITLSSHKIYLELITIPSFYYKPMIIQPIEWGKNKKNEENFELLFERYHLSLLV VASHFVPRTAAEDIIQDVFLKLWEKRKELDKIESVKDYLYSAVKYGCFNYIRSQAMEQRC IEKITDEAFEEVMLDEEVFIELSKAITELPESYRKVIELSLEGKHADEIAQIMGVTRNAV NAYKKRAKSSLKKKLSCRAYSLVVLI >gi|222159219|gb|ACAB01000140.1| GENE 15 20918 - 22084 756 388 aa, chain + ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 190 336 128 273 327 65 30.0 2e-10 MGYNDIYRIAALIVKSIKGKESEEERNELDVWRNASDRNRRLYQKYQDAEYFRARSIVKE KEYESSERIWKSIKTGIAKKKKRALLVKVVSCAAVFALLLSAGALWIIQSDVHQFEYQDD LTAIRAGDSKAILSLVDGSEYYLNSRIRLALVGENYAIRDEAGIHILDNKKGTDTKRKNK LFVPRNGKYDITLCDGTKVWVNSDSELYYPSAFASDQRIVKASGEMYFEVSHHDSWPFII ETDYGRVEVKGTSFNLRSYREEKKLVTTLVQGKVLVSSHETEDIIELHPGETAVVTDNQE IDVVTANVQEVVSWKDDMFVFRSSTLEQIMEEAARWYDIRIVWVSEERRHTLFSGELPRS KDFVDFVRMIELTEKVSFSVKGRTVYID >gi|222159219|gb|ACAB01000140.1| GENE 16 22119 - 25496 2275 1125 aa, chain + ## HITS:1 COG:no KEGG:BT_3271 NR:ns ## KEGG: BT_3271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 1125 12 1116 1116 995 46.0 0 MMKELSKTGYSLAVIQAIALRMAFACWLLFTTFPILAQEHSVSLHLNHVSLLEAINEIKR QSNLNLMYSDEDVKSVTDITVNVDNVTPEVALTACLSGTNLVAQKQNGVFVIHLRKKEKK IKLTGKVMDSKDGSPLAGATVFIMKNDKRTVGTAATLDGDFSLEVPADTKEVVASYIGYK NKSVKVVANKPLQIFLEENSNSLNEVVVNGYSSVKRTSFTGNAVTVTKDDLLKVSTRNMI DVLQVYDPSLRIAVNNEMGSDPNSLPEFYIRGRSGVGVMELDKDAVSQSALSNNPNTPIF IMDGFEVSVSKVYDFDPNRIERITILKDAAATAIYGSRAANGVIVIETKAPVPGRLRVNY NFSAELTAPDISDYNLMNASEKLEAERLAGLYESDSPSTALNLQKEYLAKQNNILKGIDT DWIAQPLRNGFNHKHSLYIEGGVNDIRFGAEMNYDIQNGVMKGSYRKRAGGGFYIDYRIN NLQIKNYVTFNSVKSENSPYGTFSDYTSRQPYDSFKDEEGNYVEYLTTWHTGTSNLRNPL YDATMLSSYDRSSYTDLTNNLSLNYHLADVFQLKAQFSVTKRDDESRQFTDPGSGKYVNY ADSDPKGELLSGRSQSIYWNANLLANYVQTVNRNHMNFSLGLNIRETKDESESALYRGFP SGKLDSPNYADEIATKPSFSDNHTRLMGTFLSTNYTYDNIYLMDASFRMDGSSEFGSKKK FAPFWSFGAGLNIHNYPFMKQQTLFSQFKIRGSYGQIGKVNFPAYAAKNTYQILSDKYGT GAGVLLYYMGNENLKWERTNTFDIGTDISLLNDAVLFRFSWYNKKTVDLITDVTLPSSTG FTVYRDNLGEVQNRGIELDIRADVFKNRDWTVTLFGNLAHNKNKILKISESLKAYNDRVD TYFADYDKNIGTMQDSKYSKPFMKYTEGGSLTSIWGMQSLGINPSDGQETFLKKDGTITS DWNSSDQVIIGDTEPTAQGAFGVNLRYKQFTMYATFKYECGGQLYNSTLVNKVENVNIYE VNADKRVLTDRWKQPGDVTMLKSIKDRYQTTRPTSRFVQDNNTLEFNSFTLAYEFSNKLL APINLSMLRVQFNMNNIARISSIHQERGLSYPYARTFTFSLNLGF >gi|222159219|gb|ACAB01000140.1| GENE 17 25508 - 26980 948 490 aa, chain + ## HITS:1 COG:no KEGG:BT_3272 NR:ns ## KEGG: BT_3272 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 461 4 467 488 275 36.0 3e-72 MKKYNFAIIGVTLLLSLSACNDWLDIDPSDQVSSEKLFESGDGFRNALNGIYIAMSSSEL YGRELSWGLASVLSQTYADNDIAKSKAYGSAMEYEYGETEIKAVLESVWSKGYNVIANCN KLISEAEAKDSTAFREGALERNLIIGEAKALRAMMHFDILRLFAPSVESGDESKRIPYFT VYPSKYEPNLSTKEILKKVEEDLLEARQLVAKHDTIVNTSTMAMPDYRMGGTNKATGGDF FGQRGIRMNYVAINALLARVYLYDNDWNNAAKYAKHVIDDFVTRRKWFYFTDPYDYAKAT DKYKYVKLYEDTFLAFYDRELLERIDNFKSANNSLLMTLKNYDALFELDADDYRTSLATS DGKTGKTSMKWINTNSTNQSVTIQYKLIPMIRLSEMYYILSEASSKTSLSEALSYLSVVR KARGAKRDISGTASKDYYNELVAEYKKEFLAEGQMFYFFKRWNMPVTTGTQVIQMDGRYV MPIPESEIIY >gi|222159219|gb|ACAB01000140.1| GENE 18 26999 - 27635 332 212 aa, chain + ## HITS:1 COG:no KEGG:BT_3273 NR:ns ## KEGG: BT_3273 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 206 1 213 236 85 30.0 9e-16 MKKIYSLLIATFFLLVACEKDEIRIWDSGNYVQFVEEYKDSLTYSFFFYGTQEEIKVPLN MRLIGLPLANDAYVSLRVNKELTTAEPDSYELENEMVFRKDVLEETGYFLLKKTALLDTK EVRLVIDIVENDVLKPGQNIYTRKVVRFSSLISQPLWWDDVVTESLLGKYTETKFRLFME VTGVGDLTELSESERWSLARQFKYYLIKKEDE Prediction of potential genes in microbial genomes Time: Wed May 18 04:16:47 2011 Seq name: gi|222159218|gb|ACAB01000141.1| Bacteroides sp. D1 cont1.141, whole genome shotgun sequence Length of sequence - 62349 bp Number of predicted genes - 52, with homology - 52 Number of transcription units - 24, operones - 16 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 55 - 1536 1116 ## BT_3274 hypothetical protein 2 1 Op 2 . + CDS 1549 - 4125 1533 ## BT_3275 hypothetical protein 3 1 Op 3 . + CDS 4130 - 6670 1383 ## BT_3276 hypothetical protein + Prom 6723 - 6782 10.0 4 2 Tu 1 . + CDS 6824 - 7729 1003 ## COG0668 Small-conductance mechanosensitive channel + Term 7745 - 7814 10.3 - Term 7746 - 7790 8.4 5 3 Op 1 . - CDS 7836 - 8075 295 ## BT_2388 hypothetical protein 6 3 Op 2 . - CDS 8144 - 9430 1370 ## COG2873 O-acetylhomoserine sulfhydrylase - Prom 9550 - 9609 3.6 + Prom 9404 - 9463 6.8 7 4 Op 1 . + CDS 9608 - 10078 449 ## COG1522 Transcriptional regulators 8 4 Op 2 . + CDS 10151 - 10777 343 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 9 4 Op 3 . + CDS 10789 - 11316 202 ## PROTEIN SUPPORTED gi|229878290|ref|ZP_04497790.1| acetyltransferase, ribosomal protein N-acetylase + Term 11359 - 11395 -0.4 + Prom 11319 - 11378 5.6 10 5 Op 1 . + CDS 11401 - 12135 367 ## BT_2379 hypothetical protein + Prom 12143 - 12202 4.2 11 5 Op 2 . + CDS 12229 - 13602 975 ## COG0534 Na+-driven multidrug efflux pump + Term 13608 - 13642 2.0 12 6 Tu 1 . + CDS 13656 - 14804 578 ## COG3177 Uncharacterized conserved protein - Term 14843 - 14900 1.7 13 7 Op 1 . - CDS 15021 - 15239 303 ## BT_2974 hypothetical protein 14 7 Op 2 . - CDS 15202 - 15390 95 ## gi|237713326|ref|ZP_04543807.1| predicted protein - Prom 15410 - 15469 3.0 + Prom 15354 - 15413 7.1 15 8 Tu 1 . + CDS 15489 - 16685 1056 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Term 16574 - 16617 -0.8 16 9 Tu 1 . - CDS 16677 - 18161 865 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Prom 18212 - 18271 6.0 17 10 Op 1 . - CDS 18688 - 19158 321 ## BT_1189 hypothetical protein 18 10 Op 2 . - CDS 19190 - 20320 675 ## COG2207 AraC-type DNA-binding domain-containing proteins 19 10 Op 3 . - CDS 20317 - 20733 365 ## BT_2376 hypothetical protein - Prom 20872 - 20931 6.2 + Prom 20834 - 20893 7.2 20 11 Tu 1 . + CDS 20932 - 21552 418 ## BT_2369 hypothetical protein + Term 21631 - 21688 -0.4 - Term 21529 - 21568 0.2 21 12 Op 1 . - CDS 21626 - 22471 684 ## COG2961 Protein involved in catabolism of external DNA 22 12 Op 2 . - CDS 22526 - 23002 407 ## BT_2370 hypothetical protein - Prom 23043 - 23102 5.2 23 13 Tu 1 . - CDS 23106 - 25238 1485 ## COG0475 Kef-type K+ transport systems, membrane components - Prom 25262 - 25321 9.0 24 14 Op 1 . - CDS 25498 - 25863 286 ## COG2315 Uncharacterized protein conserved in bacteria 25 14 Op 2 . - CDS 25938 - 26474 351 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 26556 - 26615 4.4 26 15 Op 1 . + CDS 26861 - 28096 977 ## COG0582 Integrase 27 15 Op 2 . + CDS 28108 - 28470 271 ## BF0151 hypothetical protein + Term 28493 - 28543 10.1 - Term 28480 - 28530 10.1 28 16 Op 1 . - CDS 28548 - 28853 334 ## BF0150 hypothetical protein 29 16 Op 2 . - CDS 28885 - 29178 198 ## BF0149 hypothetical protein - Prom 29385 - 29444 3.6 + Prom 29120 - 29179 3.9 30 17 Op 1 . + CDS 29386 - 29748 282 ## BF0147 hypothetical protein 31 17 Op 2 . + CDS 29752 - 30102 400 ## BF0146 hypothetical protein 32 17 Op 3 . + CDS 30123 - 31694 1405 ## BF0145 hypothetical protein 33 17 Op 4 . + CDS 31755 - 33842 1906 ## COG0550 Topoisomerase IA 34 17 Op 5 . + CDS 33870 - 34457 465 ## BF0143 hypothetical protein 35 17 Op 6 . + CDS 34447 - 40263 3874 ## COG4646 DNA methylase + Prom 40718 - 40777 10.4 36 18 Op 1 . + CDS 40921 - 42846 700 ## COG0480 Translation elongation factors (GTPases) 37 18 Op 2 13/0.000 + CDS 42846 - 45164 1285 ## COG0642 Signal transduction histidine kinase 38 18 Op 3 . + CDS 45157 - 46479 947 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 46704 - 46738 2.0 + Prom 46508 - 46567 2.6 39 19 Tu 1 . + CDS 46759 - 47181 152 ## BF0137 hypothetical protein + Term 47301 - 47349 6.8 + Prom 47271 - 47330 8.7 40 20 Tu 1 . + CDS 47420 - 48025 451 ## BF0136 tetracycline resistance element mobilization regulatory protein RteC + Term 48044 - 48090 9.4 + Prom 48226 - 48285 7.7 41 21 Op 1 . + CDS 48327 - 49457 903 ## COG1373 Predicted ATPase (AAA+ superfamily) 42 21 Op 2 . + CDS 49474 - 52728 1242 ## BF0134 hypothetical protein + Term 52852 - 52883 -0.8 43 22 Op 1 . - CDS 52860 - 54878 1278 ## COG3505 Type IV secretory pathway, VirD4 components 44 22 Op 2 . - CDS 54909 - 56156 885 ## BF0132 hypothetical protein 45 22 Op 3 . - CDS 56135 - 56563 256 ## BF0131 hypothetical protein 46 23 Op 1 . + CDS 57245 - 58000 707 ## BT_2303 conjugate transposon protein 47 23 Op 2 . + CDS 58006 - 58455 167 ## BT_2302 conjugate transposon protein 48 23 Op 3 . + CDS 58479 - 58841 153 ## gi|160889352|ref|ZP_02070355.1| hypothetical protein BACUNI_01775 49 23 Op 4 . + CDS 58838 - 59578 441 ## BF0126 hypothetical protein 50 24 Op 1 . + CDS 59780 - 60091 317 ## BF0125 hypothetical protein 51 24 Op 2 . + CDS 60102 - 60434 194 ## BF0124 hypothetical protein 52 24 Op 3 . + CDS 60431 - 62348 1150 ## BF0123 hypothetical protein Predicted protein(s) >gi|222159218|gb|ACAB01000141.1| GENE 1 55 - 1536 1116 493 aa, chain + ## HITS:1 COG:no KEGG:BT_3274 NR:ns ## KEGG: BT_3274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 460 3 483 534 100 22.0 1e-19 MKARYLIYLLSGLMTFYSCVDDKGSYDYLQTPTVKITFSESIYDGVIGEETHIEPNLSFS EGDDVSNYEFEWELNGEVVCNEQILSFVPKEAITYYVLYSVIHKDSKIRTMKNLQLVAKS SYKTGWAILSDLNHKAILTQVRDDNDEYVDYLNIYEKVNGEELGSQPYKLVEHYSGKKTN PEILVVNQDAKGPLELEGNSMMKVLYTTQEFTEGVPEDFVVTDAAYLYYTDVLLTANGQI YIRLLKNPDNAGFHSAPYSSIPVHYNMGMKITRMVQANFYKTRHVLMYDEKNHRFLSLSS LYSYYTGAIDDVPINGLEGKKLIYADAYLPDPYTDYLCGYVLVLQDADGNYSVKTFDLEY DWQGKVIIKNETDFSFSDGGDYLTDKSKFYCSKDASGSYLFFSGGAINNELYYYEKSTQR VIPYTICNSEITDLQLNYESMTLGVGMKDGFVVYAVDESTIMGGEAKKLHEVNNIGEVVD VIYRYGSPSKMYR >gi|222159218|gb|ACAB01000141.1| GENE 2 1549 - 4125 1533 858 aa, chain + ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 43 858 30 860 860 879 52.0 0 MKIRLLTLVGLSLLVMFSNEGHAKFPWKKKKAKSEQVKKATPYEKLFRGKKCETVKGMIT LHKMDDRVYFELPLNLLEKDMLLGSTISETTDNRFGSVGEKPYAPLHITFIKTDSLISLC EVSDDYMSHDSNLQQRISESTRPAILKNFNIKAYSPDSTAVVFDMTDFLLANRDNMNPFS PFSPIMAAYTVSKEFRKEDSSIMSIKSFDDNLSIRSSLTYNVTVSSMGVTYIKQMPFTAV LTRSIILLPEKPMRPRYADSRINIFYTPKVKFSNESQRVENVYYALRWRLEPVDAEKYEK GELVEPKKPIIYYIDNAFPNFWKKYIRRGILRWNDAFEKIGFKNAICVKDFPLDDPEFDP DNIKYSCVRYSPSFIANAMGPSWSDPRTGEILNASVYLYHNLVGLVRNWRFVQTAQADTD VRRKQLPTDILGDCISYVVAHEVGHTFAFMHNMAASSSIPVDSLRSASFTNKYGTTYSIM DYARNNYVAQPGDKEKGVRLTPPDLGIYDYFSVKWLYSPIMSAKSSEEEIPILRQWISEK LPDPAYRYGRQQIRTRIDPSSFEEDLGDDPVKASSYGVMNLKYILSQMNEWIKDDVEGDF RKEIYNEIIGQYIRYINNVLFNIGGIYVNEHYEGDPFPSYVPVTKEKQRESLKFLWEQAL DCSWIDEPALQKILPLHSNLSSVIENLIIDYVLKRFPSVAICGDKMEQNPYTQLDYMDDL YYYMFMVPDKKNVLSASEKNMQLKFVSYLMNGIKTVKKGSAAGTSGRMLTDERIPLPEGI VAPTYLYDGPLNHFNDPKEVMGFEDTVSVTIPVESMEHHYYSMLRKIERLAKNRMSAGSD ENRTHYRLLVYKISQSLK >gi|222159218|gb|ACAB01000141.1| GENE 3 4130 - 6670 1383 846 aa, chain + ## HITS:1 COG:no KEGG:BT_3276 NR:ns ## KEGG: BT_3276 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 39 841 44 863 868 560 38.0 1e-157 MNRKIIYFSLLSLLLLFTVQDACARKKKEKSAKTEIKGYENHLKDCVASDSTGLMIMHRT KDKVLVEFPVRLLEKEMLLASSIINISDNGEGVVGQFTSPEVMFRFVRNGKKLEARMMTN QPLMREETGKEGVASSLLKSASGGVYKSFKIEAFAKDSSSVLVNLAPLFLEHSDYSNPFS SYAPNSMYGMVTRKFRCDTSRSFLKDIHSYPDNISVTCVLGYDVDYTAFGTITMQRNVPV TVTAARMLVLLPETPMDIRYANQKVNTAFLVRKQYVDEMSPLKSTHLVKRWRLEPSDKAA FKRGEKVEPKKPIVFYIDTLMPDSWKPYIKEGVEEWNKSFEKIGYLNAVQAKEWPKGTDF SSSNIRHSSICYAPDWMYMAQTSMHTDPRTGEILNASVYIHHNFLSLLYSGRCTQTMASD PTARTLTLSEKQMGELLKVGIAQQVGRCLGLTDNMGASYHYPVDSLRSAEFTRQHGLTAS VMDNIMCNYIAQPEDVEKGAVLVQPGIGPYDYFPIRYLYAPVVADKPEKELVTLNKWVED AYTAHEYHYGPRQEFYALYDPTALYWDLGDDPFKAADYQIQNLKISIANFMKWYAKEDYD ISRRAELYASLIKLFTNRAMELSFWIGGLYLDEGKEGISFPVSKEMQQKALNYLVKMSMD LDWLTNAEVKSSLELQDLIVDKTRKYIFQLLFDRIRYVALCSEKSDGEYSVKNYMDDIHS IVWKGVLQNRVLTNTEMLYQNAFIDYLVKNISKNMGGGTAKNAGRQQALRGDSFKMDIFN EANSFVMGWQAQDIVPVNSHQDASVYWDMLLKIRNLLSSRISSASPDMKEYYGFLIYKIN QILDDN >gi|222159218|gb|ACAB01000141.1| GENE 4 6824 - 7729 1003 301 aa, chain + ## HITS:1 COG:PA4394 KEGG:ns NR:ns ## COG: PA4394 COG0668 # Protein_GI_number: 15599590 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Pseudomonas aeruginosa # 33 297 6 269 278 213 44.0 5e-55 MLLLLQATQAADSVQVAADELIKEAIANADGLDKLSLITQQLIDFGIRAGERILIAVIVF IVGRFLISMLNKFVGRLMDKRKVDISIKTFVKSLVNITLTVLLIISVVGALGVETTSFAA LLASAGVAVGMALSGNLQNFAGGLIVLLFKPYKVGDWIESQGVSGTVKEIQIFHTILTTA DNKVIYVPNGAMSSGVVTNYSNQVTRRVEWIVGVDYGEDYEKVQKIVYDILAVDQRILKE PAPFVALHALDASSVNVVARVWVNSGDYWGVYFDINKAIYETFNEKGINFPFPQLTVHQG N >gi|222159218|gb|ACAB01000141.1| GENE 5 7836 - 8075 295 79 aa, chain - ## HITS:1 COG:no KEGG:BT_2388 NR:ns ## KEGG: BT_2388 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 79 1 79 79 129 94.0 3e-29 MEKYLIHSNELHLIDAEKIHQAVEKMVESLDLAAGSTTNFDLYQVVENYFKDLEKRRKIN HVLGIKEDRYELAEDFGIK >gi|222159218|gb|ACAB01000141.1| GENE 6 8144 - 9430 1370 428 aa, chain - ## HITS:1 COG:L75975 KEGG:ns NR:ns ## COG: L75975 COG2873 # Protein_GI_number: 15672055 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Lactococcus lactis # 1 428 1 426 426 525 62.0 1e-149 MATKKLHFETLQVHVGQEQADPATDARAVPIYQTTSYVFHNSAHAAARFGLQDPGNIYGR LTNSTQGVFEQRVAALEGGVAGLAVASGAAAITYAFENITRAGDHIVAAKTIYGGSYNLL AHTLPSYGITTTFVDPSDLSNFEKAIQENTKAVFIETLGNPNSNIIDIEAVAEIAHRHKI PLIIDNTFGTPYLTRPIEHGADIVVHSATKFIGGHGSSLGGVIVDSGKFDWVASGKFPQL TEPDPCYHGVRFVDAAGPAAYAIRIRAILLRDTGATISPFNAFILLQGLETLSLRVERHV ENALKVVNFLNNHPKVKKVNHPSLSNHPDHALYQRYFPNGAGSIFTFEVKGGQEEAHRFI DSLEIFSLLANVADVKSLVIHPASTTHSQLNAQELAEQEIYPGTVRLSIGTEHIDDLIAD LDQALTKI >gi|222159218|gb|ACAB01000141.1| GENE 7 9608 - 10078 449 156 aa, chain + ## HITS:1 COG:NMB0573 KEGG:ns NR:ns ## COG: NMB0573 COG1522 # Protein_GI_number: 15676478 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Neisseria meningitidis MC58 # 7 150 33 175 187 114 38.0 5e-26 MDTFDKLDKVDLQILRTLQENARLTTKELAARVSLSSTPVFERLKRLENGGYIKKYIAVL DAEKLNQGFVVFCSVKLRRLNRDIAAEFTRIIQDIPEVTECYNISGSYDYLLKIHAPNMK YYQEFILNVLGTIDSLGSLESTFVMAEVKHQYGIHI >gi|222159218|gb|ACAB01000141.1| GENE 8 10151 - 10777 343 208 aa, chain + ## HITS:1 COG:CAC0777 KEGG:ns NR:ns ## COG: CAC0777 COG0110 # Protein_GI_number: 15894064 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Clostridium acetobutylicum # 3 206 7 210 210 281 66.0 8e-76 MNSNIYPRTNDFQTIYLNTVIKNPAIIVGDYTIYNDFVNDPVLFEKNNVLYHYPINKDRL IIGKFCSIACGAKFLFNSANHTLNSLSNYTFPIFFEEWNLDKGDVTSAWDNKGDIVIGND VWIGYEAVIMAGVHIGNGAVIASRAVVTKDVPPYTIVGGTPAREIRKRFDEQTIVRLQEL QWWDWSVEKISECIPYITGGKIEELMKR >gi|222159218|gb|ACAB01000141.1| GENE 9 10789 - 11316 202 175 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229878290|ref|ZP_04497790.1| acetyltransferase, ribosomal protein N-acetylase [Slackia heliotrinireducens DSM 20476] # 9 156 11 158 181 82 29 6e-15 MKFIIEPGTSADIDELEKLYNELNDYLAATINYPGWIKGIYPIREDAVAGVNDNMLYVAR IDGRIAGSVILNHQPEKAYENVKWKMELDYSCIFVIHTFVVHPSFLKKGVGHALMDYSLE LAQRSGIKSVRLDVYEKNLPAISLYEKCGFEYVDTVDLGLGQYGLDWFRLYERII >gi|222159218|gb|ACAB01000141.1| GENE 10 11401 - 12135 367 244 aa, chain + ## HITS:1 COG:no KEGG:BT_2379 NR:ns ## KEGG: BT_2379 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 244 17 244 244 408 85.0 1e-112 MKKLIIVFVLFLSTFSCFSQTEFTTCLFDGARNRVIPITVYQPQKVNSKTKVIIFNHGYD GNKNRKSNQTYSYLTRFLSQKGFYVISIQHELPNDPLLAMEGDFMQTRMPNWERGVANIY FTIQEFKKLKPQLDWDKLILIGHSNGGDMTMLFATKYPHLINKAISMDHRRMIMPRTEKP RLYTLRGCDYDADAGVLPTKQEQEQFHMKVVKLDGITHSNMGENGTEEQHRLINQSISGF LTQK >gi|222159218|gb|ACAB01000141.1| GENE 11 12229 - 13602 975 457 aa, chain + ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 9 430 7 428 448 268 37.0 2e-71 MINKYEARLGTERMLPLVFKMALPAVVAQFVNLLYSMVDRIYIGHIPGIGTDALAGVGVT TSLVILISSFSAIVGGGGAPLAAMALGQGDRLRAGKILGNGFVLLILFTLFTSFIAYTFM EPILLFTGASENTLEYAVDYLSIYLLGTVFVEISTGLNSFINAQGRPAIAMYSVLIGALL NIILDPIFIFWFDMGVKGAALATVLSQACSAAWVLSFLFSRRASLPLEKRNMRLSRKIVF AMLALGVSPFIMASTESLVGFVLNSSLKNFGDIYVSALTILQSAMQFASVPLTGFALGFV PIISYNYGHGDKQRVKDCFRIVMITMFSFNLLLMLLMILFPTIVASAFTSDEKLIETVRW TMPVFLGGMTIFGLQRACQNMFVALGQAKIPIFIALLRKAILLIPLALILPHYMGVTGVY AAEAISDATAAICCTLLFFWQFPKILGKIQGNILRTD >gi|222159218|gb|ACAB01000141.1| GENE 12 13656 - 14804 578 382 aa, chain + ## HITS:1 COG:RSc3413 KEGG:ns NR:ns ## COG: RSc3413 COG3177 # Protein_GI_number: 17548130 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Ralstonia solanacearum # 19 380 7 369 371 400 51.0 1e-111 MRRIIVIFTAKYTENMKSIYIWQQTDWPDFTWDDAKLSYKLGRVRGLQGRLVGRMSALGF DLKNSAMLDALTADITKSSEIEGEILNVDQVRSSVARHLGIEIEGLPEADRYIDGVVQVM IDATQNYMQPLTAERFFNWHAALFPTGRSGVYKITVADWRQGAEPMQVISGAMGKEKIHY QAPDSDHVPYQMKLFLEWVNGNQKIDPVLKAAIAHLWFVTIHPFDDGNGRISRTITDLFL ARADEMPHRFYSMSAEIRKQRKRYYEMLEKTQKGSLDITNWLEWFLDCLEAALLDTEKSI STILQKAAFWDKYRLVSMNERQIKMVNLFWDGFDGKLTSSKWAKITKCSPDTALRDIQDL ITKGVFRKTDEGGRSTNYELVL >gi|222159218|gb|ACAB01000141.1| GENE 13 15021 - 15239 303 72 aa, chain - ## HITS:1 COG:no KEGG:BT_2974 NR:ns ## KEGG: BT_2974 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 72 85 63.0 8e-16 MEEQILIFRTSVRNARDIKHIAALFTLCPQIYKWNVDIEDWEKVLRIECQGITPTEISKA LRAINIYAQELE >gi|222159218|gb|ACAB01000141.1| GENE 14 15202 - 15390 95 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713326|ref|ZP_04543807.1| ## NR: gi|237713326|ref|ZP_04543807.1| predicted protein [Bacteroides sp. D1] # 1 62 1 62 62 110 100.0 2e-23 MTDFSKDKYMKLISPITYFVMSNQGIRRIILYRTLFKPTFALKTKVIDSNYGRANTHFSN LG >gi|222159218|gb|ACAB01000141.1| GENE 15 15489 - 16685 1056 398 aa, chain + ## HITS:1 COG:PAB2227 KEGG:ns NR:ns ## COG: PAB2227 COG1167 # Protein_GI_number: 14520410 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Pyrococcus abyssi # 5 392 16 406 410 369 46.0 1e-102 MKLNFAKRMSYIKASEIREILKVTEQEDVISFAGGLPAPELFPIDEINEINQIVLKEAGT KALQYTTTEGYVPLREWIANRMNERLGTTFDKDNILITHGSQQGLDLSGKVFLDEGDIVL CESPTYLAAISAFKAYGCSFIELPTDGDGMMMDVLEEILSNTPHIKLIYAIPTFQNPTGK TWSLERRRKLAELSAKYGVAVIEDNPYGELRFEGESLPSIKSFDEAGNILCTGSFSKIFC PGFRIGWIAGDKEIIRKYVLVKQGTDLQCNTIAQMTIAEYLKRYDIDKHIAKIVEVYRKR RNVAIESIERYFPDTIKFTRPEGGLFTWIELPEGISARDVLVRSLEKKIAFVPGGSFYPN GNKENTLRINYSNMPEDKIEKGLKTLGEVVKEYISQSK >gi|222159218|gb|ACAB01000141.1| GENE 16 16677 - 18161 865 494 aa, chain - ## HITS:1 COG:PA4132 KEGG:ns NR:ns ## COG: PA4132 COG1167 # Protein_GI_number: 15599327 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Pseudomonas aeruginosa # 25 491 1 467 471 343 36.0 6e-94 MNYLQSNNNRYRHHFCIFAKYRINMKREILYQKIAGTIAWQIKTGIWKAGEKLPSLRTIS NEYGVSLNTAIQVYYELEKEGFIISRPKSGYIVNYKPLNLSAPATTQPSAQSPGKEEADL IMEVYHSIEDSSITRFSLGIPEDVLLPIAKLNKELTKAMYSLPGNGTRYEDPQGNIRLRN YIARFAYSWNGNLTEDDIVTTTGVTNSISLALSVIAQRGDTIAVESPVYFGILQLANSME LHVLELPTNPVTGIEPDALRKVLPQINLCLLISNFSNPLGSCMPDEHKKEIVQMLAEHNI PLIEDDLYGDVFFGHSRPKPCKAFDEKGLVLWCGSVSKTLAPGYRVGWIAPGKFKDAIIR QKHIHLISTPTLNQEAIANFMENGRYENHLRKLRHELHSNSLHLAQAITDYFPEDTKIIT PQGGFMLWVELNKKIDTTELYYKAMQHKISIAPGRMFTLHDQYRNCMRLSFGQQWSAFIE ERLQVLGNIIKESF >gi|222159218|gb|ACAB01000141.1| GENE 17 18688 - 19158 321 156 aa, chain - ## HITS:1 COG:no KEGG:BT_1189 NR:ns ## KEGG: BT_1189 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 156 239 78.0 2e-62 MGKISEIMLLKQPKQPALAVEVQTDMNGMSEAIGKNFVKIDSLFKEQGEVTTDIPYVEYP NFESLTEHNIKMIIGLKSSKELQGEGDIRSITIPERKIVSCLHRGTYSELAVLYNEMMEW IKTNGYKLSGTSIEYYYSNPNVPEEEQVTRIEMPLL >gi|222159218|gb|ACAB01000141.1| GENE 18 19190 - 20320 675 376 aa, chain - ## HITS:1 COG:BH3506_1 KEGG:ns NR:ns ## COG: BH3506_1 COG2207 # Protein_GI_number: 15616068 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 109 206 7 104 130 66 36.0 8e-11 MIESDINKRYCQSCGMPLRFDIEKYLGTNSDGSRSDEYCYYCLKDGKYIVDIPMSEMINI WIKYTDKYNEYADTAYSPEELRRILNERLPKLNRWKQKLETSNIHHQKIQDIVVYINNHL FDSLDADILSTISGLSKYHFRRVFQTVAGENIGSYIQRLRLEHIAHLLVSTDFTLNQISE QTNYQTKFSLAKAFKKHFGVSTSQYREKYKPMYDEQHAVITPEIRSILPMKVFCIEVGEK YKDELRYKLIWDRLTNYARQHNEEKSNDKFVSLSMDDPSITPMDKCRFYLGVIIDNKEND SQPGVMEVPGGRYAVFRHIGDYSLLHKFYRTIYEEWFPESKYRPQSTFSFEMYMNRPAST LKTELITDIYIPVIKK >gi|222159218|gb|ACAB01000141.1| GENE 19 20317 - 20733 365 138 aa, chain - ## HITS:1 COG:no KEGG:BT_2376 NR:ns ## KEGG: BT_2376 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 138 38 175 178 207 76.0 7e-53 MKETLQIPTPETLEALIGKELYDIWTSLCQLIEQKYNMEQLWNHGGKKWIYEYKYRKGGK TLCALYAKEQTIGFMVILGKDERAKFESLREVFSNETQKIYDETTTFHDGKWLMFELKDT CLFNDMERLLSIKRKPSR >gi|222159218|gb|ACAB01000141.1| GENE 20 20932 - 21552 418 206 aa, chain + ## HITS:1 COG:no KEGG:BT_2369 NR:ns ## KEGG: BT_2369 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 203 1 203 209 354 83.0 1e-96 MKTILLAVGTLCLMAGCSSKVASSKGIVSDATMNTVTIVTDKNDTLSFSTIDANKDEVNG LLLNDTLEVFYTGKYTPGMPASKLVQYPQSPIVGGDRDEHGCIGSAGYVWCEVQKDCIRL FEKGIRTESVDDSNASAFIVFSPDSTQLELFFSNNQPNEILERRGLPSGGYAWNVEDDDT KNVRLIDGLWTISQRNKLIYSQKAGN >gi|222159218|gb|ACAB01000141.1| GENE 21 21626 - 22471 684 281 aa, chain - ## HITS:1 COG:NMB1061 KEGG:ns NR:ns ## COG: NMB1061 COG2961 # Protein_GI_number: 15676945 # Func_class: R General function prediction only # Function: Protein involved in catabolism of external DNA # Organism: Neisseria meningitidis MC58 # 6 244 9 246 281 61 24.0 2e-09 MATYTHFAKQPDVLKHLILCEVLRNEAPQVYVETNSACAIYSMTQTPEQQYGIYHFLEKA DDDKSLKASIYYQLESTEMAKGNYLGSPALAMNVLGKQAKRFVFFDLEKSALENVDIYAK QIGLSTSVEIHHTDSLKGTIELLPSLPASSFIHIDPYEIDKKGDSGVTYLDILIQATQLG MKCLLWYGFMTGDDKASLDNYIVSSLEKAGIKDYIGVELTMNSIRKDSILCNPGILGSGI LATNVSQTSKDIILDYGNKLVDIYKDAKYKDYDGSLYIQHL >gi|222159218|gb|ACAB01000141.1| GENE 22 22526 - 23002 407 158 aa, chain - ## HITS:1 COG:no KEGG:BT_2370 NR:ns ## KEGG: BT_2370 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 156 248 84.0 6e-65 MSNVDFSKITEGIYHVIETEEKILTGLQNDTITLKRNKQNRTIKQILGHLIDSASNNHQR MVRLQYSKDLLFFPDYRQDNDLWIALQDYQNEDWNNLIQLWKFFNLHIIQIIKSVDQTKL DNYWCDFEGTKVTLQDMIEGYLSHLHLHIHEIHELMNA >gi|222159218|gb|ACAB01000141.1| GENE 23 23106 - 25238 1485 710 aa, chain - ## HITS:1 COG:BH2844_1 KEGG:ns NR:ns ## COG: BH2844_1 COG0475 # Protein_GI_number: 15615407 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Bacillus halodurans # 11 400 5 388 388 306 44.0 8e-83 MDWIGLDLTFPITDPTWIFLLVLLIILFAPILLNKLRIPHIIGMILAGLAIGEHGFNILA RDSSFQLFGKVGLYYIMFLAGLEMNMGDFKETRNKALVLGLLAFIVPIGIGFVANVSYLK YGVITSVLLASMYASHTLVAYPIVTRFGISRHRSVSIAVGGTAVTDTLTLLVLAVIGGLF KGETGGLFWIWLVVKVIFLGALIIYFFPRIGRWFFHRYNDNVMQFIFVLAMVFLGAGLME LVGMEGILGAFLAGLVLNRLIPHVSPLMDHLEFVGNALFIPYFLIGVGMLINLRVIFGEG DALKVAAVMITMALTGKWIACWLTQKIYKMSVLERNLMYGLSNAQAAATLAAVLVGYNII LPTGERLLNDDVLNGTVLLILVTCVVSSLITERAARKMAMDDSQPENESSKETEKILISI ANPDTIEDMVNLSLVIRDPKLKDNLLALNVINDDNNSDNLRIRSKHYLEKAEMIATEANV PLKKVTRYDLNIASGIIHSVKENEITSIITGLHRKKNITDSYFGILAEHLLNGLNCEIII SKFLIPVHTIKRIVVAVPPKAEYESGFPHWMEHFCRMGSTLGCRVHFFANEKTTIRLQAW IKKRHKQTLTDFSLLEDWNDLLVLTGQVSYDHLLVIISARPGTLSYDSSFEKLPRQLGKY FSNNSFIVLYPNQSGEPLDTSFFSKLYTDTETRHYEKVGKWFYRWFKKEG >gi|222159218|gb|ACAB01000141.1| GENE 24 25498 - 25863 286 121 aa, chain - ## HITS:1 COG:YPO0323 KEGG:ns NR:ns ## COG: YPO0323 COG2315 # Protein_GI_number: 16120660 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 1 116 1 114 119 68 36.0 2e-12 MNIEDYREYCLSVKGATECMPFDEHTLVFKVMDKMFTFATLRPKNGRFWADMKCNPDKSE ELIEQYEDIFWGPFSDKKHWITVYLEGDVPDKLIKELISHSVEEVLKKLPKKKQEEYKNM I >gi|222159218|gb|ACAB01000141.1| GENE 25 25938 - 26474 351 178 aa, chain - ## HITS:1 COG:all4541 KEGG:ns NR:ns ## COG: all4541 COG0664 # Protein_GI_number: 17232033 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Nostoc sp. PCC 7120 # 20 176 34 189 193 78 35.0 5e-15 MPEASADKVVKHLSKITYPKGYHILEAGKTETNIFFIEKGIARAYIPVDGKEVTFWIGKE GSTIVSLKSYVNNQQGYESMELMENSVLYLLKRKDLHELFKEDIHIANWGRKFAESEFLQ TEERLISLLFTTASERYMKLIQNNPELLQRIPLECLASYLGITPVSLSRIRAKLKRVL >gi|222159218|gb|ACAB01000141.1| GENE 26 26861 - 28096 977 411 aa, chain + ## HITS:1 COG:TM0967 KEGG:ns NR:ns ## COG: TM0967 COG0582 # Protein_GI_number: 15643727 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Thermotoga maritima # 228 387 100 245 253 59 27.0 1e-08 MKSTFSVIYYLKRQVVKKDGTVPVMGRITVDGSQTQFSCKLTVDPKLWDTKGGRVTGRST AALETNRMLDKMRVRINRHYQEIMERDNFVTAEKVKNAFLGLEHRYHTLMQVFRQHNEDY EKQVEAGMKAKGTLLKYRTVYKHMQEFLDIRYHVKDIALKELTPAFISDFEMFLRTDKHC CTNTVWLYVCPLRTMVFIAINNEWLTRDPFREYEIKKEETTRSFLTKDEIRLLMEGKLKN AKQELYRDLYLFCAFTGLSFADMRNLTEENIRTYFDEHEWININRQKTGVVSNIRLLDIA NRIIGKYRGLCGDGRIFPVPHYNTCLAGIRAVAKRCGITKHITWHQSRHTAATTIFLSNG VPIETVSSMLGHKSIKTTQIYAKITKEKLNQDMENLAARLNGVEEFAGCTI >gi|222159218|gb|ACAB01000141.1| GENE 27 28108 - 28470 271 120 aa, chain + ## HITS:1 COG:no KEGG:BF0151 NR:ns ## KEGG: BF0151 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 120 1 120 120 241 100.0 6e-63 MKRDTIIIEDKAVSVTGNDVWMTATEIAGLFHTTVPAVNAAIRAVRKSDVLNDYEVCRYM QLENGLHADVYALEIIIPVAFRVNTYNTHLFRTWLVGKALSQEKRQTYVMFIQNGKAGYC >gi|222159218|gb|ACAB01000141.1| GENE 28 28548 - 28853 334 101 aa, chain - ## HITS:1 COG:no KEGG:BF0150 NR:ns ## KEGG: BF0150 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 101 1 101 101 190 100.0 1e-47 MMNENNDVFTMEDEPIASVVQDMRKGSKWLSAFLESYRPPLDGERYLTDGEVSELLRVSR RTLQEYRNNRVLPFILLGGKVLYPETGLRGVLEANYRKPLE >gi|222159218|gb|ACAB01000141.1| GENE 29 28885 - 29178 198 97 aa, chain - ## HITS:1 COG:no KEGG:BF0149 NR:ns ## KEGG: BF0149 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 97 17 113 113 181 100.0 7e-45 MEIVSIEKKTFEMMVAAFGALSEKVAALRRKSDTGRMERWLTGEEVCGQLRISPRTLQTL RDRRLIGYSQINRRFYYKPEEVKRLIPLVGTLYPHGR >gi|222159218|gb|ACAB01000141.1| GENE 30 29386 - 29748 282 120 aa, chain + ## HITS:1 COG:no KEGG:BF0147 NR:ns ## KEGG: BF0147 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 120 22 141 141 196 100.0 3e-49 MKVITMESSAYKEMMAQIANIAGYIREARDEKKRKRETEDKLLDTAQAAKMLNVSKRTMQ RMRTDHRIEYVVVRGSCRYRLSEILRLLEDNTVRNEEGTIDTLFHNHTLRTGGKPKGRRT >gi|222159218|gb|ACAB01000141.1| GENE 31 29752 - 30102 400 116 aa, chain + ## HITS:1 COG:no KEGG:BF0146 NR:ns ## KEGG: BF0146 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 116 1 116 116 213 100.0 1e-54 MELLTRNNFEGWMQKLMERLDRQDELLLAMKAEGKQPTITESIRLFDNQDLCMLLQISKR TLQRYRSVGALPYKTLGKKTYYSEEDVLTFLSNHIKDFKKEDIAFYKARIHNFFHK >gi|222159218|gb|ACAB01000141.1| GENE 32 30123 - 31694 1405 523 aa, chain + ## HITS:1 COG:no KEGG:BF0145 NR:ns ## KEGG: BF0145 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 523 1 523 523 854 100.0 0 MAKKKDEKDVLVVRDEKTGEISVVAGLNADGTPKRTPAKAENAQSFLQFDRHGDVLDNFF KNFFRQCKEPSRFGFYRIAADQAENLLEVMKQLLKDPEANKELLAPHKVDTSDYEKKVQE EMAAQQTEKQEPQKQENMEQRKEQQQDKSEQMQGKRGYQPIDESKINWQELEDRWGVKRD NLEKSGDLTKMLNYGKSDLVKVKPTFGGESFELDARLSFKKDGEGNISLVPHFIRKEQKL DEYKEHKFSDNDRKNLRETGNLGRVVDIVDRETGEIIPSYISIDRKTNEITDIPASRVRI PERIGKTEITTQERDMLRAGLPVRDKLIERNDGRKFVTTLQVNVEQRGVEFVPGTGKSPR TAQTQETKGDTSKSQAQGGENAAQTKKEQRRNTWTNEDGSIRPISKWSGVSFTDQQKADY VAGKAVKLENVTDKQGFHATMYIKFNPEKGRPYRYDTNPDNAQQVAPSNESRTQVAVNND GKTNEATKNLREPLQKGQTNPKDARQQQQQEKPQKKTGKGMKM >gi|222159218|gb|ACAB01000141.1| GENE 33 31755 - 33842 1906 695 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 4 625 6 647 709 404 38.0 1e-112 MKTIIAEKPSVAREIARIVGATKREEGYFEGGGYAVTWAFGHLVQLAMPDGYGVRGFVRD NLPIIPDTFTLVPRQVRTEKGYKPDSGVVSQIKVIKRLFDTSEHIIVATDAGREGELIFR YLYHYTGCTTPFVRLWISSLTDKAIREGLRKLEDGSKYDNLYLAAKARSESDWLVGINGT QALSIAAGHGTYSVGRVQTPTLAMVCERYWENRRFTSEAFWQLHIATDGCDGEVVKFSSS EKWKEKEPAMELYNKVKAAGCATVTKAERKEKTEETPLLYDLTTLQKEANAKHGFTAEQT LEIAQKLYEKKLITYPRTGSRYIPEDVFAEIPKLLAFIGTQPEWKDKVRAKAAPTRRSVD DGKVTDHHALLVTGEKPLFLSKEDNTIYQMIAGRMVEAFSEKCVKDVTTVTAECAGVEFT VKGSVVKQTGWRAVYGEEKEEITIPGWQEGDTLTPKGSSITEGKTKPKPLHTEATLLSAM ETAGKEIEDDALRQAMKDCGIGTPATRASIIETLFKRGYMERCKKSLVPTEKGLALNSVV KTMRIADVAMTGEWEKELARIERGELSDDTFRKEIEAYTREITSELISCDKLFGSRDSGC ACPKCGTGRMRFYGKVVRCDNTECGLPVFRLKAGRTLSDDEIKDLLTEGHTKLLKGFKSK QGKSFDAVVAFDGEYNTTFVFPEAKKDKKFSGRKK >gi|222159218|gb|ACAB01000141.1| GENE 34 33870 - 34457 465 195 aa, chain + ## HITS:1 COG:no KEGG:BF0143 NR:ns ## KEGG: BF0143 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 195 1 195 195 345 100.0 5e-94 MNCSSIPDYTHTLDLVVALGGIPSAFFFLFPTDYPVNYQSAKSKVMNNKKKNEGQTDFSY YGLYLLDYLRTNKFEQADDTAFIRERADRAAETYERARLEGYPADGAQELAMDTLLRGLH YSRYAILREVVENEFADEVPEEKREAFVLKLLPLVGNVFSVYDLSDDNFALSSDYDLLYT ELTGATVLYLDEYGV >gi|222159218|gb|ACAB01000141.1| GENE 35 34447 - 40263 3874 1938 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 614 1684 65 1140 1315 354 27.0 1e-96 MAFNRKQKLRDNIEAIRTAFILDRENRTATTEERAILQRYCGFGGLKCILNPAKELTDAV RWAKSDLELFAPTVELHRLIRENSKDETEYKRFVDSLKASVLTAFYTPKEITDTIADVLA DYSVRPARMLEPSAGVGVFVDSMLRHSPNADVMAFEKDLLTGTILRHLYPDQKMRTCGFE KIERPFNNYFDLAVSNIPFGDIAVFDAEFQRSDSFGRRSAQKTIHNYFFLKGLDAVRDGG IVAFITSQGVLNSTKTSVRNELFSQANLVSAIRLPNNLFTDNAGTEVGSDLIVLQKNLSK KEMSQDERLMTVIQTDTKTALTDNAYFIHHPERIVHTMAKLDTDPYGKPAMVYLHEGKAA GIAGDLRRMLDEDFHYRLAMRLYSGSIRQAGTEEKVAVQNKVERPAIKLETVSSAQTVET PTEKPQPADEKPEIEPRPQYSAGVQLTLLDLWGMTEEVSQPKTSKKKKTVKKAVTAKSTP PKPKVTVTPTAPTAKPAMENKEVKAENTAKPADPDDIYATLDWDTNPPINGFYEMMMGLT PERRKELRELARQHNEKQVAEKTEVKAVPETSREQPRQEETQPEAVAAPAVTDTPSEAVG TFLFPDIEAEKPKEEVVDLSPRAYHRTPEMHLREGSLVADRGRHNIGYLKDITPYGATFQ PLDLKGYQKEKALLYVSLRDAYERLYRYESLRREANVPWREHLNTCYDEFVMRYGNLNAK QNVKLVMMDAGGRDILSLERMENGKFVKADIFEHPVSFAVESHANVGSPEEALSASLNKY GTVNLDYMREITDSTAEDLLTALQGRIYYNPLVTGYEIKDRFIAGNVIEKAERIEAWMGD NPENERMPEVKQALEALKDAEPQRIAFEDLDFNFGERWIPTGVYAAYMSRLFDTEVKIAY SASMDEFSVVCGYRTMKITDEFLVKGYYRNYDGMHLLKHALHNTCPDMMKSIGKDEHGND IKMRDSEGIQLANAKIDEIRNGFSEWLEEQSPQFKERLVTMYNRKFNCFVRPRYDGSHQT FPDLNLKGLASRGIKSVYPSQMDCVWMLKQNGGGICDHEVGTGKTLIMCIAAHEMKRLNL AHKPMIIGLKANVAEIAATYQAAYPNARILYASEKDFSTANRVRFFNNIKNNDYDCVIMS HDQFGKIPQSPELQQRILQAELDTVEENLEVLRQQGKNVSRAMLKGLEKRKHNLEAKLEK VEHAIKSRTDDVVDFKQMGIDHIFIDESHQFKNLTFNTRHDRVAGLGNSEGSQKALNMLF AIRTIQERTGKDLGATFLSGTTISNSLTELYLLFKYLRPKELERQDIRCFDAWAAIFAKK TTDFEFNVTNNVVQKERFRYFIKVPELAAFYNEITDYRTAEDVGVDRPAKNEILHHIPPT PEQEDFIQKLMQFAKTGDATLLGRLPLSETEEKAKMLIATDYARKMALDMRMIDPNYEDH PDNKASHCAKMIAEYYQKYDAQKGTQFVFSDLGTYQPGDGWNVYSEIKRKLTEDYGIPPS EVRFIQECKTDKARKAVIDAMNAGTVRVLFGSTSMLGTGVNAQKRCVAIHHLDTPWRPSD LQQRDGRGVRAGNEIAKHFAGNNVDVIIYAVEKSLDSYKFNLLHCKQTFISQLKSGAMGA RTIDEGAMDEKSGMNFSEYMALLSGNTDLLDKAKLEKRIASLEGERKSFNKGKRDSEFKL ESKTGELRNNTAFIDAMTEDWNRFLSVVQTDKEGNRLNIIKVDGVDSADEKVIGKRLQEI AKNATTGGLYTQVGELYGFPIKVVSERILKEGLEFTDNRFVVEGNYKYTYNNGHLAMADP LAAARNFLNAMERIPSIIDQYKAKNEVLEMEIPQLQEIAGKVWKKEDELKQLKSELAALD RKIQLELAPPTPEVAEKENEGQQLKPEAEDVRNRQAQYPENAPPQIRSPADSIVANHVII GRPGLYAKEETRSKGLKI >gi|222159218|gb|ACAB01000141.1| GENE 36 40921 - 42846 700 641 aa, chain + ## HITS:1 COG:CAC1448 KEGG:ns NR:ns ## COG: CAC1448 COG0480 # Protein_GI_number: 15894727 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 3 612 4 619 652 473 40.0 1e-133 MNIINLGILAHIDAGKTSVTENLLFASGATEKCGRVDNGDTITDSMDIEKRRGITVRAST TSIIWNGVKCNIIDTPGHMDFIAEVERTFKMLDGAVLILSAKEGIQAQTKLLFSTLQKLQ IPTIIFINKIDRAGVNLERLYMDIKTNLSQDVLFMQTVVDGSVYPVCSQTYIKEEYKEFV CNHDDDILERYLADSEISPADYWNTIIALVAKAKVYPVLHGSAMFNIGINELLDAISSFI LPPASVSNRLSAYLYKIEHDPKGHKRSFLKIIDGSLRLRDVVRINDSEKFIKIKNLKTIY QGREINVDEVGANDIAIVEDIEDFRIGDYLGAKPCLIQGLSHQHPALKSSVRPNKPEERS KVISALNTLWIEDPSLSFSINSYSDELEISLYGLTQKEIIQTLLEERFSVKVHFDEIKTI YKERPIKKVNKIIQIEVPPNPYWATIGLTLEPLPLGAGLQIESDISYGYLNHSFQNAVFE GIRMSCQSGLHGWEVTDLKVTFTQAEYYSPVSTPADFRQLTPYVFRLALQQSGVDILEPM LCFELQIPQVASSKAITDLQKLMSEIEDISCNNEWCHIKGKVPLNTSKDYASEVSSYTKG LGIFMVKPCGYQITKDGYSDNIRMNEKDKLLFMFQKSMSLK >gi|222159218|gb|ACAB01000141.1| GENE 37 42846 - 45164 1285 772 aa, chain + ## HITS:1 COG:CC3623_1 KEGG:ns NR:ns ## COG: CC3623_1 COG0642 # Protein_GI_number: 16127853 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Caulobacter vibrioides # 257 520 176 445 460 125 33.0 3e-28 MERSGNFYKAIRLGYILISILIGCMAYNSLYEWQEIEALELGNKKIDELRKEINNINIQM IKFSLLGETILEWNDKDIEHYHARRMAMDSMLCRFKATYPAERIDSVRSLLEDKERQMFQ IVRLMDEQQSINKKIANQIPVIVQKSVQEQSKKPKRKGFLGIFGKKKEVTPAVSTTILHS VNRNVISEQKVQDRQLSEQADSLAARNAELNRQLQELICQIEEKVQTELQSRENEIVAMR EKSFMQVGGLMGFVLLLLLISYIIIHRDAKSIKQYKHKTTDLIRQLEQSVQRNEALITSR KKAVHTITHELRTPLTAITGYAGLIRKEQCEDKSGQYIQNILQSSDRMRDMLNTLLDFFR LDNGKEQPRLSPCRISAITHTLETEFMPVAVNKGLSLSVKTGHDAIVLTDKERIIQIGNN LLSNAVKFTEEGGVSLITEYDNGVLTLVVEDTGTGMTEEEQKQAFGAFERLSNAAAKEGF GLGLAIMRNIVSMLGGTIRLDSKKGKGSRFTVEISMQEAEEQLGYTSNTPVYHNNKFHDV VAIDNDEVLLLMLKEMYSQEGIHCDTCTDAAALMEMIRQKEYSLLLTDLNMPDINGFELL ELLRSSNVGNSPTIPVVVATASGSCNKGELLAKGFAGCLFKPFSISELMEVSDRCAIKAT PDGKPDFSALLSYGNEAVMLEKLITETEKEMQAVRDAAKEKDLQKLDSLIHHLRSSWEVL RADQPLNVLYGLLRGDALPDGEALSHAVTAVLDKGVEIIRLAEEERRKYEDE >gi|222159218|gb|ACAB01000141.1| GENE 38 45157 - 46479 947 440 aa, chain + ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 6 428 8 439 441 265 38.0 1e-70 MNKTKIIVVEDNIVYCEYVCNMLSREGYRNMKAYHLSTAKKHLQQATDNDIVVADLRLPD GSGIDLLCWMRKEGKMQPFIIMTDYAEVNTAVESMKLGSIDYIPKQLVEDKLVPLIRSIL KERQAGQRRMPIFAREGSAFQKIMHRIRLVAATDMSVMIFGENGTGKEHIAHLLHDKSKR AGKPFVAVDCGSLSKELAPSAFFGHVKGAFTGADNAKKGYFHEAEGGTLFLDEVGNLALE TQQMLLRAIQERRYRPVGDKADRNFNVRIIAATNEDLEVSVNEKRFRQDLLYRLHDFGIT VPPLRDCQEDIMPLAEFFRDMANRELECSVSGFSSEARKALLTHAWPGNVRELRQKVMGA VLQAQEGVVMKEHLELAVTKPTSTVSFALRNDAEDKERILRALKQANGNRSVAAELLGIG RTTLYSKLEEYGLKYKFKQS >gi|222159218|gb|ACAB01000141.1| GENE 39 46759 - 47181 152 140 aa, chain + ## HITS:1 COG:no KEGG:BF0137 NR:ns ## KEGG: BF0137 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 140 1 140 140 295 100.0 5e-79 MGKVQILAVLTMDGCLSSELYDKAHQDLCLDRCGLDEIRKKAFYRVTPDYSISMLHEWRK DCTNIRYLAEATPDTADYINGLLRMHAVDEIILYTVPFISGSGRHFFKSALPEQHWTLSS LKSYPNGVCRIIYILDKKAR >gi|222159218|gb|ACAB01000141.1| GENE 40 47420 - 48025 451 201 aa, chain + ## HITS:1 COG:no KEGG:BF0136 NR:ns ## KEGG: BF0136 # Name: not_defined # Def: tetracycline resistance element mobilization regulatory protein RteC # Organism: B.fragilis # Pathway: not_defined # 1 201 1 201 201 402 100.0 1e-111 MNYFLLAETDFFRLINEAGDCNMETAYTAFATQVIELCNGGMDMNLTVIALAYIEIELQH HPVRNLSEEKREIAAYVSKALSFVRKMQKFLATPQVPPLISANNATETTASLLQWTGNAI DLVELIYGIDVMGCINNGNMPLKQLAPLLYKIFGVDSKDCYRFYTDIKRRKNESRTYFID RMQEKLNERMLRDEELERMRK >gi|222159218|gb|ACAB01000141.1| GENE 41 48327 - 49457 903 376 aa, chain + ## HITS:1 COG:RC1031 KEGG:ns NR:ns ## COG: RC1031 COG1373 # Protein_GI_number: 15892954 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Rickettsia conorii # 24 330 23 314 380 106 26.0 8e-23 METVNRILQEKITARIAPNKAVLIFGARRVGKTVMMRKIVDNYSGRTMMLNGEDYDTLAL LENRSIANYRHLLDGIDLLAIDEAQNIPQIGSILKLIVDEIPGISVLASGSSSFDLLNKT GEPLVGRSTQFLLTPFSQREIAQTETALETRQNLEARLIYGSYPEVVMMENYERKTDYLR DIVGAYLLKDILAIDGLKNSSKMRDLLRLIAFQLGSEVSYEELGKQLGMSKTTVEKYLDL LEKVFVIYRLGAYSRNLRKEVTKAGKWYFYDNGIRNAIIGAFSPLAIRQDVGALWENYII GERRKANFNEGLHREFYFWRTYDKQEIDLIEESADSLTALEFKWGNKMPAAPKAFQEAYP YAEFHVVNRENYLEFV >gi|222159218|gb|ACAB01000141.1| GENE 42 49474 - 52728 1242 1084 aa, chain + ## HITS:1 COG:no KEGG:BF0134 NR:ns ## KEGG: BF0134 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 1084 1 1084 1084 2181 100.0 0 METKLQSKQQYPRFIQNKPCGIDKFDGGSQERLAKTIARHFCQNDSLDEECTLPRIIGIE GIWGSGKSNVVKMLERELSDDYYFFEYDAWGHQEDLQRRSILELLTSKLIDDGILSGNAT IKVKGGGTKTVSWSEKLKYLLARKTETVTEKYPLISNGMVAAFLVAVLTPIFTFIAYAVK PTPTTWWFSLLSIIIAALPVLIALCVWKWAYSKDHKYGWSYMLAIYQDKVEKDVCYETLS EDEPTVYEFKTWMQDISDFIKEKGQRKLVLVFDNMDRLPAEKVKELWSSIHTFFADSGFE NVWAVIPFDETHLACAFGDETDEQTKQLTKYFINKTFPIVYRVAPPVITDYRSIFNKLFV EAFGETENEAKETINRIFRLVNPNANVREIISYINEMVALKQEWCNEILMINIALFCLKK TDILANPVEQILSGDYLNGIQTIINNDLQTQREIAALVYGVDVEDARQIPLKKYIEGCIN GEEDHDINQYAETNKQFDTVLEEVIQCMDNALIDKIIHCLHKLTRKSDVILRVWQRIAQL KLKESIEKQVFPVEYQELLLHLDTESQNHVIAQLYKKIVRFNDFNGGDYFKTLDAIDRFI AQNKLACDFTSLIEAKTVKPNTFIDYIQAANATDAAYRDNATTKAYKYYQVATNSEALDN YLANLLPDNFDHADIVKTLKDNSTYTFPTLLQAITNCIDEQNVNKDNIGAIFTTYRLLAS DEERPLPVTLDSTYINQLHSELETDGRNIKESGYYDLVAMQLAHGHSVSLIEGGDIKYVA ELMDYYVDHGDLLVNSVGWNIPLLNETLQYMVNHKLGYKLLLSDILPQFEDIKNRIGVTD EVFIEHLAEWNTDLDKYITKNNIKDVIPDASFYDLTTKISNVLTDHINKIAFEALSEISV DTLYAQRTAHTSYYWFVAIKHLLAKIKSLPDNLTEFGKKILMDIASGTQSLNPFPNCFKN IVERLDKRKIKSTVTDIRNDFCIGKKTINAIKFQFFETWLRSHGNLKSQAGDVIDKIVKP VISDGACRSLILQNKDFYMDLINTAGDDAYELKKSLRNLIQKDSDPQLVKFVNSIDSVPE VETA >gi|222159218|gb|ACAB01000141.1| GENE 43 52860 - 54878 1278 672 aa, chain - ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 202 559 117 466 589 99 26.0 3e-20 MSQQEDDLRALAKIMDFLRAVSIILVVMNVYWFCYEAIRLWGVNIGVVDKILLNFDRTAG LFHSILYTKLFSVLLLALSCLGTKGVKGEKITWGRIWTAFAVGFVLFFLNWWLLPLPLPL EAVTGLYVLTIGTGYVCLLMGGLWMSRLLKHNLMEDVFNNENESFMQETRLIESEYSVNL PTRFYYRKRWNNGWINVVNPFRASIVLGTPGSGKSYAVVNNFIKQQIEKGFSQYIYDFKY PDLSTIAYNHLLNHPDGYKVKPKFYVINFDDPRRSHRCNPIHPDFMEDITDAYESAYTIM LNLNKTWVQKQGDFFVESPIILFASIIWYLKIYQNGKFCTFPHAIEFLNRRYEDIFPILT SYPELENYLSPFMDAWLGGAAEQLMGQIASAKIPLSRMISPQLYWVMSDSEFTLDINNPE EPKILCVGNNPDRQNIYGAALGLYNSRIVKLINKKGMLKSSVIIDELPTIYFKGLDNLIA TARSNKVAVCLGFQDFSQLVRDYGDKEAKVVMNTVGNIFSGQVVGETAKTLSERFGKVLQ KRQSISINRQDVSTSINTQMDALIPPSKISGLTQGMFVGSVSDNFNERIEQKIFHCEIVV DAEKVKREESAYKKIPVITNFTDEDGNDRMKETVQANYRRIKEEVKQIVQEELERIKNDP VLCKLLPDNETV >gi|222159218|gb|ACAB01000141.1| GENE 44 54909 - 56156 885 415 aa, chain - ## HITS:1 COG:no KEGG:BF0132 NR:ns ## KEGG: BF0132 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 415 1 415 415 799 100.0 0 MVAKISVGSSLYGAIAYNGEKINEAQGRLLTTNRIYNDGSGTVDIGKAMEGFLTFLPPQM KIEKPVVHISLNPHPEDVLTDIELQNIAREYLEKLGFGNQPYLVFKHEDIDRHHLHIVTV NVDENGKRLNRDFLYRRSDRIRRELEQKYGLHPAERKNQRLDNPLRKVAASAGDVKKQVG NTVKALNGQYRFQTMGEYRALLSLYNMTVEEARGNVRGREYHGLVYSVTDDKGNKVGNPF KSSLFGKSAGYEAVQKKFVRSKSEIKDRKLADMTKRTVLSVLQGTYDKDKFVSQLKEKGI DTVLRYTEEGRIYGATFIDHRTGCVLNGSRMGKELSANALQEHFTLPYAGQPPIPLSIPV DAADKAHGQTAYDSEDISGGMGLLTPEGPAVDAEEEAFIRAMKRKKKKKRKGLGM >gi|222159218|gb|ACAB01000141.1| GENE 45 56135 - 56563 256 142 aa, chain - ## HITS:1 COG:no KEGG:BF0131 NR:ns ## KEGG: BF0131 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 142 1 142 142 246 100.0 2e-64 MKEKRKSKSGRNPKLDPAVYRYTVRFNEEEHNRFLAMFGKSGVYARSVFLKAHFFGQPFK VLKVDKTLVDYYTKLSDFHAQFRAVGTNYNQVVKELRLHFSEKKAMALLYKLEQHTVELV KLSRRIVELSREMEAKWSQKSV >gi|222159218|gb|ACAB01000141.1| GENE 46 57245 - 58000 707 251 aa, chain + ## HITS:1 COG:no KEGG:BT_2303 NR:ns ## KEGG: BT_2303 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 250 3 251 252 328 61.0 9e-89 MKEATYVAISTQKGGAGKTTLTVLVASYLHYVKGYNVAVVDCDFPQYSIHDMRKRDMKAV MEDGHYKVLAYEQLKRLKKNPYTIRCSRAEDAVKTAENLVAAQPDLDFVFFDLPGTINNA DVVQTISQMDYIFTPIIADRVVMESSIKFATVINEQMISTGKSGIKGIYLVWNMVDGREK TELYKAYDKVCAEFALPILETHLPDSKRFRKETAAERKAVFRSTAFPADKILVRGSNLDK LVDEILGIIKN >gi|222159218|gb|ACAB01000141.1| GENE 47 58006 - 58455 167 149 aa, chain + ## HITS:1 COG:no KEGG:BT_2302 NR:ns ## KEGG: BT_2302 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 142 6 114 120 78 34.0 6e-14 MAKKTRIEDIDSEAVINSFRLDDTSIPPEARSTDGNAPVPPPKEAIQEAVSPPASHSKEE PERRRRNSKPEKLDYDTVFICGSNITARLGKQVYIRKEYHDRIQKMLHVIGGNEVTIAAF LDNVLTHHFTLFQDEIAESFKRHMESYNL >gi|222159218|gb|ACAB01000141.1| GENE 48 58479 - 58841 153 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160889352|ref|ZP_02070355.1| ## NR: gi|160889352|ref|ZP_02070355.1| hypothetical protein BACUNI_01775 [Bacteroides uniformis ATCC 8492] # 1 120 1 120 120 209 100.0 6e-53 MKRTNKNRQAEYLQTGKSERTAVRAAKCVTDYGQENTDVDFRKDKGRDKPLSTFLQASEI KTRQCIYISREIHEKVAIVTSRMGNGLSIGKFVDNILRDHFRQYGQQYMEQIENAQKVRL >gi|222159218|gb|ACAB01000141.1| GENE 49 58838 - 59578 441 246 aa, chain + ## HITS:1 COG:no KEGG:BF0126 NR:ns ## KEGG: BF0126 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 246 1 236 236 245 56.0 1e-63 MTKAFLIVSVACNVWFLFLLFYDRIMDTKLIRFFRHIAGLWRSLDSTVAEQGKDKEIPHA DTSDIIGKSRFKMASTRTTAAIPTQEAATSEKGIELSEEEATFDDGNTETVSRPTQVPED KLDETFTSIPPSELEYGEDEPEDEEPDKKQASGSSFEDIDMAVHTIRKESPSDEEMRHAG KVMTELDGTELFTRITERLTDEAMSERLAKAMAVFVDTVNMTVTQQKEKVFVIPDNIEEF DFRNYV >gi|222159218|gb|ACAB01000141.1| GENE 50 59780 - 60091 317 103 aa, chain + ## HITS:1 COG:no KEGG:BF0125 NR:ns ## KEGG: BF0125 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 103 1 105 105 152 91.0 4e-36 MNKNIRQKLIIAAALMIAATASAFAQGNGLAGINEATSMVSSYFDPGTKLIYAIGAVVGL IGGVKVYGKFSSGDPDTSKTAASWFGACIFLIVAATILRSFFL >gi|222159218|gb|ACAB01000141.1| GENE 51 60102 - 60434 194 110 aa, chain + ## HITS:1 COG:no KEGG:BF0124 NR:ns ## KEGG: BF0124 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 108 1 108 110 209 98.0 2e-53 MAEYPINKGIGRPVEFKGLKAQYLFIFCGGLLALFVLFVILYMVGIDQWICIGFGAASSS VLVWQTFALNARYGEHGLMKLGAARSHPRYLINRRRITRLLKRQRKEETT >gi|222159218|gb|ACAB01000141.1| GENE 52 60431 - 62348 1150 639 aa, chain + ## HITS:1 COG:no KEGG:BF0123 NR:ns ## KEGG: BF0123 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 639 1 639 834 1241 94.0 0 MRNTSKMTTLENKFPLLAVEHGCIISKDADITVAFEVELPELYTVTGAEYEAIHGCWCKA IKVLPYFCVVHKQDWFIKEKYMPELQKDDMSFLSRSFERHFNERPYLKHSCYLYLTKTTK ERNRMQSNFSTLCRGHIIPKELDKETAGKFMEAAEQFERIMNDSGFVRLRRLSTDEIVGT EKSAGLIERYFSLMPEGDTTLQDIDLSAKEMRIGDNRLCLHTLSDAEDMPGKVATDIRYE KLSTDRSDCRLSFASPVGLLLSCNHIYNQYVIIDNSEENLQKFEKSARNMQSLSRYSRSN SINREWIDQYLNEAHSYGLTSVRAHFNVMAWSDDAEELKHIKNDVGSQLASMECVPHHNT IDCPTLYWAAMPGNAADFPAEESFHTFIEQAVCLFTEETNYRSSLSPFGIKMVDRLTGKP LHLDISDLPMKRGITTNRNKFVLGPSGSGKSFFMNHLVRQYYEQGTHVVLVDTGNSYQGL CEMIRCKTNGADGVYFTYTEEKPISFNPFYTDDYVFDVEKKDSIKTLLLTLWKSEDDKVT KTESGELGSAVNAYIERIRADRSIVPSFNTFYEYMRDDYRRELAEREIKVEKSDFNIDNM LTTMRQYYRDGRYDFLLNSTENIDLLGKRFIVFEIDSIK Prediction of potential genes in microbial genomes Time: Wed May 18 04:18:53 2011 Seq name: gi|222159217|gb|ACAB01000142.1| Bacteroides sp. D1 cont1.142, whole genome shotgun sequence Length of sequence - 34490 bp Number of predicted genes - 41, with homology - 41 Number of transcription units - 16, operones - 7 average op.length - 4.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 586 365 ## BF0123 hypothetical protein 2 1 Op 2 . + CDS 624 - 1007 413 ## BF0122 hypothetical protein 3 1 Op 3 . + CDS 1038 - 1667 492 ## BF0121 hypothetical protein 4 1 Op 4 . + CDS 1671 - 2675 577 ## BF0120 hypothetical protein 5 1 Op 5 . + CDS 2707 - 3330 499 ## BF0119 hypothetical protein 6 1 Op 6 . + CDS 3337 - 3645 199 ## BF0118 hypothetical protein 7 1 Op 7 . + CDS 3626 - 4981 1122 ## BF0117 hypothetical protein 8 1 Op 8 . + CDS 5051 - 6037 748 ## BF0116 hypothetical protein 9 1 Op 9 . + CDS 6040 - 6615 485 ## BF0115 TraO 10 1 Op 10 . + CDS 6624 - 7523 350 ## BF0114 hypothetical protein 11 1 Op 11 . + CDS 7523 - 8029 410 ## BF0113 hypothetical protein 12 1 Op 12 . + CDS 8029 - 8553 63 ## BF0112 lysozyme + Term 8574 - 8622 13.3 - Term 8562 - 8610 14.9 13 2 Tu 1 . - CDS 8616 - 8837 157 ## BF1096 hypothetical protein - Term 8915 - 8952 6.0 14 3 Op 1 . - CDS 8954 - 9262 304 ## BF0110 hypothetical protein 15 3 Op 2 . - CDS 9316 - 9621 329 ## BF0109 hypothetical protein 16 3 Op 3 . - CDS 9561 - 9779 62 ## HM1_0042 hypothetical protein 17 3 Op 4 . - CDS 9811 - 10068 210 ## BF0108 hypothetical protein 18 3 Op 5 . - CDS 10093 - 11358 791 ## PGN_0050 hypothetical protein 19 3 Op 6 . - CDS 11355 - 11774 356 ## BF0106 hypothetical protein 20 3 Op 7 . - CDS 11798 - 12019 200 ## BF0105 hypothetical protein 21 3 Op 8 . - CDS 12031 - 12246 152 ## gi|293371479|ref|ZP_06617901.1| conserved hypothetical protein 22 3 Op 9 . - CDS 12254 - 12589 222 ## BF2919 hypothetical protein - Prom 12620 - 12679 2.4 + Prom 13444 - 13503 4.9 23 4 Tu 1 . + CDS 13725 - 14045 358 ## COG2076 Membrane transporters of cations and cationic drugs + Prom 14207 - 14266 5.6 24 5 Op 1 . + CDS 14345 - 14518 178 ## gi|298483182|ref|ZP_07001362.1| two-component system sensor histidine kinase 25 5 Op 2 . + CDS 14543 - 16057 1202 ## COG0642 Signal transduction histidine kinase + Prom 16146 - 16205 5.9 26 6 Tu 1 . + CDS 16226 - 17305 952 ## COG3049 Penicillin V acylase and related amidases + Term 17456 - 17512 7.1 27 7 Tu 1 . - CDS 17519 - 18169 600 ## BT_4231 hypothetical protein - Prom 18220 - 18279 5.2 28 8 Tu 1 . + CDS 18517 - 20610 1518 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 29 9 Op 1 1/0.000 + CDS 21046 - 21576 342 ## COG0350 Methylated DNA-protein cysteine methyltransferase 30 9 Op 2 . + CDS 21573 - 22427 594 ## COG0350 Methylated DNA-protein cysteine methyltransferase + Prom 22431 - 22490 8.8 31 9 Op 3 . + CDS 22510 - 23088 578 ## COG0693 Putative intracellular protease/amidase - Term 23026 - 23075 0.7 32 10 Tu 1 . - CDS 23143 - 23508 156 ## BT_2357 hypothetical protein - Prom 23563 - 23622 4.5 + Prom 23633 - 23692 3.9 33 11 Tu 1 . + CDS 23801 - 24355 328 ## Dhaf_2020 GCN5-related N-acetyltransferase - Term 24344 - 24391 6.1 34 12 Tu 1 . - CDS 24446 - 24676 101 ## gi|237713404|ref|ZP_04543885.1| conserved hypothetical protein - Prom 24718 - 24777 1.8 - Term 25111 - 25160 1.3 35 13 Op 1 . - CDS 25176 - 27014 1215 ## BT_2363 hypothetical protein 36 13 Op 2 . - CDS 27030 - 30116 2603 ## BT_2362 hypothetical protein - Prom 30336 - 30395 8.4 37 14 Op 1 . + CDS 30449 - 30838 334 ## BT_2361 hypothetical protein 38 14 Op 2 . + CDS 30884 - 31294 235 ## BT_2360 transcriptional regulator + Prom 31316 - 31375 7.3 39 15 Tu 1 . + CDS 31429 - 31830 346 ## BT_2361 hypothetical protein + Term 31867 - 31917 8.1 40 16 Op 1 . - CDS 31976 - 33682 403 ## COG0433 Predicted ATPase 41 16 Op 2 . - CDS 33663 - 34490 286 ## ZPR_2242 hypothetical protein Predicted protein(s) >gi|222159217|gb|ACAB01000142.1| GENE 1 2 - 586 365 194 aa, chain + ## HITS:1 COG:no KEGG:BF0123 NR:ns ## KEGG: BF0123 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 194 641 834 834 362 96.0 6e-99 NRELFPVVTIIIMEAFINKMRRLKGVRKQLIVEEAWKALSSANMAEYLRYMYKTVRKYYG EAIVVTQEVDDIISSPVVKESIINNSDCKILLDQRKYMNKFDAIQSLLGLTEKEKSQILS INMANNPSRLYKEVWIGLGGTQSAVYATEVSAEEYLAYTTEETEKVEVYRLAEQLGGDIE AVIRQLAERRRKKE >gi|222159217|gb|ACAB01000142.1| GENE 2 624 - 1007 413 127 aa, chain + ## HITS:1 COG:no KEGG:BF0122 NR:ns ## KEGG: BF0122 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 126 1 126 126 236 86.0 2e-61 MSISRTKMLQVSKCLIGLAVMVLQSCDITDNRRDLLCGNWESVEGKPDVLIYKEGEAYKV TVFKRSGIRRKLKPETYLLQEENGNLFMNTGFRIDVSYNEATDILTFSPNGDYVRVNPKP DHPIGEQ >gi|222159217|gb|ACAB01000142.1| GENE 3 1038 - 1667 492 209 aa, chain + ## HITS:1 COG:no KEGG:BF0121 NR:ns ## KEGG: BF0121 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 209 1 209 209 339 86.0 3e-92 MRTRITLIICLCLFAVGRASAQWAVIDPSNIAQSIVNTSKNVVHTSTTAQNMIKNFQETV KIYEQGKKYYDALKSVNNLVKDARKVQQTILMVGDITDTYVTSFQKMMRDDNFTVEELGA IAFGYTKLLEESNDVLTELKNVVNITTLSMTDKERMDVVERCYSKMKRYRNLVSYYTNKN ISVSYLRAKKKNDLDRIMGLYGNMNERYW >gi|222159217|gb|ACAB01000142.1| GENE 4 1671 - 2675 577 334 aa, chain + ## HITS:1 COG:no KEGG:BF0120 NR:ns ## KEGG: BF0120 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 334 1 334 334 611 95.0 1e-173 MNFDNLHQILRSLYEQMMPLCGDMAGVAKGIAGLGALFYVAYRVWQSLARAEPIDVFPML RPFAIGLCIMFFPTVVLGTINSIMSPVVQGTAKMLEAETLDMNRYREQKDKLEYEAMMRN PETAYLVSNEEFDKQLEELGWSPGDMVTMAGMYIERGMYNMKKGIRDFFREILELMFQAA ALVIDTIRTFFLVVLAILGPIAFAISVWDGFQSTLTQWICRYIQVYLWLPVSDMFSTILA KIQVLMLQSDIERMQADPNFSLDSSDGVYIVFMIIGIIGYFTIPTVAGWIIQAGGMGSYG RNVNQTAGRAGSMAGSVAGAAAGNVVGRVGKLLK >gi|222159217|gb|ACAB01000142.1| GENE 5 2707 - 3330 499 207 aa, chain + ## HITS:1 COG:no KEGG:BF0119 NR:ns ## KEGG: BF0119 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 207 1 207 207 391 96.0 1e-108 MEFKSLKNIESSFRQIRLFGIVFLSLCAVITVWSVWSSYRFAERQREKIYVLDNGKSLML ALSQDLSQNRPAEAREHVRRFHELFFTLSPEKSAIEHNVKRALLLADKSVYNYYSDFAEK GYYNRIIAGNINQVLKVDSVVCDFNGYPYRAVTYATQKIIRQSNVTERSLVTTCRLLNSS RSDDNPNGFTIEGFTIIENKDLQTIKR >gi|222159217|gb|ACAB01000142.1| GENE 6 3337 - 3645 199 102 aa, chain + ## HITS:1 COG:no KEGG:BF0118 NR:ns ## KEGG: BF0118 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 102 1 102 102 149 70.0 4e-35 MKKIKKSFWRAYWKLHDKKKMLVARLKGYLDGLPPKTRRRIVLAMLAAFAVLALYTFGKA VYEIGRNDGSRMVTDHAGQVELSIKPDNNHNVIPYLYGTDEE >gi|222159217|gb|ACAB01000142.1| GENE 7 3626 - 4981 1122 451 aa, chain + ## HITS:1 COG:no KEGG:BF0117 NR:ns ## KEGG: BF0117 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 450 1 449 450 686 84.0 0 MEQTKNEPKNENKAAPDNGKPKKERKPLTEAQRLKRQKMIVLPAIVLVFIGAMWLIFAPS SDKEQQPGTGGYNIEMPDADKANRQIIGDKAKAYEQGAMEERQENRSRAMQELGDMFDRE VAETDGGRDFDLANPGNAEDATAKSSAPKTIQSSAAAYRDLNTTLGNFYEQPKNDNAEMD ELLERIASLESELETEKGKASSMDDQVALMEKSYELAAKYMGGQNGTQPPVGQAAEPYHV QKAGKNTAAPVRQVTHQVVSSLSQPMSNAEFVASFSQERNRSFNTAVGVTTVSDRNTISA CVYGAQSVTDGQAVRLRLLEPMAVADKIIPRNAVVVGTAKIQGERLDIEITSLEYAGTII PVELAVYDTDGQPGIFIPNSMEMNAVREVAANMGGSLGSSINISTNTGAQLASDLGKGLI QGTSQYIAKKMRTVKVHLKSGYKVMLYQDRD >gi|222159217|gb|ACAB01000142.1| GENE 8 5051 - 6037 748 328 aa, chain + ## HITS:1 COG:no KEGG:BF0116 NR:ns ## KEGG: BF0116 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 328 1 328 328 639 95.0 0 MKKVIIMFALAMGIVTANAQENVTVVTDNGSEQPTLTKEVYPQKEADGDLYHGLTKKLTF DRMVPPHGLEVTYDKTVHVIFPSEVRYVDLGSPDLIAGKADGAENVIRVKATVRNFPNET NMSVITEDGSFYTFNVKYAAEPLLLNVEMCDFIHDGEAVNRPNNAQEIYLKELGSESPML VRLIMKSIYKQNKREVKHIGCKRFGIQYLLKGIYTHNGLLYFHTEIRNQSNVPFDVDYIT WKIVDKKVAKRTAVQEQIILPLRAQNYATCVPGRKSERTVFTMAKFTIPDGKCLVVELNE KNGGRHQSFVIENEDLVRAGTINELQVR >gi|222159217|gb|ACAB01000142.1| GENE 9 6040 - 6615 485 191 aa, chain + ## HITS:1 COG:no KEGG:BF0115 NR:ns ## KEGG: BF0115 # Name: not_defined # Def: TraO # Organism: B.fragilis # Pathway: not_defined # 1 191 1 191 191 346 90.0 2e-94 MRKHIAIIIASLALFTGQAHAQRCLPKMQGIELRANLVDGMRLSGNNGGYSFGAALSTYT KNGNKWVFGGEYLLKNNPYKDTAIPVAQFTAEGGYYFKILSDARKIVFVYAGASALAGYE SVNWGEKVLHDGSTLHDRDAFIYGGALTLDVEFYVADRIALLANLRERLLWGGDTRKFHT QFGAGVKIIIN >gi|222159217|gb|ACAB01000142.1| GENE 10 6624 - 7523 350 299 aa, chain + ## HITS:1 COG:no KEGG:BF0114 NR:ns ## KEGG: BF0114 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 299 1 299 300 474 76.0 1e-132 MERTDIENMRQIPIADFLARLGHEPVRRSGNELWYRAPYRSERTPSFRVNVAKQLWYDFG LGKGGDIFTLVGEFTRSGDFMAQARFIAETARMPLAAAEKPLYLPEPSEPAFEGVEAVPL LRSPLTEYLKERGIPYAVASRHCCRLNYGVRGKRYFAVGFPNVSGGYETRSRRFKGCVPP KDVSLVKAGDIPADVCSVFEGFMDFLSAVTLGLATGDCLVLNSVANVEKALKHLDGYERI DCYLDRDEAGRRAMEALRTRYGGKVTDRSGIYQGCKDLNEYLQQVSRKQQKNNNLKIKE >gi|222159217|gb|ACAB01000142.1| GENE 11 7523 - 8029 410 168 aa, chain + ## HITS:1 COG:no KEGG:BF0113 NR:ns ## KEGG: BF0113 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 168 1 167 167 291 88.0 4e-78 MNILNNKNKRTSIFKALALCLFAAVSFTFVSCDDDMDIQQSYPFTVETMPVPNKVTKGQT VEIRCELKKTGDFANTLYTIRYFQFEGEGTLKMDNGITFLPNDRYLLENEKFRLYYTAQG DEAHNFIVVVEDNFGNSYELEFDFNNRNVKDDGVITVVPIGNFKPLTR >gi|222159217|gb|ACAB01000142.1| GENE 12 8029 - 8553 63 174 aa, chain + ## HITS:1 COG:no KEGG:BF0112 NR:ns ## KEGG: BF0112 # Name: not_defined # Def: lysozyme # Organism: B.fragilis # Pathway: not_defined # 1 174 1 174 174 285 81.0 5e-76 MRVFMAILCSLLAVCSVSARNRRHEGTDGQAAIYRLPPFERAVRCTKYFEGWHSEKHHPY VGYGHRLQPGERYSARTMTKRQADALLRKDLRKFCAMFQQFGKDSLLLATLAYNVGPYRL LGSGKIPKSTLIRKLEAGDRNIYREYIAFCNYKGKRHAMLLKRRKAEFALLYVP >gi|222159217|gb|ACAB01000142.1| GENE 13 8616 - 8837 157 73 aa, chain - ## HITS:1 COG:no KEGG:BF1096 NR:ns ## KEGG: BF1096 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 73 2 74 74 73 43.0 2e-12 METIRQNGKTILYSNDGISIKMVFKNLTGRNFQGQEYTDYIRHIAIGSMGFSPGIIEHCR DGEVAGKGTIPNV >gi|222159217|gb|ACAB01000142.1| GENE 14 8954 - 9262 304 102 aa, chain - ## HITS:1 COG:no KEGG:BF0110 NR:ns ## KEGG: BF0110 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 102 38 139 139 171 88.0 6e-42 MIAKTILEQIGGRRFAAMTGSKDFTDMGNGLRMSLARNKTSANRLDIIYDGGADLYNMRF YRKTFSKKTFESRTKDIETHEGIYCDMLEEMFTMVTGLYTRF >gi|222159217|gb|ACAB01000142.1| GENE 15 9316 - 9621 329 101 aa, chain - ## HITS:1 COG:no KEGG:BF0109 NR:ns ## KEGG: BF0109 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 101 11 108 108 142 75.0 3e-33 MSSRTGQVANGKGKEGGEAMNEAGYQTLIVKFSEPITALDGIFDDAEAWGVDTLKGWIDS YESSRFTAIDSHTAVITSEYSMECLKEWLEKCTPITEKTEF >gi|222159217|gb|ACAB01000142.1| GENE 16 9561 - 9779 62 72 aa, chain - ## HITS:1 COG:no KEGG:HM1_0042 NR:ns ## KEGG: HM1_0042 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 13 69 10 64 64 67 59.0 2e-10 MTINGVSTCQSAGTENYEKFQTGIDRRKRTLVQYDYRHTDGELFSCVKPTLDECRAARDK WLTAKERKEEKR >gi|222159217|gb|ACAB01000142.1| GENE 17 9811 - 10068 210 85 aa, chain - ## HITS:1 COG:no KEGG:BF0108 NR:ns ## KEGG: BF0108 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 85 1 85 85 131 82.0 9e-30 MEVRFESMVFLWDDKIPTMFLEFMNLLTLCQSEEQLRASVKDFAEKHELDKFFLYGFGSH HFYLHQRYTSDPEMVMQHRVLSVHF >gi|222159217|gb|ACAB01000142.1| GENE 18 10093 - 11358 791 421 aa, chain - ## HITS:1 COG:no KEGG:PGN_0050 NR:ns ## KEGG: PGN_0050 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 420 1 422 424 542 63.0 1e-153 MKPRNKFEKAVLAQSKKLRPITPIQINWAFRNCVEHYAHRLPKGRTTCMDCGHSWVMTEQ TEHCTCPECGASLKVCLTYQRKVRQKQYFTTLTTSGEYQVLRMFLLVVGMEKGVNAKSYA LEIGQYWWNEQGRKAVVAIPRTLGCYIDTFSFASPFAIRNDNEAYRHISYSPIYPRYKVL PTLRRNGFNGNFHDIVPTKLIPALLSDSRAETLLKAGQYPMLRYYLYHSFNIGEYWASIK ICIRNGYTIEDGSMWRDTIDLLRHFGKDTNSPKYVCPADLKVEHDKLVAKRNLQRKHERT EQQRRKAIEDEKQYLKAKGIFFGLAFTDSLICVKVIESVEEMAEEGRTMHHCVGGYHKRK DSLILSATIDGKRIETIEVSLKTFEVVQCRGVCNENSEYHDRIIALVNKNANLIRQRMKA A >gi|222159217|gb|ACAB01000142.1| GENE 19 11355 - 11774 356 139 aa, chain - ## HITS:1 COG:no KEGG:BF0106 NR:ns ## KEGG: BF0106 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 139 1 138 138 217 81.0 1e-55 MKGTDHFKRTIQMYLEQRAEEDTLFAKNYRNPAKNIDDCVTYILNYVQKSGCNGFTDGEI YGQAVHYYDENEIEVGKPIQCQVAVNHIVELTAEEKAEARQQAVRRYQDEELRKLQNRTK PTKAKTETQVQPSLFDFGL >gi|222159217|gb|ACAB01000142.1| GENE 20 11798 - 12019 200 73 aa, chain - ## HITS:1 COG:no KEGG:BF0105 NR:ns ## KEGG: BF0105 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 73 59 131 131 120 78.0 2e-26 MAKRRSKTVEQQCRYYEVGNIFEYMVETYLNGNMSVFRGLYHELNKNARKDFIDFLLSEV EPIYWREILKHTI >gi|222159217|gb|ACAB01000142.1| GENE 21 12031 - 12246 152 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293371479|ref|ZP_06617901.1| ## NR: gi|293371479|ref|ZP_06617901.1| conserved hypothetical protein [Bacteroides ovatus SD CMC 3f] # 1 51 1 51 149 96 98.0 4e-19 MTATVNFRQMAQYIGLAICTLIMRTAFWVSGILWYIVREIINGVFRVAIGLIVAILSVIL FFGFILWLFTL >gi|222159217|gb|ACAB01000142.1| GENE 22 12254 - 12589 222 111 aa, chain - ## HITS:1 COG:no KEGG:BF2919 NR:ns ## KEGG: BF2919 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 109 2 108 109 140 63.0 2e-32 MRTAIATKFVAWEVPSLENLQGSKVYGLRTKLNNGEKLSREEKDWLTRNVNSNTYFKSAV PLQGWMFDFSDILRTYIVKQYGHWAEYKATDKTALRSFLYGRIDSIVELNK >gi|222159217|gb|ACAB01000142.1| GENE 23 13725 - 14045 358 106 aa, chain + ## HITS:1 COG:CC3443 KEGG:ns NR:ns ## COG: CC3443 COG2076 # Protein_GI_number: 16127673 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Caulobacter vibrioides # 1 105 1 102 106 70 50.0 8e-13 MNWIILIIAGLFEVGFTFCLGKIKGATGTDFYLWGAGFVISVTLSMFLLAKAAQTLPIGT AYPVWTGIGAVGTVLIGILFFHEPATLGRLFFMTTLIISIVGLKLI >gi|222159217|gb|ACAB01000142.1| GENE 24 14345 - 14518 178 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298483182|ref|ZP_07001362.1| ## NR: gi|298483182|ref|ZP_07001362.1| two-component system sensor histidine kinase [Bacteroides sp. D22] # 1 57 1 57 579 108 100.0 7e-23 MKSTDSEPQGGGMELEQYRTSLEQMVEEKSKDLIAIQENLEATNRRQALFIKVLQIL >gi|222159217|gb|ACAB01000142.1| GENE 25 14543 - 16057 1202 504 aa, chain + ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 248 495 11 267 280 178 41.0 2e-44 MNMALAEIGRYTGVDRLATWENHLDGVTYGCTNEWCNDGIEPAIDYLRSMTIEAGKPWFD MLEENHIICTSDIYSLDPFITQMLEVQGVKAIAVFPLSQLGVHFGFLSFNFCWNKQWDEK DVELMSQISQIVSTATKRWQVETSLQQSQRTMQKVLDNINANIFVSDYDTLKVIFANKPF REEAGEVPENAECWRMLNAGLENGCKHCPKPKLLDANRKFTGVHFWEDYNPVTKRWYTIQ SMAIKWLDGRWAIMELATDITTRKQVELELIQAKEKAEESDRLKSAFLANMSHEIRTPLN AIVGFSSLLAETDEAELRHVYMSLVQENNELLLNLISDILDISKIEAGMIDLVMGRVDVP QLCREVIATFSHKKRDSAVELRFDENSPQIVIDADKNRIVQVLSNFLTNALKFTTKGSIT LSYLLEDESQVRFCVTDTGKGIPDEQKHEIFNRFVKLDSFVQGAGLGLSICQSLVKRMGG KIGVESREGEGSCFWFTHPYVPGA >gi|222159217|gb|ACAB01000142.1| GENE 26 16226 - 17305 952 359 aa, chain + ## HITS:1 COG:BMEI0543 KEGG:ns NR:ns ## COG: BMEI0543 COG3049 # Protein_GI_number: 17986826 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Brucella melitensis # 22 351 34 358 367 214 35.0 2e-55 MMKMTKNLLLGVVAVCGSTFQAVACTGISLTSRDGSYVQARTIEWARGVLQSEYVIIPRG QQLTSFTPTGVNGLTFTAKYGVVGLAVVQKEFIAEGINEAGLSAGLFFFPHYGGYETYDA AQNQRTLADLQVTEWLLSQFSTIDEVKAALSSVRVVGLEKTAVVHWRIGEPSGRQVVLEI VGGVPHFYENEVGVLTNAPGFEWQLTNLNNYVNLHPGDASVQKLSGITLQPTGGNSGFLG IPGDATPPSRFVRAAFYRGTAPQRATGFDTVQQCFHLLNNFDIPIGIEHSQGDIPDIPSA TQWTSAIDLTNRKVYYKTAYNNSIRCIDLKAIDFSKVKYQSQPLDKIQEQPVEMVKIPR >gi|222159217|gb|ACAB01000142.1| GENE 27 17519 - 18169 600 216 aa, chain - ## HITS:1 COG:no KEGG:BT_4231 NR:ns ## KEGG: BT_4231 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 210 1 207 212 251 58.0 9e-66 MAILFDWYEDPKPSNKQQGENGNTLHPRIKYNGSTGTDVLRRRIQERCSLTETDVTAVLD ALSHIMGQELAEGRQVHLDGIGYFHPCLTTTEPVTVDTKRKVTKVKLKAIQFRADQTLKN EFGVLKVKNLKGGLDFTQLTNEEIDLQLTKYFRTHPFMRRHDFQYLCGMARSTAMRHIRR LRDEGKLKNMGGAMQPIYVPGSGYYGNNKEVSVKTE >gi|222159217|gb|ACAB01000142.1| GENE 28 18517 - 20610 1518 697 aa, chain + ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 397 647 357 608 836 71 26.0 7e-12 MKITLIRQDNGSGKETLSVCEAGTLFDKMKTETKAGHITALRGIIPLLEGTHARYEHIDK LPYIYSAVEYTRTKEGERKMKQYNGLVQLEVSRLASGSEVEFVKRQAALLPQTFATFGGS SGRSVKIWVRFALPDDGGLPTKEAEAELFHVHAYWLAVKCYQPMLPFDIDLKEPVLAQRC RMTLDESPYYNPDAVPFCLEQPLMMPGEETFRQRKLGEKNPLLRLQPGYESAQTFTKIYE AALNRALQEMEDWKRGDDLQPLLVRLAEHCFKAGLPEEEAIRQTMIHYYREEEEQVIRSI LHNLYQECKGFGKKSSISKEQETAFLLEEFMKRRYEFRYNTVQDDLEYRQRDSVHFCFKP VDKRVRNSIAINALKEGISAWDRDVDRFLNSECVPLYNPVEEYLYETGRWDGKDRIRALA GLVPCDNPHWQELFYRWFLSMVAHWRGVDRQHGNNTSPLLVGSQGYRKSTFCRIILPPEL RFGYTDSIDFKSKQEAERYLGRFFLINIDEFDQINVSQQGFLKHLLQKPVANLRKPYGNT IREMRRYASFIGTSNQKDLLTDPSGSRRFICIEVTAPINTNVTINYKQLYAQAMEAIYKG ERYWLNDEDENILKQTNREFEQASPLEQLFHCYLNPVEEEMEGEWLTAMQILSYLQTKTR DKLAINKVGQFGRALQKLNIPCRKSRKGTLYHLMKVK >gi|222159217|gb|ACAB01000142.1| GENE 29 21046 - 21576 342 176 aa, chain + ## HITS:1 COG:PA0995 KEGG:ns NR:ns ## COG: PA0995 COG0350 # Protein_GI_number: 15596192 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Pseudomonas aeruginosa # 11 173 6 162 173 137 47.0 1e-32 MEDKTNVIKIQRYHSPCGDLMLGSFEDKLCLCDWAAESHRDIVDRRLRKVLKAGYEKSTS DVILEAMSQLDEYFNGERTVFEVPLLFVGTEFQKSVWYKLLEIPYGSTVSYGELAKQLDL PKAVRAVAAANGANAISIFAPCHRVIGSNHSLVGYGGGLPAKKRLLDLELNGKPLL >gi|222159217|gb|ACAB01000142.1| GENE 30 21573 - 22427 594 284 aa, chain + ## HITS:1 COG:SMc02841_2 KEGG:ns NR:ns ## COG: SMc02841_2 COG0350 # Protein_GI_number: 15963919 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Sinorhizobium meliloti # 115 278 3 169 169 154 43.0 2e-37 MNEQKTLDYARIAQAIGYIRENFKRQPGLDEIAGEVALSTAHFQRMFTEWAGISPKKFLQ YTSIEYAKRILNETHASLFDAAQETGLSGTGRLYDLFVNIEGMTPGEYKNGGESLSINYS FAKSPFGEIFIASTDKGICCMEFADEHDAAFNSLRKKFPNAKFTSIVDEVQQNALFIFTQ DWSKLKEIKLHLKGTDFQLKVWETLLKIPVGGLTTYGDIAVGINNPRACRAVGTAVGENP VAFLIPCHRVIRASGELGNYHWGEIRKTAMIGWEAAKGEVKRGL >gi|222159217|gb|ACAB01000142.1| GENE 31 22510 - 23088 578 192 aa, chain + ## HITS:1 COG:CAC3350 KEGG:ns NR:ns ## COG: CAC3350 COG0693 # Protein_GI_number: 15896593 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Clostridium acetobutylicum # 2 192 3 194 195 181 48.0 8e-46 MKLLVFLAKGFETIEFSAFIDVMGWAKTDFDCKIDVVTCGLNQKVISSFNVPVLVDKVMD EVSADEYDALAIPGGFEEFGFYEEAYNEQLLELIRLFDSQKKWIATVCVGALPVGKSGVL NGRKATTYHLRGAHKQKVLQGFGATIVNSPIVVDDNIITSYCPQTSYGVALLLLEKLTSH KEMVLVKDAMGF >gi|222159217|gb|ACAB01000142.1| GENE 32 23143 - 23508 156 121 aa, chain - ## HITS:1 COG:no KEGG:BT_2357 NR:ns ## KEGG: BT_2357 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 119 1 119 119 186 82.0 1e-46 MKLWKYSGTLLTATGVIHTIYALFLGKEAFTEMLNNGLIDSIGENHNLGFAFWFLICGII LILWGETLQYYIRKEQKPAPLFLGYFILLFTIIGCIVEPISGFWLFLPQALIIIYSNMKK Q >gi|222159217|gb|ACAB01000142.1| GENE 33 23801 - 24355 328 184 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_2020 NR:ns ## KEGG: Dhaf_2020 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 24 171 8 146 159 100 38.0 2e-20 MEFIKEAGVSPQKSMDIKNDKITIKPLRDKDVDIFSKWLAKEYIYKWFCPDGEEHKMAWL DEVNNRNTQYHYMKHFIVYYNDKAIGFCLYLDCYFEKEYIPEHYGKTVDEKETVFEIGYL IGEEEYLGKGIGKIIVKKLIGEIAEIGGKEILADPDEANVLSIRTLLSNGFVKVKDCDYR YCLK >gi|222159217|gb|ACAB01000142.1| GENE 34 24446 - 24676 101 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713404|ref|ZP_04543885.1| ## NR: gi|237713404|ref|ZP_04543885.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 76 86 161 161 150 100.0 3e-35 MNEEQEAEEMNKAFPEFLQRLSIPKAILGGEFQFDKMNFIERFLTKKIAKVNSSVSKLRY DAINEFTSQIKDNHLW >gi|222159217|gb|ACAB01000142.1| GENE 35 25176 - 27014 1215 612 aa, chain - ## HITS:1 COG:no KEGG:BT_2363 NR:ns ## KEGG: BT_2363 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 612 1 610 610 1045 87.0 0 MKRIYIKFIFCSVLFSALSLHSCTLEEYNPGAFTKEALATSIDGYETLINQCYFAMERFY YGSADWMSLTEGDTDLWTYKANESTSYTQWFWFFAGTSPNTTYTNNWWNGTYDGVGACNE AIALGDKPPYTTEEERNAKIAEARFLRAVYYFNAVEQFGAVTMLTEPITAETLTYSPTRT DPMTIYQEVILPDLRFASEWLPTGTHATTTTPTKKAALGFLAKACLQTYEYGSTEYLQEA LDTAKKLITDCETGGGKYNTYMYPSYSEVFKESNNWENKEALWKHRWYAGADGHGSSNGN YKLNRNDEYFLCNINKFGAREDNQETRLTWEGSITGIFMPTQHLLSLYVQKDGTLDPRFH ESFTTEWNANKNYTWDESAVHMYDKEETVIGKALNKGDLAIKFIMPQDIDYATEKQKEHI SDYLLIDYHHVYSDDNNNVNMNYAYTNVTGNYKNDGTNENQFRYYYPSLNKHNSSNYYVA NASKQRNGNLNATFIMRMAEVYLIAAEADIYLNGGANAAGYINKVRERAGANPLTGSITV RDILDERGRELCGEYCRFYDLKRTGMFKNSEYLENTHPDLAQFFHPNYALRPISTTFTAT ITNGSEYQNPGY >gi|222159217|gb|ACAB01000142.1| GENE 36 27030 - 30116 2603 1028 aa, chain - ## HITS:1 COG:no KEGG:BT_2362 NR:ns ## KEGG: BT_2362 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1028 1 1028 1028 1791 88.0 0 MKSFSDKTNIGENNPLKTVYRMYFYLLFLLIGSLSILGSTRAYCQQSRTISGIVKSSQGE TIIGANIVEKGTSNGTITNVDGLFTLHVSPNAVLKVSYMGYTEQEVNTRNKNKLEIILTE DARLIDEVVVVGYGSVKKRDLTGAVTSVKSAEVLAAPTNNVMEALQGKIPGMDITKTSGQ VGGDVTILLRGTRSIYGSNEPLFIIDGIPGSYSQVSPSDIESVDVLKDASSTAIYGSAGA NGVVIITTKRGKEGKVTVNFDAYYGFSGSPNYRHGMTRDEWVTYQQEAYKYKNGDYPTDM SALLGKQDFIDAYNDGKWIDWIDEVSGNTATTQKYSLSVSSGTEKTKLFASTSYNREEGL LNNENLNRYSLRLNLDQQIFSWAKVGFTSNLVYRDLNSGVKNTFTKSLSAIPLGDASTEK GEINHEYITGQYSPMSDFIENQYVDNTRSTYLNMSGYVELTPVKDLTFTSRVNGTLNHSR RGQYWGEKCNANRPSYAGSPHASITNNNAWNYTWENILAYNTTIAKDHNLGGSLITSWNK NQSESSMAAASGQMVDQWSFWRLTSGTSQHVESDFAQTQKMSFAFRLNYSYKGKYLFNFS TRWDGVSQFSTGHKWDAFPAGALAWRISDEAFMEKTRSWLDNLKLRVSYGITGNSGGTTA YSTTTQAYVYTASGISINGKIVPFTQYSGTYGSSDLGWEKSYNWNVGLDFGILNGRIDGS VEWFKTTTKGLLFKRTLPITSGLTGWGSPLAIWQNIAQTSNQGMEATITSHNIRHKDFTW NTTLSVTWNKEQIDDLPDGDLIAENLFVGEPIKAIYGYKYAGIWGTDTPQETLDAYGVKP GFIKIETLDQKGDGGVHKYSTEDRQILGHSNPDWIIGFSNSFTYKNFDLSVFAMARYGQT INSDLLGYYTAEQSVTKNQLAGVDYWTEDNQGAYFPRPGTGDEQKTVYPSLRVRDGSFIK IKNITLGYTLPVNISRKVLMEKCRIYATAYNPFIFVKDKQLKDTDPETNGSDAFPTYRQF VFGVNLTF >gi|222159217|gb|ACAB01000142.1| GENE 37 30449 - 30838 334 129 aa, chain + ## HITS:1 COG:no KEGG:BT_2361 NR:ns ## KEGG: BT_2361 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 129 1 129 129 173 84.0 1e-42 MDNEVKNITNRRHIGRNIQRIRVYLGMKQEALAADLGVSQNIISKIEKEPEIEEGLLNQL ASALGISAEVIKDFDVEKAIYNINNIRDNTFEQGSTSIAQQFNPVDKIIELYERLLQSER EKIELLKNK >gi|222159217|gb|ACAB01000142.1| GENE 38 30884 - 31294 235 136 aa, chain + ## HITS:1 COG:no KEGG:BT_2360 NR:ns ## KEGG: BT_2360 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 136 1 136 136 175 74.0 5e-43 MENEINTNATHIGRKIERIRRLRGMTQTDLGDLLGITKQAVSKMEQSEKIEDERIKRVAD ALGVTEEGLKKFTEETVLYYTNNFYENSNATATNIGTISNLENINHFSMEQAVKLFEELL KIEREKYNKENSKDDK >gi|222159217|gb|ACAB01000142.1| GENE 39 31429 - 31830 346 133 aa, chain + ## HITS:1 COG:no KEGG:BT_2361 NR:ns ## KEGG: BT_2361 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 133 1 129 129 150 73.0 1e-35 MDNDVKDIMNRRHIGRNIQKIRVYLGMKQEALAADLGVSQNVISKIEKESEIEEGLLNKI ASVLGISAEVIKDFDVERAIYNINSYKDATISPGAIAAVHVTSQQINPVDKIIELYERLL QSEREKIELLKNK >gi|222159217|gb|ACAB01000142.1| GENE 40 31976 - 33682 403 568 aa, chain - ## HITS:1 COG:mlr1445 KEGG:ns NR:ns ## COG: mlr1445 COG0433 # Protein_GI_number: 13471466 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Mesorhizobium loti # 109 562 116 536 610 87 23.0 5e-17 MEPFNHNKFLGYVSEVTPQFVKIQFPTANLLQSFYHDGNIYAGGNVGCFVVIEGAEYGFL GRIIELNLPQGERTEITEKTIHETNTDFHPIGKIELLVVFNVYNPLKIEKTISKYPAIGS KIYSCSDEQISNYIATFGSKSDNQGSEPFSQIGKLTSNNAICNISLNSLFGRHCAVVGTT GGGKSWTVAKLLEMMNEHTSNNIILIDATGEYANTSDEVTSVELGDGEYIFDYTRLTIDE LYYLLKPSSKTQVPKLMDAIRSLKIAKLSSGNNLEGIIENGMVRKSGNNKRPYEAFYYRN ITAIEDKYCDFNFELLPRQITSECIWDTDKNDVNKWGGRNETDVSNCISLISRVNNLLCT EEYNKIFGFRKSALNAPKDLIHVIDDYINNKKGILRFDFSNVSFDYQVREILVNAIGKYL LNNAREGKFKNDPVVIFIDEAHQFLNKSIKDEYFDAKPLDSFELISKEARKYGLFLCIAT QMPRDIPLGTLSQMGTFIVHRLINDQDKKAIESAASSANKNILSFLPILGEGEALLVGVD FPMPLLIKIDAPNTKPNSQTPRFTIKDK >gi|222159217|gb|ACAB01000142.1| GENE 41 33663 - 34490 286 275 aa, chain - ## HITS:1 COG:no KEGG:ZPR_2242 NR:ns ## KEGG: ZPR_2242 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 13 261 164 409 422 192 42.0 2e-47 VSDVIKDSTGNSLKESIEKKIANACNLELDKSNKHHGEFLNKLVARKPSDPRVQLFTTNY DTLFEQAAKKRGFTIIDGFSFSFPRYFAGKNFDYDIVYREKTRLKQEESFVPNIFHLYKL HGSIDWEKDENGIIQQTEKTQKPCIIYPASEKYESSYEQPYFEMMARFQQTLRQESTLLI VIGFGFQDKHIQNVIKEAVFQNPNFHLVIINYNKNNNSETGIMPNLLPDYIDLNMSTESN VSVIFSTFKDFVEAFPINQSYYKSERLEDNGTIQP Prediction of potential genes in microbial genomes Time: Wed May 18 04:20:47 2011 Seq name: gi|222159216|gb|ACAB01000143.1| Bacteroides sp. D1 cont1.143, whole genome shotgun sequence Length of sequence - 7795 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 74 - 529 297 ## gi|262406784|ref|ZP_06083333.1| conserved hypothetical protein 2 1 Op 2 . - CDS 544 - 1635 408 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 - Prom 1847 - 1906 4.0 - TRNA 1938 - 2010 70.0 # Lys TTT 0 0 - Term 1868 - 1915 9.5 3 2 Op 1 . - CDS 2105 - 3238 941 ## COG0628 Predicted permease 4 2 Op 2 . - CDS 3260 - 3859 478 ## COG1435 Thymidine kinase - Prom 3935 - 3994 6.0 + Prom 3883 - 3942 3.6 5 3 Op 1 . + CDS 4000 - 4662 446 ## BT_2274 hypothetical protein 6 3 Op 2 . + CDS 4675 - 5349 507 ## COG0313 Predicted methyltransferases 7 3 Op 3 . + CDS 5383 - 6240 914 ## BT_2272 hypothetical protein + Term 6263 - 6319 12.0 + Prom 6242 - 6301 3.9 8 4 Op 1 . + CDS 6338 - 7030 525 ## COG1011 Predicted hydrolase (HAD superfamily) 9 4 Op 2 . + CDS 7092 - 7739 766 ## COG2095 Multiple antibiotic transporter Predicted protein(s) >gi|222159216|gb|ACAB01000143.1| GENE 1 74 - 529 297 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262406784|ref|ZP_06083333.1| ## NR: gi|262406784|ref|ZP_06083333.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 151 1 151 406 292 100.0 4e-78 MKYLYISRNQSISFNEDYWYINGIEQKRPIEETRTTEELAKNCTKQYMKNVLTNLFSHFK HIAILTAAGTSMDNGEHRGKTRDGLWRECRDDIKAIIRELHKKQACSNKMKAIVREKNIE DFLSYLILFEKISDVIKDSTGNSLKESIEKK >gi|222159216|gb|ACAB01000143.1| GENE 2 544 - 1635 408 363 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 16 358 8 327 339 161 27 1e-39 MKEYPINAIDNEYIEWLKAIKQQIRSSKIRMIKTVNTELIHFYWRLGQMISQKLKEQNWG DKVINKLSADLRNEFPDMQGFSRQNLYYSKNFYEFYAEKIHISSNNKIVPQVEGQLQVVD NYRIIFDIPWGHQKVIISKAQNIEEALFYAHQTLSNSWSRSVLENQLKQQFYEHYRQGQT NFSQTLPALTADMAQEVVKDPYWFDFVSVSQKARERDIEKQLVTHITQFLLELGKGFAFV GEQYCLNLNNKEYFCDLLFYHIPLRAYVVIELKNGNFKPEHLGQLNFYQNLINNTLRGEF DNPTIGMLLCRDKDRIEVEYALQNISSPIGVSEFNVRELLPENLKSKLPTVEEVEEELNK FVK >gi|222159216|gb|ACAB01000143.1| GENE 3 2105 - 3238 941 377 aa, chain - ## HITS:1 COG:RSc2624 KEGG:ns NR:ns ## COG: RSc2624 COG0628 # Protein_GI_number: 17547343 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Ralstonia solanacearum # 37 345 36 338 356 102 25.0 2e-21 MERKKITFDSFIRGSIGCVLVVGILMLVERLSGVLLPFFIAWLIAYMVYPLVKFFQYKLR LKSRIISIFCSLFLITVIGVTLFYLLVPPMISEIGRMNDLLVTYLTNGAGNNVPKSLSEF IHENIDLQALNRVLSEENILAAIKDTVPRVWTLLAESLNILFSILASFIILLYVIFILLD YEAIAEGWLHLLPNKYRTFASNLVHDVQDGMNRYFRGQALVAFCVGILFSIGFLIIDFPM AIALGLFIGALNMVPYLQIIGFLPTILLAILKAADTGQNFWIIIACALAVFAIVQIIQDT FLVPKIMGKITGLNPAIILLSLSIWGSLMGMLGMIIALPLTTLMLSYYQRFIINKEKIKY DEAETTDNQETSHNEEK >gi|222159216|gb|ACAB01000143.1| GENE 4 3260 - 3859 478 199 aa, chain - ## HITS:1 COG:BH3779 KEGG:ns NR:ns ## COG: BH3779 COG1435 # Protein_GI_number: 15616341 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine kinase # Organism: Bacillus halodurans # 9 188 1 187 204 181 47.0 6e-46 MVLFSEDHIQETRRRGRIEVICGSMFSGKTEELIRRMKRAKFAKQRVEIFKPAIDTRYSE EDVVSHDSHSIASTPIDSSASILLFTSEIDVVGIDEAQFFDSGLIDVCNQLANNGIRVII AGLDMDFKGNPFGPMPQLCAIADEVSKVHAICVKCGQLASFSHRTVKNEKQVLLGETAEY EPLCRECYLRARGEDEQKV >gi|222159216|gb|ACAB01000143.1| GENE 5 4000 - 4662 446 220 aa, chain + ## HITS:1 COG:no KEGG:BT_2274 NR:ns ## KEGG: BT_2274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 220 3 241 241 183 59.0 5e-45 MKQKLLTDIELDVHELKYLMDSFTKEPTPTLSELLKRSITRMQGRLDELQQEVDAVQVIS PSAVEETAGEEDEVAGSEVDSPSIIQSLESVVVKEDEEEEIIVAEETPVVIAEEEPAIVA EPVVETVVKEEEPKSAVLGESLKLSAGLRHAISLNDSFRFSRELFGGDTDLMNRVIEQIS VMSSYKTAVAFLSSKVELNEEKEAVNDFLELLKKFFNQSA >gi|222159216|gb|ACAB01000143.1| GENE 6 4675 - 5349 507 224 aa, chain + ## HITS:1 COG:all4680 KEGG:ns NR:ns ## COG: all4680 COG0313 # Protein_GI_number: 17232172 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Nostoc sp. PCC 7120 # 2 221 8 228 285 229 49.0 4e-60 MGKLYVVPTPVGNLEDMTFRAIKVLKEVDLILAEDTRTSGILLKHFEIKNAMQSHHKFNE HKTVESVVNRIKAGETVALISDAGTPGISDPGFLVVRECVRNGIEVQCLPGATAFVPALV ASGLPNEKFCFEGFLPQKKGRQTRLKTLAEEHRTMVFYESPHRLLKTLTQFAEYFGAERQ ATVSREISKLHEETVRGSLAELIEHFTATEPRGEIVIVLAGIDD >gi|222159216|gb|ACAB01000143.1| GENE 7 5383 - 6240 914 285 aa, chain + ## HITS:1 COG:no KEGG:BT_2272 NR:ns ## KEGG: BT_2272 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 285 1 284 284 429 96.0 1e-119 MKKLAVLFVCVAMLASCDGFKGGSKDLKAENDSLLMELNQRNAELDDMMGTFNEVQEGFR KINAAESRVDLQRGTITENSASAKQQIASDIEFISKQMEENKAQIAKLEAQLKNSKYNSA QMKKAVAALTAELNAKQQRIEELQTELASKNIRIQELDAAVSDLSVAKETLAAENEAKAK TVAEQEKSLNAAWFVFGTKSELKAQKILQSGDVLKSADFNKDYFTQIDIRTTKEIKLYSK RAELLTTHPTGSYELVKDDKGQLTLKITNPTEFWSVSRYLVIQVK >gi|222159216|gb|ACAB01000143.1| GENE 8 6338 - 7030 525 230 aa, chain + ## HITS:1 COG:YPO2295 KEGG:ns NR:ns ## COG: YPO2295 COG1011 # Protein_GI_number: 16122519 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Yersinia pestis # 1 230 1 222 224 113 31.0 3e-25 MKYKNLFFDLDDTIWAFSQNARDTFGEVYQKYSFDRYFDSFDHYYTLYQQRNTELWIEYG EGKITKDELNRQRFFYPLQAVGVEDEALAEQFSKDFFAIIPTKGTLMPHAKEVLEYLAPK YNLYILSNGFRELQSCKMRSAGVDRYFKKVILSEDLGVLKPWPAIFNFALSATQSELCES LMIGDSWEADITGAHGVGMHQAFYNVTERTTFPFLPTYHIHSLKELMDLL >gi|222159216|gb|ACAB01000143.1| GENE 9 7092 - 7739 766 215 aa, chain + ## HITS:1 COG:BS_yvbG KEGG:ns NR:ns ## COG: BS_yvbG COG2095 # Protein_GI_number: 16080438 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Bacillus subtilis # 5 203 2 203 211 129 39.0 6e-30 MDSTLLPFALLCFTSFFTLTNPLGTMPVFLTMTHGMTDKERQSIVRRATIVSFITLMVFV FAGQFLFKFFGISTNGFRIAGGVIIFKIGFDMLQARYTPMKLKDEEIKTYADDISITPLG IPMLCGPGAIANAIVLMQDAHSYEMKGILIGTIALIYLLTFFILRASTKLVNVLGETGNN VMMRLMGLILMVIAVECFVSGLKPILVDIVREGMG Prediction of potential genes in microbial genomes Time: Wed May 18 04:21:06 2011 Seq name: gi|222159215|gb|ACAB01000144.1| Bacteroides sp. D1 cont1.144, whole genome shotgun sequence Length of sequence - 19228 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 9, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 180 - 239 3.2 1 1 Tu 1 . + CDS 301 - 459 69 ## + Term 472 - 523 16.4 - Term 220 - 269 4.3 2 2 Op 1 . - CDS 399 - 500 81 ## 3 2 Op 2 . - CDS 564 - 1457 814 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 4 2 Op 3 . - CDS 1517 - 3517 1870 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) - Prom 3598 - 3657 8.0 + Prom 3413 - 3472 7.3 5 3 Tu 1 . + CDS 3593 - 4270 681 ## COG0336 tRNA-(guanine-N1)-methyltransferase + Term 4284 - 4340 10.5 - Term 4272 - 4328 14.3 6 4 Op 1 13/0.000 - CDS 4487 - 5398 741 ## COG0167 Dihydroorotate dehydrogenase 7 4 Op 2 . - CDS 5386 - 6162 543 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 8 4 Op 3 . - CDS 6239 - 6568 189 ## BT_0890 hypothetical protein - Term 7841 - 7894 4.0 9 5 Op 1 . - CDS 7923 - 8942 743 ## COG1466 DNA polymerase III, delta subunit 10 5 Op 2 . - CDS 8967 - 9743 830 ## COG0775 Nucleoside phosphorylase - Prom 9793 - 9852 5.3 + Prom 9694 - 9753 6.2 11 6 Tu 1 . + CDS 9807 - 10256 248 ## BT_0887 hypothetical protein + Term 10308 - 10352 10.0 - Term 10292 - 10345 13.1 12 7 Op 1 27/0.000 - CDS 10422 - 13577 2868 ## COG0841 Cation/multidrug efflux pump - Prom 13629 - 13688 5.7 13 7 Op 2 13/0.000 - CDS 13719 - 14852 973 ## COG0845 Membrane-fusion protein 14 7 Op 3 . - CDS 14887 - 16284 513 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 - Prom 16304 - 16363 4.5 - Term 16415 - 16462 7.1 15 8 Op 1 3/0.000 - CDS 16505 - 17335 976 ## COG1579 Zn-ribbon protein, possibly nucleic acid-binding 16 8 Op 2 . - CDS 17341 - 18435 908 ## COG3323 Uncharacterized protein conserved in bacteria - Prom 18670 - 18729 9.3 + Prom 18415 - 18474 6.5 17 9 Tu 1 . + CDS 18716 - 19219 566 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog Predicted protein(s) >gi|222159215|gb|ACAB01000144.1| GENE 1 301 - 459 69 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNDTYDRCFFHYVAFNSNLHRNESIFKDVIALNQIPPIFRLSHKKVKRNKKS >gi|222159215|gb|ACAB01000144.1| GENE 2 399 - 500 81 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIEFNHWDIYFKSAYDFLFLFTFLWLSLKIGGI >gi|222159215|gb|ACAB01000144.1| GENE 3 564 - 1457 814 297 aa, chain - ## HITS:1 COG:BH1742 KEGG:ns NR:ns ## COG: BH1742 COG0329 # Protein_GI_number: 15614305 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Bacillus halodurans # 12 277 9 273 295 255 47.0 8e-68 MIQTKLKGMGVALITPFKEDESVDYDALMRMVDYLLQNNADFLCVLGTTAETPTLTEEEK KTIKKMVIDRVNGRIPILLGVGGNNTRAIVETLKNDDFTGVDAILSVVPYYNKPSQEGIY QHYKAIAEATELPIVLYNVPGRTGVNMTAETTLRIARDFNNVVAIKEASGNITQMDDIIK NKPENFNVISGDDGITFPLITLGAVGVISVIGNAFPREFSRMTRLALQGDFANALTIHHR FTELFNLLFVDGNPAGVKSMLNAMGMIENKLRLPLVPTRITTFEAIRKVLNELNIKC >gi|222159215|gb|ACAB01000144.1| GENE 4 1517 - 3517 1870 666 aa, chain - ## HITS:1 COG:BH0649 KEGG:ns NR:ns ## COG: BH0649 COG0272 # Protein_GI_number: 15613212 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Bacillus halodurans # 2 666 5 668 669 561 45.0 1e-159 MDIKEKIEELRAELHRHNYNYYVLNAPEISDKEFDDKMRELQDLEQAHPEYKDENSPTMR VGSDLNKNFTQVAHKYPMLSLANTYSEAEVTDFYDRVRKALNEDFEICCEMKYDGTSISL TYEDGKLVRAVTRGDGEKGDDVTDNVKTIRSIPLVLHGDNYPASFEIRGEILMPWEVFEE LNREKEAREEPLFANPRNAASGTLKLQNSSIVASRKLDAYLYYLLGDNLPCDGHYENLQE AAKWGFKISDLTRKCQTLEEVFEFINYWDVERKNLPVATDGIVLKVNSLRQQKNLGFTAK SPRWAIAYKFQAERALTCLNKVTYQVGRTGAVTPVANLDPVQLSGTVVKRASLHNADIIE GLDLHIGDMVYVEKGGEIIPKITGVDKDARSFMLGEKVRFIVNCPECGSKLVRYEGEAAH YCPNETACPPQIKGKIEHFISRKAMNIDGLGPETVDMFYRLGLIENTADLYKLTVDDIKG LDRMGEKSAENIVTGIAQSKTVPFERVIFALGIRFVGETVAKKIAKSFENIDDLQQADLE KLVSIDEIGEKIAQSILAYFANESNRELVNRLKEAGLQLYRTEEDLSGYTDKLAGQSIVI SGVFIHHSRDEYKELIEKNGGKNVGSISAKTSFILAGDNMGPAKLEKAKKLGITILSEDE FLKLIS >gi|222159215|gb|ACAB01000144.1| GENE 5 3593 - 4270 681 225 aa, chain + ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 223 1 224 246 247 50.0 1e-65 MRIDIITVLPEMIEGFFNCSIMKRAQDKGLAEIHIHNLRDYTEDKYRRVDDYPFGGFAGM VMKIEPIERCINALKAERDYDEVIFTTPDGEQFNQPMANSLSLAQNLIILCGHFKGIDYR IREHLITKEISIGDYVLTGGELAAAVMADAIVRIIPGVISDEQSALSDSFQDNLLAAPVY TRPADYKGWKVPDILLSGHEAKIKEWELQQSLERTRKLRPDLLGE >gi|222159215|gb|ACAB01000144.1| GENE 6 4487 - 5398 741 303 aa, chain - ## HITS:1 COG:aq_046 KEGG:ns NR:ns ## COG: aq_046 COG0167 # Protein_GI_number: 15605646 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Aquifex aeolicus # 3 299 2 300 306 300 49.0 2e-81 MADLSVNIGELQMKNPVMTASGTFGYGEEFSDFIDITRIGGIIVKGTTLHKREGNPYPRM AETPSGMLNAVGLQNKGVDYFVEHIYPRIKDIQTNMIVNVSGSAIEDYVKTAEIINELDK IPAIELNISCPNVKQGGMAFGVSAKGASEVVKAVRSAYKKTLIVKLSPNVTDITEIARAA EESGADSVSLINTLLGMAIDAERKRPILSTITGGMSGAAVKPIALRMVWQVAKAVNIPVI GLGGIMDWRDAVEFMLAGATAIQIGTANFIDPAVTIKVEDGINNYLERHGCKSVKEIIGA LEV >gi|222159215|gb|ACAB01000144.1| GENE 7 5386 - 6162 543 258 aa, chain - ## HITS:1 COG:BH2535 KEGG:ns NR:ns ## COG: BH2535 COG0543 # Protein_GI_number: 15615098 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Bacillus halodurans # 8 255 7 258 259 167 35.0 2e-41 MKKFILDLTVTENIRLNANYVLLKLTSQSLLPEMLPGQFAEIRVDGSPTTFLRRPISINF VDKQRNEVWFLIQLVGDGTRRLAEANSGDTINVVLPLGNAYTMPQEASDKLLLVGGGVGT APMLYLGEQLAKKGHKPTFLLGARSDKDLLQLEEFAKYGEVYTTTEDGSHGEKGYVTQHS ILNKVRFEQIYTCGPKPMMVAVAKYAKSNQIECEVSLENTMACGIGACLCCVENTTEGHL CVCKEGPVFNINKLLWQI >gi|222159215|gb|ACAB01000144.1| GENE 8 6239 - 6568 189 109 aa, chain - ## HITS:1 COG:no KEGG:BT_0890 NR:ns ## KEGG: BT_0890 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 109 48 156 156 156 80.0 3e-37 MKVHQKYDYVNLEWLLYGEGEMMVSEEETSFSSSNSDYLPSLFDENLINSSKEPAIPENR KEMPLRNAENAPKEIVKQGIRYIEKPARKITEIRIFFDDNTYETFRPEK >gi|222159215|gb|ACAB01000144.1| GENE 9 7923 - 8942 743 339 aa, chain - ## HITS:1 COG:BS_yqeN KEGG:ns NR:ns ## COG: BS_yqeN COG1466 # Protein_GI_number: 16079610 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Bacillus subtilis # 10 317 4 319 347 83 24.0 7e-16 MAKQELTCDDILKELRAKQYRPVYYLMGEESYYIDLIADYITDNVLSETEKEFNLTVVYG ADVDVATIINAAKRYPMMSEHQVVVVKEAQAVRNMEELSYYLQKPLLSTILVICHKHGTL DRRKKLAAEVEKTGVLFESKKIKDAQLPGFIASYMKRKGVDMEPKATSMLADFVGSDLSR LTGELEKLIITLPAGQKRVTPEQIEKNIGISKDYNNFELRSALVEKDILKANKIIKYFEE NPKTNPIQMTLSLLFSFYSNLMLTYYAPDKSEQGIANMLGLRTPWQARDYMAAMRKYSGV KTMQIVGEIRYADAKSKGVQNSSMTDGDILRELVFKILH >gi|222159215|gb|ACAB01000144.1| GENE 10 8967 - 9743 830 258 aa, chain - ## HITS:1 COG:CPn0894 KEGG:ns NR:ns ## COG: CPn0894 COG0775 # Protein_GI_number: 15618803 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Chlamydophila pneumoniae CWL029 # 3 255 6 263 293 206 39.0 5e-53 MKTKEEIVANWLPRYTKRNLEDFGEYILLTNFNKYVEIFANQFNVPILGRDANMISASAE GITMINFGMGSPNAAIIMDLLGAIRPKACLFLGKCGGIDKKNQLGDLILPIAAIRGEGTS NDYFPPEVPALPAFMLQRAVSSSIRDKGRDYWTGTVYTTNRRIWEHDDAFKEYLTKTRAM AVDMETATLFSCGFANHIPTGALLLVSDQPMTPDGVKTDKSDNLVTRNYVEEHVEIGIAS LRMIIDEKKTVKHLKFDW >gi|222159215|gb|ACAB01000144.1| GENE 11 9807 - 10256 248 149 aa, chain + ## HITS:1 COG:no KEGG:BT_0887 NR:ns ## KEGG: BT_0887 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 149 149 287 95.0 7e-77 MLSLNLPVFDTKINVRNGKNVIFDVIRKRYVALTPEEWVRQHFVHFLVAHKGYPTTLLAN EVMVKLNGTTKRCDTVLYRRDLSARMIVEYKAPHIEITQAVFDQITRYNMVLKVDYLIVS NGMQHYCCRMDYEHQSYTFLQDIPDYHSL >gi|222159215|gb|ACAB01000144.1| GENE 12 10422 - 13577 2868 1051 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 3 1029 2 1027 1051 871 45.0 0 MFSKFFINRPIFATVLALLIVVAGLVTLNILPVAQFPEITPPTVQVSAVYPGANAETVAQ TVGIPIEQQVNGVDGMLYMSSNSSSSGAYSLTITFAVGTDIDMATVQVQNRVSIAQSSLP EPVVVQGVTVQKQSSNIVMFLTMTSQDSVYNSLYLTNYAKLNLVDQLTRVPGVGAVNVMG AGDYSMRIWLDPEAMRIRNISPQQVYQSIQSQNVEVSAGYIGQPIGQDNNNAFQYTLNVQ GRLKSPEQFGDIIIRREQNGAMLRLKDIARIDLGSASYSVVSRLNGKPTAAIAIYQQPGS NSLDVSKGVKAKMEELAASFPKGVSYNVTLDTTDVIHASIDEVMVTFFETTLLVILVIFL FLQNWRAVIIPCITIPVSLIGTFAVMAAFGFSINTLTLFGLILAVAIVVDDAIVVVENAS RLLETGQYSPREAVTKAMGEITGPIVGVVLVLLAVFIPTMMISGISGQLYKQFALTIAAS TVLSGFNSLTLTPALCALFLEKSKPSNFFIYKGFNKAYDKTQGVYDKIVKWLLQRPGMAL VSYGALTVIAVLLFMHWPSTFIPDEDDGYFIAVVQLPPAASLERTQAVGDKVNAILDSYP EVKNYIGISGFSIMGGGEQSNSATYFVVLKNWDERKGKEHTAAAVVNRFNGEAYMTIQAA EVFAMVPPAIPGLGASGGLQLQLEDRRNLGPTEMQQAINALLASYHSKPALGSVSSQYQA NVPQYFLNIDRDKVQFMGIALNDVFSTLGYYMGAAYVNDFVEFGRIYQVKIEARDQAQRV IDDVLKLSVPNAAGEMVPFSSFTKVEEQLGQDQINRYNMYSTASLTCNVAPGSSTGQAIQ EVEALFKEQLGDEFGYEWTSVAYQETQAGNTTTIVLVMALIVAFLVLAAQYESWTSPVAA VIGLPVALLGAMIGCLIMGTPVSIYTQIGIILLIALSAKNGILIVEFARDFRAEGNSIRE AAFEAGHVRLRPILMTSFAFVLGVMPLLFATGAGAQSRIALGAAVVFGMAMNTLLATIYI PNFYELMQKLQERFSKKKADDGGGKDAATQK >gi|222159215|gb|ACAB01000144.1| GENE 13 13719 - 14852 973 377 aa, chain - ## HITS:1 COG:mll7356 KEGG:ns NR:ns ## COG: mll7356 COG0845 # Protein_GI_number: 13476125 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Mesorhizobium loti # 29 369 49 379 384 163 33.0 5e-40 MKKLMYIFLALPILTGCKEKKSTGAMGGMPTPEISVTKPIVEDITLTKDYPGYLTTEKTV NLVARVNGTLQSTSYVPGGRVKQGQLLFVIEPTLYKDKVEQAEAELKTAQAQLEYARNNY SRMKEAVKSDAVSQIQVLQAESSVAEGVASVSNAEAALSTARTNLGYCYVRAPFDGTISK ATVDVGSYVGGSLQPVTLATIYKDNQMYAYFNVADNQWLTMAMDNQQLPANLPQKIMVQL GKEGTETYPAALDYLSPNVDLNTGTLTVRANFDNPKGILKSGLYVSITLPYGEAKNAVLV KEGSIGTDQLGKYLYVVNDSNIVHYRHIEIGQLVDGTLRQVVGGLSPQEQYVTEALMKVR DGMKIKPLPDSLPKREK >gi|222159215|gb|ACAB01000144.1| GENE 14 14887 - 16284 513 465 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 49 464 41 457 460 202 29 2e-51 MSIRVWGTSLLMFLSTVTAVGQTANRYLKAPLPNDWEESGEVFQQILPVDDHWWKSFQDA KLDSLIALAVDRNYSVAMAINRIAAARANLWIERSNFLPSIGLNAGWTRQETSGNTSSLP QTTDHYYDASLSMSWELDIFGSIRKRVKAQKENFAASKEEYTGVMVSLAAEVASAYINLR ELQQELEVVKKNSASQEEVLKITEVRYNTGLVAKLDVAQAKSVLYNTKASIPQLEAGINQ YITTLAVLLGMYPQDIRPVLESGGTLPDYMEPIGVGLPVDLLLRRPDVRGAELSVNAQAA LLGASKADWLPKVFLKGSFGYAARDLKDLTKSKSMTYEIAPSLSWTIFNGGQLVNATRLA KAQLDEAINQFNQTVLTAVQETDNAMNGYRNSIKQIVALREVRNQGIETLKLSLELYKQG LSPFQNVLDAQRSLLSYENQLVQAQGSSLLQLIALYKALGGGWRE >gi|222159215|gb|ACAB01000144.1| GENE 15 16505 - 17335 976 276 aa, chain - ## HITS:1 COG:TP0494 KEGG:ns NR:ns ## COG: TP0494 COG1579 # Protein_GI_number: 15639485 # Func_class: R General function prediction only # Function: Zn-ribbon protein, possibly nucleic acid-binding # Organism: Treponema pallidum # 18 244 7 232 273 62 22.0 8e-10 MAREAKKDPNELTVEQKLKTLFQLQTMLSKIDEIKTLRGELPLEVQDLEDEIAGLSTRID KIKAEVDELKSAIAGKRVEIETAKASVEKYKSQQDNVRNNREYDFLTKEIEFQTLEIELC EKRIKEYSADKEEKEAEVTKNDQILNERMKDLEQKKSELDEIISETKQEEEKLRDKAKDL ETKIEPRLLQSFKRIRKNSRNGLGIVYVQRDACGGCFNKIPPQRQLDIRSRKKVIVCEYC GRIMIDPELAGVQIEHKVEEAPATTTKRAIRRKTAE >gi|222159215|gb|ACAB01000144.1| GENE 16 17341 - 18435 908 364 aa, chain - ## HITS:1 COG:BH1380_2 KEGG:ns NR:ns ## COG: BH1380_2 COG3323 # Protein_GI_number: 15613943 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 131 234 6 109 113 117 51.0 2e-26 MKIKQIVSALERFAPLPLQDGFDNAGLQIGLTEAEATGALLCLDVTEAVLDEAIALGYNL VISHHPLIFKGYKSITGKDYVERCILKAIKNDIVIYSAHTDLDNAQGGVNYKIAEKIGLK NLKVLEPKENNLVKLVTFVPYAQADAVREALFAAGCGNIGDYDSCSYNLKGEGTFRAKEG THPFCGTIGELHHEEEVRIETILPSFKKAETIKALLAAHPYEEPAFDIYPLLNDWSQAGS GIVGELDESETELEFLKRIKKTFEVGCLRHNKLKGREIQRVALCGGAGAFLLPQAIRSGA DVFITGEIKYHDYFGHEEDILMAEIGHYESEQYTKEIFYSIIRDLFPNFALQLSKINTNP IKYL >gi|222159215|gb|ACAB01000144.1| GENE 17 18716 - 19219 566 167 aa, chain + ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 3 164 5 161 183 99 36.0 2e-21 MKSLSFRKDLVGVQEELLRFAYKLTTDREEANDLLQETSLKALDNEEKYTPDTNFKGWMY TIMRNIFINNYRKVVRDQTFIDHTDNLYHLNLPQDAGFESTEKTYDLKEMHRVVNALPKE YRVPFAMHVSGFKYREIAEKLNLPLGTVKSRIFFTRQKLQEELKDFR Prediction of potential genes in microbial genomes Time: Wed May 18 04:21:28 2011 Seq name: gi|222159214|gb|ACAB01000145.1| Bacteroides sp. D1 cont1.145, whole genome shotgun sequence Length of sequence - 29478 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 12, operones - 7 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 109 - 168 8.0 1 1 Tu 1 . + CDS 226 - 639 324 ## BT_0881 hypothetical protein + Term 824 - 854 -0.4 2 2 Op 1 . - CDS 693 - 971 181 ## BT_0880 hypothetical protein 3 2 Op 2 . - CDS 968 - 1762 460 ## COG0390 ABC-type uncharacterized transport system, permease component 4 2 Op 3 . - CDS 1814 - 2434 225 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 2458 - 2517 9.3 + Prom 2289 - 2348 4.9 5 3 Op 1 . + CDS 2590 - 3501 606 ## FP0551 transcriptional regulator 6 3 Op 2 . + CDS 3524 - 3763 308 ## gi|237713214|ref|ZP_04543695.1| predicted protein + Term 3878 - 3912 1.4 7 4 Op 1 . + CDS 4133 - 4306 103 ## gi|262406646|ref|ZP_06083195.1| predicted protein 8 4 Op 2 . + CDS 4299 - 4793 225 ## gi|294643415|ref|ZP_06721233.1| hypothetical protein CW1_3321 9 4 Op 3 . + CDS 4808 - 5659 421 ## gi|237713215|ref|ZP_04543696.1| predicted protein 10 4 Op 4 . + CDS 5693 - 6358 251 ## gi|237713216|ref|ZP_04543697.1| predicted protein 11 5 Tu 1 . + CDS 6494 - 7246 531 ## COG1787 Predicted endonuclease distantly related to archaeal Holliday junction resolvase and Mrr-like restriction enzymes + Prom 7377 - 7436 4.5 12 6 Op 1 . + CDS 7479 - 8021 560 ## COG4739 Uncharacterized protein containing a ferredoxin domain + Prom 8024 - 8083 3.4 13 6 Op 2 5/0.000 + CDS 8109 - 9224 953 ## COG2957 Peptidylarginine deiminase and related enzymes 14 6 Op 3 . + CDS 9254 - 10138 800 ## COG0388 Predicted amidohydrolase + Term 10252 - 10298 9.0 - Term 10352 - 10411 9.0 15 7 Op 1 . - CDS 10436 - 10780 301 ## BT_0874 hypothetical protein 16 7 Op 2 . - CDS 10822 - 12585 2134 ## COG0173 Aspartyl-tRNA synthetase - Prom 12609 - 12668 3.3 17 8 Tu 1 . - CDS 12694 - 13719 911 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase - Prom 13760 - 13819 7.1 + Prom 13829 - 13888 7.0 18 9 Tu 1 . + CDS 13940 - 15124 1175 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes + Term 15152 - 15225 7.2 - Term 15154 - 15193 5.1 19 10 Op 1 . - CDS 15206 - 17527 800 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 20 10 Op 2 . - CDS 17543 - 19885 1340 ## BT_0861 putative ABC transporter permease 21 10 Op 3 . - CDS 19913 - 20587 337 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 22 10 Op 4 . - CDS 20614 - 20763 90 ## gi|298484034|ref|ZP_07002203.1| efflux ABC transporter, permease protein 23 10 Op 5 . - CDS 20760 - 23138 1090 ## BVU_3171 hypothetical protein 24 10 Op 6 13/0.000 - CDS 23164 - 24414 1135 ## COG0845 Membrane-fusion protein 25 10 Op 7 . - CDS 24490 - 25962 984 ## COG1538 Outer membrane protein - Prom 26023 - 26082 2.0 + Prom 25972 - 26031 4.1 26 11 Op 1 8/0.000 + CDS 26153 - 27529 1152 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Prom 27567 - 27626 3.0 27 11 Op 2 . + CDS 27775 - 28815 821 ## COG5000 Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation + Term 28859 - 28910 17.0 - Term 28762 - 28811 5.3 28 12 Tu 1 . - CDS 28879 - 29085 252 ## BT_0854 hypothetical protein - Prom 29250 - 29309 7.5 Predicted protein(s) >gi|222159214|gb|ACAB01000145.1| GENE 1 226 - 639 324 137 aa, chain + ## HITS:1 COG:no KEGG:BT_0881 NR:ns ## KEGG: BT_0881 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 138 170 61.0 2e-41 MKTILTLLLLIVVTASCQLKKSDNTETAVAESTEIDNAEDCTADTVKATAIFWIDKAETK HCKEYGFRTIKAHVLIREDGKVDLMSFVKKQSPAVETYIRHHLSKFKVSEKMFEGGYVQP GEQFVQLRCLWGMLKGK >gi|222159214|gb|ACAB01000145.1| GENE 2 693 - 971 181 92 aa, chain - ## HITS:1 COG:no KEGG:BT_0880 NR:ns ## KEGG: BT_0880 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 89 1 89 92 135 75.0 6e-31 MTITDQIFRKVAETSIPHFFITVEFSASGTEMPEHIESFLREKHKVILRGASGRKFIYKE GEWRLIFTFFPTDRVVDERYALKNKVQMKSER >gi|222159214|gb|ACAB01000145.1| GENE 3 968 - 1762 460 264 aa, chain - ## HITS:1 COG:STM0503 KEGG:ns NR:ns ## COG: STM0503 COG0390 # Protein_GI_number: 16763883 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Salmonella typhimurium LT2 # 1 256 1 252 259 97 31.0 2e-20 MGTIDISYYNLFIGLLLLAIPFFYLWKFKTGLLKPALIGTVRMIIQLFFIGMYLKYLFLW NNPWINFLWVIIMIFVAGQTALVRTQLKHRILLIPITVGFLCSVVLVGLYFIGIVLQLDN IFSAQYFIPIFGILMGNMLSSNVIALNTYYSGLKREQQLYRYLLGNGATRQEAQAPFIRQ AIIKSFSPLIANISVMGLVALPGTMIGQILGGSSPNVAIKYQMMIMVITFTASMLSLMIT ISLASRRSFDAYGKLLEVTKEPRK >gi|222159214|gb|ACAB01000145.1| GENE 4 1814 - 2434 225 206 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 197 1 202 245 91 32 6e-18 MLHIENICIAFGTEVLVSGFNMKLERGETACIVGQSGCGKTSLLNAIMGFVPLKEGIIRI GGMVLDKSTIDTVRRQIAWIPQELALPFEWVKEMVALPFGLKVNRSVPFSEERLFACFDE LGLEHELYTKRVNEVSGGQRQRIMLAVAAMLNKPLIIIDEPTSALDADSTGRVLSFFRRQ AERGTAVLAVSHDKDFASGCHYLIEL >gi|222159214|gb|ACAB01000145.1| GENE 5 2590 - 3501 606 303 aa, chain + ## HITS:1 COG:no KEGG:FP0551 NR:ns ## KEGG: FP0551 # Name: not_defined # Def: transcriptional regulator # Organism: F.psychrophilum # Pathway: not_defined # 51 302 50 300 306 133 36.0 7e-30 MGKKREGMQRMLVIINKLKGPQQYVPRKELENYVTHRMEERDGTPVDIRTLQRDFKDIED LFGIRICFDKKQNGYYINEEDDLKKEQYERLLLNFDLLNALDKISNLHTYVLAEHHRPND SECLPALIKAIKFFHPVEFSYIYVREGDKVRKKKVLPYYLKEDQQRWYLLAYDNNILKTF NIDCIRNLQILYEETFKRNMNIDANNLFKDSYGIWNQTDIPVENIELSYSALDGSFLKSI PLHHSQEIITDNEMEFRIRLRLRITNDFVMALLSRSSSLTVIEPLHLRERIRRIYEEAIK RHL >gi|222159214|gb|ACAB01000145.1| GENE 6 3524 - 3763 308 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713214|ref|ZP_04543695.1| ## NR: gi|237713214|ref|ZP_04543695.1| predicted protein [Bacteroides sp. D1] # 1 79 1 79 79 117 100.0 3e-25 MNLFKALFGSSFFQQEMKRNQQYRDNYACCDDIDVNIDLAEEIENETQNEKLVLSDYIPE SYLYDNENREMDDEIDLIE >gi|222159214|gb|ACAB01000145.1| GENE 7 4133 - 4306 103 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262406646|ref|ZP_06083195.1| ## NR: gi|262406646|ref|ZP_06083195.1| predicted protein [Bacteroides sp. 2_1_22] # 1 57 18 74 74 102 100.0 7e-21 MEKISAIPKPEIGVRILLSQEVGVKKHNEDFLILTVLFCLFMLCWIFAFLNFIGIDV >gi|222159214|gb|ACAB01000145.1| GENE 8 4299 - 4793 225 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294643415|ref|ZP_06721233.1| ## NR: gi|294643415|ref|ZP_06721233.1| hypothetical protein CW1_3321 [Bacteroides ovatus SD CC 2a] # 1 164 1 164 164 295 100.0 7e-79 MFNNINFDYMENNNEIKKIDSSMVECPFCKNVIEAPSKPDVIFKCPKCYKELITYDKSNE PTLNYSQNSTTLNCKNMSLIYCKDCGKQISDSASNCPFCGCSQNNIPSNNNEISLSYLFV SFLIPLIGLILCFTSWNRHEGKTKSALIGTLVGIIFTAILYSII >gi|222159214|gb|ACAB01000145.1| GENE 9 4808 - 5659 421 283 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713215|ref|ZP_04543696.1| ## NR: gi|237713215|ref|ZP_04543696.1| predicted protein [Bacteroides sp. D1] # 1 283 1 283 283 526 100.0 1e-148 MNNSDKFILFSIIVIIGLLIIWGVSFQKDYLPEFNPNASEEQKTNWIKAKTKEYIKNNLY NNEEYIALKWEGSGYTDDLGHTTTYSTFHIPSVEWNTLRLSHTFKIINKKGCESHYCKHI EFDKKGNITNFVNRIDGEDYTGIEQITGMIQGPLRNSNNEELIIWGEDIVRKHISENLNN SQKYIPIEWTLYTKLTLERKVFDALDFTFSSIKQVADFHPNWIHAALCIEHKYIIEGRDG YKQTKHSVFIISPKGKVKEVSYGYFSISKSAEEQYKLLFTNFY >gi|222159214|gb|ACAB01000145.1| GENE 10 5693 - 6358 251 221 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713216|ref|ZP_04543697.1| ## NR: gi|237713216|ref|ZP_04543697.1| predicted protein [Bacteroides sp. D1] # 1 221 1 221 221 424 100.0 1e-117 MRKYIFLFLLLCFTTNVLGQNTESSQTEQHQLVEWSKNCIKEFFNAVSLSDTLIHEPIEW ILIVQDARYPIKLKMEEQKTDTLRGCIKLLDSRGVFDQYFEISEAPDAFIAVSLVHNIKS KRTGLKVESGEITFVLDNKGKVLEIKDYVRPMAIRQMTTMLVLDKTKQLLAENLGKDILK LEWKFQSDTQNWVLENDNKIIYRLTPLEIIYFAQKFMRKRN >gi|222159214|gb|ACAB01000145.1| GENE 11 6494 - 7246 531 250 aa, chain + ## HITS:1 COG:jhp0345 KEGG:ns NR:ns ## COG: jhp0345 COG1787 # Protein_GI_number: 15611413 # Func_class: V Defense mechanisms # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase and Mrr-like restriction enzymes # Organism: Helicobacter pylori J99 # 1 99 79 175 189 63 36.0 3e-10 MKPREYEEYIRELFNKQGYTTELTPQSGDYGIDIFATKGREKIAIQVKMYGNSRKVNRSM IMELHGAKDYFGCNKAILVTNGEIMPDAIEVANRLQIDLQIIDYPDKTTKTAEGTLYSSA SPKKVQDIESKSSNIFFTNEEYTFDSIWETYIMPLKGKTLYNNRGKENIITNVDWAGITR ITSQGNPGKIEIEHFKNAIKILLKKGEITRDYINQNCVKRASSGIVLILSQVPFFKLDEN PLRLTFKLKE >gi|222159214|gb|ACAB01000145.1| GENE 12 7479 - 8021 560 180 aa, chain + ## HITS:1 COG:AF2201 KEGG:ns NR:ns ## COG: AF2201 COG4739 # Protein_GI_number: 11499783 # Func_class: S Function unknown # Function: Uncharacterized protein containing a ferredoxin domain # Organism: Archaeoglobus fulgidus # 1 176 1 184 184 135 41.0 3e-32 MILNERDARHEHVLQVARQMMTAARTAPKGKGIDIIEVALITDEEIKQLSDTMIAMVEEH GMKFFLRDADNILSAECVVLIGTREQTQGLNCGHCGFTTCDGRTEGVPCALNSIDVGIAI GSACATAADLRVDTRVMFSAGLAAQRLNWLKDCKMVMAIPVSASSKNPFFDRKPKQETNS >gi|222159214|gb|ACAB01000145.1| GENE 13 8109 - 9224 953 371 aa, chain + ## HITS:1 COG:HP0049 KEGG:ns NR:ns ## COG: HP0049 COG2957 # Protein_GI_number: 15644680 # Func_class: E Amino acid transport and metabolism # Function: Peptidylarginine deiminase and related enzymes # Organism: Helicobacter pylori 26695 # 38 364 6 328 330 305 47.0 1e-82 MGIMVGLPSPSGSEKDLQLNFGKNMTVQVEMRAPHLPAEWHTQSGIQLTWPHTGTDWAYM LAEVQECFINIAREIAKRELLLIVTPEPEEVKKQIAATVNMNNVRFLECATNDTWARDHG AITMIDTGTPSLLDFTFNGWGLKFASELDNQITKQAVEAGALKGQYIDRLDFVLEGGSIE SDGMGTLLTTSECLLSPQRNGRLNQVEIEEYLKSTFHLQKVLWLDHGYLAGDDTDSHIDT LARFCSTDTIAYVKCDDKEDEHYQALLAMEEQLKTFRTLAGEPYRLLALPMADKIEEDGE RLPATYANFLIMNEVILYPTYHQPANDQKAREVLQQAFPGHQIIGIDCRALIKQHGSLHC VTMQYPLGVIK >gi|222159214|gb|ACAB01000145.1| GENE 14 9254 - 10138 800 294 aa, chain + ## HITS:1 COG:XF2443 KEGG:ns NR:ns ## COG: XF2443 COG0388 # Protein_GI_number: 15839034 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Xylella fastidiosa 9a5c # 4 294 6 295 295 406 64.0 1e-113 MKKIKVGIIQQSNTADIRVNLMNLAKSIEACAAHGAQLIVLQELHNSLYFCQTENTNLFD LAEPIPGPSTGFYSELAAANKVVLVTSLFEKRAPGLYHNTAVVFDRDGSIAGKYRKMHIP DDPAYYEKFYFTPGDIGFEPIQTSLGKLGVLVCWDQWYPEAARLMALKGAELLIYPTAIG WESSDTDDEKARQLNAWIISQCAHAVANGLPVISVNRVGHEPDPSGQTNGILFWGNSFVA GPQGEFLAQAGNDHPENMVVEIDMERSENVRCWWPFLRDRRIDEYEGLTKRFLD >gi|222159214|gb|ACAB01000145.1| GENE 15 10436 - 10780 301 114 aa, chain - ## HITS:1 COG:no KEGG:BT_0874 NR:ns ## KEGG: BT_0874 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 114 1 114 114 178 87.0 4e-44 MNALIMALVVWLMMKEMSFDGDYMVANITAYLIAQIHNFLWCKYWIFPVENKKNSIWKQI LLFCSAFAVAYTAQFLFLILLVEGLDMNEYLAQFLGLFIYGGANFLANKKITFQ >gi|222159214|gb|ACAB01000145.1| GENE 16 10822 - 12585 2134 587 aa, chain - ## HITS:1 COG:BH1252 KEGG:ns NR:ns ## COG: BH1252 COG0173 # Protein_GI_number: 15613815 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Bacillus halodurans # 3 581 4 585 595 598 51.0 1e-171 MFRTHTCGELRISDVNKQVTLSGWVQRSRKMGGMTFIDLRDRYGITQLVFNEEINAELCE RANKLGREFVIQVTGTVNERFSKNANIPTGDIEIIVSELNVLNTAMTPPFTIEDNTDGGD DIRMKYRYLDLRRNAVRSNLELRHKMTIEVRKYLDSLGFIEVETPVLIGSTPEGARDFVV PSRMNPGQFYALPQSPQTLKQLLMVSGFDRYFQIAKCFRDEDLRADRQPEFTQIDCEMSF VEQEDIITTFEGMAKHLFKTLRGVELNEPFQRMSWADAMKYYGSDKPDLRFGMKFVELMD IMKGHGFSVFDNAAYVGGICAEGAATYTRKQLDALTEFVKKPQIGAKGMVYARVEADGTV KSSVDKFYTQEVLQQMKDAFGAKPGDLILILSGDDVMKTRKQLCELRLEMGAQLGLRDKN KFVCLWVIDFPMFEWSEEEGRLMAMHHPFTHPKEEDIPMLDTDPAAVRADAYDMVINGVE VGGGSIRIHDAKLQAKMFEILGFTPEKAEAQFGFLMNAFKYGAPPHGGLAYGLDRWVSLF AGLDSIRDCIAFPKNNSGRDVMLDAPSAIDQSQLDELNLIVDLKEGE >gi|222159214|gb|ACAB01000145.1| GENE 17 12694 - 13719 911 341 aa, chain - ## HITS:1 COG:lin0768 KEGG:ns NR:ns ## COG: lin0768 COG1597 # Protein_GI_number: 16799842 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Listeria innocua # 5 299 1 299 309 116 28.0 8e-26 MNERMKKIKFVVNPISGTQSKELILNLLDEKIDKARYSWEVVYTERAGHAVEIAAKAAEE KTDIVVAIGGDGTINEIARSLVHTDTALGIIPCGSGNGLARHLHIPMEPKRALEVLNEGC MDVIDYGKINGTDFFCTCGVGFDAFVSLKFAHAGKRGLLTYLEKTLQESLKYEPETYELE TENGVSKYKAFLIACGNASQYGNNAYIAPQATLTDGLLDVTILEPFTVLDVPSLAFQLFN KTIDQNSRIKTFRCKQLCIRRTTPGVVHFDGDPMETDANVNIQLIQRGLRVVVPRASEKD AANVLQRAQEYMNGIKLMNEAIVDNITDRNKKILKKLTKKV >gi|222159214|gb|ACAB01000145.1| GENE 18 13940 - 15124 1175 394 aa, chain + ## HITS:1 COG:BS_kbl KEGG:ns NR:ns ## COG: BS_kbl COG0156 # Protein_GI_number: 16078763 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Bacillus subtilis # 27 394 25 392 392 295 39.0 8e-80 MGLLQEKLAKYDLPQKFMAQGVYPYFREIEGKQGTEVEMGGHEVLMFGSNAYTGLTGDER VIEAGINAMHKYGSGCAGSRFLNGTLDLHVQLEKELAAFVGKDEALCFSTGFTVNSGVIP ALTDRNDYIICDDRDHASIVDGRRLSFSQQLKYKHNDMADLEKQLQKCNPDSVKLIIVDG VFSMEGDLANLPEIVRLKHKYNATIMVDEAHGLGVFGKQGRGVCDHFGLTHEVDLIMGTF SKSLASIGGFIAADSSIINWLRHNARTYIFSASNTPAATASALEALHIIQNEPERLEALW EATNYALKRFREAGFEIGATESPIIPLYVRDTEKTFMVTKLAFDEGVFINPVIPPACAPQ DTLVRVALMATHTKDQIDRAVEKLVKAFKALDLL >gi|222159214|gb|ACAB01000145.1| GENE 19 15206 - 17527 800 773 aa, chain - ## HITS:1 COG:NMB0549_2 KEGG:ns NR:ns ## COG: NMB0549_2 COG0577 # Protein_GI_number: 15676455 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Neisseria meningitidis MC58 # 511 773 120 395 395 65 24.0 3e-10 MIKHYLKVAFRNLIKYKTQSLVSIIGLAVGFTCFALSVLWIRYEMTYDNFHEGADRIYLA GSSFRLYGDGFTYNSSSFLADYLAKNCPEIEKVCRIFYDWNEKKIKNEDVEFVVRRIEVD SNFISMFNIKVLDGDNHLQLKKDEIAITENTAKRIFGKESPIGKHLILEESNEEKIIVAV VKSWEGHSLFSFDILLPFHDTNPNWGNQRCQTLFRIYPNCDIEALKQRLSEYEVQQDGHK YPNSTLIAPLSTLRSSHPREDVNVKLNHICLFACISGLVIICGICNYLTMLITRIRMRKR ELALRKVNGSSNGGLLTLLLTELVLLLILSSGIGLVLIELILPTFKRLSQINEGASFFYI EIFVYILSLIAVTVGFASLLIRYISRRTLLSNINKKSNLHLSGWFYKSSILFQLFIGIGF VFCTLVMMMQLHFLLNTRELGIERHNVGAVVYCSENIPFKEIVSQIPEVSECLNGFHTPI PKMFYSIYRVKEWDGKVADSEQYIELEDETINQDYADFFGVEVLNGSMLDEKDGKDMVVI NEAAMKAFGWTQPIGKKMGKLGKQCIVKGVIKNISYNAPIHPVAPAMFHLPDSRDRGGII FKVKEGTWNIVSEKIKAEVNKVNPNAELMLSNIEEVYDAYMKSERTLCKLLSVVSAICIL IAVFGIFSLVTLSCQQRRKEIAIRKVNGANIGIILNLFFKEYLFLLVLSSFFAFPLGYAM MKHWLENYIKQTPMEWWLYAVIFIGMGLVIFLSIIWCVWKAARQNPAEVLKSE >gi|222159214|gb|ACAB01000145.1| GENE 20 17543 - 19885 1340 780 aa, chain - ## HITS:1 COG:no KEGG:BT_0861 NR:ns ## KEGG: BT_0861 # Name: not_defined # Def: putative ABC transporter permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 780 1 781 781 1175 76.0 0 MIKHYLKVAFRNLMKYKMQSFVSIIGLAVGFTCFALSALWIRYEMTYDNFHKGSDRIYMV RAHYANEPGTSKFTPYPLMEYLQDKMPEIEAVTAFTPHHVKFRLEETEQEVGMIDADSVF MNFFDIQLLKGTVNFLKSDGQEVAITKEFAKRLFSKEEDALGKEVEVGRRVCKIGAIVSG WSNHSNIPYNILAPARHSPRWGSSNEQLFIRVRERTDIDAFRTKMSTVRINDLEKESELS DLLITRMSALRYSDYVDKADVVISFSYIFYFSLAGGLVIICSLFNYLTLFISRLCMRNRE MALRKVNGASNKALSVQFAIELLLLLCIALFSGLLLVEMSMSRFLNFTQIEPSSYYGEIL VYLLAVIILSFLFTLMPLSYFRRRTLQETIKGNVATARPYLFRRIGIIVQLMVSLSFIFC TVVIMKQLHFLKNTDLGMERHNVANVALWNGDIRQWTEKIKALPMVTETLPPAYFPIIPT GPMMYAEITNWDGLPKVTEKTLLVGIMPAKEEFFKFYDLKLLEGEFISEKSQQNEVVVDE STCLKFGWKHALGKTFGNNTNSRQDITYKVVGVVKNFSYRSPTSKPGLIAFQRPEAQEYL LNRAGILFKFKEGTWNECREAIEKLHKEEFPNAYLRLFSEEEEYGKYLRSEDALMKLLSF VSLVCVLISVFGIFSLVTLSCEQRQKEIAIRKVNGAQIRHILQMFFREYLLLLVIAAVIA FPMGYVVMRQWLETYVRQTAINGWVYVGIFVVVAVIILLCIIWRIWKAARQNPAEVIKNE >gi|222159214|gb|ACAB01000145.1| GENE 21 19913 - 20587 337 224 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 199 4 199 223 134 37 7e-31 MIKTINLQKIFKTEEVETWALNNVSIEVKQGEFVAIMGPSGCGKSTLLNILGLLDNPTGG EYYLNGTEVSRYTESQRTSFRKGVIGFVFQSFNLIDELNVYENIELPLLYMGISASERKK RVEAAMERMAISHRSKHFPQQLSGGQQQRVAIARAVVANPKLILADEPTGNLDSKNGKEV MGLLSELNNEGTTIVMVTHSQHDAGYADRIINLFDGQVVTEVSM >gi|222159214|gb|ACAB01000145.1| GENE 22 20614 - 20763 90 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298484034|ref|ZP_07002203.1| ## NR: gi|298484034|ref|ZP_07002203.1| efflux ABC transporter, permease protein [Bacteroides sp. D22] # 6 49 755 798 798 83 93.0 3e-15 MNLRYLFFNDGILYWGSIFGGVTLLTVITIILKLLKIARINPAEVIKNE >gi|222159214|gb|ACAB01000145.1| GENE 23 20760 - 23138 1090 792 aa, chain - ## HITS:1 COG:no KEGG:BVU_3171 NR:ns ## KEGG: BVU_3171 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 792 1 789 789 794 52.0 0 MKTLNYAIRFLLRTKSYTIINLLGLAFSLACCIILFRYIHRELTVDIHCIDRNRVYGVQT TFEGNRVLSVAEIGDRDSVYIDNSGMVTRSRIVLLENDYLTYQSNRIPVHAMAADSAYFE LFPYHVLQGSASLEDPASVLLMESFAEKLFGKQNPIGKILTYSNGKEIRVTGILEEPVNK RMFNFDLVLSSKLSSLWERMPLDFIRFTSETEVMKANKAGSYPRFINQDSRSGDSRKYTF SLIPVSDMYWDQALIGRSGPDMLVSGNRSQLFILGGICLLVLLAGVMNFINLYLVLMMKR GKVYSLRKVFGADRKALFKQIFIENFLLIAASMIVAWLIVEVTNISISSMFGSQLMYTAF DGILSFSILLFLPLLVSIYAFVQCQRSLLAISIQKVGTDNRSVRSRMVFLFLQYMITFLL VVLSIYFGKQLNFMLHTDPGFRVESVIQANMIYESRDFSAYTMETIKQRQKRITEIDQLM KSCPDIQYWTTGHSSILGAYYSTNFQNVKGETVALLQSYVTPEFFKVFNLAFVEGSLPEM DEDSRNRVAVVNRAALKALGYTQCEGAMLVDERMKRNVPDFPAQPIVAVIDDYYDGHIST GIHPMVFMVGSYLDGDLYQIYCHSGKEQAVIDYLKSIQKKVYGTEDFKYSLLKDDVAELY KNDRQIASVYALFACIAIVIVCLGLFGISLFDIRQRSREVAIRKVNGASLKDLYLLLGRK YLIILSGAFAVALPLSWYLIYEYTKDFVVKAPISIGIFFIALLLVGGISLGTLFWQINKI AHIDPAKIMKTE >gi|222159214|gb|ACAB01000145.1| GENE 24 23164 - 24414 1135 416 aa, chain - ## HITS:1 COG:YPO1498 KEGG:ns NR:ns ## COG: YPO1498 COG0845 # Protein_GI_number: 16121771 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Yersinia pestis # 56 406 55 410 420 87 23.0 4e-17 MDIKLEKKPWYIRYRYYLIGGILFVAFLIYVITLSLGPRKLRIDAEDIQIAEVKVSNFME YVDVEGLIQPILTIKINTREAGSVERIVGEEGSLLQQGDTILVLSNPDLLRSIEDQRDEW EKQMITYQEQEIEMEQKSLNLKQQALTNNYELERLKKSIALDREEFQMGVKSKAQLQVAE DEYGYKLKNAALQQESLRHDSAVTMIRKELIRNDRERERKKYERTCERLNSLVITAPLKG QLSFVKVTPGQQVSSGESIAEIKVLDQYKIHTSLSEYYIDRITTGLPATVNYQGNKYPLK ITKVVPEVKDRMFDVDLVFTGDMPDNVRVGKSFRVQIELGQPEQALIIPRGNFYQSTGGQ WIYKVNTSKTKAVRVPLSIGRQNPQQYEIVEGLQPGDWVITTGYDTFGDAEELILK >gi|222159214|gb|ACAB01000145.1| GENE 25 24490 - 25962 984 490 aa, chain - ## HITS:1 COG:VC1565 KEGG:ns NR:ns ## COG: VC1565 COG1538 # Protein_GI_number: 15641573 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Vibrio cholerae # 188 471 128 403 419 60 23.0 8e-09 MNLKLIFIYLGLLSTELLIAQNQITELSLKQVIEMARMESPDAQTARHSFRSAYWNYKYY RANYLPSLSLTSDPNLNRAINKITMGDGSVKFVEQNLLNTDLTLNLSQNLSWTGGSFFLE TSAQRMDLFSEHKYSWQTSPIMIGYRQSLFGYNSLKWDRRIEPVCYQEAKKSYVETLELV SANAIHKFFALATAQSDYDIASFNYANADTLYRYAQGRYNIGTITENEMLQLELNRLTEE TNRMNARIEMDNCMQELRSYLGIHEDRELRVLVSPQVPDFSVNLNEALVLAYENSPDIQT MERRKLESESAVAKARANAGLKADIYLRFGLTQTADKLPEAYRNLLDQQYVSIGISLPIL DWGRGKGQVRVARSNRDLVYTQVEQSRTDFELNVRKLVKQFNLQTQRVRIAARTDETAQR RSEVARKLYLLGKSTILDLNASISEKDSARRNYISALYNYWSLYYTLRSMTLYDFERNTM LTEDYHLLIE >gi|222159214|gb|ACAB01000145.1| GENE 26 26153 - 27529 1152 458 aa, chain + ## HITS:1 COG:STM4174 KEGG:ns NR:ns ## COG: STM4174 COG2204 # Protein_GI_number: 16767428 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 10 455 8 441 441 299 38.0 8e-81 MEEVNKLGKILIVDDNEDVLFALNLLLEPYTEKIKVATTPDRIEYFMTTFHPDLILLDMN FSRDAISGQEGFESLKQILQIDPQAIVIFMTAYADTDKAVRAIKAGATDFIPKPWEKDKL LATLTSGMRLRQSQQEVSILKEQVEVLSGQNTSENDIIGESSVMQEVFTTINKLSNTDAN ILILGENGTGKDVIARLIYRCSPRYGKPFVTIDLGSIPEQLFESELFGFEKGAFTDAKKS KAGRMEVATSGTLFLDEIGNLSLPMQSKLLTAIEKRQISRLGSTQTVPIDVRLICATNAD IRQMVEDGNFRQDLLYRINTIEIHIPPLRERGNDIILLADHFLDRYTRKYKKKIHGLTRE AKNKLLKYAWPGNVRELQHTIERAVILGDGSMLKPENFLFHTTSKQKKEEEVILNLEQLE RQAIEKALRISNGNISRAAEYLGITRFALYRKLEKLGL >gi|222159214|gb|ACAB01000145.1| GENE 27 27775 - 28815 821 346 aa, chain + ## HITS:1 COG:NMA0160 KEGG:ns NR:ns ## COG: NMA0160 COG5000 # Protein_GI_number: 15793187 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation # Organism: Neisseria meningitidis Z2491 # 27 346 362 700 706 96 25.0 8e-20 MDEWAQELSNALKDFRGKLLAEEIKHQYYENLLNKVDTAVLVTDVTGHIEWMNQAAVTHL GQLSQLPEVLQEASTSNDISIIRIEQNGIVLEMAISCTTFVTQGAFATQSKKQRLISLKN IHSVLERNEMEAWQKLIRVLTHEIMNSITPIISLSETLSERGIPESLGEKEYSIMLQAMQ TIHRRSKGLLGFVENYRRLTRIPTPVRTKVSVAELFTDLKKLFPEEYIHFEMPASDLYLY ADRAQIEQVLINLLKNARETCERKADKEIQIKFFSKGNPTLAISDNGEGILPTVLDKIFV PFFTTKTSGSGIGLSLCKQIMALHDGSINVKSELGKGSCFVLTFPK >gi|222159214|gb|ACAB01000145.1| GENE 28 28879 - 29085 252 68 aa, chain - ## HITS:1 COG:no KEGG:BT_0854 NR:ns ## KEGG: BT_0854 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 68 1 64 64 80 90.0 2e-14 MKSLMKTRKQKETGEKEVHLLEQKGYEKMVNEIVPVQQAEVYRKPTHKAVKEAVKELNPD TNSLGSRG Prediction of potential genes in microbial genomes Time: Wed May 18 04:22:39 2011 Seq name: gi|222159213|gb|ACAB01000146.1| Bacteroides sp. D1 cont1.146, whole genome shotgun sequence Length of sequence - 2153 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 203 - 262 6.1 1 1 Tu 1 . + CDS 496 - 1248 565 ## PROTEIN SUPPORTED gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase + TRNA 1363 - 1440 82.4 # Pro TGG 0 0 + TRNA 1472 - 1549 82.4 # Pro TGG 0 0 2 2 Tu 1 . + CDS 1677 - 2039 168 ## BT_0846 hypothetical protein Predicted protein(s) >gi|222159213|gb|ACAB01000146.1| GENE 1 496 - 1248 565 250 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase [Ochrobactrum intermedium LMG 3301] # 1 245 1 244 245 222 43 2e-58 MKENKYDNSDFFSQYSQMSRSVEGLKGAGEWHVLQKMLPDFAGKRVLDLGCGFGWHCVYA IEHGATHVTGIDISEKMLEEAQKRNPSPLIEYQCMAIEDFDFQPDTYDIVISSLTFHYLE SFTDICRKINNCLTPGGAFVFSVEHPVFTAYGNQEWHYDQDGKPIHWPVDRYFTEGKRTA IFLGEEVVKYHKTLTTYINGLLQTGFEICELIEPQPDERLLDTIPGMKDELRRPMMLLIS AKKKSKDAKL >gi|222159213|gb|ACAB01000146.1| GENE 2 1677 - 2039 168 120 aa, chain + ## HITS:1 COG:no KEGG:BT_0846 NR:ns ## KEGG: BT_0846 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 117 1 117 119 169 70.0 2e-41 MKNILFTLFAILILTSCSKDDDNWIELNENNIVGNWSTGIKGSHKFLNFGDDDKGSFGIY SNADPISFQTFKYKVEDSKIYIYDVFPKGKSPYYLDCKISNNKLKIENGEESGTYEKLEY Prediction of potential genes in microbial genomes Time: Wed May 18 04:22:52 2011 Seq name: gi|222159212|gb|ACAB01000147.1| Bacteroides sp. D1 cont1.147, whole genome shotgun sequence Length of sequence - 44593 bp Number of predicted genes - 41, with homology - 41 Number of transcription units - 25, operones - 10 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 55 - 114 3.5 1 1 Tu 1 . + CDS 135 - 503 299 ## gi|237713237|ref|ZP_04543718.1| predicted protein + Term 510 - 571 9.6 - Term 498 - 558 11.0 2 2 Tu 1 . - CDS 570 - 1280 470 ## COG4123 Predicted O-methyltransferase - Prom 1309 - 1368 6.7 + Prom 1329 - 1388 7.8 3 3 Op 1 . + CDS 1496 - 3961 2640 ## COG0466 ATP-dependent Lon protease, bacterial type 4 3 Op 2 1/0.000 + CDS 3958 - 5088 1034 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 5 3 Op 3 . + CDS 5092 - 6189 969 ## COG0795 Predicted permeases + Term 6301 - 6356 10.6 - Term 6292 - 6341 7.6 6 4 Tu 1 . - CDS 6440 - 6943 337 ## BT_0833 hypothetical protein - Prom 7128 - 7187 6.0 + Prom 7090 - 7149 4.6 7 5 Op 1 . + CDS 7210 - 8439 1488 ## COG0560 Phosphoserine phosphatase 8 5 Op 2 . + CDS 8486 - 9748 1134 ## COG0513 Superfamily II DNA and RNA helicases - Term 9649 - 9689 -0.0 9 6 Op 1 . - CDS 9831 - 10625 638 ## BT_0830 hypothetical protein 10 6 Op 2 . - CDS 10664 - 11977 1274 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 11 6 Op 3 . - CDS 12004 - 12552 582 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes - Prom 12797 - 12856 4.8 + Prom 12564 - 12623 4.8 12 7 Op 1 . + CDS 12643 - 13746 818 ## BT_0827 hypothetical protein 13 7 Op 2 . + CDS 13765 - 14292 437 ## BT_0826 hypothetical protein 14 8 Op 1 . - CDS 14295 - 14546 190 ## gi|262406610|ref|ZP_06083159.1| conserved hypothetical protein 15 8 Op 2 . - CDS 14521 - 15039 351 ## BF3041 hypothetical protein 16 8 Op 3 . - CDS 15009 - 15248 164 ## BF3042 hypothetical protein - Prom 15408 - 15467 9.4 - Term 15398 - 15459 9.2 17 9 Op 1 . - CDS 15505 - 16152 272 ## gi|237713252|ref|ZP_04543733.1| predicted protein 18 9 Op 2 . - CDS 16164 - 19202 2154 ## COG3291 FOG: PKD repeat - Prom 19340 - 19399 8.7 - Term 19948 - 19991 0.4 19 10 Tu 1 . - CDS 19992 - 20477 351 ## BT_3062 hypothetical protein - Term 20842 - 20900 12.5 20 11 Op 1 1/0.000 - CDS 20931 - 22370 1529 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 21 11 Op 2 . - CDS 22397 - 23464 1004 ## COG1609 Transcriptional regulators - Prom 23494 - 23553 6.6 + Prom 23575 - 23634 6.1 22 12 Tu 1 . + CDS 23663 - 25069 1298 ## COG1904 Glucuronate isomerase + Term 25098 - 25147 9.6 - TRNA 25246 - 25321 81.9 # Lys CTT 0 0 - TRNA 25351 - 25426 84.1 # Lys CTT 0 0 - Term 25383 - 25417 -0.5 23 13 Op 1 . - CDS 25516 - 26319 637 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I - Prom 26340 - 26399 2.1 - Term 26336 - 26378 6.1 24 13 Op 2 . - CDS 26402 - 27790 1125 ## BT_0821 putative permease - Prom 27883 - 27942 5.1 + Prom 27760 - 27819 6.8 25 14 Tu 1 . + CDS 27887 - 28309 455 ## BT_0820 hypothetical protein + Term 28353 - 28399 7.1 - Term 28340 - 28389 3.3 26 15 Tu 1 . - CDS 28434 - 28700 182 ## BT_0819 hypothetical protein - Prom 28748 - 28807 4.8 + Prom 28744 - 28803 7.8 27 16 Op 1 . + CDS 28844 - 29293 324 ## COG2259 Predicted membrane protein + Prom 29308 - 29367 8.7 28 16 Op 2 . + CDS 29387 - 29938 619 ## BT_0815 hypothetical protein + Term 29958 - 30002 7.6 + Prom 29947 - 30006 2.4 29 17 Tu 1 . + CDS 30030 - 30482 204 ## gi|237713264|ref|ZP_04543745.1| conserved hypothetical protein 30 18 Tu 1 . - CDS 30436 - 32745 1907 ## BT_0814 putative outer membrane protein - Prom 32909 - 32968 5.6 + Prom 32745 - 32804 5.3 31 19 Tu 1 . + CDS 32888 - 33652 663 ## COG0566 rRNA methylases + Term 33767 - 33817 13.6 - Term 33519 - 33558 -0.9 32 20 Op 1 . - CDS 33603 - 33989 139 ## BT_0810 hypothetical protein 33 20 Op 2 . - CDS 33995 - 35119 889 ## BT_0809 hypothetical protein 34 20 Op 3 . - CDS 35061 - 35717 549 ## BF2277 lipoprotein signal peptidase 35 20 Op 4 . - CDS 35815 - 36195 514 ## BT_0807 DnaK suppressor protein, putative 36 20 Op 5 . - CDS 36231 - 39719 3586 ## COG0060 Isoleucyl-tRNA synthetase - Prom 39739 - 39798 10.6 + Prom 39702 - 39761 6.6 37 21 Tu 1 . + CDS 39930 - 40979 708 ## BT_0805 hypothetical protein + Term 41022 - 41062 5.0 - Term 40988 - 41050 -1.0 38 22 Tu 1 . - CDS 41056 - 42594 1472 ## COG0388 Predicted amidohydrolase - Prom 42665 - 42724 6.9 39 23 Tu 1 . - CDS 42727 - 43017 135 ## gi|237713274|ref|ZP_04543755.1| predicted protein - Prom 43098 - 43157 9.9 - Term 43252 - 43283 1.1 40 24 Tu 1 . - CDS 43396 - 43590 120 ## gi|260174876|ref|ZP_05761288.1| hypothetical protein BacD2_23669 - Prom 43661 - 43720 3.3 + Prom 43625 - 43684 5.6 41 25 Tu 1 . + CDS 43737 - 44592 391 ## COG2207 AraC-type DNA-binding domain-containing proteins Predicted protein(s) >gi|222159212|gb|ACAB01000147.1| GENE 1 135 - 503 299 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713237|ref|ZP_04543718.1| ## NR: gi|237713237|ref|ZP_04543718.1| predicted protein [Bacteroides sp. D1] # 1 122 1 122 122 227 100.0 2e-58 MEGLIQFTGIVMIAFGILQIILFFKIWGMTNNVKRIWKKIDNKDFLSDACVSYIKGNLEE TERLANEAFLQEVALLSKSSESYEDWIDNYIKIKEKYTRIFKKIDKPAPDFNKYEEPKMY LL >gi|222159212|gb|ACAB01000147.1| GENE 2 570 - 1280 470 236 aa, chain - ## HITS:1 COG:YPO2709 KEGG:ns NR:ns ## COG: YPO2709 COG4123 # Protein_GI_number: 16122913 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Yersinia pestis # 6 235 20 251 252 205 44.0 5e-53 MSNPYFQFKQFTVWHDQCAMKVGTDGVLLGAWASVERARRILDIGTGTGLVALMLAQRSL PDAKIVALEIDEAAAEQARENVARSPWQERIEVVQADFKKYRSSDKFDVIVSNPPYFVDS LECPDRQRAAARHNDSLTYEELLEGVNRLLAADGLFTVVIPTDVVDRVKAIASMNKLYAI RQLNVITKPGGIPKRTLIAFSFSNRECVIEELLTELARHQYSEEYIALTREYYLNM >gi|222159212|gb|ACAB01000147.1| GENE 3 1496 - 3961 2640 821 aa, chain + ## HITS:1 COG:ECs0493 KEGG:ns NR:ns ## COG: ECs0493 COG0466 # Protein_GI_number: 15829747 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Escherichia coli O157:H7 # 39 805 11 774 784 662 45.0 0 MKERYLLEDGSDNGFSLITDYDGNDEQAFDVNVKSGEILPVLPLRNMVLFPGVFLPITVG RKSSLKLIRDADKKHKDIAVVCQRSAHTEDPKLEDLHNIGTVGRIVRILEMPDQTTTVIL QGMKRLNLINIIETHPYLKGEIELLEEDIPSKDDKEFQALVETCKDLTMRYIKSSDVMHQ DSAFAIKNINSPMFLVNFICSNLPFKKDEKMDLLSIHSLRERTYHLLEILNREVQLAEIK ASIQMRAREDIDQQQREYFLQQQIKTIQDELGGGGQEQEIEEMRQKAERMRWNAEVRETF MKELAKLERTHPQSPDYSVQLNYLQTMLNLPWGTYTTDNLNLKNAEKTLNKDHYGLEKVK ERILEHLAVLKLKGDMKSPIICLYGPPGVGKTSLGKSIASALKRKYVRMSLGGVHDEAEI RGHRKTYIGAMPGRIIKSLIKAGASNPVFILDEIDKVSADRQGDPSSALLEVLDPEQNTA FHDNFLDVDYDLSKVLFIATANNLNTIPGPLLDRMELIEVSGYITEEKVEIARKHLLPKE LEANGLKKTDIKLPKETLEAIIESYTRESGVRELEKKIGKILRKSARQYATDGYFAKTEI KPSDLYDFLGAPEYTRDKYQGNDYAGVVTGLAWTAVGGEILFVETSLSRGKGGRLTLTGN LGDVMKESAMLALEYIKAHASILNLDEEIFDNWNIHIHVPEGAIPKDGPSAGITMATSLA SALTQRKVKANIAMTGEITLRGKVLPVGGIKEKILAAKRAGIKEIIMSAENKKNIDEIQE IYLKGLTFHYVNDIKEVFAIALTNEKVSDAIDLSVKKPSQE >gi|222159212|gb|ACAB01000147.1| GENE 4 3958 - 5088 1034 376 aa, chain + ## HITS:1 COG:aq_1308 KEGG:ns NR:ns ## COG: aq_1308 COG0343 # Protein_GI_number: 15606515 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Aquifex aeolicus # 3 376 2 378 378 371 50.0 1e-102 MTFELQYTDTKSNARAGLITTDHGQIQTPIFMPVGTLGTVKGVHLTELKEDIQAQIILGN TYHLYLRPGLDVIEKAGGLHRFNGFDRPMLTDSGGFQVFSLAGIRKLREEGAEFRSHIDG SKHVFTPEKVMDIERTIGADIMMAFDECPPGDSDYEYAKKSLGLTHRWLDRCIQRFNETE PKYGYNQALFPIVQGCVYPDLRKQSAEFIASKGADGNAIGGLAVGEPVDKMYEMIEIVNE ILPKDKPRYLMGVGTPVNILEGIERGVDMFDCVMPTRNGRNGMLFTKDGIINMRNKKWET DFSPIEADGASSVDTLYSKAYLRHLFHAQELLAMQIASIHNLAFYLWLAGEARKHIIAGD FSTWKPMMVKRVSTRL >gi|222159212|gb|ACAB01000147.1| GENE 5 5092 - 6189 969 365 aa, chain + ## HITS:1 COG:FN1030 KEGG:ns NR:ns ## COG: FN1030 COG0795 # Protein_GI_number: 19704365 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 7 363 2 361 363 100 27.0 3e-21 MKSNRFIKRLDWYIIKKFLGTYVFAIALIISIAVVFDFNEKMDKLMEHEAPWDKIIFEYY MNFIPYFSNLFSPLFVFIAVIFFTSKLAENSEIIAMFSTGMSFKRMMRPYMISAAIIALT TFTLSSYVIPKGSVTRLNFEDRYIKPKKQNSVSNVQLEVDSGVIAYIDNYNNAMKTGNRF SLDKFVDKKLVSHLTARRITYDTATVHKWTIHDYMVRELDGLKEKITKGDRIDSIINMEP SDFLIMKNQQEMLTSPELSDYIEKQKRRGFANIKEFEIEYHKRIAMSFASFILTIIGVSL SSRKTKGGMGLHLGIGLGLSFSYILFQTITSTFAVNGNVPPAIAVWIPNILYAGIAFFLY QKAPK >gi|222159212|gb|ACAB01000147.1| GENE 6 6440 - 6943 337 167 aa, chain - ## HITS:1 COG:no KEGG:BT_0833 NR:ns ## KEGG: BT_0833 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 167 4 168 168 312 95.0 3e-84 MKHVKWIFVVLLICSLTAFVEKDKPTGGLSEGDVAPDFKIESTSNEQPAFKLGNLKGKYV LLSFWASYDAQSRMQNVSLSNALRSSHNVEMVSVSFDEYQSIFKETVRKDQIVTPTCFVE TEGEDSGLFKKYRLNRGFTNYLLDGNGVIIAKNISAADLSAYVKEIG >gi|222159212|gb|ACAB01000147.1| GENE 7 7210 - 8439 1488 409 aa, chain + ## HITS:1 COG:PA4960_2 KEGG:ns NR:ns ## COG: PA4960_2 COG0560 # Protein_GI_number: 15600153 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Pseudomonas aeruginosa # 193 404 1 212 217 258 64.0 1e-68 MQPSNTELILIRVTGEDRPGLTASVTEILAKYDATILDIGQADIHNTLSLGILFKSEERH SGFIMKELLFKASSLGVTIRFEPITTEQYDNWVGMQGKNRYILTVLGRKLSARQISAATS ILAEQGMNIDAIKRLTGRIPLDECQTDTRTRACIEFSVRGTPKDRIAMQEKLMKLASELE MDFSFQQDNMYRRMRRLICFDMDSTLIETEVIDELAIRAGVGDEVKAITESAMRGEIDFT ESFTRRVALLKGLDESVMQEIAENLPITEGVERLMYVLKKYGYKIAILSGGFTYFGQYLQ KKYGIDYVYANELEIIDGKLTGRYLGDVVDGKRKAELLRLIAQVEKVDIAQTIAVGDGAN DLPMLGVAGLGIAFHAKPKVVANAKQSINTIGLDGVLYFLGFKDSYLNM >gi|222159212|gb|ACAB01000147.1| GENE 8 8486 - 9748 1134 420 aa, chain + ## HITS:1 COG:CC0835 KEGG:ns NR:ns ## COG: CC0835 COG0513 # Protein_GI_number: 16125088 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Caulobacter vibrioides # 2 367 3 368 476 290 43.0 4e-78 MKFSELQLNANVLEALDAMRFDECTPIQEQAIPIILEGKDLIAVAQTGTGKTAAFLLPVL NKLSEGKHPEDAINCVIMSPTRELAQQIDQQMEGFSYFMPVSSVAVYGGNDGILFEQQKK GLTLGADVVIATPGRLIAHLSLGYVDLSKVSYFILDEADRMLDMGFYEDIMQIAKYLPKE RQTIMFSATMPAKIQQLANTILNEPSEIKLAVSKPAEKIIQAAYVCYENQKLGIIRSLFM DEVPERVIVFASSKIKVKEVAKALKSMKLNVGEMHSDLEQAQRETVMHEFKAGRINILVA TDIVARGIDIDDIRLVINFDVPHDSEDYVHRIGRTARANNDGVALTFINEKEQSNFKSIE NFLEKEIYKIPIPEGLGEAPEYKPRSFNKNRNGNFSKRKDFRGKRNNGGKKSNYPAPRQK >gi|222159212|gb|ACAB01000147.1| GENE 9 9831 - 10625 638 264 aa, chain - ## HITS:1 COG:no KEGG:BT_0830 NR:ns ## KEGG: BT_0830 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 264 3 265 265 428 80.0 1e-118 MKNVAVSLMAILFAACSGNKSPHSLQQEKEDLSAKELLQGIWLDDETESPLMRVEGDTIY YADAQSTPIAFKIMRDILYTYGNDTTYYKIDKQGEHIFWFHSITDNVIKLHKSEDLNDSI YFVRQELVVPTYTEVTKRDSVVTYNGTRYRAYVYINPSKMRVIKTTYSEDGISMDNVYYD NVMHICVYEGKKSLFASDITKQMFDKVVPADFLAQSILSDTKFVKVNRNGFHYQAVLAIP ETSIYSVVNMEVSFKGDLEITSSK >gi|222159212|gb|ACAB01000147.1| GENE 10 10664 - 11977 1274 437 aa, chain - ## HITS:1 COG:XF1606 KEGG:ns NR:ns ## COG: XF1606 COG1004 # Protein_GI_number: 15838207 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Xylella fastidiosa 9a5c # 1 437 1 444 450 505 56.0 1e-143 MKIAIVGTGYVGLVSGTCFAEIGVNVTCVDTNKEKIESLQKGNIPIYENGLEEMVLRNMK AKRLKFTTSLESCLDDVEVIFSAVGTPPDEDGSADLKYVLEVARTIGRNMKQYKLVVTKS TVPVGTASKVRAVIQEELDKRGVKVDFDVASNPEFLKEGNAISDFMSPDRVVVGVESARA EKLMSKLYKPFLLNNFRVIFMDIPSAEMTKYAANSMLATRISFMNDIANLCEIVGADVNM VRSGIGSDTRIGRKFLYPGIGYGGSCFPKDVKALIKTAEQNGYSMRVLSAVEEVNEQQKS VLFEKLKKQFNGDLQGKTVALWGLAFKPETDDMREAPALVLIDKLLEAGCRVRAYDPAAV QECKRRIGEKIYYACDMYDAVLDADVLMLVTEWKEFRLPSWAVIKKTMAQQIVLDGRNIY DKKEMEELGFVYHCIGK >gi|222159212|gb|ACAB01000147.1| GENE 11 12004 - 12552 582 182 aa, chain - ## HITS:1 COG:MA3780 KEGG:ns NR:ns ## COG: MA3780 COG1898 # Protein_GI_number: 20092576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 169 1 170 183 206 62.0 2e-53 MNYIQTEIDGVWLIEPKVFSDERGYFMEAYKKEEFDANIGPVDFIQDNESKSSFGVLRGL HYQKGEYSQAKLVRVLKGKVLDVAVDLRKSSPTFGKHVCVLLSEENKRQFFIPRGFAHGF AVLSEEAVFTYKVDNKYAPQAEASILYNDETLGIDWPLADSQMVLSAKDREGTAFKDAAY FE >gi|222159212|gb|ACAB01000147.1| GENE 12 12643 - 13746 818 367 aa, chain + ## HITS:1 COG:no KEGG:BT_0827 NR:ns ## KEGG: BT_0827 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 367 1 365 365 493 72.0 1e-138 MIELAQHIETLLLENDCVIVPGFGGFVAHYSPATRVKEENIFLPPTRTIGFNPQLKLNDG VLVQSYMSAYDTSFADASRIVEKEVNEFIGLLHEEGKAHLDNIGEIHYNIYGNYEFVPYD YKITTPSLYGLDSFEMHELSVLQQKEKVWIPAHPEKEKKTFEISINRAYLRNAAAMIAAI VLFFAFSTPVENTDVQKNNYAQLLPSELFEQIEKQSVVVTPVYVKSEAMQQAKKLSATAS TSSSTKTSSTKKNTADKTSKPIAVREMKVAKQETAATASATTPAAVKSPESANHPFHIIV AGGISLKDAEAIATQLKSKGFANAKALNMDGKVRVSISSFDNRNEATKQLLELRKNETYK NAWLLAK >gi|222159212|gb|ACAB01000147.1| GENE 13 13765 - 14292 437 175 aa, chain + ## HITS:1 COG:no KEGG:BT_0826 NR:ns ## KEGG: BT_0826 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 175 1 175 177 314 95.0 1e-84 MKRVLCPKCENYLFFDETKYSEGQSLVFECEHCGKQFSIRLGKSKVKALRKEENLEEEAE SHKEEFGYITVIENVFGYKQILPLQEGDNVIGRRCVGTFINTPIESGDMSMDRRHCIINI KRNKQGKLVYTLRDAPSLTGTFLMNEILGDKDRVCIEDGAIVTIGATTFILHAAE >gi|222159212|gb|ACAB01000147.1| GENE 14 14295 - 14546 190 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262406610|ref|ZP_06083159.1| ## NR: gi|262406610|ref|ZP_06083159.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 83 1 83 83 161 100.0 1e-38 MMEQNWQNDPVKSPEIQEIILSNRIGVIAAELSRRLEIAPVRALQLFYESKTCADLHDKE TGLYLYGNLYIADEFMREYQNKL >gi|222159212|gb|ACAB01000147.1| GENE 15 14521 - 15039 351 172 aa, chain - ## HITS:1 COG:no KEGG:BF3041 NR:ns ## KEGG: BF3041 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 12 170 2 160 171 189 56.0 2e-47 MSDNLGGKTMKITVYHGGTERVDVPICRLGRENLDFGRGFYVTDIKEQACRWAIATAKRR NTQAIINIYQLDRELVLEKARCKVFKAYDSEWLEFIVASRRGLNPAANYDYIEGGVANDR VIDTVNLYMSGLMSADVALQRLAQHQPNNQICLLNQNITDKYLIYDGTELAE >gi|222159212|gb|ACAB01000147.1| GENE 16 15009 - 15248 164 79 aa, chain - ## HITS:1 COG:no KEGG:BF3042 NR:ns ## KEGG: BF3042 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 77 1 77 102 85 51.0 4e-16 MENLIELSHTEVTLAFVASCIESTARRLGKSYQEVFTRMKRVGMIENYILPCYDVLHTES REHVTDNMIECLTTWEAKR >gi|222159212|gb|ACAB01000147.1| GENE 17 15505 - 16152 272 215 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713252|ref|ZP_04543733.1| ## NR: gi|237713252|ref|ZP_04543733.1| predicted protein [Bacteroides sp. D1] # 1 215 1 215 215 425 100.0 1e-118 MNNSGSNPCDQICKKQVAYDKVVEAYQFHVQRYHTWVNYYAIFVGALFVAFYTIIPQSEN VQCAPCSGTIFSFLDSIWLPLLIITLGWFASMCWLASIIGHYEWMKSWIRIVKKREVDFF ETTSSQPSNISPVSSEIFVYGKVMREPDRSVPNNMLPKFISTTKSNSMVCMECYFNVDSK LCFALEICRLLLVDSLFRLFPSYGYNWILSLWESS >gi|222159212|gb|ACAB01000147.1| GENE 18 16164 - 19202 2154 1012 aa, chain - ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 678 909 1035 1261 1734 100 35.0 2e-20 MMKKTVFFFLMIAALSTSFYSCKTNTDDLWDSIHQLDGRVTLLEELCKQMNGNIGALQTL LQALQSNVSITKVNPVTEGGKTVGYTISFSKGDPITIYHGEKGDKGDTGIAGKDGVTPVV GVRRDTDNVYYWTLNGEWLIDDAGNKVKAVGEDGKDGENGIDGKPGMDGITPQLKIENGR WWLSMNNGSTWSNIGQATGDTGKDGVDGDDMFSDIDYTSSQDFVTFTLSNGTVIKLPTWY AFGELQSLCEQMNTDISSIQSIIEALQQKDGITNVVPLNDNGKVIGYKLYFEKREPITIY HGKDGADGNDGQDGSDGKDGVTPVVGVKQDTDGIYYWTLNSDWMKDDNGNKIKAQGVDGS DGQDGTDGTNGVTPQLKIEEDGYWYISYDNGSNWTKLGKATGADGTSGDAFFKNVTEDDD YVYLEMQAGNIISVPKHKKLSITFNETEDIRVLANQTYPIQYAITGATDKTVLKALAQDG FRAVVKSLGNASGVIEVTTPGTILSSEVLVFVSDGEERTIMRSINFVEGVINITDRSYTV PYTGGTVSVQLSTNIDYTIEIPEADKTWISIAPTSRAIMRDETITFNVQHNNNTQLRYSI IKLVDKLGVTSETIQITQRGGSSQDIYVATAGTLEQQINSEDAKILEELKITGHLNTFDY EFLKTMPNLKTVDLSELSDTSIPASAFNNSLVPTVLLPLNLKTISDRAFYNSSITSIYIP QTVESIGQYAFANTLKLTGNIVIPSKTATIGERAFEYSAFNGTLILEEGVQSIGTAAFAG CSKVTGDLVIPNSVTSMKGSAFQSSTFTGSLSIGNALTDIPAYAFHGCSSFKGKITIGGN VKSIDDHAFEGCAGLTGNLVIPNKVETIGIYAFYACIGFDGYLSIGNSMKSIGKCAFVGE EKIGKGTVPSTVGPDKEYARCSWTPIFNKVYCKAIESPTLGIENLFDGYKNVSVYIVFGT SHTVKYGGGPMYIPEPTMCNENYPQYLFIPINASGYGKNGWGGFEIKSEVEF >gi|222159212|gb|ACAB01000147.1| GENE 19 19992 - 20477 351 161 aa, chain - ## HITS:1 COG:no KEGG:BT_3062 NR:ns ## KEGG: BT_3062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 163 165 112 42.0 5e-24 MAIRYKKKKITLCFDKENKVEKYVAANVLVGSINYNKLCDEVNQRTGVHRAMVDVVLKGV QDTMVSFLEEGFSVKLGEFGSFRPAINAESQDKAEDVSVNTIIRRKIVFTPGGAFKEMLG SASIELFGDNVTNPNGDNGGEPSEGGGSSGGDSGEAPDPAA >gi|222159212|gb|ACAB01000147.1| GENE 20 20931 - 22370 1529 479 aa, chain - ## HITS:1 COG:CAC0695 KEGG:ns NR:ns ## COG: CAC0695 COG0246 # Protein_GI_number: 15893983 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Clostridium acetobutylicum # 15 479 15 482 482 498 53.0 1e-140 MKALNKETAPKTQRPERIIQFGEGNFLRAFVDWIIYNMNQKTDFNSSVVVVQPIDKGMVD MLNAQDDLYHVNLQGLDKGEVVNSLTMIDVISRALNPYSQNDEFMKLAEQPEMRFVISNT TEAGIAFDPSCKLEDAPASSYPGKLTQLLYHRFKTFNGDKTKGLIIFPCELIFLNGHKLK ETIYQYIELWNLGNEFKTWFEEACGVYATLVDRIVPGFPRKDIAAIKEKLQYDDNLVVQA EIFHLWVIEAPQEVAKEFPADKAGLNVLFVPSEAPYHERKVTLLNGPHTVLSPVAYLSGV NIVREACEHEVIGKYIHKVMFDELMETLNLPKDELEKFANDVLERFNNPFVDHAVTSIML NSFPKYETRDLPGLKTYLERKGELPKGLVLGLAAIITYYKGGVRADGAEIVPNDAPEIMN LLKELWATGCTKKVTEGVLAADFIWGEDLNKIPGLAEAVKANLDSIQEKGMLETVKGIL >gi|222159212|gb|ACAB01000147.1| GENE 21 22397 - 23464 1004 355 aa, chain - ## HITS:1 COG:lin2880 KEGG:ns NR:ns ## COG: lin2880 COG1609 # Protein_GI_number: 16801940 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 8 193 4 194 318 72 28.0 1e-12 MEDQNYTIKDIARMAGVSAGTVDRVLHNRGDVSPKSKAKVQKVLDEIHYQPNVFAIGLAA KKKYSFLCLIPYYIEHDYWHSVVGGIERARQELRPFNVGIDYLCYHHGDEKSYQEACLAI KEKNVDAVLISPNFREETLALTAYLQENKIAYAFVDFNMEEARALTYIGQDSYKSGYIAA KILMRNYSAGEGQELVLFLSNNKDNPAEIQMQRRLDGFMSYIAEEYNNLVIHEVILNKSD QESNQQTLDEFFRAHPKAALGVVFNSRVYQLGEYLRHAGRSMKGLIGYDLLKANVELLKS GDVHYLIGQRPGLQGYCGVKALCDHVVFKKSVEPVKYMPIDILIKENIDFYFEFV >gi|222159212|gb|ACAB01000147.1| GENE 22 23663 - 25069 1298 468 aa, chain + ## HITS:1 COG:uxaC KEGG:ns NR:ns ## COG: uxaC COG1904 # Protein_GI_number: 16130987 # Func_class: G Carbohydrate transport and metabolism # Function: Glucuronate isomerase # Organism: Escherichia coli K12 # 1 466 1 465 470 550 56.0 1e-156 MKNFMDENFLLQTETAQKLYHEHAAKMPIIDYHCHLIPQMVADDYKFKSLTEIWLGGDHY KWRAMRTNGVDERFCTGKDTTDWEKFEKWAETVPYTFRNPLYHWTHLELKTAFGINKILN PQTAREIYDECNEKLSQPEYSARGMMRRYHVEVVCTTDDPIDSLEYHIKTRESGFEIKML PTWRPDKAMAVEVPADFRSYVEKLAEVSDVTISNFDDMIAALRKRHDFFAEQGCRLSDHG IEEFYAEDYTDAEIKAIFNKVYGGTELTKEEILKFKSAMLVIFGEMDWEKGWTQQFHYGA IRNNNTKMFKLLGADTGFDSIGEFTTAKAMAKFLDRLNTNGKLTKTILYNLNPCANEVIA TMLGNFQDGSIPGKIQFGSGWWFLDQKDGMEKQMNALSVLGLLSRFVGMLTDSRSFLSYP RHEYFRRTLCNLVGRDVENGEIPASEMERVNQMIEDISYNNAKNFFKF >gi|222159212|gb|ACAB01000147.1| GENE 23 25516 - 26319 637 267 aa, chain - ## HITS:1 COG:CAC3538 KEGG:ns NR:ns ## COG: CAC3538 COG1235 # Protein_GI_number: 15896774 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Clostridium acetobutylicum # 3 266 1 261 261 176 38.0 6e-44 MKVKFISLASGSSGNCYYLGTETYGILIDAGIGIRTIKKTLKDYNILMDSIRAVFITHDH ADHIKAVGNLGEKLNIPVYTTARIHAGINRSYCMTEKLSSSVRYLEKQEPMTLEDFHIES FEVPHDGTDNVGYCIEIDGKVFSFLTDLGEITPTAAHYISKAHYLILEANYDEEMLKMGP YPQYLKERIASKTGHMSNSDTAEFLAENITEHLRYIWLCHLSKDNNHPELAYKTVEWKLK NKGILVGKDVQLLALKRNTPSELYVFE >gi|222159212|gb|ACAB01000147.1| GENE 24 26402 - 27790 1125 462 aa, chain - ## HITS:1 COG:no KEGG:BT_0821 NR:ns ## KEGG: BT_0821 # Name: not_defined # Def: putative permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 462 1 462 462 790 96.0 0 MNNSPQRAAKGFTRAFWVSNTVELFERMAYYAVFIVLTIYLSSILGFNDFEASMISGLFS GGLYLLPIFSGAYADKIGFRKSMIIAFSLLSIGYLGLGVFPTLLEAAGLVSYGVTTQFNG LPDSYTRWIIVPVLFVLMVGGSFIKSIISASVAKETTEATRARGYSIFYMMVNVGAFTGK TIIDPLRNVIGEQAYIYINYFSGAMTIIALLAVILLYKSTHTAGEGKSLREIGQGFMRII TNWRLLILILIVTGFWMVQQQLYATMPKYVIRLAGETAKPGWIANVNPFVVVCCVSFITR LMAKRSAITSMNVGMFLIPFSALLMACGNLLGNDLITGMSNITLMMIAGIVVQALAECFI SPRYLEYFSLQAPKGEEGMYLGFSHLHSFLSSIFGFGLAGILLTKYCPDPALFETRAAWE AASANAHYIWYYFAAIGLIAAVALLLFAKITESIDKKKESSR >gi|222159212|gb|ACAB01000147.1| GENE 25 27887 - 28309 455 140 aa, chain + ## HITS:1 COG:no KEGG:BT_0820 NR:ns ## KEGG: BT_0820 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 140 1 139 139 237 82.0 1e-61 MIIAVDFDGTIVEHRYPRIGEEIPFAVDTLKLLQQEKHRLILWSVREGALLDEAVEWCKA RGLEFYAVNKDYPEEQKGHQGFSRKLKADMFIDDRNLGGLPDWGVIYAMIKEKKTFADVY GQNGEEEKTPSKKKKRWLPF >gi|222159212|gb|ACAB01000147.1| GENE 26 28434 - 28700 182 88 aa, chain - ## HITS:1 COG:no KEGG:BT_0819 NR:ns ## KEGG: BT_0819 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 87 1 81 82 120 80.0 1e-26 MENSKDMAQHTYDNEAVQELLNWAKKMIETRNYPTERYQVNQCTTIIDGKSYLESLIAMI SRNWENPTFYPTIEQLWEFREKWENKES >gi|222159212|gb|ACAB01000147.1| GENE 27 28844 - 29293 324 149 aa, chain + ## HITS:1 COG:RSc0240 KEGG:ns NR:ns ## COG: RSc0240 COG2259 # Protein_GI_number: 17544959 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Ralstonia solanacearum # 8 131 19 141 145 58 33.0 4e-09 MIYDFLFPTKSNTTKVSLLLLAVRIIFGILLMNHGIQKWSSFQELSTVFPDPLGIGSPLS LGLAIFGELVCSMAFIVGFLYRLAMIPMIFTMIVAFFVVHANDVFAVKELAFIYLVVFIL MYIAGPGKFSIDHIIGNELSRRKSRAYKN >gi|222159212|gb|ACAB01000147.1| GENE 28 29387 - 29938 619 183 aa, chain + ## HITS:1 COG:no KEGG:BT_0815 NR:ns ## KEGG: BT_0815 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 183 1 183 183 316 96.0 2e-85 MKRGKLTLVAVVLSGSLLFSSCVGSFSLFNRLSSWNQSVGNKFVNELVFLAFNIVPVYGV AYLADALVINSIEFWSGSNPMANVGDVKKVKGENGNYMVKTLENGYSITKEGETASMDLI YNKEANTWNVVANGESAELVKMNNDGTADLFLPNGEKMNVTLDAQGMLAARQATMSNLMF AAR >gi|222159212|gb|ACAB01000147.1| GENE 29 30030 - 30482 204 150 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713264|ref|ZP_04543745.1| ## NR: gi|237713264|ref|ZP_04543745.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 150 1 150 150 280 100.0 2e-74 MNKLYTLIYSVLIFTCLSCQQQTPQTQIEQTAIDFCEAFYNFNYPAAQAWSTPSSLSYLS FLASNVRQTHLDLLKTRGAAKVSVISSEIDANSEEATVVCQIKNTFVINPVEGKIEYMSS LQDTLQLVRENNKWLVRKDIPQQNGKQSHD >gi|222159212|gb|ACAB01000147.1| GENE 30 30436 - 32745 1907 769 aa, chain - ## HITS:1 COG:no KEGG:BT_0814 NR:ns ## KEGG: BT_0814 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 769 1 772 772 1328 87.0 0 MRRNFLYLLASAAVLSLASCTTTKFVPDGSYLLDEVKIRTDQKNIRPSSLRMYVRQNPNA KWFSLIKTQLYVYNLSGRDSTKWGNKFLRRIGDAPVIYSEDEAQRSEEEITKAVHNMGYM AATVKRSTKVKKKKIKVYYDVTAGKPYVVQSIKYDIYDPKIAALLKQDSARSLLKEGMYF DVNVLDADRQRITNKLLRNGYYKFNKDYIGYTADTVRNTYNVDLTQHLQMYKAHANDSAR AHRQYWINKINFITDYDVLQSSAMSSVDINDSVHYKGYPIYYKDKLYLRPKVLTDNLRFA SGDLYNERDVQQTYSSFGRLSALKYTNIRFIETQIGDSTMLDCYVMLTKSKHKSVAFEVE GTNSAGDLGAAASVSFQNRNLFRGSETFMIKFRGAYEVISGLQAGYSNNNYTEYGVETSI NFPNFLFPFISSDFKRKIRATTEFGLQYNYQLRPEFLRTMASANWSYKWTQRQKIQHRID LINIAFLYLPRISDRFKEDYINKGQNHIFQYNYQNRLIVNMGYSYNYNSVGGSIINNTIA SNSYSIRFNFESAGNIMYALSKATNIRKNSNGEYAILGIPYAQYLKGEFDFAKNIRIDHR NSFAFHAGVGIAVPYGNAKTIPFEKQYFSGGANSVRGWAVRDLGPGSFAGNGNLLDQSGD IKLDASIEYRSKLFWKFQGAAFIDAGNIWTIRSYANQPGGVFKFDKFYKQIAVAYGLGLR LDLDFFILRFDGGMKAVNPAYETKKEHFPIIHPKFSRDFAFHFAVGYPF >gi|222159212|gb|ACAB01000147.1| GENE 31 32888 - 33652 663 254 aa, chain + ## HITS:1 COG:VC0803 KEGG:ns NR:ns ## COG: VC0803 COG0566 # Protein_GI_number: 15640821 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Vibrio cholerae # 4 250 15 255 257 176 40.0 4e-44 MASLSKNKIKFIHSLELKKIRKEERVFLAEGPKLVGDLLGHFPCRFLAATPSWFQEHPGI DASELVEVSQEDLSRASLLKTPQQVLAIFEQPQYTLAPEAVRSSLCLALDDVQDPGNLGT IIRLADWFGIEHIICSQNTVDVYNPKTIQATMGGIARVKVHYTSLPDFIRSLGDTPVFGT FLDGKNMYEQPLTANGLIVMGNEGNGIGKEVATLINRKLYIPNYPAGQETSESLNVAIAT AVICAEFRRQAAWK >gi|222159212|gb|ACAB01000147.1| GENE 32 33603 - 33989 139 128 aa, chain - ## HITS:1 COG:no KEGG:BT_0810 NR:ns ## KEGG: BT_0810 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 128 1 125 125 180 71.0 2e-44 MKRFASHYLLIPAVGYMKQQVVEITDEGVVRGVFSLTEEIESVEWMPGVIVLLSESQMEE IKNTGMILKNIPVFQINLPSVSNQSSQCFNEYLQEAEKKGEALFPYLFYPFDFTSMQPVD GTRHRLLR >gi|222159212|gb|ACAB01000147.1| GENE 33 33995 - 35119 889 374 aa, chain - ## HITS:1 COG:no KEGG:BT_0809 NR:ns ## KEGG: BT_0809 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 358 1 343 360 499 73.0 1e-140 MSLIIRWIKIKRKLRIMKNKFRFHLCLICMFVFAVAGCKVKRPSDVISESKMENLLYDYH VAKSMGDNLPYSENYKKALYIDAVFKKYGTTQAAFDSSMVWYTRNTEILSKIYDKVKKRL KDEQELVGDLIAKRDKKPKMTKQGDSIDVWPWQRMVRLTGEMMDNQYVFTLPTDSNYKDR DTLVWEVRYRFLEPMLADSLRGVTMAMQVIYEKDTINHWKTVTEPGVQQIRLFADTLGSM KEIKGFIYYPMDSQEKGGAVLADRFMMTRYHCTDTLAFAVRDSLNKIEALKADSLKKIAT KENADSLHKVIDKNKDDIQRLTPEEMNRRRTGTHREKKPEQIQVEQHIQKERIEQRKERQ MNQRRKQQQRRSNN >gi|222159212|gb|ACAB01000147.1| GENE 34 35061 - 35717 549 218 aa, chain - ## HITS:1 COG:no KEGG:BF2277 NR:ns ## KEGG: BF2277 # Name: not_defined # Def: lipoprotein signal peptidase # Organism: B.fragilis # Pathway: Protein export [PATH:bfr03060] # 1 204 1 204 210 362 92.0 7e-99 MKKLFTKGRIALLVIFSVLIIDQIIKVWIKTHMYWHESIRVTDWFYIYFTENNGMAFGME IFGKLFLTTFRIVAVALIGWYLYKIIKKGFKTGYIVCVALILTGALGNIIDSVFYGVIFN ESTHSQIASFMPEGGGYSTWFYGKVVDMFYFPIIDTNWPTWMPFVGGEHFIFFSPIFNFA DAAISCGIIALLLFYSKYLNESYHSLDKDKKEATDHEK >gi|222159212|gb|ACAB01000147.1| GENE 35 35815 - 36195 514 126 aa, chain - ## HITS:1 COG:no KEGG:BT_0807 NR:ns ## KEGG: BT_0807 # Name: not_defined # Def: DnaK suppressor protein, putative # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 126 1 126 126 209 98.0 2e-53 MAEKTRYSDAELEEFRAIIMEKLELAQRDYEQLKKSLMGLDGNDTDDTSPTYKVLEEGAN TLSKEETTRLAQRQLKFIQGLQAALVRIENKTYGICRETGKLIPAERLRAVPHATLSIEA KNSGKK >gi|222159212|gb|ACAB01000147.1| GENE 36 36231 - 39719 3586 1162 aa, chain - ## HITS:1 COG:CAC3038 KEGG:ns NR:ns ## COG: CAC3038 COG0060 # Protein_GI_number: 15896289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 8 1162 2 1034 1035 836 40.0 0 MGKRFTEYSQFDLSQVNKDVLKKWDENQVFAKSMTERDGCPSFVFFEGPPSANGMPGIHH VMARTIKDIFCRYKTMKGYQVKRKAGWDTHGLPVELSVEKALGITKEDIGKKISVADYNA ACRKDVMKYTKEWEDLTHRMGYWVDMKHPYITYDNRYIETLWWLLKQLHKKGLLYKGYTI QPYSPAAGTGLSSHELNQPGCYRDVKDTTAVAQFKMKNPKPEMTEWGTPYFIAWTTTPWT LPSNTALCVGPKIDYVAVQSYNAYTGEPITVVLAKALLNVHFNAKAADLKLEDYKAGDKL VPFKVIAEYKGTDLVGMEYEQLIPWVKPVEVSENGTWKASDKAFRVIPGDYVTTEDGTGI VHIAPTFGADDANVARAAGIPSLFMINKKGETRPMVDLTGKFYLLNELDENFVKECVDVD KYKEYQGAWVKNAYDPQFMVDGKYDEKAAQAAESLDIVIAMMMKADNKAFKIEKHVHNYP HCWRTDKPVLYYPLDSWFIRSTACKERMMELNKTINWKPESTGTGRFGKWLENLNDWNLS RSRYWGTPLPIWRTEDNTDEICIESVEELYNEIEKSVAAGFMKSNPYKDKGFVPGQYSEE NYDKIDLHRPYVDDVILVSKDGKPMKRESDLIDVWFDSGAMPYAQIHYPFENKELLDNRQ VYPADFIAEGVDQTRGWFFTLHAIATMVFDSVSYKAVISNGLVLDKNGNKMSKRLNNAVD PFTTIEKYGSDPLRWYMITNSSPWDNLKFDVEGVEEVRRKFFGTLYNTYSFFALYANVDG FEYKEADVPMAERPEIDRWILSVLNTLIKEVDTCYNEYEPTKAGRLISDFVNDNLSNWYV RLNRKRFWGGEFTQDKLSAYQTLYTCLETVAKLMAPVSPFYADRLYTDLITATGRDNVVS VHLAEFPKYQEEMIDKELEARMQMAQDVTSMVLALRRKVNIKVRQPLQCIMVPVVDEEQK AHIEAVKNLIMNEVNVKEVRFVDGAAGVLVKKVKCDFKKLGPKFGKQMKAVAAAVAEMSQ EAIGELEKNGKYTLNLDGAEAVIEASDVEIFSEDIPGWLVANEGKLTVALEVTITEELRR EGIARELVNRIQNIRKSSGFEITDKIKITISKNTQTDDAVNEYNTYICNQVLGTSLELVD EVKDGTMLEFDDFSLFVNVIKD >gi|222159212|gb|ACAB01000147.1| GENE 37 39930 - 40979 708 349 aa, chain + ## HITS:1 COG:no KEGG:BT_0805 NR:ns ## KEGG: BT_0805 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 349 1 349 349 567 89.0 1e-160 MNYGISILFRAIPLAMAIFCFGYGAFIYGYGDDGSRVVAGPVVFSLGMICIALFCTAATI IRQIIHTYNKSAKYILPVIGYLAAIITIIGGICIFSNATSTSAFVAGHVITGVGFITTCV ATAATSSTRFSLIPRNSKATSNEVPEGAFSLNQRRALVIVAIIVSLIAWIWAFVLLGNSH SHPAYFVAGHVMVGLACICTSLIALVATIARQIRNDYSEKERNKWPKLVLLMGSISFVWG LFVILADSGSANGTTGYIMLGLGLVCYSISSKVILLAKIWRQEFKLANRIPMIPVFTALA CLFLAAFVFELATIHADYFIPARVLVGLGAICFTLFSIVSILESGTSSK >gi|222159212|gb|ACAB01000147.1| GENE 38 41056 - 42594 1472 512 aa, chain - ## HITS:1 COG:BH1089_2 KEGG:ns NR:ns ## COG: BH1089_2 COG0388 # Protein_GI_number: 15613652 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Bacillus halodurans # 202 512 3 313 313 326 49.0 5e-89 MEQHPIKINKVQIRNLQIEDYVQLSQSFTRVYSDGSDVFWTREQIQKLIKIFPEGQIVTV VDDKIVGCALSIIVDYDKVKNDHTYAQVTGKETFNTHSSKGNILYGIEVFIHPGYRGLRL ARRMYEYRKELCETLNLKAIMFGGRIPNYHKYADKMRPKEYIERVRQREIYDPVLTFQLS NDFHVRKVMTNYLPNDEESKHYACLLQWDNIYYQPPTQEYISPKTTVRVGLVQWQMRSYK TLDDLFEQVEFFVDAVSDYKSDFVLFPEYFNAPLMSKYNDKGESQAIRGLAKYTDEIRER FMNLAISYNINIITGSMPYVKEDGLLYNVGFLCRRDGTYEMYEKLHVTPDEIKSWGLNGG KLLNTFDTDCAKIGVLICYDVEFPELSRLMADQGMQILFVPFLTDTQNAYSRVRVCAQAR AIENECFVVIAGSVGNLPRVHNMDIQYAQSGVFTPCDFAFPTDGKRAEATPNTEMILVSD VDLDLLNELHTYGSVRNLKDRRNDLYEVRYKK >gi|222159212|gb|ACAB01000147.1| GENE 39 42727 - 43017 135 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713274|ref|ZP_04543755.1| ## NR: gi|237713274|ref|ZP_04543755.1| predicted protein [Bacteroides sp. D1] # 1 96 10 105 105 177 100.0 3e-43 MIRIIASIRLYKDGRRTPFYSGYRPLFDFIEETKTSGQITLLDREAFYPGDEGVVEIAFL IRRALGDNFSEGTKFTFGEGRKHVGEGEVKEILELE >gi|222159212|gb|ACAB01000147.1| GENE 40 43396 - 43590 120 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174876|ref|ZP_05761288.1| ## NR: gi|260174876|ref|ZP_05761288.1| hypothetical protein BacD2_23669 [Bacteroides sp. D2] # 1 64 1 64 64 105 100.0 1e-21 MIITMISHIRFWLKKYFFGKSNCFIFEIFQIREKWIEEVESNKSSYFPQLKQCFSLLKAP LSFS >gi|222159212|gb|ACAB01000147.1| GENE 41 43737 - 44592 391 285 aa, chain + ## HITS:1 COG:PA0780 KEGG:ns NR:ns ## COG: PA0780 COG2207 # Protein_GI_number: 15595977 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 37 122 164 249 250 70 34.0 4e-12 MNSNKFSYQGQIETILHYLNSMIQKNWTGKTADEFNRLMSIPELAKIACMSARNLQLMFK AYTSETIHQYIIRTRMEYAQQLLKDNKKSIAEIYEYIGFANQSALNNTLQKKYNLTPREL QKKLLETSHTYPSCISPYRIVESETIPVLFLSYIGNYDTCSTVAFETYTWDCLYKYAKEN SLLPDKEDYWGIAYDDTDITSLEKCRFYACIAIQKRVGSNPPLTNPIKHMDLPQGTYAVY IHQGDYALLDAFYEIILKQLPQSYCLGETPILEHYLNSPADTDVK Prediction of potential genes in microbial genomes Time: Wed May 18 04:24:31 2011 Seq name: gi|222159211|gb|ACAB01000148.1| Bacteroides sp. D1 cont1.148, whole genome shotgun sequence Length of sequence - 1911 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 720 291 ## TDE0221 hypothetical protein - Prom 744 - 803 4.0 2 1 Op 2 . - CDS 805 - 1275 332 ## Alvin_2303 5 nucleotidase deoxy cytosolic type C - Prom 1399 - 1458 8.2 3 2 Tu 1 . - CDS 1462 - 1911 286 ## CCC13826_0816 acid membrane antigen A Predicted protein(s) >gi|222159211|gb|ACAB01000148.1| GENE 1 3 - 720 291 239 aa, chain - ## HITS:1 COG:no KEGG:TDE0221 NR:ns ## KEGG: TDE0221 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 4 152 8 162 216 85 40.0 1e-15 MDTLKKEIAILIQCSDFKEIEKQLQTINKLIVTNYMFELSNGLRIYPIEVEAYFKHPKFN DGFVHGNELQKNNYGKFYVHRTGMTKNSKIKGGTRGGIDLCLSDSTDIYYGILIRSAQFD DGTIKFGPNNVLKFIVEDKNLDYNTLEEEFVLKEAVEDCRDRENKSIILHSTRVGLGRNQ SDDFRDSQLRTIAGPLLSSYAYKEKENVFKHYIINENISKEEAEKISIDILGYCPKSLI >gi|222159211|gb|ACAB01000148.1| GENE 2 805 - 1275 332 156 aa, chain - ## HITS:1 COG:no KEGG:Alvin_2303 NR:ns ## KEGG: Alvin_2303 # Name: not_defined # Def: 5 nucleotidase deoxy cytosolic type C # Organism: A.vinosum # Pathway: not_defined # 9 152 2 145 149 197 63.0 7e-50 MISALSTFKPILYIDMDNVLVDFQSGINKLSEYEKKEYEGRYDEVPNIFAKMYPYKGAID AFHRLVRFYDVYILSTAPWNNPSAWSDKLVWVKKWLGTYSYKRLILSHHKNLNKGDFLID DRLKNGAENFSGELILFGSEQYPNWDSVVDYLISSK >gi|222159211|gb|ACAB01000148.1| GENE 3 1462 - 1911 286 149 aa, chain - ## HITS:1 COG:no KEGG:CCC13826_0816 NR:ns ## KEGG: CCC13826_0816 # Name: amaA # Def: acid membrane antigen A # Organism: C.concisus # Pathway: not_defined # 4 144 115 258 264 69 28.0 3e-11 EIDQSEVLAILKENFYGADIVFLPWDKYDICGHTDGIIHNIGDGKILVNLKVYPPEIERE MRRRLSDDFAVIDLKLSKYDENSWAYINMLQTRDVIIIPGLGLPTDGEALSQIKELHPSY DGRIYQINIAPIIKKWGGALNCLSWTVTK Prediction of potential genes in microbial genomes Time: Wed May 18 04:24:44 2011 Seq name: gi|222159210|gb|ACAB01000149.1| Bacteroides sp. D1 cont1.149, whole genome shotgun sequence Length of sequence - 15444 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 11, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 586 - 635 12.4 1 1 Tu 1 . - CDS 698 - 2152 1296 ## COG0477 Permeases of the major facilitator superfamily - Prom 2198 - 2257 4.8 2 2 Op 1 11/0.000 - CDS 2291 - 3607 1636 ## COG2115 Xylose isomerase 3 2 Op 2 . - CDS 3649 - 5172 1439 ## COG1070 Sugar (pentulose and hexulose) kinases - Prom 5192 - 5251 2.7 4 3 Tu 1 . - CDS 5291 - 6022 775 ## COG1051 ADP-ribose pyrophosphatase - Prom 6120 - 6179 6.6 + Prom 6084 - 6143 8.9 5 4 Op 1 . + CDS 6364 - 7200 642 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase 6 4 Op 2 . + CDS 7251 - 8138 1072 ## COG0331 (acyl-carrier-protein) S-malonyltransferase + Term 8175 - 8216 7.1 7 5 Tu 1 . - CDS 8123 - 8320 73 ## gi|237713286|ref|ZP_04543767.1| predicted protein - Prom 8523 - 8582 2.4 + Prom 8259 - 8318 8.8 8 6 Op 1 39/0.000 + CDS 8340 - 9470 966 ## COG0045 Succinyl-CoA synthetase, beta subunit 9 6 Op 2 . + CDS 9503 - 10363 916 ## COG0074 Succinyl-CoA synthetase, alpha subunit + Term 10410 - 10472 -0.2 - Term 10396 - 10462 17.5 10 7 Tu 1 . - CDS 10469 - 11212 606 ## BT_0786 putative integral membrane protein - Prom 11331 - 11390 6.3 + Prom 11270 - 11329 5.5 11 8 Tu 1 . + CDS 11349 - 11522 273 ## gi|160884539|ref|ZP_02065542.1| hypothetical protein BACOVA_02524 + Term 11528 - 11573 8.1 - Term 11514 - 11561 9.4 12 9 Tu 1 . - CDS 11585 - 11830 301 ## COG0724 RNA-binding proteins (RRM domain) - Prom 11991 - 12050 5.8 - Term 12169 - 12217 11.0 13 10 Op 1 . - CDS 12243 - 13745 1188 ## COG0174 Glutamine synthetase 14 10 Op 2 . - CDS 13813 - 14619 619 ## BT_0971 hypothetical protein - Prom 14688 - 14747 7.5 + Prom 14647 - 14706 7.2 15 11 Tu 1 . + CDS 14744 - 15443 533 ## BT_0781 hypothetical protein Predicted protein(s) >gi|222159210|gb|ACAB01000149.1| GENE 1 698 - 2152 1296 484 aa, chain - ## HITS:1 COG:ECs5014 KEGG:ns NR:ns ## COG: ECs5014 COG0477 # Protein_GI_number: 15834268 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 12 481 9 479 491 422 48.0 1e-118 MINTTNEGSKLYLYSITSVAILGGLLFGYDTAVISGAEKGLEAFFLSASDFQYNKVMHGI TSSSALIGCVLGGAISGIFASRLGRRNSLRLAAVLFFLSALGSYYPEVLFFEYGKPNMDL LIAFNLYRVLGGIGVGLASAVCPMYIAEIAPSNIRGTLVSCNQFAIIFGMLVVYFVNFLI MGDHQNPIILKDAAGVLSVSAESDMWTVYEGWRYMFGSEAFPAAFFGLLLFFVPKTPRYL VLIQQDEKAYSILEKINGKTKAQEILNDIKATAHEKTEKIFTYGVAVIVIGILLSVFQQA IGINAVLYYAPRIFENAGAEGGGMMQTVIMGIVNIVFTLVAIFTVDRFGRKPLLIIGSIG MAVGAFAVAMCDSMAIKGVLPVLSVIVYAAFFMMSWGPICWVLISEIFPNTIRGKAVAIA VAFQWIFNYIVSSTFPALYDFSPMFAYSLYGIICVAAAIFVWRWVPETKGKTLEDMSKLW KKNK >gi|222159210|gb|ACAB01000149.1| GENE 2 2291 - 3607 1636 438 aa, chain - ## HITS:1 COG:SMc03163 KEGG:ns NR:ns ## COG: SMc03163 COG2115 # Protein_GI_number: 15966647 # Func_class: G Carbohydrate transport and metabolism # Function: Xylose isomerase # Organism: Sinorhizobium meliloti # 6 437 5 435 436 462 53.0 1e-130 MATKEFFPGIEKIKFEGKDSKNPMAFRYYDAEKVINGKKMKDWLRFAMAWWHTLCAEGGD QFGGGTKQFPWNSNADAIQAAKDKMDAGFEFMQKMGIEYYCFHDVDLVSEGASIEEYEAN LKAIVAYAKQKQAETGIKLLWGTANVFGHARYMNGAATNPDFDVVARAAVQIKNAIDATI ELGGQNYVFWGGREGYMSLLNTDQKREKEHLAKMLTIARDYARARGFKGTFLIEPKPMEP TKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDAN RGDYQNGWDTDQFPIDNYELTQAMMQIIRNGGLGNGGTNFDAKTRRNSTDLEDIFIAHIA GMDAMARALESAAALLNESPYKKMLADRYASFDGGKGKEFEDGKLTLEDVVAYAKANGEP KQTSGKQELYEAILNMYC >gi|222159210|gb|ACAB01000149.1| GENE 3 3649 - 5172 1439 507 aa, chain - ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 2 459 3 462 500 207 32.0 4e-53 MFLLGYDIGSSSVKASLVNAETGKCVSSAFFPKTEANIIAVNPGWAEQDPESWWENLKLS TQAIMTESGVSAAEIKAIGISYQMHGLVCVDKNQHVLRPAIIWCDSRAVPYGQKAFETIG EERCLSHLLNSPGNFTASKLAWIKENEPAIYEQIDKIMLPGDYIAMKLSGEICTTVSGLS EGMFWDFKNNRVADFLMDYYGFDSSLIADIKPTFAEQGRVNAIAAKELGLKEGTPITYRA GDQPNNALSLNVFNPGEIASTAGTSGVVYGVNGEVNYDPQSRVNTFAHVNHTIDQTRLGV LLCINGTGILNSWVKRNIAPEGISYNEMNVLASKAPIGSAGISILPFGNGAERMLNNKEI GCSIRGLDFNTHGKHHIIRAAQEGIVFSFKYGIDIMEQMGIPVKMIHAGHANMFLSSIFR DTLAGVTGATIELYDTDGSVGAAKGAGIGAGIYKDNNEAFATLDKLDVIEPNVAKRQEYA DACAKWKYRLEKSMTGNIPAPVLSSDK >gi|222159210|gb|ACAB01000149.1| GENE 4 5291 - 6022 775 243 aa, chain - ## HITS:1 COG:alr2484 KEGG:ns NR:ns ## COG: alr2484 COG1051 # Protein_GI_number: 17229976 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Nostoc sp. PCC 7120 # 9 242 17 241 248 89 29.0 4e-18 MQNLQKNTPLANNHISVDCVVIGFDGEQLKVLLVKRAGEDNGEVYHDMKLPGSLIYMDEA LDEAAQRVLYELTGLKNVNLMQFKAFGSKNRTSNPKDVRWLERAMQSKVERIVTIAYLSM VKIDRTLDKNLDDHQACWVALKDVKTLAFDHNLIIKEAMTYIRQFVEFNPSMLFELLPRK FTAAQLRTLFELVYDKAVDVRNFHKKIAMMEYVVPLEEKQQGVAHRAARYYKFDKKIYNK VRR >gi|222159210|gb|ACAB01000149.1| GENE 5 6364 - 7200 642 278 aa, chain + ## HITS:1 COG:CAC3095 KEGG:ns NR:ns ## COG: CAC3095 COG0351 # Protein_GI_number: 15896346 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Clostridium acetobutylicum # 7 262 7 258 265 239 52.0 3e-63 MERHPVILSIAGSDCSGGAGIQADIKTISALGGYAASAITAITVQNTLGVRAVQSISPDM VRGQIEAVMDDLQPVAIKIGMINDIQIVRVISDCLQKYSPAYVVYDPVMVSTSGKKLMTD EAIEEIKKELLPLVTLITPNIDEAKVLTGKSIHNIQDMQTAAKMLTDDYQTNILLKGGHL EGDNMCDLLHTSEFIYHIYEEKKIESHNLHGTGCTLSSAIATYLAKGYPMRESIQHAKTY ITQAIIAGKELNIGHGNGPLWHFPDTVAQMCTFCAVVS >gi|222159210|gb|ACAB01000149.1| GENE 6 7251 - 8138 1072 295 aa, chain + ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 3 286 5 289 308 245 46.0 9e-65 MKAFVFPGQGAQFVGMGKDLYENSALAKELFEKANDILGYRITDIMFDGTDEDLRQTKVT QPAVFLHSVISALCMGDDFKPEMTAGHSLGEFSALVAAGALSFEDGLKLVYARAMAMQKA CEATPSTMAAIIALPDEKVEEICAAVNAEGEVCVPANYNCPGQIVISGSVPGIEKACELM KAAGAKRALPLKVGGAFHSPLMDPAKIELEAAIKATEIHTPKCPVYQNVDALPHTDPAEI KKNLVAQLTASVRWTQSVKNMVADGATDFTECGPGAVLQGLIKKIDGTVNAHGIA >gi|222159210|gb|ACAB01000149.1| GENE 7 8123 - 8320 73 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713286|ref|ZP_04543767.1| ## NR: gi|237713286|ref|ZP_04543767.1| predicted protein [Bacteroides sp. D1] # 1 65 1 65 65 123 100.0 5e-27 MIPIIKTKERFGKFAGKEKYQGVLQHVRLYSIYGFHIKKSVLELLPTRSISFKDEVCLTD IYAIP >gi|222159210|gb|ACAB01000149.1| GENE 8 8340 - 9470 966 376 aa, chain + ## HITS:1 COG:RC0599 KEGG:ns NR:ns ## COG: RC0599 COG0045 # Protein_GI_number: 15892522 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, beta subunit # Organism: Rickettsia conorii # 1 356 1 363 386 333 46.0 4e-91 MKIHEYQAKEIFSKYGIPVERHTLCRTAAGVLAAYRRMGTDRVVIKAQVLTGGRGKAGGV KLVNNTEDAYQEAMNILGMSIKGLPVNQVLVSEAVDIAAEYYVSYTIDRNTRSVVLMMSA SGGMDIEEVARQTPEKIIRYSINPFIGLPDYLARRFAFTLFPQMEQAGKMAAILQELYKI FVENDASLVEVNPLALTKKGTLMAIDAKIVFDDNALYRHPEIHALFDPTEEEKVEADAKD KGFSYVHMDGNIGCMVNGAGLAMATMDMTKLYGGQPANFLDIGGSSNPVKVIEAMKLLLQ DEKVKVVLINIFGGITRCDDVAMGLLQAFEQINSNVPVIVRLTGTNEHIGRELLRNYSRF QIATTMKEAALMALKA >gi|222159210|gb|ACAB01000149.1| GENE 9 9503 - 10363 916 286 aa, chain + ## HITS:1 COG:BS_sucD KEGG:ns NR:ns ## COG: BS_sucD COG0074 # Protein_GI_number: 16078673 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Bacillus subtilis # 1 286 1 286 300 333 59.0 2e-91 MSILIDKSTRLIVQGITGRDGLFHAKKMKEYGTKVVGGTSPGKGGTDVDGIPVFNTMYDA VEQTKANTSIIFVPARFAADAIMEAADAGIRVIVCIAEGIPTLDVIKAHQFAEQKGAMLI GPNCPGLISPGKSMVGILPGQVFLEGNVGVISRSGTLTYEIVYHLTANGMGQSTAIGIGG DPVVGLHFRQLLEMFQNDPETEAIVLIGEIGGNAEEQAAEYIRNNVTKPVVAFIAGQSAP PGKQMGHAGAIISGSSGSAKEKIESLEAAGIRVAQEPSDIPKLLKK >gi|222159210|gb|ACAB01000149.1| GENE 10 10469 - 11212 606 247 aa, chain - ## HITS:1 COG:no KEGG:BT_0786 NR:ns ## KEGG: BT_0786 # Name: not_defined # Def: putative integral membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 247 1 240 240 386 85.0 1e-106 MLNLIAVFLVCSGIGIAVHIAIDLTHRPQSMKIMNAVWILSALWGSYLALWAYNKFGRSS PMKMEDDGMKGMDMSGMKDMNMPDMKGMDMDMSMGEMRRPYWQSVALSALHCGAGCTLAD IIGEWFTNYVPVTVAGSQLVGNWVLDFVLALIIGVYFQFYAIREMEKISVGNALTRAFKA DFFSLLSWQVGMYGWMAIVYFVLFINEPLPKDTWIFWFMMQLAMLFGFFCAYPMNALLIK LGIKKGM >gi|222159210|gb|ACAB01000149.1| GENE 11 11349 - 11522 273 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160884539|ref|ZP_02065542.1| ## NR: gi|160884539|ref|ZP_02065542.1| hypothetical protein BACOVA_02524 [Bacteroides ovatus ATCC 8483] # 1 57 68 124 124 96 98.0 4e-19 MSRRRQLEHEVSVAQERIKKAAKDTPKNILKLWEQELVDLELELNNMVDDEEDNNED >gi|222159210|gb|ACAB01000149.1| GENE 12 11585 - 11830 301 81 aa, chain - ## HITS:1 COG:alr2311 KEGG:ns NR:ns ## COG: alr2311 COG0724 # Protein_GI_number: 17229803 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Nostoc sp. PCC 7120 # 1 80 1 80 105 85 55.0 2e-17 MNLYIGNLSYNVKESDLRNVMEEYGTVASVKLITDRETRRSKGFAFIEMPDDAEANNAIK QLNGAEYVGRSMVVKEALPKN >gi|222159210|gb|ACAB01000149.1| GENE 13 12243 - 13745 1188 500 aa, chain - ## HITS:1 COG:MA3382 KEGG:ns NR:ns ## COG: MA3382 COG0174 # Protein_GI_number: 20092196 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Methanosarcina acetivorans str.C2A # 4 498 5 504 506 568 55.0 1e-162 MNQDLSMNANQLVAFLQKPTSEFTKADIISFIQQKDIRMVNFMYPAGDGRLKTLNFVINN QAYLEAILTCGERVDGSSLFSFIEAGSSDLYVIPRFCTAFVDPFAEIPTLSMLCSFFNKD GEPLESAPEHTLYKASKTFTEVTGMEFQAMGELEYYVIAPDTGMFQATDQRGYHESGPYA KFNEFRTQCMSYIAQTGGQIKYGHFEVGNFTLDGYIYEQNEIEFLPVPVSQAADQLMIAK WVIRNLGYRYGYNVTFAPKITAGKAGSGLHVHMRIMKDGKNQMLKDGVLSETARKAIAGM MVLAPSITAFGNTNPTSYFRLVPHQEAPTNVCWGDRNRSVLVRVPLGWAAKTDMCTLANP LEAESHFDTSQKQTVEMRSPDGSADLYQLLAGLAVACRHGFEIENALDIAEKTYVNVNIH QKENEDKLNSLAQLPDSCEASADCLQQQRAIFEQYNVFSPAMIDGIIRKLRSYEDKTLRA DLEGKPMEMLDLVHKYFHCG >gi|222159210|gb|ACAB01000149.1| GENE 14 13813 - 14619 619 268 aa, chain - ## HITS:1 COG:no KEGG:BT_0971 NR:ns ## KEGG: BT_0971 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 266 1 265 265 427 75.0 1e-118 MKYLLIYIGLLCTSLQLWGQETVATRIAPPTGYVREACADRSFTGYLRNLPLMPKGSKVM LYNGKEKSNQSAAYAVIDMEIGNRDLQQCADAVMRLRAEYLWKHKRYGEIKFNFTNGFPA GYKKWAEGNRIKVSGNQVQWYAAGKGVDYSYKTFRNYLDMVFMYAGTASLSRELQAVSYT SLQPGDVFIKGGSPGHAVIVVDVAVHPTTKKKVFLLAQSYMPAQQIHILVNPVSRSLSPW YELAETDAGKLYTPEWIFSRKDLKRFKE >gi|222159210|gb|ACAB01000149.1| GENE 15 14744 - 15443 533 233 aa, chain + ## HITS:1 COG:no KEGG:BT_0781 NR:ns ## KEGG: BT_0781 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 233 240 385 77.0 1e-106 MDIQTFIQNFKEAFGENAELPLVFWYSDILEGTTEKINGCFFKGMKTVREGGTISLNAEN IGCGGGKFYTGFTEMPERVPTFVSLKERYKQTPEMVIEFIRQLGVPMAEKKYLHFARIDK VASLEQMEGVMFIANPDMLSGLTTWAYYDNNAEDGVVSLFGSGCSSIVTQATLENRKGGK RTFIGFFDPSVRPYFEADKLSYTIPMSRFREMCGTMRQSCLFDTHAWRKIRER Prediction of potential genes in microbial genomes Time: Wed May 18 04:25:25 2011 Seq name: gi|222159209|gb|ACAB01000150.1| Bacteroides sp. D1 cont1.150, whole genome shotgun sequence Length of sequence - 84399 bp Number of predicted genes - 90, with homology - 87 Number of transcription units - 28, operones - 16 average op.length - 4.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 517 414 ## BF2797 hypothetical protein 2 1 Op 2 . + CDS 514 - 3039 1802 ## BF2799 hypothetical protein 3 1 Op 3 . + CDS 3036 - 3767 442 ## BF2800 hypothetical protein 4 1 Op 4 . + CDS 3788 - 4543 586 ## BF2802 hypothetical protein 5 1 Op 5 . + CDS 4562 - 5695 1040 ## BF2803 hypothetical protein 6 1 Op 6 . + CDS 5723 - 6337 478 ## BF2805 conjugate transposon protein TraK 7 1 Op 7 . + CDS 6349 - 6714 329 ## BF2806 hypothetical protein 8 1 Op 8 . + CDS 6711 - 7871 1209 ## BF2808 conjugate transposon protein TraM 9 1 Op 9 . + CDS 7896 - 8741 637 ## BF2811 conjugate transposon protein TraN 10 1 Op 10 . + CDS 8741 - 9247 210 ## BF2812 hypothetical protein 11 1 Op 11 . + CDS 9254 - 9925 376 ## BF2813 hypothetical protein 12 1 Op 12 . + CDS 9967 - 12147 1548 ## BF2815 putative mobilization protein - Term 11992 - 12025 -0.5 13 2 Tu 1 . - CDS 12207 - 15611 849 ## gi|237713116|ref|ZP_04543597.1| predicted protein - Prom 15721 - 15780 5.7 + Prom 16116 - 16175 5.2 14 3 Op 1 . + CDS 16365 - 16607 253 ## BF1159 hypothetical protein 15 3 Op 2 . + CDS 16620 - 18452 953 ## COG2189 Adenine specific DNA methylase Mod 16 3 Op 3 . + CDS 18449 - 19468 611 ## COG3943 Virulence protein 17 3 Op 4 . + CDS 19479 - 21839 1917 ## BF1156 hypothetical protein 18 3 Op 5 . + CDS 21845 - 22063 304 ## BDI_2136 hypothetical protein 19 3 Op 6 . + CDS 22076 - 24076 1692 ## COG0419 ATPase involved in DNA repair 20 3 Op 7 . + CDS 24084 - 25670 1130 ## BF1153 hypothetical protein 21 3 Op 8 . + CDS 25684 - 28932 771 ## COG1061 DNA or RNA helicases of superfamily II 22 3 Op 9 . + CDS 29001 - 29357 227 ## gi|237722745|ref|ZP_04553226.1| predicted protein 23 3 Op 10 . + CDS 29359 - 29886 380 ## gi|237713126|ref|ZP_04543607.1| predicted protein + Term 29901 - 29926 -0.8 - Term 29841 - 29881 0.3 24 4 Tu 1 . - CDS 29888 - 30562 271 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 30730 - 30789 7.6 + Prom 30652 - 30711 7.9 25 5 Op 1 . + CDS 30783 - 31541 544 ## BF2874 hypothetical protein 26 5 Op 2 . + CDS 31563 - 31787 353 ## BF2875 hypothetical protein 27 5 Op 3 . + CDS 31824 - 33416 1340 ## BF2876 hypothetical protein 28 5 Op 4 . + CDS 33437 - 34609 656 ## BF2877 hypothetical protein 29 5 Op 5 . + CDS 34615 - 35100 326 ## BF2878 hypothetical protein 30 5 Op 6 . + CDS 35108 - 35770 667 ## BF2879 hypothetical protein 31 5 Op 7 . + CDS 35791 - 36471 428 ## BF2880 hypothetical protein 32 5 Op 8 . + CDS 36500 - 37156 543 ## BF2881 hypothetical protein 33 5 Op 9 . + CDS 37186 - 38016 702 ## COG0739 Membrane proteins related to metalloendopeptidases 34 5 Op 10 . + CDS 38085 - 38321 189 ## BF2883 hypothetical protein 35 5 Op 11 . + CDS 38325 - 40157 1488 ## BF2884 hypothetical protein 36 5 Op 12 . + CDS 40171 - 42171 1719 ## BF2885 putative DNA primase 37 5 Op 13 . + CDS 42184 - 43722 1254 ## COG1705 Muramidase (flagellum-specific) 38 5 Op 14 . + CDS 43722 - 43895 163 ## gi|237713141|ref|ZP_04543622.1| predicted protein + Prom 43897 - 43956 3.4 39 6 Op 1 . + CDS 43979 - 44590 614 ## BF2888 hypothetical protein 40 6 Op 2 . + CDS 44596 - 46308 1322 ## COG4227 Antirestriction protein + Term 46330 - 46383 13.5 - Term 46116 - 46154 5.4 41 7 Tu 1 . - CDS 46392 - 46556 115 ## - Prom 46796 - 46855 2.5 + Prom 46428 - 46487 2.9 42 8 Op 1 . + CDS 46509 - 47231 363 ## BF1093 hypothetical protein 43 8 Op 2 . + CDS 47228 - 48301 767 ## PRU_0377 hypothetical protein 44 8 Op 3 . + CDS 48351 - 49181 367 ## BF2872 hypothetical protein 45 9 Op 1 . - CDS 49219 - 49845 429 ## gi|237713147|ref|ZP_04543628.1| conserved hypothetical protein 46 9 Op 2 . - CDS 49854 - 50081 131 ## BF1096 hypothetical protein 47 9 Op 3 . - CDS 50075 - 50326 120 ## BF2891 hypothetical protein 48 9 Op 4 . - CDS 50338 - 50700 397 ## gi|237713150|ref|ZP_04543631.1| predicted protein 49 9 Op 5 . - CDS 50749 - 51081 221 ## BF2912 hypothetical protein 50 9 Op 6 . - CDS 51087 - 51398 344 ## gi|237713152|ref|ZP_04543633.1| conserved hypothetical protein + Prom 51406 - 51465 5.8 51 10 Tu 1 . + CDS 51649 - 51882 89 ## gi|237721693|ref|ZP_04552174.1| predicted protein + Term 52103 - 52138 -0.6 52 11 Tu 1 . - CDS 52236 - 52484 195 ## alr0590 transposase 53 12 Tu 1 . - CDS 52591 - 53127 251 ## BDI_3256 putative transposase - Prom 53179 - 53238 6.5 + Prom 53138 - 53197 4.3 54 13 Tu 1 . + CDS 53234 - 54034 18 ## COG0030 Dimethyladenosine transferase (rRNA methylation) - Term 54158 - 54204 9.0 55 14 Op 1 . - CDS 54226 - 55491 549 ## PGN_0050 hypothetical protein 56 14 Op 2 . - CDS 55488 - 55928 420 ## PGN_0048 hypothetical protein - Prom 55969 - 56028 2.1 57 15 Tu 1 . - CDS 56134 - 56619 449 ## BF2893 hypothetical protein - Term 57057 - 57118 6.0 58 16 Tu 1 . - CDS 57123 - 57407 324 ## BF2909 hypothetical protein - Prom 57434 - 57493 1.7 + Prom 57680 - 57739 5.3 59 17 Op 1 . + CDS 57917 - 58645 293 ## BF1093 hypothetical protein 60 17 Op 2 . + CDS 58642 - 59706 649 ## PRU_0377 hypothetical protein - Term 59716 - 59752 1.7 61 18 Op 1 . - CDS 59813 - 61309 707 ## BF2899 putative outer membrane protein 62 18 Op 2 . - CDS 61364 - 62827 872 ## BF2900 hypothetical protein - Prom 62918 - 62977 2.4 - Term 62964 - 63001 0.1 63 19 Op 1 . - CDS 63055 - 63903 745 ## BF2901 hypothetical protein - Prom 63926 - 63985 3.0 64 19 Op 2 . - CDS 63990 - 64178 117 ## gi|262409443|ref|ZP_06085985.1| LOW QUALITY PROTEIN: conserved hypothetical protein - Prom 64427 - 64486 3.3 65 20 Op 1 . - CDS 64505 - 64843 175 ## BF2903 hypothetical protein 66 20 Op 2 . - CDS 64882 - 65328 364 ## BF2904 hypothetical protein 67 20 Op 3 . - CDS 65358 - 66452 577 ## BF2905 hypothetical protein - Prom 66474 - 66533 10.3 + Prom 66507 - 66566 9.0 68 21 Op 1 . + CDS 66693 - 67319 506 ## BF2906 serine type site-specific recombinase 69 21 Op 2 . + CDS 67322 - 68080 232 ## gi|237713170|ref|ZP_04543651.1| predicted protein 70 21 Op 3 6/0.000 + CDS 68126 - 69430 600 ## COG1479 Uncharacterized conserved protein 71 21 Op 4 . + CDS 69432 - 71321 1031 ## COG1479 Uncharacterized conserved protein 72 21 Op 5 . + CDS 71332 - 71874 411 ## PROTEIN SUPPORTED gi|229885015|ref|ZP_04504470.1| acetyltransferase, ribosomal protein N-acetylase 73 21 Op 6 . + CDS 71876 - 72274 392 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 + Prom 72317 - 72376 2.8 74 22 Tu 1 . + CDS 72401 - 72505 112 ## - Term 72860 - 72913 3.2 75 23 Tu 1 . - CDS 72928 - 73308 448 ## BF2915 putative single strand binding protein - Prom 73346 - 73405 6.0 76 24 Tu 1 . - CDS 73416 - 73610 68 ## gi|237721665|ref|ZP_04552146.1| predicted protein - Term 74047 - 74077 2.7 77 25 Op 1 . - CDS 74100 - 74387 292 ## gi|237713176|ref|ZP_04543657.1| predicted protein 78 25 Op 2 . - CDS 74398 - 74733 170 ## BF2919 hypothetical protein 79 25 Op 3 . - CDS 74748 - 74927 255 ## gi|237713178|ref|ZP_04543659.1| conserved hypothetical protein 80 25 Op 4 . - CDS 74948 - 75241 295 ## BF2920 hypothetical protein 81 26 Op 1 . + CDS 75645 - 76166 -97 ## gi|237721659|ref|ZP_04552140.1| predicted protein 82 26 Op 2 . + CDS 76193 - 76996 650 ## COG0566 rRNA methylases 83 27 Op 1 . - CDS 77050 - 77526 187 ## BVU_3777 arsenate reductase 84 27 Op 2 . - CDS 77547 - 78587 790 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 78781 - 78840 6.8 + Prom 78726 - 78785 7.6 85 28 Op 1 . + CDS 78837 - 81614 1519 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits 86 28 Op 2 . + CDS 81616 - 81795 87 ## 87 28 Op 3 . + CDS 81843 - 82073 140 ## gi|260171374|ref|ZP_05757786.1| hypothetical protein BacD2_05860 88 28 Op 4 . + CDS 82080 - 83156 373 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 89 28 Op 5 . + CDS 83228 - 83473 61 ## gi|237713186|ref|ZP_04543667.1| predicted protein 90 28 Op 6 . + CDS 83470 - 84397 286 ## gi|262409419|ref|ZP_06085961.1| ATPase Predicted protein(s) >gi|222159209|gb|ACAB01000150.1| GENE 1 2 - 517 414 171 aa, chain + ## HITS:1 COG:no KEGG:BF2797 NR:ns ## KEGG: BF2797 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 163 735 897 903 304 92.0 9e-82 KTARKFWASVGVVTQEIQDIIGSEIVKEAIINNSDVVMLLDQSKFRERFDTIKAILGLTD VDCKKIFTINRLENKEGRSFFREVFIRRGTTSGVYGVEEPHECYMTYTTERAEKEALKLY KAELQCSHKEAIERYCADWDASGISKSLPFAQKVNAAGKVLNLHQPKTKQQ >gi|222159209|gb|ACAB01000150.1| GENE 2 514 - 3039 1802 841 aa, chain + ## HITS:1 COG:no KEGG:BF2799 NR:ns ## KEGG: BF2799 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 836 1 836 843 1258 72.0 0 MKRFALIIMTIATIFAQQVHAQYYSVNFDAKTVAAMVAAYGTGTVAEAYYDEQVQNILKH YKASEVATAGIFASKFLERKALTDLGIWSSATENYYYRRIYHLVAHKIIPKTWIVAKLML RSPQTALHWCSYLMKVCDDTKSLCMQFESVVTNSTLSFSDIAFLEINEEIASILKLSEIG NVDWQRLFDDISKIPGNFTKENLKADLDKLYRMGAGLATSGITNIGNALLQQNSFHELMN GKVSKVIDIYEHYNGLFEQLENNAGQTILGIVGSEENVAGLFKFSDYNLTSWITDYLDET AGNYYTQRWYIARRDRGSVSLCDYYPPTDDKSILEGGEWTRFKTSYPNFYPNASQKEQAL SNSERYAGWSRNRVQQLNNANDGFSYTINYYQSSYIIKKGKKQTKKAYAYEIHVTKSWNN EEIVYEDVFDSYSMDLNTFKAQLNAHLSEFNDNEDGYVYYIASDSRNYYQATNEEKLKGC ETVTISVTCSDGATLGQGTTQYKCRKCGSTLNAHSKECAMKTTAESEELDLSELDAMQRE ADNKVYALQSQISALETENESLIKQIANASVEDAVPLRQKYNENKAEIDRLKPELAAWQQ KQEEIAQAKEEAKSDNDVQTDDYYRIPAIMQDCKTAYNITWQDNGSWNGYTYVRKGTIPN INGIITFKATLKIARKPKYFLGIKIHRAILQISWDLATEYSDTHVVEVITLDPDMSDKEK TKLVNDRIAEIAKEYPKCKITTEYARTAPQEDTKTNDMYHLLWSSDRLEIAREVDSRLTK IYADLVSLEKMMHYRRSIIDVLKDVLPPVDDDQGRRLSLVEECHDRWMENARNRKTKEGK R >gi|222159209|gb|ACAB01000150.1| GENE 3 3036 - 3767 442 243 aa, chain + ## HITS:1 COG:no KEGG:BF2800 NR:ns ## KEGG: BF2800 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 18 243 16 241 241 326 74.0 4e-88 MKRGRTYILGLLIMLLAPLSANAQWSFDVGTVEAYINDHKQQRSLLLARSTLEHSNKLLH EYSRKEMVGYKELNVDLDKYTRAFDVIDVMYQSLRTALNVHSTYKGVSERISDYKAMLED FSEKVIKRKHIELADTLILSINAGAIRMIAQEGEYLYKSVSDLVLYATGAAACSTSDLLM VLNSINCSLDNIEKHLNRAYFETWRYIQVRIGYWKEKVYRTKTKRELIDDAFGRWRQSGK QDY >gi|222159209|gb|ACAB01000150.1| GENE 4 3788 - 4543 586 251 aa, chain + ## HITS:1 COG:no KEGG:BF2802 NR:ns ## KEGG: BF2802 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 4 236 6 238 256 296 65.0 4e-79 MKWIFTTMLFVATIAANAQYVTYNHDSPKMNQITVEETGAGALKPELYYTLLHNKYKKTA AVKNKLTFRTAAGVASYQQVDEAEAIDSALTSRAKIEALNVADRQVDLAWLAEKDKIESQ MRQFQKNIDRIMMTGGSPKEKERWNDYYRVYQCAIKATRDAYMPNAQRKKEYLQIYADVS RQNDVLVKYLVQLSNRNATKNLLSATENRQIDKRSIISNAMSRWNESRSAVRGSQNDDGS EDGDGNESVNR >gi|222159209|gb|ACAB01000150.1| GENE 5 4562 - 5695 1040 377 aa, chain + ## HITS:1 COG:no KEGG:BF2803 NR:ns ## KEGG: BF2803 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 377 1 377 377 687 91.0 0 MASGDILSDFGINLLEEEIDDVIFQTNEFLTDATFTGSQGPFWWILQMCMALAALFAIVM AAGMAYKMMVKNEPLDIMKLFKPLAISIILCWWYPPADTGMVNSGSNWCVLDFLSYIPNC IGSYTHDLYQAEATQIEDKFQEVQQLIYARDTMYTSLQAQADVAHSGTSDPNLIEATMEQ TGVDEVTNMEKDAAKLWFTSLTSGIVVGIDKIVMLIALIVFRIGWWATIYCQQILLGMLT IFGPIQWAFSLLPKWEGAWAKWLTRYLTVHFYGAMLYFVGFYVLLLFDIVLCIQVENLTA ITVSEQTMAAYLQNSFFSAGYLMAASIVALKCLNLVPDLAAWMIPEGDTAFSTRNFGEGV AQQAKMTATGAIGTMMR >gi|222159209|gb|ACAB01000150.1| GENE 6 5723 - 6337 478 204 aa, chain + ## HITS:1 COG:no KEGG:BF2805 NR:ns ## KEGG: BF2805 # Name: not_defined # Def: conjugate transposon protein TraK # Organism: B.fragilis # Pathway: not_defined # 1 204 1 204 204 381 91.0 1e-105 MVIRNLENKIKLVGIICSFFLVGCIIISVSSIWTARTMVTDAQKKVYVLDGNVPILVNRT TMDETLDVEAKSHVEMFHHFFFTLPPDDKYIKYTMEKAMYLVDETGLAQYNTLKEKGFYS NILGTSAVFSIYCDSVRFDKSNMEFTYYGRQRIERRSNILMRELVTAGQLKRVPRTENNP HGLLIVNWRTLLNKDIEQKTKATY >gi|222159209|gb|ACAB01000150.1| GENE 7 6349 - 6714 329 121 aa, chain + ## HITS:1 COG:no KEGG:BF2806 NR:ns ## KEGG: BF2806 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 106 1 106 123 147 70.0 1e-34 MNTKGFRRMLFGEKMPDKNDPQYKERYERDVNAGRKFAQATRLDKLAAKIQGFANRNKIL FLVMVFGFVIGTFTFNIYRLTKAYRHGQEIRSATEMQDSLLRERHKEPVAPIVEPTQTRQ Q >gi|222159209|gb|ACAB01000150.1| GENE 8 6711 - 7871 1209 386 aa, chain + ## HITS:1 COG:no KEGG:BF2808 NR:ns ## KEGG: BF2808 # Name: not_defined # Def: conjugate transposon protein TraM # Organism: B.fragilis # Pathway: not_defined # 3 384 11 390 390 480 69.0 1e-134 MKRIFEKINFRQPKYMLPAILYIPLLGASYFIFDLFNAEKAEIADKTLQTTEFLNPDLPD ATIKGGDGIGSKYENMAKSWGKIADYTAVDNIERDEPDDDKEKYESQYSQDEVAFLGELE QQKLQAEDAADAKAREQEALKELERALAEARLKAQAQVEPVEPLAEDSLEQRTETGVSAS GTINEDSRAVNAPSENEEPGEMIKKVKVTSDYFNTIAQNEHEPNLIKAIIDEDVKAVNGS RVRLRLLDDVEINETVVKSGSYLYAIMSGFSSQRVKGTINSVLVNDEIIKVSLSLYDTDG LEGLYVPSSSFRETAKDVASGAFNSNMNMNTGSNGNSLTQWGMQAIQDAYQKTSNAISKN IKKNKVNLKYGTFVYIINSKEKRNAK >gi|222159209|gb|ACAB01000150.1| GENE 9 7896 - 8741 637 281 aa, chain + ## HITS:1 COG:no KEGG:BF2811 NR:ns ## KEGG: BF2811 # Name: not_defined # Def: conjugate transposon protein TraN # Organism: B.fragilis # Pathway: not_defined # 9 281 8 281 281 418 81.0 1e-115 MKIIKTICIGVTAALSSLPTFAQQTYEELEQLTVNEQVTTVITASEPIRFVDISTDKVVG DQPINNTIRLKPKETGYEDGEILAIVTIVTERYRTQYALLYTTRMQEAVTDKEIEFRERN AYNNPAVSMSQADMTQFARRIWNSPAKYRNVRSREHRMTMRLNNIYAVGDYFFIDFSIEN KTNIRFDIDEIRVKLADKKLSKATNAQIIELAPALVLEQAKSFRHGYRNVIVLKKMTFPN DKVLTIEMTEKQISGRNIHLNIDYEDVLSADSFNRVLLEEE >gi|222159209|gb|ACAB01000150.1| GENE 10 8741 - 9247 210 168 aa, chain + ## HITS:1 COG:no KEGG:BF2812 NR:ns ## KEGG: BF2812 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 167 3 169 170 250 71.0 2e-65 MKKILLVLVTVMTFAMNGKAQSNSDRLSLGVGCLYQNGLDITLSYEHEGKYHHTWEFFAN GYLKWDECASCGHICPESFWRNYRSYGFGVAYKPCVARGRNHYGNVRIGASAGSDTKRFL GGIHLGYEHNYVLRGGWRLFCQVKTDLMIKGEDLFRTGIVLGVKLPLN >gi|222159209|gb|ACAB01000150.1| GENE 11 9254 - 9925 376 223 aa, chain + ## HITS:1 COG:no KEGG:BF2813 NR:ns ## KEGG: BF2813 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 26 223 21 218 218 281 65.0 2e-74 MKKKLYINACRLFSLSAIVMLFVACDAHRDFPDTAMKPCHILCTDGKVLSVSDFKQSEKQ PIAVVFHVNHDEAIEGNGYAVYLWDLAPEAFADSIGVNQRTSTDITALDGNENTFAIYDT RETTSPMAEAVFALWRYGQSSYIPSVAQMRMLYNAKSQINPIIRMCGGDELPDAADDCWY WTSTEVAGQEKAKAWLYSMGSGSMQETPKTQAHKVRAIITLRD >gi|222159209|gb|ACAB01000150.1| GENE 12 9967 - 12147 1548 726 aa, chain + ## HITS:1 COG:no KEGG:BF2815 NR:ns ## KEGG: BF2815 # Name: not_defined # Def: putative mobilization protein # Organism: B.fragilis # Pathway: not_defined # 1 724 1 725 728 1275 87.0 0 MEESKELQGFYKIFRSIIYVSILLEFFEYAIDPAMLDHWGGILCDIHGRIKRWVIYNDGN LAYSKLATFLLICITCIGTRNKKKLEFNARKQVLYPIIIGMGLVVLSVWLFGYPMETRLY TLRLNIWLYMLASIIGVVLVHIALDNISKFLKEGLLKDRFNFENESFEQCRELQENKYSV NIPMRYYYKGKFRKGWVNISNPFRGTWVVGTPGSGKTFSIIEPFIRQHSAKGFALVVYDY KFPTLATKLYYHYKKNQKLGNLPKGCKFNIINFVDVEYSRRVNPIQLKYINNLAAASETA ETLLESLQKGKKEGGGGSDQFFQTSAVNFLAACIYFFCNWGKEPYDKDGNMLTAEKVQDK QTKRMIPTGRVFNSAGEEVEPAYWLGKYSDMPHILSFLNESYQTIFEVLETDNEVTPLLG PFQTALKNKAMEQLEGMIGTLRVYTSRLATKESYWIFHKDGDDFDLKVSDPKSPSYLLIA NDPEMESIIGALNALILNRLVTRVNTGQGKNIPVSIIVDELPTLYFHKIDRLIGTARSNK VSVALGFQELPQLEADYGKVGMQKIITTVGNVVSGSARSKETLEWLSSDIFGKVVQLKKG VTIDRDKTSINLNENMDSLVPASKISDMPTGWICGQTARDFVKTKTGSGGSMNIQESEEF KTSKFYCKTDFDMKEIKKEEAGYVPLPKFYTFKSRDERERILYKNFVQVGEDVKEMIQEI QKYKVK >gi|222159209|gb|ACAB01000150.1| GENE 13 12207 - 15611 849 1134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713116|ref|ZP_04543597.1| ## NR: gi|237713116|ref|ZP_04543597.1| predicted protein [Bacteroides sp. D1] # 1 1134 163 1296 1296 2174 99.0 0 MTFLLEYDEKKGGCLNVGELFEYMIVRSIQTAIDVYPNTTNNESVKVLIQRVLEKVAFIM EISRKDQISKDELYTVLDEVEGNMTQMLIANFDLLFFESRILKETNGILQFENTELQEYL AAKELCRQDNIESVLYDVAVHKELKRIYINWYDVIPHISYAEDKIHTFVNVLKLIVAYES NLENEPFETLLKYIDSSLLSLQQKEELFFIIFDHYQRVPVYINWRGAICGLLQECYTSSC NNKIILAVERLNKIQITNICTILEVIAEDDKLSKEVFDHWNNAANILMQTGDEEKQLVAL SLYNALKNEDELVKLSGSYSRFTERVKEKYCEVTGYGKFTNNNVVDCWLDSCYASDSYAI NAVLFIMDGPTIIYAYNKITEANKLGDFFNHNRTLHIHYEWHLTKQFHIAWDIDLESKKL MTKVIVSFINNHLYMSYTEINPIIKQILLEKTTGSLFIDSFDRVWDLEDLFRHFDAEIVD AELISSLDKLLHEIKIGEWHIDNILTALISKIRNDEVKSISISEYIKRYAETFERWDENS KKKVESQTNSHEYQRLIDSYQILSKLNVHEFYKYEAAFEVSKNVEFIQQQEFIQPFVDVI AKFFTELNLDEMTLKKTSENSFNLSTSLIKIPYFVNAIYRLGFRKVLEENRTVLIKTLPV VCCTLNAGANEIKEIYKSIICSISDDEKKQLVEWWKSREDDFMNISTEDVLTCITDYGID ALSYKVEEYIEQYKANQDFSHSIAASKALALVSEGYCNWNINNYRALFDSLADDSIGSIK MQCNDIMIEKYQDPEAMTWRIEYLKNHVVKSLCRKTGYVYAVSKEESEMTHPQMFKCFMS IKNNERLIEQMFDLFDFGLSLCITSETQEYSSYLLNQICFFFINTNNIFYILELRKKVAY FNDTNASFLVNNIMDNAEIMYLKKEKITIGKAIKQYNKCVKESHLEIRNDNDLRRYFNYI YSEVQKEIQDQGIYSLVRQEALSEDFIQRELKNTVINKCCQMGLEAVRVDREVTLQDNKR TDLLIRYGLCNPIMVELKLLHNKELKNKKERQEYKKKFIQYTNATGACLSVFWIFDVHKK GGNCSNFDDLKNEYTDLPHTLVLLTDCKCSSCIDTGVSKNGTRPKIKKTQNRKK >gi|222159209|gb|ACAB01000150.1| GENE 14 16365 - 16607 253 80 aa, chain + ## HITS:1 COG:no KEGG:BF1159 NR:ns ## KEGG: BF1159 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 80 1 80 80 120 95.0 2e-26 MEETKDINRIKVVLVEKKRTNKWLAEQLGKDPATVSKWCTNTSQPGLETLLQIANVLNVD VKELLNSSLEESIQTVRLVK >gi|222159209|gb|ACAB01000150.1| GENE 15 16620 - 18452 953 610 aa, chain + ## HITS:1 COG:XF1968 KEGG:ns NR:ns ## COG: XF1968 COG2189 # Protein_GI_number: 15838562 # Func_class: L Replication, recombination and repair # Function: Adenine specific DNA methylase Mod # Organism: Xylella fastidiosa 9a5c # 12 464 4 441 534 273 37.0 1e-72 MNKTELYNKIVQLDGLTNEEKSELLGLLRKQKKYGLVWEDKPEDVEERLRDELPVLIEDA MKAIISDDANAPNHILIEGDNLEALTSLAYTHEGKIDVIYIDPPYNTGNKDFVYNDCIVD KEDSYRHSKWLSFMNKRLRIAKQLLSDRGVIFISIDDNEQAQLKLLCDEVFGNNNFITNI IWQSTAGSNTGNEIVTTTEYVLVYTANRSKAHFDGQPPSDDTFKLTDEHIKERGHYTLDK LDRRRVGGHYSEALNFPIKMPDGTLRYPGGGQQQNEGWNYLWSKTKVQWGLDNDFIVFKK NKNEWAVYCKRYQKVDNSNRQVDRTTPYRNLITSDLFNTAQGTAEIANIFAIRPFAFPKP STFVKFLLSSAVVTSPNAMVLDFFAGSGTTLHATMQLNAEDGGHRKCILVTNNENNICEE VTYERNKRVINGYTTPKGEEVTGLKNNTLRYYRTSFVGRSRSMKNMRQLMNLSTDMLCIK EDLYTEQTKFGEQPTYKNVFRYFDNGKKRMMIIYKEEAVPLLADLIRKTDYEGKMRVYVF SPSEDPWEGEFEEVQDKVQLCALPQAIYNAYKRILPKKKDITLEQENDAASNDDKAFNGM LNFDEKEDEQ >gi|222159209|gb|ACAB01000150.1| GENE 16 18449 - 19468 611 339 aa, chain + ## HITS:1 COG:NMA1039 KEGG:ns NR:ns ## COG: NMA1039 COG3943 # Protein_GI_number: 15793995 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Neisseria meningitidis Z2491 # 9 335 2 322 336 298 49.0 1e-80 MNEQQDKGNIILYQTPDGQSKIEVTLSGDTVWLTADQMAELFQRNKSTISRHIKNVFEEG ELQADMVVAFFATTTQHGAIIGKQQTHQTAFYNLDMIISIGYRVKSHRGVQFRIWATNVL REYIVKGFALNDELLKRAGGGNYFDELLARIRDIRSSEKVFYRKVLAIYALSIDYDPRAE ATQLFFKTVQNKMHFSAHGHTAAEVIYERADAEKDFMGLTSWTGAMPKRTDAEVAKNYLS ADELDTLNRIVSLYLDFAELQAKSHQPMYMKDWIQKLDDFLKLSGKELLTHAGRISAEIA KQKADMEYDKFKERTMYELSPVEIHFLENFEKEQKKLKE >gi|222159209|gb|ACAB01000150.1| GENE 17 19479 - 21839 1917 786 aa, chain + ## HITS:1 COG:no KEGG:BF1156 NR:ns ## KEGG: BF1156 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 786 1 787 787 1383 88.0 0 MKNLAYQQKAVTELVDKTIRLLNLGGQRNKLVFEATTGAGKTVMACQMLAGLMDELHDRG DSRYQEVAFIWFAPRKLHIQSYEKLKGAFEETRTLRPVMFDELDQNEGIRPGEILFVNWE SVNKENNVMVREGDCSLSLYEITDRTKEEFGLPIVAIIDEEHMFWSKTADKSAAVLDRIN PTVEIRISATPKTANPKEKVTVYRQDVIAAEMIKKEVVLNPEIELNFSDELELNANLIKA ALDKRNQIAEAYKAVGSNVNPLLLIQLPNDTKESMTAEDTAIADQVKKYLEVMCGITTDN HRLAVWLAGEKENLTDLEKPDNLTEVLLFKEAIALGWDCPRAAVLLIFRKLQSDQFTIQT VGRIMRMPEQKHYQKEILNSGYVFTDIAKDKIQIVTADAGYILNNTITAHRRENLKNVNL PSSYTERPNVERNYLGPDFRKVLHEEARRFWDFVEGNLLFSLEELAKLDNDEESNTLPDS DDLQINENRKKVANSLRLDVKNINIEIPQDVHFQNEEQVLEVDTVKYARKATEIDRVFMA YIATKGHQFESKGRTDKIASYLLEILADFFGIYETEAKKVVLYHSNRPKFDRIIDSALER YARIRDKARKESAAKRVFRKYGWEVPEERTYDNETSHIEETGNHALLPFVQLNQASNPEK DFVAYLEQNSQYIDWWYKNGDKGKQHYAIEYTTGDEQAKSLFYVDFVIRMKNGHIYLFDT KSIGSDVFASDKHNALLQYIKENTTEEQPLYGGVILRKDENWLFSPLPIKNTTDTLNWNC FYPQNA >gi|222159209|gb|ACAB01000150.1| GENE 18 21845 - 22063 304 72 aa, chain + ## HITS:1 COG:no KEGG:BDI_2136 NR:ns ## KEGG: BDI_2136 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 72 1 72 72 116 93.0 2e-25 MAKGIDPKFVHYGIRKDDLAMIEAICEKEEIDFDWLSEDILKAYHAKKVDVIEMSDNDTE EIIRNAIQKIRQ >gi|222159209|gb|ACAB01000150.1| GENE 19 22076 - 24076 1692 666 aa, chain + ## HITS:1 COG:alr4919 KEGG:ns NR:ns ## COG: alr4919 COG0419 # Protein_GI_number: 17232411 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Nostoc sp. PCC 7120 # 24 660 28 660 662 95 22.0 3e-19 MLIQKIKISNYKTYLALDLDLTVDDDRPIILIGGSNGGGKTTLFEAISGALYGLKIESKE HFIELLNQGAINTAKPEISLQITFVGKVLGQQQKYILKRTYVLNPQGRPLESVSLNMNGN MYVYGTMTAAKDRLRAEQEINKIIKANLPQELSQYFLFDAMQSSELLKKNVFQQTIRDNF ENVLGFKKYLQLKHASEKLQQEWAQQRLEAEKEAKEYNALCGQKEQLISELNECLAEQDR LYKYMTSVEEEYQRAKEGAQQTANLNKRIQDIDGKIKNITDEAANYAEVLKSFVENIEIN VFLPKLASNLSQEINNLLRVKEQLQKQNTGAYPIETLRDVTSKIIEYLKELSLCSHAVDE ENVVAHMVALQNATNTEDPYDYLDDSEVTALRELPGKGGHNQFVSIDRKRQELEVQLANI DNLKSEKRTLEQTRSGGNSMLIANYEEAKHNIEKQKSLEGNLKGEIQRLEKRIHQYDVQI QQEPDLKYDTLMKLRPFFDKVADMLLKKKKAQIETEMQEQLNKLLVSYKGHVGRVELSDS MEKFNIKLYHTAGNEISLNQLNAASKQIFIQVLLKVLRNLGDYNPPVMIDTVMGVLDNES RDALMEEYFPQLAEQTILLCTTSEIRTDSDYIKLEPFISKTYTLHRNVEEQRTTIDEGYF GITLNQ >gi|222159209|gb|ACAB01000150.1| GENE 20 24084 - 25670 1130 528 aa, chain + ## HITS:1 COG:no KEGG:BF1153 NR:ns ## KEGG: BF1153 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 528 1 528 528 939 90.0 0 MSIINFSNTRQMEGLLDEISPVKDLMENKINLSRIVDREKRVSLTLNKVMRICFIVGLSI PAERKEEEFKDIQLSTSSARIVPSFFTMHDLSTLYAALLKLRYSSLNIDWTQNATLSRII AAEMLRGRDYLIKENNLDTYLYSMGSKVAMTKDIPVLNLLVGNYGDEEMEANLDINSRNI TNSQIIIAGATGSGKTNLLAVLMQQFRALSSESQYPVNFLLFDYKGEFSDIQNNHWLSHF DVDRSCILDPIEHPLPFTPFKDFTGRPINEINLYSSEMASALCSIDRVSASANMNNRLSE AIVEAYKKTSGSPITFTMMLKEYQAKMTDPAKDDSISSVLKQLVRANIFEEEDKADLIGD CYIIKMDGYPKDGPIAKAIVYFLISKLNNIYEQLEKQATNDDVVQIRHFSIIDEAHYMLD FDNRPLRNLIAVGRNKGLSIILATQNMSSFKSKGFDFYANAQYPLIMKQQTIDDKVIKDL FGVSGNELQEIRTAIAGLQKGELIIKDQMAFALSMGKKYKKINVTHLI >gi|222159209|gb|ACAB01000150.1| GENE 21 25684 - 28932 771 1082 aa, chain + ## HITS:1 COG:SPy0669 KEGG:ns NR:ns ## COG: SPy0669 COG1061 # Protein_GI_number: 15674735 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Streptococcus pyogenes M1 GAS # 148 448 24 322 527 67 21.0 2e-10 MIKLPFEIIQLLWSCGNNRVEHLNGFATEFSEATFDIVSTSPFVIRANNTEILLTRDEND HSIDGYQYVVLTNKIPRKSEFLRGNILSIRWIKHPKIDTLRPEEITASWRGKFLYKKENI QESILGLRAPQLGAIYAFMSKAQIHHGRNIIVMPTGTGKTETMLSILIANQCAKVLVTVP SDALRDQLANKFFTLGVLHKFGIVTSDCSYPNVGIVKEGMDYNGWYTLIEKSNVIITTMP LITNCESEIINLLKENISHLFVDEAHHSEAKTWSEFIDSFDDEKVTLFTATPFRNDSRRL KGDFIYNFSLKDAQSQGYYKPIKFIPIREYDLIQADKHIAEVAVKQLRSDMENGYDHILM ARCSTKQRAKEVMEIYQQYSDLNPVMVFSSMPNQKSILNNIKDKKHKIIVCVNMLGEGFD MPELKIAAIHDAKQSLPITLQFIGRFTRTSYDSNLGNASFITNIAYPPILEELEELYAKD ADWNSLLPLLNDGTTEQEKDFNQFIQSFNNLELSKIPFQSINIPLSAVIYRTSNTWNPKA WENILSPEIYDYRFGSSTTDGDTLVVIGGSIENVDWGNVECVQNLLWNILVIHRYCTPKY NHAYVYSTLSETDEIVKAIFGEDDTCKISGHCVFRVFHDIRRFAIVNFGGRKARMGNVSF KSYYGKDVQEGISMTEQKQLTANNLFGNGYRRGERVSVGCSVKGKIWSHTRGNLLEYTKW ARMIGKLVENEKIDPNSFINNTLRVQAISELPQIAPIAIDWDPELYRDYPEHGIFLSHNL VDYQLWYATIELADYEIKSTITFNILIEDSKFTYIIEYSEQDSRPIYRVKQIRGNKLKMR YGTRLYDDICDYFNNDDAAPVIYFADGSCLYANNLVRVNSDIIPFPQEKLIGIDWADTKI ENESQHVVPYETDSIQYYFSRYISEKFDLIYDDDGSGEIADLIGFKNENNAIHMHLFHLK YAHGGKISNMISNFYEVCGQAQKCLKWNDRDKSRQLFNRLFARKIKKYQGRECSRILKGT EEELEQLSSQVNWKKDLIMHINIVQPGLSCSNPSPDILNLLGCVSSYIKDVSNIDLNVYC NL >gi|222159209|gb|ACAB01000150.1| GENE 22 29001 - 29357 227 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237722745|ref|ZP_04553226.1| ## NR: gi|237722745|ref|ZP_04553226.1| predicted protein [Bacteroides sp. 2_2_4] # 1 118 15 132 132 240 100.0 3e-62 MIDHVKSIQTLGCAYWKKSRNKKLEVGDICYLYLIGKGHYQVRYRLEVVDTSCVREDSSC WITPFQADENCYKLIPTFTMYEGKELSLDTLEEIGINRHTQFKELNEFQEDFLNKYFE >gi|222159209|gb|ACAB01000150.1| GENE 23 29359 - 29886 380 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713126|ref|ZP_04543607.1| ## NR: gi|237713126|ref|ZP_04543607.1| predicted protein [Bacteroides sp. D1] # 1 175 4 178 178 347 100.0 2e-94 MSMPNDTRPQIINVTRKPSKCPVCGSEVVDIVYGTGDMTEMDFMLEYRKTAIMGGDNIPL RPPIWCCSCGCKRFRKVNEDGTDAPVKVKMLKNIRKAPVSKIIWTSQMTERALENDCISV IHQYQLEITTELDEHETLKVSAVSGSDAEDLAMELVTKGMIGLKGRKCVKIDTHV >gi|222159209|gb|ACAB01000150.1| GENE 24 29888 - 30562 271 224 aa, chain - ## HITS:1 COG:mll8577 KEGG:ns NR:ns ## COG: mll8577 COG0739 # Protein_GI_number: 13477076 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Mesorhizobium loti # 67 203 284 420 434 102 42.0 4e-22 MSNTRIVIALLLLLSGNHVKAQFNTVSCPGNHYKVAVETPSRTSNGTEATPESITSENEK SVPDKVALSSADSKKKEQVARYLSVCYPLSYVKINSPYGYRKDPFTGKRKFHNGIDLHAR SAKVFAMMQGRVLKVGQDKVSGKYVTLQHGNFTVSYCHLSQVSVSKGKIVKAGEVVGITG STGRSTGEHLHLTIRHKGDYINPCIVLDYIQSQSNKWDAVPSFG >gi|222159209|gb|ACAB01000150.1| GENE 25 30783 - 31541 544 252 aa, chain + ## HITS:1 COG:no KEGG:BF2874 NR:ns ## KEGG: BF2874 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 252 1 252 252 357 67.0 2e-97 MKQVRFEFEELPYETLSQFGLTQEMIEDLPMRILEELSHGRHSPVLPIKVNDDNGNTVTD HSRFAFVRKENGEADVVFYPVLKKSSLEKYSEEQQKQLRSGKVILADTVTADGRKTKAFV QIDTETNQVMSVPTQVIARNLQVMAEELKLSNAEIKVMQQGEPLTFVMQDEPVTVGISLH EKDGIRFCQGDAKKWNEQTKREWDKYTFGCYGCWMMDDDGNLDYIPEEQYTEELWNEQKK SAQRNMGAGLHK >gi|222159209|gb|ACAB01000150.1| GENE 26 31563 - 31787 353 74 aa, chain + ## HITS:1 COG:no KEGG:BF2875 NR:ns ## KEGG: BF2875 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 74 1 74 74 82 63.0 4e-15 MAQNTYYPEEVLIEKMERGEYGWLDYVNHFSAEWQEELVEYCKAHSLTIDDTAAEQFVHY KSEQLEAAMESGEA >gi|222159209|gb|ACAB01000150.1| GENE 27 31824 - 33416 1340 530 aa, chain + ## HITS:1 COG:no KEGG:BF2876 NR:ns ## KEGG: BF2876 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 4 529 13 521 522 702 69.0 0 MRPDLQPEMRALLMRNGMQAHVVPVGDSYQLVVQGHDSPLLRYPISRQQMQALTDWGTNH ANKKAYETFTDIVKEDFDMPRNFVHARNANGRVAMGLHGYRIGIGEYGRVGRYGMPPPFL GWTPRQQEGYHLRRVGGRLFYAGTPMVAERPDGRMKPGELQSGGYGFYYKGHRQPQEQVQ PFQQDVLQDLQAVITPMVNKPRSEEPAKPYKELITSPVYFSNEKWQECLSSHGIIVDTEA GTLTIQSEKVSADMVYDLTEEELSALTSNSIEEHPVEKRLEILNNVIQNDYSDKVTLESL NSKERISIALHPEVEQDLAMRNRQEQDLLLPLESDNSLAPHSTEQTVLQEDEHIIREPQE GALIDGRDLSYINENKGWYREGEHGREVEVEAIAVQPAETEGKYKMTAIINGEPITHEIS QKQYDKFLAVDDYHRMKLFSKIFNEVDMKTRPEMSSGLGTKIFAALAAGAVVLSDVAHGL HHPCPEFYGERFSSAPRPYFKPGVDSPMDVAARNFEAQMNKEVAEMRRGF >gi|222159209|gb|ACAB01000150.1| GENE 28 33437 - 34609 656 390 aa, chain + ## HITS:1 COG:no KEGG:BF2877 NR:ns ## KEGG: BF2877 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 317 1 313 563 424 62.0 1e-117 MANGELTYDDFLQRLDIQDILMDAGYHLNKRDGLRYPSYIRTDSNGTRIRGDKFIVTPNG KCCFQPPQQKLYNIISFIKAFPEKFAEHRNGVSPDRLVNLVCNRLLNQPINDRPLRIIQP RQENTPFRLDDYDIHRFDVNDRETHKRFYPYFKNRGIDIFTQRAFADHFFLATRHRSDGL AYANLAFPLVLPKEPDKIAGLEERGRPKMDGSGSYKGKAEGSNSSEGLWIANFSGEPLQK AGGVAWFESAYDAMAFYQIHRNGFRDNPDLSKKSVFVSTGGTPTDMQIRGMLSVTPDINH YLCFDNDSAGREFVKKFQAIAESMHINSDRIKVFPLMPCYKDWNDALLGKTSEEYLDSIK DAIIPLGAALGTTGCATDKEEEHRQPNIHR >gi|222159209|gb|ACAB01000150.1| GENE 29 34615 - 35100 326 161 aa, chain + ## HITS:1 COG:no KEGG:BF2878 NR:ns ## KEGG: BF2878 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 161 1 161 161 165 60.0 5e-40 MQQPIYREISLRPQTAQFFIDELPLLLLCPVGLVYGGMENAPLASIATLLAVLLSLILIY RLIYLKRIRYHIGSEQLTAEHGVFQRSIGYIELYRVVDFHEQQSLLQQIFGLKTVTVLSM DRTTPKLELTGLPKRINIVDIIRERVEFNKRRKGIYEITNH >gi|222159209|gb|ACAB01000150.1| GENE 30 35108 - 35770 667 220 aa, chain + ## HITS:1 COG:no KEGG:BF2879 NR:ns ## KEGG: BF2879 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 219 17 229 231 306 73.0 4e-82 MLLFGIILSVTAKAQIVTTNPLEYVALAEGNELILGKVKDQMDGQKKTALLQNTIAAEFE QMRQWEKKYNSYLKTASGFASSLKACTHLYNDGVRIFISLCKLKKAISDNPQGIVAAMSM NNLYIETATELVTVFSLLRDAVAKGGKENMLTGAERSQTLWALNDQLSAFQKKLNLLYLS IRTYTMTDVWNNVTAGMLDRNNGEVARMAMNRWRRAATVR >gi|222159209|gb|ACAB01000150.1| GENE 31 35791 - 36471 428 226 aa, chain + ## HITS:1 COG:no KEGG:BF2880 NR:ns ## KEGG: BF2880 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 222 1 224 228 300 67.0 3e-80 MRRAIFIIITLLCIGVKVHAQIDPTLAGMVLIYTEKSKKTLKNQEKIMLLQTTGHIWTKE EVEAVTDLQREFNDYLDSFRSVISYAAQIYGFYHEITHLTENMGEFTGQLRKSPANAVAV ALSTNRNKIYRELIYNSMEIVNDIRTVCLSDNKMTEKQRMEIVFGIRPKLQLMNKKLRRL TMAVKYTSMGDVWAEITEREQPKANKAEIARSAMKRWKRSGKAGAF >gi|222159209|gb|ACAB01000150.1| GENE 32 36500 - 37156 543 218 aa, chain + ## HITS:1 COG:no KEGG:BF2881 NR:ns ## KEGG: BF2881 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 218 1 214 214 198 58.0 1e-49 MGFWSSLLKTGGKAAGATGKAVGGAVLHPSQTLRGAGSALKTATVGAAAGYVGWEKLTTD KSVVRIVSDAVIGEDTTNAISGTAKAATETVNKLTGKAEQTFDSVSQASSSLSSTLDGAS NFLSGVSNGNAGNMLGNFFSNLAHGKVSGLSIVGLIAGAFLVFGRTGWLGKIAGIFLTMM MIGNNTQRQQEASVTGNQRQARPQQEQEEQTHSGGMRR >gi|222159209|gb|ACAB01000150.1| GENE 33 37186 - 38016 702 276 aa, chain + ## HITS:1 COG:CC1872 KEGG:ns NR:ns ## COG: CC1872 COG0739 # Protein_GI_number: 16126115 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Caulobacter vibrioides # 6 135 236 373 383 70 34.0 3e-12 MKYTEEMILQSPSGYCMPFEEEKNKEVTLSKGYGEQKDAVTRETSFHHGINFHASHRPLA AVASGVVSSIGTDKEHGVYIVIRYGKYEVTYAHLANIFIRFGQKVKAGQTVAISGNDLHM EVAFDGEELNPIEFLTMLYGNIQALGKSGHSAAHEFTPFDGEIKTRYDRDKEEIEELMLR FLPVYMEDLFRGEYIVPEYTEQSLRNVFTVGAAKNYFFESMPSMANPLGIGSRALPLAGK VQNLLIADFLNYLALRHNVYLSTMSEEVKKNFNPRQ >gi|222159209|gb|ACAB01000150.1| GENE 34 38085 - 38321 189 78 aa, chain + ## HITS:1 COG:no KEGG:BF2883 NR:ns ## KEGG: BF2883 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 29 73 1 45 50 64 82.0 1e-09 MVSVYPDRAGVRWWTKAWFNNREEGEHSVEIDRMAAIRFIQDRIEKDEWLEEFFPKQMEI YHNAIEQTKEQLLNQLNY >gi|222159209|gb|ACAB01000150.1| GENE 35 38325 - 40157 1488 610 aa, chain + ## HITS:1 COG:no KEGG:BF2884 NR:ns ## KEGG: BF2884 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 610 1 596 596 603 53.0 1e-171 MGEIKHTARFAEIERIFGDYFNSHLAPVMRDTQNYLNRKQTEEMAAYSTSTAGILGSMAA AGNPMMDPYQTLRVTGEWNSKTTEDYLEMCREKIANNKEIQADLALLANEWRTTVVDEIG RERYDELSGQLGCDLAYAYTEYRIGELMMGKMVKDNMPKSSAEYIMRKAAQNTLFGLSYT LNQSPLAAEIERCGEAAYKPSKLEKGTGKVIGASIDAISLGGAGSWASFAKFVGADLAFS ALTSSKDKTGNTASVEDCISKGVFGSERNVFDRFRKEAKEIRANDNPYIIATNERLTKKI PVQSFDFRDWMSHNAQAKMPFDFIAGQKIKDERYKDIPMVIAPEYMETYLQDMEKKEEVK GQSALASNPGQSGNTHTAQSVQQANETETTEESREETAIESGQNRQAGQQTANANENGWG DFFSTLGLSGMGDISKNLGYVLAMLPDILVGMFTGRTKSLDLGDNVLPIASIVAGIFIKN PILKMMMIGLGGANLLNKAGHEALGWKRNEDNGLNTGNRSVRYRVYADEPLNPRISGPIL QGNNLIATIDHVPCTIQLPEKVVSAYNAGALPLNTLANAILVKSDHTQQLMSQNYEENDR ETIVRTRGIQ >gi|222159209|gb|ACAB01000150.1| GENE 36 40171 - 42171 1719 666 aa, chain + ## HITS:1 COG:no KEGG:BF2885 NR:ns ## KEGG: BF2885 # Name: not_defined # Def: putative DNA primase # Organism: B.fragilis # Pathway: not_defined # 18 636 1 619 732 863 69.0 0 MKEKSQFEKNAADKQVELLSGALNGAVSAGGHWLNATGKGYPKFYPAGVAVSPFNALFMA LHSDKNGCKSNLFTLYSEAKARGESVKEHEKGVPFLFYNWNKYVNRNNPNEVISRDDYQK LSPEEKNRYKGVHNREIRTLFNIDQTLLPHVDRKAYDDALAKYGNSVEQGFGEKELRELR PKFNAFVQSISRNLAPVRTDGSGVAHYDSQKDAIYIPRQKDFAHYIDYAQETLRQIIAAT GHQQRLAREGMVMKNGIPPTEDAVKQERLIAEIASGIKMLEMGLPARLSDESVKLVDYWN RELKENPCLIDAIESDVNNALEVIRKAERGEKIEYATYRNYRQTEQMREQMPKHFFVADE IKKHPDKENKTIVLVIDKQGKSADVILPAGASLEVNNEIKGMSKQRIQSALEKSGIEKVR FYNPDGALGYRPDDRYFAEKQIEVARLKNWALETLSTIDAHPAVKHADKPWFEKIQMVQD DKNRWALFIKPEGKAGYSVYPEKDDLNRFFITLKQSLDNIDKVRGELARKYYALAESKPD LKVDLFHTADESIDLNRIQRVAVFKTKNGAILCAPTIDNQKPQPRSVTPQQWQRMWVAED RNSFKQHLAATLFADVLNKSQAQEQTSSEKQESEVETKISRDEEIKHDQPKEEVEEEEEV SKGMHR >gi|222159209|gb|ACAB01000150.1| GENE 37 42184 - 43722 1254 512 aa, chain + ## HITS:1 COG:SPy0710_1 KEGG:ns NR:ns ## COG: SPy0710_1 COG1705 # Protein_GI_number: 15674768 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Streptococcus pyogenes M1 GAS # 7 162 3 159 174 73 33.0 8e-13 MSKNKEYVEKYAEFAMEQMRKYGIPASVTLAQGILESSNGQSHMALNENNHFGIKATPGW IAQGGKYGIYTDDKPNEKFCSYDSVGDSYEHHSKFLVENKRYAECFDLSPDDYKGWTKGL AKAGYASGGNYAQSLQKIIEANGLQQYDRMVMAEMKAKGLETETGQAVTTNTYSFPVERE EFLFVTSPFGMRTDPMNADKLQMHKGIDIRCKGDAVLATENNGKVVAVNQNAKTAGGKSV TIEYERADGSKIQNTYMHLSSVDVKVGDMVQAGQRLGVSGNTGTRTTGEHLHFGVAQITA DGQKRDIDPAIYLADIAQKGNIKLQALHNGNDLLAKYKTEVTATPQETKSQSPEEWMKKL LSSEDSGVGLSGTNDPIMDMVVKAFSSLMMLAVVIDNKNEEEQKEAISEMAGSRKVELTS LLPHMKSCALVIGENNKAVLQADNGIVQVSRELSTVELSRLSATLNNPGLTEESKRMRVA GMVNAIVLSQQASQNFEQGMSEQQGQNEQIRR >gi|222159209|gb|ACAB01000150.1| GENE 38 43722 - 43895 163 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713141|ref|ZP_04543622.1| ## NR: gi|237713141|ref|ZP_04543622.1| predicted protein [Bacteroides sp. D1] # 1 57 1 57 57 72 100.0 9e-12 MKSVMMEIGLLFVGLSLAGLHTLVCYRLFGLVVTLVIVGVQVVIASGIVWIKIRAPD >gi|222159209|gb|ACAB01000150.1| GENE 39 43979 - 44590 614 203 aa, chain + ## HITS:1 COG:no KEGG:BF2888 NR:ns ## KEGG: BF2888 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 202 1 202 203 205 49.0 8e-52 MIKCNVTVCGTISKAAVVRTNNENKAFTAFAVNCVIPARNGIDKTVEISVAKDGEEYNRA DYTVGKRVEISGVLTFKKRGDNLYFNLSASSVNLTVASNEDSIKGEMLFRGKTGKNIEEK TDKKGKNFLQFSAFSAEKVNDGFEYLWVRFLGFDRKREEWLQPQTGIEAKGELELSVYND KLNITCKVSEMSEYVKQPYNPNN >gi|222159209|gb|ACAB01000150.1| GENE 40 44596 - 46308 1322 570 aa, chain + ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 23 350 224 521 522 92 26.0 3e-18 MAGYRKNSNDGPSAEDKALDLFAEMMIERIETISKDWTKPWITEGSLGWPKNLSGREYNG MNALMLLLHCENEGYKIPRFCTFDCVQRMNKPSEKQAKEGVELPRVSVNKGEKSFPVMLT TFTCIHKETKEKIKYDDFKKLSDEEKKMYNVYPKMQVFRVFNVAQTNLQEARPELWSKLA NGDAVKLDESEKMSFEPMDVMIRDNRWICPIKPMHQDKAYFSISKNEIVVPEKSQFKDGE SYYGTLWHEMTHSTGIEGQLDRIKPSGFGSDEYAREELVAELGSALVAQRYGMSKALKEE SCAYLKSWLDHLKESPQFIKTTLLDVKRATSLVTQNVDKIVQELEQKSKVQHDNGQRNEI KQSVPEGKVFYSSVAYLQSTDDTSRLDALREKGDYEGLLRLAKEYYDGNGINEQYTFVSP RQNKGDDLLIEDKDFAVIYNNSVGGTYEVMLKYSEQEIRDHITRYGVRLASSDIKEVAKD MAAEQFSAMTRQRTPVLEMPNGDILHIGYNRDADTLDVGTATNAGIAVSHSFPYDHDNSL DSNLQSVNEKLNEMEQYQKEEVEYSGGMHR >gi|222159209|gb|ACAB01000150.1| GENE 41 46392 - 46556 115 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNFSIVKYECCRLVAHVIIIPFFDCKGNKFSALIQNSADILRLIKVKNLSFLCK >gi|222159209|gb|ACAB01000150.1| GENE 42 46509 - 47231 363 240 aa, chain + ## HITS:1 COG:no KEGG:BF1093 NR:ns ## KEGG: BF1093 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 239 1 239 240 373 78.0 1e-102 MSNETTTFILDYAEVHHDFSLGDLFAYLQEKTGIKKSSLSWYLFKLVNENVLVRTGRGTY AKVMKQVFSPQPIEEVKEVYGLLQSDFPFAKFCVYQGDIIAPLQHHLSSNRIIYVETDRD SAETVFNFLKDKNHNVYLRPDKNMIYRYVDMDSRVIFVKNLVSEAPLQEVSGVPMPTLEK LLVDILRDTDFFYLQGSESDRIIENAFNLYTINRNRLFRYADRRKVKKELSSILENLNIQ >gi|222159209|gb|ACAB01000150.1| GENE 43 47228 - 48301 767 357 aa, chain + ## HITS:1 COG:no KEGG:PRU_0377 NR:ns ## KEGG: PRU_0377 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 39 355 1 316 319 335 50.0 1e-90 MIKKECFTTEWIEQVASELHYNDKNLIEKVIRALSLLEMLVKAGCPLVFKGGTALMLILG KSAHRLSIDIDVICPPGTNIEDYLKSFADFGFINLELVERKQRDDADIPKSHSKFFYQIA YRNDTDAQSYILLDVLYEDIHYFQTRQIAIDCPFIRLEGEPLMVTVPSAEDILGDKLTAF APNTTGIPYYKNGRSCSMEIAKQLYDVGRLFENVSGLQITAEAFRKIAVVELSYRSLGTD IGQVFNDIRQTALCISTRGKFGEGDFNLIQDGIIRVKSFMYKQRYLIDNAIIDAARAAYL ATLIEKGVTEVERYSNNPVDIKDLVIRPSLTNKLNKLKSNLPEAFYYWAKTSELLEV >gi|222159209|gb|ACAB01000150.1| GENE 44 48351 - 49181 367 276 aa, chain + ## HITS:1 COG:no KEGG:BF2872 NR:ns ## KEGG: BF2872 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 17 269 17 269 278 348 70.0 1e-94 MHERLVFVILLGVISVLWYIVGYWRRKAGETAFLAAQKTEKEDERDRLCRLAVRAGHRDA CRMFYCLHPDLFKEQPPLKPFKLHGTRVVFYGQYYPSRYKPFLNDDQQAFCHSIYEFKEG KIHGIEFFKSCMNALQMDDHHFHIMFMPCSDEFKYIQRFKRLNWYVCTHCPNLTSGLYDV DVFEPRESLHEAKGYENRILERNYRITGDIKGKDVIIVDDVLTTGQSMTDYKEEIERCGG KVVAAIFYGKTITMPHPLIVKLEVWGDYITRFSKSN >gi|222159209|gb|ACAB01000150.1| GENE 45 49219 - 49845 429 208 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713147|ref|ZP_04543628.1| ## NR: gi|237713147|ref|ZP_04543628.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 208 1 208 208 423 100.0 1e-117 MGTTTIHPAESYLRNETNPSSLYVRIAGKRRRLFINRDRGVIGIIAPGKRTKDYAFTDWS SIEQIYYPSQEQEANTDRKLILKYQKLARLATHTNAWLRDIANADLDKSLYENHITTGTR IDGKCIGLATIEKYCGTASMQRFREAMKNKESFSTCRFDFCGYDGTLWCEPRDNGDMSAG FSKEYRNCGNGYYYLLINDEYVIGYDID >gi|222159209|gb|ACAB01000150.1| GENE 46 49854 - 50081 131 75 aa, chain - ## HITS:1 COG:no KEGG:BF1096 NR:ns ## KEGG: BF1096 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 71 1 71 74 90 56.0 2e-17 MLDEIRQNGKTVLSGEDGFSIPMFFNNLCGKKISGEKYRDYIRYIAFGEMGFKPGEIMLY RNGTLVRTGKINIEK >gi|222159209|gb|ACAB01000150.1| GENE 47 50075 - 50326 120 83 aa, chain - ## HITS:1 COG:no KEGG:BF2891 NR:ns ## KEGG: BF2891 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 12 80 21 89 94 112 75.0 3e-24 MTLSDKSYYRRLCRNILADRFNWRKYCTPSLYFGREICVTPLHCSYGQIGYTINFPYTNA PEVEYDWEMNKLTIDDENWKLVC >gi|222159209|gb|ACAB01000150.1| GENE 48 50338 - 50700 397 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713150|ref|ZP_04543631.1| ## NR: gi|237713150|ref|ZP_04543631.1| predicted protein [Bacteroides sp. D1] # 1 120 1 120 120 205 100.0 6e-52 MTTQEQIRKVTDIAQEKGWSISVEDENKTSIQFEFQRYTKYGQDFNFNADMQDEDIDTLI AGMKRYYEDFDPDYEAYLWIGDDGHGKNGAPYHIKDIVADMEEVEEKIYELLQALEVEFI >gi|222159209|gb|ACAB01000150.1| GENE 49 50749 - 51081 221 110 aa, chain - ## HITS:1 COG:no KEGG:BF2912 NR:ns ## KEGG: BF2912 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 17 110 10 104 104 95 57.0 6e-19 MQALQIAIPRWNITELNGYEPITTFWQDFSIADKFGNNAIADTYRRAKSEWKDNYKYWTE LCLVLNHKIWQWHERDNQRAILYDKLWREADAMTAEWSDEEQEYYFEITD >gi|222159209|gb|ACAB01000150.1| GENE 50 51087 - 51398 344 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713152|ref|ZP_04543633.1| ## NR: gi|237713152|ref|ZP_04543633.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 103 1 103 103 172 100.0 9e-42 MEYNNQLSENDKRFADEFSNYVNGKMASPRKVGKALADDHRYLVNEKAKLMFYFMEQLAE NWHKGRYDQRNEWACRLAAEAIDHLAENDLYHLPEEYYENHKQ >gi|222159209|gb|ACAB01000150.1| GENE 51 51649 - 51882 89 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237721693|ref|ZP_04552174.1| ## NR: gi|237721693|ref|ZP_04552174.1| predicted protein [Bacteroides sp. 2_2_4] # 1 77 1 77 77 137 100.0 2e-31 MPCGVMAGSKENPPMTIMGDYPIKVLGYKFGNSFYLLPTLYKRIYMQYIITVKCMSSSGF YALTVRLGTKIAKRKKL >gi|222159209|gb|ACAB01000150.1| GENE 52 52236 - 52484 195 82 aa, chain - ## HITS:1 COG:no KEGG:alr0590 NR:ns ## KEGG: alr0590 # Name: not_defined # Def: transposase # Organism: Anabaena # Pathway: not_defined # 1 66 123 188 201 96 65.0 3e-19 MKGQIMTLSDKILLRKRAIIETINDELKNIAQIEHSRHRSVTGFTVNLMAGLAAYSFFPK KPMIAVERVESEENGLIQLSLF >gi|222159209|gb|ACAB01000150.1| GENE 53 52591 - 53127 251 178 aa, chain - ## HITS:1 COG:no KEGG:BDI_3256 NR:ns ## KEGG: BDI_3256 # Name: not_defined # Def: putative transposase # Organism: P.distasonis # Pathway: not_defined # 1 172 1 169 301 192 54.0 5e-48 MISRDKITEIFCMADDFCKLYDRFIKANGLAPKRDKSKRKYHRDSRMSSAEIITIMILFH LSGYKCFKHFYINEVKEHMTDMFPKTVAYNRFTELERTVVIPFILFVKRCCMGKCTGISF VDSTLLRVCRNQRIHMHKVFKGIAQRGLCSMGWFYGFKLHLICNEMGELLSFKEQDFR >gi|222159209|gb|ACAB01000150.1| GENE 54 53234 - 54034 18 266 aa, chain + ## HITS:1 COG:BH0380 KEGG:ns NR:ns ## COG: BH0380 COG0030 # Protein_GI_number: 15612943 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Bacillus halodurans # 7 265 18 277 284 148 32.0 1e-35 MTKKKLPVRFTGQHFTIDKVLIKDAIKQANITQQDTVLDIGAGKGFITVHLLKNVNKVVA IENDLVLYKHLCKRFNNAQNVQVVGCDFRKFTVPLLPFKVVSNIPYGITSDIFRILMFDN VELFLGGSIVLQSEPAKKLVSSKVYNPYTVFYHTFYDLEFLYEISPGSFLPPPTVKSALL KIKRKQSSIDFELKVKYLAFISCLLQKPDLSVRTALKSIFRKSQVRSISEKFGINLNAQI VCLSPSQWKNCFLEMLEVVPEKFHPS >gi|222159209|gb|ACAB01000150.1| GENE 55 54226 - 55491 549 421 aa, chain - ## HITS:1 COG:no KEGG:PGN_0050 NR:ns ## KEGG: PGN_0050 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 421 1 422 424 404 51.0 1e-111 MRAKTRLQHKVVTANGRLLPQTKKQELWAFRQCISHYAYRTKNGGTTCMDCGYQWNESNK KVCRCPHCGARLEILDTKCRTFKDKAYYSTLATQDGLQVQRVFLMNANFRKGKKAEYYSM EIARYWVDDNGKTEITALKRTLGHYADTFVLDGCLELRRDNYVYRRIADCKVYPYYSATP KLRRNGLQGSLADIEPTRLIEALLTDSRAETMFKAGRKTDLNYFLQHSMYFDLYWNTYKI VLRNGYHIADISLWTDYIRLLERCGKDIHNAHYVCPLDLKAEHDRYQERARIIQEREERE KQRKKAKENEERFKELKSKFFGLSFTDGLIVVSVLESVDDYYKEGNALHHCVGQCEYYLK PQSLVFSVRIENQRIETIELSLETFKVLQSRGLCNKATEYHDRIIRLVQKNARQIRKRMT A >gi|222159209|gb|ACAB01000150.1| GENE 56 55488 - 55928 420 146 aa, chain - ## HITS:1 COG:no KEGG:PGN_0048 NR:ns ## KEGG: PGN_0048 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 144 1 138 138 141 55.0 8e-33 MKGTEHFKQAIKAYLDERAKTDELFAVSYAKENKNLDDCVTFILNQAMAICKEGGCGMTD DEVYSLGVHYYDEDDIEVGKAVNCGVIVNHRVELSPEEQAEARENALKAYQNEELRKIQQ RNSKPKPTPKAVRQEPEQPTLFDLGL >gi|222159209|gb|ACAB01000150.1| GENE 57 56134 - 56619 449 161 aa, chain - ## HITS:1 COG:no KEGG:BF2893 NR:ns ## KEGG: BF2893 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 161 1 161 161 197 58.0 9e-50 MHSKIFQITETRVDRDNYLNEDTLEQGDGHYYDYCSEIDEEERKFHIANLIEKALPKGMF TLVAENTIRYNGGADKWKKEFVTAIQEKAQAVTVENCMMWIGAVYQLEKFLKNPLDLGYQ FYMDEYGVNGYAEQSYSFLQTVSQFEPGKLLYIGGVIDYHF >gi|222159209|gb|ACAB01000150.1| GENE 58 57123 - 57407 324 94 aa, chain - ## HITS:1 COG:no KEGG:BF2909 NR:ns ## KEGG: BF2909 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 14 91 3 80 84 94 56.0 1e-18 MVNMETNSDERKIIDRVLHRDMVFRYQLLGRLQADCEYYLNYGNRNIKRLWAGNVNLQIK LMAELYNSFKEDEKPQWLTMDEIIAYGKAMAEEE >gi|222159209|gb|ACAB01000150.1| GENE 59 57917 - 58645 293 242 aa, chain + ## HITS:1 COG:no KEGG:BF1093 NR:ns ## KEGG: BF1093 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 17 242 18 239 240 127 35.0 4e-28 MTAVNQIIDIARKQDGQFTRKDLLNAVRSGMKNISEGSLVVLLNRMLAENKIIRVSYGKY KLNEDLKYDFLYEPSEFMLSLNRHIKEKFPFIDYCIWQPSIFASMMLHIPAVRTTLVDVE RVAMESVFMSLQNMESAIPVFLNPNQEDADRYITNRDMIIVRPLVKEAPIDIIDGCPVPT LEKMLVDAISDKELQHLQGNELYTIYSNAFSDYEIKMPRLLRYAARRNRKQKVEQIINTI NI >gi|222159209|gb|ACAB01000150.1| GENE 60 58642 - 59706 649 354 aa, chain + ## HITS:1 COG:no KEGG:PRU_0377 NR:ns ## KEGG: PRU_0377 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 40 354 2 316 319 268 43.0 3e-70 MIHPDSRTLSWIEQVAKDNKIKDIALVEKTIRAFSLLEALARSGCPFLFKGGSSLMLHLD TGKRLSIDIDIICPPGTRIEDYLEKYSEEYGFGEVKLVERISRTDIPKQHAKFFYQVAYP AGGRQDKILLDVLFEEIHYANVVELPIKSRLLKTEGDPVMVKLPSKEDLLGDKLTAFAPH TTGIPFFKGEKNCSMEICKQLFDVASLFDIVGDLSVTAETFNKFAMVELQYRGENPDAIQ KVLDDIYDTAKCIVMRGQDNQEEFDLIQDGIKKVRGFIHSEVYTLESAITNASKAAYLAK LISNGINEIKHYNPSEANLLVNVTIHSPLSTKLNKLKKTNAEAFFYWSEIQKLL >gi|222159209|gb|ACAB01000150.1| GENE 61 59813 - 61309 707 498 aa, chain - ## HITS:1 COG:no KEGG:BF2899 NR:ns ## KEGG: BF2899 # Name: not_defined # Def: putative outer membrane protein # Organism: B.fragilis # Pathway: not_defined # 1 496 1 475 480 537 52.0 1e-151 MTKKIFLILAVAGMGCSLHMKAAVPPVTKDTVQLESYTDIEQMLRPLDPTYHKGVFVTSP WNGNWFVSLQGGASAFIGKPVGCADLFDRIKPTIFASLGKWFTPQIGARIGYGGWQFKDC ELVTNDYHHFHADLMWNVLGYRYGKVENPRWGIIPYIGFGLMHNPQNGHNPFAISYGIQG QYRICKRLSALLEIGNVSTFQDFDGYGKSNRFGDNMLSVSAGLSFTIGKTGWKRAVDASP YIRQNEWLIGYASQLSESNRRYTGQHDRAMRTINELKKILEIEGLLDKYSKMFDDRESLG NVYPRNDYSGLNSLRARLKNRHWDGKSPLSNDSLFSKSEFVNAIPNNDGSVVNGFYVTGA DSLLSPDMNNDSISIYSHSDYLSFIGSGNECIGSPVYFFFELGTAQLTDKSQLVNLDELA RVTKKYGLSVTVAGAADAATGTADINNKLSASRADYIATELRRRGVTVENITKVCQGGIS NYVPTEANRHTKVMLYMK >gi|222159209|gb|ACAB01000150.1| GENE 62 61364 - 62827 872 487 aa, chain - ## HITS:1 COG:no KEGG:BF2900 NR:ns ## KEGG: BF2900 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 487 6 485 485 516 59.0 1e-145 MAEKSWQVMHMNVMKGITTAQSNEHQRNWTESGWKHAIDKGTYDRQRERLNFEIVKGGKV RPINKDKSIPERMAENLAGRGIKDPNSDLAEPKFRTVVDFILGGSKFQMRQLAFGDQAVV YGSGNNTANASLQRKPEIEEWAKDMYRFMSERFGEENIVGCYVHLDETTPHMHLTLLPIQ DGKFAFKKMFAGKDKYEFSARMKMLHDELSQVNEKWQLERGRNVSETGARHRSTEEYRRY LSEECTNIEEQVVQHKKALADLKVEISLAERRVKGLTSMVDNLRKAKAEKESQLSALERT LQSHQGDTATIIAERERLEKELASIQTKLEDKQDKLRTADQQLEALKRDMDAIGERTEEL KGEAYKYSREIHSNVDVLLKDVMLETLVGEHSARVAEMGAAEQSVFDGSLLQSLTEQGAE VMHCATLLFLGMVNDATTFAETHGGGGSKSNLKWGKDDDEDNRAWARRCMMMASRMMRPA SGKKQKR >gi|222159209|gb|ACAB01000150.1| GENE 63 63055 - 63903 745 282 aa, chain - ## HITS:1 COG:no KEGG:BF2901 NR:ns ## KEGG: BF2901 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 276 3 281 288 230 47.0 4e-59 MKDKNRAANQVLADMMFFDYLKEKVGERKTKTGAYYDLLEKATAGFIAPFLKNHEYVLQA DQCHVTISDLAVEWHWHRATVRAFLDKLEEMGYIRRTRFAKSVVITVTFAPFSSGETPAS YTGHTGAGLADDLDKALSEWINGSLSDEDMGEICGQYHEAGLKRLADESGKDHSGDKPAI KAGAETELAQEIVERVAVAGLKRAIRNSRFDDPTDFLEFFHGELGGDWTSLLEASKIIAK MILDAGNDGAPNSSSETDMLATLRQPFRALWAKYQERDSTLL >gi|222159209|gb|ACAB01000150.1| GENE 64 63990 - 64178 117 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262409443|ref|ZP_06085985.1| ## NR: gi|262409443|ref|ZP_06085985.1| LOW QUALITY PROTEIN: conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 62 101 162 162 124 100.0 2e-27 MAAFITLVICAVPHLHTTAFTLAFLLLAAMFYPSCRVLAEWKDDDTRKHLKENPKTMSEH YC >gi|222159209|gb|ACAB01000150.1| GENE 65 64505 - 64843 175 112 aa, chain - ## HITS:1 COG:no KEGG:BF2903 NR:ns ## KEGG: BF2903 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 108 8 114 123 82 42.0 5e-15 MYRSWIVVDDGYGYRQSVVFRRGRPVSCVCRMLRLRSTYKHNGYERGTKWSISEYELDKA LARYRQKNHELKARLKKGASYLEAKDVEAIIEIATYGLVKLELTELPLFKQY >gi|222159209|gb|ACAB01000150.1| GENE 66 64882 - 65328 364 148 aa, chain - ## HITS:1 COG:no KEGG:BF2904 NR:ns ## KEGG: BF2904 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 148 3 150 150 196 68.0 2e-49 MLQLFYDDFLSFVPLQLPQLLDVTTMEQPRFYDDYVLLSFPLADPYDLEEVMDIFEDDME LITLYHHIPSSATTFGSSTCAYSNPAFGQMFKMNARVSDTGKVDRIDVTIYESLEFMCSD ICLDLKLHEKTGHFKYRKTKEELLAEFI >gi|222159209|gb|ACAB01000150.1| GENE 67 65358 - 66452 577 364 aa, chain - ## HITS:1 COG:no KEGG:BF2905 NR:ns ## KEGG: BF2905 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 19 364 4 349 349 267 38.0 4e-70 MYKGLFKVESMQKPKRGNLKSYSKKCASYAKSFAVCFAIVTLASCGEKTGNSMFQTVKNA ADAYRSYLSEVRKEANLPTDRLIEKVNDWQSLRDSVSACVAKDTANRIHANYESEIRGLH DSLRIEFTRLALEKPRTFADVLLLREQTSQYRQDTELMQSAAEAEPFFKPLDSVPTYKGS ANAVVEKYRMFLSKTLKSGISGKSEMLAFIKEEDRLFRSFLDHLTELADADLSVITCDTE KCCLSIFQSAENGRLSYRDALVYTARRTNRRIILNALACRDDINQDNVKTEAQARAYVWM LLQPYVALDGFSVAILSDMERTTLHAVADRTPQMIAKLNKTAGTDNDQWRVLPGVLIKIM LTSI >gi|222159209|gb|ACAB01000150.1| GENE 68 66693 - 67319 506 208 aa, chain + ## HITS:1 COG:no KEGG:BF2906 NR:ns ## KEGG: BF2906 # Name: not_defined # Def: serine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 208 1 207 214 275 62.0 1e-72 MAKVGYIFKAAGYDGFDTDKKWMEQYGCVQVIEEENGHEKLRPQWKQLMASLERGDELVV SKFSNALRGSRELATFIEFCRVKVVRIVSIHDKIDSRGDLFPETKASDVLEMFGSLPEEC AMLRKASAHIIHLKQNINQPTKEKNISRAEREKTIVAMYNNGHSIDDIWKVSGFNSRSSV FRILNKYGVSLNRGKFSGPLGKRKNKEN >gi|222159209|gb|ACAB01000150.1| GENE 69 67322 - 68080 232 252 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713170|ref|ZP_04543651.1| ## NR: gi|237713170|ref|ZP_04543651.1| predicted protein [Bacteroides sp. D1] # 1 252 1 252 252 499 100.0 1e-140 MGHSKIYNNDSKYVIANVMDDVENVNVREYLIDFHAKSIYPAIEAMVLKEQNLDTICKDP IAIMLASKKIAEKEIARATSLEIPENIKLLFQEELSKKEQISLLRGISIKPEQLATIFLY ANDKGYKFSNYRFEDTPKKYIGADLPSFIYLCDENTIEHYGETSLTDGQMKEIITVSQFV LARILNNGKHWHCFYQTRRGLLGNEPGEYGNKSHIHYISDSFSISLKDVIKGFKAGICPH SKVHITLDESKE >gi|222159209|gb|ACAB01000150.1| GENE 70 68126 - 69430 600 434 aa, chain + ## HITS:1 COG:PM0594 KEGG:ns NR:ns ## COG: PM0594 COG1479 # Protein_GI_number: 15602459 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pasteurella multocida # 7 402 9 408 433 160 31.0 3e-39 MNSKPHIWSIDKLLGKNLIIPDYQRPYKWTDKNITELILDTQKSIEESRKYANFKYRIGT VILHRNENNEYEIVDGQQRILSFLLLKLHLDSDFTCNLLKNKFLNKITQKNLHDNYTTIR EWFSSVDDTVKAIFNDALKNILEVVVITVNRIDEAFQLFDSQNTRGRALYPHDLLKAYHL REIHDKYEMQNAVVKWESKDPKAIRELFDLYLFPIWNWAKCRKCGNFTSADIDIYKGIEE KYGYTYARRANKAMPYFLLTEPFISGSDFFEMVDHYMKMLHNIKEEIVTNPAFHDIELII TNGKDVDSAEELDKSFKSSSVGLNHARNLFFCALLCYYDRFHNFDVMAVKKLFTWAMMIR IDMQHLGFDTINRYAIGIEDDKYSNVKPVISMIAYARRHTEISGMRLDMRNDDKAASENW NNLYAQLRQLNGYN >gi|222159209|gb|ACAB01000150.1| GENE 71 69432 - 71321 1031 629 aa, chain + ## HITS:1 COG:PM0595 KEGG:ns NR:ns ## COG: PM0595 COG1479 # Protein_GI_number: 15602460 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pasteurella multocida # 1 621 5 623 633 254 34.0 4e-67 MVEELKILDEVTIFDTNTHYVIPRYQRAYAWEDKEIAQLIDDINDIDSSENYYIGSLVVS KVQDKPKTYEVVDGQQRLTTLFLLLQYLVSEGALEGELGQTLSFDCRSKSKYTLSNIQQI LADKKPSADYEENIDQSILNGIKAIRQKFTSDKGLNKDEFVSRLQHVILYKIEVPEHTDL NRYFEIMNTRGEQLEQHDILKARLMRFLNNRKEQELFSRIWNACSDMTGYVQMHFSRTER ENIFGGEWNGWPANEWSEYSDCLEITEDSENNATINEIIKPRFKVDNVDGILDDDTHIRF ESIIEFPYFLLHALRVFVDFYDVSLEKDLGELLDDKKLIADFENVIKYGTLDNEPIKNNK ENFARMIIIHLLQTRYLFDSFIIKREYTGEDKEGEWSLKELHTSGQASKKKAYYSNTKLK YEKEWETTYAPRNKECLMIQSALRVSYTSPKVMHWITELVLWLFENDEVALLTNEAEDIA ALATKENFLDGKDYKLGVTTPHIVFNFLDYLLWKNNKSKYSDFEFEFRNSVEHWYPQNPS DGSFDSWDDKDTFGNLCIISRSVNSKFSNLSPESKMKSYGKMVQKGSLKLRIMGDIIEKS SNDKWIKVQCHEHGAEMIDILETACRTLD >gi|222159209|gb|ACAB01000150.1| GENE 72 71332 - 71874 411 180 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229885015|ref|ZP_04504470.1| acetyltransferase, ribosomal protein N-acetylase [Sebaldella termitidis ATCC 33386] # 11 178 4 171 174 162 45 4e-39 MSKCNYIQADIIETERLLVRKLTNDDFDTLLTIMGKPEVMYAWEHGFSEDDVQDWIERQL VRYAKDGIGYFAVLQKESGQLIGQAGLMKTTMNGNEVVEIGYIFDNTYWHNGYATEAAES LIAYAFDSLELPAVYCSIRPENKASIRVAKRLGMESCSSHIVVYRGKEMPHIIYKLKKPK >gi|222159209|gb|ACAB01000150.1| GENE 73 71876 - 72274 392 132 aa, chain + ## HITS:1 COG:CC3636 KEGG:ns NR:ns ## COG: CC3636 COG0545 # Protein_GI_number: 16127866 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Caulobacter vibrioides # 4 131 39 167 177 107 42.0 6e-24 MSKKEYIQANKEWLEAKAKEEGVKPLPKGIYYKVISKGKNDGKHPAPRSIVTAHYTGWTI NGKKFDSSRGGTPIAFRLNELIEGWIIAMQQMCIGDKWEIYIPAEMGYGKFSQPGIPGGS TLIFEIELFGIA >gi|222159209|gb|ACAB01000150.1| GENE 74 72401 - 72505 112 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGDVIITVLGLWVFIKIVKVIFGGWSKSDYQRDR >gi|222159209|gb|ACAB01000150.1| GENE 75 72928 - 73308 448 126 aa, chain - ## HITS:1 COG:no KEGG:BF2915 NR:ns ## KEGG: BF2915 # Name: not_defined # Def: putative single strand binding protein # Organism: B.fragilis # Pathway: not_defined # 1 114 1 114 126 156 71.0 2e-37 MKQIENNFAVSGFVGKDAEIRQFANASVARFSLAVSRQEKSGEETKRVSAFINVEAWRNN ANTDSLAQITKGTLLTVEGYFKPEEWIDKDGVKHNRIVMVANKFYQTPDKEETPAEPEKK TKKGKK >gi|222159209|gb|ACAB01000150.1| GENE 76 73416 - 73610 68 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237721665|ref|ZP_04552146.1| ## NR: gi|237721665|ref|ZP_04552146.1| predicted protein [Bacteroides sp. 2_2_4] # 1 64 27 90 90 119 100.0 5e-26 MLSFVCEPCSLVYSPIHIVTAFGRVIPHFSVAKLVCSRVRLKALVNGVASQIFLYLAGGF DGLR >gi|222159209|gb|ACAB01000150.1| GENE 77 74100 - 74387 292 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713176|ref|ZP_04543657.1| ## NR: gi|237713176|ref|ZP_04543657.1| predicted protein [Bacteroides sp. D1] # 1 95 1 95 95 200 100.0 3e-50 MLYDYRKESKSPMLLTLANGKTIEGEFIDLRIATETLPKGKLWYHIRHTDDDWSEPASLK NGCVVVNFCGTFICDPIIDFPCGEELEITEWSYLN >gi|222159209|gb|ACAB01000150.1| GENE 78 74398 - 74733 170 111 aa, chain - ## HITS:1 COG:no KEGG:BF2919 NR:ns ## KEGG: BF2919 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 109 1 108 109 156 67.0 2e-37 MMTPIATRFVDWDIPELSTLQDSKVYQLRVQLNEGTPMSRQQKNWLTNSVNSNTYFKRAI PLMGYRFDFSDVLKRYFVKQYGHIAEYYAVDKTALRTFLCGRIDEIIEITD >gi|222159209|gb|ACAB01000150.1| GENE 79 74748 - 74927 255 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713178|ref|ZP_04543659.1| ## NR: gi|237713178|ref|ZP_04543659.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 59 1 59 59 102 100.0 1e-20 MYYEDEQDTDIIGCLCTLLTPYKGHTEGEVVGDYGNEIVVRLTNGKEVVEYRDEVLIYD >gi|222159209|gb|ACAB01000150.1| GENE 80 74948 - 75241 295 97 aa, chain - ## HITS:1 COG:no KEGG:BF2920 NR:ns ## KEGG: BF2920 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 97 1 96 100 105 58.0 4e-22 MANYATNIFYASTENEKDLDKVENFLEDTFCDCYLDRSEDSIEGEFSSKWDYPEKEINKM ISLLEDKDKIYIRVLTHELCNEYVNFQVFSNGEWSYR >gi|222159209|gb|ACAB01000150.1| GENE 81 75645 - 76166 -97 173 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237721659|ref|ZP_04552140.1| ## NR: gi|237721659|ref|ZP_04552140.1| predicted protein [Bacteroides sp. 2_2_4] # 1 173 1 173 173 317 98.0 2e-85 MIRYTKTKLNLFNHYFTHVIGNCIKPNFRPRDKRGTNNNTTRDKLCPLHFSMLFREWSYC FFLLVCKTMAFGVSEIIQNLTLHPRDEERQSKIIARDKPCLLQFNTFFWEVVLLFFFLLV CEILSPVCPPKIFFKNLHIDIQCFIQHKYRLRGQINILRRQVKSRCNHWYNLI >gi|222159209|gb|ACAB01000150.1| GENE 82 76193 - 76996 650 267 aa, chain + ## HITS:1 COG:Rv0881 KEGG:ns NR:ns ## COG: Rv0881 COG0566 # Protein_GI_number: 15608021 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Mycobacterium tuberculosis H37Rv # 11 262 23 276 288 163 39.0 3e-40 MPIIEIHSLTDPRIEIFSTLTEAQLRNRIESDKGLFIAESPKVINVALNAGYEPLALLCE QKHITGDAADIIERYDDIPVYTGARELLARLTGYILTRGVLCAMHRPTPKSVEEICKGAR RIAVIEGVVDTTNIGAIFRSAAALGIDAILLTRNSCDPLNRRAVRVSMGSVFLIPWAWLD GSPNELHKLGFRTVAMALTEKSISIDSPILATEPKLAIVMGTEGDGLRNETITEADYVVR IPMANGVDSLNVAAASAIAFWQLRVKD >gi|222159209|gb|ACAB01000150.1| GENE 83 77050 - 77526 187 158 aa, chain - ## HITS:1 COG:no KEGG:BVU_3777 NR:ns ## KEGG: BVU_3777 # Name: not_defined # Def: arsenate reductase # Organism: B.vulgatus # Pathway: not_defined # 3 158 2 157 157 192 60.0 3e-48 MKNILIVSNSDTCRSRMAQEILNSFGRGMKISTAGVLVGNSVPDVVCQVMEQNGYDFSRR KPCDVATYAQQTWDYVITLCPEAEEVQKEMQGVVRKYVSFNFADPFRGGIHADDEQEERV VALYDAMHKELYEFFRNELMEKLLPRCSCGANTYCRCE >gi|222159209|gb|ACAB01000150.1| GENE 84 77547 - 78587 790 346 aa, chain - ## HITS:1 COG:RSc0194 KEGG:ns NR:ns ## COG: RSc0194 COG1063 # Protein_GI_number: 17544913 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Ralstonia solanacearum # 1 332 1 332 345 219 38.0 7e-57 MLAYTYIEHGKFELQEKARPEIKDSRDAIVRVTLGSICTSDLHIKHGSVPRAVPGITVGH EMVGVVEQVGPNVTSVKPGDRVTVNVETFCGECFFCKHGYVNNCTDPNGGWALGCRIDGG QAEYVRVPYADQGLSRIPDTVSDEQALFVGDVLATGFWAARISEITEEDTVLIIGAGPTG VCTLLCAMLKHPKRIIVCEKSPERIRFVREHYPDVLVTEPENCKDFVIQNSDHGGADVVL EVAGSDDSFHLAWACARPNAVVTIVALYEQPQLLPLPDMYGKNLIFKTGGVDGCDCAEIL KLIEEGKIDTTSLITHRFPLNEIEEAYRVFENKLDGVIKVAISGNK >gi|222159209|gb|ACAB01000150.1| GENE 85 78837 - 81614 1519 925 aa, chain + ## HITS:1 COG:FN1445 KEGG:ns NR:ns ## COG: FN1445 COG1112 # Protein_GI_number: 19704777 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Fusobacterium nucleatum # 62 915 2 842 849 751 51.0 0 MIDTSENLIIIKGQIKTPEIESCQNYDDSYKIVFRNVPSTYTYKEENVLWLTDPDKPDPN NYQIISKGRTLSNIKALSIFKAGSHSYWHIRFLNGKEYDYREKDLEIIESCLGESRSKSI FEYLKKVADANELKADDGTKLLAKQYEKIHFIANNRAIAVYLNPQKYKMQTQTASTLIFP FGCNASQQKAVQAAFENQISVVQGPPGTGKTQTILNIIANILVRGKTVQVVSNNNSAIVN VLEKLSKYDMGFIVALLGSTANKEKFIETQEEEKQYPEDFESWHDTDADQPQFLNQIHHQ TEELKSIFSKQERLAMARQEIQALKIEWQHYLQEFGTKEITLKQRKSSSSADLLNLWNEC QQFAEKEQSSSLRGIAAFIQRLKWLFFKFRSKTICKIPDKGFYKREMSLIIADFQILFYQ TKYAELEVEIDTLEKELANKDAAEMARQMADTSMKYLKNQLFHTYGNNHDKPIFTLPDLK NNWREVQKEYPIILSTTFSSLSSLQRDAVYDYIIMDEASQVSVETGALALSCAKNAIIVG DTMQLPNVITEENKKTLNFIANACLIKPEYDCANMSFLQSVCKVIPNVPQTLLREHYRCH PRIINFCNQKFYGGDLVIMTRDKGEEDVICAIRTAKGNHSRSHMNQREIDVIKEEVLPNL SYETDEIGVIAPYNKQVDVVKSALEEDIDVATVHKFQGREKDAIIMTTVNDVITSFSDDP NLLNVAVSRAKSQFYLVVSGNEQPKDCNISDLIAYIEYNNGTVSTSKIHSIFDYLYEQYT DARIAYLKKHKKISEYDSENLTFALLEDILKENINMRHLNIICHLPLYMLIQDYSLLNEE ESKYAANINTHIDFLIYNRVSKQPVLAIETDGYAFHKSGTSQSERDIKKDRILELYGIPL VRLSTIGSNEKKIIEDKLSEVLNLH >gi|222159209|gb|ACAB01000150.1| GENE 86 81616 - 81795 87 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFIANHFFVRIGVSAYSNNQKNKFYSNYNNHINNNAYIYFSFKVQQHYGNYFLLLHQPH >gi|222159209|gb|ACAB01000150.1| GENE 87 81843 - 82073 140 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171374|ref|ZP_05757786.1| ## NR: gi|260171374|ref|ZP_05757786.1| hypothetical protein BacD2_05860 [Bacteroides sp. D2] # 1 76 1 76 76 110 100.0 2e-23 MTKNYLIVNTLIKMRIRYNTYNNKNHLFYKNRQINKSYFDQILRNEQESIRLTILRDTLL PKLMSGELKINATEVL >gi|222159209|gb|ACAB01000150.1| GENE 88 82080 - 83156 373 358 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 5 331 1 313 339 148 32 1e-34 MNKSINILDKEYLQWVKDLCGRYRQSQIKAAVKVNVEQLKFNWLLGRDIVELHVEERWGE SVITQLSKDLREAIPNAAGLSKSNIYYCRKFYLLYKDALKTFHQLGGKNETELFHQVGGI FEVPWRHHCLIMDKVKDDIDKALFYVHQTVENGWSRSMLLNFISTDLYERQGKALTNFTK TLPDAMSDLAQELTKDPYNFAFTGITGRYNERLLKNALLNNITRFLIELGTGFAYVGKEY RLEIGETENFIDLLFYNLSLSCYVVVEVKIGKFAFADIGQLGGYVVACNHLLRKEGRDNP TIGLLICKEKDRIQAQYALESSSQPLGISEYDLEKFYPEKVEGTMPTIEEIEAKLRDN >gi|222159209|gb|ACAB01000150.1| GENE 89 83228 - 83473 61 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713186|ref|ZP_04543667.1| ## NR: gi|237713186|ref|ZP_04543667.1| predicted protein [Bacteroides sp. D1] # 1 81 1 81 81 129 100.0 6e-29 MRRMIKRLLKKYKYPPEEAENALETVIHQCEQWTDNDSDNYNSLLNETRKYDFHNETYPP MAADERVEYYTPKNNNKNTLL >gi|222159209|gb|ACAB01000150.1| GENE 90 83470 - 84397 286 309 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262409419|ref|ZP_06085961.1| ## NR: gi|262409419|ref|ZP_06085961.1| ATPase [Bacteroides sp. 2_1_22] # 1 309 1 309 675 611 100.0 1e-173 MNIDINKLEKEIKDYKTNFFSSWNDEKYKWEAVSWFQSHWDIKSPDFTQMLKTSLSKTQN LLGAQHYFPRRMIKNFAMVAPEDVRKMFIDLYNEHIPLSDRIYKFIKESDFILEKYKSTW RNHFQDYRTISTYLWLRYPERYYIFKPREFSRVSQILNTSYTFKKGATPNTVLQAYELYN EIKWILQQDTELKAMLSDVLTRTPNCDPDLELTTTTVDFLYFLDKNNQKSQKTFQIAGKK QEKDIPPLTPPTSKLHYWWLNANPQMWSLSNWSIGEIQSYTLYNDNGNKRRVFQNFLDAE AGDIAICYE Prediction of potential genes in microbial genomes Time: Wed May 18 04:31:05 2011 Seq name: gi|222159208|gb|ACAB01000151.1| Bacteroides sp. D1 cont1.151, whole genome shotgun sequence Length of sequence - 6318 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 342 96 ## gi|262409419|ref|ZP_06085961.1| ATPase 2 1 Op 2 5/0.000 + CDS 317 - 1060 368 ## COG1401 GTPase subunit of restriction endonuclease 3 1 Op 3 . + CDS 1102 - 2115 412 ## COG4268 McrBC 5-methylcytosine restriction system component + Prom 2294 - 2353 4.6 4 2 Op 1 . + CDS 2379 - 2786 441 ## BT_2361 hypothetical protein 5 2 Op 2 . + CDS 2822 - 3220 329 ## BT_2360 transcriptional regulator + Term 3239 - 3301 2.1 + Prom 3368 - 3427 3.8 6 3 Tu 1 . + CDS 3547 - 3981 346 ## BT_4511 hypothetical protein + Term 4052 - 4100 9.5 7 4 Op 1 . - CDS 4082 - 5188 725 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 8 4 Op 2 . - CDS 5241 - 6173 467 ## BT_4510 putative helicase - Prom 6228 - 6287 5.5 Predicted protein(s) >gi|222159208|gb|ACAB01000151.1| GENE 1 1 - 342 96 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262409419|ref|ZP_06085961.1| ## NR: gi|262409419|ref|ZP_06085961.1| ATPase [Bacteroides sp. 2_1_22] # 1 111 324 434 675 204 95.0 2e-51 KKNDGKHIYFQKTESLTYPIDYSILKNCEELNNMEFFANPNGSLFKLTQNEYDFIMDIIR DTNPIKRTNENIGRYTDEDFLNDVFLDEQELKTLKSILKYKKNNHIAGSPGSR >gi|222159208|gb|ACAB01000151.1| GENE 2 317 - 1060 368 247 aa, chain + ## HITS:1 COG:mcrB KEGG:ns NR:ns ## COG: mcrB COG1401 # Protein_GI_number: 16132167 # Func_class: V Defense mechanisms # Function: GTPase subunit of restriction endonuclease # Organism: Escherichia coli K12 # 1 243 205 461 465 242 50.0 4e-64 MQGAPGVGKTYSAKRLAYTIMGEKDDSRISIVQFHQNYSYEDFVMGYKPQEEKFELKKGI FYKSCITAGNDPEHDYFFIIDEINRGNMSKIFGELLMLIEKDYRNVKIALAHNGELFSVP NNLHIIGMMNTADRSLAMIDYALRRRFCFYNMKPGFDSIGFQKYQNELHNTTFDKLIETI KRLNDEISRDESLGDGFQIGHSYLCGLKSCDKVKEIIFYDIIPMLHEYWFDDKQKVDFWE NELKSLF >gi|222159208|gb|ACAB01000151.1| GENE 3 1102 - 2115 412 337 aa, chain + ## HITS:1 COG:mcrC KEGG:ns NR:ns ## COG: mcrC COG4268 # Protein_GI_number: 16132166 # Func_class: V Defense mechanisms # Function: McrBC 5-methylcytosine restriction system component # Organism: Escherichia coli K12 # 1 296 14 310 348 137 27.0 3e-32 MLAYAFQVLKRNNYASIASEKFEHIEDLFAEILSRGISYQLKQGLYREYVPRTESLPTMR GKIDITKTIKHRIQCQQILSCEFDELSENNIFNQILKTTISILLQGKIVAKERKNKLKKV LPFFVNINTIEPSIVKWNTLYFQRNNQTYKMLMNICYFILEGLLQTTEDGKYHMATFSDE YMHRLYEKFVLEYYKTEHPELKTSTSRIKWNIDYQSDNKALELLPCMQSDIMLEYNGRTL IIDTKYYSQTMQKQYNKQTLHSNNLYQIFTYVKNYDIQNSSNVAGMLLYAKTNEEITPDL ETSICGNRIYVKTLDLYTDFKNIASQLDEIISYIKIH >gi|222159208|gb|ACAB01000151.1| GENE 4 2379 - 2786 441 135 aa, chain + ## HITS:1 COG:no KEGG:BT_2361 NR:ns ## KEGG: BT_2361 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 132 1 128 129 142 75.0 4e-33 MDVEIKEKENRRHVGRNLQRIRVYLGMKQEALAADLGVNQQVISKIEKQEEIEEGFLKRI AEVLGISEEVIKDFDVEKTIFNINHHNYKDANISEGATTYAIVQQINPLEKIVELYERLL KSEQDKIEILKKHMK >gi|222159208|gb|ACAB01000151.1| GENE 5 2822 - 3220 329 132 aa, chain + ## HITS:1 COG:no KEGG:BT_2360 NR:ns ## KEGG: BT_2360 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 131 7 133 136 125 56.0 4e-28 METIENIKSNHLHLGKKIERVRRLRGMTQTELGQLLGITKQAVSKMEQTEKIDDERLEKI ASALGVTTDGLKEYNEETVLYNTNNFYENCGVKNAIGNNQTFNNFPIEQTIELFEKLLEK QKEQFESLKKEK >gi|222159208|gb|ACAB01000151.1| GENE 6 3547 - 3981 346 144 aa, chain + ## HITS:1 COG:no KEGG:BT_4511 NR:ns ## KEGG: BT_4511 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 265 88.0 3e-70 MKEEPDSLSVPYNFARCFNAQCPQASKCLRHIATQLDTTDYLYISIVNPARYPADGNQCE CFKTTVKVHVAWGLKQLLNRIPYEDAVSIRIQLVGHYGKTGYYRFYRGERGLMPKDQAYI RQLFRNKGIKEEPTYQRYTEEYIW >gi|222159208|gb|ACAB01000151.1| GENE 7 4082 - 5188 725 368 aa, chain - ## HITS:1 COG:L109011 KEGG:ns NR:ns ## COG: L109011 COG5545 # Protein_GI_number: 15672499 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Lactococcus lactis # 1 311 73 389 480 66 22.0 9e-11 MKRRYEFRHNTQIGEVEYRERLSFRFRFNPLDKRALNSIALDAQMEGIPLWDRDISRYIY SNRVPVFNPLEDFLYRLPAWDGKDRIRALAATVPCKNPYWMDLFHRWFLNMVSHWKGSNK KYANSVSPLLVGPQGTRKSTFCRSIMPPSERSYYTDSIDFSRKKDAELYLNRFALINIDE FDQVSSTQQGFLKHILQKPVLNVKKPHGSAVLEMRRYASFIATSNQKDLLTDPSGSRRFI CIEVTGVIDTNRPIDYEQLYAQAMYELEHGERYWFDQEEEKIMVENNREFEQVPPEEQLF FRYFRAAQPEEGEWLSPAEIMEDIQKGSSIPMSVKRVNSFGRILKKQEIPSKHTRSGTLY HVVRLITR >gi|222159208|gb|ACAB01000151.1| GENE 8 5241 - 6173 467 310 aa, chain - ## HITS:1 COG:no KEGG:BT_4510 NR:ns ## KEGG: BT_4510 # Name: not_defined # Def: putative helicase # Organism: B.thetaiotaomicron # Pathway: not_defined # 47 309 2 264 651 489 90.0 1e-137 MKITQFRKNEDTIALSVMDLDILVNKIKTEIKSRPVSTFREHLRYTLSDERCMFANKLPQ IIPAAEFRKVNGQKQMKNYNGIVELTIGPLSNKSEIALVKQKACEQPQTRCVFMGSSGKT VKIWTTFTRPDNSLPKTREEAELFHAHAYRLAVKCYQPQIPFDILPKEPTLEQYSRLSHD PDIIYRPNSVQFYLSQPSSMPEETTFREAVQAEKSPLTRAVPGYDAENAFLMLFEAAFRK AYADLREAGLELSEDTWHPLVVQLAKNCFASGLPQEEVVRRTVFHFYMYRQEELIRQMIG NIYTECKGFR Prediction of potential genes in microbial genomes Time: Wed May 18 04:31:22 2011 Seq name: gi|222159207|gb|ACAB01000152.1| Bacteroides sp. D1 cont1.152, whole genome shotgun sequence Length of sequence - 1032 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 141 - 200 6.7 1 1 Tu 1 . + CDS 353 - 931 203 ## gi|262407339|ref|ZP_06083887.1| predicted protein Predicted protein(s) >gi|222159207|gb|ACAB01000152.1| GENE 1 353 - 931 203 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262407339|ref|ZP_06083887.1| ## NR: gi|262407339|ref|ZP_06083887.1| predicted protein [Bacteroides sp. 2_1_22] # 1 192 230 421 421 350 100.0 3e-95 MGIFTPLLTIIIGGLSDLGSAFNYFIYRLLAGGDIYWNAYPNDIINSLDYNRSGLLNMSY MLWGPFRHLFGLEFDEAQMMTTVGADLFEITTGFYPSGGAPNSPFTVTFWAYYGWYSLIG VAILALGSSSLCYHGQNSMPNTLPNTCFKAFLFRCGTSAFLDIYLFFNNLFSLFLFYCIY KSLKFLSVNLSK Prediction of potential genes in microbial genomes Time: Wed May 18 04:31:32 2011 Seq name: gi|222159206|gb|ACAB01000153.1| Bacteroides sp. D1 cont1.153, whole genome shotgun sequence Length of sequence - 1537 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 789 - 848 2.1 1 1 Tu 1 . + CDS 894 - 1535 252 ## PRU_0443 putative WbuN protein Predicted protein(s) >gi|222159206|gb|ACAB01000153.1| GENE 1 894 - 1535 252 213 aa, chain + ## HITS:1 COG:no KEGG:PRU_0443 NR:ns ## KEGG: PRU_0443 # Name: not_defined # Def: putative WbuN protein # Organism: P.ruminicola # Pathway: not_defined # 6 199 5 196 220 88 35.0 2e-16 MTKRRLVIFDICGTLFFSNTSFDFLDLIVQAKSYSLFRKVSKTIFARFINKVSVLLLKKD LIRSIAVLYIKGMSKAELQEKANIFYDQYLLPLKIDNSFDKIEFYRKDSNTTIILASATF DFIADVVANRLAIPICFGTELAYDNQGLFKGKILKDRLGHKYQALNDMGLKSPFYKVITD NITDMDIINHSKTVDLIIYPWTEKKMVKEKEFN Prediction of potential genes in microbial genomes Time: Wed May 18 04:31:54 2011 Seq name: gi|222159205|gb|ACAB01000154.1| Bacteroides sp. D1 cont1.154, whole genome shotgun sequence Length of sequence - 77774 bp Number of predicted genes - 67, with homology - 65 Number of transcription units - 35, operones - 17 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 24 - 893 87 ## PRU_0444 putative WbuO protein + Term 1074 - 1120 1.1 + Prom 1064 - 1123 5.4 2 2 Op 1 1/0.000 + CDS 1202 - 2014 539 ## COG1216 Predicted glycosyltransferases 3 2 Op 2 . + CDS 2026 - 3039 733 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Term 2906 - 2940 -0.9 4 3 Tu 1 . - CDS 3155 - 3376 149 ## - Prom 3474 - 3533 4.5 5 4 Tu 1 . + CDS 4580 - 4936 227 ## BT_0405 hypothetical protein + Term 5045 - 5080 1.1 + Prom 5110 - 5169 5.3 6 5 Op 1 . + CDS 5283 - 5729 555 ## BT_4676 putative periplasmic protein 7 5 Op 2 . + CDS 5767 - 6267 524 ## BT_4677 hypothetical protein + Term 6287 - 6333 11.2 - Term 6274 - 6321 3.8 8 6 Op 1 . - CDS 6332 - 7540 969 ## COG1760 L-serine deaminase 9 6 Op 2 . - CDS 7560 - 8612 894 ## COG0598 Mg2+ and Co2+ transporters 10 6 Op 3 . - CDS 8699 - 11200 2287 ## COG1193 Mismatch repair ATPase (MutS family) - Prom 11365 - 11424 80.3 + TRNA 11334 - 11421 48.9 # Ser TGA 0 0 + Prom 11663 - 11722 8.7 11 7 Tu 1 . + CDS 11897 - 12622 334 ## BF1199 beta-lactamase + Term 12628 - 12668 6.1 - Term 12661 - 12709 7.2 12 8 Tu 1 . - CDS 12726 - 14129 641 ## BVU_1439 mobilization protein - Term 14258 - 14311 9.2 13 9 Tu 1 . - CDS 14354 - 15310 493 ## BVU_1440 DNA primase - Prom 15397 - 15456 3.8 - Term 15422 - 15467 1.1 14 10 Op 1 . - CDS 15525 - 16601 552 ## BVU_2466 hypothetical protein 15 10 Op 2 . - CDS 16616 - 16924 275 ## BVU_2467 hypothetical protein - Prom 17023 - 17082 6.1 - Term 17069 - 17108 6.2 16 11 Op 1 . - CDS 17148 - 18185 754 ## BVU_2468 hypothetical protein 17 11 Op 2 . - CDS 18191 - 19489 294 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 18 11 Op 3 . - CDS 19557 - 20840 965 ## BVU_2469 tyrosine type site-specific recombinase - Prom 20932 - 20991 5.1 + Prom 21400 - 21459 4.8 19 12 Op 1 . + CDS 21530 - 22657 463 ## COG2207 AraC-type DNA-binding domain-containing proteins 20 12 Op 2 . + CDS 22654 - 23355 300 ## COG1600 Uncharacterized Fe-S protein 21 12 Op 3 . + CDS 23352 - 23651 216 ## BT_0588 BexA, multidrug efflux pump 22 13 Tu 1 . - CDS 23564 - 23779 202 ## gi|294809361|ref|ZP_06768071.1| hypothetical protein CW3_2603 - Prom 23871 - 23930 4.5 + Prom 23789 - 23848 5.0 23 14 Op 1 . + CDS 23950 - 24885 289 ## COG3129 Predicted SAM-dependent methyltransferase 24 14 Op 2 . + CDS 24936 - 25703 617 ## BT_4696 hypothetical protein + Term 25737 - 25784 4.0 + Prom 25738 - 25797 5.0 25 15 Op 1 . + CDS 25858 - 27615 1104 ## BT_4697 transcriptional regulator 26 15 Op 2 . + CDS 27681 - 28043 390 ## COG3189 Uncharacterized conserved protein 27 15 Op 3 . + CDS 28128 - 29360 1173 ## COG0561 Predicted hydrolases of the HAD superfamily 28 15 Op 4 10/0.000 + CDS 29357 - 30472 716 ## COG1169 Isochorismate synthase 29 15 Op 5 1/0.000 + CDS 30492 - 32159 1299 ## COG1165 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase 30 15 Op 6 . + CDS 32163 - 32987 1164 ## COG0447 Dihydroxynaphthoic acid synthase + Prom 33081 - 33140 3.9 31 16 Op 1 4/0.000 + CDS 33160 - 34185 670 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 32 16 Op 2 . + CDS 34217 - 35356 643 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II + Term 35487 - 35540 4.7 33 17 Tu 1 . - CDS 35924 - 36487 396 ## BT_4705 RNA polymerase ECF-type sigma factor - Prom 36521 - 36580 4.3 34 18 Tu 1 . - CDS 36594 - 36824 73 ## gi|294809374|ref|ZP_06768084.1| hypothetical protein CW3_2616 - Prom 36904 - 36963 3.9 + Prom 36639 - 36698 7.4 35 19 Tu 1 . + CDS 36940 - 38160 855 ## gi|237713073|ref|ZP_04543554.1| predicted protein + Term 38191 - 38247 7.5 + Prom 38250 - 38309 4.4 36 20 Tu 1 . + CDS 38334 - 39254 797 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 39304 - 39363 2.3 37 21 Op 1 . + CDS 39508 - 42828 2933 ## BT_4707 hypothetical protein 38 21 Op 2 . + CDS 42845 - 44491 1417 ## Cpin_6734 hypothetical protein 39 21 Op 3 . + CDS 44522 - 45592 722 ## BT_3753 endo-beta-N-acetylglucosaminidase F2 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F2) 40 21 Op 4 . + CDS 45614 - 46813 788 ## BT_4710 hypothetical protein + Term 46850 - 46895 8.1 - Term 46841 - 46880 3.5 41 22 Tu 1 . - CDS 46909 - 47385 550 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) - Prom 47405 - 47464 3.5 + Prom 47361 - 47420 12.0 42 23 Tu 1 . + CDS 47523 - 48449 828 ## COG0583 Transcriptional regulator 43 24 Tu 1 . - CDS 48474 - 48644 174 ## BT_4717 integral membrane protein, putative permease + Prom 48637 - 48696 5.4 44 25 Tu 1 . + CDS 48762 - 49451 738 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) + Term 49495 - 49533 4.2 - Term 49481 - 49522 9.5 45 26 Tu 1 . - CDS 49590 - 50312 711 ## BT_4719 hypothetical protein - Prom 50374 - 50433 3.8 + Prom 50282 - 50341 7.0 46 27 Op 1 . + CDS 50489 - 51037 505 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 47 27 Op 2 . + CDS 51044 - 52342 997 ## BT_4721 hypothetical protein + Term 52368 - 52428 2.2 + Prom 52438 - 52497 3.7 48 28 Op 1 6/0.000 + CDS 52555 - 53151 429 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 49 28 Op 2 . + CDS 53219 - 54199 945 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 54218 - 54264 4.1 + Prom 54240 - 54299 2.5 50 29 Op 1 . + CDS 54362 - 57781 3057 ## BT_4724 hypothetical protein 51 29 Op 2 . + CDS 57794 - 59140 991 ## BT_4725 hypothetical protein 52 29 Op 3 . + CDS 59162 - 61693 1981 ## COG0584 Glycerophosphoryl diester phosphodiesterase 53 29 Op 4 6/0.000 + CDS 61718 - 62644 681 ## COG0584 Glycerophosphoryl diester phosphodiesterase 54 29 Op 5 . + CDS 62661 - 64016 694 ## COG2271 Sugar phosphate permease + Term 64068 - 64125 9.2 55 30 Op 1 . - CDS 64123 - 65688 1031 ## BF2880 hypothetical protein 56 30 Op 2 . - CDS 65699 - 65872 70 ## gi|294646425|ref|ZP_06724067.1| hypothetical protein CW1_3805 - Prom 65922 - 65981 8.1 57 31 Op 1 6/0.000 + CDS 65921 - 66541 532 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 66574 - 66619 7.5 + Prom 66581 - 66640 4.8 58 31 Op 2 . + CDS 66669 - 67664 590 ## COG3712 Fe2+-dicitrate sensor, membrane component 59 31 Op 3 . + CDS 67710 - 68570 670 ## BT_0190 hypothetical protein 60 31 Op 4 . + CDS 68567 - 71107 1655 ## BT_0190 hypothetical protein 61 31 Op 5 . + CDS 71123 - 72505 822 ## Slin_1275 RagB/SusD domain protein + Prom 72516 - 72575 4.7 62 32 Tu 1 . + CDS 72611 - 74386 693 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Prom 74500 - 74559 4.6 63 33 Op 1 6/0.000 + CDS 74697 - 75482 594 ## COG0584 Glycerophosphoryl diester phosphodiesterase 64 33 Op 2 . + CDS 75496 - 76851 1004 ## COG2271 Sugar phosphate permease + Term 76903 - 76958 16.4 + Prom 76874 - 76933 3.5 65 34 Op 1 . + CDS 76978 - 77112 235 ## 66 34 Op 2 . + CDS 77121 - 77366 158 ## BT_4730 hypothetical protein + Term 77382 - 77422 1.7 67 35 Tu 1 . + CDS 77423 - 77620 162 ## gi|160887564|ref|ZP_02068567.1| hypothetical protein BACOVA_05584 + Term 77638 - 77678 -0.8 Predicted protein(s) >gi|222159205|gb|ACAB01000154.1| GENE 1 24 - 893 87 289 aa, chain + ## HITS:1 COG:no KEGG:PRU_0444 NR:ns ## KEGG: PRU_0444 # Name: not_defined # Def: putative WbuO protein # Organism: P.ruminicola # Pathway: not_defined # 20 181 5 158 261 64 32.0 4e-09 MKGDITINSGQAKYACAFIPNYVFYIPFAYYGVVRMGTLSKFIGFSIYYILPTLYYFLYC FGYSLYGIPIYLVSLLLLYNLYEVGYIQNDTETVQKDRQPSLRLYGYNLKFYYEHRCSIY LTRIFISITLSFLLIFISNENKGAFLYIVLCFIELGVFLIYNLIRSNLSLVIFVLLQTIK YVIYAFYFYPKVNFIVIGMLILVYPVPNIIERFSYKRYNLSLFMNCLPNKNCFTLFRALY FGILSFGFILLNIRGILSFMCYAVFLVIAIFRISLLLLVRIYNFKNYLT >gi|222159205|gb|ACAB01000154.1| GENE 2 1202 - 2014 539 270 aa, chain + ## HITS:1 COG:RSc0688 KEGG:ns NR:ns ## COG: RSc0688 COG1216 # Protein_GI_number: 17545407 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Ralstonia solanacearum # 2 263 4 265 275 191 36.0 2e-48 MITVSIVTYKTNLEELSKCLQSLTSSLVFQIYMVDNSNERYIADFCQQYPNVVYIGSENV GYGAGHNQALRQVLNLGGKYHLILNSDVYFEPSVLEFLATYMDAHADVAQVQPNVVYPDG EQQYTCRLLPTPANLIFRRFLPKRMVEKMNIRYQLKFDDHKKEMNVPYHQGSFMFFRLEC FKKVGLFDERFFMYPEDIDITRRMHKWYRTMFVPSVTIVHAHRAASYKSKKMLKIHMVNM IKYFNKWGWIVDQERSAWNRKLLEELGYKK >gi|222159205|gb|ACAB01000154.1| GENE 3 2026 - 3039 733 337 aa, chain + ## HITS:1 COG:ECs2847 KEGG:ns NR:ns ## COG: ECs2847 COG0451 # Protein_GI_number: 15832101 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 2 324 4 310 331 131 30.0 2e-30 MNYFITGGTGFIGTHLANLLHEVHPEAKIYNLDIIEPGTPLPTVKKYKPALRKGEEHAAT FIYCDVRQPIKLEKVNITPDDVIFNFAAVHRTPGHPDPAYFETNIRGAENVCAFAERQGI RKIVFTSSIAPYGAAEELKEETTLPMPNTPYGISKLVAEKIHQTWQAGGEGRQLTIVRPG VVFGRGENGNFTRLYWGIRGHKFMYPGRKDTIKACIYVKELVRFMLYRLEHHDSGVELYN CCYEPAYTIEHIVESMKRVTNMKVKVPFMPGWLIMTAAGIVGAIGSPMGICPARVRKLQI STNICGKKLSASGYRFRYTFEEALADWYKDNGNLYLQ >gi|222159205|gb|ACAB01000154.1| GENE 4 3155 - 3376 149 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEFENNATSEIAAAVQFYYGIDVVFPKAVFHAKIIDLLSYSAMRFKLFTTLLIKEVGGRE KRKLAPMAKPFNG >gi|222159205|gb|ACAB01000154.1| GENE 5 4580 - 4936 227 118 aa, chain + ## HITS:1 COG:no KEGG:BT_0405 NR:ns ## KEGG: BT_0405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 117 1 117 119 134 50.0 1e-30 MAKEKQKPYEFLSNLVLALMGTDRIFSNSFFSSEFAISPNTLSEIRRGEDMCIYQYVRVI RCMMKYLHLIVRMDMLLKELRAVLASNCDLVVATVPHRFHGTYQPKEWVVVMHWDGIK >gi|222159205|gb|ACAB01000154.1| GENE 6 5283 - 5729 555 148 aa, chain + ## HITS:1 COG:no KEGG:BT_4676 NR:ns ## KEGG: BT_4676 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 148 1 147 147 238 83.0 5e-62 MKKFLSLLVLALVAVQFSFAKDVITKDMNQLPLPARNFINSNFTKPQVAHIKIDKDMMES TKYEVLLMDGTEIDFDSKGNWEEVSAKKGQTVPVSIVPGFAVNYLKAHNFVNEGVTKVER DRKGYEIELSTGLSFKFDKKGKFIKADD >gi|222159205|gb|ACAB01000154.1| GENE 7 5767 - 6267 524 166 aa, chain + ## HITS:1 COG:no KEGG:BT_4677 NR:ns ## KEGG: BT_4677 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 166 1 166 166 254 74.0 7e-67 MWRFKFSAWAVVLLMTALSFGACDNDDDDTFVPPSNITEALKQVYPAAQNIEWEMKGAYY VADCWVSNDELEVWFDANANWVMTENELNSIDQLVPAVYTAFIDSKYNAWVVTDVYVLTF PQNPMESVIQVKQGSQRYSLYFTQDGGLIHEKDISNGDDTNWPPTE >gi|222159205|gb|ACAB01000154.1| GENE 8 6332 - 7540 969 402 aa, chain - ## HITS:1 COG:FN1106 KEGG:ns NR:ns ## COG: FN1106 COG1760 # Protein_GI_number: 19704441 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Fusobacterium nucleatum # 1 402 1 402 408 427 52.0 1e-119 MKSIKELYRIGTGPSSSHTMGPRKAAEMFLERHPDAASFKVTLYGSLAATGKGHMTDVAI IDTLQPTAPVEIVWQPKVFLPFHPNGMTFAALDSNDKVQENWTVYSIGGGALAENNDNPT IESPDVYGMENMTEILQWCEDTGKSYWEYVKECEEEDIWDYLAEVWATMKDAIHRGLEAE GVLPGPLNLRRKASTYYIRATGYKQSLQSRGLVFSYALAVSEENASGGKIVTAPTCGSCG VMPAVLYHLQKSRDFSDMRILRALATAGLFGNIVKFNASISGAEVGCQGEVGVACAMASA AANQLFGGSPAQIEYAAEMGLEHHLGMTCDPVCGLVQIPCIERNAYAAARALDANLYSAF TDGMHRVSFDKVVQVMKQTGHDLPSLYKETSEGGLAKDYKQM >gi|222159205|gb|ACAB01000154.1| GENE 9 7560 - 8612 894 350 aa, chain - ## HITS:1 COG:MA1721 KEGG:ns NR:ns ## COG: MA1721 COG0598 # Protein_GI_number: 20090573 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Methanosarcina acetivorans str.C2A # 27 349 38 355 356 224 38.0 2e-58 MKNNLLSEKLIYTGASLTPTHLHLCTYNATEMQESANDTFQAIKGTLNSDRINWLQIHGM KDTETIREICSHFEIDFLVLQDILNADHPTKIEEHDKYIVLILKIFYPNEHKEEDELDEL LQQQVCLILGNNYVLTFLEKETDFFDDVSSALRNDVLKIRSRQTDYLLSVLLNSIMGNYI STISSIDDALEDLEEELLTITSGDDIGIQIQALRRQYMLMKKSILPLKEQYIKLLRAENL LIHKVNRAFFNDVNDHLQFVLQTIEICRETLSSLVDLYISNNDLRMNDIMKRLTIVSTIF IPLTFLVGVWGMNYKWMPELEWQYGYLFAWIVMAIIGIIVYLYFRKKKWY >gi|222159205|gb|ACAB01000154.1| GENE 10 8699 - 11200 2287 833 aa, chain - ## HITS:1 COG:BH3106 KEGG:ns NR:ns ## COG: BH3106 COG1193 # Protein_GI_number: 15615668 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Bacillus halodurans # 13 832 10 784 785 361 31.0 3e-99 MIYPQNFEQKIGFDQIRQLLKDKCLSTLGEERVTDMTFSEQHEEVEEKLNQVTEFVRIIQ EKDGFPDQFFFDVRPSLKRVRIEGMYLDEQELFDLRRSLETIRDIVRFLHRNEDEEENDT PYPSLKRLAGDIAVFPQLIGKIDGILNKYGKIKDNASTELARIRRELASTMGNISRSLNS ILRSAQSEGYVDKDVAPTMRDGRLVIPVAPGLKRKIKGIVHDESASGKTVFIEPAEVVEA NNRIRELEGDERREIIRILTEFSNILRPSIPEILQSYEFLAEIDFIRAKSYFAIQTNSLK PAVENEQLLDWTMAVHPLLQLSLAKHGKKVVPLDIELNQKQRILIISGPNAGGKSVCLKT VGLLQYMLQCGMLIPLHERSHAGIFSSIFIDIGDEQSIEDDLSTYSSHLTNMKIMMKSCN ERSLILIDEFGGGTEPQIGGAIAEAVLKRFNQKGTFGVITTHYQNLKHFAEDHEGVVNGA MLYDRHLMQALFQLQIGNPGSSFAVEIARKIGLPEDVIADASEIVGSEYINADKYLQDIV RDKRYWEGKRQTIRQREKHMEETIARYQTEMEELQKSRKEIIRQAKEEAERMLQESNARI ENTIRTIKEAQAEKEKTRQARQELTDFRTSLDALASKEHEEKIAQKMKKLKEKQERKKNK KNEPKAAVSSPSSAPKVVPIAVGENVKIKGQTSVGQVMEINGKNATVAFGSIKTTVKIDR LERSNNAPKMEGIAKSTFVSSQTHDQMYEKKLSFKQDIDVRGMRGDEALQAVTYFIDDAI LVGMDRVRILHGTGTGILRTLIRQYLATVPGVSHYADEHVQFGGAGITVVDFD >gi|222159205|gb|ACAB01000154.1| GENE 11 11897 - 12622 334 241 aa, chain + ## HITS:1 COG:no KEGG:BF1199 NR:ns ## KEGG: BF1199 # Name: cepA # Def: beta-lactamase # Organism: B.fragilis_NCTC9343 # Pathway: Biosynthesis of secondary metabolites [PATH:bfs01110]; Two-component system [PATH:bfs02020] # 1 236 64 297 300 169 38.0 8e-41 MMSVFKVHQALALCNDFDNKGISLDTLVKIDRNRLDSKTWSPMMKDYSELVISLTVRDLL RYTIAQSDNNASNLMFKDMVNVAQTDSFIATLIPRSSFQIAYTEEEMSADHDRAYFNYTS PLGAAMLMNRLFTESIVSGEKQSFIKNTLKECVTGTDRIVAPLLDKERVSIAHKTGSGDV NENGILAAHNDVAYICLPNNVCYTLAIFVKDFKGNESQASQYVAHISEVVYSLLIQNSAI P >gi|222159205|gb|ACAB01000154.1| GENE 12 12726 - 14129 641 467 aa, chain - ## HITS:1 COG:no KEGG:BVU_1439 NR:ns ## KEGG: BVU_1439 # Name: not_defined # Def: mobilization protein # Organism: B.vulgatus # Pathway: not_defined # 1 467 1 467 467 718 79.0 0 MATKSSIHIKPCRVTSSGAHNRRTAEYMRNIGESRIYIVPELTSNNEQWINPGFGSPDLQ AHYDSIKRMVKEKTGRAMQEKERERKGKNGKIIKVAGCSPIREGVLLIRPDTTLADVRKF GEECQRRWGITPLQIFLHKDEGHWLSGQPKTGDRESFQVGEKWFKPNYHAHIVFDWMNHD TGKSQKLNDDDMMEMQTLASDILSMQRGQSKSETGKEHLERNDFIIEKQRAELQRMDAAK RHKEEQIGLAEQELKQVKSEIRTDKLKKTATNAATAIASGVGSLFGSGKMKELERTNEDL HQEIAKRDKGIDTLKIQMQEMQERHGKQIRNLQSIHNQELEAKDREISRLNTLLEKAFKW FPMLREMLRMEKLCAVIGFTKDMIDCLLTRKEAIQCNGKIYSEEHRRKFEIKNNIFKVEK NPTDDSKLVLTINRQPISEWFKEQWEKLRQGLRQPTEEPRKSRGFKL >gi|222159205|gb|ACAB01000154.1| GENE 13 14354 - 15310 493 318 aa, chain - ## HITS:1 COG:no KEGG:BVU_1440 NR:ns ## KEGG: BVU_1440 # Name: not_defined # Def: DNA primase # Organism: B.vulgatus # Pathway: not_defined # 1 318 1 318 318 551 83.0 1e-155 MDIQTAKQIKIADYLHSLGFSPVKQQGINLWYKSPLREETEASFKVNTERNQWYDFALGK GGNIIALAQELYCSDHVPYLLQKIEEQTPRIRPVSFSFGKQSSSEPSFQQLEIVPLSSPA LLAYLQERGINIAMAKRECSEAHFTHNGKRYFAIAFPNVSGGYEIRNQYFKGCIAPKEIS HIKQPGTARETCYVFEGFMDYLSFLTLRLENCPKYPELDRQDYMVLNSVANVSKAIYPLG SYERIHCFFDNDRAGMEAMQQIYKEYGRDLYIRDASQTYSGCKDLNEYLQKQTERNRQVQ SAKGVRSQPPKKKNGFRL >gi|222159205|gb|ACAB01000154.1| GENE 14 15525 - 16601 552 358 aa, chain - ## HITS:1 COG:no KEGG:BVU_2466 NR:ns ## KEGG: BVU_2466 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 358 1 357 357 516 74.0 1e-145 MDYMKDIKEISAEEAVILWQASRLSLSKIYEKAPEILKVQGSVIGTLGNFSASIGKAKSK KTFNVSAIVAAALKNGTVLRYVAELPEEKRKILYVDTEQSPYHCLKVMKRILRMAGLPDD RDNEHLEFLALRKYTPEQRIRIVEQAIYNTPDIGLVIIDGIRDMVYDINSPGESTRIISK LMQWTDDRQIHIHTILHQNKGDENARGHIGTELNNKAETVLLVEKDKSNGDISNVSAMHI RAMDFEPFSFRINDNAIPELLEGYKPETKKPGRPEEEKFDPYRHITEQQHRIALEAVFGL KEEYGYKELEDTLIKTYVSVGVKLNHKKAVSLITMLRNKRMIVQENGRKYTFMPDFHY >gi|222159205|gb|ACAB01000154.1| GENE 15 16616 - 16924 275 102 aa, chain - ## HITS:1 COG:no KEGG:BVU_2467 NR:ns ## KEGG: BVU_2467 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 4 102 3 102 102 148 74.0 5e-35 MTNLQELLLKPVWQMTGEEFIFLSKYASNQTEAQPQPVTDTERKYVYGILGIAKLFGCSL PTANRIKKSGKIDKAITQIGRKIIVDAELALELAGKKTGGRK >gi|222159205|gb|ACAB01000154.1| GENE 16 17148 - 18185 754 345 aa, chain - ## HITS:1 COG:no KEGG:BVU_2468 NR:ns ## KEGG: BVU_2468 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 344 1 337 338 277 46.0 6e-73 MAGEINKLIPQYGELNRIYNDWVTNYAFSFDKQKFITDFYRQHNDTKAFEAAILELVLDK QKEQYTLILNSLKIEIEKNIRAYETKPLNDDVIKRVCFHYADRHNSAIKDQLEITTKLHE PLNDAYHRYDFIGFREHTDEEEIQAEKEYERCKAEYDKEKKELDKLYELQKQDRKEAFQY IENLSGNVYRLSLLFMEVLKKYLPNDREEKQQDEPVKQNEQEEVQNSPEGQHEYFDMKQL SPIHETCVGEQFEAITIADFYANINLYPCKNKLKIKAREKIRVCYLIFLMSEKLSKQYRD EWRSQILKLLDIDESYYRSKYKEPVSDFPSDSNQKFAKEMESIFG >gi|222159205|gb|ACAB01000154.1| GENE 17 18191 - 19489 294 432 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 208 420 122 327 339 117 32 1e-25 MENNREYHIGRTDFDAFVHAVGSEIEQAQVRLIAAANAQMLFHYWKMGNYILYHQNLQGW GSKIIKQLAKAIRFNYPEKKGYSERNLTYMCQFARSYPLNVLRSFIDTDTRLSVPSIQNV TDEVLKLNNGQFTQELTAQIQSADCQSLEITQEVPAQIQDVEKTVSAIYRMEIREIEKVF LKSPVAKINWASQMVILNGSLPLGIGYWYMKQAVEMGWSSNVLKMQIENNLYNRQINNNK VNNFTATLPAPQSDLANYLLKDPYIFDLAGAKEKADERDIEEQLVKHVTRYLLEMGNGFA FVARQKHFQIGNSDFFADLILYSIPLHAYIVVELKATPFKPEYAGQLNFYINVVDDKLRG KNDNKTIGLLLCKGKDEVVAQYALTGYDQPIGISDYQLSKAIPENLKSALPSVEEVEEEL ASFLDKDNNPQN >gi|222159205|gb|ACAB01000154.1| GENE 18 19557 - 20840 965 427 aa, chain - ## HITS:1 COG:no KEGG:BVU_2469 NR:ns ## KEGG: BVU_2469 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.vulgatus # Pathway: not_defined # 1 427 1 430 430 710 82.0 0 MNIKRNIIFALESRKKNGVPIVENVPIRMRVIYASQRIEFTTGYRIDVAKWDADKQRVKN GCTNKLKQSASEINADLLKYYAEIQNVFKEFEVQETIPTTQQLKDAFNLRMKDTSEEQQE ETQISFWEVFDEFVKECGNQNNWTASTYEKFSTVKNHLKEFKEDVTFEYFNEFGLNEYVN FLRDKKDMRNSTIGKQMGFLKWFLRWSFKKGHHQNIAYDTFKPKLKTTSKKVIFLTWDEL NRLKDYRIPKDKQYLERVRDVFLFCCFTSLRYSDVRNLKRSDVKPDHIEVTTVKTADSLI IELNDHSKTILEKYKEVHFENHMALPVISNQKMNDYLKELGELAEINEPVRETYYKGNER IDEVTPKYALLTTHAGRRTFICNALALGIPAQVVMKWTGHSDYKAMKPYIDIADDIKANA MNKFNQL >gi|222159205|gb|ACAB01000154.1| GENE 19 21530 - 22657 463 375 aa, chain + ## HITS:1 COG:lin1814_1 KEGG:ns NR:ns ## COG: lin1814_1 COG2207 # Protein_GI_number: 16800881 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 100 205 11 117 142 74 32.0 3e-13 MNINTETKYCQTCGIPLDIDYASLEEGQNEEYCDYCLKNGVKSYDFSMDYLIYLWGLFPE EYYREVGISYSSSELREIMSKRLPEIKRWKQKINTAHVQYELIIKVQEYINCHLFDDLDS DRLSQVAGISKFHFRRLFKAICGDSLGNYIFRLRLEYIAFKLISTDTSVPELLSQINYQN KHTLSRAFKSYFNCSIPEFRKKHSNARPEGKNPIQVELSIKKVHNMRIAYLKLERTNNIS HSFSLLWKQLLQFSENYELPSKGCQYVSLTLDYPLITPEEHSRFMVGVTLPQSFETPKGF GMYEISSGEYAIFRFKGLYHELNRVYRHIYLDWLPTSDYTLREPFTFETYINTPEKTPAS ELITAIYIPVTRKET >gi|222159205|gb|ACAB01000154.1| GENE 20 22654 - 23355 300 233 aa, chain + ## HITS:1 COG:MA3660 KEGG:ns NR:ns ## COG: MA3660 COG1600 # Protein_GI_number: 20092460 # Func_class: C Energy production and conversion # Function: Uncharacterized Fe-S protein # Organism: Methanosarcina acetivorans str.C2A # 1 227 7 244 248 104 30.0 1e-22 MKYLNQEIENELRNQGAELIRFVDISHLSETQNRQFPHAIVFALPLTAGYIKEVCDTPDY VQARIDDNYNFDDDEYSITENRTHKLADNMAAYIAEKGYNAFSQSDENLIAEGKFDAAHK ESLLPNKTIALLAGLGWIGKNNLLITPEYGAAQCLGTVLTDAPLETFLCETLHPHCGKCT ACVNICERKVLKGKVWSTSVSRDEIVDVYGCSTCLKCLVHCPYTRRYCKQLTK >gi|222159205|gb|ACAB01000154.1| GENE 21 23352 - 23651 216 99 aa, chain + ## HITS:1 COG:no KEGG:BT_0588 NR:ns ## KEGG: BT_0588 # Name: not_defined # Def: BexA, multidrug efflux pump # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 74 4 70 443 99 82.0 3e-20 MRRNNNIKYTYKQIGGITFPILISLLMEMMIGVTDTAFLGRVGEIELGASAISGVFYLII FMVAFGFSVGAQILAISSFVFKLLISHFGNKLFQNDISF >gi|222159205|gb|ACAB01000154.1| GENE 22 23564 - 23779 202 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294809361|ref|ZP_06768071.1| ## NR: gi|294809361|ref|ZP_06768071.1| hypothetical protein CW3_2603 [Bacteroides xylanisolvens SD CC 1b] # 10 71 1 62 62 120 98.0 3e-26 MRGSEKDTLLAPTYHWTLFRARYGEACLINPIYTQWVNMLEKSQKEISFWNNLLPKCEIN NLKTKEEIARI >gi|222159205|gb|ACAB01000154.1| GENE 23 23950 - 24885 289 311 aa, chain + ## HITS:1 COG:VC1614 KEGG:ns NR:ns ## COG: VC1614 COG3129 # Protein_GI_number: 15641622 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Vibrio cholerae # 2 309 44 350 362 307 48.0 1e-83 MAQKGELHIRNKHNGQYDFLLLIENYPPLKRFVSLNPLGVQTINFFNPQAVKALNKALLI SYYGIRYWDIPKQYLCPPIPGRADYIHYIADLIQPDRVANDLQTEEEDANEQKTKCRCLD VGVGANCIYPIIGHTEYGWTFVGSDIDPVSIENARKIVTCNPVLAHKIDLRLQKDSRKIF DGIIMPGEYFDVTICNPPFHSSKEEAEDGTLRKLSSLKGTKVKKVQLNFGGSANELWCEG GEIRFILNMISESQKYQKNCGWFTSLVSKEKNLEKLCAKLKSVNVSEYKIIRMQQGTKSS RILAWRFSNNS >gi|222159205|gb|ACAB01000154.1| GENE 24 24936 - 25703 617 255 aa, chain + ## HITS:1 COG:no KEGG:BT_4696 NR:ns ## KEGG: BT_4696 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 255 1 255 255 464 82.0 1e-129 METNYITLTKENIADEHICCAFSDKKCKDSYELKKAWLKKEFDNGYVFRRIDERAKVFIE YGPAEKAWVPVSAPNYLMINCFWVSGKYKGCGHGKALLQSAIDDAKLQGKDGLVTVVGTS KFHFMSDTKWLLRQGFQTIEKLPYGFSLLALKINPAVPDPSFNDSVLTGECPDRDGIVVY YTHRCPFTEFHVRGSLVNAAKSKDLPLKIIELETMEQAQNAPTPATIFTLFYNGKFVTTD LSACIEARLGKALGL >gi|222159205|gb|ACAB01000154.1| GENE 25 25858 - 27615 1104 585 aa, chain + ## HITS:1 COG:no KEGG:BT_4697 NR:ns ## KEGG: BT_4697 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 583 4 588 589 674 59.0 0 MRFFVCILVLLFVHNQFSRADKTLTDDSLYTEKYIRNIYIPEPRRALQLLDEAEDRKTIP LRVVNELRSLSYSNMYMNKLAFMYARKAYLLDSIYQKDPKHMLKMTVHLAEFSAMMSKYN ESMRYALDGIRQAQELGNREAKARLFFCMGENNWRLSFKDKAYDYFNRTIELLRGSKEMR EMMLLSYYYGAEMSFLMNDSRIDEALKVGFEREKQIKRLKEVPQISEDYVDGQYSYLYAK LAYIYCMEKKYEKAEQYYQKYLSKKESHTPDGKMYSVPYLALSGQYEKVIDNCRGFKELM RTQQDTLNEQYLTVLRQEVKAYLGMHKYKEAAEIRETILTITDSINTRDRNNAALELNAM YGASEKEEYIAEQAFQLKIRNITLCFLACIVVLTLFIVWRLWHFNHIVEYKNRMLARLIN EKFANRKDGNQLSEVYEQLAVVSEVEPERISPEELEELANDTDKESGEEEENKKIFQELN DIVIQKQLYLSSELSREDLAQIVHLNNARFARMIRECTGTNFNGYINELRINYAIELMKK HPNYTIRAIADEAGFNSTPILYNLFKKKTGMTPYEFKKAQDSLQN >gi|222159205|gb|ACAB01000154.1| GENE 26 27681 - 28043 390 120 aa, chain + ## HITS:1 COG:BMEII0787 KEGG:ns NR:ns ## COG: BMEII0787 COG3189 # Protein_GI_number: 17989132 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Brucella melitensis # 1 115 1 114 116 99 45.0 2e-21 MIQVRIKRVYDDFSETDGYRVLVDKLWPRGMKKEWLKYDYWAKDITPSSTLRRWFHEDIP GHWEDFVVQYQKELDASSSTADFLTLIKPHPVITLLYASKEPVYNHARILRDYLEMRLKD >gi|222159205|gb|ACAB01000154.1| GENE 27 28128 - 29360 1173 410 aa, chain + ## HITS:1 COG:SP0923 KEGG:ns NR:ns ## COG: SP0923 COG0561 # Protein_GI_number: 15900803 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Streptococcus pneumoniae TIGR4 # 4 269 3 269 269 162 37.0 1e-39 MKYKLLVLDVDGTLLNDAKEISKRTLAALLKVQQMGVRIVLASGRPTYGLMPLAKMLELG NYGGFILSYNGCQIINAQNGEILFERRINPEMLPYLEKKARKNGFALFTYHDDTIITDSP ENEHIQNEARLNNLQIIKEEEFSAAVDFAPCKCMLVSDDEEALVGLEDHWKRRLNGALDV FRSEPYFLEVVPCAIDKANSLGALLEVLDVKREEVIAVGDGVCDVTMIQLAGLGVAMGHS QDSVKACADYVTASNEEDGVAVAVEKAIIAEVRAAEIPLDQLNAQARHALMGNLGIQYTY ADEDRVEATMPVDHRTRQPFGILHGGATLALAETVAGLGSMILCQPDEIVVGMQVSGNHI SSAHEGDTVRAVGTVVHKGRSSHVWNVDVFTSTNKLVSSIRVVNSVMKKR >gi|222159205|gb|ACAB01000154.1| GENE 28 29357 - 30472 716 371 aa, chain + ## HITS:1 COG:L0168 KEGG:ns NR:ns ## COG: L0168 COG1169 # Protein_GI_number: 15672714 # Func_class: H Coenzyme transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Isochorismate synthase # Organism: Lactococcus lactis # 106 360 143 400 404 112 32.0 9e-25 MTDEEISNLTTIDAFIQRKQPFAVYRIPGEKVPRLLTQAEGVVRLIYDLKELNGQRGFVI APFQVSESCPVVLIQPDQWGQPLPVDDDTEEDREIALRLQGQESFLTSSTEEYTACFHTF INALRDRTFDKLVLSRHLTVDKVADFSPSSIFRAACKRYIHSYIYLCYTPQTGIWLGSTP EIILSGEKDEWHTVALAGTQPLQDGRLPQVWDEKNRKEQDYVASYIRRQLLSLDIHSTEN GPYPAYAGALSHLKTDFRFSLKDNKGLGDLLKVLHPTPAVCGLPKEEAYQFILQNEGYDR CYYSGFIGWLDPGGRTDLYVNLRCMHIEDKQLTLYAGGGLLASSELNDEWLETEKKLQTI KRLIAVPPLKS >gi|222159205|gb|ACAB01000154.1| GENE 29 30492 - 32159 1299 555 aa, chain + ## HITS:1 COG:BS_menD KEGG:ns NR:ns ## COG: BS_menD COG1165 # Protein_GI_number: 16080134 # Func_class: H Coenzyme transport and metabolism # Function: 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase # Organism: Bacillus subtilis # 19 473 21 494 580 184 29.0 5e-46 MYTDKKNILQLVALLEAHGITKVVLCPGSRNTPIVHTLSNHPNFTCYPVTDERSAGYFAI GLALNGGKPAAVCCTSGTALLNLHPAVAEAFYQNVPLVIISADRPAAWIGQMDGQTVPQP GVFQTLVKKSVNLPEIHTEEDEWYCNRLVNEALLETNHHGKGPVHINIPISEPLFQFTVD SLPEVRVITRYQGLNVYDRDYNDLVDRMNKYQKRMIIIGQMNLIYLFEKRYIKLLYKHFA WLTEHIGNQTVPGIPVKNFDAALYAMPEEKVGQMVPELLITYGGHVVSKRLKKFLRQHPP KEHWHVSPDGEVVDLYGSLTTVIEMDPFEFLEKIASLLDNRTPEYPRVWENYCKIIPEPE FAYSEMSAIGTLLKALPESCALHLANSSVVRYAQLYSIPSTIEVCCNRGTNGIEGSLSTA VGYAAASDKLNFIAIGDLSFFYDMNALWNVNVRSNLRVLLLNNGGGEIFHTLPGLDMSGT SHKFIAAVHKTSAKGWAEERGFLYLQAQNDEELAEAMKTFTQPEAMEQPVLLEVFTNKNK DARMLKNYYHQLKQK >gi|222159205|gb|ACAB01000154.1| GENE 30 32163 - 32987 1164 274 aa, chain + ## HITS:1 COG:BS_menB KEGG:ns NR:ns ## COG: BS_menB COG0447 # Protein_GI_number: 16080132 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroxynaphthoic acid synthase # Organism: Bacillus subtilis # 6 274 3 271 271 407 68.0 1e-113 MSTQREWTTIREYDDILFDYYNGIARITINRERYRNAFTPTTTAEMSDALRICREEADID VIVITGAGDKAFCSGGDQNVKGRGGYIGKDGVPRLSVLDVQKQIRSIPKPVIAAVNGFAI GGGHVLHVVCDLSIASENAIFGQTGPRVGSFDAGFGASYLARVVGQKKAREIWFLCRKYN AQEALDMGLVNKVVPLEQLEDEYVQWAEEMMLLSPLALRMIKAGLNAELDGQAGIQELAG DATLLYYLTDEAQEGKNAFLEKRKPNFKQYPKFP >gi|222159205|gb|ACAB01000154.1| GENE 31 33160 - 34185 670 341 aa, chain + ## HITS:1 COG:AGpA707 KEGG:ns NR:ns ## COG: AGpA707 COG4948 # Protein_GI_number: 16119707 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 96 245 34 186 299 86 31.0 8e-17 MNCKIEIIPRLLHFKQPAGTSRGTYTTRKVWYLHFTSPDFPGRVGIGECAPLPALSCDDL PDYEDILKRFCRQVEKEQGMWDKDVLCQYPSILFGLETAIWHFFAGSWALSDTAFSRGEV GIQINGLIWMGDFDHMLSQIEKKMEAGFRCVKLKIGAIDFEKELALLRHIRTHFSSKEIE LRVDANGAFSPGDAMEKLKRLSEFDLHSIEQPIRAGQWEEMARLTSESPLPIALDEELIG CNTLDEKKKLLATIRPQYIIIKPSLHGGICGGDEWIMEAEKQHIGWWITSALESNIGLNA IAHWCATFNNPLPQGLGTGMLFTDNIEVPLEIRKDCLWFCK >gi|222159205|gb|ACAB01000154.1| GENE 32 34217 - 35356 643 379 aa, chain + ## HITS:1 COG:Cgl0445 KEGG:ns NR:ns ## COG: Cgl0445 COG0318 # Protein_GI_number: 19551695 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Corynebacterium glutamicum # 51 350 60 369 376 95 27.0 1e-19 MIFDRQQQRLLLEGKEYTFEDISQLIAGGAEAHSSAYWNLYLFLNEWFNDSPVITVHTSG STGIPKELMVRKDQMMQSARLTCEFLNLKKGESALLCMNLRYIGAMMVVVRSLVVGLNLI VRPASGHPLADIKEPLRFVAMVPLQVYNTLQVPEEKERLKQTDILIIGGGAVDEALETEI KYLPTAVYSTYGMTETLSHIALRRLNGASASEHYYPFPSVKLSLSVENTLVIDAPLVCDE ILQTNDIARIYPDGSFMILGRKDNVINSGGIKIQAEEMEKLLRPFIPASFVITSVPDQRL GQAVTLLIVGQLDTRELENKLQTTLEPYYRPKHIFITESIPQTENGKIDRVGCRTLAASY LSHSQQFPSIELHDDAKCS >gi|222159205|gb|ACAB01000154.1| GENE 33 35924 - 36487 396 187 aa, chain - ## HITS:1 COG:no KEGG:BT_4705 NR:ns ## KEGG: BT_4705 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 176 1 176 189 289 93.0 3e-77 MKISFSKQTKERAFKQLYEEYYAPFCLYAKRFVDDKETREDIVSDVFTSLWDKLDTDSFD LQSNTALAYIKMCVKNNCLNFLKHQEYEWSYAENIKKKAPVYETEPDSVYTLDELYKMLY ETLDKLPENYRTVFMKSFFEGKTHVEIAEEMNLSVKSINRYKQKTMELLRNELKDYLPLL LLLFYRT >gi|222159205|gb|ACAB01000154.1| GENE 34 36594 - 36824 73 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294809374|ref|ZP_06768084.1| ## NR: gi|294809374|ref|ZP_06768084.1| hypothetical protein CW3_2616 [Bacteroides xylanisolvens SD CC 1b] # 26 76 1 51 51 79 98.0 5e-14 MINSLSIPGSVCETPDYESKNKRRRLLIKVRSVIQLKYIYSIGYYFKFNNINNTNRMPKK DREIKLFFIFFRIQLK >gi|222159205|gb|ACAB01000154.1| GENE 35 36940 - 38160 855 406 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713073|ref|ZP_04543554.1| ## NR: gi|237713073|ref|ZP_04543554.1| predicted protein [Bacteroides sp. D1] # 1 406 26 431 431 803 100.0 0 MKRLSFLLMAVVCVIFFSCGEDEDKQGHSITAFAGYGGAIATADKEIAVAGEAVTVTATP ADGFLFKEWKVRVGNTIVENVQANPSTFTMPMEDVVIVATFMIRNDVLERITDPALKAYC QSRMDTEQEIDGVTYPKWDTNGNGVLSPDEASAVKAIDITGGVNGVKIKSVDELVEFAGL EVLKVSGNELTTLNVAWPKLAQLDCSHNKLSNLSVGKSENLKELYCNNNHLSSLKLKAML YEDGFMLHCGNQTTIDGEARTVEVLLSEEQIAFWESNLKKLNENVNVEVQTMPNTDVYLT MTDAYKYSYGSLTLILSDDDSNRIQLSLKLSELQPGEYSKAQINSAYVTVTGGGSYRSLD SDDPGSFIVKYDAVSDIYTIEGVLNLRADASYPSVNIVGFEYTGPL >gi|222159205|gb|ACAB01000154.1| GENE 36 38334 - 39254 797 306 aa, chain + ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 63 242 129 300 354 82 32.0 1e-15 MNNKELEASFRKAKALGDDIREMESIDALAAYRQSQMKIRGLKRMRLHNQLIRCAAFLAI PLLLASLILGYLYFAESEEEIRYAEVISTMGTVVRYELPDHSVVWLNSGSKLRYPTVFKK DNRNVELTGEAYFRVEAEPERPFYVNTPNGLSVYVYGTQFNVTAYEDDNYIETVLEKGKV NVVTPGQETITLIPGEQLVYDKQTRQVQKNKVDVYGKVAWKDGKLIFRNASLEEIIKRLE RHFNVDIEFNNHLGKEYSYRATFRTETLTQILDYLAKSANLKWKILTPEQREDDTFTKTK IIVDLY >gi|222159205|gb|ACAB01000154.1| GENE 37 39508 - 42828 2933 1106 aa, chain + ## HITS:1 COG:no KEGG:BT_4707 NR:ns ## KEGG: BT_4707 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1103 1 1096 1099 932 48.0 0 MRISLTLLFAVVLQLSAANGYAQRIRGVISMNNVSIEQVLNKIEESSDYVFLYNDKAIQK NRIVSVNNNSGKIQDILDEIFRGTSISYTVVDKQIILSAKPFVQKTMEDATIPVKGIVKD VNGEPLIGVNVKVKGSNIGTITDMDGNFTIRTKKGDVLEFSYVGYTVQNVTITGNATLNI VLQEDNAVLDEVVVTALGIKRSDKALSYNVQKLGDAQFMSVKEASVPNALNGKVAGVTIS QGASGIGGSTKVVMRGSKSVNAEGNNNALYVLDGIPLPDLFPRRGATSNAQYGGVDEGDG ISNINMEDIAEVTVLTGPSAAALYGGQAANGVIMLTSKQGGLKDGKPRISFSHYSDFYQP LLLPKLQNTYGSRTDEFSSWGPKLETPANYDPADFFDTGMSFSNSIGAQLGTENHKSYIS LSSYNARGVVHNNNLDRYNVSYNGTYTITDKLSLGATFMYTNKKAQNMLSQGDYYNPIVP IYLFPRGDNIDKYKSYERYSADRGFPVQFWPYGDQKRSIQNPYWITHRNFNINHNERFII GGSLKYTFNEWFNISARFRTDRDNKKNEIKNYASTLLLLASSSDKGSYLLATEKNIQDYG DVIANFQKTISDWGFVVNAGASFKNASFDYTANSNKLEFGAPLASVANKFTLNNLTPEPQ YQEGKTMKSQAVFFSASVDWKRKIFLDVTGRNEWTSLLANTKNKSFFYPSVGMSFVLTEF FKVPEKILTFSKLRLSYAKVGNVPMSLAGITIPSYPISPGGGVSINPDLPIDNLKFEKTQ SFEIGLNARFLNNKLGLDITYYNANTYDQLFKFRAPESSGYANFYVNAGKVNNKGLEASI DADLTFGEFNYSPKLTVTYNKNKIVELVRNVPNPFTGANMDVSHFDMGGLSYSFRNYIDE GGSIGDIYVNSVKKDGNGYVYVNPNTLSMMVDNENLIYAGNTNPDCSLGFNNTFSYKGFT LSFLIDARIGGKVVSATQAVLDSFGVSEQSAKARDNGGVLVNGVLMDAETYYTQMGAGTG VLSNYVYDATNIRLREASLSYRFKPLFKNFVKDLTLSVTGRNLFMIYNKAPFDPQLTSSV GTFYQGIDYFLMPSLRSFGVSLKFTL >gi|222159205|gb|ACAB01000154.1| GENE 38 42845 - 44491 1417 548 aa, chain + ## HITS:1 COG:no KEGG:Cpin_6734 NR:ns ## KEGG: Cpin_6734 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 8 539 3 541 543 358 39.0 5e-97 MKMILNNRYFGKSLLLVSACSLWMLGACTGDFMETNTDPNAVTEEELGYDNLGTGSMITQ MQYQMYPCITKDQNIDVNNYQKMFSLAGDIFSGHQGASNMFDSNGRNNTTYDMYPDWCAA AYTVAYQNYMTPWYTLYTKRNINPSTFAVGQILKVFGMHRITDMYGPVPYKDFAPASNVP FTSQQVIYEQFFSELDEAIGILKEFIKDNPGAKPLVSYDKIYGGDFEKWVKFANSLKLRL ALRIVYVNETMAQQKAEEAIRDGVFIDNEDNAMLKVDGSATVNPLYMICYTYNDSRLGAT LESYLKGYEDPRLDIWFAKSEVSGFKEYNGIRCGSMFNGNDYKVFSNLNVSSSTPIQFMN AAEVYFLRAEAALRKWNAGGDAETLYENGIATAFSQPLGATQAKAGDASAYIKRTTLPKG YVDPKNSSYSDSSTGQVSVNWADATGFDEKLEKIITQKWIALFPDGQEAWSEFRRTGCPR VIRVARNNSGGLIDSKKQVARLPYPTNLSKDYPDLYQEAVNNSELLGGADNGGTKLWWDK RNDKPYQN >gi|222159205|gb|ACAB01000154.1| GENE 39 44522 - 45592 722 356 aa, chain + ## HITS:1 COG:no KEGG:BT_3753 NR:ns ## KEGG: BT_3753 # Name: not_defined # Def: endo-beta-N-acetylglucosaminidase F2 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F2) # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 339 18 365 367 129 29.0 2e-28 MDMKNDRIIKFLCCLFMVTQIIGCSDWIEAEPTILPEMKYKELTPEEEAALIAYKNSKHK IFFAWMNYSPATSSMQTRLRGIPDSLDIVSFFTGYVNNKQNREDVKFLQERRGTKVLLTM WPDKYFATTGEGREDLDSMIVYAENLVDSIYTWGLDGFDLDYEPSFGGDSYTTEMMRTFI DVMSKYLGPKCDEQYKVNGKHKLLVVDGQWNDAEYADRFDYFIGQAYNASSESSLNGRCQ NGYQDYGKGFPNEKRIFCEWVSQVGNAFGQGGVRYRYNNEYIPSLWGMAHYAAESPKNVA GCGAYVLQFGYAEGNHLNLPVPPNNYYYARQAIQIMNPAGKTVEDETDGEEVEVEE >gi|222159205|gb|ACAB01000154.1| GENE 40 45614 - 46813 788 399 aa, chain + ## HITS:1 COG:no KEGG:BT_4710 NR:ns ## KEGG: BT_4710 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 35 380 36 363 402 118 25.0 4e-25 MKNLFTKNRLAVACLSCLTFIGMACNESIDVEAVYITAAEKTPITTFSATNNNDELGITI SSSLIANSNIEGELMIDNSLVERYNSDHGESYKTLPEGACVLSSNSVLVESGKYRSEAVK LVVKDIDAIQKGVNFLLPVKLVSKNKDYPSLPGSDVLYVIVNRKLLVNVPKLNGQNYFKI NFKDNDVSRFQNMSSFTIETRVSLWEFPKYYGGNLMGIIGFPGGENNAKGAWLFVDGTPD RVGGEGDVPVFMFSTKGWSVYAGKLGYTLEKNAWYHVAGVFENNTLSLYIDGILFVQAEY ANPISFSEQFYIGAAPGVQNGFYLKGSVSEARFWTRALTASELKNPLHRCFVEVDSEGLE GYWKLDDMSDECKDYTGHGHTAVKEGSGAIEWMTEVPCP >gi|222159205|gb|ACAB01000154.1| GENE 41 46909 - 47385 550 158 aa, chain - ## HITS:1 COG:PM0817 KEGG:ns NR:ns ## COG: PM0817 COG0783 # Protein_GI_number: 15602682 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Pasteurella multocida # 7 154 5 152 159 155 52.0 3e-38 MKTLEFIKLNESGANNVVASLQQLLADFQVYYTNLRGFHWNIKGHDFFVLHSQFEKMYDD TAEKVDEIAERILMLGGTPANKFSDYLKVANISEVDKVSNGEQALNNILQSISYLIGEER KILSIASQAGDEVTVSMMSDYLKEQEKLVWMLTAYNSK >gi|222159205|gb|ACAB01000154.1| GENE 42 47523 - 48449 828 308 aa, chain + ## HITS:1 COG:STM4125 KEGG:ns NR:ns ## COG: STM4125 COG0583 # Protein_GI_number: 16767389 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Salmonella typhimurium LT2 # 1 271 1 270 305 194 40.0 2e-49 MTIQQLEYILAVDQFRHFARAAEYCRVTQPTLSAMIQKLEDELGVKLFDRTVQPVCPTPI GQKVIDQARVILAQAAQVKEIISEDKQSLSGVFRLGVLPTIAPYLLPRFFPQLMEKYPEL DIRVTEMKTQDIQQALHAGDLDAGIIASKLEDTFLTEETLFYEQFYAYVSRKEPSFKHDV IRTSDITGEHLWLLDEGHCFRDQLVRFCQMEAVKVSQMAYRLGSMETFMRMVESGKGITF IPELAVAQLTEEQRQLVRPFAIPRPTRQVVLATNRDFIRHSLLCVLKEEIKAAVPKEMLT LQPIQCLL >gi|222159205|gb|ACAB01000154.1| GENE 43 48474 - 48644 174 56 aa, chain - ## HITS:1 COG:no KEGG:BT_4717 NR:ns ## KEGG: BT_4717 # Name: not_defined # Def: integral membrane protein, putative permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 56 1 56 310 82 76.0 4e-15 MKKGLFYAILASTLWATVNSFIKQGLSNDFTPINFSETRFATVGIILFAYTRYKGI >gi|222159205|gb|ACAB01000154.1| GENE 44 48762 - 49451 738 229 aa, chain + ## HITS:1 COG:slr2057 KEGG:ns NR:ns ## COG: slr2057 COG0580 # Protein_GI_number: 16330455 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Synechocystis # 1 220 1 233 247 160 49.0 1e-39 MKKYIAEMIGTMVLVLMGCGSAVFAGSVTGTVGAGVGTVGVALAFGLSVVAMAYAIGGIS GCHINPAITLGVFLTGRMNGKDAGMYMIFQVIGAIIGSAILFALVSTGAHDGPTATGSNG FGDGEMLQAFIAEAVFTFIFVLVVLGSTDSKKGAGNLAGLAIGLTLVLVHIVCIPITGTS VNPARSIAPALFQGGEALSQLWLFIIAPFVGAALSAVVWNYLGDKEEKK >gi|222159205|gb|ACAB01000154.1| GENE 45 49590 - 50312 711 240 aa, chain - ## HITS:1 COG:no KEGG:BT_4719 NR:ns ## KEGG: BT_4719 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 240 1 237 240 325 75.0 1e-87 MKKFKFIAFIFAIMTTMPILQSCLDDDDNPSDLLVVSTINMISQDSKEFYFTLDDGKKMY PSNAQGWSNKDWVEGQRAFVIFNELEEPVNGYDLNIQVRGINPILTKDIITMGEDDNDEE KVGDDKINTTYMWINKDNKYLTIEFQYYGTHSEDKKHFLNLVINDKEESAPTADEDSAED EYINLEFRHNSEGDYPQSLGEGYVSFKLDKIKNRMEGKKGLRIRVNTLYGGPKTYEVKFP >gi|222159205|gb|ACAB01000154.1| GENE 46 50489 - 51037 505 182 aa, chain + ## HITS:1 COG:PA0762 KEGG:ns NR:ns ## COG: PA0762 COG1595 # Protein_GI_number: 15595959 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 2 175 6 187 193 67 25.0 1e-11 MEELELSEQCRLGNNQARKELYEQYAGRMLGICLRYTGDRDTAQDLLHDGFLKIFDSFDK FTWRGEGSLRAWMERVMVNTALQYLRKNDVINQSAPLEELPEEYEEPDASDVEAIPQKVL MQFIEELPAGYRTVFNLYTFEDKSHKEIAQELGINEKSSASQLFRAKSVLAKRVKEWIMH NG >gi|222159205|gb|ACAB01000154.1| GENE 47 51044 - 52342 997 432 aa, chain + ## HITS:1 COG:no KEGG:BT_4721 NR:ns ## KEGG: BT_4721 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 432 1 418 418 595 73.0 1e-169 MEEKELWMNKLKEKLADYSEPVPASGWEQLEKELMPPVEKKIYPYRKWMMAAAAVILLAV VSSVSLYFLGTPAADEIRHIKTPALASTPDALPGVQQPDVQGTSVDPVLRPVAREDRLAK IDKNLTEQKTNAGQSAIENNNEPVTGNENNPVIEKDQTLKGETEQTKNEARQVDSENVGQ ATQSQDTKRPNNRPRRPSGRDKLHIPAEKRASQKGTWSMGLSVGNSGGASTEVGAGSHAY MSRVNMLSVSNGLMEIPNDQTLVFEDGVPYLRQAKQVVDIKHHQPISFGLSVRKGLAKGF SLETGLTYTLLSSDAKLAGEDQQIEQKLHYVGIPLRANWNFLDKKLITLYVSGGGMVEKC VYGKLGSEKETVKPLQFSVSGAVGAQVNATKRLGIYVEPGVAYYFDDGSDIQTIRKENPF NFNIQAGIRLTY >gi|222159205|gb|ACAB01000154.1| GENE 48 52555 - 53151 429 198 aa, chain + ## HITS:1 COG:fecI KEGG:ns NR:ns ## COG: fecI COG1595 # Protein_GI_number: 16132114 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Escherichia coli K12 # 22 173 13 162 173 65 32.0 4e-11 MVNFTDEKHLLIDLKDGSFQAFERLYNMYSGKLYNFIMRLSSGNQYMAEEVVQSTFIRIW EVREKVDTNASFISFLCTIAKNLLMNMYQRQTVEYVYNEYLLKSSVDRDSQTEENIDLHF LNDYIDSLAEELPAQRKKIFILSKRQNYTNKEIAEMMGISESTVATQLSLAVKFMREQLM KHYDKVIALLIAFFVNEM >gi|222159205|gb|ACAB01000154.1| GENE 49 53219 - 54199 945 326 aa, chain + ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 16 294 33 294 327 82 25.0 2e-15 MDKIYYKELIEKYFDGNITDAEIKELSDWIKNDRHLQNWWEEEFSKSDAGINPVLRDKLF ARIKEQTQGKEETQGKEKPRTIRMNLWKWAAAIVLPICIAFFTYYLIDSSQTVGAPFIVK ADKGDKATIELPDGTNVVLNSASQLSYLNNFGENVRRVQLNGEAYFKVAHDEKHAFIVQV GDLEVKVLGTSFNVSAYEDAKDVTVVLLEGKVGVYAQKMSHIMKPGDKIEYNKATHKITA TQVHPSDYIEWTKGNIYFEKESLENIMKTLSRIYDVEIRFDSNKLPNEYFTGTIPGGGIQ NALNILMLTSPFYYEMDGSVIVLKEK >gi|222159205|gb|ACAB01000154.1| GENE 50 54362 - 57781 3057 1139 aa, chain + ## HITS:1 COG:no KEGG:BT_4724 NR:ns ## KEGG: BT_4724 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1139 1 1139 1139 2037 92.0 0 MKHKTTFEKLPQANTGRSKNWKTLRAICLLLFLSVSFTAYSQITVNVKDISLRASLKKIE QVSNYKFFYNESLPELNRKVSLNVKDATIEQTMQQLLGGMDLAYKKEQDNVIVLIRKAQD KSITKKITGTVVDEKGEPVIGASIVIKGESHGTITDFDGKFTLADVPEKGILTVSYIGYK TVDLSAAGQTFVKVVLQEDSKMIDEVVVVGYGVQSQKLVTTSISKVKMENIDQGNDYNPI KMLQGRVAGVNISSASGTPGEAPNVTVRGIGSVSGGSSPLYVVDGIPSEKYPNLNPNDIE SMEVLKDASAAAIYGSRANAGVVLITTKSGQQGKTKIEVSGRYGFAYLASDIEMANSMEY MNTMQAAIDNYNVQMGANLQLYVPSQIQETNWVKEISRKNSKTGTGSISISGGNEKTTFF ASLGANTQEGYLNKSSYDQYNMRAKFTHKINNLFKLNMNLAGSASRSDLLEETSTSLKVL RTAREEQPWYSPYKEDGTSYKVNGTDILRHNPVMLINEEDWVAKKYQLSGVFSIDVTPFK GFKYTPTVSVYGILDNTSKKLSDKHDARKNSSGWGALAEQKDQSFRYVIDNIFSYNNEWN KLIYSVMLGHSFEKYTYEQFGAKSDNYANGAFPSSNFDLINAGPNIYAGNISYTSYALES YFGRIALNWDNKYILNASLRSDGSSRFAKNKRYGYFPSASFAWRASNEEFFPKNKYVNDA KLRLSWGMTGSMAGVSNYAPLSLISAGGASYNGSAGFQISQDARALTWEKASQFNIGFDI EMFQSRLTLNIDMFYQKTTDLLFKKPVNATTGYTTLQSNIGSLENKGLELALNGKIFTGK FKWDLGGNISFVKNKLLSLIEGNDMYIVPSSGSNLLGGSMHALINGQPISTFYMLKMEGI YQRDDEVPAKLYAKGVRAGDVKYFDYNEDGDISDADRMNVGKAIPDFYGGITSNFSYKGF DLSLFGQFSVGGKVMSAWRGVNGSEGTDHLGLALSNVKVNDRGESVEQYFNVSKEVANGY WRGEGTSNTIPRPVRIGVHTGYDYDYNVQTSTRYLEDASYFKLKTVTLGYTLPESVVKKL RVNSLRVYVSADNLLTLTKYSGYDPETSFSGSPGDSNYGVDFGLQPVLRTFIFGLNLNF >gi|222159205|gb|ACAB01000154.1| GENE 51 57794 - 59140 991 448 aa, chain + ## HITS:1 COG:no KEGG:BT_4725 NR:ns ## KEGG: BT_4725 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 448 1 447 447 823 88.0 0 MKRYYTIINYLLCSICLLWMTGCSGILDDMRPKDQIPQDMLSESDLEKLLNGVYAEMEEL VFKFYMDGDVKGENFKAGPGFSMNDPMLMAPSSKEVLSQWQKCFTALKQVNFLVETYEAS SNKESQVVKQTGGTGYYFRALIYYHLVTRWGGVPILRKRTYDVVPISPETEVWSFIKEDL AKAEGLLSDFTDRFYVSLSACDALHAKVCLSLKDYTNAAAYADKVITKSNFALSTTSVEY AQAFVSNSSSKELVFALANKRSTSLLLFYQTVNDIDPTWDYSPSTECYSHLYDDTAVKSK DIRAKAVFGSDNSRIIKFPNGSTGQFVTNEQPSQTPIVVTRVAEMYLIKAEALGAVNGLA TLKEFMNKRYATVLLPSNMTDIEFQNQILDERHRELYAEGQRWYDLKRTNRLDLFSSLNG RNYLMYYPVPQSERDLAGEENYPQNEGY >gi|222159205|gb|ACAB01000154.1| GENE 52 59162 - 61693 1981 843 aa, chain + ## HITS:1 COG:CC3172 KEGG:ns NR:ns ## COG: CC3172 COG0584 # Protein_GI_number: 16127402 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Caulobacter vibrioides # 595 843 9 270 295 92 27.0 3e-18 MKNIKIVTTFFICLLLLSACSDDWKENALTAKFTFDKTTYYVGEEVSITNETVGGEGNYT CQWDLGNGETSTEVAPKVTYETNGAYTVTLHVKDSKGNYAMAHKLLTIEAEPLPEVGNVK LKWVGAHVLGEIRSTAPAVSDDNNVYMTSNDHYLRKFSAATGEQLWEFDLWTSADGDSPS GNTHTTPSIDTDGTVYVGTGDTSGKIGRVYAINPDGTKKWVVAGDAEKGFWNKGQASKPR INYLTCAIGENHIYMGNGGSTGSVLAVDKNTGYRVGYVANADNSGGPSGGVSAGIALANN TLIWAGGKNGLFGASASALNTGGNVMWAWQIYSSGNDKPSENMNASVAVDATGTIYGIAT FPSIGSSAFAIGSDGVEKWRTSLGNVGTLDQGGVVIGLDGSIIVTVKRAPGEATGGIVAL SPNGVVQWHYGVPEDVSGCAAIDQAGNIHFGTQSGNYYIIKPESSEEQLILKKDLAALIS ESDSPLKDNWEAGIGKIWSSPTIGPDGTIYIGVTHTVDPSKSVLVALEDEGITGCAASAW PMKGKDSRHTSAQLGGSGENPGGEPGGQLPITGNLKTDLKNLFDDSSYKVWLCAHRANTQ KGIADGIPENSLTSIEYAINAGVEMIELDARPTSDGILVLMHDNTIDRTTNGSGAVGDYS YQQLQQLYLKDAAGNLTNERIPTLEDALKKGKGKVYFNLDIVNKNVAVATMVALLKKLDM ENEVLLYVSNNRNYAYDLKAANSTLLLHPMAKASDDITYFASSYTDNVQMMQLSTSDALA GAMTEDIKSKGWLLFSNIVGANDTNMLSDNYSGLVGMINKRINVVQTDYAELAAKYLKSK KYR >gi|222159205|gb|ACAB01000154.1| GENE 53 61718 - 62644 681 308 aa, chain + ## HITS:1 COG:CC3172 KEGG:ns NR:ns ## COG: CC3172 COG0584 # Protein_GI_number: 16127402 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Caulobacter vibrioides # 52 300 21 271 295 130 35.0 3e-30 MNMKNYLIIGIILLFPFRLFADEPVKAIYAKITNPDNKEVTVVAHRADWRFAPENSLAAI ESSIRLGADVVELDVQKTKDGQLILMHDKTLDRTTTGKGKVAEWTLDSIRTLRLKNGAAL RTKHRVPTLEEALLTAKGRVMVNLDKAYPIFDEIFPILEKTGTVEQIIMKGSKPVAEVKK DLGKYLDRIIYMPIVHLDRPGAMQQIDDFMKELHPVAFELLFESDTCQLPQQVRTKLEGK SKIWYNTLWDTMAGGHDDDKSLENPDEGYGYLIDKLGATIIQTDRTAYLLEYLQVRKKRN EHKIDVFH >gi|222159205|gb|ACAB01000154.1| GENE 54 62661 - 64016 694 451 aa, chain + ## HITS:1 COG:VCA0707 KEGG:ns NR:ns ## COG: VCA0707 COG2271 # Protein_GI_number: 15601463 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Vibrio cholerae # 5 446 1 440 459 357 42.0 2e-98 MSFKILDFYKISSPLAGGETDSEDSSRFKRIRWATFLSATTGYGIYYVCRLSMNVVRKPI VEEGVFSETQLGIIGSCLFFVYAVGKLTNGFLADRSNVRRFMSTGLLCSALINLCLGFTS SFYAFALLWGLNGWFQSMGAASGVVSLTRWYSSKERGTFYGFWSASHNLGEALTFISIAL LVSWMGWRYGMIGAGIIGILGFFMMLVFMRDTPQSQGFLLNKNSSSVDSHSLSGKQTEDF NKAQKAVLKNPAIWILALSSAFMYISRYAVNSWGVFYLEAQKGYSTLDASFIISISSICG IIGTVFSGIISDKMFSGSRNIPALVFGLMNVIALCLFLLVPGVHFWIDVLAMVLFGLGIG VLICFLGGLMAVDIAPRNASGAALGVVGIASYIGAGLQDVMSGILIEGQKTVQNGVDVYD FTYINWFWIGAALLSVLFALLVWNAKSKEVD >gi|222159205|gb|ACAB01000154.1| GENE 55 64123 - 65688 1031 521 aa, chain - ## HITS:1 COG:no KEGG:BF2880 NR:ns ## KEGG: BF2880 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 5 518 9 523 525 658 62.0 0 MYDLMRKLPIGIQTFEDIRRKNYLYVDKTALVWRMANMGKPYFLSRPRRFGKSLLLSTFE SYFLGKKELFEGLAIEKMETEWKEYPVLHIDLNAEKYDKPEKLNEILSNHLTQWELKYGK GIDERTLSSRFGGVIRRACEQTGQQVVVLVDEYDKPLLQAINNLELLDDYRQTLKAFYGV LKSADRYLRFVFLTGVTKFSQVSVFSDLNQLNDISMKPPYATVCGITRQELVDTFTPELK NLSEANQLTFEETLQKMAATYDGYHFCEFAEGVFNPFSVLNSFDGCKFDSYWFQTGTPTF LVELLQKSEYDLRTLLSGIEVPVSSFAEYKMDVNNPIPLIYQSGYLTIKDYDREFQNYLL DFPNDEVRYGFMNFLVPFYTPMKNNDQGFYIGKFIQELRTGDYEAFLTRLEAFFAGIPYE LNDQTERHYQVIFYLVFKLMGQFTEAEVRSSRGRADAVVKTPKYIYVFEFKLNGTAEEAL KQIDEKGYLIPYKVDHREVVKIGVEFSAESRNLSHWLVEIK >gi|222159205|gb|ACAB01000154.1| GENE 56 65699 - 65872 70 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294646425|ref|ZP_06724067.1| ## NR: gi|294646425|ref|ZP_06724067.1| hypothetical protein CW1_3805 [Bacteroides ovatus SD CC 2a] # 1 57 1 57 57 78 100.0 1e-13 MTILLIFFHLSAFWGIFLLRKQQIYKEKDDIYQIYPVFKVFIHHYFEIMITFDVYNS >gi|222159205|gb|ACAB01000154.1| GENE 57 65921 - 66541 532 206 aa, chain + ## HITS:1 COG:PA0149 KEGG:ns NR:ns ## COG: PA0149 COG1595 # Protein_GI_number: 15595347 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 32 178 23 161 181 69 31.0 3e-12 MIEFSDEKFLLDEIKKGNNQAFEYLFKTYYPRLRGYAIRFVEDEETARDIIQECFLRFWE KRALLSAVSVTSLLFAMVRNGCLNYLKHLSIVEKHQIEYLASVDGEERLYYSDFALDAEH KLLYDELQEQINIVIGQLPERSREIFLMSRFKGLKNREIADKLQISTTAVEKHIAKALQC FSKHFKDKYPVDIYIIIIAWVIMNNK >gi|222159205|gb|ACAB01000154.1| GENE 58 66669 - 67664 590 331 aa, chain + ## HITS:1 COG:PA1364 KEGG:ns NR:ns ## COG: PA1364 COG3712 # Protein_GI_number: 15596561 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 126 322 71 265 280 74 28.0 2e-13 MEEKKNIRELAIQYFEGRILRQDEKLLFAYVGQSEENRATFRQWEKEWIASGISDSGTIG EWEILQRKIRTQEAIIPMLTVSKFRSSVWRQVAAVAAIVILTIGGTIGIWQISSSLKPET YFVCETPYGEKSKVVLSDGTVVWLNAGSTLKYSNKFNTANRKVELNGEGYFEVTKKEGAT FIVQTHGYDVVVKGTKFDVSAYADDPFISTTLLEGSVELNYKGTPVMMSPGESIRLNVET GKFVRTQVNASQSKAWAENRIEFDHITLKELVAKLSRQYDVNIRLESESIGDKTFRISLR NRETIGEVMTALQEIIPITIERIDKDIYIRE >gi|222159205|gb|ACAB01000154.1| GENE 59 67710 - 68570 670 286 aa, chain + ## HITS:1 COG:no KEGG:BT_0190 NR:ns ## KEGG: BT_0190 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 263 33 276 1146 197 41.0 5e-49 MNNRKNLLRLLLFVSLSFVAVILNGQTVSKTFKNEALKTVLKEVEKQTGLSVIYKTDEVN ENKMITATFKNASINDVLDKILDEGLIYKLQNKMIVISKSNQQKQSKSGEKKKISGTVVD ENGTPIIGASVQIKGEAQGTITDFDGKFVLSDVPEKSLLTISYIGYLTIDIAVTDEKLSK ITMIEDSKLIDEVVVVGYGSMRKQDVTTSIARVGGKDLKDMPVTGFDQAIVGKMAGVQVT QTSGKPNSGATIRVRGTGLLQQGQSRYMWWMEYHWKELQVLWKLLI >gi|222159205|gb|ACAB01000154.1| GENE 60 68567 - 71107 1655 846 aa, chain + ## HITS:1 COG:no KEGG:BT_0190 NR:ns ## KEGG: BT_0190 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 846 298 1146 1146 980 57.0 0 MNDVESIEILKDASSAAIYGSRGANGVVMITTKKGASGKAKVSYNGSVGFQTLSKKIDML DAYEFAAFARDGHNGSYLTSYPDASPDDPNEVRKKSYDKIPPELFPYLEGQKGLTNTDWQ DALYRTAPITKHSISISGGGEKTKFFISGSYLNQSGIIVNSGYERFGARLNFTYKSDKVE LGVNFSPSYSIEDRVDRDNNKGVVINALMMPPVWPVYNEDGSYNYMGNGFWKIGTDYQHN AVLNPVAMANLTKDQVTHANLLGNFYFQWEIIKGLKYKFSAATNYNYFYNEYYRSSELPL QGEKYFQSPSNPVAKSSGTYYLNWLIENTLNYQKNFNGHNLNAIIGFTAQKDMMKKHSVQ ATDFVNDLVQNVAGGIVSTGGADSQAWALASFLARAQYDYKGKYLLSAAIRSDGSSRFGK NNRWGYFPSVSVGWRIISEDFMRNLPWVSNLKLRASYGISGNFSIGNYEHIAMLKYNQYV LGTGEGSLVSGLRPSQISNDDLGWEKTRMYNVGLDIGLFDERLTFEFDMYQSNTYDLLLD VPIPQITGFSDMRKNIGEVRNRGVEFTIGTHHNWSGFRWDASFNIAANRNKVLKLGPEDA PIITSAGVSHAYFKTEVGQPIGNYFLLVQDGVFKNQAELDAYPHFSNAKVGDFKFVDVDG NGEMDLDDDRAIVGNYMPKFTYGFMSSFAYKGIDLSFNLQGIQGNKILNLQRRYIANMEG NVNSMVIALDRFQSVENPGNGQVNRANRKSTGNNSRTSTWHLEDGSYLRMQNITLGYTLP KNFVQKFGLTSLRLYFSGQNLFTITNYSGYNPEVSNYNSNGSLTPGVDYGSYPLSKTYSF GLNVSF >gi|222159205|gb|ACAB01000154.1| GENE 61 71123 - 72505 822 460 aa, chain + ## HITS:1 COG:no KEGG:Slin_1275 NR:ns ## KEGG: Slin_1275 # Name: not_defined # Def: RagB/SusD domain protein # Organism: S.linguale # Pathway: not_defined # 4 460 3 460 460 335 43.0 2e-90 MKNKLLIVCAFFCLSSCEGFLDLSPESQPNAKEFYVTENDFNSAVMACYQSLRNYPTIVL DVLEYRSDNMFMPSFTSGSQDKYEINHFQDNKTNSLIADIWEKSYSAISCCNVMIDHIQN ATINADKKLQFEAEARFIRAFHYFNLVRLFGGVPLVVHELSDGEALKMPRESVGKVYEQI ESDLIFALQLPDTYDKSDLGRVTSNAAEALLAKTYLTNKKYNDAKTHLNNIIITEKYDLL TDIKDVFDVNNEMNKEILFAIKYSKTLIDGGHGMWLSLSDVTLGHFSDVLKGAYNSDDAR AGLLEYKKSGSVYLPMKYYDIQDVSTKDVGNDFIVLRYADILLMYAEVLNEESFDTDINS KNSGFYYLNKVRDRAGLPNLTSLEVPSQSDFKKAVLNERYLEFPLEGNRWFDLIRTGTAK EVIKAAYDIDIPDYRLIFPIPSAEVEKVNNPNILPQNPQY >gi|222159205|gb|ACAB01000154.1| GENE 62 72611 - 74386 693 591 aa, chain + ## HITS:1 COG:CC3172 KEGG:ns NR:ns ## COG: CC3172 COG0584 # Protein_GI_number: 16127402 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Caulobacter vibrioides # 34 290 12 270 295 145 37.0 3e-34 MSFCILSISSFAQTRLDSIRNKLFAPENKNVLVASHRGDWRNACENSIEAIDNAVKMGVD IVEVDLARTKDGHLILMHDSKLDRTTTGKGLVADHTLAEIKALQLRNGCHIKTIYKVPTL EEALLFAKGRVMLNLDKAFDYFDQVYTLLEKTGTTDMVIMKSDAPADYVKKNYGKYLKKV VFMPKINLDDKNAMQRLDDYLQIINPVAVEFKFASDLNRLPYDVKNAMKGRARIWYNTLW NTHAGGHDDDCSLVDPDEGYGYLIDSLGASILQTDRPAYLINYLKKKELKKKWECIENWD YLSVENEWTMQTSPNFDVEEVFLKGKHTPATNEDGIIVTPYFAAVIDGATAKSELEIDGK KTGRIAMELVIEAIHDFPKDIDANEALKRITEKIHSFYVQHRLLEELEKTPGSRFTANGV IYSYEKNEIWQIGDCQCLFGNTYSSNEKEIDAIMANARAVVNEIALLNGATPDDLLSNDP GRNFIYRFLQQQAILQNNPDKNQPYSFPVFDGFPINMHQVRIFSIGNHTQIVLSSDGYPC LFPTLRESECYLMNILENDPLCMRQYKSTKGIKKGNCSFDDRAYLKIRINR >gi|222159205|gb|ACAB01000154.1| GENE 63 74697 - 75482 594 261 aa, chain + ## HITS:1 COG:CC3172 KEGG:ns NR:ns ## COG: CC3172 COG0584 # Protein_GI_number: 16127402 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Caulobacter vibrioides # 1 261 1 271 295 147 34.0 2e-35 MVVSHRGDWRNAPENSLQAIQNCIDMGVDMVEVDLKKTKDGHLIVMHDQTIDRTTTGKGK PENYTLEELRRFRLKNGAAHKTTHLIPTLEEVMLLCKGKILVNIDKGYDYFKEAYCILEK TGTVDQCVIKAGLPYEQVKVENGEVLDKVIFMPVINLNKEDAEKIIDSYQKHLKPIAYEL VFDNDEKETLRLIQKVRDSGARLFVNSLWPKLCGGHDDDRAVELHQPDESWGWIIAQGAK LIQTDRPALLLEYLRNKKLHD >gi|222159205|gb|ACAB01000154.1| GENE 64 75496 - 76851 1004 451 aa, chain + ## HITS:1 COG:VCA0707 KEGG:ns NR:ns ## COG: VCA0707 COG2271 # Protein_GI_number: 15601463 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Vibrio cholerae # 5 446 1 440 459 376 44.0 1e-104 MLKQLINFYKVSSPRPCNGETLSSSDARRLKFLKWSTFLSATFGYGMYYVCRLSLNVVKK PIVDEGIFSETELGIIGSVLFFTYAIGKFTNGFLADRSNINRFMTTGLLVTALVNLCLGF TNSFILFAILWGISGWFQSMGAASCVVGLSRWFTDKERGSYYGFWSASHNIGEALTFLII ASIVSVLGWRYGFFGAGIVGLIGALIVWKFFHDTPESMGFPSVNVPKQKKEMSETETADF NKAQRQVLLMPAIWILALSSAFMYISRYAINSWGVFYLEAQKGYSTLDASFIISICPVCG IIGTMFSGVISDKLFGGRRNVPALIFGLMNVFALCLFLLVPGAHFGIDVLAMVLFGLGIG VLICFLGGLMAVDIAPRNASGAALGVVGIASYIGAGLQDVMSGILIEGQKTVQNGVDVYD FTYINWFWIGAALLSVLFALLVWNAKSKEMD >gi|222159205|gb|ACAB01000154.1| GENE 65 76978 - 77112 235 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MENSLNDELLCSLKDELREVTAQINVHSEKLMDLLDQKKESSVN >gi|222159205|gb|ACAB01000154.1| GENE 66 77121 - 77366 158 81 aa, chain + ## HITS:1 COG:no KEGG:BT_4730 NR:ns ## KEGG: BT_4730 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 80 1 77 84 123 74.0 2e-27 MSHLTPVIIEYRGNPKQYVSVVLDAINLGRLTYDGVANCEQTFRALASVVDVISPKNGKT LSVETLVSYEKKKRAGEFEEK >gi|222159205|gb|ACAB01000154.1| GENE 67 77423 - 77620 162 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160887564|ref|ZP_02068567.1| ## NR: gi|160887564|ref|ZP_02068567.1| hypothetical protein BACOVA_05584 [Bacteroides ovatus ATCC 8483] # 1 65 1 65 117 108 100.0 2e-22 MEKKRKIRTYGGYFEAFMETLTEKEQDKIQYGLLLLKTQERLSTKFVKFVQDGVFELRTE YNGNI Prediction of potential genes in microbial genomes Time: Wed May 18 04:35:01 2011 Seq name: gi|222159204|gb|ACAB01000155.1| Bacteroides sp. D1 cont1.155, whole genome shotgun sequence Length of sequence - 22140 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 16, operones - 8 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 250 - 309 5.6 1 1 Op 1 . + CDS 340 - 1434 1085 ## BF3849 hypothetical protein 2 1 Op 2 . + CDS 1451 - 2338 919 ## BF3848 hypothetical protein + Prom 2355 - 2414 4.0 3 2 Op 1 . + CDS 2437 - 3324 852 ## gi|237712949|ref|ZP_04543430.1| conserved hypothetical protein 4 2 Op 2 . + CDS 3359 - 4921 1228 ## BF3838 hypothetical protein 5 2 Op 3 . + CDS 4944 - 6650 1000 ## DSY3857 hypothetical protein 6 2 Op 4 . + CDS 6727 - 6870 110 ## gi|256842614|ref|ZP_05548115.1| conserved hypothetical protein + Prom 7301 - 7360 2.7 7 3 Op 1 . + CDS 7417 - 7800 184 ## BVU_3680 hypothetical protein 8 3 Op 2 . + CDS 7862 - 9547 1121 ## COG3436 Transposase and inactivated derivatives - Term 9453 - 9500 -0.9 9 4 Op 1 . - CDS 9671 - 9877 101 ## PG0019 ISPg4 transposase 10 4 Op 2 . - CDS 9931 - 11208 821 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes - Prom 11308 - 11367 1.9 - Term 11315 - 11358 6.1 11 5 Tu 1 . - CDS 11423 - 11839 133 ## BVU_1598 transposase - Prom 11888 - 11947 2.5 12 6 Tu 1 . - CDS 12182 - 12583 125 ## BVU_1596 transposase - Prom 12757 - 12816 4.0 13 7 Tu 1 . + CDS 12664 - 12837 89 ## gi|237723853|ref|ZP_04554334.1| conserved hypothetical protein + Term 12988 - 13032 0.0 + Prom 12899 - 12958 4.1 14 8 Tu 1 . + CDS 13041 - 14249 1074 ## gi|198277709|ref|ZP_03210240.1| hypothetical protein BACPLE_03932 + Term 14307 - 14345 -0.8 + Prom 14943 - 15002 1.8 15 9 Tu 1 . + CDS 15131 - 15757 347 ## COG2003 DNA repair proteins + Prom 15824 - 15883 4.6 16 10 Tu 1 . + CDS 15967 - 16344 125 ## BDI_3861 hypothetical protein + Prom 16633 - 16692 1.9 17 11 Tu 1 . + CDS 16717 - 17091 200 ## gi|189461004|ref|ZP_03009789.1| hypothetical protein BACCOP_01651 + Term 17107 - 17135 0.6 - Term 17093 - 17123 1.0 18 12 Op 1 . - CDS 17204 - 17800 547 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 19 12 Op 2 . - CDS 17842 - 18171 256 ## gi|189461002|ref|ZP_03009787.1| hypothetical protein BACCOP_01649 - Prom 18194 - 18253 2.4 20 13 Tu 1 . - CDS 18378 - 19040 341 ## COG3943 Virulence protein - Prom 19135 - 19194 8.0 21 14 Op 1 . + CDS 19450 - 19698 316 ## gi|237712968|ref|ZP_04543449.1| predicted protein 22 14 Op 2 . + CDS 19692 - 20012 136 ## COG2337 Growth inhibitor 23 14 Op 3 . + CDS 20012 - 20236 186 ## gi|237712970|ref|ZP_04543451.1| predicted protein + Prom 20255 - 20314 2.9 24 15 Op 1 . + CDS 20336 - 20620 235 ## BVU_3428 DNA-binding protein HU 25 15 Op 2 . + CDS 20617 - 20889 136 ## BT_1941 hypothetical protein - Term 20802 - 20837 3.5 26 16 Op 1 . - CDS 20951 - 21328 216 ## Slin_6854 hypothetical protein 27 16 Op 2 . - CDS 21343 - 21669 93 ## gi|237712974|ref|ZP_04543455.1| predicted protein 28 16 Op 3 . - CDS 21715 - 21969 232 ## gi|189460990|ref|ZP_03009775.1| hypothetical protein BACCOP_01637 - Prom 21998 - 22057 3.8 Predicted protein(s) >gi|222159204|gb|ACAB01000155.1| GENE 1 340 - 1434 1085 364 aa, chain + ## HITS:1 COG:no KEGG:BF3849 NR:ns ## KEGG: BF3849 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 27 364 125 493 493 272 44.0 2e-71 MSKRILFFILFSWLASAAFPAFAQQKADTTYTFRFVPQKDMFYVPWNGNDTELARLLECI ENNKTTILDGKLPLLVDGYCNSLGSEAENLATAKIRANRVKSELIIRAEIKEENFITRNH ATEGDFVTVRLTVPVKGTAATDAEARRKAETERLEAEKRAEQERLAEEQRKAEEARLAAE KAEAEKAAQQNTLADTPSETKITTDYHLSLRANLLRWATLTPDLGVEWRICPSWGIAVNG SWTSWTWSDKDHRYALWEVAPEVRYYMGEKKAWYLGAMFKAGQFNYKLSETGKQGDLMGG GITAGYQLRLNKALALDFNLGLGYLNADFEKYEVIDGVRVRRGNETKDWCGPINAGVTLV WKLF >gi|222159204|gb|ACAB01000155.1| GENE 2 1451 - 2338 919 295 aa, chain + ## HITS:1 COG:no KEGG:BF3848 NR:ns ## KEGG: BF3848 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 294 1 304 304 123 32.0 7e-27 MKARQYINMMGMAAAVLLSSCVKDTLYDTPHPDYGKIAVTADWSARGEGIDIPATWTVTM GDYTGTETSATHTPDHLFAPGSYTLAVWNPAEGITVNGTTATIAAATGTRAGTDTFVNNA PGWFFTYTEQVTIEKDKDYPLTAAMKQQVRELTLVVEPTGDAAGRITEIVAHLTGAAGTL DFATDTYGAASNVVLLFTKITEGDDAGKWKATVRLLGVTGTEQLLTGEIRYADGNPTPTT LKSDLTEALKEFNTGKGESLTLGGTLVETPEGMEVDGAEINGWEEVKGDDVNADL >gi|222159204|gb|ACAB01000155.1| GENE 3 2437 - 3324 852 295 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712949|ref|ZP_04543430.1| ## NR: gi|237712949|ref|ZP_04543430.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 295 1 295 295 527 100.0 1e-148 MKTRFFALATLALALAACNNDNENLNGAPVAARFTADIAPATRAGGTTWDNGDRIGITDI GNDTQYGNVPFILKNGKFEAEGKVIYIEDTKTHTFRAYYPYYAAGGILAATTDATAQQNQ PAIDFLFASGATGDKNNPVVSFTDKTAKGGEDNSFHHRMSQITLTFEAGDGVNFSVVKPE RYTLDGLLLTGTFNTADGIATADNGAQTGELAMNLADGVLTSSIILFPQTVASLPLVVNY KGQEYHATLTVPEGALLAGNNYTYTVKVRNKVLKVSEATIAKWNDIDGGDVGADL >gi|222159204|gb|ACAB01000155.1| GENE 4 3359 - 4921 1228 520 aa, chain + ## HITS:1 COG:no KEGG:BF3838 NR:ns ## KEGG: BF3838 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 12 175 19 182 337 73 32.0 1e-11 MRHRLFIPAATALLFALAACTQDELAGDNRLPEGEYPVVIRATGLSVEATPLAAPSTRAA VDGDWQGVTSVALKMGDAVKEYTVTASTDFKSATLSRENDPYYWTSRDPITVSAWWPFNN ADITQMPAVKVAEDQSKLADFQNSDFISAENRKVEFNNPKLEFTHRTARVTIELKPGTGF TSVAGATVSLVSLSADNGNPTAIKSYNASGNTYEALTAPQTVAAGKPFVKVELGGGTFYF RPQNNVVLEAGSRYKYTVKVNATGLTLEGCTIGSWVDGGGESGEAKDLGYIYDSNTNTYT VYNADGLLAWNKAIQKDESINCTLTADIDLTGKKWTRLDTWPGYSGVFNGQGHRITGLNF SAASFGLFYFLNVSGVIKNLQLIDVNLDGSSGGAAGMVDRNHGQIIACSVTGKLTVHSGG IANANYGDIIACWFNGTLKDESGCGTIVRFNYKNITSCYWGGNTGQGVLRNEGGTVVATK VDGATVKWQTAVDGMNTALTGNDYQWALGTGGLPVLQKKQ >gi|222159204|gb|ACAB01000155.1| GENE 5 4944 - 6650 1000 568 aa, chain + ## HITS:1 COG:no KEGG:DSY3857 NR:ns ## KEGG: DSY3857 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 266 498 1530 1766 3013 98 35.0 7e-19 MKLTTIHIFAAIALLLGLAACTQDEAGFLPEGEEGTPIVFTATGLNPAATATAGTRAPVD GNWTGVQSVAVLMDGTVKTYNVTPSTADPTSATLTSTDPYYWTNHNNITVTAWWPYTAGE TTPPAVKVKANQSTQKDFEGSDLIVADGQTVTYGSPTLRFTHRTARVTIVLTDYTEGLAS VRLTGLSTENDNPAEITPYDKGSNTYIALIAPQSVAAGTTFITCTFTNGKTFVYKMKNAT DWQAGGEYTYTVSLAAAKDLGYTIESNGSYTVTSADGLMNIAELVNGGKSDINITLDTDI DLTGKDWTPIGTDYDNSYKGTFDGGGHTITGLTFTTNDEYAGLFGWLNRAGTVKNVVMEG VQITSNQIYGGSIGGVVGYSWGTIENCSVSGSVSGTVYVGGVVGAQIGGSITGCSSSATV KGTVDVGGVAGQTNSSATLTACYATGNVTIEINPAKNIAGGSLVGMNAGSSLLACYATGN VTSTGSSTGKVHIGGFLGNNYTTVTAGYWKNNHEQGIGYNRESTGATKVDGSVVTWQKAV DAMNTALQNAGSRWRYELKGALPTLRKQ >gi|222159204|gb|ACAB01000155.1| GENE 6 6727 - 6870 110 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|256842614|ref|ZP_05548115.1| ## NR: gi|256842614|ref|ZP_05548115.1| conserved hypothetical protein [Parabacteroides sp. D13] # 1 47 2 48 48 79 100.0 7e-14 MEHTENPPYWFWHQGIVNNNIDSSVNMEAVSRTLSRLVLIKDNKSSN >gi|222159204|gb|ACAB01000155.1| GENE 7 7417 - 7800 184 127 aa, chain + ## HITS:1 COG:no KEGG:BVU_3680 NR:ns ## KEGG: BVU_3680 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 119 1 118 122 152 61.0 5e-36 MFCLNDSTRYFLCPGSTDMRKGMFSLSGLIREKMNGDVRNGDVYIFINRLKNRIKLLHAE TGGLVMYEKLLEEGTFKIPAYDPETHSYPMTWSDLVIMVEGINEDKRKGRQRRLSDQKRH WQISVNK >gi|222159204|gb|ACAB01000155.1| GENE 8 7862 - 9547 1121 561 aa, chain + ## HITS:1 COG:SMb20541 KEGG:ns NR:ns ## COG: SMb20541 COG3436 # Protein_GI_number: 16264268 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 18 509 25 525 550 212 31.0 2e-54 MTTKEDMPKRVTDPMLEQLALLMETNRKQSEIIESQAKTIEELRATIVELNASLAWLKRK VFGKMSEKCNPINNGDPMLPFDYGDLGQIEAEIEAARNKAAQIITPKPQVAGKTPRRNRI IMDDLPVVTVVIEPEDLDLGKYVKIGEEHTRTLEMKPGYLYVKDTVRPTYAIKDETEAVE NGKRMVVTAPLPLMPIYKGMPGASMLAEILLQKYEYHVPFYRQVKQPEHLGVKLSRNTLN GWFKPVCELLRPLYLELKKKVLSSDYIQVDETTLPVIDHDRHKAAKEYIWIVRAAVPRLL FFHYANGSRSQKVAVDLLKTFKGYLQCDGYSAYDAFENRKDVRLCGCLAHIRRHIESCRE ENREYAMQGLKFIQDLYNVEYMADERQLSYEERAALRQRLAGPLLDAFELWLQNTYPKVL KRSLMGKAIAYAYPLIPRMRHYLYDGRIFIDNNRAENALRPMVLTRKNMLFCGNQQAAEN TAVICSLLGSCKECGVNPREWLNDVISKLPYYLTPKSEKKLTELLPDRWGGYRQSHDTLP TVTGMSMDNPRTVHRQTVHAT >gi|222159204|gb|ACAB01000155.1| GENE 9 9671 - 9877 101 68 aa, chain - ## HITS:1 COG:no KEGG:PG0019 NR:ns ## KEGG: PG0019 # Name: not_defined # Def: ISPg4 transposase # Organism: P.gingivalis # Pathway: not_defined # 1 65 45 109 385 104 73.0 1e-21 MIFCQFVDYVPLREISNGLHSANGNLNHLGIPCAPSKSNLSYQNEKRSCEFFCDCYYALL NYFGQLPL >gi|222159204|gb|ACAB01000155.1| GENE 10 9931 - 11208 821 425 aa, chain - ## HITS:1 COG:CC2770 KEGG:ns NR:ns ## COG: CC2770 COG3550 # Protein_GI_number: 16127002 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Caulobacter vibrioides # 7 407 4 425 435 214 33.0 3e-55 MKKEPIIVNVYLWGTCIGKLNWDFEKHCSVFQFTDEYRKQDYDICPSTHPKRTPLFASFY GNKDKLYQGLPEFLADALPDKWGASLFDQWLTDNNIKVTESLPLLKLSYIGKRAMGALEF EPEFNDDAIHESVNMSSLATLASKIYNDRDAAVISPEDSLTMKKLIYLGTSAGGMRPKAV VAYNLETEEFRSGQEDLPENFKQYIIKFKEADDSPTTEIEMVYSEMAKAAGINMVSCFLK EIDGRNHFVTERFDRKDGDKILSQPLAAIMPGADDYMKLCWLAETLKLPQEDKDQIFIRM VFNYVAGISDDYNKNISFIMDKTGRWRLSPAYDVMFTANTWENSSAHIHSMGVMGKRSAL TTSDFVNFAEDFVENPEKKILQVFDAVSKFQSLCVTYGIDKAISDKIQHVLDGLVTDDLN LLQLT >gi|222159204|gb|ACAB01000155.1| GENE 11 11423 - 11839 133 138 aa, chain - ## HITS:1 COG:no KEGG:BVU_1598 NR:ns ## KEGG: BVU_1598 # Name: not_defined # Def: transposase # Organism: B.vulgatus # Pathway: not_defined # 2 138 293 429 429 221 75.0 7e-57 MDGVQDLWEGEYTYRCILTNDYESSVREIVEFYNLRGGKERIFDDMNNGFGWDRLPKSFM AENTVFLLLTALIRNFYKAIIQRLDVKRFGLNATSRIKAFVFRFISVPAKWIRTSRRYVL NIYTCNNAYADIFQTDFG >gi|222159204|gb|ACAB01000155.1| GENE 12 12182 - 12583 125 133 aa, chain - ## HITS:1 COG:no KEGG:BVU_1596 NR:ns ## KEGG: BVU_1596 # Name: not_defined # Def: transposase # Organism: B.vulgatus # Pathway: not_defined # 1 123 55 192 401 179 64.0 3e-44 MSIYFCGGSCIEDVTTHLMNHLSLHPTLHTCSSDTILRAIKELTQENISYTSDTGKNYDF NTADTLNTLLLNCMFASDQLKEGEKYDAKPTYKKFSGYRPGVAVIGDLIVGIENSDGNTK VPFHGTFSRHGRA >gi|222159204|gb|ACAB01000155.1| GENE 13 12664 - 12837 89 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237723853|ref|ZP_04554334.1| ## NR: gi|237723853|ref|ZP_04554334.1| conserved hypothetical protein [Bacteroides sp. D4] # 7 57 243 293 308 102 98.0 6e-21 MWSQIAHQVGRKDFEEFGRRIGLGNKLIKRELDEFVKEKPLVQVLIDRSFLSEELKK >gi|222159204|gb|ACAB01000155.1| GENE 14 13041 - 14249 1074 402 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|198277709|ref|ZP_03210240.1| ## NR: gi|198277709|ref|ZP_03210240.1| hypothetical protein BACPLE_03932 [Bacteroides plebeius DSM 17135] # 1 402 1 402 402 795 100.0 0 MAKELIRVPRVLGESYNSVLIERLYEAGGNVMLDLIIYLGSYHLKDLFGTSWFSVEDFCR KMGYDRTNLQRKLDSRQLAAMFGKNMQPEYVCTDTAGQRISHPIETVFEAALYKLGLENL CYPTIGEDGRTSYNFVQILKRFDIMTDFKTKKSTKRLYSAVVSPEIKNFMFSLYNLLELQ DYRSLPSRYRYFYLELSKMVYLIKYKTTKNEAPFYVLTVDQLAKKLGIEIAEPKDRKKKV ASILKKMNTYLKYTNFNFSFVKGDHERWAYTVLFSFPRHTLHYFDEGQYAVVVKKFYKNL LGLYVEIAYPDTDMAVRRKKIKEVEEDAGLYKEFLLWANSPESVEKKKQIYISDFVAVFG RFPEGWAQEELESANGTEVAPAPEEFISPETDGAVNMDDPAV >gi|222159204|gb|ACAB01000155.1| GENE 15 15131 - 15757 347 208 aa, chain + ## HITS:1 COG:SP1088 KEGG:ns NR:ns ## COG: SP1088 COG2003 # Protein_GI_number: 15900956 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Streptococcus pneumoniae TIGR4 # 66 205 85 225 226 91 39.0 1e-18 MKTKRTVFEVNDVYRVMENSELIYTLTNSEKLKREEYKYEQMEIEGLCLSDVLETLTPSR KKVAYAAIELYKRLKEGQVESPALLSSNNVYKMMHPVLCDIATEEMWVLLLNSSSKLIRK VRISCGGINTAPVDIRVIMKQALYYNAVSFIMVHNHPSGARKPSSADDRLTEAVKKAAET LDIRLVDHVIVAGNNYYSYADEGRLQNR >gi|222159204|gb|ACAB01000155.1| GENE 16 15967 - 16344 125 125 aa, chain + ## HITS:1 COG:no KEGG:BDI_3861 NR:ns ## KEGG: BDI_3861 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 123 12 134 136 164 82.0 7e-40 MVALLLGIEFLRSDIRTILYCKRNDPRILKSCLYTLLILLTSLYSIVAGAMLHPEPVPVY ELIPGFQYGVYGFSFSFILLAVACLYRKTLPFVLLGYAMIYVIGVLIPLLRYVVSMMDYF TRTRV >gi|222159204|gb|ACAB01000155.1| GENE 17 16717 - 17091 200 124 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189461004|ref|ZP_03009789.1| ## NR: gi|189461004|ref|ZP_03009789.1| hypothetical protein BACCOP_01651 [Bacteroides coprocola DSM 17136] # 1 124 9 132 132 233 100.0 2e-60 MENLITQINKERLVNSDTALMMKELYYYVPCEYWYDKQDRLRTDIEGRNTPMYMCECPTL AACIQWMIQTREYTFQTEQNVAVWHVVVRAGDYVLYDSESNADAFCCLEEALEKAVQECM ELLY >gi|222159204|gb|ACAB01000155.1| GENE 18 17204 - 17800 547 198 aa, chain - ## HITS:1 COG:XFa0019 KEGG:ns NR:ns ## COG: XFa0019 COG1961 # Protein_GI_number: 10956730 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Xylella fastidiosa 9a5c # 1 183 4 182 188 148 44.0 6e-36 MKIGYARVSTKEQHLDMQLEALKAAGCEKIFSEKMTGRQQNRPELDACLSFLREGDTLVV YKLDRLGRSLKNILTLLEDFKNRGIQFTSLQDNISTEGAIGQLMNNILGAFAQFERDLIY ERTQEGRRIAKEKGVKFGRKTLINKNNIAKQKSCIQLYESGTPVREIQKILNIGSSGTVY RLLRRNGIALKSENKNNK >gi|222159204|gb|ACAB01000155.1| GENE 19 17842 - 18171 256 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|189461002|ref|ZP_03009787.1| ## NR: gi|189461002|ref|ZP_03009787.1| hypothetical protein BACCOP_01649 [Bacteroides coprocola DSM 17136] # 1 109 1 109 109 198 100.0 8e-50 MREAEENIKFKMEIDVLVPIPRTVTRDFTSLKHLRQWQKRNDIDGSLYCFAHREYLLNEK GEWEQFTVIGKQVVTIGELERLLLAMKQKGFNQYSREEYEELMSSYLKK >gi|222159204|gb|ACAB01000155.1| GENE 20 18378 - 19040 341 220 aa, chain - ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 5 123 12 138 345 95 40.0 9e-20 MEKQGEIILYQPDEAVRLEVRLEDETVWLTQAQIAELFQRDRTVITKHINNVFKEKELEE KSNVHFLHIANSDKPVKFFSLDVIISVGYRVKSVRGTQFRQWANKILKEYLLKGYSINQR LNDMEYRMNNRFFQIEKTIAEHDAKIDFFVRTSLPPVEGIFFDGQIFDAYKFATDLIKSA KCSLVLIDNYVDESVLLMLSKRNSGVSATIYTQNKRTAPT >gi|222159204|gb|ACAB01000155.1| GENE 21 19450 - 19698 316 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712968|ref|ZP_04543449.1| ## NR: gi|237712968|ref|ZP_04543449.1| predicted protein [Bacteroides sp. D1] # 1 82 1 82 82 154 100.0 2e-36 MVTNIIQIGNSKGIILPSEILKQLRLSLKSTVSVSLDGNNIVIKAQPRQGWAEAAKRAHE NGDDELLIPDVFEDEKFEDWTW >gi|222159204|gb|ACAB01000155.1| GENE 22 19692 - 20012 136 106 aa, chain + ## HITS:1 COG:NMB2038 KEGG:ns NR:ns ## COG: NMB2038 COG2337 # Protein_GI_number: 15677861 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Neisseria meningitidis MC58 # 1 104 3 106 107 97 46.0 7e-21 MVEQYGVYWVELDPTRGGEMAKTRPCVVVTPSDLNMYLTTVVIVPITSTIRNYPYRVLCS VAGREGEIATDQIRTVDKSRIKRKIGELNNREIEELKEVFRQMFCE >gi|222159204|gb|ACAB01000155.1| GENE 23 20012 - 20236 186 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712970|ref|ZP_04543451.1| ## NR: gi|237712970|ref|ZP_04543451.1| predicted protein [Bacteroides sp. D1] # 1 74 2 75 75 134 100.0 1e-30 MEEKENKPTCKDFAEYLKKCAEAYAKHGSQELTNSDGEYQDIFDDLCQLHGDFDHGANRI ISLKDMFIEYFNVK >gi|222159204|gb|ACAB01000155.1| GENE 24 20336 - 20620 235 94 aa, chain + ## HITS:1 COG:no KEGG:BVU_3428 NR:ns ## KEGG: BVU_3428 # Name: not_defined # Def: DNA-binding protein HU # Organism: B.vulgatus # Pathway: not_defined # 1 89 1 89 94 118 71.0 7e-26 MTKSEIVKEISARTGIDGKAVLAVVEGFMNEVKTSLGKEENVYLRGFGSFIVKKRAQKTA RNISKNTTMIIPAHNIPAFKPSDEFLTMVKMGKQ >gi|222159204|gb|ACAB01000155.1| GENE 25 20617 - 20889 136 90 aa, chain + ## HITS:1 COG:no KEGG:BT_1941 NR:ns ## KEGG: BT_1941 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 81 1 82 276 92 54.0 5e-18 MKDQLRLLRDCINNDRPAVVFQGDDFCAPEILEAAKEIYRKHGCSEEFLFDWQLLINEVK AYQLESPATVKLPKLSPTETELVREEMTKR >gi|222159204|gb|ACAB01000155.1| GENE 26 20951 - 21328 216 125 aa, chain - ## HITS:1 COG:no KEGG:Slin_6854 NR:ns ## KEGG: Slin_6854 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 1 118 1 150 151 68 36.0 9e-11 MEHPNSKCRIAQAEYLSRLPEEERENKARDIRIGNASYIYHQQAVPIQENRLIMYYKEWL EDLPPNISRHMRMLGFEACKTMIPFTRYVNERNDIGMRDWMQEHLSPGDFNYWQELSKKA GSPTF >gi|222159204|gb|ACAB01000155.1| GENE 27 21343 - 21669 93 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712974|ref|ZP_04543455.1| ## NR: gi|237712974|ref|ZP_04543455.1| predicted protein [Bacteroides sp. D1] # 1 108 1 108 108 215 100.0 6e-55 MVQNYTPVMWDDKAFAFVPYEAFGDLPHYPKEKCEQICKELNSLIRLCTYRPKKEDIYFH PVSYVCRSGGFIVTDNQASFEECPYPACADRHSCQKICDLMNRIIEES >gi|222159204|gb|ACAB01000155.1| GENE 28 21715 - 21969 232 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|189460990|ref|ZP_03009775.1| ## NR: gi|189460990|ref|ZP_03009775.1| hypothetical protein BACCOP_01637 [Bacteroides coprocola DSM 17136] # 1 84 1 84 84 151 100.0 1e-35 MEKKKITIEVEPATAVATVGLLRGIFPSIIEQLERQAATNGSPLKFNKVENMQEVLDEIY EKCIAETNLREFAQAHLNSDGLPN Prediction of potential genes in microbial genomes Time: Wed May 18 04:36:51 2011 Seq name: gi|222159203|gb|ACAB01000156.1| Bacteroides sp. D1 cont1.156, whole genome shotgun sequence Length of sequence - 2469 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 37 - 516 336 ## gi|237712976|ref|ZP_04543457.1| predicted protein 2 1 Op 2 . + CDS 573 - 836 203 ## gi|237712977|ref|ZP_04543458.1| predicted protein 3 1 Op 3 . + CDS 852 - 1577 217 ## BF2311 putative type I restriction modification system related protein 4 1 Op 4 . + CDS 1607 - 1930 218 ## gi|237712979|ref|ZP_04543460.1| predicted protein 5 1 Op 5 . + CDS 1930 - 2283 218 ## gi|189460984|ref|ZP_03009769.1| hypothetical protein BACCOP_01631 + Term 2290 - 2334 7.0 6 2 Tu 1 . - CDS 2307 - 2468 57 ## Predicted protein(s) >gi|222159203|gb|ACAB01000156.1| GENE 1 37 - 516 336 159 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712976|ref|ZP_04543457.1| ## NR: gi|237712976|ref|ZP_04543457.1| predicted protein [Bacteroides sp. D1] # 1 159 1 159 159 323 100.0 2e-87 MSRVHYLEGDYEQLVINETIDGLFSSYRIDRNSLPKGFFLYEIRWDDSLSSLAEICPSVV VNHAGSFITKSPLEFDANNSIRITYANFIEFCQFGEWAYEKLAVLDCNSGNVAVISPDRR LQTAEEIEIFLSEHCGYHLSEINWMVMKGDVVFLNENDF >gi|222159203|gb|ACAB01000156.1| GENE 2 573 - 836 203 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712977|ref|ZP_04543458.1| ## NR: gi|237712977|ref|ZP_04543458.1| predicted protein [Bacteroides sp. D1] # 1 87 1 87 87 159 100.0 7e-38 MKWIDKMVERITRKETALNDRFCVNRHTVVCQSGTTDYVSVTIDNTDGFDFDFWTKQLCF EKDCKYRSEIKAAFDKIYGTRNIECCE >gi|222159203|gb|ACAB01000156.1| GENE 3 852 - 1577 217 241 aa, chain + ## HITS:1 COG:no KEGG:BF2311 NR:ns ## KEGG: BF2311 # Name: not_defined # Def: putative type I restriction modification system related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 11 206 14 209 248 114 38.0 2e-24 MIPKEIMKLYEKLAFSRGYEEAFRDFLDVCLYYLSVGMLAEDYRRVEKRYKPYEMELFVQ MFYRVSEYSEGFCDVLGDMFMECVSHGNNGQFFTPIHVADLMACMGENRLKPKQSVCDSC CGSGRMLLSAVKKCAEENDGGRLFCYGSDIDLICVKMTVVNLMMNSVPGEVAWMNTLTMQ HWRSYHIDLQLIAGVWLPILKITEAGDTSFVRKLENAMEDNSELKRSIQSNARATQLTFD F >gi|222159203|gb|ACAB01000156.1| GENE 4 1607 - 1930 218 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712979|ref|ZP_04543460.1| ## NR: gi|237712979|ref|ZP_04543460.1| predicted protein [Bacteroides sp. D1] # 1 107 1 107 107 200 100.0 2e-50 MKAIRTKEVNENVKAANSCLRKAKRQAKELAVFTGEVVRLIDKEGHSLQGFFNRVEFVLV DGTIKFRVVINPILECENCIMASCHEDYLYEVVSVEKFTYKNYRYHL >gi|222159203|gb|ACAB01000156.1| GENE 5 1930 - 2283 218 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189460984|ref|ZP_03009769.1| ## NR: gi|189460984|ref|ZP_03009769.1| hypothetical protein BACCOP_01631 [Bacteroides coprocola DSM 17136] # 1 117 1 117 117 224 100.0 1e-57 MILYFNDYKLIYIITMMGKINNIPVLNEDVIRLVVLNEEIIGYIYPRRTSYVVALEIKPG RTTSLATSVNKQCMVLLSDKVRLATEQDFDDFHICFDGFRNNTMYCYNQGTEPIYLQ >gi|222159203|gb|ACAB01000156.1| GENE 6 2307 - 2468 57 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no KPGMRTAIPMKGRVGKNLRERKLMGLPVTPSVFCYFFWGACQKVADSEMLSLP Prediction of potential genes in microbial genomes Time: Wed May 18 04:37:24 2011 Seq name: gi|222159202|gb|ACAB01000157.1| Bacteroides sp. D1 cont1.157, whole genome shotgun sequence Length of sequence - 3332 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 244 - 303 3.6 1 1 Op 1 . + CDS 361 - 879 330 ## BF2868 putative ribose phosphate pyrophosphokinase 2 1 Op 2 . + CDS 881 - 1495 303 ## BF2867 hypothetical protein + Term 1519 - 1570 3.0 3 2 Tu 1 . - CDS 1802 - 2173 102 ## gi|256842673|ref|ZP_05548174.1| predicted protein - Prom 2355 - 2414 6.1 4 3 Tu 1 . + CDS 2533 - 2931 482 ## BF2915 putative single strand binding protein + Term 2934 - 2966 2.3 - Term 2867 - 2904 0.4 5 4 Tu 1 . - CDS 3074 - 3331 98 ## gi|262409560|ref|ZP_06086101.1| conserved hypothetical protein Predicted protein(s) >gi|222159202|gb|ACAB01000157.1| GENE 1 361 - 879 330 172 aa, chain + ## HITS:1 COG:no KEGG:BF2868 NR:ns ## KEGG: BF2868 # Name: not_defined # Def: putative ribose phosphate pyrophosphokinase # Organism: B.fragilis # Pathway: not_defined # 5 167 20 183 188 170 49.0 2e-41 MMFSFFEYLPMQYRQATEREWQIRKMIWSFKDGKSYLNVAWMIANKLHQVFGDDVKNIVF ACVPASSADKNELRYKGFASAVCKFSGAINAYEHIRASGDRLAIHEKFDSKSLQKVQVIE FDKDFFRGKKILVFDDILTKGFSYARFACQLEKIGGEVLGGFFLGKTVVRML >gi|222159202|gb|ACAB01000157.1| GENE 2 881 - 1495 303 204 aa, chain + ## HITS:1 COG:no KEGG:BF2867 NR:ns ## KEGG: BF2867 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 48 204 5 162 167 125 43.0 9e-28 MKEKESYIEKQKGIFGDTTWFTYRYEVNGMVYETSAGSLDICRKARDKWMKMMSVAFTGH RTIRTNKYALSVSLNEEVRFCYENGIRFFYIGCAVGFDMMAAHTILEQRKQYPDMVLVAV VPYVGQDVYFNKEDKQRYADILRQADKVVVLSEYYYAQCYAHRNDYMISHACRLIAYWDG KSAGGTSYTFNKAQKKKLVIHNLF >gi|222159202|gb|ACAB01000157.1| GENE 3 1802 - 2173 102 123 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|256842673|ref|ZP_05548174.1| ## NR: gi|256842673|ref|ZP_05548174.1| predicted protein [Parabacteroides sp. D13] # 1 123 4 126 126 206 100.0 3e-52 MQKNNRSDRKHGTFRKYGLPPIDTPEETSAIPFKKMKPKNEAYNIPRIGRSGSSIKQKPT TEEQTYGFCGENITIQAHSHYNCFRKKKIILPEKQKGIHSEKSKRLKRTANLVAKYPTTY LQP >gi|222159202|gb|ACAB01000157.1| GENE 4 2533 - 2931 482 132 aa, chain + ## HITS:1 COG:no KEGG:BF2915 NR:ns ## KEGG: BF2915 # Name: not_defined # Def: putative single strand binding protein # Organism: B.fragilis # Pathway: not_defined # 1 113 1 112 126 140 68.0 2e-32 MKKIENSFAVTGFIGKDAEIRSFETASVARFSIAVSRADKTGEETSYVSAFMGIEAWRKN EALDSFDVLKKGELITVEGYFKPEEWTDSKSGEKRNRIVMVATKFYPAPEKEEEKPDQPK QKAVSKKKKASK >gi|222159202|gb|ACAB01000157.1| GENE 5 3074 - 3331 98 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262409560|ref|ZP_06086101.1| ## NR: gi|262409560|ref|ZP_06086101.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 20 85 1 66 66 134 98.0 1e-30 REAENKEKGSTESKTNHTVLLAKPYSYLPATSGRGERVSNGLPGQTVKGRDCNTKEGGCF GNAEAPDDDHCNPAHVWNFSFLVYP Prediction of potential genes in microbial genomes Time: Wed May 18 04:37:45 2011 Seq name: gi|222159201|gb|ACAB01000158.1| Bacteroides sp. D1 cont1.158, whole genome shotgun sequence Length of sequence - 5659 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 38 - 229 150 ## gi|237712985|ref|ZP_04543466.1| predicted protein 2 1 Op 2 . + CDS 226 - 636 335 ## BF0106 hypothetical protein 3 1 Op 3 . + CDS 638 - 1885 386 ## BF0107 hypothetical protein + Term 1893 - 1928 5.1 - Term 1897 - 1950 11.4 4 2 Op 1 . - CDS 1985 - 2470 361 ## gi|237712988|ref|ZP_04543469.1| predicted protein 5 2 Op 2 . - CDS 2524 - 3549 735 ## Slin_6836 hypothetical protein 6 2 Op 3 . - CDS 3561 - 4949 619 ## COG1672 Predicted ATPase (AAA+ superfamily) - Prom 4990 - 5049 4.5 7 3 Op 1 . - CDS 5057 - 5476 292 ## gi|237712991|ref|ZP_04543472.1| predicted protein 8 3 Op 2 . - CDS 5499 - 5657 147 ## gi|255016360|ref|ZP_05288486.1| TraG family mobilization protein Predicted protein(s) >gi|222159201|gb|ACAB01000158.1| GENE 1 38 - 229 150 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712985|ref|ZP_04543466.1| ## NR: gi|237712985|ref|ZP_04543466.1| predicted protein [Bacteroides sp. D1] # 1 63 1 63 63 66 100.0 6e-10 MQIIQFVFSVIIALCAVAMFVGAVITLSPIKVVSVTLLALICLFSLFLVKLAYCEMKSKN DIL >gi|222159201|gb|ACAB01000158.1| GENE 2 226 - 636 335 136 aa, chain + ## HITS:1 COG:no KEGG:BF0106 NR:ns ## KEGG: BF0106 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 136 1 134 138 135 54.0 4e-31 MKVSEQFKSTIKAYLDNMAAVDSLFAPVYQKPTKNIDNCITYILNQVKKSGCCGFSDDEI FGMALHYYQEDNIEVGSPLKCNVVVNHHVELSEEEKKAAREAAIKKLQEEELAKLKKRQQ AKRENNTEVQTQLTLF >gi|222159201|gb|ACAB01000158.1| GENE 3 638 - 1885 386 415 aa, chain + ## HITS:1 COG:no KEGG:BF0107 NR:ns ## KEGG: BF0107 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 413 1 440 443 327 40.0 6e-88 MKPKTKLQMEIVNGSRKLAPVSEAQKRYAYKHCFVHYFKRDAKGNCFCLDCGHTWRDKED KKNCKCPHCGMNLKLENSRKRTAIYKEYFCVITTYKQYQVIRFFMVDCRLKKGSPANYFI IEAVQCWMNKEGKTETLSLLRGMSIFYYDAWIYGSSLELRKRNVHHDRIYDICPAVIYPR MKVIPELTRNGFKGVFYDICPSSFFMTLLTDNRMEILYKAGQMNLFLRFLERKYGMDKYW TYVKICLRHDYVIHDADLWLDYVDMLIENKLDARNPHYLCPLNVEEAHDWVMGKCKKKYS EKDEKDYIAAKSRFFNLSFADGNIMVRVLESLSDFYKEGKLLHHCVFSNAYYKREDSLIM SATVDGRRMETVEFSLSRMEVCQCRGKSNQLSAYHDRILNLVRDNIPLIRERMVV >gi|222159201|gb|ACAB01000158.1| GENE 4 1985 - 2470 361 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712988|ref|ZP_04543469.1| ## NR: gi|237712988|ref|ZP_04543469.1| predicted protein [Bacteroides sp. D1] # 1 161 1 161 161 307 100.0 1e-82 MNIKHLFFCMLLITFSSCSKPGYEKAIAEWVQTDSHGTWTDLKFELLEVLETEDVTVSDS LRYLNNKSAQLSAVIQKAESPRALFKPSFSAYMEAEKSLKATDAMKAMYLHRDSTEVIGK ILKCRYAIVQPHSGVQQKKTASFLLSPDMEKCIGKLKPASK >gi|222159201|gb|ACAB01000158.1| GENE 5 2524 - 3549 735 341 aa, chain - ## HITS:1 COG:no KEGG:Slin_6836 NR:ns ## KEGG: Slin_6836 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 2 281 167 453 632 67 26.0 5e-10 MKVDFNQIKTTISLPDFLLELGWKIVEGSSNSCPKMSNGTHTIVIKRNSQNQYTYWDVHS DSVRGRSIMDLMQEHLFETTGKMPTLREVGEILQNYINTNRITTPEKSRYEVGNTSMRAD ELQFYLSQLQPYKGNYLQKRGILKESIESRFFKDTLFIREVKNKGSVYRNVCIKMYNENG VQAISQRNETFKGIIGGKFDCLATSNHDKSRPIDILYIGESFIDCISHYQLRHSGSDLNL VYVSTEGTFTEGQMRLLRLILDKNQVKELRSIFDNDKQGHKYTLWLHRYFHGDTTDVESL SNDELRNKVRELKNVELSENKDWNDDLKISCGICSSTEDGQ >gi|222159201|gb|ACAB01000158.1| GENE 6 3561 - 4949 619 462 aa, chain - ## HITS:1 COG:FN0123 KEGG:ns NR:ns ## COG: FN0123 COG1672 # Protein_GI_number: 19703471 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 6 441 3 431 454 182 31.0 2e-45 MKFNQFIDRQDSLKRLRNALLRTTPQFLVIYGRRRIGKSTLIKEIMQDKDVYFLSDQTNE ANQRALFAKAIAYSIPRFDKVVYPDWETLLLELNDRLSERITVCLDEFPYMVKSCASLPS VIQKLLNGKTLKYDLIICGSSQQLMQGYVLDKREPLYGLADEIIKLPPIPVSYIGQALNV NTIAAVEEYSIWGGIPRYWELRADYPDMDTAIRELALDTKGILAEEPQRLLRDDLRDTIQ TSTILSIIGNGVNRITEIASRAGKEATQISEPLSKLRELGYIRREIPFGESEKKSKKGIY RINDNMLQFYYRFIAPHRSILELRRIDTVMHLINAQFTQHVGDCWEHLCRQYISGNEIDG IVYNIASRWWGKIFPEGNKEGTMIELDVVAESFDKKHILIGECKWTNKEDAQRLARTLSE KAAHLPFIKDGQEVHLVLFLKQEPEHQESIRYFLPEDIIETI >gi|222159201|gb|ACAB01000158.1| GENE 7 5057 - 5476 292 139 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712991|ref|ZP_04543472.1| ## NR: gi|237712991|ref|ZP_04543472.1| predicted protein [Bacteroides sp. D1] # 1 139 1 139 139 231 100.0 9e-60 MRKIHSLLVAACLLAASCNREEIQPETDKAGMYHITVAVTGTSPKGSVHIYNLDGVKFRN ERTGTSAFSVDETFTDKVEYSTEKPVSQITVQGILYSKENATITLQIKKDGESVFNQSKQ LEAVPGIDTTVDLVYSTIK >gi|222159201|gb|ACAB01000158.1| GENE 8 5499 - 5657 147 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|255016360|ref|ZP_05288486.1| ## NR: gi|255016360|ref|ZP_05288486.1| TraG family mobilization protein [Bacteroides sp. 2_1_7] # 11 52 607 648 648 84 100.0 2e-15 MELEILEEIIQQNYIKIIEDVNAILKKIEDKLKEKSAVPAAGTHKTEQKIIR Prediction of potential genes in microbial genomes Time: Wed May 18 04:38:23 2011 Seq name: gi|222159200|gb|ACAB01000159.1| Bacteroides sp. D1 cont1.159, whole genome shotgun sequence Length of sequence - 24366 bp Number of predicted genes - 29, with homology - 28 Number of transcription units - 8, operones - 4 average op.length - 6.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 492 332 ## gi|237717765|ref|ZP_04548246.1| conserved hypothetical protein 2 1 Op 2 . - CDS 504 - 845 385 ## gi|237712993|ref|ZP_04543474.1| predicted protein 3 1 Op 3 . - CDS 865 - 2058 856 ## COG4227 Antirestriction protein 4 1 Op 4 . - CDS 2062 - 2247 308 ## gi|237712995|ref|ZP_04543476.1| predicted protein 5 1 Op 5 . - CDS 2266 - 3738 1564 ## pBF9343.28c putative integral membrane protein 6 1 Op 6 . - CDS 3735 - 4847 883 ## BT_p548216 TraN-like protein 7 1 Op 7 . - CDS 4860 - 5045 90 ## gi|189460952|ref|ZP_03009737.1| hypothetical protein BACCOP_01599 8 1 Op 8 . - CDS 5042 - 6325 1169 ## BT_0087 conjugate transposon protein 9 1 Op 9 . - CDS 6312 - 7010 576 ## gi|237713000|ref|ZP_04543481.1| predicted protein 10 1 Op 10 . - CDS 7015 - 8070 1053 ## gi|255016369|ref|ZP_05288495.1| hypothetical protein B2_20891 11 1 Op 11 . - CDS 8076 - 8672 481 ## gi|237723810|ref|ZP_04554291.1| predicted protein 12 1 Op 12 . - CDS 8678 - 8899 246 ## gi|237713003|ref|ZP_04543484.1| predicted protein 13 1 Op 13 . - CDS 8892 - 11498 2147 ## Slin_6919 type IV secretory pathway VirB4 protein-like protein 14 1 Op 14 . - CDS 11495 - 11806 263 ## gi|198277775|ref|ZP_03210306.1| hypothetical protein BACPLE_03998 15 1 Op 15 . - CDS 11819 - 12163 483 ## gi|189460944|ref|ZP_03009729.1| hypothetical protein BACCOP_01591 16 1 Op 16 . - CDS 12182 - 12526 184 ## gi|189460943|ref|ZP_03009728.1| hypothetical protein BACCOP_01590 17 1 Op 17 . - CDS 12523 - 12735 240 ## gi|189460942|ref|ZP_03009727.1| hypothetical protein BACCOP_01589 18 1 Op 18 . - CDS 12754 - 13332 572 ## gi|189460941|ref|ZP_03009726.1| hypothetical protein BACCOP_01588 - Prom 13460 - 13519 7.4 - Term 13534 - 13562 -1.0 19 2 Tu 1 . - CDS 13642 - 14235 348 ## BDI_2950 hypothetical protein - Prom 14363 - 14422 2.7 - Term 14403 - 14443 2.1 20 3 Tu 1 . - CDS 14480 - 15610 1016 ## gi|189460937|ref|ZP_03009722.1| hypothetical protein BACCOP_01584 - Prom 15632 - 15691 2.8 - Term 16234 - 16283 9.4 21 4 Op 1 . - CDS 16305 - 16787 306 ## gi|237723800|ref|ZP_04554281.1| predicted protein 22 4 Op 2 . - CDS 16792 - 17547 610 ## BF0129 hypothetical protein - Prom 17578 - 17637 8.6 + Prom 17507 - 17566 12.9 23 5 Tu 1 . + CDS 17690 - 17776 57 ## + Term 17824 - 17860 0.0 + Prom 17798 - 17857 3.6 24 6 Op 1 . + CDS 17924 - 18232 141 ## gi|237713014|ref|ZP_04543495.1| predicted protein 25 6 Op 2 . + CDS 18229 - 19629 1092 ## PGN_0075 hypothetical protein 26 6 Op 3 . + CDS 19658 - 20641 521 ## gi|237713016|ref|ZP_04543497.1| predicted protein + Term 20654 - 20691 2.6 27 7 Op 1 8/0.000 - CDS 21450 - 23039 1149 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 28 7 Op 2 . - CDS 23032 - 23346 306 ## COG1396 Predicted transcriptional regulators - Prom 23386 - 23445 8.0 + Prom 23709 - 23768 5.6 29 8 Tu 1 . + CDS 23798 - 24365 215 ## COG1373 Predicted ATPase (AAA+ superfamily) Predicted protein(s) >gi|222159200|gb|ACAB01000159.1| GENE 1 3 - 492 332 163 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717765|ref|ZP_04548246.1| ## NR: gi|237717765|ref|ZP_04548246.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 163 1 163 648 334 100.0 1e-90 MEREKKDLSFILILLSTLIGTAVLFQWAIVTGLYAPLRNPAMWERLMEKDALFRFLYVIL IGGLAFLFPVGKVKDENKKWVYTSMTLVSASMLVIGFSRLSAWYNLFVFPTVFVAYTLLV IKTLPYFIRRHAQSDDSIFGLSREESAFYFRFETTSGPLVIHK >gi|222159200|gb|ACAB01000159.1| GENE 2 504 - 845 385 113 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712993|ref|ZP_04543474.1| ## NR: gi|237712993|ref|ZP_04543474.1| predicted protein [Bacteroides sp. D1] # 1 113 15 127 127 224 100.0 2e-57 MFTGQIIGYVGADARQNKNGKGFNFHVSTKYKNVNNEEQTLWVCCFVNYESKVFEYLKKG TQVYVCGDVYADVFQKDDGSHIPSVSLTVSRIELLGKAEKKENGDGTATGQPQ >gi|222159200|gb|ACAB01000159.1| GENE 3 865 - 2058 856 397 aa, chain - ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 10 319 221 509 522 123 31.0 6e-28 MDTASDKALQRFAELMIEKIKQVEDNWQKPWFGIKGGGLPQNIEGRTYNGVNSFMLFLLS EKEQYSLPVYMTFMQAKDSGLNILKGEKSFPVIYWNFSVRDKNGKKIPFDVYKNLDKNEQ QEYKVTPFLKTYNVFNVQQTNLQETKPEKWEALKEQFKIPAIKDEQGMLTVPLIDAMVRE QQWICPIYSKEGNSAYHARGEDNHIVVPLKGQFKDGENFYSTLLHEMAHSTGEPEHLNRE KGVIFGDKQYAKEELVAELTSATVGQSLGISTYIREENAMYLKNWLGALKEDPKFIYNIL ADVGKASNMIQEHASRMEQYLTPEERFTLAVLQDNRPVLEQMKNEGFIPSSRQLENLAAN HPTASNLETLYGTFGISLPVMEAEPAMKNTNEPQLGL >gi|222159200|gb|ACAB01000159.1| GENE 4 2062 - 2247 308 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712995|ref|ZP_04543476.1| ## NR: gi|237712995|ref|ZP_04543476.1| predicted protein [Bacteroides sp. D1] # 1 61 1 61 61 79 100.0 1e-13 MDYDELVQKNIAGEICDLEFLLAQEELAQAYQEEMAAKQQEINNQTAREWLLDYENRNLY Q >gi|222159200|gb|ACAB01000159.1| GENE 5 2266 - 3738 1564 490 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.28c NR:ns ## KEGG: pBF9343.28c # Name: not_defined # Def: putative integral membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 10 146 12 157 519 90 39.0 2e-16 MKLREALDTYMFDNVGQFGGIAESLGYQTEYRDGYFRFRKDGEELSMSVSEIREKAGAKY DESLRDQSKERVSALFNKERANDPKYVAELEKEGISIKRWENLKGHDKDGFTVIDHRLKV CYTGQSLYEYAYKQGNILDGKGTKLEKGIMSDLMEIHGKPGKLRLNDDGISVFYRKEALV IPDKVLGKKLGKKEKEQLLAGDIVPITVNKKDILLQVDRDLNSVILRTNQEIKIPDIIGQ TSEYGGYKLTKADKYLLANGHALENKLLHSPEGYFIADIQLTDDRKGVMIQNIQSITPGK AQELIKAMTPKLEAHAANIEAAKEQKHEEGRNMEAEFKEAVGKHDFEKIGKLKEEGYKPS EEFIKGIGKEHGLDERQTHEVIQLFGTKPEEQGEHERQAARLLDAAQTDNFHVIQEIQKE GYRLTQQDLTRMRETGVQANTLIAVQKIFGMEGSTKTLGDVKLASTPKPDNSKEMARPIA STINRAFNDL >gi|222159200|gb|ACAB01000159.1| GENE 6 3735 - 4847 883 370 aa, chain - ## HITS:1 COG:no KEGG:BT_p548216 NR:ns ## KEGG: BT_p548216 # Name: not_defined # Def: TraN-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 365 4 365 365 132 26.0 3e-29 MKKYIYLLFYCFVCSLPLFAQENGTPRQQAISQKDIFISFDCVKHLVFPVQVSDIAIGEQ ELVMASRVEEAPHIVRLSAQAEGFTEETNLTVVCIDGSVYTYHIRYLPESGTDSHPNIYE DNGKWQHHDYQAEVSDLHLAEFFFPEDIAYGTPGNEVSFTLAAYNNQLKVSTAKDAVAYS NLFVVDKAMNTYHITIKRGNTSVFTYNFDDQRKYTAHVDVNSEEMERCIQELRTKKRNIY SLGVIENKFELSMANLYVHEDFMFFIFDLKNKSYIDYDIEFVKCFQRDQKKSKNAIQQET TIEPIYQKDFDTKIKGKSRNRLILGFDKFTIPDDKVFEIEIYERNGGRHIKLAVLNEYIL SAEPLYKPQP >gi|222159200|gb|ACAB01000159.1| GENE 7 4860 - 5045 90 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|189460952|ref|ZP_03009737.1| ## NR: gi|189460952|ref|ZP_03009737.1| hypothetical protein BACCOP_01599 [Bacteroides coprocola DSM 17136] # 1 61 1 61 61 93 100.0 5e-18 MIDILNYLLALSAVFFGYRIYVSRRNGRAFKEIRPLTVSFGITLLLLWGITVFTAFFSRT V >gi|222159200|gb|ACAB01000159.1| GENE 8 5042 - 6325 1169 427 aa, chain - ## HITS:1 COG:no KEGG:BT_0087 NR:ns ## KEGG: BT_0087 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 234 388 306 461 461 93 33.0 2e-17 MEPNKLIKDKNKLLMAIPVILGILFLYVFFIREDGNVPSGSDRQTASQAFLEPQSEMADK VNDKMEAYKQDKLEEQEKARILKESQVKGSDFYFELQNQQERYDRATLERIRRMQNDPYS EVMAEYGGGRNGRFSHQLESQLDGLEDEQALNEIIREAKKNSRIRKELEESDKYRKRMYE RIVNYDQEGKTEKAAANPSGHPSADSLMQKGPIYRAENGKRVRRQPTSTPGNSNLFRACI HGDQTVVTGSTVRMRLLQDVTLSGMKIPANTLFYGVATLGANRLDVVVSNLKVGDNLNPV SVVVFDNDAMEGLNLPNNLKASAAKRMEQGLVQNIDMPLSSIGTMASEVTSVVNATTQVA KQILNMSLSQVKVHLKSNYEMYIQEESQESKLRRQAVQAELQKLYEQMEQEKTNKKNHPL QTLIDKL >gi|222159200|gb|ACAB01000159.1| GENE 9 6312 - 7010 576 232 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713000|ref|ZP_04543481.1| ## NR: gi|237713000|ref|ZP_04543481.1| predicted protein [Bacteroides sp. D1] # 1 232 1 232 232 437 100.0 1e-121 MNLMKFKGADDAVKLTIRLSAGYCIILTIAFASSIIFLINKIEKAYGQALVIDRQGEVYE ASSLSASDMRQYEYMNHVKTFVTHWYAFDESSYEKNISLALNLIGNKGKELLNEYNDVNM LNSLIQKNIRYGVRIKDVEINTQTIPVSGKIIFTQTGYRARGSISRNIEAEFTIYDMLSR SEENAHGAKIEDWIVRYSEPVESNGANTSQAASPGKNENQSETSKPTDHGTE >gi|222159200|gb|ACAB01000159.1| GENE 10 7015 - 8070 1053 351 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|255016369|ref|ZP_05288495.1| ## NR: gi|255016369|ref|ZP_05288495.1| hypothetical protein B2_20891 [Bacteroides sp. 2_1_7] # 1 351 1 351 351 650 98.0 0 MNQDNFLWGLQITDGLTMELVSQMTVIAFGVGVICFICNLAYNYLYHGASQLLSPNEDKF PDMMEIARCLVLFFCLSMYTPIAQTIVGTMEVINEATSLTSERAQEFAQFMAQSATEQGE MIVEYDKHSLQSEVAAGEDNTGAMQHELDKKMEEDEMTGVRSSVEKIVQLLNPANLATLV LHGISALLVGIIQVVILGIGVVIVKLLVILGPFVFAVSMLPVFQKQLSIWFGTLCSACMV FTVINILNQIMWQTFKAIYTESADMVDAATQQIQYLGMDLALIGAYCSCFWLSSKIVGHS DAGKIISKTVSIVTTAATIALMGGAAAGGKLTNVGGAASIGASFINDNNKK >gi|222159200|gb|ACAB01000159.1| GENE 11 8076 - 8672 481 198 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237723810|ref|ZP_04554291.1| ## NR: gi|237723810|ref|ZP_04554291.1| predicted protein [Bacteroides sp. D4] # 1 198 35 232 232 384 100.0 1e-105 MKRIILILGLAASMGLTGTGSVHAQFVVHDPGHTLLNSTEWIANVKKWVTQINEMIDAQE LRLGLQKIDQLKELKSLKELADLLDDVACLSSDYSFYLNVGSNYHCLKFLNFQRVTVNLS LSTDLLFKVATVTSYFSMNSEGRMSFIEQVKESVEKAAEEMREFNESVRSTVIYKSMKGH NRKTYYQGRLAAFTRHTN >gi|222159200|gb|ACAB01000159.1| GENE 12 8678 - 8899 246 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237713003|ref|ZP_04543484.1| ## NR: gi|237713003|ref|ZP_04543484.1| predicted protein [Bacteroides sp. D1] # 1 73 1 73 73 108 100.0 1e-22 MDDATQGLTALLSWSTDFNGSAYNLAGSIAAALLGVALIFVVWALATKKENAKSYLTAWL VCAIFTLLFITNK >gi|222159200|gb|ACAB01000159.1| GENE 13 8892 - 11498 2147 868 aa, chain - ## HITS:1 COG:no KEGG:Slin_6919 NR:ns ## KEGG: Slin_6919 # Name: not_defined # Def: type IV secretory pathway VirB4 protein-like protein # Organism: S.linguale # Pathway: not_defined # 56 863 4 787 792 270 25.0 2e-70 MIGIIILISLLICALCLLIALLYIRQTEIEVKDKSVDLESVLPIQTITENAVINGNGDIT VGYRLLLPEVFTLSESEAQYIHERLEALLKMLPAGTVIHQQNFYYTGRYHHAEYSSNALI AENNRHFNGKEILNSYTNLYVTFTNGSRNGKIRKSASGTSLMRKLHYPFKQPYKEYQQRL TEMEAFLMNFENGLSSIQQFEIRKMDDTELNNAIYDYVNLSYETPENDATQKSVNPMAVS ESGSMKIGQQHVSILSLTNEGEHLQELAVPHTGKSKAYGGNIEIPDSIRSKCSMLYPVGL GLPFNHIVNIVIEITDPDATVTAIGAEKDALNYITNFYPPAAEKQREQAAFCDEITQFDY QTAYTAFNVVLNDTDRTSLMRKTALVQQGFSFMNQSSCYVENAELCNLFFCNIPGNARAN YRGFVNTTKQAICYLQKEGMYLSDEKGHIYHDRFGTPAKINLWDYPALNNKNRIVIGPSG SGKSFWLNNYILQSYELGRDMMIIDIGGSYRSMIALNRGKYFDSTEQKKFAFNPFLCDRD KNGKYLYIDTTDAESADDQIKTIVAIISYIWKVREPMLPAENAILRKSVIGFYDYVNNSS IGEKHERIFPTLITYRAYLKEVFSKRMTEFEKQKFEIEELLLLLEPYTDGELSFLLNATE NVDIVHDRLIAFDMEDASKKEYFPLVAIITLQMIVDKIKKRQGFAKELIIDEALDFLQDE KFGDFIAYLYRTFRKKEGSITLAAQNILFLKNMPSSIKDSIIINCATKIILDHSEHRQNL PEVKAVLSITDEEAYMIESLQRTERWREFFIKMSNDAFIFRNEVSDFAAVAFDSRQATVV RLKQLFNESGSTYTAINRYLEERRKKYG >gi|222159200|gb|ACAB01000159.1| GENE 14 11495 - 11806 263 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|198277775|ref|ZP_03210306.1| ## NR: gi|198277775|ref|ZP_03210306.1| hypothetical protein BACPLE_03998 [Bacteroides plebeius DSM 17135] # 1 103 1 103 103 155 100.0 1e-36 MRTMRKINKPIKFFGLSSGQFAIFMLLTAIIIIVSIFKQLHPILIIGIISAILFLSGLLF QTLKKEHKAGNPDYLTGLRIKNATPRQITDKRQIFKFILKQQP >gi|222159200|gb|ACAB01000159.1| GENE 15 11819 - 12163 483 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|189460944|ref|ZP_03009729.1| ## NR: gi|189460944|ref|ZP_03009729.1| hypothetical protein BACCOP_01591 [Bacteroides coprocola DSM 17136] # 1 114 1 114 114 182 100.0 6e-45 MKAIENVREKANQVINRYGKVIFTFLIFFTLLGTAQVAEAQSGLKINSLSEVTDKAKEGA DTILDVAKYILAAVLGIALVFVIYSLATNNPHAKEYLLGWIIAVVVIMVAFLII >gi|222159200|gb|ACAB01000159.1| GENE 16 12182 - 12526 184 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|189460943|ref|ZP_03009728.1| ## NR: gi|189460943|ref|ZP_03009728.1| hypothetical protein BACCOP_01590 [Bacteroides coprocola DSM 17136] # 1 114 1 114 114 192 100.0 4e-48 MNVVDLILVAILACVTLHFYFRKKKKALSSGKEKEKPIREAVRKEGQERACPVICVVFQP ELEEEELAAEIADMYLHGRAYYYSGKPLPEQPKEEPISIKYKDVQWERLKTIHV >gi|222159200|gb|ACAB01000159.1| GENE 17 12523 - 12735 240 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|189460942|ref|ZP_03009727.1| ## NR: gi|189460942|ref|ZP_03009727.1| hypothetical protein BACCOP_01589 [Bacteroides coprocola DSM 17136] # 1 70 1 70 70 95 100.0 8e-19 MLIIILLADTNVSGLISLYREIGAMLIGVGFLCAGLAVLKKLISNHERTKEAIITYITAL VTWLIIWQLL >gi|222159200|gb|ACAB01000159.1| GENE 18 12754 - 13332 572 192 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|189460941|ref|ZP_03009726.1| ## NR: gi|189460941|ref|ZP_03009726.1| hypothetical protein BACCOP_01588 [Bacteroides coprocola DSM 17136] # 1 192 1 192 192 389 100.0 1e-107 MEHIRLTAISLSLLLCTVSAFAQRHYENIPALELNYGTNLFGNAGNYYNLSFSRYINRTS YWKAGFSYFEKPYEYTANLPQVSTLNPEETIPETIHDKGKDFFVDGGYYRTLACNLKSVY WGIGLGGFIGTEYVRHPDKEYNFIIGPKLETELEVFILPRTALLARIQQHWNPLSIDKWN TVWNVGIKILLY >gi|222159200|gb|ACAB01000159.1| GENE 19 13642 - 14235 348 197 aa, chain - ## HITS:1 COG:no KEGG:BDI_2950 NR:ns ## KEGG: BDI_2950 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 197 1 202 203 134 41.0 2e-30 MVQTIATLIDETLMVMGAHLHVRMTDEEKAYLDAHFELIHIKKGEHLVTRGEEVCYLYYL ESGIVRLWFPDKENREISARFIQAKEFINFFLSHEEHHAHYNIKALEPCKIWRLPKKDLH QLYELSINFNKLARIHLVRSINRKIIREEEFYTLDAEARYKALLANEKWLLRSIPLKDIA SYIGITPQALSNIRKRI >gi|222159200|gb|ACAB01000159.1| GENE 20 14480 - 15610 1016 376 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|189460937|ref|ZP_03009722.1| ## NR: gi|189460937|ref|ZP_03009722.1| hypothetical protein BACCOP_01584 [Bacteroides coprocola DSM 17136] # 1 376 1 376 376 708 100.0 0 MKESNTQKELSRVPRTLSESSNTKVLEVLFDAGGTVMSDIIIYVASSQMKDLFGDCWFSI NDFCEVMGYERTKLQRKLTEKQLDFLFSNQRPVYITEQNGQKIEHPIENTFEAALYRLGT SNLSVAYAMNGKTQYKFIQILDRFEIKDNFGTKKRTKRNYNVHLSKDLMNTLLTEYNLLE LKDYRNLPNRKGYRKFYLNLAKMIYLIKYKIDQGQAPYFTVTVDQLAKEFDVTVKDNHDR KKKVTSILNGINKKLERTKFQYQYIKGKGEKWPYTVQFFFDQETLEYFDEKIKAILTSQY HDALKSCFLLNKKGIPVSRHYQYKDFFKLGTGEYYHEFTAWLYSEEDKEIKENIYRDIYI KVVGIRPEDLAVNLNP >gi|222159200|gb|ACAB01000159.1| GENE 21 16305 - 16787 306 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237723800|ref|ZP_04554281.1| ## NR: gi|237723800|ref|ZP_04554281.1| predicted protein [Bacteroides sp. D4] # 1 160 1 160 160 209 98.0 5e-53 MATNKERTSSAAMRKLLLQSASKKEQEIQQIGSVAESTIPQPAIEQPIHEEVKESAPVTE TVQEPQPEQMEKPESKPSVQSSPKTEESKVCGKKFSEYLEERKLKNTEVIRISSETHRKL KQIAMATGLGMHNIANNILEDVLTSHNKEIQAILKKYMSI >gi|222159200|gb|ACAB01000159.1| GENE 22 16792 - 17547 610 251 aa, chain - ## HITS:1 COG:no KEGG:BF0129 NR:ns ## KEGG: BF0129 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 19 158 93 246 349 79 33.0 2e-13 MQLFNLLFIYLRTYLKTYIMGQILPGVVVAFENPKGGVGKSTLTALFAGYIHSQSNEEGG LSIAVVDIDDMQNTIGKLREDEAEDGIMKKEEEYEVINISSSEFINQLDFLQDNYDIILV DFPGNLKQNGVVETLHFVDVIIIPFEPNQTDLRPTLTFYYNIYEGIIESRRKIGKKTTMR GVMNRVLPNVLEFKEILKNKKTLPFELMQNYIKDSRVDYQRNLSTLSKAYHHPCDAFCEE VLELICNHIGE >gi|222159200|gb|ACAB01000159.1| GENE 23 17690 - 17776 57 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYFVTNGFSFACYNVRSCFVTKCAVNQD >gi|222159200|gb|ACAB01000159.1| GENE 24 17924 - 18232 141 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713014|ref|ZP_04543495.1| ## NR: gi|237713014|ref|ZP_04543495.1| predicted protein [Bacteroides sp. D1] # 1 102 30 131 131 169 100.0 6e-41 MKENRDISIKIRLTEAEKEQLLQRSKEEGKSLSSFIRESALKGRSMSKTDVQMIYELRKI GANINQLAKHINTLPSDENILYSLDRINQYLSDVETINRKLL >gi|222159200|gb|ACAB01000159.1| GENE 25 18229 - 19629 1092 466 aa, chain + ## HITS:1 COG:no KEGG:PGN_0075 NR:ns ## KEGG: PGN_0075 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 14 404 10 353 411 87 26.0 1e-15 MIAEIQPSFSSHLSMSRKIEYQLSKVKEGNGKVLCTSTGDELAAMPAYMELISQLNDRVK LPYAEFILSLYPGESLSDEQWLSLAGEYIERMGYGKSCYAVVLNTDKAHSHVHVLLTTID EEGKSIPSGNNYSRSEKISRELEQKYGLLPLEREGGKRTTLGEAQYRNYYFDAALKKAMR SYNYKDKVSAVLEQSDTYWSLDKPLQEIKLANEEWRVLLGDESYDNLFALLEKGGFFNPL FKDELLQQLDRIYSFSESTSDFRRNLEQEGLYMRLVTKKDKSYYVYGIKDSGFYLKDVSL PQKYRFGNIRFDGHGMSADEQKHYLYDHVFKALNASSGYEDFKKNLAEESIRVTEHVNGK GAYGISLYMENVENAHLFKGADLSRKLTYQNIQKLFDGVAVGLSSHINRVGEFRERVDRE AFYMQGRGVTAIPDLDITGSGKKSQEDDLIPSKKKRKRKSGPDFSL >gi|222159200|gb|ACAB01000159.1| GENE 26 19658 - 20641 521 327 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713016|ref|ZP_04543497.1| ## NR: gi|237713016|ref|ZP_04543497.1| predicted protein [Bacteroides sp. D1] # 1 327 1 327 327 682 100.0 0 MNYQTVLQNYLPVEQGDFMLKYEIDDRGYAIYSPEKGSFSCIELHGFSELTPWQLAFLLS LDMQQMKEQDEFSLSVCCKREKLLSYLFDVEESETTLKTKHVSGWQGYLMMDIHKPDRVR NVFQFHPETKKARLVFDNRLCVASLREKEKGKIIHLCWSPSLFAAIDRGGERTAPAYLLA SNAALLHGYAMKQIAECFAGTPAEERVIGIHVGDNVYEALSFVCYYARNVQDEYLVIPER KDGMMILETPKWNPIRQANFVASLNKMAVDQAKKRYPEMEVPNERPFTCLSFSRKSFVYF PDLKVYQEVFLKMYLGLVRLQEVHLLG >gi|222159200|gb|ACAB01000159.1| GENE 27 21450 - 23039 1149 529 aa, chain - ## HITS:1 COG:CC2770 KEGG:ns NR:ns ## COG: CC2770 COG3550 # Protein_GI_number: 16127002 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Caulobacter vibrioides # 5 431 4 431 435 379 45.0 1e-105 MNDLIVDVKLWGESVGSLYWEKESNAALFDYERKFIRSGLDISPIIMPISQYRNTPYRFL ENRTDCFKGLPGLFADSLPDTFGNQIINEWFASKGLSGEEITPLDRLCYVGKRGMGALEF EPSSPINGMNESSVLHIEELTELAKSVFTDRMAFQVQLRQEGRNILDILKVGTSAGGAKP KAIIAYNDITGEVRSGQVKAPEGFGYWLLKFDGGKYSEHTQITDNPQGIGNIEYAYHRMA KACGIDMMECRLLQEKESYHFMTRRFDRMKDGEKIHVQTLAGLAHYDRDQRHSYEEIFRI MRQMNLPYPDQEELYRRMVFNVMSRNHDDHSKNFSFLMDRQGKWKLAPAYDLCYSYTPGG KWTNRHQLSLNGKQDNFTMEDLQKVGENMGIREHKQIIEKVQETVSYWHETAKDCGVKPE HADFIGKNLLLFGKQLHTIQMPDIANEQEQAFMKAMRNDDFNAILELKMRGYQPSENVLK SLQPDISATNFIAAAKIFQMEGMLKSLQDIKPVQSPIIGGNKRSMELGD >gi|222159200|gb|ACAB01000159.1| GENE 28 23032 - 23346 306 104 aa, chain - ## HITS:1 COG:CC2771 KEGG:ns NR:ns ## COG: CC2771 COG1396 # Protein_GI_number: 16127003 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Caulobacter vibrioides # 5 99 7 101 119 67 41.0 8e-12 MWNEMSNPAVLMRIGQRIKETRIRQHITQEELATASGVSPLTVANIEKGKSVSLLLFISV LRSLGLLENLEQLVPEIRVSPIELKKLQGKKRYRVRHLKQDNHE >gi|222159200|gb|ACAB01000159.1| GENE 29 23798 - 24365 215 189 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 158 23 180 470 173 54.0 2e-43 MKRFVLQQLIEWKNREDRKPLILNGARQVGKTWLLHEFAKLEYKKEAYVVCRKNNLARQL FSQDFNVDRILRGLRAMTSVDITPGDTLIILDEIQDIPEALESLKYFKEEVPEYHIAVAG SLLGISLHQDVSYPVGKVNVINIFPMNFEEFLVAKGEEEACKLLMSGDFETISLLHDKYT DLLRQYYYV Prediction of potential genes in microbial genomes Time: Wed May 18 04:41:11 2011 Seq name: gi|222159199|gb|ACAB01000160.1| Bacteroides sp. D1 cont1.160, whole genome shotgun sequence Length of sequence - 13460 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 11, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 5 - 712 387 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 902 - 944 6.4 2 2 Tu 1 . - CDS 1018 - 2076 327 ## gi|256842633|ref|ZP_05548134.1| conserved hypothetical protein - Prom 2096 - 2155 6.5 + Prom 2073 - 2132 17.8 3 3 Tu 1 . + CDS 2172 - 2378 76 ## + Term 2612 - 2645 0.4 4 4 Op 1 . + CDS 2762 - 3604 463 ## BDI_0727 hypothetical protein 5 4 Op 2 . + CDS 3643 - 4521 569 ## BT_p548228 hypothetical protein + Prom 4524 - 4583 1.6 6 4 Op 3 . + CDS 4603 - 5748 832 ## BT_p548229 hypothetical protein + Term 5772 - 5830 3.2 - Term 5719 - 5760 1.2 7 5 Op 1 . - CDS 5937 - 6194 224 ## BT_p548230 hypothetical protein 8 5 Op 2 . - CDS 6199 - 6957 728 ## COG1192 ATPases involved in chromosome partitioning - Prom 7190 - 7249 11.8 + Prom 7167 - 7226 8.5 9 6 Op 1 . + CDS 7246 - 7575 463 ## BT_p548232 hypothetical protein 10 6 Op 2 . + CDS 7579 - 8415 405 ## COG3943 Virulence protein + Term 8455 - 8485 1.9 + Prom 8539 - 8598 6.9 11 7 Tu 1 . + CDS 8657 - 9295 216 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Prom 9344 - 9403 3.8 12 8 Op 1 . + CDS 9423 - 10931 623 ## COG3177 Uncharacterized conserved protein 13 8 Op 2 . + CDS 10954 - 11352 257 ## BT_p548236 hypothetical protein + Term 11476 - 11524 -0.7 14 9 Tu 1 . - CDS 11826 - 12083 76 ## BT_p548237 hypothetical protein + Prom 11936 - 11995 1.9 15 10 Tu 1 . + CDS 12151 - 12357 132 ## gi|262409508|ref|ZP_06086049.1| conserved hypothetical protein + Prom 12408 - 12467 4.9 16 11 Tu 1 . + CDS 12595 - 13459 447 ## COG4227 Antirestriction protein Predicted protein(s) >gi|222159199|gb|ACAB01000160.1| GENE 1 5 - 712 387 235 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 235 216 450 470 189 41.0 4e-48 MPEVVLKYVETDSLLEVRRIQSEILQGYDLDFSKHAPKEQVPRVRMVWNSIPSQLFKENK KFIYGALRKGARANDFEMAIQWLVNAGLLYKVPRCTKPELPLDIYEDLSAFKLYMVDLGL MGAMVKTDPAQVLIKNDIFKEYKGGMTEQYVLQQMKSKGVSPIYYHNTDNSRLELDFVIQ RNAQMVPIEVKAEGNVRANSLTALLGKRPELHAERFSMLPYKVQGNLTNFPLYAI >gi|222159199|gb|ACAB01000160.1| GENE 2 1018 - 2076 327 352 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|256842633|ref|ZP_05548134.1| ## NR: gi|256842633|ref|ZP_05548134.1| conserved hypothetical protein [Parabacteroides sp. D13] # 1 352 1 352 352 546 99.0 1e-154 MKFSTKAATFLSSIKTQTYDKKESEMIITYQQKRVFHLSLLMLALCAPIYIYSVPFPNEQ FYYINSVLFLFIIMCTLAYFKKRVNLTTTFSIILIAIHIEIFIEIIYCSICSGYEYSYQR ALIMSNITISLLFTMLSICAYMSNISILLSSLIIASYTICTLITDEPFLYSYLPLVIIIY TMIPLLGRSLHSNISSLLKSSNLLKEEEEMLLKRLQMKREELFAFAELLSENNPEEKTNS LLDIIGEQSKENLFTALAAHQKKEKSKLDTIRRIYPYLSPSELNICRLILQDKTVSQICE LLHRSSGNITSQRANIRAKLGLKKSDNLKEALQERMRLYEEEHRQEDFSAMR >gi|222159199|gb|ACAB01000160.1| GENE 3 2172 - 2378 76 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMLSAFNCCVSVIGCCGKCCSCCVVVFLSIYKFKMLVFLLWRLVYKYISLRSKFKGCKSL GLDCPINK >gi|222159199|gb|ACAB01000160.1| GENE 4 2762 - 3604 463 280 aa, chain + ## HITS:1 COG:no KEGG:BDI_0727 NR:ns ## KEGG: BDI_0727 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 280 133 422 422 397 70.0 1e-109 MVRADVNVPEREDVLRLLEKIVRLENQDSPYLGGELKRLKGGRPYSYLYKFHFPKLRSSM VKICYDSDPINPVRDTVYIHTRDTLCIRDTVTVIAPVKKRPFCMAVKTNLLYDAVLIPDI GVEFCLGKNWSVAGNWMYAWWKSDRKHNYWRIYGGDVELRRWFGRRAVEKPFSGHHVGLY GQIVTYDFELGGKGYLGDKWSYGGGVAYGYSLPVGHRFNVDFTLGIGYLGGLYKEYIPLD GHYVWQTTKKRRWFGPTKAGISLVWLIGRGNYSRKKGGRQ >gi|222159199|gb|ACAB01000160.1| GENE 5 3643 - 4521 569 292 aa, chain + ## HITS:1 COG:no KEGG:BT_p548228 NR:ns ## KEGG: BT_p548228 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 276 31 306 322 562 96.0 1e-159 MGLTACEHKDLCYDHPHFATVRVVFDWTKISNHDKPEGMRVVFYPTDDESNTWIFDFPGG EDGEVELPENDYRVICFNYDTDGMVWKENGSYTLFTADTRDVQSPDNRTMAVTPPWLCGD HIDGVILKDIPGGSAEIVRLTPVNMVCHYTYEVNGLRGLDRVADLRAALSGMSGSLNMSG DSLPADLSESLLFDGMVSRNQIIGGFYTFGHSALEGEPNVFRLYLKNRSGSMSVLEQDVS DQVHDVPVAGHIGDVHLVLNFDYEVPSEPGNGGPGFDVDVDDWDDVNVDIVL >gi|222159199|gb|ACAB01000160.1| GENE 6 4603 - 5748 832 381 aa, chain + ## HITS:1 COG:no KEGG:BT_p548229 NR:ns ## KEGG: BT_p548229 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 381 1 373 373 641 88.0 0 MYQFDLFLFMKKSTVMLWAIFGALLMGCSDEEIANVETSSRNAIGFNVLSNAAETRATPT TPSNLTSTDFDVFAFTADGTAFMGKVDTDFEHDGVKIVYKDGKWDYDDANDLRYWPSEAL DFYAFNPGTVSDDMLTFYMWEASGTVQKISYTCIDEYGAGTHANYDVMYAMAKDQTKDMN NGVVKFNFKHILSQVVFKAKTQYDNMQVDIDMIKIHNFKFAGAFTLPATAEETGSWSSSD LAFPHAFTVVKNANITVNSNTAATDISTNTPMLNIPQTLTAWTVSAPNKSKLEADNAKQC YLEISCKIRQSGAYLLGSASEYKTIYVPFGDTWVAGKRHIYTLIFGGGYDDQGEAVLNPI RFDAETTGWVDADNKDVNVQP >gi|222159199|gb|ACAB01000160.1| GENE 7 5937 - 6194 224 85 aa, chain - ## HITS:1 COG:no KEGG:BT_p548230 NR:ns ## KEGG: BT_p548230 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 85 1 85 85 138 100.0 5e-32 MAKKNDLKNSMSAGLTGGLDSLIQSTAGQKEAQKPKKAKTVHCNFVMDETYHQNLKLIAI RKGDSLKSVLQEAISDYLDKNSSLL >gi|222159199|gb|ACAB01000160.1| GENE 8 6199 - 6957 728 252 aa, chain - ## HITS:1 COG:Rv1708 KEGG:ns NR:ns ## COG: Rv1708 COG1192 # Protein_GI_number: 15608846 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Mycobacterium tuberculosis H37Rv # 4 248 64 313 318 201 44.0 1e-51 MSKAKVISVLNHKGGVGKTTTTINLGGALRQKGYKVLLIDLDGQANLTESLGFSAELPQT IYGAMKGEYDLPIYEHKDGLSVVPSCLDLSAVETELINEAGRELILAHLIKGQKEKFDYI LIDCPPSLSLLTLNALTASDRLIIPVQAQFLAMRGMAKLMQVVHKVQQRLNSDLSIAGVL ITQYDGRKNLNKSVSELVQETFQGKVFSTHIRNAITLAEAPTQGQDIFHYAPKSAGAEDY EKVCNELLTEIK >gi|222159199|gb|ACAB01000160.1| GENE 9 7246 - 7575 463 109 aa, chain + ## HITS:1 COG:no KEGG:BT_p548232 NR:ns ## KEGG: BT_p548232 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 109 1 109 109 218 100.0 5e-56 MSTYSYKNPKFINSPKGVVEVVEVIYDGKDDPAYSLAIIKWENTYKLGIRWNIAYSEWDD YRKQNGQDECIGNPQSRGIPTWFVLPDDMMFGEKFSGAMQRLDELRKGK >gi|222159199|gb|ACAB01000160.1| GENE 10 7579 - 8415 405 278 aa, chain + ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 4 123 12 132 345 115 43.0 1e-25 MEQGEIILYQPDEAVKLEVRLEDETVWLTQAQIVDLFQSSKANISEHIKNIYDQKELEES STVRDFRTVRQEGKRQVMRNLTYYNLDAIISIGFRVNSKRGILFRQWANKVLKDYLLKGY SINKRLSELERTVAQHTEKIDFFVRTALPPVEGIFYNGQIFDAYKFATDLVKSARRSIVL IDNYVDETVLLMLSKRSVGVSATIYTQRITQQLQLDLDRHNSQYPPIDIRTYRDSHDRFL IVDETDVYHIGASLKDLGKKMFAFSKLDIPAAVITDLL >gi|222159199|gb|ACAB01000160.1| GENE 11 8657 - 9295 216 212 aa, chain + ## HITS:1 COG:TVN1478 KEGG:ns NR:ns ## COG: TVN1478 COG1961 # Protein_GI_number: 13542309 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Thermoplasma volcanium # 6 193 7 184 201 58 30.0 7e-09 MNVVIYSRVSSQSARQSTERQVVDLERFAAGRGYEVTAVFEEKISGRKANIERPVLSRCL EYCTDPQNRVDMLLLTEISRLGRSTLEILKSLDTLHTHKICVYIQNLNLETLRPDKTVNP LSSLITTLLGELAAIERQGIIDRLNSGRELYIQKGGRLGRKPGSRKTAEQRKEEYREAIA LLKKGYSIRNVAKLTGKAVSTIQQVKKDFISS >gi|222159199|gb|ACAB01000160.1| GENE 12 9423 - 10931 623 502 aa, chain + ## HITS:1 COG:aq_aa38 KEGG:ns NR:ns ## COG: aq_aa38 COG3177 # Protein_GI_number: 10957070 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Aquifex aeolicus # 286 490 15 198 320 62 28.0 2e-09 MATMAEKMAASLEELRKLQEKDRCVVLQGTAEIGRTHLTRLLDNGWLQEVMKGWYIAARP GTEGDTTVWYTSFWYFIAKYAAVRLGEQWCLTADQSLDLYSGKTTVPVQVIIKSPKGHNN TQKLMYDTSLLVFQSEIPDQVYKEPEYGLNLYPLAEALVYATPRYFQVERIAACTCLAMI RDAADILKVLARNGASLRAGRIAGAFRNIGNSEIADSIVSTMRGFGYDVREEDPFEDQPR TPLVYEVSPYVTRLRLMWENMRDKVVELFPEAPGKIDDVEGYLRSVDEKYSEDAYHSLSI EGYRVSPELIEKVRVGNWKPEEEDKEHKNALVARGYYQAFQAVRGTISDILKGKNAGEAV RADHPLWYMQMWMPFVTVGILQREDLVGYRTGQVYIRGSQHIPLNPKAVRDAIPVLFDLL KNEPHPAVRAVLGHFFFVYIHPYMDGNGRMGRFVLNAMLASGGYNWTVVPVERRKEYMKA LEKASVEGDISEFAKVIASLVK >gi|222159199|gb|ACAB01000160.1| GENE 13 10954 - 11352 257 132 aa, chain + ## HITS:1 COG:no KEGG:BT_p548236 NR:ns ## KEGG: BT_p548236 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 132 1 132 132 259 100.0 1e-68 MKSTEYIEWDKLEQIPFCLCRIAEDEENQEIDVYYLDKRVCHDYDHVGHYFRTAIIMFRR IRNITADWVNLKNLWLLRDCIRENFNHGLEVDDLIFGETFDGEDPETIKPLTKERLFKIK KVIQEKDPYATV >gi|222159199|gb|ACAB01000160.1| GENE 14 11826 - 12083 76 85 aa, chain - ## HITS:1 COG:no KEGG:BT_p548237 NR:ns ## KEGG: BT_p548237 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 54 1 54 127 97 94.0 2e-19 MRLNLGYGKSRLTFADKTPGQGKRKAATPEKLKYLSPLKGGLKPGSTIGYRSSRQGMFLA PPTSACAHQPGKDHSGYSFPDCADQ >gi|222159199|gb|ACAB01000160.1| GENE 15 12151 - 12357 132 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262409508|ref|ZP_06086049.1| ## NR: gi|262409508|ref|ZP_06086049.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 68 1 68 68 117 100.0 2e-25 MEKNVWGVLIQKPLGTQGKRKTTTPTAPKNQNEVVLIQKHRNLRHPQGLVGEGARWGVYY RKKSGSKY >gi|222159199|gb|ACAB01000160.1| GENE 16 12595 - 13459 447 288 aa, chain + ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 1 269 227 482 522 126 33.0 4e-29 MIEKIQQVEDNWQKPWITIAANTRNFFPQNLTGRRYTGGNAFLLLFLCEKFQYQTPVFMT FNQAKEAGISVLKGSKSFPVYYFLFYVYHKETRKKITFEEYKALSREQQQEYNVIPTYKY YSVFNLDQTNFSDVRPEEWEALREKFRGGQAEQPEIDAMLEAKSWFCPIREQQGDRAFYS PLADYIVVPLRSQFVDMQSFYETLLHEMGHSTGHPTRLNRDLAHPFGSEEYGKEELTAEF AAALAGMFFGIAEHIRTENAAYLKSWIYVGCSQNHPYCGEWLPNQVFR Prediction of potential genes in microbial genomes Time: Wed May 18 04:42:07 2011 Seq name: gi|222159198|gb|ACAB01000161.1| Bacteroides sp. D1 cont1.161, whole genome shotgun sequence Length of sequence - 55823 bp Number of predicted genes - 39, with homology - 39 Number of transcription units - 24, operones - 12 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 6 - 65 3.0 1 1 Tu 1 . + CDS 91 - 303 382 ## BT_1667 hypothetical protein - Term 275 - 314 2.7 2 2 Tu 1 . - CDS 324 - 689 137 ## gi|237712892|ref|ZP_04543373.1| predicted protein - TRNA 992 - 1065 66.4 # Ala GGC 0 0 + Prom 1084 - 1143 4.9 3 3 Tu 1 . + CDS 1183 - 1485 289 ## BT_1665 hypothetical protein 4 4 Op 1 . + CDS 1590 - 2048 388 ## COG0817 Holliday junction resolvasome, endonuclease subunit 5 4 Op 2 . + CDS 2105 - 4111 1371 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases + Term 4204 - 4237 4.5 + Prom 4351 - 4410 5.6 6 5 Op 1 . + CDS 4558 - 6204 1192 ## BVU_0028 sialic acid-specific 9-O-acetylesterase 7 5 Op 2 . + CDS 6226 - 7710 701 ## PROTEIN SUPPORTED gi|90020673|ref|YP_526500.1| ribosomal protein L9 8 5 Op 3 . + CDS 7740 - 8876 1167 ## COG3693 Beta-1,4-xylanase + Prom 9006 - 9065 6.4 9 6 Tu 1 . + CDS 9090 - 10067 1014 ## BVU_0040 beta-xylosidase/alpha-L-arabinofuranosidase + Term 10311 - 10356 1.1 + Prom 10131 - 10190 3.2 10 7 Tu 1 . + CDS 10385 - 12412 1802 ## PROTEIN SUPPORTED gi|90020672|ref|YP_526499.1| ribosomal protein S18 + Prom 12450 - 12509 6.6 11 8 Tu 1 . + CDS 12543 - 13997 203 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 12 9 Tu 1 . - CDS 14046 - 15911 1395 ## COG0642 Signal transduction histidine kinase + Prom 15907 - 15966 7.3 13 10 Op 1 . + CDS 16213 - 16959 793 ## COG0588 Phosphoglycerate mutase 1 14 10 Op 2 . + CDS 16987 - 18039 1064 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes 15 10 Op 3 . + CDS 18107 - 18763 818 ## COG0176 Transaldolase 16 11 Tu 1 . - CDS 18917 - 22867 2423 ## COG0642 Signal transduction histidine kinase - Prom 22892 - 22951 6.3 17 12 Op 1 . + CDS 22951 - 24411 1269 ## COG1660 Predicted P-loop-containing kinase 18 12 Op 2 . + CDS 24447 - 25196 592 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) + Term 25204 - 25239 6.0 + Prom 25205 - 25264 2.3 19 13 Op 1 . + CDS 25305 - 25517 252 ## gi|298482966|ref|ZP_07001148.1| two-component system sensor histidine kinase 20 13 Op 2 . + CDS 25545 - 27287 1248 ## COG0642 Signal transduction histidine kinase + Term 27385 - 27427 3.0 + Prom 27752 - 27811 4.9 21 14 Tu 1 . + CDS 27947 - 28996 875 ## BVU_2957 hypothetical protein + Term 29046 - 29095 7.5 + Prom 29069 - 29128 7.3 22 15 Op 1 . + CDS 29283 - 32498 2577 ## BVU_0540 hypothetical protein 23 15 Op 2 . + CDS 32509 - 34335 1257 ## BVU_0539 hypothetical protein + Term 34364 - 34415 6.6 + Prom 34386 - 34445 5.5 24 16 Op 1 . + CDS 34607 - 36190 1409 ## COG3119 Arylsulfatase A and related enzymes 25 16 Op 2 1/0.200 + CDS 36209 - 38533 1790 ## COG3525 N-acetyl-beta-hexosaminidase 26 16 Op 3 . + CDS 38610 - 41693 2946 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 41759 - 41818 5.3 27 17 Tu 1 . + CDS 41838 - 43655 1560 ## COG3669 Alpha-L-fucosidase + Term 43698 - 43754 12.9 + Prom 43696 - 43755 1.5 28 18 Op 1 . + CDS 43796 - 44971 934 ## COG0668 Small-conductance mechanosensitive channel + Prom 44983 - 45042 6.0 29 18 Op 2 . + CDS 45068 - 46678 1207 ## BVU_1280 hypothetical protein + Term 46903 - 46943 1.1 30 19 Op 1 . - CDS 46757 - 47974 769 ## BT_1616 hypothetical protein 31 19 Op 2 . - CDS 47955 - 49415 1426 ## COG2195 Di- and tripeptidases - Prom 49495 - 49554 2.0 + Prom 49382 - 49441 9.4 32 20 Op 1 . + CDS 49549 - 49782 220 ## BT_1614 hypothetical protein 33 20 Op 2 . + CDS 49819 - 50412 583 ## BT_1613 hypothetical protein + Term 50487 - 50528 0.4 34 21 Op 1 . - CDS 50522 - 50863 482 ## BT_1612 hypothetical protein 35 21 Op 2 . - CDS 50867 - 51163 349 ## BF3072 putative septum formation initiator-related protein - Prom 51194 - 51253 3.9 + Prom 51129 - 51188 3.5 36 22 Tu 1 . + CDS 51398 - 53263 1461 ## COG2812 DNA polymerase III, gamma/tau subunits - Term 53155 - 53181 -1.0 37 23 Tu 1 . - CDS 53356 - 53589 242 ## COG1983 Putative stress-responsive transcriptional regulator - Prom 53768 - 53827 10.2 + Prom 53746 - 53805 7.6 38 24 Op 1 . + CDS 53895 - 54428 93 ## BT_1545 hypothetical protein 39 24 Op 2 . + CDS 54494 - 55675 526 ## COG3274 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|222159198|gb|ACAB01000161.1| GENE 1 91 - 303 382 70 aa, chain + ## HITS:1 COG:no KEGG:BT_1667 NR:ns ## KEGG: BT_1667 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 70 1 70 70 80 90.0 1e-14 MDDMINRHDSIAEENIEPNGRPAKDQFEEWSGEVADRADDVFKNDKKDGPIKDREKRIKE MDEVIKKDLE >gi|222159198|gb|ACAB01000161.1| GENE 2 324 - 689 137 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712892|ref|ZP_04543373.1| ## NR: gi|237712892|ref|ZP_04543373.1| predicted protein [Bacteroides sp. D1] # 1 121 1 121 121 209 100.0 6e-53 MRSLLEKELENSPDFLSDIAVKFASMDKVKSTDNKGYKYLISFTCSSLQKTGKYNISFRI ITALDEEEASNLIDNQKYYIQGKFISLSEKESINIRLDVFDDKTIEIGSIFIKEPIVTPA N >gi|222159198|gb|ACAB01000161.1| GENE 3 1183 - 1485 289 100 aa, chain + ## HITS:1 COG:no KEGG:BT_1665 NR:ns ## KEGG: BT_1665 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 100 1 100 100 190 89.0 2e-47 MLIYNTTFQVDDDVHDNFMIWIKESYIPEVQKHGALKAPRICRILSHREEGSAYSLQWEV ESSGLLHRWHLEQGVRLNDELVKIFKDKVIGFPTLMEVIE >gi|222159198|gb|ACAB01000161.1| GENE 4 1590 - 2048 388 152 aa, chain + ## HITS:1 COG:VC1847 KEGG:ns NR:ns ## COG: VC1847 COG0817 # Protein_GI_number: 15641849 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Vibrio cholerae # 14 124 43 150 173 87 42.0 7e-18 MGIIDLRKFANHYLKLRHIHERVLSIIESYLPDELAIEAPFFGKNVQSMLKLGRAQGVAM AVALSRDIPITEYAPLKIKMAITGNGQASKEQVADMLQRMLRFSKDDMPTFMDATDGLAA AYCHFLQMGRPAMEKGYSSWKDFIAKNPDKVK >gi|222159198|gb|ACAB01000161.1| GENE 5 2105 - 4111 1371 668 aa, chain + ## HITS:1 COG:TM1845 KEGG:ns NR:ns ## COG: TM1845 COG1523 # Protein_GI_number: 15644588 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Thermotoga maritima # 47 668 229 842 843 510 43.0 1e-144 MKIGNNYLAILGVTTVTTVMSCSPTKKEYTSFELYPVRTGSLTEMEYTPLATKFFLWSPT AEEVRLMLYDAGEGGHAYETVKMELGENGTWTTSVDKDLLGKYYAFNVKINDKWQGDTPG INAHAVGVNGKRAAIIDWKTTNPEGWESDRRPSLKSPADMIIYEMHHRDFSVDSTSGIKN KGKYLALTEHGTVNSGNQPTGIDHLVELGVTHVHLLPSSDYASIDETKLDENHYNWGYDP ANYNVPDGSYSTDPYQPAVRVKEFKQMVQALHRAGIRVIMDVVYNHTFNTLESNFERTVP GYFYRQKEDGTLANGSGCGNETASERPMMRKFMIESVLYWIKEYHIDGFRFDLMGVHDIE TMNEIRKAVNKVDPTICIYGEGWAAEAPQYPADSLAMKGNISHMPGIAVFSDELRDGLCG PVWEKDKGAFLAGVPGGEMSVKFGIVGAIKHPQVRCDSVNYSQKPWAEQPTQMVSYVSCH DGLCLVDRLKASMPGATPEQLVRLDKLAQTVVLTSQGIPFIHAGEEVMRDKQGIDNSYKS PDAINAIDWRRKTTNGDIFTYYKRLIDLRKSHPAFRMGNAEMVRKHLEFLPVEGQNLIAF RLKEHANGDSWEDIIVAFNSRTTPARLEIPVGKYTVVCKDGVIDVRGLGTQTGPEVIVPG QSALIMYK >gi|222159198|gb|ACAB01000161.1| GENE 6 4558 - 6204 1192 548 aa, chain + ## HITS:1 COG:no KEGG:BVU_0028 NR:ns ## KEGG: BVU_0028 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.vulgatus # Pathway: not_defined # 7 542 3 531 645 564 50.0 1e-159 MNKYWFYKVGLVVVFLCFALLGEAKVKLPTLVSDGMVLQRGEPVNIWGTADPDETVSITF QKKKYKTVADAQGNWKVILPALKAGGPYTMIINDIELKDILVGDVWVCSGQSNMELPVSR VTDRFRDEISADSDYPMVRYIKTPLLYNFHAPQTDIPGIFWKAMTPENVMSFSALVYFFT KDYFQKTKVPVGIINSSVGGSPVEAWISEEGLKPFPYYLNEKRIYESDDLVESMKKEESK KSRAWNVALYQGDKGMHETIPWYAAGYDDSDWTPTDLFASGWATNGLNTINGSHWFRKDF QVSGQQAGEKATLRLGCIVDADSVYVNGTFVGTVSYQYPPRIYTIPAGLLKAGKNTITIR LFSYGGFPHFVKEKPYKIFFGKGQPEKGESEISLEGNWKYRLGAPMPAAPGQTAFHYKPV GLYNAMIAPLLNYTVSGIIWYQGESNVSRRNEYKDLLTAMIADWRQHWNRPDMPFYVIEL ADFLSPEDKGGRAAWAEFRKVQAEVANTNKNVTLIKNGDLGEWNDIHPLDKKTLGQRVSQ AVFQQRVK >gi|222159198|gb|ACAB01000161.1| GENE 7 6226 - 7710 701 494 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020673|ref|YP_526500.1| ribosomal protein L9 [Saccharophagus degradans 2-40] # 6 490 7 521 522 274 33 7e-73 MKTQTQKVSMAEKIGYSLGDGSANLIFQMMMMFQLFFYTDVFGIKATAAGMILLVARIFD AFVDPVVGILSDRTNTRWGKYRPWLLWTAIPFAVFFILAFTTPDLSERGKIIYAGITYTL LMSIYSFNNTPYASLGGVMTSDIKERTSISSVRFVTATIATFVVQGLTLPLVSKFGQGDD QRGWFLTITLFAIIGVVLLVITFFSAKERITPPVGQKTSVKQDFKDIVSSRPWKAMFILT LFLFTTLAMWGSSMSYYFNYFVDKTALFDFLQNFGLVRIEGETYGMWHTFLDAFGLIAQP DHSNVFAVGFSLFNMIGQVITLAGVILLSGFLSNIFGKRNVFLICLALTAFFTALFFVVD STNISMIFIINCLKSLAYAPTIPLLWAMMGDVADHSEWVNHRRATGFVFAGVVFALKAGL GIGGAICGAIVDSFGFVSNTVQTESAIFGIRLTSSVIPAITFFVGVIALFFYPISKKLNE HIQDDLAKRRLENN >gi|222159198|gb|ACAB01000161.1| GENE 8 7740 - 8876 1167 378 aa, chain + ## HITS:1 COG:TM0061_2 KEGG:ns NR:ns ## COG: TM0061_2 COG3693 # Protein_GI_number: 15642836 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-1,4-xylanase # Organism: Thermotoga maritima # 33 365 2 315 691 231 38.0 1e-60 MKTSTRTTVVSLLAAVALITSTSIAFAQETKTLKEALKDKFLIGTAVNTRQASGQDKAGV RVIQEQFNAIVAENCMKSQEMHPKENRYNFTQADEFVAFGEKNHLAITGHTLIWHSQLSP WFCVDENGKNVSPEVLKKRMKDHITTIVKRYKGRIKGWDVVNEAIEDNGAYRKTKFYEIL GEEYIPLAFQYAHEADPDAELYYNDYSMAQPGRRAAVVKIVKDLKKRGIRIDAVGMQGHI GMDYPKISEFEESMLAFAGAGVKVMITELDLTVLPSPDPKVGAEVSASFEYKKEMNPYSD GLPEEVSKAWTERMNDFFRLFLKHQDIITRVTVWGVADQDSWRNDWPMRGRTDYPLLFDR NHQPKPVVDLIIKEAMQK >gi|222159198|gb|ACAB01000161.1| GENE 9 9090 - 10067 1014 325 aa, chain + ## HITS:1 COG:no KEGG:BVU_0040 NR:ns ## KEGG: BVU_0040 # Name: not_defined # Def: beta-xylosidase/alpha-L-arabinofuranosidase # Organism: B.vulgatus # Pathway: not_defined # 1 323 1 323 323 560 81.0 1e-158 MKTEKRYLVPGDYMADPAVHVFDGKLYIYPSHDWESGIAENDNGDHFNMKDYHVYSMDDV MNGEITDHGVVLSTGDIPWAGRQLWDCDVAFKDGKYYMYFPLKDQNDIFRIGVAVSDKPY GPFVPEANPMKGSYSIDPAVWNDGDGNYYMYFGGLWGGQLQRYRNNKALESAILPEGKEE ALPSRVVRLSDDMMEFAEEPRPLVILDENGKPLTAGDTERRFFEASWVHKYNGKYYFSYS TGDTHLLCYAIGDNPYGPFTYQGVILTPVVGWTTHHAIVEFKGKWYLFHHDCVPSEGKTW LRSLKVCELQYDVDGKIITIEGLDE >gi|222159198|gb|ACAB01000161.1| GENE 10 10385 - 12412 1802 675 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020672|ref|YP_526499.1| ribosomal protein S18 [Saccharophagus degradans 2-40] # 36 654 140 739 754 698 53 0.0 MNKKQSPTLNIAVSELRNFWQGGIPITLEIQKNKELRALGNDGYIIRASKDGNHLTITSF GEQGILYGTYHLLRLQATGQLSESTLKSLNISEKPDYRIRILNHWDNLDGTIERGYAGHS LWKWDELPSVVSPRYEAYARANASIGINATVINNVNASPKILSDDYLQKVKVLADIFRPY GLKIYLSINFSSPAALGGLSTSDPLDKEVIAWWKKKAKDIYSLIPDFGGFLVKANSEGQP GPCDYGRTHAEGANMLADVLKPYHGIVMWRAFVYSPTDSDRAKQAFLEFEPLDGKFRDNV IVQIKNGPIDFQPREPFSPLFGAMKKTAVMPEFQITQEYLGFSNHLAFLAPMWKECLDSD TYMQGEGSTIARVTDGSLFPHSLTAIAGVTNIGDDINWCGHPFAQANWYAFGRLAWKHSL SSEQIGEEWLRQTFLPVALQPYNDSVNEISPKERQQLHSQLSLLNSQLLQESREAVVDYM MPLGLHHIFAWGHHYGPEPWCDIPGARPDWMPSYYHRADSLGIGFDRSSTGSNATGQYHS PLCEELDNVDTCPENLLLWFHHVSWNHQMKSGRTLWAEMCYAYDRGVKEVRNFQKVWDSM KPYIDSERFQEVQHRLKIQARDAVWWRDACLLYFGQFNKQPIPYELERPVHELKDMMEYQ LDITNFECPPYGFTK >gi|222159198|gb|ACAB01000161.1| GENE 11 12543 - 13997 203 484 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 282 469 21 198 305 82 32 4e-15 MSQNTFCMAGGVARNPLVRLAKPITATIGANEHIAIVGPNGGGKSLFVDTLIGKYPLREG TVQYDFSPSATQTIYDNVKYIAFRDTYGAADANYYYQQRWNAHDQDEAPDVREMLGEIKD EQLQHELFELFRIEPLLDKKIILLSSGELRKFQLTKTLLTAPRVLIMDNPFIGLDAPTRE LLFSLLERLTKMSSVQIILVLSMLDDIPSFITHVIPVDKMEVFPKMEREAYLNAFRSRDV TTSFDDLQKRIIDLPYDGNNYDSDEVVKLNKVSIRYGDRTILKELDWTVSRGEKWALSGE NGAGKSTLLSLVCADNPQSYACDISLFGRKRGTGESIWEIKKHIGYVSPEMHRAYLKNLP AIEIVASGLHDSIGLYKRPQPEQMAICEWWMDIFGIADLKDKPFLQLSSGEQRLALLARA FVKDPELLILDEPLHGLDTYNRRRVKKVIEAFCQRRDKTMIMVTHYESELPNTITDRIFL KRNR >gi|222159198|gb|ACAB01000161.1| GENE 12 14046 - 15911 1395 621 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 336 617 20 307 328 171 35.0 5e-42 MKISEQNKGAFLHKLMSKANIGWWEADLKTETYKCSELISELLGLEEDGVISFEDFNQRI LREDQSYTTFPSFDELQQTTEEVYLFDTPKGSIWIRSKVCFEETDEDGNTKVYGIAETQE GIRIASAHQALQNSERILHNIYKNLPVGIELYDREGRLIDLNKKELEMFHIRTKEDILGI NLFDNPVLPDEIKQKLRDYENVDFTFRYDFSKVDSYYSPEKSTGFLDLVTKATTLYDHNH EPINYLLINVDKTEDTIAYNKIQEFESFFDLVGDYAKVGYAHFDALSRDGYALRSWYRNV GEEEGTPLPEIIGIHSHFHPEDQAVMIDFLDKVVKGESSKLCRDVRIRRANGNYTWTRVN VLVRNYQPQDNIIEMLCINFDITQLKETERMLIGAKEKAEEADRLKSAFLANMSHEIRTP LNAIVGFSSLLEEAEDAEEKHQYVTIIEENNKLLLQLISDILDLSKIEAGTFDVIPEQVN AQQLCNELFQAMQMKATPQVKILLAPELPELIFTSNKNRLYQVLLNFVTNALKFTSEGSI IIDYRINGNEVRFSVKDTGIGIAPEKQEAIFTRFVKLNSFIPGTGLGLPICQSIITQLGG KIGVESEPGKGSCFWFTHPLH >gi|222159198|gb|ACAB01000161.1| GENE 13 16213 - 16959 793 248 aa, chain + ## HITS:1 COG:STM0772 KEGG:ns NR:ns ## COG: STM0772 COG0588 # Protein_GI_number: 16764136 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Salmonella typhimurium LT2 # 1 248 3 250 250 288 58.0 1e-77 MKKIVLLRHGESAWNKENRFTGWTDVDLTEKGVVEAEKAGETLKEYGFNFDKAYTSYLKR AVKTLNCVLDKMNLDWIPVEKNWRLNEKHYGELQGLNKAETAEKYGEEQVLVWRRSYDIA PNPLSESDLRNPRFDYRYHEVPDAELPRTESLKDTIERIMPYWESDIFPSLKTAHTLLVV AHGNSLRGIIKHLKNISDEDIIKLNLPTAVPYVFEFDENLNVANDYFLGNPEEIKKLMEA VANQGKKK >gi|222159198|gb|ACAB01000161.1| GENE 14 16987 - 18039 1064 350 aa, chain + ## HITS:1 COG:ECs2900 KEGG:ns NR:ns ## COG: ECs2900 COG1830 # Protein_GI_number: 15832154 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Escherichia coli O157:H7 # 1 350 25 374 374 480 65.0 1e-135 MSKVVELLRDKACYYLDHTCETIDKSLIHVPSPDTIDKVWIDSDRNIRVLNSLQTLLGHG RLANTGYVSILPVDQDIEHTAGASFAPNPIYFDPENIVKLAIEGGCNGVASTFGILGSVA RKYAHKIPFIVKLNHNELLSYPNTFDQVLFGTVKEAWNMGAVAVGATIYFGSEQSRRQLV EIANAFEYAHELGMATILWCYLRNNDFKKGAVDYHAAADLTGQADRLGVTIKADIVKQKL PTNNGGFKAIGFGKVDERMYTELASEHPIDLCRYQVANGYMGRVGLINSGGESHGESDLR DAVITAVVNKRAGGMGLISGRKAFQKPMNEGVELLNTIQDVYLDSSITIA >gi|222159198|gb|ACAB01000161.1| GENE 15 18107 - 18763 818 218 aa, chain + ## HITS:1 COG:TM0295 KEGG:ns NR:ns ## COG: TM0295 COG0176 # Protein_GI_number: 15643064 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Thermotoga maritima # 1 215 1 211 218 248 52.0 6e-66 MKFFIDTANLEQIQEAYDLGVLDGVTTNPSLMAKEGIKGTQNQREHYIKICEIVNGDVSA EVIATDYEGMIREGEELAALNPHIVVKVPCIADGIKAIKHFTEKGIRTNCTLVFSVGQAL LAAKAGATYVSPFVGRLDDICEDGVRLVGDIVRMYRTYDYKTQVLAASIRNTKHIIECVE VGADVATCPLSAIKGLLNHPLTDSGLKKFLEDYKKVNG >gi|222159198|gb|ACAB01000161.1| GENE 16 18917 - 22867 2423 1316 aa, chain - ## HITS:1 COG:slr1393_3 KEGG:ns NR:ns ## COG: slr1393_3 COG0642 # Protein_GI_number: 16329802 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 793 1020 45 290 301 127 34.0 1e-28 MRTKTFLLLCIAFITLFIGVSTLQAGHYYYKQISLKDGLPSTVRCILTDKQGFVWIGTRS GLGRYDGHELKKYIHQANNPHSIPHNLIYQMIEDEQNNIWILTDKGVTRYQRQSDDFFIP TDETGNNIIAYSACLTDNGVLFGSKNKIFFYNYQDTSLRLLQTFDLEPNYNVTLLNLWDK HTLLCCSRWQGLLLLDLNTGKCRRPPFDCGKEIMNVLIDSQKRIWVAPYNGGLNCLTRDG KLLATYTTHNSSLSNNVILSLAEREGEIWIGTDGGGINILDPETRRISLLEHMPGSDHYS LPANSILCLYNDCNNNMWAGSIRNGLISIREVSMKTYTDVPPGNDRGLSNNTVLSLFQES QDQIWVGTDGGGINSFNPLTEKFIHYPSTWEDKVASICQFTPGKLLISLFSQGVFIFNPA TGEKQPFTIVDAKTTALLCNRGKAVNLLRNTHHNILFLGEHIYIYDLNTRTFSTVIEQEE QEIIGALLPIDNYKNITYLNDTKHIYELDNHTNRLKALYRCWKDTTINSVARDEEGNFWI GNNYGLVHYNPATKIQTPIPTSLFSEITLIACDQQGKVWIGADSMLFAWLIKEKKFVLFG ESDGVIQNEYLSKPRLLSMRGDIYMGGVKGLLHINSKIPLTTSELPQLQLSDVIINGESV NNELTNHPAGISAPWNCNITIRIMSKEKDIFRQKVYRYQIEGLNDQQIESYNPELAIRSL PPGSYKIMASCTAKDGNWIPSQEVLELTILPPWYRTWWFILCCAVFIAAAIIEAFRRAWK RKEEKLKWAMKEHEQQVYEEKVRFLINISHELRTPLTLIYAPLNRMLKSLSTEDTQYLPI KAIYRQAQRMKNLINMVLDVRKMEVGESKLLIQPYALNQWIEHVSQDFISEGEAKNVHIR YQLDPHIDTVSFDKDKCEIILSNLLINALKHSPQDAEITIMSELLSEKNRVRISIIDRGN GLKQVNTQKLFTRFYQGTGEQSGTGIGLSYSKILVELHGGSIGAQDNQDAGATFFFELPL RQQSEEIICQPKAYLNELMSDDSKEQQPEEDNFDTSPYSVLVVDDNPDLTDFLKKSLGEY FKRVVIASDGVEALQLTKSHAPDIIISDVMMPRMNGYELCKNIKEDITISHIPIVLLTAR DDKQSQLSGYKNGADAYLTKPFEIEMLMEIIRNRLKNRESIKKRYLNTGLVPAPEESTFS QADETFLLKLNKIILEHLDSSHLDVTFICKEIGMSRASLYNKLKALTDMGANDYINKFRM EKAITLITSTDLSFTEIAEKVGFTTSRYFSTAFKQYTGETPTQYKEKQKQERKKEQ >gi|222159198|gb|ACAB01000161.1| GENE 17 22951 - 24411 1269 486 aa, chain + ## HITS:1 COG:YPO3586 KEGG:ns NR:ns ## COG: YPO3586 COG1660 # Protein_GI_number: 16123728 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Yersinia pestis # 347 474 156 277 284 89 38.0 1e-17 MLIFARKYTTMITEELQKLYQSYTGVPAENITELPSSGSNRRYFRLTGIETLIGVYGASI DENEAFLYMAGHFRKCGLPVPEVRIASEDKTYYLQEDLGDTLLFHAIEKGRATSVFSEEE KELLRKTIRLLPAIQFAGADGFDFSRCYPQPEFNQRSILWDLNYFKYCFLKATGMEFQED KLEDDFQKMSDVLLRSSSATFMYRDFQSRNVMIKDGEPWFIDFQGGRKGPFYYDIASFLW QAKAKYPDSLRKELLQEYMEALRKYQPIDESYFYSQLRHFVLFRTLQVLGAYGFRGYFEK KPHFIQSVPYAIENLRELLKEEYPEYPYLCNVLRELTGLKQFTDDLKKRQLTVKVMSFAY KKGIPDDSTGNGGGYVFDCRAVNNPGKYERYKPFTGLDEPVITFLEEDGEILRFLDHVYA LVDASVKRYMERGFSNLSVCFGCTGGQHRSVYSAQHLAEHLNQKFGVKVELVHREQNIEH MFEATI >gi|222159198|gb|ACAB01000161.1| GENE 18 24447 - 25196 592 249 aa, chain + ## HITS:1 COG:CAC2981_1 KEGG:ns NR:ns ## COG: CAC2981_1 COG1208 # Protein_GI_number: 15896233 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Clostridium acetobutylicum # 1 193 1 182 382 108 33.0 1e-23 MKAMIFAAGLGSRLKPLTDTMPKALVPIAGRPMLEHVILKLKAAGFTEIVINIHHFGEQI LDFLKANENFGLIIHISDERDLLLDTGGGVKKARSFFENSDEPFLIHNVDILSDVNLKDL YDYHLQSGAVATLLASQRKTSRYLLFDTGKRLCGWINKDTGQVKPEGFQYDPSLYQEYAF SGIHVLSPAIFQWMTSPCWEGKFSIMDFYLATCRQVNYSGYLTEKLHLIDIGKPETLAKA TDFLYQNAK >gi|222159198|gb|ACAB01000161.1| GENE 19 25305 - 25517 252 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298482966|ref|ZP_07001148.1| ## NR: gi|298482966|ref|ZP_07001148.1| two-component system sensor histidine kinase [Bacteroides sp. D22] # 1 70 1 70 660 137 100.0 2e-31 MEQTLENGIADNLLHNIFNDLSVGLELYDKDGLMIDVNYSRLRSMGIKDKKDILGYNLFN YTSFSDEIKE >gi|222159198|gb|ACAB01000161.1| GENE 20 25545 - 27287 1248 580 aa, chain + ## HITS:1 COG:mlr3786_1 KEGG:ns NR:ns ## COG: mlr3786_1 COG0642 # Protein_GI_number: 13473249 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 28 428 43 462 478 174 31.0 5e-43 MAKYNFDDVRHLFPTNLSGIKFFEITVSFVHNEKSEITNYMVISQDVTERVLWQNKYDNL YEEAVRSKKELLESEQRMIQLIRQNELVLNNINSGLAYIANDYIVQWENISLCSKSLSYE AYKKGELCYLTAHNRTTPCENCVMQRARRSGQVESILFNLDNKHVIEVFATPIFNEQGDV DGVVIRVDDVTERQHMIGELEKARSRAEQSDKLKSAFLANMSHEIRTPLNAIVGFSDLLM VTEDQEEKEEFIQIINANNELLLKLINDILDLSKIEAGSVELKYEDFDLAVYFNELAASM HRRVVNPQVRLVPVNPYETCTVRLDKNRLAQILTNFVTNAIKYTSKGTIEMGYEKIDGDI RLYVRDTGIGIPEDKKDKVFHRFEKLDEFAQGTGLGLSICKAIVEACRGEIGFESEFDKG SLFWAVLPCQFDSVDSEPTSSRRNNEKDADNESILDSEETKKVPKRVLVVEDIQSNFFLV SSILKNKCQLLHAPNGLEAVEIVRTQPVDLVLMDMKMPVMDGRTATSEIRKFNAEIPIIA LTAHAFDADRVAALKAGCDDYLVKPINGAKLMQTLKEYGC >gi|222159198|gb|ACAB01000161.1| GENE 21 27947 - 28996 875 349 aa, chain + ## HITS:1 COG:no KEGG:BVU_2957 NR:ns ## KEGG: BVU_2957 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 349 3 320 320 399 61.0 1e-110 MSKYVNPFTDIGFKIIFGQPASKNLLITLLNELLAGEHHITELTFLDKEDHADNVSDKGI IYDLYCRTASGEYIIVEMQNRWHSNFLDRTLYYVCRAVSRQIESPSSKEVPVPEDPMTAR EPLVSYGKQYRLPTIYGIFLTNFKEENLEAKFRTDTVLSDRDTGKIVNPHLRQIYLQFPY FTKDLSDCHTLYDKLIYALKNMSNWNRMPDALKEQVFEHLARLAAVADLSEENRIAYDKA LDRYRVNQIVEEDERRKNEEMRRKAAEEGLKEGMKAGLEKGVKKGRLEGIKEGMKEGMKE GMKEGLEKGLEKGEQKKQIEIARKMREDGISIDIIIKYTGLQSSDIENL >gi|222159198|gb|ACAB01000161.1| GENE 22 29283 - 32498 2577 1071 aa, chain + ## HITS:1 COG:no KEGG:BVU_0540 NR:ns ## KEGG: BVU_0540 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 5 1071 3 1073 1073 1447 70.0 0 MTSMKNLSKVFLLCFITLFAFVAQSMGQGKNITGQVVDALNESMPGVNVQVKGTTNGTIT NIDGKFSISVPNSKSVLVFTFIGYSKQEIVVGNQTKINVQLKEDAQNLDEVVVVGYGTAK KSDLTGATASLRPDANDASKAVSIDGLLQGKIAGLNVTASMSTPGAASSVTIRGANSLRG DNQPLYVIDNVPQASTGEFAESAVGGGDFQIAQDPLSAINPNDIEDITVLKDASATAIYG SRGANGVILITTKKGKAGKAKVNVTANFTIANARNLHDMLNLEEYADYTNSKLDQPRYYL QPNGEMRYVFSGNESAYQADPNNPELYKVITYRNWQKEAYTSAFSQVYSASVSGGTAGMK YYISGNFKDINGIVENTGIKQGDLRANLTAELSKKVTLNLILSGSLKQNDMMTGGNTTGG VAGSLARTVLDTAPYIVPDDDPGALESKMNVFSWVNDYDDITNNKTFNGSLNLNWNINKY FSYSLRAGGNIIVNDRKRWYGIQLPPGENVKGLLAISNVNKSNYTVENLLNFNMDLTKDF RLSATAGITYDEYHFLTENVSGNTFSIYDLRTKGLSNAGKVDVKQPIQKDYQLLSYLGRV NLSAYDKYLLTASLRADGSSKFKKGNRWAYFPSFSLAWRMEQEAYMKSIDWLSQLKVRVG YGETGNQGIDPYQSLSIYENKVDYADGDGNKLTSMQIKSLQNEGLKWERTSSWNVGLDFG FFNGRLNGAFDYYRKNTKDLLVERALPYSSGFATVMINEGSLLNKGFEFSLNADIIRSNG WKWNVGGNIGFNKTTIENLGLEVSDIGSLKGVRGYLGKTIGDHFGPANVFIEGEAPGLFF GYKTQGIIQEEDVVDGKVGYISSDGSTKYYTSSVGNDMAAGNIKCVDVNEDGIVDPNDMV VIGDPNPDFTYGFQTRLEWKNLSLSASFTGVYGNDILNTNIRYEQTPSRQAGNLRKDAYY NAWTPENRSNLYPSSTSNVKGVVYDRYVEDGSYLRCSDITLNYILPKAWMTKIGFQNTSI FASVKNAFVITNYSGYDPEVNSFAFDGLRRGIDMNAFPNMRSFVFGLNVTF >gi|222159198|gb|ACAB01000161.1| GENE 23 32509 - 34335 1257 608 aa, chain + ## HITS:1 COG:no KEGG:BVU_0539 NR:ns ## KEGG: BVU_0539 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 5 608 1 609 611 647 56.0 0 MIMKMKKFTYIFLLASCLLLASCESWLDEDPQYTLNTQIQFSTSEKARAALYGCYGYFAL TSGGYGQHWQEIPVRYSGFGWAQTNGGTNDMCGWLQCDYTDDLLTNAWYVMYKVINETNA FIANVESTSLEDKDEMAAEAHFLRALSYYNLAICWGDVPLKTTPSSHDGVAVPRSSRLEV LDVVRSDLEFASKYLPETNSDGFPTKWTAEAYLGKLYHTLGCLGDATAWEKAKTYFEGVK GKYDLESKFSNLFVDKIKGSKESIFQLGFQVNGELSSFNRGSWVFNPAASSYGKSFHRVK CTKAFYDFFAGTYWGDPRIETTFMAKWRKCTNNIGPVAQVEPVPSARDSTYEYPYFTYDS GIKPAGWKNNKLLVGRIPYELLTNKENPSADEILAITNDETKFDEKYRKGLKDFLNVCTA AGRTGERAAWPHFAKAFDMNMAGIASHKNLILYRYADLLLMLADVYNELDRKDDAINTVD LVLARARKSGAKVSTQPAKWESTLTKDQVREKIFFERLFEGAGETEMYQMMRLRGTEYLK KALTVNNNHSITKKWQENNPNATGVWGERIYNNGNLNDETFLKKNLLFPIPRDEMNTNIA ITKNNFGY >gi|222159198|gb|ACAB01000161.1| GENE 24 34607 - 36190 1409 527 aa, chain + ## HITS:1 COG:ECs4619 KEGG:ns NR:ns ## COG: ECs4619 COG3119 # Protein_GI_number: 15833873 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli O157:H7 # 37 519 5 443 497 131 25.0 4e-30 MEKLQSNLFFPLAGIAAVASLASCSNKQRPAEQLPLNIVYIMTDDHTAQMMSCYDTRYME TPNLDRIAADGVRFTQSFVANSLSGPSRACMITGKHSCANKFYDNTTCVFDSSQQTFPKL LQKIGYQTALVGKWHLESLPSGFDYWQIVPGQGDYYNPDFITQNNDTIQKHGYITNLITD DAIDWIENKRNPEKPFCLLIHHKAIHRNWLADTCNLALYEDKTFPLPDNFFDDYEGRPAA AAQEMNIVKDMDMIYDLKMLRPDKNTRLKSLYEKYVGRMDEGQRAAWDKFYTPVIDDFYR QNLQGKELANWKFQRYMRDYMKTVKSLDDNVGRVLDYLKEKGLLDNTLVVYTSDQGFYMG EHGWFDKRFMYEESMRTPLIMRLPKGFDRRGDITEMVQNIDYAPTFLELAGAEIPSDIQG ISLVPLLKGEHPKDWRKALYYHFYEYPAEHMVKRHYGVRTDRYKLIHFYNDINWWELYDL QADPSEMHNLYGQPEYESVVKELKEEMLKLQEQYNDPVRFSPERDKE >gi|222159198|gb|ACAB01000161.1| GENE 25 36209 - 38533 1790 774 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 21 613 26 593 757 357 34.0 4e-98 MKRYILIILILLSPLTVFLQADNLTSPLISIVPRPTQIVPGSGNFTFSAKTAFAVENQEQ AVIARNFIDLFTRAAGITPALNVGSKEGQIRFVTDSSFKPEAYLLEITPQQILIKASDTK GFFYALQSIRQLLPAAIESEQPVQNIAWSVPAMTVQDEPRFGFRGLLLDPVRCFIPKENV LRIIDCMAMLKINKLHFHLTDDNGWRIEIKKYPRLTEVGAWRVDRTDVPFHSRRNPERGE LTPIGGFYTQEEIREIVAYAADRQIEVIPEIDVPAHSNAALAAYPQFACPVVKDFIGVLP GLGGRNSEIIYCAGNDSVFTFLQDVFDEILALFPSRYIHVGGDEARKTNWEKCPLCQKRM RKQHLANEEDLQGYFMKRISDYLRKKGREVIGWDELTNSSFLPEESIILGWQGMGTAALK AAEKGHRFIMTPARIMYLIRYQGPQWFEPVTYFGNNTLKDVFDYEPVQKDWKPEYESLLM GIQACMWTEFCNKPEDVDYLLFPRLAALAEVAWTAAGTKDWTGFLKRMDVYNSHIAEKGI VYARSMYNIQQTVTPVDGHLEVSLECIRPDVEIYYTLNGSNPAMSSHRYDGPICVTKTQM VKAATFMNGKQMGETLELPLTWNKATAKPLLDNKKNEMLLVNGLRGSLKYTDSEWCNWSR NDSISFTIDLQSQEKLNKLAIGCITNYGMGAHKPKMIRVEVSDDNRTYHAIGELNFSPEE IYQEGTFRNDYSIDMGGVSARYVRVTAEGAGICPDEHVRPGQEARVYFDEVIIE >gi|222159198|gb|ACAB01000161.1| GENE 26 38610 - 41693 2946 1027 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 32 1023 7 981 1087 668 39.0 0 MNISFKKRTSLILSAVLAATTFTLAQQQPLPEWQSQYAVGLNKLAPHTYVWPYANASDIE KPGGYEQSPFYMSLNGKWKFHWVKNPDNRPKDFYQPSYYTGGWADINVPGNWECQGYGTA IYVNETYEFDDKMFNFKKNPPLVPHAENEVGSYRRTFKVPADWKGRRIVLCCEGVISFYY VWVNGKLLGYNQGSKTAAEWDITDVLNEGENVVALEVYRWSAGAYLECQDMWRLSGIERD VYLYSTPKQYIADYKLNASLDKEKYKDGIFGLEVTVEGPSITLGATSIAYTLKDAEGKAV LQDAIRIKSRGLSNLITFDEKNLPDVKAWSAEHPHLYTLILELKDEQGKVTELTGCEVGF RTSEIKDGRFCINGVPVLVKGTNRHEHSQLGRTVSKELMELDIKLMKQHNINMVRNSHYP THPYWYQLCDRYGLYMIDEANIESHGMGYGSASLAKDTTWLTAHMDRTHRMYERSKNHPA IVIWSLGNEAGNGINFERTYDWLKSVENTRPVQYERAELNYNTDIYCRMYRSVDDIKAYL AKKDIYRPFILCEYLHAMGNSCGGLKEYWDVFENNPMAQGGCVWDWVDQSFREIDKNGKW YWTYGGDYGPEGIPSFGNFCCNGLVGANREPHPHLLEVKKVYQNIKATLADQKNLTIRVK NWYDFSNLNEYVLNWNVTADNGKILAEGTKTIDCAPHATVDVALGAVKLPNTIREAYLNI SWTRREASSMIDKDWEVAYDQFVLAGNKNYTGYRPQKAGETTFTVDKQTGALTSLNLNGK ELLAAPLTLSLFRPATDNDNRDKNGARLWRNAGLDNLTQKVVSLKEGKTSTTARVEILNA KAQKIGTADFVYSLDKNGALKVLTTFQPDTTIVKSMARLGLTFRVSNTYDQVSYLGRGEN ETYIDRNQSGKIGVYQTTPERMFHYYVAPQSTGNRTDVRWAKLANTSGEGLFVESNRAFQ FSMIPFSDVLLEKVRHINELERDGLLTVHLDAEQAGVGTATCGPGVLPQYLVPLKKQSFE FTLYPVK >gi|222159198|gb|ACAB01000161.1| GENE 27 41838 - 43655 1560 605 aa, chain + ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 50 487 10 460 559 320 39.0 6e-87 MNKLLISLLLSSTLIGGVQAQKKENYYVKHVEFPQDATLEQKVDMAARLVPTPQQYAWQQ MELTAFLHFGINTFTGREWGDGKEDPALFNPSELDAGQWVKSLKNAGFKMVILTAKHHDG FCLWPTATTKHSVASSPWKNGQGDVVKELRKACDKYDMKFGVYLSPWDRNAECYGDSPRY NEFFIRQLTELLTNYGEVHEVWFDGANGEGPNGKKQIYDWDAFYKTIQRLQPKAVMAIMG DDVRWVGNERGLGRETEWNATVLTPGIYARSTENNKRLGVFSKAEDLGSRKMLEKATELF WYPSEVDVSIRPGWFYHAEEDAKVKSLKHLSDIYFQSVGYNSVLLLNIPPDRKGLINEAD VNRLEEFAAYREQIFADNRVKKGRNYWNAISGSEAVYSLEPGSEINLVMLQEDITKGQRV ESFVVEALTDNGWKEVGKGTTIGYKRMLRFPVVKASQLRVKIDECRLTAHINQVAAYYAA PLQEVVQGEDWNNLPRAGWKQVADSPLTIDLGKSVTLASFTYAPSKAEAKPTMAFRYKFF VSMDGKHWKEVPANGEFSNIMHNPLPQTVTFGQKVQARYIKLEATTPTATTAKVGMDEIG VITTP >gi|222159198|gb|ACAB01000161.1| GENE 28 43796 - 44971 934 391 aa, chain + ## HITS:1 COG:VC0265 KEGG:ns NR:ns ## COG: VC0265 COG0668 # Protein_GI_number: 15640294 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 35 388 28 400 412 243 35.0 3e-64 MLVLDLGKWMNKILIDWGIDPTVADRFDETIIAVLMIAIAIGLNYLCQAILIGGMKRYTR RKPHLWNTLLMKRKVFHNLIHTVPAFLVYSLLPMAFIRGKDLLLISQKACAVYIIFSLLL AINGILLMIMDIYDGKESMKNRPMKGFIQVLQVLLFFVGGIIIIAIIVNKSPASLFAGLG ASAAILMLVFKDSILGFVAGIQLSANDMVRPGDWITLPSGAANGTVQEITLNTVKIQNFD NTISTVPPYTLVSSPFQNWRGMVQSGGRRVMKNITLDLTTLQFCTPEMLDRYRKEIPLMA DYQPEEGVVPTNSQVYRVYIERYLCSLPVVNQDLDLIISQKEATMYGVPIQVYFFSRNKV WKEYERIQSDIFDHLLAMVPKFDLKVYQYSD >gi|222159198|gb|ACAB01000161.1| GENE 29 45068 - 46678 1207 536 aa, chain + ## HITS:1 COG:no KEGG:BVU_1280 NR:ns ## KEGG: BVU_1280 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 533 1 533 536 865 78.0 0 MKYPIGIQSFDRLREDGFVYVDKTALVYNLVQTGSIYFLSRPRRFGKSLLVSTLACYFQG RKELFDGLAIADLEKDWLQYPIFRIDFNGGPYTQPGVLEATIEGYLGNWEDIYGKNLNYT TTGDRFKELLRRAYERTGQRAVVLIDEYDKPILDVLDTGTYTCNHVGEKLLLEDHHREIL KSFYSTFKGADEYLKFVLLTGVTKFSQVSVFSGFNQPKDISMDERYESLCGITQEELEKY FAEPISRLATKYECTVEEMKDWLKRQYDGYHFSTNMAAIYNPFSILNAFDMNEIRDYWFA TGTPTYLIRLLQHSREQMNELTGKFYDPTMFIDYKADVEQPLPMIFQSGYLTIKEYNKKM GTYLLDFPNNEVRKGFLSVLAANYMKPKSKEVTSWITDAVMDLERGDTDAFRRSLTSFLA SIPYDSHGSLKDIDITEKHFQYTFYLLLRLIGVYCTAIHCEDRQSYGRVDCTLEMEDYVY IFEFKMDGTAQEALEQIEKTGYAKPYLADKRKVICIGVNFSSVTRTVEDWEEVSIV >gi|222159198|gb|ACAB01000161.1| GENE 30 46757 - 47974 769 405 aa, chain - ## HITS:1 COG:no KEGG:BT_1616 NR:ns ## KEGG: BT_1616 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 26 405 1 373 373 551 70.0 1e-155 MFLPRNNRQISQHKEEYGNSEKRVAVAMLFFAMNLIAVSFFAPSYAIAQDSHEGQGYHDT QVPSQQERIPFRVVSWNIENLFDTRHDSLKNDHEFLPNAMRHWNYSRYKKKLVDVARVIT AIGEWSPPALVGLCEVENDTVLRDLTRRSPLKELSYRYVMTNSPDLRGIDVALLYQRDLF KLLSSRSISIPPLKQYRPTRDLLHVSGLLLTGDTLDVFVCHLPSRSGGAKESEPYRLYAA HILRMEVDSIMNIRSLPQVIIMGDFNDYPTNQSIQKILEAEAPPVTTNNLASTTGSTNFS TVAPSSLKLYHLLARKAKSKDFGSYKYRGEWGLLDHLIVSGTLLNQSNHFFTSEEKANVC LLPFLLKDDEKYGDKEPFRTYKGMKYQGGVSDHLPIYADFELILY >gi|222159198|gb|ACAB01000161.1| GENE 31 47955 - 49415 1426 486 aa, chain - ## HITS:1 COG:VC2279 KEGG:ns NR:ns ## COG: VC2279 COG2195 # Protein_GI_number: 15642277 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Vibrio cholerae # 2 485 51 533 534 467 47.0 1e-131 MEKKDLKPAGVFKYFEEICQVPRPSKKEEKMIVYLKAFGAKHNLETKVDEAGNVLIKKPA TPGKENLQTVVLQSHIDMVCEKNNDVQHDFLTDPIETEIDGEWLKAKGTTLGADNGIGVA TELAILADDSIEHGPLECLFTVDEETGLTGAFALQEGFMSGDILLNLDSEDEGEIFIGCA GGIDSVAEFTYKEVEIPAGYFFFKVEVKGLKGGHSGGDIHLGRGNANKILNRFLSRMAAR HDLYLCEINGGNLRNAIPREAYTICAVPEDAKHDVRTELNIFTSEMEDELSVVEPDLKLV LESETPRKTAIDQDTTTRLLKALYAAPHGVYAMSQDIPGLVETSTNLASVKMKPNHVIRI ETSQRSSILSARNDMANTVRAVFQLAGANVTFGEGYPGWKPNPHSAILEVAVESYKRLFG VDAKVKAIHAGLECGLFLDKYPTLDMISFGPTLTGVHSPDERMLIPTVEKFWKHLLDILA HVPAKK >gi|222159198|gb|ACAB01000161.1| GENE 32 49549 - 49782 220 77 aa, chain + ## HITS:1 COG:no KEGG:BT_1614 NR:ns ## KEGG: BT_1614 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 137 94.0 9e-32 MVETILITLLIVAISLVLLGVKVFFTKGGKFPNGHVSGNKTLRQKGIGCAQSQDREAQKK PRFSIDELEKALNDSMN >gi|222159198|gb|ACAB01000161.1| GENE 33 49819 - 50412 583 197 aa, chain + ## HITS:1 COG:no KEGG:BT_1613 NR:ns ## KEGG: BT_1613 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 197 1 193 193 290 89.0 2e-77 MNYLMNGLAALAFIVLFSQCAGKTDNQTTSTPAQANAELSGMKIAYVEIDTLLAKYNFCI DLNEAMVKKSENVRMTLNQKATSLNKEKQDFQKKVENNAFLSQDRAQQEYNRLVKLEQDL QELSNKLQNGLMEENNKNSLQFRDSINAFLKEYNKTHGYSLIFSNTGFDNLLYADSAFNI TKEIVDGLNARYSPVKK >gi|222159198|gb|ACAB01000161.1| GENE 34 50522 - 50863 482 113 aa, chain - ## HITS:1 COG:no KEGG:BT_1612 NR:ns ## KEGG: BT_1612 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 113 179 90.0 2e-44 MKQLVPALFVVGAIMALTGAAVFITGWIYAPYIYTIGAGLFALAQVNTPVKGKGKTLKRL RVQQIFGALALILTGAFMFTTRGNEWIACLTIAAILELYTAFRIPQEEEKERA >gi|222159198|gb|ACAB01000161.1| GENE 35 50867 - 51163 349 98 aa, chain - ## HITS:1 COG:no KEGG:BF3072 NR:ns ## KEGG: BF3072 # Name: not_defined # Def: putative septum formation initiator-related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 98 1 98 100 142 78.0 5e-33 MGKLISIWSFIRRRKYLITVVAFAVIIGFLDENSLVRRFGYEREISQLKEEIEKYRADYE ENTKRLNEITTNPDAIEQIAREKYLMKKPNEDIYVFED >gi|222159198|gb|ACAB01000161.1| GENE 36 51398 - 53263 1461 621 aa, chain + ## HITS:1 COG:BH0034 KEGG:ns NR:ns ## COG: BH0034 COG2812 # Protein_GI_number: 15612597 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Bacillus halodurans # 3 430 2 413 564 287 39.0 4e-77 MENYIVSARKYRPSTFESVVGQRALTTTLKNAIATQKLAHAYLFCGPRGVGKTTCARIFA KTINCMTPTADGEACNQCESCVAFNEQRSYNIHELDAASNNSVDDIRQLVEQVRIPPQIG KYKVYIIDEVHMLSASAFNAFLKTLEEPPRHAIFILATTEKHKILPTILSRCQIYDFNRI SVEDTVNHLSYVASKEGITAEPEALNVIAMKADGGMRDALSIFDQVVSFTGGNITYKSVI DNLNVLDYEYYFRLTDCFLENKVSDALLLFNDILNKGFDGSHFITGLSSHFRDLLVGKDP VTLPLLEVGASIRQRYQEQAQKCPLPFLYRAMKLCNECDLNYRISKNKRLLVELTLIQVA QLTTEGDDVSGGRGPTKTIKPIFTQPAAAQQPQVTSATQVQQASLHTSPSSVTTQAVNGT TARHPQASAAVQPGASASSGAASSAPSQGAGVAPTVKEERKIPVMKMSSLGVSIKNPQRD QTTQNTVTTHVPRVQQPEEDFIFNDRDLNYYWQEYAGQLPKEQDALTKRMQMLRPVLLNN STTFEVVVDNEFAAKDFTALIPELQSYLRGRLKNSKVVMTVRVSEATETIRPVGRVEKFQ MMAQKNQALMQLKDEFGLELY >gi|222159198|gb|ACAB01000161.1| GENE 37 53356 - 53589 242 77 aa, chain - ## HITS:1 COG:MA4346 KEGG:ns NR:ns ## COG: MA4346 COG1983 # Protein_GI_number: 20093134 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 4 62 112 170 183 65 46.0 3e-11 MENEKKLTRSSNRMIAGVCAGIAEYFGWDPTLLRIVYILATFFTAFAGVIIYIILWIVMP GKRPSDGYEDRMNQRLH >gi|222159198|gb|ACAB01000161.1| GENE 38 53895 - 54428 93 177 aa, chain + ## HITS:1 COG:no KEGG:BT_1545 NR:ns ## KEGG: BT_1545 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 175 14 186 187 178 51.0 6e-44 MSNSSINELFAHSSLLELPKKHILTSSFQNDKNFYFIEKGIARSYCVINDKELTSWFSTE GDIVFSTNNFYGNQQGYEYEVVQLLENTVLYAVPIKDLEKLYQTNIEIANWSRILHQEAF IMNEKRLISRLYKSAEERYIELLQTRPDLFQRVNLGYIASFLGISQVTLCHLRNKIK >gi|222159198|gb|ACAB01000161.1| GENE 39 54494 - 55675 526 393 aa, chain + ## HITS:1 COG:RSc3292 KEGG:ns NR:ns ## COG: RSc3292 COG3274 # Protein_GI_number: 17548009 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 23 370 1 316 336 76 22.0 9e-14 MSYHCPNFAIEKIEYQMKNVNHIVWADLIRVIAILMVICIHCSDPFNVSPEARLNPEYNL WGSIYGAFLRPCVPLFVMLTGMLLLPVEQGIGQFYKKRMLRVIVPFIVWSLLYNLFPWFT GILGLNNSVLSEMFAYAGESPSQTWDSCLHNILMIPFNFNTYTVPLWYIYMLIGLYLYMP FFSAWLKQATEKQMKIFLFVWFITLFLPYCQEYISSYIGGTCAWNDYGTLYYFLGFSGYL LLGYYLKDRNKLSVKKTVVLSLILFSVGYAITYYGFRNMTSQPDITERQMELFFLYCSPN VFLMTVAWYLLIQKVKVTSPLIISALKNITKCGLGIYMVHYFAVGIGYLAIDRIDLPIFM RIPATALFVFIVSWCIVALFYKVLPKAAKWIMG Prediction of potential genes in microbial genomes Time: Wed May 18 04:43:27 2011 Seq name: gi|222159197|gb|ACAB01000162.1| Bacteroides sp. D1 cont1.162, whole genome shotgun sequence Length of sequence - 16663 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 10, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 73 - 151 70.9 # His GTG 0 0 - Term 24 - 70 8.9 1 1 Op 1 . - CDS 216 - 1349 747 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) 2 1 Op 2 . - CDS 1367 - 2248 955 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase - Prom 2276 - 2335 3.2 - Term 2387 - 2420 3.1 3 2 Tu 1 . - CDS 2456 - 3778 1796 ## COG0541 Signal recognition particle GTPase - Prom 3801 - 3860 3.4 4 3 Tu 1 . - CDS 3883 - 5205 1146 ## COG0534 Na+-driven multidrug efflux pump - Prom 5320 - 5379 2.4 - Term 5420 - 5474 -0.5 5 4 Tu 1 . - CDS 5479 - 6945 1192 ## COG3119 Arylsulfatase A and related enzymes - Prom 6965 - 7024 6.0 - Term 7150 - 7209 17.5 6 5 Tu 1 . - CDS 7232 - 9490 2003 ## COG1158 Transcription termination factor - Prom 9717 - 9776 7.1 + Prom 9542 - 9601 6.6 7 6 Tu 1 . + CDS 9727 - 10563 574 ## BT_1594 putative ferredoxin + Term 10568 - 10617 -0.6 - Term 10287 - 10330 -0.2 8 7 Tu 1 . - CDS 10548 - 11828 863 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Prom 11906 - 11965 4.8 + Prom 11801 - 11860 8.4 9 8 Op 1 . + CDS 11974 - 14460 2107 ## COG0370 Fe2+ transport system protein B 10 8 Op 2 . + CDS 14457 - 14672 79 ## BF1354 hypothetical protein + Term 14742 - 14809 30.2 + TRNA 14724 - 14798 52.4 # Cys GCA 0 0 - Term 14801 - 14839 7.1 11 9 Tu 1 . - CDS 15069 - 15830 410 ## gi|294646784|ref|ZP_06724406.1| conserved domain protein + Prom 15660 - 15719 4.6 12 10 Tu 1 . + CDS 15859 - 16663 477 ## BVU_3167 hypothetical protein Predicted protein(s) >gi|222159197|gb|ACAB01000162.1| GENE 1 216 - 1349 747 377 aa, chain - ## HITS:1 COG:SPy0818 KEGG:ns NR:ns ## COG: SPy0818 COG2843 # Protein_GI_number: 15674859 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Streptococcus pyogenes M1 GAS # 26 292 69 338 430 138 34.0 2e-32 MILLLSLSCTSRSQAKRDSIIDTLSDSLSATDSISPTDTLRLLFVGDLMQHQGQINAART STGYDYSTCFTYVKEEIGRADLAIANLEVTLGGKPYKGYPAFSAPDEFLTAIHDAGFNVL VTANNHSLDRGKSGLERTIQLIDSLKIPHAGTYINTEERDKKYPLLLEKNGFRIALLNYT YGTNGIPVTPPNIVNYIDTTVIAKDIEESKAMKPDAIIACMHWGIEYQSLPDKEQKFLAD WLIQKGVNHVIGSHPHVVQPIEVRTDSVTNDKHLVVYSLGNYISNMSARRTDGGLMVRME LVKDSTVRLNNCEYSLVWTARPIQSGKKNHQLLPVNLPADSIPLQARNSLKIFTNDARTL FNKHNQGIKEYTFYKKK >gi|222159197|gb|ACAB01000162.1| GENE 2 1367 - 2248 955 293 aa, chain - ## HITS:1 COG:lin1397 KEGG:ns NR:ns ## COG: lin1397 COG0190 # Protein_GI_number: 16800465 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Listeria innocua # 3 289 4 279 284 272 52.0 7e-73 MTLIDGKAISEQVKQEIAAEVAEIVARGGKRPHLAAILVGHDGGSETYVAAKVKACEVCG FKSSLIRYESDVTEEELLAKVRELNEDDDVDGFIVQLPLPKHISEQKVIETIDYRKDVDG FHPINVGRMSIGLPCYVSATPNGILELLKRYEIETSGKKCVVLGRSNIVGKPMAALMMQK AYPGDATVTVCHSRSKDLIKECQEADIIIAALGQPNFVKAEMVKEGAVVIDVGTTRVPDA TKKSGFKLTGDVKFDEVAPKCSFITPVPGGVGPMTIVSLMKNTLLAGKKAIYQ >gi|222159197|gb|ACAB01000162.1| GENE 3 2456 - 3778 1796 440 aa, chain - ## HITS:1 COG:BS_ffh KEGG:ns NR:ns ## COG: BS_ffh COG0541 # Protein_GI_number: 16078661 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Bacillus subtilis # 2 433 3 437 446 443 55.0 1e-124 MFDNLSERLERSFKILKGEGKITEINVAETLKDVRKALLDADVNYKVAKNFTDTVKEKAL GQNVLTAVKPSQLMVKIVHDELTQLMGGETAEINIDARPAVILMSGLQGSGKTTFSGKLA RMLKTKKNRKPLLVACDVYRPAAIEQLRVLAEQIEVPMYCELDSKNPVEIAQHAIQEAKA KGYDLVIVDTAGRLAVDEQMMNEIAAIKEAINPNEILFVVDSMTGQDAVNTAKEFNERLD FNGVVLTKLDGDTRGGAALSIRSVVNKPIKFVGTGEKLEAIDQFHPARMADRILGMGDIV SLVERAQEQYDEEEAKRLQKKIAKNQFDFNDFLSQIAQIKKMGNLKDLASMIPGVGKAIK DIDIDDNAFKSIEAIIYSMTPAERSNPEILNGSRRTRIAKGSGTTIQEVNRLLKQFDQTR KMMKMVTSSKMGKMMPKMKR >gi|222159197|gb|ACAB01000162.1| GENE 4 3883 - 5205 1146 440 aa, chain - ## HITS:1 COG:PAB0243 KEGG:ns NR:ns ## COG: PAB0243 COG0534 # Protein_GI_number: 14520582 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Pyrococcus abyssi # 5 393 6 397 463 120 25.0 6e-27 MYTNKQIWSVSYPILLSLLAQNVINVTDTAFLGHVSEVALGASAMGGLFYICVFTIAFGF STGSQIVIARRNGEGRYSDVGPVMIQGVMFLFVMALLLFGFTKAFGGNIMRLLVSSESIY EGTMEFLDWRIYGFFFSFINVMFRALYIGITRTKVLTINAIVMALTNVVLDYALIFGKFG LPEMGIKGAAIASVLAEASSILFFVIYTYATVDLKKYGMNRLRTFDPALLMRILSISCFT MLQYFLSMATWFVFFVAVERLGQRELAIANIVRSIYVVLLIPVNALATTTNSLVSNAIGA GGIQHVMPLINKIARFSFFIMLGLVAVSALFPQFLLSIYTSEAALITESVPSVYVICFAM LIASVANVVFNGISGTGNTQAALLLETITIIIYGSYIIFIGMWLKAPIEICFTIEIVYYS LLLITSYIYLKKAKWQNKKI >gi|222159197|gb|ACAB01000162.1| GENE 5 5479 - 6945 1192 488 aa, chain - ## HITS:1 COG:yidJ KEGG:ns NR:ns ## COG: yidJ COG3119 # Protein_GI_number: 16131548 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli K12 # 21 471 2 458 497 120 25.0 5e-27 MKTIYPFMGLALCGVAAQAQEKPNFLIIQCDHLTQRVVGAYGQTQGCTLPIDEVASRGVI FSNAYVGCPLSQPSRAALWSGMTPHQTNVRSNSSEPINPRIPENVPTLGSLFSENGYEAV HFGKTHDMGSLRGFKHKEPVAKPFTDPEFPVNNDSFLDVGTCEDAVAYLSNPPKEPFICI ADFQNPHNICGFVGANEGVHTDRPISGTLPELPANFDVEDWNNIPKPVQYICCSHRRMTQ ASHWNEENYRHYIAAFQHYTKMVSKQVDSVLKALYSTPAGRNTIVVIMADHGDGMASHRM VTKHISFYDEMTNVPFIFAGPGIKQQKKPVKQLLTQPTLDLLPTLCDLAGISVPVEKPGI SLAPTLKGEKQKKTHPYVVSEWHSEYEYVVTPGRMVRGPRYKYTHYLEGNGEELYDMKKD PGERKNLAKDPKYSKVLAEHRAMLDDYITRTKDDYRSLKVDADPRCRNHTPGYPNHEGPG VREILKRK >gi|222159197|gb|ACAB01000162.1| GENE 6 7232 - 9490 2003 752 aa, chain - ## HITS:1 COG:BMEI0003 KEGG:ns NR:ns ## COG: BMEI0003 COG1158 # Protein_GI_number: 17986287 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Brucella melitensis # 357 752 27 421 421 467 61.0 1e-131 MYNIIQLNDKNLSELQVIAKELGIKKADSYKKEDLVYKILDEQAIVGATKKVAADKLKEE RKNEEQKKKRSRVAPTKKEDKVVSTPKSGEVNKTKEATPVKAPQPSKKEESTNKEKEAPV VEAKAENATTAPKRKVGRPRKSADTEEKKEVENVTPAAPKVVETKPVVAEKTTETKEKAA PIQQPTAEKKAKSKPAAETNKPAAEANKPVAEKKVIDKPQKKAAPVIDEESNILSSVDDD DFIPIEDLPSEKIELPTELFGKFEATKTEPVQTAPEQPSHPQQQQQSQQQQAQQQRPRIV RPRDNNNGNNNVNNNSNNANNNNNNFQHNNNNQNQVQNQNQQRLPMPRATQQNHANENLP AQQQQQQERKVIEREKPYEFDDILNGVGVLEIMQDGYGFLRSSDYNYLSSPDDIYVSQSQ IKLFGLKTGDVVEGVIRPPKEGEKYFPLVKVSKINGRDAAFVRDRVPFEHLTPLFPDEKF KLCKGGYSDSMSARVVDLFAPIGKGQRALIVAQPKTGKTILMKDIANAIAANHPEVYMIM LLIDERPEEVTDMARSVNAEVIASTFDEPAERHVKIAGIVLEKAKRLVECGHDVVIFLDS ITRLARAYNTVSPASGKVLSGGVDANALHKPKRFFGAARNIENGGSLTIIATALIDTGSK MDEVIFEEFKGTGNMELQLDRNLSNKRIFPAVNITASSTRRDDLLLDKTTLDRMWILRKY LADMNPIEAMDFVKDRLEKTRDNDEFLMSMNS >gi|222159197|gb|ACAB01000162.1| GENE 7 9727 - 10563 574 278 aa, chain + ## HITS:1 COG:no KEGG:BT_1594 NR:ns ## KEGG: BT_1594 # Name: not_defined # Def: putative ferredoxin # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 278 42 319 319 493 84.0 1e-138 MTVNEAHLIYFSPTHTSKQVAEAIVHGTGIKNILPINVTQQIADEIVIPASALAIIVVPV YGGHVAPLAMERLQYIRGVDTPTVLVVVYGNRAYEKALMELDAFAIPHGLKVIAGATFIG EHSYSTDKHPIAVGRPDESDLAFATEFGKKIMEKIQAADSMDTLYPVDVRAIKRPSQPFF PLFRFLRKVIKLRKSGTPLPRTPWVEDENLCTHCGLCVVRCPAGAITKGDELHTDEAKCI KCCACVKACVKKARVYDTPFAALLSDCFKKQKLPQTIL >gi|222159197|gb|ACAB01000162.1| GENE 8 10548 - 11828 863 426 aa, chain - ## HITS:1 COG:CAC3204 KEGG:ns NR:ns ## COG: CAC3204 COG0037 # Protein_GI_number: 15896451 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Clostridium acetobutylicum # 6 425 5 457 461 196 30.0 9e-50 MIQQRVTKYIEKEHLFSLDAKVLIALSGGADSVALLCILHTAGYHCEAAHCNFHLRGEES DRDELFVRQLCKRTGIHLHTIDFNTTQYAKEKQISIEMAARELRYDWFEKTRKECQADVI AVAHHQDDSVETILLNLIRGTGITGLLGIRPRNGAIVRPLLCVNREDIIHYLESIGQDYV TDSTNLEDEYTRNKIRLNLLPLMQEINPSVKKNLIDTSNYLNDVATIYNKCIEETKNNIV TTEGIRISDLVKEPAPEAILFEILHPLGFNSAQIKDIAFSLHSQPGKQFCSKKWRVIKDR KFLLIEAIESGNEILPPFQIIKEEKEYTPDFLIPRDKGIACFDADKLNGDIHYRKWQTGD TFIPFGMKGKKKISDYLTDRKFSISQKERQWVLCCGEHIAWLIGERTDNRFRIDETTKRV VIYKIV >gi|222159197|gb|ACAB01000162.1| GENE 9 11974 - 14460 2107 828 aa, chain + ## HITS:1 COG:MA3477 KEGG:ns NR:ns ## COG: MA3477 COG0370 # Protein_GI_number: 20092288 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Methanosarcina acetivorans str.C2A # 112 824 11 665 670 560 41.0 1e-159 MRLSELKTGEKGVIVKVLGHGGFRKRIVEMGFIKGKTVEVLLNAPLKDPIKYKVLGYEIS LRRQEAEMIEVISEEEAKELAEKTVYHEGLPEDLSVKEEDMKRLALGKRRTINVALVGNP NSGKTSLFNLASGAHEHVGNYSGVTVDAKEGYFDFEGYHFRIVDLPGTYSLSAYTPEEIY VRRHIIDETPDVIINVVDSSNLERNLYLTTQLIDMNVRMVVALNIYDELEASGNTLDYHL LSKLFGVPMLPTASKKNRGLDTLFHVVINLYEGVDFFDKQGNMNPEVLKDLTEWHDSLED RKNHEEEHLEDYVREHKKTGRVFRHIHINHGPDIEKAIEAVKSEVSKNEFIRHKYSTRFL SIKLLENDPDIERIVRTLPNADEIFHVRDKMSKRVQETMNEDCESAITDAKYGFISGALK ETFTDNHLEQAQTTKVLDSIVTHRVWGFPIFFLFMYLMFEGTFVIGEYPMMGIEWMVEQL GELLRNNMSEGPFKDLLIDGIIGGVGAVIVFLPNILILYFCISIMEDSGYMARAAFIMDK IMHKMGLHGKSFIPLIMGFGCNVPAIIASRTIENRKSRLITMLVNPLMSCSARLPIYLLL VGAFFPNNASLVLLSIYVIGIVLAVVMARLFSKFLIKGDDTPFVMELPPYRMPTAKSIFR HTWEKGAQYLKKMGGIIMIASIIIWFLGYYPNHDAYETVAEQQENSYIGQLGRGIEPVIK PLGFDWKLGIGLLSGVGAKELVVSTLGVLYADDPDADSVSLAERIPITPLVAFCYMLFVL IYFPCIAAIAAIKQESGSWKWALFAACYTTALAWLVSFAVYQIGGLFV >gi|222159197|gb|ACAB01000162.1| GENE 10 14457 - 14672 79 71 aa, chain + ## HITS:1 COG:no KEGG:BF1354 NR:ns ## KEGG: BF1354 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 57 2 56 70 90 76.0 1e-17 MNNWQEWVVGVLVVLCIARVVYGIFLFFRRTRENQNPCDSCVSGCELKDMMEKKRRECDV KKKSTKKNCCG >gi|222159197|gb|ACAB01000162.1| GENE 11 15069 - 15830 410 253 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294646784|ref|ZP_06724406.1| ## NR: gi|294646784|ref|ZP_06724406.1| conserved domain protein [Bacteroides ovatus SD CC 2a] # 1 97 1 97 97 160 100.0 7e-38 MGTKIINISLISPLISHYQLIDLTPKSVQEDKSDTFPPYKYNLLIILNNCISKNILFDNF SMTITTFNYFVPKNIFFNGCSIAFAIFDNGIPKDIFFNRLSITFIVSNDSISKNILFDSL SIALIITNDGISKNILFDSLSITFIITNGDIPKNIFFNSLSIALIITNGNISKDILFNGL SIAFIITNSDIPKNILFNGLPIAFIITNDDISKDILFDSLSPLINSRILIIVFNKICTIA LYYIRSQNKLGSK >gi|222159197|gb|ACAB01000162.1| GENE 12 15859 - 16663 477 268 aa, chain + ## HITS:1 COG:no KEGG:BVU_3167 NR:ns ## KEGG: BVU_3167 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 268 1 269 295 292 66.0 8e-78 MNRIKGILYAAVSSSTFGLAPFFSLTLLLAGFSAFEVLSYRWGVATIALTLFGWCSGCSF RLEKKDFLVVLLLSLLRAVTSFSLLIAYQNIATGVASTIHFMYPLAVSLVMMYFFQEKKS LWVMFAVSMSLLGAALLSSGELEAKNGDTIVGLVAACVSVFSYAGYIVGVRMTRAVRINS TVLTCYVMGLGTVLYFIGALTTSGLQLVADGYTWLIILGLALPATAISNITLVRAIKYAG PTLTSILGAMEPLTAVVIGVFVFKELFT Prediction of potential genes in microbial genomes Time: Wed May 18 04:43:47 2011 Seq name: gi|222159196|gb|ACAB01000163.1| Bacteroides sp. D1 cont1.163, whole genome shotgun sequence Length of sequence - 3391 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 14 - 73 2.0 1 1 Op 1 . + CDS 110 - 544 257 ## BT_2347 transposase 2 1 Op 2 . + CDS 528 - 893 264 ## BVU_3680 hypothetical protein + Prom 897 - 956 2.5 3 2 Op 1 5/0.000 + CDS 976 - 1920 458 ## COG3436 Transposase and inactivated derivatives + Term 1949 - 2003 -0.6 4 2 Op 2 . + CDS 2064 - 2561 224 ## COG3436 Transposase and inactivated derivatives - Term 2727 - 2763 1.3 5 3 Tu 1 . - CDS 2836 - 3078 157 ## gi|237712680|ref|ZP_04543161.1| predicted protein Predicted protein(s) >gi|222159196|gb|ACAB01000163.1| GENE 1 110 - 544 257 144 aa, chain + ## HITS:1 COG:no KEGG:BT_2347 NR:ns ## KEGG: BT_2347 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 144 4 121 121 84 38.0 2e-15 MMQRRKHMMSREKFISVLFRQQQSGLSIADFCENEGYSRSRFYLWKQKYGITERELLAEA SRLGVKDSFVPIVINGDTPSPGISHEAPLPSPQPSVPFPLKGKDSSEISLELPNGLKLKF KGSSGCEAALNLISKIYNANVLPK >gi|222159196|gb|ACAB01000163.1| GENE 2 528 - 893 264 121 aa, chain + ## HITS:1 COG:no KEGG:BVU_3680 NR:ns ## KEGG: BVU_3680 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 119 1 118 122 162 65.0 3e-39 MFCLNDSMRYFLCPGYTDMRKGMFTLCGLVHERMGGDIRSGEVFIFYNRFRTKIKLLHAE PGGLVLYEKLLEEGTFKIPAYNPATRSYPMTWSDLVVMVEGINEDVSKGRQRRLSNLKKH W >gi|222159196|gb|ACAB01000163.1| GENE 3 976 - 1920 458 314 aa, chain + ## HITS:1 COG:SMb21085 KEGG:ns NR:ns ## COG: SMb21085 COG3436 # Protein_GI_number: 16264412 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 23 289 36 298 550 95 26.0 9e-20 MAKEDATGQAVEPSMEQLLDINRKQSEIISAQARTIEELRGTIVELNASLAWLKRKVFGK MSEKCKPVDNGDPKLPFDYGDLEQIEAEIEDARSRAAEQITVPKSRAANKTPRRNRVVMD NLPVVTVVIEPENVDLSRYVKIGEEHTRTLEMKPGYLYVKDTVRPTYALKNEMEAVDNGE KAVITAPMPLMPIYKGMPGASMLAEVLLQKYEYHVPFYRQVKQLEHLGVKLSKNTLDGWF RPVCELLRPLYLELRKKVLAADYFQVDETTLPVINHDRHKAVKEYIWIVRAAVPRLPSSI TTTDRVHKKLRSGF >gi|222159196|gb|ACAB01000163.1| GENE 4 2064 - 2561 224 165 aa, chain + ## HITS:1 COG:ECs3866 KEGG:ns NR:ns ## COG: ECs3866 COG3436 # Protein_GI_number: 15833120 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 2 164 303 457 463 102 37.0 3e-22 MQGLKFIQDMYNVEYMADKQELPYEGRAELRQRLSKPMLDSFELWLKNTYPKVLKRSLMG KAIAYAYSLLPRMKPYLHDGRIFIDNNRCENALRPLVISRKNMLFCGNHEAAENTAIICS LLGSCKERGVNPREWLNDVISKLPYYLAPKSDRDLKELLPDVWRK >gi|222159196|gb|ACAB01000163.1| GENE 5 2836 - 3078 157 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712680|ref|ZP_04543161.1| ## NR: gi|237712680|ref|ZP_04543161.1| predicted protein [Bacteroides sp. D1] # 1 80 1 80 80 146 96.0 5e-34 MMDGILSNAFRKLRAAATGSAIASMQPMSYKDTDTTGEPVEHVTEPDKQIAHSALGDYGP FSVKYATKDFLSYGNKKFRI Prediction of potential genes in microbial genomes Time: Wed May 18 04:44:01 2011 Seq name: gi|222159195|gb|ACAB01000164.1| Bacteroides sp. D1 cont1.164, whole genome shotgun sequence Length of sequence - 13605 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 5, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 319 - 378 3.4 1 1 Op 1 . + CDS 483 - 1688 632 ## gi|237712847|ref|ZP_04543328.1| predicted protein 2 1 Op 2 . + CDS 1722 - 1991 98 ## gi|237712848|ref|ZP_04543329.1| predicted protein 3 1 Op 3 2/0.000 + CDS 2033 - 2233 87 ## COG3666 Transposase and inactivated derivatives + Term 2254 - 2287 0.5 + Prom 2252 - 2311 2.3 4 1 Op 4 . + CDS 2352 - 3338 287 ## COG3666 Transposase and inactivated derivatives 5 2 Op 1 . - CDS 3340 - 5499 1309 ## gi|237712850|ref|ZP_04543331.1| predicted protein - Prom 5522 - 5581 4.6 6 2 Op 2 . - CDS 5583 - 6473 307 ## gi|237718917|ref|ZP_04549398.1| predicted protein 7 2 Op 3 . - CDS 6528 - 7034 296 ## gi|237718917|ref|ZP_04549398.1| predicted protein 8 2 Op 4 . - CDS 7035 - 7943 517 ## gi|262409680|ref|ZP_06086219.1| predicted protein 9 2 Op 5 . - CDS 7991 - 8914 435 ## gi|237712853|ref|ZP_04543334.1| predicted protein - Prom 8953 - 9012 3.8 - Term 9075 - 9118 -0.8 10 3 Tu 1 . - CDS 9153 - 10319 1050 ## gi|237712854|ref|ZP_04543335.1| predicted protein - Prom 10340 - 10399 10.0 - Term 10332 - 10386 2.1 11 4 Op 1 . - CDS 10422 - 11180 460 ## BF4231 hypothetical protein 12 4 Op 2 . - CDS 11131 - 12147 491 ## BF4231 hypothetical protein 13 4 Op 3 . - CDS 12189 - 12755 379 ## BF4435 hypothetical protein 14 5 Tu 1 . - CDS 13337 - 13603 228 ## BT_1894 TPR repeat-containing protein Predicted protein(s) >gi|222159195|gb|ACAB01000164.1| GENE 1 483 - 1688 632 401 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712847|ref|ZP_04543328.1| ## NR: gi|237712847|ref|ZP_04543328.1| predicted protein [Bacteroides sp. D1] # 1 401 1 401 401 793 100.0 0 MKKIHLLHYLLLCYMVLAVATSCGDDEDDYAPDNVNDPEKLTDPLPTTDQLGVTHNNTLY YLSFWSKEDHMISENMINRYKEAVPYAPGKKMEVGDAFFFTRQDIDRIQSDTDLLADIKD MFHKRSIVMMMEGGTNEDFNTVCTLLDCYNPYAKEDESYSDELPLWVFSGPLPSASGFYS KLNALSTATGADGKESVQTINEYGQGFHCDMTSQSLQRALEPKAPSNGSSDLKNLISAYI VIGQNSYTAPPVHGHEGKGTTDNFQVEAKIWTAYSKDRKEHFYLVNLGFIAYVEPSFYGE WHTKVSTLKMKGYGFCLTDVNLEFAPVHPNGAIIHAHSPQTTERQSSYTSSVSFELGGSV SVTGPEISGGISISNSHTETINDIEVINRTNPAQGSPQLRW >gi|222159195|gb|ACAB01000164.1| GENE 2 1722 - 1991 98 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712848|ref|ZP_04543329.1| ## NR: gi|237712848|ref|ZP_04543329.1| predicted protein [Bacteroides sp. D1] # 1 89 1 89 89 172 100.0 7e-42 MFCYASSAVNGGAKAGITTCSLSTDFIISVPDGADNEWKVSMYPTLKTYFFSDLFGFHDQ YETLFDDAAESWIGIEKVFNLPTVDLKDE >gi|222159195|gb|ACAB01000164.1| GENE 3 2033 - 2233 87 66 aa, chain + ## HITS:1 COG:PA1368 KEGG:ns NR:ns ## COG: PA1368 COG3666 # Protein_GI_number: 15596565 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Pseudomonas aeruginosa # 10 63 31 84 474 60 51.0 7e-10 MRKTFLVMSRLIDLFVDILPIDELGFKHVKLQSEGRPPYNPATLLKLYLYGYKHSIRSSR KLEHFL >gi|222159195|gb|ACAB01000164.1| GENE 4 2352 - 3338 287 328 aa, chain + ## HITS:1 COG:CC0273 KEGG:ns NR:ns ## COG: CC0273 COG3666 # Protein_GI_number: 16124528 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Caulobacter vibrioides # 1 275 130 412 481 125 28.0 1e-28 MLKDMGLIEGETIGIDSFKIFAQNSLRNNYTQKKIDRHLEYIDNRIEEFEVALDKTDKEE EKELLKSKIKLQQDRRKKYETLDTELKNSNDTQISLTDKDSRALMLTNNVSGVEYAVQAA FDSKHKLLVHSHIGASTDKRELSTAALTVQELLQLDSFNTLSDAGYTSGDQLQACKYSGI CTYSSPMPSTSPNSNSIPLAEFHYINDGDYYICPCGEQMTTTGKWRIRPNYRSKVYKTSA CVNCSIREKCTQNQNGRVIERSEYQDVIDENNARVMNHLLQGRAADIRNVAQLPPLHRPD GNCHSPPKTVQTKRLKQLIVSWCCLFNG >gi|222159195|gb|ACAB01000164.1| GENE 5 3340 - 5499 1309 719 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712850|ref|ZP_04543331.1| ## NR: gi|237712850|ref|ZP_04543331.1| predicted protein [Bacteroides sp. D1] # 1 719 1 719 719 1465 100.0 0 MNIGFQTSGLSGFTRVAYLWRRHASAVLWIAMAALLPTACSDDAEQSVPQANGHTAVIYL NFKTGASKQAAEMNTSQSSETRAPLTRAIDENGINTVDVLSFKVDPADPTNIKKGTFFYR AQGTYDIGTQTVRVQLMGAPEHQTLVVLANVREQVNTLAAAFGEQKEGVMNRLAFDVGTD AAPDFTNGIPMWGELPNQAVGEGFSPSGAPQEVTMIRAVAKFTLINPLESDLKGFFYYYN DLRLYNYRAKGRIAPDNYNVTGKQVNTPTVPVGARQTRGTFTSLNSDVSPGNIGIFNGSQ KTFYLCEVDNKNRPSGGNALDDLCLVMHVKPYLTGSKPSVDGYYRLDFKDYGSGVPMDIL RNHEYKVQVESVDGLPAQTPEEAFKGNHTLKCRIVPWNEVQEEVIVRGNKRLTVDKRVFA FDGDIQIGVASLPVNITTENTNWQITGKPSWISLSQTSGTAGVPATVTISANSKNFTQNM RSAVLSIRTGNSANELEYRLHVSQQPACGFGNSMKLMRIGGSNYQTTRMGNGGVCWMIDA SREGGPNMAARPRSYSLAPAMVGPIYYGSIPQRACPDGWRLPTAWEVINLIQSITGSYPV VFTRLIESGAGIAKSFPAISAPDGGYEEGVLAIWTQTPMIYSSPQRHSPASVLHTGIYWN PDIGAGKTAPTYRVNRNRWALMHGAGNDIHSPAGLPAGFNEYYGYHYLLGGSVRCVHDQ >gi|222159195|gb|ACAB01000164.1| GENE 6 5583 - 6473 307 296 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237718917|ref|ZP_04549398.1| ## NR: gi|237718917|ref|ZP_04549398.1| predicted protein [Bacteroides sp. 2_2_4] # 1 296 188 483 483 568 99.0 1e-160 MDYVAEYNLAGGSIYNSPFISSVPPGISPTAAQTDPNLHWASSHSNDQSGYYNWYVLTGE NNDTYNPNAKKLFDDVFFKLGHPGYGYHLPSRWELTGVFSYSGNTQYDSPTNTSNVNEAI EFGGIKKTFANDYFSSGNGVCYALRFKQGTGNPIDDSSLSDFPLATDNNMVCAYRYTRVG SFANHDFTSLLKVDCVYLGSAFTGNISTINNDSWWDSHTSEAVVRIFPAAGYISFPTFIS SGLLEARGEYGRYWSSTEFPSLLGNAWNVSFYSYSAFANYRDVKHHGFSVRLFADK >gi|222159195|gb|ACAB01000164.1| GENE 7 6528 - 7034 296 168 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237718917|ref|ZP_04549398.1| ## NR: gi|237718917|ref|ZP_04549398.1| predicted protein [Bacteroides sp. 2_2_4] # 1 168 1 168 483 325 100.0 5e-88 MDYTWDGIKGGGTSPYEKGKYMPILRNHRYIFTIKEVKGPGFATLNEAVTSPDNFTNHNI VVVPIVIDAFTDITFNESGHFLAVTRTAMTLQGKHDATSTQNKFSVRTNYPSGWKVGAYN ADGTTISVSNSWLKPSQNSGAAGATLTNPVTEELQAITNGKGFKDGYL >gi|222159195|gb|ACAB01000164.1| GENE 8 7035 - 7943 517 302 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262409680|ref|ZP_06086219.1| ## NR: gi|262409680|ref|ZP_06086219.1| predicted protein [Bacteroides sp. 2_1_22] # 1 302 9 310 310 543 100.0 1e-153 MPSSSIIIPFRNLLSVLLLLPLVLSCRQDELILPVPPVSGKGETAEVTLSVRIPDFRAAN TRGVDEKGITEITVLMFADEGGTEKVKVKYDIPGSSLHTLSGSSDTKYFSVPVIAGRYKR IALIANAQTELANITAGSTYDALKQVEVVGRFGQEGTGTYIPMYGEHAPVGGFELKAGVS QTIAQEIPLIRMLAKVDIINPTTSGATTAAGKVYFVNSVGNGRVWVDLATYNTTASQSGY MTPTLPATSQPAVSGGHPLEGTANTASPNVITYYLNEQSAISGGSRPCIVINLAYQGREC YY >gi|222159195|gb|ACAB01000164.1| GENE 9 7991 - 8914 435 307 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712853|ref|ZP_04543334.1| ## NR: gi|237712853|ref|ZP_04543334.1| predicted protein [Bacteroides sp. D1] # 1 307 1 307 307 620 100.0 1e-176 MKLYKAIRFLMLSGILTLVLSRCIRESLEGCPLDTLLRFTYKPRGSAADLFGERVQRVRL CVYRPDGSIEHMQTLEKSELDRLQGVEMYLPQGNYTVVLWANASDAQTKLGGFSEGETIR NLSASHPHAETSASIPTLDRLLFAVTPLTVSEHETGDHTVNFSPGTIRVSLHLDGVSVQP VVHITGMESALRPVFDETSGTWRLQSVSRNKTFQPAVAYDVTNRHANATTDIPRFKADTP GMVEIIDPTTGNHIVPPISLAGLIARYHIPMEEDIEVTIPIEITFTNGHAKITIRNWENQ NVKPGGV >gi|222159195|gb|ACAB01000164.1| GENE 10 9153 - 10319 1050 388 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712854|ref|ZP_04543335.1| ## NR: gi|237712854|ref|ZP_04543335.1| predicted protein [Bacteroides sp. D1] # 1 388 1 388 388 660 100.0 0 MIMKKSLLSVFTMVAMAAAFTACSNDDIEVPDVKNGEARSVFMKLELPTLTRSTEPPVAS GTVATVSNLHVYFHDGTSILKYVNATTTTPITIANLTTGTQITDVPAAATTVTVCGNIPA GTSLPTGGTVAALKAVQLEITSQSAVADVVLSGDDKPLQTWTTGSPALPYAPGITDGDKY AEVEIGPAVARVEIEGLATTASSAVDGFTLEGIYVNNFFEKFNLAGTVVGTKVQYGATPA AYAQGQGLYTPANAGKLFDQSAVAATGIPKEVIPPTAGQRWAYQVVPNGNSTDANEQLQL VFKLSNLAAKAGSSVNFGTGDQFITVRGFKDGSGNIVELEKGKIYTISKADFTFDESNLS TIPNTSAVSVWLKVTVKAWTVVPVKPNL >gi|222159195|gb|ACAB01000164.1| GENE 11 10422 - 11180 460 252 aa, chain - ## HITS:1 COG:no KEGG:BF4231 NR:ns ## KEGG: BF4231 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 249 329 568 571 204 46.0 2e-51 MDISRHRYFYDRIAENEMNDRNRDEIRRRMIPFPYIDSVMVRQNSDSVSGHDYIYNYVYS LPVTDGMKKLRVRLESIVEATDRSTWRPAASDTLLFIVASLSDLVDRSALDQYVIASAET DSLAASGPVYTPQGEEYAEALRLLSERQYRQALPILEKRPDYNTALCLTQLGYHKEASAL LDQLPVDSRKEYLHAVVSARQGDDYLAVEHMLAACRMNPNLVLRIPLDPELSDLIPKFFG LRMELDRIAEGK >gi|222159195|gb|ACAB01000164.1| GENE 12 11131 - 12147 491 338 aa, chain - ## HITS:1 COG:no KEGG:BF4231 NR:ns ## KEGG: BF4231 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 19 308 27 314 571 250 44.0 6e-65 MNKYLKSHITAAAFCAGVLLAVSCSTSREPRTMRLFMQSGRPNIFLPHEESAPSGVLAEQ VGFTTSSDTLPGEFEEVSDSTDDVWKTIHLGGVDIVAARTVVKQVTMREGSVRLEFNIHV PSLLIDSCWRVTLTPMLCTSDSSGFPLPPVVLSGSSFTRMQEADYKAYRLFLNGIVDPSA YDSVYLDRKGINKDIACRQRFFYELYGKERDRQLAYEKWKRLALERQNFWNTRTQANRAT LRHRLERKRIEESVRRYVAGRDTLGLWASYDRKYHRTASFWPMYRLERELTADRVPSKFR DLYNGGAQPVQHTQLCPHRHRFDGHLPPPLFLRPDCRK >gi|222159195|gb|ACAB01000164.1| GENE 13 12189 - 12755 379 188 aa, chain - ## HITS:1 COG:no KEGG:BF4435 NR:ns ## KEGG: BF4435 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 5 188 4 186 186 186 48.0 4e-46 MNLNRKKYFLCFACVLACLVYSDKVSAQFYSVRTNLVGLGTGNLNVEGSMAFSTHWSAHL PVQYNPFRLWDDGKLKNFTVQPGVRYWMRETYGRGCFIGLHGVFSIYNAGGLFGHRYRYE GTAFGGGLSAGLVRPLSRRWNLEFELGLGVVWADWERYRCVNCGKRTGKGSGVYVTPTRT AVNLVYLF >gi|222159195|gb|ACAB01000164.1| GENE 14 13337 - 13603 228 88 aa, chain - ## HITS:1 COG:no KEGG:BT_1894 NR:ns ## KEGG: BT_1894 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 88 464 551 551 151 85.0 8e-36 NNGILQELEKIVNTCHDNAMYKLKEDFPTMKASDTRLLCYIFVGFSPQVISLFMKDTVAN VYARKSRLKSRIKSTETANKELFLSLLG Prediction of potential genes in microbial genomes Time: Wed May 18 04:45:57 2011 Seq name: gi|222159194|gb|ACAB01000165.1| Bacteroides sp. D1 cont1.165, whole genome shotgun sequence Length of sequence - 13333 bp Number of predicted genes - 13, with homology - 10 Number of transcription units - 9, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 250 - 1488 497 ## BT_0727 hypothetical protein - Prom 1512 - 1571 6.4 - Term 1748 - 1802 6.4 2 2 Op 1 . - CDS 1837 - 2853 1044 ## BF3435 hypothetical protein 3 2 Op 2 . - CDS 2889 - 5228 849 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 5435 - 5494 2.8 4 3 Tu 1 . - CDS 6041 - 6142 62 ## - Prom 6204 - 6263 4.5 5 4 Op 1 . - CDS 6303 - 6395 61 ## 6 4 Op 2 . - CDS 6457 - 7776 581 ## FP2425 lipoprotein precursor - Prom 7797 - 7856 4.6 7 4 Op 3 . - CDS 7860 - 8297 334 ## COG3015 Uncharacterized lipoprotein NlpE involved in copper resistance - Prom 8324 - 8383 8.5 8 5 Tu 1 . - CDS 8449 - 9033 503 ## gi|237712864|ref|ZP_04543345.1| predicted protein - Prom 9255 - 9314 11.4 - Term 9361 - 9399 -1.0 9 6 Tu 1 . - CDS 9623 - 9814 94 ## - Prom 9898 - 9957 2.2 + Prom 10667 - 10726 5.8 10 7 Tu 1 . + CDS 10778 - 10951 115 ## gi|293372897|ref|ZP_06619270.1| conserved domain protein - Term 11391 - 11439 1.8 11 8 Op 1 . - CDS 11592 - 11765 100 ## gi|262409697|ref|ZP_06086236.1| predicted protein 12 8 Op 2 . - CDS 11824 - 12765 448 ## gi|237712866|ref|ZP_04543347.1| predicted protein - Prom 12794 - 12853 7.8 13 9 Tu 1 . - CDS 13065 - 13331 178 ## BT_1894 TPR repeat-containing protein Predicted protein(s) >gi|222159194|gb|ACAB01000165.1| GENE 1 250 - 1488 497 412 aa, chain - ## HITS:1 COG:no KEGG:BT_0727 NR:ns ## KEGG: BT_0727 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 411 1 393 393 465 57.0 1e-129 MEGYNKKVIMSYWTICRMKTTKKIYVKVSIIFLLGGILLFSSCEKEENQKIEANRKTLFM YLPWSSNLTNYFYNNISDLEKCITKIGLNNEKVIVFISTGSTEAMMFEIVSSHGRCKREI LKKYKSPPFTTIDGITAILNDVKAFAPASVYTMIIGCHGMGWLPVYEMKVRSAPHMKMHW EYQGVPLTRYFGGLTAEYQTDIKSLASGIANAGMKMEYILFDDCYMSSIEVAYELKDVTK YLIGSTSEMMAYGMPYAAIGEYLLGNPDYQSGCEEFYNFYSTYEIMPCGTLAVTDCSELE NMAAIIKSINSKYSFDKSLRGTIQRLDGYTPVIFYDFADYITSLCNDPILLNQFREQLNH LVPYKTHTKNFYTMAKGIIPINTFSGITTSDPSDNPMTVLKENTLWYKAAHN >gi|222159194|gb|ACAB01000165.1| GENE 2 1837 - 2853 1044 338 aa, chain - ## HITS:1 COG:no KEGG:BF3435 NR:ns ## KEGG: BF3435 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 338 1 338 338 575 79.0 1e-163 MKTNVCKIALMFLIGTALSLLFNSCSKDPVIPENETDNKLHEDPSKMTIRLVECHLHADW NEIQKVGGPHQNPESPAKHMKRIQEITYELKAGKGWRLAEGSQSKFYVQKNGDYYTYGKY TPAPVYLMFIYYYNAKGDLMNSQFIENGQDNIHQHFFTPENVKPTFDGQPEADDNEPQKL VDYLYVDTTPWDKTKHSKEAEITGDSNPIGLKGVIRFLKDRKEFDLKIRLYHGYKSKGNP ETGTFDPFYKPSGILIQRGTWDINLNIPVVVFWSREETVGVDEDTNPEGVEEDGLDEKSN RAIHSIMGTFNLTWKEALEEFIIYTYKSGDVEAGAIWL >gi|222159194|gb|ACAB01000165.1| GENE 3 2889 - 5228 849 779 aa, chain - ## HITS:1 COG:XF0384 KEGG:ns NR:ns ## COG: XF0384 COG1629 # Protein_GI_number: 15836986 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Xylella fastidiosa 9a5c # 439 722 357 630 681 63 25.0 2e-09 MTFHAYFQYVCLFIVWLCAISPTRMYAQKYRQQSDTHRTLLVVDKETGRPIEGAYILLEN QPLASSTGGRIIIPWKTNSKDTVLVQSLGYKSQYIYLSEAFKKVNIQTVYLYPEIKILEE VIITGECSGVTRNTVSNNLTSFAINRALGTSLTSLLEQVSGVSSISTGTTVAKPVIQGMY GNRILIINNGSRQTGQQWGVDHAPEVDMNGNASIRVIKGSDAVRYGSEALGGIIVMEQAP LPFRKKALKGKTSILYGSNGHRYVLTGQLEGTFPFLRDLAWRVQGTYSNSGDRSTANYLL NNTGAREHHTSALLGYDRGRLRIEGFYSHFYNRTGVMFSAQMGSEDLLAERIRLGRPLYT DPFTRSIAYPYQKVTHQTAIGKMKFSMRNAGNLYWQSSWQKDDRQENRIRRLGSDIPAVS LHLNSLQNSLCWKLNYNSWQTEVGGQIMFIDNHSQAGTGIVPVIPNYTETQMGIYGIGKY NYSKGGIEAGIRFDGQETRASGYDWTGSLYGGTRKFNNVSYSLGGHHHFSDQWKLTTNFG LAWRAPHVYELYSNGNELGSGMFVKGDSTMHSERSYKWISSISYSNKVFSARMDGYLQWI SGYIYDEPKKETITVISGVYPVFQYKQTPAFFRGMDFDFHFMPINSWDYHLIASFIRANE QTTGNYLPYIPSSRFSHDLSWLHETKSHSKLRLSIRHRFVAKQTRFDSNTDLIPYTPPAY HLLGFAASFECPVKYGYKLQFMIAADNILNKEYKEYTNRSRYYAHDMGSDVRCVVNWSF >gi|222159194|gb|ACAB01000165.1| GENE 4 6041 - 6142 62 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVTGTFERDFDINPFSYALGMSRTLRPRNINVN >gi|222159194|gb|ACAB01000165.1| GENE 5 6303 - 6395 61 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MISKTKIFLFFLQELYLLMHNNKKKRGSKK >gi|222159194|gb|ACAB01000165.1| GENE 6 6457 - 7776 581 439 aa, chain - ## HITS:1 COG:no KEGG:FP2425 NR:ns ## KEGG: FP2425 # Name: not_defined # Def: lipoprotein precursor # Organism: F.psychrophilum # Pathway: not_defined # 111 397 49 337 371 87 25.0 2e-15 MLLVACGENDPIESRNRHPRQQEESHSEQSGSVPAVAKEDVAAYFRFDMSRDINVALGLV EAQKETLQINGKMIKITEAVVKRKDYQKGSFVLYVTGTVNGSPFKNEFTFTGFVSKPDDY QMVNGAQAEWKANADYYAKLDFDSFYRLHKVNMFTIENLGRLVDFYSINVGGEKYLFTLE DLEKTVLEDIKYEQNQLSFYLRYNNSRSKKRISLLFDKNKYYEGKVTVNTDFIKQKYMRG IYENPSLFNGHIFSYDESAYAVEISTTLKEKSDTGNMLTLHLSMYAKGNGGLLLAEFDKI FTGFKPLSELKNELIAYSTPDLHEFVKNKLKNSNYGDVKNKFSNSIDRWIQYVEFVIKDH DSLQWEKNKWSKNVLNGLFSSVKDIYLDNVRFELNSAQLKEIGGKRFLYIVFQMIGANDL VLTSDAILFNMSIHIPSSL >gi|222159194|gb|ACAB01000165.1| GENE 7 7860 - 8297 334 145 aa, chain - ## HITS:1 COG:YPO1067 KEGG:ns NR:ns ## COG: YPO1067 COG3015 # Protein_GI_number: 16121368 # Func_class: M Cell wall/membrane/envelope biogenesis; P Inorganic ion transport and metabolism # Function: Uncharacterized lipoprotein NlpE involved in copper resistance # Organism: Yersinia pestis # 48 125 52 129 242 61 36.0 6e-10 MKLIKLHLLVAAMIIVTSGCVGSGKKYKQQENSAAGVVDSKNKDCYGTYEGILPCADCGG IKTTLKINSDATYDLRSEYLGEENGIFEESGVYNMIGENIIELVTPSSGAKTYYKVLDDA VVLSDSTEIFNNSGRLAAQYILKRQ >gi|222159194|gb|ACAB01000165.1| GENE 8 8449 - 9033 503 194 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712864|ref|ZP_04543345.1| ## NR: gi|237712864|ref|ZP_04543345.1| predicted protein [Bacteroides sp. D1] # 1 194 1 194 194 259 100.0 6e-68 MKTKPFLKSMALVICCLTSFVLVSCDKDDDKPNLKISPNKVEVAVGKTVTVSVSNGTAPY TTKSSDAKIAVAKSDKNAIVITGVKDGSAMITITDRKGVTGKVAVTVRTVKETPKTPKGL DFDKQSITVSVGKDGIVTVKGGAQPYSAVAKDVNIATVSVKENKVNIRGVKAGKTTITVT DKDKKTGTINVTVK >gi|222159194|gb|ACAB01000165.1| GENE 9 9623 - 9814 94 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSYFKIILVTTTDFCMVPHFNILKFVSIPSENVLSWTCEYLSLAVQEILCYRRFIVYGLN TRF >gi|222159194|gb|ACAB01000165.1| GENE 10 10778 - 10951 115 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293372897|ref|ZP_06619270.1| ## NR: gi|293372897|ref|ZP_06619270.1| conserved domain protein [Bacteroides ovatus SD CMC 3f] # 7 57 1 51 51 87 100.0 2e-16 MNTVFQMYYFVKSRKGDMFYKGDFIYLYTVDFASELNGFVFSCLNNSLEGLNIKEIC >gi|222159194|gb|ACAB01000165.1| GENE 11 11592 - 11765 100 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262409697|ref|ZP_06086236.1| ## NR: gi|262409697|ref|ZP_06086236.1| predicted protein [Bacteroides sp. 2_1_22] # 1 57 1 57 57 75 100.0 1e-12 MLIWFLTIIPVGIPLCAIIGLMYGIKNKDTIFTKCSAVALFVGIACIVYTLLVIKST >gi|222159194|gb|ACAB01000165.1| GENE 12 11824 - 12765 448 313 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712866|ref|ZP_04543347.1| ## NR: gi|237712866|ref|ZP_04543347.1| predicted protein [Bacteroides sp. D1] # 1 313 1 313 313 617 100.0 1e-175 MQKKNLFGILILIAYFASCNQIHKTHYENGWYHILSQQKDSIAKESIVTVKDFVSLRMDS DENGTCVIVGQISKHKLKKWAKETEKAIGKHIAFVLDDTVITNPKVNARIENGVFQISLP HGYDLKNIYNKIRKEKIDSIESIFKGWEKDSIYYSSKEKADSIVFAIDYWEASEWVDMSV NPEDHYWWGDLDTATYSKLESALCEEIAKPNFSSRAEDYMKLDTYKRYKAYLCENADYIN LMFQGFLFTEPATGLCGFLVDDIVRSRYPTAPSIRVMTLKIDNQDDEMFAKLKYQKAIWR LMNKERDAKSNKQ >gi|222159194|gb|ACAB01000165.1| GENE 13 13065 - 13331 178 88 aa, chain - ## HITS:1 COG:no KEGG:BT_1894 NR:ns ## KEGG: BT_1894 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 88 464 551 551 149 82.0 3e-35 SNEILQELEKIVNTCHDNAMYKLKEDFPTMKTSDIRLLCYIFVGFSPQVISLFMKDTVAN VYARKSRLKSRIKSAKIVNKELFLNLLG Prediction of potential genes in microbial genomes Time: Wed May 18 04:47:00 2011 Seq name: gi|222159193|gb|ACAB01000166.1| Bacteroides sp. D1 cont1.166, whole genome shotgun sequence Length of sequence - 27960 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 12, operones - 8 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 324 - 383 9.9 1 1 Op 1 . + CDS 472 - 726 227 ## BT_1391 hypothetical protein 2 1 Op 2 . + CDS 683 - 1600 704 ## BT_1391 hypothetical protein + Term 1727 - 1763 -0.3 3 2 Tu 1 . - CDS 1783 - 1944 117 ## gi|237712868|ref|ZP_04543349.1| predicted protein - Prom 2124 - 2183 7.5 + Prom 2118 - 2177 3.7 4 3 Op 1 . + CDS 2204 - 2830 275 ## BT_0507 TetR/AcrR family transcriptional regulator 5 3 Op 2 . + CDS 2827 - 3060 68 ## gi|237718938|ref|ZP_04549419.1| predicted protein 6 3 Op 3 . + CDS 3117 - 3347 218 ## gi|237712870|ref|ZP_04543351.1| predicted protein + Term 3370 - 3408 7.3 + Prom 3375 - 3434 1.6 7 4 Tu 1 . + CDS 3462 - 4277 520 ## COG0428 Predicted divalent heavy-metal cations transporter + Prom 4412 - 4471 2.0 8 5 Op 1 . + CDS 4491 - 6857 1743 ## COG1033 Predicted exporters of the RND superfamily 9 5 Op 2 35/0.000 + CDS 6859 - 8613 206 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 10 5 Op 3 . + CDS 8632 - 10374 202 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 11 5 Op 4 . + CDS 10414 - 11199 636 ## PG1179 hypothetical protein 12 5 Op 5 . + CDS 11199 - 12458 900 ## PGN_0948 hypothetical protein + Prom 12586 - 12645 3.1 13 6 Op 1 . + CDS 12674 - 13129 382 ## PG0222 histone-like family DNA-binding protein 14 6 Op 2 . + CDS 13148 - 15490 1643 ## COG4206 Outer membrane cobalamin receptor protein 15 6 Op 3 . + CDS 15523 - 16761 821 ## BF0627 hypothetical protein + Term 16867 - 16909 9.6 - Term 16932 - 16965 -0.2 16 7 Tu 1 . - CDS 17078 - 17332 85 ## gi|237718950|ref|ZP_04549431.1| Tnp167B - Prom 17388 - 17447 1.8 17 8 Op 1 . + CDS 18017 - 18190 123 ## BT_2450 hypothetical protein + Prom 18195 - 18254 3.0 18 8 Op 2 . + CDS 18343 - 18855 200 ## BT_2450 hypothetical protein + Term 19085 - 19143 12.6 19 9 Tu 1 . - CDS 20806 - 21348 383 ## COG0622 Predicted phosphoesterase + Prom 21448 - 21507 6.1 20 10 Op 1 . + CDS 21631 - 22566 555 ## BF3018 hypothetical protein 21 10 Op 2 . + CDS 22563 - 22790 234 ## BVU_0970 hypothetical protein + Term 22815 - 22852 7.1 + Prom 22931 - 22990 4.5 22 11 Op 1 . + CDS 23041 - 23820 399 ## BF3019 conjugate transposon protein TraA 23 11 Op 2 . + CDS 23832 - 24245 339 ## BT_2467 hypothetical protein 24 11 Op 3 . + CDS 24250 - 25002 448 ## gi|237712888|ref|ZP_04543369.1| predicted protein 25 12 Op 1 . + CDS 25523 - 25720 176 ## Fjoh_3006 hypothetical protein 26 12 Op 2 . + CDS 25732 - 26082 268 ## BF0124 hypothetical protein 27 12 Op 3 . + CDS 26045 - 26215 119 ## PGN_0065 conserved protein found in conjugate transposon TraG 28 12 Op 4 . + CDS 26312 - 27959 1135 ## PGN_0065 conserved protein found in conjugate transposon TraG Predicted protein(s) >gi|222159193|gb|ACAB01000166.1| GENE 1 472 - 726 227 84 aa, chain + ## HITS:1 COG:no KEGG:BT_1391 NR:ns ## KEGG: BT_1391 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 375 107 68.0 1e-22 MKKIAILLVFTLGTFIVNAQSVVEGTKFMDNWSVGVNAGAVTPLTHASFFKGMRPGFGVG ISKQLTPSFRFGISGNGIYQYNSE >gi|222159193|gb|ACAB01000166.1| GENE 2 683 - 1600 704 305 aa, chain + ## HITS:1 COG:no KEGG:BT_1391 NR:ns ## KEGG: BT_1391 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 305 73 375 375 513 84.0 1e-144 MEFQGMGYINTTPSKTAFDVLDVSLLSKINLINLFASYRGKPRLFEIEAIAGMGWLHYYV NGKGDDNSWSTRLGLNLNFNLGETKAWTLGIKPAIVYDMQGDFNQAKSRFNANNATFELT AGLTYHFKMSTGNHYFTKVKVYNQSEIDDLNVAINALREQVGSRDRELNNANQRISGLHK ELEECRTKVVPIETVVKTARVPESIITFRQGRSSVDASQLPNVERVASYMKKYPDSKVII KGYASPEGNVEINAKIATARAEAVKTILVNKYKINTSRITAEDQGVGDMFTEPDWNRVSV CTIED >gi|222159193|gb|ACAB01000166.1| GENE 3 1783 - 1944 117 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712868|ref|ZP_04543349.1| ## NR: gi|237712868|ref|ZP_04543349.1| predicted protein [Bacteroides sp. D1] # 1 53 1 53 53 78 100.0 1e-13 MSKKIKDQEVLLVKDQKDDSLRAVVGADGKSGLKLHSFKVSFESVVASVHVYQ >gi|222159193|gb|ACAB01000166.1| GENE 4 2204 - 2830 275 208 aa, chain + ## HITS:1 COG:no KEGG:BT_0507 NR:ns ## KEGG: BT_0507 # Name: not_defined # Def: TetR/AcrR family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 204 1 202 202 139 34.0 8e-32 MQTTKDYIRKLLLETAQKAFFEKGFKSVSMREISKLSGIGLSNIYNYYPCKDDLLVDVLR PLLAAMYRMLEDHNRPENFSLDIFISDEYHRASLQELMGIITRYRSELNLLFFSTQHSRL KDYLEEWIEKSATIGMEYMEKMRRLHPELHTDISPFFMHFTCSWWINMMKEVVQHKELSC EEIECFISEYIRFSTGGWKKLMNVKNER >gi|222159193|gb|ACAB01000166.1| GENE 5 2827 - 3060 68 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237718938|ref|ZP_04549419.1| ## NR: gi|237718938|ref|ZP_04549419.1| predicted protein [Bacteroides sp. 2_2_4] # 1 77 1 77 77 136 100.0 5e-31 MKAELLKHPVCQEITNQNIRFSSCLQCKPCGNDCTIVNNRISFVFDSETIGISYLLLDST PIACMDRCCNLVRQRFV >gi|222159193|gb|ACAB01000166.1| GENE 6 3117 - 3347 218 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712870|ref|ZP_04543351.1| ## NR: gi|237712870|ref|ZP_04543351.1| predicted protein [Bacteroides sp. D1] # 1 76 31 106 106 148 100.0 1e-34 MSIINIISNDELTLVQGVNGLFFCEESWVTQSRAVSRIGKVLPKEIIGTKDQSTNIVHPQ TDRTKLSCQKHETSYF >gi|222159193|gb|ACAB01000166.1| GENE 7 3462 - 4277 520 271 aa, chain + ## HITS:1 COG:Cgl1400 KEGG:ns NR:ns ## COG: Cgl1400 COG0428 # Protein_GI_number: 19552650 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Corynebacterium glutamicum # 2 270 3 267 268 243 51.0 3e-64 MSQPVLIAFFLTSFAGISTGIGSAIAFFAKRTNTSFLSLSLGFSTAVMIYMSFANLFASS LQTLANIHGKDDGTLHSTLSFFGGIVLILLIDKLIPHYENPHEMHRVEEMSLKEERKTEI KPQLLRVDVVTALVLAIHNFPEGMVTFLAALKDINIAIPIACAIAIHNIPEGISVSVPIY YATGNRKRAFWLSFLSGLAEPVGAVIGYLILAPFLNDHVFGVIFGMIAGIMVFISLDELL PAAEEYGKHHHTIYGLVAGMAVMALNLLMLC >gi|222159193|gb|ACAB01000166.1| GENE 8 4491 - 6857 1743 788 aa, chain + ## HITS:1 COG:BB0252 KEGG:ns NR:ns ## COG: BB0252 COG1033 # Protein_GI_number: 15594597 # Func_class: R General function prediction only # Function: Predicted exporters of the RND superfamily # Organism: Borrelia burgdorferi # 58 760 41 730 767 114 21.0 6e-25 MKIERINDWFARQGRWMVQKRWLVLSLFILVFAIGFTGLRYFKVSASWENYFLEDDPMLV KTKEFKDIFGNDNFAAVLTQCDNSFMKENLELIRELTNEMMDSISYADKITSLTDIEFMV GNEEGMTIEQIVPDLIPTDAASLETIRRKAYMKPHIAERLVSKDGTLSWIILKLRTFPKD SVWNKGKNAVSPEVLTGNELEHIITKDKYQKLHPKGTGLPYVTAMKMKWIGKEMPRVMGI AALVSILILLLITRSLRGVVVPLITAAGSIVIVYGLLGYVGMTIDSGMMMIPMLLAFAVS IAYNIHIFSYFKRQFLLHGERRRAVEETVGEMGWPVLFSALTTFAALLSFLAIPMQPMRF IGIATSSCVMLAFFIAITLMPVLLSFGKNGKPHPKVQETGGRWLDHQLGRLGESVLRHGT LILWIAGLLTAALIYQFTKIETAFDIERTMGRKIAYVNNLLEVGESELGSIYTYDVMIDL PEDGLTKSPAMLVRLDSLAQKAEGYKLTKRTTTVLNILKDLNQTLHEGDAAYYRIPTNPE EVAQLLLLYENAGGSEAEYWIDYDYRRLRLMVEISSFDSGEVERELNDIAANAARLFPEA SVTTVGSIPQFTVMMQYVARGQMVSFAISLLIIGILMMLVFGSVRIGLIGLIPNITPALV VGGLMGWLGYPLDMMTATIMPMILGLAVDDTIHFINHGHLEFDRRGNYRDAILRSFRTIG TPIILTSVVICANFAIYMTSEGLSFIHMGLLSVAGIVSALVADLCVTPVLFQKFRLFGKE IETNETIN >gi|222159193|gb|ACAB01000166.1| GENE 9 6859 - 8613 206 584 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 340 562 279 507 563 84 29 1e-15 MNTLKKLQFYMGKRKVLMPIALTLSGLSGLLSLIPFIFIWLIVRSLLLTGSVASGIPINV YAWWAVGTAVAGLVIYFAGLMLLHLAAFRVETNMRRTAMRKIMQMPLGFFDRNTSGQMRK IIDENASETHTFVAHLLPDLAGSFIAPLSVIILIFVFNWQLGLACLIPLTAAVFIMTRMN TTAQKAFQNEYLGAQEHMSSEAVEYVRGISVVKVFQQTIFSFKRFYDSIIAYRDLVTKYT LGWQKPMSLYTVAINSFAFLLVPVVILLIGNKSENIAPIITDMFLYVLITPVIATNVMKV MYLQQDMFLADQAISRVENLTSSEPLPIAENPEKITACDVTFENVSFAYPNAGQNAVDGI SFHLPEGKTFALVGQSGGGKTTIAQLIPRFWDVSAGSVTIGGINVKNIAKDNLMNHIAFV FQNTKLFKTSLLENIKYGNPAASDEAVQRAIDLSQSREIIDRLPNGLNTKIGVDGTFLSG GEQQRIVLARAILKDAPIVVLDEATAFADPENEHLIQKALHELRKGKTVLMIAHHLTSVQ DADKILVIAQGKIAEEGTHSELIARNGIYNSMWNEYQRTVTWTV >gi|222159193|gb|ACAB01000166.1| GENE 10 8632 - 10374 202 580 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 354 561 15 226 245 82 27 3e-15 MNISKYFQHRFTLSEDGAKTFIKGVIASLFLNLALMLPVIFIFLFITDYLHPVIQAGNST THGLWYYVLTALCFMTVLFVIAMIQYKSTYTKIYTESANRRIGIAEKLRKLPLAFFGEKN LSDLTATMMEDCNMLETLFSHTVPQLFASVISIILIATGMFFYNWQLALALFGVVPVATT VIFLGKRLMDKANKEYYHVRRDTTEQIQEGLEAAQEIKSYNGEAAYCDRLDKQLEIYEAT LQRGEFLGAGLVGSAQSILKLGLASVIIAGAYLLGLGSIDMFTYLVFLLVVASIYNPIMD VFNHMATLILLDVRIDRIREMNDMPAQQGDEDCTVEGYDICFDKVDFAYETSKQVLRNVS FTAKQGKVTALVGPSGGGKSTSAKLAARFWDIHGGKITLGGRDISKIDPEMLLKNYAVVF QDVLLFNASVMDNIKIGKKDATEEEVKAVARLARCDEFIARLPNGYDTLIGENGESLSGG ERQRISIARALLKDAPVILLDEATASLDVENETLIQAGISELIKNKTVVIIAHRMRTVAN AHHIVVLKDGTVAEQGAPDELLARNGEFARMVARQKETGI >gi|222159193|gb|ACAB01000166.1| GENE 11 10414 - 11199 636 261 aa, chain + ## HITS:1 COG:no KEGG:PG1179 NR:ns ## KEGG: PG1179 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 1 259 1 261 263 380 74.0 1e-104 MKRSIAVVAMLVVGFAATTCAQTGREIAQKVKDRPDGDTRRSEMVMTLINKRGAVRERKL ISYSIDVGKEKKDRKSIMFFQYPGDVKGTGFLTWDYDELNKDDDKWLYLPAIKKTRRISG SSAKQDYFMGSDFTYDDMGSRNVDEDTHTLLGEETVDGQKCWKLESLPKDKRDIYSRKTA LIRQDCLIPVRVEYYDKMGKLHRRLEMSDIAKVEGFWVARKMHMTNVQTEHQTVLEIKNP TYNIPMEESKFNVTTLEKGRF >gi|222159193|gb|ACAB01000166.1| GENE 12 11199 - 12458 900 419 aa, chain + ## HITS:1 COG:no KEGG:PGN_0948 NR:ns ## KEGG: PGN_0948 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 24 419 27 425 425 526 62.0 1e-148 MKAFNRLSAGLCLSVFLFCPVHGVWGQGAGESSWQLKGLVDTYHAFRSEKPNDWMSSRTR LRGEVGKNFAGSSLFVSFNATYNALLKERTGFELREAYLDHRQEHWGFRLGRQLVIWGAA DGVRITDLVSPMDMTEFLAQDYDDIRMPVNALRFFVFNDKIKLELLAVPTFEGYKLPTDA ANPWSVLPKETPRSLVWDAEGSRPELRLSNVEYGGRLSFALPGVDFSLAALHTWNKMPVI EYKPSGSQLTVSPRYYRMGFVGGDVSKPLGQFVLRGEAAFNLGKHFSYIQQAASTPQKGF NTINWLVGADWYAPHEWTVMAQFSSESIFKYESYVAQPRHNSLLTLCVSKKLLDSNLQLS DFTYFDLNHKGWFSRFTADYALNDHIHLLAGYDWFGGSEGMFGPYKHNSEVWAKAKYCF >gi|222159193|gb|ACAB01000166.1| GENE 13 12674 - 13129 382 151 aa, chain + ## HITS:1 COG:no KEGG:PG0222 NR:ns ## KEGG: PG0222 # Name: not_defined # Def: histone-like family DNA-binding protein # Organism: P.gingivalis # Pathway: not_defined # 1 139 1 139 155 84 38.0 1e-15 MLLYNVKKEEMRIGKHKGKTMYYASPIAQDKITTKQLEDRIVNATALSRADVRSAITALA KIVREEMLSGRTVDLANLGSFKVVSNGKRVETEKAVTAETLKTPRIQFFPKLEMRNQAKN VQHVVIRESEAGSAKPSPNPGNLPEAPDSAL >gi|222159193|gb|ACAB01000166.1| GENE 14 13148 - 15490 1643 780 aa, chain + ## HITS:1 COG:BMEI0657 KEGG:ns NR:ns ## COG: BMEI0657 COG4206 # Protein_GI_number: 17986940 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Brucella melitensis # 104 235 8 141 599 72 30.0 4e-12 MKQQVLFFFFTLMLTGISVSAQTGGGRIAGVVIDNAGEEPLPGATIFIEELKKGIVTDGH GEFLLSDVPAATYTLTVRFIGYHTQTRKLTVGKERGKKIIIRLKAEAKSLDEVVVMGKSE ARKLREQAMPISVISMNQIQGTVNNVQDILAKTAGITVRATGGTGSTSRISVRGLEGKRI GLFIDGNPMNDNSDFIDINDIPVEMIDRIEIYKGVVPAKFGGSAVGGAVNIVIKEYPPKY LDVNYSYGSFNTHNASVVSKMNIAPKGIEFGLGGFYTYADNDYKMKSPFQEGLTITRDHD KFKKTVIGGSFKARKWWFDLVEFEPVFIHTYKDIQGIESNIRHAHSHSNAFIFANKMEKE NFLLDGLDLDWQLGYIYTDYHFADTASHRTHWDGTHYPAVSEFGGEIGKWASLLTNEKHQ LQHKLNLNYLVNENHSVNFNSLLKYAHANPRDGMKDKVIGYRTDFPSNMFSWVAGLNYDY RTSNDKFLNSFNVKYYYYSMKTRMASVLVKTAEDIDTHKNDFGISNALRYRITPSLMAKA SFGYDVRLPSEEELLGDGYVIAPAGNLTPERNISVNIGMLFDLTGKASPNLQIELNGYYM HLKDMIRFTGGFLQSQYQNFGEMRTLGMEAEVKADMTRWLYGYVNATYQDLRDVRKYEQN TTVANPTKGSRMPNIPYLMANAGLEFHKENLFGGSGMNTRIFTDASFVEEYLYDFEQSQF QQHRIPRALSCNIGFEQSFGNGRYFIMGKINNLTDTKMISEFNRPLPGRSFTIRFRYVFK >gi|222159193|gb|ACAB01000166.1| GENE 15 15523 - 16761 821 412 aa, chain + ## HITS:1 COG:no KEGG:BF0627 NR:ns ## KEGG: BF0627 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 410 1 405 407 451 56.0 1e-125 MKKNVFFVAAIFAALCLNSCDKNDPDNPIAGGSGKILLTTALPNATGMDGTVYMQLIDEP VLGKTMATNNNNGINVPFGSSYPMIIGQEVYVFPSYHLTMDKNELIKYRRTNGILQREGS LQLPANNSANNLVKLSSTKAYLSLAGLGLIYIFNPETMQKTGEINLTSLGIQDNNPDIGI MIERDGYVFAGLSQMVGGWTSPENYKQADVAVIDTKTDKLVKMISEKTSGFSQATRPIDP KSLFMDEKGDIYISCLGNFGMVAGHKAGILRIKKGETDFDPTYHWTITGASIEGEEKVAG FAASICYAANGKAYGYIDIPGYYKPGETGHGAIASRAVVFDLYNQKMKKIEGLDLSNGYG VLVSKYKDGLAIANASTTTKGIYYLNPQTDKINPIPMITTIGNPMAIEWFGN >gi|222159193|gb|ACAB01000166.1| GENE 16 17078 - 17332 85 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237718950|ref|ZP_04549431.1| ## NR: gi|237718950|ref|ZP_04549431.1| Tnp167B [Bacteroides sp. 2_2_4] # 37 84 1 48 48 77 97.0 2e-13 MAMGYGDVRCKHLTGSIVKADIKLFGDIPKKIKCKRVHFILNKYMTCSIKHLISVITLTD SYFILIKVSNTNTSDIGKGLKSII >gi|222159193|gb|ACAB01000166.1| GENE 17 18017 - 18190 123 57 aa, chain + ## HITS:1 COG:no KEGG:BT_2450 NR:ns ## KEGG: BT_2450 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 45 1 45 279 92 100.0 5e-18 MRKVFILLFISLALLSCQKEENNRKTEYQVLTSKDASKPFSCLGEKRIITITIIKKH >gi|222159193|gb|ACAB01000166.1| GENE 18 18343 - 18855 200 170 aa, chain + ## HITS:1 COG:no KEGG:BT_2450 NR:ns ## KEGG: BT_2450 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 170 110 279 279 339 98.0 2e-92 MNADLQISYSTINGIKVEKIPLIIDKGKLTFVYKIHSEQNPFILPAEGGRFESPFTCKKQ TYLNGQFIEETYSSLNGLRFKTISSGNVWFLTVRKDGEKIGFYKFSFVGEGPYNQKTDPE CYFNIYTHDADLITDNPTEIFRQDFIQPQTPGEDYYKPSRSSYKHGTFDF >gi|222159193|gb|ACAB01000166.1| GENE 19 20806 - 21348 383 180 aa, chain - ## HITS:1 COG:STM2347 KEGG:ns NR:ns ## COG: STM2347 COG0622 # Protein_GI_number: 16765674 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Salmonella typhimurium LT2 # 2 171 1 171 183 164 47.0 7e-41 MMKYLIVSDIHGSLPALEQVLAFYREQQCGMLCILGDILNYGPRNGIPQGLDPKGIAERL NAMAGEIVAIRGNCDSEVDQMLLDFPILSDYTLLVDNGKRFFLTHGHIYNEDRLPKGRFD CLFYGHTHRWKLERTEHTAVCNTGSITFPKDGNMPTFAIYCDGTVSVHRLDGSRLKELSL >gi|222159193|gb|ACAB01000166.1| GENE 20 21631 - 22566 555 311 aa, chain + ## HITS:1 COG:no KEGG:BF3018 NR:ns ## KEGG: BF3018 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 310 1 308 309 419 66.0 1e-116 MNEQVRNILEQSTTKTSKIEQLLRLGLTRREIADLVTRGNYGFVYNVEKKMLEREGGVLL NRAATTLMDYTFTHKFGIEIEAYNCNMERLARELREAGIHVAVEGYNHTTRDHWKLVTDS SLQGNNTFELVSPILVGENGLKELETVCWVLDICNAKVNDSCGFHVHMDAAGFNLDTWKN LTLTYKHLEHLIDAFMPRTRRNNTYCKTLSGVSDERIKSVRTIDGLREVFNNDRYHKVNF EAYSRHRTVEFRQHSGTTNFTKMENWIRFLNGLITFAKRSSLPSRMTLEELPFLDGKQKL FFKLRTKKLAV >gi|222159193|gb|ACAB01000166.1| GENE 21 22563 - 22790 234 75 aa, chain + ## HITS:1 COG:no KEGG:BVU_0970 NR:ns ## KEGG: BVU_0970 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 3 75 7 79 81 76 53.0 3e-13 MIQTYHLLDGGQITAASPEEFVRLLREGSRFDYDCTDQEYMENFARRYGELHGVSVATDT PEHFLTDLQAAGYVR >gi|222159193|gb|ACAB01000166.1| GENE 22 23041 - 23820 399 259 aa, chain + ## HITS:1 COG:no KEGG:BF3019 NR:ns ## KEGG: BF3019 # Name: not_defined # Def: conjugate transposon protein TraA # Organism: B.fragilis # Pathway: not_defined # 1 258 2 260 262 362 67.0 6e-99 MKKEPLLIAFASQKGGVGKSAFTVLVASILHYQKGLKVGVVDCDSPQHSISRMRDRDIES VQESDFLKVVLYRQHEQIRKRSYPVIKSNPEKAIEDLYRYIEEQDAVFDVVLFDLPGTLR SEGVVHTISAIDYIFIPLKADNVVMQSSLQFAEVVEEELIARHNCNLKGIYLFWNMVDKR ERTESYESWNRVIQKAELHLLESRIPDTKRYNKELSSLKNSIFRSTLFPPDNRQIKGSGL CELIEELCAVTHLDASHTL >gi|222159193|gb|ACAB01000166.1| GENE 23 23832 - 24245 339 137 aa, chain + ## HITS:1 COG:no KEGG:BT_2467 NR:ns ## KEGG: BT_2467 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 136 1 135 136 81 38.0 9e-15 MDSGKKKFDPRSIDEKAILDIVARKGTIRPDSPATSVSPKPEDTAEEIAGSSEAPVFTTG EIDAYRASFLNTVRTKSRKSLHIDANLHRRISSLVWAVGRGEVTVAGFVNQVLAHHFEEN GGLINAVLEKYYQSLKS >gi|222159193|gb|ACAB01000166.1| GENE 24 24250 - 25002 448 250 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712888|ref|ZP_04543369.1| ## NR: gi|237712888|ref|ZP_04543369.1| predicted protein [Bacteroides sp. D1] # 1 250 1 250 250 440 100.0 1e-122 MEYLFIILFIYASYLSIEKYVHTGSIFTWRGRKDKKERRTAPAAIHPAIAPSQAGIIGKS RYQMRHSLTIDDNSGQPESGIEKSDTFTPEKKTEPVLRVHEPLDKNTKFPEVPALFQKLN YPEIDYMQEPSPSSGEEKEEIPERYEITGYVQRSNPHRATGVTFDDMDMIEKAMASDELT EAEQEQARQTFQKLEGTNMERVLQASVLGSSDKLKRYMRLYVDRGEPHLLTGKNRDDILR EFDITEHVPQ >gi|222159193|gb|ACAB01000166.1| GENE 25 25523 - 25720 176 65 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_3006 NR:ns ## KEGG: Fjoh_3006 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 1 65 63 127 127 105 81.0 4e-22 MLTGYFDPATKLIYAAGAIVGLIGAIKVYSKFSSGDPDTGKTAGSWFGACVFLIVAATVL RSFFL >gi|222159193|gb|ACAB01000166.1| GENE 26 25732 - 26082 268 116 aa, chain + ## HITS:1 COG:no KEGG:BF0124 NR:ns ## KEGG: BF0124 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 107 3 108 110 133 61.0 2e-30 MEYNINKGIGRNVEFNGLQAQYLYFFVGGLLAIFLLFVIFYMIGIDRWFCIGFGVLSAFS LIFGVFHLNKKYGPNGLMKLAAVKYHPSYIINRKRISSLFKKRIHEKYIKSRNAGK >gi|222159193|gb|ACAB01000166.1| GENE 27 26045 - 26215 119 56 aa, chain + ## HITS:1 COG:no KEGG:PGN_0065 NR:ns ## KEGG: PGN_0065 # Name: traG # Def: conserved protein found in conjugate transposon TraG # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 56 1 56 833 90 67.0 2e-17 MRNILKAATLESKFPILSVEEGCILSKDADVTIGFKVFLPELFTVTSADYASMHGT >gi|222159193|gb|ACAB01000166.1| GENE 28 26312 - 27959 1135 549 aa, chain + ## HITS:1 COG:no KEGG:PGN_0065 NR:ns ## KEGG: PGN_0065 # Name: traG # Def: conserved protein found in conjugate transposon TraG # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 549 90 637 833 862 72.0 0 MSFLDRSFELHFNERPFLNHFSYLFITKTTKERSRSQSNFSILCRGNIIPKEVRDKETVS LFLESVDQFERILNDSAFIRLERLTTDEIVGTPQQAGLIEKYFSLSQQDTTTLKDIQMSA SEMRIGDDILCVHTLSDVDDLPGSVQTDTRYEKYSTDRSDCRLSFAAPVGLMLSCNHIYN QFLFIDDSAEILQRFEKTARNMHSLSKYSRANQLNKQWIDAYLDEAHSFGLTAIRCHCNV MAWSDSREKLKVIKNDTGSALALMGCKPRYNTIDAPALYWAGIPGGEGDFPSEESFFTFV EQGVCFFTEETNYADSLSPFGIKMADRTNGKPLHLDISDEPMRRGITTNRNKLVIGPSGS GKSFFMNHLVRQYWEQNTHIILIDIGNSYKGLCDLIHQRTNGEDGIYYTYSEKHPIAFNP FFTEDKVFDLEKKESIKTLIMSLWKRDTEVITRAEEVALSMAVNLFLEKIKEDNELVPSF NTFYEFVQGEFRGILEEKHYREKDFDLTNFLNVLAPYYRGGEYDYLLNSDEQLDLLNKRF VVFEIDEIK Prediction of potential genes in microbial genomes Time: Wed May 18 04:48:36 2011 Seq name: gi|222159192|gb|ACAB01000167.1| Bacteroides sp. D1 cont1.167, whole genome shotgun sequence Length of sequence - 53358 bp Number of predicted genes - 53, with homology - 52 Number of transcription units - 29, operones - 15 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 11 - 244 131 ## gi|294643851|ref|ZP_06721643.1| polysaccharide biosynthesis protein 2 1 Op 2 . + CDS 244 - 1116 455 ## Gmet_1328 glycosyl transferase family protein 3 1 Op 3 . + CDS 1118 - 2020 218 ## Ping_1180 hypothetical protein + Term 2141 - 2197 0.5 + Prom 2030 - 2089 5.9 4 2 Tu 1 . + CDS 2215 - 2625 278 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Prom 2740 - 2799 3.9 5 3 Op 1 2/0.000 + CDS 2823 - 3905 439 ## COG3754 Lipopolysaccharide biosynthesis protein 6 3 Op 2 . + CDS 3980 - 4720 313 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 7 3 Op 3 . + CDS 4733 - 5863 482 ## LSL_0977 glycosyltransferase (EC:2.4.1.-) 8 4 Tu 1 . - CDS 6311 - 6466 60 ## - Prom 6614 - 6673 5.8 9 5 Tu 1 . + CDS 6450 - 7121 164 ## gi|262408848|ref|ZP_06085393.1| conserved hypothetical protein + Prom 7495 - 7554 6.1 10 6 Tu 1 . + CDS 7653 - 8339 259 ## COG0438 Glycosyltransferase + Prom 8356 - 8415 3.8 11 7 Tu 1 . + CDS 8476 - 8640 59 ## gi|237712795|ref|ZP_04543276.1| predicted protein + Term 8833 - 8874 3.6 + Prom 8744 - 8803 6.9 12 8 Tu 1 . + CDS 8909 - 9712 347 ## COG1216 Predicted glycosyltransferases + Term 9833 - 9864 1.1 13 9 Tu 1 . + CDS 10076 - 10309 138 ## BF2772 hypothetical protein + Term 10322 - 10387 14.1 - Term 10317 - 10368 12.1 14 10 Op 1 . - CDS 10372 - 10815 186 ## COG3023 Negative regulator of beta-lactamase expression - Prom 10939 - 10998 3.9 15 10 Op 2 . - CDS 11001 - 11489 592 ## BT_0403 hypothetical protein - Prom 11625 - 11684 2.6 + Prom 11468 - 11527 4.2 16 11 Tu 1 . + CDS 11547 - 11753 59 ## gi|237712800|ref|ZP_04543281.1| conserved hypothetical protein + Term 11818 - 11867 -0.3 - Term 11634 - 11673 2.3 17 12 Tu 1 . - CDS 11754 - 14081 1420 ## BT_0404 hypothetical protein - Prom 14200 - 14259 8.5 + Prom 14036 - 14095 7.1 18 13 Op 1 . + CDS 14234 - 14593 251 ## BT_0405 hypothetical protein + Prom 14603 - 14662 10.1 19 13 Op 2 . + CDS 14731 - 15237 627 ## COG3152 Predicted membrane protein + Term 15258 - 15307 12.4 + Prom 15272 - 15331 6.3 20 14 Op 1 . + CDS 15364 - 17622 1410 ## gi|237712806|ref|ZP_04543287.1| conserved hypothetical protein 21 14 Op 2 . + CDS 17627 - 18199 452 ## gi|160882630|ref|ZP_02063633.1| hypothetical protein BACOVA_00583 22 14 Op 3 . + CDS 18204 - 19271 1020 ## PRU_0332 RyR domain-containing protein 23 14 Op 4 . + CDS 19294 - 19587 368 ## BT_2247 putative ryanodine receptor - Term 19322 - 19355 4.7 24 15 Op 1 . - CDS 19562 - 22843 2900 ## COG4995 Uncharacterized protein conserved in bacteria 25 15 Op 2 . - CDS 22864 - 23532 309 ## gi|237712811|ref|ZP_04543292.1| conserved hypothetical protein 26 15 Op 3 . - CDS 23554 - 25170 972 ## COG1262 Uncharacterized conserved protein - Prom 25200 - 25259 7.3 + Prom 25175 - 25234 6.6 27 16 Tu 1 . + CDS 25266 - 25640 405 ## COG0251 Putative translation initiation inhibitor, yjgF family + Term 25663 - 25710 13.5 - Term 25444 - 25491 1.5 28 17 Tu 1 . - CDS 25608 - 27098 1140 ## COG0285 Folylpolyglutamate synthase - Prom 27322 - 27381 5.1 + Prom 27056 - 27115 6.2 29 18 Tu 1 . + CDS 27230 - 28555 1447 ## COG1875 Predicted ATPase related to phosphate starvation-inducible protein PhoH + Term 28578 - 28640 -0.6 - Term 28564 - 28626 3.2 30 19 Op 1 . - CDS 28651 - 29628 1039 ## COG0167 Dihydroorotate dehydrogenase 31 19 Op 2 . - CDS 29653 - 30321 482 ## COG0325 Predicted enzyme with a TIM-barrel fold 32 19 Op 3 . - CDS 30328 - 30807 663 ## BT_1331 hypothetical protein 33 19 Op 4 . - CDS 30891 - 31454 529 ## COG3247 Uncharacterized conserved protein - Prom 31565 - 31624 5.1 + Prom 31414 - 31473 7.1 34 20 Op 1 . + CDS 31592 - 32092 617 ## COG2077 Peroxiredoxin + Prom 32175 - 32234 2.4 35 20 Op 2 . + CDS 32283 - 33167 933 ## BT_1328 hypothetical protein + Term 33178 - 33246 24.8 36 21 Op 1 . - CDS 33236 - 33886 307 ## PROTEIN SUPPORTED gi|154175107|ref|YP_001408238.1| ribosomal protein L22 37 21 Op 2 . - CDS 33918 - 35369 1655 ## COG0457 FOG: TPR repeat - Prom 35405 - 35464 4.2 38 22 Op 1 . - CDS 35469 - 37208 1876 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 39 22 Op 2 . - CDS 37257 - 38069 918 ## COG0226 ABC-type phosphate transport system, periplasmic component - Prom 38228 - 38287 6.7 + Prom 38066 - 38125 5.4 40 23 Op 1 38/0.000 + CDS 38249 - 39445 1090 ## COG0573 ABC-type phosphate transport system, permease component 41 23 Op 2 41/0.000 + CDS 39447 - 40322 852 ## COG0581 ABC-type phosphate transport system, permease component 42 23 Op 3 32/0.000 + CDS 40331 - 41089 215 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 43 23 Op 4 . + CDS 41165 - 41854 940 ## COG0704 Phosphate uptake regulator + Term 42097 - 42142 6.2 + Prom 42092 - 42151 9.2 44 24 Op 1 . + CDS 42245 - 42847 719 ## COG0307 Riboflavin synthase alpha chain 45 24 Op 2 . + CDS 42906 - 43442 554 ## COG0778 Nitroreductase + Term 43493 - 43553 17.1 - Term 43599 - 43650 8.2 46 25 Tu 1 . - CDS 43811 - 45130 997 ## COG1295 Predicted membrane protein - Prom 45156 - 45215 4.2 + Prom 45494 - 45553 3.8 47 26 Op 1 3/0.000 + CDS 45615 - 46595 759 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) + Prom 46629 - 46688 4.4 48 26 Op 2 . + CDS 46709 - 47860 1128 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily + Prom 47968 - 48027 7.1 49 27 Op 1 1/0.000 + CDS 48080 - 49615 1783 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Term 49661 - 49713 5.7 + Prom 49699 - 49758 7.5 50 27 Op 2 . + CDS 49799 - 50659 751 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 50692 - 50753 10.6 + Prom 50710 - 50769 5.7 51 28 Op 1 . + CDS 50790 - 51413 331 ## BT_1310 hypothetical protein 52 28 Op 2 . + CDS 51491 - 51892 288 ## BT_1309 hypothetical protein + Term 51912 - 51965 7.6 + Prom 51905 - 51964 6.5 53 29 Tu 1 . + CDS 51984 - 53198 635 ## BF2740 clostripain-related protein Predicted protein(s) >gi|222159192|gb|ACAB01000167.1| GENE 1 11 - 244 131 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294643851|ref|ZP_06721643.1| ## NR: gi|294643851|ref|ZP_06721643.1| polysaccharide biosynthesis protein [Bacteroides ovatus SD CC 2a] # 1 77 443 519 519 142 100.0 5e-33 MPMLLVGIITLCAVYCVRIILKDESFTRLILSTVTSFVIFSISAYFIGIDDSERKIVRAA WVKRFGKYRNNINNFTV >gi|222159192|gb|ACAB01000167.1| GENE 2 244 - 1116 455 290 aa, chain + ## HITS:1 COG:no KEGG:Gmet_1328 NR:ns ## KEGG: Gmet_1328 # Name: not_defined # Def: glycosyl transferase family protein # Organism: G.metallireducens # Pathway: not_defined # 1 288 1 293 294 175 34.0 2e-42 MDVVVIFNGLGNQMSQYAYYLAKKKVNPNTKVIFDIMSKHNHYGYDLERAFGIEVNKTLL IKVLQIIYVLSRKFRLFKSVGVRTIYEPLNYDYTPLLMQKGPWGINYYVGGWHSEKNFMN VPDEVKKAFMFREQPNEDRFNEWLQVIRGDNSSVSVHIRRGDYMNIEPTGYYQLNGVATL DYYHEAIDYIRQYVDTPHFYVFSNDLDWCKEQFGVENFFYIECNQGVNSWRDMYLMSECH YHINANSTFSWWGAWLCKFEDSITVCPERFIRNVVTKDFYPERWHKIKSC >gi|222159192|gb|ACAB01000167.1| GENE 3 1118 - 2020 218 300 aa, chain + ## HITS:1 COG:no KEGG:Ping_1180 NR:ns ## KEGG: Ping_1180 # Name: not_defined # Def: hypothetical protein # Organism: P.ingrahamii # Pathway: not_defined # 1 193 3 199 362 78 31.0 3e-13 MDNITSRTIERLRFPMTVLVLFIHVLPFAASQGGIEYTFPFNTDSWSHRFIYFFNWIFPR VAVPMFCIISGYLFFQGLNTVDCFKEKYKLKLKSRVKTLLIPYLLWNIIFEIWRLFMISS DKLELLPKSFFLLCKEIFNSFWGMYAPSDAPLWYVRDLMAVMVLAPIIFFVINKCGKYVV IVLTFIWLLGLWPRLPIVPGIDIILFVTWGAYFAMKGIEPFRKMASLTGYWPVYIISLLW AVWEYDELAYNFPIRLSILLGLSLVSALFLYIDKHEMNVPKILTASSFLHIVFIGFMLLL >gi|222159192|gb|ACAB01000167.1| GENE 4 2215 - 2625 278 136 aa, chain + ## HITS:1 COG:MA2174 KEGG:ns NR:ns ## COG: MA2174 COG0110 # Protein_GI_number: 20091016 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Methanosarcina acetivorans str.C2A # 25 136 53 162 184 68 36.0 3e-12 MLKNIIKKLIAYIFRPEYKFAKRGDKCFIGSNCIINSASMIELGNHVSIGPNAVLYCIYK KIIFGNNVLLGPNVTMVNGDHNMRKIGVPIIDNNEKEASDDADIIIEDDVWIGANVTILK GVTVGRGAVIAAGAVV >gi|222159192|gb|ACAB01000167.1| GENE 5 2823 - 3905 439 360 aa, chain + ## HITS:1 COG:CC0633 KEGG:ns NR:ns ## COG: CC0633 COG3754 # Protein_GI_number: 16124886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis protein # Organism: Caulobacter vibrioides # 4 356 220 564 818 236 37.0 5e-62 MSKPRLIAFYLPQYYPTKENDEWWGKGFTEWTNVGRAKPLFKGHYQPRVPADLGYYDLRV PEVRIQQAEMAKNAGIEGFCYWHYWFAGKRMLDRVFGEVVETGKPDFPFCLCWANHSWYA KTWDPNKPSKLLIEQTYPGIEDYEEHFYAMLPAFKDRRYIRINEKILFAVYDPLNIPEPQ LFIETWNRLARENGLEGFYFMGFTIKDSLKQEILKVGFDSVLVDFVFGASEKSGTLPNIY IKKILRKLLRKPITIEYSQYSQYLLNNYIVNENVYPSICPNYDHSPRSKFRGTIIVNSTP QKWKKLCHEMFSKVSVRSAEDNLVFIKAWNEWGEGNYLEPDLKYGTQFLDVIRDVLEKVK >gi|222159192|gb|ACAB01000167.1| GENE 6 3980 - 4720 313 246 aa, chain + ## HITS:1 COG:YPO3098 KEGG:ns NR:ns ## COG: YPO3098 COG0463 # Protein_GI_number: 16123272 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Yersinia pestis # 5 203 3 206 247 112 37.0 7e-25 MKKIISIIIATYNAEKTLKRCLNSIVSQKKDQLELLIIDGCSTDRTMDIVREFAESIDVI VSEVDKGIYDAWNKGIRLATGEWIMFLGADDYLLEGAMNVYWNYLKKQTLDGIDIITAQS KLIDAKGKYKRVFGNPYNIKEFRCCMKISHGSTLHNRKLFDELGNFDISFKICADYEFLL RKKLNARYIETPTIAMQVGGVSNTIRGLWESFKVKRYCKSIPLLLNVYYLTKGVIGYYMR KIKIIN >gi|222159192|gb|ACAB01000167.1| GENE 7 4733 - 5863 482 376 aa, chain + ## HITS:1 COG:no KEGG:LSL_0977 NR:ns ## KEGG: LSL_0977 # Name: rfaG # Def: glycosyltransferase (EC:2.4.1.-) # Organism: L.salivarius # Pathway: Fructose and mannose metabolism [PATH:lsl00051] # 6 358 3 355 379 242 38.0 1e-62 MRNKIIGYVSAEDPFRDKKAWSGTKYKIREALQNAGYEVVWISCKPSKICFIFLKILLKM LFGKKAIVEHNRYYFKLCAKHINMNKVRLCDYIFFPGGAQISAFVNFEKPIIYYTDATFH IMIDYYWHGLSSWLITQGEHYEKEAIKNAFINIRSSQWAAASVINDYDGCLQRNYVLEFG ANFDDKDLLHATPYKSGALNILFSGVDWIRKGGEVAVKTVQLLNKRGINSCLFIVGLSEI PLKYANLPFVRIMGFLDKNNPEHYRKYVDIVKNSHLLLLPTNAECSAIVFSESSAFGLPI FTYDTGGTGNYVIDGVNGYKLPLAAREQEFADAIEKSLESNEFLQLHKGGITLFNERLNW STWSKRFRKIMEENNL >gi|222159192|gb|ACAB01000167.1| GENE 8 6311 - 6466 60 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTRDFIRLYFINPGAKKTYAAAGIEKAVKDPGSINIACNLPVFCVIVPSNV >gi|222159192|gb|ACAB01000167.1| GENE 9 6450 - 7121 164 223 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262408848|ref|ZP_06085393.1| ## NR: gi|262408848|ref|ZP_06085393.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 223 183 405 405 400 100.0 1e-110 MKSLVIGIALILTLTTSMIVAVVIILFLKFYYYFKYLRIGLVVCFIVGVCWCINNRDILS SSEYFVNPQLRAIQEKITQTLSMVEYAEPEDFEHLNTSSYVILTNYWIAFNAPCRILGTG LGTHAQNYERMYKSDFGGYGLNKDDAYSMFARLYSEFGVLGLCLYAFFLIRYYNKDNIIS LCLIVFFISYLIKGGHYMLYGTAFFHIVYYFISPYKINCFKKI >gi|222159192|gb|ACAB01000167.1| GENE 10 7653 - 8339 259 228 aa, chain + ## HITS:1 COG:all4830 KEGG:ns NR:ns ## COG: all4830 COG0438 # Protein_GI_number: 17232322 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 4 188 151 336 379 71 31.0 1e-12 MAIQNHVMAKNKKIYFISISALTRKICMQLMNDSRKIYTIADPVEINENQLPNKLLNNDV YLFMGRLSPEKGIGLFCQAVTDLNRKGIVLGEGYLRSELMEKYPNIQFVGWVAGKDKQAY FKKAKALVFTSLCYETFGLSVAEARSYGIPCIVPDSCAASEQVIDGETGYLFKTGNLDSL KEVLVKYENADIQVMQNNIVNNFIADDWSLISHIKNLVQVYNEIITES >gi|222159192|gb|ACAB01000167.1| GENE 11 8476 - 8640 59 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712795|ref|ZP_04543276.1| ## NR: gi|237712795|ref|ZP_04543276.1| predicted protein [Bacteroides sp. D1] # 1 54 1 54 54 89 100.0 1e-16 MNRFYQLLLYFHILVVLLQNVWLNIVFYDAVADYGRKYMEYGIENGTVGKAWAW >gi|222159192|gb|ACAB01000167.1| GENE 12 8909 - 9712 347 267 aa, chain + ## HITS:1 COG:RSc0688 KEGG:ns NR:ns ## COG: RSc0688 COG1216 # Protein_GI_number: 17545407 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Ralstonia solanacearum # 2 263 4 265 275 186 35.0 3e-47 MITVSIVTYKTDLVELAKCLTSLLSPLISKVYIIDNSSQQYIADFCKEYANVEYIGSRNV GYGAGHNQALRKVLKSKSKYHLILNSDVYFDPSVLEQLADYMDLNEDVAQVQPNIIYPDG RMQYTCRLLPTPVDLIFRRFLPKKMIEKRNNRYILKFNDHTQAMNIPHHQGSFMFFRIEC FNKTGLFDERFFMYLEDIDLTRRIHKYYRTMFWPGVTIVHAHRAASYKSKKMLIIHIRNA IKYFNKWGWVFDSERKKWNKRLLKELL >gi|222159192|gb|ACAB01000167.1| GENE 13 10076 - 10309 138 77 aa, chain + ## HITS:1 COG:no KEGG:BF2772 NR:ns ## KEGG: BF2772 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 4 74 12 82 82 62 42.0 5e-09 MEHSDFVVKCYNKQELAQMYFPDLTIRASVNKLRRWMRRCKPLMNEILSTDFHPKTKAFS VREVRLITYYLGKPGEL >gi|222159192|gb|ACAB01000167.1| GENE 14 10372 - 10815 186 147 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 55 139 2 97 116 66 38.0 2e-11 MENSEENYLPREIKLLVIHCSATRCNVSFTVEQLRQCHLQRGFKDIGYHFYITRNGELHH CRPVSEPGAHVRGFNRHSIGICYEGGLDEEGRPADTRTQAQRFALLDLLTILKHQYPDAQ ILGHYQLSASIHKACPCFDSRKEYMNI >gi|222159192|gb|ACAB01000167.1| GENE 15 11001 - 11489 592 162 aa, chain - ## HITS:1 COG:no KEGG:BT_0403 NR:ns ## KEGG: BT_0403 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 1 163 165 248 80.0 5e-65 MAQNYTLMARKNLLKPSETPKFYAVARSGRKVTVKEVCKRITERSSYSKGELEGCIGEFL LEIVNVLDEGNIVQMGDLGNFRMSIKTGTPTDTAKEFKASCIDKGKVLFYPGSDLRKLCK TLDYTLYKSDSSTDSDKDPLPDDGGDDNQGGSGSGEAPDPAA >gi|222159192|gb|ACAB01000167.1| GENE 16 11547 - 11753 59 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712800|ref|ZP_04543281.1| ## NR: gi|237712800|ref|ZP_04543281.1| conserved hypothetical protein [Bacteroides sp. D1] # 11 68 1 58 58 97 100.0 3e-19 MTKVLHRESGMTHNDAHRRTDKKKKHREICIETSQCLISIIEMFFSNLPMFFRTLQKDKR TKGQKGQM >gi|222159192|gb|ACAB01000167.1| GENE 17 11754 - 14081 1420 775 aa, chain - ## HITS:1 COG:no KEGG:BT_0404 NR:ns ## KEGG: BT_0404 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 773 1 775 775 1088 68.0 0 MKVLEEIKVSVYENVYSKKPRVMSFLEVIIMCIHPIYASIINAIRRYYAEGDHAAAQKLK NQLPCFTPAGTFDGAHAIKNFLLPSHIVGLDYDHVKDRLQVIQRCAADPHTVAAIESPTD GVKVFAYVEGIENRHREGQQLVSHYYNQLLGLESDPACKDESRLCYFSYSPNGYVAALYQ AFVLEPLIKEETNTFSENEVLPPFPLQNNTPENVSEEEIAQFISSYIFFHPLTAGQRHSN VFKLACEACRRHYPQKSILRELTVFFEHTDFRPEELTKVLSSGYKQVNEHAPASSPASAP SFQKDIRTKRQYSTAENSDTDDEAYWLGEEFRKGTPLFPRSLYNNLPDLLNDCIIEDGSE REQDVALLSDLTALSAALPQTFGIYNHKKYSTHIFSVILSPAASGKSIAQTGRYLLEEIH SEILSTSESMMKNYQTVHNNWQSEYQKQKKKGEACSEEPQRPPFKMLFIPATTSYTRMQM QMQDNGPQGSIIFDTEAQTLSTANHLDCGNFDDMLRKAFEHENIDSSYKANGLIPIYIRH PKLALLLTGTPGQIDGLLSSYENGLPSRTLIYTFREAPHWKEMGDDCVSLEDSFKPIAHR VSELYHFCLAHPVLFHFNRLQWNRLNEIFSRMLSEVALEGNDDLQAVVKRYAFLVMRISM IQTRIRQFEVTDLSPEIYCTDADFERSLQIVLCCYEHSRLLHSSMPSPSVRPLKNPDTIR NFVQELPDSFTTDEAIQIGAKYDFNHRKVTRLLKSLNGVKINKISHGSYTKMNEQ >gi|222159192|gb|ACAB01000167.1| GENE 18 14234 - 14593 251 119 aa, chain + ## HITS:1 COG:no KEGG:BT_0405 NR:ns ## KEGG: BT_0405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 117 1 117 119 130 50.0 2e-29 MAKEKQEPYEFLSNLVLALMDMDRIFSNSFFISEFAISPKTLGEIRRGEDMCIYQYVRVI RCMTKYLHLIIQLDMLLKELRIVLSSHCDLVVATVPHRSCGTCQPTEWVAVMHWDGVKL >gi|222159192|gb|ACAB01000167.1| GENE 19 14731 - 15237 627 168 aa, chain + ## HITS:1 COG:mlr4844 KEGG:ns NR:ns ## COG: mlr4844 COG3152 # Protein_GI_number: 13474057 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mesorhizobium loti # 7 119 9 118 120 65 41.0 3e-11 MFKAPFSFDGRIRRIEYFLSGIIGGIVSSIAWALGVGTFILGAASGSAGGSVFGLLIGLA AMIASIWFSLAQGVKRLHDLNKSGWLILLCCVPIIGWIFALYMLFADGTVGPNPYGADPK NRMPYQAQPASVNVTVNVSREEVKVDKPVEDAPAPAEAPAEENKEKAE >gi|222159192|gb|ACAB01000167.1| GENE 20 15364 - 17622 1410 752 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712806|ref|ZP_04543287.1| ## NR: gi|237712806|ref|ZP_04543287.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 752 1 752 752 1478 100.0 0 MYMLSTVISQTGECNAQHATDPCFLNQWGYSTLLLIIIGIGLVYLLFKKRKFLVDNLKWI AGTVFIAGFLIYWYAFNEGGSDSNSIALAFRSALSSMEMFASHSDLLEVPEDLHHDPFYM TIFSVIHFLAVIVSAVFIIKLLGFRFISWVRLCMANLSRKKKCRLFIFWGVNDNAILLAN SIRKKAAEPKGKNEEGWENCKFIFVRLLSANESSSHGRFTFSHFFNSSHDGTEKFIEKIE ELDGILVNSKFGITGRVIDKVKSEFDLYKYLGLRRLGNLIKKYPQATFYFLSPNEELNLE AVSVFKEIAKCKEDRVHNQVQIYCHARKNNQNQKLEICDGLKHQIHIIDSSNLAVLQLKK NVRNHPVNFVDVDTSKACVKKPFTSMIIGFGETGRDAFRFLYEFGALIDVNGNRNPQKIY VVDEHMDELKGDFLMKAPALKERKNELEWCEEMSIHSERFWEKLSEIIHDLNYIVIAIGN DNEGMALAIDLYEYAYRYRKDCFNDFRIYLRVNGSCNTIQLKQIKEYFNIYGNTRDVIIT FGAQEEIFSYDVVSTDVLEVLAKEFYYAYQKIMIDAMPETNEKEIEEKKKAKESLKQTAE EEWNARREALQDKHSLDAQIKLAYQEEQDRANVWHIDTKKFLAGAMGEDGKDNKERLKEM VELTQRDAHTLNYSKVCDVVSSTLFDNLSKCEHLRWNACMELQGFVTCDGDKDFQQKKHK CIVDNDILRSKYPETIPYDQCVVELSFRLKKN >gi|222159192|gb|ACAB01000167.1| GENE 21 17627 - 18199 452 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160882630|ref|ZP_02063633.1| ## NR: gi|160882630|ref|ZP_02063633.1| hypothetical protein BACOVA_00583 [Bacteroides ovatus ATCC 8483] # 1 190 1 190 190 325 100.0 6e-88 MTDQEIVDGLINRDEKITDWFFNIKYRPLFINVIKLIFDYQVDYDECISELYYHLMKNDA AVLRNFEGRSTIGTWIKIVAIRFFCSRKKREQMIEDESKEPLYEQNHEEEIDDSESKIAA KIDLERLFDLMSNKRYVMVIRELVLKEVEPEFLALSMGITVANLYNIKKRALAALAHLAM NDKKKYENKR >gi|222159192|gb|ACAB01000167.1| GENE 22 18204 - 19271 1020 355 aa, chain + ## HITS:1 COG:no KEGG:PRU_0332 NR:ns ## KEGG: PRU_0332 # Name: not_defined # Def: RyR domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 2 306 22 352 354 209 40.0 2e-52 MITEELLAAFEEGKTNAEETALVLEYLATDESLQEEFILSQQLDAMMGADDEETDFLPMA QMAAKSEGNLCDFQCEQFILKRRKIEYNSDELSEEARNNSWLRERGTPLHSVGRLLEQRG LIVMRSYGSSIDSVIRALKAGHDAIVVVNSCRLPENSEEEIAYHAAVVLDVNEEEVTLYD PATGEESTAYPKDHFIAAWNDAKAYLARVKVPDLDYNPRPIDLEDVELSTDLIELREAIA ENAHEIWADQRQEEGWTYGPQRDDEKKETPDMVPYSMLPYSEKEYDRRMAFDTIKLMKKL GYSIIKQGDTALHNELMRKLKNESDAKVCECGASIFMDQIYCSHCGKKIDWKLFR >gi|222159192|gb|ACAB01000167.1| GENE 23 19294 - 19587 368 97 aa, chain + ## HITS:1 COG:no KEGG:BT_2247 NR:ns ## KEGG: BT_2247 # Name: not_defined # Def: putative ryanodine receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 96 7 100 100 112 59.0 4e-24 MKEYIPNPVDTGHIQLPKELEYLVEEMAKNVHEVWSKTRIEQGWTYGKKRDDVLKQHPCL VPYEELPEEEKVYDRNSSVETLKLIMKLGFKISKDEE >gi|222159192|gb|ACAB01000167.1| GENE 24 19562 - 22843 2900 1093 aa, chain - ## HITS:1 COG:slr1968_2 KEGG:ns NR:ns ## COG: slr1968_2 COG4995 # Protein_GI_number: 16330786 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 701 1079 3 393 426 88 25.0 6e-17 MRQFILSLLACLCLTAYSQTTSQADLLVEEAQKLESKQDYPTAITKLKQADELYVKTGKT QSAERATCLHILGRCYLNLERPEGLTYTQMAADMRKTILGETNIKYISSLNNTGLYYLTV AKDYPKAAEIHGKTWELCSRIQPRPEQAFMFHINLARCYIALGEMDKASAIVEEEIAISK KMYGEKSLSVARQLQQIGSLYYLSGRKDVGVTYYEQAFNIFPDDSKEYEQLLDWISSIYV ELNNQPKVLEYMKLVEAHNKKELEKPCDEPDCLTERAQYFASIGNNDKARAHFLDALKKC DVNTDKEIVFKVRHNYAQFLSGIQDHASAGEYYELAADILRNNPAEQAKFALESYLGGLN YSIAAKYEASNRMLNQALSVYAQMLPDKLDKYVDASVALCRNYGFTKEYGKALEILTKAE KLLPTGDKSDKMGEILRSRGSILYRQKQYAEAAANYQQAADIYKILPGSDVKYQDALSSL NRCHTMMGNETAARQTEQDAERQRMAVLNRLLKENLEQLDAYRLQWGEDGLMYVSALGTI ADIYYTQGQTDKALAYMEPFLSSEITALRNLFRLSKADERLAFWKDIRSSLDSIPLRAAN IAATGTPEQKQRFARLGYDALLFSKGIMLNSSIELESLIRASGDKSLLDQYNKATLMAEQ ILSMQSELPNATNQTEARKNIIRQKEEYEQLQLDLMRKSTDFGDYTRYLSVKWQDVQKHL HGNSIAIEFALIDDELLAPDKHLTAFVLRPGDVSPTAIKLMSQKLLSKEMQSPTAFTTTE NGAHFWRALDEYISKADTIYFSPDGILHQLPVEYLPYGAGNLPLAFQKAVYRLSSTKEIA LDRASLNYSSAALFGGLDYEMASTKVRNIVSTDHNSGKFRNGQNGYHELPYTLNEVNNVN SLLKEKKIKTNLFVGENGTEKQFRALDGKAVSLIHIATHGDYKELKQREADDAMKHCFLI MSGANATEENNDGLLRADEISTLNLRGCRMVVLSACNTGQGTLGADGLFGLQRGFKNAGV QTILMTLSAVDDQASMLLMTKFYENIISGLSERQALIKAQQYLRENGYADSKYWAPFILL DAGKSPIPHPSKS >gi|222159192|gb|ACAB01000167.1| GENE 25 22864 - 23532 309 222 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712811|ref|ZP_04543292.1| ## NR: gi|237712811|ref|ZP_04543292.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 222 1 222 222 431 100.0 1e-119 MSRKTIFRVFILLFFSIGGQMSANAQTTTPFETVFDACKKACEALDGGFASGEQLLAVSK TLRDAAPIPLTVKQTKGDVLSLKGHLVFDYEFIQACVDNETIYEIADKYAAEARMRGDKD SANKVRLDTKMVAAGQTCTFEIPSCSGTSQIGCVAEVNRSFSWKIKTIGYQSKSEREYKC NDSVRKGLPFRKEKVSSDERYKIIVSITNKSKRDGSFALILF >gi|222159192|gb|ACAB01000167.1| GENE 26 23554 - 25170 972 538 aa, chain - ## HITS:1 COG:MA4278 KEGG:ns NR:ns ## COG: MA4278 COG1262 # Protein_GI_number: 20093067 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 321 533 47 267 270 120 33.0 5e-27 MSKHNYDIFISYRKRCSGDKPEMLQLMLEESGFRKRVSFDKDNLNGRFDVELIRRIDECK DFIMFMVPETFTTIRPLNEEAVETGEKATWDMEEVAFYERMASLTYEEFETEIKQISHTG EIDFVRIELGRALHRRSRNPKQINIIPIAPQESESYDFATLQLPPDISGLKDFQAVFYSN SRVARFKDIKGDLLKQMLSKPSYVSAKWLVMTFIALSLIVAGSKTYTSIQRTAEQKLEFK DCRTYDDYSSFIKKHPDSPLKSTCDSILHEFNALRNDGRASVNNTGNRDIKDREKEWVDV KWNPTITLPQLRSLVDMMNNMLLIPAKNKEFIMGKTMGKGYDSPQHTVVLSSDYYMCKYE VTRSLWYAIMNDSIVTEEGMLPMTHITWNDAEAFTKKLNKLTGLPFSLPTEAQWEYAAAG GESYPYAGSDNIRDVAYYASNANERLHPVGEKRENGFDLYDMSGNAAEWCTDWMSRYENT RVTDPQGPAENPGHHKKIVRGGSYLANERDMDIRHRSVQTYDTSEPHIGFRVVLNPIQ >gi|222159192|gb|ACAB01000167.1| GENE 27 25266 - 25640 405 124 aa, chain + ## HITS:1 COG:PH0854 KEGG:ns NR:ns ## COG: PH0854 COG0251 # Protein_GI_number: 14590714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pyrococcus horikoshii # 1 124 12 136 137 132 55.0 1e-31 MKKVICSEKAPGAIGPYSQAIEANGMVFVSGQLPIDAATGQMAEGIEGQARQSLENIKHI LEEAGLTMGNIAKTTVFLQDMSLFAGMNGVYATYFDGAFPTRSAVAVKALPKDALVEIEC IAVR >gi|222159192|gb|ACAB01000167.1| GENE 28 25608 - 27098 1140 496 aa, chain - ## HITS:1 COG:TVN0757 KEGG:ns NR:ns ## COG: TVN0757 COG0285 # Protein_GI_number: 13541588 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Thermoplasma volcanium # 26 494 17 419 428 200 31.0 5e-51 MDYQNTLKYLYESAPMFQQIGGKAYKPGLETTHKLDEHFGHPHQQFKTIHIAGTNGKGSC SHTIAAVLQCAGYRVGLFTSPHLIDFRERIRINGEMIPEEYVVNFVEEHRSFFEPLHPSF FELTTAMAFRYFADQKVDVAVIEVGMGGRLDCTNIIHPDLCVITNIGLDHTQYLGDTLTK IAKEKAGIIKEGVPVVIGRAQGAVKRVFTMKAKEKNAPIEYARENARYWDMEIVPYSKLQ EIRPMMDNTIQSMHEMIEAMDEQSEEEANQMRQALLMLDLSDSLRTLDQILDKRKDAIKI HNEMFPFGLFTELSGAYQFENMFTILKALATLTRLNYNIRSQDYRAGLANVCQLTGLMGR WQKVHSYPDIICDTGHNVDGIEYIHVQLNAIHKTFDQEIHIVFGMVNDKDIRGVLRALPK YATYYFTKASVKRALPENELLALAEEAGLKGTTYPTVVEAVQAAKKNCPPKDLIFVGGSS FIVADLLANRDTLNLY >gi|222159192|gb|ACAB01000167.1| GENE 29 27230 - 28555 1447 441 aa, chain + ## HITS:1 COG:BH2629 KEGG:ns NR:ns ## COG: BH2629 COG1875 # Protein_GI_number: 15615192 # Func_class: T Signal transduction mechanisms # Function: Predicted ATPase related to phosphate starvation-inducible protein PhoH # Organism: Bacillus halodurans # 4 441 2 442 442 286 40.0 8e-77 MGTKKNFVLDTNVILHDYNCLKNFQENDIYLPLVVLEELDKFKKGNEQINFNAREFVREL DVLTSDELFSDGVKLGEGLGRLFVVTSNVPAAKVWESFPIKKPDHLILAATEYLTDKYPK MKSILVTKDVNLRMKARSIGLLCEDYITDKVVNVDVFEKSNEIFENVDPALIDRIYSSKE GIDLSEFDFKDLIHPNECFVLKSDRNSVLARYNPFTHSIIRVMKGKNYGIEPRNAEQSFA FEILNDPNIKLVALTGKAGTGKTLLALAAALGKLTDYKQILLARPVVALSNKDIGFLPGD AQEKVAPYMQPLFDNLNVIKRQFAANSTEVKRIEDMQKSEQLVIEALAFIRGRSLSEMYC IIDEAQNLTPNEIKTIITRAGEGTKMVFTGDIQQIDQPYLDSQSNGLVYMIDRMKDQNIF AHVNLLKGERSELSELASNLL >gi|222159192|gb|ACAB01000167.1| GENE 30 28651 - 29628 1039 325 aa, chain - ## HITS:1 COG:alr1912 KEGG:ns NR:ns ## COG: alr1912 COG0167 # Protein_GI_number: 17229404 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Nostoc sp. PCC 7120 # 3 319 2 323 343 207 36.0 2e-53 MTDLKTTFAGLSLRNPIIISSSGLTNSVGKNKKLAEDGAGAIVLKSLFEEQIMLEADQLK DPAFYPEASDYLEEYIREHKLSEYLTLIKESKKVCPIPIIASINCYTDSEWIDFAKKIEE AGADALEINILALQSELQYTYGSFEQRHIDILRRIKQTVNIPVIMKLGDNLTNPVVLIDQ LYANGAAAVVLFNRFYQPDINIEKMEHISGEIFSNASDLANPLRWIGIASAVVDKIDYAA SGGVANAESVVKAILAGASAVEVCSAVYLNTNAFIGEANRFLSAWMERKGFENIAQFKGK LNIKDIKGVNTFERTQFLKYFGKKE >gi|222159192|gb|ACAB01000167.1| GENE 31 29653 - 30321 482 222 aa, chain - ## HITS:1 COG:CAC2121 KEGG:ns NR:ns ## COG: CAC2121 COG0325 # Protein_GI_number: 15895390 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Clostridium acetobutylicum # 1 222 1 218 221 166 41.0 3e-41 MSIADNLKQVLAELPQGVRLVAVSKFHPNEAIEEAYQAGQRIFGESKVQEMTAKYESLPK DIEWHFIGHLQTNKIKYMIPYVAMIHGIDSYKLLAEVNKQAVKAGRTVNCLLQIHVAQEE TKFGFSPEECKEMLNVGEWKELTHVRICGLMGMASNTDCIEQINREFCSLNRLFNEIKTT WFTHSDTFCELSMGMSHDYHEAIAAGSTLVRVGSKIFGERNY >gi|222159192|gb|ACAB01000167.1| GENE 32 30328 - 30807 663 159 aa, chain - ## HITS:1 COG:no KEGG:BT_1331 NR:ns ## KEGG: BT_1331 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 159 159 293 98.0 1e-78 MAMHTWFECKIRYEKVMENGMQKKVTEPYLVDALSFTEAEARIIEEMTPFISGEFTVSDI KRANYSELFPSDEESADRWFKCKLIFITLDEKSGAEKKTSTQVLVQAADLRDAVKKLDEG MKGTMADYQIGMVSETPLMDVYPYSAEPNDKPEFDPSKA >gi|222159192|gb|ACAB01000167.1| GENE 33 30891 - 31454 529 187 aa, chain - ## HITS:1 COG:RSp0426 KEGG:ns NR:ns ## COG: RSp0426 COG3247 # Protein_GI_number: 17548647 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Ralstonia solanacearum # 13 170 7 163 186 65 28.0 8e-11 METVFNEIQHSVKNWWTSLLLGIVYIIVALWLMFSPVSTYVALSIIFSVSMLISGILEII FAFSNRKGVPSWGWYIVGGLIDLVLGIYLIAYPMVSMEVIPFIIAFWLMFRGFSSTGYSI DLKRYGTRDWGWYMAFGILAILCSLLILWQPAIGALYAVYMISFTFLIIGLFRVMLSFEL KNLHKRK >gi|222159192|gb|ACAB01000167.1| GENE 34 31592 - 32092 617 166 aa, chain + ## HITS:1 COG:Cgl1062 KEGG:ns NR:ns ## COG: Cgl1062 COG2077 # Protein_GI_number: 19552312 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Corynebacterium glutamicum # 1 165 4 167 168 186 58.0 1e-47 MATTNFKGQPVKLIGEFIQVGKVAPDFELVKTDLSSFSLKDLNGKNVILNIFPSLDTSVC ATSVRKFNKMAAGLKDTVVLAISKDLPFAHGRFCTTEGIENVIPLSDFRFSDFDESYGVR MADGPLAGLLARAVVVIGKDGKIAYTELVPEITQEPDYDKALAAVK >gi|222159192|gb|ACAB01000167.1| GENE 35 32283 - 33167 933 294 aa, chain + ## HITS:1 COG:no KEGG:BT_1328 NR:ns ## KEGG: BT_1328 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 294 1 294 294 519 88.0 1e-146 MNKMKTLFITLLCMGAGTLSAQTADSTQTSPWTKEGFAGLKLTQVSLTNWAAGGDNSVAF DLQGTYQINYKKGKHLWNNRIELAYGLNKTGDDGTRKANDKIYLNTNYGYAIAKSWYASA FATFQTQFSPGYDYSVNKDVAISEFMAPAYLTTGLGFTYDPGKIFTVVLSPAAWRGTFVL NDRLSDEGAYGVDPGKHLLSSFGANLKGEAKYEFLKNMTVYSRLDLYSDYLHKPLNIDVN WEVQINMIINKWFSTTLTTNLMYDDDVKIVQKDGTKGSRVQFKEILGVGVQFNF >gi|222159192|gb|ACAB01000167.1| GENE 36 33236 - 33886 307 216 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|154175107|ref|YP_001408238.1| ribosomal protein L22 [Campylobacter curvus 525.92] # 1 214 1 198 199 122 35 3e-27 MESVAELLKWVLENLNYWVVTIFMAIESSFIPFPSEAVVPPAAWKAMADDSMNIFLVVLF ATIGADIGALVNYYLARWLGRPIVYKFANSRLGHMCLIDEEKVHHAEEYFRKHGAASTFF GRLIPAVRQLISIPAGLAGMKLGPFLLYTTLGAAIWNSILALLGYLIYRFTDLKTTNDVY VMATTYSHEIGYVIIAVVVLVIGFLAYKGLKKKKKK >gi|222159192|gb|ACAB01000167.1| GENE 37 33918 - 35369 1655 483 aa, chain - ## HITS:1 COG:MA1362 KEGG:ns NR:ns ## COG: MA1362 COG0457 # Protein_GI_number: 20090223 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Methanosarcina acetivorans str.C2A # 176 414 165 395 400 73 27.0 1e-12 MGRKNSPSAKKQLATLITNYEKAKAENRQLYLDADQLADIADWYASERKFEEAQEVITYG LKIHPGNTALLIEQAYLYLDTQKLQKAKKVADSITEDFDSEVKLLKAELLLNGGKLEEAQ WLLSTIADADELETIIDVVFLYLDMGYPDAAKEWLDRGKSRYAEDEEYMALTADYLASTH QVESAIIYYNKLIDKSPFNPSYWMGLVKCYFVQEQIDKAIEACDFALAADDQCGEAYAYK AHCFFYLNNSDDAIENYQKAIELKSIPPELGYMFMGISYGNKEEWQKADDYYDKVIARFE EDGDRQSVLLIDTYTSKAFALSHLERYEEAHQLCEKAKEINPNEGLIYLTEGKLYLAEEL EDEAALSFEKAIEINSNIEMWYMIASAYSESDYLIEAKEYFEKAYQMNPKYEDVTEKLSV LCLMHGEIDNFFKYNKECEHPLEEDMILDLLNSPEHREEDERTLKEVWERMKKENKKKKK GKK >gi|222159192|gb|ACAB01000167.1| GENE 38 35469 - 37208 1876 579 aa, chain - ## HITS:1 COG:RSc0791 KEGG:ns NR:ns ## COG: RSc0791 COG0008 # Protein_GI_number: 17545510 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Ralstonia solanacearum # 16 576 16 580 580 584 50.0 1e-166 MTDIKNEEAGEKKSLNFIEQAVEKDLKEGKNGGKVQTRFPPEPNGYLHIGHAKAICLDFG IAEKHGGVCNLRFDDTNPTKEDVEYVEAIKEDIQWLGYQWGNEYYASDYFQQLWDFAIRL IQEGKAYIDEQSSELIAQQKGTPTQPGVESPYRNRPIEESLELFKKMNSGEIEEGAMVLR AKIDMANPNMHFRDPIIYRVVKHPHHRTGTTWKAYPMYDFAHGQSDFFEGVTHSLCTLEF VVHRPLYDLFIDWLKEGKDLNDNRPRQTEFNKLNLSYTLMSKRNLLTLVKEGLVNGWDDP RMPTICGFRRRGYSPESIHKFIDKIGYTTYDALNDIALLESSVRDDLNSRATRVSAVINP VKLIITNYPEGQVEELEAINNPEDPEAGSHLIEFSRELWMEREDFMEDAPKKYFRMTPGQ EVRLKNAYIVKCTGCKKDENGVITEVYCEYDANTRSGMPDANRKVKGTLHWVSCNHCLQA EVRLYDRLWKVENPRDELAAIREAKNCEALEAMKEIINPDSLKVLPNCYIEKFAATLPTL SYLQFQRIGYFNIDKESTPEKLIFNRTVGLKDTWGKINK >gi|222159192|gb|ACAB01000167.1| GENE 39 37257 - 38069 918 270 aa, chain - ## HITS:1 COG:MA0887 KEGG:ns NR:ns ## COG: MA0887 COG0226 # Protein_GI_number: 20089771 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 23 269 70 315 317 189 46.0 3e-48 MKVRRNLLIALSLLSLGANAQRIKGSDTVLPVAQQTAERFMNQHPDARVTVTGGGTGVGI SALMDHTTDIAMASRPIKFSEKMKIKGAGEEVDEVIVAYDALAVVVHPSNPVKQLTRQQL EDIFRGKINNWKQVGGDDRKIVVYSRETSSGTYEFFKESVLKNKNYMASSLSMPATGAII QSVSQTKGAIGYVGLAYVSPRVKTLSVSYDDTHYATPTVENATNKSYPIVRPLYYYYNVK NKEQVSPLIQFILSSDGQDIIKKSGYIPVK >gi|222159192|gb|ACAB01000167.1| GENE 40 38249 - 39445 1090 398 aa, chain + ## HITS:1 COG:MA0888 KEGG:ns NR:ns ## COG: MA0888 COG0573 # Protein_GI_number: 20089772 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 149 392 45 290 296 262 55.0 1e-69 MKKVFEKIIEGMLTCSGFVTSITILLIVLFLFTEAFGLFKSKVIEEGYVLALNKSNKVSV LSPAQIKNVFDEEITNWKELGGEDLPIRVFRLEDITQYYTEEELGPAYEYAGDKITELVE KTPGIVAFVPQKFIVHPDAVHFIEDNTISVKDVFAGAEWFPTATPAAQFGFLPLITGTLW VSLFAILFALPFGLSVSIYMSEVANPKVRNWLKPIIELLSGIPSVVYGFFGLIVIVPLIQ KLFDLPVGESGLAGSIVLAIMALPTIITVTEDAMRNCPRAMREASLALGASQWQTIYKVV IPYSISGITSGVVLGIGRAIGETMAVLMVTGNAAVIPITILEPLRTIPATIAAELGEAPA GGPHYQALFLLGVVLFFITLIINFSVEYISSKGLKRSK >gi|222159192|gb|ACAB01000167.1| GENE 41 39447 - 40322 852 291 aa, chain + ## HITS:1 COG:MA0889 KEGG:ns NR:ns ## COG: MA0889 COG0581 # Protein_GI_number: 20089773 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 13 289 30 306 307 271 51.0 9e-73 MEIRSNNKAKHRSQKIAFGIFRLLSLCIVLILFAILGFIVYKGIGAISWNFITSAPTDGM TGGGIWPAIVGTFYLMVGSALFAFPIGVMSGIYMNEYAPKGKLVRFIRVMTNNLSGIPSI VFGLFGMALFVNYMGFGDSILAGSLTLGLLCVPLVIRTTEEALKAIPDSMREGSRALGAT KLQTIWHVILPMGMPNIITGLILALGRVSGETAPILFTCAAYFLPQLPTGIFDQCMALPY HLYVISTSGTDMEAQLPLAYGTALVLIMIILFVNLLANALRKYFEKRVKTN >gi|222159192|gb|ACAB01000167.1| GENE 42 40331 - 41089 215 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 241 7 237 318 87 26 2e-16 MDTVKIDARDVNFWYGDFHALKGISMQIEEKSVVAFIGPSGCGKSTFLRLFNRMNDLIPA TRLEGEIRIDGHNIYAKGVEVDELRKNVGMVFQRPNPFPKSIFENVAYGLRVNGIKDNAF IRHRVEETLKGAALWDEVKDKLKESAYALSGGQQQRLCIARAMAVSPSVLLMDEPASALD PISTAKVEELIHELKKDYTIVIVTHNMQQAARVSDKTAFFYMGEMVEFDQTKRIFTNPEK EATQNYITGRFG >gi|222159192|gb|ACAB01000167.1| GENE 43 41165 - 41854 940 229 aa, chain + ## HITS:1 COG:RSc1533 KEGG:ns NR:ns ## COG: RSc1533 COG0704 # Protein_GI_number: 17546252 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Ralstonia solanacearum # 6 221 11 225 235 102 34.0 4e-22 MVKFIESELILLKKEIDEMWTLVYNQLDRAGEAVLTLDKELAQQVMVRERRVNAFELKID SDVEDIIALYNPVAIDLRFVLAMLKINTNLERLGDFAEGIARFVLRCKEPVLDAELLNRL RLAEMQAEVLSMLELAKRALNEESNDLAAGVFAKDNLLDEINADATGILSDYIIEHPEAV HTCVDLVSVFRKLERSGDHITNIAEEIVFFIDAKVLKHRGKTDENYPEK >gi|222159192|gb|ACAB01000167.1| GENE 44 42245 - 42847 719 200 aa, chain + ## HITS:1 COG:L0164 KEGG:ns NR:ns ## COG: L0164 COG0307 # Protein_GI_number: 15672976 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Lactococcus lactis # 1 196 1 192 216 150 40.0 1e-36 MFSGIVEEYATLVALVKDQENIHFTFKCSFVNELKIDQSISHNGVCLTVVTLTDDTYTVT AMKETLERSNLGLLKVGDKVNVERSMMMNGRLDGHIVQGHVDQTATCIDVKDAEGSWYFT FQYAFDKEMAKRGYITVDKGSVTVNGVSLTVCNPTDDTFQVAIIPYTYEHTNFHTFEIGS VVNIEFDIIGKYISRMIQYK >gi|222159192|gb|ACAB01000167.1| GENE 45 42906 - 43442 554 178 aa, chain + ## HITS:1 COG:CAC2311 KEGG:ns NR:ns ## COG: CAC2311 COG0778 # Protein_GI_number: 15895578 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 143 1 139 187 79 32.0 4e-15 MDFLQLVQARQSDRSYDKQRPVEPEKLERILEAARLAPSACNAQPWKFVVITDKELAQKA GKAAAGLGMNKFAKDAPVHILVVEESANITSLLGGKVKGKHFPLIDIGIVAAHIALAAEA EGLGSCILGWFDEKEMKQLAGIPASKRLLLDIVIGYPAKEKRKKMRKPKEKVISYNRY >gi|222159192|gb|ACAB01000167.1| GENE 46 43811 - 45130 997 439 aa, chain - ## HITS:1 COG:FN1154 KEGG:ns NR:ns ## COG: FN1154 COG1295 # Protein_GI_number: 19704489 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 55 341 19 300 396 162 34.0 1e-39 MKKKITDIWKFLTYDIWRITEDEVTRTKFSLYNIIKTIYLCINRFTKDRMANKASALTYS TLLAIVPILAILFAVARGFGFDNLMEHQFRNGFGGNIETTEAILSFVNSYLSQTKGGIFI GVGLVMLLWTVINLVSNIEITFNRIWEVKKARSMYRKITDYFSMFLLMPILIVVSGGLSL FMSTILKQMDDFVLLAPIMKFMIRLIPFVLTWLMFTGLYIFMPNTKVKFKHALIAGILAG SAYQAFQFLYINSQLWVSKYNAIYGSFAALPLFLLWLQISWTICLFGAELTYAGQNIRSF SFDQDTRNISRRYRDFISILIMSLIAKRFEKNEPPYTAAEISEEHQIPIRLTNQVLYQLQ EIELIHEVFTDEKSEEIGYQPSMDINQLNVAILLDRLDTYGSENFKIDKDEEFNDEWKVL TESREEYYKKASKVLLKDL >gi|222159192|gb|ACAB01000167.1| GENE 47 45615 - 46595 759 326 aa, chain + ## HITS:1 COG:BH3007 KEGG:ns NR:ns ## COG: BH3007 COG0791 # Protein_GI_number: 15615569 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus halodurans # 53 276 84 306 336 95 32.0 2e-19 MKKNILLFSCFLVIAAVSLKAQEIRPMPADSAYGVVHISVCNLREEGKFTSGMSTQALLG MPVKVLQYTGWYEIQTPDDYTGWVHRMVITPMSKERYDEWNRAEKIVVTSHYGFAYEKPD ESSQPVSDVVAGNRLKWEGSKGHFYKVSYPDGRKAYISKSISQPETKWRASLKQDVESII ATAYSMMGVPYLWAGTSSKGVDCSGLVRTVLFMHDIIIPRDASQQAYVGEHIEIAPDFAN VQRGDLVFFGRKATAERKEGISHVGIYLGNKQFIHALGDVHISSMNPVDKNYDEFNTKRL LFAVRFLPYINKEKGMNTTDNNPYYK >gi|222159192|gb|ACAB01000167.1| GENE 48 46709 - 47860 1128 383 aa, chain + ## HITS:1 COG:all3532 KEGG:ns NR:ns ## COG: all3532 COG4948 # Protein_GI_number: 17231024 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Nostoc sp. PCC 7120 # 46 380 1 343 350 207 36.0 4e-53 MQNRRDFLKTATLAALGSGLVVRQTLAGESSLSNVYINKLGLGGKMKMSFFPYELKLKHV FTVATYSRTTTPDVQVEIEYEGITGYGEASMPPYLGETVESVMSFLGKVNLEQFSDPFQL DDILSYVDSLSPKDTAAKAAVDIALHDLVGKLLGAPWYKIWGLNKEKTPSTTFTIGIDTP DVVREKTKECADQFNILKVKLGRDNDKEMIETIRSVTNLPIAIDANQGWKDRQYALDMIH WLKEKGIVMIEQPMPKEKLDDIAWITQQSPLPIFADESLQRLGDVAALKDAFTGINIKLM KCTGMREAWKMVTLAHALGMRVMVGCMTETSCAISAASQFSPLVDFADLDGNLLISNDRF KGVEVVNGKITLNDLPGIGVMKI >gi|222159192|gb|ACAB01000167.1| GENE 49 48080 - 49615 1783 511 aa, chain + ## HITS:1 COG:YPO3566 KEGG:ns NR:ns ## COG: YPO3566 COG0265 # Protein_GI_number: 16123710 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Yersinia pestis # 78 481 56 436 457 243 38.0 9e-64 MKQTTKNILGVGAIILLSSGVAGLTTYKLLQSNEAAKETSFNEMFKQNPNVKLAAFDAVN AQPVDLTQAAENSLHAVVHIRSTQEAKTRTVQQAPDIFDFFFGDGRGQQRQVQSQPRVGF GSGVIISKDGYIVTNNHVIEGADEISVKLNDNREFKGRVIGTDPSTDLALVKIEGDDFPT IPVGDSEALKVGEWVLAVGNPFNLNSTVTAGIVSAKARSLGVYNGGIESFIQTDAAINQG NSGGALVNAKGELVGINSVLSSPTGAYAGYGFAIPTSIMTKVIADLKQYGTVQRALLGIR GGSIGSSLMDDRQPIDKSGKTLADKAKELGVVEGVWVSEIVENGSAAGADIKVDDVIIGV DNKKVSNMADLQEALAKHRPGDKVKVKLMRDKKEKTVEVTLKNEQGTTKIVKDAGMEILG AAFKELPDDLKKQLNLGYGLQVTGVSSGKMSDAGVRKGFIILKANDQPMRKVSDLEEVMK AAVKSPNQVLFLTGVFPSGKRGYFAVDLTQE >gi|222159192|gb|ACAB01000167.1| GENE 50 49799 - 50659 751 286 aa, chain + ## HITS:1 COG:lin1491 KEGG:ns NR:ns ## COG: lin1491 COG0568 # Protein_GI_number: 16800559 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Listeria innocua # 21 285 108 373 374 221 45.0 1e-57 MRQLKITKSITNRESASLDKYLQEIGREDLITVEEEVELAQRIRKGDRVALEKLTRANLR FVVSVAKQYQNQGLSLPDLINEGNLGLIKAAEKFDETRGFKFISYAVWWIRQSILQALAE QSRIVRLPLNQVGSLNKISKAFSKFEQENERRPSPEELADELEIPVDKISDTLKVSGRHI SVDAPFVEGEDNSLLDVLVNDDSPMADRSLVNESLAREIDRALSTLTDREKEIIQMFFGI GQQEMTLEEIGDKFGLTRERVRQIKEKAIRRLRQSNRSKLLKSYLG >gi|222159192|gb|ACAB01000167.1| GENE 51 50790 - 51413 331 207 aa, chain + ## HITS:1 COG:no KEGG:BT_1310 NR:ns ## KEGG: BT_1310 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 218 218 327 74.0 1e-88 MKKSVLISFVLVVLGAGVMAYARYATLSSDTVPAKLEQRLREKAQAGKAYCDKNGYNTNY CFLVDFSIHSGKRRFFVWDFKGDSVKYASLCAHGYGKNSTLSKPVFSNVEGSYCSSLGKY KVGIRSYSKWGINVHYKLHGLEATNNNAFKRYIVLHSYTPMPETEVYPLHLPLGISQGCP VISDEVMRKVDRLLKAEKKPVLLWIYD >gi|222159192|gb|ACAB01000167.1| GENE 52 51491 - 51892 288 133 aa, chain + ## HITS:1 COG:no KEGG:BT_1309 NR:ns ## KEGG: BT_1309 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 133 1 133 133 206 85.0 2e-52 MKYVKILFAIALVFTMCSAFSLKKDHSKPVYAFGISASFTDTIVYFTDIQILDSAKVSKE GFLAHRELYSYQLKNYLEDNQLQQNSTCMIYFSENRKKLEKEATKILNKYKKNNRMTVSR IDSDKFRFTKPEE >gi|222159192|gb|ACAB01000167.1| GENE 53 51984 - 53198 635 404 aa, chain + ## HITS:1 COG:no KEGG:BF2740 NR:ns ## KEGG: BF2740 # Name: not_defined # Def: clostripain-related protein # Organism: B.fragilis # Pathway: not_defined # 3 404 2 390 393 313 43.0 5e-84 MKKIKILSLLFCVVMLVAACHDDEEGPIIPQPREQVGRTVLVYIVGDNGVNELSDLFKTN FEDMKEGMKEVDYSKCNLVVYSEMVNDVPRLVSLKKQNGKVVADTLFTYSEQNPLAKEVM SSVISQTVSYFPADSYGFVFLSHSSSWVPATNDANSRSIGYYRRTQMNIPDFHDVLLSSF PRPLKFILFDSCSMQAVEVAYELRDCAEYFIGSPTEIPGPGAPYSVVVPEMFTENNLAIN IASAYFNYYEKFYTGKVPSVNTNWTGGVATSVINSAALDNLAMVVKAIIPKYIQDARVVQ CNDIQLYDFSSDEANYDFDNLIQNLTGGKDNADYQSWRQAFDEAVIYRKTTPKNYSGITY SMFSMEKAEGLSTYIPRGSFDSKINNFYRTLQWYSAAGWGETGW Prediction of potential genes in microbial genomes Time: Wed May 18 04:50:40 2011 Seq name: gi|222159191|gb|ACAB01000168.1| Bacteroides sp. D1 cont1.168, whole genome shotgun sequence Length of sequence - 13237 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 6, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 144 - 659 441 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 2 1 Op 2 . - CDS 699 - 1643 328 ## BT_2411 hypothetical protein - Prom 1693 - 1752 5.5 - Term 1790 - 1822 0.8 3 2 Op 1 . - CDS 1889 - 3718 1335 ## COG0826 Collagenase and related proteases 4 2 Op 2 . - CDS 3715 - 4632 891 ## COG1897 Homoserine trans-succinylase - Prom 4676 - 4735 5.7 + Prom 4559 - 4618 3.7 5 3 Tu 1 . + CDS 4813 - 4983 210 ## BT_2414 ferredoxin + Term 5006 - 5043 6.2 - Term 5118 - 5162 6.2 6 4 Op 1 . - CDS 5239 - 6432 1272 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 6453 - 6512 3.4 7 4 Op 2 . - CDS 6515 - 7732 1316 ## COG0807 GTP cyclohydrolase II 8 4 Op 3 . - CDS 7738 - 9588 1419 ## COG0795 Predicted permeases - Prom 9627 - 9686 5.5 - Term 9632 - 9677 1.6 9 4 Op 4 . - CDS 9713 - 10099 395 ## BT_2418 hypothetical protein - Prom 10341 - 10400 77.0 + TRNA 10324 - 10397 84.7 # Met CAT 0 0 - Term 10539 - 10583 8.5 10 5 Tu 1 . - CDS 10607 - 11914 1385 ## COG0519 GMP synthase, PP-ATPase domain/subunit - Prom 12054 - 12113 9.5 + Prom 12039 - 12098 6.5 11 6 Op 1 . + CDS 12200 - 12781 286 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 12785 - 12844 1.7 12 6 Op 2 . + CDS 12864 - 13236 112 ## BT_3278 putative anti-sigma factor Predicted protein(s) >gi|222159191|gb|ACAB01000168.1| GENE 1 144 - 659 441 171 aa, chain - ## HITS:1 COG:CAC2751 KEGG:ns NR:ns ## COG: CAC2751 COG0454 # Protein_GI_number: 15896008 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 159 1 164 167 99 31.0 2e-21 MEIRSTEIKDLPLVMEIYDYARAFMRATGNTTQWIDGYPSEVLIRQEIEDGHSFVCIDEQ GEILGTFCFILGDDPTYQQIYEGTWLNNEPYGVIHRLATNGKQKGVSETCLNWCFERWPN LRVDTHRDNKVMQHILTKYGFQRCGIIYVKNGTERIAYQMTRVSDGTEKLA >gi|222159191|gb|ACAB01000168.1| GENE 2 699 - 1643 328 314 aa, chain - ## HITS:1 COG:no KEGG:BT_2411 NR:ns ## KEGG: BT_2411 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 314 31 344 344 553 87.0 1e-156 MNLLLCLLLLSSCYYKAPALDSEELSKKTKDSLTYLYERHYTWNTNLEVVDDSISLECLP IKDTFIQLNKGDRVVVAEFAVHPADSVDSVWVKLAHTQDEQGWIREKELKKSFVPTDSIS QAIHLFSDTHASYFIIIFALFVGVYLFRAFRRKQLQMVYFNDIDSVYPLFLCLLMAFSAT IYESMQVFVPETWEHFYFNPTLSPFKVPFILSVFLLSIWLFLIVALAVLDDLFRQLTPAA AIFYLLGLMSCCIFCYFFFILMTHIYIGYLFLAFFMLVFAKKVHRNIGYKYRCGRCGEKL KQKGVCPHCGAINE >gi|222159191|gb|ACAB01000168.1| GENE 3 1889 - 3718 1335 609 aa, chain - ## HITS:1 COG:ECs2039 KEGG:ns NR:ns ## COG: ECs2039 COG0826 # Protein_GI_number: 15831293 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli O157:H7 # 2 605 17 632 667 525 45.0 1e-148 MIKQRKIELLAPAKNLECGIEAINHGADAVYIGAPKFGARAAAVNSLEDIEALVQHAHLY HARIYVTVNTILKEEELKETEEMIHALYRIGVDALIVQDMGITKLNLPPIPLHASTQMDN RTPEKVKFLWEAGFRQVVLARELSLREIKKIHENCPEVPLEVFVHGALCVSYSGQCYVSQ ACFGRSANRGECAQFCRLPFSLVDADGKVIVKDKHLLSLKDMNQSDELEQLLDAGASSFK IEGRLKDVSYVKNVTAAYRQKLDAIFARRPEYVRASSGTCNFEFKPQLDKSFSRGFTHYF LHGRDKEIFSFDTPKSLGEEMGTVKEIRGNYLTVAGLKSFNNGDGVCYIDEQGRLQGFRI NRVDSNKLYPQEMPRIKPRTTLYRNFDQEFERVLSRKSAERKIAVRMLLADHHSGFSLTL TDEDDNSVTITLPREKEPARTPQTDNLKTQLSKLGNTPFEAKEIEISFTDNWFLPASVLA DFRRQAIDRLITARRINYRQELSVWKSTNHAFPQTTLTYLGNVMNTRAASFYQEHGVQQV AAAYEKEAVEDAVLMFCKHCLRYSMGWCPIHQRVRSPYKEPYYLVSNDGKRFRLEFDCKN CQMKVKAAQ >gi|222159191|gb|ACAB01000168.1| GENE 4 3715 - 4632 891 305 aa, chain - ## HITS:1 COG:CAC1825 KEGG:ns NR:ns ## COG: CAC1825 COG1897 # Protein_GI_number: 15895101 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Clostridium acetobutylicum # 1 301 1 301 301 386 58.0 1e-107 MPLNLPDKLPAIELLKEENIFVIDTSRATQQDIRPLRIVILNLMPLKITTETDLVRLLSN TPLQVEISFMKIKSHTSKNTPIEHMKTFYTDFDQMRHEKYDGMIITGAPVEQMDFEEVTY WDEITEIFDWARTHVTSTLYICWAAQAGLYHHYGVPKYPLKEKMFGIFEHRVLEPFHSIF RGFDDCFYVPHSRHTEVRREDILKVPELTLLSESEDAGVYMAMARGGREFFVTGHSEYSP LTLDTEYRRDLDKGLPIEMPRNYYIDNDPEKGPLVRWRAHANLLFSNWLNYFVYQETPYN INDIQ >gi|222159191|gb|ACAB01000168.1| GENE 5 4813 - 4983 210 56 aa, chain + ## HITS:1 COG:no KEGG:BT_2414 NR:ns ## KEGG: BT_2414 # Name: not_defined # Def: ferredoxin # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 56 21 76 76 88 100.0 6e-17 MAYVISDDCIACGTCIDECPVEAISEGDIYSINPDVCTDCGTCADVCPSEAIHPAE >gi|222159191|gb|ACAB01000168.1| GENE 6 5239 - 6432 1272 397 aa, chain - ## HITS:1 COG:BMEI0516 KEGG:ns NR:ns ## COG: BMEI0516 COG0436 # Protein_GI_number: 17986799 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Brucella melitensis # 1 397 22 421 421 381 50.0 1e-105 MNQLSDRLNSLSPSATLAMSQKSNELKAQGVDVINLSVGEPDFNTPDHIKEAAKKAIDDN FSRYSPVPGYPALRNAIVEKLKKENGLEYTAAQISCANGAKQSVCNTILVLVNPGDEVIV PAPYWVSYPEMVKMAEGTPVIVSAGIEQDFKITPEQLEAAITPKTKALILCSPSNPTGSV YTKEELAGLSAVLAKHPQVIVIADEIYEHINYIGAHQSIAQFPEMKERTVIVNGVSKAYA MTGWRIGFIAGPEWIVKACNKLQGQYTSGPCSVSQKAAEVAYTGTQEPVKEMQKAFQRRR DLIVKLAKEVPGFEVNVPQGAFYLFPKCDAFFGKSNGERKIADSDDLAMYLLEEAHVACV GGASFGAPECIRMSYATSDENIVEAIRRIKEALAKLK >gi|222159191|gb|ACAB01000168.1| GENE 7 6515 - 7732 1316 405 aa, chain - ## HITS:1 COG:BH1556_2 KEGG:ns NR:ns ## COG: BH1556_2 COG0807 # Protein_GI_number: 15614119 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase II # Organism: Bacillus halodurans # 210 403 1 194 197 236 59.0 6e-62 MKETIKMDRIEDAIADFKEGKFVIVVDDEDRENEGDLIIAAEKITPEKVNFMLKHARGVL CAPVTVSRCKELDLPHQVSDNTSVLGTPFTVTIDKLEGCTTGVSASDRAATIQALADPTS TPATFGRPGHINPLYAQEKGVLRRAGHTEATIDMARLAGLYPAGALMEIMSEDGTMARLP ELRQMADEHGLKLISIHDLIVYRLKQESIVEKGVEVNMPTEHGKFRLIPFRQKSNGLEHM AIFKGTWSEDEPILVRVHSSCATGDILGSQRCDCGEQLHKAMEMIEKEGKGVVVYLNQEG RGIGLMEKMKAYKLQEDGMDTVDANICLGHLADERDYGVGAQILRELGVHKMRLLTNNPV KRVGLEAYGLEIVENVPVETVPNPYNERYLRTKKERMGHTLHFNK >gi|222159191|gb|ACAB01000168.1| GENE 8 7738 - 9588 1419 616 aa, chain - ## HITS:1 COG:alr4069 KEGG:ns NR:ns ## COG: alr4069 COG0795 # Protein_GI_number: 17231561 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Nostoc sp. PCC 7120 # 19 153 30 164 371 63 27.0 9e-10 MLFIGTFFICLFIFMMQFLWRYVDELVGKGLEMSVMAQFFFYSALTLVPVSLPLAVLLAS LITFGNFGERYELLAMKAAGISLLKIMRPLAFFVCGLVGVSFYFQNVVGPIAQAKLGTLI LSMKQKSPELDIPEGVFYSEIKDYNLKVAKKNRKTGMLYDVLIYSMKDGFEKARIIYADS GRLEMTADKQHLWLHLYSGDLFENLKAQSMKSENVPYRREEFREKHTIIEFNSDFNMVDG EIMGKQSSAKDMAQLQSSIDSMTVVGDSIGRQYYREVAEGNFRPSYGLTKEDTVKIEKAD IHEYNVDSLYEVASLTQKQKVISSAVSRAENVANDLGFKKFTMENNDYSIRKHKTEWHKK ITISLSCLLFFFIGAPLGGIIRKGGLGMPVIVSVLVFIIYYIIDNTGYKMARDGKWIVWM GMWTSSAVLAPLGIFLTYKSNKDSVVLNADAYINWFKKIVGIRSVRHIFKKEVIIHDPDY VRLTGDLEQLSAECKAYAARKRLEKAPNYFKLWMASEDDNEVMAINEKLEALVEEMSNTK SATLIGALNNYPVISVSAHVRPFHIYWLNLVAGVIFPIGLFFYFRIWAFRVHLAKDMERI IKNNEQIQFIIQKINK >gi|222159191|gb|ACAB01000168.1| GENE 9 9713 - 10099 395 128 aa, chain - ## HITS:1 COG:no KEGG:BT_2418 NR:ns ## KEGG: BT_2418 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 128 1 128 128 226 96.0 2e-58 MKKEKIHLEYLLNATSKNILWGAISTPTGLEDWFADKVISDDKIVEFHWGKTELRKAEII AIRSFSFIRFRWQDDENERDYFEIKMTYNELTSDYVLEITDFAEPDEVADMKELWESQVA KLRRTCGF >gi|222159191|gb|ACAB01000168.1| GENE 10 10607 - 11914 1385 435 aa, chain - ## HITS:1 COG:BH0607_2 KEGG:ns NR:ns ## COG: BH0607_2 COG0519 # Protein_GI_number: 15613170 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Bacillus halodurans # 121 435 1 315 315 416 62.0 1e-116 MKQDMIVILDLGSHENTVLARAIRALGVYSEIYPHDITVEELKALPNVKGIIINGGPNNV IDGVAIDVNPAIYTLGIPVMAAGHDKATCEVKLAEFADDIEAIKAAVKSFVFDVCKAEAN WNMKNFVNDQIELIKRQVGDKKVLLALSGGVDSSVVAALLLKAIGNNLVCVHVNHGLMRK GESEDVVEVFSNQLKANLIYVDVTDRFLNKLAGVEDPEQKRKIIGGEFIRVFEEEARKLN GIDFLGQGTIYPDIVESGTKTAKMVKSHHNVGGLPEDLKFELVEPLRQLFKDEVRACGLE LGLPYEMVYRQPFPGPGLGVRCLGAITRDRLEAVRESDAILREEFQIAGLDKKVWQYFTV VPDFKSVGVRDNARSFDWPVIIRAVNTVDAMTATIEPIEWPILMKITDRILKEVKNVNRV CYDMSPKPNATIEWE >gi|222159191|gb|ACAB01000168.1| GENE 11 12200 - 12781 286 193 aa, chain + ## HITS:1 COG:TP0092 KEGG:ns NR:ns ## COG: TP0092 COG1595 # Protein_GI_number: 15639086 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Treponema pallidum # 23 174 1 155 162 62 32.0 5e-10 MNEEKSSFIKGINEQHPAAYHQLYNEYYKALVLYAINFLSSQQAAEDIVQDLFATMWEKK MRFLSLPSFRTYLYNSIRNASLNYLKHQNVESLYLERLASTYREITEEEDTNEEEVYRLL FRAIDKLPTRCREIFLLHMDGKKNEEIATALGISIETVKTQKKRAIQSIKEQMGTCYFLL PLCDILYSSKFFS >gi|222159191|gb|ACAB01000168.1| GENE 12 12864 - 13236 112 124 aa, chain + ## HITS:1 COG:no KEGG:BT_3278 NR:ns ## KEGG: BT_3278 # Name: not_defined # Def: putative anti-sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 94 2 97 394 63 33.0 2e-09 MKIKSGTEELIAKYLFGDITEHEKKQLDNWIKTSPQHEEFFNRLRTSASFRKRYEAYTQI NSHQAWKHFKKKYCQVSVTSILLKYAAILILPIIIAAGGWYFYIASEKQISDNLALGDAI QPGI Prediction of potential genes in microbial genomes Time: Wed May 18 04:50:58 2011 Seq name: gi|222159190|gb|ACAB01000169.1| Bacteroides sp. D1 cont1.169, whole genome shotgun sequence Length of sequence - 30252 bp Number of predicted genes - 19, with homology - 17 Number of transcription units - 6, operones - 5 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 4.8 1 1 Op 1 . + CDS 92 - 3421 1362 ## Cpin_1097 TonB-dependent receptor plug 2 1 Op 2 . + CDS 3433 - 4935 837 ## Cpin_1098 hypothetical protein 3 1 Op 3 . + CDS 4947 - 5807 371 ## gi|237712771|ref|ZP_04543252.1| predicted protein 4 1 Op 4 . + CDS 5829 - 7508 1044 ## gi|262409737|ref|ZP_06086275.1| predicted protein 5 1 Op 5 . + CDS 7542 - 10127 1605 ## BT_3275 hypothetical protein 6 1 Op 6 . + CDS 10142 - 12685 1960 ## BT_3275 hypothetical protein + Prom 12687 - 12746 6.1 7 2 Op 1 . + CDS 12766 - 13185 253 ## BT_3983 hypothetical protein 8 2 Op 2 . + CDS 13225 - 16008 896 ## CHU_2270 hypothetical protein + Term 16049 - 16088 -0.8 + Prom 16105 - 16164 4.1 9 3 Op 1 24/0.000 + CDS 16194 - 17129 682 ## COG1131 ABC-type multidrug transport system, ATPase component 10 3 Op 2 2/0.000 + CDS 17131 - 19452 1355 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 11 3 Op 3 . + CDS 19465 - 21792 1365 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component - Term 21797 - 21845 6.8 12 4 Op 1 . - CDS 21872 - 23311 1586 ## COG0642 Signal transduction histidine kinase 13 4 Op 2 . - CDS 23331 - 24461 895 ## COG2205 Osmosensitive K+ channel histidine kinase - Prom 24481 - 24540 1.6 - Term 24475 - 24524 9.1 14 5 Tu 1 . - CDS 24545 - 25309 697 ## BT_2422 hypothetical protein 15 6 Op 1 18/0.000 - CDS 25431 - 26003 704 ## COG2156 K+-transporting ATPase, c chain 16 6 Op 2 20/0.000 - CDS 26020 - 28053 2117 ## COG2216 High-affinity K+ transport system, ATPase chain B 17 6 Op 3 . - CDS 28072 - 29778 1779 ## COG2060 K+-transporting ATPase, A chain 18 6 Op 4 . - CDS 29810 - 29890 76 ## 19 6 Op 5 . - CDS 29910 - 30005 58 ## - Prom 30184 - 30243 10.1 Predicted protein(s) >gi|222159190|gb|ACAB01000169.1| GENE 1 92 - 3421 1362 1109 aa, chain + ## HITS:1 COG:no KEGG:Cpin_1097 NR:ns ## KEGG: Cpin_1097 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 119 1106 150 1145 1148 662 37.0 0 MRKTKSRERIRILSLVVVLLLLPTMMLAIPESGKKVTLNLESVTVKDFFDALRQQTGLSF VYNTEQTKSLKPITIHVKDETVDNVLRTVLNGTGLTYSMERDIVTISKVEQQGSKRIATG IVSDEEGYPLPGVNVVISDLQRFAITDNNGKYNIEIPANTACTITFSYIGMSTQQVMINS GRNDVRKNITLKSDTKLDEVIVTGIYTRKAESFTGAATTISSKDLMRVGNQNVFQSLKNL DPTIYIADNFDMGSNPNSVPDMSMRGTSSFPTTESSSLRSNYQNQPNQPLFILDGFETTA ETIMDMDMNRIESITILKDASAKALYGSKAANGVIVIETKRLTGNQQRITYNGSISLEMP DLTSYDLCNAFEKLEAERLDGVYTSSSANTQIQLDQLYNGRRKLALEGLDTYWLSKPLHT GIGHKHNLNIELGDSQNLRAILDLTYNQITGVMKGSDRRNISGDINLSYRHNKLLLKNIL SIISNKSNESPYGEFSEYSRMNPYWQATDENGNIVQWVEQIGSTSSKVANPMYNSIIGTS FTSSYLQFTNNFYAEWQVDKNWKATARIGVSEKRNDMDDFYPASHSIFANVNDILKRGKY IMENGKSSSISGDLNINYNNQFGKHTIFGNAGAFISGEKSSAYRHTAEGFPNNQKADISF AKQYAENSTPTGYSTINREASFLLAASYDYDNRYLADATVRESASSLYGSDNRWANSWSF GIGWNLHNEAILKGVGWIKQLKLRASIGLTGNQNFDTNAAIATYNYYTGVVYGGLAGKFT GAYLASMPNSKLKWEQKKDHNIGIDMRVAGLSLSVDYYSADTKNMLTDVTIPTSTGFAIV KDNLGLVRNSGVEAKANYTIWQGKEGFVNVYGTFAYNRNKIIRLSESMRAYNEKMMKQAE DNNTSAPVLMYQDGLSMKTIWAVPSAGIDPQTGQEIYIKKDGTYTYTYSANDMVAAGNSD PKYRGTGGFTAEYKGIGLSATISYLAGCQMYNSTLVSRVENANIAYNVDRRLLIGRWTTP GQVTPYKKFNSETTTRATTRFVQDRRELSLSSISAYYEFPSSIYRKLSMQRLRLSFYLND IATFSSIKIERGLNYPFARNMSFSLTATL >gi|222159190|gb|ACAB01000169.1| GENE 2 3433 - 4935 837 500 aa, chain + ## HITS:1 COG:no KEGG:Cpin_1098 NR:ns ## KEGG: Cpin_1098 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 1 497 1 486 488 201 30.0 6e-50 MKKYILILLTSVVTLSSCSDWFDVTSGSEIREKDHYSTLAGFQQSLIGCYISMTDNALYG KELSWYAIEILGHQFNPVTSNSSNSREMQEYEKFNYDHTQIFSDIENIWAKAYSVIANAN EALTNMEGQESSLNEVYYHVIKGELLAIRAYIHFDLLRLYGYGDWKNRSAELNSKKTIPY VTTLSSIPTPQRTGKETLQMIINDLEAAEKLLKEYDPITKKHPASYYTSIDADGFFKDRT LRLNYYAVKALEARVYLWEGSKESIDKALNATEEIITAIGDNGIVMDDMYTYSYLLPEIS QSNRSLASEALFSLNVSDVTSKITSYIIPNFVNTDYTAMYISPTDVENIYEGINSDVRFT RMLDQTQSDSRGYVPMKIHQASLDEFNKNRLSLIRLPEIFYIAAECYATSATPNLDMALQ RLNKIRENRGISTPLENLNADQIIKEIQKEYHKEFISEGVMFYYYKRTGCKSIPNYSEEM TDTQYLLPYPTFEIQSGRVQ >gi|222159190|gb|ACAB01000169.1| GENE 3 4947 - 5807 371 286 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712771|ref|ZP_04543252.1| ## NR: gi|237712771|ref|ZP_04543252.1| predicted protein [Bacteroides sp. D1] # 1 286 1 286 286 538 100.0 1e-151 MKKIIIKSVLAISCFSLCLISCGKEEIPYFDSQYNAVRFNSTNEYDAETDIFKGNYSFLE NPFDEYGEYELPLVLVGNVSTEDRTVNYVINAEETTAPENSYEITAAVIPANSLKGTIKI RLFNTDEIQNGASYKLYIQLKESSTLGLGPKEYITATVSWNNNIIAPPATERYVWMTYNS LIKSSLAPTSYSTTAYSSNALKTIVTALDWDDWDDMTAHPDQPQRPTYFTYKYLANYRLF VTDKSYEAYAAKLADYLKKYQKEHPDTPLVHNEGNLKGQIIEARTY >gi|222159190|gb|ACAB01000169.1| GENE 4 5829 - 7508 1044 559 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262409737|ref|ZP_06086275.1| ## NR: gi|262409737|ref|ZP_06086275.1| predicted protein [Bacteroides sp. 2_1_22] # 1 559 1 559 559 1022 100.0 0 MKKNNITQLIILLLPNILWFTSCYEDKSSFVTNLIPEVQISINDKNQENSIYVGYQSPVD IVPSITQDGFDGSNLRYEWAVTEEPSTNNPVYEIIGTEKDFHGIINRPISNGAYTLKLTV TDVANDNLQYIYSWELYVQSSFLDGLLIADSENGTTTDFTLINNSQITNQYTKEERIFRH ILETANGAAYDELLTSLTYEVMGNTAILGSSHLNQIWAISSTGKSIRFNCKDYSINGTWE DEKIFLYCPTDFQVKSYIRSSQLFIAYTNNGLYSFLNVAGNKFSMPNSVFNGFEINNNVY AANSSYSIGDNHLVWLDKPKGAFYSLNGTSYNTCTPYISNPDFDPNDMGSQTAIAATSSQ DGSLATFLLKDDNSGNYAIYTLSQYKAEEGYYEDPDNWEGWIVTSPEQPAAAKNKYIIPT TGKTLLDKAVSIFFGHTNNVLYVVTDAAIYSFTYGMGNEVSVSTTSQFTPSNGEKITKAK LYQQGQYTNQINVMTGNPPTIAPNAWNNKALIIVTQSAEFNGKVSIVPMKQAGAGTLDIS KALIYDGFGKILDVTTTGY >gi|222159190|gb|ACAB01000169.1| GENE 5 7542 - 10127 1605 861 aa, chain + ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 48 861 30 860 860 697 44.0 0 MKKKILLLAAMVVASSTTIDAQRKFPFFWKKKKAKTEQTTPAKKESEYDKLFKKKHEIAK GLITLHLLDGKVYFELPVDLINKDMLIGSTVTSISDNGNAVVGSKPTDLLHVVFTRNKTH VQLRQVNTDYITGNTQIDEALRKSTLGAILSNQKIQAYNNDSTAIVFDMSSVFLGDNKKM SPFDKNSIYGMYNRTENYQSDCSYISQIKAFKDNVSIKSCLSYTFSVSNSQGTSLIKDRP FTAEMTRSIMLLKEKPYRPRMADYRIGVFFTGREQLGEGAKTTVPVYYANRWDIQPSDTA AYLRGEKVKPTKQIVFYIDNTFPEKWKPYLREGVTQWNELFEQIGFKDVVAAKDFPTDDP EFDPDNIKYSCVRYAPSSIENAMGPSWVDPRSGEILNASVYLYHNVIKLISNWLFVQTAQ ADKDVRTVNIPDEMVGDALRYVLSHEIGHCLGFMHNMGASSTFPVDSLRSPEFTQKYGTT PSIMDYARFNYVAQPGDKERGVKLTPPRFGEYDKYLIKWTYTPVFNVNSAEEEAIITGKW ISDAIKENPVYRYGKQQVYGVVDPRSQTEDIGDNSMKATRYGIKNLKYIMNNLESWISEG DDTYEYREDLFIGIVEQLAMYVTHVAGNVGGYFVNEVKEGDTMPRFAQIPKAQQKEALNY LFEIYNDLNWLDNKNLLTKFPISGSPKQTIQNFMLRYILPVPFQVSQYEGLEKDSFTAAE AFNMIYNFVWKPTISGCTLTESQMNLQKQYIYMMMQTAGFTIKGAGKALAGEKPLDINHR QFGYTCCQGHAIKEDVIHNPVAGFEWRPLNRFSMTAKVTQADVYAYIAKAKQLMKQKAAS ASGKTKAHYELLLKMLDINLK >gi|222159190|gb|ACAB01000169.1| GENE 6 10142 - 12685 1960 847 aa, chain + ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 49 739 30 738 860 307 30.0 1e-81 MKILFLWLMPILLIIGSTNEANARSKKKKGKAKTEQQTDSVQKKKKSVYDKLFKDKKKHT VNKGTITVHQYEDKIYLELPIELMGRDFLVNSAITTASDISLAGTKAAQSRYLIIDKTDS LILFRDPKYNVRLNEQDDNQEAAFALSRSNAIYKAFPIEGYTSDSTAVVFNATSYFSCSN KDVLNLSGRSYGGMLTIVSASPQSKTSFVDSADAFDNCISITQNCTAKLSISIMGFVSKE QPELTMSVQTTLALLSKEKMNTREANPRVGTGYISYTDYRNEKRFKKGYYVTRRNITTQQ PVVFYIDTLIQDSWVKAIQKSADEWNIIFEDLGIGKPIIIKPYEKDSTFRANNPMINTIA FLNNNNSEVTAYNVTDLRTGEILSTKIGVPRDLAVSVRRNGVYQMAEIDPRFRTYYIADE VICENLTARMLKAFGLSLGLATNLAGSAAYSPEELRSPEFTQKYGITASVMDNVLYNYLA QPGDKEKGVVLIVDKPGVCDAFTLKYLYAATSENESDMLKKWAMEHDGDPRYFYGKRSPA YATDPRCQNYDLGNDPIASLDAQIAHVKYVVKNSPAWFHDDNIPNDYRELFPDFVIIELI NKTLSPVSSYIGGIYINEANEKSNVPSYQPVSADMQKKVLQKIFSTFYDLSWLDSNKDFL RLGGVNPDMSAWIYNNGYPMMSLMFRLMRMGLSVEKSTRPYTQEAYLNDIEKQLFKETLN GKPLSAPMIAQLSVYISSLKGMCPTLKAIDKAVSTRVTSIALNEQTNHKLQSLGLLTTFA SISATEEQSGMEPMTAVNFYVGTDIEAICYDKLKSTRRYLIQARSLASNDIERGKCDYLI AMIDRVI >gi|222159190|gb|ACAB01000169.1| GENE 7 12766 - 13185 253 139 aa, chain + ## HITS:1 COG:no KEGG:BT_3983 NR:ns ## KEGG: BT_3983 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 57 139 126 206 1135 75 45.0 6e-13 MKQIILFIVFIFTIMMCSFAQEAPSSPIIKARDNKSTEFFKNKKLQSPISPMVLFSQKIT GVVYTEGGKEVLIGASVIEVGSTNGTITDIDGEYSIGISKADKKILQASYLGYETKQERI GGRRVVNFFLKESSQALDK >gi|222159190|gb|ACAB01000169.1| GENE 8 13225 - 16008 896 927 aa, chain + ## HITS:1 COG:no KEGG:CHU_2270 NR:ns ## KEGG: CHU_2270 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 37 907 29 929 929 138 21.0 8e-31 MKKFISIFLLLYITLPINAQDKTNELSGFIIHKQENGSNGSLDGANIFLIYAKDTLKTTS INGVFRFKPIKTGKAKLIITAIGCRKVEKELNIIAGKNSNLYIEILDESIQLEEVTIKGR IPIVTQNGDTLIFNPKAVNIQEGDVAMNIVEQLPGTETDDHSVKIMGKQVTKTYIDGRLI FGSDPMAALKNLSATDVLKIKAYDEYENTKTKKLMYRGDLTRVLNIETKSKLISSWKAHL LASTGSNMDSKNDRGKFRKGLGLTTNFFSEKFLLTSNVFHNNINRKSNNIKNVLSISDPS STYNETTYADLATERSWDTENGTYGTFRAFYTFGYNRDNTNTRSEQHYFPGNNYQQRLYE EQNSNSDRKQNHYSEFSFSNSNDKWGEFRWNQLITYNNNKDWQSLYIYNEENNLQTSKSL MHYDNLNKNFHVKEDVSYVNSITDRIGYTLESSVDINKGKNHSLRIDTLESSTTQTYLTI PSKLRNTEWNGNAQLIYILNPENDTQISFDYSVKYENGWKHQFAWNMLSPTNPQTDEANT YSYKINNLTQKQEVSFHFFPFQKTSCDISAGLKETTLKRKEKEMDNYRKTFISPTVAISF VHTDITKSWSAAYRLHNFAPNIVQLRPQIDNSNPYMLRSGNPNLKQSYLHSFLFNCNRML GKHNHTIGVIINASIRQHSPVAKTTYYNAETYLPGLQYTAPAHSSLISFENVEGYWDIKG KLIWQAPIRSIKSKYALSTGFNYEHNPYYIGENKTTTRTYDPSLEHFLLCNLTKRLKITI SANTHYVHSINTENYTSKTFYQTAGATFDISQICKYFYLSSHYNFIFSRDYGINKEINRN HTLNLNIGCKVLNRKGDISIAAYDLLNSHRTFNSQMYSNYIQNTWTNYYGRFFTINFAYR FGKVKSNYEGTTNDGSIREYRPMSDKM >gi|222159190|gb|ACAB01000169.1| GENE 9 16194 - 17129 682 311 aa, chain + ## HITS:1 COG:sll0489 KEGG:ns NR:ns ## COG: sll0489 COG1131 # Protein_GI_number: 16331772 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Synechocystis # 5 303 1 309 342 225 39.0 7e-59 MSNPIVEVKHLSHRYSVDWAIRDINFSIEKTGILGLLGSNGAGKSTTMNIICGVLNQTEG DVFINGIDLRKNPVEAKKHIGFLPQKPPLHPDLTVDEYLIHCATLRRMEKSKIREAVEIA KERCAIAHFSKRLIKNLSGGYQQRVGIAQAIVHNPPFVVLDEPTNGLDPNQIVEIRNLIK EIAEDHSVLLSTHILSEVQATCNDIRMIEHGKVVFSGSMKDFDNYVVPSSFTVTFALPPS IEELAKIEHVLNIEELYPGTFRIRFDDDENITERVVALSIQNGWRLKEITMERCSLDIIF AQLSGKLKNNI >gi|222159190|gb|ACAB01000169.1| GENE 10 17131 - 19452 1355 773 aa, chain + ## HITS:1 COG:BB0753 KEGG:ns NR:ns ## COG: BB0753 COG1277 # Protein_GI_number: 15595098 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Borrelia burgdorferi # 3 251 4 239 244 88 27.0 4e-17 MTDLKIIMRIARTDLAILFYSPIAWFILIVFSFLTTASFTSLMENIVTDYDLSGGKEVSL SGICFLGSYGFLSSVVSNIYIYIPLLTMGLISRETASGSIKLAYSSPVTSGQIVLGKYLA AIGFGCCLMLVPIASAIYGSLVIPSFDWAPVLVALLGLYLLICAYCAIGLFMSSLTTYQV VAAVGTLIILAILNFVGSIGQEYDLIRELTYWLSINGRTIDMLNGVIRSEDVIYFVVVVT LFLTFTTFKLTSDRCTISRFRQAISYLGFFVAAMAIGYFTSRPGMIKVWDTTRTRLNSLT ENSQHVLAKLTGPVTITNYVNLLDNKSYRYLPIMKKANETIFEPYCLAKPDLQVKYVYYY DFAPNGVANNPKFQGKTVDEMRDYMTMIYNLNPHLFKSPAEIRQIIDLREEQNTFVRIME TQDGKRTFIRDFEDMDATPSEAEITAAIKKMISTPPTVAFIKGDGEREVSKSGDRDYSNF SIEKYSRAALINQGFDVCEIDISHGDTIPSLINIMVLAEMRTPLTEKGENQLEAYLARGG NLFILTDTGRQEVMNPFLSKLGIKMEEYQLAQSSADFSPNLILAKATRESGKLTFGFKDD FPKYDLRVSMPGCVALTCSDNDYGFQYTPILETNAKGVWIEKEQTDLQESPVECNASAGE KEQTYITAYALSRQLKDKEQRIIISGDADCISNTELTLSREGYRSGNFNLIIESFRWLSG GEFPIDIRRPHCTDNKLSIGVKDIGTMKTIFIIIIPAILLLIGVGIWFFRRRN >gi|222159190|gb|ACAB01000169.1| GENE 11 19465 - 21792 1365 775 aa, chain + ## HITS:1 COG:BB0753 KEGG:ns NR:ns ## COG: BB0753 COG1277 # Protein_GI_number: 15595098 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Borrelia burgdorferi # 1 255 3 244 244 99 28.0 3e-20 MNLRLILRIARTELAVLFYSPVAWLLLVAFTCQVGFDFMNILTEIVKIKALGNTITFSVT AGFVLGLKGIYEVIQETIYLYIPLLTMNLMSREYSSGSIKLLYSSPVSSVQIITGKFVSM VVFALIFVIILALPTIVMFISVPHVDITLILAGLLSMFLLILTYCSIGLFMTTLTSYQVV AAVATLSALAFLNYVGGIGQESIFFREITYWLSIKGRASEMVGGLICSDDVIYFLAVILL FLWLSVIKLNNEKTRRSLFSKTMRYALAVCTIIVIGFVSSRPAMMGFYDATRSKQRTLSE ESQKVMEQLSGPMTITTYVNIFDKEFDVASPREQKEDMARFKMYTRFKPEIKMEYVYYYS TPKDSTLYRQYPNKNIREIAYEVAKKKNFNPKKLKSAEELKEKIDLAKENYRFVRVVERG SGEQARLRLFDDMEYHPSETEISAALKKMLVTPVKVGAITGHQERSTTKKGDQDYSLFAT HGRFRYSMINQGFDLVELNLKDMNDIPSNINILLIAEMRSSMSSKEQEIIDRFLERGGNM MIMGDVGRQEVMNPLLRKVGLKLLPGIIAQPSDVNPGDLVLAKATQIAADSIGGFYKRMV DRQKHSAVTMPSAVALEVVDTTKFHPIVLLQSNAQQTWIEYQTKDFLNDSLSLDSLQGEK LGAYPTAIALTRKIKAKDKKQRIIVLGDADCFSNAELQKSSRPGIYSFNFNMIPGSFRWL CYNKFPVSSSRAPYLDKDISLTPMDLSTIKIIYCYGIPFIIGLCGIWICWRRRKR >gi|222159190|gb|ACAB01000169.1| GENE 12 21872 - 23311 1586 479 aa, chain - ## HITS:1 COG:SA1322 KEGG:ns NR:ns ## COG: SA1322 COG0642 # Protein_GI_number: 15927072 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Staphylococcus aureus N315 # 66 477 195 584 588 164 28.0 3e-40 MNIKTKLILGIGMLAGMIILLVTLSVVNLQTLTATEPDSPAAMPALERALLWISITGGIC ILTGLILLYWLPRSISKPIKELKEGILEIANHNYEKRLDMSDNEEFREVADSFNRMAERL TEYRASTLSDILSAKKFIEAIVNSINDPIIGLNTEREVLFINDEALSILNMKRENVIRKS AEELSLKNDLLRRLIRELVTPSDQKEALKIYADNKESYFKVSYVPIINTEAEKGEPRKLG DVILLKNITEFKELDSAKTTFISTISHELKTPIAAIMMSLQLLEDKRVGALNDEQEQLSK SIKENSERLLSITGELLNMTQVEAGKLQLMPKITKPIELIEYAIKANQVQADKFNIQIEV EYPEEKIGKLFVDSEKIAWVLTNLLSNAIRYSKENGHVVIGAKQDENWIELYVQDFGKGI DPRYHKSIFDRYFRVPGTKVQGSGLGLSISKDFVEAHGGTLTVESELGKGSRFVMRLKA >gi|222159190|gb|ACAB01000169.1| GENE 13 23331 - 24461 895 376 aa, chain - ## HITS:1 COG:AGl2094 KEGG:ns NR:ns ## COG: AGl2094 COG2205 # Protein_GI_number: 15891164 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 11 344 17 349 900 254 38.0 2e-67 MDDREQSVQHFLDLIKRSQRGKFKVYIGMIAGVGKSYRMLQEAHELLENGVDVKIGYIET HGRVGTEGMLQGLPVIPRRKIFYKGKELEEMDLDSIIRLHPEIVIVDELAHTNVEGSLNE KRWQDVMTLLDEGINVISAINIQHIESVNEEVQEITGIEVKERVPDSVLQEADEVVNIDL TAEELIARLKAGKIYRPEKIQTALDNFFRTENILQLRELALKEVALRVEKKVENEVMMGV AVGLRHEKFMACISSHEKTPRRIIRKAAKLATRYNTTFIALYVQTPRESMDRIDLASQRY LLNHFKLVAELGGEVVQVQSKDILGSIVKVCKEKQISTVCMGTPNLRLPYAICSILGYRK FLNNLSQANVDLIILA >gi|222159190|gb|ACAB01000169.1| GENE 14 24545 - 25309 697 254 aa, chain - ## HITS:1 COG:no KEGG:BT_2422 NR:ns ## KEGG: BT_2422 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 254 1 248 248 345 68.0 9e-94 MKSFFSKKSILGTMAVAAVCMLGTGNAQAQEFTIQGDLVSSYVWRGIYQGGAASFQPTLG VSVGNFSLTAWGSTSLSESNKEIDLTAAYKFGEAGPTLSVASLWWNGQADVANGELTNDY FHFKSGDTGHHFEAGLAYTLPVEKFPLSIAWYTMFAGADRKMTDKGEEKQAYSSYVELNY PFSVKGVDLNATCGVVPYETPQYNVNGFAVTNLALKATKAINFNDKFSLPIFVQAIWNPR LEDAHLVFGVTLRP >gi|222159190|gb|ACAB01000169.1| GENE 15 25431 - 26003 704 190 aa, chain - ## HITS:1 COG:AGl2092 KEGG:ns NR:ns ## COG: AGl2092 COG2156 # Protein_GI_number: 15891163 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, c chain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 188 8 186 188 157 46.0 2e-38 MKTLLKSLKITLAFCVFFSVFYILILWLFAQVAGPNKGNAEVATLDGKVVGAANVGQMFT KDIYFWGRPSSAGDGYDATSSAGSNKGPTNQEYLDEVKARIDTFLVHHPYLDRADVPAEM VTASASGLDPDITPQCAYVQVKRVAQARGLTEEQVRAIVDKSIEKPFLGLLGTEKVNVLK LNIALEESNK >gi|222159190|gb|ACAB01000169.1| GENE 16 26020 - 28053 2117 677 aa, chain - ## HITS:1 COG:DRB0083 KEGG:ns NR:ns ## COG: DRB0083 COG2216 # Protein_GI_number: 10957402 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity K+ transport system, ATPase chain B # Organism: Deinococcus radiodurans # 8 674 10 671 675 772 63.0 0 MKDNKSASLFPKEQVIESLKQSFVKLNPRMMIKNPIMFTVEVATVVMLLVTLYSIVNSSQ GSFAYNIAVFIILFVTLLFANFAEAIAEARGKAQADSLRKTREETPAKKVEGNKIVTVSS SLLKKGDVFVCEAGDVIPSDGEIIEGLASIDESAITGESAPVIREAGGDKSSVTGGTKVL SDHIKVLVTAQPGESFLDKMIALVEGASRQKTPNEIALTILLAGFTLVFVIVCVTLKPFA DYSHTVITIASLISLFVCLIPTTIGGLLSAIGIAGMDRALRANVITKSGKAVETAGDIDT LLLDKTGTITIGNRKATHFHTAPGVNLHDFVETCLLSSLSDETPEGKSIVELGRESGIRM RNLNTTGARMIKFTAETKCSGVDLADGTQIRKGAFDAIRKMVEGAGNEFPKEVEEVISSI SSNGGTPLVVCVNKKVTGVIELQDIIKPGIQERFERLRKMGVKTVMVTGDNPLTAKYIAE KAGVDDFIAEAKPEDKMEYIKKEQQAGKLVAMMGDGTNDAPALAQANVGVAMNSGTQAAK EAGNMVDLDNDPTKLIEIVEIGKQLLMTRGTLTTFSIANDVAKYFAIVPALFMIAIPELA ALNIMHLHSPESAILSAVIFNAIIIPILIPLALRGVQYKPIGASALLRRNLLIYGLGGVI VPFVGIKLIDLLVGLFF >gi|222159190|gb|ACAB01000169.1| GENE 17 28072 - 29778 1779 568 aa, chain - ## HITS:1 COG:pli0052 KEGG:ns NR:ns ## COG: pli0052 COG2060 # Protein_GI_number: 18450334 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, A chain # Organism: Listeria innocua # 9 568 1 571 573 452 45.0 1e-126 MNIEILGVVVQIALMVILAYPLGKYIAKVYRGEKTWSDFMAPIERVIYKVCGIDPNEEMN WKQFLKALLILNAFWFFWGMVLLVSQGWLPLNPDGNGPQTPDQAFNTCISFMVNCNLQHY SGESGLTYFTQLFVIMLFQFITAATGMAAMAGIMKSIAAKTTKTIGNFWQFLVISCTRIL LPLSLIVGFILILQGTPMGFDGKMKVTTMEGQEQMVSQGPTAAIVPIKQLGTNGGGYFGV NSSHPLENPTYLTNMVECWSILIIPMAIVLALGFYTRRKKLAYSIFGVMLFAFLVGVCIN VSQEMGGNPRIDELGIAQDNGAMEGKEVRLGAGATALWSIVTTVTSNGSVNGMHDSTMPL SGMMEMLNMQINTWFGGVGVGWMNYYTFIIITVFISGLMVGRTPEFLGKKVEAREMKIAT IVALLHPFVILVFTALSSYIYVYHPDFVESEGGWLNNLGFHGLSEQLYEYTSCAANNGSG FEGLGDNTYFWNYTCGIVLILSRFIPIIGQVAIAGLLAQKKFIPEGAGTLKTDTLTFGVM TFVVIFIIAALSFFPVHALSTIAEHLSL >gi|222159190|gb|ACAB01000169.1| GENE 18 29810 - 29890 76 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYTALFVLGIAIFGYLMYVLVKPEKF >gi|222159190|gb|ACAB01000169.1| GENE 19 29910 - 30005 58 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGITLSFIFCLLSGFACFWLFWKCVDWFENI Prediction of potential genes in microbial genomes Time: Wed May 18 04:52:28 2011 Seq name: gi|222159189|gb|ACAB01000170.1| Bacteroides sp. D1 cont1.170, whole genome shotgun sequence Length of sequence - 16422 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 2 - 51 11.3 1 1 Op 1 . - CDS 90 - 995 580 ## gi|237712748|ref|ZP_04543229.1| predicted protein 2 1 Op 2 . - CDS 1007 - 2674 1251 ## Slin_3992 hypothetical protein 3 1 Op 3 . - CDS 2689 - 5871 1931 ## Cpin_7043 TonB-dependent receptor plug - Prom 5949 - 6008 12.2 + Prom 6218 - 6277 3.9 4 2 Tu 1 . + CDS 6323 - 7513 700 ## BT_2267 integrase protein + Term 7522 - 7572 -0.4 + Prom 8282 - 8341 3.4 5 3 Op 1 . + CDS 8401 - 11277 1839 ## BT_2260 outer membrane protein Omp121 6 3 Op 2 . + CDS 11299 - 12771 945 ## BT_2259 hypothetical protein + Term 12794 - 12840 10.7 + Prom 12788 - 12847 6.5 7 4 Tu 1 . + CDS 12873 - 14132 1363 ## COG2262 GTPases 8 5 Tu 1 . + CDS 14252 - 16255 1660 ## BT_2257 hypothetical protein Predicted protein(s) >gi|222159189|gb|ACAB01000170.1| GENE 1 90 - 995 580 301 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712748|ref|ZP_04543229.1| ## NR: gi|237712748|ref|ZP_04543229.1| predicted protein [Bacteroides sp. D1] # 1 301 2 302 302 596 100.0 1e-169 MKIKTNMRYIQLFMATALWALLTVSCDRDLPYPIDQVKKGVVIDIVRAPDSDGVLSAGVT DGDYKVSISIPKQQGDYSMLDYAQLLCVFTDVDGKTTSKVIVDNIKEFPKDISIDMANVY QQFGKEAPTLGEIVYFTTNAILKDGYIVYGWTEYSDFNNKLFTGWEVDGRAYSYNVRYPV ACQLDLEEFVGPVVLNDFYAESYRAEIVKTSDTELEIHGVAEGEEPDGILKLTIDTSTHT VKVAKQIVAHGSVAWVGGYNNVAYQATGTIDACKGTITLNGPLTVDEGSFGDYQMIISTN Y >gi|222159189|gb|ACAB01000170.1| GENE 2 1007 - 2674 1251 555 aa, chain - ## HITS:1 COG:no KEGG:Slin_3992 NR:ns ## KEGG: Slin_3992 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 9 555 11 532 532 214 30.0 6e-54 MKIIKLLYTLTTTLIVCTSCSKYLDINDNPNKPTSAELNKVLTGAEYDIAMSFATGNYIG SSLPSYVFHLSSREVDNYGITNATSTLGNTWLQGYAYGLKNTNAVIKAAEEGNNMIYAGI GKLMKAYAFTNLVDLWGDIPYTEFDIEGNYAPKLDRSQDIYNSLLGLIDEAIGNLQDPNA ANLLKPANDDLIYKGNIEKWVRMGNTLKLKLLVQSRKAKNEITDWNTKLSNLLSENKFMN DGEDFEFKHTAKDTPDERHQAYVDEYLGGQSTYYISPWIYETMSGKNLNVTNNPFVGITD PRIPYYWVNQIKAGDKAQNATDYRDGAFVSIFFASNASGASNDQRETSTFIGIYPCGGKY DNGNGGKCDAKVGNGVAPEKMLQAYSVPFLLAELYLTGNANGDAKEALKEGIKRSINHVN SVAKASDANVPAISNESIEAFIEKVLAKYDTATSNEEKLRIVMTQKWIANFFNPVEAYTD IRRTGYPTLLPQNITYAQSPYKTSQDPELGPVNIPLKGINAFPRAMWYPSSEVTRNPNVT NEGRNLSNPILFWDK >gi|222159189|gb|ACAB01000170.1| GENE 3 2689 - 5871 1931 1060 aa, chain - ## HITS:1 COG:no KEGG:Cpin_7043 NR:ns ## KEGG: Cpin_7043 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 19 1060 127 1189 1189 709 40.0 0 MKRKLMLLLACLFVGIGLVTAQTQKITGVVISEEDGQPVVGASVLVKGTTQGTITDVDGN FNLSNVPSSAKTLQISYIGMQTQEVVIKPNLKVVLKSDSQKLDEIVVTAVGIQRAERSLG YSVSKVDADEAIQKAEPDLIRSLDGKIPGVSINSPSGAAGSATRMTIRGNSSFLGNNQPL YIVDGVPYSNTEVASSNQATDAGGAYGSGISTLDPNDIESMNVLKGAAAAALYGSRAANG VVLITTKTGSKSKKKAGKGMEITLNASYTIEQIAGLPEYQNSFGTGNDFIPGGANGSWGA AFTDVTEVPLSIYAANAYGKAYPNLPKTIPYKAYKNNVKDLFDLGGIYDLSLNFNRYTET GNFTATLSKMDQDSYIPYADFSRYSISIGGNQTLANGVRIGGNVAFSRNEQNGPMFGNNQ SSGIGASSFARALILGRNWDMSLPYETPDHKSLFFVGDQADNPIWSWKYNTINTQMDRTI ANVNMGYDITKWLSVDYRVGINDYKMDRKEVLNLGSRALGGKGRILVSTYDTQEIESTLL LRFNIPVHKDLGLKATLGHNVNQFTANETLTKGMNIMSPGVYNINNTESQTSEEEYTRTR LWALFGDVTLDYKNYAFLNITGRNDFSSTLPRNHRSFFYPSIAGSFVFTDAFHINEDIIK FGKVRLSWAKVGNDAGAYYKNGTFTLGQPFNGQPILSLPSSMFDPDLKPEFTSEVEFGAE LQFLKSRVNVDFTWYNRNSTNQIAPLSLPYSTGYGSYYTNFGKMNNHGIEIGLNVIPILN KDFKWDMYFTYTQNRSEVKELAEGVERVSLNTGFATPKAVLEVGKPYGMLVGQVFARDEE GNYLVDPNSGAYLIADEEGDLGDPAPDCKLSLNNTFTYKGFSLSFMFDAQIGGCVWTSYI PDLLGRGVTKDTEDRYGSRILPGYLADPSTKKPLLDGNGNKIPNNVQMKELDLWFSPGTS VSTFATNGVDEATVYDATTFRLRELSIGYQLPKSWLAKTFIGSATVSFVARNLWYFAPNV PKYSNYDPTASSYGGGNVQGIDYTSAPNTRRYGFNLKLTF >gi|222159189|gb|ACAB01000170.1| GENE 4 6323 - 7513 700 396 aa, chain + ## HITS:1 COG:no KEGG:BT_2267 NR:ns ## KEGG: BT_2267 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 396 1 396 396 613 81.0 1e-174 MEKISYNLVFNRKKRLNKKGMALVQVEAYLNRKKMYFSTKIYLKPDQWDAKRKMVKNHPN ANVLNRMLYENIAAIEHTELGLWQQGKPISLDLLKNSIDKPLSNGSSFLTFFKEEVANSL LKESTRQNHLSTLELLQEFKKEILFTDLTFEFVSSFDNHLQSKGYHLNTIAKHMKHLKRY INVAINKEYMDIQKYAFRKYKIKSIEGSHTHLAPEELYRFENLQLTGRYTRLQKTKDAFL FCCYAGLRYSDFTSLTSANIVEFHQETWIIYKSVKTGIEVRLPLYLLFEGKGIEILQRYK DDLDSFFKLKDNSNINKELNLLAGLAKIDKRVSFHTARHTNATLLLYSGANITTVQKLLG HKSVKTTQVYANIMDITVVRDLEKTASSKNNNRYKS >gi|222159189|gb|ACAB01000170.1| GENE 5 8401 - 11277 1839 958 aa, chain + ## HITS:1 COG:no KEGG:BT_2260 NR:ns ## KEGG: BT_2260 # Name: not_defined # Def: outer membrane protein Omp121 # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 958 1 959 959 1577 83.0 0 MQTQEVVIKPNLKVVLKSDSQKLDEVVVTAMGIKRSEKSLGYSVTSVKGDEITKARESNV LNSLSGKIAGVQIGQSSGTAGGSSSIQIRGASSIGSVSSPLFIVDGLPIDNGAFNPDRTN GIVDVGNRAGDISSDDIESINVLKGAAATALYGARAKDGAIVITTKKGSRNSQVSVTVNS STRFENVLKLPDFQNDYAQGSYGKYNVKMLNGWGPKISSVQDQTFADFKGEQVTLQAYPD NVKDFYETGMSYINNVAIAGGTERSDFRISVSTTNQTGVVPGSDYNKYAFAVNGGMNFTK NFTGRVSAQYIRSDSEGRPAQGSNDNNLLIPLINGMPRTVNINDVKNNWIDEYGKQISLD PEGKSNNPYWIINKNKFTNQLDRLIGNIVLTYKPIEGLTISNNAGTDFYTETRCKVYSKG TIGNLEGQFQTWDLYKRILNNDLMISYEKTFAEDYGVKVMLGHNIYQEDWRNNNVLAQNL VVDELYAYTNAKTTSPVNYTSKKRLVGLYGDISFSYKNMVFLDVTGRNDWSSTLPVNNRS YFYPSISGSFIFTELMKNKDVLSYGKLRMSYANVGSDEQPYQLAFQYTPVNSYFLQYNLT NVFPHNGLVGFTGPRVLPNENLKPQNQASFEVGTDLRFFQNRIRLDLTYYKSVTKNQIVS IDVPLSTGYFANNINAGKVSNKGIEVALGVTPVQTKDFKWDLEATFAKNTQKVDELAEGL DEYSLTSGLSGLQIKAEVGGSFGLYGTGWKRDDSGNYVINEKTGLREIVNNVRLGDVYPD FTMGINNTFSYKGFTLSFLIDIRQGGSLYSETVGTLRTTGLAAETAAHREDASFIEPGVI LQSDGTYRANDVPVKSMQDYWGHVAKSSNNEGNIFDASYVKLRELNFSYSFPRKWFRSFF VKSLDLGFEARNLWIIKDHVPHIDPEANFFGPAQIGGGVEFNSIPTTRSFGFNIRLTL >gi|222159189|gb|ACAB01000170.1| GENE 6 11299 - 12771 945 490 aa, chain + ## HITS:1 COG:no KEGG:BT_2259 NR:ns ## KEGG: BT_2259 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 490 3 488 488 701 68.0 0 MKNKRSIYTIAIIFACAILSSCSDWLDVNHNPNSAEKVDPGYLFNYAVVNWAGSRTGGDA YIPLSQSIQCQADGGDDYGGWAEGYYVIDPYSLGNTWKHYYSVGGNNLQLAIKNAQEATP VNHNGIAQCKIILAQHIYETTMIWGDIPFTEAWVEGVKYPKFDSQEVVLNGVVSLLDEAL NEINLDDPLAITDYDIFYKGDMQKWIRLAKSLKFRTLMTMVDKDPTKAEQIGKLISDGGM ISSADDNLQFPYLQTAGNENPKYKILEKYTNGINIMFFANNNVLKPMQERNDSRISRYFE PGADGLYRGLDTRQAAEETDDENADLLSSVISKYLFRKEAPELIYSYQEQLFFEAEAYVR GLGVTSNLVKANELYKKAIQTACDYYEADPVKTTEFIDALDDLNTLEPDRALYEIHIQQW IDLMDRPLEEFVQWRRSGSNGNEVPVLTVPTEATSRELIRRWEYSPEEMTANPNAPKESP KIWEKMWFDL >gi|222159189|gb|ACAB01000170.1| GENE 7 12873 - 14132 1363 419 aa, chain + ## HITS:1 COG:PA4943 KEGG:ns NR:ns ## COG: PA4943 COG2262 # Protein_GI_number: 15600136 # Func_class: R General function prediction only # Function: GTPases # Organism: Pseudomonas aeruginosa # 12 343 10 327 433 256 46.0 5e-68 MKEFVISEAKVETAVLVGLITQTQDERKTNEYLDELAFLAETAGAEVVKRFTQKLPTAHS VTYVGKGKLEEIRQYIRNEEEEEREVGMVIFDDELSAKQIRNIEAELKVKILDRTSLILD IFAMRAQTANAKTQVELAQYKYMLPRLQRLWTHLERQGGGSGAGGGKGSVGLRGPGETQL EMDRRIILNRMSLLKERLAEIDKQKATQRKNRGRMIRVALVGYTNVGKSTMMNLLSKSEV FAENKLFATLDTTVRKVIIDNLPFLLSDTVGFIRKLPTDLVESFKSTLDEVREADLLVHV VDISHPGFEEQIEVVNKTLAEIGGSGKPMILVFNKIDAYTYVEKAPDDLTPRTKENLTLE ELMKTWMAKMEDNCLFISARERINIDELKNVVYQRVKELHVQKYPYNDFLYQTYEEEEE >gi|222159189|gb|ACAB01000170.1| GENE 8 14252 - 16255 1660 667 aa, chain + ## HITS:1 COG:no KEGG:BT_2257 NR:ns ## KEGG: BT_2257 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 663 1 663 663 1310 94.0 0 MKDYRRLTEDEVLQLKSQSCLADDWGNVSVAEGFNCEYVHHTRFSGEVKLGVFESEFTLP GGIKKHSGLRHVTLHNVSVGDNCCIENIQNYIANYEIGSDTFIENVDIILVDGLSTFGNG VEVAVLNETGGREVLMNDKLSAHQAYILALYRHRPELINRMKSIADYYSNKHASAVGSIG NHVMILNTGSIKNVRIGDYCHICGTCRLSNGSVNSNVTAPVHIGHGVICDDFIISSGSKV DDGTMLTRCFVGQSCKLGHNYSASDSLFFSNCQGENGEACAIFAGPFTVTHHKSTLLIAG MFSFMNAGSGSNQSNHMYKLGPIHQGTMERGAKTTSDSYILWPARVGAFSLVMGRHVNHA DTSNLPFSYLIEQRNTTYLVPGVNLRSVGTIRDAQKWPKRDNRKDPNRLDYINYNLLSPY TIQKMFKGRSILKELKRVSGETSEIYSYQSAKIKNSSLNNGIRFYEIAIHKFLGNSIIKR LEGINFQSNEEIRQRLKPDTEIGIGEWVDVSGLIAPKSEIDRLLDGIEKGAVNRLKSINA SFAEMHENYYTYEWTWAYHKIQEFYGLDPETITAQDIIGIVKAWQQAVVGLDKMVYEDAK KEFSLSSMTGFGADGSHDEMKQDFEQVRGDFESNTFVTAVLKHIEDKTALGNELIERIKM IDKTEIV Prediction of potential genes in microbial genomes Time: Wed May 18 04:53:27 2011 Seq name: gi|222159188|gb|ACAB01000171.1| Bacteroides sp. D1 cont1.171, whole genome shotgun sequence Length of sequence - 1879 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 41 - 100 5.4 1 1 Tu 1 . + CDS 188 - 1825 492 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase Predicted protein(s) >gi|222159188|gb|ACAB01000171.1| GENE 1 188 - 1825 492 545 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 76 544 38 496 508 194 31 6e-50 MATPLFKYQPMFEKGKDTTEYYLLTKDYVSVSEFEGNPILKIEKEGLTAMANAAFRDVSF MLRRSHNEQVAKILSDPEASENDKYVALTFLRNAEVASKGVLPFCQDTGTAIIHGEKGQQ VWTGYSDEEALSLGVYKTYTEENLRYSQNAPLNMYDEVNTKCNLPAQIDIEATEGMEYEF LCVTKGGGSANKTYLYQETKAILNPGTLVPFLVEKMKTLGTAACPPYHIAFVIGGTSAEK NLLTVKLASTHFYDNLPTTGNEYGRAFRDVELEKEVLAEAHKIGLGAQFGGKYLAHDVRI IRLPRHGASCPVGLGVSCSADRNIKCKINKEGIWIEKLDSNPGELIPAELRQAGEGDVVK IDLNRPMPEILKELTKYPVATRLSLNGTIIVGRDIAHAKLKERLDRGEDLPQYIKDHPIY YAGPAKTPEGMACGSMGPTTAGRMDSYVELFQSHGGSMVMLAKGNRSQQVTDACQKYGGF YLGSIGGPAAILAQNNIKSIECVEYPELGMEAIWKIEVENFPAFILVDDKGNDFFKQLKP WNCKK Prediction of potential genes in microbial genomes Time: Wed May 18 04:53:32 2011 Seq name: gi|222159187|gb|ACAB01000172.1| Bacteroides sp. D1 cont1.172, whole genome shotgun sequence Length of sequence - 17112 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 9, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 64 - 363 198 ## RB2501_06175 hypothetical protein 2 1 Op 2 . + CDS 366 - 1022 413 ## Cpin_0029 hypothetical protein + Term 1196 - 1245 10.8 + Prom 1204 - 1263 8.2 3 2 Op 1 . + CDS 1283 - 2233 548 ## BF3457 hypothetical protein + Term 2239 - 2291 1.8 + Prom 2258 - 2317 3.6 4 2 Op 2 . + CDS 2395 - 3282 474 ## BF3459 hypothetical protein + Term 3300 - 3338 6.1 - Term 3485 - 3523 4.8 5 3 Tu 1 . - CDS 3550 - 5730 1923 ## COG0514 Superfamily II DNA helicase - Prom 5759 - 5818 2.3 - Term 5782 - 5825 0.1 6 4 Op 1 24/0.000 - CDS 5898 - 7142 247 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 7 4 Op 2 29/0.000 - CDS 7145 - 7807 833 ## COG0740 Protease subunit of ATP-dependent Clp proteases - Prom 7827 - 7886 5.6 - Term 7866 - 7924 6.4 8 4 Op 3 . - CDS 7951 - 9306 1764 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) - Prom 9437 - 9496 5.5 + Prom 9835 - 9894 6.7 9 5 Tu 1 . + CDS 9916 - 10161 312 ## COG0724 RNA-binding proteins (RRM domain) + Term 10193 - 10236 7.9 - Term 10185 - 10220 4.4 10 6 Op 1 23/0.000 - CDS 10229 - 10999 316 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 11 6 Op 2 . - CDS 10996 - 11739 608 ## COG0767 ABC-type transport system involved in resistance to organic solvents, permease component - Prom 11794 - 11853 3.7 + Prom 11687 - 11746 2.7 12 7 Tu 1 . + CDS 11815 - 12576 274 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 12623 - 12677 2.4 - Term 12611 - 12665 6.2 13 8 Op 1 . - CDS 12715 - 14028 1288 ## COG1160 Predicted GTPases 14 8 Op 2 . - CDS 14077 - 14958 991 ## COG1159 GTPase 15 8 Op 3 . - CDS 15045 - 16052 904 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III - Prom 16072 - 16131 3.7 - Term 16080 - 16118 3.7 16 9 Op 1 . - CDS 16141 - 16326 338 ## PROTEIN SUPPORTED gi|160882088|ref|ZP_02063091.1| hypothetical protein BACOVA_00026 17 9 Op 2 . - CDS 16339 - 16917 627 ## BT_3832 hypothetical protein - Prom 17028 - 17087 4.0 Predicted protein(s) >gi|222159187|gb|ACAB01000172.1| GENE 1 64 - 363 198 99 aa, chain + ## HITS:1 COG:no KEGG:RB2501_06175 NR:ns ## KEGG: RB2501_06175 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 3 98 4 99 322 102 52.0 5e-21 MKWWIKFGCFLTGWNSSILSQCSEASFKHLKKYTAALLILIILWGFTGYCFAERYVEAPW WGCIISSIIFVVIVIQIERQIILTVGTHKWNTFFRFLLL >gi|222159187|gb|ACAB01000172.1| GENE 2 366 - 1022 413 218 aa, chain + ## HITS:1 COG:no KEGG:Cpin_0029 NR:ns ## KEGG: Cpin_0029 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 13 213 115 322 331 84 28.0 2e-15 MAFLGSSIIDQIIFGADINRKMVEITDRQVVEQLPLRLKVIDVKLSELQTNIDSLDKANI ILNDEIVREPVIKTVSTTTTYTKERQDDGSYKDIPQTTVSTTPMANPRVKQVEANNMTLE GLRKQQDEYTQKKMNVEKDLRTELSSNTGFLSELRAMIEILSTRIEALIFYIIIFAFLIS LELFVVTSKLGDKKCDYDMIIEHQLNVKTQALEELVKV >gi|222159187|gb|ACAB01000172.1| GENE 3 1283 - 2233 548 316 aa, chain + ## HITS:1 COG:no KEGG:BF3457 NR:ns ## KEGG: BF3457 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 71 315 57 282 285 81 30.0 5e-14 MEDIKPIDKFTLDKYVKEEDDLLEYVEMLGCKCHPDYDFIRVKKEIEDKAETNEAAAEEE EETKYVFMGTTVTQEFKKRYNAKVLNAHYTFEKYIQYKDIQNTLEALNIDREKFWYLLLF VSDYIYGSCLEGIKVKETSRVLVEKLMQQLGKNIGNSGCILSFIKPMTLTLKLQEKHRSI EIDDPISLAYIYLVYEAGKDYFSNDKPTRFDTQEIDRKGKDTEPNTVLVAMFYHLLKSFF KLLPKTNTSKPAKVYSTVSLNKTLLISRLVYLTRLSTDKRYTGVDEKKDKQCPNFIKDQI KSYKDYKILRANKFYK >gi|222159187|gb|ACAB01000172.1| GENE 4 2395 - 3282 474 295 aa, chain + ## HITS:1 COG:no KEGG:BF3459 NR:ns ## KEGG: BF3459 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 283 4 285 300 130 31.0 5e-29 MIDFYFYESNRDIVTENVVKCYSMIRKYGFQPSMGKIKMVEGDDLKGEKLYKVSVIPKRN DKPLSITNFMVETEEVTDEQERKKAKSPVDGQHRLIAMGILESEGKFTFDESSMVEIVKL PEEMSLPLFTASINNGKPWNYKDFNGSGLSTGNAQIDYIERIIGDYDLKPEFVYGLYTMG QPELKANVVKDLKVGIDRLPKNLRLSEDTQSMGDKLLQAFKASKMSEKTFNNGRLAKGLK QFFKDKNPTIEQMVVIIEAMNKDVWHANYPKPKGSPEAKNYAENFAEYYEDILSK >gi|222159187|gb|ACAB01000172.1| GENE 5 3550 - 5730 1923 726 aa, chain - ## HITS:1 COG:alr0205 KEGG:ns NR:ns ## COG: alr0205 COG0514 # Protein_GI_number: 17227701 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Nostoc sp. PCC 7120 # 1 720 1 712 718 492 39.0 1e-138 MARKINLTDELKKCFGFNKFKGNQEAIIYNLLDGKDTFVLMPTGGGKSLCYQLPSLLMEG TAIVISPLIALMKNQVDAMRNFSEEDGVAHFINSSLNKGAIDQVRSDILAGKTKLLYVAP ESLTKEENVEFLRSVKISFYAVDEAHCISEWGHDFRPEYRRIRPIINEIGKAPLIALTAT ATPKVQHDIQKNLGMVDAQVFKSSFNRPNLYYEVRAKTANIDRDIIKFIKNNPEKSGIIY CLSRKKVEELAEILQANGINARPYHAGMDSLARTKNQDDFLMEKVDVIVATIAFGMGIDK PDVRFVIHYDIPKSLEGYYQETGRAGRDGGEGQCITFYTNKDLQKLEKFMQGKPVAEQEI GKQLLLETAAYAESSVCRRKTLLHYFGEEYTEENCGNCDNCLNPKKQVEAQELLCAVIEA IIAVKENFKADYIIDILQGKETSEVQAHLHEDLEVFGSGMGEEDKTWNAVIRQALIAGYL SKDVEHYGLLKVTEEGHKFLKKPKSFKITEDNDFEETEEEVPARGGGSCAVDPALYSMLK DLRKKLSKKLEVPPYVIFQDPSLEAMATIYPVTLDELQNIPGVGAGKAKRYGEEFCKLIK RHCEENEIERPEDLRVRTVANKSKMKVAIIQAIDRKVALDDIALSKGIEFGELLDEVEAI VYSGTKLNIDYFLEEIMDEDHMLDIYDYFKESTTDKIDDALDELGDDFTEEEVRLVRIKF ISEMAN >gi|222159187|gb|ACAB01000172.1| GENE 6 5898 - 7142 247 414 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 154 404 248 453 466 99 32 1e-20 MADSKTKKRCSFCGRSENEVGFLITGMNGYICDSCATQAYEITQEALGEGKKSAGATKLN LKELPKPVEIKKFLDQYVIGQDDAKRFLSVSVYNHYKRLLQKDSGDDVEIEKSNIIMVGS TGTGKTLLARTIAKLLHVPFTIVDATVLTEAGYVGEDIESILTRLLQVADYNVPEAEQGI VFIDEIDKIARKGDNPSITRDVSGEGVQQGLLKLLEGSVVNVPPQGGRKHPDQKMIPVNT KNILFICGGAFDGIEKKIAQRLNTHVVGYTASQKTATVDKNNMMQYIAPQDLKSFGLIPE IIGRLPVLTYLNPLDRDALRAILTEPKNSIIKQYIKLFEMDGIKLTFEDAVFEYIVDKAV EYKLGARGLRSIVETIMMDVMFEIPSEDKKEYKVTLDYAKMQLEKANMARLQTA >gi|222159187|gb|ACAB01000172.1| GENE 7 7145 - 7807 833 220 aa, chain - ## HITS:1 COG:sll0534 KEGG:ns NR:ns ## COG: sll0534 COG0740 # Protein_GI_number: 16332068 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Synechocystis # 32 217 25 210 226 227 58.0 2e-59 MDDFRKYATKHLGMNGMVLDDVIKSQAGYLNPYILEERQLNVTQLDVFSRLMMDRIIFLG TQVDDYTANTLQAQLLYLDSVDPGKDISIYINSPGGSVYAGLGIYDTMQFISSDVATICT GMAASMAAVLLVAGAEGKRSALPHSRVMIHQPMGGAQGQASDIEITAREIQKLKKELYTI IADHSHTDFDKVWADSDRDYWMTAQEAKEYGMIDEVLIKK >gi|222159187|gb|ACAB01000172.1| GENE 8 7951 - 9306 1764 451 aa, chain - ## HITS:1 COG:PA1800 KEGG:ns NR:ns ## COG: PA1800 COG0544 # Protein_GI_number: 15596997 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Pseudomonas aeruginosa # 1 445 1 425 436 67 21.0 4e-11 MNVSLQNIDKVSAELTVKLEKADYQEKVDKELKSLRQKAQIPGFRKGMVPTSLIKKMYGK SVIAEVVNKALQEAVYNYIKDNKVNMLGEPLPNEEKQQNIDFDTMEEFDFVFDIALAPEF KAEVSAKDKVDYYSIEVSEEMIDNQVKMYTQRTGKYDKVDAYEDNDMLKGLLAQLDEEGN TKEGGIQVEAAVLMPAYMKNDDQKAIFANAKVNDVLVFNPNVAYDGHAAELGSLLKIDKE IAKDVKSNFSFQVEEITRFVPGELTQEVFDQAFGEGVVKTEEEFRAKIKEEIAARFVADS DYKFLIDIRKVMMEKVGKLEFSDALLKRIMLLNNEEKGEEYVAENYDKSIEELTWHLIKE QLVEANDIKVEQEDVLKMARETTKAQFAQYGMLSIPDDVLDNYAQEMLKKKETINNLVSR VVEVKLAAALKAQVTLENKNVSIEEFNKMFE >gi|222159187|gb|ACAB01000172.1| GENE 9 9916 - 10161 312 81 aa, chain + ## HITS:1 COG:asl4022 KEGG:ns NR:ns ## COG: asl4022 COG0724 # Protein_GI_number: 17231514 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Nostoc sp. PCC 7120 # 1 80 1 80 94 86 56.0 1e-17 MNIYVGNLNYRVKEGDLQQVMEDYGAVSSVKVVMDRETGKSKGFAFIEMEDDAAAAKAIA ELNGAEYMGRTMVVKEARPRA >gi|222159187|gb|ACAB01000172.1| GENE 10 10229 - 10999 316 256 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 237 1 239 245 126 31 1e-28 MIEIKGLYKSFDDKTVLSDINASFENGKTNLIIGQSGSGKTVLMKCIVGLLTPEKGEVLY DGRNLVLMGKKEKKMLRKEMGMIFQSAALFDSMTVLDNVMFPLNMFSNDTLRERTKRAMF CLDRVNLGEAKDKFPGEISGGMQKRVAIARAIALNPQYLFCDEPNSGLDPKTSLVIDDLI HDITQEYNMTTIINTHDMNSVLGIGEKVIYIYEGHKEWEGTKDDIFTSTNERLNNFIFAS DLLRKVKDVEVQGMEG >gi|222159187|gb|ACAB01000172.1| GENE 11 10996 - 11739 608 247 aa, chain - ## HITS:1 COG:aq_355 KEGG:ns NR:ns ## COG: aq_355 COG0767 # Protein_GI_number: 15605864 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, permease component # Organism: Aquifex aeolicus # 1 245 1 244 245 116 34.0 4e-26 MIKALRTVGRYFMLMGRTFSRPERMRMFFRQYLNELEQLGVNSIGIVLLISFFIGAVITI QIKLNIESPWMPRWTVGYVTREIMLLEFSSSIMCLILAGKVGSNIASELGTMRVTQQIDA LEIMGINSANYLILPKITAMVTVIPILVTFSIFAGIIGAFCTCWFAGVMNAVDLEYGLQY MFVEWFIWAGIIKSLFFAFIIASVSAFFGYTVDGGSIAVGKASTDAVVSSSVLILFADLI LTKLLMG >gi|222159187|gb|ACAB01000172.1| GENE 12 11815 - 12576 274 253 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 230 1 229 245 110 25 9e-24 MEESKMVLRTEDLVKKYGKRTVVSHVSINVKQGEIVGLLGPNGAGKTTSFYMTVGLITPN EGRIFLDDLEITKYPVYKRAQTGIGYLAQEASVFRQMSVEDNIAAVLEMTNKPKEYQKEK LESLIAEFRLQKVRKNKGNQLSGGERRRTEIARCLAIDPKFIMLDEPFAGVDPIAVEDIQ QIVWKLKDKNIGILITDHNVQETLSITDRAYLLFEGKILFQGTPEELSENQIVREKYLSN SFVLRRKDFQLEK >gi|222159187|gb|ACAB01000172.1| GENE 13 12715 - 14028 1288 437 aa, chain - ## HITS:1 COG:SPy0341 KEGG:ns NR:ns ## COG: SPy0341 COG1160 # Protein_GI_number: 15674498 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Streptococcus pyogenes M1 GAS # 5 437 6 435 436 371 45.0 1e-102 MGNLVAIVGRPNVGKSTLFNRLTKTRQAIVNEEAGTTRDRQYGKSEWLGREFSVVDTGGW VVNSDDVFEEEIRKQVLLAVDEADVILFVVDVMNGVTDLDMQVAAILRRANSPVIMVANK TDNHDLQYNAPEFYKLGLGDPYCVSAMTGSGTGDLMDLIVSNFKKESSEILDDDIPRFAV VGRPNAGKSSIVNAFIGEERNIVTEIAGTTRDSIYTRYNKFGFDFYLVDTAGIRKKNKVN EDLEYYSVIRSIRAIEGSDVCILMLDATRGVESQDLNIFSLIQKNQKGLVVVINKWDLVE DKSVKVQKTFEEAVRSRFAPFVDFPIIFASALTKQRILKVLEEARNVYENRTTKIPTARL NEEMLPLIEAYPPPSNKGKYIKIKYITQLPNTQVPSFVYFANLPQYVKDPYKRFLENKMR EKWNLTGTPVNIYIRQK >gi|222159187|gb|ACAB01000172.1| GENE 14 14077 - 14958 991 293 aa, chain - ## HITS:1 COG:lin1499 KEGG:ns NR:ns ## COG: lin1499 COG1159 # Protein_GI_number: 16800567 # Func_class: R General function prediction only # Function: GTPase # Organism: Listeria innocua # 3 290 6 296 301 256 46.0 4e-68 MHKAGFVNIVGNPNVGKSTLMNVLVGERISIATFKAQTTRHRIMGIYNTDEMQIVFSDTP GVLKPNYKLQESMLNFSTSALTDADILLYVTDVVETPDKNNEFMEKVRQMTVPVLLLINK IDLTDQEKLVKLVEDWKELLPQAEIIPISATSKFNVDYVMKRIKELLPDSPPYFGKDQWT DKPARFFVNEIIREKILLYYDKEIPYSVEVVVEEFKEEPKKIHIRAVINVERDSQKGIII GKQGKALKKVATEARRELERFFGKTIFLETYVKVDKDWRSSDKELRNFGYQLD >gi|222159187|gb|ACAB01000172.1| GENE 15 15045 - 16052 904 335 aa, chain - ## HITS:1 COG:lin2305 KEGG:ns NR:ns ## COG: lin2305 COG0332 # Protein_GI_number: 16801369 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Listeria innocua # 4 328 1 311 312 281 42.0 9e-76 MEKINAVITGVGGYVPDYILTNDEISKMVDTNDEWIMTRIGVKERHILNEEGLGSSYMAR KAAKQLMKKTGTNPDDIDLVVVATTTPDYHFPSTASILCDKLGLKNAFAFDLQAACSGFL YLMETAANFIRSGRYKKIIIVGADKMSSMVNYTDRATCPIFGDGAAAFMVEPTTEDYGIM DSILRTDGKGLPFLHMKAGGSVCPPSYFTVDNKMHYLHQEGRTVFKYAVSNMSDVSAAIA EKNGLTKDNINWIVPHQANVRIIEAVAHRMEVPMDKVLVNIEHYGNTSAATLPLCIWDYE DKLKKGDNIIFTAFGAGFTWGAVYVKWGYDGKKES >gi|222159187|gb|ACAB01000172.1| GENE 16 16141 - 16326 338 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160882088|ref|ZP_02063091.1| hypothetical protein BACOVA_00026 [Bacteroides ovatus ATCC 8483] # 1 61 1 61 61 134 100 3e-31 MAHPKRRQSSTRQAKRRTHDKAVAPTLAICPNCGEWHVYHTVCGACGYYRGKLAIEKEAA V >gi|222159187|gb|ACAB01000172.1| GENE 17 16339 - 16917 627 192 aa, chain - ## HITS:1 COG:no KEGG:BT_3832 NR:ns ## KEGG: BT_3832 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 192 1 179 179 336 93.0 3e-91 MGKFDKYKIDLKGMQADSAKYEFVLDNLYFAHIDGPEVQKGKVNVTLTVKRTSRAFELSF QTDGMVWVPCDRCLDDMELPISSSDKLMVKFGHEYAEEGDNLIVIPEEEGEINVAWFMYE FVALSIPMKHVHAPGKCNKAVTSKLNKHLKTNANEDSDDTFDTGGDDIVIEEEVEEQIDP RWNELKKILDNN Prediction of potential genes in microbial genomes Time: Wed May 18 04:53:55 2011 Seq name: gi|222159186|gb|ACAB01000173.1| Bacteroides sp. D1 cont1.173, whole genome shotgun sequence Length of sequence - 1573 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + SSU_RRNA 35 - 1525 99.0 # EF403590 [D:1..1492] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. Prediction of potential genes in microbial genomes Time: Wed May 18 04:53:55 2011 Seq name: gi|222159185|gb|ACAB01000174.1| Bacteroides sp. D1 cont1.174, whole genome shotgun sequence Length of sequence - 1215 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + LSU_RRNA 20 - 1215 96.0 # FJ410383 [D:701..3455] # 23S ribosomal RNA # Bacteroides ovatus # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Bacteroidaceae; Bacteroides. Prediction of potential genes in microbial genomes Time: Wed May 18 04:53:56 2011 Seq name: gi|222159184|gb|ACAB01000175.1| Bacteroides sp. D1 cont1.175, whole genome shotgun sequence Length of sequence - 1545 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + LSU_RRNA 1 - 1179 99.0 # FJ410383 [D:701..3455] # 23S ribosomal RNA # Bacteroides ovatus # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Bacteroidaceae; Bacteroides. + 5S_RRNA 1433 - 1532 98.0 # CP000140 [D:147281..147431] # 5S ribosomal RNA # Parabacteroides distasonis ATCC 8503 # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Porphyromonadaceae; Parabacteroides. Prediction of potential genes in microbial genomes Time: Wed May 18 04:53:57 2011 Seq name: gi|222159183|gb|ACAB01000176.1| Bacteroides sp. D1 cont1.176, whole genome shotgun sequence Length of sequence - 1627 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 296 - 355 3.3 1 1 Tu 1 . + CDS 428 - 598 71 ## + Prom 711 - 770 6.5 2 2 Tu 1 . + CDS 794 - 1625 349 ## BT_3738 two-component system sensor histidine kinase/response regulator, hybrid ('one-component system') Predicted protein(s) >gi|222159183|gb|ACAB01000176.1| GENE 1 428 - 598 71 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRVRVLTNNSARFIGGCQTSTGTCTTRGYKFGPEVWMVFRTPDQRTLADSPEPGKP >gi|222159183|gb|ACAB01000176.1| GENE 2 794 - 1625 349 277 aa, chain + ## HITS:1 COG:no KEGG:BT_3738 NR:ns ## KEGG: BT_3738 # Name: not_defined # Def: two-component system sensor histidine kinase/response regulator, hybrid ('one-component system') # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 277 8 285 1330 283 50.0 3e-75 MTRVRNMGRILSAFLMLGISSMCGAQIYKYLGIEDGLSNRRIYRIQKDGRGYMWFLTQEG MDRYDGKRIRHYTVLDGNLKVAPQVNLNWLYTDTENTLWVVGRKGRIFHYDTLHDRFRMV YRIPGLQDDFATGMLCYAYMDRGDRIWLCQGDHIIRYDTRTGIAQRLVSRLRGDITAISE TDGTNLFIGTVNGLFPVRERDGVLEALADTDSIRTPVSELYYHPGSKKLFVGTFRKGILV YGVSAGSTLRNVAVNRITPLNDRELLIATGGRGVYRM Prediction of potential genes in microbial genomes Time: Wed May 18 04:54:06 2011 Seq name: gi|222159182|gb|ACAB01000177.1| Bacteroides sp. D1 cont1.177, whole genome shotgun sequence Length of sequence - 3079 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 3079 1286 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|222159182|gb|ACAB01000177.1| GENE 1 1 - 3079 1286 1026 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 507 728 5 223 294 124 36.0 1e-27 ADYASHNGMNGDNINDIYVDGGDRIWLANYPAGVTIRNNRYQSYEWFRHSPGNSRSLVND QVHDVIEDSEGDLWFATSNGISLLQPAVGRWRSFLSRSDGIQDGGNHIFLTLCEVSPGVI CAGGYASGLYRIEKKTGRVEYFPPSFAAEGRPDQYINDIGKDSGGCIWTGGCHNLKRFDP HDGTVRLYPVPGPITAILEKAPEWMWIGTGMGLYLLDGHGGTCRHIAFPVEAVHVYALYQ APDGLLYIGTGGAGLLVYDSVEDRFVRQYQTENCALISNNIHTIVPRADGTLLLGTENSV ALFRRESGTFRNWTAEQGLASVCLNAGVSTFRHGESFVFGSNTGAVMFPSDMRIPAPHFS RMLLRDFMISYRPVYPGDKGSPLREDIDNTVRLELAYDQNTFSLEAVSINYDYPSNILYS WKLEGLYEGWSHPVQSGRIQFTSLPPGNYTLRIRAVSNEEKYKVYEERSLGLSVARPLWA GTWAIAGYASLCVLAGVVSFRVAMLRRQKRISDEKTCFFIHTAHDVRTPLTLIKAPLEEV VEKDMVKAEGMDNVRMALKSVDGLLGLVTSLIAFESTDNYTLRLHVSEYELNSYLETTCE AFRTYASIRDIDITRESGFPYLNVRFDKDKMDSILKNILSNALKYTPRGGSIQVRAFADR HVWGVEVEDTGIGIPPEERKKLFRNHFRGSNAVNLQVAGNGVGLMMVHRLVRLHGGRVRV TSTEGKGTMVCVIFPLRSRRLDKACPVASPRKQDTGETRMGPDCGPMREILPTMTGGDRQ RILIVEDNDDLRTYLEGLLKEEYLVQTCSNGRDALLVAREYNPDLILSDVMMPEMGGDEL CASVKSDIETSHIPVMLLTALGDEKDMLEGLENGADAYITKPFSINVLRANIRNILANRA LLRRAYAGLEDGVGQVPPDCHNTRDWKFMASVRECVMKNIDNPGFCVDMLCGMQNMSRTG FFNKLKALTGHAPADYIRSMRLQYAAQLLREKDCSITEISDDSGFSDVRYFREVFRKYYG MSPSEY Prediction of potential genes in microbial genomes Time: Wed May 18 04:54:08 2011 Seq name: gi|222159181|gb|ACAB01000178.1| Bacteroides sp. D1 cont1.178, whole genome shotgun sequence Length of sequence - 4148 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 417 172 ## gi|255692735|ref|ZP_05416410.1| conserved hypothetical protein 2 1 Op 2 . + CDS 437 - 667 299 ## gi|237707884|ref|ZP_04538365.1| predicted protein + Term 668 - 715 10.1 - Term 659 - 700 5.5 3 2 Tu 1 . - CDS 704 - 1729 566 ## CFPG_P2-1 replication protein A - Prom 1760 - 1819 4.0 + Prom 2063 - 2122 3.2 4 3 Op 1 . + CDS 2142 - 2393 185 ## gi|189462399|ref|ZP_03011184.1| hypothetical protein BACCOP_03085 5 3 Op 2 . + CDS 2393 - 2665 191 ## Clim_0726 addiction module toxin, Txe/YoeB family 6 3 Op 3 . + CDS 2676 - 2873 273 ## gi|189464126|ref|ZP_03012911.1| hypothetical protein BACINT_00461 + Term 2991 - 3035 6.1 + Prom 2975 - 3034 2.9 7 4 Op 1 . + CDS 3066 - 3365 70 ## gi|189464125|ref|ZP_03012910.1| hypothetical protein BACINT_00460 8 4 Op 2 . + CDS 3362 - 4148 390 ## BT_2995 hypothetical protein Predicted protein(s) >gi|222159181|gb|ACAB01000178.1| GENE 1 1 - 417 172 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|255692735|ref|ZP_05416410.1| ## NR: gi|255692735|ref|ZP_05416410.1| conserved hypothetical protein [Bacteroides finegoldii DSM 17565] # 2 138 14 150 150 240 91.0 3e-62 NVKQGQYSPRKRKDTTPRVNPQLAVELVYQELKRVEVYTKRIEDATARKVQIDGKSLESA ENRLKNVLADFERQGYRMKNGGYVDKRISFYSILCAVISLLFACFMCYLWTDAAKDRDNY KQYYEYYQEQAREQKGNK >gi|222159181|gb|ACAB01000178.1| GENE 2 437 - 667 299 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237707884|ref|ZP_04538365.1| ## NR: gi|237707884|ref|ZP_04538365.1| predicted protein [Bacteroides sp. 9_1_42FAA] # 1 76 35 110 110 145 100.0 7e-34 MGTTEHEEPRFFFILNKGAKSGGEITHAVLNGSIVSKPAGWDAFHGLALAREKLSSEEIQ QQMKELGVEMEIVPLI >gi|222159181|gb|ACAB01000178.1| GENE 3 704 - 1729 566 341 aa, chain - ## HITS:1 COG:no KEGG:CFPG_P2-1 NR:ns ## KEGG: CFPG_P2-1 # Name: not_defined # Def: replication protein A # Organism: A.pseudotrichonymphae # Pathway: not_defined # 10 339 24 349 532 150 34.0 6e-35 MKKKLPITKNKDVVVSWVYTWSKQQDMSIHEQRIVLRILEACQAELKGVKLKDYAGTKRK FEHGLWDVDAQMHVSDVIFSGRDYNEIIAALDSLAGRFFTYEDDEEWWKCGFISNPKYKK RTGIITFRVSNDLWDVFTKFAKGYREFELNKALALPTGYSLRFYMLMSGQVYPLDISLEN LKDRLGIPADKYKDKNGKDRIDHFEERVLKPAKAALDESCPYTFNYVKVRENPNNKRSKV TGFRFYPVYQPQFRDEELEGKELQAKVTARYQIDSHVYEYLRYSCGFTSEEINRNKETFI TAQEKITDLIGELALLNGKSREKNNPKGWIINALKGKIKDK >gi|222159181|gb|ACAB01000178.1| GENE 4 2142 - 2393 185 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189462399|ref|ZP_03011184.1| ## NR: gi|189462399|ref|ZP_03011184.1| hypothetical protein BACCOP_03085 [Bacteroides coprocola DSM 17136] # 1 83 1 83 83 142 100.0 5e-33 MEALSVREYRNNLAASFTKADNGEQVLIRRKNEIYALVKVGREDLMITPELQARIDKARE EIKSGKCVTLKSSEDIDAYFDSL >gi|222159181|gb|ACAB01000178.1| GENE 5 2393 - 2665 191 90 aa, chain + ## HITS:1 COG:no KEGG:Clim_0726 NR:ns ## KEGG: Clim_0726 # Name: not_defined # Def: addiction module toxin, Txe/YoeB family # Organism: C.limicola # Pathway: not_defined # 2 90 4 93 93 71 44.0 7e-12 MYTIRVSDGVDKVIAKWKKSNPNLFKKYKKIYKELLEHPKTGLGHPEALRGGGDITWSRH ITAHDRIIYDIYEEVVEVYILEVEGHYNDK >gi|222159181|gb|ACAB01000178.1| GENE 6 2676 - 2873 273 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189464126|ref|ZP_03012911.1| ## NR: gi|189464126|ref|ZP_03012911.1| hypothetical protein BACINT_00461 [Bacteroides intestinalis DSM 17393] # 1 65 1 65 65 113 100.0 5e-24 MVEYCVYWLENGEPMHEVFSSLAAAEMYSCAIRGKENIEWVEVSEEETIDLDELEDMFPD DFCGV >gi|222159181|gb|ACAB01000178.1| GENE 7 3066 - 3365 70 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189464125|ref|ZP_03012910.1| ## NR: gi|189464125|ref|ZP_03012910.1| hypothetical protein BACINT_00460 [Bacteroides intestinalis DSM 17393] # 1 99 1 99 99 148 100.0 1e-34 MELRRNEKITFRCTELEKDALAEQAARCSLSVSEYCRSLSLGGRPRERYTEEERQLLRDI AQLKGTLQRLNNYFGGRQYREVFEENRALITELKKILSR >gi|222159181|gb|ACAB01000178.1| GENE 8 3362 - 4148 390 262 aa, chain + ## HITS:1 COG:no KEGG:BT_2995 NR:ns ## KEGG: BT_2995 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 242 1 261 403 105 29.0 2e-21 MIGKGKSISHGVAALEYDLAKEINGQAVATEIARHELYGCTGAEMVQEMKPYHIDFPNVK NNCLRFEVSPSIEESATFTDADWAELGNDFMQRMGLANHQYIIIRHSGTESKKEQAHLHI LANRVSLSGELYRDNWIGKKATEAANAIAKERNFVQSQDIGKVNKAEIKEAMDGVLKKMQ GFDFTKFKEELGKRGFKVREARASTGKLNGYYVTARSGTEYKASEIGKGYTLAHIERTQS KLKCNSMNISHGNKLTPGSGSF Prediction of potential genes in microbial genomes Time: Wed May 18 04:54:44 2011 Seq name: gi|222159180|gb|ACAB01000179.1| Bacteroides sp. D1 cont1.179, whole genome shotgun sequence Length of sequence - 1140 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 630 - 1139 369 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|222159180|gb|ACAB01000179.1| GENE 1 630 - 1139 369 169 aa, chain - ## HITS:1 COG:AGl1300 KEGG:ns NR:ns ## COG: AGl1300 COG0477 # Protein_GI_number: 15890776 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 144 226 369 394 135 47.0 4e-32 VPAALWVIFGEDRFRWSATMIGLSLAVFGILHALAQAFVTGPATKRFGEKQAIIAGMAAD ALGYVLLAFATRGWMAFPIMILLASGGIGMPALQAMLSRQVDDDHQGQLQGSLAALTSLT SIIGPLIVTAIYAASASTWNGLAWIVGAALYLVCLPALRRGAWSRATST Prediction of potential genes in microbial genomes Time: Wed May 18 04:54:45 2011 Seq name: gi|222159179|gb|ACAB01000180.1| Bacteroides sp. D1 cont1.180, whole genome shotgun sequence Length of sequence - 833 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 654 253 ## COG0477 Permeases of the major facilitator superfamily - Prom 691 - 750 3.8 Predicted protein(s) >gi|222159179|gb|ACAB01000180.1| GENE 1 3 - 654 253 217 aa, chain - ## HITS:1 COG:AGl1300 KEGG:ns NR:ns ## COG: AGl1300 COG0477 # Protein_GI_number: 15890776 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 205 2 203 394 174 50.0 8e-44 MKSNNALIVILGTVTLDAVGIGLVMPVLPGLLRDIVHSDSIASHYGVLLALYALMQFLCA PVLGALSDRFGRRPVLLASLLGATIDYAIMATTPVLWILYAGRIVAGITGATGAVAGAYI ADITDGEDRARHFGLMSACFGVGMVAGPVAGGLLGAISLHAPFLAAAVLNGLNLLLGCFL MQESHKGERRPMPLRAFNPVSSFRWARGMTIVAALMT Prediction of potential genes in microbial genomes Time: Wed May 18 04:54:46 2011 Seq name: gi|222159178|gb|ACAB01000181.1| Bacteroides sp. D1 cont1.181, whole genome shotgun sequence Length of sequence - 2493 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 110 - 1921 1100 ## COG3344 Retron-type reverse transcriptase Predicted protein(s) >gi|222159178|gb|ACAB01000181.1| GENE 1 110 - 1921 1100 603 aa, chain - ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 28 596 255 821 834 275 33.0 3e-73 MRSSEKVLNSLAKHSQQPDYKFERLYRLLFNEQIYYVAYQRIYAKEGNMTKGADNATIDH MSISRIEKLITSLKNETYQPIPSRRVYIPKKNGKKRPLGIPTFNDKLLQEVVRMILEAIY EGSFDANSHGFRPKRSCHTALRQIQQTFNGASWFIEGDIKGFFDNINHDVMISILRKRIA DERFIRLIRKFLNAGYIEDWIFNKAYSGTPQGGIISPILANIYLDQFDRYMREYISQFDK GKERKDNPERIKFEYGKRLAVLKLKKVTSMKERKLIIKEIKRFDRERTMISCGVEMDYDF RRLKYVRYADDFLCAVIGTKDEAKVIKQDIKRFLEEKLSLELSEDKTLITHGKKSAKFLG YEIYVRKSAQTKRNKAGKLTRPYNNKIYLKMPTEVVRKKLLDYDALQIKVHNGKERYKPK HRTYLINNDDLEILERYNAEIRGLYNFYSLANNCHTLHTFKYIMEYSMYKTFAAKYKSTV VKICKKYKKDKVFTVSYKNKNGKTLTRQFYHDGFKRKRQDYAQCCDKEPVRYFYTATSLI DRLKAKRCELCGKENVKLDMHHVRKLKNLQEKEDWERHMIARKRKTIALCRSCHKKVDGG WKD Prediction of potential genes in microbial genomes Time: Wed May 18 04:54:47 2011 Seq name: gi|222159177|gb|ACAB01000182.1| Bacteroides sp. D1 cont1.182, whole genome shotgun sequence Length of sequence - 2491 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 574 - 2385 804 ## COG3344 Retron-type reverse transcriptase Predicted protein(s) >gi|222159177|gb|ACAB01000182.1| GENE 1 574 - 2385 804 603 aa, chain + ## HITS:1 COG:Q0055 KEGG:ns NR:ns ## COG: Q0055 COG3344 # Protein_GI_number: 6226521 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 8 601 267 848 854 301 32.0 3e-81 MRSPERVLNSLSEHSKDASYKFERLYRILFNEEMFYVAYQRIYAKEGNMTKGSDGQTIDN MSLKRIEKLIDTLKDETYQPQPSKRVYIPKKNGKKRPLGVPTFNDKLIQEVVRMVLEAIY EGNFEYTSHGFRPNRSCHTALTHIQKEFNGAKWFVEGDIKGFFDNINHDVLINILLERIA DERFIRLIRKFMKAGYIEDWQFHNTYSGTPQGGIISPILANIYLDKLDKYIKEYTAKFDK GKKRKFSRESMDFGNARKRIVRRLKFVKDERQRTKLILELKAIEQGRAKYPNGEEMDADY RRMKYARYADDFLVGIIGSKQDAQQIKEDIKNFLADRLALELSDEKTLVTHTERPAKFLG YEITVRKSNDQRRDKRGRLRRTYGKRVCLNVSTETVRKKLFDWGVLELTNRNGKEIWKPK CKSGLIFNDDLEILDSYNREIVGFYNYYSIANNCAHALNNFKYIMEYSMYKTFAGKYKCR TRKVNKKYRKNGRFIVTHMTKTGVKERYFYDGGFKRKKPTYKSECDIMPRTIYTAGRTSL VERLKARECELCGATDDLVMHHVRKLKNLQGKESWERHMIARKRKTIAVCRSCHKKIHDG KID Prediction of potential genes in microbial genomes Time: Wed May 18 04:54:48 2011 Seq name: gi|222159176|gb|ACAB01000183.1| Bacteroides sp. D1 cont1.183, whole genome shotgun sequence Length of sequence - 2244 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1585 848 ## COG3436 Transposase and inactivated derivatives 2 1 Op 2 . - CDS 1655 - 2164 41 ## BVU_0035 hypothetical protein Predicted protein(s) >gi|222159176|gb|ACAB01000183.1| GENE 1 1 - 1585 848 528 aa, chain - ## HITS:1 COG:ECs3866 KEGG:ns NR:ns ## COG: ECs3866 COG3436 # Protein_GI_number: 15833120 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 101 513 5 439 463 181 31.0 4e-45 MKEPVITLTLEEYEELRKERERLEKEHAELQRKYESSLREYSRQVEEISACTAVIADLRW KLADLTRRLWGKSSEKRHLPEDAGQLSICFESPSDVNDPVAEEQKTAEKSAKSENGYNRF RKSFTKKITPHARKPIDPSLPREEIIIPMPEGLSLEGAAKLGEEVSEQYAVSPARFYVRR IIRPKYRLADGRIITAPMPVMAHPHSNASESVLSHIATAKYYDHLPLYRQLDIFEREGIH LSPSTVSNWMMAAAQRLEPIYNELRELVKDSYYVMADETPHPVLESDRPGALHRGYMWNF YLPRFHTPFFEYHKGRGSSGIDTLLAGQVRVVQSDGFAVYDEFDTLPGKLHLCCWAHVRR KFVEAEGNDPPRARHALEQIGRLYAVEEKIRIGHLEGGAVVRLRREESYPIIKGLEKWCR QEYEHTVEKSPIAKAIFYMYTRFEQLSGYVNDAQFCIDNNPVERSIRPLTLNRKNTLFSG SHEAAHAAAIFFSLMGCCRENMVNPKLWMQDVLIRVQENEREKKNDYA >gi|222159176|gb|ACAB01000183.1| GENE 2 1655 - 2164 41 169 aa, chain - ## HITS:1 COG:no KEGG:BVU_0035 NR:ns ## KEGG: BVU_0035 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 169 1 169 169 295 95.0 4e-79 MCSVLVFSSESDKLRSLFSESNHFDSVLFIFDGGRRYKVSASSLGIHSPQDLLLPLGLGL FRSSPFWSYYLYPQGCNFHKGIDGLCGEVIRHTSSCVSEQSCHIFLDRSRSRLHILYRCD DEYRLECRRLNRGSFLLKKDERKRDFLQISWNRLNELLTVKKYRKTVEK Prediction of potential genes in microbial genomes Time: Wed May 18 04:54:51 2011 Seq name: gi|222159175|gb|ACAB01000184.1| Bacteroides sp. D1 cont1.184, whole genome shotgun sequence Length of sequence - 1583 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 84 - 145 1.5 1 1 Tu 1 . - CDS 214 - 561 265 ## gi|237712702|ref|ZP_04543183.1| conserved hypothetical protein - Prom 597 - 656 5.0 2 2 Tu 1 . + CDS 505 - 663 77 ## gi|160884971|ref|ZP_02065974.1| hypothetical protein BACOVA_02963 Predicted protein(s) >gi|222159175|gb|ACAB01000184.1| GENE 1 214 - 561 265 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712702|ref|ZP_04543183.1| ## NR: gi|237712702|ref|ZP_04543183.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 115 1 115 115 222 100.0 5e-57 MYGDFNRIVVQLVQHPVMHKPLSDLTYTECELAYALIRELIDLSTEGDYTLLDYIQMARL EYYLGELSCKINCSREETALHYAGALHLLEKGGFDLNIKKWVELVSLRIENSKKE >gi|222159175|gb|ACAB01000184.1| GENE 2 505 - 663 77 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160884971|ref|ZP_02065974.1| ## NR: gi|160884971|ref|ZP_02065974.1| hypothetical protein BACOVA_02963 [Bacteroides ovatus ATCC 8483] # 1 52 1 52 52 93 100.0 5e-18 MHDRMLNQLDNNAIKVSVHRLRYLPHQFPMVRGFIKKKERGNYLFMPYIGIR Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:02 2011 Seq name: gi|222159174|gb|ACAB01000185.1| Bacteroides sp. D1 cont1.185, whole genome shotgun sequence Length of sequence - 1574 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 8 - 1300 595 ## COG5433 Transposase - Prom 1380 - 1439 5.0 Predicted protein(s) >gi|222159174|gb|ACAB01000185.1| GENE 1 8 - 1300 595 430 aa, chain - ## HITS:1 COG:ydcC KEGG:ns NR:ns ## COG: ydcC COG5433 # Protein_GI_number: 16129419 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Escherichia coli K12 # 29 423 15 372 378 162 32.0 8e-40 MDKIAKDFLINDELARHMASMSEAIDTIDPREKNKVTYSGKLIMLVTLSGVFCDCQSWND IADFARYKKDFLRRFIPDLETTPSHDTLRRFFCIIKTERLESCYREWACNMRGDSPSIED CDWSKVQIGEGNDLYTNRHIAIDGKTICGAINADKLVQESAGKITKEQAASAKLHIVSAF LSDMSLSLGQERVSIKENEIVAIPKLLDDIDIRQGDVVTIDALGTQKKIVEKITEKQADY LLEVKDNHLKLRENIENDAEYLLISGRENDFIKRAEETTEGHGFMVTRTCISCSEPSRLG FCYRDWKNLRTYGIIKTEKINIATGEIQNEKHCFISSLVNNPELILKYKRKHWAVENGLH WQLDVTFNEDDGRKMMNSAQNFSTLTKMALTILKNYQDEDKKTSVNRKRKKAGWSDEYLA NLINNFIKAF Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:02 2011 Seq name: gi|222159173|gb|ACAB01000186.1| Bacteroides sp. D1 cont1.186, whole genome shotgun sequence Length of sequence - 1460 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 32 - 712 230 ## PROTEIN SUPPORTED gi|149371717|ref|ZP_01891133.1| hypothetical protein SCB49_09940 - Prom 799 - 858 2.8 2 2 Tu 1 . - CDS 1011 - 1373 366 ## BVU_0853 hypothetical protein - Prom 1395 - 1454 2.2 Predicted protein(s) >gi|222159173|gb|ACAB01000186.1| GENE 1 32 - 712 230 226 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149371717|ref|ZP_01891133.1| hypothetical protein SCB49_09940 [unidentified eubacterium SCB49] # 135 221 5 91 96 93 54 1e-19 MIHLSKRKTVKEVTLDLSPTMMRIVRTGFPNATMTNDRFHVQKLFYEAIDELRITYRWMA RDLENDEIQRCKEQNIEYVPFRYTNGDTRKQLLARAKHVLVKHYSKWTESQRIRAEIIFD HYPELKSVYDLAIELTNIYNKHYDKDVARGKLALWYNKIEKLGYGCFRTVTNTMQNYYET ILNYFVSRETNAFAESFNAKIKAFRAQFRGVGDIPFFMFRLCKLTV >gi|222159173|gb|ACAB01000186.1| GENE 2 1011 - 1373 366 120 aa, chain - ## HITS:1 COG:no KEGG:BVU_0853 NR:ns ## KEGG: BVU_0853 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 8 120 4 116 116 106 45.0 2e-22 MNSNRNLDKLLRVIFPEVLMEYFDISGWHDGSDKIEVWLDEKHYLERSDYKSGTVISHGF TDEKVIQDFPLRGKPVYLHVRRRRWYDKATGETFSYTYDDLTAEGTKLTPEFVAFLKEED Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:06 2011 Seq name: gi|222159172|gb|ACAB01000187.1| Bacteroides sp. D1 cont1.187, whole genome shotgun sequence Length of sequence - 1428 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 33 - 92 7.5 1 1 Tu 1 . + CDS 112 - 1419 659 ## COG3039 Transposase and inactivated derivatives, IS5 family Predicted protein(s) >gi|222159172|gb|ACAB01000187.1| GENE 1 112 - 1419 659 435 aa, chain + ## HITS:1 COG:RSc0843 KEGG:ns NR:ns ## COG: RSc0843 COG3039 # Protein_GI_number: 17545562 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, IS5 family # Organism: Ralstonia solanacearum # 15 430 17 439 440 248 36.0 2e-65 MINKTTPQPSLFSSLTDQLNPRHPLYLLANKIDWRRFEEAFSPLYCAVNGRPAKPIRLMC GLLILKHVRNLSDESVVEQWSENAYYQYFCGMLEFTPSYPCNASELVHFRKRVGENGMEL ILSESIRVNQEDDGPDHHCTAFIDSTVQEKNVTYPTDAKLHKKIVRKVLSVVKSLGLPLR QSYTFVLKKIYRDQRFRNHPKNRGKALKADKRLRTIAGRLVRELRRNLKENHGYDSLLDL FERVLSQKRNSPGKIYSLHEPEVQCISKGKEHKKYEFGNKVSIVRSITGVILGAKSFRNE YDGHTIEESLRQVERITGKKIRKLAGDRGYRGKKEVGGTGILIPDVPNRKDSYYTRKKKH KLFCKRAGIEPTIGHLKSDFRLGRNFYKGVFGDVVNLLLAAAAYNFKRAMRVLWLLVEKI CGTLFSCNIPQMSTF Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:06 2011 Seq name: gi|222159171|gb|ACAB01000188.1| Bacteroides sp. D1 cont1.188, whole genome shotgun sequence Length of sequence - 1370 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 374 224 ## BT_1894 TPR repeat-containing protein 2 1 Op 2 . + CDS 443 - 1370 472 ## BT_1894 TPR repeat-containing protein Predicted protein(s) >gi|222159171|gb|ACAB01000188.1| GENE 1 3 - 374 224 123 aa, chain + ## HITS:1 COG:no KEGG:BT_1894 NR:ns ## KEGG: BT_1894 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 121 6 126 551 197 80.0 7e-50 IIGLSAILYFTGCYNREQTPRLSEAEKLMQNNPDSALAILQKLKPEGNRAEQARYALLYS EALEKKQMKVTDDSLIRQAWQYYKHYPKDLRHQCKTLYYWGRIKLRTGDKPGALRLFLKI ERN >gi|222159171|gb|ACAB01000188.1| GENE 2 443 - 1370 472 309 aa, chain + ## HITS:1 COG:no KEGG:BT_1894 NR:ns ## KEGG: BT_1894 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 309 153 462 551 417 75.0 1e-115 MNYSRAYHYFHEARNNFRQSGDIQEETKVTLDMAAATFHSKDIEKAIRLYSAALDLADEH NNSNLIEVSLTNLASLYVISKRHISNDLLQRIELSARQDTVYGYHTLTDVSLLKNHIDSA RYYLELAKAHTTDICDMAELQYTAYHIEAQAKNFEKATDNVHRYIYLNDSIMRSNMQFSA GMVERDYFKERTKFAQYRMKNRTVWEIAIAAATFFIIGIAWYIVRQRLRMQRDRTNHYLL LTEKANSEYKALTERVKKQQTTESYLRGLAASRFDIVDKLGKTYYERENTTSQQSVIFNE VKQIITDFA Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:13 2011 Seq name: gi|222159170|gb|ACAB01000189.1| Bacteroides sp. D1 cont1.189, whole genome shotgun sequence Length of sequence - 1355 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 719 570 ## COG3537 Putative alpha-1,2-mannosidase 2 1 Op 2 . + CDS 726 - 1353 264 ## Phep_3364 metallophosphoesterase Predicted protein(s) >gi|222159170|gb|ACAB01000189.1| GENE 1 3 - 719 570 238 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 4 234 552 780 790 162 37.0 7e-40 FIEPFDYATAGGLGAREAYGENNGWIYRWDVPHNIADLIELMGGKEAFRNNLETMYNTPL GEAKYVFYAQLPDHTGNVGQFSMANEPSMHIPYLYNYIGEPWRTQKRVRTLLDEWFRNDL MGLPGDEDGGGMSAFVVFSMLGFYPITPGLPIYVIGTPMFERAVIETGAGKSFEVIAHNY SPTNKYIQSAKLNGKDWNQSWFEHKELMNGGKLEFTMGNTPNKNWAADSVPPSFEMNK >gi|222159170|gb|ACAB01000189.1| GENE 2 726 - 1353 264 209 aa, chain + ## HITS:1 COG:no KEGG:Phep_3364 NR:ns ## KEGG: Phep_3364 # Name: not_defined # Def: metallophosphoesterase # Organism: P.heparinus # Pathway: not_defined # 28 206 39 213 379 108 37.0 2e-22 MNKILFVLLSLLTSLQSYSQEQNEKEVSFLLLGDIHYDLLEDHDMEWLSTKPDDLRQVTK EYSVFTKNTWPEFSRIISGQVQKHQPSIKAVLQMGDLSEGLAGSPQKAIQMANSAFKAVN KMNLKVPFIMTKGNHDITGPGAKEAFEKVYLPNMARLAGHPSLQSANYTTTLDDVLFVCY DPWDRNPEGLQQLEKSLAGSKATYKFVML Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:17 2011 Seq name: gi|222159169|gb|ACAB01000190.1| Bacteroides sp. D1 cont1.190, whole genome shotgun sequence Length of sequence - 1328 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 39 - 1202 700 ## COG3385 FOG: Transposase and inactivated derivatives - Prom 1254 - 1313 3.1 Predicted protein(s) >gi|222159169|gb|ACAB01000190.1| GENE 1 39 - 1202 700 387 aa, chain - ## HITS:1 COG:SMc00721 KEGG:ns NR:ns ## COG: SMc00721 COG3385 # Protein_GI_number: 15966404 # Func_class: L Replication, recombination and repair # Function: FOG: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 7 383 7 373 387 190 32.0 3e-48 MNQDKYVFAQLVEFLNNDKFRRLVDKYDGNRYVKHFTCWSQLLAMMFGQLSNRESLRDLI VALEAHQGKRYHLGLGREPIAKTTLASANQNRDYRIFEDFAFYMMKEACEKRSTHILDIP GRKYAFDSTTIPLCLATFPWAKFRKKKGGVKAHVLYDIEAQLPAFYTVTTASRHDSTEMS AINYEPNAYYIFDRAYDSFKELYRIHLTGSFFVVRAKSNLKCKFCKWKRRMPKNILSDAE VKLIGYTSEKKYPESFRVIRFYDKEDDREFTFLTNAKHISALDIANLYKKRWFVELFFKW LKQHLKIKRFWGTTENAVRIQISVAIITYCLVAIVQYDMQLNRSTYEVLQILSISLTDKT PLQELFNKTNFNDVKEQFNPLIPGLFD Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:18 2011 Seq name: gi|222159168|gb|ACAB01000191.1| Bacteroides sp. D1 cont1.191, whole genome shotgun sequence Length of sequence - 1294 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1294 749 ## FP0487 TraG family mobilization protein Predicted protein(s) >gi|222159168|gb|ACAB01000191.1| GENE 1 1 - 1294 749 431 aa, chain + ## HITS:1 COG:no KEGG:FP0487 NR:ns ## KEGG: FP0487 # Name: not_defined # Def: TraG family mobilization protein # Organism: F.psychrophilum # Pathway: not_defined # 2 403 210 603 645 215 34.0 3e-54 QQNVYIDGGPGSGKSESWIKGIIYQCAERNYAGFVYDWEGDPTKDKSPILSRIAYGSIEY FRAKGVETPRFAYINFVDMSRTVRVNVLSPKYMSKGNESLFIRNIIITLMKNLEASWKEK TDFWANNAINYVYSIAYKCFKERELGICTLPHVIALALSDSNLVFHWLSEDPEIALNMSS MLTAWKLGAQQQTAGAVSSAQTPLVLLNNKYIFWVLSPLPEEEFSLDITNKEHPTLLCVG NAPTIKEAVSPAISCIGSVLMSQMNNPGKATSVFMVDEFPTILLQGIDTFIGTARKHNVA TILAVQDFNQAVRDYGEKSANILKASCGTQAYGMTGNEKTAKDIENLLGEKKEAQESYSH QTSGSGSVTESLQKEKVLKARDIAGQAAGHFIGKIAGGKPPFFNVQMDMCRFEEKEIPRF SLPVRLGNGKE Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:23 2011 Seq name: gi|222159167|gb|ACAB01000192.1| Bacteroides sp. D1 cont1.192, whole genome shotgun sequence Length of sequence - 1263 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 989 432 ## gi|237712691|ref|ZP_04543172.1| conserved hypothetical protein - Prom 1019 - 1078 3.5 Predicted protein(s) >gi|222159167|gb|ACAB01000192.1| GENE 1 2 - 989 432 329 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237712691|ref|ZP_04543172.1| ## NR: gi|237712691|ref|ZP_04543172.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 329 1 329 330 626 100.0 1e-178 MEPSIYSYSLCIALPLMLFFGFYFLLAPTPEKAIFNNYLRSRRIMGVAILLLAANYSVHF FFGIRFKNTDAAILMNLSTYFLCYWLFSSALTTLLDRFYITKCRLRTHICLWILFSILSG IVLLLLPKGGLQTTVMFALAAWLVIYGLFLTRRLLRAYHRVIRIFDDTRADDIGAYIKWL SIFTYWAVTFGVGCGLLTFLPDEYVYIWILSSVPFYIYLFLCYLNYLLFYEQVENAMEDG MTSEEEDLCDTTNREQAQRQDTPFFHAEIAKKIKGWIDADGYIRPGLTIKELSDVLHTNR TYLSGYIKTTYDMSFRDWIIGLRIEYAKR Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:37 2011 Seq name: gi|222159166|gb|ACAB01000193.1| Bacteroides sp. D1 cont1.193, whole genome shotgun sequence Length of sequence - 1213 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 14 - 1165 659 ## COG5433 Transposase Predicted protein(s) >gi|222159166|gb|ACAB01000193.1| GENE 1 14 - 1165 659 383 aa, chain - ## HITS:1 COG:ydcC KEGG:ns NR:ns ## COG: ydcC COG5433 # Protein_GI_number: 16129419 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Escherichia coli K12 # 12 378 13 372 378 206 37.0 5e-53 MKIGIIDLCKQIEDPRMNRKKVHKMETIIYISIAAVICGAQSWNEIEEFGNAKIAFFKSR IPDLEFIPSHDTFNRFFSIIKPEYFELIFRNWVKQVCQEVKGVVAIDGKLMRGPSQCDGE HTTGKEGFKLWMVSAWSAVNGISLGQVKVDDKSNEITAIPLLINSLELSGCIVTIDAMGC QKDITQTIIERDANYIIAIKENKKKNYQLAKQIIDDYQDRDEIINRVTRHVSENTGHGRV EKRTCTVVSYGSIMEKMFKKKLVGLKSIVGIKSERTIVATGEYTQEVRYYVTSLDNTKPE EIASAIRQHWSIENNLHWQLDVTFREDYSKKVKNAARNFSVATKMALTILKKDKTTKGSM NLKRLKAGWDEKYLSQLLQNNNF Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:37 2011 Seq name: gi|222159165|gb|ACAB01000194.1| Bacteroides sp. D1 cont1.194, whole genome shotgun sequence Length of sequence - 1151 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 21 - 75 2.5 1 1 Op 1 . - CDS 170 - 625 271 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 2 1 Op 2 . - CDS 622 - 1149 461 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit Predicted protein(s) >gi|222159165|gb|ACAB01000194.1| GENE 1 170 - 625 271 151 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 2 145 4 148 164 108 36 2e-24 MRKAIFPGTFDPFTIGHYSVVERALTFMDEIVIGIGINENKNTYFPIEKREEMIRELYKD EPRIKVMSYDCLTIDFAQEVGARFIVRGIRTVKDFEYEETIADINRKLAGIETILLFTEP ELTCVSSTIVRELLTYNKDISQFIPKGMRIS >gi|222159165|gb|ACAB01000194.1| GENE 2 622 - 1149 461 175 aa, chain - ## HITS:1 COG:CT661 KEGG:ns NR:ns ## COG: CT661 COG0187 # Protein_GI_number: 15605394 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Chlamydia trachomatis # 1 168 431 602 605 189 57.0 2e-48 NSFGLTKKVVYENEEFNLLQAALNIEDGLDGLRYNKVIVATDADVDGMHIRLLLITFFLQ FFPDLIKKGHVYILQTPLFRVRNKKKTIYCYSEEERVNAINELSPNPEITRFKGLGEISP DEFKHFIGKDMRLEQVSLRKTDAVKELLEFYMGKNTMERQNFIIDNLVIEEDLAS Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:38 2011 Seq name: gi|222159164|gb|ACAB01000195.1| Bacteroides sp. D1 cont1.195, whole genome shotgun sequence Length of sequence - 1016 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 289 - 356 23.0 1 1 Tu 1 . - CDS 370 - 1014 622 ## COG1624 Uncharacterized conserved protein Predicted protein(s) >gi|222159164|gb|ACAB01000195.1| GENE 1 370 - 1014 622 214 aa, chain - ## HITS:1 COG:BS_ybbP KEGG:ns NR:ns ## COG: BS_ybbP COG1624 # Protein_GI_number: 16077243 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 213 45 252 273 148 38.0 8e-36 VFILIWLVVTQVLEMKLLGSIFDTLMNVGVIALIVLFQDEIRRFLLTLGSHRHVSALARL FNGSKKESLKHDDIMPVVMACLNMGKQKVGALIVIEHNVPLDEVVRTGELIDAAINQRLI ENIFFKNSPLHDGAMVISKKRIKAAGCILPVSHDLNIPKELGLRHRAAMGISQQSDAHAI IVSEETGGISVAYRGQFYLRLNAEELESLLTKEN Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:39 2011 Seq name: gi|222159163|gb|ACAB01000196.1| Bacteroides sp. D1 cont1.196, whole genome shotgun sequence Length of sequence - 925 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 923 229 ## BF2776 putative flippase Predicted protein(s) >gi|222159163|gb|ACAB01000196.1| GENE 1 2 - 923 229 307 aa, chain - ## HITS:1 COG:no KEGG:BF2776 NR:ns ## KEGG: BF2776 # Name: not_defined # Def: putative flippase # Organism: B.fragilis # Pathway: not_defined # 1 304 160 456 511 241 44.0 2e-62 SLIGIFECLLRLVIAFVCVYTNSDKLIVYGVLTAMIPLITLTIMKIYCHRKYDECVFSIR KYWDSNLLKRITSFFSWNFLTAISSLVSYYGSGLVLNHFFGTILSAAQGIANQVNGQLSN FSLNLMKAVNPVIVKRAGSHDMEAMNRVTLASGKFSTLLIVLFAVPFILEMHYVLRIWLK DVPEWTAMFCCMQLIVTVICQLTNSAATAIYAEGNIKRYAIYKSIMNVLPVCFTYLAYTL GGAPYWLYIFMIAIWSMGGDMVIIYYSNKQCGLKISDFLIQVVCPVLCISVCMFIGGFAI KSLFEES Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:43 2011 Seq name: gi|222159162|gb|ACAB01000197.1| Bacteroides sp. D1 cont1.197, whole genome shotgun sequence Length of sequence - 853 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 75 - 851 354 ## COG3436 Transposase and inactivated derivatives Predicted protein(s) >gi|222159162|gb|ACAB01000197.1| GENE 1 75 - 851 354 258 aa, chain - ## HITS:1 COG:ECs3869 KEGG:ns NR:ns ## COG: ECs3869 COG3436 # Protein_GI_number: 15833123 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 17 253 217 461 467 127 33.0 2e-29 RVYNQQKGKYEYRKQYIWGIKNPDRKIAYYLYDNGSRSMKVAQKFFAGFRGSVTTDGYNV YKMFEREDSSITRYGCMAHVRRKFVDALQTDHRSADIINLISNLYWVEADCRINLLSEDE RRMERQKRAIPILAEIWQMIKPVFDQTRGDTANLFLKAVRYAVNEWEAVSRYVQNGKAEI DNNTAERMMKPICMGRKNYLFCGSELGAKNASMLYSIIETCKMNGLRPVKYIAEILTKLT AGETNYMSLLPINNNKEY Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:44 2011 Seq name: gi|222159161|gb|ACAB01000198.1| Bacteroides sp. D1 cont1.198, whole genome shotgun sequence Length of sequence - 786 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 477 127 ## gi|294645685|ref|ZP_06723372.1| glycosyltransferase, group 1 family protein - Prom 669 - 728 7.2 Predicted protein(s) >gi|222159161|gb|ACAB01000198.1| GENE 1 3 - 477 127 158 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294645685|ref|ZP_06723372.1| ## NR: gi|294645685|ref|ZP_06723372.1| glycosyltransferase, group 1 family protein [Bacteroides ovatus SD CC 2a] # 1 158 201 358 375 297 100.0 2e-79 MKNTDIVINTFILLAKEYNNVYLEIIGNSESQFYTDYIKSKIKESNLESRINILPACNHK ELKKHLVDKSFYIFPSKEPREGHSNALTEAMAWGLIPITTSQGFNRSVIENDSLIVEKLD VKCFVNKIKYIIDNDLIEQFSYNSYNRVLSNYTDEIVL Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:52 2011 Seq name: gi|222159160|gb|ACAB01000199.1| Bacteroides sp. D1 cont1.199, whole genome shotgun sequence Length of sequence - 755 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 754 279 ## COG3712 Fe2+-dicitrate sensor, membrane component Predicted protein(s) >gi|222159160|gb|ACAB01000199.1| GENE 1 1 - 754 279 251 aa, chain - ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 57 242 139 325 354 77 30.0 2e-14 KATLILAGNDKQSLTPTYPTPVKVNHSTTAIAQNGALIYPSTPNINIDISQKQQPEIVEK NTLTTEQGNEFRVTFEDGTTVHLNYNTEIRYPVKFSKTKRTVYLKGEAYFKIAKDARPFY VVTDQGIIKQYGTEFNVNTFTSGRTEVALVKGSISIIPQGSTQEQLIEPGQLAHIEQKGN NISIYNVDLTPYIAWNEGRLIFENRTLENIVEILEHWYNVNISFGTSELKQLRFTGNMDR YATISPILKAI Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:52 2011 Seq name: gi|222159159|gb|ACAB01000200.1| Bacteroides sp. D1 cont1.200, whole genome shotgun sequence Length of sequence - 741 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 28 - 87 4.5 1 1 Tu 1 . + CDS 331 - 492 73 ## + Term 660 - 697 0.5 Predicted protein(s) >gi|222159159|gb|ACAB01000200.1| GENE 1 331 - 492 73 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHWGRRFSQASAVFSFIPCRLPYLCAGETCSQFAKEEILINIHQMANFCKGKE Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:57 2011 Seq name: gi|222159158|gb|ACAB01000201.1| Bacteroides sp. D1 cont1.201, whole genome shotgun sequence Length of sequence - 713 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 651 310 ## COG3464 Transposase and inactivated derivatives Predicted protein(s) >gi|222159158|gb|ACAB01000201.1| GENE 1 3 - 651 310 216 aa, chain - ## HITS:1 COG:AGpA139 KEGG:ns NR:ns ## COG: AGpA139 COG3464 # Protein_GI_number: 16119324 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 89 59 146 404 66 35.0 4e-11 MSVVVDQMTHMPIALLEDRSGEALDNWLARNPQIQYITRDRGRCFTEAINRIIPGVTQIC DRFHLTKNMTDTMIPEIEKMIRQTKQKLKYEYPDRDTASSLILQDIFNMGDVRHREKLKI YRESLNLKMQGMTIEQTAAHLGKKSRYIYKLIHNRRIGAYLNEQQKTALKYVSELATIIS AGCITRNILAQKMGSKISGALIGRITSSLRKMYQQK Prediction of potential genes in microbial genomes Time: Wed May 18 04:55:57 2011 Seq name: gi|222159157|gb|ACAB01000202.1| Bacteroides sp. D1 cont1.202, whole genome shotgun sequence Length of sequence - 693 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 314 - 556 147 ## gi|237712680|ref|ZP_04543161.1| predicted protein + Term 630 - 666 1.3 Predicted protein(s) >gi|222159157|gb|ACAB01000202.1| GENE 1 314 - 556 147 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237712680|ref|ZP_04543161.1| ## NR: gi|237712680|ref|ZP_04543161.1| predicted protein [Bacteroides sp. D1] # 1 80 1 80 80 152 100.0 7e-36 MMDGILSYAFRKLRAAATGSAIANMQPMSYKDTDTTGEPVEHVTEPDKQIAHSALGDYGP FSVKYATKEFLSYGNKKFRI Prediction of potential genes in microbial genomes Time: Wed May 18 04:56:03 2011 Seq name: gi|222159156|gb|ACAB01000203.1| Bacteroides sp. D1 cont1.203, whole genome shotgun sequence Length of sequence - 672 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 670 102 ## BT_0034 hypothetical protein Predicted protein(s) >gi|222159156|gb|ACAB01000203.1| GENE 1 2 - 670 102 222 aa, chain + ## HITS:1 COG:no KEGG:BT_0034 NR:ns ## KEGG: BT_0034 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 222 189 403 403 392 96.0 1e-108 LLDQLELLHSSYRNRLTLSSDYQRRFSVIQTVLEQGKNLFAGKKVSNRIVSIDRHYLRPI IRGKETKSVEFGAKVNNIQIDGISFIEHISFKAFNEGVRLKDCIHLQQQLTGVRVKALAA DSIYANNANRKFCTKYHISTSFKRKGRAAKDEPLRKILRSELSRERATRLEGSFGTQKQH YSLARIKARNRKTEVLWIFFGIHTANAVCMIEKVEKKKRKAA Prediction of potential genes in microbial genomes Time: Wed May 18 04:56:06 2011 Seq name: gi|222159155|gb|ACAB01000204.1| Bacteroides sp. D1 cont1.204, whole genome shotgun sequence Length of sequence - 640 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Wed May 18 04:56:07 2011 Seq name: gi|222159154|gb|ACAB01000205.1| Bacteroides sp. D1 cont1.205, whole genome shotgun sequence Length of sequence - 590 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 56 - 463 253 ## gi|260174678|ref|ZP_05761090.1| iron-sulfur cluster-binding protein/coenzyme F420-reducing hydrogenase, beta subunit, putative + Term 549 - 583 0.4 Predicted protein(s) >gi|222159154|gb|ACAB01000205.1| GENE 1 56 - 463 253 135 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174678|ref|ZP_05761090.1| ## NR: gi|260174678|ref|ZP_05761090.1| iron-sulfur cluster-binding protein/coenzyme F420-reducing hydrogenase, beta subunit, putative [Bacteroides sp. D2] # 1 135 188 322 322 266 100.0 4e-70 MHQSKLFRPKRCLMCKEDISYKADVSLADPWLGKYKISDKIGHTMFLINTEKGLAFIEEM KNKSLLRLIDSSVEDYIEAQGHTIMAKDKASLEKKFNNILSKMGNNILYKKIMTLSPFML RFHMLIIRIVYKIVK Prediction of potential genes in microbial genomes Time: Wed May 18 04:56:13 2011 Seq name: gi|222159153|gb|ACAB01000206.1| Bacteroides sp. D1 cont1.206, whole genome shotgun sequence Length of sequence - 577 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 369 111 ## gi|262408137|ref|ZP_06084684.1| predicted protein - Prom 389 - 448 5.6 + Prom 185 - 244 7.2 2 2 Tu 1 . + CDS 437 - 571 58 ## Predicted protein(s) >gi|222159153|gb|ACAB01000206.1| GENE 1 3 - 369 111 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262408137|ref|ZP_06084684.1| ## NR: gi|262408137|ref|ZP_06084684.1| predicted protein [Bacteroides sp. 2_1_22] # 1 122 1 122 151 191 100.0 2e-47 MEAILAIIITAILFCFFIRKGKNEISEQTTEDVIDSAYTEAYKLVIKQKKDVNEVDHILE SKGLDESQRNSIIVSLLQITNDKKAKRKQRIANAPSDILYGGIWCIGGITLTCLSLLLSD SI >gi|222159153|gb|ACAB01000206.1| GENE 2 437 - 571 58 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNSKCCLSYYSLSCLALKPFIMIRPFELCIVGFSFKEFSLSTKY Prediction of potential genes in microbial genomes Time: Wed May 18 04:56:23 2011 Seq name: gi|222159152|gb|ACAB01000207.1| Bacteroides sp. D1 cont1.207, whole genome shotgun sequence Length of sequence - 539 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 405 207 ## BT_3440 hypothetical protein - Prom 429 - 488 2.9 Predicted protein(s) >gi|222159152|gb|ACAB01000207.1| GENE 1 3 - 405 207 134 aa, chain - ## HITS:1 COG:no KEGG:BT_3440 NR:ns ## KEGG: BT_3440 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 170 302 443 178 65.0 6e-44 MSSNSTTGAMDSWDIIPATNNPGTYVLQNELYLGQGSSGSWWDVFNYVVDIQANNVGFSQ YTQRAQQEFTITPDAVFTLKSLEFINPYSATVTQRENVSIKRAITNSSSQFIRENLVFEK AIQEDSYFKEEKGI Prediction of potential genes in microbial genomes Time: Wed May 18 04:56:26 2011 Seq name: gi|222159151|gb|ACAB01000208.1| Bacteroides sp. D1 cont1.208, whole genome shotgun sequence Length of sequence - 532 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 8 - 54 1.0 1 1 Tu 1 . - CDS 81 - 530 133 ## gi|262408135|ref|ZP_06084682.1| predicted protein Predicted protein(s) >gi|222159151|gb|ACAB01000208.1| GENE 1 81 - 530 133 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262408135|ref|ZP_06084682.1| ## NR: gi|262408135|ref|ZP_06084682.1| predicted protein [Bacteroides sp. 2_1_22] # 1 149 9 157 157 284 100.0 1e-75 NSILADEYKANLEAKGILCYIQKEEVGIGAYAGGSQPQIAVFVEDEHYQEAIDLINGLKL ERDSNLPWCPKCGSENTAHTIVQHKHGPKWMVLVSFVIVVFCIIFTILNKCTIVYIPTII PIILLFIWLKGYKEDIYHCNQCGNDFKRT Prediction of potential genes in microbial genomes Time: Wed May 18 04:56:34 2011 Seq name: gi|222159150|gb|ACAB01000209.1| Bacteroides sp. D1 plasmid unnamed cont1.209, whole genome shotgun sequence Length of sequence - 2752 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 385 - 444 4.1 1 1 Tu 1 . + CDS 532 - 1122 415 ## ECA2908 putative plasmid replication protein + Prom 1579 - 1638 2.2 2 2 Tu 1 . + CDS 1742 - 2750 495 ## BVU_1439 mobilization protein Predicted protein(s) >gi|222159150|gb|ACAB01000209.1| GENE 1 532 - 1122 415 196 aa, chain + ## HITS:1 COG:no KEGG:ECA2908 NR:ns ## KEGG: ECA2908 # Name: not_defined # Def: putative plasmid replication protein # Organism: E.carotovora # Pathway: not_defined # 45 182 34 178 254 75 31.0 1e-12 MENKKAVKLTDFQKNEENPFMKQAIEGIENHVVKKYKSNSGGDKRAVVALADTETGEVFK TSFIRQIEVDEEQFTKLYLSNFAAFFDLSQAAIRVFGYFMTCMKPKNDLIIFNRKKCLEY TKYKTDKAVYKGLAELVKAEIIARGPADNLWFINPLIVFNGDRVTFAKTYVRKKTLAAQK KEEAEKRQLSLGFDEQ >gi|222159150|gb|ACAB01000209.1| GENE 2 1742 - 2750 495 336 aa, chain + ## HITS:1 COG:no KEGG:BVU_1439 NR:ns ## KEGG: BVU_1439 # Name: not_defined # Def: mobilization protein # Organism: B.vulgatus # Pathway: not_defined # 4 319 5 349 467 105 28.0 3e-21 MGATSIHVQAVKPGSEIHNFREKELDYVRPELSHLNESWVGDSISHRLESAKQRYLDTVG QKMQAKAAPIREGVIVIKQETTMQELQQFATVCKERFGIEAFQIHIHKDEGYMNAKQWTP NLHAHVVFDWTQPNGKSVRLSRDDMAELQTIASETLGMERGVSSDRKHLSAMQYKTECAK EQLQELSNDISSALDKHKDVQNQLLQLQKELRSIETKKNVQKLISKASEKFYGLIGKTVN DREKDALKAKVKALEGENEQLSDRLGKAILEKEQNGTKAFKAENDKEYYRQQMDNARTTS NRLRTENQELKTETKELKKELGKMKDLFNSEQLEAL