Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:23:22 2011 Seq name: gi|261748848|gb|ADAD01000001.1| Leptotrichia goodfellowii F0264 contig00032, whole genome shotgun sequence Length of sequence - 12615 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 6, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 2528 3811 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains + Prom 2548 - 2607 11.4 2 2 Op 1 8/0.000 + CDS 2850 - 3941 1391 ## COG0524 Sugar kinases, ribokinase family 3 2 Op 2 . + CDS 3942 - 4862 1359 ## COG2313 Uncharacterized enzyme involved in pigment biosynthesis + Term 4896 - 4949 6.2 - Term 4709 - 4742 -1.0 4 3 Tu 1 . - CDS 4934 - 6496 1810 ## COG4166 ABC-type oligopeptide transport system, periplasmic component 5 4 Op 1 . - CDS 6564 - 7715 1809 ## COG0192 S-adenosylmethionine synthetase 6 4 Op 2 . - CDS 7708 - 7872 196 ## gi|262037127|ref|ZP_06010620.1| hypothetical protein HMPREF0554_0415 - Prom 7892 - 7951 9.9 - Term 7904 - 7951 3.1 7 5 Op 1 . - CDS 7984 - 8106 59 ## 8 5 Op 2 1/0.000 - CDS 8120 - 9307 1156 ## COG0477 Permeases of the major facilitator superfamily 9 5 Op 3 . - CDS 9353 - 9940 947 ## COG0353 Recombinational DNA repair protein (RecF pathway) - Prom 9975 - 10034 1.9 - Term 9962 - 10022 2.4 10 6 Op 1 12/0.000 - CDS 10098 - 11525 2312 ## COG0469 Pyruvate kinase 11 6 Op 2 . - CDS 11558 - 12526 1814 ## COG0205 6-phosphofructokinase Predicted protein(s) >gi|261748848|gb|ADAD01000001.1| GENE 1 2 - 2528 3811 842 aa, chain - ## HITS:1 COG:FN0705_2 KEGG:ns NR:ns ## COG: FN0705_2 COG0749 # Protein_GI_number: 19704040 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Fusobacterium nucleatum # 404 842 2 441 501 416 52.0 1e-115 MKKAVILDTSAVMYRTHFALMGMRNSKGMATGATYGFVNTLEGIIKEFNPDYLVACLDIK RSELKRSEELETYKAHRESMPDDLVAQQDMIMSVLDGYKIPKYKINGYEADDVIATLATK FSEDKEEEIEVYIVTGDKDLAQLVNGKINIALLGKGDKKSLFRYIRNDNDVIEYLGVTPD KIPDLFGLMGDSSDGIPGVSGIGPKTGVELILKYGNLESLYENIDEIKGKRKEKLIEDKE KAFLSRKLATVHRNIEMEYDKDKLKMESKDTEKLLSIYRTMEFKKFATTIENEMKKQGKA VNDESNGKISLFGGIVNHPDASDNKISNENRQTEDKTTREKIEWSDAENVLNGMKEEVSI FGNEFGFVICDGERNIVLLNHENTDYGSINKVYEKLKEKSVIGYNIKEYLKKGMGYRQYF DIMLAWYVLGTESSQDLENIIFSELGVNLEKFEEQFKKRKPEEISEEEKTQFLWERAFYV KELKEILENRLKSEDLTNVFENLENKLVPVLASMEMYGIKINKEYFENYKKELQENIEKL EGEIYSLLGEAFNIGSPKQLAEILFDKMGIEPVKKTKTGYSTDVEVLEELALRGIEIAEK LLEYRGFTKLLSTYVEPLPKLADEKDRIHTTFNQNGTSTGRLSSTNPNIQNIPVRTDEGI KIRKGFISEEGWSLVSFDYSQIELRVLAELSKDENLVLAYKKDKDLHDLTARKIFFKADE EQVTREERSIAKVINFSILYGKTPFGLSKELKIPVADASLYIKTYFEQYPKVKRFLENIL ENAKLNGFVETLYGTRRYIKGINSSNKNLQAQANRMAVNTVVQGTAANIIKKVMIKLYDE FK >gi|261748848|gb|ADAD01000001.1| GENE 2 2850 - 3941 1391 363 aa, chain + ## HITS:1 COG:ECs3058 KEGG:ns NR:ns ## COG: ECs3058 COG0524 # Protein_GI_number: 15832312 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 54 362 1 310 313 192 33.0 1e-48 MTEREKQIINLIKENPLITQEEIAATLNCARTSIAVHISNLMKKGIILGKKYIINEDPYI LTIGGTNVDIQGKSYSTVVRYDSNPGSVTISFGGVGKNIAENINKLGINSKFVTALGDDI YGKNIKEYLNNQNLDISDSLFLKNQQTSMYVSILNDDKEMEMAVSSTDICKNITPEFLET IRKKITNAKLLVLDTNLEEEALRYIAFLRRKPNLILDTVSTKKSLKVKEFIGRFHTIKPN KLEAEILSGITIYSNDDLERAGEYFLHKGVKKVLISLGAKGVFYMTQDKQGIIKIPRINT VSSTGAGDAFVAGLAYGEYNDYDIEEASKFGLGAAILTSLNEKTVSDHISVKNVENIIKE MEI >gi|261748848|gb|ADAD01000001.1| GENE 3 3942 - 4862 1359 306 aa, chain + ## HITS:1 COG:ECs3057 KEGG:ns NR:ns ## COG: ECs3057 COG2313 # Protein_GI_number: 15832311 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized enzyme involved in pigment biosynthesis # Organism: Escherichia coli O157:H7 # 6 301 11 306 312 391 67.0 1e-109 MFEKYLEISEEVQNALKNNQPVVALESTIISHGMPYPQNAETALKVESIVRENGGIPATI AIIGGKLKVGLSPQEIELLGKEGEKVIKVSRRDIPYIVANKLNGATTVASTMIIANMAGI KIFATGGIGGVHRGAEHTMDISADLQELSNTNVAVICAGAKSILDLGLTLEYLETNGVPV LGYKTKELPAFYTRNSGFNLDYAIDTPKEFAEILNTKWKLGLKGGVVIANPIPEEYSMDN NVISKVINDAVEEAEKLGIKGKASTPFLLDKIQKLTGGSSLEANIKLVFNNTKLATEIAK ELCNLK >gi|261748848|gb|ADAD01000001.1| GENE 4 4934 - 6496 1810 520 aa, chain - ## HITS:1 COG:SA0849 KEGG:ns NR:ns ## COG: SA0849 COG4166 # Protein_GI_number: 15926579 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Staphylococcus aureus N315 # 26 520 39 540 551 220 28.0 4e-57 MIFSVLFYYFTNKKKIDTQTPIVFNLGSDYNTLDPHLFNEMIAVQVDSTIYEGLLILDEK GNYTGGVAESFTENGNKMIFKIRNNAKWSDGSKITANDFVFAFKRVLNPETAAQFSEMLF PVKNAEKYYEGKVSENELGIKAINDSILEIELEHPTPYFKYILTLPVAVPLKEEFYKSRK DKYAIKLEDFLFNGPYRISSLKKEEILLEKNPYYWNVENIKISRIKYVITKDFKTVDDLI KNKELDMSRVENYNLEQYKKNGTLDTFLNGRIWYLDFNLDNVYLKNKKIRQAISFVINRD KYVKEIKKDGSVVAKSIISSIISGYNENYRKNYPDTDYFKDGDIEKAKKLYKEGLKELGA EKIKLRLLSGNSDPERLEIQFLQEELRINLGLETEVTIVPFKERITKTREGNYDIVLNTW SPKYDDSLSYLERWKKEDGKNEDIWSKRKYNLLVNEISSMGYGKERDKKINEAERILIDE AVISPLYFSVENHYRTSTIKKIIRRPITGIATFNYGYIVK >gi|261748848|gb|ADAD01000001.1| GENE 5 6564 - 7715 1809 383 aa, chain - ## HITS:1 COG:FN0355 KEGG:ns NR:ns ## COG: FN0355 COG0192 # Protein_GI_number: 19703697 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Fusobacterium nucleatum # 1 381 1 379 383 599 74.0 1e-171 MSKKIYFTSEFVSPGHPDKICDQISDAVLDACLKEDPDARVACETFATTGLVIVGGEITT TTYVDIQNIVRSKIEEIGYRPGMGFDSDCGVMNTVHSQSPDISMGVDTGGAGDQGIMFGG AVNETEELMPLAMTLAREIIIQLTKLTRNKTLTWARPDAKAQVTLAYDETGKKVLNMDTV VLSVQHDEDITREKIEKDLKKLVIKPVLEKYNLNPDEVEKYHINPTGRFVIGGPHGDSGL TGRKIIIDTYGGYFRHGGGAFSGKDPSKVDRSAAYAARWIAKNIVASGIADKCEVQLSYA IGVIEPVSVRVETFGTSKIEESKIEEIVEKIFDLTPRGIEKELELRNPKFRYQDLAAFGH IGRTDIDLPWERLNKVEEIKKYL >gi|261748848|gb|ADAD01000001.1| GENE 6 7708 - 7872 196 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037127|ref|ZP_06010620.1| ## NR: gi|262037127|ref|ZP_06010620.1| hypothetical protein HMPREF0554_0415 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0415 [Leptotrichia goodfellowii F0264] # 1 54 1 54 54 81 100.0 2e-14 MVSIAGIAGALCGIGLAVYIAIIVRRKTEEAVKKRLKNREKSEDENKNQGGKNV >gi|261748848|gb|ADAD01000001.1| GENE 7 7984 - 8106 59 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYYNKLTIKYATYQERYERRGSLLVPGNLYKNTRCQFPAY >gi|261748848|gb|ADAD01000001.1| GENE 8 8120 - 9307 1156 395 aa, chain - ## HITS:1 COG:FN1497 KEGG:ns NR:ns ## COG: FN1497 COG0477 # Protein_GI_number: 19704829 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Fusobacterium nucleatum # 33 362 10 338 374 106 27.0 8e-23 MNKEKDYKKINLMYSFIIICYVICEGFFGYGFTSFIFKKGINLAKIGLLMGLTNFAIMLF DYPSGNIADKYGRKKICSSGFIIYGIGLIIFGFSNNFALFLISGIVRAFGSSLISGTPEA WYLGELSKINKFSYKDKFLPIIRGIGLFFGSVSGIMAGKVSEINISLPIYIGGIIMIVSG VIIGILFVENYGNREGNLIKTINKNSVNFFRNSKMRILSVFEVLKTIMFTIFILLWQIFT TKVIGLSHSKLGYFYTAMLLLMSLSSFFSRYLMKKLNKITVTILGLFFISFGLIIFIYSK NIYLFILGFVIFEFSLGIANTSYFTWVYDYIPEEVRATYSSALNSVRAFSGFIMSVFLGK VIQDLGYNTGWIIALVSTLISIIWLLKIHKNDKNF >gi|261748848|gb|ADAD01000001.1| GENE 9 9353 - 9940 947 195 aa, chain - ## HITS:1 COG:FN0412 KEGG:ns NR:ns ## COG: FN0412 COG0353 # Protein_GI_number: 19703754 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Fusobacterium nucleatum # 2 194 4 197 197 195 48.0 5e-50 MKRLDDLVNVFAKLPGIGRKSAMRIAFDILEKDEKEIDDILYTIKDSYDNIKHCSVCGNL SEKDVCEICIDEKRNKNVICVVEGVRDIIAFEKSETYNGLYHVLGGKIDPLNGVTIDDLN IEKLMERLDGTVSEIILALNPDLEGETTNLYLTKILKGKNIKISKIASGIPMGGNIEYTD MATLGKSLEGRVEVE >gi|261748848|gb|ADAD01000001.1| GENE 10 10098 - 11525 2312 475 aa, chain - ## HITS:1 COG:STM1378 KEGG:ns NR:ns ## COG: STM1378 COG0469 # Protein_GI_number: 16764728 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Salmonella typhimurium LT2 # 3 468 1 467 470 548 63.0 1e-156 MKIKMTKVVCTIGPKTESVEMLTKLVESGMNVMRLNFSHGDFEEHGTRIKRIREVMEKTG KNIGILLDTKGPEIRTGKLEGGKDILLEAGNTIAITTDYSHVGNKDKISVSYPGIVDDLK PGNTVLLDDGLVGLEVAEIKGNEIICKVINTGELGETKGVNLPGVSVGLPALSEKDIADL KFGCEQGVDFVAASFIRKASDVAEVRKVLDENGGANIKIIPKIENQEGVDNFDEILELSD GIMVARGDLGVEIPAEEVPFVQKMMIRRCNAAGKPVITATQMLDSMIRNPRPTRAEAGDV ANAILDGTDAVMLSGESAKGKYPVEAVQMMAGISKRTDEFKKFKNIVVPQVGSVTVTEAI SLGAVESSQLLDAKMIICWTKTGRAARMLRKYGPTVPIIALTDSEQTARQLALVRGVRAY VEKNLDKTDDFFKKAKEVASKHEEVKRGELVVLVTGISETGTTNTFKVARIGEDY >gi|261748848|gb|ADAD01000001.1| GENE 11 11558 - 12526 1814 322 aa, chain - ## HITS:1 COG:FN0410 KEGG:ns NR:ns ## COG: FN0410 COG0205 # Protein_GI_number: 19703752 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Fusobacterium nucleatum # 1 321 8 328 329 365 56.0 1e-101 MMRRIAVLTSGGDSQGMNTAVRAVAKTAMTKGIDVFGIKRGYKGMLEDQIFEMSPLSVSG IADQGGTILLSARLPEFKDPAVRAVAADNLKKRGIDGLVVIGGDGSFHGAHYLYEEHGIK TIGIPGTIDNDVAGTDYTIGYDTALNIILDAISRIKDTAISHERTYLLEVMGRNCGDLAL YSAIAGGASGVLIPEVENSIDDIAEVIKYRRSEGKLYDIIVVAEGVGNVMEIQKELAKKV DTSIRVTILGHVQRGGAPTAFDRILATRLGVKAVELLVEGKGGLMVGLQSEKVTTHPLSY AWENYHTKTSFNDYAIANMLSL Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:23:42 2011 Seq name: gi|261748816|gb|ADAD01000002.1| Leptotrichia goodfellowii F0264 contig00127, whole genome shotgun sequence Length of sequence - 31981 bp Number of predicted genes - 31, with homology - 31 Number of transcription units - 4, operones - 4 average op.length - 7.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 14 - 73 4.6 1 1 Op 1 1/0.000 + CDS 95 - 814 813 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 2 1 Op 2 . + CDS 811 - 1710 860 ## COG0673 Predicted dehydrogenases and related proteins 3 1 Op 3 . + CDS 1716 - 2156 380 ## LKI_07260 teichoic acid glycosylation protein 4 1 Op 4 . + CDS 2184 - 3317 1167 ## bpr_I0518 FAD dependent oxidoreductase 5 1 Op 5 8/0.000 + CDS 3342 - 4250 773 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 6 1 Op 6 . + CDS 4287 - 5123 721 ## COG1216 Predicted glycosyltransferases 7 1 Op 7 . + CDS 5199 - 6380 1762 ## COG2273 Beta-glucanase/Beta-glucan synthetase 8 1 Op 8 11/0.000 + CDS 6457 - 7665 1617 ## COG1088 dTDP-D-glucose 4,6-dehydratase 9 1 Op 9 . + CDS 7711 - 8283 881 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes + Term 8298 - 8343 5.0 + Prom 8338 - 8397 12.2 10 2 Op 1 . + CDS 8496 - 8711 450 ## gi|262037159|ref|ZP_06010651.1| hypothetical protein HMPREF0554_1969 + Prom 8762 - 8821 10.5 11 2 Op 2 . + CDS 8892 - 9620 677 ## gi|262037135|ref|ZP_06010627.1| putative membrane protein + Prom 9676 - 9735 8.6 12 3 Op 1 . + CDS 9824 - 10597 500 ## gi|262037157|ref|ZP_06010649.1| putative ATP synthase delta chain + Prom 10600 - 10659 5.9 13 3 Op 2 . + CDS 10693 - 11598 945 ## COG1216 Predicted glycosyltransferases 14 3 Op 3 3/0.000 + CDS 11609 - 12625 1320 ## COG0859 ADP-heptose:LPS heptosyltransferase 15 3 Op 4 11/0.000 + CDS 12629 - 13657 1287 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 16 3 Op 5 . + CDS 13687 - 14655 1177 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 17 3 Op 6 . + CDS 14670 - 15683 754 ## D11S_0428 capsular polysaccharide synthesis 18 3 Op 7 . + CDS 15752 - 16603 844 ## D11S_0427 hypothetical protein 19 3 Op 8 1/0.000 + CDS 16654 - 17418 745 ## COG3774 Mannosyltransferase OCH1 and related enzymes 20 3 Op 9 11/0.000 + CDS 17439 - 18203 706 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 21 3 Op 10 . + CDS 18124 - 18999 935 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 22 3 Op 11 . + CDS 18986 - 19717 896 ## Lebu_2266 glycosyl transferase family 2 23 3 Op 12 . + CDS 19764 - 20507 924 ## Lebu_2266 glycosyl transferase family 2 24 3 Op 13 . + CDS 20534 - 21559 1172 ## COG0726 Predicted xylanase/chitin deacetylase 25 3 Op 14 . + CDS 21594 - 23342 270 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Prom 23351 - 23410 7.0 26 4 Op 1 . + CDS 23468 - 23899 631 ## gi|262037142|ref|ZP_06010634.1| hypothetical protein HMPREF0554_1985 27 4 Op 2 . + CDS 23907 - 24908 998 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 28 4 Op 3 12/0.000 + CDS 24949 - 26274 1640 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 29 4 Op 4 . + CDS 26274 - 27368 1114 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 30 4 Op 5 . + CDS 27428 - 30487 3822 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) + Prom 30489 - 30548 1.6 31 4 Op 6 . + CDS 30581 - 31819 1750 ## COG4099 Predicted peptidase Predicted protein(s) >gi|261748816|gb|ADAD01000002.1| GENE 1 95 - 814 813 239 aa, chain + ## HITS:1 COG:AF0581_1 KEGG:ns NR:ns ## COG: AF0581_1 COG0463 # Protein_GI_number: 11498189 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Archaeoglobus fulgidus # 1 214 54 265 280 119 33.0 6e-27 MDLEITIPVFNEEETIKEKIPEMILYVKNNIKDIEISFIIVDNGSTDNTEKYSLELTKEY NNLKYIKLLEKGVGLALRTSWSQSQADYVGYMDLDIATDLEALETVVTEMKNGVKIINGS RLLKNSKVINRSFIREITSRVFNLLLKIILKVRFTDGMCGFKFLNRQTAQELIGTGIDTK GWFFSTEIMVKGYWKEIEIKEIPIKWTDDRKSKVKIFSLSWNYLKSIVKLKEEEKEFKK >gi|261748816|gb|ADAD01000002.1| GENE 2 811 - 1710 860 299 aa, chain + ## HITS:1 COG:BH2703 KEGG:ns NR:ns ## COG: BH2703 COG0673 # Protein_GI_number: 15615266 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 42 142 55 155 329 68 33.0 1e-11 MKCVLIGYGYWGKILKKYIESSKFFELFGIYDPSFEKSIDMDSILKDKSIECAFICNPID MHYLSVKKLLENGKHVFCEKPLSKSLIETEELFNLAKSKKLCLFIDYIYTNSPSINVIKE EINSLGKILYCEGNIKQFGKFYKDDDVFEVLGVHMISAISYILDSEINIIKVISKKENSK GIIQTGSLEFESNEGIKGIINSSLLDANKERKIIFRCENGNISFNMLGSTTVSIVKHIET ERGYEEKIILDEKYDETNNLINVLQEFKKNIENKIYNDQISLEVAKALEKIKELNKERE >gi|261748816|gb|ADAD01000002.1| GENE 3 1716 - 2156 380 146 aa, chain + ## HITS:1 COG:no KEGG:LKI_07260 NR:ns ## KEGG: LKI_07260 # Name: not_defined # Def: teichoic acid glycosylation protein # Organism: L.kimchii # Pathway: not_defined # 15 146 15 137 140 66 39.0 4e-10 MKKVFLKKTINKETILYLFFGVLATLLNIILFYFFVTQLKMSTFWGNLLDTVICILFQYF TNRIWVFKSENKGKNALKEFFQFLFTRSITAIIDQIFVVVGVDFFVRKFIIQTQQSIWSL GIKILSNIIVIVLNYIFSKFFVFNKK >gi|261748816|gb|ADAD01000002.1| GENE 4 2184 - 3317 1167 377 aa, chain + ## HITS:1 COG:no KEGG:bpr_I0518 NR:ns ## KEGG: bpr_I0518 # Name: not_defined # Def: FAD dependent oxidoreductase # Organism: B.proteoclasticus # Pathway: not_defined # 6 377 3 378 378 344 46.0 4e-93 MNIETQKYDKIIIGAGIYGMYAARRILQENLNTKVLIIETEKTYFNRGSYVNQARLHNGY HYPRSYSTASKSVKYFDRFYNDFKEGINDSFEKIYAIASDYSWANGEQFQKFCDNLNVLC EEIPKKKYFNEYTIDKAFLTKEFSFDAKIIGDKLYNDLISLNCEIHFNTKIISIEKKENK YIIKTKEGKIYETPFILNATYAGINKIHDLLGFEYLPIKYEFCEVILCEVSENIKNVGLT VMDGPFFSLMPFGLTGYHSITTVSRTPHFTSYEDISPYDCGGNGKLQQSEEHKKGCIHCG IYPETAFLEMVQTAKKYLNEDISIKYVKSLFTIKPIMVASEIDDSRPTIIKQYSENPDFY TVFSGKINTMYDLDEIL >gi|261748816|gb|ADAD01000002.1| GENE 5 3342 - 4250 773 302 aa, chain + ## HITS:1 COG:BS_ykcC KEGG:ns NR:ns ## COG: BS_ykcC COG0463 # Protein_GI_number: 16078354 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 9 295 4 304 323 66 27.0 6e-11 MKKNLEQNYISIVLVINEYKDEIGKKIDKIDKVLKNYFKNSEIVIVDNALRNKNIKKLIN EDIKYTLIKLPIKHGSQQALNAGTAIAIGDYIVEVEDISVEINFEMIIEMYKKSQEGYDF VFLTPKKIKATSKMFYNILNKNFKNIFNTDISSSIMTLSSRRGQNKVSEVGKKVINRNVA YVLSGLKSSSIAIDINYRNRRTFSDNLMLMFDTLIYYTDSIMVMSQRIAFFFFLLFGIGV FYSLIMRFFSNSVEGWASTFILISLGFGGIFLILSIIIRYLHHILRNSINTKDYIYRSVD RK >gi|261748816|gb|ADAD01000002.1| GENE 6 4287 - 5123 721 278 aa, chain + ## HITS:1 COG:PA1130 KEGG:ns NR:ns ## COG: PA1130 COG1216 # Protein_GI_number: 15596327 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Pseudomonas aeruginosa # 7 275 10 297 325 89 24.0 8e-18 MKIAGTVIWYNPGEEHVENIKSYIDYIDELYIIDNSDKDNKLLAERLNDSKIEYFYNGRN LGIAKALNLGCEKAFRNDYTWILTMDQDSSFTFENIKNYFEDFDAIQNNSIGIISPYHVL KNDIIKTDEKESFTEIDNVMTSGNLLNLRIWEKVGKFDESLFIDEVDSDICYKIIEKGYK IIQLNKIKMFHELGKLEKRNFFMKKISVFNHNYIRKYYIIRNKCYMWKRYKKYRKRYSYY ILNDFFKVVFYEKDKLRKLKYMFKGIKDFFKNKMGEVN >gi|261748816|gb|ADAD01000002.1| GENE 7 5199 - 6380 1762 393 aa, chain + ## HITS:1 COG:TM0024 KEGG:ns NR:ns ## COG: TM0024 COG2273 # Protein_GI_number: 15642799 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Thermotoga maritima # 114 391 193 457 642 134 30.0 2e-31 MKKILLLTAISILLTGSLGYAAAKKSNTGIKKSTTKSTTKVNTTKPNAAKTNTTKTSTTK TKPNAIKPNTTETDTTKINPIPEGFVKEERTVGPDTRKLVTEDQNQTQTQENAVNSEKNV TTNKKTKERKAEWQVVWSDEFDGNSLDQSKWSYWENGTPWKSGNYLDENGNLVDEYGFKA KQYYLRDNVKVEDGNLVITVKKEDNRTVKINGKDRRILYSSGGIHTKDKYTVQEGKIEMR AAMPEGIGVWPAFWTWPADYSQAVGTPAKEEIDIFEIYGDNLKRVTGTAHALKADNTYES FTGSNLRIRKSEDLTKFNTYAVEWNDKEIKWLFNGRVYKKVSMKKVAKFSENTFKLPHFL MINVAVQEKAGEDGNVKFPTELKVDYVRVYKKQ >gi|261748816|gb|ADAD01000002.1| GENE 8 6457 - 7665 1617 402 aa, chain + ## HITS:1 COG:FN1667 KEGG:ns NR:ns ## COG: FN1667 COG1088 # Protein_GI_number: 19704988 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Fusobacterium nucleatum # 1 402 1 399 399 681 86.0 0 MKTYLVTGAAGFIGANYLKCILKKYEGKKDIKVIVVDSLTYAGNLGTIKEELEDKRVKFE KVDIRDRKEIERVFAENNVDYVVNFAAESHVDRSIENPQIFLETNILGAQNLLENAKKAW TVGKDENGYPVYKDGVKYLQVATDEVYGSLSKDYDDAVDLVIEDEDVKKVVKNRTNLKTY GDKFFTEKTPLDPRSPYSASKAGADHIVIAYGETYKMPISITRCSNNYGPYHFPEKLIPL MIKNVLEGKKLPVYGKGDNVRDWLYVEDHCKGIDLVLREGKEGEIYNIGGFNEEKNINIV KLVIDILKEEIESNEEYKKVLKTDINNINYDLITYVQDRLGHDMRYAIDPSKIAKDLGWY PETDFETGIRKTIKWYLENQNWVNEVVSGDYQKYYDKMYGGK >gi|261748816|gb|ADAD01000002.1| GENE 9 7711 - 8283 881 190 aa, chain + ## HITS:1 COG:CAC2331 KEGG:ns NR:ns ## COG: CAC2331 COG1898 # Protein_GI_number: 15895598 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Clostridium acetobutylicum # 1 178 1 174 185 217 62.0 9e-57 MNNFTIKKTPIKDLVIIEPKVFGDSRGFFMETYNQKSFEELGLTMNFVQDNHSKSKKGVL RGLHFQTKHTQGKLVRVIKGRVFDVAVDLRKESETYGQWYGVELSEENKLMFYVPERFAH GFLTLDEDTEFVYRCTDLYAPEYDSGILWNDKTLNIDWKFEEFGINPEELTISEKDQKQQ NFQPEKNYFE >gi|261748816|gb|ADAD01000002.1| GENE 10 8496 - 8711 450 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037159|ref|ZP_06010651.1| ## NR: gi|262037159|ref|ZP_06010651.1| hypothetical protein HMPREF0554_1969 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1969 [Leptotrichia goodfellowii F0264] # 1 71 1 71 71 147 100.0 3e-34 MTFSEAFTAIREDGGKAMRLPKWNNDVRVKVQNLDGTLCNTHPYLYVESRFGRVPWIPNY VEMFSKNWEIV >gi|261748816|gb|ADAD01000002.1| GENE 11 8892 - 9620 677 242 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037135|ref|ZP_06010627.1| ## NR: gi|262037135|ref|ZP_06010627.1| putative membrane protein [Leptotrichia goodfellowii F0264] putative membrane protein [Leptotrichia goodfellowii F0264] # 1 242 1 242 242 336 100.0 1e-90 MASKYQDIGVYLSWALKVTKKIFREEIKWIFYLFVLILVSSLSNYIPITMEQKIFSWIKS GIIAVIFNTSFILFYRKVIFKIEGKENIELKKVFIKAVILGIIETAIYSLTLNSYIEVDS GIISLFLMVAALFFIFGFLYFKPLYISRNIGFEEAINYNFRLSKGNKMRMFGPIFVTRLL FGIITLVISEVIENLLEIGIISIIVIMLISALVDVLFKVFSIVLESIIYLNVEYLYKEED IN >gi|261748816|gb|ADAD01000002.1| GENE 12 9824 - 10597 500 257 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037157|ref|ZP_06010649.1| ## NR: gi|262037157|ref|ZP_06010649.1| putative ATP synthase delta chain [Leptotrichia goodfellowii F0264] putative ATP synthase delta chain [Leptotrichia goodfellowii F0264] # 1 257 15 271 271 421 100.0 1e-116 MRNLKKILESRYLNIREYYGLALKLTEKIFKENRIWIFYFLVLAFAASVDDVFTLPVGLK NTEWIISMSLVLILSVSYILFYRKVVYKIEGKEKSEIKKVFFKAVIWGIVEVSAINLYTN DYFKNKIPDIIPFLAGIMCVVIYFSFLYFKVLYISRNIRLKDTIEYSFYLGKGNRMRMFF PLFLSEILFLQIYLWLDFLLKASVENKILILLGAFILTVFQTIFKILNVALEDVIYLNVE YMDRKKMSNIEDVRGQQ >gi|261748816|gb|ADAD01000002.1| GENE 13 10693 - 11598 945 301 aa, chain + ## HITS:1 COG:RSc0688 KEGG:ns NR:ns ## COG: RSc0688 COG1216 # Protein_GI_number: 17545407 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Ralstonia solanacearum # 1 292 1 263 275 194 35.0 2e-49 MYDITAGIVVYNTKREDLKKAVSSFLNTGLNVKLRISDNSPTDKFLQELEDFLKECEKDY GFKNIKDKVEYIFNNNNGGYGWGHNKIIEKLTDNPEKSAESRKGKMISKYHLILNPDIYF EKGTLEKLFDYMENNPETGQIMPKVIYPDGELQYLCKLIPTPVNLVFRRFLPIKKIKEKL DYDYEMKWSGYDKIMEVPILSGCFMFFRTEKLLQLKGFDERYFMYLEDYDLSRRMNEISK TVFYPYAEIVHNHAKESYKSRKMTVIHMKSAIKYFNKWGWIFDRKRKETNKKIKEKFFEN K >gi|261748816|gb|ADAD01000002.1| GENE 14 11609 - 12625 1320 338 aa, chain + ## HITS:1 COG:FN0544 KEGG:ns NR:ns ## COG: FN0544 COG0859 # Protein_GI_number: 19703879 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 1 329 1 326 342 254 45.0 1e-67 MKILIIHTAFIGDIVLSTPLIQKIKDKYPDSQIDYMTLSGNKSIISNNPNLHDIIIYDKK DKNKGIKGFFRILKAVKNNRYDLAIIPHRFIRSILLAKMAGIKKTVGFDVATGSFLLHEK KHYDMSKHEVERLLDLIDYKGERVPLGIYPSKEDMDKIDGLLKGKYYKNLITIAPGSQRP EKIWPINKYDELIKRLSENKENLIVVTGGKAEKTLDLKSVSTENVTDLRGEISLLEFAAL LSKSDIIISNDSAPIHIGSAFEKPFVIGIFGPGKKSLGFFPWTEKSNVIEDNTFFENNIA TKYKGKYEYSKDYFKGIPEITVDRVYEEVVKRLEEKEK >gi|261748816|gb|ADAD01000002.1| GENE 15 12629 - 13657 1287 342 aa, chain + ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 3 223 5 222 344 117 35.0 5e-26 MKLSIIVPMYNVEKYLRQCLDSIYKLDKIEKEVILVNDGSKDETLNIAKEYGEKYKEITT VIDQPNSGVSITRNKGLEKAVGDYIYFFDGDDFLNVDIFQEEVLKLFSENGETDILYGNR ILYFNEKRLEMTYEIPQEMENKVWTGPEYMMRALKEKFWNVQVGTAIYKRELLIRNNIDF PEKRIHEDELFTIKVLNAAKKVRAVNKIFFYYMQREGSIMKSPSIEHSTDVFKNAKDLIE IFKDEKDEKLKEVMFNRIKKYYLETMKYALRQKNKEVYKEIHKEFKKDCKNYFFKMKKTS KDIELYLICYFDKIYYLVRNGTKEYRDKKHYERERKRLENSK >gi|261748816|gb|ADAD01000002.1| GENE 16 13687 - 14655 1177 322 aa, chain + ## HITS:1 COG:YPO0187 KEGG:ns NR:ns ## COG: YPO0187 COG0463 # Protein_GI_number: 16120528 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Yersinia pestis # 3 230 6 227 329 130 32.0 4e-30 MKLSIIVPVYNVEKYIEKCLNSLCNPEIEKEIIIVNDGSTDSSLKIVEEFKKNHDKENII IISQENKGLPEARNTGLRKAKGEYVSFIDSDDFVDKKLYEIMVKEVIKDRVDYGIGKYSY FYENKDKKEYNDSEIKDIIDSFQSNSVKKGKELLKILKKEDKYGPEVWDDIYRREFLVEN NLFFKPDRLHEDEIFTLEVFLKAEKVKYYAVPFYCYLQRENSIMKTQKVKNFTDMQKNIF DMEKLLTEEKNDKDIQKILIEGIHRFYKIIIKKSKMYPDENKKFIEDYKIFSKKYKENKI KNMFLRLKKSIKKKLKKFSKEK >gi|261748816|gb|ADAD01000002.1| GENE 17 14670 - 15683 754 337 aa, chain + ## HITS:1 COG:no KEGG:D11S_0428 NR:ns ## KEGG: D11S_0428 # Name: not_defined # Def: capsular polysaccharide synthesis # Organism: A.actinomycetemcomitans # Pathway: not_defined # 12 336 11 334 334 312 48.0 1e-83 MEEKRKIQGKKLFSSNSDYFNFIAKMKKKPWKCVYTSVIPKSLRKKIEKEADFSQQEYIN ECWKREIDEYFNGNTEKFVLKPKKDLSEKKIIWQYWGQGWEYDKLPDIVKLCFQSVKKYK GDYTIIRLDNENISEYIDFPEFIKEKTDKGKIKYVFLSDLLRLALLDVYGGVWLDATILL TDYFPERFEKADYFVFQRSEEVQDRKKWEKFNNGYFSWDKRQKVKMLNSIIFAKKGNKVI HTLLGLMLIYWEKYEEVYHYFFFQILYDIIMSSYLSGYKCEIVDDTLPHLLIEKFENRFS QKELDKIFEKSGLHKLTYNKVYKPDSFYSFFKEKFKM >gi|261748816|gb|ADAD01000002.1| GENE 18 15752 - 16603 844 283 aa, chain + ## HITS:1 COG:no KEGG:D11S_0427 NR:ns ## KEGG: D11S_0427 # Name: not_defined # Def: hypothetical protein # Organism: A.actinomycetemcomitans # Pathway: not_defined # 3 283 8 287 287 272 50.0 2e-71 MTESISIVTAFFDIGRGDIPKDKGYPIYTHRTTETYFEYFSNLAKLENEMIIFTSEEYKD KILKIRKEKPTKIIIIDLKKKFHRQPGKIKEVLGNDKFRSRVNKDMLLNIEYWSSEYVLV NNLKTFFVNYVIRKNLISNDIISWIDFGYVRDVDTLNNVKNWYYPFDKDKIHFFTIRKNY PLKKIKDVYDMIFNNKVFVIGGAIAGEKEEWKKFYKLQKQCQNDLLKQNISDDDQGVYFM CLFKNPNLFRLNYLGKDKWFSLFRKYDRTSKISLTEKLRDIFV >gi|261748816|gb|ADAD01000002.1| GENE 19 16654 - 17418 745 254 aa, chain + ## HITS:1 COG:FN1241 KEGG:ns NR:ns ## COG: FN1241 COG3774 # Protein_GI_number: 19704576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannosyltransferase OCH1 and related enzymes # Organism: Fusobacterium nucleatum # 4 254 2 241 243 202 45.0 6e-52 MGKIPKKLHYVWIGNGEKPEVFYKCLKSWQEKMPDYEIVEINESNFDLNFHLGKNRFLSE CYNRKLWAYVADYARVHYLYKTGGIYMDIDMEIVKDFSELTQNENIDFFTGFESDDGIGM GLFGVSPQSKFLKEIMDFYEDEIWKSSLFTIPQVTKHILKSKFSCDLSKKEVKDDKNGIY IYPKESFYPFLPHEKFSESMITDKTYAIHWWNHSWKGSKPFLFLKTKHLKGIKKYLKKLG IYFQIIRDDLRNMK >gi|261748816|gb|ADAD01000002.1| GENE 20 17439 - 18203 706 254 aa, chain + ## HITS:1 COG:FN0542 KEGG:ns NR:ns ## COG: FN0542 COG0463 # Protein_GI_number: 19703877 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Fusobacterium nucleatum # 1 221 5 222 263 181 43.0 1e-45 MKLSVGIITFNEENRIGKTLDSVSDIADEIIIVDSESTDRTADIAESKGAKVFVEKWKGY GPQKNSVLEKCNGKWILLIDADEVISKKLKEKINQIINDTSENIPDIYKIKLRNICFGKE IRHGGWDDYVIRLWKKGKVKISDREVHEQYMTDGHKVEKINELIIHYTYDNIEQFLEKLN RYTSQSAEQYMKEHKKAGILKIYFKMLYRFIKMYILQLGFFRRLRRIFVGKIQLCLYDDK IHKTERKVFSEFGK >gi|261748816|gb|ADAD01000002.1| GENE 21 18124 - 18999 935 291 aa, chain + ## HITS:1 COG:Cj1135 KEGG:ns NR:ns ## COG: Cj1135 COG0463 # Protein_GI_number: 15792460 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Campylobacter jejuni # 9 278 240 511 515 206 44.0 5e-53 MAKYSSVYTMTKYTKLREKYFQNLGNNTSLIVTTYNWPKALEVCMESVLRQTVIPKEIII ADDGSKQETADLIKKMQKSNPNIKIIHSWQEDKGFRLSMSRNKAISKATGEYIIIIDGDL LLERHFIQDHIENKEKGYFIQGSRVIMSEDKSKEIFKGKLPEMPLALIEKGFKNKANMIR NTLLSKMFVKKDKKLSGIRGCNMSFFKEDLVKVNGFEEEIQGWGREDSEIAVRLFNNGIS KKRLKFKALTYHIYHNENDRSKLKENDEFLEKVIKSGKKRAEKGLNSHERS >gi|261748816|gb|ADAD01000002.1| GENE 22 18986 - 19717 896 243 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2266 NR:ns ## KEGG: Lebu_2266 # Name: not_defined # Def: glycosyl transferase family 2 # Organism: L.buccalis # Pathway: not_defined # 1 242 1 244 244 296 66.0 7e-79 MKGVSLVITSCGRFDLLEKTLDSFFEENTYPIKKVIITEDSTEGNKLKKLVSKYKNQNFD LIINETRLGQLKCIDKAYKKVDTEYVFHCEDDWLFLKEGFIEKSLEVIEDNPKIAIVGLR PREDCTEIPLCDEPYFSKSGVEFFEIRDHVFTYNPGLRRKDVCDLFGSHEKLEGTLWEDE LCKFYKERGYRMVSFAERYVEHIGNKRHVHFSKRGKNSVLDFKIDRMIKKIRYNLLKMFG KIK >gi|261748816|gb|ADAD01000002.1| GENE 23 19764 - 20507 924 247 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2266 NR:ns ## KEGG: Lebu_2266 # Name: not_defined # Def: glycosyl transferase family 2 # Organism: L.buccalis # Pathway: not_defined # 1 246 1 244 244 288 64.0 2e-76 MKEVSLVITSCGRFDLLKRTLDSFFEKNTYPIKEIIITEDSTEGKKLRKLIAEYETGENK PNFNLIINEVREGQLKCIDKAYNEVKTEYVFHCEDDWVFLKGGFIEKSLSLLEENPHLLL VGLRAKEDFREDFFKDKEYVAGNGETYYEVKDEIFTYNPGLRKKEVCDLFGSHERLKDQL YEMALSEFYKKKDYRTVYLKEKYVEHIGDKRHVHFRRKGENSILNFKMDRMIKKVRYTVL KLLGKVK >gi|261748816|gb|ADAD01000002.1| GENE 24 20534 - 21559 1172 341 aa, chain + ## HITS:1 COG:FN0541 KEGG:ns NR:ns ## COG: FN0541 COG0726 # Protein_GI_number: 19703876 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Fusobacterium nucleatum # 2 310 5 321 351 134 31.0 4e-31 MFKSKSLVCMMYHNVFLEKTEGIICKDEFEKHMSYVKDKKSFKMEEVEKLNFRLPKNSML VTFDDGYKNTYTVAYPILKKYNIKATIFLNTKYINNDDAYLTWDEIREMYNSGLVDFQMH THSHCPVIRKLQVKGFFDENVNEFVRRESLSIYKNGLIGKEREKEIYFKDFDFEGLPIFK VRSQIAIKGMKLKEGFIEKYREIEKNKEFQNMSVNERKKYLNNIFSKRKNEFFEEYTQEE FDKRVEFEIKENKKQIEENLDKKVSFLAYPWGHRYTGDIKKLEDLGVKNFVLTTETVNNR NMNNKKICRINGDDFKEYDKFLKEMKLSENYYLALLVSKFR >gi|261748816|gb|ADAD01000002.1| GENE 25 21594 - 23342 270 582 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 337 565 15 249 329 108 34 4e-23 MKLLKNEDIETVKKLKKYIKKYYIIIIFNMLLSMVSATVSASPLVLVKRLVDKGILGSSE KDILYAAGGMIILAIIGGVLIYWNGILSVVISSSIYKNIIDDLYVKIQELDMEYFSRTKI GELMTKVLNDPSNINSLIIESFNLFSEAFTAIICLGVAIYIDWKLTLGVLIIAPILLVTV KKYSKKLKSSGKARQEATGTLNSKLQETLSGIRVIRAFATEKEETRDFKKKSLELKKVAL KSAKYTSKSSSISVALNYIMVAILLMFGGYRVLRGNHFTTGDFITIVGAISSMYTPVRRA ISRYNEISTNISSIGRVFEILDEEPEITDKPNCIRFEEFTKDISFENVGFHYKDSDEKIL KNINLIAKKGETVALVGNSGGGKSTLVNLIPRFFDVSEGELKIDGINIKDYQIMSLRKKI GIVPQETFLFGGTVFENIKYGNQDAAKEEVIEAAKKANAHDFIENLENGYETEIGERGVK LSGGQKQRISIARAILKNPKILILDEATSALDNESEQLVQDALEKLMKGKTTFVIAHRLS TIINSDKIVVIQQGEIKEIGTHDELIEKAGIYESLYKKSFKN >gi|261748816|gb|ADAD01000002.1| GENE 26 23468 - 23899 631 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037142|ref|ZP_06010634.1| ## NR: gi|262037142|ref|ZP_06010634.1| hypothetical protein HMPREF0554_1985 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1985 [Leptotrichia goodfellowii F0264] # 1 143 1 143 143 114 100.0 3e-24 MRNLKKIILFLGAVSLVAFGATNFKKTEKKVKAENETTTTVITNTEENKKEETPSSSENS ESKENEDKNQNGTKVIKKSDTKKEEKAVKKSDKKTEGKQTVQNETKTVKEEVKKSETEET QKKPAVNKTNTEKKEKERASSSK >gi|261748816|gb|ADAD01000002.1| GENE 27 23907 - 24908 998 333 aa, chain + ## HITS:1 COG:FN2031 KEGG:ns NR:ns ## COG: FN2031 COG1477 # Protein_GI_number: 19705322 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Fusobacterium nucleatum # 8 290 13 279 320 122 29.0 1e-27 MEKLYKVQTRFLFHSNIKIKIPDKYEDNIFDELFSVLEKVNKNYNSYSQGSFVDRINKNA GNFVKVDDETIKMLEKIIYFSDILRGEYDITIMPLIRLWGFYKENPETVPDKKEIEKIKQ LVNYKKIEIDKINNKVKIEAGQEIVTGSFIKSYAIDSLGKKMREKGITDAVINAGGSSIL SINDKENQSLGIIVENPENEKEIEKDKNGYPVKITDRKYKGKDEYNDLFDIEISNEAYST SNQINTYIEINGEKYGHIISPKTGYPSKNKQIGIITEDAFIGDIISTGLFNQTPEIFSEI IKELSKEMKIEGYLITFDGKIHYSEKFLNYMDI >gi|261748816|gb|ADAD01000002.1| GENE 28 24949 - 26274 1640 441 aa, chain + ## HITS:1 COG:FN1596 KEGG:ns NR:ns ## COG: FN1596 COG4656 # Protein_GI_number: 19704917 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Fusobacterium nucleatum # 18 432 25 437 441 305 42.0 1e-82 MKNIVNEKMKIDSMKHVTKKMNLAIMEDPEYFYVPLSQHIGQISSEIVKKGDYVKQYQKI GEIQGNVSAAVHSPISGVVEEIIENPVVNGNKVKTVKIRNDFKYNSENMERRAMESLSEY TKREILDVIKESGIVGEGGAQFPTYVKYDIGDKVVDTFILNGSECEPYLTADYTIMNEWT EQFFDGIKIVEKLLNPKKTVIGIEEENRELVERFSEVSRKKNMTNIEICVLPTAYPQGSE LQLIKTITGKEIKKGNLPVNYGVVVSNVGTVKSVYDAFIEGKPLIERVVTISGEKTEEKG NYLIKTGTPLGHIIEKINPEKDAKIIFGGPMMGEEVKDMNVPVIKGTSGILFLSGNIDNI ERNNCISCGYCVDACPMGLMPMKFEEMYRKGKYKKLVKLNLDMCIECGACEYSCPSRVPL IKSIKEGKGMLREMRAEGGIK >gi|261748816|gb|ADAD01000002.1| GENE 29 26274 - 27368 1114 364 aa, chain + ## HITS:1 COG:FN1595 KEGG:ns NR:ns ## COG: FN1595 COG4658 # Protein_GI_number: 19704916 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Fusobacterium nucleatum # 12 308 11 308 314 158 34.0 2e-38 MEKVLNLKTYTPHIRVNDSVTKVMSDVVLALLPSIVVSYIVYGITPVLVILTSVMSALLS EWIFSAVFFKKHNSVNDVSGIVTGILLGLTLAPFTPLYVVAFGASMAVIFGKLIYGGLGR NVFNPALVGREFMTVFFPTVMSSGAIWFNESALKISEIKIFKYFGDTPLFNYLDKTILNS SGAIGEFSIFFLVIGGIYLLMKDRISWHIPVSMFVVIFSGLYLMSVIGVDVNLSIGGLML GGIFMATDMPSSPSNNAGKIYYGAMIGIAVIICWSQNIKFEYLSYSILTLNAFAEPVNYI FRPKVFGGNGVLGGRILKGTGLTLIIVFTVFLLTVLHNAGFIPYLLFIYIFYTVIKLILS KEIK >gi|261748816|gb|ADAD01000002.1| GENE 30 27428 - 30487 3822 1019 aa, chain + ## HITS:1 COG:FN0019 KEGG:ns NR:ns ## COG: FN0019 COG1197 # Protein_GI_number: 19703371 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Fusobacterium nucleatum # 20 950 3 926 981 735 47.0 0 MNNLKFLKEIDGNLLKSENKIYRGAIPWILLSSEDKKIIYVSTSNRNLENYYDMLENYKT DGKNISMFENISSSKEDLTGINIELLDILRNRDNFILFLNLQITLDVFFEKVNLLNFKIG NTYKFSDITEFLTENGYTQSYLIEKKGEYSKRGDIIDIFPPNAENPVRLEFFDDELESIR EFDINSQRSVDKKEEVKIFGNVLSGNNYELVELIEELKKDDVIIVMENEELLNYKMEEFI LIDREKENTYRKRYENLKRKSMILETVNFTEEQLETFKERDKLGKLSEKKSVQIYTKNYE KKLREYKDFEKIDIIGYELFEGFITKDTFVLTDREIDGYIYEKKRKNNKAVKYKKVNQIL IDDYVIHVQYGVGIYKGIETMEERDYLKIKYADEDILYIPVEKLDRLEKYISYGEEPKLY KLGTRGFKKKRQKLAEDIEKFAAELIKIQAQRQSQNGFVYGKDTVWQEEFEAQFPFEETE DQKKAINDVKKDMEGPYIMDRIVCGDVGYGKTEVAMRAAFKAIENSRQVALIAPTTVLAE QHYKRFMQRYENYPVTIENLSRLTQSKSKEILKNMKNGVTDLVIGTHRLLSEDVEFNNLG LLIIDEEQKFGVKAKETIKKKRQKVDVLTLTATPIPRTLNMALLGIREISVIDTPPTNRL PIITEVEDWEEEVVKKAILKELSRDGQVFYIYNDVKSMKYKIEELRKILPDFVKIEFIHG QLLPKEIKDKIRRFEQGEFDILLASTIIENGIDIPNANTILIENFNALGLSQVYQLRGRV GRSNRQGYCYLLKTRTATKKGQKKEESMQKVEGIKSGGFQISMEDLKIRGAGEILGEKQH GTIETFGYDLYLKMLNEEIKKQKGEYREKTENVEIILKEKGFIPENYIEKEERLNIYKRF AILETFEELDELVNEIKDRFGKIPSEMKKFILNIKFKIFAETNGIQIIEEKIDSYRLSFV EDTSENIISDLEQEFEMKEVIQIPIFEDSNKIKGKEVTAQEMFIIKEVDKKELLKYLNK >gi|261748816|gb|ADAD01000002.1| GENE 31 30581 - 31819 1750 412 aa, chain + ## HITS:1 COG:YPO0986 KEGG:ns NR:ns ## COG: YPO0986 COG4099 # Protein_GI_number: 16121290 # Func_class: R General function prediction only # Function: Predicted peptidase # Organism: Yersinia pestis # 5 410 16 458 458 306 37.0 4e-83 MLCILVLSVVSFAETAVNPTKFPKEIKNITAVTEVFGTGQKLTGVIIEYNTAVKNDKLTK ETFEVTDRTITNVYANNKAEKTKSGKNGRFVIIELNPSDEKAQLFIKANNIQVKQAKDIS TVNGKIAKAYSETLTNNKIQNLVVDNFKQFEYKDPKTGTTVKYNLYIPKNYNKNKKYPMV MFIHDAGPLSENTETTLLQGNGATVWATPEEQAKHEAFVLAPQYSQKVVDDNGNYTTDLE ATVNLIRDYLVKNYSIDTNRLYTTGQSMGGMMSIVMNFKYPDLFAGSYFVACQWDASLTA PMAKNNMWTVVSTGDSKAFPGWNAIVEVLAQNGGIVAKDAWRGDYTAEQFKEGVSKVLAE NPKANIKYTTLEKGTLPALQAGNPGSEHMATWKTVYNIEGIRDWLFKQKKNK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:24:47 2011 Seq name: gi|261748814|gb|ADAD01000003.1| Leptotrichia goodfellowii F0264 contig00090, whole genome shotgun sequence Length of sequence - 461 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 460 515 ## gi|262038222|ref|ZP_06011613.1| hypothetical protein HMPREF0554_0421 Predicted protein(s) >gi|261748814|gb|ADAD01000003.1| GENE 1 1 - 460 515 153 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038222|ref|ZP_06011613.1| ## NR: gi|262038222|ref|ZP_06011613.1| hypothetical protein HMPREF0554_0421 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0421 [Leptotrichia goodfellowii F0264] # 1 153 34 186 186 249 94.0 5e-65 NIVLNAKRYLKMLSTVDTEFKERTTSHKSRKSFGRSKTTTEHWVEDNVYANNVDLTTDGN VLMNFKGVDEKTGKYISTNNQGVIAQGVNFHAKGAIIGFSEGNIYVEGTKDKLNSVYNSH TTKKWFGVKYGKASDYVNDTTEKYRLSQLYNGS Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:24:56 2011 Seq name: gi|261748809|gb|ADAD01000004.1| Leptotrichia goodfellowii F0264 contig00071, whole genome shotgun sequence Length of sequence - 1469 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 43 - 516 585 ## FN2115 hypothetical protein 2 1 Op 2 . - CDS 513 - 926 602 ## gi|262037164|ref|ZP_06010654.1| hypothetical protein HMPREF0554_1182 3 1 Op 3 . - CDS 950 - 1108 77 ## gi|262037165|ref|ZP_06010655.1| hypothetical protein HMPREF0554_1183 4 1 Op 4 . - CDS 1109 - 1468 350 ## FN2115 hypothetical protein Predicted protein(s) >gi|261748809|gb|ADAD01000004.1| GENE 1 43 - 516 585 157 aa, chain - ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 38 155 33 150 151 82 38.0 4e-15 MMKRMVMIIMSIIMLSSCYYADQVFGDIRNENFNSLGRKKNGGGAYKDDKYKSGVYEAIK DVAKRPLNNKVQYEGITLVLPQNTSMNQEAGNIVDLKTGYGLPIGFTSYDGCSEVFYYKK IRGDLYYRLTYNEMIPGVEEIAQKIIRVNGFTKTCNK >gi|261748809|gb|ADAD01000004.1| GENE 2 513 - 926 602 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037164|ref|ZP_06010654.1| ## NR: gi|262037164|ref|ZP_06010654.1| hypothetical protein HMPREF0554_1182 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1182 [Leptotrichia goodfellowii F0264] # 1 137 1 137 137 234 100.0 1e-60 MKKWLNDGHGAYTYYSAQLAKDIDGLLRKYQNTGDETAKKKIQEDIENKYIENQERLIEL FTQGPALMNDSKGLLSGLGEYNRVENERAYREGKYQGKNLPQNYITNSLNESIREGKLGV MDINEYIESLRNGAGLR >gi|261748809|gb|ADAD01000004.1| GENE 3 950 - 1108 77 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037165|ref|ZP_06010655.1| ## NR: gi|262037165|ref|ZP_06010655.1| hypothetical protein HMPREF0554_1183 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1183 [Leptotrichia goodfellowii F0264] # 1 52 1 52 52 71 100.0 2e-11 MRIKSNIQISLRKNTGVSDEKSLSEQSEFMIFQTNKVFLRVEKFELEVLSTF >gi|261748809|gb|ADAD01000004.1| GENE 4 1109 - 1468 350 119 aa, chain - ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 6 118 33 150 151 80 38.0 2e-14 VNSQGRKKNGAGAYKEDRYKSGVYGAINDIVKRPIDKKVQFEGIALIIPENTEINSKTWN LVDTKTGYGIPISFYDQNGCIQKKIGDKIYSITYNDYISGVKQIGEKLMKINGFKNTCN Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:25:15 2011 Seq name: gi|261748806|gb|ADAD01000005.1| Leptotrichia goodfellowii F0264 contig00118, whole genome shotgun sequence Length of sequence - 613 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 346 548 ## gi|262037169|ref|ZP_06010658.1| hypothetical protein HMPREF0554_1939 2 1 Op 2 . + CDS 356 - 611 331 ## gi|262037168|ref|ZP_06010657.1| glucagon isoform 2 Predicted protein(s) >gi|261748806|gb|ADAD01000005.1| GENE 1 2 - 346 548 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037169|ref|ZP_06010658.1| ## NR: gi|262037169|ref|ZP_06010658.1| hypothetical protein HMPREF0554_1939 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1939 [Leptotrichia goodfellowii F0264] # 1 114 1 114 114 97 100.0 2e-19 YRQQKEASRPVREEKVKEKKVLRRETAEGNVQEVTLDVEEEVVEEVKPKTPMEKLEYNAA KAKDRVDFYERVVRSVEREEKELNGYNEVIGKKKKVKKVVEKKVREPKKVNKKK >gi|261748806|gb|ADAD01000005.1| GENE 2 356 - 611 331 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037168|ref|ZP_06010657.1| ## NR: gi|262037168|ref|ZP_06010657.1| glucagon isoform 2 [Leptotrichia goodfellowii F0264] glucagon isoform 2 [Leptotrichia goodfellowii F0264] # 1 85 1 85 86 75 100.0 2e-12 MKVRKAIGLLAGMVLLMTQISYSDPAVDRLLREARKRQAEEAKQEKVEEVTVEETVPVTT ETPNSIQKRAAEIKRESQSKKSQEM Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:25:28 2011 Seq name: gi|261748804|gb|ADAD01000006.1| Leptotrichia goodfellowii F0264 contig00206, whole genome shotgun sequence Length of sequence - 444 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 443 651 ## Lebu_0671 autotransporter beta-domain protein Predicted protein(s) >gi|261748804|gb|ADAD01000006.1| GENE 1 2 - 443 651 147 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0671 NR:ns ## KEGG: Lebu_0671 # Name: not_defined # Def: autotransporter beta-domain protein # Organism: L.buccalis # Pathway: not_defined # 1 147 1294 1442 1550 194 67.0 1e-48 SNAYGVAYIHEDETIKLGNSSGWYAGAVNNRFRFKDIGRSKENTTMLKLGLFKSKAFDDN GSLNWTISGEGYIARSNMHRKFLVVDEIFEGKSDYTSYGAALKNEISKEFRTSERTSIKP YGSLKLEYGRFNTIKEKTGEVRLEVKG Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:25:40 2011 Seq name: gi|261748771|gb|ADAD01000007.1| Leptotrichia goodfellowii F0264 contig00102, whole genome shotgun sequence Length of sequence - 29021 bp Number of predicted genes - 32, with homology - 31 Number of transcription units - 13, operones - 8 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 447 145 ## Smon_0558 hypothetical protein + Term 512 - 538 -1.0 - Term 496 - 530 1.2 2 2 Op 1 . - CDS 660 - 860 360 ## Lebu_1802 preprotein translocase, SecG subunit 3 2 Op 2 . - CDS 904 - 1665 1265 ## COG0149 Triosephosphate isomerase - Prom 1703 - 1762 10.7 4 3 Op 1 . - CDS 1818 - 2633 1324 ## COG0561 Predicted hydrolases of the HAD superfamily 5 3 Op 2 . - CDS 2655 - 3176 492 ## CbC4_4170 hypothetical protein - Prom 3421 - 3480 9.9 + Prom 3424 - 3483 12.5 6 4 Tu 1 . + CDS 3547 - 4047 332 ## gi|262037197|ref|ZP_06010684.1| conserved hypothetical protein - Term 4330 - 4367 -0.8 7 5 Op 1 4/0.000 - CDS 4453 - 6036 2125 ## COG4468 Galactose-1-phosphate uridyltransferase 8 5 Op 2 1/0.000 - CDS 6054 - 7094 1539 ## COG1087 UDP-glucose 4-epimerase 9 5 Op 3 6/0.000 - CDS 7095 - 8147 442 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase 10 5 Op 4 . - CDS 8144 - 9301 1677 ## COG0153 Galactokinase 11 5 Op 5 10/0.000 - CDS 9360 - 10388 1643 ## COG4211 ABC-type glucose/galactose transport system, permease component 12 5 Op 6 16/0.000 - CDS 10397 - 11911 192 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 11933 - 11992 2.3 13 5 Op 7 1/0.000 - CDS 11996 - 13021 1867 ## COG1879 ABC-type sugar transport system, periplasmic component 14 5 Op 8 . - CDS 13067 - 14263 319 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase - Prom 14382 - 14441 10.5 15 6 Tu 1 . - CDS 14791 - 15522 586 ## FN0953 hypothetical protein - Prom 15548 - 15607 2.9 16 7 Tu 1 . - CDS 15624 - 16862 1208 ## COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif + Prom 16954 - 17013 14.8 17 8 Op 1 6/0.000 + CDS 17180 - 17863 813 ## COG3819 Predicted membrane protein 18 8 Op 2 3/0.000 + CDS 17863 - 18804 1281 ## COG3817 Predicted membrane protein 19 8 Op 3 . + CDS 18824 - 19465 868 ## COG2039 Pyrrolidone-carboxylate peptidase (N-terminal pyroglutamyl peptidase) 20 8 Op 4 . + CDS 19494 - 19928 476 ## gi|262037200|ref|ZP_06010687.1| BtrG family protein + Term 19936 - 19972 3.2 - Term 19666 - 19701 -0.6 21 9 Op 1 5/0.000 - CDS 19887 - 20546 483 ## COG0534 Na+-driven multidrug efflux pump 22 9 Op 2 . - CDS 20592 - 21008 575 ## COG0534 Na+-driven multidrug efflux pump - Prom 21093 - 21152 11.9 + Prom 20959 - 21018 8.6 23 10 Op 1 1/0.000 + CDS 21250 - 21606 322 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 24 10 Op 2 25/0.000 + CDS 21672 - 22628 1484 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 25 10 Op 3 42/0.000 + CDS 22655 - 23347 244 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 26 10 Op 4 . + CDS 23368 - 24201 674 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components + Prom 24205 - 24264 5.7 27 11 Tu 1 . + CDS 24284 - 24721 757 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) + Term 24748 - 24792 -0.7 - Term 24694 - 24740 1.1 28 12 Op 1 . - CDS 24846 - 25853 1366 ## COG1304 L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases 29 12 Op 2 . - CDS 25919 - 26851 601 ## COG4823 Abortive infection bacteriophage resistance protein - Prom 26890 - 26949 7.7 30 13 Op 1 5/0.000 - CDS 26994 - 28010 1420 ## COG1577 Mevalonate kinase 31 13 Op 2 . - CDS 27994 - 28935 1046 ## COG3407 Mevalonate pyrophosphate decarboxylase 32 13 Op 3 . - CDS 28917 - 29021 124 ## Predicted protein(s) >gi|261748771|gb|ADAD01000007.1| GENE 1 1 - 447 145 148 aa, chain + ## HITS:1 COG:no KEGG:Smon_0558 NR:ns ## KEGG: Smon_0558 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 4 148 258 406 407 113 42.0 3e-24 RSGDVNNTKGIIKYLGRYLARSPIAEYKITDITDNEVTFFYNDLANDKQKTFITMPIQKF ISQILIHVPPKNFKMVNRYGLYARHISNKLKRAVIPFKKNIVPNKFSFYQRQTFKTFGIN PFYCPICNIRMIVWEFYHYLYPKPKRYY >gi|261748771|gb|ADAD01000007.1| GENE 2 660 - 860 360 66 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1802 NR:ns ## KEGG: Lebu_1802 # Name: not_defined # Def: preprotein translocase, SecG subunit # Organism: L.buccalis # Pathway: Protein export [PATH:lba03060]; Bacterial secretion system [PATH:lba03070] # 1 66 2 68 69 68 73.0 7e-11 MENLLVVALVALAVIMIVVILLQPDRSQGLAKNSNVLDQEKEGIEKFTEYIAAAFLIVAV LFQIIR >gi|261748771|gb|ADAD01000007.1| GENE 3 904 - 1665 1265 253 aa, chain - ## HITS:1 COG:FN1366 KEGG:ns NR:ns ## COG: FN1366 COG0149 # Protein_GI_number: 19704701 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Fusobacterium nucleatum # 1 251 1 251 251 298 61.0 9e-81 MRRVIVAGNWKMNKTAKEAAKFISELKPLVADVKKTGIVIGTPFTALESAVKEAAGSNVK IAAQNMNPNDNGAYTGEISPLMLKDLGVEYVILGHSERREYYGETDKFINEKVKAALKHG LKPILCVGEKLEERENGTTEKVVKEQTVGGLEGISAADMANVVIAYEPVWAIGTGKTASP AQAQEVHAFIRKLLTDLYGAEVAENVTVQYGGSMKADNAFELISQKDIDGGLVGGASLEA GSFAEIIKAGDSI >gi|261748771|gb|ADAD01000007.1| GENE 4 1818 - 2633 1324 271 aa, chain - ## HITS:1 COG:CAC0629 KEGG:ns NR:ns ## COG: CAC0629 COG0561 # Protein_GI_number: 15893917 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 3 270 2 266 268 151 35.0 1e-36 MDYKLIATDMDGTLLNDRHLISEGNVQAIKEVQKKGVKFVLASGRPSFAMLNYAKKLEMD KNEGYVLAFNGGQLINMSDGKVMFHEGLNKEDIEKVYNASKEIGLPMVLYAGDTVYANGN SEYVQFEVNQCEMKFVEFKSLEELYGYGIKETTKCMIIGNGESVKKAEKYMKSKYEKDYF IAISAPIFLEIANKNINKGKTLKKLGEITGIDTSEMIAVGDSYNDAPLLEVVGMPVAVEN AVPKIKEMSKFESTSNNNDALKTVIEEFFMK >gi|261748771|gb|ADAD01000007.1| GENE 5 2655 - 3176 492 173 aa, chain - ## HITS:1 COG:no KEGG:CbC4_4170 NR:ns ## KEGG: CbC4_4170 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 16 167 6 148 172 87 41.0 2e-16 MSKKNRLYFIRLTYEYLMYLKKIETRISEKIKRPLIGIVLEVDNKKYFAPLSSPKEKHKL MKENLDIVKIKKGELGIINLNNMIPVIDDKKYRTVIDLNILKKSKSEKERKYFRLLQEQL NYCEKNREFIERKAEKIYEISQKDYKELTKTEKKILSRSNNFKILEKAIEKII >gi|261748771|gb|ADAD01000007.1| GENE 6 3547 - 4047 332 166 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037197|ref|ZP_06010684.1| ## NR: gi|262037197|ref|ZP_06010684.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 166 1 166 166 210 100.0 2e-53 MKKIILLIFITLAIISFPATQNEIENDKKIIKSQTEFIIELFKSDSLNNYLKSYMNLQND DGSLTKEDIEKLSDFYSDYINSNKYEIKDVKITGKKATVYMSISIIDIDKYSSEIIGDIK KLATQKKTSDKRNSDEFSSEEFVKNFFNNFKENKCKKKNKRLKNRI >gi|261748771|gb|ADAD01000007.1| GENE 7 4453 - 6036 2125 527 aa, chain - ## HITS:1 COG:BH1109 KEGG:ns NR:ns ## COG: BH1109 COG4468 # Protein_GI_number: 15613672 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose-1-phosphate uridyltransferase # Organism: Bacillus halodurans # 3 527 6 508 508 476 48.0 1e-134 MDIYAEIEKLVEYSVKNGFVNEEDRILITNLIAGAIELDTYREFSEEEKENIKKETENVA YPSEILDNIVKWAGVNGKLKSDTVTFQDLMNSKIMGQIVPRTSEVRKEFWDKYDNYGIDK ATEFFYGLSKKSNYIRMDRISKNIHWNYENNYGNLEITVNLSKPEKDPREIAMAKDKVSS NYPKGLLDKENEGYMGRIDHPGRQNLRTVRLNLTNEYWYFQYSPYTYYNEHSIVFSHEHR PMKIDKPTFQRLLEFVTIFPHYFIGSNADLPIVGGSILSHDHFQAGRHVFPMEKARIKEK IVFNGFEDVEAGILNWPLSVIRINGEKKRLVELADKILKKWIDYNEEKLDIYSHTKGERH NTITPIARFKDGKYELDLVLRNNLTNKSHPLGIFHPHDEHHNIKKENIGLIEVMGLAILP GRLKEETAQIKEIITKVRNSEEFKNTGNYEKCYEEMDKNESLKKHTKWLKHYLEGHKIKH LIEESPDVFIENAIGETFSRVLEDCGVFKNNEIGEKGFFKFIRKVNE >gi|261748771|gb|ADAD01000007.1| GENE 8 6054 - 7094 1539 346 aa, chain - ## HITS:1 COG:all4713 KEGG:ns NR:ns ## COG: all4713 COG1087 # Protein_GI_number: 17232205 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Nostoc sp. PCC 7120 # 23 344 7 330 332 360 53.0 3e-99 MENKEKNVKFYNIELEKGADEMNILVIGGAGYIGSHTVNLLKKSGYNPIIYDNLSKGYEQ VAEILGVKLIKGDLGDKKKLKEVFGKEKIDAVMHFAAFIEVGESVQKPSEYYDNNVAKVL KLLDQMVESGVKKFIFSSTAATFGEPKKEKIDETHIQFPINPYGKTKLTVEKILEDYDTA YGLKSTVLRYFNASGSDKDGLIGESHIPETHLIPLILQAASGKRESIKIFGNDYNTKDGT CIRDFVHVYDLGKAHILGMEKMFKENRSLNYNLGSGEGYSVKEVIEKVKEVTGKDFKVDE VKKRAGDPAVLVADSTKAEKELNWKPEYDLEEIIKSAWKWEMNRKY >gi|261748771|gb|ADAD01000007.1| GENE 9 7095 - 8147 442 350 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 6 345 6 341 345 174 29 4e-43 MINIEKKEFGMYKEEKIYKYELKGENGFQVNVLNLGGIITEIFVKNKEGKVKNVVLGYDN IEKYIDNKAYPGSVIGRTAGRTANASFYIDGTEYRLDKNSGKNSIHGGNQGFHTKIFDVK ELENGIELSYTSPDGEEGYPGNLEFKIFYLLDKNRLTLKYEAISDKKTYVNPTNHSYFNL AGNTERNGDEQMLKIEADNVCELDVDSVPTGRFINVENTAFDLRKGSVIKKGIEKGHPQF EITRAYDHPFVLNSVEKGKPQITLFSEYSGIEMKVYTIENTVVIYTGNYLDDMPLFAGEN DGNKNNRYLGVAVETQDFPNGINEKNFKTDPLNAGEKYHSKTVYEFNIAR >gi|261748771|gb|ADAD01000007.1| GENE 10 8144 - 9301 1677 385 aa, chain - ## HITS:1 COG:FN2107 KEGG:ns NR:ns ## COG: FN2107 COG0153 # Protein_GI_number: 19705397 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Fusobacterium nucleatum # 3 381 4 384 389 485 67.0 1e-137 MLNLKEKFFEVFGENSEKEFFSPGRVNLIGEHTDYNGGNVFPCAIDRGTYALIKTRNDNR FRMYSENFEKIGIIEFSLDKLENEKAHDWANYPKGVIKMFIDAGYKINKGFDILFYGNIP NGAGLSSSASIEILTAVILKNIFGLNIDMIEMVKLGQKTENLFIGVNSGIMDQFAVGMGK KDYAVLLDCNTLKYEYVPVILKDEVIVISNTNKRRGLADSKYNERRGECETALKDLQEKL KIKALGELSIEEFEENKGLIKNEVNRKRAKHAVYENQRTIKAQKELSAGNLEEFGKLMNQ SHESLRDDYEVTGKELDTLVELAWKQDGVIGSRMTGAGFGGCTVSIVKKDKVDDFIKNVG KGYKEKIGYDADFYIVEVSEGPREL >gi|261748771|gb|ADAD01000007.1| GENE 11 9360 - 10388 1643 342 aa, chain - ## HITS:1 COG:VC1328 KEGG:ns NR:ns ## COG: VC1328 COG4211 # Protein_GI_number: 15641340 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type glucose/galactose transport system, permease component # Organism: Vibrio cholerae # 8 342 5 337 337 386 68.0 1e-107 MAERMDKGVKSFDLKELILNGGIYFVLILLLILIVIKDPTFLRLINIQNILTQSSVRIII ALGVAGIIVTQGTDLSVGRQVGMAALLSATLLQAVTNPNKVFKTLGVLPIPVVILIIVVL GAIFGLVNGLIVAKLDVVPFVTTMGTMVIVYGINSLYFDYVGASPVAGFDKRYSEVAQGA LFQFGQLRLSYLIIYAVIAIVFMWILWNKTVFGKNLFAVGGNPEAARVSGVNVAKTLILV YVLSGIMYALGGFLEAARIGSATNNLGFMYEMDAIAACVVGGVSFNGGVGKISGVVAGVI IFTLINYGLTYISVSPYWQYIIKGIIIITAVAIDVLKYRKNK >gi|261748771|gb|ADAD01000007.1| GENE 12 10397 - 11911 192 504 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 277 482 17 217 245 78 26 4e-14 MENSDKKHEYVLRMEGISKEFPGVKALDNVNLRIRPNSVHTLMGENGAGKSTLMKCLFGI YKKNAGEIYLEDKKIEFKNSKEALEKGVSMVHQELNQVIQRNVMDNIWLGRYPKKGLFID EEKMYKDTKEIFERLEIKIDPRTKVSNLSVSQMQMLEIAKAVSYDSKVLILDEPTSSLTE NEVKHLFHIIKKLQGSGIGIVYISHKMEEITEICDEITILRDGQWVTTEKVKDLTTDQII NLMVGRDLTNRFPDKTNKPSDVIMEVENLTAKRKNSIEDVSFTLHKGEILGIAGLVGSKR TDIVETIFGVMERKSGTIKIHGKTVNIRNPKEATANGLALITEERRSTGIFSMLDIKFNS IISNTRNYKSKIGLLNDKRIQKDTGWVIESMRVKTPSQRTHIGSLSGGNQQKVIIGRWLL TEPEILLMDEPTRGIDVGAKFEIYQLMVNLAKKDKGIIMISSEMPELLGVTDRILVMSNG KVAGIVKTSETTQEEILKLTAKYL >gi|261748771|gb|ADAD01000007.1| GENE 13 11996 - 13021 1867 341 aa, chain - ## HITS:1 COG:ECs3042 KEGG:ns NR:ns ## COG: ECs3042 COG1879 # Protein_GI_number: 15832296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 38 341 27 331 332 402 69.0 1e-112 MKKILLVMLAMFMLLACGGEKKEEAKGGEPAKDGKKIKIGVTIYKYDDNFMATLRKDLEA FAKEDANVELIMNDSQNNQATQNEQVDTMIAKGVDVLAINLVDPAAGQTIVDKAKAANLP VVFFNKDPGEATLSSYDKAYYVGTKPEESGIIQGQLIEKNWKADAKEDLNGDGVIQYVLL KGEPGHPDAEARTTFVIKELNDKGIKTEKLQEDTGMWDAAQAKDKMDAWLSGPNASKIEV VISNNDGMALGALEALKAHQKNLPVYGVDALAEALTLIESGELKGTVLNDGKNQAKAVLE LARNLGNGKEPLDGTSWKFEGKSVRVPYIGVDKDNLAEFKK >gi|261748771|gb|ADAD01000007.1| GENE 14 13067 - 14263 319 398 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 82 385 3 317 319 127 26 8e-29 MQKIESKKIRKLNRTDYRILEKVMKDSEVSRTDLAKEMELTSAAISKAIKKLMLQNIIME HATLESTGGRPRKSLIVNKEYKKIIGVNLGAGFINIVASHLNGEIIQIRQRKFVYKTQEK VLDLLYEGISDMVEKFGEESIIGIGLATHGLVDRKKGTVIFSPHFKWKKLDIRRELERKF NLKVIVENDVRAMLTAEHMYGCAGNMKNFMLLYIRNGVGAAIFLNGKIFEGSNYSAGEVG HFIVNEKSTIQCRCGKFGCLETECSEQALINKVVWELEKKDKNESKEKITIDKIYSRSKH KEEPYYSIVKEAAYETGKVIGNILNILDIDNVVVAGDIIMTEKLYMNNFRKGVDRMILED FNKKVKIVSSALDDMIGIYGAISLVTSNLFVGEKLMKG >gi|261748771|gb|ADAD01000007.1| GENE 15 14791 - 15522 586 243 aa, chain - ## HITS:1 COG:no KEGG:FN0953 NR:ns ## KEGG: FN0953 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 242 1 243 243 322 68.0 1e-86 MPNYYYDGSFDGLLTVIFKAYKNRKEVIRVVAETEQLTFRTEDIHVLTDHYEARRVERTI CKRLSENFFKSIRLCFLSFENNKDTVIANTVYKALDSDEKIIHSADEHAFLMNKFVKRVL RERHRYLGVLRFREMKDGTLFSTIEPKNNVLPILLSHFEKRLGKEKFAIFDKKRKMIAYY NSEKFELFFVKTPEIEWSDEEQEYSSLWSTFHKSISIKERKNKKLQQSNLPKYYWKYLVE EMG >gi|261748771|gb|ADAD01000007.1| GENE 16 15624 - 16862 1208 412 aa, chain - ## HITS:1 COG:FN0954 KEGG:ns NR:ns ## COG: FN0954 COG4277 # Protein_GI_number: 19704289 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with the Helix-hairpin-helix motif # Organism: Fusobacterium nucleatum # 1 412 1 413 415 657 78.0 0 MNKPVEDKLRILSAAAKYDVSCSSSGSKRANKNNGLGNAAFSGICHSWSADGRCVSLLKI LMTNHCMYDCKYCVNRRSNDIERAILTPEEIVKLTVNFYRRNYIEGLFLSSGIIKSADYT MEQMIIVAKKLRLEENFNGYIHMKVIPGTSKELIREMGLYVDRVSVNIELAESKALKLLA PDKKSIDISTSMGLVHKNEVENKEEKKIFKSAPLYIPAGQTTQMIIGASGESDYKILNKS ENLYKNFDLKRVYYSAYVPVNKSGILANVDAAPMIREHRLYQADWLLRFYKFKAGEILNE QNPFFDPLLDPKANWALRNWYLFPMEINRASYKELLRIPGIGINSARRIVMSRKYGIIKY EHLQKLGVVIKRAKYFITVNGEFLGLKKENPELIRNILIEREKIGMYQMRLF >gi|261748771|gb|ADAD01000007.1| GENE 17 17180 - 17863 813 227 aa, chain + ## HITS:1 COG:SPy0508 KEGG:ns NR:ns ## COG: SPy0508 COG3819 # Protein_GI_number: 15674613 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 4 226 3 226 227 191 49.0 1e-48 MNLWILIGIVIIVVGFSLKLDVLAVVITAGVATGLAAKIDFFKILEIMGKAFVENRLMSI FLISFPVIAILERYGLKERSAELIGNLKNATAGKILGFYMIIRSVASALSIRIGGHIQFI RPLILPMSEAAAEASKGEKLSEDETERLKSLSGAVENYGNFFAQNIFVGASGLLLIQSTM GENGYNVSLKNLAFYSIPIGIIAIIFTIIQVSLYDKKLKKDSQGGNK >gi|261748771|gb|ADAD01000007.1| GENE 18 17863 - 18804 1281 313 aa, chain + ## HITS:1 COG:SP0859 KEGG:ns NR:ns ## COG: SP0859 COG3817 # Protein_GI_number: 15900743 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 10 309 8 305 307 204 42.0 2e-52 MDLKTFFNYLSEIIYVLCGLVSISTAIRGLKNEKAKIGTFLFWFLLGLIFILGKYIPYVY TGGMLIVLGLITVTNQLKMGKFADISAEFKIAQSRKFGNLIFIPAALIGISAFLILQFKI GTTAIPPAVGIGGGAIIALIVALLIIKPNLKETNEDSTKLLMQTGAAALLPQLLAALGVV FKEAGVGNVIAQSISSVVPSGNIFLGIIIYAIGMAVFTMIMGNAFAAFSVITAGIGIPFI LKHGGNPAVIGALGMTAGYCGTLMTPMAANFNIVPASILEIKDKNAMIKTQIPMALALFV VHIILMLLLFGIK >gi|261748771|gb|ADAD01000007.1| GENE 19 18824 - 19465 868 213 aa, chain + ## HITS:1 COG:FN1728 KEGG:ns NR:ns ## COG: FN1728 COG2039 # Protein_GI_number: 19705049 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyrrolidone-carboxylate peptidase (N-terminal pyroglutamyl peptidase) # Organism: Fusobacterium nucleatum # 2 213 3 214 214 281 67.0 5e-76 MKILVTGFDPFGGELVNPALEAVKKLPDNIAGAEIIKIEIPTVRIKSLEKIEKAVEEHNP DVILSIGQAGGRFDITVERIGINIDDFRIPDNEGNQVIDEPVFSDGESAYFSNLPVKAIV ENIRNHEIPASISNTAGTFVCNHVLYGVRYLIEKKYKGKKSGFIHIPFLPEQVISKPNMP SMAVDTVVEGLTAAIEAIVKNDEDIKKTGGTVC >gi|261748771|gb|ADAD01000007.1| GENE 20 19494 - 19928 476 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037200|ref|ZP_06010687.1| ## NR: gi|262037200|ref|ZP_06010687.1| BtrG family protein [Leptotrichia goodfellowii F0264] BtrG family protein [Leptotrichia goodfellowii F0264] # 1 144 1 144 144 228 100.0 8e-59 MIKLFVYGSLRKGCFNHYYIDKAQFKGIFYVRGNIFTIKNKNYPALVLSEQDENKNPENF TTGELYEFPDNTENLTEILENLDKMENYYGKNHPENEYNKLYLDIYDRNFNKADKAYVYV FNINNPELKNSLGEKIENNDFLNI >gi|261748771|gb|ADAD01000007.1| GENE 21 19887 - 20546 483 219 aa, chain - ## HITS:1 COG:BH1150 KEGG:ns NR:ns ## COG: BH1150 COG0534 # Protein_GI_number: 15613713 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 1 151 160 302 451 99 37.0 4e-21 MGRKQIRECLILQLVTNVVNIVLDLFFVKSLNMNVAGVAYATLISQIMTTLLSIYIIFRG KKGEDFKIIEAVKKIDLKKIFDKEAAKQIVGVNFDLVIRTVCLLAVTNLFMEEASGEGEI VLAANSILFQIQYLMAYFFDGFANASSVFDGLLFHYPFMGESWALVFVYSAFGSKIDITA FIRKKNEKKSVFGNESVRICSPHTFCLNIKKIIVFNFFS >gi|261748771|gb|ADAD01000007.1| GENE 22 20592 - 21008 575 138 aa, chain - ## HITS:1 COG:BH1150 KEGG:ns NR:ns ## COG: BH1150 COG0534 # Protein_GI_number: 15613713 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 10 138 15 143 451 109 43.0 2e-24 MTKTEKIKLNNGYLKMAIGFTLSTLTTPLLNSVDTAVVGRLPNPVYIGGVALGGTIFNTI YWILGFLRVSTSGYSAKAFSMKDEKEEVLALIRPMIISVITGLLFFVFQNGILNLALRFY KADEQILHYMSVYYKILI >gi|261748771|gb|ADAD01000007.1| GENE 23 21250 - 21606 322 118 aa, chain + ## HITS:1 COG:TM0122 KEGG:ns NR:ns ## COG: TM0122 COG0735 # Protein_GI_number: 15642897 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Thermotoga maritima # 1 117 1 115 121 71 37.0 3e-13 MKLTKNRQKVLDLIKISDIPVNVKFLKTKVDFDLSTIYRALDFLEKNKLVFSFDFENEKY YFKEENANFFICDTCKHIETVSDFSDSEIEKEKSTLEKKGFSLLSHLSIFKGKCNDCD >gi|261748771|gb|ADAD01000007.1| GENE 24 21672 - 22628 1484 318 aa, chain + ## HITS:1 COG:slr2043 KEGG:ns NR:ns ## COG: slr2043 COG0803 # Protein_GI_number: 16329702 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Synechocystis # 40 291 51 305 338 150 34.0 4e-36 MKKIVTLILLGLLVFSCGKTENPEKKENGTKTETAKKEKIVTSIPPLKWIVQKIAGSDFE VTSIVQPNMNHELFEPKPEDLKTLEDTKVFFTYDVLPFEETIEKSLNSSDKITNVLSGVD PALFLKGHHHHHDEDEDHDHDHDKNEKHEEEKEHHDHDEDARDPHVWFSLEMMPQVAKNV KNKLAQMYPDKKDTFEKNYNDFLAELTKFKEEISKKMSGKTKKLFMIYHPALNYFLKDYD IKEIEIESEGKEPSAQQIKEIIDEAKEHGITTILVQPQFPKQSAEAISKEIPNSKVAEFN ADLENVFENLNRFVDYLD >gi|261748771|gb|ADAD01000007.1| GENE 25 22655 - 23347 244 230 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 1 218 1 225 309 98 28 4e-20 MSNSEEKLISVKDLNFKYHNEIILENISFDIIKGQNVAILGRNGGGKSTLIKVLLGFLKK KSGEIKFFIDKKKIGYLPQIREFDASFPINIFDLVISGLTDKKNLFRRFNEEEKKKTENL LKEFGISHLKDKLISEVSGGQLQRALIARALVSSPEIIFLDEPESFLDKEFEFKLFEKIK KLSNSTLVIISHEMEKIYNYVDSIFIVEGTIKTYKNKKDFYDSEDNIHRH >gi|261748771|gb|ADAD01000007.1| GENE 26 23368 - 24201 674 277 aa, chain + ## HITS:1 COG:MA0025 KEGG:ns NR:ns ## COG: MA0025 COG1108 # Protein_GI_number: 20088924 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 4 261 2 252 274 145 39.0 8e-35 MDFLNILGYSFMRNALIVGFLSSICCGIIGTYIVNKKMVFISSSISHASYGGIGIGIYLI YFFRLPMKDPLIFGLIFSILSGILILILKDFFNVEGDLGIGMVMSLGMAVGIIFSFMTPG YQSDMSTYLFGNILLSNNSNIISLLILDIITVIFFIIFYKGIVYTSFDEKLYRLYGVPVY FINYFMVMLISSAIIINIKTIGIILIISILTIPQAAASMVAKKYNIIILLSVFFSFLGIL FGLYFSYTLNIPSGPSIIVSLIILIVFVKLFSFLRKK >gi|261748771|gb|ADAD01000007.1| GENE 27 24284 - 24721 757 145 aa, chain + ## HITS:1 COG:TP1038 KEGG:ns NR:ns ## COG: TP1038 COG0783 # Protein_GI_number: 15640022 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Treponema pallidum # 6 145 37 176 177 102 37.0 2e-22 MSKTSEKLNLYLVNLNVLYRKVQNYHWNVVGKGFFTIHAKLEEFYDKINEQVDDVAERIL SIGARPYGTLKDYLELTTIKEAENKEISVHDVLISVKADFESMLTLVKEIKVTADDENDY GTSAMLDEYISEYEKNLWMLNAYLK >gi|261748771|gb|ADAD01000007.1| GENE 28 24846 - 25853 1366 335 aa, chain - ## HITS:1 COG:SPy0879 KEGG:ns NR:ns ## COG: SPy0879 COG1304 # Protein_GI_number: 15674904 # Func_class: C Energy production and conversion # Function: L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases # Organism: Streptococcus pyogenes M1 GAS # 2 328 3 327 329 408 58.0 1e-114 MNRKDEHIRYALKYESPYNSFDDMELIHQSVPKFNIDEIDISTRFASNDFECPFFINAMT GGSEKGKEINRKLAKVAEECGILFVTGSYSAALKNSDDNSFKIVKEENKKLLLGTNIGAD KDYTAGLKAIEDLKPLFLQIHVNVMQELIMPEGSKNFKDWRKNIEGFVKNIKIPLILKEV GFGMSEETVKIGMESGIKTFDISGRGGTSFAYIENMRRKNSLSYLDEWGQTTVTSLLSVK KYADNIEIIASGGVRNPLDIIKSLVLGAKGVGISGTVLRLAEKNTVEEMIEIVNSWKEEC KMIMCALNAQNLEELKKVKYILYGKVKEFSEGYFK >gi|261748771|gb|ADAD01000007.1| GENE 29 25919 - 26851 601 310 aa, chain - ## HITS:1 COG:PM1540 KEGG:ns NR:ns ## COG: PM1540 COG4823 # Protein_GI_number: 15603405 # Func_class: V Defense mechanisms # Function: Abortive infection bacteriophage resistance protein # Organism: Pasteurella multocida # 7 249 10 242 309 72 25.0 2e-12 MDQPFKTFEEQRNILFNRMIVDNETISILMRYNYYSIVNFYKKPFILGKNSKGEDTYILN ISFKHLKALHDFDKKLRILFFEYLTELEKFFKTVIAYYFSEVYGQIEKNPYLLPKNYNTG HENKNIHYISYLIKTLNNIKKDNEKIIIKHYNTTKDNIPFWIIVHYLTFGDISKLYFCLK TEIHDKISSHFQNLYKVEYNNLVAISPSFIKSFLSTASIFRNVTAHNERLYNYSSKRTIN IKKSPVLTNNKRQKLFTIYEGLKFFLSRKEYDKLTENLKNIIIVLEEDLENVININSILI EMDFPVNWHK >gi|261748771|gb|ADAD01000007.1| GENE 30 26994 - 28010 1420 338 aa, chain - ## HITS:1 COG:SP0383 KEGG:ns NR:ns ## COG: SP0383 COG1577 # Protein_GI_number: 15900306 # Func_class: I Lipid transport and metabolism # Function: Mevalonate kinase # Organism: Streptococcus pneumoniae TIGR4 # 14 338 5 329 335 296 48.0 4e-80 MKITVNNTDFVEEKACGKLYIAGEYAILTPGLTAIVKNVNIYMNAQIRFSEKYKIYSDMY NYSVSLEHDENYSLIQETVNVVNKYLHIKSIDTKPFELNITGKMEKEGKKYGIGSSGSVV ILTIKAMVRLHKYCISKDTIFKLAAYVLLKRGDNGSMGDLACIAYEELVAYISFDREKIK EKIENETFEKVINSDWGYKIEVLKCKHDYEFLVGWTGKPAISKDMINNVKKSINKDFLEK SDENVKNIIKGIKSGNKELIKEAVIKSGDLLKNLDSSIYSNELTELVNAAKELDMCAKSS GAGGGDCGIAISFNKNDTKTVIEKWEKKGIVLLYRGRL >gi|261748771|gb|ADAD01000007.1| GENE 31 27994 - 28935 1046 313 aa, chain - ## HITS:1 COG:SP0382 KEGG:ns NR:ns ## COG: SP0382 COG3407 # Protein_GI_number: 15900305 # Func_class: I Lipid transport and metabolism # Function: Mevalonate pyrophosphate decarboxylase # Organism: Streptococcus pneumoniae TIGR4 # 5 311 7 315 317 353 62.0 3e-97 MDTKSVRSYANIAIIKYWGKKDAKNMIPATSSISLTLENMYTDTEISFIESETDVFYLNG VLQDSKQTEKISKVVDLFRENKEQKVLIKSENNMPTEAGLSSSSSGLSALIKACNKLFRK NMTRTELARISKYGSGSSARSFFGPIGAWDKDTGEIYEIKTDLKLAMIMLVLNEEKKIIS SREGMKLCGETSTIFDKWIKNSEIEYEEMKKALAENNFEKVGELTEKNALAMHETTLYAN PPFSYLTDKSREAMEFVKKLRKSGEKCYFTMDAGPNVKVLCLEKDFEKLKYVLGKKYKII ASKTKVITDENNG >gi|261748771|gb|ADAD01000007.1| GENE 32 28917 - 29021 124 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no GGCVIALTDSEEKAKETAKILVEKGALNTWIQNL Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:26:18 2011 Seq name: gi|261748768|gb|ADAD01000008.1| Leptotrichia goodfellowii F0264 contig00200, whole genome shotgun sequence Length of sequence - 839 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 665 527 ## gi|262037206|ref|ZP_06010692.1| conserved hypothetical protein 2 1 Op 2 . - CDS 620 - 838 162 ## gi|262037207|ref|ZP_06010693.1| hypothetical protein HMPREF0554_2399 Predicted protein(s) >gi|261748768|gb|ADAD01000008.1| GENE 1 2 - 665 527 221 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037206|ref|ZP_06010692.1| ## NR: gi|262037206|ref|ZP_06010692.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 221 1 221 221 323 100.0 5e-87 MGDERGACRKNGRTLINCRENTERASNRKLTPKEQKINDRIVDMIIDYGVPYSKFRVIPE AIIGRNLITGEKVTLLDSALSVLGVSKSLTSTAMDGFEKLKTKTNKKPINKELHEKINNR KPGHYYEKQLEKRGKSIDKANLNEKVVSKETPKIKPKAIEAKQKGTKSSPKTEEKNNKQS DGNKNNFKEEQKRTGFEEKNKEKGFNDYASDDSDKNKRYTA >gi|261748768|gb|ADAD01000008.1| GENE 2 620 - 838 162 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037207|ref|ZP_06010693.1| ## NR: gi|262037207|ref|ZP_06010693.1| hypothetical protein HMPREF0554_2399 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2399 [Leptotrichia goodfellowii F0264] # 1 72 1 72 72 90 100.0 5e-17 GKVEGRQRKVPEGSEEKGLESTGRATNDFFKRRIRREIYRYRQSQMEKITAKLILVSMWG MKEELVAKTEEH Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:26:36 2011 Seq name: gi|261748762|gb|ADAD01000009.1| Leptotrichia goodfellowii F0264 contig00018, whole genome shotgun sequence Length of sequence - 3924 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 145 218 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase - Prom 177 - 236 7.4 - Term 216 - 265 6.0 2 2 Op 1 36/0.000 - CDS 292 - 1509 329 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 3 2 Op 2 24/0.000 - CDS 1510 - 2202 325 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 2 Op 3 . - CDS 2231 - 3367 1394 ## COG0845 Membrane-fusion protein 5 2 Op 4 . - CDS 3400 - 3924 647 ## Lebu_0816 hypothetical protein Predicted protein(s) >gi|261748762|gb|ADAD01000009.1| GENE 1 1 - 145 218 48 aa, chain - ## HITS:1 COG:ECs3054 KEGG:ns NR:ns ## COG: ECs3054 COG1957 # Protein_GI_number: 15832308 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli O157:H7 # 2 48 3 49 313 70 72.0 6e-13 MKKKIILDCDPGHDDAIAIMVAGLHEDFELLGITTVAGNQTIEKTTNN >gi|261748762|gb|ADAD01000009.1| GENE 2 292 - 1509 329 405 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 5 405 7 413 413 131 28 1e-30 MDFFESLKLSVSNLFSYKVRSFLTMLGIIIGISAVVMMSSLGAGVKENIVGDLNKLGVGN FQVFIDTSPGQTYKSEDLFTTKDIANLKAIEGVEAVSPSSDAYARINIGENQSAMVVGSG VTQDTFKITNYTIVKGRKFLPDEYRKDGRYIIIDSTSADMLFPGQNPVGQKLALNFRKNM QIVTIVGVYKNPFANLGGGGGPDLPVFALFPNEYLNHINGNAQNKFTSLDMKAGNAKELE IVMERVREFLKSRGSSPKTYSVRNSAQGLDEFNNILNMISLCISGVAAISLFVGGIGVMN IMLVSVTERIREVGLRKAIGAKTKDILLQFLIEAVILTCFGGIIGVFLGYGGALLVGIFI KTTPILSPVVVIVSLVVSTMTGLIFGVYPAKKAAALDPIEALRVD >gi|261748762|gb|ADAD01000009.1| GENE 3 1510 - 2202 325 230 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 6 218 3 217 245 129 34 3e-30 MIDVKKIVKIYKNGNMSLEVLKGLNLFVAEKEYVALMGPSGSGKSTFMNILGCLDRLTSG KYILDNVDVSTMKGDELSKVRNEKIGFVFQSFNLLPKLSAMENVALPALYAGVKREERME KAKKALESVGLGERIHHKPGEMSGGQRQRVAIARAIINDPKILLADEPTGNLDSKSGEEV LGIFKRLNDNGTTIVMVTHEEDVAEHCKRIVRLKDGVIESDYIVEDRRGV >gi|261748762|gb|ADAD01000009.1| GENE 4 2231 - 3367 1394 378 aa, chain - ## HITS:1 COG:FN0826 KEGG:ns NR:ns ## COG: FN0826 COG0845 # Protein_GI_number: 19704161 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Fusobacterium nucleatum # 42 347 3 309 338 143 29.0 7e-34 MLKEKFKPFCNNKAICLLIMAVLFLVACGKKNVEKEYEVTTVERGDISLSIEKTGQVVSE NEVAVYTTANQRVNKVFFKAGDNVKKGDVVLTFYPVDRNELQRKIQIKSLEVQQKQRDLR NASELKKIGGASAVSVDDARIALQTAQLELSSLREDYALLVDNIKSPVDGVITAMTADEN YRVNTETTLFKVSDAKNMKVEVSLTDSEVQNIAVGQRVEISSDALPKGQQIEGTVSQISG VATKNANLDESNTTVSIKLSDPKGLRPGATINAIIYYKESKNILKVPYSAVINESGKYYV FTVGKDNKVTKKEVTVGETDNLYYEISTGLSSGEKIITVVDEALQNGQKIKIADPGKPQK GTKGKGDVKQESFEAPPR >gi|261748762|gb|ADAD01000009.1| GENE 5 3400 - 3924 647 174 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0816 NR:ns ## KEGG: Lebu_0816 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 174 249 422 422 186 68.0 5e-46 VELKKEDFYNLKLSEAEKIKLNETLNEEKMRKEKFDYNIPKITADAGYSFEKKSFTVGVG VSKTFKLYNDTIEDLKNEAEKLRLEYEQKKNEILSNAGQEMLNYTTYQTNELISKKALDI AKEDYAIYAKKYELGSDTFANYVEKRNTYEKAIMDYEIAKNELAAFTKKIKYYK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:26:42 2011 Seq name: gi|261748755|gb|ADAD01000010.1| Leptotrichia goodfellowii F0264 contig00016, whole genome shotgun sequence Length of sequence - 6452 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 22/0.000 + CDS 3 - 230 288 ## COG2011 ABC-type metal ion transport system, permease component 2 1 Op 2 3/0.000 + CDS 251 - 1051 1138 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen 3 1 Op 3 . + CDS 1106 - 1936 1328 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen + Term 1981 - 2025 -0.9 + Prom 1940 - 1999 10.2 4 2 Op 1 . + CDS 2244 - 3785 2177 ## Lebu_0504 hypothetical protein 5 2 Op 2 . + CDS 3813 - 5540 2201 ## Lebu_0505 ErfK/YbiS/YcfS/YnhG family protein + Term 5581 - 5622 -0.9 + Prom 5782 - 5841 9.2 6 3 Tu 1 . + CDS 5998 - 6192 317 ## gi|262037216|ref|ZP_06010700.1| transcriptional regulator, MarR family + Term 6394 - 6430 -0.8 Predicted protein(s) >gi|261748755|gb|ADAD01000010.1| GENE 1 3 - 230 288 75 aa, chain + ## HITS:1 COG:BH3480 KEGG:ns NR:ns ## COG: BH3480 COG2011 # Protein_GI_number: 15616042 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Bacillus halodurans # 1 72 147 218 218 72 55.0 2e-13 LIHGLTLTIISLIGYSAIAGSIGGGGLGNSAVVDGYTRSNPEIMWQATIVIIILVQIIQF IGDFFVKIISKKRTK >gi|261748755|gb|ADAD01000010.1| GENE 2 251 - 1051 1138 266 aa, chain + ## HITS:1 COG:DR1358 KEGG:ns NR:ns ## COG: DR1358 COG1464 # Protein_GI_number: 15806375 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Deinococcus radiodurans # 1 266 1 256 256 219 46.0 6e-57 MKKIILFLTVALIFAVSCGEKKESKKENEKLTVTATPILAGELVTLVKDDLKKEGIDLEV VIFNDYVQPNKALQDKSVDANLFQHTPYMNNFGKKNNFEMAAVGKVYLPTMALYSDKVKS LEELKDGATIFIPNDPTNLTRALLLLDKKGLIKLKDNTKLDSKLSDIVENRKNLKFVELS AEQIAPRYKEVDAAFITGSYALDSGLNPKDNGILSEDKDSPYVNVLATLKGRENEEKIQK LLKALQSEKVKKYIEEKYKDVIIPAF >gi|261748755|gb|ADAD01000010.1| GENE 3 1106 - 1936 1328 276 aa, chain + ## HITS:1 COG:FN0658 KEGG:ns NR:ns ## COG: FN0658 COG1464 # Protein_GI_number: 19703993 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Fusobacterium nucleatum # 41 276 25 261 261 257 59.0 2e-68 MKKLLLLGVLTLLFVLSCGGKKEEAKDQKGSAEEPKKTEKLVVGATPVPHQELLELVKDD LKNEGIDLEIVQFNDYVQPNKGLADKSLDANFFQHIPYMEEFAKKNNMELVSVGKIHLEP MAIYSKKIKNINDLKEGDTILIPNDPTNGGRALILLDKAGVLKLKDNTKLDSTVADIVEN NKKIKIEQLAPEQLAPRLSEVTAAIINSNFALDAKLSFKDDTIFIEDKDSPYVNIVTVLK GRENEEKIQKLVKALQSEKVKKYLEEKYSGSVIPSF >gi|261748755|gb|ADAD01000010.1| GENE 4 2244 - 3785 2177 513 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0504 NR:ns ## KEGG: Lebu_0504 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 24 513 48 555 555 484 50.0 1e-135 MMRKVMLLVGFVLFCNFNIFSMDKLQIPAEFSENITVTGFDSDIFEYDFNKDGIKEKVLV NSKKDEFSIKTIISIYIMEDGNYRFAYQIPLKEDVNIFSVKRAENMLKKVKEHYEDYSKD LEKGEVRYVRLLDDNTNEQIVFDMKFDKHSPKDLDNFIFVKKTTVFNQSPQHHSGVAYTA HYKDKPRILLEFLSEKDNKHTVWYFTEMQEGKDKSKVKGYVSEVSGTIQRRGFYWDEMHS KIEKVNYFINNAMKDKSDLYIITQYRPLANDIYSEKDKFGNRRNQSITGYINPDKKGEII NIPDQTIFKIIGKENDMLKIETPLYGGPYYIADNPEIMKKWNLETQVNKFIAIDSNNQTE AVLQRIKDTNNFSIISYSFVTTGKDDGYSSYETPHGAFLIAFTRPYMLFTGRQREGDTRK SAGKEGLVIAGEAQYAVRFSGGAYMHGIPVSYGASASTKAYTASKIGTYKESHKCVRHYD DQIEFIVNWINGNSTTKERDNTIPDEPVIAVVL >gi|261748755|gb|ADAD01000010.1| GENE 5 3813 - 5540 2201 575 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0505 NR:ns ## KEGG: Lebu_0505 # Name: not_defined # Def: ErfK/YbiS/YcfS/YnhG family protein # Organism: L.buccalis # Pathway: not_defined # 2 575 6 594 594 562 54.0 1e-158 MKLRKLQLLGMIILMSSLMYAAPLGTAPANTGTLKNWKTVEMTPDLDKDGKKDKLVIEYS EQNDKIYTKFTPYVNDDSNKAVKGQTVEKIFERANFQKDFNNFSREFVKNYPKKVKTASK PVNTTVTSVQPNTQNKQPVKNDKPLIPPTQNVGKEVEDKAAVPQDLKETGKNPKEVIPDE QTDKEIQKVDGKPVDKKEELIKGPYPYVTYYLKERPKNLTFDYKYAKNSPRDMDEFIFIK TATNIRKEPNANAASIKKASYGHKYKVVGKVKTNAKGGTAEWYEVYFDGKLGYVLNSVAV KREFDWQDMMKKVEKTNKFVNEAVSKGQTIYVLDDYVPLGGGSGSSKDKFGNRENQSERG YLGADFKEFINLPDRTMMTILEETDKYLKVKVDAYDNGTYYLKKSKKSLLKDSKITGEIT RFIYVDRHSQNEMIIEKNTNANTWNVVTTSFVTTGKDAGNSYATPYGTFLIAYSKPVMQY TGSDNKTVVGDANNAVRFSGGGYMHSIPSLFEPKETRKSRKAVTARKIGTFPESHKCIRH YDDQIKFIYDWLGNSSPGNKLGRRVPSVPTVMLVK >gi|261748755|gb|ADAD01000010.1| GENE 6 5998 - 6192 317 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037216|ref|ZP_06010700.1| ## NR: gi|262037216|ref|ZP_06010700.1| transcriptional regulator, MarR family [Leptotrichia goodfellowii F0264] transcriptional regulator, MarR family [Leptotrichia goodfellowii F0264] # 1 64 1 64 64 80 100.0 4e-14 MENNEKRGGIRKGAGRKPTGRERDKTISFKVTEDEREYIYKVLDKIGGKRTESILKLLKK YEKE Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:27:05 2011 Seq name: gi|261748741|gb|ADAD01000011.1| Leptotrichia goodfellowii F0264 contig00044, whole genome shotgun sequence Length of sequence - 8875 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 8, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 325 241 ## PROTEIN SUPPORTED gi|229547905|ref|ZP_04436630.1| acetyltransferase including N-acetylase of ribosomal protein family protein 2 1 Op 2 . - CDS 367 - 840 419 ## Lebu_0538 hypothetical protein - Prom 875 - 934 9.4 3 2 Tu 1 . - CDS 950 - 1165 265 ## Lebu_2069 heavy metal transport/detoxification protein - Prom 1231 - 1290 7.2 - Term 1177 - 1224 0.1 4 3 Tu 1 . - CDS 1292 - 1906 823 ## Lebu_0557 hypothetical protein - Prom 1945 - 2004 9.2 + Prom 1909 - 1968 10.8 5 4 Op 1 . + CDS 2081 - 2686 600 ## COG2813 16S RNA G1207 methylase RsmC + Prom 2694 - 2753 6.8 6 4 Op 2 . + CDS 2789 - 3283 1035 ## COG0716 Flavodoxins + Term 3489 - 3536 5.0 - Term 3745 - 3775 1.2 7 5 Op 1 . - CDS 3875 - 4921 1221 ## PROTEIN SUPPORTED gi|15900201|ref|NP_344805.1| hypothetical protein SP_0267 8 5 Op 2 . - CDS 4939 - 5088 230 ## gi|262037226|ref|ZP_06010709.1| ThiM2 - Prom 5108 - 5167 2.2 9 6 Op 1 . - CDS 5256 - 5528 431 ## CAR_c22460 hypothetical protein 10 6 Op 2 3/0.000 - CDS 5582 - 6352 951 ## COG0346 Lactoylglutathione lyase and related lyases 11 6 Op 3 . - CDS 6374 - 6562 79 ## COG0346 Lactoylglutathione lyase and related lyases - Prom 6626 - 6685 7.8 + Prom 6530 - 6589 9.8 12 7 Tu 1 . + CDS 6665 - 7021 424 ## COG1733 Predicted transcriptional regulators + Term 7049 - 7106 2.2 - Term 7109 - 7146 2.1 13 8 Tu 1 . - CDS 7149 - 8873 2590 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit Predicted protein(s) >gi|261748741|gb|ADAD01000011.1| GENE 1 2 - 325 241 108 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229547905|ref|ZP_04436630.1| acetyltransferase including N-acetylase of ribosomal protein family protein [Enterococcus faecalis ATCC 29200] # 9 108 21 115 204 97 48 3e-20 MNLKNTPNLETERLILRKFTKNDLKSIFEIYSDEEVNTFLPWFSLKSIKEAEMFFAERYE KNYKNPLGYNYAICLKTDNIPIGYINVNMDSNYDLGYGLRKEFWNKGI >gi|261748741|gb|ADAD01000011.1| GENE 2 367 - 840 419 157 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0538 NR:ns ## KEGG: Lebu_0538 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 46 157 162 269 398 73 39.0 2e-12 MKSKTVLVLMSIFLVLTFTSGFAAKSKVSKGTSVNSSTNTNKLLHTSWRLRSVSEKALKK VVIEGREENVMITLNFSSDVINGSGGVNNYTAGYKIKGNNISLTQISSTLRAGFTELMKA EQNYFQILQNVKTFELKGDILTLKSDNGSLVFSKMPN >gi|261748741|gb|ADAD01000011.1| GENE 3 950 - 1165 265 71 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2069 NR:ns ## KEGG: Lebu_2069 # Name: not_defined # Def: heavy metal transport/detoxification protein # Organism: L.buccalis # Pathway: not_defined # 1 63 2 64 74 65 53.0 8e-10 MKKIIEIEGMNCSHCTQKVENALYGLPETEEVNINLENKCSEVDFSSDVDDKLISDLIKS IGYLVTNIKNI >gi|261748741|gb|ADAD01000011.1| GENE 4 1292 - 1906 823 204 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0557 NR:ns ## KEGG: Lebu_0557 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 202 1 202 204 108 36.0 1e-22 MKKYLFLFLLAPVLVNASTTEIRFGGNLLSKGKFKNEFSGNTFSKKDIVKNGFGASIEAR DSIDDFEIGLGIGLKYNSLKSIENSKSKNIVSVPIYFVGKGNLFNNFTDGKVAPYIRYEL GYAFRNGNLKWENQKNSGEIKFGGGVYASAGLGIQYKNFTTDLSYNWEGIRVKRNYKTAA YNYSDKFTLHQGFVTLNVGYAFEK >gi|261748741|gb|ADAD01000011.1| GENE 5 2081 - 2686 600 201 aa, chain + ## HITS:1 COG:SA0499 KEGG:ns NR:ns ## COG: SA0499 COG2813 # Protein_GI_number: 15926219 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S RNA G1207 methylase RsmC # Organism: Staphylococcus aureus N315 # 1 200 1 201 202 173 41.0 2e-43 MNHYFSEKPDIKSQREKIKYTIENKEMEFITDNGVFSKSKIDFGTDLMLKTFIKNFKTEK GFDVLDIGCGYGIVSVVLKTFYPLSSVTLSDVNERALELSEENLKNHNINDYKIVKSFAF DNISDKYDIIMSNPPIRAGKETIFKIYEGAYEHLNNKGEFYCVIQTKHGAKSTQKKLEEI FGNCETLTIDGGYRIYKAEKN >gi|261748741|gb|ADAD01000011.1| GENE 6 2789 - 3283 1035 164 aa, chain + ## HITS:1 COG:sll0248 KEGG:ns NR:ns ## COG: sll0248 COG0716 # Protein_GI_number: 16330539 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Synechocystis # 1 164 1 168 170 146 46.0 2e-35 MSKIGIFYGTTTGVTEDAANKIADKLDGADVFNIAGNEDKLGDYDVLILGTSTWGFGDLQ DDWQTAVDELAKLDLKGKKVAYFGCGDQMTFSETYVDGIGILNEEIEKTGAEVIGQTSTE GYDFSESRAIKNSKFLGLAIDEINQPDLTDERIDMWVEELKKVL >gi|261748741|gb|ADAD01000011.1| GENE 7 3875 - 4921 1221 348 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900201|ref|NP_344805.1| hypothetical protein SP_0267 [Streptococcus pneumoniae TIGR4] # 1 348 1 348 349 474 63 1e-134 MIELGISSFGETTPIEKTGEVLSHDKRIRNLIEEIELSDKVGLDIYAIGEHHRSDFAVSA PEIVLAAGAVNTKNIKLSSATTNISSNDPVRVFQNFATIDAISNGRAEIMLGRGSFTEAF PLFGYKLENYEELFNEKLDMMLELKKNEIINWKGNFTQSIDGRGVYPRPVQEDFPIWVAT GGHVESSLRIAQKGLPIVYAIIGGNPLAFKQLIDIYKKFGVENGHSPEKLKVAAHSWGFV WDNTEEAIEKYFHPTKILTDQIATERPHWRGLSKEHYLQSVGPDGAMFVGSPEHVADKLI EMIENLGLDRFMLHLPIGSMPHEEVLKSIELYGTKVAPIVKDYFNKKK >gi|261748741|gb|ADAD01000011.1| GENE 8 4939 - 5088 230 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037226|ref|ZP_06010709.1| ## NR: gi|262037226|ref|ZP_06010709.1| ThiM2 [Leptotrichia goodfellowii F0264] ThiM2 [Leptotrichia goodfellowii F0264] # 1 49 1 49 49 77 100.0 3e-13 MAPAGDVMKMKKELENIGYDKFDVILLDGGHEINDQEIEKLKKYYKNKF >gi|261748741|gb|ADAD01000011.1| GENE 9 5256 - 5528 431 90 aa, chain - ## HITS:1 COG:no KEGG:CAR_c22460 NR:ns ## KEGG: CAR_c22460 # Name: not_defined # Def: hypothetical protein # Organism: Carnobacterium_17-4 # Pathway: not_defined # 1 89 1 89 191 84 51.0 2e-15 MEYMYIKGTDEMFVLFHGTGGNENSLLFLTGELDPYASVLSFSGDTGVRIKRRFFAPLIG KREPDRKDLAERVEKFLTQWDNLELTKGKK >gi|261748741|gb|ADAD01000011.1| GENE 10 5582 - 6352 951 256 aa, chain - ## HITS:1 COG:BH2175 KEGG:ns NR:ns ## COG: BH2175 COG0346 # Protein_GI_number: 15614738 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Bacillus halodurans # 10 251 76 319 327 131 35.0 1e-30 MKGMKENRFGTNAIERTMFLVRSEKSLYFWEKRFDDFNVCHYGIETYNGQKILRFEDEDG QRLGLVYRKGELEEMEPFVAEDIPEEHAVLGIGDIHLRVGYTEPTQKILENNYGFEKYDE ITVNNLKVSLFRFKNSPFKHEIHIIEDKDSEIQRLGVGGIHHIAFGVESVEDLKVLQKEL EDKNVQNSGIIDREFIVSSYFREPNFNLFETATPLNKEKESFPEQGKQFHEIPLFLPEFL ENRREEIERNVNYELK >gi|261748741|gb|ADAD01000011.1| GENE 11 6374 - 6562 79 62 aa, chain - ## HITS:1 COG:BS_ykcA KEGG:ns NR:ns ## COG: BS_ykcA COG0346 # Protein_GI_number: 16078352 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Bacillus subtilis # 13 60 6 53 316 63 54.0 8e-11 MKIKKKRGIYYELHHVSVLSSNAERAFYFYHHILRLKLILKTVNQDDPNMYHLFFGDETG RG >gi|261748741|gb|ADAD01000011.1| GENE 12 6665 - 7021 424 118 aa, chain + ## HITS:1 COG:AGc1831 KEGG:ns NR:ns ## COG: AGc1831 COG1733 # Protein_GI_number: 15888340 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 12 106 15 109 129 103 47.0 7e-23 MSEDIKIEPNLCRTDDGLDTIIGKWKLSILLHMMKNGTMRFSDFMRAMPEITQKMLTKNL RELEEDDLISRFSYPTIPPKVEYSLTEHGKELEPIIDQLHKWGIRHKEHIQKKWQNMI >gi|261748741|gb|ADAD01000011.1| GENE 13 7149 - 8873 2590 574 aa, chain - ## HITS:1 COG:FN1170_3 KEGG:ns NR:ns ## COG: FN1170_3 COG1013 # Protein_GI_number: 19704505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Fusobacterium nucleatum # 200 574 1 375 379 507 64.0 1e-143 DKGTEGLEEVFVDPAWADLEVDEEIIDSAKPEYVRKIADPINAIKGDQLPVSAFVGYEDG TLEHGTANYEKRGIAVEVPEWQPDMCIQCNQCAYVCPHAAIRPFLIDEKEMAAAPQGMPT IKALGRGFDNLQYKIQVSPLDCTGCTACVDVCPAPKGKAIVMKLIESQIERHEVEYSDYL YNEVSYKDHIMGKNTVKGSQFAKPLFEFSGACAGCGETTYIKLVTQLFGERMLIANATGC SSIYGASAPSTPYTKNSCGEGPAWASSLFEDNAEYGYGMFQATETIRHRMAKMMIECENE VSSELAALFLEWRENISDGEKTTELRNEIVPLLEKETGKTAKELLELKQYLVKKSVWMFG GDGWAYDIGFGGIDHVLASGDDVNMLILDTEVYSNTGGQSSKASPAGALAKFASSGKPVK KKDLAAILMTYGNIYVARVSMGANQNQTLKAIREAESYPGPSIIIAYSPCIAHGIKEGMG RGQHEEKLATEVGYWPILRYDPRLAEKGKNPLQLDSKDPAWDKYENFLKGESRYSALLSE FPERAKELFELNLKNAKETWNYYKRMASMDYSAE Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:27:30 2011 Seq name: gi|261748707|gb|ADAD01000012.1| Leptotrichia goodfellowii F0264 contig00055, whole genome shotgun sequence Length of sequence - 28309 bp Number of predicted genes - 32, with homology - 31 Number of transcription units - 13, operones - 9 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 30 - 89 14.1 1 1 Op 1 . + CDS 125 - 601 458 ## Lebu_0175 hypothetical protein 2 1 Op 2 . + CDS 641 - 1225 555 ## COG0241 Histidinol phosphatase and related phosphatases 3 1 Op 3 28/0.000 + CDS 1269 - 2648 1613 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 4 1 Op 4 28/0.000 + CDS 2668 - 3756 1567 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 5 1 Op 5 25/0.000 + CDS 3776 - 5143 1984 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 6 1 Op 6 31/0.000 + CDS 5171 - 6274 1246 ## COG0772 Bacterial cell division membrane protein 7 1 Op 7 26/0.000 + CDS 6320 - 7384 1594 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 8 1 Op 8 11/0.000 + CDS 7399 - 8742 1772 ## COG0773 UDP-N-acetylmuramate-alanine ligase 9 1 Op 9 . + CDS 8779 - 9642 1055 ## COG0812 UDP-N-acetylmuramate dehydrogenase + Prom 9649 - 9708 14.7 10 2 Op 1 . + CDS 9729 - 10394 680 ## Lebu_0208 polypeptide-transport-associated domain protein FtsQ-type 11 2 Op 2 . + CDS 10409 - 11506 1562 ## COG0206 Cell division GTPase + Prom 11508 - 11567 8.2 12 3 Op 1 24/0.000 + CDS 11600 - 11884 395 ## PROTEIN SUPPORTED gi|229212296|ref|ZP_04338678.1| SSU ribosomal protein S6P 13 3 Op 2 21/0.000 + CDS 11951 - 12364 514 ## COG0629 Single-stranded DNA-binding protein 14 3 Op 3 . + CDS 12390 - 12623 356 ## PROTEIN SUPPORTED gi|229860468|ref|ZP_04480112.1| SSU ribosomal protein S18P + Term 12633 - 12679 4.4 + Prom 12694 - 12753 10.3 15 4 Op 1 . + CDS 12825 - 13379 770 ## Lebu_0215 hypothetical protein 16 4 Op 2 . + CDS 13436 - 14026 1084 ## COG5403 Uncharacterized conserved protein + Term 14073 - 14107 1.1 + Prom 14052 - 14111 3.9 17 5 Tu 1 . + CDS 14131 - 14643 692 ## gi|262037259|ref|ZP_06010741.1| hypothetical protein HMPREF0554_0752 - Term 14649 - 14695 -0.3 18 6 Tu 1 . - CDS 14708 - 15640 881 ## COG2855 Predicted membrane protein - Prom 15668 - 15727 8.7 + Prom 15684 - 15743 11.3 19 7 Op 1 . + CDS 15772 - 16026 371 ## BHWA1_01309 hypothetical protein 20 7 Op 2 . + CDS 16073 - 16306 355 ## Thebr_0090 hypothetical protein + Term 16323 - 16375 -0.2 + Prom 16371 - 16430 7.1 21 8 Op 1 . + CDS 16468 - 17355 247 ## PROTEIN SUPPORTED gi|90020671|ref|YP_526498.1| ribosomal protein S6 22 8 Op 2 . + CDS 17385 - 17864 729 ## COG4894 Uncharacterized conserved protein + Term 17877 - 17921 8.4 - Term 17865 - 17909 4.6 23 9 Tu 1 . - CDS 17950 - 18543 592 ## Lebu_0580 hypothetical protein 24 10 Op 1 . + CDS 18832 - 20772 2457 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 25 10 Op 2 . + CDS 20799 - 21476 939 ## Lebu_1466 hypothetical protein 26 10 Op 3 . + CDS 21495 - 22460 1114 ## gi|262037240|ref|ZP_06010722.1| hypothetical protein HMPREF0554_0761 + Prom 22877 - 22936 8.1 27 11 Tu 1 . + CDS 23142 - 23705 444 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II + Prom 23733 - 23792 6.5 28 12 Op 1 . + CDS 23943 - 24488 552 ## COG0221 Inorganic pyrophosphatase 29 12 Op 2 . + CDS 24490 - 24603 86 ## 30 13 Op 1 56/0.000 + CDS 25229 - 25633 620 ## PROTEIN SUPPORTED gi|229212520|ref|ZP_04338896.1| SSU ribosomal protein S12P 31 13 Op 2 51/0.000 + CDS 25663 - 26133 749 ## PROTEIN SUPPORTED gi|229212521|ref|ZP_04338897.1| SSU ribosomal protein S7P 32 13 Op 3 . + CDS 26198 - 28276 3259 ## COG0480 Translation elongation factors (GTPases) Predicted protein(s) >gi|261748707|gb|ADAD01000012.1| GENE 1 125 - 601 458 158 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0175 NR:ns ## KEGG: Lebu_0175 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 11 149 9 147 148 121 43.0 1e-26 MQENIKIDMGIMSIFKMMKNVKKLYFKSQSLSNSCMDWNYMGIGTVELTLDYNKLYFSEE IILDNNAKYIDKKMWYFHDSFIEFYHYRNEKYEKIFEFSIRNNKFVLKEKYECQPDVYYG ALSVLEDKIFFTMNIRGMRKDELLEYIYFPDDNMEQDS >gi|261748707|gb|ADAD01000012.1| GENE 2 641 - 1225 555 194 aa, chain + ## HITS:1 COG:CAC3053 KEGG:ns NR:ns ## COG: CAC3053 COG0241 # Protein_GI_number: 15896304 # Func_class: E Amino acid transport and metabolism # Function: Histidinol phosphatase and related phosphatases # Organism: Clostridium acetobutylicum # 4 184 2 180 181 167 50.0 1e-41 MGKNRFVLLDRDGVINVEKSYLYKIKDFEYESGVIEALKKLADSGYRFAVITNQSGIGQG YYSEEDFLKLEKYIEKDLCKKGIKIEKTYFCPHHPEGKGNYRKNCDCRKPGTGNFLKAIE EFNIDIKNSYMIGDRITDLIPAHKLGIKTVLVRTGYGKKNEEKVKESGLDSIIVNNISDF ADYIENCIEKTFSK >gi|261748707|gb|ADAD01000012.1| GENE 3 1269 - 2648 1613 459 aa, chain + ## HITS:1 COG:FN1461_2 KEGG:ns NR:ns ## COG: FN1461_2 COG0770 # Protein_GI_number: 19704793 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Fusobacterium nucleatum # 23 459 6 414 416 319 46.0 8e-87 MDKNEIFWSIIDNENIDNDLTDINKISMNSKECGKNDLFVAIRGGNNYINEALEKGAYAV YDNAHADIKEEYKNKTFFVNDSVEFLQKFAKKWREALDVKVIGITGSNGKTTVKDITYQL LSSKYKGKKTEGNYNNHIGLPFTLLRLEKDDKFIILEMGMSGFGEIELLGKISNPDISII TNIGESHLEFLHTKENVFKAKTELLPYTKEVLIINGDDDYLKNINESSLKTIKVLRKDSK NIEKSNFYYGNINFDEKGSSFSLEYSEKENEKFQKKLFKTNILGEHNILNLTMAIAVAKQ FGIEDKDIEETVKNIKLTDMRFQITEKGNTIYINDAYNASPASMKKSLETFSEIYNDRMK IAVLGDMLELGENELELHSDIYETLRNIKLNKLYLFGERMKSLYDRVKKESQEKNSDENS EVEHFSDKNKIKEKLKEITDEKVILIKGSRGMKLEEIME >gi|261748707|gb|ADAD01000012.1| GENE 4 2668 - 3756 1567 362 aa, chain + ## HITS:1 COG:FN1459 KEGG:ns NR:ns ## COG: FN1459 COG0472 # Protein_GI_number: 19704791 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Fusobacterium nucleatum # 7 362 3 358 361 296 50.0 5e-80 MLYLLQYLFIRNWGVLRIFKSITIRASIAFGIAFLFMLIFGKPFIAWLKKKKYGDTARED GPQSHQTKSGTPTMGGLLIIAAILFATLISGNFMNKFIVFLFIITILFTCIGFYDDYLKL TRHKSGLSGKKKILGQLIITGLTFAFIYKYGIINKTLDFSIVNPIIKNSSLYITPALFFV FMMFVIIGSSNAVNLTDGLDGLVSGPIIIVCVTLLIITYLTGHYEYAKYLNLYHIKEAGE ITVYLAAAVGALIGFLWFNFYPAQVFMGDTGSLTLGGILGIIVIFLKQELLLPIAGFIFI VEAFSVMIQVWHFKTFKKRVFKMAPIHHHFEMLGIPETKVTIRFWIVTIIACLLTFVMLK LR >gi|261748707|gb|ADAD01000012.1| GENE 5 3776 - 5143 1984 455 aa, chain + ## HITS:1 COG:FN1458 KEGG:ns NR:ns ## COG: FN1458 COG0771 # Protein_GI_number: 19704790 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Fusobacterium nucleatum # 1 455 23 454 454 314 44.0 3e-85 MKKALVFGAGLSGLGAKELLEKYDYEVFLIDDKQGIPSSEGIKILDSENIEFVVKSPGIP WKAELLKKAEEKNIKVISEIDLAYKYMDKKIKVISFTGTNGKTTTCTKMYELLEFAGYNV KLAGNAGYSFAKLVGDEEPLDYIVLELSSYQLENDPQIHSYIAGIINLTPDHLTRYNSVE EYYITKFDIFQKQNKDDFAVINLDDAEFERLCEREEIQNKIKAEKVYLSTEKKGTVFVFE DNIYIMKNLKDRIDCYADEETEKVAEKLISVKDLSLKGKHNLENMLFLISGAKILNIPDE KIKEFLKTTKALEHRLENFFVKENTVFINDSKGTNVESTLKAIDSFEKPIILICGGDDKK ISNDELIKKIKEKVEFVYLIGDNAPLLEECMDKFGYSNYKNMETVENILNYLNNNFDFTR DAVILFSPATSSFCQFKNFEHRGHVFKELTQKILG >gi|261748707|gb|ADAD01000012.1| GENE 6 5171 - 6274 1246 367 aa, chain + ## HITS:1 COG:BH3276 KEGG:ns NR:ns ## COG: BH3276 COG0772 # Protein_GI_number: 15615838 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus halodurans # 11 355 13 361 381 195 37.0 1e-49 MKNRKILGTTLIIIMIILAGLSIAMIASVSFPRGLKEYNSHYYFLKRQLMWLGLGSISFL FTANFNYKKYKQARGILYAVQFLFLIGVLVIGKEANGAKRWIKMGMFSIQPSEFAKLVII IYLAGLIDFLKKKREKSLGILFMTMIPLMLYAFMILLEKSFSSTVQVTLIGLTMIFISGV KMEHFISVLLMLVTLGAGSILSMPYRLKRLLGHLENSDEVYQLKQSLIAIGSGKLLGKFY GNGLQKYFYLPEIHTDYIFSGYAEETGFIGSILLILLYVALLAVILITVIRIKDMYAKYL LIGILSMFSLQIIGNLSVVLGLVPSTGIPLPILSYGGSTTIVTMAALGIVYNIIRALYKQ EIEEEMQ >gi|261748707|gb|ADAD01000012.1| GENE 7 6320 - 7384 1594 354 aa, chain + ## HITS:1 COG:FN1457 KEGG:ns NR:ns ## COG: FN1457 COG0707 # Protein_GI_number: 19704789 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Fusobacterium nucleatum # 1 349 4 354 357 311 48.0 8e-85 MEKVVLTTGGTGGHIYPALSIAKKMREQNIETLFIGTKHRMEKELVPKEGFRFEGLDVIP LKSVKGIIKTAYATFKAFKILKKESPSQIIGFGNYITIPVLLAARLLRIPYYLQEQNCTM GLANKYFYKKAKKVFIAFENTLNSIPEKYKKKFIVTGNPLREEFYYKNKNEERKKLDIEK DKKAVLVMGGSLGAKNINEAILKVWEEIIKDKNVKLFWATGKENFEEAVFRMKNQGNSVI MPYFENTADIMSAADLVICRAGASTISELIQLEKPSILIPYDFVGQKENADVLEYVNGAK IYSNEEAEKAVKEALVLVKHDEMLNFMKNNIKKLKKENAANLIVETMKIDSYSK >gi|261748707|gb|ADAD01000012.1| GENE 8 7399 - 8742 1772 447 aa, chain + ## HITS:1 COG:FN1456 KEGG:ns NR:ns ## COG: FN1456 COG0773 # Protein_GI_number: 19704788 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Fusobacterium nucleatum # 8 447 12 464 468 444 53.0 1e-124 MLSDARNVYFSGINGIGMSGLAKILVKEGYNVAGSDLERKAITREMEEMGIKIYIGQVEE NVKDKGIDLFVYSTAIKECNPEYKYIVENGIKKIKRGELLAQLMNKFEGIAVAGTHGKTT TSSMMSVAFLEKDPTIVVGGIIPEIQSNSKIGNSEYFIAEADESDNSFLFIKPKYSVVTN VEPDHLEHHGTYENIKKSFEKFIDSTEKLAILCKDCEDMSTLNIKNKNIIWYSIKKEDVH IFATNITVKDGCTHYEVIKNGKNLGEFTLCIPGEHNVSNSLPVIYLADEAGCNMDTVKER LAQFKGANRRYQVIYDKNLRIIDDYAHHPTEIKVTIEAAKATEKGNVTVIFQPHRYSRTK FFFDDFVTSLKKADELILLPIYSAGEDNIYDVSSEKLAEKIGNGVKVYSEEEIQKLVKDN ENSNKSYVFMGAGSVSKLAHEIKNTLK >gi|261748707|gb|ADAD01000012.1| GENE 9 8779 - 9642 1055 287 aa, chain + ## HITS:1 COG:FN1455 KEGG:ns NR:ns ## COG: FN1455 COG0812 # Protein_GI_number: 19704787 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Fusobacterium nucleatum # 1 283 1 281 281 260 51.0 2e-69 MKIYENIEMKEYSHMKVGGIAKELIFIEDKKELKEVLNTRKNIFLLGNGTNTLLHDGKLD ISFISLKNFKKIAIEEKHEDYDLVRVEAGLDFDELIEFMEENNYTGLENIAGVPGSVGGL VNMNGGAYGTEIFDCIEEVEVCKNDGEITKLNKYQLDFKYRNTEIKQNKWIVISVLLKFK KGFDKECVADKRNQRKNKHPLEYPNLGSTFKNPEGTFAAQLISDAGLKEYRVGNAMVSAK HPNFIINLGDAKFSDIISIIEHVKKVVFEKFNTKLETEIIILKTEEE >gi|261748707|gb|ADAD01000012.1| GENE 10 9729 - 10394 680 221 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0208 NR:ns ## KEGG: Lebu_0208 # Name: not_defined # Def: polypeptide-transport-associated domain protein FtsQ-type # Organism: L.buccalis # Pathway: not_defined # 1 221 1 221 221 213 55.0 3e-54 MRKFIRTVLVISLLSGLIYFSKQFIETDYFKINEITVTGKNNLLKDDIISKIENLKGENI VYINTGRMEEILGKDVRVKKISIRKVYPSKLIVEFEEREPYVYVKKGNDIFLADKELNLF GHISEIESKNIPVIIYTDEDSLKDIKIILSKIKNKDLYDMISEIRKNNKTYELILKNGVK FITDSFVSSEKYDSRYKLYEKIKDEQTINYMDIRFKDVNVK >gi|261748707|gb|ADAD01000012.1| GENE 11 10409 - 11506 1562 365 aa, chain + ## HITS:1 COG:FN1451 KEGG:ns NR:ns ## COG: FN1451 COG0206 # Protein_GI_number: 19704783 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Fusobacterium nucleatum # 22 310 22 314 360 273 54.0 4e-73 MGESSNNAARLKVVGVGGAGGNAINDMIESDITDVDFVAVNTDAQDLARSKAETKVLLGE GLGAGADPEKARVAAKESEDKIREMLKNTDMLFITAGMGGGTGTGASPIVAEIAKNMNIL TVAVVTKPFEFEGPLKKKNAELGIENLKQNVDTLIAIPNEKLFELPNVSITLMTAFKEAN SVLRVGIKGISDLITKQGYVNLDFADVKTTMNNSGIAMLGFGEATGDGKAKTATEQALNS PLLENSIEGARKVLLNITAGPDIGLHEIKEVSETVSHKTGNAGASLIWGVIIDPELEGTI RVSIIATDFQGKYNKSFEGTVFSGFGENVEIKSDKNDVEDGKLLKDDAENQPTQFVVPSF FSKDE >gi|261748707|gb|ADAD01000012.1| GENE 12 11600 - 11884 395 94 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212296|ref|ZP_04338678.1| SSU ribosomal protein S6P [Leptotrichia buccalis DSM 1135] # 1 94 1 94 94 156 79 1e-37 MKNYEIMFILSTQLSDEEKKAGIALVEDTLTKAGAAELKTEVWGDRKLAYPIKKKENGYY VLTLFQIDGTKLPEVEAKLNISESILKYMIVRND >gi|261748707|gb|ADAD01000012.1| GENE 13 11951 - 12364 514 137 aa, chain + ## HITS:1 COG:FN1304 KEGG:ns NR:ns ## COG: FN1304 COG0629 # Protein_GI_number: 19704639 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Fusobacterium nucleatum # 1 137 1 154 154 109 45.0 1e-24 MNQVLLIGRLTKDPELKYSQSGKAFCRFSIAVTKEFNRNETDFFDCVAWNKTAEIIAEYM RKGKKIAIQGRLETGSYEKEGRNIKTYSIIVDKFEFVDSAGGQGQQQSSSYSQGTQPKET FADNDNDEIMDDDDFPF >gi|261748707|gb|ADAD01000012.1| GENE 14 12390 - 12623 356 77 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229860468|ref|ZP_04480112.1| SSU ribosomal protein S18P [Streptobacillus moniliformis DSM 12112] # 1 77 1 77 77 141 88 4e-33 MKPATEFKRRKRRPKVKFKVEDINYKNVDLLKNFMNDKGKISPARVTGLEAKIQRKIAKA IKRSRQIALMPYTRIEK >gi|261748707|gb|ADAD01000012.1| GENE 15 12825 - 13379 770 184 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0215 NR:ns ## KEGG: Lebu_0215 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 32 183 35 186 220 133 50.0 3e-30 MINKKVWLILLLPILLFVACQNKEKEETKNKIQDKAQEFKVFSDKSVEKVIEEFEKNSRR NKISIEKFEKLTFENKNYYHSKINIKKNSAYTINYDGIEVTSLVVKIGSVTGTDLGTIED LVVNLIEVSDENIKDEEARKLYAEILAGMKEGSLSNELSYKNTIKYAITISKDTGELIFI AQQI >gi|261748707|gb|ADAD01000012.1| GENE 16 13436 - 14026 1084 196 aa, chain + ## HITS:1 COG:DR0201 KEGG:ns NR:ns ## COG: DR0201 COG5403 # Protein_GI_number: 15805237 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 12 135 12 133 272 64 33.0 9e-11 MNLEALLGLLGGQDLGALTSQIGGSENQVKNGLEAALPAMLAALNKNTNSEKGAESLNKA LETKHDGSILNNLQGYLSNPDLKDGEGILNHLFGNQTSNVANAVSQSSGLDANGSMKMLQ MLAPVVLGALGQQKKENNLDAGGLNALTSMLSGSLGGNEKASGMMGLVTNMLDANKDGNV MDDIMGMVGKFLGGKK >gi|261748707|gb|ADAD01000012.1| GENE 17 14131 - 14643 692 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037259|ref|ZP_06010741.1| ## NR: gi|262037259|ref|ZP_06010741.1| hypothetical protein HMPREF0554_0752 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0752 [Leptotrichia goodfellowii F0264] # 1 170 1 170 170 278 100.0 1e-73 MLIIKKLADKYGMKTVTEKSRKYRFGLSLLENMMDNIIFIAYGIVDGYYFSVSEEEVKKQ ITLKINIDNETDKSELRKLFNSLKERKYVTEIKIEKSYIQIILDKREMKGKNGELEITQE IIDEITDYLKKHNFGSKCAYTGKDDGEIAFVNIKSAYNTLNGNYKIKRMV >gi|261748707|gb|ADAD01000012.1| GENE 18 14708 - 15640 881 310 aa, chain - ## HITS:1 COG:SPy1056 KEGG:ns NR:ns ## COG: SPy1056 COG2855 # Protein_GI_number: 15675048 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 31 297 37 323 339 131 34.0 2e-30 MSKKIVFYAGCIFALILPLIKSVHAFASGIALLTGIILASFYLISTPKNLGKIRKFTLNS AVVLFGFGLNINKVIAVGSKGILQTAVSLVFVIIIGLILAKLFKMEKKLSQLIIFGTAIC GGSAIAATSPVIEASDEDIALSTGIVFILNTVALFLFAFLISYFKLNAEQTGIWTALSIH DTSSVVSAAAFHSTEALKIATIMKLTRTLWIIPIVIILSFFNKSDSKNIKFPIFILFFIL ASVIASFVNLPSFYSLLTQAGKMLLALALYLIGTSLNIKTIKKMTGKNMAFGVTLWIFSI ISGYLIMMFL >gi|261748707|gb|ADAD01000012.1| GENE 19 15772 - 16026 371 84 aa, chain + ## HITS:1 COG:no KEGG:BHWA1_01309 NR:ns ## KEGG: BHWA1_01309 # Name: not_defined # Def: hypothetical protein # Organism: B.hyodysenteriae # Pathway: not_defined # 1 84 1 84 84 117 61.0 2e-25 MPIIAKFYGIIIKMYFQQSEHNPPHFHCIYGEYAGIVEIKTLEMLEGDMPKKALEMVKEW GRKHQNELLEMWETQEFVKLPPLK >gi|261748707|gb|ADAD01000012.1| GENE 20 16073 - 16306 355 77 aa, chain + ## HITS:1 COG:no KEGG:Thebr_0090 NR:ns ## KEGG: Thebr_0090 # Name: not_defined # Def: hypothetical protein # Organism: T.brockii # Pathway: not_defined # 3 76 12 83 84 66 47.0 3e-10 MTDYNLLVQFQNGVEKIYNLNPLFEKWEDFKDLMNIRGLFEQVKVDKGGYGISWNDEIDL SCNELWNNGEIVKEKIE >gi|261748707|gb|ADAD01000012.1| GENE 21 16468 - 17355 247 295 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020671|ref|YP_526498.1| ribosomal protein S6 [Saccharophagus degradans 2-40] # 14 284 9 275 293 99 25 2e-20 MNRQGKTEEANARIRLIVAMTIFGTIGIFVKHIPLPSSIIALARGIIGIGFLLIFTKIKK IKISFSEIKNNFPILSLSGMLIGIHWIFLFEAYHHTTVAVATLCYYLAPVFIIIASPFVL KEKLSLKKIICVTVALIGMIFVSGIFKEGGTENLQIKGILFGVGAAIIYATVILLNKHLK NISSYGMTIMQIGIAAVILLPYTAVTQNFGNLSFDFLTIVLLLIVGILNTGITYSLYFSS IKELKAQTIAIFSYIDPIVAIFLSTFLLKEKPDIYTVIGGILILGATFVSELQKD >gi|261748707|gb|ADAD01000012.1| GENE 22 17385 - 17864 729 159 aa, chain + ## HITS:1 COG:BS_yxjI KEGG:ns NR:ns ## COG: BS_yxjI COG4894 # Protein_GI_number: 16080945 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 151 2 150 162 114 37.0 1e-25 MKLYMKQKVFSLNDKFTIKDEEGNDKYKVEGEIFTLGKKLHVYNMEEREIAFIEQKLMTF MPKFFVYVNGEKIIEIVKKFTFLKPKYEIIGKDWITTGDIWGHEYTISDANSRFQIANIK KEWMTWGDSYLIDIADDQDETAVMAVILGIDAAVASQNN >gi|261748707|gb|ADAD01000012.1| GENE 23 17950 - 18543 592 197 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0580 NR:ns ## KEGG: Lebu_0580 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 27 193 2 172 174 121 42.0 2e-26 MDIFIYLIYNTCVIKIKFKKEGNAMKKKIILIFGLLLIFVMSCGGKHPAVKDFESTMKIY QSGDFTKMSQNSENATINPEVMKTFSEAYKKITYKINKTTVNKDEAIINVTMKAPDLSGV MKEAIVKIMQDPKLQAEGSDKIMTDLIKEKLNDTNNLKYNEKTFDIVYKKAQDKWFPDPY ANKEYFQMITFGILNTQ >gi|261748707|gb|ADAD01000012.1| GENE 24 18832 - 20772 2457 646 aa, chain + ## HITS:1 COG:FN2102 KEGG:ns NR:ns ## COG: FN2102 COG0488 # Protein_GI_number: 19705392 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Fusobacterium nucleatum # 1 643 1 631 631 553 49.0 1e-157 MSLVQFNRVYKQFAGEFILRDINFSIEEKDKIGLVGLNGVGKSTIIRMILGKERVDGAEN NPNEIGEIIKSPSMKIGYLSQNHEFSDEKNTIYEEMMSVFAEERKIWDELQKVNMLLGTA TGEELESLINRSAELSSLYEAKNGYEIEYKIKQILTGLELTEEYYNLYLKDLSGGERTRV SLTKLLLLEPDLLILDEPTNHLDLISIEWLEDYLKRYNKAFLLVSHDRIFLDNVCNRIYE IENKKLYKYDGNFSSFILQKEMILKGEIKRYEKEQEKIRKMEEYIDRFRAGIKARQAKGR QKILDRIERMDDPVFNPQRMKLKFETDGISGDNVLKVKNVEKSFGDKKVLNNISFNLYKG ERVGIIGKNGIGKSTLLKIIVDKLKKDTGEIEFGSRVKTGYYDQDHQDLSNANNILQEIN VSLNLTEEYLRTLAGAFLFSGDDVLKKVEKLSGGEKVRVSFLKLYMERANFLILDEPTNH LDIYSIEVLEDALEDFDGTMLVVSHNRHFLDSVCNTIYYLDENGLTKFKGNYEDYKESLK SAKMTSVSGTDTEIKEEQKLSYQEQKELNKKISKLKRDVAKLEDEMEKITLKREELNREY EVAGKQNDVGKLMEIQEQFDKLEQEEMEKMEEWDEKSEELKKIEKN >gi|261748707|gb|ADAD01000012.1| GENE 25 20799 - 21476 939 225 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1466 NR:ns ## KEGG: Lebu_1466 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 77 225 51 195 195 68 31.0 1e-10 MRKNILLKVLILGVMFGITVFLNGTAKTGKTPNKKEKQEDCYIPKGELTCVTDINTVKGS IKLAEKEAEKDKLFKDAEKYFKKGRFGNENVGFIDYPEGWMMFVDSDSGPSTMQISREGF DIYTLDVLYIVDESINLTKITEEIAKQNYDDLLKAGHTKDNLEIKDVIINGYKGKSVKVK RVSGKSWITYYIPNKKKMHIIVVEGLPQHTDKMQKFVERSWNPYK >gi|261748707|gb|ADAD01000012.1| GENE 26 21495 - 22460 1114 321 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037240|ref|ZP_06010722.1| ## NR: gi|262037240|ref|ZP_06010722.1| hypothetical protein HMPREF0554_0761 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0761 [Leptotrichia goodfellowii F0264] # 1 321 1 321 321 536 100.0 1e-150 MKVSKNEVYFAFFIGFILIAMSMYNIVLGLNRDSKAKEIYASEIKKLKEKYNLIIDKDKY TVKYKGQSCGGCIAKGYDFYIMKKKEKEKYKSKYFSKEKEEKSNTYMITPSNDYILVHSW SEVNSVFENKEHLLEEPGSGFLLEMIETFGFKPYIFNEFLYDKTKGNDFQEIEKIFDNYK EGKIHYYSQYPVWDCGTVKNNVRQGSMLFVEDEECGISDEKDKKGRYSLRKLKEYGNRFS QYFSQKREFEKIDWYEFKKFNNLHPVIIFEVYNAERKELEKMRDEIKKYYNDKELTIVLV GYPEKKFRSGKDKEKELVAVW >gi|261748707|gb|ADAD01000012.1| GENE 27 23142 - 23705 444 187 aa, chain + ## HITS:1 COG:CT776 KEGG:ns NR:ns ## COG: CT776 COG0318 # Protein_GI_number: 15605509 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Chlamydia trachomatis # 5 183 253 437 537 69 27.0 4e-12 MKNTIVLMKNTDHPRKILNKIERYRVTHMAFTPFYLELINMCNNLKINFNSLRKICFRGS VLTLENYLESKKIFPKTEFIQTYGQIEAGPRITGKKIEKEYNPKNVGKAIKKTKIKILKK EKLSNKIGEIGEIVVKIPCIIKKYFKIRRNILFEKKWLKTGDVGYFNEKKDLILLGRKNN IIKNRGF >gi|261748707|gb|ADAD01000012.1| GENE 28 23943 - 24488 552 181 aa, chain + ## HITS:1 COG:FN0099 KEGG:ns NR:ns ## COG: FN0099 COG0221 # Protein_GI_number: 19703447 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Fusobacterium nucleatum # 71 180 8 117 117 137 67.0 8e-33 MKKSILKILKKNKIKDDEENIIVDSLEFIRLIADLEESYKIKFDDEDLIFENFSSINRII EIIKKRKLLNYKNYLNQKIKVKVDRKLGDKHPEYGYIYSLNYGYIPNTESEDGEEIDVYI LGEFDPLEEFEGVCRAIIYRIDDIENKLIVTAEDKKYSIDQIKALVEFQERFFKTEIIME K >gi|261748707|gb|ADAD01000012.1| GENE 29 24490 - 24603 86 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEIIIFLTVSPESHRTAEENYRNKKTKEFFEDKRRIF >gi|261748707|gb|ADAD01000012.1| GENE 30 25229 - 25633 620 134 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212520|ref|ZP_04338896.1| SSU ribosomal protein S12P [Leptotrichia buccalis DSM 1135] # 1 127 1 127 128 243 96 1e-63 MLKGGNMPTINQLVRFGRSTSEKKKKSPALKGNPQKRGVCVRVYTTTPKKPNSALRKVAR VKLVNGIEVTAYIPGIGHNLQEHSIVLLRGGRTKDLPGVRYKIIRGALDTAGVVNRKQGR SRYGAKKPAAASSN >gi|261748707|gb|ADAD01000012.1| GENE 31 25663 - 26133 749 156 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212521|ref|ZP_04338897.1| SSU ribosomal protein S7P [Leptotrichia buccalis DSM 1135] # 1 156 1 156 156 293 92 1e-78 MSRRRRAEKRDVLPDSQFNDKVVTKFINGLMKDGKKSLAEKIFYTALKEIAEETQEEGIE VFRRAMENVRPQLEVRSRRIGGATYQVPVEVRKERQQALAIRWIVKYTRERKEYGMVNKL KKELIAAANNEGGSVKKKDDTYKMAEANRAFAHYKW >gi|261748707|gb|ADAD01000012.1| GENE 32 26198 - 28276 3259 692 aa, chain + ## HITS:1 COG:FN1556 KEGG:ns NR:ns ## COG: FN1556 COG0480 # Protein_GI_number: 19704888 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 689 1 691 693 1037 75.0 0 MARKVALKDTRNIGIMAHIDAGKTTTTERILFYTGVNHKIGEVHEGAATMDYMEQEQERG ITITSAATTAFWNGNRINIIDTPGHVDFTVEVERSLRVLDGAVAVFSAVDGVQPQSETVW RQADKYNVPRMAFLNKMDRVGADFNMCVSDIKQKLGGNGVPIQLPIGAEDAFEGIIDLIT MKEYLFKDDTMGADYDIADVRPELLEEAQIARERMLESVVETDEALMEKYFGGEEISEDE IRKAIRIATIGGIVVPVLCGTAFKNKGIQPLLDAVVYYMPSPLDIGAVKGIDPKTESPME RQPSDEEPFAALAFKILTDPFVGRLSFFRVYSGILNKGSYVLNSTKGKKERMGRLLQMHA NKREELDIVYSGDIAAAVGLKETTTGDTLCDESKPIILEKMEFPDTVIQIAVEPKTKADQ EKMGTALAKLAEEDPTFKVTSNQETGQTLIAGMGELHLEIIVDRMKREFKVEANVGKPQV AYRETITGNSDVEEKYAKQSGGRGQYGHVKIRVESNPDKGYEFINEVTGGAIPREYIPAV DKGIQEALEAGVVAGYPVQDVKVTLYDGTYHEVDSSEMAFKIAGSMAIKKAMRAANPVLL EPIFKVEVTTPEEYMGDVIGDLNSRRGQVSGMTDRNNAKIINAQVPLSQMFGYATDLRSK TQGRASYSMEFEKYVQVPNNIAQQVIAERQGK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:28:25 2011 Seq name: gi|261748705|gb|ADAD01000013.1| Leptotrichia goodfellowii F0264 contig00084, whole genome shotgun sequence Length of sequence - 1158 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 30 - 89 7.5 1 1 Tu 1 . + CDS 115 - 1156 1030 ## COG5421 Transposase Predicted protein(s) >gi|261748705|gb|ADAD01000013.1| GENE 1 115 - 1156 1030 347 aa, chain + ## HITS:1 COG:MA3502 KEGG:ns NR:ns ## COG: MA3502 COG5421 # Protein_GI_number: 20092312 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Methanosarcina acetivorans str.C2A # 16 338 15 333 517 105 30.0 1e-22 MASLVFLYNKKYNKTYVYESINYWDKSEKKSKSKRKLIGIKDPLTGQIVPTSTQKKKLEE NKAQNDKRKFYGANLLLNLIAKKLGLTSNLKECFPDLYKEILSVAQYLILEKDSPISRYE KWSKIHKTFNRSELTSQRISEMFSEINESGKNNFFKLQAEQLKEDEYWAYDTTSISSYSK AINQMRYGYNKENDTLAQINLAILYGEKSRLPFYYRVLPGNIVDVSTVRRLIKDVQYIGV KKPKLVMDRGFYSKNNIDELMNNKFKFIVGTKSSSKIIKERIEKVKDIKKFTNYIAEYNI YGKKEILMWDDTDKENKKYLYLYVFFDDEKALKEDKDFTEYLIKLKT Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:28:26 2011 Seq name: gi|261748699|gb|ADAD01000014.1| Leptotrichia goodfellowii F0264 contig00027, whole genome shotgun sequence Length of sequence - 4098 bp Number of predicted genes - 7, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1222 1632 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I 2 1 Op 2 . - CDS 1226 - 1537 479 ## gi|262037272|ref|ZP_06010752.1| V-type sodium ATP synthase subunit G - Prom 1567 - 1626 3.3 3 2 Op 1 . - CDS 1632 - 1754 58 ## 4 2 Op 2 . - CDS 1824 - 2243 574 ## Lebu_0751 hypothetical protein 5 2 Op 3 29/0.000 - CDS 2247 - 3170 1011 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 6 2 Op 4 . - CDS 3198 - 3623 598 ## COG2001 Uncharacterized protein conserved in bacteria 7 3 Tu 1 . - CDS 4011 - 4097 111 ## Predicted protein(s) >gi|261748699|gb|ADAD01000014.1| GENE 1 1 - 1222 1632 407 aa, chain - ## HITS:1 COG:FN1741 KEGG:ns NR:ns ## COG: FN1741 COG1269 # Protein_GI_number: 19705062 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Fusobacterium nucleatum # 1 404 1 400 638 188 33.0 1e-47 MAIVKMSKFNLIAFDSQRAKILKGLQKFKEVSFIDINEDDNEEETLNRVFDNEELTKIEE RLYSVDYSIKLLKRYQVIKKDIKLMMKGNDNYTFEELAKKAAVYDWKEICQKLKKLGSNL DEVRSKISKKYGELESVSLWKRLDVNPEELKNLKTVDTYLGTVPFKLKNDFISKISELDK TYYEELRVTKDDVYYLVISDKSEAEKEKLAETFRDTSFAVTDIKLDAVPEDQMNVLKKEI QELKDEKHALKDEIKSYNEELSSLEAVYEYLKNKKLRITETEKMAKTENTCILRGWIPTG RKEEFEKTVKEITHGNYYTEFEEADLEDEKVPIKLKNGKIVSAFENLTEMYALPKYNEID PTPLFTPFYIVFFGMMGADVGYGLILLLGTLFTLKFVNLNKKMKLMV >gi|261748699|gb|ADAD01000014.1| GENE 2 1226 - 1537 479 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037272|ref|ZP_06010752.1| ## NR: gi|262037272|ref|ZP_06010752.1| V-type sodium ATP synthase subunit G [Leptotrichia goodfellowii F0264] V-type sodium ATP synthase subunit G [Leptotrichia goodfellowii F0264] # 1 103 1 103 103 94 100.0 3e-18 MAKEVLEKIKSTEKEADEIIIKANENAKNILKDIDKKIKDDSEKIIFDAQQEIKEQEEKT VEEVNKDVEFLLNKERESLKSILDIDEGKMNEIVNLLAERIVK >gi|261748699|gb|ADAD01000014.1| GENE 3 1632 - 1754 58 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKTEIFYILAIQKSNSKTKKPLTRENKSDRIKNDLRVILK >gi|261748699|gb|ADAD01000014.1| GENE 4 1824 - 2243 574 139 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0751 NR:ns ## KEGG: Lebu_0751 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 3 138 9 138 139 67 30.0 2e-10 MNKKKLQMIEFPSISHTIKKYDNFQNEVIITVPRRVQPKVKKRTVVKVKGCSTPAVVAAF IYVVLIVSMLMGRMWMTYTVSNLGIEKTNVENKLNELKKEVDTLENTYISNFDLKQVEEK SKELGFVPNNDMKYVKINN >gi|261748699|gb|ADAD01000014.1| GENE 5 2247 - 3170 1011 307 aa, chain - ## HITS:1 COG:FN1711 KEGG:ns NR:ns ## COG: FN1711 COG0275 # Protein_GI_number: 19705032 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Fusobacterium nucleatum # 2 306 7 311 314 321 55.0 9e-88 MEYHKPVLFDEVMENIIKKEKAIYVDCTLGGGGHTEGILKSSSDDSKVIAIDQDIEAINF AKKRLENYGNKMEIFQDNFRNIDTVVYFAGFNKVDRILMDIGVSSNQLDNIERGFSYKYE AKLDMRMNKELSVSAYDVVNKFAEKEIADIIYKYGEEPKSRKIAKNIVEYRKNKNIETTV ELSDIVIKSIGKSMKKHPSKRTFQAIRIFVNKELEVLEEALDKAVNLLNKGGKLLVITFH SLEDRMVKEKFRKYENPCVCPPELPVCVCKKESLGKIVTKKPITAKNEELEINNRAHSAK LRIFERS >gi|261748699|gb|ADAD01000014.1| GENE 6 3198 - 3623 598 141 aa, chain - ## HITS:1 COG:BS_yllB KEGG:ns NR:ns ## COG: BS_yllB COG2001 # Protein_GI_number: 16078577 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 139 1 138 143 131 44.0 5e-31 MFMGEFSCKVDNKGRLMLPVKFREQLGEGEFVITRGLDNCIDLFPIEEWEHRMEKLKQLK TTNSNHRAYQRFILSAATKLTLDSQGRLNLPSSLIGHAEISKNATVMGSDDHIEIWSEEK WNDYINQKSGIIEDIVDGMDF >gi|261748699|gb|ADAD01000014.1| GENE 7 4011 - 4097 111 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no IDFALEIVKELVGEEKAEKIASQIVYKS Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:28:45 2011 Seq name: gi|261748697|gb|ADAD01000015.1| Leptotrichia goodfellowii F0264 contig00233, whole genome shotgun sequence Length of sequence - 483 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 37 - 483 116 ## Smon_0058 hypothetical protein Predicted protein(s) >gi|261748697|gb|ADAD01000015.1| GENE 1 37 - 483 116 148 aa, chain - ## HITS:1 COG:no KEGG:Smon_0058 NR:ns ## KEGG: Smon_0058 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 4 148 258 406 407 119 44.0 3e-26 GSGDVNNIKGIIKYLGRYLARSPIAEYKITDITDNEVTFFYNDLANDKQKTFITMPIQKF VSQILIHVPPKNFKMVNRYGLYARRISNKLKSAVIPFKKNIVPNKLSFYQRQTFKTFGIN PFYCPVCNIRMIVWEFYHYLYPELKRYY Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:28:48 2011 Seq name: gi|261748694|gb|ADAD01000016.1| Leptotrichia goodfellowii F0264 contig00161, whole genome shotgun sequence Length of sequence - 1085 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 400 491 ## gi|262037281|ref|ZP_06010759.1| putative parallel beta-helix repeat-containing protein 2 1 Op 2 . + CDS 486 - 992 731 ## gi|262037320|ref|ZP_06010790.1| hypothetical protein HMPREF0554_2163 Predicted protein(s) >gi|261748694|gb|ADAD01000016.1| GENE 1 2 - 400 491 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037281|ref|ZP_06010759.1| ## NR: gi|262037281|ref|ZP_06010759.1| putative parallel beta-helix repeat-containing protein [Leptotrichia goodfellowii F0264] putative parallel beta-helix repeat-containing protein [Leptotrichia goodfellowii F0264] # 17 132 1 116 116 181 99.0 1e-44 KKKISDDDSKCFFNSCLKLDDENVIYSKGNLLIKRNLKTKKEKILYKAHNEIKGIKSLNK ENTLISFGVSNNASNISLILGYINKWKWDHDVVYDLKENKIYMLETGNDNLVDKEEQLRL ISEIIDYGLEKD >gi|261748694|gb|ADAD01000016.1| GENE 2 486 - 992 731 168 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037320|ref|ZP_06010790.1| ## NR: gi|262037320|ref|ZP_06010790.1| hypothetical protein HMPREF0554_2163 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2163 [Leptotrichia goodfellowii F0264] # 26 161 67 202 1046 251 97.0 2e-65 MSLVSEKRGKRVLGRFISLSIMLLVTVTPSISKQKSESEGVYYSKNEDITKGNSHYNNTG NVKTVGVAVRTGSISGTVNGPHETISVQDSVNSSSRGYSISLGIGIGPHTEKRKAVNGPY ISSATVGYNKGDVNQKITRNVAEFTAGSGMLDVKGKIVQVGCCGIIRL Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:29:06 2011 Seq name: gi|261748688|gb|ADAD01000017.1| Leptotrichia goodfellowii F0264 contig00057, whole genome shotgun sequence Length of sequence - 3718 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 25 - 489 483 ## gi|262037287|ref|ZP_06010764.1| hypothetical protein HMPREF0554_0791 - Term 560 - 601 0.3 2 2 Tu 1 . - CDS 623 - 1015 438 ## PROTEIN SUPPORTED gi|229212814|ref|ZP_04339169.1| SSU ribosomal protein S30P - Prom 1140 - 1199 13.6 + Prom 1115 - 1174 8.5 3 3 Tu 1 . + CDS 1210 - 1572 625 ## COG0251 Putative translation initiation inhibitor, yjgF family + Term 1598 - 1650 7.4 + Prom 1619 - 1678 10.2 4 4 Op 1 . + CDS 1713 - 3236 1819 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 5 4 Op 2 . + CDS 3262 - 3718 623 ## COG0130 Pseudouridine synthase Predicted protein(s) >gi|261748688|gb|ADAD01000017.1| GENE 1 25 - 489 483 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037287|ref|ZP_06010764.1| ## NR: gi|262037287|ref|ZP_06010764.1| hypothetical protein HMPREF0554_0791 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0791 [Leptotrichia goodfellowii F0264] # 1 154 1 154 154 243 100.0 2e-63 MQRIKIIIKIVSVLLMILLCFMIADHLNIKKFSIENIKISDKKFNEKEWKNHIKRNMMLK DLFKNYKSALENEESRKKILGNEENLGNGNIISLLEEYEYRTQGYKIFEKGNLLKKKVFL VFYRKKCVIVEEEYIFSMMFPLPNNTVDCESVIK >gi|261748688|gb|ADAD01000017.1| GENE 2 623 - 1015 438 130 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229212814|ref|ZP_04339169.1| SSU ribosomal protein S30P [Leptotrichia buccalis DSM 1135] # 10 112 1 103 106 173 81 2e-43 MRIMRGCGSMKIIISGKQLKITDAIKSYTEEKINKLSKYTDAITEVDIVLAVEKKKTEGD VHKADGLVYASGTKIKVEARNDDLYAAIDELSERLERQVRKYKEKQKDHNKKVAINNKFS NIKVTLTLIV >gi|261748688|gb|ADAD01000017.1| GENE 3 1210 - 1572 625 120 aa, chain + ## HITS:1 COG:Cj1388 KEGG:ns NR:ns ## COG: Cj1388 COG0251 # Protein_GI_number: 15792711 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Campylobacter jejuni # 1 117 1 117 120 122 56.0 2e-28 MKKIPEAVGPYSAFRIAGDFLYISGQLGINPETQNIDSDTVEGQAKQALENMKAILENNG YSMKNVVKTTVFLDKISDFVAVNNIYAGYFEEPYPTRSAFEVAKLPKGALVEIEAIAYLK >gi|261748688|gb|ADAD01000017.1| GENE 4 1713 - 3236 1819 507 aa, chain + ## HITS:1 COG:lin2646 KEGG:ns NR:ns ## COG: lin2646 COG1502 # Protein_GI_number: 16801708 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Listeria innocua # 28 507 21 482 482 306 35.0 8e-83 MFNHLGNLLIWIISIEHIHLALYIGFVIAILVSEKTPESMVAWIFTITVFPIFGFILYVV LGVNWRKSRVTVDKKGKRRTTLLNSFSGEFMKYDPTQLFESKELRVKNLEKSMGNANINE EEKEIVKLLYESERTYLTNNGSYELFYDGKEAFDSIIKDLEEAKEVIYMEYFIWRSDELG EKIKNLLIKKAKEGLKIKLLFDGMGSMGTISKKYRKELAEVGVEFRYFLDIKYKISKLNY RNHRKMTIIDNKILHTGGMNLGEEYITGGKRFETWRDTNIRITGELVIHYLAIFATDWLN SGGKEDFEREKIDKIIKGNKVGHPDAYLMQVSSSGPDTVWASLKYMYSKMIAVAKEEVLI QSPYFIPDTSLISQLQIASLSGVKIKIMITGVPDKKIPYWIAETYFGELLAAGVEIYRYK AGFLHCKDIVTDGKISTMGTCNFDMRSFEINYEVNSVFYNEEISQNIRNQFYKDLEQCEQ IKEENLKKIVFWRKIRNSLFRVVSPIM >gi|261748688|gb|ADAD01000017.1| GENE 5 3262 - 3718 623 152 aa, chain + ## HITS:1 COG:FN0635 KEGG:ns NR:ns ## COG: FN0635 COG0130 # Protein_GI_number: 19703970 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Fusobacterium nucleatum # 10 151 2 141 287 128 52.0 3e-30 MTEKNKFIEDGMILLNKRKGISSFKAINELKRAIYAEKIGHAGTLDPMAEGLLIVMVNGA TKFSDDLMKKDKEYYVELELGYETDSYDTEGAVTLKYEGEIEISKEKISEIIFGFVGDIE QIPPMYSAIKVNGQKLYELARKGIETERKARK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:29:15 2011 Seq name: gi|261748686|gb|ADAD01000018.1| Leptotrichia goodfellowii F0264 contig00095, whole genome shotgun sequence Length of sequence - 639 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 639 758 ## gi|262037289|ref|ZP_06010765.1| hypothetical protein HMPREF0554_1475 Predicted protein(s) >gi|261748686|gb|ADAD01000018.1| GENE 1 3 - 639 758 212 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037289|ref|ZP_06010765.1| ## NR: gi|262037289|ref|ZP_06010765.1| hypothetical protein HMPREF0554_1475 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1475 [Leptotrichia goodfellowii F0264] # 16 212 1 197 198 294 100.0 3e-78 GLFIKGYGLTKEYAEMNKEEKVGMVIEGTAQGAGGVIRAGAGASAVGIGLGTCVETAGGG CLAAGAGAANVSFGGNEAFMGGQKILQGFTNQGYTIKGTPGEEYAKLKAGKEANVNRSFK GIINPIKSTMTTIGFGEDDYDVLNMMTGQGMQYASQYVQTYRPMYEAKKAAVVANSKNQE TGSGGSQGKREINKELHAKINNREKEHFANLQ Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:29:27 2011 Seq name: gi|261748684|gb|ADAD01000019.1| Leptotrichia goodfellowii F0264 contig00194, whole genome shotgun sequence Length of sequence - 506 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 59 - 505 280 ## gi|262037291|ref|ZP_06010766.1| hypothetical protein HMPREF0554_2389 Predicted protein(s) >gi|261748684|gb|ADAD01000019.1| GENE 1 59 - 505 280 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037291|ref|ZP_06010766.1| ## NR: gi|262037291|ref|ZP_06010766.1| hypothetical protein HMPREF0554_2389 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2389 [Leptotrichia goodfellowii F0264] # 2 148 1 147 147 186 99.0 5e-46 GLCFKIYNISETNFEINKFILEILKKEYKIKKNKYEKLENELGLLINNEQNSKELLEYIG RNFLVLTNLSVISFLLKSYYDSIFKDIDRIEKIKEIFKQELSPLVIILGVIIIIMMFSYM FKNPKNKQERINFLHRISYYLDELEREE Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:29:39 2011 Seq name: gi|261748675|gb|ADAD01000020.1| Leptotrichia goodfellowii F0264 contig00020, whole genome shotgun sequence Length of sequence - 10110 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 642 340 ## Clole_1308 hypothetical protein - Prom 691 - 750 9.3 2 2 Op 1 . - CDS 850 - 1557 783 ## COG2188 Transcriptional regulators 3 2 Op 2 . - CDS 1584 - 3074 1900 ## COG1640 4-alpha-glucanotransferase - Prom 3158 - 3217 15.8 + Prom 3041 - 3100 4.9 4 3 Tu 1 . + CDS 3227 - 5542 2540 ## COG0058 Glucan phosphorylase + Term 5565 - 5602 3.3 + Prom 5599 - 5658 8.4 5 4 Op 1 1/0.667 + CDS 5680 - 6471 630 ## COG3568 Metal-dependent hydrolase + Prom 6515 - 6574 7.8 6 4 Op 2 . + CDS 6620 - 8200 2107 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific + Term 8214 - 8269 7.6 - Term 8200 - 8257 4.2 7 5 Tu 1 . - CDS 8307 - 9014 576 ## COG2188 Transcriptional regulators - Prom 9103 - 9162 18.2 + Prom 8994 - 9053 15.2 8 6 Tu 1 . + CDS 9226 - 10108 920 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases Predicted protein(s) >gi|261748675|gb|ADAD01000020.1| GENE 1 3 - 642 340 213 aa, chain - ## HITS:1 COG:no KEGG:Clole_1308 NR:ns ## KEGG: Clole_1308 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 2 213 3 214 234 219 55.0 7e-56 MDLEGSIVKINQVSEKQLKEMYFLMAEFYNNITEKNFLKDFLEKDYCIILKDRENRIKGF STQKILTLNYKGKNIHGVFSGDTVIHKESWGSFSLFQVFSKFFFTYGEQYDNFYWFLIVK GHKTYKILPTFLKKFYPNFKEPTPPEIKSLMDFFGCTLYPEEYDPVTGIIEYNDIKDSLK EGIADITNKELRDNHVKFFIQKNPDYEKGNDLV >gi|261748675|gb|ADAD01000020.1| GENE 2 850 - 1557 783 235 aa, chain - ## HITS:1 COG:BH0873 KEGG:ns NR:ns ## COG: BH0873 COG2188 # Protein_GI_number: 15613436 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 235 2 235 237 157 40.0 2e-38 MSKYKDVYNDIKNRIRDGILKPGEFLDSESELANEYSYSKDTIRKALSILELEGYIQKIK GKNSLILERGYLKNVSLSSLQTSQELNKAENLNIKTNLISLYIIQDDKKLMDIFHATKDN DFYKVVRTRSLDGEKLEYDVSYFDRRIVPFLSKEISQRSIYEYLENELNLKISHSRREIR FRYATSEEKEHMDLKNYEMVVVIETYAYLSNGSLFEYGTISYRPDKFTFSIVAKR >gi|261748675|gb|ADAD01000020.1| GENE 3 1584 - 3074 1900 496 aa, chain - ## HITS:1 COG:alr3871 KEGG:ns NR:ns ## COG: alr3871 COG1640 # Protein_GI_number: 17231363 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Nostoc sp. PCC 7120 # 2 496 3 497 502 533 53.0 1e-151 MFERSSGILFHPTSLPGKYGIGTLGKEAYAFIDFLKKSKQKLWQIFPLGPTGYGDSPYQS FSSFAGNPYLIDFDLLIEAHLLSEEDLRDIFFGDNEEYIDYGAIYNQKYPLLRKAYENFK SSDNNEMKGALENFKRENSSWLNDYSLYISLKNHFNGLPWNEWAQDIKNREDGAMHHYRS ELVDDIEYHNFIQFLFFKQWGDVKRYANENGIKIIGDIPIFVAADSSDAWANPEIFLFDE ERKPVKVAGVPPDYFSATGQLWGNPLYNWEKLKETNYSWWVERVRANLSTCDIIRIDHFR GFEAYWAVPYGDDTAINGQWEPGPGIDLFNAIKSQLGELPIIAEDLGLMTQGVIDLREAT SFPGMKILGFAFDSGEENDYLPHTYTKNCVVYTGTHDNDTLVGWFQKAKEEDRQFARDYL NSRSDDEIHWDAIRGAWSSVACMAISPVQDFLGLGSEARINTPGVASGNWQWRLKQGVLT NELAERIAKLTKIYSR >gi|261748675|gb|ADAD01000020.1| GENE 4 3227 - 5542 2540 771 aa, chain + ## HITS:1 COG:SPy1291 KEGG:ns NR:ns ## COG: SPy1291 COG0058 # Protein_GI_number: 15675244 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Streptococcus pyogenes M1 GAS # 19 768 4 754 754 1063 72.0 0 MNRKFIMKGIKIRTMNHNFTEFLKRGGNKELKEMTNQEIYYKLLEYVKEEAEKKSENKSK KKIYYISAEFLIGKLLSNNLINLGIYKEIREELKNAGKKLSDIEELETEPSLGNGGLGRL ASCFVDSMSTLGINGEGIGLNYHCGLFKQVFKKNEQNAEPNYWIEDESWLRDTNIGYKVK FKNFTLNSKLKRIDILGYEKDTKNYLNLFDIESVDYNLIEEGITFDEEIIEKNLTLFLYP DDSTKKGELLRIYQQYFMVSNAARLIIAKAVEKGSNIHDLADYAFVQINDTHPSMVIPEL IRIMTEEYKISFEESVEIVTAMTGYTNHTILAEALEKWPLEYLEEVVPDIVEIIKKLDKI VKIKYNNENVQIIDKEDRVHMANMDIHFSSSVNGVAYLHTEILKNSELKDFYEIYPEKFN NKTNGITFRRWLESCNEDLADYIKELIGTGYLTDAEKLEELLKFSDDKEVYRKLENIKKE NKLKLKEYLQHTQGIVIDENSIIDTQIKRFHEYKRQQMNALYVIKKYLDIKSGKLPERKI TVLFGGKAAPAYIIAQDIIHLILCLSEIINNDPEVNKYLNVYLVENYNVGLAEKIIPATD ISEQISLASKEASGTGNMKFMLNGALTLGTMDGANVEICELVGNENIYIFGKHSDEIIEL YEKEGYVSKDYYKQEGIKEVVDFITSKELVKVGNSERLERLHNELVNKDWFMTLIDFKEY YEVKEKMLSDYENKELWYKKVINNIAKAGFFSSDRTIGQYENEIWKTKDKK >gi|261748675|gb|ADAD01000020.1| GENE 5 5680 - 6471 630 263 aa, chain + ## HITS:1 COG:SPy1985 KEGG:ns NR:ns ## COG: SPy1985 COG3568 # Protein_GI_number: 15675775 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Streptococcus pyogenes M1 GAS # 2 261 3 269 272 117 30.0 2e-26 MKLLTINVHAWIEENQDEKMEILAKVIAENDYDVIAMQEVNQSMNSLVLFRAIRQDNYGW VLLDKISKYTDRTYYYHWSNSHIGYDKYDEGLAIITKHKLLDVDEFYCTRAQSVNTITSR RINSATIEYKGQIMEFYTCHMNLPTNKEEKMSDNIQRILKRSQTDNLKILMGDFNTDAIN SPEDYKMILSQGLYDTYTAAIEKDSGITVGGNIDGWSKNKEDKRIDYIFSNKKIKVLSSK VIFNGKNHPVVSDHYGLEVILDM >gi|261748675|gb|ADAD01000020.1| GENE 6 6620 - 8200 2107 526 aa, chain + ## HITS:1 COG:CAC0532_1 KEGG:ns NR:ns ## COG: CAC0532_1 COG1263 # Protein_GI_number: 15893822 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Clostridium acetobutylicum # 1 447 1 449 453 546 64.0 1e-155 MMKKIQRFGGAMMAPVLLFAFTGIVVGLASVFTNAQIMGKIAEPGTLWHNFWYVVAEGGW TVFRQMPILFAIGLPISLATKTNARACMETFALYMTFNYFVSAILKAFYGIDAAKQIADG VTGYSAIGGVPTLDTNLFGGILIAALVVYLHNKYFDKKLPEFLGVFQGSVFVYIIGFIVM IPCAFLTVLIWPKVQMGIGAMQGFMKASGIFGVWVYTFLERILIPTGLHHFVYTPFVFGP AAVPDGIQVHWVNNITEFARSTQPLKTLFPAGGFALHGNSKIFGAAGIALAMYSTAKPEK KKTVAALLIPVVFTAIVSGITEPLEFTFLFIAPVLFAVHAFLAATMAATMYAFGVVGNMG GGLLDFLFQNWLPMFKNHSVTVITQIVIGLIFTVIYFFVFRFLIQKMNLKTPGREDEDEE MKLYTKADYKAKHGESGAAASGSSNDQYMDQAVIILEALGGKDNIEELNNCATRLRVSVK DEAKLAKDAAFKAGGAHGVVRKGKAVQVIIGLTVPQVRERIENLVK >gi|261748675|gb|ADAD01000020.1| GENE 7 8307 - 9014 576 235 aa, chain - ## HITS:1 COG:BS_treR KEGG:ns NR:ns ## COG: BS_treR COG2188 # Protein_GI_number: 16077849 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 1 235 3 236 238 134 33.0 2e-31 MNKYEKVYNDIKEKIINNTIKTGEFLKKEDDLAKDYNFSKLTVRKALSMLEAEGYIQKIK GKKSVVMEKKNLENLSLTSIQTKQEINKMQNINIKTNLISLYIVQGVEKLMKEFNVSEDA DFYKVVRINSLDDEVLSYSTSFFDRKIVPFLNEDIAKNSIYEYLENDLNLKIAYSRRDIK FRKITAEEQEYFKLKDINMVVVIETHAYLSNGALFQYETITHHPEKFTFTAIAKR >gi|261748675|gb|ADAD01000020.1| GENE 8 9226 - 10108 920 294 aa, chain + ## HITS:1 COG:CAC0533 KEGG:ns NR:ns ## COG: CAC0533 COG1486 # Protein_GI_number: 15893823 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Clostridium acetobutylicum # 1 294 1 296 441 513 80.0 1e-145 MKKFSIVVAGGGSTFTPGIVLMLLDNLDKFPIRQIKFYDNDAERQEIIAKACDVIIKEKA PDINFVYTTDPETAFTDVDFVMAHIRVGKYAMREKDEKIPLKHGVLGQETCGPGGIAYGM RSIGGVLELVDFMEKYSPNAWMLNYSNPAAIVAEATRRLRPNSKILNICDMPIGIELRMA EMLGLESRKDMVIRYFGLNHFGWWTDIRDKEGNDLMPALKEKVAKVGYNVLIEGENEASW SETFTKAKDVFAVDPTTLPNTYLKYYLFPDYVVEHSNPNHTRANEVMEGREKFV Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:29:46 2011 Seq name: gi|261748668|gb|ADAD01000021.1| Leptotrichia goodfellowii F0264 contig00109, whole genome shotgun sequence Length of sequence - 6960 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 38/0.000 - CDS 76 - 960 1017 ## COG0395 ABC-type sugar transport system, permease component 2 1 Op 2 35/0.000 - CDS 970 - 1863 786 ## COG1175 ABC-type sugar transport systems, permease components - Prom 1923 - 1982 7.3 3 1 Op 3 . - CDS 2059 - 3390 2113 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 3427 - 3486 5.6 - Term 3479 - 3517 2.1 4 2 Op 1 . - CDS 3571 - 4497 1204 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) 5 2 Op 2 4/0.000 - CDS 4577 - 6031 2023 ## COG3119 Arylsulfatase A and related enzymes 6 2 Op 3 . - CDS 6044 - 6958 1281 ## COG4146 Predicted symporter Predicted protein(s) >gi|261748668|gb|ADAD01000021.1| GENE 1 76 - 960 1017 294 aa, chain - ## HITS:1 COG:mlr2246 KEGG:ns NR:ns ## COG: mlr2246 COG0395 # Protein_GI_number: 13472069 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 24 293 25 285 285 123 29.0 5e-28 MKKENKSIGRLYKIFIYTALVSFAISIIVPVSWVFIASFKQDSEFTGSPWTLPKGIYIQN FIDAFQKAKMGESLWNSVFVTAVALILLIVIALPASYVLARFEFRGRKFWNTFMKAGLFI NVNYIVIPIFLMLLKGDKILRQSEIVKGNFFLDNLFILAVIYMTTALPFTIYLLSNFFQS LPTTFEEAAAIDGAGYFTTMIKVMVPMARPSIITVILFNFLAFWNEYIIALTLLPGPKKT LPVGLMSLMAASKGAAHYGILYSGMVIVMLPTLILYILVQQKLTQGMTLGGSKE >gi|261748668|gb|ADAD01000021.1| GENE 2 970 - 1863 786 297 aa, chain - ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 13 283 10 279 292 164 35.0 2e-40 MKTNRYSKNLFIFFSLAPAMILYIIFRIIPTFQVFRMSFFKTSALSKKEKFVGFNNFISL FQDKAFIRSFQNTILLIVVVTIVTLVVAVFFAAILTTEKIKGSNFFRIIFYIPNILSIVV VSGIFSAIYDPGQGLIDSILQMIGLKGPKAGWLGDQKIVIYSIAMALIWQAIGYYMVMYM AGMANIPESLYEAAELDGAGKISRFFNVTLPLVWLNIRATLSFFIISTINLSFLLVIPLT GGGPDGATEVFLSYMYKQAYTNQSYGYGMAIGVAVFIFSFALSGIISAVTNREILEY >gi|261748668|gb|ADAD01000021.1| GENE 3 2059 - 3390 2113 443 aa, chain - ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 8 198 12 195 438 63 25.0 1e-09 MKVKLCLLSLLSILLLMACGEKKENESKENSGSKETVLKIATLESGYGDKMWPEIIEEYK KVNPNVKIELTQSKDIESSLPGQFQAENYPDVIMLGIGRKAGITENFVKEKELADLTPVL EMKVPGEEKAVKDKLVEGFTGNTITNPYSDGKVYLMPMFYSPTGLFYNKTLFEKNGWQVP KTWDEMFALGDKAKEKKIALLTYPTTGYLDSFLPPILAGRGGEQFFKDVMNYKQGIWTSP EMNDVFKTLGKLAKYVEPTTVANATDEGFKKNQQLVLDDKALFMPNGSWIVGEMAATTPK DFKWGMTAYPSFTKDGAQYAWSFFEQIWVPKAAKNIEEAQKFIAFLYSDKAAEIFVKYNA VQPINSYPYDKLSEENKVFYDVYKNGAKALVGGYATTKPVEGIDFRGTLYDSFNSVVNGT KTPEEWQKEVNEMMEKLRNNLIN >gi|261748668|gb|ADAD01000021.1| GENE 4 3571 - 4497 1204 308 aa, chain - ## HITS:1 COG:CAC2951 KEGG:ns NR:ns ## COG: CAC2951 COG1105 # Protein_GI_number: 15896204 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Clostridium acetobutylicum # 1 306 1 308 308 228 44.0 8e-60 MITTITLNPAIDTRYFIDDFQEGKLFRADKIVKSPGGKGLNVTKVLNQLGADVTATGIVG GKNGEWIREKLKERNIKEKFYICSKETRVCIAVLAKNSETEILEASEEVEENDIKGFEKV FSELLEKSDIITISGSLLKGVEKDYYKKLIEMINSKNRKVILDTSGATLLEGIKAKPYLI KPNFDELEYITGEKISDEHKLKEAVVKLKETGAENILVSMGKKGAMYFGERNLKITIPEI EVCNTVGSGDSTVAGFAKGLHDNIDLEETLKLSMACGMSNAQKTETGLVDIEDVKSFMEQ IKVEEVKL >gi|261748668|gb|ADAD01000021.1| GENE 5 4577 - 6031 2023 484 aa, chain - ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 16 484 1 467 467 800 78.0 0 MNNIIYILLDQVRKDMLGTYGHKIVKTPNIDRLSQEGVTFNNAFTPASVCGPARTSLFTG LMPTSHGIVKNGEKGGVGEISKENPNIMANLKDYNSYVVGKWHVGTSSIPKDYDIKGHNF DGYGYPGSGVYKNLVFNQPPTRLSNRYREWLEEKGYEIPTVSRAYFGDNPHLRVQELCGL LSGTKEETIPYFIIDEAKKYISESKDSGKPFFTWINFWGPHTPCIVPEPYYSMYNPDDVE LDKSFFKPLEGKPGHYKTISKMWGMWEASEERWKEVISKFWGYITLIDDAIGELFEFLKE NNLYDNSFIVVTADHGDAMGAHRMIEKGEFMFDTTYNIPMIIKDPQSNRVNERDDNFVYL HDLTSTVYDVASQEIPEAFEGESVLNIVRNGSKNDRKGLLCQLAGHFVYFEQRMWHRDDY KLVFNASDICELYDVKNDSEEMNNLFYNKDYKKIKDEMLKELYDEMIKINDPMANWLYRI IYEI >gi|261748668|gb|ADAD01000021.1| GENE 6 6044 - 6958 1281 304 aa, chain - ## HITS:1 COG:PM0597 KEGG:ns NR:ns ## COG: PM0597 COG4146 # Protein_GI_number: 15602462 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Pasteurella multocida # 1 300 260 558 559 343 66.0 3e-94 TNQAIVQRTLGAKDLKSGQKGVLIAALFLLLLPILLNLPGLLTYHIKGEGLKPIDLAYPT LVNTVLPKPLMGFFVAAMFGAVLSTFNSFLNSAATIFCNDLLPVISKKQRSDTEVIKLAK VISTVMAILTMFIAPMLIDASSGIFLFTKRFAGFFNIPIVALFVVGIFNKTVSGKAARLA VMLHVILYYSLVWVFKVKVNFVHVMGALFVFDVIVMLLIGSFMKRDTPYKESKENKANVD LTDWKYVDMVIATLFFSLLYLHGLLSPLGLASKTGNPLLVTAAYIVVEIIVIAYYSIKNK KMEK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:29:47 2011 Seq name: gi|261748665|gb|ADAD01000022.1| Leptotrichia goodfellowii F0264 contig00163, whole genome shotgun sequence Length of sequence - 1700 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 58 - 420 413 ## FN2115 hypothetical protein 2 1 Op 2 . + CDS 462 - 1698 1581 ## gi|262037310|ref|ZP_06010782.1| hypothetical protein HMPREF0554_2212 Predicted protein(s) >gi|261748665|gb|ADAD01000022.1| GENE 1 58 - 420 413 120 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 119 33 150 151 74 40.0 1e-12 MEKKNAGGAIGNKNYESSVLEVIEDISRRPINKHAQFGGITLLIPENTIINQKVGNIVDE KTGYGIPVSFDEVKRCTSIFYRKKVNDQTFIRILYNEKDPKISNISQKIIRTNGFTKTCN >gi|261748665|gb|ADAD01000022.1| GENE 2 462 - 1698 1581 412 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037310|ref|ZP_06010782.1| ## NR: gi|262037310|ref|ZP_06010782.1| hypothetical protein HMPREF0554_2212 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2212 [Leptotrichia goodfellowii F0264] # 16 412 1 397 398 736 100.0 0 MRESNEKSLFKQSEFMIFQTSEVFLRVEKFELEVLSTFSKVEIFRYVKEVPNRVDETYGV GVEGHKREKVTRATIENGVINSGRAIEVGVNRDITRAEEMLKDVNVQKTEFIFKSEPNSW GDFNKIMSSNAGIIGNFLDDMNEHTGNKVRTNYEDKFRTKTNEIISKVEKPLDKVNRFIS ILPTSGTHGGILEQIVRTVRYDKTPIVKIGIEKNKEDGTVMVGMEEIRKISDYKSKDGKP VKVNTNGIIELKENAVRNTIFKNMTKEDMDRYNRGERVEMLMVYNPTRGAVADIIESALG KMFDGSWSSLGLSIGVNRGAAVAYASRDKNQSYDFSFYSQGNIIGLGAFNILKNNGIKLG NGAENFNVRMYGTPIAIKSYQNFEGVLGINVLGAAVNEPDFVGSNGKWAGLI Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:30:10 2011 Seq name: gi|261748659|gb|ADAD01000023.1| Leptotrichia goodfellowii F0264 contig00089, whole genome shotgun sequence Length of sequence - 6421 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 8 - 2080 2554 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 2 1 Op 2 . + CDS 2124 - 3284 1804 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 3 1 Op 3 8/0.000 + CDS 3299 - 3628 535 ## COG2739 Uncharacterized protein conserved in bacteria 4 1 Op 4 . + CDS 3628 - 4965 1927 ## COG0541 Signal recognition particle GTPase + Term 4971 - 5012 1.3 5 2 Tu 1 . + CDS 5608 - 6387 733 ## Lebu_1637 hypothetical protein Predicted protein(s) >gi|261748659|gb|ADAD01000023.1| GENE 1 8 - 2080 2554 690 aa, chain + ## HITS:1 COG:FN1482 KEGG:ns NR:ns ## COG: FN1482 COG0317 # Protein_GI_number: 19704814 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Fusobacterium nucleatum # 2 689 31 725 725 670 50.0 0 MAAESHGGQKRRSGEDYILHPVEVAEILVDMRMDTDTVVAGILHDVVEDTLITLPDIEYS FGKDVSKLVDGVTKLRNLPKTDSKKIENIRKMVVAMSEDIRVVIIKLADRLHNMRTLKYM SPEKQQIKSKETIEIYAPIAHRIGMAKIKWELEDISFRFLYPEDYREISDLVNFKRKERE NYTLEIIRKIEEELKKHNVKSEVTGRPKHLYSIYRKMYEKEKKFADLNDLIAIRIIVDKE EECYNVLGIIHNLFIPVSGRFKDYIAVPKSNGYQSIHTTVKGPNDQNVEIQIRTFDMHRI AEDGVAAHWKYKEKKSKAKNEEYYAAVKKMIETNSENPEKFAQTITGNVLNQTIFVFTPK GDVMELPNGSTALDFAFQVHTQIGYRTIGAKVNDRIVQLDQVLENADKVEVLTSRNTKGP GKDWINMVNNHSSKVKIRKWFKDKEFEEKTKEGEQLLEKEFEKLGIKLKDLEEDERVFLY MKKFNITTMDLLFYKFAMGDLSLDGFLKKFEVKEEKNLKQVLEEETEKGNRRKEKSQGGV RISGTENTMYRFAKCCNPLPGDEIKGYVTRGRGIAIHRADCDNFHSLMEHEPDREIEVSW DEETANSSNAKYQFNFTVKVLERNGVLLDIIRILNEYKMELINVNTNYVRENMNRYVLLH FGIMIKNREDFERLANNLKSMKDVVDIIRK >gi|261748659|gb|ADAD01000023.1| GENE 2 2124 - 3284 1804 386 aa, chain + ## HITS:1 COG:FN1481 KEGG:ns NR:ns ## COG: FN1481 COG0343 # Protein_GI_number: 19704813 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Fusobacterium nucleatum # 7 372 4 370 373 617 79.0 1e-176 MSKFDNPVKYRLETTDGNARAGVIETPHGLIKTPVFMPVGTQATVKAMTREELEEINSQI ILGNTYHLYLRPGDELINDFGGLHKFMNWDRPILTDSGGFQVFSLGDLRKIKEEGVHFSS HIDGSKHFLSPEKSISIQNNLSSDIMMVLDECPPGLSTREYMIPSVERTTRWAKRCIEAN RNKDRQGLFAIVQGGIYEDLRDKSLNELTEMDEHFAGYAIGGLAVGEPREDMYRILKYIT PKAPDNKPRYLMGVGEPLDMLEAVEHGIDMMDCVQPTRIGRHGTVFTKYGRLVIKNAAYS RDNRPLDECDCYVCKNYTRAYIRHLFKADEILGQRLATYHNLYFLLKLMEDARNAILEKR FKEFKEEFIKNYTMGKDSDWIKPFSI >gi|261748659|gb|ADAD01000023.1| GENE 3 3299 - 3628 535 109 aa, chain + ## HITS:1 COG:FN1394 KEGG:ns NR:ns ## COG: FN1394 COG2739 # Protein_GI_number: 19704726 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 4 103 3 102 103 73 50.0 9e-14 MDKLDDFLRYSTLFLYYGELFSQRQKQYLELYLEENESLSEIAEKYEITRQAVFDNIKRG FKQLDEYEKKLGMFEREKELKKKLENLKNDFTKENLEKIIEDFDYNEVL >gi|261748659|gb|ADAD01000023.1| GENE 4 3628 - 4965 1927 445 aa, chain + ## HITS:1 COG:FN1393 KEGG:ns NR:ns ## COG: FN1393 COG0541 # Protein_GI_number: 19704725 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Fusobacterium nucleatum # 1 437 1 436 444 518 63.0 1e-147 MFNSLGDRFKDIFKKVSGQGKLTEANIKEALREVRLALLEADVNYGVAKNFVAKIRDKAL GEEVIKGVNPTQQFIKIVHDELVEVLGGTNVLIAKSPKAPTIVMLSGLQGAGKTTFAGKL SKHLKSKGESPFLIGADVYRPAAKKQLKVLAEQVKVPSFTIDESTDVIEICKKGIEEAKN AHATYVIIDTAGRLHVDEQLMEELQNIKSTFNPHEILLVVDGMTGQDAVNVAKTFNDRLD ITGVVLTKLDGDTRGGAALSVKEVAGKPIKFISEGEKLDDIAPFHPDRLASRILGMGDVV SLVEKAQEAIDENEAKKMEEKFRKNQFDFEDFLKQFKMIRKMGSLAGIMKMIPGVDTGAI DMAMAEKEMKRVEGIIFSMTVQERRDPKILNGSRKIRIAKGSGVEVNDVNRLIKQFEQMK QMMKMFNSGAIPGLGGMKPKSKRKR >gi|261748659|gb|ADAD01000023.1| GENE 5 5608 - 6387 733 259 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1637 NR:ns ## KEGG: Lebu_1637 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 259 1 282 282 306 62.0 4e-82 MFENKYPVFRKGNVVDKESLELLRDNPSEILHLMYFNKKDGIIKGFDLITDEENKEVIVT KGIVKYQNEIYWMYEDYKFKMPETENRYVLKLRLISNIEERKYYKRKGEFVLETLDDSGT DGIEITRFITREGAELRNDYMNFQDLRRDFNLLEIINSKYSSNHKFGTLHPKITELWGSE AAKKENLDIFDINFYVNCLQGPVEREVIISYINAKLNLHKSDYTNEELYMNLLKILDELG KERKNVEKRRVIPQKITIE Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:30:16 2011 Seq name: gi|261748652|gb|ADAD01000024.1| Leptotrichia goodfellowii F0264 contig00151, whole genome shotgun sequence Length of sequence - 6398 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 135 169 ## gi|262037323|ref|ZP_06010793.1| VIP protein 2 1 Op 2 . - CDS 157 - 816 1170 ## Sterm_1092 hypothetical protein - Prom 855 - 914 11.1 3 2 Tu 1 . - CDS 931 - 1320 559 ## Lebu_1466 hypothetical protein - Prom 1446 - 1505 10.6 - Term 1496 - 1547 2.1 4 3 Op 1 . - CDS 1591 - 2415 1095 ## gi|262037318|ref|ZP_06010788.1| putative liporotein 5 3 Op 2 . - CDS 2424 - 3254 1055 ## gi|262037322|ref|ZP_06010792.1| putative liporotein 6 3 Op 3 . - CDS 3258 - 6398 4304 ## gi|262037320|ref|ZP_06010790.1| hypothetical protein HMPREF0554_2163 Predicted protein(s) >gi|261748652|gb|ADAD01000024.1| GENE 1 3 - 135 169 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037323|ref|ZP_06010793.1| ## NR: gi|262037323|ref|ZP_06010793.1| VIP protein [Leptotrichia goodfellowii F0264] VIP protein [Leptotrichia goodfellowii F0264] # 1 44 1 44 45 72 100.0 9e-12 MKRILLVLSLILTVNSISYSEGKILKNELPKNQREFFEGGRLID >gi|261748652|gb|ADAD01000024.1| GENE 2 157 - 816 1170 219 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1092 NR:ns ## KEGG: Sterm_1092 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 3 219 2 210 210 142 45.0 1e-32 MKKHILTVLGLLALSTASFAENNIIEFKAGLSPVSKFDVTPSKKAKFSYELGAEYRYLVT NNTEIGVGLSYQNHGKLKKFTDVEDNNLKVEVSDTKLYDSVPLYLTAKYNFRNDSDIVPY VKADLGYSFNINGKNSSQYKTYSKATGAVLDEGKLKDFKAENGVYYSVGAGVVYKGFTTG LSYQVNTAKIEGTRYDGLKDKGSANFRRFTLSFGYQFGL >gi|261748652|gb|ADAD01000024.1| GENE 3 931 - 1320 559 129 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1466 NR:ns ## KEGG: Lebu_1466 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 129 65 195 195 110 41.0 1e-23 MGFVDMPNDWFEFEDPDAMENAVQLAVTPYDIITLNIYSVNNQRTAEEWRKILYKKYINQ GISNDNITQKDVKINGYNAKQIAIKIPDGRELTMNLIDYEGKVYYIALEGMPDKKSELEK VVNTWKPNE >gi|261748652|gb|ADAD01000024.1| GENE 4 1591 - 2415 1095 274 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037318|ref|ZP_06010788.1| ## NR: gi|262037318|ref|ZP_06010788.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 274 1 274 274 444 100.0 1e-123 MKRKTLNKLLVILLIGLIICGCNKAKRKIEGEKEFSQLVETIKENFKVILDKEKYIVRDS ETPQGRIISSPFYEIVEKEPVKYKSKYFVKEEGAKVVITQQGEENFVLEYVPFFSDKESR VFIDIMIKYGFKPYVLNELIYDKSKGNDFSEIERILGKYEDKKIEASVVDRWQCYPNYES ASIMFVLDECMIHDYKNGTAKFSYEKILKYGSRLKEYFSKMRKFEEINWYEFMKYNSIHP VIYINIKDISKEELEKVRNEVKKYYNSDEVTISL >gi|261748652|gb|ADAD01000024.1| GENE 5 2424 - 3254 1055 276 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037322|ref|ZP_06010792.1| ## NR: gi|262037322|ref|ZP_06010792.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 276 1 276 276 470 100.0 1e-131 MKRKTLNKLLVILLIGLIICGCNKAKRKIEGEKEFSQLVETIKENFKVILDKEKYIVRDS ETPQGRIISRPFYEIAEKKPVKYKSKYFIKEEGAKVVITQQGEENFVLEYVPFFSDKESR VFIDIMIKYGFKPYVLNELIYDKAKGNDFSEIERILGRYDDKKIDVLIIDGWWCYPNYES AGIMFVLDECMIHDYKNGTAKFSYEKILKYGSRLKEYFSKMRKFEEINWHEFMKYNSIHP VIYINIKDISKEELEKVQNEIKKYYNSENLTFLLSR >gi|261748652|gb|ADAD01000024.1| GENE 6 3258 - 6398 4304 1046 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037320|ref|ZP_06010790.1| ## NR: gi|262037320|ref|ZP_06010790.1| hypothetical protein HMPREF0554_2163 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2163 [Leptotrichia goodfellowii F0264] # 1 1046 1 1046 1046 1930 100.0 0 GESLHTRGLNILNEGDTTYNITGQILKEAGVSTIKENSSNTGFGLSVAKGMVDSNGQINA FKDKNATITPSISNQKSESEGVYYSKNEDITKGNSHYNNTGNVKTVGVAVRTGSISGTVN GPHETISVQDSVNSSSRGYSVSLGIGIGPHTEKRKAVNGPYISSATVGYNKGDVNQKITR NVAEFTAGSGMLDVKGKIVQVGSLIDGGFTLNGQGYEKQDLHDIDKSRKVGVNVTVYPNV TYTKRDEKGNAIYIDGRKENEQGAVYKVGVNYAEMDKARDVLSTVGSNVQINQDITGVNR DTNRQAGEFEGREINPINVDLGTEYWLTRAGRGKAKDIFEDAGRSVEGIKRILTTRDADG NLQILKSIEAETAVQKMMRIGFIETKGKTQQQVKKELEERFGSLTKKGVKVHFYGTEDID TSKMDDDTLAKLVSNGFAITKDGTVWINKEYVDSGKVIDFNKTTQHEISHLIFGEDSEYQ AQYLTRAYGEFLEGIRDNGYLKDGQGIIDYKFSMLTDEDRLRLDGYTDEEMQFFLHGFMQ GIGLGKQYEDFKGKVKGALIKKENELRNIAAKSYLPKSVKNGAAGAASMVRKVSKGLTAV SKADKKFYDREGKNLGIGVKKGKISTTNFEKSLKYVGQSFSVVKKENKNKGAVNAIKKVN KFEKYDDLDAYEKESIIDNILLQAEENERKGGIKYSEKLFTEKRIGDFINKPEQRATKIE YQYEGMLWDDAVMAGIDAYSGGPHIYNKYVGESIPNTNNKEHYKRLWDTEVGKIYKAPSG REFLVIDTKSTISGMNAITVKDVVSGRSQIVFQGSVPGIGEPLKNPISYVQDWGNNSVNA EAIRRNNINMNFKFMGTSVNITPRSLIIRQNDNITNAFYLKRRNNSIFKGDEIGRNQQYE DAYNYTKKIMKDPRVMGTLTKTKGHSLGGGPSEETGSRLDLESIVIDPAPVNNPGKYIDD GRTLVLIPNHGKAMLNKKIIDNKGRATYKFAPLGLPNLGVGRDVRGVAIYDNMGADGDNI HEPNYDDIRKNEAEMKSFFMPVGPVR Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:31:31 2011 Seq name: gi|261748650|gb|ADAD01000025.1| Leptotrichia goodfellowii F0264 contig00148, whole genome shotgun sequence Length of sequence - 619 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 617 530 ## gi|262037325|ref|ZP_06010794.1| conserved hypothetical protein Predicted protein(s) >gi|261748650|gb|ADAD01000025.1| GENE 1 2 - 617 530 205 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037325|ref|ZP_06010794.1| ## NR: gi|262037325|ref|ZP_06010794.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 194 1 194 206 298 100.0 1e-79 YHRNNNGLEDSEDYASFYGRQFERYNRKKFEDKDANFVTEQLRYTKEQLGNDWENAVKLV GFNFTFSKGIKIFGGGNVTVGAAFGDNGKKKFFLTVGGMLGAGFGGGFAAEGFTSIISGT DNPDDIAGTAYGLEAGIGVKDIGGKTIGANTNFKTTGAGSLGISGGPSNQEYEAHIGVTV SYTFVFGNIDAMVNKVKLELKKKKV Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:31:43 2011 Seq name: gi|261748648|gb|ADAD01000026.1| Leptotrichia goodfellowii F0264 contig00175, whole genome shotgun sequence Length of sequence - 304 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 303 307 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261748648|gb|ADAD01000026.1| GENE 1 3 - 303 307 100 aa, chain - ## HITS:1 COG:FN0132 KEGG:ns NR:ns ## COG: FN0132 COG3210 # Protein_GI_number: 19703477 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 4 99 2033 2128 2462 87 52.0 5e-18 KANKTTRDDHSSTNVYVESQTIDYALHPAKFKEDVGIAVLEGSATVEGALKKIDNILRGD DNSDISQSEKRRYEEIKENIIRVKTAPDMKLIAEGDLSDP Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:31:44 2011 Seq name: gi|261748643|gb|ADAD01000027.1| Leptotrichia goodfellowii F0264 contig00031, whole genome shotgun sequence Length of sequence - 2409 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 455 535 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 2 1 Op 2 . - CDS 452 - 1033 581 ## Bsph_2619 hypothetical protein 3 1 Op 3 . - CDS 1038 - 1619 751 ## COG4420 Predicted membrane protein - Prom 1647 - 1706 11.7 4 2 Tu 1 . - CDS 1790 - 2407 839 ## COG0466 ATP-dependent Lon protease, bacterial type Predicted protein(s) >gi|261748643|gb|ADAD01000027.1| GENE 1 2 - 455 535 151 aa, chain - ## HITS:1 COG:SP1168 KEGG:ns NR:ns ## COG: SP1168 COG0494 # Protein_GI_number: 15901033 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Streptococcus pneumoniae TIGR4 # 2 151 3 152 154 148 49.0 4e-36 MKIATICYIKKDGYTLMLKRTKRKNDIHEGKWVGVGGKMEMGESPEDCIRREVFEETGLT LKNLKLKGFLSFPGFEDEEDWYSFVFESTDFEGKIIDSPEGELAWIKDDKIKDLNMWEGD IDFLEWMKKDKIFSGKITYDKGKFVKSEVIF >gi|261748643|gb|ADAD01000027.1| GENE 2 452 - 1033 581 193 aa, chain - ## HITS:1 COG:no KEGG:Bsph_2619 NR:ns ## KEGG: Bsph_2619 # Name: not_defined # Def: hypothetical protein # Organism: L.sphaericus # Pathway: not_defined # 7 190 1 184 185 178 49.0 1e-43 MNKTRFLKKYDKKISLILCLIIVLFLIMFFMNEDFFNWAFKRHQNILSWYIRPLFLIPFC YFSYKRSFSGIAMTVLGVLTSMFWFSEPVTVDAQVSNFLSMEKKYLMTDWTVSKILITSA VPLSLIILSIAFWKRNILLGIINLVIIAVGKIIWSIGFGGDEGKKIILPAFTGLVLCIIL IYAGIKAKGRNDL >gi|261748643|gb|ADAD01000027.1| GENE 3 1038 - 1619 751 193 aa, chain - ## HITS:1 COG:mlr4962 KEGG:ns NR:ns ## COG: mlr4962 COG4420 # Protein_GI_number: 13474144 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mesorhizobium loti # 7 156 105 267 307 107 37.0 2e-23 MTDLKDLSEEKRGKLVKNLKEEIENEKKMHMLLEQKVSEITEKKEKLKFSQKISDAIGNF VGSWKFIILLGLFLGFWIILNVHFLMRKTFDPYPVLLLNLILSCVAAIQAPLIMMSQKRK RDIDRQQKENDYKVNLKTEIVMEDIHYKLDELLEKQEEIVERIIYLEGRKPDPEQRKYKF MKDIFEKDKKVGE >gi|261748643|gb|ADAD01000027.1| GENE 4 1790 - 2407 839 205 aa, chain - ## HITS:1 COG:FN2014 KEGG:ns NR:ns ## COG: FN2014 COG0466 # Protein_GI_number: 19705310 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Fusobacterium nucleatum # 1 178 590 768 768 221 66.0 7e-58 GVVNGLAWTAVGGTMLEVQAVKMEGKGNLQLTGKLGSVMQESAKVAYSYVRHIKNELGIK NKFNEETDIHLHFPEGAVPKDGPSAGITITTAIISVLADKEVRQDIAMTGEITITGEVLA IGGVKEKVIAAHRVGIREVVLPIDNKIDAEELPEEITKDMTFHYAETYDDVKKVVFVKGK EKSKKKETEKTKVTKKSKSAKTSKK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:31:50 2011 Seq name: gi|261748636|gb|ADAD01000028.1| Leptotrichia goodfellowii F0264 contig00035, whole genome shotgun sequence Length of sequence - 6407 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 1, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 + CDS 10 - 303 464 ## COG1940 Transcriptional regulator/sugar kinase 2 1 Op 2 . + CDS 350 - 1873 1620 ## COG0477 Permeases of the major facilitator superfamily 3 1 Op 3 . + CDS 1895 - 2941 884 ## COG1459 Type II secretory pathway, component PulF 4 1 Op 4 . + CDS 3005 - 3667 778 ## COG4122 Predicted O-methyltransferase 5 1 Op 5 . + CDS 3672 - 4421 688 ## COG3568 Metal-dependent hydrolase 6 1 Op 6 . + CDS 4443 - 4709 325 ## 7 1 Op 7 . + CDS 4736 - 5728 1499 ## COG1649 Uncharacterized protein conserved in bacteria + Term 5739 - 5783 9.2 Predicted protein(s) >gi|261748636|gb|ADAD01000028.1| GENE 1 10 - 303 464 97 aa, chain + ## HITS:1 COG:lin0809 KEGG:ns NR:ns ## COG: lin0809 COG1940 # Protein_GI_number: 16799883 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Listeria innocua # 2 95 200 289 290 108 55.0 3e-24 MDKDEVWELEAYYLAQALVNYILILSPQRIIMGGGVMKQQQLFPLIRKNVQKFLNNYVYK KEILEEIEKYIVYPGLGDDAGFIGSIALGKIALDSLK >gi|261748636|gb|ADAD01000028.1| GENE 2 350 - 1873 1620 507 aa, chain + ## HITS:1 COG:TM0563 KEGG:ns NR:ns ## COG: TM0563 COG0477 # Protein_GI_number: 15643329 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Thermotoga maritima # 21 409 10 396 459 132 23.0 1e-30 MNTKHKIIRQKINNTKKYSIFEAVFYNGFYVGMQGFVMLSLAIYFNFSPFFISIVSVLPT AGFFLQIFTKSVNRFLGHRRKTLMMSTVISRLAVCVLPFAVLLDIRQPVLYFMIMFIYAL FSPFVNNVWTATMVEIIDKKDRGKYFGIRNFFSSLSTVIYILFYGYLLSMEDKKTAMFLL TSSMAVSAIVSTIFMHFHYIPRLGEKIQKVSIKSALKNKNFVIYLKFAAVWLFTWELLKP LFEYYRIKVLGVDIIFISQMGVLTAILSSGLYVLYGKLSDKYGNKTMLRMGIFFTTYYVL IYFSMTDDNKISMLFTAAIVDAIGFTAITLSLLNLLMEIAEEPADAYVGAYAIVSGLVAV SAGIFGGVLGTYVNNGVIYIFGETFHTIRIAFMIGFILRLYCLLQLTRVDSFEKTFVYNG GIPIKNIFSKRIFSVGSNYIKTFRDKMNGHDHEEDKKSEDNLYNDENNEINDNNDVSDKN ENTVNNEQNIKENKEETQNKENKEKTV >gi|261748636|gb|ADAD01000028.1| GENE 3 1895 - 2941 884 348 aa, chain + ## HITS:1 COG:CAC2104 KEGG:ns NR:ns ## COG: CAC2104 COG1459 # Protein_GI_number: 15895374 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Clostridium acetobutylicum # 5 344 45 392 396 137 27.0 3e-32 MTGFLSKKKLHKVKDKELLSFTKNMYYLLKGKVNLIDSLEIISMNYEGEFREKIIQTKKL VEKGKPLNTAFEKVVKDREFLELIKIGEQTGNLETVFKNLSEKYEFKEKIRKDITGLSVY PLTVIVTAVVIVVILLKFVVPKFVMIYSDTGQKLPLMTRITVKISALFDKYGLFLLILSI FFIWTAVYLKKKNEEKFEQLILNTFLLGNLYREMCILNFTRSMYSLTSADISFLESLKMC TNSNSLLLNGEVKKIITKVEKGENIKKSFYNLRFFNREYITFLNIGEKTGVISDAFHNLY EIYYEKVREKIQLFLKIFEPLSIIFIAMIIGIIMLSVMLPIFKMGEAL >gi|261748636|gb|ADAD01000028.1| GENE 4 3005 - 3667 778 220 aa, chain + ## HITS:1 COG:FN0314 KEGG:ns NR:ns ## COG: FN0314 COG4122 # Protein_GI_number: 19703659 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Fusobacterium nucleatum # 1 220 1 213 215 161 50.0 1e-39 MLENFIESSKYSQNLFQTEDEKIVGIEKESLENNVPIITKEVLNFMIFTAKGIKARNILE IGTATGYSGIFLGKIANENGGTFTSIEIDEKRYNKAVENFEKVGILDKNRLILGDALEKI PEIAQIEENSKRETFDFIFIDAAKGQYMKFFEMCYDLLNENGIIFIDNIMFRGLVTSEEK DIPKRYKTIVNRLKQFIEKLNYEYDFVLLPFGDGVGLVRK >gi|261748636|gb|ADAD01000028.1| GENE 5 3672 - 4421 688 249 aa, chain + ## HITS:1 COG:PA5488 KEGG:ns NR:ns ## COG: PA5488 COG3568 # Protein_GI_number: 15600681 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Pseudomonas aeruginosa # 1 242 35 273 292 106 30.0 4e-23 MKFLLYNIRYGTGKYLKQPLKHIRGYLGHSLSHVSRIGEFIKGYNPDIVGLVEVDLGSFR TEEKNQAELLGKITGNHCVFQYKYQKNSRYMKVPMVRKQGNALLSKTEIKSKKFHYLNNG MKKLVMEIETDSVTVFLVHLALGGKTRLKQIVQLRKLIQDCRKPFIVAGDFNVLWGNEEI ELFLEASGLHNVNVHREPTFPSWAPKKELDFILCSKEIKVKEFQVIKTLLSDHLPVIIDF EIKKDGYKN >gi|261748636|gb|ADAD01000028.1| GENE 6 4443 - 4709 325 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKAISKILLIIMLITGALNLYGREGVSDYNPENDDEILKIIDSYSNDDFSVSNTKTKKTA AKKIKKNLEEYGSQVLLILTGLLKKDSV >gi|261748636|gb|ADAD01000028.1| GENE 7 4736 - 5728 1499 330 aa, chain + ## HITS:1 COG:BS_yngK KEGG:ns NR:ns ## COG: BS_yngK COG1649 # Protein_GI_number: 16078889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 329 67 401 510 385 57.0 1e-107 MLENVKKWNMNAVFVQIKPVGDAFYPSKYAPWSEYLTGVQGENPGYDPLKFMIEEAHKRN IEFHAWFNPYRLTMGGGREKLSRDNIGNKRPEWTVMYGGKLYLNPGIPEVNDYVVDSIVE VVKKYDVDGVHMDDYFYPYKVKGQEYPDSQQYRKYGGKFSNIGDWRRNNINKLIEKLHNS IKKENKNVSFGISPFGVWRNASTDPVRGSQTQAGVQNYDDLYADILYWMDKHWIDYVAPQ IYWVRGFKVADYSTLINWWSKYAGKTNTDLYIGHAAYKVNDWSNPNELVEQVKLNRKYPE IKGSIFFSYKSLVPNPKNVTNNLLNGPYSN Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:32:07 2011 Seq name: gi|261748597|gb|ADAD01000029.1| Leptotrichia goodfellowii F0264 contig00011, whole genome shotgun sequence Length of sequence - 35939 bp Number of predicted genes - 37, with homology - 36 Number of transcription units - 11, operones - 9 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 18/0.000 - CDS 1 - 1771 2592 ## COG0466 ATP-dependent Lon protease, bacterial type - Prom 1797 - 1856 2.9 - Term 1795 - 1857 1.5 2 1 Op 2 24/0.000 - CDS 1861 - 3090 272 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 3 1 Op 3 29/0.000 - CDS 3111 - 3680 918 ## COG0740 Protease subunit of ATP-dependent Clp proteases 4 1 Op 4 1/0.000 - CDS 3716 - 4999 2193 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) 5 1 Op 5 1/0.000 - CDS 4987 - 6693 1844 ## COG0608 Single-stranded DNA-specific exonuclease 6 1 Op 6 32/0.000 - CDS 6697 - 7059 591 ## COG0858 Ribosome-binding factor A 7 1 Op 7 15/0.000 - CDS 7099 - 10065 4285 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 8 1 Op 8 22/0.000 - CDS 10043 - 10594 315 ## PROTEIN SUPPORTED gi|237742963|ref|ZP_04573444.1| ribosomal protein L7Ae 9 1 Op 9 32/0.000 - CDS 10617 - 11714 624 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 10 1 Op 10 . - CDS 11714 - 12250 572 ## COG0779 Uncharacterized protein conserved in bacteria - Prom 12312 - 12371 8.5 - Term 12255 - 12293 5.2 11 2 Op 1 . - CDS 12391 - 13035 890 ## Lebu_0803 hypothetical protein 12 2 Op 2 . - CDS 13055 - 13807 1102 ## COG0560 Phosphoserine phosphatase - Prom 13832 - 13891 11.8 - Term 13923 - 13964 -0.6 13 3 Op 1 . - CDS 13982 - 14425 376 ## Xcel_1303 hypothetical protein 14 3 Op 2 . - CDS 14488 - 14637 276 ## gi|262037367|ref|ZP_06010832.1| putative CesF - Prom 14805 - 14864 8.9 - Term 14919 - 14969 0.3 15 4 Op 1 . - CDS 15149 - 15319 207 ## gi|262037372|ref|ZP_06010837.1| hypothetical protein HMPREF0554_0135 16 4 Op 2 . - CDS 15335 - 17152 3197 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 17174 - 17233 8.2 17 5 Op 1 . - CDS 17381 - 20056 3938 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) 18 5 Op 2 . - CDS 20064 - 20159 70 ## 19 5 Op 3 . - CDS 20220 - 20621 409 ## Lebu_0791 hypothetical protein 20 5 Op 4 . - CDS 20694 - 21605 1467 ## COG0648 Endonuclease IV 21 5 Op 5 . - CDS 21593 - 22372 910 ## COG0566 rRNA methylases 22 5 Op 6 1/0.000 - CDS 22386 - 22727 701 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 23 5 Op 7 . - CDS 22729 - 23163 735 ## COG0698 Ribose 5-phosphate isomerase RpiB - Prom 23250 - 23309 7.1 24 6 Op 1 . - CDS 23332 - 23718 530 ## COG0824 Predicted thioesterase 25 6 Op 2 . - CDS 23729 - 23971 357 ## Lebu_1300 hypothetical protein 26 6 Op 3 . - CDS 24024 - 24263 328 ## Lebu_1302 hypothetical protein 27 6 Op 4 1/0.000 - CDS 24268 - 25104 986 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 28 6 Op 5 . - CDS 25116 - 25670 926 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase - Prom 25738 - 25797 6.8 29 7 Op 1 . - CDS 25868 - 27007 1496 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases 30 7 Op 2 . - CDS 27022 - 28401 1995 ## COG0531 Amino acid transporters - Prom 28537 - 28596 9.5 - Term 28569 - 28613 2.5 31 8 Tu 1 . - CDS 28636 - 29085 541 ## COG3187 Heat shock protein - Prom 29110 - 29169 7.4 32 9 Tu 1 . - CDS 29345 - 30334 994 ## COG3177 Uncharacterized conserved protein - Prom 30441 - 30500 6.6 - Term 30368 - 30406 3.1 33 10 Op 1 . - CDS 30639 - 31328 660 ## gi|262037353|ref|ZP_06010818.1| hypothetical protein HMPREF0554_0153 34 10 Op 2 . - CDS 31247 - 31546 171 ## gi|262037357|ref|ZP_06010822.1| hypothetical protein HMPREF0554_0154 35 10 Op 3 . - CDS 31577 - 34642 3601 ## COG3587 Restriction endonuclease - Prom 34671 - 34730 6.0 36 11 Op 1 . - CDS 34738 - 35517 368 ## gi|262037366|ref|ZP_06010831.1| hypothetical protein HMPREF0554_0156 37 11 Op 2 . - CDS 35510 - 35938 467 ## Rahaq_5126 hypothetical protein Predicted protein(s) >gi|261748597|gb|ADAD01000029.1| GENE 1 1 - 1771 2592 590 aa, chain - ## HITS:1 COG:FN2014 KEGG:ns NR:ns ## COG: FN2014 COG0466 # Protein_GI_number: 19705310 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Fusobacterium nucleatum # 1 590 1 588 768 591 51.0 1e-168 MSRKPFIATRELVVFPGVVTPIFIGRQSSLASLEEALDKFENKLILSTQKDANVEDPKLP EDVYETGVLVHVIQTVKMPNGTVKVLVEAKHRVLIGEFSERNGVQFTEYQEIFPKPIEES KAEALKRKVIDEFSNYAKTTQKILPDVIYNIKEIRNIDKVFDLICTNLMIATTVKQELLE ILDVEERAYRILSILEKEVEIFTIEKDIESKVRDQMSEVQKNYYLREKIKAMREEMGEDS DSDEELEELDQKVHESKVPQELKDKLFKELSRLKKMPDFSAESSVIRSYVETVLELPWDE STKDEIDIEKAEKILNEDHYGLTEVKERILEFLAVKKLNNTLKGSIICLVGPPGVGKTSL ANSVARSMNRKFTRISLGGLRDEAEIRGHRRTYIGSMPGRLINSLKQVGVNNPVMLFDEI DKMASDFRGDPASAMLEVLDPAQNHTFEDHYIDYPFDLSKVFFICTANDLGGIPGPLRDR MEIISIESYTEFEKLNIAKKYLIPQTQEENGLKDYKIPFSDASIFKIINEYTREAGVRNL RREISKLFRKMAKEALTSKSRKLSVTEAKIKDYLGNPKFREDKIKEKTGK >gi|261748597|gb|ADAD01000029.1| GENE 2 1861 - 3090 272 409 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 153 405 258 457 466 109 32 3e-23 MAKSKHICSFCGRGEDEVENLIGSPDNHSFICDRCIDDCLDLIDNYDGVEKKGESEIVLL KPAEIKAKLDDYIIGQETAKKVLSVAVYNHFKRIKHKMKNIDDVELQKSNVLLVGPTGSG KTLLAQTLAKILNVPLAIADATTLTEAGYVGDDVENVLLKLIKAADYDIETAEHGIIYID EIDKIARKSENMSITRDVSGEGVQQALLKIIEGTVASVPPQGGRKHPNQEMIEINTKDIL FIVGGAFEGLESKLKNRINEKRVGFGLENNGQKLDEMTLFENVLPEDLIKFGLIPELIGR LPIITALHGLDEEAMIKILTEPKNSLVKQYKKYFEMEDIKLSFEDDAIVEIAKLALKRKI GARGLRSIIESVMLDLMYEIPSMENVTKVIITKEAVEDKSKVIIKKGKK >gi|261748597|gb|ADAD01000029.1| GENE 3 3111 - 3680 918 189 aa, chain - ## HITS:1 COG:STM0448 KEGG:ns NR:ns ## COG: STM0448 COG0740 # Protein_GI_number: 16763829 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Salmonella typhimurium LT2 # 4 189 18 203 207 262 66.0 2e-70 MYSPVVIENDGRGERSYDIYSRLLKDRIIFLGGEVEDNIANSIIAQLLFLDAQDKEKDIV MYINSPGGVVTAGLAIYDTMRHIKADVSTVCIGQAASMGAVLLASGAKGKRYSLPNSRVM IHQPSGGARGQATDIQIQAKEIEKIKKTLNEILSEATGKPVEEIYRDTERDNFMSSEEAL AYGIIDKIL >gi|261748597|gb|ADAD01000029.1| GENE 4 3716 - 4999 2193 427 aa, chain - ## HITS:1 COG:FN2017 KEGG:ns NR:ns ## COG: FN2017 COG0544 # Protein_GI_number: 19705313 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Fusobacterium nucleatum # 4 427 5 429 429 302 44.0 9e-82 MATVKKLNETTYEVSVSRTGDELKHMKEHVLTHFKDAKVAGFRPGHVPKDVLERTFKKEI EDEILNHIISEEYSKAVSENDLKPIDNIKLEKYENDNDKVELVFTIPVLPEIKLGEYKGV KVEKEELKVTDEKVNEEIERLRVGAAKLREVEEGAAAELNDVTNIDFEGFIDGVAFDGGK AEGFDLTLGSKNFIDTFEDQIVGHKKGDEFDVNVTFPENYGAENLAGKPAVFKVKVNSIK RREETELNDDLAKELGFDSMEDLKNKTKENITKREESRIENDFRNKVVEKVAEGTEVEVP EGLIQKEIQYKVNEFAQQLQMQGMSLNQYFEMTGQDLEKMRETLREPAKKSLKTQLILDE IAKAENIAASDEDVNAEVEKMAEMYGVDKDVILEDVKKSGNYARFVENTKYQIVNRKTVD FLVKEAK >gi|261748597|gb|ADAD01000029.1| GENE 5 4987 - 6693 1844 568 aa, chain - ## HITS:1 COG:FN2018 KEGG:ns NR:ns ## COG: FN2018 COG0608 # Protein_GI_number: 19705314 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Fusobacterium nucleatum # 19 561 6 556 556 401 43.0 1e-111 MNWEMKNYDRDYLAFKSGEFGESELITRLLLNRGISSKEEVEKFLNSSENDLLSPFLIEN MDEVTERILKAKKNREKVIIYGDYDVDGISAAAYLVIVFRKLGMNVDYYIPNRAHEGVGI NSNLIKYLIKRNAGLFITVDTNIGSYEEVMMLKRNNIDIIITDHHRQTEEIKNFNIPTIN PKIGNNYPNKNLSGSGVAFKLADAVYEKCGVNKKILYDYLDIVMIGTVADVVPMTDENRF IIKKGLYNLRKTKIKGMKYILNYLRINPNNITTSDVGFYIAPIFNALGRIDNSKMVVEFF IEEDDFKLFSIIEEMKKANKIRRYLEIEIYNEIEEKIQRLNKPKYIFMKSRKWHSGVIGV VCSRISIKYNIPVILVSIKNGYGKASCRSIEGLNIFDMLKETSDKFERFGGHDLAAGFLV SEKYLKEIEIYLKKRLVTANKSNIEKVLYIDSVLNIEQINKSKVLDINRLSPFGLDNQEP NFVDKGIKFISFTKFGVNNRHFKGFIKKNDRIISVIGYNLGHKLKFKNINKKYEIVYTPS FKSVRTDLFIELKLKDFRENRGGQKWQL >gi|261748597|gb|ADAD01000029.1| GENE 6 6697 - 7059 591 120 aa, chain - ## HITS:1 COG:FN2019 KEGG:ns NR:ns ## COG: FN2019 COG0858 # Protein_GI_number: 19705315 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Fusobacterium nucleatum # 1 115 1 119 120 80 42.0 9e-16 MNDRRKKGLEKEISRIVGITLLTEIKNEKIKNLVSIHKVELTKDGRYLDLTFSILDLKNN VNKEKIVEDLEKLKGFFRKTIGSQLSVRFVPEIRIHIDDSVEYGVKISSILNEIKQKDSE >gi|261748597|gb|ADAD01000029.1| GENE 7 7099 - 10065 4285 988 aa, chain - ## HITS:1 COG:FN2020 KEGG:ns NR:ns ## COG: FN2020 COG0532 # Protein_GI_number: 19705316 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Fusobacterium nucleatum # 400 988 147 734 737 674 62.0 0 MKVHELAKFMGYKTADFIEKLHKIGIQGKTHPNNKLENDEVNSIKEKLSNENRSNNEDSH KKNGNNESKTTEHKENTNKGHVIKEKLVFSTKNRKPENRNNVKEPVKNKETVNSDNAGKS VAEVNAKPEKEVNKQENQPKEIKIEKESVVKETIAENRDDSKNNFNRNEKKFNNKHNDRN QDNNRHNNDNRNRDKNFGKSFEKSDKDNHYGKENRYQSGDRNRDRQGDRYNKNTGDNQGN RDGQNNRNFDRNDKGNYQNKGNFRNRDGQGNNQGNRDNRDRDNKRFDKDKKDFKKRDNRQ SFDGKKDDYKNDRNDRSFGNKNKKEVPKSDSAGAAPIAEKPKVNTKGKAKFDKKKYEKEK KQKEEEKKLRELRSDFRKDDKKKKSKKKQEKVMKDEIIRIEGESIGMITIGEEIVIKDLA EKLGINVSDIIKKFFMQGKMLTANAILSFEEAEEVALDYEVIVEKEEIEEISYGEKYHLE IEDREQDLVTRAPVITIMGHVDHGKTSLLDALRHTNVIEGEAGGITQRIGAYQVNWKGQK ITFIDTPGHEAFTEMRARGANITDISILIVAADDGVKPQTVEAISHAKEAGVPIIVAINK IDKPGADPMKVRTELTEYGLMSPDWGGNTEFVEISAKQKINLEELLETILITAELLELKA NPKKRVKAVVVESRLDPKMGAVADILIQEGELKIGDIFVAGESHGRVRSMVDDRGNKIEK GLLSQPVEITGFSDVPNAGDVLYGVNNDKQAKKIVEDFIKERKVNEQNKKKHISLESLSQ ELEEQQLKELKCIIRADSKGSVEALKESLQKLSNDKVMINIIQASAGAVTEGDVKLAEAS NAIVIAFNVRPTTPARIEAEKTGVEVRNYNVIYHVTEEIEKAMKGMLDPEFKEIYFGRIE VKQVFKISNVGNIAGAIVVDGKVSRTSKIRIIRDGIIIFDGELESLKRFKDDAKEVVMGQ ECGIGIKDFNDIKEGDIIESYILEEIPR >gi|261748597|gb|ADAD01000029.1| GENE 8 10043 - 10594 315 183 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237742963|ref|ZP_04573444.1| ribosomal protein L7Ae [Fusobacterium sp. 4_1_13] # 2 172 6 176 176 125 37 3e-28 MIPERTCVCCRSKKEKNKFFRISSVNNKYVFDEDNKIQSRGAYICKSSECVQKLSKHRKY NIEIEELLKMLKKIEKNKKNIIDILRPMKNSDYFVFGIDENIEGIKKDKVKLLIIPEDIN RKYIKDFKRLKEKFTFKVIKIEKKSQLEEIFSRDVNVIGIVDKKVVNGILNKVEVTNEST RIG >gi|261748597|gb|ADAD01000029.1| GENE 9 10617 - 11714 624 365 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 3 360 4 358 537 244 39 4e-64 MKSKDQRVFLEALDELEKEKGIIKEELLEAVETALLAAYKKNYGDKDNAEISINRETGEV KVFSRKTIVQTVEKPEEEISLEDAIALKKKSKLGGTIDFEINAENFKRNAIQNAKQIIVQ KVRECEKKNIFNRFKQIEKSIVSAVVRKTDEKGNLYIDINGLEAIVPEKELSEADKFIQG DRLKVYVGAVEESTKYTKCFLSRKVEELMRGLLNLEIPEIEDGTIQIKQIAREAGSRTKI AVYSDDPNLDVKGACIGKSGMRIQSIIDELKGEKVDVVLWDEDIRYFVKNALNPAEVISV EIIEEDGEQIARVEVDEEQLSLAIGKKGQNSRLAARLCGIKIDIHTVKSEENEEIDEIGE EDEEE >gi|261748597|gb|ADAD01000029.1| GENE 10 11714 - 12250 572 178 aa, chain - ## HITS:1 COG:FN2023 KEGG:ns NR:ns ## COG: FN2023 COG0779 # Protein_GI_number: 19705319 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 19 168 6 153 156 104 41.0 9e-23 MKRVVLKQEIVYEVIILEEILSKFEKEIQDSLDSLDLELSDLEYIQEGGYNYLRVYVEKK DGTTSLDDCIELSSKIDGLADNLINEKFYLEVSTPGVERKLKKEKDFIRFSGEKIKLHSK SQIEGKKTFEGKLEKFENDTIFLNDQNIGKTVEIPLSKLKKANLIYELPNDTLEKEEE >gi|261748597|gb|ADAD01000029.1| GENE 11 12391 - 13035 890 214 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0803 NR:ns ## KEGG: Lebu_0803 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 214 1 214 214 317 79.0 3e-85 MLSNTFSGIKLEFEGLYYGGNYDIEIFSDGRFYYSYIENDAVEIKKGSFQITQKEVMIFE EMMDYFEFLKNQRNYMINEYQFGNGILVIQKKNGREEKIRLGKEMMFEYSKILIDKYIPK SKRIFLLDTYLSGTKYIRNFGLKIKDAHILDLYREFNGVSLNAVAAFNANKEKVGYVPKE QNEIIARLIDAGKKIVAIPIPFDDEVALKIYMSD >gi|261748597|gb|ADAD01000029.1| GENE 12 13055 - 13807 1102 250 aa, chain - ## HITS:1 COG:CAC0263 KEGG:ns NR:ns ## COG: CAC0263 COG0560 # Protein_GI_number: 15893555 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Clostridium acetobutylicum # 11 243 2 234 247 259 55.0 4e-69 MKGMEGDLMAKNIAAFFDIDGTIHRNSLLIEHFKMLVKYEYIDIMSWEGKVKEKFSKWET RTGDYDDYLEELVETYVEALKNFKKEDMDFIAKRVIDLRGNRVYKYTRERLEYHINKGHK VIIISGSPDFLVAKMAERYGVKDYRGSEYLVNDKGIFTGEVIPMWDAKSKKKAIADFCGK YDIDLGKSYAYGDTTGDLTMFKKAGNAVAINPAKRLLEKLKKDKELKEKVTIIVERKDVI YKLKPNVEII >gi|261748597|gb|ADAD01000029.1| GENE 13 13982 - 14425 376 147 aa, chain - ## HITS:1 COG:no KEGG:Xcel_1303 NR:ns ## KEGG: Xcel_1303 # Name: not_defined # Def: hypothetical protein # Organism: X.cellulosilytica # Pathway: not_defined # 1 143 8 155 163 72 33.0 4e-12 MSGVGKTTILSMLQDENTICIDLDETDYIEVDIETKERLISIDKLLSYINSISDKNLILA VCEANQGQIYSLMSAVIVLTASLEVMKERINKRQNNNYGKDKEEWEQIVRNKKEIESLLI RRADYVIQTDGEIHEVFNKVKMIIDEV >gi|261748597|gb|ADAD01000029.1| GENE 14 14488 - 14637 276 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037367|ref|ZP_06010832.1| ## NR: gi|262037367|ref|ZP_06010832.1| putative CesF [Leptotrichia goodfellowii F0264] putative CesF [Leptotrichia goodfellowii F0264] # 1 49 1 49 49 87 100.0 3e-16 MNKLDELHEKLPVISFETRDELHDWLLENENRMRPSGYFALGLKNPKHL >gi|261748597|gb|ADAD01000029.1| GENE 15 15149 - 15319 207 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037372|ref|ZP_06010837.1| ## NR: gi|262037372|ref|ZP_06010837.1| hypothetical protein HMPREF0554_0135 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0135 [Leptotrichia goodfellowii F0264] # 1 56 1 56 56 77 100.0 3e-13 MKDEKLAELERKLRSELSEEKLKFIDKYRLELNSRRRWICRKNQTPERVYFHINLF >gi|261748597|gb|ADAD01000029.1| GENE 16 15335 - 17152 3197 605 aa, chain - ## HITS:1 COG:FN0634 KEGG:ns NR:ns ## COG: FN0634 COG1217 # Protein_GI_number: 19703969 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Fusobacterium nucleatum # 3 604 1 604 605 858 70.0 0 MSIKIKNIAIIAHVDHGKTTLVDALLKQSGTFGEHEKVDERVMDSNDLEKERGITIFSKN ASIHYEGYKINIVDTPGHADFGGEVQRILKMVDSVLLLVDAFEGVMPQTKYVLKQALEHG LRPIVVINKIDRPNSDPDGVVDSVFDLFVDLGANDLQLDFPVVYASAKNGFAKINIEDED KDMKPLYDMILKHVEDPDGDENEPLQMLVTNTEYDEYVGKLGTGRIYNGKIKKNQEIILI KRDGELVNSKVSRIYGYDGLKKIEMDEAVAGDIITIAGIDKIDIGETVTDKENPRALPLI DIDEPTLAMTFLVNDSPFAGKDGKYVTSRNLLERLTKEVNHNVSMRLEMTDSPDAFIVKG RGELQLSILLENMRREGYEVAVSKPEVIYKEENGVKMEPVELAIIDVADEFVGVVIEKLG LRKGEMINMAQGSDGYTRLEFKVPSRGLIGFSNEFLTETRGTGILNHSFYEYEPYKGDVT GRRKGVLIALEPGTSIGYSLNNLQPRGILFIGPGVEVYEGMIVGEHSRENDLIVNVCKGK KLTNMRAAGSDDALKLAPPKEFTLELALEYIADDELVEITPNNIRLRKKYLNGNERKKYE NSKNS >gi|261748597|gb|ADAD01000029.1| GENE 17 17381 - 20056 3938 891 aa, chain - ## HITS:1 COG:FN1718 KEGG:ns NR:ns ## COG: FN1718 COG0653 # Protein_GI_number: 19705039 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Fusobacterium nucleatum # 1 834 1 840 869 924 59.0 0 MLKSLGKKIFGTADEREIKRMRRLVDHINNIEPEFEKLTDDELKNKTVEFRERIEKGETL DDLLVEAFATVRETAKRLTGMRIYDVQLIGGMILHNGRIAEMKTGEGKTLMSTLPIYLNA LTGKGVHVVTVNDYLAKRDRDTMGEIYDFLGLTSGVIIANITNEQRKRAYNADITYGTNN EFGFDYLRDNMVGELEDKVQRGHNYAIVDEIDSILIDEARTPLIISGAAEETTEWYNTFA DVATRLKRSYKTEEIKDKKNTVIPDEDWEDYEVDEKSHTVTITDKGIKNVEKILKIDNLY SPENVELTHFLTQALKAKELFKLDRDYIINADGEVIIVDEFTGRLMEGRRYSDGLHQAIE AKEKLEVAGENQTLATITLQNYFRMYTKLSGMTGTAKTEEDEFKQIYKLRVIEVPTNKPV IRQDLADVIYMTKAAKYRAIARKIKELYTKGQPVLVGTASIQHSEDVSALLKKEKIPHEI LNAKHHEREAEIVAQAGRYKTVTIATNMAGRGTDIKLGGDPESFALKVAEKGTDEYKEAY SAYARDCEENKKKVIEAGGLFILGTERHESRRIDNQLRGRAGRQGDPGTSEFYLSAEDDL MRLFGGDKLKSMMKFLKLDEDEEIRHKQITKVVENAQKRIESRNFSSRKSLIEYDDVNNT QREVIYTQRDQVLKNEELKELILSMMRETVEETVNSALVSEGHEEGEMNLLQDKFREIFD YELPENLRNEDKETIIDKVYGDLSERYNMKEQLIGEETFRRIERYIMLEVLDQKWRQHLK DLTELREGIRLRSYGQRNPIHDYKIVGFDIYNEMIDAIKRETSSFILKLRLREEEETANL KREEVKNVKYEHENIDENDGEFTETDRQEAVQQEVPLSRRERRELERKNKK >gi|261748597|gb|ADAD01000029.1| GENE 18 20064 - 20159 70 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFQDSIDCSVEAEKNKVHTGKSLRIIETVFN >gi|261748597|gb|ADAD01000029.1| GENE 19 20220 - 20621 409 133 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0791 NR:ns ## KEGG: Lebu_0791 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 81 6 85 132 68 53.0 8e-11 MKKNKRIIFIITCFVFVMSYSSLSGTAGFNRQLNVLVDENNLSYEDIKQLISEVNRIESI KYDKQLKKELKRIERQKEKEKKSAENKFFIGNRDMISNFSVSFIGETINNVLKTDVLSIG NLKREQNNLRLIE >gi|261748597|gb|ADAD01000029.1| GENE 20 20694 - 21605 1467 303 aa, chain - ## HITS:1 COG:lin1487 KEGG:ns NR:ns ## COG: lin1487 COG0648 # Protein_GI_number: 16800555 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Listeria innocua # 1 300 1 295 297 414 67.0 1e-115 MFKIGSHVGMSGKKMFLGSVEEALSYGANTFMIYTGAPQNTRRKPIEELNIEAGLKLMKE NNISVDDIVVHAPYIINLGNAVKPETFEIAVQFLRTEIERTDAIGAPRIVLHPGAHVGEG AEVGIAKIVEGLNEVLTKEQKTTVALETMAGKGSECGRSFEEIAKIIDGVKLKDKLSVCF DTCHVNDAGYDIVNDFEGVIKEFDRIVGIDRISVIHLNDSKNPRGAMKDRHENLGFGTIG FEALNKIAHYEKFAHLPKILETPYVGLTEEKGAKTVPPYKFEIEMLKKEKFDKDVLEKIR KQG >gi|261748597|gb|ADAD01000029.1| GENE 21 21593 - 22372 910 259 aa, chain - ## HITS:1 COG:FN0875 KEGG:ns NR:ns ## COG: FN0875 COG0566 # Protein_GI_number: 19704210 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Fusobacterium nucleatum # 4 251 8 256 261 199 52.0 4e-51 MKNIVTSPDNKFYKLLKKLDKKKYRDENNIFKTEGEKFLNENINFNKIIIKESKYDYFNE KYNIEEYNNVTVLKDDLFEEVSTQENSQGIIFLFSKDLNSINDIEEDVVVLDDVQDPGNL GTIMRTMEAAGFKNLILTKGSVDVYNPKTVRATMGGIFNLNIIYETPENILKFLKEKEYL IITTALHEDAVSYGKIELKKKNAYIFGNEGGGVSDYFIENSDIKSIIPIYGNIESLNVSI AVGIFLYKMREKQEGKCLK >gi|261748597|gb|ADAD01000029.1| GENE 22 22386 - 22727 701 113 aa, chain - ## HITS:1 COG:FN1873 KEGG:ns NR:ns ## COG: FN1873 COG0537 # Protein_GI_number: 19705178 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Fusobacterium nucleatum # 1 112 1 112 112 133 58.0 7e-32 MSTIFKKIIDKEIPANIVYEDDEFLAFHDINPAAKVHVLVIPKKEIKNLDTATEEDLELL GKLQLTVAKVARILGINEEGYRVVTNINENGGQQVYHIHYHILGGEPIGMNVR >gi|261748597|gb|ADAD01000029.1| GENE 23 22729 - 23163 735 144 aa, chain - ## HITS:1 COG:aq_1138 KEGG:ns NR:ns ## COG: aq_1138 COG0698 # Protein_GI_number: 15606395 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Aquifex aeolicus # 1 137 1 137 154 163 61.0 1e-40 MKIAIGNDHAGVDLKHKIVEFLRKKGHEVTNVGTDTLDSVDYPDIAKEVAEKVLSGEAKY GVLICGTGIGISISANKVKGIRAALVHNEFTARLSRLHNDANVIALGARVLGDELAAMCV DTFINTEFEGGRHARRVGKIEEEK >gi|261748597|gb|ADAD01000029.1| GENE 24 23332 - 23718 530 128 aa, chain - ## HITS:1 COG:aq_1494 KEGG:ns NR:ns ## COG: aq_1494 COG0824 # Protein_GI_number: 15606651 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Aquifex aeolicus # 5 128 5 127 128 81 36.0 4e-16 MKKSFKIRTYYYDTDHMGVVYHANYLKWMEIARTEYFRDKLPYKEIEDMGIMMPVKSLSI EYINSVGYDEDVEIFIELKELTSIKIKFSYEIYGKDNVLKATAETLNVMMDKSGKLKRLP KEILDILK >gi|261748597|gb|ADAD01000029.1| GENE 25 23729 - 23971 357 80 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1300 NR:ns ## KEGG: Lebu_1300 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 80 1 80 80 104 76.0 9e-22 MNKYFETVNFWLENLLESNGNYTIESNEKGKVVDITINVIKDDIGKIIGKNGRVISSLRV LISSIAKKEKKIVKIEVKEM >gi|261748597|gb|ADAD01000029.1| GENE 26 24024 - 24263 328 79 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1302 NR:ns ## KEGG: Lebu_1302 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 79 8 86 86 109 74.0 4e-23 MKSWEYIIQMKREHIDFINKIVEAYEGVGNVRTLDNRNGLIKIITNSFFLDDIDNIIERL RKKDIFIEIIEKREWLGIL >gi|261748597|gb|ADAD01000029.1| GENE 27 24268 - 25104 986 278 aa, chain - ## HITS:1 COG:FN0287 KEGG:ns NR:ns ## COG: FN0287 COG0030 # Protein_GI_number: 19703632 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Fusobacterium nucleatum # 15 273 6 262 264 233 53.0 4e-61 MGKNKKYGNEKNIAKKKYGQNFLTDGELSDRILEEADIDKNTEVVEIGPGLGFLTEKLIE KSKHLTAFEIDDDLIPVLNKKFKDKDNFLLVHKDFLEINLGEYLEDRENIKVVANIPYYI TSPIINRLIEFRDNISEIYLMVQKEVAERISSKPRSKNMSILTHAVQFFAETEYLFTVPK EKFDPVPKVDSAFLKIVLLKDKRYESQISEEKYFKYLKKAFSNKRKSISNNLSELGFGKE KISGVLQKVGKTSLARTEEFSVQEFIDFIKILENESGE >gi|261748597|gb|ADAD01000029.1| GENE 28 25116 - 25670 926 184 aa, chain - ## HITS:1 COG:YPO3408 KEGG:ns NR:ns ## COG: YPO3408 COG0634 # Protein_GI_number: 16123557 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Yersinia pestis # 13 176 8 171 178 193 54.0 2e-49 MKKDWEFGIARQLITKEAIASRIRELGKEITKDFKDDDAPLVVVGLLKGSIIFMADLVRE IKLPLVMDFMEVSSYGNGFETTREVKILKDLENSIQGKNILVIEDIIDSGLTLKKVLKLL GGRGPKKVSLCTLLDKKERREVEIDVQYVGFEIPNEFVLGYGLDFKEEYRNIPYVALAAE KLYK >gi|261748597|gb|ADAD01000029.1| GENE 29 25868 - 27007 1496 379 aa, chain - ## HITS:1 COG:BS_yjcJ KEGG:ns NR:ns ## COG: BS_yjcJ COG0626 # Protein_GI_number: 16078253 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Bacillus subtilis # 5 379 10 384 390 432 56.0 1e-121 MKFGTKIIHGYNMIDESTGASSISICQASTFHQSNIDDDQKYTYSRFGNPTREALEEAIA CLEKGRYGLAFSSGMAAITAVLLSFSHGDHIVMCKDVYGGAFQLANEVFPRFGIEVTFVD ETNLDEWEKAIKENTKAFYMETPSNPLLKITDIKGVVNIAKKHKLLTIIDNTFMTPQYQN PIPLGVDIVIHSATKFLNGHSDVVLGMIVTDAEDIYKELKKQQIVLGALPGVEECWLLMR GMKTMEIRMKKSSKTALKIAEYLEKHPKVEKVYYPGLKSHEGYETNKKQSLSGGAVLSFD LGCCEKVKKFFKEVKYPIVAVSLGGVESILSYPVKMSHACVPEEERLKMGVTSGLVRLSV GIEDAEDLIEDIEKALQSI >gi|261748597|gb|ADAD01000029.1| GENE 30 27022 - 28401 1995 459 aa, chain - ## HITS:1 COG:FN0504 KEGG:ns NR:ns ## COG: FN0504 COG0531 # Protein_GI_number: 19703839 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Fusobacterium nucleatum # 7 458 1 452 453 481 69.0 1e-135 MEGNKKMQFWSIVLLTINSIIGTGIFLSPGSVAKLSGSLAPWIYLCAAVFAAVLAITFAS AAKYVTKNGAGYAYAKAAFGENVGLYVGITRFVAASIAWGVMATGVVKTTLSIFNIDSGN AVNITIGFLVLMAILLIINLVGTQVLTFISDLSTLGKLAALGITIIAGVVILVKTGQNHI AEIDLLKDAAGEKLVPALTTTGFVTAVIAAFYAFTGFESVASGASDMENPEKNLPRAIPL AIGIIAAIYFGIVLVSMFIDPVSLVTSKEVVVLASVFKNKIISNIIIYGALVSMFGINVA ASFHTPRVFEAMAREKQVPTYFDKRTKSGLPMRAFFVTAALAIIIPMAFNYNMMGIMIIS SISRFVQFIIVPLGVISFFYGKNKEEVLNANKNYITDVVFSVLSLILTIFLLVKFNWIAQ FSLKNDMGEQMTNWYAVSAMIIGYIILPALLFIYKKNKD >gi|261748597|gb|ADAD01000029.1| GENE 31 28636 - 29085 541 149 aa, chain - ## HITS:1 COG:FN0916 KEGG:ns NR:ns ## COG: FN0916 COG3187 # Protein_GI_number: 19704251 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Heat shock protein # Organism: Fusobacterium nucleatum # 57 144 62 149 149 58 36.0 5e-09 MKTKIILILVSIFLVFNISFSASVRTRKRAVTSNIKQKIVNTQWKLIKISDDDVSKKGIT LNISKDELSGKSGVNNYFGGYKVNGKKISVSKLAVTAMAGSNEKMDLEQDYLDILENVKR IELQNDKLIMKTDLGEILTFTREKDFIPY >gi|261748597|gb|ADAD01000029.1| GENE 32 29345 - 30334 994 329 aa, chain - ## HITS:1 COG:FN0971 KEGG:ns NR:ns ## COG: FN0971 COG3177 # Protein_GI_number: 19704306 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 325 5 327 330 263 42.0 5e-70 MKPPFTVTNSMLNKVVEISKTIGNLEFQIEKDLKLRKENRIKSIHSSLAIEQNSLTIEQI TAIIDGKRVLGNPKEIREVKNAYDVYEKILTLNPYNELDFLKAHSLLTADIVNESGKYRS KDVDIYDEKGNVVHIGARPQFIAGLMNDLFHWGQIDDTPEIIKSCVFHYEIEMIHPFEDG NGRMGRLWQTVILANWHPIFTWLPIETIIYEHQQTYYDVLGQADKENSSNLFIEFMLDII LETFIAYKTGDIYSDKVMNLPEGLTSTESKVYLLVKKYLNTHVSVTANVVSKLISKSSPT ARKYLSLFVSLGLMEAHGSNKNRTYTLIE >gi|261748597|gb|ADAD01000029.1| GENE 33 30639 - 31328 660 229 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037353|ref|ZP_06010818.1| ## NR: gi|262037353|ref|ZP_06010818.1| hypothetical protein HMPREF0554_0153 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0153 [Leptotrichia goodfellowii F0264] # 39 229 1 191 191 322 100.0 1e-86 MERSLYECEKKEKTLLYNDYTYIAKNVIITSSKKWNKKMYNGKIYFVLSGNVKIMEKNGY RYSASEVLIETFLGEEGDNIIIPGEVQIFDNGQIKKLRGVELSDFFYIHRKNKKFNNFQI LSNDILDFYKRALVFDCKAVSQEELNKKDSSSKNENTGNRRSYNSYGGGYDSKKYDKEIK KTQKEFEDAGFSSDYDPQYSDDPEYEREILEDYDSIDSGETPYDGDDGY >gi|261748597|gb|ADAD01000029.1| GENE 34 31247 - 31546 171 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037357|ref|ZP_06010822.1| ## NR: gi|262037357|ref|ZP_06010822.1| hypothetical protein HMPREF0554_0154 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0154 [Leptotrichia goodfellowii F0264] # 1 99 1 99 99 114 100.0 2e-24 MKIIDILKEIINNKVKRRYILLFLLLLLLFLTGIVTTFVVVYKNRDVEFQKNIKRFVKKR DVISPIKIFMKRDWNVLFMSVKKKRKHYYIMITHILLKM >gi|261748597|gb|ADAD01000029.1| GENE 35 31577 - 34642 3601 1021 aa, chain - ## HITS:1 COG:FN0417 KEGG:ns NR:ns ## COG: FN0417 COG3587 # Protein_GI_number: 19703759 # Func_class: V Defense mechanisms # Function: Restriction endonuclease # Organism: Fusobacterium nucleatum # 5 1000 1 971 997 399 35.0 1e-110 MSNKINFRFDSELEHQIKAINSTVDLFEGLPKQNNSIYNYNKSEHSLINPNRNKDIKLGE KLLNNLKKVQLKNELYLSEEIFDKNFTVEMETGTGKTYVYLRTILELYKKYDFKKFLIVV PSIPIRKGIEKSIEQLGEHFQRLYNIDLKKYCFIYDSGNTTNVRNFIGTTDLSIMITNIQ SFNKSVNKIQNEDEYGTVLWDEISHLKTIIIIDEPQKIEGQKNKKSESLKAIEKLEPLFT LRYSATHKNLYNQIYKLDSYDAYKKELVKKIRVKTVNGIISKDFPYIRYTHFTSNAEARI EMFTQKQGDSVRIKTFNVSGNESLEELSGGLSQYKNMFIAEAPHKEKPIRISTNGEDIYL NLGDSNSNISEKDIAKIQIRLAIENHFEKQFEIIENKNLKDKIKALTLFFVDSVEKVRSE SEDGRGEYLKIFDEEYENYINRNKEKFEKYKQYFPNYENTKLVREGYFAKDKKGKDVEIN YDIDNLNKKTPEEITRGISLILEKKDELISFEEPLAFIFSHSALREGWDNPNVFTLCTLK HGSSEIAKKQEIGRGLRLPVDVKGNRCLDSEINELTVIANDSYDNFSRALQEDFNKDFNK NEVSVDILKNTLEKAGISIEKISQELINDFKNELINKEITDEKNILKKDVEKITEIFENI EFENETLKEHTTKIKEEFVKLMQEKGSRRIEIKNGDNEPIVNKMRNFVSEAEFKEIYNRL SETLKKRTFYKCNIDKEEFIQKCADTINEAFKNYKFNQSYLSSTFKGVYDETKKMKTEIE TAEHEEEYNIQTPKAEKSDFEIIDTIMNNTMLPRLAVYRILKKLSKKSRKALNEQENLDE VTKEIKRELDNKKAESIKKYEIIEGYELEESKIFEIDNISEDDIFDDIDKWKIFKSNDKK KRTLNEYYKMDSKGEKEFAKCLEDDEKTVLFTKLKKGGFIIDTPYGNYSPDWAIVYKSKN EYNKMNLYFIVETKAGKNEENLTDVEKNKIKCGKKHFEVVSDTVKFGWVNSYEKFRKELV K >gi|261748597|gb|ADAD01000029.1| GENE 36 34738 - 35517 368 259 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037366|ref|ZP_06010831.1| ## NR: gi|262037366|ref|ZP_06010831.1| hypothetical protein HMPREF0554_0156 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0156 [Leptotrichia goodfellowii F0264] # 1 259 1 259 259 378 100.0 1e-103 MNKSIYELQNEKDIQKCIIIQKNYYTKAKAFNNFIFIFNVVIIVLFSILSLLSETMKAIS ILISMIIPYINNIFEKEKSKIQKEASDVQQYIDTYLYSAVLNEESKYRNQKEWNNTLKKS EMDKLVSKCDEMKENWYSDYSQKKDVEQIYLSQKENINWDNNLRNNYLKLNYVSFFILLI AIFIVVILFNLTFVKSMLYFAWLLVIINYFQKNISMLKNDIKNQEIMKEIAIIIENNFEK ENFKIEDILELEIKLQDKI >gi|261748597|gb|ADAD01000029.1| GENE 37 35510 - 35938 467 142 aa, chain - ## HITS:1 COG:no KEGG:Rahaq_5126 NR:ns ## KEGG: Rahaq_5126 # Name: not_defined # Def: hypothetical protein # Organism: Rahnella_Y9602 # Pathway: not_defined # 2 141 187 330 331 102 38.0 6e-21 KTDPRKDREKIKEINKKHNGKVLELIRLVKKWNNKKIPSYLLETLCIYYFENKNELESIN YIEFVKILPYVSFCIQYPVKDIKEIQEDINTLDDEKIRIIVDKITNEICIATEALSIEKK GDMKKSIELWKKIFGEEFPDYE Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:33:24 2011 Seq name: gi|261748541|gb|ADAD01000030.1| Leptotrichia goodfellowii F0264 contig00078, whole genome shotgun sequence Length of sequence - 56694 bp Number of predicted genes - 56, with homology - 55 Number of transcription units - 26, operones - 14 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 25 - 66 0.0 1 1 Tu 1 . - CDS 92 - 1888 2295 ## COG2374 Predicted extracellular nuclease - Prom 2080 - 2139 10.2 2 2 Tu 1 . + CDS 2256 - 3344 2314 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases + Prom 3346 - 3405 7.4 3 3 Op 1 . + CDS 3529 - 4461 1080 ## COG0523 Putative GTPases (G3E family) 4 3 Op 2 . + CDS 4531 - 5265 383 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 5 3 Op 3 . + CDS 5320 - 5790 590 ## EAT1b_1430 GCN5-related N-acetyltransferase 6 3 Op 4 . + CDS 5838 - 6299 525 ## COG0716 Flavodoxins 7 3 Op 5 . + CDS 6333 - 6962 741 ## Lebu_1767 hypothetical protein + Prom 6980 - 7039 7.1 8 4 Op 1 . + CDS 7065 - 9749 3749 ## COG0495 Leucyl-tRNA synthetase 9 4 Op 2 . + CDS 9775 - 10212 683 ## VP1847 hypothetical protein + Term 10274 - 10310 -1.0 + Prom 10235 - 10294 4.0 10 5 Op 1 . + CDS 10338 - 11333 787 ## gi|262037406|ref|ZP_06010870.1| putative membrane protein 11 5 Op 2 . + CDS 11393 - 12487 1374 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain + Prom 12509 - 12568 9.9 12 6 Tu 1 . + CDS 12595 - 13530 1413 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 13 7 Op 1 . + CDS 13920 - 15137 1605 ## COG2081 Predicted flavoproteins 14 7 Op 2 . + CDS 15160 - 16770 2038 ## COG2509 Uncharacterized FAD-dependent dehydrogenases 15 7 Op 3 . + CDS 16786 - 17865 1709 ## COG0006 Xaa-Pro aminopeptidase + Prom 17950 - 18009 7.4 16 8 Tu 1 . + CDS 18223 - 19047 797 ## gi|262037422|ref|ZP_06010886.1| hypothetical protein HMPREF0554_1260 + Prom 19098 - 19157 8.8 17 9 Op 1 41/0.000 + CDS 19182 - 19445 596 ## COG0234 Co-chaperonin GroES (HSP10) 18 9 Op 2 . + CDS 19482 - 21107 1624 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 + Term 21291 - 21333 8.8 + Prom 21113 - 21172 6.7 19 10 Tu 1 . + CDS 21338 - 21826 428 ## gi|262037393|ref|ZP_06010857.1| conserved hypothetical protein - Term 21820 - 21851 -0.7 20 11 Tu 1 . - CDS 21894 - 22823 615 ## COG2378 Predicted transcriptional regulator - Prom 22867 - 22926 14.4 + Prom 22826 - 22885 9.1 21 12 Tu 1 . + CDS 22916 - 23545 733 ## COG3340 Peptidase E - Term 23605 - 23650 1.1 22 13 Op 1 1/0.167 - CDS 23729 - 24289 928 ## COG0431 Predicted flavoprotein 23 13 Op 2 . - CDS 24305 - 24739 286 ## COG1846 Transcriptional regulators - Prom 24827 - 24886 9.0 24 14 Tu 1 . + CDS 24810 - 24956 59 ## - Term 24749 - 24797 5.4 25 15 Op 1 2/0.000 - CDS 24892 - 26322 1616 ## COG1257 Hydroxymethylglutaryl-CoA reductase 26 15 Op 2 . - CDS 26336 - 27505 1103 ## COG3425 3-hydroxy-3-methylglutaryl CoA synthase - Prom 27567 - 27626 9.9 + Prom 27604 - 27663 7.9 27 16 Op 1 . + CDS 27689 - 29776 2507 ## COG3590 Predicted metalloendopeptidase 28 16 Op 2 . + CDS 29800 - 31824 2660 ## COG3590 Predicted metalloendopeptidase + Prom 31961 - 32020 6.9 29 17 Op 1 . + CDS 32044 - 32406 381 ## Lebu_1349 propeptide PepSY amd peptidase M4 30 17 Op 2 . + CDS 32422 - 33354 1011 ## gi|262037429|ref|ZP_06010893.1| hypothetical protein HMPREF0554_1273 + Prom 33356 - 33415 6.9 31 17 Op 3 . + CDS 33447 - 33923 795 ## gi|262037385|ref|ZP_06010849.1| hypothetical protein HMPREF0554_1274 + Prom 33952 - 34011 8.8 32 18 Op 1 . + CDS 34034 - 34627 895 ## gi|262037430|ref|ZP_06010894.1| hypothetical protein HMPREF0554_1275 33 18 Op 2 . + CDS 34655 - 35098 654 ## gi|262037394|ref|ZP_06010858.1| hypothetical protein HMPREF0554_1276 34 18 Op 3 . + CDS 35132 - 35644 491 ## gi|262037395|ref|ZP_06010859.1| hypothetical protein HMPREF0554_1277 + Prom 35682 - 35741 9.6 35 19 Tu 1 . + CDS 35806 - 36486 834 ## COG3382 Uncharacterized conserved protein + Prom 36534 - 36593 6.5 36 20 Op 1 . + CDS 36613 - 36870 457 ## Lebu_1178 prevent-host-death family protein 37 20 Op 2 . + CDS 36872 - 37126 410 ## COG4115 Uncharacterized protein conserved in bacteria + Term 37146 - 37195 1.5 - Term 37134 - 37183 4.5 38 21 Op 1 2/0.000 - CDS 37194 - 40331 3307 ## COG3250 Beta-galactosidase/beta-glucuronidase 39 21 Op 2 38/0.000 - CDS 40335 - 41165 753 ## COG0395 ABC-type sugar transport system, permease component 40 21 Op 3 35/0.000 - CDS 41169 - 42047 656 ## COG1175 ABC-type sugar transport systems, permease components 41 21 Op 4 . - CDS 42073 - 43344 1640 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 43453 - 43512 8.9 - Term 43454 - 43505 1.4 42 22 Tu 1 . - CDS 43564 - 45558 2596 ## COG0480 Translation elongation factors (GTPases) - Prom 45710 - 45769 10.4 + Prom 45639 - 45698 10.5 43 23 Tu 1 . + CDS 45847 - 46776 1051 ## COG1792 Cell shape-determining protein + Term 46801 - 46841 -0.3 + Prom 46781 - 46840 6.2 44 24 Op 1 . + CDS 46883 - 47416 582 ## Lebu_0855 hypothetical protein 45 24 Op 2 1/0.167 + CDS 47429 - 48103 744 ## COG1381 Recombinational DNA repair protein (RecF pathway) 46 24 Op 3 1/0.167 + CDS 48109 - 48576 766 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 47 24 Op 4 . + CDS 48602 - 49066 509 ## COG1327 Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains 48 24 Op 5 . + CDS 49086 - 49361 417 ## gi|262037432|ref|ZP_06010896.1| putative septum formation initiator 49 25 Tu 1 . + CDS 49472 - 49966 607 ## Lebu_0860 hypothetical protein + Prom 50001 - 50060 5.5 50 26 Op 1 . + CDS 50088 - 51497 2036 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain 51 26 Op 2 23/0.000 + CDS 51506 - 52669 1431 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component 52 26 Op 3 . + CDS 52662 - 53345 324 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 53 26 Op 4 . + CDS 53361 - 54422 1199 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 54 26 Op 5 . + CDS 54448 - 54597 294 ## gi|262037387|ref|ZP_06010851.1| alpha/beta fold family hydrolase 55 26 Op 6 . + CDS 54641 - 56131 2051 ## COG2195 Di- and tripeptidases 56 26 Op 7 . + CDS 56147 - 56693 706 ## COG1658 Small primase-like proteins (Toprim domain) Predicted protein(s) >gi|261748541|gb|ADAD01000030.1| GENE 1 92 - 1888 2295 598 aa, chain - ## HITS:1 COG:SPy0747 KEGG:ns NR:ns ## COG: SPy0747 COG2374 # Protein_GI_number: 15674798 # Func_class: R General function prediction only # Function: Predicted extracellular nuclease # Organism: Streptococcus pyogenes M1 GAS # 23 595 292 850 910 253 32.0 1e-66 MKRKIIFLTLFTILAGVSLGETISEIQGDNMYSTMENKKVTKVEGVVTAVKKSKDNNGFF IQSRKPDKDDRTSEGIYIENKTDVEVKRGDLVRLEGEVKEIYFGKIDKSQPATTSIQADK IKVLKENVKVTPVILTGKEIPKTVRNGNNPVLDIKNNAMDYYESLEGTLVKIKDPVITGF KEKYGDITVVPSNGMYAETRSINGGVVYNNYEKEQTQRITINTTSWNLVENGKFKNNLTP NPGDKFNGDIEGVIYFENSEYRLYPVSAFPGITDGKTARETNKYKYDKEMLNVVSYNIEN FSHVDGPERVKELANQVATVLQTPDILGLIEVGDDDGQKKSEVVTAKNNVEAIVNAIKEK TGIDYGYMTVNPTDGKDGGWPEMHIRNVILYRKDRVKPVKFNQGNSQKDTEVIKKGNKVQ LTYNPGRIGNNDEIWTDVRKPVIAQFEFNGQNLFVLANHLKSKRFDEKIYGTSHPVIRKS EEVRNPQGKQINNFVKNILNNDPKATVIVLGDMNDFEFSQTIKNINGDELIDTISELPVN ERYSYVYQGASQTLDNIMINKKYKGQVNVDVIRINSEFTIDQGSFSDHDPVFIQFKVQ >gi|261748541|gb|ADAD01000030.1| GENE 2 2256 - 3344 2314 362 aa, chain + ## HITS:1 COG:MPN564 KEGG:ns NR:ns ## COG: MPN564 COG1063 # Protein_GI_number: 13508303 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Mycoplasma pneumoniae # 5 353 1 349 351 500 69.0 1e-141 MAEKMKAFVMKKIGEVGWIEKDRPACGPLDAIIKPLALAPCTSDIHTVYEGGVGERFDMV LGHEGCGEVVEVGELVKDFKPGDRVLVTAITPDWNSVEAQAGYSMHSGGMLAGWKFSNVK DGMFGEYFHVNDADGNLALIPKGMDYGVACMLSDMVPTGFHAAELADVQFGDVVAVFGLG PVGLMGVAGAALKGAGRIFAVDSREQVIPIAKAYGATDIIDFKKAPTEDQIMELTDGKGV DKVIIAGGDQRLFETAMKILKPGGRLGNVNYLGSGEYIQVPRVEWGAGMGHKFIHGGLMP GGRLRLEKLARLIEFGKLDPGLMITHRFDGFENVEKALELMRTKADGVIKPVVYMTSQRV NE >gi|261748541|gb|ADAD01000030.1| GENE 3 3529 - 4461 1080 310 aa, chain + ## HITS:1 COG:FN0779 KEGG:ns NR:ns ## COG: FN0779 COG0523 # Protein_GI_number: 19704114 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Fusobacterium nucleatum # 1 294 1 287 294 171 33.0 1e-42 MKVLVISGFLGAGKTTFIKQMIKATNREYVIFENEFGDVSIDGEILKKSVNQDKTENEEI NVWELSSGCACCSTKADFMSSLLVIDNTLNPDFLIVEPSGIALLSNILNNVKKIGYERIE LLAPITVIDAQTYFKYKSKYAEVFLDQIRVSTHIQLSKTENLSDDEMELIADDIKTINNT ANIYIADYHKADTDYWNSLFSGELVKPDSDEIKLVKSKMKNVTFKDANIRDMRAALFFLD KIIFNYYGEINRGKGVFKTPDYNIHFDLVDKNYSMYFSQEEAENSVVFIGPEIDKELLIK ELEEMGEIII >gi|261748541|gb|ADAD01000030.1| GENE 4 4531 - 5265 383 244 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 16 242 9 242 255 152 34 5e-36 MNYNKEKAGRNVMEKIIGINPVTEVLKSDKNIEKIEVYKGIKKDTIKEILNLASKRNIKI FYTDRRTENSQGVVALVSEFDYYLELTEFFEKVLAKDKSIVVILDQIQDPRNFGAIIRSA ECFGVDGIVIQDRNNVKVTETVVKSSTGAIEHVDIIKVTNISDTIDKFKKYGYKVYGAEA DGENLYYEESYPKKVCLVLGSEGNGMRKKVREHCDKTLKIHLKGQTNSLNVSVAGGIILA EMSK >gi|261748541|gb|ADAD01000030.1| GENE 5 5320 - 5790 590 156 aa, chain + ## HITS:1 COG:no KEGG:EAT1b_1430 NR:ns ## KEGG: EAT1b_1430 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: Exiguobacterium_AT1b # Pathway: not_defined # 1 154 1 151 159 120 41.0 2e-26 MKFEFEKLTQKNAIIIADKWKYEGEYSFYNMTEDIEDYEEFIDEESRNKNECFQAILDNE LSGYFCLTRNGEIIETGLGLKPDLCGKNKGLGKEFVMQLVNFVYKNFEFDKLILNVAFFN KRAIKVYHSCGFRDVEIFNQKSNGGIYPFLRMEKIK >gi|261748541|gb|ADAD01000030.1| GENE 6 5838 - 6299 525 153 aa, chain + ## HITS:1 COG:CAC3664 KEGG:ns NR:ns ## COG: CAC3664 COG0716 # Protein_GI_number: 15896897 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Clostridium acetobutylicum # 3 150 4 151 156 152 52.0 2e-37 MKISIVYYSYSGITRKLAEDIALITDGKLIELKPEKLYSFSYNTAVKEVRKEIEKGYCPS LLEIKETFDDIDTVFIGSPNWFKTFAPPILSFLRKTDLSGKTITPFCTHGGGGLGHMVED FKRECSSSDVKRGIALKGNYTFDELKDYIEKNL >gi|261748541|gb|ADAD01000030.1| GENE 7 6333 - 6962 741 209 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1767 NR:ns ## KEGG: Lebu_1767 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 193 1 197 234 69 30.0 7e-11 MKKIIILMCLFSIFSFGEQYKITKNPNVKLEKSEMNEESLKLKKAINDFRKKQDEEKDRI MMRYNQNVNPEVKQKVAELSAQTADLNKKIRAKKILEIKDVKFLTNTKAEVFYNVKEPDI GEYLGNIKFSKKIEEKITKKLGYKLDEKNMKKLTRAQIDELDRWFVSEFKSEVEKMLSSK NIYYLTTEYKIIFIKNKGNWEVEDFEELD >gi|261748541|gb|ADAD01000030.1| GENE 8 7065 - 9749 3749 894 aa, chain + ## HITS:1 COG:FN1517 KEGG:ns NR:ns ## COG: FN1517 COG0495 # Protein_GI_number: 19704849 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 2 894 1 857 859 1049 59.0 0 MIKEYKPEEIEKKWQEKWQEKNLFKSENRVDGKENYYVLEMFAYPSGKLHVGHLRNYAIG DAVARYKKMRGFNILHPFGWDSFGLPAENAAIDNGAHPGKWTKANIDNMRRQMKLMGLSY DWDREISTYTPEYYKWNQLFFIEMYKKGLVYKKKSYVNWCPDCNTVLANEQVENGKCWRH SKTDVIQKELSQWYLKITDYAEELLTGHEELRGHWPEQVLAMQKNWIGKSTGSEVDFVLD YKFDVEKQDKESNLNIGKNGEILLPVFTTRADTLYGVTYAVVAPEHPVVEEIVLKENPSL KEQIDKMINEDKINRTAEDKEKEGMFTGLYVINPVNNEKIPLWIGNYVLMDYGTGAVMAV PAHDERDFLFAKKYDLPIRIVINPLDKDGNVEKLSTEGMEKAFTFEGILVNSGEFDGIKN KEAKVKITEKLEKEGKGKKTVNYRLHDWLISRQRYWGTPIPVIYDEDGNIYLEEKENLPV KLPTDIEFSGKGNPLETSEEFKNVILPNGKKGRRETDTMDTFVDSSWYYLRYLDSHNTEK PFTKENADSWTPVNQYIGGIEHAVMHLLYARFFHKALRDMGYVSTNEPFKRLLTQGMVLG PSYYSQNERKYYFPKDVEVKNGKAFSKETGEELVTKVEKMSKSKNNGVDPEEIVKDYGVD PARVFTLFAAPPEKELEWNMNGLAGAYRFINRLYLLVSDSFEFADKNAEKENNYSIDLSK RNEKDEEIQKKLHQTIKKVTESIEDDFHFNTAIAAVMELLNDMTAYKQEVIDKNNISSES KKIWKEVLDKVILLISPFAPHIADELWEITGNKTFTFEEEWPEFNEELTKENKINLVIQI NGKVRDMVPVTIGLSKEESEKLAFASEKTKKFTEGKEIVKVIVVPNKLVNIVVK >gi|261748541|gb|ADAD01000030.1| GENE 9 9775 - 10212 683 145 aa, chain + ## HITS:1 COG:no KEGG:VP1847 NR:ns ## KEGG: VP1847 # Name: not_defined # Def: hypothetical protein # Organism: V.parahaemolyticus # Pathway: not_defined # 1 143 21 152 152 69 32.0 3e-11 MKKVFLLFLLFSLMAFAGYKEDLEKRMKIVEKGINKDTESTAEAVDAQVKTYEAWDKEMN IVYKKLINKIEEIEKKYGEEYKGFKASLIKTQKSWVEFRDNEADFSQRPFGRGSMGRVIR AGTRAEMTKDRALELAEYYDEVSEF >gi|261748541|gb|ADAD01000030.1| GENE 10 10338 - 11333 787 331 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037406|ref|ZP_06010870.1| ## NR: gi|262037406|ref|ZP_06010870.1| putative membrane protein [Leptotrichia goodfellowii F0264] putative membrane protein [Leptotrichia goodfellowii F0264] # 1 331 1 331 331 373 100.0 1e-101 MTGIIFYNQSKGNKIVLEYVEKMKEKFSYKYLIKEILKNLNYNLVKDTHSLTLLVISINL SLLVLNDNLKVIDLEKNVLGVLIRLFIISNVILKLLKKVIIFLREFISNLNNELLKIIKA RKDKKMLEVYKKILKNMNFKVLMFIFYLCFLVISNIVIIIKQKNILFKNSSMALFLILIM WTGYIFLFNINISYDKEKNILMKVLKSNNIKEFSSEIKEILKQNIKVEIKKIKITKIPYI LISIIISIGVIILDKLKELFSLNFKVNYENLRINGQVVKKMLVIFLKNFPPLICLIFLIS SISFIILVRYFFMKELIILYRMNVVVEEYNI >gi|261748541|gb|ADAD01000030.1| GENE 11 11393 - 12487 1374 364 aa, chain + ## HITS:1 COG:FN1920 KEGG:ns NR:ns ## COG: FN1920 COG0482 # Protein_GI_number: 19705225 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Fusobacterium nucleatum # 8 362 2 343 343 342 50.0 7e-94 MTSEKDNKKVIVGMSGGIDSSVAALLLKNEGYEVIGVTLKHLPDELSENPGKTCCSLDDI SDAKMTCYTLGIPHYTINVVEEFQKEVMDYFIKMYNEGKTPSPCIICDEKVKIKKLVDFA DKTGVKYIATGHYSKKSMNNLLLWDENNPKDQSYMLYRLDKSVIERFLFPLSEYEKPTVR QIAKENGIHTHAKPDSQGICFAPDGYISFLKKMLGDSVKKGNFIDKNGKIIGEHIGYQFY TVGQRRGLGLNLGRPFFISEIRPETNEIVIGDFEELLIDEIEIINYKFHYEIADLIGKEL TARPRFSSKGLKGKLLLKIEKNDELQKQRLYFKFEKKTHENSEGQHIVFYLDNELIGGGE IKKI >gi|261748541|gb|ADAD01000030.1| GENE 12 12595 - 13530 1413 311 aa, chain + ## HITS:1 COG:PAE3297 KEGG:ns NR:ns ## COG: PAE3297 COG0115 # Protein_GI_number: 18313971 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Pyrobaculum aerophilum # 5 308 4 299 303 190 34.0 3e-48 MKNEFMKYAYFKGEIVEFEKATVSIAAHSLQYGTTCFGGIRGYYRDGKISVFRLKDHYIR LMNASKMLGFEFFVSWEEFKKTVGELIKKNDIKEDFYMRPFVFCSQPRISPKKAGLDFEL AIYLLPLADYVSTEKGGIRFMSSTYRKYSDAAIPTKAKAGGSYINSFLATSDAQRNGYDE ALMFDNNGNVVEASVANIILIYRDKVVVPDVGSDALEGITVRSMLELLEYNGYEVHREKL DRSMILTADELLVTGTAMKIVYAESLDGRPIGTEDYSARGQAGKYYKLLKSEFEKVISGE HELSKKWLEIF >gi|261748541|gb|ADAD01000030.1| GENE 13 13920 - 15137 1605 405 aa, chain + ## HITS:1 COG:CAC3590 KEGG:ns NR:ns ## COG: CAC3590 COG2081 # Protein_GI_number: 15896824 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 4 401 2 403 405 289 41.0 7e-78 MTTEVAIIGGGAAGFVAAITAAKSGKKVVILERKERVLKKVLVTGNGRCNMTNVKADITH YFGKNIDSIENILDSFTPKDTMEFFNELGIICNEEERGKVYPLSGQAASIVDALRFEAEK LGIVTETEFYVRKVEKDGFKFKIYSEDKRKIEANRVILATGGQSYHELGSNGSGYEIAKE LGHSITKLSPSIVQLKTEKYQVKGLQGIKLDTAVTAYGDNEKICTYNGELLFTDYGISGN VIFNISFVMPLYKNVEFEIDFMKKFDYNELYDILKKRKKILGHLTMEQYFNGMINKKLGQ FLAKMSGIEKLSKPVNTLSDDKIRNLCDVLKKYRINVLETTGFKNAQVTAGGVPLDEVNT ETLESKKVKGLYFAGEVLDVYGECGGYNLQWAWASGHRAGESAGN >gi|261748541|gb|ADAD01000030.1| GENE 14 15160 - 16770 2038 536 aa, chain + ## HITS:1 COG:L195271 KEGG:ns NR:ns ## COG: L195271 COG2509 # Protein_GI_number: 15673161 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Lactococcus lactis # 1 536 1 533 535 499 49.0 1e-141 MLRINNIKMPIKHNTENLKKITAKLLKTKEEEFKTFEITGQAIDARNKNNIIYVYSVDIT VNNEEKYINLPNVRKIEKQKYVTEKVQLNDKKRPIVVGSGPSGLFAALILAEAGLKPIIL EQGKKVEERQKDVYNFFKGGKFDKYSNVQFGEGGAGTFSDGKLTTNTNNFRMQKVYEEFI LAGAEKKIAYMSKPHVGTDKLIGIMKNIRKKIENLGGEYRFQNKLVSIKYENERISKAVV EDVSENSDKENKIYEIDTDIIVLAIGHSSRDTFYMLDEKNVKMERKIFSVGVRIEHKQSM INHSQYGKFADRLPAAEYKLSVKSDNGRGVYTFCMCPGGVVVPAASEENRLVVNGMSYSG RNMENANSAILVNVYPEDFGEGGVLAGVEFQRKLEEKAFELGGSNYKAPVQLFGDFVKKI VSTKLGDVKPSYAKGYKFADLNECFPDYINESLKEGIKAMDKKIKGFGNYDAVLSAVESR SSSPVKIPRNEKFFSNIEGLMPCGEGAGYAGGIMSAAVDGIKCAEFVIEYYKNKLL >gi|261748541|gb|ADAD01000030.1| GENE 15 16786 - 17865 1709 359 aa, chain + ## HITS:1 COG:BS_yqhT KEGG:ns NR:ns ## COG: BS_yqhT COG0006 # Protein_GI_number: 16079502 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Bacillus subtilis # 6 357 4 353 353 299 44.0 4e-81 MEKRSEKLLKLLNELNVDGIFLTDLYNLRYFAGFTGTTGVALATKKGNFFYSDFRYRSQA EAQVSKMGFEFKEVSRGSLKYVGEHAEELGLKKIGFEDNNVTFATYQKLKEIFKAELVPV GEKIMYERMVKSDEEILMIKKAIEISDIAFSEALKVIKEGVSERELSAYMEYVQKKNGAE DKSFNTILASGVRSAMPHGVASDKKIQKEEFITMDFGAYYNGYVSDMTRTVYYGNNITER HKEIYNLVLEAQILGINTIKEGMMSDDVDKVVRNFLTEKGYGEYFGHGLGHGIGVEIHEL PYLSSVSHIELKENMVVTSEPGLYFDGWGGVRIEDDVVVKKDGREVLNKSNKKLIIIEA >gi|261748541|gb|ADAD01000030.1| GENE 16 18223 - 19047 797 274 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037422|ref|ZP_06010886.1| ## NR: gi|262037422|ref|ZP_06010886.1| hypothetical protein HMPREF0554_1260 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1260 [Leptotrichia goodfellowii F0264] # 1 274 1 274 274 483 100.0 1e-135 MKKFKVKTRPSEDTGMVIGGEIDEKFVNEKLKKLVEEAYELFVYDLNGKLSVCTPYCVSE ENVKKLIETPVRKLSRELIREYLDAVNYDETGLEIKHFLPKILEFITKRAEIRLDTSLIL DKCHFEKDVWNEKELDFMHRFSKEFIIDALGTDPGKMRIENFSVYITMFNIGGLKTEHLL DIDMWKSESTKISVLKHFEKMMYYYTRDYTYYNNSFSENAEFNNQINLWISSKEVAEIFM PVIEKYYFENPDMEYEEKWCLDQLYSVLEKNLKK >gi|261748541|gb|ADAD01000030.1| GENE 17 19182 - 19445 596 87 aa, chain + ## HITS:1 COG:Cj1220 KEGG:ns NR:ns ## COG: Cj1220 COG0234 # Protein_GI_number: 15792544 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Campylobacter jejuni # 1 87 1 86 86 84 50.0 6e-17 MKIKPLGKRVLVKQVEQEEVTKSGIVLPGTASKEKPITGEVLAVGKDVEDVKAGDKVIYE KYSGTEVKDGEDTYLILDIDNVLGTVE >gi|261748541|gb|ADAD01000030.1| GENE 18 19482 - 21107 1624 541 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 541 3 547 547 630 59 1e-180 MAKVIKFNEDARKSLEVGVDTLADAVKVTLGPKGRNVILDKGYGIPTITNDGVTIAKEIE LKDPFENLGAQIVKEVATKSNDVAGDGTTTATVLAQALIKEGLKMVASGANPVFIRKGME KASKKVIEELVKRAKKIESNDEIAQVGAISASDEEIGKLIAQAMEKVGESGVITVEEAKS LDTTLEVVEGMQFDNGYLSPYMVSDSERMVVELDNPFILITDKKISSMKELLPILEKTVE TGRPMLIVAEDVEGEALATLVVNKLRGTLNIAAVKAPAFGDRRKAMLEDIAILTGGEVIS EEKGVKLENADTSFLGQAKKVRVTKDNTVIVDGMGQSKDIQARVAQIKNTIETTTSDYDR EKLQERLAKLSGGVAVIKVGAATETEMKERKLRIEDALNATKAAVEEGIVPGGGTILVQI AKAIEDYKLEGEEGLGVEIVKRALYSPMKQIVINAGLDAGVVVEKVKNSEVGIGFDAAKE EYVDMVKAGIIDPAKVTRSAIQNAISVSSVLLTTEVAVANEKEDTPMSGGMPGGMGMPGM M >gi|261748541|gb|ADAD01000030.1| GENE 19 21338 - 21826 428 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037393|ref|ZP_06010857.1| ## NR: gi|262037393|ref|ZP_06010857.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 162 1 162 162 288 100.0 8e-77 MKIYGVYFAKKEFYQKIKEVGGEWNDTKERPVVCLFKIDNSDIYWAIPMGNFNHRDEKAK RRLKNFLECDKKDIRSCFYHIGKTTVESIFFISDVIPIIDKYIEKKYLIYNSEHYIIKNR KLIEELERKLRRILFYESINPNFFRQHITDIKNKLIEEIDKI >gi|261748541|gb|ADAD01000030.1| GENE 20 21894 - 22823 615 309 aa, chain - ## HITS:1 COG:lin0383 KEGG:ns NR:ns ## COG: lin0383 COG2378 # Protein_GI_number: 16799460 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Listeria innocua # 1 309 1 313 315 362 60.0 1e-100 MNKSERINDMMIFLNDKNSFNLTDLINKYSISKSTALRDIETLERIGMPLYSVKGRNGCY KILNNRLLSPIVFNIEEVFALYFSMLTLGAYETTPFHLSVEKLKKKFENCLPSKKIEMLH KAEKVFSLASIQHNNQCPFLKEILQYASEEKVSEIIYKEYKYYIQFFNISSAYGQWYGTG YNFEKNKIQVFRCDKISEIKESNKYQSKSLSEFLKSSDLLYKNENAVDFEIEITEKGVDL FFKEHYPSMKLHSENNKQYIRGFYNPDETDFISNYFINFGENILAIKPLSLKKLIIDKLN SRLKYCLKL >gi|261748541|gb|ADAD01000030.1| GENE 21 22916 - 23545 733 209 aa, chain + ## HITS:1 COG:lin0382 KEGG:ns NR:ns ## COG: lin0382 COG3340 # Protein_GI_number: 16799459 # Func_class: E Amino acid transport and metabolism # Function: Peptidase E # Organism: Listeria innocua # 1 207 1 207 209 235 58.0 6e-62 MKKMFLVSSFKDVAAILPNFEKNLKGKTVIFIPTASIPEKIKFYVNAGKKALEKLGMTVD TLEISALEITEISKRIKENDFIYVTGGNTFFLLSELKRTGADKVIIEEINNGKLYIGESA GAMITSKNIEYVKLMDNAEKAPNLKNFDALGLIEFYIVPHYKNFPFQKISQKIIDTYSDD LELQPISNNEAVLVEDDKINIEKANQKKI >gi|261748541|gb|ADAD01000030.1| GENE 22 23729 - 24289 928 186 aa, chain - ## HITS:1 COG:SPy1959 KEGG:ns NR:ns ## COG: SPy1959 COG0431 # Protein_GI_number: 15675757 # Func_class: R General function prediction only # Function: Predicted flavoprotein # Organism: Streptococcus pyogenes M1 GAS # 4 185 2 180 180 118 40.0 8e-27 MSKKTVLLAVGSFRKNSFNQVVSDYIAEVLKEEGANVEFLDYRNVPLLEQDTEFPTPKEV ISAKDQMRKADALWVVSPEYNGSYPAVVKNLLDWLSRPEKLLDYTTPTVIAGKPATVSGI AGSTGAKFVRENLSTLLGYIRMNPMPGTGVGLSVPAEAWQTGVLTLSDEQKNALKKQVKE FLEYIK >gi|261748541|gb|ADAD01000030.1| GENE 23 24305 - 24739 286 144 aa, chain - ## HITS:1 COG:SPy1960 KEGG:ns NR:ns ## COG: SPy1960 COG1846 # Protein_GI_number: 15675758 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 6 143 5 142 142 123 44.0 9e-29 MNTEWEKNSSIHLQIILQRVVKTINSKVGKDFRDKGLTVSQFSVLDVLYTKGEMRICELI KKVLSTSGNMTVVIKNMEQRDWLYRNISETDKRAFIVGLTEKGKKLFESVLPEHRAEIEK TYSIITSEEKKQLIDILRKFKNIE >gi|261748541|gb|ADAD01000030.1| GENE 24 24810 - 24956 59 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIFVWANKKINKIIVLFISDNVNSLKYIILFQLVHALLKLIFLTQHSQ >gi|261748541|gb|ADAD01000030.1| GENE 25 24892 - 26322 1616 476 aa, chain - ## HITS:1 COG:SP1726 KEGG:ns NR:ns ## COG: SP1726 COG1257 # Protein_GI_number: 15901559 # Func_class: I Lipid transport and metabolism # Function: Hydroxymethylglutaryl-CoA reductase # Organism: Streptococcus pneumoniae TIGR4 # 8 441 5 423 424 454 52.0 1e-127 MKNNKNIWLGFHNKDKNTRINILKENDFLNEEYSEILKSNVTLPCEIAGQMSENNLGTFA LPFGIAPNFLINEKEFAVPMVTEEPSVVAACSYAAKIISRSGGFTAEVINRKMTGQVVLY DVNDINTAIENIEKNKDLILKIANEAHPSIVVRGGGAEDVKVQAFSEKNITNNSSITADS DITADFLTVYLIADVKEAMGANIINNMLEGIKPLLEDITKGTALMSILSNYATESLITAS CEIDIKYLSNDISYAANVAKKIGLAGKYAKIDMYRASTHNKGIFNGIDAVLIATGNDWRA IEAGGHAYAVKNGRYEGLTDWTFNPETNKIKGKLTLPMPIASVGGSIGLNPTVKAAFNIL GNPDAKTLSQIIVSVGLAQNFAALKALVTTGIQKGHMKLQAKSLALLAGADPEQIDTVVE KLLEARHMSLEKAKEIIKVLKEPFYSEANINRLEKSIENVESGKSALKEHELIETE >gi|261748541|gb|ADAD01000030.1| GENE 26 26336 - 27505 1103 389 aa, chain - ## HITS:1 COG:SPy0881 KEGG:ns NR:ns ## COG: SPy0881 COG3425 # Protein_GI_number: 15674906 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxy-3-methylglutaryl CoA synthase # Organism: Streptococcus pyogenes M1 GAS # 1 389 1 390 391 451 58.0 1e-126 MNIGIDKFGLAIPEYFLDIEDLAVARNENANKFSKGLLQLEMSVAPVTQDIVTLGATAAH EFLTEEDKKKIDMIIIGTETGVDQSKSASVFIHNLLGIQPFARSVEIKEACYGATAGLDF AKNHILRNPDSYVLLIASDIAKYGIGTSGESTQGAGSCAMLIKKDPNILILNDDNVCQTR DIMDFWRPNYSPYPYVDGHFSTKQYLECLETTWEEYRKRNNKDLKDFEAFCFHLPFPKLG LKGLQSIVPKGIDKNLKEKLLENLHTSIIYSRRIGNIYTGSLYLGLLSLLENSTTLKAGD NIALFSYGSGAVCEIFSGTLAENFKNHLRNNRNEDFEKRKRLSVEQYEKMFFEEISPDEN GNVEFKNDDSLFSLKKIEEHKRIYKYNPK >gi|261748541|gb|ADAD01000030.1| GENE 27 27689 - 29776 2507 695 aa, chain + ## HITS:1 COG:CC3504 KEGG:ns NR:ns ## COG: CC3504 COG3590 # Protein_GI_number: 16127734 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted metalloendopeptidase # Organism: Caulobacter vibrioides # 54 695 70 706 706 428 36.0 1e-119 MKKLLAVSLFLLCLGVNYGVEKQKNIKNIKSNVSEIQNKNSAVPEGDAELTKDLSKNIKP QEDFYKFVNENWEKRTKIPSTKPAWGSFTELAEKNQDFLKNLVSELKNKKLAENSDEKKI ITLYNSYINMNRRNKEGIAPLKNDLERINAVKNIKDFEKYTIEVTKEGGGTLYGWGVGTD LNSSKDNAVYLGSAGLGLSRDYYQKETEENKEILKEYTKYVSDMLSNLGEKNTEEKARKI VEFEKRIAKVLLTNEEENDISKKNNPRKVDELSKIVKNVNLKNYLKESGVNTDKVIISQI KYYENLDKFLNDSNIEVIKDYLKFHLISSASGLLSDKLAQRRFEFYGKYMNGQKEREKLE KRALYFIDGTLGEMIGKVYVEKNFPPEAKENAKEMVDYILKAFRNRMENLTWMSEATKKK ALEKLGKVTVKIGYPDKWRNFDEIVLNEKDSLYKQMEDISKWEYKEELKKVGKPVDKTEW FMSANTVNAYYSPSGNEIVFPAGILQFPFYDYKKSGTGSNFGGIGAVIGHEITHGFDVSG ASFDGDGNVKDWWTKEDKKKFDSVTEKLANQFSEYAVAPNVYVNGKFTLTENIADLGGVN IAYDALQLYLKDHPEKNVKVDGYTQDELFFMNFARIWRQKATEEYLKNLVKSDPHSPNYF RVNGLLINIDAFHNTFKTKKGEKLYKEPKDRIKIW >gi|261748541|gb|ADAD01000030.1| GENE 28 29800 - 31824 2660 674 aa, chain + ## HITS:1 COG:CC3504 KEGG:ns NR:ns ## COG: CC3504 COG3590 # Protein_GI_number: 16127734 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted metalloendopeptidase # Organism: Caulobacter vibrioides # 31 674 67 706 706 400 35.0 1e-111 MKKALILLLLATAVTFGKSAVQEKYSDKSLINDLSKTERPQDDFFKFVNGNWDKQTKIPS TRSGWGVTDELIEKNQGFLKDLISEIKNKKLSENSDEKKILTLYNSYYNVAERNKTGLKP LKNDLDIINSIKNLNDFQKYAIKKSKDGTRLLYDWSVSTDLNEARNYGIFLDTPKLGLSR SYFKSDTEEDKEVLDEYTEYVSDMLSYLGEKNTEEKAKKIVEFEKEIGKILLTDEELDDI NKYNNPVKVSELGKKIKNINISDFLKQVGVNTGDINTPEIKYYENLDKFINNSNLEVIKD YLKFHLISDSAELLDEKTSNRRFEFYGKYLSGQKERDNLEKRALDFVDSELGEIVGKVYV EKNFSEEAKKSTEEMVKYIKEAFRNRIEKLTWMSSQTKKAALEKLGKINSKIGYPNKWRN FETLKISENTSLYDKISEIKKWNYNNDLKKVGKPVDKDEWGMNAHEVNAYYSPTENEIVF PAGILQLPFYSYTSQPGVNFGGIGAVIGHESTHGFDVAGASFDGDGNAKEWWTEQDRKKF DAETKRLADQFSKYTVAKGVHLNGVFTLTENIADLGGTNIAYDALKLYLKDHPGKNVKFE GYTQEELFFISYARSWREKETEERLKNMIKSDPHAPAYYRVNGVLENMDSFHEIFKTKKG DKLYKEPKDRIKIW >gi|261748541|gb|ADAD01000030.1| GENE 29 32044 - 32406 381 120 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1349 NR:ns ## KEGG: Lebu_1349 # Name: not_defined # Def: propeptide PepSY amd peptidase M4 # Organism: L.buccalis # Pathway: not_defined # 5 105 12 108 203 75 39.0 5e-13 MREKIFRIVFIGILTIGSGILSFSEEETGVKISNINVSISTQKAKEIVFSHSKINKSAAR ITKLMLYKENRKYFYDIEFFTAAKKYIYRVDANTGNIMEHRQQERTKKQNEGFKISVFGL >gi|261748541|gb|ADAD01000030.1| GENE 30 32422 - 33354 1011 310 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037429|ref|ZP_06010893.1| ## NR: gi|262037429|ref|ZP_06010893.1| hypothetical protein HMPREF0554_1273 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1273 [Leptotrichia goodfellowii F0264] # 1 310 1 310 310 467 100.0 1e-130 MKKGILKKILFLFSMLSVISMSDITTIRNAEINQKSEENRSDKEKLSKIKNLIGNKKYDE GFEMFYKYYSDITDFYDDQSIDLFVNYIKNENEKIKNTNPKLYQENIKKFKELEISPVIM TDAELKKNNKEIEIAYLSSISKEKFLKGNIDIQEEYFLNDLSTPDFIITVGFDLGRALAE FDTAKRKIEKIYYETPDKAYVTIEIRKLATSIFLKNEKEEEKIKSEIAERFKKKTGKTVE EVKTNMKTKDDMRRAINDYSKVTETVFKEYFSRLTKKDYENENNYYSDIIFGREIGKQNG KWKWNVSAFD >gi|261748541|gb|ADAD01000030.1| GENE 31 33447 - 33923 795 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037385|ref|ZP_06010849.1| ## NR: gi|262037385|ref|ZP_06010849.1| hypothetical protein HMPREF0554_1274 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1274 [Leptotrichia goodfellowii F0264] # 1 158 1 158 158 304 100.0 1e-81 MRKLLLIGMVMLGALVFGEKITTGNMTWAKLQAPDMNGEYSGTWMSKEEIEKVRFLFTVD ENNNHIWVLNQFYSDRRSAGGDDVSEDQSDDGTVMGYTGSYIIHPSGKKGIYYVNNFIDL KGKKYKRLYFGFDEKLKKVVITDKDGNISEKLELYIAG >gi|261748541|gb|ADAD01000030.1| GENE 32 34034 - 34627 895 197 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037430|ref|ZP_06010894.1| ## NR: gi|262037430|ref|ZP_06010894.1| hypothetical protein HMPREF0554_1275 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1275 [Leptotrichia goodfellowii F0264] # 1 197 1 197 197 318 100.0 2e-85 MKKVLLMGVLAVISVGVINVNASAATTKKSVKVASSESEKKAAVAQAEKFINGYLKSQDY SSDKWMEKAGTTKKFREEYKKYAQYIEISSKILETQDSKELAKLKQAQKYLENYSGTEYD PIFSAAIMNVGEKPVFKVISYNEKTGIAVLTNTNVLSTDPKVIVDGKSEDLNVEIPVKVV KQNGKWLVDGAGNVNLK >gi|261748541|gb|ADAD01000030.1| GENE 33 34655 - 35098 654 147 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037394|ref|ZP_06010858.1| ## NR: gi|262037394|ref|ZP_06010858.1| hypothetical protein HMPREF0554_1276 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1276 [Leptotrichia goodfellowii F0264] # 1 147 1 147 147 260 100.0 2e-68 MRKLILTGMLVLGMITFAAKITTNNTTWKKLWDAETDRYSSLWMDENETYKFLIYDNDKH IWKLYQVYENKENTGGEIVETDKVNLYGSYVIHPSGKKGIYYINNYVDSKGKKYKRLYFG FDEKLKTVVITDKDGNIIKKLDRVMGN >gi|261748541|gb|ADAD01000030.1| GENE 34 35132 - 35644 491 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037395|ref|ZP_06010859.1| ## NR: gi|262037395|ref|ZP_06010859.1| hypothetical protein HMPREF0554_1277 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1277 [Leptotrichia goodfellowii F0264] # 1 170 1 170 170 293 100.0 3e-78 MKKLILSLMLIFSLFIFGDEYEKSYGEKITDTIKFGMTKQEFGKVINKQPLSNSHDEGDY AVYYFSDVKDPMGVERQFNSFNFVNGRLVSVIFDSQTSDEEHEKIIKMYKNNQSKLSKSK MTKIEKKGRILFYNDKKTIEIERVLDHTFIIVQTARAEAVKNKIERMNAE >gi|261748541|gb|ADAD01000030.1| GENE 35 35806 - 36486 834 226 aa, chain + ## HITS:1 COG:APE1797 KEGG:ns NR:ns ## COG: APE1797 COG3382 # Protein_GI_number: 14601632 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Aeropyrum pernix # 1 209 4 213 229 61 24.0 1e-09 MKIYVDKSMKELGITDVVIGIAKKLNPDAELTEKFKQKFHEKEQWTLNADLSEIEKNPVV EGYKDIIVKVSRSVKKNPPTISALINNIKRRGNVPHVNSIVDIYNAESLNSLLAIGAHDF NKIDFPITFTICGKEDTFYPISSNPKHVAETDFVYRDSKGIMAWLDVRDSELYKLDENTE NVLFVIQGNANTSVDMRIAALERIRDDLKGCMPESEFEIKVIHVEE >gi|261748541|gb|ADAD01000030.1| GENE 36 36613 - 36870 457 85 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1178 NR:ns ## KEGG: Lebu_1178 # Name: not_defined # Def: prevent-host-death family protein # Organism: L.buccalis # Pathway: not_defined # 1 85 1 85 85 100 80.0 2e-20 MIAVNYSSARNNFKDYCDKATDDFETIIITRKENKNVVLMSEDEYNNLMENLYIMSNKKY YNELLKRKAEVEAGLVEEHDIIEVE >gi|261748541|gb|ADAD01000030.1| GENE 37 36872 - 37126 410 84 aa, chain + ## HITS:1 COG:AGc3658 KEGG:ns NR:ns ## COG: AGc3658 COG4115 # Protein_GI_number: 15889305 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 84 1 88 89 87 45.0 4e-18 MKISWNNKAWEEYVKWQSEDKKVVKKINELIKDIERNGNEGIGKPEPLKHEFSGFWSRRI TDKHRFIYRITEEEIIIIACANHY >gi|261748541|gb|ADAD01000030.1| GENE 38 37194 - 40331 3307 1045 aa, chain - ## HITS:1 COG:ECs3958 KEGG:ns NR:ns ## COG: ECs3958 COG3250 # Protein_GI_number: 15833212 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli O157:H7 # 3 1044 13 1013 1042 685 35.0 0 MNINNLENLNILGVNRIKPRSTFFGYDSLEKAESYNRIFSKGFKLLNGNWKCLYSEYPDN FPDNFFTPEFDDTDWDTIYVPSNWQMEGYDKPWYTNVQYPFPVNPPHVPSLNPSMVYRRK FYMSALDLKNVQYLKFEGVDSAFHVWINGKYIGYSQGSRIPSEFKINDYIIQGENLICVL VHKWNVYSYLEDQDMWWLSGIFRDVYITSEPEIHIYDIFAKTSLDETYKNGILDLEVDII NNKNSKDLSIEILLKDKSTDYFLIKENSDLDSSVKSNESGIKKYNFHYEIKNILKWSAET PDLYELYITLKEKDNVLQIVPLKTGFKKIEIKDGIMLFNGKYIMLKGVNRHESHPVYGRS VRIEHMIQDLKLMKEHNINAIRTSHYPDDPKFYDLCDEYGFYVIDEADLETHGFEFVNKR NYLNNLEEWKAAFKDRAERMVERDKNHPSIIMWSLGNESGYGKNHAAMTDYIKDRDDSRL VHYEGETREIFESTDNGLPDRDPESSDVHSTMYTPIEILEKVAKLNFLKKPHIMCENLHA MGNGPGGIKDLWELLYREKRLQGGFVWEWCDHGILRTLENGEKYYAYGGDFGDTPSDSNF VIDGLVNPDRTPSPALAEYKKAIEPIKMFALNSEKTKYEVENRYDFINLNDFICNYEIKT NGKTVQTGYLENININPSKKAQFKINIGNSILNKDGDYYITFSWKIKKNIGLLPAGTELA FHQEKLSINKNTEKTSCDDEIYLKNAPDIFNIVENKNELEISLFNKKIVFDKNTGKISAY YHNGINIFKDGISLNLWRAPIDNDILEIEEFGAKKALNHWKEKCVNLVQHSLRNFKISEK TDDYIVIDVQAEVAPAVLDWGYNVHYKYKIDRKGTVTLNITGKTYGNVPETLPRIGVVMN LDKLFTDVHWLGLGPQESYIDSCSSVKFDLWHKKIEEMHTPYIFPQENGNRHNVKSFSLD SEKNIGIYFESLDNKFDFSVSEYTIENIEKAKHTYDLKKENFLQLKIDMEQYGLGSASCG EETLEKYRLYCKDFEFSFKFCTFSK >gi|261748541|gb|ADAD01000030.1| GENE 39 40335 - 41165 753 276 aa, chain - ## HITS:1 COG:BH1926 KEGG:ns NR:ns ## COG: BH1926 COG0395 # Protein_GI_number: 15614489 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 7 275 16 284 285 194 40.0 2e-49 MRKDKKLKITLMYIVLGIICLFSVYPFIWLIISSTLSDTDIFKLPPKLFFGNNFFKNFEN LNINQPIWKAFKNSMFISVTYTILTLYLSSLAGYTFAKFDFKFKNILLGLVLMTMMLPVQ VTLIPLFKISIALGWQNKAAAIVIPALANAFGVFFMKQNMASIPAELLESARIDGASEFS IFHKIILPTSLPPLAALGILSFIQQWGNFLWPLIILQDKESTTLPVLLSMMVQRGQVVHY GEVLVGTMFSILPVLVIFLLLQKYFIAGIYSGSVKG >gi|261748541|gb|ADAD01000030.1| GENE 40 41169 - 42047 656 292 aa, chain - ## HITS:1 COG:AGl3320 KEGG:ns NR:ns ## COG: AGl3320 COG1175 # Protein_GI_number: 15891779 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 285 27 306 313 225 45.0 9e-59 MKIKNVLHSKSLTPYLFLAPFFIIFSIFMLYPILNSLFLSFTSAQGNTYNFIGLSNFKRV LADKIFWKSIMNVFLILAVQVPMMLFLGTVLANILNSNFLKFKGLFRLMIFLPVLVDLVT YSIVFSILFNENFGLINYALNTILHLSPIPWFSSPFWARILIIIALTWRWTGYNTVILLA GLQSIDNSLYESADIDGANAVTKLLHITLPLLTPILLFCGIISTIGTIQLFTEPQLLTVG GPDNATLTPIMYIYQYGFQSFKFGYASAVSYVITTIVFIISLIQIKLTNGGK >gi|261748541|gb|ADAD01000030.1| GENE 41 42073 - 43344 1640 423 aa, chain - ## HITS:1 COG:mlr6435 KEGG:ns NR:ns ## COG: mlr6435 COG1653 # Protein_GI_number: 13475383 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 36 359 26 356 423 228 38.0 2e-59 MKRNFLLLILGLILLISCGGEKKEAAANGKDEKVTLKIGSWNDAGDALKKVAAAFEKTHP NVKIEIVESDSDYTKLTPALVSNQGAPDIFQAQARDFQSFLVKFPKQFYDLTEKMQKDGL VDKFLPASIENVKENGKIYAMPWDIGPAAVYYRKDMFEKANVDPKAIETWDDFIEAGKKV QAANLGVTMTGYSEDNDFFHMLFNQLGGTFVKDNKIAINSPEAVKALELLKKMQDANILI NIKDWNGRIIALNNNKIATIIFPVWYSGTMVNAVADQKGKWGIVELPAFEKGGNRQANLG GSVLIISEQSKNKEIAYEFLKFALATNEGEDIMMEFGLFPAYTPYYSAPSFKVNNEYFGM DINSFFAKLTNTIPETNYGAIMLDAQAPLTGLTTSVLGGKNINTALKETSAAISQSTQLE EAK >gi|261748541|gb|ADAD01000030.1| GENE 42 43564 - 45558 2596 664 aa, chain - ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 662 3 690 690 618 46.0 1e-177 MRIYESDSIRNVAILGHSGAGKSNMTEALEFTAGLTTRISNPNDNVKISSSTTLHAIEYQ SLKFNFLDIPGYTDFFGELESGLAAADGAIIIVDGTTDLSVGTETSLELTDSRNIPRFIF VNKIDSEKADYEKILSQLREKYGKRIAPFHVPWGKAESFKGHINVVDMYAREYDGKECHN ASVPDDMNERIQPVREMLLEAVAETSEELMDKYFNGEEFTTAEIHKGLREGVLNGNVIPV ICGSTFKNIGLHTAFDMVRDYLPAPRDNKKVKSEKKEFVCQVFKTVIDSFLGRVSYAKVL SGELKPESEIYNINKRTKERVGKISTMVMDKLTDVPKAVYGDIIIFTKLSGTQTSDTLSN NEKEPSVPNISFPKAQMLVAIEPLNKADDEKISTGLQKLIEEDASFMWDRNMETGQTVLG VQGDLHTSVLMEKLKNKFGVEVKTEELKVPYRETIKGQSDVQGKYKKQSGGHGQYGDVKI KFMPSEEHFVFEETVVGGSVPKSYIPAVEKGLKESIAHGVLAGYPVTNIKAVLYDGSYHD VDSSEMSFKIAANLAFKKGMLEAKPVLLEPIMELSIIVPEEYIGDIMGDINKKRGRVSGM ESHKGTKQKITAEAPMAETFKYANDLKAMTQGRGYFEMKLLRYEEVPYDLAKKIIDETNA EKEN >gi|261748541|gb|ADAD01000030.1| GENE 43 45847 - 46776 1051 309 aa, chain + ## HITS:1 COG:FN1496 KEGG:ns NR:ns ## COG: FN1496 COG1792 # Protein_GI_number: 19704828 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Fusobacterium nucleatum # 96 304 4 207 210 81 32.0 2e-15 MAAKKGKNSDNKRKTGRNIIIIVIMCIALFIFKDKVAGVFGFLDGMAQTVNFKMVKVKSI IYKQTLKFKSRIHDLNYIDSYVDNNKERDFELQKNKVQNMEMANIKSENEKLRELLNMRS KNPAEYIAADVALVENLNFSERIFIDKGKSQGVALNLPVMYNGYLIGKISKVGNNYSEVT LLTSKNSRISVVINGADLQILRGNGNGTFSIFNYNENVTDKSLFNIETSGTSDIFPKGLS IGSFYIKDLNSFKQTKEVKFRPSYKVYDIQSVLVYKWSMNDQVNKEIQDQIDEEIEQEFK KNKGTTQTN >gi|261748541|gb|ADAD01000030.1| GENE 44 46883 - 47416 582 177 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0855 NR:ns ## KEGG: Lebu_0855 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 8 177 10 179 179 191 63.0 1e-47 MKNKITGKIILFFMMIAAFSFSKDFWEKGRFTVNVTEELKMNKSTKTKRYIMTYSGGTLK IQITYPSVNKGEIYTFSGNTKTIYYPSLNQTVTQNLQKDESNILSVFNKLNKLTSKKTQT KNGDTFTFENSWLTSISSKGYRANFSDYVKAGEYTYPTKIWVTDGNSQIIYRLSNFK >gi|261748541|gb|ADAD01000030.1| GENE 45 47429 - 48103 744 224 aa, chain + ## HITS:1 COG:FN1492 KEGG:ns NR:ns ## COG: FN1492 COG1381 # Protein_GI_number: 19704824 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Fusobacterium nucleatum # 1 221 1 227 233 80 30.0 3e-15 MNILKTKCLVLKNKKINESDLTATIFTREYGKINATAYGIRKSKSRNPITLNPLSMADMT IHKKNGFYTVNESELIKKFENITKNIEKLEISLYILDSVNKIYDINYEDRIFFDRLIEIL EFINTREKILKGYKYYLILAFLRRIMLEQGIYHADELEKILDFELLLKYKEVSLISKKSS NEENIQKKFEKYSDYLKKIAVAFEKYINESLQVKMEMKKILTEE >gi|261748541|gb|ADAD01000030.1| GENE 46 48109 - 48576 766 155 aa, chain + ## HITS:1 COG:FN1491 KEGG:ns NR:ns ## COG: FN1491 COG1762 # Protein_GI_number: 19704823 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Fusobacterium nucleatum # 1 155 6 162 162 136 47.0 2e-32 MLSGKKVAEYIKAETVELDLKSKNKNAVIKELFENIKKSGQVRNEELALEDLYARENMGS TGIGKNVAVPHAKTDAVDKLTVTIGISRNGVEYEAIDDENVNIFFMFLCPKDQAQEYLKI LARISRLVKEDKFRESLMKAKTQEEIVEIIRKEEN >gi|261748541|gb|ADAD01000030.1| GENE 47 48602 - 49066 509 154 aa, chain + ## HITS:1 COG:MT2791 KEGG:ns NR:ns ## COG: MT2791 COG1327 # Protein_GI_number: 15842256 # Func_class: K Transcription # Function: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains # Organism: Mycobacterium tuberculosis CDC1551 # 1 151 1 148 154 135 45.0 2e-32 MKCPFCGNENTRVIDSRAYSDGNSIKRRRICENCGKRFTTHEKVVDLALYVIKKNGEKQP YSRKKVSNGITRAVEKRNVDPEKIEEVVDKIERVILTEYSGEIKSSDLGNIIISYLLDLD EIAYVRFASVYKQFDSLDSFIREIEKIRNEKINL >gi|261748541|gb|ADAD01000030.1| GENE 48 49086 - 49361 417 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037432|ref|ZP_06010896.1| ## NR: gi|262037432|ref|ZP_06010896.1| putative septum formation initiator [Leptotrichia goodfellowii F0264] putative septum formation initiator [Leptotrichia goodfellowii F0264] # 1 91 1 91 91 93 100.0 6e-18 MGRGKIFFLFNIFFILALFGISYQTFETYLVKKEVSKEIKNTEKEVEEYAEKKKKLESNI KNFSEEEKIERVARDKLNLKKEGEVTYKIVE >gi|261748541|gb|ADAD01000030.1| GENE 49 49472 - 49966 607 164 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0860 NR:ns ## KEGG: Lebu_0860 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 2 163 3 164 164 122 53.0 4e-27 MDIYKLENLKIRLEEYKIKDEKDTYIFRVNKKYAASSSIFFIMISFIALYSLYKELTGAE KFSLFRTILGIALLGYSMFAAWLLFGYKIVIKKNVIIIGKISINMEEIESATVKIDRISA FKYDRFLDIVTKDKKRVKLRLNIGNDVLFLKLIQNYTGDKLVIS >gi|261748541|gb|ADAD01000030.1| GENE 50 50088 - 51497 2036 469 aa, chain + ## HITS:1 COG:alr2758 KEGG:ns NR:ns ## COG: alr2758 COG0265 # Protein_GI_number: 17230250 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Nostoc sp. PCC 7120 # 46 320 127 399 407 91 30.0 4e-18 MYLLLGMAIFANDAAIKKALVKVYASHQLFNYEAPWQYGQSFNSTATGFIVEGNKIITNA HAVLNAKFLQVRKEGDSKKYKASVKFISEDYDLAYVEVEDKTFFSGTSSLKLGTLPQIQD NVTVYGYPLGGDKLSTTQGIVSRMEHNAYTLTNKRFLIGQTDAAINSGNSGGPVISKNKV AGVAFAGLSSADNIGYFIPVTILEHFLDDVKDGNYDGAPVLGVEWSKLESPSHRKMLGLE NNSQGVLIKKIFKNSPFEGVLKPNDVLLKLDNSPIEYDGTVEFRKNEKTDFGYVNQQKKY GDSLSYEIVRDKKKQNGQVKLNSKNVKYSVVKNVTLETAPSYLVYGGLLFEPLTNNYMGV TQGALNSVYEKEESFKDYSELAVLVRVLPFDVNLGYSDMGNLIITKVNGQKYKDFKDFVK KVQAVNSEFIVFETDRGEEIVLDVKQVQAEKAELMRNYNITSDMSDDVK >gi|261748541|gb|ADAD01000030.1| GENE 51 51506 - 52669 1431 387 aa, chain + ## HITS:1 COG:FN0581 KEGG:ns NR:ns ## COG: FN0581 COG4591 # Protein_GI_number: 19703916 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Fusobacterium nucleatum # 1 387 1 389 389 234 38.0 3e-61 MVELFIAVRHILERKFQSIFSVLGVAIAVTVFVVSLTVSNGLNKNMVNSLLTLSPHILIK NAKDSYFESYQDILEKTKGMKDVKAVIPEIRSQSIIKYNELAKGVLADGISAENVKNDLK LKIIDGKNDISEGNSVLVGKVLAEEMGIKVGEELSLVSAENKEIKLIVRGIFKTGGLPYD SNLVIVPLKTMQIMFERGEAATEVGILVENPQKVENTVGTVMSKFPESDYKVQSWKVVNE GLLSAVRFEKFVLIAILSLLLMIACFAVSVILNMIVREKIKDIGILKSIGYTNKNIRKIF TIEGLIIGVSGMVLASILSPFILIALQKLFKIYMKDSYYYLDELPLYISVAELSAVYIIT FIVVFISTIYPAVRASRMNPVEALKHE >gi|261748541|gb|ADAD01000030.1| GENE 52 52662 - 53345 324 227 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 4 222 3 217 223 129 37 4e-29 MNNKIIELKNVNKIYKTKVEEIHILKDVNLSFSKGDFASIQGKSGSGKTTLLNILGLLDV PTNGELKIDGNSVDYKNEKIKNRIRNEKIGFVFQFHYLLNEFTALENVMIPALINKKRDK KEIREKAEELLGLVGLQDRITHKPLELSGGEKQRVAIARSMINNPEIILADEPTGNLDSE TSDVINSLFKKINKEKDQSIIIVTHSAELANLATYKYKIEKGKFTSV >gi|261748541|gb|ADAD01000030.1| GENE 53 53361 - 54422 1199 353 aa, chain + ## HITS:1 COG:FN1199 KEGG:ns NR:ns ## COG: FN1199 COG0389 # Protein_GI_number: 19704534 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Fusobacterium nucleatum # 1 347 1 347 350 269 46.0 5e-72 MEKIYLHYDMDAFFASVEQRDNPRLKGIPIAVGHGVVTTASYEARKFGVKSAMPTVTAKK LCPHLKLIPVRKNYYSAVGKEIQNLIKKFTDKYEFTSIDEGYIDITEFIKNNNIEKFIIR FKEYIFKNTGLTCSVGIGFSKISAKIASDINKPNNYFIFESKEQFVEYIRDKNLSIIPGI GKKTRELLSLFNISVVSELYKIEKKELTAKFGINRGEYLYNVIRGMQYSEIDTNRKRQSY GHEVTFGQSMNDILELSDELKVQSKKLSKRLKENKEFAKTVTLKIRYSNFMTYTKAKTLK IATDEYKSIFEAAMESFKHLKKKDEVRLIGIHLSSITKSNVIQLSFNDLQINK >gi|261748541|gb|ADAD01000030.1| GENE 54 54448 - 54597 294 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037387|ref|ZP_06010851.1| ## NR: gi|262037387|ref|ZP_06010851.1| alpha/beta fold family hydrolase [Leptotrichia goodfellowii F0264] alpha/beta fold family hydrolase [Leptotrichia goodfellowii F0264] # 1 49 1 49 49 72 100.0 1e-11 MEFYKRLVIKVLERSSLAEDNPLLKKLKSGHDLTQKEMVLLEELFDSII >gi|261748541|gb|ADAD01000030.1| GENE 55 54641 - 56131 2051 496 aa, chain + ## HITS:1 COG:FN1277 KEGG:ns NR:ns ## COG: FN1277 COG2195 # Protein_GI_number: 19704612 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Fusobacterium nucleatum # 6 491 8 483 486 357 42.0 4e-98 MDVEKLYPEKVFHYFREISKIPRGSKKEKEISDWLVKFAKGRGLQVMQDQNYNVVIKKKA SEGYKDFSPLILQGHMDMVWEKNKDTKFNFETQGIELVEKEGFLKANGTTLGADNGIAVA YALAILDSDDIKHPALEILITTDEEEGMSGAENLDYSVFDGKTLINLDTEEYGEIYVSSA GGGRTTTQFILSPKKIKTEDSTLISIEIKGLSGGHSGAEIHKNFGNSNKIMGEVLYHLSK RYSMRLIHIDGGEKVNAIPREAMVHLNVKLDGDTVKEFEKAAKLAFGNILKDFKAVDKSP IIEVSEISIEKLGKKPAEISQSDTANIISYLHEFPNGIQAMSKQVEGLVETSINLGVAKT ASVGGNIQMTFKALSRSSVSASLQNLISEITDLARKHGANIKVDSAHLPWEYKEHSEIRD LIVKSFKKLTGKDAEIKAIHAGLECGIFTNKMENLDVVSIGPNIYGAHTPEERMDIKSVG DTWKLLLKILEDYNIK >gi|261748541|gb|ADAD01000030.1| GENE 56 56147 - 56693 706 182 aa, chain + ## HITS:1 COG:CAC2987 KEGG:ns NR:ns ## COG: CAC2987 COG1658 # Protein_GI_number: 15896239 # Func_class: L Replication, recombination and repair # Function: Small primase-like proteins (Toprim domain) # Organism: Clostridium acetobutylicum # 8 182 2 173 185 139 48.0 3e-33 MENKKIKINEIIVVEGRDDITTVKRVIDAHVVALNGFSGLTKKSLGKLSELAKNNDLILL TDPDFAGKKIRSVVEKRIPDIKHAFISRKNATKKDNIGVENASDESIKEALLHIVTHKNR SDEKYIFTVKDLIENGLCSGEGAKEKRALLGDTLKIGYYNSKQLLNALNSFDISKENFDE AI Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:35:38 2011 Seq name: gi|261748525|gb|ADAD01000031.1| Leptotrichia goodfellowii F0264 contig00093, whole genome shotgun sequence Length of sequence - 16041 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 6, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 521 520 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase 2 1 Op 2 . + CDS 574 - 1341 829 ## Lebu_2201 hypothetical protein 3 1 Op 3 . + CDS 1359 - 2630 2142 ## COG0172 Seryl-tRNA synthetase 4 1 Op 4 . + CDS 2658 - 3365 702 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) + Prom 3381 - 3440 8.4 5 2 Tu 1 . + CDS 3484 - 4470 1519 ## COG0191 Fructose/tagatose bisphosphate aldolase + Term 4601 - 4648 3.5 + Prom 4595 - 4654 11.3 6 3 Tu 1 . + CDS 4680 - 6290 2436 ## COG0504 CTP synthase (UTP-ammonia lyase) + Prom 6366 - 6425 8.5 7 4 Op 1 . + CDS 6489 - 6593 75 ## 8 4 Op 2 . + CDS 6627 - 8090 964 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 8114 - 8153 -0.6 9 5 Op 1 . - CDS 8399 - 8659 496 ## COG1925 Phosphotransferase system, HPr-related proteins - Prom 8688 - 8747 12.4 10 5 Op 2 1/0.000 - CDS 8752 - 10335 1378 ## COG1293 Predicted RNA-binding protein homologous to eukaryotic snRNP - Prom 10381 - 10440 6.3 11 5 Op 3 1/0.000 - CDS 10447 - 11109 756 ## COG1846 Transcriptional regulators 12 5 Op 4 10/0.000 - CDS 11143 - 11778 880 ## COG0036 Pentose-5-phosphate-3-epimerase 13 5 Op 5 7/0.000 - CDS 11804 - 12730 270 ## COG1162 Predicted GTPases 14 5 Op 6 . - CDS 12717 - 13670 1219 ## COG2815 Uncharacterized protein conserved in bacteria - Prom 13701 - 13760 6.4 + Prom 13748 - 13807 6.8 15 6 Op 1 . + CDS 13836 - 14183 493 ## COG0789 Predicted transcriptional regulators 16 6 Op 2 . + CDS 14206 - 15903 2314 ## COG1154 Deoxyxylulose-5-phosphate synthase + Term 15976 - 16010 6.2 Predicted protein(s) >gi|261748525|gb|ADAD01000031.1| GENE 1 3 - 521 520 172 aa, chain + ## HITS:1 COG:FN1130 KEGG:ns NR:ns ## COG: FN1130 COG1663 # Protein_GI_number: 19704465 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Fusobacterium nucleatum # 1 172 138 315 325 125 43.0 5e-29 DKNIVLIDATNPFGGGEYLPKGRLRESINELKRADEIIITKSNYVKNDVVEMIKGRIKKY NKPILVAVFKENHFYNVKGEIFELDTIKNKNVLIFSSIANPKTFYDTVKKIGPEKIDEIK FADHHSYSEEEIEEISVESRNYDYVVTTEKDIVKVNRDIENLLVLKMEFEII >gi|261748525|gb|ADAD01000031.1| GENE 2 574 - 1341 829 255 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2201 NR:ns ## KEGG: Lebu_2201 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 3 251 5 262 262 179 49.0 8e-44 MEKSKEQKQSLCDIKTFSEFVNQYKNCFEKRKYFVINEKYEITKREPSFLLELGYIYFQT KDKTIENVVEEEFSYTYKEKNKRIDRLSKYEKDRLKESFRRSLVNKDSIHSVKLGNELLH RNKEEFLEIMYKISLISSDCNKLIKTFFVEFLLDEVGDFNKNREQTDEIVRNIINYFVKS ENEYIDYSCENSIEYFINNKTDLLYKKIYNENYDKIVKKYNIQSISKLELEINEKDYDKL SESKKILYNYLKNKK >gi|261748525|gb|ADAD01000031.1| GENE 3 1359 - 2630 2142 423 aa, chain + ## HITS:1 COG:FN0110 KEGG:ns NR:ns ## COG: FN0110 COG0172 # Protein_GI_number: 19703458 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 423 4 423 424 536 65.0 1e-152 MLEMRYIRENAEKVREYLKNRKSDFDLNGLLALDEERRNVLQEVEMLKKERNESSTLIGK YKKEGKDAAELLDRMQEVSGKIKELDKKLAEIDEKQVKLTFTIPNKLHETTPVGESEDDN VEIRKWGEPKKFDFTPKSHDELGVNLEILDFERGSKLAGSRFTVYKGQAARLERALINFM LDTHTTEHGFEEIMTPQMAKREIMTGTGQLPKFEDDMYKIEEEDLFLIPTAEVTLTNLHN GEILSEEELPKYYCGFTACFRKEAGSGGKDLKGIIRQHQFNKVEMVKIVHPESSYEELEK MVSNAERILQKLELPYRVIALCSGDIGFSASKTYDIEVWVPSQDKYREISSCSNTEDFQA RRAMIKYRSKENGKSYFVHTLNGSGLAVGRTLLAIMENYQQEDGSIKVPEALVPYMGGLT VIK >gi|261748525|gb|ADAD01000031.1| GENE 4 2658 - 3365 702 235 aa, chain + ## HITS:1 COG:mlr7748 KEGG:ns NR:ns ## COG: mlr7748 COG3757 # Protein_GI_number: 13476430 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Mesorhizobium loti # 37 225 18 207 228 144 39.0 1e-34 MKKIFKFLFFLVVLGCAAIYLEFSGYFYHNDIFAKLYKIQGIDISHHQEKINWKKVSKKY KFVIMKATEGKDFLDTDFSYNWNNARLNGFTVGAYHFFSMLSSGNAQADYYISKVPDYDK ALPPIIDLEIPTKYPKKRVLLELRNLIDKLEEHYKKRVIIYVTYYTYKAYIQGEFPENKL WIRDIKFVPQLAEDDRWVMWQFSNRGRVEGIPGFTDKNVLRGNSVEQLIEESRIK >gi|261748525|gb|ADAD01000031.1| GENE 5 3484 - 4470 1519 328 aa, chain + ## HITS:1 COG:TP0662 KEGG:ns NR:ns ## COG: TP0662 COG0191 # Protein_GI_number: 15639649 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Treponema pallidum # 5 319 4 330 332 392 61.0 1e-109 MKYHFKDLGLSNTKEMFAKANKEGYSVPAFNFNNLEQLQGIIEACVEMGSPVILQVSTGA RSYIGKEMLPWLAKAATAYVEASGSDIPVALHLDHGPNFEEAKDCIEYGFSSVMIDASHH PYDENVKESKEVADFAHKHDVTVEAELGVLAGVEDDVVAEKHIYTQPDEVEDFVKKTGVD SLAIAIGTSHGAHKFKPGDKPQIRLDILKEIEQRIPGFPIVLHGSSSVPKKFVDMINKFG GQIADAIGIPDEQLREASKSAVSKINVDTDGRLAFTAGIREVLATKPGEFDPRKYVGPAK NYMKDYYKEKIVSVFGSEGAYKKGTPRK >gi|261748525|gb|ADAD01000031.1| GENE 6 4680 - 6290 2436 536 aa, chain + ## HITS:1 COG:BH3792 KEGG:ns NR:ns ## COG: BH3792 COG0504 # Protein_GI_number: 15616354 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Bacillus halodurans # 8 533 3 532 532 688 60.0 0 MRVKIDKTKFIFVTGGVVSSLGKGIVASSLGRLLKERGYKVTIQKFDPYINVDPGTMSPY QHGEVFVTEDGAETDLDLGHYERFINEDLTKYNNLTTGKIMSKIISKERKGEFLGGTVQT VPHVTDEIKYNIIKAAEESESDIVITEIGGTIGDIESDPFIEAIRQLKREVGRENIAYIH VTLLPYLKAAGELKTKPTQHSVKMLQGLGISPDVIIVRSEHPVDQNIKKKISIFCDIDEE AVIESLDADTLYEIPLTMERLGLADVICRYFKIENKKPALIEWSKMVEKFKNPKKLVKVA VVGKYVELKDAYISINEAIEHAGFNLDTKVEIDYLKAGNFDIEKLSGYDGILVPGGFGDR GIEGKIDAVRYARENKIPFFGICLGMQMACVEFARNVLGHKNATSTEFDRDTEYPIISLM EEQEGLEYLGGTMRLGAYPCILKDESIAAKVYGKTSISERHRHRYEFNSKYKNEFEQAGM DIVGLSPDGNYVEVVEVKEHPYFLAVQYHPEFKSRPTKPHPLFTGWIKAVLKKRWT >gi|261748525|gb|ADAD01000031.1| GENE 7 6489 - 6593 75 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKCRNTENNEEFITGIVNYIKEITDDDYKNKSNM >gi|261748525|gb|ADAD01000031.1| GENE 8 6627 - 8090 964 487 aa, chain + ## HITS:1 COG:CAC0528 KEGG:ns NR:ns ## COG: CAC0528 COG0488 # Protein_GI_number: 15893818 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 487 1 492 492 556 58.0 1e-158 MSLINISNLSFSYDNNYEMIFENVSFQIDTDWKLGFIGRNGKGKTTFLNILMGKYNYRGT VSSNVTFEYFPYEVKNEKLSTMEILREICPDCEEWKFLKEFSLLEIEEDIINHQFFLLSS GEKTKVLLAALFLKENNFLLIDEPTNHLDINARKIVSRYLKGKKSFILVSHDRNLLDSCI DHILSINKTNIEIQKGNFSTWEKNKRQQDELEIYQNRKLKKEIKRLVKSSKRVAVWSDKV EKTKNLGMGDKGYIGHMAAKMMKRAKSIEKRSDRLIEEKSQLLHNIDIDEKLKLSPLIFH SKRLIEFKNVSLFYNNRLICRDINFKINQGEIISLTGKNGSGKSTILKLIIGKNVKYTGN IYCAKNLKISYVSQDTSDLKGNLFEYIEEKQIDKTLFLSILSKMDFNSLQFEKDISDYSE GQKKKILITASLCEKAHLYIWDEPLNFIDIMSRIQIEELILNYRPTLLFVEHDELFNKKV ADRFINL >gi|261748525|gb|ADAD01000031.1| GENE 9 8399 - 8659 496 86 aa, chain - ## HITS:1 COG:FN1794 KEGG:ns NR:ns ## COG: FN1794 COG1925 # Protein_GI_number: 19705099 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Fusobacterium nucleatum # 1 86 1 87 87 85 58.0 2e-17 MASKTVVMTNPTGLHTRPGGVFVAKAKEFESDVFVEHDGKKVNGKSLLKLLSIGIKNGSE ITVHAEGPDADKAVEVLGELLATIRD >gi|261748525|gb|ADAD01000031.1| GENE 10 8752 - 10335 1378 527 aa, chain - ## HITS:1 COG:FN0682 KEGG:ns NR:ns ## COG: FN0682 COG1293 # Protein_GI_number: 19704017 # Func_class: K Transcription # Function: Predicted RNA-binding protein homologous to eukaryotic snRNP # Organism: Fusobacterium nucleatum # 1 523 1 531 541 231 35.0 2e-60 MLYMDGIGISFLLKEIKEKIINYKLTKIYQYDKSSLSFYFGKNNLLFQVKDNSTICYLKN EKDLNTDFQSKFLLSLKKYILNSILINIRQEDYDRIVYFDFEKLNQFGDIEKYTLIFEIM GRASNIFLTSQGKILSALYFSSFDEGNRIIMTGAKYVLPFEEKKISPAYLEKENFPFESP EDFINKIEGVGKTFSKECFSDYNIFKKYISEYEAVMYESKKQKIMTYNRFSEYSENDFIS FSSVNEGLNEYFKTTITSNVINEKKKVLLKYIDSQTKKFEKILKNINIDLNKNKNYDNYK KIGDILAANMHVLKQGMKEIKTFDFYNNNEITISLDPLLSPNDNLNFYYNKYNKGKRTIT ALEARFQDIQNELKYYDEVKLFIEKETDFIGIEEIENELKFSNKNKIKLNKAKKRDLLSF EYRGFKIFVGRNNKENEEISFSKGSPTDIWFHVKDIPGSHVLLIKDNKNPDSETLNFAAS LAGKYSKGSEGDKVTVDYCERKFVKKIRNSKPGNVTYSNFNSINIII >gi|261748525|gb|ADAD01000031.1| GENE 11 10447 - 11109 756 220 aa, chain - ## HITS:1 COG:FN0681 KEGG:ns NR:ns ## COG: FN0681 COG1846 # Protein_GI_number: 19704016 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 4 217 7 223 225 157 43.0 2e-38 MYSKIESLLDEFYKTYYKIEEINLNQVIKCLTTTELHVIEAIGEKSITMNELAEKLGITM GTASVAVNKLTDKYFIERSRSDDDRRKVYVQLSKKGLIAFKYHGNFHSNILEKITTKIPQ KDLDTFIKVFEAIVDNLNKVKKEIQPESILSFEKGDTVQVSSIKGSTAIKKYLNEKGIVM KSLIKILDIDKHIIILLVNGEEKVINIDDATDIMVTKNYI >gi|261748525|gb|ADAD01000031.1| GENE 12 11143 - 11778 880 211 aa, chain - ## HITS:1 COG:FN0680 KEGG:ns NR:ns ## COG: FN0680 COG0036 # Protein_GI_number: 19704015 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Fusobacterium nucleatum # 1 211 1 211 215 261 60.0 7e-70 MSKKIIIAPSLLAADFSDLKNEIEKVEKAGAEYLHLDVMDGTFVPNISFGAPVISSIRKH SNLVFDVHLMVENPDRFIKDFVDAGADIITVHVEATKHLNRTVQLIKSYGKKVGISLNPS TPLEMIKHELKNIDMVLIMTVNPGFGGQAFIENMTDKIKELRALDANIDIEVDGGINTET GKKVKESGANILVAGSYIFSGNYKEKIDSLK >gi|261748525|gb|ADAD01000031.1| GENE 13 11804 - 12730 270 308 aa, chain - ## HITS:1 COG:FN0679 KEGG:ns NR:ns ## COG: FN0679 COG1162 # Protein_GI_number: 19704014 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 15 301 1 278 285 211 42.0 2e-54 MTENNNGGNFIKGKIIRKIKGFYYVLDENCTNLKEENIYECKLRGTLKVKNDKLNCIIGD IVEFDKNEKVITKVEKRKNFLYRPLLSNIDFIGILFSVITPNFDFTVFQKMLLNAGQQDI PAILILSKIDLISDEELQIFLNEIKENFGNTIPVFPVSVEKNIGLDNLLSYIKGKSVTVS GPSGAGKSSLINTFIGENILETNEISEKTSRGRHTTTESRFFKIYADTYLIDTPGFSTLD FPKLKEKKELELLFPEFSEYILQCKFRDCLHINEPGCSIKENVENGNIPQMRYNFYLYSF ENIFSNNK >gi|261748525|gb|ADAD01000031.1| GENE 14 12717 - 13670 1219 317 aa, chain - ## HITS:1 COG:FN0678 KEGG:ns NR:ns ## COG: FN0678 COG2815 # Protein_GI_number: 19704013 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 9 168 37 197 200 74 28.0 2e-13 MAKKFKFNIIKTVITIVCLIILGTIGREVFFRHFFNERHTIIPNVVNLHEKDAVKYLKEA GLNVKIINSKTEKVPLDTVFIQFPLAGKEVKVHRTIQIWVNNGEGQEVPNLVGLELLEAR SKLQEQNIQIEKIDYQPSNQKYNTIISVYPKPGTKLEINQKISILVSSQKILDASVMPNL IGLDINDAKVLLEQIGLSVGGTSKGEDSTLPVNTIISTSPAPGAKISKGQKISLVINVGV KVQPTGPSVEEIINQTNQDLQNQEIENIINDTLNKIDKRDNPGNNQNNNSGQQQNNNQPK SNNNVGSETERTHNDGE >gi|261748525|gb|ADAD01000031.1| GENE 15 13836 - 14183 493 115 aa, chain + ## HITS:1 COG:CAC0766 KEGG:ns NR:ns ## COG: CAC0766 COG0789 # Protein_GI_number: 15894053 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 114 1 114 126 119 52.0 1e-27 MTIKEVSERFDISQDTLRYYERVGMIPPVNRTAGGIRDYNENDLNWVKFAKCMRSAGLQV EAIVEYLKLYREGDATVQARIELLKEQRERLMEQKNEIDTALAKINDKISIYEKI >gi|261748525|gb|ADAD01000031.1| GENE 16 14206 - 15903 2314 565 aa, chain + ## HITS:1 COG:CAP0106 KEGG:ns NR:ns ## COG: CAP0106 COG1154 # Protein_GI_number: 15004809 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Clostridium acetobutylicum # 1 560 1 561 586 789 68.0 0 MALEKVNSPEDVKKLSRKELKELAKDVRKALLNRLTKIGGHIGPNFGMVEVTIALHYVFD SPKDKFVFDVSHQCYPHKILTGRKEGFLNPDKFSEISGYTNQNESKHDFFKVGHTSTSVS LATGLAKARDLKGDKENIIAIIGDGSLSGGEAYEGLNNASEQGTNMIIIANDNDMSIAEN HGGLYKTLKELRETDGKSPNNLFKALGLDYIYVKDGHDIDELIKVFESVKDIDHPVVLHI HTIKGKGLPFAETDKEPWHYNAPFDAATGEVKVKSGGKENYSNLTGEYLLKKMEQDPKVV AISSGTPTVLGFTPERRAKAGKQFVDVGIAEEHAVALSSGIAANGGKPVYGVFSTFIQRT YDQLSQDLCINNNPAVILVFMGGMTGMNDITHLGYFDEQILGNIPNMVYLSPTSREEYFA MLEWGIEQQSHPVAIRVPMNVVTSGEEIKPDFSELNKYVIAETGNDVAIIGLGDYYHLGK EVKELLKEKAGMNATLINPRFISGIDEEVLEKLKENHKLVITLESALIDGGFGEKVTRYY GTSNMKVMNFGGKKSLLTEFPQKKY Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:35:48 2011 Seq name: gi|261748523|gb|ADAD01000032.1| Leptotrichia goodfellowii F0264 contig00106, whole genome shotgun sequence Length of sequence - 677 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 234 285 ## FMG_0331 hypothetical protein + Term 428 - 471 3.9 Predicted protein(s) >gi|261748523|gb|ADAD01000032.1| GENE 1 1 - 234 285 77 aa, chain + ## HITS:1 COG:no KEGG:FMG_0331 NR:ns ## KEGG: FMG_0331 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 1 77 222 298 298 87 64.0 2e-16 AKKEREAKIDELVNEEKLKEDSKRFIKKSIDKGYVEYAGSELDSILPPTSRRQGAREAKK LSVLEKIRKIVEVFVGV Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:35:51 2011 Seq name: gi|261748520|gb|ADAD01000033.1| Leptotrichia goodfellowii F0264 contig00157, whole genome shotgun sequence Length of sequence - 1221 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 110 - 805 694 ## FN1816 hypothetical protein 2 1 Op 2 . - CDS 822 - 1184 406 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261748520|gb|ADAD01000033.1| GENE 1 110 - 805 694 231 aa, chain - ## HITS:1 COG:no KEGG:FN1816 NR:ns ## KEGG: FN1816 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 231 1 232 233 207 53.0 3e-52 MSELTTKEFNEIYKRNFREYFDKPLKKDGFYKKGTINFYRLNKLGMVEVLNFQRWYDKLT VNCSIKPIFCASTKSSTEVGKRLGKFMNTRNDYWWDVKDDESMKKSMQEMLKVIRKDLYE WFNKMDNEKEILEYISKSYQTIIYRYITQAASMAKFRRYKEILQYIEKVKKEYIEGWTKE EREEREWLKKVLDEALLLEEKIKEGKESIDKYIEEREKQSIIELGLKKLIK >gi|261748520|gb|ADAD01000033.1| GENE 2 822 - 1184 406 120 aa, chain - ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 2 120 2680 2806 2806 99 50.0 2e-21 MYEGKSDIEVKPQGSYKGQQAADYGEKGSVRPDISVKKDGELVETYEVKNYNKENYNSMA SKIGEQAIKRAEELPKTATQKIVIDTRGQKITKEIKESVIKKIIEKSNGKIKKENIIFME Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:36:07 2011 Seq name: gi|261748491|gb|ADAD01000034.1| Leptotrichia goodfellowii F0264 contig00029, whole genome shotgun sequence Length of sequence - 39784 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 11, operones - 7 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 2015 1688 ## COG3711 Transcriptional antiterminator + Prom 2063 - 2122 11.4 2 2 Tu 1 . + CDS 2207 - 2752 427 ## COG0212 5-formyltetrahydrofolate cyclo-ligase - Term 2619 - 2675 3.4 3 3 Op 1 . - CDS 2761 - 3747 1441 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF 4 3 Op 2 . - CDS 3744 - 4322 768 ## COG1011 Predicted hydrolase (HAD superfamily) - Prom 4343 - 4402 1.8 5 4 Tu 1 . - CDS 4444 - 6174 2851 ## COG0442 Prolyl-tRNA synthetase 6 5 Op 1 . - CDS 6286 - 7173 1144 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 7 5 Op 2 . - CDS 7243 - 9318 2914 ## COG1200 RecG-like helicase - Prom 9412 - 9471 8.8 - Term 9521 - 9578 1.4 8 6 Op 1 . - CDS 9594 - 10763 1593 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family 9 6 Op 2 1/0.000 - CDS 10750 - 11511 645 ## COG1349 Transcriptional regulators of sugar metabolism 10 6 Op 3 . - CDS 11526 - 12116 700 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases - Prom 12155 - 12214 6.6 11 7 Op 1 . - CDS 12220 - 13266 1355 ## COG0182 Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family 12 7 Op 2 . - CDS 13269 - 14597 1969 ## COG0531 Amino acid transporters 13 7 Op 3 . - CDS 14628 - 15368 713 ## COG2820 Uridine phosphorylase - Prom 15459 - 15518 7.6 - Term 15482 - 15527 7.4 14 8 Tu 1 . - CDS 15553 - 22557 10816 ## FN1554 hypothetical protein - Term 22849 - 22906 0.1 15 9 Op 1 . - CDS 22917 - 24509 1906 ## COG0038 Chloride channel protein EriC 16 9 Op 2 12/0.000 - CDS 24524 - 25672 1734 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 17 9 Op 3 2/0.000 - CDS 25709 - 26542 1222 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Prom 26597 - 26656 6.3 - Term 26556 - 26588 1.5 18 9 Op 4 . - CDS 26681 - 28111 2219 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 28145 - 28204 2.1 19 10 Op 1 . - CDS 28543 - 29292 1366 ## COG0217 Uncharacterized conserved protein 20 10 Op 2 . - CDS 29348 - 29989 923 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 21 10 Op 3 . - CDS 30006 - 30791 922 ## COG0796 Glutamate racemase 22 10 Op 4 . - CDS 30833 - 31903 1612 ## COG1686 D-alanyl-D-alanine carboxypeptidase 23 11 Op 1 . - CDS 32267 - 33448 1308 ## COG0739 Membrane proteins related to metalloendopeptidases 24 11 Op 2 . - CDS 33454 - 34644 1393 ## COG0845 Membrane-fusion protein 25 11 Op 3 . - CDS 34649 - 37264 3103 ## Lebu_0449 molecular chaperone-like protein 26 11 Op 4 . - CDS 37292 - 38128 991 ## COG0515 Serine/threonine protein kinase 27 11 Op 5 . - CDS 38140 - 39348 1215 ## Lebu_0899 hypothetical protein 28 11 Op 6 . - CDS 39373 - 39783 603 ## Lebu_0915 hypothetical protein Predicted protein(s) >gi|261748491|gb|ADAD01000034.1| GENE 1 2 - 2015 1688 671 aa, chain - ## HITS:1 COG:SA0321_1 KEGG:ns NR:ns ## COG: SA0321_1 COG3711 # Protein_GI_number: 15926034 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Staphylococcus aureus N315 # 2 497 1 492 504 103 24.0 9e-22 MIVNIRTIELYSYILNNPYKTVKELAQDMSVDERKIRYEIENLNFLLSLNKIEYKILNSV GKISIEKNLKTVEFSKILLSLEKPSQKQRRDIIIFKYLTESKINIKRLTEEFDVSRTTLK KDLKYIEEKIFKEKKIHQPLKMLLKDEHELELRGILVKILLKYRDNLMLKLKSNMVTEFI KDKISLINKKSIKKFLEKISGEKVLKSEFYDFFYSYMLVSILRISSGKSLSTRILNANFL KNTKEYEIITKYISILEKNNKIIFNENEKLQLTDYLIGFFSYSYNTKIFEKWIEVNLIIK NMIYKMEKEMKINFSEDEVLLEGLLNHIKPAIYRAKNKINLEETELYLNESESFDKNLLE KVEEEVKKIENLLDINFKREEIILFIVHFQASMERIAERIKQNKRVLLMCIGGYGTTTIL MYKLKEKYDLKELKPISYMSLSDIDISNYDAVITTVNLKKSIQENLNIKIIKVSPLLVEE DIKRLDSYFNKRRVKRIELPEIIESIEKFVEIKNREGLEKSIEKLLLGKNSDVISVNDLK ENQNLYSYFYSGNVEYIKEKIENWEEIIDIGVNILKNNKKVNEEYSEDIKSLIRNFGPYM VITEDIAIPHSETNKNVYEKGIALLVLKYPVFFPKNKRVKILFFLSSKERKDNLKIIEDI LKLIEQYDFKK >gi|261748491|gb|ADAD01000034.1| GENE 2 2207 - 2752 427 181 aa, chain + ## HITS:1 COG:aq_1731 KEGG:ns NR:ns ## COG: aq_1731 COG0212 # Protein_GI_number: 15606807 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Aquifex aeolicus # 1 178 1 176 186 84 28.0 2e-16 MKKDEIRKNMIEKRNLLSKETVEKNSNIIIEKLDIYIRKAENIMIFMNMGNEVEITKIIK LYSDKNFYIPKTLPNKEMKISRYNENELVLHKFGYYESKSDIFSDENILDLIIVPALAFD ISKNRIGFGGGYYDKFLHKIRKNHNNPLIIGICYDFQLLDNIPSEDHDVKVDFVITEKRS L >gi|261748491|gb|ADAD01000034.1| GENE 3 2761 - 3747 1441 328 aa, chain - ## HITS:1 COG:CAC0293 KEGG:ns NR:ns ## COG: CAC0293 COG1619 # Protein_GI_number: 15893585 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Clostridium acetobutylicum # 33 327 6 303 306 202 40.0 6e-52 MMRIRNLLLVLLLIMANMISGAEKNQIIIPEGLKPGDTIGLIAPANYKGTSADAEVEYLM NRGFNIVYGKSFDSAWYGFGGTDEVRAKDINDMFANKQIKAIFAIRGGYGTIRLVDKLDY DLIKKNPKIFSGYSDITTLLIAINEKTGLVTYHGPMSSNFKDIPVVTESSFDKTFINKES LNLAEFDSEYFVLKGGKGEGQITGGNLSLVIASLGTEYEINTDGKILFIEEVNEPTYRVD RMLKQLKLAGKLKNIKGVLLGDFKNAKRAEPGDMSLEDVFEDNFGKMKVPVISGLKSGHV RPFITIPIGAKAKIDTYKNEIIIEQSVK >gi|261748491|gb|ADAD01000034.1| GENE 4 3744 - 4322 768 192 aa, chain - ## HITS:1 COG:CAC3581 KEGG:ns NR:ns ## COG: CAC3581 COG1011 # Protein_GI_number: 15896815 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Clostridium acetobutylicum # 3 190 2 197 201 139 44.0 3e-33 MSIKNIIFDLGNVLINFNPKEYTNNEKLLKEVFASPEWLMLDRGTLTYEEAKEIFKERIP EEAEKINELFDSKFFELLTPIKENTELLPQLKEKYDLYILSNFHKDAFEEIYEKYDFFKC FKGKVVSCYYHLLKPEKEIYEEILNKYSLKPEETVFIDDMKVNIDAAEKLGIKGIHLPDY TKLKEKLEEFLK >gi|261748491|gb|ADAD01000034.1| GENE 5 4444 - 6174 2851 576 aa, chain - ## HITS:1 COG:FN1658 KEGG:ns NR:ns ## COG: FN1658 COG0442 # Protein_GI_number: 19704979 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 7 574 1 564 567 667 60.0 0 MKQLKKVRLSKAFLKTYKEAPKEAEIISHQLMLRASMIKQLTRGVYSYLPLGYKVLRKIE NIVREEMNRANAQEIHMPVLQPASLWEESGRWFAYGPELMRLKDRNEREFALGPTHEEVV TDIVRNMVSSYKDLPFNLYQIQTKFRDEMRPRFGLMRGREFLMKDAYSFHIDQKSLDEEY LNMKSAYERVFIRCGLDFRAVEADSGSIGGDSSHEFMVLADSGEDDILYSDVSGYAANIE KASSIIELVESTEEKKEKELVETPDAKTIEGLAEFLNIPKEKTVKAVLMKEVLPDRQNFI LGLIRGDLDINPIKLKNAMGASMELEMMTEEDCEKLGITTGYVGSFEKNDKLKVIMDETV KYMRNFVVGGNKKGYHYINVNLEDLAYDFVADIREARAGDGSPDGKGTLKVARGIEVGHI FKLGDKYSKALNATVLDENGKQQIIAMGCYGIGISRVMAAAIEQNHDEFGIIWPKNIAPY LVDVIIANIKDEAQFSLGEKLYKELCDNGIETVLDDRNERAGFKFKDADLIGIPLKIVVG KGAAEGKVEIKDRRTGESREVKAEEVTDFVRRFIEE >gi|261748491|gb|ADAD01000034.1| GENE 6 6286 - 7173 1144 295 aa, chain - ## HITS:1 COG:AF0788 KEGG:ns NR:ns ## COG: AF0788 COG0697 # Protein_GI_number: 11498394 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Archaeoglobus fulgidus # 1 286 38 307 308 124 31.0 2e-28 MKRQQADMGLLLVGILWGLGFVFVKIGLNEGIDPFYLLSIRFLGGFAILYAIFRKKMKKI TLYDMKAGIIIGMFQFFGYTFQTYGMLTTTASNNAFFTAINVVIVPYFFWLIYRERPNKS AFIASILCVAGIGIISFDKDMHLTSLNKGDILTIICAVFFAAQVASTGYYSERVHTLNLV FIQMVTGGVLFLGNLIIFSGVSAIKPLHGMALIAILYLTIFSTAIPFLLQTYFQKWTTST RASIIMTTESLFAPLFAYFLLGEVLTLKVIFGAVLILSSVVVSEIKFGKNEVEKG >gi|261748491|gb|ADAD01000034.1| GENE 7 7243 - 9318 2914 691 aa, chain - ## HITS:1 COG:FN1660 KEGG:ns NR:ns ## COG: FN1660 COG1200 # Protein_GI_number: 19704981 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Fusobacterium nucleatum # 1 685 2 682 689 689 55.0 0 MATYNLLFKSLEEFKIKGVGKVALNKFNKLGVYTLYDLFYFFPRAYEDKANSKNISEILP EEFVAIKGTVVNAVNQYIKSGRTMFRAMLRDETGIIELVWFNNRFIKNSIKIGDELQVYG KVKRAAKFQMVNPEYKKIQSDGMIRGEARHEQILPIYPSTASLRQEAIRKIVNEALLDYG YLLQENIPEDLLKKGSLISRKEAILNIHFPEEFEKKEKAKKRFMIEEILLLEMGILQNRF EMDKENNNIYELEDNKKLVSKFIESLEYELTKAQKKVISEIYKELKSGKIVNRLIQGDVG SGKTIVALIMLLYMAENSYQGVIMAPTEILATQHYLGIVDIFNDLDVRVELLTGSVKGKK REKLLSEIENGLVDIVIGTHALIENDVVFKNLGLIVIDEQHRFGVVQRKLLREKGSIANL IVMSATPIPRSLALTIYGDLDVSIIDELPSGRMPIKTKWIRDLDEKNKMYNFIERKIKEG RQVYVVSPLIEESETLNVKSAQQTFEEYSEIFSNRRLGLIHGRQNYKEKQEVMRKFKKHE IDILISTTVIEVGVNVPNASIMVIRDAQRFGLSSLHQLRGRVGRGKYQSYCFLESETDND LSMKRLQVMEKTTDGFKIAEEDLKLRNSGEILGTRQSGVSDMVLTDIVKNVKEIKAIRDY VVDYLEKNNGKINNEYLKRDIYEKFHKKMVE >gi|261748491|gb|ADAD01000034.1| GENE 8 9594 - 10763 1593 389 aa, chain - ## HITS:1 COG:FN1415 KEGG:ns NR:ns ## COG: FN1415 COG1979 # Protein_GI_number: 19704747 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Fusobacterium nucleatum # 1 389 1 385 385 290 40.0 3e-78 MLNFSYKNNTKLIFGKNSIDKLEDELKKYNSKVLMLISGNGKYLKDLGIYNKVKENCEKL NIKLIYNNEVIPNPEVELVRKLVDICKKESVDFILAAGGGSVMDTAKAVAVGALSDEDVW EFYLYNKIPEKALNIGVISTLPASGSEASNCSIISNGKYKLGLETDLITPEFAILDPEYT KTLPLYQTSVGICDISSHLIERYYSVVKNTDTTDYMIEGLLKALIINGLKLIKNPDNIDA RAEVFLISIFAHNNILDSGRMADWASHRIEHELSGEYGITHGEGMAVVLVAYTKYMAKKK PEKLAQLVNRVFGIDSFDYSLTEMAEILADKLEDFYKKLHLRTKLSEMNINDTDFEKMAD RATKNGSNKIGHYLPLGKEEIIEILNIAR >gi|261748491|gb|ADAD01000034.1| GENE 9 10750 - 11511 645 253 aa, chain - ## HITS:1 COG:BH1553 KEGG:ns NR:ns ## COG: BH1553 COG1349 # Protein_GI_number: 15614116 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Bacillus halodurans # 1 251 1 248 251 120 32.0 4e-27 MIKIERQNKILHYLNVNGSLSISKISKELKCSEETVRKDLVELERESKLIRIHGGAYIED RYDKGFSSNIRETLLKEEKMYISDIALDFVKDRNTLMLDSSTTCIEFAKKILNSKIYVTI ITNSLQIANLCDFSENAKLILLGGKFRTRNKSFTGYHTTDILKLYNADISFVSYPSIDIK LGLGDNNYEELRVRETMLKHSKVKILLMDHTKFSENASIVFSKISNINCIITDKKINKKW EEFFKREDINVKF >gi|261748491|gb|ADAD01000034.1| GENE 10 11526 - 12116 700 196 aa, chain - ## HITS:1 COG:SMc01621 KEGG:ns NR:ns ## COG: SMc01621 COG0235 # Protein_GI_number: 15965986 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Sinorhizobium meliloti # 1 194 4 196 222 127 34.0 1e-29 MNIRKKLVEAGKRLYNTGLVAGTSGNISMRNLEKKDSYFITPSSLPYNEIKEEDIVEINS KGEPYIRGQKPSSEWQMHISVYEKYNKYNAIIHTHSTFATAFAVIGEKIPLILVEMKPFL GGDLDVAPYKPAGSRELGKVILPYLENRNSCLLANHGTISCGTTLEEAFISAEYVEDASK IYYYAKTAGKPKILDF >gi|261748491|gb|ADAD01000034.1| GENE 11 12220 - 13266 1355 348 aa, chain - ## HITS:1 COG:FN1413 KEGG:ns NR:ns ## COG: FN1413 COG0182 # Protein_GI_number: 19704745 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family # Organism: Fusobacterium nucleatum # 1 348 1 349 349 375 57.0 1e-104 MIRQDYNLPFMLKYENIAWFNEEGYVNILDRRIYPEKTVFVKCNTHEEVAKAVKDMVTQS AGPYTAVGMGMALAAFECKNMSVEKQIEYLEKASYTLSHARPTTANRMSKITEACVEKAK KTLENGDNAFESIKNETIDSLNRRYSIMEEVGRYLADKIKDGDKILTQCFGETIIGMLIR ILKQQNKNNVEFYCAETRPYLQGARLTATCFAEAGFKTTVVTDNMIAYLMENIKINLFTS AADTITREGYIANKIGTYQIALLARNFGIPYFVTGIPDIDKYGENDINIEYRDPSSVLGN HTLKNVNAIYPSFDITPPYLISGIVTDKGVYSSYDLGSYFKNDIKRFY >gi|261748491|gb|ADAD01000034.1| GENE 12 13269 - 14597 1969 442 aa, chain - ## HITS:1 COG:AF1612_1 KEGG:ns NR:ns ## COG: AF1612_1 COG0531 # Protein_GI_number: 11499204 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Archaeoglobus fulgidus # 6 327 9 347 445 111 27.0 3e-24 MSGSEKKLSFWELMIIGIGQLIGSGIMVLLCIAMGMTGKGVAFSFIVAAVIVIIPLIAMA ALGSAIPNSGGMYVYVRDLIGKKTGFFYVTLLVFGQFILAQYAIGVGEYAKELWSNVNVG LISMGVMTFVFIVNLIGLKTSVLLQRFVVVILVGSLLAFVIYGFPQVKDVGSFFQASNIL PNGLGSFLAASVLVRFALIGSEFLSEFGGDAKNPGRDIPIAMILSTVCVAVLYVLIAIVA AGVLPIDEVAFKTLGVVAKKILPTWMYFVFMIGGGMFALISSLNAVFAWATKGIKQAIKD GWLPEKLAEENKRFGTAHWLLLMYYLVGMYPILIGQEIKTISVIGTNIGLIFSGFPVVAI MFLAKKRPKEYENAQFKLPKWAMYTIPAISMCIYVIGVLSSWDYLKSQGAITPIIVFCII VAAYTWIREPYVKNKQSLQKEE >gi|261748491|gb|ADAD01000034.1| GENE 13 14628 - 15368 713 246 aa, chain - ## HITS:1 COG:PAE3111 KEGG:ns NR:ns ## COG: PAE3111 COG2820 # Protein_GI_number: 18313830 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Pyrobaculum aerophilum # 11 208 16 224 261 130 35.0 2e-30 MAEELLPGIKLRKGEISERVLVCGDPFRAEMLAQRLDDMKCLAKAREYWTYYGKYKGVPI NISSHGVGASGAMLSFISLIKGGAEVIIRLGTCGSLKKDVKAGDIIIATAAAKEDGLSNI YVPVSFPAVADIDVINCLRETGIKQNIKNIHTGIIVTQGAFYGGVIKTNTEELAESGAIG LEMEVAALYTAGSIYGIKTGSILSVDGNALAVLDCAEDNNPDPELLKKTVEKCADIALEA LINVKF >gi|261748491|gb|ADAD01000034.1| GENE 14 15553 - 22557 10816 2334 aa, chain - ## HITS:1 COG:no KEGG:FN1554 NR:ns ## KEGG: FN1554 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 734 2334 1 1582 1582 1409 57.0 0 MSNELTRLEKELRNFAKRCKNIKYTGGLLLSFLFTGMLTFSVNVDSNIVNARKDLNTSIN DLHVSFKQAKRENNRLLKNANLELIQLMEQGDHVVKSPWSSWQIGMNYFYSNWRGAYKGR GDKAQKYPYEGIFERSTNAFERYTSPLSPSYEELNGMSNIRNGLNQGYGIASTTPKQEPL AILNVDASIKPKDVFRDAVNAPTVSVQAPNLPSLNVPSLTPPPVSVPTPNVPEKTVSIVK PNASPFTGFFFDGSANAITKDNSQGLIPTGADVTYVATQGVTLYSGVEPSDINANKSADQ ITPAAKTGYIKSDRTIVDVSNTNTSQGNKVGRTTNILYRSGYTNGTPLTLKDLTLHVRGN FDGTDSGGTGTGPTDSYTDVGRGAEGGADTGGPTRGTIGIHTLLNVNIENTTANLYGRAG FVTSETWRNGVVNMVGNNNKVNVYGSENSVFYIMPSAYGTITYGYSNSNRWAFYLGKITG KTNIDIFGTGNNVYLSSGISGARHIENDGVINSYGASNIVYSGMSYVPDWSKSAYQTMGP HPDKMQSKIQLSKVNLYGDENVGMFFGSKMGGNNPKSWEPGHRISEDGAGYIRKASYIGI YQGEIDFAAKIGTQLGANASDTAQASVGNLTGNSDKWVEGGVGVFAQSGQREGIDPIKDL GVPSGLGGDYAQAPTIANDKIHALQVRKLDITFGKNSKNGFMLISKLGSVIDVGNPASSH DYVTGISSTITDGVNGANTNETDASTGTVIAYAEGTWDQAKHQLGSKAAEKTQNDNDAAA VNNGTARKPLTDTAASTAAKLQGLASEINVYPDVVLASKEGIAYMGDNQGIVNAKKTTVA VNHSAIIGFARNKGIVNIDGDITALDAKVTNADDKYKNIAGLATLNTTALSGATVGGTVN INGNVNVNGMGALASGNGSTVNLNKTGNIINIGKEGGLAAVKGGKVNFAGGTINVGADRS SSTPVYADEDPNSSVTFKDGATPTSKPTTINISSGILMTGEDTDYSAAATGVGKYKGMKN VKVVLQADGVVLKTSNGKTVTWNGSTAGSTAVKNAMKLGELDTNNKKFKIFYINGTFNLA TNLNLDDATDEFNKSLGLSNEVFNIGSGVTVSSAAGKGLAMGSNDTATSNASNAYNNDGT INITSGTANTSALNISYGTVDNRNIINVDKGIGVYGINGSRLVNNTNGTITVGTEGVGMA GFSSASTLQIYGTDKKIKDGTLASTDKTLELLNKGTITVNGNNSVGMYGETNKAAGAHAS TNIAASNGLITNSGKITMTGDNAVGIVSKGFGNKIDLSGTGSSDIVTGVNGIGVYAEKSN INLLSNYGMEVKDQGTGIFVKDGSSVSAGTLELKYNGSNTGTGVGIFYEGSSTANMLNKT NVNLVDNTGTTGGLVGLFANNGGILTNTGNIAGDKGYGIITSGTEIVNNGNITLNNPVDN ATKKASVGIYTQGSDKITNNGIITAGENSVGIFGYTVVNTSTVNVGNGGTGIYSTGGNVD LLLGAINTGTNKAVAVYTKGNGQTVTAHNGSTLNIGDNSFGFINEGTGNTINSNISSQSL GTDTVYIYSTDRTGTVTNNTALTSTGSYNYGMYSAGTVTNNADINFGAGLGNVGIYSTHG GTATNSAGRTITIGKSFIDPDDVEKNRYGVGMAAGFTPTDLERLAGKTDYTGNIVNNGTI NVTGQYSIGMYGTGAGTKVYNGTAIGSSATINLGASNTTGMYLDNGAYGYNYGTIKTVRT ALEKVAGVVVRNGSTIENHGDIILDAKSAVGLLAKGDVLGNNLGIIKNYKNMMITGEGSV EQQEDTNNSSFGKGMGGVSIDVPKGSSTGTINVNGKPVVSTLATTTGEEYQNMNVSRIGM YIDTSSKRFTRPIAGLSNLSSLRKADLIIGAEAAKNTTSKYIQLDKKILDPYNDMIKRNT QIKDWKIYSGSLTWMATVAQNQTDGTMENAYMAKIPYTQWAGKEPMPVEVTDTYNFLDGL EQRYGIEGLGTREKTLFDKISGIGKNEQRLFYQAIDEMMGHQYANVQQRINGTGKLLDKE ITHLGKEWDTKSKQSNKIKVFGMRDEYKTDTAGIINYTSNAYGFAYLHEDETIKLGNSSG WYAGAVNKSFKFKDIGGSKENTVMLKLGMFKSTAFDNNGSLKWTVSGEGYVARTSMHRKY LVVDEIFEGKSDYNSYGTAIKNEISKEFRTSERTSIKPYGSLKLEYGRFNTIKEKTGEVR LEVKGNDYYSVKPEVGIEFKYRQPMAVKTTFVTTLGLDYENELGKVGDVKNKGRVAYTDA DWFNIRGEKDDRKGNFKADLNIGIENQRFGITLNGGYDTKGKNVRGGLGFRVIY >gi|261748491|gb|ADAD01000034.1| GENE 15 22917 - 24509 1906 530 aa, chain - ## HITS:1 COG:FN1727 KEGG:ns NR:ns ## COG: FN1727 COG0038 # Protein_GI_number: 19705048 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Fusobacterium nucleatum # 24 509 18 501 521 387 43.0 1e-107 MAKGILRELHDINSLKSRRKSTVLIFFCFLTGIVTGGVVASYRLLLDKIAYFRHDFFKDM SLGRVIIGMSIFILIGIAVQFMLSKYPMISGSGIPQVNGLLQKKVGFNWFPELICKFVGG VLAIGTGMSLGREGPSIHLGALIGDGINKLSKRTSVEEKYLVTCGASAGLSAAFNAPLAG AIFALEELHKFFSPLLLICVLVASGAANYVSRTILKTGITFQHNFVLPDGTSPLFAIVTT VIFCVIMAVSGKAFGYFLVYFQKKYKGVKLNKYLKISCFMIIAYLTAVFFREVTGGGHDL IENMFQADVPLKILFVILILKFFYSMLSYSTGFPGGIFLPMLVIGALVGKVYGVILVNYF SASNEFIVHFMLLGMAAYFTAVVKAPITGIILILEMTGNFSYLFLLTITATITYILTEMM KMEPIYEVLFENMFAKSETEKKGSHSPGKENKIVTLLIHVGADSELENKKIKDLNLPKKL LIMSVRANGKEHIPDGETVVRSGNQLVIVTDYKTAQKYAYDLKDKGIKIL >gi|261748491|gb|ADAD01000034.1| GENE 16 24524 - 25672 1734 382 aa, chain - ## HITS:1 COG:VC0994 KEGG:ns NR:ns ## COG: VC0994 COG1820 # Protein_GI_number: 15641009 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Vibrio cholerae # 3 377 4 373 378 345 48.0 1e-94 MIIKNAKIFDGEKFTDKDAVVIEDKKIKKVANTSELSEEELKNHEVVDINGMILSPGFLD LQINGCGGVLFNDDISRKALEIMNETNKRFGCTSFLPTLITSPDEKIEKALDLIKEMQDK EEIGVLGLHIEGPYISVEKKGIHRPEYIRILSDEMIQKIADAGPEVTKIITIAPEKAKVE HLKTLKNAGLNIAVGHTNATYEECMEKKEYFNCATHLYNAMSPLESRNPGVIGFLFNNDT TNCGIIIDGFHMEYPSAEIAKKILKERLYLVTDAVSPAGTDNMTEFMFEGNRVLYENGRC YSPLGTLGGSALVMIDGVKNLVQHVHVSQEEALRMATSYPAKAISVNDRYGYIKEGYIAD LTYFDDNYKVKGTVAKGNLTKY >gi|261748491|gb|ADAD01000034.1| GENE 17 25709 - 26542 1222 277 aa, chain - ## HITS:1 COG:PM0875 KEGG:ns NR:ns ## COG: PM0875 COG0363 # Protein_GI_number: 15602740 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Pasteurella multocida # 1 260 1 260 267 352 63.0 6e-97 MRVIILKDKQEIGKWSAYQIAKKILKFNPTADKPFVLGLPTGSTPLGTYKELINLYKEGI ISFENVITFNMDEYVGLSPEHEQSYHYFMHENLFKHIDIKPENVNILDGLAEDLTKECQR YEDKIKSVGGIKLFLGGIGEDGHIAFNEPGSSLASRTRDKELTYDTILVNSRFFDNDVNK VPKLALTVGVGTLMDSEELMILADGYKKARAVYHGIEGGINHLWTVSALQLHRRAILVID ESAASDIKVKTYRYFKEIEAHNLDLEKYKEEIIKLKK >gi|261748491|gb|ADAD01000034.1| GENE 18 26681 - 28111 2219 476 aa, chain - ## HITS:1 COG:PM0876_1 KEGG:ns NR:ns ## COG: PM0876_1 COG1263 # Protein_GI_number: 15602741 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Pasteurella multocida # 1 387 3 396 408 482 66.0 1e-136 MMKYLQRIGKSLMLPVAVLPAAAILLGIGYRLDPVGWGANSPLAAFLIKSGASILDNMAI LFAVGIAIGMSKERDGSSALAGLVGFLVITTLLSKDSVAVMMKVNPEEVSRAFGKINNPF IGILSGIIAGEIYNRFHKTELPMAFAFFSGKRLVPILTATTMLILSPVLFFVWPVIFEAL VGFGESFLKLGAVGAGVFGFFNRLLIPLGLHHALNAVFWFDVAGINDIGNFLGGTGVKGT TGMYQAGFFPIMMFGLPAAAFAMYQAAKPEKKKQIASLMLAAGVTSFVTGVTEPLEFAFM FVAPVLYFVHAVLTGISLFISALFHWTAGFGFSAGLIDFTLTSGMPMANNSFMLLVQGLV FGAIYYLLFKVLIAKFNLMTPGREEDDNKQEMEIRSSNNNHAEVAKIIYDALGGKENVVS IDNCATRLRLEVKDTSVIEDKKIKQVSAGIIKPSKTSAQVVIGPLVEFVAEEMKKF >gi|261748491|gb|ADAD01000034.1| GENE 19 28543 - 29292 1366 249 aa, chain - ## HITS:1 COG:FN1661 KEGG:ns NR:ns ## COG: FN1661 COG0217 # Protein_GI_number: 19704982 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 249 1 249 249 282 58.0 5e-76 MAGHSKWANIQHRKGRQDKQRGKAFTRLGKELMIAARIGGGNVDFNPRLRLAIDKAKAAN MPKDTIERAIKKGTGELEGVDYIEIRYEGYGPGGIAFLVDVVTDNKNRSASTVRMNFSRN DGNLGETGSVAFMFDRKGIFEFEKGKVDEEKLMEIALEAGAEDVADKENSLVVITDPNDF ETVKETLEKEGLKTEEAEITMYPQNEIDITDTETATKILKLYDDLEDNEDVQEIYANFNI SDEVLSRIE >gi|261748491|gb|ADAD01000034.1| GENE 20 29348 - 29989 923 213 aa, chain - ## HITS:1 COG:FN1104 KEGG:ns NR:ns ## COG: FN1104 COG0632 # Protein_GI_number: 19704439 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Fusobacterium nucleatum # 1 210 1 191 194 129 37.0 4e-30 MFEYIFGELTVKKIDYVALDINGLAYKIFISLKTFEALGETGQDEKLYVHTHVKEDDISF YGFKTENERELFKALISTSGVGPKLGLAILSTYNVKDVINIVIESNSKLFTKVPGLGDKK AQKIILDLKDKVKKLNIIETYNENGEISKGNPIVSDTSNTKILLMKEDIKLALESLGYLN ADVSKWIKDEELSQINDIGEAIKSILQKIQNKK >gi|261748491|gb|ADAD01000034.1| GENE 21 30006 - 30791 922 261 aa, chain - ## HITS:1 COG:aq_325 KEGG:ns NR:ns ## COG: aq_325 COG0796 # Protein_GI_number: 15605845 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Aquifex aeolicus # 1 254 1 244 254 242 50.0 5e-64 MSIGIFDSGVGGLTVLKEIRKVLPNEKIYYLGDTARVPYGEKTKELIVRYSKQITEFLLE QNVDAIVVACNTATSLALEELKETFRVPIIGVVDAGVKTALYTTKNNKIGVIGTKATINS RKYETEIKNKNSAIEVYSKSCPLFVPVIEEGITKGEIVDLVVKMYLDEFKEKIDSLILGC THYPLLKETIKNLYPNIEIVDPAKETAEDLKKILTDNNLMKNDGEKGKEVRYYVTDGQEK FKEIGTMFLDENIDYVELIKL >gi|261748491|gb|ADAD01000034.1| GENE 22 30833 - 31903 1612 356 aa, chain - ## HITS:1 COG:FN0060 KEGG:ns NR:ns ## COG: FN0060 COG1686 # Protein_GI_number: 19703412 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Fusobacterium nucleatum # 29 262 112 346 368 177 41.0 3e-44 MKNLRKIFIILTIFVISSFAFAEGENYFDYKAILIGDTNGNIAREENSYAVRPLASVTKV MTIMLTLDKVHSGEISLNDRVVVSEAAASVPYGIKLVAGNEYTVRDLLKATAIRSSNNAA YALAEYVSNGNVYEFVNSMNNKARQSGWDSLRFCSPHGLPPSYTGSCMDQGNARDLYKMA MKAVTYKEYMNIAKESVDFIDHGNIKLTATNHLLGKVPGVDGLKTGYHNAAGSNIILTAK RNGERMILVILGSTKAKNRDAIGEKEINDYFINGSKSFKIIDKDTAVAYAKIGKDRYGLY PSEDVTVKSKNGFEKPKLTYKVALKDNVSRKDVGQVVGTFIGTDGENEYTGALIMK >gi|261748491|gb|ADAD01000034.1| GENE 23 32267 - 33448 1308 393 aa, chain - ## HITS:1 COG:AGc3364 KEGG:ns NR:ns ## COG: AGc3364 COG0739 # Protein_GI_number: 15889135 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 254 390 282 420 432 125 46.0 2e-28 MKKKINFFKISVYIIVPVIFLVIIYGIYRTDYNENEKVENISGEILKVPNSEGNTGEFEY ESYIVSEEISVTDLALKLKKDVKVLQYNNSGVTKSRLLKPGTRIIVYKEPVIFYKIKDGD TLSVIASKFQVSKESIININSDIIQENLSNKKYLVIKNPVINDSLIVEKGMEEISSLVDL SVELNKGHNKSDLINKETIISDISNNRTKSNRIKNNYSSEDNFENIEVKVITDNESNQSN FEGEEEIKITKEDIRKLDLSKIESYELYWPVVSTKITSEFGNRMHPVLKENRFHRGVDIA SVKGAAVNSGVKGVVTYAGAKGNYGNMIEVRRNDGLKVRYAHLSKIEVRTGQTVQEGDKI GEVGSTGMATGPHLHYEVLIEDIPVDPMKFKYR >gi|261748491|gb|ADAD01000034.1| GENE 24 33454 - 34644 1393 396 aa, chain - ## HITS:1 COG:PA2528 KEGG:ns NR:ns ## COG: PA2528 COG0845 # Protein_GI_number: 15597724 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 34 350 60 391 426 73 24.0 4e-13 MKKDKKDNIIWILPPLFIFLVLLMLGMCTPKINNRYIIENAKIETLELFLDMKGEVEAKD LLPIGLDIQLGIDDVYFKEGDRVKKGDIIIKFSDYKSKDLETKLYNQKQELAVKTSQLRY LQEQYKLGTDVTNQIQKLMGEIKALEGELATASKENTLVQRVVMSPIDGYIVKLNAVKGG TTDSLTPVVILAKTHDIKIVSEPFPANKLQYVNLGNKAEIKAINNTETKFEGILYKISDS GIPDLKVLEFLTSSFADVTLNQILKIRLIYQKKENAITVPLNAVVKKKSSKNGEKYIIYL INKDNKVTEKEVQIGINNGERIEIYGTGIKEGLEIVANPDDKIRNNIIVKRRDLITEKKK KEEELSKLEKENEKKQKEIEKNEIEIIRLKRGENKK >gi|261748491|gb|ADAD01000034.1| GENE 25 34649 - 37264 3103 871 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0449 NR:ns ## KEGG: Lebu_0449 # Name: not_defined # Def: molecular chaperone-like protein # Organism: L.buccalis # Pathway: not_defined # 4 871 11 901 901 1226 72.0 0 MRLFYEEELRRKSYYEMYQIAVEERLVELYTSTPTREELINLLLKYRGARPDYCINKYKK DGIAYVQALFDEKLSTRIHHENRIKIPHKIILYKDLNLTREDNYKIIIPDYIGNANVFLV NANNYLCGILQLEKDLNSRDAYYLVSRQEFLRVEDLKNNKFSLIFFKNPELEFIYKFYNV KDGNSFPTYPYKLDYYKVDIENFEVRDLEETNATLCIDFGTTNTALGAYLDRHYVRDLPT NDILNGNITLNEINYVRFNDGERHYREIFPTLAYVDNCSNPDDIKYSFGYDVIKKLEMND YIVNGSLFYGLKRWVHDHDEEEKIVDELGNILYIKRREIIKAYLKYVVNRAEYVFKCKFK RIHASSPVRLKEQFLDMFQEIFTTEKNGKKIYEYEIIRKNAMDEAIAVLYNTIENQIRKG KYKEGEEYSALVIDCGGGTTDLAACKYTIKNGRVAYHLDVKTSFENGDENFGGNNLTYRI MQYLKIVLAAKYSGEGVVTIDDLIEYDNDMIYKVIDEHGIEKIFEKIDEEYNKCEKIIPT KFAKFENKMSDEYRKIKNNFYMLWEAAESMKKEFFNRDSRLRTKFDIPRTYEKRNDLHIT QLKAWILHITENGIFKTINEYPNQVFTIKEVEKIVKADIYAMLRKFLNTYYNEGLLFEYS LIKLSGQSTKINTFQEVLKEFVPGKMIEYKELSHRDDYELKLNCLDGSIKYLDYKRFGHM EVDIENEVPLVPYSVWVEKYNGEKERVLQTSRKASILMGYIDKVISAMELKIYLHNAEGE LKKEMVYKNEIEYTEEDAEELIPKFNGVITQDDTDTIQNDTVRFFVYTDMNNWGFFVFPV QRSGDQLYLGKREYFPYEDNLNEISYFDGEH >gi|261748491|gb|ADAD01000034.1| GENE 26 37292 - 38128 991 278 aa, chain - ## HITS:1 COG:DRA0335_1 KEGG:ns NR:ns ## COG: DRA0335_1 COG0515 # Protein_GI_number: 15808039 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Deinococcus radiodurans # 13 229 11 233 459 117 32.0 2e-26 MNELPLKYKLKKKYSIAEYIAESDFSNIYLATYRKKKYVVKECFPAQLVIRDNEYGVFTN KYKTQFGMVVNSFRKEAEILGKFIHKGIVGLEDFFEEKGTVYLILEYCEGKTLKEYILEE DLTEIEIIKIFEEIMEVIEEIHSKDVIHRDIKPSNIIIDKNRNVKIIDFGSAITKEEENG KYVKLTNGYSPMEMYSVKSKNDERTDIYSLSALLYFMLNKQKPMDSVKRFYYPELLYEDT VSENARVFIEKGLEQEMKARYENMQKMKEKFKKLNLTE >gi|261748491|gb|ADAD01000034.1| GENE 27 38140 - 39348 1215 402 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0899 NR:ns ## KEGG: Lebu_0899 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 397 1 389 391 204 38.0 5e-51 MKIGRNKYKTFFFKEYAGMAEKNTYSAYIPSEGKGLWVTVVDSETEENFAKAGVKAVVEK YLENSDFSMENMNTFLNVANKRIVESKISKGKEYSKGFSILAVIAEKNEMIVGNIGKTKF KMFRNNEVFEEVSGDKIKNIILEKNDYILVGSPLFWESINDTEVSDMLISSGSKKELEKL LSEKIEKNEKLRKMSIPFLSVFVEDLEEEKIETVIKTEEKRTNPLKYLLIMLVLLFLFIA IGKTVQNNGYVKKAKEHIALSEEKFKNKRYKESVNEIDAAILYYKKISPQNKKTKEKIDE LQKKKETAEKEEQNLINSVNNIKITVPAVEPPKAPEPVVVEEEVVEKTEVTVADDNVKKK KTVKTKRRENSKNKRQSDLDREIQRNWKALGRDSNGNKKNKN >gi|261748491|gb|ADAD01000034.1| GENE 28 39373 - 39783 603 136 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0915 NR:ns ## KEGG: Lebu_0915 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 136 116 251 251 122 68.0 4e-27 EKERLEKIAKDIENATNLEIQGDQMFSLKRYTESIQKYSEAKKIFENLKAAGNFNDQTNK IDYLGKKIVKSEGYLYEEQGDTESKNKKWKEAEKKYQMSKDNMELSDVSTEDKKRVEKKL KNATKKANKKWWEFWK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:37:21 2011 Seq name: gi|261748415|gb|ADAD01000035.1| Leptotrichia goodfellowii F0264 contig00065, whole genome shotgun sequence Length of sequence - 68199 bp Number of predicted genes - 74, with homology - 71 Number of transcription units - 28, operones - 17 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 175 249 ## COG0776 Bacterial nucleoid DNA-binding protein + Prom 308 - 367 12.1 2 2 Op 1 . + CDS 396 - 839 521 ## Lebu_1009 hypothetical protein 3 2 Op 2 . + CDS 856 - 1485 1002 ## COG2323 Predicted membrane protein + Prom 1491 - 1550 12.9 4 3 Op 1 . + CDS 1584 - 2570 1260 ## COG2984 ABC-type uncharacterized transport system, periplasmic component + Prom 2680 - 2739 11.5 5 3 Op 2 . + CDS 2789 - 4255 1812 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen + Term 4457 - 4483 0.3 + Prom 4324 - 4383 10.4 6 4 Op 1 4/0.000 + CDS 4521 - 4715 266 ## PROTEIN SUPPORTED gi|229212474|ref|ZP_04338851.1| LSU ribosomal protein L32P + Term 4723 - 4758 1.1 + Prom 4748 - 4807 9.8 7 4 Op 2 . + CDS 4906 - 5883 1656 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 8 4 Op 3 . + CDS 5916 - 6320 618 ## Lebu_0618 hypothetical protein 9 4 Op 4 6/0.000 + CDS 6330 - 7232 1465 ## COG0331 (acyl-carrier-protein) S-malonyltransferase + Prom 7278 - 7337 9.5 10 4 Op 5 27/0.000 + CDS 7363 - 7587 440 ## COG0236 Acyl carrier protein + Term 7604 - 7630 1.0 + Prom 7614 - 7673 13.2 11 4 Op 6 1/0.250 + CDS 7700 - 8935 2109 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase + Term 8955 - 9009 6.0 + Prom 8947 - 9006 1.9 12 5 Op 1 . + CDS 9029 - 9745 1011 ## COG0571 dsRNA-specific ribonuclease 13 5 Op 2 1/0.250 + CDS 9757 - 10257 347 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 14 5 Op 3 7/0.000 + CDS 10324 - 11691 1865 ## COG1066 Predicted ATP-dependent serine protease 15 5 Op 4 . + CDS 11688 - 12758 713 ## PROTEIN SUPPORTED gi|163764769|ref|ZP_02171823.1| ribosomal protein L18 16 5 Op 5 . + CDS 12779 - 13696 1626 ## COG3872 Predicted metal-dependent enzyme 17 5 Op 6 . + CDS 13732 - 14223 510 ## Lebu_2038 hypothetical protein 18 5 Op 7 . + CDS 14216 - 14602 346 ## Lebu_2039 hypothetical protein 19 5 Op 8 . + CDS 14592 - 16769 2867 ## COG2217 Cation transport ATPase 20 5 Op 9 . + CDS 16787 - 16981 190 ## gi|262037524|ref|ZP_06010983.1| conserved hypothetical protein 21 5 Op 10 . + CDS 17002 - 17529 443 ## Lebu_2041 hypothetical protein 22 5 Op 11 . + CDS 17548 - 17670 266 ## + Prom 18006 - 18065 11.0 23 6 Op 1 2/0.250 + CDS 18124 - 18798 954 ## COG1321 Mn-dependent transcriptional regulator 24 6 Op 2 25/0.000 + CDS 18791 - 19762 1660 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 25 6 Op 3 42/0.000 + CDS 19802 - 20554 248 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 26 6 Op 4 10/0.000 + CDS 20559 - 21476 1169 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 27 6 Op 5 . + CDS 21502 - 22590 1392 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components + Term 22595 - 22642 3.1 - Term 22583 - 22630 2.6 28 7 Tu 1 . - CDS 22729 - 22941 423 ## gi|262037553|ref|ZP_06011012.1| conserved hypothetical protein - Prom 23165 - 23224 17.4 + Prom 23321 - 23380 3.7 29 8 Op 1 7/0.000 + CDS 23402 - 24616 1090 ## PROTEIN SUPPORTED gi|149002994|ref|ZP_01827905.1| 50S ribosomal protein L33 + Term 24649 - 24713 -0.9 + Prom 24649 - 24708 4.1 30 8 Op 2 8/0.000 + CDS 24733 - 25725 1531 ## COG0078 Ornithine carbamoyltransferase 31 8 Op 3 3/0.250 + CDS 25728 - 26663 1359 ## COG0549 Carbamate kinase 32 8 Op 4 . + CDS 26683 - 28080 2208 ## COG1288 Predicted membrane protein 33 8 Op 5 . + CDS 28111 - 29376 1254 ## COG2207 AraC-type DNA-binding domain-containing proteins 34 8 Op 6 . + CDS 29404 - 31131 1969 ## COG0018 Arginyl-tRNA synthetase 35 8 Op 7 1/0.250 + CDS 31175 - 32584 1775 ## COG0531 Amino acid transporters 36 8 Op 8 . + CDS 32603 - 33727 1440 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Term 33745 - 33794 7.4 - Term 33733 - 33782 7.4 37 9 Op 1 . - CDS 33791 - 34216 623 ## COG2153 Predicted acyltransferase 38 9 Op 2 . - CDS 34238 - 35101 1416 ## COG0501 Zn-dependent protease with chaperone function - Prom 35238 - 35297 12.1 + Prom 35223 - 35282 11.8 39 10 Op 1 . + CDS 35317 - 36126 1234 ## COG0501 Zn-dependent protease with chaperone function 40 10 Op 2 . + CDS 36153 - 37142 1457 ## COG0042 tRNA-dihydrouridine synthase + Prom 37296 - 37355 10.7 41 11 Tu 1 . + CDS 37397 - 38728 1911 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases + Term 38748 - 38792 -1.0 + Prom 38782 - 38841 11.5 42 12 Op 1 . + CDS 38966 - 39187 130 ## gi|262037495|ref|ZP_06010954.1| putative major facilitator transporter 43 12 Op 2 . + CDS 39203 - 40198 648 ## gi|262037507|ref|ZP_06010966.1| putative membrane protein + Term 40259 - 40298 1.5 - Term 40245 - 40288 7.0 44 13 Op 1 . - CDS 40318 - 40512 105 ## gi|262037537|ref|ZP_06010996.1| apicoplast 30S ribosomal protein S4 - Prom 40660 - 40719 6.8 - Term 40582 - 40608 -0.6 45 13 Op 2 . - CDS 40724 - 42214 1752 ## COG0498 Threonine synthase - Prom 42294 - 42353 7.5 + Prom 42271 - 42330 9.8 46 14 Op 1 . + CDS 42360 - 42803 431 ## gi|262037534|ref|ZP_06010993.1| putative brix domain containing protein 47 14 Op 2 . + CDS 42691 - 42960 157 ## gi|262037489|ref|ZP_06010948.1| hypothetical protein HMPREF0554_1003 + Prom 43010 - 43069 8.6 48 15 Tu 1 . + CDS 43089 - 44312 1508 ## Lebu_0228 Sel1 domain protein repeat-containing protein + Prom 44331 - 44390 3.5 49 16 Tu 1 . + CDS 44434 - 44541 124 ## + Prom 44544 - 44603 7.7 50 17 Op 1 . + CDS 44627 - 45346 1061 ## COG2859 Uncharacterized protein conserved in bacteria 51 17 Op 2 . + CDS 45374 - 45949 959 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) + Term 45968 - 46002 2.2 52 18 Op 1 . + CDS 46306 - 47172 938 ## Lebu_0765 endonuclease/exonuclease/phosphatase 53 18 Op 2 . + CDS 47194 - 48564 1983 ## COG0534 Na+-driven multidrug efflux pump 54 18 Op 3 . + CDS 48579 - 50084 2231 ## Lebu_0763 hypothetical protein 55 18 Op 4 . + CDS 50111 - 51313 1307 ## Lebu_0762 hypothetical protein + Term 51364 - 51412 2.0 + Prom 51365 - 51424 6.5 56 19 Tu 1 . + CDS 51577 - 52317 684 ## bpr_I0857 hypothetical protein 57 20 Tu 1 . - CDS 52533 - 52934 194 ## gi|262037551|ref|ZP_06011010.1| putative NADH dehydrogenase subunit 4 - Prom 52957 - 53016 11.2 + Prom 52937 - 52996 9.7 58 21 Op 1 . + CDS 53102 - 54562 2019 ## COG1621 Beta-fructosidases (levanase/invertase) 59 21 Op 2 . + CDS 54569 - 55897 1262 ## COG0144 tRNA and rRNA cytosine-C5-methylases 60 21 Op 3 . + CDS 55915 - 57675 2077 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 61 21 Op 4 1/0.250 + CDS 57700 - 59208 1850 ## COG1288 Predicted membrane protein 62 21 Op 5 . + CDS 59230 - 60666 1950 ## COG2195 Di- and tripeptidases 63 21 Op 6 . + CDS 60698 - 61189 568 ## COG3760 Uncharacterized conserved protein - Term 61171 - 61221 0.8 64 22 Tu 1 . - CDS 61468 - 62679 1211 ## COG0477 Permeases of the major facilitator superfamily - Prom 62700 - 62759 7.7 + Prom 62632 - 62691 6.5 65 23 Op 1 9/0.000 + CDS 62798 - 63460 709 ## COG0673 Predicted dehydrogenases and related proteins 66 23 Op 2 . + CDS 63390 - 63797 534 ## COG0673 Predicted dehydrogenases and related proteins 67 23 Op 3 . + CDS 63877 - 64440 924 ## gi|262037509|ref|ZP_06010968.1| conserved hypothetical protein + Prom 64454 - 64513 1.8 68 24 Op 1 . + CDS 64549 - 65304 1131 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 69 24 Op 2 . + CDS 65320 - 66186 1041 ## Lebu_1253 hypothetical protein + Prom 66223 - 66282 8.7 70 25 Tu 1 . + CDS 66336 - 66740 618 ## Lebu_1139 hypothetical protein + Term 66755 - 66798 5.2 - Term 66793 - 66823 -0.4 71 26 Tu 1 . - CDS 66828 - 67112 281 ## spyM18_1747 hypothetical protein 72 27 Op 1 . - CDS 67202 - 67423 226 ## 73 27 Op 2 . - CDS 67424 - 67573 220 ## gi|262037544|ref|ZP_06011003.1| conserved hypothetical protein - Prom 67597 - 67656 8.3 - Term 67635 - 67675 1.0 74 28 Tu 1 . - CDS 67704 - 68198 677 ## COG2849 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|261748415|gb|ADAD01000035.1| GENE 1 2 - 175 249 57 aa, chain + ## HITS:1 COG:HI0430 KEGG:ns NR:ns ## COG: HI0430 COG0776 # Protein_GI_number: 16272378 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Haemophilus influenzae # 2 53 84 135 136 58 61.0 3e-09 AKGETVQFVGWGTFGVQKRAARKGRNPQTGKEIKIAAKKVVKFKVGKKLADKVAKGK >gi|261748415|gb|ADAD01000035.1| GENE 2 396 - 839 521 147 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1009 NR:ns ## KEGG: Lebu_1009 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 147 1 147 147 177 64.0 2e-43 MEFYSQGYLQNSKLITDKLIIGAIFVIIIFLLFILTRWFKGSITLKEKQISLLALILVLL FGLSKMSEVQAQNNQEKIYKNTASVIRGLSEKFNVSEDEIYVNTKEITEHTVYKIRDKFY QIHWVDNNILVEEMTVPYVDEIKIFDK >gi|261748415|gb|ADAD01000035.1| GENE 3 856 - 1485 1002 209 aa, chain + ## HITS:1 COG:FN0036 KEGG:ns NR:ns ## COG: FN0036 COG2323 # Protein_GI_number: 19703388 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 5 206 6 208 210 159 43.0 5e-39 MSFIISMAIKLTIGFIALVLFMNLNGKGQLAPLSTGDQIGNYVLGGIIGGVIYNPAITVV QFLMVLLIWGLLMTSINFLKNTSTGVKNIIDGQIIVLAKDGKLITENFAKASMSVTDFYT KLRMKGVYKVSDIEDAFMESNGQLTIIKKEDKFGMVLIAEGKIQENNLMHSGKDNEWLME KLNRKGIENVEDVFLAEISDDDLFIINKE >gi|261748415|gb|ADAD01000035.1| GENE 4 1584 - 2570 1260 328 aa, chain + ## HITS:1 COG:FN2081 KEGG:ns NR:ns ## COG: FN2081 COG2984 # Protein_GI_number: 19705371 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Fusobacterium nucleatum # 36 321 12 297 299 182 38.0 7e-46 MKKIMRGVLLFGMLGGMSVSCGNNENGKVKETQQQEKVYKIGMSQIVDHPALNSAKQGFK DAIEKAGIKVEYDDKVANGEIPTQALIMQQFQADKKDLVYAITTPTSQAAKNKIKDIPVV IAAVTDPKGAGLEGVPNITGTSGAAPINENLELMRQLFPKAKKIGIIYNSSEQNSISELN NLKKLAPEKGFEVVDKSITNGTELVAAANILSKQVDIFYAIQDNTISSYFPTLLDILNKA KVPVFATNDVYSDRGGLISQGTTDYDIGYRAGEIAVEILKNGKKPSDIPIETMKKLRIEI NKGNMELLGIKIPEEVLKQAVFTEDKKK >gi|261748415|gb|ADAD01000035.1| GENE 5 2789 - 4255 1812 488 aa, chain + ## HITS:1 COG:MA2369 KEGG:ns NR:ns ## COG: MA2369 COG2865 # Protein_GI_number: 20091201 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Methanosarcina acetivorans str.C2A # 2 374 5 369 510 192 32.0 1e-48 MLVEKLSEGESKTLEYKESVPENSQKYMKTVVAFANGNGGKIVFGVKDKNYEITGIESEN IFELMDSITNAVSDSCEPLIIPDITLQTINDKTVIVMEIDAGKQTPYYIKSSGIKKGTYV RVAGTTRLADDYTIKDLLFEGANRSYDQAPTDIEITEEEINNFCSKLKEIASKNSNSKSE IKDVTKNILLSWGILTEKNNKIIPSNAYILLAENIENHIKIQCGVFKGKTRAIFVDRREF SGSIINQLEEAYKYVLSKINLGAEINGLYRKDVYEFPVESIREIIANAVVHRNYLEPNDI QIALYDDRLEVTSPGGLPKGVTIDKIKIGYSKVRNRAIANVFAYLEIIEKWGSGIPRIFE EFKKYGLREPELIDFEGDFRINLYRNTGKEQDSKETDIINLTNKETNKANITNKAMELSD TDKNIIESIMKNPFITQNEIASDLNLPMSKVKYYIMKLREKNVIKRNGGRQKGEWEINEE NIEYNEKN >gi|261748415|gb|ADAD01000035.1| GENE 6 4521 - 4715 266 64 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212474|ref|ZP_04338851.1| LSU ribosomal protein L32P [Leptotrichia buccalis DSM 1135] # 1 57 1 57 63 107 87 2e-22 MAVPKKRTSKAKRNMRRAHDSIKAPNIIVEADGTVRRPHRLNLETGVYRGRQVLSTESPA ADSE >gi|261748415|gb|ADAD01000035.1| GENE 7 4906 - 5883 1656 325 aa, chain + ## HITS:1 COG:FN0148 KEGG:ns NR:ns ## COG: FN0148 COG0332 # Protein_GI_number: 19703493 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Fusobacterium nucleatum # 3 324 4 327 328 353 53.0 2e-97 MRVGILGTGSYLPEKVMTNDDLSKFVDTNDEWIRTRTGIRERRIAAENEATSDLAYKAAE KAIENAKIDKNEIDLVIVATMSPDHITPATAAIVQDKLGINAAAFDLSAACTGFVYAFTT GYSFVKSGIYKKVLVIGAETMSRILDWEDRTTCVLFGDGAGAVVLGQIENGGYIASHLVS DGSGAEDVIIPAGGSRNPVSKEEIDKREIYFKMKGSDVFKFAVRAFPETVENVLAQGNTT ADDVDMFIPHQANIRIIESIAKRFKQPLDKFYVNLQRYGNTSGASIPLALDEANKEGKLK KGDKVVMVGFGGGLTYGSILLEWSI >gi|261748415|gb|ADAD01000035.1| GENE 8 5916 - 6320 618 134 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0618 NR:ns ## KEGG: Lebu_0618 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 134 1 120 121 69 41.0 4e-11 MNKKLLWGIIIILLSIFYGIQALFPEFLSESVMKYIFNYQVILMLIGLFFILKKSKFGWF MFAMGLYLYLQAFLGDYFQKGLPLLTLIGGIVLVTLGAKEIKGDKPKKERKYSKPAEKVE KEQKEEIIDAEEIK >gi|261748415|gb|ADAD01000035.1| GENE 9 6330 - 7232 1465 300 aa, chain + ## HITS:1 COG:FN0149 KEGG:ns NR:ns ## COG: FN0149 COG0331 # Protein_GI_number: 19703494 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Fusobacterium nucleatum # 1 298 1 298 299 318 59.0 6e-87 MSKTAFVFPGQGTQYPGMIKKLYDESDDNTKKLIDEIFENIKDENVKKVLFEGSEEELKD TKYAQPAIALLSVIFTKLLKEKGINPDYVAGHSLGEYSSLYAAGVLNEKETLKLIAARGN IMSNAHVDGTMAAILGLPASEVEKICSEINGVIEAVNYNEPKQTVIAGEKEVIEKNSEIF KEKGAKRVIPLAVSGPFHSSLMKPVAEQLKEEFEKYSWKDLKVPVVANTTANILNSSDEI KEELYRQTFGPVKWVDTINKLSENGVTKIYEVGPGKVLAGLIKKINKEIEVINIENIENI >gi|261748415|gb|ADAD01000035.1| GENE 10 7363 - 7587 440 74 aa, chain + ## HITS:1 COG:FN0150 KEGG:ns NR:ns ## COG: FN0150 COG0236 # Protein_GI_number: 19703495 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Fusobacterium nucleatum # 1 74 1 74 75 76 67.0 1e-14 MLDKIKSIVAEQLGVDEDQVTEDASFIDDLGADSLDTVELIMAFEEEFDIEIPDEDAQKI KTVRDVMDYIESKQ >gi|261748415|gb|ADAD01000035.1| GENE 11 7700 - 8935 2109 411 aa, chain + ## HITS:1 COG:BS_yjaY KEGG:ns NR:ns ## COG: BS_yjaY COG0304 # Protein_GI_number: 16078199 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Bacillus subtilis # 2 410 4 412 413 481 60.0 1e-135 MKRVVITGIGLITPLGTGKDKTWKRLLDGECGIEKITAFDTTEYPVHIAGEVRDFNPEDY IEKKELKKIGRFSQFAIAASKEALEDAKLEITPENADRVGMIIGSGIGGLEVIEQEIGKL VEKGPKKVSPFYIPAAIANMAAGNASIYLGAKGPNKSVVTACASGTNSIGDAFQTILLGK ADVIIAGGTEGTVTPSGIAGFGNLKALSTNPDPKKASRPFTVDRDGFVLGEGAGILVLEE LEHAKKRGAKIYAEVVGYGETGDAFHMTAPSDGGEGAARAIKMALEQGNVKPEEVGYINA HGTSTPANDKNETKAIKAAFGDHAYKLAVSSTKGATGHLLGGAGGVEAAFLAMAIDEGVL PPTINQDNPDPECDLYYVPNKAEKREIEVGLSNSLGFGGHNAVIAFRKYKG >gi|261748415|gb|ADAD01000035.1| GENE 12 9029 - 9745 1011 238 aa, chain + ## HITS:1 COG:FN0152 KEGG:ns NR:ns ## COG: FN0152 COG0571 # Protein_GI_number: 19703497 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Fusobacterium nucleatum # 3 223 1 220 234 179 46.0 4e-45 MKIKNVEELMEKIDYKFRNESYLKEALTHRSFSNEHEKSKNFDNEKLEFLGDAVLNLITT EYIYNSGKGKNEGELAKLKSQIISEPVFSAIAAEIGLGDYLYLSNGEESSGGRKRKSILG DAFEALIGAVFLDSDYYTAKNTALKFLPDKINNLEDIEGIIDYKTVLQEVFQSKYKKMPE YEILDTKGPDHNKVFEISVKLNNKIIGIGRGKSKKEAEKRAAKEAIEFIENKKRIQKI >gi|261748415|gb|ADAD01000035.1| GENE 13 9757 - 10257 347 166 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 3 163 4 161 164 138 44 9e-32 MIKTALYPGSFDPITSGHVDIIKRSANLFDKLIIGIFKNSSKTKAWFSDEEKVEMIKEVL KNENINAEVKIFNGLLVDFISKEKVDILIRGLRALSDYEYELQFTLTNKTLAKSEFETIF LSASRKYLYLSSSLVKEIAQNYGDLRTFVPENVEKKLIEKVKQMEL >gi|261748415|gb|ADAD01000035.1| GENE 14 10324 - 11691 1865 455 aa, chain + ## HITS:1 COG:FN0157 KEGG:ns NR:ns ## COG: FN0157 COG1066 # Protein_GI_number: 19703502 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Fusobacterium nucleatum # 12 453 7 448 452 468 56.0 1e-131 MAAAKSSSKVKYICSECGYTSQKWLGKCPNCDSWGTFEEEIDIKKAFKNIESKEVSISKI SEIEVEKEFRMVTQYEEFDRVLGGGLIKGEVVLITGSPGIGKSTFLLQLSEEYSKIGNVF YVSGEESPRQIKQRAERINVNSGNLYILNDTNIEKIESVILNDKPKVVVIDSIQTLYSEN VNSVPGSVTQIRETTLKLIEIAKKNEISFYIVGHVTKDGKLAGPKLLEHMVDAVLQIEGE ESNYYRIIRSIKNRYGSTNEISIFDMKENGISEVKNPSEFFISDREEKNIGSIIAPIFEG SRVFLFEIQSLLSTPNFGIPRRTVEGYDKNRVEILSAVLSRSLNIDVNSKDIYINIPGGI ELKDRSSDLAVVFSLLSSVNKTPVSQKIAAIGELGLRGEVRKVSFIKNRINELEKLGFTG VYLPKSNKADLAKEKTKIKLNYISNISELVERMRI >gi|261748415|gb|ADAD01000035.1| GENE 15 11688 - 12758 713 356 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764769|ref|ZP_02171823.1| ribosomal protein L18 [Bacillus selenitireducens MLS10] # 14 356 16 359 360 279 40 3e-74 MKKKETKKKILEYVFSILAPGTPLRAAIDRIQEASLGAIVVLGNPADLESIMRGGFKLNT PYTPQKLYELSKMDGGIILSEDIETIYGANIQLQPNSNIKTDESGTRHQAAHRVAKQTGK LVITVSERRNKITIYKGEFRYTLHDIGDLLVKSSQAIMALEKYAVAINRNLINLTVSEFD NMVTLYDIVEVIRMYGLLFRMSEELIEYISELGTEGRLVKIQYQEIMLNQNEEFTDLIRD YKKNEEKVEKVIENIRMLNKEELLEDENIANILGFNLKNISLDEIINSRGYRILGTVNKV TKKDIEMLVSEFKEIQSILIATVEEMTQIKGISKLKAEHINKALIRLKNRVMLDRY >gi|261748415|gb|ADAD01000035.1| GENE 16 12779 - 13696 1626 305 aa, chain + ## HITS:1 COG:FN1092 KEGG:ns NR:ns ## COG: FN1092 COG3872 # Protein_GI_number: 19704427 # Func_class: R General function prediction only # Function: Predicted metal-dependent enzyme # Organism: Fusobacterium nucleatum # 4 304 7 304 304 344 56.0 1e-94 MEEKVVVGGQAVVEGVMMRGPKAIATAVRKHDGSIVYKKTEITEKANKWFKVPFIRGVLA LYDAMVVGTKELIFASNQAGLEEEQMTDKQVTFTVATSILLGIGIFMVLPSYVGGLIFKE KTVMANLLEALVKLVLFLGYIWGISFFKDIKRVFEYHGAEHKSIINYEEGTELTPANAKK CTRFHPRCGTSFLLLVMFISILVFSVVDLIFGVTKDNSGMIVFLLYKLVTRVLFVPVVAG ISYELQRWTSYHLNNGIAKMIATPGMWLQKITTSEPDESQLEVAIVALNVALGREVTNAV EVFEK >gi|261748415|gb|ADAD01000035.1| GENE 17 13732 - 14223 510 163 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2038 NR:ns ## KEGG: Lebu_2038 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 161 1 161 161 191 60.0 9e-48 MLPDFYGVIEVKHYQKGRLRLQTQVLKENFELKKEFLENINRIEGIVSADVNSVIGSILI LFDENKIESSFLYLVILKILHLEEEAFKSKPTKIKIFLKNVFEVLDLSVYNKSKGLLDIK TIVAGLLAYYGIESLRGVKSVPSGISLLWWAYILVTEGKNENV >gi|261748415|gb|ADAD01000035.1| GENE 18 14216 - 14602 346 128 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2039 NR:ns ## KEGG: Lebu_2039 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 122 1 122 123 142 57.0 3e-33 MFNNLIKSAYLTFNKVKIVHSIPGRLRLLIPGLSDVPEDFKKYEHYITDCILSKKGIKSV EYCFKTSKALIYYNSEILSAEQITDWLNKIWKTVMNHSELYEGKSLEEIEENIDVIYDIL KKGKNDVK >gi|261748415|gb|ADAD01000035.1| GENE 19 14592 - 16769 2867 725 aa, chain + ## HITS:1 COG:FN1190 KEGG:ns NR:ns ## COG: FN1190 COG2217 # Protein_GI_number: 19704525 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 1 723 1 732 735 912 69.0 0 MSNNNYLLDCEVIHKIPGRIRIKSNALKYLGSLKNEIEDQIKVLKSVKEAVISDITGTIV VKFKNDDLSEENLLSLLQNILNGYLVEIHKNEKKMKPDKYVIERKLQEESPKEIMRKIIA SAVLLLMPGPKGELTGMRRLFNYKTLSTISLALPVLKNGINSIVKNKRPNADTLSSTAIA SSIILGNEKTALTIMILERFAELLTVYTMKKTRGVIKDMLSVGESYVWKQNEDGTVKKVP IEEISKGDSIVVQTGEKISVDGKIIKGNALIDQSSITGEYMPVSKKTGEEVFAGTLIKSG NITVEAQKVGDDRTVSRIIKLVEDASFNKADIQSYADTFSAQLIPLNFLLAAIVYASTRN MQKALSMLVIDYSCGIRLSTATAFSASINTAAKNGILVKGSNYLEELSKSDTVIFDKTGT ITEGKPKVQTLQIFGKNIKEERMLSLAAAAEETSSHPLAVAILNEMKERGLNIPKHKETV IKVSRGMETTVGKDVIRVGSRRYMEESDVELLDSVDAAKRMLNRGEIIIYVARNKNLIGI IGVSDPPRENIKKAMNRLRNQGIDDIVLLTGDLRQQAETIASKMSMDRYESELLPEDKAK DILKFQSIGSKVIMIGDGINDAPALSYANVGIALGSTKTDIAMEAADITITSDDPLLIPG VIGLAKNTMKVIKENFAMAIGVNSFALVLGATGILPAIYSSILHNSITILVVGNSLRLLK YNVNK >gi|261748415|gb|ADAD01000035.1| GENE 20 16787 - 16981 190 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037524|ref|ZP_06010983.1| ## NR: gi|262037524|ref|ZP_06010983.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 64 1 64 64 93 100.0 6e-18 MKKLSLTILHRLPNRIRFKVYPRIRNFKVFEEFLNVDNNMEIRYNTKINTLLVKFDPKKF IYRK >gi|261748415|gb|ADAD01000035.1| GENE 21 17002 - 17529 443 175 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2041 NR:ns ## KEGG: Lebu_2041 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 175 74 248 248 218 62.0 9e-56 MSIQHGMMPVKLIEDYEEKSIDSLSLYSGAAISLSLLHNLIKKNTGNLQTNINHFTAILT TGAIVEHAYRETKQKGFFDIEILPALYLLKSYLNSNSIYSLLLMWLTTFGRHLILNNISN KEIRVFRIKDENGKYQYIADIREDNSIGSISDLVYRIFFNNIRSNRSDEKYITLQ >gi|261748415|gb|ADAD01000035.1| GENE 22 17548 - 17670 266 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLNLGNGVKKEHLVGAVVGIGAVAAGYYLYKKNQNKVDNF >gi|261748415|gb|ADAD01000035.1| GENE 23 18124 - 18798 954 224 aa, chain + ## HITS:1 COG:SPy0450 KEGG:ns NR:ns ## COG: SPy0450 COG1321 # Protein_GI_number: 15674572 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 185 1 187 215 109 36.0 4e-24 MSKSIEDYLKGIYTLKKNKQYSNKKLAEYLNISPASVSEMIKKLVNDDYLKVNGKSVTLT EKGNDFALNVIRKHRVWEVFLYEKLGYGKDEVHPEAEALEHVTSDKLLKKLEKFLFYPKE CPHGSPIFYGIKKFDEENIIKLSEAEEDDEIIILRVEDNIELYDYLRELNISIKEIYKIE RKDPFDGPIYLSSKEKNIKAVAYNAAGMIEVYKKNQNTEETDDD >gi|261748415|gb|ADAD01000035.1| GENE 24 18791 - 19762 1660 323 aa, chain + ## HITS:1 COG:FN0668 KEGG:ns NR:ns ## COG: FN0668 COG0803 # Protein_GI_number: 19704003 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Fusobacterium nucleatum # 6 323 4 312 312 358 64.0 6e-99 MTKKVNLKKSFKFLIMFGIMMIGLLACGKKEGDGEKKDEMTKTETGKKIKVTTTTTMLVD LIKTIGGDKVEVTGLMGEGVDPHLYTASAGDIDKLSNADLVVYGGLHLEGKMVEIFEKLT SQGKVVLNVGDALDKSKIAYVEGNTPDPHVWFDTELWAKEADAVEAELSKMDPSNSSYYK GNLDKYKADLTELTNYVKAKINEIPEKSRVLVTAHDAFGYFAKQFGLQVKAIQGVSTDSE TGSKNISDLANFIVENDIKSIFIESSVPKKSIEALQEAVKAKGKEVKIGGELYSDSLGDE AHNTETYIKTVKANADTISNALK >gi|261748415|gb|ADAD01000035.1| GENE 25 19802 - 20554 248 250 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 237 1 237 318 100 27 3e-20 MAEKDITKEIIIKVEDLTVAYEDKPVLWDIELEVKKGVLMAVVGPNGAGKSTLIKAMLNL LKPATGEVHFYGEKYNKVRSKIAYVPQRGSVDWDFPTTVFDVVEMGRYGKVGWLKRIRKI DKEKTEEAIRQVEMEEFKSRQISQLSGGQQQRVFLARALVQDAEIYFMDEPFQGVDSKTE KSIVNILKKLRDDGKTVIVVHHDLQTVKNYFDYVTFINVSVVASGPVNEVFTQENIEKTY KSKFLSEKEA >gi|261748415|gb|ADAD01000035.1| GENE 26 20559 - 21476 1169 305 aa, chain + ## HITS:1 COG:FN0670 KEGG:ns NR:ns ## COG: FN0670 COG1108 # Protein_GI_number: 19704005 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Fusobacterium nucleatum # 3 282 4 282 305 248 56.0 1e-65 MNIISLLVSDYTFRTIAMGCMLLGIVSGVLGCFAILRKQSLLGDAVSHASLPGVCLAFMI THIKNTEILLLGALFIGIVCIGLIYLIQNYTKIKFDSALAFILSVFFGLGLVLLSYLNKL PGANKSGLNRFIFGQASTFVERDVKLMFYTGLLLLAIIILFWKEFKIVSFDAEFARTLGF PSKKIGISISVLIVVTVIIGIQAAGVILISAMIISPAVAARQWTDKLSVMVILSGIFGGF AGLTGTLVSITESNLPTGPVIVLIISLIVIFSILFSPKRGIIFKFSMNRKKEKCLVKNLK IKNPN >gi|261748415|gb|ADAD01000035.1| GENE 27 21502 - 22590 1392 362 aa, chain + ## HITS:1 COG:FN0671 KEGG:ns NR:ns ## COG: FN0671 COG1108 # Protein_GI_number: 19704006 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Fusobacterium nucleatum # 1 271 1 271 280 264 60.0 2e-70 MSAGLTIQLIAICVAGSCSILGTFLVLKSMAMVSDAITHTILLGIVIAFFMVHDLSSPFL IVGAGIVGVLTVYLVELLNSTRLVKEDSAIGVVFPLLFSIAVILISKYASNVHLDVDSVL LGELAFAPFNTAKIFGVTVVKGLVTTFGIFLINFLFVVIFFKELKISTFDRVLAATLGMK PILIHYMLMSLVSMTSVASFEAVGSILVVAFMIGPPITAYLLTNNLKMMMFLSVVLGAIS SVIGFFLASAWDVSIAGMIAVVIGAIFIVVLIISPKSGLVSTFRRKKNQKIEFSTKMLLI HILNHKDTAEEKEECGIDTMEYHLRWEKSFFEKISEKAKDRKLIYVDNGIFKISDKGREY IR >gi|261748415|gb|ADAD01000035.1| GENE 28 22729 - 22941 423 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037553|ref|ZP_06011012.1| ## NR: gi|262037553|ref|ZP_06011012.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 70 1 70 70 130 100.0 3e-29 MSLDPKTILSPKGKVENIEIIEQNTDYTIAILSWEGKDTIAARWNPTEENTMGIPQSRGY ATWFNFPILS >gi|261748415|gb|ADAD01000035.1| GENE 29 23402 - 24616 1090 404 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149002994|ref|ZP_01827905.1| 50S ribosomal protein L33 [Streptococcus pneumoniae SP14-BS69] # 3 404 6 408 409 424 54 1e-118 MAINVKSEIKPLKKVLLHRPGKELLNLTPDTLERLLFDDIPFLKVAQQEHDAFAQILKEN GVEVVYLEDLAAETISQCPDLKKQFLEQFITEGGVLVPEYREALMKLFESYTDNKELIMK TMEGVKYGELKLNSDSLISKLIDENELILDPMPNLYFTRDPFASIGHGVSLNKMYSVTRN RETIYADYIFKHHNDYKGKVPYFYERDNQFHIEGGDILNLNDKVLAIGISQRTQAAAIDQ IAKNIFDSSDSSIKTILAFKIPETRAFMHLDTVFTQIDHDKFTIHPNIMGPLEVYELTKN GNKIEVKLLQDTLKSILEKYLGVPNVTLIKCGGEDRIAAEREQWNDGSNTLCIAPGVIVV YERNDVTNELLRRAGLKVLEMPSAELSRGRGGPRCMSMPLVRED >gi|261748415|gb|ADAD01000035.1| GENE 30 24733 - 25725 1531 330 aa, chain + ## HITS:1 COG:BB0842 KEGG:ns NR:ns ## COG: BB0842 COG0078 # Protein_GI_number: 15595187 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Borrelia burgdorferi # 1 329 1 327 328 385 59.0 1e-107 MPKNLAGRNYLKLLDFTSSDIRYLLDLSKNFKELKLTRTPHKYLEGKNIVLLFEKTSTRT RCSFEVAGMDLGMGVTYLDPGSSQMGKKESIEDTARVLGRMYDGIEYRGFDQTIVEELAA NAGVPVWNGLTNEFHPTQMLADLLTIEENFGYLKGINFVYMGDTRNNMGNSLLIACAKMG LNFTACGPKDLKPTPELIAQAEAIAKESGSKIRFTEDVKEACTDADVIYTDIWVSMGEPD EVWEQRIKELKKFQVNKEAMSYAKDTAIFLHCLPSFHDLKTTIGKEINDKFGLPEMEVTD EVFESPQSKVFDQAENRMHTIKAVMYATLK >gi|261748415|gb|ADAD01000035.1| GENE 31 25728 - 26663 1359 311 aa, chain + ## HITS:1 COG:SA1013 KEGG:ns NR:ns ## COG: SA1013 COG0549 # Protein_GI_number: 15926753 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Staphylococcus aureus N315 # 4 310 3 309 310 374 61.0 1e-103 MAKRVVVALGGNALGNSPEEQLQLVKNTAKSIVDMIKEGYEVIIGHGNGPQVGMINLAMD YAANGEVKTPYMPFAECGAMSQGYIGYHLQQAIREELQAQNIKKGVVTLVTQVVVDKNDP AFTNPTKPIGMFYSKEEADKIAAEKGFTFVEDAGRGYRRVVPSPLPKKIVELEEVETLVN NGTVVITVGGGGIPVIEENGHYKGVDSVIDKDKSSSKLAADLKADMLVILTAVDKVYINY NKPDQKELDVINTEEVKKYIDEGHFAKGSMLPKIEACVEYVKNNKNGQAIITSLQNAGSA LAGNTGTVIKF >gi|261748415|gb|ADAD01000035.1| GENE 32 26683 - 28080 2208 465 aa, chain + ## HITS:1 COG:SP2152 KEGG:ns NR:ns ## COG: SP2152 COG1288 # Protein_GI_number: 15901964 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 4 457 6 494 503 305 40.0 2e-82 MAKKKGIQLSAFSILFLIIFALAILTYLFPQVQDASLATVVMAPFNGFKEAIDVCIFILL LGGFLGVVTKTGALDAGIGALVKKLKGNELVLIVILMILFSIGGTTYGMAEETVAFYVLI CSTMVAAGFDTTVGVATIMLGAGVGVLGSTVNPFAVGVALDALKEKGIQYNSATVIVLGA ILWLTALLMAIFYVLRYAKKVKAEKGSTILSLQEQNDMHEHFGHANFDNIEFTGKTRATL IVFGITFIIMVISLIPWENFGINIFVGWSSFLTGTPLGQWYFGELSMWFLIAAIVIGVIN GFKESEIIDSFMDGVKDILSVVLIIAVARGASVLMAATKFDTFILESASTALKGLSPVIF APASYVLFLVLSFLIPSTSGLASVAFPVLGPLTSTLGYSAEVMVLIFSAASGVINLITPT SGVVMGGLAIGKMQYGTWLKFVGKLIAAIIIADVVILTAAMMFIR >gi|261748415|gb|ADAD01000035.1| GENE 33 28111 - 29376 1254 421 aa, chain + ## HITS:1 COG:PA2337 KEGG:ns NR:ns ## COG: PA2337 COG2207 # Protein_GI_number: 15597533 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 307 417 181 291 301 76 31.0 1e-13 MFNELDLQFQSSYKHYKLYSFDIEYVENPVSPTISSVSKLFFFSKGTGKIKVYEREYEII PNTLLAVFPWEIVEITKVNESLEIMEISYNSIFNSLILNVLNRYKKNKTHIFSLIKEIPF VYFDDKEAKIIKNIFYEIKNEIEIESFSDFDKISDKKKFDEENDFDLSKVYITNKLTELV ILIAKKIISEKDNLPKKKEIIENNIIKYIYAHLNEKLTLNKLSVMFFMSESSITKYLEEV TGFTFNELLRHMRISKALNLLVYTDLNIEEIAYSVGFVDGAHISKLCNKYLQMKPNAYRE YYKNIYRIFNEKEQKLVYEIVNYIQNNYTQEITINDVTNKFNITDTKLNKILMSYSGKRF IDFLNFLRINKACELLVTTDKSIIDISFELGYNTVKTFNNNFFKLKNITPTDLRKNVKAE I >gi|261748415|gb|ADAD01000035.1| GENE 34 29404 - 31131 1969 575 aa, chain + ## HITS:1 COG:FN0506 KEGG:ns NR:ns ## COG: FN0506 COG0018 # Protein_GI_number: 19703841 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 575 1 569 569 523 49.0 1e-148 MKPIIIKVEDILNKNIEKIFGKNSQKVEIQYSSKKEFADFQTNFALINSKIIEKNPREIA AEIADKFEDNDIIDKFEITDPGFLNIYIKNSKIIEELKKIGEGKYIFPIDTSKKTIIDYS SPNIAKRMHIGHLRSTIIGDSLKRILEYVGFKVLGDNHLGDWGSHFGKLIVGCNKWLDEE EYKKNPIGELERIYIKFSDEAQLNPELEVSAKEEWEKLKARDKENVKLWKNFVKCSLKEF EKVYNILNVKFDMYNGESFYAEMIPEILESLKNKKIAVLDDGVLTVYFAKEEKITPCILQ RKDGNYISYPSTDLATIKYRKDVLNIENAIYVMDERPKEHFRQVFKISEMAGKEYAYEKT HVWFGIMRFEDDIISSATEGNNIRLINILEQAVKEAEKIIDLRNPDLSEKEKKEIGKIVG IGAVKYFDLSQNRMTPISFSWDKVLNFEENTSPYLQYTYVRIMSIFRKIKENNIDYNKNY RQISENFNGEERELATALLKFPYSVMKACEAYKPNLIADYLFETAKIFNTFYATESILKE EDKSRFYMKLLLAEKTAYIIKEGLQLLGIDVVERM >gi|261748415|gb|ADAD01000035.1| GENE 35 31175 - 32584 1775 469 aa, chain + ## HITS:1 COG:L94890 KEGG:ns NR:ns ## COG: L94890 COG0531 # Protein_GI_number: 15674017 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Lactococcus lactis # 5 468 6 485 490 358 42.0 2e-98 MPNKQKRLGIFALTALVISSSLGSGIFGISSDMASSAGPGAALLAWVIVGFGVLMLCLSL TNLNEKRPDLSGIFSYAETGFGPFGGFISGWGYWLSSWLGNVAFATMMMSAVAYFVPAYG QGNNLISILSASVILWLMFYLVNRGVESAAILNAVITVCKLVPLVLFIVIAVISFKADVF SANFWGTVSGNFEFSQVFSQIQKSMMVLMWVFVGIEGAAMMSDRAEKKSIVGKSTILGLT GLLIVYILASILPYGLMTREELSKLGQPAMGYILQALVGTWGAALVNIALIISIFGCWLS WTMLPAETTLLMAKRNLLPKKFGELNEKKAPTFSLLFMTVLTQAFVFTLLFTDKAYSFAY SLCTAAIFVSWLFVTLYQVIYSYENKEWSQLFIGLFGSVFYLWAIWASGIGYFLLCLTVY ILGIYLYAKARKENGISKVFSQKESIVIIFIVIGALLSIYMLFTGQIAV >gi|261748415|gb|ADAD01000035.1| GENE 36 32603 - 33727 1440 374 aa, chain + ## HITS:1 COG:L91456 KEGG:ns NR:ns ## COG: L91456 COG0436 # Protein_GI_number: 15674014 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Lactococcus lactis # 1 371 1 374 376 463 58.0 1e-130 MKIANFGVEEWLNVWENDAVYDIAGSSIASFTLEEIIKIGGKTQEEFFSDLLNKKMNYGW IEGSPEFKKQVSMLYKNISPEQILQTNGATGANFLALYSLIEPGDHVISLYPTYQQLYDI PRSLGAEVDLWHIREEKNWLPDLDELHKMIRPDTKMICINNANNPTGAVMEEEFLKELVE IAKSCGAYILSDEVYKPLEEGLYIPAIADIYDKGISANSLSKTYSIPGIRIGWTAANPEI TDIFRKYRDYTMICMGVFDDYLAAYVLKNKEAVLERNRKIVSENLNIVKEWVKNEPRVSL IFPRHVSTSFIKLDIPLEIEEFCIKLLKEKGVLLVPGNRFDMPGYARLGYCTHTETLKNG LKALSEYLREFDNH >gi|261748415|gb|ADAD01000035.1| GENE 37 33791 - 34216 623 141 aa, chain - ## HITS:1 COG:SPAC11D3.02c KEGG:ns NR:ns ## COG: SPAC11D3.02c COG2153 # Protein_GI_number: 19113711 # Func_class: R General function prediction only # Function: Predicted acyltransferase # Organism: Schizosaccharomyces pombe # 1 141 4 149 150 120 45.0 7e-28 MAWICKEFNGLDVNELFLIYKERVAVFVVEQECPYQEVDDNDLIATHLFKIDNNKITAYC RIIPENDGIHLGRVLVAKNERKSGAGRELVSEALKVIKEKWKDMPIHAQAYLENFYNSFG FKAVSDVYLEDNIPHLDMILK >gi|261748415|gb|ADAD01000035.1| GENE 38 34238 - 35101 1416 287 aa, chain - ## HITS:1 COG:AGc5123 KEGG:ns NR:ns ## COG: AGc5123 COG0501 # Protein_GI_number: 15890072 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 283 2 280 321 273 50.0 3e-73 MFENSLKTGLLMTGLVALFVAIGSLFGREGALLGLLFAGGMSFFSYWNSDKMVLSSYRAQ EVTEINNPRLYRMVQNLAKNAGLPMPKVYIVPEQQPNAFATGRDPQHAAVACTQGLLEIM NDEELSGVLGHELGHVKHRDILISTIAATFAGAISNIARFLPYYGAGTRRRRNDDNGGTA ALAMLLSLLAPIGALIIQMSISRKREFMADRAGAEFSGNPLYLRNALVKLETYSQRIQMN NRNPAYSHMFIVNPLAGLSGLANLFRTHPSTEERISELEKMARNKGY >gi|261748415|gb|ADAD01000035.1| GENE 39 35317 - 36126 1234 269 aa, chain + ## HITS:1 COG:SPAP14E8.04 KEGG:ns NR:ns ## COG: SPAP14E8.04 COG0501 # Protein_GI_number: 19114452 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Schizosaccharomyces pombe # 4 263 70 312 337 120 32.0 4e-27 MKKIFKILMILTGVTALVSCTTAPLTGRKQLKLVSDDSLVAESKQQYSEFISKLRSKNMI ANDTPDGRRLAAIGRRISTSVEKFMNENGMSNKVKDLSWEFNLIKSEDINAFALPGGKIA FYTGIMPVLKTDAGVAFVMGHEIGHVIGGHHAESQSGRTAAGLIMLGKEVADALTGGATS VVNNDLVGQGLSVGLLKFSRTQEYEADKYGMIFMAMAGYNPEEAIKAQERTMAMEKGRNV EILSTHPSTEKRIEALKNFLPEAMKYYKK >gi|261748415|gb|ADAD01000035.1| GENE 40 36153 - 37142 1457 329 aa, chain + ## HITS:1 COG:sll0926 KEGG:ns NR:ns ## COG: sll0926 COG0042 # Protein_GI_number: 16329480 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Synechocystis # 3 308 11 306 334 298 47.0 7e-81 MKNIISIAPMVDRTDRNFRNFVRMINKDVLLYTEMITAQAVINGDSDYILEFDDTEHPIV LQLAVTNKKEAYEAIKIAEKYDYDEINLNVGCPSDRVSGNMMGAYLMAFPETVAEIVAGI KKATDKPVSIKHRIGIDGKNILPETFKRTLLDKYEDMLNFVNITQKAGVKKYVVHARIAV LEGLDPKQNREIPPLRYEEVYRLKKERPELHIEINGGIKTTEQIDEHLKYADSVMLGREI YDNPMILSKFGKYYGKNIEIAREEIIEKMILYVEKMERENKRPHLFLMHTHGLFHGVKGS KYWKREINAPQADSKTLRRLLKEVKDFSE >gi|261748415|gb|ADAD01000035.1| GENE 41 37397 - 38728 1911 443 aa, chain + ## HITS:1 COG:MYPU_0230 KEGG:ns NR:ns ## COG: MYPU_0230 COG0446 # Protein_GI_number: 15828494 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Mycoplasma pulmonis # 1 434 23 465 478 373 44.0 1e-103 MKVVVIGCTHAGTAAILNLKRINPDVEITVFERNDNISFLSCGIALYVGGVVKDPQGLFY CSPEKLKELNVDTRMKHDVKNVDIKGKKVRVVNMETGIEFNETFDKLIITSGSWPIIPPI KGIDLNNILLSKNFNHSNEIIERAKHSKKIIVVGAGYIGVELVEAFRDNGKEVVLVDAEE RILSKYLDKEYTDIAEESFRQKGIVIATGEKVVRFEGKNGNVTKVVTDKNEYETDMVIMC VGFVPNTQLFKGQLDMLPNGAIKVDEYMRTSDYDVMAAGDCCSVYYNPLKTYRYIPLATN AVRMGTLAALNLSENKIKHPGTQGTSGIKIYENNMSSTGLTYETAKSEGLDVDFVYAVDN YRPEFMPTYEKVTFKVVYEKSSRRIVGAQLTSKADLTQSINTISVCIQNGMTVEELAFID FFFQPHYNKPWNFINLAGLNSLK >gi|261748415|gb|ADAD01000035.1| GENE 42 38966 - 39187 130 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037495|ref|ZP_06010954.1| ## NR: gi|262037495|ref|ZP_06010954.1| putative major facilitator transporter [Leptotrichia goodfellowii F0264] putative major facilitator transporter [Leptotrichia goodfellowii F0264] # 7 73 1 67 67 103 98.0 4e-21 MKSRNLLKNMKNKVIIYYISQILLFTATSLFNPVSYLFYMDNGLNLSVIGVSISILWIVS GIFEIPFGIFTDK >gi|261748415|gb|ADAD01000035.1| GENE 43 39203 - 40198 648 331 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037507|ref|ZP_06010966.1| ## NR: gi|262037507|ref|ZP_06010966.1| putative membrane protein [Leptotrichia goodfellowii F0264] putative membrane protein [Leptotrichia goodfellowii F0264] # 1 331 1 331 331 520 100.0 1e-146 MILSNVLSICGLGLLIARTNVPTLLICAAIFGVSTAANSGCLSSWIVNSLKIELKDSFKQ EIIQKVFSRSNIFCSIVSLLFSFLMLQIVYEINKNYPIYLSVIAYIICLLFFVFIFNDEY KNFNRKNSNLTNVVTQYFRIIDKKFLFISLFFTIPFIMDMGPVNQWSIVYRKTNILGFIQ IIIVGAFVFGNYIVSKTKIKQEKILKLIILDVVIILLLNFFDKKSIYTGIIFFILHVIIN TIILGVNISYLHSDIIKDDANRNTSISTFNFLNSCIVGILLIIQGIFSDKIGMTNTWSLF SVIGAVLYIIMFNRIFIKKGLDVTREEGMYE >gi|261748415|gb|ADAD01000035.1| GENE 44 40318 - 40512 105 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037537|ref|ZP_06010996.1| ## NR: gi|262037537|ref|ZP_06010996.1| apicoplast 30S ribosomal protein S4 [Leptotrichia goodfellowii F0264] apicoplast 30S ribosomal protein S4 [Leptotrichia goodfellowii F0264] # 1 64 1 64 64 90 100.0 4e-17 MSRYTIINGKEYTKIVKKETFIKKKLKAYINLYKKAYENQDIHKNKTICSMSCLQYFHKE LNIH >gi|261748415|gb|ADAD01000035.1| GENE 45 40724 - 42214 1752 496 aa, chain - ## HITS:1 COG:CAC0999 KEGG:ns NR:ns ## COG: CAC0999 COG0498 # Protein_GI_number: 15894286 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Clostridium acetobutylicum # 3 492 6 494 496 419 46.0 1e-117 MNYKSTRGGELQSSTFAALHGLANNGGLYIPENLPDIKLSYEDLKDLSYQELSEKIIKLF FTEFSDEEINQAVNNAYNSNTFTNEEIVPLHKLNEKISFGELFHGRTLAFKDLALSLFPY LLLLSKEKQNEKKDILILTATSGDTGKAALEGFKDIEGINIVVFYPKNGVSPMQEEQMRK QKGKNVEIVAVNGNFDDAQSAIKTIFSCNDFKKYAEEHDIMFSSANSINIGRLFPQVIYY ISAYVNLVKNGTLKAGEEFNVVVPTGNFGNILAGYIAKKLGIPIRKFISASNKNNVLSDF FQTGTYDKNRDFYMTNSPSMDILLSSNFERYLYYSTGENSKRVNELINSLLSDGKLSVTS EELANIQKEFYGEFADDKATVKAIKGVYNEYKYLMDPHTAVAYAVYDKLDKELDKNIHTV IMSTAHPFKFPIPVAEALGLNKEKDPYEILEEVSSITGIKIPEKLEEIRKSEIRFSKVID KSEISDFVKDYIKNIK >gi|261748415|gb|ADAD01000035.1| GENE 46 42360 - 42803 431 147 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037534|ref|ZP_06010993.1| ## NR: gi|262037534|ref|ZP_06010993.1| putative brix domain containing protein [Leptotrichia goodfellowii F0264] putative brix domain containing protein [Leptotrichia goodfellowii F0264] # 1 147 1 147 147 208 100.0 1e-52 MGDYDIWIIIAMVAVAILGLSFYKERRKMKELDDFSNEKERERKLKIQDNEPKDKILQSR EKFEEKYGIREIKDNKTEKFDENSQENTVSSESSDTGKNEEVSNETENQNNDDKFLFKRN TESEYSEKEKSHMADDYFCCTFYSQIL >gi|261748415|gb|ADAD01000035.1| GENE 47 42691 - 42960 157 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037489|ref|ZP_06010948.1| ## NR: gi|262037489|ref|ZP_06010948.1| hypothetical protein HMPREF0554_1003 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1003 [Leptotrichia goodfellowii F0264] # 46 89 1 44 44 72 100.0 1e-11 MMINFFLKEIQKVSIQKKKRVTWRTIIFVVLSILKFCNSQNEKEIMDRYSQQQEKYEKNQ ENDELKIEILRDNNDCFALNLEVTEGYYI >gi|261748415|gb|ADAD01000035.1| GENE 48 43089 - 44312 1508 407 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0228 NR:ns ## KEGG: Lebu_0228 # Name: not_defined # Def: Sel1 domain protein repeat-containing protein # Organism: L.buccalis # Pathway: not_defined # 35 405 73 446 478 146 31.0 1e-33 MKRSQKEGKNSALKRIMTILVILLMFIQCGAAFMDQEEKNKEYSKLSEIALKYEKKGDFK KAEKYYKMATAYNKYGYYKIAVMYYYNVNRETGIKKMEEAYNIHKVVDAANFLGYKAYEK NDIAGAKKWYTLAAKDGDSDAQYELARIYENEKKFEEAEKWYIEAAENNDSDAMYKVIFL NFLKGNKEKVREWKRRFLGTKGLIGITRSQKTWIDYMTGSEKDEKYFYLSREAENFIDES KYKEAEEKYIEAEKYSERAYFELGAFYYYDVGDKEKGKKVLEEAYNRKLSIAAYALGLIY ESEKQPEEAYKWYKIAADEGDSSAQYMVALHYYKKKNLKEAEKYYILSANQKDAEAMYNL MLLYFEDKKDNQNAKKWANKILNDSGLENLTWDIKDGADAVLENIEK >gi|261748415|gb|ADAD01000035.1| GENE 49 44434 - 44541 124 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCEKSITFLTFKQQVELFKKRGMIIENEEEFKEFC >gi|261748415|gb|ADAD01000035.1| GENE 50 44627 - 45346 1061 239 aa, chain + ## HITS:1 COG:VCA0167 KEGG:ns NR:ns ## COG: VCA0167 COG2859 # Protein_GI_number: 15600937 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Vibrio cholerae # 8 233 35 258 262 59 23.0 5e-09 MKKVQFIIISTILTFGLIVSGALISNALDKVNKSENQITVKGVAERRIKADRAVVRVILT AKNKNLEDAKKSIAEKETAVNEILTSEKIEEENYDKGNLKIKPNFTGDTDKISDYEITQT ISVNLKEVEKIDQIYEKLSELTLTFNNLEVIKPEFYITGIEKYKKDLLIEASKNAESRAF EMLKVNKNKVGEVKSMTQGQFEVLEDREDPKRIDEPAENQMYKKLRSVVTVTYSIDTIK >gi|261748415|gb|ADAD01000035.1| GENE 51 45374 - 45949 959 191 aa, chain + ## HITS:1 COG:FN0720 KEGG:ns NR:ns ## COG: FN0720 COG0231 # Protein_GI_number: 19704055 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Fusobacterium nucleatum # 1 188 1 184 187 169 49.0 2e-42 MKAAMDLRQGSTYRKDGVPYLILRADRHQSTSGKKARAAEMKFKIKDLMSGKVQEITVLS TEMMDDIILDRNQMQFLYESEGEYFFMDQESFEQIALTEEDLGDAVNFLVEEMVIQVLMY EGTPVGVELPNTVVREVTYTEPGLKGDTIGRATKPATISTGYTLQVPLFVVIGDKIKIDT RTGEYMERANG >gi|261748415|gb|ADAD01000035.1| GENE 52 46306 - 47172 938 288 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0765 NR:ns ## KEGG: Lebu_0765 # Name: not_defined # Def: endonuclease/exonuclease/phosphatase # Organism: L.buccalis # Pathway: not_defined # 1 288 1 291 291 348 68.0 2e-94 MRNIKRGLIFLLLFCFVFSCNNEKNNNKDSERKGTILIASFNAMRLGEKPKNYEVTAKIL SKFDLIGIEEVMHEKGLKKLKAHLIKLTGEEWEYLISDESVGSENYREYYGYIYRKKKFQ EVKKIGFYKEKNKNEFMREPYGAYFKAGNFDFVYVICHSIFGDKEKQRLIEAANYSNVYE YFLNISQESDIIIAGDFNTSADSPAFKNLFEKNNVSYLLNPEENLTTISDTRLASSYDNF LINKDKTKEFTGNYGVYNFIKDNNSEIKKYVSDHLLIFSEYSIEDDSD >gi|261748415|gb|ADAD01000035.1| GENE 53 47194 - 48564 1983 456 aa, chain + ## HITS:1 COG:CAC0847 KEGG:ns NR:ns ## COG: CAC0847 COG0534 # Protein_GI_number: 15894134 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 6 453 10 451 459 279 37.0 1e-74 MDSTVKTVYKKVFTIGLPVSFENMIYSLMNFIDVFMVGVENPVLRLGTAAVAGLGFANQM FMIFMVSLFGMNSGGGILAAQYFGSKDYKNLKKCLGITIIVGFLLSLLFLASGLLIPEAV IGVFSKDTKVINLGARYLRVVAWTYPLIGVGFAFNMQLRAIGQTKYSLYSSIMGLIINMV GNYTLIFGKFGFPALGIEGAAIATIIARIISTSYIIIMIYKMKLPIAGTFSELFTHSWDF VVKMMKISLPVFGHEIMWVSGVSVYVIIYGRMGTEPAAAIQIVKSISNLVFTLIFGLSSG TSAIIGHEIGGGNEENAYRYAVEMLKMSMLIGVIIALFVCVISPLVLRFMGVKPELYALT GKIVLSEGILIIVKTAGTLLIVGILRAGGDTLWTMFADLIPLWLFAIPLTYIAGIKFGLP VALVYLCSGSDEMLKVYPCLRRLKSKKWINNLVINH >gi|261748415|gb|ADAD01000035.1| GENE 54 48579 - 50084 2231 501 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0763 NR:ns ## KEGG: Lebu_0763 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 11 501 11 562 562 675 67.0 0 MIFENIRKINRYLVTGTAILLMSCSTIKTPPLGANYESPLRDTKTADFHYDLTYLDKDGN IQYDRNIWDATYKVVDNAKDYLIIEMFLFNDIYNKNKERFPEFAKEYTARLVKKQKENPN LKVYAILDENNNMYGAFEHPFITEMKNAGINIIIVDIFKLKDTFPWYSPLWRTVIEPAGN PQNKGWIGNFYGPMWPKLTLRNLFRALNVKADHKKIFLNEKEVVVSSANIHDPSYYHENI AIHTDGEITKDVLDNLQLVAKFSDSEINVDKSSESTENTSKAEEKNSLKEDKEKTDTITD DEGENYKIQFISEAMIGKHLDKDIDSLQAGDELLMGMYFLADKGIIDRLIKAANRGVKIR LILDRSKDAFGMSTNGLPNKPVSKKLKKKTKGKIDIKWYFTNNEQYHTKILLMKKTDGNV IITAGSANFIKKNIRGYIMDADFRILTNKDSKLTKDIYTYFDRLWDNKDGIYTINFEDEP TTSGFSDFMYKILDATQLGSF >gi|261748415|gb|ADAD01000035.1| GENE 55 50111 - 51313 1307 400 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0762 NR:ns ## KEGG: Lebu_0762 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 398 1 384 389 293 49.0 1e-77 MKKKTFFLLLVIIGLFTVSCFKNKEEKKSSDTNGADSTSTTENTTSKTNGDIFNLGKNDG QQQNNQGQIQNLTAEEQQNLINNKIDPAKVSQAIKDAESGNKEAILSLAHLYYGLKDNAK TKKYLQMGVEKNYPEAIYNLAVLLKEEGNIAEANKLMARLPKNSANGQMAPGAEAYNKGI NFVRAKNYKEAKAQFESAYRQGIREADIQVALLNKQMKNYDEAVKWFKLALNRGVKEANL EIGAILFDTGRQVEARSYLMKAYNSGNKGLAMPIAISYHKENNMSEALKWYKIAAKNGDK EAKETVAEIEGNKTANNSGKGVNQFLNVEKPEKSKTNITNTLAQTKKDNNVPSSDSGTKN AVSTNTNAASTMNRKAKQSTAREKQNYNIGIDEVTDNRMN >gi|261748415|gb|ADAD01000035.1| GENE 56 51577 - 52317 684 246 aa, chain + ## HITS:1 COG:no KEGG:bpr_I0857 NR:ns ## KEGG: bpr_I0857 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 9 226 29 249 270 202 54.0 8e-51 MNITKNNGLNSNQIKMIAIIAMTIDHMTWLFFPGYQKIWWIMGLHVIGRLTAPIMWFFIA EGFHYTRNVKKYILRLFTFAIISHFAYDFAGGISFIPDGFFNKTSVMWALAWAVVLMVIF TTDRLPQRFKTILVILICFITFPADWSTIAAICPVYLYLNRGNFRKQSTKMLIWVSIYAI IYFIFLDKVYGIIQMFALLSLPILKQYNGERGKWKGMKWFFYLYYPAHLFVIGLIRIMIG NGSIFP >gi|261748415|gb|ADAD01000035.1| GENE 57 52533 - 52934 194 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037551|ref|ZP_06011010.1| ## NR: gi|262037551|ref|ZP_06011010.1| putative NADH dehydrogenase subunit 4 [Leptotrichia goodfellowii F0264] putative NADH dehydrogenase subunit 4 [Leptotrichia goodfellowii F0264] # 1 133 1 133 133 140 100.0 3e-32 MLITIFSFLSILMIFVRLYSMNSVFSITKEKNKFSERITVFVLIFEILTIGFFVFFSSFR YLLPPVLLFLKTRQFHYLIFENLSVGLIIYVLIDFFLLSTFYPLNHILKFFNRTVLTFVF LLNILLGAMMFMV >gi|261748415|gb|ADAD01000035.1| GENE 58 53102 - 54562 2019 486 aa, chain + ## HITS:1 COG:BS_sacA KEGG:ns NR:ns ## COG: BS_sacA COG1621 # Protein_GI_number: 16080855 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus subtilis # 16 483 17 477 480 414 44.0 1e-115 MDFKHIKENEEKSVSEKKEKVSSDYWRQKYHIQGIVGLINDPNGFSQFKGKYHMFYQWNP LKTDHTAKYWGHSVSGDLLHWERKKTALRPETVYSKNGVYSGSGLTINDKLYLFYTGNVK DEKGNRDSYQCIAISEDGENFERIEPVITNQPEGYTRHIRDPKVWEKDGIYYMIIGIQND DLKGKAALFSSENIYDWKFLGEIAGADKGKLGKFGYMWECPDYFQLKDEKTSEIKDILVA CPQGLEAQGDLYNNVFQSGYFIGKMDYKKPEFIIENDFIELDRGHDFYAPQSMEDDKGRR LMIGWMGVPEQEDYPTVKNEWLHCLTLPRELKLKNGKLYQVPIEEMETIRGEKSEFSGSV KGEKEIGKGTVYELKAEFSDISSDFGLKLRVGENSETVLKFDYNEKKFILDRSSGEQPDK SLRKVYLGDIKTLELDIFVDNSSVEVFINGGEEVFSSRIFPEKGADKITVFSENGINVKI EKWEWK >gi|261748415|gb|ADAD01000035.1| GENE 59 54569 - 55897 1262 442 aa, chain + ## HITS:1 COG:FN0313 KEGG:ns NR:ns ## COG: FN0313 COG0144 # Protein_GI_number: 19703658 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Fusobacterium nucleatum # 13 441 2 433 435 288 42.0 2e-77 MNNKKDLKNQKDNIKIDIVNLLDEIMSGKYSNIQLNHYFKTKNYLKKEKLFITNIINITI KNLIYIDYLLEKTVKNIQKRKIKQLLRISVAQLFFTEADNAGVLYEAVEVAKIINNHQSG FVNATLQSVLRNKDEIIENIPKDRKDGILYSYPQWFVNKMKVDYPEDYIDIMKSYKSRSY LSVRYNSKKLTKDRFEKILSEVKSGILFSADEVYYLSNSNIFDTEEYKNGNIIIQDASSY LAVKNLNVEKGDVVLDACAAPGGKSLAILQNFEPELLVAEDIHEHKIKILENMKKKYNFS NLKVVLNDATQIESLNMKFDKILLDVPCSGLGVLRKKPEKIYDLTGEQIKSLKKLQKKIF DSAYNSLKENGIILYSTCTFSINENTNNLEYFLEKYRDLVIEEVVIPENVDIRKDQWGGV YITHKNIYNDGFYIAKLRKKQL >gi|261748415|gb|ADAD01000035.1| GENE 60 55915 - 57675 2077 586 aa, chain + ## HITS:1 COG:FN0243 KEGG:ns NR:ns ## COG: FN0243 COG0617 # Protein_GI_number: 19703588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Fusobacterium nucleatum # 7 443 10 449 451 319 42.0 1e-86 MQFKLNSDVKFILEQLNNNGTGFLVGGAVRDLITGKEPYDYDFATDIEYEKLKEIFKNFS PKEVGAHFGILIIKVNGKHYEIAKFRKETGVLNSRHPKSVKFIDTIEEDLSRRDFTVNAM AYNEIRGLIDLFGGKKDIENRIIRFVGKPKIRIEEDALRIMRAFRFISQLGFKLDKKTSE AIYLKKKFLNKISKERIFTELSRILTGPCMKRALRMMKKCGVLEMIIPEFDYAYDFDQNN SYKKDLLFEHIIKVTDLCHPDLITRFAALFHDLGKISTKSIDAKGKFHYYGHEKESVLIA ENCLKELRVPNDFLHSVKKIVLNHMIVYQEISDREIKKLIINLGEKDLARLFDLFNADFL CKNSNEYKKENPVGKLKKRIEEIKSRGYIPSLRELDITGADLISLGFEPINIGEIKNDIY EQVLDETVKNEKEEILKYLSGKYNISKKMKHEKSCGAVIIREKNEEFLIVKMYNGNWGFA KGHTEMNENEEETAIREVKEETGISVKLINGFRETVKYVPNESTLKEVVFFLGTAENEEV KIDKEEIEEFKWCNYEEAMKLITYKLQRDVLDKAVEFIQINNNSKL >gi|261748415|gb|ADAD01000035.1| GENE 61 57700 - 59208 1850 502 aa, chain + ## HITS:1 COG:FN2106 KEGG:ns NR:ns ## COG: FN2106 COG1288 # Protein_GI_number: 19705396 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 2 502 18 518 518 619 68.0 1e-177 MKKKKFEFPSAFTVLFIIMVLSAVLTYLIPSGRFSKLVYDRVSNEFVITDQQGETTAKPA TQEVLKELNIDLPLEKFTKEIIKKPMSIPGSYKRIPQQPQGFMEIIQAPIAGITDSVDIM IFVLILGGIIGIINKIGAFDAGIAALSKKTKGKEFILVFMIFTLITLGGTTFGMAEETIA FYPILMPIFLVSGFDAMTCIAAISLGSIIGSMFSTINPFSMIIASNAAGISFTEGLVFRI IALILAAAITIIYIYRYTEKVKKDRTKSVVYDQEKEIRERFLSNYEEGSKSEFTLRKKLS LLIFAMAFPVMIWGVSIEGWYFGEMSALFLTVAIVIIVVSGLPEKKSVSAFIQGAGDLIG VAMAVGLARAINVIMDNGYISDTLLDFFSHLVAGMNSGIFALMQFGIFSVLGIFIQSSSG LAVLSMPIMAPLADNAGVSREIIINAYSWGQGLMSFITPTGLVLVFLEMVGVTFDKWLKF VLPLFGIIAAFSAVMLVINTMF >gi|261748415|gb|ADAD01000035.1| GENE 62 59230 - 60666 1950 478 aa, chain + ## HITS:1 COG:FN1277 KEGG:ns NR:ns ## COG: FN1277 COG2195 # Protein_GI_number: 19704612 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Fusobacterium nucleatum # 3 478 4 486 486 458 50.0 1e-128 MRKLENIKPERVFYYFEEISKIPRDSYKEKEISDYLVKFGKEHNLECYQDEVYNVVLRKK ASPGYENAEKIILQGHMDMVCEKTEDSNHDFTKDPIELIVDGNYLRANKTTLGADNGIAV AMMLAIVEDDSLKHGPLEFLITTSEEIDLGGAMALKPGILQGKMFINLDSEETGILTVGS AGGENIDILLPVKKTELKEGFTYKVKLQGFAGGHSGAEIHKGRENSNKAMNKVLKSLNEK EDIYLVSVSGGSKDNAIPRVAEAVITSIKDIKETVEKTLKEIKESYIKTEPQTELLLEEV LFNGQVFEKEILKKYTELIEDIPTGVNTWMKEYPDIVESSDNLAIVKTEEKYIRIAISMR SSEPEVLEKIKKKMAETAEKHGSNYEFSANYPEWRYRSVSALRDKAVEVWKKLTGEEMKV EVIHAGLECGAIYHNYPDIDFISLGPDMQNVHTPEEKLDIASTEKIYNYVVKLLEELK >gi|261748415|gb|ADAD01000035.1| GENE 63 60698 - 61189 568 163 aa, chain + ## HITS:1 COG:L89201 KEGG:ns NR:ns ## COG: L89201 COG3760 # Protein_GI_number: 15674012 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Lactococcus lactis # 1 158 1 159 163 139 44.0 3e-33 MEEFEKVYEKLNELNISFEIVEHPAATTTEEADKYVEGIEGVLTKSLFLTNDKKTAYYLL IMDDHNKLDMNEFKEIVGAKRLKFASSDSLYKKMKLQPGVVSIFGLLNNPEKDIKLYFDT KILKEKRISFHPNDNTKTIFISINDMLKFIKNIGFEYEEVKFE >gi|261748415|gb|ADAD01000035.1| GENE 64 61468 - 62679 1211 403 aa, chain - ## HITS:1 COG:RSp0310 KEGG:ns NR:ns ## COG: RSp0310 COG0477 # Protein_GI_number: 17548531 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Ralstonia solanacearum # 7 402 45 441 450 132 24.0 1e-30 MTNNNSWKSKISLFLFSQTVSTLGSFVVNFSILWYITLKYSSGTFITILVLCTFVPQILI SLFAGVWADKYNKKFIIMLSDSFIALATFIIVIFFLAGNHSLYIMYAATIVRSIGSGIQT PAISAVIPEITPEDKLLKINGINNTLQSVVALLSPAIGGVILGSLGIIYSLMFDIVTAVI GVGILSFLKIPENQNKTEIHESGYSQLKSGLKYAKNNIAINRMLRFFTVIYILVTPIAFL YPLLIKRVFGDDIGKLTLTEVLWSIGMILGGLAVTFTKNVKNKIKLFLIVYFIIGIDFYI LGLTRDFNVLLITLFLGGIFVIIGDTAEMTFIQENTDPEMMGRVLSMINLIRVFIFPVSI LFFGPFADKIRLDYLIAGTSFLISIFALTKFFNKEFMEMGKRS >gi|261748415|gb|ADAD01000035.1| GENE 65 62798 - 63460 709 220 aa, chain + ## HITS:1 COG:CAC1231 KEGG:ns NR:ns ## COG: CAC1231 COG0673 # Protein_GI_number: 15894514 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Clostridium acetobutylicum # 1 202 5 206 331 234 56.0 9e-62 MRFGIIGTNWITDKLVDAGKEIKDFELTAVYSRKEETARAFADKYGVDTIFTDLEKMAES DKIDGVYIASPNSFHSSQSVLFLKNKKAVLCEKPATSNLRELEEVIKTAKENDTLYMEAM KIPFIPTYEVLKENLYRIGKIRKAVLGYCQYSSRYEDYKRGDVKNAFKPEFSNGALLDIG VYPLFLAISLFGSPENIDAGGINTGKWKRNRCTGEYKYVL >gi|261748415|gb|ADAD01000035.1| GENE 66 63390 - 63797 534 135 aa, chain + ## HITS:1 COG:ygjR KEGG:ns NR:ns ## COG: ygjR COG0673 # Protein_GI_number: 16130982 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli K12 # 12 135 214 333 334 92 41.0 2e-19 MQGGLILENGKGIDAQGNINMYYKDMDVTVLYSKISDSFIPSEIMGENGSLIIEYPSEMD KLYFIDRKNKNEKIDLTVQAKGNRMYYELKHFMELFNEGKKESPVNNFALMQTVMKVMDE ARKKIGIVFPADKIK >gi|261748415|gb|ADAD01000035.1| GENE 67 63877 - 64440 924 187 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037509|ref|ZP_06010968.1| ## NR: gi|262037509|ref|ZP_06010968.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 187 1 187 187 323 100.0 3e-87 MKKAIILMITVFTINGFSAQVLNNKVGSFFTNKEIKNLVFNQYNLDTEIAAVSKIKGKGE DAVSKADADAKKGVQNGARDYAYEILNEYLNGSLVSGPGFNTFKMREFANEVAKEVAPNA QRRGSWTTSKNETVVLYTIDKQLVKSSAERIFNERLAAVIQKLSDYKNNFGQTNRSGGEE VRVETAE >gi|261748415|gb|ADAD01000035.1| GENE 68 64549 - 65304 1131 251 aa, chain + ## HITS:1 COG:CAC0908 KEGG:ns NR:ns ## COG: CAC0908 COG1179 # Protein_GI_number: 15894195 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Clostridium acetobutylicum # 2 249 3 250 251 194 41.0 1e-49 MKHIFSRFSMLVGEKALDKLKNSNIIIFGIGGVGSYAVESFARSGIGSMTIVDYDEISES NINRQLHALHSTVGMSKGEIMKQRMLDINPECNVKLRKELAFKNIDTFFEDNDEKYDFAV DAIDVIYSKIELIEYCYENNINIISSMGFGNKMHPEMIEISTIENTSVCPMARTIRRILR NKNIENIPVVYSKEKALVPDKSDDYSSEEPTDFRMNNELPNKITPGSNAFVPGTAGLIMS SYVIRVLLEIE >gi|261748415|gb|ADAD01000035.1| GENE 69 65320 - 66186 1041 288 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1253 NR:ns ## KEGG: Lebu_1253 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 6 286 34 314 322 322 66.0 1e-86 MLKILQQNKIIKIAIIVGILLCIFGLMNKIFEFTIFNFFKNMTKPYLERAYEESKKLFIT LSLLKGTTDVIEGSTVNVNVILGMNIQIGDLVQPIYDIIDILWKISLASVVVLKLETIYY EIFKVKLASILIFISLVTYFPYTFYQNTVTEIFKKISKYSLLALAFVYVLLPGTILVSSA VSDYFEKEYKEPAIVRLNESVNKMNKVKDDLFVMEQSKSIFNIPGQIESTKNKFSNLGNE ISNVSKDLSDYTPVIIGITLLSYIIMPLLLLIFLYKMTKSLLLEKIAK >gi|261748415|gb|ADAD01000035.1| GENE 70 66336 - 66740 618 134 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1139 NR:ns ## KEGG: Lebu_1139 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 134 1 122 122 122 54.0 5e-27 MKKIIMLLMLLVGALTFSALQDGIYYVEKANGGNWKAFAKITVKGNKIIGAQYDRKNGTG ELLSIDQKENEKYKSKFGESFRDSSFSMTRTLVSTQDINSVSGVKNSEALSEFKQLIQFL INKANSGEQGSFKM >gi|261748415|gb|ADAD01000035.1| GENE 71 66828 - 67112 281 94 aa, chain - ## HITS:1 COG:no KEGG:spyM18_1747 NR:ns ## KEGG: spyM18_1747 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_M18 # Pathway: not_defined # 2 94 98 186 186 94 51.0 2e-18 MSDGYIVNSYLLGQLDKNYSEEINENNQISGMQLLTLAYDTLMEAKKIINVRYVWLECED TEKLLNFYKTFGFEEIENFTSVNNLKVLIMKLKK >gi|261748415|gb|ADAD01000035.1| GENE 72 67202 - 67423 226 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYIENVKIISLQDLLEELKDKELVIDILKKFRSKYNKDVEDFLHTKAIEFENSGLSSTHL VFDENFILLGFFL >gi|261748415|gb|ADAD01000035.1| GENE 73 67424 - 67573 220 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037544|ref|ZP_06011003.1| ## NR: gi|262037544|ref|ZP_06011003.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 49 1 49 49 69 100.0 9e-11 MATKSFTAEMTFDKKSINSLIKALDNEKSPNRKPVKDGELIQKAFGRKK >gi|261748415|gb|ADAD01000035.1| GENE 74 67704 - 68198 677 164 aa, chain - ## HITS:1 COG:FN2117 KEGG:ns NR:ns ## COG: FN2117 COG2849 # Protein_GI_number: 19705407 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 7 158 82 237 244 62 29.0 3e-10 VSYEYDPTGVLTSVTSYTNNRKDGKELTVSNGVIVLENNYNNGVLSGLSTSYYLDGTLRS TGNYVNNLRNGEWVWKYPDGTVKLVENYQNGKISGNITGYFPDGNKERVFQVTNGNGSFT QYYDNGKLKAKGSIVNYSSAGDWVFYDKNGNVMTTNPYFEEVVR Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:39:47 2011 Seq name: gi|261748410|gb|ADAD01000036.1| Leptotrichia goodfellowii F0264 contig00197, whole genome shotgun sequence Length of sequence - 1840 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 892 628 ## gi|262037565|ref|ZP_06011023.1| hypothetical protein HMPREF0554_2392 2 1 Op 2 . + CDS 919 - 1365 315 ## gi|262037564|ref|ZP_06011022.1| valine--tRNA ligase 3 1 Op 3 . + CDS 1358 - 1633 230 ## gi|262037843|ref|ZP_06011277.1| hypothetical protein HMPREF0554_2290 4 1 Op 4 . + CDS 1726 - 1840 62 ## gi|262037843|ref|ZP_06011277.1| hypothetical protein HMPREF0554_2290 Predicted protein(s) >gi|261748410|gb|ADAD01000036.1| GENE 1 2 - 892 628 296 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037565|ref|ZP_06011023.1| ## NR: gi|262037565|ref|ZP_06011023.1| hypothetical protein HMPREF0554_2392 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2392 [Leptotrichia goodfellowii F0264] # 1 296 1 296 296 487 100.0 1e-136 YHRNNNGLEDSEDYASFYGRQFERYYDRNFGNNDVAFITERLKYTKEQLGNDWENKVNEG ADNYNKDLTTKRRPNANKSFYTKKEIQDEIQNWFGMNSFNIDWTKYKNNEKYRKQTNYYF FQAKYALRVKSVNKVSDKYVTITQKNGKEMTLKRVPPEESIYHNIELKNGKIYFYFDNRY KKYVQADGYELILDKNNKPVYIPEISGTYNYYTYNMKFEPDWFNHKKDIDLWKDFGAGPN DRTTYEQRLEIGDLSFGSFIQSNYKTLSEMAKSRKDNYFTYEELIILKRTMQTIKD >gi|261748410|gb|ADAD01000036.1| GENE 2 919 - 1365 315 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037564|ref|ZP_06011022.1| ## NR: gi|262037564|ref|ZP_06011022.1| valine--tRNA ligase [Leptotrichia goodfellowii F0264] valine--tRNA ligase [Leptotrichia goodfellowii F0264] # 1 148 1 148 148 198 100.0 1e-49 MQIDNLFLALMAFNIFICICFIFGYNIFEMKHDYLIDDIGIYFSLLILLILLNFNVLNHI WKNEYKSAIYLYLPISVVLVLILLFVYYDDIRMIVFGLKTVKPDVWIKEIIDYIIVKKMK YNIVFFLEFYIPLKYWKVKRNINKKTYE >gi|261748410|gb|ADAD01000036.1| GENE 3 1358 - 1633 230 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037843|ref|ZP_06011277.1| ## NR: gi|262037843|ref|ZP_06011277.1| hypothetical protein HMPREF0554_2290 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2290 [Leptotrichia goodfellowii F0264] # 1 89 467 555 627 154 97.0 2e-36 MNKRKNVGINLTFTPGVTDVYRNGKLTPDSVGGVLVGTRIDYSKDDYVAKVKATIGNSVN MTVAGKKADLSGVNRDTSKMVDVLKDRKITL >gi|261748410|gb|ADAD01000036.1| GENE 4 1726 - 1840 62 38 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037843|ref|ZP_06011277.1| ## NR: gi|262037843|ref|ZP_06011277.1| hypothetical protein HMPREF0554_2290 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2290 [Leptotrichia goodfellowii F0264] # 3 38 592 627 627 68 94.0 2e-10 MSEIVKAVLENEGKDIGLYYRDRLDFEAERDRKERSGE Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:40:21 2011 Seq name: gi|261748408|gb|ADAD01000037.1| Leptotrichia goodfellowii F0264 contig00203, whole genome shotgun sequence Length of sequence - 241 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 240 327 ## gi|262037567|ref|ZP_06011024.1| tRNA-I Predicted protein(s) >gi|261748408|gb|ADAD01000037.1| GENE 1 3 - 240 327 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037567|ref|ZP_06011024.1| ## NR: gi|262037567|ref|ZP_06011024.1| tRNA-I [Leptotrichia goodfellowii F0264] tRNA-I [Leptotrichia goodfellowii F0264] # 13 79 1 67 67 108 100.0 1e-22 ETILIDRDKALEMAERAAGRKIDDLQIISTKDPNIKEIVGADAIAGRAYHVGNGKIVIVA DNIGDYAKAMGTIAEEGRH Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:40:33 2011 Seq name: gi|261748390|gb|ADAD01000038.1| Leptotrichia goodfellowii F0264 contig00138, whole genome shotgun sequence Length of sequence - 14548 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 6, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 81 176 ## - Prom 114 - 173 10.1 2 2 Op 1 1/0.000 - CDS 248 - 904 238 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 3 2 Op 2 1/0.000 - CDS 901 - 1848 1205 ## PROTEIN SUPPORTED gi|229211635|ref|ZP_04338026.1| (LSU ribosomal protein L11P)-lysine N-methyltransferase 4 2 Op 3 2/0.000 - CDS 1835 - 2614 1097 ## COG1692 Uncharacterized protein conserved in bacteria - Term 2643 - 2691 8.6 5 2 Op 4 . - CDS 2702 - 4276 2575 ## COG1418 Predicted HD superfamily hydrolase 6 2 Op 5 . - CDS 4289 - 4651 201 ## Lebu_0690 hypothetical protein - Prom 4699 - 4758 13.8 7 3 Op 1 . - CDS 4779 - 5630 1015 ## COG1560 Lauroyl/myristoyl acyltransferase - Term 5663 - 5698 3.1 8 3 Op 2 . - CDS 5720 - 5983 378 ## PROTEIN SUPPORTED gi|169837801|ref|ZP_02870989.1| SSU ribosomal protein S15P - Prom 6012 - 6071 5.9 9 4 Op 1 . - CDS 6098 - 6586 546 ## COG0526 Thiol-disulfide isomerase and thioredoxins 10 4 Op 2 . - CDS 6662 - 8062 1667 ## Lebu_0686 hypotheticalprotein 11 4 Op 3 1/0.000 - CDS 8099 - 10000 2653 ## COG0143 Methionyl-tRNA synthetase 12 4 Op 4 . - CDS 10030 - 10674 559 ## COG2121 Uncharacterized protein conserved in bacteria 13 4 Op 5 . - CDS 10711 - 11283 948 ## COG1309 Transcriptional regulator 14 4 Op 6 16/0.000 - CDS 11309 - 11785 480 ## COG0262 Dihydrofolate reductase 15 4 Op 7 . - CDS 11812 - 12681 1069 ## COG0207 Thymidylate synthase - Prom 12761 - 12820 2.6 16 5 Op 1 . - CDS 12824 - 13294 466 ## COG0219 Predicted rRNA methylase (SpoU class) 17 5 Op 2 . - CDS 13320 - 13889 879 ## COG0817 Holliday junction resolvasome, endonuclease subunit - Prom 13912 - 13971 9.7 18 6 Tu 1 . - CDS 14040 - 14546 694 ## Lebu_0677 tRNA pseudouridine synthase B Predicted protein(s) >gi|261748390|gb|ADAD01000038.1| GENE 1 3 - 81 176 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKKEFVEAYAKATGETKKRAEELVN >gi|261748390|gb|ADAD01000038.1| GENE 2 248 - 904 238 218 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 1 203 35 275 863 96 29 1e-19 MIIAVDGPAGSGKSTISKLVAKELGMIHLDTGAMYRLFTLKLLKEKVSFEDKDKINELLE NLDINIENDRFFLDGKDVSEEIRKSEISENVSEVSAIKEVREEMINLQRKFSSSKDIILD GRDIGTVVFPNADIKIFLVADPKERAERRYKEVIGKGGKASFQEIYESIVNRDRLDTVRE ISPLKKAEDAIEVDTTGKTIEEVKNMILSIYKTKSNDC >gi|261748390|gb|ADAD01000038.1| GENE 3 901 - 1848 1205 315 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229211635|ref|ZP_04338026.1| (LSU ribosomal protein L11P)-lysine N-methyltransferase [Leptotrichia buccalis DSM 1135] # 6 315 1 310 310 468 71 1e-132 MTRFKMEWIKVKVEYFSDNSEISKAKIINIFEEIGIKQIEAVDYFSDNSLDYNENFKSKN DIWSIIGYIINNRFAKSKLKIMKNALEEYSEENENFGYEIYTSECSDDDWKDEWKKYFHT AKITENIVIKPSWDEYEPVGNEKIIEIDPGMAFGTGTHETTSLCVEFLEKYSGNKDKLLD IGCGSGILMLIGKKLGINKVTGIDIDEKVGEVVKENFAKNDIHENFEVIIGNLVNDINEK YDIIVSNILVDVLTELLKDIEKVLLKDSVVIFSGILKEKEEMFLEKTKLYKLEQIDRNEK NNWVSLVFRYKGELL >gi|261748390|gb|ADAD01000038.1| GENE 4 1835 - 2614 1097 259 aa, chain - ## HITS:1 COG:FN1609 KEGG:ns NR:ns ## COG: FN1609 COG1692 # Protein_GI_number: 19704930 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 258 1 259 263 286 55.0 3e-77 MKFLIIGDIVGEPGRNILFKYLKKRKQNYDFIIVNGENSAGGFGITGKIADQIFAHGADV ITLGNHSWDRKEIYSYINEEKRMIRPINFVKEAPGRGYTIVEKKGKKVAVINAQAKVFMP PIACPFLAIDEIIPEISKTADIIILDFHGEATSEKLAMGWNLNGKASLVYGTHTHVQTAD ERVLPGGTAYITDVGMTGGHDGVLGMNKRESLQKFKDGMPSKYSVCEDNIKINGIEVDIN ENNGKAVSIRRINMHYDEV >gi|261748390|gb|ADAD01000038.1| GENE 5 2702 - 4276 2575 524 aa, chain - ## HITS:1 COG:FN1913 KEGG:ns NR:ns ## COG: FN1913 COG1418 # Protein_GI_number: 19705218 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Fusobacterium nucleatum # 16 524 1 508 508 474 57.0 1e-133 MHISIVILLIVIFSFLAFFIAYFFGSSVFKKKYGELSELELRIVDAKRRLESSKKEVERE IESFKKEETLKVKETLLNEKKIADEEIKKMKSEIVSKEERLAKKEETLETKMERLEERES KIERQREKISRKETELNELIQKEEKELERISELTREDASKIILTRLENELDHDKAVLIRD FEYNLDREKDKISKRIISTAIGKASADYVVDSTISVIQLPSEEMKGRIIGREGRNIRAIE SATGVDLIIDDTPEAVVLSSFDGVRREVARIALEKLISDGRIHPTKIEEVVQKAQEEVEE SVLDAAEQAILEVGIPTLPREVLRVFGRLKFRTSFGQNILQHSIEVAHIAAALAAEIGAN VDIAKRAGLLHDIGKAFSHEQEGSHAINGGEFLRKFSKENELVINAVEAHHDEVEQLSIE AVLVQAADSISASRPGARRETLSNYLKRLEQLEEIANSHEGIESSYAIQAGRELRLIVHP DRINDDKAVILSREVAKEIEEKMQYPGQIKVTVIRETRAVEYAK >gi|261748390|gb|ADAD01000038.1| GENE 6 4289 - 4651 201 120 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0690 NR:ns ## KEGG: Lebu_0690 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 120 10 133 133 102 48.0 5e-21 MLIFINCTTVNLTLVESDKVNKEVTDTIVELKEAANFNKYEKLKEFFLPTFKNNIILDNI KKYDLSKLTFIFSEVKAVSKNKASGLMIINYGSESNYYIVNWKMTNENGQWKISDVAEKK >gi|261748390|gb|ADAD01000038.1| GENE 7 4779 - 5630 1015 283 aa, chain - ## HITS:1 COG:FN1016 KEGG:ns NR:ns ## COG: FN1016 COG1560 # Protein_GI_number: 19704351 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Fusobacterium nucleatum # 78 283 1 211 226 103 31.0 4e-22 MKYKPGEIVAGWNAIVFRKVLSVFPLKFRYKFFESVGMAAYYLIKKRRELTIDNIKHAFP EKSKEEIINIAKESYKTMGKMIMTSIYLKEVTSGGNTVLENEELMLKACENDKAVIIVSL HSGGFEAGSIMRNIRKFYAVFRKQKNKKLNDLMAKWREEGGLHSIALRDSETLNDALRNK TIIALASDHYAEDIPVKYFGRETTAVSGPVLLGLKYKVPLVLAYSVFENGKIKIINKEII EIEKKENLKETVKFNMQKIFYKFEEIVKENPGQYMWQHKRWRS >gi|261748390|gb|ADAD01000038.1| GENE 8 5720 - 5983 378 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169837801|ref|ZP_02870989.1| SSU ribosomal protein S15P [candidate division TM7 single-cell isolate TM7a] # 1 87 1 87 87 150 82 7e-36 MAMRSKQEIITQYGKNAQDTGSADVQVALLTERINHLTEHLRTHPKDVHSRVGLLKMVGK RRRLLNYVKNRNVDNYRELIEKLGIRK >gi|261748390|gb|ADAD01000038.1| GENE 9 6098 - 6586 546 162 aa, chain - ## HITS:1 COG:BS_yneN KEGG:ns NR:ns ## COG: BS_yneN COG0526 # Protein_GI_number: 16078864 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 20 152 28 159 170 68 33.0 5e-12 MKKIIFLLVIVLFAIGCGKKSDVKAESDGKISNFSLEDLNGNKYESSKIINNGKKTLFVV AAEWCPHCRAEAPDIQKFYDEYKDKVNVIVVYSEANSSLDKVKEYVKNNEYTFPIYYDSG SEILTGFKVQAFPFNLIMDGNKIIKEHKGELTYDLLVEEFGK >gi|261748390|gb|ADAD01000038.1| GENE 10 6662 - 8062 1667 466 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0686 NR:ns ## KEGG: Lebu_0686 # Name: not_defined # Def: hypotheticalprotein # Organism: L.buccalis # Pathway: not_defined # 1 465 1 460 461 459 62.0 1e-127 MRKSIAGFILFLLILVSCGKTEKGYDSLEKGLLGILEKKEEGYVRKYLEQSAKEENADAF GLAYLYWQNSGEDFFNKFLNKSNGNAEFYKALILKGKSDSEKEVINLLESAANQGNYKAY YMLGNIFQDKLEFTKAQEYFKKGKERGEIYSVYSYDYNKNLTGIYKRIEELNKKLQGGTI NPNEKKELGTLILEKFSNYNAAYDILKEFIVEEYPPALYSKAKKLETEDKDQEAVEIYNQ LFAKNRYYLASFELAYKLVNSSKNYELAIRVLEDTNSDDSLITGYKGYIYENMKNYTKAE ENYLKAVHKNDVDMMTYLGKLYEEKKEVKKAKEIYNKAYSAGSISSGYRLANLIEKTDKE KPEKDKKSSQKKTERNNKSAKKILERLAESGDDYSMVDLSLYYPEKDKNVRILNLKAAAR LNDTAFHNLGVYYYNNKNKDKAKQYFKIAKDYGYHLEPEYEAFISI >gi|261748390|gb|ADAD01000038.1| GENE 11 8099 - 10000 2653 633 aa, chain - ## HITS:1 COG:FN1268_1 KEGG:ns NR:ns ## COG: FN1268_1 COG0143 # Protein_GI_number: 19704603 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 507 1 510 526 649 59.0 0 MSNSFYITSPIYYPNAAPHVGTAYTTIICDVVSRYKRIKGYDVSFMTGVDEHGQKIQEAA EKNGFTPQQWVDKMSLNFTTLWEKLNISNTDYLRTTQERHIESVREIVKRVNEKGDIYRG EYVGKYSVSEETFVPENQLVDGKYMGKEVIDVKEVSYFFRLSKYEDALLKHIEENPDFIK PESKKNEVVAFIKQGLQDLSISRTTFDWGIPLELEKGHVIYVWFDALTVYLTGAGFKKDE KEFGEIWSNGKVTHVIGKDILRFHAIIWPAMLMSAGIKLPDTVAAHGWWTVEGEKMSKSL GNVVNPEEEVKKYGLDAFRYYLMREATFGQDADYSKKAMIQRINSDLANDLGNLLNRTIG MQKKYFNSEVILNKVEDKYDIEIKDLWEKTVEDLDIHMNEFQFSEALKDIWKFIGRMNKY IDECEPWNLAKDGSKKDRLSTVMYNLVDSLYKISMLIYPFMPDTAEKIVRQLGLNIDFNT LKLEDIRKWGSYPAGNKLDEAVPLFPRIEIEPENKKDYNENLHIENPITINDFNKVEIKV VEIIKAEKIKDADKLLKFIVDTGTEKRQIVSGIAKWYPDEKELIGKKVTAVLNLEPVTLK GELSQGMLLTTTEKKKIRLIEISKEVKTGSIVK >gi|261748390|gb|ADAD01000038.1| GENE 12 10030 - 10674 559 214 aa, chain - ## HITS:1 COG:FN1269 KEGG:ns NR:ns ## COG: FN1269 COG2121 # Protein_GI_number: 19704604 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 3 196 8 201 209 129 41.0 3e-30 MNKYKFTGIVLHIIYRILSFMTKKEYHYADNVDIEVQKIMVFWHRKIFTVCNATRTIKKK ASMVSSSKDGEILSELLRREGNEIIRGSSNKDNIKSLKESIKFVKKGYSLGIAIDGPKGP IFEPKAGAIYIAQKTGLPIIPVSSFCNKRWIFRNVWDRLEIPKPFSRNVHYVGEPFYISK DISIEEATEIVKENIHKAGRRAYEIYRNKYILKK >gi|261748390|gb|ADAD01000038.1| GENE 13 10711 - 11283 948 190 aa, chain - ## HITS:1 COG:FN1004 KEGG:ns NR:ns ## COG: FN1004 COG1309 # Protein_GI_number: 19704339 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 3 178 4 173 188 79 33.0 4e-15 MPKIKFTKKDIVKATYEIMKNEGIKNISARKIASKFKGSTAPIYANFSTIEELKEEIIKL AEEKLDEYLNGHYSDKWEEDRKILNTAIGFVVFAREEKELYRAIFLDGSKGFKTLFDETI EKLLTKEELLKSFPQLPEEEAKLSVLRLWYFLYGYATLVCTSAILDQTNEAIERRILDIA EHFKQLHGLE >gi|261748390|gb|ADAD01000038.1| GENE 14 11309 - 11785 480 158 aa, chain - ## HITS:1 COG:CAC3004 KEGG:ns NR:ns ## COG: CAC3004 COG0262 # Protein_GI_number: 15896256 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Clostridium acetobutylicum # 1 157 1 152 153 119 40.0 2e-27 MFSIIVAMGENREIGKKNKLLWHIPEDLKNFKKTTTGKTVIMGRKTFESIGKALPDRRNI VLSRTFGQEEARKYEIEVYDNFDDVIKDYYDTDEEVFIIGGEDVYITALKYVKKLYISYI KFSDKEADAYFPEIDYREWGMREEKQFENWNFSVYERL >gi|261748390|gb|ADAD01000038.1| GENE 15 11812 - 12681 1069 289 aa, chain - ## HITS:1 COG:NMB1709 KEGG:ns NR:ns ## COG: NMB1709 COG0207 # Protein_GI_number: 15677556 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Neisseria meningitidis MC58 # 1 289 1 264 264 256 46.0 4e-68 MKEYLELVKHVLDNGVRKENRTEVDTISTFAYTYKVDLSKGFPLLTTKKMYFNSMLHELF WYLTGEEHIKNLRTKTKIWDAWADEEGRLETAYGRFWRRYPVPEISLDGEVFTDENNKWT KREKNGQLVFDQIQYIIDTLKEMKTNPNHKNGRRLIVLAWNPGNASISKLPPCHYTFAFN VLGNRLNCHLTQRSGDIALGIPFNLACYSLLTMMIAKECGYEPGEFSHTIIDAHIYENHI EGLKEQLTREPLKLSKITIADKPFNELTFEDIKLEDYESHPVIKFEVAV >gi|261748390|gb|ADAD01000038.1| GENE 16 12824 - 13294 466 156 aa, chain - ## HITS:1 COG:FN0809 KEGG:ns NR:ns ## COG: FN0809 COG0219 # Protein_GI_number: 19704144 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Fusobacterium nucleatum # 1 153 1 150 150 217 69.0 6e-57 MNIVLLNPEIPYNTGNIGRTCVLTKTKLHLIKPLGFSLDEKAVKRSGLDYWKDVQLYVWE DFEHFFKENIENRNVNLYFATTKTKRRYSDVNYGKNDYVMFGPESRGIPEEILNRYKENN ITIPMLPLGRSLNLSNSAAIILYEALRQNDFEYEKI >gi|261748390|gb|ADAD01000038.1| GENE 17 13320 - 13889 879 189 aa, chain - ## HITS:1 COG:FN0214 KEGG:ns NR:ns ## COG: FN0214 COG0817 # Protein_GI_number: 19703559 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Fusobacterium nucleatum # 1 188 1 188 190 214 57.0 9e-56 MRILGIDPGTAIVGYSVIDFEKGKYKVLDYGCIYTDKDEDMPVRLEEIYDGLDNIIKLWE PSDMAIEELYFFKNQKTIVKVGQARGVITLVGQKNNMNIYSYTPLQVKMGIASYGRAEKK QIQEMVKIILGLEEIPKPDDAADALAIAITHINSKNGFGGFSRGDNITKKLEKLNSNKIK LSDYKKLMG >gi|261748390|gb|ADAD01000038.1| GENE 18 14040 - 14546 694 168 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0677 NR:ns ## KEGG: Lebu_0677 # Name: not_defined # Def: tRNA pseudouridine synthase B # Organism: L.buccalis # Pathway: not_defined # 17 168 172 355 355 118 47.0 8e-26 KINSIREIKINSENPKKISFYADVSSGTYIRTLVKDIGDKTGFYATMTKLVRTKIDKFSL EDAVDIGKINEYFNISKIKDKINLKEVEYIFDYEKVITDYEKYKKLKNGMTVLFKKNKFK DIEKINENKKYKLYLHNKTSKDKEFKGIVKIIKKGTDMIYIKRDKYFL Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:40:52 2011 Seq name: gi|261748384|gb|ADAD01000039.1| Leptotrichia goodfellowii F0264 contig00114, whole genome shotgun sequence Length of sequence - 4163 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 112 87 ## 2 1 Op 2 21/0.000 + CDS 127 - 1656 173 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 3 1 Op 3 2/0.000 + CDS 1656 - 2609 1438 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 4 1 Op 4 2/0.000 + CDS 2612 - 3292 882 ## COG0036 Pentose-5-phosphate-3-epimerase 5 1 Op 5 . + CDS 3304 - 4161 260 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase Predicted protein(s) >gi|261748384|gb|ADAD01000039.1| GENE 1 2 - 112 87 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no GELAVELLMDKILSRKSNVNNIKIPVNLIRRKSTEN >gi|261748384|gb|ADAD01000039.1| GENE 2 127 - 1656 173 509 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 275 487 15 219 245 71 25 1e-12 MNSNPLVEMKSITKKFGVVTALNEISFKINRGEVHVLLGENGAGKSTLMKILSGVYQPTS GKIIINGRDYDFLTPKLSFENKISIIYQELSVIDELSIQENLFVGKLPVKKVLGIETVDY QYMENVTKKILSKIGLDKDPYTTVEELTISEKQQVEIAKALAANADIIIMDEPTTSLTET ETNHLFKIIRQLKLEGKGIVYISHKLKELKEIGDIVTVLKDGKDVGTKDIKTVDIKDLVT MMVGREIKSRYDSNNPELDRKEIIFEVNDLTRSDEKVKDISFKIHKGEILGFAGLVGSGR SELMEAIFGVEPIKKGTIKMFGKESVPKNEYDSIKKGMGFVTENRRETGFFHNFDIKQNI SILPFIKSSKFKGTWGLTDNSKEKEYALKGKNQLNIKCASIDQNITELSGGNQQKVILSK WMMADSSLIIFDEPTKGIDIGSKSEIYTIIRKLSDEGKIIMMISSEMPELLAVCDKIAVF KEGRIAEILPIEKASEEAIMRILTSEGVE >gi|261748384|gb|ADAD01000039.1| GENE 3 1656 - 2609 1438 317 aa, chain + ## HITS:1 COG:alsC KEGG:ns NR:ns ## COG: alsC COG1172 # Protein_GI_number: 16131912 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli K12 # 5 309 19 323 326 319 65.0 4e-87 MKSEFNTLWQKYGTFGILIIVMVVFGIGQPALFFSFDNITQIILQSSVNILIACGEFFAI LIAGIDLSVGSVIALTGMFTGKLLVAGMNPIFAILIGGILAGALIGLLNGFLVNVTELHP FIITLGTQAIFRGVTLIISDARPVFGFSPLFAKMIAGRIVGIPIPAIIAIIVALFLGFLT AKTKLGRNIYALGGNKQAAWFSGIDVKLHTLIVFIISGTCAGIAGVVSTARVGAAEPGAG AGFETFAIAAAIIGGTSFFGGKGKIFGVVMGGLIIGVINNGLNLLTVPTYYQQIVMGGLI ILAVTADKIFGGKKKGE >gi|261748384|gb|ADAD01000039.1| GENE 4 2612 - 3292 882 226 aa, chain + ## HITS:1 COG:alsE KEGG:ns NR:ns ## COG: alsE COG0036 # Protein_GI_number: 16131911 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Escherichia coli K12 # 4 216 1 213 231 278 60.0 7e-75 MEKMKFSPSLMCMDLTKFKEQVDILNERADFYHVDIMDGHFVKNITLSPFFVEQLNKITK LPIDVHLMVEFPGDYIDELAKSGATYICPHAETINKDAFRIINKIKSLGCKAGVVLNPST PVEWIKYYIHLVDKITIMTVDPGFAGQPFIPEMLEKIKELKKIKEENNYSYLIEIDGSCN ERTFKRLTSAGAEVLIVGSSGLFNLEEDLVVAWDKMIEIFNREIAE >gi|261748384|gb|ADAD01000039.1| GENE 5 3304 - 4161 260 286 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 8 260 5 283 319 104 27 1e-22 MKEQKDYVLGIDIGGTNFRIGLVSQNYEVEEFQIKPILELQKGDFIDNLLKYIKFYTDLY REKIKAIGIGFPSIVSKDKKYVYSTPNIKNLDNINVTDTLEKKLDIPVYINKDVNFLMLK DVKENNIENDKIAIGLYIGTGFGNAIYINGKIIEGKHGVAGELGHIPVLGSNEVCACGNT GCIEAHASGKALKKICEENFRETDIDNIFSEHRDTKIMEDFIDTLSIPIVTEINILDPDY IIIAGGVPIMKDFPMDKLKKSIYKRARKPYPAEDLNIIVSNHDQKS Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:40:57 2011 Seq name: gi|261748382|gb|ADAD01000040.1| Leptotrichia goodfellowii F0264 contig00199, whole genome shotgun sequence Length of sequence - 304 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 303 307 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261748382|gb|ADAD01000040.1| GENE 1 3 - 303 307 100 aa, chain - ## HITS:1 COG:FN0132 KEGG:ns NR:ns ## COG: FN0132 COG3210 # Protein_GI_number: 19703477 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 4 99 2033 2128 2462 88 52.0 3e-18 KANETTRDDHSSTNVYVESQTIDYALHPAKFKEDVGIAVLEGSATVEGALKKIDNILRGD DNSDISQSEKRRYEEIKENIIRFKTAPDMKLIAEGDLSDP Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:41:00 2011 Seq name: gi|261748369|gb|ADAD01000041.1| Leptotrichia goodfellowii F0264 contig00058, whole genome shotgun sequence Length of sequence - 8372 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 2, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 459 565 ## Lebu_1496 extracellular solute-binding protein family 1 2 1 Op 2 . + CDS 362 - 709 370 ## Lebu_1496 extracellular solute-binding protein family 1 3 1 Op 3 . + CDS 733 - 1644 999 ## COG1175 ABC-type sugar transport systems, permease components 4 1 Op 4 . + CDS 1680 - 1916 201 ## Lebu_1494 binding-protein-dependent transporters inner membrane component 5 1 Op 5 1/0.000 + CDS 1907 - 2509 931 ## COG0395 ABC-type sugar transport system, permease component 6 1 Op 6 . + CDS 2555 - 4774 3327 ## COG3345 Alpha-galactosidase + Term 4816 - 4867 10.1 + Prom 4891 - 4950 9.3 7 2 Op 1 . + CDS 5056 - 5553 569 ## Lebu_1055 hypothetical protein 8 2 Op 2 . + CDS 5572 - 5757 253 ## Lebu_1429 hypothetical protein 9 2 Op 3 . + CDS 5738 - 6148 513 ## Lebu_1427 hypothetical protein 10 2 Op 4 . + CDS 6163 - 6978 1115 ## COG2849 Uncharacterized protein conserved in bacteria 11 2 Op 5 . + CDS 7003 - 7914 979 ## COG1481 Uncharacterized protein conserved in bacteria 12 2 Op 6 . + CDS 7941 - 8370 551 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase Predicted protein(s) >gi|261748369|gb|ADAD01000041.1| GENE 1 1 - 459 565 152 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1496 NR:ns ## KEGG: Lebu_1496 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: L.buccalis # Pathway: not_defined # 1 150 170 319 405 201 68.0 8e-51 LTKDGIYGLSAPLNFQEGFWNEVYQNEGYIIKDDKSGYNNPATQEAIQWWVDLSLKEKVS PLQKEFDEVEYVQMFTSGKVAMAQLGSWNLPRIEEDKEFAKKVGVTYLPRGKKQATIYNG LGYSVSAKTKYPEEAKKIPSIFSNRKSEFITG >gi|261748369|gb|ADAD01000041.1| GENE 2 362 - 709 370 115 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1496 NR:ns ## KEGG: Lebu_1496 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: L.buccalis # Pathway: not_defined # 15 113 304 402 405 130 60.0 2e-29 MDIQFLQKQNILKKLKKFLQFLATEKANLLQAKYVSAIPAYEGTQQAWVDHNKDMDLKIF IDQLKYGVVYPSASNGGKWRDLGNQIFAPVFAGKVDVKTATEEYAKKMNEMIEAK >gi|261748369|gb|ADAD01000041.1| GENE 3 733 - 1644 999 303 aa, chain + ## HITS:1 COG:BH1865 KEGG:ns NR:ns ## COG: BH1865 COG1175 # Protein_GI_number: 15614428 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 5 301 15 309 309 194 40.0 1e-49 MNNIKSNYMKNKKTENMVWGYFMITPTMLGLFILNILPIFQTLYLSFTKSGAFGKVTFIG LKNYVNLFQDKLVMQSFINTFIYTILTVPAGVFLSLITAVLLNAKIRGKTIYRTLFFLPV VSVPAAVALVWKWIFNSKFGIINTILTAIGLKGVDWLTDSKSAMIAIVIVGIWSMVGYNM IVIFAGLQEIPQTYYEAAEIDGAGPVKKFFSITLPLITPTLFFIIITTFIGCLQVFDLIY MMIGRANVVLPKVQSVVVLFYTYSFERNMKGYGSAIIMMLFLVILLITYIQLKLQKKWVN YIS >gi|261748369|gb|ADAD01000041.1| GENE 4 1680 - 1916 201 78 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1494 NR:ns ## KEGG: Lebu_1494 # Name: not_defined # Def: binding-protein-dependent transporters inner membrane component # Organism: L.buccalis # Pathway: not_defined # 1 72 2 73 277 89 61.0 5e-17 MKKYGYKDKNKIIVHILLIIGALFMIGPFIWTVLTSFKTLPESISVPPTILPKSYKFDNY KKAVQMLPFLIFTLIQWQ >gi|261748369|gb|ADAD01000041.1| GENE 5 1907 - 2509 931 200 aa, chain + ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 1 200 82 281 282 163 45.0 2e-40 MAVIVVRVIVSIFFAAMAAYAFARLKFPGRDVLFMLVLLQMMVPSQIFVIPQYLIAHKLG ILNSISALVLPGIVTAFGTFLLRQFFMGLPVELEEAAIIGGANQWQIFTQVMLPLARSGM ISLSIFTALFAWKELMWPLIVNTDLDKMPLSAGLAQLIGQFNTNYPVLMAGSVLAIIPMV VIFMIFQKQFISGVALNGGK >gi|261748369|gb|ADAD01000041.1| GENE 6 2555 - 4774 3327 739 aa, chain + ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 2 737 3 741 748 839 54.0 0 MIKYSEKSREFYLGNEHISYIIGILQNGHLGQYYFGKKVRYREDFSHLFEKIGGEPAITP STDGKGFSLDMIKQEYPSYGTGDYREPAFAVLQENGSRIVDFKYKSHSIKNEKKKLTGLP AVYANEDDKVETLEIILEDEVIDAELILSYSIFEKYSAIMRNVKIINKGNQKLNIEKIMS ASLDLPDSEYILTQLSGAWGRERYVTEREIKRGITVIDSKRGTSSALNNPFISLKRKNTT ETSGEIIGLNLIYSGNFSTTVEVDQYDVTRVNIGINHFDFNWLLEPGQEFQSPGAVLVYS DEGINGMSLTFHNLYKNNLIRGKYKNKAKPVLLNNWEGTYFNFNEEKILKIAEDAKKLGV ELFVLDDGWFGTRNDDFQGLGDWEPNLEKLPHGIKGLAEKVNQMGMKFGLWFEPEMVNKN SDLYRKHPDWIISTPDRFETPGRHQHTLDLSRKEVADYVYESVARNLRESNIEYVKWDMN RCMTEIYSKGRCACRQKETAHRYILGLYSVLEKITQEFPDVLFESCASGGNRYDPGMLYY MPQTWTSDNTDAVDRLKIQYGTSMVYPVPSMGAHVSITPNHQTSRNTSLDIRGNVAMFGT FGYEMDPKDLSEEEKEKVKKQIEEFKNYRELIAEGDFYRIKSPFEGNDTVWMMVSKDKKE ALVGYYRKSVEVNGGFKRVRLTGLNENLDYTVNKKNKGTLNKVGGDELMNVGLFIGEANT EHRDMQGDYYTELYYLKAE >gi|261748369|gb|ADAD01000041.1| GENE 7 5056 - 5553 569 165 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1055 NR:ns ## KEGG: Lebu_1055 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 2 165 1 181 181 73 32.0 4e-12 MLKKLAFFLLFLIGIISFSLESKDFNTPQAVIYNTNKKMPANFRFEMEEDNWEDTNYIIH VIKDGDKYKVAYSYLKANSPSYYDKDGYPRLDFVTKEFDKVYFTYFSAQPGTYVKEGKKI ERMTYETLTAEQLEEFLKERNAKKVDDNIQKRIKEHLLWLDSAAN >gi|261748369|gb|ADAD01000041.1| GENE 8 5572 - 5757 253 61 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1429 NR:ns ## KEGG: Lebu_1429 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 51 1 51 194 73 70.0 3e-12 MNKERLTAFTDAVLAIIMTILVLELEKPSEPTLRAIWELRTSYFSYTLSLFLARNNVGKS A >gi|261748369|gb|ADAD01000041.1| GENE 9 5738 - 6148 513 136 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1427 NR:ns ## KEGG: Lebu_1427 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 121 56 176 193 125 63.0 6e-28 MWVNLHNEWHKVEKISKSVVWWSVILLFFSSWFPYTTSFVNSYFYSSTAQVFYGIIVLAV TYVNIELSKALEKANENNKKLKEKTVKRRNWLHIDILIKIAGLIISVFIYPPAMMLSVFI TSILVLTVFTREKNRK >gi|261748369|gb|ADAD01000041.1| GENE 10 6163 - 6978 1115 271 aa, chain + ## HITS:1 COG:FN2119 KEGG:ns NR:ns ## COG: FN2119 COG2849 # Protein_GI_number: 19705409 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 34 271 43 282 338 77 28.0 2e-14 MKKNYIILTALLFLMAVISYGQDVYADYRIHFNKVYITNYVQTENDNLKLKRVSDSTYEI SNSGNFRIHDAGGRQLDVTLKGGIINGQYNEYYSNGNMFTTGKYDDGKKEGLWKIYTESG LLWKSYEYKNNELNGKYILYYASTGEKETVGNYKNDKLDGAWNEYYSNGNRRKSGDYQDG KKNGIFTEWFSNGIKKSVINYADDEINGKMNVYYENGTLFYEADIVGREGNVKGYYQSGN VSFEGKVSDNKRRGTWNFYDNAGNLSNKVEY >gi|261748369|gb|ADAD01000041.1| GENE 11 7003 - 7914 979 303 aa, chain + ## HITS:1 COG:FN0706 KEGG:ns NR:ns ## COG: FN0706 COG1481 # Protein_GI_number: 19704041 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 299 1 294 299 216 47.0 4e-56 MSFSATLKRELFNKEIENKEEIYAELFGIFISKDIISEKGVNFSTENVSLAKRVYSNLKV ITEMDIYLKYSMSNRLGAHKIYEVKIIVTKNNKKEYDKLLKKLFSHKNFSEEKTEHQLAG IIRGFFISCGYVKSPEKGYALDFFIDTEDSATFLYYLFKQMGKKVSHTDKKSKSLVYLRN SEDILDIIFLIGGVTSFFEFEEVTINKEIRNKINRNMNWEIANETKKLSTSEKQIKMIKY IDSEMGLSELSGILRETAEIRLENEEMSLQELADLLEVSKSGIRNRFRRIEEIYNSLRES NGD >gi|261748369|gb|ADAD01000041.1| GENE 12 7941 - 8370 551 143 aa, chain + ## HITS:1 COG:FN1130 KEGG:ns NR:ns ## COG: FN1130 COG1663 # Protein_GI_number: 19704465 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Fusobacterium nucleatum # 14 143 5 134 325 155 56.0 2e-38 MKFLSLLYGFAVFIRNKLYDLKFLKTRGVENVEIICIGNIVAGGSGKTPAVQYFVRKYLS EGEKVGVLSRGYKGKRDKDTMLVRNEKEIVAKPSKSGDEAYLHALNLQVPVVVSKDRYEG AVYLRDKCSVDFIIMDDGFQHRK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:41:15 2011 Seq name: gi|261748367|gb|ADAD01000042.1| Leptotrichia goodfellowii F0264 contig00153, whole genome shotgun sequence Length of sequence - 283 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 282 294 ## gi|262037608|ref|ZP_06011060.1| hypothetical protein HMPREF0554_2166 Predicted protein(s) >gi|261748367|gb|ADAD01000042.1| GENE 1 3 - 282 294 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037608|ref|ZP_06011060.1| ## NR: gi|262037608|ref|ZP_06011060.1| hypothetical protein HMPREF0554_2166 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2166 [Leptotrichia goodfellowii F0264] # 1 93 1 93 93 104 100.0 3e-21 AEKAKKRQEEERKKQEEKNKEIAEISKKYIGRRIVSKEDYDYLIYLSKNPDLRSEYKDYN IKLINTLEGKYLIEESLGNPKAKASKEWFEEKY Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:41:26 2011 Seq name: gi|261748354|gb|ADAD01000043.1| Leptotrichia goodfellowii F0264 contig00211, whole genome shotgun sequence Length of sequence - 12536 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 3, operones - 2 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 110 - 3277 3630 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 2 1 Op 2 . - CDS 3270 - 6029 2863 ## Lebu_1798 hypothetical protein 3 1 Op 3 17/0.000 - CDS 6074 - 6829 191 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 - Prom 6887 - 6946 4.8 4 1 Op 4 21/0.000 - CDS 6979 - 8001 1509 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 8198 - 8257 5.1 5 1 Op 5 3/0.000 - CDS 8394 - 9167 878 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 6 1 Op 6 . - CDS 9136 - 9444 544 ## COG0011 Uncharacterized conserved protein - Prom 9681 - 9740 11.8 + Prom 9701 - 9760 11.1 7 2 Tu 1 . + CDS 9831 - 10691 763 ## gi|262037620|ref|ZP_06011071.1| conserved hypothetical protein + Term 10709 - 10750 2.2 - Term 10697 - 10738 3.1 8 3 Op 1 . - CDS 10752 - 11414 894 ## Lebu_1386 hypothetical protein 9 3 Op 2 . - CDS 11494 - 11856 248 ## gi|262037612|ref|ZP_06011063.1| glutamine ABC transporter, permease protein 10 3 Op 3 . - CDS 11918 - 12040 62 ## 11 3 Op 4 . - CDS 12037 - 12267 326 ## gi|262037619|ref|ZP_06011070.1| hypothetical protein HMPREF0554_2420 12 3 Op 5 . - CDS 12321 - 12524 306 ## gi|262037621|ref|ZP_06011072.1| conserved hypothetical protein Predicted protein(s) >gi|261748354|gb|ADAD01000043.1| GENE 1 110 - 3277 3630 1055 aa, chain - ## HITS:1 COG:FN1149 KEGG:ns NR:ns ## COG: FN1149 COG1074 # Protein_GI_number: 19704484 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Fusobacterium nucleatum # 5 1054 8 1053 1056 331 31.0 4e-90 MSKKILKASAGTGKTYRLSLEYIANLIKGISYKNIIVMTFTKKATAEIKDRIYDFLYQIA FEKYKFEELEKSLKEIYGFQGGEIDKNSLQNIYFEMIKNKDEIRIYTIDGFTNQIFKNTI APFFGIYGYETLDEEDDGFYEDILVKILNNNEYFEKFSFVFEEKKERKDIKKYVKFIKNI INIRKDFILAGNYKIENDKKANTKFVDYLEEIFDMIGSVAENKDGNVKDFVNTDFRSIYD EIKGVDERISKDKIENKREKIEIIQKNHELFFSGKNIWNGSKIKGKSVENIIYEMKESRD LLSKSFSDYIFVNEVIPLHEKICEAANMIYNIAEEMKFSTKRFTHDDISVYTYKFIFNEE LGFIKDKKVTSDFLELIGGNVETVMIDEFQDTSVLQWKILKLLLNSAKNIICVGDEKQSI YHWRGGEKELFEKLETLINGTVENLDKSYRSYKKVIENVNRIFEKYSEEWSYNPVKYRDD EEYSKGYFEYYLQERKPSYKGDEIRPEAAYEKAIEMIKEGEIQNLGKTCIICRTNSHLNE IAERLNKENIPYTLNSSFSLLEHDVIKPLYKLIKYFVFNNYIYLLEFMRSDLIGCLNSHV KYMLENKHEIERYIREIEEEKFSDFVNSQSENEILPEYKEIDEMKRNNLLFSDILHKIKK LKNLSRSLNSKYLKENFSKKLAEEFYVTDFYSTKSDIKNIFKFFNILKEYSDLFEFVTYI EDEKDKLKQLSSEDSDAVNLMTIHKSKGLEFDTVIYYKKESSGKDNDKDLKVFFDYDEKF EKINKFLVTFPKYEKTFIDSEYSEINEKSEQKEKMESINNDYVALTRAKKNLLLFFEKVV SSKGEVKGELVKRIIDVYKSEIWHSSGKISESKKKEIKISENADFEGLKDIMSYFEDNVL KYTASKYETDLEGEFKRKKGLAMHYYFEHMTNDFENDRKNAESAFLSRYGNMLGKKIITE LLERMKKFIFDNKEIYNAKYKVYTEFEICDRENNKRIIDRINIDEENRKIFIYDYKTGYE PTENEKYKQQLEEYKQILFEKTNGEYEIFTQILEV >gi|261748354|gb|ADAD01000043.1| GENE 2 3270 - 6029 2863 919 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1798 NR:ns ## KEGG: Lebu_1798 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 919 1 955 957 751 52.0 0 MNIKYFGLGSDLKEILFEEFKKNEEILYVFENTSSFFEIKREYLKSGEKLFYNFKLLKLY DFYEKLFQTDKIVLKEEKQVVLFYNSLTDEIKKELKINNYYDVIDIAYNFYGLFSELQEY KIDLNKIEIEKWQKNTFETLLKINGQINEKSDEKGLILPYMLRKSENISENFLKKYKKIC FVNKVKFTPLEKEIFDVIEKKGIEIENILQLDKKDFNEEKLKISETFGLPDKEEFEKKYN VNVEIHEFENKFAQLLGIVKKLDKELKKERLSTVENKSLYKIYDAQDENSDNEKDYHLLN QNKITYNLEITMQKTKIYKTLDLIYNVLENIKVVFKSKEDKICLFRMKELHDAFKSRDFL KTFGLKENYRLFQDLVLDDYKYISCEKLKELSENVDRYKHMENEVKSFVNFLEKIEEIYS YKTLSEYAEFLEKIFVQSGEKDVNIRDKYFEALSEMTVLEDLSFDDLWTDFFGLNMSASL LKLFLKYLDKKAVSLDLEKIEEQENERKYTINPFSSISETAKENIMFLNIQDSFPKVKIN NYLFSKIQRAKMGLPISDEEKQIEIFKFYQNVLGAKNIYLSYVKNIDEKIDSAGVIEEIK LKYGIEPVKNEISEEEELNFVKEYFNKSKIQRKIGKFIKSKLEKNKEKMRSEDLSLGFYD FEKMKDFEYGFYIEKMTGEKEIEKIDDKIEALMFGNIIHVLYEKIVMENKEALEKGEFTV SNGKIEKTLNRILDSLEYKIPKEYIIFYRKISFTEIVKSTERFLKELAQYLLMKRNIKIH SEEKIKKKAEKDIAENVRISGVVDLYIETEDSEILVDYKSGKDNQQNKNRAFNQLDYYSV ILPENENKKTEKWVINAWTGEKNTDEDRKENDILGKEDIKEVIENYYKTDYYDLGERKET YISRVYSDIARREEELGDE >gi|261748354|gb|ADAD01000043.1| GENE 3 6074 - 6829 191 251 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 209 1 232 563 78 25 3e-14 MSTEKKIKLETKNVSVKFEDKKIIENISVKLYENELVCILGLSGVGKSTLFNVIAGLISP DEGSVEMNGQNITGKSGNISYMLQKDLLLPFKTVIDNVSLPLVIKGESKKTAREKAEKYF EQFGLEGTQKKYPSQLSGGMKQRAALLRTYMFSREIALLDEPFSALDAITKHKMHLWYLD IMKQIKMSTLFITHDIDEAILLSDRIYILAGSPGQITAEIKIEIDRNGNNQEIEMSEKFL KYKKEILEYLK >gi|261748354|gb|ADAD01000043.1| GENE 4 6979 - 8001 1509 340 aa, chain - ## HITS:1 COG:SP2197 KEGG:ns NR:ns ## COG: SP2197 COG0715 # Protein_GI_number: 15902004 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Streptococcus pneumoniae TIGR4 # 33 340 29 335 335 298 49.0 1e-80 MKKFVKILLLCILGILAISCGKSESQNKTEDKAKTGNESPSKKISIVLDWTPNTNHTGLF VAKELGYFKEEGLENVEIVQPPEGSTTALIGAGGVQFGISFQDTLAKSFSTDAPVPVTAV AAIIQHNTSGIISLKDKGIDSPKKMENHKYATWDDEIEKAILKKIITDDGGDFNRVKMIP NTVTDVVTALQTDIDAVWVYYAWDGIATELAGLKTNFLNFADYGKELDYYSPVIITNNDY LKKNPEEAKKVLRAIRKGYEYAIANPEEAAKILVKNAPELKPELAIASQKWLATRYKADA TEWGVIDANRWDTFYEWLFKNGLVKKEIPKGYGFSNEYLK >gi|261748354|gb|ADAD01000043.1| GENE 5 8394 - 9167 878 257 aa, chain - ## HITS:1 COG:FN0237 KEGG:ns NR:ns ## COG: FN0237 COG0600 # Protein_GI_number: 19703582 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Fusobacterium nucleatum # 25 255 1 231 241 224 50.0 1e-58 MSKKSRNILNKITDKTAPFIIIVFILVLWQILSMTGIVPKFMLPSPFDVIKAFFSDFGLL MHHTVITLTEAFLGLGLGIILGFVTAVIMDRFEFAYKAIYPVLVITQTVPTVAIAPLLVL WLGYGIMPKITLIVITSFFPITVGLLDGFKSADKDALNLLKTMGATPFQNFIHIKFPSSI GYFFAGLRISVSYSIIGAVVAEWLGGFDGLGVYMTRVRKSYSFDKMFAVIFFVSAISLFL MYAVKKIQKVAMPWERI >gi|261748354|gb|ADAD01000043.1| GENE 6 9136 - 9444 544 102 aa, chain - ## HITS:1 COG:SP2199 KEGG:ns NR:ns ## COG: SP2199 COG0011 # Protein_GI_number: 15902006 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 5 102 1 93 96 65 43.0 2e-11 MSKEIDASIAIQVLPNVQGNEEIVRIVDEVIAYIKSKNLNTHVGPFETTIEGKYEELMEI VKECQIIAIKAGAPGVMSYVKINYKPKGDVLTIEQKISKHSE >gi|261748354|gb|ADAD01000043.1| GENE 7 9831 - 10691 763 286 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037620|ref|ZP_06011071.1| ## NR: gi|262037620|ref|ZP_06011071.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 286 1 286 286 468 100.0 1e-130 MKKSQILKMFSKYRITTSILIFGIIIWIAYLILNSIGVPEFKNKWKQVCEHNYCNFSEYG YYETAEQNQPNKENCFIFGDCMDLNYTYITKNIIITSSKKCYQSDSDKVLVGNVEILTRN GDKYFASEIIIEGKNILIPGNVKIYRKNKIENLNNISLRNFFPFYNNTFLRLFSEYKNFK LHRDSYDYYSPDTYRSHKFCENNDKIIYRETDNENKKKNTSVETENFKNRGYYYSPNINN NSDKKDRNQLEEEFSSDHVSEYEREILEDYDSIDSGETPYDGDDGY >gi|261748354|gb|ADAD01000043.1| GENE 8 10752 - 11414 894 220 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1386 NR:ns ## KEGG: Lebu_1386 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 213 1 209 211 116 36.0 8e-25 MRKIRLITGLLMIVFSLTVYSKNEYNIENLKKNDLLIKETLEKCKKNQFTKKDEACINVQ KVQNELAREIWDKNKNEIKNRLVKLYSEIDAGNFDNIFSEMPPKFLKFLSEQASMSEKDL KMVAKEVAKEAFDKYGARTENDKKFKEIKVGRTSVGRTYAIISHKIRVSVNNEVINSEEY ILAFEDRGKWYSMSLNKNFFIIFEIYPDFEEIEFSPSYFE >gi|261748354|gb|ADAD01000043.1| GENE 9 11494 - 11856 248 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037612|ref|ZP_06011063.1| ## NR: gi|262037612|ref|ZP_06011063.1| glutamine ABC transporter, permease protein [Leptotrichia goodfellowii F0264] glutamine ABC transporter, permease protein [Leptotrichia goodfellowii F0264] # 1 120 1 120 120 130 100.0 4e-29 MILLFGLTFYKLPYMGLYLLSYFAAVIIFIIIKGKNDPKILRYILFAITLVNFVNVFYFL FSYYAYFIIYQFIIIYLAVKKDKSNNKFENYMYMTLSFNDFMNISVFSSLILRMMANGVI >gi|261748354|gb|ADAD01000043.1| GENE 10 11918 - 12040 62 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRYLLENLFGGYTMRYLLSNLFSNIRTIQIIFKFYKGNFF >gi|261748354|gb|ADAD01000043.1| GENE 11 12037 - 12267 326 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037619|ref|ZP_06011070.1| ## NR: gi|262037619|ref|ZP_06011070.1| hypothetical protein HMPREF0554_2420 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2420 [Leptotrichia goodfellowii F0264] # 1 76 1 76 76 115 100.0 9e-25 MLTEEEKDSILSENIMSYGVENENFWLKSLKRAKNFDEEKSKSLYNKLKKLPFVAFNYSG SYREIDNLILKIKELV >gi|261748354|gb|ADAD01000043.1| GENE 12 12321 - 12524 306 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037621|ref|ZP_06011072.1| ## NR: gi|262037621|ref|ZP_06011072.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 67 1 67 67 114 100.0 3e-24 MIKGVPESIRIGVGTLSKCDELKYLVKDKNLLNKKDDELFELDVKYEIEWDELAEIYNCE VKVRGII Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:42:18 2011 Seq name: gi|261748353|gb|ADAD01000044.1| Leptotrichia goodfellowii F0264 contig00081, whole genome shotgun sequence Length of sequence - 258 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 258 167 ## Smon_0558 hypothetical protein Predicted protein(s) >gi|261748353|gb|ADAD01000044.1| GENE 1 3 - 258 167 85 aa, chain - ## HITS:1 COG:no KEGG:Smon_0558 NR:ns ## KEGG: Smon_0558 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 2 83 156 237 407 111 67.0 1e-23 VYFTDSDIIHYGLISIIHTFGRDLKWNPHIHAIVSLGGFNKNFDFKKLDYFNVDTIAAQW KYHVLDIISKGNYPNQKIKRLAKIT Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:42:21 2011 Seq name: gi|261748351|gb|ADAD01000045.1| Leptotrichia goodfellowii F0264 contig00137, whole genome shotgun sequence Length of sequence - 422 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 421 562 ## gi|262037624|ref|ZP_06011073.1| conserved hypothetical protein Predicted protein(s) >gi|261748351|gb|ADAD01000045.1| GENE 1 1 - 421 562 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037624|ref|ZP_06011073.1| ## NR: gi|262037624|ref|ZP_06011073.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 140 1 140 140 233 100.0 5e-60 DKIRANYKKEVKDDLIAEGKGKLSDQWIVKKWLNDGHGAYTYYSAQLAKDIDGLLRKYQN TGDETAKKKIQEDIENKYIENQERLIELFTQGPALMNDSKGLLAGLGEYNRVENERTYRE GKYQEKNLPQNYITNSLNES Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:42:30 2011 Seq name: gi|261748348|gb|ADAD01000046.1| Leptotrichia goodfellowii F0264 contig00060, whole genome shotgun sequence Length of sequence - 905 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 456 651 ## COG0693 Putative intracellular protease/amidase - Prom 490 - 549 9.6 Predicted protein(s) >gi|261748348|gb|ADAD01000046.1| GENE 1 3 - 456 651 151 aa, chain - ## HITS:1 COG:CAC1629 KEGG:ns NR:ns ## COG: CAC1629 COG0693 # Protein_GI_number: 15894907 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Clostridium acetobutylicum # 4 151 2 148 188 105 44.0 4e-23 MDHKVVLFLVEGFEEIEAMAPIDLMRRAGITVDTVSITKEKNVISSRGVTVLTDKVIDEI NFEEYEMIVLPGGPGTDNYMKSDKLREKLKEFSIEKKVAAICAAPTILSALGLLKGKKAV CFPSCESEIIKDGAILEVKNVVKDGNIITGR Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:42:31 2011 Seq name: gi|261748347|gb|ADAD01000047.1| Leptotrichia goodfellowii F0264 contig00073, whole genome shotgun sequence Length of sequence - 286 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:42:31 2011 Seq name: gi|261748345|gb|ADAD01000048.1| Leptotrichia goodfellowii F0264 contig00101, whole genome shotgun sequence Length of sequence - 340 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 95 - 147 1.4 1 1 Tu 1 . - CDS 172 - 339 239 ## COG0776 Bacterial nucleoid DNA-binding protein Predicted protein(s) >gi|261748345|gb|ADAD01000048.1| GENE 1 172 - 339 239 55 aa, chain - ## HITS:1 COG:HI0430 KEGG:ns NR:ns ## COG: HI0430 COG0776 # Protein_GI_number: 16272378 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Haemophilus influenzae # 2 53 84 135 136 60 63.0 8e-10 AKGETVQFVGWGTFGVQKRAARTGRNPQTGKEIKIAAKKVVKFKVGKKLADKVAK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:42:41 2011 Seq name: gi|261748311|gb|ADAD01000049.1| Leptotrichia goodfellowii F0264 contig00076, whole genome shotgun sequence Length of sequence - 28302 bp Number of predicted genes - 34, with homology - 33 Number of transcription units - 8, operones - 5 average op.length - 6.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 220 321 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 2 1 Op 2 . + CDS 300 - 1115 1078 ## COG0169 Shikimate 5-dehydrogenase 3 1 Op 3 . + CDS 1134 - 2000 1115 ## Lebu_1669 hypothetical protein 4 1 Op 4 . + CDS 1987 - 2688 989 ## Lebu_1668 hypothetical protein 5 1 Op 5 . + CDS 2704 - 2892 256 ## gi|262037651|ref|ZP_06011096.1| conserved hypothetical protein 6 1 Op 6 2/0.000 + CDS 2903 - 3421 516 ## COG0703 Shikimate kinase 7 1 Op 7 . + CDS 3418 - 3849 525 ## COG0757 3-dehydroquinate dehydratase II 8 1 Op 8 . + CDS 3878 - 4327 727 ## Sterm_2366 HutP family protein 9 1 Op 9 . + CDS 4343 - 5011 945 ## COG0637 Predicted phosphatase/phosphohexomutase 10 1 Op 10 . + CDS 5042 - 5977 1259 ## COG3621 Patatin + Prom 5982 - 6041 8.9 11 2 Op 1 17/0.000 + CDS 6066 - 7319 1665 ## COG0448 ADP-glucose pyrophosphorylase 12 2 Op 2 . + CDS 7334 - 8719 1870 ## COG0297 Glycogen synthase 13 2 Op 3 . + CDS 8736 - 9272 466 ## PROTEIN SUPPORTED gi|116629704|ref|YP_814876.1| acetyltransferase + Prom 9459 - 9518 9.1 14 3 Op 1 7/0.000 + CDS 9539 - 9982 547 ## COG1846 Transcriptional regulators 15 3 Op 2 . + CDS 9966 - 11333 1606 ## COG0534 Na+-driven multidrug efflux pump 16 3 Op 3 . + CDS 11349 - 11624 405 ## Sterm_0542 hypothetical protein 17 4 Tu 1 . - CDS 11725 - 13350 1956 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 13454 - 13513 12.2 + Prom 13432 - 13491 8.7 18 5 Tu 1 . + CDS 13575 - 14921 2018 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Term 14950 - 15002 2.2 + Prom 14998 - 15057 10.6 19 6 Op 1 . + CDS 15084 - 15854 1216 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 20 6 Op 2 . + CDS 15924 - 16418 885 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 21 6 Op 3 . + CDS 16454 - 16879 449 ## Lebu_0872 hypothetical protein 22 6 Op 4 . + CDS 16889 - 17578 837 ## COG1451 Predicted metal-dependent hydrolase 23 6 Op 5 . + CDS 17593 - 18498 773 ## CTC01162 membrane associated protein + Prom 18504 - 18563 10.3 24 6 Op 6 . + CDS 18583 - 19179 783 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family + Term 19212 - 19262 5.2 25 7 Tu 1 . - CDS 19216 - 19425 86 ## - Prom 19504 - 19563 9.3 + Prom 19456 - 19515 13.0 26 8 Op 1 . + CDS 19558 - 20286 785 ## Lebu_0800 CRISPR-associated protein Cas6 27 8 Op 2 . + CDS 20305 - 22125 1387 ## Lebu_0799 CRISPR-associated Cst1 family protein 28 8 Op 3 9/0.000 + CDS 22127 - 23044 1028 ## COG1857 Uncharacterized protein predicted to be involved in DNA repair 29 8 Op 4 6/0.000 + CDS 23049 - 23789 687 ## COG1688 Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) 30 8 Op 5 6/0.000 + CDS 23817 - 26249 2391 ## COG1203 Predicted helicases 31 8 Op 6 . + CDS 26276 - 26569 236 ## COG1468 RecB family exonuclease 32 8 Op 7 12/0.000 + CDS 26524 - 26763 172 ## COG1468 RecB family exonuclease 33 8 Op 8 13/0.000 + CDS 26796 - 27788 1210 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 34 8 Op 9 . + CDS 27763 - 28068 205 ## COG1343 Uncharacterized protein predicted to be involved in DNA repair + Term 28227 - 28284 -0.9 Predicted protein(s) >gi|261748311|gb|ADAD01000049.1| GENE 1 2 - 220 321 72 aa, chain + ## HITS:1 COG:lin2436 KEGG:ns NR:ns ## COG: lin2436 COG1187 # Protein_GI_number: 16801498 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Listeria innocua # 1 71 164 232 233 75 59.0 2e-14 AKVEVIENSEDKVYITITEGKYHQVKRMFKAVNNKVLYLKRVQMGNLKLDDKLKVGKYRE LTEKEINILKNN >gi|261748311|gb|ADAD01000049.1| GENE 2 300 - 1115 1078 271 aa, chain + ## HITS:1 COG:CAC0897_2 KEGG:ns NR:ns ## COG: CAC0897_2 COG0169 # Protein_GI_number: 15894184 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Clostridium acetobutylicum # 4 269 7 271 273 187 42.0 2e-47 MEKFGLLGEKLGHSFSKEIHKAFFKTIQKEAEYSLIEKKYEEIPNFLEELRSGEYKGINV TIPYKVEIMQYLDEISPVAKEIGAVNTVSVKNGKLIGDNTDYFGFIKTLKIKNIKIEGSK VLILGTGGAAKSVYNALIDEGAEHIYVATISENDPFKVRKYDRLLNYSEIRNIKSVNLIV NCTPVGMYPEINMSPLEEVNLIQTDYLVDLIYNPEETVLMKKYRLKGAECVNGFMMLISQ AIKSEEIWNDKIYNENILKEIYNKLVKKLYK >gi|261748311|gb|ADAD01000049.1| GENE 3 1134 - 2000 1115 288 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1669 NR:ns ## KEGG: Lebu_1669 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 279 1 273 278 196 44.0 1e-48 MIKKMIMTVFAVLVSSTGFAITNEQIIAQGAATQKAIFDKYYNRTPSGTSRKDGNLLNSV FPEVITNLDGINRGIYDKEFNRLGEAKEHKRKSTFRKMYVQFNDYILENSKFSRNVFSNF LQEQDDLQAYLYTNIYLSIEAFNLNMNTYLEGRKNSSTIDSNVKTVIDYLYYKGDEKSQE DYMNMSNRELAAIVDKEYMNLNDALDKRIAEGTQVEKKSGNSVKSSFKKLEKLYRQYDKS FEEYIDGSELKNEDKEKIKKLVKFENIASLKFMVKSLEKVEGEESEQQ >gi|261748311|gb|ADAD01000049.1| GENE 4 1987 - 2688 989 233 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1668 NR:ns ## KEGG: Lebu_1668 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 7 233 36 269 269 216 51.0 6e-55 MSNNNDIYDEMDNFCAEVLSPEGLLNYMRVRKEYFFEPEEAVEKYFGDSEYKKEIATFGD FFYYYLAKYEKTYLYTFLEKGFTKKFKKLLEDHDIDPKTMDIDWLGMETKEKKYKESLFD ILYAMINYELKKHGLVMFGLNIGLESALYFIVPEDAYTRIDRKAELYTIFDLEYLETIYN EIFEVKRDLGVKGLQVGDFIEKNGQEYCSLFLENNVVIKNINEDDESEVILIL >gi|261748311|gb|ADAD01000049.1| GENE 5 2704 - 2892 256 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037651|ref|ZP_06011096.1| ## NR: gi|262037651|ref|ZP_06011096.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 62 1 62 62 92 100.0 1e-17 MFDFEKGDINMMLSLLNMKLRDEFSDLERLADYYSADKNVILKRMEENGYFYNNEINQFK KI >gi|261748311|gb|ADAD01000049.1| GENE 6 2903 - 3421 516 172 aa, chain + ## HITS:1 COG:CAC0898 KEGG:ns NR:ns ## COG: CAC0898 COG0703 # Protein_GI_number: 15894185 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Clostridium acetobutylicum # 3 161 2 161 165 120 47.0 1e-27 MKNIILIGMPASGKTTVGKLLAKKINYEHYDADRYLEKNEEKRISEIFSEKGEEYFRNLE EKYLKELSEKNGVIISTGGGAVKRKENMEELKKNGIIVFLSRKVDDIAKENHKYRPLLQN IENIHKLYGERIELYRKYADITVENDDTLSNVVNKTAKILKEKKFIYQGDSE >gi|261748311|gb|ADAD01000049.1| GENE 7 3418 - 3849 525 143 aa, chain + ## HITS:1 COG:CAC0899 KEGG:ns NR:ns ## COG: CAC0899 COG0757 # Protein_GI_number: 15894186 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase II # Organism: Clostridium acetobutylicum # 3 140 2 138 144 172 65.0 2e-43 MKKITVINGPNLNFLGIREKEVYGSENYESLCEYIENYCSEKNIEVEIFQSNSEGQIIDF LQKAYYEKVDGIVINPGAYTHYSYAIFDALKSVSIPTVEVHLSDIYNREDFRKVSVTAPA CVKQIYGKGKDGYIEAIKFLVEK >gi|261748311|gb|ADAD01000049.1| GENE 8 3878 - 4327 727 149 aa, chain + ## HITS:1 COG:no KEGG:Sterm_2366 NR:ns ## KEGG: Sterm_2366 # Name: not_defined # Def: HutP family protein # Organism: S.termitidis # Pathway: not_defined # 1 147 1 147 148 214 76.0 1e-54 MEFYDNKSVDVCRIALKMAISSREEEKKLIKEYKLKGIRVAAVDVGGTMPNSRFKFIESA LVAAKRNNIIKDEHVHDGAVIGAMREAISQIETNINGLSVGGKIGLARNGEHLGVAIFLS VGILQFNEVITSVAHRSVAVLPDENDFIE >gi|261748311|gb|ADAD01000049.1| GENE 9 4343 - 5011 945 222 aa, chain + ## HITS:1 COG:alr0288 KEGG:ns NR:ns ## COG: alr0288 COG0637 # Protein_GI_number: 17227784 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Nostoc sp. PCC 7120 # 13 221 9 217 222 135 32.0 6e-32 MSRKLFEELDLFVFDMDGLLFDTETVYVNYGRELSEEKGYIITNEFVEKTTGMTVEEAKN MYFEEFGKDFPYAEISGKVYKYIIEQAEKANIPLMKGAADFLERLHNNKKTLVLATSADR LMATTLIENKGLKKYFSHIITANDVKKGKPDPEVFLLAADKAGISPEKAAVFEDSFNGIR AAHSAGMYPIMIPDKIQPNEEIKKILHKKFNDLTEVIKYFEL >gi|261748311|gb|ADAD01000049.1| GENE 10 5042 - 5977 1259 311 aa, chain + ## HITS:1 COG:VC0178 KEGG:ns NR:ns ## COG: VC0178 COG3621 # Protein_GI_number: 15640208 # Func_class: R General function prediction only # Function: Patatin # Organism: Vibrio cholerae # 1 206 10 244 355 117 34.0 4e-26 MDDSFKILALDGGGARGLFIVSTLKQIEERYNIKYYEYFDLIIGTSTGSIIAAALSSGID IDEVEKLYIEEMDKIFKKDLLKNGIIQSKYDNKYLEKVLKRVLKNKTFENVKTDLMITTT NIVNGEPVLIKNKDTKNMKIVEAILASCAAPVFFDPLVMDEKRIFTDGGLWANNPSLAAI SEALSKTGYNRKIEDIKMLSIGTGEEIFDHKYENKQWGIVNWAMPLIKIVLQLNSKSTHN IVSGLLSENQYVRLDYHAESILDIDTVDKDVQKKSKKAVNENKEKLNKFFEKKAKKVGIF QRIFKKKDKNI >gi|261748311|gb|ADAD01000049.1| GENE 11 6066 - 7319 1665 417 aa, chain + ## HITS:1 COG:CAC2237 KEGG:ns NR:ns ## COG: CAC2237 COG0448 # Protein_GI_number: 15895505 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Clostridium acetobutylicum # 2 389 5 379 380 354 48.0 2e-97 MEILAMILAGGRGSRLDILSEKRVKPSVPFAGKFRIIDFALSNCSNSGIYDVALLTQYLP LSLNEHIGSGKPWDFDRRDSSITMLQPHEKPDGNNWYQGTADAIRQNIEFIKSKNPKYVL ILSGDHIYKMNYNWMLADHVKSNAELTIAVQKVPIEDAGRFGIFEVDENKKILNFEEKPK EPKSNLASMGIYIFNTNVLLEYLESLENPDLDFGNHVIPAMIEDDRKVFVHTYDSYWMDV GTYDSYLEANLDLIKKSEEVGINLYDKDWKIYTRSEDLAPVRIGVTGSVLNSLICNGCKI EGRVENSVLGPGVTVRKGSTIKNSIIFSGSYIDENTHLDTMILDKRVYVGKNSLLGHGDD FTANKAKPDLLAKGISVIGKHSRLPEGSIVGRNVRIFGNVELNEGNKIVESGETIEK >gi|261748311|gb|ADAD01000049.1| GENE 12 7334 - 8719 1870 461 aa, chain + ## HITS:1 COG:CAC2239 KEGG:ns NR:ns ## COG: CAC2239 COG0297 # Protein_GI_number: 15895507 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Clostridium acetobutylicum # 1 455 1 473 477 379 45.0 1e-105 MKIVYLASEVFPFFKTGGLADVMYVLPKEMQELGHKVSIIMPKYDKIPLKYLEKMEWVAR LESHGDIFNLVKYPDDKIDFYFIENKALYERGRVYGDNDEDVQYAMFSELALRFLKEIDL QPDILHCNDWQTGPVPYFLNVRYNQDPFYWDMRTVYTIHNLMYQGRFSKYSFERMGYFTD GHDLNFMQIGIGYADIVNTVSPTYAEEIKYPYFSEGLEGITNNKHIYGVLNGIDVDYFNP ETNEDIIHFNKNLLKKKKENKYQLQEKLGLPKSDNMLISMVTRLVEGKGLDLVSLVLENL LQYDAVQIVILGSGDKFYEDYYNYLTMKYPEKFKVYLGYNPNLANELYAGSDAFLMPSRY EPCGLSQMIAMRFGTIPIVRETGGLKDTVKPYNIYTDEGNGFSFTNFNADDMLFTIKVAE GMYYDKPEIWEKLVKRNMDIDFSWDKSAKEYLKLYELVKSW >gi|261748311|gb|ADAD01000049.1| GENE 13 8736 - 9272 466 178 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116629704|ref|YP_814876.1| acetyltransferase [Lactobacillus gasseri ATCC 33323] # 1 174 1 176 181 184 46 7e-46 LKKIEIKPLKKEDLKILYSLEYQGESPQWKKWDAPYFDDYKFISYEEFEKSDLSFYFSDN VNGIYVDKVLTGIVSRYWENKKTRWLEVGIIIYNPDYWYGGYGSEALKLWTTRTFKDFPE LEHVGLTTWSGNIPMMKCAEKLGYKLEAKIRKVRYHLGKYFDSIKYGVLREEWKEKEI >gi|261748311|gb|ADAD01000049.1| GENE 14 9539 - 9982 547 147 aa, chain + ## HITS:1 COG:MTH313 KEGG:ns NR:ns ## COG: MTH313 COG1846 # Protein_GI_number: 15678341 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Methanothermobacter thermautotrophicus # 27 145 26 144 146 63 30.0 9e-11 MCSIKNSVSFKLKLFHKEITELSKEGLQNLGLTYGNYLAMCYIYENPGITQVKLSEISYK DKNVVGRNIDKLEEKEYVKRESDKTDRRVYALYLTEKGKEIVENNRNIILKEEEKILNKI SEKEKKVFLEILDKLVNQKEKLNGKHC >gi|261748311|gb|ADAD01000049.1| GENE 15 9966 - 11333 1606 455 aa, chain + ## HITS:1 COG:FN0944 KEGG:ns NR:ns ## COG: FN0944 COG0534 # Protein_GI_number: 19704279 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 6 454 6 454 455 560 69.0 1e-159 MENTVKEQNPLGYKPVSKLLASLAIPAVVANLVNALYNIVDQIFIGQGVGYLGNAATNIA FPIVTICLAIGLMIGVGSAANFNLELGRKNPDKAKHVAGTAASLLVIIGVLLYIIILIFL KPMMIAFGATDNILDYAMEYTGITSFGVPFLLFSIGANPLVRADGSPKYSMMSIIIGAVL NTILDPIFMFVFGMGIAGAAWATVISQIISALVLTLYFPKFKTVKFEWKDFIPQFSAMKI IVSLGISSFIFQFSTMIIQITTNNLLKIYGESSIYGSDIPIAVAGIVAKINVIFIAVVIG IVQGAQPIFSYNYGAKNYGRVRQTMRLLLKVTITISSLIFVVFQVFPVQMISLFGSGSDL YFQFGVKYMRVFLFFMFINGVQIGASTFFPSIGKAVKGVIISLMKQIAVLLPLLIILPKF MGVDGIMFATPVTDFISFIVAVVFLVYEFKKMPRD >gi|261748311|gb|ADAD01000049.1| GENE 16 11349 - 11624 405 91 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0542 NR:ns ## KEGG: Sterm_0542 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 89 1 89 95 77 51.0 1e-13 MSLNVKQKKATAEELNENYKRLGYSEEKILSDLKFSEEELHNALNITENTNGYNVWKLRD YLVEKLKKERKEVYPFSILKMNIYFPYKKTW >gi|261748311|gb|ADAD01000049.1| GENE 17 11725 - 13350 1956 541 aa, chain - ## HITS:1 COG:FN1301 KEGG:ns NR:ns ## COG: FN1301 COG0488 # Protein_GI_number: 19704636 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Fusobacterium nucleatum # 1 538 1 538 539 816 73.0 0 MISTSNLSVQFGGRKLFDEVDIKFTPGNCYGIIGPNGAGKSTFLKILTGEADSTSGEVII DKNKRLSFLKQDHFAHEEEEVLNVVMMGHAKLYDIMMKKNELYSKSEFTEEDGILASELE GEFAELDGWEAETNAEKYLIGLGIPAELHHKLMKELTEPEKVKVLLAQAIFGNPDILLLD EPTNGLDLKAVKWLEDFLMDLENTTVLVVSHDRHFLNKVCTHIADIDYGKIKMFVGNYDF WYESNKLMQELIRNQNKKLEQKKKELQDFIARFSANASKSKQATSRKKQLEKLQFEDMQI SNRKYPYIEFKPDREAGNNMLRVENLTKTVDGEKVLNNISFTVNTGDKVVILSDNDIAKT TLFQILSGEMEPDSGSFEWGVTTSQSYFPKDNSEYFENVDLSLIDWLRQFSTDQHEEYVR GFLGRMLFSGEEARKMAKVLSGGEKVRCMLSRMMLSGANVLLLDNPTDHLDLEAITSLNK SLMNFPGTVLFTTHDHEFIQTVANKIIEITPNGILEKEMEYDDYINDENVQKKLNELYER N >gi|261748311|gb|ADAD01000049.1| GENE 18 13575 - 14921 2018 448 aa, chain + ## HITS:1 COG:lin2856 KEGG:ns NR:ns ## COG: lin2856 COG1455 # Protein_GI_number: 16801916 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 1 436 1 446 454 399 50.0 1e-111 MKKFTDWMEQKFVPKAAKIASQRFLVAIKDSFIAIMPITMVGSIAVLLNVFFRDLPNSWG LEGFVKLMTPIININGIVWFASIAILSLAFVISLGYNVSRSYDVNPVAGALVAFASFIAF LPQEAKFDAEINGVTQAVSSWGFINLDYLGAKGLFPAMIIGLVSAIIYSKLMKSKLTIKL PDSVPPAVSKAFASIIPGVVAVYICAILSHLVVAATGSPLNDLIQKYIQAPLLGLSQGMF SVVLLAFLVQLFWFFGLHGHNVLAPIMDGIYQPALLANVEHMAKGGAVKDLPYIWTRGSF DAYLQMGGSGITIALIFAIYLFSKRKEYKTVAKLSTPMAIFNINEPVIFGIPIVLNPIYI IPFLLAPTVCAIIAYTATAIGLIPPVYVVVPWVLPPGIYAFFATGGSVKAALVSLFNVFV AFVIWAPFVIMANKMAEDKKNQENAEEK >gi|261748311|gb|ADAD01000049.1| GENE 19 15084 - 15854 1216 256 aa, chain + ## HITS:1 COG:DR0470 KEGG:ns NR:ns ## COG: DR0470 COG1387 # Protein_GI_number: 15805497 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Deinococcus radiodurans # 6 248 8 241 260 81 25.0 1e-15 MKTNFHTHNYRCGHAIGNVEDYVKVAVKEGYSELGISDHAPVPGYYNDRMKKEELQGYLE EIEEAQKKYKDKIKIYKSLEIEYFPEWQEYYDELKKKLDYIVLGLHAFRKEGETEIHSAW SIKEEEHVLDYGKYMVKAIESGNFDYIAHPDLYMIDYRNWTDGCIKTAHMICEAAEKHDV PLEVNANGIRKTLVRHPDWDRYMYPYKEFWEIAAQYNIRTIIGSDTHNYEEMEDLPMELA RQFAKDLKLNVIETIF >gi|261748311|gb|ADAD01000049.1| GENE 20 15924 - 16418 885 164 aa, chain + ## HITS:1 COG:FN0747 KEGG:ns NR:ns ## COG: FN0747 COG0494 # Protein_GI_number: 19704082 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 4 161 9 166 171 130 44.0 8e-31 MDGFRFLKGAKMTHPTTGITLEYLDKSDAVCFVLFNETKEKVILVKQFRPGPKDYTLEVV AGLIDAGENPETAAFRELREETGYTKDDITDFKKLEKGLFVSPGYTTENLYFYSAKLKSD SIKPKELDLDEGEELEVEWVPVKDIVEKSNDMKTLFGVNYFLGK >gi|261748311|gb|ADAD01000049.1| GENE 21 16454 - 16879 449 141 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0872 NR:ns ## KEGG: Lebu_0872 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 6 136 10 140 141 91 40.0 1e-17 MNIGLFIILYLFIYFLIRKIMKNIKLSFEKLEELGGEFIYTFLNRVFKKEIYFKLEEVKN IFFTRMVIKNDMFGNLMLHIILEDDYTVKLEKKENIIIFFASCKRNNPELYERFLRNTPM GINISAIMDKEIENYENKNRN >gi|261748311|gb|ADAD01000049.1| GENE 22 16889 - 17578 837 229 aa, chain + ## HITS:1 COG:Cj0620 KEGG:ns NR:ns ## COG: Cj0620 COG1451 # Protein_GI_number: 15791980 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Campylobacter jejuni # 9 221 18 211 215 114 35.0 1e-25 MRTEKILGYDIHRKKVKNINLRINQNMEVYISAPLNLHSSYIENFIRSKEDWIKKVLNRI EDVKSKQTQYEYKTGEIHKLIGREYSLIVKQGNTDRVNLSKDTGEIILVTKHENIENKKK IMDKWYYDCAKKIFPDVMEKWLKILGETIEHLSIKPMKTRWGSCNYNKRYINLNTELIKR TPFEIEYVVLHELAHLKYPNHGKGFYNYVENYMPNYKEAEKLLNAKHYY >gi|261748311|gb|ADAD01000049.1| GENE 23 17593 - 18498 773 301 aa, chain + ## HITS:1 COG:no KEGG:CTC01162 NR:ns ## KEGG: CTC01162 # Name: not_defined # Def: membrane associated protein # Organism: C.tetani # Pathway: not_defined # 34 293 28 284 303 292 60.0 1e-77 MKKRIVISLICGILSISCSQKTINDKYDINISQKSNENLKKITYSNLVDKASQEEVRAAL KNAGVSEKSIKDFFEGVNYFNNSVENVSLVKSGFQTMEKTSPVYDEVTIQEKWNKKNPVF IGQNCRITTFELMKDFISVGNPEIKNTEQLFMDQDSLKNFPVKIFTDKEKAQFESLFSSI ETENTKDIATHIRKVKEDWEKKNIKFSNKNKISMISVFFHSEITPKESFLFIGHVGVLVP SSDGKFLFIEKLAFQEPYQALKFDNVLQLNDYLLNKYDVEYGQSNAKPFIMRNGDPILMY P >gi|261748311|gb|ADAD01000049.1| GENE 24 18583 - 19179 783 198 aa, chain + ## HITS:1 COG:FN0342 KEGG:ns NR:ns ## COG: FN0342 COG0652 # Protein_GI_number: 19703685 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Fusobacterium nucleatum # 31 194 1 155 167 175 64.0 5e-44 MIQMKIRNISVILVMLFMVFSINSESYSKKMKMNVKIKTNKGEINLRLFPENSPVTVASF VNLIKHGYYNGLKFHRVIEDFMAQGGDPTGTGMGGPGYRFEDEVKNGLDFSVPGKLAMAN AGPGTNGSQFFITTVPTEWLTGKHTIFGEVVSDKDLEVVKSLSNGDIMETVTVSGSGIDE FLSKYKDRISEWNKTLGY >gi|261748311|gb|ADAD01000049.1| GENE 25 19216 - 19425 86 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNTTIFFIRLVIVYFISLMFERQEEKQKFLLSLIIYLITGSTKLSKTNFLLLTFLYFKNP ANQSHIDYK >gi|261748311|gb|ADAD01000049.1| GENE 26 19558 - 20286 785 242 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0800 NR:ns ## KEGG: Lebu_0800 # Name: not_defined # Def: CRISPR-associated protein Cas6 # Organism: L.buccalis # Pathway: not_defined # 1 240 1 257 260 234 48.0 3e-60 MRFKIACEITEGEAFSQDYRRNILRMLKTGLDRYEDIFENFFGSNKLKKYCWSVYFKNSI FQGDKIYLQGEKKEFFINFSVFDNGDAMNIFNAFMNIKYREIDISKELRVKVSNVIKLPT KEVSGEFFRAKVMSPIICRDHNQETGKDSYYTGSEEEFIPIIKRNLYLEMREKYGDYVER DIEKLVIETKELKKTVSKFYGKLIDGSIGIIELQGKNYLLNYIYNAGLGSIRGSGYGLLE II >gi|261748311|gb|ADAD01000049.1| GENE 27 20305 - 22125 1387 606 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0799 NR:ns ## KEGG: Lebu_0799 # Name: not_defined # Def: CRISPR-associated Cst1 family protein # Organism: L.buccalis # Pathway: not_defined # 3 603 4 585 589 313 42.0 2e-83 MSEKIEIRMSNWLFNAGLLGFINVLENSNEEIRINDEKRSIEFSPKVLENFEYKYFDFLI LRYNEILPSTRILRFQENLEKIENFKNSEKENEFISKNIKELLQEINYIKNKKYNNINIE KITFEKFEATKKNIKKLDESKYEIKELKENLKNITEIIKIIEENRNSFQLEDIKTIVRYG WEGIAFLDNANIAIRRKNGIDIGSIYKIYQDFFIKDIQEYIQKDKSKFEGQCSISNAPLP IPYDKKENKNYNKTIQFLGDFFNPNDKKSNVWNFENDIYVSPIVYLIYSCVPAGFIYSNY NSKGSGKGIFINLNYNITQLKKVNNGIFSNIFEKNINEEDKLYRNIFKKFELGNQNYKYD ISDIQVIKTFERREIPKKSYPKYRFNILSKRILEFIQKNLNKINSFFGKYYVFKDEKDNP KWDTTVYLYKDIIERLLENRDLFLYIDKLCYVKISERYKYNYSNKNLLDLIYINMNNMWR FRDMKDKVEKSEKEIENEQLTYKDINTIKNNAYHFREKYNKVSNNEDKIKGLQYRLQNAL RTNNVDLFMDILISAHAYAGKEIHKLFLKALDNDDDFKTLGYAFLIGLLKGDKEENTNNK QVKGVE >gi|261748311|gb|ADAD01000049.1| GENE 28 22127 - 23044 1028 305 aa, chain + ## HITS:1 COG:aq_374 KEGG:ns NR:ns ## COG: aq_374 COG1857 # Protein_GI_number: 15605880 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Aquifex aeolicus # 92 297 146 354 357 116 37.0 7e-26 MKIKGLTLTIIFLAESANYGEGIGNVATLKKISRNKGEQYTYISRQALRYSIVELLEMLG ETKAIVTKSPGVVQFEKETTINDSAELDFFGYMKTEGGVSTRSAVVRLSNAISLEIYKGD LDFLTNKGLADRINQNNDIAQSEIHRSYYKYTITIDLDKIGIDEIINEKKEEIIELEKEQ KSERVKKLLDAISILYRDIKGRREDLKPLFVIGGVYDIKNPFFENLVDVKNENILVDKLV SGIHKIIEKDTKCGIIKEQFQNETELREKLKNKNITVQDISEVFNDLKKEIDKYYEIETN KKEAK >gi|261748311|gb|ADAD01000049.1| GENE 29 23049 - 23789 687 246 aa, chain + ## HITS:1 COG:aq_373 KEGG:ns NR:ns ## COG: aq_373 COG1688 # Protein_GI_number: 15605879 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) # Organism: Aquifex aeolicus # 1 162 1 158 219 62 34.0 7e-10 MRAIKLKLYQNMVNYKIATSFYLKETYPLPPYSTVIGMVHSLCDFKEYNPMEISVAGNYF SKVNDYYTKHEFNKSDNIITGPAYVELLVDVNLTIHIIPRDQSDESIQKILNAFEKPKEY LSLGRREDIVVINKVKEVAIEKKKLEKDKRYDKGIFAYIPVNFAVGRLVKAGSKKYGPIS GTRYEITKNYKLQNIGTKNKPKKIRIWEKKEVIYSSNIIGFKNKILPIDNEEDIVFCDED TIFRDL >gi|261748311|gb|ADAD01000049.1| GENE 30 23817 - 26249 2391 810 aa, chain + ## HITS:1 COG:FN1179 KEGG:ns NR:ns ## COG: FN1179 COG1203 # Protein_GI_number: 19704514 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Fusobacterium nucleatum # 8 792 11 796 812 350 36.0 8e-96 MKNKFLAKSLTNETIEAHTNNLIKNFKLLLKLYPDINVNKRLLLLACIYHDLGKINVKFQ KFQKNLPQIPHGLLSTAFIDSSGLIDNSGFEEYEIRILAHAVALHHERNLSEIEVEDLEN EIELMNKETPDFIKVLKILEEEYNEYINTQGKNEKIKIFNIIDNNIKLEMLSENFYEIGG RMYSEDISMTKQEAMETFKKYIMVKGLLNKIDYAASSGLTVECENNFLEKNMNDFLQNVL KKYDHDNDWNDLQKFMINNRDNNVIVVAQTGFGKTEAGLLWIGNNKGFFTLPLRVAINSI YKRIKEQILKDDITKDEFTDKLGLLHSDFRSEYLKRFEEKKDKEIEENDFTEDRLDEYIG KTKQLSLPLTVCTIDQIFDFVYRAPGFEMKASILAYSKIVIDEIQMYSPNLVASLIYGIK FITDLGGKFAIMTATLPGIIKTLLEKENVSFVTTSPFVNNKIRHKVKIENNEINAEFIKG KYKNNKILVVCNTVKKSKEIYDTLKEIGVLEEEINLLHSKFIKKDRSEKENEITKFASPK RFSGEEKKRRKEQNIIEHGIWIGTQVVEASLDIDFDVLITELSDLNGLFQRMGRCYRSRT LKSEDGYNCFVFTDNCSGISKSERSIIDKDIHEKSRERLLNIDGILTEEMKLSMIDEVYS YESLKETQYCQTIEEKLRALKFYIVEYEMSKEKVKEIFRDIHSIDVIPKSVYEENKEMIT ENIKILKKTYKDCSNIERETLKLDKIKARDEINQLKVSVPQYEFDESDYYTVEINKYEEI FVMNCDYSFERGVEIIKKSKKLDFDEDLCF >gi|261748311|gb|ADAD01000049.1| GENE 31 26276 - 26569 236 97 aa, chain + ## HITS:1 COG:FN1178 KEGG:ns NR:ns ## COG: FN1178 COG1468 # Protein_GI_number: 19704513 # Func_class: L Replication, recombination and repair # Function: RecB family exonuclease # Organism: Fusobacterium nucleatum # 3 97 5 99 164 89 52.0 1e-18 MRISGLMINYYFICKRKLWCLAKNINFEETNENVKMGKLIDESRYALETKQIMIEETVNV DFIRNWKVVHEVKKSKAIEEAAIWQVKYYIYFLKKKG >gi|261748311|gb|ADAD01000049.1| GENE 32 26524 - 26763 172 79 aa, chain + ## HITS:1 COG:aq_370 KEGG:ns NR:ns ## COG: aq_370 COG1468 # Protein_GI_number: 15605876 # Func_class: L Replication, recombination and repair # Function: RecB family exonuclease # Organism: Aquifex aeolicus # 9 78 107 176 177 60 45.0 5e-10 MASEVLYLFFEKKGIDIEKGIIDYPEIRERKEVILSKEDEVYLEEILKDIERICQNEISP EVINNKICKKCAYYEYCYI >gi|261748311|gb|ADAD01000049.1| GENE 33 26796 - 27788 1210 330 aa, chain + ## HITS:1 COG:FN1177 KEGG:ns NR:ns ## COG: FN1177 COG1518 # Protein_GI_number: 19704512 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 1 330 9 338 338 386 60.0 1e-107 MSESYFIFSNGELKRKDNVIRITAVDGRFKDIKVEMTRDIYLFGEVALNTKCLNYLATLK IPVHVFNYYGFYTGTFYPKETNVSGKLFVKQVENYTNNEKRIELAQLIIDAASANILRNL RYYQERGKELEEIIEQIKALKKGIFRTENVEELMGIEGTIRKTYYDSWNIIVNQEIDFEK RVKRPPDNMINTMISFLNTLVYTACLSEIYVSQLNPTISYLHSVGERRFSLSLDIAEVFK PLLADRIIFSLLNKKMITEKDFVKDSNYFYMKEKAQKLILKTFNERLETTIRHRDLNRNV SYRRLMRLEAYKLVKHLMEDKKYEGFKIWW >gi|261748311|gb|ADAD01000049.1| GENE 34 27763 - 28068 205 101 aa, chain + ## HITS:1 COG:FN1176 KEGG:ns NR:ns ## COG: FN1176 COG1343 # Protein_GI_number: 19704511 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 7 101 12 106 106 105 55.0 3e-23 MKVLKYGGNMYVILVYDISTDDDGGRISRNIFKICKRYLTNVQKSVFEGEITPVLLRKLY LELRGFIRDDKDSVLLFKSRQEKWLEKEFWGLEDDKTSNFF Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:43:28 2011 Seq name: gi|261748309|gb|ADAD01000050.1| Leptotrichia goodfellowii F0264 contig00086, whole genome shotgun sequence Length of sequence - 463 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 461 708 ## gi|262037937|ref|ZP_06011359.1| MhaB1 Predicted protein(s) >gi|261748309|gb|ADAD01000050.1| GENE 1 2 - 461 708 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037937|ref|ZP_06011359.1| ## NR: gi|262037937|ref|ZP_06011359.1| MhaB1 [Leptotrichia goodfellowii F0264] MhaB1 [Leptotrichia goodfellowii F0264] # 1 153 53 205 206 254 97.0 2e-66 ITDKTISLQDSKKGVTETVTETRQVRDKKLVHRGGRDGYEREVEYGPSKTVQVQVSRYWD TVNKMNTVSGIIGEGKDTILDSAKDIVLESSDLRAQNDVVLNAKNYLLMFSTVDTEYKFR TQTSTSGGGLRRKKTTTETWIQDNVYANPVEIT Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:43:44 2011 Seq name: gi|261748279|gb|ADAD01000051.1| Leptotrichia goodfellowii F0264 contig00062, whole genome shotgun sequence Length of sequence - 26026 bp Number of predicted genes - 27, with homology - 26 Number of transcription units - 13, operones - 8 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 387 - 446 6.9 2 2 Op 1 . + CDS 472 - 912 545 ## Lebu_1364 hypothetical protein 3 2 Op 2 . + CDS 846 - 1367 512 ## Lebu_1364 hypothetical protein + Prom 1411 - 1470 6.4 4 2 Op 3 . + CDS 1527 - 1814 400 ## gi|262037690|ref|ZP_06011133.1| single-stranded-DNA-specific exonuclease RecJ + Term 1956 - 2024 30.4 + TRNA 1938 - 2013 91.7 # Val TAC 0 0 + TRNA 2021 - 2097 88.2 # Asp GTC 0 0 + TRNA 2101 - 2176 86.3 # Phe GAA 0 0 + Prom 2243 - 2302 7.8 5 3 Tu 1 . + CDS 2355 - 3038 762 ## COG2964 Uncharacterized protein conserved in bacteria + Term 3209 - 3267 1.1 + Prom 3190 - 3249 12.2 6 4 Op 1 . + CDS 3318 - 3404 136 ## 7 4 Op 2 2/0.000 + CDS 3462 - 4655 1301 ## COG0078 Ornithine carbamoyltransferase + Prom 4715 - 4774 11.8 8 5 Op 1 4/0.000 + CDS 4893 - 6104 1356 ## COG1171 Threonine dehydratase 9 5 Op 2 1/0.000 + CDS 6132 - 7445 1900 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases + Prom 7492 - 7551 9.3 10 5 Op 3 2/0.000 + CDS 7630 - 9096 1279 ## COG1953 Cytosine/uracil/thiamine/allantoin permeases 11 5 Op 4 . + CDS 9101 - 10495 1959 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 12 6 Op 1 . - CDS 10696 - 11580 889 ## COG1078 HD superfamily phosphohydrolases 13 6 Op 2 . - CDS 11594 - 11770 136 ## gi|262037671|ref|ZP_06011114.1| putative liporotein - Prom 11954 - 12013 10.9 + Prom 11750 - 11809 9.2 14 7 Op 1 . + CDS 11987 - 13459 1974 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 15 7 Op 2 . + CDS 13519 - 14577 1551 ## COG0687 Spermidine/putrescine-binding periplasmic protein + Term 14755 - 14811 2.5 + Prom 14869 - 14928 10.3 16 8 Op 1 . + CDS 14950 - 15222 389 ## TDE0506 DNA-damage-inducible protein J, putative 17 8 Op 2 . + CDS 15219 - 15494 233 ## COG3041 Uncharacterized protein conserved in bacteria + Term 15556 - 15601 1.1 - Term 15499 - 15534 4.0 18 9 Tu 1 . - CDS 15543 - 16313 904 ## COG0685 5,10-methylenetetrahydrofolate reductase - Term 16525 - 16571 1.7 19 10 Tu 1 . - CDS 16686 - 17492 991 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase - Prom 17520 - 17579 8.7 + Prom 17586 - 17645 8.7 20 11 Op 1 41/0.000 + CDS 17742 - 18488 211 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 21 11 Op 2 12/0.000 + CDS 18519 - 19934 2076 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 22 11 Op 3 24/0.000 + CDS 19974 - 21101 1570 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component + Prom 21323 - 21382 8.3 23 11 Op 4 19/0.000 + CDS 21483 - 22682 1733 ## COG0520 Selenocysteine lyase 24 11 Op 5 . + CDS 22679 - 23113 491 ## COG0822 NifU homolog involved in Fe-S cluster formation + Prom 23219 - 23278 12.3 25 12 Op 1 . + CDS 23318 - 24313 1561 ## Lebu_0390 hypothetical protein 26 12 Op 2 . + CDS 24339 - 25016 814 ## Sterm_3339 hypothetical protein 27 13 Tu 1 . - CDS 25123 - 25587 555 ## COG2707 Predicted membrane protein Predicted protein(s) >gi|261748279|gb|ADAD01000051.1| GENE 1 2 - 328 623 108 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1810 NR:ns ## KEGG: Lebu_1810 # Name: not_defined # Def: autotransporter beta-domain protein # Organism: L.buccalis # Pathway: not_defined # 1 108 3537 3644 3644 149 67.0 3e-35 DYYSVKPEVGIEFKYKQPMAVRTTFTTTLGLGYENELGRVGDVKNKGRVAYTDADWFNIR GEKDDRRGNFKADLNLGIENQRFGVTLNAGYDTKGKNIRGGLGLRVIY >gi|261748279|gb|ADAD01000051.1| GENE 2 472 - 912 545 146 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1364 NR:ns ## KEGG: Lebu_1364 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 14 146 16 142 292 153 59.0 2e-36 MGKNKNLKFNKLLMAVFICLQMSCNSENQSKDVKNDNSQTIQQNEENMKNNIVNPKGTTI ETRYNVPAGYKRVQAEEGSFAQFLRNQKLKPYGEKALFFNGEAKRSEGIYDSVIDVEIGD RDLHQCADAIMLLRGEYFYSKKNMTK >gi|261748279|gb|ADAD01000051.1| GENE 3 846 - 1367 512 173 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1364 NR:ns ## KEGG: Lebu_1364 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 13 173 134 292 292 259 77.0 2e-68 MCRCDNAVERRIFLFKKKYDKINFKFVSGFDAKYSKWMEGYRINPEGKGSYYKKTGSSNT YKDFRNFMTIVFAYSGTLSLENEMKPQALKDMKIGDVFIMGGSPGHAVIIVDMAVNDKGE KIFMLAQSYMPAQQTQILINPENSDMKVWYSLKNKDILVTPQWRFPVDKLRSF >gi|261748279|gb|ADAD01000051.1| GENE 4 1527 - 1814 400 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037690|ref|ZP_06011133.1| ## NR: gi|262037690|ref|ZP_06011133.1| single-stranded-DNA-specific exonuclease RecJ [Leptotrichia goodfellowii F0264] single-stranded-DNA-specific exonuclease RecJ [Leptotrichia goodfellowii F0264] # 9 95 1 87 87 149 98.0 6e-35 MEIVYFIIVIAIIIAAWVGLAYFMNFCLQMRIKRVFNSKRMTDERIVKEYKGLDVMSTIF IFYGGPVGFFLAKKKFIPELKKTMEEKMRERNIEF >gi|261748279|gb|ADAD01000051.1| GENE 5 2355 - 3038 762 227 aa, chain + ## HITS:1 COG:YPO0626 KEGG:ns NR:ns ## COG: YPO0626 COG2964 # Protein_GI_number: 16120952 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 14 186 3 175 196 71 30.0 1e-12 MEYCTLKALKKIGESIAEHFGENCEVLIHDIKSDGHDSRIVYIKNGHVSGRHLGDGPSRI ILESSQKKLSKIKDNLCYLTKTKDGKILKSSTLFFSDKTKKRLKYVLSINYDITHLLVVD AAIQTIISTGKDKNKEAIEIPNNVNDLLESLIEKSVELIGKPAPLMTKIEKKKAIKYLNE AGAFLITKSGDRISQYFGISKYTLYSYIDIKNTEKDNKKIISKKEIV >gi|261748279|gb|ADAD01000051.1| GENE 6 3318 - 3404 136 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDFSKIRKDAKKYKKVIIKFLKEIMYFP >gi|261748279|gb|ADAD01000051.1| GENE 7 3462 - 4655 1301 397 aa, chain + ## HITS:1 COG:ECs3743 KEGG:ns NR:ns ## COG: ECs3743 COG0078 # Protein_GI_number: 15832997 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Escherichia coli O157:H7 # 1 394 2 394 396 528 64.0 1e-150 MVNIKKYIKILEKLNFKSMYNSDFFLTWEKTGDELDAVFIVADALRYLRENNISSKIFDS GLGISLFRDNSTRTRFSFASACNLLGLEVQDLDEGKSQIAHGETVRETANMISFMADVIG IRDDMYIGKGNKYMHDVVNAVTEGNKDGVLEQKPTLVNLQCDIDHPTQSMADALHLIHEF GGVENLKGKKIAMSWAYSPSYGKPLSVPQGIVGLMTRFGMEVVLAHPKGYEIMPEVQKVA EKNAIESGGTFKVTNDMKEAFKNADIVYPKSWAPFVAMEKRTVLYGNGDTQGIKELEKEL LEENSKHKDWECTEEFMKNTKNGKALYLHCLPADISGVSCEKGEVEASVFNRYRIPLYKQ ASFKPYIIAAMIFLSKVKEPEKTLENLMIQNKIRWNF >gi|261748279|gb|ADAD01000051.1| GENE 8 4893 - 6104 1356 403 aa, chain + ## HITS:1 COG:STM1002 KEGG:ns NR:ns ## COG: STM1002 COG1171 # Protein_GI_number: 16764362 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Salmonella typhimurium LT2 # 2 398 3 401 404 434 53.0 1e-121 MEKIKWILNTMSKSNGDRSIEILTEDKVIKVKKFHESFPQYCETPLVGLDNLSKNLGVSG IYIKDESYRFGLNAFKVLGGSYAMAKYLAKKLEKDISELSYEKLISEKVKKELGEITFYT ATDGNHGRGVAWTANKLKQKSVVYMPKGSSKTRLKNILSEGAEASITELNYDDAVRLAAK NAEKNNGVIIQDTAWEGYTDIPTWIMQGYGTMAIEAKKQLEKYGVDRPTHIFIQAGVGSL AGTIQGFFASTYKDNIPITIVVEPNSADCYYRSAVQGDGKPVNVGGDMQTLMAGLACGEP NIIGFEILKNYATAFLSCPDYIAAKGMRISAAPLRGDQQIISGESGAVSLGSLFSILKDD YYKELRKKLKLDENSKILLFNTEGNTDPDKYTDIVWDGEYPSK >gi|261748279|gb|ADAD01000051.1| GENE 9 6132 - 7445 1900 437 aa, chain + ## HITS:1 COG:ECs3745 KEGG:ns NR:ns ## COG: ECs3745 COG0624 # Protein_GI_number: 15832999 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli O157:H7 # 3 431 7 396 403 550 62.0 1e-156 MNYEEIKRKSENYKSDMTKFLRELVQIKGESAEEKGHALRLKEEMEKVGFDKVEIDPQGN VLGYMGNGPRIIAFDGHIDTVGIGEIRNWTFDPYEGYETEEEIGGRGTSDQLGGIVSAVY GAKIMKELELIDPEYTILVVGSVQEEDCDGLCWQYIIKEDKIRPEFVVSTEPTDSGIYRG QRGRMEIRVEVKGISCHGSAPERGDNAIYKMADILQDIRSLNENSVSEGTEIKGLIKMLD KKYNSEWKEANFLGRGTITVSQIFYTSPSRCAVADFCAVSLDRRMTAGETWEYCLEEIRQ LPAVKKYKNDVTILMYDYDRPSYTGLVYPIECYFPTWTLPEDHKVTLSLKEAYESLYGEK RTGSKSIIETRINRPLVDKWTFSTNGVSIMGRNGIPVIGFGPGAEAQAHAPNEKSWKNDL VVCAAVYAALPEIYKTK >gi|261748279|gb|ADAD01000051.1| GENE 10 7630 - 9096 1279 488 aa, chain + ## HITS:1 COG:RSc1233 KEGG:ns NR:ns ## COG: RSc1233 COG1953 # Protein_GI_number: 17545952 # Func_class: F Nucleotide transport and metabolism; H Coenzyme transport and metabolism # Function: Cytosine/uracil/thiamine/allantoin permeases # Organism: Ralstonia solanacearum # 28 479 20 491 502 265 33.0 1e-70 MKKFQQDNHGIWDLTKEGLQEIEKYETIYTNDTKPISSKEREWGSWDIAALWIGIMVSIP VYMLASGLIASGMNWWQSISVIVLGHTIVIIPATLLGHSGTKYGISYPLVSKLIFGPKGN IFPTMVRAFLGCFWFGIQCWIGGMAVDSMISAIIPAWIQINGHLFVCYGIFLLLNVYIGY TGSKAVKYMESYAAPVLIIMGLAVIIWAYRVSGGFGKLLTHSVAQGGKNNFWKLFFPSLT AMIAFDGTIALNISDFTRHAKTQKAQISGQFIGAPVMTVFIVFVGICGTVASAISFGSPI WNPAELVSKFKNPLIVILFSIFIILATLTTNVAANLVPPGIIFSNLFAKFLTYKKAIIIV GFLAIVAQPWKVLENPNNYIYEVNGALATFLGPMAGIYLASYWLEYKSNIELVDLYRIDG GKYYYDKGVNKIAIISLFLITLFLYFGKFSDSFYKIFYENSYVLGLLIGMFVYILLIKIF NKNNSIGG >gi|261748279|gb|ADAD01000051.1| GENE 11 9101 - 10495 1959 464 aa, chain + ## HITS:1 COG:ECs3746 KEGG:ns NR:ns ## COG: ECs3746 COG0044 # Protein_GI_number: 15833000 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Escherichia coli O157:H7 # 1 455 7 457 465 457 48.0 1e-128 MLIKNALIATASDLYPGDIYIKDGVIKEIGENLNIDDNEIIDAEGNYVIPGGVDVHTHFS LDVGIAISNDDFRTGTIAAACGGTTSIVDHIGQGPVGSTLKSRIKHYHELADGKAVIDYS FHGVIAYNADEEKIKDMKELIKEGIESYKIYMTYGQRIGDEEALKVLKTAKENNAIVSVH PENHAAVEYLRKYYVENGMTSAEYHPKSRPEECEAEAINRIITLAHIVGDAPIYIVHLTS DMGLDYVKMARKRGQKNIYVETCTQYLVLDEELYKLPRTEGLKYVMSPPLRNKSNQEKLW RGIKNGDIQVVATDHCPFSYEKEKVPMGKDDFTKCPNGAPGVEARIPVLFSEGVMKGRIS INKFVEVTSTNPAKICGMYPKKGSIAVGSDADLVIINKNKKINITKSLFHENVDYTSYEG LELQGYPIMTIVRGKVIVKNNEFIGEEGYGQFIKRYKNNDLVIY >gi|261748279|gb|ADAD01000051.1| GENE 12 10696 - 11580 889 294 aa, chain - ## HITS:1 COG:BH1811 KEGG:ns NR:ns ## COG: BH1811 COG1078 # Protein_GI_number: 15614374 # Func_class: R General function prediction only # Function: HD superfamily phosphohydrolases # Organism: Bacillus halodurans # 4 294 3 299 325 203 35.0 4e-52 MNTVIDELYGEYHVNCIIFEIMETSVFKRLRNIHQSGGGYLINELWNVTRYEHSVGTMIL SLKFGGNVEEQIKSLLHDISHSTFSHVIDYILKNNNEDYHEIIQKDFLKNEELINIFEKY RLNLDTIMSTNYNLLDADLPDLSFDRIDYTFRDLFRQKLITKDDINFLLNYIIVHKNILC FNSIEAGKYFKELYFKETEGYFNDPLNKYINTMLSKLLSKALQEKIIELKDFNFTEEIIL EKIKNSRYKEELKNIVNKKYFDFFLQSNRKVEVKQRYIDPYVFINGVIRRLSGF >gi|261748279|gb|ADAD01000051.1| GENE 13 11594 - 11770 136 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037671|ref|ZP_06011114.1| ## NR: gi|262037671|ref|ZP_06011114.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 58 1 58 58 72 100.0 1e-11 MKKILLTFSAMALLLSCNSNKSETEQKPVTNKNQEKVCYKLVNGEASKLLKKTDSNSK >gi|261748279|gb|ADAD01000051.1| GENE 14 11987 - 13459 1974 490 aa, chain + ## HITS:1 COG:FN1225 KEGG:ns NR:ns ## COG: FN1225 COG0769 # Protein_GI_number: 19704560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Fusobacterium nucleatum # 4 490 3 476 485 419 47.0 1e-117 MFKIFENISHTVLHKGKEFKIEGIEYDSRKIKENFVFAAMKGSSVDGHEFIQKAIDNGAK MIIVEENIEVSKYDKYEDISFVFVEDVRKHLGIIASNYYGYPQDKIKIIGITGTNGKTTS SYILENILEKTARIGTTGNRILDEEFETVNTTPESLELVKLINESVKRGVEYFIMEVSSH ALEIGRVNMLKFDSAIFTNLTQDHLDFHKTMENYFNAKRKIFSMLRNSGTGTVNIDDGYG KRIIEEKKDAENVCYKTVSIKDKNADLYGEIAEYTNDGMKIKIVHNDKEYFANINLVGEY NLYNVLGCAASALALGIEIEKIIEKLQKMPSVPGRFETVKNNKNARIVVDYAHTDDGLTN IGKTLRKITDNRVITVFGAGGDRDNKKRPKMAKAAAEFSDFIILTSDNPRTENPANILKE VEAGLIEINFPKDKYTVIEDREQAIKYAVQEMTGEGDSLLIAGKGHETYQIIGKEKRHFD DREIAKKYLI >gi|261748279|gb|ADAD01000051.1| GENE 15 13519 - 14577 1551 352 aa, chain + ## HITS:1 COG:FN0618 KEGG:ns NR:ns ## COG: FN0618 COG0687 # Protein_GI_number: 19703953 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Fusobacterium nucleatum # 1 352 1 342 342 238 38.0 2e-62 MKKIILLLGIALLMVACGGKSEGSKGSESKELNIYTWTYFIPQEVIDDFQKETGIKVNLS YYDNNDVMIAKLMSGAKGFDIISPSTDYVDILIKNGLIEKLDKSKLGQTFNNLDADNLKL MEHSKIYDPGLDYSIPYSFSATGIAVNKKFMKDYPQSFDIFGLAQYKGKMTMLDDGREVI GATLQYLGYPSDSSNDKELQEAKNKILEWKKNLAKFDATAFGKSVVTGEFYAAHGYAENV YGELEESEYSNYDYFIPKGAMMYIDSMAIVKNAPNKENAYKFLEFLYRPENFIKVYEQFK APSVIKGVEEKSQVKAIVKSKQVVENAKLPGALSDEAKEKQDKIWNEIKLSN >gi|261748279|gb|ADAD01000051.1| GENE 16 14950 - 15222 389 90 aa, chain + ## HITS:1 COG:no KEGG:TDE0506 NR:ns ## KEGG: TDE0506 # Name: not_defined # Def: DNA-damage-inducible protein J, putative # Organism: T.denticola # Pathway: not_defined # 1 90 1 90 90 104 64.0 1e-21 MANTNLNIRTDIDVKKKANAIFQELGMSMTTAVNLFLKTAIRENGIPFSLKLETPNEMTI KAIEEGRKIAENKEIEGFSNIKDLKKALEV >gi|261748279|gb|ADAD01000051.1| GENE 17 15219 - 15494 233 91 aa, chain + ## HITS:1 COG:SP0276 KEGG:ns NR:ns ## COG: SP0276 COG3041 # Protein_GI_number: 15900210 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 4 91 3 92 92 73 45.0 7e-14 MKYKVKFTGQFKKDLKLLKKQGKDIDKLFEVIEILANGKKLKEKYREHNLTGNYKGTKEC HVDPDWLLIYEFYDDILVLLLLRTGSHSELF >gi|261748279|gb|ADAD01000051.1| GENE 18 15543 - 16313 904 256 aa, chain - ## HITS:1 COG:Cj1202 KEGG:ns NR:ns ## COG: Cj1202 COG0685 # Protein_GI_number: 15792526 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Campylobacter jejuni # 4 252 26 272 282 212 45.0 5e-55 MTKELASYKPDFISVTYGAGGTTKSGTVEIASYIKNELKIETLAHLTCIGSKKQEIANFL KELKNNNIKNILALRGDIPQGENESIYNRGDYKYASDLIKDIKENEDLSIGAAFYPETHY ENNDLIDLFHLKEKVDLGADFLISQVFFENETFLRFREQAEKLSINVPLVAGIIPVTNAT QIKRITALCNCKIPLKLEKILDKYGSNPDSMKKAGIAYATEQIIELLSNDIRGIHLYTMN KTDTTKEILDNVSFVR >gi|261748279|gb|ADAD01000051.1| GENE 19 16686 - 17492 991 268 aa, chain - ## HITS:1 COG:BS_thiD KEGG:ns NR:ns ## COG: BS_thiD COG0351 # Protein_GI_number: 16080853 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Bacillus subtilis # 3 267 2 267 271 301 58.0 7e-82 MANLKKVVTIAGSDTSGGAGIQADLKTFQERGVYGMNVLTVIVTMDKNWNHKVFPIDMNT IKEQAQTVFEGIGVDGVKTGMLPTVEIIEYAGSLLKKCDSSVIKVVDPVMVCKGTDEVLF PENTIAMQKHLLPYATAVTPNLFEAAQLAGVAPIHSLDDLKEAAKKIYDLGPKNVVIKGG KALSDKISIDVVYNGKDFEIFESEKINTPYTHGAGCTFAASLTAELTKGKSVTEAMNITK KCITDALKESFKLNEYVGPVYHKAFSLK >gi|261748279|gb|ADAD01000051.1| GENE 20 17742 - 18488 211 248 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 4 236 12 236 318 85 29 2e-16 MSLLNLKGVKSEVEGKPILKGVDLTIDKGEVHVIMGPNGAGKSTLASILVGHPKHEIVDG EIILDGENINDLEVDERAQRGIFLSFQYPEEIPGLTVEDFLRTAKEAVTGEKQYMMQFHN ELVEKMEKLHINPEYAERHLNVGFSGGEKKKNEILQMAVLEPKLAILDETDSGLDIDATK IVFEGVKSLKTPDTAMLIITHYDKVLEYLQPDFVHILMDGKIVKTGGKELVESIEKHGYS KIKKELGL >gi|261748279|gb|ADAD01000051.1| GENE 21 18519 - 19934 2076 471 aa, chain + ## HITS:1 COG:TP0612 KEGG:ns NR:ns ## COG: TP0612 COG0719 # Protein_GI_number: 15639600 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Treponema pallidum # 8 471 16 479 479 704 70.0 0 MEESRNKTYVADIERGVYDIKDEMHHKYTTGKGLTEEIIRKISGKKNEPEWMLELRLKAL EVFNSKPMPTWGADLSDLDINQIIHYLEPDSKIMSDNWDDVPDYIRATFDRLGIPEAEKQ SLAGVGAQYDSEVVYHSLHKELEEQGVVYTDIETALHTHEDIVKEYFMKVIKMTDHKFAA LHAAVWSGGSFVYVPKGVKVNQPLQSYFRLNAAEAGQFEHTLIIVEEGADLHFIEGCSAP KYRRNALHAGAVELIVRKGARLRYSTIENWSRNLYNLNTKRALIDEDGVMEWVSGSFGSR ATMLYPGTILRGERARCEFTGVTFASTGQYLDTGCKIVHAAPYTTSTVHSKSISKNGGNA FYRGFLHIAKNAHGCKSTVECESLMLDNESHSDTIPIIEINNDSVDIGHEAKIGRISDEA IFYLMSRGISKDEAKAMIVRGFVEPISKELPLEYAVELNKLIELELEGTIG >gi|261748279|gb|ADAD01000051.1| GENE 22 19974 - 21101 1570 375 aa, chain + ## HITS:1 COG:CAC3290 KEGG:ns NR:ns ## COG: CAC3290 COG0719 # Protein_GI_number: 15896535 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Clostridium acetobutylicum # 47 375 36 363 366 213 35.0 7e-55 MLNEASLKNLENSDYRLEFFKKYSALDKPNWKRVGYKYEEPEKYTDFNNTVIKNENQDGL TIKEINNSLEDLTRLQNDNEYGLDEFFKLQIFAFHNAGQFVKVKERKKLEKPVYITYNTN KENNFLIDCNIIEVEDFAEAVIIITYNSEDDTPAYHNGITKIFAGQNSKVKIIKIQTLNT ESSNFESSKIETAGQGIVNYYSIELGAKINGISHKTYLMEDRAETYVWPGYLADGTRQLD LEYSTVFYGRQTIGEIHGRGAVKDTARKVFRGNMYFKRGAAKSEGREGEFAILLDKDVKV HAIPTLFCDEDDVIGEHYASIGKVDDAKLFYLMSRGLSEGRAKKLIVESSFRPIFDNIDD ENIREHLLEELERRI >gi|261748279|gb|ADAD01000051.1| GENE 23 21483 - 22682 1733 399 aa, chain + ## HITS:1 COG:CAC3291 KEGG:ns NR:ns ## COG: CAC3291 COG0520 # Protein_GI_number: 15896536 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Clostridium acetobutylicum # 3 398 5 405 408 439 53.0 1e-123 MDYRKEFPIFENGEKHYLDTAATSQKPKKVLDKVVQYYERYNGNPGRGSHELSMKASEIM DSARETVRQFINAEKTEEIIFTKSTTESINLIAYSYGMEFINEGDEILLGISNHHANIVP WQFVAKKKKAKIKYVELDDDGQFDLNDYKYKLSAKTKIVAVSGVVNVTGVIQPVKEIIEA AHRKNIPVLIDAAQSIVHFKHDVQKLDADFLVFSGHKLFAPMGIGVMYGKKELLDKMPPF LYGGDMIEFVTEQESTFAPLPNKFEGGTQNVGGAVGLKEAIDFIQAIGYDKIDKIGKELD MKALFEIKRLDFVETYCTENVERTGIIAFNVKGVHSHDVAFILDSYHVGVRSGHHCAQPL MNYIGIPSCCRASFGIYNDDEDIAKLVEGLKKVKEVFGI >gi|261748279|gb|ADAD01000051.1| GENE 24 22679 - 23113 491 144 aa, chain + ## HITS:1 COG:TP0615 KEGG:ns NR:ns ## COG: TP0615 COG0822 # Protein_GI_number: 15639603 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Treponema pallidum # 1 144 1 143 147 157 50.0 4e-39 MNFEKIYQQTILEYSNRKDLKKEIENPDYVERGHNPNCGDDLTLEVKLDGNIIEDAAFIG NGCAISSASTAMLIDLIKGKTMEEAEEKVNLFFKMMKQEEKLTSEEQKKLGDAVLMEYVA NMPARIKCATLSWHSLRVITEKRK >gi|261748279|gb|ADAD01000051.1| GENE 25 23318 - 24313 1561 331 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0390 NR:ns ## KEGG: Lebu_0390 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 327 1 327 336 170 35.0 1e-40 MKKFLIAVSMLLILISCGKGKAKDSKSNPFDNLNQGGTKKEVNEVEKYNMYVGIYNDLLN FEDDVNDYFEDAGDKEQFQKPSGSVDASFSDLTSLIKKMKEAVSAKPAMAELDKSTAALT SVIEEASPLTQDMKVYYDGKDYTSDNYKKAQEFHTKFLAIVKKYNEAVVPFRAAMDKKVA EQREKEMKMLQKEGRKISYNKMMVLSVSEEILAEIKKQKLNGANFTTGDVSKFKPLQEKL INAVSEFQNSVNDEAQLKKEGYASHTLSSFLSEATEFKAATATFIERIENKKKVDDFKLS NSFFLETENGTPENVIRSFNELVRAYNSSNR >gi|261748279|gb|ADAD01000051.1| GENE 26 24339 - 25016 814 225 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3339 NR:ns ## KEGG: Sterm_3339 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 19 225 47 251 251 210 52.0 4e-53 MRKILILITILAALLISCGKKKLTKEQQHKMVYGAVLLTRNDFPYDRLPDKKDCDKCSVE VLSRDWDITDKKSATETLDILLDEGTRGEVDLILPELKSPNALSGEYAEVAATYNNIRDA LVNDYGYTKEEVDNVKTISAWDYDRLINVARFAYDAGYITEEEMWSYIDKTVVKARNDYD SWKSYFAGVMLGRGITFSNNFSENKIVADKLLKDKNSPYNKFSFK >gi|261748279|gb|ADAD01000051.1| GENE 27 25123 - 25587 555 154 aa, chain - ## HITS:1 COG:lin1603 KEGG:ns NR:ns ## COG: lin1603 COG2707 # Protein_GI_number: 16800671 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 3 148 4 149 153 120 56.0 9e-28 MDESLLFLGLILFIGIISKNQSIIIATIFVIILKLLPFTEKVMFQFKTKGINWGVLIITV AILIPIATKEIGFLDLLNAFKSPIGWVAILSGIGVSLLSARGVNLLSGQPEITVALVFGT IIGVVFMRGVAAGPVIASGITYCALQIINIFFKK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:44:56 2011 Seq name: gi|261748169|gb|ADAD01000052.1| Leptotrichia goodfellowii F0264 contig00067, whole genome shotgun sequence Length of sequence - 122364 bp Number of predicted genes - 112, with homology - 105 Number of transcription units - 39, operones - 26 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 2114 2882 ## BCAH187_A2105 hypothetical protein 2 1 Op 2 . + CDS 2111 - 2278 204 ## gi|262037711|ref|ZP_06011153.1| hypothetical protein HMPREF0554_1047 3 2 Tu 1 . - CDS 2339 - 3091 707 ## COG1414 Transcriptional regulator - Prom 3115 - 3174 13.9 + Prom 3110 - 3169 13.7 4 3 Op 1 . + CDS 3300 - 3416 101 ## + Prom 3423 - 3482 1.6 5 3 Op 2 . + CDS 3523 - 3921 583 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase + Term 3987 - 4032 -0.2 + Prom 3924 - 3983 10.5 6 4 Op 1 . + CDS 4060 - 4767 745 ## COG1011 Predicted hydrolase (HAD superfamily) + Prom 4811 - 4870 8.2 7 4 Op 2 . + CDS 4898 - 6991 3084 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) + Term 7096 - 7156 -0.5 + Prom 7137 - 7196 10.2 8 5 Op 1 . + CDS 7298 - 8098 1100 ## COG0489 ATPases involved in chromosome partitioning 9 5 Op 2 . + CDS 8142 - 8819 898 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 8825 - 8884 7.5 10 5 Op 3 . + CDS 8914 - 9546 552 ## PROTEIN SUPPORTED gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent + Term 9585 - 9618 -0.1 + Prom 9610 - 9669 10.6 11 6 Op 1 . + CDS 9811 - 10848 1558 ## COG1052 Lactate dehydrogenase and related dehydrogenases + Term 10860 - 10896 -0.9 12 6 Op 2 . + CDS 10908 - 10982 71 ## 13 6 Op 3 1/0.000 + CDS 10957 - 12003 1504 ## COG1186 Protein chain release factor B + Prom 12075 - 12134 4.8 14 6 Op 4 . + CDS 12165 - 13667 1975 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases + Term 13673 - 13726 11.5 + Prom 13803 - 13862 11.3 15 7 Op 1 . + CDS 13972 - 15375 1356 ## Lebu_1914 hypothetical protein + Prom 15384 - 15443 4.4 16 7 Op 2 . + CDS 15464 - 15877 396 ## gi|262037743|ref|ZP_06011185.1| putative exonuclease subunit C + Term 15886 - 15927 2.5 + Prom 15991 - 16050 7.7 17 8 Op 1 5/0.000 + CDS 16103 - 16942 1016 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 16961 - 17020 8.3 18 8 Op 2 5/0.000 + CDS 17083 - 18285 1468 ## COG1887 Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC 19 8 Op 3 . + CDS 18272 - 19021 945 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 20 8 Op 4 5/0.000 + CDS 19077 - 20357 1757 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 21 8 Op 5 . + CDS 20360 - 21292 1419 ## COG0451 Nucleoside-diphosphate-sugar epimerases 22 8 Op 6 . + CDS 21320 - 24850 4888 ## COG1196 Chromosome segregation ATPases 23 8 Op 7 . + CDS 24869 - 26167 1879 ## Lebu_1377 hypothetical protein 24 8 Op 8 2/0.000 + CDS 26169 - 27170 1490 ## COG2853 Surface lipoprotein 25 8 Op 9 . + CDS 27190 - 27897 374 ## PROTEIN SUPPORTED gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase 26 8 Op 10 . + CDS 27910 - 28215 425 ## Lebu_1599 hypothetical protein 27 8 Op 11 . + CDS 28234 - 29361 1054 ## COG4292 Predicted membrane protein 28 8 Op 12 . + CDS 29364 - 29447 64 ## 29 8 Op 13 . + CDS 29485 - 32307 3562 ## COG0178 Excinuclease ATPase subunit + Prom 32457 - 32516 6.9 30 9 Tu 1 . + CDS 32540 - 33112 576 ## Lebu_0612 hypothetical protein + Term 33125 - 33176 6.9 + Prom 33196 - 33255 9.2 31 10 Tu 1 . + CDS 33296 - 34351 2002 ## COG1064 Zn-dependent alcohol dehydrogenases + Term 34367 - 34402 1.8 + Prom 34372 - 34431 6.4 32 11 Op 1 . + CDS 34473 - 35582 1507 ## COG0836 Mannose-1-phosphate guanylyltransferase 33 11 Op 2 . + CDS 35619 - 36596 1514 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 34 11 Op 3 8/0.000 + CDS 36624 - 38081 1503 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 35 11 Op 4 . + CDS 38124 - 39017 1067 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 36 11 Op 5 . + CDS 39044 - 40480 1149 ## Lebu_2029 hypothetical protein 37 11 Op 6 . + CDS 40428 - 41183 766 ## COG3774 Mannosyltransferase OCH1 and related enzymes 38 11 Op 7 . + CDS 41207 - 41980 985 ## Lebu_2024 endonuclease/exonuclease/phosphatase 39 11 Op 8 . + CDS 42061 - 42213 171 ## + Term 42232 - 42289 -0.1 40 12 Tu 1 . - CDS 42282 - 43169 853 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Prom 43406 - 43465 8.8 + Prom 43072 - 43131 5.9 41 13 Op 1 . + CDS 43286 - 43438 182 ## 42 13 Op 2 . + CDS 43429 - 44160 851 ## COG4123 Predicted O-methyltransferase + Term 44396 - 44462 30.0 + TRNA 44366 - 44451 55.2 # Ser TGA 0 0 + TRNA 44469 - 44562 66.8 # Ser GCT 0 0 + TRNA 44576 - 44652 75.1 # Arg ACG 0 0 + Prom 44493 - 44552 80.4 43 14 Op 1 . + CDS 44684 - 45694 1423 ## COG2502 Asparagine synthetase A + Prom 45770 - 45829 5.4 44 14 Op 2 . + CDS 45928 - 48420 3274 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) + Term 48657 - 48724 31.1 + TRNA 48620 - 48707 75.1 # Leu TAA 0 0 + TRNA 48734 - 48809 84.0 # Gly TCC 0 0 + TRNA 48814 - 48889 92.4 # Lys TTT 0 0 + TRNA 48908 - 48984 87.6 # Arg TCT 0 0 + TRNA 48989 - 49065 97.8 # Met CAT 0 0 + TRNA 49071 - 49145 53.4 # Glu TTC 0 0 + TRNA 49217 - 49307 54.5 # Ser CGA 0 0 + TRNA 49321 - 49396 86.3 # Phe GAA 0 0 + Prom 49322 - 49381 79.3 45 15 Tu 1 . + CDS 49527 - 49667 154 ## + Term 49901 - 49965 5.1 + Prom 50027 - 50086 6.5 46 16 Tu 1 . + CDS 50190 - 51329 1354 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 51382 - 51441 7.0 47 17 Op 1 . + CDS 51489 - 52037 851 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 48 17 Op 2 1/0.000 + CDS 52027 - 53331 1930 ## COG0104 Adenylosuccinate synthase 49 17 Op 3 . + CDS 53348 - 54781 1760 ## COG0015 Adenylosuccinate lyase + Term 54943 - 55013 22.0 + TRNA 54921 - 54996 84.4 # Gly GCC 0 0 - Term 54908 - 54976 31.2 50 18 Tu 1 . - CDS 55193 - 55912 1123 ## COG0217 Uncharacterized conserved protein + Prom 56130 - 56189 10.6 51 19 Op 1 . + CDS 56219 - 56965 883 ## COG3177 Uncharacterized conserved protein 52 19 Op 2 . + CDS 56993 - 61309 5548 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) 53 19 Op 3 . + CDS 61338 - 61763 521 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 54 19 Op 4 . + CDS 61768 - 62232 603 ## gi|262037798|ref|ZP_06011240.1| hypothetical protein HMPREF0554_1109 55 19 Op 5 . + CDS 62271 - 62852 697 ## LVIS_2035 4-methyl-5(B-hydroxyethyl)-thiazole monophosphate biosynthesis protein 56 20 Tu 1 . + CDS 63180 - 63722 533 ## CbC4_4170 hypothetical protein + Term 63818 - 63857 -0.9 + Prom 63724 - 63783 7.9 57 21 Op 1 . + CDS 63882 - 64763 1218 ## COG1396 Predicted transcriptional regulators 58 21 Op 2 . + CDS 64802 - 65815 953 ## CLOST_1278 hypothetical protein + Prom 65826 - 65885 7.0 59 22 Tu 1 . + CDS 66011 - 66616 501 ## gi|262037791|ref|ZP_06011233.1| conserved hypothetical protein + Term 66635 - 66666 3.4 60 23 Op 1 . - CDS 66671 - 67528 766 ## Lebu_1019 hypothetical protein 61 23 Op 2 . - CDS 67559 - 68416 597 ## Lebu_1019 hypothetical protein 62 23 Op 3 . - CDS 68431 - 69048 795 ## SSA_1645 hypothetical protein - Prom 69164 - 69223 10.9 + Prom 68975 - 69034 8.1 63 24 Op 1 . + CDS 69222 - 70892 2729 ## COG2759 Formyltetrahydrofolate synthetase 64 24 Op 2 . + CDS 70911 - 72116 1685 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities + Prom 72118 - 72177 2.7 65 24 Op 3 . + CDS 72199 - 72537 496 ## COG3042 Putative hemolysin + Prom 72575 - 72634 10.7 66 25 Op 1 . + CDS 72693 - 73505 990 ## COG2514 Predicted ring-cleavage extradiol dioxygenase 67 25 Op 2 . + CDS 73527 - 74213 721 ## Sterm_1150 XRE family transcriptional regulator + Prom 74285 - 74344 9.2 68 26 Op 1 2/0.000 + CDS 74510 - 75844 1769 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) + Prom 75892 - 75951 8.2 69 26 Op 2 . + CDS 75984 - 77324 1841 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) + Term 77462 - 77517 2.1 + Prom 77505 - 77564 7.4 70 27 Op 1 . + CDS 77585 - 78118 373 ## PROTEIN SUPPORTED gi|163801060|ref|ZP_02194960.1| 50S ribosomal protein L35 71 27 Op 2 . + CDS 78136 - 78342 318 ## PROTEIN SUPPORTED gi|229212793|ref|ZP_04339152.1| LSU ribosomal protein L35P 72 27 Op 3 . + CDS 78373 - 78717 543 ## PROTEIN SUPPORTED gi|229212792|ref|ZP_04339151.1| LSU ribosomal protein L20P + Term 78746 - 78802 1.1 - Term 78721 - 78759 5.3 73 28 Op 1 . - CDS 78809 - 80008 1511 ## COG0460 Homoserine dehydrogenase 74 28 Op 2 . - CDS 80021 - 81244 1573 ## COG0527 Aspartokinases - Prom 81473 - 81532 8.6 + Prom 81398 - 81457 10.6 75 29 Tu 1 . + CDS 81654 - 81851 321 ## COG1278 Cold shock proteins + Term 81876 - 81919 5.6 + Prom 81895 - 81954 11.1 76 30 Op 1 . + CDS 82113 - 83792 1955 ## COG0692 Uracil DNA glycosylase 77 30 Op 2 . + CDS 83805 - 84548 732 ## COG0300 Short-chain dehydrogenases of various substrate specificities 78 30 Op 3 . + CDS 84575 - 85723 874 ## Lebu_1741 ceramide glucosyltransferase 79 30 Op 4 . + CDS 85716 - 86348 446 ## FN1846 hypothetical protein 80 30 Op 5 . + CDS 86320 - 87600 1461 ## COG1819 Glycosyl transferases, related to UDP-glucuronosyltransferase 81 30 Op 6 3/0.000 + CDS 87597 - 88670 1118 ## COG0451 Nucleoside-diphosphate-sugar epimerases 82 30 Op 7 2/0.000 + CDS 88645 - 89457 739 ## COG0491 Zn-dependent hydrolases, including glyoxylases 83 30 Op 8 1/0.000 + CDS 89441 - 90727 1131 ## COG1541 Coenzyme F390 synthetase 84 30 Op 9 . + CDS 90724 - 91653 1200 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III + Term 91665 - 91696 3.1 - Term 91652 - 91683 3.1 85 31 Tu 1 . - CDS 91747 - 92925 1149 ## COG0025 NhaP-type Na+/H+ and K+/H+ antiporters - Prom 93042 - 93101 7.4 + Prom 93055 - 93114 5.5 86 32 Tu 1 . + CDS 93141 - 94265 1346 ## COG0787 Alanine racemase + Prom 94329 - 94388 2.8 87 33 Tu 1 . + CDS 94409 - 94597 428 ## COG1983 Putative stress-responsive transcriptional regulator + Term 94630 - 94661 -0.8 + Prom 94639 - 94698 7.1 88 34 Op 1 . + CDS 94728 - 95141 653 ## Lebu_1048 hypothetical protein + Prom 95153 - 95212 3.9 89 34 Op 2 . + CDS 95241 - 97898 3421 ## COG1164 Oligoendopeptidase F + Prom 97924 - 97983 9.2 90 35 Op 1 . + CDS 98020 - 99816 1495 ## Lebu_2228 hypothetical protein 91 35 Op 2 . + CDS 99888 - 100400 829 ## COG2190 Phosphotransferase system IIA components + Term 100424 - 100466 5.3 + Prom 100447 - 100506 5.1 92 36 Op 1 . + CDS 100529 - 102991 2581 ## COG1199 Rad3-related DNA helicases 93 36 Op 2 7/0.000 + CDS 103005 - 105059 736 ## PROTEIN SUPPORTED gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 94 36 Op 3 11/0.000 + CDS 105059 - 105523 723 ## COG0319 Predicted metal-dependent hydrolase 95 36 Op 4 . + CDS 105566 - 106390 1250 ## COG0818 Diacylglycerol kinase 96 36 Op 5 . + CDS 106444 - 107367 1308 ## COG1227 Inorganic pyrophosphatase/exopolyphosphatase 97 36 Op 6 . + CDS 107431 - 108432 315 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 98 36 Op 7 . + CDS 108509 - 109300 693 ## LBUL_0130 ABC transporter permease 99 36 Op 8 . + CDS 109305 - 110078 665 ## COG3694 ABC-type uncharacterized transport system, permease component 100 36 Op 9 . + CDS 110154 - 110501 431 ## Sterm_0612 hypothetical protein 101 36 Op 10 . + CDS 110526 - 111191 832 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family + Term 111198 - 111234 0.1 102 37 Op 1 . - CDS 111218 - 111628 138 ## gi|262037787|ref|ZP_06011229.1| sn-glycerol-1-phosphate dehydrogenase - Prom 111648 - 111707 9.8 103 37 Op 2 . - CDS 111719 - 112822 1467 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 112928 - 112987 26.6 + Prom 112841 - 112900 9.3 104 38 Op 1 10/0.000 + CDS 112958 - 113770 1063 ## COG1349 Transcriptional regulators of sugar metabolism 105 38 Op 2 . + CDS 113849 - 114766 1206 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) 106 38 Op 3 . + CDS 114771 - 114893 238 ## 107 38 Op 4 . + CDS 114903 - 116681 3088 ## COG1299 Phosphotransferase system, fructose-specific IIC component + Term 116832 - 116891 7.0 + Prom 116762 - 116821 16.3 108 39 Op 1 44/0.000 + CDS 116982 - 118022 1561 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 109 39 Op 2 6/0.000 + CDS 118024 - 118965 731 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 110 39 Op 3 49/0.000 + CDS 118967 - 119929 1460 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 111 39 Op 4 5/0.000 + CDS 119940 - 120839 1457 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 112 39 Op 5 . + CDS 120893 - 122363 2183 ## COG0747 ABC-type dipeptide transport system, periplasmic component Predicted protein(s) >gi|261748169|gb|ADAD01000052.1| GENE 1 3 - 2114 2882 703 aa, chain + ## HITS:1 COG:no KEGG:BCAH187_A2105 NR:ns ## KEGG: BCAH187_A2105 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_AH187 # Pathway: not_defined # 1 703 21 720 723 1084 73.0 0 KELMERWGADAVRDSDGTKLDEEIKKLDTKIYTTYFVARAHNEFAEKHIEECQQLYLMSK FNMALSEELSIDFINGYFTEQIKPDYIHDEKIYWEVIDRTSDEIVSVSDWSLDKENNCVN IKKAKPFHEYTVSFLAYMIWDPTQMYNHITNNWGDKPHDIPFDVRQPASNTYMSEYLKEW LKNNPDTDVVRFTTFFYHFTLVFNNLGKEKFVEWFGYGASVSVAALEAFAEAKGYRLRAE DIVDKGYYNTTFRVPTKQYLDYIDFIQEFVASEVKKLVDIVHDSGKEAMMFLGDNWIGTE PYGKYFEKTGLDAVVGSVGGGATLRMIADIPHVKYTEGRFLPYFFPDTFYEGNDPVPETI QNWINARRAVMRNPVDRIGYGGYLSLAYKFPKFVDCVEKIAEEFRNIYDIAKNGKPYCAI KVAILNSWGKLRTWQTHMVAHALHYKQIYSYLGVLEALSGMAVDVSFISFDDVLNTDILN EIDVVINAGDAETAFSGGEVWKNDKLVSKIREWVYNGGGFIGIGEPSAYEYGGRFFQLGQ ILGVDKELGFSLSTDKYFTEMQENHFIKEDLKSGESFDFGESMKNIYSLYEDTEILEYSD GEIHLSSHDFGKGRGIYIAGLPYNAQNARLLMRALYYAAHKENEFKKWNCSNIHCEVHVY PSKNKIVYVNNSMSEQTTIVYNEKGEAKTIVLKADEIIWEELN >gi|261748169|gb|ADAD01000052.1| GENE 2 2111 - 2278 204 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037711|ref|ZP_06011153.1| ## NR: gi|262037711|ref|ZP_06011153.1| hypothetical protein HMPREF0554_1047 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1047 [Leptotrichia goodfellowii F0264] # 1 55 1 55 55 92 100.0 7e-18 MNTTIKICIYLLLFFVFNGLIITGAKQADTEHLLRMIIGLSGLIGLLWVYNRSHK >gi|261748169|gb|ADAD01000052.1| GENE 3 2339 - 3091 707 250 aa, chain - ## HITS:1 COG:BH2137 KEGG:ns NR:ns ## COG: BH2137 COG1414 # Protein_GI_number: 15614700 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 8 250 2 248 251 160 38.0 2e-39 MNPQTNNVQSLDRALAILEILSESEILTLSEITARTGLTKSTVHRLLSALMQNGYIEKIN TGEYRITLKLFKLGNQRIQKIDFINIAKNFAGQLSSETNETVHLVIPDNNEILYIDKFES ENILFKTASKIGKTAPIYSTAVGKALVASYSNQDIVNMWNNFDLKKFTDNTITDLDEFIK EIEKVRKNGYAFDNEENEYGTFCIGAAFYNYLRKPVGAVSLSTSAENPDKEKYINKVLIC ANRISGMLGY >gi|261748169|gb|ADAD01000052.1| GENE 4 3300 - 3416 101 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSSFEKYEVLKTLTDSKIVAVIRGNSSKRSDRNRRSMC >gi|261748169|gb|ADAD01000052.1| GENE 5 3523 - 3921 583 132 aa, chain + ## HITS:1 COG:SP0317 KEGG:ns NR:ns ## COG: SP0317 COG0800 # Protein_GI_number: 15900249 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Streptococcus pneumoniae TIGR4 # 1 127 76 206 209 145 54.0 2e-35 MDAETARMAILSGAKFIVSPAFSKEVAALCNRYQIPYIPGCQTVTEILTALEAGCDLIKL FPGNNFDHSYMKAVKAPVPKVKFMVTGGVNVDNIAEWLKAGANAVGVGGNLVKGSKEDII KAAKEYLAKIKE >gi|261748169|gb|ADAD01000052.1| GENE 6 4060 - 4767 745 235 aa, chain + ## HITS:1 COG:YPO2295 KEGG:ns NR:ns ## COG: YPO2295 COG1011 # Protein_GI_number: 16122519 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Yersinia pestis # 2 235 3 222 224 132 33.0 5e-31 MYKLIFIDADDTLFDFRKAQGNAMKKTFEDFDYFEKEGNEEKFDKIKEDYKVINSKLWRE LEKGQIKEDELRVLRFERLFEKNSLKYNTKIFSEKYSERLSEGSFLLEGAEELCKYSSEK YRMIIITNGIKKVQIPRIKNSKINKYIEKIVVSEDTRVSKPNIEIFQYALDLANKKNKVT KQEIIMIGNSQSADIQGGINFGIDTCWINLKNQQKNEDIKAKYTVKSYKELYNFL >gi|261748169|gb|ADAD01000052.1| GENE 7 4898 - 6991 3084 697 aa, chain + ## HITS:1 COG:FN1717 KEGG:ns NR:ns ## COG: FN1717 COG0272 # Protein_GI_number: 19705038 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Fusobacterium nucleatum # 11 691 24 688 696 641 55.0 0 MKLFDNEIKSETVDFEEYKKLREDIERYNRYYYDEDRPLISDKEYDDLIKKLQKIEEKNP ELKNIYTVFSKDNSDISDIEENPEITPTEKIGGTASEKFSKVVHSVPMLSLSNTYNISEI EDFDQRARKITGLDKKLEYILELKLDGLSISLIYEKGMFKQAITRGDGQIGEDVTENVRE ISSVPKKLKDSIDVEVRGEIVLPISNFNKVNEIREEAGEDVFANPRNAAAGTIRQLDSSI VKDRGLDCYLYYLVNPERYEIKTHLESMEFIKKLGFKTTDIFEKYTDFKELEKSIEKWHD EREKLDYETDGLVIKVNDFSYYNELGYTTKSPRWAIAYKFPAEQVKTKLLDVTFQVGRTG VITPVAELEPVTLSGSVVKRASLHNFDEIERKEIKIGDNVIIEKAAEIIPQVVNVVFEDR TGKEREIEKPTNCPVCGSELVKEEGQVALKCLNPVCPEKVKREIEYFVSRDAMNISGLGE KIVEKFIELGKIKTVVDIYSLKDYREELENLEKMGKKSVDNLINNIEESKNRGFNKVLYS LGIPFVGKFTANLLVKNFKNIENLKNSEIEELLQIKGVGEKVAISVHNYLNNEDNWGKIN DLKGRGLNFSLENAEEIEIEDNPIKGKNFLATGKLEKYKREEIKEIILSKGGNYLSGVSK NLDFLIAGEKAGSKLEKAEKLNVRVLTEEEFEREFLK >gi|261748169|gb|ADAD01000052.1| GENE 8 7298 - 8098 1100 266 aa, chain + ## HITS:1 COG:MJ0283 KEGG:ns NR:ns ## COG: MJ0283 COG0489 # Protein_GI_number: 15668458 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Methanococcus jannaschii # 9 264 26 285 290 247 47.0 2e-65 MHGNPNVDERKKKINEKLSRIKNKIVVMSGKGGVGKSTVSVNLAYGLYLRGYKVGILDAD LHGPNVPLMLGKEGVKLPALSTPLKIAENLSISSLSFFVPDNDPIIWRGPQKMGAIMEML EGIEWGEMDFLIVDLPPGTGDETLSIAQNIGSDARSIVVTTPQDVSLLDSKRTVKFSRLI NLKLLGIIENMSGFICPDCGKEVNIFKKGGAEKMAAETKQTFLGSIPMEANIVESGDNGL PYISNDSTASRKMNDIINKVLEQLNM >gi|261748169|gb|ADAD01000052.1| GENE 9 8142 - 8819 898 225 aa, chain + ## HITS:1 COG:ML0174 KEGG:ns NR:ns ## COG: ML0174 COG0745 # Protein_GI_number: 15826987 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Mycobacterium leprae # 1 225 1 225 228 123 35.0 3e-28 MKILVCNKNEKSRLVVGQLLKEMSFSVLLVDSEEKVVEEMKSDSLDIVFLDIASFSNFPK ILEKIIKNKKDSYILLSIDSDDRYSKTEALLNGADDYIYNDFRLEELSAKLKSIVRILTK KINNDDSEVLSIYDLTLNPINREVIRDGKEIVLTNKEFLLLEYFLRNKNRVLTRTMISEK IWDIDFITESNIVDVYINFLRAKVDKGFDKKIIKTVRSVGYIVKE >gi|261748169|gb|ADAD01000052.1| GENE 10 8914 - 9546 552 210 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent [Streptococcus pneumoniae TIGR4] # 6 210 7 201 201 217 52 2e-55 MFEQVKLPYAYDALEPNIDAKTMEIHYTKHYAAYTNNLNDALKNNAPEFLNKSIEEILGN LDALPEEIRAGVRNNGGGFYNHNLYFTVIGPNGGGEPTGELAQKINEAFGSFQGFKDEFS KAAATRFGSGWAWLIVNKKGELKVRSTANQDNPLIPGATCECSHGTPILGIDVWEHAYYL NYQNRRPDYINAFFNVIDWNAVAKRYEEAK >gi|261748169|gb|ADAD01000052.1| GENE 11 9811 - 10848 1558 345 aa, chain + ## HITS:1 COG:STM1647 KEGG:ns NR:ns ## COG: STM1647 COG1052 # Protein_GI_number: 16764991 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Salmonella typhimurium LT2 # 1 330 1 329 329 348 51.0 1e-95 MKIVVFDAKPYDIEFFEKWNEKYGAKITYFEEKLSLKNVMLTKYQDVVCTFVNDDLNEKV INVLSKNGIRAIAIRAAGYNNVDMKAANDNRVTVFRVPAYSPYAVAEHALALLMTVNRKI HKAYNRTRDGNFSLVGLTGIDLNGKTAGIIGTGKIARIFIKILNGLGMNVIAYDKFPNEQ AAKDENFKYVELDELFAKSDVISLHCPLTPETRHIINGENIDKMKKGVIIINTARGALVD TSILVEALKDKKIGGAGLDVYEGERDYFFDDKSANVLEDDILARLLTFNNVIVTSHQAFL TDEALNNIVETTFDNILKFAKKEVLQNEVWYDEENDKIVEGPRKK >gi|261748169|gb|ADAD01000052.1| GENE 12 10908 - 10982 71 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MELFEAKKVNEQNSTEIEEIRRHL >gi|261748169|gb|ADAD01000052.1| GENE 13 10957 - 12003 1504 348 aa, chain + ## HITS:1 COG:BS_prfB KEGG:ns NR:ns ## COG: BS_prfB COG1186 # Protein_GI_number: 16080582 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Bacillus subtilis # 6 344 22 362 366 378 56.0 1e-105 MRKSGGIFDLDSLNLKIERIEKKTAEPDFWNRENSQETLKELNILKKLIEEYEKISGMNE DVSVMIEFIEAGDDSFEKELDDKIDTVTEEIQNFKTKLLLDEKYDSNNAILTINAGAGGT EACDWTEMLYRMYDRWSNRKDFKVEVLDSLSGEEAGIKSITLNIKGSYAYGYLKGEKGVH RLVRISPFDSNGKRHTSFTAVNVVPEIDDDVEVNIRPEDLKVDTYRASGAGGQHVNTTDS AVRITHIPTNTVVTCQKERSQLKNKETAMKILKSRLFEIELEKREKEMENIKGTESKIEW GSQIRSYVFQPYKMVKDHRTKAEEGNVDKVMDGDIDSFINEYLKFYKN >gi|261748169|gb|ADAD01000052.1| GENE 14 12165 - 13667 1975 500 aa, chain + ## HITS:1 COG:FN1340 KEGG:ns NR:ns ## COG: FN1340 COG0008 # Protein_GI_number: 19704675 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Fusobacterium nucleatum # 3 500 12 510 516 692 67.0 0 MSEKRVRVRIAPSPTGDPHVGTAYIGLFNYVFAKHNGGDFLLRIEDTDRTRFSETSEQQI FNMMKWLGLNYDEGPDIGGNSGPYRQSERFSIYKEYAEKLVEKGEAYYCFCTPERLQKLR ERQIAMKQAPGYDGHCRNLSKEEVEAKLAAGEPYVIRLKMPYEGETIVRDGLRGDIVFEN SKIDDQVLLKSDGFPTYHLANIVDDHLMGITHVIRAEEWIASTPKHIQLYKAFGWDEPKW YHMPLLRNADKTKISKRKNPVSMNYYMEEGYLKEGLLNFLALMGWSFGGDKEIFTIDEMI ENFSFDKISLGGPVFDLVKLGWVNNQHMRLKDLDELTKLAIPYFVKAGYYKDEDLSEEEY GKLRRIVEISREGAHTLKELPEIASIYFEDKFELPVVEEGMNKKERKSVEKLRSSLETET GKKSVELFQEKLSKLNENISEEEAKQLLHELQEELGEGPAAVIMPLRAVITGKARGADLY TVIAIIGKERTLNRIKNILK >gi|261748169|gb|ADAD01000052.1| GENE 15 13972 - 15375 1356 467 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1914 NR:ns ## KEGG: Lebu_1914 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 4 467 3 453 466 114 28.0 7e-24 MYSIGYKEVDILWEIANFKIADKNMIMKNYNISKTELSDTLKKINNLLNWEKMNYLLDKR GYILFNNELSERRFKLESIIKFTNKIRREIIYLFLILSEKHFNLTKFTLRFGLGESSKKY VRTDLKAVLKELDIDYAEYRESKNPLEIIKRKIPNFEKTRKEYLINLFIKRFEKSYVYYK EEKDKDGKELNYPSKLIFEIIKEELKIKDLKFIFHAFREFYVKYNKLFSVKEKYEIFIYF ISNLYNEKENTRKRITSKNVKMKEDNDYKLLEEMLDKISENENRRIYSYFKLKLYRLLLK KEKMNFDITPKKSKEIFYNNTESLNIKIAEEYVYQIAEENSKIEYYENFERLKTLFVLDL EYNEIIKNQKYLKDLMQLSNIIEVTDLGELSEKLNGKNEKNFEQVFLISKSEIQNMIKNY SDIPIYFFKINDKDRFGRKTGKLKRRTDSEKQLTEIRETITEYNRIK >gi|261748169|gb|ADAD01000052.1| GENE 16 15464 - 15877 396 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037743|ref|ZP_06011185.1| ## NR: gi|262037743|ref|ZP_06011185.1| putative exonuclease subunit C [Leptotrichia goodfellowii F0264] putative exonuclease subunit C [Leptotrichia goodfellowii F0264] # 1 137 1 137 137 209 100.0 6e-53 MTIKTGNLKIDEKVYFLKNIETPVEVQEEIKKFSCENENHNTENGDSEIEGKEISVDLGQ LKFVVCPVKIKNSDIKAKVIKSDKKVEYGEIKINDKAKKFKSDLFFAIYYSSEKKFNFIF EELMEHIIEKIDMYREL >gi|261748169|gb|ADAD01000052.1| GENE 17 16103 - 16942 1016 279 aa, chain + ## HITS:1 COG:alr3062 KEGG:ns NR:ns ## COG: alr3062 COG0463 # Protein_GI_number: 17230554 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 2 167 3 173 321 61 28.0 2e-09 MKVSLIMPTINVTEELDLFLKSLSEQTYKDFELIAIDQNTGNEAFEIIRKYEDDFEIKYM KSNQSGLSLNRNKGLIMMDGDIVGFPDDDCEYEKDTLEKVVAFFKENKDKRIYSCRTLER GKDYGTGIMLENNDELKVDNIDMTVKSITFFVNYSLEDITLFDENLGVGAYFGSGEETDY VLTLLHKGFKGNYFADDIIYHPAKKGNYDDLERAYKYALGYGALVKKEVKCRKNFFYIFK YWKKLFRNIGGMIVTKNRKYHWFVLKGRLRGYFRYKCQG >gi|261748169|gb|ADAD01000052.1| GENE 18 17083 - 18285 1468 400 aa, chain + ## HITS:1 COG:BS_ggaB_2 KEGG:ns NR:ns ## COG: BS_ggaB_2 COG1887 # Protein_GI_number: 16080621 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Bacillus subtilis # 24 390 13 372 381 105 27.0 2e-22 MDKIKCLVNMIFACILYPFTKSKFKNKRIWLIGGNAGELYVDNGRAMYEYLRTKKEVETY WVINKDSPVFEQIPGNKLIRGSVKDYLYFMNSEVVLFSHSISADIVPYLFAVPFIKGFHY KPIKVFLNHGTVGFKVRMAMNKKTEKIAEELVRSYDINICDSEYESVIKRDAWWNIEGEK IFVTGYPRYDKLYNAEIKENEILFMPTWRNWIKSENLKIEDTEYFKNITGLVTDGKLNEY LERKGILLNLYIHQLMHDYLGNFGSIKLGKNIRLLPKEAEITKELMKARILITDYSSVAY DFYYLDKPIIFFQFDKKEYTEKVGSYVDLDNELFGKEAYTIEECVKDIIEISENNFKYDG VIEEKSEKLKDKFLKYNDKDNCKRVYELILSKIGEKNDKR >gi|261748169|gb|ADAD01000052.1| GENE 19 18272 - 19021 945 249 aa, chain + ## HITS:1 COG:BH3661 KEGG:ns NR:ns ## COG: BH3661 COG0463 # Protein_GI_number: 15616223 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus halodurans # 6 247 11 253 257 226 47.0 3e-59 MIKDKVSVIIPMYNAEKFISRTVESVINQTYKNWELLIINDNSGDKSYEIVKKYSEKDDR IKVITVEKNIGVVKGRNKLIDMAEGQYIAFLDADDYWAENKLERQIAFMKEKNAAISCTE YTRVTEEGKPINRIIIKEKITYTDLLKNNYLGCLTVMLDNEKLGKRYFKERNKNEDYVLW LEVIKETGEIYGLKEDLAFYRVLNNSRSSNKAGAAKVRWEIYRKEEKLSFLKSLYYFINY AIIALKKTK >gi|261748169|gb|ADAD01000052.1| GENE 20 19077 - 20357 1757 426 aa, chain + ## HITS:1 COG:all2854 KEGG:ns NR:ns ## COG: all2854 COG2148 # Protein_GI_number: 17230346 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 184 426 227 469 469 265 50.0 2e-70 MAASGVRRNFSYLFGFLTVAMYFLGLFIINRGLSFKNALIIVMALIAYYIANVYNVAVGK YRLRDMATVIVINFILMCITIFVKIFSLNEGIILFGIISIFQIVFRYVVIMGVVEKQKII FVGENDYTKDLSDSIKNDGQYRFLGLLKEDDPKLRETILKIYKVNKPDIIVDFTENLLIE PKFVDKLLQYKLNGLQYYNYREFYETYENKLPISHLSPKWFLENSGFEIYHNNFNLKAKR LLDLFFAFLIGVCVAPIMLIAAIIVKLESKGPVFFIQERIGEGNKKFNIVKFRSMTTDAE KDGPKWATKNDNRVTKFGKIMRLTRIDELPQLWNVFRGEMSFVGPRPEREYFIQQLEKEI PYYNLRHTVKPGLTGWAQVMYPYGASVEDAYRKLQYDLYYIKHHDILFDMKVLLKTVTIV IFGKGR >gi|261748169|gb|ADAD01000052.1| GENE 21 20360 - 21292 1419 310 aa, chain + ## HITS:1 COG:BH3379 KEGG:ns NR:ns ## COG: BH3379 COG0451 # Protein_GI_number: 15615941 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 1 310 1 306 308 278 48.0 8e-75 MAKKILITGGAGFIGSHIAERFDKENYEIIIVDNLVGGKKENISHLKNIRFYEVDVRDRE SLEKVFEKNKEINYVFHEAAQVSVSVSVENPHYDADENVLGLINVLDMCRKYSVEKVLFA STAAAYGIPKTSVSAEDSKIAPLAPYGLTKVFGEHYIRMYHDLFGLNYVIFRYANVYGPR QSAHGEAGVVSIFNDRMKVEQEIFIDGDGEQTRDFIYVRDIAEANYVCAVESVINKTLNV STNAKTSINELFNYMKKYSGYKKEANYREPRKGDIRDSRLDNTKLKSNTSWNYKYSLEKG LKEYAEYEKK >gi|261748169|gb|ADAD01000052.1| GENE 22 21320 - 24850 4888 1176 aa, chain + ## HITS:1 COG:FN1129 KEGG:ns NR:ns ## COG: FN1129 COG1196 # Protein_GI_number: 19704464 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Fusobacterium nucleatum # 1 1175 11 1180 1193 507 35.0 1e-143 MHLKALELAGFKSFADKTVVEFNRGITSIVGPNGSGKSNILDAILWVLGEQSYKNIRAKE SSDVIFSGGKNKKAKSMAEVSLIIENEDGYLDIDFSEIKITRRIYKSGENEYFINNRKAR LKDINNLFMDTGIGKQAYSIIGQGRVERIIGSNPRELKEIIEEAAGVKRAKVEKEESEKK LKEVKSEIEKITYVEKDLETRVKYLKDEGMKARLYKTYTEKIDVHRLMILEYNINEKEIA RKKYTSEKEELKSIIDGIQENLEVKKENLGKLNEKREKAYEKVEIEKNKNFDMFKEIETL KNEYSELKNGYSNLETEANEKQKRKAVLEKDIDEKKEMLDSSKIELESITKELTEKEKEK NEVDKKVQDLKDKKDRITADLKEKNQENQDFEIEKFRATADNEDLEKRISTAEHTNKRLI GDKESAEKEFNSINVKKQEFELKRNEKEKEKNELEKEIDNINNGIEILQKRYEGLNKEKN ELNYKYETLNAKKRANENVIESNETFAKSIKYILNENYEGVIGVFINLINIPDGYEEAVQ TLSGGSFQDIVVKNTDIGKKCIDILKEKKIGRASFLPIDGVKVFRITDELPKNEGVIDFA RNIVKYDKSIEKIVHFVYGNSLVVKNIETGTKLLKEGYNDRIVTLEGDIITARGRMTGGY SNKGKNEFLERKKELNTIIGSMAVIEKELKEKDEEALKISERLKQGELKKEEKSSEYKVI SDEYKAFLEEYNSFTADFNKKQREIETLNYEIAENEKTVKERKEKIVENKEIVKNAEENI IKNSIIIKELQNELEKFEDLTEYLNELNKIDTEYKILKVKTENNQLRYKEIEKDYQKLIN EQNELTEFEKKRDSLSEEFQRKISEKKEEIQKKNVENENLNQMIKVLEKDIKKYENEERE LIKEVNDIEMKIFGEKNKHEKVSENLIKAEKDLEYFTGELENIENLSVKENPEYKMIENE TELQNIKRKLSMNEKSRMEMGSVNLAAIEEYERENTRYNELVDQKRDLSESRESLLTLIR DIENEIIEKFSVAFEEINKNFEYMCRTILNGARGVIRLLDSENILETGLELSVKYKNKPE QTLLLLSGGEKSMLAVSFIMAIFMFKPSPFTFFDEIEAALDEENTKKIVKLLNRFIEKSQ FILITHNKETMKGSHRLYGVTMNKEIGESRLISVDV >gi|261748169|gb|ADAD01000052.1| GENE 23 24869 - 26167 1879 432 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1377 NR:ns ## KEGG: Lebu_1377 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 432 1 432 432 581 66.0 1e-164 MLRKVLLIMAFALLTIIGKGEVTDISKIAEDYPYKENAILSTVLGTPSDQWYKFKKAKGP TVKRFKANKAIPAILRQWSDYEYGVWKQKKEAPLMILISGTGSLYNSGLSMFLANVFYER GYNVIAFSSTSTMPYIVSQSKNNYAGYVKDEVPHLYELMAKAISTEKSSGMKIGKIYVGG YSLGGFQSLLIHELDSKSKKIGIDKSLLLNTPVSILTATKQLDKYLIKNGIYNAAGLEKF LDRIFSRIINDKTLELSDIDFTSLNTTLGKLQLTDSDFEALTGLLFRFYSANMTFAGEVF SGRNAIGRLSNKKSYQRFDSVTKEFHEGLSVSFDEYSKEILYPYLKKYKYPDLTMEQFIK DFSLDNSREFIERNSDKIVFITSENDILINEDELAYINKYFTNRVIIPFGGHTGVLWHRD VAKLMVDKMEEN >gi|261748169|gb|ADAD01000052.1| GENE 24 26169 - 27170 1490 333 aa, chain + ## HITS:1 COG:PA3239 KEGG:ns NR:ns ## COG: PA3239 COG2853 # Protein_GI_number: 15598435 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Surface lipoprotein # Organism: Pseudomonas aeruginosa # 120 331 63 265 267 166 41.0 8e-41 MKNKKNKLLTVAALLILATSAEAAKSEVLKPYTEENNTLTQMVEEEVNDDTFVQFVDGEV IEEPGKSVDKVEKNNLSTNKKFKNKEYAVNYFLSDKDEYGIIASNIDVLDEQYIISDKVF DLMNVDDSWEPFNRRMYAFNTQFDKKVAYPVSRVYSAVVPQPIRKGIANFYNNFKEIPTA INSLLQLNPKKAVNALGRFAINSTIGILGTNDVATKIGLKKNIETMGDTLGHYGVPTGSY LVLPVLGPSTIRDAIGSLPDAAMEGAVRRVAEKELFFDTKIFDKNIYGFTRPVVTGLNAR SLVDFRYGDLNSPFEYDLVKAIYYNYRKLQVIK >gi|261748169|gb|ADAD01000052.1| GENE 25 27190 - 27897 374 235 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase [Ochrobactrum intermedium LMG 3301] # 4 215 3 212 245 148 35 1e-34 MSRQNIYDNDIFFQGYKKIREKENNMNDTVEKPMLFSLLPDLKNKKVLDLGCGYGENCVK FIKMGAEKVVGIDISEKMLDIAQKENSDKKVVYLNLAMEDISQINEKFDIIVSSLAFHYV ENYEKLVSDIYNLMNNGGYLVFTQEHPLVTCHSTGERWTKDEEDNKMYANISNYTISGKR ESVWFIDNVTIYHRNFSDLINVLIETGFKMEKIVESIGVDNVHRPDFIAFKALKD >gi|261748169|gb|ADAD01000052.1| GENE 26 27910 - 28215 425 101 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1599 NR:ns ## KEGG: Lebu_1599 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 2 100 4 102 102 109 60.0 4e-23 MDKIKNIENMEKILNNTENIFKEFQAVLDKLEKNQKDYKKLITYYGSEEWFSDVEDSNNN LLPKDLKCGVLSEDAVYNMISDNHELAVKMLEIATEMIKRD >gi|261748169|gb|ADAD01000052.1| GENE 27 28234 - 29361 1054 375 aa, chain + ## HITS:1 COG:SA0341 KEGG:ns NR:ns ## COG: SA0341 COG4292 # Protein_GI_number: 15926054 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Staphylococcus aureus N315 # 7 309 13 316 377 110 29.0 6e-24 MSKILPKKVSMAELFYDLIYVLEVQKVTGVIHHLHHGQISIDTYLKFAFACLVILLLWFN QTIYINKYGTNSYYDIIGMFLHMFGAVYISNNISLDWKEVFVPFNVTAILMSLVIIAEYY LKSREYEKMPYEIKSQILILSLEIAALLLAFVTGPKYRMILAATAYLAAMFFPMYVMPRN DEEVTLNFPHLVERVSLITIIIFGEMVVGLAGVFALGKPSDILPLVSFVAAVLLFGTYVL QIEKMIEHHQKKRGFVLVYSHIGIFMGLSTITTSFGFALNPEVDWKFLIIFRLFGLLVYF VSMFSISVYNKENYRLKIKDVVAYLIFFIIGSLITGFSNKENIFLFNFGMFVMIFMIFMI FMIFGYMFNVVRKRR >gi|261748169|gb|ADAD01000052.1| GENE 28 29364 - 29447 64 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCRIYGNSNKPNEKNEKIERKIIERVK >gi|261748169|gb|ADAD01000052.1| GENE 29 29485 - 32307 3562 940 aa, chain + ## HITS:1 COG:FN1103 KEGG:ns NR:ns ## COG: FN1103 COG0178 # Protein_GI_number: 19704438 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Fusobacterium nucleatum # 1 940 16 955 960 1335 70.0 0 MNDKIKITGAREHNLKNVDIEIPKNEFVVITGVSGSGKSSLAFDTIYSEGQRRYVESLSA YARQFIGQMQKPELDSIEGLSPAISIEQKSVSKNPRSTVGTMTEIYDYMRLLWAHIGEAH CPVCGQKVEKQSLQEIVDNVITTTQEKDKLIVLAPVIIDKKGTHKNLFLNLQKRGFQRVR VNGDILDLNDTIDLDKNKRHHIEVVVDRLVIKKDDKDFLSRLTEAVETAGGLSDGKIITN INGKDSKYSENFACSDHPEVVFPDVVPRLFSFNAPYGACEACNGLGSTLEVDENKLIVDE NLSLREGGILFPGSTNQKGWNWELFEAMAKAHKIDLDKKVSELTEEERKIIFYGSDKQFK FSWSGDSFSYNGKKEFDGIVRNIERRYRETASESTKEEIEAKYMTEKTCKTCHGKRLKDV VLAITVNGKNIIDLTEVSVTEALKFYENIELTEKQKQIANEILKEIKERLSFMINVGLDY LTLARMTKTLSGGESQRIRLATQIGSRLTGVIYVLDEPSIGLHQRDNEKLLAALKDLKNI GNTLIVVEHDEDTMREADYLIDMGPGAGIYGGEVVAEGTPKQVLKNKKSLTAKYLNGEIG IEVPKKRRKFTKEIILKNAKGNNLKNVTVNIPLEVFTVVTGVSGSGKSTLINQTLFPELH NRLNKGKLYPLENGGIEGLEHLEKVIDIDQSPIGRTPRSNTATYTKIFDDIRDLFAQTKD AQVRGYNKGRFSFNVKGGRCEACGGAGINKIEMNFLPDVYVECEVCKGKRYNRETLEVHY KGKSISDILDMSVEEAYEFFKAVPSLERKLQTLIDVGMNYIKLGQPATTLSGGEAQRIKL ATELSKISRGDTIYILDEPTTGLHFEDIRKLLIVLDRLVEKGNTVLVIEHNLDVIKFADY IIDVGPEGGHRGGQIIAKGTPEQIVKSKKSHTGKFLKKYL >gi|261748169|gb|ADAD01000052.1| GENE 30 32540 - 33112 576 190 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0612 NR:ns ## KEGG: Lebu_0612 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 12 190 3 174 174 166 53.0 5e-40 MKDLIYSGGKMKKNVIIKRILIIMAVLFGIMSCDFLENKTIHVPKKTIQEKTDKKFPVTK NFLVSKVTLKNPKVDFKDEKILIETDYDISLLDDRSKGKMYLSSGIRYDKDKEELYLVDL SVDKITDENGKERVKSKTENTLISLITNYVEMSPVYKYGEEREKREKEGKKKQIKIKNMY IKNCKFYVQT >gi|261748169|gb|ADAD01000052.1| GENE 31 33296 - 34351 2002 351 aa, chain + ## HITS:1 COG:SP0285 KEGG:ns NR:ns ## COG: SP0285 COG1064 # Protein_GI_number: 15900219 # Func_class: R General function prediction only # Function: Zn-dependent alcohol dehydrogenases # Organism: Streptococcus pneumoniae TIGR4 # 1 337 1 336 339 513 81.0 1e-145 MKAVVVNEKGTGEVQIVEKEVPKVGPGEALVEVEYCGVCHTDLHVANGDFGKVPGRILGH EGIGIVKEVAPDVKTLKVGDRVSIAWFFEGCGVCEYCTTGRETLCRNVKNAGYSVDGGMA EYCLVTADYAVKVPEGLDPAQASSITCAGVTCYKAIKVSKIEPGQWIAIYGCGGLGNLAI QYAKHVFNAHVIAVDINQDKLELAKEVGADYIINGKNVADPAAEIKKITNGGAHAAVVTA VSKVAFNQAIDSVRAAGKVVAVGLPSETMDLPIVKTVLDGIEVIGSLVGTRKDLEEAFQF GAEGKVVPVVQKRALEDAPEIFKEMEEGKIQGRMVLDMKGAGCGCGGHHNH >gi|261748169|gb|ADAD01000052.1| GENE 32 34473 - 35582 1507 369 aa, chain + ## HITS:1 COG:CAC3072 KEGG:ns NR:ns ## COG: CAC3072 COG0836 # Protein_GI_number: 15896323 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 6 352 5 336 350 267 42.0 3e-71 MDKVVLIMAGGSGTRFWPLSTNERPKQFLDLVSDKTMIRETVDRVAKMIPAEKIFISTNI DYLDIVKKELPEIPERNIIFEPMARDTAACIGYAALIVKKLYENSIMAVLPSDHLIKKEK EFLESLEFAFEKAKNDVIVTLGIKPSYPETGYGYIEYIKNKNSAKEQSKEKFEIYKVKSF REKPNKEIAEKYIEKGNYLWNSGMFVWKTDFILNEIRKYMDSHKPVTDKIEEMLSNIDLN EFYGKKLSSYVSEEFEKFEKISIDFGVMEHTRAVLVIPVDIDWNDVGSFKSLEDVFPKDK DNNIVKAEKYEEIESEGNIIINKENDKIIATIGLEDIVIVNTKDALLVCHKEKSQEIKKI LTKIKKYKK >gi|261748169|gb|ADAD01000052.1| GENE 33 35619 - 36596 1514 325 aa, chain + ## HITS:1 COG:FN0903_1 KEGG:ns NR:ns ## COG: FN0903_1 COG0794 # Protein_GI_number: 19704238 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Fusobacterium nucleatum # 1 206 2 206 206 205 57.0 1e-52 MEFDIIKEAQKTFNIEISELERVKNRINENFEKLVYMINGLEHRKVVVTGIGKSGIIGKK IAATLASTGTSAIFINAAEALHGDLGMISEGDIVIAISNSGNSDEILSIMTPIKKIGAEI VAFTGNETSPLAKHAKVVINIGVEKEASNLGTAPMSSTTATLVMGDALASVLMKMRNFTE NDFAKYHPGGSLGKRLLLTVSDLMHSGEELPVLAADENIENVLLVLTKKKMGAVCISETG KENGKLIGIITEGDIRRALVHKEEFFSYKAKDIMISTPVSIGRNAMAMEALKLMENRKSQ INVLPVVENGNVVGIIRVHDLIGLK >gi|261748169|gb|ADAD01000052.1| GENE 34 36624 - 38081 1503 485 aa, chain + ## HITS:1 COG:TM0620 KEGG:ns NR:ns ## COG: TM0620 COG2244 # Protein_GI_number: 15643386 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Thermotoga maritima # 2 444 4 439 479 75 21.0 3e-13 MDNKNLLKGTMVYSLMQIATKMGSFIFLPIITRLLTPAEYGVAGTLGSMITMFTVVLGLG MYNAQMKKYVDLKDDENEMGSYMFSTFLVLFIFNLTVFGFLFTPLSKKLFSYIINLNEIS YHPLITTAVVIAFVNALNTLGVTLSRMKRMYMKVAVGSLISMTTNYILAIFLIGNLRKGI IGFQGANLASSVALLLYYFKDYFGKFRFKAKKEYVKFSLKNGIPLIFIELTDQIVNLSDK LVLLKFIAPGIVGGYNLAADGGRAFSVLKGSFVDSWTPEFYEAMKKDKNNPRITGSIENF IAIISFFCVLCQLFAPEGISLIFSKGYLKAINYMPLILAGIVIQALYCLDYFFHFHENSI YIFYFSVFAMVFNLAGNLIFIPVFPQYGPYIAAWTTILAFLIRAILEMVVIKKKYGISFN YKKFLFYMIIVLNPVILYLSNDNISWTKFGLKIVYLAVVTKLIVNKEVYAKIANIVIKLK NKIVK >gi|261748169|gb|ADAD01000052.1| GENE 35 38124 - 39017 1067 297 aa, chain + ## HITS:1 COG:L17695 KEGG:ns NR:ns ## COG: L17695 COG0463 # Protein_GI_number: 15672198 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Lactococcus lactis # 10 281 12 282 285 171 36.0 2e-42 MKFTVFTPTYNRKELLRKLYESLKNQTYKDFEWLIVDDGSDDNTEKIVKEFISENKLDIK YYYKKNGGKQRAYNFAVEKARGELFICLDSDDEYISDGLETIFKYWKKYEKNDKIIGMGY LSIYPDGKVIGTEFPQNEMIESQFDIYNKYGIKGDKGLMFRTEIIKKYPFPVFEGEKFTT EALVYNRISEKYKMLYINKKIEIKQYHDDGLTAEYNNLLLRNPKGQALYHNERNRHKMSF KQKILNNAVYYKFCKIAGYPFRKIWNESESKGMLLLGLPLGMYMVSKEKSKLKSELR >gi|261748169|gb|ADAD01000052.1| GENE 36 39044 - 40480 1149 478 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2029 NR:ns ## KEGG: Lebu_2029 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 460 1 460 464 628 82.0 1e-178 MRKNGVLMTVIHIFIIVFVFFLKSAIVFPNDAGEMKFLEYLLMGVYIYTFFTAKIYSDWL NSYMIFLYTLFLFNFTRVFLDIVGYKEFGWATKFANYYFFYDVRIEIINVFLLVLLFTHL GFFIGIINETESEMKSRVTLKNRKIYTDFGMFLFIIALPALAYKMFIQLRIILQAGYEAY YTGILKGVDYPFFTKGSGTIMTIGFLIFLISVPSKKKFLTVSSLYLMVKLLDSFKGARAI FLTQLLFIMWYYAKVYGIKIKMKTMGKLVGFTVIFSQLLVSVRSKKVFSLDLINTVYNFL FSQGVSYLVLGYTIDFKSRIVGQGPYPYVFQGIFGFKPQSLETLTTTNSLADKLTYLLNP TAYLKGEGIGSNYIAEMYDLGYLWIIIISVLLGIFIVKYEKYVVKNRFLLLTSYYFIPNL FYIPRGSFFGEALIKNMIMLISVYVLIFGFDYMYRKIEEKYNDRKKDLLCVDRECKET >gi|261748169|gb|ADAD01000052.1| GENE 37 40428 - 41183 766 251 aa, chain + ## HITS:1 COG:FN1241 KEGG:ns NR:ns ## COG: FN1241 COG3774 # Protein_GI_number: 19704576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannosyltransferase OCH1 and related enzymes # Organism: Fusobacterium nucleatum # 1 245 1 235 243 219 50.0 3e-57 MIEKKIYYVWIGNAKKPDIFYKCLKSWQENLPDFQITEINEKNFDMETHLKKNRFFRECY ERKLWAYVSDYIRVHYMYEHSGIYVDTDMEIIKDITPLITKENMKFFIGYEDEKHISVGI FGTDRHNEVLKDLTEFYEKEIWEKPLWTIPKIFTYIFEKKYGLTDKRENTLKKGEITIYP KEYFYPYGFKEKYTPQCITENTYGIHWWNDSWTSLKARLFLETKHLTGFSKIVKKMRIVA RYYIKERRIGS >gi|261748169|gb|ADAD01000052.1| GENE 38 41207 - 41980 985 257 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2024 NR:ns ## KEGG: Lebu_2024 # Name: not_defined # Def: endonuclease/exonuclease/phosphatase # Organism: L.buccalis # Pathway: not_defined # 6 257 3 257 257 317 63.0 4e-85 MIKTGNIIRILLFTVIALTVNAKEFRMMTYNIYGGRLANGTKIGQSIKRYKPDFIALQEV DKFTKRSNIRDITKDIADEMGYNFYYFQKSRDYDSGEFGIAFVSKYPIEKILTYELPSIG IEKRQVIAAKIEKSVFGKTVLFINTHLDYKQEAKNDELNSLLIMSEIAEGDIKFLGGDLN MLPTTEYYNQITQNWKDTYLEGDKAGVRKNEDPRIDYILGDFSTNWKLKESFFINDNTQE WTKLSDHLPYMTIVDIK >gi|261748169|gb|ADAD01000052.1| GENE 39 42061 - 42213 171 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYVVDDFDIRKDKMIVKYKYQKGNMWLYNADILPSEYYFEKEKDTINTTK >gi|261748169|gb|ADAD01000052.1| GENE 40 42282 - 43169 853 295 aa, chain - ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 31 286 16 271 277 264 48.0 1e-70 MNNLTNNNIDDTKTSSIGNTSCEAILPTVPLKSLSEIEKSIQKKYRSELWSPFIRGLKEF EMVKDGDKIAVAISGGKDSLLLSKLFQELKRASKTNFELVFISMNPGFNEVNLTNLRKNL EHLEIPCEIYNDNIFEVAEKIAKDYPCYMCAKMRRGSLYTKAASYGCNKLALGHHLDDVI ETTLMSIFYMGKFETMLPKLKADNFDIELIRPLFYVEEKSIIKWIRNNGILAMNCGCTVA AGKTSSKRRETKELIADLVKNNPDIKKRIIQSAQNVNLEKILGWKTSEGKFSYLK >gi|261748169|gb|ADAD01000052.1| GENE 41 43286 - 43438 182 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGGNMNKIIIFVVAFLVLRFFFKSAKLLIKVAIMAALVILFIKFRDVLWM >gi|261748169|gb|ADAD01000052.1| GENE 42 43429 - 44160 851 243 aa, chain + ## HITS:1 COG:FN0782 KEGG:ns NR:ns ## COG: FN0782 COG4123 # Protein_GI_number: 19704117 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Fusobacterium nucleatum # 16 243 14 243 243 176 42.0 5e-44 MDVIKKIGEETTTVIKRMKIIQRNDFQNFTLDTVLLADFTKINRKTKKVLDIGTGCGIIP ILLAEKSKAEIVGIELQKEMADIAERNVQNYEERINIINDDIKNYQKIFKKDEFDCIVTN PPYFEFKGDINQINNSPQMSLARHNIDLTLEQIIKISAWLLKNSGHFSIVFRSERLVEML KLLTENKLEPKRMRNCYTKRNQDAKICLIEAIKDADKGLKIEMPIYVYEENGEKTEYIKK LYE >gi|261748169|gb|ADAD01000052.1| GENE 43 44684 - 45694 1423 336 aa, chain + ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 10 336 3 327 327 430 66.0 1e-120 MAKVIVPKNYDAKYGIMETEIAIKLIKDFFESELSKELNLTRISAPLFVRKKTGINDNLN GVERPVSFEMKDYEGEIIEIIHSLAKWKRLALKRYGINTGKGIYTDMNAIRRDEELDNVH SIYVDQWDWEKVITKEERNLDFLKETVKKIYKVFLKTQDMLVNKYPKYKKFLPEEVTFIT SQELENMYPDLTPDERENRFAKEKGAIFIMQIGKILESGEKHDGRAPDYDDWELNGDLIM WDPVLDKALELSSMGIRVDEKSLLKQLEILNLEERKGLEYHKMLLNSELPLTIGGGIGQS RICMFLLQKAHIGEVQASIWDEETIKICEQNGINLL >gi|261748169|gb|ADAD01000052.1| GENE 44 45928 - 48420 3274 830 aa, chain + ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 2 608 3 599 600 449 42.0 1e-125 MFLQKTDRLALVDFENNHINYTELVNKIKYFSENVIELDKNKFGLIIMENRVEWIYSFFA VWDKKSAPITIDSTSNPKEILYVLEDSHPEVIICSNETEKNIKEALLSYGLKDKIKIVNV DNYPIDKEKLEIIKNKEFELYNPEDEDTAVMLYTSGTTGLPKGVMLTYKNLNTEMDGIFA KNIFTHEDQILALLPFHHILPLTATVLLMLRHQTSIVFVEKIASKEILEALSRNRVTALI GVPRVFKLFYDGIKQQIDAKFITRFIYKTMSKIKSMKLRRKVFKKVHDKFGGHLDFIVSG GAKLDAEIGEFYEVLGIYSLEGYGLTETSPVIAVNSQKERKIGTVGKKLENIEVKVVDEE LWVKGPIVMKGYYNKPEKTAEVMTEDGWFKTGDLATIDDEGYITIRGRKNSMIVLSNGKN IDPETIENKVIAKSNYLIKEIGVFGHNDKIAAIIVPELVEFRKKGITNINAYIKNVIEDY NLTVHNHEKILDYKLYEEELPKTRVGKVRRFMLPNLYEKNMTEKKKVEEPDSEVYKILKE YIKKLKGIEPLPEENLELEIGMDSLDIVELFAYIENSFGIKLNEEQFSEMSNLKALSEYI NEKATKIEDSEVDWEKIIEEAPAVEEKNRWVTKVLRPLLDVTLKVYFRLKRVNRDKISAE QQIFVSNHQSFIDSLVLGSLLPHKILYNTMFLAIDWYFKKGIMKLLVSNGNVVLIDINKN IKKSVEEIAAHVKAGKNVLIFPEGARTKDGKVGKFKKVFAIIAKELNVDIQCLAIKGGFE AYSRFMKFPKPKKIEVAVLERFKPEGTYDEIVRKAENIIREYVEGNNTQV >gi|261748169|gb|ADAD01000052.1| GENE 45 49527 - 49667 154 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYIEGSDVMTNTNATNLRKNLFSFLDAAIDYNDIINVNTKKGMQLL >gi|261748169|gb|ADAD01000052.1| GENE 46 50190 - 51329 1354 379 aa, chain + ## HITS:1 COG:RSc2202 KEGG:ns NR:ns ## COG: RSc2202 COG0463 # Protein_GI_number: 17546921 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Ralstonia solanacearum # 3 147 4 152 256 80 33.0 7e-15 MIKISACILAKNEEENIAECIRSVKPYVDEVIVVDNGSTDNTGEIAESLGSIVLDGSNLL LDSARKLYMEKAKYDWILILDADERFGKLGEVSLKEFLSRIKDNIWGYGILSYQHSGLGK WAEVHILRLIRNNKMIHYNESPIHSSVAPSIFENGAEINDTSLFAIHHLDILIKGRPVPK RKRYRTMLEEILSNKNRNLDKDTENMYKCFLGLEYVAVGEYDKAEKIYEHAVEEDFKYRN FALECLCQLYMYRKKYEKVKKYITDKNILLFRNNAVLGNYYNYFDKEKAADFYENIIKNS NATASDYLNLAYLLKDKDRIKARELLEKAVEKNSYLLKRVSYDLGEKQNLFIIQSNILFE IENVYNLFEDLKMSELING >gi|261748169|gb|ADAD01000052.1| GENE 47 51489 - 52037 851 182 aa, chain + ## HITS:1 COG:VC0585 KEGG:ns NR:ns ## COG: VC0585 COG0634 # Protein_GI_number: 15640606 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Vibrio cholerae # 4 172 3 172 177 199 56.0 3e-51 MFNYSIDTLISKEEIALKVKELARLIDEDFKGEKVLLIGLLRGSVIFLSDLARELETEAT LDFMVVSSYGNEMESSRDVKIKKDLEEDVRGRNVVIVEDIIDTGNTLKKVMEILMTRDPK TIKICTLLDKPERRETEISIDYTGFKIPDEFVVGYGIDFAQKHRTLPYIGIVKKEEEQKD EK >gi|261748169|gb|ADAD01000052.1| GENE 48 52027 - 53331 1930 434 aa, chain + ## HITS:1 COG:FN1605 KEGG:ns NR:ns ## COG: FN1605 COG0104 # Protein_GI_number: 19704926 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Fusobacterium nucleatum # 6 423 4 421 425 539 62.0 1e-153 MKNNTFVIVGTQWGDEGKGKIIDVLSPKADYIVRFQGGNNAGHTVVVNDEKFILHLLPSG IINSDGKCIIGAGVVVDIEVLLDEMEKLEKRGKNMDNLFIDERTHIIMPYHIEIDKAKEE AMGENKIGTTQRGIGPCYIDKIARNGIRIGDLLDSERFRDKLEWNVKEKNDMLTRYGKPV FDFDELYDRFIKLGEKIKHRIIDGVVEINEAIEDEKTVLFEGAQALMLDIDYGTYPYVTS SSPTAGGVTVGTGVSPKKINRILGVMKAYTTRVGEGPFPTELNDETGEKLRTVGHEYGAT TGRPRRCGWLDLVIGRYAALIDGLTDIVLTKLDVLTGFEKIKVAVGYEINGKIYNSYPGN FRKSENLKIIYEELDGWKEDITQIKNYDDLPENCKKYVEYIEKKLKCNVSMISVGPERSQ NIYRYDLIDKVKIK >gi|261748169|gb|ADAD01000052.1| GENE 49 53348 - 54781 1760 477 aa, chain + ## HITS:1 COG:FN0368 KEGG:ns NR:ns ## COG: FN0368 COG0015 # Protein_GI_number: 19703710 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Fusobacterium nucleatum # 4 465 6 466 477 604 68.0 1e-172 MDKYSNPLEERYASEEMLYNFSPENKFRTWRKLWIILAEAEKELGLDFISDEQIEELKKF KNDINFEKAAEFEKKLRHDVMAHVHTYGEQATNARKIIHLGATSAYVGDNTDLIQIKEGL IIIRKKLLALIKRMKEFALEYKSLPTLGFTHFQAAQLTTVGKRATLWLHSLLMDFEELEF RLEKLRYRGVKGTTGTQASFKELFDGDFEKVKRLDKLVTEKAGFKIKQGVSGQTYDRKTD TQILNLLSNIAQSAHKFTNDFRLLQHLKELEEPFEKNQIGSSAMAYKRNPMRSERISSLS KFVMSSAMNGALVYSTQWFERTLDDSANKRLSVPQAFLAVDAILIIWLNIMDGVVVYPKV IEANIQKELPFIATENIIMESVKKGMDRQEVHEIMRELSMEETKEIKINGKPNNLIDRII KDGRLGLKSEDMEGILVSKNYTGFAEQQTEDFINEEINPVLEEYKDEIDEERVELRV >gi|261748169|gb|ADAD01000052.1| GENE 50 55193 - 55912 1123 239 aa, chain - ## HITS:1 COG:TP0474 KEGG:ns NR:ns ## COG: TP0474 COG0217 # Protein_GI_number: 15639465 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Treponema pallidum # 5 234 8 236 245 200 50.0 2e-51 MGRHGTIAGRKEAQDRKRAASFTKFVRLITVAARNGADPEYNVALKHAIEKAKAINMPND NIMRAVKKGSGADGSTSYEALSYEGYGPGGVAIIVEGLTDNKNRTASSVKTAFDRNGGNL GVSGCVSYMFERKGVIEIEKTDKTSEDEIMDIALEAGMEDMQTYDDSFYITTATDSFDSV SSALRNAGYELLESDIEFVPSIEVETLSEGDSEKLRKLIDILENDDDVQKVHHNYAGEL >gi|261748169|gb|ADAD01000052.1| GENE 51 56219 - 56965 883 248 aa, chain + ## HITS:1 COG:jhp0651 KEGG:ns NR:ns ## COG: jhp0651 COG3177 # Protein_GI_number: 15611718 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori J99 # 34 225 19 209 234 97 35.0 2e-20 MTSEQLKNLCLIDRLREERESGLRNLIYDYTQTQMSYHTNAIEGSKLTLKQTVDIYDADR LYIDESGIIKIDDIIETKNHFKAFDYIIDNINKELNEDMIKKLHKILMTGTSKEKQEYFK IGDYKLLNNTIGNLVETTEVENVQKEMEELLREYNHIEKKTLENILDFHVKFERIHPFQD GNGRVGRLIMFKECLANNIIPFIVIKEEKEYYIRGLSEYSAEKEYLTDTCLHFQDIYKDK LKYFGLNL >gi|261748169|gb|ADAD01000052.1| GENE 52 56993 - 61309 5548 1438 aa, chain + ## HITS:1 COG:FN0281 KEGG:ns NR:ns ## COG: FN0281 COG2176 # Protein_GI_number: 19703626 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Fusobacterium nucleatum # 1 1438 6 1454 1454 1343 50.0 0 MTARYLKLKPDNKFTVKYDMKNFEIEYINLYSQKKEMEIHVIINHYDANKELKELRQLIH KNFGNDLTLKIKIGVKPEIIEKDVTEFIRFIIENYKSESVRHQYIFANYEVESFENEVFI KLPSEYMIEEGRKIEILKELSDRIFDITNKKFNIEFTNGDFEEIQEKIKEKKKHLERDVK AEEVPKIEIKKDETKDQQNGNKVTTAYNGFRRKVENRETSEFSILENLNLGEGIALEGQI FDINISETKNGNLKYEFLITDYTDSIRCVIFPKNNETLKISLNDWVKVNGIFEPDKFTGE FYIRTQKVDKIEPKISKRFDNAEKKRVELHAHTNMSEMSGVVSAKDLAKRAKEFGHSAVA VTDFGVVHSFPFAYKESDENFKIIFGVEAYVVDDEQEMITKPADRLIEDEIYVVFDIETT GLDPYKDKIIEIGAIKLKGKEIIDEFSVFINPEIDIPEEITALTNITNDMVKDAEKVETV LPKFLEFCKDTTVVAHNAKFDVGFINQKAKNLGLEYSPSVIDTLHWARILLPEQKRFGLK YIANYFNVVLDNHHRAVDDAKATAEIFQKFLNMVLSRGVLKLSEINQELQTNIQNADTLN TMILVKNQDGLRDLYELISRSHIEFFGNKKPRIPKTLLNSMRKNLLLASSASAVFGNSGE LVNLYLRGTEREEIEEKAKFYDYIEIQPLSNYADLERDSLIESENAKIVLEMNRYFYDLG KKLDKIVVATGDVHYLEEREAINRSVLVLGSGMGRRVFSYDKKLFFKTTEEMLEEFSYLG NDEAYEVVVENTNKINDMIESVRPIPKGFYPPKIEGAEDEVRKMTYEKLHELYGENVPEH LNERIEKELNSIIGNGFAVLYLIAQKLVKKSLDNGYLVGSRGSVGSSIVAYMMGITEVNG LYPHYRCPNCKHSEFTELEGSGVDLEDKICPNCGTKYIKDGHAIPFEVFMGFNGDKVPDI DLNFSGEYQGEIHKYTEELFGSDNVFRAGTISTLAEKNAFGYVKKYFEEVEGTTEISKRK SEIMRIAKGCEGARKTTGQHPGGMIVVPKDKSIYDFCPIQRPANDVNADSKTTHFDYHVM DEQLVKLDLLGHDDPTTLRILQDLTGIDIYTIPLDDKKVMSLFSGTEALGATPEQIGSPV GSSGIPEFGTSFVKQMLVDTKPTTFAELVRISGLSHGTDVWLNNAQEYVRNGTATLSEII TVRDDIMNYLIDNGLDKSVAFTIMEFVRKGQPTKNPEKWKEYSKIMKEHKVKQWYIDSCE KIKYMFPKGHAVAYVMMAVRIAYFKVHYPLEFYTAFLNRKVEDFKMTAMFKPIDELKKSR TELDRKGNLNAKEKQELFLYEILIEMHYRGIELMQIDIYKSDARQFRIENGKIRMPLIAM DGLGEAVAINVINERKNGEFLSIEDLVKRTKLNKTVVDLLKTYNCVPELSATNQQTLF >gi|261748169|gb|ADAD01000052.1| GENE 53 61338 - 61763 521 141 aa, chain + ## HITS:1 COG:mlr7240 KEGG:ns NR:ns ## COG: mlr7240 COG0454 # Protein_GI_number: 13476032 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Mesorhizobium loti # 10 139 14 139 140 74 32.0 5e-14 MYEIEDCNDEEKEYLISKLVEYNLSKVPATQEVDFVNFDKKIKNEKGEIIAGIISRMYCW NCIYVDTLWVSEKYRGKGIGEKLLKEVEKEASINKVYLIHLDTFDFQAKDFYERYGYEVF GVLNDCPENHTRYFMKKFLKG >gi|261748169|gb|ADAD01000052.1| GENE 54 61768 - 62232 603 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037798|ref|ZP_06011240.1| ## NR: gi|262037798|ref|ZP_06011240.1| hypothetical protein HMPREF0554_1109 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1109 [Leptotrichia goodfellowii F0264] # 1 154 1 154 154 239 100.0 6e-62 MRERNLKLLAGICLTLLGVIGIVFQIVFIVGILTSPAKDNPILIAFIVTIMIELIFTTIG IVFLRSVFKKDKKSKQLKISGIEKKGKVVDIVKTKYTVNKRHLMKIIVKTDDGKVYHSEG DSEKKIRNYYKIGDKATVLVSSENEKDYEVQLYL >gi|261748169|gb|ADAD01000052.1| GENE 55 62271 - 62852 697 193 aa, chain + ## HITS:1 COG:no KEGG:LVIS_2035 NR:ns ## KEGG: LVIS_2035 # Name: not_defined # Def: 4-methyl-5(B-hydroxyethyl)-thiazole monophosphate biosynthesis protein # Organism: L.brevis # Pathway: not_defined # 1 190 1 181 200 82 31.0 7e-15 MKKTAVLLYESFCNFEFSVLLEILAINKKPVVFFAKEILPIISEEGLKVIPDIKIEDLDI SEFDSLILTGAADIRKAIEDEEILSFIKKFDERDYIIGAISIAPILLLKAGILSGKSFMA GVNKEELYEEGFSKKDLTLMIDWNESIKNPVPNGYIKDRNIITSVSYEFVRWGIEVAKKL GLNVNPLSFGIKE >gi|261748169|gb|ADAD01000052.1| GENE 56 63180 - 63722 533 180 aa, chain + ## HITS:1 COG:no KEGG:CbC4_4170 NR:ns ## KEGG: CbC4_4170 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 15 176 1 153 172 84 37.0 3e-15 MKFNEKYTEELFFVMVNEKYSDYLRKYDKKIREKEKRPYIGIVLQIKEKQYFAPLSSPKE KYRLMKEQMDLIKLKNGKLGVINLNNMIPVISNNLYIEKIKLSDLRKSKIEKERRYYFLL KEQHEFCIQNKEKILKKAEKVYETFTKEKSELTKREKFISKRVIDFKLMEKLCEKYTEKL >gi|261748169|gb|ADAD01000052.1| GENE 57 63882 - 64763 1218 293 aa, chain + ## HITS:1 COG:SPy1834 KEGG:ns NR:ns ## COG: SPy1834 COG1396 # Protein_GI_number: 15675661 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 111 1 104 195 61 32.0 2e-09 MTFAEKLKSFRKQAKMSQEKLSEKLGVSRQAVTKWETGLGIPEIENIRAISELFEISIDE LLSDELEVKKEKDYLFESITEYDIDSIKHYDIKLIGAYKIILTGYSGEKIYVRFASNTLA DIEKDLKLKIDDGKYNIDLDVNYFNQTTKALTKEDLVIFMKLPEKYIEGIELAAAANTVE FHSLSCEQIEFDGKAENVMLDNVNAKVKLNCNLDMNIVCKTLNGNLKINQLFATSKLSVV EGMSFFAKKRGIGNSLLYEKDGKYVESFGDEDSENIIELNGMKSELIISRSLK >gi|261748169|gb|ADAD01000052.1| GENE 58 64802 - 65815 953 337 aa, chain + ## HITS:1 COG:no KEGG:CLOST_1278 NR:ns ## KEGG: CLOST_1278 # Name: not_defined # Def: hypothetical protein # Organism: C.sticklandii # Pathway: not_defined # 5 243 2 245 304 194 47.0 5e-48 MSDKRKDSTEKSIRILKILKRLSKGEIINIEKVSNEYNVHRKTVQRDIESLRAYFLEENS SEIKYSKTKKGYYLINNDQNNFTNEETLAISKIILESRAFNKTELEKLLKKLIKQATNED RKIIEDIIKNEEFNYSPLQHGKNLLSIIWDLSQYIVNKELINIKYTTKDGVQKNYEAKSL SIMFSEYYFYLITYVNGKEEYPAILRIDRISEIKRKNQKFLLPYNERFEDGKSRKYVSFM HSGPVNIEVEKESDEEKLKNLAIKFIEEMSGEKLNERKVKIERQYVMSGEDEKGKSRKYI IDILLGIKNKENDGIVDKYIIIEDKTFTGVHGDQINK >gi|261748169|gb|ADAD01000052.1| GENE 59 66011 - 66616 501 201 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037791|ref|ZP_06011233.1| ## NR: gi|262037791|ref|ZP_06011233.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 201 1 201 201 274 100.0 3e-72 MENEINLYVNMNIKKWEKNQYIGFFKFLKEKEKELLKCDECSWGYVPNASGGFMGFWWFP LNDEEFKKIQMENEFLYFQIEQYPVKEKKEKEEKYITKDIIAVKYTVDKPDSDEKKETEG IKIGAEKRRIIYEYFQKKAKEKGEEFKKKAFRSGKYMTVGYLEYDYENYKKKIKCLQEIL ESLRNDEKLLEELQNTENNIR >gi|261748169|gb|ADAD01000052.1| GENE 60 66671 - 67528 766 285 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1019 NR:ns ## KEGG: Lebu_1019 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 284 1 286 294 271 52.0 3e-71 MDNLKVTKRNVTIFTVFVLICGWVGVFIDKITNQAHYENMGTLSNDGTLGMGIFISSPLI LVFILRTFCGDGWKDTGYKPNFRKNRKWYLLSCLIYPTVTVFMLILGYLTKTMVFNPISI SIYLTMLIGQLVIQIIKNFFEESLWRGYLTSKLMKFSWSDWKLYVVSALIWNFWHLPYYL VFLSKEEIEATIPGGRLCFVIVATICILCWNVMYVEIYRATNSIWPLVIMHATEDAVINP LLLFNIVYVIPKWAFAFSLSSGIIPTVIYLCIGLYLRGVRMKKES >gi|261748169|gb|ADAD01000052.1| GENE 61 67559 - 68416 597 285 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1019 NR:ns ## KEGG: Lebu_1019 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 284 1 285 294 253 50.0 6e-66 MDSLKITKRNVTIFTVFVLIYGWVGVFIDKITDQTHYENIGTLSNNGTLIGVGIFLASPL ILVFILRTFCGDGWKDAGYKPNFRKNRKWYLFSCLIYPTVTVFMLILGYLTKTMVFNPIS ISVYLTMLIGQLFIQIIKNFFEESVWRGYLTSKLMKFSWSDWKLYIVSALICNFWHLPYY LVFLSKEEIEATIPVGRLCFVIVATICILCWNVMYVEIYRAANSIWPLVIMHAIKDAVIN PLLLFNIIYVIPKWAFIFSLSVGIIPTVIYLCIGLYLREMRMKKN >gi|261748169|gb|ADAD01000052.1| GENE 62 68431 - 69048 795 205 aa, chain - ## HITS:1 COG:no KEGG:SSA_1645 NR:ns ## KEGG: SSA_1645 # Name: not_defined # Def: hypothetical protein # Organism: S.sanguinis # Pathway: not_defined # 1 202 1 203 208 157 41.0 2e-37 MKFLILGILVIKEFTVYEIRNIIHENFQSMCSDSMGSIQIAIKKLLAENLIVFNEIREKN VTKKLYSITDTGRDEFIKWLKTPIDMSKTKNMEIGKLLFMGMVPVKNRLSLISEIIESQK EELRLLENIKKLRSNEEFNNIIEFHNNNEDYKQGLLKISEQSDIENLGHDINNYELLTLQ YGIDNTKFNIKWFENVKKEIKKGNL >gi|261748169|gb|ADAD01000052.1| GENE 63 69222 - 70892 2729 556 aa, chain + ## HITS:1 COG:FN2082 KEGG:ns NR:ns ## COG: FN2082 COG2759 # Protein_GI_number: 19705372 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Fusobacterium nucleatum # 1 556 1 544 544 726 68.0 0 MTDLEISQKAKLEKINVIAEKIGLTEDDYEQYGKYKAKVNLDVLERNADKKDGKLILVTA ITPTPPGEGKSTVTVGLTQALNKFGYKSIAALREPSLGPVFGMKGGATGGGMSQVVPMEE INLHFTGDLHAISAAHNLISTCIDNHINHGNELDIDVNNITWKRVLDMNDRALRNIVIGL GGKINGIPRESSFQITVASEIMAIFCLAESITDLKNRIGEIVFGYNRKGEILKVKQLNVQ GAAASLLKEAIKPNLVQTLENTPVFIHGGPFANIAHGCNSVLATKTALKLSDYVVTEAGF GADLGAEKFLDIKARKANLEPNVIVLVATVRALKHHGSNMDGNKDSIETLKKGMTNLEKH IENMQKYNVPLVVAINKFVTDTDEEIEEITKFCNKKGVEAALCEIWEKGGEGGKELVEKV MDAIEKNEKSKVKFSPLYDEKLPIKEKIEIICREIYGADGVKFMPKALNNIKKYTENGYD KLPICISKTQKSLSDNSNLLGRPEGFEVTINEVRLSAGAGFLVAMAGEIIDMPGLPKKTS AELIDIDDNGVITGLF >gi|261748169|gb|ADAD01000052.1| GENE 64 70911 - 72116 1685 401 aa, chain + ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 5 393 2 382 384 307 40.0 3e-83 MSKKYDFETVISRKGQGAYKWDQMYEKYPDLEDSIVPFSVADMELKPATEITEGLQKYIG EQILGYTGPNKDYFDAVIKWMKKKHDFNIEKDWISCSPGVVSAIYDCVKAYTEENDGVII FTPVYYPFYNAIKFNNRKIIDCGLIEKEGYYTIDFEKFEEFAKDPNNKLLILCSPHNPVG RVWTKEELEKIGKIALKNNLIVVSDEIHFDIIMPGHKHTVFQTLSEELAEITITCTAPTK TFNLAGVGVSNIIIKNKKLREKFRNSQEKSATHVFSPLPYRACEIAYSQCEEWLTLFLEL VDRNQKTVNKFFEEKFIELKAPLIEGTYLQWIDFRALGLKNEELKKFMNEKAKLFFSEGY TFGEKGSGFERINLAVPSAVLEKALDRLYNAIKEDFSKLCK >gi|261748169|gb|ADAD01000052.1| GENE 65 72199 - 72537 496 112 aa, chain + ## HITS:1 COG:msl2463 KEGG:ns NR:ns ## COG: msl2463 COG3042 # Protein_GI_number: 13472236 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Mesorhizobium loti # 62 110 29 77 89 63 48.0 1e-10 MKILKIMTCLLVLGTVISCTEVSKNTGPEKTISDMRSEKRDHHKGGHKKKVHHPKKDGDR MIGMPNPASVYCTEQGGESVTKKDEQGNEYGICKFKDGKEVDEWQFYRENHE >gi|261748169|gb|ADAD01000052.1| GENE 66 72693 - 73505 990 270 aa, chain + ## HITS:1 COG:AGc1468 KEGG:ns NR:ns ## COG: AGc1468 COG2514 # Protein_GI_number: 15888145 # Func_class: R General function prediction only # Function: Predicted ring-cleavage extradiol dioxygenase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 235 53 274 323 166 37.0 6e-41 MKKYGLNFDFITLRVKNIEKMKDFYLKLLKMKVLSEKNEGEKKEIVLGTDTKEIIRLISY GNESIESQEETNVYHIAYLLPTREDLGNFLRNCIKEQIRLDGVGDHDVSEAIYLTDPEGN GIEVYADRDYNTWKWNGNYVIMGTVEVDTEDLLRISDNMPDFVIPEGTKIGHVHMETFNI ENDKEFYIEKLGLDVVSELPRAYFMSVDRYHHHFGMNQWNGNRKIPKKENSTGVEEIYIT MDKEKFSRKFSGDKTVIETPNGIKLVIKSE >gi|261748169|gb|ADAD01000052.1| GENE 67 73527 - 74213 721 228 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1150 NR:ns ## KEGG: Sterm_1150 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: S.termitidis # Pathway: not_defined # 1 226 1 226 229 123 35.0 5e-27 MRKTGEILEEYLRKKMSQVDLAKIIEVSPQYINNIIKNLKSPSENFLEKFYKIFDVSEED RAEIKEYEEFRKLPKKFQEEISALKKSNENDGNVSKNERTQKISLLGKFDNSGFFRYSEN NESVFLPKIETNNELFAVKTEYIDYIPDFYINDTLIFEKKNITRNSDLHGKICLTECNGK IEIKKVEYIDEVIVLKSPGERENMILLTKEKYSQLKIIGILKVHMRIY >gi|261748169|gb|ADAD01000052.1| GENE 68 74510 - 75844 1769 444 aa, chain + ## HITS:1 COG:TP0917 KEGG:ns NR:ns ## COG: TP0917 COG2239 # Protein_GI_number: 15639902 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Treponema pallidum # 5 443 8 446 449 416 48.0 1e-116 MENSIQNLIKEKKYFEIRKYLNDLNTVEVSELLNQFESSELIMIFRLLSKDRAADVFSYL DTEHQEMIINTMTDVETKNIFDELYFDDIVDIIEEMPSNVVKKILKNTDTKDRHMINQLL KYPDNSAGSIMTTEYVDLKKDMKVSDAIREIRNTVEDKENIYTCYVISADRKLEGVVSLK EIITSDNDVIIEDIMNRNFVSVHTNDDQEEVAEIIKKYDLIVLPVVDIENRLLGIITIDD VMDVIDKEATEDFHKMAGISPVEESYLKTSAFTMARQRIMWLIVLMISATFTGRIIGKYE NVLQSVVILASFIPMLMDTGGNAGAQSSTIVVRALALGEVDVKDTFRVLIKEFSISFIVA IVLAAINYLRLIALTKVPLNVALTVSVTLIFVVIISKIIGALLPIGAKVLKMDPAIMAGP LITTILDALTLTIYFKFATVFLKI >gi|261748169|gb|ADAD01000052.1| GENE 69 75984 - 77324 1841 446 aa, chain + ## HITS:1 COG:TP0917 KEGG:ns NR:ns ## COG: TP0917 COG2239 # Protein_GI_number: 15639902 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Treponema pallidum # 3 445 6 446 449 401 48.0 1e-111 MKEIIETLIKEKKYFEIRKQLNELNTVEISEVINEFEISELVIMIFRLLKKDKASDVFPY LDSEHQEMIIHAATDIETKNIFDELYFDDIVDIIEEMPSDVVKKILKNTDAKDRHLINQL LKYPDNSAGSIMTTEYVDLEKNMKVSEAIEQIRNTGKDKENIYTCYITDEQGKLKGVLSL KELIAKKDDTIIEDIMNRNFISVQTNDDQEAVADMFKKYDLIVMPVTDHENRLLGIITID DVMDVVEQEVTEDFHKMAGITAPAEETYLKTNVFTMAKQRIGWLAVLMISDTISGNIIQG YEKVLAQSIILTAFIPMLMSTGGNVGSQSSTVVIRALALGEISPKDAFKVLKKEFSISIM VSIVLAILNFIRLITLEKIDLMIALTVSVTLVFTVIVSKIVGALLPLGAKLIKADPAVMA TPLITTISDAVTLIIYFKFATLFLKI >gi|261748169|gb|ADAD01000052.1| GENE 70 77585 - 78118 373 177 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163801060|ref|ZP_02194960.1| 50S ribosomal protein L35 [Vibrio campbellii AND4] # 15 177 1 165 166 148 46 1e-34 MFFIKGTNKSDEPRMNERIRAREIRVIGEDGEQFGILSVNEAIALANEKGLELVEISPNA TPPVCKIMDYGKFKYEKTKKEKENKKKQKNVVIKELRIKPHIDEHDKETKISQIEKFIAK EYKVKVSLRLSGREKMHAESAIKVLDEFASHFEETATVEKKYGKEQLQKFIMLSPKK >gi|261748169|gb|ADAD01000052.1| GENE 71 78136 - 78342 318 68 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212793|ref|ZP_04339152.1| LSU ribosomal protein L35P [Leptotrichia buccalis DSM 1135] # 1 68 1 68 68 127 88 3e-28 MPKMKTHRGTKKRVKVTGSGKLVIKHSGKSHILTKKTHKRKKRLGQDAIVPKGAERRIKK LLAGQEGR >gi|261748169|gb|ADAD01000052.1| GENE 72 78373 - 78717 543 114 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212792|ref|ZP_04339151.1| LSU ribosomal protein L20P [Leptotrichia buccalis DSM 1135] # 1 114 1 114 114 213 92 3e-54 MPRVKTGIVRRKRHKKVLKEAKGYRGSVKTNFKKANEAVKKAMAYATEHRKQKKRKMREL WIIRINAAARLNGVSYSKFMNGLKKAGIELDRKVLADLALNNPADFAKLVEKVK >gi|261748169|gb|ADAD01000052.1| GENE 73 78809 - 80008 1511 399 aa, chain - ## HITS:1 COG:aq_1812 KEGG:ns NR:ns ## COG: aq_1812 COG0460 # Protein_GI_number: 15606863 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Aquifex aeolicus # 1 359 4 356 435 281 43.0 1e-75 MKIGIIGLGTVGEGVLKVLTTERKSIFEKSTMDIEVKYACDLNINREFSFDFDKSILIDD YKKIINDDEIEIVVELIGGETIAKNIIIEAFRAKKSVVTANKALVAKHGVELFQIAKENK VSFLFEAAVGGGIPIVTPLMESLVANTVTEIRGIMNGTSNYILTKMKEEKLSFAEALSLA SAKGYAEADPTYDVDGIDAGHKINILASLAYGGSIKFKDMQLSGIRDINSIDIFAANQMN YTIKLIAQSKLLSENSVQISVEPTLVQNSEILAKVDDVYNAIETIGSYTGRTLFYGKGAG MDPTASAVVSDIVRIAARIHIESDYFFNSTKIFDVVDINTIKNSYYIRVSSDFDIEKSPF EIYNEIDSFYIITADDISRNDINEYLKNAQEKLILKIMK >gi|261748169|gb|ADAD01000052.1| GENE 74 80021 - 81244 1573 407 aa, chain - ## HITS:1 COG:MT3812 KEGG:ns NR:ns ## COG: MT3812 COG0527 # Protein_GI_number: 15843329 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Mycobacterium tuberculosis CDC1551 # 1 404 1 405 421 335 46.0 8e-92 MALIIQKYGGTSVADAERVKEVAKRVLRYKKEGHDVIVVVSAPAGTTDSLIKRAYEISDT PNKRELDMLLTSGEQISIASLAIAVEALGGKAVSLNAFQVKFKTTEDHTKAEILDIDTEL IEEKLSEGNVVVFAGFQGITENNDITTLGRGGSDTTAVALGAALNADEVEIYTDVDGVYT ADPRVVKSPKKLEFISYQEMLELAVSGAKVLHPRSVEIAARYGINIHLRSSFDDPTGTIV TDEKKGENSMEKVKIVGVTSSKNEGKITLMGVPDKPGIAAKVFSTLAKSKINTDIILQSS SINKEFNNISFTVKTEDFKEALNVSEKLKDELGAQGVIHEEKIAKVSVIGIGLKTHYETT AEIFDTLAENGINIDMISCSEINVSCIIKEDDIEKAVKALHQKFIEV >gi|261748169|gb|ADAD01000052.1| GENE 75 81654 - 81851 321 65 aa, chain + ## HITS:1 COG:L172505 KEGG:ns NR:ns ## COG: L172505 COG1278 # Protein_GI_number: 15672150 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Lactococcus lactis # 3 62 4 63 65 80 63.0 7e-16 MLGKVKWFNDKKGFGFISGEDGNDYFLHFSKINKEGFKSVNEGEEVSFDVEEGPKGPQAT NVVSQ >gi|261748169|gb|ADAD01000052.1| GENE 76 82113 - 83792 1955 559 aa, chain + ## HITS:1 COG:SP1169 KEGG:ns NR:ns ## COG: SP1169 COG0692 # Protein_GI_number: 15901034 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Streptococcus pneumoniae TIGR4 # 328 542 21 211 217 83 30.0 9e-16 MNIILEDSIYHRSGIKDFGVDPVNERIIMTGEKLVFFKNGKIEKEISGKVKNCEIIKYIK EKNQLFVSSTFFVSTGTGKVYKCDSSKKKIVEPVFDSEKLIEFINFTTGGKIIYIENDIL YSYDSNSLELNQAEISEKESRQRGNYKLFTSGENVILKYRELHSQTNIINIFDSKLEKIF EIKTENNHIYAKISDLEYIAGTATGEIEIWNILEKELYNSIKIADSRITYIEKNNGNYFI GTGNGDLIITDWEFKTLKTQSVFKNEIIKICYIEDQIFILSADNKIVTLKIINEENDSKN ASLREKFLEEYNIHSDYYDFFTLDRVIKIDNFIKEMDIKKINYTPSRENIFKVFSDSISS RKVCLIGKDPYFQEGVATGLSFEVNKSSWDDPDINTSLKNIVKLIYKTYTGKSEDISKIR EEIENKKFLILPPNKLIKSWKEQGVLLVSAALTTIVGKAGEHHKFWDPFTRDLLEYISAK NPNIVYFLWGKDAEIFEKNILSGEIIKHNHPAISGNLSNEKDFMNGKSFEKTKNIINWTG FEIKPEKVAENTENTGRLF >gi|261748169|gb|ADAD01000052.1| GENE 77 83805 - 84548 732 247 aa, chain + ## HITS:1 COG:FN1844 KEGG:ns NR:ns ## COG: FN1844 COG0300 # Protein_GI_number: 19705149 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Fusobacterium nucleatum # 1 242 1 243 257 235 50.0 6e-62 MSTVLITGAGSGIGKELANLFAERKYDLLLCGRTPEKLEDVQKEILSKYEVNSTVIQYDL NDTDNIEKILENYDTDVLINCAGTGKMGDYESLSLEDEKRMINVNFVAPMILSKIFIKKF MVKKYGTIINVCSTASLYPNPYLNVYSVSKVSLLYYSLALDKEISMKNNNIRVLSVCPGP VNTDFFEKDTREKFSSWKKYEMKVSKVAKAIMQAFDMKKRFTVIGFGNKIFSYIMRVLPI SFSTENS >gi|261748169|gb|ADAD01000052.1| GENE 78 84575 - 85723 874 382 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1741 NR:ns ## KEGG: Lebu_1741 # Name: not_defined # Def: ceramide glucosyltransferase # Organism: L.buccalis # Pathway: Sphingolipid metabolism [PATH:lba00600]; Metabolic pathways [PATH:lba01100] # 3 382 4 391 392 394 54.0 1e-108 MFIYYFFLFTSIFLLVLKLFFAYKYFKNIENQPKSYINEENYTVIQPILSGDLRLEEDLT ENLKNTEKMRFIWLIDKSDRVALETAEKILGNKDFFQRTEIFQMDDVPQEINPKIFKISQ VIEKVKTKYTIILDDDTVMDIKRINEFALYEKRKDEWIATGIPYNYGVRGLFSKLIAAFV NGNSFLTYFPMAYLKESRTLNGMFYMAKTELFRKYNTFDEIKYELCDDLALATFLSEKKV KIIQTTVFCNVRTTVKNSLNYILLMKRWFLFTKIYIKKAFSMKFFVFILLPSLLPGIVLI MSLFLGIKFIFVIFGYFVLKAFILYLFRYSLLKKKENMNVILYEIISDLILIPVFIYTLI TPPVIKWRNKKIRVSDGRIHYE >gi|261748169|gb|ADAD01000052.1| GENE 79 85716 - 86348 446 210 aa, chain + ## HITS:1 COG:no KEGG:FN1846 NR:ns ## KEGG: FN1846 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 4 206 3 205 208 194 50.0 3e-48 MNNFNEYLKKLQCLESEKTLIKKERSVILLSGSSNYKNAALSQIQKEFLNIFEKFGYNVV NSNFPYNEDFKHNDFDDVHILKASISNIVYYWHTLYNINFQKEIKRHLSPLFELENAIIV TQSSGLNLLNCVLKDRNIKEKNFRIFCLGPVAQGSEKIEKAIIFKGKKDMYTEILDNHKC DVRVDCKHFDYLKNKEIKEFIYDTLQKDKS >gi|261748169|gb|ADAD01000052.1| GENE 80 86320 - 87600 1461 426 aa, chain + ## HITS:1 COG:YPO1985 KEGG:ns NR:ns ## COG: YPO1985 COG1819 # Protein_GI_number: 16122227 # Func_class: G Carbohydrate transport and metabolism; C Energy production and conversion # Function: Glycosyl transferases, related to UDP-glucuronosyltransferase # Organism: Yersinia pestis # 1 361 7 365 395 201 32.0 2e-51 MTPSKKIKVDVVAPPFSGHLYPLLEMIIPLLNNEKYDIRVYTGIQKKEVAESLGISCKVV LEDYPDAFENIANTPEKTDAVTLYKQFKKNMKIIPIIIKELEEEFEKRGTDIVIADFVAI PAGIACNNLKIPWITSMPTPFALESKSTTPSYLGGWYPKNNIFYKIRDSIGRFVIRSFKR IVCLAVSKELKTLNFKLYNENGEENIYSPYSILGVGMKELEFRDDFPERFIWIGPCSPSF DKKEYTFTDTSKFEKTVFLSNGTHVLWAKERMIKIAENLGKSYPDICFIVSSGDYSKKDE EVKKISENVYVYKYIDYETALPRADYVIHHGGAGIMYKVIKHNKPSVIIPHDYDQFDYAV RARIADIAFVADLKSEDSINKAFQDMLNRKEWKNLEKIHNSFNNYPSADILEKEIERLLK GRENNK >gi|261748169|gb|ADAD01000052.1| GENE 81 87597 - 88670 1118 357 aa, chain + ## HITS:1 COG:FN1847 KEGG:ns NR:ns ## COG: FN1847 COG0451 # Protein_GI_number: 19705152 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Fusobacterium nucleatum # 1 357 2 324 328 390 59.0 1e-108 MKVLVTGATGFLGKYIIDELVENNYKVVAFGRNEKVGKSLENENVRFFKGDFTEKEDIIK IMNSQKSPESKINADLNHENQTKSDLKVSRKYHTEEISAVIHAGGLSTVWGKWQDFYNSN VKGTENILEICRIYKIKKLIFISSPSIYAEPKDQVNVREEEAPEENNLNFYIKSKIMAEK KIKEYADIPSVIIRPRGLFGVGDTSVIPRLLKLNRAKGIPLFNDGKQMVDVTCVENAAFA IRLALESENSSGQIYNITNNEPMAFKNILELFFKEFGEKPHFIRKNYNTVKFFVNVIEKI YHLFGITKEPPITMYTLYLMRYSQTLNIQKAEKELNYKPKLSIAEGVKKYVEHNRKN >gi|261748169|gb|ADAD01000052.1| GENE 82 88645 - 89457 739 270 aa, chain + ## HITS:1 COG:FN1848 KEGG:ns NR:ns ## COG: FN1848 COG0491 # Protein_GI_number: 19705153 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Fusobacterium nucleatum # 4 265 1 262 263 310 58.0 1e-84 MSNIIEKIDYFSCGYCTNDMQKIFKGVSSGKRNFHGGVFLIKHKTKGYILYDTGYSMKIF ENNLKYKIYTGLNPVTLKKEDIINEQLKKEGIGTEKINYIILSHLHPDHIGGAKFFSDAQ IIITDECFQEYKKSSFKSLIFKEILPENFEERLKIIEINSHNSAFPYIKSHDLFNDGSLF LSRIDGHAKGQGCLFIPEKNLFLAADVCWGIDLLELTDKIKFIPRLIQNNFSEYKKGVEI LKKLMKNNIDVIVSHDMPERIRGILNEKDS >gi|261748169|gb|ADAD01000052.1| GENE 83 89441 - 90727 1131 428 aa, chain + ## HITS:1 COG:FN1849 KEGG:ns NR:ns ## COG: FN1849 COG1541 # Protein_GI_number: 19705154 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Fusobacterium nucleatum # 1 422 1 422 424 531 66.0 1e-151 MKKIVKTLLTFIQVRWIKKFHSRKELEKYQNKEIKKHLKYLKTKSPYFRKNNIKGDFITD KKFVMENFDELNTLGIKKEEAMNIALESERTRDFGKKYEKFSVGLSSGTSEHRGIFVTTD SEQAVWAGTILAKMLPRGNLFGHRIAFFLRADNELYKTVNSFIIRLEYFDTFKSVNEHIE RLNKLNPTILIAPASMLILLGERKKHGMLKICPKKIISVAEILEKKDEEFISECFGNKII HQIYQATEGFLACTCPCGSLHLNEDIIKFEKEYIDEKRFYPIITDFKRTSQPFVKYRLND ILVENNEPCKCGSIFQRIEKIEGRSDDIFEFYNKKNQIVTVFPDFIRRCMLFSEGIREYQ VFQTEYDRLEIAVPEINENQKEQILKEFEKLFDSLNVENIRIKFVEYNTDNKIKLKRICQ KMKKEGKK >gi|261748169|gb|ADAD01000052.1| GENE 84 90724 - 91653 1200 309 aa, chain + ## HITS:1 COG:FN1850 KEGG:ns NR:ns ## COG: FN1850 COG0332 # Protein_GI_number: 19705155 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Fusobacterium nucleatum # 1 309 1 309 309 448 70.0 1e-126 MRKIRIRGYGVSLPEKTMKFGEQIRYRLTGEESQFNLAVDACKKALENADLSINDVDCIV SASAVGVQPIPCTAALIHESIAKGTDIPALDINTTCTSFISSLDIMSYMIEAGRYKRVLI VSSDSASMGLNPSQKESYELFSDGAAAIIIEKTEENKGVISAVQKTWSEGAHSTEIRGGL ANLHPENYSEETKEEFMFDMKGKIILSLSKKKIPEMLTEFLKENNIEIADIDMTVPHQAS SAMPSVMKKLGICEGKYINIVVEYGNMVSASVPFTLCYGLENRMLKTGDTVLLIGTAAGL TTNILLMKL >gi|261748169|gb|ADAD01000052.1| GENE 85 91747 - 92925 1149 392 aa, chain - ## HITS:1 COG:MTH760 KEGG:ns NR:ns ## COG: MTH760 COG0025 # Protein_GI_number: 15678785 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters # Organism: Methanothermobacter thermautotrophicus # 3 389 6 393 399 264 45.0 2e-70 MLFSLTLIFLSGLILGSIFNRLKLPQLLGMLLTGIILGPYVLNLLDPKILTISADLRQIS LIIILTRAGLNLDINDLKKVGRPAVLMCFVPATLEILGMILFAPKFLGLNLLDSIILGTV IAAVSPAVVVPKMLKLMEENYGTDKSIPQLIMAGASVDDVYVIVLFTSFTGLASKGTFSF LDFVKIPTSILFGISGGFICGILAVYLFKKLHIRDSVKVIIILCISFMLVTFEHSLKGII GFSGLLAIMSIGIAIQKKNPGLSKRLSVKYSKLWIGAEIILFVLVGAAVNIKYALFSAIP SIILIFLVLIFRMIGVFICLLGTSLSFKERIFCMIAYCPKATVQAAIGSIPLAMGLSSGD TILTVAVLSILITAPLGAFAIDTTYKKLLLKN >gi|261748169|gb|ADAD01000052.1| GENE 86 93141 - 94265 1346 374 aa, chain + ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 6 366 11 371 386 171 32.0 2e-42 MLIKLEINRKNIEKNLEKIKSINKNIICVLKDNAYGLEIKNILPILIENDCFYFAVAYIG EALEIKKIIQKKYPDKADKIKIMTLNYIEEKEIKKTVKNDIELTIFNFEQLESYINASAK GKKLKIHIKLNTGMNRLGFNENEIDKLTEILRGNPNLEIISVFSHISHAENKKETEEQII KYDKMMSVFDKNSIKYKFKHIQASPLLFKYKEKYNYDFVRTGMAIYGMEPLKEKVGLYNT VKLSSKIINIRKVKKGESVSYGNGEPLEKDGKIAIIPVGYAHGLQKQIENKNAYVLIKGK RSYILGEICMDMIITDITDIEDAVIGSEAVIIGKQGSEEISLSKMAEWTGTIQDDVLTKW ERKIKRSVANLPKN >gi|261748169|gb|ADAD01000052.1| GENE 87 94409 - 94597 428 62 aa, chain + ## HITS:1 COG:MA4346 KEGG:ns NR:ns ## COG: MA4346 COG1983 # Protein_GI_number: 20093134 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 1 62 111 171 183 62 48.0 1e-10 MEKKLFKSKSDKKLMGVCGGLGKYFDKDSNLIRIIAVVLGLFTGGAALVAYLIAGFVLPE GE >gi|261748169|gb|ADAD01000052.1| GENE 88 94728 - 95141 653 137 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1048 NR:ns ## KEGG: Lebu_1048 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 136 1 135 135 141 58.0 8e-33 MNKREKKFYERFKSENIDSEQINKAKNMASHLGKVSSKFLLLIKMLNSDLKGEFKIPVID KLKIIGAIVYVITPTDAIPDILPILGFGDDIAVVTYVLTKLNKLISEYEEFENKKTQDKK KSDGPDFENMRVVNEDD >gi|261748169|gb|ADAD01000052.1| GENE 89 95241 - 97898 3421 885 aa, chain + ## HITS:1 COG:FN0887 KEGG:ns NR:ns ## COG: FN0887 COG1164 # Protein_GI_number: 19704222 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Fusobacterium nucleatum # 287 884 5 599 600 573 52.0 1e-163 MKKEMFLKQFPKDLEYEVSKLYNSFEIAKEYSVPAYTEEFYTPNIWKKLTEKIENIKIEA NGIFENSDRRQIAFIPEGFYGKNSSGIEEIYSDADGDNKNSAEFPSKLLKIKINSRFREY GHKDFLGSLTGLNIKRELMGDLIFDKETAYVPVSDKISDYILTELKQIGRDKCSVEEEDI KNREIIPEYKYDDKFITVPSKRLDSIVAAITLLSRNKVIEPIEKGKVLVDYYEEKDKSKI IETGSLITIRGYGKYKLFFGTRRNKKRKRKTAHKKIYIGKENKMAEKEIKKEYKWNLSDI YRSYKEWEKDFGKVQKLKDELLMYKGKFSDEKKLSEFLKKQEELDKIAYKLYAYPQLARD LNSSDKEATENLQKIQFLFSEITTELSWVNPELIENRKKIEKYIKKEEFSDYKFGLENLF RLQKHVLNERESKLLSYFGSFFSTPRTVYTEVTVTDVEWPVVKLSTGEKAEATPANYAKV LTKNRNQKDRKLMFDSYYGVYKRKENTIAAIYNSILQKDIAKMKAYEYDSFLLSFLEGNN IPEEVYMNLINTAKENTKPLKRYLKLRKKILGLKKYHNYDGSVNLIEFNKEYEYDDAKNI VLKSVAPLGKDYVKKMKKAVSEGWLDVFEAKGKRSGAYSAGIYGVHPYMLLNYNNTLDSV FTLAHELGHTLHTLYSDENQPFSMSDYTIFVAEVASTFNERLLLDYMLENTDDPKERIAL LEQEIRNITGTFYFQALLAEYEYQAHSLVEKGEPVTADILSKIIEKLFDEYYRKEMEKDE LIYALWARVPHFFNSPFYVYQYATCFASSAILYDKIINEKDKKKKEEALKKYIELLSSGG NDFPMEQLKKAGADLSKKETVKAVSEQFNLLLDKLEKEIEKMDLK >gi|261748169|gb|ADAD01000052.1| GENE 90 98020 - 99816 1495 598 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2228 NR:ns ## KEGG: Lebu_2228 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 3 598 4 609 610 305 34.0 5e-81 MTLKERIKKFLPNMKKGIERFPITILFSVVIFGILVYMIHQEGDIASSLEERLNKILTLL GIGLPMTATLELIREKYFPDKNKLFSRVVEGTVTLIFLYIIKLQYLGHAFIESDFIRLFF IGVIFTILFFFIPVIGRKKDAEKYIETVVVNVFVTKVFSVVLYLGLIAIIGAVDVLLINI DNKVYLYTYMFSVFIFAMTFFLSRLKEVNENLEEYQTNIVFKTLLKFIIIPLILGYTLVL YLYLGRILIQMKMPKGVVSHLVLWYTTFSLFVMIMITPMAETDKIISEFKKNFPKISIPL ILLSFVAIFQRIGQYGITESRYYIVAVAVWLLFCIIFYIFRTKVTVIMISAIAVIFVTLY VPFVNAESISLRSQNDRLEKILIKNGILKDGKLIKKTDLDEKTKAEIADIVSYINNKNEV DRKERLKVFGKESYKTADEFKNIIEMDDTWYNSYLLTEDHETRISYKLQNSEYMIRKTYG YEYLTLLSFYDGDNYESEKYKIENKKRSIKLSDKNNKELLKIDIDKILNEIMTKKEATVE DNKDYYITASLPENIMDYTGENENIKYKLLIRNFTVIKTKDEKLKFEDCDVDFYFSGK >gi|261748169|gb|ADAD01000052.1| GENE 91 99888 - 100400 829 170 aa, chain + ## HITS:1 COG:PM0896 KEGG:ns NR:ns ## COG: PM0896 COG2190 # Protein_GI_number: 15602761 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Pasteurella multocida # 1 166 1 162 166 166 55.0 2e-41 MGLFDFLKGNNKETEKKEFEGKIYAPISGKILPLEQVPDEVFAQKMVGDGVAIEPDASGV MLAPVDGKVEKIFDTNHAFSVTTTSGMEVFVHFGMDTVKLDGKGFERIANEGDIVKKGDP LIKYDYEFLKENAKSVITPVIISNYDEFSSLEKKDGNAVAGETLVLNVIK >gi|261748169|gb|ADAD01000052.1| GENE 92 100529 - 102991 2581 820 aa, chain + ## HITS:1 COG:FN0743 KEGG:ns NR:ns ## COG: FN0743 COG1199 # Protein_GI_number: 19704078 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Fusobacterium nucleatum # 87 818 9 741 741 539 42.0 1e-153 MRIADFISNETAVIMKKEIEESGGNEVFFRGIPDENGFVVDVKVIARGNENSVAALLNMM RKNEVIIHNHPSGVLIPSNEDVTVSSVYGESGGASYIVNNDVDDIYVIVPLKKHNKINID EFFGEKGKIHQKIKKFEPRKEQYKMSKYIEHCVNENKKLIIEAGTGTGKTIAYLLPTLLY ALENNLKLIISTNTINLQEQLINKDIPLIKKIIEKDFTYEIVKGRGNYLCKRKLKNLNVI ESDKDTEEERKEKKILKNILEWDEITETGDRSELKYEIPNYIWEQVNCESDLCTQSHSPD CYFYKARKKIGNADMLIVNHHLFFADLAIRNEIGFNTEYSILPNYDIVVFDEAHNIEDTA RNYFTYEVSRHTLGRLAGNIHNRRATGANNAGALARVMVFLNESLKQEEYVKADEMKDEV IKALNEYYDMGNVILDKIIFPFAQELTSGEIKRRIDKNEIKNSQFWREIMELKKEMKVKY GDFLRKTARFLTFIDNFELEDENGIVFDFMRYFERMKQYNANFEFILQGDDENYVYWLNI NASRANIKLYATPFDVAESLEENLFNKMNRMIFTSATLAVDNKFDYYKKSIGLQKEEKKN IEEKIISSPFDYEKQMKVYIPNDTLDPNDINFLDDLEYFTEKIIKKTEGRCFLLFTSYSS MNYMYNRLKNYFNKNEYTFIKQNDYPRHEMIEIFKNSKNPVLFGTDSFWEGVDVQGEQLK SVIIVKLPFKVPNDPVTEAIIEDIRKKGKNPFNDYQVPQAVIKFKQGVGRLIRSKKDTGI ITILDNRVIKKSYGKKFLKSLPKNIIEKSKNEILDLAESE >gi|261748169|gb|ADAD01000052.1| GENE 93 103005 - 105059 736 684 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 [Bacillus selenitireducens MLS10] # 201 673 243 731 750 288 33 1e-76 MTINIFGREIVFEVKEKNQKEEQKKTSGKKFINRFIWSMIVFFCFGTAIELSKIKANYIV GTVAPTDIIAYKDVVYNVDIMDKGVKDKIMKNTTPEYDKIPSVPEESVVGINNFFRDIKN IDLSNEQSISNFIKENKYNLTVKDIKAITERESVGYVVNLITIVKEIYEIGVVKASDFEK IIAANNIKIDELDKKFLKNFIKPNLKINEKKTAEKIEDNIDSLRNNEIKIYKGDIIVKKG EIIDSDDFEKLERLNLVRGQDKAKKIAGLFITFVVVGAVFYYITKKYCKKIVESKAFYPS LLTVIMINLSYVLFANDEFFIYLLPFATIPIILTVLGDKVFALTFTFFNMIILSRDETWF LVIIAVTLVAVYRADKLTNRSDIVKLGIFTGIFQGIMSLSYGLVNQLEFPMQIVMIMFSV LSGILTGMISLGILPYLENSFDILTDIKLLELSDFSHTLLKQLLVTAPGTFHHSIMVGAL AEAGAEAIGANATFTRVASYYHDIGKMKRPMFFVENQRGGKNPHNTIRPSLSALIITSHT KDGYIIGKQNKLPKEILDIILEHHGTTLVQFFYYKALENGEEVLESDFRYSGPKPRTKES SIILLADTIEAAVRASEDKSREGIENLIRYLIRYKIDDNQLNNSDLTLGEIEKVIQAFLN VLQGAYHERIKYPKLDEKSERREK >gi|261748169|gb|ADAD01000052.1| GENE 94 105059 - 105523 723 154 aa, chain + ## HITS:1 COG:FN0746 KEGG:ns NR:ns ## COG: FN0746 COG0319 # Protein_GI_number: 19704081 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Fusobacterium nucleatum # 13 154 20 161 162 136 62.0 1e-32 MLECDITYEINEIEEFLDEKKIENFIKFILENEFKEEYSENDYYLSLLIADNKIIRKINK EYRNKDLETDVISFAYNETENIGGMNVLGDIIISLERVKSQAEDYGHSEEREFYYVLCHG MLHLLGYDHIEEEDKKVMREKEEKILKQFNYTRD >gi|261748169|gb|ADAD01000052.1| GENE 95 105566 - 106390 1250 274 aa, chain + ## HITS:1 COG:CAC1294_1 KEGG:ns NR:ns ## COG: CAC1294_1 COG0818 # Protein_GI_number: 15894576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Diacylglycerol kinase # Organism: Clostridium acetobutylicum # 35 192 2 157 157 100 39.0 4e-21 MDKNDKEKTGKKKRKSFLFRKYENYDWDIKSERERDRRIVDSFNFAIEGLTEAIRNEKHM KVHILLAIIVVILAILTNAAKSEILIISVSVSFVMITELINTSVEALVDLISPKRHPLAK LAKDVAAGAVLIAAINALCVGYLIFYDKLLNILDTKNQLHVIAGRKGNIAILILVLIAII VIIIKSFYKKGTPLEGGMPSGHSAIAFATFGILLFMTSDVRVLALVFLMALLVAQSRVKA GIHSFREVFAGGVLGFAVAFIIMFVMIHFGILYN >gi|261748169|gb|ADAD01000052.1| GENE 96 106444 - 107367 1308 307 aa, chain + ## HITS:1 COG:BS_yybQ KEGG:ns NR:ns ## COG: BS_yybQ COG1227 # Protein_GI_number: 16081107 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase/exopolyphosphatase # Organism: Bacillus subtilis # 3 304 4 307 309 332 58.0 7e-91 MSILVFGHKNPDTDTICSAIAYAELKGKTGKKVKPARLGEVNEETKFVLDYFKIQKPELI ETVAGKEIMLVDHNERTQTADGFEEAKVLELVDHHRISNFNVDEPLYVRMEPVGCTATII LKLFKEKNLLPSKETAGLMLSAIISDTLLFKSPTCTEYDVKAGKELAEIAGVNLEEYGLE MLKAGTALGGKSESELLNMDMKIFEIDGKRIGVGQVNTVNEEEILERKEKLISEMNTLNS KENLDFFLFVITNILSNDSIGIVAGNGNEIIEKAFNGKVESNTVLLKGVVSRKKQVIPPL TRAIQGN >gi|261748169|gb|ADAD01000052.1| GENE 97 107431 - 108432 315 333 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 32 333 14 318 318 125 29 7e-28 MNENTENEIIKCEKLNYWYQTYEKSSGIKGTLQDLWKRKHKKAPAVIDVDILINKGEIVG LLGPNGAGKTTLIKLLTGILELKSGEIKCLNKNPYKKEKEYLKNIGVVMGQKSQLIWDLP SMETLKMLKEIYEIEKREFEERLEKLLKLLNLKEKINIPVRKLSLGERIKFELICSLIHK PEILFLDEPTIGLDITSQYAVYDFLKEVNKTENTTIILTSHYMKDIEKLCERVIIILKGE KHFDLTIDELKQKFITKKTYIVESKNKKLPFEENNFIVKKLDNKIFEIYHEQGDIKIEEL NLKDIVSIKENTPELEDIIFELFSNFKEKEQEK >gi|261748169|gb|ADAD01000052.1| GENE 98 108509 - 109300 693 263 aa, chain + ## HITS:1 COG:no KEGG:LBUL_0130 NR:ns ## KEGG: LBUL_0130 # Name: not_defined # Def: ABC transporter permease # Organism: L.delbrueckii_BAA-365 # Pathway: not_defined # 3 262 1 247 248 81 29.0 4e-14 MAIKKYVIIALNEITIKLQYKFNSFVGIFLRFAQLSMLYYIWNGIYNSTKTGKIGNYTKK EMLIYIVLINLTVALFSSSYMMELGKMIHNGKLTTLLLRPINLLGESFAKYTGRKLFLII IYLSALIFGRFTGFFKSPVYFLGMLIFIILSYIMFFYMISFMSTVGFWLIEVWPLSAVVN GIYLLLSGIYFPLDLLSENFYNIIKYNPFAIVGYASIHGLQGMFTEEEVFFYIIVIIIWI FVFRAAYHYTFTKGLRKYEGMGT >gi|261748169|gb|ADAD01000052.1| GENE 99 109305 - 110078 665 257 aa, chain + ## HITS:1 COG:FN0881 KEGG:ns NR:ns ## COG: FN0881 COG3694 # Protein_GI_number: 19704216 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Fusobacterium nucleatum # 4 256 6 260 262 67 25.0 3e-11 MNMKVYKVMISNSIQQMFIYRTTSLLIVVFGLIFYLFELFSGFVYFKYTDNIMGWTKWDY FSLITTSTIIMYGYNFFFVWGDSTLDEEILEGKLDYIFLRPINSFWYYSLYRIDFPSLIN ILTGIAVQSYIIYREKIGILQILMYIVSVIMGVWFIFLFNHLTVTVAFWKEKANELTWVP EILTDFSSRPASIYPKIIRFLMIWIIPILTSINLPVDILRGKVNMMSMLWYILFLIVFTI AVYKMWYAGIKKYQSSN >gi|261748169|gb|ADAD01000052.1| GENE 100 110154 - 110501 431 115 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0612 NR:ns ## KEGG: Sterm_0612 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 112 1 112 116 95 49.0 5e-19 MAIIKFKKREEPEILFGIKLPSIAVDLSKEIKNKKKVYEIIRNTLNIADGRLINIVDVRD AYDNPASVLVIYDNFVTEREILKMDLEIEVFDFNIFEFDYNNKINIEDVIKRIKN >gi|261748169|gb|ADAD01000052.1| GENE 101 110526 - 111191 832 221 aa, chain + ## HITS:1 COG:FN1800 KEGG:ns NR:ns ## COG: FN1800 COG0652 # Protein_GI_number: 19705105 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Fusobacterium nucleatum # 39 219 29 208 274 132 42.0 7e-31 MKRILLLGILMMMFFNLGNAKAVKIKKTREERKLERQEKKEDSKFKKQMKKFELEAVIKT NKGDISVFLYPEAAPQNVANFVFLAKNNFYNGLSFHRIIGNVLVQSGDPKGDGTGSTGYL VNDEIVDWLTFDSAGKVGMANTGPNTNSSQFFITVSPVSQLNGKYTIIGEIKSREDMSVL RLIRPEDKIVDIEIKGRKVDDFLTYFSTEVAQWEEILKQSR >gi|261748169|gb|ADAD01000052.1| GENE 102 111218 - 111628 138 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037787|ref|ZP_06011229.1| ## NR: gi|262037787|ref|ZP_06011229.1| sn-glycerol-1-phosphate dehydrogenase [Leptotrichia goodfellowii F0264] sn-glycerol-1-phosphate dehydrogenase [Leptotrichia goodfellowii F0264] # 72 136 1 65 65 117 100.0 3e-25 MPEGFMLARYCRTISGQFFINSFRNVCAHNERLYEHRHTRGIKIKSTILHNLHNLEFKYS LFDLILILVSFMTNFEYSLLFKDLIVSPLALLYSNIGEKRFIKILNMMNFPKNWQEILTI EDSFNKYIDLYEKEKI >gi|261748169|gb|ADAD01000052.1| GENE 103 111719 - 112822 1467 367 aa, chain - ## HITS:1 COG:VC0566 KEGG:ns NR:ns ## COG: VC0566 COG0265 # Protein_GI_number: 15640588 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Vibrio cholerae # 96 367 92 364 456 233 45.0 4e-61 MKLRKTLALTFLLTALGIHAETANPKPNVEVTTKSTTSGDVYRIQDAFSNVYERTKDSVV NIRTKKTIIVNTYNPLEELLFGTSGRRQIKKESGSLGSGFIVSNDGYVMTNNHVIDGANE IFVKLSDGNEYSAKFIGGSPEIDIAILKINSNKKFVPLKFENSDNIKIGHWAIAFGNPLG LNSSMTVGVIGALGRSSLGIEQVENFIQTDAAINQGNSGGPLLDINGNVIGVNTAIFSTT GGSIGISFAIPSNLAQNVKDSIIKTGKFERPYLGISIADLSSIPANQQKPPYSYGVYVNK VFPNSPAAKYGIRDKDVILELNNRRISSASSFIGELAAKKIGDTVNLKVYSDGKEKNVNI VLEKYNN >gi|261748169|gb|ADAD01000052.1| GENE 104 112958 - 113770 1063 270 aa, chain + ## HITS:1 COG:BS_fruR KEGG:ns NR:ns ## COG: BS_fruR COG1349 # Protein_GI_number: 16078502 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Bacillus subtilis # 27 255 1 230 251 162 36.0 4e-40 MKKMIDFEKNKVYYMGAKIKMTGRENMLEIDRFEKIMQELNKKGRLSYRELDDIMEVSSS TVRRDIEKMYSKGLLLKIKGGICQQKKLSFDVEVKDRFRENVEAKKEIAERISKTLKDKD FIFLDAGTTAFYLIEKLRGKNITVVTNGTMHIEELLKYKIKTIILGGEIKESTKAVIGLE AILSLEKYRFDKCFLGVNGINLENGFTTPEINEAMIKKKVLELSEEKYILADKEKFDKMS NVKFAELEECKIVTTKQAIKENNRYKKYFY >gi|261748169|gb|ADAD01000052.1| GENE 105 113849 - 114766 1206 305 aa, chain + ## HITS:1 COG:BB0630 KEGG:ns NR:ns ## COG: BB0630 COG1105 # Protein_GI_number: 15594975 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Borrelia burgdorferi # 1 300 1 301 307 238 44.0 1e-62 MIYTLTLNPALDYDMYIKEDLKTEHLNLADKVNYRAGGKGINVSKVLKNLDVESVAIGYT AGFTGNFIIEDLKKDGIKSEFVELDGNTRINVKVNGNDKETEITGVSPVISEEKLKLLME KVSHLKDGDILVLSGSIPESVSKTVYKELSEKIKTDVEIVLDTRGNLLQNNIHNNFFIKP NIHELRDMFDEKLETKQEIVKKCSFFLERGVKNVIISRGGDGALLVNKDFVLEASVPKGE LINSIGAGDSMVAGFIAGYVKGMSTEDSFRLAVAAGSATAYSYGLAEKDFVNKLYEEINI VKEGV >gi|261748169|gb|ADAD01000052.1| GENE 106 114771 - 114893 238 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKISDLLIKERINLDLKSEDKKSVIREIAKLHDNTGVLTD >gi|261748169|gb|ADAD01000052.1| GENE 107 114903 - 116681 3088 592 aa, chain + ## HITS:1 COG:SPy0855_3 KEGG:ns NR:ns ## COG: SPy0855_3 COG1299 # Protein_GI_number: 15674888 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Streptococcus pyogenes M1 GAS # 251 587 1 358 360 332 58.0 1e-90 MGALNAREAQSSTALEEGIAIPHAKTEYVKEPALAMGRSSKGIDYDSIDGEKSTLFFMIA APAGANNTHIETLARLTQLLLDDEFKTSLDNAKTPEEVLDIINAKEAEKLAAENAEKEAA NAPIGNDETYIIAATACPTGIAHTYMAEEALKKAATEMGIKIKVETNGTDGRKNELTAED IKAAKGVILAIDRNIEMARFDGKQLIKVGAKEGINNAKGLIQKVLDGNLPKYAASASEEG SASKSNEKSGLYKHLMSGVSYMLPLVISGGILIALAFLVDTLMGNTDAGKDFGSKAALAK MLMTIGGQAFGLFVPILGGYIAYSMADRAALTAGLVAGALASSLGSGFLGAIVGGLFAGI VVKFLIKSLSGLPKSLNGLKMILFYPVLSVLIVGLAMIIVINPIVSIINTGLTNYLNLMN GSSAVLLGLILGGMMAVDMGGPVNKAAYVFGTGTLAATMTTGGSGVMAAVMAGGMVPPLA IALASTLFKNKFTEEEREAGLTNYVMGLSFITEGAIPYAAADPTRVIPANIVGAAIAGAL TMLFGIKIRAPHGGILVMALSNNFVMYFIAIAIGSAISAVLLGLLKKNIKKA >gi|261748169|gb|ADAD01000052.1| GENE 108 116982 - 118022 1561 346 aa, chain + ## HITS:1 COG:BH3640 KEGG:ns NR:ns ## COG: BH3640 COG0444 # Protein_GI_number: 15616202 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Bacillus halodurans # 5 337 4 337 340 454 67.0 1e-128 MEETKTLLKINNLYTSFRIKDEYYPAVDNVTLDLQKNEILAIVGESGCGKSTLATSIIGL HNPINTKSEGEILYKGRNLLKLTENEYNDVRGNEIGMIFQDPLSALNPLMRVEEQIEEGL IYHTKLNKDERKKRVQELLVQVGIPKPERVARQFPHELSGGMRQRVMTAIALSCKPEIII ADEPTTALDVTIQAQILDLLKSLQEEIQAGIILITHDLGVVAEMADRVAVMYAGQIVEIA TVDELFNNPKHPYTRSLLNSIPQLDSETDKLHVIQGTVPSLKNLERKGCRFAPRIPWIKE EEHEENPELHEILPGHFVRCTCWKNFYFKEDESNEKNINNLKSEEV >gi|261748169|gb|ADAD01000052.1| GENE 109 118024 - 118965 731 313 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 262 11 273 329 286 54 4e-76 MIEIKNLKVHYPIRGGFFNRIVDHVYAVDGIDLNIEKGKTYGLVGESGSGKSTIGKAIIG LEKIKDGTIKYEGELIPKKIGRKSDYSQNVQMIFQDSLSSLNPKKRIIDIIAEPLRNFEK LSEREEKARVMELLEIVGMTEDALYKYPHEFSGGQRQRIGVARAVAIKPKLIIADEPVSA LDLSVQAQVLNYMKEIQNQFGLSYLFISHDLGVVKHMCDHISIMYRGRFVETGEKKDIYE NPQHIYTKRLIAAIPEVNPNHREVNKQKRMAVEKEYKTNELEYYDEKGKVYDLKKLSDTH YAALAHEVKGGEQ >gi|261748169|gb|ADAD01000052.1| GENE 110 118967 - 119929 1460 320 aa, chain + ## HITS:1 COG:BH3638 KEGG:ns NR:ns ## COG: BH3638 COG0601 # Protein_GI_number: 15616200 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 1 320 1 322 322 367 54.0 1e-101 MWKTVLRRILVMIPQLFILSILIFILAKKMPGDPFTGLITPQTSPAVLEELRRKAGFYDP WHVQYIRWMTRAFKGDFGLSYAYKLPVSTLVGQRVNNTFILSLLSMILTYSIAIPLGMIA GRHQNSLIDKGVTFYNYVSYAIPTFVLSLIMVWLFGYTLGWFPTTGSVTIGLTPGSFKYF IDRLHHIMLPAITYALLNTTGTIQYLRNEVIDAKSLDYVKTARSKGVPENVVYSRHIFRN SILPIAAFFGYSITGLLSGSIFIEMIFSYPGMGNLFISSISTRDYSVVTTLILLFGFLTL LGSLLSDIILSIVDPRIRIE >gi|261748169|gb|ADAD01000052.1| GENE 111 119940 - 120839 1457 299 aa, chain + ## HITS:1 COG:BH3637 KEGG:ns NR:ns ## COG: BH3637 COG1173 # Protein_GI_number: 15616199 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 14 299 17 302 302 328 56.0 6e-90 MSKKKKNDNQNIGEKPTGISIILRELKKDKLAMGSVILLVTLFVIIFVSAALINQEDVMK ISLLDKYARPGEGFWLGADYGGRDILGQLIIGARNSIIIGFTVTVLTSGLGIIVGLIAGY YGGVVDDLIMRLIDFIMILPTLMLVIVFITIVPRYNIFTFIMIMSAFYWVGTARLIRSKA LSEGRRDYVHASKTMGTSDFKIIFREMLPNISSIIIVDSTLAFAGNIGIETGLSFLGFGL PPSIPSLGTLVGYATNPEVLSSKPWIWMPASILILVMMLCINYVGQALKRAADARQRLG >gi|261748169|gb|ADAD01000052.1| GENE 112 120893 - 122363 2183 490 aa, chain + ## HITS:1 COG:BH3636 KEGG:ns NR:ns ## COG: BH3636 COG0747 # Protein_GI_number: 15616198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 42 490 64 507 610 360 39.0 4e-99 MKRKNWIMTIILSILVLLTACGPGERKSQGGGEKADTSKFPVATSNNDPAVKDAVLKVAV VKDAPLVGLFNLAFFQDGADGEIIENFFFDQIFEVDENFEVTDTGVATLQVDVPNKKATV KIKDGIKWSDGQPLIADDLIYAYEVIGSKDYTGVRYNEDSEKVVGMKEYHEGKASTISGV KKIDDKTIEVSFTELGQGIYTLGNGLQAIALPKHYLKDIPIKDMEKSEKIRSKIVTLGAY TISNIVQGESLELKANEFYFKGKPKIEKAVVQIVNSNTISSALKSGEYDIVIQIPIDQYK VFKDYDNLQILGRQELYYSYMGFKVGHFDKVKGENVTDPNAKMSDVRLRQALAYGLDVDQ MVNAFYSGLRERATSTVPPVFKKYYPKNLEGYPYNPEKAKKLLDEAGYKDKDGDGYREDK DGKPFEVKIAAMSGGDIAEPLVQFYIQQWKEIGIKAVLTTGRLIEFNSFYDKVQADDPEI DVYFAAWGVG Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:48:01 2011 Seq name: gi|261748167|gb|ADAD01000053.1| Leptotrichia goodfellowii F0264 contig00204, whole genome shotgun sequence Length of sequence - 387 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 386 580 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261748167|gb|ADAD01000053.1| GENE 1 2 - 386 580 128 aa, chain + ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 4 127 2216 2349 2806 73 38.0 8e-14 KANETTKDTHRTTNINIESQTIEYATNPGKFKEDLGKSKDEINDIGRAIKESLNDRGDDN RNFLGQLSEGRLQRTIENIGGERLRKSTTQEDISKTLKDTYKDLGYDINIVFTTPDKAPQ LIDEKGNI Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:48:01 2011 Seq name: gi|261748165|gb|ADAD01000054.1| Leptotrichia goodfellowii F0264 contig00241, whole genome shotgun sequence Length of sequence - 283 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 283 177 ## gi|262037810|ref|ZP_06011250.1| dihydrolipoyl dehydrogenase Predicted protein(s) >gi|261748165|gb|ADAD01000054.1| GENE 1 1 - 283 177 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037810|ref|ZP_06011250.1| ## NR: gi|262037810|ref|ZP_06011250.1| dihydrolipoyl dehydrogenase [Leptotrichia goodfellowii F0264] dihydrolipoyl dehydrogenase [Leptotrichia goodfellowii F0264] # 1 94 1 94 94 164 100.0 2e-39 DSKDNYIVFDDCNNKNTVIISDRKKICKNCKTVIRDQILYGKEHQDSAILIHGAGVIKDK KGFVITGNKRAGKTTTMINFLRNGFDFCSNDLSF Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:48:09 2011 Seq name: gi|261748158|gb|ADAD01000055.1| Leptotrichia goodfellowii F0264 contig00165, whole genome shotgun sequence Length of sequence - 3138 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 3, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 135 122 ## gi|262037817|ref|ZP_06011256.1| VIP protein 2 1 Op 2 . - CDS 157 - 816 1084 ## Sterm_1092 hypothetical protein - Prom 855 - 914 14.8 3 2 Op 1 . - CDS 1105 - 1563 436 ## gi|262037812|ref|ZP_06011251.1| conserved hypothetical protein 4 2 Op 2 . - CDS 1581 - 1709 70 ## - Prom 1803 - 1862 5.0 5 3 Op 1 . - CDS 1885 - 2289 436 ## gi|262037815|ref|ZP_06011254.1| hypothetical protein HMPREF0554_2217 6 3 Op 2 . - CDS 2369 - 3136 692 ## gi|262037816|ref|ZP_06011255.1| hypothetical protein HMPREF0554_2218 Predicted protein(s) >gi|261748158|gb|ADAD01000055.1| GENE 1 3 - 135 122 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037817|ref|ZP_06011256.1| ## NR: gi|262037817|ref|ZP_06011256.1| VIP protein [Leptotrichia goodfellowii F0264] VIP protein [Leptotrichia goodfellowii F0264] # 1 44 1 44 45 71 100.0 2e-11 MKKILLVLSLILTVNSISYSEGKILKNELPKNQREFFEGGRLID >gi|261748158|gb|ADAD01000055.1| GENE 2 157 - 816 1084 219 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1092 NR:ns ## KEGG: Sterm_1092 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 3 219 2 210 210 139 43.0 1e-31 MKKHILTLLGLLALSTASFAENNIIEFKIGLSPVSKFDVTPSKKAKFSYELGAEYRYLVT NNTEIGAGLSYQNHGKLKKFTDVEDNNLRVEVSDTELYDSVPLYLTAKYNFRNNSDIVPY VKADLGYSFNINGKNSSQYKTYSKATGAVLDEGKLKDFKAENGVYYSVGAGVVYKGFTTG LSYQVNTAKIEGTRYDGLKDKGSANFRRFTLSFGYQFGL >gi|261748158|gb|ADAD01000055.1| GENE 3 1105 - 1563 436 152 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037812|ref|ZP_06011251.1| ## NR: gi|262037812|ref|ZP_06011251.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 152 1 152 152 237 100.0 2e-61 MYGFPQLLITSDINNDGIKEIIIEKNSNDLFVFKFSDEKLVKVFSGFKEINFDNYYKIND EVEKLVITIKDDFGYNKKIFQKLPNLMEIQDSKEGIKDESIYYQLISINNKIIMRRIGSQ NYEIVFDKNLNFKIKAITEKNNELKLSTGYYY >gi|261748158|gb|ADAD01000055.1| GENE 4 1581 - 1709 70 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIQKKEEIAKEKMEIMLLTETDMETSEENKLYLYIKKIIKNV >gi|261748158|gb|ADAD01000055.1| GENE 5 1885 - 2289 436 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037815|ref|ZP_06011254.1| ## NR: gi|262037815|ref|ZP_06011254.1| hypothetical protein HMPREF0554_2217 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2217 [Leptotrichia goodfellowii F0264] # 1 134 1 134 134 127 100.0 4e-28 MIVMGVINYIIYIFTGLLLLSLKNCGYLKIKNGKLQVIIGTLDILYTITYHIFFGIGMAI LTLETEYSVYKIAFILICLLIIPIGVYKITIIIKKIFNIGKNDLKKYIISIEIIDLSVFL IQNVILVLIDNQIS >gi|261748158|gb|ADAD01000055.1| GENE 6 2369 - 3136 692 255 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037816|ref|ZP_06011255.1| ## NR: gi|262037816|ref|ZP_06011255.1| hypothetical protein HMPREF0554_2218 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2218 [Leptotrichia goodfellowii F0264] # 1 255 1 255 255 462 100.0 1e-129 DTEDYSTFYGRQLEKYYRKHFGDADATFVTEQLKYTKEQLGKDWENDIWVSYNPLKINHV SIYNDKIDRLFSYGEPVPYVKKVYKKKWLLRDFSVGRAENDALQISTLENYLKDARRKKD DVVMFRLNLSDKALARVEKRLEKVTKGFYKFDCPSGEAGCQRKGEYYLMVKTDFLPNKFQ QYSLLRNNCAVFVKRMLFNDVIGSGMNFSLGTVGNLPSNVADWAYNRTSTILDRNRKVTR SNKIIYYDYNNTGKK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:48:51 2011 Seq name: gi|261748156|gb|ADAD01000056.1| Leptotrichia goodfellowii F0264 contig00139, whole genome shotgun sequence Length of sequence - 356 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 291 418 ## gi|262037819|ref|ZP_06011257.1| putative ciliary WD repeat-containing protein ctxp80 2 1 Op 2 . + CDS 281 - 356 65 ## Predicted protein(s) >gi|261748156|gb|ADAD01000056.1| GENE 1 1 - 291 418 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037819|ref|ZP_06011257.1| ## NR: gi|262037819|ref|ZP_06011257.1| putative ciliary WD repeat-containing protein ctxp80 [Leptotrichia goodfellowii F0264] putative ciliary WD repeat-containing protein ctxp80 [Leptotrichia goodfellowii F0264] # 1 96 1 96 96 161 100.0 2e-38 EGNVLINYKGIGKAADNKGVFAQGVNFKTKGQIVGLSDGNIYLQGTKDRIKNIYDSHTVK SFWGVTYKRKSDYISDDKEKYRHSQLYGESGVKNGL >gi|261748156|gb|ADAD01000056.1| GENE 2 281 - 356 65 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDSEGKLRIEGVDIQTVGPVFLRGK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:49:02 2011 Seq name: gi|261748154|gb|ADAD01000057.1| Leptotrichia goodfellowii F0264 contig00129, whole genome shotgun sequence Length of sequence - 213 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 212 371 ## gi|262037821|ref|ZP_06011258.1| hypothetical protein HMPREF0554_1995 Predicted protein(s) >gi|261748154|gb|ADAD01000057.1| GENE 1 2 - 212 371 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037821|ref|ZP_06011258.1| ## NR: gi|262037821|ref|ZP_06011258.1| hypothetical protein HMPREF0554_1995 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1995 [Leptotrichia goodfellowii F0264] # 1 70 1 70 71 64 100.0 2e-09 KEGYILADGITKEEAKKWEVKEKGPEKKEDKKEEEKKDEVKEQTTGVELKADRLDNTKGI IASLGQTTLN Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:49:09 2011 Seq name: gi|261748152|gb|ADAD01000058.1| Leptotrichia goodfellowii F0264 contig00002, whole genome shotgun sequence Length of sequence - 4956 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 4954 7209 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261748152|gb|ADAD01000058.1| GENE 1 1 - 4954 7209 1651 aa, chain + ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 3 1634 614 2344 2806 774 38.0 0 KITSNDLKVTGNTFLNNDKVDATSADFNLQNDFTTKGTVSTDKLIVKSNNVVNDGKILSG SANISVKEKVINNDTVNIGKFDVNAKSLENKKDIRANNLNITTNENTVNTGTITANNVSV NAGSVVNTNKIETESLNIKTTGNTENSGQILSNNLTVNSDNFINKNEVSVDKGTITANNV IKNENKINANELTTNSRSLVNEKKIEVGTGTFTTTEDIENKDLLAGNNLKFKGRNLLNSE GKTIFATNKLEADMAGNVVNKKAEILSQNEIELKADIIDNTVGKIKGTGDVKLTGRKIEN VGEVGDLTKYKLYWETWNGKRFDSHADVATPERGWTIEARETLSGSTEDKESTFEYFLSN FGGADGVSYIFKELDKVVRQKKYTQQGTIAVNSTEYPVVPLMNKIESEAKTNHAVISGKN ITINAGEELNNTDGIISASEINKIDAPKIVHKVTIDENNPIALTDGVEKLTWWKTSHHGK SRYPVNYERYLVPSSRVGYVAGQPSIIEGKEVLIDKTKIVSERFDSAKGQIIKNAGTQNN VSGKNIEIVKNVSGIQDIKNTGIISVNIGSISGINGVNVNTGINGRNGINIGSLYVPSTD PSSRYVIENRTEFITRGNYYGGDYFLNRMGYVEDWDRVKLLGDAYYDNMLIEQTLTEKLG TRYINGLSGEELTKQLIDNATTAKKDLQLRQGVALTKDQINNLKNDIIWYEYETVNGKKT LVPKIYLSKATIEKLEVDGRSKIYGTEKTIIKASGDFENTGIKIGSKTGITEVTGNRIKN ETVTNERAEIVGKKVKVEALTGNIENIGGKIVAVERTELIAKNGDIINDGKKIKEGYYLD SKNHTEYETVSNIGEISAESVRLETNNYNSTGGALVTKNLELQLTGNINENALELNGSDR FGDDKNYQKYKAKEYVGSGIVAEKATGKVGDINLKGSAFVVEDGKGLKVGNVKAESAVNE YDSELKGSIKGPVSRSSSFTQSHIEENAASNFKIGENADIKGTVTGIGSNIYIGDNSFVG GKVTTDSRELHNTYYNEEKKSGFSASAKGTSVSAGYGKSKNTYDEKSTINAKSSLHVGNN TVLNNGASITATDFEHGKIEINNGDVTYGARKDTRDVSTSSKSSYIGVTANVKSPALDRV KQAKEAVDQIKKGDTVGGAVNAVNFVTGTVHGLAGNQGNRQRNYDKNGSVGKQGVKDASA NNNFYANIGLDVGFNTSKSETNSHEESGVVTTIKGIDEESSITYNNVKNINYIGTQAKDT KFIYNEVENINKEAVKLNNSYSSNSKGFGVSAGATVGYGHKLQTTGNGGSVSVTRSNMNT EETLYQNGRFTNVEEVHNGTKNMRLSGFNQEGGKVAGSIENFVEESKQNTSRTTGSSSGV TLGIGSNGVPSSASISGSRTSGSRAYVDNQTTFILGEGSNLTIGKVENTAGIIGVEGNGK LKINEYKGKDLYNHDNLTTTGGSIGVDFGKGGAKVNGIGVNNENHRKEGITRHTVIGNVE TGNATGSPINRDRSKANETTRDDHSSTNVYIEGQTIDYATNPSKLKEDIGKAKDEIKDIG RAIKESLNDRGDDNRNFLGQLSEGRLQRTIENIGEERLRKSTTQEDISKTLKDTYKDLGY DVNIVFTTPDKAPQLRDEHGNIKAGTAYVGA Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:49:15 2011 Seq name: gi|261748135|gb|ADAD01000059.1| Leptotrichia goodfellowii F0264 contig00017, whole genome shotgun sequence Length of sequence - 17544 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 6, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 494 - 553 13.8 1 1 Op 1 . + CDS 684 - 1646 1452 ## COG2984 ABC-type uncharacterized transport system, periplasmic component 2 1 Op 2 9/0.000 + CDS 1671 - 2636 1265 ## COG2984 ABC-type uncharacterized transport system, periplasmic component 3 1 Op 3 13/0.000 + CDS 2664 - 3809 1688 ## COG4120 ABC-type uncharacterized transport system, permease component 4 1 Op 4 . + CDS 3810 - 4595 230 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 4741 - 4800 3.0 5 2 Tu 1 . - CDS 4819 - 6243 1801 ## Lebu_0576 hypothetical protein - Prom 6270 - 6329 11.0 + Prom 6302 - 6361 10.1 6 3 Op 1 2/0.000 + CDS 6382 - 7134 1002 ## COG0778 Nitroreductase + Prom 7138 - 7197 12.2 7 3 Op 2 . + CDS 7228 - 7989 1111 ## COG0778 Nitroreductase + Term 8014 - 8072 10.4 + Prom 8019 - 8078 9.7 8 4 Op 1 . + CDS 8144 - 9409 1813 ## COG1760 L-serine deaminase 9 4 Op 2 . + CDS 9435 - 10034 586 ## COG3124 Uncharacterized protein conserved in bacteria 10 4 Op 3 . + CDS 10043 - 10756 1013 ## Lebu_0574 hypothetical protein 11 4 Op 4 . + CDS 10784 - 11641 1083 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 12 4 Op 5 . + CDS 11654 - 12022 362 ## COG1733 Predicted transcriptional regulators + Term 12023 - 12070 -0.7 + Prom 12024 - 12083 4.8 13 5 Op 1 . + CDS 12103 - 12660 473 ## COG1335 Amidases related to nicotinamidase 14 5 Op 2 . + CDS 12666 - 13415 926 ## COG3142 Uncharacterized protein involved in copper resistance - Term 13385 - 13431 12.1 15 6 Op 1 3/0.000 - CDS 13436 - 16933 4400 ## COG1410 Methionine synthase I, cobalamin-binding domain - Prom 16973 - 17032 10.6 16 6 Op 2 . - CDS 17079 - 17465 209 ## COG0646 Methionine synthase I (cobalamin-dependent), methyltransferase domain Predicted protein(s) >gi|261748135|gb|ADAD01000059.1| GENE 1 684 - 1646 1452 320 aa, chain + ## HITS:1 COG:Cgl2198 KEGG:ns NR:ns ## COG: Cgl2198 COG2984 # Protein_GI_number: 19553448 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Corynebacterium glutamicum # 37 317 43 328 330 182 41.0 1e-45 MKKMLLLLGVITVFILSCVGKEGQSSGKKEESKKKVKIGITQITTHPALDSAREGFKEAF KEAGLEVVYDEKNANGEITTANLIANNFVNSKMDLIYAIATNTAQAVSNTTEDIPIVFSA ITDPESVGILKENVTGISDRVDIKQQLELLLKIDSKIKKVGIIYNSSEPNSKIQVEDLKK AAKELNLKVVEKSVSQVSEIPQVTDMLIRESDALYLPTDNLVASVVNLITDKAAVAKKIV FGAEAAHVKGGALITQGVDYYEMGKEAGKIAVEILKNNKKPFEMKYKTMELGEITVNSKT LGKLGIKLPEEIRNKVKFIE >gi|261748135|gb|ADAD01000059.1| GENE 2 1671 - 2636 1265 321 aa, chain + ## HITS:1 COG:FN2081 KEGG:ns NR:ns ## COG: FN2081 COG2984 # Protein_GI_number: 19705371 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Fusobacterium nucleatum # 35 319 13 298 299 209 44.0 7e-54 MKRLLLLVLGCLMILACGKQEEIKEKKNNNKSEGKTYKIGITQIVTHPSLDMVKEGFKKA FEEAGMKADFDETNAEGSIPNANLIANKFKSDKKDLILGIATPSAQALANSITDIPILFS AVTDPVSAKILNKNVTGTSDKLDNVGEQLDLLIKLKPETKKIGVLYNPSEQNSLVQVKEI QEKAKERNLTVELQGITSFNEVAQATKILLGKTDALYLPTDNLVVSAVKLIVSEGISAKK PVISSERSSVDQGALFTMGLNYFDLGKRTGEMAIEILKGKPASEIPFETSKKTTLFLNEK TAQAIGIDIKNPVLEGAEIVK >gi|261748135|gb|ADAD01000059.1| GENE 3 2664 - 3809 1688 381 aa, chain + ## HITS:1 COG:FN2080 KEGG:ns NR:ns ## COG: FN2080 COG4120 # Protein_GI_number: 19705370 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Fusobacterium nucleatum # 11 369 2 278 278 158 34.0 1e-38 MNELLVFMQSLPEAFKTGFIYSIMVMGVYITYKILDFPDLSVDGTFPLGAFIFAGFAMSK NGFFGITHPIMGLVLATVGGMMAGYVTGALHTYLNIEGLLAGIIVMTGLYSINFRIIGGA NGFIPDDRSIYEIVSYDKNFIMFTIIFLILLVLKGFYDYKIKKNKYVIRALAVYTIMVIA LIVYVFLSKDIKLMLVALIVFILKMILDYILTSKFGFALRALGSNEQLVISLGVNEKRLK IFGLMLSNGFAALSGALYAQSLKAADLQLGFGVLVMGLAAIILGLGIIKKSQAVNEISIV ILGSLLYYFIINVALMSNNWTRNLYDALGFSDGLKSVLEIKPTDVKVITAIILTVILWNE TAKRIRKSRRKAKLIEKERGI >gi|261748135|gb|ADAD01000059.1| GENE 4 3810 - 4595 230 261 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 228 1 218 245 93 28 1e-18 MIEIKNLYKTFFSELGTEKQVFNNLNLKVEDGDFITIIGSNGAGKSTLLNVLNGQITPDS GEIILNGKNISNVEKYKRSQWISQVYQNPTQGTAPSMTVMENLSMAKNKGKKFNFTFGLD LKNMKFYEKQLESLGLGLEKQLFTQVGLLSGGQRQCLSLIMATLNHPDILLLDEHTAALD PQTSEIILEKTKEIVEKNNITSLMITHNMQDAINYGNRLIMLHAGEIIFDIKEKDKKNLT VEKLLEMFKEKDAKLSDKDVF >gi|261748135|gb|ADAD01000059.1| GENE 5 4819 - 6243 1801 474 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0576 NR:ns ## KEGG: Lebu_0576 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 4 474 2 472 472 720 77.0 0 MKKSKYTILNIIILLSVLSCVSLPEGLSTKSEVYNADNVDFYYDLTYKKDGKIEYERQIW EQAYEILDNAKEFFLMDIFVFNDYLGKGVREKLNPVDIAEEFAQKILEKRKRDPDVEIYL ILDESNTFYGAFDNRTHKKLEEAGVHIGYVDLAKLRDPMPTYSTVWRLFIRPFGNPKNVG KTKNPVYEETDKVTIRSILRALNAKADHRKLIMNENTAMLTSANPHAEGSRHSNVAFKFS SPIIKDIYSAEQSVAKITKPDGSLKQNLPDKDFSKIPFSNNSKLKLQYFTEGQTVLDISN ELANTGTEDKIIIAQFFLADRGIIKSIKKAAKNGADIQIILNNSNSGLPNKAAAGELMKY ARKHNYKIEIRFYNKGEEMYHVKMLSIMKKDYMVTYGGSTNFTRRNMRNYNLENELKITS AYDQNISKSILNYYDRLWTNKDAEFTLPYEKNKNEKVINDLLFRFMEINGFGIF >gi|261748135|gb|ADAD01000059.1| GENE 6 6382 - 7134 1002 250 aa, chain + ## HITS:1 COG:CAC0718 KEGG:ns NR:ns ## COG: CAC0718 COG0778 # Protein_GI_number: 15894006 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 250 1 246 246 171 39.0 9e-43 MNEVIKQLQNRRSVREFTGEKVKDEDLKLILETAQRYSNWAHGQQTSLIVVRDKEKIKKI AELSGGQKHIEAADVFILILIDFYRVVYAVESIGGEISIPKSVEGLMIGTGDAGIMVNAI QTAAESLGYGTTTIGGVRVNPDAIAEMFDLPEYVFPVFGTTIGVPAENKLKSLKPRVPFE SFAFEEKYDKKKVKEGVEKFEKDYRKWWDENGLPEVPSYKETMKKYYSNYVKENYQKTEE TLRKQGFIQK >gi|261748135|gb|ADAD01000059.1| GENE 7 7228 - 7989 1111 253 aa, chain + ## HITS:1 COG:CAC0718 KEGG:ns NR:ns ## COG: CAC0718 COG0778 # Protein_GI_number: 15894006 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 247 1 243 246 177 40.0 1e-44 MNEVIKQLQNRRSVREFTGEKVKDDDLKLILETAQRCPNSVHGQQTSLIVVKDKDKIKKI AELSGGQKQVENADVFVLVLADYYRPYYAAKSIGTEIGIQKSLEGVITGAVDAGIMVEAI QTAAESLGYGTTVIGGIRLNPEAICEMFNLPEYVFPVLGTTIGVPAENKLKTPKPRVSYE SFALDEVYDKAKVEKGVEDFDADYRKWWDENGLSQMPSYKESIKNYYENAANERYKRVSD IVKKQGFDFDKSE >gi|261748135|gb|ADAD01000059.1| GENE 8 8144 - 9409 1813 421 aa, chain + ## HITS:1 COG:FN1106 KEGG:ns NR:ns ## COG: FN1106 COG1760 # Protein_GI_number: 19704441 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Fusobacterium nucleatum # 1 412 1 397 408 388 51.0 1e-107 METLREFYKIGNGPSSSHTMGPERAASLFKKRVEEKYGKDLRYKAELYGSLAATGKGHLT DFIIKKTLENTEKDNIEVIFMPEVVYDFHTNGVRFFVYKKDDINFETPLESALFFSVGGG SILEAKDGNENLNKNADVEHIYPQKNFKEIMEYCEKENITIPQYVERYEGVEIWDYLKTV WHAMDDAVDRGLNTEGYLHGMLHLERKAKMFYEKYLNAKFKDVKGRVYSYALAASEENAS AGKVVTAPTCGASGLIPAVLKTFQQENGLSEEEILNALAVAGVIGNVYKQNASISGAEVG CQGEVGVACSMGSGMAAYIMGGSIQEIEYAAEIGMEHCLGMTCDPMLGYVQIPCIERNAI YAAKAMDCAQYSLMSGGENHLITLDEVVETMLQTGKDLHSNYKETSLAGLAKLKKAKLNL E >gi|261748135|gb|ADAD01000059.1| GENE 9 9435 - 10034 586 199 aa, chain + ## HITS:1 COG:PA4353 KEGG:ns NR:ns ## COG: PA4353 COG3124 # Protein_GI_number: 15599549 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 1 173 1 168 190 64 25.0 1e-10 MNFLGHSMISIEIDEKTDRKTLYGNFTGDFYKGTLEKINLTDELKEGIVLHRIIDDISDR ENNFLSDLLREKFGIFKGIVSDMFVDHFLSKNFYRIFNENINDIETKILYNVNQYEKYFP EKFERTLSWISSEKILSGYANIDILERAFYGLSKRVKKGEILNSAIKELKKNYGIFEENS VKEFEYVKNESINKFLDKY >gi|261748135|gb|ADAD01000059.1| GENE 10 10043 - 10756 1013 237 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0574 NR:ns ## KEGG: Lebu_0574 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 70 235 19 188 191 166 59.0 5e-40 MKKTLITFLILAQGIILAENKPKAAVPVQAQKKAPVAAKKADPKVAVQAKPVANTNTKAP AAATQTKPTATTKTAPKATAKKDEVKKFRTLQEAADDYAKAMQEISNYGSREMLRTVNVD VDKYVGARGDQELTRKWEEINTMLLEQFEVSVYKVNENGNSGEAVFLIKGYDEEALSKYL NDNIDKYAKIKRLKGEVDIDIEAYINLQYNYLKNAKKINIATSTVNFVKTGNEWKKA >gi|261748135|gb|ADAD01000059.1| GENE 11 10784 - 11641 1083 285 aa, chain + ## HITS:1 COG:DR0470 KEGG:ns NR:ns ## COG: DR0470 COG1387 # Protein_GI_number: 15805497 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Deinococcus radiodurans # 67 279 46 257 260 106 31.0 6e-23 MLTDYHMHFEYGDYDEEWVSLFFKQAEKKRLHEIGISEHTHAFKEFKDFYYEELILDNSE TGEFQRKWLDNPKSKFVHTLEEYSDFINMLKSKGYPVKLGLEVCNFRNQDRVKEILQKYE WDYLIVSVHFIKGWGFDFSNLKHRFNERNLTDIWKDYAEEIENVANTGFYNILGHPFNLR LFNNIPDKENVGDLLEKTAVCLKNNDMTVDVNTGTVYRYPIKEISPYGDFMEYVKKYNIP VMLSSDAHQPEHVGMKIEEAVEYIKKYGINEIATFNKRKRIIEKI >gi|261748135|gb|ADAD01000059.1| GENE 12 11654 - 12022 362 122 aa, chain + ## HITS:1 COG:CAC3399 KEGG:ns NR:ns ## COG: CAC3399 COG1733 # Protein_GI_number: 15896640 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 115 1 116 116 133 63.0 7e-32 MKLRENYTCPLEFVHDIIKGKWKTIIIFQLRKGKCSFSELHREISGISQKMLLEQLKELR EFGFVDKKSYSGYPLQVEYFLTERGEKILSAIKIMQDIGIEYMIENNMTQFLDKKGIKYD IS >gi|261748135|gb|ADAD01000059.1| GENE 13 12103 - 12660 473 185 aa, chain + ## HITS:1 COG:BS_yddQ KEGG:ns NR:ns ## COG: BS_yddQ COG1335 # Protein_GI_number: 16077574 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Bacillus subtilis # 1 180 1 180 180 187 47.0 9e-48 MRKGLLIVDVQNDYFPNGRCELFKAEETLENIKKLLEYFRKNKFPIYYIKHISENEDATF FLPGTDGIEIHKEIKPLNNEKVVIKNYPNSFYNTDLKEILQKDLIDELIICGMMTHMCID TTVRAARDLGYKNILVSDACTTKDLEWNSIKIPAETVQNVYMASLNGKFAHVIKTEEYFK KLCEE >gi|261748135|gb|ADAD01000059.1| GENE 14 12666 - 13415 926 249 aa, chain + ## HITS:1 COG:PM0526 KEGG:ns NR:ns ## COG: PM0526 COG3142 # Protein_GI_number: 15602391 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Pasteurella multocida # 6 242 3 237 244 203 44.0 3e-52 MENFTLEICVDSVESAINAEKGGATRLELCSNLIIGGTTPTKSLFEEVKKNVNIPINVLI RPRFGDFLYSDYEINMIKNDIKMFKGLGANAVVIGVLTKDGEIDIENMKKLMEEAEGMSV TFHRAFDVCKDPIKAFQQLKELGVKTVLTSGQEDSCLKGKELLKKLVELSNGNSPEILIG AGLNVGNAEETAKYTGAKAFHLSAKKIKESKMIYKKENVNMGLKEFSEFEILETDENAVR EIYNCLVRV >gi|261748135|gb|ADAD01000059.1| GENE 15 13436 - 16933 4400 1165 aa, chain - ## HITS:1 COG:BH1630_2 KEGG:ns NR:ns ## COG: BH1630_2 COG1410 # Protein_GI_number: 15614193 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Bacillus halodurans # 345 1165 29 840 841 764 51.0 0 MSDIKYKDSFYELKKDLNGKIVLLDGAMGTMIQKQNLTAEDFGGEKYEGCNDYLVLTKPE VIKNIHKMYLEAGSDIIETNTFGALDIVLRDYELEDKVFEMNKSAAELVRKAIEEYREEN PNEKRNLYVAGALGPSNKSISVTGGVTFDELIHTYYTATSGLLAGNVDIILFETIQDTRN LKAAYLGLQKAMEEHYNVPLMLSFTIESTGTTLAGQTADAFYYAVNHMHPLSVGLNCATG PEFMTQFLKILNNVSNTYISVYPNAGLPNEEGKYEETPDTLSAKIEPFFQNKYLNIVGGC CGTTPEHIKQIKEKSIKYTPRVIEEHSETKNEVSGLITLNPPQNRPVYVGERTNVIGSRI FKNLIVNEKFDEATEVARLQIKGDADVIDVCLANPDRDEINDMKAFLEKVSKFAKVPIML DSTDINVIKEGLTYLQGKGIINSINLEDGEKKFVDMSELIKKFGASVVVGLIDEEGMAVS LEKKIKVAKRSYELLTKKYGIDERDIIFDTLVFPVATGDQKYIGSAAATIEAIREIKKEM PNVKTILGVSNISFGLPVAGREVLNSYYMQKAYETGLDYAIVNTEKLIPLSEISDKEKEL SEALLFNTNDDTVADFVAFYREKKGVEKKADTTNMTPEEQVSNLVVEGSKKDLIPLLDTL LQKYAPLDIINGPLMDGMDEVGKLFNNNDLIVAEVLQSAEVMKASVSHLEQFMEKSESST KGKVIMATVKGDVHDIGKNLVGIIIGNNGYDVVDLGINTPAEKIREAIINEKADFVGLSG LLVKSAAEMVNTVAVLREAGIKIPIFVGGAALTEKFTVNKIEPSYENNIVIYSKDAMTAL ADLNKMIVPEKFEEFKQYLQSRRDMLLLKDKDAIEKLNVRQTVGDVKNADGSFDITKVKL PEYNFEKIYKPSTLNKQIITNIPAKEVFKYLNLQMLIGKHLGMKWVVRELLEKGDPKATK MYNEILDIINHGDEYFDIKAVYKYFPCRKKDGKIEILSDDLKETVETFDFPRQTWGQHLS LNDYIHPAEIDYIGMFVVSAGEKSRIVSNELKEKGEFYKGHLVNSIGLELAESTAEYAHY LMRKDVGIIDDENLTVDEIHRAKYHGKRYSFGYPACPDLSDQDKLFNLLKPERFGIKLTL GHMMYPEASVSAIVFSQPFAKYFNM >gi|261748135|gb|ADAD01000059.1| GENE 16 17079 - 17465 209 128 aa, chain - ## HITS:1 COG:AGc3907_1 KEGG:ns NR:ns ## COG: AGc3907_1 COG0646 # Protein_GI_number: 15889436 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I (cobalamin-dependent), methyltransferase domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 113 225 338 340 112 39.0 1e-25 MLSFALKDTGTTFTGQTIQELYDTAYYMNPISLGFNCAVGGEHTEKFIETLNSISNIYTS FYPNAGLPDKNGKYEKTPDVLASETELFFQNNCLNIIGGCCGTTPEHIRQIKEKSIKYLP RSIELPVL Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:49:28 2011 Seq name: gi|261748132|gb|ADAD01000060.1| Leptotrichia goodfellowii F0264 contig00186, whole genome shotgun sequence Length of sequence - 2310 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 436 581 ## gi|262038222|ref|ZP_06011613.1| hypothetical protein HMPREF0554_0421 2 1 Op 2 . + CDS 429 - 2310 1986 ## gi|262037843|ref|ZP_06011277.1| hypothetical protein HMPREF0554_2290 Predicted protein(s) >gi|261748132|gb|ADAD01000060.1| GENE 1 2 - 436 581 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038222|ref|ZP_06011613.1| ## NR: gi|262038222|ref|ZP_06011613.1| hypothetical protein HMPREF0554_0421 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0421 [Leptotrichia goodfellowii F0264] # 1 142 34 175 186 237 96.0 2e-61 NIVLNAKRYLKMLSTVDTEFKERTTSHKSRRGLGRSKTTTEHWVEDNVYANNVDLTTDGN VLMNFKGVDEKTGKYISTNNQGVISQGVNFHAKGAIIGFSEGNIYVEGTKDKLNSVYNSH TTKKWLGATYGKASDYVNDTKEVV >gi|261748132|gb|ADAD01000060.1| GENE 2 429 - 2310 1986 627 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037843|ref|ZP_06011277.1| ## NR: gi|262037843|ref|ZP_06011277.1| hypothetical protein HMPREF0554_2290 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2290 [Leptotrichia goodfellowii F0264] # 1 627 1 627 627 1067 100.0 0 MYNGSDISFDSEGKLRVEGVDVQTTGHVYLRGKQGVEVLPGVENSLREEEHKISGLKGSI SVSRGGVSAGVGYGKSSDKIKEVTKEIIANKLQAAGNVDIKSEEGDIDFGPTNGSIGGKL TYTGKDINILDMQSERVMDRETESSYVGVGVNFGVPAISAAQQIWEAGKGLTKARHKEDY INAGFGAFNAGMGAVSALGGNVLASSVSLNYNKSQNSYHREERMSVGSRLHVKGGVEYNG KNLHTVNLNMLNEGDTVYNITGNIIKEAGKSTIKENSDGKSFGLSVSHDTFDSAGKIQNV LKPSNGTVTAGIGRGKSESQGEYYSHNQDITKGTTYYNNSPDVKIRGVDVQTGGIAGKVA SLTVESVQDRTDGKSSGYNISYGIGIGEHAVMRNGTKEMADGLHTTNIGVGYSRGKHTER TTNAIGGFTAEHGVLDVIGKTVQTGSVIGGGFTLNTGTYEKNDLEDVNKRKNVGINLTFT PGVTDVYRNGKLTPDSVGGVLVGTRIDYSKDDYVAKVKATIGNSVNMTVAGKKADLSGVN RDTSNMVDVLKDRKINPVSIDLGTEYWATNYGREKTKGDFAKAGNKIERVVEIIKAVSEN EGKDIGLYYRDRLDFEAERDRKERSGE Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:50:07 2011 Seq name: gi|261748116|gb|ADAD01000061.1| Leptotrichia goodfellowii F0264 contig00110, whole genome shotgun sequence Length of sequence - 18386 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 5, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 74 - 136 70 ## + Prom 177 - 236 6.7 2 2 Op 1 . + CDS 300 - 1277 863 ## gi|262037849|ref|ZP_06011282.1| conserved hypothetical protein 3 2 Op 2 . + CDS 1302 - 5261 3925 ## COG3179 Predicted chitinase 4 2 Op 3 . + CDS 5284 - 5802 467 ## gi|262037852|ref|ZP_06011285.1| 50S ribosomal protein L23 5 2 Op 4 . + CDS 5831 - 7105 1272 ## Lebu_1123 hypothetical protein 6 2 Op 5 . + CDS 7102 - 7833 564 ## Lebu_1124 hypothetical protein 7 2 Op 6 . + CDS 7849 - 8304 512 ## Lebu_1125 hypothetical protein 8 2 Op 7 . + CDS 8318 - 9811 1724 ## Lebu_1119 hypothetical protein + Prom 9888 - 9947 10.0 9 3 Tu 1 . + CDS 10023 - 10700 838 ## Lebu_1837 hypothetical protein + Term 10737 - 10798 4.2 + Prom 10751 - 10810 14.3 10 4 Op 1 . + CDS 10974 - 13139 2559 ## Lebu_1638 hypothetical protein 11 4 Op 2 . + CDS 13187 - 13624 869 ## Lebu_0804 hypothetical protein + Term 13633 - 13674 6.0 + Prom 13639 - 13698 2.6 12 5 Op 1 . + CDS 13727 - 14185 669 ## COG2214 DnaJ-class molecular chaperone 13 5 Op 2 . + CDS 14204 - 15028 1183 ## Lebu_0937 FHA domain containing protein 14 5 Op 3 . + CDS 15096 - 15635 873 ## Lebu_0445 hypothetical protein 15 5 Op 4 . + CDS 15632 - 16483 865 ## COG0631 Serine/threonine protein phosphatase 16 5 Op 5 . + CDS 16499 - 18386 2384 ## COG0631 Serine/threonine protein phosphatase Predicted protein(s) >gi|261748116|gb|ADAD01000061.1| GENE 1 74 - 136 70 20 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKRKMEKAGSTWKQELKMR >gi|261748116|gb|ADAD01000061.1| GENE 2 300 - 1277 863 325 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037849|ref|ZP_06011282.1| ## NR: gi|262037849|ref|ZP_06011282.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 325 1 325 325 499 100.0 1e-139 MKKRRFYEVLIIFIVLAFNSCGENTDKEKTELNSDNKTNKISSRNNDQHNEIDNKLKNGI TKENITALNDYAYELEKRGELKKSKYLLEKILEKFPDRDVANLNYADVLWKLKEKEKSEK YYNNYIKLLIKGNKIKKLLSNNKEIYFSNTVKLKKVISGYVNSDKNEDYLIEYSNITDDR FDIANGGSGEDIEFTDREYKKGKNTKCLKFLVSVGNDYKIFDNCKVIIPLTLRGYGGRAG SQVYYEIKNNKYIIFDRNMDSNQSFSDYYIEIIYDKNNLIVSQIKSKSYEKVTDTNGNNN IVEREENIHRVNKKIEIFDMSKEGF >gi|261748116|gb|ADAD01000061.1| GENE 3 1302 - 5261 3925 1319 aa, chain + ## HITS:1 COG:STM0907 KEGG:ns NR:ns ## COG: STM0907 COG3179 # Protein_GI_number: 16764269 # Func_class: R General function prediction only # Function: Predicted chitinase # Organism: Salmonella typhimurium LT2 # 27 202 24 197 204 82 31.0 8e-15 MEVTIEQLKKMFPGGKEKLFVSTIDEINKCFKKFNLDSCLAYSHFFAHFRQETDYVSFEE SLNYSQEQLISTFSKFRNNRELAVKYGRSHGKKANQEMIGNIAYGDRKELGNRGINSGDG YKYRGRGLIQVTGRSNYMQATQIYNKVFEENIDFVSNPDLILQPKYAVRASFADLLNKKC ETYVNIIEESKNGIEKYRKAFDLIMDKINKHTKSREKRWKYFKENFMTIFQCSVKDKVVL KNDSNDDNKNINKQSKTLEKSNVKGEAVEDKEGVYIFNGQKFQEVWDNEEYKKDRAENKK APDIVFVNSNEVVLKRLQAKFQDESMYNKDYVKYIETTTAKKVLDENKEDINVILKIYDE ATKNNENIGNVIRQHGFDLLRGRDEINEVLKIQDIMYYVFSYPIPNKSKIESLEYVLKQY SAKYVLFPEQKPLKEKSQNKELSLEKIKNIVNEVNIDNKNISEKKVEKNKLPEKYNKYME GIKDIDEIVIRVTKKLSSEFPKTKCENLIKDLGESDEHGKSYSTDENGVRHFPLARRLID NIIIEFLKVQTGFKEEKEIKYFDNHKYKDSNIVRNYLLWYVEKQNGIDLPSKNENGLYNR LEKLGKEMRVTTILNDWISSKSAIKVRNIRKVSNLIDEVYSEYRDNNDFERDKSIIHNLF KDSKELRDYAANIDSMDLYSFYRKFYDLKNIVKPNQEMFVNDNCILKCTLGQDISRLIIA EDSVTLKGGKQANINDKKIMPFKMCKAIGICKPELLAMWEKNTDVKVRNHPALLDISTIQ CKYGGTVSIDDAGQKEMGIAVTENKKTEEAQRDADCEYKLLIDICRDINNNFMQTELKKS SQKYAKWKKYKQETEKRAVEYLKNDVKEVLGLGKADEKKKKEYEKFVKDNKKKIEIAKTE ASETREKELKNKIMSTFVEVNKRVKQLSGPKLDTKGVIKKSGISLCDYERKNFYSAKILP AIVYGYLMKAGNLNMNSQERQTIENELNKDVIGIGTVNIDLTGYGIDSKFKKSKDSQKMK IPSSEIMTWVGIGESLYATVKNIPTEWDIQNKIREQGKSLGRCPFSQMEWNQNHKENKPL KDVPPLSETVTEKDKSGKISSNKGTDKTVPDKKNETGKSSSKDSAESFCKGDNCPHVGVE KKAPWIKIAQEQCEKYKGKKEGNEPLYSQIKNVYFVIAKHEKKDPTKEAWCAAFISYCIN KAGFKNSSYPSVAGYDWGVAPRKGLPRRGWFEGEKTKPFVGAIGVFKWHNGYSHVAILIG KTPKGAHVFLGGNQNNEINVTAYSEKRIDYYMKPKSYTISSEEWNLPIITKYKNSGSIG >gi|261748116|gb|ADAD01000061.1| GENE 4 5284 - 5802 467 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037852|ref|ZP_06011285.1| ## NR: gi|262037852|ref|ZP_06011285.1| 50S ribosomal protein L23 [Leptotrichia goodfellowii F0264] 50S ribosomal protein L23 [Leptotrichia goodfellowii F0264] # 1 172 1 172 172 300 100.0 2e-80 MKKYIIIFLILSTLSISEIKYTGNKYEKIKSPKYTTYDTELKSLRRVLLVIANARVNKRY VLEKNDKDIFKGNPYYNNDEYLKQTSKLLLSLTNELKQLNYRHEIFKKANIRGYEYICSL EITEKPTRIGYIEGEIFGTVTCSSEFDDMSRFDEIYLKKVGNSYYIMDWFIS >gi|261748116|gb|ADAD01000061.1| GENE 5 5831 - 7105 1272 424 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1123 NR:ns ## KEGG: Lebu_1123 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 5 423 3 426 429 335 49.0 3e-90 MQGEREREEFEKELLGNSSRELFFELKNETDKRFSEIKEKFFEVQNNNYEELFIETLLVD EKEAYKFGGAFYPIVSENQNISDIFNENIYLENVYLDVPYYQLKDIEEETFNAWIHTDDE TYEIKVVLEKDMRYEKKIKRLYEAFELNGQNWNTINTAHIKRMYRIKVSDYDFEMTDKIY EGVVNKKYKIDYDFEYYNDNVRRGKKLLWNVEEKQIKSTIFIRPTKVDISFEYILKFDEN EKLLVENNDKNDILCCYTENKNELSIISRRDDNSIWKVFSIKSVELCRKIMEMNFNKENL DYIHFTNYREISFIDKIRKHYNNENGIRSKAYLEKKFFDYEFIKNNFELKDVVFNKEKEK NIDVYNCNEFVETDFDLFQKKGKIVLNLFVKCLNRNKYTEDMLSFIVSEIQWRMPEYDCR GYLI >gi|261748116|gb|ADAD01000061.1| GENE 6 7102 - 7833 564 243 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1124 NR:ns ## KEGG: Lebu_1124 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 243 1 275 275 160 40.0 4e-38 MSKRRREGYEYIWCPKVDISNVIFLIKENASPYLEINSDFINEDFQGENNIIEVNPYIRF NHIFSDFLYTFKNIEEKKGYSDLINKKFKNNLEHNALAYLRELDFVTGTTVADIYCEYIL KDIEKGKFGRNILKYFRRMSEEDRKIAVLLIYNNISTKIENRKSFELGIKYFFTDSIIYM DKFEDKKIIVYINYKKDKINTAKIRVLESLFLPFGTTLKCYWEHHFGVIGVNDTMKIGEV SVF >gi|261748116|gb|ADAD01000061.1| GENE 7 7849 - 8304 512 151 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1125 NR:ns ## KEGG: Lebu_1125 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 151 1 154 154 158 66.0 9e-38 MKYKIKITDNQKKEINISEDEIIEVKYNIGLTEKNDYANNRSSVEMEITGKIISNLESTD SGNLGINDDLYEKNKKNIIDLVSWSESYLGDNDYRNIEVDVDLGSDKKINYKFSDMFVHN FFQELSIEKGIGLFVIKLKQKFHQNHKIEIK >gi|261748116|gb|ADAD01000061.1| GENE 8 8318 - 9811 1724 497 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1119 NR:ns ## KEGG: Lebu_1119 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 200 497 8 313 313 333 61.0 2e-89 MVNWEGMYEGKDISIVLDGKNLNGFILKDFVLEESINEHQKLNLQIETDISQKKNLENTV SKENVEVKIELREMKGKRKIFEGIVDYFEVLDYGSEGCRILIEAFSKSILLDRKYEKKYR VFQDTSWTFANIVEEINKDYVEKKIEIKYSDIAQKSVNSLIIQYDETDWEFLVRISSHLK TGMFVTEDGVITFGIIENSEIKQENRYFSDYSIVREHKNLYYKINSGKAISSGDTVSINN EKDSSDKGSNLTVLKSKIFLKNYILKSNFLATNMESYYIYRKYNEKIKGSAIEVKVEKVF EENNIAKMEVTFFEGLNKIVHEKGKDSENKDRAYNDYGIKRIPLSYQTFYSQTNTGFFCT PEVNDNVEVYFPSEDENFARVSWAINNEGNGRFSDYEKRNFHINGKDFNFNIDKNNVTVN IEESYTRNSKTSNETAESFVNKGTKNLVVVSDDYIGIESIGEMSVYGRSIDIVGKEKEIR METPNEIRIKGKKVHTN >gi|261748116|gb|ADAD01000061.1| GENE 9 10023 - 10700 838 225 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1837 NR:ns ## KEGG: Lebu_1837 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 220 1 212 216 156 43.0 5e-37 MRKIIIFVMLLLSVQLFSEDYKVVLKKNVTLSEQDLKKNNKEIEIAVKRDYNLLKEASLE GFQNVFYQIFQSLVPEDGPDKELILFYQKKTQKFYSDFLSMYMKKTKAKVEKIRYVTNDV VEVTLEMQMPDMTKIMEKAFANMRKDGMLGGVKPEEIERMSQKELIDKVLTREFDEMKKA IDTIKDYSIIKTSIYLEKDNNKWGIAELDEKIQNMKDIYKSAGKK >gi|261748116|gb|ADAD01000061.1| GENE 10 10974 - 13139 2559 721 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1638 NR:ns ## KEGG: Lebu_1638 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 721 1 721 721 1095 80.0 0 MANKNFINQTNRTILFEEINPEKMDLITLIDDSKDLDSLDDERIIEINKHLLVSSFDEFL TKFEPKVYSYYNAETQSIKYILKKPEGIPEELITEIKIDNGNTFFKMLNTLIEARKSQGN RNVDFKFENILELISPKKVIEDIKQTRKEIAYIYNKYEELDEGNPKKLEYGDRLNAKFEE ASQNYSNVLGMLPLAIEDIKTRLLLGHDENSLKSEKIKLGMLQVGDKGELEVIEYKQEET KDLALIEEKNSTALVEAFREDYNDVAEEPNQYISDLVVRTFVPLAKGVVAIEPSQEVENY NNYLEFYKAAQEDFVKIAKPLIEKLLGVKMFFEQYNAKVSLMKPTLLVTNIKAEMIVKAG NKERLQSFFNTVNAKNDFDNAIWFGIYPNVDLDIKKGEKKVRERFKGAKNEEKSEKNTIE SLTNLMSVLSEYKVQVFFNFEGTYETSFDNLATTGVDKYIEKTRILEDQKYSEYLVPAIP NFTIIPKDKSGVVLDFKMKYKDETGVALSKEKEDLLKFWIEGVYIDASYVAAGIVAAYQC PNYLKDRFNNVSPVYPGVRFDIEAEDNSYKAKTTMSKEISGFTNTIKDKINQFSYGFIFS SDSAHLKKEKIKNIVVYKARTLAKNVDGNYEPLYKTLTTTFIERTLRFVTTDFKEDKLNF FFSTNPESQKSVWIKDEKFVNGIMQKGDDLSHVIDEDSNVCQLNIVFSGNIKNLQVEINK N >gi|261748116|gb|ADAD01000061.1| GENE 11 13187 - 13624 869 145 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0804 NR:ns ## KEGG: Lebu_0804 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 145 1 145 145 205 84.0 5e-52 MGFRLNVKGQGEEILLEKESILDVKYISETPDDSNARATDLGVILEIKGKILAATNGEAA DDTRKLAQWSLIPAESSDAYRHATLEVISAGQMVRKIDMNNVFVVDYIETFGDQAGTGTF TLKIKQKKEKVKEVAIEGGYASQEA >gi|261748116|gb|ADAD01000061.1| GENE 12 13727 - 14185 669 152 aa, chain + ## HITS:1 COG:slr0093 KEGG:ns NR:ns ## COG: slr0093 COG2214 # Protein_GI_number: 16331768 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone # Organism: Synechocystis # 6 69 8 71 332 85 64.0 3e-17 MGETMNYYEILGVPIDADENEIKSKYRKLAMKYHPDRNPDDKKAEEMFKKVSEAYEILGD KEKRKEYDKKISKTGEEKQNSEKKKAGAGGDARKGAEAFYRNFSANPQDIKNMFEKAFNV DEMKNPDKEKMKAHKESMEKSFENFFRPKKNK >gi|261748116|gb|ADAD01000061.1| GENE 13 14204 - 15028 1183 274 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0937 NR:ns ## KEGG: Lebu_0937 # Name: not_defined # Def: FHA domain containing protein # Organism: L.buccalis # Pathway: not_defined # 1 274 1 271 271 382 79.0 1e-105 MFWRIRSFLSRMRSIFRPTSIGRGSLYRLESMRRNAESKTKSGKTEEKTVSMSVDKGIKK YEKRQMEAKGERQYNFFNVRNTVIMIILILTFVYIHFSGYSTKSLYIAIFIFIGLTLYLL VVERFKERVEVEEEIKEIKNEREREHNVFLDKVKEIEQVEKNQIENIILKNSEDYDIKVW KVGRATSLLIGKRTPRNRVDIDVSEGIYSNLVSRAHGILNKVNGVWYYEDLGSQNGSGIE KSSDRRKIKIRKNTPVKVESGDIIYLATSKLLLK >gi|261748116|gb|ADAD01000061.1| GENE 14 15096 - 15635 873 179 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0445 NR:ns ## KEGG: Lebu_0445 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 174 1 173 210 275 77.0 6e-73 MKLERCKNGHMYDASRYGATCPYCKSEGLEVEIKEDKINLVEEMNDDDKTTAYWSKDTKV DPAVGWITCIEGPDKGQDFRIVSERNFIGRGDDMDIQIKGDSTISRKNHCSISYNPKQRV FMLSPGQSNGLVYVNNEALYDTRELRAYDMIEIGESKFVFVNLCGENFDWNKEKVSDKE >gi|261748116|gb|ADAD01000061.1| GENE 15 15632 - 16483 865 283 aa, chain + ## HITS:1 COG:lin1935 KEGG:ns NR:ns ## COG: lin1935 COG0631 # Protein_GI_number: 16801001 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Listeria innocua # 53 272 13 243 252 63 24.0 4e-10 MINILKYGIFMYTVLISILLFLILFGKIRKKGKPEKKQKSKIDITGMKWIGKRKLQEDSY AVFSPSDEEHILIIADGMGGYSEGQLASQFSVNRFLDLYSRGDLKIPEIIKKINGELLNF SESLDLKNRIGTTFLVCEIILNKMRIFSVGDSSAYIVTKEYIRKINKSQNRGSYLTNYVG YEDFDEEGINVTEINSMKLETDEKTGSKKFPVIILMCSDGVDKYMETGTIKRIIEENINK TGKEITELIMNTLKERRRLNQDNASIILIKTDNKYLQKNDVIL >gi|261748116|gb|ADAD01000061.1| GENE 16 16499 - 18386 2384 629 aa, chain + ## HITS:1 COG:CAC1727 KEGG:ns NR:ns ## COG: CAC1727 COG0631 # Protein_GI_number: 15895004 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Clostridium acetobutylicum # 13 207 26 223 269 68 25.0 3e-11 MRREEAKFTTKFVSEPGTKPKNNDYFGYVQLDNYAVWAIADGYDEEDGAAIASKLAVESA IEYFMLRPRFNPEVIKEMMEYANLKVKEKQEETERYSLMHTSLLIVISNYNSILYGNVGN TRLYHLRGGYVMYQSSDDSIAQLLVEEGALNTRDMKYHRQRNDLLQAIGDFGKIKPNVIK HPVQLQEKDVLCLTTMGFWENIDEREMEVELSRYDDKSKWLRSLERKIIATLRDEVENYT FAAVSIEGVAAPEPVEKDKKKFWIKVALISLVILLLILMLTFWNIKKRNDIIKKATTYEQ MADEDLVKKDFNNSMEELKLSIGEYEKLKPKNRGIIGFFVNAKNRRADADKRIDGVKKKI EQTEKLKQAFQDISEGNEFFNGGRYDEASRKYQQAKYNLEQNTYKKDELNTEEILVTLNT RIDSSSKLKEAQSMELAGDSAYTAGNYNLAKESYKTASDIYLVNGRPDYVSSMERKISEI NDKEKTAYSGAMLTENRGDILAPTDSNMSREAYYQARQMYQVLGDSAKTQEIDNKIQELN SRQIANLQTASNMVKEGLDQITANNPSEAIALLTKAKAIYQGLQDTNNASNVDNYIKQAQ EFIKFESDTKTQLANQEEQSKLEIQAKEN Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:51:23 2011 Seq name: gi|261748103|gb|ADAD01000062.1| Leptotrichia goodfellowii F0264 contig00082, whole genome shotgun sequence Length of sequence - 10330 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 2, operones - 2 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 62 - 121 2.0 1 1 Op 1 . + CDS 158 - 517 431 ## Lebu_2157 hypothetical protein 2 1 Op 2 . + CDS 536 - 1315 965 ## Oant_2750 hypothetical protein 3 1 Op 3 . + CDS 1315 - 2121 943 ## Oant_2751 hypothetical protein 4 1 Op 4 . + CDS 2125 - 2739 282 ## Lebu_2157 hypothetical protein 5 1 Op 5 . + CDS 2736 - 3119 149 ## Lebu_2157 hypothetical protein + Prom 3307 - 3366 10.4 6 2 Op 1 . + CDS 3439 - 4320 459 ## Lebu_2157 hypothetical protein 7 2 Op 2 . + CDS 4287 - 4406 63 ## 8 2 Op 3 . + CDS 4424 - 6499 1385 ## COG0572 Uridine kinase 9 2 Op 4 . + CDS 6496 - 7182 637 ## COG1011 Predicted hydrolase (HAD superfamily) 10 2 Op 5 7/0.000 + CDS 7166 - 7564 364 ## COG2246 Predicted membrane protein 11 2 Op 6 . + CDS 7561 - 8265 593 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 12 2 Op 7 . + CDS 8262 - 8639 151 ## Asuc_0824 acyltransferase 3 13 2 Op 8 . + CDS 8706 - 9410 300 ## Asuc_0824 acyltransferase 3 14 2 Op 9 . + CDS 9437 - 10039 556 ## Psta_4569 hypothetical protein + Term 10163 - 10216 -0.3 Predicted protein(s) >gi|261748103|gb|ADAD01000062.1| GENE 1 158 - 517 431 119 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2157 NR:ns ## KEGG: Lebu_2157 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 26 119 670 762 762 71 42.0 1e-11 MISDNSEKRTNLSIFPQDKIYYLKNIDEIQEIELPYPTKIEYLYTKRVRNYNFNRVKLIG FDKNNQIVIVLKQLNEKQRSYIGFKNDNPNIEISKISFITENDKPAYVIPEIIIGEPLK >gi|261748103|gb|ADAD01000062.1| GENE 2 536 - 1315 965 259 aa, chain + ## HITS:1 COG:no KEGG:Oant_2750 NR:ns ## KEGG: Oant_2750 # Name: not_defined # Def: hypothetical protein # Organism: O.anthropi # Pathway: not_defined # 1 255 3 254 258 286 48.0 4e-76 MKIIIPMTGYGSRFVAAGYKDLKPFIKVQGRPVIEWIVKGMYPNETDFLFICRKEHFDKD LTLREKLMELAPKGEIFEIDDWVKKGPVFDVLRASEKIDDNEQCIINYCDFYMLWDYEKF KEEVAERKSDGAIPCYTGFHPNLLPEKNYYASCLTDGNDDLVEIREKYSFEKNKQKSKHS PGVYYFRTGGLLKKYCKKLVESNQVINGEFYASLPYNFMVKDGLKIWVPLNVDKFCQWGT PEDLKEYLFWTETVKGMMK >gi|261748103|gb|ADAD01000062.1| GENE 3 1315 - 2121 943 268 aa, chain + ## HITS:1 COG:no KEGG:Oant_2751 NR:ns ## KEGG: Oant_2751 # Name: not_defined # Def: hypothetical protein # Organism: O.anthropi # Pathway: not_defined # 1 263 1 263 267 299 51.0 7e-80 MNILIPMAGAGKRFSDKGYKLSKPVIPTIDRRTGNEYPMVVCATMDLPGIKSGGSNVIYV DREFHKKDGVESKIKQFYPKANFITVKELTEGQACTCLLAKEKINNSDELLIAGCDNGMV IDKKKFEKLKMETDVLVFTYRNNEAVLENPNAYGWVKVDENDNITGLSIKKAISDNPMND HAIVATFWFKEGQIFVKAAEKMIKENERVNNEFYVDSVISHVIDLGYTAKVFEIERYIGW GTPKDYEEYQNTVKYWKEFVQSKGYLGE >gi|261748103|gb|ADAD01000062.1| GENE 4 2125 - 2739 282 204 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2157 NR:ns ## KEGG: Lebu_2157 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 3 187 24 211 762 127 39.0 4e-28 MNKKIIWILFFIIGIFIPCFTYIYDNMEKSIKTNFYNLGLTQEITKGFTVSQKIYIPKKI QKFGLMFATYLRENHGKIKVSIVQNNNKTEKIVEMSTLKNGEITNLGLDISKLKKGEAIL TIEGIDGKEGSSVSLYDSSDISLGKINNTEDKGLIFELSYFQMTKVVKLQFMFLLISASL FFYIYKLSNNLIKNNRKIFLQLQV >gi|261748103|gb|ADAD01000062.1| GENE 5 2736 - 3119 149 127 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2157 NR:ns ## KEGG: Lebu_2157 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 111 227 337 762 82 35.0 6e-15 MIFCIINVKIPVVSFAAEPYVEVLVNFLINGLNKNIFQNLIIPDAGYWPLFQRIIGLTII KLFGFNLKITVFMLQNIGVFIICMFGSLFVLKEYRKYGTLLFRICISIILGGGGGNINFI SRRILFF >gi|261748103|gb|ADAD01000062.1| GENE 6 3439 - 4320 459 293 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2157 NR:ns ## KEGG: Lebu_2157 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 3 283 466 730 762 120 34.0 7e-26 MTQQFIFLFIPDITNIYNVGFLNITFLILWIGIILVNVYFYVKLKNKESVISLALIFVIL GTIILNILTTGKYFFWNQEYEWLKTFSIINTRHSLFIILSYINIFVLMCYNGMLLFRKKY KINKNLLYILLTFLLVIRFSAFDNNKIKNQFPNSVKVGETLSDWSKYNKFFQKSSYVIPF EPLVAIAKHNKIYTFKENKIIEAPLIGDVFSDVKMFWLASGVEEQIHEINLDDELFIEFI YAERLRVNNYNKIKVIGYNEKDEEVFSLKQLNKTERLFVGRKDCLLDLKMKNL >gi|261748103|gb|ADAD01000062.1| GENE 7 4287 - 4406 63 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFVGFKNEKPIAVKKLKFYNEDGTQAYVKNNLYFGVPSK >gi|261748103|gb|ADAD01000062.1| GENE 8 4424 - 6499 1385 691 aa, chain + ## HITS:1 COG:alr2350 KEGG:ns NR:ns ## COG: alr2350 COG0572 # Protein_GI_number: 17229842 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Nostoc sp. PCC 7120 # 396 574 5 183 313 122 31.0 2e-27 MKKKLSKFQVFIISMMLLKIILSGLFSSDYQNILFMKFINGFLTELKNGNLLNPYELFSN EVALFPYPPVMFMIEFVGGALSSLFSNIFIKNILFKLPNLVFDLLGLHFLMKMFPNKRRQ VGMLYFASPIIIYSIYMHGQLDIIPTVFLIGSLYYLVEKKRHNKFILFLFLSLASKLHIL AIIPILFIYILKKEGVWIAIKDITLTLFLTVSVILPFLDKSGGFVNMVLLNKEQNGITSL FWEYSTTKVYLAILAIVLIYLRAFSINKINKNLLYSFCGVLFSVFSVLVLPMPGWYVWIV PFITIFFIHVNLDRYTNLIVYLLLNGFYLIYFLFIHKTRYVDLYFLNKDMNFIKMDNIIL TNVVFTLLTAFLIYTTYLVYQMGIINNSIYKRKNVPFTIGITGDSGSGKTTLTKMIRNVF GAKNILFLEGDADHKWERGDENWNYFTSLNPKSNYIYKQSGDLEKLRKGETVHRVIYRHD TGKFTEEYKMRSKPYIVLAGLHSLYLPQTRRNLDLKIYMDVEEKLRRYWKIQRDVYIRSY KLQKVLDTIEFRMKDAQKYIYPQKKYADLIIKYYDDTLTDYLVENHDLKLNLELTLNADI NLERLREYLENNNIMVDYDYDNSLEKQIVTFSGQGLHGNILDFHKVMGEIIVNYDEISQN DMKYQNNIQGIMETVILLLISYKMREELRYL >gi|261748103|gb|ADAD01000062.1| GENE 9 6496 - 7182 637 228 aa, chain + ## HITS:1 COG:VNG1202C KEGG:ns NR:ns ## COG: VNG1202C COG1011 # Protein_GI_number: 15790269 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Halobacterium sp. NRC-1 # 3 206 4 209 232 98 27.0 9e-21 MIKAIIFDLDDTLYSYNLLDKLGIEKICEFVCKKLQIDEDKFYSAFNKAKRSVKEQLGNV ASSHNRLLYCQKTMENLNENPFSIALEMYDIYWNYVLENMKLNENALELLKFCKKEKIKI GICTDLTVHIQHRKIKKLKIDEYIDAIVTSEEVGAEKPNFKMYDKILKKLNILSEEAIFI GDSLKKDVEGSFKYGMKALWYSTEKSERYETVQNFGQILEKLKNERSK >gi|261748103|gb|ADAD01000062.1| GENE 10 7166 - 7564 364 132 aa, chain + ## HITS:1 COG:MT3897 KEGG:ns NR:ns ## COG: MT3897 COG2246 # Protein_GI_number: 15843411 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 11 127 20 140 142 60 30.0 5e-10 MKEVNKFNKIEILKFLVGGGSAVVTDFIAYKLLMNFGMDRNSAKTISFICGSIVGFIINK YWTFKSSIFSLKEIFKYTILYIVTAFINSLVNKYVMLAVKHELFAFLCATGVSTILNFLG QKFLIFKKGDGK >gi|261748103|gb|ADAD01000062.1| GENE 11 7561 - 8265 593 234 aa, chain + ## HITS:1 COG:MA1184 KEGG:ns NR:ns ## COG: MA1184 COG0463 # Protein_GI_number: 20090050 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Methanosarcina acetivorans str.C2A # 1 150 3 153 314 73 31.0 2e-13 MKLSIVIPCYNEEKNLSLILEKFQEVITDENMEVILVNNGSSDNTAKKLEELLPNFSFAR TVLVPVNQGYGYGILQGLREAKGEFLGWTHADMQTDPKDVIKAYKILEKENWNENVYVKG NRKNRPFTDQFFTFGMSIFETVFMGKFLYDVNAQPNIFSKQFFSTWKNPPKDFSLDLYSL YLAKKRKQKITRFDVLFPERIHGESTWNTGLGAKWKFIKRTIGFSLELKRKGVK >gi|261748103|gb|ADAD01000062.1| GENE 12 8262 - 8639 151 125 aa, chain + ## HITS:1 COG:no KEGG:Asuc_0824 NR:ns ## KEGG: Asuc_0824 # Name: not_defined # Def: acyltransferase 3 # Organism: A.succinogenes # Pathway: not_defined # 18 125 1 95 349 82 47.0 4e-15 MIKIIKNIKNKKNGDSRMIPEVQGLRCIAILMIIYFHLPLILTSKYLEFHQKISRNFILS TGVELLFVLAGFFLMKSLSKMEKYNIPKIVDNSIPQNSNNLKIKIEILFEFILKKFKRLA PASYF >gi|261748103|gb|ADAD01000062.1| GENE 13 8706 - 9410 300 234 aa, chain + ## HITS:1 COG:no KEGG:Asuc_0824 NR:ns ## KEGG: Asuc_0824 # Name: not_defined # Def: acyltransferase 3 # Organism: A.succinogenes # Pathway: not_defined # 1 232 119 349 349 171 44.0 2e-41 MLKKFFATIIWLRNFEVAFVPTHLGYHWAVSLEFQVFIVFSIFYFIIGKNKTIFLSIFIC IIMMFFRPAINSQYWLFRFDPILWGTIVYYLFEKIDKNFLNTKFKSKKIWIFIISCIFVL ILSSNEVTFGEYKYFKTSISAITSAIMLLLALSGNGYFYSNIFGIRHIIDWVSSRSYSLY CCHIVSWFMIKQLYSFLGIEYNKNGVIYCIIFMIISAEFTYQYIEKILIKSKSK >gi|261748103|gb|ADAD01000062.1| GENE 14 9437 - 10039 556 200 aa, chain + ## HITS:1 COG:no KEGG:Psta_4569 NR:ns ## KEGG: Psta_4569 # Name: not_defined # Def: hypothetical protein # Organism: P.staleyi # Pathway: not_defined # 3 197 13 203 230 185 46.0 8e-46 MKFIAHRINTKKELKNISEQYGAELDLRDSVDGKIYINHDPFILGEDFEEYLKEYNHGTM ILNIKSERIELRVLELLKKYNIKDYFFLDSSFPMIKLLTDNGEKKIALRYSEFEGLDTLE KMQEKVDWVWVDCFTKLPLNNEIYNKIKKMGYKLCLVSPELQGQPEKIEKYAEQIKKEKI IFDSICTKVYNIEKWREILK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:52:01 2011 Seq name: gi|261748101|gb|ADAD01000063.1| Leptotrichia goodfellowii F0264 contig00147, whole genome shotgun sequence Length of sequence - 206 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 205 232 ## gi|262037874|ref|ZP_06011305.1| outer membrane efflux protein Predicted protein(s) >gi|261748101|gb|ADAD01000063.1| GENE 1 1 - 205 232 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037874|ref|ZP_06011305.1| ## NR: gi|262037874|ref|ZP_06011305.1| outer membrane efflux protein [Leptotrichia goodfellowii F0264] outer membrane efflux protein [Leptotrichia goodfellowii F0264] # 1 68 1 68 68 97 100.0 3e-19 TTKSASQYLKNITTNMRAGQDILLKSQELEINGSFIRAENDLEMDAKKIKLEASADRYKA KSRSIGAN Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:52:10 2011 Seq name: gi|261748085|gb|ADAD01000064.1| Leptotrichia goodfellowii F0264 contig00236, whole genome shotgun sequence Length of sequence - 12723 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 3, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 91 - 167 89.0 # Met CAT 0 0 - TRNA 173 - 249 81.3 # Met CAT 0 0 - TRNA 279 - 366 75.1 # Leu TAA 0 0 - Term 306 - 343 1.0 1 1 Op 1 . - CDS 482 - 1090 764 ## Lebu_1548 hypothetical protein 2 1 Op 2 13/0.000 - CDS 1099 - 2049 1177 ## COG0457 FOG: TPR repeat 3 1 Op 3 . - CDS 2051 - 2884 1101 ## COG0457 FOG: TPR repeat 4 1 Op 4 6/0.000 - CDS 2903 - 3391 615 ## COG0620 Methionine synthase II (cobalamin-independent) 5 1 Op 5 . - CDS 3428 - 4093 914 ## COG0620 Methionine synthase II (cobalamin-independent) - Prom 4130 - 4189 2.7 6 2 Op 1 . - CDS 4222 - 4563 614 ## EUBREC_3158 hypothetical protein 7 2 Op 2 . - CDS 4539 - 4697 78 ## 8 2 Op 3 . - CDS 4728 - 7316 3844 ## COG1012 NAD-dependent aldehyde dehydrogenases - Prom 7414 - 7473 8.4 - Term 7457 - 7493 3.3 9 3 Op 1 . - CDS 7508 - 7933 819 ## COG0716 Flavodoxins 10 3 Op 2 21/0.000 - CDS 7952 - 9148 1744 ## COG0282 Acetate kinase 11 3 Op 3 . - CDS 9188 - 10195 1624 ## COG0280 Phosphotransacetylase 12 3 Op 4 . - CDS 10226 - 11290 533 ## PROTEIN SUPPORTED gi|157804145|ref|YP_001492694.1| 50S ribosomal protein L32 13 3 Op 5 12/0.000 - CDS 11280 - 12020 961 ## COG1214 Inactive homolog of metal-dependent proteases, putative molecular chaperone 14 3 Op 6 1/0.000 - CDS 11989 - 12456 675 ## COG0802 Predicted ATPase or kinase 15 3 Op 7 . - CDS 12480 - 12713 299 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase Predicted protein(s) >gi|261748085|gb|ADAD01000064.1| GENE 1 482 - 1090 764 202 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1548 NR:ns ## KEGG: Lebu_1548 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 3 202 8 207 207 223 62.0 4e-57 MDKEKIIKEILEREWVFFQMAQNTGGRASCQDNKKEFIIMRESQWKTLPLNILESYLEDL KIAEGTKQNIVVEKYARMMKYSAPDEYEKIKKFLPEISEEKMKIADQIVEIYLDWERDLI KKYPKLTDRGRPLNSKDDTPEYTSIETYLRGELLSYSLKTVKLYFDYIKECIEKNINLAK INIENIVKEKGYASLEDAEDKI >gi|261748085|gb|ADAD01000064.1| GENE 2 1099 - 2049 1177 316 aa, chain - ## HITS:1 COG:FN0847 KEGG:ns NR:ns ## COG: FN0847 COG0457 # Protein_GI_number: 19704182 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 3 316 293 599 599 313 50.0 4e-85 MDKKIKGLELSKLYFENIYLPELQKNFSELLERMAIGLAGEGSECFGYDDEISRDHDFGP SCCIWLTDSDYNNYGKEIQEKFDKLPKEFMGFKALNISEWGNDRRGVLNMNDWYYKFLGT EAGPGNLYDWRLIPETALASAVNGEVFADNLGEFSKIREKLKKYFPEDIRLNKIATRCMK IGQSGQYNFRRCMKRRETVAAKIAEAEFIDEAVHMIYLLNREYKLFYKWMHRRLKELPIL GEKSYFLIKELSELPVNEVNRKIEIIENICMDIIGELKKQNIVDKHLKSDFLSDYGPYVQ HKIKDEKLRNWSPWLD >gi|261748085|gb|ADAD01000064.1| GENE 3 2051 - 2884 1101 277 aa, chain - ## HITS:1 COG:FN0847 KEGG:ns NR:ns ## COG: FN0847 COG0457 # Protein_GI_number: 19704182 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 17 274 10 269 599 238 52.0 7e-63 MDINEKKIKIKEMAQLRNMYMQQGKTLEEVEVLRKLAMLTEEVYGVESDENIKILNELGG TLKYVGAFDEAVNALLKARDLIEKRYGNENVSYATCNLNLAEVYRFMRKFDETEKLYLDT IKIYEKNNLQNDYLYAGVCNNLGLFYQEQGKYENAVKLHKKSLKILENSEKYKLQYATTL SNLVIPYMKIGKKELSEEYLKKSLKLIEKEVGKNHSLYSASLNNLAIHYYNEGNYEKALE FFEESAEICKKSFGENSTNYKNLMSNIEIVRERLEGK >gi|261748085|gb|ADAD01000064.1| GENE 4 2903 - 3391 615 162 aa, chain - ## HITS:1 COG:PAE3655 KEGG:ns NR:ns ## COG: PAE3655 COG0620 # Protein_GI_number: 18314225 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Pyrobaculum aerophilum # 24 160 256 386 389 78 37.0 5e-15 MKFANDLLAPVFGELRKHENVLSAMHVCRGNWTCDETVLLEGAYDKLGEFFDALDVDMLA LEFSTERAGEVRQLFANNFLDKKITLGLGCLNPRNTIVETPEQIVKLAEKVLEFLPPEKI WLNPDCGFATFSRRPLNSYSIIEQKLVSMVKAAHILREKYCK >gi|261748085|gb|ADAD01000064.1| GENE 5 3428 - 4093 914 221 aa, chain - ## HITS:1 COG:PAE3655 KEGG:ns NR:ns ## COG: PAE3655 COG0620 # Protein_GI_number: 18314225 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Pyrobaculum aerophilum # 6 210 10 206 389 103 31.0 4e-22 MDSRIKPFTTTLMGSMPRSEELLKLKEMCIKNNVHCYEYREKLFSETKKIINMLEKVGID IVISGELARDNYMSYVAEHVPGIKLMTLEEIKSITENNEEFNKSLEEMDAADNSMNSPVC VDRISTDVKLDIEEIDMVKKITDKPFKATLPSPYLLTRSMWLQEITGKVYADRKELGQDV VKLLINEIKRLAAMGAAIIQIDEPILSEVVFKKKKGDNSFY >gi|261748085|gb|ADAD01000064.1| GENE 6 4222 - 4563 614 113 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3158 NR:ns ## KEGG: EUBREC_3158 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 7 113 99 205 205 97 42.0 2e-19 MAQAYDIVIKGTGMGVGEKVLVDNLMNAFIHTVAQRENLPSHIIFYGEGVKLATKGSGCL EDLKALSGKGVKVLSCGICLDYYEITEHIEVGEVTTMGAVVEILSNSNLIIEP >gi|261748085|gb|ADAD01000064.1| GENE 7 4539 - 4697 78 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKYLYYLTKGKTGGIFKVEREFKQLIIIKNEKKGKKNLKYRRKQKWHKLMTL >gi|261748085|gb|ADAD01000064.1| GENE 8 4728 - 7316 3844 862 aa, chain - ## HITS:1 COG:CAP0035_1 KEGG:ns NR:ns ## COG: CAP0035_1 COG1012 # Protein_GI_number: 15004739 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Clostridium acetobutylicum # 2 446 3 448 448 593 64.0 1e-169 MVKDLKSLKELMEKVRKSQEEFSTFEQEKVDKIFRKVAQKINDERITLAKLAVEETGMGI LEDKVIKNHFASEYIYNRYKDEKTCGVLEEDKSYGIKKIATPIGIIAGVIPTTNPTSTAA FKILLALKTRNAIILSPHPRAKKSTIETAKIALKVAIKYGAPENIIGWIDEPNVELSKEL MANSDLILATGGPGMVKAAYSSGRPAIGVGAGNTPVIIDKSADIKMTVNYTLLSKTFDNG VICASEQSVIVDKSIYDKVRKEFELRGAYILNKDEIEKVRKIMFKDGNLNADIVGQSAYK IAKLAGVKVPEEARVLIGEVKSTGEEEPFAHEKLSPVLAMYKAEDFEDALDKAKKLVELG GLGHTSLLYINLAEKEKIDKFGITMKTGRTLINMPASLGAIGDVFNFKLEPSLTLGCGSW GGNSVSENVGVKHLLNVKTIAERRENMLWFKVPQKVYFKYGSLPTALEELKGEHKKAFIV TDKTLAELGFTSHVTRVLEEIGIDFRIFSEVNEDPTLSSTQAGAKAMLEYNPDVIIALGG GSAMDAAKIMWVLYEYPEIRFRDLAMRFMDIRKRIFEFPKMGLKAKFVAVATSAGTGSEV TPFSVITDDSTGIKYPLADYELTPHVAINDPELMLTMPKGLTVASGIDVLTHAVESYVSV LATEYTKPYSLEAARLVFKYLPESVEKGEEAIKAKEKMANASCLAGMAFANAFLGICHSL AHKLGGKFHVPHGIANAMLLNEVIKFNSVEAPTKMGLFPQYRYPDAMQRYAAMADFLGLG GKTKEEKTDKLIKHINNLKEKIGIPMSIKEYGIPEKDFLDAVDEMSLDAFDDQCTPANPR YPLISELKEIYLKAYYGKEYKK >gi|261748085|gb|ADAD01000064.1| GENE 9 7508 - 7933 819 141 aa, chain - ## HITS:1 COG:CAC0587 KEGG:ns NR:ns ## COG: CAC0587 COG0716 # Protein_GI_number: 15893876 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Clostridium acetobutylicum # 1 140 1 141 142 97 40.0 8e-21 MAKLNIIYFSTSGNTEQMCIYLEQGAKSVGVGAEMISVDAADESSADADFIAFGSPATGS EEVAPEIMEYIESVKDKLKGRKVGIFGSYDWGGGEWLSIWIDDLAKEGIFVVGKGCMVNL APDDDEKIEACKEYGKAVVNQ >gi|261748085|gb|ADAD01000064.1| GENE 10 7952 - 9148 1744 398 aa, chain - ## HITS:1 COG:FN1171 KEGG:ns NR:ns ## COG: FN1171 COG0282 # Protein_GI_number: 19704506 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Fusobacterium nucleatum # 1 398 1 398 398 473 61.0 1e-133 MKVLVINSGSSSLKFELVDMTNEKTLAKGICERIGIADPIFTYKNLVKGTKIDKQQVPME DHSVAIDLVLKTLQDKENGVISGVDEIDAIGHRVVHGGEYYDDSVVVDDQVIKNLEEVAA LAPLHNPANIMGIKVMRKLLPDKKNVVVFDTAFHQSMKPEAYMYALPYEDYKELKVRKYG FHGTSHKYVSETVAELLGKKDAKIIVCHLGNGASISAVKNGKVVDTSMGMTPLAGIMMGT RTGDTDPAAAFYVMEKRGLSLKEIDTRMNKKSGILGVYGKSSDFRDLEEGMEAGDERAKL AYDMFCYRVKLYIGAYAAAMDGVDAIAFTGGIGENGADPRETICEGLGYLGLKFDNAKNK ERISGNVELSTPDSKVKVYKIETAEELVIARDTYRLTK >gi|261748085|gb|ADAD01000064.1| GENE 11 9188 - 10195 1624 335 aa, chain - ## HITS:1 COG:FN1172 KEGG:ns NR:ns ## COG: FN1172 COG0280 # Protein_GI_number: 19704507 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Fusobacterium nucleatum # 4 332 6 333 337 316 53.0 5e-86 MADLVASLKEKAKKLGKTVILPETEDERVLRAAEKIIAEGIAKVALVGNEQKLKEEAAKL GVSIEGAIIYDPQNCATIDDMAELLRKRREKKGMTFETAKATLLSDPRFFGAMLISQGRV DGMVAGSNSPTAHVLRAAILVIGPKAGLKTVSSSFVMITKTPEYGDNGVFIYADGGVIPN PTALQLADIAVSSAEKARFTAGIKEPKVAFLSFSTKGSADGESVTKVREAIEILKERNVD FEFDGELQLDAAIVPEVAKSKAPGSKVAGHANVLIFPDLNSGNIGYKLTQRLAGAKALGP LIQGLASPVHDLSRGCSVDDIVEVVAITAVESDQK >gi|261748085|gb|ADAD01000064.1| GENE 12 10226 - 11290 533 354 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157804145|ref|YP_001492694.1| 50S ribosomal protein L32 [Rickettsia canadensis str. McKiel] # 54 352 8 300 303 209 37 6e-54 MALKNLFKFGKKEKEHENSEQEIHENIQKQEIIENNEDIEKKEIEEKEAEVKPLKDRLAA PKKGFFGKLKQMLLGKTIDDDLYEELEEVLIQSDIGMNMTMQLVEELEKRVAKKKLKTSE EVYDELKEILKEKFISNNTLHDNIDLDLKDGQLNIILVVGVNGVGKTTSIGKIANKLINE GKKVIIGAGDTFRAAAIEQIEEWGKRTGAEVVKQSHGSDPSAVIFDTVSTAKNRSFDVAI IDTAGRLHNKRDLMNELEKINKIIRQQSGQEKFETLLVIDGTTGQNGLEQAKVFNEIVDL SGIILTKFDGTAKGGIIFPISEELKKPIKFIGVGEGINDLRKFNIKEFVEAMFE >gi|261748085|gb|ADAD01000064.1| GENE 13 11280 - 12020 961 246 aa, chain - ## HITS:1 COG:FN0928 KEGG:ns NR:ns ## COG: FN0928 COG1214 # Protein_GI_number: 19704263 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Inactive homolog of metal-dependent proteases, putative molecular chaperone # Organism: Fusobacterium nucleatum # 1 236 1 212 214 113 34.0 3e-25 MLIFSITTTTKIVSLSLHDGTKIIGEIRVEVAKTHSTGIIDQIDKLFEWTGKKLSDIDNV VVSTGPGSFTGVRIAISVVKGLFYGRNVNIYEVNELDALAYQIYYSAQSTSGEKCSGMKI YSMIDSGKEKIYYSVYTAESFGYLKKDKDDRVSKLDNVIESISLEENEKIYITGDGAFNY KEKITENLKDKVQFSEDKNMKINSSTFAQMLLNGKLVKTDIFNLKPDYLEKSQAERDKKE RGNNGS >gi|261748085|gb|ADAD01000064.1| GENE 14 11989 - 12456 675 155 aa, chain - ## HITS:1 COG:FN0929 KEGG:ns NR:ns ## COG: FN0929 COG0802 # Protein_GI_number: 19704264 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Fusobacterium nucleatum # 4 145 3 144 153 113 49.0 1e-25 MKSQILTFDEIDRLAVKVAENMKKGGCIGLIGDLGAGKTTFTKKICKYYGIEENIKSPTF TYVIGYTSGSVNVYHFDAYRIINPEEIYEIGFEDYVGEDGSVIIVEWANNISDEMPEDTV YIEIEHNDENTRKVSIYKLKNGEKEYADIFHNNNN >gi|261748085|gb|ADAD01000064.1| GENE 15 12480 - 12713 299 77 aa, chain - ## HITS:1 COG:FN0930 KEGG:ns NR:ns ## COG: FN0930 COG2870 # Protein_GI_number: 19704265 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Fusobacterium nucleatum # 1 69 89 157 160 93 75.0 8e-20 MKFVDYTVIFDEKTPEKLLDLLKPDIHVKGGDYKKEDLPETKIVEKNGGEVKILSFVDSF STTDIINKIIDVYSNKS Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:52:22 2011 Seq name: gi|261748081|gb|ADAD01000065.1| Leptotrichia goodfellowii F0264 contig00168, whole genome shotgun sequence Length of sequence - 1404 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 8 - 364 456 ## FN2115 hypothetical protein + Term 506 - 542 2.4 + Prom 400 - 459 4.7 2 2 Tu 1 . + CDS 687 - 1404 1094 ## gi|262037892|ref|ZP_06011321.1| putative filamentous hemagglutinin outer membrane protein Predicted protein(s) >gi|261748081|gb|ADAD01000065.1| GENE 1 8 - 364 456 118 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 21 116 47 150 151 77 44.0 2e-13 MAQSLKKIGGAIYNEKYKSGVYEAIKDVVKRPINNKVQFEGITLIIPENTYINQKGGSIV DIKTGYGLPINFNSSGSCTTKKVENKIYGILYNEMIPGVEEIAQKIIKANGFTKTCSK >gi|261748081|gb|ADAD01000065.1| GENE 2 687 - 1404 1094 239 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037892|ref|ZP_06011321.1| ## NR: gi|262037892|ref|ZP_06011321.1| putative filamentous hemagglutinin outer membrane protein [Leptotrichia goodfellowii F0264] putative filamentous hemagglutinin outer membrane protein [Leptotrichia goodfellowii F0264] # 1 239 1 239 239 374 100.0 1e-102 MYSRSDELIIQGGNLKGKETDVKTKKLVLESLQDTSKYREIGVGIGVNAGKGKEKYTYSG NGEVTAVKGDKEWVGTQSSITGTEKIRVKAEEGYLKGGLIGNIDENGKDRGNLTVEIKKL LTEDIENKDKQVGVGIRASITEREKVPTDQQINLKDENGNPLKGKEDKVGTDYGASIEGH DREKVTRATIGNGVLLAGEQSGPLNRDVYRADETLRDVNVKKVNIDYVETKKGWGEIGG Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:52:38 2011 Seq name: gi|261748076|gb|ADAD01000066.1| Leptotrichia goodfellowii F0264 contig00218, whole genome shotgun sequence Length of sequence - 3816 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1 - 35 0.6 1 1 Tu 1 . - CDS 147 - 1688 2720 ## COG0696 Phosphoglyceromutase - Prom 1805 - 1864 8.9 + Prom 1755 - 1814 7.9 2 2 Tu 1 . + CDS 1835 - 2332 654 ## COG0350 Methylated DNA-protein cysteine methyltransferase + Term 2369 - 2426 12.4 - Term 2365 - 2406 1.5 3 3 Op 1 . - CDS 2489 - 3010 433 ## gi|262037898|ref|ZP_06011326.1| conserved hypothetical protein 4 3 Op 2 . - CDS 3021 - 3815 861 ## gi|262037899|ref|ZP_06011327.1| hypothetical protein HMPREF0554_2431 Predicted protein(s) >gi|261748076|gb|ADAD01000066.1| GENE 1 147 - 1688 2720 513 aa, chain - ## HITS:1 COG:CAC0712 KEGG:ns NR:ns ## COG: CAC0712 COG0696 # Protein_GI_number: 15894000 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Clostridium acetobutylicum # 1 510 1 508 510 547 55.0 1e-155 MKKRPVVLIILDGWGMNHHTEQVDAVRLAHPVNFERYKKEYPFTELRADGEFVGLPEGQF GNSEVGHLNIGAGRVVYQLLPKITKEIREGLILDNKPLSDVMNKTKESGKALHIMGLMSD GGVHSHINHIIGLVDMAKKKGLSEVYVHALMDGRDTPPESGAGYLAEMQKALDEIGLGKI VSVIGRYYGMDRDNNWDRIELAYNALFSGEGATAASAEEGIKASYAEGVTDEFVKPLKVV ENGTPAGIIKEGDGVIFANFRPDRARQLTRAIIEDDFKGFARKVHPKVNFVCMAQYDSTF DVPVAYPPQKIVNGFGEVVSKAGLIQVRTAETEKYAHVTFFFNGGVEEPYEGEIRLLSDS PKVATYDLQPEMSAYKVKDRLIEELNTGKVDTVVLNFANPDMVGHTGVVDAVIAACQAVD NCTGQIVNKVLELDGAVLITADHGNADLLVDPETGAPYTAHTVNPVPFILITNDMKDAKL RTDGKLADLTPTMLDLLGLEKPAEMDGETLIIK >gi|261748076|gb|ADAD01000066.1| GENE 2 1835 - 2332 654 165 aa, chain + ## HITS:1 COG:BS_dat KEGG:ns NR:ns ## COG: BS_dat COG0350 # Protein_GI_number: 16078418 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Bacillus subtilis # 6 161 3 161 165 150 50.0 1e-36 MRYIYFYDTKTKELGTIGIAADENHITNLFFEYEIENIKKDKNYILKETFLIKKASEQLF EYLAGKRKDFELPLLRDGTDFQISVWNELIKIPYGETRSYKDIAVAINNEKAVRAVGMAN NRNKISIFIPCHRVIGSDKKLVGYGGGLEIKKFLLNLEKMNLFVQ >gi|261748076|gb|ADAD01000066.1| GENE 3 2489 - 3010 433 173 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037898|ref|ZP_06011326.1| ## NR: gi|262037898|ref|ZP_06011326.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 173 1 173 173 280 100.0 3e-74 MEKKLISILILFLAFIFYKSCFEKNVRQKSYYEVSMSPLYNEGETIPNIEEVYIGDNENL IIKWDKVPEGIYLKNIMIIYKNEKIGNIIIEKRIYETINSNEFTYSFKDDLVRILGKENE NYKISSEHYMEDGFIFYIVFEDKKNNIEYKLKREVSIIFRKKGENTRTSITDF >gi|261748076|gb|ADAD01000066.1| GENE 4 3021 - 3815 861 264 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037899|ref|ZP_06011327.1| ## NR: gi|262037899|ref|ZP_06011327.1| hypothetical protein HMPREF0554_2431 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2431 [Leptotrichia goodfellowii F0264] # 1 264 1 264 264 452 100.0 1e-125 SDKKVNPYDFEYAKFDPDSEVTGDKIIVPSIYISKPTIKNKNKVDEAFINAYGEGKFEEF NLQRYWGRIEYQYVAKMNYYYLQANNIVNNLEEMKKGQFRLIPDNEAVLHNIVDGGKKIY IGEKPNKKYVDNNTGKEVVIDSKGKIVRDFWNVGTFNRYTYISDDDPDKNKHFEDVIEWL TFGTGANDKSSMFDRIKIGVIGKIISDNYNDIKEWAKYKGYNAVGYYELIQYEVDRRKDN LMNQYHNLLKFNPKAIAPKIIDLR Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:53:03 2011 Seq name: gi|261748074|gb|ADAD01000067.1| Leptotrichia goodfellowii F0264 contig00202, whole genome shotgun sequence Length of sequence - 901 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 887 1109 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 Predicted protein(s) >gi|261748074|gb|ADAD01000067.1| GENE 1 3 - 887 1109 295 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 294 1 307 407 431 70 1e-122 MAKAKFERSKPHVNVGTIGHVDHGKTTLTAAISKVLSDKGLAEKVDFENIDQAPEERERG ITINTAHIEYETANRHYAHVDCPGHADYVKNMITGAAQMDGAILVVSAADGPMPQTREHI LLARQVGVPYIVVFLNKVDMVDDEELLELVEMEVRELLTEYSFPGDEIPIVKGSALGALN GEGQWEDKIMELMDAVDSYVPTPERPVDQAFLMPIEDVFTITGRGTVVTGRVERGVVKVG EEVEIIGIKPTAKTTVTGVEMFRKLLDSGQAGDNIGALLRGTKKEEVERGQVLAK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:53:04 2011 Seq name: gi|261748072|gb|ADAD01000068.1| Leptotrichia goodfellowii F0264 contig00014, whole genome shotgun sequence Length of sequence - 1495 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1349 1400 ## COG2831 Hemolysin activation/secretion protein - Prom 1433 - 1492 6.5 Predicted protein(s) >gi|261748072|gb|ADAD01000068.1| GENE 1 2 - 1349 1400 449 aa, chain - ## HITS:1 COG:FN0131 KEGG:ns NR:ns ## COG: FN0131 COG2831 # Protein_GI_number: 19703476 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Hemolysin activation/secretion protein # Organism: Fusobacterium nucleatum # 1 449 1 447 566 250 37.0 3e-66 MKKIKNLSFFLLLCSVVVNAAPKDEANRLLDKEQQRLEQERQLNEQENRRKEIESSKFAP DIEIKPVEEENVTTFLLKDLQIIDKDKLLSEGDKFQLSKKYKYKEIGVKELNILLNEANA ILVKKGYVTSRINVNTDNIQQGEITFEVITGRIDKIKLNDNSFADKLKKFFNKPKTEGNV LNIRDIDTMTDNFNKNASNNFAVNIEPSDKEGFSNIISKNEIKGKTTVSVGYNNYGDEQG GKNRLKIGLDIESPLGVNDLLSMNIQGVKRKKVDRSWKIPESQLLPGQIAPSGPVPGYDP KIHGLLPPQRQTLLWNITYRIPFRSYSLVLSGNKSIYRRSTYAYNTIYDLSGSSTSLTAN LSKILYRDNKSKLTGNIEIKRKNSKSYFEDVELTNRNLTIGTLGVNYDFALFKGVSGLSV SYSKGMRIFHGEKDIEKGETAPKAEATRY Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:53:12 2011 Seq name: gi|261748047|gb|ADAD01000069.1| Leptotrichia goodfellowii F0264 contig00135, whole genome shotgun sequence Length of sequence - 24005 bp Number of predicted genes - 25, with homology - 24 Number of transcription units - 8, operones - 5 average op.length - 4.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 58 - 663 707 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 2 1 Op 2 . + CDS 692 - 2020 1755 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) - Term 2213 - 2257 3.2 3 2 Op 1 . - CDS 2260 - 3177 1078 ## COG1897 Homoserine trans-succinylase 4 2 Op 2 . - CDS 3220 - 3813 259 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 5 2 Op 3 . - CDS 3846 - 4283 501 ## COG1490 D-Tyr-tRNAtyr deacylase - Prom 4393 - 4452 8.7 6 3 Op 1 . - CDS 4550 - 5950 1705 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 7 3 Op 2 . - CDS 5977 - 6585 1044 ## COG0491 Zn-dependent hydrolases, including glyoxylases 8 3 Op 3 . - CDS 6587 - 7432 1226 ## COG0668 Small-conductance mechanosensitive channel 9 3 Op 4 . - CDS 7475 - 8578 1918 ## COG0012 Predicted GTPase, probable translation factor 10 3 Op 5 1/0.000 - CDS 8618 - 9163 673 ## COG1573 Uracil-DNA glycosylase 11 3 Op 6 . - CDS 9191 - 9964 1030 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 12 3 Op 7 . - CDS 10027 - 10467 519 ## Lebu_1114 hypothetical protein 13 3 Op 8 1/0.000 - CDS 10483 - 11625 1184 ## COG1161 Predicted GTPases - Prom 11652 - 11711 5.7 14 3 Op 9 1/0.000 - CDS 12112 - 13059 1091 ## COG4974 Site-specific recombinase XerD 15 3 Op 10 13/0.000 - CDS 13072 - 15627 3141 ## COG0550 Topoisomerase IA 16 3 Op 11 . - CDS 15678 - 16733 1179 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 17 3 Op 12 . - CDS 16759 - 17559 1179 ## Lebu_1109 hypothetical protein - Prom 17660 - 17719 4.6 18 4 Tu 1 . - CDS 17721 - 19109 2010 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases - Prom 19179 - 19238 11.4 19 5 Tu 1 . - CDS 19450 - 19581 125 ## - Prom 19607 - 19666 15.1 - Term 19646 - 19703 7.9 20 6 Op 1 14/0.000 - CDS 19722 - 20015 457 ## PROTEIN SUPPORTED gi|229211823|ref|ZP_04338213.1| LSU ribosomal protein L27P 21 6 Op 2 14/0.000 - CDS 20018 - 20353 373 ## PROTEIN SUPPORTED gi|229211824|ref|ZP_04338214.1| predicted ribosomal protein 22 6 Op 3 . - CDS 20381 - 20689 449 ## PROTEIN SUPPORTED gi|229211825|ref|ZP_04338215.1| LSU ribosomal protein L21P - Prom 20753 - 20812 7.5 - Term 20702 - 20749 -0.9 23 7 Op 1 . - CDS 20840 - 21583 885 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase 24 7 Op 2 . - CDS 21573 - 22760 1232 ## COG1519 3-deoxy-D-manno-octulosonic-acid transferase - Prom 22789 - 22848 3.8 25 8 Tu 1 . - CDS 22852 - 24003 862 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|261748047|gb|ADAD01000069.1| GENE 1 58 - 663 707 201 aa, chain + ## HITS:1 COG:FN0525 KEGG:ns NR:ns ## COG: FN0525 COG0744 # Protein_GI_number: 19703860 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Fusobacterium nucleatum # 7 199 2 196 731 198 54.0 5e-51 MLIVAGIGFFAYMIFTLRGETPTELIESYSPISPSIIYDMNGNQLDTITVENRDPISIKD VPLNVQNAFLAIEDRKFRTHYGFDLVRTGRAMFLTLTGKRREGGSTITQQLAKNAFLTPE RTVVRKMKEAILALEIERKYTKDEILENYLNTIYFGRGAYGIKNAALKYFNKEPKDLTIA QAAVLASLPKSPSKYSKNRKC >gi|261748047|gb|ADAD01000069.1| GENE 2 692 - 2020 1755 442 aa, chain + ## HITS:1 COG:FN0525 KEGG:ns NR:ns ## COG: FN0525 COG0744 # Protein_GI_number: 19703860 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Fusobacterium nucleatum # 1 440 212 651 731 278 38.0 1e-74 MRNFGFINEQQYEQAKSEEIAFVNTDSKNKNEEEQISTSNIAPEFTTIVLSEVKKILKVD EDDHNFLFDGYKIYATVDLDMQRAAYKAFSSNYNLKRREKLNGALFSIDPSNGYVKAMVG GKNYKKGEFNRALSALRQPGSSFKPLVYLAALQKGMEMSSVMEDSPLTTPGWSPKNYDGK FRDSMTLLKAIEISNNIIPVKLLQYVGVDSVEKVWRDAGVVGGDFPKDLTLALGSISTKP IDMALFYAALANGGYQVTPQYIYKIENKYGEVIYEAKPKKKQIFKSEDVALLTYMLESAV NYGTGQSAKVFKNGNLIPMAGKTGTTSDYISAWFTGYTPTLATVVYVGYDDNKSMGGGMT GGAAAAPIWKTYMQSVVNIQNYNVGTFEFIDDYIKRRDLSLREIDLKIGLLDTDGVDKRT ALFKAGTEPIESETKFKDGIVF >gi|261748047|gb|ADAD01000069.1| GENE 3 2260 - 3177 1078 305 aa, chain - ## HITS:1 COG:BH2280 KEGG:ns NR:ns ## COG: BH2280 COG1897 # Protein_GI_number: 15614843 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Bacillus halodurans # 1 302 1 302 303 375 58.0 1e-104 MPIKIPHNLPAVNILAKENIFVMDEKRALSQDIRPLKFIIINLMPTKIETETQLLRLLSN TPLQIEITFLKMDSYISKNISAEHMKNFYKTFEDIKNEKFDALIITGAPVETLEFEEVDY WKELTEVMEWSSKNIFSTLHICWGAQAGLYYHYNIPKYSLDKKLFGVFPLEIEDSKTMLL RGFDEIFNMPQSRHTGIKEEEIKKNPELEILAKSDLVGASIIRTKDKRKIFVTGHMEYDR LTLANEYNRDINLGKKIDIPYNYYPENNSEKTPLYTWRSHANLFFTNWINYYVYQETPYD LNELK >gi|261748047|gb|ADAD01000069.1| GENE 4 3220 - 3813 259 197 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 81 191 67 174 175 104 43 6e-22 MNRKKIIITVGCIVLALSCSSIENNKHSGKGTHSGKTVTNDEQIVSDRFKELRREQDRIM LSGTAEEIDRVILENALLKSYNNWKGTRYAWGGDSSRGIDCSALTRRVYREVFGYELPRV TEDQVKVGRKVSFNDLKPGDILFFRPENRVNHTAVYLGKSLFINASSSKGVVLSSLESSY WGKYFKYAVRVDDARRS >gi|261748047|gb|ADAD01000069.1| GENE 5 3846 - 4283 501 145 aa, chain - ## HITS:1 COG:SP1644 KEGG:ns NR:ns ## COG: SP1644 COG1490 # Protein_GI_number: 15901480 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Streptococcus pneumoniae TIGR4 # 1 145 1 144 147 137 45.0 5e-33 MKIIIQRVNKAEMNVNGTFKCKIGKGIVAYIGITHEDGIKDINYCIDKLINLRIFDDSEG KLNLSVQDIKGELLIVSNFTVYGNTKKGRRPDYLNSAPASKAKEIYNMFLEKLSETEVPF ETGEFQADMKIYSENDGPVNLIIES >gi|261748047|gb|ADAD01000069.1| GENE 6 4550 - 5950 1705 466 aa, chain - ## HITS:1 COG:SP2021 KEGG:ns NR:ns ## COG: SP2021 COG2723 # Protein_GI_number: 15901842 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Streptococcus pneumoniae TIGR4 # 3 463 4 469 469 575 61.0 1e-164 MSKLKFPKNFWWGAATSGPQSEGRFNKKNRNIFDYWYDIDKKAFFDEVGPDTASNFYNSY KEDIELYKKIGLNSFRTSIQWSRLVKNFETTELDEDGVRFYNDVINEFEKKGITLVLNLF HFDMPIELQEKYGGWESKHVVELFVKYAEKAFELFGNRVKYWMTFNEPIVPVEAQYMYKF HYPLIVDGKKAMQVLYNTALASAKVIKKFRDMKTEGEIGIILNLTPSYPRSESEEDKRAA EISDVFFNNSFLDPAVYGEFPKLLTDTLKKDNVLWESTEEELKIIKENTVDFLGVNYYQP KRVKARETEFDMSNGWLPDKYFENYEMPGRRMNIYRGWEIYPQAIYDIAKNIRENYKNIK WFISENGMGVEGEDRFKNTDGVIEDDYRIDFYREHLTHLHKALEEGANCFGYHTWTGIDC WSWANAYKNRYGFISVDLATQKKTVKKSGNWIKNVAESNEIEECHF >gi|261748047|gb|ADAD01000069.1| GENE 7 5977 - 6585 1044 202 aa, chain - ## HITS:1 COG:FN1162 KEGG:ns NR:ns ## COG: FN1162 COG0491 # Protein_GI_number: 19704497 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Fusobacterium nucleatum # 25 199 26 200 207 141 44.0 6e-34 MEVELFRNWDTQGNSYVITKGTDCYVVDPGGKNMAPVIDYIKEKGLNLVAVLLTHGHFDH IIGIPQIIEYKDVPVYISEKDYDFLYDPNLSLSTWIRMDFKLSDDVKVIKMKENDEVFGF KIIETPGHTHGGVCFYDENEKLMISGDTIFKGTYGRTDVPMGSSEDMKNSISRLMKMDGD IIVYPGHGDYTYIKDERKYYNY >gi|261748047|gb|ADAD01000069.1| GENE 8 6587 - 7432 1226 281 aa, chain - ## HITS:1 COG:VC0480 KEGG:ns NR:ns ## COG: VC0480 COG0668 # Protein_GI_number: 15640507 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 22 277 35 285 287 179 36.0 6e-45 MGNFEKLFDWKAIYRFLETNVLKIIFAFAVFKIAGILKKYVDNTIKIVLEKSKMDKSVAS FLRSSFSVLYYIILGYLLIGFLGINLTSITTFLGAAGIVLGFAFKETLGNFCGGLIILTF KPFKVGHLIEYGKYMGEVKSIELIYTKIKTPQNELVIIPNGMITNSEIRNVTKEKVRRLD IKIGVSYDSDILKVKEVLNEIVNEEIKSEEKLVLKSPAPTIGVLELAESSVVFCVYVYTK SDNYFNLMLKMNEKIKIKFDENNIEIPYPQMDVHISKKGEE >gi|261748047|gb|ADAD01000069.1| GENE 9 7475 - 8578 1918 367 aa, chain - ## HITS:1 COG:FN1365 KEGG:ns NR:ns ## COG: FN1365 COG0012 # Protein_GI_number: 19704700 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Fusobacterium nucleatum # 1 367 1 364 364 485 68.0 1e-137 MIGIGIVGLPNVGKSTLFNAITKTQNAEAANYPFATIEPNIGLVSVPDTRLKELEEIINP QRTVGATVEFVDIAGLVKGASKGEGLGNQFLSNIRNTAAICQVVRCFDDDNIIHVEGSVD PVRDIETINGELIFADLDTVERALQKNSKLARGGNAEGKELVAVLEKCKTHLEDFKLLKT LTFTQREAELIRVYQFLTLKPMMFAANISEDDLSKGIENDYVKKVREYAATHESEVVTFS AKVEAELIEIEDEEERQMFIEELGISEPSLNRLIRGGFKLLGLITYFTAGEKEVRAWTIS KGTNAQKGAGEIHTDIEKGFIRAEVVAYDKFIEYKGWQGSKEKGAMRLEGKEYIINDGDV MFFRFNV >gi|261748047|gb|ADAD01000069.1| GENE 10 8618 - 9163 673 181 aa, chain - ## HITS:1 COG:PH1472 KEGG:ns NR:ns ## COG: PH1472 COG1573 # Protein_GI_number: 14591260 # Func_class: L Replication, recombination and repair # Function: Uracil-DNA glycosylase # Organism: Pyrococcus horikoshii # 5 152 14 169 197 73 33.0 1e-13 MWNDLETEIGICTKCHLERIRTNPVTGKGSRKAKILFVLESVSKKEDLKNDLLADKKGEY FKKFLEYSKINLSECYFTTLTKCSSHGELIENECIVKCRDYLTTQIALLNPEYIITVGEI PTKNFIKSKEEIRDMVGKIYGYTGGIKIIPVYDLSYLLKAGDKEKWQVVKILEEINRRIS E >gi|261748047|gb|ADAD01000069.1| GENE 11 9191 - 9964 1030 257 aa, chain - ## HITS:1 COG:FN0900 KEGG:ns NR:ns ## COG: FN0900 COG1235 # Protein_GI_number: 19704235 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Fusobacterium nucleatum # 1 230 1 231 260 221 48.0 9e-58 MEFSVLGSGSSGNSSYIEMGKRKFLIDAGFSGKKTIEKLNNIERRIEDIDGIFVTHEHSD HIQGLGVLSRKFDIPVYLHEITYGMIKEKVGKIEKKNINFINEEKITIDDCVINSFEVMH DAEKCLGFTFEHEDKKLAYASDVGCTNNIIKENFKNSDVIVVESNYDYNMLMTGPYHWEL KNRVKGRNGHLSNAEASKLISQVMSDKLKKIYLMHISKDNNTPELAYNSLYEILERENKS HLEIEIVTENETVIYKI >gi|261748047|gb|ADAD01000069.1| GENE 12 10027 - 10467 519 146 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1114 NR:ns ## KEGG: Lebu_1114 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 4 145 5 146 146 129 47.0 4e-29 MAINDEYVTGLNRKKIRKIIVKTIVFITVFITVFFIGVYLFSKRVEKLIKADIQIETVRL ENAVKEFKSKTGVYPDISGKENNLKEVKSPDGRYTFDLFYGTEKIYEIPDNLKKGIMKSN SVNLRKDNKGGWFYNTMTGEIKPNID >gi|261748047|gb|ADAD01000069.1| GENE 13 10483 - 11625 1184 380 aa, chain - ## HITS:1 COG:FN1072 KEGG:ns NR:ns ## COG: FN1072 COG1161 # Protein_GI_number: 19704407 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 6 378 1 366 366 356 49.0 4e-98 MEEELISKKCTGCGIELQFEDKNREGYVPEEKFITEEDLLCQRCYKIKNYGQNIGNNLKK EDYSKEVANSAKKSDIILPIFDIIDFEGSFTEEILDYLRDYRSIILINKIDLLPDFVHPT EIANWVKERLMEEDIIPDDIGFLSAKSKYGVNGIIRKINNLFPDKKVRAVILGASNVGKS SVINLLLGKNKITTSKYSGTTLKSINNKIPKTDIMIIDTPGLIPEGRISDLISVESGLKL VPSGEISRKTFKPEENRVFMFDAFCRFKILETENSDRGKYKPIFSVYSSKNVKFHLTKED RVEELLKGDFFELMKKEEKENYLENKFVTYNTEIQENEDLVIAGLGWINVKRGPLNIELT VPEYAKVIVRPSLFRGRKYK >gi|261748047|gb|ADAD01000069.1| GENE 14 12112 - 13059 1091 315 aa, chain - ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 14 313 5 297 297 212 38.0 7e-55 MENSEKNNNLKTYVDKFLYYEEVITGKSYNTIKSYKKDIMQFIDYLNEYEEIDEFENVET ITFRSFIAYLNSTSKENNDERKVVSKRSINRKISALRTFFKYLQEKKIVKTNKVNYITMP KFEKELPNVLGREDINKLRDAVNTSKITGIRDRLIIELLYSSGIRASELIDLNEYMINIE ERELRVIGKGNKERITFFSENSKKWLEKYIEEKKEKYSNYTKDVVFANSKGEKLTTRSLR RLIADYAKKAGLQKEVTPHVFRHTFATELLNNGVDIRYLQELLGHSSISTTQVYTHVSKA LLKDVYMNTHPLARE >gi|261748047|gb|ADAD01000069.1| GENE 15 13072 - 15627 3141 851 aa, chain - ## HITS:1 COG:FN1069_1 KEGG:ns NR:ns ## COG: FN1069_1 COG0550 # Protein_GI_number: 19704404 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Fusobacterium nucleatum # 17 685 12 683 684 687 57.0 0 MRENKYKKFEVRVLAKKLVIVESPSKAKTIEKILGKGYEVTASYGHVIDLPKTKIGIDVE NNFEPKYQVIKGKGEVLKKLKEKSKKASTVYLASDQDREGEAIAWHISNYIKQPEKTKRI EFNEITKTAVNNAIKNPRDINKNLVNAQQARRLLDRIVGYKISPLLWKTINKNASAGRVQ SVALKLICDLEDEIKSFVPQKYWEVGALLENGINLNLVEILKEKVDKIFDEEVVKKLKKE LKGKSLTLDSIEIKKKTQRPPLVFKTSTLQQLASSYLGYGATKTMRIAQQLYEGLSINGE NRGLITYMRTDSTRISMDAINMAKDYITNHYGKEYVGYYVTKNSKSNVQDAHEGIRPSDI NLDPEKIKDSLTNDQYKLYKLIWNRFLVSQFAAMKYEQMQINAVNGDYKFRGTINKVIFD GYYKIFKEEDEIKTADFPELKEGDSYLIKKLNVEEGITKPPTRYSEATLVKKLESEGIGR PSTYASIVETLKTREYVEIIEKRFHPTFLGYEVKNELEKNFEDIMNVKFTAIMEEDLDKI EEGNVQWVELLKKFYDSLEIHLEQYEKEIEKLKDRRIESDVLSSDGSPMLLKTGRFGKYL VSETNPEEKITIKGIAVEAEQIEAGKIFIKEEVEKLQANKKGIPTDFFTENDKRYMLKKG RFGEYLESEDYENDEKRMSLPLPIKQKYKKGTLNEVNGVLQINEEITAIINEDKKIIEEA GVCEKCGRPFEIKMGRFGKFLACTGYPECKNIKAIPKTKSTSEKKKTASKKTKAKKGTVE KTAAKKAASENITVKKVTSKKETAEKTSTKKTVSKTKATTKTKKPSSTKTTAKKSPTAKK TASKKSSSKGK >gi|261748047|gb|ADAD01000069.1| GENE 16 15678 - 16733 1179 351 aa, chain - ## HITS:1 COG:FN1068 KEGG:ns NR:ns ## COG: FN1068 COG0758 # Protein_GI_number: 19704403 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Fusobacterium nucleatum # 69 351 10 288 288 208 41.0 1e-53 MGWLKLKEIGMKDSHIRKLMEYYRTYEELFLEENFRLFNGELKRMLEKAENIDLSEKYEL YSKKGIRIINANDKEYPKKLKEIVDYPLFLYVKGNANLNSNEGKNIAVVGTRRATKYGKS SCEKIVRELFSYNINLISGLAEGIDTIALTAFSEKNGNTVAVVGTGLDVVYPYENKMLWE KISDTGMIISEYPLGTAPLKWNFPRRNRIIAGLSDGIIIAESFKSGGSLITAELGFSVDR EIFAIPGFINYPSFEGCNNLIKENKAKLVTCAEDIAKEFLWDINKEKSKKTKLNEEEKKV FDYLEEETGLEELMKKIGNEIPVNRILSLLMSLKVKGLITETGTARYMRIV >gi|261748047|gb|ADAD01000069.1| GENE 17 16759 - 17559 1179 266 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1109 NR:ns ## KEGG: Lebu_1109 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 86 266 40 232 232 172 48.0 1e-41 MKKIISLITLILTVAIIVKGDLLGKIQTIDTLIQQGKYDRAEQSARKLLNDPNITPQEKA SVQNLLNEIAQKKNGQKQQQTNQQAQQQQQATQSEIDKIITEATQNGDAANTPQTADTGT VPVTQTDDVSDGSKFGTYNNYEKAALAKRNASTIYQLCQLYFKDGLFERAANLAKKDTSG DIRNLYVVAISSRLMGNYDQSIDYYNRILASSPGEAQARLGIGIAYKGKGDYGKALEYLR SYASSNPDREVTREIAILNEVIANNK >gi|261748047|gb|ADAD01000069.1| GENE 18 17721 - 19109 2010 462 aa, chain - ## HITS:1 COG:FN0040 KEGG:ns NR:ns ## COG: FN0040 COG0017 # Protein_GI_number: 19703392 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Fusobacterium nucleatum # 2 462 1 461 461 664 68.0 0 MLLELRELQLNTEKYLDKEIELNGWIKKIRSQKNFGFIELNDGTFFNGIQIVIDDTLPNF EEISKLTISSSIKVYGKLVKSEGKGQDYEIKADKVEVYDKSDSDYPLQNKRHTFEFLRTI AHLRPRTNTFFAVFRVRSILAYAIHKFFQEKNFVYVQTPIITGSDAEGAGEMFRLTTLDL EKLPRTEKGDIDFKQDFFGKESNLTVSGQLNVETFCTAFKNTYTFGPTFRAENSNTPKHA AEFWMMEPEIAFADLDVNMDVIEEMIKYIVKYVMEHAKEEMEFFNKFIDKELFERLDVLM NSGFERITYTEAIEILKNAKQDFEYKVEWGIDLQTEHERYLAEKHFKKPVFVTDYPKDIK AFYMKLNKDRKTVRAVDLLAPGIGEIVGGSQREDNYDVLLDKINQMGLNEQDYWWYLDLR KYGSVPHSGFGLGFDRILMYITGMTNIRDVIPFPRTTKNLEF >gi|261748047|gb|ADAD01000069.1| GENE 19 19450 - 19581 125 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKIFYMLMLILLCMMTNYNSLGKSNKKNFKNSISIRGMLRAY >gi|261748047|gb|ADAD01000069.1| GENE 20 19722 - 20015 457 97 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229211823|ref|ZP_04338213.1| LSU ribosomal protein L27P [Leptotrichia buccalis DSM 1135] # 1 97 1 97 97 180 91 7e-45 MILKLNLQLFASKKGQGSTRNGRDSNPKYLGVKKYDGEAVKAGNIIVRQRGTKFHAGTNM GLGKDYTLFALTDGYVKFENFGKGKKRVSIYAERKEA >gi|261748047|gb|ADAD01000069.1| GENE 21 20018 - 20353 373 111 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229211824|ref|ZP_04338214.1| predicted ribosomal protein [Leptotrichia buccalis DSM 1135] # 1 107 1 108 112 148 63 4e-35 MIKIEIEKRNERIIYFEISGHANFKEYGQDIICSAVSSVSQMTLNGLIEILKLKKLKYTE KEGYIVCDLEKSGLTDEEYERADILTQSMFSYLKEIARQYGKYVNLKIREV >gi|261748047|gb|ADAD01000069.1| GENE 22 20381 - 20689 449 102 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229211825|ref|ZP_04338215.1| LSU ribosomal protein L21P [Leptotrichia buccalis DSM 1135] # 1 102 1 102 102 177 84 6e-44 MFAVIKTGGKQYKVEVGSVLKVEKLAAEVNSDVEIKEVLLVGEGDNVTVGTPLVNSASVV ATVKEHGKGEKKINFKYNKKTYYRKKGHRQQYTTIEVKSINA >gi|261748047|gb|ADAD01000069.1| GENE 23 20840 - 21583 885 247 aa, chain - ## HITS:1 COG:FN1606_2 KEGG:ns NR:ns ## COG: FN1606_2 COG0220 # Protein_GI_number: 19704927 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Fusobacterium nucleatum # 35 245 2 212 214 242 61.0 6e-64 MVIRERKVKLKTEKIENEKQAKEVWEYFFEKPKKNYNKYIYEMVEYPEHLMYDNEKADSY KGRWTDFFGNNNDIFLEIGCGSGNFTVGNAEKYKDRNYIALELRFKRLVLGARKSKKRNI KNILFIRKRGETILDFIGENEISGIYINFPDPWEGEEHKRVISKELFEKLNVILKKDGKL YFKTDHQKYYEDVIELVSEIDGYEVVYHTDDLHNTEKAVENIKTEFEQMFLSKHNMNIKY VEIVKKI >gi|261748047|gb|ADAD01000069.1| GENE 24 21573 - 22760 1232 395 aa, chain - ## HITS:1 COG:FN1606_1 KEGG:ns NR:ns ## COG: FN1606_1 COG1519 # Protein_GI_number: 19704927 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic-acid transferase # Organism: Fusobacterium nucleatum # 10 382 16 392 426 224 42.0 2e-58 MYIPIFIISLFNKKTREFFKKRLFQDPENRNFLEKEEKAVFIHMSSVGEFNLSKELIEKL LEKKEKIIISVMTDTGKEAVTKAYGNNKNIKIIFFPLDDYFMLRKVYKNFSVKKTVIIET EIWPNLYQTAEKYSELYIINGRLTEKNMKTYRKIKPLIKKTLNKVKKIMIQSEEDKERYL DLGVKKEKIYVFKNLKYSIKYEILSEKEKKELLDNYTINGRKIIVCGSTRPGEEKIWIEV FKEININSVYQLIIVPRHLDRISEIADEIKQTFGEENFSLLSENKKNDIILVDKMGMLRD FYQLADFVFVGGTLVNIGGHSILEPLYYGKMPIIGEYYQNIEEIVKEAKKMKFIKIVKNK NEITEYLKKSEIIDTSDFFKKNDEIDKILKELYGD >gi|261748047|gb|ADAD01000069.1| GENE 25 22852 - 24003 862 383 aa, chain - ## HITS:1 COG:SA1341 KEGG:ns NR:ns ## COG: SA1341 COG0477 # Protein_GI_number: 15927091 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Staphylococcus aureus N315 # 3 367 25 388 405 93 23.0 6e-19 GAATQFSIFLFPLIAIYLYNTNVMQTSLITFVTFIPNLFLANHAGIFVDRHRKKKILLIC NIISTILIFLLILLIMLNVKNIILFYFAILLSNTVRVFYTLTYGAYIPVLVQKKDLKSAN VVLEIVNSFVQVIFPSLLGILTKYIALPLMTVGYGISHVLSLIGQLFIKDKEEKIIKTDT DSEESCIEEIRKSYIFVFSNELLRPIIICYIVLVFCIGIFTSVQSFYILKELGLDKSKLG LIMGIGNTGFFIGSFLTPILSKKIKPAKLILFSISQYFGGYVCYFIFNSITGLIIGQILI STAIPVYNINVVTLRQSITPIERLGSTTSIFRVCGRGLVPVGALVGGILGGLFTSRYTVL ISAFIMLFSGISIICSKKLMKWE Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:53:26 2011 Seq name: gi|261748045|gb|ADAD01000070.1| Leptotrichia goodfellowii F0264 contig00184, whole genome shotgun sequence Length of sequence - 257 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 255 185 ## Smon_0058 hypothetical protein Predicted protein(s) >gi|261748045|gb|ADAD01000070.1| GENE 1 3 - 255 185 84 aa, chain - ## HITS:1 COG:no KEGG:Smon_0058 NR:ns ## KEGG: Smon_0058 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 1 82 156 237 407 110 67.0 2e-23 YFTDSDILHYGLISVIHTFGRDLKWNPHIHAIVSLGGFNKNFDFKKLEYFNVNTIAAQWK YHVLDIISKGNYPNQKIKRLAKIT Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:53:30 2011 Seq name: gi|261748040|gb|ADAD01000071.1| Leptotrichia goodfellowii F0264 contig00112, whole genome shotgun sequence Length of sequence - 4021 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 47 - 637 859 ## COG1739 Uncharacterized conserved protein 2 1 Op 2 . - CDS 649 - 1644 1386 ## Lebu_2083 hypothetical protein 3 1 Op 3 . - CDS 1681 - 2448 1018 ## COG1994 Zn-dependent proteases - Prom 2491 - 2550 8.1 + Prom 2491 - 2550 12.3 4 2 Tu 1 . + CDS 2643 - 4020 1071 ## GTNG_2007 hypothetical protein Predicted protein(s) >gi|261748040|gb|ADAD01000071.1| GENE 1 47 - 637 859 196 aa, chain - ## HITS:1 COG:FN1907 KEGG:ns NR:ns ## COG: FN1907 COG1739 # Protein_GI_number: 19705212 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 185 4 184 195 180 51.0 1e-45 MKTVSKETIIEFEEKKSKFIGYIKPVSSVNEAEKFIKSVRKKHPEATHNVPLYRVMENGQ EYFKYNDDGEPTNTAGKPMAEILNILDVYNVAIVATRYFGGIKLGAGGLIRNYAKTAKLA INEAEITEYVEKSEFILDYDYESTSEVDSFLNINGEKYNIEISERNYSDRVTLKIKANSD IEEKLNEMSRVIVIRL >gi|261748040|gb|ADAD01000071.1| GENE 2 649 - 1644 1386 331 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2083 NR:ns ## KEGG: Lebu_2083 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 270 1 242 293 81 28.0 5e-14 MKKILLILTMLFGISFGANAFTRKEYIQENLSKLGIKQDIIDETIKLDKEIGDTTLNESS EGGTTEKKRTQQLKKAKELFKKDKNNYIILEKILGIAKSKERNEEIKEYENLYLKAKIPE DAKNFFLGEYFFLSGQRDKMNEYFKKVKDSSKNVFYLKMTEFYKMLEMKEEAENNNKDTE KENKRIKEFISTGIIKTGELNKILNNSSLMKESGISDEEVYSRQLDVEILRVFFYVFNDE LEKGIDDYIQNIANKKVSKDVLEYNKEKDVFTMMMTIMTPVFKIMFGALTDTKNKDEINE KSIEKIMEKFIKTEKYRQFEKEGLMDEINLK >gi|261748040|gb|ADAD01000071.1| GENE 3 1681 - 2448 1018 255 aa, chain - ## HITS:1 COG:aq_1853 KEGG:ns NR:ns ## COG: aq_1853 COG1994 # Protein_GI_number: 15606892 # Func_class: R General function prediction only # Function: Zn-dependent proteases # Organism: Aquifex aeolicus # 46 245 7 214 217 138 41.0 1e-32 MNNIKRYYNKFKMMNPKNTELKLAVIITVLGFMLFRGISDFNFSWDIVISLCVFVISMTL HEVAHGYVAYRFGDDTAKRAGRLTLNPLKHLDLFGMLLPILLILSGFPFVIGWAKPVPVN FYRLKPNRLGLFCVAIAGIVVNLIIAGISLIALRTLAHSADFFMSDTLLTVFLYMYIINL ALAFFNLIPITPLDGGRIVYSMAGEKVRSFYNQIEKYGIIIVFIIVYSGVFRGIFMELLD FFINLTNLGIPVDVI >gi|261748040|gb|ADAD01000071.1| GENE 4 2643 - 4020 1071 459 aa, chain + ## HITS:1 COG:no KEGG:GTNG_2007 NR:ns ## KEGG: GTNG_2007 # Name: not_defined # Def: hypothetical protein # Organism: G.thermodenitrificans # Pathway: not_defined # 3 422 35 456 665 294 38.0 7e-78 MGISEAGIETFRGSLFGSLAKEICQNSLDARLNFSEPVTVEFSKFNIKTDSIPGYSELNE ALKKCLHSDSDEKAKKFFENACKKMENPKISILRASDFNTTGLRGSRESNMSINPWQSLV KSSGISDKSSVSGGSYGIGKSAPFACSDIRTVFYSTFDTENVLATQGVSRLISFPISEDR FTQGIGYYGETEKNTPVFKDLNLDNSFKRKGRRGTDLYILGFREDSKWKEELIFAVLENF LISIYNRNLIVKVDETVISRETLKETIDNLKISEKKKSHFIGDYYRVLLSTSGNNVVDNS FLLGDNFKDLDDFELRILYDDLNRKVLISRSNGMKIFDLDRFPSSLQFSAVFTLKGEKLN GYFREMESPQHDNWEPDRHSDKNAKNMLKELKLFIRNKIVELGKNAIEDQMDADGIGDFL PDDLIFLEKKKEEDKIDKVSDKTKKMELKSVDSVPRKRN Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:53:42 2011 Seq name: gi|261748038|gb|ADAD01000072.1| Leptotrichia goodfellowii F0264 contig00191, whole genome shotgun sequence Length of sequence - 619 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 617 855 ## gi|262037937|ref|ZP_06011359.1| MhaB1 Predicted protein(s) >gi|261748038|gb|ADAD01000072.1| GENE 1 2 - 617 855 205 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037937|ref|ZP_06011359.1| ## NR: gi|262037937|ref|ZP_06011359.1| MhaB1 [Leptotrichia goodfellowii F0264] MhaB1 [Leptotrichia goodfellowii F0264] # 1 205 1 205 206 339 100.0 9e-92 SVPEVIGVTESIRNIGKIEGNGLVYVEGKEVENVAGNIKGKGGTYIKSTEKDIVDKTISL QDSKKGVTETVTETRQVRDKKLVHRGGRDGYEREVEYGPSKTVQVQVSRYWDTVNKMNTV SGIIGEGKDTILDSAKDIVLESSDLRAKNDIVLNAKNYLLMLSTVDTEYKFRTQTSTSGG GLRRKKTTTETWIQDNVYANPVEIT Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:53:53 2011 Seq name: gi|261748036|gb|ADAD01000073.1| Leptotrichia goodfellowii F0264 contig00145, whole genome shotgun sequence Length of sequence - 472 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 471 639 ## FN0254 hypothetical protein Predicted protein(s) >gi|261748036|gb|ADAD01000073.1| GENE 1 3 - 471 639 156 aa, chain + ## HITS:1 COG:no KEGG:FN0254 NR:ns ## KEGG: FN0254 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 156 1269 1421 1677 209 67.0 3e-53 IKKWNIYSGSFTWIATATIDTGAATIRNIYLAKKPYTAFAGKEPTPVEVTDTYNFLDGLE QRYGVEGLDTREKTLFNKLNGIGTNEQLLFYQATDEMMGHQYANVQQRMNRTGTLLDKEF THLRKEWDNKSKQSNKLKVFGMRDEYKTDTAGIIDY Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:54:01 2011 Seq name: gi|261748018|gb|ADAD01000074.1| Leptotrichia goodfellowii F0264 contig00049, whole genome shotgun sequence Length of sequence - 17206 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 2, operones - 2 average op.length - 8.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 77 - 877 206 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 2 1 Op 2 . + CDS 909 - 1805 966 ## COG2267 Lysophospholipase + Prom 1827 - 1886 8.0 3 2 Op 1 . + CDS 1912 - 2511 859 ## Lebu_0306 hypothetical protein 4 2 Op 2 34/0.000 + CDS 2528 - 3277 599 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 5 2 Op 3 15/0.000 + CDS 3244 - 4122 335 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 6 2 Op 4 . + CDS 4136 - 4954 313 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 7 2 Op 5 . + CDS 4988 - 5635 859 ## Lebu_0637 hypothetical protein 8 2 Op 6 . + CDS 5704 - 6846 1566 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 9 2 Op 7 . + CDS 6939 - 7448 594 ## Lebu_0460 Ser/Thr protein phosphatase family protein 10 2 Op 8 . + CDS 7470 - 9329 1884 ## Lebu_0461 hypothetical protein 11 2 Op 9 . + CDS 9345 - 10133 993 ## COG0561 Predicted hydrolases of the HAD superfamily 12 2 Op 10 . + CDS 10167 - 13622 4014 ## COG0587 DNA polymerase III, alpha subunit 13 2 Op 11 . + CDS 13690 - 14091 581 ## Lebu_1931 biotin/lipoyl attachment domain-containing protein 14 2 Op 12 4/0.000 + CDS 14147 - 15514 2306 ## COG0439 Biotin carboxylase 15 2 Op 13 . + CDS 15566 - 15976 601 ## COG1302 Uncharacterized protein conserved in bacteria 16 2 Op 14 . + CDS 15990 - 16643 1021 ## Lebu_1934 hypothetical protein 17 2 Op 15 . + CDS 16681 - 17091 766 ## COG0781 Transcription termination factor Predicted protein(s) >gi|261748018|gb|ADAD01000074.1| GENE 1 77 - 877 206 266 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 3 234 6 226 259 84 31 7e-16 MKTVLITGASSGIGYELSKLYAENGYNLVLAARRTENLSILKDDILKNISSNLLIEIISV DLSEERSAYELFDIIKSKGIEIDVLINNAGSGIYGEFAEYSSEQMQRNDRMINLNVKSVV ELTKLFLDDMIKKNEGSILNVSSIAGFQPGGPLMANYYASKAYILSFSEALREELRKTNV TISVLCPGPTATEFEKSSNLTGSGLFSKLKVMTAKEVAEIAYRDFTKKKRIIIPGFMNKL AVLGAKISPRRLTVRLVRKLQEIKDN >gi|261748018|gb|ADAD01000074.1| GENE 2 909 - 1805 966 298 aa, chain + ## HITS:1 COG:PA3301 KEGG:ns NR:ns ## COG: PA3301 COG2267 # Protein_GI_number: 15598497 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Pseudomonas aeruginosa # 31 295 30 302 316 105 27.0 9e-23 MEKRYFESFDNTKIPYLYFESNRGEQFKNNIIIFHGVAEPAERYEEFGNFLSANGYNVFI PEIRGHGELKEGEIGDFGKKGIQNIFGDINEFFEKELFPKGIKSENTTLFGHSMGALIVA NWVIENNYKYFILSGFPVKKNSELFGAKLLSFFERMLIFKKKSFFNKEFKKYNSFFAPNQ TKFDWLSRNNEECRKYEESELCGYALSPKVFAGIIKMMSFINKKYKKIRPDANMLIIHGT EDKAMDIDYVNKILSVLRKKKRRINVLANEGGRHESLNEINKYVIYDEILKWLNEREK >gi|261748018|gb|ADAD01000074.1| GENE 3 1912 - 2511 859 199 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0306 NR:ns ## KEGG: Lebu_0306 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 198 1 197 197 193 57.0 4e-48 MKSIKEDFSLMAILLIPVAVAINMAGFGIVKLLQLPIFLDTVGTIFISFLAGPWIGAIAA ILTSIVSGMFDPVYLAYIPTSVFLALVAGFLARFKFGNNVIIKIAVSSLILTTVAVVVSA PITVLVFGGSTGNMTSSITGLFLASGKQIWEAVISSTIITEFADKLVTVIICILILKSMS SRYLIKFKFGEKYINKSEK >gi|261748018|gb|ADAD01000074.1| GENE 4 2528 - 3277 599 249 aa, chain + ## HITS:1 COG:CAC3100 KEGG:ns NR:ns ## COG: CAC3100 COG0619 # Protein_GI_number: 15896351 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Clostridium acetobutylicum # 3 226 14 247 267 85 27.0 1e-16 MEKRITKKLYPTTNLLLAIVLIISAFIVPDYRYSYFIFIICGVIVYFYGKLNTYLKQVFR SLILLFIIIFIIQSILIPGKEVLIKLGFISIYKDGFMRAVNLTSKITAFVSAIAMFFQIT EIRDFVISLEKAGLNSKAAYVVMLTLQIIPETMKRSKIIMDSQKARGVETESNIFTRAKA FIPVFFPLILSSIEGTEERAITLEVRGFSSGAKKTRLYDIEKTSYDRIFRILLIIFLILC IIWRKLWNI >gi|261748018|gb|ADAD01000074.1| GENE 5 3244 - 4122 335 292 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 14 286 133 396 398 133 34 7e-31 MHYLEETMEYLKLKDINYKYPLSENKVLKNINLDIKKGEFWAVIGKNGSGKTTFCNLLRG FVPDFYKGELSGEITLENKKLSEYSQKEIVQKIGFVFQNPFTQISGVKETVFEEIAFGLE NLGFEVSYIKDKVNEILQLLGIENLKDKNPYELSGGQGQKVALASIIAMEPEILVIDEPT SQLDPQGTEEIFKIINMLAKKGKTIILVEHKTELIAEYAEKVIVFDEGEIILKGDTEEVL KNPILLEKQIGMPQYAVLAYKLEEKMPEKIKFDEIPVTRAATLKELKKIIES >gi|261748018|gb|ADAD01000074.1| GENE 6 4136 - 4954 313 272 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 8 248 134 375 398 125 32 3e-28 MSFIEIDKVSFIYPDGTTAIDDISLNIEKGEKVAIIGQNGAGKTTTVKMLNRLLKPSKGQ VIVDGWNTKNYTTAQMSRKVGYVFQNPMDQIFHSSVYDEIAFGPKKLKYSESEIKKVVEK AIEMTGLNDYKKENPYNLPYSVRKFVTVAAIIAMETEVIIMDEPTAGQDMKGIKILDELT RELMKQGKTVITITHDMEFVVNNFERIVVMANKKIIGDGNKKDIFKESEMLSQGKIKSPY ISDLANELSMDKSILTINDFVEEFSEKIKQKN >gi|261748018|gb|ADAD01000074.1| GENE 7 4988 - 5635 859 215 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0637 NR:ns ## KEGG: Lebu_0637 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 53 208 21 176 182 63 31.0 6e-09 MKIKGIIGTIKGNKMKITVMFIFTMIFVSCYAIKNPNPKNKSTTAVEKEDIVNVDMEKVI EEINSEIRKTDKDKNLKRKVKVIKGLGKDSKSEYTGYRTKNNELRKFVVKHFAETGRLIQ EFYVKNGKIYYLYQKKMTYNRPIGWTKAKAKESGDTEYFNLEKSEVEEAKYYFDFDENLV RFISEDGSINDNQEILKVIKNDLKDEYLMNIQNIK >gi|261748018|gb|ADAD01000074.1| GENE 8 5704 - 6846 1566 380 aa, chain + ## HITS:1 COG:MA2718 KEGG:ns NR:ns ## COG: MA2718 COG1104 # Protein_GI_number: 20091542 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 2 375 4 376 392 294 42.0 2e-79 MKQVYLDNAASTKMMPKVIEKIAESYGEIYANPSSTHRLGQKAKGIIEKTRNIIAGCLGA EANEIIFTSGGAEGNNLILRGALNAYEYKGKHIITSKIEHSTVLKTCQQLEREGYEVTYI DVDKNGVIDIEQLKNSVRKDTVIVSIMYVNNETGVKQPIEEIGKILEGTSVLFHTDAVQA VGKEIILPKNVRISALTATAHKFYGPKGAGFIFLDKNFLVEKEIWGGSQERNRRAGTENI QGIIGLGTALEEVYKTMYEEKGKEDELHTYMENRLRNEIEKIKINGENAPRIKTITNLCI EGCDVQTLLIALDLRGIYVSGGSACMSGAHENSHVLKAMGLPEKELKSSFRISIGKYTTK EEIDYFIDNLKEIVKIERGE >gi|261748018|gb|ADAD01000074.1| GENE 9 6939 - 7448 594 169 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0460 NR:ns ## KEGG: Lebu_0460 # Name: not_defined # Def: Ser/Thr protein phosphatase family protein # Organism: L.buccalis # Pathway: not_defined # 1 169 37 205 205 269 81.0 3e-71 MDFIMSAGDVSNRYLDYLVSVLDKDIICVNGNHIYHKDFPITFAKVIDGKFIKYKGLRIL GLDGCKVYSFKEHQYTENQMKIKIFKNIFHLLKGVDIVLSHASPEGIHDVDDGVHNGFKI FNKVIKYFKPKLWIHGHIHLPNFMEHQDTQVESTTVSNTFGYRIFFVEK >gi|261748018|gb|ADAD01000074.1| GENE 10 7470 - 9329 1884 619 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0461 NR:ns ## KEGG: Lebu_0461 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 7 581 9 586 590 388 46.0 1e-106 MDNSIYLFEAEGAYKKFLKSSKGFLGLKKRENLKSFGEVQKNENAYNSVYLGIKEVPLSK IVGSVEKYTDFDKNFVPKNNIVKQRWMNIYTGYMAESMLPPVILYKIKDDYYVYDGNHRI SVAKFLNFVSVEAEVEEFLPSKDAADEMIYRESMVFEKETGIKDVILSNPLKYKNLKNEI RSYVNFIHKKKDENIDYKAAAENWNKNIFVPVKILIEKNDILKNFPDSNINDIFLFILDH KYYMSEKRDKNTGYFLSTVDFINRVKTNEKRSLSNNCKIEDEETLRACEKLRKIDYELIY SLEETEINEKLFKLTGIDFRYDRVLLEEVEKIGTPEKWYEENYKKITEYFYNKADKLPEK YSRYLQYFEENRIFGYIFEYKCCKNFFENENPEISVLNYIIEVFLLIISSFDDTVSEKEK IIYLYEKIQNQYFYLFRIEKRLVEEGKTTKYEKIIADNLLNIMSFKNEQGYYDIKGILIN RKYEEFLDNLKKPEEFLNIYKKYGESGKYETFTKLFEMLDILGEEKFLKKIKNDLKKMFL SDDILADYKMKDILTEFNNNLGKEKDFYNREKYSFIDFYADILSFTKETAKDEDNGNIDL DIDILDMEMYYREKEKIYI >gi|261748018|gb|ADAD01000074.1| GENE 11 9345 - 10133 993 262 aa, chain + ## HITS:1 COG:FN0391 KEGG:ns NR:ns ## COG: FN0391 COG0561 # Protein_GI_number: 19703733 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Fusobacterium nucleatum # 1 262 1 262 267 144 34.0 2e-34 MKIKAIFTDLDGTLLKNDHTVPENVKEKFKELEEKGVKIFISTGRSFKSSYPFVKELDIK TPVITYNGGRIADPVTEEVIYEKPVSKENVEKIIDISRKKGIHLNLYNDDELYIEEEDEE GTSYAKRVGIPYFLINFDEFRGKTSTKGLFLGDAEILTELKKELEKELSDVNFVFSQPTY LEVLNKDVNKGLAVTEMLKKYDISPDEAMAFGDQWNDLEMLKAVKYGYLMGNAVEELKNI FPEDRITSSNEEDGIYNILKDI >gi|261748018|gb|ADAD01000074.1| GENE 12 10167 - 13622 4014 1151 aa, chain + ## HITS:1 COG:FN1383 KEGG:ns NR:ns ## COG: FN1383 COG0587 # Protein_GI_number: 19704718 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Fusobacterium nucleatum # 1 1150 1 1131 1133 865 46.0 0 MGNEIVHLKVHTEYSLLEGVGKIEEYVERAKFMGIKTLAVTDTSMFGVIEFYKKCIKSGI KPVIGLEVFLDGIVSEGEYSLTLLAKNKKGYRNLSKLSSLSYSRFNRRRNKIKYEELLEY SSDLYILSGGIHSEIIKGISEYKYPEAKKVAVKLAKDFKENFFLEVPAVKRLESVRKSLK EMIKEISVGYLITNDVYYPNKGEAVLQKIMSSIKEGNKIETVQNDIFYDDLYLKSFDEIK ESFLQDEEFFDEGMKNTVILAENCNTDFEFDVFKFPEYELPEGISESFYIRKLVYEGIAK KYLKENIEITTEDQENEIRKKLIAKDREDVFKRADYELEVINSMGYNGYFIIVWDFIKFA KESGVYVGPGRGSAAGSLVSYALNITEIDPLKYNLIFERFLNPERISMPDIDIDFDQEQR EIVIDYVLNKYGNKYVAHIITFGTLKARAAIRDVGRVLNVSLKKVDKAAKLIPFNMDLEK ALDSVGSLREMYNSDNEIKTMIDYSLKIEGRVRHASVHAAGVVISKDVLDEEIPTYSDGK TPILSTQYQMKELEELGILKMDFLGLKNLTILRKTIENIEKNKGEILRTENINLENKKAY KLMTEADTLGIFQCESPGIRKLMKKLKIEKFEDITALLALYRPGPLQSGMVEDFIAAKNA EAKIKYPDESLKEILEETYGVILYQEQVMKIASEMAKYSLGEADQLRRAIGKKIPAIIEE NREKFVKKAQENNVSREKADRIYNLIDKFGGYGFNKSHSAAYALIVYWTAYFKANYPLEF FAAIMTTEMYNIERLSLFINEARGKGIEILVPDVNLSDYDFKVEESGIRFGLVAIKGVGG NFVREIMEERESGIFVSYEDFVYRMKQRGLNKKQLESLILSGSLDNLDGNRKEKTESINK VMEWSQKKYESEEDLQMILFGGKSKKIGDFQMDKTEEYSQSILLKNEREYLGIYVSSHPL DEKKDYFEMIEHTKISETEVRNKSGKFDKIRIMGVIKNLNKIVTKLSGEPMAKFELEDAT GAMEVICFPKDFIKFGYKIIEDAVVMIEGHVRSEGNKLSLVAGMINGLDDLEENKFLNLY ILIDDESKEKVSELKKIILKNKGNNQIFLAMNTGESKKEVIKLGSKYDVSLSKKIMTELI KLVGTKKIKIR >gi|261748018|gb|ADAD01000074.1| GENE 13 13690 - 14091 581 133 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1931 NR:ns ## KEGG: Lebu_1931 # Name: not_defined # Def: biotin/lipoyl attachment domain-containing protein # Organism: L.buccalis # Pathway: not_defined # 1 132 1 132 133 100 53.0 3e-20 MELRDIKELMKILKKEEMAEMKVKYGKIKLVLTNSEVSSKEIPQNETKKIEVIEKLENSA KEEVIKSKNVGQILLEKLEKGMEVYKGMKLARIRTIGIDTDIKSQHNGILKEILISDQSN VDYAKPLFVIELL >gi|261748018|gb|ADAD01000074.1| GENE 14 14147 - 15514 2306 455 aa, chain + ## HITS:1 COG:sll0053 KEGG:ns NR:ns ## COG: sll0053 COG0439 # Protein_GI_number: 16331500 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Synechocystis # 6 449 3 446 448 558 61.0 1e-159 MGGAMFKKILIANRGEIAVRIIRAARELGIKTVAVYSEADADSLHVKLADEAVCIGPASS ADSYLKIPNIISAAQITGSEAIHPGYGFLAENASFAKICAQNNIVFIGPKPELINMMGDK ATARETAIKNKVPITKGSDGIVPDLDEAKKVAEWITYPVMIKATAGGGGKGMRIAHDEKE LSENFIAAQNEAKAAFGNPDVYIEKYVEEPRHVEIQVIGDKFGNVVHLGERDCSIQRRHQ KLIEEAPSAGIDAKTREKMGKFAAKLAKGIGYDSVGTLEFLVDKNMNFYFMEMNTRIQVE HTISEEITGVDLIKEQIKVAAGEKLSFSQKDIEISGHAIECRINAEDSENGFLPSSGTLE KYIPAGGIGVRVDSHSYQDYEIPPYYDSMIAKLVVKGKTREEAIKRMKRALKEFIIEGVD TTIPFHLKVLDNKEFNKGTIYTNFIETHFKDSLGK >gi|261748018|gb|ADAD01000074.1| GENE 15 15566 - 15976 601 136 aa, chain + ## HITS:1 COG:FN1619 KEGG:ns NR:ns ## COG: FN1619 COG1302 # Protein_GI_number: 19704940 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 118 1 115 122 79 42.0 1e-15 MNELGNISISSDVVATIAESVITEIDGIYSLAGNAPKNEITKFFQSVSSGVNKGIDVEVG ETECTLDLYIIAKLGYQLPALAGEIQTKVVKAITEMTGLKVQEVNVYIQKVVKDTNEEDV PKISASVETATEDSEQ >gi|261748018|gb|ADAD01000074.1| GENE 16 15990 - 16643 1021 217 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1934 NR:ns ## KEGG: Lebu_1934 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 184 1 184 218 191 67.0 2e-47 MIGILAFLARLSIIIGLAGAAFAGISDLIMKTDYLVTVDNTVNLGSVRVQIVLVLLAILY LVIFLLSYINKFTKYSQNKKVKTKTGEIEVSIKTINEVAKDFLSSQEIIKNSKVKSHPKG RAVVIEAVVDTYNVNNLSDKLLKIQEKLSEYVSNSTGITVKKSKVKLKKVLGETIVEKKI IEAPKEEIKVVKADIEKPTEEVKETVVSEVTAAEDKE >gi|261748018|gb|ADAD01000074.1| GENE 17 16681 - 17091 766 136 aa, chain + ## HITS:1 COG:BH2785 KEGG:ns NR:ns ## COG: BH2785 COG0781 # Protein_GI_number: 15615348 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Bacillus halodurans # 1 130 1 127 134 69 30.0 1e-12 MTRREIREEIFKLLFEKELIDNDIDKRINEVIEENKIKKEEQIDFLKSYVTEIVEKEEIL VEEIKSILEGWTYERLGTLEKALLKIAFYEITIKNIGYEIAINEVLEIAKKYSYDDTKEF LNGILAQLVKKNNGTA Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:54:28 2011 Seq name: gi|261748016|gb|ADAD01000075.1| Leptotrichia goodfellowii F0264 contig00024, whole genome shotgun sequence Length of sequence - 283 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 283 359 ## gi|262037959|ref|ZP_06011378.1| filamentous hemagglutinin outer membrane protein Predicted protein(s) >gi|261748016|gb|ADAD01000075.1| GENE 1 1 - 283 359 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037959|ref|ZP_06011378.1| ## NR: gi|262037959|ref|ZP_06011378.1| filamentous hemagglutinin outer membrane protein [Leptotrichia goodfellowii F0264] filamentous hemagglutinin outer membrane protein [Leptotrichia goodfellowii F0264] # 1 94 1 94 94 127 98.0 3e-28 AVDNTEGLIKGREVRIGGNLTGNAKGKVESIGALTLDGKIIDNKNGTLKGNIKKIDANKF INDEGKILTNEKLDIAAKEGSNVRGEIFGSEGVK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:54:39 2011 Seq name: gi|261748001|gb|ADAD01000076.1| Leptotrichia goodfellowii F0264 contig00178, whole genome shotgun sequence Length of sequence - 15277 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 7, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 435 356 ## PM0488 hypothetical protein 2 1 Op 2 . + CDS 454 - 915 583 ## gi|262037965|ref|ZP_06011383.1| hypothetical protein HMPREF0554_2253 3 2 Op 1 . + CDS 1033 - 3369 2910 ## COG3210 Large exoproteins involved in heme utilization or adhesion 4 2 Op 2 . + CDS 3383 - 4078 636 ## FN1816 hypothetical protein 5 2 Op 3 . + CDS 4161 - 5897 2081 ## COG3210 Large exoproteins involved in heme utilization or adhesion 6 2 Op 4 . + CDS 5900 - 6391 483 ## gi|262037973|ref|ZP_06011391.1| conserved hypothetical protein + Prom 6395 - 6454 8.8 7 3 Op 1 . + CDS 6564 - 8186 1943 ## COG3210 Large exoproteins involved in heme utilization or adhesion 8 3 Op 2 . + CDS 8186 - 8767 522 ## gi|262037972|ref|ZP_06011390.1| putative liporotein + Term 8927 - 8983 7.7 + Prom 8959 - 9018 9.7 9 4 Op 1 . + CDS 9138 - 10457 2153 ## COG1160 Predicted GTPases 10 4 Op 2 . + CDS 10478 - 11545 1255 ## COG0628 Predicted permease + Prom 11571 - 11630 3.2 11 5 Tu 1 . + CDS 11661 - 12119 625 ## Lebu_0428 MarR family transcriptional regulator + Prom 12192 - 12251 8.7 12 6 Op 1 5/0.000 + CDS 12302 - 12664 383 ## COG0640 Predicted transcriptional regulators 13 6 Op 2 . + CDS 12697 - 14865 2488 ## COG2217 Cation transport ATPase + Prom 14917 - 14976 11.9 14 7 Tu 1 . + CDS 15012 - 15276 285 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase Predicted protein(s) >gi|261748001|gb|ADAD01000076.1| GENE 1 1 - 435 356 144 aa, chain + ## HITS:1 COG:no KEGG:PM0488 NR:ns ## KEGG: PM0488 # Name: not_defined # Def: hypothetical protein # Organism: P.multocida # Pathway: not_defined # 35 137 6 113 115 73 37.0 3e-12 EKQVEKSTLKYNNFDHISKGEVQVKGDKVKVTGGHSTQNGIVVEEKLQEFPGGSYEAKIK IPNPNNPNEYLSKSNNGGKSTMFPDHWTENRIKVEIDSILKNPNNKVGDNVWVGRSSTGV VIRVIYREGKVITAYPIDPKQIVK >gi|261748001|gb|ADAD01000076.1| GENE 2 454 - 915 583 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037965|ref|ZP_06011383.1| ## NR: gi|262037965|ref|ZP_06011383.1| hypothetical protein HMPREF0554_2253 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2253 [Leptotrichia goodfellowii F0264] # 1 153 1 153 153 278 100.0 1e-73 MKLIFKYVCHEGWVPKSYIRVCKSFRRVNEISGEINQKDFWDLVLKKNILNEIYLAEYLY DTDLNLAEYVLDKLNDKTIDDYDIGSQGWDVELEEDKVVIGQMCSSEDDKRAYINREEVA YAMLKWKHFLEREFDVPDYQEVIDTEDVYRGEK >gi|261748001|gb|ADAD01000076.1| GENE 3 1033 - 3369 2910 778 aa, chain + ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 1 393 2046 2455 2806 253 42.0 1e-66 MRLSGFNQEGGKVAGSIENFVEESKQNTSRTTGSSSGVTIGVGSNGVPSSASISGSRTNG NRAYVDNQTTFILGEGSNLTIGKVENTAGIIGVEGNGKLKINEYKGKDLYNHDNLTTTGG SIGVDFGKGGAKVSGIGVNNENHRKEGITRHTVIGNVEIGSATGSPINRDRSKANETTRD DHSSTNVYIEGQTIDYATNPSKLKEDIGRAIKESLNDRGDDNRNFLGQLSEGRLQRTIEN IGGERLRKSTTQEDISKTLKDTYKDLGYDIEIIFTTPDKAPQLIDEKGKIKAGTAYVGED GKHTIIINTEAKENQDRAGLIGTITEEGSHVIGKVEGRQRKVPEGSEEKGLESTGRATND FFNKTYQEGNIPIQTKSDGKDYSNAKFGEHVGDERGACYKGVRIFINCKIPPTMADFRIL DQEHLKRDKNKLQKIREISKVEKKIEESKTPEEKAALKKRLDKLEGKNKNFSSFISGFKE ETKESIPFGVGIGIAEALHPVIALFFATSTPVNTNEEQLLHGIRKPQYSTKDAEYIKNNY PEMYPEIKGSINPKIKEFNKKAVDVSATKFYGKDEENANEFGKFLGSIVGFEAGRGAANT VINTGVKYYSSSSKSLFEKGHSEEIGKNLNDYASSKSNPNKRYTASESEAELTTRLKEMY KDKPNTTIETQESYKGGENVKYGIKGSVRPENSVVENGIVKETYEIKNYNQENYNNMTSN IGKQAVKRAEELPKTATQKIVIDTRGQKITPEIRQKITEDIIRKSNKIIKKENIIFME >gi|261748001|gb|ADAD01000076.1| GENE 4 3383 - 4078 636 231 aa, chain + ## HITS:1 COG:no KEGG:FN1816 NR:ns ## KEGG: FN1816 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 231 1 232 233 208 52.0 1e-52 MNKLTTKEFNDLYKRNFREYFDKPLKKDGFYKKGTINFYRINKLGMIETLNFQKHREELN VNCAILPIYCGSTNDSITIGLRLGKFMNTRYEHWWDIKDDESMKKSMQEMLNMIQKDLYE WFNKMENEKEIIDFISKSYSTVIYRYITQAASMAKFKRYDEILQYVEKVKKEYMESWSEE ERQKKEWLKKVLDEALLLERKLKEGKESIDQYIIEREKQSLIELGLEKLIK >gi|261748001|gb|ADAD01000076.1| GENE 5 4161 - 5897 2081 578 aa, chain + ## HITS:1 COG:FN0132 KEGG:ns NR:ns ## COG: FN0132 COG3210 # Protein_GI_number: 19703477 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 5 339 1920 2260 2462 302 53.0 1e-81 MNFPNGSRAYVDNQTTFILGEGSNLTIGKVENTAGIIGVEGNGKLKINEYKGKDLYNHDS LTTTGGSIGTSGAGISNENHRKEGITRHTVIGNVEIGSVTGSPINRDRSKANETTRDDHS STNVYVESQTIDYALHPAKFKEDVGIAVLEGSATVEGALKKIDNILRGDDNSDISQSEKR RYEEIKENIIRFKTAPDMKLIAEGDLSDPDVQKRLGIGGRFNPDDPNLPEKIKERIAQVR EQGQEINYFYDKVTGKIYINENADEDEIRAGIGREWGIRNEFDRGRTKPNGEGQEKGTVA GEIAYKEIEDRTKEKGDKKVNPYDFEYAKFDPNSEVTGDKKIVSVFNASAIWFDTKLKQE KVQQSFQETYGKKYSIDWEKYNSGRIVDMVYREKINYYYYQSLNSVKTIEEFKKGKFEAA DEKESVFHNIKGGKIYIGDESNKKFIEYETGKEVVLNNKLTKIIKDPVNIGTFNYYTYVL KGGGISIDKGMHGIDVYNWIRFGTGINDKSTFGERIDIFVLGTIVSTNYKELKEWSKNRK IEYIGYKELIEYDEDLKFKIEKERNRIKNRINRISGGK >gi|261748001|gb|ADAD01000076.1| GENE 6 5900 - 6391 483 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037973|ref|ZP_06011391.1| ## NR: gi|262037973|ref|ZP_06011391.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 163 1 163 163 268 100.0 9e-71 MKKLIVIVIIIILLFLKACVRHTDRNHYNISSGRFEGNVPNIEIYTSLENVCFESNNFPE NLILKKINLYFNGKIIGTMNINKNIHEMKFTNSTLKEYEMEEELLRILGKNNEKYEIATG YMQKGFVFEIFIKNIDTGEIYKAIRETSISFDKKGFYFWVPNI >gi|261748001|gb|ADAD01000076.1| GENE 7 6564 - 8186 1943 540 aa, chain + ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 5 340 2103 2455 2806 231 45.0 3e-60 MNFPNGSRAYVDNQTTFILGEGSNLTIGKAENTAGIIGVEGNGKLKINEYKGTDLYNHDN LTTTGGSIGTSGAGISNENHRKEGITRNTVIGNVEIGSATGSPINRDRSKANETTKDTHR TTNINIESQTIEYALNPGKFKEDLGKAKDEIKDIGRAIKESLNDRGDDNRNFLGQLSEGR LQRTIENIGGERLRKSTTQEDISKILKDTYKDLGYDVNIVFTTSDKAPQLIDEKGNIKAG TAYVGADGKHTILINTEAKENQDRAGLIGTITEEGSHVIGKVEGRQRKVPEGSEEKGLES TGRATNDFFNKTYQEGNIPIQTKSDGKDYSNDNFGEHVGNQFIIQNAPYVREYLLPKNLK NEGYKYPVLYIDEEYPKEYNRLMKLPKVIDGFEKNLISIENDLIYLISKKNLKPNESYEI PKHPSNMMVPYRLYGEKKELVDLTLAAGFAWGHTHVRTQVKNVIINRTNSGYKINITYVE RLEDTFDDVLDTFNMWKGNQDMPGSKPFIMKTSNKKHSVEVEVKNLREISSELIKKRNKR >gi|261748001|gb|ADAD01000076.1| GENE 8 8186 - 8767 522 193 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037972|ref|ZP_06011390.1| ## NR: gi|262037972|ref|ZP_06011390.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 193 1 193 193 358 100.0 1e-97 MRRIIKVWFFISILIFLMYIIWACYRYLNPEIHIFKSKKYNLKIIKEGTFGDDWISQNSF YYIDYVLKEDKNINEKKQSIIYEIGCSKLEKVGFDEKKGIFYGKLDFHSGVVSFRKNETD IKKLCSGYFILDLNTVGYESRMTEKKFEEKLLERGIRNDIIDTKKFVKKYGRLVYCLKIS NSAQCDLPNWMFR >gi|261748001|gb|ADAD01000076.1| GENE 9 9138 - 10457 2153 439 aa, chain + ## HITS:1 COG:FN0170 KEGG:ns NR:ns ## COG: FN0170 COG1160 # Protein_GI_number: 19703515 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 1 438 1 440 440 555 64.0 1e-158 MKQVVAIVGRPNVGKSTLFNKLVGDRLSIVKDEPGVTRDRLYRETEWSGKNFILVDTGGL EPKTDDFMMRKIKEQAQVAIDEADVVIFLVDGKAGITGLDEDIANILRKKDKKVVVAVNK IDNYMREQDNILEFYALGFEEVIGISGEHKINLGDLLDAVIAKFDNKKEKQTEEGLCIAI LGRPNAGKSSLINKLLNEERSIVSDIAGTTRDAIDSTLKYDGEIYTLIDTAGIRRKSKIE DSIEYYSVLRAVKSIKRVDVCVLMLDATELLTDQDKRVAGLIYEEKKPIIIAVNKWDLIE KDNNSVKHFTELVKADLPFLDYAPIITMSALTGKRTVSILEQAKFINEEYHKKITTGLLN QILAEMIAQNPVPTRKGRAVKINYGTQVGQAPPKFVFFSNNPDLIHFSYQRYIENKLREY FGFEGCPISIVFNKKNENY >gi|261748001|gb|ADAD01000076.1| GENE 10 10478 - 11545 1255 355 aa, chain + ## HITS:1 COG:BS_yrrI KEGG:ns NR:ns ## COG: BS_yrrI COG0628 # Protein_GI_number: 16079795 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus subtilis # 16 346 12 347 353 119 27.0 6e-27 MDFYNEEKFKKWKNILVVTVLGLLTVLLFFKVYVYFEKSVQIITGTLFPFILSFVIVYSL MPFIDMLNENLKINRKVAISLVLLIFFIFFIYVILAFIPLVAGQLSSLIEFFIKNQENLQ KNLISFMAQNNINIRDSIINSKEVIFTNFLKVLNSSVSLLTGTFSFLFMTPIFTIMLIFS YDNIDEGIKNLLRKFDREEWIPLIKQMDDAIGKYIKVTVLDSLIVGICSYVIFFFLKMEY SSLFSMIIGAGNVIPFIGPFIGLIPVILYAATKSFKLVVIIIVLITILQTIEANIIKPWL TSKSVDIHPITTLLVVLIGGALFGIGGAFIAIPVYIVLKLGVIFYFEKLKFNKEN >gi|261748001|gb|ADAD01000076.1| GENE 11 11661 - 12119 625 152 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0428 NR:ns ## KEGG: Lebu_0428 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: L.buccalis # Pathway: not_defined # 5 151 3 149 161 176 61.0 2e-43 MEKCSKGETQINNCLYFTISRLFRVVNKVAEESFGRMGICPTHAFLMVLLEEEKNGLSVN KISEALTIAPSTVTRFVDKLVLKGYVERIKSGKQSFTKITDAGLEAMPEIYKSWEIMFGK IERMAGDKEYLGEILKQIRDLTDILEENQKYL >gi|261748001|gb|ADAD01000076.1| GENE 12 12302 - 12664 383 120 aa, chain + ## HITS:1 COG:CAC2242 KEGG:ns NR:ns ## COG: CAC2242 COG0640 # Protein_GI_number: 15895510 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 8 119 11 122 122 145 65.0 2e-35 MEKKLPACEGTIIHKEVVEEIKTKIPEEEVLYELGDFFKLLGDSTRIKILSALFHSEMCV CDIASLFDMTQSAISHQLRVLKQGRLVKYRKSGKVVYYSLDDEHVKEIVEQGLNHITEKK >gi|261748001|gb|ADAD01000076.1| GENE 13 12697 - 14865 2488 722 aa, chain + ## HITS:1 COG:BS_yvgW KEGG:ns NR:ns ## COG: BS_yvgW COG2217 # Protein_GI_number: 16080402 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Bacillus subtilis # 9 721 10 700 702 719 53.0 0 MSQRIILNVKGLHCANCAAKIEKKLNEMQNVKEARIDLVGEKIFLTAEEKNGSVLVNAVQ KIADSIEEGVKIFLPSKHNHAHSHDHAHDHSHHHSGEVGEMIKILVIGGIFYFAASYPNL FSAYLSPDTVTFAKIILFLISYIIIGGHVLTTAFKNILKGQFFDENFLMAVATIGAFAIG EYHEAVAVMFFYQVGELFQSMAVNKSRKSIASLMNIRPDYANLKVRNNITKVSPEEVKIN DFIVVKPGEKVPLDGIITEGSSSFDTSALTGESLPVSKNIGEEVLSGSVNKTGLVTIKVT KLFSESTVSKILDLVENANSKKSKTENFITKFARYYTPTVVGIAVLMAIIPPLVLKETTF YEWLYKALVFLVLSCPCALVISIPLGFFGGIGGASRHGILIKGGNYLEALNNVETVVMDK TGTLTKGVFKVTQINPENNITKEELLQYAAYAESFSTHPIAESVIKEYQKNNLQIDKSLI KNYEEISGYGIKTEVNGKSVIAGNIKLMNLENIKTENNSQTGTVVYVAIDGKYAGNLLIS DEIKEDSLRAIEDMKKIGVKKTVMLTGDSKAIGESIAEKLNIDKAYTELLPSDKVEKIEE IFEERKSNGKILFVGDGINDAPVLARADIGVAMGGVGSDAAIEAADVVIMNDEPSKIVTA IKIAKKTRTVVWQNIALALGVKIITLILGVMGFATIWAAIFADVGVALIAILNAARVIRM KV >gi|261748001|gb|ADAD01000076.1| GENE 14 15012 - 15276 285 88 aa, chain + ## HITS:1 COG:rfaE KEGG:ns NR:ns ## COG: rfaE COG2870 # Protein_GI_number: 16130948 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Escherichia coli K12 # 12 86 322 396 477 90 57.0 9e-19 MSKIKGEFRKNLLTQEEMKEKIEELKKNGKKTVFTNGVFDILHIGHLTYLEEARNLGDVL IVGVNSDASVKVNKGDKRPINSQKNRAE Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:55:17 2011 Seq name: gi|261747997|gb|ADAD01000077.1| Leptotrichia goodfellowii F0264 contig00224, whole genome shotgun sequence Length of sequence - 878 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 367 420 ## FN2115 hypothetical protein + Prom 403 - 462 3.4 2 2 Tu 1 . + CDS 549 - 876 312 ## gi|262037976|ref|ZP_06011393.1| hypothetical protein HMPREF0554_2441 Predicted protein(s) >gi|261747997|gb|ADAD01000077.1| GENE 1 2 - 367 420 121 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 27 119 51 150 151 78 46.0 8e-14 RTSKTREKNGGGAIDNEKYESGVKEAIKDVAKRPLNKKEQIDGLILILPENTSINTKLGN VIDLKTGYGLPIIISNNRACVEKKIRDNLYYGINYDKYVSGIKEIVQNIIKANGFTKTCS K >gi|261747997|gb|ADAD01000077.1| GENE 2 549 - 876 312 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037976|ref|ZP_06011393.1| ## NR: gi|262037976|ref|ZP_06011393.1| hypothetical protein HMPREF0554_2441 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2441 [Leptotrichia goodfellowii F0264] # 1 109 1 109 110 184 100.0 2e-45 MKKWLNDGHGAYTYYSALLAKDIDGLLRKYQNAEDETAKIQEDIENKYIENQERLMELFT RGPALMNDSEGLLKGLGEYNRVENERAYREGKYQGKNLPQNYITNSLNE Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:55:28 2011 Seq name: gi|261747995|gb|ADAD01000078.1| Leptotrichia goodfellowii F0264 contig00187, whole genome shotgun sequence Length of sequence - 1902 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1900 2939 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261747995|gb|ADAD01000078.1| GENE 1 1 - 1900 2939 633 aa, chain - ## HITS:1 COG:FN0290 KEGG:ns NR:ns ## COG: FN0290 COG3210 # Protein_GI_number: 19703635 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 308 529 37 256 727 62 26.0 2e-09 GVEIVPGTEISSRYEEHLSKGLKFTANKGGVFAGVEKGKNEISTQTIKNVGSVINSKGGG VKVEGDHIISVGSKIGAAGDIILDGKNGVVLKDGENFASIREQNEKIRAGLFATANGKKL SASAGVEAAYSKDYNGRTMITPEKNVLVTNKNILIKSDEGNVLLQGDFGAKENIGVFAEK GKIYIKDSTSEILTDSQSVNARVALAVSLNLGGIKDTFKSYKDYFKAIKELPNIGRVGSF IKDIAKGKDLMESLEGKEDTINALNNYFKGPSSGGVHAGLDLEASVSQGKSASKYIQNIV ANIRAGKDIVLKSNELEIYGGIIKGDNDVFLESNKIKIHGSEDSYENKSRNISVSGKYGI WGTNSGNIGGNVGYSQSKSEGKTYNNSQIIAGNKLYSRSDELIIKGGNLKGKETDVETKK LVLESLQDTSKYREIGVGIGVNAGKGKEKYTYSGNGEVTVVKGDKAWVGTQSSITGTEKI RVKADEGYLKGGLIGNIDENGKDRGNLTVEIKKLLTEDIESKDKQVGVGIRASITEREKV PTDQQIYLKDENGNPIKGKEDKVGTDYGASIEGHDREKVTRATIGNGVLTVGEQNGPLNR DVYRADEILKDVNVKKVNVDYVETKKGWGEIGG Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:55:29 2011 Seq name: gi|261747993|gb|ADAD01000079.1| Leptotrichia goodfellowii F0264 contig00223, whole genome shotgun sequence Length of sequence - 916 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 71 - 916 481 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|261747993|gb|ADAD01000079.1| GENE 1 71 - 916 481 281 aa, chain - ## HITS:1 COG:pli0076 KEGG:ns NR:ns ## COG: pli0076 COG2801 # Protein_GI_number: 18450358 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Listeria innocua # 19 272 6 257 269 119 32.0 8e-27 DSKKFGLRWLLRRFDIYPNSYYNFLKNRKEQQIKKKDDIKSSIKEIYHSYNGILGHRMIQ VLLFKLKNIKISKTTVHKYMNRELHLYSITRKKRPEYTQTKKHKIFDNLLRQNFRSSLPN TVWCTDFTYIRLVDGSFRYNCTIIDLYDRRVVVSITDKSITSELAIRALEKALNNLSKNR KKIILHSDQGSQFTSKKFVEYCEGHKIIQSMSKAGCPYDNAPMERYFNTLKSELLNHHHY KDEKKLYNSIEEFAYVWYNHVRPHSYNGYKTPYEARYNKVG Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:55:30 2011 Seq name: gi|261747990|gb|ADAD01000080.1| Leptotrichia goodfellowii F0264 contig00104, whole genome shotgun sequence Length of sequence - 3483 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 1608 1612 ## COG2831 Hemolysin activation/secretion protein 2 1 Op 2 . + CDS 1620 - 3483 2234 ## COG2849 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|261747990|gb|ADAD01000080.1| GENE 1 1 - 1608 1612 535 aa, chain + ## HITS:1 COG:PA0040 KEGG:ns NR:ns ## COG: PA0040 COG2831 # Protein_GI_number: 15595238 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Hemolysin activation/secretion protein # Organism: Pseudomonas aeruginosa # 44 535 91 562 562 179 25.0 1e-44 QEEIKEPDANVDTEGLPKFHVKKITLKRPTSSIKPLVNPKKIDKIINEYKDTEINIFDLR ALVKKLNDEYMKKGYITTRVYLEPDQNIQSSGEVKLVVLEGKIEEVVLDKDTGKDKRKIF FAFTNENGKVLNISHIDNGIDNLNRVESNNSKINIVPGTKQGYSKIIIESQKEKPFRFIL NYEDTQTEKQRYKATIEYDNLFGTNDNIYLSYRGDMGKLAKRRNHTDDYSQSYSFGYSFP FKSWSFGLSYNSTKDKSLILGNTTEYTSISKSRQYSLNTSKLLYRDANMKLNLTMGLDVK RERTYLAERRLETQDRNITAGSIGINGMFKPFKGIASYSLSYSKGLKGFRANEDNSFNAG TLPTENIEPSDNRYQFGKVNLNLSYYKPFYFKNQGITLRASFNGQYSKDALFSTEKYSIG GFDTVKGFPVSVSGDMGYSTKLELSYILPSGESKLGQFMYKIRPYMEADLGKVRNNYNEY GDKKGRIGTLSSYSLGLRYYGEKVTLDAGIAKTDKGRSLTKAESHRGYVTVSTTF >gi|261747990|gb|ADAD01000080.1| GENE 2 1620 - 3483 2234 621 aa, chain + ## HITS:1 COG:FN1514 KEGG:ns NR:ns ## COG: FN1514 COG2849 # Protein_GI_number: 19704846 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 144 621 5 502 503 128 27.0 3e-29 MKKILMFLTLLILAISCGTKQDEDVNKTKPQGVEVEDLQVIDEKLYEYGEEKPYSGKVIT RDEDDKIVMIETSKDGSIEGEVKTYYETGKLKEVYNVKDDKIEGKYNWYGKDGSIEINAT YKNNERETETAISTKDKKPYTGTYTETYANGNVSQVIKFNEGKRDGETIYYYDNGKIKER IPYTQGLREGNYFYYNKVGEVIGKGAFVNDKREGQWLVYDEEDKTLIEKIYSNNLEEGPY KVYFEDGKVRAEGTYKGGKLDGMYKAYYLNGNVETEVNYVDGKREGAYKINYENGKVRES GNYKNDKLVGDIAVVYDTGVKLADLHYTQDGKKTGKWIYLYPSGKVQQEFTYENDKPVGN YKKYYESGKVSEEGNYKNGLLEGEVKLYYENGQLASKVNFKRNSKEGEARSYYENGKEKE KGTFKHNKYEGKVNVYYDDGQVAVDQTFKNGKLDGSYKEYYKGNKPKVTATYVNGKEEGE YTVYYESGQKQVVSKFKEGLPEGEWVYYYQNGKESKKMNFVKGLKDGKQSEYYESGNKKL ESEFKNGKESGTWTVYFDNGKISTTFSYLDGQLNGPVVINDDKGVKIVEGNYKGGKEDGK WIFYDESGKVKKEEVYVLGKK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:55:32 2011 Seq name: gi|261747983|gb|ADAD01000081.1| Leptotrichia goodfellowii F0264 contig00124, whole genome shotgun sequence Length of sequence - 5040 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 211 232 ## PROTEIN SUPPORTED gi|229547905|ref|ZP_04436630.1| acetyltransferase including N-acetylase of ribosomal protein family protein + Term 271 - 304 1.3 - Term 125 - 162 -0.6 2 2 Tu 1 . - CDS 307 - 1542 1925 ## COG2309 Leucyl aminopeptidase (aminopeptidase T) - Prom 1623 - 1682 8.3 - Term 1657 - 1719 18.6 3 3 Op 1 11/0.000 - CDS 1769 - 3127 2152 ## COG3037 Uncharacterized protein conserved in bacteria 4 3 Op 2 13/0.000 - CDS 3142 - 3426 525 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 5 3 Op 3 . - CDS 3430 - 3864 450 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) - Prom 3891 - 3950 3.7 - Term 3889 - 3951 4.3 6 4 Tu 1 . - CDS 3965 - 4948 1458 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 4976 - 5035 3.8 Predicted protein(s) >gi|261747983|gb|ADAD01000081.1| GENE 1 2 - 211 232 69 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229547905|ref|ZP_04436630.1| acetyltransferase including N-acetylase of ribosomal protein family protein [Enterococcus faecalis ATCC 29200] # 2 61 121 180 204 94 65 2e-19 CKAVIEQLKKDGIPYITATHDINNPRSGEVMKKLGMSYQYSYEEQWQPKDIKVTFRLYRL NFINKNKFF >gi|261747983|gb|ADAD01000081.1| GENE 2 307 - 1542 1925 411 aa, chain - ## HITS:1 COG:BH2245 KEGG:ns NR:ns ## COG: BH2245 COG2309 # Protein_GI_number: 15614808 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase (aminopeptidase T) # Organism: Bacillus halodurans # 1 411 1 410 410 448 53.0 1e-125 MKNLDLKLDKYAQVIVKIGANVQPGQRVWINCTTDSLPLVYKVTEASYKAGASDVHVKLT DDRLTRMHAEYQPTEVYSHVPQWLVDERNEYLDNNVVFIHILSSSPNLLSGIDPKKLGTY TKNMGEAFKYYRSRVMTDANSWTLASYPSADWAKLVFPEEKDSEKAQEKLLDAILKTVRV DREDPVKAWEEHHKILSEKADYLNKKAFSALHYTAPGTDLTVGLPKNHVWIAAGSKNLKG DHFMPNMPTEEVFTAADKYRIDGYVSNKKPLSYQGNIIDNFKLTFKDGKVVDFQAETGYD ILKQLLDTDEGAKSIGEVALVPHDSPISNSGLLYYQTLFDENASNHLALGAAYPTNVKGG KDMSEEELEKAHLNQSITHVDFMIGDAEMDIDGINEDGTREPVFRKGNWAF >gi|261747983|gb|ADAD01000081.1| GENE 3 1769 - 3127 2152 452 aa, chain - ## HITS:1 COG:SPy1949 KEGG:ns NR:ns ## COG: SPy1949 COG3037 # Protein_GI_number: 15675750 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pyogenes M1 GAS # 1 373 1 376 411 464 68.0 1e-130 MKEILLFLRDVLSQPAILLGLVSFVGLIALKKPGHKVLTGTLGPILGYIMLGAGADFIVS NLDPLGKMIEKGFKITGVVPNNEAVVATAQNLLGKETMLILVAGLFVNLIVAKFTKYKYI FLTGHHSFYMACLLSAVLGAMGFSNVALVIIGGFFLGTWSAISPAIGQKYTLKVTDEEEI AMGHFGSLGYYISAWIGGKVGNPENTTENMKIPEKWGFLRNTTIATAITMIVFYLIAGIA AGNEYVSTLSNGVSPYLYVIMLGLKFAVGVAIVYNGVRMILADLIPAFQGIATKIIPNSI PAVDCAVFFTFAQTAVIIGFIFSFIGGIIGMIILGITGGVLIIPGLVPHFFCGATAGIYG NSTGGKRGAMIGAFINGLLLSFLPAALLPVLGKLGFANTTFGDVDFTVIGIVLGTINNFM GHLGVYIIIAIIMIVLIIPNFVKTKSETINNI >gi|261747983|gb|ADAD01000081.1| GENE 4 3142 - 3426 525 94 aa, chain - ## HITS:1 COG:SPy1950 KEGG:ns NR:ns ## COG: SPy1950 COG3414 # Protein_GI_number: 15675751 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Streptococcus pyogenes M1 GAS # 1 90 1 89 94 85 52.0 2e-17 MLRILTVCGNGIGSSLMLAMKIEEICKEEGMDNITVESSDFNGAQSKEADVIVTVKEIAE QFSENKKVIVIRSYTNKKKIKEDILEELRQYYNN >gi|261747983|gb|ADAD01000081.1| GENE 5 3430 - 3864 450 144 aa, chain - ## HITS:1 COG:SPy1952_2 KEGG:ns NR:ns ## COG: SPy1952_2 COG1762 # Protein_GI_number: 15675752 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Streptococcus pyogenes M1 GAS # 8 142 3 144 154 102 39.0 2e-22 MLYEILKDNIIMQENVSDWEESIKKASDPLEKTGKIEKRYIEAMINDIKQIGFYVVITDK VAMPHSRPENGVKETALSLLKLEKPVNYGEHEVNLVFILAAENKEKHIGVLKELSDFLDS DERIDLLINAKSIEEIEKFLKNRR >gi|261747983|gb|ADAD01000081.1| GENE 6 3965 - 4948 1458 327 aa, chain - ## HITS:1 COG:alsB KEGG:ns NR:ns ## COG: alsB COG1879 # Protein_GI_number: 16131914 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 35 327 18 311 311 309 60.0 6e-84 MNRFLKSILMLLGLFLLVSCGGGSKEAEKTESAKSDSGEGKAEVAIILKTLSNPFWVSMK EGIEKEAAAQGIKVDIFAANSEEDVQEQLKLLENLLGKGYKAIGVAPLSPVNMIPGIVEA NKKGIYVVNIDEKIDMNQLQASGGSVLAFVTTDNVKVGAKGAEFIISKLPDGGEVAIIEG KAGNASGEYRKQGATEAFKANSKIKLVASQPADWDRSKALDLAANLIQKYPNLKAIYAAN DTMALGALQAVVNANKQNQIIVVGTDGAPEALDSIKQKGLSATVAQDSANIGAESLKILL EAIKNKPEISPSATPKEVPVESKLVTQ Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:55:44 2011 Seq name: gi|261747946|gb|ADAD01000082.1| Leptotrichia goodfellowii F0264 contig00131, whole genome shotgun sequence Length of sequence - 31754 bp Number of predicted genes - 38, with homology - 36 Number of transcription units - 14, operones - 8 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 990 1109 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D - Prom 1075 - 1134 6.5 - Term 1013 - 1046 -0.9 2 2 Op 1 . - CDS 1144 - 1809 978 ## COG0546 Predicted phosphatases 3 2 Op 2 . - CDS 1835 - 2056 370 ## gi|262038013|ref|ZP_06011425.1| conserved hypothetical protein - Prom 2107 - 2166 9.3 + Prom 2060 - 2119 11.3 4 3 Tu 1 . + CDS 2202 - 2576 503 ## Lebu_0663 hypothetical protein 5 4 Op 1 . - CDS 2698 - 3006 455 ## pE33L466_0017 hypothetical protein 6 4 Op 2 1/0.250 - CDS 3032 - 4117 1603 ## COG1435 Thymidine kinase - Prom 4148 - 4207 7.9 7 4 Op 3 . - CDS 4215 - 5123 1331 ## COG0083 Homoserine kinase - Prom 5181 - 5240 10.0 + Prom 5140 - 5199 12.1 8 5 Tu 1 . + CDS 5343 - 5906 582 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Term 6044 - 6077 2.2 9 6 Op 1 . - CDS 6087 - 7214 1448 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) 10 6 Op 2 1/0.250 - CDS 7278 - 8054 926 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 11 6 Op 3 1/0.250 - CDS 8094 - 9113 1411 ## COG0240 Glycerol-3-phosphate dehydrogenase 12 6 Op 4 . - CDS 9141 - 9761 839 ## COG0344 Predicted membrane protein 13 6 Op 5 . - CDS 9764 - 11224 1781 ## COG0534 Na+-driven multidrug efflux pump 14 6 Op 6 5/0.000 - CDS 11249 - 12385 1267 ## COG0763 Lipid A disaccharide synthetase 15 6 Op 7 5/0.000 - CDS 12378 - 13184 945 ## COG3494 Uncharacterized protein conserved in bacteria 16 6 Op 8 25/0.000 - CDS 13184 - 13963 1135 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase 17 6 Op 9 4/0.000 - CDS 13997 - 14422 835 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases - Prom 14469 - 14528 9.8 - Term 14521 - 14550 -0.7 18 6 Op 10 1/0.250 - CDS 14680 - 15519 1287 ## COG0774 UDP-3-O-acyl-N-acetylglucosamine deacetylase 19 6 Op 11 . - CDS 15534 - 17735 3042 ## COG0210 Superfamily I DNA and RNA helicases - Prom 17828 - 17887 14.2 20 7 Op 1 . - CDS 17897 - 18253 398 ## gi|262038014|ref|ZP_06011426.1| TM2 domain-containing protein - Prom 18297 - 18356 11.9 21 7 Op 2 . - CDS 18367 - 19605 1503 ## COG0053 Predicted Co/Zn/Cd cation transporters - Prom 19634 - 19693 11.7 + Prom 19595 - 19654 6.8 22 8 Tu 1 . + CDS 19674 - 19940 56 ## + Term 20034 - 20075 -0.2 - Term 20041 - 20077 2.3 23 9 Op 1 . - CDS 20183 - 21445 1446 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 24 9 Op 2 1/0.250 - CDS 21445 - 22203 965 ## COG0284 Orotidine-5'-phosphate decarboxylase 25 9 Op 3 19/0.000 - CDS 22223 - 22678 631 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit 26 9 Op 4 . - CDS 22689 - 23624 1165 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 27 9 Op 5 . - CDS 23675 - 23872 103 ## gi|262037997|ref|ZP_06011409.1| ribosomal protein L22 - Prom 23906 - 23965 7.8 - Term 24119 - 24151 -0.2 28 10 Op 1 . - CDS 24277 - 25641 1566 ## COG1032 Fe-S oxidoreductase 29 10 Op 2 . - CDS 25717 - 26130 537 ## gi|262038028|ref|ZP_06011440.1| conserved domain protein - Prom 26238 - 26297 5.9 30 11 Op 1 . - CDS 26350 - 26847 339 ## gi|262038000|ref|ZP_06011412.1| 8 kDa glycoprotein 31 11 Op 2 . - CDS 26844 - 27299 600 ## gi|262038015|ref|ZP_06011427.1| hypothetical protein HMPREF0554_2074 - Prom 27326 - 27385 14.0 32 12 Op 1 . - CDS 27445 - 27555 85 ## 33 12 Op 2 . - CDS 27622 - 28251 964 ## COG0461 Orotate phosphoribosyltransferase 34 12 Op 3 . - CDS 28277 - 28942 864 ## Lebu_1655 hypothetical protein 35 12 Op 4 . - CDS 29014 - 29448 341 ## TDE0497 hypothetical protein 36 12 Op 5 . - CDS 29470 - 29823 323 ## COG0789 Predicted transcriptional regulators - Prom 29893 - 29952 8.1 + Prom 29718 - 29777 5.6 37 13 Tu 1 . + CDS 29905 - 30693 846 ## COG0300 Short-chain dehydrogenases of various substrate specificities + Term 30725 - 30755 0.6 - Term 30713 - 30742 0.4 38 14 Tu 1 . - CDS 30752 - 31726 1221 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein Predicted protein(s) >gi|261747946|gb|ADAD01000082.1| GENE 1 3 - 990 1109 329 aa, chain - ## HITS:1 COG:FN0751 KEGG:ns NR:ns ## COG: FN0751 COG0252 # Protein_GI_number: 19704086 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Fusobacterium nucleatum # 8 328 4 308 336 318 51.0 8e-87 MKSENSGKILIINTGGTISMVHSDGNDNKSVLKPSKSWKEVIQNYQFLESLNIDYAQTSK IIDSSDMDYKIWLEIGKIIEENYAAYKGFVILHGTDTMSYTASVLSFMLKNLNKTVILTG AQRPIREMRSDGLQNLLTSIEIIEKQADNPEFKNDEECLPVVPEVCIFFRDHLFRGNRSR KLDSTNYFGFSSPNYLPLGQAGSKIKVYEERLLPQSDRQFYVDYQISTDVLMMDVFPGFN PQILKKIFEDNSNIKGLVLRTYGSGNTPQNEEFLETIRYIINSGVIILNITQCTVGSVEM GLYESNAVLTQLGVVNGYDMTPEAAITKF >gi|261747946|gb|ADAD01000082.1| GENE 2 1144 - 1809 978 221 aa, chain - ## HITS:1 COG:BB0676 KEGG:ns NR:ns ## COG: BB0676 COG0546 # Protein_GI_number: 15595021 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Borrelia burgdorferi # 1 194 1 194 220 89 35.0 6e-18 MKYDLVLFDLDGTLVNTVEAISKSVNCAMEELGLKTYSVEYCYNLIGHGVAGIIDRVFEL EKYNPEELNKEKAKEVVRKYYKKYFDYNVFLYPEIDKLMNFLEKNNIKKGIVTNKDQELA TATVKSHLEKWSYADIIGSDDKNYPRKPDSYGVDKISKELGIPKNRILYVGDMEVDVKTA ENSGTDIVYCNWGFGENKKEKNIPENIKVSSVDELIAKIKG >gi|261747946|gb|ADAD01000082.1| GENE 3 1835 - 2056 370 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038013|ref|ZP_06011425.1| ## NR: gi|262038013|ref|ZP_06011425.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 73 1 73 73 99 100.0 7e-20 MEDIVKKINEFSKIAKERELTEEEAKEREKYRRMYIDKFKESVRGHLDSIKVIRVDDDGN PIDEEGNIIPDQA >gi|261747946|gb|ADAD01000082.1| GENE 4 2202 - 2576 503 124 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0663 NR:ns ## KEGG: Lebu_0663 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 123 1 124 129 188 75.0 6e-47 MFCTLICCMDGRFIHIINNYIRNNYRYDYVDTITDAGPVNKIVYEDYLQAVEDKIVLISI NKHKSDHIFVAGHSDCAGCPIDDETQKGYIIQSVKMIHEHLPDIAVTGLFVEESGDIEVL IDFL >gi|261747946|gb|ADAD01000082.1| GENE 5 2698 - 3006 455 102 aa, chain - ## HITS:1 COG:no KEGG:pE33L466_0017 NR:ns ## KEGG: pE33L466_0017 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_ZK # Pathway: not_defined # 2 102 22 122 122 140 73.0 2e-32 MGNESYFDGGLLTFIGVIILGAIITSITFGICYPWALCLVYGWKINHTVIKGRRLKFNGT AIGLFGNWIKWFLLTVITLGIYGLWLHIKLEQWKVKNTSFEN >gi|261747946|gb|ADAD01000082.1| GENE 6 3032 - 4117 1603 361 aa, chain - ## HITS:1 COG:BS_tdk KEGG:ns NR:ns ## COG: BS_tdk COG1435 # Protein_GI_number: 16080759 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine kinase # Organism: Bacillus subtilis # 4 199 2 195 195 216 52.0 7e-56 MEFHLKSNMGTLEVVTGSMFSGKSEELIRRLRRAKYAKQKVIVFKHAIDNRYGEKGIFSH GNDSLEAYPASSTEQMEEIMVKNNDAEVIGIDEVQFFGQSVVDFCKKYVEYGKRVIVAGL DMSFRAEPYHPMPELMGIADQVDKLHAICTVCGKPAYASQRLIDGEPAYYDDPLVMVGAS ENYEARCRRHHIVKYRENKMGKIYFMVGTDIGVGKKSVEKLYSKNILKDKKTKTVVITSG IEKYENNENKLSELREEIEKELMRNDFVFVRIVRGLLLPIEGNYNVLDFMCEYRKNSEII LVSQNKKGVLNQVLLTVDLLRKSDLNFREIVYTNKNVIDKENIEIIEKIKEITKLDYRII D >gi|261747946|gb|ADAD01000082.1| GENE 7 4215 - 5123 1331 302 aa, chain - ## HITS:1 COG:CAC1235 KEGG:ns NR:ns ## COG: CAC1235 COG0083 # Protein_GI_number: 15894518 # Func_class: E Amino acid transport and metabolism # Function: Homoserine kinase # Organism: Clostridium acetobutylicum # 6 296 3 295 296 211 41.0 1e-54 MGLKFKVRVPGTSANIGVGYDCLGVALDYLLELEVEESDKIEFLENGAPFSIPIEENLIF EAIKYTEKHLVKNIPSYKVNVIKNEIPIARGLGSSSSAIVAGIMIANKFAGSPLDINKIA KLAVDMEGHPDNVVPAIFGGMVLAAHDKEKVAYSTLLNSDDLYFYVMIPDFQLSTEKARS VLPKSYLVSDAINNMSKLGLLVNAFNKGEYENLRFLLGDKLHQPYRFVLINNSEKIFEAS EKYGALGEYISGAGPTLISLNHDNDEFLENMKEELGKFSDKWTIEKKRINLKGAEIFDEE TV >gi|261747946|gb|ADAD01000082.1| GENE 8 5343 - 5906 582 187 aa, chain + ## HITS:1 COG:NMA0440 KEGG:ns NR:ns ## COG: NMA0440 COG0791 # Protein_GI_number: 15793445 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Neisseria meningitidis Z2491 # 61 182 145 267 280 104 41.0 1e-22 MKRIVILAGALLLSTTILTSKPKNPSDELTNIIGNYYKNDSMSLNVKDNSSKSERSSHTA RAGVRDQIIEFASNKLGSPYVWGATGPNSFDCSGFVGYVFKKAADLNLPRVSSDQATFRP KISSMNMKKGDLVFFETTGKGRISHVGIYMGGRQFIHASSGSRRVTISSLDSDFYSRTFR WAINPFS >gi|261747946|gb|ADAD01000082.1| GENE 9 6087 - 7214 1448 375 aa, chain - ## HITS:1 COG:FN0536 KEGG:ns NR:ns ## COG: FN0536 COG0592 # Protein_GI_number: 19703871 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Fusobacterium nucleatum # 2 374 1 381 381 208 39.0 1e-53 MLNVKVDRKELLKAIQIVENAVTENKIREVLSGIYIEAKENKIILKGTDLELSINTEISG EIISEGKIVIKHKLIEEFLKQILDEKIELIEEQGKLVIKTDSTNTEFSLYDAENFPTQSK MELGLEYTFNKNKLLSYIENVKISASADPENLAVNCIRFEIEENKLKLVSSDTYRLTYIE EELEENEKNKEGLSVSIPLKTIEGLIKIMRLIDEENIIFKSDGTKVFFKFSNVEILTRVI ELQFPDYKSILKNAQHDKKVLLNTKDFLSVLKRTLIFVRDNKDAKNGGIFSFENNKLVLT GINENAKIKEELATIQEGEDLKISLNVKFLLDYISILDGKVTELKLMNSKSSVLVKDEES DASLYFTMPLALREG >gi|261747946|gb|ADAD01000082.1| GENE 10 7278 - 8054 926 258 aa, chain - ## HITS:1 COG:TM1451 KEGG:ns NR:ns ## COG: TM1451 COG0568 # Protein_GI_number: 15644200 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Thermotoga maritima # 9 257 123 373 399 181 44.0 2e-45 MEENNLNLISLYLSDIQKFNLLSKEEEYELLRRIREENDEQARQLLILSNLRLVISTAKK LLGNGLPLIDLISEGNIGLIKAINKFDYEKGHRFSTYAVWWIKQSIKKAIINTGRDIRIP SYKYEQLSKVNKVITDYVAKYGESPPTEYIAKEVDLKESKVILLLNEFQDVISLNETIGD NIYLEDIVGNTDDVEEKIIKEDQLSEMRELLENILTDREREILELRYGLYNNKIHTLKEI GAKLNITRERVRQIEKKR >gi|261747946|gb|ADAD01000082.1| GENE 11 8094 - 9113 1411 339 aa, chain - ## HITS:1 COG:BH1640 KEGG:ns NR:ns ## COG: BH1640 COG0240 # Protein_GI_number: 15614203 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Bacillus halodurans # 1 337 1 331 345 281 43.0 1e-75 MKNILVMGGGSWGTCLSKLLAENGHRVYLWERSEDVREDMRKNHENSTYLPGIKLPEEII LIDDYCEILSNKEKYGKIDVILSAVPTQFIRSVLKRLKNCLDYNIILANVAKGIEVATNK RISEVVAEELDGKEYNYVLLAGPTHAEEVSRKLPSAILSVSKDEKSAIEVQNIFSSPYFR VYTGTDLTGAELGGALKNCLAIAAGIADGMGYGDNSKAALLTRGLNEILEFGKYYNADPK TFMGLSGLGDIIVTCTSKHSRNRFVGEELGKGKKIDEILSNMKMVSEGAATIKAIYSIIK EKNIKSPIFTALYEVIYQGKPVTELASTFMSRDLRSEFI >gi|261747946|gb|ADAD01000082.1| GENE 12 9141 - 9761 839 206 aa, chain - ## HITS:1 COG:FN0537 KEGG:ns NR:ns ## COG: FN0537 COG0344 # Protein_GI_number: 19703872 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 201 1 192 194 149 44.0 5e-36 MKTVLLVIIAYLLGSIPNALWIGKVFKGIDVREHGSKNTGSTNAARVLGAKLGILTLLLD IGKGALPTAAALFLHADMLEKLTGISNIDAIFIGIFAIIGHSFSVFMKFKGGKAVATTVG VFTVIVPKAILVAAIVFFVVFAVSRYVSLSSITAAVSLPIAIYIFYKNIPLTVFGIVIAV LIIVKHKSNIERIKNGTESKFTINKK >gi|261747946|gb|ADAD01000082.1| GENE 13 9764 - 11224 1781 486 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 6 449 5 433 448 261 35.0 2e-69 MEKKRDELLNSSVLTLFIKYFIPTFIGSIVVVLYNIVDRFFVGKISEKALAGAGVSFYIV MILIAFSMLIGVGSGTIVSIRLGQGKEDEAEKILGNAVTIFGILGLVLFVLLQINLDTIL LYSGANSETLPYARTYLEIILYAIFPLFFSYGLTNILNAAGTPRIAMFSMILGAITNIIL DYVAVMIMKTGIEGTAYATLIGNVLSALFVMYFITVGKLPFKINLFGYELEDTSSIRLRF GNLKLSSDIVRDIFTIGMSPFLLQLASSGVGLITNKIVETNGGTYGVAVMTIINSYLPIM TMTVYSAAQAMQPIIGFNYGAENYKRVRKSLLTAVFSGFLLSAVFWIMVILFPKNLILFF NEKSTKEGLREGIKALRIYFALIIPASLGIILPNYFQATGRPRYSVTLNLLRQVVIFLLV VVIFSHIWKLDGVWYAQPFTDLMFALVLTVFLFRELRKLKMKEAETENGKNVQKSEKTDK IIERDV >gi|261747946|gb|ADAD01000082.1| GENE 14 11249 - 12385 1267 378 aa, chain - ## HITS:1 COG:FN0597 KEGG:ns NR:ns ## COG: FN0597 COG0763 # Protein_GI_number: 19703932 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A disaccharide synthetase # Organism: Fusobacterium nucleatum # 6 371 2 354 356 282 43.0 8e-76 MNNIKKVFVSCGEMSGDLHLSYIIEEIRKKDPNISFYGVVGDKSIAVGANKITHIKNNDI MGFVEALKKYKYFKQKALEYMEYIKNNNIDTVIFVDFGGFNLRFFKLLKKNIPSIKTIYY IPPKIWAWGKKRIETIKKFDDVIVIFPFEKEYFDKIEKKSGLNVKYFGNPLVDKYRFSQK LGKKIMLLPGSRKQEIGKFIPVIVDLIGNEKMKNEKFIMKFADKSHLEYAQNAVKNSNIN LTEIKNLEISFDSIEALRDKCKYAVATSGTVTFELSLTGLPVITVYKTSAVNAFIARKIV KIKYITLTNLNADKEIFPELLQEDFNVEKLSEQCQIMEKQKEKIVEELKKEREKLGGNGV LGKISDYLLEKITDSDRK >gi|261747946|gb|ADAD01000082.1| GENE 15 12378 - 13184 945 268 aa, chain - ## HITS:1 COG:FN0596 KEGG:ns NR:ns ## COG: FN0596 COG3494 # Protein_GI_number: 19703931 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 266 1 266 267 221 44.0 1e-57 MDRVGLIAGNGKLPELFLKQCQKKGIELFSVYLFDSVEESIKNHSNSVKYSVAQPGKIIS HFKRNGLSHIIMLGKVEKDLIFSNLKFDFTATKILLSAKNKKDKNILKAIINYIESEGIT VLPQNYLMDDYMVKQTVYTKYSPSAEEEKTIEIGIEAAKMLTDIDAGQTVVVKNQSVIAL EGIEGTDKAILRGGELAGKNCIVVKMARKNQDYRIDIPTIGLETVKKVAEIKGRGIVVEA DKMLFIDQEEVINYANKNKIFIKGIKYE >gi|261747946|gb|ADAD01000082.1| GENE 16 13184 - 13963 1135 259 aa, chain - ## HITS:1 COG:FN0595 KEGG:ns NR:ns ## COG: FN0595 COG1043 # Protein_GI_number: 19703930 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Fusobacterium nucleatum # 5 259 3 257 257 304 57.0 8e-83 MSTNNIHPTAIVAEEAKLGENITVGPYSIIGPEVTIGNGTVVESHVVIEGETIIGENNYI FSFASIGKVPQDLKFKGEKTRTVIGNNNKIREFVTIHRGTDDKYETRIGNNCLIMAYVHI AHDCIIGDNCVLANAATFAGHVEVEDYAVVGGLTAIHQFTRVGRHAMIGGCSAVTQDVVP YMLSEGNKARAVYINIVGLQRRGFSEEQIKTLREVYKIIFKKKLKLEEALQILERDYSHF DEAMKVVEFIRKSKRGITR >gi|261747946|gb|ADAD01000082.1| GENE 17 13997 - 14422 835 141 aa, chain - ## HITS:1 COG:lin2668 KEGG:ns NR:ns ## COG: lin2668 COG0764 # Protein_GI_number: 16801729 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Listeria innocua # 4 141 1 142 144 145 53.0 2e-35 MEPVMKIEDIMKILPHRYPFLLVDKVIEKNGTETLVAIKNVTMNEEFFNGHFPGKPVMPG VLQVEAMAQAVGLLMLEPGKIPLFMSIDKCKFRKGVVPGDQLRIEVEKVKVKRNVIVAKG KCTVDGAVTCEADLMFAIQEI >gi|261747946|gb|ADAD01000082.1| GENE 18 14680 - 15519 1287 279 aa, chain - ## HITS:1 COG:FN0593 KEGG:ns NR:ns ## COG: FN0593 COG0774 # Protein_GI_number: 19703928 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-acyl-N-acetylglucosamine deacetylase # Organism: Fusobacterium nucleatum # 1 275 7 279 283 305 56.0 9e-83 MKRKTVKNEVEISGIGLHKGEQIKLTLKPNTDNSGIVFKRVDVTDKNNIIKVDYRNLFDL ERGTNIRNKDDVKVHTIEHFLSSLSVFGITDILVEISGNELPILDGSSIKFIAKLKEAGT QELESEIEPVVIKEPVIFSDEKAGKYVLALPYDGFKISYTIDFNHSFLKSQYFEISPSLE EYIEKIARCRTFAFDYEIDFLKKNNLALGGSLENAIVVGKDGPLNPEGLRYPDEFVRHKI LDIIGDLYVLGRPLKAHIIAIKAGHYVNAKLTELIAQTL >gi|261747946|gb|ADAD01000082.1| GENE 19 15534 - 17735 3042 733 aa, chain - ## HITS:1 COG:FN0592 KEGG:ns NR:ns ## COG: FN0592 COG0210 # Protein_GI_number: 19703927 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Fusobacterium nucleatum # 1 730 1 731 735 583 48.0 1e-166 MSILDELNSEQKKAAEKTEGPILILAGAGSGKTRTVTYRIAHMIKEIGISPMNILALTFT NKAAREMKERAEALIGADSYNLVVSTFHSFSVKLLKTYADRIGFGRNFNIYDVDDQKSII TKIKKDLNTGNDDLTPGKIAGKISKLKEQGIGINELEEEIDLKLPWNKLFYDIYKQYNEV LKANNAMDFSDLLLNARKLLDDPYVLERVQDRYRYIVVDEYQDTNDIQYQIISIIAAKYK NICVVGDEDQSIYAFRGANINNILNFERDYPDALVVKLERNYRSTKRILDVANEVIKNNE SSKGKKLWTDGDEGEKIKIFNAENVYDEADYIVENIRAKHLQGKAYKDMTILYRTNAQSR VLEEKLLASNIPYKIYGGMQFFQRKEIKDILAYLSLLNNKNDNHNFLRIINVPKRSIGDK TLEKINNLAQKKGLSMFDSLKFINEISELRAGVRETLANFYNMMQGIYESLDELSVKEVF DEVMSKTKYIDCIEDNKEDRVKNIEELLNSITEAQKQNEGLSLSEYLDMVSLSTSTDGME DEENFVKLMTVHSSKGLEFDYVFIAGMEDGLFPSCNFETPEEDIEEERRLCYVAVTRAKK ELFISHVSERMVWGQMNFMIKPSRFIYEMKQDNLEYLSEKFAKFTKKMEKTSVIDSKKKT KIENFNPFSIKSVRTSELKHRHDLKYKTGDIVTHIKFGKGKIKSIDSKSMTVDFPVGEKK IALVLVKKILKDK >gi|261747946|gb|ADAD01000082.1| GENE 20 17897 - 18253 398 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038014|ref|ZP_06011426.1| ## NR: gi|262038014|ref|ZP_06011426.1| TM2 domain-containing protein [Leptotrichia goodfellowii F0264] TM2 domain-containing protein [Leptotrichia goodfellowii F0264] # 1 118 1 118 118 183 100.0 3e-45 MENMEDEVIVEEFEEKTSSNKKNEYLQEVKEVKLENSEKKYSKVAYCILAFFLGVFGIHL FYAKKTMQGIIFVVFAIIGIITAAFYIGVFIIMILRITSFVQMLIALFKKSDEFGRIS >gi|261747946|gb|ADAD01000082.1| GENE 21 18367 - 19605 1503 412 aa, chain - ## HITS:1 COG:SP1552 KEGG:ns NR:ns ## COG: SP1552 COG0053 # Protein_GI_number: 15901395 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Streptococcus pneumoniae TIGR4 # 137 407 18 287 394 137 31.0 4e-32 MDIIEWSGKFDKNEIGKLKNMDKDFTFVENVDKLFILYLGDKIYGYAVIGLEKVVELKKI FIIPRLRNNGYGTILLKYVINWLINNNYDSLIVVNHKQMNNFLEEQRFVKNEEGYILNNL TQNKKQEKRMLFVSKFAIGINVILAFLKIISGTVFKSTSLLADGINSLSDLITNILVIIG LKVGANPEDKDHPFGHGKIESVFSVIIGTFIILTAFDLIRENMGNLISPENKMIFGFVPV IVTIVAIIIKIFQLMFMKYKTRDYRSQLINSLLKDYKSDIVISSAVLLGIMLSKFNPLFD TFVGMGVAIYIIREGYKLIKENALILLDSQDEKLLESIKKDIFDFEEIENAHDFRMTTSG KDIYIFADIRVNKEMSVHEAHEITNKISKFVRHKYINVKKVLIHLEPVYESE >gi|261747946|gb|ADAD01000082.1| GENE 22 19674 - 19940 56 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSFIIKSYIGHLKKFYRLFLKPKAQKQRPCKPEEIDLVLLRSQPDTIHRFSTIQDYFLNT LQFKALTLKSSASPTSPLLKRVVDTGHR >gi|261747946|gb|ADAD01000082.1| GENE 23 20183 - 21445 1446 420 aa, chain - ## HITS:1 COG:all2303 KEGG:ns NR:ns ## COG: all2303 COG0044 # Protein_GI_number: 17229795 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Nostoc sp. PCC 7120 # 8 407 14 427 439 252 36.0 1e-66 MENKEKTLLLKNGKDVYGEKIDILIVGEKIEKISEEISGKDAENLKIIDLKGNLVMPGII DVHTHMREPGITHKEDFETGSRACAKGGITTFYDMPNTVPPTVTLENLIIKKKAAAEKSI VNFGFHFGGSKNNNIEEIRKVLKEKEANTVKIFMNVTTGEMLIEDEELLKEIFRNSDLVP VHAEHEMIDKALLLNKEYGHGLYVCHIPSKEELVKVIEAKKNPEMNNEKHPVYVEVTPHH LFLNEKIRESSERNKMLLRMKPELRTEQDNEFLWKALINGDIDTVGTDHAPHLISEKLEK VTFGMPGVETSLALMLTEYNKGKITLEKIQKVMCENPAKIMKIEKRGKLKEGYYADIIVI DLNKEWIVKNEDCESKCKWSPYENWKLKGKNIMTVVNGNIVYENEKIRENEGKGRNVEVK >gi|261747946|gb|ADAD01000082.1| GENE 24 21445 - 22203 965 252 aa, chain - ## HITS:1 COG:VC1911 KEGG:ns NR:ns ## COG: VC1911 COG0284 # Protein_GI_number: 15641913 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Vibrio cholerae # 12 243 5 230 231 174 40.0 2e-43 MNLNIEEKAKSKLIIALDYDDLKEAQESVENLGNNVDIYKVGLEIFLNTNGKIIDYLHEK NKKVFLDLKFHDITNTVKAACEYAIKRNVFMFNIHCSNGSKTMRKVAELVKKHNSNSILI GVTVLTNLSEEDIQEMFKSEMNLKELVLNMATITKNSGMNGIVCSADEAKFIKEKLGNNF VTVCPGIRPRFTLNKEDDQSRIMTPYNAIMNNADFLVVGRPVTKSENPVKAVKFIIEEIS KGLNEKSDKLEK >gi|261747946|gb|ADAD01000082.1| GENE 25 22223 - 22678 631 151 aa, chain - ## HITS:1 COG:PAB1499 KEGG:ns NR:ns ## COG: PAB1499 COG1781 # Protein_GI_number: 14521525 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Pyrococcus abyssi # 2 146 3 148 152 120 42.0 6e-28 MELQITALKNGIVIDHIPAEKTFPILEILRLREYGEVITVACNLKSSSLGKKGLIKIADR NLGDRELGEIAILAPDVTINIIENFKVVKKIKLSIPNEIVGLIKCNNNKCISNHENIESK FIKITDGDKIKYKCFYCERVIAEHEIELKPL >gi|261747946|gb|ADAD01000082.1| GENE 26 22689 - 23624 1165 311 aa, chain - ## HITS:1 COG:PAB1498 KEGG:ns NR:ns ## COG: PAB1498 COG0540 # Protein_GI_number: 14521526 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Pyrococcus abyssi # 2 306 4 308 308 321 52.0 1e-87 MKQKDIISMNDMSKEEILSILEIAGKIEKTPEKEKLKFLQGKIVSTLFFEPSTRTKMSFE SAALRLGAEILHFPPLEQTSLKKGESFTDTIKMVESYSDAIVVRHPFDGAARLAANTSKK PVLNAGDGSNQHPSQTLLDLYTILEEKGTLNNISVAFVGDLKYGRTVHSLVKALTHFNPK IYFVSPEILQMPQYLLDDLDKHNIKYEVLRDFRDYLDKIDVFYMTRIQRERFPDIEDYEK VKGVYVINKENIAGKCKEDMIILHPLPRVDEISTDLDDTKYALYFKQAKNGIPVRQAMLM TVLDEIKNHIK >gi|261747946|gb|ADAD01000082.1| GENE 27 23675 - 23872 103 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037997|ref|ZP_06011409.1| ## NR: gi|262037997|ref|ZP_06011409.1| ribosomal protein L22 [Leptotrichia goodfellowii F0264] ribosomal protein L22 [Leptotrichia goodfellowii F0264] # 1 65 1 65 65 95 100.0 9e-19 MLGIFTENNNFKRIFSKLKQKKPDFSFHSVSYFISYNLYKKRKFEIQISDFLTIIDNREN KDILF >gi|261747946|gb|ADAD01000082.1| GENE 28 24277 - 25641 1566 454 aa, chain - ## HITS:1 COG:slr0309 KEGG:ns NR:ns ## COG: slr0309 COG1032 # Protein_GI_number: 16331878 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Synechocystis # 27 402 35 413 473 187 29.0 4e-47 MKVKMILPALTEAESPFWRPIKYSLFPPLGLATLAGYFSEDDEIELQDQHIEELNLDDSP DLVVIQVYITNAYRSYKLADYYRKKGAYVVLGGLHVTSLPEEAIEHADTIMLGPGEDIFP KFLEDLRNKKPQKMYISVHRSLENIPVVRRDLIKRERYLVPNSIVVTRGCPHHCDFCYKD AFYQNGKSFYTRLVDDALKEIESLPGRHLYFLDDHLLGNPKFAKELFEGMKGMNRVFQGA ATIDSILTGDTIEKAAEAGLRSIFVGFETFSPENLKQSNKSQNLQRDYIKVVNRLHSLGI MINGSFVFGLDNDDKDVFKRTVDWGVKNAITTSTYHILTPYPGTRLFKRMEEDGRITTRN WDLYDTRNVVYKTKNLTPQELKEGYNWAYKEFYTWKNIFEAISNHNLVKHKLKHFFYTAG WKKFEPFWNFIIKSRHLNNMTPVLESILSKVKNH >gi|261747946|gb|ADAD01000082.1| GENE 29 25717 - 26130 537 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038028|ref|ZP_06011440.1| ## NR: gi|262038028|ref|ZP_06011440.1| conserved domain protein [Leptotrichia goodfellowii F0264] conserved domain protein [Leptotrichia goodfellowii F0264] # 1 137 1 137 137 258 100.0 8e-68 MKKSLLILFLLFGAVSLAREVWKERDYCYGGERTELRGVKFSKDGNFALRTYEYCYLGGK MYKNILVDVEIKVGNPDNRYVDNPVYDRYLVATSKSEYNFKAWNDEEIMVVIKGETNDFK NLKLAKKWFLNSKKRLF >gi|261747946|gb|ADAD01000082.1| GENE 30 26350 - 26847 339 165 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038000|ref|ZP_06011412.1| ## NR: gi|262038000|ref|ZP_06011412.1| 8 kDa glycoprotein [Leptotrichia goodfellowii F0264] 8 kDa glycoprotein [Leptotrichia goodfellowii F0264] # 1 165 1 165 165 315 100.0 9e-85 MKEQENKYGYVLAHVNESNVKREWFNGHGKLPDLDNWKWTSKNYKRVSLSDEELRQLFGK NKFCQNVGQQLMLWYENTALGYPDGRGTWRAGLAVALAEYSKQMFPSDLQEEYGELNLSK LWKYADCKVAPQEILNKVYIDRAIFTKGIYDETLWDLARDSFWEY >gi|261747946|gb|ADAD01000082.1| GENE 31 26844 - 27299 600 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038015|ref|ZP_06011427.1| ## NR: gi|262038015|ref|ZP_06011427.1| hypothetical protein HMPREF0554_2074 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2074 [Leptotrichia goodfellowii F0264] # 1 151 1 151 151 167 100.0 2e-40 MKNDNEKYEVTLKEDGSVEEIKSVPKSKEELEKELASLEEQLEAKKLEEKIEQMKRRLAG EPEPADIEYIEEVLDKKEKEEKKKSDTIQWAVALLLVVFLFAGGYGYMWYQDKVSKAKAA AAAEEQKRQEEKAKVPEKLRRKKTKRLGRLG >gi|261747946|gb|ADAD01000082.1| GENE 32 27445 - 27555 85 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLQFLETENLKEQKREDIYIRVKKVTEKMIKLVLL >gi|261747946|gb|ADAD01000082.1| GENE 33 27622 - 28251 964 209 aa, chain - ## HITS:1 COG:lin1945 KEGG:ns NR:ns ## COG: lin1945 COG0461 # Protein_GI_number: 16801011 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Listeria innocua # 4 207 1 207 209 192 48.0 3e-49 MNDLSLKEKIAKVLFDVKAVKISVNEPFTFASGIKSPIYCDNRYILGFSEERDIIIDGFV EKIDKDVDVIVGVATAGIPWASFIADRMKKPLSYVRNKPKEHGRGKQIEGADIKGKKVVV IEDLITTGKSSLIAVEVLQKEEVAEQEVMAIFSYGFDKARENYEKYNCKFSSLSNFDTLI KILEQLKYLTEKEAEIALDWSRNPEGWER >gi|261747946|gb|ADAD01000082.1| GENE 34 28277 - 28942 864 221 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1655 NR:ns ## KEGG: Lebu_1655 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 7 221 1 222 222 227 58.0 3e-58 MKGKEAIMKNFNYKPLIKYISYLNRKYINPDKAGTDKESMQKFKEAGQSARKTFTEIAKS FESKLEDFKMQKVSSWMNQAQIGRPYFWVFFKRPDELRDESGFALRVFGKGKSTGISLEV SFLERGIGKNTIYQQNKVLEIPIKEPLYYFSQIDGENYRMNGNEENRQKLIKDVTDGKVR KVLVKYDVNNIQSFKNLDELTDEFLKGFKLLYPYYEMTKII >gi|261747946|gb|ADAD01000082.1| GENE 35 29014 - 29448 341 144 aa, chain - ## HITS:1 COG:no KEGG:TDE0497 NR:ns ## KEGG: TDE0497 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 1 140 1 139 168 81 34.0 9e-15 MTEIIFSYNRLSHRVLNFMIIVGQFISLILVWLFLYFLGITNYETAPAFFQENPDIAVWI IFLSIPFFMLSTAFIIFKFWSRKEDKACIRLWKEYAILYYRGEEIMINKGEISTEQLTSR KKAIFYDTYILKIQKRKIIFDKCI >gi|261747946|gb|ADAD01000082.1| GENE 36 29470 - 29823 323 117 aa, chain - ## HITS:1 COG:CAP0178 KEGG:ns NR:ns ## COG: CAP0178 COG0789 # Protein_GI_number: 15004881 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 2 115 4 117 123 92 50.0 2e-19 MNISEFSEKTGLTAYTIRYYEKKNLIKVKRDDKNRRIYDESDIEWIKFIKKLKDTGMLLK DIKVYSDLRYKGDGTISERMELLIKHRKYVKNQIKIWENYLSNLDNKIEIYKSKIRN >gi|261747946|gb|ADAD01000082.1| GENE 37 29905 - 30693 846 262 aa, chain + ## HITS:1 COG:CAP0001 KEGG:ns NR:ns ## COG: CAP0001 COG0300 # Protein_GI_number: 15004706 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Clostridium acetobutylicum # 4 250 5 251 251 280 60.0 3e-75 MSKKYVTITGASSGIGKAAAKLFSKKGKNLILIARRKNLMENLKNEILSDYPDIDIVIKD FDLTDTEKIQDLYSSLKEYNIEILINNAGFGLYQNVNEHNIEKTRNMLKLNIEALTLLSI LYVKDYHNVEGSQLINISSAGGYTIVPSATVYCATKFYVSSFTEGLARELIKNNSKLKAK VLAPAATKTEFGNIANDVTDYDYDKVFGTYHTSEEIADFLYQLSESDFTVGIVDREIFTF SLSSPVFSHSYNSNKNQNLNND >gi|261747946|gb|ADAD01000082.1| GENE 38 30752 - 31726 1221 324 aa, chain - ## HITS:1 COG:mlr3148 KEGG:ns NR:ns ## COG: mlr3148 COG1744 # Protein_GI_number: 13472749 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Mesorhizobium loti # 3 324 2 320 329 136 29.0 5e-32 MKKKLFLMILMFCSIISFGKNINVAVIYSVRGKDDKSRNDFAYEGLMRAKKDFGIEFKEI IQKVSILDAEDQIKYLAKSGKYDLIIGISHETQMAIRNMAREYPEQKFAVAYERLDSKIP NVATILFREEEGGFLAGALAAMMNKTYSVGFIGSTETENVKKYETGFKQGAKYINKNIVT VSSYIDGENPYFDIPLGKKKAEEMIKKYKTGVIFHDASGSGIGVLNAAQENKIFAVSSEQ NEDKLAPGTVLTSVMTDLSTPVYTIVKSVVNGNFQGKEYYFGLSENAIKTTDFTYTKNTI GKEKLQKLEEIKEMIRSGKIKVRD Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:56:55 2011 Seq name: gi|261747944|gb|ADAD01000083.1| Leptotrichia goodfellowii F0264 contig00181, whole genome shotgun sequence Length of sequence - 734 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 732 817 ## gi|262038031|ref|ZP_06011442.1| hemolysin Predicted protein(s) >gi|261747944|gb|ADAD01000083.1| GENE 1 3 - 732 817 243 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038031|ref|ZP_06011442.1| ## NR: gi|262038031|ref|ZP_06011442.1| hemolysin [Leptotrichia goodfellowii F0264] hemolysin [Leptotrichia goodfellowii F0264] # 1 243 1 243 244 246 100.0 8e-64 NNINNTGNILANNLDINSKNLNSNGITADNLKINSETISSNKIEAGNTDIKGINLTTDKL KSRNINLDISRNIINTKNITGSKVNINTGDIKNNEIISDELNLNATKNIDSNAIYSNTAK INAQNLDSNSIEGQNISLAISNNINNKGNILADNLNISSKDITSNEIKANKINIDSKNVT SKKLTSNEAKIKVESFVNNEKTDSKSADFDIQKNFENKGTLAVDNLTLKADNVNNDGKIS TDK Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:57:07 2011 Seq name: gi|261747942|gb|ADAD01000084.1| Leptotrichia goodfellowii F0264 contig00234, whole genome shotgun sequence Length of sequence - 1642 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1640 2345 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261747942|gb|ADAD01000084.1| GENE 1 2 - 1640 2345 546 aa, chain + ## HITS:1 COG:XF2775 KEGG:ns NR:ns ## COG: XF2775 COG3210 # Protein_GI_number: 15839364 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Xylella fastidiosa 9a5c # 6 209 2615 2817 3455 69 31.0 1e-11 GITLWGAEGVSAGANYGTMKSEGTMYNNTQIQAGNKLKIKADNMEIRGGRAVGKHVEADI KENLLIESLQDKEEMKQMGVSVGYSMQKGKGKDGKPDNRHNGSLGGNYGRKDKEWVSEQS GIIGTESANVKVGGELTLIGGIIANIDEEGKDKGNLTLSYGSLRTADIKSHDKLINLNGN IEINQRSRDNNTKMILDKNGKNDNVKEVPNRVDETYGVGVEGHEREKITRAEEMLKDVNV QKTEFIFKSEPNSWGDFNKIMSSNAGIIGNFLDDMNEHTGNKVRTNYEDKFRTRTNEIIS KVERPLDKINRFISILPTSGTHGGILEQIVRTVRYDKTPIVKIGIEKNKEDGTVMVGMEE IRKISDYKSKDGSPVKVNTNGIIELKENAVRNTIFKNMTKEDMDRYNRGERVEMLMVYNP TRGAVADIIESALGKMFDGSWSSLGLSIGVNRGAAVAYASRDKNQSYDFSFYSQGNIIGL GAFNILKNNGIKLGNGAGNFNVRMYGTPIAIKSYQNFEGVLGINVLGAAVNEPDFVGSNG KWAGLI Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:57:08 2011 Seq name: gi|261747941|gb|ADAD01000085.1| Leptotrichia goodfellowii F0264 contig00079, whole genome shotgun sequence Length of sequence - 223 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 221 281 ## Predicted protein(s) >gi|261747941|gb|ADAD01000085.1| GENE 1 2 - 221 281 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no TANVEIIDTVRYEATGTEGIGYTRGGIFSRLRRNSGRNNKNEEKREIRERGVNFNVEASN ITSDGSITIKAEN Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:57:14 2011 Seq name: gi|261747939|gb|ADAD01000086.1| Leptotrichia goodfellowii F0264 contig00121, whole genome shotgun sequence Length of sequence - 1622 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 153 - 209 -0.2 1 1 Tu 1 . - CDS 352 - 1359 962 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 1390 - 1449 9.3 Predicted protein(s) >gi|261747939|gb|ADAD01000086.1| GENE 1 352 - 1359 962 335 aa, chain - ## HITS:1 COG:RSc1813 KEGG:ns NR:ns ## COG: RSc1813 COG2207 # Protein_GI_number: 17546532 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Ralstonia solanacearum # 251 335 218 302 303 75 32.0 1e-13 MEKIEKKLLEENDFIGIFGVNELKKFFSIKEKEKKFGKEYKLISKNEKLKCILERYRLDE DYEVSVLNTIGETDKGFFYPSINMNVYEIMYCLHGECIIFSGNESYSMKKGSILIYKMNN DNIDEYSFKSSNFKKILIFLDIDKLETSLSKSIDKNLILEWKEKVVNIFEDNIFCYGKTN SEIDILSNQIENMNIDNIDDYLEFKMKIFKFLFLILKLKITPVKEVSINEEEVVMKLKGM VDGYAISEIPTIKEMYETVGISNYHLQKAFKKIEGITIYQYVQKKKMDYSKHLLETSDKN IIDIAMEIGYENPSKFSKTFKKYFGILPSKYLKKN Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:57:18 2011 Seq name: gi|261747929|gb|ADAD01000087.1| Leptotrichia goodfellowii F0264 contig00154, whole genome shotgun sequence Length of sequence - 10117 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 2, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1764 1166 ## Lebu_2157 hypothetical protein 2 1 Op 2 . - CDS 1784 - 3388 856 ## SeAg_B4457 putative inner membrane protein 3 1 Op 3 . - CDS 3392 - 4252 899 ## COG1091 dTDP-4-dehydrorhamnose reductase - Prom 4272 - 4331 5.2 4 2 Op 1 . - CDS 4343 - 6232 1930 ## COG3754 Lipopolysaccharide biosynthesis protein 5 2 Op 2 . - CDS 6235 - 7074 755 ## COG1087 UDP-glucose 4-epimerase 6 2 Op 3 26/0.000 - CDS 7100 - 7822 204 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 7 2 Op 4 . - CDS 7844 - 8605 381 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 8 2 Op 5 . - CDS 8621 - 9487 1542 ## COG1209 dTDP-glucose pyrophosphorylase 9 2 Op 6 . - CDS 9536 - 10117 535 ## Sterm_2405 glycosyl transferase family 2 Predicted protein(s) >gi|261747929|gb|ADAD01000087.1| GENE 1 3 - 1764 1166 587 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2157 NR:ns ## KEGG: Lebu_2157 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 22 587 30 592 762 340 37.0 2e-91 MKKKRKYNINIEDKTKLFFIILFFISGIVLPSFVLIRKEMKKEVTSKLYSLDLVGQIIKG THFEEKIYLPKHITKYGISFATYGRKNSGTIKLEIIQGKKKITEVIDVSKLKDNDYYYLD VKKLALINGEALLKIEGIDGIEGNAVSMHKTADIIYGELIQNGEPTNRALAHKVEFNDFN KIVKGQIIFFILSIFTYFYFLFLIQNEKKNNMKIYLTTVLLIFFIINIKAPTLTFNAQPF AEQVFNFMFNGTRASFLKNIFISDAGYWPLYQRLIGLVIIKLGFNAYWTGVLTSNAAVLT VALMVSIFTLNYFEKYGNIIFRFIISMLFGTFNIVTYTETHTFIDFSYMNIILLFLIFLI DLKKLNKKQYASLMILTVLLCISKSYYILLLPLALVVLVLFWRKLIKREKIFLATILISC LLQTMYIYSNMHMWVTPQSKKLEILYIIQIGIHQIVQQLMNMFYPSMASNQNILSFNLLF LFVLVLFVIFSVYIFIKNRDRESLIILILLGLIFGISCFNVISRIWGGEIEKLWTETAGA INTRHGLIIKISMILILILVPYCINKLRKDSSNTKIKMEYCRGIILF >gi|261747929|gb|ADAD01000087.1| GENE 2 1784 - 3388 856 534 aa, chain - ## HITS:1 COG:no KEGG:SeAg_B4457 NR:ns ## KEGG: SeAg_B4457 # Name: not_defined # Def: putative inner membrane protein # Organism: S.enterica_Agona # Pathway: not_defined # 21 263 12 242 512 85 26.0 7e-15 MKRGMNDEKTANKKCLIGFYVLCFFLIYIMSSNYPVHTDEYIYSYIFGTNIKITVPIQLL ISLKKMYFEWSGRIVSNFLQQLFIYLGTSFFIISNSMVFILILWYSYKIISNILKIEKTE NNMLKINILLFFFIWFFIPAFSQDFIWMVGSANYAWPLFLNLIFLDKLMRLNEEEKIKIK YIILIFIVAVSNESSVIFTGILMLIFLFFEFIKNKKFKKNKLEIFIYFSLCSLAQILAPG NLLRIKNESQVFFPRNTILEKMFYILNLMEFQILIVLSIVLVIILYILKIRIKRMSVVLL ITSILNLVVFIFMLPRNENRALLYPLFGLILFITILIYQVFNKLNNRNQWIMIFGLLLIS SNSLLKIFNYYDVDLRNLENKRKFQLNYYSSTNNKEVVLEKYGSKVDNVSKQHKGYDFLQ TNPNSFSNVHMATFYGFDKLYAVNEGSKLLIVQFSNGSNIEMLRIDTDKNVSIFNFKQSY LSEKNELFLEIPSNSKVINISGENFMINKIKIIKVLEGIEYENNSEVKNIEIKI >gi|261747929|gb|ADAD01000087.1| GENE 3 3392 - 4252 899 286 aa, chain - ## HITS:1 COG:FN1698 KEGG:ns NR:ns ## COG: FN1698 COG1091 # Protein_GI_number: 19705019 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Fusobacterium nucleatum # 4 279 3 295 298 249 49.0 5e-66 MKILLIGSDGQLGYEFKRLFDSLNKEYIATDYQNLDITDEKALNNFFTIHKDITHIINCA AYNDVDKAESEDDKVRLLNTEAPKKLAEISKNINAVYTTYSTDFVFDGEKGKPYIEEDKI NPLCKYAESKAEGEKKVFETYDKSFIIRTSWLFGIGNNNFSKQIINWSKIQDTLKVVDDQ ISAPTYSKDLALFSWKLIQTRKYGMYHISSNGVASKYDQAKYILDKIGWKGKLEKARTFD FKLPAKRSKYSKLDSGKIERLLGEKIPDWKNGIDRFLEEMKEKGEL >gi|261747929|gb|ADAD01000087.1| GENE 4 4343 - 6232 1930 629 aa, chain - ## HITS:1 COG:mlr7559 KEGG:ns NR:ns ## COG: mlr7559 COG3754 # Protein_GI_number: 13476280 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis protein # Organism: Mesorhizobium loti # 1 566 1 557 644 377 38.0 1e-104 MKRIAIYFFYDKNGVVDKYVNYFLEDLKKNIEELVIVCNGKLNSEGRKEFLKFTDKLIVR ENKGFDVWAYKEGLEYIGWENLKKYDELLLINFTIMGPIYPFKEMFDKMDERKEIDFWGI TKFHKYPVDPWGMIEYGYIPEHIQSHFIAVRQPMLSSYEFKKHWEKMKMINTYFESVSFH EAIFTKKFNDKGFKWTTYIDSDYLKNFTDHPIIDYPREIIRDKRCPIFKRRSFFNPYYDF LSRSSGKSSLNLFNYIKTHTSYDVNLIWDNLLRTENMYDIKNSLHLNYNLSENQVTKKIE NSPKVGLFFHIYFEDLIEECYRYALNMPEYADIFITTDKEEKKEKIEKIFSKMKNKIDIK VIQNRGRDVSAFLIPNKEEILKYDYACFAHDKKTKQLQPEIKGEDFKFRCFENILGSKEL VENIIGLFIENPRLGLLSPPSPNHAEFYGNLGREWGHSGNDNYEETCNLLKELVIEVNVD ISKAPVAPYGTIFWFRPKSLEKLLKKGWKYEDFPKEPNKVDGTLLHAIERVYPFVVQGAG YYSANILSEENAKVEITNTTYMISSLNKSLYATKLFKNFGLQYYLINVIDDTLGKKAGFR KILRRALIAKLPNRVKSVLIRVRNKIKRK >gi|261747929|gb|ADAD01000087.1| GENE 5 6235 - 7074 755 279 aa, chain - ## HITS:1 COG:aq_1069 KEGG:ns NR:ns ## COG: aq_1069 COG1087 # Protein_GI_number: 15606348 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Aquifex aeolicus # 1 279 1 306 327 73 27.0 4e-13 MKRILVTGANGYIGKNVVHWLVENGYEVLAIDISTDRVDNRAEKIEMNIFEEYKKLLPRL NGIDSCIHLAWRNGFVHNSITHLQDIPLHYEFLKEISKFGIKNINVIGTMHEIGYWEGEI NNDTPTNPVSFYGIAKNSLRQSLRVLEREENINLKWLRVFYIQGDDRNNNSIFSKILQKE EEKSEKFPFTTGLNKYDFINIEDLSQMICKASLQKEMTGIINCCSGTAISLKEKVENFLK ENNLKIKLEYGVFPDREYDSPEIWGNNDKIKKILEREVK >gi|261747929|gb|ADAD01000087.1| GENE 6 7100 - 7822 204 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 41 233 17 226 245 83 27 7e-16 MIELKNVSLKYNLNKESILSLKEYMIKFLKRKLNYEIFWAVKNVDLKIEKGEVLGIIGYN GAGKSTLLKIISGIIQPTEGKVNVDGKIAPLIELGAGFDFELTARENVFLNGLILGYNKS FIKENFSKIMEFAEIEDEFIDIPLKNFSSGMVARLGFAVSTAIKPDILIVDEILSVGDFK FQEKSLNKIKEMMQDETTVIMVSHSIEQIEKLCTKVVWLENGTIKKIGMPDEICTEYKNS >gi|261747929|gb|ADAD01000087.1| GENE 7 7844 - 8605 381 253 aa, chain - ## HITS:1 COG:CAC2329 KEGG:ns NR:ns ## COG: CAC2329 COG1682 # Protein_GI_number: 15895596 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Clostridium acetobutylicum # 3 249 8 254 258 120 33.0 3e-27 MNIFKYKDLLFELVLRDIKIKYKKSILGVLWSVLNPLLMMIVLTIVFSEIFKSSINNFPL YLITGQVIFNFFSEATNMSMMSILGNGSLIKKVYIPKYIFPLSKTVFGLVNLIFSLIAVI IVFLFTKQSIGIATIFIPISLFYTFLFSFGIGLILASTAVFFRDMLHLYGVLLIIWNYLT PVFYPMEIVPDKYINFINFNPLTYYVTYFRKILIYNEIPDINLNLICLSISLISIILGII IFKKNQNKFILYV >gi|261747929|gb|ADAD01000087.1| GENE 8 8621 - 9487 1542 288 aa, chain - ## HITS:1 COG:MTH1791 KEGG:ns NR:ns ## COG: MTH1791 COG1209 # Protein_GI_number: 15679779 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Methanothermobacter thermautotrophicus # 1 285 1 285 292 414 66.0 1e-115 MKGIILAGGSGTRLYPLTKAISKQIMPIYDKPMIYYPLSVLMLANIREILIISTPRDLPV FQELLGDGSKLGIKLEYKIQEHPNGLAEAFIIGEEFIGNDNVALILGDNIFYGSGFTGLV EEAAKLEKGAVIFGYPVKDPKAYGVVEFDKNGKAVSLEEKPEKPKSNYAIPGLYFYDNGV IEKAKTIKPSARGELEITTVNERYLEEGILNVKTLGRGIAWLDTGTHDALLEAANYVEAI QKRQGFYVACIEEIAYAKGWINEEQLEELAKPLMKTEYGRYLLDLIKE >gi|261747929|gb|ADAD01000087.1| GENE 9 9536 - 10117 535 193 aa, chain - ## HITS:1 COG:no KEGG:Sterm_2405 NR:ns ## KEGG: Sterm_2405 # Name: not_defined # Def: glycosyl transferase family 2 # Organism: S.termitidis # Pathway: not_defined # 1 193 107 298 298 150 49.0 2e-35 EKIRETEKSSEKEIPLLVHTDSNIMNYGIKTEKKFITEVAENRNFENSFFNFFVQGATCI INKKMKDEILPFSEKFYFHDRSIHLIAEFIGKREYIKESTMFYRQHEKNEIGAKSSVIKK ILTKNYFDKRDRELIQFLYEKYNNSLSKEKKEKIENYLKITDRKISRIKRLWYKYKYKIP MNMKKQVFLFLKG Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:57:40 2011 Seq name: gi|261747914|gb|ADAD01000088.1| Leptotrichia goodfellowii F0264 contig00023, whole genome shotgun sequence Length of sequence - 12760 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 4, operones - 2 average op.length - 6.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 11 - 184 380 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 2 1 Op 2 . + CDS 177 - 821 905 ## COG1434 Uncharacterized conserved protein + Term 825 - 863 0.3 3 1 Op 3 . + CDS 876 - 1403 678 ## Lebu_1185 hypothetical protein 4 1 Op 4 . + CDS 1440 - 2345 1448 ## COG1210 UDP-glucose pyrophosphorylase 5 1 Op 5 . + CDS 2366 - 2530 180 ## gi|262038057|ref|ZP_06011463.1| hemagglutinin 6 1 Op 6 . + CDS 2539 - 3465 938 ## gi|262038054|ref|ZP_06011460.1| hypothetical protein HMPREF0554_0352 7 1 Op 7 . + CDS 3484 - 5007 628 ## PROTEIN SUPPORTED gi|153833204|ref|ZP_01985871.1| ribosomal protein S15 8 1 Op 8 1/0.000 + CDS 5079 - 6086 408 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 9 1 Op 9 1/0.000 + CDS 6052 - 6750 832 ## COG1354 Uncharacterized conserved protein 10 1 Op 10 . + CDS 6765 - 8243 533 ## PROTEIN SUPPORTED gi|163803542|ref|ZP_02197411.1| 30S ribosomal protein S20 + Term 8246 - 8305 12.1 11 2 Tu 1 . - CDS 8659 - 10104 1327 ## COG1960 Acyl-CoA dehydrogenases - Prom 10131 - 10190 13.0 + Prom 9969 - 10028 8.8 12 3 Tu 1 . + CDS 10154 - 10273 66 ## - Term 10436 - 10482 4.1 13 4 Op 1 . - CDS 10586 - 10972 378 ## Lebu_1211 hypothetical protein 14 4 Op 2 . - CDS 11022 - 11678 907 ## COG2860 Predicted membrane protein 15 4 Op 3 . - CDS 11699 - 12760 1067 ## Lebu_1207 type III restriction protein res subunit Predicted protein(s) >gi|261747914|gb|ADAD01000088.1| GENE 1 11 - 184 380 57 aa, chain + ## HITS:1 COG:CAC3415 KEGG:ns NR:ns ## COG: CAC3415 COG1132 # Protein_GI_number: 15896656 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 1 56 571 626 627 96 80.0 1e-20 MKGRTVFVIAHRLSTIRNSDVIMVLDQGRIIERGSHEELLKMKGIYYQLYTGGFELE >gi|261747914|gb|ADAD01000088.1| GENE 2 177 - 821 905 214 aa, chain + ## HITS:1 COG:CAC0441 KEGG:ns NR:ns ## COG: CAC0441 COG1434 # Protein_GI_number: 15893732 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 3 199 56 246 259 111 36.0 1e-24 MNRKNGISENDLKKKLFKFVKIFIVIFLFPFIIIQLLTISEMKKEKDGKDKKVDYVVVLG GRVRGEKPSRSLYKRIEKAAEYLKEHENIKVVASGGKGKGEEISEAEAIKRELVKLGISE NRIITEDKSVNTMENFKFTLEKIKENESENKEKKFKILIVTNDYHVYRSRKIAELLGFEA YGLGAKTPLISLPKSYIRESASIIKYYFTKNSIK >gi|261747914|gb|ADAD01000088.1| GENE 3 876 - 1403 678 175 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1185 NR:ns ## KEGG: Lebu_1185 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 175 1 181 181 164 50.0 2e-39 MKKLLLAAILALGVQSFSCEFMKNPDLLLGRVIDKLKSEKKTNDIFCDSDELKMAYYIID NGDYNLNIGIKLGINPQTTNNDFRNDFYKKLTEYTNVLKNVDKKNLNGLPLPDKEVLRFY GYVEPEKNFFYIGKYEYDRKTNKYKMVVNSQGKTIFDQMGLFTGVNVEYSDEIVF >gi|261747914|gb|ADAD01000088.1| GENE 4 1440 - 2345 1448 301 aa, chain + ## HITS:1 COG:FN1266 KEGG:ns NR:ns ## COG: FN1266 COG1210 # Protein_GI_number: 19704601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose pyrophosphorylase # Organism: Fusobacterium nucleatum # 4 296 9 296 301 342 63.0 5e-94 MKQKKIRKAVIPAAGLGTRVLPATKAQPKEMLAIVDKPALQYLVEELIDSGIEEILIITG RNKASIENHFDYSYELEKTLEENGKKDLLKVVDDISKMSNIYYVRQKKPLGLGHAVGCAE AFVGNEPFVVLLGDDIMYADPSKGELPVTKQLIDKYEELNGGTILGVQEVPEKEVGKYGI INPLKEIDSQTVEVENFIEKPSVEEAPSRLAALGRYILEPEIFEFLKKTKPGKGGEIQLT DAILDMKNSGKKLYAYNFKGLRYDTGDKFGMFVANVEFGLRHPELKDRTKEYLKKLYESI K >gi|261747914|gb|ADAD01000088.1| GENE 5 2366 - 2530 180 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038057|ref|ZP_06011463.1| ## NR: gi|262038057|ref|ZP_06011463.1| hemagglutinin [Leptotrichia goodfellowii F0264] hemagglutinin [Leptotrichia goodfellowii F0264] # 1 54 1 54 54 77 100.0 2e-13 MIKTVKKTILFCMFTVLFCNIAFSLKAENKADMNDEGYKEIEKELKNIKDKEKT >gi|261747914|gb|ADAD01000088.1| GENE 6 2539 - 3465 938 308 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038054|ref|ZP_06011460.1| ## NR: gi|262038054|ref|ZP_06011460.1| hypothetical protein HMPREF0554_0352 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0352 [Leptotrichia goodfellowii F0264] # 3 308 1 306 306 458 100.0 1e-127 MDMTKCTVKTSKTKSNLPVCQNSTAFFQNTTSYKNNKPLKVYPDKDEVLLETPIVVETNT KTDKGNIPSSGKYNKKVKEYKANYDYEINKNTKVGIGLNHKDKELRQYKDQKRHKRDADD NNVSLNTEYEKDGLTYRGEVEFGKGYYYKKGIKNETPQSHYDYINSRGNVSYKKDLGNDT ELVPSAEVEIKNTNERNVGNNGRNYIPAENSDIVTTAGVKLKQKYNDKIKWEVGTEYKQN MKNVITGKTNYNKSLNNDTYERGTFITGANVQYKKDDSVTYKLGYNFEKNRDYNNRVLNF SVTYQIKD >gi|261747914|gb|ADAD01000088.1| GENE 7 3484 - 5007 628 507 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153833204|ref|ZP_01985871.1| ribosomal protein S15 [Vibrio harveyi HY01] # 7 496 13 502 512 246 33 6e-65 MLLGGNEETRKIDEYAINTLGIPGIVLMENAAASFVKNLDLTADSYLVICGKGNNGGDGY AIARQLNALGKNITVFCIDNQNMSKDCKINYEICKNMNIKISSDIDELDELLINSRGVID GIFGTGLNSPVTGIYREVIEKINRHSQHCKIYSVDIPSGINGNTGEIMGTAVKAFKTVSF VIYKKGFLNRKNKEYFGEIRVENIGVNENSFLHLIKAHYLTEKEIQKIIIGRNEFSHKGD FGKILIFAGSKGFSGAAKIAVNSCVRSGSGLVTLLTYDNILNEVSSKMTEAMTLGINSDF VEENFPEIEKMILNSDVMAIGPGIGKSEKSLVILEKLLSFKKNIKGNDIKLVIDADGLNL LSENKYLFKEIENRAVLTPHSVEFSRLSGFSLEEIEKYRFEICRDFALSNKVILLLKGKN TLITDGNEVYINSTGNPYMANGGAGDCLTGIIASLSGQNYELFESAYTGAFLHGYIADGL FKEQYIINASHIIENIPKYMKKLFCGK >gi|261747914|gb|ADAD01000088.1| GENE 8 5079 - 6086 408 335 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 37 317 20 305 317 161 35 2e-39 MSMKIITKKEKEVEEFLKFENTSCVNEEELKQSKNIVILGNFDGVHRGHAKLLERAVKKA REKGYKTIVYTFCEYPQKKESRITTPSEKCQLINNFNIDYVYMDNFEDVKNFSPEEFIEK ILIQKLNVREIYCGFNFTLGKGKSGNVRIMENILREKYNNSIVLNIQPSILDDENEIISS TRIRKYIQKSDLSKAKELLGHNFIIMGEVVHGKKLGRTLGFPTANLTFENRIYPELGVYG VYVHIEGDSNIYHGIMNIGKNPTVKSDSLLLNVETNIFDFEQDIYGKIILIEVLEKVGKE KKLNSLEELIQKISDNVSNWRKRIDEKYHNKNKNR >gi|261747914|gb|ADAD01000088.1| GENE 9 6052 - 6750 832 232 aa, chain + ## HITS:1 COG:FN0708 KEGG:ns NR:ns ## COG: FN0708 COG1354 # Protein_GI_number: 19704043 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 6 220 4 219 225 117 39.0 2e-26 MKNTTIKIKIDNFEGPLDLLIHLIEKKQMKITEINISQIIDDYLNYIREQKEENLKIKVE FLIMATDLIEIKAYSILNKEKKDEKIENLERKILEYKLFKEISVLFSEYENEYNIPYKRT GNKNIETAMIEYDISGLTLDSLLNSFRALLEEEDKEKLILNLEEEYSTEDAADEISELME TKDRIGFSELLKNKFTKSRIVSLFLCVLEMFKSGYIDIINENKEFYIQKLSF >gi|261747914|gb|ADAD01000088.1| GENE 10 6765 - 8243 533 492 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803542|ref|ZP_02197411.1| 30S ribosomal protein S20 [Vibrio campbellii AND4] # 1 424 5 439 520 209 30 6e-54 MFKSSFIVMAINMLSRLLGLIREMIIGSMFGATGLTDAYVSATKIPNFFTTLFGEGSMGT VFIPIYNRGLEEKGVEKTNDFVFSILNLIIAFTSTLSVIMIVFSRQILKITTGFKDPERF ETANNLLKIMAFYFLFIALSGVVSSFLNNYKKFAIAASTGLVFNLTIIIGTLLLSKKIGI YSLGIAYLLSGVFQLGMMLPQFFQIIKKYKFIFNLKDEYVREMFLLMIPTLIGIFGYQIN EIVDNNFATRLSAGTASALNYASRLYLLPIGVFAISLSVVIFPTLSQAVVKNQRKKVKNT IQKGLNMLVFLIIPSSTVLMGYAEPIVTLIYKRGHFSDKGVITTAGALKFYALGLLFFST IHLLTRSHYVYKDRTLPVISSFVAIFINIVLDWLLYEKYQHIGLTIATSFSAMVNYMILL ISLNKRHIRLNNLIYLKFLVLSLVISLAAFYVSNMIKVPSLGKFGILVNITAFAVLYLGI WAIPVMTKKSKK >gi|261747914|gb|ADAD01000088.1| GENE 11 8659 - 10104 1327 481 aa, chain - ## HITS:1 COG:NMA1490 KEGG:ns NR:ns ## COG: NMA1490 COG1960 # Protein_GI_number: 15794390 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Neisseria meningitidis Z2491 # 12 481 55 516 517 609 62.0 1e-174 MLNLIDEALVKMNTGEFLNNIKKAFNDIFSQENINKINLMNYLPEDKWLDIKNYGLLLPF LAEKFGGRKANQFEIQEVLRIAGNYGVPVTLRTGIEGALVLQPLTEFGNNEQISKGLELI LKGEGGGLAITEPETSGSAIAKEMQSYYEYVDENTIHLKSAKYWQGNSQSDFLLVAAKEK KDRKLSKTISLIFVPKEYIKYEVLQSEGLKAVRYAVNNIDAAIPAKYLMKLSESKANSLR EFQNIFIRSRLQLIGMTHGIMEYIVKNIKKYSRNEIKFVEREIDEIEKRYEASKIMYNYT CNNVSPDKSVADKLMEANIIKSLSTDYTYEAAKTAQKLLGAKGFEAGHLMSNIAVDFRPF TIFEGPNDMLYADIYDQFSKATSEEKKSGIRISKEQTLYERFLSDKRFQEIKTSTENLLQ KADDIINFMKNHTLNEIDQIKKVFTGKILSKLFVLSQTDAEDLSTFLIKDIKKDILDFEY C >gi|261747914|gb|ADAD01000088.1| GENE 12 10154 - 10273 66 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTYNNTKILLKKEPRRALKGVYEEKSILLLVKNSITFYN >gi|261747914|gb|ADAD01000088.1| GENE 13 10586 - 10972 378 128 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1211 NR:ns ## KEGG: Lebu_1211 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 118 3 120 122 132 67.0 5e-30 MENKIRYTKTGKKIEIYEFDAKLHQKVIMQKKQFEEHIIKKHPEITLEIIDIILNNPDYV TKRSNSKKEHFYQKKIDKLHYFVVVSSHKNIRHLRFILTAFSVDDKEFLKEKNVYYKYVG YIDYDNTY >gi|261747914|gb|ADAD01000088.1| GENE 14 11022 - 11678 907 218 aa, chain - ## HITS:1 COG:SSO0012 KEGG:ns NR:ns ## COG: SSO0012 COG2860 # Protein_GI_number: 15896983 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Sulfolobus solfataricus # 5 215 3 198 205 100 33.0 2e-21 MRFEIFLIICNYVGTVAFAASGVLKGIKHKLDIFGITLLAVITALGGGILRDILVAQIPD VLINPEGLYIAIITSGIMWLFTKKIRKKIKISRRRKMKLYKLITTSNLIFDSIGLAAFTL IAANKSVSLELNTVTTGILAAVTGVGGGIIRDLLVTETPIVLREDIYAVLAFGGGILYHI CIIQFSFLKVPTMAILFIIMLIIRLVIIKYKVNLPNIN >gi|261747914|gb|ADAD01000088.1| GENE 15 11699 - 12760 1067 353 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1207 NR:ns ## KEGG: Lebu_1207 # Name: not_defined # Def: type III restriction protein res subunit # Organism: L.buccalis # Pathway: not_defined # 1 352 720 1087 1090 377 62.0 1e-103 EKINFNSRNTLKDMYDEYKAEIGKSHDDILEISDFDSNMELFTELILKTGSFYNAQIYFE KVDFEKKYPLKNEETEFLAYIEKKIKLVEPFTYLIIKNLLETKFESQNFKNGNINYINSE ILLRNYKNYYNIKSEFKKIYLINRIFSELVEDSILENTLYGHKISEKYEKLFIEKRDEFD KRLKQLLILGLNEFAKNDMEEFNENILLTHKEYMRIDLQILLDSKVPKGTWRAGYANTDK DICLFITNDKSHITQENLKYDNSLHADDIIQWISQPKTFHESSVGQMFIKHKEKGIKVHI FIRKYAFMDGNKTNPFIYLGNADYYKSYGDKPMAILWKLKHKIPQELIYDLYE Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:58:26 2011 Seq name: gi|261747882|gb|ADAD01000089.1| Leptotrichia goodfellowii F0264 contig00096, whole genome shotgun sequence Length of sequence - 29606 bp Number of predicted genes - 31, with homology - 31 Number of transcription units - 11, operones - 5 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 58 - 279 449 ## Lebu_0466 exonuclease VII small subunit 2 1 Op 2 . - CDS 294 - 854 334 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 3 1 Op 3 . - CDS 869 - 1594 727 ## Lebu_0464 hypothetical protein 4 1 Op 4 . - CDS 1611 - 2633 1276 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 5 1 Op 5 . - CDS 2661 - 3593 1183 ## Lebu_1875 hypothetical protein - Prom 3634 - 3693 2.2 6 2 Op 1 32/0.000 - CDS 3710 - 4813 249 ## PROTEIN SUPPORTED gi|224825464|ref|ZP_03698569.1| protein-(glutamine-N5) methyltransferase, ribosomal protein L3-specific 7 2 Op 2 . - CDS 4806 - 5885 1798 ## COG0216 Protein chain release factor A 8 2 Op 3 . - CDS 5940 - 6482 577 ## COG3981 Predicted acetyltransferase - Prom 6512 - 6571 7.9 + Prom 6497 - 6556 8.0 9 3 Tu 1 . + CDS 6612 - 7892 1044 ## COG0285 Folylpolyglutamate synthase - Term 7889 - 7939 4.5 10 4 Tu 1 . - CDS 7949 - 8545 658 ## Lebu_0530 hypothetical protein - Prom 8676 - 8735 11.5 - Term 8689 - 8740 4.4 11 5 Tu 1 . - CDS 8815 - 9708 1486 ## COG1159 GTPase - Prom 9799 - 9858 9.3 + Prom 9761 - 9820 10.2 12 6 Tu 1 . + CDS 9868 - 11079 1228 ## COG0786 Na+/glutamate symporter 13 7 Op 1 3/0.000 - CDS 11450 - 12442 1443 ## COG0095 Lipoate-protein ligase A 14 7 Op 2 30/0.000 - CDS 12456 - 14204 810 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 15 7 Op 3 24/0.000 - CDS 14219 - 15253 1620 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes 16 7 Op 4 28/0.000 - CDS 15268 - 16266 1532 ## COG0022 Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit 17 7 Op 5 . - CDS 16342 - 17307 1542 ## COG1071 Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, alpha subunit - Prom 17383 - 17442 11.2 18 8 Tu 1 . - CDS 17566 - 17838 394 ## PROTEIN SUPPORTED gi|229211116|ref|ZP_04337511.1| SSU ribosomal protein S16P - Prom 17865 - 17924 9.0 19 9 Op 1 . - CDS 17993 - 18661 797 ## Lebu_0102 hypothetical protein 20 9 Op 2 . - CDS 18711 - 19100 529 ## gi|262038088|ref|ZP_06011493.1| hypothetical protein HMPREF0554_1495 - Prom 19128 - 19187 12.8 - Term 19288 - 19342 6.0 21 10 Op 1 1/0.000 - CDS 19350 - 19865 915 ## COG0716 Flavodoxins 22 10 Op 2 . - CDS 19869 - 21113 1461 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 23 10 Op 3 . - CDS 21124 - 21906 1082 ## COG0748 Putative heme iron utilization protein 24 10 Op 4 . - CDS 21945 - 22712 972 ## gi|262038079|ref|ZP_06011484.1| hypothetical protein HMPREF0554_1499 25 10 Op 5 14/0.000 - CDS 22750 - 23631 1334 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 26 10 Op 6 35/0.000 - CDS 23657 - 24439 234 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 27 10 Op 7 2/0.000 - CDS 24443 - 25435 1330 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 28 10 Op 8 1/0.000 - CDS 25425 - 27404 2747 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 29 10 Op 9 30/0.000 - CDS 27461 - 27883 495 ## COG0848 Biopolymer transport protein 30 10 Op 10 . - CDS 27913 - 28548 829 ## COG0811 Biopolymer transport proteins - Prom 28585 - 28644 9.9 + Prom 28589 - 28648 19.0 31 11 Tu 1 . + CDS 28712 - 29368 633 ## COG1272 Predicted membrane protein, hemolysin III homolog + Term 29402 - 29465 12.7 Predicted protein(s) >gi|261747882|gb|ADAD01000089.1| GENE 1 58 - 279 449 73 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0466 NR:ns ## KEGG: Lebu_0466 # Name: not_defined # Def: exonuclease VII small subunit # Organism: L.buccalis # Pathway: Mismatch repair [PATH:lba03430] # 1 73 1 73 73 63 80.0 3e-09 MAAKKQSYEENIEEIDKILEKLENQELSLDESIAEYEKAIKLIKESEKLLEIGEGKVLKV LEKNGKLETEEVE >gi|261747882|gb|ADAD01000089.1| GENE 2 294 - 854 334 186 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 186 13 198 199 133 34 1e-30 MRITAGMLKNRKIKSREGRETRPTLERIKEAIFSIIGEQVVEAKFLDLYSGTGNMAIEAL SRGAGRAVMIEQDKEALRIIIENINNLSLENKCRAYKNDVFRAIEILDRKNEKFDIIFMD PPYKENISAQTIEKISESNILSEEGIIISEHSTYEKLENTIGNFVKYDERDYNKKIISFY RLKEND >gi|261747882|gb|ADAD01000089.1| GENE 3 869 - 1594 727 241 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0464 NR:ns ## KEGG: Lebu_0464 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 3 241 9 255 255 216 54.0 6e-55 MSKKNFKIIISIFVFLIFSVLSFSENIEKTRFLSVGSLKIVKPETLVIKREDLNITIEKD RKIKVESLYTFQNTGDVNVKSTFMFWLDQSTVNDKSQKYIENIKFVSDFKKNKNSRAVIN FSENIYDNQSSDNIQREWFAVTHTIEPQKTGRLTVNYNLVNTEFQINKKINFSFDLVNNF IDKNKTEILYINVYNKSGIKINSIIYKNYVFKNVSENSQKEHYELLEGNVALDGKMTINF K >gi|261747882|gb|ADAD01000089.1| GENE 4 1611 - 2633 1276 340 aa, chain - ## HITS:1 COG:FN1330 KEGG:ns NR:ns ## COG: FN1330 COG0809 # Protein_GI_number: 19704665 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Fusobacterium nucleatum # 3 340 13 351 351 409 64.0 1e-114 MNIAEFDFDLPEELIAQHAVNPRDHSKLLVLNKEKKEIEHKRFYNIIDYLKKDDVLVINR TKVIPARLYGRKDTGTVLECFLLKRYDLYTWEVLLKPAKRLKIGQKIVFSENMLEAELIE IKSDGNRVLKFNYKGNFEEILDKLGEMPLPPYITEKLEDKDRYQTVYAKEGESVAAPTAG LHFTEELLEKIKEKGIIIAEVFLDVGLGTFRPVQTENIHEHIMHSEKYWVPKETAEIVNN AKRKGNRVIAVGTTSVRTLESSVNEKGELIENESETSIFIYGDYKFKIVDAIITNFHLPK STLIMLISAFGGKEFVFSAYEEAVREKYRFYSFGDSMFIY >gi|261747882|gb|ADAD01000089.1| GENE 5 2661 - 3593 1183 310 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1875 NR:ns ## KEGG: Lebu_1875 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 34 309 60 353 357 307 54.0 4e-82 MQKNKINIGKNNIGKMLLFIAILSLSGLTFGASNELENIIKDYYNNKEYDLMNMRKTDNK GGKSLNGKMKSRLSLNITQGELMTIANQIFKNETGGTVENLVDWNAGENFPSLGIGHFIW YKSSLGGPYAESFPQMVSFYRSQGIKLPRILEENTGSPWKSRSELLRKKENGDRDIIELI DFFDRTRDTQILFIFERLEDSLDKMLNASANSENVRKQFYRVANSPNGMYALIDYVNFKG EGISDTNGWGLRQVLENMRGNETGKSALIEFSDSAKYVLEKRVNNSRQHERKWLPGWFKR VDTYKTFQVR >gi|261747882|gb|ADAD01000089.1| GENE 6 3710 - 4813 249 367 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|224825464|ref|ZP_03698569.1| protein-(glutamine-N5) methyltransferase, ribosomal protein L3-specific [Lutiella nitroferrum 2002] # 116 331 51 263 300 100 33 1e-20 MNNLLDILNKSINYLEKKNIKNARIVTETVFSQILGIERIMLYANFEKVLSDGDIARIKE KLNEITTQNNEISDKEIFEGNSENLKSLIDKSVSYLEKNNINEAKLIAEIIFSHVLEIDR MLLFTKYKENLDKEKTDKIRNFIQKVGKEKFPVQYLLNEQEFYGRKFYVNTGVLIPRQDT EVLVEEAIKTLRSEKTETPKILDIGTGSGAIGITLALEIPDSKIMGTDISDKALEISEKN KSLLNAENIKFFKSDLFENIEYKKFDMIISNPPYIASDETKVMSEDTLLHEPDEALFAED EGLYFYREISFQGKEYLRNGGYMLFETGYKQAETVKKIMEITGYKNVNIIKDMQNIGRVV TGQKLES >gi|261747882|gb|ADAD01000089.1| GENE 7 4806 - 5885 1798 359 aa, chain - ## HITS:1 COG:FN1332 KEGG:ns NR:ns ## COG: FN1332 COG0216 # Protein_GI_number: 19704667 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Fusobacterium nucleatum # 1 354 9 362 365 428 70.0 1e-120 MFQKLDDVVLKHEELTRLLMDPEVISDPKKIMEYNKALNSIDEVVKKYTYYKSRKEEMES LKEDLKAEKDPEMKEMMLEEIHTIEEEIPGLEEELKILLLPKDPNDDKNVIMEIRAGAGG DEAALFAADVFRMFTRYAERNRWKTEIIDKSEIGVGGLKEVTFLIKGNGAYSRLKFESGV HRVQRVPATEASGRIHTSTITVAVLPEIEDVNEVEINQSDLKIDTYRSSGAGGQHVNTTD SAVRITHLPTGLVVTSQDGRSQIKNREAAMKVLASKLYEMEYEKQRKEVESERRSQVGSG DRSEKIRTYNFPQGRVTDHRIKLTLHRLEGVLDGDLDEMIDALIAYDQAELLKAVGDNE >gi|261747882|gb|ADAD01000089.1| GENE 8 5940 - 6482 577 180 aa, chain - ## HITS:1 COG:DR0763 KEGG:ns NR:ns ## COG: DR0763 COG3981 # Protein_GI_number: 15805789 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Deinococcus radiodurans # 42 155 25 141 158 71 35.0 7e-13 MTNLNFREVSINDGYKEFLFLKSRKKENENGFRIKVPKDFNDFRTIIEELIRDKNRKISE ERVPQTVYWVEKDREIIGLVKIRLVLSEKLKKKGGNIGYFISDKYRRKGYGKELLKNALK FFKDTENVIITCNKDNIASKKVIEYNNGKLMEIFDNSCIYKIIIKNLIDLNEKVQDNFEK >gi|261747882|gb|ADAD01000089.1| GENE 9 6612 - 7892 1044 426 aa, chain + ## HITS:1 COG:FN1014 KEGG:ns NR:ns ## COG: FN1014 COG0285 # Protein_GI_number: 19704349 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Fusobacterium nucleatum # 1 416 5 404 415 286 39.0 8e-77 MNIDTILKEIFSFRATKRRENIDDLKLIYDLLGKPCKNNKIIHIAGTNGKGSTASFIENI LTAAGYSVCKFTSPHILKYNERIVSDKKMITNEEILQNYSIIKGTVSNYNKSGSNEINLN FFEITTFIALLYFQKQNPDFIVLETGLGGRLDATNIVDSNISVITNITFDHVNILGNTLQ QIAYEKAGIIKNGELCIYSDNLPELKKEIDKKTSKAINVIEKYKNIEIELDKKNYKTVLY FENKRFILPLFGKFQAYNFLIAYEIAKIYNIHDKVIQKGLDNVYWPARFEFYSKNPTIIL DAAHNDDSIKKLSENLKELYKKNEVIIITSLLQTKDFKAVFSELQNISDKIFITSLKNVA YGLTSQEIKDKMKELNISVENIIFEDDINKAYKESLKLLNNTIYNFKAIIVCGSFYEISE FEKIVL >gi|261747882|gb|ADAD01000089.1| GENE 10 7949 - 8545 658 198 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0530 NR:ns ## KEGG: Lebu_0530 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 2 194 9 216 217 178 47.0 1e-43 MELKMTRENFWEYVNGARKKYKDDFEVMSYLTDALTDKANEEIFSFGIIVDEIMLESYNE KLWCASYLVNGDTGEDSFDFFRLWLISQGEKIYNDVIKNPDNLINYIENSNEDEFIQDFY ENEDFFFVAVDAFGKKNRIEAFEEIFEKYLDEFDSYKEKIGYIDTDYPKLRFTWCEKVPK SMKKICPKLFERLYIEEN >gi|261747882|gb|ADAD01000089.1| GENE 11 8815 - 9708 1486 297 aa, chain - ## HITS:1 COG:FN0270 KEGG:ns NR:ns ## COG: FN0270 COG1159 # Protein_GI_number: 19703615 # Func_class: R General function prediction only # Function: GTPase # Organism: Fusobacterium nucleatum # 1 291 1 293 296 317 61.0 1e-86 MKSGFIAIVGRPNVGKSTLMNKLVKEKVAIVSDKAGTTRDQIKGIVNIGENQYIFVDTPG IHKPKHLLGEHMTEIALETLSNVDLIMFMLDGTKEISTGDIFVNEHIRETKTPVIVVINK IDKMSNEEIEEKKTEIKEKLGEFADIITLTAEYAIGVHKIFEVAEKYLSSDVWFYPEDYY TDLPVNKIVVETVREKILHNTKDEIPHSVAVEITDVETKKDIRKYEVNIYVERDSQKGII IGKDGALLKKIGIEARKEIEALIDLKVNLKLWVKVKKKWRKDKQFLNEMGYKIKKKK >gi|261747882|gb|ADAD01000089.1| GENE 12 9868 - 11079 1228 403 aa, chain + ## HITS:1 COG:FN0793 KEGG:ns NR:ns ## COG: FN0793 COG0786 # Protein_GI_number: 19704128 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Fusobacterium nucleatum # 1 398 8 390 399 330 48.0 4e-90 MIQAMGLAVILLIIGRKLRSKISFFEKYCIPSPVIGGLIFSIISFIIRRYNIATVLFNTT LQSFFMIMFFTSVGFNASVKVLKKGGKKVFIFLIAAVVLCVFQNAVAVIISKFVGIHPLL ALMTGSTPMTGGHGTSAAIAPTVESLGIKGAGAVAIASATYGLLAGSILGGPIADSLIKK HNLLPKNEAADKTQKEENYEKDIDENLKKKLQPILDGEKFSRAFFQILIAMGIGSYLSLF IDWMMPEMHFPVYIGPMIVAAIIRNISDNTNLIDSPTKEISILEDVALNLFLAMALMSMK LWELIDLAGPMIILLLAQTVLIYIYLNLVTFKAMGSDYDAAVIVSGHCGFGLGATPNGIS NMKSVCEKYKYSKIAFFVVPIVGALFIDFVNVSIITVFIGFFK >gi|261747882|gb|ADAD01000089.1| GENE 13 11450 - 12442 1443 330 aa, chain - ## HITS:1 COG:SPy1033 KEGG:ns NR:ns ## COG: SPy1033 COG0095 # Protein_GI_number: 15675030 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Streptococcus pyogenes M1 GAS # 1 329 1 328 329 434 66.0 1e-121 MIYVVNKSNNPTYNIALEEYCFKNLKDYDQIFILWINEPTIVVGKYQNTIEEINSEYIKE HGIHVVRRISGGGAVYHDLNNLNYTIISNVNKGDEFNFKEFSVPVIETLKELGVKADFTG RNDLEIDGQKFCGNAQAYIKGRVMHHGCLLFDVNLGVLGDALKVSKDKIESKGVKSVRSR VTNILPHLKEKITVEEFADKLMGYMKKHYPEIKEYKLSKEELDIIEKNSKEKHGNWDWVY GQSPEYNITRGKRFPFGKIQIFANVVNSKIKNIKFYGDFFGTKEDLSEIEEKLIDVKYTP EDIKERLKDLDIKEYFAGFTLDEIVEAIVE >gi|261747882|gb|ADAD01000089.1| GENE 14 12456 - 14204 810 582 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 126 578 3 458 458 316 39 9e-86 MATEIIMPKAGIDMTEGQIIKWNKKEGDKVEAGEILLEIMTDKTSMELEAEESGYLIKIL KGEGETVPVTQVIGYIGAEGEAAPAGDAPSAAPASEPAPQKAEPVQKETKSEEVKPAAKA EKGEDEFDVVVIGGGPAGYVAAIRAAQLGGKIAVVEKSELGGTCLNRGCIPTKTFLKNAE IIEGIEMSAKRGIILESEKFKVDMPKVISLKNEIVKTLTNGVQGLLKSNSVKIFKGVGKI NKDKDVVVNGEKVLRTDKIILAGGSKVGSVNIPGIESKRVLTSDDILDLKELPKSLVVIG GGVVGVELGQAYMSFGSEVTVIEMMDRIVPGVDREASETLRKALEKKGMKILTSSKINEI IDQGDKLVIKLEGKEDVIAEKALLSIGRVPDLEAVGELDLEMERGKIKVDKFMETSIKGI YAPGDVNGIKMLAHAAFRMGEIAAENAILGNHRETKLETTPSAIYTIPEVGMVGLTEEQA KEKYDISVGKFAFVGNGRALASGDTTGFVKVITDKKYGEILGVHIVGQSAAEIINEASSL MAMEITVDEVIKTIHGHPTFSEALFEACADVLGEAIHLPKKK >gi|261747882|gb|ADAD01000089.1| GENE 15 14219 - 15253 1620 344 aa, chain - ## HITS:1 COG:SP1162 KEGG:ns NR:ns ## COG: SP1162 COG0508 # Protein_GI_number: 15901027 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Streptococcus pneumoniae TIGR4 # 1 344 1 347 347 409 61.0 1e-114 MENDKLRATPAARNLAKRLGVDLFRVQGSGAKGRVHKEDVEIFNYEKRIHISPLAAKIAK DYDINLDNVVGSGHNGKIMRDDILKLIAKPQEKEELIRHEVPKTVEAKQVTEEDIEMIPM SPMRKVISKRMSESYFTAPTFTLNYEIDMTEIKALRTKILDTILENTGKKVTITDIVAFA VVKTLMKHKYINSSLSEDGSQIIFHNYVSLAIAVGMDDGLLVPVIKNADKMSLSELVVNS KEIVSKALAMKLSPTEQSGSTFTISNLGMYGVQSFNPIINQPNSAILGVAGTVDKPVVVN GEIVVRPIMTLSLTIDHRVVDGLAGAKFMQDLKKLLENPISMLV >gi|261747882|gb|ADAD01000089.1| GENE 16 15268 - 16266 1532 332 aa, chain - ## HITS:1 COG:SP1163 KEGG:ns NR:ns ## COG: SP1163 COG0022 # Protein_GI_number: 15901028 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit # Organism: Streptococcus pneumoniae TIGR4 # 4 332 2 330 330 504 75.0 1e-143 MSDQTKLMTVKEAIITAMSEEMRRDENIFLMGEDVGIFGGDFGTSVGMLEEFGPERIKDT PISESAISGTAIGAAMTGLRPIVDVTFMDFIVYMMDNIVNQAAKTRYMFGGKGQVPVTFR CAAGSGVGSAAQHSQSLESWFCHIPGLKVVAPGTPADVKGLLKSAIRDNNPVIFLEYKAQ YNMKGEVPLDPDFVIPLGKGEIKKEGSDITIVTYGRMLERVMKAAEEVEKEGISVEVVDP RTLIPLDKELILNSVKKTGRVILVNDAHKTNGYIGEIASMICESDAFDFLDSPIVRLASE DVPVPYNHTLETAIVPSVEKIKNAIHKVMNKQ >gi|261747882|gb|ADAD01000089.1| GENE 17 16342 - 17307 1542 321 aa, chain - ## HITS:1 COG:SP1164 KEGG:ns NR:ns ## COG: SP1164 COG1071 # Protein_GI_number: 15901029 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, alpha subunit # Organism: Streptococcus pneumoniae TIGR4 # 3 321 4 322 322 472 71.0 1e-133 MELSKNRLLNMYELMLDIRNFDLKVNQLVKRGMVPGMTHLSVGEEAANIGALSALNPDDI ITSNHRGHGQVIGKGIDLNGMMAEIMGKATGTCKGKGGSMHIADLENGNLGANGIVGGGQ GIAVGAAYTQKVKKTGKIVVCFFGDGATNEGSFHETLNLASVWKVPVIFYSINNGYGIST SIKKVMNTEHIYERAAAYGIPGYFIEDGNDVLEVYKKFEEAVKYVREGNGPVLIESVTYR WFGHSSSDPGKYRTKEEVDEWKKKDPILKFGKYLVENKIATQEELDKLDEISKQKIEDAV EFAKNSPEPTIESAFEDIYAD >gi|261747882|gb|ADAD01000089.1| GENE 18 17566 - 17838 394 90 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229211116|ref|ZP_04337511.1| SSU ribosomal protein S16P [Leptotrichia buccalis DSM 1135] # 1 89 1 89 89 156 85 2e-37 MVKLRLTRLGRKKVPFYRIAAMEALTQRDGKAIAYLGTYNPLVEEDKQVVLKEEEILRFL SNGAQPTETVKSILTKAGVWEKFQATKKKK >gi|261747882|gb|ADAD01000089.1| GENE 19 17993 - 18661 797 222 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0102 NR:ns ## KEGG: Lebu_0102 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 9 222 1 220 220 148 47.0 1e-34 MKKIIIVLMTLILSIQAFSETVVKGTYEKSKKRYKELNLKVMGNIKLNSISIDSKNSVTL KLENYNVIMNKNELLNLYNSDNTNKLNKLKNTLTDKEAVKYNLKNKIAELVENNKAIIYD KKNKTELKNIIKVEYHNYIYYDEGRGSGYEGYSFYADDNFEKVIMNFDIVVPGLGIPIHS SLGDNPYTENQKVGKNFEKYNERKELYEKARINPDKIITIKY >gi|261747882|gb|ADAD01000089.1| GENE 20 18711 - 19100 529 129 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038088|ref|ZP_06011493.1| ## NR: gi|262038088|ref|ZP_06011493.1| hypothetical protein HMPREF0554_1495 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1495 [Leptotrichia goodfellowii F0264] # 1 129 1 129 129 192 100.0 9e-48 MSKVILNYKTIPGMKSSDLHLKINGEKSILLSEGKQEIQVDSGENTLRVYENFMMKSNEL KVDIISENTEVVLTFNYTVFLIAFLINIFLVLTAYMQILGNYLFIAVILALIINFILMTT IGFLKLTLK >gi|261747882|gb|ADAD01000089.1| GENE 21 19350 - 19865 915 171 aa, chain - ## HITS:1 COG:FN0772 KEGG:ns NR:ns ## COG: FN0772 COG0716 # Protein_GI_number: 19704107 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 168 1 163 169 107 36.0 1e-23 MSTLVVYSTLTGNTEKVAKAVFEAINGEKELKNVSEVEKTEDFEKFDKIIFGYWVDKGDA DERMKKFMAKVKNKTVGAFGTLGAKPDSDHAKRCLEKVKTFLEGNGNKVEREFICRGAID PKLLDKFRKMTAEGMTGHHAATPEAEKRWSEAAKHPNEEDFQNAKKAFEGF >gi|261747882|gb|ADAD01000089.1| GENE 22 19869 - 21113 1461 414 aa, chain - ## HITS:1 COG:FN0771 KEGG:ns NR:ns ## COG: FN0771 COG0635 # Protein_GI_number: 19704106 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 1 409 1 405 411 350 45.0 3e-96 MFSQRFKSHHDVGGIILQHFGHMKRPVPPTEVEKMLITKAKEEKGAMYVHIPYCDKICSF CNLNRKQLDNDLEDYTDFLIKEFEKYGKTDYMKSKKLDAAFFGGGTPTILREHQIEKILK CIKQNYNFTEDYEFTFESTLHNLNDKKLEILQKNGVNRLSIGIQSFSPKGRDTLNRTYTK EETVKRLKKVKDNFDGLVCIDIIYNYPEETVEEATEDADIIADLKLDSSSFYSLIIFDGS KMSKDIKENRLELDYKLETDRKLHDAFVKRLLSKGDYEILEHTKIVRKNRDKYKYIKLLN SQTDTLPIGVGAGGKLGNFDIFRMSPDKQFFMAASEEERKLKTLSGLLQYPVVYFSEIKK YIPEGTFDKINNLFKIFAEKKYLNLYDDRYEITVDGLFWGNNIDSEVIKICLGG >gi|261747882|gb|ADAD01000089.1| GENE 23 21124 - 21906 1082 260 aa, chain - ## HITS:1 COG:Cj1613c KEGG:ns NR:ns ## COG: Cj1613c COG0748 # Protein_GI_number: 15792918 # Func_class: P Inorganic ion transport and metabolism # Function: Putative heme iron utilization protein # Organism: Campylobacter jejuni # 3 258 4 250 251 145 34.0 7e-35 MKDRILNHMNEDHNDVLALYVRYFNKRDDVKEARLIDVNEEEMTLRVNGNEDVKVKFTKR TEFEHMHLEMVKMAKIARKELGIPAPERYKDKSHLQEEEIKIEINDFIRKFKSVILGTVT EDGEPNVTYAPFLRFRGDDYIFISTTGDHFDNLKNNGKLEVLFIEDEEKSKMISARTRVR YKAVAEFLERNAQFEEIMDEFQQKDSLIKMTRTMKDFYLVKLKLIKGRYIKGIGKAYDIA ENGEIKHITQDGHVYGHKEK >gi|261747882|gb|ADAD01000089.1| GENE 24 21945 - 22712 972 255 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038079|ref|ZP_06011484.1| ## NR: gi|262038079|ref|ZP_06011484.1| hypothetical protein HMPREF0554_1499 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1499 [Leptotrichia goodfellowii F0264] # 1 255 1 255 255 340 100.0 5e-92 MKFYIASIILNIGILFIPAFVLSQNTGKDTNETITVNLDNFVSENLSEESSDKNKQKAHK EEVITAQKPEIKEFNANKPEIKQNNVPINKIQNNHTDNKTTENTRNENAVINNNNSSHQN NSVSQSSGESSGSKGKEARSTGVSENTESAKKGNTTGHSRDNGNVCREGIDFTVVYNPDL QYPVAAERLGTKNTVVVNVRLNFNSNGNVSVIGVSGGNGIFQSEAKKSASRIKVRIKNPE TLKCTITKPFRFNSR >gi|261747882|gb|ADAD01000089.1| GENE 25 22750 - 23631 1334 293 aa, chain - ## HITS:1 COG:FN0767 KEGG:ns NR:ns ## COG: FN0767 COG0614 # Protein_GI_number: 19704102 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 33 292 9 271 275 151 35.0 1e-36 MINVKNKIVKKGFLFFFLLLSVIVSCSKKEKSDNKSGTENFKYKRIIVMDPAVVEMFYLL NSEDKIVGISKLQRSKIWPEEKTEKLESVGTFTNVSVEKVLSLKPDLVIASAFYVPQGFE EVLKSNNIELKKVGANSIDEVFTNFQEIGKMLGKEKEAEKIISEKKEKLENIKKLETKEK KGVFILSSAPMRVFGKGTLPDSIMRLLNIQNISDNMPGNSPILTPEFLIKENPDVILTLV KNPEEIVKANPQIKDVSAIKNRKFIVLDSNQVLRGSPRIVDHIAEVYEKTGKN >gi|261747882|gb|ADAD01000089.1| GENE 26 23657 - 24439 234 260 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 244 1 236 305 94 27 6e-19 MKTDIYTENINFSYDKKELLLNGINLNIKKGEFTGILGPNGCGKSTLLKIILKYLFPQSG IVELQNKKLKDYRQEELSKILSFVPQKSSLSMPMTVEEVVYMGRVPYMKNKWAGFDREDD EKVNEVMKKLELEKFRQRNIFSLSGGEFQRVLLARALAQDTEIIMLDEPTSALDMNYALE IMKLVFNFVKNENMTGIMVMHDLNLASMYCDNLIFLKNGEIRYRGTPEELFKPEIFYEIY GFKCEVAKYNDNYYVIPKKL >gi|261747882|gb|ADAD01000089.1| GENE 27 24443 - 25435 1330 330 aa, chain - ## HITS:1 COG:Cj1615 KEGG:ns NR:ns ## COG: Cj1615 COG0609 # Protein_GI_number: 15792920 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Campylobacter jejuni # 13 326 15 322 328 307 57.0 2e-83 MNSNINMKKIYAGLVISLFIFIFIALGAGDINVPLKDTVGSLLNKSYVDDSTKNIILNIR LPRIIMSVLIGMLLASSGTVVQTVFRNPLADPYIIGISASATFGAIIAFILELPDIMYGI SGFITSVIVILIIFRLSKRKNGTDITSLLIIGIAISSFLGAFTSFSMYLIGQDSFKIVAW MMGYIGTSSWTKVSILLIPLVISVTYFYLKRYELDLLLSGDEEAHSLGLDVGKLKKNMLI ISALIVGFSVAFTGMIGFVGLIAPHTIRLLTKSSSNVKLIPLATLGGGLFLLICDTIGRT ILYPVEIPIGVVTAFFGAPFFLYLAMRKGG >gi|261747882|gb|ADAD01000089.1| GENE 28 25425 - 27404 2747 659 aa, chain - ## HITS:1 COG:FN0768 KEGG:ns NR:ns ## COG: FN0768 COG1629 # Protein_GI_number: 19704103 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 28 659 21 715 715 315 35.0 2e-85 MMKRIGILSLLAVAALLQAESKYEGKMEESVVTATGFNENVEKQIKNVTVITSNDIKNKG YSSVEEILRRTPGVNFVNNGFGEIVDIRGQGPEKASGRVKILVDGVSMNILDLSHGLVPV NSVSVEEIERIEIIPGGGSVLYGNGTAGGVINIITKTPEKEGAHGKIYYENSSYSTNKAG GRAGIKFNDNFSIDLGYENINGKGYREKDKNSNEFLYGGFTLQGEKQKLRFKATRYNEDG ETTNGLSWEELSKNRRQAGENESLTKALRKEYTLEYNFKPVDNLEFTVLGYNQKNERTYD QDYGMGIRNNGLFRDKKNGANLKGNFNYGSGNLVFGYEYVDNKMLRRSVNTMTMRGRTRT LSDTKIELSKKTNSAFLLERHSFTDKLEGTLGYRFESAKYDIFRTDGRSSLKSKSKENNN AYEASLNFKYSDTGNVYAKYERGFRSPSPTELVDKDINKGGAYTLNDVKPETYDTYEIGI KDMVGPSFVSLTGFYTKTKNEIAIKWQGATFMRNWIYRNLEETERKGAELFAEQYLGNFR INESVSYVNAKITKGDKKGQKIAYVPSTKATLGVNYDVVSGLTLKADVNYLSGSVDGNNN KIKGYSTTDLGVSYKHESGWGLDAGVKNVFGKKYNLFQNRDAYTPAGERQYYLGVNYEF >gi|261747882|gb|ADAD01000089.1| GENE 29 27461 - 27883 495 140 aa, chain - ## HITS:1 COG:FN1311 KEGG:ns NR:ns ## COG: FN1311 COG0848 # Protein_GI_number: 19704646 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Fusobacterium nucleatum # 31 133 1 96 100 57 36.0 8e-09 MRFTNRRSKKNAEISMLNLIDVIFMLLIFFMITTTFNNYAQFHLAVPKSDIKMENNKNEQ TVEVIIDKDMNYFFKVGGNIKSITPEDLKTELSELSPKLLESVTLTADEHLEYGTIVNVM GRLKDQNVKNVNLNIQKNNK >gi|261747882|gb|ADAD01000089.1| GENE 30 27913 - 28548 829 211 aa, chain - ## HITS:1 COG:FN1834 KEGG:ns NR:ns ## COG: FN1834 COG0811 # Protein_GI_number: 19705139 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 20 207 3 186 187 127 41.0 1e-29 MLHYFIAGGIFMWGILLASICGLAVILEKAALFLTKEKKLTEEFKRKLYKTLKKGNKKEI LDLCSERKDSVSQILVKTVNNMEIDLDKADHSHKQYVEEIITENILEQTMKLEKGMWLLG AVVNTAPQLGLLGTVTGMIVSFSALSTNNTESARAVASGISEALYTTAFGLMVAIPFLVF YNYFNKKIDDTVLEMERVSLQFLNRFQNTEQ >gi|261747882|gb|ADAD01000089.1| GENE 31 28712 - 29368 633 218 aa, chain + ## HITS:1 COG:TP1037 KEGG:ns NR:ns ## COG: TP1037 COG1272 # Protein_GI_number: 15640021 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Treponema pallidum # 15 215 32 235 238 146 40.0 3e-35 MENNIKNTGLEADTSRGEEIANFVSHTVGAGLSIIAFILLTIRASWTRDVPTILSFMIFG FGLIVLYTMSAIYHGLRPGTAKRIFEIFDHSAIYILIAATYTPFLVLVVKSKTGIIILWI QWAVCILGILFKSFFTGKLKMFSTLLYLFMGWMIVFAFNELKQNINPVSFVYLVAGGVLY SLGTIFYTWKICKFNHMIWHIFVILGSLAHFFAVWYLV Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:59:09 2011 Seq name: gi|261747880|gb|ADAD01000090.1| Leptotrichia goodfellowii F0264 contig00126, whole genome shotgun sequence Length of sequence - 404 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 403 586 ## gi|262038095|ref|ZP_06011499.1| hypothetical protein HMPREF0554_1959 Predicted protein(s) >gi|261747880|gb|ADAD01000090.1| GENE 1 1 - 403 586 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038095|ref|ZP_06011499.1| ## NR: gi|262038095|ref|ZP_06011499.1| hypothetical protein HMPREF0554_1959 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1959 [Leptotrichia goodfellowii F0264] # 1 134 1 134 134 207 100.0 2e-52 TTEVEIKDVLKATGTEGAAVGGIFSRLRRKPRPQTVKQEKERTYRIDMDDSNITSEGSIY IKAENGDVINKDGGNIEGKKGVRIDAKNVYNTERYIEDEKNKVILAGGGRIKGENVYLKV TGKVVNGTEEYTKG Prediction of potential genes in microbial genomes Time: Thu Jul 14 02:59:20 2011 Seq name: gi|261747860|gb|ADAD01000091.1| Leptotrichia goodfellowii F0264 contig00159, whole genome shotgun sequence Length of sequence - 9272 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 7, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 624 907 ## gi|262038097|ref|ZP_06011500.1| hemolysin 2 1 Op 2 . + CDS 621 - 1148 467 ## gi|262038112|ref|ZP_06011515.1| putative carbon monoxide dehydrogenase 1 3 1 Op 3 . + CDS 1159 - 1674 420 ## gi|262038106|ref|ZP_06011509.1| hypothetical protein HMPREF0554_2187 + Term 1743 - 1785 -0.8 + Prom 1701 - 1760 10.4 4 2 Op 1 . + CDS 1827 - 2522 423 ## gi|262038098|ref|ZP_06011501.1| hypothetical protein HMPREF0554_2188 5 2 Op 2 . + CDS 2542 - 2769 243 ## gi|262038099|ref|ZP_06011502.1| homoserine kinase 6 2 Op 3 . + CDS 2786 - 3055 541 ## Lebu_2111 hypothetical protein + Prom 3064 - 3123 6.8 7 3 Op 1 . + CDS 3145 - 3636 342 ## gi|262038107|ref|ZP_06011510.1| hypothetical protein HMPREF0554_2191 8 3 Op 2 . + CDS 3672 - 4166 478 ## gi|262038100|ref|ZP_06011503.1| conserved hypothetical protein 9 3 Op 3 . + CDS 4229 - 4504 226 ## gi|262038114|ref|ZP_06011517.1| putative NADH dehydrogenase subunit 5 10 3 Op 4 . + CDS 4534 - 5226 580 ## gi|262038113|ref|ZP_06011516.1| conserved hypothetical protein 11 3 Op 5 . + CDS 5292 - 5498 247 ## gi|262038101|ref|ZP_06011504.1| hypothetical protein HMPREF0554_2195 + Prom 5501 - 5560 2.2 12 4 Op 1 . + CDS 5582 - 5830 318 ## gi|262038111|ref|ZP_06011514.1| conserved hypothetical protein 13 4 Op 2 . + CDS 5845 - 6012 297 ## gi|262038102|ref|ZP_06011505.1| conserved hypothetical protein 14 4 Op 3 . + CDS 6023 - 6451 549 ## gi|262038104|ref|ZP_06011507.1| conserved hypothetical protein + Term 6543 - 6595 1.2 + Prom 6683 - 6742 9.4 15 5 Op 1 . + CDS 6912 - 7061 98 ## gi|262038108|ref|ZP_06011511.1| conserved hypothetical protein 16 5 Op 2 . + CDS 7081 - 7641 496 ## gi|262038115|ref|ZP_06011518.1| putative structural maintenance of chromosomes 1 protein + Term 7846 - 7908 2.0 - Term 7837 - 7893 3.7 17 6 Tu 1 . - CDS 7898 - 8245 258 ## Lebu_0662 hypothetical protein - Prom 8386 - 8445 7.3 + Prom 8364 - 8423 9.3 18 7 Tu 1 . + CDS 8455 - 9271 1099 ## COG0249 Mismatch repair ATPase (MutS family) Predicted protein(s) >gi|261747860|gb|ADAD01000091.1| GENE 1 1 - 624 907 207 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038097|ref|ZP_06011500.1| ## NR: gi|262038097|ref|ZP_06011500.1| hemolysin [Leptotrichia goodfellowii F0264] hemolysin [Leptotrichia goodfellowii F0264] # 1 207 1 207 207 322 100.0 1e-86 GKVEGRQRKVPEGSEEKGLESTGRATNDFFNKTYQEGNIPIQTKSDGKDYSKANFGEHVG NLVVVDDVVIVEGALVIVAGIVVYKAATSKEAREFAEILGRKIDDGISNIGNAVAALKLQ MFGVQVKPGDIINTPDTRPGSFKEGKNEGTKGFENVEDGSWWERDIAGARGHGGSVWKRW PKKPNPKNKKNDPKRRTIGEDGKVLRR >gi|261747860|gb|ADAD01000091.1| GENE 2 621 - 1148 467 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038112|ref|ZP_06011515.1| ## NR: gi|262038112|ref|ZP_06011515.1| putative carbon monoxide dehydrogenase 1 [Leptotrichia goodfellowii F0264] putative carbon monoxide dehydrogenase 1 [Leptotrichia goodfellowii F0264] # 1 175 1 175 175 263 100.0 4e-69 MKKYLLDFCERKLSADEAEEYFKYDVIEMYKSGKDISIYLEIENDYIKFFKNLFEINNYE VEIVFKNFSIDLNTVLFDEEKIYFEKFLRNLKFEELFFKRVIKLTKFEELEILIKLAYRN LWEYPWLRMEFGKLKIKVDSLYDYCFVMILNRKKDIDKLNRISENLKIYIKDLGN >gi|261747860|gb|ADAD01000091.1| GENE 3 1159 - 1674 420 171 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038106|ref|ZP_06011509.1| ## NR: gi|262038106|ref|ZP_06011509.1| hypothetical protein HMPREF0554_2187 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2187 [Leptotrichia goodfellowii F0264] # 1 171 1 171 171 273 100.0 4e-72 MQKYIVTAFIPDSEEEKEEYLSWSLVNIQRDEKNKIFKRDRQKFTLNFYKENEKMMIKFF EKFFILNDCCAEVYLEPVNTGTKQIFLYNERKKFWNYMKKLSFEQLFLREKIMINNKKDF DFFIKLGYRSDEKQCVKIIFENLDIKSESTGEYSYILFVDESSELKTVFEI >gi|261747860|gb|ADAD01000091.1| GENE 4 1827 - 2522 423 231 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038098|ref|ZP_06011501.1| ## NR: gi|262038098|ref|ZP_06011501.1| hypothetical protein HMPREF0554_2188 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2188 [Leptotrichia goodfellowii F0264] # 1 231 1 231 231 326 100.0 1e-87 MVLDMIFVLFLIFIGNIIFYFIIYITYKIRNYNAEIIFYQDYLEAKYKKGNKKIFYKNIE TIIVERNLLGSRNILVNYPVIYGYSNVFNNLRKDERYNITLNNVENYEEIIVELIEKTRF DKFSKNQKELERKFPILSLYTVIYIIVLFGFFYKIKYIFTILPLAWIKLFKKNYKFDENS EAIKIYNKRRIIYLKRGNYILENDKLILKIRKKDTLFENEVFIVKKIGKTS >gi|261747860|gb|ADAD01000091.1| GENE 5 2542 - 2769 243 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038099|ref|ZP_06011502.1| ## NR: gi|262038099|ref|ZP_06011502.1| homoserine kinase [Leptotrichia goodfellowii F0264] homoserine kinase [Leptotrichia goodfellowii F0264] # 1 75 1 75 75 124 100.0 2e-27 MKFDREKTKVYMLYHINEGKDMKLIGFFSDKEKAGETVKELLSKPGFKDYPQGFKMKAMT IDKSYYAKGFNSKCL >gi|261747860|gb|ADAD01000091.1| GENE 6 2786 - 3055 541 89 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2111 NR:ns ## KEGG: Lebu_2111 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 87 1 91 154 72 47.0 7e-12 MKIYVLSHGYDYGYDYDKEEARFIGVYSTRKKALKALKEYKKIRGFSSHLDGFYIGSYIV DKTLGWIEGYKTGFFDPEIGFYESKNEVE >gi|261747860|gb|ADAD01000091.1| GENE 7 3145 - 3636 342 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038107|ref|ZP_06011510.1| ## NR: gi|262038107|ref|ZP_06011510.1| hypothetical protein HMPREF0554_2191 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2191 [Leptotrichia goodfellowii F0264] # 1 163 16 178 178 263 100.0 5e-69 MFFCLLFICVNFKFKSEYLIDSPGTDYEKSNRNYISRIEIFDYNLFIDIPYERISEYYLK NIEIIHNDKMIGIIQINKSMGELECNKNKNLCNYVISEKSVSKILNDKKLYKIIRTRFYE RDKNYFKFKIFISSKKNNQKEIFIVEDVRIRYKKWYDFEIPYF >gi|261747860|gb|ADAD01000091.1| GENE 8 3672 - 4166 478 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038100|ref|ZP_06011503.1| ## NR: gi|262038100|ref|ZP_06011503.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 164 1 164 164 266 100.0 5e-70 MKKLIVIVIIIILLFLKACVRHIDRNHYNISSSGSESNVPNIEEIYISDESTCFESNKFP ENLILEKINLYFNKKIIGTMNINKNIHEMRFTNSTLKKYEMEEELLRILGENNEKYQIST GFMQKGFVFEIFIKNTNTGEIYQTIREASINFDKKGFYFWVPYI >gi|261747860|gb|ADAD01000091.1| GENE 9 4229 - 4504 226 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038114|ref|ZP_06011517.1| ## NR: gi|262038114|ref|ZP_06011517.1| putative NADH dehydrogenase subunit 5 [Leptotrichia goodfellowii F0264] putative NADH dehydrogenase subunit 5 [Leptotrichia goodfellowii F0264] # 1 91 1 91 91 98 100.0 2e-19 MKEILKNKYLIKIFIGIFLLKMTVWNMILWNMILVDWFILKDIIKNILIFITFFLIKEDK EFKRTVVVIKVICNLILFCKILELIMASSAM >gi|261747860|gb|ADAD01000091.1| GENE 10 4534 - 5226 580 230 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038113|ref|ZP_06011516.1| ## NR: gi|262038113|ref|ZP_06011516.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 230 1 230 230 354 100.0 2e-96 MSDIKLNLKTEKLMLRHKFILIGIFLFFTSFSNDKNIIIFYLIFFSLIFYYRYLVLRALK KYTKITINDEYIDFNSHFMKRKIYFKDMKNITCIENRENIGDIVINIVPGILVYKYIFSL FKENYGFSNTLKYKTFLIPDVVNLKEVFGEIKKKKIGDKKEEFKIINSENLKFEIGEKIM SSTDFKRKSKIVFFEGAKYDSEKRKIIHPSQMGMVPLQYRGIRLPSLKLK >gi|261747860|gb|ADAD01000091.1| GENE 11 5292 - 5498 247 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038101|ref|ZP_06011504.1| ## NR: gi|262038101|ref|ZP_06011504.1| hypothetical protein HMPREF0554_2195 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2195 [Leptotrichia goodfellowii F0264] # 1 68 1 68 68 83 100.0 6e-15 MKEYEKAKRIFLYYNGNYYLMERDGIYSDYKSYEIPKYYEKKWLYEKIIMLLFEIKKENN VRKLLEKI >gi|261747860|gb|ADAD01000091.1| GENE 12 5582 - 5830 318 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038111|ref|ZP_06011514.1| ## NR: gi|262038111|ref|ZP_06011514.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 82 1 82 82 111 100.0 1e-23 MDMFSKVVLAETILDKLNKKDSSKEMGTLKKIISKLLVITQNENVISDDFKDEKGKYLND ITENEVLRRYSNIVKIINNTIL >gi|261747860|gb|ADAD01000091.1| GENE 13 5845 - 6012 297 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038102|ref|ZP_06011505.1| ## NR: gi|262038102|ref|ZP_06011505.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 55 1 55 55 83 100.0 4e-15 MGKRKEYIKIIEFGNDIKKIINSWDPLDLLDIAPEDEYETEIREIRDIVIETENY >gi|261747860|gb|ADAD01000091.1| GENE 14 6023 - 6451 549 142 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038104|ref|ZP_06011507.1| ## NR: gi|262038104|ref|ZP_06011507.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 142 1 142 142 206 100.0 4e-52 MTYAIKGIFGNSFSEEYCFDNFTIGEVAEEILKVRRKYSPFEIVINYNKDKDVDIKKIEL YSKIKEIINSWDPLKIMDISLDNEYCYEIKEITGNFNENVNVENLSNLINEVFKNTYNEL YNANKNEEIKIAKELNKIGENR >gi|261747860|gb|ADAD01000091.1| GENE 15 6912 - 7061 98 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038108|ref|ZP_06011511.1| ## NR: gi|262038108|ref|ZP_06011511.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 49 1 49 49 80 100.0 3e-14 MKNRKIEFINVEEFEKLKIYLENGNIWGVFLRRLKENWSHIGVGAIVKI >gi|261747860|gb|ADAD01000091.1| GENE 16 7081 - 7641 496 186 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038115|ref|ZP_06011518.1| ## NR: gi|262038115|ref|ZP_06011518.1| putative structural maintenance of chromosomes 1 protein [Leptotrichia goodfellowii F0264] putative structural maintenance of chromosomes 1 protein [Leptotrichia goodfellowii F0264] # 1 186 1 186 186 282 100.0 9e-75 MGILDSVCKNKIEKLQGELLTTKRELENFKVKYEKQRKEIRNIFDSYKKRNVKIIGISQN KENDNLYILLFDYDTILLYNDDKYNKSGKMPLLYFKVENVYDKEKGNEIGKRIKIQDLLA REKGIGNGTILLDSLINLAEENKIEEIIGDLSMVDEQDKTNKQRRDQLYKKFGFEINNAK IKKILK >gi|261747860|gb|ADAD01000091.1| GENE 17 7898 - 8245 258 115 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0662 NR:ns ## KEGG: Lebu_0662 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 114 122 237 238 81 46.0 1e-14 MNNIYTTPDYIGFHTEASVSHCCSNYLSSRPKAYSRKNLKNIAIYLMFFENNKDYWEVER RYRNKKVSNLEEVIILESKRKIGRIGEKQTNIPILKQNSSFIRESLKEISHTTII >gi|261747860|gb|ADAD01000091.1| GENE 18 8455 - 9271 1099 272 aa, chain + ## HITS:1 COG:FN0693 KEGG:ns NR:ns ## COG: FN0693 COG0249 # Protein_GI_number: 19704028 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Fusobacterium nucleatum # 3 269 23 291 896 254 53.0 9e-68 MTETPLMKQYKDIKSDYKDSILFFRLGDFYEMFFEDAIKASKELGLTLTSRNKERDQNVP LAGVPYHSANSYIAKLVSKGYKVAICEQTEDPKAAKGIVKREVVKIITPGTIIDVDSLDS KSNNYLMSIKVTENKIGIAYIDITTGEFKVTEIDEDTEYMQLFSELNKIEPKEVLLTSSF YETIKDKIDDFVQKNDATISIVNKVRDSEKLLTDYFGIVSLESYGIKGKKSVIEAAAIAL DYAMTMQLENDLTVEKIEFLNVSNYADINSIT Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:01:27 2011 Seq name: gi|261747857|gb|ADAD01000092.1| Leptotrichia goodfellowii F0264 contig00160, whole genome shotgun sequence Length of sequence - 4529 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 2381 3549 ## COG1404 Subtilisin-like serine proteases - Term 2572 - 2609 -0.9 2 2 Tu 1 . - CDS 2752 - 4302 1783 ## COG2978 Putative p-aminobenzoyl-glutamate transporter - Prom 4463 - 4522 12.6 Predicted protein(s) >gi|261747857|gb|ADAD01000092.1| GENE 1 2 - 2381 3549 793 aa, chain - ## HITS:1 COG:FN1950_1 KEGG:ns NR:ns ## COG: FN1950_1 COG1404 # Protein_GI_number: 19705252 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Fusobacterium nucleatum # 99 566 22 497 498 216 34.0 2e-55 MKKKTILLALLIAVLASCGGGGGGGGAAPQSGGPSPIIPSPGTNPGGNSGSGGNGGNNGS GIIGNGQNPGSGINPQNPSNPGSGLMPQNPNVPDQFPKPTDNRQTTGTGVKLGVLDDDFV SGDAFTQRFYKDPFLLVGTRFDEVLRQEFGNRFEALAKDQGIPGRDDHGLMVATIMAGKS GKGATGSTVYGASFGESNGSVIIDTNKYIELRNKGVKIYNQSFGTPNEFNMPGINYRNEI WNSLNTAGVWTQAQIDQKVNELIDFYKDSVNDGALFVWAAGNRKKVGGNVVTLNNPTIQA GLQEYIPSLYKGWIAVVGVRDDGTEFGPHLARAGAARMWTISANGYCELSGCSEYGSSFA APRVTAAAAKVKEKFPWMTGHELKQTLLTTAKDLGDPGVDGIFGWGLLDEQKALKGPAQF NSELLVGKSGVNAGLKGQFNANITNNLTSIFENDIDGEGGLKKSGNGKLILTGNNSYQGS TDIEEGTLEIYGDNGSNITIKNQGTLITYPKTMIGLKNYNGNVIPKNVENNGGTLENKGS GAVITGNYTATNGSVTKAEIGTKLTVKGAVNLNGGNTLRQTMSGYITAKPLSSTVIEAEK GINGTFDKVETPELINGSATVEGNKVVSTVSRKNVEDYVSTLSLSDTMRNNTAQNLETSF KELDSQIENGNTENVKSFSRSAALIQKMSLPNAAAVLDSLSGQIYASAQALTFQHSQTVN KDLSNRLVMLGTLDNVGDNAGLWVTGIEANGRLRQEGFGVGKTHTYGGQVGIDKAFGNSL ILGTALSYSKSDV >gi|261747857|gb|ADAD01000092.1| GENE 2 2752 - 4302 1783 516 aa, chain - ## HITS:1 COG:FN0470 KEGG:ns NR:ns ## COG: FN0470 COG2978 # Protein_GI_number: 19703805 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Fusobacterium nucleatum # 5 516 1 509 512 493 52.0 1e-139 MEKNLKKSKEKYSSSFFDQIEKIGNKLPNSVFIFISLAVIVVIASEIVHRMGVTVSYFDA KKSENVVVSAISLMNREGFVYMITSMVKNFTSFAPLGVALVAMIGVGVAEYSGLIDAFLK KMVLSVPNKYITATIVFVGIISNIAADSGYLVVIPLGGIIFLSLKRHPFAGIAAAFAGVS AGFSANLLIGPIDAQLSGITNEALKVAGIDRAISPTANWFFLAVSTILLTIIGTMVTEKI IEPRLGKYEKDTQNFEYSQLTELENKGLFWSGIAVVIFVIALAFMIFPKNAVLKTLDPKT STYTLNTFLNNGIVPIIMMFFLIPGIIYGLITKSIKNTKDVVKGMNKSMETMSVFLVIVF FASQFISYFNHSKLGFIISVKGAIFLRNIGFTGLPLIVTFVLLCAFINLFMASASAKWAI MAPIFIPMMYNLNISPEMTQLAYRIGDSSTNIIMPLMSYFPLIVAYMNKYDEDTGVGNLI SLMLPYSLFFLIGWTVMLVLWYLTGLPIGPDAFLHV Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:01:32 2011 Seq name: gi|261747843|gb|ADAD01000093.1| Leptotrichia goodfellowii F0264 contig00047, whole genome shotgun sequence Length of sequence - 12120 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 4, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 336 311 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 2 1 Op 2 . - CDS 365 - 1660 1480 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis - Prom 1815 - 1874 9.9 + Prom 1576 - 1635 11.0 3 2 Op 1 . + CDS 1841 - 3121 841 ## Lebu_2260 O-antigen polymerase + Prom 3131 - 3190 8.3 4 2 Op 2 . + CDS 3213 - 3824 785 ## Lebu_0557 hypothetical protein + Term 3868 - 3903 5.1 + Prom 3887 - 3946 9.9 5 3 Tu 1 . + CDS 4103 - 4648 335 ## COG5522 Predicted integral membrane protein - Term 4799 - 4837 0.0 6 4 Op 1 . - CDS 4925 - 6829 2567 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 7 4 Op 2 . - CDS 6831 - 7748 1132 ## Sterm_4158 hypothetical protein 8 4 Op 3 . - CDS 7742 - 8512 1285 ## COG1192 ATPases involved in chromosome partitioning 9 4 Op 4 4/0.000 - CDS 8525 - 9892 1983 ## COG0486 Predicted GTPase 10 4 Op 5 16/0.000 - CDS 9916 - 10785 1243 ## COG1847 Predicted RNA-binding protein 11 4 Op 6 18/0.000 - CDS 10790 - 11464 741 ## COG0706 Preprotein translocase subunit YidC 12 4 Op 7 16/0.000 - CDS 11490 - 11699 235 ## COG0759 Uncharacterized conserved protein 13 4 Op 8 . - CDS 11699 - 12025 333 ## COG0594 RNase P protein component Predicted protein(s) >gi|261747843|gb|ADAD01000093.1| GENE 1 3 - 336 311 111 aa, chain - ## HITS:1 COG:SPy0787 KEGG:ns NR:ns ## COG: SPy0787 COG0463 # Protein_GI_number: 15674831 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pyogenes M1 GAS # 4 102 3 102 310 100 52.0 6e-22 MEKIDILLATYNGEKYLEEQLYSILNQSYPNINLLIRDDGSKDKTVDIIKKYAQKDERIK FIEDDLGNLGFLKNFEKLLEHSEENYIMFSDQDDIWNPDKIEKSYKKIRET >gi|261747843|gb|ADAD01000093.1| GENE 2 365 - 1660 1480 431 aa, chain - ## HITS:1 COG:all2854 KEGG:ns NR:ns ## COG: all2854 COG2148 # Protein_GI_number: 17230346 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 189 431 227 469 469 232 41.0 1e-60 MKIEKNTKISGIYIILLYIIYNILLKNFGYSIKYLTNFLFMIFIFVEFVFGQLDFYAERF RYKDVFINLGIDFLFSVVIFVFLRDWDIFLIFSVIFVFQVIFRRQICLKHVKKQRVLIFG SNHIVNNVQEDLINSLDYEYIGYISNNKSRATKYLVGTYDEMEKIINEKEIDALVIVKDI KSESFKEYLKRIFDLKINGLKVLSYEEFNESIQKKIDINQIDEEWLLESNGFDILSNESQ RNIKRGLDLIIAFILMILLSPIALITAIIIKLESPGPVIFKQARIGQNMKKFKIYKFRSM RIHDSTKYPKYTKDNDDRVTPFGKFIRKTRIDELPQLFCIVKGTMSFVGPRPEWDLLAEE YNEKIPYYNLRHMIKPGITGWAQVMYPYGESVEDAKKKLEYDLYYLKNQDLLMDVLTIMK TAKIVLFGKGK >gi|261747843|gb|ADAD01000093.1| GENE 3 1841 - 3121 841 426 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2260 NR:ns ## KEGG: Lebu_2260 # Name: not_defined # Def: O-antigen polymerase # Organism: L.buccalis # Pathway: not_defined # 7 416 12 420 435 306 47.0 1e-81 MKNLYNTGYYVCLLFALSIFLSVKAQNILGILLLLLFIISMFSKENREKLKLLTDKQIIL GLAVFIVTPYIIALIDGGLKARIDMDDYIKFILFLPLVFFLDTEKKFWNFLKSILTGGTV SLIITLFIFIKNYDEWAHPKGFVYPRVFFELDPQDFANIMCLLLLFLISLFLFYKFEKKE DNFKFKLLYFFIIVLNVFILLVNRSKMVYICLIPTVIYILYKKNKKYIAVFLIFCIGGFF LLPHTISDRLQYIVKVQKDPSSNLRLIFWDGAIASIKQSPLIGMKSEQRRDFVENYYKQK GVYDYVVANYEFVKTRNVLDTHNMYLQFLTYFGIIGFISLIFFFFFVIPKKLLSISFYKN REKNEEKSEYSKFIALEIALKASYACYLIQGLTEINLNNKSIILAFSVLIYILNFVIHKN FKMKLN >gi|261747843|gb|ADAD01000093.1| GENE 4 3213 - 3824 785 203 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0557 NR:ns ## KEGG: Lebu_0557 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 203 1 204 204 207 55.0 2e-52 MKKLLLLGALISSVAMAQVVEVRIGGDLSNSGKFKGGFSDGANLQKKAIKRGIELSAEYR TPVLENFEIGGGISYKHNKVDSKGYYEHKGVDSVPVYFTARYNFKNSSEVTPYVKANLGY AFNSGSLKWFNNSSYYGEAKFKGGLYSGVGAGIQYKNFVADLSYNWNRMRVDRKGYVAPY RYEDKFTLSHGTLTLGLGYSFGF >gi|261747843|gb|ADAD01000093.1| GENE 5 4103 - 4648 335 181 aa, chain + ## HITS:1 COG:TP0381 KEGG:ns NR:ns ## COG: TP0381 COG5522 # Protein_GI_number: 15639372 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Treponema pallidum # 26 174 77 225 238 84 34.0 1e-16 MGIFVLSFKLLDSVYRVLYQYEPVYNTVPIHLCNFAAIAAGVYLITRKRFFFNLLYFLSF GAAFALVLPGVTYYYYPVYVYIFMIMHALEFVGVIYGFVYLKEKITFKGFIGSCIVLLGL FLFSHFYNMKFNTNAMFINDYIAPMVSFIKPFGLYIFVLIISMLFIMFLMYLPFSKKYSK N >gi|261747843|gb|ADAD01000093.1| GENE 6 4925 - 6829 2567 634 aa, chain - ## HITS:1 COG:FN0007 KEGG:ns NR:ns ## COG: FN0007 COG0445 # Protein_GI_number: 19703359 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Fusobacterium nucleatum # 1 625 1 622 628 747 62.0 0 MNKNYDVIVVGAGHAGVEAALAASRLGLKTALFTIYLDNIAMMSCNPSIGGPGKSHLVSE LGMLGGEMARHIDNYNLQLKNLNHTKGLASRITRAQADKYRYRIKMREVLEKQENLDIIQ GIVTDLILENGEIRGVEDKLGVKYGAKAVILCTGTFLKGQFIIGDVKYSAGRQGEPSSDE LPDKLAEYGFELDRYQTATPPRIDKRTIDFSKTQELKGEEKPRYFSYETEKEYNPVLPTW LTFTTEKTIETGKEMLKYSPIVTGIVSTKGPRHCPSLDRKILNFPEKTDHQIFLEQESVE SNEIYVNGFTTAMPPFAQEAMLKTIAGLENAKIVRYGYAVEYNFIPAYQLNLNLETKVIK GLYTAGTINGTSGYEEAACQGFIAGVNAARKILGKEEIVIDRSEGYIGVLIDDIINKKTP EPYRVLPSRAEYRLTLRQDNIFIRLFEKSKEIGLLRKEKLEELEKAKNQIDQETERLKTI TVYPTKETNEKLKEIGKIFNQKNNNEEISVNSANSPVSAFEFLARKEITYDNLSKFIETK EFSPIVKEQIEINAKYNVFIEREKTQIEKFKKLEEMKIPENIDYEKIRGLSNIAMSGLTY GKPHTVGQASRISGVTYNDIGILIANISAFLEKK >gi|261747843|gb|ADAD01000093.1| GENE 7 6831 - 7748 1132 305 aa, chain - ## HITS:1 COG:no KEGG:Sterm_4158 NR:ns ## KEGG: Sterm_4158 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 6 303 4 297 297 261 56.0 3e-68 MVKKSKNIFEKRNPFSNLNNEKTKNFVKQLEEDENIKTEVLKIKGAEITSINEKILRKID FSDREFINREITFEKIMETEALRNLAESINEIGIINPVYVIRKENGKYKILSGFRRLSAV YYGYENIKDFNPAGTNNLIIIPENAGYEILDKISLHENTLREDLTTLEISMKIWRESRNK KKNAEQIAQEYGISKRTVARYLRVEKYPEQLLEKLDEIKNIRKSDTIFNYLNRTNFENME KNIEKLSAMEISDIEREIKKLALTVDDNKIIVKKGKKSTSFEIQGRLTDEEIEKIKQIIS KRILI >gi|261747843|gb|ADAD01000093.1| GENE 8 7742 - 8512 1285 256 aa, chain - ## HITS:1 COG:TP0272 KEGG:ns NR:ns ## COG: TP0272 COG1192 # Protein_GI_number: 15639264 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Treponema pallidum # 2 253 3 251 253 186 37.0 4e-47 MKKIVIANNKGGVAKTTTAYNLASYYAKKGKRILVIDCDPQGNLTDALGVDPNEVDCTIL NALMKRNIKEAIIKLPQADNFYLIPSNLESERVNINLSSQLNRESVLKKILREVEDEFDM CIMDTSPSLSILTLNAIVAADSIYLTLRSGYFELRGAGLLISTINEIKEDLNDKLQVKGL ILTQYDIRTSLSSDSKEQLEEHFKSTIMKTVIRQNVDLAKAPAHAQDIFDYAPASNGARD YEDLALEVLEREGLEW >gi|261747843|gb|ADAD01000093.1| GENE 9 8525 - 9892 1983 455 aa, chain - ## HITS:1 COG:FN0006 KEGG:ns NR:ns ## COG: FN0006 COG0486 # Protein_GI_number: 19703358 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 455 1 455 455 538 66.0 1e-153 MLFDTIAAISTPKGEGGIGIIRISGDKSFEILSKIFNMKNPNKDLGFYKFNYGFVHDNGK IIDEVMTVRMKAPKTYTCEDIVEINCHGGNLITEKVLELVLKNGARHAEQGEFTKRAFMN GRIDLSQAEAVMDLIQGKTEKSISLSMDQLRGDLKEKINKFKKALLDVTAHVNVVLDYPE EGIDDPLPENLRENLEKVYKEANILIESYDKGKKIKEGIKTVIVGKPNVGKSTLLNSLLK EERAIVTHIPGTTRDIIEEIINIKGIPLVLVDTAGIRKTEDIVENIGVEKSKQFIEKADL ILLVLDASKELEKEDIEVIEKIKKNNKKTIVLLNKIDLERKIDLSNYDLENILEISAKDN IGIDEMEEKIYSYIVEEDVENTSEKLIITNIRHKTALEKTKEAVNNIFETIDNGMPMDLI SVDLKEALDALSEITGEISSEDILDHVFGNFCVGK >gi|261747843|gb|ADAD01000093.1| GENE 10 9916 - 10785 1243 289 aa, chain - ## HITS:1 COG:FN0005 KEGG:ns NR:ns ## COG: FN0005 COG1847 # Protein_GI_number: 19703357 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein # Organism: Fusobacterium nucleatum # 155 288 31 162 163 83 50.0 4e-16 MDKIILNAQNREELDKMIKRTLTLEKDETFKITVIKEPKKILFFNLKGKYQIDIVKKYEI EKNEFKNDRYEKKNFEKFDKFEKKDVRKDKRNNRENFKKEKNEKYGFENKKFSGNRANRN FEQNVNTSGNKENEVKEIRNDDSNYDRIRSFMKEFIVNSKLDLKVVDIKRIEERYVVNVD GKDIRYLIGEKGSTLNSVEYLLSSVKNFKNIRVVIDSNDYKQKREESLRDLARKKGKKVL ETGRNVKLNPMSARERKIIHEEISFMKGLKTESVGEEPKRYLIIKRTRD >gi|261747843|gb|ADAD01000093.1| GENE 11 10790 - 11464 741 224 aa, chain - ## HITS:1 COG:FN0004 KEGG:ns NR:ns ## COG: FN0004 COG0706 # Protein_GI_number: 19703356 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Fusobacterium nucleatum # 14 221 8 202 205 172 50.0 4e-43 MAKAKGNFLRIPALEALVVTILKAINGIVGNYGIAIIIVTIIVRILIFPLTLKQEKSMKR MRELQPQLDKLKEKYKDDQQMLQQKTMEIYKENNVNPLGGCLPILIQMPIFIALFYAFSG DAIPNDATFLWFHLKHPDNLFKIGTFAVNLLPILNTVVTFFQQKMMSGPSKEQGVNSQMQ SMMYTMPILMLVLFYNMPSGVTLYYLVSGVLSVAQQYIIMKGRD >gi|261747843|gb|ADAD01000093.1| GENE 12 11490 - 11699 235 69 aa, chain - ## HITS:1 COG:FN0003 KEGG:ns NR:ns ## COG: FN0003 COG0759 # Protein_GI_number: 19703355 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 68 1 68 82 94 63.0 6e-20 MKHILLFLIKIYQKFFSGAFGRKCRFYPTCSEYARQAVTKYGVIKGVYLSVKRILKCHPF HKGGYDPLK >gi|261747843|gb|ADAD01000093.1| GENE 13 11699 - 12025 333 108 aa, chain - ## HITS:1 COG:FN0002 KEGG:ns NR:ns ## COG: FN0002 COG0594 # Protein_GI_number: 19703354 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase P protein component # Organism: Fusobacterium nucleatum # 3 108 1 106 111 78 50.0 4e-15 MQINKIKKNKDFAFIYNNSKKVYTRYAIIFIKENNKKEQRFGFVASKKTGKAVQRNRIKR LFREFVKLNKSKFKESSDYIFVGKSNLKENIKSLKYKDIEKDMLKAVK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:01:48 2011 Seq name: gi|261747841|gb|ADAD01000094.1| Leptotrichia goodfellowii F0264 contig00176, whole genome shotgun sequence Length of sequence - 251 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 251 302 ## gi|262038134|ref|ZP_06011534.1| response regulator/ggdef domain-containing protein Predicted protein(s) >gi|261747841|gb|ADAD01000094.1| GENE 1 2 - 251 302 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038134|ref|ZP_06011534.1| ## NR: gi|262038134|ref|ZP_06011534.1| response regulator/ggdef domain-containing protein [Leptotrichia goodfellowii F0264] response regulator/ggdef domain-containing protein [Leptotrichia goodfellowii F0264] # 1 83 1 83 83 117 100.0 2e-25 GLVEGDSVTTISGRKVTVKEKLQRKNTLELIGQEGIILKDEIVSGILRLDTKSDLKNAKD IRGIDTLSVTAKNIENEGKLLSD Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:01:55 2011 Seq name: gi|261747837|gb|ADAD01000095.1| Leptotrichia goodfellowii F0264 contig00008, whole genome shotgun sequence Length of sequence - 3333 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 28 - 87 9.1 1 1 Op 1 . + CDS 137 - 1120 1509 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 2 1 Op 2 . + CDS 1150 - 2178 1004 ## COG0859 ADP-heptose:LPS heptosyltransferase 3 2 Tu 1 . + CDS 2280 - 3281 612 ## PROTEIN SUPPORTED gi|229879751|ref|ZP_04499249.1| (SSU ribosomal protein S18P)-alanine acetyltransferase Predicted protein(s) >gi|261747837|gb|ADAD01000095.1| GENE 1 137 - 1120 1509 327 aa, chain + ## HITS:1 COG:lin2113 KEGG:ns NR:ns ## COG: lin2113 COG0667 # Protein_GI_number: 16801179 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Listeria innocua # 1 327 1 327 331 453 68.0 1e-127 MEYTKLGNTGLEVSKICLGCMGFGDTENWIYKWVLEEEESRSVIKKALESGINFFDTANV YSLGRSEEILGKALKDYAKRDEVVIATKVFFEMRQGANVRGLSRKAIMTEIDNSLKRLGT DYVDLYIIHRWDYNTPIEETMSSLHDVVKSGKARYIGASTMYAWQFQKALNTAEKNGWTK FVSMQNHLNLLYREEEREMMPLCKEEKIAVTPYSPLASGRLTRDISAEKTKREETDQTAK GKYDSTMEYDKIIIERVKEIAEKYNRQRAEIALAWLLQKEQVAAPIVGATKFSHIETAVS ALDVKLTEEDIKFLEEPYIPHKVMGAL >gi|261747837|gb|ADAD01000095.1| GENE 2 1150 - 2178 1004 342 aa, chain + ## HITS:1 COG:FN0546 KEGG:ns NR:ns ## COG: FN0546 COG0859 # Protein_GI_number: 19703881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 2 337 6 335 335 231 43.0 2e-60 MKILIIRFSSFGDIVLTTPVIKKIKEKYPQADIDFLVYDTFSEAISLNPNIRNLIIFERK ESKNREYIKNIINKLKNENYDYVIDLHSKILSRIIGKSLGNKKTKYLKYKKRKWWKTLLV KTKLITYNADCTIVESYFTALKKLGITFDKSSLQSGKGDELEFYFDSIQEKELTEKYDLL SKPYFVLAPGASKFTKKWPYYNELAKRIIENNDIRIFIIGGKEDYNSVEEKEKIINLCGK ISFKESGIILKYARIAVVNDSGPFHIARALKTKTFVFFGPTDPKLFNFENTTFLIKNPDC PAHSLYGDDKFPKKYEKCMSDISVDEVYNKIMTENKEDMIIQ >gi|261747837|gb|ADAD01000095.1| GENE 3 2280 - 3281 612 333 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229879751|ref|ZP_04499249.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Slackia heliotrinireducens DSM 20476] # 3 305 440 755 781 240 41 1e-63 MKILAFETSCDETSVAVVENGKNVLSNIISSQIDIHKEYGGVVPEIASRHHIENILPVFN EAMEKAECTLKDIDYIAVTNTPGLIGSLLVGLMFAKSLSYSNDIPVIPVNHIDGHIFSSF IENDIKLPAISLVVSGGHTNLYYINENLETELIGETLDDAVGESYDKVARILDLPYPGGP EIEKISTFGEDTLKIKKPDVDNFDFSFSGIKTYVTNFVNREKMKNHKINKEDIAKSFQET VVKVLYDKVIKALAEKKVKTVLVAGGVSANKRLREKFKELPENVEIHFPKFEYCTDNGAM IGAAAYYKLKNKKSFEKNKYDIDAESTKEKNRK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:02:00 2011 Seq name: gi|261747822|gb|ADAD01000096.1| Leptotrichia goodfellowii F0264 contig00098, whole genome shotgun sequence Length of sequence - 16720 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 7, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 703 889 ## COG1738 Uncharacterized conserved protein 2 1 Op 2 . + CDS 721 - 1425 882 ## Lebu_0507 radical SAM protein + Prom 1432 - 1491 9.8 3 2 Tu 1 . + CDS 1540 - 3024 2084 ## COG1012 NAD-dependent aldehyde dehydrogenases + Term 3049 - 3094 -0.9 - Term 3037 - 3082 -0.9 4 3 Tu 1 . - CDS 3099 - 3626 641 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) - Prom 3655 - 3714 8.3 + Prom 3629 - 3688 7.4 5 4 Tu 1 . + CDS 3734 - 4078 419 ## COG1733 Predicted transcriptional regulators + Term 4283 - 4338 1.3 + Prom 4167 - 4226 5.6 6 5 Tu 1 . + CDS 4355 - 4573 192 ## gi|262038147|ref|ZP_06011545.1| hypothetical protein HMPREF0554_1540 + Term 4584 - 4647 2.2 + Prom 4580 - 4639 10.6 7 6 Tu 1 . + CDS 4691 - 6334 2309 ## COG1283 Na+/phosphate symporter + Prom 6862 - 6921 9.2 8 7 Op 1 1/0.500 + CDS 6949 - 10755 4817 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain 9 7 Op 2 4/0.500 + CDS 10786 - 11265 801 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase 10 7 Op 3 2/0.500 + CDS 11338 - 12069 1098 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase 11 7 Op 4 13/0.000 + CDS 12075 - 13448 1696 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 12 7 Op 5 21/0.000 + CDS 13489 - 14487 810 ## PROTEIN SUPPORTED gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase 13 7 Op 6 10/0.000 + CDS 14475 - 15083 762 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 14 7 Op 7 . + CDS 15122 - 16642 2068 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) Predicted protein(s) >gi|261747822|gb|ADAD01000096.1| GENE 1 2 - 703 889 233 aa, chain + ## HITS:1 COG:FN1996 KEGG:ns NR:ns ## COG: FN1996 COG1738 # Protein_GI_number: 19705292 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 6 233 4 234 235 192 46.0 3e-49 FQFGTNELLWLGYLILNFTAVILAYRFWGKSGLLAIVPLSIVVANIQVGKMMTLFGVDTT MGNIAFGGIYLASDILSENEGKKYAKKVVSLGFASMLFTTFIMQIVLKIQAAPSDVMQGP LSQVFGFLPRIAIASVCGFAASQAFDIWSYQAIRKIRPSFQDIWIRNNASTMLSQILDNI VFTFLAFTGIYPLDVIIVIIFSTYFLKVLIALLDTPFVYIATMWKNKGKINED >gi|261747822|gb|ADAD01000096.1| GENE 2 721 - 1425 882 234 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0507 NR:ns ## KEGG: Lebu_0507 # Name: not_defined # Def: radical SAM protein # Organism: L.buccalis # Pathway: not_defined # 1 234 1 234 234 306 69.0 5e-82 MIRYNVIKEKNPREIVLLKGHRCAYGKCAFCNYILDNTDDENEMERVNFEALSNVTGDLG TLEVINSGSVFELSENTLKRIKEICDTKNIKILYFEAYFGYLKRLNEIREYFKNQEVRFI IGIETFDNHYRTQILKKNFILNENVLEQLKKEYQTALLLICTEGQTKEQILSDIEQARKN FRETVVSIFINNGTEIRRDEKLVEWFLKEVYPELNKVNNVEILVDNKDFGVYVQ >gi|261747822|gb|ADAD01000096.1| GENE 3 1540 - 3024 2084 494 aa, chain + ## HITS:1 COG:FN0454 KEGG:ns NR:ns ## COG: FN0454 COG1012 # Protein_GI_number: 19703789 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Fusobacterium nucleatum # 1 491 1 491 491 622 60.0 1e-178 MDVRLQEKYNLYIDGKWVPASDGATLKVINPATGKEIANISEATSEDVDKAVKAAQKAFD SWKNSSLIERQNLLLKIADIIDENKELLATVETMDNGKPIRETMAADIPLGADHFRYFAG AIRAEEGTANMLDGNTMNIILREPIGVVGQIIPWNFPFLMAAWKLAPALAAGCTVVLKPS SHTSLSVLELMRLIGDVVPKGVINVLTGSGSKSGDFMLKHKGFKKLAFTGSTAVGQEVYS AACSHMIPATLELGGKSANIFFEDCDWDIAMEGAQLGILFNQGQVCSAGSRTFVQDTIYD KFVDSLAKIFDKVKVGNPLDPNTQMGSLIYERHLKDVLNYIEIGKKEGARLVTGGVRIED EEHKDGFYLRPTIFADVKNDMRIAQEEIFGPVVVIEKFHSEEEVIKMANDSIYGLGGGLF TRDLNRAIRVARAIETGRMWVNTYNAFPAGAPFGGYKQSGIGRETHKIILEHYTQMKNII INLNEKTIGMYETK >gi|261747822|gb|ADAD01000096.1| GENE 4 3099 - 3626 641 175 aa, chain - ## HITS:1 COG:FN1233 KEGG:ns NR:ns ## COG: FN1233 COG2249 # Protein_GI_number: 19704568 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Fusobacterium nucleatum # 1 175 5 180 180 261 73.0 4e-70 MKTLIILAHPDLEKSKVNKRWIEEAEKYLDKFTVHKLYEAYPNEIIDIKNEQELIEKHKG LILQFPIYWFNCPSLLKKWLDEVFTDGWAYGKGGDKLSARNIALTVTAGIDEKNYSKNGK YKYSLKEILIPFEITFDYCSANYKGFYAFYSAEFEATKERIENSLKEYIDFIKSI >gi|261747822|gb|ADAD01000096.1| GENE 5 3734 - 4078 419 114 aa, chain + ## HITS:1 COG:FN1232 KEGG:ns NR:ns ## COG: FN1232 COG1733 # Protein_GI_number: 19704567 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 1 111 1 111 113 179 81.0 1e-45 MKNGSCVKPDIKLSTTGFGYTLSLIGGKYKMTVIYKLYENAPFMRYNELKKSIRIISFKT LTSTLKELENDDLIIRKEYPQIPPRVEYSLSEKGKTLIPILNMMCDWGERNMKI >gi|261747822|gb|ADAD01000096.1| GENE 6 4355 - 4573 192 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038147|ref|ZP_06011545.1| ## NR: gi|262038147|ref|ZP_06011545.1| hypothetical protein HMPREF0554_1540 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1540 [Leptotrichia goodfellowii F0264] # 1 72 1 72 72 130 100.0 4e-29 MIDTFNDSLKTYGRIIQLEGEFEKESDPVVVTDKYYAKNCFIPEQKWTDYINSPALNDYF KMIQDYKNRKKS >gi|261747822|gb|ADAD01000096.1| GENE 7 4691 - 6334 2309 547 aa, chain + ## HITS:1 COG:SP0496 KEGG:ns NR:ns ## COG: SP0496 COG1283 # Protein_GI_number: 15900410 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Streptococcus pneumoniae TIGR4 # 8 547 2 543 543 541 52.0 1e-153 MNNLAVNTINYQQMAFAFLGGLGLFLFSIKYMGEGLQIMAGDRLRYILDKYTTSPFLGVL VGILVTALIQSSSGTSVITIGLVGAGLLSLRQAIGIIMGANIGTTITTFIIGFNITDYAL PILFLGAACLFFTKLRVINNIGRILFGFGGIFFALTLMSGAMQPLKYLPEFRQLMINLSH NSVLGVFIGTLVTVLVQASSATISVLQNIYQENLVTLRGALPILFGDNIGTTITVIIAVI GANTAAKRLAASHVIFNVIGTIIFMIFLTPFTFFIEKMQYMLHLNPKMTIAFAHGAFNTA TTILLFPFIRVLEFLVVKLIKSNKREKEYKTKLDMALLTAPVIALGQVKAEITDMTSLVL ENLKASVDFFHNHNEKLAEEIETTEEGINNLDQEISNYLTLLSGGNFNVKEGEEIGIYLD MCRDVERIGDHAFGIVKDVNYEIKKKMKFSQTAHEEVDQLLATSTLMIENAIEALRNSDK EKSIEVLDLHNKLYAQEKKIRKNNIERMRNQECELRAGLYYIDLISHFTRIGDHARNMVE KVIENRV >gi|261747822|gb|ADAD01000096.1| GENE 8 6949 - 10755 4817 1268 aa, chain + ## HITS:1 COG:CAC1655_1 KEGG:ns NR:ns ## COG: CAC1655_1 COG0046 # Protein_GI_number: 15894932 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Clostridium acetobutylicum # 4 982 7 968 985 994 54.0 0 MNYRIFVEKKKNFRVEAQSLLNDLKDNLSVKNLENVRILNIYDIFNLNEKDLKKLNTSVF SEINADDIYYYLEDVLEETGVKRENEIYFATEFLPGQYDQRADSAVQCINLLSDSENVAV KSGKLIILYGKVSEEDIKRIKKYYINEVEMKEKDLAILIENREKENTEKVPVFKSFREKS EKEMEEFKKNLELAMTVKDLLFIQNYFKNEEKRDPSETEIRVLDTYWSDHCRHTTFETII DDIKIENEKYKDIIEKTLSEYIKSREYVHGEKIDKKSVTLMDLATIFGKEARKKGLLSDL EVSDEINACSIYIDVPVERIGENGNKIVTDEKWILQFKNETHNHPTEIEPFGGASTCIGG AIRDPLSGRTFVYQAMRISGSGDPTEKIENTIKGKLPQKVITQKAAHGFSSYGNQIGLTT TYVNEIYDEGYKAKRMELGLVVGAAPAENIIREKPEKEDVVILLGGRTGRDGIGGATGSS KEHTTESAEKCGAEVQKGNAIIERKIQRLFRNKDVTKLIKKCNDFGAGGVSVAIGELADG LEIDLNKVKVKYTGLTGTELAISESQERMAVVIAKENIEKFIEYAEKENLEAYKVAEVTD TNRLVMKYNDEIIADISRDFLNTNGAKSNINVEIENTKKVDFTRKISGNTLKEKFLNNMK DLNVCSQKGLMESFDSTIGSTTVLMPYGGEYQLTPSEVSVQKVSVVNGETNVASMVGYGY NPYVAKQSTFHGGAFAVIESLCKIVASGGNYKNVRFTFQEYFERLGNDPKKWGKPLSALL GTLYVQKEFRLPSIGGKDSMSGTFHDISVPSTLVSFAVSVTDSENVISNEFKKEGSKIYL VKTAYDKNDLPDLKDLKENFDFLTENIKNKKIMSANAVKGGGIAETLGKMTFGNKKGIKV NFDGINADELFKVNYGAFIIETEEDMKYKNAVLIGNVTDDKKITLKIADEIVEIGLEELI EEWEKPLEKIFPTRVSEREKENSSKSLNFEVKKINSAQAVEKFAKPRVFIPVFPGTNCEY DLERAFSKEGGVVKMSIFNNLSYENILNSIDDFAKEIDNSQILMLPGGFSAGDEPDGSAK FIVAVLKNKKITEAVDRFLKRDGLILGICNGFQALIKSGLLPYGEIRELNQDSPTLTHNA ISKHMSKIVRTKIITNNSPWLAGTEVDEIHAIPLSHGEGRVVITEKEYRQLMENSQIATK YVDFDGNPSMETEFNPNGSYYAIEGMLACNGRIFGKMAHSERTGENLYKNIPGNKEQNIF KNGINFFK >gi|261747822|gb|ADAD01000096.1| GENE 9 10786 - 11265 801 159 aa, chain + ## HITS:1 COG:CAC1390 KEGG:ns NR:ns ## COG: CAC1390 COG0041 # Protein_GI_number: 15894669 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Clostridium acetobutylicum # 1 159 1 159 159 228 75.0 3e-60 MKIAIFFGSKSDTEKMRGAANCLKEFGIEYEAYILSAHRVPEKLEEVLTEAEKKGAEVII AGAGLAAHLPGVIASKTVLPVIGVPLNAALGGTDSLYSIVQMPKSIPVATVGIDNSYNAG MLAVEILAVKYEDIKEKLVKFRKEMKEKFIKENEQKVEF >gi|261747822|gb|ADAD01000096.1| GENE 10 11338 - 12069 1098 243 aa, chain + ## HITS:1 COG:FN0988 KEGG:ns NR:ns ## COG: FN0988 COG0152 # Protein_GI_number: 19704323 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Fusobacterium nucleatum # 9 241 1 233 237 340 76.0 2e-93 MIIENGGKMEKKEFIYEGKAKQVYSTDDENLVIIHYKDDATAGNGVKKGTIKDKGIINNK ITAKLFSVLEKNGIRTHFKEMLNERDQLCEKLEIVPLEVIVRNVITGSMAKRVGIKDGTI PKTTIFEICYKNDEYGDPLINDYHAVAMGLTTFDELKYIYETTSKINDLLKKVFDEEGIT LVDFKIEFGKNSKGEILLADEITPDTCRLWDKETGKKLDKDRFRQDLGGIEEAYIEILNR LEA >gi|261747822|gb|ADAD01000096.1| GENE 11 12075 - 13448 1696 457 aa, chain + ## HITS:1 COG:FN0987 KEGG:ns NR:ns ## COG: FN0987 COG0034 # Protein_GI_number: 19704322 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Fusobacterium nucleatum # 14 457 2 446 448 589 66.0 1e-168 MASVKSADKIEEGGVFALYSRKLRTDLAGLAYYGMYALQHRGQESAGFSIADFVSENEVK LKTVKGRGLVADVFSLKDLQSYSGNILVGHLKYATEGGASSHSYQPLRGESIMGKIAIVH NGNLLNTKELKEELMKNGSIFQTKTDTEIILKLLGKNGKFGYDQAILNTLKKLKGSFAIA VIIEDKLIGIRDPLGTRPLCLGMREDGVYVLVSESCALDAVNAEFVRDIEPGEIVVIDKQ GIESIRYANKKKKSFSSFEYIYFARPDSVIDGINVYSSRHEAGKLLYKQNPIEADLVIGV PDSGVPAAIGYSEASGIPYGTALLKNKYVGRTFILPTQELRENAVRVKLNPMKSLIENKR IVVVDDSLVRGTTSKILIKILFEAGAKEVHFRSASPVVISESYFGVNIASENELIGNTMT IDEIRDYIGATSLDYLSIENIKKALQNKDVNLDCFKD >gi|261747822|gb|ADAD01000096.1| GENE 12 13489 - 14487 810 332 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase [Acinetobacter baumannii SDF] # 3 329 12 344 356 316 48 6e-86 MAISYKDSGVDKEEGYKTVEKIKEKVKSTYSSNVMNELGSFGALYKLGDYKKPVLVSGTD GVGTKLKVAFEAGKYDTVGIDCVAMCINDILCHGAKPLFFLDYLACGKLDSDVSSEIIKG VVEGCLQSEASLIGGETAEMPGFYAPGEYDIAGFAVGVVEEDKIVNGSDIKEGNVIIALS SSGAHSNGFSLIRKLFTDLNEEYEGKTVGEYLLTPTKIYVKSIQKLTENIKVNGMAHITG GGLIENILRIIPEGLCANIEKEKIQIHPLFKNDKFKNVNEDEMWGTFNMGVGFTVILDKK DSEKAMEILRENGETAYEIGYISKGDKKICLK >gi|261747822|gb|ADAD01000096.1| GENE 13 14475 - 15083 762 202 aa, chain + ## HITS:1 COG:CAC1394 KEGG:ns NR:ns ## COG: CAC1394 COG0299 # Protein_GI_number: 15894673 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Clostridium acetobutylicum # 7 197 3 192 204 171 51.0 6e-43 MSEIKPKIAVLVSGSGSNLQTIINNIENGNLNCEISYVIADRFCYALERAEKHKIKSVLL DRKIYGDKLSDKINEILEKNNEKTSYIILAGYLSILSEEFIEKWEKKIINIHPSLLPKYG GKGMYGMKVHEAVIKNKEKESGCTIHYVDSGIDTGEPIMSIKVRVSEDDTPESLQKKVLE KEHILLTEGIKKLLENEKNERV >gi|261747822|gb|ADAD01000096.1| GENE 14 15122 - 16642 2068 506 aa, chain + ## HITS:1 COG:CAC1395 KEGG:ns NR:ns ## COG: CAC1395 COG0138 # Protein_GI_number: 15894674 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Clostridium acetobutylicum # 1 506 1 499 499 658 65.0 0 MKKRALISVFDKRGILEFAKFLYGKDVEIISTGGTYKYLKENGLEVTEISEVTNFKEMLD GRVKTLHPNIHGGILAVRNNKEHMETIEKEGIQTIDFVIVNLYPFFKEVQTDKSFDEKIE FIDIGGPTMLRSAAKSFADVTVICDAEDYTKVREEIESKGETSFETRKRLAGKVFNLTSA YDAAISNFLLNEEYPKYLSLSYEKKFDLRYGENPHQTSAYYVSTTENGSMKNFVQLNGKE LSFNNIRDIDIAWKVVGEFDEIACCAVKHSTPCGTAIGEDVFTVYKKAHDSDPVSIFGGI VAINREVDGKIAEELNKIFLEIVIAPSYSEEALGILKNKKNLRVIKCEVPRSQDKFEYIK VDGGILVQETNTKMIEEMKIVTEKQPTEKEKQDMILGMKVVKHVKSNAIVVVKDGTAKGI GTGQTNRIWATIHALEHAKGEGESLEGAVLASDAFFPFRDCVDEAAKYGIKAIVQPGGSM RDQESVDACNEHGISMVFTGIRHFKH Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:02:17 2011 Seq name: gi|261747802|gb|ADAD01000097.1| Leptotrichia goodfellowii F0264 contig00037, whole genome shotgun sequence Length of sequence - 23592 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 4, operones - 4 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 1847 2272 ## COG0249 Mismatch repair ATPase (MutS family) 2 1 Op 2 . + CDS 1847 - 4021 2725 ## Sterm_2742 OstA family protein 3 1 Op 3 . + CDS 4034 - 4768 215 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P + Prom 4876 - 4935 14.3 4 2 Op 1 9/0.000 + CDS 4980 - 7592 3619 ## COG0013 Alanyl-tRNA synthetase 5 2 Op 2 1/0.000 + CDS 7611 - 8027 627 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 6 2 Op 3 31/0.000 + CDS 8081 - 9283 1911 ## COG0342 Preprotein translocase subunit SecD 7 2 Op 4 . + CDS 9283 - 10242 1284 ## COG0341 Preprotein translocase subunit SecF 8 2 Op 5 . + CDS 10268 - 11359 1212 ## COG0787 Alanine racemase 9 2 Op 6 . + CDS 11378 - 11971 675 ## Sterm_2735 colicin V production protein 10 2 Op 7 22/0.000 + CDS 11955 - 13055 1313 ## COG0795 Predicted permeases 11 2 Op 8 1/0.000 + CDS 13060 - 14136 1177 ## COG0795 Predicted permeases + Prom 14297 - 14356 4.8 12 2 Op 9 . + CDS 14405 - 15631 1570 ## COG0612 Predicted Zn-dependent peptidases 13 2 Op 10 1/0.000 + CDS 15633 - 16739 1541 ## COG0772 Bacterial cell division membrane protein 14 2 Op 11 . + CDS 16756 - 17649 906 ## COG0564 Pseudouridylate synthases, 23S RNA-specific + Prom 17670 - 17729 13.3 15 3 Op 1 10/0.000 + CDS 17776 - 18627 1103 ## COG0777 Acetyl-CoA carboxylase beta subunit 16 3 Op 2 . + CDS 18662 - 19636 1573 ## COG0825 Acetyl-CoA carboxylase alpha subunit + Term 19653 - 19698 1.1 + Prom 19638 - 19697 12.6 17 4 Op 1 1/0.000 + CDS 19873 - 21063 1542 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes 18 4 Op 2 . + CDS 21111 - 22073 1321 ## COG0451 Nucleoside-diphosphate-sugar epimerases 19 4 Op 3 . + CDS 22112 - 23359 1611 ## COG1301 Na+/H+-dicarboxylate symporters Predicted protein(s) >gi|261747802|gb|ADAD01000097.1| GENE 1 3 - 1847 2272 614 aa, chain + ## HITS:1 COG:FN0693 KEGG:ns NR:ns ## COG: FN0693 COG0249 # Protein_GI_number: 19704028 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Fusobacterium nucleatum # 1 610 286 890 896 567 52.0 1e-161 NYADINSITRKNLELTKSQREKTVYGSLLWVLDKCKSSMGTRLLKRYINNPLLDIEEITK RQNDVQYFIDNILVREDLKEKLENIYDLERLLGKVIFGSENGKDLTALKNTVKASIEIIQ ILKDTSFFNDININVLIGIYRLIDESIDENAPFTVREGNIIKRGYNAELDEIYKIMNSGK DFLLEIEKKEKEATGIKNMKIKYNKVFGYFIEVTKSNLDMVPDTYIRKQTLTNAERFVTP ELKKYEDTIINSKAKIEDMEYYLFKEVSTKVKECKTDLVKLAEKLAYLDVIVSFSTVAIE NDYIRPEITNDFSIDIKDGRHPVVEKLIGRSDFVSNDTLLTEKERFTVLTGPNMSGKSTY MKQVALISIMAQTGSYVPASSAKLSVVDKYLTRIGASDDILTGQSTFMVEMSEVSNIINN ATERSLIILDEVGRGTSTFDGVSIATAISEYIHEKIKAKTIFATHYHELTELENRYENIV NYRIEVEEKAGKVMFLRNIVKGGADKSYGIEVAKFAGLPKEILHESKKILHRLEQKKELI EKTIDVHQLSLFGQVIEAQESYQGYNPDFYTEDKTDKVAEEIIYDIESKDVNNMTPMEAL KFLSEIKAKLEEKK >gi|261747802|gb|ADAD01000097.1| GENE 2 1847 - 4021 2725 724 aa, chain + ## HITS:1 COG:no KEGG:Sterm_2742 NR:ns ## KEGG: Sterm_2742 # Name: not_defined # Def: OstA family protein # Organism: S.termitidis # Pathway: not_defined # 1 724 1 737 742 568 45.0 1e-160 MWKKTAYGVLIVIAGYFLYSVFLKKIDTSPVEKMKQEMNAKNVTYKLKDDAIIKADEQIG TQTDGIIKFKGVVIDLLKKDMLISAKEAEVNTKTSDITLKNTVEGRTKDNKWNIFTEHVD YKKEGDLIISDTRTKIINNEDKTELQADKVQTTVKFEEITGTGNVVYKKEGKELKADKIK YNDVNKQAEAEGNVKYKDEKSDIAANRGVYFIEKKQVDATGNVDYRNKDLNVKANHVFYD EIKQIANADGNGTFSYFPRKSTGTFQSGVYDLVNEILTTDQYYTMNYDDYKMKGTGLVYL FKTGDATLKSNFSVTKQNFTVSGSNGTMNTIVKNIFANNMLMTSVQGDRISSRTGEGSFE KKEFRFDGNINGKIRGNVKNFVTNPTKLVDSEAVHFRGNTAKVYFLSHSNNDMSITRSEI KENVHMIYKELNLDSQYNEIDTSKNLVLARDRVILDFRNETQMTSNFLYLDLNQEIGNAQ NNVKIVSKLPQLVNLNTSSDKATVNMKEKKVTLLGNVVSYQGKTKISSKKAVYDINKKIL ENDGNIKMEYFVQNSAVSTGKTNATDVQAVDEILKKLSVSQNDINNRDKIELPRAMTASN GTNVNIKWSSSNSGFLPVTGKVNKEFLGGNRRNVTLKALLRAGSEEKEKTFNVNIPVETV GEMLERAAKNIYVPESQNNLPSTVKVNVAKGTLDIPVTWERSGSKEKGLVAILRYQGVEY KKQF >gi|261747802|gb|ADAD01000097.1| GENE 3 4034 - 4768 215 244 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 22 226 147 351 398 87 27 8e-17 MGRKISIEADNLKKIYRNREVVKNVSLSMEKGEVVGLLGPNGAGKTTTFYMITGIIKPNN GRVTYNGEDITNFPMYKRARLGLGYLPQEASVFRNLTVEENIISVLEMRGIPKKERIAKM SSLIEEFKLTHVAKSLGYALSGGERRRVEIARTISTNPDFILLDEPFAGVDPIAVEDIQN IIMQLKKRGLGILITDHNVRETLRITERAYIMAEGTILISGTGEEIANNETARKVYLGDN FKLD >gi|261747802|gb|ADAD01000097.1| GENE 4 4980 - 7592 3619 870 aa, chain + ## HITS:1 COG:FN0697 KEGG:ns NR:ns ## COG: FN0697 COG0013 # Protein_GI_number: 19704032 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 869 2 866 867 900 54.0 0 MTGNELRKSFVDFFKSKEHKHFESASLIPDDKSLLLTVAGMVPFKPFFLGEKEAPFKRIT TYQKCIRTNDLENVGKTPRHHTFFEMLGNFSFGDYFKKEAIEWSWEYITKVLKLDKERLW VSVFETDDEAYKIWNEEIGVPAERMVRLGEDDNWWAAGPVGSCGPCSEIYYDTQNMGKNN EEINCKPGDEGDRFLEIWNLVFTEWNRLEDGTLVPLPEKNIDTGAGIERIASVIQNKKSN FETDLFMPIIKGIEKVLDIKKEEHDEAVKVIADHIRASVFLIGDGVLPSNEGRGYILRKI IRRAFGVGSVAKEKVFEKEDIFLHKLVSYVVETMKDGYPDLVEKSEYIEKVIKIEEERFS TTLKNGTEMLSEEIAKLKKQNKKKLSSEISFKLYDTFGFPFELTKLILNNEGIEVSEEDF EKRLEEQITRSQRSRVTISDMIKDDFIDEFFEKHGKTEFTGYENFADKGKILYAAKSEGM SGYEMIFDKTPFYAESGGQVSDTGKITAGEFEGRVVDVVKKRDVFIHQVEVIKGIIPAEN TTVELEIDVERRKDIQRNHTATHILHKVLREKLGTHVEQSGSLVESDRLRFDFSHYEPIS KEMIEEIEYEVNNVILSNIKTKIDYENIQEAKNRGAMALFSDKYGDIVRVVEIPGFSIEL CGGAHVKSTGEIGFFNIESETGISSGVRRIIATTGHKSLEYVNGIEEKLAEISATMKSDE NNIVEVLKKYKNEFKDLEKAYMQLQSRLLKYEINEIFEGVEEISGVKVLAKTFENKNIDE LKEIVDRGKEKLQSGIIILGSNNEKAIFVAGVTKDLISKIKAGDIVKVAAQTADGNGGGR PDFAQAGGKNGSAVKEAVEKSKEYIVSQLS >gi|261747802|gb|ADAD01000097.1| GENE 5 7611 - 8027 627 138 aa, chain + ## HITS:1 COG:FN0698 KEGG:ns NR:ns ## COG: FN0698 COG0816 # Protein_GI_number: 19704033 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Fusobacterium nucleatum # 1 138 1 138 138 126 56.0 1e-29 MKKFIGLDIGDVRIGVAKSDPLGILATGLEVIDRNTVNPVQRIKEILSDEGTKKIVAGLP KSLDGTKKRQAEKVEEFVAELKESIPDIEVIMVDERYTTTEAEHYLKNYSKKNGKERRKV VDMVAASIILQKYLDRIK >gi|261747802|gb|ADAD01000097.1| GENE 6 8081 - 9283 1911 400 aa, chain + ## HITS:1 COG:FN0699 KEGG:ns NR:ns ## COG: FN0699 COG0342 # Protein_GI_number: 19704034 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Fusobacterium nucleatum # 3 395 2 403 411 362 51.0 1e-100 MKNKTIHYILLLAIIIIPGAILYRNKIKLGLDLRGGTYVVLQAQGKIENDTMDKVRDIVE RRIDSLGVSEPVLQINGKDRLIVELAGIKDPQKAIDLIGTTAKLEFKIKNQDGTYGPTLL EGSAIKNATLVQGQFGQPEVAFELDTKGAETFAKITRENVHKQLAIMLDGKEQSAPVINT EIPGGRGVITTNDPEDAQALTNLLKSGALPVSIKIMETRTVGATLGNESIQQTKMAGVIA MIAISLFMFAVYKIPGLIADIVLAIYGFLVLASFSMIGATLTLPGIAGFILTLGMVVDAN VITFERIKEELRKGYSLEDAVENGFKNGLPAILDGNITTLLVASVLFFFGTGPVKGFAVT LTLGVLITIVTALFITKVLFKLVISVFKVKKEQLFWKGVK >gi|261747802|gb|ADAD01000097.1| GENE 7 9283 - 10242 1284 319 aa, chain + ## HITS:1 COG:FN0700 KEGG:ns NR:ns ## COG: FN0700 COG0341 # Protein_GI_number: 19704035 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecF # Organism: Fusobacterium nucleatum # 2 319 4 317 317 261 47.0 2e-69 MNLRVIQLKKYYLGFSAIMVLISIIFFLTIKLNLGVDFKGGDLLQLHYSQAVNKDTLNGA LDGIINEVPQLKNRRLQYSEDNTIVLRTEQLTDAQKNAVLKQLTEKTGKYELVKYDTVGP TIGKELTSNALTALLIGSLLIIIYITIRFEFVYAVAGVAAVLHDVIIAMGLIAFLKYEIN TPFIAAILTILGYSMNDTIVVFDRIRENDAKEGKTKDFAEIIEESVNQVYMRSLFTSLTT LLSITVLLIFGGDSLRTFNTALFIGMIFGTYSSIFVASPLVYLMRKYRKKRKDTDHKDKS KKYRQKTVNGYDEDEKVLV >gi|261747802|gb|ADAD01000097.1| GENE 8 10268 - 11359 1212 363 aa, chain + ## HITS:1 COG:BS_yncD KEGG:ns NR:ns ## COG: BS_yncD COG0787 # Protein_GI_number: 16078827 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Bacillus subtilis # 4 363 9 383 394 217 35.0 3e-56 MRCWAEINIKNLYDNIKEIEKITSKEHIMAVIKADGYGHGMEKICEALIKTGIKNFAVAT SEEAHKIREIDDSVNILILGPIENDHANSVSEKNIYFMITDFQEIDYLEKNKCNTEVFIK IDTGMGRVGFQLQEIEKLAETLKRCKYVKPIGVFSHFSSSDSDVDYTDLQIKRFEEVSNK LKKELPTVKYRHLLNSFGSLRFQKNKYDFIRAGIIIYGGVTEEETKPYKFKPVMSLYAKI SYIKTLSEDSYISYGNTYLGKKGETLATVSIGYADGVRRDLSNKGYVYYKGHKCNIVGRV CMDQLVIKLPEKLAKEAAKGDKVEFFGENINVVDVANLCNTISYEILCGISQRVPRIYIN ENE >gi|261747802|gb|ADAD01000097.1| GENE 9 11378 - 11971 675 197 aa, chain + ## HITS:1 COG:no KEGG:Sterm_2735 NR:ns ## KEGG: Sterm_2735 # Name: not_defined # Def: colicin V production protein # Organism: S.termitidis # Pathway: not_defined # 1 197 1 196 196 99 41.0 8e-20 MILDIGFLILLILSFLLGRKRGFTLEFFNVFKYLLILYFMKYTYGAVKVLFKLAEKDSRD QLKIYIIAFAILYISLTIILKLSANFLKSIKLKRLNEFFGGILGIIKTTFVIFIIYIIVL IGSTHSKRLEEIKHQSLAVKGITQYLYVYSEVFPDFIKNDVNRYRKKRAEEKLKRNVLNE LKENNLNEGIKNNENNR >gi|261747802|gb|ADAD01000097.1| GENE 10 11955 - 13055 1313 366 aa, chain + ## HITS:1 COG:FN1031 KEGG:ns NR:ns ## COG: FN1031 COG0795 # Protein_GI_number: 19704366 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 348 1 340 359 132 25.0 8e-31 MKIIDKYIYNSLILPSVFGISIFTFILMLNVLIEAMERLFASDLPFLSVVDYFFYVVPGI LVQTIPMGAFLGVMLVYGGLSETNEIIAMEGSGISLFRIIRPAFIFGLILTFIGLGLEIY VNPRALENINKQTKMLLATRPNSLTEEKIFLTNPEKGFGFYIDKVDNKKATASQFLLLNK QGNNPYPVIFLAKSARFDPGVIVLSDVKGYSFDKEGNSQVAAEYKEQNVPVSTFFTSEEG EQRQKKRSEMNLKELREFYNNNIGNPEMKEAALKALIESYQRVIGPLASVFLCWLGVLLS VGNRRSGKGISFGISLIVIFGYIAIVNYAKIMILKNNVPVSIAMWIPNFILFLLCVYFSI KKYRRH >gi|261747802|gb|ADAD01000097.1| GENE 11 13060 - 14136 1177 358 aa, chain + ## HITS:1 COG:FN1030 KEGG:ns NR:ns ## COG: FN1030 COG0795 # Protein_GI_number: 19704365 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 358 2 363 363 174 32.0 2e-43 MNKLDKYIILNYIKSFFLGMMMFFLIFILAESINLTGWIMDGKFTLGEAVKYLSYGIPEI VTNTAPLGILLGSLLAISKMAKQLEITAMKTGGISFLRIAKFPLIFSFFVSLSVLYVNVD ILGKSNSKKSNMKLLKLEQAEPVKVEKNFILVKIDKNRILYSGYVNKKDGIMREVEIIEM ADEFKNVKTVYTASEGALEKGTDNWTFTNLKEHNIMTNVSKNIDPGVFKLKIPIDDILAD PVKAKNLTMPELREKIVYFTRVGADSIDLLIDFYYRISFALASFVMCFIGLSLGSRYVRG GAAVNIGLSVIIGYSYYGFSTIMKSLASTGTIPVYIACFLPLLIYLGVGIKLFMEAEY >gi|261747802|gb|ADAD01000097.1| GENE 12 14405 - 15631 1570 408 aa, chain + ## HITS:1 COG:FN1029 KEGG:ns NR:ns ## COG: FN1029 COG0612 # Protein_GI_number: 19704364 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Fusobacterium nucleatum # 8 387 10 389 408 259 39.0 5e-69 MPEEIITNRGIRVIFDRLENISTCSVGVFVKTGSKDESDQEEGISHVLEHMIFKGTSKRD YFQISEEVDYLGASINAHTTKEETVFYINALTEFLGKSVDILFDIVTNSLIPEDELKKEK DVIVEEIKMYQDSPDDLVFELNYADCIKGQYSKPIIGTEESVRSFTSEMIKKYYKERYTK DNILIVVSGNFDKKEIIEKIDEYFSKLQENKVDRRENISFEFKEGRETHEKDINQVNICI SFEGKSYNSSERIYTDILANIMGGSMSSRLFQEIREKKGLAYSVYTYNQYYREGGVVTTY IGTNIESYKEAIDITLKEFSKMRKEGITETELQKAKNKYLSKIAFSMENPRSRMSILGNY FVRRGEIIDIDKMKKEIHEVKSENINEFLKSQYLKPNITVLGNIKGDK >gi|261747802|gb|ADAD01000097.1| GENE 13 15633 - 16739 1541 368 aa, chain + ## HITS:1 COG:FN1027 KEGG:ns NR:ns ## COG: FN1027 COG0772 # Protein_GI_number: 19704362 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Fusobacterium nucleatum # 1 365 1 362 366 257 40.0 3e-68 MFQNQRLTEIIKSNIVKMDKMLLLFVYALVMISTVFVYSATRSTKFVVQNLIWISIGTLL WIGISFIDYRDMKKHIWKIYGLSAALLLLVRFAGKKTLGAQRWIKLGPFQLQPSEFVKIA IIVIIAFWIVEKYAKGINNLKDIIGSFLPAIPLILLILLQPDLGTTLITVCSFVFMIFLY GADMKPIWVIAIIVILSAYPVYRFVLSDYQRTRVETFLDPEKDRKGSGWHVTQSKISVGA GGLYGKGVLQGSQSRLEFLPEPQTDFIFSVISEESGFIGSTTVILLYFLLIFNIMRISRL TQDRFARLILYGISGIFFMHVIVNIGMTIGLVPVTGKPLLFLSYGGSSFLSSFIMIGLVE SIKINIED >gi|261747802|gb|ADAD01000097.1| GENE 14 16756 - 17649 906 297 aa, chain + ## HITS:1 COG:FN1026 KEGG:ns NR:ns ## COG: FN1026 COG0564 # Protein_GI_number: 19704361 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Fusobacterium nucleatum # 5 292 3 282 289 194 43.0 2e-49 METFEYIVDSESEGMRLDRYLKKTFKNESLSKIFQALRKGDVKINGKKSKENYRLCLGDK ITVKYLYSEKEEKTKKTFDFDEKKYKNMIIFENNDFFIINKPGNVPMHKGTGHKHGLSEI FKKIYENDNINFANRLDYETSGLVIGCKNLKFLRYISEKIRNNEVQKKYVAAVDGEIEKD KFIIENYLKITENGVIQSYVSDKDAKRSITKYKKINNRFFDKILNNIKKEKVTLLDIDLV TGRKHQIRVQLSSIGNPVIGDRKYGKKKEENKLYLCCYSVSFDDYKFKLENENEYFK >gi|261747802|gb|ADAD01000097.1| GENE 15 17776 - 18627 1103 283 aa, chain + ## HITS:1 COG:FN0408 KEGG:ns NR:ns ## COG: FN0408 COG0777 # Protein_GI_number: 19703750 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase beta subunit # Organism: Fusobacterium nucleatum # 1 283 11 304 304 326 57.0 3e-89 MGLFSSKKSKNKYATVTSKSKLTMDVVDDNKWKKCSRCNEIIYNEDLKNNLNICPKCGNY FRLTAFERIELLVDEDTFIEEDMTLNSKDFLKFPGYEEKLENSREKSRMLDGIISGIGKI NGIEVSIAAMEFSFMGGSMGSVVGEKVTRALERGIEKKIPVVIVSSSGGARMQEGIVSLM QMAKTSGAVKRLNEAGLPFISVPVDPTTGGVTASFAMLGDIIVTEPNALIAFAGPRVIEQ TVNQKLPKGFQRAEFLLEHGMVDVISERKDLKMTIYRILEKLI >gi|261747802|gb|ADAD01000097.1| GENE 16 18662 - 19636 1573 324 aa, chain + ## HITS:1 COG:FN0409 KEGG:ns NR:ns ## COG: FN0409 COG0825 # Protein_GI_number: 19703751 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase alpha subunit # Organism: Fusobacterium nucleatum # 1 312 1 311 313 337 59.0 2e-92 MSVKEEIKELEDKIEELRRFSAEQKIDFSKQIGELEKNLEEKYLEFSEKEMDSWARIQIS RNPKRPYTLDYINELTQDFVELHGDRLSKDDHAIIGGLASVDGYNIMIIGHQKGRDLESN MYRNFGMASPEGYRKALRLMRMAERFELPILTLIDTSGAYPGIEAEEKGQGEAIAKNLSE MFSLRVPVVSVVIGEGGSGGALGIGVADSVLMLENSVYSVISPEGCASILFNDATRAAEA AKSLKMDAINLKGLKIIDEIIEEPLGGAHRNLEKTAQNLKAAVLKEFKKIDKLTVEELLE RRYEKFRKMGEYFENEENIEENEK >gi|261747802|gb|ADAD01000097.1| GENE 17 19873 - 21063 1542 396 aa, chain + ## HITS:1 COG:SA0508 KEGG:ns NR:ns ## COG: SA0508 COG0156 # Protein_GI_number: 15926228 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Staphylococcus aureus N315 # 17 396 4 383 383 543 72.0 1e-154 MSNGVLKKFLSPKLRELKEKGLYNEIDVLDGANGPEIFINGRKLINLSSNNYLGLATRED VKQAEIEASKKYGAGAGAVRTINGTMVIHKELEETIAKFKHTEAAIAFQSGFNCNMAAVS AIMDKQDAILSDELNHASIIDGCRLSGAKIIRCKHQDMNDLRLKAKEAIESGLYNKIMYI SDGVFSMDGDVAKIKEIVEVAKEFGLITYIDDAHGSGVMGEGAGTVKEFGCSADIDIQVG TLSKAVGAVGGYVAGSKELIDWLKVRGRPFLFSTSLTPGAAAAAKKSIEIISTEKDLVKK LWENSRYLKQELKKIGYDTGVSVTPITPVIIGDENKTQLFSKRLIEEGVYAKSITYPTVP LGTGRVRNMPQASHTKEMLDEVVRIYEKVGKELKVI >gi|261747802|gb|ADAD01000097.1| GENE 18 21111 - 22073 1321 320 aa, chain + ## HITS:1 COG:SA0511 KEGG:ns NR:ns ## COG: SA0511 COG0451 # Protein_GI_number: 15926231 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Staphylococcus aureus N315 # 1 317 1 320 321 394 58.0 1e-109 MKKVLITGATGQIGTELTIRLRKDLGIENVIATDIKTGEIADEGIFEILDVKDYDKFLKL AKNNKVDTIIHLASILSATAEKNPLGAWRLNMDGLVNALETAKECHAQLFAPSSIAAFGD DTPKDHTPQDTLMHPNTIYGVTKVSGELLCDYYYTKFGVDTRSVRFPGLISHKTLPGGGT TDYAVHIYYDALTKGEYTSFIDKGTYMDMMYMDDAIDAIINILNADPSQLKHRNSFNITA TSFEPEQLAATIKKYIPEFKLKYDVDPVRQKIADSWPNSLDDTCAREEWHFSPKYDLNRM TETMLKELSKKLDVKVNLNL >gi|261747802|gb|ADAD01000097.1| GENE 19 22112 - 23359 1611 415 aa, chain + ## HITS:1 COG:Cgl2969 KEGG:ns NR:ns ## COG: Cgl2969 COG1301 # Protein_GI_number: 19554219 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Corynebacterium glutamicum # 7 408 5 395 412 296 45.0 7e-80 MLKKIPLLIQIIIAILLGIILGKFASKDLVISKAFSINFVRILVTFSDIFGQLLKFLIPL IILAFIAPGIGALAKGAGKLLGITTAFSYGSTVISGLTAFITASVLYPIILQGQSIANFG NPEEALLKGYFIIEIPAPMQVMSALVLSFVLGLGMSVLKTDALFKVMDEFRIIIENMLEK VIVPILPFYILGVFANMTAAGQITGIINVFIKVFVMILILHLTILIFQYTVAGIMTKQNP FKMIKIMLPAYFTALGTQSSASTIPVTLKSAKKMGVKEEVANFAIPLCATIHLSGSTITL VSCSTAVFFMQHGMLPSWGLMIKFILLLGVTMVAAPGVPGGAVVAALGILNSVLGFDETM LSLMIALYIAQDSLGTACNVTGDGAIAAVTAVFVKNEEEKEKAAQIEKVQINEVI Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:02:31 2011 Seq name: gi|261747800|gb|ADAD01000098.1| Leptotrichia goodfellowii F0264 contig00216, whole genome shotgun sequence Length of sequence - 491 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 16 - 75 2.0 1 1 Tu 1 . + CDS 136 - 490 310 ## gi|262038175|ref|ZP_06011571.1| hypothetical protein HMPREF0554_2425 Predicted protein(s) >gi|261747800|gb|ADAD01000098.1| GENE 1 136 - 490 310 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038175|ref|ZP_06011571.1| ## NR: gi|262038175|ref|ZP_06011571.1| hypothetical protein HMPREF0554_2425 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2425 [Leptotrichia goodfellowii F0264] # 1 118 1 118 118 196 100.0 4e-49 MKKWLNDGHGAYTYYSAQLAKDIDGLLRKYENTKDKTAKQTIQEDIEKKYIENQERLMEL FTQGPTLMNNSEGLLKGLGEYNRVENERVYGEGKYQGKNLPQNYITNILNESIREGKL Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:02:40 2011 Seq name: gi|261747793|gb|ADAD01000099.1| Leptotrichia goodfellowii F0264 contig00043, whole genome shotgun sequence Length of sequence - 3727 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1258 1879 ## gi|262038182|ref|ZP_06011577.1| hypothetical protein HMPREF0554_0592 - Prom 1318 - 1377 3.1 2 2 Tu 1 . - CDS 1379 - 1795 367 ## gi|262038181|ref|ZP_06011576.1| conserved hypothetical protein - Prom 1841 - 1900 8.6 3 3 Op 1 . - CDS 2032 - 2229 118 ## gi|262038179|ref|ZP_06011574.1| conserved hypothetical protein 4 3 Op 2 . - CDS 2226 - 2612 437 ## gi|262038178|ref|ZP_06011573.1| hypothetical protein HMPREF0554_0595 5 3 Op 3 . - CDS 2657 - 2911 179 ## gi|262038177|ref|ZP_06011572.1| DNA topoisomerase IV subunit A 6 3 Op 4 . - CDS 2982 - 3680 496 ## gi|262038180|ref|ZP_06011575.1| hypothetical protein HMPREF0554_0597 Predicted protein(s) >gi|261747793|gb|ADAD01000099.1| GENE 1 1 - 1258 1879 419 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038182|ref|ZP_06011577.1| ## NR: gi|262038182|ref|ZP_06011577.1| hypothetical protein HMPREF0554_0592 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0592 [Leptotrichia goodfellowii F0264] # 44 419 1 376 376 644 99.0 0 MSLVSEKRGKRVLGRFISLSIILVITFNSAGKIQNPLSASSKTVTPSISREKGETTAEYY SKNEDITKGTTYYNNNPEVKVVGVDVQTKGIEGTVKSLDVISVQDKVDSKNTGYNISYGI GIGDHTVVRDGKGMNANGIYTANIGVGYSRGDVTQRTTNAVGSFTAESGVLNVEGKTRQV GSVIGGNFTLKTKEYEHEDLEDINKSRTIGFNITVTPGVTEVYRNGRPTGEYKGGAAYGT RINYAENDYVAKVKATIGENVSPIIDGKLSELKDVNRDVDNRVEVIKDKEIRPINADLGT EYWATDYAREKARGDFVKAGNKIDRVAEILNAAIKNDNKDIYLYYKDRIAAEEEIARMER EGIDLSIIDKEQAKKIIERAAGGEVDDVQIISSKDPNIKEIIGAEGIRGRAYYTGKGKV >gi|261747793|gb|ADAD01000099.1| GENE 2 1379 - 1795 367 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038181|ref|ZP_06011576.1| ## NR: gi|262038181|ref|ZP_06011576.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 138 67 204 204 189 100.0 7e-47 MKETLEYLLKKIGYCEAEKSYNELYIRYSLFEEAYLVYCIVLLILWLCVFEFPQIIIVII IYLIVINIRFFKRKYIMTKYGNKIKIFEKNKIAYIIEDFYEDENSVIIEYNIKQNLLSKH NFYYFSNNILPRKYMKKL >gi|261747793|gb|ADAD01000099.1| GENE 3 2032 - 2229 118 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038179|ref|ZP_06011574.1| ## NR: gi|262038179|ref|ZP_06011574.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 65 4 68 68 87 100.0 4e-16 MKIEYSKFDFNKNDIGLFLFFLFYIQPSLLLSKNMKDIVGGIIFLNIATFYFIYRFYLVW ILKKI >gi|261747793|gb|ADAD01000099.1| GENE 4 2226 - 2612 437 128 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038178|ref|ZP_06011573.1| ## NR: gi|262038178|ref|ZP_06011573.1| hypothetical protein HMPREF0554_0595 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0595 [Leptotrichia goodfellowii F0264] # 1 128 1 128 128 182 100.0 8e-45 MKYFATIAGGLGVGAGGGFSISGGTSYLTGIDSPDQLKGWGNSLSLNFIGGISGSFSAEF SKGAGGFSISAGTGGPASATVTYQLGYTFVSKNSFDVLDKLPISKKSKEKLRAEFEKRKR ESKKGWRK >gi|261747793|gb|ADAD01000099.1| GENE 5 2657 - 2911 179 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038177|ref|ZP_06011572.1| ## NR: gi|262038177|ref|ZP_06011572.1| DNA topoisomerase IV subunit A [Leptotrichia goodfellowii F0264] DNA topoisomerase IV subunit A [Leptotrichia goodfellowii F0264] # 1 84 1 84 84 140 100.0 2e-32 MGTIAEEGRHIYHRNNNGLEDSEDYASFYGRQFERYYYRNFGNNDVAFITERLKYTKEQL GNDWENDVRTLFAAISGEIGYYGH >gi|261747793|gb|ADAD01000099.1| GENE 6 2982 - 3680 496 232 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038180|ref|ZP_06011575.1| ## NR: gi|262038180|ref|ZP_06011575.1| hypothetical protein HMPREF0554_0597 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0597 [Leptotrichia goodfellowii F0264] # 1 232 16 247 247 316 100.0 1e-84 MKIEISENISSIRMTKFNIFILLYFLTCSFILLFIIKNIFFSFLLVILIIAFIIYRKILI KKIKGKIKIFITDSGIEMISPKIKKINYSDIKNIAYIQNNSYGNGDIIINISPLKIENKY IDLFSKNDYTFAESINYRTFIIPDAKNVKEIFQKITEKKEKNEILTKTSEDKKRRIEIYS NNVIAISSKVNNSEIIFYDEAEYNEEKHEIKVKRMGITNKYLNEIGLPSLKL Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:03:51 2011 Seq name: gi|261747764|gb|ADAD01000100.1| Leptotrichia goodfellowii F0264 contig00117, whole genome shotgun sequence Length of sequence - 30621 bp Number of predicted genes - 28, with homology - 27 Number of transcription units - 9, operones - 5 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1891 2639 ## COG0058 Glucan phosphorylase + Term 1914 - 1959 4.7 - Term 1909 - 1938 -0.3 2 2 Tu 1 . - CDS 1946 - 3436 1946 ## COG1640 4-alpha-glucanotransferase - Prom 3466 - 3525 10.8 + Prom 3520 - 3579 12.9 3 3 Tu 1 . + CDS 3600 - 3815 454 ## Lebu_1259 hypothetical protein + Term 3841 - 3886 4.2 + Prom 3890 - 3949 11.5 4 4 Op 1 . + CDS 3999 - 5291 1980 ## COG2067 Long-chain fatty acid transport protein + Prom 5307 - 5366 1.6 5 4 Op 2 . + CDS 5408 - 6466 1188 ## COG1056 Nicotinamide mononucleotide adenylyltransferase + Prom 6472 - 6531 14.2 6 4 Op 3 . + CDS 6580 - 6729 258 ## PROTEIN SUPPORTED gi|169837420|ref|ZP_02870608.1| LSU ribosomal protein L33P + Term 6775 - 6843 30.4 + TRNA 6757 - 6832 85.7 # Trp CCA 0 0 + Prom 6760 - 6819 80.4 7 5 Op 1 . + CDS 6855 - 7070 268 ## Lebu_1898 preprotein translocase, SecE subunit 8 5 Op 2 45/0.000 + CDS 7072 - 7689 1126 ## COG0250 Transcription antiterminator + Prom 7702 - 7761 3.7 9 5 Op 3 55/0.000 + CDS 7781 - 8206 653 ## PROTEIN SUPPORTED gi|229885177|ref|ZP_04504630.1| LSU ribosomal protein L11P 10 5 Op 4 43/0.000 + CDS 8283 - 8990 1065 ## PROTEIN SUPPORTED gi|229210212|ref|ZP_04336609.1| LSU ribosomal protein L1P + Prom 9043 - 9102 6.5 11 5 Op 5 47/0.000 + CDS 9227 - 9766 673 ## PROTEIN SUPPORTED gi|229210211|ref|ZP_04336608.1| LSU ribosomal protein L10P 12 5 Op 6 28/0.000 + CDS 9828 - 10202 517 ## PROTEIN SUPPORTED gi|229210210|ref|ZP_04336607.1| LSU ribosomal protein L12P + Term 10226 - 10262 5.0 + Prom 10226 - 10285 5.6 13 6 Op 1 58/0.000 + CDS 10395 - 13841 596 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 + Prom 13857 - 13916 2.8 14 6 Op 2 . + CDS 13964 - 17980 5587 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit 15 6 Op 3 4/0.000 + CDS 17982 - 18224 385 ## COG2052 Uncharacterized protein conserved in bacteria 16 6 Op 4 . + CDS 18260 - 18805 839 ## COG0194 Guanylate kinase 17 6 Op 5 . + CDS 18819 - 18998 414 ## Lebu_1888 hypothetical protein 18 6 Op 6 31/0.000 + CDS 19078 - 20886 1723 ## COG0358 DNA primase (bacterial type) 19 6 Op 7 . + CDS 20879 - 22060 1733 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 20 6 Op 8 . + CDS 22114 - 23058 1348 ## Lebu_1885 hypothetical protein 21 6 Op 9 . + CDS 23069 - 23878 1052 ## COG0327 Uncharacterized conserved protein + Prom 23892 - 23951 13.7 22 7 Tu 1 . + CDS 23982 - 24104 199 ## + Prom 24596 - 24655 9.8 23 8 Op 1 17/0.000 + CDS 24681 - 25529 1234 ## COG0061 Predicted sugar kinase 24 8 Op 2 . + CDS 25569 - 27227 2157 ## COG0497 ATPase involved in DNA repair 25 8 Op 3 . + CDS 27227 - 28174 1077 ## COG4974 Site-specific recombinase XerD 26 8 Op 4 . + CDS 28207 - 28617 356 ## Lebu_0485 hypothetical protein + Prom 28623 - 28682 7.4 27 9 Op 1 . + CDS 28720 - 29310 839 ## COG5506 Uncharacterized conserved protein 28 9 Op 2 . + CDS 29303 - 30571 2169 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase Predicted protein(s) >gi|261747764|gb|ADAD01000100.1| GENE 1 2 - 1891 2639 629 aa, chain + ## HITS:1 COG:FN0857 KEGG:ns NR:ns ## COG: FN0857 COG0058 # Protein_GI_number: 19704192 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Fusobacterium nucleatum # 15 625 185 789 789 766 63.0 0 DEVGKEYFKRVNTENVNAVAYDIPIIGYGNDTVNTLRLWEARSPEGFDLKLFNDQKYLLA SENAVQAEDISRVLYPNDTEKDGKLLRLKQQFFFTSASLQDIVRRYKSLYGNDFSKFHEK VAIQLNDTHPVVSIPELMRILLDYEKLNWGEAWNICKNVFSYTNHTILSEALEKWEISLF QPLLPRIYQIVEEINRRFMEELNEKYPDDWQKKQEMAIIGNGQIRMAWLAIVGSHTVNGV AALHTEILKNSELKEWYELYPEKFQNKTNGITQRRWLLKSNPELAGLITELIGDKWITDL YELKKLEQFIEDDNVLNRLTEIKFRNKQKLAEYIKETTGIDVNPHSIFDVQVKRLHEYKR QLLNVLHIMDLYNKLKENPLLDVEPRTFIFGAKAAAGYRRAKGIIKLINAVAERVNNDPD INGKIKVVFLENYRVSLAEKIFPAADVSEQISTASKEASGTGNMKFMLNGAITMGTLDGA NVEIVEEAGAENEFIFGLKADEVVRLEGYGKYNPMEEYNVVEGLTKVIDQLSDGTYDDNH TGIFRELQASLLYGVDGGRPDVYFVLRDFSSYRNAQTELQKAYKDKRNWAKKMLKNIANA GKFSSDRTIAEYAKDIWNIHPVKVEDYID >gi|261747764|gb|ADAD01000100.1| GENE 2 1946 - 3436 1946 496 aa, chain - ## HITS:1 COG:alr3871 KEGG:ns NR:ns ## COG: alr3871 COG1640 # Protein_GI_number: 17231363 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Nostoc sp. PCC 7120 # 2 496 3 497 502 527 53.0 1e-149 MFERSSGILLHPTSLPGKYGIGTLGHEARNFIDFLKKSNQKLWQIFPLGPTGYGDSPYQS FSSFAGNPYLIDFDILISDGLLQNEDFENIFWGENPEYVDYGIIYNQKFPLLKKAYDNFK NGNFDHLKEEFSKFKEKSSSWLEDYSLFISLKNHFNGLPWSKWPEDIKLRTASALENYKK ELEDEMEYQKFIQFLFFKQWYEIKKYAEENNVKIIGDIPIFVAADSADTWANPEIFLFDD ELNPVKVAGVPPDYFSETGQLWGNPLYNWEKLKETDYKWWIERIRTNLSLYDIIRIDHFR GFEAYWAVPFGDETAQNGSWQKGPGIDLFNAIKKELGDLPIIAEDLGHLTEEVINLRNET GFPGMKILGFAFDSNEENDYLPHTYIKNCVVYTGTHDNDTLIGWFEKANESDRNFTREYL NIADHSQINWAILKGAWRSVASIAIAPVQDFLGLGSEARINTPGVASGNWQWRLKDGLLT DELAQRIGRITKIYGR >gi|261747764|gb|ADAD01000100.1| GENE 3 3600 - 3815 454 71 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1259 NR:ns ## KEGG: Lebu_1259 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 2 68 3 69 70 109 89.0 3e-23 MEKVDKDMNIMEAVERFPIIAQVLMRYGLGCVGCIISSAETLGEGIAAHGLNPDIIIEEV NMILEKQQEQG >gi|261747764|gb|ADAD01000100.1| GENE 4 3999 - 5291 1980 430 aa, chain + ## HITS:1 COG:FN1003 KEGG:ns NR:ns ## COG: FN1003 COG2067 # Protein_GI_number: 19704338 # Func_class: I Lipid transport and metabolism # Function: Long-chain fatty acid transport protein # Organism: Fusobacterium nucleatum # 203 430 41 273 273 132 37.0 9e-31 MKLKIAILSLLASTILSAASIDYLSNNSASFFQNPSQTGKISVEGIFYNPAGTAFLEDGT YLNLNMQNSLIEESMTLNGKKYKSNNYAGAPSFNLLYKKNNFSLFGNASVIAGGATLKYK DGVVGVQLAGDAFNELTRGRLGAKLTENSFKGQNRYYQLMVGGAYKVNDQFSVSGGLKYV LGVRKLKGNATYSYNPLVGRAIGLTGNELHLDSERTASGVGGTIGFDYKPTDTLNFAVKY DTPVKLKFKAKATEYKGMSVAGRPVGISLFYPEYADGAEYRRDLPGVLSLGVSKDFGNFT YSAGYIHYFNKSAKLDRFEYKDGHEYNFGVDYRLNEKFTLHAGFNYADTGAPRNTFTDVE YAVNSQIYAAGLTYKPTDSSEWKFGIAHVSYNSSNGEKEKTALPGVNLDKSKVKYDKSIN VFTLGYTHKF >gi|261747764|gb|ADAD01000100.1| GENE 5 5408 - 6466 1188 352 aa, chain + ## HITS:1 COG:PM1387_2 KEGG:ns NR:ns ## COG: PM1387_2 COG1056 # Protein_GI_number: 15603252 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinamide mononucleotide adenylyltransferase # Organism: Pasteurella multocida # 10 180 1 160 160 160 47.0 4e-39 MKNNNGKKIKNGIIFGKFYPLHTGHVDFIQRAGGLVESLYVIVCTDKKRDIELFKKSKMK KMPTEKDRIRFAEQTFKYQDNIKILHLSEDNIPPYPNGWKEWTKRVKELLSENHLKIDTI FTNETQDVENYKKNFINSDDSYKVFDKELKVETIDILRNNFHISATEVRKNPYHNWQYIP KYVREFFVLKVAIIGSENSGKTNLTNKLANYYNTSCVREYRKKYIEEVLAGKSDNMQYED YSRIAYEHNREITDSGANADKLTFIDTEYTSLQVFSIINTGEEHPVIKDFIKNSKFNIII YIEKDRNKEYDKILKKLLEKYNIKYFTIKNEECNFTKIYNKSIEIIDDFISI >gi|261747764|gb|ADAD01000100.1| GENE 6 6580 - 6729 258 49 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169837420|ref|ZP_02870608.1| LSU ribosomal protein L33P [candidate division TM7 single-cell isolate TM7a] # 1 49 1 49 49 103 100 1e-21 MRVQVILECTETKLRHYVTTKNKKTHPERLEMRKYNPVLKRHSLYREVK >gi|261747764|gb|ADAD01000100.1| GENE 7 6855 - 7070 268 71 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1898 NR:ns ## KEGG: Lebu_1898 # Name: not_defined # Def: preprotein translocase, SecE subunit # Organism: L.buccalis # Pathway: Protein export [PATH:lba03060]; Bacterial secretion system [PATH:lba03070] # 1 70 1 70 71 79 52.0 6e-14 MSKFNVSDAIKNMIDEYKKIYWPNRSEVFHVTIIVLLITLFIALYILVFDNVFDLLLNRL TQILKSFLGGR >gi|261747764|gb|ADAD01000100.1| GENE 8 7072 - 7689 1126 205 aa, chain + ## HITS:1 COG:FN2041 KEGG:ns NR:ns ## COG: FN2041 COG0250 # Protein_GI_number: 19705332 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Fusobacterium nucleatum # 17 204 7 192 193 194 57.0 8e-50 MSEKMEEVMNDEVKYEKKWYIIHTYSGYEKKVATDLEKRIESLNLTDRVFRILVPEEEVL EEKRGKMVKVPRKLFPSYVMVEMLSVKEENDLGLGYRVDSEAWYVIRNTNGVTGFVGVGS DPIPLSDEEASNLLSKIGINGFENENKFNIDVEIGEKVIVKKDSFLDQEGEVAEIDYEHG RVKVLLEVFGRLTPVEFEYNEIQKR >gi|261747764|gb|ADAD01000100.1| GENE 9 7781 - 8206 653 141 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229885177|ref|ZP_04504630.1| LSU ribosomal protein L11P [Sebaldella termitidis ATCC 33386] # 1 141 1 141 141 256 92 2e-67 MAKEVIGKIKLQLEAGKANPAPPVGPALGQHGVNIPEFCKAFNAQTQDKMGFVIPVEISV YADRSFTFILKTPPASDLLKKAAKTQKGAANSKKDVAGKITKAQLKEIAETKMPDLNAGS VEAAMNIIAGTARSMGIKIEE >gi|261747764|gb|ADAD01000100.1| GENE 10 8283 - 8990 1065 235 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229210212|ref|ZP_04336609.1| LSU ribosomal protein L1P [Leptotrichia buccalis DSM 1135] # 1 235 1 235 235 414 86 1e-115 MAKRGKRYSDISQKVDKMKVYTPEEALDLVFDTKSAKFVETVELAIRLGVDPRHADQQVR GTVVLPHGTGKTIKILVITSGENIQKALNAGADYAGDDEYISKIQGGWMEFDLVIATPDM MPKLGKLGKTLGTKGLMPNPKSGTVTTDVEKTVEEFKKGKVAFKVDKLGSIHLPIGKVNF DKQAIVDNFKVALDQIIKLKPSASKGQYLRTVAISLTMGPGIKIDPLLAGIFATK >gi|261747764|gb|ADAD01000100.1| GENE 11 9227 - 9766 673 179 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229210211|ref|ZP_04336608.1| LSU ribosomal protein L10P [Leptotrichia buccalis DSM 1135] # 1 169 1 169 171 263 78 7e-70 MPAQSKIEAVEKLTAKLKDAKAMVFVDYRGISVNEDTELRKQARESGVEYFVAKNRLMKI ALKSVGIDTNFDDLLEGTTSFAVGYEDGVAPSKLVFDFGKKLKDKLIIKGGMLSGDRVDT KTVEALAKLPSREELLGQIAYGLLSPVRMLTVGLSNLAEKVESGQPLEAKAEETAEAAE >gi|261747764|gb|ADAD01000100.1| GENE 12 9828 - 10202 517 124 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229210210|ref|ZP_04336607.1| LSU ribosomal protein L12P [Leptotrichia buccalis DSM 1135] # 1 124 1 123 123 203 89 9e-52 MAFNKDQFIEDLKNMTVLELKEVVEAIEETFGVSAQPVAVAGGAAGAGAGSAEEKSEFDV ILVSAGAAKLGVIKEVRAITGLGLKEAKELVEAGGKAVKEGVSKEEAEALKAQLEGAGAT VELK >gi|261747764|gb|ADAD01000100.1| GENE 13 10395 - 13841 596 1148 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 975 1136 1230 1391 1392 234 71 6e-61 MNKLIERYSFGKIVDRGEMPHFLEFQINSYEDFLQAKVPPQKRENKGLEGIFNEIFPIES SNGLLKLEYLWYEIHDNDEPLNDELECKKRGKTYSGQLKVRLKLTNKRTQEIQETLVHFG DIPLMTEQATFVINGAERVVVSQLHRSPGVTFNKELNIQTGKDMFIGKIIPYKGTWLEFE TDKNDILNVKIDRRKKVLSTVFLKAVDFFANNAEIMDEFFEEKEVDLKPLYKKYKDEELE DVLKNRLEGSFNKEDILDEETGEFIVESEELIDNIVIEKLIENKIDKITIWEVKPEDRII ANSLIHDNTKTNDEAVIEVFKKLRPGDLVTVESARSLVKQMFFNPQRYDLANVGRYKINK RLKLDLPEDEIVLTKEDVLQTINYVRNLVNGEGFTDDIDNLSNRRVRGVGELLSIQVKGG MLKMAKMVKEKMTIQDITTLTPQSLLNTKPLNALILEFFGSGQLSQFMDQSNPLAELTHK RRISALGPGGLSRERAGFEVRDVHNSHYGRICPIETPEGPNIGLIGSLSTYGKVNKYGFI ETPFVKIENGKANFDEIEYLGADEEEGLFIAQADTLIDEDGNFLTDEVVCRYGEEIVHID KSKVDLLDVSPKQLVSVSAGLIPFLEHDDANRALMGSNMQRQAVPLLKTEAPYVGTGLER KVAIDSGAVITSKATGKVTYVDANRIIVTDKSEKEFVHRLLNFEKSNQSMCLHQKPIIDL GDKVKKGDVIADGPSTAGGDLALGKNILLAFMPWEGYNFEDAILISERLRKDDVFTSLHI EEFDIEARTTKLGDEEITREIPNVSEEALRNLDENGIIRIGAHVNPDDILVGKVTPKGES EPPAEEKLLRAIFGEKAKDVRDTSLRLPHGVKGTVVDVLVLSKENGDDLKAGVNKLVRVY IAEKRKIMVGDKMSGRHGNKGVVSRVLPIEDMPHLENGTPIDVCINPLGVPSRMNIGQVL EVHLGLAIGDIDKYIATPVFDGAKEDDVKDYLEEAGYSRTGKVKLIDGRTGEPFDNPVTV GRMYMLKLHHLVEDKMHARAIGPYSLVTQQPLGGKAQFGGQRLGEMEVWALEAYGASNIL QEMLTVKSDDIGGRTKTYEAIVKGQSMPEADAPESFRVLIKEFQSLGLDVTLYDREGEAI ELDKNVDM >gi|261747764|gb|ADAD01000100.1| GENE 14 13964 - 17980 5587 1338 aa, chain + ## HITS:1 COG:FN2035 KEGG:ns NR:ns ## COG: FN2035 COG0086 # Protein_GI_number: 19705326 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Fusobacterium nucleatum # 1 1328 1 1319 1319 1828 70.0 0 MSIRDFDSIQIKLASPEKILEWSHGEITKAETINYRTLKPEMDGLFCERIFGPSKDYECS CGKYKRMRYKGMTCEKCGVEVTTSKVRRERMGHIQLATPIAHIWYSKGTPNKMSLLLGIS TKELESVLYFSRYIVTDKGDTELEKGQILTDREYKMYESQYKNGFTAKMGAEGVLKLLQE IDLQKLEKELEKEMESVNSSQKRKKIIKRLKIVRDLILAGNRPEWMILTVLPVIPADLRP MVQLDGGRFATSDLNDLYRRVINRNIRLNKLMSIKAPEIVIKNEKRMLQEAVDALIDNGR RGKPVVTQNNRELKSLSDMLKGKQGRFRQNLLGKRVDYSGRSVIVVGPNLKMNQCGLPKK MALELYKPFLMRELVKRELATNIKTAKKMVEEEDENVWELIEEIIKNHPVLLNRAPTLHR LSIQAFEPTLIEGKAIRLHPLVCSAFNADFDGDQMAVHLVLSNEAQMEAKLLMLATNNIL APSSGKPIAVPSQDMVMGCYYMTKERKGEKGEGKYFSNKNQLITAYQSKQVDTHALVKVR IDGELIETTPGRLMFNTMLPKEVRNYSKTFGKGELAKLIAELYKKFGFEKTSELIDKIKN FGFHYGTTAGITVGIEDLEIPVTKKAILEKAESDVAEVEEQYKSGEIIDAERYRRTVAIW SEAVDRVTHEMMDNLDEFNPVYMMANSGARGSVAQMRQLGGMRGLMADTQGRIIEIPIKA NFREGLNILEFFMSSHGARKGLADTALRTADSGYLTRRLVDISHEVIVNHDDCGCEEGIV VSDLVDAGNVIEKLSERIYGRYLAEDLVHEGEVIAERNTMIMDDLIKKIEELDIKEVKIR TPLTCKLEKGVCKKCYGLDLSNHKEILKGEAVGVIAAQSIGEPGTQLTMRTFHTGGVATA AAVQSDYKADVSGKVKLKDITTLENEKGIEVVVSQTGRIIVGKHRYEVPSGSVLKVKDGE SVKKGQVLVEFDPYQVPIITSESGKVEFRDIYVRENIDVKYGVTERIAIKPVESSDVNPR IIIYNKNKKVAEYNVPYGAYLMVKEGETVKKGQIITKILKTGEGNKDITGGLPRVQELFE ARNPKGKATLSEVSGRVIFSDKKKKGMRLITIEDPETGKPIKEYTVPVGEHLVVTNEMLI EKGAKITDGPVSPHDILKIKGLVAAQQFILESVQQVYREQGVPINDKHIEIIVKQMFQKV KIKEAGDTLFLEDELIDKKLVERENEKLIAKGKRPAIYEPIIQGITKAAVNTESFISASS FQETTKVLANAAIEGKIDRLEGLKENVIIGKKIPGGTGFKDYKHIKVEIKNQGNKVPVIN KEAKPEVEVIEEAEIVEE >gi|261747764|gb|ADAD01000100.1| GENE 15 17982 - 18224 385 80 aa, chain + ## HITS:1 COG:TM1690 KEGG:ns NR:ns ## COG: TM1690 COG2052 # Protein_GI_number: 15644438 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 4 78 5 79 92 73 49.0 8e-14 MKTLNIGFNNFVIDEHVISIIVPDTAPARRLREDAKNRDSLIDATAGRKTRAIIIMDNGF VILSAVNVETLIQRIEQNND >gi|261747764|gb|ADAD01000100.1| GENE 16 18260 - 18805 839 181 aa, chain + ## HITS:1 COG:FN2033 KEGG:ns NR:ns ## COG: FN2033 COG0194 # Protein_GI_number: 19705324 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Fusobacterium nucleatum # 3 179 4 180 185 206 63.0 2e-53 MKGKLIIVSGPSGSGKSTVTKIVKDRLNIPLSISATTRKPREGEVEGIDYYFLSEDEFKR KISNDEFYEYANVHGNYYGTLKKTVEENLEKGLNVILEIDVQGALIAKDKKKDAVLVFFK TKDTKILEERLRGRKTDSEEVIKKRLENALKELEFENKYDYTVINEDIEDSCQKLIEIIN E >gi|261747764|gb|ADAD01000100.1| GENE 17 18819 - 18998 414 59 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1888 NR:ns ## KEGG: Lebu_1888 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 57 1 57 68 65 77.0 9e-10 MKKDKITIDDLLTKIPNKYELAIVAGKVAKEEFIKGHDKFKIMDNVFRDIMDDEIEVKK >gi|261747764|gb|ADAD01000100.1| GENE 18 19078 - 20886 1723 602 aa, chain + ## HITS:1 COG:FN1319 KEGG:ns NR:ns ## COG: FN1319 COG0358 # Protein_GI_number: 19704654 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Fusobacterium nucleatum # 2 596 3 602 603 360 40.0 4e-99 MYKDTEIQKLIDNLDIVQVIGEYVTLKKAGVNYKGFSPFKDEKTPSFVVSPVKNIFKDFS TGIGGNAISFYMKINNISFYEAVEELSRKYNVSIKKLNINKNIDNPNTKYYEIMREAQTF FKNNIINSEEALKYMENRGYSREEIQKFDIGFSFNSWDSLLKHLKEKGYNESDLLELGLI RKNDEGNVFDYFRNRIMFPIYNDTMKLIGFGGRTVENNSDIPKYLNSPDSKIFKKGKELY GLSNRGENIKKKGLAILMEGYLDVLTAQKNGFINSVASLGTAFTEEQAQLLKKYTNNVII AYDNDEAGKNAIIKAGNILKKQDFNVRCLSIEGKEKDPDEYLRKHGRKDFLEILKTSKNY FDFLYDYYSEDLNLNEISGKKEFIQKFKGFFSNVINKTERNLYINKLSVELGIDKDILSE EFVMKKKDFSDRNPIRKKRHETVATKKTKEELNDLLEKETLEFILRYRENGNSEYRKYCE KLESKNFTNIIYGEILEKLKKIEFDMKNLNNSIFEEEETEMITTMILNANTQPVNAEKEY KDHFKGWFDRELNNALDLTDKKDVSYKTKLQRIKTDLKSRYDINEIEKEYEEFKLIRRSD YV >gi|261747764|gb|ADAD01000100.1| GENE 19 20879 - 22060 1733 393 aa, chain + ## HITS:1 COG:FN1318 KEGG:ns NR:ns ## COG: FN1318 COG0568 # Protein_GI_number: 19704653 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Fusobacterium nucleatum # 86 392 23 329 331 395 76.0 1e-110 MSEKKEDLKKRLANLMKQAKEEKIVSYEEINSILSVGFSTQKIDQLIKKLQDDGVNIVDT LKDKQELLKVNKILSEDLEKVEVVDFDDSEDEFVENEIDDSEVDKLLQTDLLKMAESMDV DEPIKMYLREIGQIPLLSYEEEIDYAQRVLKGDEEAKQKLIESNLRLVVSIAKKHTNRGL KMLDLIQEGNMGLMKAVEKFEYEKGFKFSTYATWWIRQAITRAIADQGRTIRIPVHMIET INKIKKESRIILQETGKEPTAEELSKKLEIPVDKVKNILEMNQDPISLETPVGSEEDSEL GDFVEDDKFLNPYDATTRVLLKEQLDEILKTLNEREEMVLRYRYGLDDGSQKTLEEVGKI FNVTRERIRQIEVKALRKLRHPSRRKKLEDYRS >gi|261747764|gb|ADAD01000100.1| GENE 20 22114 - 23058 1348 314 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1885 NR:ns ## KEGG: Lebu_1885 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 314 1 314 314 372 74.0 1e-101 MLKRIIEDIKKKENYSFEKIIEDNNMSDDDFFELLKFIYSEKISEIRTSTENNDFIVLED ENYYVEEKDIIKAYLEDIKEKYEEFDKKMKNESRELIDEYLKVAVRESLLYSKFGFSFLD TVQEATLGVLSGINYYDRIKEISKDPEFFIKSFAVKYILEFQKNLLKDIKASELSYVLYL KVKVDKEMGHSIDEISKQMNVTPEYIEDLEKLFDGVEPEELIQNEQILEKADKITQLYIL ENIPKKLSYLDEKILVMAYGLDDKVYTGKEIAKTLNISLHNVNILKEKAINKLSIDLLKN EFTKNSEKMEYMIN >gi|261747764|gb|ADAD01000100.1| GENE 21 23069 - 23878 1052 269 aa, chain + ## HITS:1 COG:FN1316 KEGG:ns NR:ns ## COG: FN1316 COG0327 # Protein_GI_number: 19704651 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 268 1 257 258 160 40.0 3e-39 MKLRDIAGELYSIYNPKIAENWDNVELLLGDENCEIKKILVCLDITKEAVEKAVKENVNL IISHHPFIFSDLKRITSETVLGRKILKLIENKIAVYSIHTNADFAINGLNDFIMDKLNLN GKTEILDEADFYDYNYIKNENEKVKGGTARIKILDEEMELEDLISRIKKGLGIEYVRYVG ENRKIRKIGLVTGGGSSFIQNVKEHIDVFLTGDLRYHEALDTREEGGILIDVGHYESEYL FADLMEIQLSRFFDGEIIKYFDSPVFKLG >gi|261747764|gb|ADAD01000100.1| GENE 22 23982 - 24104 199 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTKKRFLILVFGMILSLNSCLAVAAGAAAGAATGYYVKNK >gi|261747764|gb|ADAD01000100.1| GENE 23 24681 - 25529 1234 282 aa, chain + ## HITS:1 COG:FN0267 KEGG:ns NR:ns ## COG: FN0267 COG0061 # Protein_GI_number: 19703612 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Fusobacterium nucleatum # 43 282 24 266 267 165 39.0 7e-41 MKAILKDSENSIQKDKKTVNIVRKIKIVKNRYVKEELLTGFYEYIKRNNIIEVVDVKNAD LIVSFGGDGTILVAAKETVKKDIPVLAVNMGTVGYMAEIKPENAVEMLENYQENKCIIDE RAFLEVEYNGEIFYALNELLIIKGGLVSHLINVEVYANDIIVNKYRADGVIVATPTGSTA YSLSAGGSIVHPKLNALSITPLLPQSLTARPIIVNGNDKLSFKVYTRDNDAHLNIDGSEC FRVTDTDEIKATLSEKKVKIIRSENSDYYNILREKLKWGDTF >gi|261747764|gb|ADAD01000100.1| GENE 24 25569 - 27227 2157 552 aa, chain + ## HITS:1 COG:FN0268 KEGG:ns NR:ns ## COG: FN0268 COG0497 # Protein_GI_number: 19703613 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Fusobacterium nucleatum # 1 550 6 555 558 394 46.0 1e-109 MLRELRLNNLAIIKNMDLEFNSNMITLTGETGAGKSIILDGISLLIGERNQAEMIRTGEE SLFAEGVFDLNENQLKRLNKLGFEIEDNELIISRYFDRNSKSKVTVNGMRITVTKLKELM GNVLDLVGQHEHQYLLNKSYHLNLLDKFLDKDGIELYKKIRKNVMGLKSLNEKISEMEAE KSKIIEKKDIFEFQVNEINNLDLKENEDNELEEEYKILFNSGKIGEKLGNSLQSLKDGEY SVLNSLGRVRKNLEQLSGISETYSELKEKIESIVYDVEDISYSIEAFSQETKTDDMRLEA VVHRIDEINKLKLKYGSTIEEILSYRDEIQKKLAFISFENNELEELKKQKREKIEEYYKD CEKLSKLRKNIAKKMESTIDTQLKDLNMGNAQFKVEFSEKSAISPKGKDDAEFMMTTNPG ESFKPLSKIASGGEVSRIMLALKTVFSKVDNISVLIFDEIDTGISGETVRKVAEKLKELS ANVQVICVTHSPQIAAKAEQQFFIKKEIENNLTETKVRELNTEERIREIARIISGDNITE ASIVHAKEIMGL >gi|261747764|gb|ADAD01000100.1| GENE 25 27227 - 28174 1077 315 aa, chain + ## HITS:1 COG:BS_ripX KEGG:ns NR:ns ## COG: BS_ripX COG4974 # Protein_GI_number: 16079408 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus subtilis # 34 309 5 291 296 198 42.0 1e-50 MENNNKIIVEEIEKKNKKENEKESKIERENNLFIKEFSDYLSFEKGSSKNTVAGYERDLK IFFDYIQKSATDIKEEDIYEYIEEISRELKRNSVLRKIASIRTFYKFCYLNKMVKEDPTG MIKSLKREKRLPEVLNLKEVKTIIDNIGHTPEGMRDRLIIKFLIATGARVSEILNLNIKD VENQGYEFIKVLGKGSKYRIVPIYESLENEIKDYISNYRPKLKGSESSFKLFPDTRRENF WKRLKKTAKDAGIEKNVYPHIFRHSVATVLLSNGADIRIVQEILGHANISTTEIYTHVEK SDLKRIYNKIKIGDE >gi|261747764|gb|ADAD01000100.1| GENE 26 28207 - 28617 356 136 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0485 NR:ns ## KEGG: Lebu_0485 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 5 130 8 133 139 72 46.0 4e-12 MAAQKKEENKKNIMLTILIVLWGSIFLLMKMHIIGVYSGMLILILLYLYLNFNLINLYFV SKRTTFKIYIFMLLDLIYLLRESFSLFSILIYFVAMAILIYLIMKDEGRNELPKILGFSG FYTILKIIFISMFVLL >gi|261747764|gb|ADAD01000100.1| GENE 27 28720 - 29310 839 196 aa, chain + ## HITS:1 COG:BH3352 KEGG:ns NR:ns ## COG: BH3352 COG5506 # Protein_GI_number: 15615914 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 29 120 24 115 146 63 35.0 2e-10 MKSNGNESKAELMKNIENFKDRVREIKLEKNIFLGEFKERVLAALTREQVKEKGIYPEIE KALESKEAKKMIISREMDFNDIKKYINLAKSKNIPYKMIDSLLYTGEIGLVVASDDALSE PLENPVVKTKEEKFKEKNLPPIYYKSIGSKICEFHREIIKNELPEYEHSYKEIGFMDSLL GTRCPICEKLGGKKRG >gi|261747764|gb|ADAD01000100.1| GENE 28 29303 - 30571 2169 422 aa, chain + ## HITS:1 COG:FN1520 KEGG:ns NR:ns ## COG: FN1520 COG0766 # Protein_GI_number: 19704852 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Fusobacterium nucleatum # 1 421 1 422 423 487 60.0 1e-137 MVDGFKIKGKTPLNGTIKVSGAKNAALPIIIATLVARGEYTLKNVPNLRDIRILMKLLED LGMKTEKVDETTYKIINNGFKRNEASYEIVKQMRASFLVMGPMIANLEESVVSLPGGCAI GSRPVDLHLKGFEALGAEITRVHGYIHAKSDKLKGAEIPLTFPSVGATQNLMMAAVKIPG KTVISNAAREPEIIDLGNFLIKMGAKINGLGTPNIEIEGVDDLHGVEYSIMPDRIEAGTY VIASLITEGDLKIENINLEDLGVFRSELEAMGVKFEYNGNVLTVNGNLKELKPSKIRTMP HPGFPTDMQPQMMLLQTLVNGVSSMEETVFENRFMHVPEFNRMGADISIRHGVAVINGGL PLTGAEVMSSDLRAGAALVLAGLAADGETTVNRVYHIDRGYDKLEEKLNAVGAKIERVKL DI Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:04:12 2011 Seq name: gi|261747762|gb|ADAD01000101.1| Leptotrichia goodfellowii F0264 contig00005, whole genome shotgun sequence Length of sequence - 402 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 400 528 ## gi|262038213|ref|ZP_06011606.1| hypothetical protein HMPREF0554_0028 Predicted protein(s) >gi|261747762|gb|ADAD01000101.1| GENE 1 1 - 400 528 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038213|ref|ZP_06011606.1| ## NR: gi|262038213|ref|ZP_06011606.1| hypothetical protein HMPREF0554_0028 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0028 [Leptotrichia goodfellowii F0264] # 1 133 1 133 134 213 100.0 3e-54 KKIRENYGSELKDKEKLKDDWIMKKWLNDGHGAYTYYSAQLAKEIDGLLREYKNAEDETA KKEIQKDIENKYIENQERLMELFTGGPALMNDSKGLLAGLGEYNRVENERVYREGKYQGK NLPQNYITNSLNE Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:04:22 2011 Seq name: gi|261747755|gb|ADAD01000102.1| Leptotrichia goodfellowii F0264 contig00170, whole genome shotgun sequence Length of sequence - 6391 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 227 285 ## gi|262038219|ref|ZP_06011611.1| hypothetical protein HMPREF0554_2228 2 1 Op 2 1/0.000 - CDS 244 - 1626 2230 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase - Prom 1681 - 1740 7.7 3 1 Op 3 . - CDS 1752 - 3335 1998 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) - Prom 3379 - 3438 14.4 + Prom 3378 - 3437 10.7 4 2 Tu 1 . + CDS 3541 - 4263 735 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase - Term 4237 - 4293 3.5 5 3 Tu 1 . - CDS 4344 - 5633 1615 ## gi|262038215|ref|ZP_06011607.1| outer membrane autotransporter barrel domain protein - Prom 5684 - 5743 12.6 6 4 Tu 1 . - CDS 6028 - 6384 206 ## FN2115 hypothetical protein Predicted protein(s) >gi|261747755|gb|ADAD01000102.1| GENE 1 2 - 227 285 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038219|ref|ZP_06011611.1| ## NR: gi|262038219|ref|ZP_06011611.1| hypothetical protein HMPREF0554_2228 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2228 [Leptotrichia goodfellowii F0264] # 1 75 1 75 75 108 100.0 1e-22 MRIIYGDSPISLGDSLWVTIISMLIVFLVLVLISFILSFLKYIPSEKKEIKTVKSSTESN TVVSLSESRKLKPED >gi|261747755|gb|ADAD01000102.1| GENE 2 244 - 1626 2230 460 aa, chain - ## HITS:1 COG:FN1376 KEGG:ns NR:ns ## COG: FN1376 COG5016 # Protein_GI_number: 19704711 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Fusobacterium nucleatum # 1 444 1 444 448 692 75.0 0 MSKVKITETSLRDGHQSLMATRLTTAEILPIVEKMDKAGYYSLEVWGGATFDSAIRFLNE DPWERLREIRKRAKNTKLQMLLRGQNLLGYRHYADDIVDKFVEKSIKNGIDIIRIFDALN DVRNIRQASESTKKYGGHSQLAICYTISPVHTIEYYKNLAMEMQEMGADSIAIKDMAGIL LPYRAYELVKELKKVINVPIEVHTHNTAGLGAMTNLKSIEAGVDIVDTAISPLSGGTSQP TTESLVRTLAGTEYDTGINLEILKEVAEYFKPIRKKYLDKGTLKPQALFTEPNIVEYQLP GGMLSNMLSQLKEQKAEHKYEDVLREIPKVRADLGYPPLVTPLSQMTGTQAVFNVLTGER YKMIPKEIKDYVRGKYGKSPVPVSEEMKKRIIGDEEVFTGRPADLLKNEYEEIKKEAEDL IRNEEDVLTYAMFPQVARTYFEKRNNPEKEVKVQTINVIF >gi|261747755|gb|ADAD01000102.1| GENE 3 1752 - 3335 1998 527 aa, chain - ## HITS:1 COG:FN1120 KEGG:ns NR:ns ## COG: FN1120 COG1866 # Protein_GI_number: 19704455 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Fusobacterium nucleatum # 1 526 1 524 527 798 71.0 0 MRTYGLEKLGIINTGKVHRNLSSTELVEIAVSKGEGKLSKTGALVVTTGKYTGRSPEDKF IVDTPGIHDSIAWGKVNRPIEKEKFNAIYGKLISYLQNREIFIFDGLAGADPVCRKKFRI INELASQNLFIHQLLIRPTAEELANYHTQDFTIIAAPGFKCNPKIDGTHSEAAIILDYEA KIGIICGSQYSGEIKKSVFCVMNYLMPKQDVLPMHCSANIDPVTGKTAIFFGLSGTGKTT LSTDPNRKLIGDDEHGWSNHGIFNFEGGCYAKCIDLKEKYEPEIFHAIKFGSLVENVIMD SNTRELDFKDRTLTENTRVGYPINYIDNAQIPGVGGIPSVVIFLTADAFGVLPPISRLDQ NAAMYHFITGFTSKLAGTERGVTEPQPTFSTCFGEPFMPLDPSVYAKMLGEKIERYNTKV YLINTGWSGGPYGIGKRMNLKYTRAMVTAALNGELDSVEYIHDETFNVDVPQSCPGVPSE IMNPADTWDNKEAYDVYAKKLAKMFKENFENKYPNMPENIVNAGPKA >gi|261747755|gb|ADAD01000102.1| GENE 4 3541 - 4263 735 240 aa, chain + ## HITS:1 COG:FN1921 KEGG:ns NR:ns ## COG: FN1921 COG0340 # Protein_GI_number: 19705226 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Fusobacterium nucleatum # 1 235 1 231 234 209 49.0 3e-54 MKIYKFETLTSTNEYMKKNSDKFGNFDIVIAETQTEGKARRENVWFSDKGMALFTFLVKK KGDFDDTEYLKLPLIAGLSVIKGLNRIEKLNYMFKWTNDVYLDEKKLCGILSERNGDNFY IGIGININNHLPEKLKNIAVSLSEKMGKQYDINEIAVYITEEFKTLYTDFINGLWNEILS EINSRNFLKNKKISIKSGKNLKSGIAENINNDGEIEVLINNELHSFSFGEIMNEKIMIEK >gi|261747755|gb|ADAD01000102.1| GENE 5 4344 - 5633 1615 429 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038215|ref|ZP_06011607.1| ## NR: gi|262038215|ref|ZP_06011607.1| outer membrane autotransporter barrel domain protein [Leptotrichia goodfellowii F0264] outer membrane autotransporter barrel domain protein [Leptotrichia goodfellowii F0264] # 1 429 1 429 429 654 100.0 0 MKKLLLLGALIASVNAYSLSFSGVSTGSGASYSGGSISTSAGTFSGNFTINAGDFTLTLN PTDKITSGTSVTVSPGANFKVVNSSGTVIGTIAGGNTFNFTNQVTGNDTTFNTAPTPSPT PTPTPTPTPTPTPVPTPGTPSAKVYIPRSKVDLELTKNITSSGFKSLEDAQNNNGIGLDI QYIGGVGRYTDNSQVIKYDSKSNGVALIGTKNFGNFTLGGGFGYQKSKVKYKETFDGIKE NLDSYQFMLSGKYNFTENVDLASVLTYSHNSHKYENKDKVIIEGVKFNSNIIDFQTRLGS KFSGNIGYVKPYVGLGVTRVEEGAIDKLNASKARGTSGNANVGVYGQVALGSAVDLFGNV EYEHRFNNKSYHRDRDVKGGRKIEGLDYDGGMKLGLGLKYKFSKFNVTTAYELNDDKNSV FKVGFGGEF >gi|261747755|gb|ADAD01000102.1| GENE 6 6028 - 6384 206 118 aa, chain - ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 22 117 47 150 151 67 36.0 1e-10 MAQSLKKNRGGAIYNEKYKSGVYEAINDIVKRPVNKKVKFEGITLIIPENTEINLESWTL LDSKTGYGIPIGFSNQSGCKQKKIGDKIYSITYNDYISGVKQIGEKLMKINGFKNTCN Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:04:59 2011 Seq name: gi|261747753|gb|ADAD01000103.1| Leptotrichia goodfellowii F0264 contig00033, whole genome shotgun sequence Length of sequence - 646 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 646 772 ## gi|262038222|ref|ZP_06011613.1| hypothetical protein HMPREF0554_0421 Predicted protein(s) >gi|261747753|gb|ADAD01000103.1| GENE 1 1 - 646 772 215 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038222|ref|ZP_06011613.1| ## NR: gi|262038222|ref|ZP_06011613.1| hypothetical protein HMPREF0554_0421 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0421 [Leptotrichia goodfellowii F0264] # 30 215 1 186 186 317 100.0 4e-85 ENDDLYSVTYGPAKKVQVQESTRWDTLNKMKTVSGEIGDGGNTVLDSKRDIILQSSDIRG KDNIVLNAKRYLKMLSTVDTEFKERTTSHKSRRGFGRSKTTTENWVEDNVYANNVDLTTD GNVLMNFKGIDEETGKYISTNNQGVIAQGVNFHAKGAIIGFSEGNIYVEGTKDKLNSVYN SHTTKKWLGATYGKASDYVNDTKEKYRLSQLYNGS Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:06:19 2011 Seq name: gi|261747690|gb|ADAD01000104.1| Leptotrichia goodfellowii F0264 contig00021, whole genome shotgun sequence Length of sequence - 51603 bp Number of predicted genes - 64, with homology - 62 Number of transcription units - 14, operones - 8 average op.length - 7.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 + CDS 3 - 230 344 ## PROTEIN SUPPORTED gi|229212524|ref|ZP_04338900.1| SSU ribosomal protein S10P 2 1 Op 2 58/0.000 + CDS 286 - 912 964 ## PROTEIN SUPPORTED gi|229883874|ref|ZP_04503339.1| LSU ribosomal protein L3P 3 1 Op 3 61/0.000 + CDS 937 - 1581 846 ## PROTEIN SUPPORTED gi|229212526|ref|ZP_04338902.1| LSU ribosomal protein L4P 4 1 Op 4 61/0.000 + CDS 1581 - 1871 389 ## PROTEIN SUPPORTED gi|229212527|ref|ZP_04338903.1| LSU ribosomal protein L23P + Prom 1873 - 1932 9.1 5 1 Op 5 60/0.000 + CDS 1957 - 2784 1384 ## PROTEIN SUPPORTED gi|229212584|ref|ZP_04338957.1| LSU ribosomal protein L2P + Prom 2789 - 2848 8.0 6 1 Op 6 59/0.000 + CDS 2908 - 3186 456 ## PROTEIN SUPPORTED gi|229883870|ref|ZP_04503335.1| SSU ribosomal protein S19P 7 1 Op 7 61/0.000 + CDS 3249 - 3584 501 ## PROTEIN SUPPORTED gi|229212586|ref|ZP_04338959.1| LSU ribosomal protein L22P 8 1 Op 8 50/0.000 + CDS 3606 - 4262 1027 ## PROTEIN SUPPORTED gi|229212587|ref|ZP_04338960.1| SSU ribosomal protein S3P 9 1 Op 9 . + CDS 4262 - 4687 725 ## PROTEIN SUPPORTED gi|229212588|ref|ZP_04338961.1| LSU ribosomal protein L16P 10 1 Op 10 . + CDS 4687 - 4881 234 ## PROTEIN SUPPORTED gi|229212589|ref|ZP_04338962.1| LSU ribosomal protein L29P 11 1 Op 11 50/0.000 + CDS 4894 - 5154 421 ## PROTEIN SUPPORTED gi|229212590|ref|ZP_04338963.1| SSU ribosomal protein S17P 12 1 Op 12 57/0.000 + CDS 5188 - 5556 575 ## PROTEIN SUPPORTED gi|229883864|ref|ZP_04503329.1| LSU ribosomal protein L14P 13 1 Op 13 48/0.000 + CDS 5576 - 5965 539 ## PROTEIN SUPPORTED gi|229212592|ref|ZP_04338965.1| LSU ribosomal protein L24P 14 1 Op 14 50/0.000 + CDS 5990 - 6544 820 ## PROTEIN SUPPORTED gi|229212593|ref|ZP_04338966.1| LSU ribosomal protein L5P 15 1 Op 15 50/0.000 + CDS 6575 - 6862 456 ## PROTEIN SUPPORTED gi|229212594|ref|ZP_04338967.1| SSU ribosomal protein S14P 16 1 Op 16 55/0.000 + CDS 6884 - 7279 596 ## PROTEIN SUPPORTED gi|229212595|ref|ZP_04338968.1| SSU ribosomal protein S8P 17 1 Op 17 46/0.000 + CDS 7311 - 7844 801 ## PROTEIN SUPPORTED gi|229212596|ref|ZP_04338969.1| LSU ribosomal protein L6P 18 1 Op 18 56/0.000 + CDS 7858 - 8220 516 ## PROTEIN SUPPORTED gi|229212597|ref|ZP_04338970.1| LSU ribosomal protein L18P 19 1 Op 19 50/0.000 + CDS 8239 - 8736 741 ## PROTEIN SUPPORTED gi|229859307|ref|ZP_04478967.1| SSU ribosomal protein S5P 20 1 Op 20 48/0.000 + CDS 8757 - 8936 274 ## PROTEIN SUPPORTED gi|229212599|ref|ZP_04338972.1| LSU ribosomal protein L30P 21 1 Op 21 53/0.000 + CDS 8942 - 9556 919 ## PROTEIN SUPPORTED gi|229212600|ref|ZP_04338973.1| LSU ribosomal protein L15P 22 1 Op 22 . + CDS 9600 - 10898 756 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 + Term 10902 - 10946 7.1 + Prom 11004 - 11063 7.7 23 2 Tu 1 . + CDS 11195 - 11713 848 ## COG2426 Predicted membrane protein + Prom 11784 - 11843 5.0 24 3 Tu 1 . + CDS 11864 - 12898 1521 ## COG4822 Cobalamin biosynthesis protein CbiK, Co2+ chelatase + Prom 13247 - 13306 8.0 25 4 Op 1 3/0.000 + CDS 13417 - 13758 554 ## COG0310 ABC-type Co2+ transport system, permease component 26 4 Op 2 8/0.000 + CDS 13761 - 14123 530 ## COG0310 ABC-type Co2+ transport system, permease component + Prom 14125 - 14184 6.7 27 4 Op 3 34/0.000 + CDS 14411 - 15103 517 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 28 4 Op 4 . + CDS 15096 - 15902 296 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 29 4 Op 5 1/0.000 + CDS 15899 - 17254 1767 ## COG1797 Cobyrinic acid a,c-diamide synthase 30 4 Op 6 5/0.000 + CDS 17318 - 17962 1017 ## COG2082 Precorrin isomerase 31 4 Op 7 6/0.000 + CDS 17995 - 19128 1396 ## COG1903 Cobalamin biosynthesis protein CbiD 32 4 Op 8 . + CDS 19139 - 19759 909 ## COG2241 Precorrin-6B methylase 1 33 4 Op 9 . + CDS 19784 - 20149 429 ## COG3189 Uncharacterized conserved protein 34 4 Op 10 . + CDS 20165 - 20671 521 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 20848 - 20899 5.2 + Prom 20991 - 21050 8.8 35 5 Op 1 7/0.000 + CDS 21093 - 21662 854 ## COG2242 Precorrin-6B methylase 2 36 5 Op 2 9/0.000 + CDS 21725 - 22450 1014 ## COG2243 Precorrin-2 methylase 37 5 Op 3 12/0.000 + CDS 22429 - 23190 1229 ## COG2875 Precorrin-4 methylase 38 5 Op 4 6/0.000 + CDS 23207 - 24235 1451 ## COG2073 Cobalamin biosynthesis protein CbiG 39 5 Op 5 . + CDS 24297 - 25040 1246 ## COG1010 Precorrin-3B methylase + Term 25252 - 25292 1.8 + Prom 25232 - 25291 6.6 40 6 Op 1 . + CDS 25360 - 25641 449 ## Swol_2479 ABC transporter periplasmic substrate-binding protein 41 6 Op 2 21/0.000 + CDS 25690 - 26397 982 ## COG1840 ABC-type Fe3+ transport system, periplasmic component + Term 26469 - 26511 -0.9 + Prom 26466 - 26525 9.7 42 6 Op 3 17/0.000 + CDS 26545 - 28161 1769 ## COG1178 ABC-type Fe3+ transport system, permease component 43 6 Op 4 . + CDS 28162 - 29166 1437 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 44 6 Op 5 . + CDS 29163 - 29732 647 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase 45 6 Op 6 . + CDS 29784 - 29912 80 ## + Term 30156 - 30188 1.5 - Term 29840 - 29871 0.0 46 7 Tu 1 . - CDS 29881 - 29979 113 ## - Prom 30021 - 30080 15.5 47 8 Op 1 2/0.000 + CDS 30391 - 31899 1669 ## COG1492 Cobyric acid synthase 48 8 Op 2 9/0.000 + CDS 31900 - 32838 1052 ## COG1270 Cobalamin biosynthesis protein CobD/CbiB 49 8 Op 3 . + CDS 32855 - 33961 1384 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 50 9 Tu 1 . - CDS 34282 - 34971 763 ## COG0588 Phosphoglycerate mutase 1 - Prom 35021 - 35080 8.4 + Prom 35021 - 35080 7.3 51 10 Op 1 . + CDS 35143 - 36354 1378 ## Lebu_1028 hypothetical protein 52 10 Op 2 4/0.000 + CDS 36374 - 37414 1194 ## COG0373 Glutamyl-tRNA reductase 53 10 Op 3 6/0.000 + CDS 37411 - 38361 1394 ## COG0181 Porphobilinogen deaminase 54 10 Op 4 1/0.000 + CDS 38354 - 39886 2087 ## COG0007 Uroporphyrinogen-III methylase + Term 40091 - 40131 1.5 + Prom 39962 - 40021 7.1 55 10 Op 5 . + CDS 40142 - 41434 1918 ## COG0001 Glutamate-1-semialdehyde aminotransferase + Prom 41505 - 41564 5.1 56 11 Op 1 . + CDS 41598 - 41975 530 ## Coch_1309 NAD-dependent epimerase/dehydratase 57 11 Op 2 2/0.000 + CDS 41997 - 43070 1516 ## COG2038 NaMN:DMB phosphoribosyltransferase + Prom 43073 - 43132 4.1 58 11 Op 3 . + CDS 43190 - 43807 643 ## COG0406 Fructose-2,6-bisphosphatase 59 11 Op 4 . + CDS 43847 - 44623 1343 ## COG2099 Precorrin-6x reductase 60 12 Op 1 17/0.000 - CDS 44738 - 45802 1081 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 61 12 Op 2 21/0.000 - CDS 45822 - 47537 1756 ## COG1178 ABC-type Fe3+ transport system, permease component - Prom 47560 - 47619 5.7 - Term 47548 - 47588 8.1 62 12 Op 3 . - CDS 47630 - 48661 539 ## PROTEIN SUPPORTED gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 - Prom 48723 - 48782 12.3 + Prom 48641 - 48700 12.6 63 13 Tu 1 . + CDS 48885 - 50429 2011 ## COG0554 Glycerol kinase + Term 50466 - 50518 -0.7 - Term 50710 - 50741 -0.2 64 14 Tu 1 . - CDS 50763 - 51509 660 ## Selsp_0263 hypothetical protein Predicted protein(s) >gi|261747690|gb|ADAD01000104.1| GENE 1 3 - 230 344 75 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212524|ref|ZP_04338900.1| SSU ribosomal protein S10P [Leptotrichia buccalis DSM 1135] # 2 74 28 100 101 137 90 2e-31 VKKNGSKLAGPLPLPTKTKKYTVLRSVHVNKDSREQFEMRVHRRFVEIKNSSQQIIAALS SLNLPSGVGIEIKQL >gi|261747690|gb|ADAD01000104.1| GENE 2 286 - 912 964 208 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229883874|ref|ZP_04503339.1| LSU ribosomal protein L3P [Sebaldella termitidis ATCC 33386] # 1 208 1 208 208 375 87 1e-103 MILGKKIGMTQIFENEKLIPVTVIEAGPNFVVQKKSEEKDGYNAVTLAYDDKKEKNTIKP EMGIFKKAGISPKKFLKEFKVAELESFELGQEFKVDVLEGIEFVDIVGTSKGKGTAGVMK RHNFGGNRATHGVSRNHRLGGSNAGGAASNSNVPKGKKMAGRLGNEQVTVQNLKVVKFDA ENNLLLVKGAVPGPKNGYLVIKKSVKKY >gi|261747690|gb|ADAD01000104.1| GENE 3 937 - 1581 846 214 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212526|ref|ZP_04338902.1| LSU ribosomal protein L4P [Leptotrichia buccalis DSM 1135] # 1 214 1 214 214 330 75 1e-89 MPVLNTYKLDGSKAGTVEVKEEIFGIEPNKTVMHEVLVAELAGAREGNASTKTRAEVRGG GRKPFRQKGTGRARQGSTRAPHMVGGGVVHGPKPRDYTKKVNKKVRRLALRSALATKINN GDLIVLDDFTLDVPKTKTFVQFAKALNFAGEKQLFIADYNGETFNRDINLEYSVRNLEKV MTLDSRSLSIFWLIKQDKIVVTKEALATIEEVLA >gi|261747690|gb|ADAD01000104.1| GENE 4 1581 - 1871 389 96 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212527|ref|ZP_04338903.1| LSU ribosomal protein L23P [Leptotrichia buccalis DSM 1135] # 1 93 1 93 94 154 82 1e-36 MHITDIIKKPVVNTEKARILMESNEYVFIVDRRANKLQIREAVEKLFNVKVASVNTLNVK PKMERFRMSMYKTPAIKKAIVKLQDGEKIASYEQNL >gi|261747690|gb|ADAD01000104.1| GENE 5 1957 - 2784 1384 275 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212584|ref|ZP_04338957.1| LSU ribosomal protein L2P [Leptotrichia buccalis DSM 1135] # 1 275 1 275 275 537 93 1e-152 MPIKKLKPMTSGTRHMSILVNHELDKVRPEKSLTEPLGSSYGIDNYGHRTGRNRHKGHKR LYRVIDWKRNKIGVPAKVATIEYDPNRTANIALLHYVDGEKRYILAPNGLKKGDTVLAGD SAEIKPGNALKLKDLPIGTVIHNVELMPGKGGQLARSAGTSARLVAKEGTYCHVELPSGE LRLLHKECMATIGAVGNSEHSLVSLGKAGRNRHLGKKPHVRGSVMNPVDHPHGGGEGRSP IGRKSPVTPWGKPALGKKTRGKKLSDKFIVRKRKK >gi|261747690|gb|ADAD01000104.1| GENE 6 2908 - 3186 456 92 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229883870|ref|ZP_04503335.1| SSU ribosomal protein S19P [Sebaldella termitidis ATCC 33386] # 1 91 1 91 91 180 93 2e-44 MARSLKKGPFVDAYLLKKVESMGEKKQVIKTWSRRSTIFPQFIGHTFAVYNGKKHIPVYV TEEMVGHKLGEFAPTRTFYGHGKDAKKDAKRK >gi|261747690|gb|ADAD01000104.1| GENE 7 3249 - 3584 501 111 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212586|ref|ZP_04338959.1| LSU ribosomal protein L22P [Leptotrichia buccalis DSM 1135] # 1 110 1 110 112 197 89 1e-49 MAVVAKLRYQRLSPQKARLVADIVRGKNALQALNVLRFTNKKAAKYIEKTLKSAIANAEH NFQMDPDKLFVSRILIDKGPVLKRVNPRAMGRADVIRKPTAHITVEVDERN >gi|261747690|gb|ADAD01000104.1| GENE 8 3606 - 4262 1027 218 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212587|ref|ZP_04338960.1| SSU ribosomal protein S3P [Leptotrichia buccalis DSM 1135] # 1 218 1 218 218 400 88 1e-110 MGQKVDPRGMRIGIVKTWDSRWFAEGKDYLNNFHEDLKIKKYIKKNYYHAGISTIQIERT SPTDITIIIETGKAGILIGRKGQEIEALKVNLEKLTGKKVQVKVQEIKNPNKDAQLIAES IATAIEKRVAYKRAVQQAIQRAEKSGVKGIKIMVSGRLNGAEIARSEWTLSGRVPLHTLR ADVDYATATAHTTYGALGLKVWVFNGEVLPTKKEGGNE >gi|261747690|gb|ADAD01000104.1| GENE 9 4262 - 4687 725 141 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212588|ref|ZP_04338961.1| LSU ribosomal protein L16P [Leptotrichia buccalis DSM 1135] # 1 141 1 141 141 283 96 1e-75 MLIPKRTKYRKQFRGKIGGIATKGNKVDFGEFGLAAKEFGWITSRQIEACRVTINRTFKR EGKIWIRIFPDKPYTKRPEGTRMGKGKGNAEGWVAVVKKGKIMFEVGGVSEEKAKEALRK AGHKLPIKVRFVRREEVGGDK >gi|261747690|gb|ADAD01000104.1| GENE 10 4687 - 4881 234 64 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212589|ref|ZP_04338962.1| LSU ribosomal protein L29P [Leptotrichia buccalis DSM 1135] # 1 60 1 60 63 94 76 9e-19 MTANEIRELSLDELDTQVKELKQELFNLKFQKTLGQLHNTTKIKQVKRDIARMKTIITEK KTVK >gi|261747690|gb|ADAD01000104.1| GENE 11 4894 - 5154 421 86 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212590|ref|ZP_04338963.1| SSU ribosomal protein S17P [Leptotrichia buccalis DSM 1135] # 1 86 1 86 86 166 94 2e-40 MENKRNERKVREGIVVSNKMDKTVVVLEETMKLHKLYKKRVKTSKKYKAHDENNECGIGD KVQIMETRPLSKDKRWRVVTILEKAK >gi|261747690|gb|ADAD01000104.1| GENE 12 5188 - 5556 575 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229883864|ref|ZP_04503329.1| LSU ribosomal protein L14P [Sebaldella termitidis ATCC 33386] # 1 122 1 122 122 226 94 3e-58 MVQQQTMLNVADNTGAKKIMVIRVLGGSRRRFGKIGDIVVASVKEAIPNGNVKKGDVVKA VIVRTRKELKRPDGSYIKFDDNAAVILNSNLEIRGTRIFGPVARELRAKNFMKIVSLAPE VL >gi|261747690|gb|ADAD01000104.1| GENE 13 5576 - 5965 539 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212592|ref|ZP_04338965.1| LSU ribosomal protein L24P [Leptotrichia buccalis DSM 1135] # 1 129 1 127 127 212 82 4e-54 MAKPRIKSIPKKLHVKTGDTVVVISGRSKDLPRNDKAGQDQTGDKGKVGKVLKVFPKTGK IVVEGVNIQKRHVKPNANNPQGTVVEKEMPIFSSKVMLWDEKAKKGTRVRYEVRDNKKVR ISVVSGNEI >gi|261747690|gb|ADAD01000104.1| GENE 14 5990 - 6544 820 184 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212593|ref|ZP_04338966.1| LSU ribosomal protein L5P [Leptotrichia buccalis DSM 1135] # 1 184 1 184 184 320 84 1e-86 MAEKYIPRLQKEYKEEIIPALMKDLDIKNIMQVPKIEKIIVNMGIGEATSNPKLIETAMK ELAQITGQQPVPRAARKSEAGFKLREGQKIGAKVTLRKEKMYEFLDRLISITLPRVRDFE GVSPKAFDGRGNYTLGIKEQIVFPEIEIDKVDKIFGMGITIVTTAKNDEEGRALLKAFGM PFAK >gi|261747690|gb|ADAD01000104.1| GENE 15 6575 - 6862 456 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212594|ref|ZP_04338967.1| SSU ribosomal protein S14P [Leptotrichia buccalis DSM 1135] # 1 95 1 95 95 180 92 2e-44 MAKRAMVERNLKREKTVDKYAAKRAELKERAKKGDREAIAELSALPRNASPTRVRNRCQI NGRPRGFMREFGISRVMFRQLAGEGAIPGVKKSSW >gi|261747690|gb|ADAD01000104.1| GENE 16 6884 - 7279 596 131 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212595|ref|ZP_04338968.1| SSU ribosomal protein S8P [Leptotrichia buccalis DSM 1135] # 1 131 1 131 131 234 86 1e-60 MHLTDPIADMLTRIRNGNIAKHSEVKIPFSKIKESMANILKNEGYIAGYEIKEEGTKKDI AVTLKYMDGDAVIKGLKRISKPGRRVYTSVENMPKVLGGLGIAIVSTPKGVITDKECRKH NVGGEILCYVW >gi|261747690|gb|ADAD01000104.1| GENE 17 7311 - 7844 801 177 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212596|ref|ZP_04338969.1| LSU ribosomal protein L6P [Leptotrichia buccalis DSM 1135] # 1 177 1 177 177 313 88 2e-84 MSRIGRKPITIPAGVEIKQDGNKFIVKGPKGQLERELSSEIKVNIKDGEITFERPNDLPN IRALHGTTRANLNNMITGVSDGFAIKLELVGVGYRVQANGKGLTLALGYSHPVEIEAVDG ITFKVEGNNKITVEGIDKQLVGQIAANIRAKRPPEPYKGKGVKYADEVIRRKEGKKG >gi|261747690|gb|ADAD01000104.1| GENE 18 7858 - 8220 516 120 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212597|ref|ZP_04338970.1| LSU ribosomal protein L18P [Leptotrichia buccalis DSM 1135] # 1 120 1 120 120 203 85 2e-51 MIKKLDRNKLRQKKHKSIRSKIVGTAERPRLSVYRSLKNIFVQIIDDQAGKTLVSASTIE KGAKTENGSNVEAAKKIGEAIAKKALEKGIDTVVFDRSGYVYTGRVKALADAAREAGLKF >gi|261747690|gb|ADAD01000104.1| GENE 19 8239 - 8736 741 165 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229859307|ref|ZP_04478967.1| SSU ribosomal protein S5P [Streptobacillus moniliformis DSM 12112] # 1 165 1 165 165 290 90 1e-77 MAREVKASEYKESLLRISRVSKTVKGGRRISFSVLAAIGDEKGKVGIGLGKANGVPEAIR KAIANAKKNMITVSLKGGTLPHEQLGKYNATSVLLKPASKGTGVIAGSATRELLELAGVT DVLTKIRGSKNKDNVARATLDGLKQLRSVEEVARLRGKSVEEILG >gi|261747690|gb|ADAD01000104.1| GENE 20 8757 - 8936 274 59 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212599|ref|ZP_04338972.1| LSU ribosomal protein L30P [Leptotrichia buccalis DSM 1135] # 1 59 1 59 59 110 93 2e-23 MSKVKVTLIKGINGRKPNHIDTVKSLGLRKISQSVVHNKTADIEGKIKLVSYLLKVEEV >gi|261747690|gb|ADAD01000104.1| GENE 21 8942 - 9556 919 204 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229212600|ref|ZP_04338973.1| LSU ribosomal protein L15P [Leptotrichia buccalis DSM 1135] # 1 204 1 204 204 358 84 3e-98 MNLNELRPAEGSKRERRRIGRGHGSGWGKTAGKGHNGQKQRSGTYVSPVFEGGQMPIIRR IPKRGFSNSAFKKDTIVISLFDIVEKFNDGDVVSLETLVENGIIKNPKFITKYSDAALRE TKGKKAVKQYLLENVESYVKEKDFSSILKIIGNAEVNKKLTVKAHKISESAKELIEKAGG SVEVLEVKSYSAKAGNNKKEDENK >gi|261747690|gb|ADAD01000104.1| GENE 22 9600 - 10898 756 432 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 16 427 19 440 447 295 40 3e-79 MTLTEAIVNRIQSIFKIPELKKRVAFTLVMFAIARIGVHIAVPGINMAAFKNFQNNAIAG FLNLFSGGAVQRASIFSLGIIPYINSSIVFQLLGVIFPKIDEMQKEGGKERDKITQWTRY VTIILAIAQSFGIAITLINQPGLVVEPGPKFIISTMALMTGGTAFLMWISERISIRGIGN GSSMLIFLGIVVNLPQVIQQMVSSNINFIFFGLSIILFVVIIALMVDIQLAERRIPIQYA GKGSLGFGGGQSAVGRRTYLPLKINTAGVMPIIFASVLMAAPPFIVQMLKANAFWQKQFS QTGVLYLLLFAFLIIVFSFFYTLTIAFDPEKVSEDLKQSGGTIPTVRAGKETADYLEKVV TRVTFGSALFLAILGIFPNIWFGYFLHIPVLLGGTSLLILVGVAVELIQQIDSYLAVKKM KGFISTSGKNNR >gi|261747690|gb|ADAD01000104.1| GENE 23 11195 - 11713 848 172 aa, chain + ## HITS:1 COG:PH1658 KEGG:ns NR:ns ## COG: PH1658 COG2426 # Protein_GI_number: 14591427 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pyrococcus horikoshii # 22 169 8 158 159 66 31.0 3e-11 MKKLAQSIVAFIVGIFGVNAGKIIGIFLISMLPVIELRGSIPVGFSQGLPWYTNMITSII GNMLPVPFILLFVVKVFEFMKKHNIMVKFIEKLEKRALSRSESVANKEFLGLMLFVAIPF PGTGAWTGALIAALLKLNPKKSFFTILLGVVLAAAVVTLGVYGVFGFLKNMV >gi|261747690|gb|ADAD01000104.1| GENE 24 11864 - 12898 1521 344 aa, chain + ## HITS:1 COG:lin1165 KEGG:ns NR:ns ## COG: lin1165 COG4822 # Protein_GI_number: 16800234 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiK, Co2+ chelatase # Organism: Listeria innocua # 1 254 1 259 261 172 38.0 1e-42 MSKGILVTSFGTSHKDTREKCLDSIQKEVEKKYGSEKVERAYTSGVVRRIVERKENIHIY DQEEGLKALKDKGFDEIITMSLHILNGIEYSKLNDKYGKITKPLLTTEEDYKRIVNDTEF NDLEGNDVIIFMGHGSESAADITYEKLQEAYNKAGKNNIFVATVEGKVTIEDVIEKLKQT DYKKVLLKPFMIVAGDHAKNDMASDEEDSWKTILQNNGYEVTPVLKGMGEYKLIRDMFME KLDEAYGKNEKSNTELVVKKEEEPKVEKKLYKSRTDKKITGVCGGLGKYFNVDSTIVRII WAIAFFVGGTGGLAYIVAALIIPEEPVGNSKKSEDIVEAEKAED >gi|261747690|gb|ADAD01000104.1| GENE 25 13417 - 13758 554 113 aa, chain + ## HITS:1 COG:lin1167 KEGG:ns NR:ns ## COG: lin1167 COG0310 # Protein_GI_number: 16800236 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, permease component # Organism: Listeria innocua # 20 112 25 117 244 102 56.0 2e-22 MKKRTSLFLVLTGMFVLFANGYSMHIMEGFLPKQWVVVWFIVCVPFWVIGIKKIQSVSKG SVREKMTLALAGAFIFILSALKIPSVTGSSSHPTGVGLSAILYGPFVTSILGT >gi|261747690|gb|ADAD01000104.1| GENE 26 13761 - 14123 530 120 aa, chain + ## HITS:1 COG:alr3943 KEGG:ns NR:ns ## COG: alr3943 COG0310 # Protein_GI_number: 17231435 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, permease component # Organism: Nostoc sp. PCC 7120 # 9 119 134 246 261 107 56.0 7e-24 MIFQAGLLAHGGFTTLGANTFSMAIAGSIVSYLIYKGLAKKNRTLAIFLAAAVGDLATYV VTSFQLALAYPSADGGVMASFIRFGMVFAVTQVPLAIIEGLLTNVIMNILDKYNTKEVEA >gi|261747690|gb|ADAD01000104.1| GENE 27 14411 - 15103 517 230 aa, chain + ## HITS:1 COG:MJ1089 KEGG:ns NR:ns ## COG: MJ1089 COG0619 # Protein_GI_number: 15669277 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Methanococcus jannaschii # 2 179 8 207 268 72 29.0 9e-13 MLIDKISYGNPIKDINPGIKFLLSMTTLIFLLCTENKGVIVFNLVLFNLLLLFVVKVKVS DLLKLNFIPALFILTTVISLFLIKADIWLFLLRSFAGLSVVYFLICSTPIIDLDYIFGKL RFPKIFREMFLLIYRYIFLLFDNKEKLQNSQEVRLGYSSFKNSMKSFPMLVVAILRKTYY YNYNSIKAVESRLGKEFIFSPRRYKKVGIEAVFAISVIIINLYLVVKYYA >gi|261747690|gb|ADAD01000104.1| GENE 28 15096 - 15902 296 268 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 227 1 229 245 118 32 6e-26 MLKLENITFSYDKETEALKDISLNIEKGKKTVFLGENGSGKSTIFSIMNGLLQAQKGSVY LNGEKVIYRKKNLEELRKKVRIVFQDPEIQIFAPLVFQEVSYGPENLGYSKEKVEKNITK AMEEINILDLKDRPCHHLSYGQKKRVSIAAITAMEPDLLILDEPTAWLDSKNTKSVSEIL NNFADAGKTLVVSTHDTDFAYDFADYIYVLDKGKIVRQGSRDEVFEDFKFLKELNLNIPA VLKITRYLKSKNIDVNDYYTFLEENNLL >gi|261747690|gb|ADAD01000104.1| GENE 29 15899 - 17254 1767 451 aa, chain + ## HITS:1 COG:FN0972 KEGG:ns NR:ns ## COG: FN0972 COG1797 # Protein_GI_number: 19704307 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Fusobacterium nucleatum # 1 448 1 444 444 411 48.0 1e-114 MKKILIAGAMSGSGKTTLSSILMSSFENVAPFKVGPDYIDPSYHEVFTGNKSRNLDAFMF DEATLKHIFETGAKGKDIAVVEGVMGLYDGIGHEKDNFSTAHVSRMLDIPVILVVNAKGI STSIAAEVLGFKMFDKNVKIKGVILNNVSSEKLYLNLKEAVERFTGIECLGFLPRNEKLT IESRHLGLKQAFEMNASEELAEKKALFKEIAENNLDLKRIWEIAEDFEPESSMDDFEPLK ELKDRYKGKRVAMAKDGAFSFYYESNLELMRFSGLEIVEFSPVKDKKLPENIDMVYLGGG YPELYYKELSENISMKESIKEAFEKGVKIYGECGGFIYLTKKLNLTDGNSGDFCGLIDAK ISMRNRLNIGRFGYINIETGNGIKTKGHEFHYSEISEDNEKSKGHNHFYKIEKNDGRNWS CGYKKKSLLAGYPHISFYSNIEFFKYILETL >gi|261747690|gb|ADAD01000104.1| GENE 30 17318 - 17962 1017 214 aa, chain + ## HITS:1 COG:FN0970 KEGG:ns NR:ns ## COG: FN0970 COG2082 # Protein_GI_number: 19704305 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin isomerase # Organism: Fusobacterium nucleatum # 1 214 4 217 219 261 59.0 1e-69 MSYIKDPKGIEVRSFEMITEELGDRADRFSEQEKPIVKRLIHTTADFEYADIVEFQNDAV NSAMEALRSGCKIYCDTNMIVTGLNKIGLGKFGASAYCLVNDEDVAREAKERGVTRSMVA IERAVADKDTKIFLIGNAPTALFMLLEKMDKEGENKPKLIAGVPVGFVGCPESKAELSKY DVPFIRTNGRKGGSTVAVAVLHGILYQMYERDRY >gi|261747690|gb|ADAD01000104.1| GENE 31 17995 - 19128 1396 377 aa, chain + ## HITS:1 COG:FN0967 KEGG:ns NR:ns ## COG: FN0967 COG1903 # Protein_GI_number: 19704302 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiD # Organism: Fusobacterium nucleatum # 10 365 4 364 375 268 45.0 1e-71 MEDYVYFQGKRLRYGYTTGSSATAATKAALTYLLENGKNEIPEVTIELPSGQTLTIKINS VKKREDYALASVLKDGGDDPDVTHGLEIFSKVSLRNDAEINIFGGVGVGKVTKKGLPIEP GNSAINPTPMKMIRNTVESILPEGVGADVEIFVPKGEEAAKKTLNSKLGIVGGISILGTM GIVKPMSEEAWKASLAVELKMALAITGTKEAIFLFGNRGKKYLSDHFKDNTSQAVVISNF VGYMFDRACEFEAKKIYFIGELGKFVKVAGGIFHTHSRVSDAKMEILTANCLLVGESMEN LKKIMASNTTEEATKYIEKKEVYNLLAKKAKQKCEEYCRRNGWDLEVETLIISAEREELG SSKNFFEHFKRKDETEV >gi|261747690|gb|ADAD01000104.1| GENE 32 19139 - 19759 909 206 aa, chain + ## HITS:1 COG:FN0966 KEGG:ns NR:ns ## COG: FN0966 COG2241 # Protein_GI_number: 19704301 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 1 # Organism: Fusobacterium nucleatum # 2 206 16 222 229 167 41.0 1e-41 MKINVLGLGPGSLDYILPAAIKKLKESEVIVGGKRHIESLGEYAANKEYCYITGDLESVL KFIKENRDKKMSLILSGDTGFYSMLTFMKKHFSDDELEVIPGISSVQYMFAKISDYWYDA FISSAHGKEFDYVAKLEEYGKVGMLTDNKNTPQEIARQLSENGMGESTVYVGENLSYEDE KIYKYKALKLKDVDYKFKMNVVVLKK >gi|261747690|gb|ADAD01000104.1| GENE 33 19784 - 20149 429 121 aa, chain + ## HITS:1 COG:yeaO KEGG:ns NR:ns ## COG: yeaO COG3189 # Protein_GI_number: 16129746 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 4 116 10 119 122 87 41.0 7e-18 MNKLQWKRVYDSVSEDDGFRILIDRLWPRGEKKEAAKIDCWAKEITPTKELRESYHKGMI DYETFSEKYGKELEENTEFKNFVQLIEKELQKENVTMVSAVKEPETSHIPVLRKYIERFL Q >gi|261747690|gb|ADAD01000104.1| GENE 34 20165 - 20671 521 168 aa, chain + ## HITS:1 COG:lin0387 KEGG:ns NR:ns ## COG: lin0387 COG0494 # Protein_GI_number: 16799464 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Listeria innocua # 1 150 1 150 169 83 36.0 2e-16 MEEWDAYNRNGEKLEGTLIRGKSIPEGMYHIVCEVFVEHKDGTYLCTKRAKTKEKFPEYY ETTAGGSALKGEDKYSCIKRELMEETGIVCNDFTQVKRTIVDKESFIMYSFVCTVDCDKN SVKLQPGETEDYKWLTKEEFVELINSDRIIKDQKERFYNYFLGKNIIG >gi|261747690|gb|ADAD01000104.1| GENE 35 21093 - 21662 854 189 aa, chain + ## HITS:1 COG:CAC1378 KEGG:ns NR:ns ## COG: CAC1378 COG2242 # Protein_GI_number: 15894657 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 2 # Organism: Clostridium acetobutylicum # 1 188 1 183 187 157 47.0 1e-38 MFHIKDEEFIRGKAPMTKEEIRAVSLAKLEMKNDDICLDIGGGTGSVTMEMAQFAKDGHV YVIEQKEDAFELIKQNVEKFDLKNVTYIKGEAPEGLSHLDVKFDKIFIGGSGGNLEDIIQ YSYDNLKEKGILVLNFIVLENVFNAIETLKKIPFKDTDICQIIVSKNRKVKDFNMMMAEN PIYVITARK >gi|261747690|gb|ADAD01000104.1| GENE 36 21725 - 22450 1014 241 aa, chain + ## HITS:1 COG:FN0959 KEGG:ns NR:ns ## COG: FN0959 COG2243 # Protein_GI_number: 19704294 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-2 methylase # Organism: Fusobacterium nucleatum # 1 239 9 246 248 266 57.0 4e-71 MENKKGNFYGIGVGVGDEENITLKAVKKLADVDVIILPEAKTGEGSTAFSIVKNYVKDDV EQVFLEFPMVRDLETRKTFRKNNADIINGLLSEGKNVAFLTIGDPMTYSTYTYVLEHILP EIKVETIAGINSFNSMAARLNIPLMIGDEDLKVISVNKKTDIRKEIENNDNVVFMKISRD LDRLREAIVATGNTENIIIVSNCGKESEEIYTDIEKLESVHYFSTLILKKKGINQWKRFI L >gi|261747690|gb|ADAD01000104.1| GENE 37 22429 - 23190 1229 253 aa, chain + ## HITS:1 COG:FN0957 KEGG:ns NR:ns ## COG: FN0957 COG2875 # Protein_GI_number: 19704292 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-4 methylase # Organism: Fusobacterium nucleatum # 2 253 6 257 257 372 75.0 1e-103 MEKVYFIGAGPGDPDLITVKGKKIVEKADVIIYAGSLVNKEIIGCHKEGAEIYNSASMNL DEVIEVTVKAQKEGKLVARVHTGDPSIYGAIREQMDMLDEYGIEYEVIPGVSSFVAAAAA IKKEFTLPDVSQTVICTRLEGRTPVPESESLESLASHKCSMAIFLSVQMIDGVVERLRKH YDINTPIAIIQKATWEDQKIVMGTLENIAELAKKEKITKTAQILVGNFMGNDYSKSKLYD KTFSHEFRKGIKE >gi|261747690|gb|ADAD01000104.1| GENE 38 23207 - 24235 1451 342 aa, chain + ## HITS:1 COG:FN0952 KEGG:ns NR:ns ## COG: FN0952 COG2073 # Protein_GI_number: 19704287 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiG # Organism: Fusobacterium nucleatum # 1 327 1 322 337 241 43.0 2e-63 MKTAIYCVSKNGYNIALSVRNRVYPDSDIYISERVYKLLNLENENIEKLFVITERVPLLL DKIFNEYDLHIFVAATGAVVRLINGKFKSKDTDPAVLTIDEQSNFVISLLSGHLGGANEE CRKIAQELEAIPVVTTATDVSGKIAVDILSQKIKARLEDLEGAKRVTSLIVNGEKVGLHL PKHIVDVNENTAGAIIVSNRKKIEISKIIPQNIFIGIGCKRNTPKEHIVAKLKEVMENQN LEMTSIKTAASAWVKADETGLLEAMEELNIPIKFFDKEEILKVEDLIEDRSNYVKQTIGV YGVSEPCAFLASSGKGKFLVKKIRLGGLTLSIFEEEMTEVKE >gi|261747690|gb|ADAD01000104.1| GENE 39 24297 - 25040 1246 247 aa, chain + ## HITS:1 COG:FN0951 KEGG:ns NR:ns ## COG: FN0951 COG1010 # Protein_GI_number: 19704286 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-3B methylase # Organism: Fusobacterium nucleatum # 5 244 4 244 249 297 60.0 1e-80 MNKTGKIYVVGIGPGKKADMTYRAYEAMEKSDIIVGYKTYVDLIKEYYPGKEMKNSAMTK EVDRCIEVLEMAKQGKNVSLISSGDAGVYGMAGIMLEVATEDVEVEIIPGVTATNAAAAI AGAPVMHDYATISLSDLLTDWELIKKRIDLAAQADFVMSIYNPKSRGRATQIEEAREIMM KYKPADTPVAIVKNAGREDETKVITTLEEMLSHEIDMLTIIIIGNSNTFIKNGKMITPRG YKEKYTY >gi|261747690|gb|ADAD01000104.1| GENE 40 25360 - 25641 449 93 aa, chain + ## HITS:1 COG:no KEGG:Swol_2479 NR:ns ## KEGG: Swol_2479 # Name: not_defined # Def: ABC transporter periplasmic substrate-binding protein # Organism: S.wolfei # Pathway: ABC transporters [PATH:swo02010] # 1 93 1 93 347 63 39.0 2e-09 MSKMFKRTMLLAMLVIGFLISCQKSNDTKETKKDGAESASKGTLKVIAAYDAKEKIFEEF TKKTGIKVEFLDISSGEVLSKLNAENGKIGADV >gi|261747690|gb|ADAD01000104.1| GENE 41 25690 - 26397 982 235 aa, chain + ## HITS:1 COG:BMEII0565 KEGG:ns NR:ns ## COG: BMEII0565 COG1840 # Protein_GI_number: 17988910 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Brucella melitensis # 24 234 141 354 358 119 33.0 5e-27 MGNYVSPEEKAMDQRFIGNEYWTGVSIVTAGFLVNTDVLQKKDIPEPKSWADLKNPKYKD ELIMADPAISGTNYAVVYNVLQSMGEEAGWAYLESIKDNIPFYAQRGSEPTAKVKNAEMA VGIIPLGGDTYKMEQEFKVKNIIPSDGLPWVPAGMAIFKNAENKDAAKVFVDWALSKEGQ GFLKTAAPRMMVRSDVTPPAELKGMTPDKLMKMDINGMGTRRDEILKLWKEKMGK >gi|261747690|gb|ADAD01000104.1| GENE 42 26545 - 28161 1769 538 aa, chain + ## HITS:1 COG:PM0956 KEGG:ns NR:ns ## COG: PM0956 COG1178 # Protein_GI_number: 15602821 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Pasteurella multocida # 13 531 144 671 685 233 31.0 9e-61 MEIMDKLFYLLAVIPVIIFILWPITALFMKSIFPEGSFTLELYVGLFKENVPIIVNSIWV CILSTVLAVIIGTVIAVYTSYTSKLKKRLITLLLMLTMISPPFLSSLSYILLFGKRGFIT ADLLHLNVNPYGWHGIVVMQTFSEISLAALILIGGLTTIPFSIIEAGKDLGSKSGDVLWK IILPMLKTSIVAVFFIVFVKNIADFGTPIIVGGNFKMIATEAYKGVISYGEIDKAAAISF LIFLPIVFIFFIYRHELVNSNVMGNFSEKSGQTKEKVYELPKSLKIIFGLITLLFAVYML IQYTSIFVSAVTSNRGNRFHWTLEYVEAFSFTKIPSLVRSVVYSLIAGFVSSFLGVLFSY YIDRRKIRFSRSFGFIAALPYILPGPFFGIAYILAFNNPPLLLTGTGAIVVLNCIFRQMP VTTKAASANLYQISSLAEDAARDLGTPKISVFFRVILPQLKPAFLVGFINTFTATMTTVG AIIFLITPSAKVATVELFNVIRDGDYRMASVIASLLILVILAVNILFSILVLREKKVK >gi|261747690|gb|ADAD01000104.1| GENE 43 28162 - 29166 1437 334 aa, chain + ## HITS:1 COG:VCA0687 KEGG:ns NR:ns ## COG: VCA0687 COG3842 # Protein_GI_number: 15601444 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Vibrio cholerae # 2 245 6 244 351 278 50.0 1e-74 MYLELKNLSKKYGENTVVDNFNINVEKGELISILGPSGCGKTTTLRMIAGFINPTGGHIF LSGEDITDFPPEIRPVSTVFQNYALFPHLTVYENIDYGLRYPLKVGEKLNKKQKKERIEK ILELINLKGLEKRKIDQLSGGQQQRVALARSLVLEPKVLLLDEPLSNIDTKLRETVRNEI RKIQKKLGITMIFVTHDQEEAMSISDRIVVMNQGKIEQTGTPREIYRTPESVFVAEFIGK ANIFRENGKTFMVRPENIRLSSEEINDELSGIAEITRKEYRGTVILYELENSKNKEINVI TISNNREYNIGEKIRYSADKKDLYEITSERKEEI >gi|261747690|gb|ADAD01000104.1| GENE 44 29163 - 29732 647 189 aa, chain + ## HITS:1 COG:CAC1383 KEGG:ns NR:ns ## COG: CAC1383 COG2087 # Protein_GI_number: 15894662 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 3 189 4 185 185 152 43.0 3e-37 MSVIYVTGGAKSGKSKFAEEMLLGMNNGKQQNIYLATSIVFDEEMEMKVSLHKERRKDKW ITVESYKNFSDSLKNAFSENPKNNILVDCLTNMISNIIFERQDIDWDNPSKKELKLCDVS VEKEVDELIGTFDKFENIVIVSNELGMGIIPGYPLGRYFREIAGKMNQRIADIADEVYFV VSGIPMKIK >gi|261747690|gb|ADAD01000104.1| GENE 45 29784 - 29912 80 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTIEKNFGIIITINEIAFLVTKNPWGTTGVFLFTFLSALQYT >gi|261747690|gb|ADAD01000104.1| GENE 46 29881 - 29979 113 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLSQIAVSLLVSIASNLISYFIIKYIEEHLKK >gi|261747690|gb|ADAD01000104.1| GENE 47 30391 - 31899 1669 502 aa, chain + ## HITS:1 COG:FN0977 KEGG:ns NR:ns ## COG: FN0977 COG1492 # Protein_GI_number: 19704312 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Fusobacterium nucleatum # 4 501 2 496 496 523 57.0 1e-148 MQKKHKNIMLLGTGSNVGKSIITAGLCRMFYQDGYKVSPFKSQNMALNSFITKDGKEMGR AQVVQAEAANIEPEVFMNPILLKPTTDRKSQVIVNGKVYRNMDAREYFAYKHNLKKDIMS AYDHIRENFDICVLEGAGSPAEINLKEDDIVNTGMAEMADSPVVLVADIDRGGVFAAIYG TIMLLEESERNRIKGVIINKFRGDKTLLTPGIEMIEKLTNVPVLGVVPFVPLGIEEEDSL GIDKYNVKKEGKIRISVIKLKHISNFTDIDALSHYNDVSLKYVTKSSDLGNEDIIIIPGS KNTVEDMKDLIEKEINTKIIRLAKEGTVVFGICGGFQMLGQKIMDPEGLESDLKEISGLD LLNMETVMEKSKITTQYKNKIKKTSGLLTGAEGIEIKGYEIHQGYSYPASEEKNTIECIF DDEMLKGAVKDNIIGTYIHGIFDNSEFTNYFLNKVRKLKGLDNIEENFSFSEYKNREYDK LAQVLRENLDMEKIYEIMEVKR >gi|261747690|gb|ADAD01000104.1| GENE 48 31900 - 32838 1052 312 aa, chain + ## HITS:1 COG:FN0975 KEGG:ns NR:ns ## COG: FN0975 COG1270 # Protein_GI_number: 19704310 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobD/CbiB # Organism: Fusobacterium nucleatum # 4 312 7 324 325 315 55.0 7e-86 MLIIKIWTAYVLDLIFGDPQNIIHPVQIIGKLISFGEKILLKEKYKFFAGIILNLTVLSV TYGVNYIIFRSAKNSGIFAIIEIYLMYTIFSVNSLAREGNRVYGILKEGNIEKARKDLSY LVSRDTGTMDEKMIIRSTMETISENTVDGIVAPMFYMFIGGLPLGMTYKAINTLDSMVGY KNEKYMEFGKFSAKVDDVANFIPARITGIFIIIASFILHYDYKYSFKIFFRDRKNHSSPN SAHAEASVAGALGVQFGGRVSYFGKETEKPTIGDKIKDFELEDIKKNIKIMYVTSFLSLV IFSLIFGIISLI >gi|261747690|gb|ADAD01000104.1| GENE 49 32855 - 33961 1384 368 aa, chain + ## HITS:1 COG:FN0973 KEGG:ns NR:ns ## COG: FN0973 COG0079 # Protein_GI_number: 19704308 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Fusobacterium nucleatum # 2 362 3 352 357 346 55.0 3e-95 MDFHGGNIYKIFREKNIKEILDYSSNINPYGIPESLKKKIVENTEILERYPDPDYVELRE KIALLNKVDISNVIIGNGATEIIFLFMKVIKPEKVLIVSPTFGEYERALRASENKEIKIE YFELEEKDEFKLNIEKLKKELEKKYDLLVMCNPNNPTGKFMKLSETEEVLKKCNKYETKL FVDEAFIEFLEDGYKESIINTGENKRNLFVTRAFTKFFAIPGLRLGYGIFFNKNLEKEIA EKKEPWSVNNIAEMAGITLLEDKEYIEKTLKWITEEKKYMYEKLREIKGIKPYETDVNFI SVKINEDNYSAGLNVKKLREKMLEQGILIRDASNFKYLDDRFFRLAIKDRENNDKVLRVL STFFLPFY >gi|261747690|gb|ADAD01000104.1| GENE 50 34282 - 34971 763 229 aa, chain - ## HITS:1 COG:FN0729 KEGG:ns NR:ns ## COG: FN0729 COG0588 # Protein_GI_number: 19704064 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Fusobacterium nucleatum # 1 229 1 228 228 329 71.0 2e-90 MKLVLVRHGESEWNLQNRFTGWVDVDLTEKGISEAKAAGKTLKELGYIFDVAFTSFQKRA IKTLNYILEEIDQLYIPVYKSWRLNERHYGALQGLNKAETAKKYGNEQVHIWRRSFDIAP PLINTDDKENYPLFQERYKNIPVEKCPRGESLKDTIHRVLPYWDSHISKEIKEGKNVIIA AHGNSLRALIQYLLKIDNAKILELNLPTGKPLIFEINENLEILSVPELF >gi|261747690|gb|ADAD01000104.1| GENE 51 35143 - 36354 1378 403 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1028 NR:ns ## KEGG: Lebu_1028 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 41 395 105 481 647 148 36.0 4e-34 MEKFVYRIIISFILIGTVVKSEYAIKNGYLYYNFKESNSEYKMTEADIKTLKILDDNYAI DGKHVFFQKDIIENADLNTFKILEDFYSKDRKNVYYREHKISGANPETFEILDDSYSKDR KNVYFMYDVLKGADPKSFRRIPETDFYRDKNNIFYSSEKVGSIKGENIKRINQNLIKNDG SLFMAFWKYDKFKNKKIEIITRNDSSLSYYIKDGKNIYYLARSTEEEVDGKLLKLKEIDY ATFDFIDKEFSTYIKDKNGIYYISRGGANKINADRDTFKIINVIFSKDKNNVFYDNVNLG IHSEGFEILAYVWDSTYYLRDKNNVYYLDAYIGPPAVKIVKGADAKSFIVINNLYSKDKN NIFCDTEILRDADINTFEIASDDGEVTAKDSKNKYDRNCKIIK >gi|261747690|gb|ADAD01000104.1| GENE 52 36374 - 37414 1194 346 aa, chain + ## HITS:1 COG:FN0646 KEGG:ns NR:ns ## COG: FN0646 COG0373 # Protein_GI_number: 19703981 # Func_class: H Coenzyme transport and metabolism # Function: Glutamyl-tRNA reductase # Organism: Fusobacterium nucleatum # 8 332 6 329 329 239 43.0 7e-63 MSEKAENNFYIISFSYKNLNLEERENFVKKGYKEILGNYLEKTLIKGYIAVETCLRIELY MEITNDFNIEMMKRDFGIEKMKVYKDTDGIKYLLKVICGLDSIIKGEDQILVQLKKAYFE ALEKGTTSTFLNIVFNQAIETGKKFRSESKINEKNISLDSIAVKFIKTKFESLENKNIFV IGVGDLSQSILALLHKMNNCKLTMTNRSLRRSIELQKVFPDVETSEFNQKYEIIKESDII ISATSAPHLVVVKEKMKDILNDGKKRFFLDLAVPRDIETSIGQNENTSLYHLEDIWAEYN KNVEKRDEIVEKYFYIIEEQLEKISEKLEKRKKYGNLSKNGLEERR >gi|261747690|gb|ADAD01000104.1| GENE 53 37411 - 38361 1394 316 aa, chain + ## HITS:1 COG:FN0645 KEGG:ns NR:ns ## COG: FN0645 COG0181 # Protein_GI_number: 19703980 # Func_class: H Coenzyme transport and metabolism # Function: Porphobilinogen deaminase # Organism: Fusobacterium nucleatum # 1 314 1 294 298 277 51.0 2e-74 MKSNKIIIGTRGSILALAQAEKVKEMLIAKYDELRKKEDFKGISDFNKNESLTIELKIIV TTGDKDLRDFDKIKGTTQKELFVKEIEKEMLENKIDFAVHSLKDMPQQTPEGLLNACFPL REDNRDVIVSKSGKKLNELREKSVIGTGSIRRETELLNLRKDIIIKPIRGNIHTRLKKLD NGEYDAIILAAAGLKRTGLSDRITEYFEDEVFMPAPGQGILCIQCRENDDRIRELLEIIN DKEVGIMCETEREFSRIFDGGCHTPIGCSSTIEGNTLKLKGVYNKDGKRIFSEIEGDKNN PKEIAQKLAKEIKNNE >gi|261747690|gb|ADAD01000104.1| GENE 54 38354 - 39886 2087 510 aa, chain + ## HITS:1 COG:FN0644_1 KEGG:ns NR:ns ## COG: FN0644_1 COG0007 # Protein_GI_number: 19703979 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III methylase # Organism: Fusobacterium nucleatum # 1 250 1 250 251 295 60.0 1e-79 MSKGKVYIAGAGCGDEKLITVKLRNIIKEAECIIYDRLVNESILQYAQPDAELIYMGKAN TEGGELQREINETIVKKGKEGLKVLRLKGGDPFVFGRGGEEIEVLIAENIDFEVIPGITS SIAIPAYAGIPVTHRGINTSFHVFTGHMKIDGNELDFPTIAKLDGTLIFLMGLSNLEKIV TNLIKNGKNPETPVGIIKDGTTAKQKTYVGTMSNIVDIVKENNIKSPVIIIIGEVVNLRE KMKWFEDKPLFGENILVTRNKEKQGDITNKINELGGQALNLPFINIEYIDYEMPDLSKYS AILFNSINSVAGFMRKIKDIRILGNVKIGVVGEKTDEEMRKYKIIPDFYPKKYTVEKLAE ECVNFTKEGDNVLFIVSDISPVNEEEYTKLYKRNCKKLVVYNTKKVKVEKEKAERYVEKS DILMFLSSSTFEAFAESIDLTGNEEMKKILNEKTVASIGPVTTKTIEKYGIKVGIEAEKF TEEGIVDALLKRKQKILSAFLEKKQQKTAD >gi|261747690|gb|ADAD01000104.1| GENE 55 40142 - 41434 1918 430 aa, chain + ## HITS:1 COG:FN0540 KEGG:ns NR:ns ## COG: FN0540 COG0001 # Protein_GI_number: 19703875 # Func_class: H Coenzyme transport and metabolism # Function: Glutamate-1-semialdehyde aminotransferase # Organism: Fusobacterium nucleatum # 1 428 1 428 434 574 65.0 1e-163 MKTENSKQLFEEAKKAIPGGVNSPVRAFGSVKKDYPIFIERAKGSKIYDEDGNEYIDCIG SWGPMILGHNYPEVLECIKKEVEKGTSFGLPTKKEVELAELVKDCFPSIEKLRLTTSGTE AAMAAVRVARAYTGKNKILKFEGCYHGHSDSLMVKAGSGLLTFEHQDSNGITDGVMKDTI TLPFGDIKKVRELFEKEKDIACIILEPIPANMGLIETEKEYLEELRKVTEKENVLLIFDE VISGFRVALGGAQQLYGITPDLTVLGKIIGGGYPVGGFGGREDIMNLISPLGNVYHAGTL SGNPVSVAAGIKTISILKDNPNMYTKLNEKVAQFVKKLEELIKKYNISATVNSAGSLFTI FFSDKKVKNLDDAINTSDEMYNIYFDTMTENGIMVPPSKYEAHFISIIHTEEEMQKILEI TEKAFQKMRK >gi|261747690|gb|ADAD01000104.1| GENE 56 41598 - 41975 530 125 aa, chain + ## HITS:1 COG:no KEGG:Coch_1309 NR:ns ## KEGG: Coch_1309 # Name: not_defined # Def: NAD-dependent epimerase/dehydratase # Organism: C.ochracea # Pathway: not_defined # 1 125 164 289 293 139 52.0 3e-32 MAERGKVYLFGKGEYRMNPIHGEDLAEFCVNCIHSSEQELPVGGPEIFTQKELSQVAFKI SGKKEKIVYISDKIRVLILKIGKFLMPKTKFGPVEFFLNAMSIDMVAPEYGKHKIEEYFE ELKNQ >gi|261747690|gb|ADAD01000104.1| GENE 57 41997 - 43070 1516 357 aa, chain + ## HITS:1 COG:CAC1372 KEGG:ns NR:ns ## COG: CAC1372 COG2038 # Protein_GI_number: 15894651 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 17 343 13 330 352 254 46.0 1e-67 MGYSEILNETLEKIVPLNKERMENKQKELNSLLKTPDGLGKLEKLAVQLEGIKENYSPEK KVVTVMAADNGVERENVSKSKRVITQYVVEAMLKGKSSINALGNTFGSDVKVVDLGIDEK TDIKKEIDLTGIISKKIMQEGTGNIAEGPAMSYETALKAIEIGIETVDKFISEGYNLFAT GEMGIGNTTTSSAVLKVFTDFPLGDIVGYGSGISDETLEHKKNIVKKAIEVNNRIEKINI KDTVDVLAKVGGLDIAGMTGVYLGCAKNKVPVVIDGLISSVAALAAYRLCKDVKGYMIPS HLSEEPGMKYIMKELEIEPFLFMNMKLGEGTGAIMMFPVIEAACNITKTVRKYPILP >gi|261747690|gb|ADAD01000104.1| GENE 58 43190 - 43807 643 205 aa, chain + ## HITS:1 COG:FN0808 KEGG:ns NR:ns ## COG: FN0808 COG0406 # Protein_GI_number: 19704143 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Fusobacterium nucleatum # 8 205 1 196 206 114 33.0 2e-25 MDNSGNVLKLYIVRHGETKWNVEKRFQGQTDSDLTEKGKEKVGKTGEELKNILFDAVYTS ELGRAVKSAEIILSKNINKKNRLQKLTELNEVYFGKWQGLSYNEIFEKYPEEANNYFYDI KKYYAKNIEAEELKDALDRFLKGMDKIVKNHKSGNILVVTHGTVLEMFINHVINKNTEDI DERKLIGNGKYRIFTYKNGEFLTAD >gi|261747690|gb|ADAD01000104.1| GENE 59 43847 - 44623 1343 258 aa, chain + ## HITS:1 COG:FN0950 KEGG:ns NR:ns ## COG: FN0950 COG2099 # Protein_GI_number: 19704285 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6x reductase # Organism: Fusobacterium nucleatum # 1 257 19 264 266 191 43.0 1e-48 MIWIIGGTKDSRDILERLIKETDKDIIVSTATGYGGKLLEEYTENNGNGKREIKVISERL NEEQMKDLIKENNITEIVDASHPYAINVSNSVIKVKNETGIKYIRFERKMLDYGNDKVEK FDSLEEVNNYVKGINGKNILSTLGSNNLGDIKEMGENNNLYVRILPTTDSIKKAEDLGYL PSRIIAVQGPIDKVLNKAMLESYKIDYLITKESGATGGEQEKVEACRENGTTVLVVKRPF VDYGTVYNEIDDLMKEFR >gi|261747690|gb|ADAD01000104.1| GENE 60 44738 - 45802 1081 354 aa, chain - ## HITS:1 COG:AGc3896 KEGG:ns NR:ns ## COG: AGc3896 COG3842 # Protein_GI_number: 15889431 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 344 4 340 352 308 46.0 1e-83 MSESKSLIIKNLTKTFITKTRRVEAVKNVNLSINSGEFICLLGPSGCGKTTVLRMIAGFE IPTSGSIEIGGKDIAYLTPDKRGISMVFQNYALFPHMNVYDNIAYGLKLQKKGKQEIDER VKKILQLMKMEDFAERVPSQMSGGQQQRVSLARALIMNSSVLLFDEPLSNLDAKLRLHMR DEIRKLQQEVGITSIYVTHDQAEAMALSDKIILMRDGEIIQVGSPTEIYQKPNSEFVAKF IGRANILNAEILEKHSNFTKIKLLNSEYNVPETTTHNVGDNIKIVVRPESMDFSKNSNTL KVVKSVFMGENHEYEVQSENEIIEIILSNPHGKEIKKVGEMLTFGIDEEAIHIL >gi|261747690|gb|ADAD01000104.1| GENE 61 45822 - 47537 1756 571 aa, chain - ## HITS:1 COG:SMb20364 KEGG:ns NR:ns ## COG: SMb20364 COG1178 # Protein_GI_number: 16264098 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Sinorhizobium meliloti # 22 553 96 643 681 306 34.0 6e-83 MNNDFKLKLNHELKNIRKLINDPVLLLTIIFSLAVVTFFVLVPLWNILLESLKIGKGKIG LENYIESFTASGNFQVILNTILLGFVTSIISLIIGFFFAYVSVYIKIKGKKIFDFIAMLP IISPPFVIALSAIMLFGRQGIITSRMLGMKGFEIYGFHGLVLVQVLSFFPIAYMMLVGLL QMIDPSVEEASRSLGASRLKVFTTITFPLMLPGLANAFLLVFIQSIADYANPFVIGGKFT TIAVKIFQEGVGNYELGIATALSIILLSLSITMFTIQRYYINKKSYVTVTGKVSRERELI DSKKIVTPIFTVLILLTSFVIIMYILIPLSSFVEIFGIKNNFTLRHYQEIFRFKNRIAPI IATTNLSLWATLIASVFSMIIAFLIVRKNFIGKTFIEFTTIMALAIPGTIIGIGYAISYN KVYKIPFTNITLIPTLTGTAFIIVMAFIIRSLPVGVRSGMSALNQIDPSIEEASTILGAS SRQTFFKVTIPMIREAFLSGLIYSFARSMTLVSTVVFLISAKWKLLTPEIMNHIDQGRIG IASAYCTILIIIVSTFMIIVKILMSFTNAKK >gi|261747690|gb|ADAD01000104.1| GENE 62 47630 - 48661 539 343 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 [Haemophilus parasuis 29755] # 1 332 1 338 346 212 36 4e-54 MKKIFLAILLIILTLSCGASDDSDKKEGKGKLLFYAGLQEDHAALIAQEFEKETGIKTEF VRLSSGETLARLKAEKDNMTASVWYGGPVDGMIAAIDEGLIESYISPEAANIKPEYKDSK GYWTGIYVGYLGFVGNKKMLEEKKIPMPSSWADLLKPEYKGEIVVAHPGSSGTAYTMLAT LVQLMGEEKAMEYFKKFNGQVRQYTKSGTAPGRMVGTGEATIGITFLHDGIKYQKEGYSD IILAAPSEGTGFEIGGVALLKNGPDQENAKKFIDWVLSKKVQELGKTVGSFQFLTNKDAI NPEEAKVVEGAKLIKYNFDWAGKNKKRLVDRFTKDTNTTLPKE >gi|261747690|gb|ADAD01000104.1| GENE 63 48885 - 50429 2011 514 aa, chain + ## HITS:1 COG:CAC1321 KEGG:ns NR:ns ## COG: CAC1321 COG0554 # Protein_GI_number: 15894601 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Clostridium acetobutylicum # 13 512 2 498 498 755 73.0 0 MDNKKIIQIYGSKKYIIALDQGTTSSRAVIFNKNTDIISSAQREYTQIYPQPGWVEHNPM EIWATQRSVLTEVIAASGISLKDVAAIGITNQRETTIVWDKNTGEPVYNAIVWQCRRTAD ICEKLKEKGLEEYVKENTGLIIDAYFSGTKLKWILDNVEGAREKAEKGELLFGTVDTWLV WKLTAGKVHVTDYTNASRTMLFNIKELKWDKKILKKLNIPESMLPEVKNSSEIYGYTKMG LTMGEESGTNIPIAGIAGDQQAALFGQTCFNKGDIKNTYGTGCFMLMNTGKNCIKSNNGL LTTIAVGINGKVEYALEGSVFIAGAVIQWLRDEMKFFSDAADTEYFATKIEDNGGVYLVP AFVGLGSPYWDMYARGTIVGLTRGSNRNHIIRAALESIAYQSKDLIDAMKEDSGLDLTSL KVDGGAVANNFLMQFQSDILNIEVLRPEIIETTALGAAYLAGLATGFWKDKEEIIKKQKL NRKFLPDLSEELRSKYFKGWKKAVEKAKNWEESE >gi|261747690|gb|ADAD01000104.1| GENE 64 50763 - 51509 660 248 aa, chain - ## HITS:1 COG:no KEGG:Selsp_0263 NR:ns ## KEGG: Selsp_0263 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 65 236 65 237 250 123 34.0 7e-27 MKLQFLKEIPLETLRKNIENNFSNYALPTNEWIFSFFGDVSPFLDFKKEVPDFELNMSYE KPEKSDIENIKILYSSLKDLSVVEATSESLWVGMAHSTFWNYMQYRLYLDRNKSYINNIK NSFFFSKGRKRSLIEHTLSRLWWAGKMAYDENREDPFELLEVFNTDFRTKLLYLFSSNFT NNPTITRAFLSSVLDFEKNGVKIKREIFNETIMYLNILGGTYILDYFEESELKDKITKKI DDLLLASD Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:06:58 2011 Seq name: gi|261747678|gb|ADAD01000105.1| Leptotrichia goodfellowii F0264 contig00063, whole genome shotgun sequence Length of sequence - 10802 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 4, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 2192 2972 ## COG1404 Subtilisin-like serine proteases 2 1 Op 2 . - CDS 2232 - 2636 631 ## Lebu_1215 hypothetical protein 3 1 Op 3 . - CDS 2600 - 3535 792 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 3666 - 3725 9.2 - Term 3719 - 3776 1.4 4 2 Tu 1 . - CDS 3862 - 4443 733 ## PPSC2_c2539 transcriptional regulator - Prom 4511 - 4570 8.1 + Prom 4589 - 4648 10.6 5 3 Op 1 . + CDS 4678 - 5253 196 ## PROTEIN SUPPORTED gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 + Prom 5260 - 5319 4.4 6 3 Op 2 . + CDS 5355 - 5792 266 ## gi|262038290|ref|ZP_06011679.1| hypothetical protein HMPREF0554_0940 + Term 5846 - 5879 5.4 - Term 5834 - 5867 5.4 7 4 Op 1 . - CDS 5875 - 7428 718 ## PROTEIN SUPPORTED gi|86153404|ref|ZP_01071608.1| 30S ribosomal protein S1 8 4 Op 2 . - CDS 7492 - 7944 331 ## COG1135 ABC-type metal ion transport system, ATPase component 9 4 Op 3 . - CDS 7966 - 8085 172 ## 10 4 Op 4 . - CDS 8163 - 9098 940 ## Lebu_0355 RimK domain protein ATP-grasp 11 4 Op 5 . - CDS 9172 - 10428 1634 ## COG1301 Na+/H+-dicarboxylate symporters Predicted protein(s) >gi|261747678|gb|ADAD01000105.1| GENE 1 2 - 2192 2972 730 aa, chain - ## HITS:1 COG:FN1426_1 KEGG:ns NR:ns ## COG: FN1426_1 COG1404 # Protein_GI_number: 19704758 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Fusobacterium nucleatum # 74 500 28 527 528 147 28.0 7e-35 MKKNKKILILALLTVVISCGGGGGGGGGGGNTSNPTPSIPTPTPSVDSSKDSNGNIKWND TTRTYNAGNPNNKTSATTQTGAGVTVGVIDMGFNTTNITYATDMRDKFGSRLVKPSSYTG QTSNDNHGIIVAEIIGGNTSNGIAKNVTIAVSDASKRGSDGKYHPDPTISMYNFLYSKGA RIFNQSYGVDSQVTDSKWNNPYYVEAQMGSDILSFYKTAVNDGSLFVWAAGNNKKDSNPS LEAGLPYRVSELEKGWINVVGLANSSKENPGNTEWSKQERLSGAGVAKNWTVSAVSGITF KTGDNNYQANGSSFAAPMVTGTAALLKEKYPWMDGSLIRQTILSTATDIGKKGVDEDFGW GLLNIDKALKGPALFDKRLALGDNVVADVTSGTYTFSNDIAGDAGIIKNGTGELVLSGKN TFTGGTKVNAGRLKLGNQYVSSLEIAKMGTVETTENANLENGIKNEGTYVNSGSNTVIGG NYVASSSSKYVSELGASAYVKGQAELNGTLEASAVKDGEQQYITARGVKENIITSDNAIK GNFSNVETETLLNADVEKSENSVDVNLSRKNVEDYVSTLSLSDTMRNNVAQNIETSFKEL DSQIENGNTENVKSFSRSAALIQKMSLPNAAAVLDSLSGQIYASAQALTFQHSQTVNKDL SNRLVMLGTLDNVGDNAGLWVTGIGANGRLRQEGFGVGKTHTYGGQVGVDKAFGNSLILG TALSYSKSDV >gi|261747678|gb|ADAD01000105.1| GENE 2 2232 - 2636 631 134 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1215 NR:ns ## KEGG: Lebu_1215 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 4 119 21 133 158 95 52.0 8e-19 MGIFSNNNGKELNVSEGISVISAKTYVKGVIETDTMTQIEGVLEGDIKCDNTVHVSDGGR IVGNIVAKSVFVDGEVSGDILAEKVEIGEKGKVLANVVSTLFVIQEGGVFEGNKKMKKAE TYIEYESSDEDSER >gi|261747678|gb|ADAD01000105.1| GENE 3 2600 - 3535 792 311 aa, chain - ## HITS:1 COG:HP1543 KEGG:ns NR:ns ## COG: HP1543 COG0739 # Protein_GI_number: 15646150 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Helicobacter pylori 26695 # 109 310 101 296 312 155 40.0 8e-38 MEKKRKIELETLNALKEKKMSEVFGDKGKKEKKEPKKKAKKQNQFNEGIIIQVGECGTKS FIKKSSSGKKNLRKIFNVVVFGIFLLMAGIIWFLFVEYQSFRLYKTKNKEIVALGKKYEE EQKMKAAELKDLKNLSLDNVQNVPEDKKNLMLSIIPSGSPLLRELFVTSPFGERVHPITH ERKFHHGIDLRVDIGNPVVSTAIGRVSFAGVKGGYGNVVIVDHMYGFQTAYAHLNKILVK EGEIVGKGKIIAEGGSSGHSTGPHLHYEVRYNGTPIQPKNFIDWDSKNFNIIFKNERSVP WEYFLTIMGKN >gi|261747678|gb|ADAD01000105.1| GENE 4 3862 - 4443 733 193 aa, chain - ## HITS:1 COG:no KEGG:PPSC2_c2539 NR:ns ## KEGG: PPSC2_c2539 # Name: not_defined # Def: transcriptional regulator # Organism: P.polymyxa_SC2 # Pathway: not_defined # 1 192 1 191 196 108 32.0 9e-23 MRKIDKKLNGTYEKILQTTIKLIQEEGSKNLTMKKVAEKAKVNKALVNYHFTGKENLIKE AVSVMSKDLWAIFDVLDNYDLDARSRFKKFLTDFLNQLVKFKESVKYLLCEEEFLFDEQR QFSAYVKNVGLPKMKKVLFEIIGEESRVSENLIIMQMLGALALPNLINSKENGVFKFKLP EIEKQVEIIVERL >gi|261747678|gb|ADAD01000105.1| GENE 5 4678 - 5253 196 191 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 [Bacillus selenitireducens MLS10] # 22 178 4 167 190 80 32 6e-15 MNQNLLINNIVTVKNKHYAIAKNILLVLSGTIFIALMAQVKIFLPSTVIPVTGQTFAVML IGLTYGKKLGTSTLISYITAGSLGLPVFANFKSGFPFFSPSGGFVIGFLFAAFICGYFAD KGATKSYAKLIPVLLLAHFVLYFFGLLQFRLFFPGVNVFEKAFIPFIPGDIIKMVLMIGI LPTVWKFVKEQ >gi|261747678|gb|ADAD01000105.1| GENE 6 5355 - 5792 266 145 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038290|ref|ZP_06011679.1| ## NR: gi|262038290|ref|ZP_06011679.1| hypothetical protein HMPREF0554_0940 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0940 [Leptotrichia goodfellowii F0264] # 1 145 1 145 145 246 100.0 5e-64 MRKLKKENRQNAENANLSNNIDKEKIEKQIRFILLISWILPFFGLFFLHFSRKTLHDKPK EILCKIINMEFTSIIILFILTSLLNTLAIAKVSGIILLFITVIMLGILIYAVISHVIGTL KWSKGEDYSFKYSLEFFKPYENINK >gi|261747678|gb|ADAD01000105.1| GENE 7 5875 - 7428 718 517 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|86153404|ref|ZP_01071608.1| 30S ribosomal protein S1 [Campylobacter jejuni subsp. jejuni HB93-13] # 1 511 1 539 556 281 32 2e-75 MSEFNDLFEEMLNDYLPEEKKSGDIVEAVIVRKDNDFGYLDLSGKQEGRILIREIEDFNI GDTIEVKVLRNDEENVIVSKFLLDKAKEFVDFSEGETVTGTIFKKIKGGYSVKVSKNEGF LPFSLSSFERNNDHIGEKFKFIIKEKTRNGLTLSRTDLIKKEEEGYLKSINIGDTLTGEV KKVLDFGIIVNLGVTTGFIHISELGWHHGADALKNYKEGDKIEAKVIDIDNEKKSVKLSV KQLSENPWNSVKEKYKIGDILEKDVSEVFEFGLLIELEKDIEGLLHVSDLAYRRVSNLNS KYKKGDVIKFKIIGFNDEKNRLSLSAKALFDDMWDKIEELYNVGDVVKGKVVNVQEYGIF LEIREGVEVFIHKNEFSWDRNENIKFKVGDEVEFKIISLDKNEKKIGGSIKQLTVSPWKE AAEQYKIGNKVKVPIVEIQENGVLVKLTDRFNGMVPKKELTDEFLKDISEKFSVGDEVEA VITETNEKRKSIILSVKKIKEIEDKKELEELMKIYGV >gi|261747678|gb|ADAD01000105.1| GENE 8 7492 - 7944 331 150 aa, chain - ## HITS:1 COG:STM0247 KEGG:ns NR:ns ## COG: STM0247 COG1135 # Protein_GI_number: 16763636 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Salmonella typhimurium LT2 # 1 106 1 106 343 124 53.0 5e-29 MIEISNVKKVFKNKKTEVHALKDVSLKVEKGDIFGIVGYSGAGKSTLLRLVNLLEKPDSG SVKIKGKEIVYLSERELNKLRKNIGMVFQQFNLLEAQTVYQNLKIPNGLKLLWRLITQKN LKNIWKQIIKVCGMYLMRINKKGNSILHFF >gi|261747678|gb|ADAD01000105.1| GENE 9 7966 - 8085 172 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKEIQNLQKIFKKEFKWFHKHPELPLEKYETTKQIEEIL >gi|261747678|gb|ADAD01000105.1| GENE 10 8163 - 9098 940 311 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0355 NR:ns ## KEGG: Lebu_0355 # Name: not_defined # Def: RimK domain protein ATP-grasp # Organism: L.buccalis # Pathway: not_defined # 1 311 1 314 315 506 81.0 1e-142 MAKVYIIHENIDWTQHLIKRLEELSVPYEELNLSEGVIDIQKEPEKGVYYNRMSASSHMR GHRYAPELTEQVLTWLENKGAVVINGTSAINLEVSKLKQYILLQKFGIKTPESLGAVGKE QILSAGKKLNKYPFIIKHNRAGKGLGVRLIRSYEELKNYVEGGQFEDSVDGISLIQEYVK PADGHIVRAEFIGGKYFYAVKVDSRDGFELCPADSCQIDDKFCPVGESSSKENKFKIIDR LPEEQILKYEQFLEEHNINVAAIEFIVNEDNEVYVYDINTNTNYNADAEKIAGKYAMLEL AEYLKNELEKL >gi|261747678|gb|ADAD01000105.1| GENE 11 9172 - 10428 1634 418 aa, chain - ## HITS:1 COG:BS_gltT KEGG:ns NR:ns ## COG: BS_gltT COG1301 # Protein_GI_number: 16078086 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Bacillus subtilis # 9 402 5 407 429 241 37.0 3e-63 MKSFLGFKKLNLLVQITIGGLLGAFAGYLFPSLGKELKILGVLFTNLIQMVIVPLIFPIV VLSIVTIKNSNKFGKLTFKTFSFFFLITTLLITISLVLGKVSGAGSSIHAVNVSTEAIKG VASDININDFLLSIVPKNVVSALSEGNLLPVIFFAVFLGIALIAIGDDNKPVIDFFEAWS KAMFKIVNYAISFAPVGVFGLVASNVAKTGLGDLYLLGQFVLLLYVGYLSTVLIVFPIIA YIYKVPYIRLLSNIKDLILIAFTTGSSSVVLPTLIERSEKNGVPSEISAFAVPLGYSFNL TGACVYISLSVAFITNLYGNPIQWVQLLPLILFLTIITKGIAAVPAEALVVLLATAGQLN LPPEGVALIVAVDFLANAGRTAVNVVGNALIPTIIAKSENEFIEESSDLKEKVLINSN Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:07:23 2011 Seq name: gi|261747676|gb|ADAD01000106.1| Leptotrichia goodfellowii F0264 contig00007, whole genome shotgun sequence Length of sequence - 422 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 44 - 421 249 ## FN2115 hypothetical protein Predicted protein(s) >gi|261747676|gb|ADAD01000106.1| GENE 1 44 - 421 249 125 aa, chain - ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 27 123 51 150 151 79 43.0 7e-14 RTSKTREKNGGGAINNEKYESGVKESMKDILKRPLNKREQMDGLTLILPENTSINTKMGN VIDLKTGYGIPIIFSKTNRCSNIFYHKKIGPDSYYSLSYNDYHTLTNEIAQKIIKANGFT KTCSK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:07:28 2011 Seq name: gi|261747668|gb|ADAD01000107.1| Leptotrichia goodfellowii F0264 contig00026, whole genome shotgun sequence Length of sequence - 10973 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 1, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1719 2281 ## Smon_0125 polysaccharide lyase 8 alpha-helical 2 1 Op 2 . - CDS 1734 - 3179 1496 ## gi|262038306|ref|ZP_06011693.1| hypothetical protein HMPREF0554_0365 3 1 Op 3 . - CDS 3149 - 6064 3133 ## Sterm_0013 autotransporter beta-domain protein 4 1 Op 4 14/0.000 - CDS 6085 - 7650 2334 ## COG1653 ABC-type sugar transport system, periplasmic component 5 1 Op 5 7/0.000 - CDS 7699 - 8577 1113 ## COG0395 ABC-type sugar transport system, permease component 6 1 Op 6 . - CDS 8621 - 9535 1085 ## COG4209 ABC-type polysaccharide transport system, permease component 7 1 Op 7 . - CDS 9581 - 10660 1531 ## Lebu_1920 hypothetical protein - Prom 10693 - 10752 9.1 Predicted protein(s) >gi|261747668|gb|ADAD01000107.1| GENE 1 3 - 1719 2281 572 aa, chain - ## HITS:1 COG:no KEGG:Smon_0125 NR:ns ## KEGG: Smon_0125 # Name: not_defined # Def: polysaccharide lyase 8 alpha-helical # Organism: S.moniliformis # Pathway: not_defined # 5 563 4 564 739 507 48.0 1e-142 MKNTVLLMMILIASISSGVTQKNEFDKIRKKWFEIIVGLPEGNDPQIRKDVEKYALQTEN NAEKAIGKLNNEQDKKSLFNDLKDMKNGAHVLSSFENVKAMAKAYMTPGTKYYKNKDLKK TITESLEWLNKNAYHEGLPELGNWWQWEIGIPKSVNEIGIIMYGEIPQSLLNNLMNASKY FQPYATHSGSSPAAKHSTSPKKRVSKGGNRMDTAIISFGRGILTKDKAETMNGVNAVGEI GEIVTKGDGFYSDGSFIQHENVAYSGTYASVLFNGLGTLLYVSEGTGFLPKDTRLENLYK SITKGYSYLIINGGINDSVSGRSISRDNSNDLERGKSLIAAFAIISEGVDSPYKEEIRKL VKKSVSENNHFNIVEKINNPVIKNIVKNIAENNSIKVEEMSGTKIFSSMDRAVQVNKKNG KFLVSMHSSRIANYETMNGENLKGWHTGDGMTYIYGISSSDYVDFWETVDNLHLSGTTES TNDRGPGSGERRVPSSVSPNSWAGGATDGETAMIGMDFVSWNDKTVAKKSWFMLGEEIVA LGSGISSSDGEIHTTLDNKIINDINDKKITVN >gi|261747668|gb|ADAD01000107.1| GENE 2 1734 - 3179 1496 481 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038306|ref|ZP_06011693.1| ## NR: gi|262038306|ref|ZP_06011693.1| hypothetical protein HMPREF0554_0365 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0365 [Leptotrichia goodfellowii F0264] # 1 481 1 481 481 830 100.0 0 MFKVSVYGGIGFKRVFQGNINENREELPSRASNLTIKNSPVNEKYYNSVIPSIGINFEKA GLLFNKVYRFGIGAGYETETGNINKFKYYKLQGLNNEHKVKSANMKNVFYCNIDTVLGLT ENLAITANYKHAKGKEYKSDKITLGFEYKGNIFKNNIFNSIVAKLSNSRKNSRWKKTLGF TLEAEDPSDRAYYYGNVLSSGDYSKSADYKPLVWLTLLDTKSKISYYFEGYYKNNSMFQG KNSKEDTASSSRLMAEMRYRNTFSKGNYGFSIGYKIENLKKPKNYPKKDYYVSKSGFYEL KLDGNISYNLGNGLYFTGKTTGMVDFYYKGERKNQIDYKVENQYGFIYNGFSPKWKLQLM YFRDDRRFDNKNNARKYVSEQLRPSVIYNFGNGDSVTFDLRLPLGKGAWYKTAETKERKS EVYEYRYGIKYNKGIVPGLNVFASLAISEIKVKNTDEKSSKYGEKTRNHVFKPSIGINYS F >gi|261747668|gb|ADAD01000107.1| GENE 3 3149 - 6064 3133 971 aa, chain - ## HITS:1 COG:no KEGG:Sterm_0013 NR:ns ## KEGG: Sterm_0013 # Name: not_defined # Def: autotransporter beta-domain protein # Organism: S.termitidis # Pathway: not_defined # 95 908 1273 2077 2276 73 22.0 5e-11 MKKNKLSLFILLAILMCNQGHATLYPDQNNNTDPRRNHLITNKIITPNDLNGDTVISIQP GKYAILLGKEDTDSPVNIDIEHSQNKNKKIININSGITNNNFVINRGIIKGKNKSNFNFE LISIQKGKFRNEGILEISGAKGVGINVTNTAELENKGEISVGNKAKGIFLNYNNSKFENY AKINVTGQESIGITGNKNKKANITSYHNGVIEVRDKATGVSLENKVFFKNEGQINVFSNN RLTSGSEAIGVNVNSEDKNSFENYGIINVKGTGKNPYSGYNAKGIFINNQNGNALNFGII NADNNGIGVISKGYFTNSGTINADGKSVGINILKTNTVPIEKTVNTGVIKASGESYGIQA AGNANFENSENGKIYVSEGATGIVAVKESSETENGNVINKGNIFVSGDKSTGIQANDNRK ITTTGLVEVDAGRYTAGIKASGKNSQLTNTGKIVHKGDEKNTSNAAVSQMKGTIENKGEI EVLSGKLSSGIAVGENSFGINSEAGKIKINEGTGIKIEKNGIGINKGEILLKNRTTTGIS LSNSIFINDTGKIIGDEKGNAVISKPLTNNTVVLKNRNSVNGNIIGNTGIDTLFLENANI DRINVKKYNSLNLIGNSYIKNSEISLESHENARKYSEDIKKSLERESYISNRTGTLTLED STLTLNKMDSTPIYVNGNQITLKGKINLLFKGEGKEFSTSEYLGGIKVDGSQAKFEDTLV WTYRKKGNADVVLSKNRYSDVNGNKNLNDFAEYLEAERNANKTNGDLDKVIGKMEMLKDK EDFNNALAQISGVVHSYVIDSVLQESELFKDLSLKKVFSRDITRNSPLNSVIHNIYYIND TGNLSGTIEAERKEKGIAGIWEKQIINDNKLGFIYGASKGDLNFKKNNAGNAHLETVYFG GYYNYNLNNKISFLTTVGTTYSHASIKREINIGNQNYKFNSSYPVITLGADSKIIYTPIE KICLKYRFTEE >gi|261747668|gb|ADAD01000107.1| GENE 4 6085 - 7650 2334 521 aa, chain - ## HITS:1 COG:AGl3560 KEGG:ns NR:ns ## COG: AGl3560 COG1653 # Protein_GI_number: 15891902 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 25 520 24 519 522 374 43.0 1e-103 MKVKKILIMALICITAILACKNEKSASDGGSGKKLSGNLITEKPREVTIFAIFQGKAIDG ELPVFKKAFEETNIKLVSVASKNQSDEVQGFNLMLSSGKLPDIVAYELVSDFENLGMEGG LIPLEDLIDKHAPNIKKFFEENPRYKKDAIAADGHIYMIPNYNDYFNLKTTQGYYIRKDW LKKLNLQEPKTVDELYNVLVAFRDKDPNGNGKKDEVPLFLRGNIIRKIMMGLTDIFKARA VWYDDKGTPQFGPAQENYKEAMIQLAKWYKEGLIDKEVFTRGMNSRDYMLSNNLGGFTND WFSSTGTYNDKLASKIPGFDLSIMLPPEYKGNNKTFFVRPTYIGGWGITSAAKNPVELIK YFDFWYSEAGRRLWNFGIEGEDYKMVDGKPVFTDKILKDPQGKNPLAVIRSTGAQYRLGM FQDAEAEKQLSGQNEAVDLYIKSNVIEDDLPVLKYTKEEAKEFLKIDTQLRAYTEEMTQK WILGVSDVQKDWDNYVKRLNEIGLKRAEEIQKTAYARFMKK >gi|261747668|gb|ADAD01000107.1| GENE 5 7699 - 8577 1113 292 aa, chain - ## HITS:1 COG:AGl3561 KEGG:ns NR:ns ## COG: AGl3561 COG0395 # Protein_GI_number: 15891903 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 9 291 11 292 293 241 47.0 9e-64 MKIKTGIDEKIFYTINYILLAVFALMFLYPIVYVFSASVSKPYFIETGAVTLLPKGFNFE AFKSAAAIPGIWRAYANSIFITVVGTAVSMMFTISGAYVLSKPELKFRKIIVILVVVTMW FDPGMIPKYLNFRDLGLINSYTGIILGFAINTFNVIILKTFFEAVPKSLEESARIDGASH FKIMTHIYLPLSTSAITTVSLFYAISRWNGYFWSMVLLTDDAKAPLQVFLKKLIVEKNMA GEATQLITPESLTSPQTIIYAVIVLSLIPILLIYPFIQKFFKKGVAIGAVKG >gi|261747668|gb|ADAD01000107.1| GENE 6 8621 - 9535 1085 304 aa, chain - ## HITS:1 COG:AGl3564 KEGG:ns NR:ns ## COG: AGl3564 COG4209 # Protein_GI_number: 15891904 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 17 304 22 309 309 311 52.0 1e-84 MRAKLKKTKFNKEQLSLYLLLLPFILWYAVFVYKPMYGLIIAFKDYSLFKGILGSDWVGF KNFTEFLTSPYFYVTLKNTLMINLYSLFLEFPFAIMLALMLNEVKNKLFKSFVQTASFIP YFIAIVVAAGIAINILSPSTGVINLLLEKLGMEKIYFLSKPEYFRGIFTGLNIWKNTGFN AVIYLAALTTIDEQLYEAAKIDGADKFKQLRYITIPGIIPTIVIMLVLKVGSMLNVAFET VLLLYQPATYSTSDVISTYVYRTGMLQQDFGLATAVGLFNALVGFILVYSANKWSKKVSS SSLW >gi|261747668|gb|ADAD01000107.1| GENE 7 9581 - 10660 1531 359 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1920 NR:ns ## KEGG: Lebu_1920 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 15 359 14 370 370 214 36.0 6e-54 MKKIIGLTALMTVISGVVYGNQDVLKNIRVTTEYRQAWTDKDGNEANNIGNAGFKSNNKT FARWRNIVSGELKLSDEWGIDSKFNIIHDNDSTFQKGKRNNIKKESWENNLEFSKKINIG NLETTTTLGWLHKSSMKYKKNESKKTAGISNEIYFGPTFGMKLFGQNITTTLEAVYFKSN GEKNGEYYLSGSGFEGGKTDGWGLNARFRTNGNIYQGKVGKVNYWIDLKNKFRDPHGKIA ATGKEAKSSVKLVYVAGIKYTTPTFAGFTGSISADNEWEKHTAKTGYKNTISVWTELGYS KSFETGVGKVSISPFVKYRVLQRATLKDKNNRGLKNYGYKRTTETNEVRAGLNIGLTVK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:08:17 2011 Seq name: gi|261747659|gb|ADAD01000108.1| Leptotrichia goodfellowii F0264 contig00080, whole genome shotgun sequence Length of sequence - 6607 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 173 229 ## gi|262038314|ref|ZP_06011700.1| D12 class N6 adenine-specific DNA methyltransferase 2 1 Op 2 . - CDS 200 - 733 539 ## Lebu_0734 hypothetical protein - Prom 760 - 819 11.6 + Prom 716 - 775 12.2 3 2 Tu 1 . + CDS 865 - 1992 1317 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) + Term 2006 - 2040 2.8 - Term 1985 - 2034 8.1 4 3 Op 1 . - CDS 2041 - 2757 905 ## COG3022 Uncharacterized protein conserved in bacteria 5 3 Op 2 . - CDS 2754 - 3695 1155 ## lse_2076 lipoprotein, putative - Prom 3718 - 3777 6.1 - Term 3719 - 3774 10.2 6 4 Op 1 . - CDS 3796 - 4800 1613 ## COG0208 Ribonucleotide reductase, beta subunit 7 4 Op 2 . - CDS 4841 - 5467 736 ## HAPS_0790 hypothetical protein - Prom 5529 - 5588 1.6 8 5 Tu 1 . - CDS 5590 - 6546 1107 ## COG0270 Site-specific DNA methylase Predicted protein(s) >gi|261747659|gb|ADAD01000108.1| GENE 1 2 - 173 229 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038314|ref|ZP_06011700.1| ## NR: gi|262038314|ref|ZP_06011700.1| D12 class N6 adenine-specific DNA methyltransferase [Leptotrichia goodfellowii F0264] D12 class N6 adenine-specific DNA methyltransferase [Leptotrichia goodfellowii F0264] # 1 57 1 57 57 95 100.0 1e-18 MIIKFENDNSLRVFSRYPEGKISLIGEEKEKYERVLDENPGLRNTDYNDKLKFPDKV >gi|261747659|gb|ADAD01000108.1| GENE 2 200 - 733 539 177 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0734 NR:ns ## KEGG: Lebu_0734 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 9 159 23 182 358 100 40.0 2e-20 MKINRFLFFILFSVLIFSASPKKSLQEIQQSENKETETVEVEIIEKDETDYYKKDDRELI YLDRIVDEEEMKSDKKGKLEKAISRFDKKINPVLKFYVPATYGDWYVIKTTNTQELNYEN IKYNFKQEENGYRITKMYYDPLRKVWNEDKERAWIKEKKEKYIYILKKHVSKALKMK >gi|261747659|gb|ADAD01000108.1| GENE 3 865 - 1992 1317 375 aa, chain + ## HITS:1 COG:NMB1147 KEGG:ns NR:ns ## COG: NMB1147 COG2843 # Protein_GI_number: 15677023 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Neisseria meningitidis MC58 # 164 375 1 211 211 253 59.0 5e-67 MIKRKIIFSILLVTLFYSCGKNSDNVSETKNEKIVESKIQKDNEKKEIAKETVTVIGVGD IMLGSNYPSPSLLPENDADILENTRKILENADITVGNLEGTLFDKGGSPKGCSDPSVCYV FRTLSKYGEYLKNAGFDYLSIANNHSNDFGSEGINQTMKNLENLGIKYSGIKEQAEFALL EKDGIKYGFVSFAPNNRTVSINDYEYASKLIKSTKEKSDILIVMFHGGAEGNGHQNIPRK NEIFYGEDRGNVFKFARMAVDSGADIIFGQGPHVTRAVELYKNKFISYSGGNFATYGKFN LSGPSGISPIFKITVDNKGNFLSGEIIPTKQYKGVKGPFVDKNNEAIKKIISLSKQDFPE GNGLEISSEGKITKK >gi|261747659|gb|ADAD01000108.1| GENE 4 2041 - 2757 905 238 aa, chain - ## HITS:1 COG:FN1762 KEGG:ns NR:ns ## COG: FN1762 COG3022 # Protein_GI_number: 19705081 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 238 1 245 248 130 38.0 3e-30 MKIIVSPSKTKKISNLRGIEIYEEPCFSEITHKIVKELEKMSVSEIEKKFKLKKEQAEKL SKFYKTFETQDYGHALGTCTGIAFKSIDTNSFNKKDLEFAQEHLLILSALYGVLTPFTKL KEYRLDMTNSVFKEKSLYEVWKKDINDYFKDEKCIINLASKEYSKVIEAENLYDFEFYDR KDGKLKQISTNSKKIRGFTVNYIVKNRITNVRDLKNISLNGYEYNEEMSNEKKYVYVK >gi|261747659|gb|ADAD01000108.1| GENE 5 2754 - 3695 1155 313 aa, chain - ## HITS:1 COG:no KEGG:lse_2076 NR:ns ## KEGG: lse_2076 # Name: not_defined # Def: lipoprotein, putative # Organism: L.seeligeri # Pathway: not_defined # 30 211 41 217 335 76 29.0 2e-12 MKLKNLLLTVLLIAGLIMSCGGKKEDKQEKNNQKYNEYVEFFNDIKMGKFDKFYNIYMGK FSNDSGDYKTPDENSIKEIIGLGENVDYFEKRIEEMEKLVNKKPEFEVDKNLQEFLGAIK AKNEVMSEIIDYYKNGEYKKDNFEKGKQLHLKYVDINKKAMEKYIPYTENMKKLMAKNRE ERLKDLKTKGLKAKYYMVKFIDDTDKFSNKLYESEEGEFAEEDVKELKELSQNLRKTYEE WVKVDAKNITEEKYDLGEYNKLRGNAGMITDLADNIIKKVEAKSDDLYKLAENFSNGHQK LIIDYNKMIGAVK >gi|261747659|gb|ADAD01000108.1| GENE 6 3796 - 4800 1613 334 aa, chain - ## HITS:1 COG:BS_nrdF KEGG:ns NR:ns ## COG: BS_nrdF COG0208 # Protein_GI_number: 16078802 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Bacillus subtilis # 4 329 3 326 329 308 50.0 1e-83 MTKKIYEAVNWNTPENDYVEMFWEQNLKQFWIDTEYIPSKDIDSWHSLEPEMKLAYLQVL GGLTLLDTLQSHTGMPKIIDHIESLQCRSVLSYMCMMETIHAKSYSTIFTTVASTREINE TFNWVQENEHLQYKANKIDYYYQKMNNPKATKREIYMALCASVYLETYLFYSGFFLPLWL AGQGEMVASCDIIKKIIADESIHGVFVGLLSQEIYATFSEEEKESVRKELKELMHDLYEN EARYTDEVYGDVGLTADVKEYIRYNANKALMNLGFEEEFEVKDVNPIVLNGLNVETTQHD FFSKKSTNYEKALEVVHLNDEDFEFKKEEIDLEI >gi|261747659|gb|ADAD01000108.1| GENE 7 4841 - 5467 736 208 aa, chain - ## HITS:1 COG:no KEGG:HAPS_0790 NR:ns ## KEGG: HAPS_0790 # Name: not_defined # Def: hypothetical protein # Organism: H.parasuis # Pathway: not_defined # 1 207 41 246 247 283 65.0 4e-75 MHYNLDEIEYVKAVVLNGYKADINVQIQIKLRKAVDTENIQVKLVSNRKGFNQVDKRWLK NYNEMWNIPEDIFKLLQYFTGEIKPYIKNTRENRRMFLDKFSVQEQEKLINWFEKNKIMI LTDVIKGRGEFSAEWVLVAQKVSNNARWVLKNINDVLNHYYGNGKVEVSPKGSIKFCKLT IQRKGGDNGRETANMLQFKLDPAELFDY >gi|261747659|gb|ADAD01000108.1| GENE 8 5590 - 6546 1107 318 aa, chain - ## HITS:1 COG:jhp1050 KEGG:ns NR:ns ## COG: jhp1050 COG0270 # Protein_GI_number: 15612115 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Helicobacter pylori J99 # 4 314 5 316 321 310 50.0 3e-84 MEKTFFDFCSGIGGGRLGFEKNGFICVGHCEIDEKADKTYQLFYNDGRNYGDLMKVNPKE LPDFNYLIAGFPCQTFSIVGKREGFNDKRGEIIYGLKNILKEKNTPFFIMENVKGLANHE KGKTLKKIIELLEELDYHVEYKILDSLDYGVSQMRERLYIVGIKKELYKKKFNWKFKKSI SKIEDFLIDVDNKELDINNETFQKYLSNKYNTGKYNIDELMNNDYLVLDTRQSDLRIYNG KVPTLRTGRHGILYIKNRKIKKLSGYEALLLQGIPKELAEKAKNSETADGVILSQAGNAM TVPVIYEIVKELKKYGEI Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:08:36 2011 Seq name: gi|261747656|gb|ADAD01000109.1| Leptotrichia goodfellowii F0264 contig00119, whole genome shotgun sequence Length of sequence - 613 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 346 537 ## gi|262038318|ref|ZP_06011703.1| hypothetical protein HMPREF0554_1941 2 1 Op 2 . + CDS 356 - 611 346 ## gi|262038319|ref|ZP_06011704.1| glucagon isoform 2 Predicted protein(s) >gi|261747656|gb|ADAD01000109.1| GENE 1 2 - 346 537 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038318|ref|ZP_06011703.1| ## NR: gi|262038318|ref|ZP_06011703.1| hypothetical protein HMPREF0554_1941 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1941 [Leptotrichia goodfellowii F0264] # 1 114 1 114 114 98 100.0 2e-19 YRQQKEASRPAREEKVKEKKVLRKETPEGNVQEVTLDVEEEAVEEVKPKTPMEKLEYNAA KAKDRVDFYERVVRSVEREEKELNGYDEVIGKKKKVKKVVEKKVREPKKVNKKK >gi|261747656|gb|ADAD01000109.1| GENE 2 356 - 611 346 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038319|ref|ZP_06011704.1| ## NR: gi|262038319|ref|ZP_06011704.1| glucagon isoform 2 [Leptotrichia goodfellowii F0264] glucagon isoform 2 [Leptotrichia goodfellowii F0264] # 1 85 1 85 86 72 100.0 1e-11 MKVRKAIGLLAGLIMLMTQISYSDPAVDRLLKEARKRQAEEAKQQEKVEEVTVEETPVTT ETPNSIQKRAAEIKRESQSKKSQEM Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:08:53 2011 Seq name: gi|261747655|gb|ADAD01000110.1| Leptotrichia goodfellowii F0264 contig00180, whole genome shotgun sequence Length of sequence - 1699 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + SSU_RRNA 153 - 1638 100.0 # FJ577259 [D:1..1486] # 16S ribosomal RNA # Leptotrichia goodfellowii # Bacteria; Fusobacteria; Fusobacteriales; Fusobacteriaceae; Leptotrichia. Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:08:54 2011 Seq name: gi|261747652|gb|ADAD01000111.1| Leptotrichia goodfellowii F0264 contig00179, whole genome shotgun sequence Length of sequence - 2009 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 47 - 715 773 ## gi|262038322|ref|ZP_06011705.1| hypothetical protein HMPREF0554_2266 - Prom 896 - 955 11.8 + Prom 730 - 789 8.7 2 2 Tu 1 . + CDS 894 - 1997 790 ## COG4292 Predicted membrane protein Predicted protein(s) >gi|261747652|gb|ADAD01000111.1| GENE 1 47 - 715 773 222 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038322|ref|ZP_06011705.1| ## NR: gi|262038322|ref|ZP_06011705.1| hypothetical protein HMPREF0554_2266 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2266 [Leptotrichia goodfellowii F0264] # 1 222 1 222 222 355 100.0 1e-96 MKRFFLLFSKIALFFSICFLTVNYIDEASRNYVLNLSKKFYVREKVSYTEVGIILCREDY FYNEMYLTNEYGQVDKLKKIGDLWAGHILSYEESDKKIFTYQHDNAGIYFDMFKDRLKKY KGYFIIGEGVEQFGMSLEEVAEYLKVREIEFKDTPTYYVKKFGYKKDLASEALNKKLAAL YGEENIEKTFIQKKKIRMLLGVIGLGLGILGVIVFRGNKGEE >gi|261747652|gb|ADAD01000111.1| GENE 2 894 - 1997 790 367 aa, chain + ## HITS:1 COG:SA0341 KEGG:ns NR:ns ## COG: SA0341 COG4292 # Protein_GI_number: 15926054 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Staphylococcus aureus N315 # 2 364 8 371 377 112 28.0 1e-24 MKKIIPKKVEMIELFYDLIFVYAISKMTGIIHHLHHGEIGIINFLQYLAVCFIVLQIWLY QTNYINHYGKHSIAENILLYINMFATVFMARNINTDWSVTFKPFNMASIVMCMTIIGQFA LQLNKEKEAKSELLGFISTLAIEVVMISIGLLLGYDTGLWIVLLGAAMGILLPFVITKKH LDVERTNFPHLVERMGLIVIITFGEMIINIVSLFDINLFGIIPILVFSFAVILFTTYVIQ MDKMLEHKQFNQGLVLIYSHFWIVISLGIITVSLGFLNEEHVNKNFHLLFILSGLVIFYL SLFINSVYNKEKYKITTKDIIGYSISFLIGALIMFIGKGSNILFLLGLLVISGFILILLN KKYKSVQ Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:09:07 2011 Seq name: gi|261747649|gb|ADAD01000112.1| Leptotrichia goodfellowii F0264 contig00162, whole genome shotgun sequence Length of sequence - 1870 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 1 - 1165 1722 ## COG3210 Large exoproteins involved in heme utilization or adhesion - Prom 1217 - 1276 7.6 - Term 1270 - 1307 -1.0 2 1 Op 2 . - CDS 1435 - 1869 587 ## COG2831 Hemolysin activation/secretion protein Predicted protein(s) >gi|261747649|gb|ADAD01000112.1| GENE 1 1 - 1165 1722 388 aa, chain - ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 4 357 1 332 2806 179 34.0 1e-44 MKEIRKELNRVIAELLLSLFMSFNIFGAGIEVDPNVPQNVNVDRAPNGVPVINISTPTDK GTSVNSFKEFNVDQRGVEILNNTGVGRGYLSGIVNPNPNLRPGQEARTVVFKVTGANRSE IEGYISALSPRPINLFIANENGIYVNGGGFINVNRAALVTGKINIQDGDVVSFTTRDGKV IIGEKGLDISNVERVDIITRTQELTGKIVGQKDVNIILGQNEVNLAGIVTPIITSDNKPA LALNGGALGSIYSNGQVNIISTEKGVGVNLKSSVLSENDIRMKINGNADVKEIISKNAEI QTEDLKTDKINANNLSIRAKDYENRNEITAQNVNISSSNLKNNELTAGNLTLNTGNTESN RISANSVNIKGNNLKSNILEGQNISLAI >gi|261747649|gb|ADAD01000112.1| GENE 2 1435 - 1869 587 144 aa, chain - ## HITS:1 COG:FN1818 KEGG:ns NR:ns ## COG: FN1818 COG2831 # Protein_GI_number: 19705123 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Hemolysin activation/secretion protein # Organism: Fusobacterium nucleatum # 5 144 412 555 555 94 43.0 9e-20 ASFYWYKPIKNAAFRLTAEMQKSKDVNYGSEKISIGGIGSVPGYQYDNISGDIGYSVLAE LSYTFRYEKQRLIPYISYGIGETKNNHDESEYRLGKVKGGSIGLRFSSTYFDIDVSYAKA FSQSEYVKPKNHEVYASMTVKYSF Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:09:13 2011 Seq name: gi|261747631|gb|ADAD01000113.1| Leptotrichia goodfellowii F0264 contig00072, whole genome shotgun sequence Length of sequence - 19727 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 5, operones - 5 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 72 - 173 105 ## 2 1 Op 2 . + CDS 136 - 969 844 ## Lebu_1404 pseudouridylate synthase-like protein 3 1 Op 3 . + CDS 1039 - 1461 383 ## Lebu_1403 metal dependent phosphohydrolase 4 1 Op 4 1/0.000 + CDS 1498 - 2538 282 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 5 1 Op 5 1/0.000 + CDS 2540 - 3157 722 ## COG0164 Ribonuclease HII 6 1 Op 6 . + CDS 3179 - 3538 417 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 7 1 Op 7 . + CDS 3549 - 3749 251 ## Lebu_1394 hypothetical protein + Prom 3755 - 3814 7.1 8 2 Op 1 . + CDS 3883 - 4443 667 ## COG1040 Predicted amidophosphoribosyltransferases 9 2 Op 2 . + CDS 4440 - 6884 2923 ## COG0366 Glycosidases + Term 7069 - 7105 -0.9 10 3 Op 1 . + CDS 7321 - 7503 164 ## gi|262038343|ref|ZP_06011724.1| conserved hypothetical protein 11 3 Op 2 . + CDS 7478 - 12271 7140 ## FN1554 hypothetical protein + Term 12275 - 12324 9.3 + Prom 12288 - 12347 11.3 12 4 Op 1 . + CDS 12396 - 12944 683 ## COG2807 Cyanate permease 13 4 Op 2 . + CDS 13018 - 13563 814 ## COG2807 Cyanate permease 14 4 Op 3 . + CDS 13591 - 13776 131 ## gi|262038344|ref|ZP_06011725.1| conserved hypothetical protein 15 4 Op 4 . + CDS 13785 - 17102 3764 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits 16 4 Op 5 . + CDS 17118 - 17876 219 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 + Term 18061 - 18107 3.0 17 5 Op 1 32/0.000 + CDS 18222 - 19283 1394 ## COG1135 ABC-type metal ion transport system, ATPase component 18 5 Op 2 . + CDS 19283 - 19727 649 ## COG2011 ABC-type metal ion transport system, permease component Predicted protein(s) >gi|261747631|gb|ADAD01000113.1| GENE 1 72 - 173 105 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNEVDKLIRGFENTNYFGYLFFVEYDGKKFDFF >gi|261747631|gb|ADAD01000113.1| GENE 2 136 - 969 844 277 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1404 NR:ns ## KEGG: Lebu_1404 # Name: not_defined # Def: pseudouridylate synthase-like protein # Organism: L.buccalis # Pathway: not_defined # 11 271 39 339 340 240 57.0 4e-62 MWSMTEKNLISFDENPEGKSLKSEFRKLLEMNGLKIFKGIQQAGRTDKDVSAKENILYIN SKKYIENFHKEIDGLKIRKIVRTVPFLEFPQMVSKRFYIYEYPQNLIENDEEKIKSNCEK LSGKKNFKKYTSKKGEKLKNHIREVFVRFENGKLYFEGDGFLPQQVRIMSNFILNNKMKS LPGEYLTLEKVEMSEELKKLIFKEVDKNMLMNSKEFLSELGNISEKIENIEKNEYFYVFY VSKKNKSEIIGKNGKNIKKMRKIMGNIIIKEKIQDIM >gi|261747631|gb|ADAD01000113.1| GENE 3 1039 - 1461 383 140 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1403 NR:ns ## KEGG: Lebu_1403 # Name: not_defined # Def: metal dependent phosphohydrolase # Organism: L.buccalis # Pathway: not_defined # 4 140 2 138 138 130 56.0 1e-29 MINYLVKKGKEYFFPKINKELSEEAMEYLTVQERKIFQNMSKYDKFHSLEVYKKLKNTGL KDDRLYLKLALLHDCGKGKVLFITRIFHKIGIKTGLRNHAEKGSEKMKNIDKELSILIRN HHKKNCSRKMKIFQKCDDAS >gi|261747631|gb|ADAD01000113.1| GENE 4 1498 - 2538 282 346 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 76 327 40 281 285 113 32 1e-24 MGRKKEKISDRKNTEIEIVEDFSDDFSEQNEEKAMSEEKISVTVSTEKTNKRVDSFLSEK TDLTRSRIQQLIKDGNVSVNNKSVKSSYKVEENDRIEIIIPEAEKVEIIAENIDINIIYE DKDIAVINKKAGMVVHPAHGNYSRTLVNAILYHIKDLSGINGEIRPGIVHRLDKDTSGLI VVAKNDKAHIKLAEMFQKKEVKKTYLAILKGKLNRESGRLETQIGRDADDRKKMSVLKEN DKGKKAITNYNVILSNELFTLVKVYIETGRTHQIRVHMKYLGYPILGDQVYGRKDSEKRQ MLHAYKLEFLHPVTEKPMKFIGEIPEDFKNALKNTKLEVDLSSLEE >gi|261747631|gb|ADAD01000113.1| GENE 5 2540 - 3157 722 205 aa, chain + ## HITS:1 COG:FN1371 KEGG:ns NR:ns ## COG: FN1371 COG0164 # Protein_GI_number: 19704706 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Fusobacterium nucleatum # 13 204 19 213 215 191 57.0 6e-49 MGSLAEFDNSYNKIIVGVDEAGRGPLAGPVVAAAVIIIQYFKELDEINDSKKLSEKKREQ LFELIKEKCIVGTGIADEKEIDEINILNATFSAMRKAINEVKEKSDFDIVLVDGNHKIRK YEGEQEAVIKGDSKSLSIAAASIVAKVTRDKMMKEMSEEFPEYRFDKHKGYATKLHREIL LKKGPCKYHRKTFLSKILGEKENKK >gi|261747631|gb|ADAD01000113.1| GENE 6 3179 - 3538 417 119 aa, chain + ## HITS:1 COG:YPO3549 KEGG:ns NR:ns ## COG: YPO3549 COG0792 # Protein_GI_number: 16123693 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Yersinia pestis # 4 119 2 114 117 81 33.0 3e-16 MRKSKREVGFEYEEIAKDYLEERKLLFIESNYYTKYGEIDLIFLEKSSETLIFVEVKYRK NNIYGEAVEAVDKRKQEKIIQSSQIYISKNKWKNSVRYDVIGIIGNKLKNDINWIKNAF >gi|261747631|gb|ADAD01000113.1| GENE 7 3549 - 3749 251 66 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1394 NR:ns ## KEGG: Lebu_1394 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 65 1 65 66 67 55.0 2e-10 MKNNDKCLKCGSTSCEIKTTVLPVKKLGEAKITLDTFYLKICQNCGYTETYSAKVIEKVK KPIKNY >gi|261747631|gb|ADAD01000113.1| GENE 8 3883 - 4443 667 186 aa, chain + ## HITS:1 COG:FN1368 KEGG:ns NR:ns ## COG: FN1368 COG1040 # Protein_GI_number: 19704703 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Fusobacterium nucleatum # 4 177 37 199 204 97 36.0 2e-20 MKKLRKLNNIYYIWDYNEQFKKLIFSYKYHRKKSLAKLIAEMIKVEFEFVLKKEKIDYVV SVPVNRKRMNERGYNQVDEILKYLNINYIQLKRVKNTQKMHKLLDEKLREDNIKGSFYIG KNTDFKNKRILVIDDIITTGATLREIKKSILETAKENNESINKSKTEITVFCLAAAREIK INKGEI >gi|261747631|gb|ADAD01000113.1| GENE 9 4440 - 6884 2923 814 aa, chain + ## HITS:1 COG:BS_yvdF KEGG:ns NR:ns ## COG: BS_yvdF COG0366 # Protein_GI_number: 16080515 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus subtilis # 60 673 20 537 589 93 22.0 1e-18 MILSKMKKIILLLILTNAVVSYSVGNSAKENLTKYSQRDDNYIDTESVYHTESPEYRVEE GKNVRIILETKNNDVYSAEIVYGGKTVMMRSIGNYGGKEIFVGEIPNGSTDYYFRLVDNK VKYFYGKNMVTSQNSVQKFHYEKSDSLTNVPDWSKSSVGYQIYIDSFRNGNPDNDPIFNE FGTDDFKAPTGQIRSGTQKKDLVAAQWGTGAYNTEFTVNDWNGNYETKNVWEENALNEVK NYSRYYGGDLQGIKDKLDYLKELGVEYLILSSPFYSLSNHKYDTIYFNHVDPYFGHLEQT GTNKGLDIKGKVHNKNGDKELNLLIYNSKTGKDLLDENMTDPTTWVWTDSDLELASLVKE AHKKGFKVVLEVAPDITSNRFFARMDSRYKDWYLDESDLRLDLSNKNVRDYIENSMKKWV LGPDGTFKNYSDDDGIDGIRYVYYDNKNKKYLINITESLKKYKQDLLISGEFTNKFGEDI TAGVYDSGADYNIINDLTKYMINTNSNYKIGSVEFATKLNEIYNKYSSERFNATQIFADS LDTDRIYSGVINPNRVFDRNNQSNQGYLNIRPDLYDGNAVNKFKEIISVQMMLPASPVIY YGDEKGMWGADSPRNRKPMLWEDYVPYENETDDINKYKTRLRTLPENVQINEVNKTISYP VTINTEIENHYRNLLKIRKDYKELFKNGKFKVLEVYNDPKTKGRIDAEISRYLSEEKRKA KTYQGKDINPQVPNVDFITYEISNKKDSIIVVINNSGDSYPLSLFVPKLFGFYTNQLNTK EKYSISDKKINLIVKPYEVKVLHSKDTNIFDSFK >gi|261747631|gb|ADAD01000113.1| GENE 10 7321 - 7503 164 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038343|ref|ZP_06011724.1| ## NR: gi|262038343|ref|ZP_06011724.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 60 1 60 60 85 100.0 9e-16 MVGTLSFSDTLTSPEVKNTENVINQTKKELNTSISDLHVAFKQAKRENNSSDYDRSYKRL >gi|261747631|gb|ADAD01000113.1| GENE 11 7478 - 12271 7140 1597 aa, chain + ## HITS:1 COG:no KEGG:FN1554 NR:ns ## KEGG: FN1554 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 25 1597 11 1582 1582 1170 49.0 0 MTGAIKDYEGTGPITVSNDDSVNQAATGTIIGYSEGIWKNSEHKMTSASATALEGKSSEL NIGKDVILSARYKDFGNNKVSRPIGYLAVNKGKINATGNTDAKGYGSVLAYADNSGEIET KAIKAIDEWANVTDRNLLYTNIGAVATNSAKVTINGNAEINGMGAMAIGNGSLVKLNGSN NIINTGTNGALAAKNGGGVNFAGGKIVHKNNEAGDHKSSTPFYADNSSSKVVFTGSTTID MYDGILIPGTEADYEGATPLATAKYKGMNNVTVNLKKDNVVISINDHHAPVTWTSTSGTS SDIKDKMKLAAFNTNGHRYKIYYTNSEFNLDTDVNLNSNSTDAFNNVGLANEKFTILNGR TVSSIDGTGLAMGSLATATDNTTNQYINKGIVNITGGNFATKGSPALSIAYGTINNENEI IVDSGIGAYGINGSSLHNNTNGKISITTEGVGLAAFASAKNSLLPYGTDKKINDGTLAPT EKVLEVINDGTINVAGDKSIGIYAETNEVASHSGVPYPLNSRQGLIKNNNKIVMTGDKAV GILSKGNTVELNGTGSSNITVGTNGIGVYAENSPTTLLSDYGFEVKDNGTGIFLKNSFLI PSGKTVEIKYTGSTTGTGVGLFYNAGGTNTTDVKLVNAVNSTGGVVGLYASNGGVLTNNA SVSGDNGYGIIIEGTDIVNNGTVTLNNPIPGNKSSVGLYTRGINDITNTGDITVGGKSVG IYGKSNINNTGNIKVGDSGTGIYSENGNVNLTAGSITVGNQQATVLYTKGIGSIIELNGT SLDIGNGSYGIINDGTGNTVNSRNTAQTVKNDTVYMYSKDATGNINNYTKLTSTGNVNYG LYGAGEINNYADIDFKTGYGNVGIYSSIDINTANTSHTKAVNHKGYKIEVGDSYIDPNGH SENNKYALGMAVGYAHSAQDKIDLNSSNPAVRNRVPRETVGFAVNNGIINVNGEYSVGMY GAHAGSRVENHRTINLNKSNTIGMFLDTGAYGYNYGTIQSNGTGLKNIVGIVVRNGATIE NHGTIQINASNARGFISKRDEGGQNPGIIKNYGNIIITGPGAKDDEHQAKEDLGKSMGGI KIDAPAGSETAVITVTPPTPGVVVQPELVSKIQGERLPEVSTIGMYIDTITPTAPIAGMN NLSGLTNVDLIIGTEAAEHTNSKYILVDPSLLKPYNDAILANPQIERWNIYSGSLTWMAS IAQDQSNNSIKNAYLVKVPYTQWAGKEATPVEVTDTYNFLNGLEQRYNMNELSSREKALF NKLNKIGNNEEILLFQATDEMMGHQYATVQQRMVSTGKALDKEFTHLRKEWDNKSKQSNK IKVFGMRDEYKTDTAGIIDYTSNAYGVAYVHEYETIKLGNSSGWYAGVVNNTFKFKDIGR SKENTTMLKLGVFKSTAFDHNGSLNWTISGEGYIARSDMHRKFLIVDEIFNAKSSYNSYG AAIKNEISKEFRTSERTSIKPYGSLKLEYGRFNTIKEKTGEVRLEVKGNDYYSIKPEVGI EFKYKQPMAVRTTFTTTLGLGYENEFGKVGDVKNKGRVAYTDADWFNIRGEKDDRRGNFK ADLNLGIENQRFGVTLNAGYDTKGKNVRGGLGFRVIY >gi|261747631|gb|ADAD01000113.1| GENE 12 12396 - 12944 683 182 aa, chain + ## HITS:1 COG:lin0946 KEGG:ns NR:ns ## COG: lin0946 COG2807 # Protein_GI_number: 16800015 # Func_class: P Inorganic ion transport and metabolism # Function: Cyanate permease # Organism: Listeria innocua # 12 178 23 190 405 80 33.0 2e-15 MKKFNYSYALLIFIVAFNLRLGISSVPPIQLIIQESLKLSNLEVSLLTGMPVICMGIFAF LVGKVQEKYGMRKSIFALLLVLGIGTLARGFVNDYVFLLITTFCIGFSIAIIGPLLSGFI KKEFPDHGSILIGIYSSAMGIGSLVVSNTTKGITDLWNWQWGLAVWGIISIAAGIIWIIF FS >gi|261747631|gb|ADAD01000113.1| GENE 13 13018 - 13563 814 181 aa, chain + ## HITS:1 COG:DR0260 KEGG:ns NR:ns ## COG: DR0260 COG2807 # Protein_GI_number: 15805292 # Func_class: P Inorganic ion transport and metabolism # Function: Cyanate permease # Organism: Deinococcus radiodurans # 1 175 210 382 382 79 26.0 3e-15 MFFGVQSGIFYGFSAWLTRFLKDRGIKEEYTISLLTFYVAMQMAFGFIIPVLMHKIGSAR KWGTFSAGCMALGSLMPVLFETNIITAIIIITLMSIGLGGSFPIAMILPLEYSENSEEAG VITGVVQAIGYVLGGIMPLVFGYVVDKTKNYDNLFYQMVIGSVILVIIGMSKIHRKNAVN Q >gi|261747631|gb|ADAD01000113.1| GENE 14 13591 - 13776 131 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038344|ref|ZP_06011725.1| ## NR: gi|262038344|ref|ZP_06011725.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 61 1 61 61 91 100.0 2e-17 MNINFLNLLSEKLNVGSMRSIYLSAMPGRYRARMDLSELDKINSNFSKVFLKIYSAKILL M >gi|261747631|gb|ADAD01000113.1| GENE 15 13785 - 17102 3764 1105 aa, chain + ## HITS:1 COG:PA2728 KEGG:ns NR:ns ## COG: PA2728 COG1112 # Protein_GI_number: 15597924 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Pseudomonas aeruginosa # 651 1055 381 802 886 212 31.0 3e-54 MTNDEKNETVERKIQYLFNENMNMFREEGISSFALGYPILVKQNNKTKSVMKSPLFIWKL DIKRSKTDMNEFIISREENSEVEINKVLLIQLLSDDKISLSDVYSDEIEENEKSVLTFEE IKKIIEKINEKLKIDFSEEFKLEKFPEKAEEIDKKGIEKPFIFYGGILGLFKRQNEGIIQ DFNDLRENFNKFKFNVSDRDDFQMTNKNSSVSTDPSQQNVVETLTNTQYKIIQGPPGTGK SQTLTAIITNALENGANILVVCEKKTALDVIYEKLAELGLGDLTAMLNDPAADRKAIVKR VRDLEEKFPKTEQYDEQKYEYTVNSYKDLKNKYNEHQNLLKKPLKSDDSSSINDTAIKYL EGKNYSDYYYVDTTKNDINSIYKILDRIDNLLSKIGNISNIQKFENIYNEKLGQISSYEI FFSETSDTLENLKKLISIVEQNLEKYGDEFKNSYGANKLKVSVFSIFNAKIREIKNAWNG IYSDSQKVRNYNGKYINKRFSEHDYGVYLKELKEFSDMLNEIINNRKEFISFIDENKNDS ISKEDEAFINNFIKYSEEKGINDKPEFLTTSYYYSLLQNSDLKNQRYKDFGINIPKLIEN DKYITENQRYRIKKYWNDIRKRALSNMGTGENIKMLYNLRKNAKYGKINTLREIVATNVN FFKDMFPIIMTDPNTCSAVFPLQEGIFDLVVFDEASQLKLEEVLPSMMRGKYKIVSGDIH QMPPSNYFGTETEQTGLTENKEIDEETLFLSDSESLLDYVNNLKDDVIMSYLDFHYRSKH PKLINFSNAAFYESRLVPMPEKKEYIPIEYFKVDGIYKARKNDDEADKIVEYIFGEKVLI DGKLPSVGIVTLNLEQKNNIVNKINNYLRKNDNEETKNRYNQLLENNMFVKNLENVQGDE RDIMLLSTTFGKSEEGKFIQNFGPLNNQDKGYKLLNVLITRAKYRFSVFTSIPDENINSW ENEVIKNGNNGRGIFYAYLAYAKAVSENDREAESRILEVLSGDKMKNTNILYEKIEDTDK KIIEIIKNEIKVGENDEIVKNYKIGGFTLEYAVKDKNEEKAKILIDLNSVNNFSGNTAYK SIIYRKNMFENMGYEYKLLDMASYI >gi|261747631|gb|ADAD01000113.1| GENE 16 17118 - 17876 219 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 1 210 3 217 259 89 30 2e-17 MKRNIFITGASSGIGKAIAYAFGKNGDNLVLCARRLEKLEEIKKDIESKYDVKVDIYKLD VTVYEEVVSVVKEAVNRNIHIDVLINNAGLALGLDKFQDYSISDMEIMLNTNVKGLLYVT REIIPNMISQNSGHIINIGSTAGMYSYANGAVYCATKVAVRFLSDGIRIDTIDKNIKVTT IQPGIVETDFSEVRFYGDKEKAGNVYKGVEALKPEDIADIVLYSANQPKHVQISDITIMA TKQATGFNVHRE >gi|261747631|gb|ADAD01000113.1| GENE 17 18222 - 19283 1394 353 aa, chain + ## HITS:1 COG:FN0660 KEGG:ns NR:ns ## COG: FN0660 COG1135 # Protein_GI_number: 19703995 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Fusobacterium nucleatum # 5 353 2 331 335 317 51.0 3e-86 MEKFITLNNLVKIYKTSEGKELKAVDGVNLEIEKGDVYGIMGLSGAGKSTLIRLINRLEE PTSGEILVNHIVKDKKSIEKKDIMYFQQDELREYRKKTGMIFQHFNLLNSRNVAGNIAFP LEISGWGKNEINARVDELLEIVGLTSKKYNYPEQLSGGQKQRVAIARALANNPEILLSDE ATSALDPRTTNSILDLLKDINKKFGITIILITHQMEVIRKICNKAAIMSDGKIIEEGKTR DIFLNPKSSLAKEFVANISHNDDFEVEEKNGGKEKNEGRIKLKLKFTEEQVDKPYIAEII KKYDAEISILGGSIDKLSDTVVGHLTVEIIAEKEKIDVILKWLEMNNVELEVL >gi|261747631|gb|ADAD01000113.1| GENE 18 19283 - 19727 649 148 aa, chain + ## HITS:1 COG:FN0659 KEGG:ns NR:ns ## COG: FN0659 COG2011 # Protein_GI_number: 19703994 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Fusobacterium nucleatum # 20 147 28 154 233 129 60.0 2e-30 MRFNWVKFLQFNNMLIPLWETIYMVAIATAVSLLIGLPIGVLLVVSDSKGVKPNKTLHKI LDMILVNITRSIPFVILMVLLIPLSRMIVHKSYGSIAFIVPLSLGAAPFVARIIEGALKE VDEGLIEASKSMGATTWEIIIKVMIPEA Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:09:59 2011 Seq name: gi|261747629|gb|ADAD01000114.1| Leptotrichia goodfellowii F0264 contig00240, whole genome shotgun sequence Length of sequence - 573 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 572 863 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261747629|gb|ADAD01000114.1| GENE 1 2 - 572 863 190 aa, chain + ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 1 190 2188 2385 2806 105 36.0 4e-23 ITRHTVIGNVEIGSTTGSPINRDRSKANETTRDDHSSTNVYIEGQTIDYATNPGKFKEDL GKSKDEINDIGRAIKESLNDRGDDNRNFLGQLSEGRLQRTIENIGGERLRKSTTQEDISK TLKDTYKDLGYDIEIIFTTPDKAPQLIDEKGKIKAGTAYVGEDGKHTIIINTEAKENQTR AGLIGTITEE Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:10:00 2011 Seq name: gi|261747626|gb|ADAD01000115.1| Leptotrichia goodfellowii F0264 contig00074, whole genome shotgun sequence Length of sequence - 1497 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 623 638 ## gi|262038349|ref|ZP_06011728.1| putative liporotein 2 1 Op 2 . - CDS 644 - 1495 848 ## gi|262038348|ref|ZP_06011727.1| hypothetical protein HMPREF0554_1203 Predicted protein(s) >gi|261747626|gb|ADAD01000115.1| GENE 1 2 - 623 638 207 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038349|ref|ZP_06011728.1| ## NR: gi|262038349|ref|ZP_06011728.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 207 8 214 214 311 100.0 2e-83 MMLSLSLSCSKPKIVYNDYSFIEKIDRKFFTKGETVSPENRLELYIIERENSKLVKKTLV NREKEDKDIKDRHSFLLNYNNDKLYLLFYKNVDIGAESRKAKIFYYDKDFNLKPFNDYEY VFDNIDEYKNINKGGKKVKNVDTINYVIENNVYYNISNEGGFKNNERIIKEENYIFDFNK QKKELYLSPNREKEESIYIYNFKNGKI >gi|261747626|gb|ADAD01000115.1| GENE 2 644 - 1495 848 283 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038348|ref|ZP_06011727.1| ## NR: gi|262038348|ref|ZP_06011727.1| hypothetical protein HMPREF0554_1203 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1203 [Leptotrichia goodfellowii F0264] # 1 283 1 283 283 512 100.0 1e-143 DTEDYSTFYGRQLEKYYRKHFGDVDVAFVTEQLKYTKEQLGYDWENDAYWIESTHPGGHG ALMIVNEKYPDGRVYSWSAIGPDHMHYGIEFKGIKNKLGIPYPKFEKINYGALYITNGKN YLLGKYGRHSGKIHKIEGSKEYDNKIVDFYVKFIEGKEKKEITEFNSKSKNEGIGSYYIY KGSERPYRLGAYNCVTVGLSSIYYANGLSLDEFRKNRKIGEKVTEFHTYFNPRSAVRTPV LKGKTYVIVEYKGDNKNGKHTKERENEIDRIMDKLRKDSGVKK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:10:26 2011 Seq name: gi|261747621|gb|ADAD01000116.1| Leptotrichia goodfellowii F0264 contig00182, whole genome shotgun sequence Length of sequence - 2285 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 17 - 442 470 ## gi|262038352|ref|ZP_06011730.1| conserved hypothetical protein 2 1 Op 2 . - CDS 462 - 1259 833 ## PM0490 hypothetical protein - Prom 1298 - 1357 1.7 - Term 1380 - 1413 1.3 3 2 Op 1 . - CDS 1482 - 1859 595 ## gi|262038351|ref|ZP_06011729.1| conserved hypothetical protein 4 2 Op 2 . - CDS 1879 - 2283 456 ## PM0488 hypothetical protein Predicted protein(s) >gi|261747621|gb|ADAD01000116.1| GENE 1 17 - 442 470 141 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038352|ref|ZP_06011730.1| ## NR: gi|262038352|ref|ZP_06011730.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 141 1 141 141 228 100.0 1e-58 MKLVFKYKHYDNMYSKGYTKICISYKEINEVSIVTEKKFLDEIFLSSYLDDLDLRMSKWV LEKLKNKNINNDDISSQGWEAFIENDQVLITYIFNNEKDPVAIINREEIIYALEKWKKFL EKEITDPSYKEIIDTDDLYKK >gi|261747621|gb|ADAD01000116.1| GENE 2 462 - 1259 833 265 aa, chain - ## HITS:1 COG:no KEGG:PM0490 NR:ns ## KEGG: PM0490 # Name: not_defined # Def: hypothetical protein # Organism: P.multocida # Pathway: not_defined # 157 248 4 99 116 71 43.0 4e-11 MRDVTNSVKEDKSGGITYTIKVIPKEDSKVNPKTIENISETAIEQSLKRVENNQSESFKN NNEVEQVVVNKHFKEEQKEKSYRGDSSGSESQNNNNSKEKKGINKELHEKINNRGKEEFA NRQADKRVGEATTNYNNFEHITEGEIGVDRRSGNIRVIGGHKAGNRVRVIEKLEEYSGGS YEARIEVQDSNNPNNYIPKTNNDGISTMFPDHWTENRIKVEINNILKNPQNRQTRYKWEG TSSSGVKIRIFLKDGKVTTAYPIKP >gi|261747621|gb|ADAD01000116.1| GENE 3 1482 - 1859 595 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038351|ref|ZP_06011729.1| ## NR: gi|262038351|ref|ZP_06011729.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 125 1 125 125 215 100.0 8e-55 MKLIFKYVKFLSGYGPRSVSFNDKHDEELLAEYLTNTNLELAVHELKFLGDKDVKERAIG SEAWDVDILGDIVSIELSNTSYPDKVYIDREVVTYAMSKWKEFLEKEIKDYSYEEIIDTD DIYKK >gi|261747621|gb|ADAD01000116.1| GENE 4 1879 - 2283 456 134 aa, chain - ## HITS:1 COG:no KEGG:PM0488 NR:ns ## KEGG: PM0488 # Name: not_defined # Def: hypothetical protein # Organism: P.multocida # Pathway: not_defined # 31 133 6 113 115 97 50.0 1e-19 TRVNNVEAKFENFEGHIINAEMKNGKVKGGHTTLGNVKVKQVTKRYPSGVYEAEIEIQDP KNPNNFIPKSNNSGKSTMFPEHWTADRIKVEVDIAFKNKTMTGTYMWQGKTPSGVQIEGY IDKNGNITSVYPIR Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:11:03 2011 Seq name: gi|261747573|gb|ADAD01000117.1| Leptotrichia goodfellowii F0264 contig00061, whole genome shotgun sequence Length of sequence - 42790 bp Number of predicted genes - 50, with homology - 47 Number of transcription units - 19, operones - 10 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 16/0.000 + CDS 2 - 715 1131 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I 2 1 Op 2 11/0.000 + CDS 772 - 1275 927 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 3 1 Op 3 11/0.000 + CDS 1346 - 1900 872 ## COG1390 Archaeal/vacuolar-type H+-ATPase subunit E 4 1 Op 4 13/0.000 + CDS 1922 - 2923 1351 ## COG1527 Archaeal/vacuolar-type H+-ATPase subunit C 5 1 Op 5 12/0.000 + CDS 2916 - 3224 586 ## COG1436 Archaeal/vacuolar-type H+-ATPase subunit F 6 1 Op 6 16/0.000 + CDS 3273 - 5045 2571 ## COG1155 Archaeal/vacuolar-type H+-ATPase subunit A 7 1 Op 7 16/0.000 + CDS 5038 - 6417 2137 ## COG1156 Archaeal/vacuolar-type H+-ATPase subunit B 8 1 Op 8 . + CDS 6420 - 7064 859 ## COG1394 Archaeal/vacuolar-type H+-ATPase subunit D + Prom 7080 - 7139 10.5 9 2 Tu 1 . + CDS 7288 - 8100 1362 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 8220 - 8263 1.2 - Term 8130 - 8183 -0.9 10 3 Tu 1 . - CDS 8253 - 9680 1557 ## COG4690 Dipeptidase - Prom 9718 - 9777 10.4 + Prom 9872 - 9931 10.5 11 4 Tu 1 . + CDS 9965 - 10069 102 ## + Term 10112 - 10160 -0.6 + Prom 10161 - 10220 10.8 12 5 Op 1 38/0.000 + CDS 10248 - 11879 2482 ## COG0747 ABC-type dipeptide transport system, periplasmic component 13 5 Op 2 49/0.000 + CDS 11908 - 12840 1252 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 14 5 Op 3 44/0.000 + CDS 12850 - 13662 986 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 15 5 Op 4 . + CDS 13650 - 14441 399 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 16 5 Op 5 . + CDS 14438 - 15211 344 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Term 15213 - 15249 0.5 17 6 Tu 1 . - CDS 15307 - 15849 424 ## Halsa_1787 GCN5-related N-acetyltransferase - Prom 15896 - 15955 17.1 + Prom 15852 - 15911 13.3 18 7 Op 1 . + CDS 15968 - 17170 1539 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 19 7 Op 2 . + CDS 17198 - 17395 299 ## gi|262038361|ref|ZP_06011738.1| conserved hypothetical protein 20 7 Op 3 . + CDS 17407 - 17631 212 ## gi|262038358|ref|ZP_06011735.1| conserved hypothetical protein 21 7 Op 4 . + CDS 17658 - 19058 1612 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase + Prom 19082 - 19141 10.3 22 8 Op 1 . + CDS 19165 - 19395 264 ## gi|262038401|ref|ZP_06011778.1| hypothetical protein HMPREF0554_0873 23 8 Op 2 . + CDS 19385 - 19840 620 ## gi|262038371|ref|ZP_06011748.1| hypothetical protein HMPREF0554_0874 + Term 19850 - 19891 -0.8 - Term 19909 - 19950 -0.3 24 9 Tu 1 . - CDS 19952 - 20518 511 ## Lebu_1264 hypothetical protein - Prom 20633 - 20692 10.1 + Prom 20572 - 20631 10.4 25 10 Op 1 13/0.000 + CDS 20664 - 21398 980 ## COG0826 Collagenase and related proteases 26 10 Op 2 . + CDS 21496 - 22839 1444 ## COG0826 Collagenase and related proteases 27 10 Op 3 . + CDS 22860 - 23507 636 ## Lebu_1011 hypothetical protein 28 10 Op 4 . + CDS 23549 - 23962 378 ## Lebu_1011 hypothetical protein + Term 23970 - 24007 2.3 + Prom 24248 - 24307 8.5 29 11 Op 1 . + CDS 24334 - 24438 129 ## + Prom 24457 - 24516 1.8 30 11 Op 2 . + CDS 24537 - 24959 239 ## Dhaf_4658 Abi family protein + Prom 24961 - 25020 7.9 31 12 Tu 1 . + CDS 25164 - 25292 137 ## - Term 25187 - 25246 1.6 32 13 Tu 1 . - CDS 25370 - 25732 507 ## Lebu_1804 hypothetical protein - Prom 25765 - 25824 10.1 + Prom 25778 - 25837 8.8 33 14 Tu 1 . + CDS 25864 - 26847 1183 ## Lebu_1817 DNA polymerase III delta + Prom 26900 - 26959 8.8 34 15 Op 1 . + CDS 27036 - 27155 154 ## PROTEIN SUPPORTED gi|229211279|ref|ZP_04337673.1| SSU ribosomal protein S20P 35 15 Op 2 . + CDS 27133 - 27303 234 ## PROTEIN SUPPORTED gi|229211279|ref|ZP_04337673.1| SSU ribosomal protein S20P + Term 27346 - 27382 5.0 - Term 28109 - 28144 0.0 36 16 Tu 1 . - CDS 28170 - 29546 1383 ## COG0534 Na+-driven multidrug efflux pump - Prom 29580 - 29639 17.1 + Prom 30124 - 30183 12.4 37 17 Op 1 1/0.000 + CDS 30205 - 31512 2193 ## COG0019 Diaminopimelate decarboxylase 38 17 Op 2 . + CDS 31584 - 32786 1731 ## COG0527 Aspartokinases 39 17 Op 3 1/0.000 + CDS 32834 - 33679 1305 ## COG0253 Diaminopimelate epimerase 40 17 Op 4 3/0.000 + CDS 33700 - 34584 1301 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 41 17 Op 5 . + CDS 34663 - 35661 1684 ## COG0136 Aspartate-semialdehyde dehydrogenase 42 17 Op 6 16/0.000 + CDS 35662 - 37233 1820 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 43 17 Op 7 . + CDS 37233 - 37979 529 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 44 17 Op 8 . + CDS 38050 - 38820 1101 ## Lebu_0706 hypothetical protein + Prom 38823 - 38882 8.6 45 18 Op 1 59/0.000 + CDS 38938 - 39372 658 ## PROTEIN SUPPORTED gi|229211385|ref|ZP_04337778.1| LSU ribosomal protein L13P 46 18 Op 2 . + CDS 39385 - 39783 621 ## PROTEIN SUPPORTED gi|229211386|ref|ZP_04337779.1| SSU ribosomal protein S9P + Term 39798 - 39841 8.0 47 19 Op 1 . - CDS 39909 - 40937 974 ## COG0582 Integrase 48 19 Op 2 . - CDS 40918 - 41157 272 ## gi|262038390|ref|ZP_06011767.1| putative UDP-3-O- 49 19 Op 3 . - CDS 41175 - 41930 951 ## Sterm_1235 phage recombination protein Bet 50 19 Op 4 . - CDS 41930 - 42790 781 ## gi|262038387|ref|ZP_06011764.1| putative DNA double-strand break repair Rad50 ATPase Predicted protein(s) >gi|261747573|gb|ADAD01000117.1| GENE 1 2 - 715 1131 237 aa, chain + ## HITS:1 COG:FN1741 KEGG:ns NR:ns ## COG: FN1741 COG1269 # Protein_GI_number: 19705062 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Fusobacterium nucleatum # 1 227 410 637 638 215 53.0 5e-56 SFSVIIWGFLYGSYFGAELPGMWRLINPANEFQKLLIGSMIFGLIHLFFGLAIKAYLLFK AKKPLDALYDVGFWYMALAGGIVYLVFSLMKLSPVVTNISMWIMIVGMIGIVLTGGREAK SIGGKLGGGLYSLYGISGYVGDFVSYSRLMALGLSGGFIAQSMNMIAQMLGGSIYGLVFV PVILVVGHLFNLFLAFLGAYVHTSRLIYVEFFGKFYEGGGKAFKNFRTESKYINLED >gi|261747573|gb|ADAD01000117.1| GENE 2 772 - 1275 927 167 aa, chain + ## HITS:1 COG:FN1740 KEGG:ns NR:ns ## COG: FN1740 COG0636 # Protein_GI_number: 19705061 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Fusobacterium nucleatum # 9 166 3 160 160 160 62.0 9e-40 MNSSTLLVNLSTLFGEQSGLIFAALGAAIAVFLSGVGSAKGVGIVGEVAAGLMAEEPEKF GKSLVLQLLPGTQGLYGFVIGLMVLGRLKPDMSIGEGLYIFMACLPIAVAGYGSAIAQGR VAASGISLLAKNEEQSTKGIIYAVMVETYALLAFVVSIMLLSGVSFK >gi|261747573|gb|ADAD01000117.1| GENE 3 1346 - 1900 872 184 aa, chain + ## HITS:1 COG:FN1739 KEGG:ns NR:ns ## COG: FN1739 COG1390 # Protein_GI_number: 19705060 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit E # Organism: Fusobacterium nucleatum # 1 183 1 182 183 62 31.0 4e-10 MSNLDNLTSKIINDAKEKAAEIEQKAKETAEEKYKSGMKKAEEKKERILETGKRERELLS ERMKSGANLKARNDKLKAKQDAIDKVILRLKEKLVNMSEREYLDYISKNIDVSSFNQNKK LIVKKEYVNKVKERFPNIKVEENEFVNSGFIIEENGIQENYTFEVKLDFMRDELEVEISK LLFS >gi|261747573|gb|ADAD01000117.1| GENE 4 1922 - 2923 1351 333 aa, chain + ## HITS:1 COG:FN1738 KEGG:ns NR:ns ## COG: FN1738 COG1527 # Protein_GI_number: 19705059 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit C # Organism: Fusobacterium nucleatum # 1 332 2 333 334 169 31.0 6e-42 MNRMDYGRSVVTVRVLEKRLLTKNKLERMIEAETPDEVLKLLGETEYSQNMTDIDNSRDY EKILKRETERVFSLVRDMSKDSEIVDILSLKYDYHNLKVLLRGRMAGKDFSNLLIDAGTV KAARMKSDFETKSYDALPKEFVSVISEAEKDFSENKDPQRIDIIADKYYYSHLLRIANSI DTEILRDYVQGLIDFQNIITLLRVKKQNRDFKFLENIIHDGGKISREKIISSLNDSPEGI ANKFKKEKIGVYLTKGMEVFSENGRLSELEKIADNYLLELNKESKYIVFGPEPLFTYLVA KEREINAVRMIMVSKINNINSEKVKERLRDTYA >gi|261747573|gb|ADAD01000117.1| GENE 5 2916 - 3224 586 102 aa, chain + ## HITS:1 COG:FN1737 KEGG:ns NR:ns ## COG: FN1737 COG1436 # Protein_GI_number: 19705058 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit F # Organism: Fusobacterium nucleatum # 1 101 4 104 105 118 60.0 3e-27 MHKIGVVGDKDSILAFRVLGVDVYPAVGKEEARKIIDKLASDKYGIIFVTEQIAQLVEET IERYNRDLIPAIILIPNNQGSLGTGIQKINDYVEKAIGSNIF >gi|261747573|gb|ADAD01000117.1| GENE 6 3273 - 5045 2571 590 aa, chain + ## HITS:1 COG:SPy0154 KEGG:ns NR:ns ## COG: SPy0154 COG1155 # Protein_GI_number: 15674362 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit A # Organism: Streptococcus pyogenes M1 GAS # 1 587 1 591 591 784 66.0 0 MKVGKIIKVSGPLVVAEGMDEANVYDVVRVSDNKLIGEIIEMRGDKASIQVYEETVGIGP GEPVYSTGEPLSVELGPGLLEAMFDGIQRPLKEFQEVAGDYLNKGVEVPSLNREKKWNFE HVMKVGDKVEAGDIIGTVQETSVISHKIMIPLGISGKIKSIEKGSFTVTDTVAVIETDKG DVNIQMMQKWPVRRGRKYKQKLNPEAPLITGQRVIDTFFPVTKGGTACVPGPFGSGKTVV QHQLAKWADAQIIVYVGCGERGNEMTDVLMEFPEIIDPNTGQSLMKRTVLIANTSNMPVA AREASIYTGITIAEYFRDMGYSVAIMADSTSRWAEALREMSGRLEEMPGDEGYPAYLGSR AADFYERSGKVVCLGADNREGALTVIGAVSPPGGDISEPVSQATLRIVKVFWGLDANLAY RRHFPAINWLNSYSLYQTKVDGWMDKNIGPEFSQNRQRAMALLQEESSLQEIVRLVGRDT LSEKDQLKLEVAKSIREDYLQQNAFMESDTYTSLAKQDKMLALVLKFYDEGLRGLDSGVY LKEISAMPVREKIARAKYLPEEELNKIDDISEEITKEIDNLISEGGISNA >gi|261747573|gb|ADAD01000117.1| GENE 7 5038 - 6417 2137 459 aa, chain + ## HITS:1 COG:FN1734 KEGG:ns NR:ns ## COG: FN1734 COG1156 # Protein_GI_number: 19705055 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit B # Organism: Fusobacterium nucleatum # 1 458 1 458 458 761 81.0 0 MLKEYKTITEIVGPLMIVEGVEGVKYEELVEIETQTGELRRGRVLEVNGDKAMVQLFESS AGINLKDTKVRFLGKPLSLGVSEDMIGRIFDGLGRPKDNGPKIIPEKSLDINGVAINPVA RDYPSEFIQTGVSSIDGLNTLVRGQKLPIFSGSGLPHAELAAQIARQARVLGSGSKFAVV FGAIGITFEEAQFFIDDFTKTGAIDRAVLFMNLADDPAIERIATPRMALTCAEYLAFEKG MHVLVILTDLTNYCEALREVSAARKEVPGRRGYPGYLYTDLSTIYERAGKIKGKEGSITQ IPILTMPEDDKTHPIPDLTGYITEGQIILSRDLYKQNLMPPVDVLPSLSRLKDKGIGKNK TREDHADTMNQLFAAYATGKEAKELAVILGESALSDTDKAFVKFTTAFEDQYVAQGFEKN RTIEETLNLGWELLKILPRTELKRIRDEYLEKYLPSGDE >gi|261747573|gb|ADAD01000117.1| GENE 8 6420 - 7064 859 214 aa, chain + ## HITS:1 COG:FN1733 KEGG:ns NR:ns ## COG: FN1733 COG1394 # Protein_GI_number: 19705054 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit D # Organism: Fusobacterium nucleatum # 1 211 1 209 211 228 67.0 5e-60 MAKLNINPTRMELTKLKIRLKTAVRGHKLLKDKQDELMRQFIDMIKMNKKLREEVEGKIQ DSFKDFLLARGVMSNEMLENAIIYSNEEIGVEIKRKNIMSVHVPQMNFIKNSEGKSASIY PYGYAQTSADLDDAIDGLSKVMDKLLELAEVEKACQLMADEVEKTRRRVNALEYMTIPQL QETIRFIQMKLDENERGSITRLMKVKDMMAKKEA >gi|261747573|gb|ADAD01000117.1| GENE 9 7288 - 8100 1362 270 aa, chain + ## HITS:1 COG:SP0923 KEGG:ns NR:ns ## COG: SP0923 COG0561 # Protein_GI_number: 15900803 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Streptococcus pneumoniae TIGR4 # 1 267 1 269 269 269 51.0 3e-72 MIKLIALDMDGTLLNNDKHIAEAQKKAIKKAAQAGIKIVLCTGRPLFGVEPLYKELDFDE EEYVILNNGCEIRETKNWSLVDSHTLTKDEIVDLYNFGKGYNLDFTLFDEKHYFCVGKPN EYTVRDGNFVYVPITEIDLDEAISGKYTIFKGMYVGDPEEVDRFQKNIPENVKKLYEFVR SQVSILEAMPSGVNKGTALKDFARRLGIDKSEVMALGDGNNDIEMLEYADFGIAMSNGTE AAKKAAKYVTDTNENDGVAKAIYKYVFNEN >gi|261747573|gb|ADAD01000117.1| GENE 10 8253 - 9680 1557 475 aa, chain - ## HITS:1 COG:SPy2066 KEGG:ns NR:ns ## COG: SPy2066 COG4690 # Protein_GI_number: 15675831 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidase # Organism: Streptococcus pyogenes M1 GAS # 1 475 1 488 498 486 51.0 1e-137 MKSKKTLFFLILIVLMTQISVYIFACTGIIVGKNLTTDGSFIFGRNEDLEPDHNKTFVVH ERKKNKPGDVFKDESNGFIYKLPAESFKYTALPDVTPKDGIFDEAGFNEYGVIIDATVSA KANEKIQKIDPYVENGLAESALTSVVLPYVKTAREGVMHLAQIIKTQGAAEGNTLVIADK NELWYIEIYSGHQFVAIKYPDDKFSVFPNTFFLGTVDLNDKKNVVASEDIENVAKKAGTY IAENGKIHLTKSYAPPFEDRDRSRQFSGILSLNPDAKITYNDERYDFLQSANKKISLQDV MKTLRNRFEGLGYKPELSKADKGKDGYKYPIGNINTMESHIFQIKGTLPNEIGGVMWLTM AAPKFSPYVPYYGNITATDSSYHNTDTKYNENSFYWVADNVNDILDKSDSKLQNNFLSKI KSYENKKIAEQAQLDKKIAKMSAKEAQKFANELALKNGKEAFDMVKSQEKLLKNK >gi|261747573|gb|ADAD01000117.1| GENE 11 9965 - 10069 102 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTDLKKSRGREAGGIPGLSRNREKIFYFKSDNYP >gi|261747573|gb|ADAD01000117.1| GENE 12 10248 - 11879 2482 543 aa, chain + ## HITS:1 COG:FN1504 KEGG:ns NR:ns ## COG: FN1504 COG0747 # Protein_GI_number: 19704836 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 12 541 3 521 522 610 58.0 1e-174 MKKFLKQIGILLMTEIILGCQSKPNSDSKEGQKANKENKVTLAFNQDPVGNLNPHEYLPS QFITQDMVYDGLVSYGENGEIKPMLAESWTISEDGKTYTFKLRPNVKFSDGSAFDAKNVV KNFDTIFSKENKSKHSWFAFTNHLKSYRAVDDLTFEIVLDTPYTATLYDLAMIRLIRFLG DAGFPEGGDTMKGIKAPIGTGAWVLKEHKNNEYAVFVRNEYYWGEKPAASEIIIKNIPDS ETLALQFESGDIDLIYGNGLISLDRFNSYRQDNGKYTTATSNPMSTRMLLMNTTSSILGD LNVRKALNHAVDKESIAKNIFDGIEKPADTIFAPNVPHTNVKLEKYGYDLAKAEAMLDEA GWKKGADGIREKNGKKMVLSFPYISSKVTDKSIGEYIQGEWKKIGIQVELKAMEEKAYWQ NATAKNYDIMSDFSWGAPWDPHAFLTAMADNSASNTNPDYAAQLGLPMKAQLDKTIKALL VEPNEQKLNEMYTYVLTTLHEQAVYIPISYQAMLSVYRTGELEGVKFMPEENRIPAWSVK KVK >gi|261747573|gb|ADAD01000117.1| GENE 13 11908 - 12840 1252 310 aa, chain + ## HITS:1 COG:FN1503 KEGG:ns NR:ns ## COG: FN1503 COG0601 # Protein_GI_number: 19704835 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 309 4 312 312 263 47.0 5e-70 MIKKIISLFSIFAVISLITFFLVKLSPGDPAENYLRASHVSITAETLASTRKALGLDKPV PVQYMNWAGNLLKGDFGVSYSKKVPVIDLVSEAIVPTFQLGLFSFLILLLTAPILGIISA VNKNSFIDYIVRALSYTGVSVPTFWLGYILIIIFAVSLKILPVSGRGGLENLILPCITLV TPLVAQTTFLIRKSILEQMEKPHVENAVIREVSRKKVIINHLLRNAAIPIVTVLSSNIMF LLTGSVLIEEIFSWPGIGKLFATAVRTGDFPLIQIMLLFFGVMSVIVNEITQIMIKYIDP KLRLKEKTGM >gi|261747573|gb|ADAD01000117.1| GENE 14 12850 - 13662 986 270 aa, chain + ## HITS:1 COG:FN1502 KEGG:ns NR:ns ## COG: FN1502 COG1173 # Protein_GI_number: 19704834 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 265 5 271 271 252 53.0 5e-67 MKKSKLFYFSCTLLVILIFIAIFAPLIAPYDPLYTDISQKLQLPSKIHILGTDHMGRDVF SRLVYGTRLSLTISGIIIVLTLCVSFPVGIAVGWFGGAWDKLFLWFINVLMAFPSFLLSM ALVGILGQGVGNIIIAVTVIEWIYYARILRNMVLSIKQQEYVQAARAIGAPSFYIIRKHI LPFVFKPILIAALMNIGNIILMISGFSFLGIGVQPNIAEWGMMLNDAKPYFRRIPGLVLY PGMAIFITVLAFNLLGETFDGKGIKKLWKD >gi|261747573|gb|ADAD01000117.1| GENE 15 13650 - 14441 399 263 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 17 258 26 269 563 158 39 4e-72 MERLIINSLKTEIDKTEILKGISFKLNKGETLIIIGESGSEKTMLSRLLIGMKPENAIIR GNIYFDDKDLLSMIEKERSNYRGKRIAYIAQNPMAVFNDFQSIESHTVELFQSQLNFSKK ECQEKMISGMKELNLPNSEEIMKKYPFQLSGGMFQRVMFAMMMQLQPELLIADEPTSALD YYNSEKVAEMLKKFQSKNTALIVITHDYDLAEKLGGKVMIMRNGEIVEEGITSEMLKNPE SNYGKELLLRKTHTRYKKRSDNV >gi|261747573|gb|ADAD01000117.1| GENE 16 14438 - 15211 344 257 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 4 231 12 247 329 137 35 1e-31 MTKLMINNVNKYYNKKLILSNINLSVSERECVGIMGESGSGKSTLAKLIVGLEPLDEGDM ILDGISYKSLSKKEMKKIYRKVQMVFQNALGAVNLRFTIEEILLESLRIHYKKTLSYEEM KKRATDLLEKVGLRAEFLSRKATELSGGQLQRVYIARALILEPEIIVFDESVSGLDLVVQ QQILELLAELRETMNLTYVFISHDFEACYYLCDKVVIMESGEIKDVISDLDSPIVINNER VKKIVGKSLNHIEYIEK >gi|261747573|gb|ADAD01000117.1| GENE 17 15307 - 15849 424 180 aa, chain - ## HITS:1 COG:no KEGG:Halsa_1787 NR:ns ## KEGG: Halsa_1787 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: Halanaerobium_sapolanicus # Pathway: not_defined # 15 145 20 142 192 78 41.0 1e-13 MKLIKTKKPKHLMNIYKLYLKAFPKNERKPFPFILLKQWRGTTEILSIKNSTGEFAGLAI AAIHNDLVLLAYFAVSSTQRGGGIGSKALQLLKERYADKKLFLEIENTDTDAPNIEERKK RKNFYLKNGMKEMPFIINFFGTEMEIMTYNSDITFDEYHSVYKKEFGKYISRKVKFVRYK >gi|261747573|gb|ADAD01000117.1| GENE 18 15968 - 17170 1539 400 aa, chain + ## HITS:1 COG:CAC1014 KEGG:ns NR:ns ## COG: CAC1014 COG1473 # Protein_GI_number: 15894301 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Clostridium acetobutylicum # 17 400 10 391 396 289 42.0 7e-78 MNSKEINDFIKQNTGKIYDEIVKIRRKIHMNPELGDEEFETGKTIKDFLKENGIEYEEVI NTGVVATIYNGEGKTVATRADIDALPIFEENDVEYKSKIDGKMHACGHDGHTSVQLGVAK ILADNKDKWKGTVRFFFQPAEETNGGADRMIKAGTLKFKGDENRKIDAFFALHMAPEIET GKIGIKYGKAHATSAMFRLTINGVSAHAAQPQKGVDAILIGAKVLEFLQSIVSRRIDPRE EAVITVGSFKGGEAENVVCDKVDMLGTIRTMSKEIRTFIIETIKRDLPKFVESLGGKADI RIREGYAPVINNEEITKKVEQNIIDLYGKESLEIIKEARMDVEDVSYFLNEINGCFFRLG TRNEEKGLIYDLHHPKFNIDEESLKIGIGLQLKNILEFLK >gi|261747573|gb|ADAD01000117.1| GENE 19 17198 - 17395 299 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038361|ref|ZP_06011738.1| ## NR: gi|262038361|ref|ZP_06011738.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 65 1 65 65 96 100.0 7e-19 MNYYSKFDEEKYYENELFKYNTESISERIAIWDEVEYSIDTLDSSYRGTNARDNAFIVSG MKKAE >gi|261747573|gb|ADAD01000117.1| GENE 20 17407 - 17631 212 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038358|ref|ZP_06011735.1| ## NR: gi|262038358|ref|ZP_06011735.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 74 1 74 74 141 100.0 2e-32 MQCTPEYPKQSFFRKIFGLSEELRIDSYNVEIGTGEKIYRKVVNTKEEIKNLFKNFYETG NIPDISDWEDTGIL >gi|261747573|gb|ADAD01000117.1| GENE 21 17658 - 19058 1612 466 aa, chain + ## HITS:1 COG:FN1713 KEGG:ns NR:ns ## COG: FN1713 COG2265 # Protein_GI_number: 19705034 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Fusobacterium nucleatum # 1 466 7 463 464 359 47.0 7e-99 MDINKKKYKIGEKLGIKIEKIVFGGEGLGRIDGFTIFVPMSVPGDVLEIEIISVKKSYAR GLITKIIKPGKNRIDDISKVSFEDFDGCDFGMLKYEKQLEYKNEMLKEVLEKIGEINIEN IELESIIGSDIKTNYRNKTAEPVFKKDGKIMTGFYSKKSHDTFSASESLLKSKIAEDIIN KFLKEINSFSGTKNEFKVYNEITNTGFLKHVIVRNNEKNEVMLIITVSKKSQYKQLVKVL EKLYEENESLKSVYISVKKEANNVILGEETRHLLGNEYLEEEIEEIRFKIYPDSFFQINK SQAIKLYDKAIEFLGESVHKTVIDAFSGTGTIAMALSSKVKKAIGIESVESSVIAANQTA EENNIKNVEFIKGKVERVLPDVLKKEGNSVEAIIFDPPRRGIDEKALKSSVKNKIPKIVY ISCNPSTFARDCKLLMENGYTLKKVSAVDMFPQVNHIEIVGLLERK >gi|261747573|gb|ADAD01000117.1| GENE 22 19165 - 19395 264 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038401|ref|ZP_06011778.1| ## NR: gi|262038401|ref|ZP_06011778.1| hypothetical protein HMPREF0554_0873 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0873 [Leptotrichia goodfellowii F0264] # 1 76 1 76 76 122 100.0 7e-27 MDILYRKHMNLYGSSFLKEYTRVSKLTEELNKKIGLPIYKVTIDVLNSELKVENRFSGKK KMKFSHSMKGNLNYVR >gi|261747573|gb|ADAD01000117.1| GENE 23 19385 - 19840 620 151 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038371|ref|ZP_06011748.1| ## NR: gi|262038371|ref|ZP_06011748.1| hypothetical protein HMPREF0554_0874 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0874 [Leptotrichia goodfellowii F0264] # 1 151 1 151 151 266 100.0 4e-70 MSDNENQLENIKIENKETEMVERLRDVPKILSQLNGTVEREFSGPLPLEVLQNLSQENKD KLVNNFINQETKEHEANMKVLELIGDKNNKDFSLKRLSLIIGIPGFLILTGICLFSGNKD VLLEILKITFAFLGGTGFGYYIKRDKDSERS >gi|261747573|gb|ADAD01000117.1| GENE 24 19952 - 20518 511 188 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1264 NR:ns ## KEGG: Lebu_1264 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 187 7 188 198 270 67.0 3e-71 MNLNRIIKKIVLLLIPVLFLISCKSIDPAYKWYSAKEVIAKSDQLQPGDILILSKRNTIR SMWGHVAVLNEEKKIVEFPSYSAGYSESPLFVWQKIDRQISVFRLKGIDDDYKNALFEEI NKTVNKPYGLTFHKNFDKRLYCSQFVYLVFKKAGKKVGRNVDLDSNNGGWVMPFDIMRSP LLENVILE >gi|261747573|gb|ADAD01000117.1| GENE 25 20664 - 21398 980 244 aa, chain + ## HITS:1 COG:FN1931 KEGG:ns NR:ns ## COG: FN1931 COG0826 # Protein_GI_number: 19705236 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Fusobacterium nucleatum # 1 238 1 238 720 291 60.0 9e-79 MNIVAPAGNYEKLEAAIKAGANEVYFGLKGFGARRNNENLSIKEILDGIDYAHSRGIKTL IALNTVMKDVEIDAAYKYISKIYEHGLDAVIVQDLGFMSFLKENFPKLALHGSTQMTVAN HVEANKLKELGLSRVVLAREMSFEEIKSIREKTDIELEIFVSGSLCISYSGNCYISSFIG GRSGNRGLCAYSCRKKFEDEEGNKAYFLSPNDQLLQTEEINKLKDIEINAIKVEGRKKIK RICL >gi|261747573|gb|ADAD01000117.1| GENE 26 21496 - 22839 1444 447 aa, chain + ## HITS:1 COG:FN1931 KEGG:ns NR:ns ## COG: FN1931 COG0826 # Protein_GI_number: 19705236 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Fusobacterium nucleatum # 2 447 280 720 720 223 33.0 7e-58 MDNKLMNFKYSSNFGYFLGARIENSNNFRIDDELILGDGIQFVDSDFEKISGEYVNKLIV NGKKVQKADKDDIVSVGKLPDKTKYIYKNYSKEINDRIIHNIKVSKRFASIDGELFAEKG KEIMLKFSIYNLKGEKIEVLKKGNVIEQDAKKEITKEQIAEKLGELGDTTFELNSLKINY DGKSFIPFSELKALKRECVSELQEKLVNSYRKNLEEKKVKTFKGKSSTEPIFSVLVSNKE QEKACREAGIEKIYHKQYDVAKEKNLHKTDKIKTDSNLASNLYQVAMGEKNNINGQSSDW NLNVFNNYTLDLFSQFPNLETVFLSPELNYKQLKMIKSDKVKIGMIIYGYLKGMYIEHKI FDKEYKELQGEFYDKYKILKNDLDNIELYLNKPMNLIPKLDQIKEFNFDELRLDFTFETA EEVKKIIKSLQTKKGNYNPYSFERGVF >gi|261747573|gb|ADAD01000117.1| GENE 27 22860 - 23507 636 215 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1011 NR:ns ## KEGG: Lebu_1011 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 6 209 1 204 373 206 61.0 4e-52 MKRGFMEIIKNRNYIEKMYMLLGGAIFTHYVLVILLCILLLKDIFVTGKYKKILKDKSLV IVEAVLGLSIITSLFYKNYYGLIAIPMLLCIMVGRYYTRIVSDDFKNYNLELMAKFSGAA FFVSLAECLITKERAGYFAYLNPNYLGNIMMMAAIVNLYLALKKRTKLNFLIFILNFITI VMSGSRSALAAAIIGIFVLLYYFLEKKVFLSVFYF >gi|261747573|gb|ADAD01000117.1| GENE 28 23549 - 23962 378 137 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1011 NR:ns ## KEGG: Lebu_1011 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 131 226 358 373 153 58.0 2e-36 MREDSISKYYGLRKDILKMALRAFKSNILFGHGNFYYFKYTKFYPHSHNAVTELLISYGL IGTVALMTVWLKYIYDIFKDDKSNVLKISILLGVVAHNLTDFPIFWIQTVLLFIMILSCR ENKDEVKKLRFHSRNIK >gi|261747573|gb|ADAD01000117.1| GENE 29 24334 - 24438 129 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSKKAPKTYEEQLNILKNRNLKILNERKAKKFYQ >gi|261747573|gb|ADAD01000117.1| GENE 30 24537 - 24959 239 140 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_4658 NR:ns ## KEGG: Dhaf_4658 # Name: not_defined # Def: Abi family protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 2 129 73 201 317 83 41.0 3e-15 MLFNFDRELRMILLKYILIVERTIKTKISYHFSMKYDDKYLNANNFDYNNRRKNLKIPRL IYNMSKVKRIYSEVNSSIYHYQEIHGKIPLWVLVEKLNFGIISHFFYCLILKDQNAIVKE IFEDYKEEYNYNKKSILNPA >gi|261747573|gb|ADAD01000117.1| GENE 31 25164 - 25292 137 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MINKINELISKLPALLPNYYKKILNKMGFSNDWENIMLEIIK >gi|261747573|gb|ADAD01000117.1| GENE 32 25370 - 25732 507 120 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1804 NR:ns ## KEGG: Lebu_1804 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 120 1 119 119 95 58.0 4e-19 MSSNIKEQASIGGLRANAAAALINLSFFLPGLGFLISLLALILETKNQFVRNYAKQTLAL GVLLFASILLNVVVVLGTIAFGIITFVLSIVQIIAIIKSFLGQEFEIPYIKKVSELLFVD >gi|261747573|gb|ADAD01000117.1| GENE 33 25864 - 26847 1183 327 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1817 NR:ns ## KEGG: Lebu_1817 # Name: not_defined # Def: DNA polymerase III delta # Organism: L.buccalis # Pathway: Purine metabolism [PATH:lba00230]; Pyrimidine metabolism [PATH:lba00240]; Metabolic pathways [PATH:lba01100]; DNA replication [PATH:lba03030]; Mismatch repair [PATH:lba03430]; Homologous recombination [PATH:lba03440] # 1 327 1 327 327 348 63.0 2e-94 MIYFIGGTKHREFRYFDLLEKIRKENPGISESFFDVDIKEEEKFLEKVSFNSIFSTEELI VLRQAEKLKDLEKTLDYIANLDIVGKEIIIDYFKEDGKTGVKLSKKLEELKKNGKLEVYL FPKEDDGEIKKYIKKELGISEKDTAMLLEMIGNDPFKVKNEVEKIKIYLNGDSFNMQEMK KIVSVEKEYRIYETVDKILNNKAAEVIDYLEKTKEYMGVLYSLYGELEVMYKLSCLRSSG MKFSSNYAAFKNEFEGIKEIFKTNNRLPNPYVIFMKFPKLKNYTAKNLRRLVFRCWEVEK DIKTGKIEMESGIETLIMETADLYVKK >gi|261747573|gb|ADAD01000117.1| GENE 34 27036 - 27155 154 39 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229211279|ref|ZP_04337673.1| SSU ribosomal protein S20P [Leptotrichia buccalis DSM 1135] # 1 34 1 34 88 63 88 1e-31 MAHSKSSKKRIFIGERNASRNQAIKSRVKTFVKKSSFSC >gi|261747573|gb|ADAD01000117.1| GENE 35 27133 - 27303 234 56 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229211279|ref|ZP_04337673.1| SSU ribosomal protein S20P [Leptotrichia buccalis DSM 1135] # 1 56 33 88 88 94 87 1e-31 KKVLSAVEAKNVDEAKTALQVAYKELDKAVTKGVLKKNTASRKKSRLALKVNSLAS >gi|261747573|gb|ADAD01000117.1| GENE 36 28170 - 29546 1383 458 aa, chain - ## HITS:1 COG:BB0473 KEGG:ns NR:ns ## COG: BB0473 COG0534 # Protein_GI_number: 15594818 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Borrelia burgdorferi # 12 456 9 454 454 137 25.0 5e-32 MIFKAQTAEERRKMILNGNILNTLLFLSVPTLLVGAIQAFIPLSDGLFLNNIAGVVTASA VTFSQPVLNIMIALSQGLGAAAMAMIGQLYGRGVIKAVKEVTLQVFVFGFVIGLILIPVC IGAGYLVSANITSEIKHEVFVYISLYSLVMPLVFMTAIYNSAKNAVGRPEVTFIRVFILL FLKIIFNTVFLYFFEMKIVGAVMASLFSYIIISVWMYHDLFIKNTDMKLDLKKYHLNLPI VARLFKIGFPSMLSYMLIQTGFVLINKEVDKYGAISLNAQGIASNINSICFILPSAIGTT VTTMISMNMGIGEIKKAKKIFTYGWITSVVIALLTIGMIVPLAGKITVLFTRNKEVLDIA NHALNIYTYSVIGFGVFSVCQGVFIALGRTKVPLFMAILRIWFFRYLFILFTQKFLGVCA VFWGNLFSNTLAAILFFCLVKKLDWNINPAKIQKKKLK >gi|261747573|gb|ADAD01000117.1| GENE 37 30205 - 31512 2193 435 aa, chain + ## HITS:1 COG:BS_lysA KEGG:ns NR:ns ## COG: BS_lysA COG0019 # Protein_GI_number: 16079395 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Bacillus subtilis # 1 434 1 434 439 411 49.0 1e-114 MKLFGTSKINEKGNLSIGGVDTNDLAKEFETPLYVMDQELIETTIDKMKKAFVSSRFDTH IAYAGKAFFTTGMAKIIEAKDLELDVVSGGELYTAYKAGFPMDKVHMHGNNKSYEELEMA IDLEIKQIVIDNEDEIGKIEKICKEKNKKQAVLLRIDPGIEAHTHHYIKTSGLTSKFGIS LFQKDLLDIIKRINDSEYLEFRGFHTHIGSQIFQSIFFIFALEEIFKYLDKLKKELGIVV HTVNMGGGFGVYYKEGDDPVPVEEVLKEIITYTEAMEIKYKIGFKELCIEPGRSIVGNAG TTLYEVGGIKKTVGGKTYVFVNGGMADNIRPALYQAEYEAGITNKLDKEATNEVTVAGKF CESGDILIEKTKIQEAHVGDILAITTTGAYCYTMSSNYNRFAKPAVVFVKDGKAKLAVKR ETFEDLIRNDEIFEL >gi|261747573|gb|ADAD01000117.1| GENE 38 31584 - 32786 1731 400 aa, chain + ## HITS:1 COG:TM1518 KEGG:ns NR:ns ## COG: TM1518 COG0527 # Protein_GI_number: 15644266 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Thermotoga maritima # 2 400 3 401 401 314 45.0 2e-85 MIIVHKYGGSSVATTEKIMNIAKYLGKIKDKKNDVVVVVSAMGKTTDTLIKLAHEITGDP DKREMDRLMSTGEQQTIALLSIALQSLGYDAISLTGRQAGIRTSGHHTKNRIESIDEKKI KKHLEEGKIVIIAGFQGVNENGDVATLGRGGSDTSAVALAAALNGKCEIYTDVDGVYSVD PRIYPEAKKISYISYDEMMELAFLGAGVMEPRAVELGKKYGVEIYVGKSLGEKNGTVITA KEKIMEEKVITGISVNEDILMVNIEEIPTYAKNVYAILEQAVKFGVNIGVISQNDVSSEH GSFAFTCPQSDKASLEKISEILKEKFERISVIINPYVTKVSIVGIGLISNIGIAARVFKV LADNNISFHQVATSEISIGLIVDEVMGKKVAELLAKEFNV >gi|261747573|gb|ADAD01000117.1| GENE 39 32834 - 33679 1305 281 aa, chain + ## HITS:1 COG:MTH1334 KEGG:ns NR:ns ## COG: MTH1334 COG0253 # Protein_GI_number: 15679334 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Methanothermobacter thermautotrophicus # 1 275 4 283 289 185 36.0 6e-47 MLKFEKYQGAGNDFIIVNEKDLIEKGIPDYNQFASEVCCRHFGIGADGLLILKYVANMPF MFYYNSDGSQAPMCGNGIRCFSFYLKNNNVEQEDTFTVKTLSGDLKIETKTENETFFAKV NMGEPVFDVKKLINIEKERFMREKIIIDEKEIEISYIFMGTDHSVIFVNDFSDYNIDDLG SKIENYTEIFPKKVNVNFVKVHDRSNIEVITWERGAGRTLACGTGATASAVLARIYGYVE DRINVTVPGGKLVIDYAGYGSEAYMTGTSEKIAEGQYIFKR >gi|261747573|gb|ADAD01000117.1| GENE 40 33700 - 34584 1301 294 aa, chain + ## HITS:1 COG:aq_1143 KEGG:ns NR:ns ## COG: aq_1143 COG0329 # Protein_GI_number: 15606400 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Aquifex aeolicus # 3 289 2 287 294 277 49.0 1e-74 MKFEGSYVALITPFKENGEVDEEKIRELVNYHIENGTAGIVPCGTTGEAPTLTFSEHEKV IKIVVEEVKGRIKVIAGAGSNNTDRAVELTKYAKELGADAALSTCPYYNKPTQRGLYEHY KKIAEESKFPVMLYNVPGRTGINIEAETVARLAEIPEIVAIKEATGSLEQMIKIQDLCGD RIEILSGEDHLILPMLSIGAKGVVSVVANIMPKEMSDLIASFLNKEYEKAFELHTKLYEV SRNMFIEGNPVTVKAAMNILGQIDNDSVRLPLVSAEHKTKDRLTKFFKERGIIK >gi|261747573|gb|ADAD01000117.1| GENE 41 34663 - 35661 1684 332 aa, chain + ## HITS:1 COG:aq_1866 KEGG:ns NR:ns ## COG: aq_1866 COG0136 # Protein_GI_number: 15606903 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Aquifex aeolicus # 4 332 3 340 340 376 58.0 1e-104 MKKYNVAVVGATGLVGQTFLKVLKERKFPIEKLYLYASARSAGKKVDFDGKEYTVIELKD ENIKNDIDIALFSAGGSISKEYAPKFRDKGAIVVDNSSAWRMDKDIPLVVPEANPEALDG QNGIIANPNCSTIQVMPVLKVLADKYGLKRVVYSTYQAVAGSGQKGIDDLEANLKGESSK NYPYQIAFNLLPHIDVFLDNGYTKEEMKMVEETRKILGLPDLKITATCVRVPVRMGHAVS VNVELEKPFDLKDVFKAFEEKEGVIVKDDVSKNVYPMPIDAADTDEVYVGRIRRDESVEN GLNLWVVADNIRKGAATNTIQIAETLIKKGVI >gi|261747573|gb|ADAD01000117.1| GENE 42 35662 - 37233 1820 523 aa, chain + ## HITS:1 COG:SP0453_1 KEGG:ns NR:ns ## COG: SP0453_1 COG0834 # Protein_GI_number: 15900370 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Streptococcus pneumoniae TIGR4 # 30 306 28 308 325 240 46.0 6e-63 MKLNLLKNKIVLLLCFLSLFVFGYAENGELRVGMECGYAPFNWVQNNDKNGAVKIDSGYC GGYDVEIAKLIAKGLDKKLVVVKSEWDALLGPALTSGNVDIVIAGMSPTAERKQSLSFTK PYYESDLVVVVKKDGKYANAKSINDFAGARITGQLSTLHYNVIDQMKGVKKQQAMENFPA MIVALDSGKIDGYVSEKPGAMAAQMSNPELVFVSFDKENGFKYENSEVDVAVGVKKGDEK LVGEINKILDNISPEQRKQLMEKAISNQPEMKKRDFTGWVLFFLKNNKKAFLIGTATTLG ISLTGTVVGFIIGLGVMLVKNINTDERTSAVKKIGLKIINMIFSIYIAVFRGTPMIVQSM IIFYGSSQVLNWNLSPLGAALFIVSINTGAYMCEIIRGGIDSIDKGQFEAAQAIGMTHFQ MMKNIIFPQVFRIILPSIGNEFIINIKDTSVLSVISVTELFFVSKSVAGTYSKYYEVFII TCVIYFFLTFTLSALLKYVEKKMDGPQNFEILEEASVATGGEK >gi|261747573|gb|ADAD01000117.1| GENE 43 37233 - 37979 529 248 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 245 1 242 245 208 45 5e-53 MEKGNKVIEIKNIRKDFGNRTVLKDINFDVYEKEVVSIIGASGSGKSTLLRCINLLETPT GGEILFHGKDVVSGNIPLITLREKVGMVFQQFNLFNNLSVLDNCVIGQTKVLKKSREEAE KTAKELLAKVGMERFIHAKPNQISGGQKQRVAIARALAMEPEVLLFDEPTSALDPEMVGE VLKVMKDLAKSGLTMIVVTHEMDFAHDVSSRVVFMDQGVIVESDKPEIIFDNPKHDRTKE FLSRMLNK >gi|261747573|gb|ADAD01000117.1| GENE 44 38050 - 38820 1101 256 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0706 NR:ns ## KEGG: Lebu_0706 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 3 254 10 261 268 220 60.0 4e-56 MKKLKLTDEEKELLKGNEEGLKQAFINKAALATAEKYEFSDSEKEEIDYFYNNEKTKYFV AKQIEEKISVDADEVVKIYNENKAQFDAQNVPFTQARDIIQRDLLNQQVATLENEEFNKI IEEMGESVSIAKKEIIFSQGNPDVIRNIVLNKVVEEKAKGTDFEKKEKDALKIIKDNVLA NFYVDLEIRKKVQVTHEEIVGIYESEKGKLGNVTPNDAYNQIANGLLNNRAVEERQNVIN KLIEEYKIDDLVKENL >gi|261747573|gb|ADAD01000117.1| GENE 45 38938 - 39372 658 144 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229211385|ref|ZP_04337778.1| LSU ribosomal protein L13P [Leptotrichia buccalis DSM 1135] # 1 144 1 144 144 258 84 5e-68 MNKYTVMQKKEEVTRNWYEIDAEGKILGKLATEIAVKLMGKHKPSYTPHVDGGDYVIVTN ATKFAVTGTKMLNKKYYRHSGYPGGLKVRSLEEMLEKKPTEVIRKAVERMLPKNKLGSQM IGRLKIYTGTEHNHEAQKPEKIEL >gi|261747573|gb|ADAD01000117.1| GENE 46 39385 - 39783 621 132 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229211386|ref|ZP_04337779.1| SSU ribosomal protein S9P [Leptotrichia buccalis DSM 1135] # 1 132 1 132 132 243 92 1e-63 MAEKIQYLGTGRRKTSVARVRLIPGTSGIEINGKDMREYFGGRELLAKIVEQPLELTETL NKYGVKVNVNGGGNTGQAGAIRHGVSRALVVADEELRGALKEAGFLTRDSRMVERKKYGK KKARRSPQFSKR >gi|261747573|gb|ADAD01000117.1| GENE 47 39909 - 40937 974 342 aa, chain - ## HITS:1 COG:Cgl1981 KEGG:ns NR:ns ## COG: Cgl1981 COG0582 # Protein_GI_number: 19553231 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Corynebacterium glutamicum # 195 341 160 310 315 70 31.0 5e-12 MKMRNPNGYGSIIKLSGKRRKPFAVRVTIGWNNNGKQIYKYIGYYKTKQEADRQLVYYNE NPYDLDIQKITFSEVYEKWKKEKFETVGHSAQLGYIAAFKSCKTLHYTRFVDLKSSHLQE IISNPGIKYGSKRKIKVLFNQLYAYAMKNDIVSKDYSKYTDTGKNTEESSRKPFTTEEIQ RLWDLVDENDWIDTILILIYTGFRIGELLEIKNTDIDLDNRIIKGGLKTEAGKDRLVPIN SKIYNLIKNRMSSDNEYLIVNFKGEKMKYSNYYREKFEPIMEELGMKHRPHDTRHTFATL LSNADANKTSVKKLIGHNSYTTTEKFYTHKDIEELRKAIEKI >gi|261747573|gb|ADAD01000117.1| GENE 48 40918 - 41157 272 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038390|ref|ZP_06011767.1| ## NR: gi|262038390|ref|ZP_06011767.1| putative UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase [Leptotrichia goodfellowii F0264] putative UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase [Leptotrichia goodfellowii F0264] # 1 79 1 79 79 143 100.0 5e-33 MKLVLSMKEACDFLQLNERTVKSGLITGTLNIGSAVVTKIVNKKEKYKYHIPTRRAEAYM GISYDDFLKMRQQNENEKS >gi|261747573|gb|ADAD01000117.1| GENE 49 41175 - 41930 951 251 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1235 NR:ns ## KEGG: Sterm_1235 # Name: not_defined # Def: phage recombination protein Bet # Organism: S.termitidis # Pathway: not_defined # 4 224 7 226 238 269 63.0 8e-71 MGRLTQNEANDKLIIFKVGNDEVKLSNNIVKRYLVTGQGNVTDEEIMYFMKLCKARNLNP FVRDAYLIKYSDKDAATIVVAKDAIEKRAIQHPKYNGKEVGLYIIKKETGDLEKRNGTIY LKEKEEIAGAWCTVYRKDWDNPVTVEVNFDEYVGRKKDGTANINWANRPVTMITKVAKAQ ALREAFIEEISGMYEAEEAGINVNDLDDTPIEQKDMQNMPDVTNTVQETEIDDEDLKKEL FDNENKKNPLE >gi|261747573|gb|ADAD01000117.1| GENE 50 41930 - 42790 781 286 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038387|ref|ZP_06011764.1| ## NR: gi|262038387|ref|ZP_06011764.1| putative DNA double-strand break repair Rad50 ATPase [Leptotrichia goodfellowii F0264] putative DNA double-strand break repair Rad50 ATPase [Leptotrichia goodfellowii F0264] # 1 286 1 286 286 308 99.0 4e-82 EEKYGNLVFTEDEKATAKEMRTNLNKLEKQISDERKKIEKEAKVEIDNIINTLKKAEKDV KSLSASIGEQIKNFDENEWKEKLESISEIKEKIYSENKLLEKYFVIDEAWKKKTMTLKKV EEDIKEKFDYWNKRYNFIVSQLVAVNEEIENKLKFEDIQYLMLEEYDNIMKKLVDKKNEI KTTEENIKRKAEEDKQKALAELEAKKEQEKREAIEKAKNETAKEEKTTKDKSINKTQKVS ENTTYFDTTIRFPKAPIEFLKELKTLSDRYGLTYELKENIRLGDDE Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:12:45 2011 Seq name: gi|261747569|gb|ADAD01000118.1| Leptotrichia goodfellowii F0264 contig00230, whole genome shotgun sequence Length of sequence - 4799 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 3553 4252 ## PM0490 hypothetical protein 2 1 Op 2 . + CDS 3563 - 4030 477 ## gi|262038404|ref|ZP_06011780.1| conserved hypothetical protein + Prom 4232 - 4291 1.9 3 2 Tu 1 . + CDS 4331 - 4799 569 ## gi|262038405|ref|ZP_06011781.1| hypothetical protein HMPREF0554_2458 Predicted protein(s) >gi|261747569|gb|ADAD01000118.1| GENE 1 2 - 3553 4252 1183 aa, chain + ## HITS:1 COG:no KEGG:PM0490 NR:ns ## KEGG: PM0490 # Name: not_defined # Def: hypothetical protein # Organism: P.multocida # Pathway: not_defined # 1072 1180 4 111 116 96 44.0 8e-18 ASIPVVGAAKQVWDAGKNLKNAKHKEDYLNAGLGAYSAGMSAASAIGGVFTNPFGVTGSI NVSHNKNVYHREESISVGSNLHVGGGVEYNGESLHTRGLNILNEGDTVYNITGQILKEAG VSTIKENSSSTGFGLSVAKGMVDSNGQINAFKDKNVTITPSISNQKSESEGVYYSKNEDI TKGNSHYNNTGNVKTVGVAVRTGSISGTVNGPHETISVQDSVNSSSRGYSVSLGIGIGPH TEKGKTVNSPYVSSVTVGYSKADVEQKITRNVAEFTAGSGMLDVKGKIIQVGSLIDGGFS LNGQGYEKQDLHDIDKSRKVGVNVTVYPNVTYTKRDEKGNAIYIDGRKENEQGAVYKVGV NYAETDKARDVLSTVGSNVQINQDIAGVNRDTNRQVGEFEGREISPINVDLGTEYWLTRA GRGKAKDIFEDAGRSVEGIKRILTTRDADGNLQILKSIEAETAVQKMMRIGFVETKGKTQ QQVKKELEERFGSLTKKGVKVHFYGTEDIDTSKMDDNTLAKLTANGFAITKDGTVWINKE YVDSGKVIDFNKTTQHEISHLIFGEDSEYQAQYLTRAYGEFLEGIRDNGYLKDGQGIIDY KFSMLTDEDRLKLDGYAYKDMQFIQEMADGIRAAKARLSKCGTNQKCIADNKQIIHNLEV SLKKFGVQEKIQTHKEKGKRINANKPRNIIGKIQKEIEKYKNKKEISKYEKEYQDLDEEF DEELKYVVRDLERQINQGKKKEANKKLKEIAKEIVSKNEIVLSSDYMFNNGIKTEANGKV ILYAGGKDYEKLNKILKEEGILSRITVNYMEKRGADILEKVGRAAAKENKIPNYLRKSDI QQLEEYNGRKKEIFEGRKLYGRGEITESELKKYESTLNYIGASNAITDVAKIGSVVYLSS KYGINTKPEYMKQREILINDGTSTGTVRAKQITVGNRDIVQTKISLETGETLVTTKFKPT GEVLQELRFEAKGTGNVIPYSVGEATKSVLQISDSVNKTFGNMSLANTTKSAVSSNNLLT EYGNIPVVYQQPNPTLLGTMIQINPDRYIQNVTAPINYNHVFNLEINRRGKVVGGHTAFG NVTVEKVIQTYPTGVYKAEIFMQDPNNPSNLLKKSNNDGISTMFPKDWTPNRIKVEVDFA YKNRKAHSDINKALRGMWEGITPSGVWVEGYTNRNTTVYPQPK >gi|261747569|gb|ADAD01000118.1| GENE 2 3563 - 4030 477 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038404|ref|ZP_06011780.1| ## NR: gi|262038404|ref|ZP_06011780.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 155 1 155 155 290 100.0 3e-77 MLKIKFSWRKGEDNNFYPGGQSSFEGDYSNYVEDKIKYFFNEISTQNKLNYKDCLGYLFI DDGGVGWKRRIPFIENGIKKIEKVLSNLSEEEDWGGEGFLAEIKKEGVLIYFITDDSYFD IISLKCFYKAMVSWRKFLDTEPNENTIMEIECEED >gi|261747569|gb|ADAD01000118.1| GENE 3 4331 - 4799 569 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038405|ref|ZP_06011781.1| ## NR: gi|262038405|ref|ZP_06011781.1| hypothetical protein HMPREF0554_2458 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2458 [Leptotrichia goodfellowii F0264] # 1 156 1 156 156 256 100.0 4e-67 MEDINKSRNIGINVTITPGVVEIYKNGMLTGNAKGGTAYGTRISFNEKNYVAKAKATIGS NVKTIIDGKEEELKEVNRDIGNRIEVIRNEEISPINIDLGTEYWGTDYAREKARGDFVKA GNKIDRVAEILNAAIKNDNKDIYLYYKDRIAAEEEI Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:13:18 2011 Seq name: gi|261747567|gb|ADAD01000119.1| Leptotrichia goodfellowii F0264 contig00108, whole genome shotgun sequence Length of sequence - 568 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 56 - 568 542 ## COG5421 Transposase Predicted protein(s) >gi|261747567|gb|ADAD01000119.1| GENE 1 56 - 568 542 170 aa, chain - ## HITS:1 COG:MA2942 KEGG:ns NR:ns ## COG: MA2942 COG5421 # Protein_GI_number: 20091761 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Methanosarcina acetivorans str.C2A # 14 163 370 519 521 71 31.0 8e-13 VENNVENEKRVEDYRKYFDIKVEKGNIIAVAKDDIIEKHMKKYGYFSLISNENLEAREIL SIYRQKDVAEKAFHNIKDRLDARRLRVSSKPTMDGKIFVTFVSLVMLSYIKNKMSEKELY KKYTTQELLDELDLIESYERCNEKLKLGEVTKKQKEIFKYMDIKFPEELL Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:13:37 2011 Seq name: gi|261747504|gb|ADAD01000120.1| Leptotrichia goodfellowii F0264 contig00115, whole genome shotgun sequence Length of sequence - 65119 bp Number of predicted genes - 63, with homology - 61 Number of transcription units - 23, operones - 15 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 9 - 68 11.2 1 1 Op 1 . + CDS 266 - 562 575 ## COG0513 Superfamily II DNA and RNA helicases 2 1 Op 2 . + CDS 568 - 2022 2049 ## COG0513 Superfamily II DNA and RNA helicases + Prom 2044 - 2103 7.0 3 2 Tu 1 . + CDS 2133 - 2777 828 ## COG2077 Peroxiredoxin + Prom 2780 - 2839 8.3 4 3 Op 1 . + CDS 2899 - 3234 585 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily 5 3 Op 2 . + CDS 3185 - 5110 2437 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily 6 3 Op 3 . + CDS 5135 - 6046 1155 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase + Prom 6128 - 6187 11.8 7 4 Op 1 . + CDS 6244 - 7833 2159 ## COG1543 Uncharacterized conserved protein + Prom 7841 - 7900 3.0 8 4 Op 2 . + CDS 7945 - 8208 94 ## gi|262038448|ref|ZP_06011822.1| hypothetical protein HMPREF0554_1786 + Prom 8316 - 8375 8.9 9 5 Op 1 7/0.000 + CDS 8623 - 9771 1562 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes + Prom 9773 - 9832 2.4 10 5 Op 2 . + CDS 9885 - 11072 1674 ## COG0301 Thiamine biosynthesis ATP pyrophosphatase 11 5 Op 3 . + CDS 11089 - 12402 1671 ## COG1295 Predicted membrane protein 12 5 Op 4 . + CDS 12413 - 13780 457 ## PROTEIN SUPPORTED gi|163739394|ref|ZP_02146805.1| 50S ribosomal protein L36 13 5 Op 5 . + CDS 13749 - 14789 1674 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 14 5 Op 6 . + CDS 14813 - 15409 723 ## PROTEIN SUPPORTED gi|229210324|ref|ZP_04336721.1| acetyltransferase, ribosomal protein N-acetylase + Prom 15424 - 15483 8.9 15 6 Op 1 . + CDS 15508 - 15762 427 ## Dtox_4301 prevent-host-death family protein 16 6 Op 2 . + CDS 15755 - 16012 202 ## COG4115 Uncharacterized protein conserved in bacteria 17 6 Op 3 3/0.000 + CDS 16035 - 17117 1343 ## COG0337 3-dehydroquinate synthetase 18 6 Op 4 5/0.000 + CDS 17139 - 18395 1585 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase + Prom 18742 - 18801 10.1 19 7 Op 1 . + CDS 18830 - 19927 1518 ## COG0082 Chorismate synthase 20 7 Op 2 . + CDS 19972 - 20103 257 ## + Term 20125 - 20163 -0.3 + Prom 20110 - 20169 3.4 21 8 Tu 1 . + CDS 20226 - 20882 722 ## BMD_0900 hypothetical protein + Prom 20939 - 20998 9.4 22 9 Tu 1 . + CDS 21191 - 22753 2076 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific + Term 22818 - 22852 -0.7 + Prom 23157 - 23216 3.4 23 10 Op 1 38/0.000 + CDS 23238 - 24803 2454 ## COG0747 ABC-type dipeptide transport system, periplasmic component 24 10 Op 2 49/0.000 + CDS 24825 - 25748 966 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 25 10 Op 3 1/0.000 + CDS 25757 - 26275 535 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 26 10 Op 4 44/0.000 + CDS 26335 - 26565 275 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 27 10 Op 5 . + CDS 26558 - 27295 316 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Prom 27334 - 27393 7.8 28 11 Op 1 . + CDS 27437 - 28027 663 ## FN1814 hypothetical protein 29 11 Op 2 25/0.000 + CDS 28027 - 28926 1380 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 30 11 Op 3 . + CDS 28928 - 29140 302 ## COG1121 ABC-type Mn/Zn transport systems, ATPase component 31 11 Op 4 42/0.000 + CDS 29131 - 29616 170 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 32 11 Op 5 12/0.000 + CDS 29658 - 30551 1070 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 33 11 Op 6 . + CDS 30577 - 31452 1248 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 34 11 Op 7 . + CDS 31479 - 31898 563 ## FN1808 hypothetical protein 35 11 Op 8 . + CDS 31963 - 32763 1258 ## COG5266 ABC-type Co2+ transport system, periplasmic component + Term 32769 - 32805 5.0 + Prom 32893 - 32952 8.9 36 12 Tu 1 . + CDS 32983 - 33366 368 ## gi|262038420|ref|ZP_06011794.1| NADH dehydrogenase subunit 5 + Prom 33392 - 33451 9.6 37 13 Tu 1 . + CDS 33599 - 34963 1494 ## Lebu_0877 hypothetical protein + Term 35053 - 35101 3.4 - Term 34969 - 35011 0.5 38 14 Op 1 . - CDS 35027 - 35785 700 ## FN0484 lipase (EC:3.1.1.3) 39 14 Op 2 . - CDS 35815 - 36609 691 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor 40 14 Op 3 . - CDS 36626 - 38263 1641 ## COG1297 Predicted membrane protein - Prom 38327 - 38386 17.1 + Prom 38315 - 38374 10.2 41 15 Op 1 . + CDS 38571 - 38678 157 ## 42 15 Op 2 11/0.000 + CDS 38719 - 40950 3413 ## COG1882 Pyruvate-formate lyase + Term 40976 - 41032 2.1 + Prom 40965 - 41024 5.3 43 15 Op 3 . + CDS 41061 - 41786 750 ## COG1180 Pyruvate-formate lyase-activating enzyme 44 16 Op 1 . - CDS 42083 - 42568 577 ## COG2131 Deoxycytidylate deaminase 45 16 Op 2 . - CDS 42583 - 43923 1207 ## COG1570 Exonuclease VII, large subunit 46 16 Op 3 . - CDS 43940 - 44290 553 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family - Prom 44321 - 44380 2.3 47 17 Tu 1 . - CDS 44389 - 46362 2258 ## COG0556 Helicase subunit of the DNA excision repair complex + Prom 46716 - 46775 5.9 48 18 Op 1 . + CDS 46801 - 47655 1154 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily 49 18 Op 2 . + CDS 47682 - 48953 1042 ## Lebu_1924 hypothetical protein 50 18 Op 3 . + CDS 49012 - 49695 663 ## Lebu_1923 hypothetical protein + Prom 49719 - 49778 5.6 51 19 Op 1 1/0.000 + CDS 49812 - 50708 1139 ## COG0583 Transcriptional regulator 52 19 Op 2 . + CDS 50750 - 51334 822 ## COG0279 Phosphoheptose isomerase + Term 51367 - 51425 1.3 + Prom 51424 - 51483 10.5 53 20 Op 1 5/0.000 + CDS 51670 - 53682 1572 ## COG3711 Transcriptional antiterminator 54 20 Op 2 . + CDS 53700 - 54134 445 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 55 20 Op 3 10/0.000 + CDS 54166 - 55491 1905 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 56 20 Op 4 . + CDS 55493 - 55777 469 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 57 20 Op 5 . + CDS 55782 - 56618 959 ## COG0191 Fructose/tagatose bisphosphate aldolase 58 20 Op 6 . + CDS 56618 - 57307 915 ## COG0149 Triosephosphate isomerase 59 20 Op 7 . + CDS 57304 - 57819 555 ## COG0432 Uncharacterized conserved protein + Term 57931 - 57982 -0.8 60 21 Op 1 . - CDS 58047 - 59903 1364 ## FN0723 hypothetical protein 61 21 Op 2 . - CDS 59907 - 62960 2612 ## COG2319 FOG: WD40 repeat - Prom 63080 - 63139 13.8 + Prom 63076 - 63135 12.5 62 22 Tu 1 . + CDS 63163 - 64524 545 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 + Term 64637 - 64669 1.7 + Prom 64631 - 64690 8.4 63 23 Tu 1 . + CDS 64828 - 65119 280 ## gi|262038433|ref|ZP_06011807.1| conserved hypothetical protein Predicted protein(s) >gi|261747504|gb|ADAD01000120.1| GENE 1 266 - 562 575 98 aa, chain + ## HITS:1 COG:AF2254 KEGG:ns NR:ns ## COG: AF2254 COG0513 # Protein_GI_number: 11499835 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Archaeoglobus fulgidus # 4 97 9 103 368 103 53.0 9e-23 MQRFEDYGLSQEVLNALEKKGFEEPSDIQKLVIPELLKERTHLIGQAQTGTGKTAAFGIP ILETLEADKTVKALILAPTRELANQVADEIYSLKGKKI >gi|261747504|gb|ADAD01000120.1| GENE 2 568 - 2022 2049 484 aa, chain + ## HITS:1 COG:FN1975 KEGG:ns NR:ns ## COG: FN1975 COG0513 # Protein_GI_number: 19705271 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Fusobacterium nucleatum # 1 416 108 524 528 425 54.0 1e-118 MAVYGGASIENQIKKLKSGVDIVVGTPGRVMDLMRKKVLKVNNLDYFVLDEADEMLNMGF IEDIELILEETNDEKKMLFFSATIPKTIMSIAKKFMNDYKLLKVKKEELTTDLTEQIYYE VKQEDKFEALCRVLDYTQNFYGIVFCRTKSEVDEVTNKLKARNYDAECIHGDITQGLRQK ALDLFKKKILTILVATDVAARGIDVSNLTHVINYSIPQEAESYVHRIGRTGRAGHKGIAI TFVTPREARTLSQIKRVTKTDIKRETIPNVEEILNAKKEALIAYIDEIIKEEDYTSYEGL ADKLIEGRDPKQVLSSLLRHFYEDEFLPENYSEIEDVKVKIDDKTRLFIALGSKDGYNPG RLLDLLNKKAKTPGRKVKDIKIMDKYSFITVPLQEAEYIMRALNSKKDSKPLVEKATGGQ SGGGSSEGKRTRGRKSEGESKRRRKSDDKTSKKAKKSGDKKVSSKTKEKSKKTEKASAKK KKSK >gi|261747504|gb|ADAD01000120.1| GENE 3 2133 - 2777 828 214 aa, chain + ## HITS:1 COG:BS_ytgI KEGG:ns NR:ns ## COG: BS_ytgI COG2077 # Protein_GI_number: 16080001 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Bacillus subtilis # 48 205 3 160 167 122 40.0 5e-28 MKRILTVILAGIVIVSCGKTKQNEQKMSGTNEQYTQYLDTLKSENELKITMGGKPITIVG KETKTGDKLKEIPLVISPKLEEKNILEDKAVKVIYTAPSLDTKVCSIQTKMLNTAAEKFT DVKFYSVTVDTPFAQERFCSSNGINSLKPVSDYKYHQFGIQNGFFIKEAGLLTRALMIVD ENNIVKYIEYVKEEGDEADVQKAMKFLEEKVIKK >gi|261747504|gb|ADAD01000120.1| GENE 4 2899 - 3234 585 111 aa, chain + ## HITS:1 COG:PA3339_1 KEGG:ns NR:ns ## COG: PA3339_1 COG1752 # Protein_GI_number: 15598535 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 45 107 25 87 308 97 65.0 4e-21 MKKMIYVVFMFILLGNLIFSENETPKENQKVKVENNDENRKDKHIGLVLSGGTAKGLAHI GILKVLEEEKVPVEYVTGTSMGSIVGGLYSVGYTPDEIEKIATEMDWLSLF >gi|261747504|gb|ADAD01000120.1| GENE 5 3185 - 5110 2437 641 aa, chain + ## HITS:1 COG:XF0066 KEGG:ns NR:ns ## COG: XF0066 COG1752 # Protein_GI_number: 15836671 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Xylella fastidiosa 9a5c # 52 261 170 368 395 109 31.0 2e-23 MRLKKLRQKWTGFLFFDDKIERKEKGLSRNLIEDRNTVALPTEKFIPKIPSGAVGGKSAS EKLNELFYGAEGVENFKKFPKKFALVATDLNTGEGVMIDKGSIATAIRSSLSLPSVFNPV ESGERLYVDGGVVRNLPVQDVKVLGADYTIGVNVGEGFSKRNPEKLNIVDVISDSMTIAG RQEVERQIRMLDLYMAPNLEKIGSYDFQKVKDIIAAGEKVARENIESIRKLSDPVKFAEL EEKRKEFRKSWKEEYYIKNIEIKGNKKYKNNYFDRYIPKNLGKMNRKDMEKIVNDLYKNG NFSTVYFEIKNSDTLVINVQEKAGNYLTLSGNVNNEDLAISNIGVQGNRTINNIDTRYML RATIANEHGINGVGIMSMGKDNRMLLIGTFDFKKDIIKNQYHNGNKYSFNNRRFKTNLGI GVELSRSTLLILSGGYQTSHVNGNPDEKSDQKIRFPYFEASVTQDNRDSIMFPTKGTYFK AEYTTSNSKNADFNALYVKGEVNIPLGKNFTLTPSAVYVTSRGDKVPETYKPKAGGFQED VYSLEFAGLPTDKMRANSIFIGKLNFQYQISRFFFAGVNASVASFSDKSFSFGKERKESY GLGFGVRTPLGPGYIGVAKTPGEGVKYFLNFGYEPKAFNEN >gi|261747504|gb|ADAD01000120.1| GENE 6 5135 - 6046 1155 303 aa, chain + ## HITS:1 COG:BH0676 KEGG:ns NR:ns ## COG: BH0676 COG1597 # Protein_GI_number: 15613239 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Bacillus halodurans # 7 237 1 229 295 120 32.0 3e-27 MSDKNVLKKALLVYNPKSGNADMILNNFDLIASMLLEKGITLTLYSIRKKYDILTEILKK EKYDILILSGGDGTLSRSLSLLYKENIDFPDVAIFPTGTSNDLAKSLDLGENIEKWLDNI IHGTAKPVDFGLINEKTIFLSSYAGGLFTKISYNTDKNLKKVIGKAAYHITGLGELTNIK KFDLNITLDNGEVISEKAILYMILNGKSVGGFDGVIDNADVSDGLMNIIIVKNIENPFDI PQILFDLINSNLVNNDYVRTLTAKRCIIEKVNEEIGVSIDGEEGVNEQVEVKFIGNKLKI FRK >gi|261747504|gb|ADAD01000120.1| GENE 7 6244 - 7833 2159 529 aa, chain + ## HITS:1 COG:sll0735 KEGG:ns NR:ns ## COG: sll0735 COG1543 # Protein_GI_number: 16332203 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Synechocystis # 3 525 4 527 529 539 51.0 1e-153 MNGYLSLVLHAHLPYVRHPEHKEFLEEDWLYEAITETYIPLLEMFENLTRDRIQWNLTLT MSGTLVNMLNDSLLRERYIRHMDKLIEFCELEIERLKEYPDMLKVANHNLWFNKRAKSVY EDKYGRDLVGAFRKFQDQGNLEIIPVTATHGFLPLMKDYPEAVNAQIYMAKKDYQKNFNR DPKGIWLAECAYYPGQDKFLEKHGIRYFLVDAHGIMHSDPRPVYGIYSPVYTKNGVAAFA RDLESSEQVWSSEIGYPGDGAYREFHKDAGYELDYEYVKPYLHSDGIRRNMGIKYHAITD KKGSFKACYDPDLAYGRAKEHAYNFVHNRSKQIEFLASKMKHRKPIVISPYDAELYGHWW YEGPIFLEWVFRATAESNFSTITPYQYLERYPTNQIVDVSMSSWGANGYYDVWIDGSNDY VYRHLHKATKKMIELANEREPVNELEYRALNQAARELLMAQTSCWEFIMFTGTMVGYAHK KISDHIHRLFKIYEDFKNGMLDESWINEIEYRDNIFPEIEYRMYRSDRL >gi|261747504|gb|ADAD01000120.1| GENE 8 7945 - 8208 94 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038448|ref|ZP_06011822.1| ## NR: gi|262038448|ref|ZP_06011822.1| hypothetical protein HMPREF0554_1786 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1786 [Leptotrichia goodfellowii F0264] # 1 87 1 87 87 135 100.0 1e-30 MLVKVENSPEGSNIDFYLANYFFFREKFECWGRYRKKEKENFEILQVLLLFKKYINPFVI IILSVLIFMVLLEIEIQVIKNKRKKNE >gi|261747504|gb|ADAD01000120.1| GENE 9 8623 - 9771 1562 382 aa, chain + ## HITS:1 COG:CAC2972 KEGG:ns NR:ns ## COG: CAC2972 COG1104 # Protein_GI_number: 15896225 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Clostridium acetobutylicum # 1 382 2 378 379 261 42.0 2e-69 MIYLDNAASTKPKEEVVNTMIEVMKNSYANPDAIHEFSHEIFLKIKNSRKTVGNFLGVLP ERVYFTAGGSDGNNLVLQGIIEANSKTRKHLITTKIEHPSVYETFRNYEKKGFDVDFLDV DKNGYVDLEQLKNIIREDTLLVSIGTVNSEVGSIQNLKEIAQIVKGKSKDIYFHTDFVQG FGCTDIKFDKIPVDAITVSGHKIYASKGIGAVYVADGVKLTNVIYGSNSENGIAKRTMPT ELILGFAKAVELLDKNYKKDMEYLQNLKAEFAKKIEENISDIRINSLFDIEKSSPKVLNV SFRGTKGEVLTHFLGMNEIYVSTGSACSSKKGNSRVLSCMGLSQSELDGAIRFSFSYENT LEEIDKVVEVLKKSVEQIRKMR >gi|261747504|gb|ADAD01000120.1| GENE 10 9885 - 11072 1674 395 aa, chain + ## HITS:1 COG:CAC2971 KEGG:ns NR:ns ## COG: CAC2971 COG0301 # Protein_GI_number: 15896224 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis ATP pyrophosphatase # Organism: Clostridium acetobutylicum # 4 385 1 379 384 293 43.0 4e-79 MDYMNKSLLNSVGLSYGELSLKGKNRGQFEKMLKNKINKALSGYEFKLNDDLSKLYVDIK SEDMEKVIEKLKKVFGIVGLNPSVKVDRNDEIIKEKVLEIVNFAYENGARTFKVAVNRSN KGFEKKSMDYAKELGAHILVNSPFEQVKMKEPDIQINVDIRKNVYIYTEKIKTYGGLPIG STGKGLVLLSGGIDSPVASFLMAKRGMRINFVTFHSFPFTSKQALEKIKELTEILSIYTG KSRLYAVNILKIQEAINTKTKKELATILTRRAMMRLAEKLANSMQYQALITGESLGQVAS QTLGGLTCTNASVEKLPVFRPLIGADKTEIIEIAKEIGTYEKSIEPHEDSCVIFAPKHPV TNPKLEDVLNEEKRIENYDELLEEVFSEKEYFNIG >gi|261747504|gb|ADAD01000120.1| GENE 11 11089 - 12402 1671 437 aa, chain + ## HITS:1 COG:FN1154 KEGG:ns NR:ns ## COG: FN1154 COG1295 # Protein_GI_number: 19704489 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 33 387 1 351 396 157 28.0 6e-38 MGECMKLKFDISGMSEKLSKENIKKHIREAFQMINLIYRNYSDGETQILACTLTYFSMIA VFPILALVLGITKGFGLDKLVIHKIFELVPQNEGMVRTVLDIANKLLASTQGGILTGVGV VILINSAIKVLMMLEDAFNKIWHISKNRSMTRRIVDYVAIIFLGPILIIVIVATNSFVVE KMSTMLFGGTVIVNIFLHVFGPLFYVFLFTLLFYVIPNTNVKLKPAFISGIITVLLCYVL KVAFTWLQSSITKYNAIYGSLALVPIFLTWVQYMWVTILLGAQIAFSIQTSDEFLYNEKV EMPIKLRKEAGILILTLIIARFKEKKEPYTYLELTKKLGVEAQFVKDILSELEKMGLVNE VIIDRNEDIKYQIAYDPESLSFEEFLDKFEGKNFDYYSDIFDDLEETEKELLVKIRENII LKNGRLLKNMEAEIKNN >gi|261747504|gb|ADAD01000120.1| GENE 12 12413 - 13780 457 455 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739394|ref|ZP_02146805.1| 50S ribosomal protein L36 [Phaeobacter gallaeciensis BS107] # 121 444 84 384 409 180 33 2e-44 MGNEDLQILKNQISGNYLIIMILSLVIVIFCVMIFIYLFKSGKRQQEEIIYLKDEMEKIL TEGFVASTEKNISETGKKLNEMEKTITLNTEKSLLRGISDLNKELSGNNEKLLVRFTDFG TKLGSTMNENNQKLSENINRFKDEFKKGINDDFEMLNGKIEKRLDMMNTKVEERLSKGFE ETTKTFGNVLERLSKIDEAQKKIEALSSNVVSLQDVLTDKKSRGIFGEVQLYQILVSVFG ERNDKIYQRQYKLSNGTIVDSIIFAPEPMGNIGVDSKFPLENYRKMYDIELTAIERENAR KEFSINLKKHIDDISSKYIIPGETSEQAVLFLPAEAVFAEINAYHTDIIEYAYRKNVRIT SPTTLMSVLTTLQVIMTNIERDKYAHVIQEELMKLNKEFDRYQERWNVLEKDIEKVSKDV KNITTTSNKISKRFSEISNVNLIENNGRERKNDDN >gi|261747504|gb|ADAD01000120.1| GENE 13 13749 - 14789 1674 346 aa, chain + ## HITS:1 COG:CAC0892 KEGG:ns NR:ns ## COG: CAC0892 COG2876 # Protein_GI_number: 15894179 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Clostridium acetobutylicum # 8 343 1 335 337 357 55.0 2e-98 MVGKGKMMIIKVDGNVSLETLKKMIARLESENKVKVELIKGEEYLVLGLVGDISTIDIKH IQSLEYVIDVQRIQEPYKRASRKFKNDDTTVKVGNTEIGGNNLVMMAGPCSVESEEQIIA TAKAVKKAGANILRGGVVKPRTSPYAFQGLGMEGIELMKKAKEETGMPIICEVMSIEQLH EFGPHLDMIQIGARNMQNFDLLKEVGKTKIPVLLKRGLSATIEEWLMSAEYILAGGNENV VLCERGIRTYETAYRNVMDLNAVPMIKKLTHLPIIVDSAHGTGKYWMVKPLAMAGIAAGA DGLMVEVHPEPDKAFSDGPQSLKPEVFEDLMKDVEKIASVLGKSFK >gi|261747504|gb|ADAD01000120.1| GENE 14 14813 - 15409 723 198 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229210324|ref|ZP_04336721.1| acetyltransferase, ribosomal protein N-acetylase [Leptotrichia buccalis DSM 1135] # 1 198 1 199 200 283 69 2e-75 MENETQKMYIKKITGDNVYLSPISLDDTVQYTHMVNDLKVSVGIGHPAYTNIMDYERERE FLGSIKDDKTFAVRLIENDELLGNIELFNVNILQKNAVLGIMLGNPEYQRKGYGKEAINL ILDYGFSFLNLYSVSLTVFEYNEVAYNLYKKVGFKEIGRLRKRVEIMGKRYDEIIMDILK EEFESVYIKREIEGRYRL >gi|261747504|gb|ADAD01000120.1| GENE 15 15508 - 15762 427 84 aa, chain + ## HITS:1 COG:no KEGG:Dtox_4301 NR:ns ## KEGG: Dtox_4301 # Name: not_defined # Def: prevent-host-death family protein # Organism: D.acetoxidans # Pathway: not_defined # 1 84 1 84 84 89 65.0 4e-17 MLAANFTTLRNNLKNYCDEVSDNNETVIVTRKKEKNIVILSLEKYNELEKAAKNAEYLAM IDRRMEKYLLGKYQQHELIEDKNE >gi|261747504|gb|ADAD01000120.1| GENE 16 15755 - 16012 202 85 aa, chain + ## HITS:1 COG:SA2195 KEGG:ns NR:ns ## COG: SA2195 COG4115 # Protein_GI_number: 15927985 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Staphylococcus aureus N315 # 1 83 4 88 88 82 54.0 2e-16 MNKLWVDEAWEEYLYWQVIDKKIVKKINELLKDISRNGVSEGIGEPESLKYRKAWSRRIN KKHRLVYFIENSNIVILSCKGHYEE >gi|261747504|gb|ADAD01000120.1| GENE 17 16035 - 17117 1343 360 aa, chain + ## HITS:1 COG:CAC0894 KEGG:ns NR:ns ## COG: CAC0894 COG0337 # Protein_GI_number: 15894181 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Clostridium acetobutylicum # 1 360 1 351 356 291 47.0 1e-78 MQILNVNLGENSYDIVIGTNFYNEFSEYIKNIYRGKKLFVITDSNVNRIYEEKYNKMFKD FDYNVYVLEAGEENKHIGVMQGIYSAMVKAEIKRSDMVVAFGGGVVGDIAGFAAASYLRG VNFIQIPTTIISQVDSSVGGKVGVDLPEGKNLVGAFYQPKLVLIDSCFLNTLSDRYFYDG FAEIVKYGCIYDRDLFEKLENIVNSSEIAKSNKKELREHLMKYINGIIYRSCEIKKEVVE KDEKESNLRMILNFGHTIGHAIEQYTNYKKYSHGEAISAGMVDIIKIGEKKGITEKGCSE KVELLLKTLNLPTNIEYDKLKITEIMKRDKKSIDGGINFIFLKNIGNVEIIKMNGEEIFE >gi|261747504|gb|ADAD01000120.1| GENE 18 17139 - 18395 1585 418 aa, chain + ## HITS:1 COG:CAC0895 KEGG:ns NR:ns ## COG: CAC0895 COG0128 # Protein_GI_number: 15894182 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Clostridium acetobutylicum # 3 414 4 410 428 399 52.0 1e-111 MKVKIKPGILNGTIEIPPSKSYSHRAVIAAALASGEKSKIDNLKFSVDITTTTDIMENWG AKVIRGENFLEIVGNDGKVVPRDKYTQCNESGSTIRFLIPIGITSENELIFDGKGKLVDR PLDTYYKIFDKQKIFYKTEKGKLPLEIKGKLQSGIYEIDGNISSQFITGLLYALPLLEGN SKIVINKSLESKGYIDLTLEILKIAGIEIINNGYKSFDIRGNQKYKSFDYTVEGDYSQVA FWIVAGIISSNRDNKIKCLHVNKNSLQGDREIIEIVQRMGANLEIYDDYVIVKPSKTKGT VIDVSQCPDIAPILTVLGALSEGETQIINGERLRIKESDRITSIKTELNKLGAKVEEKGD DLVIQGVDGFKGGVTVSAWNDHRIAMSLAIASTRCEKEIVIEEAESVKKSYPHFLGRV >gi|261747504|gb|ADAD01000120.1| GENE 19 18830 - 19927 1518 365 aa, chain + ## HITS:1 COG:CAC0896 KEGG:ns NR:ns ## COG: CAC0896 COG0082 # Protein_GI_number: 15894183 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Clostridium acetobutylicum # 1 355 1 354 356 388 54.0 1e-107 MGSNFGKNYCISIFGESHGTALGINIEGIPAGTELDLDFIREEMKRRAPGRSELATPRVE KDEFEILSGFINGKTTGTPLAMIIRNADQRSKDYSEIAKKPRPGHADWSGMNRYNGFNDI RGSGHFSGRITAPLVFAGAIAKQLLKEKGILIAAHIKSIKDIKDRDFEESDITQKNIDKW RKMILPVLNEEIIPKMEETILKAKENKDSIGGTVEITVMGMKPGIGNPFFESMESELARI MFSVPSIKGIEFGAGFDIAKMTGYEANDEMYYDEKGEVKSYTNNNGGLIGGITNGMPINF KVAIKPPASIGKRQKTVNLETEKDDFLEVTGRHDPAIVPRAIVVLEAATAIVILDQLLEA RKYNI >gi|261747504|gb|ADAD01000120.1| GENE 20 19972 - 20103 257 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNPKVDLGYAIKDWRKALDKIKRQDIRIDNDDEEKKFWKRYIV >gi|261747504|gb|ADAD01000120.1| GENE 21 20226 - 20882 722 218 aa, chain + ## HITS:1 COG:no KEGG:BMD_0900 NR:ns ## KEGG: BMD_0900 # Name: not_defined # Def: hypothetical protein # Organism: B.megaterium_DSM319 # Pathway: not_defined # 1 216 77 276 283 120 34.0 3e-26 MEIGPGWGNYTFDLAELSCEMTCLDLSSDNLEYIKTVTDLKNIKNIKYINKKVEEANIDS YDLIFGYNCFYRMYDLDKVLEEIDKKSKKIAMIGMATGVEPEYNNLFEEKLKLPMDWREN DYIYLTMILYSKGVDVNQIVIKNKRKYTFDSIESILKHESVRIDYKTLPLDLFSKNKTKE EEKLYSEIKDVLLQYYKYEDGKYVYEYEFNSMLLYWTK >gi|261747504|gb|ADAD01000120.1| GENE 22 21191 - 22753 2076 520 aa, chain + ## HITS:1 COG:lin1223_2 KEGG:ns NR:ns ## COG: lin1223_2 COG1263 # Protein_GI_number: 16800292 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Listeria innocua # 92 514 1 381 403 273 41.0 6e-73 MKSYKDEAKELLKLIGGKENVVSVTHCATRLRFALADDAKASPKDIQKLDSVKGTFTNAG QFQVIIGNDVGNFYNDFIGVSGIESGSKEDVKKAAMTKQPFLQRMVAHLAEIFVPLIPAL VAGGLMLGFGNFLSQGMKSLGGKSLVDIYPLAANLKLYTDWIGGSVFGMLPVLVTWSTVK KFKGNEALGIVLGLMLVAGVMLNAYVYGNISSSRELSDAGVHLKDIIINGSAKYTHLFKD AQNPMGKDPFGIEAGKYILNLGFAKIMMIGYQAQVLPAMFAGIAMCYIQKFIDKRTPEVL KLVWVPFATLVITGFLTILFIGPAARTLGEWLTNLFQFLFKTPGLRYIGALIFGTTYAPL VITGLHHTFIAVDLQLASSAEGGTFIWPLIAISNIAQSGAVLATYFLYKKDKKQESVSLS ATVSAWFGITEPAMFGANLKYMYPFYASLIGSAVGAVICTAFNVLASGIGVGGLALAFLS ISKNRGAFWIASLAAFGLAFGLTFFFSRFKKLNKGSLTGE >gi|261747504|gb|ADAD01000120.1| GENE 23 23238 - 24803 2454 521 aa, chain + ## HITS:1 COG:FN1652 KEGG:ns NR:ns ## COG: FN1652 COG0747 # Protein_GI_number: 19704973 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 17 518 3 497 499 439 44.0 1e-123 MKKENYSGANTPNLKLKMNKMLLMMMAVLLVFISCSKGKGEGDKKGASEPKTVKTVNVGQ GYVASGLDPGDGGTGWALTSHGISENLYFVNKKGELDTRLVESISQKNDNEWEVKLKKDI LFSDGTKVDAKAVSDALNRTNEKNAIARGTAGALKFSPIDEFTVNIKSEKPTKIMKAVLA EWTNVIYKVNDKGEFIFTGPFAVESLTPNTEVKLAPNKYFPDAEKRPNVTVKEFKDPNAL KLAFENNELDIAVSIPSEFAESLQKSGHTLKPMEVGYQYFAFVNLTNKVLSNAKVREALD LAINRDEMIASLHGGKKPTGFFAGYFPFTGKADISSDVNKAASLLDEAGWKLNTQGIREK DGKPLSLTLVTYPQRPDLVTLMQVMSSQLKKMGIDAKTEISENIGEVGASKKFDIMLYAQ HTASTGVPTFSLNQFFRPKASNNYTGYSSKEFEDVMKKLDATGDQQEMIKLAVEAQDILK KDRPVLFLVDPVWYVGLSDKVKDYEVWGADYYIVRADLTVK >gi|261747504|gb|ADAD01000120.1| GENE 24 24825 - 25748 966 307 aa, chain + ## HITS:1 COG:FN1651 KEGG:ns NR:ns ## COG: FN1651 COG0601 # Protein_GI_number: 19704972 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 7 303 5 301 307 223 47.0 4e-58 MKYTQLLKETVKILYLLFVISVVVFFFIRLLPSSPEETYLRSTGVPVTAENLKNLRAEWE LDRPLIEQYLKWIGNFIKFNWGTSLITKNDIKTEMLTRLPYSLGIGLGGVLISAFLSFFL GYLSSLKRNGFFDRFTRRLSLMSQTVPSFILAIIIIYFFSIKWKLTTFFFNKDLSNFFLS IFIISLYLMGKFSRIVRIHFREQMTKTYILAAISRGFSEKYVLFRHAARPVIYGLISAVI SEFSWAIGGTAVLEVIFIIPGISTFLVESIKYRDYNVIQSYILIVVIWMIAVKIFFNIIL KYLDIKE >gi|261747504|gb|ADAD01000120.1| GENE 25 25757 - 26275 535 172 aa, chain + ## HITS:1 COG:FN1650 KEGG:ns NR:ns ## COG: FN1650 COG1173 # Protein_GI_number: 19704971 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 33 160 12 139 238 118 50.0 5e-27 MKKISKKYIVYLCVLVLIIIISCLFPVYDTDIINLNEKFLKMSGKHILGTDYLGRDIYSL MIKGGMRTLEVIVIASSFSFTTGIFAGMITGYTDSPLNVIPDFLIDLFMIIPTFILALVI TSQFGLTPLNAGLSIGIGSIGGYYNQTKKLVQEIKKRNLLSLQECSVQETAE >gi|261747504|gb|ADAD01000120.1| GENE 26 26335 - 26565 275 76 aa, chain + ## HITS:1 COG:FN1650 KEGG:ns NR:ns ## COG: FN1650 COG1173 # Protein_GI_number: 19704971 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 2 65 174 237 238 75 56.0 3e-14 MSGISLQYASLTFIGLGSDINNPDWGTMLYQYRIYLFDYPMMLLWPSLGIFLIAFVSNRL FDDREINNIKQGTIYD >gi|261747504|gb|ADAD01000120.1| GENE 27 26558 - 27295 316 245 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 26 238 38 242 329 126 34 3e-28 MTDNIILEVKNLVVSTVNENKKNTILDNVSFTLEKGERLGVIGDSGTGKSVLMLSLTGLL PKNNFEVTGEILYKGKTDLLKFDRKNMRKFTSEKINMILQDSMNILNPYRKISDQITETL IFKEKISKKEAYKKAVKILKDLNISAAEKRIKDYPYQFSGGMKQRIGIALSLLGKADILI ADEPTTSLDAINQEKILDKINRHIKENNISLIYISHDMRVISKMSDRIIVMEKGKIIKEK LAVDL >gi|261747504|gb|ADAD01000120.1| GENE 28 27437 - 28027 663 196 aa, chain + ## HITS:1 COG:no KEGG:FN1814 NR:ns ## KEGG: FN1814 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 193 1 192 192 119 37.0 6e-26 MIKIVTEKLKPASSKKENVILIITVVLLILTAALLIKLRKTEAPKQKIKENEVSSYSDFS NIEESIYADLINFVSEIKMKGKDEELPTIKQLEDELMPPFTKDITWEERGKLTWERIDKN KKTYYVGLCENVMTSGNFLVSVDNENKENTKILFIKTHIHKDEVEFAIDENFPGWKEIVP YTGENERQKFEGKGDN >gi|261747504|gb|ADAD01000120.1| GENE 29 28027 - 28926 1380 299 aa, chain + ## HITS:1 COG:FN1812 KEGG:ns NR:ns ## COG: FN1812 COG0803 # Protein_GI_number: 19705117 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Fusobacterium nucleatum # 2 299 3 301 302 349 58.0 3e-96 MKKFLLVISMIFMSVISYGKMKVGVTLQPYYSYVSNIVGDKMEVIPVIRGDLYDSHSYRP RPEDIKKMSTLNILVVNGVGHDEFVYDIVKAAKMNGKVKIINANKGVSLMSVSGMRGKTK IVNPHTFISITTSIQQVYTIAKELGQIDPANKDYYNKNAQAYAQRLRNLKAAAIKKVSHL KNLDMRIATSHAGYDYLLSEFGLKVRAVIEPAHGVEPSASDIKAMIDIMKRDKIDVLFVD AQVQNKYSTTIQKATGVRIKSLSHMSSGPYTKDSFEKFMQYNLDSLTNAMLEVAKAKGK >gi|261747504|gb|ADAD01000120.1| GENE 30 28928 - 29140 302 70 aa, chain + ## HITS:1 COG:FN1811 KEGG:ns NR:ns ## COG: FN1811 COG1121 # Protein_GI_number: 19705116 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn/Zn transport systems, ATPase component # Organism: Fusobacterium nucleatum # 1 69 1 69 229 89 63.0 1e-18 MSGILIEIKNLTLTLSNTKILKNINLTVEPGKIHCLVGPNGGGKTSLLRSILGQVPYEGE IFISYEKTKK >gi|261747504|gb|ADAD01000120.1| GENE 31 29131 - 29616 170 161 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 1 137 82 213 309 70 32 6e-16 KKIGYVPQFLDFERTLPITVENFLSIIYQKKPCFLGVSRKYKKTIDDLLKKIGMYEKRTR LVGNLSGGEKQRLLLAQAIHPEPDLLILDEPFTGIDKLGEEYFKSVIKDLKNKGVTILWI HHNLKQVIEMADTVTCIKKEIQFSGDPKKVLDEEKILTVFS >gi|261747504|gb|ADAD01000120.1| GENE 32 29658 - 30551 1070 297 aa, chain + ## HITS:1 COG:FN1810 KEGG:ns NR:ns ## COG: FN1810 COG1108 # Protein_GI_number: 19705115 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Fusobacterium nucleatum # 1 297 1 297 297 307 68.0 1e-83 MLELIRNFFIDMVNRQILPDYFKYAFVINSLICTLFIGTILGGIGTMVVTKKMAFFSEAI GHAALTGIALGVLVGEPYNAPYVMLFTYCILFGLLINYTKNRTKMSTDTLIGIFLSISIA LGGSLLIFVSSKANSHMLENVMFGSILTVSDFDIKVLIVTMLILGIVLIPLFNQMLLSSF NVNIATVKGVNVKLTEYIFIIVITVVTIVSVKIIGAALVEALLLIPAASAKNLSKSMKSF FFYSIFFSLISCVLGVLVPLHFNISIPSGGAIILTASCIFFITMIIKNISRKFSEGD >gi|261747504|gb|ADAD01000120.1| GENE 33 30577 - 31452 1248 291 aa, chain + ## HITS:1 COG:FN1809 KEGG:ns NR:ns ## COG: FN1809 COG0803 # Protein_GI_number: 19705114 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Fusobacterium nucleatum # 17 290 9 281 283 214 41.0 2e-55 MKKTIILMILLTVNILYGKEKVLTSIQSVYSIAQNITKNTDIEVYSIFDSDISMDYGKSA FDNKDLDLAVAKNAVAVIDVAKVWENDYLYEYVRRQNIRIIEIDASYSFSGDDYSALSLL NYKNGERNPYVWMSLQNTIKMADIIAADLTRLFPKNEKVIATNLKKFTEELKDLENNYLR ETLNAKSLSVIALTGNLDYLFNELNIFANHVEFSQMTSENIDKIMKENGSKIIVSDKWLK KEVISEIEKKGGKFIVLDIFNIPRELNEKMDPDGYLKGMKENLDKLAKALK >gi|261747504|gb|ADAD01000120.1| GENE 34 31479 - 31898 563 139 aa, chain + ## HITS:1 COG:no KEGG:FN1808 NR:ns ## KEGG: FN1808 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 15 139 2 125 125 152 60.0 6e-36 MKKIMLICLFLISSVIFAHAPLLSVDDNKDGTIYIEAGFSNGEKADGMELIIVKDKAYNG PEDTYEGKLIIFKGKFDAKSSMTIIKPLAAKYEVIFNGGPGHVTSKKGPKLEESEMSKWK ENIQKAYYLGEWKEKMTQK >gi|261747504|gb|ADAD01000120.1| GENE 35 31963 - 32763 1258 266 aa, chain + ## HITS:1 COG:FN1807 KEGG:ns NR:ns ## COG: FN1807 COG5266 # Protein_GI_number: 19705112 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 264 1 262 264 193 42.0 3e-49 MKKFLGVLAVLTAATVNLYAHNQFIYTDTLNVTGKSSVPFKVIFGHPYHGGEDKPIPVGK VKEKTHLAEKVFAVHDGQKIDLTAKVKEGQLKTDKASGRTLDFVIDSELKGAGDWVIIAV PGLTVDDSISYSFNGIVKTVITKEGSKGDDWKKRVADGYYEIIPFTNPSMVNVNSVFKGQ LVDKKGNPMKNTDISVNYVNGKLDAGKGTFTGKLQNEKVSMSTYTDDNGYFVLSFPHKGL WSIRGKAFVDREKKYVEDTSLLIEVK >gi|261747504|gb|ADAD01000120.1| GENE 36 32983 - 33366 368 127 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038420|ref|ZP_06011794.1| ## NR: gi|262038420|ref|ZP_06011794.1| NADH dehydrogenase subunit 5 [Leptotrichia goodfellowii F0264] NADH dehydrogenase subunit 5 [Leptotrichia goodfellowii F0264] # 1 127 1 127 127 143 100.0 5e-33 MEDIKIMTLIITTAILIWEIFKWRSLIKIKFELSSFIFWENLISAGIVALFSWFLWYWTL FNEYHLFSSGYITIALILTIFIIIFISKMVIFSLRKSGVKIKTIFCILTTIMEILVFVIY PIEDIVF >gi|261747504|gb|ADAD01000120.1| GENE 37 33599 - 34963 1494 454 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0877 NR:ns ## KEGG: Lebu_0877 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 430 1 430 454 161 31.0 6e-38 MKKAILFILLILAVISCDLEKAQKEYEKGQYIQSIETTLKYFDKNENRIKKVNSKVKDEI VSKFSKITEYYSDMASNFGNEEQKINANINLFKIYIMLEERNYTKGFTDFTSKYKGEDFY SKANKLILKKYRDYFDSGLYDKAAEELGNYDGFSQILSRMNKMKFSKYERIYESSLKTKA DNLIKIAEKYEEMKEYRKAEKSYFQVSETYKNYEKNYRNSYDKYTENKNKADLADAEKYY EMGKQALKESSQKEKYRKAYEYFKKSDSLVKGYKDAVNLANLYYKQGFFRYKIVGDKKYK NVIEEALKDIGYRDDSNPELIIEYTERNYYISEPMTYIEKLSEKTPSGIGNDMQILYREN PFDKIITSVKEKIRTDYRIRIDGVQYKDSYSNYLIEENNSEKIRYKGPNVPLAYRDENKG KELGKEEMQKLIDKKVKEDIIQRIKIMEKRLERI >gi|261747504|gb|ADAD01000120.1| GENE 38 35027 - 35785 700 252 aa, chain - ## HITS:1 COG:no KEGG:FN0484 NR:ns ## KEGG: FN0484 # Name: not_defined # Def: lipase (EC:3.1.1.3) # Organism: F.nucleatum # Pathway: Glycerolipid metabolism [PATH:fnu00561]; Metabolic pathways [PATH:fnu01100] # 18 251 5 240 240 259 56.0 8e-68 MVILAIITFLSLKITKILLYHEYGVKKFGTFNNDVVITFHGIYGEYKDMEVIDKALEKEG YSGINIQYPTTEGTIEEISEKYIAPNVQNILKTVEEDNKKRGKQGLPEKKVHFVVHSMGT GILRYYLKTHRINDLGKVVFISPPSHGSQLPDNPISDILKDTLGEAVAQFKTSLNSFINS LGEPDYPCYVMIGNKSNNFLYSILIPGVDDGMVPFKTSRLENCKYKVMENDTHTSILEDK RTLDEIIEYFKN >gi|261747504|gb|ADAD01000120.1| GENE 39 35815 - 36609 691 264 aa, chain - ## HITS:1 COG:FN0761 KEGG:ns NR:ns ## COG: FN0761 COG1521 # Protein_GI_number: 19704096 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Fusobacterium nucleatum # 1 255 1 256 256 206 45.0 4e-53 MILGFDIGNTHIVPIFYDNNGKIKASFRIPTRLSFTEDTLFSMLKTLADNNNINIYDITD IIVSSVVPHINEIFEYLGQNYFNIQPEFISLDTIDDEIKLLDGMERGLGADRIADILAAK KIFPEKEFVIIDFGTATTFDAVKNSTYMGGCILPGIELSINTLFNNTAKLPKIKFEKPDT VFGIDTVTQINAGIFYGNVGTIKELISQYKKGMPEAYIISTGGQGRKISEYISEIDEYIP KLGEKGIFEFYKLRKNNRFIKEKK >gi|261747504|gb|ADAD01000120.1| GENE 40 36626 - 38263 1641 545 aa, chain - ## HITS:1 COG:Cj0204 KEGG:ns NR:ns ## COG: Cj0204 COG1297 # Protein_GI_number: 15791591 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Campylobacter jejuni # 9 536 11 615 665 109 24.0 1e-23 MTKYSGRSLTPASILIGTIGAILVAASSFYVVLKFGALPWPTIMVTLISMSALGLFGRKD SGEITVTHTIMSAGSMVAGGVAFTVPGYLLLGGKLENIDKSLFLITILTGSVLGAVLSFI FRKKLIEEDKLEFPIGDAAYNLVKSGKNKKNMKTVTLGMLFSTVVAVLRDYSFQKGKAPF IPTVYSIKNGLFSFYVSPLLLGVGYILGFMNTFIWFLGGAVTYLIAQPLSVYYNIKDFDI MKNSFGMGFVIGIGTSVIVKIIISTKRENKYKRIKNDNINIKIILGVFSVAAVALISFVY KLPLFLSFVLILICILCTVIAGYTTGKTGINPMEIYAIVTILVISFLNVLLNGLNIGGIR FSTNINLLILFLLACIVAVACGLAGDILNDFKSGFKMKVSPGEQFAGELIGAVVSSFVVA FLFFIFFDVYKNIGPKENSELIVLQASIVASIIKGIPFINIFWIGFGAGFTLNMLNLPVL TFGIGIYLPFYLTLPVFAGGLLSLIASRISQKFSSNMLLFSNGLMSGEAIVGVILSILAY IKLFI >gi|261747504|gb|ADAD01000120.1| GENE 41 38571 - 38678 157 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVRLKKSKKPKKQKKLKKTFDFLDKISYNKLVNKN >gi|261747504|gb|ADAD01000120.1| GENE 42 38719 - 40950 3413 743 aa, chain + ## HITS:1 COG:FN0262 KEGG:ns NR:ns ## COG: FN0262 COG1882 # Protein_GI_number: 19703607 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Fusobacterium nucleatum # 1 742 1 742 743 1283 82.0 0 MDAWRGFKEGNWTDKIDVTDFIKQNYTEYLGDESFLEGPTDATVQLWDSLKEKFKVEREK GIYDVETKVPSQIDAYGAGYINKDLEKIVGLQTDAPLKRAIFPNGGLRMVKNSLEAFGYK LDPETEEIFTKYRKTHNDGVFSAYTDSIKKARHTGIITGLPDAYGRGRIIGDYRRAALYG LDRLIAEREQIFKTQDPEEMTEDVIRQREEITEQIKALKALKRMAEAYGYDISRPAETAQ EAIQWTYFAYLAATKDQNGAAMSIGRVSTFLDIYIERDLKEGKITEKEAQEFMDHFVMKL RLIRFLRTPEYDALFSGDPVWVTESIGGMGLDGRSLVTKNSFRILHTLYNMGTSPEPNLT VLWSESLPENWKKFCAKVSIDTSSVQYENDDIMRPQFGDNYGIACCVSPMTIGQQMQFFG ARVNLPKALLYAINGGKDEKSKMQVTPEGKFPKIEGDYLEYDEVWEKFDKLLDWLAETYV KALNIIHYMHDKYSYEALEMALHDIDIKRTEAFGIAGLSIIADSLAAIKYGKVKIVRDED GDAVDYINEGEYVPFGNNDDKTDELAVKVVKVFMDKIRSHKMYRNATPTQSVLTITSNVV YGKKTGNTPDGRRAGAPFGPGANPMHGRDTRGAVASLASVAKLPFEDANDGISYTFAITP ETLGKNEVEKKGNLVGLLDGYFNQTGHHLNVNVFGRELLEDAMEHPENYPQLTIRVSGYA VNFVKLTKEQQLDVINRTISDKF >gi|261747504|gb|ADAD01000120.1| GENE 43 41061 - 41786 750 241 aa, chain + ## HITS:1 COG:FN0261 KEGG:ns NR:ns ## COG: FN0261 COG1180 # Protein_GI_number: 19703606 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Fusobacterium nucleatum # 1 237 1 236 243 350 69.0 1e-96 MEGYIHSFESFGTKDGPGIRFVLFMQGCPLRCLYCHNVDTWNVKDKKFMMTPEEVMKEIL KVKGFIRTGGVTVSGGEPLLQPEFITELFKLCKENDIHTAVDTSGYMFNNKVKEVLKWTD LVLLDIKHINPNKYKKLTSVSLEHTLEFAKYLSEINKNAWIRYVLVPGYSDDEEDLHEWA KFVSQFKNVTRVDILPFHQMGGYKWKEVGKEYKLADVKPPAREEVKKVEEIFKSYGLNVG I >gi|261747504|gb|ADAD01000120.1| GENE 44 42083 - 42568 577 161 aa, chain - ## HITS:1 COG:FN1902 KEGG:ns NR:ns ## COG: FN1902 COG2131 # Protein_GI_number: 19705207 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Fusobacterium nucleatum # 2 161 13 172 174 218 60.0 4e-57 MNKRDDYLSWDEYFMGIAFLSGMRSKDPSTQVGACIIDEDKKIIGIGYNGFPMGSSDDNM PWNKEGDFLNTKYPYVVHAELNAILNSIKSLKNAIIYVTHFPCNECAKAIVQSGIKKVIY FSDKHKSLDATKASRKIFENAGVETVHLEIDKKEILIRFGD >gi|261747504|gb|ADAD01000120.1| GENE 45 42583 - 43923 1207 446 aa, chain - ## HITS:1 COG:FN1066 KEGG:ns NR:ns ## COG: FN1066 COG1570 # Protein_GI_number: 19704401 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Fusobacterium nucleatum # 3 424 2 403 404 325 46.0 1e-88 MEQTVFSVSEVNRAVKQFLEGTSTFKNILIQGELSNITYYRSGHLYFTLKDDKASVKCAI FKYIYKNIPTDLKEGDQVKITGSATIYEASGSFQIIAEALEKTDKLGSLYEKLERLKKLY LEKGYFDESNKKPLPLLPVNIGVVTAETGAAIRDIINTAHKRFRNINIYLYPAKVQGEGS AAEVSAGIEFFNKMKKEEQLDTDVLIVGRGGGSIEDLWSFNEEAVIEAVFHSEIPVISAV GHEIDNLLSDLVADKRAATPTQAAEILIPVKEELIRNLEIRKNNVNKLLLNKVTLMKKEL EQRKNNYYIKNYVGILNDKKLQLMEKEQRITRELKRIVERAREQFDYRKKKFEKTDLKMI LNDKKDKLNTFEKKLNNKISEILRELKKDLEYKTAKLSKYSINDILKQGYTITRKNGKVV KRGIELSKSDNIEIQFSDMKVKSTVK >gi|261747504|gb|ADAD01000120.1| GENE 46 43940 - 44290 553 116 aa, chain - ## HITS:1 COG:BH3485 KEGG:ns NR:ns ## COG: BH3485 COG1393 # Protein_GI_number: 15616047 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Bacillus halodurans # 1 113 3 115 119 108 50.0 2e-24 MRIYYYPSCSTCKKALKWLDNNEIAYEKKHIVDAKLTTKEIEDIFKKSGLPIQKLFNTSG RVYKELNLKDKLKDMTEKEKLELLASDGMLLKRPILVNRTFALIGFKEDEYKEKLV >gi|261747504|gb|ADAD01000120.1| GENE 47 44389 - 46362 2258 657 aa, chain - ## HITS:1 COG:FN0224 KEGG:ns NR:ns ## COG: FN0224 COG0556 # Protein_GI_number: 19703569 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Fusobacterium nucleatum # 3 647 6 653 663 863 70.0 0 MDFKIHSKFQPTGDQPQAIDKIVENIENGITDQILLGVTGSGKTFTVANVIERINRPALI MAPNKTLAAQLYNEYKQFFPENAVEYFVSYYDYYQPEAYIMQTDTYIEKDSSVNDEIDKL RHAATAALLNRRDVIIVASVSAIYGLGSPEAYKKRSIPIDVSTGFDRNELIKRLISLRYE RNDIAFERGKFRVKGDVLDLHPSYQDTGYRFEFFGDDLESIAEINTLTGQKIREIQRLTI MPATHYLSTEDSEKMFESIKSELHDRINFFERQNKLLEAQRIKQRTEYDLEMIAEIGYCK GVENYSRYLTGKNEGEAPDTLIDYFPDDMVVFLDESHISVPQINGMYKGDRARKESLVNN GFRLPSAFDNRPLKFEEFFAKVPQVVYISATPSDYELEQSKEEIIEQLVRPTGIVEPDIE IRPTKNQIDDLMDEIKIRTAKKERVLVTTLTKKMAEELTDYYLEFGIKVKYMHSDIDTLE RTEIIRGLRKGEFDVLVGINLLREGLDIPEVSLVAILEADKEGYLRSRRSLIQTMGRAAR NVNGQVILYADTVTGSMKEAIDEVNRRREVQEKYNIDNNINPRTIEREIAESIVDYEIIK EDSVDKVKKDYKNQAEIEKEIKRLNKEIKKAAEELNFEEAIKLRDKMNELKKLILEL >gi|261747504|gb|ADAD01000120.1| GENE 48 46801 - 47655 1154 284 aa, chain + ## HITS:1 COG:CAC2424 KEGG:ns NR:ns ## COG: CAC2424 COG4667 # Protein_GI_number: 15895690 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Clostridium acetobutylicum # 4 283 2 281 283 250 47.0 2e-66 MKENKTGLVLEGGGLRGIFTAGVLDFFLEKNVEFDGCIGVSAGACHACSYLSKQHKRAYN VSVDYLDDKRYCSFYSLIKTGDLFGVDFVYGEIPDVLNPIDNETYLKGKTKFQAVITNCE TGEAEYPYVKDMKKDVDYIRASSSLPFLSRMVKINGGLYLDGGISDSIPIKKSIENGNTK NVIIMTRDKNYRKAQSKLGKISAIRYKKYPKLVELMNTRYSRYNEILEYIYGQEKTGNVF IIQPEKPLNLGRIEKNREKLTAVYNEGYKEAEKRYEAMMKYLSE >gi|261747504|gb|ADAD01000120.1| GENE 49 47682 - 48953 1042 423 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1924 NR:ns ## KEGG: Lebu_1924 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 423 1 420 421 550 69.0 1e-155 MKVQKKWYKLDAFAKTYSSIISEGRTTCFRLSVLLSEEIDVELLKKVALILQKKYPFYNS ELKKGIFWNYLQHKKEDFTVEPETTYPCTDIKKKNPLRIIYYKKKISIEIAHFLTDGVGA AEFFKDLIEKYLRMKYFSKKIEDKHEKNEKCNSETAVLKENAENTDYTDLYSKYMKKVNK EKTVKSAYHLPVKILEKGQYHITTGEISVEDIKNESRKYRTTVGRYLLAVYFKVLLDKYP NIKKQIVVGVPVDLRKIFSEKTYRNFFINITPTVDPSLGNYTLEEIIEYLENYFKMKVNK KEFYKSIYKAINPMRNVLVKSIPYFIKRVFYPFVFDYYGEKGYTTGFSNLGILKLDKKYS EYIKGFRFLPPPSKRCKIKIGIISDENKVYLTFGNLSTNHEIERDYFIYLRKRGIRAKIS TNY >gi|261747504|gb|ADAD01000120.1| GENE 50 49012 - 49695 663 227 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1923 NR:ns ## KEGG: Lebu_1923 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 226 1 226 227 252 67.0 7e-66 MYCVKCGAELDEGVEKCPLCETPVLKMTDEENVIYDPEYPIININLYELKIKKVKKAVFL SFFTISIISILEVFFQNMIVYGEMKWGYYAIPSILIFDLMLFVLLNSHTLRQNLFLIFTG LTVYFLILDYGNKELTWSLRTGIPIAGAYYFVSFIFSFVWDKHKSDRIKIMNFFLFFVGV FLLILEFIISRRLSWSIWASVPLFVLNVMLRYAYKAYKEEFKKRLHL >gi|261747504|gb|ADAD01000120.1| GENE 51 49812 - 50708 1139 298 aa, chain + ## HITS:1 COG:FN0503 KEGG:ns NR:ns ## COG: FN0503 COG0583 # Protein_GI_number: 19703838 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 290 8 298 302 202 38.0 5e-52 MDIHHLKIFYEASNEKSFTKAAKKLYISQSAVSIQIKKLEHTLGLQLIERNSKNFKLTFA GKELFRMSRDIFEKISRMENEMRKILQYKKGKISIGATHNIGEPVLPGIMIEFRKNNPEI EFDLYIKNKESLIKHLKEGTVDIALMEEYFIEDKEIKVIETEEYPFVVVSGTEVRDYREL IDIPLLKRDTMLTNKYLDLFEKIIGFNLDKRISINGSIETMKNLIKNGLGFAVLPYYSVY EEIKKGTLKVIYSFEKSEDKFQIVFIRENEEKEGITKFVEFLESYKINRELFTDRRMR >gi|261747504|gb|ADAD01000120.1| GENE 52 50750 - 51334 822 194 aa, chain + ## HITS:1 COG:FN0502 KEGG:ns NR:ns ## COG: FN0502 COG0279 # Protein_GI_number: 19703837 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Fusobacterium nucleatum # 8 194 5 191 194 241 68.0 5e-64 MLKEHIKNSYFTAYETVKNFVENDENIEKTVKISEELAQAYKNGNKSLIAGNGGSNCDAM HFAEEFTGRFRKDRKALPSLSISDSSHITCVGNDYGFNFIFAKGIEAFGQKGDFFFGIST SGNSQNIIEAMKVAKEKGMKTVGLLGKDGGKMKGMCDYEFIIPGETSDRIQEVHMIILHI IIEGVERILFPENY >gi|261747504|gb|ADAD01000120.1| GENE 53 51670 - 53682 1572 670 aa, chain + ## HITS:1 COG:FN0198 KEGG:ns NR:ns ## COG: FN0198 COG3711 # Protein_GI_number: 19703543 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Fusobacterium nucleatum # 22 525 11 520 660 97 25.0 1e-19 MILNEREVKILEKFYKNKTVLLSELSKEFSVSERMIRYNINEINTLLEFIKIAPIKKIGK SKYILENNSTRLLDVIKELEPINKSRRQTLIQMYLMFSEEKVNIKYLMEKFQLTRVSINS DIKEINKTLNIRGLKIENNKGLSIIGNTDNIKKYKIFILSEQLELLFKEDYSEYVNQIKE IIFKIISEKTLKKIKVFVDEVISENKIKLDDLNYKYFYSQVLNIFYEDSYKNRNCHDSDY KEYVEKKLKELKLLKELNVKKVNELSNLVSWIKSYESYEDFFQDLLSIEVLVKNIIKTVE NKISIQITKDKLLEEFLIQHLKALIYRTQKGYKLKDSIIVEERDEDKLYMTIKDSLKLIS QQLGNEIENNEIHLLKIHFLASIERINKLKTVPLEIAIVTSLGSGSNKILIDNIKSKFLV NVVYIGPLYNLDSVLKKHKNIKYILTTIDIEENKYKSKIIKINTILSFEDKQKLDLLGFR SNNNKINLSNLIEVISENCKIGNKNNLIENLMNNFNDRIINDISEFAGNEDILLSKNIIF DYKAENIMDAIRECCKNLEKDYTNSTYTEEVLDVFINNQRQIIRYNGVMLPHAVNKDNVH KNGVSILKLKNPILMENTKEKIDTIVCFAIKDKKNISNEVSNVINRVLKRQFKKLLRERD KISIINYLIN >gi|261747504|gb|ADAD01000120.1| GENE 54 53700 - 54134 445 144 aa, chain + ## HITS:1 COG:lin2202 KEGG:ns NR:ns ## COG: lin2202 COG1762 # Protein_GI_number: 16801267 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Listeria innocua # 1 132 6 138 155 73 30.0 1e-13 MFQKEDIYLDVDAKNYIELFEKMAKVFKEKRYVKETYLKSVLEREKKYPTGFEFDGYNIG LPHTDAEHISVQKIILIRLKNEIEYREVVTNKKIPIKIFIMLLIKNGEDQTTILGKLIEI IGEKEFYRKVSEAKKVEDLKELYE >gi|261747504|gb|ADAD01000120.1| GENE 55 54166 - 55491 1905 441 aa, chain + ## HITS:1 COG:SA0238 KEGG:ns NR:ns ## COG: SA0238 COG3775 # Protein_GI_number: 15925950 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Staphylococcus aureus N315 # 7 399 11 403 419 352 48.0 6e-97 MFILQWFINLGATVMIPLLLIIFAILLGTKPTRAVKAGITIGIGFIGLNLVIGLLSESLG TAAQGMIANFGLRLDVIDVGWPATSAIAYSTALGTMAIPIGIVVNMVLLIFGLTKTLDID LWNFWHGAFIGSIIYATTNSFLLGILTIISYYLFIFWIADYMEPVISEFYGFPNITFPHG TSAPGYIIAKPLNYIFDRIPGFKNWDISPEQIQKKFGLFGDSTVMGLIIGLIMGILEGYD VSKILNLGMQTAAVMMLMPRMVSLLMEGLEPISEAASEFIKKRFPTRDIYLGMDSALSVG HPAVLSSSLILVPITILLAVILPGNRVLPFGDLATIPFIVCLMAAVFKGNIIRTVIGGTM YMGLGLYIATWISPLFTKMAQNANFDFKGNVNISSLADGSLWPTALFVFFAKYAAYIGVI GFGIIMTGLLIYQSKLKKGEN >gi|261747504|gb|ADAD01000120.1| GENE 56 55493 - 55777 469 94 aa, chain + ## HITS:1 COG:lin2815 KEGG:ns NR:ns ## COG: lin2815 COG3414 # Protein_GI_number: 16801876 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Listeria innocua # 3 88 5 88 93 60 40.0 9e-10 MKKILIMCGTGVATSTVVVNKVKKWLEANKLSEYVKIYQGKIAEEINRFNNYDIIISTTL VPDEYKDKIINGMPLLTGVGINEMYEKIKKEIEG >gi|261747504|gb|ADAD01000120.1| GENE 57 55782 - 56618 959 278 aa, chain + ## HITS:1 COG:ECs4017 KEGG:ns NR:ns ## COG: ECs4017 COG0191 # Protein_GI_number: 15833271 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli O157:H7 # 8 265 9 275 286 179 37.0 4e-45 MRNKMYKMFEIAMKENFAIPSVNFVDREMIRAYAEVVNETKLPIIFSIAESHLIYIDLKE AFLLAEYYIKKYNLNAVIHLDHGQSTEIINEAMELGFDSIMIDGSGLPFEENVKITKRVV EFAHSKNIFVESEIGHVGSGELTGISCSADDDTVYTSKEEAIKFAKLTSTDSLAISIGTV HGSYKGKPKINFERLQEIRKELSIPLVLHGGSSSGDENLNKCAKNGINKINIYTDFIIAA QNANDSKMGYFDNKNAMKEAIKNKLRHYFKVFDTKEMK >gi|261747504|gb|ADAD01000120.1| GENE 58 56618 - 57307 915 229 aa, chain + ## HITS:1 COG:SSO2592 KEGG:ns NR:ns ## COG: SSO2592 COG0149 # Protein_GI_number: 15899321 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Sulfolobus solfataricus # 1 221 1 224 227 129 33.0 4e-30 MKKPFFVVNPKSYLYGERLEKLALKADEIAKKYDIKIYFTAPFVELKNISQITENIILTA QHMDETKIGRGMGHIIGEMLINSGVKAVVLNHAEHKLNNEKLELTIERAKELNLDTIVCA GTYEEVKYISSLKPFSILCEPTELIGTGITSGDDYIEKSSEIIREIDSNIYVMQAAGVTT PEDVYNIILKGADATGCTSGIVLADDPEDMMEKMVTALKKAYQERECKQ >gi|261747504|gb|ADAD01000120.1| GENE 59 57304 - 57819 555 171 aa, chain + ## HITS:1 COG:TM1872 KEGG:ns NR:ns ## COG: TM1872 COG0432 # Protein_GI_number: 15644615 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 8 167 5 130 132 64 30.0 1e-10 MKYKHHDIVVKTVSNRVSYHCITKETKEIIEKSEIKNGIVVLSSSHTTCSLFFEEYMHDK NYYGDEYIQVDINNIMDKIVPKCNTETQYFSPGPEHIDFGLGLVDPAYPQEKWTMLNTDA HIKSSIFGNNSLTFIIKDGEIRLGSLGKIYFADWDQLRERTRKVNVLVMGD >gi|261747504|gb|ADAD01000120.1| GENE 60 58047 - 59903 1364 618 aa, chain - ## HITS:1 COG:no KEGG:FN0723 NR:ns ## KEGG: FN0723 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 618 1 618 618 806 68.0 0 MKKFFKTLKHHFNKNLWIIYLLGLSLSLIGSFQVYHGKYDSVIKEISVIFISVLKLFLFS PLEGFTKQNPLAYELAIWFAPVTMFFAIFSIFAKLYNDIKLKFIHFRKKHIIVMGYNNYS LTFMKNFINSKNKERLLCILPEDIQESNIKSLNKSGILTCTLDYSSGLSNENIRTASEYD FASVNTVICFEEEPKNYGYLKLISELIDVNKKDKNNKVKVYVSIVNKYIREIIQHQMDKI KIFDIKYFNIFDLISYNLINEKNFKLYETKELKYDWNEPKKSLKNNFSFDDFSGLIGNPH LLLIGFKDRGKSLFELAVNQTAINVKEKMKITVVDRKINDLAEEYKATVRELEKVADIDF IDGDISHISTQNKIKKIHSKNPFSAVIFSTKNCSENLVFMDLVGDELFKNVNIALYSENI RENGPLIESIAFKYPNITVFGELSHLLNLETIANEPLDIKAKNFNAYYNKVTADIMNFPT EDLSPDKQWNSLSNIKKESSRNQCMHQNIKKVLLGKIAEIEGFSSAKELLQSWKNQIDNL SPTEQVNIIENNPFMNYMTALEHKRWNNFYYMKNFVYSDEKNEVNRTHNSLIDDWDEFLR SDQSDKAVYDFISVLSAE >gi|261747504|gb|ADAD01000120.1| GENE 61 59907 - 62960 2612 1017 aa, chain - ## HITS:1 COG:FN0722_2 KEGG:ns NR:ns ## COG: FN0722_2 COG2319 # Protein_GI_number: 19704057 # Func_class: R General function prediction only # Function: FOG: WD40 repeat # Organism: Fusobacterium nucleatum # 265 1017 1 757 757 883 65.0 0 MPENKEDLQNKKNENSERKYKYDAFISYRHTEPDFTIAKNLHSMIEKFKVPKHLSENSSN ETREFRVFRDREELSTKDLSTMIQEGLKESENLIVICSRRTPLSPWCRKEVQLFKEMHGS DNIIPVLIEGTPDESFIDELKNLKMTFINSNNEEEEKDIELLAADLRPEEIKSSSFKGYE TLQNEKAPELNELAKKSLNILKKSEIYRIMASMLNVNYGDLKLRHQERRMKRIIYASISA VLVMFLFIVSVMTLYLKSVVSERKANEQSALMTLNMANDANSQGNRALAMLIAKEAMKNS TSKMDQYDKLIAQYTNILNNSLITLPFSNEFILPTESETAAFSISSDSKTLVSPGSFNNA NIWNLENGGIIKTLTFDAPVTSLALSPDNKKIYAGTASNKLFEVNTDNYQIKEAFEASQL PVNAIIVSKNNKYLYVLRGLLIFDVFDIENHKKLHSFSFDFENRITGFKENPVTNNFFIL KKDNSITEYDINTGQVAAIHAPATPPESSFRRELEISDNGILFYSDMENNSVKIVMKNLQ TGQINTANNLKVSTFDIETDKETNVLYTHSFGNFVTRFDLSNLKPNEEINTPQRMMYLNI ENNESIKNIKLSPDDNTLAVILANRTVGAFTNLKNMPQNSVSQFILNEKSTHKNTVNIIK FTPDSKKIITSAADSTIRVMNTKAYLGEIQSLNGKIVSSSRDKNSILILSGDQLSKYSFA DNKEVPLAHLHPKYLHILQMFATTNDVSLVALSPGNSTSADVFDVKQNKKVYTTKPHAVK PGNIPVLSKISFSNDGKFLFTLGPDSCLFVHDAKTGKFLFSLEDKENGIAASFVLSNDDN FVALNYTTGKSTIFSLETKKIVQKLDGEILAVNSENKKIKIIYGQVENKLFYSTSNNKII KYADNKIKTATGTTKFNTVNISFDGKYYISGIPKNNTVITDLNTGETIRTLYTDTNKYFV SLPVINKDNRKIAYNNDENRIIITDMYSLEELSKKADEMLKGRKMTESELNSIGRRK >gi|261747504|gb|ADAD01000120.1| GENE 62 63163 - 64524 545 453 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 1 445 1 444 458 214 30 1e-54 MKKYDAIIIGFGKGGKTLAGFLAGKGQNVAVIEKSDKMYGGTCINVGCIPSKKLVDSTKV LKNKGLSNIEDKKNFYAESINNKNNLIEALRGKNYEMLASKDTIDIYNGAGSFVSKKVVN IHSNGENTQIEGEKIFINTGSTTVIPDIKGLKESKHIYTSTTLMDLEQLPEKLVIIGAGY IGLEFASMYSEFGSEVTVIDTAEKLLPREDEEIADRVKTILEGKGIKFLLKEKIEEIFDK DGKGYVKVSGEEVEADAILVAIGRKPNTEGLNLEAAGVKTDEKGTVMVNETLQTSIDNIW AMGDVKGGLQFTYISLNDFRIIRDNVYGNGNRTVNDRNVIPYSVFINPPLSRAGMTESEA VSKGYEVKTGRLEAMAIPKAKIEGQTDGLLKTVIDAKTDKILGCTLLCNTSHEMINVVAA AMKAEQKYTFLKDMIFTHPTMSEALNDLFGSVK >gi|261747504|gb|ADAD01000120.1| GENE 63 64828 - 65119 280 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038433|ref|ZP_06011807.1| ## NR: gi|262038433|ref|ZP_06011807.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 97 1 97 97 135 100.0 1e-30 MKNYFLSKEAFKYRFSIEHVNVYTNEEELYNYFKLNLESYYNVNKNIINSEGMDVFCVVD DLKKYIDMFDNSEIINYVSASNHEHFNKKYFSENGVD Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:14:56 2011 Seq name: gi|261747502|gb|ADAD01000121.1| Leptotrichia goodfellowii F0264 contig00092, whole genome shotgun sequence Length of sequence - 6230 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 6038 6931 ## FN1554 hypothetical protein - Prom 6147 - 6206 3.4 Predicted protein(s) >gi|261747502|gb|ADAD01000121.1| GENE 1 2 - 6038 6931 2012 aa, chain - ## HITS:1 COG:no KEGG:FN1554 NR:ns ## KEGG: FN1554 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 712 2012 11 1326 1582 1123 54.0 0 MSNNLKKMEKDLRAFAKRSKDVKYTKGLLFSFLLMGMLTFSDTLTSPEVKSTENAINQTR KELNTSINDLHVAFKQAKRDNNRLLKRANLELIQLMEQGDHVVKSPWSSWQYGLNYFYSD WRGTYKGRGDKAEKYPYEGIFTRSKDPFERYTSPESPNYGLLPTSTDPYSATTSSRKGLR SGYGIASTTPKQEPLTVLNVDASIKPKDVYRDPVTAPTVDVKAPVLQALNVPNLLPPSLD IPTPQAIVAPNKTPNIAINPTVQYNFGGRSITGESPWDPSHPAKNQDGLGLNYWSGWNPN TNTTDGSKSVWYRNGGTVVESANRVPNLFYLNAVDRSQVTGIRWTFKNANADVAGNPSWN PKGTSAMHTVWDGDITNVNVRLHGYATFIAAETWHNGNVTINNSTVTIENEYNSVFFGYP GSYYTMGINNNDYHSYNQRGGYSGNLKVNINANNNYIHSVMGVQGSFKLDHTGDYDIKGN GNLAYLGIGYTPNWQNLKGSGDVTSNPNTNMTPSIQLGSGTGKINVDGNSNVGLFFENRY SNFQTGGLLPFDTRGTFSADNWRKSVIGIYQGEVAVGMNVGANSASKGNVGVYSRSGQRE GIVPERDLGTPTAAQRTEAKGYAGTQSRGLPDYNIDKIHNLEIANTKIYFGKYAVDSIMF AADKGSVIDVARPDKADKNLGRTAYTEVTPSTEIRDAATDVTSSYDDNTNQAATGTVIAY STGVWNNTNMPGGTAAGLTNKPSEINLYKPVVMTGRAKLDASNQLNRSVALIGDNKGIVN AKEKVTALGYSSIVALAQNNGVVNAEKDIIAKDGAAAQDAATKPYLYNNIGAYAGINGKV NVTGNADIYGIGAMASGANAVANLNGTNNKIRTGKSGALVATNGGVVNFGGGTITHSENF SGDHDSSTPFNADSSSHINFKGATTLNISHGVLIPGTKDDYAAAPGTTKKYNGMSNVTVN LTGDNVVLASNNGIHKIWDGTTIANLVKSTMKVAAFNANGHSYKIYYINGQFDIDSNIDV GNASDDFNKVGLSREVVTINAGKTVSSTVGKGLAMGSNDSANADGNNSKTQFINNGTVDI KGGSLSAGTIGLNISYGQIHNRNIVNVENGIGAYGINGSTLTNDTTGKINITTQGVGMAA FTSAGSLQTYGTDKKISNGTLTAADKTFEIINKGQITVNGNKSVGLYGDTTKATSPLLSA SNGVITNSGKLTLTGDESVGIVSKRATVGLTGTGSSDIVVGKKGIGVYAEKSPVTINSNY GIEVKDGGTGIFIKNDGSTLTSGTHTFELKYSGSNTGTGVGLFYEGGTGANIVNRTNVKL TDTVGTTAGLIGVYTAGGGKLTNNAKITGDKGYGIISNGAEVENTGDITLTNALTSSKPS VGILTQAGDKITNTGTVTVGNNSVGIFGKEIVQKGTVTVGNGGTGLYSEGGNVTLDSTSK INTGSNKAVGVFTKGAGQTITANSGSSMAIGDSSFGFLNEGTGNTINSNVTSQTLGNDGT YIYSSDKTGTVNNNTTLTSTGSYNYGLYSAGTVTNNADINFGAGKGNVGVYSTHGGRATN LAGRNITVGASYIDPSNSLNNRYAVGMAAGFNGDGNPAKAYTGSVINEGTINVNGEYSIG MYGTEAGTKVYNGTAVGSTATINLGASNTTGMYLDNGAYGYNYGTIRSVGSGLKKLAGVV VKNGSTIENHGKIELSAEDAVGILSKGNAAGANPGIIKNYGTFNINGKTDPNDSSVVKKD SGGQDLGKSMGGVKIDVPSGSTVGTITVNGKPVVPTLATTSAKEYKDMEISKIGMYIDTS NKRFTNPINGLSALSRLKTADLIMGNEAAQSTTSKYIQVDQKILKPYNEMIKKNPQIKKW NIYAGALTWMATVSQDQNDGTMKSAYLAKIPYTQWAGNESTPVDKKDTYNFLDGLEQRYG AEAIGTRENQLFQKLNGIGKNEQILFFQATDEMMGHQYANVQQRMNRTGVLLDKEFTHLR KEWDNKSKQSNKLKVFGMRDEYKTDTAGIIDY Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:15:23 2011 Seq name: gi|261747501|gb|ADAD01000122.1| Leptotrichia goodfellowii F0264 contig00103, whole genome shotgun sequence Length of sequence - 384 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:15:24 2011 Seq name: gi|261747499|gb|ADAD01000123.1| Leptotrichia goodfellowii F0264 contig00201, whole genome shotgun sequence Length of sequence - 343 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 342 432 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261747499|gb|ADAD01000123.1| GENE 1 3 - 342 432 113 aa, chain + ## HITS:1 COG:FN0291 KEGG:ns NR:ns ## COG: FN0291 COG3210 # Protein_GI_number: 19703636 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 4 113 36 144 1881 81 49.0 3e-16 IGTRVTKTASGVDQIDIAAPNKNGTSYNSLKELQVSEQGLILNNNKNVVINTQIAGLVVR NRNLDNGIEANLIITEVTGKNKTNINGIVEVAGKRADLVMANRNGIFVNGGGF Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:15:26 2011 Seq name: gi|261747492|gb|ADAD01000124.1| Leptotrichia goodfellowii F0264 contig00010, whole genome shotgun sequence Length of sequence - 5554 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 978 980 ## COG5324 Uncharacterized conserved protein 2 1 Op 2 . + CDS 997 - 2214 1527 ## COG0438 Glycosyltransferase + Prom 2216 - 2275 5.3 3 2 Op 1 17/0.000 + CDS 2320 - 2979 755 ## COG0569 K+ transport systems, NAD-binding component 4 2 Op 2 . + CDS 3022 - 4359 1165 ## COG0168 Trk-type K+ transport systems, membrane components + Prom 4405 - 4464 11.1 5 3 Tu 1 . + CDS 4492 - 5379 1390 ## COG2822 Predicted periplasmic lipoprotein involved in iron transport Predicted protein(s) >gi|261747492|gb|ADAD01000124.1| GENE 1 1 - 978 980 325 aa, chain + ## HITS:1 COG:FN1603_3 KEGG:ns NR:ns ## COG: FN1603_3 COG5324 # Protein_GI_number: 19704924 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 12 277 2 267 273 191 43.0 2e-48 DKAKNSKEINLTKDDEVNKLIISKLINVKNTKPNLYSLNFTRNAFKRKLWNDSTIKARGL FVDKITGEVKMRSYNKFFNYGENNFSSKKYLEKNLSFPVTAYEKYNGFLGILGVINDEFI FATKSTTEGEHARYFRKLFEKVNEKNKKKLFELCKNKNVSVIFEVVSNEDRHIVDYFMNE NLFLLDVIENNLVINGIHSDITLSEKYIKELNINDDVIKVKKAVSICNDFEEVLDLMGKY GDIPNVEGLVFVDTKGLMFKYKAEHYNTWKRRRTLVELYRKNGDINLKQCKNEEEKDFIK WIISLDKGYVENTHIVDLMNKYMEK >gi|261747492|gb|ADAD01000124.1| GENE 2 997 - 2214 1527 405 aa, chain + ## HITS:1 COG:MTH450 KEGG:ns NR:ns ## COG: MTH450 COG0438 # Protein_GI_number: 15678478 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanothermobacter thermautotrophicus # 112 328 123 349 411 85 27.0 2e-16 MKVLHVLAQLPLKTGSGVYFTNVIDGLKEDSNIEQACIYATTAEYDINILNKENQYEAVF ESENLPFPIAGMSDIMPYPNTLYKNMTDEMMSVWKKEFLKQLNAAKEKFNPDVILTHHLW ILSSMVREVFPDKKVVAVCHNTDIRQSKQNPHIKEKHLQALKNVDKVLALSNKQIPEISE LFGIDEKKMINIGGGYNEKIFYPPEKYPEKEKVELMYAGKFDDSKGFYELIEAFKKIEEK DENVTIDLIGAITEKNKDRIEKAVGNSKNIKIYPPVDQKTLAEIMRKKDIFILPSYFEGI ALIAVEALGSGLRVVATKIEALMELLGDELNDSEVIEYVDMPTIYDTDKAVEEEKPAFIL RLAGKIEKMIERAREKREINKLLTEKIQKNSWKSKIEKIKEILAE >gi|261747492|gb|ADAD01000124.1| GENE 3 2320 - 2979 755 219 aa, chain + ## HITS:1 COG:FN1724 KEGG:ns NR:ns ## COG: FN1724 COG0569 # Protein_GI_number: 19705045 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 1 219 1 218 218 182 47.0 3e-46 MEGYLIIGLGRFGKSIAKTLYNHNKTVLGLDIDEEIVQQVIDNEIISEAIVIDATNENEL KKIIDDDFDTAFVCIGDNIQVSILVTLILKELGVKTVICKAKTKMQGKVLEKIGASSVVY PEEAMGEKIALNVLQSNITEYFKFSDKYGIFEFKAPKSFIGKNLMELDLRKKYGINIIVI KSENKEINFTPNPLTVIEENDTLFAMAEQEKMSFLSNII >gi|261747492|gb|ADAD01000124.1| GENE 4 3022 - 4359 1165 445 aa, chain + ## HITS:1 COG:FN1725 KEGG:ns NR:ns ## COG: FN1725 COG0168 # Protein_GI_number: 19705046 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 1 445 7 448 448 296 43.0 6e-80 MRKKLHISPQAMILISFCMIILISGTILSLPVSTVNGKGIRWIDGIFTATSAVCVTGLSV NDMNSVFNLFGKTLILILIQLGGLGLITISSLFIILLSRKINYYTKKAVEEDLNTDKIFN IQKYVKNVIFTVFLIEFSGAVFLLFEFSKIFELKRAAYYSVFHSISAFCNAGFSLFSNNL SDFKSSFILNLVISLLIILGGIGFSTLLNIYRYFSKKDKKLTITSKLSINISLILILIGT ISIFIIEYSNEKTIGNLSFLQKITASIFQSVSVRTAGFNTISLVDLKDASTFLFIILMFI GASPGSTGGGIKTTTIGIIFLGIRAILFNKENLEFSRRRIEWNNFNKAIVLFFISIIYIA IILFLMMITEKNIEFIDLLFEIVSAFGTVGLSKGVTSLLSDISKSFIIFTMFIGRVGPLT VTLVLLPKKIKSTNYKYPAENIIIG >gi|261747492|gb|ADAD01000124.1| GENE 5 4492 - 5379 1390 295 aa, chain + ## HITS:1 COG:SA0331 KEGG:ns NR:ns ## COG: SA0331 COG2822 # Protein_GI_number: 15926044 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted periplasmic lipoprotein involved in iron transport # Organism: Staphylococcus aureus N315 # 46 290 40 284 284 210 51.0 3e-54 MKNMKKIALLTLIATLVIVGCDKKEDKESNKDTTTQTTAQDNATVDLSAETSEYKKFVEE QIGMLLKDTENFAQLLKDGKLEEAKKVYPLIRMAYERSEPIAESFGESDVNIDFRLVDFK EEHNTEEGWRRFHRIERILWEQNTTKGTEKYAEQLVNDIKELKAKIATVEVTPDIMLTGA VDLLNEVATSKITGEEEVYSHTDLYDFRANIEGAQKIFELFRAKLEKKDSKLVATLDTEF KAINGLLDKYMTDDKNYKLYTDLTKEDTKALAEAVTKLGEPLSQMGIILDIKEGK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:16:06 2011 Seq name: gi|261747353|gb|ADAD01000125.1| Leptotrichia goodfellowii F0264 contig00113, whole genome shotgun sequence Length of sequence - 127356 bp Number of predicted genes - 139, with homology - 139 Number of transcription units - 44, operones - 30 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 96 - 155 5.9 1 1 Op 1 . + CDS 300 - 497 342 ## gi|262038565|ref|ZP_06011934.1| hypothetical protein HMPREF0554_1624 2 1 Op 2 . + CDS 527 - 1597 928 ## COG0582 Integrase + Term 1622 - 1687 -0.7 - TRNA 1731 - 1814 56.7 # Leu GAG 0 0 + Prom 1927 - 1986 10.9 3 2 Tu 1 . + CDS 2057 - 2803 1014 ## FN0557 hypothetical protein 4 3 Op 1 1/0.167 - CDS 2922 - 5141 2998 ## COG2217 Cation transport ATPase 5 3 Op 2 . - CDS 5165 - 5641 539 ## COG3682 Predicted transcriptional regulator - Prom 5888 - 5947 9.7 + Prom 5732 - 5791 10.0 6 4 Tu 1 . + CDS 5875 - 6765 932 ## NT05HA_1979 hypothetical protein + Term 6850 - 6911 0.3 - Term 6752 - 6785 3.4 7 5 Op 1 . - CDS 6861 - 7673 913 ## gi|262038497|ref|ZP_06011866.1| hypothetical protein HMPREF0554_1631 8 5 Op 2 . - CDS 7703 - 8212 227 ## PROTEIN SUPPORTED gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 9 5 Op 3 . - CDS 8209 - 9531 1718 ## COG1253 Hemolysins and related proteins containing CBS domains 10 5 Op 4 . - CDS 9552 - 10502 486 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 - Prom 10539 - 10598 11.2 11 6 Op 1 . - CDS 10612 - 10920 385 ## gi|262038612|ref|ZP_06011981.1| putative ribose operon repressor 12 6 Op 2 . - CDS 10956 - 11456 706 ## SP670_2008 hypothetical protein 13 7 Op 1 . - CDS 11795 - 12481 620 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 14 7 Op 2 38/0.000 - CDS 12494 - 13318 749 ## COG0395 ABC-type sugar transport system, permease component 15 7 Op 3 35/0.000 - CDS 13342 - 14232 981 ## COG1175 ABC-type sugar transport systems, permease components 16 7 Op 4 6/0.000 - CDS 14255 - 15556 1734 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 15582 - 15641 5.8 17 7 Op 5 . - CDS 15662 - 16663 974 ## COG1609 Transcriptional regulators - Prom 16723 - 16782 10.8 + Prom 16801 - 16860 13.9 18 8 Op 1 . + CDS 17002 - 17970 1149 ## COG0180 Tryptophanyl-tRNA synthetase 19 8 Op 2 . + CDS 17996 - 18175 214 ## Clocel_3263 hypothetical protein + Prom 18274 - 18333 8.9 20 9 Tu 1 . + CDS 18395 - 18793 302 ## COG1434 Uncharacterized conserved protein 21 10 Tu 1 . - CDS 19099 - 19329 457 ## gi|262038581|ref|ZP_06011950.1| MAP1-LC3 domain-containing protein - Prom 19395 - 19454 7.3 22 11 Op 1 . - CDS 19535 - 20056 875 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily 23 11 Op 2 30/0.000 - CDS 20062 - 20805 735 ## COG0336 tRNA-(guanine-N1)-methyltransferase 24 11 Op 3 . - CDS 20806 - 21315 770 ## COG0806 RimM protein, required for 16S rRNA processing 25 11 Op 4 . - CDS 21329 - 21742 670 ## COG3412 Uncharacterized protein conserved in bacteria 26 11 Op 5 . - CDS 21765 - 22328 646 ## COG4752 Uncharacterized protein conserved in bacteria - Prom 22431 - 22490 11.2 + Prom 22418 - 22477 10.7 27 12 Tu 1 . + CDS 22504 - 23067 534 ## gi|262038593|ref|ZP_06011962.1| 235 kDa rhoptry protein - Term 23061 - 23105 8.1 28 13 Op 1 . - CDS 23161 - 24039 1315 ## COG0682 Prolipoprotein diacylglyceryltransferase 29 13 Op 2 . - CDS 24040 - 25131 1655 ## COG0075 Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase 30 13 Op 3 . - CDS 25181 - 26599 2288 ## COG1012 NAD-dependent aldehyde dehydrogenases 31 13 Op 4 . - CDS 26632 - 27468 1150 ## COG2877 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase - Prom 27495 - 27554 9.6 + Prom 27460 - 27519 9.5 32 14 Op 1 . + CDS 27692 - 28942 1739 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase 33 14 Op 2 . + CDS 28976 - 29272 421 ## COG2350 Uncharacterized protein conserved in bacteria + Term 29283 - 29328 1.4 - Term 29271 - 29316 1.4 34 15 Op 1 12/0.000 - CDS 29326 - 29745 345 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 35 15 Op 2 . - CDS 29708 - 30148 478 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 36 15 Op 3 . - CDS 30166 - 30756 315 ## PROTEIN SUPPORTED gi|162456259|ref|YP_001618626.1| putative ribosomal protein 37 15 Op 4 . - CDS 30776 - 31522 922 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 38 15 Op 5 . - CDS 31536 - 32123 630 ## Lebu_1699 regulatory protein RecX 39 15 Op 6 . - CDS 32120 - 33268 2078 ## COG0468 RecA/RadA recombinase - Prom 33359 - 33418 11.4 - Term 33392 - 33446 10.7 40 16 Op 1 . - CDS 33468 - 34619 1674 ## Lebu_1986 hypotheticalprotein 41 16 Op 2 2/0.000 - CDS 34654 - 35361 1157 ## COG2171 Tetrahydrodipicolinate N-succinyltransferase 42 16 Op 3 . - CDS 35387 - 36115 1114 ## COG0289 Dihydrodipicolinate reductase - Prom 36145 - 36204 8.2 43 17 Tu 1 . - CDS 36250 - 36801 907 ## COG1859 RNA:NAD 2'-phosphotransferase - Prom 36910 - 36969 7.8 44 18 Op 1 . - CDS 36971 - 38137 1592 ## Lebu_0649 hypothetical protein 45 18 Op 2 . - CDS 38210 - 38599 723 ## Lebu_0648 ATP synthase F1 subunit epsilon 46 18 Op 3 42/0.000 - CDS 38618 - 40015 2216 ## COG0055 F0F1-type ATP synthase, beta subunit 47 18 Op 4 42/0.000 - CDS 40034 - 40894 1109 ## COG0224 F0F1-type ATP synthase, gamma subunit 48 18 Op 5 41/0.000 - CDS 40908 - 42413 2164 ## COG0056 F0F1-type ATP synthase, alpha subunit 49 18 Op 6 38/0.000 - CDS 42433 - 42960 813 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 50 18 Op 7 37/0.000 - CDS 42965 - 43459 744 ## COG0711 F0F1-type ATP synthase, subunit b 51 18 Op 8 40/0.000 - CDS 43523 - 43759 540 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 52 18 Op 9 . - CDS 43798 - 44676 1212 ## COG0356 F0F1-type ATP synthase, subunit a - Prom 44704 - 44763 9.1 53 19 Op 1 . - CDS 44767 - 45165 357 ## Lebu_1791 ATP synthase I 54 19 Op 2 . - CDS 45183 - 45509 493 ## Lebu_1792 hypothetical protein 55 19 Op 3 . - CDS 45532 - 46425 1102 ## gi|262038548|ref|ZP_06011917.1| putative regulatory protein ArsR 56 19 Op 4 . - CDS 46446 - 47171 1111 ## COG0500 SAM-dependent methyltransferases 57 19 Op 5 7/0.000 - CDS 47199 - 48572 2037 ## COG1921 Selenocysteine synthase [seryl-tRNASer selenium transferase] 58 19 Op 6 . - CDS 48589 - 50466 2512 ## COG3276 Selenocysteine-specific translation elongation factor 59 19 Op 7 . - CDS 50495 - 50794 289 ## gi|262038513|ref|ZP_06011882.1| conserved hypothetical protein 60 19 Op 8 . - CDS 50802 - 51389 880 ## TDE0677 hypothetical protein 61 19 Op 9 . - CDS 51428 - 52336 1080 ## COG0709 Selenophosphate synthase - Prom 52487 - 52546 13.5 - TRNA 52626 - 52718 26.6 # SeC(p) TCA 0 0 62 20 Tu 1 . - CDS 52772 - 53395 883 ## TDE2415 hypothetical protein - Prom 53506 - 53565 15.8 + Prom 53411 - 53470 12.8 63 21 Tu 1 . + CDS 53548 - 54720 966 ## COG0477 Permeases of the major facilitator superfamily + Prom 54844 - 54903 12.2 64 22 Op 1 . + CDS 54993 - 55313 361 ## FMG_0119 putative phosphate ABC transporter substrate-binding protein 65 22 Op 2 . + CDS 55295 - 55594 414 ## COG0226 ABC-type phosphate transport system, periplasmic component 66 22 Op 3 . + CDS 55616 - 55900 283 ## FMG_0119 putative phosphate ABC transporter substrate-binding protein 67 22 Op 4 38/0.000 + CDS 55965 - 56780 633 ## COG0573 ABC-type phosphate transport system, permease component 68 22 Op 5 41/0.000 + CDS 56770 - 57642 571 ## COG0581 ABC-type phosphate transport system, permease component 69 22 Op 6 . + CDS 57599 - 58249 208 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 + Term 58459 - 58493 -0.9 70 23 Op 1 . - CDS 58372 - 58845 544 ## ELI_1489 acetyltransferase 71 23 Op 2 . - CDS 58883 - 60586 2006 ## FMG_0256 hypothetical protein - Prom 60641 - 60700 12.7 + Prom 61013 - 61072 10.1 72 24 Op 1 2/0.000 + CDS 61109 - 62392 1690 ## COG1653 ABC-type sugar transport system, periplasmic component 73 24 Op 2 35/0.000 + CDS 62426 - 63715 1806 ## COG1653 ABC-type sugar transport system, periplasmic component 74 24 Op 3 38/0.000 + CDS 63762 - 64622 857 ## COG1175 ABC-type sugar transport systems, permease components 75 24 Op 4 . + CDS 64637 - 65482 564 ## COG0395 ABC-type sugar transport system, permease component 76 24 Op 5 . + CDS 65500 - 66207 573 ## Pjdr2_5268 hypothetical protein 77 24 Op 6 . + CDS 66234 - 67508 1949 ## COG1653 ABC-type sugar transport system, periplasmic component - Term 67593 - 67627 3.0 78 25 Op 1 . - CDS 67644 - 68090 533 ## gi|262038553|ref|ZP_06011922.1| conserved hypothetical protein 79 25 Op 2 . - CDS 68059 - 68268 274 ## gi|262038486|ref|ZP_06011855.1| conserved hypothetical protein - Prom 68412 - 68471 9.4 - Term 68477 - 68517 7.4 80 26 Op 1 . - CDS 68519 - 69778 1434 ## COG1301 Na+/H+-dicarboxylate symporters 81 26 Op 2 . - CDS 69834 - 70901 1963 ## COG2055 Malate/L-lactate dehydrogenases 82 26 Op 3 . - CDS 70959 - 72218 1711 ## COG0477 Permeases of the major facilitator superfamily 83 26 Op 4 . - CDS 72220 - 72999 653 ## CAR_c00510 hypothetical protein 84 26 Op 5 . - CDS 72999 - 75998 4765 ## COG0074 Succinyl-CoA synthetase, alpha subunit - Prom 76031 - 76090 11.8 + Prom 76144 - 76203 15.1 85 27 Op 1 . + CDS 76329 - 76991 708 ## COG1802 Transcriptional regulators 86 27 Op 2 . + CDS 77032 - 77523 468 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Term 77420 - 77470 3.1 87 28 Op 1 44/0.000 - CDS 77621 - 78343 415 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 88 28 Op 2 4/0.000 - CDS 78359 - 79924 417 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 89 28 Op 3 38/0.000 - CDS 79939 - 80901 895 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 90 28 Op 4 1/0.167 - CDS 80924 - 82594 2209 ## COG0747 ABC-type dipeptide transport system, periplasmic component 91 28 Op 5 . - CDS 82632 - 83540 1043 ## COG0583 Transcriptional regulator - Prom 83577 - 83636 2.1 92 29 Op 1 . - CDS 83691 - 83927 369 ## Elen_1686 hypothetical protein 93 29 Op 2 . - CDS 83924 - 85018 1862 ## Ccur_00560 hypothetical protein 94 29 Op 3 . - CDS 85061 - 86215 1524 ## COG0520 Selenocysteine lyase 95 29 Op 4 . - CDS 86202 - 86444 64 ## gi|262038544|ref|ZP_06011913.1| conserved hypothetical protein - Prom 86465 - 86524 9.7 - Term 86451 - 86503 9.3 96 30 Tu 1 . - CDS 86527 - 86916 540 ## Amet_3596 GrdX protein - Prom 86941 - 87000 6.9 - Term 86951 - 86983 4.0 97 31 Op 1 . - CDS 87017 - 88150 1659 ## COG3949 Uncharacterized membrane protein 98 31 Op 2 . - CDS 88181 - 88423 366 ## CD1741 sarcosine reductase complex component B subunit alpha (EC:1.21.4.3) 99 31 Op 3 . - CDS 88451 - 89500 1653 ## Clos_0958 selenoprotein B (EC:1.21.4.2) 100 31 Op 4 . - CDS 89513 - 90799 1829 ## CDR20291_1637 sarcosine reductase complex component B subunit alpha - Prom 90844 - 90903 8.1 101 32 Tu 1 . - CDS 91385 - 92809 867 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 93008 - 93067 9.8 - Term 93051 - 93103 6.1 102 33 Op 1 . - CDS 93124 - 94269 1853 ## CDR20291_2237 glycine/sarcosine/betaine reductase complex component C subunit alpha 103 33 Op 2 . - CDS 94285 - 95811 2402 ## CD2349 glycine/sarcosine/betaine reductase complex component C subunit beta (EC:1.21.4.2) 104 33 Op 3 . - CDS 95893 - 96126 385 ## Amet_3591 selenoprotein B (EC:1.21.4.2) 105 33 Op 4 . - CDS 96154 - 97203 1801 ## CD2351 glycine reductase complex component B gamma subunit (EC:1.21.4.2) 106 33 Op 5 . - CDS 97234 - 97545 526 ## TDE0745 glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.3 1.21.4.4 1.21.4.2) 107 33 Op 6 . - CDS 97561 - 97695 168 ## CDR20291_2240 glycine/sarcosine/betaine reductase complex protein A 108 33 Op 7 . - CDS 97758 - 99044 2061 ## CDR20291_2241 glycine reductase complex component B subunits alpha/beta 109 33 Op 8 . - CDS 99068 - 99385 566 ## COG0526 Thiol-disulfide isomerase and thioredoxins 110 33 Op 9 5/0.000 - CDS 99460 - 100629 1490 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 111 33 Op 10 1/0.167 - CDS 100645 - 101937 1538 ## COG0477 Permeases of the major facilitator superfamily 112 33 Op 11 . - CDS 101960 - 102904 519 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 113 33 Op 12 . - CDS 102934 - 103305 488 ## CD2357 putative glycine reductase complex component - Prom 103545 - 103604 9.1 - Term 103692 - 103741 9.4 114 34 Tu 1 . - CDS 103763 - 104896 1546 ## COG1960 Acyl-CoA dehydrogenases - Prom 105126 - 105185 10.4 + Prom 105061 - 105120 17.4 115 35 Op 1 10/0.000 + CDS 105156 - 105824 778 ## COG2391 Predicted transporter component 116 35 Op 2 10/0.000 + CDS 105862 - 106089 376 ## COG0425 Predicted redox protein, regulator of disulfide bond formation 117 35 Op 3 1/0.167 + CDS 106104 - 106658 832 ## COG2391 Predicted transporter component + Prom 106687 - 106746 3.4 118 36 Op 1 1/0.167 + CDS 106799 - 108199 1684 ## COG2897 Rhodanese-related sulfurtransferase 119 36 Op 2 . + CDS 108221 - 108775 677 ## COG2391 Predicted transporter component + Prom 108814 - 108873 3.0 120 37 Tu 1 . + CDS 108895 - 109839 435 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Term 109867 - 109903 4.2 + Prom 109882 - 109941 6.6 121 38 Tu 1 . + CDS 109987 - 110202 275 ## Closa_1519 SirA family protein 122 39 Op 1 1/0.167 - CDS 110195 - 111079 775 ## COG0583 Transcriptional regulator 123 39 Op 2 . - CDS 111106 - 112467 2250 ## COG1109 Phosphomannomutase 124 39 Op 3 40/0.000 - CDS 112495 - 114087 1875 ## COG0642 Signal transduction histidine kinase 125 39 Op 4 . - CDS 114099 - 114773 944 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 114797 - 114856 7.6 - Term 114816 - 114850 2.4 126 40 Op 1 1/0.167 - CDS 114957 - 116231 1646 ## COG0826 Collagenase and related proteases 127 40 Op 2 16/0.000 - CDS 116296 - 117747 2066 ## COG0305 Replicative DNA helicase 128 40 Op 3 . - CDS 117754 - 118209 573 ## PROTEIN SUPPORTED gi|229210387|ref|ZP_04336784.1| LSU ribosomal protein L9P 129 40 Op 4 1/0.167 - CDS 118248 - 119738 1723 ## COG2812 DNA polymerase III, gamma/tau subunits - Prom 119803 - 119862 8.9 - Term 119858 - 119915 7.4 130 41 Op 1 . - CDS 119922 - 120614 1016 ## COG0689 RNase PH 131 41 Op 2 . - CDS 120645 - 121877 2052 ## COG0112 Glycine/serine hydroxymethyltransferase - Prom 121903 - 121962 9.5 132 42 Op 1 . - CDS 121990 - 122754 1142 ## COG1058 Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA 133 42 Op 2 4/0.000 - CDS 122779 - 123219 531 ## COG0346 Lactoylglutathione lyase and related lyases 134 42 Op 3 . - CDS 123255 - 123662 364 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 123688 - 123747 2.2 135 43 Op 1 . - CDS 123755 - 125455 2387 ## COG0366 Glycosidases 136 43 Op 2 . - CDS 125480 - 125929 491 ## COG4405 Uncharacterized protein conserved in bacteria 137 43 Op 3 . - CDS 125951 - 126280 433 ## Lebu_0807 branched-chain amino acid transport 138 43 Op 4 . - CDS 126273 - 126983 965 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) - Prom 127052 - 127111 4.7 139 44 Tu 1 . - CDS 127183 - 127347 266 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains Predicted protein(s) >gi|261747353|gb|ADAD01000125.1| GENE 1 300 - 497 342 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038565|ref|ZP_06011934.1| ## NR: gi|262038565|ref|ZP_06011934.1| hypothetical protein HMPREF0554_1624 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1624 [Leptotrichia goodfellowii F0264] # 1 65 1 65 65 73 100.0 4e-12 MSITELLEKLEKIYDFDEENPLEDEEPDEITEIPITELFKEVINDEENLFKDDEPDNDKK TSYLN >gi|261747353|gb|ADAD01000125.1| GENE 2 527 - 1597 928 356 aa, chain + ## HITS:1 COG:BH3551 KEGG:ns NR:ns ## COG: BH3551 COG0582 # Protein_GI_number: 15616113 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus halodurans # 9 346 11 362 378 100 28.0 4e-21 MRTRKIGKKWYYNFDIYIDGKRKIIEKVGGNTKKEAIEKGEIAKKDYEQTSNSFSQSIAT LSDLIEDYRKNYINIFLRENTKRIRNYYFNIIIKDIGDTKLQKMNVRFFQDYIVKTKDPN NIKKFLNSLFNYAVRVELISKNPIKNVIVPKAEKIDFDLITPEDIEIIKNMEIREVLKDF ILFIYLTGVRPSEGLALEFKNINFKDKTISIERSINKYKRNLFEPLKNNSSKRKILFNDI IEDIFTRRKSAIEKLIDKSEGYYKSDLIFANSKGNTFIISDFNKEKYKIKEKINKNFHFR ALRHMHTTLLIEKGVNIKAVQERLGHADIKTTLEVYAHVTEKMKKEVIPVLDDLLK >gi|261747353|gb|ADAD01000125.1| GENE 3 2057 - 2803 1014 248 aa, chain + ## HITS:1 COG:no KEGG:FN0557 NR:ns ## KEGG: FN0557 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 243 1 244 244 263 62.0 4e-69 MKRILSGIFVLLGILTFSTEISATQKIKRGESMELVFIMDRSGSMGGLESDTIGGYNSML NKQKKEEKGEVYVTTVLFDDQYELLHDRISISKVKPITDKDYYVRGSTALLDAIGKTVAK VKAEQNKLGKEKSKKVLFIIITDGMENASKEYSVAAVKNLIETQKKKDKWEFLFLGANID AIGTAESFGIESSKAANYRSDSQGTQNNYKVLNEAIMEIRSGKELKDTWKKEIEEDYNNR EQKNNQKK >gi|261747353|gb|ADAD01000125.1| GENE 4 2922 - 5141 2998 739 aa, chain - ## HITS:1 COG:CAC3655 KEGG:ns NR:ns ## COG: CAC3655 COG2217 # Protein_GI_number: 15896888 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 8 737 78 810 818 755 56.0 0 MKESIYNVTGMSCAACARTVENVLNKNENIEAHVNIATEKVNIKYDEKKYDFEKIKEIVE NSGYGLIETLSEEEKMQIYENRIKSLRNRLILSIIFIIPLFYISMGHMVRLFLPNVINPE KNALIYAVAQLILTLPIVYAGRDFFIHGFKNLLRKSPTMDSLIAIGSSAAIFYSLYATYM IAIGDGEHHMNLYYESAGTIITLILLGKLLEARTKGQTSSAIKKLIGLQPKKAKIIENGQ EKEVLIENIKTGNIIIVRPGEKIPVDGRIIKGSTSIDESMITGESIPVTKNEGDKVIGGS INKNGSIEFEATEIGKDTVLSQIIRLVEEAQGSKAPISRMADIVAGYFVPAVIFIAVVTG SVWYIGGSGLTTALTFFISVLVIACPCALGLATPTSIMVGTGKGAEKGILIKSGEALETA HKIKTVVLDKTGTITKGKPVLTDLKIYGNYNENEVLQLAASAESKSEHPLAEAIVNKAEE KNIELKKHEKFRAMPGYGIRVQMDEKEIQIGNRKLMTSKKIDINQAEKDYEILSDEGKTP IFISVNNELAGLAGVSDVIKETSKEAVERFHKLGLEVIMLTGDNEKTAKYIAKEVGIDKV IAGILPFQKSEEIKKLQSQGKFTAMVGDGINDSPALAQANVGIAIGSGTDIAIESADIVL IRNDLKDVAEAIELSRATITNIKENLFWAFFYNVIGIPFAAGIFYAFFNGPKLDPMIAAF AMSLSSVSVLLNALRLKLR >gi|261747353|gb|ADAD01000125.1| GENE 5 5165 - 5641 539 158 aa, chain - ## HITS:1 COG:SPy1717 KEGG:ns NR:ns ## COG: SPy1717 COG3682 # Protein_GI_number: 15675568 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 7 146 3 142 144 92 32.0 4e-19 MEDNNCHITGAEWEVMRVVWANEEVTSKFVSQVLGEKMQWKHTTVKTLLNRLLEKNVLKK RESGNKYIYYTEYNEQEIAEKYVTETFDRICKTKVGGMIKELIEKSLLSRKDLENIEKIV KEKKKNAPEKIKCMCIEGQCNCSEHEHKHGDEKSCCKE >gi|261747353|gb|ADAD01000125.1| GENE 6 5875 - 6765 932 296 aa, chain + ## HITS:1 COG:no KEGG:NT05HA_1979 NR:ns ## KEGG: NT05HA_1979 # Name: not_defined # Def: hypothetical protein # Organism: A.aphrophilus # Pathway: not_defined # 5 295 1 293 296 360 63.0 4e-98 MSKKIRKFYISIILFISVFSLSYSQKTVPLSSYIYGITVDDSWDGKIKTEQIIEAIKAMS IKPTVRIVMSKDVSPKEYKELFSKIHKVAYIMATPVDSYEMKKYSVSGYLKRFKDSYETL SEYTDIWEIGNEVNGENWLGNDPKLISEKIYTAYKYIRNNNGKTALTSYYFVPGEQKIPM NDWLKKYIPPDMKNNLDYVLVSYYEDDNNGYHPEWKKVFEDLESIFPNSKLGIGECGSTK KGAEIASKVKKARHYYTMPKYVKNYIGGYFWWYWVQDSVPYKNSKVWEAINESISK >gi|261747353|gb|ADAD01000125.1| GENE 7 6861 - 7673 913 270 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038497|ref|ZP_06011866.1| ## NR: gi|262038497|ref|ZP_06011866.1| hypothetical protein HMPREF0554_1631 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1631 [Leptotrichia goodfellowii F0264] # 1 270 1 270 270 401 100.0 1e-110 MSIRRKALLINIFMIIGLSILLGIVIFIMFFVLQKEYSIIDYLLFAGVFLIIDRIITKIQ KKMWFRLYSKYMNVLNEELDPEKFIKLTESEYSENRNRRYRNYMRMNFCAGYSSLGEFEK AYEYLSEIDFSKRNAFRKQDMIMYYFNKALLLDSLEREEEAKEVYEKNLLNEKKKFTNNK KINEINALIDSLEGFLFYKDDNEKMIEILSKILREIKVKRYVIAMKYELAVCKEKIGEIY EAKSLYEEVAREGNKLYIVNEARKKLEILS >gi|261747353|gb|ADAD01000125.1| GENE 8 7703 - 8212 227 169 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 [Bacillus selenitireducens MLS10] # 3 142 11 144 236 92 32 1e-17 MTLEDIIMRILVAGIIGALIGNERKKSGKPAGIATHCVVCLGAALLAMIQSKIDENILKA VAVNPKIIGAFKVDTARIPAQIVSGVGFLGGGVILHSKNTISGITTAATIWITAGLGMGA GFGYFEIVIPTAIFLLGALYVLKGHGLFSEIEENHHSEDKTNNSTDDKK >gi|261747353|gb|ADAD01000125.1| GENE 9 8209 - 9531 1718 440 aa, chain - ## HITS:1 COG:CAC0460 KEGG:ns NR:ns ## COG: CAC0460 COG1253 # Protein_GI_number: 15893751 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Clostridium acetobutylicum # 8 438 9 440 443 411 55.0 1e-114 MDTTTGSNIFLQFLIIIILTGINAFFSGAEMAIVSINKNKLKMLVEEGNKKAILLENLLK EPSKFLSTIQVGITLAGFFASASAATGLAQYLSLYMKKIHIPYSNQLSMILITFVLSYIT LVFGELIPKRIALRSSEKMALSSVNVIVFISKVFSPFVKILTLSTNSVLTILKMKEDNLE EQVSKEEIRSLVEVGREHGIINESEKEMIENIIEFDEKIAREIMIPRTKVFLIDKDISVK ELFEKKETGKYSRIPVYEKEADNIVGVLHMKDLMMEAYKQGFENIKIEDIIQEAYFVPET KNVNELFNELQIEKKHIAILIDEYGGFSGIVTLEDLIEEVMGNISDEFDDEDYSIKKLAL NKYLISGELSLNDINDYFHIELESKHYDTLSGLLIEHLGYIPEDDQEIEPIVIDDIIFKP QRVKDKKIERILVTFNKEKK >gi|261747353|gb|ADAD01000125.1| GENE 10 9552 - 10502 486 316 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 4 313 5 312 323 191 33 1e-47 MNYYVGMDLGGTNTKIGLVDEGGNIIFTTIVKTESMEGFEKTIERLSKILIEQVKGSNIN YDDVKGVGLGVPGPVVNERVVKLWANFPWPKEVDLAGEFEKHLNRKVKVDNDVNVITLGE MWKGAAQGYKHVLGLAIGTGIGGGIIVDTKLVSGKNGGGGEVGHTKVEKEGKLCGCGQNG CWEAYASATGLIREAKSRLTVHKNNKLYEKITSMGRELEAKDIFDAAKEGDEFSLNLVDY EAEYIALGLGNLLNTLDPEIIVIGGGVALAGDILFNRINEKLHKYALSSTLEGLKILPAQ LGNDAGIIGAAYLGMN >gi|261747353|gb|ADAD01000125.1| GENE 11 10612 - 10920 385 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038612|ref|ZP_06011981.1| ## NR: gi|262038612|ref|ZP_06011981.1| putative ribose operon repressor [Leptotrichia goodfellowii F0264] putative ribose operon repressor [Leptotrichia goodfellowii F0264] # 1 102 1 102 102 165 100.0 1e-39 MAENQNINALILNSDFYAVFAKKITNEIGKKIYVTSFDGTSLLKMASGNIIHIKQPFEEM GKVSAETLLKKINQEKYQKKRYLNYELIDKDYRNEVKEKETF >gi|261747353|gb|ADAD01000125.1| GENE 12 10956 - 11456 706 166 aa, chain - ## HITS:1 COG:no KEGG:SP670_2008 NR:ns ## KEGG: SP670_2008 # Name: not_defined # Def: hypothetical protein # Organism: S.pneumoniae_670-6B # Pathway: Glycerophospholipid metabolism [PATH:snb00564] # 1 166 44 208 286 105 35.0 6e-22 MIEVDVIKSTDGIYYAFHYGNEKRLLDRDKNIKEMPSEEIEEKKYINSIGERTGYKVERL DSILKYFEGKDVLINIDRAWEYFEELLTFFDKYNVKKQLVIKSSPKKEFLDIFEKHNVKY MYMLIVKDKKEIETALEYNNVNIVGFEIIAKDEKDDFFDDGFIKEI >gi|261747353|gb|ADAD01000125.1| GENE 13 11795 - 12481 620 228 aa, chain - ## HITS:1 COG:CAC0364 KEGG:ns NR:ns ## COG: CAC0364 COG1234 # Protein_GI_number: 15893655 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 6 159 4 164 167 103 37.0 4e-22 MKKLKFTGIGSAFNPILGNTNAYFTHNDDFYLIDCGESVFGKIWNLKEMTESNNIFILIT HFHCDHVGSLGSLISYLYLKLGKVAYVIHPNKKIIEYLDIVGIERKFYVYEKELPEISEI KNDVIEVKHADDMKCYAYEIIFPDETVYYSGDAVEIPDKILQKYFSGKIKEMYQDTCSYV SENPSHGNIFYLEEIIPVEKRKNVYCIHLDKDFRDIIQEKGFGVIEDI >gi|261747353|gb|ADAD01000125.1| GENE 14 12494 - 13318 749 274 aa, chain - ## HITS:1 COG:lin0761 KEGG:ns NR:ns ## COG: lin0761 COG0395 # Protein_GI_number: 16799835 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 4 273 6 291 291 156 36.0 4e-38 MNIIKKGTDSVITVIAFVIAFIWLIPLIWMIGTAFTEPSFSMTLFPKTGFTLKNIIYVWN AVPFKTYYINTLILVTVTFIIQIFTSTMAAYALAVMDLKWAKYVFIIIFMQIIIPNDVLI TPNFMTLKDLNLINTKLGIMIPFLGTAFGIFLLRQHFKTIPKSLSEAARIDGANTWQIIW KIYMPCAKPAYLSFGVVSISYHWNNYLWPLIVTNSTQNRTLTVGLALFAKSKEATMQWAN VSAATFIIIFPLIVIFFILQKRFINSFISSGIKE >gi|261747353|gb|ADAD01000125.1| GENE 15 13342 - 14232 981 296 aa, chain - ## HITS:1 COG:XF2447 KEGG:ns NR:ns ## COG: XF2447 COG1175 # Protein_GI_number: 15839038 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Xylella fastidiosa 9a5c # 10 282 3 280 293 168 34.0 1e-41 MNIKLSRKVKENITGFLLLLPSIIFMTGFTVLPVFKSFYLSFTKYNLGMKAPKLIGLNNY IALFKSELFWKVMYNTIFFSVITVVPSMILGLALAFLVNRRSKAVGILRTVYFYPVVMPM IAVASIWMFIYMGKNGLLDQWLSHFGFKPLNVLSNKNTVLPFMAIMYVWKESGYLMIFFL AGLQGISEDLFESARIDGAGFWVMLKNITLPLLKPTMIFVSTIALTNCFKLVDHIVIMTE GAPNNASTMLLYYIYQQGFTNFNYGKSSALTVIMLFLLLFVSLPRFFKQDKEAYYN >gi|261747353|gb|ADAD01000125.1| GENE 16 14255 - 15556 1734 433 aa, chain - ## HITS:1 COG:TM1120 KEGG:ns NR:ns ## COG: TM1120 COG1653 # Protein_GI_number: 15643877 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 3 433 2 432 436 222 34.0 9e-58 MKKKIFVLLFLSLILVISGCSGNKEEKKDGKVKLVFYYPVNVGGPVAKIVEQLTNDFNKE NPDIEVEAVYTGNYDDTVTKIQTAAQGGNPPDLFVSLATQRFTMASSEMAMPLDELIAED GEEGKKYIDDFLESFIEDSYVDGKIYSIPFQRSTMVLYYNKDIFKEAGLDPEKAPETWEE LVEIAKKVTNDKRKGVGIALNSGSAQWAFTGFALENSIDGKNLMSDDGKKVFFNTPENVE ALQFWIDLQKKYKVMADGIVQWTDLPTQFLAGEVVMIYHTTGNLSNIAKNAKFNYGVAFL PGHKRKGAPTGGGNFYISSGISKEKQKAAWKFIKYLTTPERAAQWSVDTGYVATRKSAFE TDIMKKYYQERPQAKVAFEQLKFAKPELTTYNAAEIWRILNDNIQSAITGEKSAKDALDN AQNQATEVLKDIN >gi|261747353|gb|ADAD01000125.1| GENE 17 15662 - 16663 974 333 aa, chain - ## HITS:1 COG:SP1821 KEGG:ns NR:ns ## COG: SP1821 COG1609 # Protein_GI_number: 15901650 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 3 330 7 332 333 199 32.0 7e-51 MKKNMTIKDVAEKCGVSTQTISRVINESTNVRESTRQFVKKKIEELGYKPNLYAKNLSNR KTKNILVSIRRNKGHTATIWTNILVSEIFAYNKDKNVSLFMEQYYDDEDLKNSLLNTSNT FIDGAIIFYEKKNDKRIQILRKEKIPFIVLGKSYSEENVYVSNDDYNSVFKATEYLFGKN IENIVFITANPTPMNIERKNGIIEAYQKNNKSLKTLKIMEKMNNQKEIYKLVKELERKNL LPEAFFVSGDEKAIAVLKALNDLKINIPDEISVLGLDNIPISEFFSPGLTTLALNYKKIS ERVYEKLINMMNGKKEYSEEIPGEIVERESVRK >gi|261747353|gb|ADAD01000125.1| GENE 18 17002 - 17970 1149 322 aa, chain + ## HITS:1 COG:FN0405 KEGG:ns NR:ns ## COG: FN0405 COG0180 # Protein_GI_number: 19703747 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 2 320 3 321 325 431 66.0 1e-121 MRSLSGIQPSGNLHIGNYFGAIKQFVEFQDKYEGLYFLANYHALTSSPKGEDLKRNTVNA ILDYLALGLDPEKSIIFLQSDVPEHTELSWILANVAPMGLLERAHSYKDKTAKGIKPNVG LFTYPILMAADILMYDPDVVPVGKDQKQHVEITRDIAIKFNETYGKEVFKLPEEKIVENV AVVPGVDGDKMSKSYGNVINMFIPKKEIKKQIMGIVTDSTPLEEPKNPDNNITKLYSLLA TEAEVEEMKRKFSEGNYGYGHAKNELFEKFIDYFNPFIEKREKLEKNIDYVYDVLKEGAF KARSIAQKKMEEVRSTVGLLKI >gi|261747353|gb|ADAD01000125.1| GENE 19 17996 - 18175 214 59 aa, chain + ## HITS:1 COG:no KEGG:Clocel_3263 NR:ns ## KEGG: Clocel_3263 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulovorans # Pathway: not_defined # 1 59 1 59 264 82 61.0 8e-15 MNKNIPNWINILNEFCGKRDIPALTSYELNKKYCFSQADIMVLFGGTALCGGDILAQAI >gi|261747353|gb|ADAD01000125.1| GENE 20 18395 - 18793 302 132 aa, chain + ## HITS:1 COG:ECs2020 KEGG:ns NR:ns ## COG: ECs2020 COG1434 # Protein_GI_number: 15831274 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 4 116 136 251 266 80 32.0 5e-16 MLDLLNEHKIPFKNIIISQDATMQLRMDVTLRKYVSNKIEIINFATYSVKVSEKDGNLTY DKNIPGMWNINHFITLLMGEIPRLTDDENGYGPKGKNYLAHIDIPDNVQNAFSELKKIYA NSVRQANPLYSS >gi|261747353|gb|ADAD01000125.1| GENE 21 19099 - 19329 457 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038581|ref|ZP_06011950.1| ## NR: gi|262038581|ref|ZP_06011950.1| MAP1-LC3 domain-containing protein [Leptotrichia goodfellowii F0264] MAP1-LC3 domain-containing protein [Leptotrichia goodfellowii F0264] # 1 76 1 76 76 112 100.0 9e-24 MLEVFLFNVTVGGLILSVAGMIGLELSENFKKEYKIVKRKSKISKNEKDKRQSNIKIAKN EGIIDIRMSKSGEEVA >gi|261747353|gb|ADAD01000125.1| GENE 22 19535 - 20056 875 173 aa, chain - ## HITS:1 COG:SMc04241 KEGG:ns NR:ns ## COG: SMc04241 COG0663 # Protein_GI_number: 15965664 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Sinorhizobium meliloti # 2 166 3 169 176 162 45.0 3e-40 MIYKLDGIKPKITGEVFVAESADVSGNVELSDGVNIWFGAVLRGDIEKITIGKNSNVQDN STVHTDFGLPCIVGENVTVGHNVILHSCEIGDNVIVGMGSTVLNGTKIAPNCLIGANSLV THKIPYEEGVLILGSPAKIIRKLTEEELEHIKKNAAHYVENGKKFSKTLKKVL >gi|261747353|gb|ADAD01000125.1| GENE 23 20062 - 20805 735 247 aa, chain - ## HITS:1 COG:TM1569 KEGG:ns NR:ns ## COG: TM1569 COG0336 # Protein_GI_number: 15644317 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Thermotoga maritima # 1 238 1 244 245 218 44.0 8e-57 MKFTVLTLFPELFEMYLSQTIIQRAADAGIVNYDIVNIRDYSGNKHNQMDDIPFGGGAGM VLKPEAYWNFFNKKCSDTEKPYIIFVSPQGKTLTHKRVTELTEKENIVIISGRYEGLDQR VIDKFVDEEISVGDYVLSSGDLPALIVMDSIIRIKEGVIKKESFETDSFYNGLLGFPQYT RPVEIDGMSVPEVLRSGNHAKINEYRYFKSIEKTSENRKDLLEKKLESDEEFRKLYQKFL KSKNGKD >gi|261747353|gb|ADAD01000125.1| GENE 24 20806 - 21315 770 169 aa, chain - ## HITS:1 COG:FN0284 KEGG:ns NR:ns ## COG: FN0284 COG0806 # Protein_GI_number: 19703629 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RimM protein, required for 16S rRNA processing # Organism: Fusobacterium nucleatum # 4 169 3 168 173 97 40.0 1e-20 MENLVNIGTITGTHHLTGNVKINSIFQETDLIIGEKVLLEKEDKKKILTVKKIKRLNEKK VIAEFEEIDNIDSAKELNGFQIKIRRDLLPEKNENEFYLKDLLGVEVFEGNGKIGDVIDV METAAHNILIIEDTVNKKEIMVPLVDEFVKNIDFANNKIEVELIEGMRE >gi|261747353|gb|ADAD01000125.1| GENE 25 21329 - 21742 670 137 aa, chain - ## HITS:1 COG:FN1842 KEGG:ns NR:ns ## COG: FN1842 COG3412 # Protein_GI_number: 19705147 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 8 135 3 134 136 75 37.0 2e-14 MRKAKKGLAGIVVVSHSNKLAEEVINFSKVLQQTEFNIENGGNVNQEVYGTNVENVKEAV KRADSGNGVLIFVDMGSSLFHSLKAMEDLKGQIEVEIADAPLVEGVISAVAANFDGMTLK ELKEIAEESRNFKKIKK >gi|261747353|gb|ADAD01000125.1| GENE 26 21765 - 22328 646 187 aa, chain - ## HITS:1 COG:FN0282 KEGG:ns NR:ns ## COG: FN0282 COG4752 # Protein_GI_number: 19703627 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 187 1 187 187 249 63.0 2e-66 MRNNIYVGLVHYPVYNRNGETVATSVTNFDIHDISRTCRTYDIKKYFIITPVDAQQELTN RIINYWVEGDGIEFNKNRNEAFENTDLEDSVQSAAEMILKAEGKMPKIITTSAKIFPNTV SYAELGKEIVEDESPYLILFGTGWGLTEEIMNMSYKILEPVRGKTKYNHLSVRSAVSIIL DRLLGEN >gi|261747353|gb|ADAD01000125.1| GENE 27 22504 - 23067 534 187 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038593|ref|ZP_06011962.1| ## NR: gi|262038593|ref|ZP_06011962.1| 235 kDa rhoptry protein [Leptotrichia goodfellowii F0264] 235 kDa rhoptry protein [Leptotrichia goodfellowii F0264] # 1 187 1 187 187 221 100.0 2e-56 MKMSIKEYMKEKNVSRRTIYNRIEKGQLEVIKEKNKTYIIKENLNINTKNTKIDIPELKE LKEMFQLIQNSNELLKNFDYSFLRDRISSIEKSLINFTMDISRANIDMKEKMNTFSQNIN EKTENIQKKNNNFMENILLINEQNSNFFKESIENLQAKIENLEAKFESIESKLDEIIKKQ DKKGLFR >gi|261747353|gb|ADAD01000125.1| GENE 28 23161 - 24039 1315 292 aa, chain - ## HITS:1 COG:FN0489 KEGG:ns NR:ns ## COG: FN0489 COG0682 # Protein_GI_number: 19703824 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Fusobacterium nucleatum # 1 288 1 285 288 246 49.0 5e-65 MKPYLFKIGKFELRIYSLMYIIAFLSGIFIASKDEVAKKRGIKDKKIIEDFTFWAMISGL IGARLYYVFFKFGEYAGDPLSIFQVWKGGLAIHGGILGGLIGSYLYAKKNKLNIWVLTDM GVGPLLFGQFLGRFGNLANGEIHGVPTFTPWSVILSGKFSQWWASYQSMSIEMQSKFKEL VPWGLVFPKDTAAGIEFPDYPLHPAMLYEAFLNLIGFLLLWFYFRKKEYNPGVLSMIYLI MYAVIRVFVSTFRAEDLTVYGIRAPYLISIVMLIGAFVGIKYFNRNNKIKTN >gi|261747353|gb|ADAD01000125.1| GENE 29 24040 - 25131 1655 363 aa, chain - ## HITS:1 COG:TM1400 KEGG:ns NR:ns ## COG: TM1400 COG0075 # Protein_GI_number: 15644152 # Func_class: E Amino acid transport and metabolism # Function: Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase # Organism: Thermotoga maritima # 6 342 10 342 384 241 38.0 1e-63 MSSKLLLTPGPTNIPERYLKIFGEDIIHHRTPEFRRILKESNENLKKVFKTKNDVAVITS SGTGAMEAAIVNFFSKGDKVIVINTGYFGERFRKISEIYGLNVINLGYEFGDGYKLEDVK KALAENTDVKGILVTHSETSVGILNDVKALGELTKDTDMLLVVDTISGLVANDFDFDGWH VDVAIAGSQKAFLIPPGLSFIAISDKAKKAMEKSDLPKYYFSIKQYQKYFDESSETPYTP AIALILALHESLKDLINKGIDTTIKEKYELRKLIEEKAQNIGFNLLVKEEKNRTNTLVSV YREGVIIKDIINALEDQGYTVTGGKGKYGESLMRIGILGEITKEQINDFFILFEKELKKQ LGE >gi|261747353|gb|ADAD01000125.1| GENE 30 25181 - 26599 2288 472 aa, chain - ## HITS:1 COG:SP1119 KEGG:ns NR:ns ## COG: SP1119 COG1012 # Protein_GI_number: 15900986 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Streptococcus pneumoniae TIGR4 # 2 472 3 473 474 700 74.0 0 MKYKNLLNGEWKESEKEITIYSPINDEKLGTVPAMTREEVDYAMETAKIALDKWKSMAVV ERANYLYKAAEILERDKEIIGEILAKEVSKGIKAAIGEVTRTADLIRYSAEEGMRTVGEI VEGGSFEAASKNKIALVRKEPMGLVLAIAPFNYPVNLSASKIAPALIGGNVVIFKPPTQG SISGLLLVKAFYEAGIPESVINTVTGKGSEIGDYLIGHPLVDFINFTGSTPVGKRIGELA GMRPIMLELGGKDGAIVLDDADLEKAAKDIVSGAFSYSGQRCTAIKRVLVMENAADRLAE LITEEVKKLKVGDPFDNADITPLIDNRAADFIEGLVIDAEEKGAKQLTEYKRERNLLWPV LFDHVTLDMRIAWEEPFGPVLPLIRIKSMEEAIEICNGSEYGLQTSVFTKDIVKAFDIAG KLEVGTVQINNKTQRGPDNFPFLGIKGSGVGVQGIKYSIQSMTKVKSIVIDL >gi|261747353|gb|ADAD01000125.1| GENE 31 26632 - 27468 1150 278 aa, chain - ## HITS:1 COG:FN1224 KEGG:ns NR:ns ## COG: FN1224 COG2877 # Protein_GI_number: 19704559 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase # Organism: Fusobacterium nucleatum # 1 277 9 285 286 392 66.0 1e-109 MIINKVNEVKITENIKVGNGKMFLIAGPCVIESEDLVNEVAAKMKEITDKLGIQYVFKAS FDKANRSSISSFRGPGIEKGLEILSRVKEKYGVALATDIHEPWHCKVAAEVIDLLQIPAF LSRQTDLLIAAGETGKAVNIKKGQFLAPWDMKNVVKKFEEIGNKNIMLCERGASFGYNNL VVDMRGLLEMRKFGYPVVFDATHSVQIPGGQGETSGGNSVYVYPLARAALAVGVDGIFAE VHPDPSVAKSDGPNMLKLEDVENILKHLLKYDELTKGY >gi|261747353|gb|ADAD01000125.1| GENE 32 27692 - 28942 1739 416 aa, chain + ## HITS:1 COG:NMB1476 KEGG:ns NR:ns ## COG: NMB1476 COG0334 # Protein_GI_number: 15677330 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Neisseria meningitidis MC58 # 1 416 5 421 421 642 72.0 0 MPKETLNPFEIAQKQIKTACDRLNADPAVYEILKNPMRVLEVSFPVKMDDGSIKSFTGYR SQHNNAVGPYKGGIRFHQNVTRDEVKALSTWMTFKCSVVGIPYGGGKGGITIDPKEYSQA ELERISKAYAAAISPLIGEKVDIPAPDVNTNGQIMSWMIDSYEKIAGKSAPGVFTGKPLG FGGSLARTEATGYGVSLSAKKALEKIGKNINSATFAVQGFGNVGFYTAYYAHKNGAKIVA ISNVDTAFYNENGIDMEKVIKEVEEKGFVTNNGYGKEIPHNELLELEVDVLAPCALENQI TSENADRIKAKVIVEGANGPTTPEADEILFKKGIIVVPDILANSGGVAVSYFEWVQNLQN YYWEFDEVQQKEDALLSKAFEEVWALSEKYKTDLRNASYMKSIERIAKAMKLRGWY >gi|261747353|gb|ADAD01000125.1| GENE 33 28976 - 29272 421 98 aa, chain + ## HITS:1 COG:CAP0038 KEGG:ns NR:ns ## COG: CAP0038 COG2350 # Protein_GI_number: 15004742 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 97 1 94 96 91 46.0 4e-19 MKNIFIAILTYKKPLDEVVKFRPAHLDFLDEYYNAGKFIVSGRQTPPVGGVIVAHNVTKD ELESILKEDPFYKNELADFEIVEFTPAKYAEKFENFVK >gi|261747353|gb|ADAD01000125.1| GENE 34 29326 - 29745 345 139 aa, chain - ## HITS:1 COG:FN1345 KEGG:ns NR:ns ## COG: FN1345 COG0596 # Protein_GI_number: 19704680 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Fusobacterium nucleatum # 6 134 143 271 275 129 49.0 2e-30 MKSLTKILEKLDKEERQKFSELAVIADERTYKRYIEDIKEGNQVANMEFIKKLLEKYKFS FEPDELLREIRFENPVLFIGGRQDNIVGYRDLGKLSEDYPRCSYSVIDTAGHNLQTEQPE IFECLVKDWIKRVEINLKN >gi|261747353|gb|ADAD01000125.1| GENE 35 29708 - 30148 478 146 aa, chain - ## HITS:1 COG:FN1345 KEGG:ns NR:ns ## COG: FN1345 COG0596 # Protein_GI_number: 19704680 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Fusobacterium nucleatum # 1 142 1 144 275 138 49.0 4e-33 MFYEKDNLKLHYKVIGEGKPIILLHGLACHMKLMENCMEPVFKNKNYKRIYVDLPGMGKS QVKTDSPSSDFILESLISFIEFITDENFLLAGESYGGYIARGILAKMHERVDGMLLVCPV AVPEYGKRDLPVKNIEIFDENFRKTG >gi|261747353|gb|ADAD01000125.1| GENE 36 30166 - 30756 315 196 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|162456259|ref|YP_001618626.1| putative ribosomal protein [Sorangium cellulosum 'So ce 56'] # 1 189 6 197 207 125 36 8e-28 MKIILATKNEGKIKEFEKLTEGMNIEVLSILDNIDFPDVVEDGKTFEENSAKKAKEIAKY TGITTVSDDSGLCVDILNGEPGIYSSRYSGENATDASNMEKLLKNLSNIQKEKRKAHFVS VVSIAFPDGSVKSFRGETEGEILFEKEGNNGFGYDPIFYSYDLKKSFGNAMPEEKKSVSH RGRAFQKLKKEVLEKL >gi|261747353|gb|ADAD01000125.1| GENE 37 30776 - 31522 922 248 aa, chain - ## HITS:1 COG:TM1693 KEGG:ns NR:ns ## COG: TM1693 COG0204 # Protein_GI_number: 15644441 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Thermotoga maritima # 3 248 10 244 247 140 36.0 2e-33 MRTLFYHFIIVGTFIYGTIVHYWYLLLNKGEKRYRYVCKVGKNWGKNLIWASGSQVKTLY KNGSEEEINKIRETGEAVILISNHQSNVDIPTLLGYLPLEFSFIAKKEMKKWPMIGRWMR SFDCIFLDRKNVRQGMKDMKEAISKIKNGHSYVIFPEGSRSKDGTIEEFKKGSFKLATDT GAKIVPITLVGTYEVQNRKSLKITPNKDIKIIVDKPLDLKELSKEEQKEVHEIVNKIIKN NFEEYKNR >gi|261747353|gb|ADAD01000125.1| GENE 38 31536 - 32123 630 195 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1699 NR:ns ## KEGG: Lebu_1699 # Name: not_defined # Def: regulatory protein RecX # Organism: L.buccalis # Pathway: not_defined # 1 186 1 199 202 162 61.0 1e-38 MIIEKIQKNKLYLSTEEIMDVSPLIRQKYCLKVNDNIESLYDDISYEASLEKGIFLLSLK DRTKKELQMKLNEKYHNEKMVEKAVLKLAELGYIDDLNYAVSYINSRKYGKQRITYNLLQ KGISKEKIEQAYINIQEETEKDVEDEKLKKAILKNIKKEEKKLVQYLVRQGFELDKIFRK LKEYKDFDGFENFEE >gi|261747353|gb|ADAD01000125.1| GENE 39 32120 - 33268 2078 382 aa, chain - ## HITS:1 COG:FN0547 KEGG:ns NR:ns ## COG: FN0547 COG0468 # Protein_GI_number: 19703882 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Fusobacterium nucleatum # 19 343 23 344 381 392 71.0 1e-109 MARKKAVEEKDDNKVQSEKERVLNLALTQIQKEFGEGSIMKLGENQKMNVKSISTGSINL DMALGIGGVPRGRIIEIYGAESSGKTTLALHVIAEAQKEGGVAAFIDAEHALDPVYAKAL GVNIDELLISQPDTGEQALEIADMLVRSGAMDVIVVDSVAALVPKAEIEGEMGDQQMGLQ ARLMSKALRKLTANISKSSTVMIFINQIREKIGGFSFTPGPQTTTSGGRALKFFSTVRME VKRVGSVKQGEEVIGNEVLVKVTKNKVAPPFKEARFNVMYGMGISRIGEILDAALELGVA SKSGAWFSYGDERLGQGRVNVEKMLQENKELYERMEKDVLDAIRPKTEEEKEPEVVETAK PKKETAKKSAAKAEKEEVEIEE >gi|261747353|gb|ADAD01000125.1| GENE 40 33468 - 34619 1674 383 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1986 NR:ns ## KEGG: Lebu_1986 # Name: not_defined # Def: hypotheticalprotein # Organism: L.buccalis # Pathway: not_defined # 1 383 1 379 379 350 57.0 9e-95 MKKLLLVLGILSISALGFADAKSDFENAQKLAGQKKINEAVKVLEGVANSSDKAYSAKAS FQLGAYYLQNNNKANAKKYLTMAWNGYDVAKEEALEASKLLYLVALQEKNVKDAEKYITW ANDKTGGKDVDAVSSLIIFYFDNNMQSKGQAKYAEASRNTNKEFTSELNFNIGQYYLGKN NTAQAKKYLQDSYNQSPNAVLPAGFLLAQIALSEKNNAQAEKILLEMNSKTGSKDAEVLG MLGSYYLQVNNLTKAEDYLKKSAAANRDNTDARLLLLALYESKNDTAKANTMYNEIKSTK GINQKTLNKEIGFYFAQLGNGNAAEKYLKKSINEDKDNESKLILGQVYYGMGKKAEAVSI LKEAVNNKVKGAAEVLKQVESGK >gi|261747353|gb|ADAD01000125.1| GENE 41 34654 - 35361 1157 235 aa, chain - ## HITS:1 COG:BS_ykuQ KEGG:ns NR:ns ## COG: BS_ykuQ COG2171 # Protein_GI_number: 16078482 # Func_class: E Amino acid transport and metabolism # Function: Tetrahydrodipicolinate N-succinyltransferase # Organism: Bacillus subtilis # 14 235 9 235 236 233 61.0 2e-61 MEDIMTELEKSEAIIKFIANAEKTTPVELYTDEEVADKGLCRVIGKDGLKLIIGDWKDVE EVIKNNNLKNYYLKNDRRNSGVPMLDIKNINARIEPGAIIRDKVTIGDKAVIMMGAVINI GAEIGEGTMIDMNVVLGGRAKVGKNCHIGAGAVLAGVIEPPSADPVIVEDDVVIGANAVV LEGVKIGKGSVVAAGAVVTENVPEKVVVAGMPAKIIKNVDDKTASKTGIVEDLRK >gi|261747353|gb|ADAD01000125.1| GENE 42 35387 - 36115 1114 242 aa, chain - ## HITS:1 COG:CAC2379 KEGG:ns NR:ns ## COG: CAC2379 COG0289 # Protein_GI_number: 15895645 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Clostridium acetobutylicum # 1 241 2 250 250 189 42.0 4e-48 MKIIVYGAGVMANYVKEAIINSGNEFEGFVDPLGKGNFETLKNENRDYDAIIDFSHFSLL EEILEVGIQKGVPVLIATTGHSKEQIGKIEKASEKIPIIRATNTSIGVNVLNEITAYATR LLKGFDIEVIEKHHNRKLDAPSGTANTLIEVINENLDEKYETVYGREGHKKRTEKEIGVH SIRAGNIVGEHTVIYSKNDEIIEIKHEALSRRIFAEGAVRAAVYLKDKKPGLYTMKDVLN IK >gi|261747353|gb|ADAD01000125.1| GENE 43 36250 - 36801 907 183 aa, chain - ## HITS:1 COG:FN1102 KEGG:ns NR:ns ## COG: FN1102 COG1859 # Protein_GI_number: 19704437 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNA:NAD 2'-phosphotransferase # Organism: Fusobacterium nucleatum # 6 182 2 179 179 191 58.0 5e-49 MAKHKDENVKLGRFISYILRHHPEAVNLELDENGWADTEELINKINLSGSAIDFAKLNEI VKTNDKKRYEFSDDYKKIRACQGHSINVDLNLKPVKPPQYLYHGTALKNVEIIKKEGLRK ISRQYVHLSIDYETAYNVGKRHGKPYIIKVESGKMYDNGHVFYLSKNKVWLSEDINSEYL IFE >gi|261747353|gb|ADAD01000125.1| GENE 44 36971 - 38137 1592 388 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0649 NR:ns ## KEGG: Lebu_0649 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 115 388 60 332 332 318 64.0 2e-85 MKYLDEMTKEELINCAKKEKIFMGLSFNKEKIMKKLEKKGYKYSQAFGENIEKNHETEEK KNIIVEKIEEFAEKAAEKFEEIKEEFIDRKSQNEETAIREVEEIDFSVEGIFNYRTRSYY RNYPNKKLMRTFRKKYRKYKSDFQEEFDFNYKGIDDDQIRKREKETEINRLKFAKGAEYD GESKEDIYFDKAPLPAAYFVDEVVLMPKNPTTLYVYWEIRDDTFEKLSENKDIVDNIVIK LFKGGNEYKKIIRHERIGSHYIGDVDTNQNYEVFIGYEDKYGNFSEVAHSSEAIAPSDKL SNNLDLVWGTVKEDKNTNQIIKYVNSPVPTEENREFIELMKNPVADDDEFTVEVLERLLK VGASENLIEREVRKAKPEKLIMTGVRSS >gi|261747353|gb|ADAD01000125.1| GENE 45 38210 - 38599 723 129 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0648 NR:ns ## KEGG: Lebu_0648 # Name: not_defined # Def: ATP synthase F1 subunit epsilon # Organism: L.buccalis # Pathway: Oxidative phosphorylation [PATH:lba00190]; Metabolic pathways [PATH:lba01100] # 1 129 1 129 129 162 72.0 5e-39 MATEFILKAVTPEGLVFEKPVLFVKVRTENGDIGILAKHANFVSSLGAGEMIVREKDNVE TTYYLEGGFLEVRQDKVVILGEDMMEASGAEAVKLAKKAAIELAKQHKIREEQDILGTKK RIQENLRRK >gi|261747353|gb|ADAD01000125.1| GENE 46 38618 - 40015 2216 465 aa, chain - ## HITS:1 COG:FN0358 KEGG:ns NR:ns ## COG: FN0358 COG0055 # Protein_GI_number: 19703700 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Fusobacterium nucleatum # 3 462 2 460 462 701 78.0 0 MENKGTLVQVIGPVVDVRFPDKLPDIYNALEIHTEDGKKIVAEVHSHVGNNVVRAIAMSG TDGLRRGMEVIDKEGPIQVPVGRKTLGRIFNVLGETVDAGEELIPDAVDSIHKEAPSFES QGTDSEILETGIKVVDLLAPYLKGGKIGLFGGAGVGKTVLIQELINNIAKGHGGLSVFAG VGERTREGRDLYNEMKESGVLNKTALVYGQMNEPPGARLRVALTALTMAEYFRDKEGQNV LLFIDNIFRFTQAGAEVSALLGRMPSAVGYQPNLATEMGTLQERITSTSSGSITSVQAVY VPADDLTDPAPATTFAHLDATTVLSRSIASLGIYPAVDPLDSTSRILEPEIVGNEHYKVA RETQRVLQRYKELQDIIAILGMDELSDEDKLIVNRARKIQRFFSQPFSVAEQFTGMKGKY VPLRETIRGFKEILEGLHDDLPEQAFLYVGGIDEAVAKSRELLGE >gi|261747353|gb|ADAD01000125.1| GENE 47 40034 - 40894 1109 286 aa, chain - ## HITS:1 COG:FN0359 KEGG:ns NR:ns ## COG: FN0359 COG0224 # Protein_GI_number: 19703701 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Fusobacterium nucleatum # 5 285 4 282 282 258 47.0 6e-69 MAANMKEIKERIDSVRNTSQITNAMNIVSSTKFKRFQELTLKSRSYSDTLDIAFDNLVAS LTGKKHVIFDGKPEVERIGIVIMTSDRGLCGSFNTNMLRRMEEMIKDFKKENKEVSVITI GKKARDYCKSKNISVDSEYVQLIPETMFEKAKRISEDIVEFYLNDFYDEVYLLYSKFVSA IEYNVQLEKILPIEKKEGLPTQEYIFEPSEDEVLRAFIPKVLNIKLYQALLESSASEHSA RMGAMRQANDNADEMIKKLTVQYNRERQGQITQELSEIVGGSEALK >gi|261747353|gb|ADAD01000125.1| GENE 48 40908 - 42413 2164 501 aa, chain - ## HITS:1 COG:FN0360 KEGG:ns NR:ns ## COG: FN0360 COG0056 # Protein_GI_number: 19703702 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Fusobacterium nucleatum # 1 499 1 499 500 729 73.0 0 MRIKPEEVSKIIRTEIENYKKTLDISNTGTVLEVGDGIARIYGLNNAMSGELLEFENGTT GMVLNLEENNIGAVIFGETRGIKEGGIVKGTGKIAQVPAGEEILGRVVNSLGEPIDGKGS ITAEKYMDIESPAYGIIDRKPVSEPLQTGIKAIDGMVPIGRGQRELIIGDRQTGKTAIAI DAIINQKSNDVYCIYVAIGQKRSTVAQIYKKLEEEGAMEYTTIVAATASEAAPLQYLAPY AGVAMAEYFMLQGKHVLIVYDDLSKHAVSYREMSLLLRRPPGREAYPGDVFYLHSRLLER AAKLSDELGGGSITALPIVETKAGDISAYIPTNVISITDGQIFLETDLFNSGFRPAINPG VSVSRVGGSAQIKAMKQVSSKVKIELAQYNELLAFAQFGSDLDKATRDQLNRGAKIMEVL KQKQYNPYKVEEEVVSFYCVTNGYFDDVPTEKVSNFEHDLIDSLRTNSEILNKIREEEKL NDEVKKELDEYIVGYKQDYVW >gi|261747353|gb|ADAD01000125.1| GENE 49 42433 - 42960 813 175 aa, chain - ## HITS:1 COG:FN0361 KEGG:ns NR:ns ## COG: FN0361 COG0712 # Protein_GI_number: 19703703 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Fusobacterium nucleatum # 1 174 1 174 174 97 41.0 1e-20 MDKSGIAKRYASAIYDIAKSSNKISEIREVLSLMMEKYEEEEEFKKYLSDPVITVEEKED FLKRAFDFVSEEALRIVNYIVKKGRLSLIPYIKEEFLKIYYTENDKLPITAIFAKELSEE QKDKLIKKLESKYKKKVVLNVKVDESLIGGGIVKIRNEVIDGSIKHQIEALKKTI >gi|261747353|gb|ADAD01000125.1| GENE 50 42965 - 43459 744 164 aa, chain - ## HITS:1 COG:FN0362 KEGG:ns NR:ns ## COG: FN0362 COG0711 # Protein_GI_number: 19703704 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Fusobacterium nucleatum # 7 164 3 161 163 73 40.0 1e-13 MSEGSNLINIDFTMAIQIINFLILVYIFWKVFAKKIEKVIEERKHLALSEMEIVEKEKEK LEEQKTASEKLKKESKRRANEILIKAERQADERKEQIISTAMTNRERMMMKAEADIEKMR QNAKFELQKEVGEMALELAEKIIKENIKDKEDEIIDNFIDEIGD >gi|261747353|gb|ADAD01000125.1| GENE 51 43523 - 43759 540 78 aa, chain - ## HITS:1 COG:MPN603 KEGG:ns NR:ns ## COG: MPN603 COG0636 # Protein_GI_number: 13508342 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Mycoplasma pneumoniae # 8 78 34 104 105 61 49.0 5e-10 MEGLVQAAALLGAGIAAIGGIGAGLGQGMATAYAVEAVSRQPEAKQDIMQTLIIGLAITE STGIYALVISFLLIFLKG >gi|261747353|gb|ADAD01000125.1| GENE 52 43798 - 44676 1212 292 aa, chain - ## HITS:1 COG:FN0364 KEGG:ns NR:ns ## COG: FN0364 COG0356 # Protein_GI_number: 19703706 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Fusobacterium nucleatum # 62 287 1 215 218 161 48.0 1e-39 MLKKVTVFVLLLVGLTLVVNLILSLISTFLPVKFVMPESLIEAPTYYNFVIGSVNITISQ TVLNTWAIMALLAFIVKKGTDKLSTTNPGKLQIILEEYYHFIENMFLGTFGKYKKKFMPF FSALFAFILFSNLSLFLFPFIITVTKGKNGGFDVKPFFRTPTADPNTTIGLSLVVVVLFV SIAIKRGGVTGYIKSLMHPAWFMLPINIVGELSKPLNTSMRLFGNMFAGLVIAGLMYSLA RGHFSLSVGWPGILQLYFDGFVGMIQAFVFTVLSSVYVGEVLGEDTEGEENL >gi|261747353|gb|ADAD01000125.1| GENE 53 44767 - 45165 357 132 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1791 NR:ns ## KEGG: Lebu_1791 # Name: not_defined # Def: ATP synthase I # Organism: L.buccalis # Pathway: not_defined # 1 127 1 127 133 89 48.0 6e-17 MFKDLPELIKKIYIISLFLTVISFIVGIILKRPELYLGFFTGSVISMINVYLTIRGAYKA VYERNKSRFGSMFQYLKRIAIYCGGMYFVILISKKYFNSRVTGNIIGTGLGFLNFKISIL VSTFLKKKKEKT >gi|261747353|gb|ADAD01000125.1| GENE 54 45183 - 45509 493 108 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1792 NR:ns ## KEGG: Lebu_1792 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 5 108 19 122 122 108 55.0 1e-22 MSNKSNKKKSDEMTEKEKKLYELEKKLGRHDEEKEAKKKKDRLLLKYFLIATNMIYTLAG PILLMLGLYLLLEKFVFKDKQPIVLIVLLIIGAFAGYWTLIRQVLDTK >gi|261747353|gb|ADAD01000125.1| GENE 55 45532 - 46425 1102 297 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038548|ref|ZP_06011917.1| ## NR: gi|262038548|ref|ZP_06011917.1| putative regulatory protein ArsR [Leptotrichia goodfellowii F0264] putative regulatory protein ArsR [Leptotrichia goodfellowii F0264] # 1 297 1 297 297 558 100.0 1e-157 MEKEKSYIELEKEREKFESTIGEAKIVNSVDISDSKKEIVKYSTVTYKNMVKWIVGIVIG LVLIPFTDWGMFITVFSIVMCIPVLLTILFNGIKTVELKNHTFIFSDEKKEREYKSLEYF ILTTYSGDFLAYRDYKGYWLKIPMIAFNSSLIKNFLKNYLLDKLPVSEANLNIEGEEVFY VRDEEDIKADYRKREMFGKHRVRVHRAMRNGSIPLERLQKIADNLYKSENRIIIKKEALI INGNSYRWGGLQPVKLSKTLGGVLEIKTKDDQTIFSSRATAITRVPLFEALYNSMIK >gi|261747353|gb|ADAD01000125.1| GENE 56 46446 - 47171 1111 241 aa, chain - ## HITS:1 COG:BS_yqeM KEGG:ns NR:ns ## COG: BS_yqeM COG0500 # Protein_GI_number: 16079615 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus subtilis # 1 240 2 241 247 100 26.0 2e-21 MHKEFAEIYDVFMKHVDYKGWYKFLRSYIKTEGEVLDLGCGTGEFIYRFLKDGFSVRGVD ISEDMLKISKEKIESKNLKNNNYELIKENLVNYEDGSEADYIICNFDTVNYLKNEKEFEK FLEKSNKNLKKDGYLIFDIVTEEIFDEIFENGIFLDEEPEYTSIWRYEQLSEKKYFVEID LFIRQDKNDNLFRKYNEQHNKFIYKPEWVVEIAREKGFEIFDTAKNPEFGESRIFFVFKK L >gi|261747353|gb|ADAD01000125.1| GENE 57 47199 - 48572 2037 457 aa, chain - ## HITS:1 COG:aq_1031 KEGG:ns NR:ns ## COG: aq_1031 COG1921 # Protein_GI_number: 15606324 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine synthase [seryl-tRNASer selenium transferase] # Organism: Aquifex aeolicus # 1 457 1 449 452 352 43.0 1e-96 MNELLSKIPSINKILLTEELKQLLEEYPEIFVKDIVKKEVEDIKNDILMSSIKEVPSMEE IVERVSFQVKKKDRLSLRRVINATGTILHTNLGRSLLSEHVKENLTEIAFNYSNLEFNID EKARGSRYTHLIDIIKRLTGAEDVLVVNNNAAAVLLTLNTLTKGKEIVVSRGELVEIGGA FRIPEIIKLSGGISCEIGTTNKTHLKDYENAINENTGVLLKVHTSNYKISGFTKAVTYEE VAALAHEKGLIAINDLGSGQFVDFRPYGLPYEPTVKEVLDSGMDIVTFSGDKLLGGPQAG IIVGKKKYIEEMKKNQLTRALRVDKMTIATLEATLKLYLDEKTALEEIPTLNMISASIEK LREKAEKFTEIIKKSRFVAEIIEDKAEVGGGSYPGSQLESIAVKLTHSEKTSTEIERILL SEDIPIITRIKENSIILDMRTLREKEFGIVAGSLDKI >gi|261747353|gb|ADAD01000125.1| GENE 58 48589 - 50466 2512 625 aa, chain - ## HITS:1 COG:PA4807 KEGG:ns NR:ns ## COG: PA4807 COG3276 # Protein_GI_number: 15600001 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Selenocysteine-specific translation elongation factor # Organism: Pseudomonas aeruginosa # 4 623 1 625 641 283 29.0 9e-76 MSNVIIGTAGHIDHGKTTLIKALSGIETDTTAEEKERGMSINLGFAYFDLPSGKRCGVVD VPGHEKFIKNMLAGVSGINLVLLLVDSREGIMPQTKEHADILSLLGVENYIIVMTKIDLA EKEYREMVKEEIESYIKGTPLEGSPIIEVDSVSKKGIDTLLKEIDRKIENIAEIKEGKNA RLNVDRAFQVKGFGTVVTGTLTEGTVSVGDELEIYPENLKTKVRNIQVHKQDVKTAHAGQ RTAISLTNVKIDDVGRGRTLATPGTLTKTYMLDTEIKIIDNSNFTLELWDRVRVYTGTSE VMARIVPLGSEVLESGKGGFAQLRLEEEVSVKNYDRFIIRTYSPMITVGGGVILDVSPKK HSRFNEEILNKLKVQSKGNSKELISNYILTGHAYLITAEEIAKTLEQSLENVEKDLAELV QDGALYKTKSGYIHVKKYEDILEKVSTLVGDYHKKFRLKRGISKAELFSKFKINQKELAV IIDLFVSNNVLKIQNNFISLYDFEVKYDANQLKEKNNIEKILLESKFTPPNIKDLTKNNK GLNELLSSLAGDTIIILNENIAIHVKTFEEAKNKVLKHFENNKKLTLAEFRDFTGSSRKY SLPILEYMDKQGITKRVEDYRVLGK >gi|261747353|gb|ADAD01000125.1| GENE 59 50495 - 50794 289 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038513|ref|ZP_06011882.1| ## NR: gi|262038513|ref|ZP_06011882.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 99 1 99 99 187 100.0 2e-46 MIKTKDEGIVVFHNTHDAIKADNISMSKQIKAGLIPVHPSISLGCGFMLKTEWEDFSNLI KTLDDENINYKALYYSQKRGMKREVELLYEDKEDITTEK >gi|261747353|gb|ADAD01000125.1| GENE 60 50802 - 51389 880 195 aa, chain - ## HITS:1 COG:no KEGG:TDE0677 NR:ns ## KEGG: TDE0677 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 5 195 3 197 197 142 44.0 7e-33 MKKYELNAKGLACPIPVVKTKKLLEEYDTVETTVDNFTATQNLTKLAEQLDYNIDVKTVS EEEYVVTISAKEESKKAEKTSQAEDDSYIVVINKQIMGYGSEELGKKLMKAFLYTLTEQK VLPKKVIFYNGGALLVDKTRSHVLEELKELEDNGVEIMCCGACIDYYNVELALGNPSNMY FIVEEMRNTNKVVRP >gi|261747353|gb|ADAD01000125.1| GENE 61 51428 - 52336 1080 302 aa, chain - ## HITS:1 COG:Cj1504c KEGG:ns NR:ns ## COG: Cj1504c COG0709 # Protein_GI_number: 15792818 # Func_class: E Amino acid transport and metabolism # Function: Selenophosphate synthase # Organism: Campylobacter jejuni # 5 301 13 307 308 216 41.0 4e-56 MGYESSDDAAVYKISEDKAIIQTLDFFTTMINDPYLYGQIAATNALSDIYAMGGEVISAL NIVAFPEKMDMEILHQILKGGAEKVHEAGGVLAGGHSIHDSTPKYGLSVTGIIHPDKILM NNNCKEGDALIVTKPLGVSIINSAHMIGECSEETFAKCIKQMTTLNKYAAEIMKNYPVNS CTDITGFGFLGHLYEMLDGKYSAEIIGKDIPMLPEAYRCAGDFIITSGGQLNRNYLQDKV DFKIKDFPLEEIMYDPQTSGGLLISLPSEYAEELLEKLNKLEIKSAVVGKVIKKQEKDII IV >gi|261747353|gb|ADAD01000125.1| GENE 62 52772 - 53395 883 207 aa, chain - ## HITS:1 COG:no KEGG:TDE2415 NR:ns ## KEGG: TDE2415 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 2 204 5 207 210 258 58.0 1e-67 MKNPNSKNGKSILFTCQNAKSLDILERDGRFINKKEYIQEHLEDVAPMILKSYDWFVSAA SKRIKKPDDVQYQIWCAVGDEVCMKPIETELVYILEVPDKEIIYFNSYKWDYVINHHYVP ENNEDLKKYEQEMEAKGFANTYEILTGRYAGKFPMEERKIKDSWERVFDIDNWNVRQVQA NLWEIKKEWVKRILKPGEPMPEEYYVR >gi|261747353|gb|ADAD01000125.1| GENE 63 53548 - 54720 966 390 aa, chain + ## HITS:1 COG:L125116 KEGG:ns NR:ns ## COG: L125116 COG0477 # Protein_GI_number: 15672105 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Lactococcus lactis # 3 385 5 380 387 89 23.0 1e-17 MDKKNINRMVLLMFLCIIVYNFAHPATPGLIELRNWKKSISGEFLAVMSTAMFISSPYLG ALADNIGMKKIFIFMPLMYGTAQLFFGFVNYLPIIFLARAFSGFASGGTYAVAFGYVSRL SSKETKSKNIAKVSSAAVIGGAIGQKTGGMIARIDTRYPFALQFICGATVSLLIFLILKE IVIKHDETEIKEKKNLNPFATFKYIFELDNYSKFFCFIIFLSSIGIYSYASALNYFLKFY IKVNSDTIGTFVMFSSLSAFFGTSVLLVKLIAKYKEITIHKTMIFLGIVLMSVILYRLNL GAVSYVLMAFYTMTYEIVRSLGNSIIAHRYKEEQGKILGVASAVGFLGNAVGSLLSGHLL NAGSHLPFIVNICIMSVVFLLLLLNLSKRL >gi|261747353|gb|ADAD01000125.1| GENE 64 54993 - 55313 361 106 aa, chain + ## HITS:1 COG:no KEGG:FMG_0119 NR:ns ## KEGG: FMG_0119 # Name: not_defined # Def: putative phosphate ABC transporter substrate-binding protein # Organism: F.magna # Pathway: ABC transporters [PATH:fma02010]; Two-component system [PATH:fma02020] # 1 93 1 98 310 93 54.0 3e-18 MKKIFMLIAVAALFFTTAISCGNKSTEGNGGETKKSSDVNTQISFSGSSTLAPVISKIST NFIEKYTSWDKVDPSLPAENITIYVSLGGSGAGGTVSQTSECLPVI >gi|261747353|gb|ADAD01000125.1| GENE 65 55295 - 55594 414 99 aa, chain + ## HITS:1 COG:BH2994 KEGG:ns NR:ns ## COG: BH2994 COG0226 # Protein_GI_number: 15615556 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Bacillus halodurans # 3 95 113 207 335 59 37.0 1e-09 MLARDIKDGEKEKIKDLKAFTLGLDALTISVNPQNKFIQLKGGNITKEEIIKIFSGEYKK WSDLDKSLPDEEIVVVTRDLSGGAHEVFQKNIMKDINVR >gi|261747353|gb|ADAD01000125.1| GENE 66 55616 - 55900 283 94 aa, chain + ## HITS:1 COG:no KEGG:FMG_0119 NR:ns ## KEGG: FMG_0119 # Name: not_defined # Def: putative phosphate ABC transporter substrate-binding protein # Organism: F.magna # Pathway: ABC transporters [PATH:fma02010]; Two-component system [PATH:fma02020] # 1 91 218 308 310 138 80.0 8e-32 MGALVEKIIENKNAIGYASFGITNQNAGKLTPLKVDGVEPTKANILNKSYYISRPLIIMK KGELTKTEKIFVDLLKSDEGKKVIKSMGFIPVSE >gi|261747353|gb|ADAD01000125.1| GENE 67 55965 - 56780 633 271 aa, chain + ## HITS:1 COG:VCA0071 KEGG:ns NR:ns ## COG: VCA0071 COG0573 # Protein_GI_number: 15600842 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Vibrio cholerae # 4 265 68 325 327 132 31.0 5e-31 MFILAVLLIYLFAEALPFFKEVKIPDFIFGKVWRASEPNQSFQIFNILYAGFYISILACI ISFPFSYGISMYICFYSKGIFKKMIIWAVNILSGIPSIVYGFFGMTVILKILEKTFRMST GESVLAGSLILSVMIIPFFVANCIEHIETIKEKYGKDSDAMGITKEYFIRKIVFRETRFS VLTAFILAFARATGETMAVMTVIGNSPQFPKLLTKAETVPALIALEIGMSEVGSKHYSAL FASAFVLLTTVMIINIIFFILNKMRRIYIEK >gi|261747353|gb|ADAD01000125.1| GENE 68 56770 - 57642 571 290 aa, chain + ## HITS:1 COG:VCA0072 KEGG:ns NR:ns ## COG: VCA0072 COG0581 # Protein_GI_number: 15600843 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Vibrio cholerae # 5 268 16 280 289 113 30.0 5e-25 MKNNDLIIKTWIYISTITVISVIFGIFYFIISKGIGRVNLDFIFKNPEGMSLGSEGGVKN AIIGSIFLMFIAILFSVLLGVVCAIYNTIYCKSKIIGTVINLTVQCIASIPSIIIGLFVY GFFIVTFNVPRSMLTAGIALGIMVFPFVEVRIEKAVLNMDRQFIKDSYSLGIEKDYMCRK LILPVIKKEIVSTGILAGSHAIGATAPLLLIGAIFIGGTSDKLLSPVMALPFHLHMLLGQ TALHDKAYATAFVLICILIILHILSEIIISGLGGKIIEYIGNKRYWYILR >gi|261747353|gb|ADAD01000125.1| GENE 69 57599 - 58249 208 216 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 4 216 12 218 318 84 29 2e-15 MNILEIKDIGISYDNKEILKNININVKKNEILCIMGQSGCGKSILLSLINGFFEENGGKY TGDVFLNGQNIKSIKLLNLRRKVSTLFQDSKPFPLSIESNILYPIEFYEGKIKDKKERVE KYLKDVNLYNEVKDNLKMSAVKLSGGQKQRLCIARMLTTNPDILIFDEPCSSLDKENSLI IENLIKKLSGKYTIILTTHNPEQAERIADRIFTVGK >gi|261747353|gb|ADAD01000125.1| GENE 70 58372 - 58845 544 157 aa, chain - ## HITS:1 COG:no KEGG:ELI_1489 NR:ns ## KEGG: ELI_1489 # Name: not_defined # Def: acetyltransferase # Organism: E.limosum # Pathway: not_defined # 1 157 1 157 160 186 59.0 2e-46 MDYRIREMEKDEYPLLKNFLYEAIYIPEGEKIPDKSIVNLSELQIYYKDFGSFKDDIALA ADVKDKVIGAAWVRIMDDYGHIDNETPSLAMSLFKEYRNLGIGTVLLNELLSELKIRGYL KVSLSVQKDNYAQKMYKKAGFKVIKENEEEYIMLAEL >gi|261747353|gb|ADAD01000125.1| GENE 71 58883 - 60586 2006 567 aa, chain - ## HITS:1 COG:no KEGG:FMG_0256 NR:ns ## KEGG: FMG_0256 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 23 520 17 506 558 351 40.0 7e-95 MKFYNSTHNFYRDDEKHNLYKFNSNTEFEKIEVVKNEEFGFFVTIDLPCESCLNINKDLD LPWWGLSKRYRLEVNSQLTSVYPSLVDYVKDDDGTEIADIILAEQSKVYPQGEVPVFIQG VVPEDFKGKSVSINIKLYKNNGYEKEKLIEEKNVNIDVIDYCLKNNDFYLDLWQHPCSWA RVYNLKYFSEEHFTVIENYLKAMSKLGQKVIDLIVSDFPWAGQKCFDVEKNASRLYEYNI IKVYRKNNELELDFTYLDRYVDICFKLGINEEINLFGLIGNWHGFDFGSPLTDYSDPIRI SVYDEDKGINDYIRTKEELAVYIKLLFRHFKEKGYLPITKIIGDEPNPIKKFEEYSSFLN SCTEEKLQYKYAILHPEFFEEYEGDFESFSVITLVLSDFYKDGNIIHKKYKENSSKMTWY ACCMPDTFNTFIKSPLIEARLTGLYTYLFRMKGMLRWAYGIYVEDIYSGISYKTEKWSAG DMFFAYPGKNMKPVHSLREKNMLYGIQDFNIFKQLEEKFGNLEGKLKEKLAIHEVIKRNG GDIDLGTYPAINEYRKVRNEIVKDSLL >gi|261747353|gb|ADAD01000125.1| GENE 72 61109 - 62392 1690 427 aa, chain + ## HITS:1 COG:SMb21221 KEGG:ns NR:ns ## COG: SMb21221 COG1653 # Protein_GI_number: 16264473 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 85 411 74 402 412 87 23.0 3e-17 MIIKNKLLFCLVSLFLVFMLLSCQGKKEDTAGTKENIEKNETEITLMIPDWGTPTEEMLA EFKKETGITVKVLPTSWDDIKSKVSIAAAGKKAAADVIEVDWSWVGEFNSTGWLESLEID EATQKDIPSISYFKVKDNIYAMPYANGIRLAYINTEMFSKAGIKEYPKTWSEMEAAFDKL KESKTIDYPMLFSLNAEEITTTSFLTVAYTRNGKIFNDDDTLNKESALDILQLLRKYLDK GYINPASISTPGIDIFRGINNAQGAFLIGPTSHITSVNDPKVSKVVGKVTTIPVMGKDAP AKETITFTEAVGVSTYSENKEAAKKFVEWFSRPETQLKLNKAINNMPTRTSVIEQMVKDG TIKTPGSIIEQSKIVVSPFPNGVPKYYTKMSTEIFNIINQMGQGKLTPEQAADKMEQKVN ELAKENK >gi|261747353|gb|ADAD01000125.1| GENE 73 62426 - 63715 1806 429 aa, chain + ## HITS:1 COG:mlr3639 KEGG:ns NR:ns ## COG: mlr3639 COG1653 # Protein_GI_number: 13473140 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 86 419 74 408 412 107 24.0 4e-23 MLKKNLNLTILLILLAAIIMGCGSKKEAETKTAENKDRTKENTEITLMLPDYGLPSQEML DEFKKETGITVKTLPTSWDDIKSKVSIAAAGKKAAADIIEVDWSWVGEFQSAGWLEPLEI DEATQKDILSLEYFKIDNKYYAMPHFNDLRIAYINKEMLKKAGVEKLPETWSELEKVMDQ LKAKNIIKNPLLFPLGAEENATTSFLTLVYTRSGEVFNSDNTLNKANALDALQLLKKYLD KGYISPENVSASGGDIYRGIINANGAFLIGPTFYVTKADDPKSSKVVGQITTLPIPGKEK PATKTVAFTEAMGISAYSTNKEAAKKYIAWISKPENQLKLGKTLDITPTRTSVIQQMTDE KSIPGGPGSLIEQAKIVGSPFPNGVPKYYTKMSTEIFNIINQMAQGKLTPEQAADKMEQK VNELAKENK >gi|261747353|gb|ADAD01000125.1| GENE 74 63762 - 64622 857 286 aa, chain + ## HITS:1 COG:SMb20326 KEGG:ns NR:ns ## COG: SMb20326 COG1175 # Protein_GI_number: 16264060 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Sinorhizobium meliloti # 9 279 33 316 328 137 30.0 3e-32 MKKDTKGIFLLFPSLILLLISVIFPVILTFRYSLKNYNLTEPYNEKFIWFDNYIKIFKDA HFYNALYNSVIILILVMIIGMTFSIIIGVVLNRKSKINPLLTALVVIPWAMPPIVNGIMW KFIFFPGYGFMNKILLKLHLINTPISWTDNRYLFLIVISIVVAWRIIPFSSLVILANLQN IPESYYDTVQVFGGTKLQSFFYVTLPLLLPSIGVVLINLTTTALNIFDEVIAISGYQFEI QTLLVYNYSNTFNFLDFGYGSAISYVIMLISGIFGYFYVKNMAYEK >gi|261747353|gb|ADAD01000125.1| GENE 75 64637 - 65482 564 281 aa, chain + ## HITS:1 COG:SMb21105 KEGG:ns NR:ns ## COG: SMb21105 COG0395 # Protein_GI_number: 16264432 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 8 280 15 287 288 140 27.0 2e-33 MKETKKDKFIYYITVFFIILFSVGPIFWCFLISITPEGDLLKNNTRLLPETITFINYKNL FSIGSKESETLLNGLSNSMYLSFVTVLIGMPLSVITGYTLARYKFKYKNLIIGFILLTIV IPVFTTIIPIYSFFMEHSMLDSMFWTSVIFISAFIPLNIWIIMNYFRELPEDLWEAAAIE GANERQLFFQIALPLAMPIVLTSSLIIFLMSWKQYIIPMLLLSSHNNKVLTLIMSEFMTR DAVNYSVIAMSGIIVIIPPLLASMVFRKYLVSGLTAGSVKE >gi|261747353|gb|ADAD01000125.1| GENE 76 65500 - 66207 573 235 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_5268 NR:ns ## KEGG: Pjdr2_5268 # Name: not_defined # Def: hypothetical protein # Organism: Paenibacillus # Pathway: not_defined # 10 232 61 274 276 83 26.0 6e-15 MIKTENMSEKEMYDFVLNKSKTYTKELYKKENLSDDLYSDNMKDIDIWTNKYENINGTKG LYKEHFNWIDSILEMRVIKLDRLQFELLDKTDKNFDLIPEKHKKNSILVNVHIREDGKLT PSACEASYKTAWNYYKSKGLSFEEIIFTCYSWLLNSDLKLLLPENSNIIQFQKKYKFLSF KENEEKSQIIERVFGLGNEKLENCSQNTILQKNLKKFWEKGYKFPMVKGYFIYKI >gi|261747353|gb|ADAD01000125.1| GENE 77 66234 - 67508 1949 424 aa, chain + ## HITS:1 COG:SMb21221 KEGG:ns NR:ns ## COG: SMb21221 COG1653 # Protein_GI_number: 16264473 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 57 414 53 408 412 86 24.0 9e-17 MLKKMKLIALLTLIMSILIGCQAKKEEAKKESKESTEITLMIPDWGAPTEEMLAEFKAET GISVKVLPTAWDDTKNKISIAAAGKKAVADVVEVDWSWVGEFQSAGWLEPLEVDDATQKD IPSISYFKVKDKIYAVPYANGLRLAYINDEMLSKAGIKRVNEAWNGIDGIEETFDKLKKS KVVDYPMLFPLNAEEKTTTSFLTLAYTRNGKIFNDDDTLNKESALDALQLIKKFIDKGYI NPSSVSTPGIDTFRGINNAQGAFLIGPTSFITSSNDPKVSKVVGKITTIPVPGKDGAAKQ TITFTEAVGVSAYSQNKEAAKKFVEWFSRPETQLKLNKAINNTPTRTSVIEQMVKEGIIK TPGSIVEQSKIVASPFPNGVPKYYTKMSTEIFNIINQLGQGKLSPEEAANKMEEKVNELA KQNK >gi|261747353|gb|ADAD01000125.1| GENE 78 67644 - 68090 533 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038553|ref|ZP_06011922.1| ## NR: gi|262038553|ref|ZP_06011922.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 7 148 1 142 142 233 100.0 4e-60 MPKKSKMTTNRRNDNGMREILLQKQEQYSGIIPHPSIVEGYERNCPGATDRILAMTENQL KCNQELAKMEQENINECRKEALKSEVENIKRGQILGFVILLTMIIGGFILIIFGKDAGGY GTIVASIMLGISSVIWNNKSKKEQEEKQ >gi|261747353|gb|ADAD01000125.1| GENE 79 68059 - 68268 274 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038486|ref|ZP_06011855.1| ## NR: gi|262038486|ref|ZP_06011855.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 69 1 69 69 114 100.0 3e-24 MKKINAIVCGVASIFSSASVVDTSNIFKEYPTTHRKNVKEALQNSWITVGTAVKGAIDKY AEEVKNDNK >gi|261747353|gb|ADAD01000125.1| GENE 80 68519 - 69778 1434 419 aa, chain - ## HITS:1 COG:RP176 KEGG:ns NR:ns ## COG: RP176 COG1301 # Protein_GI_number: 15604051 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Rickettsia prowazekii # 10 411 1 397 399 160 26.0 6e-39 MSEKKKLLNLSLTSQILIATFGGIIFGGVVGPWASNLKVFGDIFLRLIQMSVIMLVMSAV IVAVGRGSTGDAGKMGFHTFKWIIFFTLTSAFMGVVLSYYIQPGIGVAHMTEVGKNAVTA NALQDTSLKDTILAFFTTNIFSSMASSAMVPCIIFSLFFGIAMSQHIRDTGKTIVLDWIL SVNEIIVNIIKNVMKVSPIGIFCLLADVTGSKGFTVIIPMIKFLIILIIGDILQLVIYGL ITAFICKVNPLKMPKKFAKMSIIALTTTSSAISLPTKMEDTVIKFGVSRKVSDFTGPITM SMNSSGWALCNVTAIFFIAQFSGVQLTPYQMVMAVILSCLMCMGSVVVPGGAIIVFTFLA TSMGLPLDSIVVLIGIDWFSGMFRTLNNVDIDVLVAMLVANRIDEFDRDVYNNKKVVEY >gi|261747353|gb|ADAD01000125.1| GENE 81 69834 - 70901 1963 355 aa, chain - ## HITS:1 COG:ylbC KEGG:ns NR:ns ## COG: ylbC COG2055 # Protein_GI_number: 16128501 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Escherichia coli K12 # 7 352 3 348 349 406 54.0 1e-113 MNDVVIVSPDKLHNLIKNKLEKAGLKSEHADEVAKHLVFADSCGIHSHGAVRVDYYAERI AKKGVTIDPKMSFEKTGPCTGIFHGDNGIGQYVAEKALKYAIEMAKENGVAIVGVEKLSH SGTMAYYLNEIAENDLIGLSMCQSDPMVVPFGGREPYYGTNPIGFGAPASEGLPIVFDMA TSVQAWGKILDARSKNMEIPDTWAVDRKGVPTTNPHDVGALLPISGPKGYGLMMIVDILS GLMLGLPFGGHVSSMYDKMSEGRDLGHTFILIDPSRFMDINKFKENITKTKKELHSIKAA EGFKQVFYPGEVSQITYEKYQREGIPVEKGIYEYLVSDIVHYDKFGGQSAFASKK >gi|261747353|gb|ADAD01000125.1| GENE 82 70959 - 72218 1711 419 aa, chain - ## HITS:1 COG:STM0520 KEGG:ns NR:ns ## COG: STM0520 COG0477 # Protein_GI_number: 16763900 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 18 412 11 402 415 231 34.0 3e-60 MSSMLEKRYKAMNVSERYWWKVVALCFAGWILMYADRTILGPVMGNIAEQFNLNNTQLGA VNSIFFLTYAIMQIPFGIVGDRIGRRLVITFGFVLFGVTTFLSGIASGLGVFMIYRAITG IGEGAYYGPQYALSGEAIPQKSLALGTAIINSGMAFGTSGGYLLSSYLVLQKGHHWSLPF FIMSVPTIIVGLLFMTLKERVIKPEDAGKEVIQEVKEEIKKEKTSGLSVFKNKNLMCAFL LLFCSIYANFVIITWLPEFLKQERGFAGTSVGFIASLVPWSSIPGALLFARLSDKLKKTK IFVYSLVPLALISTFSMAYVTNRTVLILMLILYGLTGKLALDPILVAYVTKNSPKEALGT ALSAYNFIGMSAAITAPFITGWLKDKFGSMQSGFYLASVLLLVGMLIFSAASENGNKKV >gi|261747353|gb|ADAD01000125.1| GENE 83 72220 - 72999 653 259 aa, chain - ## HITS:1 COG:no KEGG:CAR_c00510 NR:ns ## KEGG: CAR_c00510 # Name: not_defined # Def: hypothetical protein # Organism: Carnobacterium_17-4 # Pathway: not_defined # 17 256 13 281 288 103 33.0 5e-21 MNKSHEKYGEDGSISELMLLYLKNQKIGYIHSIFKSSLNIRFGENLIHISGDNKGLTCFG CCITGKKIKNIILNADIDDIVIKKGNNLLFYTNSGVREIDILKLKKVNLKIENIKISEKI LEEIFGHLKNINFEEKTGIENKEVIKWLQDGISEESQRYLTGRGKGLTPSGDDILVGFAL IQHLCTGNVELKCGDLTTDISRQYFKAFNEGYTNQYLIELFSGNIEKSICNITQIGHTSG YDLLFGIFLGIKKFLKWRK >gi|261747353|gb|ADAD01000125.1| GENE 84 72999 - 75998 4765 999 aa, chain - ## HITS:1 COG:STM0529 KEGG:ns NR:ns ## COG: STM0529 COG0074 # Protein_GI_number: 16763909 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Salmonella typhimurium LT2 # 1 579 1 554 554 394 40.0 1e-109 MLYTVLKENSYQDSINLMLLTNHISTMKGVDKVQVMMGTDANKDIFNTAGLLSKEAETAK PSDMVVVVDSNDEVVVKEVMEAIDKFLADLSVKKDSGVQQSVKNWQQVKEVMSDVNMALI SVPGIYAALEIENAIDNGLHAFVFSDNVPKDEEVRLKKKAHEKGLLVMGPDCGTGIVSNV PLAFTNVVRSGNIGIVGASGTGIQEVSTIIEKLGGGVIHAIGTGGRDLSTEVGAITTKDA LTGLAHHETTDVIVVVSKPPAKEVKDDVVKLLHSLGKPVVAIFLGEKPEYHEGNVYFAHT LEECAKIAVDLAKGEKVKTNYQNKSDIQPKDDEISGKTVKGLYSGGTLAYEAAMLVSEAL KLSSSTKEEGYMLKTDGFEIMDLGDDIYTQGKPHPMIDPSVRIEKLREFGADPKTGVILL DVVLGYGAHEDMAGQLAPVIKEILKKAESEKRKLYIIGTVCGTKGDPQNYEKSQKVLEQA GMLVKESNTSALRTALNLMGTDIDETDKEFKEYKGEKKSLTEVSEAIKDLLLTKPRVINI GVGGFTEPVRQYGGKCVQFEWKPIAGGNQKLIKILQQLKNLETLEKENNLVVEAMKNAAP YLIDVVPAYTVIPEINGKVLLHAGPPITYAEMTGPMQGSCIGAALFEGWVENEEEARKLL ETGEVKFIPCHHVKAVGPMGGITSANMPVLVVENRLTGNRSYCTLNEGIGKVLRFGAYSE EVVNRLQWMKDVLGPVLGQAAKQVQGGINLNVIIAKAITMGDEFHQRNIAASLLFLKEVT PIIVTLDIDDKKKKDVIQFLANTDQFFLNIMMATGKAVVDGARVNKAGTIVTTMTRNGKD FGIRISGLGDEWFTAPVNTPKGLFFTGFTQDDANPDIGDSAITETVGVGGMTMIAAPGVT RFVGAGGFKDALKVSDEMSEICTIHNPNFAIPTWDFKGAPLGIDIRKVVETGITPVINTG IAHKNAGVGQVGAGTVRAPLACFEKALVAYAKHIGLDVE >gi|261747353|gb|ADAD01000125.1| GENE 85 76329 - 76991 708 220 aa, chain + ## HITS:1 COG:Cgl0642 KEGG:ns NR:ns ## COG: Cgl0642 COG1802 # Protein_GI_number: 19551892 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Corynebacterium glutamicum # 10 186 19 198 240 65 23.0 6e-11 MKQKLDILAYEYIKDKILNNEFKAKDTISEKQIAEELSISKTPVKEALSQLENENFVIIN PRKSVSVTEVDLKLIRDVFQVRSRIEPLLVELTIFFLNKDELKTALSEFKKRFEKMGKKE KVSGKEFDKLYDGYRYFFAGNCGNLFFARQMNLVYDHLHRIRRVLYGENNRRLEAIEEHI SIIDLILNENPPIKIIRELCEKHVEAAQMDFFKNLNNLNI >gi|261747353|gb|ADAD01000125.1| GENE 86 77032 - 77523 468 163 aa, chain + ## HITS:1 COG:CAC2751 KEGG:ns NR:ns ## COG: CAC2751 COG0454 # Protein_GI_number: 15896008 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 7 163 4 166 167 111 39.0 6e-25 MSTPNIRNADLSDLKEIMDIYEYARKFMKQTGNRNQWANKFPPENLIKEDIEKKQLYIIE KSGFICGVFAFIIGNDPTYSIIKNGEWLSYEKYGTVHRVASNGKSKGIFNEIITFCESKI SHLRIDTHKDNKIMQHLIEKNGFYKCGIIYVTDGTSRLAYEKI >gi|261747353|gb|ADAD01000125.1| GENE 87 77621 - 78343 415 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 7 236 11 232 329 164 38 2e-39 MAEKDTVLSIKNLKKIYKSDSNYFFKKDYVKVIDKFDLEVKRGECVGISGKSGSGKTTVV KCILNLEEPDEGEIIIDNETVFDSSKKINLFKDKKKSFNIRKKIQLIMQDPGSALDSKLR IGKLLEYTLKNYNPDYESEQIKKEINDTFKLCGLGENVYNKYPYQLSGGQKQRVCIARAL ILNPEILICDEITASLDISLQKKILELLQSLKEKLKIAIIFISHDRKVVSKFCDRICYIS >gi|261747353|gb|ADAD01000125.1| GENE 88 78359 - 79924 417 521 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 279 513 34 262 329 165 37 1e-39 MNKKLFKNKYFLTGGTVIMILLLLCFFAPVLTKYSYSDINLTNINKGPSLKHLLGTDELG RDVWTRILYGGRVSVTVGIVATAVQMFIGITLGLTAGYKGGIFDFIIMRIIDIIMCFPMF LAAIAISSVIGQSFINLILIISLLSWTEVARVVRAETLSLKTRDFVTASKVSGFSDFKII MIHILPNIFSTVVVAMTVSMATVILMESSLSFFGLGIKDPMPSWGNLISSAQNLRTFTSY WWTWLPAGIIIILFVLSINFIGKALTMYYSPNKEALKEEVEIVKDFDLQINKGERIAIIG ESGSGKSLTSLSMLGLNDKKLKVNGKIIFNNENLLSLNEEELRKKRAKNISIIFQEPMTA LNPLVTVGKQVAEVFEIHTDYSKKEIKKKVSELFEELGFENIEELYNKYPNELSGGMRQR VVIAMAIALKPELLIADEITTSIDYSLKVSVLNLIKELSIKYNMAVILITHDLDLLQGFA DRVVVMRNGRIMEIAETKTFFSNPKNDYSKEMIEIFQSLTS >gi|261747353|gb|ADAD01000125.1| GENE 89 79939 - 80901 895 320 aa, chain - ## HITS:1 COG:CAC0177 KEGG:ns NR:ns ## COG: CAC0177 COG0601 # Protein_GI_number: 15893470 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Clostridium acetobutylicum # 1 320 1 321 322 217 39.0 2e-56 MLKYLIKRLIISFFVLFGVSISIFYLINKQPGNPYLHMINPGVPPEIVKKKLIELGYYDP FFIKYVKWLQRVIFLDLGYSIKYSEKVTKVIASRLGNTFILMGTSLFLSSISGIFLGVVS AIKKNTVIDEIITVLSFAGLSIPTFFVSLLLIKVFSYDLRLFPASGMYDSINGSEGITVF YNLFRHMILPTTVLAIMQSVIFIRYTRSAVIEQMSKEYMMTAMAKGLTFKRAVFGHALKN ALLPIITVFFLQLPTIFSGALITETVFVWPGIGRLGYEAISNRDYPLIMGILTVTAIIII LSNLFADIFYMKIDKRIKLQ >gi|261747353|gb|ADAD01000125.1| GENE 90 80924 - 82594 2209 556 aa, chain - ## HITS:1 COG:CAC3179 KEGG:ns NR:ns ## COG: CAC3179 COG0747 # Protein_GI_number: 15896427 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Clostridium acetobutylicum # 39 543 60 552 567 247 33.0 4e-65 MKNKITKLTGLLIILVFVVLACSGKKEEKIEDKGAGKIKDTIVYGISSSPTGIFNPLISD TVYDDAVNTMVYDSLLKLDKEQKLVPSMAESYEVSKDGLELTFKLKDNIKFSDGTPVTAK DVEFTLTSLADKDYPGELGDYISKVVGVNEYKSGAEKSVKGIQVIDDKNIKITFTEIYAP ALTNLGTVGIIPKHIWEKVPVANWKKEKDLLSKPVGSGPYKLAMFTEGQEVKFVKNENFE RKPKTENIVFKVINEDTVIADLKNGQVDIVNVSNLKKDDREALKKENYQLFTHSNNLFQY MGLNYRNEIFKDLKVRQAFIYAIDRKGMVDNILEGNGVVTNVPLLTSSWAYPKDSAGIEK YEYNPEKAKQLLKEAGYTEKDGVLTNAAGKQLSFKLDVPTGNKIRELAAQIIQENLKQIG VKVELNKMEFPALMQKVVANHDFDLYMMGNNLPTDPDLTAYWTSSSVSNEKGKMGWNISG FATPELDGILKEGISTFDLAKRGEIYGKFSKYMNENIPWIYLFEQEIVIAANPNLKGFEP SVFRDFADAENWEISQ >gi|261747353|gb|ADAD01000125.1| GENE 91 82632 - 83540 1043 302 aa, chain - ## HITS:1 COG:PA3398 KEGG:ns NR:ns ## COG: PA3398 COG0583 # Protein_GI_number: 15598594 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 3 293 5 290 308 138 30.0 1e-32 MDFKQLEVFVRLAENKSFSATANDLKISQPTVSLHIKQLEEELDAPLFVRSTRELKITLT GEKLYRQVKELLEKKESIVKSFSNKRKKEMILGVSTISANYIMPPLIKKFNEEFPDVYVN ISEKNSAETIKKVSDYKVDIGIVGMKIHDENCEFYPIYKDEFVFIAPNTDYYRKLKESNP SLKELVKEPFILREDGSGVKRNTELIFQSQNVNPASINTVASVNGIEVMKQLVARGVGTS LISKIAVEAWVKRGELLEIEIKENPHQYRQLYLVWNKKITLPTHVQEFLKIAKRGDIWDK SK >gi|261747353|gb|ADAD01000125.1| GENE 92 83691 - 83927 369 78 aa, chain - ## HITS:1 COG:no KEGG:Elen_1686 NR:ns ## KEGG: Elen_1686 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 1 75 1 75 79 76 41.0 4e-13 MREKKEKIILTFHTTSESMRVEKIFKEYGIEGRLIPVPRQISAGCGIAWCSDIELKEKII EVIEEKKIEMEGIHEFVM >gi|261747353|gb|ADAD01000125.1| GENE 93 83924 - 85018 1862 364 aa, chain - ## HITS:1 COG:no KEGG:Ccur_00560 NR:ns ## KEGG: Ccur_00560 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 1 356 1 357 381 339 55.0 9e-92 MEIANSKKGLLMGILMGLVAVMLAIAGNPKNMAICLACFIRDMSGAMKFHTTPVVQYFRP EIVGMVLGAFIMSIFSKEYKATGGSSPVIRFFSGMAMMIGALVFLGCPTRMILRMASGDM SSYIGLVGFLAGIATGSFFIKKGYSLEKRVEIKKENGYIFPIVMAVLLVLSAVTTGLFAV SAKGPGSIHAPLILSLLGGLIVGAISQKFRVCFTGAFRNIILLKNFEVFSVIFGMFITVM IYNVVTGQFKFVPYGPVAHAEALWNILGLYVVGFAGILLGGCPLRQVILAGQGSSDSVMT VIGMFVGAAAAHNFKLAAGPAAKATGTTPAAIGGPGINGKIMVIVCIIYLFYVAITGLKK EKTE >gi|261747353|gb|ADAD01000125.1| GENE 94 85061 - 86215 1524 384 aa, chain - ## HITS:1 COG:CAC2354 KEGG:ns NR:ns ## COG: CAC2354 COG0520 # Protein_GI_number: 15895621 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Clostridium acetobutylicum # 1 381 1 376 379 325 44.0 1e-88 MIYLDSAATSYYKPEEVATAVYNAIKTMGNSSRGVNKASLSATRIIFETREKLAKLFNIK NSSQIAFMQNSTEALNVAVNGIFTGKEHIITTEAEHNSVLRPLYNLQKKGLELTILKCDK NGECDFSEIEKNIKDNTKAFICTHASNLTGNIMNIKKAGETCKKNNILFVVDASQTAGVI PIDVSENNIDILCFTGHKSLMGPQGTGGIYVKEGIELIQTKVGGSGTHTYEKEHPSEMPE RLEAGTLNGHSIAGLNAALDFLEKTGIENIHKKEQELMWKFYEGIKDIKGVRIYGKFYEG DKKVNRAPIVTINLNNVDSSEICDILDSEYGIVTRCGGHCAPLMHKALGTDKTGAVRFSF SYFNSEKDIEEAVKAVKEISEYDF >gi|261747353|gb|ADAD01000125.1| GENE 95 86202 - 86444 64 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038544|ref|ZP_06011913.1| ## NR: gi|262038544|ref|ZP_06011913.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 80 1 80 80 130 100.0 3e-29 MILPLIDKIHILQCKNISVSVTKISFFCFHSSYAFLNATTLDKVKKVGYSIYRKTFKYIK KINKCLNIKIRKGDDLIDIS >gi|261747353|gb|ADAD01000125.1| GENE 96 86527 - 86916 540 129 aa, chain - ## HITS:1 COG:no KEGG:Amet_3596 NR:ns ## KEGG: Amet_3596 # Name: not_defined # Def: GrdX protein # Organism: A.metalliredigens # Pathway: not_defined # 6 125 2 121 127 80 39.0 2e-14 MWNLEKAILVTNNDKVYTKYKGELQCIFVEKYEDVLIKVRDLVYDRHILLTHPQASSLKP NQTPYRSVMVYPKEKEDNMKDILLIEKCLETFRQWQEIAKTPENYQSKISEDFKTIDLSV IDNIIPRIY >gi|261747353|gb|ADAD01000125.1| GENE 97 87017 - 88150 1659 377 aa, chain - ## HITS:1 COG:SA0609 KEGG:ns NR:ns ## COG: SA0609 COG3949 # Protein_GI_number: 15926331 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Staphylococcus aureus N315 # 36 377 1 323 326 99 25.0 9e-21 MGEDSNIINWKRVLILAGAIIAFTIGSGFATGQEIVQYYTAYGMSGLLVVAVFAITFLYY NFSFAKAGAEENFDRPNDIYKYYCGKYIGTFFDYYSTIFCYMSFWVMVGGAASTLNQGYG LPLWAGAVILTVITVVTVIGGLNSLVDAIGIVGPLIVVLCIAIGLITCIRDGVNIPQGLE IIKTSAYEGAKAGETIKNAGANWLISGLSYAGFVLLWFASFTTALAGKNSKKNLKYGIIG GTLAVCVAIVLVSFGQIANINLGSEGQYVWNAPIPNLILAAKIWKPFSAIFALVVFAGIY TTAVPLLYNPAVCFSKEGTLQFKILTISLALIGLVVGLFLDFRTLVNIIYVLNGYVGAVL IVFMLWKDIKVLKDKKK >gi|261747353|gb|ADAD01000125.1| GENE 98 88181 - 88423 366 80 aa, chain - ## HITS:1 COG:no KEGG:CD1741 NR:ns ## KEGG: CD1741 # Name: grdF # Def: sarcosine reductase complex component B subunit alpha (EC:1.21.4.3) # Organism: C.difficile # Pathway: not_defined # 1 76 359 434 434 103 69.0 2e-21 MVKGIEKYGIPVVHMATVVPISLTIGANRIIPGVGIPYPLGDPTQGEVDSKKIRKRMVRR ALKALQTPVTEQTVFEKDDF >gi|261747353|gb|ADAD01000125.1| GENE 99 88451 - 89500 1653 349 aa, chain - ## HITS:1 COG:no KEGG:Clos_0958 NR:ns ## KEGG: Clos_0958 # Name: not_defined # Def: selenoprotein B (EC:1.21.4.2) # Organism: A.oremlandii # Pathway: not_defined # 1 349 1 349 435 419 59.0 1e-116 MKKYKVVHYINQFFAGIGGEEKADHKPELREGSVGPGQALQEALGEEFEIVATIICGDNY FGENLEVATDTVLEMVKKYNPDVFVAGPAFNAGRYGVACGTVCKAVEERLGVPSVTAEYE ENPGVDMFRKDVIIMKTGNSAADMRKAVKSISNIVKKLAKGEEILGPAIEGYHERGIRVN YFAEERASERAIKMMLKKLKGEEFETDLPMPKFDRVDPAEPIEDIKKAKIAVVTSGGVVP TGNPDHIESSNATKYGSYSIEGMSELSPNDFTTIHGGYDRQFVMANPNLVVPLDVLREME KDGEFGELYNTFFATTGTGTATGSAAKFGTEIGQKLIDEHVDAVILVST >gi|261747353|gb|ADAD01000125.1| GENE 100 89513 - 90799 1829 428 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1637 NR:ns ## KEGG: CDR20291_1637 # Name: grdG # Def: sarcosine reductase complex component B subunit alpha # Organism: C.difficile_R20291 # Pathway: not_defined # 1 428 1 428 428 621 68.0 1e-176 MKLELGKIKINDIQFSDKTYVENHILYVNKEEVEKLVLEDDKLIGCSLDIARPGEKTRIT PVKDVIEPRVKVSGGEMFPGVIGKVSPTVGSGRTHALDGCCVITVGRIVGFQEGVIDMSG PAAEYCPFSGTVNLCVVIEPKEGLETHVYEKAGRMAGLKVAAFLGETARNLEPDTLEVFE TKPIFEQAAMYPDLPKIGYVHMLQSQGLLHDTYYYGVDAKQFVPTFMYPTEIMDGAIVSG NCVAPCDKVTTYHHFHNPVIEDCYKHHGKDINFMGVILTNENVFLADKERHSDMVAKLAE WMQLDGVLITEEGYGNPDTDLMMNCRKVERKGVKVVLITDEFPGKDGKSQSLADTCEEAT ALASCGQGNATLQFPVMDKIIGTMEYIESQIGGWAGCVNEDGSFEAEIQIIIASTIANGF NKLAARGY >gi|261747353|gb|ADAD01000125.1| GENE 101 91385 - 92809 867 474 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 9 473 6 453 456 338 41 7e-92 MGYEKLVDLINTINSFIASNILMWGLLGAGAFLTLLLGFPQITKMSRAFGMVFGGLFKKK ANSEEGSMSSFQALATAIAAQVGTGNVAGVATAITAGGPGAVFWMWISAFLGMGTIFTEA VLAQKYRKKIHGELVGGPAYYISYGLKKMGILSKFLAGFFAVSIILALGFMGNAVQSNSI ASGIKGISGLENINPGIIGVIVAVLAALIFWGGMQRIAKFAELVVPAMAAIYILASIIIL IKFHTEIIPTVSWIFESAFTPHAAVGGIAGSIVKVAVQKGIARGLFSNEAGMGSTPHAHA VAHVKHPVEQGLSAIVGVFIDTILVCSATALSILVTGAYNLKGANGKYLVGAQLTQGAFR NAFQEPGAILLAICLAFFAFTTIVGWYYFGESNIKYLFGKSALLPYRIIVIVCIIVGALQ EVDIVWSLADIFNSLMVIPNLIAIVWLSGEVKELLADYNKKYEKNDVYYDYEDK >gi|261747353|gb|ADAD01000125.1| GENE 102 93124 - 94269 1853 381 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2237 NR:ns ## KEGG: CDR20291_2237 # Name: grdD # Def: glycine/sarcosine/betaine reductase complex component C subunit alpha # Organism: C.difficile_R20291 # Pathway: not_defined # 1 381 1 381 381 477 65.0 1e-133 MSKKIISEVLLELADAIETGNFEKKIKVGVTTLGSEHGSENIIKGAIAAKNDLFDIVLIG KGHEDFESYETQSEDEAHKIMEELLDKGEIASCVTMHYNFPIGVSTVGRVVTPGKGNEMF IATTTGTSATDRVEAMVRNAIYGIAAAKSVGIKKPTVGIINIEGARQTEKLLSELQKNGY DFEFTESQRADGGAVMRGNDLLMGTPDVMVTDSLTGNLLMKIFSSFTTGGDYEAQGYGYG PGVGENYDRRILILSRASGSPVVAKALKYAYEVATGEVNEKAREEYKKAQAAGLDKIFAE LKNKKQESKPSEEVKVPEKEAVTSQIAGVDIMDLEDAAKAIWKHGIYAESGMGCTGPIIL VSKANKEKAKEILKADGFLSE >gi|261747353|gb|ADAD01000125.1| GENE 103 94285 - 95811 2402 508 aa, chain - ## HITS:1 COG:no KEGG:CD2349 NR:ns ## KEGG: CD2349 # Name: grdC # Def: glycine/sarcosine/betaine reductase complex component C subunit beta (EC:1.21.4.2) # Organism: C.difficile # Pathway: not_defined # 1 505 1 509 510 780 76.0 0 MNYPVFKGAGYVLVHTPDMIENGSTCSIERETNPDSEFLKEIKNHIRSYEEVVNYLPNQV YIGRKTPEELNTFPMPWYNIKDQKGDRHGKFGEIVPQDEFLGIMQISDAFDLVKLSEEFV ASVRPKIEENYKELAPFFGQLKGSDISEIDPKDQKLIHEGKVVGYVKRAHDVDPNLNAHV MAENLVVKASGVISALQLLRNAKINPEEIDYVIECSEEACGDINQRGGGNFAKSIAEIAG LTNATGSDLRGFCAAPTHALISGASLVKAGIYKNVMVVAGGATAKLGMNAKDHVKKGIPV LEDVLGGFAVLISENDGVNPVIRTDLVGKHNVGTGSSPQKVISALVTPGLDKAGLKITDI DVFSVEMQNPDITMAAGAGNVPEANYKMIGALAVMRGDIAKADLKSFIEEKGLPGWAPTQ GHIPSGVPYVGFLREDLTTGNKNRAMIVGKGSLFLGRMTNLFDGVSFVAERNTGEKSSEE GAVSKEEIRKIIAESIKKLAQNIADAEE >gi|261747353|gb|ADAD01000125.1| GENE 104 95893 - 96126 385 77 aa, chain - ## HITS:1 COG:no KEGG:Amet_3591 NR:ns ## KEGG: Amet_3591 # Name: not_defined # Def: selenoprotein B (EC:1.21.4.2) # Organism: A.metalliredigens # Pathway: not_defined # 1 77 360 435 436 128 84.0 6e-29 MVKEIERAGIPVVHMCTVVPISLTVGANRIVPTIAIPHPLGNPKLDSIEEERKIRRKLID KALKALTTEVDGQTVFE >gi|261747353|gb|ADAD01000125.1| GENE 105 96154 - 97203 1801 349 aa, chain - ## HITS:1 COG:no KEGG:CD2351 NR:ns ## KEGG: CD2351 # Name: grdB # Def: glycine reductase complex component B gamma subunit (EC:1.21.4.2) # Organism: C.difficile # Pathway: not_defined # 1 349 1 349 435 528 75.0 1e-148 MSKIRAVHYINQFFAGVGGEEKAHIEPELRKELPPISQQLQAQLGDEFEIVATVVCGDSY FNENLEKAQATLLEMIKGLEPQLFIAGPAFNAGRYGVAAGTITKLIQDELHIPALTAMYI ENPGVDMYRKDVYIVEASDSAAGMRKVLPKLAKLATKLAKGEEILSPAEEGYIPKGVRVN FFHKETGSKRAVDMLIKKIKGEPFETEYPMPNFDRVDPNPAVKDLTKATIALVTSGGIVP KGNPDHIESSSASKYGEYSIAGVNDLTAETFETAHGGYDPVYANEDADRVLPVDIMRELE KAGKIGKLHDKFYTTVGNGTAVASAKGFAAEYAQKLKADGVDAVILTST >gi|261747353|gb|ADAD01000125.1| GENE 106 97234 - 97545 526 103 aa, chain - ## HITS:1 COG:no KEGG:TDE0745 NR:ns ## KEGG: TDE0745 # Name: grdA # Def: glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.3 1.21.4.4 1.21.4.2) # Organism: T.denticola # Pathway: not_defined # 1 101 50 150 157 131 71.0 8e-30 MDLENQKRVKEAAEKFGNENVAVLLGASEAEAAGLAAETVTVGDPTFAGPLAGVALNLAV YHIVEEAVKEQVDPAVYEEQVGMMEMVLDVDAISEEMKEVRGD >gi|261747353|gb|ADAD01000125.1| GENE 107 97561 - 97695 168 44 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2240 NR:ns ## KEGG: CDR20291_2240 # Name: grdA # Def: glycine/sarcosine/betaine reductase complex protein A # Organism: C.difficile_R20291 # Pathway: not_defined # 1 44 1 44 156 75 86.0 1e-12 MSSLKNKKVIIIGDRDGIPGLAIEECVKTIEGTEVVFSSTECFV >gi|261747353|gb|ADAD01000125.1| GENE 108 97758 - 99044 2061 428 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2241 NR:ns ## KEGG: CDR20291_2241 # Name: grdE # Def: glycine reductase complex component B subunits alpha/beta # Organism: C.difficile_R20291 # Pathway: not_defined # 1 428 1 428 428 682 79.0 0 MRLELGKIFIKDIQFASESKIENNVLYVNKEELIKAIWDDEAIVSVDLDIAKPGESVRIT PVKDVIEPRVKVEGRGGIFPGILSKVDTVGEGKTIALKGIAVVTTGKIVGFQEGIIDMTG PGAEYTPFSKLNNLVIIAEPAEGIKQHEHEKAVRYIGFKAAKYIGELAKDLKPEETKVYE TKPLLEQIAQYPDLPKVGYVYMLQTQGLLHDTYVYGVDAKQIVPTLLYPTEIMDGAIVSG NCVSACDKNPTYVHLNNGVIEELYEKHGKEINFLGVIITNENVYLADKERSSNWTAKFTK YLGLDAAIVSQEGFGNPDTDLIMNCKKIENEGVKTVIVTDEYAGQNGASQSLADADPKAD AVVTGGNANQLITLPKLDKVIGHIDVVNVIAGGNHDSLQADGSIVLEIQAITGATNETGF GYLSAKTY >gi|261747353|gb|ADAD01000125.1| GENE 109 99068 - 99385 566 105 aa, chain - ## HITS:1 COG:alr0052 KEGG:ns NR:ns ## COG: alr0052 COG0526 # Protein_GI_number: 17227548 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Nostoc sp. PCC 7120 # 3 87 6 90 107 73 37.0 9e-14 MLELNKDNFEAEVLQADGYVFVDFWSQGCEPCKALMPDVHKLAEKYEGKMKFTSLDTTTA RRLAIKQRVLGLPTLAVYKGGEKIDEVTKEDATVENVEALIKKYI >gi|261747353|gb|ADAD01000125.1| GENE 110 99460 - 100629 1490 389 aa, chain - ## HITS:1 COG:PH1043 KEGG:ns NR:ns ## COG: PH1043 COG1473 # Protein_GI_number: 14590880 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Pyrococcus horikoshii # 4 385 5 385 387 284 42.0 3e-76 MSNVLDKAKKYEDHIIKIRRKLHENPELSCQEFETKKLIVEELKKLGLEIKETSGTSLIA VLKGGKPGKTIALRADFDALPIIEQSDVEFISKNHGVMHACGHDGHTAMLLGAANILSEM KEEINGEIRFFFQEGEEIFAGAKKIIEAGGMKGVDACFGMHGMPIPTGTVNIESGYRLSG CDTIYVNFEGVSGHGSAPHLAKDTIHPACLFVTDLQSIITKNLNPQEAVVVSVGKFNGGT KANIISKYTELEISMRYFNPEARKTVHEAIIRHTKSIAEAYEIKVDVRIEESTLSLYNSP EMVELAKKVATEILGENCNVPIPRPMASEDMSYYFEHAKGVYIFIGYKNEEKGSIYFPHH EKFKLDEDYFKYGAAIFVEFALDYLKSNL >gi|261747353|gb|ADAD01000125.1| GENE 111 100645 - 101937 1538 430 aa, chain - ## HITS:1 COG:MA4166 KEGG:ns NR:ns ## COG: MA4166 COG0477 # Protein_GI_number: 20092959 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Methanosarcina acetivorans str.C2A # 18 386 3 385 444 187 33.0 5e-47 MNTNDIEKSRQIKKVESYRWIVWLILAATYVFVTFHRMSAGVVRSDLEEAFKIGAAQFAN IGSMYFYAYFIMQIPSGILADKIGPKKTVFIFSIIAALGSICFGLAPALGVAYISRFFVG IGVSVVFVCLIKIQSRWFYSRNFALMIGFAGLAANLGAIIAQTPLVIAVETFGWRRTFIY MGIIMVIFAILTLLFVRDDPTEMGLPGMDEIENRPIATSNINVMKALGSILSNPKTWTIS LVYIGLYTGYTVFLGTFGVSFLMSNYSITKIQAANYIIATVIGSAVSGLVIGYLSDKIKR RKSILIFSSIATLILWIVLIYVKLPLGLLTPYLFVFGFLMTAFTLCWTVGNEVNDRRLSG MSTGVVNCIGFLGAAVIPVIMGKVLDTYKNMPEIGYKKAYFVLIILVGLSTVFSFFSTET YATNVYKEKE >gi|261747353|gb|ADAD01000125.1| GENE 112 101960 - 102904 519 314 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 5 309 2 303 306 204 38 2e-51 MAEHFDLVIIGSGPAGLSSALYASRGKLKTLVLEKGQNGGQAAITHLIENYPGAIEDPTG PKLTQRMLEQAKNFGTEVRKEEVVEVDFSGKEKVIKCKDNEYTAKAVVIATGATPRKLDA PGIKELSGKGISYCATCDADFFKGLEVYVVGGGNSAVEEAIYLTKFARQVHIVHMLDNFQ CENITLEKAKSVPNLDIRLRTVVQEVKGDGILESIVFKNLDTNEVYEVEADEEDGTMGLF VFIGYQPQTELFKGKVEINQYGYIVAGENTETSVPGVFVAGDCRVKEVRQVITAAADGAV SAIAAEKYISKEFE >gi|261747353|gb|ADAD01000125.1| GENE 113 102934 - 103305 488 123 aa, chain - ## HITS:1 COG:no KEGG:CD2357 NR:ns ## KEGG: CD2357 # Name: grdX # Def: putative glycine reductase complex component # Organism: C.difficile # Pathway: not_defined # 1 121 1 120 123 120 53.0 2e-26 MKLITNNPNFLNYKKKDIEVDYRDVDYLKILEIARDYIHVNYELLTHPLYGSVKPNETIY RSIVLKSNDNLDHNSVVMISEAIETFVKFRKNKETPLWTDTVKEDFGVIDYDLITNAIER IIK >gi|261747353|gb|ADAD01000125.1| GENE 114 103763 - 104896 1546 377 aa, chain - ## HITS:1 COG:AGl2707 KEGG:ns NR:ns ## COG: AGl2707 COG1960 # Protein_GI_number: 15891460 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 8 375 55 421 421 120 27.0 3e-27 MKDYYKWAKEFADKNIAPYAEEIDKTGKFPEEIFKKISQEGFFKIVIPEELGGEGQGIYA HAKVTQAFAEACATAGLCYTMHNVALKFTLTFGREEIKQQVVKDIIENNKFMTLARSEFG TGVHVFNSQLEVENHDDYSVITGAKSMITSANYASYYLISVPADEKRGPVNWLIPYGAEG LSFKESDWNGLGMRGNNSCPMFMEKMKLDNKYAIYIDRQKANQKYPVNIDVIFFMAGLAS VYVGLSKTIYEAAKEHALNRKYPGDKSLADIETVQLHLGQLYSNAFAGESMIDAATRAIE NEEKDGFEKIMSARVVSSLNCVDSATLGMRIGGGQAYNNKGPLARLLRDAFAAPVMFPSV DVLRNWIARVITGKGLA >gi|261747353|gb|ADAD01000125.1| GENE 115 105156 - 105824 778 222 aa, chain + ## HITS:1 COG:YPO0274 KEGG:ns NR:ns ## COG: YPO0274 COG2391 # Protein_GI_number: 16120613 # Func_class: R General function prediction only # Function: Predicted transporter component # Organism: Yersinia pestis # 13 216 188 387 404 84 28.0 2e-16 MNTETKIKTRTQRKPFKSQIPFALALTVLVIAFGFYLKDVKLVTFWIFGLAFGIILQRSR FCFTAAFRDPILTGSTSLTRGVLLAIAIGTIGFTAISFGSVLSGGKFIGVDSVEPLSLLT IAGGIIFGIGMVMAGGCASGTLMRFGEGFELQWVSFIFFMLGSVAGAWAMGFLEPVFSNK KFAIYLPEHLGWIGALVFQFSIILILYIVALKWQKKKIGSDK >gi|261747353|gb|ADAD01000125.1| GENE 116 105862 - 106089 376 75 aa, chain + ## HITS:1 COG:SA1849 KEGG:ns NR:ns ## COG: SA1849 COG0425 # Protein_GI_number: 15927619 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Staphylococcus aureus N315 # 4 73 3 73 74 57 40.0 9e-09 MKEYTLDCLGEACPIPLIKAQKKFETMETGDVLIINVDHSCAIKNIPEWAQEVGYNCEVE EVSEGEWNIIVEKSK >gi|261747353|gb|ADAD01000125.1| GENE 117 106104 - 106658 832 184 aa, chain + ## HITS:1 COG:PA3631 KEGG:ns NR:ns ## COG: PA3631 COG2391 # Protein_GI_number: 15598827 # Func_class: R General function prediction only # Function: Predicted transporter component # Organism: Pseudomonas aeruginosa # 30 176 25 166 408 82 35.0 6e-16 MEFFNKLSQNETYKRIMKTPLSYGAGAMLLGIAATAHLVIFKKAWGVTGSITVWGAKLFN LVGIDTTKWAYFIAHKGLGKSIATPILKDGGSLRNIGIIVGATLATLYASEFKIKKIKGK KQLLGGIIGVFFMGFGARLAAGCNIAALFSALSALSLTGWFFAASLLIGAIIGSKILVGY IMNE >gi|261747353|gb|ADAD01000125.1| GENE 118 106799 - 108199 1684 466 aa, chain + ## HITS:1 COG:ynjE KEGG:ns NR:ns ## COG: ynjE COG2897 # Protein_GI_number: 16129711 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 10 439 12 432 440 249 34.0 8e-66 MKIKSKSILLTLIIIILGVFTACNNNSVKEIKTEELIKNLDNPEYVIIDTREDSLYNGFK DGKAERGGHIKNAIQFSCSWMNYIQDDKFESFAAGKGITKKKTLVFYDTNLDNLNCVSAE FASRGYKVRTYTDFVNYANDKNNPMESFPNFELSVSPQWLNTLLQGGKPETYNNDKFMLF EVSWGPLEKSKGYVQHITGAYHFDTDWIENGPVWNLSDPKVIEQNLLKNGITKDKTVILY SAENQLAAYRVFWALKWAGVQDIRVLDGNLVTWMDAGLPTETKVNTPKPETNFGTQIPAD TSVTITTPEQAMEMQQKEGLKLISNRSWDEYTGKVSGYDYIPGKGEPQGAIWGFAGTDSS NVADYYDPDGTLRNPKEIFELWATQNIHQNDKLAFYCGTGWRATVPWFMTQMAGWKNSYI YDGGWNAWQMDSKFPVQKGAPNNMKKPDAKNDYGKANKKPGGSCKA >gi|261747353|gb|ADAD01000125.1| GENE 119 108221 - 108775 677 184 aa, chain + ## HITS:1 COG:STM1965 KEGG:ns NR:ns ## COG: STM1965 COG2391 # Protein_GI_number: 16765303 # Func_class: R General function prediction only # Function: Predicted transporter component # Organism: Salmonella typhimurium LT2 # 31 175 45 184 421 88 37.0 7e-18 MTFFNKLSQNEIYKRIMKTPLSYSAGAALLGITATAHLIIFKEAWGVTGPITVWGAKIFQ LMGINTARWSYFVAHPGLKRGIMTPILRSGGSIRNIGIIVGATLATLYASEFKIKKIKGK RQLIGAVIGGFLMGFGARLAAGCNIAALFSALSALSLTGWFFAASLLIGAIIGSKILVTY IINK >gi|261747353|gb|ADAD01000125.1| GENE 120 108895 - 109839 435 314 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 8 303 2 296 306 172 36 9e-42 MEKIKEFYDVIIIGGGSAGLTAGIYCGRSKLNTLILEKSLIGGLATYTNEIENYPGFPEG ITGLDLMNLFHKQAKKFGVKFKQVPVKSVSLSNEKKVIETFRNIYEAKAVIIATGGKPKL TNSLNEEKFLYGKGISFCATCDAAANIDKTVMVVGNGDSAIEEGIFLTKFAKKVIVSVTR DTGNLKCHKAAREQALNNKKMEFIWNTCVHSFESEGELLEKVILKNTKTEELMPFNIDTC FLFIGYNPDTEIFKNSVNMNSNGYIITNEKMETNINGVFCAGDVRDKPLRQVATAVSDGA VAAYYAEKCMSESE >gi|261747353|gb|ADAD01000125.1| GENE 121 109987 - 110202 275 71 aa, chain + ## HITS:1 COG:no KEGG:Closa_1519 NR:ns ## KEGG: Closa_1519 # Name: not_defined # Def: SirA family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 70 1 70 73 73 45.0 2e-12 MIKINCMGYNCPLPVIMLREKYPLIMSGKKILLVTDHSLAVQNIEYYCSLNNLEYKSEEV LKGVWEITVSK >gi|261747353|gb|ADAD01000125.1| GENE 122 110195 - 111079 775 294 aa, chain - ## HITS:1 COG:all3953 KEGG:ns NR:ns ## COG: all3953 COG0583 # Protein_GI_number: 17231445 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Nostoc sp. PCC 7120 # 16 270 21 282 337 112 30.0 9e-25 MLNEIYEFFLKVSEMGSISKAADKLYISQPALSQQLKNLEKELDASLFIRSNKGIELTGE GKIVYKYFSMSEKLLQEMKKEIRDKKNNVKKIRISAVPTMCNYSLPCMIYHLKQHYPNTS VELYSRSDSFKIEEELVSERCDIGFVAEDIKNDLVLTSKKIFEEEILLVASNISINYPDF ITERELSRYDLINMATENEITRTVSRYINDFESYKITYNLESIDAIKSCVVNGYGLAFLP YSTIKKELYNKEMKIINIDNLSIVQNISLVKRVSETDGLKKIISYIEKHVSKFI >gi|261747353|gb|ADAD01000125.1| GENE 123 111106 - 112467 2250 453 aa, chain - ## HITS:1 COG:FN0366 KEGG:ns NR:ns ## COG: FN0366 COG1109 # Protein_GI_number: 19703708 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Fusobacterium nucleatum # 4 452 3 451 452 506 58.0 1e-143 MARKYFGTDGMRGEANKDLTIELVTNLGLALGYYLKKNKEGTVKPKIILGTDTRISGYMI RSALSAGLTSMGVHIDFVGVLPTPGVSYLTRKLNADAGIMISASHNPVKDNGIKIFSQNG YKLPDSVEEELETLMENTDELLKHQVSGDDLGRFKYVEDDMRIYLDFLSSTVKKSFKGTK IVIDAANGAAYRVASKIFQRLGADIIVINNIPNGKNINVDCGSTHPELLQEIVQVYKADL GLAYDGDADRLIAVDHTGKIIDGDLIIGIIAKNFKSKGLLNNNKVVTTVLSNMGFEKYLE ENGIGLIRANVGDRYVLEKMKEFGLNLGGEQSGHILMLDYNTTGDGVLSSIQLVSAMIES GKTLSELVKDIKLWPQDMKNITVSKEKKSDWKNNKALTEFIKEKEQEINGKGRILVRASG TEPLIRVMVEAETEEIVDKYVNELVNKVKEELL >gi|261747353|gb|ADAD01000125.1| GENE 124 112495 - 114087 1875 530 aa, chain - ## HITS:1 COG:BH0373 KEGG:ns NR:ns ## COG: BH0373 COG0642 # Protein_GI_number: 15612936 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 216 510 160 452 459 175 33.0 2e-43 MKKVKLEDRISISYVLLFLALILISNFVLVYILKKENTKSIQMAATAKEEELNEYLDKWE IYSRSDQFQWESPAGQLQGERIVYLRKPFNPGDNEYLYLMQVSSEDSRQILINSITPVEI LETGIDRENMVTKGKEIIDVVEKNNIKDNSIKGKDIKLEDKTSENEKTNDIYHVFKVSRT IKSRARDYKVEIYVLKNISQETKIYSRLQWLIVLFSLIGVIVTVLVSSFLSKRILRPVNS IIKTAKTINTDDLSKRIDVPKTEDELQNLTLIINEMLDRLEFSFDNQSKFVSDASHELRT PLAIIKGYAEIIKKRRLTDEEIFEESIDSIINEAENMKSLVQKLLFLAKGEITKINTNFS EIDAGELVHQIYTDTVVSVKDHVFNLEKSEEYKIKADTTLLQQAIRALIENATKYSDKNT NIYLISQKEGDFGRITVKDEGVGMTPEDSKRVFERFYRVDVSRTKATGGTGLGLAIVKRI VEIHNGKIEVNSELGKGTEISIILPLVKEKTEEKPKSGKSILKIGNNKNK >gi|261747353|gb|ADAD01000125.1| GENE 125 114099 - 114773 944 224 aa, chain - ## HITS:1 COG:BS_ykoG KEGG:ns NR:ns ## COG: BS_ykoG COG0745 # Protein_GI_number: 16078390 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 5 224 6 227 228 226 50.0 3e-59 MREKILVIEDDPKISRLLEIELKFEGFDVFFAYDGKEGLNMAKYGSYDLILLDVMLPKMS GMEVCKRIREGSQVPIIMLTAKDEISDKIVGFDYGADDYMTKPFSNEELLARIKALLRRT KKTVDHKGIFEFEDLKINYSTYEVFRGETLISLSKREFELLDFLVLNKGIVLSRDKILEE VWGFDYIGNDNILDLYIKYLRDKVDKPYERKFIQTVRGIGFIFK >gi|261747353|gb|ADAD01000125.1| GENE 126 114957 - 116231 1646 424 aa, chain - ## HITS:1 COG:FN1826 KEGG:ns NR:ns ## COG: FN1826 COG0826 # Protein_GI_number: 19705131 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Fusobacterium nucleatum # 5 408 5 410 410 468 55.0 1e-131 MQQKRKVELLAPAGNMEKLKTAFHFGADACFVGGSAFNLRGMSSNFKNSELRQAIDYVHS LGKKIFVTLNIFAHNSEIEYMPRFIKLLDEYGADAVIVADLGVFQLVREHASNLPIHVST QANNVNWMSVKTWRDMGAKRVILAREMSLKEIKTIKEKVPDVEIEVFIHGAMCMAISGRC LLSNYFTSRDANRGICAQDCRWNYKVIAEGHEETGAHDIIEEHGSTFIFNAKDLCTIEFI DKVLETGVDSLKIEGRMKSIYYNSTVVKQYRQALDSYYSGNYKYDPKWLYELQTISHRLY SKGFYLGVTTEEDQNYNTGSSYSQTYQLVAKVAEKLDDNKYVWQIRNRVFASNELELVRP HQDPVKFKIKNFLNTKNNEYIDVAHPNTVAIIETDVEMEPMDLIRVKLPEGKSDSDMEMG ENTL >gi|261747353|gb|ADAD01000125.1| GENE 127 116296 - 117747 2066 483 aa, chain - ## HITS:1 COG:lin0047 KEGG:ns NR:ns ## COG: lin0047 COG0305 # Protein_GI_number: 16799126 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Listeria innocua # 12 472 7 441 450 349 42.0 6e-96 MALFDESKSENDLKIPYSIEAEEALLGSIFIKPDVIGDIVEIITPNDFYKNNYRIIFSEM LNVYNTGKIVDALVIKDSLEYQNLLDEIGGEDILYGLTDVVPTAANAVTYAQIIKERSVQ RQLIETGERITRMAYRGYDDVEKMLDKAESMIFKIAESKQKKDVVSLKELAGLKISSLDE MSKYKGGLRGLSSGFTDYDSLTSGFHGSDLIILAARPAMGKTAFALNLALNVAKRGKHVL IFSLEMGNEQLFERLLSIDSKIKLKSIKDGTLADDDYTLLGNSMGRLSELPLYISDSSSV NILEIKAVARRLKAEGKLDFLLIDYLQLITPSEGSKKSREQEISEISRSLKIIAKELDIP VITLSQLSRGVESRNDKRPILSDLRESGAIEQDADMVMFLYREAYYASKVGMQPNGQNPN ESNMPQQYTSQPTVSSTPELEEVEVIIGKHRSGPTGTIRLGFRPSYQQFVNIISDARVNQ YPE >gi|261747353|gb|ADAD01000125.1| GENE 128 117754 - 118209 573 151 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229210387|ref|ZP_04336784.1| LSU ribosomal protein L9P [Leptotrichia buccalis DSM 1135] # 1 150 1 150 150 225 73 9e-58 MKVKVILKENIKGVGKKDEIVEVKDGYANNFLFAQNKGIPATPENINKLKSKNEKIKKTH DNDVKKANELKELLSSKEILLKVKAGNNGKVFGSVGGKEIADAIKEQLNLEVDKKKVSTD ARMKELGEHFIEIKLHPEVKAVIKVKLEGQE >gi|261747353|gb|ADAD01000125.1| GENE 129 118248 - 119738 1723 496 aa, chain - ## HITS:1 COG:FN1830 KEGG:ns NR:ns ## COG: FN1830 COG2812 # Protein_GI_number: 19705135 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Fusobacterium nucleatum # 1 490 1 482 484 357 42.0 2e-98 MNITLYRKYRPQNFDEIAGQEFVTRAIKNSLREDKLSHAYLFTGPRGVGKTTIARLIAKG VNCLNNGITDNPCGVCDNCREIAQGISMDMIEIDAASNRGIDEIRELKEKINYQPVKGRR KIYIIDEVHMLTKEAFNALLKTLEEPPAHVIFILATTEIDKIPDTVISRCQRYDFLPIDE KDITKLLKEVAEKENITIDDESLDLIYRKSEGSARDSFSIFEQVISNFNGEDIDISKTQK ALGVVPDIVLNQFLELIQKGNKEELLDFMDKIWEEGLVIETFLKDFAYYLKEQFKKDRGL SIDFILNTVSSIFFILNEFRYEEDKRLLGYVLVHEFYKDKVKKTVYTEQPVQKSESISSI NPVMSDKSEKISEKKSENVITEQQNHDISLFESKWKEIKKGIKKESVMIEALMADTYPEK LEGNTLTIRFPEGHKFHSSRIMETENKLKIEKIINKLCNSDILISTTFGGEDGGSSEDEF VNKVIDFFEGTIIDSK >gi|261747353|gb|ADAD01000125.1| GENE 130 119922 - 120614 1016 230 aa, chain - ## HITS:1 COG:BS_rph KEGG:ns NR:ns ## COG: BS_rph COG0689 # Protein_GI_number: 16079889 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase PH # Organism: Bacillus subtilis # 5 229 6 231 245 285 64.0 4e-77 MRVNRKNDEMREVKVTKNYIMHPEGSVLIEFGNTKVICNATVEEKVPPFLRGTNSGWITA EYSMLPRATNNRVQREANKGRLSGRTMEIQRLIGRALRAAVNLEKLGERTVIIDCDVIQA DGGTRTASITGGYLALELAIEKLIDKGKLNEIPINSRVAAVSVGKIKNEILLDLEYEEDF RADVDMNIIMNDKGEFIELQGTGEEATFTQEELLKFIEISKKGFEKLFNL >gi|261747353|gb|ADAD01000125.1| GENE 131 120645 - 121877 2052 410 aa, chain - ## HITS:1 COG:TM0720 KEGG:ns NR:ns ## COG: TM0720 COG0112 # Protein_GI_number: 15643483 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Thermotoga maritima # 3 410 4 417 427 542 68.0 1e-154 MSYIKEFDSEVYNAIINEEKRQEEGIELIASENFVSKAVMEAAGSVFTNKYAEGYPEKRY YGGCRNADTVEQLAINRLKEIFGAKYANVQPHSGSQANMGVYVALLEPGDTILGMGLSSG GHLTHGYKVNFSGKNYKGIEYGLHPETEMIDYEAVRNLALENKPKIIVAGASAYSRIIDF KKFKEIADEVGAYLMVDMAHIAGLVAAGEHPNPLKYADVVTSTTHKTLKGPRGGIILTNN EEIAQKIDKVIFPGIQGGPLMHIIAAKAVAFKEALSPEFKEYQRQVVKNAEVLSEELVKG GLRIVSGGTDNHLMLVDLRPKGVTGKLAEEKLEEAGITCNKNAIPNDPEKPFITSGIRLG TPAITARGMKEKETAEIARMILKVLENVNDKEKIKEVKNEVYELTKKFPL >gi|261747353|gb|ADAD01000125.1| GENE 132 121990 - 122754 1142 254 aa, chain - ## HITS:1 COG:CAC3586_1 KEGG:ns NR:ns ## COG: CAC3586_1 COG1058 # Protein_GI_number: 15896820 # Func_class: R General function prediction only # Function: Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA # Organism: Clostridium acetobutylicum # 1 228 1 229 245 207 46.0 2e-53 MKAEIICVGTELLVGDIVNTNAQYISEKLTSIGVDLYYQTTVGDNFERVKNCIEAAFNRV DLVITTGGLGPTGDDITKEVIAEYFNEELEVSESHYKEIVKIYEDRGYEVSEGSEKEASI LKNSVLLKNQNGYAPGFFYEKENKKIIVLPGPPKELTWMVDNEVLPLLSKYSDSILLMKT LEITGVPEGEINDRLKKYCDMSNPTVAPYAKNKCVHMRIAMKGPRNNIENIKKEIERIAE EIKNIYPHVVEIDK >gi|261747353|gb|ADAD01000125.1| GENE 133 122779 - 123219 531 146 aa, chain - ## HITS:1 COG:FN0974 KEGG:ns NR:ns ## COG: FN0974 COG0346 # Protein_GI_number: 19704309 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Fusobacterium nucleatum # 1 141 1 140 142 218 74.0 3e-57 MNKITCICLGVRDMEKAVKFYRDKLGFQTEEKSDYPPVIFFNTPGTKFELYPLDLLVTDI SEANDIKIKKGFGGFTLAYNVPNKEDVDLTIELVRKAGGTIVKEPQEVFWGGYHAYFSDL DGYYWEVVWGPNFEFDENGLLVIQKD >gi|261747353|gb|ADAD01000125.1| GENE 134 123255 - 123662 364 135 aa, chain - ## HITS:1 COG:SP0256 KEGG:ns NR:ns ## COG: SP0256 COG0454 # Protein_GI_number: 15900191 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Streptococcus pneumoniae TIGR4 # 14 125 13 123 138 65 33.0 3e-11 MIIFKEFSYDDFHENVKQIYEKEEWSAYLGNDLKLKRAFDNSLYKLGAFDGNRLIGFIRC VGDGEHVVLIQDLIIDNEYRKQGTGSELFRRIYKKYKDVRMLVLITDIADEAANHFYQSL NMKKLEKGGMISYFR >gi|261747353|gb|ADAD01000125.1| GENE 135 123755 - 125455 2387 566 aa, chain - ## HITS:1 COG:BH0872 KEGG:ns NR:ns ## COG: BH0872 COG0366 # Protein_GI_number: 15613435 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 7 562 3 556 559 713 62.0 0 MKKDFREQWWHKSTVYQIYPKSFNDTTGNGQGDIKGIIEKLDYLKELGVEVLWLTPMYKS PQADNGYDISDYYNIDENYGTMEDFEKLLEEAHKRGLKIVMDIVVNHSSTENEWFKKSEA GDPEYKDFYIWKDAVDGKEPTNWQSKFGGNAWKYSEKRGQYYLHLFDVTQADLNWENENV RKKVYEMIRFWLNKGVDGFRLDVINLISKDQRFLNDDGSDKRFVPDGRRFYTDGPRIHEF LKELNKEAFGGEELITVGEMSSTSIDNCIRYSNPDEKELSMAFSFHHLKVDYPNGEKWVK APFDFVELKKIFSKWQKGMYDGNGWNATFWNNHDQPRAISRFGNDGKYHNESGKMLATVL HGLQGTPYVYQGEEFGMTNPYFDNINKYRDVESHNIYKIKEKEGLSDKEILDILMQKSRD NSRTPMQWNDSKNAGFSEGTPWIGIPENYKFINAEAALKDKNSIFYHYKKLIELRKNEDL LVTGKYEDIDLENKNVYAYKRTGDNGELVVISNFYENEVPFELKNNGINDLEKAEILISN YETNPEFKDGKIVLKPYESIIFKKVF >gi|261747353|gb|ADAD01000125.1| GENE 136 125480 - 125929 491 149 aa, chain - ## HITS:1 COG:SP0796 KEGG:ns NR:ns ## COG: SP0796 COG4405 # Protein_GI_number: 15900689 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 24 145 21 140 146 108 48.0 4e-24 MNINKIIEKYIETLTEEERKNIYISKFSFGDENDIQMQNYLAQLVLKGEKTATTSLYSLY DFENERIPQVGDVNVILDGNSNEVCVTVNTKVYRLPFKDISEEYAYKEGEGDKSLEYWKK VHKDFFIKEAEGNFNESMEVLCEEFELLK >gi|261747353|gb|ADAD01000125.1| GENE 137 125951 - 126280 433 109 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0807 NR:ns ## KEGG: Lebu_0807 # Name: not_defined # Def: branched-chain amino acid transport # Organism: L.buccalis # Pathway: not_defined # 8 105 1 98 101 68 48.0 8e-11 MNNFTIVLLILVTALITAFIKLAPLFIKIPDDNPIINKFFEALPYTVLTILIFPDIFTST GTGTFNLIKVLAGIVIIVWLSLKKFSLGIVVPVSIAVIFMFDIVKLFIK >gi|261747353|gb|ADAD01000125.1| GENE 138 126273 - 126983 965 236 aa, chain - ## HITS:1 COG:BH2910 KEGG:ns NR:ns ## COG: BH2910 COG1296 # Protein_GI_number: 15615473 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Bacillus halodurans # 4 233 9 236 237 85 26.0 8e-17 MGHKGKLENYLKGVKDGLGIGIAYIPSGMTLGLISNSFGMKSILMALMSLTLYAGSAQTI LLKALYISKSTFIEIAVSVFMINLRYSLLNLIIYREIKNHTTLGERIFVGLGLTDETVAY LRIRKAKNPYYMMGVNTLPYLLFGAGSVIGSLFGDLIPEIFSASMNFILYAAFLSLLVSA LKANFKYIKVVLIVIVFKKIFEYIPVSNGWAMIFIMFLSSLVYALMTYKEGVKENE >gi|261747353|gb|ADAD01000125.1| GENE 139 127183 - 127347 266 54 aa, chain - ## HITS:1 COG:FN0705_2 KEGG:ns NR:ns ## COG: FN0705_2 COG0749 # Protein_GI_number: 19704040 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Fusobacterium nucleatum # 1 54 448 501 501 59 51.0 1e-09 MLLQVHDELIFEVKDDAIEKYTDKIKEIMENTVNFEDIKLSANGSVAKDWGALK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:19:54 2011 Seq name: gi|261747329|gb|ADAD01000126.1| Leptotrichia goodfellowii F0264 contig00189, whole genome shotgun sequence Length of sequence - 22109 bp Number of predicted genes - 24, with homology - 23 Number of transcription units - 10, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 11 - 343 430 ## COG3070 Regulator of competence-specific genes 2 1 Op 2 . - CDS 401 - 502 107 ## 3 1 Op 3 . - CDS 517 - 993 456 ## COG3467 Predicted flavin-nucleotide-binding protein 4 1 Op 4 . - CDS 1019 - 1660 853 ## COG1051 ADP-ribose pyrophosphatase 5 1 Op 5 . - CDS 1683 - 2858 1929 ## COG1015 Phosphopentomutase - Prom 2878 - 2937 5.1 6 2 Op 1 . - CDS 3491 - 4081 558 ## COG0500 SAM-dependent methyltransferases 7 2 Op 2 7/0.000 - CDS 4129 - 4794 1213 ## COG0274 Deoxyribose-phosphate aldolase 8 2 Op 3 . - CDS 4831 - 6135 2123 ## COG0213 Thymidine phosphorylase - Term 6155 - 6185 0.3 9 2 Op 4 . - CDS 6220 - 7263 1691 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein - Prom 7297 - 7356 16.4 10 3 Tu 1 . - CDS 7388 - 8149 976 ## TDE0523 hypothetical protein 11 4 Op 1 26/0.000 - CDS 8282 - 9151 1568 ## COG1079 Uncharacterized ABC-type transport system, permease component 12 4 Op 2 24/0.000 - CDS 9152 - 10216 1521 ## COG4603 ABC-type uncharacterized transport system, permease component 13 4 Op 3 . - CDS 10209 - 11729 167 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 14 4 Op 4 . - CDS 11779 - 12798 1183 ## Lebu_0605 hypothetical protein 15 4 Op 5 5/0.000 - CDS 12838 - 14619 1902 ## COG0322 Nuclease subunit of the excinuclease complex 16 4 Op 6 . - CDS 14641 - 15522 1052 ## COG1660 Predicted P-loop-containing kinase - Prom 15701 - 15760 7.4 + Prom 15559 - 15618 8.4 17 5 Tu 1 . + CDS 15693 - 16202 566 ## Lebu_0809 hypothetical protein 18 6 Tu 1 . - CDS 16264 - 17283 977 ## Gura_3044 helix-turn-helix, type 11 domain-containing protein - Prom 17355 - 17414 7.6 19 7 Op 1 . - CDS 17435 - 18661 1268 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase - Term 18681 - 18724 6.2 20 7 Op 2 . - CDS 18740 - 19417 564 ## gi|262038644|ref|ZP_06012012.1| hypothetical protein HMPREF0554_2322 - Prom 19439 - 19498 5.6 21 8 Tu 1 . - CDS 19524 - 20504 392 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 - Prom 20556 - 20615 8.2 22 9 Op 1 . - CDS 20680 - 21342 427 ## gi|262038643|ref|ZP_06012011.1| hypothetical protein HMPREF0554_2324 23 9 Op 2 . - CDS 21327 - 21446 201 ## gi|262038646|ref|ZP_06012014.1| hypothetical protein HMPREF0554_2325 - Prom 21471 - 21530 8.2 - Term 21660 - 21710 13.0 24 10 Tu 1 . - CDS 21785 - 22108 487 ## COG0747 ABC-type dipeptide transport system, periplasmic component Predicted protein(s) >gi|261747329|gb|ADAD01000126.1| GENE 1 11 - 343 430 110 aa, chain - ## HITS:1 COG:SP0951 KEGG:ns NR:ns ## COG: SP0951 COG3070 # Protein_GI_number: 15900829 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Streptococcus pneumoniae TIGR4 # 1 75 1 75 75 85 56.0 3e-17 MPSSKEYLEFILEQLSDLEEITYRYMMGEYVIYYRKKVIGGIYDDRFLIKVTNASKNLLS DAKLELPYERGSKMILIDDPENKEQYRELFEKMYEELPAPKVKKKKAVKK >gi|261747329|gb|ADAD01000126.1| GENE 2 401 - 502 107 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MILNVYEYKLENNIVCKVRSPKEDMVIKRYRNQ >gi|261747329|gb|ADAD01000126.1| GENE 3 517 - 993 456 158 aa, chain - ## HITS:1 COG:FN0030 KEGG:ns NR:ns ## COG: FN0030 COG3467 # Protein_GI_number: 19703382 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Fusobacterium nucleatum # 1 157 1 157 158 159 52.0 2e-39 MRRKDREITDNEKIKKIISECDCCRLGLNDNGKVYIVPLNFGFTEENGKYTFYFHGAKTG RKYDILKVNNYVGFELDTNHKIYFKNEDVACTYTSAFQSVIGNGRVTFIEDYEEKLKGLL ELMKHNTEKSEWKFDERMINSVCVFKLEVEELSCKEHE >gi|261747329|gb|ADAD01000126.1| GENE 4 1019 - 1660 853 213 aa, chain - ## HITS:1 COG:BH3089 KEGG:ns NR:ns ## COG: BH3089 COG1051 # Protein_GI_number: 15615651 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Bacillus halodurans # 1 201 1 195 207 156 47.0 3e-38 MEDREQWLKWAMELQSLAQAGLNYAKDVFDVERYTRIREISAEILTKKGDVSLEKVNMMF CNEVGYQTPKMDTRAAVFKNDKILMVKERDKKWALPGGWVDVYLSIKENTEKEVKEEAGV KVTAKKIIAVQDGFKNNFGCGLGQVPYGISKIFVLCSLDEENEGEFVKNIETSERKYFSL DELPELSEIRNTEKQIKMCFKAYKSDTWEAVFD >gi|261747329|gb|ADAD01000126.1| GENE 5 1683 - 2858 1929 391 aa, chain - ## HITS:1 COG:BS_drm KEGG:ns NR:ns ## COG: BS_drm COG1015 # Protein_GI_number: 16079407 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Bacillus subtilis # 5 375 7 376 394 393 52.0 1e-109 MKDINRITLIVLDSVGAGELPDAADFDDVGSNTLGNMAKAAGGMSLPNMGKLGLGNITKI EGTPAVKEAEGAYGKAVEVSIGKDSTTGHWEIAGVPLERPFPNYSNGFSDEVIKEFEEKT GRKVLWNRPASGTVIIDKFGEEQIKTGDWIVYGSADPVFQIAANEDVIPLEELYKGCEIA LEICNRISPVARVIARPYVGKKVGEFKRTANRHDFSIDPPKESMLERLEKAGLDVVGIGK TSDLFNGKGITDNRKANQDNLDGIKKTVAALKENTKGLIFTNLVDFDAVYGHRRNVQGYV NALIEFDNWLPEIEKNLRDDEILIITADHGNDPTYKGTDHTREYIPILVTGKKVKKNVNI GVRKTFADIAATIEEILLGTEKEGSFAKEIL >gi|261747329|gb|ADAD01000126.1| GENE 6 3491 - 4081 558 196 aa, chain - ## HITS:1 COG:MA3783 KEGG:ns NR:ns ## COG: MA3783 COG0500 # Protein_GI_number: 20092579 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 30 188 34 191 249 132 40.0 5e-31 MKNKNDNEKIREKVDEFYQKIFDKEYKTHISPEKISESLGYNQEMLKQLPEEADTGLSCG NPLENLILNDGETLIDLGCGTGKDIFLTRMKYPNSGILYGMDRLEDIIKKAEKIRDMKKF KNIEFRRGTLTNMPFDSNSIDKAISNCVINLEPDKQKTYDELYRILKSGGKFYISDIILK KDLPKEWRKSEKMHCT >gi|261747329|gb|ADAD01000126.1| GENE 7 4129 - 4794 1213 221 aa, chain - ## HITS:1 COG:L63310 KEGG:ns NR:ns ## COG: L63310 COG0274 # Protein_GI_number: 15673422 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Lactococcus lactis # 1 214 1 214 220 219 57.0 2e-57 MEINKYIDHTILKATAQKKDIKKLCDEAKEYKFFSVCVNGANVKYAYEQVKDSDVKVAAV VGFPLGAMSKDVKVFEAKKAIEDGASEIDMVINVAALKDGEYAYVEQEIREIKEAIGNNV LKVIIETCYLTDEEKKKACELSLNAKADFVKTSTGFGTGGATFDDVKLMKSIVGDKAQVK ASGGVKDLITAQKYIELGATRLGTSSGIEIIKGLEVEEGKY >gi|261747329|gb|ADAD01000126.1| GENE 8 4831 - 6135 2123 434 aa, chain - ## HITS:1 COG:BH1533 KEGG:ns NR:ns ## COG: BH1533 COG0213 # Protein_GI_number: 15614096 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine phosphorylase # Organism: Bacillus halodurans # 1 433 1 432 433 384 48.0 1e-106 MRVVDLIEKKREKQELTKEEISFLLSEYLKGNVPDYQMSSFLMATYFNDMTADELLEFTT IMRDSGDTIKFDEIDKFLVDKHSTGGVGDKVTVVLAPVLAALGMGTAKLSGKGLGHTGGT IDKFESIDGFKFSNTRDELVSVANKTGIGLMGYSDKIVPLDKKLYSLRDVTGTVPSIPLI ASSIMSKKLAIQSDVIILDVKVGDGAFMKDIEHAKELAKRMIEIGKDAGRKVKVVLSNMD EPLGYSIGNANEITEAIETLKGNGPEDLKEIVYAIASLALKAKGEIKELSEGKSKIEEVI NNGSALKKLSEFISESGGNGNLVNDYNLFPQPKSMTEVFSEKDGYVSKIKAEEIGKAAMN IGAGRATKEDIIDHAVGIKILKKVGDKVNKGEKIAEIYYNDDKNVNNSRSMILDAYVIQS DKVDKQKAILEIIE >gi|261747329|gb|ADAD01000126.1| GENE 9 6220 - 7263 1691 347 aa, chain - ## HITS:1 COG:AGc200 KEGG:ns NR:ns ## COG: AGc200 COG1744 # Protein_GI_number: 15887481 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 46 343 31 326 337 227 45.0 2e-59 MKKIVTIFSIMATLFLFVACGGSKPAEGEQKKDGGETAQKADAAKKVAIVYSTGGKGDKS FNDAAFRGLERAKKELGITFDEYEPKDPATEAKDALTKFAETGEYQLIIGVGFTMKDSVM AVAQQFPDQKFAIIDEKIENVPNIASLSFKEHEGSFLVGALAAMMSKSGTIGFVGGMESP LIQKFQAGFEQGAKYVNPNIKTLSVYIGGNSAFNDPASAKTKTETLIQQKADVVYHAAGA SGQGVFQAAKEKNVYAIGVDSNQDGIAPGTILTSMMKYVDNAVFNEVKDTLEGKYQPTIQ EFGIKEDGVGTTEFEFTKDKIGEENIKKLEQIKQDIKDGKIVVKPTL >gi|261747329|gb|ADAD01000126.1| GENE 10 7388 - 8149 976 253 aa, chain - ## HITS:1 COG:no KEGG:TDE0523 NR:ns ## KEGG: TDE0523 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 54 249 3 197 197 278 69.0 1e-73 MNGGMKVRTEKQIITPEWLKRYQEIKDLLVSDTNYGELFEKEEIFGKKLFHIDMGEVNFP TGKILVRDPLVYLNKRELPYFQEVSTGKYKLVTLVAEIEEDHYRYTLSRVKFNDEKAVVY REAMIGNEAVEDDLEKGSFFGFAVDAGLATIVDVKTRDAYADFEKEWYEKNPDGNIYDDF FAEEFKKSYNENPEFQREDGDWINFKIPGTDLTVPMVQTGFGDGVYPVYFGYDKNNEVCE VIIEFIFLGEDEE >gi|261747329|gb|ADAD01000126.1| GENE 11 8282 - 9151 1568 289 aa, chain - ## HITS:1 COG:CAC0705 KEGG:ns NR:ns ## COG: CAC0705 COG1079 # Protein_GI_number: 15893993 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Clostridium acetobutylicum # 14 289 14 307 310 242 50.0 8e-64 MNILAILIFLIKQTLIIAPPILITAVGACISERSGVVNIGLEGIMLSGAFATTVVNLSTG NPYLGIIVGVITGGLISLIHAVISINLKGDQIISGVALNLFAVAMTSFLIKTIYRVAGST PSAQKLADKYVVLAIIYILAILTHFLVFKTVLGLRIRAVGEHPLAADTVGINVYKIRYIS VILSGMFAGLGGAYLTAVMLPAFSNNMSAGRGFMAMAAMIFGKWNPIGAILASLLFAFGQ AFADYAKTINLPIPQQFLSMIPYVLTLLALVGFVGKSKAPKDSGQPYEK >gi|261747329|gb|ADAD01000126.1| GENE 12 9152 - 10216 1521 354 aa, chain - ## HITS:1 COG:TM0104 KEGG:ns NR:ns ## COG: TM0104 COG4603 # Protein_GI_number: 15642879 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Thermotoga maritima # 1 354 1 343 344 181 38.0 2e-45 MTKKLKEFLPSIIAVFIALVIGGIVMMTKGVNPLLAYTDMAKAAFYRSSARAPFMGGLAK TLFTATPLIFSALAAMVAFKAGLFNIGAQGQMIAGGLAATFWAVTFRNYFLGNTIVVLIV AMAAGFLWAGIAGYLKSKFGVNEVISTIMLNYIMVDFQNYLLNGILKDPTSQNTQSEKVF EGARLPLLFAKITKQNLNFGFIIAILVVVGIYFFFKYMKKGYEIKAVGLSETVAENAGIN PKNMMFLAMGLAGICAGLGGAERVLGGSAQYAYTELIMGDAGFTGLAVALLGKNNPFGIL IAAIFYAALDVGGQTLQLRYQVDKEIVLIIQALIIIFVASENLFKFFISKRKEK >gi|261747329|gb|ADAD01000126.1| GENE 13 10209 - 11729 167 506 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 272 481 16 214 311 68 25 3e-11 MSDYILEMKNIRKEFLGGKIVANDDITLKIKKGEIHAIVGENGAGKSTLMKILNGLYAPT SGEIFYKGKKTDITSPSVAANLGIGMVYQHFMLVEPLTVAENMVLGFEPRKVGAFFDLAT ARKQVIEVSEKYNLNINPNAKVSDLSVGIQQRIEILKILFKGAELLIFDEPSAVLTPQEV IELYAIMRNLVKEGKTIIFITHKLHEVLELSDNITVIRKGKDVGNLKTSEATKEKIANMM VGRVVLFEVKKPDVKLGETTVKVDNITVRGENGIEKVKGVSFDIKEGEVLGIAGVEGNGQ TELIEALAGLAKIESGTYSVAGESLENKTPKFIKEKGLSHIPENRHKRATIDDFTIEENM ALGLQDKYSKGALMDYPVIEKNTKDNMEKYDIRPLDGKIKFGGLSGGNQQKVVVARELER ENKFIIAAQPTRGVDIGAIEMIHNTILNEKTKKKAILLVSAELSEIMALSDRIAVMYSGK IVGVLDRKDATMEKLGILMAGGKLDD >gi|261747329|gb|ADAD01000126.1| GENE 14 11779 - 12798 1183 339 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0605 NR:ns ## KEGG: Lebu_0605 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 338 1 338 338 355 64.0 2e-96 MKKLILISMIMMNMLGFGIFDKIRSEIKLKEEEEPKFKEVKVIINGKEKTRKVIPGRYEI KLFELNYPTDIFGKNDFYNEFNRVLEKSDKNKKYSVLLEKRFYDDIQKIKIDEMSEEEKV LELPSSSLEEYDRQELTGLSKGLNVNQYAGTDQEYLNRQIYVLYKLLGKKDFNPDNVYSE LDKEFLISELKQKRKRTVQNFENSKFEYFVKESYNNVDFNKFEDDSVYKFDKDIVISEKI EDQALLPKIKNNLYLTDFPEKIIEDLSKSKLKLKREILLLENNQFHGYTYNENNTVIFIG GKTGKYYFNDYKIDVVLRKVNVSELLKTNSDYYVSDFFY >gi|261747329|gb|ADAD01000126.1| GENE 15 12838 - 14619 1902 593 aa, chain - ## HITS:1 COG:FN1090 KEGG:ns NR:ns ## COG: FN1090 COG0322 # Protein_GI_number: 19704425 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Fusobacterium nucleatum # 12 590 7 585 589 568 53.0 1e-161 MEKIKPPVEYKNMPNSPGVYLMKNINGKIIYVGKAKNLKNRVSSYFKNINSHNVKTLELV KNIDDIEFFICKTEVEALILENNLIKKYRPKYNILLKDEKTYPYIKFTREKFPKIEIVRS TKKLNEKADYFGPYPSGIFFAVKTLLRIFPVRDCKRSMEKVTIPCLKYFMNTCTAPCKYK DIEEEYNENVDNFKNFLKGQRTEIFDVLEKRMKKFSDNMEFERALAERNKIEALKKLLQT QIIEYSKEIDEDVFVFEETSEMVFLCVLNIREGKVINKNHIKISLERGHEDNLFERLVTS YYEKRNIPKNIISDVKYNENEELIKEWAKIEKEKDIKLHFPKINSRRQQLLEMGYLNLKE EVGKHFRQKKIVQEGLQKLKIELQLKEQPQRIECFDISNIQGKDAVAAMTVAINGEVEPR EYRHFKITVKDTPDDFLMMREALTRRYSKLSEEEFPNLILIDGGKGQLGIAVDVLEKLGK IEFTDIISIAKREEEIFKSYESEPYIFEKTDETLKILQRLRDEAHRFGITHHRKLRSKRN IKSALDDIEGIGPKRKKELINKFGTIGNIRNATIEELIEIVPEKVAVSIKEML >gi|261747329|gb|ADAD01000126.1| GENE 16 14641 - 15522 1052 293 aa, chain - ## HITS:1 COG:BH3569 KEGG:ns NR:ns ## COG: BH3569 COG1660 # Protein_GI_number: 15616131 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Bacillus halodurans # 4 286 2 289 295 224 42.0 2e-58 MEKNVKKQLVIITGMSGAGKSQAMDFFEDRGYFCIDNFPLNLFQYLNEIFISSEKRDKVA VAIDIRNQEFITQFHKQLKLLDDLKIDYTIIYLDARTDVLLSRYELSRRKHPLNEHDTLL ENIEEERNIIKNFMLKADIIIDTSSTTVKEFQEILQKEFSEKMTKLSINLTSFGFKYGIP LDMHLMFDLRFLPNPYYIEDLKKKTGNHPDVSNYVMGLEESREFYKILYDMLVYLIPKYE KDGKSHLRIGIGCSGGQHRSATFVNRLYEDLSKRFDYKITKFHREVGDELEIS >gi|261747329|gb|ADAD01000126.1| GENE 17 15693 - 16202 566 169 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0809 NR:ns ## KEGG: Lebu_0809 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 18 169 18 169 169 174 61.0 1e-42 MKNFFKLIVFLLVLLIGGGTYLYFQNQVKVPNSMIQAAVNSKFPMEKSYPLGKIKLYNPK SHFENDKLIIEADYMNDALNDKISGTMTFETDLKYDLMDAKLYLNDFKLIKVTKEGKEID MDKKPIIRTVLNFAFGQLEKKELLNLKQVEKFQMIKDIKIENNKVVVIK >gi|261747329|gb|ADAD01000126.1| GENE 18 16264 - 17283 977 339 aa, chain - ## HITS:1 COG:no KEGG:Gura_3044 NR:ns ## KEGG: Gura_3044 # Name: not_defined # Def: helix-turn-helix, type 11 domain-containing protein # Organism: G.uraniumreducens # Pathway: not_defined # 10 338 2 334 346 187 36.0 5e-46 MGFSKEYKNEIKNYIIEIIDKGQNPYQKVPDKYQVSRQTLSKYIKEFLEVDIIEKKSKNI FELKNYVMFSELLENNKLEEDVIYEKLISNYEKEKKENVIRILNYAFTEMLNNAIEHSNG KEISILYAENYKRIFICIEDNGIGIFKKIKESHNLENENQAIFELQKGKLTSDIKRHSGE GIFFVSKVVNFFRIKSFDKEFHTGDKHNIYSFEKIQKDKQVKGTEVLFIIEKDTDREISE VFKEYTNDYIFDKTTITVHLAKEYMGQKFISRSEARRVLLNVDKFKTVVLDFSMVENIGQ GFADEIFRVYADKNSNITILPINQNENVDFMIKRALNSK >gi|261747329|gb|ADAD01000126.1| GENE 19 17435 - 18661 1268 408 aa, chain - ## HITS:1 COG:FN0297 KEGG:ns NR:ns ## COG: FN0297 COG2256 # Protein_GI_number: 19703642 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Fusobacterium nucleatum # 1 405 1 405 407 404 51.0 1e-112 MNLFENLYDEKKPLAFRYRPRTLEDFYGQEKIVGEKGVLRKIIEKGTFMNSIFWGSPGTG KTTLAEIIANRMNYNYEYLNAIKSSVSDIKELSERAKRIFGIEGRQTLLFFDEIHRFNKL QQDSLLQDLENGNIILIGATTENPYYSLNNALLSRCMAFEFKKLNKEDLFKILKNINDKE KFDFSEEILEYISEIIEGDARQAINVLELLSNTGVNFTLEEVKEVLNTKKSYHKTEDKYD TISAMIKSIRGSDPDAAVYWIAKMLSGGEDPMYLARRLVILSSEDIGLANPQALAIATAG MQATKEIGMPEARIILSEVAIYLASSPKSNSAYLAIDNAISHIENDKIQEVPKHLTKLGA KDYKYPHSYPEHFVKQDYMTKKIKFYEHGDNKFENAANERLEKLWGKK >gi|261747329|gb|ADAD01000126.1| GENE 20 18740 - 19417 564 225 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038644|ref|ZP_06012012.1| ## NR: gi|262038644|ref|ZP_06012012.1| hypothetical protein HMPREF0554_2322 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2322 [Leptotrichia goodfellowii F0264] # 1 225 1 225 225 335 100.0 1e-90 MEELKKSLSGKYLEMGEYFNKAFELFPKVMQKEVILVGIMVLLSITASFTMKTPLRYIIP VFSGIVTLLLLQKVIYSIDKIGDLSFEEEKTNNNMIKCIIIALFSTVPLVSLIFVIFCLI YPYFMVSYLSENMNFSQAKEYAGIVSPGNRMRIILPGFIITICISLITMPLIYIGGIGLV IAGIVSALLSILFVSIHSIIYLNVKYMNEKNGNQNNNQEVIEIEQ >gi|261747329|gb|ADAD01000126.1| GENE 21 19524 - 20504 392 326 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 23 322 1 303 306 155 35 2e-37 MDFNLNLGALEEGKLRKIDENELYDVIVIGTGPAAMSASIYAVRKGLSVAQIGLKVGGQI LDTNEIENIIGTPKTTGAEFAASMEKHMKEYEIAFKEGHLVKEIKEGGKDKVLVTDDGKS YKTKTVIIATGAEWKQLNIPGENEYIGKGVHYCSTCDGPFYKNLDVAVIGGGNSGVEAAL DLSGIAKHVTLLEFMPELKADMVLQDKLSERDNIKVITNAQTTAVEGTQFAEHLKYLDRA TNTEKDLKIDGVFIEIGLSPKSGLVKDLVELNKMGEIIIDEYNMTSNKGIFAAGDVTAVK QKQIVVAIGEGAKAALGAFEYIIKKY >gi|261747329|gb|ADAD01000126.1| GENE 22 20680 - 21342 427 220 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038643|ref|ZP_06012011.1| ## NR: gi|262038643|ref|ZP_06012011.1| hypothetical protein HMPREF0554_2324 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2324 [Leptotrichia goodfellowii F0264] # 10 220 1 211 211 318 100.0 2e-85 MVFLLATGLMFIEFGENIPYLDILIMVLGFIVMSIARDLFYRNIANKIENKNDADNGIKK VLIFNGIAILVLLGNGIIFYCILISGLENILTLYRMLIIICVIIIIQVFIYCLFFIYFTP LYLIRNLTFSESLKYNFHLCKSNKARIFFPMLITEIPILAVNSIFNRIPYLGSLLNIGLS VFGNIFIAALMTIIYLNVEYMDRKKDVEKTEKDISDEELI >gi|261747329|gb|ADAD01000126.1| GENE 23 21327 - 21446 201 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038646|ref|ZP_06012014.1| ## NR: gi|262038646|ref|ZP_06012014.1| hypothetical protein HMPREF0554_2325 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2325 [Leptotrichia goodfellowii F0264] # 1 39 1 39 39 67 100.0 4e-10 MEGLRELLESKYLGMGTYFSEAFDIFTKILVKEKNWFFC >gi|261747329|gb|ADAD01000126.1| GENE 24 21785 - 22108 487 107 aa, chain - ## HITS:1 COG:BH3636 KEGG:ns NR:ns ## COG: BH3636 COG0747 # Protein_GI_number: 15616198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 1 106 509 610 610 71 35.0 3e-13 DPDPYGTAGRKAMFNYTRFASDENDKLMAEIASPKTLEDPNYKAEALIKWQEYYINQAVE VPLTYRYQLYPVNKRVKNFYVGYDAEKLGKMVHLVELTADAPIKAKN Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:20:48 2011 Seq name: gi|261747317|gb|ADAD01000127.1| Leptotrichia goodfellowii F0264 contig00225, whole genome shotgun sequence Length of sequence - 12090 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 434 345 ## Lebu_1140 MORN variant repeat protein - Prom 536 - 595 10.9 + Prom 407 - 466 9.9 2 2 Op 1 . + CDS 570 - 1529 793 ## gi|262038652|ref|ZP_06012019.1| putative membrane protein 3 2 Op 2 . + CDS 1587 - 3521 2205 ## Lebu_1143 hypothetical protein 4 2 Op 3 . + CDS 3549 - 4097 589 ## Lebu_1144 hypothetical protein + Term 4140 - 4173 -0.5 + Prom 4100 - 4159 3.4 5 2 Op 4 . + CDS 4181 - 4978 1210 ## COG3023 Negative regulator of beta-lactamase expression + Prom 5000 - 5059 5.7 6 3 Op 1 . + CDS 5105 - 6307 1565 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 7 3 Op 2 . + CDS 6330 - 7034 802 ## COG1636 Uncharacterized protein conserved in bacteria + Term 7108 - 7148 1.2 - Term 7096 - 7135 2.4 8 4 Tu 1 . - CDS 7170 - 7571 692 ## COG1970 Large-conductance mechanosensitive channel - Prom 7595 - 7654 10.9 + Prom 7608 - 7667 7.8 9 5 Op 1 . + CDS 7704 - 9500 2317 ## COG0481 Membrane GTPase LepA 10 5 Op 2 . + CDS 9516 - 10103 720 ## Sterm_2117 exonuclease RNase T and DNA polymerase III 11 5 Op 3 . + CDS 10133 - 11812 1463 ## COG1061 DNA or RNA helicases of superfamily II Predicted protein(s) >gi|261747317|gb|ADAD01000127.1| GENE 1 2 - 434 345 144 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1140 NR:ns ## KEGG: Lebu_1140 # Name: not_defined # Def: MORN variant repeat protein # Organism: L.buccalis # Pathway: not_defined # 3 144 6 149 308 120 45.0 2e-26 MKKTLKKSILILFSCIVLSSLSFSENINMNPKRSQDYSSLYTEAMNSVYSTPETVSLSNN FAITKEDSAPFTGKLVEFNKNSEIKSVKNFKNGLYDGKMYFYFDNGNLLKILEYSQGVQT GEEIEFYSDGISKTVKNYKNGLLN >gi|261747317|gb|ADAD01000127.1| GENE 2 570 - 1529 793 319 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038652|ref|ZP_06012019.1| ## NR: gi|262038652|ref|ZP_06012019.1| putative membrane protein [Leptotrichia goodfellowii F0264] putative membrane protein [Leptotrichia goodfellowii F0264] # 1 319 1 319 319 526 100.0 1e-148 MEEIKEKLSENYLKIGDYFNISFEVLKKIISQNKLLTGGLFLAVFILATVNQKIIESLGV IKDMIEVGSDYYPENLSQYYGMNFLITPCAIATTVILVVIMKKAAILIEEKSKFNLVEVI KKFFIVYLFDIIVYMIVNIIPFILMITGATIFLSKVSAHDPREEILAVLIVFGIILFIYY ITVCTVIWLKLLYFRPLFYLRNVKLKEAVYYNFHLCKGNRLRIILPGIILFVFQWFFFIP FQVINFITLNNFQIAFILLSSILTTFLSTFSCVLTVVIYLNVEYMDLKKFNDGQSDGNDK IENKKNLPKNNLFENNSEN >gi|261747317|gb|ADAD01000127.1| GENE 3 1587 - 3521 2205 644 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1143 NR:ns ## KEGG: Lebu_1143 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 2 637 39 665 668 509 48.0 1e-142 MKIFNLFKRSVEKEDQKPEILENIMNENQEISYLDEFGKEQKILKSEWVNHHLKPAVEKN WNDVEELYSIVMDAFSKNVYYEAKDACLRIYSLDNNKERRTNVLGIYYIKNKEYDQAIDL YKKYTFYEKPSEAIYSNYAKVLEKKGNHEEAEKLYLKALNLNPNSSNAFKKYFDNIKKRN IAEYNTKLEEFAHLEDTWRAKLTLAVKYFKSNEIEKGKELLNKALKESEYDSEVMSMASG IYGMNSLFEEFENEILPNFNPEKHSAYTAMNILEYYKRNEDFEKGLELCKFASKFAWFEF TDNFINYEKIFLNMKIRADKEIEEENFYIEKPQEKFFTTNIPIWYYEFNNPAWILNKKKR TSPNLLILSFTSIGDYDKTSRNLAVSIPLFINEVLHYKTDLNYQLAVAYEGDYLKIANKR YSTDYMKLIKSQNPHLDYVLSGNVVKNVNEDKEEIFEIEIYLYDCSNEMKIKLIDASYKK DEVYKVQKDVVNKFNNFFECIDKNFEVTETNSENLLLYSPKIKFLIGNPKYKEFLSWKYK KLLFEQIKVVLDDRENELKKVNLVALLYEISKTNSQLLKCEKPSIYNMVSRGIFKSKVSR LLVPIIFNVYNDEANYDAYMETLNEENPLYIDWVNKFLDYVNED >gi|261747317|gb|ADAD01000127.1| GENE 4 3549 - 4097 589 182 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1144 NR:ns ## KEGG: Lebu_1144 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 6 182 6 209 221 69 35.0 8e-11 MNITSKKSISIIIFLCYIISDLLFLKTADRDYANIILLFSSTILFVFEVLFWGMLFLSSD GRERKSSVELLFLGTLAGVGLSRIFLISSPYINDLLNANIVLAYIIGIIRVAFIFAAIMN IFYFFDTKNIFLIIISILNLVCAILIWVDFDSGINGIIRLIIGISAIIFMIMSKNKTFGE SD >gi|261747317|gb|ADAD01000127.1| GENE 5 4181 - 4978 1210 265 aa, chain + ## HITS:1 COG:FN0164 KEGG:ns NR:ns ## COG: FN0164 COG3023 # Protein_GI_number: 19703509 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Fusobacterium nucleatum # 2 264 15 286 288 236 46.0 3e-62 MSKDIQRPPAVKKTTEVISNDAGKITLDREYNSKGQNFRERFIIVHYTALDRNRSLEVLT TQEVSAHFLISDEEKDPVYALVDENRRAWHAGVSEWKTSKNLNDSSLGIEIVNPGYIKEN EFVPYKDFQIKELAVLIKYLSEKYQIPATNILGHSDIAPQRKQDPGPLMPWKKLYTDYNI GMWYDDAVKESFKNQYIQTFYTLPVLEVQKELKKFGYSIEITNQWDKQTQNVIRAFQHHF RPELFDGKMDLETYSILKALNEKYK >gi|261747317|gb|ADAD01000127.1| GENE 6 5105 - 6307 1565 400 aa, chain + ## HITS:1 COG:CAC1001 KEGG:ns NR:ns ## COG: CAC1001 COG0436 # Protein_GI_number: 15894288 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 1 395 1 393 395 485 59.0 1e-137 MEFSNRILSMQYSPIRKLVPYADAAKKEGVRVYELHIGQPDVQTPDTFFDGLNNYKEKIV KYANSAGLAELIDSFSKSYISSGMNILPGEILITNGGSEAIQITLQTICNPGDEVLVPEP YYSNYDSFLKSAGAVLVPIKTTIENNYHLPEKEEIVKLITPKTKAIMFSNPSNPTGIVFT NKEMEMIKEISLEHNLYIITDEVYRQFIYDEDIEYKSFMTVPEVSDRTILIDSISKHYSA CGARIGVVASKNKEFIAQALKLCQARLAVSTIEQFASTNLINTLDTYIDNVKLEYKVRRD LMYNHLKEMPGVVSFKPDSAFYIFAELPVDDIEKFVVWLLTEYRYKNQTLMFAPGPGFYS EEGKGMKEARFSFCTHNLTEIENGMKVLKKALEEYKNVKK >gi|261747317|gb|ADAD01000127.1| GENE 7 6330 - 7034 802 234 aa, chain + ## HITS:1 COG:L6208 KEGG:ns NR:ns ## COG: L6208 COG1636 # Protein_GI_number: 15673349 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 7 231 2 228 241 316 66.0 3e-86 MLDEKDIKDAKDILSRMNPNQKINYHTILTKLISDWKSRELRPKILLHSCCAPCSTYTLE FMTQYADVTILFANSNIHPEAEYRKRALVQKEFIEEFNKRTGNNVGFIEDEYKPVDFFRK VRGLEREKEGGARCTVCFQMRLDIVARRAQELGYDYFGSALTLSPKKNSELINNIGLEVQ EIFDVKYLPSDFKKNNGYSRSVEMCKEYDVYRQCYCGCVFAAMDQGIKFEKNEI >gi|261747317|gb|ADAD01000127.1| GENE 8 7170 - 7571 692 133 aa, chain - ## HITS:1 COG:VCA0612 KEGG:ns NR:ns ## COG: VCA0612 COG1970 # Protein_GI_number: 15601370 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Vibrio cholerae # 1 133 3 133 136 131 52.0 4e-31 MFKEFKEFISKGNVMDLAVGVIIGGAFGKIITSLVDDMIMPVLGIILGKINFSALKLIIT PAEGDKPEVAVLYGSFIQNVVNFLIMAFVIFLMVKAINKLRKPAKEVAEEIVEEVPSKEE KLLTEIRDILKSK >gi|261747317|gb|ADAD01000127.1| GENE 9 7704 - 9500 2317 598 aa, chain + ## HITS:1 COG:FN0777 KEGG:ns NR:ns ## COG: FN0777 COG0481 # Protein_GI_number: 19704112 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Fusobacterium nucleatum # 4 598 7 604 604 920 78.0 0 MLDQKNKRNFSIIAHIDHGKSTIADRLLEITGTVNQREMKDQLLDSMDLEREKGITIKAQ AVTLNYKSKNGETYEFNLIDTPGHVDFIYEVSRSLAACDGALLVVDAAQGIEAQTLANVY LALENDLEILPIINKIDLPSADPDKVKLEIEDVIGIPSDNAALVSGKTGLGIEDLLESII EYIPAPKGDVNAPLKALIFDSHYDDFRGVITYIRIVEGKLSKGDKIKIMSTEKEFDVLEV GIFSPKMKEVPELTVGSVGYIITGIKSIKDTQVGDTITHVKNPTAIALAGYRPALSMVFA GVYPISTDDYEDLREALEKLQLNDASLTYTPETSLALGFGFRCGFLGLLHMEIIVERLRR EFNIDLISTAPSVEYHVTPEVGEMMIIDNPAEFPAGKKYIEEPYVKGTIIVPKDYVGNVM ELCQEKRGTFLTMNYLDDTRTMISYDLPLAEIVIDFYDKLKSRTKGYASFEYEMIGYKES DLVKVDILVSGNPVDAFSFIAHKDNAYYRGRAIVEKLKEVIPRQQFEIPLQAALGTKIIA RETIKALRKNVLAKCYGGDITRKKKLLEKQKEGKKRMKAIGNVEIPQEAFLSVLKLND >gi|261747317|gb|ADAD01000127.1| GENE 10 9516 - 10103 720 195 aa, chain + ## HITS:1 COG:no KEGG:Sterm_2117 NR:ns ## KEGG: Sterm_2117 # Name: not_defined # Def: exonuclease RNase T and DNA polymerase III # Organism: S.termitidis # Pathway: Purine metabolism [PATH:str00230]; Pyrimidine metabolism [PATH:str00240]; Metabolic pathways [PATH:str01100]; DNA replication [PATH:str03030]; Mismatch repair [PATH:str03430]; Homologous recombination [PATH:str03440] # 1 194 1 195 195 284 73.0 2e-75 MKKNIIFFDVETNGKMGSSVLSISAIKVSYDFEKNSWEKVSEYDRFYFRNEGEPINFGAI NVNGLTDEVIAEKRENADYPSTFKEDLDSFYLYCQDTKHFVAHNIKFDRSFIPFPLKNQF DTMMENIDIVRAGINEYGSYKWPKLMECAKFYNVPMDEEQLHESLYDVLITFRVFYKMSK NPFGKPRIAGFLEKD >gi|261747317|gb|ADAD01000127.1| GENE 11 10133 - 11812 1463 559 aa, chain + ## HITS:1 COG:VC0812_2 KEGG:ns NR:ns ## COG: VC0812_2 COG1061 # Protein_GI_number: 15640830 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Vibrio cholerae # 270 540 29 284 790 194 38.0 5e-49 MGILLEKGNKNGEIINIENYSEVLKFEISEYFSNNISSLIEKNDYQRAMEYIENQLDKKF EIPLRIDEKRVSNDKLNTLIINEKAKFMNFFVHLKKELLSCKKFYFIVSFIKYSGIQLLI SILDELEEKGIQGEIITSVYLNITDPKALKKLLSYKNIKVKIYNNSNESFHTKAYLFEKE KYHTCIIGSSNISQSALYSAEEWNVKLVNSSFLDIYEKSLVQFQKLWNSNEAVKLSEDFI EKYDDYRNNNKPQNTFDYKKLQVQKKNFEPNSMQKEILEKLKITRKNGNKKGLVVAATGT GKTYLAAMDIKEFFKSKNIIGYNRKKARFLFIAHREELLENAAKVFSNILEVAEDDFGKI FGGKKESDFEMIFASIQSLRSCYKEFRKDKFDYIIIDEFHHASADSYNKVINYFSPEFLL GLTATPERMDGKDILALCDYNLVGEIGLKKAMEKDLIAPFHYFGVNDETVDYEEIPYKNG VYNEEILLENLSKSVRTDYIVEKIEKFGYDGNKMSGIAFCQNIEHASYMKNEFIKKGYKS KVLTAKTNKTERSKIFRKL Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:21:23 2011 Seq name: gi|261747315|gb|ADAD01000128.1| Leptotrichia goodfellowii F0264 contig00207, whole genome shotgun sequence Length of sequence - 666 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 665 1099 ## COG4625 Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain Predicted protein(s) >gi|261747315|gb|ADAD01000128.1| GENE 1 2 - 665 1099 221 aa, chain + ## HITS:1 COG:FN1950_2 KEGG:ns NR:ns ## COG: FN1950_2 COG4625 # Protein_GI_number: 19705252 # Func_class: S Function unknown # Function: Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain # Organism: Fusobacterium nucleatum # 1 221 221 430 432 144 40.0 1e-34 FDRYGGESKGDGFGISLYGRLGNKEVPYYLQGRIGIGFITSNVERDILLGSGDISRAKIE HRDKVFSGYLESGYDTKIGSLTITPYVGLSHDTVERGAFSEENSQFGLTADKKRYNQTSA LLGLRLGKSVNWANGSKTTFQGYVTQYIGFKKQDLSFEAAYSGLSNARFKVEGIGLSKNS TWAGIGVLTEVNPRFAWYVNYDAKMEKNKLNNNVFTTGFRF Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:21:24 2011 Seq name: gi|261747313|gb|ADAD01000129.1| Leptotrichia goodfellowii F0264 contig00142, whole genome shotgun sequence Length of sequence - 396 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 95 138 ## 2 1 Op 2 . - CDS 106 - 396 404 ## Lebu_1474 hypothetical protein Predicted protein(s) >gi|261747313|gb|ADAD01000129.1| GENE 1 2 - 95 138 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKIVMLGLLAIGSIVMAESAQDVLRKARED >gi|261747313|gb|ADAD01000129.1| GENE 2 106 - 396 404 96 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1474 NR:ns ## KEGG: Lebu_1474 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 96 33 128 128 84 61.0 2e-15 VEAKFNDLLEKEAQKKREFEAQKAQLEAEVEDLKAKEQGKEKLFEKLKKDSEVRWLRDKY KQVLNNYDTYYKNIAKMIREKEQKISELEAMLSVMN Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:21:32 2011 Seq name: gi|261747305|gb|ADAD01000130.1| Leptotrichia goodfellowii F0264 contig00091, whole genome shotgun sequence Length of sequence - 6169 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 286 317 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases - Prom 323 - 382 1.9 2 2 Op 1 . - CDS 391 - 477 161 ## 3 2 Op 2 . - CDS 509 - 1438 1164 ## COG0077 Prephenate dehydratase 4 2 Op 3 . - CDS 1476 - 1682 320 ## gi|262038668|ref|ZP_06012032.1| chorismate mutase - Prom 1714 - 1773 7.8 + Prom 1535 - 1594 9.3 5 3 Tu 1 . + CDS 1829 - 2563 1043 ## COG2071 Predicted glutamine amidotransferases + Term 2734 - 2777 5.2 6 4 Tu 1 . - CDS 2764 - 3267 771 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family - Prom 3353 - 3412 7.4 + Prom 3312 - 3371 12.6 7 5 Op 1 . + CDS 3403 - 5508 2214 ## COG3968 Uncharacterized protein related to glutamine synthetase + Prom 5513 - 5572 4.8 8 5 Op 2 . + CDS 5604 - 5942 363 ## Lebu_1552 hypothetical protein Predicted protein(s) >gi|261747305|gb|ADAD01000130.1| GENE 1 1 - 286 317 95 aa, chain - ## HITS:1 COG:SP0280 KEGG:ns NR:ns ## COG: SP0280 COG1187 # Protein_GI_number: 15900214 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Streptococcus pneumoniae TIGR4 # 1 93 65 157 240 109 56.0 1e-24 MNKPDGVISSTEDDEHRTVIDLLENKYRTYSVFPIGRLDIDTEGLLILTNDGILTHKLLA PNKHVDKKYYVELKNPVLKSDIEKLENGIELENGF >gi|261747305|gb|ADAD01000130.1| GENE 2 391 - 477 161 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRLDRFLANSGVGTRKEVKEILKKEKLK >gi|261747305|gb|ADAD01000130.1| GENE 3 509 - 1438 1164 309 aa, chain - ## HITS:1 COG:VC0705_2 KEGG:ns NR:ns ## COG: VC0705_2 COG0077 # Protein_GI_number: 15640724 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Vibrio cholerae # 47 306 8 269 278 175 35.0 1e-43 MDSSKEYQKFKIGVSEEYINDLEFDGKKLGYTGVPGAYAYEVMINLMKNNEISNGKTTDE NILNFNSHKELIEAVEAGKADFGILPIENSIAGEVTDSIDLINKRNIHIVGEVRHKIEHN LLGIKGSKIEDIKRIYSHEQALMQCSDFLEKHSYWKKEKVANTALAAKYIKDTESKENGC IANMRAKEMYDLELLEKNINNEKENYTRFFIISNKNLISENSKKVSIITGTKNESGALME LLKIFSVYGLNMVSLKSRPKPNKPWEYYFYIDFEGNLKEEKVKKALEEIRIKSIYLQVLG NYKIYNSEI >gi|261747305|gb|ADAD01000130.1| GENE 4 1476 - 1682 320 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038668|ref|ZP_06012032.1| ## NR: gi|262038668|ref|ZP_06012032.1| chorismate mutase [Leptotrichia goodfellowii F0264] chorismate mutase [Leptotrichia goodfellowii F0264] # 1 68 1 68 68 73 100.0 7e-12 MNNGKKYNLSEIREKIDKIDKEIVELIEKRLEIVKEVALYKKENNMKVFDSKREKEVLEK NLLNIKKC >gi|261747305|gb|ADAD01000130.1| GENE 5 1829 - 2563 1043 244 aa, chain + ## HITS:1 COG:FN0505 KEGG:ns NR:ns ## COG: FN0505 COG2071 # Protein_GI_number: 19703840 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Fusobacterium nucleatum # 1 242 1 241 243 266 52.0 2e-71 MDKKPIIGISGSILIAEGGVFTGYQRAYVNHAYVESVIKAGGIPFIIPFNTDKEVTKEQI KYVDGLILSGGHDIFPQLFGEEPKQHIGETFLDRDNFDILLLKTAVDQKKPVLGICRGHQ VINVTFGGTMYQDLSYNKDIYIKHSQATKWDRPTHTVEIKENSFLSEIFGKEGLVNSFHH QVVNKVADDFKVTALSKDGVVEGIENISDDKFILGVQWHPESMIHTDKNSAKLFKRFVER VKKN >gi|261747305|gb|ADAD01000130.1| GENE 6 2764 - 3267 771 167 aa, chain - ## HITS:1 COG:TM0564 KEGG:ns NR:ns ## COG: TM0564 COG1853 # Protein_GI_number: 15643330 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Thermotoga maritima # 24 163 17 156 159 73 31.0 2e-13 MEFHEIKIQDFNLNPFTSTKEWCLVTAGDEKGINTMTVTWGMMGYLWGGPVLSVYIRPER YTKKFVDEQKFFSLSFFSKEHRPALRTLGVKSGRDGDKIKEVDFHPVFLDGVPAFEEADL VIVAEKLYEDNIKPEKFLDESIPAKLYNNEGAHIIYTGRVKKIYVKE >gi|261747305|gb|ADAD01000130.1| GENE 7 3403 - 5508 2214 701 aa, chain + ## HITS:1 COG:CAC2658 KEGG:ns NR:ns ## COG: CAC2658 COG3968 # Protein_GI_number: 15895916 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Clostridium acetobutylicum # 1 701 1 696 696 663 49.0 0 MENAMKVFGENVFTESNLKKRVPKSVFKEFQASQLGETELSKESAEVIANAMKDWATKRG ATHYCHWFQPLTDLTAEKHESFLEPTGDKDIIYKFSGSSLIKGESDASSFPNGGLRSTFE ARGYTVWDTSSYPFIRENKNGVTLYIPTAFISFNGEALDKKVPLLRTMKYINKQSLRILK ALGNTEAKHVFNTLGIEQEYFLVNKDLFEARDDLLLTGRTLFGAPAPKGQELSDHYYGKI KEKIINFMSDVDVELWKLGVPAKTRHNEVAPNQFEIAPLFSVANLASDQNQIIMETIEKI ALRHGLVALLHEKPFAGVNGSGKHNNWSLSTDDGKNLFSPGKDPKTNKRFLLFVSAVIEA IDRYYPLLRVTTASATNDHRLGGHEAPPAIISIFLGDELTSIFENIALKKKLPVSEMSKL NLGVDVLPTLNTDSGDRNRTSPFAFTGNKFEFRMPGSSSTPATSAAAINATVGKVLKEYA DKLEKTQKKDLDKTITEIITNAYKKHGRIIFNGNGYSDEWKKEAEKRGLTNELSSNLALR KHLDSEILQMTEETGMLSKQESIARYNAYAERYVTHLVIESKTLIDIANKDILPAGLKYA NLLADNIEKVSKFNKNYAKEQELLLKEVIMNVTNLRKEIKSLEKKIEKVKNEKDLGKKTD LAAMYLATGLESLRVPCDNLEKVVDNSIWALPTYKDLLFKL >gi|261747305|gb|ADAD01000130.1| GENE 8 5604 - 5942 363 112 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1552 NR:ns ## KEGG: Lebu_1552 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 109 1 109 111 82 43.0 7e-15 MKKAKSFTTVLSVLLLMTGINYISSAKTLKKVQKEVSITAYNCGNENIKVTYPTKKTAKV TDSEGKVYNLRLAKSAGGTRYTNIKGIEFFTDGRNTIYTGPSGIENTCTVSQ Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:21:45 2011 Seq name: gi|261747299|gb|ADAD01000131.1| Leptotrichia goodfellowii F0264 contig00167, whole genome shotgun sequence Length of sequence - 3253 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 19 - 56 5.0 1 1 Op 1 . - CDS 200 - 730 461 ## gi|262038672|ref|ZP_06012035.1| putative liporotein 2 1 Op 2 . - CDS 732 - 1064 506 ## gi|262038674|ref|ZP_06012037.1| hypothetical protein HMPREF0554_2221 - Prom 1153 - 1212 3.8 3 2 Op 1 . - CDS 1251 - 1799 416 ## gi|262038673|ref|ZP_06012036.1| putative liporotein 4 2 Op 2 . - CDS 1801 - 3252 1711 ## gi|262038676|ref|ZP_06012039.1| hypothetical protein HMPREF0554_2223 Predicted protein(s) >gi|261747299|gb|ADAD01000131.1| GENE 1 200 - 730 461 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038672|ref|ZP_06012035.1| ## NR: gi|262038672|ref|ZP_06012035.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 176 1 176 176 347 100.0 2e-94 MKKIIMVVLSLMFIVSCSFFLQDSSEWSGPHAKKGQADGDTDFGEEEILGRRRVKQQISG YSITMPEDMEFKYGSTVYGTDKLFGKKRKYLYDNRTKTGTNIYLIDEVIDNLKGDESATV FKDTKKNGRYEMVAKAFSSGVFIEIKPNLYVACNSNNRNYMGTRPCEAIVKVMKSQ >gi|261747299|gb|ADAD01000131.1| GENE 2 732 - 1064 506 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038674|ref|ZP_06012037.1| ## NR: gi|262038674|ref|ZP_06012037.1| hypothetical protein HMPREF0554_2221 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2221 [Leptotrichia goodfellowii F0264] # 1 110 1 110 110 174 100.0 2e-42 MQVSTQQNLPKEFQEKIIGSIDEIREARKDAMVEKWMRGVDLIDRNEIKDFRLNALYDEY KENLDAQWREKENIAGLVNIYNQKINQTIPRKNPTQSIDDMVENLRKQVR >gi|261747299|gb|ADAD01000131.1| GENE 3 1251 - 1799 416 182 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038673|ref|ZP_06012036.1| ## NR: gi|262038673|ref|ZP_06012036.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 182 1 182 182 340 100.0 3e-92 MKKIVMIMLSLMFIMSCSFFLQDSSEWSGPHAKKGQADGDTDFGEEDILGRRKVKQQISG YSITMPEDMEFRYGSTVYDTDKLFGKKRKYLYDNRTKTGTNIYLIEETIGKLISYSENDA YIWKKESKKNGRYKMEAKAYSSSGGVLIEIKSGLYLTCRAKELNSYVVGNTCEAIVKVMK SQ >gi|261747299|gb|ADAD01000131.1| GENE 4 1801 - 3252 1711 483 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038676|ref|ZP_06012039.1| ## NR: gi|262038676|ref|ZP_06012039.1| hypothetical protein HMPREF0554_2223 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2223 [Leptotrichia goodfellowii F0264] # 1 483 1 483 483 891 100.0 0 GVIGKFADDVNEKLLNNEPKNIQGEWTEKVYKTIRRIEDFVDKKLYNESLALIPTSGQHG GMLEGIQMYFKKDKFDIYDIVLEEDKNGKAVITGKKLKNASEITNNEILINGMTEESEGS IANAINQLVSMDEIKNKMKKEPVKITLIYSKTRGGIIDGAEAVLGKMFDGSKTNFHLTTG TSRGLADVLMQLDPKREFNITTYSQANIIMMGALNYLNNKGKTLNGNITLYHTGTPVAPM VFDNLSRKMNFINGVSLINYLDQVGSEKGKWNGIKGLISETRDYDEKPRDKRNSYSGIPI FGPSTMRDQRYVFMTLPTIDEFNKEKNGKYAYAKDEIMRNNNLKNDKEFKKFIENNHRWY RIEDRKVVEKLLEYSAKYNPYIPGKEQMNIVNSIKSIREARKDAMVEKWMRGVDLIDRNE IKDFRLDALYDEYKGNLDAQWREKENIAGLVNIYNQKINQTIPRKNPTQSIDDMIENLRK QVR Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:22:32 2011 Seq name: gi|261747297|gb|ADAD01000132.1| Leptotrichia goodfellowii F0264 contig00107, whole genome shotgun sequence Length of sequence - 1389 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 3 - 71 0.7 1 1 Tu 1 . - CDS 105 - 251 68 ## - Prom 489 - 548 6.0 + Prom 669 - 728 2.4 2 2 Tu 1 . + CDS 759 - 1268 180 ## Smon_0058 hypothetical protein Predicted protein(s) >gi|261747297|gb|ADAD01000132.1| GENE 1 105 - 251 68 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYNRTERTFKFNETLRDIPAGEEVIDRVYNVFKMICADIIFLAKIYGI >gi|261747297|gb|ADAD01000132.1| GENE 2 759 - 1268 180 169 aa, chain + ## HITS:1 COG:no KEGG:Smon_0058 NR:ns ## KEGG: Smon_0058 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 1 169 234 406 407 135 44.0 6e-31 MKALETVSKLYKENKRFFFNVGDGDINSTKGIIRYLGRYLARSPIAEYKITEIDDDKVTF FFNDLANNKKKTFITMSAEKFISQILIHLPPKNFKMVNRYGFYSRHISDKLKKAMQPFKK NIVVSKYSFYQRQMYITFGMNPFFCPECKIRMIVWEFYHYLYPPLKKYH Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:22:41 2011 Seq name: gi|261747294|gb|ADAD01000133.1| Leptotrichia goodfellowii F0264 contig00048, whole genome shotgun sequence Length of sequence - 5545 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 5020 6364 ## FN0254 hypothetical protein 2 1 Op 2 . - CDS 5083 - 5544 176 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 Predicted protein(s) >gi|261747294|gb|ADAD01000133.1| GENE 1 1 - 5020 6364 1673 aa, chain - ## HITS:1 COG:no KEGG:FN0254 NR:ns ## KEGG: FN0254 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 448 1671 1 1264 1677 1033 56.0 0 MSNNLKKMEKDLRAFAKRSKDVKYTKGLLFSFLLMGMLTFSDTLTSPEVKSTENAINQTR KELNTSISEMHTAFKQAKRENNRLLKRANLELIQLMEQGDHVVKSPWSSWQMGMNYFYSN WRGTYKGRGDKAEKYPYEGILERDSNEFNRYVSPDSKFYGSLPTSSNPRSAASNARQGLS TQYGIASTQPVPEPIVDLQLSAGINPKIVNKADLNLVPKSANVPNLPEPVKFQPINPKIT IPADPPLPTPPNFAIVLGGDCNSGCDSYDWRNGGGTTPRQNTKSGFTPPAGTNQNVTNIL HYTWRDGLATPAERSYAFKMLWEAPNGSTTSYKLEDKPQEIFFNSYNYGYGRAANGGEWA SDVADSQNIPSDTQEKNHQYFLIGGSRALEVDNVYSTNTEYGVASGRTLQLGGIFTLGVV SQENGTILVNEGTITDEKENQDSYIRNMPTDPGKDYLTIHGPTQDYQISKNGNGYVGYKV GIAQVEENEGVSQFHMDQQRLQNRSGGKINFFGNRSIGMYVYLPRHTTYAKMENQAGAEI NMSGAESYGMRIAAKSDDAATMLNAGTINLRKINNQGADTSAGMALMLDSSVVNGVSLKR GNAKNTGKITLTGVKNSIGAYVNIDSDITNEGTISINSTIAKNNDPTKQEVNVGMRADKT ATAEVINEGTGKITMDGSYAMGMLANGSKLVNKGTISSTNVLNGVGIAGINNAKVKNSGS IKVLGSGNTNNIGVFLKTNSEGEIGTPAAGVNQEVEVTGDNSTGVLVSDNSKLTMNGNVT VSGKSVSGVVVNNGSQVTSNGGTVKVDNSGNHAEPIGDKGSYGIVVKGATSKYTGATTEV TAKVTSDKSMGLYSEGKLEVNKANIDAKDGAINFYANGGEIKINNGGNSVTGQKSLLFYT AGAGKFNLAGGTLSATIKGGTTPSTRGTAFYYEHTGPTYGSFDTAALTNFASTKFQNTLN HLNLTMEAGSRLFVASNVEMNLSDTNPSGLVGALNLGGITGSDYKVFMLYLSRLKVNQAA NLDDANDLYNKLEIANSSITNENTMTGSQNRQVAMAQENGSNPSGTPYAANKVTLINEAN GKINLTGEETTGMYAKRGVIENNGEISVGKKSTAIYSEDDDAITAGTSATAENKGKITLG EDSTGIYYKNGVSAAGGGAYNQGKIESSANNVIGMTFDTGSNSKVFRNHGNGEINLTGDK STAMYATGNGTYTAENNAKITLGNSSDANNPNVGMFTDKSQITLKNTGTIAAGDKTAAMY GYGINTDASSDISVGSGATGIYSKGGNVHLRGKLATGGNEAVGVYTVGAGQTITNDATNV NIGDGSYGFVIKNAAGGNTFTSNTANVSLGNNVVYAYSTDTTGTMTNNTKLNSTGSENYG LYGAGIITNNADINFGSGIGNVGIYSIKGGTATNTAGKTITIGASNVSGQKYGIGMAAGY ENSDTGKIINKGTINVNGKNSIGMYATGNGSVATNDGTINLGADGTQGMYLDNGAKGINN GTITTVGNPNGAIGVTVRGGAEITNNGTININSSNGYAFAQLKGGIIKNYGTFTVGNGAQ KLYEPKAKPTGKGVAGVEINAPAGATNATITVNGKPVTPVTISSTAGQRAPLTSSIGMYI DTLRGTNPIGGLIPSGEADLIIGSEAAQKTNSKYIEVNGDIIKPYNKAITNNP >gi|261747294|gb|ADAD01000133.1| GENE 2 5083 - 5544 176 153 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 49 150 246 347 347 72 34 7e-13 TLAFRKLTTTQIRENSIRINALELDEIDITKVKGPKEVTVVLDERSLLFDFDKSNVKPQY YGILQNLKEYIEVNDYDVTIIGHTDSKGTNEYNMKLGMRRAESVKAKMIEFGVPASRIVG VESRGEEEPVATNDTAEGRALNRRIEFKLVRKN Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:23:25 2011 Seq name: gi|261747234|gb|ADAD01000134.1| Leptotrichia goodfellowii F0264 contig00087, whole genome shotgun sequence Length of sequence - 52868 bp Number of predicted genes - 59, with homology - 58 Number of transcription units - 27, operones - 14 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 201 - 260 11.0 1 1 Tu 1 . + CDS 374 - 1894 1970 ## COG0606 Predicted ATPase with chaperone activity + Prom 1989 - 2048 8.8 2 2 Op 1 . + CDS 2079 - 2954 939 ## COG0657 Esterase/lipase 3 2 Op 2 . + CDS 2976 - 3653 754 ## COG1059 Thermostable 8-oxoguanine DNA glycosylase + Prom 3680 - 3739 13.9 4 3 Tu 1 . + CDS 3776 - 4093 667 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 4104 - 4150 7.0 + Prom 4134 - 4193 8.8 5 4 Op 1 . + CDS 4249 - 6414 185 ## PROTEIN SUPPORTED gi|16801118|ref|NP_471386.1| 30S ribosomal protein S1 6 4 Op 2 . + CDS 6489 - 8063 2549 ## COG1620 L-lactate permease 7 4 Op 3 16/0.000 + CDS 8118 - 10916 4400 ## COG0060 Isoleucyl-tRNA synthetase 8 4 Op 4 1/0.000 + CDS 10928 - 11407 670 ## COG0597 Lipoprotein signal peptidase 9 4 Op 5 . + CDS 11430 - 12299 1115 ## COG0752 Glycyl-tRNA synthetase, alpha subunit 10 4 Op 6 . + CDS 12348 - 13601 1155 ## Lebu_0363 ABC transporter ATP-binding protein 11 4 Op 7 . + CDS 13588 - 14121 465 ## Lebu_0364 hypothetical protein 12 4 Op 8 . + CDS 14160 - 16193 2718 ## COG0751 Glycyl-tRNA synthetase, beta subunit + Prom 16380 - 16439 12.7 13 5 Tu 1 . + CDS 16497 - 17585 1473 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins + Prom 17626 - 17685 8.9 14 6 Op 1 . + CDS 17733 - 18122 567 ## Lebu_1960 hypothetical protein 15 6 Op 2 . + CDS 18125 - 18520 605 ## Lebu_1959 hypothetical protein + Prom 18532 - 18591 10.2 16 7 Op 1 1/0.000 + CDS 18708 - 18947 377 ## PROTEIN SUPPORTED gi|229211537|ref|ZP_04337929.1| LSU ribosomal protein L31P + Term 18956 - 18993 2.1 + Prom 18954 - 19013 5.9 17 7 Op 2 . + CDS 19033 - 19656 1028 ## COG0035 Uracil phosphoribosyltransferase + Term 19709 - 19753 1.1 - Term 19647 - 19686 5.0 18 8 Op 1 7/0.000 - CDS 19767 - 21437 1162 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 19 8 Op 2 . - CDS 21467 - 22243 873 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 20 8 Op 3 13/0.000 - CDS 22244 - 22717 578 ## COG0526 Thiol-disulfide isomerase and thioredoxins 21 8 Op 4 2/0.000 - CDS 22748 - 23392 853 ## COG0785 Cytochrome c biogenesis protein 22 8 Op 5 . - CDS 23427 - 24545 1436 ## COG0225 Peptide methionine sulfoxide reductase - Prom 24618 - 24677 16.9 23 9 Tu 1 . - CDS 24679 - 25683 1262 ## COG2141 Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases - Prom 25713 - 25772 13.1 - Term 25818 - 25874 2.1 24 10 Op 1 . - CDS 25921 - 27093 1448 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 25 10 Op 2 . - CDS 27117 - 28076 1254 ## COG1482 Phosphomannose isomerase - Prom 28173 - 28232 8.4 + Prom 28062 - 28121 8.1 26 11 Tu 1 . + CDS 28219 - 28887 675 ## gi|262038714|ref|ZP_06012074.1| hypothetical protein HMPREF0554_1348 + Prom 28915 - 28974 6.4 27 12 Op 1 12/0.000 + CDS 29029 - 29658 1005 ## COG0563 Adenylate kinase and related kinases 28 12 Op 2 . + CDS 29679 - 30452 1278 ## COG0024 Methionine aminopeptidase 29 12 Op 3 . + CDS 30472 - 30834 471 ## COG3759 Predicted membrane protein + Term 30900 - 30961 10.6 - Term 30881 - 30918 -0.9 30 13 Op 1 . - CDS 31045 - 31419 533 ## Clocel_1051 pyridoxamine 5'-phosphate oxidase-related FMN-binding 31 13 Op 2 . - CDS 31437 - 32393 1063 ## COG0657 Esterase/lipase 32 13 Op 3 . - CDS 32417 - 32839 494 ## FN0106 hypothetical protein - Prom 32871 - 32930 8.2 + Prom 32833 - 32892 7.7 33 14 Tu 1 . + CDS 32937 - 33683 843 ## COG2378 Predicted transcriptional regulator + Prom 33742 - 33801 13.9 34 15 Op 1 . + CDS 33828 - 34046 267 ## PROTEIN SUPPORTED gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 35 15 Op 2 . + CDS 34067 - 34180 196 ## PROTEIN SUPPORTED gi|229210335|ref|ZP_04336732.1| LSU ribosomal protein L36P + Prom 34191 - 34250 11.5 36 16 Op 1 48/0.000 + CDS 34355 - 34720 555 ## PROTEIN SUPPORTED gi|229883849|ref|ZP_04503314.1| SSU ribosomal protein S13P 37 16 Op 2 36/0.000 + CDS 34743 - 35132 644 ## PROTEIN SUPPORTED gi|229210337|ref|ZP_04336734.1| SSU ribosomal protein S11P 38 16 Op 3 26/0.000 + CDS 35161 - 35748 932 ## PROTEIN SUPPORTED gi|229210338|ref|ZP_04336735.1| SSU ribosomal protein S4P 39 16 Op 4 50/0.000 + CDS 35782 - 36750 1294 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 40 16 Op 5 . + CDS 36784 - 37173 516 ## PROTEIN SUPPORTED gi|229210341|ref|ZP_04336738.1| LSU ribosomal protein L17P + Term 37239 - 37296 1.5 + Prom 37281 - 37340 6.9 41 17 Tu 1 . + CDS 37365 - 38678 1852 ## COG1114 Branched-chain amino acid permeases 42 18 Op 1 . - CDS 38768 - 39496 767 ## COG3700 Acid phosphatase (class B) - Prom 39520 - 39579 9.5 43 18 Op 2 . - CDS 39619 - 40347 763 ## COG3700 Acid phosphatase (class B) - Prom 40402 - 40461 13.6 + Prom 40324 - 40383 9.7 44 19 Tu 1 . + CDS 40487 - 40825 549 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase + Term 40844 - 40904 4.0 - Term 40838 - 40886 2.6 45 20 Tu 1 . - CDS 40899 - 41390 748 ## Clos_0633 C_GCAxxG_C_C family protein - Prom 41516 - 41575 8.6 + Prom 41485 - 41544 10.6 46 21 Op 1 . + CDS 41578 - 42501 1079 ## COG1275 Tellurite resistance protein and related permeases 47 21 Op 2 . + CDS 42575 - 43903 1656 ## Lebu_2011 hypothetical protein 48 21 Op 3 . + CDS 43909 - 44142 381 ## Lebu_2011 hypothetical protein + Term 44182 - 44246 3.2 + Prom 44212 - 44271 19.8 49 22 Op 1 . + CDS 44309 - 44722 339 ## gi|262038725|ref|ZP_06012085.1| hypothetical protein HMPREF0554_1372 50 22 Op 2 . + CDS 44744 - 45499 681 ## gi|262038701|ref|ZP_06012061.1| hypothetical protein HMPREF0554_1373 51 22 Op 3 . + CDS 45514 - 46083 550 ## gi|262038726|ref|ZP_06012086.1| hypothetical protein HMPREF0554_1374 + Term 46138 - 46183 0.2 - Term 46117 - 46179 2.2 52 23 Tu 1 . - CDS 46374 - 46829 503 ## Lebu_0540 septicolysin - Prom 46905 - 46964 12.4 + Prom 46898 - 46957 10.5 53 24 Tu 1 . + CDS 47170 - 48054 1000 ## COG0739 Membrane proteins related to metalloendopeptidases - Term 48069 - 48126 -0.4 54 25 Tu 1 . - CDS 48130 - 48192 80 ## + Prom 48435 - 48494 9.1 55 26 Op 1 6/0.000 + CDS 48519 - 48983 614 ## COG0054 Riboflavin synthase beta-chain 56 26 Op 2 16/0.000 + CDS 49017 - 50165 1279 ## COG1985 Pyrimidine reductase, riboflavin biosynthesis 57 26 Op 3 15/0.000 + CDS 50174 - 50845 786 ## COG0307 Riboflavin synthase alpha chain 58 26 Op 4 . + CDS 50869 - 52080 1624 ## COG0807 GTP cyclohydrolase II + Term 52265 - 52293 1.4 + Prom 52092 - 52151 3.5 59 27 Tu 1 . + CDS 52308 - 52835 536 ## Clocel_3820 hypothetical protein Predicted protein(s) >gi|261747234|gb|ADAD01000134.1| GENE 1 374 - 1894 1970 506 aa, chain + ## HITS:1 COG:FN1614 KEGG:ns NR:ns ## COG: FN1614 COG0606 # Protein_GI_number: 19704935 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Fusobacterium nucleatum # 1 505 1 497 497 476 50.0 1e-134 MAISVLSCSYLGIESYIVEAEVDISNGLPIFNIVGMGDLAIIESKERIRSCFKNIGLEFP VKRVLVNLSPADIRKKGSHFDLSIFLGILANTGKITNIENFKNYLILGEISLNGNIKPIK GAINAAILAKEKEMKGVIIPVENYNEAKLISGVEIIPVKTIKEAVGFINGKIGREELYEN VEKSGKNSAEDKENQKEIVDFSDVKGQFLAKRALEIAAAGGHNIFLMGDPGSGKSMLAKR FITILPDMTEKEIIETTKIYSVSGMLSAMEPVIMKRPFRAPHHSATQTALVGGSTRVGEI TLALNGVFFMDELGEFGIKTLEALRQPLEDGSITISRANLIVTYPVNNIMIAASNPTPGG FFPDDPQCKDSLRDIKNYQKKFSGPLLDRIDLYVEMRRLKKEELFSDTLSEPSEKIKERV IKARNIQKKRFSSELLNAKMSRKQISEYCKINDATKEVFEKAVDELKLSVRMYDKILKIS RTIADLEDSESIETEHLLEALNYRKK >gi|261747234|gb|ADAD01000134.1| GENE 2 2079 - 2954 939 291 aa, chain + ## HITS:1 COG:SPy1308 KEGG:ns NR:ns ## COG: SPy1308 COG0657 # Protein_GI_number: 15675257 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Streptococcus pyogenes M1 GAS # 15 291 43 322 327 164 33.0 2e-40 MSLLSDIAVPVAKLVNAKRGNEKAMENPRRDTDFFNRNNFDKSFQTDEEFIDGFQVITVK TEKSLNRHVIFLHGGGYVLRAVRSHKNIVERMVKKYNLKVTFVDYPLAPEDTVDKAHDVL MKAYKSVAEKNREDDFYFFGDSAGGGLALAFLQVARDEKIVPFPKKTVLMSPWLDVSMTN EEIKDFEEKDPLLPVHSLIKAGKQFAGNMDVKSPLVSPIYGNMDNLGEIMLLFGTNEVLY PDCMKLSDMLDTAFGTTVELYVGENLCHDWILAPLKETDEALDAIGKFYLE >gi|261747234|gb|ADAD01000134.1| GENE 3 2976 - 3653 754 225 aa, chain + ## HITS:1 COG:FN0622 KEGG:ns NR:ns ## COG: FN0622 COG1059 # Protein_GI_number: 19703957 # Func_class: L Replication, recombination and repair # Function: Thermostable 8-oxoguanine DNA glycosylase # Organism: Fusobacterium nucleatum # 18 225 9 217 217 224 59.0 7e-59 MTKKDKFEKKLNKQLHEEIVKIYKKIKKDIDKAVKGYKKAWNGSEKKVFAEVAFCILTPQ SKAKNAWQAITNLVNNGLLFDGQPEEIAEYLNIVRFKNNKSRYLVELRELMTVDGKLQPK KILSDKGNTLEKREWIFKNIKGMGMKEANHVLRNLGFGDEIAILDRHILRNLVQLNIIDE IPKSITEKKYYEIEEKMKEYSEYSEITMGELDLVLWYKEAGEVFK >gi|261747234|gb|ADAD01000134.1| GENE 4 3776 - 4093 667 105 aa, chain + ## HITS:1 COG:Cj0147c KEGG:ns NR:ns ## COG: Cj0147c COG0526 # Protein_GI_number: 15791535 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Campylobacter jejuni # 1 103 1 102 104 99 45.0 2e-21 MSKVVHYAGEDFQTETLVQNGVTLVDFFATWCGPCQMLGPVLDELSNSADYKIVKVDVDQ ANELAVQFGVRSVPTMVIFKDGQPVETLVGFMTQDEIDKKVQAYK >gi|261747234|gb|ADAD01000134.1| GENE 5 4249 - 6414 185 721 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16801118|ref|NP_471386.1| 30S ribosomal protein S1 [Listeria innocua Clip11262] # 644 719 182 256 381 75 46 5e-13 MDILKNVATELNLKVSQIENTMNLFDEGATVPFIARYRKEVTGNLDEEQIREVIEKVTYY RNLEKRKEEVLRLIEEQGKLTEELKTSIVNAVKLQEVEDLYLPYKKKKKTKADIAKDQGL EPLSIFALLPDTTFDSLKAEAEKYITEEVPSVEAAIEGVHLIIAQNISEDPKIREFLRER IAKYGILTSKVIEKNKAEDVKGVYQDYYDHSELIEKMASNRVLASNRGEKEKILKVDIDI DEKTEEFIMNFILKSFGNKNLTDFYKEIIRDSLDRLAYPSIKNEVRNIYTEKAEEEAINI FSENLEKLLLQPPLAKKTLMGLDPGYRTGCKMVIINKDGFYETNDVLYLVEEMHNSRQLR EAKEKILNYIDKYDVDIIAIGNGTASRETESFVAKIIKEAEKKVSYLIVSEAGASVYSAS KLAIEEFPDLDVTARGAISIARRIQDPMAELVKIDPKSIGVGMYQHDVNQKKLNETLEQT IEHVVNNVGVNINTASWALLSFVSGIKKNVAKNLVDYRHENGDFKDRKQLKKVKGLGDKA FEQMAGFIVVPDSDNPLDNTIIHPESYHIAETVLKEAGCKVSDLKENLNEVRQKLKTINL DKIIKENDFGTQTAKDVYDALLKDRRDPRDEFEKPLLRSDILSMDDLKEGMILEGTVRNV AKFGVFVDIGLKNDALIHISQLSDKFVADPTKIMSVGKIIKAKILSVDKERARVSLTIKG I >gi|261747234|gb|ADAD01000134.1| GENE 6 6489 - 8063 2549 524 aa, chain + ## HITS:1 COG:BB0604 KEGG:ns NR:ns ## COG: BB0604 COG1620 # Protein_GI_number: 15594949 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Borrelia burgdorferi # 4 518 6 499 500 267 37.0 5e-71 MSLFLIGLVPIVLFLVLLAVLKKSAMFSAYASLVTAIVLTFLVPAWRMPIQGILASIIEG FAVAWMPIGFVVIAALFAYDLSVKTGKIETVKKMLGNITTDKRAQALVLAWGFGGFIEGI AGYGTAVAIPAAIMVSLGFNPLNAALICLIANSTPTAFGTVGLPVTTMVSKLGLDKAGNF TPLFTSLLLLILTCIIPFIIVKMANKEIENGKNPAFGKGIIFVILASILGYLIQPFIAFN MGAELTTILSSLVAMILMIISIKIFIKNDEGFVKADVTGKDALLAWLPYILMIVLIVGTS PVVAHIHELLEPTTTKFNFALGNQAAWFRDGGDPSVTFKWLLAPAAPLFLATVLAGFVQR AKIKDMGSVLWHTTVHKIPSLTVIMGIVALSVVMKHSGMIQSIADGFVNLTGKGFPLISP FLGTIGTFVTGSDLSSNLLFGDLQHSVAEGLKPGSDMLKSLFIASNTAGATGGKMISPQN IAIAASTVGLVGQEGDMLKFTIKYSVVYAIILGVLVFIGSGMIA >gi|261747234|gb|ADAD01000134.1| GENE 7 8118 - 10916 4400 932 aa, chain + ## HITS:1 COG:FN0067 KEGG:ns NR:ns ## COG: FN0067 COG0060 # Protein_GI_number: 19703419 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 11 927 5 933 933 1177 62.0 0 MSEEKNVEKPDYAGTLNLPKTSFKMKANLAQKEPLTLRDWKKTNIYEKSLKDEGQYFVLH DGPPYANGDIHIGHALNKVLKDIILKYKRLRGYNAPYIPGWDTHGLPIEWKIMEEYGEKA RNMTPLQIRQECKKYALKWVEKQKAEFIRLGILGNWDDPYITLKPEYEAEQLKVFKEIYE NGYVYKGLKPVYWSPTTETALAEAEIEYKDVVSPSIYVKMEASKDLLDKLGTDEASIVIW TTTPWTLPANLGIALNKDFDYGLYKTEKGNLVLAKELAEKAFGEMKLSYELIKEFKGNKL ERTHYKHPFLDREGLVILGEHVTLDAGTGAVHTAPGHGVDDYVVGQAYGIGILSPVDDQG HMTKEAGKYEGLFYKKASKVIVEDIEASGHLMAYSEFTHSYPHDWRSKKPVIYRATEQWF ISVDESDVRQNALKALEDVEFVPEWGKNRINAMLETRPDWTISRQRVWGVPIPLFYNAET NEVIYEPEIMDKVIEAVKKKGTDIWWKYEAEEIIGEELLEKYNLKGVKLRKERSIMDVWF DSGASHRGVLIPRGLHRPADLYLEGSDQHRGWFQSSLLTSIASTKDAPYKRILTHGFVMD GQGRKMSKSLGNTIVPKDIIEKYGADILRLWVSSVDYREDVRISENILQQMSDAYRRIRN TARFLMGNLNDFDYENKKVDYEEMYEIDKWALHKLEELKEKTTEYYDKYEFYSLFQEITY FCSIDMSAFYLDIVKDRLYCEEKDSIERRSAQTVLTEVLKVLVRVISPVLSFTAEEIWER IPESLKKEESVHLSEWVELEPKYKNNELAKKWDKIQNLRKEVNKKLEAARQEGLIGHSLD ARVLLNVNNDDYSFLKDYTEDEVSDLFIVSQTKFVNDELGASEISGISIGVIKALGKKCE RCWKYHEEVGNDPEHPDVCPRCAKVLKFAGNK >gi|261747234|gb|ADAD01000134.1| GENE 8 10928 - 11407 670 159 aa, chain + ## HITS:1 COG:FN0068 KEGG:ns NR:ns ## COG: FN0068 COG0597 # Protein_GI_number: 19703420 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Fusobacterium nucleatum # 1 152 14 163 165 121 45.0 5e-28 MLYIIIIIVLTGIDQYTKYEMFRIAQGQHGYSIPVLKDFFNFTYVENHGGIFGIFQGKIN LFTIASIILIAYVVFTEFKNFKNYTKWTRIGVSVIAAGALGNMIDRIFRGFVIDMIDFRG IWGFVFNVADMYVHIGIYIIVVDYLLKKRKKKSGLQKIK >gi|261747234|gb|ADAD01000134.1| GENE 9 11430 - 12299 1115 289 aa, chain + ## HITS:1 COG:FN0069 KEGG:ns NR:ns ## COG: FN0069 COG0752 # Protein_GI_number: 19703421 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, alpha subunit # Organism: Fusobacterium nucleatum # 1 289 1 289 290 491 78.0 1e-139 MTFQEIILTLQKFWGEKGCIISNPYDVETGAGTFNPDTFLMSLGPEPWNVAYVEPSRRPK DGRYGENPNRVYQHHQFQVIMKPSPENIQELYLESMVALGIDPKKHDIRFVEDNWESPTL GAWGLGWEVWLDGMEITQFTYFQQVGGLEVEITPAELTYGLERIALYLQDKDNVYDLEWT KGVKYGERRSQYEYEMSKYSFEVADVSMNFQLFDMYEKEALNCLEHKLVLPAYDYVLKCS HTFNNLDARGAISTTERMSYILRVRDLAKKCAEQFVEVRRGLGFPLLKK >gi|261747234|gb|ADAD01000134.1| GENE 10 12348 - 13601 1155 417 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0363 NR:ns ## KEGG: Lebu_0363 # Name: not_defined # Def: ABC transporter ATP-binding protein # Organism: L.buccalis # Pathway: not_defined # 1 416 1 427 427 278 43.0 3e-73 MELELKNIAKIKHAKVEINGITVIAGENNTGKSTVGKVLWSVLNGFCQIDKKVYNEKVKE LEKIISNIMEKNGYELFTSNDFLKGIFDRTEREIAIEILKSKKDNYSQKDVITLINKYKN DLKIRDLSSYVNSINRTINIEVDEIVKIIISKVLNEEFYNQINTISFQGQKKKGNISFRL QNRTMQLEIKENEISNILGIFPLTKETIYIDNPFILDSYDYEKENHQAHLANRIFGEDKG NAISKINTKRKLQKVYEKLNIVLNGEIIENNNFEFIYTENEQELNLKNLSTGLKTFAIIK MLLQNGILEENGTIILDEPEIHLHPEWQIIFAELIVLLQKEFGMHILLTTHSPYFLKAIE VYSKKYDIKEKCKYYISQNQEKEATIIDKTDKIEDIYYKLAIPFENLMNEEEIYEDR >gi|261747234|gb|ADAD01000134.1| GENE 11 13588 - 14121 465 177 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0364 NR:ns ## KEGG: Lebu_0364 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 176 1 176 176 213 71.0 2e-54 MKTDKIIEKCKSALKKTSRDNKNDNMTESNLEVIDFDCLKDKYLGKIQFKGVGIRSNDAL FTRKKKYIFIEFKNGEIKNIKEYELHKKIYDSLLIFCDIFGKTLRETREEMDYILVANFS NEKKSDEVLLEKFLKSAKIEERLRVGLRNFKGYCFKNVHTYSKEEFEEKFVKIYENV >gi|261747234|gb|ADAD01000134.1| GENE 12 14160 - 16193 2718 677 aa, chain + ## HITS:1 COG:FN0070 KEGG:ns NR:ns ## COG: FN0070 COG0751 # Protein_GI_number: 19703422 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, beta subunit # Organism: Fusobacterium nucleatum # 1 677 1 682 686 531 46.0 1e-150 MDFLFEIGLEELPSRYVDEAESNLKKLTENELKDERINFSEVESFSTPRRIAIIVKDISE KQEDLDKKSVGPSVDIAFKGGELTKAGEGFVKSQGAEKEDIKIIENEKGKYISVEKFIKG KETKEILPDILSNIVKKIEFEKSMKWGEKSFRFARPIKWFVTLLGNEILDFEFEGIKGSN KSRGMRYFASQDVAISNPSEYEKVLKENFVIAKKDKRKEEILKSIKENCENDGEKAIINN YLLEEVINLVEYPYAIKGEYNKDYLKLPEDITTITMETHQRYFPVRDKDGKLTNKFILIR NAPEYSETVKKGNEKVIEPRLADAKFFFDEDLKHKFSDNVEKLKEVTFQKDMGTIYDKIK RSQKIAEYLINETGLQEEKENILRTVELAKADLVTNVIAEKEFTKLQGFMGSVYAEKQGE NKNVALGIFEHYLPRYQGDSLPTTIEGALAGIADKIDTVTGCFSVGLKPTSSKDPYALRR AVQGIIYVTLNSKLKYDYKKLIEKAYEIFAADKKVLEKNVVEDVTEFFKQRIINVLSEKY SKELINYEVNLESNVVNLDEKLEVLLKLSKTEKFEKLINLLKRIKNILKEEKGNAVSINE NLFSTEEEKKLYSFAEELENVENKGFFICTETLIENEEIINNFFDKVKINDENPEIKNNR ISLLKKLEKISDKIINL >gi|261747234|gb|ADAD01000134.1| GENE 13 16497 - 17585 1473 362 aa, chain + ## HITS:1 COG:PA1041 KEGG:ns NR:ns ## COG: PA1041 COG2885 # Protein_GI_number: 15596238 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Pseudomonas aeruginosa # 238 350 95 208 210 76 41.0 7e-14 MKKLVILGTALLALNAVASEFQIKGGYDFNRKFKGGASTEDLKLKAGPALGLEYIFDNQG EFEWGLGAEYKLSPKSGKLVARHDKNDKLVRVAPVYALGKFNLITTDSGNDVLYVLGRAG YAFANESKELGRKGINVSGGLYTAAGIGTEFGPVSLEAIYERANIKSNVVGSSDKERGRI DSFGLRLGYRFGKLQNDRSPKVVTQRVVEEVVVQPKQPEIKEINTKTAVFPFNCSVDEKK CVIRGFKVDGRVPNQAEAKDLGTIANVINQFADGGTIDFVGHTDSTGSNAYNQKLSVARA QNVARLLREYGLKNSISYGTITGKGESQPADTNNTVQGRYNNRRVELFFQNVDFQNVRLI NQ >gi|261747234|gb|ADAD01000134.1| GENE 14 17733 - 18122 567 129 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1960 NR:ns ## KEGG: Lebu_1960 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 127 1 142 143 97 48.0 1e-19 MKKLGILFVLLSAVSFAAERQVATVDAKFNKLEAEYNQLVNMENQQYSKLRANAENASQQ LQEKQALKAAIEEKMSKLEQTRNTKYFKGEYENLVKEYKNVVKALDADIAGLSKVVDNFN KVEELKGGN >gi|261747234|gb|ADAD01000134.1| GENE 15 18125 - 18520 605 131 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1959 NR:ns ## KEGG: Lebu_1959 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 131 1 133 133 91 58.0 1e-17 MKKKLAILAGVLVLSSMSFGAAAPAKKSSLEDSLNNLEKQLQRLDQMEQQKFNEQAALAQ AAQQRLDKYLAMQATIDQRIADIEANADKSIFGKEFKQKMSEFKSLRSELDKEIKKEQKV LEDFELLKSLR >gi|261747234|gb|ADAD01000134.1| GENE 16 18708 - 18947 377 79 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229211537|ref|ZP_04337929.1| LSU ribosomal protein L31P [Leptotrichia buccalis DSM 1135] # 1 79 1 79 79 149 88 2e-35 MRKDIHPDYRLVVFEDTSNGDRFLGKSTKSSKETVTFEGQEYPVIKVATSSTSHPFYTGK SKFVDETGRVDKFKKKYNL >gi|261747234|gb|ADAD01000134.1| GENE 17 19033 - 19656 1028 207 aa, chain + ## HITS:1 COG:FN0483 KEGG:ns NR:ns ## COG: FN0483 COG0035 # Protein_GI_number: 19703818 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 207 8 214 214 281 64.0 8e-76 MAVYELKHPLIEHKLTNLRNKNTDTKLFRESLNEIAGLMVYEATKHLQLKEIEIETPIQK TKTKVLEEPVTLVPILRAGLGMVEGILQLLPNAKVGHLGVYRNEETLEPVYYYAKMPVNV AESQVFITDPMLATGGSMIYTIDYLKEKGVKNITMLCIIGAPEGIKKVIEKHPDVDLYIA AIDKGLNDKAYIYPGLGDAGDRIFGTK >gi|261747234|gb|ADAD01000134.1| GENE 18 19767 - 21437 1162 556 aa, chain - ## HITS:1 COG:FN0190 KEGG:ns NR:ns ## COG: FN0190 COG2972 # Protein_GI_number: 19703535 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Fusobacterium nucleatum # 10 553 6 550 552 448 49.0 1e-125 MKLFEKSIRKSMTFKIIIYYLTGNFLFILFLSSIFYYSSKYIIMKKEIESTRKNAERSAE YVSLYVDKLKNLINLLSVNTDIENTLINNSESSKIEAEKLIQMIIRENKGNIKSITVIGK NGNIISNEKNVNMDISADMMKENWYISAINNTDNPVLNPIRKKQTSIDMTSWVLSISKDI KDRNGENLGVIVFDIKYQALNEYLKDISMGEQNDNLIIDNKNNIIYYKDVNCFINKKCIE KFLSQNNKSYKNKNIALTKVHIENTDWQLISFSEMNDLVLLRKSFFHVIIAIFLISLGFT TVVTFLIIRKFINPLRKLENHIQKFENSLSEFNADKNTSYEIEVLIKHFNSMISKIKYLR EYEIKALHSQINPHFLYNTLDTIIWMTEFGDSEQVITITKSLANFFRLSLSNGNEKISLK DELTHIKEYLFIQKQRYEDKLTYNFNIDESLLSYEVPKIIIQPIVENSIYHGIKNIQGTG IINIDVYKNNNDIYIAVKDNGIGFKESKKFKKSKIGGVGIKNVDKRIKFYYGNEYGIEIK DVAEGSLVILKLPILK >gi|261747234|gb|ADAD01000134.1| GENE 19 21467 - 22243 873 258 aa, chain - ## HITS:1 COG:FN0189 KEGG:ns NR:ns ## COG: FN0189 COG4753 # Protein_GI_number: 19703534 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Fusobacterium nucleatum # 1 258 1 257 261 236 54.0 3e-62 MYSILIVDDEPIIRRGIKTFIDFNKYGISNVYEAEDGNSASKVFSEILPDLVLLDINMPF KDGLTIAEEFKNIKKETKIAIITGYDYFEYAQKALKIGVEDYILKPVSKKDINEIITKLI YELNQEKKHIEVEKIINKISNNENTDSQTVNSKYRDIILKKFEENYSNVSFNLNSLADDM NLSSGYLSSLFKSLFGIPFQDYLNNIRMEKAKLLLLTTDLKNYEISDQIGFDNVYYFNSK FKKTFGVTPKEFKKNITK >gi|261747234|gb|ADAD01000134.1| GENE 20 22244 - 22717 578 157 aa, chain - ## HITS:1 COG:HI1453 KEGG:ns NR:ns ## COG: HI1453 COG0526 # Protein_GI_number: 16273359 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Haemophilus influenzae # 1 153 1 151 156 198 60.0 3e-51 MKKIIALLLMILSFNILAAGGTSLNGIQLKDLNNKTVSLSKYKGKKVYIKMWASWCPICL SGLQEINTLSGDKNKNFEVITIVSPGQKGEKPKDKFIQWYKGLNYKNITVLIDEKGEVLK KAQIVGYPSNIILDANGNIAKVLPGHLNIAQIKGAIN >gi|261747234|gb|ADAD01000134.1| GENE 21 22748 - 23392 853 214 aa, chain - ## HITS:1 COG:HI1454 KEGG:ns NR:ns ## COG: HI1454 COG0785 # Protein_GI_number: 16273360 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis protein # Organism: Haemophilus influenzae # 1 212 1 212 213 264 76.0 9e-71 MFEQQLLIGSVFLGGLVSFFSPCIFPIVPIYLGILSKGKRTVINTFLFILGLSLTFVSLG FGFGFLGKIFFSDTVRIVAGIIVIILGIHQMGILKFKQLEKTKLVAFNTAGKSSSLEALL LGLTFSLGWTPCVGPILTSVLALSGDKGSAVYGGMMMFIYVLGLATPFVLFSFFSNELLK RTKSLNKHLDKFKIIGGVLIILMGILLIANRFHF >gi|261747234|gb|ADAD01000134.1| GENE 22 23427 - 24545 1436 372 aa, chain - ## HITS:1 COG:HI1455_1 KEGG:ns NR:ns ## COG: HI1455_1 COG0225 # Protein_GI_number: 16273361 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptide methionine sulfoxide reductase # Organism: Haemophilus influenzae # 60 224 40 204 205 251 72.0 2e-66 MKHRSLLLLFAVSATLFSAAMVMKNSTVSKSVKNTETKNMNKKDGMMMDDKKMTNNDAKE DIREIYLAGGCFWGIEAYMERIYGVKDATSGYANGKTTETNYQMIHGTDHAETVHVKYDA NKISLEKLLKYYFQVIDPTSINKQGNDRGRQYRTGIYYTNVKDKVAIVKEIQEQQRKYSD KIQVEVEPLRNYILAEEYHQDYLKKNPNGYCHIDLEQANRIIIDPNDYPKPSDKELQAKL TPLQYSVTQKKNTEHSFSNEYWDNHEAGLYVDITTGEPLFSSKDKYDSGCGWPSFTKPIV KEVVTYAEDTSFNMIRTEVLSRSGKAHLGHVFNDGPKDKGGLRYCINSASIKFIPLKNME KEGYGYLINLVK >gi|261747234|gb|ADAD01000134.1| GENE 23 24679 - 25683 1262 334 aa, chain - ## HITS:1 COG:BS_yddN KEGG:ns NR:ns ## COG: BS_yddN COG2141 # Protein_GI_number: 16077571 # Func_class: C Energy production and conversion # Function: Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases # Organism: Bacillus subtilis # 5 333 2 330 339 290 45.0 2e-78 MSENKNIKLSVLNLIPKFEGDTDIQAIQRAVDLIKIVEKLGYYRYWVAEHHNFKGVLSSA TAIIIQHLLANSEKIRVGAGGVMLPNHTPLQVAETYGTLATLYPDRVDLGLGRAPGTDSD TAALIYRVQYVQTAKFIEAIGDLQRFMGAEAEQSIVSAYPGINTNVPIYILGSSVNSAHV AGELGLPYSFAGHFSPDAAEEAIQIYRNTFVPSKYLKEPYVILGLLVHGADSDEEAERLY TATQQGMLKIIRGEKGLYPLPDKNFSDKLTSAEKIILQSKMGINLMGTKGTMKKKWDEIK RKYNPDEIMAVSYMSEIEQLRTSYEILAQVVKEK >gi|261747234|gb|ADAD01000134.1| GENE 24 25921 - 27093 1448 390 aa, chain - ## HITS:1 COG:YPO3006 KEGG:ns NR:ns ## COG: YPO3006 COG1168 # Protein_GI_number: 16123185 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Yersinia pestis # 1 390 1 392 393 385 45.0 1e-107 MKYNFDEIIDRKSNHSTKYNELMKKFGTEDVIPLWIADMDFRTAEPVVKALKEKAEHGLF GYVYRPDEYFEAFINWQKRHHNWNVDKELLSFSIGVVPALAALVKQFSEKGDKILIQTPV YSEFYDINHDNERVVIENKFIEKDGEYSLDLEDFENKLKEQPKLFICCNPQNPIGRVWSY DELKAMGDLCVKYNIPVISDEIHADLTLWDNKHIPFASVSREIAENTITCTATGKAFNLA GLQCATVIFNNLQGKNKFDRFWKDLEVHRNNPFNLVATIAAYNEGEEYLEQLKKYLEDNI MFVHDYFKKNIPQIKPNIPQATYLIWLDCRDLGFSQEELEKFMLKKAKLGLNPGRAFQKD LEGFMRLNAACPRSVLEKALGQLKKAVEEL >gi|261747234|gb|ADAD01000134.1| GENE 25 27117 - 28076 1254 319 aa, chain - ## HITS:1 COG:BS_ydhS KEGG:ns NR:ns ## COG: BS_ydhS COG1482 # Protein_GI_number: 16077654 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Bacillus subtilis # 1 314 1 311 315 200 38.0 2e-51 MLYPLKFKKVFIEKVWGGREFETKLDMKLPENRKIGESWEISAHPNGMSIVENGELAGKT LQEVYDEYKEKLVGNKVYNEYKDRFPLLIKYLDVNDRLSIQVHPDDETALKNHNELGKSE SWYIMEASPDAVLIMGMKPGITKEQFLTKAENNDFEGLFEEKTVKKGDFIDIVPGTVHAS LKGSIVFAEIQENSDITYRIYDFDRTENGVKRKLHIKESADTIDFDKKVDIVNTDFKENE TRRNLIRKKYYSIDKIKINNTFTDINDESMIIYSILEGNGTINQENYSLEIKKGESVLVP PHISVTLKGDFEILRTVIE >gi|261747234|gb|ADAD01000134.1| GENE 26 28219 - 28887 675 222 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038714|ref|ZP_06012074.1| ## NR: gi|262038714|ref|ZP_06012074.1| hypothetical protein HMPREF0554_1348 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1348 [Leptotrichia goodfellowii F0264] # 1 222 1 222 222 372 100.0 1e-101 MYEYDEEKLKKERMEKRIEKILNIVLFCLPFLIRNFLITLPIQLSILGYWLYKAFVEIRL KRMAKKNIEKKEESWIKKSFIVEKIISILFVILATYFVFLSLEIPRKIRAVTETEKIIKN TETFYAKGEKIDNSEEIKKWLLKEINIWKVIGNKKVGKLSGNWEENYTLVNQDNQKIGIN AVSCLYMLIKLDNGLPPMVIPLELYEKPKEKELCKSITGMKE >gi|261747234|gb|ADAD01000134.1| GENE 27 29029 - 29658 1005 209 aa, chain + ## HITS:1 COG:FN1298 KEGG:ns NR:ns ## COG: FN1298 COG0563 # Protein_GI_number: 19704633 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Fusobacterium nucleatum # 1 207 4 210 211 274 72.0 7e-74 MNIVLFGAPGAGKGTQAKELTKKYEIPQISTGDILREAIAKKTPLGLEAKKLMDGGNLVS DDIVNGLVEARLQEADCEKGFILDGFPRTVAQAEALDKILEKFNKKIEKVIALDVSDDEI IERITGRRVSKKTGKIYHIKYNPPVDEKEEDLEQRVDDNKETVMKRLEVYNKQTAPVLEY YKKQNKVYSVDGAKKLEEITKDIIDILEK >gi|261747234|gb|ADAD01000134.1| GENE 28 29679 - 30452 1278 257 aa, chain + ## HITS:1 COG:FN1297 KEGG:ns NR:ns ## COG: FN1297 COG0024 # Protein_GI_number: 19704632 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Fusobacterium nucleatum # 1 257 1 254 254 326 65.0 3e-89 MIIYKTLDEIKKIKKANEIIARLFEDILPKYIKAGISTHELDQISEDYIKSQGAVPGTKD YDIGRPYPPYPASTCISVNDTVVHGIPDKKIILKDGDIVSVDTVTVLDGYFGDAAITYAV GEIDEESKKLLEVTEKSRDLGITYAKEGNRIGDVSHAIQEYVESFGFSLVRDFAGHGVGK EMHEDPMVPNYGKPGTGAKIEDGMVIAIEPMVNVGKYPVKIMKDMWTVKTKDGSRSAHFE HSVAIVDGKPLILSVKD >gi|261747234|gb|ADAD01000134.1| GENE 29 30472 - 30834 471 120 aa, chain + ## HITS:1 COG:AGl3168 KEGG:ns NR:ns ## COG: AGl3168 COG3759 # Protein_GI_number: 15891701 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 120 22 135 137 84 42.0 7e-17 MSGLTKILVLLVAFEHLYIMVLEMYFNESRSAQKSFNLEVDFLKDERVKKMMMNQGLYNG FLAMGLFWSLLETGIFQIEIVLFFLLCVICAAIYGAITVSPKIFLLQGLPGIAAMASLFI >gi|261747234|gb|ADAD01000134.1| GENE 30 31045 - 31419 533 124 aa, chain - ## HITS:1 COG:no KEGG:Clocel_1051 NR:ns ## KEGG: Clocel_1051 # Name: not_defined # Def: pyridoxamine 5'-phosphate oxidase-related FMN-binding # Organism: C.cellulovorans # Pathway: not_defined # 1 124 1 124 124 171 66.0 1e-41 MFNEKFFEVLKYDGVVSIVTWGNEEPSITNTWNSYLQIKDENRILIPAAGFTRAEKDINV NNKVKVALGSKEIMGNNNYQGTGFALEGTGKFLTSGEDFDFMREKFPFLTRVFEITVTSA KQLL >gi|261747234|gb|ADAD01000134.1| GENE 31 31437 - 32393 1063 318 aa, chain - ## HITS:1 COG:DR0133 KEGG:ns NR:ns ## COG: DR0133 COG0657 # Protein_GI_number: 15805172 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Deinococcus radiodurans # 31 315 28 290 296 126 29.0 5e-29 MKKILILLFTAIITIISAEGNKTLEVKLTKPSIKMLSGITYSQGTNSTGIGTFKLEMDIL KPRAKQTKPLPLVVFITGGGFIGAPKENYIQQRLAIAEAGYIVASIEYRAAPNGVFPEPL EDVKSAIRYLKANADKYGIDKTRVAVMGDSAGGYLAALTGTTNTMKQFDKGDNLNENSNI LAVADLYGASDLTKIGADFSEKVKEAHNSPAIPEALLINGLPLYNVGGSVNSRPEKAKAA NPINYISKNTSPFLLMYGDKDNLVSPSQTEILYKALKEKGVSVTKYVVKNAGHGDEYWTQ PEIIEIIINFFDKNLKNK >gi|261747234|gb|ADAD01000134.1| GENE 32 32417 - 32839 494 140 aa, chain - ## HITS:1 COG:no KEGG:FN0106 NR:ns ## KEGG: FN0106 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 139 1 140 140 101 42.0 8e-21 MDVKDKFRELMKVSNDIALASSSSDNQPNVRIVRFYYDEKDNLLYFLTLKNSQKTSEFEE NNKVAFTTVPKNSLQHVKAKGIVTKSKKSVQDLKETFIEKIPSIKMNIEQGEAFMDLYEI SFSKVTVTLDQKHVENIEII >gi|261747234|gb|ADAD01000134.1| GENE 33 32937 - 33683 843 248 aa, chain + ## HITS:1 COG:lin0383 KEGG:ns NR:ns ## COG: lin0383 COG2378 # Protein_GI_number: 16799460 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Listeria innocua # 1 242 1 240 315 237 50.0 2e-62 MNKSERINDMIMYLNDKEFFNLKDIMQKYSISRNTALRDIRSLEEIGMPVYSTTGRNGKY GILKNKLLSPVMFNIDEMYALYFTMITLKGYKTTPFYLDIEKLKEKFKACLPRKQIEEIS KMETVLSFESSVHYKESPYLKDILKFAIKEKVCKILYKEEKAERAYFIQFFKITSAYGQW FTIGYDFKNSEIKNFQCDKIIKLSENDKFKSVSLKKLSSLNPKVYKIFDYSMADFEIEIE KNKTIDKL >gi|261747234|gb|ADAD01000134.1| GENE 34 33828 - 34046 267 72 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 [Mycobacterium tuberculosis H37Rv] # 1 72 1 73 73 107 69 1e-22 MAKQDVLELEGEILEALPNAMFQVRLENGHEVLGHISGKMRMNYIKILPGDKVTVEVSPY DLSRGRIVYRKK >gi|261747234|gb|ADAD01000134.1| GENE 35 34067 - 34180 196 37 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229210335|ref|ZP_04336732.1| LSU ribosomal protein L36P [Leptotrichia buccalis DSM 1135] # 1 37 1 37 37 80 89 2e-14 MKVKASIKPICDKCKVIKRHGKVRIICENPKHKQIQG >gi|261747234|gb|ADAD01000134.1| GENE 36 34355 - 34720 555 121 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229883849|ref|ZP_04503314.1| SSU ribosomal protein S13P [Sebaldella termitidis ATCC 33386] # 1 121 1 121 121 218 90 6e-56 MARIAGVDIPRNKRVEISLTYIFGIGRSTSNEILEKAGVDKDIKVKDLTEEQVGKIRTIV EEYKIEGELRKEIRLNIKRLLDIKSYRGLRHRNGLPVRGQKTKTNARTRKGPVKMAIAKK K >gi|261747234|gb|ADAD01000134.1| GENE 37 34743 - 35132 644 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229210337|ref|ZP_04336734.1| SSU ribosomal protein S11P [Leptotrichia buccalis DSM 1135] # 1 129 1 129 129 252 97 3e-66 MAKKPAVSKKKKLKNIPNGIAYIHSTFNNTVVTITDSEGKVVIWKSGGTSGFKGTKKGTP FAAQIAAEQAAQVAIENGMKQIEIKIKGPGSGREASIRSIQATGLEVTRIVDITPVPHNG ARPPKKRRP >gi|261747234|gb|ADAD01000134.1| GENE 38 35161 - 35748 932 195 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229210338|ref|ZP_04336735.1| SSU ribosomal protein S4P [Leptotrichia buccalis DSM 1135] # 1 195 1 195 195 363 91 1e-99 MARDRQPVLKKCRNLGLDPSVLGVNKKSNRNIRPNANRKLTEYGTQMREKQKARFVYGVM EKQFYKLYEEATRKEGVTGELLLQYLERRLDNVVYRLGFGATRRQARQIVSHGHILINGK RVNIASYRVKQGDVISIKENSRELALIKESVGQKTVPGWLELDEAALTAKVLENPGRDSV DFEINEAMIIEFYSR >gi|261747234|gb|ADAD01000134.1| GENE 39 35782 - 36750 1294 322 aa, chain + ## HITS:1 COG:FN1283 KEGG:ns NR:ns ## COG: FN1283 COG0202 # Protein_GI_number: 19704618 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Fusobacterium nucleatum # 1 318 17 337 342 381 63.0 1e-106 MLNIEKIAKNIKLTEEKESRYTAKYTLEPLYRGYGNTIGNALRRILLSSIPGSAIKGLRI DGVLNEFSTIPGVKEAVTDIILNVKEIVVELDEPGEKKMVLSVKGPKVITAADIKPDPGI KIINPEQVIATVTTDKEINMEFLVDSGEGFVVSDEIDTEGWPIGYLAVDAIYTPIKRVTY SVEDTMVGRVTNYDKLILEISTDGSIEIHDALSYAVELLILHVKPFTNIGNSMSKFRGNE DENSSLNSETESNIEDMKIEELDFTVRSYNCLKKAGVNTISDLTSMTYVELLKIKNLGRK SLNEIIDKMKELGYDLSEEAGN >gi|261747234|gb|ADAD01000134.1| GENE 40 36784 - 37173 516 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229210341|ref|ZP_04336738.1| LSU ribosomal protein L17P [Leptotrichia buccalis DSM 1135] # 1 129 1 124 124 203 80 2e-51 MNHNKSYRKLGRRSDHRLAMLKNMTISLVKAERIETTVTRAKELRKFVEKVITLGKKYNN FDLENKEERVKAIHLRRQAFSFLRNEEAVAKVFKEIAPKYMDRNGGYTRIIKTDVRRGDS AELAIIELV >gi|261747234|gb|ADAD01000134.1| GENE 41 37365 - 38678 1852 437 aa, chain + ## HITS:1 COG:CAC1610 KEGG:ns NR:ns ## COG: CAC1610 COG1114 # Protein_GI_number: 15894888 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid permeases # Organism: Clostridium acetobutylicum # 1 427 1 429 440 290 40.0 3e-78 MEKLKKKELVMIGIMIFSLFFGAGNLIFPPIIGKQSGTNMFTVMLFFSITAIIFPVLGII AVTKSDGLKKLSNRVDHVFSIIFTTAVYLAIGPALAMPRAGTVPFEIAISPYLPEGTGVK LVLFIYTTAFFGIVYWLSLSPHKLLDRISKITSPAFLVLVFMLFIGTFIKPMGGYIQPEE LYAKNFGLQGFLDGYLTLDALAGLNYGLVVVYVIKSKIGSEDKKLTKTVAITGIVAGVML FAVYMMLAHVGAASASMFPQTKNGAEILTRAIRYSYGNFGAVLLAFMFIIACLNIGVGLT ISLSQYFNSLIKKVPYKVWATIWAFWSYVLANLGFQKIMEYSIPVLLAIYPASLVLIILA LLDKQIKSNKIIYRATVYPTVMISIINTLEKINIKIPILTEFTQKLPLQSADLGWVSVAL IFFIISFGLVKLRKVEN >gi|261747234|gb|ADAD01000134.1| GENE 42 38768 - 39496 767 242 aa, chain - ## HITS:1 COG:SPy1113 KEGG:ns NR:ns ## COG: SPy1113 COG3700 # Protein_GI_number: 15675094 # Func_class: R General function prediction only # Function: Acid phosphatase (class B) # Organism: Streptococcus pyogenes M1 GAS # 18 242 24 243 243 215 53.0 6e-56 MKKLLLALAILSSSTLFAAGPKVPYTHEGFYSTDKVQKAVHFISVEDIKKSLEGKGPINV SFDIDDTLLHSSGYFIYGQHYFQIPGDKRGAISYLYNQKFWDYVAENGDEHSIPKQSAKD LIKMHLERGDNVFFITGRTKHSKDKNYTSTKLSKTLQRYFDLPKEVYVEYTADTPTGGFK YDKSFYIKKHNVSIHYGDSDDDILAARELGIRGIRVQRAYNSTNPQKLNGGYGEEVLINS AW >gi|261747234|gb|ADAD01000134.1| GENE 43 39619 - 40347 763 242 aa, chain - ## HITS:1 COG:SPy1113 KEGG:ns NR:ns ## COG: SPy1113 COG3700 # Protein_GI_number: 15675094 # Func_class: R General function prediction only # Function: Acid phosphatase (class B) # Organism: Streptococcus pyogenes M1 GAS # 18 242 24 243 243 204 51.0 1e-52 MKKLLLVLTILSSSTLFAAGPKVPYTHEGFYSTEKVQKAVHFISVEDIKKSLEGKGPINV SFDIDDTLVHSSGYFIYGQYHFQIPGDERGERSYLKNQKFWDYVAENGDEHSIPKQSAKD LIKMHLERGDHVFFITGRTKHSKDKNYTSTKLSKTLQRFFDLPKEVYVEYTAATPTGGFK YDKSFYIKKHNISIHYGDSDDDILAAREVGIRGIRVERAYNSTNRQNLNGGYGEEVLINS AW >gi|261747234|gb|ADAD01000134.1| GENE 44 40487 - 40825 549 112 aa, chain + ## HITS:1 COG:lin0580 KEGG:ns NR:ns ## COG: lin0580 COG3695 # Protein_GI_number: 16799655 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Listeria innocua # 16 109 8 97 98 75 43.0 2e-14 MKGYVHMKNKKDNIFEEIYKIVKKIPCGKVATYGQIAIMIGNPRLSRVVGYAMSSCPYKD VPCHRVVNRFGELAKTFGENGSEEQKIRLENEEVYVEESGCVDLKEYIWNGK >gi|261747234|gb|ADAD01000134.1| GENE 45 40899 - 41390 748 163 aa, chain - ## HITS:1 COG:no KEGG:Clos_0633 NR:ns ## KEGG: Clos_0633 # Name: not_defined # Def: C_GCAxxG_C_C family protein # Organism: A.oremlandii # Pathway: not_defined # 1 162 1 162 164 238 66.0 7e-62 MGKIDTKNYTKEEIIAKVKEEAEELYENGTFFCSEAVITVLNNYLGQPYPPEVVKMASGF PIGMGKAGCLCGAVSGGQMALGIVYGRTQGEAMQEKMFEMSKSLHDYIIKEYGSCCCRVM TRKWRGDDFKSPERKRHCVEITGKVAEWIAEQLINDNQIETKH >gi|261747234|gb|ADAD01000134.1| GENE 46 41578 - 42501 1079 307 aa, chain + ## HITS:1 COG:L181867 KEGG:ns NR:ns ## COG: L181867 COG1275 # Protein_GI_number: 15672360 # Func_class: P Inorganic ion transport and metabolism # Function: Tellurite resistance protein and related permeases # Organism: Lactococcus lactis # 4 302 9 310 324 130 33.0 4e-30 MKKFLEKYPVPISGLALGMAALGNLLKSYGETPRNLLGIIAFVIILLLTVKILSNIEGFK KAMESPVISGSFATYSMVIIILSAYLPNKAVGIVFWYIGILIHVLLIVWFSVKYIPKRNI ETIFTTWFIVYVGIVTVSVTAPWYKMPVFGKAAFWFGFVTYIVLLPIVLYRVFKVKKIPE PAIPTFAVFAAPAALLLAGYISSFPEKNFIIFNLLIFLTVLFYVMVLLSMPKLLKLKFYP SFSAFTFPLVISGLALKLTTGVLKKAGVNVSVLAKVVNLAEIIALVAVIYVFVKYLQFIF QKEQIEK >gi|261747234|gb|ADAD01000134.1| GENE 47 42575 - 43903 1656 442 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2011 NR:ns ## KEGG: Lebu_2011 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 5 441 4 456 537 562 67.0 1e-159 MKRKKLKLMLFLFGISSLVAAGATTQTVNTGKTDKNMSVSENKQEIKIQPSNKTEKGYDL NFNTQNYTKKTAEVNGKTIEYRAYENIVYVKNPVDVNYQTINIYIPEEYFNGKSVGKYNA KNAPIFFPNSIGGYMPGAAGTVGMDKRSGKENASLIALSKGYVIAAPGARGRTLKDEQGS YTGKAPAAIVDLKAAVRYLHYNDKTMPGNADRIISNGTSAGGALSALLGATGNNKDYEPY LKELGAAEAKDDIFAVSAYCPITNLDNANEAYEWMFNGVNDYKKINITMLDYNVQRKEVA GKLTEDEIKRSADLKVLFPQYLNSLKLKDKSGKLLTLDKNGNGSFKEQIKKYYIDSANKA LKAGTDLSTFKFLTIKNNKVVDLNFDEYVKYMGRMKTPGAFDNVDLSTGENNLFGDKTVD NKHFTKYMLEHSTTNGQMAEKI >gi|261747234|gb|ADAD01000134.1| GENE 48 43909 - 44142 381 77 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2011 NR:ns ## KEGG: Lebu_2011 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 76 461 533 537 121 73.0 9e-27 MMNPMYYIGTKGATISKHWRIRHGAIDRDTSLAIPAILALKLENTGKNVDFASPWATGHS GDYDLDELFIWVDGIVR >gi|261747234|gb|ADAD01000134.1| GENE 49 44309 - 44722 339 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038725|ref|ZP_06012085.1| ## NR: gi|262038725|ref|ZP_06012085.1| hypothetical protein HMPREF0554_1372 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1372 [Leptotrichia goodfellowii F0264] # 1 137 1 137 137 250 100.0 3e-65 MKKLIILIMLVIAVPSLSCEVTINMNTKNVIAFEPYLTIFMFSGRTSSNIVKTDCNIEKR WYIGVNINFNVIRKNNISWSLNNKNGGYFRNGNSEFSAIFNSDPFGNTLFLRKGSPYVTF KGISKVRYTQDKHLMLD >gi|261747234|gb|ADAD01000134.1| GENE 50 44744 - 45499 681 251 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038701|ref|ZP_06012061.1| ## NR: gi|262038701|ref|ZP_06012061.1| hypothetical protein HMPREF0554_1373 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1373 [Leptotrichia goodfellowii F0264] # 1 251 1 251 251 421 100.0 1e-116 MEEKQKNPWYKSTPGLVLIVLFWYVLVPVVIYQSKMEKKKKIMYGSIYIFAMILFYGSIF LSENKSSVSKREETKNNEIAKSEELQKLSQEKAKLEEEKRNLEIRKNVETEKNRIINEIV RTWDKADKVLAEYLESTKTNVDPYDFYYKGKESLTITQETLSEFQNLNCNITGDENFDNK CKKTITSGIKVLKTRIVYFQKATDAMYKILTDSTPDKEEINQISSSANDTLDEAVKLFQN FKFEIHDLKNY >gi|261747234|gb|ADAD01000134.1| GENE 51 45514 - 46083 550 189 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038726|ref|ZP_06012086.1| ## NR: gi|262038726|ref|ZP_06012086.1| hypothetical protein HMPREF0554_1374 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1374 [Leptotrichia goodfellowii F0264] # 1 189 1 189 189 334 100.0 2e-90 MSKNKIWRINMKKQQEILFNENNRNSEVILNDDILNNYIGLGWSSKEGFDKDIQEKDFEV EEIVNKIRAELNKEVDLKKRNDDLKHFNTDIQSIFIEGMKIEDFVWIKFGKIYKLGKITG ECRYNLGSKYIPGKYQIGFYRKVKILNRDFDELEIPKKVIASFKVRRAVQRIEDETDGSI IKHCESCYN >gi|261747234|gb|ADAD01000134.1| GENE 52 46374 - 46829 503 151 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0540 NR:ns ## KEGG: Lebu_0540 # Name: not_defined # Def: septicolysin # Organism: L.buccalis # Pathway: not_defined # 1 151 1 151 152 149 64.0 4e-35 MKLAEALILRADLQKRIEQLRIRLNNNAKVQENDTPTENPVTLISELNNCIDELTSLIKK INKTNCTATSNGITLVDLIAERDTLTLKANIMRTFLQQASQKVELYSSKEIKILSTVDVP SLQKEVDNLSKKIREIDTKLQQANWLTDLIE >gi|261747234|gb|ADAD01000134.1| GENE 53 47170 - 48054 1000 294 aa, chain + ## HITS:1 COG:NMA2172 KEGG:ns NR:ns ## COG: NMA2172 COG0739 # Protein_GI_number: 15795043 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Neisseria meningitidis Z2491 # 133 288 231 388 430 115 42.0 8e-26 MGLIKKIEGYKENFCKKSKFSEKFKNIENKNTIVFIGLIAVFLMVFIFQTSHTENIYATS KKQNVYKTESNNTRENVKNSLHEETEKLKDNKIQEETGQKDLKISIGEKIERETKRNIIS IRNGSATRKAYAKNENRNSAIYYTKYNTRVDNYEREDIRSQEISEKFYWPVGSQNITSKY GERIHPILKERKFHRGIDIGEKFGAEVRSAVKGIVTYSGEKGNYGKMVEVTSEKGIVTRY AHLSKISVEEGEIVSQGYILGNVGSTGMSTGPHLHFEIMIDGKPLDPMEFEYYQ >gi|261747234|gb|ADAD01000134.1| GENE 54 48130 - 48192 80 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLIKARGLYRRSGISPCPEV >gi|261747234|gb|ADAD01000134.1| GENE 55 48519 - 48983 614 154 aa, chain + ## HITS:1 COG:FN1505 KEGG:ns NR:ns ## COG: FN1505 COG0054 # Protein_GI_number: 19704837 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Fusobacterium nucleatum # 1 153 5 157 157 219 75.0 1e-57 MKKFEGKFNGKDLKIGIVAGRFNEFITSKLVEGATDVLKRNEVLEESIEIAWVPGAFEIP LIVKKMINAKRYDAIITLGAVIKGSTPHFDYVCAEVSKGVAQLSLQYDIPVIFGVLTTNN IEEAIERAGTKAGNKGSEAAFAAIEMANLIKEII >gi|261747234|gb|ADAD01000134.1| GENE 56 49017 - 50165 1279 382 aa, chain + ## HITS:1 COG:FN1506_2 KEGG:ns NR:ns ## COG: FN1506_2 COG1985 # Protein_GI_number: 19704838 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine reductase, riboflavin biosynthesis # Organism: Fusobacterium nucleatum # 159 374 1 220 223 218 57.0 1e-56 MNKEINKKDCDNNISKEDAEYMKMALELAKKGIGRVNPNPLVGAVIVKNGKIIGEGYHKM FGEAHAEVHAIENAGEETKNATIYVTLEPCSHYGKTPPCAEKIIKAGIKRCVIGSGDPNP EVAGKGIEMLKNSGIEVTENVLKKECDELNQVFFKYIKSKIPYLFLKCGITLDGKIALSN GESKWITNEKAREKVQYYRNKFMGIMVGINTVLYDNPSLTARIENGVDPFRIIIDPNLRI TEDYNIIKNNSDGKTVIVTSVINKNSDKVEIFKERKKLKFIFLKGKNFLMRDILEEIGKE GIDSVLLEGGEKLISKAFSEKVIDGGEIFISNKIFGDKTAKSFISGFTKINTDEAITLNN IKYNIYDNNIGVEFYFKKYDES >gi|261747234|gb|ADAD01000134.1| GENE 57 50174 - 50845 786 223 aa, chain + ## HITS:1 COG:FN1507 KEGG:ns NR:ns ## COG: FN1507 COG0307 # Protein_GI_number: 19704839 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Fusobacterium nucleatum # 1 223 34 251 251 252 63.0 4e-67 MFTGLIEGPGEVINVLKKPSGMEITIKGSNVAEKVEIGGSIAVNGVCLTVTSFSKNTFTA DVMFETIKRSGLKRLKTGEKVNLEKSVTLSTFLGGHLVMGDVDCEAEILSIISKGIAKIY IFHLEEKDKKYMQYIVEKGRITIDGASLTVIDVNDEKRTFSVSLIPHTIENIILGGKKSG DFVNVETDLFGKYVEKILKFSEYEKEQNRKSKITPEFLRENGF >gi|261747234|gb|ADAD01000134.1| GENE 58 50869 - 52080 1624 403 aa, chain + ## HITS:1 COG:BH1556_2 KEGG:ns NR:ns ## COG: BH1556_2 COG0807 # Protein_GI_number: 15614119 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase II # Organism: Bacillus halodurans # 208 399 1 192 197 233 57.0 5e-61 MDSKFNTIEEALEDIKKGKVIVIVDDEARENEGDLFLPAQSATYKMINFMINKAKGLMCV PLSRRRAEELELDPMNRVNTDHHETAFTTSVDAYEGTVTGISVADRLKTITDLGNLEKKA KDFRKPGHIFPLIAKEKGVLERKGHTEAAVDLSRIAGFSEVGVIMEILNEDGTMARRDEL FEFCTKYNLKIITIDDLILYRKNNEKLVKKEAEVNIFTKFGNFDFVGYSDKIENKEYIAV IKGDIRNKENVSVRLHSECLTGDVFGSKRCDCQEQLHRSLKEIEEKGEGLLIYLRQEGRG IGIINKLKAYKLQDEGYDTVEANHQLGFQEDLRDYAVAAQIIKDLKVKSISLKTNNPLKI EGLKKYGIKVVARESIEIKFNEFDKKYLKTKKEKMGHIFIQEL >gi|261747234|gb|ADAD01000134.1| GENE 59 52308 - 52835 536 175 aa, chain + ## HITS:1 COG:no KEGG:Clocel_3820 NR:ns ## KEGG: Clocel_3820 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulovorans # Pathway: not_defined # 1 172 35 211 244 81 36.0 2e-14 MKLYVVDDDYIEYLKKFDNHVLNYSGKDYKNKRKYLGILLNINECKYIAPLSSPNKKTDY LEIGEIRKSIVPIIRIVSNENLLGTIKLSSMIPVYDETVISYYNIKDEKDIKYKNLVSKE YKFININKEYIIKNALKLYTQKINNMSMKYVQETVNFKLLEEKAKLYMNKKIFEV Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:24:59 2011 Seq name: gi|261747227|gb|ADAD01000135.1| Leptotrichia goodfellowii F0264 contig00077, whole genome shotgun sequence Length of sequence - 4321 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 2 - 859 447 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 2 1 Op 2 . - CDS 872 - 1450 920 ## COG1713 Predicted HD superfamily hydrolase involved in NAD metabolism 3 1 Op 3 . - CDS 1469 - 2014 636 ## Lebu_0295 hypothetical protein 4 1 Op 4 . - CDS 2056 - 3099 1519 ## COG0860 N-acetylmuramoyl-L-alanine amidase 5 1 Op 5 . - CDS 3199 - 3456 384 ## Lebu_0297 preprotein translocase, YajC subunit - Prom 3486 - 3545 9.2 6 2 Tu 1 . - CDS 3575 - 4318 753 ## COG1489 DNA-binding protein, stimulates sugar fermentation Predicted protein(s) >gi|261747227|gb|ADAD01000135.1| GENE 1 2 - 859 447 286 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 16 284 19 291 730 176 36 2e-44 MKELIYLKKLLEDHELTFQEIIQMLEWTPKKRKLYKQILSSWEDEGEIYLKRNGKYTLPE KAGLIKGEISIAGGNFGFLDVVGENSIFIPGHYLNTAMNGDTVLVRILKNNGNSDKSREG EVYKVVKRARDVIVGIFEHNLSFGFVRPRNSGRDIYVSKKKIRGAKTGDLVAVKIYFWGD EEKKPEGEIVSILGNPEDTEALISALLINNGIQEKFPNEVIKETGKIGDDFSNEIEKRKD LRHLDIITIDGSDAKDLDDAVYVEKTETGYKLIVSIADVSYYVKNG >gi|261747227|gb|ADAD01000135.1| GENE 2 872 - 1450 920 192 aa, chain - ## HITS:1 COG:FN0607 KEGG:ns NR:ns ## COG: FN0607 COG1713 # Protein_GI_number: 19703942 # Func_class: H Coenzyme transport and metabolism # Function: Predicted HD superfamily hydrolase involved in NAD metabolism # Organism: Fusobacterium nucleatum # 1 180 1 178 193 119 39.0 3e-27 MGINIDKIKTNVKKYLDEKRYRHVERVAAAAKELAEIYNVPAEDAVAAAYLHDVAKFFEI RKMIDLVRGKYPEVENKISQTTAILHGFAGAEFIRNNYDMFGIDNEEILDAVKYHTIGSE NMSTLSKIIYLADAIEAGRTWDGVEKARELAKKDLDKALIYEIKTKLEYLLSIENIIHPN IILFRNSLIYKG >gi|261747227|gb|ADAD01000135.1| GENE 3 1469 - 2014 636 181 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0295 NR:ns ## KEGG: Lebu_0295 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 11 181 12 182 182 168 52.0 8e-41 MGNGVKTDSQSEKIKGFFKKNLLLMILVLLSAGAITANYYDKKNSQTINVTVDPKLVEKS SSSAEQTQKINIFVYDSASKAIESKEVFIPKQLNIIEGDFINEIIKDSPYVTKEMKFQSA YTLNMDDKNTTIVKLNSQFAGLKSDKVLFDGFSQAVTQTIMKNFPNVQAVVIQIDGETTA Q >gi|261747227|gb|ADAD01000135.1| GENE 4 2056 - 3099 1519 347 aa, chain - ## HITS:1 COG:FN1334 KEGG:ns NR:ns ## COG: FN1334 COG0860 # Protein_GI_number: 19704669 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Fusobacterium nucleatum # 128 346 116 336 338 169 44.0 1e-41 MLLLISTVVFSETLEKVTYRNGVYTATFKEKRKINVSATFNQSQSVLALDFQNVTVKDGI PNNLRVNDQYVDNVSITEVAGISTISFYLKAGTQYRIVTRNGEVQVTFSGQNNNGNSVVK NQPSKNQNTQSGKKKKYTIVVDPGHGGKDSGALGNGYREKDFALAIGLKLANNLKKDYNV IMTRSTDVFIPLQTRAKIANDANADFFVSIHLNSGGASANGAEAFYYSKKESAYAAEVAK FENSVDSNYKDIPLSDFIINDIFYRINQQKSAAVATDALDSIISNFGLRRRGVYGANFAV LRGTNAPAILVEVGFITSYGDIDQYLSESGKERLAAGIANAIRKHFN >gi|261747227|gb|ADAD01000135.1| GENE 5 3199 - 3456 384 85 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0297 NR:ns ## KEGG: Lebu_0297 # Name: not_defined # Def: preprotein translocase, YajC subunit # Organism: L.buccalis # Pathway: Protein export [PATH:lba03060]; Bacterial secretion system [PATH:lba03070] # 2 85 4 87 88 71 54.0 1e-11 MNYFQMVIIYIVIMAVVLGPTYFTNKKKKQKQQEMINSLKAGDKITTIGGIKGTVVEVLT DTVEIKIDKTARMTILKSAVSSVSK >gi|261747227|gb|ADAD01000135.1| GENE 6 3575 - 4318 753 247 aa, chain - ## HITS:1 COG:MTH1521 KEGG:ns NR:ns ## COG: MTH1521 COG1489 # Protein_GI_number: 15679518 # Func_class: R General function prediction only # Function: DNA-binding protein, stimulates sugar fermentation # Organism: Methanothermobacter thermautotrophicus # 22 228 15 216 240 162 40.0 5e-40 MKSKSESNKNIYTIDYDETVIFKERITRFTVNFDFKEKSENSKSENLAHLHDSGRLTELL VKGNELLIKKAENENRKTKWDVVGVKVDDETILINTAYHRYIAESILKNEKLSPFGKVKS IKPEVKYNKSRIDFYLETEKDKIYLEVKGCTLIENKVATFPGAPSVRATKHLNELMELKK EGYRAAVLILVFRKSDIFRPRHEIDKDFAEAFYKAKRKGVEVYPMLLSYKNKNIYFEKKL EILRKSF Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:25:04 2011 Seq name: gi|261747225|gb|ADAD01000136.1| Leptotrichia goodfellowii F0264 contig00174, whole genome shotgun sequence Length of sequence - 329 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 323 311 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261747225|gb|ADAD01000136.1| GENE 1 2 - 323 311 107 aa, chain - ## HITS:1 COG:FN0132 KEGG:ns NR:ns ## COG: FN0132 COG3210 # Protein_GI_number: 19703477 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 5 107 1817 1919 2462 84 56.0 3e-17 MNIRGKGDTRPNSPSVSISANRSNSNTEQTIYQNGRFTNVEEVHNGTKNMRLSGFNQEGG KVTGSIENLVEESKQNISRTTGSSSGINVNLSSKGVPTGGSVNYSHT Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:25:09 2011 Seq name: gi|261747209|gb|ADAD01000137.1| Leptotrichia goodfellowii F0264 contig00133, whole genome shotgun sequence Length of sequence - 16229 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 6, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 379 550 ## Lebu_0160 hypothetical protein 2 1 Op 2 . - CDS 399 - 1142 1157 ## COG0670 Integral membrane protein, interacts with FtsH 3 1 Op 3 . - CDS 1173 - 3083 3092 ## COG0441 Threonyl-tRNA synthetase 4 1 Op 4 . - CDS 3126 - 6767 4340 ## Lebu_0425 OstA family protein - Prom 6801 - 6860 7.3 + Prom 6815 - 6874 9.3 5 2 Tu 1 . + CDS 6915 - 7631 1052 ## COG0860 N-acetylmuramoyl-L-alanine amidase + Term 7637 - 7685 5.4 - Term 7623 - 7673 1.8 6 3 Op 1 . - CDS 7679 - 8104 447 ## SMU.442 hypothetical protein 7 3 Op 2 . - CDS 8121 - 8534 549 ## Vpar_1626 hypothetical protein - Prom 8723 - 8782 11.4 + Prom 8538 - 8597 14.6 8 4 Tu 1 . + CDS 8669 - 9103 458 ## COG1846 Transcriptional regulators + Term 9105 - 9164 17.8 - Term 9094 - 9151 5.5 9 5 Op 1 . - CDS 9170 - 10135 1366 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 10 5 Op 2 . - CDS 10155 - 11237 1407 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold - Prom 11275 - 11334 4.1 - Term 11330 - 11382 6.0 11 6 Op 1 . - CDS 11398 - 12879 2133 ## COG1190 Lysyl-tRNA synthetase (class II) 12 6 Op 2 . - CDS 12896 - 13264 366 ## Lebu_0135 hypothetical protein 13 6 Op 3 . - CDS 13276 - 13686 783 ## Sterm_0021 hypothetical protein 14 6 Op 4 1/0.000 - CDS 13702 - 14172 556 ## COG1576 Uncharacterized conserved protein - Prom 14222 - 14281 8.2 15 6 Op 5 . - CDS 14283 - 16229 2264 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) Predicted protein(s) >gi|261747209|gb|ADAD01000137.1| GENE 1 1 - 379 550 126 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0160 NR:ns ## KEGG: Lebu_0160 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 124 1 124 128 194 82.0 7e-49 MKKQIKSVEEFHKIYKLGNSEKPIGKLESNKEKLRFELMKEENEEYLEAAKNGDIVEVAD ALGDMMYILCGTIIEHGMQHIIEEVFDEIHRSNLSKLDENGNPIYREDGKVIKGPNYFPP DIKKII >gi|261747209|gb|ADAD01000137.1| GENE 2 399 - 1142 1157 247 aa, chain - ## HITS:1 COG:ybhL KEGG:ns NR:ns ## COG: ybhL COG0670 # Protein_GI_number: 16128754 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Escherichia coli K12 # 36 246 26 234 234 128 42.0 1e-29 MNYDDFENEVYGNNTSQMTYEKLEKLVASKVLGSIGWMIVGLTITGIVGFITVYSLLTGS LSFQTIQSIYVPATIIELVLVFAFTAISFKAKAGTLRFVFILYSALNGFTLSILGIIYTE GSLIFAFIGTLVLFIVLGLYGYFTKEDLSKYGTILKVGLIALIIMSLINMFMASDQLMWI SSILGVIVFIVFIAYDINRIKNSIISYAVYEDASILERIEIIGALRLYLDFINLFIYILR IIGRRRR >gi|261747209|gb|ADAD01000137.1| GENE 3 1173 - 3083 3092 636 aa, chain - ## HITS:1 COG:FN0611 KEGG:ns NR:ns ## COG: FN0611 COG0441 # Protein_GI_number: 19703946 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 22 634 3 618 620 827 63.0 0 MLEMLLPDGNVRKIEQPMTVTEFAKTISLSLGKETIGAIIDGVQVDPSHIIEKSGSIKII TSTSEEGVAIIRHSAAHIMAQAVQKLFPGTKVTIGPVIENGFFYDFDPEKPFTEEDLTKI EEEMKKIVKENYEFKRSEMSAEDAKKLFAEMGENYKVEIIDDLGVDKVSIYQQGEFVDLC RGTHIPSTGYLKAFKLMSTAGAYWRGNSDNKMLQRIYGVAFGSKKELEDYLTMLEEAERR DHRKLGKQMDLFFLDEHGPGFPFFMPKGMELMNKLQEIWRKEHKKRKYEEIRTPVMLDKE LWEISGHWFNYRENMYTSEIDEKVYAIKPMNCPGSIIAYKNNLHSYKDLPLKYGEMGLVH RHEFSGALHGLMRVRAFTQDDAHVFCTKEQIEEQIIEIIDLYDKFYTLFGFDYNIELSTK PEKAIGSDEIWAVAEADLASALEKKGIKYKLNPGDGAFYGPKIDFKMRDSIGRIWQCGTI QLDFNLPARFEMSYIGADGEKHEPVMIHRAMYGSMERFIGILIEHYAGAFPTWLAPVQAR ILTISKEQVPFAEKLYEELQEAGIRVDIDTRDEKIGYKIREANGDQKIPVQLIIGKNEVE NNEVNVRRFGSQDSQNVSVKEIVDMLLEESKVPFKN >gi|261747209|gb|ADAD01000137.1| GENE 4 3126 - 6767 4340 1213 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0425 NR:ns ## KEGG: Lebu_0425 # Name: not_defined # Def: OstA family protein # Organism: L.buccalis # Pathway: not_defined # 1 1213 1 1226 1226 1462 62.0 0 MKNNKYKVLFLFLFCLLFAANVNAEETVIELETEKSIIDIEKEAIEATDGVILKYGDIII RADSVKKLGGKNVLFAYGNVVFTQGTQTVKANELVFDMDTKKAKILSSESYDTKLKLRFG GEQTLSEPNKITIKNGWFTTSPYEEPNYKVNAEELLIYPNRKIVAKKIGVEVGGKTWFKF PYYVASLKPESQRATLFPYIGSDSERGLFGIWGFDYDKGKYAQGFVDFELSAKKKLALKI SNDYTLWPGSSGNVFVKRFVVPIGNRIREWDFEWNHNIVNTPKKEKSERRFYDLGYGLWN FNYKNITTNLVRAADGVLLKDDYTSYVDTYKKIGFYDFKINQELGQNGEFNLDYYWTQNP DALRELTKINDFIVDRDEIDPRKTDVDLYKTLKYTNGNSDIAIKIDNEKFTDINPGYIGD LNSFRNKKNYSIDFKGPKIKLEYLDSNKDEYKEILNLKERDDADNVSLEDSKRWVQTTAY DHRKEVSLTLGNYYPFRQNEFFGYKPKTLSQYLTNNFYFGVETKHSDIKKKEYEYDFTRD NPAFNNLFLDSTTDDNSRIYKVYEDTDKIKRAKKIIYEKYRSQKINIGNDKIELPIRNSY LSFNFGFENRDYSEVYVPEFFKGRKIEDVDSKTGYKILRDAAGNEIKKQPSLNISTLNTK LFTTIFDNTAKINNKYDIKVTNEANLTLQKVNASGAMYNGNDIIDIPTNTLGINNNFNFY LGNMTFNYNFTNFHNRHFSGNWLKNQYVRNYFKFDIANKRFVSFDFISNDEYEFEDFKSE RNLSREVQYGYLSDAGNSFLYKYTDKNRQYFPYNEDMGWNRKDNKEFVRDRMFSINYNEW GIDYTNIKSKINDIFGTSLNFGKPALKLQKETHKIGFVYDTSKMKNKKFESDHYFRISYG FGKKTYRDLNNTPLTISDDRYSSGSDYTTISLLYRYENNVKPKYEGRDSEGNIVKESNQK DFSVENANKNIIDNIEIKSDDRLFLNSEEEQAYKSYVEEETYKQNKFNLNDFNSKLQDLR KKKKYFQIGLDMEIDGSNSMQQTDLKGFNRLNDLTFKVEAGYLEKFFVRYAFVMEKPDRI YRKDPNRYSQYNFRRHDFETKYMFGPDPDKPWWIGAKVQYVQDGAPKSSDPEIFESSSAA RRVNKITLGMATVSHRFENLEWEIGAGMKWDKPNNKKLGYYPVVTLKFGIVTFPEKNVQF NYTKGSPSFGAGL >gi|261747209|gb|ADAD01000137.1| GENE 5 6915 - 7631 1052 238 aa, chain + ## HITS:1 COG:MT3820 KEGG:ns NR:ns ## COG: MT3820 COG0860 # Protein_GI_number: 15843338 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Mycobacterium tuberculosis CDC1551 # 32 235 28 234 241 76 30.0 3e-14 MFKYVLKITTLISVFAVVFTTFGAEVKPGQTLICIDPGHQGKGNRGLEEIAPGSSKKKVK VADGTAGIVTKKGEYELTLEIGLKLRDAFKSKGYKVLMTRETHNVNISNKERSLMTNKAG CAAYIRLHADGSSNRSLTGVSVLTSSAKNPYTQKVQKTSDKLSKDVLSEFVKATGAKNRG ISYRDDLTGTNWSTVPNTLIEMGFMSNPDEDRKMASKEYQEKMVRGMVNGIEKYLRER >gi|261747209|gb|ADAD01000137.1| GENE 6 7679 - 8104 447 141 aa, chain - ## HITS:1 COG:no KEGG:SMU.442 NR:ns ## KEGG: SMU.442 # Name: not_defined # Def: hypothetical protein # Organism: S.mutans # Pathway: not_defined # 1 128 2 129 136 175 60.0 6e-43 MKYWLGVVSKEHVLRGVEGEFCQVCHGKAVPLKRMKKGDYLIYYSPKISMDSDIKCQEIT AIGKMKDERIYQFQMSENFIPYRRNVKFLKLKGKCSINEFREHPEWLKYSKQLRYGHFQV SEEFFEFICSFFIQDSAIEVG >gi|261747209|gb|ADAD01000137.1| GENE 7 8121 - 8534 549 137 aa, chain - ## HITS:1 COG:no KEGG:Vpar_1626 NR:ns ## KEGG: Vpar_1626 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 1 137 1 135 135 163 61.0 2e-39 MEFSFKIKVNAKKEEVWKYYSDIEKWYIWEEDLKNISLNGKFQTGTEGIMELEKMPPMKY ILTSVKENAEFWDKTETPLGDIYFGHEIFEDKGGFVEIKHTVRLESEEINEENIKFLKQI FSDVPHSMLLLKKAIEK >gi|261747209|gb|ADAD01000137.1| GENE 8 8669 - 9103 458 144 aa, chain + ## HITS:1 COG:BS_ydcH KEGG:ns NR:ns ## COG: BS_ydcH COG1846 # Protein_GI_number: 16077544 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 13 144 13 143 147 74 34.0 6e-14 MFPSKYKNDSEDSTGLLFMRTYNKWHTIIKNELKKLNITHPQFVVLTSLSYLSEKEKEVT QIMISKISGIDVMTVSQILNLLEKNDFIKRKVHSKDTRAKSVFLTLKGRKTAEKSVPIIE SIDEKFFGVLNTEENIFKSFLKKL >gi|261747209|gb|ADAD01000137.1| GENE 9 9170 - 10135 1366 321 aa, chain - ## HITS:1 COG:all1225 KEGG:ns NR:ns ## COG: all1225 COG0667 # Protein_GI_number: 17228720 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Nostoc sp. PCC 7120 # 11 306 13 311 315 169 37.0 5e-42 MKKTNIQELEVPKVALGTWSWGFGEIAGGDTIFGNQLGKEELKPVFDRAMELGLTLWDTA TVYASGASETILGEFVKNRKDAVISTKFTPQLAEGRGDNAIFEFLDESLKRLNKEVIDIY WLHNTVDMEKWAPKLVDLLKSEKVKKVGVSNHNLEQIKHVDKILKDAGYKLHAIQNHYSL LYRTIETAGILDYCKENDIAVFSYMVLEQGALSGKYTKDNPLPSGTRRGEAFPPEILGKL EPLFYEMKILGGKYQATVPEIATAWAIAKGTVPIIGVTKTAQVDDASRAANVELSAEDIE LLDRVATDTGVTVKGEWESSM >gi|261747209|gb|ADAD01000137.1| GENE 10 10155 - 11237 1407 360 aa, chain - ## HITS:1 COG:slr0619 KEGG:ns NR:ns ## COG: slr0619 COG2159 # Protein_GI_number: 16331820 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Synechocystis # 136 360 98 336 348 82 25.0 1e-15 MSNNPVNDKITKITGEFMKEYVKNPDLIKFDFNENGKGLFENRKKNLKIDAFAHILTPKF YEDLKELNPKIPQIFSTVLNPSLLDVEERRKFQKEYSDVKQIISMFNLNPEDFLRAENEE ERKRNAEKTLQITWDANNELTETVKNNKDLFIGAVATVPMNNIEGALKIINNQVSENDEL AGIQLFSKALGKSIASDEYLPIFEAAEKNDITVFLHPVYDMTKNDNNIVFSWEYEQSQAM NEIVLAGIFKKYPNIKILVHHAGAMVPFFANRLPLVMPKEYAEDFKKFYVDTAIIGNVKA VEMAVDYFGEDKVVFGTDAPVGIQPSGPTKIVIESIEKLNKGEDIKEKIYRKNVEQLLKI >gi|261747209|gb|ADAD01000137.1| GENE 11 11398 - 12879 2133 493 aa, chain - ## HITS:1 COG:SA0475 KEGG:ns NR:ns ## COG: SA0475 COG1190 # Protein_GI_number: 15926194 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Staphylococcus aureus N315 # 1 489 1 495 495 514 53.0 1e-145 MSNQTNESNIISEKLKKVEELKEAGIEPYGRKYEKINNIEEVNQYDETSDKVFKTAGRIV AFRRMGKNGFGHIQDPTGKLQYYVKKDEVGEEQYEIYKKLGLGDFIGIEGTLFRTQTGEL TLRAKSFEVLSKNIRPLPEKFHGLTNVETRYRQRYVDLVMNSEVMDTMKKRFQIVRFFRK YLEEKGFTEVETPMMHPIAGGATARPFTTHHNALDMELFLRVAPELYLKRLLVGGFEKVF EINRSFRNEGISIKHNPEFTMMELYQAYADYVDMMNITEDLISKLTFELHGTYEIEYEGK KINMESPWRRVKMKDIVKEVTGFDFDTVTSDEDAVTKAKELQIPLEKDRTYTKYGILNLI FEEKVESTLINPTFIIEYPKEISPLSKNKKGETDWVDRFELFISGREFANAYSELNDPRD QKERFEEQVKMKEAGDDEAQNMDLDYIRALEYGMPPAGGLGIGIDRLVMLMTNSASIRDV ILFPTLRKEDIEL >gi|261747209|gb|ADAD01000137.1| GENE 12 12896 - 13264 366 122 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0135 NR:ns ## KEGG: Lebu_0135 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 122 1 122 123 104 63.0 9e-22 MKIKIKSSDLHGQNYEQNFEVKNFLNSENKKEIHYEDEYGKCKILKFDDFVEIYRYGQIN SKQIFKQNKKTFFTYFTKEFKGKYEIFTKIIIMENEKMYLEYDIINNNEIFNSIKLEITE LN >gi|261747209|gb|ADAD01000137.1| GENE 13 13276 - 13686 783 136 aa, chain - ## HITS:1 COG:no KEGG:Sterm_0021 NR:ns ## KEGG: Sterm_0021 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: Purine metabolism [PATH:str00230]; Pyrimidine metabolism [PATH:str00240]; Metabolic pathways [PATH:str01100]; RNA polymerase [PATH:str03020] # 1 136 1 130 130 88 52.0 9e-17 MYSVGEEFEKLIDDDIEYFTCLANITVKEKEYLICENEAGVKRVFWYDTNEEDLVILDED EEDEVLEIWEEEYYGTDKDYMYWNEDFGEYDKVVKEDEDLSDLDDVGIDEMDDLEEIGSF EEDEEDIDEFLDNFLD >gi|261747209|gb|ADAD01000137.1| GENE 14 13702 - 14172 556 156 aa, chain - ## HITS:1 COG:FN0463 KEGG:ns NR:ns ## COG: FN0463 COG1576 # Protein_GI_number: 19703798 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 2 156 1 155 155 145 57.0 3e-35 MIKISIICIGKIKDGYIKEGISEFSKRLSKYVNLNIVELTEEDDNKGVENAVMKETDRII DTINKRNQSYNILLDLKGKVRTSEEMAKDIERLSLTNSEISFIIGGSNGVNDKLRKIVDF RLNFSSFTFPHQLMRLILIEQIYRWISINKNIKYHK >gi|261747209|gb|ADAD01000137.1| GENE 15 14283 - 16229 2264 648 aa, chain - ## HITS:1 COG:FN0462 KEGG:ns NR:ns ## COG: FN0462 COG0323 # Protein_GI_number: 19703797 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Fusobacterium nucleatum # 4 647 10 643 643 546 49.0 1e-155 VGYIKILDESVSNIIAAGEVVENPASMIKEMIENSLDAKATIIKIEVFKGGMDVKINDNG IGMDKDDTLLSIERHATSKISTKDDVFNLNTYGFRGEALASIAAVSKLTITTRSENSSTG YKIGSYGGVVRKFEEVSRNVGTEIEVRDLFYNTPARKKFLRKESTEYNKIKDIVLKEALA NSDVAFTLEFDGKQSVNTSGKGIENAILELFGKSVLRNLKKFEYGYLGNVEILRSSKDYI FTYVNKRYVKSNTIERAVIDGYYTKLMKGKYPFVIIFFDIDPKEIDVNVHPSKKIVKFSN DKIVYKQIKDAIDEYFYQSDREGWQPNIDLLKKSINVENKEEKIKDLFSDEVIKGESQKF FSFETHDGDFSAGNKIQERTLEEIIGSNAENNEIKENKEIESSRLLHDTDDYIIEEMNKN EKVFEKSDKIKEELEIKEESIRETGNDYKVGTFEKHTGEQFTYDVLGQIFDTYILVRKNN ELEIYDQHIIHERILYEELKEKFSGKEHKSQHLLLPQKMEVSEVEKSIIFDNIEVFNEFG FDIDEFSNNEIVFRAVPAFDFRDSIQNVFLKLLSDLKNESEIKDLRENIIISMSCKGAVK AGQKLNMNEMQNMVRRLHEVGKYTCPHGRPIIVKLTKDDLDKMFGRKK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:25:40 2011 Seq name: gi|261747208|gb|ADAD01000138.1| Leptotrichia goodfellowii F0264 contig00123, whole genome shotgun sequence Length of sequence - 314 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 314 468 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261747208|gb|ADAD01000138.1| GENE 1 2 - 314 468 104 aa, chain - ## HITS:1 COG:FN0132 KEGG:ns NR:ns ## COG: FN0132 COG3210 # Protein_GI_number: 19703477 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 1 103 2131 2234 2462 104 52.0 4e-23 VQKRLGIGGRFNPDDPNLPEKIKDRIAQVREQGQEINYFYDKVTGKIYINENADEDEIRA GIGREWGIRDEFDRGRTKPNGEGQEKGTVAGEIAYKEIEDRTKG Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:25:42 2011 Seq name: gi|261747205|gb|ADAD01000139.1| Leptotrichia goodfellowii F0264 contig00149, whole genome shotgun sequence Length of sequence - 2474 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 1 - 417 615 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases + Term 453 - 497 -0.8 + Prom 489 - 548 6.8 2 1 Op 2 . + CDS 568 - 2316 2065 ## COG0366 Glycosidases 3 2 Tu 1 . - CDS 2323 - 2472 114 ## Predicted protein(s) >gi|261747205|gb|ADAD01000139.1| GENE 1 1 - 417 615 138 aa, chain + ## HITS:1 COG:BS_glvA KEGG:ns NR:ns ## COG: BS_glvA COG1486 # Protein_GI_number: 16077885 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Bacillus subtilis # 5 137 309 441 449 216 75.0 1e-56 AEKGTAKDSKLHVDDHASYIVDLARAIAYNTKERMLLIVENDGALSNFDPTAMVEIPCLV GSKGPEKIVQGKIPQFQKGLMEQQVSVEKLTVEAWIEGSYQKLWQAITLSRTVPSASVAK AILDDLIEANKDFWPVLK >gi|261747205|gb|ADAD01000139.1| GENE 2 568 - 2316 2065 582 aa, chain + ## HITS:1 COG:BH3868 KEGG:ns NR:ns ## COG: BH3868 COG0366 # Protein_GI_number: 15616430 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 6 546 1 526 553 449 46.0 1e-125 MDTKKLDKKWWKKEVGYQIYPRSFYDSNNDGIGDLNGITEKLDYLKNLGITLIWVCPIFK SPMDDNGYDISDYYDVNPEFGTKEDLERLISEAEKRGIKIILDLVINHTSDEHEWFLEAL KNPESKYRNYYIFKRGKNGLPPTNWRSHFGGSAWEKVEGEADENGNEMYYLHLFTKKQPD LNWENPEVREELYEMVNYWLEKGIAGFRVDAINSIKKDKDYLDLPVDGADRLAHNIKYTL NQPGIEEFLGELAEKTFKKYNCMTVAETPLLEYERYNDFIGEDGFFSMIFDFSYSDLDMA KEGFYYSVQDVKINELRNKIFESQLTQQKYGWGAPFLENHDLPRSLNKFLGKKANEINAK LLATLFFFLRGTPFIYQGQEIGMDNFERTDIAQFDDIASKDQYQRALGEKFSPEEALYFV NKRSRDNSRTPFQWNSNKNAGFSKDENVKAWIELTGSHKKVNAESQINNENSIFAHYKNM IELRQNGKYSDCLIYGKFIPVPLKNDEIIAYIRKYENQKVLCINNFSEKKQEVELNEIAK SISEEEIKLVDILINNYENVENNGKILVLEGYQSILVEFQIL >gi|261747205|gb|ADAD01000139.1| GENE 3 2323 - 2472 114 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no KLKLSNIPIFIIYLRSCVKDFKYRIFIRIEDSFFTLCNSQITDYVLLFL Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:25:54 2011 Seq name: gi|261747182|gb|ADAD01000140.1| Leptotrichia goodfellowii F0264 contig00050, whole genome shotgun sequence Length of sequence - 21695 bp Number of predicted genes - 24, with homology - 22 Number of transcription units - 4, operones - 3 average op.length - 7.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 455 - 985 859 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 2 1 Op 2 . - CDS 995 - 1447 588 ## COG1051 ADP-ribose pyrophosphatase 3 1 Op 3 . - CDS 1434 - 2753 1709 ## COG1253 Hemolysins and related proteins containing CBS domains 4 1 Op 4 . - CDS 2798 - 3283 753 ## COG0511 Biotin carboxyl carrier protein 5 1 Op 5 . - CDS 3311 - 4162 1267 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase 6 1 Op 6 . - CDS 4164 - 4412 380 ## Lebu_0168 redox-sensing transcriptional repressor REX 7 1 Op 7 . - CDS 4318 - 4821 713 ## COG2344 AT-rich DNA-binding protein 8 1 Op 8 26/0.000 - CDS 4850 - 5782 1363 ## COG0223 Methionyl-tRNA formyltransferase 9 1 Op 9 4/0.000 - CDS 5787 - 6299 892 ## COG0242 N-formylmethionyl-tRNA deformylase 10 1 Op 10 1/0.000 - CDS 6327 - 8531 2404 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase 11 1 Op 11 . - CDS 8561 - 10525 2792 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 - Prom 10557 - 10616 2.8 - Term 10547 - 10598 -0.7 12 2 Op 1 . - CDS 10622 - 12055 2197 ## COG0064 Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) 13 2 Op 2 . - CDS 12072 - 12761 623 ## gi|262038778|ref|ZP_06012132.1| putative DNA double-strand break repair Rad50 ATPase 14 2 Op 3 . - CDS 12743 - 13180 299 ## gi|262038784|ref|ZP_06012138.1| putative liporotein 15 2 Op 4 . - CDS 13164 - 13268 117 ## 16 2 Op 5 . - CDS 13331 - 13435 169 ## 17 2 Op 6 31/0.000 - CDS 13472 - 14935 396 ## PROTEIN SUPPORTED gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 - Prom 15008 - 15067 8.0 - Term 14946 - 14999 10.2 18 3 Op 1 . - CDS 15150 - 15452 543 ## COG0721 Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit 19 3 Op 2 . - CDS 15442 - 16113 751 ## COG4359 Uncharacterized conserved protein, possibly involved in methylthioadenosine recycling 20 3 Op 3 12/0.000 - CDS 16123 - 16809 1122 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 21 3 Op 4 1/0.000 - CDS 16796 - 17422 857 ## COG1386 Predicted transcriptional regulator containing the HTH domain 22 3 Op 5 . - CDS 17483 - 18523 1769 ## COG1077 Actin-like ATPase involved in cell morphogenesis 23 3 Op 6 . - CDS 18594 - 21056 2891 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins - Prom 21136 - 21195 10.7 24 4 Tu 1 . - CDS 21271 - 21693 434 ## gi|262038780|ref|ZP_06012134.1| conserved hypothetical protein Predicted protein(s) >gi|261747182|gb|ADAD01000140.1| GENE 1 455 - 985 859 176 aa, chain - ## HITS:1 COG:alr4582 KEGG:ns NR:ns ## COG: alr4582 COG0503 # Protein_GI_number: 17232074 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Nostoc sp. PCC 7120 # 9 176 3 172 172 176 51.0 1e-44 MKTEEKKIVEGLIRTIPDFPEKGIVFRDITTALKDKKGLNIVIKDFTERYKDKGIDYVLG ADARGFIFGAAIAYNIGAGFVPARKVGKLPAETVKIEYELEYGKNSIEIHKDSFKKGDKV LIVDDLLATGGTAAAMVKLVEMLGASIYELAFMIELEDLKGRDVLKGYEVYSQLQY >gi|261747182|gb|ADAD01000140.1| GENE 2 995 - 1447 588 150 aa, chain - ## HITS:1 COG:PA5081 KEGG:ns NR:ns ## COG: PA5081 COG1051 # Protein_GI_number: 15600274 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Pseudomonas aeruginosa # 2 105 3 111 158 66 33.0 2e-11 MEKIRPRVRVAGILIENERILLIEHSKNDKKYWLVPGGGVDWGESTAESLIREYKEETNL DIEVESFLFLSETIAPDKEKHVINLYFKVKRKDTSKEDLKLGNEEMLTDLKFFEKEKIKN IKLYPNIKEQIIKLLNKEEIVPYLGLLWDK >gi|261747182|gb|ADAD01000140.1| GENE 3 1434 - 2753 1709 439 aa, chain - ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 20 433 17 426 426 310 43.0 3e-84 MEEQRIWVKIILLIILLFLSAFFSAAEAALISLKRIHIGDIKEKNSKEGTLLKLWLKNPN ELLTTLLLGKTAVYVLSVSTAVMLAESYYNSFFGHMNRSFYMFSAFVIIVIIMLIITEIT PRVFAKNSAAKISRTLIVPLNTLRIILRPIILIFLEISKLIIKLFGIKVKEQMFEITEED IKTFVKEGTEVGVIEEGEEEMIHSIFEFSDTTVKEILTPRTTVFALDMEKTIGEVWDEVV EQGFSRIPVYNETIDKIVGIVHMKDMIKYNKEEHSDMKIKELMKEPYFVPTTKTLVELLE EFKKKQSHMAIIIDEYGGTLGIVTIEDLLEEIVGEIRDEFDQEEESIQQIKETIFDIRGD TVIEELNEELDINMPVSEEYDTVSGYVQDELGKVAEEGDQVKGDGFILKVMEVDNKRIEK LRIIITEKNNEREADDGKD >gi|261747182|gb|ADAD01000140.1| GENE 4 2798 - 3283 753 161 aa, chain - ## HITS:1 COG:BH2788 KEGG:ns NR:ns ## COG: BH2788 COG0511 # Protein_GI_number: 15615351 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Bacillus halodurans # 4 161 3 166 169 95 37.0 3e-20 MKEKIKFIRELAQSMNENKIEEVQYEENNFEIQLRKKGKEKRTVIYGGAPVNVQPEASNQ VSSEAVIDNISTVAEKVETTAEISGTKIESPMVGTFYIAPSPTSAPFIKEGDNVTEGQTL CIVEAMKLMNEVKSSVSGKVKKIMAKDGDAIKKGQVLVIIE >gi|261747182|gb|ADAD01000140.1| GENE 5 3311 - 4162 1267 283 aa, chain - ## HITS:1 COG:aq_1898 KEGG:ns NR:ns ## COG: aq_1898 COG0190 # Protein_GI_number: 15606923 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Aquifex aeolicus # 4 280 5 281 291 286 53.0 4e-77 MIKIDGKDISQKILKYIEKEHKLLRDKYERTAGLAVIMAGNNPASEVYVKNKIKACESVK FYSEIVRLDENVTEADFIKEIERFNKNDKIDGILIQLPLPEHINELKVINAVSPEKDVDG FHVVNIGKMMTGDKSGFWPCTPYGIIQLLEEYDIDPAGKDVVIVGRSNIVGKPLALMLIE KSATVQVCNTKTKNIREKLKNADIVIAAAGVPNLISAEDIKEGAVVIDVGINRVNGKLCG DVNYEEVSEKAGYITPVPGGVGPMTIASLIKNTFKSYINKVEK >gi|261747182|gb|ADAD01000140.1| GENE 6 4164 - 4412 380 82 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0168 NR:ns ## KEGG: Lebu_0168 # Name: not_defined # Def: redox-sensing transcriptional repressor REX # Organism: L.buccalis # Pathway: not_defined # 1 76 136 215 221 87 66.0 1e-16 MEEVEDFIKNSSEAVETAILAVVKDQAQIAAEKLIRSGIKAILNMTTYKLELPKNIVVVD IDISAKLQELNFWRLNINDREE >gi|261747182|gb|ADAD01000140.1| GENE 7 4318 - 4821 713 167 aa, chain - ## HITS:1 COG:TM0169 KEGG:ns NR:ns ## COG: TM0169 COG2344 # Protein_GI_number: 15642943 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Thermotoga maritima # 8 126 5 124 208 90 41.0 1e-18 MKLKKNDISDRVVQRLTNYLSILKEVRKYEEGINSIELSKIMDTTSAQVRKDLSTFGEFG VRGKGYDIEKLIEIIEEILGINKVNDVIIVGHGKMGEMITSNSKVLGKGFKIVGVFDKDK KKIGEKIPDLKMEIKKCGRSRRFYKKFFGSGRDSYIGSCKRSGSDCC >gi|261747182|gb|ADAD01000140.1| GENE 8 4850 - 5782 1363 310 aa, chain - ## HITS:1 COG:FN1489 KEGG:ns NR:ns ## COG: FN1489 COG0223 # Protein_GI_number: 19704821 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Fusobacterium nucleatum # 1 309 8 316 317 320 55.0 2e-87 MKTIFMGTPEFAIPSLETVYKNTDLKLIFTKEDKRNARGNKIIFSPVKQFGLDNNIEIIQ PKRLKDAEITEKIREINPDLIVVVAYGKIIPKEIIDIPKYGIINVHSSLLPKYRGASPIH SAILNGDKETGVSIMYIEEELDAGDVILKEYCEINEDDTLGTLHDKLKELGATGLEKTLK LIEDGNVKTEKQDDTKATFVKPISKEQAKIDWTDTKENVYNKIRGLNPFPAAYTFNEKNE NIKVYKGEKIDKIYDGEFGEIVEIINKKGPVVKVQNGGIILTEVKFEGKKIQKGTDVING RKLLQGEKLI >gi|261747182|gb|ADAD01000140.1| GENE 9 5787 - 6299 892 170 aa, chain - ## HITS:1 COG:FN1157 KEGG:ns NR:ns ## COG: FN1157 COG0242 # Protein_GI_number: 19704492 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Fusobacterium nucleatum # 14 169 18 174 174 134 51.0 7e-32 MNIVLYEHPTLRTKSTEVDIVDDELRKILDEMVETMRKANGVGLAANQVDIPKRFFVLEV ENKVKKIVNPEIIESSDEIIEYEEGCLSIPGIYKKVNRPSEIKVKYLNEKGEEVIEELKE MWARAFQHELDHLDGVLFIDRISVLNKRLISKKLELMKKEFAKGKKYREE >gi|261747182|gb|ADAD01000140.1| GENE 10 6327 - 8531 2404 734 aa, chain - ## HITS:1 COG:FN1156 KEGG:ns NR:ns ## COG: FN1156 COG1198 # Protein_GI_number: 19704491 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Fusobacterium nucleatum # 1 734 1 766 766 528 42.0 1e-149 MYYYQMYVENNKNIYTYKSQDKYEIGEWCIVNFANRNKMALVLSEISETELTIDTAKIKF ISDKAPVLSVPSVIMELVKWIKDYYLSDYNSVIKAAYPGALKLNYSKKAVYVKDLEIKEK KEKLFEKEESGKSESIEKFNEYMKKKKEVTFQTLQKNFSEKIINKAIREKAVTIEKKIIT KSKIKENTELKSEILKDSVVLNDEQQKAVNRIENGNNKFYLIKGVTGSGKTEIYVNLIKN AMKENCGSIFLVPEISLTPQMIERLERQFSGSVAVLHSKLTDVEKRAEWSFIRRGEKKIV IGARSAIFAPVENLKYIIIDEEHENTYKQDTNPRYHVKNAAIKRALIEENVKVIFGSATP SFESYYQAKKGDLELIELKERFNNAKIPEYKVVDLNETPNNFSDELLKEISNTLAKKEQV LLVLNRKAFSNLLKCKECGNIPTCPNCSISLSYYKYDNKLKCSYCGYEERFDKKCKSCGS DKMIQIGSGTEKIEEELKEIFYDGKIVRIDSESVKTKKDYENLYNDFKNQKYNIMLGTQI IAKGFHFPNVTLVGIVNSDIILNFPDFRAAEKTFQLLVQSAGRAGREEKEGQVIIQTFNK ENEVIKKTVENDYEGYFEKEMEIRKLLNYPPYGRLIIIVLSSDEENGLEEKVKIFYNKIM SSVKRKIQPEKNEFISEPFKAPIYKINTRYRYQIFIKFNRNNITKIKNIIRATLNDYNEK NMRISVDVDPVSML >gi|261747182|gb|ADAD01000140.1| GENE 11 8561 - 10525 2792 654 aa, chain - ## HITS:1 COG:FN1155 KEGG:ns NR:ns ## COG: FN1155 COG0768 # Protein_GI_number: 19704490 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Fusobacterium nucleatum # 3 654 42 711 711 275 30.0 2e-73 MKERNSFRFRLMILVLFISAGFLWVIINLFYIQIVKGSEWQKTGEMQYKSEFTVKSKRGR IITNDGEVLAYDGETYDLILDPTLIDPENIDKLMELLKKNIISFDVNKVKNDISDKMRQN KKYLKLDYILGYNEKRAILDELSKDKSLKSGVFFETNYIRHYIRNKAFQETIGYMNTENK GVYGIEKTYNDQLTGVDGLIEGFRIPRKFTTITSLKDTKEVVLAKNGDNVILTVDSVLQY ALDEELRKTFEEYSAVSTMGILMEVETGKILAMSSYPKAENNAEVKNRPITDLFEPGSIF KPITVSTGLQLGVINQNSMIESSGHIRVADRVIKDHDSSTTGSLNLENLIAKSGNVGMVK IAQMMKTKDFYSYLEKVGLGKKTGIDTYSETTQKLLPLKDMTEVKKSNIAFGQGIAMTQM QIMMALNTVVNHGKLMKPYIVDHIEDDDGNIISQNSPVVLENVFSEEVSRLNRYYMEAVV TKGTGRGAQIEGYNIGGKTGTAQKSGTRGYETGKYFSSFFAFFPVENPKYAILITVNEPH GAYYGAAVALPSVKQVLEKLIKYKGINPQGIITAQKQQIAINSYTKKDLKKISEDFGKNL MPDLKGISMRELLSVYPQKKFPKYNIAGSGRVTDQWPAAGAKLDKNTEIKIVLE >gi|261747182|gb|ADAD01000140.1| GENE 12 10622 - 12055 2197 477 aa, chain - ## HITS:1 COG:FN0753 KEGG:ns NR:ns ## COG: FN0753 COG0064 # Protein_GI_number: 19704088 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) # Organism: Fusobacterium nucleatum # 1 476 1 480 481 591 64.0 1e-168 MSMEYETVIGLEVHCQLKTKTKVWCSCNADYDNEVPNVSTCPVCTGQPGALPKLNEEVLN YAIKAALALDCKINGESQFDRKNYFYPDSPKDYQITQYFKPYAENGELHIVTNSGKETKV GIERIQIEEDTAKSIHTTSESLLNFNRASIPLIEIISKPEIKNAEEAYAYLNTLKDRLKY TKVSDVSMELGSLRCDANVSVRKKGDTALGTRTETKNLNSFKAVVRAIEYETNRQIEVIE NGGRVVQETRLWDEEQGVTRPMRSKEEAMDYRYFPEPDLPKVIISEERLENVRKEMPEFA DEKAKRFINDYKLNEKEAVILAGEPELAEYYEEVVTKSGEPKLSANWMLTEILRVLKEKN TGIEKFSVSAENIAKLITLIKSNVISSKIAKEVFEMLLSENKDPEIIVKEKGLVQITDNS EIEKIVEQVLEENKQSVEDYRAGKSNALKYLVGQAMRLSKGKANPQMINELILEKLG >gi|261747182|gb|ADAD01000140.1| GENE 13 12072 - 12761 623 229 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038778|ref|ZP_06012132.1| ## NR: gi|262038778|ref|ZP_06012132.1| putative DNA double-strand break repair Rad50 ATPase [Leptotrichia goodfellowii F0264] putative DNA double-strand break repair Rad50 ATPase [Leptotrichia goodfellowii F0264] # 1 229 1 229 229 272 100.0 1e-71 MAKKEVKYKNEELDKAAEKFLSSLEEKIRNIEEIKQYYKNGKYKEDNFEKGKNLSEQYLS NRKKVEGSFKKFSNQRNIVKYITMKNTIGVLKDIEGKEVESDIIKLKLLFEMLNEKLFGT KLYLDTSKPFIIEEKDEVQYVNELKSIQKTIKNLLSDMKKTDVSKLEKENINSENYKKSV KEIEKISRETGKIIDNIEKRKNENLNEMISDYSKEINSVLEKLNSNLAN >gi|261747182|gb|ADAD01000140.1| GENE 14 12743 - 13180 299 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038784|ref|ZP_06012138.1| ## NR: gi|262038784|ref|ZP_06012138.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 145 1 145 145 210 100.0 3e-53 MILKNRKYLFLLVILLLVFGCGSKKSSEKNSKKNNAVKAKSYDEIPEIDFSNKDIDTATL QNIIIGKMENPELSKSYRNLDISYRLTFFPDSFHNDYEGTFLDNQNNFIIPNEEKINTFS KEIEGIGYEELIVKMNKVKEWQKKK >gi|261747182|gb|ADAD01000140.1| GENE 15 13164 - 13268 117 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQRQENDFSLEKYKEAEIVETTEEYKEKNYDIEK >gi|261747182|gb|ADAD01000140.1| GENE 16 13331 - 13435 169 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYKLIDLFAISEKNITEYNILLAGFPYQAFSLAR >gi|261747182|gb|ADAD01000140.1| GENE 17 13472 - 14935 396 487 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 [Phaeobacter gallaeciensis BS107] # 7 459 8 435 468 157 30 7e-38 MELYKKTASEIAEMIKSKEITSEEVTKHFLERINLLEDKIGAFSSVLEEKALESAKIYDN DNNEEKRKNYDNTSLFGVPVALKDNILSKGDLTTASSKILANYEGIYDATVVERLKKAGV PIVGKANMDEFAMGSSNENSAIKSVSNPWDLERVPGGSSGGSAAAVAAQMIPIALGTDTG GSIRQPACLTGTVGIKPTYGRVSRYGLMAFGSSLDQIGALAKSTEDLARIMKIIAGYDEK DPTTADVEVPDYLKSINNDIKGLRIGLPKEYFAEGLDKNIKEVVNKAVEQLKGLGAEIKE VSLPYVEYAISTYYIISSAEAASNLSRYDGVRYGVRKSDDSVEDMYVKSRSEGFGPEVKR RIMIGNYVLSSGFYDAYYKKASQVRRLIKDDFERVLTEVDIILTPTSPTAAFKKGEKNTD PVQMYLADIYTVSINMAGVPAICVPAGFVDGLPVGIQLIGNYFKEDLLFNASHKFEEVRG KIEYPEI >gi|261747182|gb|ADAD01000140.1| GENE 18 15150 - 15452 543 100 aa, chain - ## HITS:1 COG:RC0195 KEGG:ns NR:ns ## COG: RC0195 COG0721 # Protein_GI_number: 15892118 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit # Organism: Rickettsia conorii # 1 93 52 150 151 73 42.0 6e-14 MLTKEDVLKIAKLSKLEFQEDEIEKFQTDLNKILEHMEILNNVDTTGVEPLFNVLDLKDR LRKDEVQSVDIKKELLKNAPNKDDDFIIVPKIVGGGNTDN >gi|261747182|gb|ADAD01000140.1| GENE 19 15442 - 16113 751 223 aa, chain - ## HITS:1 COG:SPAC823.14 KEGG:ns NR:ns ## COG: SPAC823.14 COG4359 # Protein_GI_number: 19114753 # Func_class: E Amino acid transport and metabolism # Function: Uncharacterized conserved protein, possibly involved in methylthioadenosine recycling # Organism: Schizosaccharomyces pombe # 4 209 3 218 229 66 27.0 4e-11 MEIEKKKLIFLIDFDITISKKDSTDMLLETHNPKYKIKLREQYKNKEISMREFVIFGLES LNITKEEYIKTLQDKVDIDVSFLDFIKSGVEFRIVSAGTRLNIQGTLLKYNVVLENEKII SNDINFEGKKITITNPFLDKEMYYGVDKKEAVENFKRQGYKVIFVGDGPSDYRAIETADF SFIRKNTRAIDFCKENNIKFKEFDNFNEILSYYNGKEVSKNAD >gi|261747182|gb|ADAD01000140.1| GENE 20 16123 - 16809 1122 228 aa, chain - ## HITS:1 COG:FN0756 KEGG:ns NR:ns ## COG: FN0756 COG1187 # Protein_GI_number: 19704091 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Fusobacterium nucleatum # 1 228 1 234 234 195 47.0 6e-50 MRLNKYMADAGVCSRRKADEMIKEGRVTVNKKEAVLGMEISSGDIVRVDGEKIKLNTVYE YYMLNKPKRVICSNEDRFGRRLAIDYIKSKKRLFTYGRLDYMTEGLIIISNDGEVYNHVM HPRKKLYKSYIAKLSREVEEKDMEAWKHGVVIDGKRTAPAKVKKIDKKEIRIAIFEGRNR QIRKMVEILGYTVESLKRVKVGELTLGYLQPGDYRALTEEEVSYLKSL >gi|261747182|gb|ADAD01000140.1| GENE 21 16796 - 17422 857 208 aa, chain - ## HITS:1 COG:FN0757 KEGG:ns NR:ns ## COG: FN0757 COG1386 # Protein_GI_number: 19704092 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing the HTH domain # Organism: Fusobacterium nucleatum # 6 168 2 164 181 130 46.0 2e-30 MDINENLEDRVETIVFLSKEQLTVEELAKFYEIELSKMEEILLNLKEKRKTSGINVKIEN GMIMLVSNPLYGEDVKRFFNPEMKIKKLTRSTMETLAIIAYKGPITKTEIEQIRGVSVEK TMANLLEKNLVYISGKKKTIGTPNLYEVTEDFYSYLSINDKTELPGFEQYEKIELLYKMK EEENGEIPESIMEKLEKKAETEVEDETE >gi|261747182|gb|ADAD01000140.1| GENE 22 17483 - 18523 1769 346 aa, chain - ## HITS:1 COG:FN0758 KEGG:ns NR:ns ## COG: FN0758 COG1077 # Protein_GI_number: 19704093 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Fusobacterium nucleatum # 16 346 20 353 353 315 49.0 9e-86 MKFFNFFKFSTKPTRDIAIDLGTANTVVYVKDEGILINEPTYVAINVKTDEVEHIGEKAK EIMGRTAKHTQIIRPLKNGVISDYEVTEKMLAEFLKRIRKDKFQSSRVIICVPSGVTQVE RRAVVEVVKEAGAKEVYLIEEPIAAAIGAGIDLFEPKGHLIVDIGGGTTEIAFIVSGGAA SSRSVKIAGDHLNEDIIEYVKEKHNLLIGERTAEELKVNTISLKDKNTEFEIRGRELGKG LPKSIKIVASEIDGAIDKNIDLIIDEIKLTMEGIEPEIAADIFETGIYISGGGAGIRILK EKIEEELKLQVTVCDEAIHAVVRGIAKVLDDFDSYKNIIISPTNEY >gi|261747182|gb|ADAD01000140.1| GENE 23 18594 - 21056 2891 820 aa, chain - ## HITS:1 COG:CAC1812 KEGG:ns NR:ns ## COG: CAC1812 COG1674 # Protein_GI_number: 15895088 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Clostridium acetobutylicum # 299 809 225 748 765 415 49.0 1e-115 MNKKIRGILLFVLGIFLIYLLINKNGIQASTKGQENFLAIFLNASSLLFGKMSWFVSILI TCWGIFDFFTRNIKLKFHKSKIAALIILFIADSMLLIKKSVVSPLPDSFTEAGRKLLEIG FNRQSGGLAGALTSMPLYGFMHLHIVEVILKVVTVICILVLLKEVLVLIYEILAGLVKYY SSEDYKKKLKILKAKKQAEKQEYIVSKKQRREELRERLIESRKQKLSFEISKKPKDSFLQ KTEMYSEAELIDKEKEWLELLENQKNKNRQKEEKIPEKRENIKPEKEEKHFDKSIEIEKN KDENKEKPISKNDNFEEKLEIKSEKVENEDLEELKKDFQEKVETEIKKESEIIQKSQENE NNDEYEELLKKSIEEIFKSKAMDPSKKKEIEKSIVENVSHLESVLKEFGINAKVVNYEYG PTITRYEVTIPKGVKVSKVTSLTDDIAMNLAAESIRIEAPIPGKNTIGIETPNKIKEPVH FSNIIRNPQLEKGALNVILGKNIVGQDRIIDIAKMPHLLIAGQTGSGKSVAVNTLISTLI TKKSEKEVRFIMVDPKMVELMPYNGIPHLLVPVIIDPQQAAIALRWAVNEMDNRYRQLME NGVRNIVGYNSLGYVEKMPYIVIIIDELADLMMVAAGSVEESIARIAQKARAVGIHLVVA TQRPSTDVITGMIKANLPSRISFALRSQIDSRTILDTPGAEKLLGQGDMLLLENGSSKLE RIQGAFISDDEVMKLTTALKTNKKVSYMEEILIETVEKGKETDPLFENAIDVIKQEGRVS ISLLQRKLNVGFNRASRIYEQLKENGIISDDNQLLIDDFD >gi|261747182|gb|ADAD01000140.1| GENE 24 21271 - 21693 434 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038780|ref|ZP_06012134.1| ## NR: gi|262038780|ref|ZP_06012134.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 5 140 1 136 136 234 99.0 1e-60 EFRFLTEQFYKDYCNCGEMEKKKLRPYAVLCIVNYQGLTFAIPIRHNIKYNYAVKTVNNQ GLDLSKTVVIKNDKYIDNTKVAVINSVEYKILLSKKYFIEQKLEKYIKEYKKGLKNLSDS KYRNLCGKSCLKYFHAELGI Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:26:33 2011 Seq name: gi|261747181|gb|ADAD01000141.1| Leptotrichia goodfellowii F0264 contig00221, whole genome shotgun sequence Length of sequence - 326 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 326 347 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261747181|gb|ADAD01000141.1| GENE 1 2 - 326 347 108 aa, chain - ## HITS:1 COG:FN0132 KEGG:ns NR:ns ## COG: FN0132 COG3210 # Protein_GI_number: 19703477 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 1 103 1812 1914 2462 97 58.0 6e-21 AGATVGYGHKLQTTGNGGSVSVTRSNMNTEETLYQNGRFTNVEEVHNGTKNMRLSGFNQE GGKVTGSIENLVEESKQNISKTTGSSSGVTLGIGSNGVPSSASISGSR Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:26:34 2011 Seq name: gi|261747180|gb|ADAD01000142.1| Leptotrichia goodfellowii F0264 contig00039, whole genome shotgun sequence Length of sequence - 213 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 16 - 92 88.2 # Ile GAT 0 0 + TRNA 116 - 191 93.8 # Ala TGC 0 0 Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:26:36 2011 Seq name: gi|261747171|gb|ADAD01000143.1| Leptotrichia goodfellowii F0264 contig00064, whole genome shotgun sequence Length of sequence - 7673 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 2, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1588 2412 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 2 1 Op 2 . - CDS 1626 - 2360 770 ## COG0666 FOG: Ankyrin repeat - Prom 2415 - 2474 8.9 3 2 Op 1 . - CDS 2485 - 2934 660 ## COG1225 Peroxiredoxin 4 2 Op 2 . - CDS 2951 - 5233 1243 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 5 2 Op 3 . - CDS 5294 - 5431 82 ## gi|262038797|ref|ZP_06012148.1| heavy neurofilament protein 6 2 Op 4 . - CDS 5428 - 5676 201 ## gi|262038798|ref|ZP_06012149.1| tRNA(Ile)-lysidine synthase 7 2 Op 5 1/0.000 - CDS 5657 - 6580 921 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 8 2 Op 6 . - CDS 6597 - 7547 1052 ## COG1559 Predicted periplasmic solute-binding protein - Prom 7575 - 7634 3.4 Predicted protein(s) >gi|261747171|gb|ADAD01000143.1| GENE 1 1 - 1588 2412 529 aa, chain - ## HITS:1 COG:FN1793 KEGG:ns NR:ns ## COG: FN1793 COG1080 # Protein_GI_number: 19705098 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Fusobacterium nucleatum # 4 528 11 534 579 592 58.0 1e-169 MKRLTGIGASEGVSIGKVLLFIEEELVIPQEKSESTIESELTKLDEGLKKSKTQLIAIRE KVREKMGEDKASIFDGHIMLLEDEDLIMEVQEKIKGEGMPAAKALNEGIEEYCEMISQLD DPYLRERAADLKDIGKRWLKNLLGMKIYDLSNLEPGTIVVTYDLTPSDTAQLDLENCAGF ITEIGGKTAHSAIMARSLELPAVVGVKGVLGEAKDGDTVVMDGEAGELFLNPPAEIITEY KEKREVIKKEKEELKKLINEEAVTPDGRKVDIWGNIGKPDDVDAVIEAGATGIGLYRTEF LFMNSDHFPTEEEQYRAYRVVAEKMKGKPVTIRTMDIGGDKELPYLDLPKELNPFLGYRA IRISLENKDMFKTQLRAILRASQYGQIKIMYPMISSINEIRKANAILEECKKELDEIGKV YDRNIKVGIMVETPSTAIIAYKFAKEVDFFSIGTNDLTQYFLAVDRGNEMVSTLYNSFNP AVLEAIQKVIDAAHDANINVSMCGEFAGDKKATKLLLGMGLDSFSMSAS >gi|261747171|gb|ADAD01000143.1| GENE 2 1626 - 2360 770 244 aa, chain - ## HITS:1 COG:FN0178 KEGG:ns NR:ns ## COG: FN0178 COG0666 # Protein_GI_number: 19703523 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Fusobacterium nucleatum # 102 232 28 155 326 63 35.0 4e-10 MFLFVTVNIFSAKENTNNGKTIDFVEKFRLEQIIDEKLKFFFSAIRNHNNKYVKMLLNTE ENRQAIKKRELKLTDEKNNKRVYGVDLELEVFPLYDEKEVLIDVNSRDRYGYTPIIVAIE SGNNEILKLLIENGADLREKHPVFGRLTLHTACYFQNEEAAEILLKADPSLVNAKSGTDG WTPLQDATLRSNTRLVKLLLSYGANPMIKDDNGGTAIDMATQFGKGEIVKLLRDRVKEIR KENY >gi|261747171|gb|ADAD01000143.1| GENE 3 2485 - 2934 660 149 aa, chain - ## HITS:1 COG:BH0948 KEGG:ns NR:ns ## COG: BH0948 COG1225 # Protein_GI_number: 15613511 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Bacillus halodurans # 2 149 6 154 154 180 59.0 9e-46 MKAPDFELPADNGEIIKLMDYKGKTVVLYFYPKDSTPGCTQEACSFRDNFARITAKGVAV LGISKDSIKKHKNFKEKNELPFLLLSDENNNVCEKYQVWKEKMNFGKKYFGIERSTFLID NKGNIVKEWRKVKVNGHVDEVLKEIENLK >gi|261747171|gb|ADAD01000143.1| GENE 4 2951 - 5233 1243 760 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 43 677 8 634 636 483 42 1e-136 MDDKNRDDIKKRLEELRKQNNRKDNNNKSPFGGSHIFFSIIILIVLSILFFWNGSIPNYF QQKKEVPYSEFITKVKNGTFKEVTEKDDKMLSEVKENKKSTIYITRKITERIGQDPNIIQ AVENKKIALNVAEPSTSGLVLAFIVNVLPLILMIGISIYFIRRIMGNSQGGGPGNIFGFG KSRADRLDKKPDVKFDDVAGVDGAKEELREVVDFLKNPEKYTKAGARVPKGVLLLGRPGT GKTLLAKAVAGESGASFYTISGSEFVEMFVGVGASRVRDLFEKAKSSTPSIIFIDEIDAI GRRRSTGKNSGSNDEREQTLNQLLVEMDGFETDTKVIVIAATNREDILDPALLRAGRFDR RIQVDAPDLQGRIAILKVHAKNKKLAADVKLEDIAKITPGFVGADLANLLNEAAILAARK SSDTIVMEDLDEAVDKIGMGLGQKSKIIKPEEKKLLAYHEGGHAIMSELTPGADPVHKVT IIPRGDAGGFMMPLPEERIVMGSKQILAKIKVAFGGRAAEELVLDDISTGAYSDIKHATM LARRYVESVGMSKKLGPVNLENSDDEFSFVSNKSNETAREIDLEIRKILSEEYENTLNTL RENRDKLDRVAGLLLKKETITGDEVRKIIAGATIEEVLNGEKNTEEKEEQNSENNDNVQV DHENKQQENNQAEKKTDENSEDVLLENNAEETYLQSTEVAENIIENDNLEEIVKSEEVSD INEANQKDDEDENDSHFGDNDEVFTEEKEEKKIEIPNFMK >gi|261747171|gb|ADAD01000143.1| GENE 5 5294 - 5431 82 45 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038797|ref|ZP_06012148.1| ## NR: gi|262038797|ref|ZP_06012148.1| heavy neurofilament protein [Leptotrichia goodfellowii F0264] heavy neurofilament protein [Leptotrichia goodfellowii F0264] # 1 45 1 45 45 65 100.0 1e-09 MIDEKIPKQKRDEIPVIECKKTVGNSEINQILAVGDIKISEYLKK >gi|261747171|gb|ADAD01000143.1| GENE 6 5428 - 5676 201 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038798|ref|ZP_06012149.1| ## NR: gi|262038798|ref|ZP_06012149.1| tRNA(Ile)-lysidine synthase [Leptotrichia goodfellowii F0264] tRNA(Ile)-lysidine synthase [Leptotrichia goodfellowii F0264] # 1 82 1 82 82 136 100.0 5e-31 MSENQDNYIKILKINQSIEWYNYKVYLCDKSEINNQFFLKDGIRYTFLEFSDADNKEIII RKRKEGDAILLSNLGHKKIKKF >gi|261747171|gb|ADAD01000143.1| GENE 7 5657 - 6580 921 307 aa, chain - ## HITS:1 COG:FN1977 KEGG:ns NR:ns ## COG: FN1977 COG0037 # Protein_GI_number: 19705273 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 15 302 12 319 448 151 37.0 1e-36 MFNFGKLKEKIYEIEKSNLIENGDKILLAFSGGPDSVFLYRILSFFQEKYSLELALMYVN HNLRHDVEKDLEFVKRFSEENGVQLYTESVNVNDYSSKNRKSTELAARELRYSALTEKAD DVGYDKIATGHNLDDNVETFIFRLLRGTSIKGLKGIPEKRENIIRPILSFEKKEILDCLK ANNEEYIKDYTNEQNDYTRNYIRNDIFPMFLRINPNFRRKIDELISEIKNKENGENKKAD FVKYLEENDIEINRRKINRIYSNIFDEEGNIDKKGTREFDLGQNRFLRKEYGNIEIVKKN RNVGKSR >gi|261747171|gb|ADAD01000143.1| GENE 8 6597 - 7547 1052 316 aa, chain - ## HITS:1 COG:FN1976 KEGG:ns NR:ns ## COG: FN1976 COG1559 # Protein_GI_number: 19705272 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Fusobacterium nucleatum # 23 310 20 308 310 177 39.0 3e-44 MKKVLVAFNVFISIFIIVNLIFVYSFFMTKKEYKNVNVNIVKGTSFQQIFKDLKLNYGII DKIYLKMTGNSSNLKIGTYKFNGKYSRYEIIRKIQNKDSIGIKLTIPEGFTKRQVYNRIT ALGLGDEKEIDAALKEIDFPYPHENNNFEGYFYPETYIFTEGTTTKQVMKTILNEFLKKF PPEKYPDKKKFYEQLKLASIVEAEVSDMADKPKVAGVFLKRLEIGMKLESDATLKYELGR QALRDELKSNESVYNSYKVKGLPPTPIGNPPVETFKAVEKAEITDDLFFFTYKGKTYYSK THEEHLKKRKETGQLK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:26:47 2011 Seq name: gi|261747169|gb|ADAD01000144.1| Leptotrichia goodfellowii F0264 contig00210, whole genome shotgun sequence Length of sequence - 266 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 243 249 ## gi|262038806|ref|ZP_06012156.1| conserved hypothetical protein Predicted protein(s) >gi|261747169|gb|ADAD01000144.1| GENE 1 1 - 243 249 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038806|ref|ZP_06012156.1| ## NR: gi|262038806|ref|ZP_06012156.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 80 1 80 80 104 100.0 2e-21 RLFLEEGKTKKSISTEYNVSVASISNWVKQFRNECQNNEKANNEYNYMKENLRLRKELEE VKKENDFLKKAAAFFAKEID Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:27:01 2011 Seq name: gi|261747140|gb|ADAD01000145.1| Leptotrichia goodfellowii F0264 contig00097, whole genome shotgun sequence Length of sequence - 27234 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 7, operones - 6 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 95 - 154 10.9 1 1 Op 1 . + CDS 179 - 1054 707 ## Sgly_1266 hypothetical protein 2 1 Op 2 . + CDS 1064 - 3046 1731 ## gi|262038816|ref|ZP_06012165.1| putative viral A-type inclusion protein 3 1 Op 3 . + CDS 3067 - 3756 858 ## gi|262038832|ref|ZP_06012181.1| CRISPR-associated RAMP protein, Csm3 family 4 1 Op 4 . + CDS 3772 - 4722 934 ## gi|262038822|ref|ZP_06012171.1| hypothetical protein HMPREF0554_1510 5 1 Op 5 . + CDS 4715 - 5521 796 ## gi|262038808|ref|ZP_06012157.1| hypothetical protein HMPREF0554_1511 + Prom 5541 - 5600 6.4 6 2 Op 1 . + CDS 5630 - 7132 2037 ## CLK_3052 hypothetical protein 7 2 Op 2 . + CDS 7132 - 7863 673 ## COG1180 Pyruvate-formate lyase-activating enzyme 8 2 Op 3 . + CDS 7885 - 8253 543 ## CLJU_c23160 putative glyoxalase 9 2 Op 4 . + CDS 8278 - 8934 704 ## COG4912 Predicted DNA alkylation repair enzyme 10 2 Op 5 1/0.000 + CDS 8995 - 10278 739 ## PROTEIN SUPPORTED gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 11 2 Op 6 . + CDS 10319 - 11233 1022 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 12 2 Op 7 . + CDS 11254 - 11712 742 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) + Prom 11782 - 11841 6.3 13 3 Tu 1 . + CDS 11879 - 12574 797 ## COG2964 Uncharacterized protein conserved in bacteria + Term 12578 - 12637 11.4 + Prom 12699 - 12758 11.1 14 4 Op 1 . + CDS 12792 - 13121 471 ## COG1447 Phosphotransferase system cellobiose-specific component IIA 15 4 Op 2 . + CDS 13108 - 13518 393 ## CD3050 hypothetical protein 16 4 Op 3 10/0.000 + CDS 13552 - 13860 603 ## COG1440 Phosphotransferase system cellobiose-specific component IIB + Prom 13864 - 13923 7.4 17 4 Op 4 1/0.000 + CDS 13943 - 15280 2008 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 18 4 Op 5 . + CDS 15308 - 16396 1766 ## COG3589 Uncharacterized conserved protein + Term 16407 - 16443 2.0 - Term 16625 - 16667 2.2 19 5 Op 1 . - CDS 16682 - 18019 240 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 20 5 Op 2 . - CDS 18059 - 18697 706 ## COG0572 Uridine kinase - Prom 18723 - 18782 10.3 + Prom 18673 - 18732 6.4 21 6 Op 1 . + CDS 18871 - 20406 2026 ## COG2317 Zn-dependent carboxypeptidase 22 6 Op 2 . + CDS 20443 - 21849 1853 ## COG1362 Aspartyl aminopeptidase 23 6 Op 3 . + CDS 21877 - 22740 1028 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 24 6 Op 4 7/0.000 + CDS 22755 - 23294 853 ## COG2059 Chromate transport protein ChrA 25 6 Op 5 . + CDS 23291 - 23824 638 ## COG2059 Chromate transport protein ChrA + Term 23922 - 23972 -0.4 + Prom 23832 - 23891 8.4 26 7 Op 1 30/0.000 + CDS 24095 - 25198 1557 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 27 7 Op 2 36/0.000 + CDS 25199 - 26068 918 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 28 7 Op 3 . + CDS 26068 - 26859 1006 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II Predicted protein(s) >gi|261747140|gb|ADAD01000145.1| GENE 1 179 - 1054 707 291 aa, chain + ## HITS:1 COG:no KEGG:Sgly_1266 NR:ns ## KEGG: Sgly_1266 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 9 291 62 345 355 225 42.0 2e-57 MIDDNKKNIENIIKNYVMYEKELINQLNFQIEHSTTIGGFREEIWKSFFERIVPKKFKVE RSVFIMDSEGSFSNEVDLAIFDEQYTPYVFKYGQIKFIPIEAVAAVIECKSTNFKNDKIR EWIDKIDNLSTSNDSIARIVSGVYNPMFDDRELAQTATAPIKIFCHTPKSKMRKERIEKL KDMDIIIEAIKSEKTEKQEMNIIFNNYSLENWYEKINHKKDEENERKILKFPENLKDKDL KKDYEIRNEQEKSIPLLSFIFQFNQLLMLLNNPMFFPHRSYVKLFNTYNEK >gi|261747140|gb|ADAD01000145.1| GENE 2 1064 - 3046 1731 660 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038816|ref|ZP_06012165.1| ## NR: gi|262038816|ref|ZP_06012165.1| putative viral A-type inclusion protein [Leptotrichia goodfellowii F0264] putative viral A-type inclusion protein [Leptotrichia goodfellowii F0264] # 1 660 1 660 660 950 100.0 0 MNSKLIAISVEKIQKYIYQRIDDMTSQGLNDEKTLSSICSASKTIADDILEVVSKKFEIN KEDYILEVSGKYIFKKDIEEDVLKRIEKEIFQEVYRKYDGQIFLKYTHFKAEKVDDDIEY IKKAIESLKCPKVKSKIIEENQDIIFNFLELGKIKEGNEEDKIESDSNEREEKIEYKHFV KSLDDLVSDKINTRGKIAVVKADLNNMGKIFEEIKSYNKYDRLSKILEETIKLKKFSEYL NQFNLENKILPFYIEGDDIFFAVQIGSLLNSIILLKVLIKELRERLEIEEIGKEINLTLS IGVTFVDNHQPVRYYRKSVEEELSKMKKIMKSEKKKDKAILGVSVAGVNLFYYDGNKGEG ETDGFSKFKEDVERLEYLKISNSYFHNLLQSIEREKDEKAQIRVLLYRLLPENIIGRKGK NELALKYYLLSQVLEKEDKSKEKSPRKLKLEKIKSKLIPKLRLFMLLTDERYSEKKEIDK EKIQKYINEKIMLKIKSDLFNKPVSYLYDNEIEKRESTILKLFIKNSKITKEIKDKNGKT KIKNMWVYKKAPFQTSFFYRIKNIIETEKEKSKKLNKVKTVFENINNTIKNNENRERKDN KKLGNKENTNKIDYKLNFNFDKFKELIEEEIDKTDWLDVLILFYSYTQQKIKYLTVNKKK >gi|261747140|gb|ADAD01000145.1| GENE 3 3067 - 3756 858 229 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038832|ref|ZP_06012181.1| ## NR: gi|262038832|ref|ZP_06012181.1| CRISPR-associated RAMP protein, Csm3 family [Leptotrichia goodfellowii F0264] CRISPR-associated RAMP protein, Csm3 family [Leptotrichia goodfellowii F0264] # 1 229 1 229 229 413 100.0 1e-114 MRKIKVKIETKSNCLIGNHTQSFSIGGVDQCTTVDNDGKPLIQASAFKGSTRFIVKSEGS DMEETKKFYKNYLEDLLKKYEKRKLENNLNSQNLEEIINKIKECINDLKPEYLFGIEGIN TTPRLYFSDLKVVDGRKNTEEYFIIDSKTAILEDGDEVVSNPRTYKAIRPGVKFEGEIIL RDFKNFVDIKKIAAELVEKLELFNDGYYRMGNSKSRGYGKISIETEIVE >gi|261747140|gb|ADAD01000145.1| GENE 4 3772 - 4722 934 316 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038822|ref|ZP_06012171.1| ## NR: gi|262038822|ref|ZP_06012171.1| hypothetical protein HMPREF0554_1510 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1510 [Leptotrichia goodfellowii F0264] # 1 316 1 316 316 513 100.0 1e-143 MKKYLINLKVASDITSFPDSQKIFGWLIYQIKKYESEENITKLVNNIYEKKIKCMISNVL PKGFIPMPKEYVIDKFGDKSKEIYEKIKKIDFIKKEDMKKILNNEVELKNFKDYLRQKQS YIQKFRLENQFHNLAGLENKAYTVPIVKIINESTEEIVTDFIIMIKTDSDMIIKWLENIK NAQENEGNDEEVYLGPKGSQGYNRFIRGKVEIKEEKNSEKANFYLNVGMLLPNSINYKKS YIDLHISDRKAFEITEEAKKVIGFINVGSVIYSEGENFNIGKSIQNKYNILYKNAIVFGN SYLEALDSEKFGGENE >gi|261747140|gb|ADAD01000145.1| GENE 5 4715 - 5521 796 268 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038808|ref|ZP_06012157.1| ## NR: gi|262038808|ref|ZP_06012157.1| hypothetical protein HMPREF0554_1511 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1511 [Leptotrichia goodfellowii F0264] # 1 268 1 268 268 392 100.0 1e-107 MNKYNLKFKSISKIVVSPRSLKILYKGVDFREDDLVKPYSNLEKEESNKITVIYPFYNYK DKLDEMIEKPEYYLPGSSIKGSIFPKEEDIRCDDIKLGKEDIVLDLLYKVIDYDEEKCQK SEKDLKIPKYKNFFENLGVEMINFGKEFETDIISVKKKNILEMLKKANEKTIDKLNRFTN KIEKVIFCLKKIKEKEAINELNKLEKILNEIKKIIKKYSKEKDKMITFIGGFKGKVGTIY QMKEDIKTGLYFDSKTYLPYGLVEIKIS >gi|261747140|gb|ADAD01000145.1| GENE 6 5630 - 7132 2037 500 aa, chain + ## HITS:1 COG:no KEGG:CLK_3052 NR:ns ## KEGG: CLK_3052 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 3 497 2 495 498 670 65.0 0 MSDQILSIIKNRVLTYEQKLRTLAGAAEGTLSVLNITPDIQEYRDKGIICDLFEGNAPYR ARYITPDYEKFMKQGSKFLELEPAKDIWEATTNLLILYNHVPSITSYPVYLGNIDMLLEP YIKNEEEAYKAIKLFLLQIDRTITDSFCHANLGPEKTKAGEMILRATVELNTMTPNLTLK YDEDITPDDFAIQCVETSLKVAKPSFSNYKMFQKDFYGKDYAIASCYNGLSIGGGAYTLV RLNLAKLADEAKDEEDLLNNKLPDAVNKMARFMDERIRFLVEESGFFESSFLAKEDLIYR DRFSGMFGMVGLAECVNKIIGAEKKEDKFGWSKYADDLGVKIIDKLQELVHNYKVKYCEI TDNHYVLHAQVGIDTDHGISPGCRIPIGDEPVLPKHIKQTARFHPYFKSGIGDIFPFDEM ASKNPASILDIIKGAFKEGMRYFSVYSTDADVIRITGYLVKKSEMEKLYKNEQVMQQTVV FGLGARENSRILERKVRGNE >gi|261747140|gb|ADAD01000145.1| GENE 7 7132 - 7863 673 243 aa, chain + ## HITS:1 COG:HI0520 KEGG:ns NR:ns ## COG: HI0520 COG1180 # Protein_GI_number: 16272464 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Haemophilus influenzae # 5 243 11 261 262 144 33.0 2e-34 MKGIVNKIIPFSNVDGPGNRLSIFFQGCNFDCLYCHNPETIEVFGENKVPEEISVMEIDD ILKEIEEVAPFISGITVSGGECSLQWKFLTELFKAVKKRWERMTCFVDSNGSIPLWTEDK KEFLSVTDKIMLDIKAFDEKDHILMVGVSNENVIKNFKFLVEIGKIYEVRTVIVPEIIDN EKTVDNISKLIAEYDKNLKYKLLRFRQNGVRRDVLVAYTPNDDYMNNLKNIATKNGLTDV VII >gi|261747140|gb|ADAD01000145.1| GENE 8 7885 - 8253 543 122 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c23160 NR:ns ## KEGG: CLJU_c23160 # Name: not_defined # Def: putative glyoxalase # Organism: C.ljungdahlii # Pathway: not_defined # 7 122 1 116 121 154 71.0 1e-36 MDSKLNIKDFDNFFLYVDDLKKAKEYYENKLGLKVKFDFSEMGMVAFNVGNNEPAIILKD KKKYPDSKPVIWFVVDDVLKEYEKMSIENIKFLSEPFSIRTGNTVEFEDPFGNRFGITDY KK >gi|261747140|gb|ADAD01000145.1| GENE 9 8278 - 8934 704 218 aa, chain + ## HITS:1 COG:Cgl0917 KEGG:ns NR:ns ## COG: Cgl0917 COG4912 # Protein_GI_number: 19552167 # Func_class: L Replication, recombination and repair # Function: Predicted DNA alkylation repair enzyme # Organism: Corynebacterium glutamicum # 19 218 23 208 208 139 41.0 3e-33 MNFDNLFEEMMKHKNEKEAEKMSAYMLNKFKYIGIKTPERRKIFKNFFKEYKKEEIINWD FINKCWENEYRELQYSAIDYLKEMKNFLTEKDIPKIKKLIITKSWWDTVDGIDVIVGETA LKYPEINKTLIKWSKDKNIWLRRIAIDHQLLRKEKTNTELLSEIIENNLNGTEFFINKAI GWALRDYSKTNPDWVTDFIEKNKDKMSKLSIKEASKYI >gi|261747140|gb|ADAD01000145.1| GENE 10 8995 - 10278 739 427 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 [Roseobacter sp. AzwK-3b] # 2 346 3 339 345 289 44 2e-77 MFIDESVITVISGRGGDGAATFRREKFVQFGGPDGGDGGKGGDIAFLTDPNINTLVDFKS SKKFQAGDGERGAAARSTGKSGNDLIIKVPVGTMIRDFETNRLLLDLDRPNEKVIFLKGG DGGRGNIHFKSSVRKAPRIAESGREGAELKIKLELKLLADAALVGYPSVGKSSFINKVSA ANSKVASYHFTTLKPKLGVVRIGDEESFVIADVPGLIEGAHEGIGLGDRFLKHIERCKLI IHIVDISGIDGRSPEEDFLKINRELENYSEKLAKKPQIVVANKIDMLYDDKKYDSFEKFV KDRGIEYVYPVSVIANEGIKPVLMKAWELIKEIPREELEEEYSVDELLREMNKKDDWIIN KLEDHVFEVDGRIVDDVLKKYVFTGDEGIINFLQVMRTLGMEVELENAGVEEGDMIIIAG YEFEYVI >gi|261747140|gb|ADAD01000145.1| GENE 11 10319 - 11233 1022 304 aa, chain + ## HITS:1 COG:FN1917 KEGG:ns NR:ns ## COG: FN1917 COG0324 # Protein_GI_number: 19705222 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Fusobacterium nucleatum # 1 297 1 295 303 257 51.0 2e-68 MNDLLKGIVISGPTGVGKTDISIILAKKIHSDIISADSTQIYREMDIGTAKVTTEEMDGI KHYMLDIINPDEDYSVGDFEKEANRILKEKEKNKENIIITGGTGLYIKALTEGFSDLPSK DEKLRKELGKKTVEELCEELEKIDKKAYNDIDRNNKVRIVRALEVCILTGKKFSEIKTEN IKNNNYDFLKIFLTRNREEMYERINKRVDIMINKGLPDEVRKIYDKYGKNRHKITAIGYK ELFDYFDGAISLEKAVEEIKKESRRYAKRQMTWFRKEKDYITYNLSEKSQKETIEEILKK WEKI >gi|261747140|gb|ADAD01000145.1| GENE 12 11254 - 11712 742 152 aa, chain + ## HITS:1 COG:FN1491 KEGG:ns NR:ns ## COG: FN1491 COG1762 # Protein_GI_number: 19704823 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Fusobacterium nucleatum # 1 149 10 157 162 98 38.0 5e-21 MKIIDYLTEDRVKINLESRTKDGILGEMAQLFLKGNIVDPENMEQFINDLKDREKLSSTG LQDGIAIPHAKSEAVNKIALAVGTISEGAEFDCMDGEMSKIFFMIAAPESVKNEHLDLLA EISKLSYDEELLEKLENSATTEELMSLLKSFD >gi|261747140|gb|ADAD01000145.1| GENE 13 11879 - 12574 797 231 aa, chain + ## HITS:1 COG:YPO0626 KEGG:ns NR:ns ## COG: YPO0626 COG2964 # Protein_GI_number: 16120952 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 16 198 1 180 196 112 40.0 7e-25 MNNNDTNSKLQKYIPLVDFFAAVLGKNSEVVLHDLTDPDHSIIAIRNNHISNREVGGPAT DLVLKILKESDREERSFIANYVGIGKFNKALKSSTYFIREEGELIGMICVNTDEAVFDGL FASMKKLQETFIKEKMETDEVQAPENLSRSIEEVAREAISEVLSTQNVSIEYLKQQDKLN IIEVLYLKGIFLLKGAVVEIAKALEMSEASVYRYVQMVKRQEQEEQKRKRK >gi|261747140|gb|ADAD01000145.1| GENE 14 12792 - 13121 471 109 aa, chain + ## HITS:1 COG:CAC2965 KEGG:ns NR:ns ## COG: CAC2965 COG1447 # Protein_GI_number: 15896218 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Clostridium acetobutylicum # 1 103 1 103 104 79 42.0 2e-15 MDKEKLELIVFDIVNSAGTAKGLAYEALGEAEKGNYEEAEKLLKEADKSLLAAHNIQTEI IQAETSGDNMEVSVLFVHAQDHLMTAVEAKSLIECMIKMYRRIENLEKK >gi|261747140|gb|ADAD01000145.1| GENE 15 13108 - 13518 393 136 aa, chain + ## HITS:1 COG:no KEGG:CD3050 NR:ns ## KEGG: CD3050 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 33 131 2 97 102 79 41.0 4e-14 MKKNKNRGGCMKRKIKKYNAKQEEKLKEKEEKEKNDQKKKDSEIKPFSLQSAIGIILSSL AGSIVFPFLLSLVGVKDVRLGILLGNVFISSFGFTWVRHFIDSKKGFCKSFWIQYATFAV IFGIISYLWFYWAKFV >gi|261747140|gb|ADAD01000145.1| GENE 16 13552 - 13860 603 102 aa, chain + ## HITS:1 COG:VC1281 KEGG:ns NR:ns ## COG: VC1281 COG1440 # Protein_GI_number: 15641294 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Vibrio cholerae # 2 100 3 100 101 66 38.0 1e-11 MKILFVCSQGMSSAIAVNALEKEAKAKGVEIEVKAVSTQQFEEEVKNGYDAAMVAPQVRH RFDTLSAQAKEAGVPCEMIQPQAYSPLGGPKLLKQIQALIAK >gi|261747140|gb|ADAD01000145.1| GENE 17 13943 - 15280 2008 445 aa, chain + ## HITS:1 COG:VC1282 KEGG:ns NR:ns ## COG: VC1282 COG1455 # Protein_GI_number: 15641295 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Vibrio cholerae # 10 445 23 433 446 258 35.0 2e-68 MLEKLSMQMAKLSEQRHLRAIRDGIIATLPLIIVGSIFLIVAFPPFPVDWGISLWAKKNI AQILLPYRMTMFIMALYAVMGIGNSLAKSYKLDGITGAILATSGFLLTIVPTIVKEILVV TQVIDGKEVQIIAEKGTEGATVLQEAAGFVMPMTNLGSAGLFVGIIIAIFAVEVYRFTTT TGFRIKMPDSVPASVARSFEALTPTLIVVLVVATVTYWLHINLHGIMKIIIEPLVKATDS WFSVIIIVFLITFFWSFGIHGVSIVGSLVRPLWLTLLDQNTAALADGKVIPHIAAEPFYQ WFIWIGGSGATIGLAILLATVSKSSYGKTLGRTSIVSSIFNINEPIIFGAPIVLNPILIP PFIIVPVVNSSIAYLCTSMGLVNKVTSTPPWTLPGPIGSFLATNGDYRAAILNIVLIISS AVIYYPFFKAYDKKLLEDEKNGEAE >gi|261747140|gb|ADAD01000145.1| GENE 18 15308 - 16396 1766 362 aa, chain + ## HITS:1 COG:L176316 KEGG:ns NR:ns ## COG: L176316 COG3589 # Protein_GI_number: 15672155 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Lactococcus lactis # 1 358 1 357 364 343 47.0 3e-94 MKKLGISIYPEKTDEKTLKQYIDKTAETGFSRIFSCLLSVNDEKEKIKEDFRKINSYAKS KGFEIILDVNPGVFSKFGITHDDLSFFREIEADGLRLDMGFSGLQESIMTYKGNGLKIEI NMSMDTHYIDTIMDYRPDKSNLIGCHNFYPHEYTGLGMDFFNKCTENFTKYGLRTAAFIT SQAENTFGPWPVTQGLPTLEMHRHLPIETQFRHFAALETIDDIIISNCYPTDEELEKLKN VRKDMASFAVELVPDIPEIEKKIVLEEFHFNRGDVSGNLIRSSNSRVKYKGHNFKLFNAP EIIKRGDIIIESSEYGHYAGELMIALKDMKNTGKSNVVGKIVDHEQFILDYIKPWQKFAF SM >gi|261747140|gb|ADAD01000145.1| GENE 19 16682 - 18019 240 445 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 86 444 119 456 458 97 26 1e-19 MKIVVIGGGAAGMMFSTQYKKANPEHEVFLFEKSPYVAWAGCPTPYYIADELSINDVVLG TAEDFIKRGVNVKIHHEVTGIDFKNKTLNVTGNEINGIFSYDKLVLAVGAKSFIPDIKGY SPELSNVFILSHAENAIEIKKYINENKATLKNALIAGAGFIGLETAESFNKLGLNVTVVE KSGEIFPSVSENLKKGIYSEIEKRGVSLKLNAGVAEIISGNNVAKAVKLDNGETLNFDIA LFSIGITPNIGFISDELETEKGKIVVNDKFETNISDVYAIGDCIFNKYYNTDRNLYAPFG DVANKHGILLSKYLSGENIHWKGLLRSYATSFYDIKLAQTGLSLDEATALGYNAEKIDMK AMYKNSGFADSVPAQVEIIYDKDKKVLLGGTMVGREAVAQFIDQIAIVITLETPIEKFID IDFAYSPTNASVWNPLLVTYRKVIK >gi|261747140|gb|ADAD01000145.1| GENE 20 18059 - 18697 706 212 aa, chain - ## HITS:1 COG:BH1275 KEGG:ns NR:ns ## COG: BH1275 COG0572 # Protein_GI_number: 15613838 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Bacillus halodurans # 1 207 1 205 211 204 48.0 1e-52 MDYKTIIVGIAGGTGSGKTSVTKAILEELNKTHINSILLEQDSYYKRHDELTYEERVKLN YDHLDSIDFDLLEEHILSLREGKPIEKPIYDFRIYNRVDETEHIEPANLIIVEGILILAV EKIRNLFDAKIFVDTDDDERLLRRIERDLKERGRSFDSIKKQYIATVKPMHLEFVEPSKR YADIIIPRGKENKVGIKMVSSRLKYLLNNLSD >gi|261747140|gb|ADAD01000145.1| GENE 21 18871 - 20406 2026 511 aa, chain + ## HITS:1 COG:BS_ypwA KEGG:ns NR:ns ## COG: BS_ypwA COG2317 # Protein_GI_number: 16079266 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent carboxypeptidase # Organism: Bacillus subtilis # 11 508 8 499 501 381 43.0 1e-105 MKNMTEKEKMKKIREVLENNKALGHAITLLHWDLETEAPKDAVERISKTLGYLTGENYSM IINDEFKNLVYSVNSEELDEVDKKIIKELKKEYFEKLEKIPKEDYMKYSELKVVSTKKWE EAKNKDDFNIFKPYLSEVIEYNKKFIKYRGYEGNPYDTLLDDYEPEITVKELDEFFNNIK NELSPFIKKIVEKKKSDKEIEDKKKLSEMTFDITKQKEMSLYALNLIKFDFNKGILKESE HPFTTNVNNKDVRITTHYYEHDLLSSVYSTVHEGGHALYEQHIDDKITDTILGTGVSMGI HESQSRIYENMFGRNKEFLKFIFPKIDEMFGLSKQGIDFESFYKLTNEVQTSFIRTDADE LTYPIHILIRYELEKDIFSNLDEKTDVDKLAEKWADKYEEYLGIRPESCKEGILQDVHWS DGLFGYFPSYALGSAYAAQIYDAINQKINIDKELEKGDFSKINEILKEGIHKYGKMKTPK ELIKNITGTSFDSKHYINYLKEKFSKIYDIK >gi|261747140|gb|ADAD01000145.1| GENE 22 20443 - 21849 1853 468 aa, chain + ## HITS:1 COG:CAC0607 KEGG:ns NR:ns ## COG: CAC0607 COG1362 # Protein_GI_number: 15893896 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 41 466 7 432 433 483 55.0 1e-136 MGMSRNLLPFLNQYSVRKMIQTNKKIEKIQKRFEDMEVKSFAREVVEFIDDSPSTYHVVK NCSDVLEENGFERLNPKEKWELKKGGRYYLKRSNSTVIAFTLGENINLKKGFKIFGSHTD SPGFRIKPQPEIVTENLIRLNTEVYGGPILSTWFDRPLSIAGRVVVRSNNLFSPKTVGMK IDEPLMTIPNLAIHQNREVNNGVKIDKQTDTLPVIGLINDEFEKENYLLNLILDRTNVKK EDVLDFDLFVYSIEKGTLLGANEEFISAPKLDNLVSVYAGLLGLIEAEDIHDQINVFVGF DNEEIGSATKQGADSNYLLNTLERIVCSLGYGRSEFLQMLSCSFMLSADGAHAAHPAHME KTDPTSRGKINEGISVKISANQKYTSDGFSIAVLKQIIENTDIKIQPFVNQSNERGGSTI GPISSTHLDIDAIDLGVPMLAMHSVRELCGVYDVFYLKELAKEFFNKN >gi|261747140|gb|ADAD01000145.1| GENE 23 21877 - 22740 1028 287 aa, chain + ## HITS:1 COG:lin0819 KEGG:ns NR:ns ## COG: lin0819 COG0656 # Protein_GI_number: 16799893 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Listeria innocua # 3 283 2 270 274 300 54.0 2e-81 MKNSDKTVILLNGVKMPILGFGTWKIEDEKEAFNSVKEAIETGYTHIDTASFYKNEESVG SGIKEGLKSKGLKREDIFVTTKVWNTEQGYENTLEAFERSLKKLDTGYVDLYLIHWPVTK AYENEWRTKIKETWKAMEKLHKEGKIKAIGVSNFLVHHLEELLSDCEVKPMVDQIEFHPG HNQKETVEFCRKHNIAVEAWSPLGRGVVLDNEFLAEIAAKYNKTVAQICLRWIVQQGIAA LPKSTKKERIQSNFHIFDFELSEEDMKKITNMKPTGYSGSDPNLGRY >gi|261747140|gb|ADAD01000145.1| GENE 24 22755 - 23294 853 179 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 6 179 8 181 186 158 48.0 5e-39 MKIYFELFWIFFKIGAFTLGGGYAMVPLIQNEIVDKKKWIEKNEFIDLLALAQSAPGVLA INMAVFVGYKIKKYAGVFATVLGATLPAFLVIVLIAALFENFQDNPYVIKAFKAVRPMVV ALIGASVYTIGKQAKINRYTLIIVIIIALSVAYFKFPPILVIILGAVSGNIWMNRRAKK >gi|261747140|gb|ADAD01000145.1| GENE 25 23291 - 23824 638 177 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 5 177 2 173 176 128 50.0 6e-30 MNMGTLITLYTVFFKIGLFSFGGGYAILPLIQKEVVEIHNWITVSQFTDIVAVSQVTPGP ISINASTYVGYLVTGDVIGATVATLGLITPSIIIMIIFSRFFLKFKDNKYISNAFTGLRV VVVGLILSATLLLMDKNNFIDFKSIIIFLVTIVLFLKWKINPIIMTILAAITGILIY >gi|261747140|gb|ADAD01000145.1| GENE 26 24095 - 25198 1557 367 aa, chain + ## HITS:1 COG:FN1797 KEGG:ns NR:ns ## COG: FN1797 COG3842 # Protein_GI_number: 19705102 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Fusobacterium nucleatum # 2 366 3 371 376 399 58.0 1e-111 MKKNLEIKNVGKTFGKEEILKNINIEIKKGEFFSLLGPSGCGKTTLLRMIAGFIRPDKGS ISINEKVIDHLNPNERNVNTVFQNYALFPNMTVFENVAFPLKIKKVSKEEIEKEVLKYLE LVGLKEHKDKMPSNLSGGQKQRVSIARALISKPDVLLLDEPLSALDAKLRQKLLIELDTI HDEVGITFVFVTHDQEEALSVSDRIAVMNKGEVLQIGTPNEIYEMPANDFVADFIGETNF IEGKVTEVYDKYGYAENEELGRFKIELDKPVKVGDNVKLTLRPEKIKVDTEPRYTDNDNY KVIKGTVDEVIYTGFQSKLFIKIDGVNRIINAYDQHRKFLTEEELFEWKEKVYFYWHYED AYLVEVN >gi|261747140|gb|ADAD01000145.1| GENE 27 25199 - 26068 918 289 aa, chain + ## HITS:1 COG:FN1798 KEGG:ns NR:ns ## COG: FN1798 COG1176 # Protein_GI_number: 19705103 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Fusobacterium nucleatum # 19 289 11 278 284 275 56.0 5e-74 MGKRTVKKINEKGPLKWVYYLPISVWMTLFFGLPTLIIIAFSFLKKGAYGGIAMPLKYTM AAYNLLFTSNDIIKVVVKTLNISVWITAVTLFLAIPVSYYISRSKYKNLWLLLIVIPFWT NFLVRIFSFIAILGNNGIVNQFLMKVFGLKTPLALLYNKNAVIIISVYVFLPYAILPLYS AIEKFDFSLLDAASDLGANKFQALMRVFIPGIKSGIVTAVIFTLVPAIGSYAVPDLVGGT DGIMLGNIIANRMFQLRDWPTASAISTVFIVITTFGVWLSMKMEKEEEE >gi|261747140|gb|ADAD01000145.1| GENE 28 26068 - 26859 1006 263 aa, chain + ## HITS:1 COG:FN1799 KEGG:ns NR:ns ## COG: FN1799 COG1177 # Protein_GI_number: 19705104 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Fusobacterium nucleatum # 4 259 10 260 264 253 58.0 3e-67 MNEKRRISLFFFVLVMIFFYLPIVMLVVLSFNSSKSFSWTGFSLKWYIELFQKSPDLWQA FGRSLIIGITSSLFSVAVATFGAIGIKWYNSRIKNYVKFLNYIPLILPDLIIGVSLLIFF NMQIGIYKGVELGMTTIFIAHTTFNIPFSLFIIMARLDEFDYSIVEAARDLGASEVQTLF KVILPTMMPGIISAFLMAMTLSLDDFVITSFVTGTNSNTLPVQIYSMIKFGVSPVINALS TILIIGTIVLSLSSRKLQKYMLS Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:28:34 2011 Seq name: gi|261747124|gb|ADAD01000146.1| Leptotrichia goodfellowii F0264 contig00041, whole genome shotgun sequence Length of sequence - 17287 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 4, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 74 - 721 811 ## COG4146 Predicted symporter 2 1 Op 2 1/0.000 - CDS 755 - 1483 906 ## COG2188 Transcriptional regulators - Prom 1519 - 1578 12.9 3 2 Op 1 . - CDS 1641 - 2669 1253 ## COG0524 Sugar kinases, ribokinase family 4 2 Op 2 . - CDS 2688 - 3470 823 ## EF0454 hypothetical protein - Prom 3506 - 3565 8.3 5 3 Op 1 3/0.000 - CDS 3567 - 4670 1861 ## COG3839 ABC-type sugar transport systems, ATPase components - Prom 4707 - 4766 5.0 6 3 Op 2 9/0.000 - CDS 4769 - 5563 194 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 7 3 Op 3 . - CDS 5585 - 6427 1171 ## COG3717 5-keto 4-deoxyuronate isomerase 8 3 Op 4 . - CDS 6466 - 7299 1016 ## MPTP_0398 hypothetical protein 9 3 Op 5 . - CDS 7334 - 9148 1846 ## COG4289 Uncharacterized protein conserved in bacteria 10 3 Op 6 . - CDS 9163 - 10122 490 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 11 3 Op 7 . - CDS 10156 - 12315 2189 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 12337 - 12396 10.1 - Term 12326 - 12375 2.8 12 4 Op 1 . - CDS 12398 - 13588 1402 ## Smon_0127 glycosyl hydrolase family 88 13 4 Op 2 . - CDS 13631 - 15031 1597 ## COG3119 Arylsulfatase A and related enzymes 14 4 Op 3 . - CDS 15056 - 16627 2292 ## COG1653 ABC-type sugar transport system, periplasmic component 15 4 Op 4 . - CDS 16648 - 17286 751 ## Hfelis_14140 polysaccharide lyase family protein 8 Predicted protein(s) >gi|261747124|gb|ADAD01000146.1| GENE 1 74 - 721 811 215 aa, chain - ## HITS:1 COG:PM0597 KEGG:ns NR:ns ## COG: PM0597 COG4146 # Protein_GI_number: 15602462 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Pasteurella multocida # 1 212 1 212 559 282 75.0 3e-76 MFTVLLSFLFFTGLVALISWWKTKNDDLSTSKGYFLAGRGLTGLVIGCSMVLTSLSTEQL IGINAESYKGNFSIMAWTVQSVIPLCILAKFLLPKYIREGYTTIPEFFEKRYDKSTRLIM STLFLFFYLFIAIPVALYTGAIAFNQVFNIETLTGLSYAQSMWITIWAIGIVGGIYAIFG GLKAVAVSDTLNAIILVIGGFLVPLFGLRFLGTEV >gi|261747124|gb|ADAD01000146.1| GENE 2 755 - 1483 906 242 aa, chain - ## HITS:1 COG:BS_yvoA KEGG:ns NR:ns ## COG: BS_yvoA COG2188 # Protein_GI_number: 16080556 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 3 238 1 235 243 117 31.0 1e-26 MELNINYKSKKPIYLQIVHSIKKEIKEGKIKEGEKLVSENTLCKDYYISRNTVRQALDIL EKEGLIKKIKGKGSFVLSPKIYQNRTKLSRFYDDIKMSGLVPYSKILHSGKEIPDKNVKD KMKLNDDDFIYIINWVRYGNDEPLIYETIHLNYSLVSGIEKIDLNDDIKLYKILEKEFGI KPQNGQEQIYPCKLNAKEAKFLKMRQEDLGMRIERVLYKDKKIFEYTKSVIRGDRFIYTI DF >gi|261747124|gb|ADAD01000146.1| GENE 3 1641 - 2669 1253 342 aa, chain - ## HITS:1 COG:TM0067 KEGG:ns NR:ns ## COG: TM0067 COG0524 # Protein_GI_number: 15642842 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 7 329 3 325 339 209 38.0 6e-54 MSKMKSVLSLGEMLLRLSPENHKMIIQSGTNLLEMFYGGAELNVSIALANFGVKSGYITK VPDNSLGDKGIKYLKQYGVNTENIQIGGERIGIYFLENGYSVRPSCVTYDRKYSSFSEME MTDEEIRKALSDYDVLFISGITLSISEKTFKLSKQFIKIAKELEKEIIFDCNYRSKLISL EEASKRYREIIDYADVLFAGYLDFINIFKIENPEYDGKEDYTEYYKKLYTKVYEKYNFKY IISSIRKSVSSNRNEYSGIITDGKETIKGREYHIEIQDRVGTGDAFTSGAIYGYLSGKDK EGIINFAIGSGALKHTIHGDVSEFGVEDILQIINSKCFDIAR >gi|261747124|gb|ADAD01000146.1| GENE 4 2688 - 3470 823 260 aa, chain - ## HITS:1 COG:no KEGG:EF0454 NR:ns ## KEGG: EF0454 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 2 259 3 259 259 160 42.0 4e-38 MIILIYTVIIILATTLGAISGLGGGVIIKPLFDAVGFHNMSTIGFYSSVAVFTMSIVSII KQLRKGFSFEIKTVFWISFGSLVGGIIGENVFNEVTRAYENSTVKIIQAVCLGITLIGIL IYTLNKNKMKSYCLKNILYIFASGFFLGSISVFLGIGGGPLNIAVLMLLFSYEMKEATVY SIATIFFAQISKLGSVLVGGKLMNYDLSLLPFICFAAVIGGFMGTLINQKLNSSKIEKIY LGIIIGLIFVSIYNAYSAIV >gi|261747124|gb|ADAD01000146.1| GENE 5 3567 - 4670 1861 367 aa, chain - ## HITS:1 COG:YPO0609 KEGG:ns NR:ns ## COG: YPO0609 COG3839 # Protein_GI_number: 16120935 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Yersinia pestis # 1 365 2 369 372 439 60.0 1e-123 MSGVVLKKVEKQYPNGFKAVHGIDLEIKDGEFMVFVGPSGCAKSTTLRMIAGLEEITGGE IYIGDKLVNDVPPKDRGIAMVFQNYALYPHMTVYQNMAFGLKLKKTPKDEIDRRVREAAE KLEITELLDRKPKEMSGGQRQRVALGRAIVRKPEVFLFDEPLSNLDAKLRVSMRVRITQL HQELKTTMIYVTHDQVEAMTMGDRITVMRAGKIMQVDTPLNLYHYPANKFVAGFIGSPTM NLVDGVLKEKEGKVYIDIDGAEIELSHEKGEKVKGHLGKKVTFGIRPENISVTEHEDSIS KTGEISVVEQMGNEEYIYFTLNGHQMTCRINIEHVGDSVSKKGKRVFRFDTNKAHIFDVE TEENISL >gi|261747124|gb|ADAD01000146.1| GENE 6 4769 - 5563 194 264 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 10 254 4 238 242 79 25 2e-14 MDKLFDLTGKIALVTGASYGIGFSMAAGLAKAGARIVFNDINGELVEKGIKAYKESGIDA KGYVCDVTDEKAVNSMIAQIEKEVGIIDILVNNAGIIKRIPMTEMSVEDFRKVIDVDLNA PFIMSKAVIPGMIKKGHGKIINICSMMSELGRETVSAYAAAKGGLKMLTRNICSEFGEKN IQCNGIGPGYIATPQTAPLREKQPDGSRHPFDSFIIAKTPAGRWGTPEDLTGPAVFLASD ASNFVNGHILYVDGGILAYIGKQP >gi|261747124|gb|ADAD01000146.1| GENE 7 5585 - 6427 1171 280 aa, chain - ## HITS:1 COG:kduI KEGG:ns NR:ns ## COG: kduI COG3717 # Protein_GI_number: 16130747 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Escherichia coli K12 # 1 280 1 278 278 313 51.0 3e-85 MDIRYSVNQKDFKRYTTEELRNEFLIEKLYVENEIKAVYSHIDRMVTLGCMPTTEKVSID KGINSWENFGTDYFLERREIGIFNLGGKGKIEVDEKVYELDYKDCLYITKGAKKVYFTSD NKDKPAKFYMISAPAHCSYETTFISIKDAAKRELGSMETANKRTIYQFIHPDVIKTCQLS MGMTILEKGSVWNTMPCHTHERRMEIYTYFEVPEDNIVMHFMGEPTETRNVVIKNEEAVI CPSWSIHAGAGTSNYSFIWAMGGENIAFDDMDHLKPTELK >gi|261747124|gb|ADAD01000146.1| GENE 8 6466 - 7299 1016 277 aa, chain - ## HITS:1 COG:no KEGG:MPTP_0398 NR:ns ## KEGG: MPTP_0398 # Name: not_defined # Def: hypothetical protein # Organism: M.plutonius # Pathway: not_defined # 2 276 3 276 279 278 51.0 1e-73 MNISIEILDRNNEIKLGKNDYEKELRPMQVKGEDLVYLATQVMSIQSGDKICVKVEKEGQ YLFVKLDETLNESLIYLSGTEWIYEPLFVLNGLEAAPEEAFKAKRHYIQVRLAKDYEINA YRNLALNTHDQKEDSGAYPHAYANVETRNDSTFFAKNAIDGIFANNNHGSYPYQSWGINR QKDAALTIDFGRNVKLNQIGLTLRADFPHDSYWKQVTVSFDSGEKMIFDLKKESKPQYFS FSDIITKKIVLENLIKSEEDDSPFPALTQIECFGINA >gi|261747124|gb|ADAD01000146.1| GENE 9 7334 - 9148 1846 604 aa, chain - ## HITS:1 COG:SMb20536 KEGG:ns NR:ns ## COG: SMb20536 COG4289 # Protein_GI_number: 16264263 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 54 566 60 572 617 376 37.0 1e-104 MSFDRKTFHFNTRKEIEDSLWKTIGPLEDFYKNRKGRLNIGSHGTVYKKDTRNIEAFLRI LWGLGSFWAQNPSEKWLDIYIKGIVAGTDPTNEDYFGDVTDYDQLLVEMTSIVVTFMLNK EKIWDRLTAKEQTKIANWLRQINEHNMPANNWHFFRILVNVCLKKMGVRFSQEKLDEDLE LINSFYVDNGWYYDGIKTQRDYYIPWAFHYYGLIYSVFMSEDDPVRSKLFKERAAEFAKS FAYMFDKNGNAIPFGRSLTYRFAQCAFWSALVFADVEALPWDQIKGLYVRNMESWYNKEI FTTDGILSVGYYYQNLIMAEGYNAPGSPYWALKSYLLLAVPDDHPFWKAVVKETKLSEEK YLNVSGRSMYVHTGNSNHAQMFPFGQSINYQPHCAAKYSKFVYSSVFGFSVPKSTYYYYE GAFDNVLAVSEDGLYFRPKEKDEMFEVNEDYLICKWKPYSDVLIYSTIIPCGDYHVRIHE INSERNLIACEGGFSNLYDKEERIEDSKSAKYISKIGTSEIFSVKGFDKGDVIRPEPNTN LLYERTLLPILTANLSKGKHILISIVGGIPKGNTAQKPDVKREDNQVVIITENKRISVNI NNKF >gi|261747124|gb|ADAD01000146.1| GENE 10 9163 - 10122 490 319 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 4 313 5 312 323 193 34 8e-49 MRCFAGIDVGGTNSKIGLLDENGNILITESIKTESNKGPQDTIERIWKTVEKLAKEISVN IEDIEGVGVGIPGPVVDESIVKIAANFSWGNDFPAKKMFEEITKKRVKIGNDVKVIALGE QLYGAGKGYKNSITIPIGTGIAAGIIIDGRILAGTTGAGGEFGHIVINKKGHKCGCGLTG CLETYCSATGIVREAKIILKDNKESTLLEVVDNDLDKLEAYHIFEEAKKKDKIAMKIVDD FCDNLAHGIGTLLNIVNPEIIIFAGGVSKAGSIITDGVKKHLDKYALKMTTEDLKFAFSE LGNDAGIKGAAALIINDLK >gi|261747124|gb|ADAD01000146.1| GENE 11 10156 - 12315 2189 719 aa, chain - ## HITS:1 COG:BH0483 KEGG:ns NR:ns ## COG: BH0483 COG2207 # Protein_GI_number: 15613046 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 265 711 297 763 769 103 24.0 1e-21 MELKNIQELCVIHSKKMNNIIISILSCSLFIFMLFNIVNKISRINEEEKKNASNIFLNIN RKTQFIMSTLNDLKFDDKFNEYISKNENDQEYQYLKSALHRKIISGNNIYGSIGYLINFS TNNLNYYMTSTGIVDKKTFYKNNKFPYESIEKKIKSVSDDGNYINIFINDKGYLGLNKSF WVIKLDKNSFFYEIEDKKDWYLRIDNTISLSESENKNSEIIKNINHTEKKNSAFKENLKL KGRNLHVFYLSEIDAELYYYQPRINLIMIIAIEIIKSLLFFGLLYFTVAAVINLIIKPIK ELAEKIGYGNEKGTAEIDYIETKIEELSTTNVELTGKIKELSEIQKRKKLKDSLLGIPNE QNVEKILWEKEYRVILMEIDDVDSVKNIYDNFRLSKEFVMKYFLNEVAFEVIDIDFKSIV IVIENFPKEEMEPILDNLIFHIERNFSLYFVAAVTDKYNDLKDLPLAYRTAKRIMDYKYI YKQYKVIFQNIMNENQLNKYYYPIQLEAKLITKTLSSNTSEVKKIIQEIFGNDIEQKLIN KKAVKEFNGLLYNTLNQISIQLDEMNIGEINRSENFEKFLKITEIKELKEVFSERLIEIC KFIKENSEIDIRTIKNSIEKYMTENYMKDISLENLSEYLGYSFKYTSILFKKIMGDNFKN YLSLYRIKKAKELMEEKEYKIKELAELVGYNSSNTFIRMFRKYEGVSPGKYFFDTENEV >gi|261747124|gb|ADAD01000146.1| GENE 12 12398 - 13588 1402 396 aa, chain - ## HITS:1 COG:no KEGG:Smon_0127 NR:ns ## KEGG: Smon_0127 # Name: not_defined # Def: glycosyl hydrolase family 88 # Organism: S.moniliformis # Pathway: not_defined # 1 396 1 394 394 569 68.0 1e-161 MELNEKTMKRYSADTELSKEMLYGALHNALSKIDKNIEYFINLFPRPASVNNIYPGILNG GEWDDWTSGFWTGILWLAYEITKEEKYKKVAEFQLKSYKERIKNKNAVDHHDLGFLYIPS AVAQYKTTKSEEAKTTGIMAAEHLITRFREKGEFIQAWGVLGNPDNHRLIIDCNLNIPLL FWASEVTGEKKYREIAEKHVNTAAGTVVREDGSTFHTYYFDVETGKPLRGSTAQGSSDDS AWARGQAWGVYGFPLAYKYLKNEKFVSLYKKVTNYFLNKLPEDDICYWDLAFDDTSGEEK DTSAAAIAICGILEMDKNLSDTDKNKKVYMNAAKHMMKSLIEKYTTKDLRESNGLLKEAV YSKPHGIGVNECCIWGDYFYMEALVRFLKQDLKIYW >gi|261747124|gb|ADAD01000146.1| GENE 13 13631 - 15031 1597 466 aa, chain - ## HITS:1 COG:YPO0829 KEGG:ns NR:ns ## COG: YPO0829 COG3119 # Protein_GI_number: 16121138 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Yersinia pestis # 3 452 39 480 517 311 38.0 1e-84 MNKNMNRPNLLFLFADQWRRNSVGFMEKEDVITPNIDLFSKEALAFENAVSGCPLCSPSR ASILTGTYPITHGVWTNCKTGLDDVALKEDAVTITDILKKNGYNTGYIGKWHLDSPEMNK SENPVSGARDWDAYTPPGKKRHGVDYWYSYGAYDNHLKPHYWSNDEKMIEVDKWSVEHET DKAIEFLEKNKDKNVPFSLFVSWNPPHTPLDLVPEKYIKMYENKKLKVNPNVILKNVIDH TESMKEKLNFTEEEYQTVMKKYFAAITGIDENFGRIINYLKENDLYENTVIILTADHGEM LCGHGLWSKHVWFEESIGIPFLIKYGSKCLSGRTDTVISSVDIMPTILSLMNLDIPDTVE GTDLKNTLLEKEKNENKAFISSYPGQIPAIKEFEKINEDNKKYGWRAVKTKTHTYVINKG YKPGRETERLLYDNINDEYQLNPLKLNDSSENKLSEELEKLLKKLG >gi|261747124|gb|ADAD01000146.1| GENE 14 15056 - 16627 2292 523 aa, chain - ## HITS:1 COG:AGl3560 KEGG:ns NR:ns ## COG: AGl3560 COG1653 # Protein_GI_number: 15891902 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 29 519 26 516 522 352 41.0 1e-96 MKKKITYLLLIIAILLVACGKSQSKGDNKDSGKKLEGHLISENPKELTVFAIHLGKALNP DAPVYKKAAEMTNIKLKNVASQNQTDQKEAYNLLVSSGELPDIVAYEFTEDLEGLGIDGG LVPLEDLIDKYAPNIKKFWEENPRYKQDAIAADGHIYMIPNYYDYFNFMPSTGYYIRKDW IKKLNLKDPVTTDDLYNVLVAFRDKDPNGNGKKDEVPLFSRGDTVAKVLQPLADIFKARA FWYDDKDAVKFGPAQAEYKEAMKQLAKWYREGLIDKEIFTRGINARDYMLSNNLGGFTID WFGSTSSYNNKLGKTIQGFDFGIIAPATFNGDNKTYQARTTYLGGWGISSKSKNAIEAIK YFDFWYSPEGRRLWNFGIEGDDYKLVDGKPVFTDKILKNPQGKTPLAVIRESGAQFRLGM PQDSEYEKQWYVKEAVDAIDFYLKNNYVHELMPILKYTKDESKEFIKINTQLNSYVEEMS QKWIMGVSDVDKDWDSYIKRLNDIGLAKAEKIQLEAYKRFTKK >gi|261747124|gb|ADAD01000146.1| GENE 15 16648 - 17286 751 212 aa, chain - ## HITS:1 COG:no KEGG:Hfelis_14140 NR:ns ## KEGG: Hfelis_14140 # Name: not_defined # Def: polysaccharide lyase family protein 8 # Organism: H.felis # Pathway: not_defined # 11 212 548 747 748 120 34.0 4e-26 VTGIKKVPGTKGTYINFTDKKTNENIGYKILNTPILTINIEKRKGSWKEIGGKSQDIKEK TYFKVYAEHGKNPKDSKYAYIVLPMFSEEEVKNYNEDNIKIIRMDNDVHAIEDKQRKITG INFWGSKEITIAGIKAVSPISIVKTEKGNEIILAVSDPTQLSKYETIIELDGEYSLLSQN NSVKLNVKNGKTEIKVNLVNQGKSVVINLKKI Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:28:58 2011 Seq name: gi|261747103|gb|ADAD01000147.1| Leptotrichia goodfellowii F0264 contig00012, whole genome shotgun sequence Length of sequence - 15550 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 6, operones - 4 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 85 - 160 85.4 # Thr TGT 0 0 1 1 Op 1 . - CDS 214 - 477 448 ## Lebu_0475 integral membrane protein 2 1 Op 2 . - CDS 546 - 962 500 ## SPJ_1866 metal dependent phosphohydrolase, HD region 3 1 Op 3 . - CDS 949 - 1158 325 ## gi|262038854|ref|ZP_06012201.1| hypothetical protein HMPREF0554_0161 4 1 Op 4 . - CDS 1133 - 1723 567 ## SEQ_1757 phage Mu protein F like protein 5 1 Op 5 . - CDS 1777 - 2445 731 ## COG0560 Phosphoserine phosphatase 6 1 Op 6 . - CDS 2465 - 3571 1390 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 7 1 Op 7 . - CDS 3606 - 4076 496 ## Lebu_2016 hypothetical protein 8 1 Op 8 1/0.000 - CDS 4083 - 4688 769 ## COG0009 Putative translation factor (SUA5) 9 1 Op 9 11/0.000 - CDS 4681 - 5670 1642 ## COG0462 Phosphoribosylpyrophosphate synthetase 10 1 Op 10 . - CDS 5671 - 7011 1944 ## COG1207 N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) - Prom 7040 - 7099 16.2 11 2 Op 1 . - CDS 7134 - 7955 1197 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Term 7984 - 8032 5.2 12 2 Op 2 . - CDS 8037 - 8666 1015 ## COG0457 FOG: TPR repeat - Prom 8785 - 8844 12.0 + Prom 8795 - 8854 9.9 13 3 Tu 1 . + CDS 8883 - 9338 299 ## Lebu_1811 hypothetical protein - Term 9339 - 9382 7.5 14 4 Op 1 1/0.000 - CDS 9523 - 10824 2009 ## COG1158 Transcription termination factor 15 4 Op 2 . - CDS 10853 - 12178 572 ## PROTEIN SUPPORTED gi|227372256|ref|ZP_03855738.1| SSU ribosomal protein S12P methylthiotransferase - Prom 12230 - 12289 5.2 - Term 12250 - 12293 4.2 16 5 Op 1 . - CDS 12311 - 12745 689 ## Lebu_1971 hypothetical protein 17 5 Op 2 . - CDS 12771 - 13409 806 ## Lebu_1972 hypothetical protein 18 5 Op 3 . - CDS 13419 - 14129 240 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 19 5 Op 4 . - CDS 14158 - 14631 282 ## PROTEIN SUPPORTED gi|223039866|ref|ZP_03610150.1| 30S ribosomal protein S16 - Prom 14661 - 14720 6.7 20 6 Tu 1 . - CDS 14722 - 15549 1155 ## COG0726 Predicted xylanase/chitin deacetylase Predicted protein(s) >gi|261747103|gb|ADAD01000147.1| GENE 1 214 - 477 448 87 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0475 NR:ns ## KEGG: Lebu_0475 # Name: not_defined # Def: integral membrane protein # Organism: L.buccalis # Pathway: not_defined # 1 86 1 86 87 131 86.0 1e-29 MDKKKINTIVGSIGAFIGIFVFIAYIPQIIANIQGVKAQPWQPLFASFSCLIWVVYGWTK EPQKDYILIIPNTAGVILGFLTFITAF >gi|261747103|gb|ADAD01000147.1| GENE 2 546 - 962 500 138 aa, chain - ## HITS:1 COG:no KEGG:SPJ_1866 NR:ns ## KEGG: SPJ_1866 # Name: not_defined # Def: metal dependent phosphohydrolase, HD region # Organism: S.pneumoniae_JJA # Pathway: not_defined # 14 136 13 135 137 114 50.0 2e-24 MQLIKAFILCIIKHWKQKDKAGKRYVYHPIYVMLNVKGYKEKVVALLHDIVEDTDVTIDK LKKAGFSKEIVKAVEVITKKKNQEYFRYLTKVKNNKIAKSVKIEDLKHNMNLKRIKNISG KDYKRFEKYKEALKILEN >gi|261747103|gb|ADAD01000147.1| GENE 3 949 - 1158 325 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038854|ref|ZP_06012201.1| ## NR: gi|262038854|ref|ZP_06012201.1| hypothetical protein HMPREF0554_0161 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0161 [Leptotrichia goodfellowii F0264] # 1 69 1 69 69 107 100.0 3e-22 MGKYYLTSEKRFLVKIVKDIAWEYNFITKKWNENFYWWDKIFVDSYTDFEEITEEQAQKI IKGERYAVN >gi|261747103|gb|ADAD01000147.1| GENE 4 1133 - 1723 567 196 aa, chain - ## HITS:1 COG:no KEGG:SEQ_1757 NR:ns ## KEGG: SEQ_1757 # Name: not_defined # Def: phage Mu protein F like protein # Organism: S.equi_equi # Pathway: not_defined # 16 189 348 521 521 133 43.0 5e-30 MTERDELEKLYNDFVLKEPKITEEIEEIVKKSRGFLTGLENKIKTKESLLRKIEIETLKE EITEYKALKKIQDILRYTVILNLENFVEDYYSIVSLLSKKNYILIKVGNTWKNGNVYKGI NTVLEKDDIKIEIQYHTEESYNLKEKILHKLYEEYRDTSTVKSRKKELQKEMKKISLKIK NPKGIGDINGEILFNK >gi|261747103|gb|ADAD01000147.1| GENE 5 1777 - 2445 731 222 aa, chain - ## HITS:1 COG:CAC2227 KEGG:ns NR:ns ## COG: CAC2227 COG0560 # Protein_GI_number: 15895495 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Clostridium acetobutylicum # 7 222 4 213 213 64 30.0 2e-10 MGKKINIIVYDFDKTIYGGESGTNFFTYYFKKYPIKASVFSFIYLKEVFLYLIKVINLKQ LKERFYVFLEKYSNEEIQEIIDGFWKEYSKKNYKWTGEELKKNKAESDIVIVSSATPLFL IEKFVLSLGYDAVFGTDFEKGKNGEFVSKIKGENNKGIEKVHKLSRWAKDNNIEYRITKF YSDSLADKPLFDIAEKKYWIKKGEKIEGMPEKRTIIDVLFWK >gi|261747103|gb|ADAD01000147.1| GENE 6 2465 - 3571 1390 368 aa, chain - ## HITS:1 COG:PA0686 KEGG:ns NR:ns ## COG: PA0686 COG2804 # Protein_GI_number: 15595883 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Pseudomonas aeruginosa # 4 331 105 433 469 276 41.0 4e-74 MKNVISYVNVILETGIAELASDIHIKYDNEESEIKFRIDGILTEISKLYDKIGKQILTKN IIEIISRIKILSEMNVAEKRKPQDGSFTFIFNNENYDIRAAFMPTVNGESIVLRILKSYL ENTELETLGFSEKSRANLEKILKRKYGLILVSGPTGSGKSTTLMSMINMLNDGKKKIITV EDPVENKIKGIVQVQVNEEIGVTFSEILKNTLRNDPDIIVISEIRDEVTAEIAVRAALTG HLVIATVHTNDAVSTVTRLADMGIQKYLILDSVIGVVGQRLVGRKCDYCSGEGCEKCRKG YNGRISVNEILIINEEVKNILKSENSNQEYKNKLKKLQNNEFIDFEEDIKEKIEKNLIFE KDVFDFLN >gi|261747103|gb|ADAD01000147.1| GENE 7 3606 - 4076 496 156 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2016 NR:ns ## KEGG: Lebu_2016 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 156 1 156 157 141 60.0 9e-33 MAKMINPNTINDMTLMNAKVQIRMNELLQKIGRGKRKVKVTLSKSTRSYLNKLTEEMKKQ MKDYEKQRPNLFQFFNYLEKETAVTKANKKEKTKEITLSYEELDFLKFQIKETVKGIDNT RSKLKWYNFLKKGLYKTLRKQNEVTLEELGKTSVSR >gi|261747103|gb|ADAD01000147.1| GENE 8 4083 - 4688 769 201 aa, chain - ## HITS:1 COG:FN1993 KEGG:ns NR:ns ## COG: FN1993 COG0009 # Protein_GI_number: 19705289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Fusobacterium nucleatum # 8 188 21 202 217 113 40.0 3e-25 MNNEIKKVTEILLEGGVAVFPTDTVYGIGALPREKSVIKIYEIKKRDFSKKIIALIEDIS MLNRIINETPENIEKISPVLKKFWPGELTVIFKANTEFTGKFDRELETIGVRIPKNETAL SIIKNTGGILLTTSANISGEEAVTELDKLSKEIREKADYVIGDDSKLTGKPSTIIKYENG NIELLRKGNISSEDIKKAMKG >gi|261747103|gb|ADAD01000147.1| GENE 9 4681 - 5670 1642 329 aa, chain - ## HITS:1 COG:FN1992 KEGG:ns NR:ns ## COG: FN1992 COG0462 # Protein_GI_number: 19705288 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Fusobacterium nucleatum # 8 316 7 315 316 390 63.0 1e-108 METSSEQIKIFAGSSSKALAEKIAQHLDMKLSSVQLIRFADGEIFVKPNESVRGCKVFVV QSTSGPVNENIMELLIFIDALRRSSAKEIVAVIPYYGYARQDRKASPREPITSKLVANLL TVAGATRVITMDLHARQIQGFFDIPVDHMEALPILAKHFISYGFHPEDTVVVSPDVGGVK RARGLARWLHTPLAIIDKRRPKANESEVMNIIGDVQGKKAIIIDDMIDTAGTICNAAKAL KEKGAVEVYACATHAIFSGPAIERLKNAPLNKIVVTDSIELPEEKIFDKLRILTTSKMFA ETIKRIYWNEAISDLFEMPAEERETKVNE >gi|261747103|gb|ADAD01000147.1| GENE 10 5671 - 7011 1944 446 aa, chain - ## HITS:1 COG:FN1991 KEGG:ns NR:ns ## COG: FN1991 COG1207 # Protein_GI_number: 19705287 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) # Organism: Fusobacterium nucleatum # 1 446 1 446 446 494 59.0 1e-139 MISLILAAGKGTRMKSEKPKVLHEVNGTPMLKRVLKTLENTGIEKNVFILGHKKDAVLEA MGNLEYVEQKEQLGTGHAVLIAKEKIEKYKDDVLITYGDAPLLREETINRMKEIFYEKDL DCILLSCKMKDPFAYGRIIKKDGKVVDIIEEKEATEEQKKIKEVNTGVYIFKYKSLVEAV EKIDNNNIKGEYYLTDTIKILSGAGYKLESYQIEDEDEVLGVNSKAQLAQAGKILRDRKN LELMDDGVILIDPETTYIEEQVKIGEDTVIYPNVIIQGETEIGKNCKILGNTRIENSVIA DNVKIESSLIEQSRLEEGVTVGPFAHLRPKAHLKKNVHVGNFVEIKNSVLEEGVKSGHLT YLGDAEVGKNTNIGAGTITCNYDGKNKHKTIIGENAFIGSNSTIVAPAEIGEKAFTAAGS TITKKVPEKALAFGRAKQTNKEGWNK >gi|261747103|gb|ADAD01000147.1| GENE 11 7134 - 7955 1197 273 aa, chain - ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 15 273 16 274 277 360 66.0 1e-99 MVNLVCEAIIPDGPMKEVKEIENSIRTTYKKSIWGKFVKAVNDFDLIEDGDKIAIGVSGG KDSLLLVKLFHELKKDRSRNFEFKAVSLNPGFRESDLSNFKRNLENLNIDCAIIDTNIWE IANEKAKDYPCFLCAKMRRGILYKKVEELGYNKLTLGHHFDDVIETTMINMFYAGTLKTM TPKVKSTSGNLSLIRPLIYVREADIIEYTKENGIRPMNCGCTIEAGRTSSKRREVKDLLA SLEEKNPGIKQSIFNSMRNINLDYVFGYSGGDK >gi|261747103|gb|ADAD01000147.1| GENE 12 8037 - 8666 1015 209 aa, chain - ## HITS:1 COG:MA3704 KEGG:ns NR:ns ## COG: MA3704 COG0457 # Protein_GI_number: 20092504 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Methanosarcina acetivorans str.C2A # 15 188 284 438 1004 61 27.0 9e-10 MLETLIFREIRYNNNNEEMLEKFKNKIIDNPDDVESLQTLASIYHALKKNNKAIEIYKKL VKLEPENHEIRAFLGYLYYENEELDKAEENLNEALDISSGEPFVLFLLGNVYARRGKISE AVDCYDLAIFLDFDMYTAHIDFARKYEHMGRHGRALKEFKAAYEIDSRDEGLVEKIKYIE NKCKAKGKCECEFDDHKLLITADKLALNI >gi|261747103|gb|ADAD01000147.1| GENE 13 8883 - 9338 299 151 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1811 NR:ns ## KEGG: Lebu_1811 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 35 146 43 155 155 112 45.0 3e-24 MKNKVMLPFIFCFLLLSFNIFSSERISFKDFSQEIEKNIKDGAMEYIKNQQCFVSYNNIV FKNFYINHVKLYATCIEIDPAKRIEWDYPKIPENAEVKTIYIFGEDMAEKIRDDLKRKKM GFLFTKLAYYKTSHEKYKKNEYFYFESSAVK >gi|261747103|gb|ADAD01000147.1| GENE 14 9523 - 10824 2009 433 aa, chain - ## HITS:1 COG:BH3781 KEGG:ns NR:ns ## COG: BH3781 COG1158 # Protein_GI_number: 15616343 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Bacillus halodurans # 28 431 5 410 423 398 52.0 1e-110 MKSDEKIEKANNEEVTKNEVKSDDDRLIKEILGMKITSLKKKAKEYGIPNFSVMNKSDLV NYVLVEKGNEEGKIYGFGTLDIIGEGNYGFLRNTSVGPDVYVSLSQIKRFFLRNGDVVFG ELRVPIAAEKNYGILKVLLVNGDLAEKSLERPFFDDLIPSYPDEKINLGEGELASRIIDL VSPIGKGQRGLIVAPPKAGKTVLLSTLANDIIKYNPEIDVWILLIDERPEEVTDIKENVK DAEVYAATFDENPSVHTQVTENVLEMAKREVERGKDILILMDSLTRLARSYNITIPSSGK LISGGIDPNALYYPKRFLGAARNIKKGGSLTIIATALIETGSRMDEVIFEEFKGTGNMEI MLNRALEQLRIFPAIDVMKSGTRREELLIPKNQLEKIWKLRRELSKESEVEGMKKLIELV KKYKSNEELLENI >gi|261747103|gb|ADAD01000147.1| GENE 15 10853 - 12178 572 441 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227372256|ref|ZP_03855738.1| SSU ribosomal protein S12P methylthiotransferase [Veillonella parvula DSM 2008] # 1 389 1 389 449 224 33 2e-58 MGKRATIITYGCQMNVNESAKMKQMLQTMGYEITEDINESDLVFLNTCTVREGAAVKVYG KLGDLKRIKEEKNGNMIIGVTGCLAQEVRDEFIKKTPYVDLVLGNQNIGRIPDILERIEK GEETHIVMVEDEDELPKRVDADFGDDIVASISITYGCNNYCTFCIVPYVRGMERSVPLHE VLRDVEFYTKRGYKEILFLGQNVNSYGSDLADENDNFATLLNESAKIEGDFWIKYVSPHP KDFNDEVIEAIANNSKISRMLHLPLQSGSTKILNAMNRGYTKEEFIQLAKKIKERIPDIG LTTDIIVGFPGETDKDFQDTMDVVEEIGFENAFMFMYSKRTGTPAATMEEQVNEQTKNER LQQLMRLQNRKAKEESQKYLGKVVKVLVEGPSRKNPEMLTGRTSTHKIVLFRSDRKDLKG QFVHTKIYEAKTWTLYGELVE >gi|261747103|gb|ADAD01000147.1| GENE 16 12311 - 12745 689 144 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1971 NR:ns ## KEGG: Lebu_1971 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 12 140 12 133 146 104 45.0 1e-21 MKKIAVFLLMCMMFTLGSAATKKTTSGRKKVTDKEYSGQYIKATNTFTYKKDNLIFKDKL LTYQLQDNSGLLEGIYEFIGQNYGVTDDDIIGLNVKGRVSGDTLIVTKITNYRIPEDKLH KGEDVDPLAVPSDTTQDQSENVGQ >gi|261747103|gb|ADAD01000147.1| GENE 17 12771 - 13409 806 212 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1972 NR:ns ## KEGG: Lebu_1972 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: DNA replication [PATH:lba03030] # 1 212 1 210 210 262 67.0 9e-69 MSKKPKIYAYLLTETNEKGIIYTWEECQNKVKGKKARYKSFKTEDEASAWLESGAEYESK EKKDEKFDELISKLDRKAVYFDAGTGRGKGVEVRITDFNGESLLYKVLDKKKINDFGNYY LSKGRTNNFGELTGLYAALKYALKYDINKICGDSSLIIDYWSKGRYNKDGLEEDTINLIK KVTLMRSEFEKKKGKIEKISGDVNPADLGFHK >gi|261747103|gb|ADAD01000147.1| GENE 18 13419 - 14129 240 236 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 2 232 1 238 242 97 30 7e-20 MLKGKIALVTGGARGIGKEIVLRFAENGATVISGDLIDPDYSHENVSHVKLNVTDRENIK EVAAQLKEKYGKLDILVNNAGITRDSLLQRMKEQDWDLVVDINLKGVYNVMQGMVSLLLK SNGASVINMASVVGLDGNAGQTNYSATKGGVIAMAKTWAKEFGRKNLRSNAIAPGFIKTD MTHELPEKVVESVLENTPLRKMGEASDVADAALFLASDLSGFITGQVIRVDGGLNL >gi|261747103|gb|ADAD01000147.1| GENE 19 14158 - 14631 282 157 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223039866|ref|ZP_03610150.1| 30S ribosomal protein S16 [Campylobacter rectus RM3267] # 16 156 25 167 286 113 40 9e-25 MKKIITILMGLILTLFISLPIKGNNKKLTDDLKKSENKIKTTENRDDIYDEEIKEELAKY SYFQTGTASFYGGKWHGRKTANGEIFDTYKLTAAHKTLPFGTRVRVTNLSNGKSVVVRIN NRGPYSKGRIIDLSQAAFSKIENMSKGVTKVKLEIVK >gi|261747103|gb|ADAD01000147.1| GENE 20 14722 - 15549 1155 275 aa, chain - ## HITS:1 COG:CAC3017 KEGG:ns NR:ns ## COG: CAC3017 COG0726 # Protein_GI_number: 15896269 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Clostridium acetobutylicum # 64 259 85 282 282 135 43.0 8e-32 ITVLSLGMIGYSGGIVFNSVKGNLDKSSLEKDIKKLDESIKEKSTLLAEVKAKDTELRKH YNQIRQEKGIKVVYLTFDDGPTPNNTPRILEILKKNNIKATFFVIGQNPDMYKQIVDQGH AIAIHTYSHEYKQVYASEDAFFQDLYKLRDLIIEKTGVNPKVTRFPGGSSTTRVSKPIKQ AIINRLTKEGYVYQDWNCDSTDASGNKVPVEKLVKNGVCNVREVNLLMHDAYAKTTTVEA LQQIIDAYRAKGYIFETLTVDSPKFQHVKQPENVD Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:29:25 2011 Seq name: gi|261747101|gb|ADAD01000148.1| Leptotrichia goodfellowii F0264 contig00052, whole genome shotgun sequence Length of sequence - 447 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 28 - 87 5.8 1 1 Tu 1 . + CDS 119 - 403 271 ## gi|262038874|ref|ZP_06012220.1| conserved hypothetical protein Predicted protein(s) >gi|261747101|gb|ADAD01000148.1| GENE 1 119 - 403 271 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038874|ref|ZP_06012220.1| ## NR: gi|262038874|ref|ZP_06012220.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 94 1 94 94 112 100.0 9e-24 MSDNNKKNLNYIGVLIGRVNELNSKFSFQNEISIIKFFIIMFFLIVYYYIISFFINPKLN IIEIIFVIAVAFMVIIVLPIYSILWGNRKRLRKN Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:29:38 2011 Seq name: gi|261747078|gb|ADAD01000149.1| Leptotrichia goodfellowii F0264 contig00056, whole genome shotgun sequence Length of sequence - 20387 bp Number of predicted genes - 22, with homology - 21 Number of transcription units - 10, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 42 - 572 655 ## Lebu_2211 hypothetical protein - Prom 713 - 772 15.6 + Prom 701 - 760 11.8 2 2 Op 1 . + CDS 795 - 1562 1096 ## COG0561 Predicted hydrolases of the HAD superfamily 3 2 Op 2 . + CDS 1677 - 1958 471 ## Lebu_0346 cupin + Prom 1976 - 2035 7.7 4 3 Op 1 . + CDS 2104 - 3306 1272 ## COG1323 Predicted nucleotidyltransferase 5 3 Op 2 . + CDS 3331 - 4410 1476 ## COG1073 Hydrolases of the alpha/beta superfamily 6 3 Op 3 3/0.000 + CDS 4439 - 5194 859 ## COG2211 Na+/melibiose symporter and related transporters 7 3 Op 4 . + CDS 5199 - 5753 807 ## COG2211 Na+/melibiose symporter and related transporters 8 3 Op 5 12/0.000 + CDS 5758 - 6579 1554 ## COG3959 Transketolase, N-terminal subunit 9 3 Op 6 . + CDS 6592 - 7518 1639 ## COG3958 Transketolase, C-terminal subunit 10 3 Op 7 2/0.000 + CDS 7567 - 8478 1273 ## COG2177 Cell division protein 11 3 Op 8 . + CDS 8488 - 9888 1749 ## COG0739 Membrane proteins related to metalloendopeptidases 12 3 Op 9 . + CDS 9906 - 11129 1875 ## COG2195 Di- and tripeptidases + Term 11146 - 11204 4.2 + Prom 11180 - 11239 4.0 13 4 Tu 1 . + CDS 11308 - 12414 1022 ## COG3525 N-acetyl-beta-hexosaminidase 14 5 Tu 1 . - CDS 12453 - 13307 819 ## COG1284 Uncharacterized conserved protein - Prom 13340 - 13399 7.1 + Prom 13304 - 13363 10.2 15 6 Op 1 . + CDS 13567 - 15036 2237 ## COG0516 IMP dehydrogenase/GMP reductase 16 6 Op 2 . + CDS 15062 - 15655 741 ## Lebu_1686 hypothetical protein 17 6 Op 3 . + CDS 15674 - 16384 881 ## COG3382 Uncharacterized conserved protein + Term 16399 - 16438 3.0 + Prom 16517 - 16576 11.3 18 7 Tu 1 . + CDS 16607 - 17653 1839 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases + Prom 17714 - 17773 10.6 19 8 Op 1 . + CDS 17842 - 18090 504 ## gi|262038890|ref|ZP_06012235.1| hypothetical protein HMPREF0554_0787 20 8 Op 2 . + CDS 18110 - 19075 992 ## jhp0940 hypothetical protein - Term 19090 - 19131 1.1 21 9 Tu 1 . - CDS 19158 - 20126 344 ## PROTEIN SUPPORTED gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase - Prom 20214 - 20273 4.7 + Prom 20161 - 20220 9.3 22 10 Tu 1 . + CDS 20241 - 20385 299 ## Predicted protein(s) >gi|261747078|gb|ADAD01000149.1| GENE 1 42 - 572 655 176 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2211 NR:ns ## KEGG: Lebu_2211 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 175 1 193 194 175 59.0 6e-43 MKKTLLGLFLVLGAASFAASKVELKGAWDTGGKYYFKDSSTKAKGNAGEVGAEYRYEVIP GLELGGGTSYQFHSKLKKDRGTLYNSVPVYATAKYNFDTGTTVKPYVKGDLGYSINTNKA KDGLYYGVGGGINFNNVNVDVMYKENKGKYTVGQKDHKADYRRVSLGIGYDFNLGY >gi|261747078|gb|ADAD01000149.1| GENE 2 795 - 1562 1096 255 aa, chain + ## HITS:1 COG:Z5347 KEGG:ns NR:ns ## COG: Z5347 COG0561 # Protein_GI_number: 15804418 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli O157:H7 EDL933 # 1 253 40 303 305 149 35.0 3e-36 MCNIVVSDLDGTLLNSDQEVSEISKKVIKKLIEKGKKFYIATGRAYPDAKNIMESIGIKI PLISANGAVINDAEGKEIYRNNLDEKYKDIMLDIDYLRISDSIHINVYSDNRWFLTDKER KINPFEEEPEYMYEVVTPEEIRKKDITKIFYIGRHNDLLKLEKVIFDRTNGEVNVKFTHP ECLEIFDINATKANAIKKLGLIDGFNDYEMLKEAKKGFIMKNAHYSLKEALPHLEIVSSN SKNGVAKKLIELYNI >gi|261747078|gb|ADAD01000149.1| GENE 3 1677 - 1958 471 93 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0346 NR:ns ## KEGG: Lebu_0346 # Name: not_defined # Def: cupin # Organism: L.buccalis # Pathway: not_defined # 1 93 1 93 93 110 63.0 3e-23 MYGTVKNEAGMLFSGNNFKVVKKILKTGELIPSHNHEEEEIVFSVLKGKMEMFLNDTEKH VLVPGDILNFDGVNFIKGIALEDSEINVTLVKK >gi|261747078|gb|ADAD01000149.1| GENE 4 2104 - 3306 1272 400 aa, chain + ## HITS:1 COG:BS_ylbM KEGG:ns NR:ns ## COG: BS_ylbM COG1323 # Protein_GI_number: 16078570 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Bacillus subtilis # 3 347 4 352 415 186 33.0 6e-47 MRIGIVAEYNPFHNGHLYQIKKIKEIFGRDIFLVVIISGDFVQRGELSFLNKWEKTEIAL ENGVDLVVELPLYCSVQNAEIFSRTATEILDYLEVDMQVFGAEEENIQKLEEVIKLQSKQ SYKKKLTDFIKSGNSYSTSQKLVLNEYGYENIVKSNNILGLEYIRAIKKRKLKIKPYAIK REVSQYNEEKIEKNRIDNMVSASFIRKEIEENNFEKIKRFVPENTYEILKEKVKERKEKG INDAILKNDIFKMVKYKFLTYSKKEIIKIYDTTEEIYVRIYDRLKESSSYNEFVKNVKSR NFSTKRIERIIMNIFLNVTLKSADFKVDYVRVLGFNSKGREYLKNMKDKFGNQKKIFVNW KDIEKDEKISRKKINIEKNGFLVKELLFEEKERLNPVVIE >gi|261747078|gb|ADAD01000149.1| GENE 5 3331 - 4410 1476 359 aa, chain + ## HITS:1 COG:L123536 KEGG:ns NR:ns ## COG: L123536 COG1073 # Protein_GI_number: 15672103 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Lactococcus lactis # 72 314 70 308 311 63 25.0 8e-10 MGILITFNVLFMIFFFVIAYISIRYFLNQIEKYPRITLDEVYNNKKLRQKYNVEEKVNPM DYGFTYKEVAYKSGKIQLYGWLIDNKKNDKTVIISHGRGVNRLAVLQYLQIFKDVGLENE YSFFIPDLRNSGKSDEAKTKMGYCFGQDIFHTMEMLHEKHGKSSFILYGFSQGGMGSAIA AKLYSNELRKKGIKVEKLILDSSISNVRKRVKQDAAKRKVPKFIVSLVTRVFNLRVGNKL DKLRFSYLLRRIPALIIQTKSDKATTYGMLMEEYNDIAQYKNIYLKVFERGSHTRIYPDY KEEYTDAVGRFLKGELSENENIVKNDENPENNKEIKNPNEEKETGYYENFSDFDYDEEN >gi|261747078|gb|ADAD01000149.1| GENE 6 4439 - 5194 859 251 aa, chain + ## HITS:1 COG:FN0222 KEGG:ns NR:ns ## COG: FN0222 COG2211 # Protein_GI_number: 19703567 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Fusobacterium nucleatum # 1 241 1 239 448 196 42.0 4e-50 MSKLTKRTYLIYGMGVSYFILDQLYNQWLSYYYLPPGTEHSLKPLLKPSYLVLAYLFARL IDAISDPLVGYWSDNSKSKFGRRSFFMMIGGLPLGILMVMYFFPPKDSQMYTLLYLSVIG GLFFTAYTLVGGPYNALIPDLARTKEERLNLSTVQSTFRLIFTGIALVLPGYMISILGKG NTEWGIRKTVIILTVFAIIGIYICVFFLKEKELVKDNENHESIGFKSSLKYLMRKEILLY FAGFFFFFQRI >gi|261747078|gb|ADAD01000149.1| GENE 7 5199 - 5753 807 184 aa, chain + ## HITS:1 COG:FN0222 KEGG:ns NR:ns ## COG: FN0222 COG2211 # Protein_GI_number: 19703567 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Fusobacterium nucleatum # 1 145 252 397 448 102 39.0 3e-22 MRGVLTYYLTIIMELPIKQMTVISAILFGMAGICFPITNMLGKKFSYKKILVLDIMLLIA GTVGLLFVNSSISWLAYIMLVICGTGLSGSAFIFPQAMLSEISAKISETEKVSLEGFMFG IQGLFLKLAFLVQQSVVSLVIIAGSIPDVQGRKSATGLGVRSTLIVALVLFGISLVFYKL KKED >gi|261747078|gb|ADAD01000149.1| GENE 8 5758 - 6579 1554 273 aa, chain + ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 6 272 3 269 270 390 71.0 1e-108 MEVFMEIKELKEKAKEIRKDIVEMIYRAKSGHPGGSLSIADIMAVLYWSEMRVDPQNPKD EKRDRFVLSKGHAAPALYATLIEKGYVSKDLIPTLRRWGSPLQGHPDMKKLSGVEMSTGS LGQGLSVANGMALSSKIYNNDYRVYTILGDGELQEGQIWEAAMTAAHYKLDNLVAIVDYN NLQIDGKVSDVMDVYPVVDKFKAFNWNVIEIDGHNYEEIIKAFEKAREVKGQPTVIVAKT VKGKGVSFMENNAGFHGAAPNDEEYKKAMEELQ >gi|261747078|gb|ADAD01000149.1| GENE 9 6592 - 7518 1639 308 aa, chain + ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 1 307 1 307 309 414 68.0 1e-115 MEKKSTRQAYGEALVKLGKENKDVVVLEADLSKSTMTVFFKKEFPERHINVGIAEADLMG TAAGIATTGKIPFASTFAHFAAGRAFDQIRNSIVYPKLNVKICPTHAGISLGEDGGSHQS IEDMALMRSLPGMVVLSPADAVETEKAVMAAAKYEGPVYIRLGRLNIPVLFDDSYNFEIG KAVTLSEGNDVAIIATGLMVYEAVEAAKLLEKEGIKARVINMSTIKPLDKDAVLKAAKEC KFIVTSEEHSVVGGLGSAVSEYLSEVHPTKIIKHGIYDVFGQSADGETMLNNYKLRAKDI AEVVLNNK >gi|261747078|gb|ADAD01000149.1| GENE 10 7567 - 8478 1273 303 aa, chain + ## HITS:1 COG:BS_ftsX KEGG:ns NR:ns ## COG: BS_ftsX COG2177 # Protein_GI_number: 16080578 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Bacillus subtilis # 5 236 9 246 296 72 22.0 8e-13 MFSPIQETFRNISKEKGLFFSSLISLITIFVLLDTFIFGVFNLNDFKAKMENSNQAIVYV KTMTEDEISAFQGKLLKVNGIQTIKYVSKESALQLLEKELKVDLSDEENPLQDSFYVYID KNANVNALKEELLKNPEVTELDMRTQTIERTNQFSKNLDKLVLFGGVGSIIIAAILIMNI TSFGVRLRRREIRDLVATGVSGMAIKITYFLEGLILVLFSSIIGFGIFWKLQKFIVEGIN LLRSGIIGNSTDKELLGIYLISLLVGVIITFFSNFIGLHGYYRVKDKKVKPVSNNKNTEN EKK >gi|261747078|gb|ADAD01000149.1| GENE 11 8488 - 9888 1749 466 aa, chain + ## HITS:1 COG:AGc3124 KEGG:ns NR:ns ## COG: AGc3124 COG0739 # Protein_GI_number: 15889007 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 344 466 438 561 562 87 40.0 4e-17 MKKNILKISLFICTIFLIFADQIEENKKRIQQIDSQVNQNNQKINKNKTEISKAKNTENI ISAQVKKLDKDIARLQAEYNEAERKYTEILKQIGVNNENIRKNISEINRDTEIINVNKED LYKKIKTWDKIRRNRDMAAIVGTSNSAEKVKMTHDLKILLNKQTDYIQGVEDSKKGVESK KQREESVRARNQEEASKVKAARAELESKNRQLNAAKKEKNVLIAQLRGKQKVLNTENKKI ESNNSQLISEKRRLNAQIQAIIQRAIRERELAMQRAAEEKRKQEEAERIRRDAEAQRKAN SERQNNDLAGQKSTPSKNTGTASNKTTTRTTTTTKTPAVQVPKGTGNLMMPINGSIVVGY GQEKVAGLKSNGIEIRGSAGQAVKAADSGIVIYAGSLNNLGGVVIIDHGGLVTVYGNLAG VSVSKGSKVNKGQTVGTLGRDQTTHQPNLYFETRRGVNIVNPMSYL >gi|261747078|gb|ADAD01000149.1| GENE 12 9906 - 11129 1875 407 aa, chain + ## HITS:1 COG:FN0733 KEGG:ns NR:ns ## COG: FN0733 COG2195 # Protein_GI_number: 19704068 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Fusobacterium nucleatum # 2 407 6 411 412 624 78.0 1e-179 MYDTLKERFLRYVKFETRSDENSETIPSTPSQTEFAKMLKKELEEIGLENIFINDACFVN GTLPGNIDKKVPVIGFIAHMDTADFNAVNVNPQIIENYDGKDIVLNKELGITMSVEEFPD LKKYVSQTVITTDGTTLLGSDDKSGIVEIIEAMKYFIAHPEIKHGTVKVAFGPDEEIGRG ADNFNVEEFGADFAYTMDGGPVGELEYESFNAAQVIYKIKGKSVHPGTAKGKMKNASLIA VDLATMFPADEVPEKTEGYEGFYLLDGMTSNCEDARVEYILRDHDRKKFEQKKEFAKNIA EKLNEKYGKNTVEVFMKDQYYNMGDIIKDHMYVVDIAREAMENLGIKPIIKPIRGGTDGS KISFMGLPTPNIFAGGENFHGKYEFVTLESMEKATDTIIEIIKLNAK >gi|261747078|gb|ADAD01000149.1| GENE 13 11308 - 12414 1022 368 aa, chain + ## HITS:1 COG:SP0057_2 KEGG:ns NR:ns ## COG: SP0057_2 COG3525 # Protein_GI_number: 15900002 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Streptococcus pneumoniae TIGR4 # 28 339 450 770 782 102 27.0 2e-21 MINRVLKITVMMLGISMISLGVTKEIGVNLDLARQYYSVQVIKKFIDNISSSGGNFLHLH ISDDENYGIESEILGQTTAEAKLEKGIYINKKTGKKFLSYAQLKEIIFYAKTKNIELIPE IDSPAHMKAIFELLEMKKGNFYVKNIKSDTEDEINMEKSESINFMKQLIGEVTDIFGGKL RHFHIGGDEFEYSEAKNSEFVKYINDLSNYLISKGIKPRMWNDGITKADINKINKEIEIT YWSYDGDVEDVKQTGKRRKIRVSVPELLEKGFKVLNYNSYYLYYVPKESKNLKKDIEYSV NDIKKNWNLTVWDGENSENAVKNEDNIIGAAISIWGEGPYKFTDSEIQKYSEPLVKTFIL KTQNYKKN >gi|261747078|gb|ADAD01000149.1| GENE 14 12453 - 13307 819 284 aa, chain - ## HITS:1 COG:CAC0496 KEGG:ns NR:ns ## COG: CAC0496 COG1284 # Protein_GI_number: 15893787 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 7 283 2 279 279 209 41.0 4e-54 MKNKFAKILKEYFFISIGVTLLAFGLHFFLFPNKIASGGITGFALIINHIFGISNSIIIA IGNVILFSLGFILIGGSFGIKSIYASTYLSVILLIFENFFPHYSLTQNLTLATIFGSVFC ALGSTVVFTYEASTGGTSIIGKIINKYFGIRLGMANFIADATVTVMAMFAFGVELALFGL LSVYISGYMVDKFIDGFNSRKQIFIITENKEIIIKYILRDFQRGCTVFNGRGGYSGASRD VLMTILDRRQFINLRKFLKTNDPTAFVTVTETTKVFGLGFDQLH >gi|261747078|gb|ADAD01000149.1| GENE 15 13567 - 15036 2237 489 aa, chain + ## HITS:1 COG:FN1231_3 KEGG:ns NR:ns ## COG: FN1231_3 COG0516 # Protein_GI_number: 19704566 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Fusobacterium nucleatum # 204 486 1 284 285 397 71.0 1e-110 MSQKSKVIMDGLTFDDVLLIPQASEVLPHEVSLKTKVTKKLELNIPVMSAAMDTVTESQL AIAIAREGGIGFIHKNMTIERQADEVEKVKRYESGMITNPITLEEHSMLKEANDLMKTYK ISGLPVIDKKGNLKGIITNRDLKYRENLTVKVEEVMTKENLITAPVGTTLDEAKAILLEH RIEKLPIVQRKKLKGLITIKDIDNKINYPNACKDAHGRLRVGAAVGIGNDTLKRVEALVE AGVDIITVDSAHGHSKGVIKKIKEIRKAFPDLDLIGGNIVTKEAALDLIKAGVNAVKVGV GPGSICTTRVVSGVGVPQITAILEIAEVCEKKSIGLIADGGIKLSGDIVKAIAAGADCVM LGGLLAGTNEAPGEEIIYNGRKFKTYAGMGSLAAMKRGSSDRYFQLESATEKLVPEGIES MVPFKGALKDTIYQLCGGLRSGMGYCGTPTIEKLKSDGKFVKITNAGLKESHPHDVIITK EAPNYNASN >gi|261747078|gb|ADAD01000149.1| GENE 16 15062 - 15655 741 197 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1686 NR:ns ## KEGG: Lebu_1686 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 195 1 195 195 149 55.0 7e-35 MKSLRKTVFLAILVLSIGSLSFSFEDYYKKVYIVKISKKEIYKAVNANQNQQKKLSKIFD EYQKKAEKIEKQLKSFEDKKSQIGKIEKDRYFKIAEVLSFEQLKEFNKYTNQKKLEFEEK NDKIKSLMDDLNLENEQKAEILKLDRDFKRAVGRLKEERLSEENFASEYESLKKVRNEKI RALLNEEQIKVVDSYKF >gi|261747078|gb|ADAD01000149.1| GENE 17 15674 - 16384 881 236 aa, chain + ## HITS:1 COG:SPy1551 KEGG:ns NR:ns ## COG: SPy1551 COG3382 # Protein_GI_number: 15675447 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pyogenes M1 GAS # 1 235 1 234 238 254 51.0 7e-68 MSKFIVNESFWNIFPEANIGVLLLENIDNSKESSENIVKLLEESNKEAKKHLTEDQLSKN PVIAIWREAYQKFKTKKGVRCSIEALLKRVEKGSGVSTISPLVDIYNAASLKYGLPVGAE DIETFQGNLVLGVTEGNNEFYALGDEENSPTLEGEICYKDDKGAVCRCFNWRDGQRTMIT EKTRKAFVVIEYVNGERNEDIQKALEMIAEYAEKELGAKKTSLKILNKNENSINLI >gi|261747078|gb|ADAD01000149.1| GENE 18 16607 - 17653 1839 348 aa, chain + ## HITS:1 COG:BS_ydjL KEGG:ns NR:ns ## COG: BS_ydjL COG1063 # Protein_GI_number: 16077691 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Bacillus subtilis # 4 347 1 345 346 430 68.0 1e-120 MAKMKAARWYKQRDVRVEEIEIPQMGENQIKIAVKYCGICGSDLHEYLGGPIFIPVEAPH PYTNEKAPITMGHEFSGEVVEVGSGIKKFKVGDRVTVEPIFAKDGLKGKYNLDPNLTFIG LGGGGGGFSEFVVVNEDQAHKLPDEIDYEQGALTEPAAVALYAVRQSKFRTGDKVAVFGC GPIGLLIIDALRAGGATEIYAVELSHERQEKAAKLGAIIIDPSKVDAVKEIKERTNGGVN VSYEVTGVPKVLEQALESAEKDGELMVVSIWESSAPIHPNEIVIQERAMKGVIAYRDVFP STLDLMKKGYFSKDLLVTKHIKIDDIVTEGFEALVKEKSQVKILVSPK >gi|261747078|gb|ADAD01000149.1| GENE 19 17842 - 18090 504 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038890|ref|ZP_06012235.1| ## NR: gi|262038890|ref|ZP_06012235.1| hypothetical protein HMPREF0554_0787 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0787 [Leptotrichia goodfellowii F0264] # 1 82 1 82 82 147 100.0 2e-34 MEIHNEIKIDFELTNKLKRTIEKLERVFWVAQHYDEESKEYSKLDGKFLILCDDLEIDAK MGARAGYITWEQVDLLMAKYRF >gi|261747078|gb|ADAD01000149.1| GENE 20 18110 - 19075 992 321 aa, chain + ## HITS:1 COG:no KEGG:jhp0940 NR:ns ## KEGG: jhp0940 # Name: not_defined # Def: hypothetical protein # Organism: H.pylori_J99 # Pathway: not_defined # 14 317 15 291 325 152 33.0 3e-35 MEIKDYTGFKEGNKFYGGNAGHKISIIENGVKWLLKFPKSTYNLENVDMSYTSAPLSEYI GSKIYEILGHDVHETKLGIYDSKLVVACKDFTTKDYDFFELHEVKNSYLGKNEAKREYIM RKSKTSRDEVNLEELLFVLDNYEKLTKIPGIKERFWDMFVIDNFINNNDRHNGNWGILSN ESGMKLAPVYDNGAGFFSKHDTNKIKKILSDEERFNQIVENGRTPYIYDGKKIDSSTVIK KLSFEKGENIAENMDLKKAVLRNVPKIELNRIFKLIDEIPEKEKGIKVMSDLEKVFYKKF LKERYEKILLPSYERIVNSKK >gi|261747078|gb|ADAD01000149.1| GENE 21 19158 - 20126 344 322 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase [Haemophilus influenzae 3655] # 2 321 27 352 353 137 29 8e-32 MKHPIKERQSKMKIYISPMAGYTDYSYRKILKKFNPDLLFTEMVSAHLLNENDEATEKEL LKCDDKENEGVQVFGSDKNELIKAFLKLENKGFKNLNLNMGCPQTKIIKNGAGSALLPQT DFVDSLLSELKSKLSDKTKLSIKIRIGYRDFSSPELYIDLANKYNLDFICIHGRTREQLY NGKADWDIVSELSNRPRKIDFIGNGDLFEAKDIVSLIKKSNLDGIMLSRGIIGNPWLISQ IRELLNLGEITTFPDFDTVKNAVLEHISYLEENKGKIVAALEINKYLQPYFKRFETEDLK NNLKEIIIERDIDKKIKDIEKL >gi|261747078|gb|ADAD01000149.1| GENE 22 20241 - 20385 299 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRIFALLMIVLLVAVSCTTAPDGTKKVSKTGVGAGIGAAAGAVLGQA Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:30:09 2011 Seq name: gi|261747063|gb|ADAD01000150.1| Leptotrichia goodfellowii F0264 contig00066, whole genome shotgun sequence Length of sequence - 11325 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 4, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 24 - 58 0.4 1 1 Op 1 8/0.000 - CDS 201 - 989 971 ## COG3639 ABC-type phosphate/phosphonate transport system, permease component 2 1 Op 2 9/0.000 - CDS 1058 - 1858 809 ## COG3639 ABC-type phosphate/phosphonate transport system, permease component 3 1 Op 3 15/0.000 - CDS 1861 - 2628 313 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Prom 2655 - 2714 3.4 - Term 2637 - 2672 -0.8 4 1 Op 4 . - CDS 2748 - 3749 1412 ## COG3221 ABC-type phosphate/phosphonate transport system, periplasmic component 5 1 Op 5 . - CDS 3778 - 4668 897 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 4697 - 4756 9.2 6 1 Op 6 . - CDS 4759 - 5418 657 ## COG2378 Predicted transcriptional regulator - Prom 5465 - 5524 13.3 + Prom 5311 - 5370 8.7 7 2 Op 1 . + CDS 5511 - 5927 679 ## SSA_2134 hypothetical protein 8 2 Op 2 . + CDS 5943 - 6503 529 ## PROTEIN SUPPORTED gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) - Term 6494 - 6524 1.0 9 3 Tu 1 . - CDS 6533 - 7375 803 ## COG4823 Abortive infection bacteriophage resistance protein - Prom 7499 - 7558 12.8 10 4 Op 1 1/0.000 - CDS 8003 - 8611 751 ## COG2184 Protein involved in cell division - Prom 8637 - 8696 13.4 11 4 Op 2 . - CDS 8718 - 9593 860 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 12 4 Op 3 10/0.000 - CDS 9596 - 10042 651 ## COG0691 tmRNA-binding protein 13 4 Op 4 . - CDS 10064 - 11323 816 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 Predicted protein(s) >gi|261747063|gb|ADAD01000150.1| GENE 1 201 - 989 971 262 aa, chain - ## HITS:1 COG:ECs5086 KEGG:ns NR:ns ## COG: ECs5086 COG3639 # Protein_GI_number: 15834340 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, permease component # Organism: Escherichia coli O157:H7 # 68 258 88 276 276 96 31.0 6e-20 MTKSGIYLWGTLGICFIISIFKLATMDTGNVNMGQAIIEFFKNMKLMFLQPRLSERYTLL EIFQSLGITISLAIFTTLIGAFIGLFFSFFAAKNLSNEKISKAIKIAVSFVRSIPTILWV MVFSVVAGIGVEAAVIGICFHSTGYLIKAYSESIEEIDNGIIEALKATGASKGQVIFQGV LPASVTSILSWTFVRLEMNFTNAVVVGAAAGVGGIGYEMFMAGKMYFDSREIGFFVYLIF GTALILESISIYLRRKYIVKSL >gi|261747063|gb|ADAD01000150.1| GENE 2 1058 - 1858 809 266 aa, chain - ## HITS:1 COG:YPO1184 KEGG:ns NR:ns ## COG: YPO1184 COG3639 # Protein_GI_number: 16121479 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, permease component # Organism: Yersinia pestis # 19 266 1 263 266 97 27.0 2e-20 MKKNEMEKDILKKRLQSKIIFVILFLIVYISFAQILEFEGIMSFLSIPEGIIWLFKNFIP TQNSLKYLPIILKTGVHTILTAVTATTISGFFALIAAVLGSETIGINRFMKIFTKTVALF FRNMPVVAWSLILLFSFKQSEFTGFLALFFVSFGYLMRTFTETVDEVGEGVAEALKSTGA SYFQIVFQGIIPSIASQLISWLLYYIENGIREATLIGVLTGTGIGFVFDLYYKSNRYDGA GLVILVILIIVIIIELLSNKIRKEMM >gi|261747063|gb|ADAD01000150.1| GENE 3 1861 - 2628 313 255 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 222 23 232 329 125 34 2e-28 MKGNFMLLKVSNVSKQYSNGIKALNDVSFEAEKGQFISVVGPSGSGKSTLLRTINRLIDV THGNILFEDVQIEKLKNNEIKKLRRKIGMIFQNYNLVERLSVIENVLHGRLGYKSLMSGV LGIYTEEEKERAFELLNKVDMGKYAYQKCSELSGGQKQRVGIARALMQEPLLLLCDEPVA SLDPKTSERVMDYLRKITDEMRITCIVNLHQTDIAAKYSDRIIGLNQGKKVFDGETSKFT EDIMNSIYSDNNEGL >gi|261747063|gb|ADAD01000150.1| GENE 4 2748 - 3749 1412 333 aa, chain - ## HITS:1 COG:BH0468 KEGG:ns NR:ns ## COG: BH0468 COG3221 # Protein_GI_number: 15613031 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, periplasmic component # Organism: Bacillus halodurans # 35 211 58 219 323 68 29.0 2e-11 MKKNFWKIIVIVLMSIILISCGKKAKNDVIKIVFLPNESNESLKKSREEFAAAIQKATGK KVEIVTTTDYNIALESIISGKAQIAYIGAETYLSANERSKDVQAVLTNSGESGTLKDALY YSFIAVRGEDADKYKTDGNYDLKKINGKVMAFVTNTSTSGFVIPAKAIVKEFGLKDTDEV LQEGKVFSKVIFGSSHPGTQVTLFKGDADVATFAIPKAFTIYELIGGEENKVGATYKVKQ GAIAPFGDYAGKSFTVIKSIAVPNGPIVFNVKTLSKEDQEKIKKGLLSKEVTDNPYIFSP KTSKVRGLFLKENPNTGFVEANTEWYKQIKNGE >gi|261747063|gb|ADAD01000150.1| GENE 5 3778 - 4668 897 296 aa, chain - ## HITS:1 COG:FN0354 KEGG:ns NR:ns ## COG: FN0354 COG0697 # Protein_GI_number: 19703696 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 70 293 1 224 224 139 37.0 6e-33 MKKINLKFDNNRKAELLAFITVFFWAVSVVFTKVGTEYYDSTTLAVLRYSFAFLMLVIFM VVKKIKLPKLKDFPMFFLAGGVGIAFHMITFNKGVSTLSSATSSILLACTPIFTSVLSVI FLKEKINLYCWISIFISFSGIIVLTLWEGVFSFNEGIFWMLLAAFLLAVYNIMQKSFSDK YSGTEATIYSISAGAIVLMVYSPGSLREIPKMNINQFSVIFFMAFLATVVAYICWGKALE MADSAGEIANFMFLGPFISTLTGIIFLNEKLMLSTIVGGILILIGITGFNKFKQTK >gi|261747063|gb|ADAD01000150.1| GENE 6 4759 - 5418 657 219 aa, chain - ## HITS:1 COG:lin0383 KEGG:ns NR:ns ## COG: lin0383 COG2378 # Protein_GI_number: 16799460 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Listeria innocua # 1 205 1 207 315 170 43.0 2e-42 MTKSERINDMIIFLSKKNYFNLKDITERYSISKRTALRDVQSLEKLGLAIYSEVGRYGKY KVLNNKLLSPVLFTADEISALYFGMLTLDAYEMCPLNLNKDKLKEKFEACLSKKQSEKIF MIEKIMKTEMKKSNENSRILKDIVEVITENKVCKISYKGNEKKEEVQFLGVFAEQGRWNA KVLSYKNGKNKVLKCNKIISVEKTDIKPFKTLEEIIVGL >gi|261747063|gb|ADAD01000150.1| GENE 7 5511 - 5927 679 138 aa, chain + ## HITS:1 COG:no KEGG:SSA_2134 NR:ns ## KEGG: SSA_2134 # Name: not_defined # Def: hypothetical protein # Organism: S.sanguinis # Pathway: not_defined # 2 138 18 154 154 118 44.0 1e-25 MDIKKEFETIMKDQAQIALATSVNDIPNVRITNFFYCGDSKALFFGSAKNSKKIEEFASN SHVAFTTVPKNESKYVRAQGIVKKSEKTVFDVKEAFVAKNPKMEFIIDNFGDNLDLFEIP LSEVTVTLENQEVHELKL >gi|261747063|gb|ADAD01000150.1| GENE 8 5943 - 6503 529 186 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) [Campylobacter concisus 13826] # 6 186 3 184 185 208 58 2e-53 MENNNRCGWAKGEKDILYHDTEWGVPSHDDGYIFEMLILEGFQAGLSWNTILQKRENFRK AFDDFDYKKIAEYDENKLNELLQNEGIIRNRLKIYSAVTNAIAFMKVQKEFGSFSDYIWN FTDNKRIINNWKTLSEVPATSELSDKISKDLKKRGFKFVGSTIIYSFLQAIGIIDDHLVS CPYKNK >gi|261747063|gb|ADAD01000150.1| GENE 9 6533 - 7375 803 280 aa, chain - ## HITS:1 COG:FN1184 KEGG:ns NR:ns ## COG: FN1184 COG4823 # Protein_GI_number: 19704519 # Func_class: V Defense mechanisms # Function: Abortive infection bacteriophage resistance protein # Organism: Fusobacterium nucleatum # 1 277 22 295 299 141 35.0 2e-33 MIVENKEKAAEKLVFINYYKIKEASLPFLHDGMYEKNTKFEDIVFRFYEDRNLRLYFLRI VEKIEVSLKTNFSRILGREFEDTGYLDFKKWTDSEEYCKHFIKYKQKEFFRRIDKNIKQS KNKLIQEYKNQSEIPVWLIVDVLTFGEILDLYKLMKKEYKEEIAENHNITVSIFVSWLEN INLIRNLSAHNSNVLDILFRTKPKILNRWKDKIIVNEKNNRSVDKICKTILIMEHLIMNI NKDFPGNAVRNCLMRLYKRDKRILKQTGFKTIESIKEIKI >gi|261747063|gb|ADAD01000150.1| GENE 10 8003 - 8611 751 202 aa, chain - ## HITS:1 COG:HI0977 KEGG:ns NR:ns ## COG: HI0977 COG2184 # Protein_GI_number: 16272915 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Protein involved in cell division # Organism: Haemophilus influenzae # 18 192 12 186 191 186 54.0 4e-47 MSKYDWKDNELKRMVSEKEEEMLSKKRAKELYDKKIMENFEVGTFKGLQQIQEYLFQDVF EDAGKIREKNFSKGGVRFASAIFLKDNLKTLDKIPESTFENIIDKYVEMNILHPFNEGNG RSMRIWLDRMFIKNLGICIDWSKINPAEYLQAMERSPINTLELRTLLEKNITKDIDNREI YMHGINQSYKYEKMYEYDIEKI >gi|261747063|gb|ADAD01000150.1| GENE 11 8718 - 9593 860 291 aa, chain - ## HITS:1 COG:FN0354 KEGG:ns NR:ns ## COG: FN0354 COG0697 # Protein_GI_number: 19703696 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 66 289 1 224 224 151 40.0 1e-36 MRKLSNDDKAKIAAFATIFFWSSSFMFIKIALNYYDSMTLSVLRYIISSFVLIFYMIMKK MRMPELKDIPIFFCSGFTGFALYMITLNKGLTTLSSSTVSVIMAFAPILTSVLSVIFLKE KINLYGWICIFVSFAGIAVLMLWEGVLSVNEGVFLILTSALLLAIYNIIQKKLMRRYTAS EATTYSIFTGTILLVLYSPKSAMEIFHMSSAGFIIVFYLAVFVGVVGYILWSKALSLAEN TGEIANLLFLNPFIATIMGIIILHEKLTIPTIIGGTMILSGILGYNKFKGK >gi|261747063|gb|ADAD01000150.1| GENE 12 9596 - 10042 651 148 aa, chain - ## HITS:1 COG:FN0609 KEGG:ns NR:ns ## COG: FN0609 COG0691 # Protein_GI_number: 19703944 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Fusobacterium nucleatum # 1 148 1 148 148 154 61.0 8e-38 MVLARNKKAFHDYFIEDKLEAGIELVGTEVKSVKAGKTSIKESFIRIIKNEVFIMNMHIT PYEFGNINNLPESRVRKLLLNRKEIEKWASKIKEQGYTIIPLSVYTKKRLVKMEIGLAKG KKLHDKRESLKRKDQEKDMKKIQKNFGR >gi|261747063|gb|ADAD01000150.1| GENE 13 10064 - 11323 816 419 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 2 401 296 697 730 318 39 8e-87 ELDKEALKRGNSIYLVDRVIPMLPRKLSNNLCSLNPNEDKLTFSVEIDFDNNGKVVKNDF YKSVIKSKYRMTYSGVNEIFEGNEEFIKKYSKIYEMLQNMLSLSKTIRNIKKRRGSIDFE LPEIKVVLDENKLVQDIVLRSRGEAERLIEDFMVAANEVVAEKLFWEEIPAIYRVHEDPD KAKISALNESLIKFGYHLKNTDELHPGKFQTIIEKTTGLPEGYLIHKLILRAMQRARYSN KNLGHFGLASKYYLHFTSPIRRYSDLVVHRMLEKSITKFMKEKEKQTYSENFDIVASTIS KTERVADKLEEDSVKIKLIEYMQDKIGNVYIGRLSGMNKNKVFMELENHIEVVYNVLTSR DDFIYDEENFKITDTKTGESYTMGNTLKVIITGASYQKMEIEVIPYREKTAEILPETEE Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:30:14 2011 Seq name: gi|261747058|gb|ADAD01000151.1| Leptotrichia goodfellowii F0264 contig00045, whole genome shotgun sequence Length of sequence - 4441 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 482 524 ## PPSC2_c3489 protein 2 1 Op 2 . - CDS 512 - 2560 1987 ## COG2189 Adenine specific DNA methylase Mod 3 1 Op 3 . - CDS 2570 - 3283 756 ## Lebu_0147 hypothetical protein 4 1 Op 4 . - CDS 3284 - 4441 1081 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family Predicted protein(s) >gi|261747058|gb|ADAD01000151.1| GENE 1 2 - 482 524 160 aa, chain - ## HITS:1 COG:no KEGG:PPSC2_c3489 NR:ns ## KEGG: PPSC2_c3489 # Name: not_defined # Def: protein # Organism: P.polymyxa_SC2 # Pathway: not_defined # 1 160 1 160 333 126 45.0 3e-28 MEKTVKQAFQEFLENSVNLNRKATEDARKSRDNLKKNISEFGSDEDFFTLYEDFNIDFGS FARKTKCRELDDIDMMIGISANYATYNSEDSWDNTRIYANKSDVIQNECMNDDGTLNSKM VVNKFKEKLKKVNEYSKAEIKRNYEAVVLNLKSKTWNFDI >gi|261747058|gb|ADAD01000151.1| GENE 2 512 - 2560 1987 682 aa, chain - ## HITS:1 COG:PM0698 KEGG:ns NR:ns ## COG: PM0698 COG2189 # Protein_GI_number: 15602563 # Func_class: L Replication, recombination and repair # Function: Adenine specific DNA methylase Mod # Organism: Pasteurella multocida # 5 638 4 602 636 341 36.0 2e-93 MSENKIKQQINNVANDNLEALKQLFPSAVKDGQVDFEALKEELGEFEEVKDEKYEFRWAG KQNAKKAAQTDVGNRTLKYIPEDSKDADTTQNLYIEGDNLEVLKLLRQNYYGAIKMIYID PPYNTGNDFIYNDNFSMSKEESEKVENRLSEDGERLQKNPKEGNSKYHTKWLNMMYPRLK VARDLLTDDGVIFISIDDNEQANLKKICDEIFGGNNKVAILPTIMNLKGNQDEFAFAGTH EYTVVYSKKYDDLNISSLKVDEEEVILEWQEDEEGYYKKGASLISTGQNAPRELRPNLWF PILYKEDKLYLPSEEFVNLLFDEKTKLFNDILLKEYIEKKEKESYVIILPFVNGKKASWR WSYKKIQQQLKDIIITKTKNGISLNKKQRPELNDLPSKKMKSLLYKPEYSSGNGTNELAN IFEMKRIFSNPKPINLLKDLFSIGTKEDSTILDFFSGSATTAHAVMKLNSEDNGNRKYIM VQLPEETDEKSEAFKAGYKNICEIGKERIRRAGDKIKEEIEKENSNLKLGEEPKKVPDIG FKVFRVTDTNIRWKEMEEINETDILNSPDMLDFKVGTKDADVVYELMLRQPGVPLSETLE KLTEIGNRTYLYADSYLVCLETEITEEIIEKVAEINPLPFRFIFRDSAFGDNIGLKDEIF RRLKSLIEKNFGSGDGYRVEFI >gi|261747058|gb|ADAD01000151.1| GENE 3 2570 - 3283 756 237 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0147 NR:ns ## KEGG: Lebu_0147 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 236 1 237 237 173 49.0 7e-42 MIKLPQKYEKNKKIPIKDFVSKSFKPEERRKIREIVKEVRLTHQIDGEEIPSIKNEIYNI EVIQFFDFKISDIKNSAFLTGLYQKLIKNPAALRIYDENSEMYSFALKRLNQNDKNEIVV TDEFATDIFYLTMSNKKVEFEEITDYENIVNKINKLNFYTEMYIKSYVLKNENIYKDLKS FLKKPIWYDIEKMNSFYKTLKELVNYREELKKISIDSRKIEINKKIKETILKLKEYE >gi|261747058|gb|ADAD01000151.1| GENE 4 3284 - 4441 1081 385 aa, chain - ## HITS:1 COG:FN0414 KEGG:ns NR:ns ## COG: FN0414 COG0553 # Protein_GI_number: 19703756 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Fusobacterium nucleatum # 4 385 626 1014 1014 310 45.0 3e-84 EIDELEKIVLDKIRNTPYNSGNKKILIFTAFADTVEYLYNELRERLRDENVNLACITGQV VKTTNKNVRAEFNKVLSAFSPVSKIKKEIPESEQIDILIGTDCISEGQNLQDCDTVINFD IHWNPVSLIQRFGRVDRIGSKNKNIQMINFFPNMELNDYLKLEQRVKGKMTTLNIVSTGD EDVLTPEMNDFNFRKRQLERLKDEVIDLEDTNENISLTDLNMNEYLYELSEYVKTNKEIN KIPKGIYSVAEGEKKGVLFCFKHSKENEKPKSDSSLYPYYLIYMNNDGEVLYGNVQARET VKLFRKLCYGKDKPIKELFKIFDEKTKNKKDMRFYSNLLNKGIASIKGSEEEKAIQSVFD FGGFENSFENDTTEDFELISFLVVE Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:30:22 2011 Seq name: gi|261747056|gb|ADAD01000152.1| Leptotrichia goodfellowii F0264 contig00229, whole genome shotgun sequence Length of sequence - 403 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 106 - 402 409 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 Predicted protein(s) >gi|261747056|gb|ADAD01000152.1| GENE 1 106 - 402 409 98 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 98 310 407 407 162 76 3e-41 GSVKPHTSFKSEVYVLTKDEGGRHTPFFTGYKPQFYFRTTDITGEVNLPDGVEMVMPGDN IEMSVELIHPIAMEEGLRFAIREGGRTVASGVVATIVK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:30:25 2011 Seq name: gi|261747034|gb|ADAD01000153.1| Leptotrichia goodfellowii F0264 contig00019, whole genome shotgun sequence Length of sequence - 8420 bp Number of predicted genes - 23, with homology - 19 Number of transcription units - 13, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 331 220 ## SAOUHSC_02217 phi ETA orf 22-like protein - Prom 375 - 434 4.9 2 2 Tu 1 . - CDS 536 - 610 57 ## - Prom 675 - 734 4.0 + Prom 607 - 666 14.4 3 3 Tu 1 . + CDS 775 - 1608 400 ## gi|262038937|ref|ZP_06012278.1| hypothetical protein HMPREF0554_0254 + Term 1667 - 1725 7.6 4 4 Tu 1 . - CDS 1614 - 1790 185 ## gi|262038935|ref|ZP_06012276.1| hypothetical protein HMPREF0554_0255 - Prom 1870 - 1929 5.8 + Prom 1677 - 1736 12.3 5 5 Tu 1 . + CDS 1866 - 2096 248 ## gi|262038927|ref|ZP_06012268.1| hypothetical protein HMPREF0554_0256 6 6 Tu 1 . - CDS 2130 - 2315 358 ## gi|262038930|ref|ZP_06012271.1| mobilization protein B - Prom 2415 - 2474 8.9 + Prom 2310 - 2369 5.4 7 7 Tu 1 . + CDS 2392 - 2463 86 ## + Term 2606 - 2647 -0.4 8 8 Tu 1 . - CDS 2493 - 2714 287 ## gi|262038941|ref|ZP_06012282.1| transcriptional regulator - Prom 2742 - 2801 8.9 - Term 2770 - 2806 1.2 9 9 Op 1 . - CDS 2808 - 3236 503 ## COG1598 Uncharacterized conserved protein 10 9 Op 2 . - CDS 3273 - 3467 217 ## MGAS9429_Spy0565 phage protein - Prom 3490 - 3549 3.5 11 9 Op 3 . - CDS 3566 - 3970 363 ## Cthe_2456 nuclease - Prom 4003 - 4062 8.5 + Prom 3962 - 4021 10.0 12 10 Tu 1 . + CDS 4046 - 4234 237 ## Cthe_2457 hypothetical protein 13 11 Op 1 . - CDS 4236 - 4664 386 ## PPSC2_c2958 hypothetical protein 14 11 Op 2 . - CDS 4723 - 4854 173 ## 15 11 Op 3 . - CDS 4947 - 5378 350 ## gi|262038940|ref|ZP_06012281.1| hypothetical protein HMPREF0554_0265 16 11 Op 4 . - CDS 5390 - 5815 535 ## gi|262038932|ref|ZP_06012273.1| transcriptional regulator, AraC family - Prom 5890 - 5949 11.3 + Prom 5897 - 5956 10.5 17 12 Op 1 . + CDS 6006 - 6317 484 ## gi|262038921|ref|ZP_06012262.1| putative esterase 18 12 Op 2 . + CDS 6330 - 6533 247 ## gi|262038939|ref|ZP_06012280.1| hypothetical protein HMPREF0554_0268 19 12 Op 3 . + CDS 6597 - 6776 279 ## gi|262038936|ref|ZP_06012277.1| hypothetical protein HMPREF0554_0269 20 12 Op 4 . + CDS 6856 - 7014 263 ## gi|262038938|ref|ZP_06012279.1| hypothetical protein HMPREF0554_0270 + Prom 7025 - 7084 2.5 21 13 Op 1 . + CDS 7188 - 7670 409 ## gi|262038922|ref|ZP_06012263.1| conserved domain protein 22 13 Op 2 . + CDS 7652 - 8299 706 ## COG5377 Phage-related protein, predicted endonuclease 23 13 Op 3 . + CDS 8309 - 8420 135 ## Predicted protein(s) >gi|261747034|gb|ADAD01000153.1| GENE 1 1 - 331 220 110 aa, chain - ## HITS:1 COG:no KEGG:SAOUHSC_02217 NR:ns ## KEGG: SAOUHSC_02217 # Name: not_defined # Def: phi ETA orf 22-like protein # Organism: S.aureus_NCTC8325 # Pathway: not_defined # 4 82 6 84 256 70 41.0 2e-11 MKTIMKKNDNFTIIANNLILDNSISWKAKSILIYMLSRPKGWSYNAAEISKHSKDGINAV YSGLKELVKAKYVSRKKLADGSLCYYIFENKSENNIIDCYEKIANQENPN >gi|261747034|gb|ADAD01000153.1| GENE 2 536 - 610 57 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTFQEFLKGCKQYKDFILIEEYGN >gi|261747034|gb|ADAD01000153.1| GENE 3 775 - 1608 400 277 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038937|ref|ZP_06012278.1| ## NR: gi|262038937|ref|ZP_06012278.1| hypothetical protein HMPREF0554_0254 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0254 [Leptotrichia goodfellowii F0264] # 1 277 1 277 277 434 100.0 1e-120 MKDFNKFNKIWDNPQLKHIFETQQQMYQMIEHTGLTKILENNLKFSKPLLEATKNLKINL PNNLMSSTANPKNIMNSLTLNNYELQNTLASVTASNNAAISMINREIFDSFYSINSVIKD LSIYSQQHNLVLKSLAPNFFHMNKILNQVINMYNISNKDEVEVVVENISVVIMKSNLQEY IEIFKNLKIPDAIAVKLSFISYFILKFAPLIYVLPALIPFMGFLLKLFIIYGKPYIKALS KLKDPEVFAEKAADNTFSILLSSIISILGTLFINKKK >gi|261747034|gb|ADAD01000153.1| GENE 4 1614 - 1790 185 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038935|ref|ZP_06012276.1| ## NR: gi|262038935|ref|ZP_06012276.1| hypothetical protein HMPREF0554_0255 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0255 [Leptotrichia goodfellowii F0264] # 1 58 1 58 58 71 100.0 2e-11 MDNLLRQGLISLGYVVSALIYYKYSKKIRKIINKILPFLVFQYFVGTLFAVIIRENLK >gi|261747034|gb|ADAD01000153.1| GENE 5 1866 - 2096 248 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038927|ref|ZP_06012268.1| ## NR: gi|262038927|ref|ZP_06012268.1| hypothetical protein HMPREF0554_0256 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0256 [Leptotrichia goodfellowii F0264] # 1 76 1 76 76 134 100.0 3e-30 MTDELKNKINEFKKFYSNYPPAIKTKIIQIPGGQFLTISIFNPSGSLISSASKYVKTDEL NTEIENLTIEAIERCL >gi|261747034|gb|ADAD01000153.1| GENE 6 2130 - 2315 358 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038930|ref|ZP_06012271.1| ## NR: gi|262038930|ref|ZP_06012271.1| mobilization protein B [Leptotrichia goodfellowii F0264] mobilization protein B [Leptotrichia goodfellowii F0264] # 1 61 1 61 61 89 100.0 1e-16 MGTGKDIDKMTFEEIKAYVKELENYKRTVESFGTEKVKEYKDLLIKSRIEKDPFRPLKKY F >gi|261747034|gb|ADAD01000153.1| GENE 7 2392 - 2463 86 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTDNEKKIDKIIDLLRKIIKKIK >gi|261747034|gb|ADAD01000153.1| GENE 8 2493 - 2714 287 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038941|ref|ZP_06012282.1| ## NR: gi|262038941|ref|ZP_06012282.1| transcriptional regulator [Leptotrichia goodfellowii F0264] transcriptional regulator [Leptotrichia goodfellowii F0264] # 1 73 1 73 73 112 100.0 8e-24 MDRKEKIAIDIYSKINIELKRKKLSQKIIAKKIDMTPQTFSDNMKRLANGIFPKLDFLLD VQSELKIDLGLNF >gi|261747034|gb|ADAD01000153.1| GENE 9 2808 - 3236 503 142 aa, chain - ## HITS:1 COG:SP1786 KEGG:ns NR:ns ## COG: SP1786 COG1598 # Protein_GI_number: 15901615 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 1 135 1 143 150 80 32.0 1e-15 MLVVYPAIFHKAVEGGYVVVFPDLNNGATQGETLEEAIEMAQDYIGTYLYDDFLNGKELP KASRIENIKINFEEGEKEYYRENKSFTSLVNLDMKRYVDECKSQTVRKNVSIPSWLNEKA KRNNINFSNILQEALKQELGID >gi|261747034|gb|ADAD01000153.1| GENE 10 3273 - 3467 217 64 aa, chain - ## HITS:1 COG:no KEGG:MGAS9429_Spy0565 NR:ns ## KEGG: MGAS9429_Spy0565 # Name: not_defined # Def: phage protein # Organism: S.pyogenes_MGAS9429 # Pathway: not_defined # 1 63 28 88 88 67 61.0 1e-10 MPMTSKEMIKFLKKNGFVEIQGGKGSHRKFINHSTGKITIVPCHSKELKKGMEQVILKQA GLKK >gi|261747034|gb|ADAD01000153.1| GENE 11 3566 - 3970 363 134 aa, chain - ## HITS:1 COG:no KEGG:Cthe_2456 NR:ns ## KEGG: Cthe_2456 # Name: not_defined # Def: nuclease # Organism: C.thermocellum # Pathway: not_defined # 3 134 7 140 140 142 54.0 4e-33 MSYRLFISHAWKHNDEYYNLIDMLDRKAYFNWTNYSVPEHDPFDTEDDLKEELRQQIRPV NAVLIIAGMYALYSNWIKFEIEFAEKISKPIIVIRPRGQEKVPVYLQEIANKSGNTIVNW NTDSIVEAIREYSI >gi|261747034|gb|ADAD01000153.1| GENE 12 4046 - 4234 237 62 aa, chain + ## HITS:1 COG:no KEGG:Cthe_2457 NR:ns ## KEGG: Cthe_2457 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 1 62 4 65 68 96 79.0 3e-19 MKARKFRKKTVIIEAYQTDKEMIIQTLEGPLKASIGDWIITGVHGEKYPCKPDIFKKTYE EV >gi|261747034|gb|ADAD01000153.1| GENE 13 4236 - 4664 386 142 aa, chain - ## HITS:1 COG:no KEGG:PPSC2_c2958 NR:ns ## KEGG: PPSC2_c2958 # Name: not_defined # Def: hypothetical protein # Organism: P.polymyxa_SC2 # Pathway: not_defined # 1 142 3 144 198 116 46.0 2e-25 MNSETYLKERVEDQINWYDKKSSENKKRYYKFKLFEIIAGMLVSILSIFLKGGYLKYIIS ILSFLITGINSISFFLKCQENWIKYRETSEILKQEKYMFLASGGVYNIEDENKFKLFVER CETVISSENINWAQINNKEKKK >gi|261747034|gb|ADAD01000153.1| GENE 14 4723 - 4854 173 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDDKTKKDNFKGIGESGKKPYAPPQIKSNPKEELKKEKIKEKN >gi|261747034|gb|ADAD01000153.1| GENE 15 4947 - 5378 350 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038940|ref|ZP_06012281.1| ## NR: gi|262038940|ref|ZP_06012281.1| hypothetical protein HMPREF0554_0265 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0265 [Leptotrichia goodfellowii F0264] # 1 128 1 128 143 194 100.0 1e-48 MSQEEMYKDFIKRYDKLYDEMNYYEREKIKYILLACGGGIASLFPFLLAEKKVEMYKEIK TMFLLIFITGIINIFILIFSQKSLEKALENESEIISLKLKNEISESLENKIRDKNYYRKI IQKLDTLVFITLIFILIIVIKIF >gi|261747034|gb|ADAD01000153.1| GENE 16 5390 - 5815 535 141 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038932|ref|ZP_06012273.1| ## NR: gi|262038932|ref|ZP_06012273.1| transcriptional regulator, AraC family [Leptotrichia goodfellowii F0264] transcriptional regulator, AraC family [Leptotrichia goodfellowii F0264] # 1 141 1 141 141 200 100.0 3e-50 MEKLGVTLKKLRESRNLTITELADKAGLGRGTVGDIETGKNKSTIATIDLLSKTLKLTKK EREQLFASMLPKDIGEKLLSDKSDEFLDSLSVLLKLVGAEEQKNILNYIIEKIEYLSMKN GKIGEVEKLIEITKKEINNLK >gi|261747034|gb|ADAD01000153.1| GENE 17 6006 - 6317 484 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038921|ref|ZP_06012262.1| ## NR: gi|262038921|ref|ZP_06012262.1| putative esterase [Leptotrichia goodfellowii F0264] putative esterase [Leptotrichia goodfellowii F0264] # 1 103 1 103 103 135 100.0 1e-30 MKQYAGTGYDRKYEAAASDWRDKQYDDYCAAQDEYDRQIQEKFEELFNELEVYHTSYYDK LSENEWIELKDYFRDYEYNLETTFSDVIESYQEELIEKYNFMI >gi|261747034|gb|ADAD01000153.1| GENE 18 6330 - 6533 247 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038939|ref|ZP_06012280.1| ## NR: gi|262038939|ref|ZP_06012280.1| hypothetical protein HMPREF0554_0268 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0268 [Leptotrichia goodfellowii F0264] # 1 67 1 67 67 109 100.0 8e-23 MKWIETVKTMEDINMFSLAARIKEKLNEKSFQLRGNGELIIGNDKLMMIRKNSEIYRVDF AITKRVN >gi|261747034|gb|ADAD01000153.1| GENE 19 6597 - 6776 279 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038936|ref|ZP_06012277.1| ## NR: gi|262038936|ref|ZP_06012277.1| hypothetical protein HMPREF0554_0269 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0269 [Leptotrichia goodfellowii F0264] # 1 59 1 59 59 82 100.0 1e-14 MKNRIKILISIILILLAQYGAIIENGYWGLGGNALVPILCWLLFWIFPELIRELKKELK >gi|261747034|gb|ADAD01000153.1| GENE 20 6856 - 7014 263 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038938|ref|ZP_06012279.1| ## NR: gi|262038938|ref|ZP_06012279.1| hypothetical protein HMPREF0554_0270 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0270 [Leptotrichia goodfellowii F0264] # 1 52 1 52 52 69 100.0 8e-11 MKTEQEIREMIRKNELEIERAKSDNDWGLLSTYILVLEERNKLLREILGDDK >gi|261747034|gb|ADAD01000153.1| GENE 21 7188 - 7670 409 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038922|ref|ZP_06012263.1| ## NR: gi|262038922|ref|ZP_06012263.1| conserved domain protein [Leptotrichia goodfellowii F0264] conserved domain protein [Leptotrichia goodfellowii F0264] # 1 160 1 160 160 266 100.0 5e-70 MEKENVLEIEFLPVWDKWAWKISKNKIKNNHLKDLDINTEIWVDPMLKATVDLFKDDSFL IDTDSLINDEIKKRLENIVEKINEKYGTPKRWRAEKGGQYFYIDTFGEISSDTEYDLSED SESYEFGNYFRTIAEAEKYRDRIKEILLNRETEEECNSEK >gi|261747034|gb|ADAD01000153.1| GENE 22 7652 - 8299 706 215 aa, chain + ## HITS:1 COG:BH3544 KEGG:ns NR:ns ## COG: BH3544 COG5377 # Protein_GI_number: 15616106 # Func_class: L Replication, recombination and repair # Function: Phage-related protein, predicted endonuclease # Organism: Bacillus halodurans # 10 208 14 200 320 87 34.0 2e-17 MQFREISYSSEEEWHDIRNKHIGGSDCATIMGYNEYKNIVDLWKEKTGRKKQDDLSDNEA IQRGVRSEDLLIEHFRINNPEYSVGKFGKTLVSIKYPFMAANLDGVLENKSREKGILEIK TATCHSYAMYKQKWKDNIPIEYYLQVQHYLMVTGWKYAILYADIKLVFADNKHEIRQYHI ARDNEDMFEIYKKEVEFYGYLERDEEPPYIKKLII >gi|261747034|gb|ADAD01000153.1| GENE 23 8309 - 8420 135 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MANELQVLELKVEKITPASIVSNVDKLEPFIEKVKEK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:32:17 2011 Seq name: gi|261747023|gb|ADAD01000154.1| Leptotrichia goodfellowii F0264 contig00054, whole genome shotgun sequence Length of sequence - 15290 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 2, operones - 1 average op.length - 9.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 31 - 61 2.0 1 1 Op 1 4/0.000 - CDS 182 - 649 385 ## COG4687 Uncharacterized protein conserved in bacteria 2 1 Op 2 13/0.000 - CDS 709 - 1623 1160 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 3 1 Op 3 13/0.000 - CDS 1637 - 2446 1234 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 4 1 Op 4 . - CDS 2462 - 3445 1348 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 5 1 Op 5 . - CDS 3518 - 5143 2105 ## COG1686 D-alanyl-D-alanine carboxypeptidase 6 1 Op 6 . - CDS 5161 - 5835 815 ## COG0177 Predicted EndoIII-related endonuclease 7 1 Op 7 . - CDS 5844 - 6482 344 ## Lebu_0421 TraX family protein 8 1 Op 8 . - CDS 6501 - 6770 447 ## COG2388 Predicted acetyltransferase 9 1 Op 9 . - CDS 6845 - 7594 1073 ## Lebu_0420 hypothetical protein - Prom 7679 - 7738 12.0 - Term 7678 - 7728 12.5 10 2 Tu 1 . - CDS 7756 - 15054 9860 ## FN2047 hypothetical protein - Prom 15218 - 15277 12.7 Predicted protein(s) >gi|261747023|gb|ADAD01000154.1| GENE 1 182 - 649 385 155 aa, chain - ## HITS:1 COG:CAP0069 KEGG:ns NR:ns ## COG: CAP0069 COG4687 # Protein_GI_number: 15004773 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 35 155 11 129 135 78 37.0 4e-15 MKILRNFSNLEDVEKVIFYILGHSLLFFNVRGGIKLAVSLNTKALFVTKGNFLSGGFGNK RGDILIGDRAFEFYNIRNTEDCIQIPWEEIEKVRAQIFFNDRYIRGFFIDTKKSGSFNFV VVKAGKSLKVMRDFLENEKIVRSKPLFSLKKLFRK >gi|261747023|gb|ADAD01000154.1| GENE 2 709 - 1623 1160 304 aa, chain - ## HITS:1 COG:L147466 KEGG:ns NR:ns ## COG: L147466 COG3716 # Protein_GI_number: 15673690 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Lactococcus lactis # 1 303 1 306 307 463 75.0 1e-130 MAENKIKLSKSDRRSVMLRSQFLQGSWNYERMQNGGWAYSLIPALKKLYPDRNDASAALK RHLEFFNTHPYIAAPILGVTLALEEERANGIPIDDAAIQGVKIGMMGPLAGIGDPVFWFT VRPILGAIAASLATGGSVIAPLFFFIVWNIIRIGFLWYTQEFGYQKGAEITKDLSGGLLQ TVTKGASILGMFVMGILVQRWTSINFPMIISKVPLSEGAFIKFPQENVNGAELQRILSEM MSGVSLTPEKITTLQDNLNQLVPGFAALLLTFLCMWLLKKKVNPILIIFGLFAVGILGHL VGIF >gi|261747023|gb|ADAD01000154.1| GENE 3 1637 - 2446 1234 269 aa, chain - ## HITS:1 COG:L146623 KEGG:ns NR:ns ## COG: L146623 COG3715 # Protein_GI_number: 15673689 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Lactococcus lactis # 1 269 1 270 270 273 67.0 3e-73 MDFNIITVVFVLIVAFLAGMEGILDQFQFHQPIIACSLIGLATGHLPECIMLGGALQLIA LGWANIGAAVAPDAALASVASAIIFVKAGNFSADGRNAAIVAAITLATVGLVLTMVVRTL SVVIVHQADRAAESGNFKGVEFWHIIALLCQGLRIAVPAVLLLFIPSEIIQGALSSLPKW FTEGMTIGGGFVVAVGYAMVINLMATKEVWPFFFLGFALAPLKELTLIATGIIGVCLAII YLNLSKKSNSGGGGASSSDDPLGDILDNY >gi|261747023|gb|ADAD01000154.1| GENE 4 2462 - 3445 1348 327 aa, chain - ## HITS:1 COG:L1762179_2 KEGG:ns NR:ns ## COG: L1762179_2 COG3444 # Protein_GI_number: 15673688 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Lactococcus lactis # 163 318 1 156 161 207 70.0 2e-53 MIGIIIASHGEFAAGIKQSASMILGDLEQVESVVFMPNEGPDDLYGKLQNAVTNLGTEEI IFLVDLWGGSPFNQSNRLFEEAPEKRAIVAGLNLPMLLEACSSRDDTDKSHKLAEIITSA AKDEVKVKPEELQPAEKKQEVTKSISQGAIPEGTVIGDGKLKIVLARVDTRLLHGQVATS WTKATNPDRIIVVSDTVSKDELRKKLIEQAAPPGVKAHVIPLDKLVEVSKDTRFGNTKAL LLFENPQDALKVIEKGVEIKELNIGSMAHSVGKVMVSTSLSMDQNDVETYKKLADLGVKF DVRKVVADKSGDLFKMISAKSNEGLKL >gi|261747023|gb|ADAD01000154.1| GENE 5 3518 - 5143 2105 541 aa, chain - ## HITS:1 COG:FN0060 KEGG:ns NR:ns ## COG: FN0060 COG1686 # Protein_GI_number: 19703412 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Fusobacterium nucleatum # 96 427 9 343 368 151 33.0 2e-36 MLKKIIKTVLMLSLFTLLVFGEGSHSDNDGGSRDDDIGNIIKNSSIENSQSEKERKKQER EEKKRLKKEEKERKKREKQGIFLPEDNNETDIDNGTNDKNNQNNIDGKENISNTNNKNNT INTTVPKNNNEQKNNTNINNNVTNGGTDIVNEDISKILEENRKKEEQKKPKIDKRKPVVQ AIDKEKEEASQYKDKLLKFIATKDGKIIKKELETQQHPIASLTKVMNILVALDEVDKGNV SLDDKVCFTPQNANVGGSWLNVKVGDCYLLRDLLRSEIIYSANNSAYLVAYHVGKGNIEY FVKLMNQKAKELGMNNTEFHTPAGLPTTMTGKGMDVSTAYDMFLMGKKAIEDKRIREWAS EPELVLVNPAGEQVIYKSRNHLLGQYGIYGLKTGFHVQAGYNIIVTGKMGNIEIISVVLG HKTHNQRTKDQLEEFSQIKNRLKKIHTMGEEIGEFKIKDSPKRKIKGVLAENVYQLDDTN YDFKTVGIDAKAKINKGDVIGKMEVLSNGNVVSSVDILAIEEAEELSWFGKLLRFISFGL L >gi|261747023|gb|ADAD01000154.1| GENE 6 5161 - 5835 815 224 aa, chain - ## HITS:1 COG:FN0057 KEGG:ns NR:ns ## COG: FN0057 COG0177 # Protein_GI_number: 19703409 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Fusobacterium nucleatum # 14 214 1 201 201 256 63.0 2e-68 MTKKERFNLIFPYLQERYGKPKCALDFETSYQLMIAVILSAQCTDARVNIVTKELFKVVK TPEDIHNMDLETLEKYIKSTGFYRNKAKNIKLNAEQVLNEYNGKIPKKMDELVKLAGVGR KTANVVLGEVWGISEGIVVDTHVKRLSKRMGLTKSDNPEIIERELMKIVPKKYWFVFSHY LILYGREVSTAINPKCDICIINKYFNYCEKEKAEKQRKKVAKKK >gi|261747023|gb|ADAD01000154.1| GENE 7 5844 - 6482 344 212 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0421 NR:ns ## KEGG: Lebu_0421 # Name: not_defined # Def: TraX family protein # Organism: L.buccalis # Pathway: not_defined # 1 207 1 218 219 141 53.0 2e-32 MDLFILKIIGIITMFTDHWYHVIGGSEVLNIIGRTAFPIFAFSLGEGYVHTKNLKKYLLR LFLFAIIIHLMIFYMNYSLNIFFTLFTGLLIISLYHSKKINILPKIIIIGILCYISVKFD FDYGIYGILTIMIFHFFRQDKIKITISFLLLNIAAPFISDISKIQIYSMFGLIPIFLYNG KKGRNMKYFFYLFYPLHFLVLKGIKLLLKNGF >gi|261747023|gb|ADAD01000154.1| GENE 8 6501 - 6770 447 89 aa, chain - ## HITS:1 COG:FN1391 KEGG:ns NR:ns ## COG: FN1391 COG2388 # Protein_GI_number: 19704723 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Fusobacterium nucleatum # 1 88 1 88 89 102 60.0 2e-22 MEIKHIENKGFYIYDEKGEVVAEMTYKKDGNNLIFDHTYVSPLLRGQGVADKLFSTGVEF AEKNNYKIVPVCSYIVKKFESGKYDYIKA >gi|261747023|gb|ADAD01000154.1| GENE 9 6845 - 7594 1073 249 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0420 NR:ns ## KEGG: Lebu_0420 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 125 235 19 128 134 76 36.0 1e-12 MNLKITGGMCMKKYFLLMLVSGILLGANEKELTIEKILNNENVVAAATASAKQSTAKKEE KKAAAVPKKTVNTATNNKAATPQNNAKTTVNENTKAKQTTEVKKAAVTKNTGAVNPQKIT DTVIDRDRTKQFNDPKNLEKGSYCNKPVGRHQFQEINDKGEYRADIKFQYCMINNEKKEN VTIGIIGTDLNAISGYIKKYAYLHLKNTDKLHLNADGKSYYYDGVWAWSNSSDYPMSAWE EKANTNKGK >gi|261747023|gb|ADAD01000154.1| GENE 10 7756 - 15054 9860 2432 aa, chain - ## HITS:1 COG:no KEGG:FN2047 NR:ns ## KEGG: FN2047 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 798 2432 1 1630 1630 1804 65.0 0 MSKNLRKIEKNLRTFAKRCKNIKYTKELLFSFLLMGMLSFSDTILSSQLESTENAIKQSR KELNISIKDMHMSFKTAKKENNRLLRNSTLELIQLMEQGDQVVKSPWSSWQFGMGYSYSS WRGIYKGRGDKKEKYPYEGIFKRSNDLFLRNISPESDAYERYTASSTENFDNPATTSDRK KRKKLNDSYGLESSADQQEPIAQIELGASVKPREIVKNPITVNPPRIKINSVSPLNTPVA PTPPAAPNIGIDKFNPVAPDDITVSLPKPPTFNIKLGSYRNYMEQNTLGQIDGGRHSGDG KSYNSSSNADIDGNTLNYTAIYAWASPSTALRGAQGFNSALLKAYFDYTNRGGTGGGTAT VRGNITIDSIRGNINDSDPGARPWNNQPFLVGGSRVATLDNANGGGTIVNRATVNMVGPL VTGYEIQNDDAGSGKREVINEGELTDRAEEGYRGTEGLGGLHVGKAGGQTEASNSTNLKL APYLGGNEYPRGIDISRTPDIVDGAGNVLKRGGYTGYKIGLILTQEFDDPNPGNNYYRLI NNGKISFMGKSSIGIQVYAKPTNSPNTVIDVINGEGTGSKEITMGGIESYGMKLSSRILQ SAAGRSSVFENRGTINISGGDGSGSSLSSGMAVLEDNDMTSNKAIRAYTGMVKNKGTINV SGGTGNSGMILKVKDNDDVTNDTGGTINVTGTKNIGMRVDLGSVVTENSPANITPKAINK GTININDGDGNIGMAANNSEKNAAGEITHKAVAENTKDILFNNTSKNGIGMFSQDGGEIV NKGNIKGIGDKLEKTIGMVIQKALTTAGNNNTASSGVNAGKIELRGKQVTGVFNQGKFTM TGGSILTSGEKSISLYAKGTTATTKIVKGDITAENKALGLYADNTEIELGDAANANTVNL TAKGAGTLLFYNYTKNAGGTYTPSGKFKLLNNISGTIKDGATAFYYRDSATAATISQRLN DMFNDTGSTAGKKLKLKMEDNSTLFVLENTTPSTTAINLSSVDPSQINNYLGKRVEIDST SSKNFKAYKVSKGTLNIDSDVNLDNHSGAAIDKYYRVEFINSAVTIASGKKISGTDSGKL DQVIAQANYEGAMSRDNINVVNEGTIDYSKKGATGIVVDYGKATNKGLIKMDADNGNNQN STALFGASNSLLLNDTTGTIQLGNKGVGIWGANKIGSSIGTWSKNINITNKGKIAGINGK KGLFGIYADNKEAGATSTISHTGTIDFSKVTSSTGIYAVKGKVTSSGNISVKEGSVGINA KNSEVEINGGTYTIGKNSAGFNLTTDSGTSKFIGNSGNISLAGKGSVAYLIKGGSYNSSS NFKDNLTLTSTDGYTYMSITEAFMQYKNTKVINNDDSLFVNAKNSTVNLDNGTDISSTKN RVTGIYSDGGVASNNGKISLTGDKSSALYGKDASMANGIDGKITIGTNGSGIYTVDGNGH NNGQITVGSGSVGMRAENGNVYNRATGKIESTGENAIGMSQSGNSGEIENHGNINLVGDK SIGMHSEGATNAGHTMLNKGNVTVGDSTRADSPSIGIYSNNGLNSTIRNEGKVQAGARST GIYGRNVTLTSSSETTAGDGGIGVYSKEGTVNIEQNAKISVGKSLGASQEGAGVYLAGNG QTLNSDTDKMTIGKGSFGYVMTGQGNTVRTGLLGTTGVTTLSDDSVFMYSADRTGTITNY SNLRSTGSLNYGIYASGEVKNYGTIDFRRGIGNVGAYSYVKGGTSTPKAIKNYGIINVSG SDISTNPDDKKYGIGMAAGYSEESPAGSGNKVTRGIGNIENHGIIRVTTPNSIGMYATGK GSRILNGVNGRIELSGSKRNIGIFAENGAEVINEGTITTVGTGNVGQIGIAVTSGATLDN RGKIHIDASKGYGLLVAGGIIKNYGEFNITTGSGATKIKEVKAADTSKTLGDAGLDRIRM YAPAGSPNATITRNGQIQKPHLARVQAIANRKPSEIPTSSVGMYIDTSGINYTKPINNIG ALAGLTQGDLILGSEATKYTNAKDIQLGDDIIKPYNDMIRRSGIEKWSIYSGALTWMATI TQLPDYTIRNAYLSKIPYTVFAGDKKTTRDTYNFTDGLEQRYGVDGLETREKALFNKLNK IGNNERILLQQAFDEMMGHQYANVQQRVSSTGQTLDKEINGLRKNWNNKSKQSNKIAAFG SRGEYKTDTAGIIDYTSNAYGVAYVHEDETIKLGNSSGWYAGVITNIFKLKDIGRSKETT TMLKLGVFKSKAFDHNGSLQWTVSGEGYIARSDMHRKYLVVDEIFNAKAGYNTYGVAVKN EISKEFRTGEKFSIRPYGNLKLEYGRFDSIKEKAGEVRLEVKGNDYYSVKPEVGIEFKYK QPMAVKTTFTTTLGLGYENELGKVGNVKNKARVAYTKADWFNIRGEKDDRKGNFKADLNL GLENQRFGVTLNAGYDTKGKNIRGGLGFRVIY Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:32:58 2011 Seq name: gi|261747021|gb|ADAD01000155.1| Leptotrichia goodfellowii F0264 contig00237, whole genome shotgun sequence Length of sequence - 550 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 549 846 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261747021|gb|ADAD01000155.1| GENE 1 3 - 549 846 182 aa, chain - ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 5 181 154 308 2806 71 33.0 9e-13 SDRVTLTTGSLQMKDGDLVAIDVSQGHIGIGEKGIDALSLTDLELLGKTIDIAGVIKASK ETRVMVSAGGQTYQYKTKEVKSKGETYSGIAVDGKAAGSMYAGKIDIISNDKGAGVNTKG DLVSVDDVVLTANGDITTNKVNAGKKVVYKTPKKVRIKGETTSGKKVQIKAKETEIDAKV IT Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:33:02 2011 Seq name: gi|261747009|gb|ADAD01000156.1| Leptotrichia goodfellowii F0264 contig00070, whole genome shotgun sequence Length of sequence - 9356 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 469 508 ## COG0125 Thymidylate kinase 2 1 Op 2 . - CDS 506 - 952 239 ## gi|262038961|ref|ZP_06012299.1| conserved hypothetical protein - Prom 1004 - 1063 4.4 3 2 Tu 1 . - CDS 1080 - 1379 242 ## gi|262038957|ref|ZP_06012295.1| TctB protein - Prom 1461 - 1520 9.5 + Prom 1393 - 1452 12.0 4 3 Tu 1 . + CDS 1499 - 2050 190 ## PROTEIN SUPPORTED gi|52081538|ref|YP_080329.1| ribosomal protein S2 - TRNA 2121 - 2195 49.2 # Cys GCA 0 0 - Term 2341 - 2388 7.1 5 4 Tu 1 . - CDS 2440 - 3243 1026 ## COG3315 O-Methyltransferase involved in polyketide biosynthesis - Prom 3482 - 3541 13.6 + Prom 3346 - 3405 13.7 6 5 Op 1 2/0.000 + CDS 3494 - 4609 1349 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component 7 5 Op 2 20/0.000 + CDS 4637 - 5758 1698 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component + Prom 5806 - 5865 6.5 8 5 Op 3 24/0.000 + CDS 5885 - 6781 1151 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 9 5 Op 4 19/0.000 + CDS 6793 - 7836 1348 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 10 5 Op 5 18/0.000 + CDS 7852 - 8610 237 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 11 5 Op 6 . + CDS 8645 - 9355 241 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 Predicted protein(s) >gi|261747009|gb|ADAD01000156.1| GENE 1 1 - 469 508 156 aa, chain - ## HITS:1 COG:FN1323 KEGG:ns NR:ns ## COG: FN1323 COG0125 # Protein_GI_number: 19704658 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate kinase # Organism: Fusobacterium nucleatum # 1 156 1 154 225 174 56.0 6e-44 MGKLIIIEGTDGSGKQTQTELIYKRLCELKGKEKIKKISFPNYESKASEPVKMYLAGEFG KRVDSVNAYAASLFYSVDRYASFKKDWEDFYNNGGIIVSDRYTTSNMVHQAPKIGNEKER EKYLEWLVDLEWEKMPIPEPDLVFFLDVPFEFSQKL >gi|261747009|gb|ADAD01000156.1| GENE 2 506 - 952 239 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038961|ref|ZP_06012299.1| ## NR: gi|262038961|ref|ZP_06012299.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 148 1 148 148 259 100.0 6e-68 MAKGTRKILIKDYLNYIIAEKINRVEEGYEEKFIVKRDIKSIFEFVIGILSFLAILGPVV DKIFSDCPGNIYENPSGTYEISLTKDYIKIKDVIYNRNENKIFFNKWNNFIILDLEGKIL QQIYFNNITRPDLFLNLMKYFFAEKEDL >gi|261747009|gb|ADAD01000156.1| GENE 3 1080 - 1379 242 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038957|ref|ZP_06012295.1| ## NR: gi|262038957|ref|ZP_06012295.1| TctB protein [Leptotrichia goodfellowii F0264] TctB protein [Leptotrichia goodfellowii F0264] # 1 99 1 99 99 153 100.0 4e-36 MKTDKKIERLIEKNNISKEGLKLVLITKKSLFQKIFTFYIGFLVIMGITVCFFSVLQGQW PKFLNLETILLFIFLAVLFVFNLSRSPIIYFYNDGFIVG >gi|261747009|gb|ADAD01000156.1| GENE 4 1499 - 2050 190 183 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|52081538|ref|YP_080329.1| ribosomal protein S2 [Bacillus licheniformis ATCC 14580] # 48 174 49 174 174 77 37 3e-14 MINLIQNIDISCIEFLYKIQHNLKSDIINKTMIFFTNLGDNGIIWIMISLILLCTKKYRK SGIISVISLIVCSVTVNLILKPLVHRPRPFNEIAHIVLLIKAPKDFSFPSGHTAASFTAV YIFFKNMKKYFFPVLIIALFIAFSRLYLTVHYPSDVFAGILIGLFSGFAGEKIFYRFLNK NSS >gi|261747009|gb|ADAD01000156.1| GENE 5 2440 - 3243 1026 267 aa, chain - ## HITS:1 COG:FN0388 KEGG:ns NR:ns ## COG: FN0388 COG3315 # Protein_GI_number: 19703730 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: O-Methyltransferase involved in polyketide biosynthesis # Organism: Fusobacterium nucleatum # 1 264 5 269 269 326 66.0 4e-89 MKIKLDGVMETLLITLYARGEDAKKEKSVLEDKKSEEIMSMIDYDFSKFKSGWLSYYGIL ARAKTMDDQVKKFISENPESVIVSVGCGLDTRFNRIDNGKIEWYNLDFPEVIEKRKLFFP ENERVKNISKSALDPAWTQDVKVNGKKLLIISEGVLMYFDENEVKDFLKILTDNFDEFEA QFDLISKGALKMQKEHDTLKKMNAKFKWAVKDGSEVVKLNPLIKQTGLINFTAEMKKILP FTKKWIIPFMYLYNNRLGIYEYKKKND >gi|261747009|gb|ADAD01000156.1| GENE 6 3494 - 4609 1349 371 aa, chain + ## HITS:1 COG:FN1432 KEGG:ns NR:ns ## COG: FN1432 COG0683 # Protein_GI_number: 19704764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Fusobacterium nucleatum # 1 367 1 377 383 265 42.0 1e-70 MKRILLTALLLIIFIVSCGEKKSADSNVIKIGVIAPLTGNYAQYGTAVKDGVELKVNEIN KAGGINGKKIELIIADSKGDVQESVNAFKKMVSQDKVTAVIGEVVSATTQAISDLAQTAK VPLISATATSIDVTKGKDYVFRTTFTDPYQGTATAKYIHSKGIKSVAILTNSSNDYSVGV ANAFKEQAAKDGINIVEEKYTGDDKNFKAILTKLKGQNTEAMFIPDYYNTIGLIISQARD LGINTRYFGGDGWDGIQTDFGEVAEGAVFASQFSAEDTSEIVQKFINSYKTEYKKEPTMF AALGYDTVEILGTALKSSKDLTGSSIKEALNNVNGIELITGKLKFDADRNPEKSVTFIEI KGGKLTLKEKF >gi|261747009|gb|ADAD01000156.1| GENE 7 4637 - 5758 1698 373 aa, chain + ## HITS:1 COG:FN1432 KEGG:ns NR:ns ## COG: FN1432 COG0683 # Protein_GI_number: 19704764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Fusobacterium nucleatum # 2 369 3 377 383 288 43.0 9e-78 MKKLLLLMSLLMVFILSCGGKKEAAQDKDSIKIGVIGPLTGDIAQYGTSTIDGFKLRVKE INAAGGIKGKKIELVIADSKGDPQEAISIFKKMVSQDKVDFIVGEVASTASLAISDFAQK AKVPMLTPTSTLFDITKGKDYVFRVTFTDPYQGVAMAKYVKERGIKNITILINTSNDYDV ALAQAFKEQAEKDGIKISEEKYTKDDKDFKSVLTKVKAQNPEAVFIPDYYNTVGLILTQA NEIGLKTQFLGGDGWDGIQTNFGAVAEGGIFASQFAPDDPSEIVQKFIKAYKAEYNTEPI IFSALGYDAGTVVEAALNSAKDLSRESIKEAFKASNIDNLVTGSLKFDENRNPEKKVSFI EVKNGKLTLKDKF >gi|261747009|gb|ADAD01000156.1| GENE 8 5885 - 6781 1151 298 aa, chain + ## HITS:1 COG:FN1431 KEGG:ns NR:ns ## COG: FN1431 COG0559 # Protein_GI_number: 19704763 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Fusobacterium nucleatum # 5 298 16 308 308 252 52.0 8e-67 MFKNFIEQTINGLQTGSIYALIALGYTMVYGIVKLINFAHGDILMVGAYAALVAVSNGMP FSLALILSIVFCSILGVVIDYFAYRPLRNSPRISALITAIGMSFMLESLALVIFGATPKV INTNLLPVFLSPKVKINLGFISVSMLTIFVIIVTLICMLVLNLFIKNTKLGKATRAVSQD TGAAKLMGINVNLTIAITFAIGSGLGALGGVMYAIMYPTIEPYMGMLPGLKAFIAAVFGG IGSIPGAMVGGYVLGIIESYTKGYISSTWANPIVFGVLILILIFKPNGLFGKNMKEKV >gi|261747009|gb|ADAD01000156.1| GENE 9 6793 - 7836 1348 347 aa, chain + ## HITS:1 COG:FN1430 KEGG:ns NR:ns ## COG: FN1430 COG4177 # Protein_GI_number: 19704762 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 60 345 2 285 285 233 49.0 5e-61 MQDTKQITEDRWKDLNNFNKINIRNYSLSFLFIIVLYFILRLSFDPSDPFNYTGGIYINI LIAILFSFSLNIAVGIMGQLSLGHAGFIAVGAYTGAVLSKALLPYNLPPILQLTITGVAG GIIAGLFGFFVGGSTLRLRGDYLAIITLAFGEIIKYVIQNMDFLGGATGLKNIPNIVTFD NVYLISIISMLIMGMIMISRKGREIQSIRENEIAAENIGIHINKVKLYGFALSAFFAGVG GSLYAHNVGVLTPDKFGFMFSIEILVMVVFGGLGSITGSILSAVLLTLLNEQLRQASEYR YLVYAVILIILMIYRPDGVFGKKEITIPRFINKFKKIKNNWNNKKNN >gi|261747009|gb|ADAD01000156.1| GENE 10 7852 - 8610 237 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 14 242 33 252 329 95 27 1e-19 MSLLKTTDLGISFGGLRAVDNVNIEINNQELVGLIGPNGAGKTTIFNLLTGVYKPTDGDI FVNETNINKKSTPQIVSLGVARTFQNIRLFKDLSVLDNVKIALNNSMSYSTLDAVFRLPK FWKEEKEVTDKALDLLDIFEMAEMGNITAGNLSYGQQRKLEIARALATSPKLLLLDEPAA GMNPNETKDLMNTISFIREKFKIAILLIEHDMDLVMGICERLYVLNFGKIIAEGLPEEIQ NNKEVITAYLGE >gi|261747009|gb|ADAD01000156.1| GENE 11 8645 - 9355 241 237 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 229 1 231 245 97 26 3e-20 MDILDVNDLNIYYGGIHAIKNISFNIKKGEIVSLIGANGAGKTSTLHAVSGLIPIKSGEI SLNGINTTNTEAHKLVKLGMAHVPEGRRIFTELTVTENLEMGAYTRNDKDSVKKDMENQF KLFPRLAERKKQLAGTMSGGEQQMLAMARALMSRPALLLLDEPSMGLAPLLVQEIFKIIE KINKDHGVTILLVEQNANMALSIANRGYVLETGKIILEGTGKELLSNPEIKKAYLGG Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:33:17 2011 Seq name: gi|261747007|gb|ADAD01000157.1| Leptotrichia goodfellowii F0264 contig00068, whole genome shotgun sequence Length of sequence - 256 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 120 - 254 71 ## Predicted protein(s) >gi|261747007|gb|ADAD01000157.1| GENE 1 120 - 254 71 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no IIDAMFIANVGVINRAYCNVNTKGYFHVVNGLNVINRAYCNVNL Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:33:25 2011 Seq name: gi|261746993|gb|ADAD01000158.1| Leptotrichia goodfellowii F0264 contig00188, whole genome shotgun sequence Length of sequence - 11684 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 4, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 296 - 334 -0.5 1 1 Op 1 4/0.000 - CDS 534 - 1694 1323 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) - Prom 1715 - 1774 2.5 2 1 Op 2 . - CDS 1790 - 3247 1988 ## COG3119 Arylsulfatase A and related enzymes 3 1 Op 3 . - CDS 3234 - 3374 243 ## gi|262038982|ref|ZP_06012318.1| conserved hypothetical protein 4 1 Op 4 . - CDS 3376 - 4851 2070 ## COG0591 Na+/proline symporter - Prom 4872 - 4931 7.1 5 2 Op 1 . - CDS 4937 - 5788 1392 ## COG0191 Fructose/tagatose bisphosphate aldolase 6 2 Op 2 2/0.000 - CDS 5867 - 6304 668 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA 7 2 Op 3 . - CDS 6335 - 7519 1558 ## COG2222 Predicted phosphosugar isomerases - Prom 7577 - 7636 9.4 8 3 Op 1 . - CDS 7777 - 8160 364 ## t1803 glucose-1-phosphatase/inositol phosphatase 9 3 Op 2 . - CDS 8188 - 8499 365 ## gi|262038976|ref|ZP_06012312.1| glucose-1-phosphatase - Prom 8519 - 8578 5.4 10 3 Op 3 . - CDS 8581 - 9585 1156 ## COG1609 Transcriptional regulators - Prom 9664 - 9723 15.9 + Prom 9663 - 9722 8.4 11 4 Op 1 . + CDS 9789 - 10529 725 ## COG2188 Transcriptional regulators 12 4 Op 2 . + CDS 10593 - 11666 934 ## COG0820 Predicted Fe-S-cluster redox enzyme Predicted protein(s) >gi|261746993|gb|ADAD01000158.1| GENE 1 534 - 1694 1323 386 aa, chain - ## HITS:1 COG:PM1678 KEGG:ns NR:ns ## COG: PM1678 COG0641 # Protein_GI_number: 15603543 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Pasteurella multocida # 4 353 16 351 374 229 36.0 9e-60 MRALNLLIKPYSSGCNLRCKYCFYYDVADNRIIKNYGPMKFDVLEKLVKEAFAYADTIVN FMFQGGEPTLVGIEYYRKFHEYVEIYNKNNIRTAFFMQTNGTLLNKEWIELYKKYNYLIG ISIDGYKEIHDVFRLSAKNKGTFEQVIKGAELLKNNNIEFNVLCVINKLVAENGKKVYSF FKEKDFRYMQFIPCIDSFNEKNENEDYTLTAKDYGIFLNDTFSLWYEDFIKGNFISIRYF DNLIRILLGEPPEACDMMGFCSVNGVVESNGDIYPCDFYVLDEYKIGNIVNDKFENILFS ENAVKFYTSSLKMSEKCKKCKYLKICRSGCRRYKNFDGTENLYENKFCDAYMSFFGKNLE KLIEVAKITRKIRYENIRSSQNINRL >gi|261746993|gb|ADAD01000158.1| GENE 2 1790 - 3247 1988 485 aa, chain - ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 19 426 2 447 467 135 26.0 1e-31 MKKSNVLLIVADDLGAWALGCYGNKDAITPNIDKLAKEGKIFENFFCVSPVCSPARASIF TGRIPSQHGIHDWLDEWENGTTTEDYLKGQSTFVDILSKNNYICCMSGKWHMGLAETPQK GFGYWYSHQKGGGPYYKAPMYKDGNLVHEEEYITDKITDYAIEFLDKIYENEKPFFLSLN YTAPHSPWDRKNHQEEILKLYENCKFESCPRDPYHPWKIAETFEGNEEERREILRGYFAA LTSMDFNIGRVLDELEKKNILEDTLVIFTSDNGMNMGHHGIFGKGNGTSPLNMYDTSVKV PFIIYKKGETDADKVGNVLSHYDVRATLLEYLKLEDIKDETVNYPGKSFAEILNNEQKID DENVVIYDEYGPTRMIRNKKYKYVHRYPDGPYEFYDLEKDPEERTNEIHNKIYYNIIDEM RKELEIWFLNHVNKEIDGAVLPIYGAGQKKLAGKWGNYARGSFGRYHSRFIFSSDEELKR KKRRN >gi|261746993|gb|ADAD01000158.1| GENE 3 3234 - 3374 243 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038982|ref|ZP_06012318.1| ## NR: gi|262038982|ref|ZP_06012318.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 46 1 46 46 66 100.0 5e-10 MSILILFWKFLIVFGLIWYVATIVIVGIKGFRNIKDMLEAVSHEKK >gi|261746993|gb|ADAD01000158.1| GENE 4 3376 - 4851 2070 491 aa, chain - ## HITS:1 COG:PA1418 KEGG:ns NR:ns ## COG: PA1418 COG0591 # Protein_GI_number: 15596615 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Pseudomonas aeruginosa # 8 455 3 435 463 111 27.0 3e-24 MLNFNWFLDGSIVGVYILLSLIVGLIIKKYVKNVDDFLVAGRSVDLYVGMASLAATEFGI ITCMAASQLGYKYGFSGATVGLLMSIIMFTVGKTGFCIEPLRRAGVVTIPEFFEKKFGRK IRWASGVVIVLGGLLNMGVFLRTGGEFLVYVAGLNPKYLEITMTILLLIVAVYTIFGGML SVLVTDYMQFLLMSIGLVLTTFTIFYKLGWNKLFDTVIEKYGEGGLNPFVHPQLGWQYVL YTALVLSATVLTWQTMISRVLSAKDEKVAKKIYTRTSMFYIVRSLIPVLWGVAALSLISL DSVGGAPIKAMPKMLSTLLPSGIIGILVAAMLAADMSTDSSYLLGWASVIYNDILVIFHK NSWEEKKAIFVNRLLVALIGIFLLLYGLWYPLKSDLWVYMTLTATIYSVSVSTLLIAACY WKKANDWGAYASIIVGAIIPISFLIAQQVDPLKPIAKAIGPYYSGISAFVFAWIAMIAGS YMKNIFVKDVR >gi|261746993|gb|ADAD01000158.1| GENE 5 4937 - 5788 1392 283 aa, chain - ## HITS:1 COG:YPO0844 KEGG:ns NR:ns ## COG: YPO0844 COG0191 # Protein_GI_number: 16121152 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Yersinia pestis # 2 282 3 283 284 362 61.0 1e-100 MIVSTREMLLKAQKEKFAVPAFNFHNLETIQTIVEGAAEMRSPVILAGTPGTFDYGGRDY LQAIIEVASNKYSIPVAIHLDHHETFESIKKSVDLGTKSVMIDASHHPFEENIKLVKEVV EYAHRYDATVEAELGRLGGQEDDLVVDEKDSFYTNPDAAVEYAERTNIDSLAVAIGTAHG LYKSEPKLDFDRLAEIREKVSIPLVLHGASGVSHDDVRRCISLGITKVNIATELKIPFSN ELRKYLIEHPEANDPRKYMAKAKEEMKKVVIEKIKMCMSDGKY >gi|261746993|gb|ADAD01000158.1| GENE 6 5867 - 6304 668 145 aa, chain - ## HITS:1 COG:ECs4014 KEGG:ns NR:ns ## COG: ECs4014 COG2893 # Protein_GI_number: 15833268 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Escherichia coli O157:H7 # 1 145 1 144 144 103 40.0 1e-22 MIGIILTGHGKIAEGIESSIKLVFGKPEKFYSINFTEDITPELLEKEIEEKIDELNGEEG VLIFTDIAGGTPFKTASVLSLKKEKVKVISGMNLPMVLETVCEREGYDSINELYLNAVET GKSQITGFELKDTKSKNEGEFEEGI >gi|261746993|gb|ADAD01000158.1| GENE 7 6335 - 7519 1558 394 aa, chain - ## HITS:1 COG:SPy0716 KEGG:ns NR:ns ## COG: SPy0716 COG2222 # Protein_GI_number: 15674774 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Streptococcus pyogenes M1 GAS # 19 392 16 389 399 282 41.0 1e-75 MNLLLGIDENFLKEKKSYITAKEILQQPELWKETLEIFKNSRKQLKEFLKKIDFNETFDV IFTGAGTSEYVGNILEPLLRKESKTEFKSFATTDILNNPLNYFKKEKKTLLVSFARSGDS PESMAVVDIANENVDNVYHLFITCNKEGALAKSSVNNEKAFLILMPEKSNDKGFAMINSF SCMLLTGILVFGKNSEDVIEDMQKIIVLAENELKEKYEKIKELADYDNKRIVILGSGILK GLAQELSLKVMELSAGKIVSVNNTTLGFRHGPKAIIDKETIVFDLVNQDDYAKKYDEGLL EEMFSDGTADKIIAYNITENKKISENTDEVIIPESKKIQEIKDKELCSLFIYLIYGQMYA FFKSQYLGNTTDNPFPTGEVNRVVKKFEVHKFNK >gi|261746993|gb|ADAD01000158.1| GENE 8 7777 - 8160 364 127 aa, chain - ## HITS:1 COG:no KEGG:t1803 NR:ns ## KEGG: t1803 # Name: agp # Def: glucose-1-phosphatase/inositol phosphatase # Organism: S.typhi_Ty2 # Pathway: Glycolysis / Gluconeogenesis [PATH:stt00010]; Microbial metabolism in diverse environments [PATH:stt01120] # 1 122 287 411 413 129 49.0 3e-29 MVKFIYKDIFDSERRITVTVGHDSNIAALFSTLDIKKHTLEKQYEMFPVGGKIFFQIWKD RVSKKRKVKIEYVYQSTEQLKNGEQLGLKNPPMRKVLEMEECPVDKNGFCSYEKFEEVLK NARNKKY >gi|261746993|gb|ADAD01000158.1| GENE 9 8188 - 8499 365 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038976|ref|ZP_06012312.1| ## NR: gi|262038976|ref|ZP_06012312.1| glucose-1-phosphatase [Leptotrichia goodfellowii F0264] glucose-1-phosphatase [Leptotrichia goodfellowii F0264] # 1 103 1 103 103 195 100.0 1e-48 MNKDTIKNFELEKVFIFSRHGIRTPVLPPESRLYKITPHKWKEWDRKPGHLTKKGTLIVD ALLLQLYEGEKIENIVNGNIDLEEKWKRLNEIKDTYIQTLFGN >gi|261746993|gb|ADAD01000158.1| GENE 10 8581 - 9585 1156 334 aa, chain - ## HITS:1 COG:SP1821 KEGG:ns NR:ns ## COG: SP1821 COG1609 # Protein_GI_number: 15901650 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 2 332 4 332 333 202 34.0 1e-51 MIQKSKSTIKDVAKKTGYSIQTVSRVINKSPDVKESTRKIIESAIKELEYKPNFYARNLN SKKNVNVLISIRREKGHDATIWLNNLVNEIIVVNNTPKISIFVEQFYEEKELKKSLLYTT SNFIDGAVIFNKLENDSRISYLKNNNVPYVIFGKSSKDEDIFVASDDYNSFITGTKYLFK KGVRKIDFITGNDTFKEDDRERGIADTYRKKNVDLKYFNVIRKMKNQEEIYKEVIKKIEN NQLPEAFFISGDEKAAGVLRALNEKSIKIPEEIMILGYDNIPISKYYFPSLSTIDLNYKK IADKLFRKIINLINGKEEKSEFVAGELIIRNSTK >gi|261746993|gb|ADAD01000158.1| GENE 11 9789 - 10529 725 246 aa, chain + ## HITS:1 COG:CAC0189 KEGG:ns NR:ns ## COG: CAC0189 COG2188 # Protein_GI_number: 15893482 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 241 1 242 243 131 34.0 2e-30 MKSTDKLPAIPLYYRIYEHFKTLITNEKLSEGEALPPERDLADTFKVSRATIRQALQKLQ EDNLVYKLHGNGTFVSHKTVKQELTSFYSFYEETVKAGKIPSSRVLKHEVISSDKEFAEI FKIPLTVNILHITRLRLINDEPIMYEDTYIPLNRFENFDPELLNEKPMYSIFKEQYNVSF DKATESFSSLIIKDKEILDNLGYKEKSSCMLIKRLTYEKNRVIEYTVSYARGDKYEYKVI LNNIEK >gi|261746993|gb|ADAD01000158.1| GENE 12 10593 - 11666 934 357 aa, chain + ## HITS:1 COG:FN0526 KEGG:ns NR:ns ## COG: FN0526 COG0820 # Protein_GI_number: 19703861 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Fusobacterium nucleatum # 8 354 4 358 358 313 49.0 3e-85 MTATDTKDKIDILGFNLEKLQNIFADTGLKKFNANQVYDWLHNKLVFDFDKFTNISKHDR EILKKKFALPKLVHRSHQISEDRDTEKFLFELKDRRLIESVLISHKNRHTLCVSSQIGCL IGCDFCATATMKYERNLDASEILMQFYHIQNYLKEKNEKLGNVVFMGMGEPFLNYDNVIE SINILNSDKGQNFSKRNFTISTSGIVPVINKFTEDENQINLAISLHSVKDDIRSELMPIN KTYKVKELKEALINYQKKTKNRITFEYILIDDLNCETKDAFELMNFLHSFSCLVNLIPYN PVAGKPYSTPSKKKQREFYTLLKDKNVNVTLRETKGQDIAAACGQLKAKKEMDNGKQ Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:33:45 2011 Seq name: gi|261746975|gb|ADAD01000159.1| Leptotrichia goodfellowii F0264 contig00001, whole genome shotgun sequence Length of sequence - 15745 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 9, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 - CDS 1 - 1696 1944 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 2 1 Op 2 . - CDS 1689 - 3419 215 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 3 1 Op 3 . - CDS 3426 - 4169 917 ## COG4947 Uncharacterized protein conserved in bacteria 4 1 Op 4 . - CDS 4203 - 5402 1504 ## Lebu_1198 hypothetical protein - Prom 5432 - 5491 8.8 + Prom 5472 - 5531 8.3 5 2 Tu 1 . + CDS 5552 - 6043 476 ## gi|262038988|ref|ZP_06012323.1| envelope glycoprotein + Term 6047 - 6107 0.3 + Prom 6079 - 6138 12.0 6 3 Op 1 . + CDS 6158 - 6571 446 ## gi|262038999|ref|ZP_06012334.1| ribosomal protein L7/L12 7 3 Op 2 . + CDS 6586 - 6999 435 ## gi|262038995|ref|ZP_06012330.1| phosphoprotein + Prom 7018 - 7077 4.3 8 4 Tu 1 . + CDS 7145 - 7474 325 ## gi|262038985|ref|ZP_06012320.1| hypothetical protein HMPREF0554_0008 + Term 7484 - 7536 7.1 - Term 7475 - 7520 3.2 9 5 Op 1 . - CDS 7684 - 9135 1715 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 10 5 Op 2 . - CDS 9151 - 10299 1209 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 10409 - 10468 9.9 + Prom 10288 - 10347 10.8 11 6 Op 1 . + CDS 10449 - 11198 847 ## Lebu_1633 lipopolysaccharide biosynthesis protein 12 6 Op 2 . + CDS 11218 - 12162 1152 ## COG4989 Predicted oxidoreductase + Prom 12191 - 12250 16.2 13 7 Op 1 . + CDS 12319 - 12522 112 ## COG1724 Predicted periplasmic or secreted lipoprotein 14 7 Op 2 . + CDS 12559 - 12960 435 ## PTH_0968 hypothetical protein + Term 13042 - 13097 -0.6 15 8 Op 1 . - CDS 13119 - 14045 1181 ## COG0524 Sugar kinases, ribokinase family 16 8 Op 2 . - CDS 14075 - 14362 145 ## 17 8 Op 3 . - CDS 14320 - 14715 282 ## gi|262038989|ref|ZP_06012324.1| conserved hypothetical protein - Prom 14876 - 14935 4.4 18 9 Tu 1 . - CDS 14950 - 15744 971 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase Predicted protein(s) >gi|261746975|gb|ADAD01000159.1| GENE 1 1 - 1696 1944 565 aa, chain - ## HITS:1 COG:CAC3415 KEGG:ns NR:ns ## COG: CAC3415 COG1132 # Protein_GI_number: 15896656 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 5 565 4 564 627 688 62.0 0 MNKSENKENKGKQFKALLRLIGYMFNLYKFHFAAVVIFILLSSLSMVKGTMYTKQLIDGY IVPNIGNPNIDFTPLVKIILSMIGVYSVGILCSYLSGRFMVVTAQGTLKQLRDDVFIHME KLPIKYFDTHAHGDIMSVYSSDIDTLRDMITESLAQIISAIITIVSVLASMFILSVPLTI FAVFMIVLIITTTKIISEKSGKNFKEQQENIGSVNGYIEEMLEGLKVVKVFSHEEESKER FDDLNEKLFVSSDKANKYGNILGPVVGNLGNINFVLTASIGSFLALGGIGGFTLGGLASF LQFTRTLNQPIAQTTQQINSVMLAAAGALRVFKLLDENPEFDEGYVTLVNAEFDSENNIR ETEKHTGMWAWKHPHEDGNVTYEKLLGDVVFEHVDFGYNENKIILHDINLYAEPGQKIAF VGATGAGKTTITNLINRFYDIQNGKIRYDGINIKKINKSYLRKSLGIVLQDTQLFSGTVA DNIRYGKLNASDEEVYAAAKLANAHHFIKHLPKGYDTYLANGGANLSQGQRQLLSIARAA IADPPVLILDEATSSIDTRTEKIVQ >gi|261746975|gb|ADAD01000159.1| GENE 2 1689 - 3419 215 576 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 346 546 34 240 329 87 28 6e-17 MLKKLLSYVKEYKIPSILSPVFISVEVMLEILIPFLMASIIDDGLEKGNMQHIYFIGFIT LIVAMVSLYTGFSAGKYAAKASAGFAKNIRKAVFYKIQDFSFTNIDKFSTAGLITRFTTD IANIQNSYQMILRIFVRAPLMLIFATLMTIYINAKLSLIFIGAIVIFGIVMTVVILLVYP IFTKAMRKYDNINSSLQENINGIRVVKAYVREDYEINKFEKATDELKNTLLSAEKIAVFV SPAMLFCMYGCIILLSWFGAKMIVVNELTTGQLMSLFSYTANIIISILMVTMIIVRLTIS RASAERIVEVLNEEPTIKNPENPVYEVKDGSISFENVNFSYSNNPDILNLKNINLEIKQG ETIGIIGGTGSAKSALVQLIPRLYDVLNGTVKVGGINVKDYDIKTLRDNVAMVLQKNILF SGTIKENLMWGNRNATEEEMIHACKLAQADEFIQKFPDKYDTYIEQGGSNISGGQKQRLC IARALLKNPKILILDDSTSAVDTKTDRLIREAFKNEIPNITKIIIAQRISSIKEADKIIV LDDGNISGTGIHEELIISNNIYKEVHDSQEEGGKNE >gi|261746975|gb|ADAD01000159.1| GENE 3 3426 - 4169 917 247 aa, chain - ## HITS:1 COG:mll2788 KEGG:ns NR:ns ## COG: mll2788 COG4947 # Protein_GI_number: 13472478 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mesorhizobium loti # 1 244 10 258 267 204 42.0 2e-52 MYTEYKKDYSNYLGREFEFKRYGNSGKPCLVFPPQDGRYHDYEDFKMVETLSDYIEKGEL QLFCVVSIDSETWSDRNGNPRERIEKQEKWFKYMSEEFIPRIYEWTGRRDLIVTGCSMGG SHAGILFFRRPDLFETLISLSGAYNPSMFFGDYMDNLIYDNSPVHFLRNMPSDHYYLDLY RQKNIIICIGQGAWEEDLIPGNKEMEVILKEKNVPAWVDFWGYDVAHDWNWWQIQIKYFM EKVLRKG >gi|261746975|gb|ADAD01000159.1| GENE 4 4203 - 5402 1504 399 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1198 NR:ns ## KEGG: Lebu_1198 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 397 1 397 401 556 71.0 1e-157 MNFIYISPHFPKTNWNFCDRLKQNGVTVLGIADVSYDELDERLRNSLTEYYKVSSLENYD EVLKAVGYFTFKYGKIDWLESNNEYWLEQDARLRTDFNINTGIKSDEILNIKEKSLMKNS YKKAGVGTAHYHTVSTLEEGKEFVKKVGYPVVVKPDNGVGASNTYRIRNEKELTEFYKNL PNVKYIMEEYVSGDLVSYDAIIDGEGNPIFESGITWEPTIMDIVNEGLDLYYYVRKELPP KLVDAGRRTVKGFGVKSRFIHTEFFRLKEDKEGLGKKGDYIGLEVNMRPAGGYTPDMYNY ANNTDVYQIWADMVAFGKIVNAPLNAELEKYFCVYVSRRDTRNYVHTHEEIIEKYNNQLV MYERMPDLYSAAMGNNMYTAKFLLKEDMDEFVDFVHKIV >gi|261746975|gb|ADAD01000159.1| GENE 5 5552 - 6043 476 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038988|ref|ZP_06012323.1| ## NR: gi|262038988|ref|ZP_06012323.1| envelope glycoprotein [Leptotrichia goodfellowii F0264] envelope glycoprotein [Leptotrichia goodfellowii F0264] # 1 163 1 163 163 257 100.0 2e-67 MSFEDDVRKKLNDDEKEEKKELLEFENTPAPLNQYESSYQGKGSSVPEVKLKLDEMMKKD LKMIALYNKIIGVLMMIQGVLTAISLIGIPLIFIALKLFDSSKAIERLLYSNDEIALREY FNAQGKYSKYSLIYLAVYLIFMLLMIFFFIVIVGVAANSYNQY >gi|261746975|gb|ADAD01000159.1| GENE 6 6158 - 6571 446 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038999|ref|ZP_06012334.1| ## NR: gi|262038999|ref|ZP_06012334.1| ribosomal protein L7/L12 [Leptotrichia goodfellowii F0264] ribosomal protein L7/L12 [Leptotrichia goodfellowii F0264] # 1 137 1 137 137 209 100.0 7e-53 MNTEISNEEITELSEYSSNGYENEPNNSNEISFELDDLTVKNINSISLILKIWGVLNIII GIMECLTIIKIISGILKIIAAFSLFEVAKFLKQSLYSKDERNIKGYFNAASKVLVLTVII FILEAAFSIFRIITHLI >gi|261746975|gb|ADAD01000159.1| GENE 7 6586 - 6999 435 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038995|ref|ZP_06012330.1| ## NR: gi|262038995|ref|ZP_06012330.1| phosphoprotein [Leptotrichia goodfellowii F0264] phosphoprotein [Leptotrichia goodfellowii F0264] # 1 137 1 137 137 218 100.0 9e-56 MSIADDVRKKLNSKDKEIPEYFSDDSETQRKITFNVDDLIIKNSNTISLTSNIIGIIYIM LGILFCITVIGAIAGIPLINVALKIFDSAKHLKKGISQKNGEKIKTYFDILAKALKLLII VTVLEIILIILYFRAFR >gi|261746975|gb|ADAD01000159.1| GENE 8 7145 - 7474 325 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262038985|ref|ZP_06012320.1| ## NR: gi|262038985|ref|ZP_06012320.1| hypothetical protein HMPREF0554_0008 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0008 [Leptotrichia goodfellowii F0264] # 5 109 1 105 105 129 100.0 5e-29 MDTEMVKALKFLSIVCKIFGVLITVGGVIYALFLIGIPLIFIGIKVYKTGEYIDNAIINK NGENLRLSILTIAKAIKYYLILYVGIIALMFFIFFLIMSFGLIAALLGN >gi|261746975|gb|ADAD01000159.1| GENE 9 7684 - 9135 1715 483 aa, chain - ## HITS:1 COG:CAC1435 KEGG:ns NR:ns ## COG: CAC1435 COG2265 # Protein_GI_number: 15894714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Clostridium acetobutylicum # 1 483 4 456 456 397 46.0 1e-110 MKRGEEIEIEVTGIEFPNKPYGVYGEKKVYPVGNYIIGHKLKGTVTKIRNKKSELKKIEI LEKAANEIEPFCPHFNICGGCTFQNMNYGDQTELKSSLVLNILKRAANYEFEYEKIIKSP KEFEYRNKMEFSFGNEIIDGPLTLGMHKKGSFHDIITVNECKLMDIDFRKILTATADYFR KKEEEGELSFYHRIQHIGYLRNFVIRKGEKTGEISINLITTSQIDFDLSEWKEIMLNLEL KNKIYGIIHTINDNLSDSVQSDEENILYGNRDINERIFDLNFKISPYSFFQTNSDGVELL YGKVMEYIDVISENEELNNSSIAITVIKNPQNVKDCVVFDLFSGTGTIGQIVSKKAKQVY GIELIEEAVKKANETAKFNNIKNAEFIAGDVFEKLDELEERDVKPDIIILDPPRPGVGEK TITKLLKYNVSNIIYVSCNPKTLAQDLAIFHNNGYRLVKSCPVDMFPQTPHVEVVNLLVK NSI >gi|261746975|gb|ADAD01000159.1| GENE 10 9151 - 10299 1209 382 aa, chain - ## HITS:1 COG:CAC0375 KEGG:ns NR:ns ## COG: CAC0375 COG0436 # Protein_GI_number: 15893666 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 3 382 5 384 386 262 38.0 1e-69 MYINKTVKEIQISDIRKIYEKMQTHENPLNMSLGEPDIDVPDKVKEAVAYHALNTRIKYS PVGGIPELRGKIAEFYNKNFEGSFTMDNVLVTVGSTEGLASVMKAVIAEGDEVLMPTPAY VGYGSLIKMTGGVPKYIDLRENDFNLTKEILEKNVTEKTKLIILTYPNNPSGAILHEEEM EKIAEFLKDKEIYLLSDEIYGSITFGKYTSFGKYSEILKKQLIIISGFSKSHSMTGYRIG YIITNPELQLQVKKVSQYNVTSTSTLSQYGALTALEKCSDRKQISEIYRKRVEYFLKELE KMRFKCIKPEGAFYIFAGYENIDKLKNMKSLDFALDLLEKTGLAIVPGSTFQVEKYVRFS IVHDIPVLEEAVKRLKEYVENL >gi|261746975|gb|ADAD01000159.1| GENE 11 10449 - 11198 847 249 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1633 NR:ns ## KEGG: Lebu_1633 # Name: not_defined # Def: lipopolysaccharide biosynthesis protein # Organism: L.buccalis # Pathway: not_defined # 1 248 1 248 250 263 64.0 5e-69 MSNEKQQSINLNIIIKILYKNLVMIVLITAMITALGAVYAFTNKSYKSEINLYGNDRVLN EIGETSQYSLNSFDFFLFIKKNSKTLKNTGLSDEKFLKDMSSRLNAQSETNNPTIKVKFS TKSKTEGEEFSKEYVQLATEYLGNKENKFLDTQIKLLEEQYNFITKNTDIRTTKDSLSDT LVSRLAYYRLLKNDTSPVVKLISLNTKPALSKKIILAGALFLGLFLGVFAAFIKEFSKTL DWNEIKDKN >gi|261746975|gb|ADAD01000159.1| GENE 12 11218 - 12162 1152 314 aa, chain + ## HITS:1 COG:lin0643 KEGG:ns NR:ns ## COG: lin0643 COG4989 # Protein_GI_number: 16799718 # Func_class: R General function prediction only # Function: Predicted oxidoreductase # Organism: Listeria innocua # 1 314 1 305 305 407 63.0 1e-113 MKTIDLGKVGLPVSEISLGCMRIADIPYEKGEKAVLTSIDKGINFFDHADIYGRGKSEEI FGKIIKSNNIKRENIFIQTKCGIIVKDYPDGTYKTHFDFSKEHIIKSANESLKRLNTDYI DVFLLHRPDTLMEPEEVATAFDTLYKEGKVKYFGVSNQNPAQMELLQKYVDQKLIANQLQ FSIMHSGIIDNGINVNLKNERASDRDGSILEYCRLKDITIQAWSPFQYGFFEGVFLNNDK FRDLNIIINKIAQKYGVSDTAIATAWILRHPAKIQVIVGTMNSNRISDISTASNITLSRE EWYDIYKAAGNILP >gi|261746975|gb|ADAD01000159.1| GENE 13 12319 - 12522 112 67 aa, chain + ## HITS:1 COG:CC3184 KEGG:ns NR:ns ## COG: CC3184 COG1724 # Protein_GI_number: 16127414 # Func_class: N Cell motility # Function: Predicted periplasmic or secreted lipoprotein # Organism: Caulobacter vibrioides # 6 67 3 61 62 58 46.0 3e-09 MKSYISREIIKILKDNGWEETGESRGSHHYFVNPDKPESGKVTVPHPRKDIPIATVKAIF KQAGIQF >gi|261746975|gb|ADAD01000159.1| GENE 14 12559 - 12960 435 133 aa, chain + ## HITS:1 COG:no KEGG:PTH_0968 NR:ns ## KEGG: PTH_0968 # Name: not_defined # Def: hypothetical protein # Organism: P.thermopropionicum # Pathway: not_defined # 1 132 1 132 137 154 53.0 9e-37 MTKDSYLYPAIFKYGEDGITITFPDLPGCISCGKNDEEALYMARDVLGGWMYQIERAKEK IPKSSSLNMINLNFDEKVLLIDVWMPSVRKSIRNKAVKKTLTIPQWLNERAIEKNLNFSH ILQEALKEELGIK >gi|261746975|gb|ADAD01000159.1| GENE 15 13119 - 14045 1181 308 aa, chain - ## HITS:1 COG:PA1950 KEGG:ns NR:ns ## COG: PA1950 COG0524 # Protein_GI_number: 15597146 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Pseudomonas aeruginosa # 3 299 4 301 308 201 38.0 1e-51 MKKALVIGSLNMDMTAKVENLPKLGETIFSNEFYESCGGKGANQAVAMSKLGMTTEMIGM VGNDSQGDRLINNLIKHNVKADNIIKSNDLTGRAIITVDKRGNNSIIVIPGCNFKITEED IQNKIEVIAENDIIVLQNEIPSEVVEYTLVKAKELGKTTILNPAPARKLNDKIYKNIDYL ILNETEMEEIFDIGIHDKVYIGQIFHKKKECNIKNIILTLGENGSVLFDENDNVKKYDAY EVEAVDTTAAGDSFIGAFALKICETNNPDEAIKYATATSAIVVTRQGAQDSIPSPEEIEE FIEKHPIR >gi|261746975|gb|ADAD01000159.1| GENE 16 14075 - 14362 145 95 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCRDETFKERKISVIIQNFILFLLIGLPYILKLKILSLFFSKIFTIPITIHTLMLINPDA DLKGMTVIWFFMQISFLGGILLSNIVKSVFKKIYL >gi|261746975|gb|ADAD01000159.1| GENE 17 14320 - 14715 282 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262038989|ref|ZP_06012324.1| ## NR: gi|262038989|ref|ZP_06012324.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 131 1 131 131 168 100.0 2e-40 MKKVDEDKFFPLIKILLITTCAAVILLKDIAMLLILVSLFFLLGVFLNQKKFEKYIIKLK PISAVFSQIIFFPFIAFAISMIVGIPLAALFSVVSKNEKITLELMKVIYVTLEIIFGICV GMKLSKKEKYL >gi|261746975|gb|ADAD01000159.1| GENE 18 14950 - 15744 971 264 aa, chain - ## HITS:1 COG:yeiK KEGG:ns NR:ns ## COG: yeiK COG1957 # Protein_GI_number: 16130100 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli K12 # 2 264 49 309 313 199 38.0 6e-51 NNALRICDYFDLKTPVYRGMSEPLVRKNQIIATDFHGETGLDGIKLGKTDRLPEKEHGVN YIINTLLNSNEKITLVPVGPLTNIGMALKLEPKIKEKIEKIVLMGGSCKGGNVTPYAEFN IYADPEAASIVFSSGVSIFMMGLEVTNKTAPDEKIVKKIQSLKTKAADFLNQGLHFPERY DEKGNFIYHTLHDVVTLIYLIDENTVKLEKINCEIETKDEEKYGQTICSNCKDNNRVTNS NIQTAVEIDLNKFWEIIFGVIKKY Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:35:04 2011 Seq name: gi|261746911|gb|ADAD01000160.1| Leptotrichia goodfellowii F0264 contig00116, whole genome shotgun sequence Length of sequence - 72633 bp Number of predicted genes - 64, with homology - 61 Number of transcription units - 31, operones - 21 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 10 - 102 154 ## 2 1 Op 2 . + CDS 151 - 519 623 ## COG4770 Acetyl/propionyl-CoA carboxylase, alpha subunit 3 1 Op 3 . + CDS 539 - 1657 1941 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit + Term 1666 - 1713 -0.4 + TRNA 1796 - 1882 67.5 # Ser GGA 0 0 + Prom 2199 - 2258 15.1 4 2 Tu 1 . + CDS 2433 - 2507 147 ## + Term 2620 - 2678 3.0 + Prom 2714 - 2773 11.0 5 3 Op 1 40/0.000 + CDS 2835 - 3848 1605 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit 6 3 Op 2 . + CDS 3910 - 6309 3775 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit 7 3 Op 3 . + CDS 6335 - 6994 787 ## COG1564 Thiamine pyrophosphokinase + Prom 7004 - 7063 8.2 8 4 Op 1 . + CDS 7132 - 7260 268 ## gi|262039050|ref|ZP_06012384.1| putative liporotein 9 4 Op 2 . + CDS 7313 - 7444 162 ## gi|262039035|ref|ZP_06012369.1| hypothetical protein HMPREF0554_1851 + Term 7558 - 7611 -0.0 + TRNA 7857 - 7931 73.2 # Gln TTG 0 0 + Prom 7859 - 7918 80.4 10 5 Op 1 . + CDS 8091 - 9554 1844 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen 11 5 Op 2 . + CDS 9560 - 11305 1644 ## COG1835 Predicted acyltransferases + Term 11306 - 11359 -0.5 + Prom 11328 - 11387 10.0 12 6 Tu 1 . + CDS 11435 - 12790 1972 ## COG0166 Glucose-6-phosphate isomerase + Prom 12826 - 12885 4.2 13 7 Op 1 . + CDS 12905 - 13507 615 ## gi|262039045|ref|ZP_06012379.1| putative myosin tail 1 protein + Term 13508 - 13542 -0.3 14 7 Op 2 1/0.250 + CDS 13584 - 14957 1867 ## COG1362 Aspartyl aminopeptidase 15 7 Op 3 . + CDS 15021 - 18089 3746 ## COG0366 Glycosidases 16 7 Op 4 . + CDS 18113 - 18883 764 ## COG1349 Transcriptional regulators of sugar metabolism + Term 18920 - 18957 -1.0 + Prom 19034 - 19093 9.8 17 8 Op 1 . + CDS 19182 - 21389 2596 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 18 8 Op 2 . + CDS 21467 - 22552 1638 ## COG0371 Glycerol dehydrogenase and related enzymes 19 8 Op 3 . + CDS 22594 - 23343 886 ## COG1349 Transcriptional regulators of sugar metabolism + Term 23360 - 23425 2.7 - Term 23352 - 23405 -0.8 20 9 Tu 1 . - CDS 23434 - 24150 812 ## COG0176 Transaldolase - Prom 24377 - 24436 13.1 + Prom 24352 - 24411 13.2 21 10 Op 1 11/0.000 + CDS 24463 - 25260 827 ## COG1180 Pyruvate-formate lyase-activating enzyme 22 10 Op 2 . + CDS 25298 - 27703 3298 ## COG1882 Pyruvate-formate lyase + Term 27773 - 27827 3.1 + Prom 27850 - 27909 10.7 23 11 Op 1 . + CDS 27946 - 29169 2185 ## COG0126 3-phosphoglycerate kinase + Prom 29171 - 29230 8.2 24 11 Op 2 . + CDS 29280 - 31058 1404 ## PROTEIN SUPPORTED gi|68248811|ref|YP_247923.1| NAD nucleotidase + Term 31201 - 31242 2.2 + Prom 31255 - 31314 7.1 25 12 Op 1 . + CDS 31417 - 31821 610 ## Lebu_1964 MazG nucleotide pyrophosphohydrolase 26 12 Op 2 . + CDS 31878 - 32447 945 ## gi|262039044|ref|ZP_06012378.1| conserved hypothetical protein + Prom 32463 - 32522 9.0 27 13 Tu 1 . + CDS 32602 - 33066 665 ## gi|262039061|ref|ZP_06012395.1| putative liporotein 28 14 Op 1 6/0.000 - CDS 33210 - 33758 479 ## COG1045 Serine acetyltransferase 29 14 Op 2 . - CDS 33783 - 34694 851 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 - Prom 34748 - 34807 5.9 30 15 Tu 1 . - CDS 34822 - 35205 495 ## COG1733 Predicted transcriptional regulators - Prom 35323 - 35382 9.1 + Prom 35203 - 35262 5.2 31 16 Op 1 . + CDS 35290 - 35922 761 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) 32 16 Op 2 . + CDS 35935 - 36231 433 ## COG2350 Uncharacterized protein conserved in bacteria + Term 36246 - 36295 7.8 33 17 Op 1 . - CDS 36316 - 37488 1457 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 34 17 Op 2 . - CDS 37503 - 39275 2017 ## COG1164 Oligoendopeptidase F - Prom 39325 - 39384 7.2 35 18 Tu 1 . - CDS 39387 - 41057 197 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 41125 - 41184 7.9 36 19 Tu 1 . - CDS 41193 - 42401 493 ## Sterm_3700 initiator RepB protein - Prom 42600 - 42659 6.5 + Prom 42528 - 42587 4.9 37 20 Op 1 13/0.000 + CDS 42615 - 43100 665 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 38 20 Op 2 13/0.000 + CDS 43122 - 43919 1129 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 39 20 Op 3 . + CDS 43909 - 44706 949 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID + Term 44719 - 44769 12.5 + Prom 44818 - 44877 15.0 40 21 Op 1 49/0.000 + CDS 44966 - 45946 1357 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 41 21 Op 2 44/0.000 + CDS 45946 - 46929 1236 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 42 21 Op 3 44/0.000 + CDS 46987 - 48009 1540 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 43 21 Op 4 . + CDS 48011 - 48952 364 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Term 48953 - 48991 0.1 44 22 Op 1 . - CDS 49055 - 49432 335 ## gi|262039002|ref|ZP_06012336.1| hypothetical protein HMPREF0554_1887 45 22 Op 2 . - CDS 49482 - 50315 660 ## gi|262039030|ref|ZP_06012364.1| hypothetical protein HMPREF0554_1888 - Prom 50370 - 50429 3.5 46 23 Tu 1 . - CDS 50436 - 51086 986 ## Lebu_1910 signal peptide - Prom 51183 - 51242 11.3 + Prom 51142 - 51201 13.1 47 24 Op 1 . + CDS 51228 - 52820 2426 ## COG4166 ABC-type oligopeptide transport system, periplasmic component 48 24 Op 2 . + CDS 52842 - 53504 1015 ## COG0778 Nitroreductase + Term 53531 - 53577 5.2 + Prom 53551 - 53610 9.3 49 25 Op 1 . + CDS 53645 - 55531 3010 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 50 25 Op 2 . + CDS 55588 - 56034 482 ## COG0346 Lactoylglutathione lyase and related lyases 51 25 Op 3 4/0.250 + CDS 56093 - 56797 766 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division + Prom 56936 - 56995 8.2 52 26 Op 1 . + CDS 57038 - 57895 1049 ## COG1475 Predicted transcriptional regulators 53 26 Op 2 . + CDS 57918 - 62522 5972 ## Lebu_1778 hypothetical protein 54 26 Op 3 5/0.250 + CDS 62538 - 64607 3156 ## COG4775 Outer membrane protein/protective antigen OMA87 55 26 Op 4 . + CDS 64671 - 65675 1439 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase + Prom 65732 - 65791 12.7 56 27 Op 1 . + CDS 65821 - 66462 903 ## Lebu_1775 hypothetical protein + Prom 66496 - 66555 2.8 57 27 Op 2 . + CDS 66575 - 67519 1279 ## COG1281 Disulfide bond chaperones of the HSP33 family 58 27 Op 3 . + CDS 67546 - 68475 1430 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes + Prom 68502 - 68561 2.7 59 28 Tu 1 . + CDS 68581 - 69060 821 ## COG1854 LuxS protein involved in autoinducer AI2 synthesis + Term 69137 - 69190 6.7 + Prom 69109 - 69168 6.5 60 29 Tu 1 . + CDS 69266 - 69361 84 ## + Term 69510 - 69553 3.9 + Prom 69436 - 69495 9.4 61 30 Op 1 . + CDS 69572 - 70078 518 ## Lebu_1767 hypothetical protein 62 30 Op 2 1/0.250 + CDS 70065 - 70823 1149 ## COG0084 Mg-dependent DNase + Term 70850 - 70894 1.6 + Prom 70832 - 70891 5.4 63 31 Op 1 . + CDS 70931 - 71296 642 ## COG0736 Phosphopantetheinyl transferase (holo-ACP synthase) 64 31 Op 2 . + CDS 71318 - 72571 1297 ## Lebu_2206 hypothetical protein Predicted protein(s) >gi|261746911|gb|ADAD01000160.1| GENE 1 10 - 102 154 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLAAVSVAVMEAAGETENSYIRVKSIREIR >gi|261746911|gb|ADAD01000160.1| GENE 2 151 - 519 623 122 aa, chain + ## HITS:1 COG:ML0726 KEGG:ns NR:ns ## COG: ML0726 COG4770 # Protein_GI_number: 15827304 # Func_class: I Lipid transport and metabolism # Function: Acetyl/propionyl-CoA carboxylase, alpha subunit # Organism: Mycobacterium leprae # 37 121 512 597 598 65 44.0 2e-11 MIKVYKIKIGEKVYEVEVEAVSEKEGKIETESAKEAKEEKPKETVKTSVSEGDTKVEAPM QGLVVSVDVSAGQKVKVGETLVVLEAMKMENPIVAPVDGTVAGIHVSKGDTVETGTLMVS LS >gi|261746911|gb|ADAD01000160.1| GENE 3 539 - 1657 1941 372 aa, chain + ## HITS:1 COG:SPy1177 KEGG:ns NR:ns ## COG: SPy1177 COG1883 # Protein_GI_number: 15675149 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Streptococcus pyogenes M1 GAS # 1 371 2 375 376 429 68.0 1e-120 MELLKTLYSTTGLSMLTWQQFIMIIVALILLYLAIKRGYEPYLLLPISFGMLLVNLPGVP KEGLMEPGGLLYYLYQGVKFGIYPPLIFLAIGASTDFGPLIANPKSLLLGAAAQLGIFLA FTGAILLGLTGREAASIGIIGGADGPTAIYLTTKLAPDLLGPIAVAAYSYMALVPIIQPP IIKLLTTKEERKIKMVQLREVSKTEKILFPIAVTIVVTLLIPSAVPLIGMLMLGNLIKES GVVGNLVEHVKGAMLYCITIVLGTTVGATANAQTFLNVTTLKILALGLFAFAFGTVGGVL FGKLMCRLSGGKVNPMIGAAGVSAVPMAARVVQKVGQEENPSNFLLMHAMGPNVAGVIGS AVAAGVLLIIFK >gi|261746911|gb|ADAD01000160.1| GENE 4 2433 - 2507 147 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKLLMLAILAMAVASCAPYYWWW >gi|261746911|gb|ADAD01000160.1| GENE 5 2835 - 3848 1605 337 aa, chain + ## HITS:1 COG:FN2123 KEGG:ns NR:ns ## COG: FN2123 COG0016 # Protein_GI_number: 19705413 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Fusobacterium nucleatum # 2 337 3 337 338 421 58.0 1e-117 MEKLSRLKEEVLETLNNVGNLEELNDLKVKILGKKGEFTTIMKGMADIAAEKRAEFGKVT NELKTVLQDKFDEKLNTLKEKAKQERLKNETIDITLPGRKADEGSLHPLTKTVAEIKEIV SDMGFDIVDGPEIEYTKYNFDALNIPETHPSRELSDTFYIQNDVLLRTQTSGMQIRYMLD RKPPFRMVSIGKVYRPDYDVSHTPMFHQMEGLMIGEDVSFSNFKAILENIVKKIFGKERN VRFRPHFFPFTEPSAEMDVECGVCRGAGCRVCKGTGWLEILGSGMVNPKVLQGVGIDPQK YQGFAFGLGLERITMLKYGIDDLRAFFENDVRFLDQF >gi|261746911|gb|ADAD01000160.1| GENE 6 3910 - 6309 3775 799 aa, chain + ## HITS:1 COG:FN2122_2 KEGG:ns NR:ns ## COG: FN2122_2 COG0072 # Protein_GI_number: 19705412 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Fusobacterium nucleatum # 148 799 3 653 653 576 46.0 1e-164 MLISLNWLKQYIDLEGKEVGELEKALTMIGQEVEKIEIKGDNLDNVVVGHLIEVKKHPNA DSLTLCKVDNGKEILQIVCGATNHKTGDKVALAQVGAKLREDFTIKKGKIRGEESNGMLC SEDELGIGSDKDGIIILPEDAPVGTPLKEYLGINDTVFELEITPNRPDCLSHIGIARELA AYYGKELKYPKTVINKEVQEKTSDNIKVKIEDSNLSRRYVTRILKGVTVKESPKWLKERI EAVGLRSINNIVDVSNFILMEMNHPNHVFDLDKIEGNEIIVKTANKGDVLVTLDDQERKL ENEDIVIADSKKILAVGGVMGGFDSQVTDSTKNVLLEVAHFNPQNVRKTSRRLTLFSDSS YRFERGIDVENAVNVINRLANLIQEVAGGEILSGYAEQYPVPHENRTTGLNLERLHRFVG KVIPKKEVIKILEGLEIKVVDKGENLELTAPSYRGDLELEQDYFEEVIRMYGFDNIENVL PKVNINESSTLDTTKLTDTVKNIAASTGLKEVINYSFIPKDGLEKIKFTGVKGDKLIDIS NPITEDFVTMRPTLLYSLIKNAKENINRNVSDIRFFEVSRTFEKAEELAKEEIKLGIILA GEKDKTLWNPKPVPYDFYDLKGIVEEVFSKLKFKNYSIKRSVQTEFHPGRSADVFVGNEC VGSFGEIHPDVLENFDLNRKSVLVGEFNIDLIKKYIRKPFVYQGPVKYPAVPRDIALVMN ENVLVGDVLKTVEKIDKKVEKVELFDIYRGLGVEPGKKSVAISVLLRDNSKTLEEKEIND IMEKILNKVKKDYMAELRQ >gi|261746911|gb|ADAD01000160.1| GENE 7 6335 - 6994 787 219 aa, chain + ## HITS:1 COG:FN0890 KEGG:ns NR:ns ## COG: FN0890 COG1564 # Protein_GI_number: 19704225 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Fusobacterium nucleatum # 1 215 1 209 209 105 32.0 5e-23 MKKCVIFLNGEYTYSQQFIDDLFDENTICLCADGGVNSAHKYNKKPLYIIGDLDSVNAEI LEDYKKQGVNIVKYNPEKDYTDFELILQKVEELEKDGDFKFDSMSILGALGKRIDLTLNN IFLMEKYKNIKILTENEEIFYTESSFSLNNKKGYGFSIIPLDNIIGNLTLKGFKYELDSV NVERKSSRLVSNIIEKEVCSVSFTRGKMIVILRKNVTED >gi|261746911|gb|ADAD01000160.1| GENE 8 7132 - 7260 268 42 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039050|ref|ZP_06012384.1| ## NR: gi|262039050|ref|ZP_06012384.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 42 1 42 42 63 100.0 7e-09 MSRMVKVLVIMALMMSLFSCNSALRCELGSRTACIDYGNGLK >gi|261746911|gb|ADAD01000160.1| GENE 9 7313 - 7444 162 43 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039035|ref|ZP_06012369.1| ## NR: gi|262039035|ref|ZP_06012369.1| hypothetical protein HMPREF0554_1851 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1851 [Leptotrichia goodfellowii F0264] # 1 43 1 43 43 77 100.0 2e-13 MNKIKFIVLVVLSVMMLNSCFALWSASQAGTGKYCGIRKCPKE >gi|261746911|gb|ADAD01000160.1| GENE 10 8091 - 9554 1844 487 aa, chain + ## HITS:1 COG:UU038 KEGG:ns NR:ns ## COG: UU038 COG2865 # Protein_GI_number: 13357594 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Ureaplasma urealyticum # 1 483 2 463 463 387 46.0 1e-107 MKLSEYVAEDSLCEKKEAVEKNKPKSWLKTVSAFANGTGGVLIFGISDKDELKGLKDYKK DGEYISEIIKSKIEPVPEIRMENISEEGKNYIFLYVNEGKETPYYFVDSGNRIAYVRIGN ESVTAKNSDLKQLILKGMNMTYDSLNTNIKIDNMSFSKLRSVYYLKTGNEFEESDFVSFG LLNNDKSLTNAGILFSDMPSIRHSRVFCTRWNGLDKASGRDEALDDKEFKGSLLLLLENS IDFIKNNSKKKWKKAKDTRIEMPDYPERAVQEAIVNALIHRDYTELGSEIHIDMFDDRME IYSPGGMPDGSTVQNSDIKNIPSKRRNPVIADLFNRMNLMERRGSGLKKILSDYYNAVNY SEDKNVEFKSTNKDFWVILKNLNYVVKDVVKDVVKDVVKDVVKDVVKDVVKDVVKDVVKI NDRYKRILREMSKNPDITAKKLSNILEVSERTVQRDINKLKSDKKIMREEGLKKGKWRVK AEKDREN >gi|261746911|gb|ADAD01000160.1| GENE 11 9560 - 11305 1644 581 aa, chain + ## HITS:1 COG:FN2029 KEGG:ns NR:ns ## COG: FN2029 COG1835 # Protein_GI_number: 19705320 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Fusobacterium nucleatum # 10 581 11 604 604 353 39.0 5e-97 MKKNRRIEEIDVLRAAALIMICLYHWFTYKGMYVGVIIFFALSGYLFTSSLLSRDFSCFN VIKRRMSKIYPALLTVILVSTIVLIIMNGGLEEKYKNSAFFSVIGLNNIYQIISGISYFD SYNIILPLTHIWALSFQIQMYVFFPFILKGLKKLKLKDELIGIIFFGISVLSALLMGYKY YKGADISRIYYGTDTRAFTFFVGAAVAMFYNKREISKESEKIKVLLAGMSGIFLSVIFAL TVDYKNPLGYYGLLYFISVLSAFSTVLFTKIKLRKLQIPFLSGMKKVLSLLGRREYHYYL WQYPLMIFMREIFKWSKVGFLSQFILEILILIVISEISYFIFEKKNLKVPNYAMPGIIIM LLLFAPLYKNKDLEEMKTVQAQIKEETVKEIEKETVTETEEKTEKTDEKKEEKKSEEEKQ KTENIDDRNILFIGDSVLEMTKPELEKKYPNAIIDVKIGRQFSELAGLLTEFKNSGKLGK TVVIALGTNGPILDEDAKKAMAILEGHDVYFVNSVVARPWEKRVNREIANIAKNYGNVKI IDWYSYAKGEKEYFYKDGVHPKPAAAKKYVNLVYSAVSKNK >gi|261746911|gb|ADAD01000160.1| GENE 12 11435 - 12790 1972 451 aa, chain + ## HITS:1 COG:BH3343 KEGG:ns NR:ns ## COG: BH3343 COG0166 # Protein_GI_number: 15615905 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Bacillus halodurans # 2 448 4 450 450 602 63.0 1e-172 MKLNFSYQFAKNFFNEAELDQIKPYTELANEVLTKKSGAGNDFLGWISLPEDYDKEEFGR IKKAAEKIKNDSEVLIVIGIGGSYLGAKAAIEFLSHSFYNNLAKEKRKTPEIYFAGTNMS STYLNHLIDLLGNRDFSVNVISKSGTTTEPAIAFRVFKKLLEEKYGKEEAAKRIYATTDK SKGALKTLATTEGYETFTVPDNVGGRFSVLTAVGLLPIAAAGINIDELMAGAKDAMNDYS NGSFEENQTLQYAAIRNILHRKGKDVEILVNYEPRLHYFSEWWKQLFGESEGKDGKGLYP SSVDFSADLHSLGQYIQEGKRTMFETVVSIGSPESEYTIESDKDNLDGLNFIAGKTLDYV NKKATDGVILAHIDGGVPNLTVNIPEVTPYHLGYAFYFFEKACGVSGYLLGVNPFDQPGV EAYKKNMFALLGKPGYEEEGKKLEEKLKNMK >gi|261746911|gb|ADAD01000160.1| GENE 13 12905 - 13507 615 200 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039045|ref|ZP_06012379.1| ## NR: gi|262039045|ref|ZP_06012379.1| putative myosin tail 1 protein [Leptotrichia goodfellowii F0264] putative myosin tail 1 protein [Leptotrichia goodfellowii F0264] # 1 200 1 200 200 308 100.0 1e-82 MEKALIEKMNKDMTEIKNLWEKVSEGNDREKEKIGSEKSLFAEKIKNYAQMYKDFGTFFS EYNMGIVTHEDFSKVRNIVNKLYKEIKKDTFDKSVIENTVNEFQEIRITGSGKLSEEIIK KSTEDFEKLEGIIKNKSWEKETEFIHERYKHYNKIKEWFDGNILRDLELYSSIFDTMVNF DISVKQMAKNKDPYIFISED >gi|261746911|gb|ADAD01000160.1| GENE 14 13584 - 14957 1867 457 aa, chain + ## HITS:1 COG:BB0366 KEGG:ns NR:ns ## COG: BB0366 COG1362 # Protein_GI_number: 15594711 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Borrelia burgdorferi # 1 454 1 456 458 465 50.0 1e-130 MTRENLWKSYSEKEKKQIFKFAEEYKSYLNAAKTEREFVTITEKELIENGFTDINKKKTL KKGDKVYYNNRDKNIVAVIIGKDIKSGINMIVSHVDSPRLDLKPNPIKEDEEFALLNTHY YGGIKKYQWAATPLALHGVVYLKDGKKVTLAIGEDENDPVFSVPDLLPHLAHKIQDDRKA RETIQGEELKLLFGNMPVNDKNVKQQTKQMIMDKLKKDYGIEEDDFFTAELEIVPAGKLR DVGLDRSMIGGYGQDDRICAYTSLRAIYEVKNPEKTAMVYLTDKEEVGSEGSTSLKATLP ELIIGKMLSMTEKNYNDQILRETLWNSKALSSDVTAAMNPVFKAVHDGENVARLSYGLAF AKYTGSRGKVSANDADAEYLQEIRQLFEKNKIKYQAGGFGKVDEGGGGTVAKYLAHYGIK TVDAGPAVLSMHSLFEISSKADLYETYRAYRVFFDLK >gi|261746911|gb|ADAD01000160.1| GENE 15 15021 - 18089 3746 1022 aa, chain + ## HITS:1 COG:DR1141 KEGG:ns NR:ns ## COG: DR1141 COG0366 # Protein_GI_number: 15806161 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Deinococcus radiodurans # 547 984 124 475 498 197 28.0 1e-49 MNCSILIYKNNKFYAKKTLKQIKNGTYIYKWSNIEAGSYFFELEDEKGTVHGITYNHTAP FATDFEAVAEENTPPKSITGFQNGIDILVSYIPNKKLFMLSKKKFFRMRLDIADFGLEKA DTIEISGNFNNWTPEEEPVHHMDGTIYEVVLAIPEGVYEYKYLIDGKWFPENENKKLIIG ENGALFHAGDIGTGKFSYETIDKNIDLKAIVHNYKSLRYFNKLSDREYEFTIRTQANDVE RAFISIVLHEEDNYEMIYELERFKDNTNGFDYFKRIIDFGRKVEKISYFFVLEDGGVKAY FNGDLSYKKPKRISVRTSRDDIEIFSIPEWSKEAVWYNIFPDRFYNGNPYNDPIFNEFGP EAYKINELHENGFIERYKWNKNNNLYGKFDRNRWTADFSRQVTWEKLGEQGINYSLKYAR MYGGDLEGIKKKIPYLKDLGVNAVWLNPVFFSYQNHKYGANDFRHISPDFGTIRTSGEKH DVEIGKNNKYGNKSYIDVLGKNAVNSSELKLLEVNLKGENKGKNGFGETADPSTWVWTES DLIMADLIKEFHKNGIRVIFDGVFNHSSDRHWTFNMVMADGENSEYKEWYKFNDFSQYVP VTDSMSDEQAYETMIENKKRIKYNAWAGFNSLPEFNTFNQQYKEYIFNITRKWMYGPDGE TKENWQEDDGIDGWRLDVPNCLENQNFWLEWREVVKESKPDSYITAELWGNASGDINGGN KFDTVMNYEWLKTVIGFFINQSKEGGDKYKLKATDFFNELREKRTWYPFQSLQASQNLNG SHDTDRLYSRIVNDGIGRNLEEGKQLEKGYNGIRPDLASNYHPNTTVNWRDSEIKPKDIL KLISIFQMTYIGAPMLFYGDEAGMWGATDPYCRKPMLWEEFMYDNEKNPSYINQGEEYPQ IRDNDLYQWYKKLIKIRRENRVLVYGKFKELVADNEREIIAYERANEGRSIITAINNSFS DCVGIEIQTDYPNEKYMDLIRGNIVKTSSTGKMKIELKAKQGTILKRWINGSNINGMKMF RY >gi|261746911|gb|ADAD01000160.1| GENE 16 18113 - 18883 764 256 aa, chain + ## HITS:1 COG:CAC0113 KEGG:ns NR:ns ## COG: CAC0113 COG1349 # Protein_GI_number: 15893409 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Clostridium acetobutylicum # 1 256 1 253 253 155 41.0 9e-38 MFLDERLEKILEILSKEKKVKVNDLAHKFKVSEVIIRKDLKRLEIEGKLRRTHGGAILLK ELVHTIALEDRIINRTKQKEKIAEKIISHIKEKEVIFFDVSSINYMVAERLSKIDKSLKI ITNMPSIAALFNKNSQIEIIIAGGDYHKEIGGIIGSEAINNISRYNADKAFIGCAGIDID TGRVMNFDANDGNTKRKIMDISGQNFLVTESKKINETGSFNFSNLKEFTGIITEKNKLEY SNDIRKKLSEDDTDIL >gi|261746911|gb|ADAD01000160.1| GENE 17 19182 - 21389 2596 735 aa, chain + ## HITS:1 COG:SA0233_1 KEGG:ns NR:ns ## COG: SA0233_1 COG1263 # Protein_GI_number: 15925945 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Staphylococcus aureus N315 # 10 437 4 422 437 413 51.0 1e-115 MEKIKKESRIFLTLQTIGRAFFLPVSILPVAGLLLGLGASFTNEKTIAYYHLEKLLAQGT FLNYILTIMNNVGLAVFANLPLIFAMAVAFGMAKKERGVAVLSAGLSFIIMHTTIKTLLV FNGTISPSGEISEKVLNGAIGTVLGIQTLEMGVFGGIITGLGVAFLHNKFYKTKLPAAIS FFSGVRFVPIICTFSYIIVGAVSFLIWPFIQQGIYAIGSVVMSTGDVGIFIFGFMERILI PFGLHHIWYIPFWQTGLGGSAVVDGVMQYGAQNIFFAELASPNTKHFSIEAAKFMTGKYS FMMAGLPGAALAMYHCAKPEKRKIVGGLLFSAALTSFLTGITEPIEFTFLFVAPFVFVIH CIFAGLSFALMSILKIAIGTTFSCGLIDLTLFGILPGNSKTNWLMILPVFVLYFTVYYFF FKFVILKWNLKTPGREDDTEEVKLYTKNDYNTMRDERKKGVVLEDTVSQAIINGLGGLDN FGDVTCCATRLRMQINNMDLVNDGILKQTGAMGIIKKGNGIQVVYGPTVSVIKSNLTEYI DKIKEHGMEASFAEEKSIKNVDLKVLSLCEDCILEDYESLEEINKIVAPFKGKIFDVENI NDKIFNTKVLGDGFAIEIEGNEILAPTNGKIINIYDTEHAFIIEDNYKHNILVHVGLGTV GLNGKGIKLFKKVGDEVKKGEKIGEIDKKIITESGFSLVSPVTFLNVDKARYNIQVEKEG NVEAGEEAIIFIIKK >gi|261746911|gb|ADAD01000160.1| GENE 18 21467 - 22552 1638 361 aa, chain + ## HITS:1 COG:STM4108 KEGG:ns NR:ns ## COG: STM4108 COG0371 # Protein_GI_number: 16767374 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Salmonella typhimurium LT2 # 1 361 1 360 367 414 58.0 1e-115 MSKIINSPGKYIQGAKELENLAKYYKIKGDKVAYILVDKFVFDNFKNKITESFEKENIPY HIEVFGGECSQNEIDRNIKILKEKNCDVMLGIGGGKTLDAAKAISYYENIPVLIVPTIAS TDAPCSALSVIYTPAGEFEKYLFLKSNPDMVIMDTEVIVNAPVRLLVAGIGDALSTYFEA KACVDSNATSIAGGKATKAALAIAELCLNTLFEDGLKAKIAVENKVCSKAVENIIEANTY LSGIGFESGGLAGAHAIHNGLTILEEGHHMYHGEKVAFGTIVQLVLENRSLEEINEVVEL CKSVGLPTTLKELGLENVSNERLYKVAEAATVPGETIHNMPFEVTADDVYAAILVADKLG K >gi|261746911|gb|ADAD01000160.1| GENE 19 22594 - 23343 886 249 aa, chain + ## HITS:1 COG:AGc4939 KEGG:ns NR:ns ## COG: AGc4939 COG1349 # Protein_GI_number: 15889977 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 247 8 250 266 136 32.0 5e-32 MLKKERQDLILMKLNNKGKVVVGELAVDLGVSEDTIRRDLTEMDNRLLLKKVFGGALPLN RYALNYTERENFEPELKYELALKGVNLLKNGQLVAIDGSTTNLQLARAIPVNLKLTVITN SLSIAGELCNHKNIEVVMIGGNLFRKIMTNVGDSAVQQLKEYYPDICFMGAYAIHPLMGV TSPYEKEISVKRQFIKSSGRVVTLIIPNKFNVIMPYKVCDMKDITTIITDKRVSSGILEE YEKIGIECI >gi|261746911|gb|ADAD01000160.1| GENE 20 23434 - 24150 812 238 aa, chain - ## HITS:1 COG:ECs0903 KEGG:ns NR:ns ## COG: ECs0903 COG0176 # Protein_GI_number: 15830157 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Escherichia coli O157:H7 # 1 234 25 244 244 169 36.0 5e-42 MEYFLDTADINKIKKINEIFPIAGVTTNPSIIAKEKRDFKLIINDIYDIIGKDKILHAQV AGNTADIIIKEVNRLRDLFGENFYTKIPVTPEGIKAMKTLSKDGHKITATGILSSQQIIM AAEAGAEYMAPYINRSDDIGESGIEIVKDAYRILTINKKSEEECLKKYKRIFEPKILGAS FKNVRQVHETMLAGAKSVTVSPDVFERFIYHPYTDWSMDKFNSDWEEIYGNKNLLDLL >gi|261746911|gb|ADAD01000160.1| GENE 21 24463 - 25260 827 265 aa, chain + ## HITS:1 COG:SPy2055 KEGG:ns NR:ns ## COG: SPy2055 COG1180 # Protein_GI_number: 15675825 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Streptococcus pyogenes M1 GAS # 6 247 5 241 257 168 36.0 1e-41 MIAMEALVTAIERCATFDGPGIRTVVYFKGCPLRCKWCSNPETQKIENEVFYNEKEISPI TGEYPKVAKLMTLNEIFDIVMKDEKFYRNSGGGVTLSGGEILVNVEFAIKLFEKLKEEYI NTAIETTGYGNYKEFENLAKLTDTVLFDIKHMNSQKHKEYTAVSNELILENLEKLSKWHK NIIMRFPLIKGVNDDDVNVEETAKFLKKINLIDVDVLPYHTMGVEKYRKLRRPYPMKTLE KHTKEELEHVIELMRGVGIRARINN >gi|261746911|gb|ADAD01000160.1| GENE 22 25298 - 27703 3298 801 aa, chain + ## HITS:1 COG:ybiW KEGG:ns NR:ns ## COG: ybiW COG1882 # Protein_GI_number: 16128791 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 8 801 10 810 810 723 45.0 0 MKEYVLKSERVKKLRESALSKFPGVSVERGRLLTKAYKEHAGKSKYIVRAYAVKEILENM TIYIKDGELIVGNQSVDERTAPFFPEYAVDWIADELKEKGNFDSRNGDKFRIPEEEIPEM LEICEWWKGKTLKDKAHAQMPKEIKEAGTVKIIHGEGNMTSGDGHIVPSFEKVLEKGLRG IIEEARESREKVDITVYGGYNKVDFLKSVEIVAEAVINFAHRYADLALKLAQTEKDDTRK KELLQIYENCMNVPENPAKTFWEGVQAIWFVHLVIQIESNGHSASLGRVDQYLYSLYKKD VLEGNTDREFAKEMLQCLWVKLYSVIKVRSTSHSGYGAGYPTYQNVTLGGSKPNGKDCTN ELSYLILESVGENKLTQPNLAVRYHANSPEKFIRECASVAATGYGMPAMHTDEIIIPALL NKGVDFRDAYNYTMVGCVEVAVPGKWGYRCTGMTFLNLVKATELVLNDGYDNRTGLQMLK GQGKLTDFETYDDLWKAWENHIKHYTKLTVALDALADTHLEEFPDILLSSLVDSCIERGL TAKEGGAVYDIVSGLQVGIANAANSLYALKTVVYDNKILTREEVYNALQNDYAGEEGERI RKILLEVPKYGNDIDEVDEFAEKVYWSYINEIQKYHNTRYGRGPKNGGYGVSTSGISSNV PMGTVSGATPDGRKAYTPAAEGGSPTQGTDTNGPTAVLNSVNKLPTMMITGGQLLNQKYS PELVRTPVQFEKFVDVIKSFIASKGWHIQFNIISSDTLKEAQVEPEKHRDIIVRVAGYCA QFVTLDKTTQCDIISRTEQRL >gi|261746911|gb|ADAD01000160.1| GENE 23 27946 - 29169 2185 407 aa, chain + ## HITS:1 COG:lin2552 KEGG:ns NR:ns ## COG: lin2552 COG0126 # Protein_GI_number: 16801614 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Listeria innocua # 1 400 1 396 396 572 77.0 1e-163 MAKKTLKDLDVKGKKVLVRVDFNVPIKDGVITDDNRIIAALPTLKYILENGGKVIAFSHL GKVKAEEDKVSKTLAPVAKRLEEVLGKPVKFIPETRGAKLEAAINELKDGEILMFENTRF EDLDGKKESKNDPELGKYWASLGDVFVNDAFGTAHRAHASNVGIASNLKESAVGFLVEKE IDFIGGAVDNPVRPLVAILGGAKVSDKIGVIENLLDKADKVIIGGGMMFTFLKAQGKNTG ASLLEEDKVELAKSLIEKAKTKGVELVLPIDTVVAKEFKNDTEFKTVSVDDIEDGWMGLD IGEKSIALFKEKLNGAKTVVWNGPMGVFEMPNFAKGTIGVCEAIANLSDAKTIIGGGDSA AAAIQLGYAEKFSHISTGGGASLEYLEGKELPGVAAISEKKSCCCGH >gi|261746911|gb|ADAD01000160.1| GENE 24 29280 - 31058 1404 592 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|68248811|ref|YP_247923.1| NAD nucleotidase [Haemophilus influenzae 86-028NP] # 10 590 12 593 603 545 47 1e-154 MKKLLVIGLIAAMSLTANAKAVKKGAVQKSRPLEVNIIHINDHHSHLEEEKMDLVLDGKK TTVHIGGFPRINQRINELMKTGKNNLVLHGGDALSGTLYYTLFKGKADAALMNAGKFDLF TLGNHEFDDGNKVLKDFLDELKIPTISANVVPDKGSILEGKWVPYVIKKIGGQDVAIVGI DVVGKTVRSSSPGKDIKFLDEIETAKKVVEELQAKGINKIIFLSHAGYERNLEMAEKVSG IDVIITGDTHYLLGEEFKEYGLKPVAEYPKKIMSPAGEPVYVAEAWSYSYLVGNLKAKFD EKGVITELKANPTIIIGDDLFERKNDKGEVYKLEGKEKEDVIKYVNSRKDIKFVKEEANS KKVLERYKAEKNALGKKMIGKIKEEIPGGSANRIPNDKNPQGSIATRLVSETVLHKLRNM GTGEIDLVILNSGGIRMNLSPGDFSYDNAYTLLPFTSNTIYILNMSGAEIKQVIEDSLDF ALSGGSSGAFPYGAGIRFEATKAGQLGTRVKKVEVFDVKANKWVPIDMKKMYNVGTNSYV AKGKDGYTTFGKVNEQRPGVDTHLGVETAFIEYLKEKGEIAKPDSSNVIFKY >gi|261746911|gb|ADAD01000160.1| GENE 25 31417 - 31821 610 134 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1964 NR:ns ## KEGG: Lebu_1964 # Name: not_defined # Def: MazG nucleotide pyrophosphohydrolase # Organism: L.buccalis # Pathway: not_defined # 1 124 1 124 135 129 66.0 3e-29 MSEKKELKEDMTIKEFQFLIKHIEQGSLKDEEIGKNPEDKKENSRRLILKLIEEFGELAE NVRKNIRYDGGNIKGTIEEELFDIFYYIAAIANDYGIDLEDIFYIKDKVNIVKYKREFTI EEGREKYRKIKEKV >gi|261746911|gb|ADAD01000160.1| GENE 26 31878 - 32447 945 189 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039044|ref|ZP_06012378.1| ## NR: gi|262039044|ref|ZP_06012378.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 189 1 189 189 320 100.0 2e-86 MRKKGAVFLLLMLSAVTGYGATVKKTVVKVLPKVEIEGKFNEEKAILNYLKKEGYTQKSN MRKTTDKTMEEEVKEIEAEKNGVTYTLYLKADEASGIVVTVDSSVREYYDILESVRKEHN RVLSIEAATYARGIDINLYPREHISPDEIRTIGTKIANKIRAVNPSPGRLTLIIYDIYDH SRILYKDNF >gi|261746911|gb|ADAD01000160.1| GENE 27 32602 - 33066 665 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039061|ref|ZP_06012395.1| ## NR: gi|262039061|ref|ZP_06012395.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 154 1 154 154 173 100.0 3e-42 MKINKKVMAGIIGALLAVSCASGGSNKTAVTMDKGSSVPVTVSESTASSDMTANKSTDTN MMQNDTTASESQTTADSNKKTFTCAKETVMVEYNSSADSVSLTNSKGTYELKRAKAASGE LYKDAKGLSIHMKGDEAVYKTSAKAKDVSCKAAE >gi|261746911|gb|ADAD01000160.1| GENE 28 33210 - 33758 479 182 aa, chain - ## HITS:1 COG:BS_cysE KEGG:ns NR:ns ## COG: BS_cysE COG1045 # Protein_GI_number: 16077161 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Bacillus subtilis # 5 173 3 172 217 209 56.0 3e-54 MSKIFKWLKNEINNIKEKDPAVKSKIEVLLYPSFHAVINHKISNFFYRKKLFFLARLISQ SSRFFTGIEIHPGATLGERIFFDHGMGIVIGETAVIGNDCVIFHGVTLGGVNSSKTKRHP TLKNNVIVGTGAKILGNITIGENVKIGANSVVLKDIPDNAVAVGMPARVIKKDSVDYFMW HI >gi|261746911|gb|ADAD01000160.1| GENE 29 33783 - 34694 851 303 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 2 303 3 305 308 332 57 3e-90 MIYENILELIGNTPVVKLNLEKEENIADVYIKLEKYNLGGSVKDRAALGMIEAAEKDGKL QKGGTIVEPTSGNTEISLALIGKLKGYKVIIIMPDTASVERRDIIKAFGAELILTDGAKG MKGSIAEAERIVAENPGYFLPQQFENPANPQKHYDTTANEILADFPVLDAFVAGIGTAGT LVGVAKKMKEERPDTKIFGVEPVKSPVISGGEPGKHVIQGIGAGFVPKNYDGSVVDEIIQ VTDEEALEYGLRASKENGLFLGISSGAAIAAAYKVAKQLGKGKKVLAVAPDGGEKYLSVA LYK >gi|261746911|gb|ADAD01000160.1| GENE 30 34822 - 35205 495 127 aa, chain - ## HITS:1 COG:lin1899 KEGG:ns NR:ns ## COG: lin1899 COG1733 # Protein_GI_number: 16800965 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Listeria innocua # 1 120 1 120 120 143 60.0 6e-35 MKQYILGIEATIEIFGGKWKALIVYMLMLGTKRTGELQRLIPGISQKVLIEQLRELEKDD IIERRTYDEMPPKVEYYLTEYGSTFSEILYAMCYWGRKNIKLRRERGENVMLLDKKEALK LKKKELL >gi|261746911|gb|ADAD01000160.1| GENE 31 35290 - 35922 761 210 aa, chain + ## HITS:1 COG:lin1898 KEGG:ns NR:ns ## COG: lin1898 COG2249 # Protein_GI_number: 16800964 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Listeria innocua # 29 208 1 177 177 177 49.0 8e-45 MVTLLCVFTKKNFKFIMIYKNFEERKMIMKTLVILVHPDMENSKANKRWKEELLKYPDEI EIHELYKEYPDWNINVKKEQELIEKYDHIIFQFPLYWFSCPPLLKKWFDDVFEYNWAYGP EGNKLKGKKMGLAVTTGGRKEYYTHGGENKFTLDELFIPFEAAINYAKAEYLPCFSVFAV SPNVKGPTEEEIEKNVKEYIAHIQKTGNWK >gi|261746911|gb|ADAD01000160.1| GENE 32 35935 - 36231 433 98 aa, chain + ## HITS:1 COG:CAP0038 KEGG:ns NR:ns ## COG: CAP0038 COG2350 # Protein_GI_number: 15004742 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 95 1 94 96 87 53.0 4e-18 MFIANLKYKKSLEEVNKVLESHLEYLDKYYKKGKFICSGRRFFPEMGGIILFNSDNLEEA KKIMYEDPFYTEEIADYEIIEFEAKIYDEKFSDFVTSK >gi|261746911|gb|ADAD01000160.1| GENE 33 36316 - 37488 1457 390 aa, chain - ## HITS:1 COG:FN1063 KEGG:ns NR:ns ## COG: FN1063 COG1473 # Protein_GI_number: 19704398 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Fusobacterium nucleatum # 1 390 1 394 394 359 48.0 5e-99 METKNSVNNIMKDVVEWRRYLHRHPETGFDLENTVRFVCEKLDEMGIEYETNVGSKCSII AHINKGKSGKCIALRADMDALPVKEITNLEFSSENDNMHACGHDAHTAGLLGVCKLLKER ENELNGSVKFIFQPAEEIGTGAIGIIEKGVLDNVDEIIGLHVGNIYPEGAKGNLVFKKGP MMASMDKFIIKVKGQGSHGAYPNLSKDPVVTASHIVAGIQEILGREINPVEPAVVTIGTI HGGSAFNIIPETVELTGTARAVNNETREYLHKRIGEIASNIAAAFRCETEYEFFYQPPPL INDENATIKVMEVAKKLYPGTVEEMKAPVMGGEDFAWYLKKIPGTFFFLHNPLEIDGKVW PHHNPRFAIDEDYLDRGIAVMTEYVSEFLK >gi|261746911|gb|ADAD01000160.1| GENE 34 37503 - 39275 2017 590 aa, chain - ## HITS:1 COG:FN1145 KEGG:ns NR:ns ## COG: FN1145 COG1164 # Protein_GI_number: 19704480 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Fusobacterium nucleatum # 27 583 1 555 559 540 49.0 1e-153 MTQKKVADNGNIRAIHPYKRHLCIYFMNFNDYKYERVNIDTTKEQFGSLIGKFTNASNVE EQCKYMDEIIKLRNHVETMSTLVSVRHSIDTVDEFYDKENDYYDENGPVLQSFVMDFYKA LTTSKFRKELEEKYGKFLFDLAECSLKTFDEKVIPDLQEENKLVSKYQKLIASAKIEFDG GEKNLSEMVPYAQSKDRNIRKDAAKKVAQFFSENKNEFDDIYDKLVKVRNGIAHKLGFKN YIELAYALMSRLEYNAADVAGYRKQVLENIVPVHSELRKRQAKRLGLDKLTFYDEPIKFN SGNADPHGEPEWILNHGKTMYHELSKETDEFFTFMTENNLLDLLSKKGKMSGGYCTYIPD YKSPFIFANFNGTAHDVDVLTHEAGHAFQVYESRKYEVPEYLWPSYEACEIHSMSMEFLT WPWMKLFFENDTDKYKFIHLSESLLFIPYGVTVDEFQHWVYENPEATPEQRRNQWIEIEK KYLPTRDYGEIDELKEGIFWFRQGHIFASPFYYIDYTLAQVCAFQFLIKSVEDRENAWKE YLALCRLGGSKPFFELMKAANLKNPFTEGTLSSVIPKIREILNGFNDTEM >gi|261746911|gb|ADAD01000160.1| GENE 35 39387 - 41057 197 556 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 326 546 2 228 245 80 27 2e-14 MKTVWKKHIKDILILSVFSLLDIFGEIYTAILFGQIIDSASSRNLSVMTFLIIKTLIFTI GVVIIFWLYQVYMKKFIYLMVRDTKTKVFSDIQNMSINQFFKEETGNKISLLTNDMSILE KDYFQSIVLVVRSVILFIFSIMTVFIKSYQIGIFLLTLTLISFFLPKFFERKLSSYKKEY SDAQSEYTARISEYLNGFDTIKSFNITETVKNIFFNNADNIVKTGTIYEKHYNFVRAVSI FLGSLTFMGGFLLGGYLAAKGIISLGTMVVCIQLTNHITNPVYTFVDRLSSFKSVSKILE KIENLSSGFEKNINANKICKNLKENINLKSVNFSYENKEILKNISYTFEKNKKYAIVGLS GSGKSTILKLLLGKTKATSGNITIDNTNINDLSEDSILSLYSYISQNVFLFKGSIFDNVT LYNDYPKEKVENILKKVGLEKFTKEMDNPDFVGENGVNLSGGEKQRISIARALIRETDIL VADEILSNLDNETALKIEKELLNLKNITLISVTHRLFEETLTNFDEIIVLNEGCIVEKGT FNELIEKKSFFTKYTD >gi|261746911|gb|ADAD01000160.1| GENE 36 41193 - 42401 493 402 aa, chain - ## HITS:1 COG:no KEGG:Sterm_3700 NR:ns ## KEGG: Sterm_3700 # Name: not_defined # Def: initiator RepB protein # Organism: S.termitidis # Pathway: not_defined # 1 401 1 402 403 381 62.0 1e-104 MEKIFDKNQIFLSFNKQLSKNENIIFKLFAEHIISNNISYIELSFKKLNTLLKLDQDNNI KNFFENFMKKKIIYKYNKFNFVKIEGHLYIISSYKIIDERFVISFSEDFFNIFNKDFNDF KTYKFNSLLQFDSVVKRNFFMMLIQKDIINNFLDISLEELKNILEIENNYSRFYDFEKYI LIPVIKEINYIQNIKIQYEKLKIRENGKIIGLRLFTKDKLDIQIHEQTQILMKEIETNVQ NYYEIQKFLKKYIQLKGIDYVKKNIYYSKLHSKNKFDLFLIESMKYDYFSRRFKNKLKND YKLIFFLDKVFKNTQHLRETFFQILKDSRFLHLSDIGPVLEETLDFINRKSDKLIKNFIP FYNEFIINLENTNECKYEDNIFVIFIEFNGIYDSHIYIFQKV >gi|261746911|gb|ADAD01000160.1| GENE 37 42615 - 43100 665 161 aa, chain + ## HITS:1 COG:ECs4018 KEGG:ns NR:ns ## COG: ECs4018 COG3444 # Protein_GI_number: 15833272 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Escherichia coli O157:H7 # 4 152 5 153 158 185 65.0 4e-47 MGLNILLTRIDNRLVHGQVGITWTKSTGANLIVVVDDEAANDDVQQQLMKMTAESSGADI RFFTVEKTISVIHKASDRQKIFIVCKTPEIVRKLVDGGVPVDKLNVGNMHSSPGKRQITK KVYVDDKDMEDLKYLKEKGIDIFIQDVPDDTKIPIDNFINS >gi|261746911|gb|ADAD01000160.1| GENE 38 43122 - 43919 1129 265 aa, chain + ## HITS:1 COG:agaC KEGG:ns NR:ns ## COG: agaC COG3715 # Protein_GI_number: 16131031 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Escherichia coli K12 # 1 263 1 262 267 298 58.0 7e-81 MHQITLVQGLLLAIMAIIVGVDFYLEVLFIFRPIIVGTLSGIILGDIKTGVLAAGLVELA FAGLTPAGGTQPPNPILAAIMTVVLSYTTGADVKTTIGLSLPFSFLMQYIILFYYSTFSL FVAKFDKYSSEADTKSYKKLSMITISIVALSYGIIVFLCAYMAQEPMRILVNSMPEWLAH GFEIAGGILPAIGFGMLLKIMFKIEYFPYLIIGFLVATFLNFSNLLPVALIGFAIAGYKF FEEKQNEEKMKVFKMVEREEEEDGI >gi|261746911|gb|ADAD01000160.1| GENE 39 43909 - 44706 949 265 aa, chain + ## HITS:1 COG:agaD KEGG:ns NR:ns ## COG: agaD COG3716 # Protein_GI_number: 16131032 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Escherichia coli K12 # 7 265 5 263 263 317 61.0 2e-86 MESKRILTNDDLKKLAFRSSLLQSAFNYERMQGIGWTHSMLPALEKIYKNDKEGLAKAMV DNSTFINTSPPIVTFLMGLLLSLEEKKEDRNLINGLKVALFGPLAGIGDALFWFTILPIV GGISASIASQGSIIGPVLFFIVYILIFLSRLYTVKLGYHTGVKAISTLKDVTRLITKSST ILGMTVIGGLIASYVHIEVLTKIPINKEHSISLQQDFFDKIIPNLLPLCYTFLMFYLMRY KKINPVTLIVITFVLTIVSSFFGIL >gi|261746911|gb|ADAD01000160.1| GENE 40 44966 - 45946 1357 326 aa, chain + ## HITS:1 COG:CAC3644 KEGG:ns NR:ns ## COG: CAC3644 COG0601 # Protein_GI_number: 15896877 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Clostridium acetobutylicum # 4 320 1 304 306 297 48.0 2e-80 MNNVLKFLVNRILMGLVTLWLVITITFFLLHLLPGDPFQSEKEVSPQIKENLMAKYHLDK PLGVQYVEYLKNVTKGDLGLSMKERGRTVNEIIADSFPTSADLGMRAVLFALITGIPLGI IAALNRGKYQDKLAMIIAVIGISVPSFVMAGLMQRYFVDIHNNVLIENRFLPEFLRIKIS GWDEPSKRILPVVALGLYTVALIARLLREKMIEVMGQDYIRLAVAKGVKPSNIVWKHALR NAILPVVTVMGPTIAAVLTGSFVIENMFTIPGLGKYYIDSINERDYTMVLGVTIFYAAFL ILMMIITDIVYVLVDPKIKLGKGAEV >gi|261746911|gb|ADAD01000160.1| GENE 41 45946 - 46929 1236 327 aa, chain + ## HITS:1 COG:SA0846 KEGG:ns NR:ns ## COG: SA0846 COG1173 # Protein_GI_number: 15926576 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Staphylococcus aureus N315 # 22 326 26 355 356 251 42.0 1e-66 MENTQFEKDTVLKNDIENTGFIPASEFQIVGADTAQSEVIYKPSLTFWQDGWRRFKKNKL ALTFLGIMLIFIFLAIFGQKMTKFLYSDQNLPEKFLNPFQGFKKGHYLGTDALGRDLFAR LSQGIRVSVELSVITAIICVIFGTIYGAISGYLGGIVDVIMTRIVEVIMTIPSMIYIILL MVVMGNSIKTIIIAMSLTRWLNYSLLVRGEVLKIKQNEFVLASKSLGGNFVWITLKHLIP NTLSVIIIRLTTDIPNIIFTEAFLSFIGLGVPIPQASLGNLVYDGNANMVSYPYLFIIPA VVISLITLSFNIVGDAINDALNPKLRS >gi|261746911|gb|ADAD01000160.1| GENE 42 46987 - 48009 1540 340 aa, chain + ## HITS:1 COG:BS_oppD KEGG:ns NR:ns ## COG: BS_oppD COG0444 # Protein_GI_number: 16078211 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Bacillus subtilis # 4 332 6 334 358 417 62.0 1e-116 MNEKILEVKDLSVSFNTYAGEVKALRNISFSVSRGETLAIVGESGSGKSVTVQTVMRLIP MPPGEIKSGEILFDGEDLVKVSDERMRQLRGGKIGMIFQDPMTSLNPSIRVGKQIMEGIL IHKKVSKKEAEKQAVEMLRKVGIPKPEERFKQYPHEYSGGMRQRAVIAIALSCEPDLLIC DEPTTALDVTIQAQILDLINELKKELNIGVILITHDLGVVAETSDRVVVMYAGEKLEEAP VKELFKNPKHPYTWGLLKSLPRLDMASNERLSSIPGTPPDLLNPPVGDPFAPRSEYAMKI DYEKKPPMIDLGNGHFVKSWLYVKGAPEIKSPFEKEKGGK >gi|261746911|gb|ADAD01000160.1| GENE 43 48011 - 48952 364 313 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 8 256 1 244 245 144 34 1e-33 MSNTKEKMIEIINLKKYFPMKKRQVLKAVDNVTMDVYKGEVLSLVGESGSGKTTLGRTLS ILYPKTAGKIILDGRSTDDYKTKEFTKKVQMIFQDPQASLNPRMTVGDIIAEGIDIHKLA SSKKERMEKVYNLLEIVGLNREHASRFPHEFSGGQRQRIGIARALAVDPEVLICDEPISA LDVSIQAQVVNLLKDLQKERNLTLLFIAHDLSMVKYISDRVAVMYRGKVVELGEPDAVYN DPIHSYTKSLLSAVPVADPTYNKREKVVMEEDYLRDPVGDISDINKIPEKPELTEYKPGH FVETSFLEEHNLI >gi|261746911|gb|ADAD01000160.1| GENE 44 49055 - 49432 335 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039002|ref|ZP_06012336.1| ## NR: gi|262039002|ref|ZP_06012336.1| hypothetical protein HMPREF0554_1887 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1887 [Leptotrichia goodfellowii F0264] # 33 125 1 93 93 158 98.0 1e-37 MKQLINNPFFYSCIVFTLLFSTCLTPGIIAIILLIKQNIRKKTNGAFHIENGFLYTHEII SKKILISEIKNIKIDYSIFRGRKYIIRITTIYNETAGIMITHDLDNEFKRFKSELKENGY EIKDN >gi|261746911|gb|ADAD01000160.1| GENE 45 49482 - 50315 660 277 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039030|ref|ZP_06012364.1| ## NR: gi|262039030|ref|ZP_06012364.1| hypothetical protein HMPREF0554_1888 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1888 [Leptotrichia goodfellowii F0264] # 1 277 1 277 277 420 100.0 1e-116 MFKKRLNILILVLIVITLFSAVILTPIFIFKKNIFTSDTRSRKTFTKEINSNEIKNLSIN TDVANLHIKKHNKENFYIEYYAYKNSKFSVEENSLTELKIETLSKQNNSQNVYMKVYCPV SFKGNLKVKNDVGSTEIDGNYQNLNTVNNVGTMSVKGNYTSVVLQSDLGDISGDFSADSF DISGNTGDIDIKLLKISEGIYNIKSDIGDISLTLPKNPDINLTANTKIGDIHVNKGFIKS LSSYNIKNSNNELIIKNKKGKANININSNSSGDIKIK >gi|261746911|gb|ADAD01000160.1| GENE 46 50436 - 51086 986 216 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1910 NR:ns ## KEGG: Lebu_1910 # Name: not_defined # Def: signal peptide # Organism: L.buccalis # Pathway: not_defined # 1 216 1 216 217 338 79.0 7e-92 MKKFLLLIFMTLGLNIFANVNYNVFVTLDNAAKQNVESISAGLKNEGIDSLYSKGYVIHM TLYLTEYKPEALKKIKEIVNKIGKETRPFDVNFYRLRKTGGNWFMLDAENNDTIQGLADE ITVSLNKYRAKDAHVPDWAKSIPEKVKSFNLYGSPNVFTSFDPHITLLTPEDPAKIDAFV SKYTFKPFKARVTGIGIAQVDDLGQAKNVIYSVKFK >gi|261746911|gb|ADAD01000160.1| GENE 47 51228 - 52820 2426 530 aa, chain + ## HITS:1 COG:BS_dppE KEGG:ns NR:ns ## COG: BS_dppE COG4166 # Protein_GI_number: 16078361 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Bacillus subtilis # 20 512 44 533 549 287 33.0 5e-77 MKKILLILTIVIFVLSCGGSKTGKGSTLVLNSGSSPDSIDPQLTTGISGGTVDDLIMQGL LRKDKDGKSVAGLAEKWDVSPDGLKWTFHLRKGLKWSNGDPITANDFRAGWIRALDPKTA AGNANMLFMIKNAEEFNGGKANADQVGIKVIDNETLEVELSKPTPYFDDLLTFKAYMPLN EKFYKETGEKYFTEADKTISSGPYILKSWKSGKDAILEKNPDYWDAANVKVDKIKLKFIE DTNASFNAFKNKELDVTKVTYQQAKEMNSDPRLANANDGGVWYLLFNTKVKPLNNAKIRK AIVMAINRKELVENILENSEIYTEKFVPAGIGIKGLEKDFSEEVPTVAPEFNVEEAKKLL AEGLKEEGLSRFPDTEIIFGDSGNVKLIAEYIQESLRKNLGVEFKLSGMTGKERVARSKK RDYAITIHNWTGDFRDPITYLDLFDSKNPNNRGDFTSAKYDALVNIAKSTADPKVRIPAM IEMEKIISEEVPIGILFQRKKTYLVNPKVKGLGFVAIGGEFNFSDLIIEK >gi|261746911|gb|ADAD01000160.1| GENE 48 52842 - 53504 1015 220 aa, chain + ## HITS:1 COG:SA2311 KEGG:ns NR:ns ## COG: SA2311 COG0778 # Protein_GI_number: 15928102 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Staphylococcus aureus N315 # 1 217 1 217 223 223 48.0 3e-58 MKEKNKEIYDVFNYRYACKKFNKDKKVSDEDFATIIESARLSPSSFGLEPWKFLLLKNEK MKEDFKEFAWGGINSLNGASHIVIVLAKKGVTADSKHMAHMLKDVKKVSGEVENIIKEEF HDFQKNQFKLLENERALFDWASKQTYIALGNMMTTAAFLGIDSCAIEGFEKERAEKYLSE KGLLNTDEYGMSYMVSFGYRDEEQPVKIRQQLSEVFEVIE >gi|261746911|gb|ADAD01000160.1| GENE 49 53645 - 55531 3010 628 aa, chain + ## HITS:1 COG:FN1723 KEGG:ns NR:ns ## COG: FN1723 COG0445 # Protein_GI_number: 19705044 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Fusobacterium nucleatum # 1 624 1 629 633 725 58.0 0 MREYDVIVIGAGHAGIEASLASARLGMKTAVFTITLDNIGVMSCNPSVGGPAKSHLVKEV DALGGEMGRNMDKSFVQMRILNTKKGPAVRSLRAQSDRKIYAKEMKKTLENQENLDIIQD IVTEFTVENGEIKGIKTKTGLEFKAKAVITATGTFLRGLMYIGEKRIKGGRMGELSSEEL TDSLKSLGFKMDRFKTGTPPRIDIRTLDVSKLEEQPGEKGVPLKFSMRTPDEEVLEKPQL SCYLTRTNLKAHKIILDNLDKAPMYNGSISSTGPRYCPSIEDKVVKFHEKDSHHLFLEPE GFDTAEVYISGLSTSYPAEYQQKIVNTIEGLENAHIMRYGYAVEYDIVDPGELDYTLETK RVKGLYLAGQINGTSGYEEAAAQGIIAGINAALKIKGEEPFILDRESSYIGTMIDDLINK ELIEPYRMFTARSEFRLILREDNADIRLSEKAYKIGLLDKKYYDKIEEKKKNVSETIEKL ENIKLGTSNGRLVEILDKYYESLKSGTTLKEILRRPKVTYQDIKYIAEIIENVSNLSFDE ETEYQIEVQVKYEGYIAKAKQIMDRQKKLDDKKIPKNFDYGVMKGITREAKQRLEEKRPY NVGQASRISGVTPADISVLLMYLEGVLK >gi|261746911|gb|ADAD01000160.1| GENE 50 55588 - 56034 482 148 aa, chain + ## HITS:1 COG:CAC2466 KEGG:ns NR:ns ## COG: CAC2466 COG0346 # Protein_GI_number: 15895731 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Clostridium acetobutylicum # 21 147 1 129 132 105 43.0 4e-23 MQYRIACKIEKRNMNLWRNDMKYNDLIPEMVVSDIEKSKEFYINHLGFKIEYEREEDKFV FLSLNEIQLMLEQGTDEELSMMKYPFGNGVNFSFGVENVEKIYKKLKEKNYPIKRDIEKR YFRVNDQILTPTEFSILDPDGYYIRITD >gi|261746911|gb|ADAD01000160.1| GENE 51 56093 - 56797 766 234 aa, chain + ## HITS:1 COG:FN1722 KEGG:ns NR:ns ## COG: FN1722 COG0357 # Protein_GI_number: 19705043 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Fusobacterium nucleatum # 4 234 1 232 232 213 54.0 3e-55 MKEIKEYFENLLEKAEIEAEKDKIEKMLQFSELLYEKNKFMNLTAIRDKKEIIEKHFIDS LLLTKIIKEDEKKFIDVGTGAGFPGLVLSIYYPEKEFLLVDSVRKKVEFINEVIEKLDLK NVKTSSERAEELIKNKRESFDTALCRGVANLRIILEYMIPFLKINGRFLPQKQNLNEIEE AGNALKVLNAQLENVYKFNLPESNDDRIILEIKKMKKTDGKYPRKTGVPAKKPL >gi|261746911|gb|ADAD01000160.1| GENE 52 57038 - 57895 1049 285 aa, chain + ## HITS:1 COG:BS_spo0J KEGG:ns NR:ns ## COG: BS_spo0J COG1475 # Protein_GI_number: 16081148 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 5 281 25 278 282 176 38.0 4e-44 MKLLKLPINKIITNPNQPRKYFDDEKMEELKESIKNNGLIQPIVVRKLESGKYEIIAGER RFRACRELGLESIEVLKINAGNSKSYEFSVLENIQRENLNPVEEAESYIMLMEVYGYTQE KLAEKLGKTRSSISNKTRILKLPEKVKEMVKKGDLSYGHARTLLGINDTKEITDLAKKII DKKYSVREVEKIVKKYSEKIEKKSEKKEKNNENVVKLNDKEVTENYDKYDENNDEKVFIE EKLREFFETKVSIKGDLQKQGKIEIDFFDYEDLSRIISLLNIEFE >gi|261746911|gb|ADAD01000160.1| GENE 53 57918 - 62522 5972 1534 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1778 NR:ns ## KEGG: Lebu_1778 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 4 1534 3 1526 1526 1429 53.0 0 MNNYLKKTLAVFVPLLMSLFLIKGYISTNNFKDNLGKILKSTGLNVEFDKVRLKGLSKIE IDNLVVKDQSGNAVIKAKKATATVNLLLPSRLARVDAYDGEVFLERYKNNKFNMFNILPP SDPNKKTVDKASRIGKIYVHNAILHYSDTSFEQKISKDLTEVSGFLDVSKSRGFSLEAAG KGDGQEKISIKLGQLVNNFESLKSMFDTKKNTDKSKKEFNLGFVFENVNVTEQLGQYVPL DMIKAKKGLLNGTLNLTDGNPEKKIRATGKLTVKNGTLSYVDYEGDIENADAVIDMKGNT ITVDAVKMFNNDPLTFKMNADIDEGKLNIKLNAVNTPFQEISRYKLLKDVKAFGNVTGNL NADINTKTKETTLDGKFSSPNIKLGGYNFRDIKTGIAMNKDQILTVENTSFNFDETIGGF KIKDYVSSKKFVYNVKEKNGSGDYTISNRGSDYEIAEINGTGKIDKNNTITGNFNSNKID GNFVIEPSKKLMTVNADGKDYVGVKYGGQTYEVNPDVKNLVLNFGEKNILESGTINAKLK GGQNKYFDSINAKIDINKGAYNVNADINTGGQTIHAKGMTTREMYHSYTVTSAGKSTFDA AKLLRQYGYNLKGLDTAKLPIVLNAHISGKTDSLSGTYDIYSPFGKYIVEYEELHAKGKI RNLLSLNLDVNATMEELWLGYQRFKNVTGELDIRNNILNIVDVHNGKLNAKGQYNLKTGK MNINSDLNDYVLYNTSKPEVNVYVDSMTMNLDGKLDNLSGNITLNPAKTTINSQFIGNTK GIIDIKNSVLNFRDFGLRENSVQGTYDLKTELADISLNLNEPDVPKLFEFKDLTFGTFSV LNLKGDLNKFDLTGKIIFDNISYKGFRLPQVATQMEYSDGNVDKFFKYGTFDIKEFMLLG DNGEELFKTNTKFDLENIDIDYKLENQKFTLDSVQDLKEKGYSGDIDLNFIFKGKPEDFF TDVKIDSEKLVLGGFPVNNLNIDVQANNKGLNIGQFYLEYENNPLLVNGYLDFIPIKYNM SVLAKDFNLAFLEVGKDIEKASGIANIDILFSNEQTTGKILLDNFYYKTKDKLTNVENIN ADINVVNRKLNINRLDGGYNGGTFKVEGDLDVPGVPADFMRTKRLELGKFELNADLNRVG VRYGKDIDLVLSGDVKFTENNLFGNLTVNSGEIRAIPSFGGDKSENLSAEAQEKKLKEKT IVEGIVEEVIDKIVKQYTVDINLQSTGDLKLNIPSISVANISLLKNIKGDVVGGSRIFYD AGDVNLLGSYTLKKGSFVLNNNKFNLKNVEIRFTDPLAKMSELNPFVVFEASSNIKGERI EISMNSYLKEANLTFKSESGLTKEQILSLLAFNETGKTGDNPNGSTQTETALIGSALNLA LNQLIFSPVTDKIGETLGLTNVSVKTDFKKSEGTGKYSGATTLYIQDNIYKDKLFWNFEV KFPFQAKEKETNNSNPLGYNAWVNYNVAEGLELRLGGETVTKRNKDYINNGTSIKNIKNE MNYYVGVDFSARADTFRDLIRKIFRKKKLDILTK >gi|261746911|gb|ADAD01000160.1| GENE 54 62538 - 64607 3156 689 aa, chain + ## HITS:1 COG:FN1911 KEGG:ns NR:ns ## COG: FN1911 COG4775 # Protein_GI_number: 19705216 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein/protective antigen OMA87 # Organism: Fusobacterium nucleatum # 28 689 3 678 678 337 32.0 5e-92 MKRKIYALIVTAALFLSVNGLVMAEGKDLRVGTIQFSHLNQLPEDFLMEKLPVKSGETYS NKSLSDIYLALKRLSYISNVNVYPKVEGDTVNLVIEVDEAGNALELAQREESAQDLGKKT EFKVSSVDISGIKTLNKEDFLKDIPVKVGEYFTPQDAINGAAKIFQSGYFSSVDPKVDRK TDNTISILYVVEENPAVQSVTVKGNTLFTQDELVKALGLKQGEVLNGNLLDPDRNGIIRH YSQAGYSLARIETIAVSPSGDINIELTEGIVDSVSFKKVSSKKDNERTTESSANLRTKPY VFERSQAVKPGEVFEAKNIEATIRELYRTGVFTSIEPVLSGKENDPNARVVEFLVEERPT TTINGSISYGTSVGLVGGIKLSDSNFLGRGQEAAFNIEASNKGDKTLELSLFDPWIKGTE RIQGGGSIYWKETADDDAADNEVSKVRKIGTRWTIGKGLNDKIFVRGSLRFDHYKEILGS KQINDRYNLVAISPTLVYDSRDNAFAPTKGIYSTLSYEFGDLIKDSRKYSQFEADLRGYH RTFFKDKNVMAYRVVWGSTGSGTPEALRFSIGGAETLRGYDYGKYDGFNKFHATIENRTQ INQYIQLVAFFDIGNAWQNVTTVNGKKVYSPNRKDANGFKDLKKGVGIGVRLNTPIGPLR FDYGWPLDPEEKGGKKTGGKFYFSFGQTF >gi|261746911|gb|ADAD01000160.1| GENE 55 64671 - 65675 1439 334 aa, chain + ## HITS:1 COG:FN1909 KEGG:ns NR:ns ## COG: FN1909 COG1044 # Protein_GI_number: 19705214 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Fusobacterium nucleatum # 2 330 3 331 332 362 58.0 1e-100 MYNIKDVAKLINGDIKGNDSLSFIGLSPFFQAKENEITFAADEKMLKKIEECKAGAVIVP FVEGLPENKTYIVVKNNPRELMPILLSYFKPEIKPFEKSREDSAQISEGANISPINTYIG HNVKIGKNTVVYPNVSIFEGAEIGDNCIIYSNVTIREFTKIGNGSIIQPGAVIGSDGFGF IKVNGNNVKIEQIGKVIIEEEVEIGANTCVDRGTIGDTVIKKGTKIDNLVHIAHNDIIGE NCFIVAQTGISGSVEVGNNTTLAGQVGVAGHLKIGNNVVIAARSGVTNDVPDGKQMSGYP LRDHMEDLRIKMAMGKVPELVKKFRKMEKEMGKD >gi|261746911|gb|ADAD01000160.1| GENE 56 65821 - 66462 903 213 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1775 NR:ns ## KEGG: Lebu_1775 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 206 1 208 214 163 51.0 5e-39 MKKLLLGVLMFILVISCGEKPETVVSKFIDNIKEKKFDEASKYSLNKDFTKDLKLEYNNK MQQLFFETLFKNMKYEIVGKEKQGDMTIVTVSVENVDVQKVFLMIYQALLKDTFSENSAP NIEDEFKKILESKDVPKQKNTTKFVVVKTKEGNKVNVTAENIDVLFGKINTTFSNLNTLG ADTDDENDTQVRQEGPSAGKDQKLTEPKLDNNK >gi|261746911|gb|ADAD01000160.1| GENE 57 66575 - 67519 1279 314 aa, chain + ## HITS:1 COG:FN1610 KEGG:ns NR:ns ## COG: FN1610 COG1281 # Protein_GI_number: 19704931 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond chaperones of the HSP33 family # Organism: Fusobacterium nucleatum # 6 314 3 282 285 234 43.0 2e-61 MKEKSKLIRGTSKCARFFVCDTTQIVQEGKDIHNLDPVETTLFGKLLTITAIMGKDLKGE NDLITLRVSGDGPYGSMLATGNKKGEVKGYTGTLEEEKLNKIINEKGEFITDETGQVRFL GNGTLQVIKDMGLKEPFVGLTQLDGEDIADTMAHYFLISEQIKSVVTLGVKLTADGNVEK AGGYVVQLLPGVEERFIDKLEDKLRQIRSITELLSGGFSPERIVELLYEDISTVEDEAEM SGAHVKKYVEDYEILEKSKLEYKCNCTKEKFYKGIITLGKEEINQILEEEGKIQVECHFC GKKYTFEKDDFKNM >gi|261746911|gb|ADAD01000160.1| GENE 58 67546 - 68475 1430 309 aa, chain + ## HITS:1 COG:PM0144 KEGG:ns NR:ns ## COG: PM0144 COG1181 # Protein_GI_number: 15602009 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Pasteurella multocida # 6 303 9 307 309 183 36.0 3e-46 MSKLLKVGVISGGVSTERDVSLNTGREIVKNLNRGKYEVFDIVINSENEVFEKLKDLNLD FVYIALHGAFGEDGRIQAILESLGIAYGGPGVMSSSVCMDKELTKKIVATYGVRVAKGLS VRRGETADFKTISEKLGKRIIVKPNSGGSSIGVSFVESEEQLQEALKLVFTMDKEALVEE VLSGTEISVPVIDGKVYPTLKIEAVAGDYFDYKSKYSDGGAREFVFEFEKEIQQEIDKFA YDSYYAMKCEGFSRVDFMVVGDKPYFMEVNTLPGMTAASLLPKSTASKGYSYSETLDLLI EASLKVDRK >gi|261746911|gb|ADAD01000160.1| GENE 59 68581 - 69060 821 159 aa, chain + ## HITS:1 COG:CAC2942 KEGG:ns NR:ns ## COG: CAC2942 COG1854 # Protein_GI_number: 15896195 # Func_class: T Signal transduction mechanisms # Function: LuxS protein involved in autoinducer AI2 synthesis # Organism: Clostridium acetobutylicum # 1 159 1 158 158 209 61.0 2e-54 MERIASFQVDHIRLNRGLYVSRIDEVNGNYLTSFDIRMKLPNREPVINIAELHTMEHLGA TFLRNHPVWKNEIVYFGPMGCRTGFYLILKGKLESKDIIDLMKELYKFMAEFEGEVPGAT AIECGNYLDQNLPMAKFEAKKYLEETLNNLGEENLNYPE >gi|261746911|gb|ADAD01000160.1| GENE 60 69266 - 69361 84 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDKVIYISNSIDVELDKNLLNKFRENGIEVE >gi|261746911|gb|ADAD01000160.1| GENE 61 69572 - 70078 518 168 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1767 NR:ns ## KEGG: Lebu_1767 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 2 155 66 219 234 111 57.0 1e-23 MIKYIETKNLTTNKENFKISYPDILLNEKIYEKTIIEIKKINFNSKNKLEVTYEERVPNM FDLMFSEELNKLLEDKIKEKFGYKLDEEKIEKLSIEEKNKLNIIILSIAREEMLKRLETD KFKYIVSKTKAIFKKENGKWELKDEETLDIENFDIPFLKMKEGNNENN >gi|261746911|gb|ADAD01000160.1| GENE 62 70065 - 70823 1149 252 aa, chain + ## HITS:1 COG:FN1343 KEGG:ns NR:ns ## COG: FN1343 COG0084 # Protein_GI_number: 19704678 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Fusobacterium nucleatum # 1 252 7 257 258 286 56.0 2e-77 MKIIDTHTHIYDERFEEDFDIIMKNIEEQMEGIVSIGFDLESSKKSIDLANKYSFVHAVI GVHPVDIKKLNDEAEKELEKLALTEKKVVAIGEIGLDYHWMEDPEEVQKEGFRKQIELAR RVKLPIVIHTREALQDTLDILKEYKNVGGILHCYPGSYEAAKPFLDRYYIGVGGTVTFKN NRKTKELVKKLSLEKIVLETDCPYLTPVPFRGKRNEPGYTKYVAEEIARIKEISVEEVIN ITTENAKKIYGM >gi|261746911|gb|ADAD01000160.1| GENE 63 70931 - 71296 642 121 aa, chain + ## HITS:1 COG:FN1342 KEGG:ns NR:ns ## COG: FN1342 COG0736 # Protein_GI_number: 19704677 # Func_class: I Lipid transport and metabolism # Function: Phosphopantetheinyl transferase (holo-ACP synthase) # Organism: Fusobacterium nucleatum # 3 119 2 121 122 105 54.0 2e-23 MEIYGIGTDIIEISRIKEAIHRTESFKKKVYTEKEIEYIEKKKNPYASFAGRFAGKEAVS KALGTGVRGFSLKDVEILNNELGKPDVILYNELLNIAKDMKIQISISHSKEYAVSTVIIY K >gi|261746911|gb|ADAD01000160.1| GENE 64 71318 - 72571 1297 417 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2206 NR:ns ## KEGG: Lebu_2206 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 13 396 20 410 431 550 75.0 1e-155 MLKKILEKKVIKIILFTLMFTVGSLSFAVDKVKIEYFGRKDCKNCANLEKFLGQLSSERN DFEYIEYKIDENKENKSFFDEVTTKLKLVKGTPVIYVNENIIQGFNTPETTGQEIIKLID NGKQKNSIPSLAEYVKNGKYTNISNNGAVCEGDGVCEIPGFTKDASKQVIINIPFINKSI DLTDYSLPVMSLILGTVDGFNPCAMWVLVLFLTALIAVGSKTKMFRVAGLFIFAEAVMYF LILNAWLYTWDFVGLDKWITPIVGVVGIVGGLFFIKNYLKKGDELSCEVTDFKKRAKISN QIKDIANKPFTILTALGIIGLALSVNVIEFACSVGIPQTYTKILQINNISFVQRQLYTFI YIIGYMIDDIIVFGLALLSVNKLQLTTKYSKWVNLFGGILMVILGLILLLKPSLLVV Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:37:18 2011 Seq name: gi|261746854|gb|ADAD01000161.1| Leptotrichia goodfellowii F0264 contig00088, whole genome shotgun sequence Length of sequence - 52343 bp Number of predicted genes - 56, with homology - 54 Number of transcription units - 18, operones - 10 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 495 223 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 + Term 553 - 603 7.5 - Term 548 - 583 2.0 2 2 Tu 1 . - CDS 630 - 899 241 ## COG3326 Predicted membrane protein - Prom 1027 - 1086 11.2 + Prom 971 - 1030 9.1 3 3 Tu 1 . + CDS 1105 - 1257 159 ## + Term 1358 - 1395 3.0 + Prom 1286 - 1345 4.8 4 4 Tu 1 . + CDS 1451 - 3115 3215 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase + Term 3229 - 3264 -0.6 + Prom 3345 - 3404 6.5 5 5 Tu 1 . + CDS 3441 - 4664 1705 ## COG1171 Threonine dehydratase + Prom 4666 - 4725 2.2 6 6 Op 1 32/0.000 + CDS 4781 - 6499 2575 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 7 6 Op 2 . + CDS 6492 - 6980 791 ## COG0440 Acetolactate synthase, small (regulatory) subunit 8 6 Op 3 . + CDS 7060 - 8565 2280 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases 9 6 Op 4 1/0.000 + CDS 8593 - 9432 1207 ## COG1408 Predicted phosphohydrolases 10 6 Op 5 . + CDS 9438 - 10298 1126 ## COG1408 Predicted phosphohydrolases 11 6 Op 6 30/0.000 + CDS 10368 - 11768 2221 ## COG0065 3-isopropylmalate dehydratase large subunit 12 6 Op 7 . + CDS 11768 - 12343 823 ## COG0066 3-isopropylmalate dehydratase small subunit 13 6 Op 8 . + CDS 12372 - 12803 537 ## gi|262039091|ref|ZP_06012424.1| putative transcriptional regulatory protein 14 6 Op 9 . + CDS 12818 - 13255 574 ## gi|262039075|ref|ZP_06012408.1| integral membrane sensor signal transduction histidine kinase 15 6 Op 10 1/0.000 + CDS 13292 - 14089 1045 ## COG0500 SAM-dependent methyltransferases 16 6 Op 11 . + CDS 14135 - 15193 1546 ## COG0473 Isocitrate/isopropylmalate dehydrogenase 17 6 Op 12 . + CDS 15213 - 16016 284 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 18 6 Op 13 . + CDS 16067 - 17077 1638 ## COG0059 Ketol-acid reductoisomerase + Term 17117 - 17170 0.4 - Term 17103 - 17158 0.8 19 7 Tu 1 . - CDS 17161 - 17652 547 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 17682 - 17741 10.2 + Prom 17621 - 17680 11.1 20 8 Op 1 3/0.000 + CDS 17820 - 18770 1330 ## COG0679 Predicted permeases + Prom 18780 - 18839 5.1 21 8 Op 2 . + CDS 18869 - 20509 2169 ## COG0281 Malic enzyme 22 8 Op 3 . + CDS 20581 - 21090 569 ## Lebu_1752 hypothetical protein 23 8 Op 4 . + CDS 21044 - 21169 82 ## + Prom 21300 - 21359 13.8 24 9 Op 1 . + CDS 21559 - 23388 3035 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains 25 9 Op 2 . + CDS 23427 - 24062 396 ## Lebu_2080 hypothetical protein 26 9 Op 3 . + CDS 24125 - 24376 472 ## gi|262039108|ref|ZP_06012441.1| conserved hypothetical protein + Term 24387 - 24430 3.7 + Prom 24462 - 24521 10.9 27 10 Op 1 . + CDS 24595 - 25050 818 ## Lebu_1704 hypothetical protein 28 10 Op 2 . + CDS 25075 - 25752 1009 ## COG0775 Nucleoside phosphorylase 29 10 Op 3 . + CDS 25798 - 26538 996 ## Lebu_2131 sporulation domain protein 30 10 Op 4 . + CDS 26541 - 27557 184 ## PROTEIN SUPPORTED gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B 31 10 Op 5 . + CDS 27637 - 27849 268 ## gi|262039089|ref|ZP_06012422.1| aspartyl protease family protein 32 10 Op 6 . + CDS 27849 - 28163 432 ## Lebu_2128 hypothetical protein 33 11 Tu 1 . - CDS 28247 - 28828 400 ## Sterm_1582 TetR family transcriptional regulator - Prom 28853 - 28912 11.2 + Prom 28802 - 28861 11.2 34 12 Op 1 . + CDS 28954 - 30051 1654 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 35 12 Op 2 . + CDS 30068 - 30412 669 ## SSUBM407_1759 hypothetical protein + Prom 30414 - 30473 4.5 36 12 Op 3 . + CDS 30493 - 31422 890 ## COG0598 Mg2+ and Co2+ transporters - Term 31417 - 31455 4.1 37 13 Op 1 . - CDS 31574 - 31990 532 ## Lebu_0717 hypothetical protein 38 13 Op 2 . - CDS 32023 - 32598 587 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) 39 13 Op 3 35/0.000 - CDS 32632 - 34365 230 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 40 13 Op 4 . - CDS 34359 - 36089 262 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 36266 - 36325 13.8 + Prom 36218 - 36277 10.8 41 14 Tu 1 . + CDS 36302 - 37192 1155 ## COG2378 Predicted transcriptional regulator + Term 37322 - 37357 -0.6 42 15 Op 1 . + CDS 37871 - 38470 696 ## COG0693 Putative intracellular protease/amidase 43 15 Op 2 . + CDS 38484 - 38948 627 ## COG3708 Uncharacterized protein conserved in bacteria 44 15 Op 3 . + CDS 39014 - 39550 502 ## COG4283 Uncharacterized conserved protein 45 15 Op 4 . + CDS 39627 - 40394 1001 ## gi|262039105|ref|ZP_06012438.1| putative liporotein + Term 40458 - 40495 2.3 + Prom 40497 - 40556 10.6 46 16 Op 1 2/0.000 + CDS 40576 - 41496 1137 ## COG2103 Predicted sugar phosphate isomerase 47 16 Op 2 3/0.000 + CDS 41465 - 42301 1011 ## COG1737 Transcriptional regulators 48 16 Op 3 . + CDS 42320 - 43732 2371 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 49 16 Op 4 . + CDS 43737 - 44525 993 ## COG0627 Predicted esterase 50 16 Op 5 . + CDS 44537 - 46333 2474 ## COG1680 Beta-lactamase class C and other penicillin binding proteins 51 16 Op 6 . + CDS 46349 - 47425 1072 ## COG3589 Uncharacterized conserved protein 52 17 Op 1 11/0.000 - CDS 47828 - 48460 860 ## COG3634 Alkyl hydroperoxide reductase, large subunit 53 17 Op 2 . - CDS 48486 - 49052 1009 ## COG0450 Peroxiredoxin - Prom 49111 - 49170 8.8 54 18 Op 1 . - CDS 49242 - 50672 1824 ## COG1488 Nicotinic acid phosphoribosyltransferase 55 18 Op 2 . - CDS 50692 - 51030 469 ## gi|262039087|ref|ZP_06012420.1| hypothetical protein HMPREF0554_1439 56 18 Op 3 . - CDS 51027 - 52265 958 ## Smon_0471 hypothetical protein Predicted protein(s) >gi|261746854|gb|ADAD01000161.1| GENE 1 1 - 495 223 164 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 17 160 206 345 347 90 36 2e-17 GKDTKGTLIGTAGGAAVGAVIGNIFDRQEKELKNRLDGSGVKVERTGEGEIKLTAPENIT FDTNSSVIKSRFTGSLNSVADVLKKYPDSNIIVSGHTDSTGNDSINNPLSVNRAASVKSY LVGAGVPSSRITSVGYGSKEPIASNSTANGRAQNRRVEIKIVAK >gi|261746854|gb|ADAD01000161.1| GENE 2 630 - 899 241 89 aa, chain - ## HITS:1 COG:lin0836 KEGG:ns NR:ns ## COG: lin0836 COG3326 # Protein_GI_number: 16799910 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 6 77 3 74 85 65 48.0 2e-11 MQKTFIIISLILINIITFLLFVIDKRKAERHQWRIPETTLLGFSLIGGALGGIIGMMLYH HKVRKPHFYIGLPMILIIQISLYLNFFQI >gi|261746854|gb|ADAD01000161.1| GENE 3 1105 - 1257 159 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFLISILVILLLIIIQILGIIVVKISQLSPLRASKNDFDIKSVIKKNDRY >gi|261746854|gb|ADAD01000161.1| GENE 4 1451 - 3115 3215 554 aa, chain + ## HITS:1 COG:PAB0895 KEGG:ns NR:ns ## COG: PAB0895 COG0129 # Protein_GI_number: 14521553 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Pyrococcus abyssi # 3 550 2 548 551 656 60.0 0 MARSNNLTKGAARAPHRSLLKGLGFTSDEMNKPLIGIANSFNEIIPGHVHLKTLVQAVKD GIRNAGGVPMEFNTIGICDGLAMNHIGMKYSLVTRNIIADSIEATAMATPFDALVFIPNC DKVVPGMLIAAARLNIPSLFISGGAMLAGVYKGKKVGLSNVFEAVGEYEAGLITRKELNM VEDMACPTCGSCSGMYTANTMNCLTEALGMGLPGNGTVPAVFSERIRLAKKAGMQIVEVL KKDLRPKDIMTREAFENAVAVDMALGGSSNTALHLPAIAYEAGVKLTLDDFNEIAKKTPQ LCKLSPSGEYFIEDLYRAGGVTGVMKRMYENGRLHGDAKTVALETQVELVKGAYINDEDV IKPWNKPAYETGGIAVLKGNLAPDGSVVKEGAVDREMLVHSGPAKVFNNEEDAVEAIRGG KIVAGDVVVIRYEGPKGGPGMREMLAPTAMIAGMGLDKDVALITDGRFSGATRGASIGHV SPEAASGGTIGVVKDGDIIEIDIPQRKLNVKLSDEEIAERKAEMEPFKIEVKGYLKKYAM HVSSAAEGAIEILD >gi|261746854|gb|ADAD01000161.1| GENE 5 3441 - 4664 1705 407 aa, chain + ## HITS:1 COG:FN1411 KEGG:ns NR:ns ## COG: FN1411 COG1171 # Protein_GI_number: 19704743 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Fusobacterium nucleatum # 1 402 1 402 404 485 69.0 1e-137 MHKLYDFMEARERVGTVIVKTKLIHSNVFSEEAGNDVYIKPENLQVTGSFKVRGAYNKMS KLTEEEKKKGVIASSAGNHAQGVALAAQKLGIKAVIVMPKHTPLIKVEATKKYGAEVILS GEVYDEAYQKALELQEKEGYVFIHPFNDEDVIEGQGTIALEVLEELPDTDIVMVPLGGGG LISGVAAAAKLKNPQIKIIGVEPEGAASAKASIDKDEVVELKETSTIADGAAVKKIGNID FDYIKKYVDEIITVSDYELMEAFLLLVEKHKIVAENAGILSIAGLKKLKDKNKKVISILS GGNIDVLTISSMINKGLVVRGRIFKFSVDLPDKPGQLVAVSQILAKENANVIRLDHNQFK NLDRFHEVELQVTVETNGEEHIKKIIESFKKEGYVVKRLNSRETMEE >gi|261746854|gb|ADAD01000161.1| GENE 6 4781 - 6499 2575 572 aa, chain + ## HITS:1 COG:MA3792 KEGG:ns NR:ns ## COG: MA3792 COG0028 # Protein_GI_number: 20092588 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Methanosarcina acetivorans str.C2A # 2 563 5 559 564 634 54.0 0 MSIEMINGARILLECLHRIGVTDIFGYPGGAVIPIYDEIYKYDGINHYFARHEQGAAHEA DGYARVSGKIGVCLATSGPGATNLVTGIMTAHMDSIPMLAITGQVSSSLLGKDAFQESDI VGITVPITKMNYLVQDIKDLPRIIKEAYYIATTGRPGPVLIDIPKDIQLQEITYEEFNKL YEKDFKLEGYDPTYKGHQGQIKRALKLIKEAKKPLIIAGAGVLKAKASEELREFADKTQT PVAMTLLGLGTFPGDHELSLGMLGMHGTVYANYATDEADLVIAAGIRFDDRITGNPDTFC PKAKIIHIDIDPAEIGKNKDIDVPIVGDLKNVLTEINKEIGKLSHSEWIEKVKGWKNEYP LVYRKVSEDKLIPQEVLEELNNILKGDAIIVTDVGQHQMWTAQFMTYQNPDSIVTSGGAG TMGFGLPAAIGAQVAMPDKKVVLIVGDGGFQMTFQELMLIRQYKLPVKVLIINNSFLGMV RQWQEIFNDKRYSFVDLTYNPDFMKIGEAYGIKTVKIENKEELKNKMEELIKSDEGIVIN CIVEKEENVYPMIPAGTSVAQMVGKRGVLEDE >gi|261746854|gb|ADAD01000161.1| GENE 7 6492 - 6980 791 162 aa, chain + ## HITS:1 COG:MTH1443 KEGG:ns NR:ns ## COG: MTH1443 COG0440 # Protein_GI_number: 15679440 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanothermobacter thermautotrophicus # 5 161 9 166 168 135 41.0 3e-32 MNKEHEILIIAKNTNGIVARIMSLFNRRGYFVNKMTAGVTNKTGYARLTLTVAGDEKSLD QIQKQVYKIVDVVKVKVFPANGVIRRELMLIKVKSDSDTRSQIVQIADIYRGKVLDVSPT SLIIELTGDVQKLRGFVEIMRDYGILEMAKTGVTAMSRGEKL >gi|261746854|gb|ADAD01000161.1| GENE 8 7060 - 8565 2280 501 aa, chain + ## HITS:1 COG:aq_2090 KEGG:ns NR:ns ## COG: aq_2090 COG0119 # Protein_GI_number: 15607049 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Aquifex aeolicus # 2 497 7 504 524 485 51.0 1e-136 MKKIKIFDTTLRDGEQTPRVNLNAEEKLRIAKQLEAMGVDIIEAGFAVASLGDFEAVKMI AENVEKSTVTSLARAVKKDIEVAAGALKKAKKPRIHTFIATSPIHREYKLKMTKEEILKK VTEMVTYAKSFVDDVEFSAEDATRTEKEYLVEVYETAIKAGATTLNVPDTVGYRTPDEMY ELIKYLKENVKGIENVDISVHCHDDLGLSVANSIAAIKAGATQIECTINGLGERAGNTSL EEVVMIVKTRKDIFDEYETAIDTKQLYPVSKLVSLLTGVTAQPNKAIVGANAFAHESGIH QHGVLANPETYEIMKPETVGRNTDSLVLGKLSGKHAFVQKLSALGFEDMADDKVEELFSE FKKLADKKKYVLDEDIISLVTGDAAKIEGKLSLEHFEITRKDKKPKAEITISINKEKKSG EAFGDGPVDAAYNAVNKILNDSFVLEEYKLDSITGDTDAQAQVVVILEKEGKRFMGRGQS TDIVEASINAYINGINRLYKN >gi|261746854|gb|ADAD01000161.1| GENE 9 8593 - 9432 1207 279 aa, chain + ## HITS:1 COG:yaeI KEGG:ns NR:ns ## COG: yaeI COG1408 # Protein_GI_number: 16128157 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Escherichia coli K12 # 51 276 27 246 247 95 32.0 8e-20 MNRKIIKNSVILLLCVIVVFLIYAHYEYRHIKIRTIEIASKDVPEEFNGKKILYVADFQY DTMMRFNKKQLKKAIDLINEQKKDIILLGGDYSTWEKYRPHFYKEAENLKIPQYGVYAIY GNHEYPNIEGSNESLKKLGYNLLINENKKITINNQSIYIAGVEDLWHGKPDAEKVLNVVK KEDFVIFMTHNPEYFEEMTESQKERADMTLAAHTHGGQVTFFGKIVFAPIKHKDKYGYGM KEYDGHKIYTTSGVGGSFLEMFIRFFAPPEIVIFELKRV >gi|261746854|gb|ADAD01000161.1| GENE 10 9438 - 10298 1126 286 aa, chain + ## HITS:1 COG:BS_ykuE KEGG:ns NR:ns ## COG: BS_ykuE COG1408 # Protein_GI_number: 16078469 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Bacillus subtilis # 29 286 30 287 287 85 31.0 1e-16 MGIFKKETLKKMKYIFLTLFVLGIIALIYAHYEYTQIKIKTLEIASKDIPEEFDGKTIIF AADFQLDTYARFNKKQSDRIINLINEQEKDLIILGGDYTNWTGKIPRFYKEMEKLKKPEY GIYAVLGNHDYNSAPKNIENLKRLGYKVLINENDKITINNRSIYISAVDDLLKGKPDAEK TLEGIKKEDFNIYITHNPDYFEDMTEEQKNRSDITLAGHTHGGQITLFGLILLAEIKHPW KYGYGLKEYDGHKIYTTSGVGGSAFELFIRFFAQSEIVVLKLKKIN >gi|261746854|gb|ADAD01000161.1| GENE 11 10368 - 11768 2221 466 aa, chain + ## HITS:1 COG:lin2096 KEGG:ns NR:ns ## COG: lin2096 COG0065 # Protein_GI_number: 16801162 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Listeria innocua # 7 463 3 455 462 657 69.0 0 MGKNKSKTLFDKVWEKHVITGNPGEAQLLYIDLHLIHEVTSPQAFSGLRLAGRKVRRPDL TFGTMDHNTPTIPEQRMNILDKISKAQLDALKANCEEFGIELLDMFNDKNGIVHMVGPEL GLTQPGKTIVCGDSHTATHGAFGALAFGIGTSEVEHVLATQTIWQKKPKTMGIEITGKLQ KGVYAKDIILYIIKTYGIGLGNGYAFEFFGDTIRGLSMEERMTICNMAIEGGGKSGIIAP DETTFEYVKGRKYAPQGEELERKIAEWRELYTDSVEAFDKYIKIDVSNLVPQVTWGTNPE MGMDITEVFPDIKDHNDEKAYEYMGLEPGNSPKDIKLKHVFIGSCTNGRLSDLEIVANIV KGKKVHPNITAVVVPGSQTVKRLAEEKGIDKILKDVGFQWREAGCSTCLGMNPDQIPGGE HCASTSNRNFEGRQGKGARTHLVSPAMAAAAAIHGHFVDVRFEEGV >gi|261746854|gb|ADAD01000161.1| GENE 12 11768 - 12343 823 191 aa, chain + ## HITS:1 COG:SA1865 KEGG:ns NR:ns ## COG: SA1865 COG0066 # Protein_GI_number: 15927635 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Staphylococcus aureus N315 # 1 188 4 188 190 234 57.0 7e-62 MKPFTKYEGTIVPIMNDNIDTDQLIPKQYLKSIEKTGFGDYVFDEWRYNEDGTDNMNFNL NKPEYKKGTILITGDNFGCGSSREHAAWALQDYGFHVIVAGGYSGIFYMNWLNNGHLPIT LSERERLELAALSGNEKVIVDLENNKLSANGKEYTFELEETWKQRLLKGLDSIGLTLEYE NEIRAFEEKRK >gi|261746854|gb|ADAD01000161.1| GENE 13 12372 - 12803 537 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039091|ref|ZP_06012424.1| ## NR: gi|262039091|ref|ZP_06012424.1| putative transcriptional regulatory protein [Leptotrichia goodfellowii F0264] putative transcriptional regulatory protein [Leptotrichia goodfellowii F0264] # 1 143 1 143 143 241 100.0 2e-62 MAFKKLDIKGNIEKYFRENGINYDKNYLLLARRKLFWNIWDVYSQIEEELSKACVVLVKK DEVEFFSCPIKMNGFSSMTMKIGEKRFKLNKNEILSFEIKKILIAYKIRIETKDSVEVFF VEGKIFGKNWIDENVEYIKELLK >gi|261746854|gb|ADAD01000161.1| GENE 14 12818 - 13255 574 145 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039075|ref|ZP_06012408.1| ## NR: gi|262039075|ref|ZP_06012408.1| integral membrane sensor signal transduction histidine kinase [Leptotrichia goodfellowii F0264] integral membrane sensor signal transduction histidine kinase [Leptotrichia goodfellowii F0264] # 1 145 1 145 145 240 100.0 3e-62 MNSDRNNEQDESIIKFFEENNIIYEKNYLFVMGFVKVEPMDYFIRAITFRFMPLHYIVLF EKDEIKFYGFKSFKFHPNPELIIKKSEIAEIKLTNNNFLGKLKIFTKSSEEKIKTYDFKV IVGNRKWTKENLEYLRNSEWSSKMI >gi|261746854|gb|ADAD01000161.1| GENE 15 13292 - 14089 1045 265 aa, chain + ## HITS:1 COG:BH2887 KEGG:ns NR:ns ## COG: BH2887 COG0500 # Protein_GI_number: 15615450 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus halodurans # 19 265 8 260 261 167 37.0 2e-41 MENMENRKKKVQVTEDVIKKGKHWNTEKYEKNARFVSDYGIDVISWLDPKPHEYILDLGC GDGVLTKKIIEFGCKVLGVDGSLEFVNAAKKIGVNAVQGDGENLNFEEEFDAIFSNAALH WMTDQEKVISGVSKGLKKGGRFVVEMGGAGNLEKIQKVITETVSEYGYKIKKCWFLPTES EERKLLEKYGLKVNKMSFFKRPTSLPTGIKEWLWTITVPLLGNVPEELHGEIIEKIAKKL ESRLEYKNDKYIADYVRLRFIAYKE >gi|261746854|gb|ADAD01000161.1| GENE 16 14135 - 15193 1546 352 aa, chain + ## HITS:1 COG:PA3118 KEGG:ns NR:ns ## COG: PA3118 COG0473 # Protein_GI_number: 15598314 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Pseudomonas aeruginosa # 1 348 1 353 360 426 59.0 1e-119 MDYKIALLKGDGIGPEIVDEAVKVLDKIGQKYGHKFEYTQGYLGGESIDRYGVPLSKETT EICKSSDSVLLGAVGGPEWDNIEPEKRPEKGLLAIRKELGVYTNLRPAILFNVLKDASPL KSEIIGNGLDIMIVRELTGGLYFGQREYSEEKAHDTLSYTRAEIERIAKKAFEIAKLRNK KLTSVDKRNVLDTSKLWRKIVNEISKDYPEVEVSHMYVDNAAIQLIANPGQFDVILTENM FGDILSDEASMLTGSLGMLPSASLGEGKIGLYEPSHGSAPDIAGQNIANPIATVLSAAMM LRYSFGLPEEADTIEKAVDKALQDGYRTADIYTEGTKKVGTKEMGDRIAERI >gi|261746854|gb|ADAD01000161.1| GENE 17 15213 - 16016 284 267 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 20 241 15 225 311 114 35 1e-24 MNRNIKEKEILHYENVTFKRDGKIILDNVNWHIDSGENWVLLGLNGSGKSTILGMIPAYI FPTRGEVRVFGHKFGNYVWENIKNRVGFVSSSLNSFLSTLNWEKAEDVVISGKFSSIGIY REITEKDREKARKIIENFKLSYIKDSYFSTLSQGEQRRILLARAFMNEPDLLILDEPCSG FDVKSREYFLKVLQENAEMKNSTPFIYVTHQIEEIIPSITHVALLSEGKIKAKGKKKDIL NDELLSQIFEINVRVEWENERPWLIVK >gi|261746854|gb|ADAD01000161.1| GENE 18 16067 - 17077 1638 336 aa, chain + ## HITS:1 COG:RSc2075 KEGG:ns NR:ns ## COG: RSc2075 COG0059 # Protein_GI_number: 17546794 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Ralstonia solanacearum # 10 336 3 329 338 413 59.0 1e-115 MAGNILGTTVYYDADCNLNKLAGKKITILGYGSQGHAHALNLKESGMDVTVGVVKGSKSW ERAEEAGFTVKETSEAVKNADIVMILIPDELQADVYAKDVAPNLKEGVYLGFGHGFNIHF GKIKPREDINVFMVAPKGPGHLVRRTFQEGSGVPCLIAVEKDPSGDTKEVALAWASGIGG GRSGILETTYKQETETDLFGEQAVLCGGVVELMKTGFEVLTEAGYDPVNAYFECLHEMKL IVDLIYEGGLATMRNSISNTAEYGDYITGPKIITPETKKAMQGILADIQSGKFADDFLAD SKAGQPFLKAKREEFAKHEVEKVGAELRNLMSWIKK >gi|261746854|gb|ADAD01000161.1| GENE 19 17161 - 17652 547 163 aa, chain - ## HITS:1 COG:BS_ykvV KEGG:ns NR:ns ## COG: BS_ykvV COG0526 # Protein_GI_number: 16078448 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 2 162 4 164 165 69 29.0 2e-12 MKRLLVIFSMFMVFFSGYAKEININPEVGAKIPNLKLENLDERKVNSRRIFNNGKETLIV MAAEWCPHCHQELPEIQKFYEENKDKINVIVIFTNRKTSLENTKQYVKDSKFTFPAFFDS DDSIFKSFKIKNVPTNFKVKNSEIEEIVQDIMKYDDLSEMFER >gi|261746854|gb|ADAD01000161.1| GENE 20 17820 - 18770 1330 316 aa, chain + ## HITS:1 COG:CAC0366 KEGG:ns NR:ns ## COG: CAC0366 COG0679 # Protein_GI_number: 15893657 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 7 316 4 300 301 87 26.0 2e-17 MIFFKSLESIFPIIFMIAIGYILRKKDWFHDSFSENVSKVITNIALPASIYVAVSRNLTL ETLVSMSDRLIYTFASFIIGYIIAVFMVKIFKIRPGRRGIFINAFVNANTIFIGMPLNIE LFGEQSLPYYLMYYITNTVSIWTLGAFFVSNDTLDIDNKKKGVNWKKIFSPPLIGFIVAL LLLAFNIKMPKMINSTMQYIGNIVTPLSLMYIGIVLADARLRNIRFDKDTILALVGRFVL SPVVMVVLLMIGMSLGGNLTPLDTKTYVIQSAAPVFAVLPILANEADGDVKYATNVVTTS TILFALVIPILMMFLK >gi|261746854|gb|ADAD01000161.1| GENE 21 18869 - 20509 2169 546 aa, chain + ## HITS:1 COG:L121483 KEGG:ns NR:ns ## COG: L121483 COG0281 # Protein_GI_number: 15672882 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Lactococcus lactis # 3 541 2 540 540 729 66.0 0 MKKSYEILNDPFLNKGTAFSKEERAKFGLNGLLPPYIQTIDEQAKQIYVQFEKKSSLLEK RHFLMEIFNTNRTLFYYLFSEHVVEFMPIVYDPVIAESIEQYSELFVNPQNAAFLSINEP ENVEETLKNAADGRNIRLIVVTDAEGILGIGDWGTNGVDISIGKLMVYTAAAGINPESVL PVVLDVGTNRETLLKDPFYLGNRHERIRGDRYYDFIDKFVQIAEKLFPDLYLHWEDFGRS NAANILNKYKDKIATFNDDIQGTGIITLAAILGALNITGEKLTDQKYMCFGAGTAGAGIA KRVYEEMIENGLSEEEAYKRFYLVDRQGLLFDDMEDLTPEQRPFARKRTEFPNSKELTNL TAAVKAVKPTILVGTSTVPGTFTKEIVEEMASYIERPMIFPLSNPTKLAEATAQDLLKWT DGKALIATGIPYDPIKYNGATYEIGQANNALIYPGLGLGVLASGAKILTDKMISAAAHSL GGIVDVSKPGAAVLPPVAKLTEFSETVALAVGKCALEEKQNRKPVDDIKQAIDNLKWKPE YADLSF >gi|261746854|gb|ADAD01000161.1| GENE 22 20581 - 21090 569 169 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1752 NR:ns ## KEGG: Lebu_1752 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 168 1 167 168 200 65.0 1e-50 MNAMQYKIKLPENFDMDVIRKRVNDNGFKTDGFEELLFKAYLIKDTESEKEYSPLYIWKD SKGMNKFIFDGFYDNILNSFGWQHINIGIPLEISLNDNFSKAKYLLELENHIEPTKNLSR PKFSGFKNENSGKVLIYNPDKWKYTEFYFCEEKPDVNNGNVYDILHISV >gi|261746854|gb|ADAD01000161.1| GENE 23 21044 - 21169 82 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLIMVMFMIYFIFPFKIFDVCSKKRSGNTGLFLKFLLIYKA >gi|261746854|gb|ADAD01000161.1| GENE 24 21559 - 23388 3035 609 aa, chain + ## HITS:1 COG:CAC0158 KEGG:ns NR:ns ## COG: CAC0158 COG0449 # Protein_GI_number: 15893453 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Clostridium acetobutylicum # 1 609 1 608 608 679 58.0 0 MCGIVGYIGTQNAQDFVLDGLEKLEYRGYDSAGIAVNTGDEKFSIVKKVGRLKNLADVLE KTPLKGSMAIGHTRWATHGKPSDENSHPHFNKDETLVVVHNGIIENYLELKKELIEKGYK FNSETDTEVVTHLLDELYEGDLLEATKKLIKIIKGAYALGIMSVTEPDRIIAVRKESPLI VGLGENENFIASDIPAILKYTRDVYLIENNEIVEVKKDSVKVMNAEGQEIERDITHIEWD LEAASKGGYEFFMEKEIHEQPEVLIETLNSRVDENNNINFDNAGLTKEYLDGINSIYIVA CGTAYHAGLVGKYIIEKKTRVKVDVDIASEFRYRNPVIDDKTLVIVLSQSGETLDTLEAL KEAKRNGARVVAITNVVGSSIAREADHVIYTWAGPEIAVASTKAYTTQMVILYLLATDMA YKFGEITREEYEYDIKSLYGLKKNIEKMLEYSDRIEATADKIKDRTSMFYLGRGLDYVIA VEGALKSKEISYIHSEAFASGELKHGTIALIEDGTPVVINITQSDLFEKSVSNIKEVTAR GAYVIAVAKEGNTLVEEVADEVFYIPDVKDDYTGFLSIIIHQLLAYYLSKLKGNDVDKPR NLAKSVTVE >gi|261746854|gb|ADAD01000161.1| GENE 25 23427 - 24062 396 211 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2080 NR:ns ## KEGG: Lebu_2080 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 206 1 206 207 236 62.0 4e-61 MKYIKKLPKENKEYSQKLLDDGWKKLKEPRLSVAMLIGLPIGILSALLNIKYFFMFFPNF KEIINYSDRGFAIEIRLDILKSVLYFVITIMFFILHEFIHILFMPDFIKSKKTFWGMKAL YFFAYTEESFSKRRAIIIFAAPLVILSFVLPFILNMAGLMSGFIVFLCVFNAAGSYMDIF YIILILFKVPNGATITNNGTETFYKEDNFKK >gi|261746854|gb|ADAD01000161.1| GENE 26 24125 - 24376 472 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039108|ref|ZP_06012441.1| ## NR: gi|262039108|ref|ZP_06012441.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 83 1 83 83 87 100.0 2e-16 MNFDDLKKGTEDVLHKTAEEAKDLADKAVEKTKEFADSEIAQNIKKGAEDALHKTVEGAK DLADKAVDGAKDLADKVADKFKK >gi|261746854|gb|ADAD01000161.1| GENE 27 24595 - 25050 818 151 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1704 NR:ns ## KEGG: Lebu_1704 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 151 1 151 151 175 79.0 5e-43 MKKMIMITGAVLILSSCGVVGAAGSVVGGTVKAAGTVTGAVIKTTGKIIGGIIGGNDGEI EAKGVKYKFSKAKVENDGNTTVVTGTLSHNGSKKENVSIQIPCFDKKGDKVGDAVDNIEY LEKNKKWEFRAVLNTGDVKACKVKDAYVYAE >gi|261746854|gb|ADAD01000161.1| GENE 28 25075 - 25752 1009 225 aa, chain + ## HITS:1 COG:FN1015 KEGG:ns NR:ns ## COG: FN1015 COG0775 # Protein_GI_number: 19704350 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Fusobacterium nucleatum # 3 224 8 233 237 185 44.0 7e-47 MTGIIGAVPEEAQVIKKEMINISEETIGGLTFFKGKFYDEDIVFVQSGVGKVNAAMTATI LITRYNVDKVIFSGVAGSLDRKVKVGDVVIGTEMIQHDADATEFGYEIGQIPQMDEWKFK SAPKLLEKSKNIKNDKFELFFGRILTGDQFISKKDEKKRLGEKFEALCVDMESAAVAQVC YRLNTDFLILRSISDSLTDESGMEYDVFVELAANNSKEILKEILK >gi|261746854|gb|ADAD01000161.1| GENE 29 25798 - 26538 996 246 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2131 NR:ns ## KEGG: Lebu_2131 # Name: not_defined # Def: sporulation domain protein # Organism: L.buccalis # Pathway: not_defined # 1 246 1 238 238 83 37.0 7e-15 MRFRLNFFKILRLIVIVAFAVYAFQMAGKFMKEKNKKTDEVAKTFDIRKNDFHNSENKGK KTEEEQKAAENTQTNNIQAINTDTAQNNQVQNTENKVEEQKTDAEKEAAAKAEADKKAAE QKNKELEAQKIQKELSEQKKKEVEAAKKAQEKAEAEKKAKEAAAQKTAGRRYLQVATLQT EAAAKKVAAQLGGNFTVQAIKGSSGRTMYRVVSASTDNPQTLSAMESQVKSKLGGSYKYI VRTAGK >gi|261746854|gb|ADAD01000161.1| GENE 30 26541 - 27557 184 338 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B [Streptococcus pneumoniae SP18-BS74] # 13 315 5 289 311 75 25 6e-13 MLENNNRGNVTLEEIKNEILKAENIILTTHINPDGDALGSVSAFLLMINEYNKKFAKKSL DMKKVRIIVDDELPKYMEKFEEGPLMERYSSSLENERADLFISLDCANAERYGNVIEIKK NCKKSINIDHHISNTEHAQMNYVGDVSSTCELIYQFLELFNIELTKEIADFLYLGIINDT GNFRHDNVTQNTFAVCSELMKAGADNHKVANIIFGMSRKKVNLFGDVYKNNKMNDKYKFI YYYLTKDKIKEMEVSKGDSDGIAELLLKIEDTEISLFTRDDENGFLKGSLRCNDKYNVNE IASIFNGGGHIKAAGFKTDLSFPEVIEKIVEKLEEYEK >gi|261746854|gb|ADAD01000161.1| GENE 31 27637 - 27849 268 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039089|ref|ZP_06012422.1| ## NR: gi|262039089|ref|ZP_06012422.1| aspartyl protease family protein [Leptotrichia goodfellowii F0264] aspartyl protease family protein [Leptotrichia goodfellowii F0264] # 1 70 1 70 70 108 100.0 1e-22 MFNMFSYLQLKGFDNSDFAKYFEKIDEMNENINKVLIENPRAVLKGIKITFLDKNKEQIH FDIDIEVVNN >gi|261746854|gb|ADAD01000161.1| GENE 32 27849 - 28163 432 104 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2128 NR:ns ## KEGG: Lebu_2128 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 9 104 4 99 100 106 56.0 3e-22 MDMKNNLLKIKNYVFLFTFAFLISCSSVGKRTVPESEVLSKDAVVQIGIQGVEKKFGETV NSENVGVYKRGYNNWKIILYGKNNFYLVFVTEDGKIVSVEQGSY >gi|261746854|gb|ADAD01000161.1| GENE 33 28247 - 28828 400 193 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1582 NR:ns ## KEGG: Sterm_1582 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: S.termitidis # Pathway: not_defined # 1 191 1 190 194 209 59.0 7e-53 MNKITVKKNNVIEKSAKLFYYRGYKNTGLSDILTECKIPKGSFYYYFKNKEDLLIYVIKY HTDKLIKFFSKVVDDLSIFKLKTFFYQYFTNIEHNKFHGGSPLGNLAVELGDINENVRKE LLSSYYQIELRFSFFLSSLKRDYPQKYSHIEPEIYARILISLLEGTMLKIKIEKNSNAID DFLSFFDKIFKPY >gi|261746854|gb|ADAD01000161.1| GENE 34 28954 - 30051 1654 365 aa, chain + ## HITS:1 COG:SA1449 KEGG:ns NR:ns ## COG: SA1449 COG0482 # Protein_GI_number: 15927201 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Staphylococcus aureus N315 # 1 365 1 363 372 462 63.0 1e-130 MRNKKVVLGMSGGVDSSVAAILLKEQGYDVIGVFMKNWEEKDDNGVCMSEEDYKDVVAVA EQLEIPYYSVNFVKEYWDKVFTYFLDEYKKGRTPNPDVMCNKEIKFKAFLDYAMKIGADH VATGHYARIIHEEKDGKIKSVMLRGIDDNKDQTYFLCQLNQKQLEKVLFPIGEYTKPQIR EIAEKYNLKTAKKKDSTGICFIGERDFNEFLSKYLPAKTGNIVDTKGNTLGKHNGLMYYT IGQRKGIGIGNTKEGTGEPWFVVDKNLDTNELIVTQGDNSVLYSKGLIATDFNFINKDEL QFPLECTVKFRYRQKDAGAVINKLNDNEYEIIFDEPQKAITLGQIAVVYKGEECLGGGVI DKVIK >gi|261746854|gb|ADAD01000161.1| GENE 35 30068 - 30412 669 114 aa, chain + ## HITS:1 COG:no KEGG:SSUBM407_1759 NR:ns ## KEGG: SSUBM407_1759 # Name: not_defined # Def: hypothetical protein # Organism: S.suis_BM407 # Pathway: not_defined # 1 114 3 116 131 140 68.0 2e-32 MKVELTRFKVLEGKSETVDEWMKFLNDNMKEVLLTLDGEKMYVETILREVLNGEEYLYWY SIQGEGGIEVEDSESPIDKKHLEYWKECIDKTFRPVDLGVEVVMIPERIQEMMK >gi|261746854|gb|ADAD01000161.1| GENE 36 30493 - 31422 890 309 aa, chain + ## HITS:1 COG:lin1052 KEGG:ns NR:ns ## COG: lin1052 COG0598 # Protein_GI_number: 16800121 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Listeria innocua # 3 308 10 314 314 150 31.0 3e-36 MINRNIIQLENGNMEWVNTVNITAENKKVLNEKDKLSTEFLEYATDADESPRSEYDDLNR IKLLCFDVPYYDRIMESLATVPLVFIIRENTLYTFIEKNEDYEYLNSLLENTVTEKVYDS MYHLLFSVVYKFCLIYHDKLKVINKERADIKKSFRKSVKNTDIYKLLNIEQGLTYLSTSL KANRLALNTLKRHWNVNIKKLSEVEEEKLEDVMIEIDQAMEMTEIITTIAEKEKTTYSTV IDNNLNTTMKFLTVFTILLEIPSMIFGFFGINTNVPFQNMKDGWIYVVLITGIICTMFTL GLWKKRFLK >gi|261746854|gb|ADAD01000161.1| GENE 37 31574 - 31990 532 138 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0717 NR:ns ## KEGG: Lebu_0717 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 138 12 149 154 105 37.0 8e-22 MKKIILICLFAITIIGFSKPERDNRGILTMNENEWYQMFGDNTKTNGKCSFIGASIMQLA YINDGKKLETTQENALSSLEALNRQIYSEGLRHPSNDNSLLFEYYYVKNCRKLTNKDFDL VGSPSFKTVFEEIYNTYK >gi|261746854|gb|ADAD01000161.1| GENE 38 32023 - 32598 587 191 aa, chain - ## HITS:1 COG:SMc00379 KEGG:ns NR:ns ## COG: SMc00379 COG2249 # Protein_GI_number: 15964052 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Sinorhizobium meliloti # 4 145 5 148 196 102 33.0 3e-22 MKTVIFAHPWHGSFNKAILDSVTKKFEEMKESYTIIDLHKDNFDPVMREEDLKLYSEGKF NDPLVGKYQEILKKSDGLVFIFPIWWSTMPAILKGFLDKVFLLNFAYNHGNTGITGHLKN IKSVKIITTAGSPKFLIKLFLGNPIGWTFGTGTLRATGMKGVKWLHCYISKKDSKEKLEK FLQKTETFVGK >gi|261746854|gb|ADAD01000161.1| GENE 39 32632 - 34365 230 577 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 343 560 34 257 329 93 32 3e-18 MLKDIKTLSGKEFKLLKKPVFFLVIDSLFYMMNYMMFYFTIIDLITNTFTLKKIIIYSVI MLIANILRYSFNRIGYTGVQTQGARIIQDLRLRMGDHLRNLNLGYFNKHNIGNIINIMTN DLQDFEHVLTHSTSEIIKLSILSAYLLLITFVISPLLGILQVIIAGLGIIFIIAGMKKSS KIALKKKHTMDNVVSRMVEYISGMELFKSYNLVGEKFKRLKDSFHNLKKESINTEIALAP YVLIFQLTVDISFALLLLLSTKLFITHSINKIEFFSYIIISLSLSNVLKAFSSQYVLFQY MKLATDKLINVYKEKEISYEFESMPFENYNIKFDNVSFYYEEDKPVLKNISFEAKQGTAT ALVGSSGSGKTTVTNLIARFWDSQSGNVTIGGIDIRKIYPEELLTNISMIFQDVYLVNDT VENNIKLGNPDASREEVIEAAKNAHCHDFITELENGYDTIIGEGGSTLSGGEKQRISIAR ALLKNTPIILLDEATASLDADNEHEIRNSLDKLTKNKTVITIAHKLNTIKNYDQIIVMSD GIIEEIGNHEKLMKNKGRYYIMYTEMRKAQSEGCSLI >gi|261746854|gb|ADAD01000161.1| GENE 40 34359 - 36089 262 576 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 334 559 2 232 245 105 31 5e-22 MNNLKFLLKLSGQHKIKLIFSALFSIISTTLSAVPYILIYNIILELFKEAVDYNKIQNLV FITIIFIIIRVVTFVLSGIFSHVSAFNILYKIRIDLIKHMSKLNMGFFKKNTSGKLKKII NEDVEKLENFIAHQIPDLSSAFATPLIFLGIMIYYNWKLSLVLFIPVILGIIAQTGMMKS FMSRVDHFYKLVAKLNSTIMEYINAMNVMKAFNLSAKSFKDYRDNTQEYADYWIELTELS VPYYSAFLCLVDGGLFFIIPAGGIMLLNHQISIPVYILFLLMSTIFLNSLKSLFELSEKF SFLLKGMEKIIEIFDEKEQISGNIEFPQNFSQSLKYENITFAYNKTKVINNFSLDIKVGS TVALVGPSGSGKTTIGLLAGRFWDISEGKITVDGIDIKDISYDSLMDKISFVFQDTFMLH DTIYENIKMGKNYNSEQIEDAAKKAQIHDFIMSLPDKYETKIGEGGVKLSGGEQQRISIA RAILKDTPIVILDEVTSYSDIENETKIQAALKTLLKGKTALIIAHRLYTIKNADNIVFMN KGKIVEQGTHEELLKNKADYWHLWSLYNEETEGQKC >gi|261746854|gb|ADAD01000161.1| GENE 41 36302 - 37192 1155 296 aa, chain + ## HITS:1 COG:CAC3494 KEGG:ns NR:ns ## COG: CAC3494 COG2378 # Protein_GI_number: 15896731 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 1 296 1 299 300 233 43.0 4e-61 MQINRLFETVHMLLNKKSMTARELAEHFEVSIRTVYRDIETLTFAGIPVYSTRGKNGGIK LLDEYVLDKSIISKDEQNDILYALQSLKAANYPEVEETLEKLSVIFNKTSGNWIEVDFSE YGNEQKELFENIKNAIINKKVIRFEYYNSQGIKSERSAEPLKLKFKVRAWYLSAFCRKSN EMRFFKIRRIKRLSVTEEIFERTSENIEIPDNETKAPKVKITMTIDKSQAYRVYDEFSEK DIKVSENGDFEIISELIENEYLYGYLLSFGEYIKIIGPTRLKKILEGKVEKMKKNY >gi|261746854|gb|ADAD01000161.1| GENE 42 37871 - 38470 696 199 aa, chain + ## HITS:1 COG:AGc1616 KEGG:ns NR:ns ## COG: AGc1616 COG0693 # Protein_GI_number: 15888226 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 169 12 169 188 92 37.0 4e-19 MDKFADWEVVFLSTTLNNKRVARDYIVKYASTDKEIKTSMGNLKVLPDMTIDEISDDIKG IVLIGADGSWRNLNNEMNDKIMNLVQRFKKNGKVVGVICDAVYYLAVNGLLNDCKHTANS LEEIENNKNYKNRENYIQTDEITAVTDGKTVTASGTAPFGFAVNVLKVLEDISEKNINFL KDMYTEGFTKAFEKYNEIK >gi|261746854|gb|ADAD01000161.1| GENE 43 38484 - 38948 627 154 aa, chain + ## HITS:1 COG:CAC3493 KEGG:ns NR:ns ## COG: CAC3493 COG3708 # Protein_GI_number: 15896730 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 154 1 146 146 74 34.0 9e-14 MNYEIVELEGKTLVGLKMRVKDDKTMYEKIGNLWKDFYSQDNVQDKINDNAVAVYYNYNN KNGFEYDFFIGNEVESGNDKIPENMTKLEISKGKYAKFKIFGNPKREVPKFWQEFWKEFG KETSEIRAYTYDFEEYVTGNDYENTEINIYISIK >gi|261746854|gb|ADAD01000161.1| GENE 44 39014 - 39550 502 178 aa, chain + ## HITS:1 COG:FN1248 KEGG:ns NR:ns ## COG: FN1248 COG4283 # Protein_GI_number: 19704583 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 165 25 199 204 175 56.0 3e-44 MARPTTKKDLCEAAETQFQKLENLIDGMSEEEQKGTFLFEDRDKNLRDVLIHLYEWHNLV INFVESNMRGKEKGFFPEPYNWKTYPKLNVEFWEKHQSTPLEKAKEMLKKTHKKIMKLIE TRTNEELFSKGIYKWTKGSTLGAYFVSSTSSHYNWAMKKLKKHIKMFRIEKNERIEVL >gi|261746854|gb|ADAD01000161.1| GENE 45 39627 - 40394 1001 255 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039105|ref|ZP_06012438.1| ## NR: gi|262039105|ref|ZP_06012438.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 255 1 255 255 375 100.0 1e-102 MNFRKILKLSLVTLIIFIVQSCITVETDMKVNSDFSGSTNSKIAIVKGVITEEQLKIEIS KLGIEKYTLKKEKSSDEQTDRYSVDINWKTEEDLKKILKFIGSGGMDLSTAESSAGKQGT DTNKKAETEPAKIFTKEKGEVIVDMGTSKIARLTVKVDGKIIPEENQSGTVSESKKEITF YQGDQVHFKYKTGGGLFENAVKIVLGLVILAILGLIGKNVLSKSKKKNENSEETDKTVAV EEVEIVEENSNNEEK >gi|261746854|gb|ADAD01000161.1| GENE 46 40576 - 41496 1137 306 aa, chain + ## HITS:1 COG:L144334 KEGG:ns NR:ns ## COG: L144334 COG2103 # Protein_GI_number: 15673112 # Func_class: R General function prediction only # Function: Predicted sugar phosphate isomerase # Organism: Lactococcus lactis # 1 292 1 292 297 291 54.0 9e-79 MIDLSKLSTEQNNPNSKNIELQDSYEIVKRINEEDKKVAFCVEKELPNISKLIDAIMSRY KEETRIVYVGAGTSGRLGILDASECPPTYGVSFEKVQGIIAGGNEAIFKAKENAEDSKEE GKQDLIDINLTENDVVIGLAASGRTPYVLGAIEYANSIGAVTGSITCSENSELSKVSQYP IEVTVGPEIVTGSTRMKAGTAQKMILNMISTTVMIKIGKVFSGYMVDVKTSNQKLVERAK RIIMNTTGSDYETTSRVLAEASNEVKTAIAMILLSLDKETAKEKLKEYNNNVARLIHEYS HKIERK >gi|261746854|gb|ADAD01000161.1| GENE 47 41465 - 42301 1011 278 aa, chain + ## HITS:1 COG:CAC0191 KEGG:ns NR:ns ## COG: CAC0191 COG1737 # Protein_GI_number: 15893484 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 4 274 5 274 283 134 31.0 3e-31 MSILIKLRENKDFTANEEDIAKYVIKYFRSIRELDTNIIAAKTYTSNASVTRMCKKIGFN GFQDFKMKMIEEVVSLENQEINFDNSDIDKKDSTAIIIEKLNKLSISSLQETKLLQEVSM IDSVVDLITEKEVIDFYGIGASHIVCLDAQYKFMRAGKIVNTFEGSDLQHVQAVNSNLKH LAILISYSGLTKEILDIAEILQEKDIETVSITGYGNNKLVKKCNYNLFVTSREALIRSAA IYSRMSMLNLIDVLYFVYYNKNYDYVSEKIIKTKINKI >gi|261746854|gb|ADAD01000161.1| GENE 48 42320 - 43732 2371 470 aa, chain + ## HITS:1 COG:BH3574_2 KEGG:ns NR:ns ## COG: BH3574_2 COG1263 # Protein_GI_number: 15616136 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus halodurans # 97 467 1 354 355 251 42.0 2e-66 MDAKKTAKEIYDILGGRENIVSNAVCMTRLRVKVRKDPNLSELKKVEGVLNVVEADTLQI VLGPGKVNSVGQEFSKLTGISLGFSDNNVEDVAKENKKANKQKYDGPVQRFLQKIANIFV PLLPGIIAAGLIMGITNVINVTTKNAYNTMWWFAAIKSLGFVMFGYLAIYVGMNAAKEFG GTAVLGGIIGALFIANDKLPLLAKTGDAGIILPITNKPFTPGIGGLLAALFMGMIVAYLE KQIRKIIPSMLDTFLTPLLTLIISVFIALLIIQPLGTGITKGIYVILDFVYNKLGLLGGY ILAAGFLPLVSVGLHQALTPIHVLLNNPDGPTKGINYLLPILMMAGGGQVGAGLAIYVKT KNEKLRTMVRNALPVGILGIGEPLMYAVTLPLGKPFITACLGSGLGGFLAALFHLGTISQ GVSGLFGLLIVVPGTWIFFIIAMLGAYAGGFVLTYFFGVDDERIEEIYGE >gi|261746854|gb|ADAD01000161.1| GENE 49 43737 - 44525 993 262 aa, chain + ## HITS:1 COG:PM1451 KEGG:ns NR:ns ## COG: PM1451 COG0627 # Protein_GI_number: 15603316 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Pasteurella multocida # 1 259 1 263 269 354 66.0 8e-98 MKKILLVLFGICSFITFPFTEVNTKVYSKAMKKDVPITVVLPNGYSNSKKYNTIYVLHGW SGSNKNYPEKTSIGKLSDEFGIVYISPDGNYDSWYIDSPLKKDSKYYTFVSKELVEYVDN NYSVYKEKEHRAITGLSMGGFGAFYIGIKNQNVFGNIGSMSGGMDPEQYKGNWGIDKVLN SNWSEYNIKDIAHQLIGTKSNIIIDCGVDDFFIQPNRELHKKLLDLNIKHDYIERDGAHS WAYWDNSIKYQTFFFNQKFNGK >gi|261746854|gb|ADAD01000161.1| GENE 50 44537 - 46333 2474 598 aa, chain + ## HITS:1 COG:yfeW KEGG:ns NR:ns ## COG: yfeW COG1680 # Protein_GI_number: 16130355 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Escherichia coli K12 # 114 516 51 450 463 347 45.0 3e-95 MKKTLLILLNMLIFLNVFSYQIEKSFPVDKDSFDDSILTNDMFEFKAFKNQGYIILDYEG LNRADIYINGVKLKIDDLKGKGTTKIDISKITKNDKNIFQISNLEGKVSVKIPYPTITEN KKIKSYNSETIKFLDEFVQAEIKAGLPSAQIAVIKDGNLEILSSYGYVNNYNQDGSEIKD KIKVTDETVYDLASNTKMYATNYAIMKLVSEKKLNLDDYVYKFYPEFTGNNKEKVQISDL LKHQAGFPPDPQYFNDKYDKDDGISNGKNDLYAIGKENVRKAIMKTPLIYEPKTSTKYSD VDYMLLGLIVEKVTSKDLDTYLKENFYNKLNLKKTTFNPLKYGIAKNVVAATELNGNTRD NTIDFVNARKYTIQGEVHDEKAYYSMGGVSGHAGLFSNAYEVAKLAQIIINEGGYDNIKF FDKTTLDNFVKPKDINASYGLGWRRQGDFIYKWAFSGLASRETVGHTGWTGTLTVIEPSQ NLVIVLLTNAKNSRVIDPAKKPNDFYGNHYYTTNYGVISSIIIDAYSNMNNKKDTNLRMN NILEDLINGKYNLIKSDSDYKNSADIKDTVELINLLEKRIGKLNSEYQKIREELQKMQ >gi|261746854|gb|ADAD01000161.1| GENE 51 46349 - 47425 1072 358 aa, chain + ## HITS:1 COG:BH3573 KEGG:ns NR:ns ## COG: BH3573 COG3589 # Protein_GI_number: 15616135 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 6 345 2 340 351 149 31.0 9e-36 MKKNFETGFSVYLSTDIEKNRNIINKFENSGAKYVFTSLNISEEKVNKISELEKIIEFCS EKNLNLIVDINSTTKNLINVNAENVYLRIDDGLTLDEILKLSEKNKIVLNASMITEENLD YFQKKGIDFSKVLSLHNFYPKRFTGISREYLQKQNLKYKKYGIKTMAFVKGDELRGPVYE GLPTVEKHRNKRFLTSCLDLLLLDTDIVLVGDIDISDKNMKDFKYLNKGIVPLRNFNNIL TDRVFKDRIDFSEYIIRAVVDVNIGESRKSFCEYITKELEKNRINLKEISCNGPIKKGDI LVSNEKYLRYEGELEIALQDLGIDEKRCVVSKIIDEDMELLDYVRVVDRFLFEKSDKK >gi|261746854|gb|ADAD01000161.1| GENE 52 47828 - 48460 860 210 aa, chain - ## HITS:1 COG:STM0609 KEGG:ns NR:ns ## COG: STM0609 COG3634 # Protein_GI_number: 16763986 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Alkyl hydroperoxide reductase, large subunit # Organism: Salmonella typhimurium LT2 # 4 207 2 195 521 160 41.0 1e-39 MAFLDTNIIEQLKGYFNKISEPIEIVSFLDDSQKSVELQAFLTEIDEISDKVNYTKKSFG EDASLEEKLEITRPVSFSLLKNGEKTGINFCGIPGGHEFNSFILAVLGLAGLGRKLEGDQ LSKVAAVDKHLNIETFISLSCTKCPEVVQALNLIASNNPNITSSLVDGGVYPEEVSEKNI QGVPVVYINGNQAAIGEQTLEQLIGLVVSA >gi|261746854|gb|ADAD01000161.1| GENE 53 48486 - 49052 1009 188 aa, chain - ## HITS:1 COG:SA0366 KEGG:ns NR:ns ## COG: SA0366 COG0450 # Protein_GI_number: 15926082 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Staphylococcus aureus N315 # 1 188 1 189 189 261 63.0 5e-70 MSLIGKKLENFTAQAYQNEEFREVSFEKDMLGKWNVFMFYPADFTFVCPTELEDLEDHHE ELKKLGFNVYSVSTDTHFTHKAWHDHSEAIKKVTYTMIGDPTMAVSRIFEVLNEETGLAF RGTFIVNPEGKIVAYEVNDEGIGRDASELVRRAKAAKFVAENPGLVCPAKWKEGEATLKP GLDLVGKI >gi|261746854|gb|ADAD01000161.1| GENE 54 49242 - 50672 1824 476 aa, chain - ## HITS:1 COG:slr0788 KEGG:ns NR:ns ## COG: slr0788 COG1488 # Protein_GI_number: 16331895 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Synechocystis # 3 459 4 448 462 468 52.0 1e-131 MKNLILATDSYKMTHPKQYPAGITYMHDYIESRGGLYGYTKFFGLQYYLKEYLTKPITKE MIEEAEEICTLHGLPFFKEGWNYIVEKLDGKLPLRIRAVPEGAVVKNHNVLVTVESTDPN VPWIVGWVETLLLKIWYPITVATFSFKVKQIIEHFLKETSDNVKEEIPFKFHDFGYRGVS SEESAGIGGLAHLTNFSGTDTLYSLLFARKYYHDKMAGFSIPASEHSTMTSWTKKHEKGA YENMLDSFPTGTIAIVLDSYNYFNAVENIIGKELHDKISERNGTLVIRPDSGDAITNILF ALESLEHNFGCTVNSKGYKVINKVRVIQGDGINENTVWDIYKALKDNGYSAENVSLGCGG ALLQGNSKSSINRDTHKFAMKCSCIKIGDKIIDVFKNPVTDRGKVSKKGRLDLIKTDNGD YRTVNISHLKENEYHKNSVLELVYENGEIKKDYTLSEVKNNENLLFSPELIRDCIK >gi|261746854|gb|ADAD01000161.1| GENE 55 50692 - 51030 469 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039087|ref|ZP_06012420.1| ## NR: gi|262039087|ref|ZP_06012420.1| hypothetical protein HMPREF0554_1439 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1439 [Leptotrichia goodfellowii F0264] # 1 112 1 112 112 194 100.0 1e-48 MKNIKIVSKIVNEITSFYLEKNSENFQIKIEKNPLKDGYIIFTQGFTVLSDKEIEETEKL LSIHHDAEYDEYWELMGEGDASNKLLLVARICDSIKMDYNDGILKLALKKNI >gi|261746854|gb|ADAD01000161.1| GENE 56 51027 - 52265 958 412 aa, chain - ## HITS:1 COG:no KEGG:Smon_0471 NR:ns ## KEGG: Smon_0471 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 3 409 2 397 409 283 44.0 1e-74 MTKEKKQEYLIISLLPFILLIYSIILSTPYKLWTGMNNILLSDGILITDYFIVGGKEAAI FNASIITLINIYLLYKMDMKINGLIISGIFLMLGFSFMGKNILNIIPFYIGAWLYAKVSG KKFKTVIIVCMFSTSLSPFVSVIAEFFGYSPLGIGIALISGTVLGFIMPLISAQVLPIHG GYSLYNTGFAAGLVAIVSYSILKAAKVTVSFKKDFIQTIDYRLLALFTAVFSFYIVYGFI KNGYSFNGLKNLLSHSGKLVSDFTVTEGFPVVVINMGLLGFFCLGLIFLLFPMFNGPILS GMITVVAFAGFGKHIKNIFPVIIGVIIAYYLFGQNTSLTAFAVILFFSTTLAPISGKFGI FAGILAGFLNYCLVLNIGSAHGGLNLYNTGLAAGIVASVGVPVLQTFQGGNK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:38:52 2011 Seq name: gi|261746852|gb|ADAD01000162.1| Leptotrichia goodfellowii F0264 contig00136, whole genome shotgun sequence Length of sequence - 240 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 240 183 ## gi|262039123|ref|ZP_06012455.1| hypothetical protein HMPREF0554_2124 Predicted protein(s) >gi|261746852|gb|ADAD01000162.1| GENE 1 3 - 240 183 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039123|ref|ZP_06012455.1| ## NR: gi|262039123|ref|ZP_06012455.1| hypothetical protein HMPREF0554_2124 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2124 [Leptotrichia goodfellowii F0264] # 1 79 1 79 80 125 100.0 1e-27 ERRPILGTGKEDIARLKATIKDSKLASEIPVLKAILREEPGAVLPLPKLVSEEEKAKYES NKSEWLQKNGYSMPITSSD Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:38:59 2011 Seq name: gi|261746849|gb|ADAD01000163.1| Leptotrichia goodfellowii F0264 contig00030, whole genome shotgun sequence Length of sequence - 980 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 366 383 ## FN2115 hypothetical protein + Prom 401 - 460 4.7 2 2 Tu 1 . + CDS 524 - 919 460 ## gi|262039528|ref|ZP_06012828.1| hypothetical protein HMPREF0554_2462 Predicted protein(s) >gi|261746849|gb|ADAD01000163.1| GENE 1 1 - 366 383 121 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 119 30 150 151 77 41.0 2e-13 TKEMRKKNVSGAILNNHYKEKVEESIKDIDRRNIDKRVKFENITLLIPSNTEINFKNGTI IDLRTGYGLPIYFVKDDHCNKIEFTKKVNGEYYRISYYGANVNNLAQKIIRANGFTKTCS K >gi|261746849|gb|ADAD01000163.1| GENE 2 524 - 919 460 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039528|ref|ZP_06012828.1| ## NR: gi|262039528|ref|ZP_06012828.1| hypothetical protein HMPREF0554_2462 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2462 [Leptotrichia goodfellowii F0264] # 8 131 679 800 800 207 91.0 2e-52 MGSINSNVYESAEISEQIYNKFEEYKSANDTRKEEILGEVIELYKKDQALKLDLAMNGPA ILDNSPFLLKARDEYNRAELIREYNAGRYQDKGLPGIMKIGMPEEERGERKELKVSDITS YLRKLRQGVEK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:39:09 2011 Seq name: gi|261746846|gb|ADAD01000164.1| Leptotrichia goodfellowii F0264 contig00141, whole genome shotgun sequence Length of sequence - 527 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 212 341 ## FN2049 hypothetical protein - Prom 252 - 311 7.4 2 2 Tu 1 . - CDS 395 - 526 148 ## Predicted protein(s) >gi|261746846|gb|ADAD01000164.1| GENE 1 2 - 212 341 70 aa, chain - ## HITS:1 COG:no KEGG:FN2049 NR:ns ## KEGG: FN2049 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 70 1 70 82 69 59.0 4e-11 MKKIAIALGLGALAVSCTNAKLVNYNTDRLDNIEAYLKENKFVKPSDNLEKLKNEGQIEY TTQYKSLERE >gi|261746846|gb|ADAD01000164.1| GENE 2 395 - 526 148 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no NVEIQRIKKRVDEINNNIETFHKTNEALEKMEERLEGIQSKMK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:39:16 2011 Seq name: gi|261746844|gb|ADAD01000165.1| Leptotrichia goodfellowii F0264 contig00144, whole genome shotgun sequence Length of sequence - 203 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 203 265 ## gi|262039131|ref|ZP_06012460.1| 30S ribosomal protein S13 Predicted protein(s) >gi|261746844|gb|ADAD01000165.1| GENE 1 2 - 203 265 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039131|ref|ZP_06012460.1| ## NR: gi|262039131|ref|ZP_06012460.1| 30S ribosomal protein S13 [Leptotrichia goodfellowii F0264] 30S ribosomal protein S13 [Leptotrichia goodfellowii F0264] # 2 67 1 66 67 92 98.0 1e-17 GLVEGDTVTTLNGKKITIKEKMQRKNTLELIGQEGIILKDEIVSGILRLDTKSDLKNAKD IRGLDTL Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:39:22 2011 Seq name: gi|261746841|gb|ADAD01000166.1| Leptotrichia goodfellowii F0264 contig00100, whole genome shotgun sequence Length of sequence - 1182 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 1 - 352 427 ## COG1774 Uncharacterized homolog of PSP1 2 1 Op 2 . - CDS 349 - 1023 689 ## COG2003 DNA repair proteins 3 1 Op 3 . - CDS 1064 - 1180 122 ## - TRNA 1080 - 1156 71.3 # Arg TCG 0 0 Predicted protein(s) >gi|261746841|gb|ADAD01000166.1| GENE 1 1 - 352 427 117 aa, chain - ## HITS:1 COG:FN0908 KEGG:ns NR:ns ## COG: FN0908 COG1774 # Protein_GI_number: 19704243 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Fusobacterium nucleatum # 3 117 26 137 312 57 37.0 6e-09 MKILNIKFRKTKKVYPFLIGKYENYQKGDHVIVDTIRGEQIGIVIGMTDKFGEESEEKDD VKIREVKRKLTDKEVEKLKELDEKANDAYFKCKKIVKSILPEMNLVIGEYTFDENKL >gi|261746841|gb|ADAD01000166.1| GENE 2 349 - 1023 689 224 aa, chain - ## HITS:1 COG:FN0909 KEGG:ns NR:ns ## COG: FN0909 COG2003 # Protein_GI_number: 19704244 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Fusobacterium nucleatum # 3 224 5 232 232 170 46.0 2e-42 MGENEGHRERLRKRFLMSGISGFHDYEVLELLLSYVIVRRDCKKIAKDLLNKYGDLYSLL KQSKDELETNSFITERIAIFLKVLFSMLEDQLYRKIHNERIVISSNVQLLDYLEFSLLKR DIEIFKVLFLNTQNELIREEELFQGTLDRSTIYVRELMKKVLKYNAKSVILVHNHPSGSL KPSQSDIILTKKIKEIFENVEIRLLDHIIISEKGYFSFLEGGIL >gi|261746841|gb|ADAD01000166.1| GENE 3 1064 - 1180 122 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no NVIKLLRGAFVAQLDRASDFGSEGLGFDSSRVRHVFIY Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:39:27 2011 Seq name: gi|261746839|gb|ADAD01000167.1| Leptotrichia goodfellowii F0264 contig00222, whole genome shotgun sequence Length of sequence - 213 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 212 356 ## gi|262037821|ref|ZP_06011258.1| hypothetical protein HMPREF0554_1995 Predicted protein(s) >gi|261746839|gb|ADAD01000167.1| GENE 1 2 - 212 356 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262037821|ref|ZP_06011258.1| ## NR: gi|262037821|ref|ZP_06011258.1| hypothetical protein HMPREF0554_1995 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_1995 [Leptotrichia goodfellowii F0264] # 1 70 1 70 71 64 95.0 3e-09 KEGYILADGITKEEAKKWEVKEKAPEKKEDKKEESKKDEVKEQTTGVELKADRLDNTKGI IASLGQTTIN Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:39:34 2011 Seq name: gi|261746834|gb|ADAD01000168.1| Leptotrichia goodfellowii F0264 contig00128, whole genome shotgun sequence Length of sequence - 3706 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 445 482 ## COG0646 Methionine synthase I (cobalamin-dependent), methyltransferase domain - Prom 479 - 538 11.0 + Prom 636 - 695 12.0 2 2 Op 1 . + CDS 763 - 1002 414 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 1016 - 1071 -0.8 + Prom 1067 - 1126 4.0 3 2 Op 2 18/0.000 + CDS 1153 - 1527 591 ## COG1780 Protein involved in ribonucleotide reduction 4 2 Op 3 . + CDS 1520 - 3658 2464 ## COG0209 Ribonucleotide reductase, alpha subunit Predicted protein(s) >gi|261746834|gb|ADAD01000168.1| GENE 1 1 - 445 482 148 aa, chain - ## HITS:1 COG:BH1630_1 KEGG:ns NR:ns ## COG: BH1630_1 COG0646 # Protein_GI_number: 15614193 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I (cobalamin-dependent), methyltransferase domain # Organism: Bacillus halodurans # 14 147 6 140 305 122 48.0 3e-28 MPDNIKYKNSFYQLKKDLESKIVLLDGAMGTMIMNFDNFREEKYKGCIDYLVLRKPETVK NIHKMYLEAGSDIIETNTFGALDITLKDYGLENKAFEINKAAALLARKAVKEYRKENPFE SRNLYVAGALGPSNKSLSSGNITFEEMA >gi|261746834|gb|ADAD01000168.1| GENE 2 763 - 1002 414 79 aa, chain + ## HITS:1 COG:BS_yosR KEGG:ns NR:ns ## COG: BS_yosR COG0526 # Protein_GI_number: 16079061 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 3 76 2 76 80 64 42.0 5e-11 MRKLIKFEKEDCNPCAMVSDLLDKSGVEYEKVNPFNNPELAMKYKVRTVPTTILVESDQE VKRTIGFNPEELREIITMI >gi|261746834|gb|ADAD01000168.1| GENE 3 1153 - 1527 591 124 aa, chain + ## HITS:1 COG:BS_yosM KEGG:ns NR:ns ## COG: BS_yosM COG1780 # Protein_GI_number: 16079066 # Func_class: F Nucleotide transport and metabolism # Function: Protein involved in ribonucleotide reduction # Organism: Bacillus subtilis # 5 124 3 119 129 96 46.0 9e-21 MFIYYDSKTGNVQRFVDKVKKERPEWNYIKISEDTDIENEGHLITFTTKFGEVPEKTEKF MMRNNNKNYIKSVSSSGNMNWGTLFALAADKISEKYNVPVLMKFELSGTHPQVEYLINYI EKND >gi|261746834|gb|ADAD01000168.1| GENE 4 1520 - 3658 2464 712 aa, chain + ## HITS:1 COG:BS_nrdE KEGG:ns NR:ns ## COG: BS_nrdE COG0209 # Protein_GI_number: 16078801 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Bacillus subtilis # 14 712 1 700 700 830 58.0 0 MINYKKDFRKERAIKMNKAKKWIYLNNEIMVKENGEFQLNKDKEAVYSYFVDYVNKNTVF FHNLKEKMDYLIENDYYINFYDMYKFEEIKQVFELVYNKKFRFASFMSASKFYQSYALRD DSGEKFLERYEDRIAIVSLYLAQGDLSKAMEYAEMLINQEYQPATPTFLNSGKKRSGELV SCFLDEMGDNLSGIGYAFDSAMKLSSIGGGVSINLSKVRARGESIKGVEGRASGVLPIMK IMEDIFSYANQLGQRAGAGAVYLNVFHSDINEFLDCKKINVDEKIRIKSLSIGVIVPDKF MELAKEDEICYTFNPHTVFLEYGQYLDEMDMNEMYEKLVDNPNVKKKKINARELLIKISQ TQKESGYPYLFFKDNANKEHALKEIGNVKFSNLCTEIMQLSEVSDINPYFEEDTIRRGIS CNLGSLNIVTVMENNRIREATRAAIDSLTTVSDLTNIDIVPSIKKANEELHSVGLGAMNL HGFLAKNFIMYESKEALDFCNVFFMMMNYYSLERSMEIARERGEVFKDFDKSEYANGNYF NKYIEKSYFPQTEKIKELFDGIYVPTKEDWAKLKEDVMKYGVYNAYRMAIAPNQSTSYIM NSTASVMPIVDTIEVREYGDSTTFYPMPYLTNDNYFFYKSAYDMDQKNILKLVSVIQRHV DQGISTILFNKSTDTTRDLAKLYIYGHRLGLKSLYYTRTRKATIEECLSCSV Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:39:35 2011 Seq name: gi|261746831|gb|ADAD01000169.1| Leptotrichia goodfellowii F0264 contig00025, whole genome shotgun sequence Length of sequence - 1242 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 496 492 ## gi|262039144|ref|ZP_06012469.1| hypothetical protein HMPREF0554_0362 2 1 Op 2 . + CDS 480 - 1242 709 ## GTNG_2006 hypothetical protein Predicted protein(s) >gi|261746831|gb|ADAD01000169.1| GENE 1 2 - 496 492 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039144|ref|ZP_06012469.1| ## NR: gi|262039144|ref|ZP_06012469.1| hypothetical protein HMPREF0554_0362 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0362 [Leptotrichia goodfellowii F0264] # 1 164 1 164 164 291 100.0 1e-77 KTAPKSREEFNESSETGNFTEGTPDAGDAPFGIDEVEENNLEKEFSIIKDENGFLKVKKI SLVSDADIRLILLNKKLRKYKLVITPKKNIPDACVRISLSGEQSNLRARIRNAYENDNVA SPLRIKQDKIYMNNLYENQKFSLSFILNYSDDCSMEVELYEYRV >gi|261746831|gb|ADAD01000169.1| GENE 2 480 - 1242 709 254 aa, chain + ## HITS:1 COG:no KEGG:GTNG_2006 NR:ns ## KEGG: GTNG_2006 # Name: not_defined # Def: hypothetical protein # Organism: G.thermodenitrificans # Pathway: not_defined # 8 249 41 279 315 109 26.0 1e-22 MNIEYKLYPYPVLNYFSDDYINSVFTSNLKLDKLGNGIIFELTANTDDEGLKELIGKGLA EYVFHIECSSTSYRNIVKSATGTETVTIPESKLNNRVNVCFFIVAKKEIENYSNVNFNPD YEGISFTIEKANILAIAKQYNVEIEKEKDNLAQVPSIFLIIKRESDRKDGIEIEMLQDKL QISLSKKDYEHYALLSKGNYQALLHSSIIFPALIYVFENLKKSDLEIYEDFNWFKTIKKV LNNIDIELDKESLE Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:39:48 2011 Seq name: gi|261746830|gb|ADAD01000170.1| Leptotrichia goodfellowii F0264 contig00028, whole genome shotgun sequence Length of sequence - 207 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 145 63 ## Predicted protein(s) >gi|261746830|gb|ADAD01000170.1| GENE 1 2 - 145 63 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no FLIGFTLHNALLKTKTLTILNIFFTTFTLHNALLKTTRLQLVDIMRT Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:39:53 2011 Seq name: gi|261746828|gb|ADAD01000171.1| Leptotrichia goodfellowii F0264 contig00134, whole genome shotgun sequence Length of sequence - 848 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 846 1308 ## gi|262039147|ref|ZP_06012470.1| putative adhesin/hemagglutinin Predicted protein(s) >gi|261746828|gb|ADAD01000171.1| GENE 1 3 - 846 1308 281 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039147|ref|ZP_06012470.1| ## NR: gi|262039147|ref|ZP_06012470.1| putative adhesin/hemagglutinin [Leptotrichia goodfellowii F0264] putative adhesin/hemagglutinin [Leptotrichia goodfellowii F0264] # 1 281 1 281 282 458 100.0 1e-127 GVEIVPGTEISSRYEEHLSKGLKFTANKGGVFAGVEKGKNEISTQTIKNVGSIINSKGGG VKVEGDHIISVGSKIGAAGDIILDGKNGVVLKDGENFASIREQNEKIRAGLFATANGKKL SASAGVEAAYSKDYNGRTMITPEKNVLVTNKNILIKSDEGNVLLQGDFGAKENIGVFAEK GKIYIKDSTSEILTDSQSVNARVALALNLNLGGIKDTFKSFKEGYKTLKEIPNVGRVGSF IKDIVKGKSLLESLEGREETINAMSTLFQGPSSGGVSAGLG Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:40:07 2011 Seq name: gi|261746826|gb|ADAD01000172.1| Leptotrichia goodfellowii F0264 contig00083, whole genome shotgun sequence Length of sequence - 1882 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 134 - 523 350 ## - Prom 637 - 696 8.9 + Prom 579 - 638 9.9 2 2 Tu 1 . + CDS 694 - 1882 1139 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases Predicted protein(s) >gi|261746826|gb|ADAD01000172.1| GENE 1 134 - 523 350 129 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIYKNKLILESGEMRKILLMLFIMIAVPSFSYYNYYGAIAINQSTGASGYSYNYSDESSA INTALNQCGYNCIATVSFANTCASVAWSPSTGAYGWYSSRDRYLNNRLSKSYCGYSDCYV IADVCTSWY >gi|261746826|gb|ADAD01000172.1| GENE 2 694 - 1882 1139 396 aa, chain + ## HITS:1 COG:FN1603_2 KEGG:ns NR:ns ## COG: FN1603_2 COG0639 # Protein_GI_number: 19704924 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Fusobacterium nucleatum # 175 393 1 220 236 221 53.0 2e-57 MRTLLLLRGAPGSGKSTFVRENRLEQYTLEADRFRTLVSNPVLNEKGDFTISQRYDKISW KMLMECLEERMIKGDFTIIDATHSTKQSVKSYEDLAEKYKYSVYYYQFDTPYEICLENNL KRDVFKQVSENVIKRMYNSVQSSVLPNRFKRIYDINEIINFYVDDITGKYENVKIIGDIH SCYTALSKILEDFSEKTLYIFLGDYLDRGIEHKKTLDLFLGLWKNPNVILLEGNHEIHLR NFASDLPVKSREFMEATLPAISENVTNTEEFKKQIRMFCKKMRQCYSFKFHNHKILCTHG GLSAVPDMALISAQDMIKGVGKYETEIGIVYEENYRLGKCQDYTQVHGHRGIESTEHSIC LEGEVEFGGELKFLNILPEKIELNSVKNDVYDRDYL Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:40:15 2011 Seq name: gi|261746824|gb|ADAD01000173.1| Leptotrichia goodfellowii F0264 contig00075, whole genome shotgun sequence Length of sequence - 430 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 429 758 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261746824|gb|ADAD01000173.1| GENE 1 3 - 429 758 142 aa, chain - ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 1 142 149 277 2806 73 36.0 2e-13 NSDRVTLTTGSLQMKDGDLVAIDVSQGHIGIGEKGIDALSLTDLELLGKTIDIAGVIKAS RETRVMVSAGGQTYQYKTKEVKSKGETYSGIAVDGKAAGSMYAGKIDIISNDKGAGVNTK GDLVSVDDVVLTANGDITTNKV Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:40:30 2011 Seq name: gi|261746768|gb|ADAD01000174.1| Leptotrichia goodfellowii F0264 contig00040, whole genome shotgun sequence Length of sequence - 46595 bp Number of predicted genes - 57, with homology - 53 Number of transcription units - 23, operones - 14 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 768 1109 ## COG1577 Mevalonate kinase - Prom 792 - 851 4.2 - Term 788 - 854 6.8 2 2 Op 1 . - CDS 870 - 2102 377 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 3 2 Op 2 32/0.000 - CDS 2139 - 3065 1047 ## COG0575 CDP-diglyceride synthetase 4 2 Op 3 . - CDS 3055 - 3750 1118 ## COG0020 Undecaprenyl pyrophosphate synthase - Prom 3850 - 3909 8.8 + Prom 3753 - 3812 11.0 5 3 Tu 1 . + CDS 3910 - 4998 960 ## COG3172 Predicted ATPase/kinase involved in NAD metabolism + Term 5005 - 5064 4.9 - Term 5000 - 5043 -0.8 6 4 Op 1 . - CDS 5053 - 6000 943 ## Lebu_2020 hypothetical protein 7 4 Op 2 . - CDS 6043 - 6222 205 ## gi|262039187|ref|ZP_06012507.1| transferase family protein - Prom 6301 - 6360 5.4 8 5 Tu 1 . - CDS 6704 - 6826 65 ## - Prom 6879 - 6938 6.8 - Term 6839 - 6901 0.6 9 6 Op 1 . - CDS 6940 - 7131 330 ## gi|262039190|ref|ZP_06012510.1| conserved hypothetical protein 10 6 Op 2 . - CDS 7122 - 7469 278 ## gi|262039192|ref|ZP_06012512.1| hypothetical protein HMPREF0554_0509 11 6 Op 3 . - CDS 7473 - 8033 963 ## COG3545 Predicted esterase of the alpha/beta hydrolase fold - Term 8043 - 8084 2.2 12 7 Op 1 40/0.000 - CDS 8089 - 9423 1603 ## COG0642 Signal transduction histidine kinase 13 7 Op 2 . - CDS 9466 - 10056 973 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 14 7 Op 3 . - CDS 10062 - 10142 116 ## - Prom 10168 - 10227 8.4 - Term 10192 - 10243 4.1 15 8 Op 1 3/0.000 - CDS 10270 - 10596 640 ## COG3212 Predicted membrane protein - Prom 10620 - 10679 14.7 16 8 Op 2 . - CDS 10681 - 11046 596 ## COG3212 Predicted membrane protein 17 8 Op 3 . - CDS 11124 - 11483 661 ## Lebu_1603 propeptide PepSY amd peptidase M4 - Prom 11608 - 11667 9.9 - Term 11588 - 11642 3.1 18 9 Op 1 1/0.000 - CDS 11715 - 12260 811 ## COG0110 Acetyltransferase (isoleucine patch superfamily) - Prom 12290 - 12349 1.9 19 9 Op 2 4/0.000 - CDS 12351 - 12737 690 ## COG0346 Lactoylglutathione lyase and related lyases 20 9 Op 3 . - CDS 12764 - 13360 701 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family - Prom 13387 - 13446 7.2 21 10 Op 1 . - CDS 13451 - 14287 1160 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 22 10 Op 2 . - CDS 14309 - 15049 692 ## Lebu_0183 hypothetical protein 23 10 Op 3 . - CDS 15086 - 16507 2116 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 24 10 Op 4 . - CDS 16522 - 16977 316 ## Trebr_0061 pyridoxamine 5'-phosphate oxidase-related FMN-binding protein 25 10 Op 5 2/0.000 - CDS 17008 - 18411 2292 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 26 10 Op 6 . - CDS 18735 - 19061 639 ## COG1447 Phosphotransferase system cellobiose-specific component IIA - Prom 19093 - 19152 13.1 27 11 Op 1 . - CDS 19166 - 20044 1052 ## gi|262039201|ref|ZP_06012521.1| putative tetratricopeptide repeat-containing domain protein 28 11 Op 2 . - CDS 20066 - 20434 494 ## Lebu_1349 propeptide PepSY amd peptidase M4 29 11 Op 3 40/0.000 - CDS 20424 - 21869 1469 ## COG0642 Signal transduction histidine kinase 30 11 Op 4 . - CDS 21862 - 22593 833 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 22634 - 22693 7.4 - Term 22630 - 22665 3.1 31 12 Tu 1 . - CDS 22703 - 23041 482 ## gi|262039191|ref|ZP_06012511.1| putative chromosome segregation protein SMC - Prom 23103 - 23162 6.9 - Term 23147 - 23193 7.0 32 13 Op 1 . - CDS 23236 - 23532 612 ## Lebu_0314 hypothetical protein 33 13 Op 2 27/0.000 - CDS 23583 - 26621 4681 ## COG0841 Cation/multidrug efflux pump 34 13 Op 3 13/0.000 - CDS 26649 - 27692 1572 ## COG0845 Membrane-fusion protein 35 13 Op 4 2/0.000 - CDS 27726 - 29006 2064 ## COG1538 Outer membrane protein 36 13 Op 5 7/0.000 - CDS 29021 - 29605 723 ## COG1309 Transcriptional regulator - Prom 29656 - 29715 8.3 37 13 Op 6 . - CDS 29803 - 30435 753 ## COG1309 Transcriptional regulator - Prom 30462 - 30521 10.5 + Prom 30474 - 30533 15.1 38 14 Op 1 . + CDS 30613 - 31185 593 ## gi|262039176|ref|ZP_06012496.1| hypothetical protein HMPREF0554_0537 39 14 Op 2 . + CDS 31208 - 31666 623 ## Sterm_0207 hypothetical protein + Term 31695 - 31744 3.2 - Term 31792 - 31832 6.2 40 15 Tu 1 . - CDS 31855 - 32895 1276 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 - Prom 32963 - 33022 9.3 + Prom 32888 - 32947 8.3 41 16 Tu 1 . + CDS 33003 - 33968 778 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 33987 - 34043 4.1 - Term 33975 - 34031 4.1 42 17 Op 1 . - CDS 34063 - 34548 812 ## Sterm_3318 hypothetical protein 43 17 Op 2 2/0.000 - CDS 34567 - 35250 891 ## COG0325 Predicted enzyme with a TIM-barrel fold 44 17 Op 3 . - CDS 35274 - 36425 1194 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 45 17 Op 4 . - CDS 36429 - 37121 775 ## COG1794 Aspartate racemase 46 17 Op 5 . - CDS 37155 - 38873 2798 ## COG1109 Phosphomannomutase - Prom 38903 - 38962 11.3 - Term 38970 - 39005 4.4 47 18 Tu 1 . - CDS 39032 - 39391 616 ## COG0718 Uncharacterized protein conserved in bacteria - Prom 39414 - 39473 9.0 + Prom 39347 - 39406 6.6 48 19 Op 1 . + CDS 39464 - 39601 85 ## - TRNA 39465 - 39549 74.8 # Tyr GTA 0 0 - TRNA 39564 - 39639 85.4 # Thr TGT 0 0 49 19 Op 2 . + CDS 39651 - 39842 287 ## - TRNA 39652 - 39728 88.2 # Asp GTC 0 0 - TRNA 39734 - 39809 91.7 # Val TAC 0 0 - TRNA 39830 - 39904 53.4 # Glu TTC 0 0 + Prom 39972 - 40031 21.3 50 20 Tu 1 . + CDS 40087 - 41874 2063 ## COG0616 Periplasmic serine proteases (ClpP class) + Term 41881 - 41917 -0.3 - Term 41868 - 41901 5.1 51 21 Op 1 1/0.000 - CDS 42071 - 42880 1282 ## COG0561 Predicted hydrolases of the HAD superfamily 52 21 Op 2 1/0.000 - CDS 42908 - 43270 630 ## COG0784 FOG: CheY-like receiver 53 21 Op 3 . - CDS 43291 - 44094 1235 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 44209 - 44268 8.8 + Prom 44162 - 44221 14.6 54 22 Tu 1 . + CDS 44246 - 44719 701 ## Lebu_0281 thiol-disulfide isomerase and thioredoxins + Term 44737 - 44784 4.1 - Term 44610 - 44660 1.1 55 23 Op 1 1/0.000 - CDS 44795 - 45544 744 ## COG4123 Predicted O-methyltransferase 56 23 Op 2 . - CDS 45547 - 46479 1112 ## COG1774 Uncharacterized homolog of PSP1 57 23 Op 3 . - CDS 46442 - 46594 114 ## Lebu_2142 PSP1 domain protein Predicted protein(s) >gi|261746768|gb|ADAD01000174.1| GENE 1 3 - 768 1109 255 aa, chain - ## HITS:1 COG:SPy0876 KEGG:ns NR:ns ## COG: SPy0876 COG1577 # Protein_GI_number: 15674901 # Func_class: I Lipid transport and metabolism # Function: Mevalonate kinase # Organism: Streptococcus pyogenes M1 GAS # 1 255 6 260 297 253 53.0 3e-67 MNEKKGSGTAHSKIILIGEHSVVYGYPAIAIPLENITVKCEIIPSHSFFIHNPEDTLSTA IYEAMKYLGKEKEKIKYNIISQIPEKRGMGSSAAVSIAAVRAVFDYFGKNIDDKLLEKIV QRAEMVAHTTPSGLDFRTCISEKAVKFIKGAGFFPLDLNLGAYLVIADSGVQGKTKETVT AVREMGENARPMLLKLGELTDETEKFISEKDIVNIGKNMTKAHEELSKLGLNIEETEMLV KEALKNGALGAKISG >gi|261746768|gb|ADAD01000174.1| GENE 2 870 - 2102 377 410 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 51 405 41 410 418 149 28 2e-35 MSFTNEAEIKKEVERQFDILKRGCDEIINESEFKKKLEKSISTNTPLRIKLGIDPSGTDL HFGHAVPLRKLKQFQDLGHQVFFLIGTFTGRIGDPTGKSETRKMLSAEQVQENIKTYLDQ VSLILDLDKIKVVYNGDWLEKLNLEDVLRLLSQFTVSQMISREDFAKRLSENKPVSLIEF MYPILQGYDSVELRADVELGATEQKFNLLRGRDLQKNAGQEQQVCMIMPILEGLDGVEKM SKSLNNYIGVKDSPNDMFGKVMSVSDDLMYRYYEVITEVAQEEIAKMKEEINNGTLHPME AKKRLGEEVVKIYHNEEESKKAREWFENVFSKKNLDVDLPEVELEYKEIGVIDLLVRETG LLSSTSEARRLVEQGGFKINDEAVKDVKAVVKIQNGMVIRAGKKKIVKVK >gi|261746768|gb|ADAD01000174.1| GENE 3 2139 - 3065 1047 308 aa, chain - ## HITS:1 COG:FN1325 KEGG:ns NR:ns ## COG: FN1325 COG0575 # Protein_GI_number: 19704660 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Fusobacterium nucleatum # 3 307 5 290 294 123 33.0 5e-28 MLSRIFVIVLFVPFLLWIFLKGDVMFLVFTMVIIGVSLHEFYKMLKDKGFQVANRIGMGL GLFLPLAIYFQQNSKNIFSFLRLKFFQQINFDMGGFIVFALMLIAFRQILKVKIQGAMAE IAYTLFGIIYVSYFFSYILLIKYEFPNGNVIVAMTFILIWACDISAYLVGKFIGGKIFKQ RLAPKISPNKSIEGAIAGILGVFLVISFFDKIYLFIANFVCSITLISKTCSLDYNYIAIS GWQAFLLALGIGVFAELGDLVESKIKRELEIKDSGNLLLGHGGFLDRFDSALFVLPIVYV FMKYVAHM >gi|261746768|gb|ADAD01000174.1| GENE 4 3055 - 3750 1118 231 aa, chain - ## HITS:1 COG:FN1326 KEGG:ns NR:ns ## COG: FN1326 COG0020 # Protein_GI_number: 19704661 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Fusobacterium nucleatum # 1 228 1 227 230 266 59.0 4e-71 MDLKIPKHVAIIMDGNGRWAKEKGKIRLEGHRQGAINLEKIIEHSVKIGVKYLTVYAFST ENWKRPEKEVNGLMELFAKYLDTKKKSLKKQGIRLLVTGTKENISEKLLKKIEETEKYLE DCEKMIFNIAFNYGGRKEIIDAINKIIKDKKETVDEKEFARYLYRPEIPDPELVIRTSGE FRISNFLLWEIAYSEFYVTDVYWPDFTPEEYDKAILAFNKRDRRYGGLNVK >gi|261746768|gb|ADAD01000174.1| GENE 5 3910 - 4998 960 362 aa, chain + ## HITS:1 COG:STM4580_3 KEGG:ns NR:ns ## COG: STM4580_3 COG3172 # Protein_GI_number: 16767821 # Func_class: H Coenzyme transport and metabolism # Function: Predicted ATPase/kinase involved in NAD metabolism # Organism: Salmonella typhimurium LT2 # 168 351 1 184 185 209 52.0 9e-54 MKIGIVVGRFLPLHTGHVNLIQRASGLVDKVYVVVSYSDKGDTEMISNSRFIKEITPKDR LRFVKQTFKHQDTISSFLFDESNCPPFPEGWEIWSSLLKKEIESREPDIDWENDVLFISN RKDDAKYNLKYFNAKTKSIDPEYLEYPVNSWEIRENPSKYWEFLPREVREHLIPIITICG GESSGKSVMIDKLANAFNTSSAWEYGREYVFEKLGGDEDSLQYSDYEKIVFGHQSNVLYA ARNANKFALIDTDYITTLAFCLTYEKKDNPIIREFVQNYKFDLTILLENNVKWVNDGLRS IGESERRKKFQDLLKELYKEYKIPYIIVKSDNYEKRYLACKEIIKAYLNGQDNLQEIADS FI >gi|261746768|gb|ADAD01000174.1| GENE 6 5053 - 6000 943 315 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2020 NR:ns ## KEGG: Lebu_2020 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 311 1 312 321 150 36.0 6e-35 MKQILKNIDKNSLIGIYRFKENDFIVGNIIKLSDDYLFLNSCDIFGKYNGIKIVDVNIID RLIIKSDYIDNLNELRKNENKENKKIELYKIKSVEDFYKKIIDDKMLLSIELEDESIETG YMKKKTEDKFYFDFINEDMKVISAEIIKESYIKRIKLLEKIEDITKTDKENNIKKIVMNT GEICFGNIVQTIGEYLIFREKDEFRENRQISIIKTDKIEEITELISFDNMKKTEIGNLFK NIDFFEILKASMENKLVISIDNEDYEETKVGIIIEMKKDTLKLKRFDKYRQFSEISIIPY SEIQLLYVYNYEVFE >gi|261746768|gb|ADAD01000174.1| GENE 7 6043 - 6222 205 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039187|ref|ZP_06012507.1| ## NR: gi|262039187|ref|ZP_06012507.1| transferase family protein [Leptotrichia goodfellowii F0264] transferase family protein [Leptotrichia goodfellowii F0264] # 1 59 1 59 59 74 100.0 3e-12 MDAINNKYEETLYNEQNYELILDNGKKYHNKIEKGKFKKILNELIQEDYKNIERRYLGY >gi|261746768|gb|ADAD01000174.1| GENE 8 6704 - 6826 65 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKIGKKVILIIIASFLCISCMKNELNFELKIKSKYENFSE >gi|261746768|gb|ADAD01000174.1| GENE 9 6940 - 7131 330 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039190|ref|ZP_06012510.1| ## NR: gi|262039190|ref|ZP_06012510.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 63 1 63 63 110 100.0 2e-23 MEVRIFINFYCSDKNPTRAYLDSEGKDCEPEGVLKVNYQGKEKSMIVHVYDESEEVLKKY VKK >gi|261746768|gb|ADAD01000174.1| GENE 10 7122 - 7469 278 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039192|ref|ZP_06012512.1| ## NR: gi|262039192|ref|ZP_06012512.1| hypothetical protein HMPREF0554_0509 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0509 [Leptotrichia goodfellowii F0264] # 1 103 1 103 115 179 100.0 7e-44 MKAKKKINIPMILLLLYMFVFILFFIVLAITSKDDLVENMEVDRIRRDIFRLLNKDTEVY VEGVKLEGKEREKVLRIFNGEDVGGLPFSPLYYPREMQGRQEVKVKLKKRIKKWK >gi|261746768|gb|ADAD01000174.1| GENE 11 7473 - 8033 963 186 aa, chain - ## HITS:1 COG:BS_ydeN KEGG:ns NR:ns ## COG: BS_ydeN COG3545 # Protein_GI_number: 16077593 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha/beta hydrolase fold # Organism: Bacillus subtilis # 1 181 1 181 190 168 44.0 4e-42 MAKQLYIIHGYGASPREHWFPWLKNTMEEKGWTVSILEMPDSEHPKFDKWLKAMKENIKN INENTYFVAHSLGTITTLQYLSQYENLPKFGGFILVSGFDEGIAGFEELRPFVKEKPDYK KINEKAELRAVVAAKDDHIVPIKLSEKVAKKLNGKFYPVEKGGHFLDREGFTELPLVKEI LEKNEI >gi|261746768|gb|ADAD01000174.1| GENE 12 8089 - 9423 1603 444 aa, chain - ## HITS:1 COG:FN0586 KEGG:ns NR:ns ## COG: FN0586 COG0642 # Protein_GI_number: 19703921 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 3 441 4 443 445 379 47.0 1e-105 MGKKVAKIWGNLPITVKITLWYTTFVIILMSAMLIISFTVTDKISKDKNQRELTEAVNEL ISDNDFDDFDDGIYFVRYNNEGIESGGMSPRGFDLTLKLEEDTIGAYGKNGEKFYYYDRK MNNSDGDWIRGIIPVRKMSDEINKLFLILTVLSPILLVIIVYGGYRIVKKTLKPVVKISD TALEISKNGNFSKRIDIDDGKDEIHKMANTFNEMPNSLESFYLREKQFSSDVSHELRTPV SVILTESQYSLEYVDNMEEAKDSFGVIQRQAKRMSELINQIMELSKIEKQISILPEKIDF SEITEKILGDYKNLLEEKNIKIFPEIEKNISIHGDKVMIERLLDNLLNNAMKFTKDEINV KVYSENENCVLEVKDNGIGISEKEKELIWGRFYQVNDSRNKKINKGFGLGLSLVSKIVEN HNATVNVESEPNKGAKFTVKFKKY >gi|261746768|gb|ADAD01000174.1| GENE 13 9466 - 10056 973 196 aa, chain - ## HITS:1 COG:FN0585 KEGG:ns NR:ns ## COG: FN0585 COG0745 # Protein_GI_number: 19703920 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Fusobacterium nucleatum # 1 195 30 224 224 260 71.0 1e-69 MFDGEEALEYLEYGEYDLVILDVMMPKINGFEVVKELRSKGNNTSVLMLTARDSADDKVK GLDMGADDYLIKPFDFNELSARIRAVVRRKFGNSSNKIEVKDLVLDTTEKSVTRAGKKID LTGKEYEILEYLVQSKNRILSRDQIIERVWGYEYEGDSNIIDVLIKNIRKKIDIEDGKQI IFTKRGMGYVVKEDDE >gi|261746768|gb|ADAD01000174.1| GENE 14 10062 - 10142 116 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKILFVEDEIDLNNIVTKYLKKTVTV >gi|261746768|gb|ADAD01000174.1| GENE 15 10270 - 10596 640 108 aa, chain - ## HITS:1 COG:MA0533 KEGG:ns NR:ns ## COG: MA0533 COG3212 # Protein_GI_number: 20089422 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 18 106 114 201 206 64 39.0 6e-11 MKNKNLKLVVLGLAMFGVLGFAYRENSNKVSQYAVNSTNSQVITESRAKAIALKEVPRAN ESNIVEMELDRDHGRMVYEGEIYYKGLEYDFDIDAVTGEVLKWHVDRD >gi|261746768|gb|ADAD01000174.1| GENE 16 10681 - 11046 596 121 aa, chain - ## HITS:1 COG:MA0533 KEGG:ns NR:ns ## COG: MA0533 COG3212 # Protein_GI_number: 20089422 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 49 114 137 201 206 66 51.0 1e-11 MKNRILNVMILGIMMLGVAGTANSASKSKNSKTGNRDVEVQVRNVTAGSYIGVARAKAIA LKKVPGANDSHVTEVHLDREDGRMVYETEIRYNNTKYEFDIDATTGEIVKWKMKKAKNKN Y >gi|261746768|gb|ADAD01000174.1| GENE 17 11124 - 11483 661 119 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1603 NR:ns ## KEGG: Lebu_1603 # Name: not_defined # Def: propeptide PepSY amd peptidase M4 # Organism: L.buccalis # Pathway: not_defined # 1 119 1 121 121 117 57.0 2e-25 MKSKTLKLAVLGIGMLGILGIAYGVSRNNVSNKTEYVVNNGANTQNTQVIMKERAEAIAL KEVPGANESNITEMELDREHGRMIYEIEIQYNNTEHEFDIDATTGEILKKEVKTYYKNY >gi|261746768|gb|ADAD01000174.1| GENE 18 11715 - 12260 811 181 aa, chain - ## HITS:1 COG:MA0513 KEGG:ns NR:ns ## COG: MA0513 COG0110 # Protein_GI_number: 20089402 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Methanosarcina acetivorans str.C2A # 1 181 9 189 199 261 68.0 7e-70 MTEKEKMLAGMIYDANYNEELAEERRKAKDLCYEYNQIYPSQEEKQREVMRKILGKTEKN FMIVAPFWCDYGYNIEIGENFYANHNMIILDGGKVTFGDNVFIAPNCGFYTAGHPLDTES RNAGLEYAHPIKVGNNVWIGAGVSVLPGVTIGNNTVIGAGSVVTKDIPDNVVAVGNLCRV I >gi|261746768|gb|ADAD01000174.1| GENE 19 12351 - 12737 690 128 aa, chain - ## HITS:1 COG:FN1050 KEGG:ns NR:ns ## COG: FN1050 COG0346 # Protein_GI_number: 19704385 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Fusobacterium nucleatum # 1 128 1 127 127 142 57.0 2e-34 MKINHIALYTNDLERIREFYEKYFGAKANTKYHNKNTGLQTYFLGFEGSDTRLEIMTRPE ISQREERTVNEGFIHLAFSTGSDSEVDRLTEMLVKDGYKCLSGPRTTGDGYYESVIEDPD GNLIEITV >gi|261746768|gb|ADAD01000174.1| GENE 20 12764 - 13360 701 198 aa, chain - ## HITS:1 COG:BS_ydfE KEGG:ns NR:ns ## COG: BS_ydfE COG1853 # Protein_GI_number: 16077605 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Bacillus subtilis # 2 192 11 203 207 154 39.0 8e-38 MFKKIEKNGFYYGFPVILMTTKDKETGKNNVSPLSSSWVLGKTMVIGIGLGSKGFRNIEE GSDLTFNIPDEKLFENIKRIEKFTGMPEVPEAKQKLGYSYCEDKFEVAEFTEIEGETVKA VRIKECPIHIEAKVADIVKKDWFAIVTCEIQTIFMSEEILKNDSQIDTEKWEPLIYKFRE YVGTGERLGLNFNFQEAL >gi|261746768|gb|ADAD01000174.1| GENE 21 13451 - 14287 1160 278 aa, chain - ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 4 272 5 259 269 133 33.0 4e-31 MLADYHIHCRYSDDSEEEPEKIVESAVSKNFDEICFTDHVDYGIKIDHDVYNKMNDSEKE KYKNLLNVDYPEYFREIGELREKSKDKITIRQGLEFGMQTHTIADFQKIFDKYDLDFVIL SCHQVNNEEFWNKVFQKGKTPDEYNYKYYEEIYKVIQKYSDYSVLGHLDHIQRYNETIYP FEKSREIITEILKKVIKDGKGIEVNTSSFRYGLKDLTPERNILKLYHELGGKIITIGSDA HKAENVGDHIPYIQSELKKTGFTHICTFDKMKPIFHGL >gi|261746768|gb|ADAD01000174.1| GENE 22 14309 - 15049 692 246 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0183 NR:ns ## KEGG: Lebu_0183 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 13 233 6 227 242 265 65.0 1e-69 MEDKILNKNNSILEKIRVFSGAQLKYIAFLSMLADHVNKALIYPILTGEGVLQQISNVFD IFGRVAFPLFSFFLVEGFFKTKSRKKYLLYLLTFGVISEVPFDMFQSGVFFNPDSNNVLF SLALSLVTIWTIDMLKRKTGKISKIMWYPVSFIIVIIMSIIAVFLSVDYEYHAVLIGYFF YIFYEKPLLSAVFGYVSIFKEVWSILGFGLTLTYNGKRGKQYKILNYCFYPAHLLVLGLL RMYFNL >gi|261746768|gb|ADAD01000174.1| GENE 23 15086 - 16507 2116 473 aa, chain - ## HITS:1 COG:L121426 KEGG:ns NR:ns ## COG: L121426 COG2723 # Protein_GI_number: 15673653 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Lactococcus lactis # 1 472 1 474 477 717 71.0 0 MGFRKDFLWGGATAANQLEGAYNEDGRGLANVDLTPVGEARLSVITGKRKMLEFEEGYFY PAKGAIDFYHRYKEDIALFAEMGFKTYRMSIAWTRIFPKGDEETPNEKGLEFYENVFKEC RKYGIEPLVTITHFDFPIHLIKEYGGWRNRKIVDFYKKLCEVIFKRYKGLVKYWLTFNEI NILLHAPFMGAGIVFEEGENEKQALYAAAHHELVASAWATKIAHEIDPENKVGCMLASGK YYPLTAKPEDVWEALEKDRENYFFIDVQVRGYYPNYALKFFERNNLKIDITEEDRKILKE NTVDFVSFSYYTTRCISKEAADLGEGNLLESMRNPYIEVTDWGWGLDPLGFRTTINEIYD RYQKPLFVVENGLGAVDTPDENGYVEDDYRIDYLRAHIKAMKDAVELDGVDLLGYTTWGP IDLVSAGTGEMKKRYGFIYVDRDNAGNGTLKRSKKKSFEWYKKVIASNGEDLE >gi|261746768|gb|ADAD01000174.1| GENE 24 16522 - 16977 316 151 aa, chain - ## HITS:1 COG:no KEGG:Trebr_0061 NR:ns ## KEGG: Trebr_0061 # Name: not_defined # Def: pyridoxamine 5'-phosphate oxidase-related FMN-binding protein # Organism: T.brennaborense # Pathway: not_defined # 11 148 87 224 227 84 34.0 1e-15 MKEFDEKCNFLFEQLGKGKKMVLSTSSDDRVTSRMMCIGIIDNQFYFQTDRNFRKYEQLK VNPNVALCFENIQIEGICKETGKPLENEKFCNLYKEVFRNSYENYSSLENERLFIINPVY IQKWIYENGEPYLEKFDFVKNTYKKEKYIQK >gi|261746768|gb|ADAD01000174.1| GENE 25 17008 - 18411 2292 467 aa, chain - ## HITS:1 COG:lin0540 KEGG:ns NR:ns ## COG: lin0540 COG1486 # Protein_GI_number: 16799615 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Listeria innocua # 3 439 2 440 440 630 67.0 1e-180 MAKKGVKIVTIGGGSSYTPELIEGFIKRIKELPVKEIWLVDIEEGKEKLEIVGKLAQRMV KAAGIDCKVHLTLDRRKALKGADFVTTQFRVGLLDARIKDERIPFENGLLGQETNGAGGI FKAFRTIPVILDIVKDMKELAPDAWLINFTNPAGMVTEAVLKYGNFDKVVGLCNVPVNLM MDCAKIYEKNKDDFIFQFAGLNHFVWYNVYDKKGNDLTYDLWERLQDKEDIGVKNIVNFN FDYDQIKNLGMLPCPYHRYYYLQDTMLQQSLKDYKEKGTRGQIVKKVEEELFELYKNAEL HEKPKQLEQRGGAYYSDAACEIINGIYNNKGTIMVVNTKNKGAISDLPYDAAVEISSYIT GAGPKPIAFGRFPNAGQRGWIQVMKAMEELTIEAAVTGDYATALQAFTTNPLIPGTEIAR KVLNELLDAHKKYLPQFKEYYKNPEKYKIKESKEKAVKKSTKKTVKK >gi|261746768|gb|ADAD01000174.1| GENE 26 18735 - 19061 639 108 aa, chain - ## HITS:1 COG:lin2833 KEGG:ns NR:ns ## COG: lin2833 COG1447 # Protein_GI_number: 16801893 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Listeria innocua # 6 102 2 98 100 85 52.0 2e-17 MAENFDVETVAMTLIGHAGESKSLAYQALNEAKKGNFEEAQKFMDQAGEEMLKAHSMQTD LIMKEANGEKIDIGLIMVHSQDHLMGAILFKDLAKEFIDLYKTIYNKK >gi|261746768|gb|ADAD01000174.1| GENE 27 19166 - 20044 1052 292 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039201|ref|ZP_06012521.1| ## NR: gi|262039201|ref|ZP_06012521.1| putative tetratricopeptide repeat-containing domain protein [Leptotrichia goodfellowii F0264] putative tetratricopeptide repeat-containing domain protein [Leptotrichia goodfellowii F0264] # 1 292 1 292 292 427 100.0 1e-118 MIKKKFGNLLTVVFLILGIQILSQARAATIGQIEQYKLKLKKYNIKKESSDILVKALQTE NPSEVEKLLLKAVEADKRNYIAYELLGAYYQNEKYGNDIKKALEYYEKALEVNPDEEKIY LQLGENYIRANDYDKAANIYGQMKSRFSENSVKAEEMAVASAEYARSSKMAEKTHEEQAL YSKKSDSFAGFVAPSAAFESGANGYTKERYDFEEDKKKAKEEILASLSYFENKQYEKGFQ SFYENYSRYADVLDDGLIEETIKIIKKYNEEIKNTNKKLYRKNLKKFKELEI >gi|261746768|gb|ADAD01000174.1| GENE 28 20066 - 20434 494 122 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1349 NR:ns ## KEGG: Lebu_1349 # Name: not_defined # Def: propeptide PepSY amd peptidase M4 # Organism: L.buccalis # Pathway: not_defined # 1 108 1 111 203 79 47.0 4e-14 MKNKILRIVLTGMAIFTIGVFAEGSKKGVKFANPNVKISTGKAKEIMLQHAGIHLGQKVR ITKIMLQKENRKYFYVIEFFTEDKKYKYNIDAGKGNVLNFSQENRKRAENKSKGTFWDII GI >gi|261746768|gb|ADAD01000174.1| GENE 29 20424 - 21869 1469 481 aa, chain - ## HITS:1 COG:FN0586 KEGG:ns NR:ns ## COG: FN0586 COG0642 # Protein_GI_number: 19703921 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 21 467 12 445 445 208 32.0 1e-53 MTKATKIRKNTIGNEKKSSGKLSIKMRIAFWYTGLIIGITALFIITMLNIGSLNVRNAVH NRLKRTVEKSFERIDFIDGEIVLDNNLDTSAGDIFISVYDKNGDFIYGDLYLDLETGSSF KENNKVKTVKQGNLKWYVYEIKRNFEGYGDLWIRGVTPISQPEKSAEMTIFIFLALFPFL IIISASGGYIITKKAFEPIEDITKTAEKINKGNDLSERINIGNGKDEIFTLANTFDEMFD RLQGSFDREVQFTSDVSHELRTPISVISTQSEYGLKYLDINEETKEIFQSILEETKKMSG LVSNLLMLARMDKGYQKLNVENTNLSETAEIAIETQRPNAEKRNITIKSNISGNVYADID ESMIMRVFINLLSNAVFYGKDGGNVSVDLFAKEDKVICKITDDGIGISEEHIDKIWNRFY RADFSRTGDNSGLGLSMVKGIIEAHKGKIQVQSELHKGTVFTFELPVVFSGGNGGSMYNE K >gi|261746768|gb|ADAD01000174.1| GENE 30 21862 - 22593 833 243 aa, chain - ## HITS:1 COG:FN0585 KEGG:ns NR:ns ## COG: FN0585 COG0745 # Protein_GI_number: 19703920 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Fusobacterium nucleatum # 17 239 1 221 224 209 51.0 5e-54 MLKYAIIKKNRLSGEKMRILVVEDEKKINDVIVKILKQEKYGVDSCYDGQEALDYIFSVD YDAVILDIMLPKKDGFEVLKKIREKNIKTPVLFLTARDRVEDRVKGLDYGADDYLIKPFA FEELLARVRVLLRKSSDTSQTGNIFKVENLTVNCNDHTVFRDETSIKLSAKEFAILEYLI RNKGKVVSKEKIEEHVWDFDYEGGSNIVEVYIKFLRKKIDDNFSPKLIHTIRRVGYILKV END >gi|261746768|gb|ADAD01000174.1| GENE 31 22703 - 23041 482 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039191|ref|ZP_06012511.1| ## NR: gi|262039191|ref|ZP_06012511.1| putative chromosome segregation protein SMC [Leptotrichia goodfellowii F0264] putative chromosome segregation protein SMC [Leptotrichia goodfellowii F0264] # 1 112 1 112 112 145 100.0 8e-34 MKKFAILISTAISLQSFAADYQTTKEENVGFSKQEIRKNNAQIESEIKRSIENIVSPVSK YEIQEISYMSETNAEVKIGLKETLEKQKENKKPLVIKFSKEEGNWSMEETVS >gi|261746768|gb|ADAD01000174.1| GENE 32 23236 - 23532 612 98 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0314 NR:ns ## KEGG: Lebu_0314 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 98 1 98 98 138 62.0 9e-32 MKRLEVYFDSFFLEKVKEELNEYGLEKYVLVPEVFSDWGKHLKHFNSHLWPGTDSILIAY VEDDQGEEIMRVIKKMKIDVGHMISMGAVLIPIDDMIL >gi|261746768|gb|ADAD01000174.1| GENE 33 23583 - 26621 4681 1012 aa, chain - ## HITS:1 COG:FN1275 KEGG:ns NR:ns ## COG: FN1275 COG0841 # Protein_GI_number: 19704610 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 1001 1 1002 1020 668 39.0 0 MTIAEFATKRVVSTTMIIIFMIFAGYSAMKNMKQELIPDFNFPFVVVQTKWTGATSEDVD TQITKRIEEASLNVDGIKNITTTSAYGTSVVVIQFNFGSNTDTKKVQIQSEIDKIKNDLP NDADSPVISGANTRGGNSSMALFIILKGADEATLTSFVKETMKPRLQRNRGIGDIRVTGS TEREIKVELDPYKLKAFNLSPSEIYSKIKAANTITPAGTVTDGGKKFILKVSGEIENLEQ VENIILSNDNGQTLKLADIAKVSYSTKDRETYTRVNGKDAVGVIIEKTKDGNIVEIADTA KKQLEEMKPLFPKGSTYELITDNSIMVKDSIANVTSSGLQALVIAAIVLMVFLKDIRASI FISLSIPISTMFTLFLLGTQGISLNMVSLMGLALAVGSLVDNSVVTLDNIFDHMQEYKEP PNVAAVRGTNEVVMPMIASTMTSVCVFLPIVIFEGFTKAVFKSIAFSMMFALSASIIVAV LFIPMVSSRFLNLQKILDSKDKSKYYNAFKEKYKKAIAAALDNKWKVVIGTFVGFVATMV IFGPMVKTAFFPQIDNKEYSVVASLATGLDLEKSYEITKKIEEAVKADKNTKSYYSISQT DAAIVNVKVDKDTAKAMDRIREKLKDLPDVNLAVLFEETTSANANKDYSFEIEGENEDEL NRIANSIIGEIKKENWMKDVKSSAEGGYPQAQLEINREKAESYGINVTNLTTMLNMSVLG IAPIEITEGTEKLDVTLQLEEQYRNSLEKILDLEIKASDGAYVRIGDIAKLSNVEGTSEI TKYNGSRIVTVGFNLDSSKGLSDAAKFVNAAFKKARPAQGYKLATAGNARSQEEMGGEMS RALGLSIVLIYIVLAIQLESFILPFIIMTSLPLSIIGVVIGMVITRVQLSMFVMIGIIML MGMVVNNAIVLLDFVANMRQRGVGIREALIESGGSRLRPILMTTLTTVLGWIPMALAIGG GTAGYYQGMAIAVMFGLSFSTLLTLIFIPVLYLIVEEAKEKRAKKKRKKIHN >gi|261746768|gb|ADAD01000174.1| GENE 34 26649 - 27692 1572 347 aa, chain - ## HITS:1 COG:FN1274 KEGG:ns NR:ns ## COG: FN1274 COG0845 # Protein_GI_number: 19704609 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Fusobacterium nucleatum # 16 347 8 366 370 157 29.0 4e-38 MKNGKKSYYNKYILGLLVLLIFAVSCGKKKEQKVEEDLRPVKTVLLNQSDMSLGYTAGAE IKGKEEIPYTAVVSGELTVLNVKDGDTVHAGQIILGIDNQAARSNVQSASANYNAARIQY EKYSQLYQKRLITETDYLNAKTNYESASANLALANDSNSKSAIKADIDGVISGLTLERHQ QISAGQTLFKIVNESEMEIKVGVSPNAISKIKVGTNAKIKIDELNKEVSGKVYEISGTAD TKTRQFIVKIRIPNPEKEIKSGMYGTASIDTGIEAGIIIPKEAIVVRGVQQIIYIVRDGK AMAIPIKILNQNEKYAAVEGEGLTAGSELIIDGQNVVQDGEKLKKVN >gi|261746768|gb|ADAD01000174.1| GENE 35 27726 - 29006 2064 426 aa, chain - ## HITS:1 COG:FN1273 KEGG:ns NR:ns ## COG: FN1273 COG1538 # Protein_GI_number: 19704608 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Fusobacterium nucleatum # 21 425 5 412 413 180 29.0 6e-45 MRIKQIGVAVTMMLFALPVFVFGQKITVQEAAEMAIKNNKDIKIGMLEVEKGKLDVNKAW KSAYFKVTYNATANKYFKDVKSPFTGKYNQAYGQNVTLAQPIYTGGAIKSGIEIGKNYLS LMELSLDKIRKDTMLSVIQAYIDVYEAENTLEVLKKSKEALSRNYEEQKEKYKLRLVTKP EFIEAERSLKAIEADVIQQTSNIEIAKEALGNMIGVKDSEKIEIVPFTVEEKFSKAVNLK DDLAKLTTRNTEYQMALKQKEISRKNIDLEKADLRPKVTGVVTYGTSDRTKFSEVSKTKN YNGTIGLNLSWQIFDWGENKLDVEKAKRNHEIKEIEAEKALDDLKVGMKKVYYQLQALEK SLEAKKIAVEKAEEVYELEQERYSYRLITMRDLLNAESNLRQSRTDYISSKLRYYYLVSR YGAFLD >gi|261746768|gb|ADAD01000174.1| GENE 36 29021 - 29605 723 194 aa, chain - ## HITS:1 COG:MA0364 KEGG:ns NR:ns ## COG: MA0364 COG1309 # Protein_GI_number: 20089261 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 4 158 9 162 192 60 29.0 2e-09 MKKEDKRQMIIEKSIELFSENGYHKTKVEDITKALEISKGNFYTYFDSKEEVLYEILDRI KEEKIKALKEIDVNANPKEILKEFVNKRSNIFFKYMKKTNMQNVDAFLKDQKIVNYLNEI QIISVNFIEENIINRGGGGKKYNSRFISEFILLSIEGFFLDEVLSEGIDCERGYTMEREK KIEQIIEFINNALK >gi|261746768|gb|ADAD01000174.1| GENE 37 29803 - 30435 753 210 aa, chain - ## HITS:1 COG:FN1034 KEGG:ns NR:ns ## COG: FN1034 COG1309 # Protein_GI_number: 19704369 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 7 204 9 205 205 159 46.0 4e-39 MKNKIKEDRYHHGDLRYALIEKGIEIINLEGESNFSLRKVATRCGVSNAAPYAHFKNKEE LLDAMRLHVIDQLAEELEKAKKEYYGTPDVLMKMGKSYVLFFYKYPVYHNFLFPKKNLKM DFSLEFTEKTENKALEILKKTALDIFDELKVSSKIIQDKIIAMWALVEGLTVIITMSDIK YDENWDKRIEEILNSVMIFDGNENLLEKVQ >gi|261746768|gb|ADAD01000174.1| GENE 38 30613 - 31185 593 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039176|ref|ZP_06012496.1| ## NR: gi|262039176|ref|ZP_06012496.1| hypothetical protein HMPREF0554_0537 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0537 [Leptotrichia goodfellowii F0264] # 1 190 1 190 190 297 100.0 3e-79 MKMKINKFLITALMLLISAYSFSDNETPAETETTVNNTETKIENSEEATEPIVKPVTTSV TTKESKKAKSSATFKKKEVANILWRLTKISDKDLASSEDEQEKKEIITLYLSSNGGMNGV TGDNGYFGNYKLDGNSISISLFGNSAVYDGSTELETEYLNLLGRVSSISVSGNILTLRAG KDLLIFQRVQ >gi|261746768|gb|ADAD01000174.1| GENE 39 31208 - 31666 623 152 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0207 NR:ns ## KEGG: Sterm_0207 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 4 151 3 151 153 96 38.0 3e-19 MKTKIILSLISILFIFACTTANTASQNNKPGKVKTDIAAKLLNTSWELTEINGQEPDFGV AYSLKPVITIEFKDNGMNGRSVVNGYFSSYKINDDNINFGGIGATRMGGPKEFMDLEYKY FNILDKVNKVELPDENTLILKSNTDTLKFTKK >gi|261746768|gb|ADAD01000174.1| GENE 40 31855 - 32895 1276 346 aa, chain - ## HITS:1 COG:FN1322 KEGG:ns NR:ns ## COG: FN1322 COG0750 # Protein_GI_number: 19704657 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Fusobacterium nucleatum # 1 346 1 338 339 252 43.0 6e-67 MSIIFTIIILGIIIFLHELGHFMTAKYYKMPVLEFAIGMGPKVFSKKINETAYSIRLLPL GGFVNIGGMQPEDDPEKQVKDGFYTKSPFSRFVVLIAGIMMNFISSIIAIFIMLSVTGGV PAGYIKPIVGSVNENSAAKNVLQVNDRITEINGKKIKNWEDLANAIYKINEKGYNGENIS LKIMRDNKEINTDIKLTYSEELKINALGIVAAQAKISFFQKISASFYTFGNYFKVMADGL KMLITGKVSVKEVTGPVGLPKYVGQAYKDGGGIGLLNIFILLSINIGLMNLLPIPALDGG RLLFVIPEFFGIKVNKKIEERIHMIGMLLLLGLMVFIVFNDVMKYF >gi|261746768|gb|ADAD01000174.1| GENE 41 33003 - 33968 778 321 aa, chain + ## HITS:1 COG:FN1498 KEGG:ns NR:ns ## COG: FN1498 COG0697 # Protein_GI_number: 19704830 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 4 291 2 285 299 194 44.0 3e-49 MKKNNLILGGFLVVIAAIMWGLDGVLLTPSYFSKFHFYDVNFIVFIAHAIPTLILSVLFT KQYKMLKEFTKNDYIFFLLIALFGGSIGTLSIVKALQLSEYSKFSIVILIQKSQPIFAVI LAFLILKERPSKKFYIVAIICMISIYLLTFEFKSPTLLPKNNLLAAMYSLLAAFSFGSST VFGKKIVGKFSFLTSTFYRFFFTTVIMLVFILFSGNTEKSIITFTGNKSLMSLALFIAVY GLCAILIYYNGLKNIPASIATFCELAYPLTSVFAEAVILKRFLSPVQFVSAMILVGSILY LNLSKVDKEFKNIDKVGDSEI >gi|261746768|gb|ADAD01000174.1| GENE 42 34063 - 34548 812 161 aa, chain - ## HITS:1 COG:no KEGG:Sterm_3318 NR:ns ## KEGG: Sterm_3318 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 5 154 3 150 152 154 61.0 2e-36 MKLGKKFLEFFGDVDEEEVIEEEEVVVKPKPQVKTHTVNESEQQPEEKEKRVMSIFGGGK KEEIGKMKVSYVSIIRPKTFEDSRLIADSIKEKKVVTFSLEFLEFEVGQRVIDFVSGAAY AMDAHLSKVTDKVLTSIPNGVGYEDIDTSLEEERKRNSDLL >gi|261746768|gb|ADAD01000174.1| GENE 43 34567 - 35250 891 227 aa, chain - ## HITS:1 COG:FN0561 KEGG:ns NR:ns ## COG: FN0561 COG0325 # Protein_GI_number: 19703896 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Fusobacterium nucleatum # 8 227 3 223 223 167 48.0 2e-41 MELTREKIHQNYEKIKEDIKKYSPYPEKVKILFVTKYFNFEEQKKIIDMGYNYFGENKAQ VYRDKVKNFSEDKYKNLKWDFIGRLQKNKIKYIIKNVNLIHSIDSYELLEEINKKAAENN RIVNGLIQINISKEESKTGVYVESFEKEYEKYFSMDNVKIKGFMTMAPLEAESHEIKEYF FNMLRLKEKYEKKYDYLDELSMGMSNDYIEALESGSTIIRIGSKLFK >gi|261746768|gb|ADAD01000174.1| GENE 44 35274 - 36425 1194 383 aa, chain - ## HITS:1 COG:FN0560 KEGG:ns NR:ns ## COG: FN0560 COG0635 # Protein_GI_number: 19703895 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 29 383 8 365 365 317 49.0 2e-86 METIKEFVKIKSEQKKVKIEEQKNIDALYLHIPFCEKKCEYCDFCTFINMSREYEKYTKA LIKELKMYPEYEYDTVYFGGGTPSLLPVEMTAEIMSNIRYKENSEITLELNPNDMTAEKL KELRKTGINRLSIGIQSFQNHVLKFIGRLHSGEDAVRVFRDARKAGFENISIDLMFGIPN QSFEDLKKDLENILLLSPDNISIYSLIWEEGTVFWSKLKKGILSEMDQDLEAEMYEEIID FLKKNGYSHYEISNFSKKGRGGIHNLKYWRNKEFIGVGLSAATYFNGKRYSNVRTFNKYY KSLEENILPVDEKSVEIIDETEKEKLKNMLGLRLTEEGIKYFQNEKVESLLERGLLERFD SGKRMRLTRKGILLANEVFVEFI >gi|261746768|gb|ADAD01000174.1| GENE 45 36429 - 37121 775 230 aa, chain - ## HITS:1 COG:Cj0085c KEGG:ns NR:ns ## COG: Cj0085c COG1794 # Protein_GI_number: 15791473 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Aspartate racemase # Organism: Campylobacter jejuni # 1 230 1 230 231 248 57.0 9e-66 MKTIGLIGGMSWESTELYYRIINETVKENLGGLHSAKCLLYSVNFEEIEKYQSTDQWEKS AEVLTEIAKKLEDAGADFIGICTNTMHKTVPYMEKSINIPILHIAEATVEEIKAQNIKKV ALLGTKYTMIEDFYKEKILESGIEVIIPNEDDIKIINDIIYDELCLGKVEFKSKEKYLKI IEKMKNQGVKGVILGCTEIGLLISQKDVDIKVFDTTKIHAKKIALNVIKN >gi|261746768|gb|ADAD01000174.1| GENE 46 37155 - 38873 2798 572 aa, chain - ## HITS:1 COG:FN0559 KEGG:ns NR:ns ## COG: FN0559 COG1109 # Protein_GI_number: 19703894 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Fusobacterium nucleatum # 3 567 20 574 580 591 53.0 1e-168 MEFMEKYEYWLNSDSVDEKDKEELRSLKDNPKEIEDRFFKDLSFGTGGIRGVRGIGTNRI NKYVIRKATQGLANYMLKYNEKEAREKGIIIAHDCRIGSREYALNTARVMTANGIKAYIY PDLRSTPELSFGVRYKGCLAGIVVTASHNPVEYNGYKVYWEDGAQVVDPHATGIVEEVNK IKTLEEIKVMCEKEAREKGLIIELDGKIDDDYLAEIKKQTLKTNIPGKENFKIVYTPLHG TGGRPMKRILSDFGYSFEVVKEQIEPDGNFPTVVYANPEEVAAFKLGVKLADEIGAKLVM ANDPDADRIGIAVKDDSDNWYYPNGNQMGLLLLQYLLNNKKDIPANAKVITTIVSTPMID VVAPAKNVGVMKTLTGFKYIGEKIREFETGKLDGSYLFGFEESYGYLIGTHARDKDALVT SMVIAEMAAYYNSIGSSIYKELQKLYKEFGYYLEGIKSVTLKGKDGIEKMTALMSDLREN IKDTLIGKKIKIKRDFDSHKEYNLETGEEKEVKLPKENVLQFVLEDNTFITARPSGTEPK IKFYFSVNADSDEKVKEKLENTMEEFLKILKL >gi|261746768|gb|ADAD01000174.1| GENE 47 39032 - 39391 616 119 aa, chain - ## HITS:1 COG:FN1270 KEGG:ns NR:ns ## COG: FN1270 COG0718 # Protein_GI_number: 19704605 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 30 119 1 88 88 70 48.0 7e-13 MVRKLKGANSGKSGSQQDIIKQAQVMQQEMLKIQEGLKDKFVETSVAGGGITVKANGQKK IVDLSISMDILKDAVDEGDTSIVSDVIINAINEILEKAEEMAEKEMEVITGGVSIPGLF >gi|261746768|gb|ADAD01000174.1| GENE 48 39464 - 39601 85 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVVREGFEPSKAEPADLQSAPFGHSGTSPHSLKVVPLVGLEPTTY >gi|261746768|gb|ADAD01000174.1| GENE 49 39651 - 39842 287 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAEVTRLELATSCVTGKHSNQLSYTSINGGHNWARTNDPLLVRQVLSQLSYATDISVFLN GTP >gi|261746768|gb|ADAD01000174.1| GENE 50 40087 - 41874 2063 595 aa, chain + ## HITS:1 COG:FN0873 KEGG:ns NR:ns ## COG: FN0873 COG0616 # Protein_GI_number: 19704208 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Fusobacterium nucleatum # 85 588 11 487 494 290 37.0 4e-78 MKFFDFIKRFLLFTIKEIYSFFLKLSLLFFVFFILLSSLIGYIISKAKDEDSISKNYNYV LLNVSSISEDKLDTSLFGEAKKYNISYMDVLNSLEDIKNNDNIKGVIINLDQTNISSVKS EEISKKLQEIKNKNKKVYAFGAYMDNNNYPLASVANEIIMVPSASGSVSLAGYHYSDLYY KKLLSNVGVDMEVVRIGDFKSYGENYTSDTMSSGLRNELTRILESRFNSFLDKVSKARRL DKNKLNADILNGDNTNLTPSAARDKNFVDTLEYFNDLMTKLQINEDNIVDIYDYYADNGK RIEEQNQDKGTIAVIFAEGPIVYNEEAQGIYISPDNMAEKLKELSKIKDLKGVVLRVNSP GGSALASEMIYQMLSKINVPVYVSMSEVAASGGYYISMSGKKVFANDATITGSIGVVSMF PKFYNAQNKYGVTSNSISKGKYTDTFDPFVPLSAESRNKIIESMNATYDEFKSRVSKNRN MAPQVLENYAQGKIWLGSEAKKINLVDGIATLDETVKTLARDLNLGDNYRVENIYAKKDF KETLKLLSSYIFEKFQLTSQLETKLPGSSKIMDEYKLIEQNKNKPMYYLPYQIKF >gi|261746768|gb|ADAD01000174.1| GENE 51 42071 - 42880 1282 269 aa, chain - ## HITS:1 COG:FN0391 KEGG:ns NR:ns ## COG: FN0391 COG0561 # Protein_GI_number: 19703733 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Fusobacterium nucleatum # 2 268 3 263 267 179 43.0 5e-45 MYKAVVTDLDGTLLNDEHKVSEFTKETVRKIIDKGIKFYIATGRNYGLAKVVKDELGINI PLISSNGARINDEDGKVIYEDGLGRKEIDAILSIDYKSFGKDIHLNIFSGDDWIITKGTL EEVIKRDGETFPLDMIEVPENELGKREILKFFYIGKHENLIELEKEILKKTDNNVSVVFV SDDCMEVFSKTANKANAAKFLLKRDGINSEETVSFGDGENDYELLTTMGKGHAMGNAIYR LKNLLPKDFEIIGKNSDDGEAKKLIELFL >gi|261746768|gb|ADAD01000174.1| GENE 52 42908 - 43270 630 120 aa, chain - ## HITS:1 COG:CAC2218 KEGG:ns NR:ns ## COG: CAC2218 COG0784 # Protein_GI_number: 15895486 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Clostridium acetobutylicum # 6 117 5 116 119 91 44.0 4e-19 MTKTALVADDASYIREDIKDILEDQGYEVYEAADGMEAFEMYKKVKPTIVTMDINMPRVH GIKATQLITDYDPQAKVMLCSTMITFQNYIKMGKEAGAKGFLSKPFTDEEFMNELSKLFL >gi|261746768|gb|ADAD01000174.1| GENE 53 43291 - 44094 1235 267 aa, chain - ## HITS:1 COG:PM1629 KEGG:ns NR:ns ## COG: PM1629 COG0561 # Protein_GI_number: 15603494 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Pasteurella multocida # 2 266 6 269 272 167 37.0 2e-41 MYKAVISDLDGTLLDENHQINDFTIETVKKVVEKGIKFYIATGRSYFGAKEIMDKINLKI PLITSNGARIMDSDGNEIYINNIEKKYLDEIYKVDYKFVGQNIILNGYSGSNWYIVEDVV DYYRKKRPDRTFLPEVVSWKEFMEKEYTKIFFLGPHDELLKLEKILKKVTNEEVNAVFVS EGSLEVFDKNSDKAVAAKYLLEKDGIKLSETVAFGDGYNDYELLKEAGKGYLMGNSLYRL LEGLPDYEVIGTNENNGEAEKLRELFL >gi|261746768|gb|ADAD01000174.1| GENE 54 44246 - 44719 701 157 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0281 NR:ns ## KEGG: Lebu_0281 # Name: not_defined # Def: thiol-disulfide isomerase and thioredoxins # Organism: L.buccalis # Pathway: not_defined # 1 157 1 158 158 165 57.0 6e-40 MSTFQDYLKFSDNEEYSQKQLKIIDKITLSEETEKAVKSINKEIKILCLAQVYCPDCRAI VPFMKKFSELNSNIKVNYLPREGNEELLKKLTETVKIPTLVYEKENNMSVFLLEFPNVVQ KAMKENPENYDEIKYNFRTGKYNQEIEKELVNYLISL >gi|261746768|gb|ADAD01000174.1| GENE 55 44795 - 45544 744 249 aa, chain - ## HITS:1 COG:CAC0306 KEGG:ns NR:ns ## COG: CAC0306 COG4123 # Protein_GI_number: 15893598 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Clostridium acetobutylicum # 22 236 6 215 244 77 26.0 2e-14 MGSLKNNKKDGFLKESYEIEAVDSYTERLESVDKIMTVSEIGLKITEDSLLLANFIKNKL EKKSGKSNRTFFHNRNMLEIGAGQGIISLLVSGLPNISKIYAVEVQKEVFGNLIGNVEKN SLQSKILVLNEDIRNVVGEYDYIFSNPPYKKVNSGKLPQDKTEAISKYEVLLTLEELFFN TKRLLKNYGEFFVIVPEERLNDSFRYIYKNELQILSLNINQYKKKKLIIIHGKKGGKINS GIEIEYNLK >gi|261746768|gb|ADAD01000174.1| GENE 56 45547 - 46479 1112 310 aa, chain - ## HITS:1 COG:FN0908 KEGG:ns NR:ns ## COG: FN0908 COG1774 # Protein_GI_number: 19704243 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Fusobacterium nucleatum # 33 293 29 291 312 244 50.0 1e-64 MHLENMGKRYTGNENELDSGIINKINEDRTIDVYFETFQKRYFFLDNPDFRVKINDKVLV ETQMGLGIGKVIGIKSGRVEEKESLKRIIRVADDKDLEKNKELKEDSVKAGFIFRNKLKK YNLNLKLVATEYTFDKKKLIFYFASEERVDFRDFVKDLAAIFKVRIELRQIGVRDYAKMV GDCGSCGKTLCCKSFINKFDSVSIKMARDQGVVVTPSKITGVCGRLKCCIGFENDQYNEV RENYPAVGQTVTTEEGKGSVISMNILNDLIFVNIHGKGIGRYGLAEIQFDAEEKREIEKR QNCCNFIEGN >gi|261746768|gb|ADAD01000174.1| GENE 57 46442 - 46594 114 50 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2142 NR:ns ## KEGG: Lebu_2142 # Name: not_defined # Def: PSP1 domain protein # Organism: L.buccalis # Pathway: not_defined # 1 50 119 168 168 82 90.0 6e-15 FTAENRLDFRELVKEVNKTFKKRVEFYQIKQNDEGRILSAFGKYGKEIYW Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:42:10 2011 Seq name: gi|261746765|gb|ADAD01000175.1| Leptotrichia goodfellowii F0264 contig00156, whole genome shotgun sequence Length of sequence - 651 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 35 - 277 421 ## FN2049 hypothetical protein - Prom 298 - 357 7.1 2 2 Tu 1 . - CDS 519 - 650 97 ## Predicted protein(s) >gi|261746765|gb|ADAD01000175.1| GENE 1 35 - 277 421 80 aa, chain - ## HITS:1 COG:no KEGG:FN2049 NR:ns ## KEGG: FN2049 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 80 1 80 82 73 55.0 3e-12 MKKIAIALGLGALAVSCTNAKLVNYNTDRLDNIEAYLKENKFVKPSDNLEKLKNEGQIEY TTQYKSLEREADAWKESQEQ >gi|261746765|gb|ADAD01000175.1| GENE 2 519 - 650 97 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no DVEIQRIKKRVDEINNNIETFHKTNEALEKMEERLEGIQSKMK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:42:33 2011 Seq name: gi|261746705|gb|ADAD01000176.1| Leptotrichia goodfellowii F0264 contig00192, whole genome shotgun sequence Length of sequence - 54720 bp Number of predicted genes - 57, with homology - 57 Number of transcription units - 19, operones - 14 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.600 + CDS 3 - 200 353 ## COG0125 Thymidylate kinase 2 1 Op 2 . + CDS 226 - 1464 1687 ## COG0124 Histidyl-tRNA synthetase 3 1 Op 3 . + CDS 1502 - 2104 604 ## gi|262039223|ref|ZP_06012541.1| putative appr-1-p processing domain-containing protein 4 1 Op 4 . + CDS 2124 - 3155 1043 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 5 1 Op 5 . + CDS 3170 - 4945 2955 ## COG0173 Aspartyl-tRNA synthetase 6 1 Op 6 . + CDS 4965 - 5186 315 ## Lebu_1872 hypothetical protein + Term 5229 - 5287 0.9 7 2 Op 1 5/0.200 + CDS 5546 - 7612 1921 ## COG3711 Transcriptional antiterminator 8 2 Op 2 2/0.200 + CDS 7636 - 8091 591 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 9 2 Op 3 . + CDS 8116 - 8379 439 ## COG1925 Phosphotransferase system, HPr-related proteins 10 2 Op 4 10/0.000 + CDS 8409 - 8684 479 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 11 2 Op 5 . + CDS 8717 - 10048 2057 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 12 2 Op 6 . + CDS 10076 - 10735 1020 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 13 2 Op 7 10/0.000 + CDS 10753 - 11760 1469 ## COG2376 Dihydroxyacetone kinase 14 2 Op 8 . + CDS 11776 - 12405 899 ## COG2376 Dihydroxyacetone kinase + Term 12427 - 12473 10.8 + Prom 12448 - 12507 7.3 15 3 Op 1 10/0.000 + CDS 12545 - 12853 408 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 16 3 Op 2 . + CDS 12884 - 14221 2147 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Term 14266 - 14310 3.8 + Prom 14340 - 14399 10.7 17 4 Op 1 35/0.000 + CDS 14488 - 15816 1959 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 15871 - 15907 0.3 + Prom 15822 - 15881 4.3 18 4 Op 2 38/0.000 + CDS 15942 - 16844 1114 ## COG1175 ABC-type sugar transport systems, permease components 19 4 Op 3 . + CDS 16853 - 17689 1197 ## COG0395 ABC-type sugar transport system, permease component 20 4 Op 4 . + CDS 17734 - 18033 271 ## gi|262039264|ref|ZP_06012582.1| hypothetical protein HMPREF0554_2348 21 4 Op 5 4/0.200 + CDS 18033 - 18941 434 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 22 4 Op 6 . + CDS 18963 - 19877 1493 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase + Prom 19937 - 19996 6.1 23 5 Op 1 2/0.200 + CDS 20048 - 21151 1788 ## COG3839 ABC-type sugar transport systems, ATPase components 24 5 Op 2 . + CDS 21170 - 21982 892 ## COG1737 Transcriptional regulators + Term 22088 - 22124 1.1 + Prom 22070 - 22129 6.3 25 6 Op 1 . + CDS 22186 - 22602 364 ## gi|262039238|ref|ZP_06012556.1| putative membrane protein 26 6 Op 2 . + CDS 22618 - 23268 616 ## Lebu_0727 hypothetical protein 27 6 Op 3 . + CDS 23234 - 23650 159 ## gi|262039224|ref|ZP_06012542.1| hypothetical protein HMPREF0554_2355 + Term 23736 - 23776 3.1 - Term 23718 - 23764 1.9 28 7 Op 1 . - CDS 23810 - 24769 1374 ## COG2355 Zn-dependent dipeptidase, microsomal dipeptidase homolog 29 7 Op 2 . - CDS 24785 - 25735 906 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF 30 7 Op 3 . - CDS 25797 - 27047 1012 ## COG1301 Na+/H+-dicarboxylate symporters - Prom 27075 - 27134 18.4 + Prom 27153 - 27212 9.8 31 8 Tu 1 3/0.200 + CDS 27245 - 27940 1035 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase + Prom 27942 - 28001 5.3 32 9 Op 1 . + CDS 28155 - 29723 2417 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 33 9 Op 2 . + CDS 29720 - 30178 700 ## COG2731 Beta-galactosidase, beta subunit + Prom 30265 - 30324 9.7 34 10 Op 1 . + CDS 30383 - 31300 1213 ## Sterm_0363 hypothetical protein + Term 31310 - 31354 1.8 + Prom 31318 - 31377 6.4 35 10 Op 2 . + CDS 31400 - 31762 526 ## Spico_1517 PRD domain-containing protein 36 10 Op 3 . + CDS 31774 - 32124 612 ## CAR_c15350 hypothetical protein 37 10 Op 4 . + CDS 32156 - 33493 2218 ## Spico_1515 hypothetical protein 38 10 Op 5 . + CDS 33512 - 34387 1295 ## COG1735 Predicted metal-dependent hydrolase with the TIM-barrel fold 39 10 Op 6 . + CDS 34402 - 35637 1508 ## COG1680 Beta-lactamase class C and other penicillin binding proteins 40 10 Op 7 . + CDS 35657 - 36760 1455 ## Sterm_0357 hypothetical protein 41 10 Op 8 2/0.200 + CDS 36767 - 37930 1476 ## COG3457 Predicted amino acid racemase 42 10 Op 9 . + CDS 37953 - 39167 1898 ## COG1015 Phosphopentomutase + Term 39337 - 39383 2.0 43 11 Tu 1 . - CDS 39394 - 40023 647 ## COG2910 Putative NADH-flavin reductase - Prom 40049 - 40108 14.6 + Prom 40093 - 40152 9.6 44 12 Tu 1 . + CDS 40187 - 40537 404 ## COG1733 Predicted transcriptional regulators - Term 40557 - 40612 -0.6 45 13 Op 1 . - CDS 40640 - 41167 631 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) 46 13 Op 2 . - CDS 41172 - 41801 992 ## COG2910 Putative NADH-flavin reductase - Prom 41925 - 41984 15.7 + Prom 41925 - 41984 13.0 47 14 Tu 1 . + CDS 42047 - 42385 304 ## COG1733 Predicted transcriptional regulators + Prom 42407 - 42466 13.9 48 15 Op 1 . + CDS 42515 - 44332 2607 ## COG2461 Uncharacterized conserved protein 49 15 Op 2 . + CDS 44397 - 44972 751 ## COG2096 Uncharacterized conserved protein + Term 44991 - 45049 9.7 + Prom 45066 - 45125 9.4 50 16 Op 1 12/0.000 + CDS 45195 - 46163 1152 ## COG1125 ABC-type proline/glycine betaine transport systems, ATPase components 51 16 Op 2 . + CDS 46156 - 47673 1865 ## COG1732 Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) + Term 47775 - 47808 1.4 + Prom 48026 - 48085 8.3 52 17 Op 1 1/0.600 + CDS 48196 - 50331 1656 ## PROTEIN SUPPORTED gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase 53 17 Op 2 . + CDS 50315 - 50860 275 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 54 17 Op 3 . + CDS 50889 - 51158 256 ## Lebu_1454 hypothetical protein + Term 51165 - 51219 7.1 55 18 Op 1 . + CDS 51234 - 52781 1726 ## COG0232 dGTP triphosphohydrolase 56 18 Op 2 . + CDS 52832 - 53656 987 ## Sterm_1982 TrmB family transcriptional regulator + Prom 54041 - 54100 10.7 57 19 Tu 1 . + CDS 54149 - 54719 998 ## COG0058 Glucan phosphorylase Predicted protein(s) >gi|261746705|gb|ADAD01000176.1| GENE 1 3 - 200 353 65 aa, chain + ## HITS:1 COG:FN1323 KEGG:ns NR:ns ## COG: FN1323 COG0125 # Protein_GI_number: 19704658 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate kinase # Organism: Fusobacterium nucleatum # 2 61 160 219 225 59 53.0 2e-09 INKITGEEKKDIHESDKEYLKNAYNLAKELAEKYRWIIISCVKNGKLRTIEEINDEITEK ILYNI >gi|261746705|gb|ADAD01000176.1| GENE 2 226 - 1464 1687 412 aa, chain + ## HITS:1 COG:FN0298 KEGG:ns NR:ns ## COG: FN0298 COG0124 # Protein_GI_number: 19703643 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 407 3 409 413 375 49.0 1e-104 MINVLKGMKDRYFDDVKKYDFIVDTAKKVFTKYGFERIITPILEETELFRRSVGDETDVV SKEMYDFKDKGERNVTLRPEGTAGVVRAYLEAGLHKSDPVVKWFYNGPMYRYEAPQKGRY REFHQIGAEIFGIRSPYLDAEVIKMGCEFLEKLGITGLTVELNSLGNLESRKKYIDDLKE FMAQRLDKLSEDSKKRYEKNPLRALDSKDKGDQEQFKDAPKLYDYLDEESKKYFEDTKRY LEILGVKYVENPKLVRGLDYYSDTVFEIKSDKLGAQATVLAGGRYDRLLEILDNVKIPAI GFAAGMERAAMLMDESLLEQKEEKIYVVYFDETKEYFIKTVEELRKKGIKVNFDYNPKSF SAQMKKANKINAEYVLILGEEEQKENVVTLKRFSTGEQEKYNLKEVIEILNK >gi|261746705|gb|ADAD01000176.1| GENE 3 1502 - 2104 604 200 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039223|ref|ZP_06012541.1| ## NR: gi|262039223|ref|ZP_06012541.1| putative appr-1-p processing domain-containing protein [Leptotrichia goodfellowii F0264] putative appr-1-p processing domain-containing protein [Leptotrichia goodfellowii F0264] # 1 200 1 200 200 318 100.0 2e-85 MWIHHLTHIDNLESILERGLISRNQLKQFNIKFNDTADKRIIGERNDLNNYIPFHINYLQ KKYSIPYNWKVLKNQSSENMIFLNYNIDDFSNNELDFYLYHPISKSQNNAIKINDLMGFK EKLLKEEKKLQNDMGYLDFSDNKTQQFLMSEVLIKDRIYLSEKWKIGVFSFAQKQIVEKK LEKNNLKIEVFIDDKRYFYR >gi|261746705|gb|ADAD01000176.1| GENE 4 2124 - 3155 1043 343 aa, chain + ## HITS:1 COG:MT0066 KEGG:ns NR:ns ## COG: MT0066 COG2110 # Protein_GI_number: 15839437 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Mycobacterium tuberculosis CDC1551 # 6 208 7 229 352 126 34.0 6e-29 MEIRQGNIFESETQALINPVNTQGIMGKGLAYQFKIKYPNNFQNYEQKCLIKEVDVGKDL IYTEEKDKIIINFPTKKNWKEKSKIEYIEIGLKKLEELLEKLRIKSVSIPPIGAGNGKLD WKLVKNEIEKFEKRVSKSTNVIIYEPAISESKLSKGHYLIAYTLVKAEEKGIKKEITDLV LQKLIYLGDRNNYFKFKEESEGPFSKLISIQYQKLKEYSKLNNKKLVEIEKELLKEEITK NLQKEKENIENAIQIYINMKNFYSISVKEEIENKVELLSTVIYILKNKKDDEISNNNIYE EMKSWNKRKDKKYSLEDIGQMFDFLEKEKILKKDIFNKYIMKY >gi|261746705|gb|ADAD01000176.1| GENE 5 3170 - 4945 2955 591 aa, chain + ## HITS:1 COG:FN0299 KEGG:ns NR:ns ## COG: FN0299 COG0173 # Protein_GI_number: 19703644 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 580 2 583 592 704 62.0 0 MYRNYKLNELRIENVGEEVTLSGWISKIRDKGHFVFIDLRDRYGVTQIFINEEVSGKELQ EKVKKFKNEWVIKVKGVVVERSSKNPNIPTGDIEIQAKEIEVLSQSKQLPFEIDETGNLN ENIRLTYRYLDIRRPKMLNNIIKRNDMLFSIRKFMNENGFLDVDTPILAKATPEGARDFV VPSRINKGDFYALPQSPQLFKQILMVAGIDKYYQLAKCFRDEDLRADRQPEFTQLDVEMS FVEQEDVLNVMEKLAKKVFEDVTGEKVTETFERMTYDDAMNNYGSDKPDLRFDMKLIDLS KEVENCGFGVFENAVKEGGNVKALVAPKGEIFSRKYIKDLEDYVKTYFKAKGLAYIKIDE NGEINSPIAKFFTEETMQKIIQKLNIKNNETALILADKYKTVHDGLGALRLKLGEELGLI DKDAYKFLWVVDFPMFEWSEEENRYKAQHHPFTSIKVEDRKYLDTNELDKIKTDSYDMVL NGYEIGGGSIRIHDEELQEKVFEKLGLGKEEQQEKFGFFLEVLKYGVPPHGGLAFGIDRW LMAMLKENSIKEVIPFPKTNKGQDLMTGAPAGVEEKVLEEDLNLKLLEIKE >gi|261746705|gb|ADAD01000176.1| GENE 6 4965 - 5186 315 73 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1872 NR:ns ## KEGG: Lebu_1872 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 69 1 69 70 98 59.0 1e-19 MFIVFGTKTITKDMGVVGTYECSRCHNVSEWRFMQYRHWFTLFFIPVIPVSGKHEYLQCP VCSQAYSVPKDGE >gi|261746705|gb|ADAD01000176.1| GENE 7 5546 - 7612 1921 688 aa, chain + ## HITS:1 COG:BH0220_1 KEGG:ns NR:ns ## COG: BH0220_1 COG3711 # Protein_GI_number: 15612783 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus halodurans # 2 489 3 497 556 112 23.0 2e-24 MLNNREIKILDIFSLKEEVSIKELKESFNISERMVRYDIEKINFVLSIFNLKPIYRSKTG FFKLDKNNKINKVKLIIEETEPITIEKRKKLIELSLATSRKKINISTLMDKFDTSRITIN NDLNNIKKEFLRKNIRIDNRKGLSLEGELSDLIKVRVEKISESLKYLKNKKNKTNYNLKI EEIFKENLNIKDINILSKFLQDVLNKLNFSATDENYKLLLSYIILLQSENYHEEIIKFIH IDASMIKDWEEYKIIKSQVENSKLGEIFNENDIFIITDLIVGMVSYNKTSQFYENWIDID VLVKSLIENVNNKLDVDISEDKLLFQYLRQHLKPLFYRIKNDYSLNEKFVQDLKFSKNSL FYLIKDSLEILSNILKKDIPDEEIFLLMLHFQASIERAENNVSNIKRIIIVSTLGYGISQ ILADSISSLFNVEIVSVGPYFKIKEMLQKNKNIDYIITTIDIDNKASLFIPVIKVNPIFT FEDKKKMLELGFLPNNKKILMSEILQIIGEESTILNKKKLIKKLENKMKGKIINDISDEN KVENMISKDNIIFDYKAENIEKAIKYTCELLEREGSIEKSYTENVIDIFKNHSSYLLIHN GIILPHARNTGNVFKTSGILLALQNPVMFNETEVKYIFTFAIKDKNSELNKVSKIINQVF KDDLIKILSTKNVEKTFRYFLGGVRENI >gi|261746705|gb|ADAD01000176.1| GENE 8 7636 - 8091 591 151 aa, chain + ## HITS:1 COG:lin0503 KEGG:ns NR:ns ## COG: lin0503 COG1762 # Protein_GI_number: 16799578 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Listeria innocua # 3 148 5 149 155 86 34.0 2e-17 MKDYFKEENVFLKFEAENRDEFFSKIGEILLEKKYVKEDFVDSIKEREKNYPTGLNFGEY NIAIPHTNPEFVNEEGIVTVRLKNPVIFRDMGMDENDLEVLLVFVLLIKKGEEQVNTLMK LMSLLEKKDIYEKLIKAEEKDDILKILKKNF >gi|261746705|gb|ADAD01000176.1| GENE 9 8116 - 8379 439 87 aa, chain + ## HITS:1 COG:Z4879 KEGG:ns NR:ns ## COG: Z4879 COG1925 # Protein_GI_number: 15804017 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Escherichia coli O157:H7 EDL933 # 1 87 1 87 89 65 41.0 2e-11 MVMKTLEVTNKTGLHARPVSQIIKLISKYKSKTVFKVGNKEVKNKSVLEFLKLGAKYGSK VEIEIDGEDENELLKELEYLFYSNFGE >gi|261746705|gb|ADAD01000176.1| GENE 10 8409 - 8684 479 91 aa, chain + ## HITS:1 COG:lin2201 KEGG:ns NR:ns ## COG: lin2201 COG3414 # Protein_GI_number: 16801266 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Listeria innocua # 1 91 1 91 91 100 62.0 5e-22 MKRIVVACGSGVATSQTVASKINNMLEDEGISANVEAVDIKSLDSIIDQVDVYVTIVPGS KTYDKPMINGIKFLTGMGMAEEFEKLKKLLK >gi|261746705|gb|ADAD01000176.1| GENE 11 8717 - 10048 2057 443 aa, chain + ## HITS:1 COG:lin2200 KEGG:ns NR:ns ## COG: lin2200 COG3775 # Protein_GI_number: 16801265 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Listeria innocua # 4 443 6 441 449 404 51.0 1e-112 MFGNILNYILGLGAAIFLPIIMILIGLGIKMKFKRAVTSGLTLGVAFTGMNVILGFMFDT ISPVAGAFVEKTGIQLNIIDVGWSPMSAIAWAWPYALFMFPLQIGINLLMLVFKQTDILN VDLWNVWGKIFTATMVAAITGNIALGFVAAAIQVVAELKIGEATQKRTQEITGIPGVTCT HYMVLQCVIMEPVNKLLDYIPIFKKENSNADKLKDKIGIFGENSVMGFIIGGLMAILAGY TVKDTLNVAIKVGTALVLFPMVAKLFMQALAPIADAASEFMKSKFKDREIYIGLDWPFMA GQSELWVVAILLVPIELLLAVVFSKMGISNVLPLAGIVNIIVVVPAMIVSKKNLLKMLIL SIIYTPIYLLVSTSFAPYATELAQKTNAIKIPAGQMITYFGVEAPFFRWAIANGLAMRIP GIIALVLFFGLFYMFVKNMKKQS >gi|261746705|gb|ADAD01000176.1| GENE 12 10076 - 10735 1020 219 aa, chain + ## HITS:1 COG:STM2974 KEGG:ns NR:ns ## COG: STM2974 COG0235 # Protein_GI_number: 16766279 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Salmonella typhimurium LT2 # 5 193 6 187 215 120 37.0 2e-27 MLENLKKEVIKAAQEGQRLDLCKHKSGNFSIRDNKTGYVVVTPSGVNREELTVDDICVLT VDGEIIEVKNGRKPSSETMMHLEIYKTREDIKAVAHTHSKMATSFAVLNKPIPAIIYEVV TFGLKDAVVPVAPYARPGTVELAKSVIEPVKRADIFLLEKHGVVACGSDIYEAFLKAQYV EELAEIYYYTLLINKGNEPESFSAEELESWKYPEKLDKK >gi|261746705|gb|ADAD01000176.1| GENE 13 10753 - 11760 1469 335 aa, chain + ## HITS:1 COG:mll7280 KEGG:ns NR:ns ## COG: mll7280 COG2376 # Protein_GI_number: 13476064 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Mesorhizobium loti # 2 321 6 325 337 290 46.0 3e-78 MKKIINNAENFVEETVRGIVAAHGDKVELLNDDFRILIRKDNPKKDKVGIVTGGGSGHLP TFLGYVGEGMLDGCTIGNVFASPSSQKMFDMIKACDFGKGVLCLYGNYGGDKMNFNMACE LAEFEDIKTENVLVKDDVASASFDKKDTRRGVAGMLYAYKIAGAAADEGYELEKVADITR RALENIKTIGVALSPCVVPEVGKPTFNIEENKIEIGMGIHGEKGIEVRDMMTANEIADLI FDKLNEELNLQENEEVSVMVNGLGATPLEELYIIYNRIHEILKKKNINIIKPHIGEFATS MEMAGLSITIFKLEEGTKQLLLKTAITPFYTNINK >gi|261746705|gb|ADAD01000176.1| GENE 14 11776 - 12405 899 209 aa, chain + ## HITS:1 COG:FN1841 KEGG:ns NR:ns ## COG: FN1841 COG2376 # Protein_GI_number: 19705146 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Fusobacterium nucleatum # 8 192 4 188 202 93 31.0 2e-19 MNLYDVKQIISNISILMKENKDYLLELDSKFGDGDLGISMVQGFDGINEFTEKSEEKDLG KLFFGMSRVFNEMAPSSLGTIISFWMLGIASNLKGKEEASKEEIAKALEAGISKIKERAG SKENEKTILDALVPATEKFEKAIKSDKTISEILKETYDKAAKGSENTKEMKAVHGRAAYH DEKTLGHIDGGAYVGKLIFEGIYNFYKTK >gi|261746705|gb|ADAD01000176.1| GENE 15 12545 - 12853 408 102 aa, chain + ## HITS:1 COG:VC1281 KEGG:ns NR:ns ## COG: VC1281 COG1440 # Protein_GI_number: 15641294 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Vibrio cholerae # 1 101 1 101 101 104 58.0 4e-23 MKKILLCCAAGMSTSLLVNKMKAASESKGMEVEIWAEPLDKAPEEIPKSDVVLLGPQVKY ALPELKKIADEHGKKIEAINMTDYGMMNGAKVLEAALKLLND >gi|261746705|gb|ADAD01000176.1| GENE 16 12884 - 14221 2147 445 aa, chain + ## HITS:1 COG:BS_licC KEGG:ns NR:ns ## COG: BS_licC COG1455 # Protein_GI_number: 16080909 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 5 444 1 440 452 441 50.0 1e-123 MANFMDKFTKALEEKLMPIAAKVANQRHLVAIRDGIIITLPFVIVGSIFLILANLPIPMI AKFYETPLGGMIQRWLSYPVSVTFGLLAIITCLGIGYRLSSSYKLDGISGAILGLVSFLL VTPFEIVFTHEKYGELSTSGIPLGLMGSAGLFVAMIMAILSVEIFRIVVKRNLVIKMPDM VPPAVSKSFAAMIPGFFIVTAALLIRIGFESTPFKSIHNVVSMILTKPLTAVGGSYFGAI LITLLIHLLWTCGLHGANIILGIVDPALYVLMDQNRAAFEAGMRGSQLPNVVTRQFFDVF QSMGGSGATLSFVVMMLFWAKSKQLKEIGKLSIAPGLFNINEPVLFGLPIVMNPMLIIPF ILAPLACVTISYWSMKLGIVAKPTGIAIPWTTPPVIAGWLVTGDYKGGVVQIVNFFVTGI IYYPFFKMWDKKKLEEEKETDVSED >gi|261746705|gb|ADAD01000176.1| GENE 17 14488 - 15816 1959 442 aa, chain + ## HITS:1 COG:SPy0252 KEGG:ns NR:ns ## COG: SPy0252 COG1653 # Protein_GI_number: 15674433 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pyogenes M1 GAS # 19 442 19 439 439 303 40.0 4e-82 MNKNFMKKILIGVLFAFVLTACGGGKKESSDEKKLDPNQKVTIKYWSFPNFTSDGEFKTP EEYDAALIKAFEEKNPNIKVEYQKIDFTDGPAKIETAIQAKSNPDIVIDAPGRIIDWAKK GYLAPFEKADVSKYSKVIADASSFENKLYLYPLGTAPFVMAVNKKLTDKLGVTDLLPLNK PGRNWTVEEFEKFLKAVKAKDSSVDPVLFYTKSQAGDQGPRAFVSNLYNSWITDEGVTKY TINDENGVKSLTWIKKAYDEGLLGKGVSAEAKDALEAFRSGKAVATILYSPGLKGQDKEA ITKGDLEPVFLPFPNESGQAKFEFLLAGAAVFDNGDPAKVAAAQKFVDFIVNDPVWGERS LKATNNFSPSGKTGIYGDDSEIKYLESLVGFYGPYYNTIDGYAQMRPLWFNMVQAILNGQ TAPKAGLDKFAEDANKTISDAK >gi|261746705|gb|ADAD01000176.1| GENE 18 15942 - 16844 1114 300 aa, chain + ## HITS:1 COG:SPy0254 KEGG:ns NR:ns ## COG: SPy0254 COG1175 # Protein_GI_number: 15674434 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Streptococcus pyogenes M1 GAS # 20 300 39 319 320 287 55.0 2e-77 MDNSIAGKYKWNWKKVDYSAYLFILPVLVFFLSFVLYPMIKGIHLSLYRFRGRNTTFVGL KHYADLLKNDIFLKSATNTVFITFVALPVVVVFSIFVAYVIYEKNATVRSFFRGVFYIPA ISSVVSITVVWNWIYHPKYGILNYVFQKAHLISKPVDWLGDPKTAIFAIIAILITTSVGQ PIILYVAALGNVPKDLTEAARIDGANEWQTFRNITWPLVMPTTLYIVVVTTINSFQIFAL IQLLTHGGPNYSTSTVMYLVYQTAIVEGRFGVSSAMGIILAVIIGIISVLQFKFLSKDID >gi|261746705|gb|ADAD01000176.1| GENE 19 16853 - 17689 1197 278 aa, chain + ## HITS:1 COG:SPy0255 KEGG:ns NR:ns ## COG: SPy0255 COG0395 # Protein_GI_number: 15674435 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 4 278 3 276 276 297 59.0 1e-80 MKNKEKKSIFSIISMVILVMLTIFFIFPFYWIATGAFKVQEVAISIPPEWFPLKPTLENF DKLLIPLTTRWFFNSVFVSLSTTILVCITASLAGYALAKKQFPGAGLIFIVFVAAMALPK QVILIPLLKFVTELGWIDSYKALVLPAVGWPFGVFLMKQFSHSVPNELLESARIDGCGEL KTFINIVLPIIKPGIGALAIFTFIASWNDYFSQLIFTNSEMMKTLPLGVAAMAQSAEFSL NYGLLMAGALLASLPMIIVFLMFQNYFTQGVTMGAVKG >gi|261746705|gb|ADAD01000176.1| GENE 20 17734 - 18033 271 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039264|ref|ZP_06012582.1| ## NR: gi|262039264|ref|ZP_06012582.1| hypothetical protein HMPREF0554_2348 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2348 [Leptotrichia goodfellowii F0264] # 1 99 1 99 99 150 100.0 3e-35 MIENFRIMIIGWFYYGILFIIGSIVVTALLNRVFNKLYIPPLIVNAVSVILLFIGLKLNM KNPGYALYFNYIPTVAASVTYNFIIFIVRKLQKRTDVKC >gi|261746705|gb|ADAD01000176.1| GENE 21 18033 - 18941 434 302 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 4 290 6 313 319 171 34 6e-42 MYYICIDIGGTSIKYGVLSETGEQLVNGMINTRVTSKENYILADIKKVIKDILEQYSMYK IEGICVSSAGVVDTDKGEIAYAGPTIPKYTGTKIKEELEKEFSLPCEVENDVNCAGLGEY WKGAGKGSKSMVCLTIGTGIGGSIILDGKLLNGVGYTAGEIGYMDVNGKYIQDIASSKYL VQKVREEKKEKEGIDGKINGLSIFELAKQGDKICIDAIDEMISNLSVGIRNIIYLLNPEI VVIGGGITAQKEYLEEKINKKVNDNMISDTFRKTEIRLAKQGNQAGMLGALYHFLSKRNK IK >gi|261746705|gb|ADAD01000176.1| GENE 22 18963 - 19877 1493 304 aa, chain + ## HITS:1 COG:SP1676 KEGG:ns NR:ns ## COG: SP1676 COG0329 # Protein_GI_number: 15901511 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Streptococcus pneumoniae TIGR4 # 4 297 3 305 305 270 45.0 3e-72 MKRDLGKFKGIFMAMYSAYDDNGNVDKERVKKLARYYADKKVKGLYVGGSSGEGVLQSEE ERKQVVEAVMEEVKGELTIIVHVGANSTAESVRLAQHAEKMGADAVSSIPAVYYRLSPES VKVHWQAMIDSTSLPFIIYHIPQTTGFNLPMCLFEEMAKQEKVIGIKCSSESTFELQQFK AVGGKDFLVFNGPDEQFVAGRAIGADAGIGGTYGVMPELFMKLDEYMKKQEVEKARELQD KVNEIIKGLLSVGSLYGACKYILDLRGIKTGVPRLPLLPITDDKKKEFLKELNEKILKLV EEVK >gi|261746705|gb|ADAD01000176.1| GENE 23 20048 - 21151 1788 367 aa, chain + ## HITS:1 COG:YPO0609 KEGG:ns NR:ns ## COG: YPO0609 COG3839 # Protein_GI_number: 16120935 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Yersinia pestis # 1 365 2 369 372 437 59.0 1e-122 MSGVVLKKVEKQYPNGFKAVHGIDLEIRDGEFMVFVGPSGCAKSTTLRMIAGLEEITGGE IYIGDKLVNDVPPKDRGIAMVFQNYALYPHMTVYQNMAFGLKLKKTPKDEIDRRVREAAE KLEITELLDRKPKEMSGGQRQRVALGRAIVRKPEVFLFDEPLSNLDAKLRVSMRVRITQL HQELKTTMIYVTHDQVEAMTMGDRITVMRAGRIMQVDTPLNLYHYPANKFVAGFIGSPTM NLVDGVLKEKEGKVYIDIDGAEIELSHEKGEKVKGHIGKKVTFGIRPENISVAEHEDSIS KTGEISVVEQMGNEEYIYFTLNGHQMTCRINIEHVGDSVSKKGKRVFRFDTNKAHIFDAE TEENISL >gi|261746705|gb|ADAD01000176.1| GENE 24 21170 - 21982 892 270 aa, chain + ## HITS:1 COG:L192289 KEGG:ns NR:ns ## COG: L192289 COG1737 # Protein_GI_number: 15673158 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 1 269 1 271 273 197 43.0 1e-50 MKYINIIESFYPSLSKQEKKIADYIFEKKGQISYQSLQEIAKQIKVGEATIVRFVKKIGF NGFQDLKLHIAKEDFPIIETGYEDYIDNIQANINSAITNTKSLIDRKHLDKSIAAIGKAE RLFLYGVGASGIVARELQNKFLRFGKAGIAYTDSHFQIMNAAITTNKDVIIAVSLSGSTN DIVESLEIAKKNKAKIIAITNHILSPVAQLADYVLLTAGRETLLDGGSLIAKISQLYVAD ILCTGYALKNKEEALKMKHNTAQAVLSKNK >gi|261746705|gb|ADAD01000176.1| GENE 25 22186 - 22602 364 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039238|ref|ZP_06012556.1| ## NR: gi|262039238|ref|ZP_06012556.1| putative membrane protein [Leptotrichia goodfellowii F0264] putative membrane protein [Leptotrichia goodfellowii F0264] # 1 121 1 121 138 153 100.0 4e-36 MTFERGTKKQMTDYGLIILILSVSFFSWECFKLARLRKTEFKGKVIITIDNIISIFIAII LIFRFFEFWNLYIGGNIKESGISDEDYLFSALESIVIVVFYVVKVVFTSVSGNGKVRKCF FIISAIIDIIILKILELL >gi|261746705|gb|ADAD01000176.1| GENE 26 22618 - 23268 616 216 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0727 NR:ns ## KEGG: Lebu_0727 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 148 1 176 297 89 35.0 8e-17 MKIEYSKFDFNRADIFGIVPGNIVYLIYRFIFTLFLKNRTQINFGKVKITVNGLFMTKEI YYDDIKFISVRKNFLNSYDLIINFDETATIFRYLCNYISDTFSSGINKKNTFVICELKNV DEVLKELLANTGFDLEKLNSADKKVIYSGQFHRKYDFSKLKNFQNKEIIKISKNSESIYV IDKFQEDDKSIILDEYSKNILYKLVYGPMPGKYNKL >gi|261746705|gb|ADAD01000176.1| GENE 27 23234 - 23650 159 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039224|ref|ZP_06012542.1| ## NR: gi|262039224|ref|ZP_06012542.1| hypothetical protein HMPREF0554_2355 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2355 [Leptotrichia goodfellowii F0264] # 22 138 1 117 117 147 100.0 2e-34 MDRCRESIINYKLESEKIGGYMKEKLLVLFFINVIWQFIKWKYRKIRTVYYLDIVMSAII ILFLLYVIYVWIIFYDIISRGVDKNDLQMLLSLSGLVLFYITKIISYFGKKKFLVSKKIC LIVSIIIELLFNLYPILY >gi|261746705|gb|ADAD01000176.1| GENE 28 23810 - 24769 1374 319 aa, chain - ## HITS:1 COG:PH1026 KEGG:ns NR:ns ## COG: PH1026 COG2355 # Protein_GI_number: 14590865 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent dipeptidase, microsomal dipeptidase homolog # Organism: Pyrococcus horikoshii # 1 318 12 320 320 172 35.0 1e-42 MFFDMHADVWTDNLWEYEKGKNDVIRRKYKDKFIKGGLFGGIFVIYMDAFNTPDVEEVFF KSLRAMSEELYHSRDLVHIIKDYEDFEKAEKQNKFGVILGIEGLPGIGSNLDYLYLLKRT GVRHIGMTWNEENAFATGQRGDVNRGLTDLGIKAVEIIENLGILLDVSHANDKTFWDIAK YSKKPFFASHSNARSLCPSMRNLTDEQILCIGERNGMIGMNSYHGFVSKNEKEKNLDMLL NHMEYIAEKIGLDKVGFGFDFAEYYSTPDDEDEGLQGVHDVTEIGNVKKALEKRGYSKEE IENIAYKNFVSFFERVRGK >gi|261746705|gb|ADAD01000176.1| GENE 29 24785 - 25735 906 316 aa, chain - ## HITS:1 COG:CAC0293 KEGG:ns NR:ns ## COG: CAC0293 COG1619 # Protein_GI_number: 15893585 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Clostridium acetobutylicum # 5 315 4 300 306 197 39.0 3e-50 MNFPKPLKKEDNVFLLCPSSPVIAEEDIKKCSEIIENMGYNAVIGKSLYENYGGYMAGSA KIRVDDLHEAFSREDIKGIFCVKGGYSASQLLDKLDYNLIANNPKVFAGYSDITNLHIVF NQKCNLGTYHGPMVKSNMFNDFNEYTEKSFFEAIECSEWKYKEPENKSLSLLNAKNINDS VSHIKGILTGGNLAIIMASLGTPYEIDTKDKILFLEDVDEKIGSLDRMLTHLKYSGKLDD CNGIILGNFADCKNTYETINNQIYELDELLKDFFGNYDKPVIYGMESGHAKPYMATLPLG AECKINIKTKEITFTK >gi|261746705|gb|ADAD01000176.1| GENE 30 25797 - 27047 1012 416 aa, chain - ## HITS:1 COG:CPn0528 KEGG:ns NR:ns ## COG: CPn0528 COG1301 # Protein_GI_number: 15618439 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Chlamydophila pneumoniae CWL029 # 21 400 7 393 414 150 28.0 5e-36 MEKSEINKKSFWKNYSFPILLLTGIAIGSIIGIILGEKATVLKPFGDIFINLMFTAVVPL VFITISSAVGSMINMKRLGMILGYMLLTFVVTGAIASIIIIIAVKIFPPAAGVNISIPTP EAVKPVSLGEQFVKAITVSDFSQLLSKSNMLPLIIFSVLFGTCVSLVDDEKKSIAAGLEK FSKVMMKLIELIMYYAPIGLGAYFAALIGEFGPQLLGSYAKAMIIYYPICIFYFFAAFSV YSFYAGGKNGVKVFFGNILNPSITSLATQSSIATLPVNLEAAKNMGVPKDIREIILPIGA TMHMDGTCLSSILKISFLFGIFGKDFSGIGTYLIAIVVSVLGGVVMSGVPGGGLIGEMLI INLFGFPMEAFPIIATIGFLVDPPATWLNSTGDAVASMVVTRMIEGKNWIFGKKID >gi|261746705|gb|ADAD01000176.1| GENE 31 27245 - 27940 1035 231 aa, chain + ## HITS:1 COG:SPy0251 KEGG:ns NR:ns ## COG: SPy0251 COG3010 # Protein_GI_number: 15674432 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Streptococcus pyogenes M1 GAS # 3 231 7 234 234 254 58.0 1e-67 MNKKEIIEKLKGKLIVSCQALPGEPLYIENDTMMPLMAVAAKRAGAAGIRTNGERDVKEI KKTVDLPVIGLIKKEYEGFFQYITVTMKEIDTLVKAGADIIALDCTMRERGDGKTVNGFI KEIKEKYPDIILMADISTFEEGVNAEKAGVDMVGTTLSGYTPYSEKTDGPDFKLVEKLSK ELDIPVIAEGKIHEPQQAKKMIELGAHAVVVGGAITRPLEIAQRFVKAVEN >gi|261746705|gb|ADAD01000176.1| GENE 32 28155 - 29723 2417 522 aa, chain + ## HITS:1 COG:BBB29_1 KEGG:ns NR:ns ## COG: BBB29_1 COG1263 # Protein_GI_number: 11497021 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Borrelia burgdorferi # 1 428 22 452 469 358 50.0 2e-98 MLPIAVLPMAGILLGVGGSFTNPVLIKTYNLTFLNEGTPLNYLMQLFSNTGLFVFANLPL LFAVGVAIGLANKNKETAALSAVLGFLLFHTIIGTVLSFQGITPDSVTYDALIGKGLSEA AARGTAALYAKELGIFTLQTGVFGGIICGIVASAITNKFSDKVLPDYLAFFSGNRFVPVM TIILFIPVAVIFPFVWPTIFMGIVKAGEMFAATGAVGTFFYGFTMRILNVFGLHHAIYPL FWYTQLGGYQEVAGKMVAGGQNIFFAQLADPSVKHFSAAATKTMTGGFLPMMFGLPAAAL AMYRTADDKNKAAVKGILISAALTSFLTGITEPIEFTFLFVAPALYVIHAILEGLAYMLM YVLNVAVGITFSRGIIDFTFFGLLQGPAKTSYYWILILGPVYAIAYYFIFKTLILKFNIP TPGRGDSENKLYTRKDYNESKGEKEFIDEIVDSLGGIDNIENIDACITRLRVTVKDPSTV SDDVKWKELQAKGVIRSGKGVQIVYGTQAETYKNKIREKYKI >gi|261746705|gb|ADAD01000176.1| GENE 33 29720 - 30178 700 152 aa, chain + ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 8 150 4 152 152 82 33.0 3e-16 MIFTNLNDELQNKSLSEKIQKCIEFTKENNFEEYEPGSYDVPGTDIKVNVGNYDTKPENE CAWESHLKYIDVQIMIKGREYIAFNNIRNLKKTKVDEEKDFIGHEGPELFRVFLEDGDVL ILYPEDGHMPGIKSEESIPVKKIVYKIDVNTV >gi|261746705|gb|ADAD01000176.1| GENE 34 30383 - 31300 1213 305 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0363 NR:ns ## KEGG: Sterm_0363 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 9 305 7 303 303 223 40.0 8e-57 MNNSEEKRLNKKNRSVLFMARELLFSKVGERIAPVTEYSEKYGISVGLIQRAFVLLQNEG AIKLDKRGVLGSFVKEINNEILLEKSDLGFLVGVMPLPYSKRYEGLATGIKNNFQNYNLN YYFAYMSGSGVRLNLLRNGIYDFAVVSKLAYEIERERNGDIESVFEFGAKSYVSKHVLLK APGVKKIVKIGVDRNSEDQKFLTKEFLGDSNYEFVEIDYNETLKLLENKIVDGIIWNYDE IEEKSIKMEYDELPENKILKKANEAVLVISRRNDNLRKLSRKIIDVDYIREIQQKVLTNQ ILPTY >gi|261746705|gb|ADAD01000176.1| GENE 35 31400 - 31762 526 120 aa, chain + ## HITS:1 COG:no KEGG:Spico_1517 NR:ns ## KEGG: Spico_1517 # Name: not_defined # Def: PRD domain-containing protein # Organism: S.coccoides # Pathway: not_defined # 7 116 6 113 120 68 39.0 8e-11 MEKLKFRLDILKNSEVIDEEIYNKVISLIEHLDKKWEITLTEENGAMFITHLSMALKRIK QNENVNNIDKTVFEEVMVSDKLDKIEEIYEDIEKNIFIEKLPEEEKKYILVNLLLLEEDK >gi|261746705|gb|ADAD01000176.1| GENE 36 31774 - 32124 612 116 aa, chain + ## HITS:1 COG:no KEGG:CAR_c15350 NR:ns ## KEGG: CAR_c15350 # Name: not_defined # Def: hypothetical protein # Organism: Carnobacterium_17-4 # Pathway: not_defined # 2 116 3 118 121 117 56.0 2e-25 MKIVVGGQIDKENVNNLLKKYIPESEIIIKSDIDAAMDVKAGNADYYFGACNTGGGGALA MAIAIIGADKCATLAMPGNILDEEKIKEEVNSGKIAFGFTPQAAEQVIKTVAEYIK >gi|261746705|gb|ADAD01000176.1| GENE 37 32156 - 33493 2218 445 aa, chain + ## HITS:1 COG:no KEGG:Spico_1515 NR:ns ## KEGG: Spico_1515 # Name: not_defined # Def: hypothetical protein # Organism: S.coccoides # Pathway: not_defined # 3 440 4 428 428 414 57.0 1e-114 MEKFLISALLSGFASILANLGVAVFNDGLRPMMPEFLEGRMDRKALAATSFALSFGLVIG FGIPFSIGSTIILIHSILLGTDIIGTSTPRNNVGLALAGIIGALYGVGLVYGLEIVINIF QKMPIDFLPSLAKVGAPIIVGFAVFPALVVGYQYGTKKGAFTLIVTLLVRQIVAVFGKIP VAEKVSITLNPDGMALLVSVVIMLIFAIFDKETEKTNSNEMLVGIFSARVARVKKNIIPL SIMGGLIAAACSLRVVAGDPISLKLLASTLKDTAGGDAKNFEAGLVALARSIGFVPLVAT TAITTGVYGPAGMTLVFVVGIFVKNPIISFVLGAAILAVEILLLELIAKSLDRFPGVKAC GDQIRTAMTKVLEVSLLVGGMIAANEMAGSNGLGFLFVSGFYLLNKTSKKPLVDMAVGPV ATILFGIILNILYLLHLYVIPVAAK >gi|261746705|gb|ADAD01000176.1| GENE 38 33512 - 34387 1295 291 aa, chain + ## HITS:1 COG:php KEGG:ns NR:ns ## COG: php COG1735 # Protein_GI_number: 16131257 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Escherichia coli K12 # 6 291 7 292 292 283 47.0 2e-76 MSLKNGYTLMHEHIFIDLSGIKKLDDCRLDCKNETIEEFKELYKNGVRNVVEVTNIGMGR DISYIQEVSEKSGINIICATGFYKEPFYPKFVYEKNEKELSEIMKKDILEGINSTGKKAG IIGEIGSSKEIITETELKVFKAAIIAHLETGVPITTHTSLGTMGHEQVKIFKEYKVDLNK IVIGHVDLTGNVEYILEMLYNGVYVEFDTIGKNNYLADDIRVEMLKEIEKRDLIDRVFLS LDITRKSNMKYNGGIGYNYIFDVFIPKLKEAGIKDDSIEKMLKSNPERFFK >gi|261746705|gb|ADAD01000176.1| GENE 39 34402 - 35637 1508 411 aa, chain + ## HITS:1 COG:yfeW KEGG:ns NR:ns ## COG: yfeW COG1680 # Protein_GI_number: 16130355 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Escherichia coli K12 # 20 379 65 424 463 242 38.0 9e-64 MRKILLVILFMLSFNVFSENIENLNKIDEVTKKYMENETIPGSVVLVSKNGKIVYLKAFG YAQLFDRDKKMEKPLLMNENTIFDLASLTKVMGTTQAIMKLCSEKKINVEDKVSKYIKGF EKNGKENIKIKDLLTHTSGLTPWQPIYYHSKNPKETLEYIKNMKLEYKTGTERKYSDFSF IILGFIVEKVTGQKLDDYLEKNIYLPLGMKNTKFNPKKKGITKNIAATSNGNPFEERMIK DDNFGYKVNERFEEFKNWRMYTLVGEVNDGNSFYANKGVAGHAGLFSNVKDLYILGEVLL NGGIYKGKRIYSKEVIDMFTSVQSSFGHGYGWEINRGGGESGYMGRFSDEYFVGHTGFTG THIVYDMKNKMQIIILTNKQNYGVDKDTKYKSTWSYAREIMNIVGETFKEN >gi|261746705|gb|ADAD01000176.1| GENE 40 35657 - 36760 1455 367 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0357 NR:ns ## KEGG: Sterm_0357 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 364 1 364 364 484 64.0 1e-135 MKTYPLKSISISEAQEKQFRLVDIITRNMSGEEFLELGDLGVKKGTNRPLKTEKVEKIIA EFFNAEDCMFTRGAGTQAIRWGILAGIVPNEKILVHDAPVYPTTEVNLKSMNLKIIKYDF NDLSKISEVLKDEELKFCLLQHTRQKPDDSYDLAEVIKKIKEIRKDIVILTDDNYAAMKV DKIGCEVGAEMSAFSAFKLLGPVGIGCLVGKKEFVEKVKKLNYSGGGQVQGYEAMEVLRG LVYAPVSLAIQAQENEKSVERLNNKKKYPYIKEAFLANAQSKVLLVEFEENVAEKIIEAA QTLGALPNPVGAESKYEIPPLFYKVSGTFLKTDPTLKERMIRINPNRSGAETIIRILEKA YEKVTKE >gi|261746705|gb|ADAD01000176.1| GENE 41 36767 - 37930 1476 387 aa, chain + ## HITS:1 COG:yhfX KEGG:ns NR:ns ## COG: yhfX COG3457 # Protein_GI_number: 16131259 # Func_class: E Amino acid transport and metabolism # Function: Predicted amino acid racemase # Organism: Escherichia coli K12 # 1 385 1 385 387 353 45.0 2e-97 MFLDTLKRRNEKLINTAFQFHQKGYVMPDSYIIDVDTFSDNAKKILKEAEKHNIKLYYMT KQIGRNPYLAKKLEEIGYSGAVVVDFKEAELFINHNLKIGNVGHLVQIPKNMIEKVVKSN PEIMTVYSYDKVKEISEAALKSGKIQDIMLKVIDSSSKIYPGQEAGFESKDIEKEIEKIM NLKGVNINGLTSFPCFLYDKETLSIKPTENTKTIIEVKKILNEKFGKEIAQINMPSVTSV ENIKLISQLGGTQGEPGHSLTGTIPISGDKDIEEIPAYVYVSEISHNFRGKGYFYGGGYY RRSNIKKALVGTDYNTCCETEVNKMEPENIDYYMEMKKEGKVGDTVLSCFRAQMFVTRST VVLVEGIQKGKSEIVGKYTSSGEQIKD >gi|261746705|gb|ADAD01000176.1| GENE 42 37953 - 39167 1898 404 aa, chain + ## HITS:1 COG:yhfW KEGG:ns NR:ns ## COG: yhfW COG1015 # Protein_GI_number: 16131258 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Escherichia coli K12 # 4 403 2 401 408 461 54.0 1e-129 MEQGRFIVIVLDSYGVGYMDDVLEVRPRDYGANTCKHIIDKVSDLKLENLERLGLMNALG EDYGKMKMNPEAVFGKAKLMHHGGDTFLGHQEIMGTRPVKPLIAPFSYYIDRVFDELIKE GYKVEKIGKEAKYLWVNDCVAVGDNLETDLGQVYNVTTTFQKISFEEELKIAQIVRKIVE VERVIVFGGTEATIESIKAAAEEKEGKYIGINAPKSKVYEKGYMVRHLGYGINPETQVPT ILGKAGIPVVLVGKVADIVFNEQGKSFINLVDTSKIFDITLSEIDKMKKGFIAVNIQETD LAGHAENAERYAQILQISDKYIGKIIEKLNEKDILVITADHGNDPTIGHSQHTRENVPIL IYKKGLEGINIGHRETMSDIGATVADYFKVETPENGKSFLSILE >gi|261746705|gb|ADAD01000176.1| GENE 43 39394 - 40023 647 209 aa, chain - ## HITS:1 COG:SP1627 KEGG:ns NR:ns ## COG: SP1627 COG2910 # Protein_GI_number: 15901463 # Func_class: R General function prediction only # Function: Putative NADH-flavin reductase # Organism: Streptococcus pneumoniae TIGR4 # 1 207 1 207 209 202 46.0 3e-52 MKIAIIGSNGKVGKQIVKEALNKKHDVTVIVRNENQSEANKIIQKDLFDLKMSDLSNFDA VINAAAFWTPETLSDYTKSLKYLSDILSGKNIYLLIVGGTGSLFVDKEHKVQLYQTSDYP EKYRPLAAAKSESFNELLKRNDILWTYVCPPKNFDEKGKRTGKYLLRGDEFTLNDKGESY ISYADFAVALIDEAEKREHNRQKISILGI >gi|261746705|gb|ADAD01000176.1| GENE 44 40187 - 40537 404 116 aa, chain + ## HITS:1 COG:FN0589 KEGG:ns NR:ns ## COG: FN0589 COG1733 # Protein_GI_number: 19703924 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 5 108 3 106 107 130 66.0 6e-31 MKLNKNKLPNCYVETTLMLISNKWKVLIIKELFEGVKRFGELKKSLNNISHKVLTYNLRE MEEDGLITRKVYPEIPPKVEYMLTDTGKSLYIILGEMRKWGKEYTEKIIIKKGNEK >gi|261746705|gb|ADAD01000176.1| GENE 45 40640 - 41167 631 175 aa, chain - ## HITS:1 COG:lin1898 KEGG:ns NR:ns ## COG: lin1898 COG2249 # Protein_GI_number: 16800964 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Listeria innocua # 7 175 1 176 177 164 43.0 8e-41 MKGMIEMKTLVILAHPDIENSRINKKWKEELEKYPDEIKVHELYKEYPDWNINVEKEHKL IEGYDNIIIQFPMYWCSCPPLLKKWFDDVWTFNWAYGPEGDKLKNRKIGLAISAGSLEES YTLPVNEILSPFKASTIHVGAEFLPYFSLFGTVHDMSDEKVAQSSVEYVEYIKNI >gi|261746705|gb|ADAD01000176.1| GENE 46 41172 - 41801 992 209 aa, chain - ## HITS:1 COG:SP1627 KEGG:ns NR:ns ## COG: SP1627 COG2910 # Protein_GI_number: 15901463 # Func_class: R General function prediction only # Function: Putative NADH-flavin reductase # Organism: Streptococcus pneumoniae TIGR4 # 1 209 1 209 209 230 54.0 1e-60 MKIAVIGANGQAGKLIVKEGIERGFNVTAIARGENKAGAENFIQKNLYDLTKEDVKDFDA VVSAIAIWSDEITDEHKKATEFLADLLSGAKTRLLIVGGAGSLYTDNTLTTTIAETPDFP KEYIPVATSMKIALDALRKRNDVNWTYISPAGEFDAEGAKTGDYILAGEVFAVNDKGESY ISYTDYAVAMIDEIEKGNHNKQRISVLGK >gi|261746705|gb|ADAD01000176.1| GENE 47 42047 - 42385 304 112 aa, chain + ## HITS:1 COG:FN0589 KEGG:ns NR:ns ## COG: FN0589 COG1733 # Protein_GI_number: 19703924 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 5 108 3 106 107 161 80.0 4e-40 MIINKKELPACPVETTLLLISNKWKILIIRDLSNGTKRFGELKKSINNISQKVLTSNLRE MEENGLLTRKIYPEVPPRVEYTLTEIGESLNPILEEMDKWGTGYKEKIQSGI >gi|261746705|gb|ADAD01000176.1| GENE 48 42515 - 44332 2607 605 aa, chain + ## HITS:1 COG:FN1655 KEGG:ns NR:ns ## COG: FN1655 COG2461 # Protein_GI_number: 19704976 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 5 515 1 510 512 366 43.0 1e-101 MGKDMKDHLDIDLAMIEKMTEIKRDYIEGKTDYETTKALIKQNFTRITPAEFAYSEQKIK DLGFDDNTVHDKMNDVLGLFDDIIVRETADLPEGHPINTYLMENFVVKGLIAEMKEEAEK KFIKNRWLELYDDLYKFNIHLSRKQHQLFSMLERKGFDRPSRIMWSFDNAVRDSISKARK LLNEDKIDEFMEHQKLVWELTLDIMNKEEEVLYPTSLKMITEEEFRGMRPGDDEIGYCLI DKPEGFYPEVKEEPQIPENTEILQGNQNQAGFMNDLASLLSKYNMGNEAKKENEVLDVKQ GKLTLDQINLIYKHMPVDLSFVDENEIVKFYTDTKHRVFPRSAGVIGRDVKNCHPRESVS TVMQIIDAFRKGEQEEAEFWLETNGKFIYITYTAVRDENGKFKGVLEMMQDVTRIRSLTG ERRLLMWESSAKNSEGKEGESKSEYGLNESSVIGDIIDKYPYIKEFMPTLAPEYTKLLDP VQYMIMSKVATLDMIAERGGFTVPELIGKIEERIKREENKTENISGESAYGLTSDSIIGN IIDKYPYIKEFMPTLSPVYNRLLNPVQYMIMSKVATLDMIAARGGFEVADLIKKIEDKIK EEENK >gi|261746705|gb|ADAD01000176.1| GENE 49 44397 - 44972 751 191 aa, chain + ## HITS:1 COG:lin1172 KEGG:ns NR:ns ## COG: lin1172 COG2096 # Protein_GI_number: 16800241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 191 1 187 188 184 49.0 1e-46 MKIYTKYGDKGFTRLYGGDRVSKTHVRVEAYGTLDELSALLGSTIAKLENYKEMSDIREE CENIQQQLFDCGSDLATPRELRPYKQTEEDVKWLEERMDQYIPLLPKLQCFVIPGGSKIV AEFHILRTVARRLERRIIAVIEADEAVNEAGLRYINRLSDYFFVISCLINLRLGKPETIY KRSAKVFRDTK >gi|261746705|gb|ADAD01000176.1| GENE 50 45195 - 46163 1152 322 aa, chain + ## HITS:1 COG:lin1460 KEGG:ns NR:ns ## COG: lin1460 COG1125 # Protein_GI_number: 16800528 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, ATPase components # Organism: Listeria innocua # 1 240 1 240 327 310 61.0 4e-84 MIEFLNVEKTYPNGNKGIKNMNLNIEEGKITVFIGPSGSGKTTALKMINRLEDATSGEIR ISGKNIMEYNIHELRWDMGYVLQQVALFPHMNVEQNISIVPELKGWKKDQTDKRIDELLN TVGLEAEKYRKRMPAELSGGEAQRIGIARALAANPRIILMDEPFSALDPVTRTGLQADIK KLQEKINKTVVFVTHDIEEALLLGNKICIIKDGELIQCGTKKDLTEHPVNNFVREFLASG RNHTICDDIVKDILKNGFYKKIDDFEKENNEITMKINDSSDKLFKILEKNEYVIIINEKN EKLMVKRKNVFSCLSQESDKNE >gi|261746705|gb|ADAD01000176.1| GENE 51 46156 - 47673 1865 505 aa, chain + ## HITS:1 COG:lin1461_2 KEGG:ns NR:ns ## COG: lin1461_2 COG1732 # Protein_GI_number: 16800529 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) # Organism: Listeria innocua # 233 505 28 299 299 325 61.0 2e-88 MNNFFVTLLERKEDIIKSITEHIEISFLALVIALLIAVPLGIYLSYHKKIAEIVIGVTAV LQTVPSLALLGLLIPFVGIGTVPAVIALVIYALLPILRNTYTGISEVDPVYLLASRALGM NKVQQLLKIQIPLAMPVIMAGIRTATVLIIGTATLASLIGAGGLGSLILLGLDRNNNNLI LLGAVPAALLAVFFDQVLKRLEKKDWKKTLIYMFAAMTLFFAGNFGADKMMYKDKLVIAG KLGTEPEILINMYKILIEENMKNIHVELKPGFGKTSFVFNALKSKEVDIYPEFTGTAVFT FLKEKPVSNNEIEVYNQAKKGLAEKYNMILLEPMKYNNTYALAVPRKFAEDNNLMKISDL EKVKDKIKAGFTREFNDREDGYKGLKKLYNFEIPDVKELEPKLRYTAIQNGEINLIDAYS TDSELEKYDLAVLEDDRKLFPPYQGAPLLRKEILEKYPELEKILGMLKGRITDADMREMN YEVGVNGKRAEDVAKEYLKKNGIIK >gi|261746705|gb|ADAD01000176.1| GENE 52 48196 - 50331 1656 711 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase [Brucella abortus bv. 1 str. 9-941] # 1 704 1 699 714 642 48 0.0 MFDERTYGFNLGTQQVKLSTGKIARQAGGSVMVQCGGTVLLVTATRSKDVKEGQDFFPLT VDYIEKFYASGKFPGGFVKRESRPSTDEILISRLVDRPIRPLFPEGFLNSVHIVITVVSY DEINQPENLATIGVSVALGLSDIPFAGTVAGVTVGYVDGEYVLNPTAEQLEKSEIHLSVA GTKDAVTMVEAGAKEVSEEVMLEAIMFGHERIKEICAEQDKFLSQFDVKKYEFEKKEVAP EIKEFIDSFEKNVEEAVMTPGKLEKYEAIDNLEIELFEKYVAKLESEAKEIDENLEKEFK NYYREIEKKVVRDAILYKKYRADGRTTTEIRPLDVEVDTLPVPHGSALFTRGETQALVTV TLGSKADEQIVDGIEDESRKKFFLHYNFPPYSVGEAGFLRAPGRRELGHGNLAERALKYV MPDQETFPYTVRLVSEITESNGSSSQASICGGSLALMAAGVPIKSTVAGIAMGLIKEGDT FTVLTDIQGLEDHLGDMDFKVAGTKKGITAIQMDIKIEGITREIMDIALKQALEGRYFII DKMEEVISEPRPEIAENAPKIELMKIDPSKIAGLIGPAGKVIKAIIEETGVSIDIEDDGS VSIFGKDPEMMKKAIELVKRQTQSVELNGIYDGKVTKLMKFGAFVEVLPGKEGLLHISEI SHKRVAKVEDVLKEGQEIKVKVITMEDENKFNLSMKALISKEEGVKDESAQ >gi|261746705|gb|ADAD01000176.1| GENE 53 50315 - 50860 275 181 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 5 179 484 669 904 110 38 2e-23 MNLPNKLATLRMILIIPFVIVMGAALSTDNDILSIFMRILACIIFVGASITDYYDGQIAR KYNLVTNLGKLIDPLADKLLVISALTVLTKYDKISLWIVLIIIFRELMITGLRAIVAADG TVIAAETLGKWKTATQMVALTIIILFPLSYTMNNILMIIPLILTILSGAEYVLKSKDVLN K >gi|261746705|gb|ADAD01000176.1| GENE 54 50889 - 51158 256 89 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1454 NR:ns ## KEGG: Lebu_1454 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 5 82 1 78 85 76 57.0 3e-13 MENILYIIYKIIDLYSIIILISVLGSWVDGRNQSPFFRFINKLTNPYLKLFRIIIPAGNM NIDISPIIGITVLNLLKSIIGRLYYFSVV >gi|261746705|gb|ADAD01000176.1| GENE 55 51234 - 52781 1726 515 aa, chain + ## HITS:1 COG:lin2806 KEGG:ns NR:ns ## COG: lin2806 COG0232 # Protein_GI_number: 16801867 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Listeria innocua # 6 514 1 464 465 382 44.0 1e-106 MKKYKMNWKNLLSVNSQRPRSNKRGNRQEEPEEAFNRSKEKPYSDLRSDFERDYHRILSS ASFRRLQDKTQVFPLEKNDFIRTRLTHSIEVSSFARSLAQSVADEIIRRNLDNNFGHEEA NGITNILASSGLLHDIGNPPFGHFGEDTVRAWFIKNLEEIKIKDSQGEYKKLNEILNTQM CHDFINFEGNAQAIRVVSKLHFLVDENGMNLTFALLNTLIKYPVDSLHIDKKSGDIKTKK MGYYYSEKELFENIVKSTGTYNEETGEIYRHPLTFLLEAADDIAYCTADIEDGMKKGFIS FENLIESLKVNVPEDERLYKNLITYKKYAKEKSYDSPELYAVQRWIVSIQGIFISSVVNS FIENYEQIMNGEFKTDLFKGTEAEKLLKVLKNLAFEEVFESKAILKMEIAANNIINYFLT NFVNSVLYWDTPYEDEMAGIDSKYIAILSENQRHIYKYYSELFEKNIKSQDLKAYKKEEI FAYKLYLRLLLVTDYISGMTDSFIKTLYQELTGID >gi|261746705|gb|ADAD01000176.1| GENE 56 52832 - 53656 987 274 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1982 NR:ns ## KEGG: Sterm_1982 # Name: not_defined # Def: TrmB family transcriptional regulator # Organism: S.termitidis # Pathway: not_defined # 1 273 1 258 258 167 37.0 4e-40 MDIISELQKFGFSKVEAEVYMEVLNVPMSNGTQISKKTDISRSAIYNALERLCEKGYIYT VPTEEDKKNYMATEPMEIISKLKEEWQNKAEFLEKEFLKIRNKTGETRFYELYSEESLIL KIKEMILNASDEVYISTNIDIYLFREEILKMVKAGIKIFIFNYGEMIVDFNKEVSESGNI FENYNVYNLKIYNAYNKFFKKEEKEIIMVSDIVKGFSCEEAGEGFVGMSTENKILVKVLA ENIHNNIYVNKIERVYGEEVFEETFIESVFEKKK >gi|261746705|gb|ADAD01000176.1| GENE 57 54149 - 54719 998 190 aa, chain + ## HITS:1 COG:FN0857 KEGG:ns NR:ns ## COG: FN0857 COG0058 # Protein_GI_number: 19704192 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Fusobacterium nucleatum # 1 184 1 182 789 185 48.0 4e-47 MKIDKSQLKDSILRKLRRQYGKTIEEAHEYEIYYAASRATLDYIVENWYNTKKTYAKKQV KQMYYFSAEFLMGRYLGNNLINLQINDAVKETLEELGVDINKVEDQEMDAGLGNGGLGRL AACFLDSLATLKLPGHGYGLRYKYGMFDQRIENGFQVEYPDDWTKFGDPWSIKRMDRVFE IKFGGQIEVH Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:43:53 2011 Seq name: gi|261746661|gb|ADAD01000177.1| Leptotrichia goodfellowii F0264 contig00059, whole genome shotgun sequence Length of sequence - 29222 bp Number of predicted genes - 46, with homology - 38 Number of transcription units - 7, operones - 4 average op.length - 10.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 3 - 62 7.5 1 1 Tu 1 . + CDS 305 - 457 140 ## - Term 425 - 483 -0.6 2 2 Op 1 . - CDS 492 - 674 245 ## gi|262039312|ref|ZP_06012629.1| homeobox protein chox-7 - Prom 785 - 844 10.8 - Term 730 - 770 -0.2 3 2 Op 2 . - CDS 918 - 1001 162 ## - Prom 1052 - 1111 10.2 - Term 1084 - 1114 -0.4 4 3 Op 1 . - CDS 1115 - 1681 489 ## gi|262039276|ref|ZP_06012593.1| hypothetical protein HMPREF0554_0810 5 3 Op 2 . - CDS 1693 - 1929 224 ## gi|262039297|ref|ZP_06012614.1| hypothetical protein HMPREF0554_0811 6 3 Op 3 . - CDS 1986 - 2330 554 ## gi|262039288|ref|ZP_06012605.1| hypothetical protein HMPREF0554_0812 7 3 Op 4 . - CDS 2343 - 2879 510 ## COG0860 N-acetylmuramoyl-L-alanine amidase 8 3 Op 5 . - CDS 2892 - 3449 570 ## Sterm_2817 hypothetical protein - Prom 3473 - 3532 9.2 9 4 Tu 1 . - CDS 3651 - 4616 644 ## COG4973 Site-specific recombinase XerC + Prom 4458 - 4517 8.7 10 5 Tu 1 . + CDS 4654 - 4989 167 ## 11 6 Op 1 . - CDS 4986 - 5972 857 ## Sterm_2818 hypothetical protein 12 6 Op 2 . - CDS 5976 - 6635 684 ## gi|262039301|ref|ZP_06012618.1| conserved domain protein 13 6 Op 3 . - CDS 6628 - 7749 1129 ## Arcpr_1480 baseplate J family protein 14 6 Op 4 . - CDS 7751 - 8107 462 ## gi|262039291|ref|ZP_06012608.1| hypothetical protein HMPREF0554_0819 15 6 Op 5 . - CDS 8110 - 8583 484 ## gi|262039303|ref|ZP_06012620.1| hypothetical protein HMPREF0554_0820 16 6 Op 6 . - CDS 8580 - 9563 1149 ## gi|262039275|ref|ZP_06012592.1| hypothetical protein HMPREF0554_0821 17 6 Op 7 . - CDS 9566 - 9931 344 ## gi|262039313|ref|ZP_06012630.1| hypothetical protein HMPREF0554_0822 18 6 Op 8 . - CDS 9928 - 10515 621 ## gi|262039272|ref|ZP_06012589.1| hypothetical protein HMPREF0554_0823 19 6 Op 9 . - CDS 10518 - 12839 3390 ## Haur_1369 TP901 family phage tail tape measure protein 20 6 Op 10 . - CDS 12836 - 12934 96 ## 21 6 Op 11 . - CDS 12949 - 13293 424 ## gi|262039296|ref|ZP_06012613.1| hypothetical protein HMPREF0554_0825 22 6 Op 12 . - CDS 13306 - 13722 479 ## gi|262039311|ref|ZP_06012628.1| putative flagellar basal body rod protein FlgG 23 6 Op 13 . - CDS 13779 - 14954 1320 ## gi|262039308|ref|ZP_06012625.1| hypothetical protein HMPREF0554_0827 24 6 Op 14 . - CDS 14947 - 15450 375 ## gi|262039295|ref|ZP_06012612.1| conserved hypothetical protein 25 6 Op 15 . - CDS 15434 - 15796 556 ## gi|262039306|ref|ZP_06012623.1| hypothetical protein HMPREF0554_0829 26 6 Op 16 . - CDS 15799 - 16260 536 ## EpC_17640 phage related-protein 27 6 Op 17 . - CDS 16257 - 16829 176 ## gi|262039299|ref|ZP_06012616.1| conserved domain protein 28 6 Op 18 . - CDS 16841 - 18586 2306 ## COG4653 Predicted phage phi-C31 gp36 major capsid-like protein 29 6 Op 19 . - CDS 18596 - 19306 945 ## gi|262039309|ref|ZP_06012626.1| phage Mu protein F like protein 30 6 Op 20 . - CDS 19333 - 20601 1108 ## COG4695 Phage-related protein 31 6 Op 21 . - CDS 20612 - 21928 1358 ## COG1783 Phage terminase large subunit 32 6 Op 22 . - CDS 21931 - 22203 402 ## gi|262039289|ref|ZP_06012606.1| tRNA pseudouridine synthase A 33 6 Op 23 . - CDS 22208 - 22639 529 ## gi|262039286|ref|ZP_06012603.1| hypothetical protein HMPREF0554_0837 34 6 Op 24 . - CDS 22626 - 22742 115 ## 35 6 Op 25 . - CDS 22746 - 23363 637 ## gi|262039307|ref|ZP_06012624.1| putative pbsx phage terminase small subunit 36 6 Op 26 . - CDS 23377 - 24648 1350 ## COG0863 DNA modification methylase 37 6 Op 27 . - CDS 24748 - 25155 385 ## gi|262039282|ref|ZP_06012599.1| prolactin receptor - Prom 25261 - 25320 5.0 38 7 Op 1 . - CDS 25352 - 25591 126 ## 39 7 Op 2 . - CDS 25639 - 26430 800 ## Sterm_2849 hypothetical protein 40 7 Op 3 . - CDS 26412 - 26582 185 ## gi|262039273|ref|ZP_06012590.1| conserved hypothetical protein 41 7 Op 4 . - CDS 26625 - 27254 561 ## Sterm_0817 phage regulatory protein, Rha family 42 7 Op 5 . - CDS 27313 - 27672 390 ## Ilyop_2060 hypothetical protein 43 7 Op 6 . - CDS 27688 - 27996 228 ## Lebu_0946 helicase 44 7 Op 7 . - CDS 28002 - 28175 150 ## 45 7 Op 8 . - CDS 28175 - 28924 535 ## COG1484 DNA replication protein 46 7 Op 9 . - CDS 28836 - 29066 128 ## - Prom 29147 - 29206 4.6 Predicted protein(s) >gi|261746661|gb|ADAD01000177.1| GENE 1 305 - 457 140 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKLEEKQIKNLKEMKIDNLKEKEIDKLEEKKIDFLNEVPEKKIKSKKRK >gi|261746661|gb|ADAD01000177.1| GENE 2 492 - 674 245 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039312|ref|ZP_06012629.1| ## NR: gi|262039312|ref|ZP_06012629.1| homeobox protein chox-7 [Leptotrichia goodfellowii F0264] homeobox protein chox-7 [Leptotrichia goodfellowii F0264] # 1 60 1 60 60 90 100.0 5e-17 MGAKKGRPKPIGSGKKATGRVRTENFGVKLTVLEKEFLMKKLNETGEKSNTDALLKLLGY >gi|261746661|gb|ADAD01000177.1| GENE 3 918 - 1001 162 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLFHQWVQTLANIATIILVIYTITKDK >gi|261746661|gb|ADAD01000177.1| GENE 4 1115 - 1681 489 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039276|ref|ZP_06012593.1| ## NR: gi|262039276|ref|ZP_06012593.1| hypothetical protein HMPREF0554_0810 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0810 [Leptotrichia goodfellowii F0264] # 1 188 1 188 188 289 100.0 7e-77 MESTKNILLYIENHGLSLIITVMLILAVWKYIVPYIKEQTEILEEVKKFLKNFNTGAISG KALGLMLELQAKSLRWSIENKYVFFIQNNNIKNRYNNIIFEIDNYISTKMLKFEDELKDI TDKITFKVFFDMFQSSISDLKKELNTVLKALKEENTEPSDYDIAIRTVKQHMEHFQNNLI KKIKELTD >gi|261746661|gb|ADAD01000177.1| GENE 5 1693 - 1929 224 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039297|ref|ZP_06012614.1| ## NR: gi|262039297|ref|ZP_06012614.1| hypothetical protein HMPREF0554_0811 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0811 [Leptotrichia goodfellowii F0264] # 1 78 1 78 78 129 100.0 7e-29 METVEKVVEKVTNKVNADDIIDKVNSTTFSGTILKQWEEKDKKINYNRSHLYGDIAYHTN FKDKNELVARAGILYYLK >gi|261746661|gb|ADAD01000177.1| GENE 6 1986 - 2330 554 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039288|ref|ZP_06012605.1| ## NR: gi|262039288|ref|ZP_06012605.1| hypothetical protein HMPREF0554_0812 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0812 [Leptotrichia goodfellowii F0264] # 1 114 1 114 114 167 100.0 2e-40 MLKEILRNKVILEGLSVLIPIIVTYILAKLKTNSTIRKFIPEGVVFADSLDTSNENKLIR AVTQIELLVLSVTPALFKPIVDFLINPKYIVKMIERYLTKKKVEKQELLEKENV >gi|261746661|gb|ADAD01000177.1| GENE 7 2343 - 2879 510 178 aa, chain - ## HITS:1 COG:BS_yqiI KEGG:ns NR:ns ## COG: BS_yqiI COG0860 # Protein_GI_number: 16079475 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus subtilis # 9 177 37 205 206 82 35.0 4e-16 MRKICVIIGHGGNDSGAVNIHTGDTELKYNTGLSVMVADLLRNRGYNVDIYNRGYARVEN VPELNAKKYDLFISLHCNSFNEQANGTEMLYWSTSSRSKKLAQSLQDEVVKTFSLTDRGI KPKVNGDRGAYLLKKTNAPCVILEPFFIDNMHDLEVGKAKKQEYSQAIVNGIDKYFNN >gi|261746661|gb|ADAD01000177.1| GENE 8 2892 - 3449 570 185 aa, chain - ## HITS:1 COG:no KEGG:Sterm_2817 NR:ns ## KEGG: Sterm_2817 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 5 178 4 171 174 67 28.0 4e-10 MPFKIYLYDKDGKLIGIYIAPSKEDFEADKQKYCSEYIEGENYISYEEVKNPIIDNGNIR EMKTSELIRSGKITLSDGQYLDGEEIKSIPKPNEYSKWDENTHEWVEDKAEKLQYFKDLR YTKQQEYIKYKKELEEKEDEKSEFESLGFDTTETEERITEIKSEMDLLKTEISKLSKEIT LLSKK >gi|261746661|gb|ADAD01000177.1| GENE 9 3651 - 4616 644 321 aa, chain - ## HITS:1 COG:xerC KEGG:ns NR:ns ## COG: xerC COG4973 # Protein_GI_number: 16131663 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerC # Organism: Escherichia coli K12 # 80 296 66 273 298 63 23.0 6e-10 MELEVTKNRNGLIYMEYLNSCIAKNAATARTTYKTYFNNMKLFVEYLREYENNRYLLSKD TLKFIVSILERYIRFCREVKGNNAQTINNKITAISSFYIWAIKRDLIATHPFREKLDRLK VTDMEKRRKSYYLSIKEVIEINIKMELEHKKFDLQDRIIFNLIIDTACRISALQSIKIKN IDIDNGLILGIVEKEQKLVEFAIFKDTIKLIKEWLKCRNNKGIEDEYLLITKYEKEYRQM SKSTIRDRVKKIGKLIDIENLYPHSLRKTSINLLANAGSLELASEFANHSGIDVTKKHYI KKDTGSEKRNKILNVRKKAGF >gi|261746661|gb|ADAD01000177.1| GENE 10 4654 - 4989 167 111 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MENLYKVQQAKLYVHSEATGLGRTTCNVVQKVGNVVTIVFDSGDTLRSINDNTTIFSIPE GYRPKSFLSVNASQFNGISGTIYIQPDGTAKWRGSTVTTASIIFSVSYIVD >gi|261746661|gb|ADAD01000177.1| GENE 11 4986 - 5972 857 328 aa, chain - ## HITS:1 COG:no KEGG:Sterm_2818 NR:ns ## KEGG: Sterm_2818 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 327 1 349 349 69 24.0 2e-10 MSKKFTNVVDRGRVVSNKYNLVNNGDGTALITDIENNINVQGTPLDKNLFNPMQEGLIFT VETTHTIENGTDVYELDIDGLQGVNNLNGLPLFNGLSLNIKVNKENTTNVVKVKISGNKY DLAKENNDTLENLKIGELKNKYHKIIYNGVRFVLFTGVGLEYSKWLETIGGQFGGYVSKV ANKEAGKLYINDVYDNKLYVCVTAHSSTSFDITKYRDITNNGISNKLENLSKIETKIIDK GIALLKSGNIVILTWDSNYYLQGSLPRGTVLATLPAGYRPILDISAPVTFWNSNNSGTIK IRNNGQITWESSINLQGTIYVNASFLIS >gi|261746661|gb|ADAD01000177.1| GENE 12 5976 - 6635 684 219 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039301|ref|ZP_06012618.1| ## NR: gi|262039301|ref|ZP_06012618.1| conserved domain protein [Leptotrichia goodfellowii F0264] conserved domain protein [Leptotrichia goodfellowii F0264] # 1 219 1 219 219 399 100.0 1e-109 MYNNYKYLNSKIPYILKATDTNQLFVKSIANVFDLVDKYMDMLENYWLIDKAKGEFLDDL GALVEENRNNDVDDNYRKRIKLKFQALDIVPTLDNILKLIKSFTGLFPEIREGWKVDGEA GRYDIDFIAEKDYNFSLIDVIDLESIIGGGIKINTRKCLENYTEAYYSGDVQSGDKLFPF YSERKADCNFSFNDIPYSKEINSGDKLYLSDDLINFERR >gi|261746661|gb|ADAD01000177.1| GENE 13 6628 - 7749 1129 373 aa, chain - ## HITS:1 COG:no KEGG:Arcpr_1480 NR:ns ## KEGG: Arcpr_1480 # Name: not_defined # Def: baseplate J family protein # Organism: A.profundus # Pathway: not_defined # 77 286 97 302 391 83 30.0 2e-14 MARIDILELNDIIGTMGENIKSIQSNFSIDKRSVWYLMIGYPVGRLAQQKLYRIQYIADK ANIYKCENEELDDLLNGNFNFSRKQPSYAKLFVTFNAVNGTTVGIGELGVKTSTGIEFFN TNTATATNNLISLEFECETLGSIGNVGSNEITKFITTVQGILSIQASTEGQGGQDKETDI KYRDRWFNSRFKSFWNIDGIKSALMDLDGVESAYVNENDQPVAVNGVEQKSVIIVVDGGI NEQIANVIFQKKDQAIKSVGDIVAYATDTSGIKREIRFYRPTEIKIEAQYISIPGNYAVE NKNKIDVLINEYIKSKGVNGFISVYECFVENIRPAISETDLKHLDLSFKIHNSRLSFKTN LQLGIKEKGVLYV >gi|261746661|gb|ADAD01000177.1| GENE 14 7751 - 8107 462 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039291|ref|ZP_06012608.1| ## NR: gi|262039291|ref|ZP_06012608.1| hypothetical protein HMPREF0554_0819 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0819 [Leptotrichia goodfellowii F0264] # 1 118 1 118 118 192 100.0 1e-47 MDLKLGNINHGELEVKNNDLVLMNDANMEIIQMIALMLQIKAGELEFDTNYGLDWAYWET GNKDLVEENIRNKILYYFKEVNKINSITSEFVTKEKRKLKVFISLDINNQTYEKSLEV >gi|261746661|gb|ADAD01000177.1| GENE 15 8110 - 8583 484 157 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039303|ref|ZP_06012620.1| ## NR: gi|262039303|ref|ZP_06012620.1| hypothetical protein HMPREF0554_0820 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0820 [Leptotrichia goodfellowii F0264] # 1 157 1 157 157 295 100.0 6e-79 MNEVVPTTTLGKITRDYGDGFYAIQPLGTIKGVEWQPIPRVPMCQLGNKNINQIFPFNVG DIVPIAFLSFSQSNYLESDDEGDLDSDLTNSFADCIAFPFVVPTASNSLNADTITINGNV EQSGTLTNDETTSSGIVLTTHVHGGITKGSDKTEKAE >gi|261746661|gb|ADAD01000177.1| GENE 16 8580 - 9563 1149 327 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039275|ref|ZP_06012592.1| ## NR: gi|262039275|ref|ZP_06012592.1| hypothetical protein HMPREF0554_0821 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0821 [Leptotrichia goodfellowii F0264] # 1 327 1 327 327 520 100.0 1e-146 MVDNNRFIIGELFNESALIIIRTASKDIEIPYQYWDPNNTEQDREIRGYDISVSYKDSEN NDLSSGEITIFNLAQSDIDLIREKDTINVKMGFGRDIGEVFTGTITEVVQLEYELKIKFL EATKTFNDRVSIGLEPTKASKVIKEIADSIGYVVKKCDLKIDKQYKGGFYLSPYDVPLRK IIQIVDDCDSKINLKYDEIYIYSKENNDTEKIILNKTSGLLGEPKKYVKPEKTSAKNRGK ATKKTKKEDKKKRKSKKKKVAKTKSPENRKVEYDYSLNCLLIHYLKKTDNVIIESKTFNG KAKIVALSIKDFVMELKIKVLNEVKKK >gi|261746661|gb|ADAD01000177.1| GENE 17 9566 - 9931 344 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039313|ref|ZP_06012630.1| ## NR: gi|262039313|ref|ZP_06012630.1| hypothetical protein HMPREF0554_0822 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0822 [Leptotrichia goodfellowii F0264] # 1 121 1 121 121 199 100.0 7e-50 MRINLDKSLIPLKFTLRVLDENFQLHFKEHKMLINDDELNPIFKSRLYLDIYNEDNKLIL KNEKLVYGVPVGLYLSRDKNNNTSTDFPNAYIFPFSNDGIEKEVNFDNLNNTVFIEFIER E >gi|261746661|gb|ADAD01000177.1| GENE 18 9928 - 10515 621 195 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039272|ref|ZP_06012589.1| ## NR: gi|262039272|ref|ZP_06012589.1| hypothetical protein HMPREF0554_0823 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0823 [Leptotrichia goodfellowii F0264] # 1 195 1 195 195 271 100.0 1e-71 MIDLKQINNQISGYKNQYDNYRKDVSTKLNTYKKKYLEKYREGVYINNIRLDWCRISETQ KGDMKDSPLDPSDIPNQISTNLHISDKEINLEAKFNIDNRQKKELFEEIKQLFLKKQRIN ITTSNEIIENLIIISISKELDKNNYAFSLSMRQFQTAKITSTGEVKSGEQTQVNGTTTVG TQGTTQSNVSGGYLK >gi|261746661|gb|ADAD01000177.1| GENE 19 10518 - 12839 3390 773 aa, chain - ## HITS:1 COG:no KEGG:Haur_1369 NR:ns ## KEGG: Haur_1369 # Name: not_defined # Def: TP901 family phage tail tape measure protein # Organism: H.aurantiacus # Pathway: not_defined # 40 453 4 420 1347 200 33.0 2e-49 MSDEMIIDVKIKNNKSAIDELDKTMEGLTSTTKKTSKGVDELGNEVKKTGKRKSELDKVK EGLKGVEKGAKEAKSGVNVLAGGFKSLASAMLPILSVAAVVGFVKKSLDAFEDFEKGMNS IFTLLPKKSAEAEEAMGKKVRNMAKTYGVEMSDATDAIYNALSAGVSEDNVFGFVETGIK ASKAGMASLSDSTATLNTIMNNYRSDSLDVNNVSDLLFATIKKGVTSFPELASSIGDVLP STAAANVSFEQTAATIATLTATMGKGSTAKAGTSMRAMFEELNNSGSKTYKMFKQLNGGV DFKTFMKNGGTVSQALGMIEKKAQSTGKTVADMFMSVESKKAVNILTSNKKVFDENLDEF KNVAGATDEAYEKMNRGWGATTARLKAGMTDVMIGLGDAIAPVAGLIGGALIEALSLVTP AFDLLGQGINSVIKPLSALGEAWGLITGTSSTAGIEETNKKFAELSPLAQQLVKPLAELN QAFNDLINKLIGAFAPATERVQEFFNSITGNTNTKDILVGLVEGVTWGVETITPLLEGLG SWWSTQFNVMLNVVGIVGNFFKGVMEGMGVDTQTLGQFISDLYVITGATFKGFSTAVSTA WGITKPIFDFLAETLGKIIGLLAKVSFEPLQKGASFLSGLLGGKKKNALGTDNFGGGTTT ISEQGKELFATPSGLVGISPNSRSEMMLPKGTQIFSNKKTEKIMNMAKNIYNNQVSLPQG YGNSYDINIPISIQQVAQDKIEKIKALRPLITSLIENILSDKESDMAYKWGDM >gi|261746661|gb|ADAD01000177.1| GENE 20 12836 - 12934 96 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGSINEILNMDNDTFLEMKIARDEYIKEVSKK >gi|261746661|gb|ADAD01000177.1| GENE 21 12949 - 13293 424 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039296|ref|ZP_06012613.1| ## NR: gi|262039296|ref|ZP_06012613.1| hypothetical protein HMPREF0554_0825 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0825 [Leptotrichia goodfellowii F0264] # 1 114 1 114 114 201 100.0 1e-50 MPKLKLTNVYAKDETGKGYKVYEELEIEFQDNGDDEKLAKIINSNLEGTADRLETYDALA EDMIISPEGARSHKFFGKNAAAVVNSLLPFLLKYGNEDMISANKKLKVVEVAGE >gi|261746661|gb|ADAD01000177.1| GENE 22 13306 - 13722 479 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039311|ref|ZP_06012628.1| ## NR: gi|262039311|ref|ZP_06012628.1| putative flagellar basal body rod protein FlgG [Leptotrichia goodfellowii F0264] putative flagellar basal body rod protein FlgG [Leptotrichia goodfellowii F0264] # 1 138 1 138 138 226 100.0 3e-58 MKDTKKVSLVLTSPTGRTRNIIGVSVNPSQVNQNFTLSDPDMNGEHVTIMNGSTATTYEI VVRQNSGNFTFLNNFVQDCLDEGASGNGLFKNTSVKGKPETHVLVGVTIQKKESGQHDNS NVDATFTIQAESVTRNNL >gi|261746661|gb|ADAD01000177.1| GENE 23 13779 - 14954 1320 391 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039308|ref|ZP_06012625.1| ## NR: gi|262039308|ref|ZP_06012625.1| hypothetical protein HMPREF0554_0827 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0827 [Leptotrichia goodfellowii F0264] # 1 391 1 391 391 755 100.0 0 MSNILSQNINDVKITVIREYISNFNVDLGVHRLVTIGKNIPLTKLEPNTALEFMKTETSK GGLGLSSTDNIYKMVELFLSQTIESGGTTIKGDHFWIQGIEFNPVTDDLTLKLTDKIENT KEDADNYFWVFDIQSVAFNEWLSKFLTRNYNFGLIEKGESTVSDLEKSDRIFAIANPKVD VHVDKTNTIEFLNPKGSVVSALGGGIFTRLATAGFGMRVKHKTLQGIRTYNTPLFEHNVP LTNVDLNTYKSKNIATYENAWGAGMVSLSKTIGGDVYSDERIGLDYIIFVITGSIHKLYN QQIGIPYDDGGINLIENKLNDCMVQVGDEGWLARKSTKARDYSFKVSVPERTSIPNQKVV DRILDNTAIDFTLAGQIENVNVTLNWKTTLV >gi|261746661|gb|ADAD01000177.1| GENE 24 14947 - 15450 375 167 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039295|ref|ZP_06012612.1| ## NR: gi|262039295|ref|ZP_06012612.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 167 1 167 167 274 100.0 1e-72 MIEKFRKLLNSFSNKKWQIVNGEVLAKTPDYPFVEMFVINLAPDFHNQSVEIKKRNENVL IEENLKTYFTMLQFNCRHSTMTEAVTLANDLFRLINYEKRKKILNEGIGIRKMSNIRNLN YEVAGKWNYCYSFDVEISYDVIEEREIETIETIKANINNKQEVTIDE >gi|261746661|gb|ADAD01000177.1| GENE 25 15434 - 15796 556 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039306|ref|ZP_06012623.1| ## NR: gi|262039306|ref|ZP_06012623.1| hypothetical protein HMPREF0554_0829 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0829 [Leptotrichia goodfellowii F0264] # 1 120 1 120 120 210 100.0 2e-53 MDNIKIPERFFKILTLSTSPGEWEDGELILEKRDLKFKGALFDLRNSDYEKFKSQGTSLG FEDRKLYIKENIKIDLKDEVIDHLGNKFRVVGKEDYRQNGHADLIICYLERLKNKDDRKI >gi|261746661|gb|ADAD01000177.1| GENE 26 15799 - 16260 536 153 aa, chain - ## HITS:1 COG:no KEGG:EpC_17640 NR:ns ## KEGG: EpC_17640 # Name: not_defined # Def: phage related-protein # Organism: E.pyrifoliae # Pathway: not_defined # 11 152 1 151 152 72 32.0 5e-12 MIMGITIKFKLDEYKKAKEILEYLTKHKIRIGFIGNETGENGTKVSEYAFYVEFGKGKGN IPRPFFRNATKEIQEMLNEKLKPLIMEAIKSKANGEVVLNTIGIEAVGLIQKSINEGIYA PNKESTLKHKKGSKPLIDTGTMLQSVNFKIEGV >gi|261746661|gb|ADAD01000177.1| GENE 27 16257 - 16829 176 190 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039299|ref|ZP_06012616.1| ## NR: gi|262039299|ref|ZP_06012616.1| conserved domain protein [Leptotrichia goodfellowii F0264] conserved domain protein [Leptotrichia goodfellowii F0264] # 1 190 1 190 190 356 100.0 6e-97 MNSIITLKQYEKLTGETMEQEKSSFIETLIRVASDMIESYIGYDLEKQDRTEIIQKKINI SRLWIKYPPINSVKNISINKKNIGRQHYIHTTKKIEFTDYFCSCNCRCSFTFNDRIILEY NSGYKFGDDGNVPYDLQYYVAMLVKSLFLLSQDDDAQKYSSYKINDIAYSYKENETFTKH IVPILKRLLW >gi|261746661|gb|ADAD01000177.1| GENE 28 16841 - 18586 2306 581 aa, chain - ## HITS:1 COG:CC2783 KEGG:ns NR:ns ## COG: CC2783 COG4653 # Protein_GI_number: 16127015 # Func_class: R General function prediction only # Function: Predicted phage phi-C31 gp36 major capsid-like protein # Organism: Caulobacter vibrioides # 309 569 59 327 341 64 24.0 4e-10 MADKLFQKEVGGIITKADIETGTLEGVLIKGEVLDSYEDYFLKEATENFKTKNGSNVVFM LHQHKKDSEIGVMELYAEGSDLKMKARLDLSKDTNGNFINKEAAKIYSLMKLGAHYDLSV GGRVLKGEIGYVPTEKGEVRAYIIKEFEVWEGSLVVKGAVPGSNVTTFKNYNENEEDNMS VNLEKQFNDYKESVEKQIKSLEEMLKEKDLTDEMKKSVEAVKGDMEIKLKEYQEAFEKSI DDKLNDFAKEFKSIKETEKELKEADLEKSIMDFMKEVNSDTFSKKSYAQYVVEKATSTTS GAVPTDFLPLLQRTILRRAQEVKNIWAYISKFSMKEGSTKIPRETLGSTEVKFIGETAAR TETTINVLDQVEIELHQVYALPIFTNKMLATDVVGFVALVLERVAENFTKKISEKILFGS GTGEPHGILNNSEVLANALTFGTTGEMDYETFTKAKYNLKEDYANRAIVIMNRKSAPAII NLKDKNDRPIFIEAYKDGKSDTISSLPVVYDDTMPIFDSANTGDVTVLIADMSRYLGATH TDYNIKMKDDITNKGFTAYYFETMVGGNVLLPEAFIPIKKK >gi|261746661|gb|ADAD01000177.1| GENE 29 18596 - 19306 945 236 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039309|ref|ZP_06012626.1| ## NR: gi|262039309|ref|ZP_06012626.1| phage Mu protein F like protein [Leptotrichia goodfellowii F0264] phage Mu protein F like protein [Leptotrichia goodfellowii F0264] # 1 236 10 245 245 414 100.0 1e-114 MLQQAKRLRQARGKATKFIKKKVDSNFNQLADSVNVGDNDIIEIDFKTFKTNMGKTIVLT HRVATTETINVIDEMYGLSEKVPYFQDIVDKRLNDFNKKNAGETVKKIDKVTKEKINKII TERQADGINAKQIAKEVKENVKDMTKSRSLTIARTETAKATAYANWMLAHETLVNTKVWI HVGGGKKHRENHLNMDGEERPLDELFSNGLQYAHEVGASAGEVINCYCTTTYKFKV >gi|261746661|gb|ADAD01000177.1| GENE 30 19333 - 20601 1108 422 aa, chain - ## HITS:1 COG:BMEI1349 KEGG:ns NR:ns ## COG: BMEI1349 COG4695 # Protein_GI_number: 17987632 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Brucella melitensis # 53 382 57 386 397 67 21.0 4e-11 MFNFFKKKEENKTVEENKIIIKSFDDFLNYLKAFNISPYNINISKMIKQIPENPFISSSL DRMQKGFYPIKWSVYEGSKKDKKEKTNNIVYRSLEQPNALMDTTDFLYYCYLYWSIYGEF LIQKVKLFNKYDLWIYNPNEYTINYHKNNLLLGIESIELGIGKKIVGKELENFTYKKIPN LYSKTSGFNQITSLVLLHDYYCLISRWNNNLLKNSGKRQFLILLDQLGTGETIEKIEDKI SESSGADGVDKPIILTGFDEKSKIHNLDFSPRDFDYMEAVTEIRNITSNVLNVPDLLIGG KDNAKYNNMKKAKKALYTENIIPACEQIKSAIKRLFYKDLGVKEFIDFDISNIEVLKDEQ IDLINALNNSEIHTINEKREKLNLDRIDGADEILVKGLPNTLKDVLSGEIEPKESNVSEE DI >gi|261746661|gb|ADAD01000177.1| GENE 31 20612 - 21928 1358 438 aa, chain - ## HITS:1 COG:BS_yqaT KEGG:ns NR:ns ## COG: BS_yqaT COG1783 # Protein_GI_number: 16079672 # Func_class: R General function prediction only # Function: Phage terminase large subunit # Organism: Bacillus subtilis # 7 435 5 420 431 261 39.0 2e-69 MKVELDISEHFQKFIDEETSDIYFLIGSYGSGKSYNIATKLIVNSFREKRKVLGIRKVYR DVRDSVFTDLVDVISELELEGYFKIITGRLEIVNKITGSKFIFRGLDEVGRLKSIKGITD IWIEEANQCDRNDFKQLRYRLRTPNIKMHMYISTNPAEPDSASNWTYWFLTDYMNVTEET LYDVKEFIKSIEDKETGYIQRIYVNHSTYKENKFLPESAVAELNMETDPYLISIAKEGKF GYHGEFVYYNIESENNQYVDEQIKRIGIEWHIAGMDFGFSISYTAVVRAAIDYENNILYV YDEFYNKGLTNAQILQEDFLYDVAEEGIVIYADYAEPKTIQEFKANGVLMAKADKMTGNT LGRIGKIQSFNKVIIAKRCENTYRELKNLKYKKDENGVVVVGDKKKMFNFDPHTKDALDY ALSRYRQRDLKNRYKEKI >gi|261746661|gb|ADAD01000177.1| GENE 32 21931 - 22203 402 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039289|ref|ZP_06012606.1| ## NR: gi|262039289|ref|ZP_06012606.1| tRNA pseudouridine synthase A [Leptotrichia goodfellowii F0264] tRNA pseudouridine synthase A [Leptotrichia goodfellowii F0264] # 1 90 1 90 90 118 100.0 2e-25 MVKKNNRTEEIQEKKEIKKVAETEVTKEWLDKEIERGNFRIVKNYIFKDGGKKLQVDYKI ADRNYFEKDNPLIEYFKEKNQHQKFIILED >gi|261746661|gb|ADAD01000177.1| GENE 33 22208 - 22639 529 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039286|ref|ZP_06012603.1| ## NR: gi|262039286|ref|ZP_06012603.1| hypothetical protein HMPREF0554_0837 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0837 [Leptotrichia goodfellowii F0264] # 1 143 1 143 143 183 100.0 4e-45 MTKNNDEVTKSKNEDEVKKTEEVVVEETTKTIDNDRNKILEEFAKKYLDGLTYDKDTKLK ISGEEMEFGISTNNILYDADLTEEEFNKKVEFLHNKKSQDTVGVKFQYRVITRKEEKFVI KFKSDSYVVGAGRVGRTKEIMEV >gi|261746661|gb|ADAD01000177.1| GENE 34 22626 - 22742 115 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDIERLKNEDEVMKDTKVEFNFKTDKVKELEEKKNDKE >gi|261746661|gb|ADAD01000177.1| GENE 35 22746 - 23363 637 205 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039307|ref|ZP_06012624.1| ## NR: gi|262039307|ref|ZP_06012624.1| putative pbsx phage terminase small subunit [Leptotrichia goodfellowii F0264] putative pbsx phage terminase small subunit [Leptotrichia goodfellowii F0264] # 1 205 1 205 205 269 100.0 1e-70 MTNEDVKALVKKEYENGAGVTELSQKYKLSINTVKSWRKRDNWKKKQRNAPSTNAPPKKK NAPKIKKGANEREIKIQQDILEGKTPKEVMEQNDISRTTYYRKSKNARQIRLERTEEHLK EIIDEVYPDLSEYLVKMSKAKKNTITRIINSTIETNIDDKQLNKLGKHLELVLKAERELL RTGKMLTSYELLEIDKQLNEEELQQ >gi|261746661|gb|ADAD01000177.1| GENE 36 23377 - 24648 1350 423 aa, chain - ## HITS:1 COG:SPy0679_2 KEGG:ns NR:ns ## COG: SPy0679_2 COG0863 # Protein_GI_number: 15674743 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Streptococcus pyogenes M1 GAS # 190 419 2 243 246 117 33.0 5e-26 MRIVSLFSLVFPNFLYIEFKEERLGWWEIKKAGGNMKIEKIDIDKIIEYSGNVKEHPEWQ IEQIKNSIKEFGFNDPIAIDENGIIIEGHGRLIALKELGYKEVECIRLEHLTEEQKVAYA IVHNKLTMNTDFDIEMLKYEINKLELADFDLDILGFDEVELEEILFEDIENDTEDTEMID KKDEIDEDVEIKIKKGQIWKLGNHYLMCGDSTKKDNFEKLLKDIDINLCLTDPPYGINIV KNGKIGAENAAKTTEYKKVKGDETTETAQKSFELIKQYSEKVILFGGNYFTAFLPFSDGW IVWDKRKDMNSNNFADGELAWCNFHTPVRIYKQLWNGMIREGERGKRVHPTQKPIRMLGE ILQDFSKENDNILDVFGGSGSTLIACEETGRNCYMIEYEEHYCNVILKRWEDLTGEQGIL YKK >gi|261746661|gb|ADAD01000177.1| GENE 37 24748 - 25155 385 135 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039282|ref|ZP_06012599.1| ## NR: gi|262039282|ref|ZP_06012599.1| prolactin receptor [Leptotrichia goodfellowii F0264] prolactin receptor [Leptotrichia goodfellowii F0264] # 5 135 1 131 131 229 100.0 5e-59 MVIEMFGNKKTIQKYLINLKFNNYAGFLDFQGKRVFLNEDGRILAVFKNNYEILNIKDYN VSVKLPTGSEITFTSLVNGALPIASTYHYYFVFAKKESNKWERQYEFIDEEITNSFIEWL KYYRDIGILENYIEY >gi|261746661|gb|ADAD01000177.1| GENE 38 25352 - 25591 126 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVLNIILYLISIFLTVGISLIIILLFYRICKRYKYNFKYKKFNNIYEIIGFILGTAINLF IFYILIFLLIFFMRLITNI >gi|261746661|gb|ADAD01000177.1| GENE 39 25639 - 26430 800 263 aa, chain - ## HITS:1 COG:no KEGG:Sterm_2849 NR:ns ## KEGG: Sterm_2849 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 253 1 258 263 219 47.0 9e-56 MAYVEVDIENGKIKWFWATNKPAKRIEEFEELLNNLPIEVIPQNTITLDQMKLCYALFRQ FGDEKGWTVGDTKEYFKETFSFAYEIGSFSLSPNKKDALTLEQATEFIQFIIEFSIQEDV NLYILDTKTRIKHHIREIVPDIQRYVIVCLRKRVCCVCGQIHDFENGKIVDLEHFDNVNR IGGYEQDDGLQTRFLSLCRQHHMEIHAAPKDGFLEKYHLQAVYLNPQLVYELLDIYPNHF KLFRKRLKEGYYDKIIIRKKKNE >gi|261746661|gb|ADAD01000177.1| GENE 40 26412 - 26582 185 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039273|ref|ZP_06012590.1| ## NR: gi|262039273|ref|ZP_06012590.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 56 1 56 56 82 100.0 1e-14 MRKETKRTFELVRKVLLTGFEITEDDGEFLRRNPELFANFKFKKVRRADPKWRMLR >gi|261746661|gb|ADAD01000177.1| GENE 41 26625 - 27254 561 209 aa, chain - ## HITS:1 COG:no KEGG:Sterm_0817 NR:ns ## KEGG: Sterm_0817 # Name: not_defined # Def: phage regulatory protein, Rha family # Organism: S.termitidis # Pathway: not_defined # 5 128 1 121 241 103 49.0 4e-21 MNELMNLGQKKTMTSLEVAEITGKRHDNILADIRDEINKLGLERAVLIFQEGYYLDKNNQ TRPKFDLNYRGILQLGARYSAETRFKLIQKAEELSNQDKPMTIEDMIILQANEMKTVKTK VDILENKIDNEIRIDSGEQRKLQRAVSVRVYQRLEVINSDSKYLFPAIYRDLKDRFGIAS YRDLKRKDLKEALAYIQNWIEKADLREAN >gi|261746661|gb|ADAD01000177.1| GENE 42 27313 - 27672 390 119 aa, chain - ## HITS:1 COG:no KEGG:Ilyop_2060 NR:ns ## KEGG: Ilyop_2060 # Name: not_defined # Def: hypothetical protein # Organism: I.polytropus # Pathway: not_defined # 2 119 6 123 126 162 68.0 3e-39 MIFISGNVPSSKNSKQWTGKYLISSKTVRNYIKKHCDEWCKNTEKFKEMIKGKEKPYKVG FYFIRDSKRKFDYINAAQLPLDLMQDYDWIEDDDSSNIIPVFLGYEVDKRNAGVRIEIL >gi|261746661|gb|ADAD01000177.1| GENE 43 27688 - 27996 228 102 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0946 NR:ns ## KEGG: Lebu_0946 # Name: not_defined # Def: helicase # Organism: L.buccalis # Pathway: not_defined # 1 94 407 501 2131 67 38.0 1e-10 MITNKNIDEVLKKQYFQEIKDFYKRNNSAEERISYIRDLYGHSGWGIPDKGDFIDGMHCD SSGIEFIKTNSNWKQETMKMKWNKVAERLQKIIENESQLSLF >gi|261746661|gb|ADAD01000177.1| GENE 44 28002 - 28175 150 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDKNQIDKKIKELQKENQKKAKKREKIADEDKKLHLEIMKNISEIEKLKKLKNKSES >gi|261746661|gb|ADAD01000177.1| GENE 45 28175 - 28924 535 249 aa, chain - ## HITS:1 COG:CAC1933 KEGG:ns NR:ns ## COG: CAC1933 COG1484 # Protein_GI_number: 15895206 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 65 248 92 277 282 69 29.0 6e-12 MEEINRIADRYKTNIQKNQTIQTVSKTGINDIVETVDITEIKEKEKIIEYRRYSKITVDD MKAVFQNEKIIHPTEKEYSLSFQRFCKNIEKIKEKGLGILMFGNPGTGKSFYSNCIMNTL NEKYKVYRTSLYNILEEIRQTYKKNANDDDTFIFNRLGIADLTIFDDLGNEFITDWGKEK MFLIFEFLYKYRKSFIINTNLDEKQLSTFLKINGSRKLIDRIRERCKKYEFNWGSRRGEL HKKDFEELY >gi|261746661|gb|ADAD01000177.1| GENE 46 28836 - 29066 128 76 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTEGAVKRMLKRIEKMTGGNNENYAIKLLNKSEDKCWLDIYETKEKENGRNKQNSRQIQD KHTEEPDYTNSFKDWN Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:48:27 2011 Seq name: gi|261746659|gb|ADAD01000178.1| Leptotrichia goodfellowii F0264 contig00215, whole genome shotgun sequence Length of sequence - 316 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 314 494 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins Predicted protein(s) >gi|261746659|gb|ADAD01000178.1| GENE 1 2 - 314 494 104 aa, chain - ## HITS:1 COG:FN2048 KEGG:ns NR:ns ## COG: FN2048 COG2885 # Protein_GI_number: 19705338 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Fusobacterium nucleatum # 4 104 22 122 151 140 67.0 8e-34 TKVKGPKEVTVVLDERSLLFDFDKSNVKPQYYGILQNLKEYIEVNDYDVTIIGHTDSKGT NEYNMKLGMRRAESVKAKMIEFGVPASRIVGVESRGEEEPVATN Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:48:30 2011 Seq name: gi|261746647|gb|ADAD01000179.1| Leptotrichia goodfellowii F0264 contig00177, whole genome shotgun sequence Length of sequence - 9749 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 83 - 142 6.4 1 1 Op 1 . + CDS 291 - 1166 569 ## gi|262039328|ref|ZP_06012643.1| putative dynein beta chain 2 1 Op 2 . + CDS 1126 - 1905 509 ## gi|262039321|ref|ZP_06012636.1| normocyte-binding protein 1 3 1 Op 3 . + CDS 1938 - 3203 1644 ## COG0151 Phosphoribosylamine-glycine ligase 4 2 Op 1 1/0.000 - CDS 3324 - 4196 1237 ## COG3007 Uncharacterized paraquat-inducible protein B 5 2 Op 2 . - CDS 4211 - 4516 475 ## COG3007 Uncharacterized paraquat-inducible protein B - Prom 4540 - 4599 11.2 + Prom 4517 - 4576 11.5 6 3 Tu 1 . + CDS 4803 - 6014 2033 ## FN1449 hypothetical protein + Prom 6070 - 6129 8.8 7 4 Op 1 1/0.000 + CDS 6170 - 6358 263 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 8 4 Op 2 . + CDS 6420 - 7115 1016 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 9 5 Tu 1 . - CDS 7588 - 8487 1027 ## COG0628 Predicted permease - Prom 8596 - 8655 6.0 + Prom 8886 - 8945 18.2 10 6 Tu 1 . + CDS 9003 - 9747 875 ## Lebu_0816 hypothetical protein Predicted protein(s) >gi|261746647|gb|ADAD01000179.1| GENE 1 291 - 1166 569 291 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039328|ref|ZP_06012643.1| ## NR: gi|262039328|ref|ZP_06012643.1| putative dynein beta chain [Leptotrichia goodfellowii F0264] putative dynein beta chain [Leptotrichia goodfellowii F0264] # 1 291 1 291 291 441 100.0 1e-122 MEISNKKYQGFSSISKRIFEKNLLQFLCLYNNRSEFKEYGKILNILEEENIKNRIFITQK EKNDIINKIQDKIYYKLMNEMIYLYRGNLLIFLLSYIEHNRRFTTELINSINGSVPTLNK NNFTNIIINELNKSAKKNEIITGIVNEFLGMEKEYLENGIDLLIKENSFTISDVEITEKE YKDFLSYIYVIWYVRYIHISFTIGPDKNKELKFLDKDVNISKYNSLYRNDFITNQIENLV SYKDNFFQEDLEKINNLIEKYFNFNLADIDLLISKLKKDENDIFFEDKVIF >gi|261746647|gb|ADAD01000179.1| GENE 2 1126 - 1905 509 259 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039321|ref|ZP_06012636.1| ## NR: gi|262039321|ref|ZP_06012636.1| normocyte-binding protein 1 [Leptotrichia goodfellowii F0264] normocyte-binding protein 1 [Leptotrichia goodfellowii F0264] # 18 259 1 242 242 384 100.0 1e-105 MKMIYFLKIRLFFKERIMKLAECDENRADSIIRYFKYVSQKRNYGSFQSRKNNKILKKPI IEIYEKYIVIKKLLIYTLELFLLNPKTIESSDNEFKKQLEYLSQRKNNQFEEEVFNEIKK YFSTSEIIVKKDIGEKDIKGVVPPGQIDILCLYRKNIIVIECKNLDLKTNHRETINQIKK IRGEFEKKLVEKVEYIKNNKSKVLNYLENREVENYSEYNVEGVIVFKNYTVELATEEKVK RPISRITWEKLCDWIKEKG >gi|261746647|gb|ADAD01000179.1| GENE 3 1938 - 3203 1644 421 aa, chain + ## HITS:1 COG:FN0981 KEGG:ns NR:ns ## COG: FN0981 COG0151 # Protein_GI_number: 19704316 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Fusobacterium nucleatum # 1 420 1 418 426 471 58.0 1e-132 MKILIVGAGGREHAIAWKLSQNDKVKKIYIAPGNVGSELINIAENVELSNIEEYISFAKK NSIDLTIVGSEELLVQGIVDEFRKNNLKIFGPDKKAALLEGSKAYAKDFMKKYNIKTATY EIFDKYEDAINYLDNYGEKNFPVVVKASGLAAGKGVIIAENLKEAKEAVTDIMVNKKFAS SGNEIVIEEYLEGKEASILSFTDCNVIIPLLSAKDHKKIGENETGLNTGGMGVISPNPYV TDEVFEEFQKKIMKPTLEGLKSENMDFEGVIFFGLMITKKGVYLLEYNMRLGDPETQAVL PLLETDLMDIIENSFEKRLSEVKVEWKPLYSCCVVGVAGGYPESYNKGDVIKGIKEFEEN ERNILFIAGAKKENENFVTSGGRVLNVVCLGDSPEEARKSAYDRIDKIKFNNIYYRKDIG K >gi|261746647|gb|ADAD01000179.1| GENE 4 3324 - 4196 1237 290 aa, chain - ## HITS:1 COG:CAC0462 KEGG:ns NR:ns ## COG: CAC0462 COG3007 # Protein_GI_number: 15893753 # Func_class: S Function unknown # Function: Uncharacterized paraquat-inducible protein B # Organism: Clostridium acetobutylicum # 3 290 111 398 398 283 51.0 2e-76 MGDAFSNEMKEDVIKYIKEEFGGKIDLLIYSLASAVRTDPKDGVTYRSALKSTEKEIVGP SINLEKEEIEETVMGVATPEEIHSTVKVMGGEDWKLWVEALDEAGVIDKGFKTVAYSYLG PKVTYGIYKDGTIGAAKRDLEHTSDTLNDFLKKKYNGEAYVSLSKALMTRASAVIPIFPL YAALLYRVMKEKGLHEGTIEQKHRLLKDMVYGNKPEIDSERRLRPDNWEMREDVQAEVEA LWDKVTPENFKEISDYKGAREEFMNLSGFGFDNVDYDTDIDLDELAKLQP >gi|261746647|gb|ADAD01000179.1| GENE 5 4211 - 4516 475 101 aa, chain - ## HITS:1 COG:YPO4104 KEGG:ns NR:ns ## COG: YPO4104 COG3007 # Protein_GI_number: 16124212 # Func_class: S Function unknown # Function: Uncharacterized paraquat-inducible protein B # Organism: Yersinia pestis # 1 75 1 76 399 89 55.0 2e-18 MVIKPRLKGSLALVNHPMGAYEFVKRQIDYVKSQDKYTGPKKVLIIGASSGYGLASRISL AFGAGADTIGVAYEKVLKEKEQEVQVGGILSLSMKLQKKKD >gi|261746647|gb|ADAD01000179.1| GENE 6 4803 - 6014 2033 403 aa, chain + ## HITS:1 COG:no KEGG:FN1449 NR:ns ## KEGG: FN1449 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 403 2764 3165 3165 524 69.0 1e-147 MTSLAGSLTWVATPVLDNHGQIKGVAMAKLAYTSFVRTTDNAYNFTDGLEQRYDMNTLDS AEKRIFNKLNGIGKNEETLLTQAFDEMMGHQYANVQQRTASTGRALDKEITHLRKEWDNK TKNSNKIKVFGMKDEYNTDTAGIIDYTSNAYGVAYVHENETIKLGNSSGWYAGAVNNTFK FKDIGRSKENTTMLKLGAFKSTAFDHNGSLNWTVSGEGYIARSDMHRKFLVVDEIFNAKS SYNTYGAAIKNEISKEFRTGEKFSIRPYGSLKLEYGRFNTIKEKDGEMRLEIKGNDYYSV KPEVGIEFKYKQPMAVKTMFTTTLGLGYENELGKVGDTKNKGRVAYTDADWFDIRGEKDD RKGNFKADLNIGIENQRFGVTLNGGYDTKGKNIRGGLGIRVIY >gi|261746647|gb|ADAD01000179.1| GENE 7 6170 - 6358 263 62 aa, chain + ## HITS:1 COG:SA2078 KEGG:ns NR:ns ## COG: SA2078 COG1957 # Protein_GI_number: 15927863 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Staphylococcus aureus N315 # 4 60 3 59 313 80 56.0 5e-16 MKKRRIYLNHDSGVDDLISLYLLLKMEDAELVGVSVMDADCYIQPAASATRKIIEKFGSE KR >gi|261746647|gb|ADAD01000179.1| GENE 8 6420 - 7115 1016 231 aa, chain + ## HITS:1 COG:SA2078 KEGG:ns NR:ns ## COG: SA2078 COG1957 # Protein_GI_number: 15927863 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Staphylococcus aureus N315 # 1 231 80 311 313 286 60.0 3e-77 MHAFVVDALPILNEKKPLTVEESNIKAHEDIIRVLKESDEKTDLVFIGPLTDLARALDKD SSIEEKIGKLYWMGGTFLEHGNVEEPEHDGTAEWNVYWDPFAAKRVWESNIKIELVALES TRMVPLTAEVRDMWASQRKYEGVDFLGQCYATVPPLTHFVTNSTYFLWDVLTTASFGKEN LVKREVVNSDVITEGPSRGRTVRKENGRPVDLVYHVDRDSFFEYITDLAKQ >gi|261746647|gb|ADAD01000179.1| GENE 9 7588 - 8487 1027 299 aa, chain - ## HITS:1 COG:BH0463 KEGG:ns NR:ns ## COG: BH0463 COG0628 # Protein_GI_number: 15613026 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus halodurans # 41 254 163 371 372 112 31.0 1e-24 MTIDQSHIYLGYLPKGFTEQIKNSVAGFISMGSSKFGRFINGLIYFVTSIPTIIIYICIT VLSTFFISLDKKEILNFLEQQLPESWLRKVYNIKREMFTVLGSYVKAQIILMTICFFELL ISFNILFFLEFNVSYPLLISIVICLIDALPILGAGAVLLPWAGISFILGDIKLGIALILI YLLVLSVRQMLEPKLISQNIGVHPLVTLISMYSGFKFFGVMGFLIGPVVMIILKNVFSKE LEIGFFREIFSDTSEDSDDSASGNSKDNSPKENNNPIIDNMPENLKKLMEDEFIDDAKC >gi|261746647|gb|ADAD01000179.1| GENE 10 9003 - 9747 875 248 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0816 NR:ns ## KEGG: Lebu_0816 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 9 248 6 247 422 227 61.0 3e-58 MLKAIPEMKNKKFQLLLLFFVSALSYGATVDDVISEYEQKSYTTKINQANLRTYDIKDKA LKKGDWNTIKLTSENSYEKNKTYDGVSMENTASFGMLYYKNGYNFTNKEFTENKVGVSKT LNDFFYSDNKYNTNINNISRDIQKISNETNKNTEIRNLIDLYKNYKNKEKELEQSKISLQ GKKKDYDILARKYQLGTASKFDYDLAKNEYETTDLKKQNTERELKILGENFMIYNVTLPK KEKLEDLK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:49:10 2011 Seq name: gi|261746643|gb|ADAD01000180.1| Leptotrichia goodfellowii F0264 contig00150, whole genome shotgun sequence Length of sequence - 2531 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1412 1517 ## gi|262039332|ref|ZP_06012646.1| hypothetical protein HMPREF0554_2155 + Prom 1446 - 1505 2.9 2 2 Tu 1 . + CDS 1536 - 1628 65 ## + Prom 1649 - 1708 3.5 3 3 Tu 1 . + CDS 1818 - 2348 581 ## gi|262039331|ref|ZP_06012645.1| putative liporotein + Term 2492 - 2529 5.0 Predicted protein(s) >gi|261746643|gb|ADAD01000180.1| GENE 1 3 - 1412 1517 469 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039332|ref|ZP_06012646.1| ## NR: gi|262039332|ref|ZP_06012646.1| hypothetical protein HMPREF0554_2155 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2155 [Leptotrichia goodfellowii F0264] # 1 469 1 469 469 903 100.0 0 GIIGKFADDVNEKLLNNEPKNIQGEWTEKVYKTIRRIEDFVDKKLYNESLALIPTSGQHG GILDAFKYFGKKDQLDIYHIVVQQDENGKRIITGQKLKDPSQIKTKEIFVNGMSEELEGS INNVINQLSSAADRKRIDNGGKFEVTLIYNKTRGYYPDVVESVIGKMADGSKTNFRMTTG VSRGLSDILTRLDSSREYNITTYSQANIIMLGALNYLNNKGKTLNGNITLYHTGTPIAPK AFDALTGKLNFINGVSMINTLDPVGSERGKLMGLKGIVSETRDFDQKPRTKRDALSPKIA FIGPSMLRDQVYPMLTLKKKEKYTPEEISSISQAHGFKKYKQFTGYAMNYHGRYMYEDRE FVARELEQLNSGKSMDEIGKEIDKIRAERQNVMLDKMKYGPVVAPEQIIDFRTNILFDEF IENLGAQWRERERIPENRRAGDLHINGNTKRSATQSIDDMVANLRRRVR >gi|261746643|gb|ADAD01000180.1| GENE 2 1536 - 1628 65 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFKNRCLLSEIDKIRAKRQKVISDKMKVAL >gi|261746643|gb|ADAD01000180.1| GENE 3 1818 - 2348 581 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039331|ref|ZP_06012645.1| ## NR: gi|262039331|ref|ZP_06012645.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 176 1 176 176 286 100.0 6e-76 MKKNIILILSLIILSSCITDMSQLTGPIPKKGYSDYGAEFTEKDILERRKVVAQISGYNI KMPENMIFKYGKSATTDELFGKKRKYLYDEVTKTGTSVFLIEESIEKMKEYEKKRGGSYI FSEERKRYNIRISILEGVSVMIEIKPGLYIACEAEEIGSMKAVPVCEAIVKVMKSQ Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:49:57 2011 Seq name: gi|261746595|gb|ADAD01000181.1| Leptotrichia goodfellowii F0264 contig00130, whole genome shotgun sequence Length of sequence - 46606 bp Number of predicted genes - 49, with homology - 46 Number of transcription units - 16, operones - 13 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 3 - 1067 1264 ## COG2837 Predicted iron-dependent peroxidase 2 1 Op 2 . + CDS 1045 - 2733 2203 ## COG0672 High-affinity Fe2+/Pb2+ permease 3 1 Op 3 . + CDS 2756 - 3466 822 ## COG0805 Sec-independent protein secretion pathway component TatC 4 1 Op 4 . + CDS 3470 - 3673 435 ## Lebu_1816 sec-independent translocation protein MttA/Hcf106 + Prom 3680 - 3739 14.3 5 2 Op 1 2/0.000 + CDS 3795 - 5021 1881 ## COG1840 ABC-type Fe3+ transport system, periplasmic component + Prom 5057 - 5116 9.6 6 2 Op 2 4/0.000 + CDS 5158 - 5862 901 ## COG0378 Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase 7 2 Op 3 . + CDS 5852 - 6868 1382 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component + Term 7071 - 7105 3.6 + Prom 6870 - 6929 6.5 8 3 Tu 1 . + CDS 7121 - 7642 548 ## Clocel_3820 hypothetical protein + Term 7733 - 7774 -0.9 - Term 7768 - 7798 1.9 9 4 Tu 1 . - CDS 7807 - 8478 409 ## COG3619 Predicted membrane protein - Prom 8692 - 8751 8.7 - Term 9137 - 9168 -0.7 10 5 Op 1 . - CDS 9175 - 9648 334 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 11 5 Op 2 . - CDS 9678 - 9842 63 ## 12 5 Op 3 . - CDS 9864 - 10595 675 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump - Prom 10640 - 10699 12.1 + Prom 10668 - 10727 7.7 13 6 Op 1 5/0.000 + CDS 10817 - 11167 532 ## PROTEIN SUPPORTED gi|229885453|ref|ZP_04504902.1| LSU ribosomal protein L19P + Term 11174 - 11204 1.1 14 6 Op 2 1/0.000 + CDS 11219 - 12769 2058 ## COG0681 Signal peptidase I 15 6 Op 3 . + CDS 12788 - 13246 514 ## COG0456 Acetyltransferases 16 6 Op 4 . + CDS 13317 - 15968 3803 ## COG0525 Valyl-tRNA synthetase 17 6 Op 5 . + CDS 16043 - 16159 196 ## 18 6 Op 6 26/0.000 + CDS 16188 - 16613 561 ## COG1585 Membrane protein implicated in regulation of membrane protease activity 19 6 Op 7 . + CDS 16641 - 17561 1349 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs + Term 17589 - 17628 2.7 20 7 Op 1 . - CDS 17641 - 18393 1054 ## COG1878 Predicted metal-dependent hydrolase 21 7 Op 2 . - CDS 18338 - 18448 94 ## - Prom 18468 - 18527 9.0 + Prom 18409 - 18468 13.7 22 8 Op 1 . + CDS 18582 - 19325 1078 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase 23 8 Op 2 . + CDS 19372 - 20823 2260 ## COG0366 Glycosidases + Prom 20830 - 20889 3.2 24 9 Op 1 . + CDS 20979 - 22127 1468 ## COG0263 Glutamate 5-kinase 25 9 Op 2 . + CDS 22169 - 22966 1116 ## FN0994 hypothetical protein + Prom 22992 - 23051 1.6 26 10 Op 1 . + CDS 23089 - 23679 580 ## FN0995 hypothetical protein 27 10 Op 2 2/0.000 + CDS 23721 - 24977 1812 ## COG0014 Gamma-glutamyl phosphate reductase 28 10 Op 3 . + CDS 25039 - 25845 1035 ## COG0345 Pyrroline-5-carboxylate reductase + Term 25889 - 25925 -1.0 + Prom 26069 - 26128 9.2 29 11 Op 1 11/0.000 + CDS 26154 - 27191 1835 ## COG4213 ABC-type xylose transport system, periplasmic component 30 11 Op 2 11/0.000 + CDS 27210 - 28760 188 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 31 11 Op 3 . + CDS 28729 - 29874 1747 ## COG4214 ABC-type xylose transport system, permease component 32 11 Op 4 11/0.000 + CDS 29905 - 31227 2028 ## COG2115 Xylose isomerase + Prom 31328 - 31387 5.2 33 11 Op 5 . + CDS 31446 - 32918 1985 ## COG1070 Sugar (pentulose and hexulose) kinases 34 11 Op 6 2/0.000 + CDS 32931 - 34394 1856 ## COG1757 Na+/H+ antiporter 35 11 Op 7 . + CDS 34387 - 35562 1231 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities + Prom 35569 - 35628 3.9 36 12 Op 1 2/0.000 + CDS 35652 - 37223 1866 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 37237 - 37278 0.6 + Prom 37225 - 37284 8.1 37 12 Op 2 . + CDS 37366 - 38511 210 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase + Term 38518 - 38580 8.9 + Prom 38513 - 38572 10.3 38 13 Op 1 . + CDS 38613 - 38906 397 ## Lebu_1678 hypothetical protein 39 13 Op 2 . + CDS 38890 - 39159 188 ## Lebu_1677 plasmid stabilization system + Term 39167 - 39200 1.1 + Prom 39179 - 39238 9.4 40 14 Op 1 . + CDS 39285 - 39689 521 ## Lebu_0620 hypothetical protein 41 14 Op 2 . + CDS 39693 - 40151 452 ## Lebu_0621 peptidase A24A prepilin type IV 42 14 Op 3 . + CDS 40167 - 40595 374 ## Lebu_0622 hypothetical protein 43 14 Op 4 . + CDS 40604 - 41074 546 ## Lebu_0623 hypothetical protein 44 14 Op 5 . + CDS 41090 - 41653 621 ## Lebu_0624 hypothetical protein 45 14 Op 6 . + CDS 41676 - 42851 1290 ## Lebu_0625 hypothetical protein 46 14 Op 7 . + CDS 42848 - 43399 619 ## Lebu_0626 hypothetical protein + Term 43446 - 43485 -0.5 + Prom 43554 - 43613 4.5 47 15 Tu 1 . + CDS 43725 - 44882 1376 ## COG1450 Type II secretory pathway, component PulD + Prom 44899 - 44958 10.5 48 16 Op 1 . + CDS 44995 - 45993 1454 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase 49 16 Op 2 . + CDS 46025 - 46606 197 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 Predicted protein(s) >gi|261746595|gb|ADAD01000181.1| GENE 1 3 - 1067 1264 354 aa, chain + ## HITS:1 COG:SA0332 KEGG:ns NR:ns ## COG: SA0332 COG2837 # Protein_GI_number: 15926045 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted iron-dependent peroxidase # Organism: Staphylococcus aureus N315 # 2 354 21 371 371 376 51.0 1e-104 SFYGEHQSGIATPVQSNVYFAVLDLHSENKEEIIQMFKEWTAYTEKLMKGELVSPELKNH LLPPVDTGEALGLNPYRLTVTFGVSPAFLDKLKLDSKKIEEFKDLPHFPRDQIKEQYKGG DICIQACADDPQVAFHAVRNLVRKGRTLITLKWSQAGFLPIGNRKETPRNLFGFKDGTEN PKNENDFKNVVWYDKNNWLKDGTFLIVRRIQMHLETWDRTNLQEQENTFGRYKESGAPFG EKDEFAVIDINKKGPDGKPILPIDSHVFLAKKAEAKICRRAFSYSDGINEISGQFDAGLL FICFQKHPEQFIKIQNSLGSEDKLNEYITHIGTGIFACFGGVKKGEYIGQKLFE >gi|261746595|gb|ADAD01000181.1| GENE 2 1045 - 2733 2203 562 aa, chain + ## HITS:1 COG:SA0333 KEGG:ns NR:ns ## COG: SA0333 COG0672 # Protein_GI_number: 15926046 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity Fe2+/Pb2+ permease # Organism: Staphylococcus aureus N315 # 1 561 1 571 571 280 31.0 7e-75 MGKNYLSKIGIILLICFTFFSANIIAKESYSELYIKITDATTALKNNDKKQTEKLIAEIR ESFKSAKNADSPQGKKVQESLAKIGEITENDLREITKALLAFEKEQNPVDEVQVKKDFKA KVYPALDTLEKVIKSKDVEAMKKEYLKYSGVWTRNEAVIRDRDTAYYGKVETAMAFLRSS MEMEPFDYDNAISSFNNLKTVIQEFLDGKKIEQVSTEGITLKQAIELLKEGLEAFKSGDK SAGQSKVRKFIEIWPTVEGDVSTRNASLYTKVETETPVIMVKGGEKQYQEKLQNLITELS QIDTKAEYTFVDAMFILLREGVEALLIVLALVSGLKAANQKKGLKWVYAGAVAGILASIV IAVVLQKLFPAVSSGTNREIIEGFVGIFAVIMMIGIGVWLHSKSSLKAWKDYMDRKMNVV LSTGSFISMFALSFLAVFREGAETILFYAGILPLISVQNLITGISAAVIILIIIALALTY ASSKIKVHRVFFILTWMIYFLAFKMLGVSIHMLQVVGVIPLHVIHFIPTVEILGIYANVE VFISQLILILIIAGITLKRKKK >gi|261746595|gb|ADAD01000181.1| GENE 3 2756 - 3466 822 236 aa, chain + ## HITS:1 COG:BS_ycbT KEGG:ns NR:ns ## COG: BS_ycbT COG0805 # Protein_GI_number: 16077333 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway component TatC # Organism: Bacillus subtilis # 3 234 2 236 245 177 41.0 1e-44 MSEEKNQSLIEHLGEFRKRLIMTIIFFLFAFVASFVFCADIYRLLTYPFSKKLLVLGPDE VLGIYVTLAGICALSFTLPFASYQLWAFIKPGLKEKEAKMILTYIPATFILFVGGLAFGF FVITPALLNILLSIGEDLFNVQVTARNYLEFVLHTSLPIAVVFELPVIAAFLTSLHILTP MFLTKNRRYGYFILLVLAVVLTPADFISDLAMTVPLVLIYEISISVSKYIYKKREG >gi|261746595|gb|ADAD01000181.1| GENE 4 3470 - 3673 435 67 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1816 NR:ns ## KEGG: Lebu_1816 # Name: not_defined # Def: sec-independent translocation protein MttA/Hcf106 # Organism: L.buccalis # Pathway: Protein export [PATH:lba03060]; Bacterial secretion system [PATH:lba03070] # 1 65 1 66 66 84 74.0 1e-15 MGIFRDIGTPGLIVIILGALLIFGPKRLPELGEAVGKMFKEFKKSMSDITTDGDEKNSEK KESDKSE >gi|261746595|gb|ADAD01000181.1| GENE 5 3795 - 5021 1881 408 aa, chain + ## HITS:1 COG:FN0128 KEGG:ns NR:ns ## COG: FN0128 COG1840 # Protein_GI_number: 19703473 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Fusobacterium nucleatum # 95 404 8 311 314 291 46.0 2e-78 MNKFFSLNDTLYQIVQKYPEALDFLIANGFDQLKNKQMFESMAKNINLNMALKAKKINPE LFEEKLVTFLEKDSETDISLEGRKEDSTGDILIEGVLPCPIRIPLLEGIKGWVNKRNDEV DYKIAYELQSANLGLDWVVDKVKTGDPDKVSDILLSAGFELFFDKELMGQYMEKGIFETY LDEMNKDFCNDKIDLRDPKKQYAIMGVVPAVFLINKTVLGDRKPPQTWEDILSEEFEDSV ALPMNDLDLFNALIINIYKEHGSEGIQKLARSYKKNLHPAQMVKAKGRSGDAPAVSIIPY FFTQMLMGATDLEAVWPKDGALLSPIFMITKKAKKDKIQPFIDFFMSEEIGTLFSANGKF PSTNPNVDNHLTEDQKFKWIGWDFIHGNDVGGLVRKCEDEFNEAIMKL >gi|261746595|gb|ADAD01000181.1| GENE 6 5158 - 5862 901 234 aa, chain + ## HITS:1 COG:FN0129 KEGG:ns NR:ns ## COG: FN0129 COG0378 # Protein_GI_number: 19703474 # Func_class: O Posttranslational modification, protein turnover, chaperones; K Transcription # Function: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase # Organism: Fusobacterium nucleatum # 1 228 1 228 231 366 75.0 1e-101 MNLVIVSGPPSSGKTSVILKTVDALKKQGMSVGVVKFDCLYTDDDILYEKAGIPVKKGIS GALCPDHFFVSNIEEVVQWGRSLGLSMLITESAGLCNRCSPYIKDIKGICVIDNLSGINT PKKIGPMLKAADIVIITKGDIVSQAEREVFASRVNSVNPTAITMHVNGLTGQGAYELSTL LYEKEKEIESVQGKKLRFPMPSALCSYCLGETRIGEKYQMGNVRKMNLGGKDEK >gi|261746595|gb|ADAD01000181.1| GENE 7 5852 - 6868 1382 338 aa, chain + ## HITS:1 COG:FN0130 KEGG:ns NR:ns ## COG: FN0130 COG1136 # Protein_GI_number: 19703475 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Fusobacterium nucleatum # 77 333 4 259 268 310 65.0 3e-84 MKNKLVDEFKISELILKYPFAENFFNENAININGHEDKTFSEFLERFTEEEIEDMALDTE KIKEGLVEYIEQMKLFLGMEEDKGVEVLTIIAGQDKSGNPEGFNRLDIHKSEIISIVGPT GSGKSRLLADIEWTAQKDTPTQREILINEELPDKKWRFSSNNKLVAQLSQNMNFVMDLSV KDFLELHAKSRMIENVEEVTDKIIQEANKLAGEQFNLDTPVTALSGGQSRALMIADTAIL SSSPIVLIDEIENAGIDRKKALNLLVSADKIVLMATHDPTLALIADKRIIIKNGGIAKII ETSSEEKKILLKLEEMDNTIQQMRTALRNGEILSDSNF >gi|261746595|gb|ADAD01000181.1| GENE 8 7121 - 7642 548 173 aa, chain + ## HITS:1 COG:no KEGG:Clocel_3820 NR:ns ## KEGG: Clocel_3820 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulovorans # Pathway: not_defined # 1 173 35 208 244 81 37.0 2e-14 MKLYEIDENYINYLKFFDSKVLNHSGSNYKKTRKYIGLLLRVNDCDYIAPLSSPNKKSDY DQNGKIKKSNNFIIRIVASQSKRNILLGTIKISNMIPVFDRSVIKYYDIHKEQDEYYKLL IIRELRFIYVNKEKINRTASRLYNQKINNMSMDYIKHTVDFELLEEKAKLYKK >gi|261746595|gb|ADAD01000181.1| GENE 9 7807 - 8478 409 223 aa, chain - ## HITS:1 COG:lin0467 KEGG:ns NR:ns ## COG: lin0467 COG3619 # Protein_GI_number: 16799543 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 7 221 3 215 221 169 45.0 4e-42 MWGYNKQKKQTSDTFRLGLLLCLVGGFTDAYTFIMRGKVLANAQTGNMVFFALRLTERRW REALFYFLPIFAFALGILIAEFIKRKFKHDKIHWRQIVVLMEIVVLFTSSFVPIGELNVF VNIAISFVCSLQVQGFRKINGNISATTMCTGNLRSGTENLFHYLITKDKEFKHKFMTYFG LIFFFMTGAVTGSFFSEILKEKAPLVCCLILFSAFIIMFKDEI >gi|261746595|gb|ADAD01000181.1| GENE 10 9175 - 9648 334 157 aa, chain - ## HITS:1 COG:FN1901 KEGG:ns NR:ns ## COG: FN1901 COG0664 # Protein_GI_number: 19705206 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 6 150 66 209 217 100 41.0 1e-21 MQKFGGDTITIDFIKANEIIAPAFIFGNMRVFPVDLISVENSKLIFLNKKDFVEVMQEDK RLLLNFLDEISNKSQLLSKRIWFNFINKTINDKVLSYIRENQKDNIIIFKPNISELSKKF EVTRPSLSREISILCDKGILIKLQNNKYKVDFSKIEI >gi|261746595|gb|ADAD01000181.1| GENE 11 9678 - 9842 63 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKPEELYKKVSEISLFKGFNIHETENILKKINFQTKNYKKMNPSFSEEIKSSIL >gi|261746595|gb|ADAD01000181.1| GENE 12 9864 - 10595 675 243 aa, chain - ## HITS:1 COG:AF1248 KEGG:ns NR:ns ## COG: AF1248 COG1811 # Protein_GI_number: 11498847 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Archaeoglobus fulgidus # 2 241 4 223 228 132 34.0 8e-31 MGIFTDFLAIIIGSLLGLLIGKKFKENMRNIIMDCIGLFIIISGIKSTLNSNRDITVLIY LITGSIIGQIIDIDLQIKKLSTFVENKFSEISSVSYKNKNLVDSVENKEEYSNFAKGLSV TTILYCVGAMAIVGPINSGLTGDNKIMNIKAILDGITGIVFASIYGIGVIFSAISAFLYQ GTIYLFARQIKSFLTPNAISDLNFLGGIMICALGINVVLKKNIKVANMIPAIFIPIIVEI FVK >gi|261746595|gb|ADAD01000181.1| GENE 13 10817 - 11167 532 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229885453|ref|ZP_04504902.1| LSU ribosomal protein L19P [Sebaldella termitidis ATCC 33386] # 1 116 1 116 116 209 90 2e-53 MKEKLIELVEKEYLKSDIPQFKSGDTVAVHYKVKEGNKERIQVFEGVVIRVSGGGMAKNF TVRKVSSGIGVERILPINSPLIEKIEVKRIGKVRRSRLYYLRNLSGKAARIKEIRK >gi|261746595|gb|ADAD01000181.1| GENE 14 11219 - 12769 2058 516 aa, chain + ## HITS:1 COG:FN0370 KEGG:ns NR:ns ## COG: FN0370 COG0681 # Protein_GI_number: 19703712 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Fusobacterium nucleatum # 206 494 59 281 286 127 33.0 6e-29 MNIILWGIFYLIMTVILLYFFIREKQVVNFIRKKEDELLKKIPVGDKGEEKRIKIGNIIT VIGILVSVAFYFLVDKEPDPVIRIKIYGIYGMFVLNLSVLVMREQHEWAFLANLVMLFLS KLMFNILDTNFYIYLFINMILAFPLIFLYRRQSKKRITESSLRNDLKPKEFNTELEKVKK ETGLETEEARKEMVELEKRKQSTLGKAFARLDTTVTAVILVLIIQAFYLGNYVIPTGSME PTIKVKDRAFANMVKYKFGKPKVGDIIAFKEPVTDKIMYTKRITGTPGQTLQIKDISLMN LTDPSSDRTEATNAGRIYLDDKISEVLNRPYSVEGLVQDKKIYIPKKGDKVKLDKIIMIS KLHENEEKKETELLKDNFLLSQADWDGYKEDKNYLEVTPEEFLSRVNTNRDFKNIIGISD KFRKDDPKLDVYYTFTLKVEGRDELVLPIQDFKYNDEKFMKLLKGETVTLDKNYYMAMGD NTTNSLDSRFFGYVSEDRIKGNLLFRWWPLNRIGLL >gi|261746595|gb|ADAD01000181.1| GENE 15 12788 - 13246 514 152 aa, chain + ## HITS:1 COG:aq_567 KEGG:ns NR:ns ## COG: aq_567 COG0456 # Protein_GI_number: 15606021 # Func_class: R General function prediction only # Function: Acetyltransferases # Organism: Aquifex aeolicus # 7 148 1 140 154 67 32.0 8e-12 MLEKTEIIKICEVREEIVAELCKIHNENFDKKVESKYFEEILSNTQYTVYYMKKSEKVSG YIIYYDTFDSFDLFEIAIKKEFQNKGLGNELLEKTIDRLFNEEKYKKTVFLEVNENNFSA VKLYKKNNFKEISLRKNYYGIGQNAIIMVRNL >gi|261746595|gb|ADAD01000181.1| GENE 16 13317 - 15968 3803 883 aa, chain + ## HITS:1 COG:FN2011 KEGG:ns NR:ns ## COG: FN2011 COG0525 # Protein_GI_number: 19705307 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 2 883 1 886 887 1117 60.0 0 MIKELSKTYSPKEIEEKWYKIWEEKGYFNAQHNDEKPGYSIVIPPPNVTGILHMGHMLDN AIQDTIIRYKRMSGFDTLWVPGTDHAGIATQNKVERMLKDEGTSKEEIGREEFLKRTWEW KEKHGGLITKQLRRLGVSLDWSRERFTMDEGLSKAVKEVFIKLYNDGLIYRGEYIVNWCP HDKTALADDEVNHVEKNGKIWEIKYRIKDSDEFVVIATTRPETMLGDTGVAVNPKDERYS HLVGKTVILPLMNREIPVVADEYVDMEFGTGVVKMTPSHDPNDFEVAKRTGLAFLNIFTE DAKVNENGGKYCGLDRFEARKAVLKDLEEQGLLVSVKDHKNAVGHCYRCDSVIEPRVSTQ WFVKMQPLAERALEVVKNGQVKITPQRWEKVYYNWLENIRDWTISRQIWWGHRIPAYYAE DGTVFVARDIEEAKAQARKKFGKDVDLREETDVLDTWFSSALWPFSTMGWPEKTKDLEKF FPTDTLVTGADILFFWVARMIMMSLYIMDEIPFNYVYLHGLIRDEIGRKMSKSLGNSPDP LNLIDKYGADAIRYSFLYNTSQGQDIHFSEKLMEMGSTFANKIWNVSKFVLSNLEDFNTE TSVTDLEFKLEDTWILSKLQSAAAKINEYMNGYELDNAAKAVYEFFRGEFCDWYVEIAKT RVYGGEGTDKVTAQWVLRHVLDSGLKMLHPFMPFITEEIWQKLDFDEETVMLSDFPKEDK LLINKEAEKEFDYLKEVITAVRNIRGEANVSPSKKIEVIFKTSSESEKKILTDNPKILDK LANIEKYGFAPDVEIPELVGFRLVETTEIYVPLADLIDKEKEIAKLEKDIEKTQKELDKV LGKLSNGAFLGKAPQAVIEKENAIKEELETKIAKFRESINLYK >gi|261746595|gb|ADAD01000181.1| GENE 17 16043 - 16159 196 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKTIIAILLGLTLMSCSAGVGVNVGKHGVNGHAGISF >gi|261746595|gb|ADAD01000181.1| GENE 18 16188 - 16613 561 141 aa, chain + ## HITS:1 COG:FN1548 KEGG:ns NR:ns ## COG: FN1548 COG1585 # Protein_GI_number: 19704880 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Fusobacterium nucleatum # 1 140 1 138 138 93 36.0 1e-19 MGAIFWAVSASVFAILEIIIPGLVTIWLALAALVVTVFAGLINNAYIEFFIFAVLSLIFI LFTRPVLQNYLKKKIKHDFNSNMTGSEIKIEKVVNSDKAKKEYEVKFKGSIWTGISEEFF KVGDMVRIKSFEGNKIVLERK >gi|261746595|gb|ADAD01000181.1| GENE 19 16641 - 17561 1349 306 aa, chain + ## HITS:1 COG:FN1549 KEGG:ns NR:ns ## COG: FN1549 COG0330 # Protein_GI_number: 19704881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Fusobacterium nucleatum # 6 295 4 293 294 399 76.0 1e-111 MGFVLLPLIIIIITIALIIVFKAIKIVPESRVYIIEKLGKYDQSLESGLNFINPFFDKVS RVVSLKEQVVDFPPQPVITKDNATMQIDTIIYFQITDPKLYTYGIERPISAIENLTATTL RNIIGDMTVDQTLTSRDVINTNMRVELDEATDPWGIKVNRVELKSIIPPADIRSAMEKEM KAEREKRANILEAQARRESAILVAEGEKQAAILRAEAKKEQQIKEAEGEAEAILSIQKAK AEALRLLRESDPTAEVLALKGMETFEKVADGKSTKIIIPSNMQNLASMVTAFAELNEKQV NKPETK >gi|261746595|gb|ADAD01000181.1| GENE 20 17641 - 18393 1054 250 aa, chain - ## HITS:1 COG:SA0343 KEGG:ns NR:ns ## COG: SA0343 COG1878 # Protein_GI_number: 15926056 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Staphylococcus aureus N315 # 5 249 6 247 250 223 44.0 2e-58 MENNLWELLKVLKNHKWVDLTHEITNDSPYWQGMPEGVLELNNTIIDFPEMNLNIQTHKF PGQFGTHIDYPGHFIKNARLAGDFKVEDTVLPLVVIDLSEKVKENNDYEISIDDIKEFEK KYGTVPEGSFVVFRSDWSKRWPCIVSLTNADKNGNAHSPGWPVSTLEFLFDERNIAGVGH ETLDTDAAVTCAKNQDLIGERYILQKDKFQVEAMANLDKLPPVGAVIFISAPRIIHANGL PVRAWAVIPE >gi|261746595|gb|ADAD01000181.1| GENE 21 18338 - 18448 94 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIYYKVVNKFILNERCDKCGKQFMGTAESFEKSQMG >gi|261746595|gb|ADAD01000181.1| GENE 22 18582 - 19325 1078 247 aa, chain + ## HITS:1 COG:FN0807 KEGG:ns NR:ns ## COG: FN0807 COG1212 # Protein_GI_number: 19704142 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Fusobacterium nucleatum # 1 247 1 245 245 257 59.0 1e-68 MKILGVIPARYASTRFEGKPLKEINGSPMIEWVYKRAENSGIDKLVVATDDERIFNAVKS FGGNAVMTSTEHENGTSRIIEVINNPEYNDFDFVINIQGDEPLIDIKSINMIADNFRQKK SEIVTLKKEFKEKEEIKNPNIVKVITDFNDNAIYFSRSVIPYERNEVKDFKYYKHIGIYG YTSKFLNELKNLKNGVLEKIESLEQLRFIENGHKIKVLETTSEVIGVDTEEDLKEVIEYI NKNNITL >gi|261746595|gb|ADAD01000181.1| GENE 23 19372 - 20823 2260 483 aa, chain + ## HITS:1 COG:SP1382 KEGG:ns NR:ns ## COG: SP1382 COG0366 # Protein_GI_number: 15901236 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Streptococcus pneumoniae TIGR4 # 1 482 1 483 484 458 48.0 1e-129 MENGVMIQYFEWNLPDDGKHWERLKNDAKHLSEIGVSAVWIPPAYKGTSSEDVGYGAYDL WDLGEFNQKGTVRTKYGTKQELAEAIEELHKYGINVYLDAVLNHKAGADETERFLAVEVA SDNRNEEVSEPYEIEGWTKFTFPGRNDKYSAFKWNYNLFTGVDFNNENKKNAIYKIVGEN KDWAKSVDSEMGNYDYLMNADINYAHPEVKKEVINWGKWVVNELKLDGFRMDAVKHISEE FIKEFLDEVRNVYGENFYSVGEYWKDDLEVLKEYLESLEYKTDLFDVGLHFNFHEASIKA KEFDLSTILDNTIMVSDSMQAVSFVDNHDSQKGSALESQIESWFKPHAYAIILLSEHGYP CLFYGDYYGVGGNESEHRWIIDQLLYTRRNFAYGSQVSYFDSPNVIAIYRTGRENDINTG CIAVLSNSDEEGEKVIEAGQNRTGQTWKEITGSGFDNVVIDENGFGTFKVQAGKISVWVP ETE >gi|261746595|gb|ADAD01000181.1| GENE 24 20979 - 22127 1468 382 aa, chain + ## HITS:1 COG:HI0900 KEGG:ns NR:ns ## COG: HI0900 COG0263 # Protein_GI_number: 16272838 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Haemophilus influenzae # 14 375 2 364 368 313 47.0 2e-85 MKNKENKREEILKNIQKVVVKVGTSTLTKEDGNLNVEKIKKIVLELSNLSDKGYDIVLVT SGAIGAGMGKLNMTERPKTLHEKQALASVGQVALTHLYQMLFHEYNKTMGQLLLTKGDFS DRRRYLNARNVCNTLLKNKIIPVINENDAVVSDEIKVGDNDTLSALVAGLIDADLLIILS DVQGLYNKNPQKYRDATLMEIVGDINEDIKNMAGGEGSKFGTGGMITKIIAAEMATKIGT HLVISSGENPQNITKIVEKENVGTLFVKKHKKISSKKYWLAYGTNKKGILTIDEGAESAL YKGKSLLPVGIKSVEGAFNKGSVVKIENMKKEVIAMGISNYSSEEINLIKGQHSEDIENI LGHKYADEAVHIDNVAKINKIV >gi|261746595|gb|ADAD01000181.1| GENE 25 22169 - 22966 1116 265 aa, chain + ## HITS:1 COG:no KEGG:FN0994 NR:ns ## KEGG: FN0994 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 36 265 16 241 241 228 52.0 3e-58 MENYGKPKKTVGMKLYPKTLTILNFLLFIFFMLGLTKEFINNFERIYERNGISSVFFLIG FDLVLLLIALGIPYIMIKLYPKIYYYEDGFTVGKNKEKVLYEKVDYFFIPNQHPALYAMN RFTTIWYKNNDNKWKFISAMGYPKKGFDLFQEDFVKINYPKAMKEIENGGTVEFLFNNPK KLIPALGKKKFIEKKLNQTMKIKVSKENIVFDNEVYEWDKYKITTNLGTIVVKDKDDKAI LALPSRALIHKVNLLVAIIDKLGNN >gi|261746595|gb|ADAD01000181.1| GENE 26 23089 - 23679 580 196 aa, chain + ## HITS:1 COG:no KEGG:FN0995 NR:ns ## KEGG: FN0995 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 195 5 207 208 123 41.0 3e-27 MQKFMTFKKDENKLLTLLGGMAGFTVSTFILFLIARNGGATLYVCLFALAGPLLGVLGAN YLKRETKSDKEDTWDKNFDTGKVQKSKFSPDSNYELTGFGTVTVFVFAYLSIYLSEVLNL TKFFQEQNPDRKFSELLMLVAGNIFNDSEFGGFLISYWLGLTYLVVCIIIGTIVGFFIRK KKEQEEKEKRNKSKFQ >gi|261746595|gb|ADAD01000181.1| GENE 27 23721 - 24977 1812 418 aa, chain + ## HITS:1 COG:CAC3254 KEGG:ns NR:ns ## COG: CAC3254 COG0014 # Protein_GI_number: 15896499 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Clostridium acetobutylicum # 2 418 5 418 418 414 51.0 1e-115 MNYIEEMGKKAKEASKKLLVLDTETKNRALTMIAEELINKKDEIKKANKKDLEKGKKDGL SFALLDRLELTDARIEAMAQSLREIAAFTDPVGEIVTGWKHKNGMTIEKKRVPLGVLGII YESRPNVTVDSAGLGIKSSNAVILRGSASAINSNIYLSRLFNEIGTKGGLPENSVQLIED TDRELVNSMVKMNKYIDVLIPRGGKGLKKFIIENATIPVIETGAGVCHVFVDESAKINIA LSVIENAKTQRPSTCNSIETVLIHKNIAEKILPDLTEMLLKDGVELRYSKEALDIVGNKA EIKLTNEDDFGAEYLDMIMSLKLVNDVNEAVEYINEHSTQHSDSIITESIENAEKFLNEV DSAAVYLNASTRFSDGGEFGFGGEIGISTQKLHARGPMGVRELTTTKYVIRGNGQIRK >gi|261746595|gb|ADAD01000181.1| GENE 28 25039 - 25845 1035 268 aa, chain + ## HITS:1 COG:ECs0437 KEGG:ns NR:ns ## COG: ECs0437 COG0345 # Protein_GI_number: 15829690 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Escherichia coli O157:H7 # 2 267 4 269 269 239 45.0 3e-63 MKIGFIGTGNMGSAIINGLVVSKFVSEKEINIFDLNDEKAKKIYEKYSINILQNETEIVE NSDIIILAVKPDVYSGVLGKIKDKMKKDKIILTIAAGVSISDVENIIGKDKKIVRTMPNT PAQVMEGMTAIAFNKNINSEEKEMIFRLLNSFGESIEIEEKLMHVFTAISGSLPAYVYMF AEALSDGGVLEGMPREKTYKIIAQAIKGSAEMMLKTGKHPGILKDEVTSPGGTTIEALKV LENGNFRGTLINAVSKCTEKSKKMGNNK >gi|261746595|gb|ADAD01000181.1| GENE 29 26154 - 27191 1835 345 aa, chain + ## HITS:1 COG:YPO4037 KEGG:ns NR:ns ## COG: YPO4037 COG4213 # Protein_GI_number: 16124157 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Yersinia pestis # 36 343 24 331 331 409 68.0 1e-114 MKKIFLLLGLFTLFLVSCGGNGGGDSKAGTDGAKKKKIKIGMSIDDLRLERWQKDRDLFK KMAESLGAEVVVISANGDSQKQLTDSENLLSQGIDVLVVIPNNGQVMAPIVDQAHKAGVK VLAYDRLITDSDVDFYISFDNVKVGELQAQTIVEKAPKGNYFLMGGSPTDNNARMFREGQ MKVLKPLIDKGDIKVVGDQWVKDWLPEEALKIMENALTANHNDITAVVASNDSTAGGAIQ ALNAQGLAGKVPISGQDADLAAIKRIVAGTQTMTVYKPIKTLAEKAAEIAIKLGNGETIE SNGEVNNGKINVKSYLLDPISVTNENIKETVIKDGFQSESDVYGK >gi|261746595|gb|ADAD01000181.1| GENE 30 27210 - 28760 188 516 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 272 502 16 233 311 77 25 2e-13 MDSREILDMRHITKDFSGVKALDDIAIKVKKGHVHALCGENGAGKSTLMKVLSGIYPYGT YNGEIYFDGTILKNRGIKDSEEKGISIIHQELNLVDELSIMENIFLGNFITKNGIVDYYK MYQKTADILKELKMDVAPDTLVKELGIGHKQLVEIAKALSKKSKLIIFDEPTASLTEKET DVLLNIIKELKEKGVTSIYISHKLEEVIKISDEITVIRDGKFIECKKTSDYTKNDIVKAM VGRELSSFYPEKNNKIGDIIFEIKNYNVFDNSGKQKVKDINLTLKKGEILGISGLIGSGR TELISSIFGTYEGYQKGECYFKGEKINIKSTKEALKMGISMVPEDRKKDGIIEDMSVEKN MTLSNLDNYTKKLNYVDVYNEDKDVKKYIEILKIKTSGTDLEIKNLSGGNQQKVVLAKNL MVNPKIIILDEPTRGVDVGAKYEIYKNIVELAEQGISIIMVSSELPEILGISDRIVVMHE GEVKGEFINKNITQEMIMETAIGGDKNEFGNEIKES >gi|261746595|gb|ADAD01000181.1| GENE 31 28729 - 29874 1747 381 aa, chain + ## HITS:1 COG:YPO4035 KEGG:ns NR:ns ## COG: YPO4035 COG4214 # Protein_GI_number: 16124155 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, permease component # Organism: Yersinia pestis # 9 377 24 392 394 337 52.0 2e-92 MSLVTKLKKAKNSNTFIIFLALLVIWAVFISVTGGNFINSRNISNLLRQMSITGILSIGM IFIIISGEIDLSVGSQMALLGGIAAIMDVNLKMPFILTIIVTLILGFAFGIWNGYWTANK GVPSFIVTLSGMLVFRGVLIGITGGKTISPISKSFEILGQSYIPKSLSYIIVIIYIGYMI YSKYSERKAKTVNGIEVPTLREDLTGIIISGILMFLGIIILNSYEGIPTPVLLMGLLIAV FSFVSNKTAYGRIVYAIGGNINAVKYSGIDTKKIKMIIYIVNGFLVSIAGLVLTSRLGAG SVSAGTNAELDTIAACVIGGASLSGGKGKVTGAILGALIMASLDNGMSMLNVEPSWQYVV KGLILLFAVLFDVQSQKNKKN >gi|261746595|gb|ADAD01000181.1| GENE 32 29905 - 31227 2028 440 aa, chain + ## HITS:1 COG:TM1667 KEGG:ns NR:ns ## COG: TM1667 COG2115 # Protein_GI_number: 15644415 # Func_class: G Carbohydrate transport and metabolism # Function: Xylose isomerase # Organism: Thermotoga maritima # 1 436 1 436 444 567 63.0 1e-161 MKEFFPEIKEIKYEGAESKNDLAFKYYNKDEVLGGKTMKEHLRFAMSYWHTLKAQGVDMF GGETMDREWNKYENVLERAKARANAGFEFMQKLGLEYFCFHDRDIIDESMMLADSNKLLD EIVDHIEELMKKTGRKLLWGTTNAFSHPRFVHGASTSPNADVFAYAAAQVKKAMDITNRL GGENYVLWGGREGYETLLNTNSELEYDNFARFLKMVVDYKEKIGFKGQLLIEPKPKEPTK HQYDFDTATVLAFLRKYNLDKYYKVNIEANHATLAGHTFQHELNLARINGVLGSIDANQG DMLLGWDTDQFPTNIYDTTLAMYEVVKNKGLGSGGLNFDAKVRRGSFEDKDLFLAYIAGM DTFAKGLKIAYRLYEDKVFEDFIDKRYESYKTGIGKDIIDGKVGFEELSKYAETLTEVKN NSGRQEMLESKLNQYIFEVK >gi|261746595|gb|ADAD01000181.1| GENE 33 31446 - 32918 1985 490 aa, chain + ## HITS:1 COG:TM0116 KEGG:ns NR:ns ## COG: TM0116 COG1070 # Protein_GI_number: 15642891 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Thermotoga maritima # 2 485 3 491 492 367 39.0 1e-101 MLYLGIDLGTSGMKIIVINEKGEIKVSASKEYEVNYSNNGWSDQNPEDWVKAMKEGIKEI SQKININEVKGIGISGQMHGLVILDENKKVLYPVILWNDQRTVEESDYLNNEIGIEKLVK CTSNISFTGFTASKLLWIRKNEPEIFSKIKYIMLPKDYLGYVLTEEIFTDVSDASGTLFF DVKNRKWSEEMTEILGIKEESLPKSFESYEIIGKLTENIKDEFGITEDIPVVAGGGDNAC GAVGAGVINEDKILISLGTSGVVFIPQEKWKLPEKNSMHTFCDANGKYHFMGVILSAASC LKWWVEEIQDGNYTVLLDEALKSPIGSNNLFFLPYLTGERTPHSDPYARGSFIGLNASHT RGDMTRAVLEGVVFALYDSYKLADNLNPDTIRIIGGGAKSELIRKIITDLFNIKSEVLTV QEGPSYGAALLAMFGVEKIENRNGKLRELIKITDILEPNSENHKEYKRRYEVYKGLYSSL KDEYKKINSL >gi|261746595|gb|ADAD01000181.1| GENE 34 32931 - 34394 1856 487 aa, chain + ## HITS:1 COG:STM1556 KEGG:ns NR:ns ## COG: STM1556 COG1757 # Protein_GI_number: 16764900 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Salmonella typhimurium LT2 # 2 463 8 468 483 471 56.0 1e-132 MKRKVSFIEAVMPIIFMILLLGIGYGIYGLRAEILMLISASLASIIAVKNGYTWDDIMNS IVNKLSKTMPAILILIIVGLLIGSWMIGGTIPMMIYYGLKIISPKFIIITSFLVTAFVSI CTGTSWGSAGTVGVALMGVAAGIGAPLPVVAGAVVSGAYFGDKMSPLSDTTNLAPIAAGT TLYEHIGHMVWTTGPAFILASIVYTLIGFNISSGNAVTPEKVTIILDTLNSIFNWNLLVL LPPVIVLFGSIKKKPTIPVMLISSAIALFNAVVFQHFTLQQAFEATIDGFNISMVSGLNT ENIIADIPRLLNRGGMNSMLGTVLIAFCAYGFAGAIAVTDSLDIVLKKLLKNVKSTGGLI LSTLISCFVAVCVTSNGQLSILIPGEMFKNKYLSKKLHPKNLSRTLEDGATVIEPLVPWT AAGVYMATTLGVKTLEYLPWAILCYSGCIFAIIWGYSGKFIAKLDEKDKIYQEYLEEEKN KGGIENE >gi|261746595|gb|ADAD01000181.1| GENE 35 34387 - 35562 1231 391 aa, chain + ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 1 386 1 382 384 337 42.0 3e-92 MSNFDRIIDRYGTNSAKWDSFIQGKKGEYLYMAVADMDFRSPEEMINKMNEIVNHGVFGY TILPESYYQSVINWFSKKHNWEIDKDWIVYSPRIGISISLIIQNITEVGDGIILQTPAYP MLNDVIVKNKRKIIRNPLILKNGKYEMNLSKLEEKIDKNTKIFLLCNPHNPTGKVFLREE LEEIVEFCKKHNLILISDEIHCDLVFSPYKHIPITSISKDAENLGIVCNSITKTFNVPGV IASNLIIPNKQIREQVSEILDKAIIHNPNIFGAGITEEAYNKCEYWVEEVMNYIKKNKEY LEKYISKKIPQLEVIKSEGTYLVWIDYRKLNIPEEELNRILLEKAKLKIYTGSHFGEEGK GFIRVNLATPKKNIEEFLNRFYNVLKEEKLI >gi|261746595|gb|ADAD01000181.1| GENE 36 35652 - 37223 1866 523 aa, chain + ## HITS:1 COG:AGl3560 KEGG:ns NR:ns ## COG: AGl3560 COG1653 # Protein_GI_number: 15891902 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 40 523 31 519 522 369 42.0 1e-101 MSEKIIFKKILFILIVVLFFSCEKEKFKMNENKNLKGYLITEKETELSIFAMDSGKALNT EWPIFKKATEMTNIRLKNVASQNQTNQTEAYNLLVSSGQLPDIVSYMDTAVLERLGMEGG LVPLEDLIEKYAPNIKKFWEQNPRYKKDAIAADGHIYVIPNYYDYFNSMPSTGYFIRKDW LKKLNLKEPETTEELYNVLTAFRDKDPNGNGKKDEIPFFFRSESPKDTIRLLVDVFKART FWYEDSAGKIKFGAVQPEFREAIKNIAQWYKEGLIDKEIFTRGLVSRDYMLSNNIGGYTN DWFGSTASYNEKLGKNIPNFDLAFIAPPKYKNNNQTYQARTTEYGAWGISSKSKKVIEAI KYFDFWYSEEGRRLWNYGIEGSEWVLKDGKPLFTDKVLNNKEKKTALQVLKETGAQFRLG VAQDAEYERQWYPKEAVEAIDTYIKNNYVHELLPSLKYTKEEAEDFSKINTQLNAYVEEM SQKWIMGVSDVDKDWEEYIKRIEQIGLKKAEQIQQKAYDRFMK >gi|261746595|gb|ADAD01000181.1| GENE 37 37366 - 38511 210 381 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 78 362 6 313 319 85 25 5e-16 MELVPNNYHKKILSYIYSQKKVTKVQLSKYIDVTIPTVTLYLNELVKIGLIKESGVINSE SGRKPVIFEINPENIYTVGIEIRQESLTLIILNLNLCTIYKLQEKYNINNLEEELKNIIY KSIVHSKITIDQILGIGIAFPGIVNDRKLKLEESPIVNIKDYSLEGLKETFKMPIYIGNE ADYAAYAENLIGSSKQYRNSIYLSIYEGIGGGIIMENSIYSGSLQHAGEIGHMVIDYKGR QCECGRKGCWDKYVSSKIIDKLIKENNLKGLDELIDIYEKKSNNNIFNQMNEYFDYLATG IMNLFLIFDLDCVIVGGIWAPYENKIKPILIEKIKKENCKLGKNSEKIIFPKLLTKASAI GAGIIPFSNIYDFNLILRGME >gi|261746595|gb|ADAD01000181.1| GENE 38 38613 - 38906 397 97 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1678 NR:ns ## KEGG: Lebu_1678 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 11 97 12 98 98 92 60.0 6e-18 MAKEVYREGMLRKNITINSDDFYIVDRFAKKIGISFSELVRKAAVNYVKEQEELDLSAFL RAHCSTVPEDEEYEIVEAMKNKDKKDKGKEIKIEDLL >gi|261746595|gb|ADAD01000181.1| GENE 39 38890 - 39159 188 89 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1677 NR:ns ## KEGG: Lebu_1677 # Name: not_defined # Def: plasmid stabilization system # Organism: L.buccalis # Pathway: not_defined # 1 89 1 89 90 74 49.0 1e-12 MKISYKVKYLKKAEKFMKKNKDFGKKFFKNFSELSEDFYNNFQNFDIEYYRSIPNSFRMR IGKYRALFTVEDEIRIIEVIEINSRGDIY >gi|261746595|gb|ADAD01000181.1| GENE 40 39285 - 39689 521 134 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0620 NR:ns ## KEGG: Lebu_0620 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 134 1 135 135 173 74.0 2e-42 MEKKKYNQKKSMKKGFTLIEVILVVAIITIISAIAVPQVGKYLNKANRSKVIGAVAELNN SSTSWSIDHGGDSPKNLQDVLTEQGNLQKLGIGMNTSGNFKIGNIKGTMIYDKGEVYAKI DADSKAFPGEEIRK >gi|261746595|gb|ADAD01000181.1| GENE 41 39693 - 40151 452 152 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0621 NR:ns ## KEGG: Lebu_0621 # Name: not_defined # Def: peptidase A24A prepilin type IV # Organism: L.buccalis # Pathway: not_defined # 14 146 13 143 151 95 54.0 5e-19 MYVTLLNIIEYVSLAYIFIIDIKRKIIPERGFLILLAMGFLKALYFNYLTEWYLGICAFS MPLLFLYIVEDYVKKELIGFGDIKLMMAVGGLLRYENISETVRFYIFLYILGGIFSIFLS AYLKIRKKKTEYIAFGPFIIMTYIVFGYLNII >gi|261746595|gb|ADAD01000181.1| GENE 42 40167 - 40595 374 142 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0622 NR:ns ## KEGG: Lebu_0622 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 3 140 2 147 147 120 52.0 2e-26 MKRRNKGETLVESLISMFFVITVLVPVSDIFLKTFKVNIKTDIKNSINNENKNIIEILKT KKYDEIINFKGKHSISDLNGFYNVFGIEERYKVLNGKLADDKSKEIEIKQTENFYINEKG DKEYILEISAGNIKDYYFPNLK >gi|261746595|gb|ADAD01000181.1| GENE 43 40604 - 41074 546 156 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0623 NR:ns ## KEGG: Lebu_0623 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 2 155 4 157 157 114 54.0 1e-24 MKKYNKNKGYLLLEILICMFLFSVLVFVISVFLKRVVLIEKEKKENQLMYENVYFISDKI IEDIKNRDMEAFGYEGNTDNIHIKNEEIIMKKDGKFYKLEYSKGKLYVSDGDDITNLGTK SIIGQYDNVEFKRIDDLLVINLKYKKNNEVKVLNLL >gi|261746595|gb|ADAD01000181.1| GENE 44 41090 - 41653 621 187 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0624 NR:ns ## KEGG: Lebu_0624 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 5 183 8 185 186 122 41.0 1e-26 MGEIRENMKKGVSLIYVLIVLSMLSVFSLAFIYFVKEKADIVSLKSRKESSVSENYLINK ERQNSLRILTKGIENIGVSVFPEDENQYFNSKMEINPTGNKEIERIVFSGKNIGSIGGFQ IEKITDSSGNEYSLPLDKNTVYNDLEAVYFKDMLGGKILFIEKLTFKRINPTLVEVKSDE GRFSYDE >gi|261746595|gb|ADAD01000181.1| GENE 45 41676 - 42851 1290 391 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0625 NR:ns ## KEGG: Lebu_0625 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 152 391 225 464 466 231 59.0 6e-59 MIENIKIYMRSNSKIYVCIDEEVFYHENESLEEVLEVLKEEYEFTGNKNPLKFILHFSYF TFNDKELNVKEDISTNEENRKILKNIYNKDFLDDFVRKNINSENKKLYLNHLENKFINVY LSRKKTSELKRVCGKFGFEILSIKIDFVSVYNFYKEENIEIIQIGEEKSLRFLIEDSKIK EIEKLDLVMEDISDIDNFDFGNMEVFTVDEENVKIVFQGSDLYSEPNFAKKNEIISMESI KNVKMKDIIILGILAMSYFLLCGFIPLEKEIKKNEKIKEETKNLEKEYLNKKNDKIPDYS KELQTLNEIDSSLKRKEYFSVIKFLIENSEYGIDYTKIGYENKKWLIQGEMQDFGNFEKF EKNILNKYSKAELGYIKDNDTATVFEYNILE >gi|261746595|gb|ADAD01000181.1| GENE 46 42848 - 43399 619 183 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0626 NR:ns ## KEGG: Lebu_0626 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 7 178 1 172 182 176 61.0 4e-43 MKNGVKMSKLNLKNIIIILLVTLMFLGLNNKYTKYRSVLKDREMLVENKKQLESKIDEII SEQEKKKKEIKSDYDQIQVLTGKLAALSMKNESDFKKMIYVFAKESNLKMKEISKSEKIW ERNGYKLKYIHFTLYGGLSDFGKFLYYVNKSKSFIDTSKMYIELTGDAFKISLGFIEKEK IVL >gi|261746595|gb|ADAD01000181.1| GENE 47 43725 - 44882 1376 385 aa, chain + ## HITS:1 COG:FN2086 KEGG:ns NR:ns ## COG: FN2086 COG1450 # Protein_GI_number: 19705376 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulD # Organism: Fusobacterium nucleatum # 126 382 129 402 402 127 33.0 3e-29 MTKIIKNRFINILFTVFYFVTGIYIFPAKITDYVNKEDAKKANKIFIYKDNRENINPNQE INKTTNQKENNGVNSGTTGVNGNNQNVQNQKKDEKIEEINLEYRDVKEVSEKLNGFNNLK LVGIDNKIIISGEMKEIEQIRKIIKSLDRPKDQIIIKGTIIDTSSNLFEKLGVDWNINSN NPEPAKSNLVAKFLNGEITIGSIFASGGKFLGIDFNLLRENGDIKIEAMPTLLIMEDEEG ELKVTEEVLVGEKKTTKNNTDYTEPIFSEAGIVFKILPEIRKINGEKKILLKIDTEISNF KLTSNYSSGSGAKQKNQTKTIITLNNGGSTFIGGLKQNVTKETIKKVPFFSSIPIIGPLF KYKRNNKEVRDIYIEIEAVIQEMGN >gi|261746595|gb|ADAD01000181.1| GENE 48 44995 - 45993 1454 332 aa, chain + ## HITS:1 COG:FN1786 KEGG:ns NR:ns ## COG: FN1786 COG2870 # Protein_GI_number: 19705091 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Fusobacterium nucleatum # 7 325 4 318 323 303 52.0 3e-82 MISVERLEEILNRFDKVKIAVVGDMMLDEYLIGKVNRISPEAPVPVVNIEEERFVPGGAS NVANNLRSLDGKVSVYGVIGKDDNGEKFLKELSLKKIDSSTIVKDETRPTIIKSRVLSQG QQLLRLDWEKDTDISKDIQDKIIKNLEENIEKIDAILLSDYNKGVLTKYLSEKIINIAKK HNKKIVVDPKPQNFKNYKGATSMTPNRKEILDYFGMKKFESEEEIAQKMKELKEELELEN VVLTRSEEGISLFEKEHKRIPTVAREVYDVTGAGDTFISTFLLAVSAGADLYEAGVIANM ASGIVVGKIGTATATRKEIIEFYHNKMEKENY >gi|261746595|gb|ADAD01000181.1| GENE 49 46025 - 46606 197 194 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 3 186 4 191 323 80 30 2e-14 MAVIAAVEAGGTKFICGLGTEEGKIIEKINIPTTNPEETMKRVIEYFKNKKFDVMGVGSF GPIDPVKGSETYGYITKTPKPYWSDYNMIGELKKHYDVPMEFDTDVNGAALAESWWGAGK GLKNLIYITVGTGIGAGAVVNGTMLQGLTHPEMGHIFLKRHPEDKFEGRCPFHKDCLEGM AAGPAIEDRWGKKG Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:50:58 2011 Seq name: gi|261746580|gb|ADAD01000182.1| Leptotrichia goodfellowii F0264 contig00053, whole genome shotgun sequence Length of sequence - 11023 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 7, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 522 706 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 598 - 657 9.5 + Prom 523 - 582 12.7 2 2 Op 1 8/0.000 + CDS 635 - 1129 466 ## COG2207 AraC-type DNA-binding domain-containing proteins 3 2 Op 2 . + CDS 1039 - 1512 248 ## COG2207 AraC-type DNA-binding domain-containing proteins 4 2 Op 3 . + CDS 1555 - 2115 621 ## COG0406 Fructose-2,6-bisphosphatase 5 3 Op 1 . - CDS 2200 - 2811 873 ## COG0716 Flavodoxins 6 3 Op 2 . - CDS 2873 - 3349 564 ## COG3439 Uncharacterized conserved protein - Prom 3429 - 3488 5.0 7 4 Op 1 . - CDS 3537 - 4679 1767 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 8 4 Op 2 . - CDS 4681 - 5637 1559 ## COG1446 Asparaginase 9 4 Op 3 . - CDS 5715 - 6233 728 ## TepRe1_2248 selenoprotein B, glycine/betaine/sarcosine/D-proline reductase family - Prom 6287 - 6346 1.9 - Term 6293 - 6331 3.5 10 5 Op 1 10/0.000 - CDS 6348 - 7694 1960 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 11 5 Op 2 . - CDS 7697 - 8023 426 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 12 5 Op 3 . - CDS 8062 - 9069 1668 ## CbC4_2169 hypothetical protein - Prom 9122 - 9181 8.8 + Prom 9577 - 9636 6.0 13 6 Tu 1 . + CDS 9675 - 10478 550 ## Sterm_3349 initiator RepB protein + Term 10489 - 10526 -0.3 - Term 10519 - 10564 3.7 14 7 Tu 1 . - CDS 10606 - 11022 638 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases Predicted protein(s) >gi|261746580|gb|ADAD01000182.1| GENE 1 3 - 522 706 173 aa, chain - ## HITS:1 COG:lin0220 KEGG:ns NR:ns ## COG: lin0220 COG1653 # Protein_GI_number: 16799297 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Listeria innocua # 31 167 30 171 418 102 35.0 4e-22 MRKILILMVCLLLVLSCGRKENKDGSKSGGKKELTYLIWDKGQEAGMKAIIDEFEKENPG VKIKLQIVGWEEYWTKLETSATGGTLPDIFWMHSERFYDYASNGMLMEVDKNIDEGFKHF PQDLVKLYQFNGKQYAVPKDFDTIGVFYNKELFDKAGVPYPDGNWDWAQYLET >gi|261746580|gb|ADAD01000182.1| GENE 2 635 - 1129 466 164 aa, chain + ## HITS:1 COG:mll6442 KEGG:ns NR:ns ## COG: mll6442 COG2207 # Protein_GI_number: 13475388 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Mesorhizobium loti # 20 127 77 179 351 60 31.0 2e-09 MLKKEKEWILNIFGFKKPFPIVTMYGIGRHKITDPDYYWDNMKRPQDGNYCLLQYTVSGH GEIKIDNKIYRIGKNEAFFVKVPGKHVYYLPKNASWEVLYLEFSVEAENFRKQIVKNAGS EILKFSPDSKSISLLWEFFMPQKVTKSTIFLNVQNMRTIFFLNS >gi|261746580|gb|ADAD01000182.1| GENE 3 1039 - 1512 248 157 aa, chain + ## HITS:1 COG:all3171 KEGG:ns NR:ns ## COG: all3171 COG2207 # Protein_GI_number: 17230663 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Nostoc sp. PCC 7120 # 50 146 209 305 306 79 37.0 2e-15 MGIFYAAKSNEIYNIFECSKYAYNLLFELLNEFTKVNKKIDSPTVESIKSYINKNYAKPL AIEDLSNHVNLSKYHLTRKFKAEAGISPGQYLIKVRLKKATSLLLNSNMTIEEISEKVGF SCGNYFAKAFKHNFKISPTKYRNYYKHFCKLNILSNE >gi|261746580|gb|ADAD01000182.1| GENE 4 1555 - 2115 621 186 aa, chain + ## HITS:1 COG:lin1208 KEGG:ns NR:ns ## COG: lin1208 COG0406 # Protein_GI_number: 16800277 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Listeria innocua # 1 183 8 190 199 214 60.0 7e-56 MRHGQTLFNFRKKIQGACDSPLTEEGKKQAEIAGLYFKANEIKFDHAYSSTQERASDTLE IVTDFKMPYERLKGIKEWNFGLFEGESEDLNPPRESYSEFFVKYKGESGKQVEQRMTETL TEIMERENHETVLAVSHGGACYTFMLKHAPDFPFTGIPNCSIFKYEYENGKFTLIELVKH DFEREI >gi|261746580|gb|ADAD01000182.1| GENE 5 2200 - 2811 873 203 aa, chain - ## HITS:1 COG:CAC3417 KEGG:ns NR:ns ## COG: CAC3417 COG0716 # Protein_GI_number: 15896658 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Clostridium acetobutylicum # 48 199 2 155 161 82 35.0 7e-16 MKKIQIKLITILSLFTLAILGCAKESTAKTMKDNMKSDMMKDGMMTKKTLVVYYSLTGTT EKVAEMIKEKLGADILKIEPEKEYDTSSVSKLEALVKKQMANKEKVAIKKINKDISEYDT IIIGTPAWFSEVALPVQTYLSEQNLSNKKVAVFTTYGGTYGNLLTNFEKNVKAKQIKKGI AFSGKQVKEGIDKELNAWIETLK >gi|261746580|gb|ADAD01000182.1| GENE 6 2873 - 3349 564 158 aa, chain - ## HITS:1 COG:SSO0995 KEGG:ns NR:ns ## COG: SSO0995 COG3439 # Protein_GI_number: 15897871 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sulfolobus solfataricus # 39 144 5 109 123 78 41.0 4e-15 MKQKFLVSVILILCFGLSFANMKMNMKDNMMIDSEYIQKKSSYNFKETVETVKKRIIEKG GTIFYISDHKKNADEVHLELRPATVIIFGNPSVGTFLMQENQKSAFELPLRILIYENEKG ETEVIYYNVKLWEKKFGIKNKKILANIANLYDYIVPSK >gi|261746580|gb|ADAD01000182.1| GENE 7 3537 - 4679 1767 380 aa, chain - ## HITS:1 COG:CAC2723 KEGG:ns NR:ns ## COG: CAC2723 COG0624 # Protein_GI_number: 15895980 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Clostridium acetobutylicum # 8 191 3 185 465 167 43.0 3e-41 MSAEKKGKIINEVKNIQKDLISSIRDLVSIYSAESTPSENAPFGDGPLEALRKVLNIAEK MGFHTENIDNKIGYAQYGESENDEYIGIFGHVDVVPLGEGWKHEPLKGEIENNRIYGRGV LDNKGPILANLFALFILKKFGITFDVPVRIVFGTNEETGFNCVKHYLTKEKPPVFGWTPD CKWPVVYGERGRLKVRVSAENKYAEELYNFVNDYILSAPNNGVKLGINFKDEDFGEMIMR GYKLGTHENRSYFEWAMSYPAICKKDELIKLIKEKLSDNLEIEEIANWDPILYDKTSKYV KTLQKVYNDVTGFDAKPVTTTGGTYAKIIPNIIAYGPSFPGQKDIAHLPDEWMDLDDLEK ITEIYALALYEISKLKNKKE >gi|261746580|gb|ADAD01000182.1| GENE 8 4681 - 5637 1559 318 aa, chain - ## HITS:1 COG:CC2359 KEGG:ns NR:ns ## COG: CC2359 COG1446 # Protein_GI_number: 16126598 # Func_class: E Amino acid transport and metabolism # Function: Asparaginase # Organism: Caulobacter vibrioides # 3 277 20 309 327 174 36.0 2e-43 MWGIIATWTMAKDGAEEGKKILENKGEAGEAIEKAIKSVEDFPYYKSVGYGGLPNEEMEV ELDAAYMDGTTLDFGAVCAIKDFANPISVAKKLSKLNENSVLVGAGAEAYAHKEGFERKN MLTDRAKIHYKNRKKDIINLELKPYSGHDTVGMVCLDSFGNVVSGTSTSGLFMKKKGRVG DSPIIGSGLYADSDIGGASATGLGEDIMKGIISYEIVKLMETGLSPQKACEEAVSNFEKK MIKRRGKLGDISVIAMNNKGEWGVATNIDNFSFVVANENSDICIYRAKRKGRETIYEPAT KEWVDNYVEERMKPLEEK >gi|261746580|gb|ADAD01000182.1| GENE 9 5715 - 6233 728 172 aa, chain - ## HITS:1 COG:no KEGG:TepRe1_2248 NR:ns ## KEGG: TepRe1_2248 # Name: not_defined # Def: selenoprotein B, glycine/betaine/sarcosine/D-proline reductase family # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 1 164 1 164 173 153 45.0 3e-36 MRIILFFDQIQSGTGGKEGSNVELAVEKGGIGSYLMFEEYIKEIGGTVIATTYCSDNYFK DNEETVLSKMEGLVNKVKADILLCGPCFNYHNYAEMSSVLADYIQKKTECKPIVVCSEEN ADVIEKYKNDIVMVKMPKKGGTGLRESLKNMAKVIKKVHNNEKLAVEDNIYL >gi|261746580|gb|ADAD01000182.1| GENE 10 6348 - 7694 1960 448 aa, chain - ## HITS:1 COG:BS_licC KEGG:ns NR:ns ## COG: BS_licC COG1455 # Protein_GI_number: 16080909 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 5 426 8 433 452 310 39.0 4e-84 MFEKLEKVLGPVAVKLSSNKVLTAIRDGFLVGTPLLIVASIFLVVGNFPVPGYSEFIAQF FGEGWENYLDAIINATFGVVALIGVIGIGYYYGKAKGIEGIAGAAAALVAFLILSPQSHP LYVNADGKTFGGFAFRNLGTAGLFVAMITALVSVTIFAAIKNKGWTIKLPEGVPPAVSNS FAALIPSMFVMIFFFIIRLIFNFTEYKYAHDFVYKILQTPLMGFGRSIIFEPVYQFLSTL FWFFGINGPAVTNTVFGPIHLALTTENLAAFTAHQPLPNIFTGPFGDFFGNFGGGGSTLS LVLLMIFAGKSERMKKLGKLSIVPGIFGINEMVIFGLPVVLNPIILIPFLLVPLVNIGLS TAATLVGLIPYTTGASLPWTTPLFFSGWLSTGSIIAGLFQILLVAIGCVIYYPFFRVLDE QYLKDETKPVEENNDDLDDISLDDISFD >gi|261746580|gb|ADAD01000182.1| GENE 11 7697 - 8023 426 108 aa, chain - ## HITS:1 COG:CAC0384 KEGG:ns NR:ns ## COG: CAC0384 COG1440 # Protein_GI_number: 15893675 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Clostridium acetobutylicum # 1 102 1 100 102 81 40.0 4e-16 MKKFYLFCAAGMSTSLVAKKMQDVADSHKLPIEVKAFPDHKLDIIVEELHPDIILLGPQV KFKYEETKEKYEPKGIPVAVIDLNDYGNVDGERILKRAIKILKEKESK >gi|261746580|gb|ADAD01000182.1| GENE 12 8062 - 9069 1668 335 aa, chain - ## HITS:1 COG:no KEGG:CbC4_2169 NR:ns ## KEGG: CbC4_2169 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 40 335 2 297 297 341 58.0 2e-92 MKIKKNFFDKIKIIGYKNNIKNAENLKMKIAEMEMEIMPKRLLNCFASDFKKMTKEELIN AIKASEGRTVLSENVADRRTVTGDVTNSELARAFGADLILLNGLDVYNPVIASLPESDEP IKLLKKLTGRPVGLNLEPVDLNADMLENLEVIPEGRMCSEKTLKKATEMGFDFVCLTGNP GTGVTNKETVNGIKLAKKYFDGMIIAGKMHGAGVNERIVNLDVIKQFLESGADVIMLPAV GTIPGFTHDEMIEAVKYIKENGGLVMSAIGTSQESSTRETIREIAIMNKIAGVDIQHIGD AGYAGVAPYENIMELSMAIRGVRHTIRMIAASNDR >gi|261746580|gb|ADAD01000182.1| GENE 13 9675 - 10478 550 267 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3349 NR:ns ## KEGG: Sterm_3349 # Name: not_defined # Def: initiator RepB protein # Organism: S.termitidis # Pathway: not_defined # 1 250 140 394 398 124 34.0 3e-27 MSESFSFSLYNNILKNFENSKEIILPISSLKMYLNSDNKYSRFFDFEKYILKKAVSDINI FTTFNISYKKIKEGTKLTNKITSITFFVEKSKHMNTSQDNTVYKMMELVKDRIKNPEKIY QLFLLYIAKRGYKYVYENIKYLKHTSSENFDKKLRKALLLDLASNSLKLYISINKSVDSS IVLFSILSRNINDIKRYFPKIEKLLQTSELKNINSISYFKDGNIFEFVNEDIEIYVEYYI NKPSLIKIYLPDNIIKKLEKNKPRLRS >gi|261746580|gb|ADAD01000182.1| GENE 14 10606 - 11022 638 138 aa, chain - ## HITS:1 COG:CAC0533 KEGG:ns NR:ns ## COG: CAC0533 COG1486 # Protein_GI_number: 15893823 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Clostridium acetobutylicum # 2 137 305 440 441 228 79.0 3e-60 TEKGSSEGTKLHVDDHASYIVDLARAIAYNTHERMLLIVENDGAISNFDSTAMVEVPCIV GSNGPERICQGKIPQFQKGLMEQQVSVEKLTVEAWIEGSYQKLWQAITLSRTVPSASVAK LILDDLIEANKGYWPELK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:51:13 2011 Seq name: gi|261746577|gb|ADAD01000183.1| Leptotrichia goodfellowii F0264 contig00217, whole genome shotgun sequence Length of sequence - 1120 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 400 477 ## COG3210 Large exoproteins involved in heme utilization or adhesion 2 1 Op 2 . + CDS 416 - 1111 621 ## FN1816 hypothetical protein Predicted protein(s) >gi|261746577|gb|ADAD01000183.1| GENE 1 2 - 400 477 132 aa, chain + ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 1 132 2667 2806 2806 105 50.0 2e-23 ESEAELTTRLKEMYKDKPNTTIETQESYKGGENVKYGIKGSVRPENSVVENGIVKETYEI KNYNQENYNSMTSNIGKQAIKRAEELPETATQKIVIDTRGQKITPEIRQKITEDIIRKSN KIIKKENIIFME >gi|261746577|gb|ADAD01000183.1| GENE 2 416 - 1111 621 231 aa, chain + ## HITS:1 COG:no KEGG:FN1816 NR:ns ## KEGG: FN1816 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 231 1 232 233 214 54.0 3e-54 MRELTTKEFNDLYKRNFREYFDKPLKKDGFYKKGTINFYRINKLGMIETLNFQKHREELN VNCAILPIYCGATKESITIGLRLGKFMNTRYTYWWDIKDDESMEKNMQEMLNVIQTDLYN WFNKMDNEKEILKYISISYQTIINRYITQAASMAKFKRYDEILQYVEKVKKEYMESWSEE ERQKKEWLKKVLDEALLLERKLKEGKESIDQYIIEREKQSLIELGLEKLIK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:51:19 2011 Seq name: gi|261746569|gb|ADAD01000184.1| Leptotrichia goodfellowii F0264 contig00122, whole genome shotgun sequence Length of sequence - 5552 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 3, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 91 92 ## 2 1 Op 2 . + CDS 72 - 533 599 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 3 1 Op 3 . + CDS 555 - 833 496 ## Sterm_1465 phosphotransferase system lactose/cellobiose-specific IIsubunit beta 4 1 Op 4 1/0.000 + CDS 858 - 2126 1866 ## COG3037 Uncharacterized protein conserved in bacteria 5 1 Op 5 12/0.000 + CDS 2149 - 2958 1167 ## COG3959 Transketolase, N-terminal subunit 6 1 Op 6 . + CDS 2962 - 3897 1067 ## COG3958 Transketolase, C-terminal subunit + Term 3908 - 3968 6.2 + Prom 3956 - 4015 6.0 7 2 Tu 1 . + CDS 4072 - 4503 759 ## COG0698 Ribose 5-phosphate isomerase RpiB + Term 4506 - 4548 -0.5 + Prom 4579 - 4638 6.2 8 3 Tu 1 . + CDS 4676 - 5552 992 ## COG1609 Transcriptional regulators Predicted protein(s) >gi|261746569|gb|ADAD01000184.1| GENE 1 2 - 91 92 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no IEKIENKGMLVELINEIGNEVKNIGRNNQ >gi|261746569|gb|ADAD01000184.1| GENE 2 72 - 533 599 153 aa, chain + ## HITS:1 COG:BH0221 KEGG:ns NR:ns ## COG: BH0221 COG1762 # Protein_GI_number: 15612784 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Bacillus halodurans # 3 124 2 123 147 112 39.0 2e-25 MAEIISKKLINLDLEAENWEEAIMKGGEILIKEGYIEKKYVETLLKISRDEGPYYIITKN IALPHVRPEEGVKKRGFSFIRLKRPLDFGSKENDPVKYIIFLMALNSEEHINLIKFISKI IEDKNFFELVEKNRQNNEKNREIVYKYLNNIIV >gi|261746569|gb|ADAD01000184.1| GENE 3 555 - 833 496 92 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1465 NR:ns ## KEGG: Sterm_1465 # Name: not_defined # Def: phosphotransferase system lactose/cellobiose-specific IIsubunit beta # Organism: S.termitidis # Pathway: Ascorbate and aldarate metabolism [PATH:str00053]; Phosphotransferase system (PTS) [PATH:str02060] # 1 92 1 92 94 127 77.0 2e-28 MKRGLVVCRTGMGSSMMLRIKLEQVIGENKFPLELEHDVLSAISNYDVDCVITMMDLVEQ IKDEAKYVIGINDLMNKVEMKEKIEKFFEENK >gi|261746569|gb|ADAD01000184.1| GENE 4 858 - 2126 1866 422 aa, chain + ## HITS:1 COG:YPO2782 KEGG:ns NR:ns ## COG: YPO2782 COG3037 # Protein_GI_number: 16122986 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 1 413 1 413 418 366 47.0 1e-101 MELMKIVVFDLLGSAPILVGLMALLGLALQKKSPEKIISGTLKAIVGFLIFAGGAGIGVT ALNNFQSLFSAGFGLKGVLPLAEAVTALAQTKFATVVSLIMVLGFFCNLLVARFTKFKYI FLTGQHNLYLAALLTVVLKALNFSNVTTVILGGIILGIAAALYPALAQPYMRKVTGNDDI AMGHYVTIAYALSGWLGGKVGNPDESTEKLKLPGWLSIFKDYIVSVSISIIVFFYIASFA AGKETVEKLSGGVSWLVYPLFQSLTFTASLYIIITGVRMLLGEIVPAFVGISEKLIPNAK PALDCPVVFPYAPTATVVGFLSAYAGGLLCMIVLAMLGMTVIIPVAVPYFFIGATAGVFG NATGGWKGAVAGGFVTGILIAVGPALLYPIMEIIGLKGTTFPETDFVALGLVIYYLGKMF GR >gi|261746569|gb|ADAD01000184.1| GENE 5 2149 - 2958 1167 269 aa, chain + ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 4 265 3 269 270 262 48.0 5e-70 MEYDLELLKKKSQKYRKAIIELIYRAKAGHPGGSLSCIDILNYLYTYKIDFNNENRSRLV LSKGHVTPAIYAVFMDLGFINDDEIDTFRRVNSRLQGHPDRNKIPELDANTGLLGQGLSI GIGMALGKKLKKDNSKVYVIIGDGEMQEGQIWEGLMSGAHYNLNNLTVFLDYNKLSSKND VNKTMNLEPIKDKIEAFGWNAIEINGHEFNEIKKSIEFSEKSEDKPTFIIAHTVKGKGVS FMENNPKWHSSALTQEEYDIAIKDIERGN >gi|261746569|gb|ADAD01000184.1| GENE 6 2962 - 3897 1067 311 aa, chain + ## HITS:1 COG:MJ0679 KEGG:ns NR:ns ## COG: MJ0679 COG3958 # Protein_GI_number: 15668860 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Methanococcus jannaschii # 6 308 8 309 316 223 40.0 5e-58 MKKYEYISPRDYIGDILIELGEKNDKITVIDSDLASSVTTHKFQQRFPERFFEMGIAEQN SLGVTAGLATEGFIPFYVNFAIFSTGTVWTQLRQSCYANLNIKIIGTHPGMDNGPDGASH HALEDIALSRVLPNLKVFNPLDLEDLKAAIKRAIEIKGPVYIRVARDIVPVIHKEPLIFK EGVSELLFNEGNDYLLIFEGTAAKQAVEGFELLKEKGFKNKLLNIKSIKPLDKEGILNEV KNMKGIITIENHTVNAGLGGAISEMIAESGMGVPLRRVGIQDTFTESGKTGDVKQKYGLS GEKVLETVEKF >gi|261746569|gb|ADAD01000184.1| GENE 7 4072 - 4503 759 143 aa, chain + ## HITS:1 COG:rpiB KEGG:ns NR:ns ## COG: rpiB COG0698 # Protein_GI_number: 16131916 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Escherichia coli K12 # 2 140 3 141 149 177 66.0 4e-45 MKIAIGSDHVGLELKPVIVEYLKELGHEVEDFGAYSKERTDYPIYGEKVAKAVIEGNYDC GIVMCGTGVGISIAANKIKGIRAVVCSEPYSAKLSKEHNNTNILAFGSRVIGSELAKMIV KEWLEAEFEGGRHANRVNMLDNM >gi|261746569|gb|ADAD01000184.1| GENE 8 4676 - 5552 992 292 aa, chain + ## HITS:1 COG:BH3727 KEGG:ns NR:ns ## COG: BH3727 COG1609 # Protein_GI_number: 15616289 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 6 291 1 281 331 187 37.0 2e-47 MKNKKVVSIKELAEMSEVSIATVSRVINKKGGYSKETEEKILKLAESKSYQQNVNARSLR TKKSQTIGVIVPDISNEFFAKIIQAVEKQAIKYNYSVFVCNTDENIEIEKRQLNNLIGQF VDGIIYIGGGVQLGNETQALKIPMIYIDRYIDDKEIYIESDNFHGGYLAGQELIQSGCRK IAVMKDIRKISSAHKRYLGFLKALKDSKVGFDEKLLCDVTVINYKEAKEKTLELLDSGEV FDGVFATNDTLALGVMTALNERRIRIPNEVKIVGFDNISASEIAGIPLTTIN Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:51:26 2011 Seq name: gi|261746567|gb|ADAD01000185.1| Leptotrichia goodfellowii F0264 contig00143, whole genome shotgun sequence Length of sequence - 332 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 331 397 ## gi|262039408|ref|ZP_06012717.1| hypothetical protein HMPREF0554_2147 Predicted protein(s) >gi|261746567|gb|ADAD01000185.1| GENE 1 1 - 331 397 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039408|ref|ZP_06012717.1| ## NR: gi|262039408|ref|ZP_06012717.1| hypothetical protein HMPREF0554_2147 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2147 [Leptotrichia goodfellowii F0264] # 1 110 1 110 110 151 100.0 1e-35 GDKKVNPYDFEYAQFDPDSEVSGDFTKAHIDKFFAGLGEAGARLRKGDIKGGLGKVAETF GKTGKVLGKDTKNVIAGGDKRKKTIEESKNAIKEKAAEVKARNEKKRKAA Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:51:34 2011 Seq name: gi|261746565|gb|ADAD01000186.1| Leptotrichia goodfellowii F0264 contig00022, whole genome shotgun sequence Length of sequence - 442 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 441 525 ## gi|262039410|ref|ZP_06012718.1| conserved hypothetical protein Predicted protein(s) >gi|261746565|gb|ADAD01000186.1| GENE 1 3 - 441 525 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039410|ref|ZP_06012718.1| ## NR: gi|262039410|ref|ZP_06012718.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 146 1 146 147 251 100.0 1e-65 GNVDIKSEEGDIDFGPTNGSIGGKLTYTGKHINILDMHSERTIDRVTKNTYVGVGITASI PVVGAAKQVWDAGKNLKNAKHKEDYLNAGLGAYSAGMSAASAIGEMFTSPFGVTGSVNVS HNKNVYHREESISVGSNLHVGGGVEY Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:51:49 2011 Seq name: gi|261746547|gb|ADAD01000187.1| Leptotrichia goodfellowii F0264 contig00042, whole genome shotgun sequence Length of sequence - 16116 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 4, operones - 2 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 69 - 130 1.5 1 1 Op 1 . - CDS 156 - 2081 2391 ## Lebu_1818 hypothetical protein 2 1 Op 2 . - CDS 2099 - 2551 463 ## gi|262039425|ref|ZP_06012732.1| conserved hypothetical protein 3 1 Op 3 . - CDS 2397 - 2813 113 ## 4 1 Op 4 . - CDS 2852 - 3418 903 ## COG1704 Uncharacterized conserved protein 5 1 Op 5 . - CDS 3440 - 4036 701 ## COG0218 Predicted GTPase 6 1 Op 6 . - CDS 4051 - 4845 1000 ## COG1316 Transcriptional regulator 7 1 Op 7 . - CDS 4870 - 5493 742 ## COG0237 Dephospho-CoA kinase 8 1 Op 8 . - CDS 5530 - 6309 722 ## Sterm_1364 hypothetical protein 9 1 Op 9 2/0.000 - CDS 6373 - 7854 1767 ## COG0457 FOG: TPR repeat - Prom 7935 - 7994 9.2 - Term 7932 - 7973 0.2 10 1 Op 10 . - CDS 8045 - 9085 1566 ## COG1077 Actin-like ATPase involved in cell morphogenesis 11 1 Op 11 33/0.000 - CDS 9132 - 9689 1008 ## COG0233 Ribosome recycling factor - Prom 9719 - 9778 7.2 12 1 Op 12 24/0.000 - CDS 9818 - 10528 1148 ## COG0528 Uridylate kinase - Term 10538 - 10575 3.1 13 1 Op 13 38/0.000 - CDS 10578 - 11459 552 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts 14 1 Op 14 . - CDS 11483 - 12277 1039 ## PROTEIN SUPPORTED gi|229883428|ref|ZP_04502896.1| SSU ribosomal protein S2P - Prom 12380 - 12439 8.3 15 2 Op 1 . - CDS 12516 - 13622 1633 ## Lebu_1920 hypothetical protein - Prom 13738 - 13797 14.5 - Term 13782 - 13822 1.8 16 2 Op 2 . - CDS 13839 - 14918 1466 ## Lebu_1920 hypothetical protein - Prom 15075 - 15134 9.8 17 3 Tu 1 . - CDS 15185 - 15640 567 ## gi|262039421|ref|ZP_06012728.1| putative liporotein - Prom 15731 - 15790 7.4 18 4 Tu 1 . - CDS 15819 - 16115 409 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 Predicted protein(s) >gi|261746547|gb|ADAD01000187.1| GENE 1 156 - 2081 2391 641 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1818 NR:ns ## KEGG: Lebu_1818 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 27 610 27 612 647 551 52.0 1e-155 MEYKSCRKQKKILVILFTFFSFFLINSKALKAETIKNYNVTIQINKDATLTVNEVIDYEF DVMKHGIYRDIPLRSRKEGFDIYRSYVRMNSVKRNGEPEHYSSKSFDEGIRYRVGSANTY VNQGLNTYEFNYTIYNAVFEKDGIYQIYYNAIGQFWEVPIEKATVTVKFPPYGSQIEQKE IEKLEVYTGSYGEKSDNYNIDENSGEIKISTKNELSPDSGLTFMLNLKTDKINPTFLDKV KLIYYTNPIIILGPVILIILTIYAFVTWLLFGRDPARKAVIPEFNLPKDMSAMFAAYMDG ARDPKEIMNVGVLSLLSKGYVTAEDKDGDGKNVKYNKNMEKTNNEKKELYSEEIKLYNAL SSSKDNIFKDGDALYSTGISILKDLGDKYNKIIYRHNYGFLVPFILAIPVLIIFSSGGGS SIIRMENGFYFIIFFYILGVFGSILHSVFRGGNKIVILAGIATIALLASLSMGKVVFIMA ICFLVLFFGYMKLIGKYTNEGIRKKEYLQGMKMYIKTAEENQIKKFNDVDEMVSYFKGIL PFAVALGVKNEAIKLMRKAIKMNHFSENDFNRNQTLDLWVYNDYGLRSSLSREYNTAQEK IYKEKFSSSRGGSSGGFGGGGFSGGGGFSGGGSGGGGGGSW >gi|261746547|gb|ADAD01000187.1| GENE 2 2099 - 2551 463 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039425|ref|ZP_06012732.1| ## NR: gi|262039425|ref|ZP_06012732.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 150 1 150 150 238 100.0 9e-62 MKRKRAVFFFTCQTFLLAVEIIFSVGLPSLISILITALFIIYIKVLGKYTPEGIEQLKII EGIRKYIKSKEKIIFNKEEDLINHFGVLLSYTAAMNVEKETLELIEKDIEISVFKERRDY IREKLCINYYENRKEYEKEYIKYIFPKIYR >gi|261746547|gb|ADAD01000187.1| GENE 3 2397 - 2813 113 138 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDSVIKMGIITVVILIIYCFVTWLLFGKEYDYKKEIISVYEAPKELSSIFVLYILKKNKV LSFELFYIGVLSLIEKGFLTVENTAIFDEKKKSSIFLYLPDFSFSSRNYIFSRVTFINFD IDYRTFYYLYKSIRKIYT >gi|261746547|gb|ADAD01000187.1| GENE 4 2852 - 3418 903 188 aa, chain - ## HITS:1 COG:SMc00853 KEGG:ns NR:ns ## COG: SMc00853 COG1704 # Protein_GI_number: 15964606 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 5 188 1 182 182 160 49.0 1e-39 MPILMIILAIVAVLILVGVGAYNKYIRLKNLNEEGWSGIEVYLQKRLDLIPNLVNTVKGY AAHEKETLENVVRLRSQMMSIDTKDIENIEKIQKLENEMTKTLRSIMMLQENYPDLKANE NFISLQSQLSQIESEIQNSRKYYNGTARDRNTFVETFPNNILGGFLGFKKAEFFNAAEGA ERAPEVKF >gi|261746547|gb|ADAD01000187.1| GENE 5 3440 - 4036 701 198 aa, chain - ## HITS:1 COG:FN2013 KEGG:ns NR:ns ## COG: FN2013 COG0218 # Protein_GI_number: 19705309 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 193 1 194 194 247 68.0 8e-66 MEIKKSEFVKSAVYEKDYPEKKGIEFSFIGRSNVGKSSLINSLTNRRNLARTSKTPGRTQ LINYFLIDSEIYFVDLPGYGFAKVPEAVKRNWGTTIETYLKSERDKVVFLLLDLRRIPSG EDMEMLRWLEHYNIEYYIIFTKSDKLSNNEKAKQLKEIKKKLEFNNEDVFFSSALKNTGK EELLDFIYEKVKNYYKNK >gi|261746547|gb|ADAD01000187.1| GENE 6 4051 - 4845 1000 264 aa, chain - ## HITS:1 COG:SA2103 KEGG:ns NR:ns ## COG: SA2103 COG1316 # Protein_GI_number: 15927890 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Staphylococcus aureus N315 # 22 261 68 311 315 140 33.0 2e-33 MKVFRRLVLLAVILCLGWLSIPFNVLVLGSDARPNQPLKGSRSDGIIVLKVTPLLAKIQM ISIPRDTYTDVPCEKGGKVDKINHSYAFGGRECTIKAVEELLDTKINYSVLFRFDDVIAL TDIIGGVDIVSNHTFTQDNQSFVQGQAYNIKGERALAYTRHRKTDSAFKRDERQRQVIQS IAKKLVSPAGWQYIPGVRNYMQEKMEVSFNPLRIISTLPAILINKTNFEQHEIKGDGKMI KGVWYFIPEQSSLDEAKKDFKIYF >gi|261746547|gb|ADAD01000187.1| GENE 7 4870 - 5493 742 207 aa, chain - ## HITS:1 COG:lin1598 KEGG:ns NR:ns ## COG: lin1598 COG0237 # Protein_GI_number: 16800666 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Listeria innocua # 3 206 5 198 200 128 40.0 9e-30 MIIGLTGGIGTGKSTVSNIFRQKGIPVVDTDVIAREVIDYPEVVNEIIRNFGTEILEEET QQEQGQNKFKKKKISRNKLGQIVFKDEKKVGILNSIMHPLIIKVMKEQTEKLKKDNKIIV ADVPLLFEIHLEKEFDITVLVYADKETQIKRIMKRDKRTLEQAEDIINSQMDIEEKKKKS NYIIYNNGDFEKLTEETEKFLKSLKNM >gi|261746547|gb|ADAD01000187.1| GENE 8 5530 - 6309 722 259 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1364 NR:ns ## KEGG: Sterm_1364 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 22 258 20 256 264 83 30.0 7e-15 MKRNTLKIILFLILSVCSFSINYDEYLRKELTDFGVRKNSIDTFFQGKKALEEMAPINEI EALFFKAIELDKRNYMAYQYLGTEALIHDRDKNKSIEYYKKSLKINPKNFEIYLNMGYIY EEEGNYEMAFKEYEKLKKIVPNSPESYYAAASLNFKLGKFSNMLNDAQKALKLYESPSYN LDMKEKYIMDAQFLILVSFFEQEKYEEALNYFFIVGPNMKKNNFDSFPKLLNVATEVINT KLESKNKKIYEEKLKKIKR >gi|261746547|gb|ADAD01000187.1| GENE 9 6373 - 7854 1767 493 aa, chain - ## HITS:1 COG:alr3807 KEGG:ns NR:ns ## COG: alr3807 COG0457 # Protein_GI_number: 17231299 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 28 279 316 572 604 68 23.0 2e-11 MKKRIILGIIVLIATFNTFSEELSSEELFNTGLKYHTEKNYKEAEKYYKESFLKKADGDT AYNLGVLYDTTKDVKQAEKYYKEAVRLGQVKAYYKLGYLYSKNKNKNLAIENYKQSVEKT KNVDAMYNLGLIYEETNQKNEAIKYFQMAADKGDKDSYDNLGNLYMETGNLALAEKNYKI AADKYNNAEAAYNLGVLYEKKNDNKNAIKYFEKAMSAGIKGGYYRLGLLYDEVKDSKKAE QYYKLAVDKENDDTAAYNLAVLYEKSGKLSEAEKYFLSAYNKNKTGKSALNLGTIYEKQK KYDLAEKYYKEAMNSGEKQKAAYNLIYLYQEQNKGEKAAEILKNMDGVNEDPELLYNIAL SYDTANNKVEAEKYYLKAIEKGKLTKAMNNLAILYYEQKKLQESLKYYKILVDDYGKTEN AYNTGLIYSQLKQPKEAIKYFEIALNKVGNKNALYNLGVSYYDLGDKNTAKKYLKQAAGN GDKDAQLMLDGME >gi|261746547|gb|ADAD01000187.1| GENE 10 8045 - 9085 1566 346 aa, chain - ## HITS:1 COG:FN1577 KEGG:ns NR:ns ## COG: FN1577 COG1077 # Protein_GI_number: 19704898 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Fusobacterium nucleatum # 11 345 6 341 342 357 55.0 1e-98 MSRFTRFINFFRTKKNIAIDLGTSNVLIYDKQRKKIVLNEPSILVKDKKSGEIVAIGTEA KEMLGKTSDSLVVIKPLSEGVISDIDATREMLNIFVKRIYGGSPFRPEIMICVPAQVTSI ERRAIFDAAIGAKKIYLIEEGRAAVLGSGVNISLPEGNTVIDIGGGSTDIAILSLDEIIT SKSIRVAGNNFDQDIIKYVKKKFDLLIGDKTAENLKKEIGTAMVVSGDENITTTIKGRDL TTGIPKVIEINSNQVREAIEDSVNEIITALKEVLEQCPPELSADILNNGVVLTGGGSLLR NFDKLIEERVQIPINRSTNSLESVVMGGGLAFDNKKLMRTLEMREI >gi|261746547|gb|ADAD01000187.1| GENE 11 9132 - 9689 1008 185 aa, chain - ## HITS:1 COG:FN1623 KEGG:ns NR:ns ## COG: FN1623 COG0233 # Protein_GI_number: 19704944 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Fusobacterium nucleatum # 3 185 6 190 190 205 64.0 3e-53 MLDTILKETEDRMHKALENTKDKFSHVRAGRASVSMLDGVSVEAYGVMTPLNQVGTVSAP EARLLTIDPWDKSIIPAIEKVILQANLGFNPSNDGKIIRLVVPELTEDRRKEYVKLVKKE AEEGKVAIRNVRKDVNNKIRKLEKDSEITEDELKSGEEKVQKMTDKFIAEIDDVLAKKEK ELLKV >gi|261746547|gb|ADAD01000187.1| GENE 12 9818 - 10528 1148 236 aa, chain - ## HITS:1 COG:FN1622 KEGG:ns NR:ns ## COG: FN1622 COG0528 # Protein_GI_number: 19704943 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Fusobacterium nucleatum # 4 235 6 237 239 317 67.0 1e-86 MLKYKRILLKLSGEALAGKKEFGFSDEVLESFARQIKEVHDKRVEIAIVIGGGNIFRGAT GTSKGVDRVTGDTMGMLATIMNGLALQNAIEHFDIPTRVLTAVQMPQVAEPFIRRRAIRH LEKGRVVIFAGGTGNPYFTTDSSGALRALEINADILAKGTKVDGIYDKDPMKNKNAVKYE TVTYDEAISKNLGVMDTAALSLCKENKMPIIVFNALEEGNILKMVEGEKIGTLVVD >gi|261746547|gb|ADAD01000187.1| GENE 13 10578 - 11459 552 293 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 3 292 4 280 283 217 44 5e-56 MAVTTALIKELRERTGAGMLDCKKALEENGGDIEKAIDWLREKGIAKAAKKSGRVAAEGL VFGAVSADRKKGAILEFNSETDFVAKNDEFKSFGEKLVQLSLTHDVTSEDELKALEVEGK KIEDVLTELIAKIGENMNIRRLKLVKTDGFVETYIHLGGKIGVLVNVAGEATPENVEKAK GVAMHIAAMDPRYLDASQVTPEDLEREKEIARHQLEQEGKPANIVEKILEGKMRKFYEEN CLVNQKYVRDDSLTIEKFIAPLSIVSFDRFKVGEGIEKDDVDFAAEVAAMAGN >gi|261746547|gb|ADAD01000187.1| GENE 14 11483 - 12277 1039 264 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229883428|ref|ZP_04502896.1| SSU ribosomal protein S2P [Sebaldella termitidis ATCC 33386] # 1 262 1 256 263 404 77 1e-112 MAVITMKQLLEVGAHFGHQAKRWNPKMKPYIFTERNGIHILDLHQTLAATEKAYEFVREI SENGGKVLFVGTKKQAQDAIREEAERAGGFYVNHRWLGGLLTNLNTIKTRVKRLKELEEM DADGTLDTAYTKKEAGLLRKEMAKLSKNIGGIKNMSKLPAALFVVDIKKEFLALEEAKKL GIPVIALIDTNVDPDLVTYKIPANDDAIRSVKLFAQVIANAAVEGNGGVEGFVEEEDGSE EVLQETDVVEFEDEAVTEEEAEIN >gi|261746547|gb|ADAD01000187.1| GENE 15 12516 - 13622 1633 368 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1920 NR:ns ## KEGG: Lebu_1920 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 15 368 14 370 370 265 41.0 3e-69 MKKLVGLLALTLAISGVAYSDQDVIKSIKVTSEVRQGWQDKDGDKANGIGNAGFKKNNRA RTRWRNAVSGNINLVDEWGLEASFLIRNDQDTNRNKFDSSSRNITNLGTYNKREGWYTNL ELSKSLKLGALDTNTIFGWTHEASRQRIDPLTGKESLTGQKQTKGILNEIYFGPKFDVKL FGQNISTTVQGVYFNIKGNKSGDYHFSGTDFVNGRADGWGLNLDLSTSGNIFEGGFGKVG YNVDLTHHFRDAKGKVTATGEDAKSNVYIDYYTAVHYTTSSFAGFYGKITVDNEWEKYTA VNGWNNYFSIWTDLGYKASFDTSVGTISVNPFVRYRPLHRETAKDHSGRTTIEINEVRAG LSIGLTAK >gi|261746547|gb|ADAD01000187.1| GENE 16 13839 - 14918 1466 359 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1920 NR:ns ## KEGG: Lebu_1920 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 15 359 14 370 370 272 46.0 1e-71 MKKLLGLLALTLAVSGIASADQDVLKSIKATSELRQGWTDKDGDKANEIGNAGFRKHNKT RLRWRNVVTGELNLNDEWGLDAKFKVQNDKDRFYNYNANGENTSSAARKGAWETNLEFGK ALNIGSLETRTALGWTHKSSYAGTKESHKGTTGHSNEIYFGPTFGVNVFGQSISTTLQAV YFTANDGKDADYAHPGKAFERGKTEGWGLNAELATDGKIFDGNAGSLGYYVDLTHKFRAP KGKLAATGEKAGSNVYLDYVVGATYTTPSFAGFYGQLNVENEWEKHTAVTGYTNNFSVWT NAGYKAAFDTAVGEVSVNPFIKYRPLHRETTYNKHDENKKVTTETNELRAGVSVGLTVK >gi|261746547|gb|ADAD01000187.1| GENE 17 15185 - 15640 567 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039421|ref|ZP_06012728.1| ## NR: gi|262039421|ref|ZP_06012728.1| putative liporotein [Leptotrichia goodfellowii F0264] putative liporotein [Leptotrichia goodfellowii F0264] # 1 151 1 151 151 225 100.0 1e-57 MKKFKNILMVVLSVTSITACTGIGVGVGIPIGPVNVGLGTTIGLPKGSDSSSKAGKVNGR YVINGDGLSKMVTLKGEPVEIIGSSNRLIIKGRAVSINITGTNNIVEVDSTENVNIEGEN NKVSYKTSITETKRPNISISGAGSEVYKVQE >gi|261746547|gb|ADAD01000187.1| GENE 18 15819 - 16115 409 98 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 98 310 407 407 162 76 2e-39 GSVKPHTSFKSEVYVLTKDEGGRHTPFFTGYKPQFYFRTTDITGEVNLPDGVEMVMPGDN IEMSVELIHPIAMEEGLRFAIREGGRTVASGVVATIVK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:52:43 2011 Seq name: gi|261746541|gb|ADAD01000188.1| Leptotrichia goodfellowii F0264 contig00158, whole genome shotgun sequence Length of sequence - 5583 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 81 122 ## - Prom 207 - 266 11.7 - Term 88 - 140 2.2 2 2 Tu 1 . - CDS 303 - 2990 4025 ## COG0474 Cation transport ATPase - Prom 3079 - 3138 8.1 - Term 3088 - 3138 3.3 3 3 Op 1 . - CDS 3157 - 3600 617 ## Lebu_0717 hypothetical protein - Prom 3620 - 3679 4.2 4 3 Op 2 . - CDS 3683 - 4393 886 ## COG0813 Purine-nucleoside phosphorylase 5 3 Op 3 . - CDS 4415 - 4843 580 ## COG0295 Cytidine deaminase 6 3 Op 4 . - CDS 4860 - 5570 1158 ## COG0813 Purine-nucleoside phosphorylase Predicted protein(s) >gi|261746541|gb|ADAD01000188.1| GENE 1 3 - 81 122 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKKEFVEAYAKATGETKKRSEELVN >gi|261746541|gb|ADAD01000188.1| GENE 2 303 - 2990 4025 895 aa, chain - ## HITS:1 COG:SPy0623 KEGG:ns NR:ns ## COG: SPy0623 COG0474 # Protein_GI_number: 15674699 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pyogenes M1 GAS # 2 871 10 867 893 720 46.0 0 MWFTKSQEEVLRELNVNPKTGLTNDEVNKRLEKYGQNKLKGKPKKSIFQLFLGQLQDVLI YVLIGAAVINIVAHGLEGVTDAIIILAVVLINAIVGVVQESKAEKALEALQQMTTPKSAV RRNGEIIEINSEDLVPGDILVIDAGRFIPADIRLIESANLQIEESALTGESVPTEKNADF IAEDEKIPLGDKENMAFMSTMATYGRGEGVVVATAMDTEIGKIAKILDEDENTLTPLQIK LDELGKTLGYMAIGICLFIFVIGLFQGRNWIDMLMTSISLAVAAIPEGLVAIVAIVLSMG VTRMSKKHAIVRKLPAVETLGAVNIICSDKTGTLTQNKMTVVKIYTLDNHRDVPSEGRDF EANKDEKELIRSFVLCSDASIDGEQDVGDPTEVALVVLGDRFNLEKNTLNTEYKRVGENP FDSDRKLMSTLNEEENGYRVHTKGAIDNILTKSDRIFVNGEIIPLTEEMKNKILKAAEEM SDTALRVLGVAFKDTDSIISAEEMEKDLVVVGIVGMIDPPRTEVKASIVEAKKAGITPVM ITGDHKNTAVAIAKELGIATDISQSLTGAEIDEIPEDKFAEDINKYRVFARVSPEHKVKI VKAFKDHGNIVSMTGDGVNDAPSLKFADIGVAMGITGTDVSKGASDMILTDDNFTTIVTA IEEGRNIYNNIKKTIMFLLSCNLGEVMCVFAATLFNWPLPLLPIQLLWINLVTDTLPAIS LGVDPGDKEVMTRKPRNPKESFFAEGAGMRAVIAGILIGSLTLFSFYIGINEHGFGISEL FNNNTPEAETALTYGRTMAFIVLTVSQLFYSLTMRNSKKTIFEVGFFKNKFLILSIITGI VLQVGLTSIPSISNIFKVTQIKLVDWDIVILFALIPFAVNEIIKIISRKRNIIQK >gi|261746541|gb|ADAD01000188.1| GENE 3 3157 - 3600 617 147 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0717 NR:ns ## KEGG: Lebu_0717 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 5 145 12 148 154 66 30.0 3e-10 MLKKMKVLLTLFFVIISVVSFGEMKINDDGILVGESSEDWEEFFGDDYYKTGNICTVIGT TIMQMSYNKDGKGDKLSNPDNDVKAMLNDINEALDEMGEKNPKKGKNYLYESYYVKNCKK LTEADYKLANSKTFRDTFKKMFSTYGK >gi|261746541|gb|ADAD01000188.1| GENE 4 3683 - 4393 886 236 aa, chain - ## HITS:1 COG:DR2166 KEGG:ns NR:ns ## COG: DR2166 COG0813 # Protein_GI_number: 15807160 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Deinococcus radiodurans # 5 234 70 299 305 291 60.0 7e-79 MGTPHIEAKRGEIAETILLPGDPLRAKHIAETFFENVMQYNGVRGMSGFTGTYKGKKVSV QGTGMGTPSTGIYSHELITEYRVKNLIRVGTAGSFQKDLKVRDIVMAISSSTESNINKLR FNGADYAPTASPDLLFRTYKTAKLKKINIRAGNILSTDTFYDDEPEHWKKWAKFGVLCVE METAQLYTTAAKFGVNALTLLTICDSLVTGEATSSKERQSDFNEMVELALESILEQ >gi|261746541|gb|ADAD01000188.1| GENE 5 4415 - 4843 580 142 aa, chain - ## HITS:1 COG:BS_cdd KEGG:ns NR:ns ## COG: BS_cdd COG0295 # Protein_GI_number: 16079584 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Bacillus subtilis # 17 142 5 130 136 127 47.0 6e-30 MKELAGKIKLTEKEISDYIDEANETLDRAYVPYSKFPVAALLIDQNGKKFKGVNVENASY GVGICAERNVIPTAVTEGMKKIKLLVVTGGTPEPISPCGACRQFISEFSDKDTVIILTNR DKKYKIWSIDELLPYSFGPEDL >gi|261746541|gb|ADAD01000188.1| GENE 6 4860 - 5570 1158 236 aa, chain - ## HITS:1 COG:DR2166 KEGG:ns NR:ns ## COG: DR2166 COG0813 # Protein_GI_number: 15807160 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Deinococcus radiodurans # 5 236 70 301 305 322 65.0 4e-88 MATPHIGAKKGDIAETILLPGDPLRAKYIAETFLQDIVQYNNVRGMLGFTGTYKGKKVSV QGTGMGVPSIGIYSHELINEYGCKNLIRVGTAGSFQESVKIRDVVIAMAASTDSAINKLR FNGADYAPTASADLLFKAHEVGKAKGLSMKAGNVLTSDTFYGDEPEAWKKWAKFGVLCVE METAQLYTTAAKFGVNALTLLTISDSLVTGEATSAEERQLTFNDMIEVALESALNL Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:52:51 2011 Seq name: gi|261746539|gb|ADAD01000189.1| Leptotrichia goodfellowii F0264 contig00205, whole genome shotgun sequence Length of sequence - 704 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 702 773 ## gi|262039436|ref|ZP_06012741.1| conserved hypothetical protein Predicted protein(s) >gi|261746539|gb|ADAD01000189.1| GENE 1 3 - 702 773 233 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039436|ref|ZP_06012741.1| ## NR: gi|262039436|ref|ZP_06012741.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 233 1 233 234 238 100.0 3e-61 NNINNTGNILANNLDINSKSLNSNRITAGKATINAEDTVSNKIEGNNINITGNTLKTNVI EGKDVNLNLNGNISNTGNIVAGTLNVSAKDFDNKEIVANNLNLNTSGKIQSNKINSNTSN IKGQDIISNEIETGILKLQGNDIQSNKITGNEVEIKGNNLKTNVIEGKNVDLTVNNNIHN TQNITAGKLKINSSNLINNEINASELNINTGQNIDSNNIRAAKATINATDIKS Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:53:14 2011 Seq name: gi|261746504|gb|ADAD01000190.1| Leptotrichia goodfellowii F0264 contig00046, whole genome shotgun sequence Length of sequence - 35189 bp Number of predicted genes - 34, with homology - 33 Number of transcription units - 12, operones - 6 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1810 2253 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 1840 - 1899 13.4 + Prom 1977 - 2036 11.4 2 2 Op 1 . + CDS 2258 - 2920 931 ## Lebu_0371 hypothetical protein 3 2 Op 2 . + CDS 2932 - 3069 61 ## gi|262039454|ref|ZP_06012758.1| NADH dehydrogenase subunit 5 4 2 Op 3 . + CDS 3100 - 5784 2266 ## COG0608 Single-stranded DNA-specific exonuclease 5 2 Op 4 . + CDS 5840 - 6382 451 ## Lebu_1546 hypothetical protein 6 2 Op 5 . + CDS 6383 - 7669 1300 ## COG0658 Predicted membrane metal-binding protein + Prom 7710 - 7769 8.1 7 3 Op 1 1/0.000 + CDS 7824 - 8447 935 ## COG4399 Uncharacterized protein conserved in bacteria 8 3 Op 2 . + CDS 8455 - 9522 1539 ## COG2255 Holliday junction resolvasome, helicase subunit 9 3 Op 3 5/0.000 + CDS 9552 - 10259 779 ## COG1385 Uncharacterized protein conserved in bacteria 10 3 Op 4 . + CDS 10246 - 11607 339 ## PROTEIN SUPPORTED gi|227996069|ref|ZP_04043107.1| SSU ribosomal protein S12P methylthiotransferase 11 3 Op 5 . + CDS 11604 - 12608 1329 ## Sterm_0175 hypothetical protein 12 3 Op 6 . + CDS 12625 - 12969 552 ## COG0799 Uncharacterized homolog of plant Iojap protein 13 3 Op 7 1/0.000 + CDS 12993 - 14948 2696 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 14 3 Op 8 1/0.000 + CDS 14986 - 16860 2613 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily 15 3 Op 9 . + CDS 16862 - 17188 300 ## PROTEIN SUPPORTED gi|229884332|ref|ZP_04503793.1| predicted RNA-binding protein containing KH domain, possibly ribosomal protein 16 3 Op 10 . + CDS 17185 - 17646 898 ## COG1963 Uncharacterized protein conserved in bacteria + Prom 17649 - 17708 9.0 17 4 Op 1 1/0.000 + CDS 17749 - 18486 1219 ## COG1189 Predicted rRNA methylase 18 4 Op 2 1/0.000 + CDS 18506 - 19810 1833 ## COG0793 Periplasmic protease 19 4 Op 3 1/0.000 + CDS 19827 - 20528 700 ## COG0313 Predicted methyltransferases 20 4 Op 4 . + CDS 20530 - 21408 1322 ## COG1161 Predicted GTPases 21 4 Op 5 25/0.000 + CDS 21439 - 22695 1518 ## COG0438 Glycosyltransferase 22 4 Op 6 . + CDS 22598 - 23614 1180 ## COG0438 Glycosyltransferase 23 5 Tu 1 . - CDS 23714 - 24730 635 ## COG2342 Predicted extracellular endo alpha-1,4 polygalactosaminidase or related polysaccharide hydrolase - Prom 24755 - 24814 7.2 + Prom 24683 - 24742 10.5 24 6 Tu 1 . + CDS 24877 - 25479 727 ## gi|262039457|ref|ZP_06012761.1| hypothetical protein HMPREF0554_0640 + Term 25580 - 25647 30.2 + TRNA 25558 - 25634 86.8 # Pro TGG 0 0 + TRNA 25650 - 25725 90.8 # Gly TCC 0 0 + TRNA 25754 - 25829 78.1 # His GTG 0 0 + TRNA 25833 - 25908 92.4 # Lys TTT 0 0 + TRNA 25929 - 26012 70.2 # Leu TAG 0 0 + Prom 25939 - 25998 80.4 25 7 Op 1 . + CDS 26166 - 26927 965 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases + Prom 27029 - 27088 10.8 26 7 Op 2 . + CDS 27166 - 27915 639 ## COG0101 Pseudouridylate synthase 27 7 Op 3 3/0.000 + CDS 27936 - 28409 624 ## COG0456 Acetyltransferases + Prom 28421 - 28480 10.2 28 8 Op 1 . + CDS 28562 - 28996 504 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 29 8 Op 2 . + CDS 29038 - 29541 792 ## COG1528 Ferritin-like protein + Prom 29740 - 29799 15.2 30 9 Op 1 . + CDS 29824 - 31647 2581 ## COG0760 Parvulin-like peptidyl-prolyl isomerase + Prom 31650 - 31709 14.5 31 9 Op 2 . + CDS 31740 - 33044 2285 ## COG0148 Enolase + Term 33063 - 33114 4.3 + Prom 33113 - 33172 12.8 32 10 Tu 1 . + CDS 33220 - 33735 886 ## COG2077 Peroxiredoxin + Term 33774 - 33804 2.6 - Term 33753 - 33798 4.1 33 11 Tu 1 . - CDS 33804 - 34943 1541 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 34973 - 35032 7.5 + Prom 34927 - 34986 10.5 34 12 Tu 1 . + CDS 35110 - 35188 75 ## Predicted protein(s) >gi|261746504|gb|ADAD01000190.1| GENE 1 1 - 1810 2253 603 aa, chain - ## HITS:1 COG:FN1170_1 KEGG:ns NR:ns ## COG: FN1170_1 COG0674 # Protein_GI_number: 19704505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Fusobacterium nucleatum # 1 403 1 403 410 639 77.0 0 MAKVMETMDGNQAAAHAAYAFTEVAGIYPITPSSTMAEYTDQWAAYGKTNLMGAPVKVVE MQSEAGAAGTVHGSLQAGALTTTFTASQGLLLKIPNMYKIAGELLPGVMHVAARSISAQA LSIFGDHQDIYAVRMTGWAMMATSSVQEVMDLAGVAHLAAIKSRVPMLHFFDGFRTSHEI NKVEVMDYEVFDRLLDKKALQEFRERALNPENPVTRGSAQNDDIYFQAREAQNRFYDAVP DIVNNYMKEITKETGRHYAPFVYYGAEDAERVIVAMGSVNETIKETVDFLAEKGVKVGLL TVHLYRPFSKKYFFEAMPETVKKIAVLDRTKEPGALGEPLYMDVRSLYYGKENPPIIVGG RYGLSSKDTTVEQILAVYKNLDQSEPKDHFTIGIVDDVTFSSLALEEPVFAGNKDVKACL FYGLGSDGTVGANKNSIKIIGDKTDLYAQGYFAYDSKKSGGVTRSHLRFSKDPIRSTYLV TKPSFVACSVPAYLGKYDMISGLREGGIFLLNTIWDKDKLVKNIPNEIKRELARKKAKFF IINATKLALEVGLGNRTNTIMQSAFFYLTEVIPYEEAKKYMKEYAEKTYGKKGRDVVEKN WAA >gi|261746504|gb|ADAD01000190.1| GENE 2 2258 - 2920 931 220 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0371 NR:ns ## KEGG: Lebu_0371 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 220 1 215 215 165 41.0 1e-39 MKKLGILLMALTIVSVSYGAKEKEDTTFTFQNYIPSAPDVVKEIARTKNLDGSVVAYPVF TGNLKVVEKMNKTVTKFVNTFKGHKDKTYKVEYQVVGSNDRFVSILFTIEKNDKKNNVIT KYNDAITFNVKDGKEMGIKDIFVQGYETALNAAINDKIKQFGLTVNETGKNKFKGAAKGT KFYMEDDSIVLFYNQGEGLEFADGQLFIPFITRDLIGIIK >gi|261746504|gb|ADAD01000190.1| GENE 3 2932 - 3069 61 45 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039454|ref|ZP_06012758.1| ## NR: gi|262039454|ref|ZP_06012758.1| NADH dehydrogenase subunit 5 [Leptotrichia goodfellowii F0264] NADH dehydrogenase subunit 5 [Leptotrichia goodfellowii F0264] # 1 45 1 45 45 67 100.0 2e-10 MKKSFRIVYFGAIFYCKKSGNVEKCFGEIVNKKLKKYILITFIIW >gi|261746504|gb|ADAD01000190.1| GENE 4 3100 - 5784 2266 894 aa, chain + ## HITS:1 COG:FN0374 KEGG:ns NR:ns ## COG: FN0374 COG0608 # Protein_GI_number: 19703716 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Fusobacterium nucleatum # 56 726 49 715 844 496 43.0 1e-140 MRNTKWVIKSVPKDKDIKINDIPVDKDILKILSSRGIKSAKDIKNFLNPKLENIQSPYGL YDMEKTAEEIEKAVNKQKNIWIYGDYDVDGITSTSILYMALKELGAENVNYYIPVRDEGY GLNNDALKKIKDSGGELVITVDCGITAFTEVEYANFLGLSVIITDHHNLHGNKIPKALAV INPKRSENSFSFNALAGVGTIFTVILALFERKGIKEKAYKYLDLVAIGTVADIVPLLEEN RIFTKFGLEKLPFSENKGLSFLLYKLFNNEANSANNPKVEYSTYDVGFIIAPVFNAAGRL KDAKMVVKLLISDNDREIEIIVKELINKNYERKELQNNIVEMVEKNIEKQHINEDFVIID YSPEYHHGVIGIAAAKIVDTYYKPVIIMEVKEDEGIAVGSCRSIENFNILEALQSMPELF VKFGGHSGAAGFTIPIKNIELFRKKINDYAKKILKEEDFVKVINIDKQIPIQKISYEFFQ VIELLKPFGFGNPSPTFQTKNILLENIKFIGESKNHLMFDLKQKGFSSRNAVWFGAGEYF KELSQNLVYDIVYKMKVESFQDRYYTKVYIDDIKVSDLKDDTLSYYHSLYNTSFPLKSVF YTNIDLEKDSPLTVKIEFDQISLFQGRKFVGRLDYNVSNLLILLNEYYNQNFKIKIENIR KTGNHNIVDILIKRDYNFECYDYTETSIFRKIKGFLIGNMEYDSYTKKLLSVFFKQNKNL MIKTEKSSQNILNNFLLTTGIYYMKQTGRKSKIIIKNVKNSPYDNPFIKTYFDIMPKYKS DADCPFAFFCDEQAEFEKYIGEENSETRFCYASNDVSEKNFVEKSEQILSAEMIEDVEEL PENIVLLNSLKKSELKDLKNIFVEYLPLKEKIRLKKLFRDGETIYSDETVKEIL >gi|261746504|gb|ADAD01000190.1| GENE 5 5840 - 6382 451 180 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1546 NR:ns ## KEGG: Lebu_1546 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 3 180 4 181 181 239 76.0 4e-62 MRKMKNNGFTLVEVLVYMSILAILFAIVFNVLQNQKQKQEFTIQKRNISQFIRKIQQYAQ YNKKEYVLDFKISGNTAYFLNEENGKTETIGKLEISGNISYMTNNTNKNADFKRRTTNEG NFEKGFSIYLLDKKGKKIYYRISTNTINAAKYPIISIYRAKTPIDVNDDYTKSVLWEEEL >gi|261746504|gb|ADAD01000190.1| GENE 6 6383 - 7669 1300 428 aa, chain + ## HITS:1 COG:FN0223 KEGG:ns NR:ns ## COG: FN0223 COG0658 # Protein_GI_number: 19703568 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Fusobacterium nucleatum # 61 421 32 378 378 121 29.0 2e-27 MESEKIFLQKEINFSDILKFGILLIFLILLFIFYTDFVTFKEKLTGEKVLYVKMDGNNGS VLKVNNKYLKKQAIIKNTKNLSYGFYWIKYKIKNVKEKNGFITIEGKISGYKESGLNGVR RYVLNIFDELFLTEDNLYAFSKAAILGEKSDVSKDMNDKFKYTGLAHLIVISGSHISLVI MGIVKILDSVNVGYKLKYVLSLAALTFYCTLIGMSPGILRAYIMGAMMILARILFEQEDS KKSLMISAIIIIVLNPYSLFDISLQLSYAAVIAIIFIYPIFEKLLQGKYFDEMKDGIVKD VLKLTLLSLVIQMTSIPLFLYYFDKLPLFSFLLNTVGVPVGTVLIELLFGVTLLNILQIK ILNPLLVTVSKIIFNAFEGFIYAGSRLPLLQIGISVKINLLFVFIYYGMLFFLCLKLNKK DEKSENGS >gi|261746504|gb|ADAD01000190.1| GENE 7 7824 - 8447 935 207 aa, chain + ## HITS:1 COG:FN1218 KEGG:ns NR:ns ## COG: FN1218 COG4399 # Protein_GI_number: 19704553 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 7 204 3 199 200 149 45.0 4e-36 MKHLIIQFVIMVSVGTLIGWFTNYLAIKLLFRPHREINFLFFKIQGLIPKRRDEISENIA GVVEQELISVSDIAERLKGSNLDEEIVDELVDKIIGVKLQKSILEKNPLLKMIVNDSLMD KLKSYFKKAILENKEEILAEILKVVEEKIDFKEIMVEKMTNFSLDEIENIILKISKKELK HIEIIGGVLGGIIAVFQFLLMMLLKQV >gi|261746504|gb|ADAD01000190.1| GENE 8 8455 - 9522 1539 355 aa, chain + ## HITS:1 COG:FN1217 KEGG:ns NR:ns ## COG: FN1217 COG2255 # Protein_GI_number: 19704552 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Fusobacterium nucleatum # 6 328 2 325 332 452 73.0 1e-127 MLMENERILASEELGEDNIQKTLRPKTFSEYIGQEDLKEKMNIFIKAAKMRNEAMDHILL YGPPGLGKTTLAGVIATEMGVNLKITTGPVLEKAGDLAAILTSLEENDILFIDEIHRLNT SVEEILYPAMEDGELDILIGKGPSARSIRIELPKFTLIGATTKAGQLSTPLRDRFGVTHR MEYYKLEELKEIIRRGANIFQVSYDEDGITEIAKRSRGTPRIANRLFKRARDFALVEGKG ILDKASVDGILKLLGVDESGLDELDRNILKSIINVYNGGPVGIETLSLLLGEDKRTIEEV YEPYLVKIGYIKRTQRGRVVTEHGYRHLGLEKILGEKIKGENDSDLKNDDENTLF >gi|261746504|gb|ADAD01000190.1| GENE 9 9552 - 10259 779 235 aa, chain + ## HITS:1 COG:FN1215 KEGG:ns NR:ns ## COG: FN1215 COG1385 # Protein_GI_number: 19704550 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 234 1 234 235 189 48.0 3e-48 MLTVIADKENITEKEITITDKSDCNHVQNVFRLGKGDGLRVIDGEFEYLTEITNIAKKEI KLKIIKKNTDNYSLNINIDAAIGILKNEKMNMTIKKLTEIGVSKIIPLQTERVVVKINEK KEKWDITVREALKQCKGVKFTEIVPVTKLQSINYELYDKIVYAYENSDNTEPLVNIVNKN DKNILYIVGPEGGITEEEVNFLKEKGATEISLGKRILRAETAAIVIGGILANVYN >gi|261746504|gb|ADAD01000190.1| GENE 10 10246 - 11607 339 453 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227996069|ref|ZP_04043107.1| SSU ribosomal protein S12P methylthiotransferase [Kytococcus sedentarius DSM 20547] # 154 446 215 515 526 135 31 4e-31 MSTIRDTQKEKAEDIKNNAERTVAFYTLGCKVNQYETEIIRKDFLDHNYKEVDFDEKADV YIVNTCTVTNVADKKNRKMLRRAKNTNPDSLVVATGCYAQTNLDDLKEMKEIDFIIGNSK KENVFNIINKNVSHYQVDNIFDEKEYSSNKYTILREKARAFVKIQDGCSKFCSYCKIPYA RGLSRSRATEHVLEEINYLGEQGYKEVVLTGINMSEYGLDLEPKTDFDTLLEKILAVKSV ERVRVSSVYPDTITDKFLGMLKNNPKLMPHLHVSVQTLDDKILRLMRRNYKAEFVVNTLE KVKREVPEVALTADIIVGFPQEEEENFANTMKNLDSLGFADLHVFPYSDREKTTAMLLDG KIDAVEKKRRVKKVEELNNIKYAEFRKKTVGSKQRVYIEEIVEDKAFGYTENYLKVFIDL KKGEKNNLDVKVSDLVNTKIVDFDGILLEGDII >gi|261746504|gb|ADAD01000190.1| GENE 11 11604 - 12608 1329 334 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0175 NR:ns ## KEGG: Sterm_0175 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 334 1 321 321 220 40.0 7e-56 MKKFLLVMISLLFFVSCFENAEKLKSENRIFRTNDKYYVYYDGNTFELPKDLYLTKTKKV EQYFTSGMLKQLNIGTLSKAELLNDLNKYFPSGIKYITENETPVGSVLIPVTSVGSEKHV DSIKFEKMLANMPKAQEVKKDDETQEETVVNTNPAETLKGKKFEILNANGIDGFAKNIGE KLKAKFGIEYNAENYTKPETMNYVIVRKLNDTEIEAILAEAGLKYAKILEDKTVKPDADF VLITGNDSQINFPVEIVTLGEKSVTADKIKGYNVKNIKSSEYKGEKLDKIIDVKIVYNPS DVYTAKLLAKQIGGGVKLIADNEINGKIIIVSKN >gi|261746504|gb|ADAD01000190.1| GENE 12 12625 - 12969 552 114 aa, chain + ## HITS:1 COG:mll4005 KEGG:ns NR:ns ## COG: mll4005 COG0799 # Protein_GI_number: 13473415 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Mesorhizobium loti # 4 110 10 120 125 79 32.0 1e-15 MEENKELKKEIDSIISIIEDKKGQDIKVFDMKGRSPFFDYSILCTGSSTRNIEAIANDIK KSSENIRSIEGLEECNWVLIDSGDIIISIFSKDAREYYQLDAFYEGVNQEGSII >gi|261746504|gb|ADAD01000190.1| GENE 13 12993 - 14948 2696 651 aa, chain + ## HITS:1 COG:FN1211 KEGG:ns NR:ns ## COG: FN1211 COG0768 # Protein_GI_number: 19704546 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Fusobacterium nucleatum # 7 634 15 606 657 367 38.0 1e-101 MRELDKEERNPRYVMFILLVASVFTVLVARLFSLQILNASTYEERALQNRIRTNVIKATR GEIYDREGKLLAKNTTGYKLIHTDTRQLSSNDIELLRKIQNLDENQLEEALSRQKKQKAE GLKETIEDIRTISQITGYTTDYIITRFSKQPRIGIDKTILVIEDLDKNIALKAVEKIKNN RINIVEYNKRYYPEDSIASHVIGNVKPISEKEYNELKKEGYQNDDLIGKKGVEKEYDKEM KGQDGVEDVEVDVHGNVIKEIKNVSSITGKNIYLSIDLDLQKYMTQAFAGKSGAFIAMEV KTGKIITFVSYPEISLNLLSSRIPDDQWNELVNSKAKPLVNKGIAGLYPPGSTFKAITGL GILESGISPYDTVMSTGQYKFGKLIFRDSSSRGYGITNFNKSIEHSVNTYYYVFSQRAGK DNIIKYAKEMGVGEKTGIDIPGEQAGVLPTPEWKKKRFKKKQDQIWLPGDLINMSIGQGY VLMTPVQVLSAYQIIANNGVMIKPTVVDRFVSYDGKVGKNEPKILRKIKVSDKNLKLMQN ALRLPVSSYGGTARVLYFPNFPVSAKTGTAQNTGFRDNHSWIAGYFPSDNPQIVFVSIIE GAGYGGVASGQLARTFIEKYRDKYEFKKNILPEQQNQIADNTGKKKKRRKE >gi|261746504|gb|ADAD01000190.1| GENE 14 14986 - 16860 2613 624 aa, chain + ## HITS:1 COG:FN1210 KEGG:ns NR:ns ## COG: FN1210 COG0595 # Protein_GI_number: 19704545 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Fusobacterium nucleatum # 62 624 40 608 608 599 55.0 1e-171 MSEKENKKGNEKADHPFKRDYSKNILNKVKSKIKNSGLNREGRLNEGNSLQFFPDKSEGI NLKNENKKKEKEEKMYVIPLGGIEEVGKNMTAFQYKDEIIIVDAGLTFPEDEHLGIDVII PDFSYLESNREKIKGLLLTHGHEDHIGAIPYLYQKLGTEDIPMYGGRLTLELAKAKFERK DAKLPKEKIIKGRSILKISKYFTVEFISVTHSIADCYAICIKTPAATVLHSGDFKVDLTP VDGEGFDFARFAQLGEEGVDLLLSDSTNAQIPGFTLSERTVGESLKEEFAKAKGRIILAA FASHVHRLQQIIYIAEKNNRKIAIDGRSMVKIFEICSNLGYLKIPKGIMIDIDKVETLPA NKVLILCTGTQGEPLAALSRIANGSHKYITLRDGDTVVISASPIPGNEKAAYKNINQLMK RHANVVFEKGIGIHVSGHGCQEEQKLMLNLVKPKFFMPVHGEYVMLKKHKDSAEAVGIPS QNIILAENGMKLELTKSSFKAVGKVPSGVTLIDGFGIGDIGNAVLKDRQNLADDGIVIIS VSQYKTGTFNKQIELVTRGFVYNKDAESLLSETKELIKLELENMETQGIKETGKIKQRLK IKVGEFLNKKTDREPIILPIIMEV >gi|261746504|gb|ADAD01000190.1| GENE 15 16862 - 17188 300 108 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229884332|ref|ZP_04503793.1| predicted RNA-binding protein containing KH domain, possibly ribosomal protein [Sebaldella termitidis ATCC 33386] # 1 107 1 106 106 120 56 1e-26 MELSSKERAFLKKLAHGIDPVVRIGKDGIDENVIKSIADAVKKRELIKVKILQNSQEEIG REPGTELASKTKSVFVDSIGKIMIFFKPDNKNGKITKEFNEFRKKGKK >gi|261746504|gb|ADAD01000190.1| GENE 16 17185 - 17646 898 153 aa, chain + ## HITS:1 COG:L26878 KEGG:ns NR:ns ## COG: L26878 COG1963 # Protein_GI_number: 15672981 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 6 146 7 145 147 122 46.0 3e-28 MSGGIIFGNRLLDVAAISCFSAQFYKVFYPLLKKEKIQWVRMFQTGGMPSSHASTVVSLA TSVCLLKGANSIEFAIAMVFSGIVLYDATGVRRQAGKHAKALNTLVDSIEKRDGIEIISE EFKEFLGHTPLEVFWGSILGIVIGLLFRGYIAG >gi|261746504|gb|ADAD01000190.1| GENE 17 17749 - 18486 1219 245 aa, chain + ## HITS:1 COG:lin1403 KEGG:ns NR:ns ## COG: lin1403 COG1189 # Protein_GI_number: 16800471 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase # Organism: Listeria innocua # 2 240 5 242 274 242 52.0 4e-64 MKKRLDLILVERGIFETREKAKREIMAGNIIVNEHAVTKAGTNFKDDEKLLIRVKDRLKY VSRGGLKLEKAIEVWNLDFSDKTVLDIGASTGGFTDCALQNGAKKVYSVDVGTNQLDWKL RNDSRVVSIENTHIKDLKPENLENEKADFTVIDVSFISLTKVIPYLRKFLKDKGKVIMLI KPQFEVGKKKIGKNGIVMEEQYHDEAIKKIISFIKENDYELVGVEESPIKGSKGNKEFLA MIVSN >gi|261746504|gb|ADAD01000190.1| GENE 18 18506 - 19810 1833 434 aa, chain + ## HITS:1 COG:FN1205 KEGG:ns NR:ns ## COG: FN1205 COG0793 # Protein_GI_number: 19704540 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Fusobacterium nucleatum # 27 426 4 420 427 294 44.0 2e-79 MKKNKLLYLILGFILISIPVYCAATIIKSARNDKNSNANYNTDSTELNRIVDVINIIDNR FVGKETPNKDELYKAAVTGVVNRLNDPYSEYLSQEDLKNFSEDIEGEYVGVGMSIQKKKG EALEVTSPFIGSPAEKVGIKIGDKIIKVDDKDILPLTSTDTVKLLKGKEGTKVSVEIVRA GKKEPFKVTMTRAKIKLETVESKMLGNGIGYVSLLRFGNHAGEDVQKAVEGLQKQGMKGL ILDLRLNPGGSLQEAQDISSLFLKEDLIVSLKYKDGQEKKYNRTGKYLGDFPLIVLVNKG SASASEIVTGAIKDYKRGTIIGEKTFGKGIVQQVLPLRTGDAVKLTIAQYFTPKGNYIHE KGIEPDIKVPMEELITLKGYTNDSEQARKNREKEIEEILIKEKGAEEAKKIIAAGDVQLK RAIEEMNKKLGTKK >gi|261746504|gb|ADAD01000190.1| GENE 19 19827 - 20528 700 233 aa, chain + ## HITS:1 COG:FN1204 KEGG:ns NR:ns ## COG: FN1204 COG0313 # Protein_GI_number: 19704539 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Fusobacterium nucleatum # 1 220 1 221 235 289 69.0 4e-78 MFYVVGTPIGNLEDITFRALRILKEVDYIFAEDTRVTKKLLNHYEIEKTVYQYHEHNKFH QIENILNLLKDNKNIALVTDAGTPCISDPGFELVNEILKENIKVSGIPGPSSIITGGSIS GLDMRRIAYEGFLPKKKGRQTLFNKLKEEERMIIILESPNRILKTLKDIKEYLGERYIVI TRELTKIYEEVIRGNVSEIIEKLEKKPVKGEIVLFIRASEDNGIYLKKEVKGE >gi|261746504|gb|ADAD01000190.1| GENE 20 20530 - 21408 1322 292 aa, chain + ## HITS:1 COG:FN1203 KEGG:ns NR:ns ## COG: FN1203 COG1161 # Protein_GI_number: 19704538 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 3 287 6 287 289 247 51.0 2e-65 MNINWYPGHMKKTKDLIVENLKIIDVVIEILDARIPLSSKNPDISKLAKSKQKIVVLNKV DLVDTKDLKKWEEYFLKNEISDYFVALSVEKGTNFNELRKITDKIYSDKLEKMKKKGLRK TEVRAMIVGIPNVGKSKFINKFVNKNKAKVGNTPGFTRGKQWIKINDKIELLDTPGVLWP KFEDENIAYNLAITGSIKDNVLPIEEVVMKFFDKLKILNKIENLVEVYNLEEYTNKEEIK DTENHKILEMLEKRLGIFKNEEHNYETVARRVLKDYRNGKTGKFFLELPEEF >gi|261746504|gb|ADAD01000190.1| GENE 21 21439 - 22695 1518 418 aa, chain + ## HITS:1 COG:lin2700 KEGG:ns NR:ns ## COG: lin2700 COG0438 # Protein_GI_number: 16801761 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Listeria innocua # 1 384 1 384 427 270 38.0 4e-72 MRVGIFTDTYRPQVNGVVSSILTLEKELRKKGHKVYIITTTDPDAPMVEPNVLRLPSMEF KPLPQYRLGMLYSSRIIKKIKRLELDIIHSQTEWGVGTFARFAAINLEIPLVHTYHTLYE YYTHYITRGHFTVPAKKLAAAISKFYCEKCNALIVPTRKVEDILYSYDVDKNMNIIPTGI ELNKFYRENYTDEEIKFMRESFNIQDSDFLCVYIGRIAKEKSIDVLIDMFSKIKDETFKF MIVGRGPVVDELKNQAENLGISDRVIFAGEVPHDKVPVYYQMGDVFLNASVSETQGLTFV EAMAAKTPVNARYDLNLEDLLVKNEAGLVYKNEEEFISNIMLLKQNKKLREKIIENAYNV SQDFTAEKFGERVEAVYKKTIEEYDSRESFTIFRGEKYIQQIRRWTSIKSSGRSPWSK >gi|261746504|gb|ADAD01000190.1| GENE 22 22598 - 23614 1180 338 aa, chain + ## HITS:1 COG:L189090 KEGG:ns NR:ns ## COG: L189090 COG0438 # Protein_GI_number: 15674119 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Lactococcus lactis # 1 336 1 331 332 230 40.0 3e-60 MKVLLFSEGKNTFSKSGVGQALNHQVEALGANNVEYTLNPDEEYDIVHINTLGLKSWKML KKAKKMKKPVIYHTHTTYEDFKGSIKWSDQLAPVIRYWAKKSYNGADYLISPTEYTKKLV SEKYLKNEKEIRVISNGVNTEKISKNEELKKKFLEKYKIFKPLVITAGLPFERKGIKEFV KLAERCSDYQFLWFGSSSIKPMLPKKIQKIIDNPPKNLSFPGFVEKEELIGAFSAAKLFL FMTYEENEGIVVLESLSSKLPLVVRDIPVYEDWLQDGVNCFKARNNDEFYKKMVSIVNDK VKNIDDIIENAYKIAEERDLKNIGKKYKEYYEYILDKK >gi|261746504|gb|ADAD01000190.1| GENE 23 23714 - 24730 635 338 aa, chain - ## HITS:1 COG:FN0386 KEGG:ns NR:ns ## COG: FN0386 COG2342 # Protein_GI_number: 19703728 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted extracellular endo alpha-1,4 polygalactosaminidase or related polysaccharide hydrolase # Organism: Fusobacterium nucleatum # 90 331 1 242 254 292 62.0 5e-79 MKKQLLLIFIFFISGFSKEDPLKLSSLKESENKIFSNNERNTSNEVYRNRMKDFIKEIRN NTSKNRIIITQNGNGLYFNNGKLDKNFFNITDGTTQESLYYGDILKYNTPTQKKSRQELL NLLIPLRNSGKPVFVINYAKGKSKKDFLIKEDMKTKFISEMLLSFGAKDLYESIHGYNTK NIDSLNQVKNFLCLLNPEKFKDINQYFEFLKNTDFDLLLIEPSHNGVFFTKEQIKQLKTK KSGGKRLVVAYFSIGEAEDYRNYWDTSWNKKNPEWIVAENTDWKGNYIVKYWHPKWKKII KNYQKNLDEIGVDGYLLDTVDSYYYFEDKAEQNRKISD >gi|261746504|gb|ADAD01000190.1| GENE 24 24877 - 25479 727 200 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039457|ref|ZP_06012761.1| ## NR: gi|262039457|ref|ZP_06012761.1| hypothetical protein HMPREF0554_0640 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0640 [Leptotrichia goodfellowii F0264] # 1 200 1 200 200 310 100.0 4e-83 MKKLILFVFFIFSITAFSGTDKEAEEKIRKEMEKVINYHKTGGKEEREKLLLVNRLDRDS VSYFIMNKIGSVMSEGHKKSEYKINNIEIKGNEAVIDIDYKGPDLSQGVEKFQKDFEKSP EIVEKIKKMSKNDSKKYIAEEFEKYMIKDLKSGSFKYYNKKRMKLKIKKIGNWDNLNLDR EFMNMLSLGIMDKFHSIAGL >gi|261746504|gb|ADAD01000190.1| GENE 25 26166 - 26927 965 253 aa, chain + ## HITS:1 COG:TM0742 KEGG:ns NR:ns ## COG: TM0742 COG0639 # Protein_GI_number: 15643505 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Thermotoga maritima # 36 230 3 182 209 97 33.0 3e-20 MTDKIKKIFGMFKKSEKSKNIIRIEKIHEKDFERIFVTSDIHGYYSLFLKLLDKIQLTKK DLLIIMGDSCDRGPQSYELYKKYMELSEQGYNIKHILGNHEDMLYKAVQSGDDAHWYRNG GEKTDISFSENLGITLEEWKEKDGMKNLEWFVNWIEQLPLIIEGDKNIFVHAAYDTTKSI DEQEHRFLVWSRDDFWTNNKTGKAIYFGHTPSKDGKIRYYVNDVCCIDTGSYNTKVLGCI NLNSKEEIYVKED >gi|261746504|gb|ADAD01000190.1| GENE 26 27166 - 27915 639 249 aa, chain + ## HITS:1 COG:FN1600 KEGG:ns NR:ns ## COG: FN1600 COG0101 # Protein_GI_number: 19704921 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Fusobacterium nucleatum # 1 249 2 247 247 190 45.0 2e-48 MTKEKNIKMVYQYDGSKFYGFQRQKNKKTVQGEIEKVILKNFSQKINMISSGRTDKGVHA LGQVSNFFIDEKIPLEAIKRQINKNLYGEIKILSVEEIKREFNSRFDAKSRTYLYIMKKE EEITPFESNYITGIKNDINIEQFQKIMEIFKGKYDFSSFMKKDKAIRNTMREIYNIKCEY EEREKKIYVEICGSSFLKTMIRIMIGSAMAIYFGKEREDYIKMRLENPDADKPKILAPSE GLYLYRVDY >gi|261746504|gb|ADAD01000190.1| GENE 27 27936 - 28409 624 157 aa, chain + ## HITS:1 COG:FN2046 KEGG:ns NR:ns ## COG: FN2046 COG0456 # Protein_GI_number: 19705336 # Func_class: R General function prediction only # Function: Acetyltransferases # Organism: Fusobacterium nucleatum # 12 152 8 145 149 67 35.0 8e-12 MGEFFIRELDSEKDSELLFQIVKHEDEVFQKASIGNFNIKPFAKYGKVFAMLHKEDKMEE PVSIIEVLRSFNGEKGYLYGVSAVPKYEKQGHTRKLLEYVINYLSENSINIIELTVGVNN ERAIKMYKKSGFKIEKVLTNEYKDGEKRYLMKYENKG >gi|261746504|gb|ADAD01000190.1| GENE 28 28562 - 28996 504 144 aa, chain + ## HITS:1 COG:FN2045 KEGG:ns NR:ns ## COG: FN2045 COG0735 # Protein_GI_number: 19705335 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Fusobacterium nucleatum # 1 130 5 134 142 145 54.0 3e-35 MHIENVGEYLKENGIKPSIQRIKIFEFLLEHHIHPTVDDIFQTLSAEIPTLSKTTVYNTL NLFVGNHIVQEVIIEENEVRYDVVTGVHGHFKCKVCGEIKDFDVDLSKLDLSKLGDVEIE ETHFYLKGTCAEWLKNRNTTVVKN >gi|261746504|gb|ADAD01000190.1| GENE 29 29038 - 29541 792 167 aa, chain + ## HITS:1 COG:TM1128 KEGG:ns NR:ns ## COG: TM1128 COG1528 # Protein_GI_number: 15643885 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Thermotoga maritima # 1 164 2 163 164 125 43.0 3e-29 MKLTKKTEKLLNDQVNMELGAAYQYQAMAAYFESLGLEGFAKWMDNQAKEEIEHSRKFYD YLLSRNGKVVLEALGKPKGNFKTVKEVFEDSLKHEQSVTKSIEAIYENVRKEKDYGAEVF LNWFVSEQEEEEETVQKIIDKITLLNVDKDSVALYMFDKEMGERTEE >gi|261746504|gb|ADAD01000190.1| GENE 30 29824 - 31647 2581 607 aa, chain + ## HITS:1 COG:FN1320 KEGG:ns NR:ns ## COG: FN1320 COG0760 # Protein_GI_number: 19704655 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Fusobacterium nucleatum # 248 607 5 354 356 104 28.0 6e-22 MSVRKHKNIIKGISIALIIAFALSMVYAGWNFLKNNVFIERKKVIAEVNGEKIYADDMEK SYADLSSRLDSIVAQRKQQIAQLGGNPDNFKSLPEELLREYLLKGLIDQKLLLSSAKDLK VKVSSADIDKIVENDQKQAGGKENFIRLLGTNGYNLTTYKEFIRDKMLLEKVAEKIESSS KISDEELKKAYERYKYFNFQDQTFEEAKPQLIETLNGDNEQMLISSYLVKAWKNAKIKIK NDDVKKIDYKKMYENITKTIVEKDGYKFEGGSLNEKIITMFVNSENGYSEALREEAKKSL ESDLNKLIEIAKKAKAAGIKASPEAAGIQELNDYAKKYYSYLIDTYKPTEQAMLERFNSK RETYNIQNTIAGQVVGDYFQPSKADFDEVKKKAEELIKTVNVENFGQKAKELSQDPGSKD NGGQLGVIDLSGMVPEFAEAVKKAEKGKIVGPVKTQFGYHIIYVEDKDSSNSDKAKVSHI LLTPSVSEATKQELIKKMKTLKDELTAKKVTWQQVNTQDKYKFEIKEQFKKLTKSQPIPG VGKYDSELSNKLFASKVGDIIEHQTEYGYFLLSKISEVPFKEVTFDSVKERIRLELAFEH VNEEIEK >gi|261746504|gb|ADAD01000190.1| GENE 31 31740 - 33044 2285 434 aa, chain + ## HITS:1 COG:FN1764 KEGG:ns NR:ns ## COG: FN1764 COG0148 # Protein_GI_number: 19705083 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Fusobacterium nucleatum # 1 434 1 434 434 616 76.0 1e-176 MTRIEDIYAREILDSRGNPTVEVEVFLEGGAMGRASVPSGASTGEHEAVELRDGDKSRYL GKGVLKAVENVNTVLAENLIGMDALDQVAIDKAMIELDGTPNKGKLGANAILGVSLAVAK AAANQLGIPLYRYLGGVNAKELPVPMMNILNGGSHADSAVDVQEFMVQPVGAKTYKEALR MGAEIFHHLGKLLKANGDSTNVGNEGGYAPSNINGTEGALDIISKAVEAAGYKLGEEITF AMDAASSEFATKEGDKYVYTFKREGGVVRTSEEMVEWYAGLCAKYPIISIEDGLAEDDWA GFKLLTEKLGKKVQLVGDDLFVTNTERLSRGIKEGIANSILIKVNQIGTLTETLDAIEMA KKAGYTAVVSHRSGETEDDTIADIAVATNAGQIKTGSASRTDRMAKYNQLLRIEDDLAEE AVYEGKAAFYNIYK >gi|261746504|gb|ADAD01000190.1| GENE 32 33220 - 33735 886 171 aa, chain + ## HITS:1 COG:BS_ytgI KEGG:ns NR:ns ## COG: BS_ytgI COG2077 # Protein_GI_number: 16080001 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Bacillus subtilis # 8 171 4 166 167 192 58.0 3e-49 MTERKNVVTFKGTPITLLGEEVKVGDKAKDFTVLANDLKEVKLSDYKGKVVIISVFPSVD TGVCALQTTRFNQEAAKFPEDIQLLTVSADLPFALGRFCADKGIENALTTSDHKELDFGL KYGFVIKELRLLARGTVIVDKEGNVKYVEYVSEIGEHPDYDKALEVAKSLV >gi|261746504|gb|ADAD01000190.1| GENE 33 33804 - 34943 1541 379 aa, chain - ## HITS:1 COG:BMEI0783 KEGG:ns NR:ns ## COG: BMEI0783 COG0265 # Protein_GI_number: 17987066 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Brucella melitensis # 52 378 42 373 474 248 40.0 1e-65 MKKNKAFIILLLSIILSCSKPAEQKKDVSQNNSQNNTVSQKEVEKSNKNALETQGAFINV YKEAKDSIVNIRTKKTITVNTYNPLEELLFGSSGGQEKKESGSLGSGFVVSEDGYIVTNN HVVNNADEIYVKFTDGREYLTKLVGTSPEVDIAILKIESSEKFKPLEFADSDKIQIGQWS IAFGNPLGLNDSMTVGIISAAGRSSLGIEEIENFIQTDAAINQGNSGGPLVDINGKVIGV NTAILSTSGGSVGLGFAIPSNLVAVVKDSIISTGKFERPYVGLYLDNLDSQKVKNLNIKS GNGVYIAQVVPGSPAEKAGLKANDVIIGVNDKPINSAGSFIGELAAKKIGQTVNLQIIRN SQTMNVNVTLESSPKIQQR >gi|261746504|gb|ADAD01000190.1| GENE 34 35110 - 35188 75 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIKLILLDVDGTLTDGGIYRGNKGEE Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:53:50 2011 Seq name: gi|261746501|gb|ADAD01000191.1| Leptotrichia goodfellowii F0264 contig00219, whole genome shotgun sequence Length of sequence - 4009 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1974 1800 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family - Prom 2042 - 2101 8.4 + Prom 1882 - 1941 9.3 2 2 Tu 1 . + CDS 2093 - 3064 753 ## COG0582 Integrase + Term 3251 - 3287 -0.3 Predicted protein(s) >gi|261746501|gb|ADAD01000191.1| GENE 1 3 - 1974 1800 657 aa, chain - ## HITS:1 COG:FN0414 KEGG:ns NR:ns ## COG: FN0414 COG0553 # Protein_GI_number: 19703756 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Fusobacterium nucleatum # 55 655 4 620 1014 347 38.0 4e-95 MSEELINKINEILENKKDAVINIVNDRLTISVFTLLEKNLKNVKEINFIIRGMELIPKNN EISHEFEMTPNDVLYNSYEIKEKNKLNHFAKAKAMHDFIEKNVNIRKVNSGIKVNGNLLT IDNDFMIQGSSSLEILKKNNVFNFDTYISEMMDKSQIEKMLSNYKTLWNSNQHTVDYKKE LLESLEFVYKEHSPEFLYYFTLNELFGHQLDNGIEYFEKNSDKFKRTEIWNSLFDFQKDC VVSAINKLQKYNGCIIADSVGLGKTYEALAVIKYFELRNDNVLVLTPKKLYENWNSFTGN YTDGNLDEIFNYKIMYHTDLSRYKGESKTGQDLSRFNWSNFDLIVIDESHNFRNRNDRYD DENKLIMNRYSRLLQEVIKKGRNTKVLLLSATPVNNSLVDLKNQLSIITGDRDFAFEESG INSVNYTLKKSSEVINSWEKENNHKKEELLDKLPSDFYKLLEMMTISRSRKHITNYYGNS DVGKFPNKNKPKTFYTKIDKNEKFLDFKTTNESLEELTLAVYTPIRYIKSEYIKSYTEKY DWGRMNFTTQSKGTVMLHRFNLFKRLESSVFSFSETLKRLLNRIDSTIKILEKDAVIKEN MEMEEDEVQEEIYIENKYEINTKHLRKEHFIEDLINDRNVIQMIYDETKKVLDEKRD >gi|261746501|gb|ADAD01000191.1| GENE 2 2093 - 3064 753 323 aa, chain + ## HITS:1 COG:L48477 KEGG:ns NR:ns ## COG: L48477 COG0582 # Protein_GI_number: 15672029 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Lactococcus lactis # 12 314 69 386 394 93 25.0 5e-19 MATKTKSKDIKFEKLADLWLKRERNFLKISTYSTYKNLYETHIKPVFGNKFIGNISSESL QQFIFDKLKNGRLDGNGGLSRKTVKDIMTIIKISINYAIRENIIKEKSLKYKIPKTDKIN EINTFNQEEQSLLFKYIASNLTSKSVGILLAMSIGLRIGELCGLKWKDIDLQNEILIVNK TLQRIYFRVKKGGKSEIIISVPKSKNSYRIIPLSRNLVNFLKILQIDDKSNYFLSNNEKY IEPKTYRRYYYKILEKLKIRKLKFHSLRHTFATTAIESGIDYKTVSEILGHASVNTTLEL YTHPKIEHKKKCIELIFQNFEKK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:54:03 2011 Seq name: gi|261746456|gb|ADAD01000192.1| Leptotrichia goodfellowii F0264 contig00013, whole genome shotgun sequence Length of sequence - 43484 bp Number of predicted genes - 46, with homology - 42 Number of transcription units - 15, operones - 9 average op.length - 4.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 103 94 ## - Prom 157 - 216 8.8 2 2 Op 1 44/0.000 - CDS 237 - 1040 571 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 3 2 Op 2 5/0.000 - CDS 1033 - 1809 409 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 4 2 Op 3 5/0.000 - CDS 1819 - 3384 2488 ## COG0747 ABC-type dipeptide transport system, periplasmic component 5 2 Op 4 49/0.000 - CDS 3416 - 4240 1192 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 6 2 Op 5 . - CDS 4243 - 5181 1176 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Prom 5297 - 5356 9.3 + Prom 6085 - 6144 8.7 7 3 Op 1 . + CDS 6170 - 7531 1627 ## COG0593 ATPase involved in DNA replication initiation 8 3 Op 2 9/0.000 + CDS 7560 - 7781 295 ## COG2501 Uncharacterized conserved protein 9 3 Op 3 . + CDS 7813 - 8907 1295 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 10 3 Op 4 . + CDS 8897 - 9883 1045 ## Lebu_0005 hypothetical protein + Prom 9893 - 9952 10.1 11 4 Op 1 . + CDS 9973 - 10239 325 ## Sterm_0005 hypothetical protein 12 4 Op 2 . + CDS 10251 - 11267 1254 ## COG0457 FOG: TPR repeat 13 4 Op 3 . + CDS 11282 - 12106 1094 ## Lebu_0008 hypothetical protein 14 4 Op 4 24/0.000 + CDS 12137 - 14119 3301 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit + Prom 14189 - 14248 3.7 15 4 Op 5 . + CDS 14476 - 17037 4138 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit + Term 17045 - 17089 4.0 + Prom 17070 - 17129 4.1 16 5 Op 1 . + CDS 17155 - 17910 1119 ## Lebu_2298 hypothetical protein 17 5 Op 2 . + CDS 17926 - 18375 663 ## COG0622 Predicted phosphoesterase 18 6 Tu 1 . + CDS 18433 - 19314 1111 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 19320 - 19361 -0.6 - Term 19300 - 19357 5.1 19 7 Op 1 . - CDS 19470 - 19781 362 ## gi|262039496|ref|ZP_06012798.1| peptidase propeptide/YPEB domain protein 20 7 Op 2 . - CDS 19836 - 20195 318 ## gi|262039486|ref|ZP_06012788.1| putative threonine rich protein 21 7 Op 3 . - CDS 20116 - 20313 268 ## - Prom 20337 - 20396 6.3 22 8 Op 1 40/0.000 - CDS 20404 - 21729 1298 ## COG0642 Signal transduction histidine kinase - Term 21749 - 21785 3.1 23 8 Op 2 . - CDS 21794 - 22471 743 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 24 8 Op 3 . - CDS 22492 - 23565 1037 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF - Prom 23606 - 23665 14.5 + Prom 23703 - 23762 12.6 25 9 Tu 1 . + CDS 23789 - 24037 170 ## Lebu_0133 hypothetical protein + Term 24039 - 24083 1.6 26 10 Op 1 . - CDS 24200 - 24571 450 ## lmo1117 hypothetical protein 27 10 Op 2 . - CDS 24577 - 25026 393 ## BBR47_09460 hypothetical protein - Prom 25061 - 25120 10.4 + Prom 25081 - 25140 7.4 28 11 Tu 1 . + CDS 25160 - 25843 664 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 26006 - 26060 -0.0 29 12 Tu 1 . - CDS 25952 - 26809 882 ## Lebu_2291 hypothetical protein - Prom 26852 - 26911 13.6 + Prom 26839 - 26898 8.3 30 13 Op 1 . + CDS 26924 - 27592 725 ## COG2094 3-methyladenine DNA glycosylase 31 13 Op 2 . + CDS 27637 - 27744 58 ## 32 13 Op 3 . + CDS 27834 - 28733 1360 ## Lebu_0266 hypothetical protein 33 13 Op 4 . + CDS 28770 - 29180 537 ## Smon_1263 hypothetical protein 34 13 Op 5 . + CDS 29208 - 30683 1838 ## Lebu_0265 hypothetical protein 35 13 Op 6 . + CDS 30694 - 30939 323 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) + Prom 31029 - 31088 9.8 36 14 Tu 1 . + CDS 31122 - 31397 504 ## COG2088 Uncharacterized protein, involved in the regulation of septum location + Term 31439 - 31488 8.1 + Prom 31448 - 31507 10.6 37 15 Op 1 . + CDS 31551 - 31643 65 ## 38 15 Op 2 . + CDS 31647 - 32213 701 ## COG0193 Peptidyl-tRNA hydrolase 39 15 Op 3 . + CDS 32226 - 33107 273 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 40 15 Op 4 . + CDS 33135 - 35885 3260 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase 41 15 Op 5 . + CDS 35887 - 37011 1083 ## COG5379 S-adenosylmethionine:diacylglycerol 3-amino-3-carboxypropyl transferase 42 15 Op 6 . + CDS 36998 - 38329 1171 ## Sterm_1951 phosphatidate cytidylyltransferase 43 15 Op 7 . + CDS 38358 - 39380 1134 ## Sterm_1952 GCN5-related N-acetyltransferase 44 15 Op 8 . + CDS 39421 - 41025 1838 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 45 15 Op 9 . + CDS 41049 - 42071 1265 ## Sterm_1948 UbiA prenyltransferase 46 15 Op 10 . + CDS 42135 - 43142 1682 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Term 43173 - 43232 12.0 Predicted protein(s) >gi|261746456|gb|ADAD01000192.1| GENE 1 1 - 103 94 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTKRTYQPNKRKRKKDHGFRSRMKTKSGRKVLKR >gi|261746456|gb|ADAD01000192.1| GENE 2 237 - 1040 571 267 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 26 256 35 265 329 224 46 7e-58 MYKENDIILETKNLTKIYEKSDNKKVIACNNVNLQIHKGKTLGIVGESGSGKSTLVNMLM DLEKPTSGEILYHGKDISKFTKEEVWLNRQNIQIVFQDPWSAFNPKMNVMQILTEPLMNY NRLKKSERKEKAVELLKMVDLPEEFVTKYPQNMSGGQRQRLGIARAISLEPEILICDEAT SALDVSIQKTVVELLVRLQKEKGITMIFICHDIALIESFAHQIAVMYHGDVVEFIEGGQI SEKAKHPYTKALLNAIFPVHGKKETAE >gi|261746456|gb|ADAD01000192.1| GENE 3 1033 - 1809 409 258 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 3 245 13 258 563 162 33 3e-85 MLEIKIQYEDKDPVVENFSMSLKKGQIITIVGESGSEKSTVLSSILGLLPNDGKVVSGDL IYNGKSLLNKDINEWRKLRGTEITMISQDSGGTLNPIRKIGKQFVEYIQTHSNLSTKEAE EKSKSLFSKVNLPDPDIIMKSYPHQLSGGMKQRVGIAMALTFEPKIILADEPTSALDVIT QEQIVKEIMNLKEISDTSIIMVTHNLGVAAYISDWIIVMQKGKVVNEGTAEDVINNPKSD YTKQLLKAVPEIGGERLV >gi|261746456|gb|ADAD01000192.1| GENE 4 1819 - 3384 2488 521 aa, chain - ## HITS:1 COG:FN1523 KEGG:ns NR:ns ## COG: FN1523 COG0747 # Protein_GI_number: 19704855 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 6 519 6 523 526 704 68.0 0 MLKSQKSKFLILIALLLTLLVACGGGDKGNSGASGASNELVVGVTSFEDTLEPTDQYFSW VVTRYGIGENLTKFDEKGNLQPLLAESWQLSDDKLEWTFKIRDNVTFSNGNPLTAEAVKK SIERVFEKNKRAESFFTYTAMEANGQNLKIKTKEPVAILPAALADPLFLIVDTTANTEEF AQKGPITTGPFVVQEFKPGGQTVVVRNEKYWDGKPKLAKVTFKDINDQNTRTLALKAGEI DVAYNLKVGNKSDFEGDKNIVINELKSLRTTYAFMNQTKGLKDKALRQAIIRGADRENYT QSLLQGGATPGKAPVPPTLDYGFNELKDENAYNPESAKKILADAGYKDKDGDGFLERPDG SKLDLTFVIYTSREELGVYAQALQANMKDIGIKVTLKPVSYETTLSMRDASNFDLLIWNV LAANTGDPEKYLNENWYSKAKTNQVGYSNPEVDKLLTDLSKEFDAAKRKDMIVKIQQLIM NDAATLFFGYETTFLYSNKSVEGLKMYPMDYYWITKDVSKK >gi|261746456|gb|ADAD01000192.1| GENE 5 3416 - 4240 1192 274 aa, chain - ## HITS:1 COG:FN1522 KEGG:ns NR:ns ## COG: FN1522 COG1173 # Protein_GI_number: 19704854 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 274 1 274 276 347 67.0 2e-95 MDKIKFMKKNTQLTIFLTLAIIVLLIAIFAPAIVPKDPFLTVRNNNLKPLSAAHIFGTDS LGRDLFSRVIYGSRYSIFMTLTLVFVIFLIGTFLGIIAGYFGGITDTIIMRIGDMMIAFP GLILAIAIAGLLGPSVINSIIVISAVTWTKYARLSRSMVLKIKQELYVEAAKVTRSRDYS ILLKYIMPNMITTMIVTAVSDLGTLMLEIAALSFLGFGAQPPIPEWGAMLNEGRTYLSRA PWLMIYPGFAIVTVVIIFNMLGDSVRDIVDIKSE >gi|261746456|gb|ADAD01000192.1| GENE 6 4243 - 5181 1176 312 aa, chain - ## HITS:1 COG:FN1521 KEGG:ns NR:ns ## COG: FN1521 COG0601 # Protein_GI_number: 19704853 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 309 1 309 312 392 70.0 1e-109 MKNKKLLKYIIQILIVLFGISFFTFCLVYISPGDPAETMLTECGNIPTPELLAQTRAELG LDKPFYIQYGRWLFSVLRGDFRKSYSLRIPVIQKIASAFMPTLSLAFLALFLMCIISIPL GILAAIKENKWQDYLVRMMTFMGMSVPSFWLGLIFLSVFGVTLKLVSVSGGKADFRSIIL PAFTIAIAMSAKYTRQIRHIFLEELNKSYVTGARMRGIKENVILWRHVLPNAMLPIITLL GLSLGSLLGGTAVVEIIYNWPGMGSMAVKAITFSDYSLIQAYVLIIAFLYLIVNITVDIS YKYLDARVEEVI >gi|261746456|gb|ADAD01000192.1| GENE 7 6170 - 7531 1627 453 aa, chain + ## HITS:1 COG:CAC0001 KEGG:ns NR:ns ## COG: CAC0001 COG0593 # Protein_GI_number: 15893299 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Clostridium acetobutylicum # 1 450 1 442 446 287 38.0 3e-77 MDVNASNLWDKILKIFKNNTSMETEYNAFFKNVKADDFSNNILSLNCNSPLMKEKMEKYK NEIEETLNDIFILSNEKIHIVFNIKKEEEPGEISYRIKEYRSTNQNSSMKTGLNVRNRLD NFIVGDNSRMAYNACLAVLENEAPVYNPLFIYGGSGLGKTHLMQAVGNAILERNPEKRVL YTTTEEFSNEFIAAIKEGRIKNFRDTFRNLDVLLLDDIQFFERIFGRGMGDTEEEFFHTF NKLQESGKQIIMISDRYPQDIKNLSKRLESRFISGLSTEILEPGYETRKAILENIVEIKN IEIDDNILEYIAESVSSNVRELEGILTLINARAKLLNEKITLQQVQDELSTRMRSQQSKI TAEKIIEIVSQEYSIPVSEMKARKKKQEIVDARQTAMFLIKNILDLNLTTIGGLFGGKDH STVISSIRKIEGKIEENIAFKKELDRIKQKIVK >gi|261746456|gb|ADAD01000192.1| GENE 8 7560 - 7781 295 73 aa, chain + ## HITS:1 COG:CAC0003 KEGG:ns NR:ns ## COG: CAC0003 COG2501 # Protein_GI_number: 15893301 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 6 72 1 67 68 73 61.0 1e-13 MKKEEIEEIEINTEFIKLDQFLKWTNFVISGAEAKLFIQEGQVKVNGDTETRRGKKLYSG DIVEFKGEKVKIK >gi|261746456|gb|ADAD01000192.1| GENE 9 7813 - 8907 1295 364 aa, chain + ## HITS:1 COG:FN2128 KEGG:ns NR:ns ## COG: FN2128 COG1195 # Protein_GI_number: 19705418 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Fusobacterium nucleatum # 1 363 1 364 369 258 46.0 9e-69 MYLKQLSYSNFRCLEDTKTELDRNFNLIYGKNGQGKTSFIEAVHFLATGKSFRTKKTKEL FRYNKNRVIVFGKYINKNEEENILAIDVNEEKKDFYINRNKNKYIDYVGLLNIISFIPED IEIIVGNPSIRRNFFNYEISQAKKDYLKSIVDFEKILKTRNKLIKEKKTREEIYSIYNEK FMEEGTNIIIHRREFIKNISILLNLNYRKLFDPKSELKLKYDCFLGDIDKKTKEEIKEKF SENIKRKAEREKILGYSLTGPQKDDFIFELNGKNAKSFSSQGEKKSIIFSLKVSEIDMLV KEKNEYPLFIMDDIASYFDEVRKKSILDYFINKKIQCFITSTEDLNIKGKKFIIEKGKVI TDEK >gi|261746456|gb|ADAD01000192.1| GENE 10 8897 - 9883 1045 328 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0005 NR:ns ## KEGG: Lebu_0005 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 5 326 4 340 347 221 41.0 4e-56 MKSNIKSTENIVSDITKTKKSLFNNENYILWKIKKNWINITGEEIGVKSFPKYLYNKKMT LNVEDSVIHHSILVHTGVIIEKINDFIQKEAVNQLEIRKIEKKPRRNIVKNLTEDNEKNI ENSQEKPNEKTDFEENIELSAFETEKIKKSISKIDKKYKDIGEKLEKIALNRKKKDIYLL SKGYIRCKNCGDIFYPSGKNEICPYCCEKEENEKLEKMSAIITGNPFIGENEAVKISGTD RYTYYKVRDILAQRAYNDLLYFYITKNIEIKYSEDYESEIRKEANTDFEIYVKNYIDYKV GTDNKEIYNIERKKVISGLRKRKEYQKR >gi|261746456|gb|ADAD01000192.1| GENE 11 9973 - 10239 325 88 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0005 NR:ns ## KEGG: Sterm_0005 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 84 1 82 83 67 53.0 1e-10 MYLYIENNIFINLKEIELLMDYKDFISNGNNKKIMEKGKRKILDLTENEKKRRTLIFTEK FIYISSYTNRALKMRADEYDKLVNSIVF >gi|261746456|gb|ADAD01000192.1| GENE 12 10251 - 11267 1254 338 aa, chain + ## HITS:1 COG:BB0195 KEGG:ns NR:ns ## COG: BB0195 COG0457 # Protein_GI_number: 15594540 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Borrelia burgdorferi # 23 172 216 365 379 64 30.0 3e-10 MKIQNLLEEAVKSYEEMKYDDTVLYLETVLEIDKNNYEALILLTRVYTGSGLFKEALQYC ERIYKNYPEDSLVLFNMGYINQSLGKPKKAIFYYDKYLKYEEDYHVLLNIGLSYMDMKYY KKAMSVIEKAIKMEPENSDGYLDKAECFTKQGKYNDAIKIYEERLKNPENNIEEYYIYTK IADVKERAGDIEEALKNYNIAINCENVDELVYEAFYEFLLRADRKDEIELMLINYANSPI PRERSLNLEGRYAAYIEDFERARKVCEKLLILNPDNPLHYFNSAYIWEMLKDFDKALDFI KKVEKKVDDKELIKNARKRIMKSRREYMKSLNKAENKR >gi|261746456|gb|ADAD01000192.1| GENE 13 11282 - 12106 1094 274 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0008 NR:ns ## KEGG: Lebu_0008 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 2 273 1 304 304 209 50.0 1e-52 MIKEENYKIKEGYDKEILSDVLKNFDKTGSFVVQGARNQIKKFVIINNGKEKEINIKRFG RKNILTELIYKFFRASKAKRSYEFGNRLLQKNIKTPEPIAYFDEYTDEKTGEKRSFYISE ELKYEFTCREVFWDEKTSTEIDELILKDQDKIIREFAEFTFDLHEKGIKFEDYSPGNVLI KREKDGKYGFYLVDLNRMSFEKNLDFNSRMKNVSRMMEFKKYAEKFSEEYAKLYKKPYEE VFKKLYYYITIHKYRVLFKDNTRFLREIFKSKRK >gi|261746456|gb|ADAD01000192.1| GENE 14 12137 - 14119 3301 660 aa, chain + ## HITS:1 COG:FN2126 KEGG:ns NR:ns ## COG: FN2126 COG0187 # Protein_GI_number: 19705416 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Fusobacterium nucleatum # 4 660 6 639 639 811 65.0 0 MANNYGAENIKVLEGLEAVRKRPGMYIGSTSSRGLHHLVWEIVDNSVDEALAGVCDNIIV KILKDNVIEVSDNGRGIPFATHETGKSTLEVVMTVLHAGGKFDNDNYKVSGGLHGVGVSV VNALSEWLEVTVTRDGQIVRQTYKRGEPVSDVEKIGEASPEAHGTTTKFKADPEIFETTV YEFSVLESRLKELAYLNKGLTITLIDERDQEKIKEEKFLFEGGIIDFLNEIADEEKITDE VIYMSDTYEVEAAKEVEIVDDNGNTVKKMRAAKFVEVEIAMSYTISQRENVYSFVNNINT HEGGTHVSGFRTALTRTINDIAKQMNIVKDKDGTFQGTDVREGLVCVISVKIPEPQFEGQ TKTKLGNSEVTGIVSNIVGNNLKFYLEDHPKEAEKIIEKMTMSKRAREAAKKARELVLRK NTLEVGSLPGKLADCSSKDPSESEIFIVEGNSAGGSAKQGRDRRFQAILPLRGKILNVEK SGMHKLLENAEIRAMITAFGAGFGDDMDLEKLRYHKIIIMTDADVDGAHIRTLMLTFFYR QLRELINEGYIYIAQPPLYKVQAGKAIKYAYSDEQMKKITSVLERDNRRYTIQRYKGLGE MNPEQLWETTLDPEVRTLLKVTMEDASYADKMFNILMGDKVEPRRQFIEENANYVRNLDV >gi|261746456|gb|ADAD01000192.1| GENE 15 14476 - 17037 4138 853 aa, chain + ## HITS:1 COG:FN2125 KEGG:ns NR:ns ## COG: FN2125 COG0188 # Protein_GI_number: 19705415 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Fusobacterium nucleatum # 36 849 1 810 811 957 64.0 0 MSDDIRDDEEKQEDIIEETNDDDSDIVVEEIKETNMTNEVNIYIEDEIKASYLDYSMSVI VSRALPDVRDGLKPVHRRILFAMNEMGMTHDKPFKKSARIVGEVLGKYHPHGDSSVYNAM VRLAQDFNMRYLLVDGHGNFGSVDGDEAAAMRYTEARMAKITAELLADIDKNTIDFRKNF DESLDEPTVLPAKLPNLLLNGSTGIAVGMATNIPPHNLSEICDGIVALIDNREISVDELI NYIKGPDFPTGGIINGKQGIYEAYRTGRGKIKVAGKVKIETSKTGKESIIVTELPYQVNK ARLIEKIAELVRHKKLTGISDLRDESDREGIRIVIELKKGEESELILNSLYKFTDLQNTF GIIMLALVNNAPRVLNLKQILENYLEHRYNVITRRVQFELNKAENRAHILEGFKIALDNI DEVIRIIRGSKDANEAREKLIASFGFSEIQAKAILDMRLQRLTGLERDKIEQEYRELMLL IEELRSILADDSKKYRIIKDEVIKLKEDFGDKRRTEIKEARVEIGIEDLIKDEDVVVTLT EKGYVKRTSIDTYHSQRRGGIGVNATNTVEDDVIKDMYIAKNLDTLLIFTTKGKVFSMKV YEIPEAGKQARGKLISNLIKLSEDERVSTVIRVREFEKDKAIFFITKDGVVKKTDLTLFA NINKTGIRALTLRDEDELKFAGLTSGSEKDEIFIATRNGISIRFCEEDVRVMGRTAAGVK GITLRDKDYVVAAVIINPEKISEELSVMTITEEGYGKRTALSEYKVQSRGGKGIINLKIN EKTGKIVDVKIVDDKTEIMLITSEGTLIRTKVDTVSVIGRATSGVRIMKVRNEEKVASAV KIAENPEEEKELS >gi|261746456|gb|ADAD01000192.1| GENE 16 17155 - 17910 1119 251 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2298 NR:ns ## KEGG: Lebu_2298 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 251 1 253 253 284 64.0 2e-75 MLMKKALFLVSSESEIETLKEFGKKFREKYNVETDALYVKDVLKYEIFPVTIEGIGVNIG SNYAFKEYMELEERNFNNVKEKLGSEFSKVYSEDGETIETALNELKKYDLIVVVKNEKVS PYLKELLRSNFKPLIILPNITEFSFDKILLLDDGAYNANKTLYTFFYMFDEQKVDVLRVN VDSKDYLKERFGDNCNLVEKEGDPFKTIMEESEKYDFILMGDLRYTIMVERITRKLGVKL LENLKKPIFIV >gi|261746456|gb|ADAD01000192.1| GENE 17 17926 - 18375 663 149 aa, chain + ## HITS:1 COG:FN2124 KEGG:ns NR:ns ## COG: FN2124 COG0622 # Protein_GI_number: 19705414 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Fusobacterium nucleatum # 2 137 3 138 153 112 41.0 2e-25 MKILICSDSHTRLDYFQKVIDLEEPEMIIFGGDHSTDAIDMSLVYSEIPFKIVRGNTDYF DKDTRDIEIFEVNGKKVFLTHGHLFGVKSNLNEIEKKAVSENADICIFGHTHREYMKEID GVTYLNPGALQDRKYVIYDGKKFEQKVLK >gi|261746456|gb|ADAD01000192.1| GENE 18 18433 - 19314 1111 293 aa, chain + ## HITS:1 COG:YPO2151 KEGG:ns NR:ns ## COG: YPO2151 COG0697 # Protein_GI_number: 16122384 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Yersinia pestis # 19 289 16 290 373 146 32.0 5e-35 MGAKSKVNIQKRYINHYYAVLAALFYALNVSVAKILLKDINEVILAGLLYLGAGMGMTGV VFFGKKNVSEKEKPFEKKDVKYIFGMIVLDIFAPIFLMTGLNKALPENVSLLNNFEIVVT SLIAYFIFREKISKRLATGLFLVTLSTIILSFQGIKSFSFSEGSLFVILACCCWGLENNC TRMLSQSDPKKIVIIKGIGSGLTAFLIGVFLRYPLPSLMTMIYSLCLGFVSYGLSVYYYV KAQRYLGASRTSAYYALTPFIGVILSLVLFREMPTVNFWIALVVMGIGIFFTN >gi|261746456|gb|ADAD01000192.1| GENE 19 19470 - 19781 362 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039496|ref|ZP_06012798.1| ## NR: gi|262039496|ref|ZP_06012798.1| peptidase propeptide/YPEB domain protein [Leptotrichia goodfellowii F0264] peptidase propeptide/YPEB domain protein [Leptotrichia goodfellowii F0264] # 1 103 12 114 114 155 100.0 1e-36 MILFSAVTFGIGKGKNMFSTDFSYKRSNADLKNKISSEKAKEIALNHARVSKNNAKFKKI GLHKKKDVLIYKIEFYVDKTRYQYEIDARNGIILKAKNNHKNK >gi|261746456|gb|ADAD01000192.1| GENE 20 19836 - 20195 318 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039486|ref|ZP_06012788.1| ## NR: gi|262039486|ref|ZP_06012788.1| putative threonine rich protein [Leptotrichia goodfellowii F0264] putative threonine rich protein [Leptotrichia goodfellowii F0264] # 1 119 1 119 119 140 100.0 3e-32 MFPTVISILMHKIRKQKTAVKKAKSTKGKKIVINGASITKTVTLNGESLEITGASNHITV KGNVSSLTVNGADNTVTLDSVSSINVYGTSNKIYYKTAPSKSGKPSISVTGADSTVSKR >gi|261746456|gb|ADAD01000192.1| GENE 21 20116 - 20313 268 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKNFIKLTVLTLVTSALLNAGTVKVGNQIKIKTPGSTSVNVSDGNININAQDSETKNSSK KSKID >gi|261746456|gb|ADAD01000192.1| GENE 22 20404 - 21729 1298 441 aa, chain - ## HITS:1 COG:FN0586 KEGG:ns NR:ns ## COG: FN0586 COG0642 # Protein_GI_number: 19703921 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 5 437 24 445 445 214 34.0 4e-55 MGLMTGLVIFFLSIIFYISDNLIRNSIHEDLKNTVNSVFGQIEISDNNEPDFDESLVLLR NNIEISIYNKDLEFIYGSALSDFNFENSPFHENTIRTIRKGDEKWYVYENKKYIPNYGDL WIRGVIPSSLAEKAIETIISISLIILPFFLLFAGISGYIITKNGFKPIEKIRLTAENINA GNDLSRRINLGQQRKDEIHTLANTFDTMFNRLQTSFENEVQFTSDVSHELRTPISVIISQ SEYGLNHTESKEKMENSLRSVLKESIKMSQMISQLLMLSKMDKGHQKLNIEKVNLSELLD IIIDTQQIAADKKNIKINSVISPDIVIPADEILIMRLFINLITNAVNYGKENGYINIELF KFKDKIISKISDNGIGIAQENINKIWTRFYQVDSSRSSDSSGLGLSMVKWIVEAHKGSIS VKSELNKGTSFTVELPLNNKL >gi|261746456|gb|ADAD01000192.1| GENE 23 21794 - 22471 743 225 aa, chain - ## HITS:1 COG:FN0585 KEGG:ns NR:ns ## COG: FN0585 COG0745 # Protein_GI_number: 19703920 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Fusobacterium nucleatum # 1 225 1 224 224 207 49.0 1e-53 MRILVVEDEKSLNEVIVKRLEDEHYGVDWCYNGKDALDNILITEYDAIILDIMLPELNGY EVLKTMRSKNIDTPVLFLTAKDSIEDRVKGLDSGANDYLTKPFAFEELLARIRVMLRKDS NSSGNVFTVANLTVDTNSHSVFRDDIPIKLSKREFTILEYMIRNRGKILSKDKIEQHIWN YDYEGGSNVIEVYIRYLRKKIDADFSPKLIHNIYGVGYILKVEDE >gi|261746456|gb|ADAD01000192.1| GENE 24 22492 - 23565 1037 357 aa, chain - ## HITS:1 COG:SP0182 KEGG:ns NR:ns ## COG: SP0182 COG1619 # Protein_GI_number: 15900119 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Streptococcus pneumoniae TIGR4 # 12 353 4 342 343 361 47.0 1e-99 MNKPKSLKRGDTVAVVSLSRGVLGESFCNHQKLLGTKRLEKMGFNVVFMPNSLKGIEYLE NHPEKRAEDLKNAFKNPDIKGILCAIGGIDGYKIFPYLMEDEEFISLVKENPKLFTGFSD TTVHHLMFHRLGMQSFYGPSFLTDIAELDDNLLPYTEKYWNIYSGSALNEINSSDIWYEE RTDFSEKSLGIPRISHKENKEFELLQGKENFSGKLLGGCIESIYNVISGERFSEQKEICD KYNIFPTKDEWKEKILFIETAEGKSSPEKYKKMLEAIKEKGVFEMINGIIVGKPQDEAYY EEYKSVLKEVVDNPELPVLYNVNFGHAYPRCILPYGLKMEYNHSERKIIFKENIFSE >gi|261746456|gb|ADAD01000192.1| GENE 25 23789 - 24037 170 82 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0133 NR:ns ## KEGG: Lebu_0133 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 82 1 82 83 66 49.0 3e-10 MKKLVLCLFLTLGVLSFSERFVEECKILSSSYSHMRCQSLDSGKIFHFSYPRNTLGIGQV YKVWFTGSGYRNLQLSYFEYLY >gi|261746456|gb|ADAD01000192.1| GENE 26 24200 - 24571 450 123 aa, chain - ## HITS:1 COG:no KEGG:lmo1117 NR:ns ## KEGG: lmo1117 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes # Pathway: not_defined # 1 117 3 117 118 106 48.0 4e-22 MKIKSVAIGIPVNNLEKSAKWYKDIFEPEEESVPSVDSPIIEYKTGPVWIQLFEGKTEDS DNIVNFEVENLEKEYNRLKEMSVINDEEITDIPDVIKYLEFKDPDGNKLTFVEIYSEAEL KSK >gi|261746456|gb|ADAD01000192.1| GENE 27 24577 - 25026 393 149 aa, chain - ## HITS:1 COG:no KEGG:BBR47_09460 NR:ns ## KEGG: BBR47_09460 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 22 146 21 145 162 95 41.0 7e-19 MNKIIHCKVKLKCDTKYAFEMFTKNELISTWLCKYADIEPVINGKYELFWDDNKEINSTI GCKITAIEPSRLLCFEWKGSLEFSEFMNYSDPLTHVCVLFSQENNDSCEVNLIHSGWKEG ENWETARKWFENCWNNLFNVLQEKIEKKG >gi|261746456|gb|ADAD01000192.1| GENE 28 25160 - 25843 664 227 aa, chain + ## HITS:1 COG:BH0851 KEGG:ns NR:ns ## COG: BH0851 COG2207 # Protein_GI_number: 15613414 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 21 221 21 251 261 92 27.0 6e-19 MDIKIIAEYMREKLESYDENEKFSIKEIAEKFGYTKYEFSRKFKKETGFSAKEFVSVLKL EKSLQKLIKEDKSVILAQLESGYESSGSFSNIFRKNTGLSPREYRKNIVKLQNYKSEMTF VGLFKTPIPSHAPVIGKAVISKKVNNRCTFKNIPDGKYYILACSVEKNGKIFRYFDLKNC LRGKVEKQLIFPSEKEEKFEILFREAIPEDPPILINLPNLLFKVIGK >gi|261746456|gb|ADAD01000192.1| GENE 29 25952 - 26809 882 285 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2291 NR:ns ## KEGG: Lebu_2291 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 8 284 4 286 287 347 63.0 4e-94 MKNPNFLSKRKLIVIISVLAIFSAIYAVSPKKSSTAKNKVAAAKTIISPTKKELDIIADK IFQNEAGGVKNNLVYWNTGENFPSLGIGHFIWYKENEKGIFEESFPPLIEYFKMKNVKLP EILEKNKYAPWLDREELIRLKNGKNPDIEKLIDFLYNTKDIQIEFIYKRMEASLDKMSAI SGNKENVIKQFYRVANSPNGLYPLIDYVNFKGEGTNPEERYNGKGWGLLQVLENMKGEGS GKAALEEFSNSAKFVLQRRVNNSDPSKNEKKWLAGWFKRCDTYKE >gi|261746456|gb|ADAD01000192.1| GENE 30 26924 - 27592 725 222 aa, chain + ## HITS:1 COG:SA2134 KEGG:ns NR:ns ## COG: SA2134 COG2094 # Protein_GI_number: 15927924 # Func_class: L Replication, recombination and repair # Function: 3-methyladenine DNA glycosylase # Organism: Staphylococcus aureus N315 # 10 220 2 202 202 176 45.0 3e-44 MQKNKNKSKNFFEQDTVSLAKELLGKLIVVKSREEILSGYITETEAYLGVVDKACHGYNG KRTPKVEALYQEAGTVYIYTMHTHKMLNIVSCEKDNPQAVLIRGIEPDTGKEIMEENRGK TGVLITNGPGKLTKAMGINDKFNMSKVEILRGKIDIKEMKENTIYIDFEKSKNPLKIEVS ARIGIPDKGVWTKKPLRFYVAGNGYVSGMRKSEYTDKCWKDL >gi|261746456|gb|ADAD01000192.1| GENE 31 27637 - 27744 58 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSYTLLKFIENIKTEQSPNLIGKFQFFGHKLKISI >gi|261746456|gb|ADAD01000192.1| GENE 32 27834 - 28733 1360 299 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0266 NR:ns ## KEGG: Lebu_0266 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 289 1 292 298 250 51.0 6e-65 MKNKIKILIVTVLCILALACGNTGGKSGKVVFESEDKKIKVYESEVNFQLERELEAAGAE LKDIPKEQLEQMKLDIVKNIATTRAFALKAKEKKLDKDKKYTEGVEFTKENFLASIAMLD RLNSVNVTDDEAKKIYEANLKNFERPEDSVRLQLIVTPASEKDKAEAALKEAKANPVKFG ELVRKYTGVQNGSTGETDEIPMSALAEKYAPISEAVKNAQKGQIIDNVIIVGDEAYIVKV LEKNPKGVVAFDVVKDQIKAQMKASKRQEETQKFMEDVTREFKLDKINKDTVKLPETKK >gi|261746456|gb|ADAD01000192.1| GENE 33 28770 - 29180 537 136 aa, chain + ## HITS:1 COG:no KEGG:Smon_1263 NR:ns ## KEGG: Smon_1263 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 41 136 113 199 203 62 39.0 6e-09 MRKSYFIMLLISIGMISCTSGDPYAGSSSSGTTSGSRVESKPAPNKVTPSKSNSGGEIDN KFDATDEKIVAFIDQKVYEDAAQIRQVTVDSLNKLLKEKNIRMTKREFLARTYQIIRDNN MNSFYLAAGRLLNQLK >gi|261746456|gb|ADAD01000192.1| GENE 34 29208 - 30683 1838 491 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0265 NR:ns ## KEGG: Lebu_0265 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 457 1 466 500 294 45.0 7e-78 MKFLKFLLVLVIIAVIFALGGVIFSNRIIAHLTKTDSSRFENLKYSIKDGEIVFDNFILN GEKLGKGKAKIKIVRGGKFGISPEVKLESMELQEVKSELLYNAADPQIDSFIEKINVPAE HEKIVKTTKSYIKETTARINNLDKNIDDFFNMKKGNIEAVNKLKQDYGLSTDLKEKSSKL IELNKELKNFNEQINTEKEKVNKEISEIEAERSIMIENISGDLDKLEKIISLNDVENLNS YIFLERGREIGISLNKTLKAVKFIKKIRDNSGLSISSISVNNGEIVFSGLEKGKKTSKGQ VTLGDNIKADVTETDKGYEISYNKDDIALKTLFGEGIISTVEYSKKDLLEGKALNLISEL IFENNNFKNLNKTVLSEKDKKMLEEKINGMKSERYNEIMKEYEDQTKSLEGLIDDIYAKE GKLDKLQRGLLSLNTILNISDYISGDTAGTSTVNTNTEVKSEENNGNENKNSASGDSTAK DVKDQLKKIFN >gi|261746456|gb|ADAD01000192.1| GENE 35 30694 - 30939 323 81 aa, chain + ## HITS:1 COG:BS_yabO KEGG:ns NR:ns ## COG: BS_yabO COG1188 # Protein_GI_number: 16077127 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Bacillus subtilis # 1 76 1 76 86 63 44.0 9e-11 MRLDKFLKITRIIKRRTVAKELADNGNISVNGEEKKSSYNIKKGDILDIKYFNKNIKVKI KDIPPENLKKDFIDEYIELIS >gi|261746456|gb|ADAD01000192.1| GENE 36 31122 - 31397 504 91 aa, chain + ## HITS:1 COG:FN0022 KEGG:ns NR:ns ## COG: FN0022 COG2088 # Protein_GI_number: 19703374 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein, involved in the regulation of septum location # Organism: Fusobacterium nucleatum # 1 91 1 90 93 102 51.0 2e-22 MKITDIRLRLGKGSEEGGKLKAYVDITFDECFVIHGLKVIEGQNGLFVAMPSRRMPNGEF KDIAHPITPELRSELTRVILENYEKENVEAE >gi|261746456|gb|ADAD01000192.1| GENE 37 31551 - 31643 65 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRAHCCPHTPDTCVLRLRRMRKCKGMNNDL >gi|261746456|gb|ADAD01000192.1| GENE 38 31647 - 32213 701 188 aa, chain + ## HITS:1 COG:FN1597 KEGG:ns NR:ns ## COG: FN1597 COG0193 # Protein_GI_number: 19704918 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Fusobacterium nucleatum # 1 188 1 186 191 166 48.0 2e-41 MKLIVGLGNPGEQYKLTRHNIGFIFIDEYLKENNINDMREKYKSEFIQTAYKGDKVFYQK PLTFMNSSGEAVGEAVRFFKIDPETELFVIYDDMDMEFGKVKVKKDGRSAGHNGIKSIIQ HVGEKFVRIKYGIGKPKSKDETIGFVLGKFSPEEKETLKESREKIFSLIDDIKNDMTLER LMNKYNTK >gi|261746456|gb|ADAD01000192.1| GENE 39 32226 - 33107 273 293 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 73 278 82 279 285 109 34 2e-23 MKKFKLEKQYQRMKISQYLREVQNYSGRSLRNVEVFLNGKQVRTTKKLPSSGTLRVIEKE KGTDIKPIKLDLDIVYEDDDLLVVNKEPFLLTHPTQKKADFTLANGIVYYFKEKYEKETV PRFYNRLDMNTSGLIIIAKNSFAQAFLQNFSNFEKKYLAIVDGIMETDEEIVIEKPIYRD GDKLERIIDERGQYAKTIVKPLKKYPEKNVTLAECELFTGRTHQIRVHLKSVGHTIVGDE LYGKGSDEKRGIKRQFLHAYKVKFTHPVKKEEIELEIPLFNDMKEFLGEDILK >gi|261746456|gb|ADAD01000192.1| GENE 40 33135 - 35885 3260 916 aa, chain + ## HITS:1 COG:CAC0801 KEGG:ns NR:ns ## COG: CAC0801 COG0574 # Protein_GI_number: 15894088 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Clostridium acetobutylicum # 20 915 15 851 856 343 32.0 1e-93 MKYIQEMKNYRGNSGIFEEKTGKKAQNLLELSNQGFNVPNFSVIDNRYFKEVILKEIQES NRNKGINGEINDWKSIFAENAEEKIENIIKVIKEHEPEKEFIEEIEKVIGKDEYYAVRSS SVEEDSDNFSFAGQFETFLYVRKENIIEKIKEVWLSSFSSHVMKYRKEGNINNEINVPAV IIQEMVNSQKAGVAFSVNPVNGNAAELVISGTYGLGTSIVDGDENGDLYIYNKNTQEIKK EIKTKKIRQVLDFENKKIKTEEINIDDEILNEKEVEELAENIINIEKCYGRPQDIEWAYE KGELYILQSRPITTLKKDTDNMFNTIIWDNSNIVESYPSISLPLTFSFIRTAYSEVYKRF SEITGVPPKVVESYQVVYDNMLGLLKGRVYYNLINWYKLLLLFPNAKNNSKFMEQMMGVK KELSQENLSENLLQANEKMSFLEKTVNKVQKLKAGLAIFLNMFLIEKKAEKFYKIINQNL NNEEINLSDMNITELKKYYRFLENKFLKNWEIPIINDFLVMIWFGISKKVAEKYIKENAD EVHNILIAQEGQEMISVEPSKYIEKLSEMLRKNEVLKDEVKSIIENAEREKNSETGEKTL FGNFYNISSLTKDTEFNRVLNEYMEKFGDRTVQELKLESLTLKEEPLFFIKMIYSLSGVK SEHEHSKRNIADEQKKIYDGLKTGFIKKYILKKSVFYAKKFIRLRENLRYERTKVFGTVR KIMKQIGIYLKEDNITENERDIFYLTIEEIFGLIDGAVIDVNLKELVKLRKEEYKKYEEG IILPDRFLTKGFLGEDFYFEDLSAVGQSNEELKGTGCSKGIVKGKVKVVLDLVNDEVKEG DIVVTKSTDPSWVMVFPLLKGLIVEKGSLLSHSAIISREMNIPAIVGVQGATGLLKTGDY IQFDGSTGIIKKLEEV >gi|261746456|gb|ADAD01000192.1| GENE 41 35887 - 37011 1083 374 aa, chain + ## HITS:1 COG:mlr1574 KEGG:ns NR:ns ## COG: mlr1574 COG5379 # Protein_GI_number: 13471564 # Func_class: I Lipid transport and metabolism # Function: S-adenosylmethionine:diacylglycerol 3-amino-3-carboxypropyl transferase # Organism: Mesorhizobium loti # 12 321 57 381 432 92 23.0 2e-18 MKSEVRENNVDFSLIRYSQCWEDTEVLLEALDIRENDVCLGILSAGDNVFSMLTKNPEKI VALDISFPQVALVKLKMEAFKRFSYEEMLKFIGVKESDKRAEMYEKIRENLEEKVREYWD FNKEAVQNGVIHMGKFEKFFRLFRKRILPFVHSKKRVEEFLSEKSETERIDYYNKKWNNF RWKMMFKLFFSNYIVGKFGRDKEFFKYVEKDISKVMTERSRYASCELSPYENPYLNYILT GNYRSDCLPYVLREENFEKIKKSLHKIEIVQSSMEEYLDGTDLKINRFNLSDIFEYMSLG NYEKLMEKIYEKSADEAKLVYWNLVVERNAGLIKDRKEKRFERLERLDKELHKKDKTFFY TDFVVEKVIKNGNS >gi|261746456|gb|ADAD01000192.1| GENE 42 36998 - 38329 1171 443 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1951 NR:ns ## KEGG: Sterm_1951 # Name: not_defined # Def: phosphatidate cytidylyltransferase # Organism: S.termitidis # Pathway: not_defined # 13 344 14 347 444 182 38.0 4e-44 MEIVKMIIVLIIFMLFFLLLNKLEKSEKFNSELIRKILHIGSGFGGLILPFLFKRKISVI VLGIIFLILLIGIRIIKNKIMGLKQVIETKNRKTLGDIYFIISILGLWIVSGDNKVTYAL PLVILMFSDAFAALIGEFYSKFKFDTGFGTKSVEGSVVFFLTTYFVCINFFLIFSNLKNI NIVLLSLLLSILTMILEVISWNGLDNLFVPFFVYLFLKLNMNLGTKELMYKFWVIVVLFI IIILNRKKTTLTRIAQTGSLFFLYLVMIIGGIKWLIPPLIMYLGYYHFTPKVEGQVKDSL KGLLAIAFSTFVWLVLATILDKEKLYQVYIFSFSLHFGLINLIRDNAGNVNRETFRMRFL AGSTGKALLFFGCNYLFLSMIRDFKMLAGVIILLIGGIFLYETSMKIYYIIEKEKELSGE SKVFIASGTVFLCSLLLTGIGML >gi|261746456|gb|ADAD01000192.1| GENE 43 38358 - 39380 1134 340 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1952 NR:ns ## KEGG: Sterm_1952 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: S.termitidis # Pathway: not_defined # 45 336 35 320 325 179 35.0 2e-43 MKMKRYVIMKDNQTENIVLEKYIAENNISEEEIQKIKIERPSEITVFYSETKEIAGSLYL WHDRPDYKGQKTSYIGNVTVNEPFREKGEEIFEEIFKELKNEKVQVIIGPLNGTTWNTYR YVTEKRERPKFLMEPWNEEYYPDLFEKTGFVSLARYISSISENMKKSENIKRKMEKIKKF DFYNHIKVESVENKDLNEVLNNVYDLTIEAFKNNFLYTELDREIFLKMYMSYKDKLVKQF FKLVYLKEELVGYVFGIPDYAELYYKEKVDTVILKTIAVAPKYNGKGIGYILIDEFIKEA QNNGYSNIIYALMHESNVSKNTGLLLGEKLREYTLFIKEL >gi|261746456|gb|ADAD01000192.1| GENE 44 39421 - 41025 1838 534 aa, chain + ## HITS:1 COG:XF2276 KEGG:ns NR:ns ## COG: XF2276 COG0318 # Protein_GI_number: 15838867 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Xylella fastidiosa 9a5c # 30 527 59 553 563 126 21.0 1e-28 MTIADKIKELKDRHPYKTALVDLKTGDKANFNHMDVKSDKICTYLENKGFKKGDKIVVFV PIGVEFYIILLAIFKMGLQAVFIDPYADTEHINKCCEMISPEGIIGSGKTILKGFFLKGI RKIAKKINYKKMLEQSESLPPIKKRKPKKYEREEEHINENTPALISFTSGSTGFPKIIMR THGFLMGQHKVLEKNLKFEKETSIYSSFPIFLLSHMATGATVFIPDIDMSSPMEANPKEV TEQIKKNNIQNIILPPVILENIVNYVEKNNMTLKNVQTVYTGGAPVFPKLMQKIDKIFEN VKVRALYGASEAEPISILNYEDITGEDIENMKNGYGLLVGRKVDEIDLKIEEISEKADEK MNESENEGFEKKSEAENKKSENKEDKNTGIKGEILVKGENVLKGYLNIPESPDKKWHKTG DTGYINEKGQLVLLGRVKGRIRINDNIYYPFSIETAFSFCEALKKSVLTSKDDKLYLITE RNPDFEGNLEENEEIKKLKEKFGIFKIVEDEIPVDKRHNSKTDYKKLEELKEKL >gi|261746456|gb|ADAD01000192.1| GENE 45 41049 - 42071 1265 340 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1948 NR:ns ## KEGG: Sterm_1948 # Name: not_defined # Def: UbiA prenyltransferase # Organism: S.termitidis # Pathway: Ubiquinone and other terpenoid-quinone biosynthesis [PATH:str00130]; Metabolic pathways [PATH:str01100]; Biosynthesis of secondary metabolites [PATH:str01110] # 11 326 3 294 300 181 38.0 6e-44 MNNKIMVTERIKNFKIYLDERFPARKNSFFVLMFTLSAYVYTGLLYNMKILGMTGNDPDR IPMPLHKVIPLFIIILMFFFQLRLTDEFKDYEEDLKYRPYRPVQRGIITLKTLRNIGIVT VIMQIISAFLINPKILIYMSFVWIYMFLMAKEFFIKEWLTKRIVIYALSHVVIMIFINLV IIKAAEFIMTAQGTPGLPDSIISKIYAGIVPFLMLGYLNGMVLEIGRKTRKSDEEEHGVE TYSKLWGRKKAVYILCGLYVLDYILVMFGLLQTNEKYFFLGMAVLTIVLAISVYFMIKFL KKNLSGKVSENVSGLWILFSCLTTGLFQYTVFYIFSLLKS >gi|261746456|gb|ADAD01000192.1| GENE 46 42135 - 43142 1682 335 aa, chain + ## HITS:1 COG:FN0652 KEGG:ns NR:ns ## COG: FN0652 COG0057 # Protein_GI_number: 19703987 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Fusobacterium nucleatum # 1 335 1 335 335 530 81.0 1e-150 MAVKVAINGFGRIGRLALRLMANDPEFDVVAINDLTDAAMLAHLFKYDTAQGRFDGEIQV KENAFVVNGKEIKTFADADPENLPWGQLGVDVVLECTGFFTSKEKAEKHIKAGAKKVVIS APGSGEMKTVVFNVNNNILDGSETVISAASCTTNCLAPMAKVLQDKFGIEVGSMTTIHAY TGDQNTLDAPHRKGDFRRARAAAANIVPNTTGAAKAIGLVIPELAGKLDGAAQRVPVPTG SLTELISVLNKKVTKDEVNAAMKAASNESFGYTEEPLVSSDIVGIKFGSLFDATQTKVIE SGDKQLVKTVSWYDNEMSYTAQLVRTLKYFVELAK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:55:39 2011 Seq name: gi|261746450|gb|ADAD01000193.1| Leptotrichia goodfellowii F0264 contig00105, whole genome shotgun sequence Length of sequence - 5671 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 - CDS 2 - 266 425 ## COG0747 ABC-type dipeptide transport system, periplasmic component 2 1 Op 2 3/0.000 - CDS 259 - 1464 1682 ## COG0747 ABC-type dipeptide transport system, periplasmic component 3 1 Op 3 3/0.000 - CDS 1510 - 3309 2593 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 3344 - 3403 17.2 - Term 3386 - 3432 3.1 4 2 Op 1 3/0.000 - CDS 3513 - 5303 2768 ## COG0747 ABC-type dipeptide transport system, periplasmic component 5 2 Op 2 . - CDS 5347 - 5670 483 ## COG0747 ABC-type dipeptide transport system, periplasmic component Predicted protein(s) >gi|261746450|gb|ADAD01000193.1| GENE 1 2 - 266 425 88 aa, chain - ## HITS:1 COG:BH3636 KEGG:ns NR:ns ## COG: BH3636 COG0747 # Protein_GI_number: 15616198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 1 88 420 507 610 125 69.0 2e-29 MDEAGYKDVDGDGYREDKNGKPLEIKIAAMSGGDIAEPLAQYYIQQWKQIGIKGVLATGR LIEFNSFYDKVEADDPEIDVYFAAWGVG >gi|261746450|gb|ADAD01000193.1| GENE 2 259 - 1464 1682 401 aa, chain - ## HITS:1 COG:BH3636 KEGG:ns NR:ns ## COG: BH3636 COG0747 # Protein_GI_number: 15616198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 46 399 68 416 610 249 36.0 6e-66 MKNAGSVLIMIIMCVLMTACGPGEKKSNGGNGDKKEFKFPTETSNDGKAVSGGTLKVAMV KDSPIVGIFNYAMYKDEYDGDILDWFVGGYILDIDENFEVTDTGIATLAVDVPNKKVTIK IKDNMKWSDGQPVVADDVIYAYEVLGNKDYTGIRYNDESTKIVGMEEYHAGKTPNISGIK KVDDKTVEITFKQLGQGIYTATTNGLLRYALPKNYLQDVPIKDLEKSDKIRKNIVTVGPY TISNSVQGESLELKANEHYFRGKPKIEKVIIQIVNSQTIGAAMKAGEYDVALNIPTDLYK TYKDFDNLEVLGRQELYYSYMGFKMGHFDKAKGENVVDPDKKMSDLRLRQALAYGINTEE MINAFYEGLREKATSSVPPVFKKYYPKDFKGFEYNPEKAFG >gi|261746450|gb|ADAD01000193.1| GENE 3 1510 - 3309 2593 599 aa, chain - ## HITS:1 COG:BH3636 KEGG:ns NR:ns ## COG: BH3636 COG0747 # Protein_GI_number: 15616198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 43 595 64 607 610 407 37.0 1e-113 MKMKNAGSTLIMVIMCILMVACGPGNKKSNAEKDAVDASKFPIETSNDDSTVKDAVLKVA IVKDSPLVGIFYSELNKDGFDQDLVSSFFSNLIFDVDENFEVTDTGPATLTVDAPNKKAT IKIKDGIKWSDGQPLTADDVIYSYEVIGNKDYTGPRYDDDSVKIVGMKEYHEGKANTISG LKKIDDKTVEVSFNELGQGVYTIGNGLQGSALPKHYLKDIPIKDLEKSEKVRSKIISLGA YNIVNAVQGESLELKANEYYFKGKPKIEKVIVQVVNSNTITAALKAGEYDMVIQMPSDQY NIYKDYDNLQILGRQELYYSYVGFKMGHFDKAKNENITDPNAKMADIRLRQALIYGLDVN QMVKAFYGGLRERATSSVPPAFKKYFSKDIEGFPYNTEKAKKLLDEAGYKDINGDGYRED KNGKPFEVKIAAMAGGDIAEPLVQFYIQQWKEIGIKGVLSTGRLMEFNSFYEKLEADDPE IDVFFAAWAVGTNLNPVESAGRRSQFNLTRFGSDENDKLMAETSSPKTLEDPNYKAEAYK KWQEYYINQAVEVPLTYRYAIIPVNKRVKNYYIGRDSAKKGEGIHKWELTAKDPIKSSK >gi|261746450|gb|ADAD01000193.1| GENE 4 3513 - 5303 2768 596 aa, chain - ## HITS:1 COG:BH3636 KEGG:ns NR:ns ## COG: BH3636 COG0747 # Protein_GI_number: 15616198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 40 594 64 609 610 417 38.0 1e-116 MRKIGTVMTLFILMLLISCGPGKKRSNAGDEAVDASKFPIETSNKDAAVKDAVLKVAIVK DSPLVGVFYEELYSDGFDGKIAETFFNNQIFEVDENFEVTDTGMATLSVDAANKKATIKI KDGIKWSDGQPLTADDVIYSYEVVGNKDYTGIRYNDDSTKIVGMKEYHEGKAPNISGLKK IDDKTVEVSFTELGQGVYTLGNGLRGTALPKHHLKDVPIKDLEKSELIRSKVVTLGAYSI SNSVQGESLELKANENYFKGKPKIEKAVVQVVNSNTISQALKSGEYDMALRIPTDQYKVY KDYDNLEVLGRQELYYSYMGFKVGHYDKEKGENIPDPNAKMADVRLRQALAYGLDVDQMV KAFYNGLRERATSSVPPIFKKYFSKDIKGFPYDPEKAKKLLDEAGYKDVNGDGYREDKDG KPFEVRIASMAGGDIAEPLVQFYIQQWKEIGIKGVLSTGRLIEFNSFYEKVKADDPEIDV FFAAWTVGTNLNPIEGSGRKSMFNYPRFASDENDKLMAETASSKTLEDPNYKAEAYKKWQ EYFIPQAVEVPLTYRYEMVPVNKRLKNFYIGFDEATKGEGIHKWELTAKEPIKASK >gi|261746450|gb|ADAD01000193.1| GENE 5 5347 - 5670 483 107 aa, chain - ## HITS:1 COG:BH3636 KEGG:ns NR:ns ## COG: BH3636 COG0747 # Protein_GI_number: 15616198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 1 105 509 609 610 70 35.0 1e-12 NLNPIENAGRKSQFNFPRFVSDENDKLIAEISSPKTLEDPNYKAEAFKKWQEYFIPQAIL VPLTYRYAVTPVNKRLKNFYIGLDYAKKGEGVHKWELTAKEPIKASK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:55:40 2011 Seq name: gi|261746447|gb|ADAD01000194.1| Leptotrichia goodfellowii F0264 contig00235, whole genome shotgun sequence Length of sequence - 2514 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 2404 3338 ## COG3210 Large exoproteins involved in heme utilization or adhesion 2 1 Op 2 . + CDS 2404 - 2512 102 ## Predicted protein(s) >gi|261746447|gb|ADAD01000194.1| GENE 1 2 - 2404 3338 800 aa, chain + ## HITS:1 COG:XF2196 KEGG:ns NR:ns ## COG: XF2196 COG3210 # Protein_GI_number: 15838787 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Xylella fastidiosa 9a5c # 6 184 2615 2798 3442 66 31.0 2e-10 GITFWGAEGVSVGANYGTMKSEGTLYNNTQIQAGNKLKIKADNMEIRGGRAVGKHVEADI KENLLIESLQDKEDMKQIGVNVGYSMQKGKGKDGKPDNRHNGSLGGSYGRKDKEWVSEQS GIIGTESADVKVGKELALIGGIIANIDEEGKDKGNLTLSYGSLRTANIESHDKLINLNGN IEINQRSRDNNTELVLNNKKGKNKTRDNVKEVPNRVDETYGVGVEGHEREKITRATIGNG VINSGRAIEVGVNRDITRAEELLKDVNVQKTEFIFKSEPNSWGDFNKIMSSNAGIIGNFL DDMNEHTGNKVRTNYEDKFRTKTSSAIASVESKLVKLNNMTGSLMPLQEHHGGLLEQIVR TVRKDKAPIIEVTLQKGKNGEMLMGMVEKRRLSEVGIDKDGNRLKNVQAFINGITEVKSN ATRNAIFKNMTEENKKRFANGEEVKIALVYNPSRGMVSDLFESFMGKVFDGSLSSLGLST GISRGTAIALASGAEGVNYDLGLYSQGNIVGLGAFNILKNNGIKLGDGQGSYKVGMYGTP IRRNTIQEFENSLGITFRGAALNTPDFVANSKKIFGLSGESRLINALNVNSVKEHSWKDN TGNWWFAGKAIFREQRMPGLFIPVLPGDKKILKEDYLYSKVVTNDEIDKINNNYRPIVGG DLTEKIINSLISNPHGTYVYESAEISEQIYNKFEEYKGASNTRKEEILGEVIELYKKDQA LKLDLAMNGPAILDNSPFLLKARDEYNRAELIREYNAGRYQDKGLPGIMNVGMSVEEKRK ELKVSDITSYLRKLRQGVER >gi|261746447|gb|ADAD01000194.1| GENE 2 2404 - 2512 102 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEKKVMIVIGLMILLGSCGELRYGIEKTVFEMEQDK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:55:57 2011 Seq name: gi|261746407|gb|ADAD01000195.1| Leptotrichia goodfellowii F0264 contig00006, whole genome shotgun sequence Length of sequence - 43335 bp Number of predicted genes - 40, with homology - 36 Number of transcription units - 13, operones - 10 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 14 - 73 5.1 1 1 Op 1 5/0.000 + CDS 168 - 2210 1827 ## COG3711 Transcriptional antiterminator 2 1 Op 2 8/0.000 + CDS 2194 - 2628 615 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 3 1 Op 3 7/0.000 + CDS 2642 - 2950 587 ## COG1445 Phosphotransferase system fructose-specific component IIB 4 1 Op 4 2/0.000 + CDS 3027 - 4208 1681 ## COG1299 Phosphotransferase system, fructose-specific IIC component 5 1 Op 5 . + CDS 4198 - 6912 2492 ## COG0383 Alpha-mannosidase 6 2 Tu 1 . - CDS 7042 - 7812 753 ## THEYE_A1956 hypothetical protein - Prom 7844 - 7903 10.9 + Prom 7868 - 7927 7.8 7 3 Tu 1 . + CDS 7964 - 8797 1164 ## COG4820 Ethanolamine utilization protein, possible chaperonin + Prom 8902 - 8961 17.1 8 4 Op 1 . + CDS 8995 - 10287 1111 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog 9 4 Op 2 . + CDS 10307 - 13186 2903 ## COG1221 Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain 10 4 Op 3 . + CDS 13250 - 13870 780 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 11 4 Op 4 13/0.000 + CDS 13886 - 14350 644 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 12 4 Op 5 10/0.000 + CDS 14360 - 14641 457 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 13 4 Op 6 . + CDS 14660 - 16015 1897 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 14 4 Op 7 . + CDS 16030 - 17175 1419 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily + Term 17190 - 17247 0.3 - Term 17025 - 17064 4.2 15 5 Op 1 . - CDS 17243 - 17713 611 ## COG0394 Protein-tyrosine-phosphatase 16 5 Op 2 . - CDS 17739 - 19388 1921 ## COG0616 Periplasmic serine proteases (ClpP class) - Prom 19416 - 19475 12.2 - Term 19443 - 19492 2.1 17 6 Op 1 . - CDS 19536 - 20267 549 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 18 6 Op 2 . - CDS 20281 - 21213 1040 ## COG0167 Dihydroorotate dehydrogenase - Prom 21233 - 21292 12.9 19 6 Op 3 . - CDS 21312 - 22169 730 ## COG1737 Transcriptional regulators - Prom 22202 - 22261 15.2 + Prom 22238 - 22297 13.2 20 7 Op 1 17/0.000 + CDS 22331 - 24007 1835 ## COG1178 ABC-type Fe3+ transport system, permease component 21 7 Op 2 7/0.000 + CDS 24026 - 25123 229 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 22 7 Op 3 . + CDS 25140 - 26183 450 ## PROTEIN SUPPORTED gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 23 7 Op 4 . + CDS 26204 - 27613 1763 ## COG0362 6-phosphogluconate dehydrogenase 24 8 Op 1 . - CDS 27778 - 28383 469 ## gi|262039545|ref|ZP_06012844.1| hypothetical protein HMPREF0554_0052 25 8 Op 2 . - CDS 28411 - 28851 453 ## Spico_0842 Glyoxalase/bleomycin resistance protein/dioxygenase - Prom 28969 - 29028 8.2 - Term 28986 - 29027 1.5 26 9 Tu 1 . - CDS 29100 - 29330 104 ## gi|262039541|ref|ZP_06012840.1| hypothetical protein HMPREF0554_0054 + Prom 29215 - 29274 9.3 27 10 Op 1 . + CDS 29485 - 30306 837 ## gi|262039566|ref|ZP_06012865.1| hypothetical protein HMPREF0554_0056 + Prom 30310 - 30369 2.8 28 10 Op 2 . + CDS 30390 - 30473 98 ## 29 10 Op 3 . + CDS 30480 - 30623 199 ## gi|262039556|ref|ZP_06012855.1| protein RhuM 30 10 Op 4 . + CDS 30596 - 30706 133 ## + Prom 30713 - 30772 4.5 31 11 Op 1 2/0.000 + CDS 30926 - 32929 1371 ## COG3593 Predicted ATP-dependent endonuclease of the OLD family 32 11 Op 2 . + CDS 32929 - 34917 1840 ## COG0210 Superfamily I DNA and RNA helicases 33 11 Op 3 . + CDS 34942 - 35520 409 ## gi|262039547|ref|ZP_06012846.1| M protein + Prom 35522 - 35581 4.5 34 12 Op 1 2/0.000 + CDS 35612 - 37192 1641 ## COG0286 Type I restriction-modification system methyltransferase subunit 35 12 Op 2 . + CDS 37185 - 38183 896 ## COG3943 Virulence protein 36 12 Op 3 . + CDS 38243 - 38371 103 ## 37 12 Op 4 . + CDS 38392 - 38523 106 ## + Prom 38525 - 38584 5.6 38 13 Op 1 4/0.000 + CDS 38606 - 39784 875 ## COG0732 Restriction endonuclease S subunits + Prom 39982 - 40041 8.0 39 13 Op 2 11/0.000 + CDS 40169 - 40489 190 ## COG0732 Restriction endonuclease S subunits 40 13 Op 3 . + CDS 40543 - 43333 2684 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases Predicted protein(s) >gi|261746407|gb|ADAD01000195.1| GENE 1 168 - 2210 1827 680 aa, chain + ## HITS:1 COG:FN0198 KEGG:ns NR:ns ## COG: FN0198 COG3711 # Protein_GI_number: 19703543 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Fusobacterium nucleatum # 5 586 6 579 660 125 25.0 3e-28 MINSKNEITIKDIADLYNMTERSIRYDINELNIFFKEKNRRDIIEINNNKLKILYEKKEI EKVIHKISIGEYYLSEDERIDILFYEIFLLKNEFILQYFSEKYGLSKTTIRYSLKEVDHI IQEYNLTIEMSNKGYKVTGNEINIRKYIINILRKYLKIIKGKNIEYNPYMNIVEQFLSKK NIEFSKILVGEILNKTGKTVSDEAFETLQLFILTSILRNEKNLKIKEDAENKIFLLKTSE FIKIKEILKKIENIEEKDIYYFVDFFLGSYSYNPDYSYFLNWVLMESLIEQFLVKLSREL KINLATDVILRKELLNHLKPAIYRMKNKFKLTESILKEVKAQYTDLYDKTTMSLKIISDF INIPFDEDEAAFITVIVQRAVLRNNSSLLSKKDPRILIVCGLGYSSSRFLYENINSHFQV NIIDIIPFNQLESYNYLKKADIVISTLDFSLDGIDVITVNPIINENDIVKLKNYGLEERK NKIKLSELINFVKNISDEKKLKKKLIKNFGENIYDDTKKEKDSGKSFVNLLSEENIKLNT GVKNLNELIEFTGQTMIDSGLVKKEYIKELKNQVAQYGKYILIGEKTILPHGQLLKNVKK TGFSLITLKEGIDFFGSEVRIVICLASRHTEEHLQAILELNGYLKNSDFEKELLNKKTPT EILNYFKTLNKQEKKNEKFY >gi|261746407|gb|ADAD01000195.1| GENE 2 2194 - 2628 615 144 aa, chain + ## HITS:1 COG:lin0376 KEGG:ns NR:ns ## COG: lin0376 COG1762 # Protein_GI_number: 16799453 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Listeria innocua # 1 142 3 146 148 81 36.0 5e-16 MKNFINVSNIILDLETENKNLIINKMVETIPEGKLFDKEKFIADVLKREETENTVVGFKV AIPHGKSEYIKSPQIVFAKLKQEIFWGDPEEKVKYIFLLGVPSISAGDHIEILMKLSKKI LDEKFREKLGSTNVKEELLNIILE >gi|261746407|gb|ADAD01000195.1| GENE 3 2642 - 2950 587 102 aa, chain + ## HITS:1 COG:ECs3267 KEGG:ns NR:ns ## COG: ECs3267 COG1445 # Protein_GI_number: 15832521 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Escherichia coli O157:H7 # 2 98 3 99 108 106 60.0 1e-23 MKKIVAVCVCPMGLAHTFMAADSLEKAAKELGVEIKIETQGADGVQNELTKKDIAEADAV ILALAITPQGMERFDDCDPYEITLKEAIREGKEILEEIVNEL >gi|261746407|gb|ADAD01000195.1| GENE 4 3027 - 4208 1681 393 aa, chain + ## HITS:1 COG:ECs3266 KEGG:ns NR:ns ## COG: ECs3266 COG1299 # Protein_GI_number: 15832520 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli O157:H7 # 8 391 28 413 415 401 56.0 1e-111 MAQLKRRKNSFWQEFYKHLMTGISYMIPVLIMGGLIGAFSQVIPYVIFKIEPNVSILDAI KTGQYTGMNLQLLKLASLMESFGFTLFGFAIPMFAAFVANSIGGKTALAAGFIGGYVANK PISVVGMVDGAWDKITPVPSGFLGAFIIALIVGYFVKYLNKNINLPKNWLAFKSTFLIPL LASLFCMIIMVFIITPLGGWVNLQIRALLEAAGAAGQFAYALVLSATTAFDLGGPVNKAA GFVALGFTTEKVLPITARTIAIVTPSIGLGLSTLIDRKLVGRKVYNSEFYDAGKTSIFLA FMGISEGAIPFALENPGFTIPLYVVSSVIGAFSGIALGAVQWFPESAIWAWPLVKNLPSY ILGIAIGSIIIAVCNVLYRNKLIKDGKLEIDEY >gi|261746407|gb|ADAD01000195.1| GENE 5 4198 - 6912 2492 904 aa, chain + ## HITS:1 COG:lin0424 KEGG:ns NR:ns ## COG: lin0424 COG0383 # Protein_GI_number: 16799501 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Listeria innocua # 13 842 8 832 875 603 40.0 1e-172 MSINKEKKYNFPVIPHTHWDREWYFTTSRSKIYSLNHFKEIFDVLENNDNFKHFLLDAQL SIIDDYLEFHPYDEERIKRLVSKGKLIIGPWYTQTDQMVISGESIIRNLYYGIERAKYFG NYMKTGYMPDSFGQSAQMPQILNGFDINYNTFKRGLSDRHYPKNEFYWEAPDGSKVFNMY LDRYGNFVYFTSEEDSLNNLMEKLKEETDRRSLVGTLTLYNGEDQRPIRKNLPEIIEKLN KLHPDSNFYISTIDEVMKKTENNGFHYDTVKGEMTAGQFSRVHKSIYSTRADLKIKNNKN ENYIVNIAEPLNTIAYKLGFQYENKVFEKVWKLMAENAAHDSIGMCNSDKTNKSIEYRCD TVKSLTENSVELKMREIGSSIPEKDIFQFQVYNLLPYERSGILKAEIFTPSIEFEICDTD DNVYEIEILKSEKLAERIKNKMKSEVGFNTNDNPRWIKENVEIYKTEVLIYIKNLHPMGY KTLFIREKSLKISKSSIEKNNVVENENLKVTVCENGSIIIKNKKENFERKGFLILENSGD EGDTYDYSEPYKDRIITSENSTIEVLEIKNNSLLHEIKYSLKMNIPYNLKMRKKNEDEVI NEFFVTLSLEKESFLLKVDIEVENKAIEHRTRVLFKTEIESKESIADQQFGIIKRPVYLP EVENWRENGWNEKPRTIEAMQSFVSLSDKCGTVSIMTDCVREYQVIGEKYDTIALTLFRS VPEMGKADLQDRPGRASGMADYLTPDARLLKKLNFNFAIFISKNEFSASDMANLSKEYLT PFQYYQAAEFKNVDIFFLMNKPEIQSTPFSYSLFSFENKDCVLSTVKKAEKENSLIIRIY NPDDIKATDFSMIYNEETDKADLVKFNEETIMGGIDFSFEKSGEKSRIIINEVKTCQALT IRMK >gi|261746407|gb|ADAD01000195.1| GENE 6 7042 - 7812 753 256 aa, chain - ## HITS:1 COG:no KEGG:THEYE_A1956 NR:ns ## KEGG: THEYE_A1956 # Name: not_defined # Def: hypothetical protein # Organism: T.yellowstonii # Pathway: not_defined # 15 250 5 224 235 210 50.0 6e-53 MTQREAVILTIEKLGGIATLGQLYQEVLKIKECIWRTKTPFASIRKIVQEYKVSEAKKEI YKIKPGLYSLIKFKKANEENGIFEETEKNKNSKEMEIFNHSYYQGLLLNLGNLRNFDTFC PNQDKNKKFIDKTSLGNLRTLNNIPNFSYPELVKRSSTIDVIWFNYGFMELRMPDSFFEI EHSTDIQNSLLKFNDLRNFYTRMIIIADKSREEEFKTKIEYQAFQNLKERVKFLNYDELV KQYEKTIEILNCNVIL >gi|261746407|gb|ADAD01000195.1| GENE 7 7964 - 8797 1164 277 aa, chain + ## HITS:1 COG:FN1783 KEGG:ns NR:ns ## COG: FN1783 COG4820 # Protein_GI_number: 19705088 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin # Organism: Fusobacterium nucleatum # 5 273 1 270 274 249 48.0 4e-66 MSKKITYDYCNKLVKEFEKVVKNPIKERSSFYYTGVDLGTSCVVIAVLDENKRPVAGAYR YASVVKDGMIVDYVGAVKIVRELKEEIEKKLGVTLTYAAAAIPPGTDKLDGGVVKNVVEA AGFELTDLSDESTAANEVLQIKNGVIVDVGGGTTGISVFKDGKVVYIADEPTGGTHFSLV IAGAYRMPFEEAELFKRNKKNYNEVFGLLVPVIEKVASIIETHIKDYKIEEICLVGGTTC LDEIEKIIENKTGIKTTKPKNPMFVTPLGIALSCKHK >gi|261746407|gb|ADAD01000195.1| GENE 8 8995 - 10287 1111 430 aa, chain + ## HITS:1 COG:lin2555 KEGG:ns NR:ns ## COG: lin2555 COG1508 # Protein_GI_number: 16801617 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Listeria innocua # 14 427 18 445 447 171 29.0 2e-42 MEIKQQLINKLVNKQAISQKQLYSLEILKMSEQELFEFLKKEKEENPLLEIENEESFFSS FWKSEKNDKAFEVKDETVASKNLIENIKLQINNQNFSYIENEVLIYILNSIDESGFLDLG IKEIRKKFDIDNKKFKYILELIKEAEPNGIGTKNTKDFFKYQLLKKRIKDKKLIILIDNY LEKLTNQKELCKELDIFEGKLSEYIGIIKGLKLKPIENINEKIEYIFPDIIVKKINNEWK VMLNEGFYKNIKINREYSNLLEKSEDQEMINYINKKIKRIDFIIKCIEQRRQTLYKIGNV ILEKQKKYFEKNEDIIPMKQKDIALELNVHESTISRAIKNKYLESLRGIVKIKSFFSIGY DKKNISISSLTIKKYISEIISNEKNGVIYSDEKIRKFINEKYEINISRRTVAKYRKELKI YNKNVRKISR >gi|261746407|gb|ADAD01000195.1| GENE 9 10307 - 13186 2903 959 aa, chain + ## HITS:1 COG:lin1832_1 KEGG:ns NR:ns ## COG: lin1832_1 COG1221 # Protein_GI_number: 16800899 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain # Organism: Listeria innocua # 52 461 5 424 432 277 38.0 9e-74 MIEEIIKKEDKKSPLTDEEISEKLGISREIVTKIRNEKEIPNSRDRNIDVLKEEIENILK KEGKISNRKLTKRLNEIGYKIGKYAVNRIYQQIDSSSLSITEKEIKTFKRQEIKENVFKN FIGYDKSQKNNIEKLKAAVMYPPKGLHTLIYGESGVGKSYLAELAHSYAVMTDNFSDNAP YFEFNCADYADNPQLLVSQLFGYSKGAFTGAEEDKKGIVELCDNGILFLDEVHRLPSEGQ EILFSLIDKGKYRRLGETDTNRKSGIMIIAATTETPDSALLLTFRRRIPMSIKIPPLNER TLEEKFEFIKFFLHEESVRLKKKIRVKKEVIENFLEIIYPGNAGQLKSEIQVSCAKAFLE AKIENKEEIEIKKEFLIDISRRRKLEPEAIKLIKEEYIIYPNEKDKWDSNDNISNIKDMN IYQKIEEKYNELKEKGIETEKINDILEEEVKKEFFENILKFSNQDFNYRELQQIVGEKIL TSVMTAYEKAKMSFTDLNSKIIFPLSIHMKTSADRIKEGKQLPTSVSAGFKMSNKKETEV AKQMLAIINEKCYLNFPDSEAGFIAMYLKEFRKNEKYNEKIGLIVLSHGKVARGMVDVAN KVLNEDYAVGLEMDFSDTPQYMLEKTVNLVRQTDRGKGCIILADMGSLLNFEEKIIEKTG INVKIVGRVDTLMVIECLRKILYTSENIGEIVKEIDEKSSYYKSNNELKKKKIILCLCIT GEGAAQNLKEYLKERLKSVLEGVEILTKGYIEGENTEDIIKNISKEYEILAVIGTLETDR TINELNYISVEEAYKLAGIKKIREILKRKNIFSKNNLNEILNIDFIELTDLKYKENILDM MIGKMRQKGIVKDEFLLSVYKRESSVATYLNGGIAIPHGESTFVNKSCIFVTKLDKPVIW DGINMADIVLLLALKEDSKKYFEQLYKMISNESIVNSIRNAKTKEEILKILCKITEPVN >gi|261746407|gb|ADAD01000195.1| GENE 10 13250 - 13870 780 206 aa, chain + ## HITS:1 COG:BH3723 KEGG:ns NR:ns ## COG: BH3723 COG0800 # Protein_GI_number: 15616285 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Bacillus halodurans # 9 205 14 209 214 118 31.0 1e-26 MKIEEFPKITVILRGYNYDQVRLITELLVKYGIKSLEITMNTENAFEIIKKIKEEFGKKI FIGAGTVINMENTKEAVKAGAEFILSPIMLEKEILEYCKSKEVLTVPGAFSPTEITESLR NGADIVKIFPAERLGSKYISDITAPLGELPLMVVGGINKNNVNDYFKKGAKYAGIASGIF EKEDILNEDREKLENTLKSFINNLEI >gi|261746407|gb|ADAD01000195.1| GENE 11 13886 - 14350 644 154 aa, chain + ## HITS:1 COG:lin0503 KEGG:ns NR:ns ## COG: lin0503 COG1762 # Protein_GI_number: 16799578 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Listeria innocua # 5 140 4 139 155 90 36.0 9e-19 MSEQIKFSKELILRKKEETGQDEILLEMAELLKNKGFVKETYGEALLQREKEYPTGLETG ETNVAIPHVDIKHVNSAAIAVGILEKPIEFHKMEEPEKSVNVGIIIMLALDKPHAHIDML GKIIEMIKNKEVLKDIINEKNEENNYKIISKYLL >gi|261746407|gb|ADAD01000195.1| GENE 12 14360 - 14641 457 93 aa, chain + ## HITS:1 COG:BH0191 KEGG:ns NR:ns ## COG: BH0191 COG3414 # Protein_GI_number: 15612754 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Bacillus halodurans # 2 92 4 93 94 69 49.0 1e-12 MKKILVACGNGIATSTVVSSKIKEACEINGIQIMISQCKLLEVESKYKDYDLLVTTGKFT GGEVTIPTIGAISLLTGIGEEETMQEILEQLKN >gi|261746407|gb|ADAD01000195.1| GENE 13 14660 - 16015 1897 451 aa, chain + ## HITS:1 COG:SA0238 KEGG:ns NR:ns ## COG: SA0238 COG3775 # Protein_GI_number: 15925950 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Staphylococcus aureus N315 # 1 418 1 416 419 317 44.0 2e-86 MNFLYTVFDTLLKAGPTVLLPVIITVIGLIFGVRLSKAFRSGLTIAIGFIGIKLVINLLS SNLGPAAKAMVENFGIKLDTIDVGWGAISAVTWSSPIIAILIFAILLTNIVMLILKATDT LDVDIWNYHHLAIVGIMVFEVTHNLVYAIAASIVMAIITFKLSDWTAPLVEDYFGLPGVS LPTMSALSSVIIAAPLNALLDKIPGIDKINFSVKDAKKYLGFFGEPMIMGLILGGIIGVL AKYSITDICNLAVNMAAVMMLIPKMTSLFMEGLMPISEAAKKFTQEKFKGRKFLIGLDAA VVVGNPDVITTALIVIPLTILMSVVLPGNRMLPFADLAVVPFRVALVVALTRGNLFKNII IGLACTGAILLAGTATAPVLTKLAINIGLDLSASGGTYISSFAATSLTVSYLVYKVFTDN LFISVPILLISFSGIWYYLEKVKRIRKSSVI >gi|261746407|gb|ADAD01000195.1| GENE 14 16030 - 17175 1419 381 aa, chain + ## HITS:1 COG:SMb20510 KEGG:ns NR:ns ## COG: SMb20510 COG4948 # Protein_GI_number: 16264240 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Sinorhizobium meliloti # 1 381 1 382 382 466 56.0 1e-131 MKISNIIIYTVKPRWIFIKIVTDNGIEGWGEMVSGTKTETVVAGAKELSKRILGKNPLEI EKIWQELFRVFFRGGPINMTIISGIETALWDIKGKFYNAPIYELLGGSAREKIKVYSWIG GDRPADVVKDAQDRVDRGFDSVKMNATEELHYIDSFKKVDKVVERVAALRETFGNDLNIG VDFHGRVHKPMAKVLAKELEKYRPMFLEEVVLTENAEAFKEVAEHVVTPLATGERMYTRW GFKEILKSGYIDIIQPDVALAGGISEARKIIAMAEAYDIAAAPHAPYGPIALAATLQVDV CSPNVFIQEQSLGIHYNKGFDLLDFVENKEIFQYKDGFVDIPTKPGLGLIINEEKVKEVS LEGLNWSNPNWKNYDGTKAEW >gi|261746407|gb|ADAD01000195.1| GENE 15 17243 - 17713 611 156 aa, chain - ## HITS:1 COG:lin0937 KEGG:ns NR:ns ## COG: lin0937 COG0394 # Protein_GI_number: 16800007 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Listeria innocua # 1 149 1 151 152 145 50.0 2e-35 MVQVLFVCLGNICRSPMAEAVFRKMVDNEGLSRQIYIDSAATSSWEHGNPVHKGTRNRLA KEGISTVGMYSRILNDDDLSANYIIGMDQSNIENIEKFLKNRYSGRVQRLLEYAGEDRDI LDPWYTDDFDTTYKDVVKGCEALLSFIKKYDFNMEI >gi|261746407|gb|ADAD01000195.1| GENE 16 17739 - 19388 1921 549 aa, chain - ## HITS:1 COG:FN0873 KEGG:ns NR:ns ## COG: FN0873 COG0616 # Protein_GI_number: 19704208 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Fusobacterium nucleatum # 59 548 3 492 494 545 61.0 1e-154 MFILNTIIQVLLTVALYLVIGIIIYKKLLKGKKKKEISKKEAKTVVFDVSDVKEDEIGSA IEINSKISFYDVLQGLKNLTTDKNIEKIIIDVDKLNLPLAKWEELSEIFDEIGKNKELVA IGTFFDERKYRYAIIASKVFMLNTRQSTVCFRGYEYKEPYWKSFLAKFGIKMNILHIGDY KVAGENYSHDKMSPEKKQSILNIKESLFRNFIKSVENKRGVNIENEILNGDFIFVGTDKA LESKLIDGVADYEEIGINYKEDTVSFEDYLSIYKEKKNKSKDTIAIINLEGIIEPKKSNK VNITYKNVCEKLDKLEDIKNLKGLVLRINSPGGSALESEKIHQKLKKLDVPIYISMGDVC ASGGYYIASAGKKIFADSMTLTGSIGVVLMYPELSETLNKIDVNIEGFEKGKGFDIFNIF ETLSEESKEKIIHTMNEVYSEFKSHVIAAREMSEEELEKIAGGRVWLGSEAVNINLIDEI GSLEKSVETMVKDLKLEKYKVEIIELKKSLKETLTDIKAPLISEELEDKIRFMQNNVNRI LYYENDFEL >gi|261746407|gb|ADAD01000195.1| GENE 17 19536 - 20267 549 243 aa, chain - ## HITS:1 COG:CAC0284 KEGG:ns NR:ns ## COG: CAC0284 COG0846 # Protein_GI_number: 15893576 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Clostridium acetobutylicum # 1 242 4 244 245 287 57.0 1e-77 MDKINQLQKIIDESKKTVFFGGAGVSTESGIPDFRSADGLYSIKINRHFSPEQLVSHTMY LKYPEEFYQFYKTHLIYPDAEPNFAHYYLAKLEQQEKLSAVITQNIDCLHEKAGSRKVLK LHGSVDKNTCIDCGKKYNLEEFLELYHNGIPHCPECNGIIKPDVTLYEETPDMSVFDEAI RHLSRADTLIIGGTSLVVYPAASLIQYFRGNNLILINKSETSQDNYADLVIHDRIGEVFK QLK >gi|261746407|gb|ADAD01000195.1| GENE 18 20281 - 21213 1040 310 aa, chain - ## HITS:1 COG:SP0764 KEGG:ns NR:ns ## COG: SP0764 COG0167 # Protein_GI_number: 15900658 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Streptococcus pneumoniae TIGR4 # 1 310 1 310 311 457 73.0 1e-128 MSSTKTTIGNYEFENCLMNAAGVYCYDRNELEQIINSDAGTFVTKSATLNPREGNPLPRY YDTKLGCINSMGLPNLGIDYYLDYLLELQKTHPDRTFFLSLTGLSSEEIHTLLNKVYESD FNGLTELNLSCPNVPGKPQMAYDLEATEKLLNKVFEYFKKPLGVKLPPYFDIVHFDEAAK VFNKFPLTFVNCINSIGNGLVINDEHVVIKPKQGFGGIGGEYIKPTALANVHAFYQRLNP SIQIIGTGGILTGRDVFEHILCGAGIVQIGTTLQKEGPEVFSRITRELEEIMDEKGYKTL DDFKGKLKYL >gi|261746407|gb|ADAD01000195.1| GENE 19 21312 - 22169 730 285 aa, chain - ## HITS:1 COG:BH2675 KEGG:ns NR:ns ## COG: BH2675 COG1737 # Protein_GI_number: 15615238 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 10 278 15 283 287 202 40.0 5e-52 MKEKNMKFTIKTIYNDLSKKEKLIADYILKNINKSTNLTIADLAENIGVVNSTIFSFTKK LGFKGFREFRDSLLRDQFNSNINVHEKISKNDSITSIIEKVFNSNIKSLEDTKKIAEKKS FSTALKILTQVDIVHFFGVGGSSVIAHDAYHKFLRSPLKCRFDSDFHLQLMQASLLTKND CAFIISHSGMTKESIEIADIAKERKAKIIVLTSYPLSILAKKADITFISTAEEIEYRSEA LSSRIAQLSILDTIFILVMLNNPTKTQISLNKIRTTIAKTKRKGE >gi|261746407|gb|ADAD01000195.1| GENE 20 22331 - 24007 1835 558 aa, chain + ## HITS:1 COG:VCA0686 KEGG:ns NR:ns ## COG: VCA0686 COG1178 # Protein_GI_number: 15601443 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Vibrio cholerae # 35 552 166 693 700 331 36.0 2e-90 MNKIIAKNNKFLKDPILLFTILLVILSLFLFIVLPLYEVFKQSIFDSVGKLTFEAYKGIF TSRVNIEAVKNTLILATTVGITSTFIGYIFAYTEAYIKMRGKKIFNLIALLPIISPPFAI SLSIILLFGSRGLITKELLKIQNFNIYGFQGLVMVQTLSFFPMAYLLLNGVLKTIDPSME EASENLGGTRKDTFFKVTFPLTKSAIMNSFLLVFIKSISDFGNPIAIGGDYSTLAVQIYQ QALGNYDMAGAAALSMVLLDISLILFILSKYYFDKKSYVTVTGKAAKKRELIEDKIIKLP LNIFCGILSFIIMILYILIPIGSFIKLWGINYSFTTEHYIYALNVGGKAILDTVIFSVMA APVTGILAIIISFLIVRKRFIGKNFIDFTSILGIAVPGTVIGIGYVLAFNTPPIVLTGTA TIIVIAFIARTIPVGIKSGINTLQQIDPAIEEAAQDLGANSFKVFTTITLPMIKSAFLGG MVYSFIKSMTSLSAVIFLISAKYNLLTIAVLDQVEVGKFGVASAFSTILILIVYVVITVM QKIVNRQKGGKTAEILLS >gi|261746407|gb|ADAD01000195.1| GENE 21 24026 - 25123 229 365 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 293 12 307 318 92 27 3e-18 MEIKGLKLKNISKEFKTDNSKGLFKAVDNVDIEINPGEMVTLLGPSGCGKTTILRMIAGF EKPTIGEIYIGEEDVTELPANKRDTAMVFQSYALFPHYTVTENIEYGLKIKKVGKEERKR RVDKILGLIGLTEFKDRYPGQLSGGQQQRVALARALVTEPAVLLFDEPLSNLDAKLRIHM RTEIRKIQKELKLTSVYVTHDQAEAMTLSDRVIIMNKGIIEQVGTPREIYQYPESEFVAD FIGEANFIDGKVTEISVKNENEKCVNIDILGNKKEVILKNSKINKGDNVRVVLRPEVISI GEGKKYEGIVLSSVFMGEVQEYIIKIEDKNLFAKVINPKGKKVYSVDDIVNFDFEGIDIH ILRKK >gi|261746407|gb|ADAD01000195.1| GENE 22 25140 - 26183 450 347 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 [Haemophilus parasuis 29755] # 1 340 2 338 346 177 33 7e-44 KNMKKKMKYLGLLILFVVILSGCGNKDEKKEDKKDEPKELTMYIGVVEEQALKIAQEFEK DTGIKVNFVRMSGGEILSRIRAEKDAPKASIWYGGPSDSFVAAKKENLLQPYISGNAKII PENMKDKEGYWTGIYNEILGFVLDDRWFTEKNIAKPKTWDDLLKPEYKGQITVANPGASG TAFLFLSGFVQQRGEDKGLEYLAKLNENVKQYTKSGTAPAKSVILGESAIGITFIHNGLR YKEEGYDNVSVVVPEDGTWYSTAAVGIIAGAPELEAAKKFVDWALTKNAQEIGQKFKSYQ FPTNPEANIPDIVKPYQNFKMNEYNLEWSGEHRAELVEKWDKMLKSK >gi|261746407|gb|ADAD01000195.1| GENE 23 26204 - 27613 1763 469 aa, chain + ## HITS:1 COG:BS_gntZ KEGG:ns NR:ns ## COG: BS_gntZ COG0362 # Protein_GI_number: 16081060 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconate dehydrogenase # Organism: Bacillus subtilis # 6 468 5 466 468 475 50.0 1e-134 MNESILGVYGLGVMGSSIAQNLLKHGYNISVYSKDLQETEKFRLKNIENVKVFDKDNIDL FLKSLEKPKKVFLMITAGEVVDKVIEELLNYLEPGDIIIDGGNSFYKDTNRRYEYLKKKG LDYIGVGISGGEKGALNGPSMMPSGEADIYKKVENIFNDISAKSKDNKPCCTYIGKEGAG HYVKMVHNGIEYADIEIICEAYLIMKNICGLGIEKIQKIFEKWNKGPLKSYLIEITANIL KKKDEETGKYLVEVISDRAGQKGTGKWTAVEGLEMNVPIPSVVEAVFARSISDLKEKRIE AEKKLKLDKMSGNIGSEDIFINNLEKAVYLSKICSYAQGFDLLSNVSKKYGWNLNLEEIA LIWREGCIIKAEFLEEIASEFKNEKIENLMLGKKFSKELIDNHTKWRDVVCKAIKAGIYI PGMSSALEYFDGYRNSESSANLLQAQRDYFGAHTYERNDKKGHFHTEWE >gi|261746407|gb|ADAD01000195.1| GENE 24 27778 - 28383 469 201 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039545|ref|ZP_06012844.1| ## NR: gi|262039545|ref|ZP_06012844.1| hypothetical protein HMPREF0554_0052 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0052 [Leptotrichia goodfellowii F0264] # 1 201 1 201 201 309 100.0 1e-82 MKKLILFFLFLFTITELSAVDKDLEIKVKNLFEKKMEAAKNRNNTQISDSFNINKIGTDM YSDPKLYEILSKIYKQMEYSVKSVKIKGNTAEIHLDFKIPDAGRQVAYSIVNLTDKSKTL KYKSLSGKDLEKAYYMDIYSDILNKLNNKTIMYENKKNMKVIIDKAGNWDNVNFDFDTVY GKQFVTSYLGGFYIFKERLGK >gi|261746407|gb|ADAD01000195.1| GENE 25 28411 - 28851 453 146 aa, chain - ## HITS:1 COG:no KEGG:Spico_0842 NR:ns ## KEGG: Spico_0842 # Name: not_defined # Def: Glyoxalase/bleomycin resistance protein/dioxygenase # Organism: S.coccoides # Pathway: not_defined # 5 141 5 142 151 156 54.0 2e-37 MNPKIDHIHITVKDINRAKKFYDKLLPIIGFDLSLKEYADVPEHEYKIIEYHNKNFSFGI VNERLQYAHEAVNRRKPGALHHLAFHAETKEEVDILYQKILEIPAAIVQPPRYYIEYCKD YYAFFFKDSEGIEYEIVNFQRNKYFY >gi|261746407|gb|ADAD01000195.1| GENE 26 29100 - 29330 104 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039541|ref|ZP_06012840.1| ## NR: gi|262039541|ref|ZP_06012840.1| hypothetical protein HMPREF0554_0054 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0054 [Leptotrichia goodfellowii F0264] # 1 76 1 76 76 126 100.0 6e-28 MELNTAIAGKYEGKRIFSFFGNLYILIGFMKWKKLNFDSFKRYKILKENREGKFVYIKFH TGEELAIYINNRFTDI >gi|261746407|gb|ADAD01000195.1| GENE 27 29485 - 30306 837 273 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039566|ref|ZP_06012865.1| ## NR: gi|262039566|ref|ZP_06012865.1| hypothetical protein HMPREF0554_0056 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0056 [Leptotrichia goodfellowii F0264] # 1 273 1 273 273 411 100.0 1e-113 MDIFKKYRKPENNPFEKWFLEKMKEDLTFVYEKIYKYKKKDIGRMLNYKGDIRYEKWLLS DNSEIKINSGFIFGYGYNKNTPDFTLIEDSMKNGTLNQKIIFYCIFSKNGNGYIRETTVK KLLSFNNLPEWSIPYILERCSDYVYNIVKIIYDYSTLENIEKYKRMFNLNPKRLALSYDK MISYWWEWNRYLSENEDLRKEKGLEIKYTDNTEYEDKEEKTYKKSVNYENYKGYKIFNEF FGYNDDFKYNEKSGKIELKKRIRRRKNKENQKL >gi|261746407|gb|ADAD01000195.1| GENE 28 30390 - 30473 98 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRVLIYKLLEENVVIEPVVKDETIWLI >gi|261746407|gb|ADAD01000195.1| GENE 29 30480 - 30623 199 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039556|ref|ZP_06012855.1| ## NR: gi|262039556|ref|ZP_06012855.1| protein RhuM [Leptotrichia goodfellowii F0264] protein RhuM [Leptotrichia goodfellowii F0264] # 1 47 1 47 47 74 100.0 3e-12 MGDLSEYSVDNTGYFDYIEDLIERENTFTMEKFISVNKFWHSDDTKF >gi|261746407|gb|ADAD01000195.1| GENE 30 30596 - 30706 133 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAFRRYEILKRKVKILSKQTKKKAKDNTTNLTRCRK >gi|261746407|gb|ADAD01000195.1| GENE 31 30926 - 32929 1371 667 aa, chain + ## HITS:1 COG:PA1939 KEGG:ns NR:ns ## COG: PA1939 COG3593 # Protein_GI_number: 15597135 # Func_class: L Replication, recombination and repair # Function: Predicted ATP-dependent endonuclease of the OLD family # Organism: Pseudomonas aeruginosa # 297 582 312 636 665 100 26.0 8e-21 MFLKSIKINNFRKFRKEGNKVEFTNSKNYQTNVNVASNTTLIVGKNNSGKTTIINALEKL INKNINVSDFNFHYLKEILEKGKIEIPSIEFNIIIGLEEETENKEIKNNDNLNNLLPFMT LKDLKDKEIEITIKYEIEEEEIFKEEIKKIKKIRNKESRFLKFLKILEEARYRFKYYRED KTEVENFDLGKLIEIKTIKANNIEGEKSLQVAFSKIIKYRYKELIENNEEKENIEKKFQE INNNLTEQINEKHGKILNKALNKIFSKKNLEVSLKADLSFDNLINYLLKYEYVDEDLFIP ENQFGLGYTNLMMIVANLIDYMEKYPETSFNSKINLICIEEPETYMHPQMQEMFIKYIND VISELLKSHKKNINSQLIISTHSSHILNSKIQFGGSFDSINYIISLDKKIEVIPLKDELI SFSGRREDEDFKFIKKHIKFKVSDLFFSDAVILVEGITEYNLLPYYVDKDENLKKYYISL FNINGDYALVYKKLLKLLKIPVLIFTDLDIERIEDEKKSFSQIESLENKTTTNPTLTKFN NSKNISNLPEYLSEENISIYFQRKINEYYPTSFEEALILTNFNNPLLNTVLREIKPQIYE EIVKSKFENNKHNSYKWQKKLKNEKSEFSNKILYNLIITDNNDLKLPNYMEDGLKDLLKR LEKNGRK >gi|261746407|gb|ADAD01000195.1| GENE 32 32929 - 34917 1840 662 aa, chain + ## HITS:1 COG:MA2140 KEGG:ns NR:ns ## COG: MA2140 COG0210 # Protein_GI_number: 20090983 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Methanosarcina acetivorans str.C2A # 19 330 13 307 612 78 28.0 4e-14 MLKNISKENIKQEKEILKNLFSNIDNYQSTLFNAGAGAGKTYALIECLKYIIKKYEKELK IHNRKIICITYTNVAAKEIKERLGNTELIKVSTIHERIWELIKIYQKELVEIHLENIKIE IKKIEEKIEIKKEYRDKILEIKGNFYSNENLKSKEYRKKMEQLGIPDGLLKNVNNFKILA KNIIKVENYKLCIQKIEKNEYKEITYDPLYNSDRLYKMRISHDTLLKYGYLIIEKYSLLR KLIIDQYPYIFIDEYQDTDEKIVKIMNLLDIESKRINNPVFIGYFGDYVQNIYENGVGRI ENDHLELKKINKIYNRRSYSEIIEVANKIRNDEVNQKSIYTDSFGGNVEFYKGSKEKLEE FLRKNKRIYNTTKNKPLHCLLLTNRMIASFAGFEKLYEFFSKSEYYSGVNYNQLNTEFLS NEAEKMGEIPRLLYRIMRFKNKIENSSTFLKEIIIIKELHERNIVQLRELINLFSLQKVE TFLEYIDLICKIYDETNNNDYKVIIGKIFDIKEISAQGIRSYLLESLYPNLEDKTEKIIE IDDILKENINEYENWFDYIEDNIKSDIIYHTYHGTKGLEFENVIVVMENNFGRDKDYFKR FFKYYNKRDKLDENDREKFEKARNLLYVSVTRAIKNLKIFYVDDISEFEDEIKKIFGEIK IF >gi|261746407|gb|ADAD01000195.1| GENE 33 34942 - 35520 409 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039547|ref|ZP_06012846.1| ## NR: gi|262039547|ref|ZP_06012846.1| M protein [Leptotrichia goodfellowii F0264] M protein [Leptotrichia goodfellowii F0264] # 1 192 1 192 192 267 100.0 3e-70 MKIFSFICKKEKELEEQLDNIIKKLEDVEDKYKKERKQIENIFNGNFINNRDNLYNRIDV KIVGYSQNKNGHNVFILLYNNNEILLYNNFYNKIEVMPNLYFKRQDMYDKNSQKIGEKIK IIGLFAEKDKNIGNGSILLKALIKFAKEEKIKKITGELPRGNKANNDKQLRKHFYEKFGF KINEEQIELLLD >gi|261746407|gb|ADAD01000195.1| GENE 34 35612 - 37192 1641 526 aa, chain + ## HITS:1 COG:NMA1038 KEGG:ns NR:ns ## COG: NMA1038 COG0286 # Protein_GI_number: 15793994 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Neisseria meningitidis Z2491 # 13 524 7 513 514 653 63.0 0 MENNSKESKEGIQRSELHRKIWAIADEVRGAVDGWDFKQYVLGILFYRFISENMVTFFNS AEHEAGDLEFDYSKISDEEAERDFRPNTVEDKGFFILPSQLFENVVKNAAKNENLNTDLA NIFKSIEASAIGFASENDIKGLFEDVDTTSNRLGGTVAEKNKRLTDILTGISEINFGKFE ENDIDAFGDAYEYLISNYASNAGKSGGEFFTPQTVSKLLARLVMEGKTSINKVYDPTCGS GSLLLQMKKQFEEHIIDEGFFGQEINMTNFNLARMNMFLHNINYNNFSIKRGDTLLNPLH SEEKPFDAIVSNPPYSIKWIGDGDPTLINDERFAPAGKLAPKSYADYAFIMHSLSYLSSK GRAAIVCFPGIFYRKGAEQTIRKYLVDNNFIDCVIQLPENLFFGTSIATCILVMAKNKTE NKVLFIDASKEFKKETNNNILEEKNIENIVEEFKNRSDKEYFSRYVDKSEIEENDYNLSV STYVEKEDTREIIDIKVLNKEIEETVIKIDALRASINEIVKELEDE >gi|261746407|gb|ADAD01000195.1| GENE 35 37185 - 38183 896 332 aa, chain + ## HITS:1 COG:NMA1039 KEGG:ns NR:ns ## COG: NMA1039 COG3943 # Protein_GI_number: 15793995 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Neisseria meningitidis Z2491 # 5 332 3 330 336 328 53.0 7e-90 MSNEIIIYNTDDGKSEIELHLENNTVWLNQTELAELFQTSKQNISKHIKAIFDDGELYED STVNYKLTVQKEGNRNVSREVAFYNLDMILAIGYRVRSPRGIQFRNYASTVLKEYLIKGF AMNDERLKEFGGGTYFKELLERIRDIRSSEKVFYRQVLDLFATSIDYNSKSDEAKKFFAT VQNKIHYAIHHNTASELVYNRVGSEKEFMGLTSFKGNLPTKKEAETAKNYLSEKELRGLN QLVSGYLDFAERQAEREVQMTMKDWINHVDNILKATGEDLLEGNGKITRQKMKEKVSEEY KKFQQKTLSQVEKDYLKEIKEIEKRAKEKEKR >gi|261746407|gb|ADAD01000195.1| GENE 36 38243 - 38371 103 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLKFLLTIIYNLYIIYVSDKLNSTRGEQNPYNSTRGVRVFVY >gi|261746407|gb|ADAD01000195.1| GENE 37 38392 - 38523 106 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKQNFDSLKRKFKKHLEKLEMELSDNPEALKKIKISRLKLPLD >gi|261746407|gb|ADAD01000195.1| GENE 38 38606 - 39784 875 392 aa, chain + ## HITS:1 COG:HI0216 KEGG:ns NR:ns ## COG: HI0216 COG0732 # Protein_GI_number: 16272178 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Haemophilus influenzae # 2 389 6 384 385 292 44.0 9e-79 MTKIEELLKNEKVEWKKLGEVIDYEQPTKYIVNSTQYDDKFKTPVLTAGQTFILGYTNEI EGIYKASKEDPVIIFDDFTASNHWVDFEFKVKSSAMKILKPKNQFVNLRYCYHYIKTINF DVTEHKRIWISKYSQLEVPILSLEIQEKIVKILDKFTNYVTELQSELQSRTKQYNYYRDK LLSEQYLNKISEKIDKFEDKEYKLRVTTLGEIGEIKMCKRILKEQTSTKGTVPFYKIGTF GKKADSFISREIFEEYKKKYSYPKKGEVLISASGTIGRTVIFDGEDCYFQDSNIVWLSHN ESKVLNKYLYYYYQIVNWNPSSGGTIKRMYNYNLVNMKIFLPPIEIQDKVVKVLDKFQEL LKDTKGLLPQEIEQRQKQYKYYREKLLTFDEK >gi|261746407|gb|ADAD01000195.1| GENE 39 40169 - 40489 190 106 aa, chain + ## HITS:1 COG:XF2726 KEGG:ns NR:ns ## COG: XF2726 COG0732 # Protein_GI_number: 15839315 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Xylella fastidiosa 9a5c # 47 102 350 405 409 73 57.0 1e-13 MALIRINTNVALPKYIIYVLQSNEFKNSQINKWLEASSMKNLTMENIRKFKFLLPSLKVQ EYIVSILDKFDTLVNDIKNGLPKEIELRQKQYEYYREKLLDFPKEN >gi|261746407|gb|ADAD01000195.1| GENE 40 40543 - 43333 2684 930 aa, chain + ## HITS:1 COG:HI0218 KEGG:ns NR:ns ## COG: HI0218 COG0610 # Protein_GI_number: 16273673 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Haemophilus influenzae # 13 929 7 942 1026 1017 59.0 0 MSEEVAKYNISAIAEMTNGIILANYKKDDRIQEPSFQTETQLENDMIKNLVSQGYEKLEA TSNEDLYKNLKVQIERLNKVTFTQEEWRRFLVEYLDSPNDSMTEKTRKIQENHIYDFIFD DGHLKNIKIIDKENIHNNILQVVHQINQKGKHNNRYDVTVLVNGLPLVHIELKKRGVNIH EAFNQIHRYSKESFNAENSLYKYVQIFVISNGTYTRYFANTTAQSKNNYEFTCEWADAKN KVIRDLEDFTKTFFERRTILEVLTKYCVFDASNTLLIMRPYQIAATERILWKIKSSYESK KAGKTEAGGFIWHTTGSGKTLTSFKTARLATELDYIDKVFFVVDRKDLDYQTMKEYQKFQ PDSVNGSKDTKELKRSIEKQDNKIVVTTIQKLNEFVKKNPNHEIYGKHCVLIYDECHRSQ FGDAQKNIRKAFKYYYQFGFTGTPIFPENSLGGDTTSGVFGAQLHSYVITDAIRDGKVLK FKVDYNNISAKFKSAETETDEKELKKLERKILLHPDRISEITKHILRVFDVKTHRNEMYS VKQRQLSGFNAMFAVQSVEAAKLYYEEFQKQQENLSEEKRLKIATIYSFSANEEQTAIGE IFEENFEPNALDSTAKEFLEKVINDYNGYFKTNFSTNGNEFQNYYKDLSKKVKNKEIDLL IVVGMFLTGFDAPTLNTLFVDKNLRYHGLIQAFSRTNRILNKVKTFGNIVCFRNLEKATQ DAIKTFGDENSVNIILEKSYEEYINGFKDEETGKTVKGYIDICKEIIEKFPEPTEIVLEA DKKEFVQLFGELLKSQNILRNFDEFENFESGISDRQMQDMKSVYVDIREEILNSKSHENN SNSSQIDFSDVEFQIDLLKTDEINLDYILTLILEKSKESEDIESLKSEIRRVIRSSLGTR AKEELIIEFINKTKLSELKNNSDILESFYK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:57:10 2011 Seq name: gi|261746405|gb|ADAD01000196.1| Leptotrichia goodfellowii F0264 contig00003, whole genome shotgun sequence Length of sequence - 427 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 426 688 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261746405|gb|ADAD01000196.1| GENE 1 3 - 426 688 141 aa, chain - ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 5 141 154 277 2806 72 36.0 3e-13 SDRVTLTTGSLQMKDGDLVAIDVSQGHIGIGEKGIDALSLTDLELLGKTIDIAGVIKASR ETRVMVSAGGQTYQYKTKEVKSKGEIYSGIAVDGKVAGSMYAGKIDIISNDKGAGVNTKG DLVSVDDVVLTANGDITTNKV Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:57:12 2011 Seq name: gi|261746403|gb|ADAD01000197.1| Leptotrichia goodfellowii F0264 contig00036, whole genome shotgun sequence Length of sequence - 6704 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 66 - 129 5.5 1 1 Tu 1 . - CDS 214 - 6513 6949 ## FN0254 hypothetical protein - Prom 6621 - 6680 3.6 Predicted protein(s) >gi|261746403|gb|ADAD01000197.1| GENE 1 214 - 6513 6949 2099 aa, chain - ## HITS:1 COG:no KEGG:FN0254 NR:ns ## KEGG: FN0254 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 426 2099 1 1677 1677 2205 79.0 0 MSNNLKKMEKDLRAFAKRSKDVKYTKGLLFSFLLMGMLTFSDTLTSPEVKSTENAINQTR KELNTSINDLHVAFKQAKRDNNRLLKRANLELIQLMEQGDHVVKSPWSSWQMGMNYFYSD WRGTYKGRGDKAEKYPYEGIFQRDEDEFNRYIAPNSENYGLLPTSTNPRSAASNQRTNLN GYGIASTQPVPEPPAVTEVNAGINPKSVNKVPLNIAAKTANSPTLPEAVKFAPIDPKIVI PADPDLPEAPKFSVVLGADCNEGCESSASTPRQNTKGNFLGSEENQSKQNINTILHYTWP NNTGAEKSYAFKMYKEGDTTNLPAPSSGNTYYFNSYNFGGTKEFANGVVASQGDNRNNQR FFIGGSRFWEIDNISSGTFEFPNGKKLNLGGILTLGLVSQENGTELKNSGTITDEEEKDE KWIKEMPYDPGKDYLTIKGPTEDYKVKRSGEGYVGYKVGIAQVQENGSNSYGGVNQKMTN SPTGKINFFGERSIGMYVYLPGSTTYAIMKNEGEINLRGSESYGMKIAAKSADRAEMVNT STGKITLGKNGKDSASNSIAMALTADNTVANGVSLKRGNARNEGTITLKDVQNSIGTYVN VDSNITNTSTGKININSTIAKLTEAEKNAGKKQAFNMGMRADSHAGAEIINNGEITIDGS YAIGMLAKGSKLVNTKTISSTNVKNGIGIVGMNGATVENRGTVKVVGTGNTNNIGVLINS GSTGTVGTMAPGASAPSVEVSGDNSTGVLVTGAGSSLTLKGNVKVSGNSITGIVADGTNV KLEDDATVTVDDNGSEAGEIDKKGSYGIVVRGSAGKFEGTGTTVNAKVKTDKSIGLYSEG NLTVKKANITATDGAINFFAKDGKINITGGGTTVTGQKSLLFYTSGTGKVTLGGAMTSTI KGGTTPNTRGTAFYYVSPTRYGTFDKTAIQNYFNNTFGNGTSTLNNLTLNMEQGSRLFVA SNVGMNLSDTAATNLMSGITNAPTINGSNYKTFMLYLSKLTINQPVDLNDSNSNYAQLEI ANSSIDNANQMTGSQNRQVAMAQENGNDTSGNGYNPGEVTLTNTATGTINLTGEETTGIY AKRGIIRNEGTISVGKKSTAIYLVEDDRGPADGSKAANDATGTIKLGENSTGIFYKVNSA GTHTGTAGGVSNAGRIESTANNVTGISIDNPHSTARIFRNESSGVIDLQGQESTGIFATG TGTYEASNEGKINLASSTNINKPNIGMYTDKSSITLNNNGTITGGNKTVGIYGYNVNLGS TSVAKTGTGGTIVYSKGGNVVANGKLSVGEDGAEGANDAVGVYYVGQGGTVTSNASEINI GNSAYGFVIQNEHGAGVTLTTNTANVTLKNDSVYSYSNNRAGRVINNTVLNSTGNGNYGI YSAGTVTNNRDINFGTGIGNVGMYSIFGGTGTNNATITVGASDTAREKFGIGMAAGYKTR DSANIINSPTGVINVNGKNSIGMYATGASSTAINKGTINLRGENSVGMYLDNGATGINEG TITTVGSPKGAKGVVLSNNSKLINRAGAKININSAEGFGIFRVNSEETNITIANYGDITV SGGATASGKFDPTGGKELEKTAGGVTLKAPKGTTDVNITSKGKPVPNVEKITDPIGHRGN ALISRLGMYVDTLRGTNPINGLEKLNVKKAELLYGVEAAQNSNSKYFEVSGKILKPYQDA VKKATGMKWSHNSASFTWMALPSVDGNGVPLKVGMAKIPYTEFAGKKPTPVEVTDTYNFL DGLEQRYDKNALESREKLLFNKLNGIGKNEAVLFYQAVDEMMGHQYANVQQRIYGTGRMI DKEISHLSKEWDTKSKQSNKIKLFGMRDEYNTDTAGVIDYTSNAYGIAYLHEDETVKLGN RSGWYAGAVYNRFKFKDIGRSEENQTMLKLGIFKTMSPVADHNGSLQWTISGEGYVSRND MHRKYLVVDEIFNANSDYTTYGVAVKNELGYNIRTSERTSIRPYGSLKLEYGRFNTIKEK TGEVRLEVKGNDYYSVKPEVGVEFTYRQPMAVRTTFTAGLGLGYETELGQVGNVKNKGRV AYTDADWFNIRGEKDDRRGNFKADLNLGIENQRFGVTLNAGYDTKGKNIRGGLGLRVIY Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:57:40 2011 Seq name: gi|261746399|gb|ADAD01000198.1| Leptotrichia goodfellowii F0264 contig00094, whole genome shotgun sequence Length of sequence - 3775 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 409 608 ## COG1778 Low specificity phosphatase (HAD superfamily) 2 1 Op 2 . + CDS 419 - 2956 3242 ## COG1461 Predicted kinase related to dihydroxyacetone kinase 3 1 Op 3 . + CDS 2997 - 3725 732 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase Predicted protein(s) >gi|261746399|gb|ADAD01000198.1| GENE 1 2 - 409 608 135 aa, chain + ## HITS:1 COG:FN0213 KEGG:ns NR:ns ## COG: FN0213 COG1778 # Protein_GI_number: 19703558 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Fusobacterium nucleatum # 1 132 32 163 168 132 51.0 2e-31 FNVKDGYIIVNSQKVGIDFGIITGRESELVKIRSEELKIKYLYQGISDKTLILEEIMEKT LLKKDEIAYMGDDLNDLNIMKKVGLKGAPQDAVPEVKEVADFVSSKNGGDGAVREFTEFI LKKDGIWEKFLKNLK >gi|261746399|gb|ADAD01000198.1| GENE 2 419 - 2956 3242 845 aa, chain + ## HITS:1 COG:FN1927_1 KEGG:ns NR:ns ## COG: FN1927_1 COG1461 # Protein_GI_number: 19705232 # Func_class: R General function prediction only # Function: Predicted kinase related to dihydroxyacetone kinase # Organism: Fusobacterium nucleatum # 14 558 5 560 560 436 45.0 1e-122 MTKKNNCLGGIMGIKYLDAKRLRVILMGGGKWVIKHEDILNELNVYPVPDGDTGSNMAMT LNSMVTELENNTNEKTSMEDLIGVVEEAVLMGARGNSGTILSQVITGFLKEIGNKIKLLP VDVAQAFESAKKTAYGAVSEPVEGTMLTVIRKIADKAMEIAPSMDDLVIFLKEITEEANR AVEETPELLPKLKEAGVVDAGGKGLFFLFEGFYKVATELNLLAELQKAQVKENEFDKTIA NIDHDPESIKFQYCTEYIILNGDFDVEEYKRRVLELGDSAVFAQTSKKFKTHIHTNHPGK AFEIALEYGPLEKMKVENMKLQHDNLQIFSEKDEAKIFKNNKINKTENAYIILADSENLK DEFLKEGADVVILGGQSKNPSVQEILSAIDKVEKTNVYVLPNNKNVITTAKMAAEKSGKN TIVYETKTMLEGYYCLKNRQDGIEEIKNNAKRNYSIEITKAVRDTKVDELTIEKDNYIGL VNGKIKYTNSSLKELTEEILNSLITPNTVKVVAVEGNTKDEEVKQFILDKLGNIKVTYIN GNQENYYYYIYIENKDPNMPEIAILTDSVSDLSKEDIEGLPVKIVPLKIDINGEIFKDGE EISKTEFWKDMTEKELEIKTSQPSPQEFLNAYNRLFEKGYKKIISIHPSAKFSGTLQAAR VGRSLTNRENDIELIDSMGASLLEGFLVIEAAKKSIKRESYGEIINWVNSFKNKGKLLIV VPDLKYLEKGGRIGKASSVIAGAFQLKPILTVSQGEITVEKKVLGERNAQKYIEKYVKDE SKKQSLIVFTGWGGGPNELESIVKIHSEIGESSRISFPILNRQVGAVIGAHAGPVYGVFV FPRLS >gi|261746399|gb|ADAD01000198.1| GENE 3 2997 - 3725 732 242 aa, chain + ## HITS:1 COG:FN1921 KEGG:ns NR:ns ## COG: FN1921 COG0340 # Protein_GI_number: 19705226 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Fusobacterium nucleatum # 1 210 1 203 234 122 36.0 5e-28 MKFTVFEEIDSTNDYLRKNHKIEEFEVIIAKRQTAGRGKRGRVWISDEGAALFSFSVKDK ENLQEKITIFSGYTVYEILKKYIDENLKNDDFSENLKFKWPNDIYYQDKKICGILCEKIR ENIIIGIGININNNDFGIFQERAVSLSEICGIKFSVEEIIKEIVFLFEKEFTNLNKEWER ILEKINEKSYLKDKIIKIKKENKLGEKIYRFLRIDRAGKICLIGKGDTEESKFESLDFEV IL Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:57:44 2011 Seq name: gi|261746386|gb|ADAD01000199.1| Leptotrichia goodfellowii F0264 contig00183, whole genome shotgun sequence Length of sequence - 11927 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 7, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 323 373 ## COG0640 Predicted transcriptional regulators - Prom 356 - 415 8.7 + Prom 336 - 395 10.4 2 2 Tu 1 . + CDS 545 - 784 145 ## SMU.1237c hypothetical protein - Term 961 - 996 -0.2 3 3 Op 1 . - CDS 1000 - 1254 327 ## COG2026 Cytotoxic translational repressor of toxin-antitoxin stability system 4 3 Op 2 . - CDS 1254 - 1469 384 ## SSUBM407_1018 hypothetical protein - Prom 1499 - 1558 6.6 - Term 1554 - 1600 5.5 5 4 Op 1 12/0.000 - CDS 1629 - 2177 640 ## COG0602 Organic radical activating enzymes 6 4 Op 2 . - CDS 2209 - 4332 3040 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 4355 - 4414 4.5 7 4 Op 3 . - CDS 4425 - 4544 76 ## gi|262039589|ref|ZP_06012884.1| hypothetical protein HMPREF0554_2280 - Prom 4601 - 4660 4.7 + Prom 4852 - 4911 10.5 8 5 Op 1 21/0.000 + CDS 4944 - 5996 1380 ## COG1840 ABC-type Fe3+ transport system, periplasmic component 9 5 Op 2 8/0.000 + CDS 6028 - 7710 1812 ## COG1178 ABC-type Fe3+ transport system, permease component 10 5 Op 3 . + CDS 7726 - 8796 1367 ## COG3839 ABC-type sugar transport systems, ATPase components + Term 8798 - 8861 3.1 + Prom 8831 - 8890 9.8 11 6 Op 1 . + CDS 8926 - 10470 1572 ## COG0519 GMP synthase, PP-ATPase domain/subunit + Prom 10488 - 10547 4.3 12 6 Op 2 . + CDS 10597 - 11649 678 ## Vpar_1397 hypothetical protein - Term 11653 - 11701 2.0 13 7 Tu 1 . - CDS 11828 - 11926 96 ## Predicted protein(s) >gi|261746386|gb|ADAD01000199.1| GENE 1 3 - 323 373 106 aa, chain - ## HITS:1 COG:BS_yczG KEGG:ns NR:ns ## COG: BS_yczG COG0640 # Protein_GI_number: 16077456 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 6 98 6 98 104 93 47.0 9e-20 MDKLLHPNIEDVTLEGLLHALSDPIRLEIFRRLYSSEEEGKCGCFVDLGKKNNLSHHFKV LRESGVIKVRIDGRNRFISTRKDEINKKFPGLLSSIYNGCEENKLL >gi|261746386|gb|ADAD01000199.1| GENE 2 545 - 784 145 79 aa, chain + ## HITS:1 COG:no KEGG:SMU.1237c NR:ns ## KEGG: SMU.1237c # Name: not_defined # Def: hypothetical protein # Organism: S.mutans # Pathway: not_defined # 6 77 44 115 116 75 47.0 6e-13 MPEIKGKDNIISFFKTFFGRNEEAKHLWTTVISEDKLKQVNWGVVCKRKNGELFTLTGTD YFKIKNNKIVYLEVVSNKK >gi|261746386|gb|ADAD01000199.1| GENE 3 1000 - 1254 327 84 aa, chain - ## HITS:1 COG:SP1223 KEGG:ns NR:ns ## COG: SP1223 COG2026 # Protein_GI_number: 15901085 # Func_class: J Translation, ribosomal structure and biogenesis; D Cell cycle control, cell division, chromosome partitioning # Function: Cytotoxic translational repressor of toxin-antitoxin stability system # Organism: Streptococcus pneumoniae TIGR4 # 1 83 1 84 84 103 63.0 9e-23 MYKLVPTPYFAKQFKKLDKFTQKQIKSYLENIVNNPRAKGKMLIANRSGQWRYRIGSYRV IVNIQDEELIILALEVGHRKEIYK >gi|261746386|gb|ADAD01000199.1| GENE 4 1254 - 1469 384 71 aa, chain - ## HITS:1 COG:no KEGG:SSUBM407_1018 NR:ns ## KEGG: SSUBM407_1018 # Name: not_defined # Def: hypothetical protein # Organism: S.suis_BM407 # Pathway: not_defined # 1 71 1 71 83 71 63.0 1e-11 MAVITLKISEVEKKFLQSMAKFEGKTLSELIREKTLNTLEDEYDAKVSDIRLAEYEEYLA SGGEVLKWDDL >gi|261746386|gb|ADAD01000199.1| GENE 5 1629 - 2177 640 182 aa, chain - ## HITS:1 COG:VCA0512 KEGG:ns NR:ns ## COG: VCA0512 COG0602 # Protein_GI_number: 15601272 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Vibrio cholerae # 22 159 11 150 155 116 41.0 3e-26 MDSKDKKKDFTLRILKTFKETIVDGVGFRYSLYFAGCIHKCPGCHNEKSWNPDNGELVSY EMLQEIADEINENSILDGITISGGDPLFNPVDMLKVLKFLKEKTGKNIWLYTGYTLENIK LDKDRSKCLEYIDVLVDGPFIKQLYAPDLEFRGSSNQRIIKKSEFEKYSCGKNMQPQNMY VQ >gi|261746386|gb|ADAD01000199.1| GENE 6 2209 - 4332 3040 707 aa, chain - ## HITS:1 COG:CAC1209 KEGG:ns NR:ns ## COG: CAC1209 COG1328 # Protein_GI_number: 15894492 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Clostridium acetobutylicum # 2 705 93 689 699 365 34.0 1e-100 MQISGTLKNIIDGIVTVEKNDINNENANMSSMTPAGQMMRFASEVSKMYALENLISPKFA EAHQKGEIHIHDLDYYPSKTTTCLQYDLADMFEHGFQTKHGYIREAKSISTYATLATIIF QTNQNEQHGGQAIPAFDFYMAKGVLKSFRRHFRYRILSFLSLDYVDETNKKVKAFINENI HTILPDDSVAGKISEHFNIEKAQIKRLLEIAYDDTRLETYQAMEGFLHNLNTMHSRGGNQ VVFSSINYGTDFSEEGRMVIRELLKATEDGLGKRETPIFPIQIFKVKEGLSYSEEDYKFA MENIDKIDDMLNGKIKFKTPNFDLLLLACRTTSRRLFPNFLFLDTSFNSHERWDINDPMK YKYEVATMGCRTRVFENLNGEKSSLGRGNLSFTSINFPRIAIETRKKIENIISTMTFDSE QEKEDKKNEMLKKEFQSKVKEKTYLVAEQLLERYRFQQTALAKQFPFMRANNLWKGMGEV DQNSELYDALNTGSLSIGFVGGANAMYALFDAEHGSSEIAYNTLYETVEMMNEIAEELRK THKLNYSILATPAESLAGRFLKIDRKEFGEIENVTDKDYYVNSFHIDVRNKVGVFEKIKK EAPFHKLTSGGHITYVELDGEARKNIGVILKIVKTMKDSGIGYGSINHPVDRCRTCGTET IIDDKCPVCGSHDISKIRRITGYLTGDLDCWNSAKKAEEKDRVKHGL >gi|261746386|gb|ADAD01000199.1| GENE 7 4425 - 4544 76 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039589|ref|ZP_06012884.1| ## NR: gi|262039589|ref|ZP_06012884.1| hypothetical protein HMPREF0554_2280 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2280 [Leptotrichia goodfellowii F0264] # 1 39 1 39 39 63 100.0 3e-09 MLEMGNVFGKIDSGLFNIPKFLNLKNTKYCDIFKKRNNI >gi|261746386|gb|ADAD01000199.1| GENE 8 4944 - 5996 1380 350 aa, chain + ## HITS:1 COG:FN0308 KEGG:ns NR:ns ## COG: FN0308 COG1840 # Protein_GI_number: 19703653 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 348 1 351 352 536 82.0 1e-152 MKKKFLLLSLALVMLILTSCGKKGDSAASDGKNAEIKGKIVIYTSMYEDIIDNVKEKLKK EFPNLEVEFFQGGTGTLQSKIVAEMQANKIGADMLMVAEPSYSLELKEKGVLHAYLSKNA ENLALEYDKEGYWYPVRILNMVLAYNPDKYKKEDLALTFDDFAKKESLKGKISIPDPLKS GTALAAVSALTDKYGEGYFESLSKQNAVVESGSVAVTKLETGEAAEIMILEESILKKRQE EGSKLEVIYPTDGIIAIPSTIMTIKEDMSPNKNIKAAEALTDWFLSPAGQEAIVEGWMHS VLKNPEKAPFDALSTPEILKAAMPINWDKTYHDRENLRQIFEKHITKAKK >gi|261746386|gb|ADAD01000199.1| GENE 9 6028 - 7710 1812 560 aa, chain + ## HITS:1 COG:FN0309 KEGG:ns NR:ns ## COG: FN0309 COG1178 # Protein_GI_number: 19703654 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Fusobacterium nucleatum # 1 560 1 560 560 844 86.0 0 MTGRKKYNIDIKWIVILAVVAFLVIFEVLPLSYLLIRSLFPKGSFSFESFKRVYTYDLNW TALINTLVISGLTTIFGVVLAFPLAFLVGRTDMYGKKFFRTLFVTTYMVPPYVGAMAWLR LLNPNAGILNKFLMQVFSLSKAPFNIYTVGGIVWVLTCFYYPYAFITISRAMEKMDPSLE EASKISGASPLKTLMTITIPMMTPSIIAAGLLVFVASASAFGIPSIIGAPGQIYTVTTRI IDFVHIGSEEGLNDAMVLAVFLMVIANIVLYITTFVIGKRQYITMSGKSTRPNIVELGKW RLPITIIISVFSFFVVILPFITVALTSFTVNMGKPLGLSNMSLSAWEKVFSRASILSSTK NSIIAGLAAAFFGIMISCIMAYLLQRTNVKGKRIPDFLITLGSGTPSVTIALALIISMSG KFKINIYNTLTIMIIAYMIKYMLMGMRTVVSAMSQVHPSLEEAAQISGADWLRMLKDVTL PLIGASIVAGFFLIFMPSFYELTMSTLLYSSQTKTIGYELYIYQTYHSQQVASALATAIL IFVIAVNYLLSKLTKGQFSI >gi|261746386|gb|ADAD01000199.1| GENE 10 7726 - 8796 1367 356 aa, chain + ## HITS:1 COG:FN0310 KEGG:ns NR:ns ## COG: FN0310 COG3839 # Protein_GI_number: 19703655 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Fusobacterium nucleatum # 1 356 1 356 356 592 83.0 1e-169 MASVTITGVTKSFGNVQVLQEFNQKFEDGEFITLLGPSGCGKTTMLRLIAGFEKPSSGEI YIGDKIVSGKDSFVSPEKREIGMVFQSYAVWPHMNVYNNIAYPLKIKKASKSEIEEKVNK ALKIVHLEQYKDRFPSELSGGQQQRVALGRALVAQPEILLLDEPLSNLDAKLREEMRYEI KEITKKLKITVIYVTHDQIEAMTMSDRIVLINKGEIQQIGTPQEIYSRPNNIFVANFVGK VDFIKGKVQDGNILLNGSDDQTLPNKSDLNGNVIVAIRPENVILSDDGEIKGKVFSKFYL GDCNDLRVEVGNGNILRVTARASTYDTLKVGEEVRLKVLDYFIFEDDGEDKTKIMT >gi|261746386|gb|ADAD01000199.1| GENE 11 8926 - 10470 1572 514 aa, chain + ## HITS:1 COG:FN1444_2 KEGG:ns NR:ns ## COG: FN1444_2 COG0519 # Protein_GI_number: 19704776 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Fusobacterium nucleatum # 194 514 1 318 318 479 71.0 1e-135 MKEKILIVDFGSQYSQLIARRIREMEVYCEIVPAIDIDNIKSGKEKVRGIIFSGGPASVY EKDSPTVDKAVFDLGIPILGICYGMQLITHLNGGKVEKAESREFGKAILEVIDGKNPLLL GIPKTSSIWMSHNDHITVLPDGFEIIAETDSSIAAITNNNGIYALQFHPEVVHSECGTEI ISNFIFNICKCEKNWKIAGFIEEKAKSIKETIGNEHVLLALSGGVDSSVAAVLINNAIGK QLTCMFVDTGLLRKDEGKKVLEYYKEHFDLNIVFVDAKDRFLNKLKGVDEPEAKRKIIGN EFIQVFNEEIRKLKGKEGAKFLAQGTIYPDVIESQSIKGPSHTIKSHHNVGGLPEDLQFE LLEPLKELFKDEVRKVGHELGLPDTIINRHPFPGPGLGIRVIGEVTADKVKILQEADDIF INELMAQGLYSKVDQAFVTLLPVKTVGVMGDQRTYEYVAAIRSVNTMDFMTATWSKLPYE FLEDVSNKIINKVNGINRIVYDISSKPPGTIEWE >gi|261746386|gb|ADAD01000199.1| GENE 12 10597 - 11649 678 350 aa, chain + ## HITS:1 COG:no KEGG:Vpar_1397 NR:ns ## KEGG: Vpar_1397 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 12 345 1 328 332 229 40.0 2e-58 MQNSKNKYGGFMTNIWRLQTNTASSDGQSIAPFLIEKRIAAIGWSLLDKDLERLSKGDSR KFEEIKNKRNLIKNFNDYENFYNKYKKQLYKDINCVRNLANSVKSGDLIWIRDRGIYFLG KVKENSKYNYCYEEEYLNKDAANYINSIDWIKIGDESSIPGSLTTAFIRGRTLQKIHKKG IPEFSKLEYNRLIDEKNIKDKKYNFDNFENDCETFFNYLSPSDCEDLVCMYLFYKKGYIC IPSTNKNSTELYECVLLNYKTGKTAYIQVKNGEINLETKNYEHLINDNNEVYILTTKGKV TFSKEQHKNYIKEITSENLYHFVQNKDLKIQNIIPKNIKKWINFLSKKSS >gi|261746386|gb|ADAD01000199.1| GENE 13 11828 - 11926 96 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no KKEREAKIDKLVNEEKLKENSKRFIKKSIDKR Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:58:16 2011 Seq name: gi|261746347|gb|ADAD01000200.1| Leptotrichia goodfellowii F0264 contig00009, whole genome shotgun sequence Length of sequence - 42291 bp Number of predicted genes - 39, with homology - 37 Number of transcription units - 13, operones - 7 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 266 - 748 642 ## COG1525 Micrococcal nuclease (thermonuclease) homologs - Prom 784 - 843 4.9 2 2 Op 1 . - CDS 856 - 1992 1713 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases 3 2 Op 2 . - CDS 2020 - 2448 377 ## gi|262039606|ref|ZP_06012900.1| conserved hypothetical protein - Prom 2615 - 2674 6.4 - Term 2462 - 2494 2.4 4 3 Tu 1 . - CDS 2676 - 3854 1563 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 5 4 Tu 1 . - CDS 3955 - 4827 988 ## COG2017 Galactose mutarotase and related enzymes - Prom 4850 - 4909 6.6 - Term 4876 - 4908 3.2 6 5 Op 1 29/0.000 - CDS 4932 - 6764 2934 ## COG0443 Molecular chaperone - Prom 6799 - 6858 8.2 7 5 Op 2 21/0.000 - CDS 7003 - 7614 960 ## COG0576 Molecular chaperone GrpE (heat shock protein) 8 5 Op 3 . - CDS 7640 - 8647 1281 ## COG1420 Transcriptional regulator of heat shock gene 9 5 Op 4 . - CDS 8720 - 8842 83 ## 10 5 Op 5 . - CDS 8897 - 9535 637 ## COG0637 Predicted phosphatase/phosphohexomutase 11 5 Op 6 . - CDS 9522 - 9743 223 ## CDR20291_2973 putative glycosyl hydrolase 12 5 Op 7 . - CDS 9794 - 11779 2094 ## COG1554 Trehalose and maltose hydrolases (possible phosphorylases) 13 5 Op 8 38/0.000 - CDS 11776 - 12600 938 ## COG0395 ABC-type sugar transport system, permease component 14 5 Op 9 35/0.000 - CDS 12614 - 13423 625 ## COG1175 ABC-type sugar transport systems, permease components - Prom 13454 - 13513 4.5 15 5 Op 10 . - CDS 13516 - 14697 1438 ## COG1653 ABC-type sugar transport system, periplasmic component 16 5 Op 11 . - CDS 14720 - 15007 364 ## Rahaq_0135 conserved hypothetical protein CHP00022 - Prom 15044 - 15103 5.5 17 6 Op 1 1/0.000 - CDS 15161 - 18118 2558 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 18140 - 18199 7.6 18 6 Op 2 . - CDS 18318 - 19298 1059 ## COG1609 Transcriptional regulators 19 6 Op 3 . - CDS 19302 - 19694 494 ## COG0692 Uracil DNA glycosylase 20 6 Op 4 . - CDS 19651 - 19968 263 ## COG0692 Uracil DNA glycosylase 21 6 Op 5 . - CDS 19979 - 20827 1177 ## COG1912 Uncharacterized conserved protein - Prom 21052 - 21111 11.9 22 7 Op 1 34/0.000 - CDS 21376 - 22209 967 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 23 7 Op 2 3/0.000 - CDS 22209 - 23912 2042 ## COG1122 ABC-type cobalt transport system, ATPase component 24 7 Op 3 . - CDS 23985 - 24530 991 ## COG4720 Predicted membrane protein - Prom 24727 - 24786 10.8 + Prom 24508 - 24567 13.0 25 8 Op 1 . + CDS 24698 - 25174 260 ## Lebu_1940 hypothetical protein 26 8 Op 2 . + CDS 25212 - 26297 1494 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase + Term 26316 - 26365 8.4 27 9 Tu 1 . - CDS 26405 - 28042 1622 ## FN1654 hypothetical protein - Prom 28076 - 28135 2.7 28 10 Tu 1 . - CDS 28148 - 28228 65 ## - Prom 28269 - 28328 8.0 - Term 28235 - 28276 -0.9 29 11 Op 1 . - CDS 28411 - 28848 624 ## Lebu_1622 hypothetical protein 30 11 Op 2 . - CDS 28848 - 29525 791 ## Lebu_1623 MafB1 31 11 Op 3 . - CDS 29539 - 31074 2140 ## COG1109 Phosphomannomutase 32 11 Op 4 1/0.000 - CDS 31113 - 32084 1030 ## COG0470 ATPase involved in DNA replication 33 11 Op 5 . - CDS 32095 - 33126 1528 ## COG1077 Actin-like ATPase involved in cell morphogenesis 34 11 Op 6 . - CDS 33159 - 34643 1904 ## COG0215 Cysteinyl-tRNA synthetase - Prom 34849 - 34908 8.5 - Term 34888 - 34939 4.1 35 12 Tu 1 . - CDS 35019 - 37358 3036 ## COG1193 Mismatch repair ATPase (MutS family) - Prom 37389 - 37448 13.0 + Prom 37375 - 37434 12.2 36 13 Op 1 . + CDS 37497 - 38195 238 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 37 13 Op 2 . + CDS 38192 - 39031 698 ## DSY1434 hypothetical protein 38 13 Op 3 . + CDS 39046 - 41835 3856 ## COG2352 Phosphoenolpyruvate carboxylase 39 13 Op 4 . + CDS 41920 - 42183 390 ## PROTEIN SUPPORTED gi|229882702|ref|ZP_04502176.1| LSU ribosomal protein L28P Predicted protein(s) >gi|261746347|gb|ADAD01000200.1| GENE 1 266 - 748 642 160 aa, chain - ## HITS:1 COG:HI1296 KEGG:ns NR:ns ## COG: HI1296 COG1525 # Protein_GI_number: 16273209 # Func_class: L Replication, recombination and repair # Function: Micrococcal nuclease (thermonuclease) homologs # Organism: Haemophilus influenzae # 5 159 11 159 178 99 40.0 3e-21 MKKILILIVMLFLSMPIFSENYRVLYISDGDTIAVKKTENGKVTGKLIKVRLFGIDAPEL KQDYGYESKQALMNFIKNKDVKIEGKKKDRYGRLIGVVYLNNENINEKMVKTGNAWWYEQ YDRKNLKMKQYQESAQKSRIGLFAKKGYVTPWEFRKRKRR >gi|261746347|gb|ADAD01000200.1| GENE 2 856 - 1992 1713 378 aa, chain - ## HITS:1 COG:CAC0391 KEGG:ns NR:ns ## COG: CAC0391 COG0626 # Protein_GI_number: 15893682 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Clostridium acetobutylicum # 1 377 1 377 377 388 51.0 1e-108 MKFETKAIHGIRKDDGKHSGWKSTINMASTAQIENFGEEQEFEYARVSNPTRSELEKIMA KLENGKHAFAFSSGMAAITSLFTKFKAGDHIVLGTDIYGGTYRIMTDIFGKYNLEYTFVD TTDLQNIEKAVKDNTVAIFIETPSNPLLDVTDIKGVVEIAKKYELLTITDNTFMTPYLQR PLDLGINIVVHSATKFLSGHHDILAGMVIVNSDELAEEVWFAQKAIGAILSPFDSWLLMR SLKTLKVRIEAAQKNTLKLIEFFKNHKAVGKVYFPTEENNKGKKIHESQATGGGAVFSFV LKDENKVKPFFDNLKVALSAASLGGVETLVTHPHTITHAEMPEDEKNARGITKSLIRVAV GIEHIDDLIEDFKNALEK >gi|261746347|gb|ADAD01000200.1| GENE 3 2020 - 2448 377 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039606|ref|ZP_06012900.1| ## NR: gi|262039606|ref|ZP_06012900.1| conserved hypothetical protein [Leptotrichia goodfellowii F0264] conserved hypothetical protein [Leptotrichia goodfellowii F0264] # 1 142 1 142 142 200 100.0 3e-50 MEFKFLTENFYEKYENYKEVEKKKNRPYTVVYIIEYNNLLFAIPLRHNINHRYKISTVDN KGLDLSKVLVITDRKYIDNKKVFINDEEYKLLKNKERKIVSELQKYIKLYKKALKKPEIK RNKSLIEKSCLQYFHKELGIEK >gi|261746347|gb|ADAD01000200.1| GENE 4 2676 - 3854 1563 392 aa, chain - ## HITS:1 COG:STM0013 KEGG:ns NR:ns ## COG: STM0013 COG0484 # Protein_GI_number: 16763403 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Salmonella typhimurium LT2 # 1 388 1 374 379 332 49.0 1e-90 MAKKDYYEILGVPKNATDQEIKKAYRTMAKKYHPDMNKDNKEAEAKFKEVQEANEVLSDP QKRAAYDQYGHAAFENGGQGGFGQGGFGGFGGFSSEGFGNFGGFEDINLGDIFGDFFGGG SSRRSQGPRVKEGADLRYNMTLTLEEVAFGVEKEIKYKRKGKCKTCNGSGAEPGYNMKTC DNCNGSGQIRMQQRSIFGIQTVIHECDKCHGTGKIPEKECHTCHGTGIEKETVERKVRIP SGVEKGQRLVVRDGGDAGENNGLFGDLYIFIDVKEHPIFKRKGYDIYCKVPISMTTAILG GEVEVPTLEGKRTIKISEGTQSGKELKLRDKGIRTSNGTGSEIIEIKIETPINLTEKQKR ILKEFEDSLNKKNYKESNSFLDKMKKFFKGEE >gi|261746347|gb|ADAD01000200.1| GENE 5 3955 - 4827 988 290 aa, chain - ## HITS:1 COG:CAC3032 KEGG:ns NR:ns ## COG: CAC3032 COG2017 # Protein_GI_number: 15896283 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Clostridium acetobutylicum # 5 287 5 295 298 205 37.0 9e-53 MENTLKYGNVEIEVLSTGAELSSYKVNGNEFIWERNPEFWPNSSPVLFPFVGILKDGKYI FKNKEYSMTTRHGIARYEDFDLVEKGENFLKFKFSSDEETLKKYPFEFDFFMTYTIIDNS LEIKYEVINKNNEDMYFSLGAHPAFALEINENIKLEDYFLEFEKEETAQIYQLRTDALIL NEKKDYLKNEKIIKLNKNIFDNDAIIFKDLNSTKVALKCNKNERKLTMDYGKFPFIAFWS KPAAPFVCIEPWFGISDFANCSGKLEEKTGILKLDKNKSFTAKLLLTGSL >gi|261746347|gb|ADAD01000200.1| GENE 6 4932 - 6764 2934 610 aa, chain - ## HITS:1 COG:FN0116 KEGG:ns NR:ns ## COG: FN0116 COG0443 # Protein_GI_number: 19703464 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Fusobacterium nucleatum # 1 610 1 607 607 838 78.0 0 MSKIIGIDLGTTNSCVAVMEGGSFAIIPNSDGGRTTPSVVNIKENGEIIVGEIAKRQAIT NPDSTVISIKTNMGSDYKVDINGKSYTPQEISAMILKKLKKDAEAYLGEEVKEAVITVPA YFTDAQRQATKDAGEIAGLTVKRIINEPTAAALSYGLDKKKEEKVLVFDLGGGTFDVSIL EIGDGVVEVISTSGNNHLGGDNFDQKIIDWLADEFKKETGIDLRNDKMAIQRLKDAAEDA KKKLSTTLETSISLPFITMDATGPKHLEKKLTRAAFDELTKDLVEATKGPVKQALDDANL SPNEINEILLVGGSTRIPAVQEWVKSYFGKEPNKGINPDEVVAAGAAIQGGILMGDVKDV LLLDVTPLSLGIETLGGVFTKIIDRNTTVPVKKSQVFSTAADNQPAVSIVVLQGERAKAA DNHRLGEFNLEGIPAAPRGIPQIEVTFDIDANGIVHVSAKDLGTGKENQVTISGSSNLSK DDIEKMKKDAEAHEAEDAKFKELVETRNQADQLVLATEKTISENADKLQGTEKEDIEKAI EELKKVKDGDDLEAIRKGVEELSKVSQGFATRMYQEATQKAQAEQAANGNTSGNTSSDSN DDVQDAEVVD >gi|261746347|gb|ADAD01000200.1| GENE 7 7003 - 7614 960 203 aa, chain - ## HITS:1 COG:FN0114 KEGG:ns NR:ns ## COG: FN0114 COG0576 # Protein_GI_number: 19703462 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Fusobacterium nucleatum # 56 203 53 199 199 117 58.0 1e-26 MSEKDYEKEIADESTENFQEENEKETNENVNNSDEEKAPENGDSDKKEEADSPEMKIKKL ELELQEWKNSYTRKLAEFQNFTKRKEAEVSEMKKYASENIIVKLLDNIDNLERAMDASKE SKNFDSLVEGVNMILNNLKYLLKEEGVEEIETENKKFDPYEHQAMMTEQKEELENDDIVQ VFQKGYKLKGKVIRPAMVTVNKK >gi|261746347|gb|ADAD01000200.1| GENE 8 7640 - 8647 1281 335 aa, chain - ## HITS:1 COG:FN0113 KEGG:ns NR:ns ## COG: FN0113 COG1420 # Protein_GI_number: 19703461 # Func_class: K Transcription # Function: Transcriptional regulator of heat shock gene # Organism: Fusobacterium nucleatum # 1 333 14 341 351 244 43.0 1e-64 MNDRERLILKAIIKHYLEFGESVGSRTLEKKYEIGVSSATIRNTMADLEDKGLIVKTHTS SGRIPTSEGYRIYVEELIKIRDISTEAKAKVIEAYNKKMSQIDKIFEETSRLLSKISQYA GVVLEPAIRQENVKKVKLIHINDTGILAVAVMDSFLTKSFNIFLENPMSEEEVEKINIQL NEKIKNSPEAFTLSDLGEFFMNMDLLMPEELKEEQDENETKLFFEGGTNLLESNVSDVMK VIDRVKLFNNPNDMKQIFSQFIQTDQYKDGEVNVIFGEDLDIAGLEDFSFVFSVYTMGNA RGIMGVIGPKRMEYSKTVGLVEYVSEEVKQLLNKK >gi|261746347|gb|ADAD01000200.1| GENE 9 8720 - 8842 83 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVYKKISKFIFLKSKTKVLSNPIFSMSKKIKKNYKKSLTL >gi|261746347|gb|ADAD01000200.1| GENE 10 8897 - 9535 637 212 aa, chain - ## HITS:1 COG:CAC2614 KEGG:ns NR:ns ## COG: CAC2614 COG0637 # Protein_GI_number: 15895872 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Clostridium acetobutylicum # 5 212 4 203 215 153 43.0 2e-37 MGMFKGYIFDLDGVVCHTDKFHYSAWKKISDDLGIKFSENINNLLRGVSRKESLEIILSF SDKKISDAEKEKIIKSKNDLYVKSLEKINKDFLDDGVEEVLRNLKEKNKKVALASSSKNA KLILNKLEIISYFTIIIDGNNITYSKPNPEIFEKAVSSLGLDKSDCLVIEDADSGIKAAK IAGIKVCGLGNNFTEEVNYKLNHIKELFKLID >gi|261746347|gb|ADAD01000200.1| GENE 11 9522 - 9743 223 73 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2973 NR:ns ## KEGG: CDR20291_2973 # Name: not_defined # Def: putative glycosyl hydrolase # Organism: C.difficile_R20291 # Pathway: not_defined # 1 55 701 755 796 67 50.0 2e-10 MHAASAGGVWQSLILGFAGMSIEKGELQFSPKLPKKWKEIEFSIIHKSKINKVNITSNNK VKIKEKGMINGNV >gi|261746347|gb|ADAD01000200.1| GENE 12 9794 - 11779 2094 661 aa, chain - ## HITS:1 COG:all1058_1 KEGG:ns NR:ns ## COG: all1058_1 COG1554 # Protein_GI_number: 17228553 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose and maltose hydrolases (possible phosphorylases) # Organism: Nostoc sp. PCC 7120 # 18 661 14 657 798 298 29.0 2e-80 MIRDRFSDNIKKYLSEDEWIVVEDGYNSLENLNYETIFAMASGKMGLRATHPEGWVKKTL PANYMHGVFDRSEAFQRELCNLPNFNILKIYYKTEPISVEYGAKTEDYIRVLDLKNGILA KHYINESYDGRKTLVETLQFLSRKHPNSGLLKYFITPLNYSGKMEFENIIDGSVTNFYDF PRFRVKHMNIKEVGKFSNNGIYINSETRDFKLQVTTSSKVKIIDKKTQFQIKSHGEYAIE FFDCDFVENETLIIEKYSCVRNGNETDNTKLSSEKELEEIYVAGFENELEEHIMIYNEMW EMADIKIKGDPESEKAVKFEIFHLMSTPNPESSMTNIGAKMIHGEEYGGHAFWDTELFIL PYFIWTFPKIAKNLVEYRYNLLDGARRNARKYGYKGARFPWESADTGDEECPDWTIEGDG TCYECVVAKQEIHVTSAVVYGGYQYYKVTEDKDYFYNKFLEILAETSRYFIDRLEYSKEK DCYELTNVMGPDEWHENVSNNVYTNYIVKWHLNLANELLMNHMTNEIVKNMLNKINMSYE EVINFKKISEKIYLPLDDKIIEQFDGYFNLKDIEIYQWDKNNMPLLPKELKEIPREQTTI NKQADVVMVMFLFPENFSEEVQRKNFDFYEKRTLQRSSLSPSIFSIVGNRVGSGDRAYDY F >gi|261746347|gb|ADAD01000200.1| GENE 13 11776 - 12600 938 274 aa, chain - ## HITS:1 COG:SMb20969 KEGG:ns NR:ns ## COG: SMb20969 COG0395 # Protein_GI_number: 16264842 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 8 274 3 270 270 263 48.0 2e-70 MRLSLKTKFVIYFLCVVFSIYFLFPFIWMILSALKTDVEVFSYPPKFFPEKFNFKNFYDA WHSQKFSVFLVNSLIITAFTTLGQVISGSLAAYGFARYKFKGDKILFSLMLATMMLPWDV TVIPQYMQFNILGWIDTLKPLIIPALFGSAYYIYLLRQFLETLPSDFAEAAKIDGANEFQ IYRMIYLPLMKPSIILVAVLNIIVVWNDYLGPLVFTNSSDKYTLALGLAAFKGVHSDAII YTMCISIIMCIPPIIIFFMAQKNIIEGISGGVKG >gi|261746347|gb|ADAD01000200.1| GENE 14 12614 - 13423 625 269 aa, chain - ## HITS:1 COG:SMb20970 KEGG:ns NR:ns ## COG: SMb20970 COG1175 # Protein_GI_number: 16264843 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Sinorhizobium meliloti # 1 263 43 305 319 239 49.0 6e-63 MGPLIMSLIMSFFNWPVIGERVFTGFSNYIELFTKDTQFYKSLFITFKFTVIFVPVNIVL SLLLALLISQDLKGMSFFRIIFYLPTVVSSVAISIIWGWILNGEYGILNYFLSILKITGP DWLNSEKYSIFALIIASGWTVGVMMLVFYTALKSIPYELYESAIIDGANNIVIFFKITLP LITPTLLFNLVTAIIGALQNLALVILLTNGGPLNSTYMYGLFVYNNAFKKSRLGYASASA WVMFIFILIITGVIFKSSKKWVYNYNNKE >gi|261746347|gb|ADAD01000200.1| GENE 15 13516 - 14697 1438 393 aa, chain - ## HITS:1 COG:SMb20971 KEGG:ns NR:ns ## COG: SMb20971 COG1653 # Protein_GI_number: 16264844 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 29 374 29 385 410 276 38.0 5e-74 MFKWKRILLVVIFAIILLSCGKKDNRVKIRFATWDNAETLEFQKKMVEKFNKSQDKIQVL IEAYGENYDTKITASMGSSDAPDVMYMWNFPKYSEGLLPLDEMIKKEGDSYKNNFYEALW NYNSVKGNVYGLPVGYTTHVLYYNKDLFDKAGIEYPNNNWTWEDVEKAAAKLTNKTEKIY GFAFSQKPDPYDYEMFAWSNGGSFNPDMILDDKSIEPFKFFQKMIREGNAISTEDGGEKS FLLGKIGMFINGAWSLQKLQDKKLNFGVEVLPKFGSESSQSIISSSGVSISKTTKHPEEA WEFIKYWTSEEMNKNRLNYELPVLKSVAKSENLTEDKIKGKFYKMLEQSQKNMPTSFVIS NWSDIGEKIGLALEKIFNKNSLVEPKEALFEVK >gi|261746347|gb|ADAD01000200.1| GENE 16 14720 - 15007 364 95 aa, chain - ## HITS:1 COG:no KEGG:Rahaq_0135 NR:ns ## KEGG: Rahaq_0135 # Name: not_defined # Def: conserved hypothetical protein CHP00022 # Organism: Rahnella_Y9602 # Pathway: Galactose metabolism [PATH:rah00052]; Other glycan degradation [PATH:rah00511]; Metabolic pathways [PATH:rah01100] # 24 95 72 143 150 62 40.0 8e-09 MKTNLNMSGRSLKDIKDIWIYTIFLEGESEIEINLKKNLELYKQYEDITDTEYYTGKGEK INLKKGCILVAEINEAIKFLKDENTQKIVVKLTVE >gi|261746347|gb|ADAD01000200.1| GENE 17 15161 - 18118 2558 985 aa, chain - ## HITS:1 COG:ebgA KEGG:ns NR:ns ## COG: ebgA COG3250 # Protein_GI_number: 16130971 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 1 980 13 1009 1042 1004 48.0 0 MNKWEDFNILSENTMEPRNHFFRYENLEKAKTFNRYYSENFLLLNGIWKFKYFKNPHEVL EEYHSQITADREIQVPNMWQFEGYGQPHYTDEGFPFPIDFPYVPTDNPTGLYQRYFYLSK EFKNKDIILKAEGIESYYELYINGHYVGLNKGSRLITEFDITDFVNISKENLISIKVLQW SDSTYLEDQDMWWISGIFRDIYIYSKNRFCIEKYVVKTEMKNDYKDSDLIIDLELNTEKF LGKNLEIEFTLYNEKEKIFSEKIVPKKKNIKLKKSIKKIIEWNPENPYLYDLFIILRKDK KIIEVIPQKVGFRYIEVKKGLMYLNGKYFMMHGVNRHDNDHISGRTVGFERMEKDIILMK QNNINSVRTAHYPNSPRFYELCDKYGLMVLSEMDIETHGFVNTDDFNMLINNEKYKKMFV SRVVRQVISQINHPSIIIWSLGNESGYGVNVTHMTNEIKKVDTTRLIHYEEDRYAEDVDI ISTMYSRVQMMDNFGRFPSDKPRIICEYAHAMGNGPGGLKEYQEVFYKYPSIQGHFVWEW CDHGIYDKERDIYKYGGDYEDYPNNLNFCMDGLIFSNQKEGPGLKEYKQVISPVLIKKIS DIEYKVINRYWFSNLKDIKIYYEIFNGKKLTECNELENIENGSIIKFPRKIVNASIINFK VYKKNKTDYSAENHLLSVYQFQLEEKKQTYEKDNKIPEFHKITENNNEIVITGKNFSIKF SKLNGKLISYNYSGLELIKKPGSINLYKPVIDNHKKENLKWWLPFYFSVIQEHFRKMKIV KEKNKVKISVNSILAPPVYDFGLNCEYEYEIYGNGQINIKLKGKKYGKFLGIIPKIGFEI GLNKDLQNVKYYGRGENENYTDSKEASIIGFYKTTVDNMFINYPYPQDNGNHQDTKELYL TNHFGQGMSVKSENGLNFSAWNYTKENIDKAKHPDELIKSDFITLNLDYKILGLGSNSWG SEVLETYRVYLEDFEYGFSINTYND >gi|261746347|gb|ADAD01000200.1| GENE 18 18318 - 19298 1059 326 aa, chain - ## HITS:1 COG:ebgR KEGG:ns NR:ns ## COG: ebgR COG1609 # Protein_GI_number: 16130970 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 322 1 327 327 167 31.0 2e-41 MITLSEISKIAGVSISTVSRVLSEDETLKVTDEIRNKIIKIADENQYKRKRKKSKNLNEY EEVMILKNYDEETEIEDAYYYYLRTKLENKLKKNNIKIVVERYSKKINLKKNNILIGSYP EKILSDLSKKRCNLILCDSYSDKENIDCVTFNFKNSVYKVLDYIINRGHSNIGFIGGKDS EEKQDFREKYFTKYMKKFNMYKKENIYIENFDFKTGYNKTKEILSLEEYPTALFVATDEI ALGCYKAAKEFNKKIPEDISIIGFNNLDMSEYMIPSLTSVQVYMDNIINETVNLLKERMI HNKNYTKKILLETKIIERDSVKQLLL >gi|261746347|gb|ADAD01000200.1| GENE 19 19302 - 19694 494 130 aa, chain - ## HITS:1 COG:FN1226 KEGG:ns NR:ns ## COG: FN1226 COG0692 # Protein_GI_number: 19704561 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Fusobacterium nucleatum # 1 130 92 221 226 182 65.0 1e-46 MYKEISEEYGYEMSKNGYLVPWAKQGVLLLNTALTVIADNANSHSKIGWEIFTDNVIEYL NEREEPLIFILWGNNARSKKRLINTGKHYILEAAHPSPLSASRGFFGCGHFKKANEILKE LGKEEIDWRM >gi|261746347|gb|ADAD01000200.1| GENE 20 19651 - 19968 263 105 aa, chain - ## HITS:1 COG:lin1190 KEGG:ns NR:ns ## COG: lin1190 COG0692 # Protein_GI_number: 16800259 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Listeria innocua # 1 84 1 84 224 117 59.0 7e-27 MVNIGNDWDEVFKGEFEKEYYQNLRKFLINEYRTKRIYPKADEIFTAFKLTSYKDCKIVL LGQDPYHGVNQAHGLAFSVKEGIKTSSVITEYVQGNKRRIRLRNE >gi|261746347|gb|ADAD01000200.1| GENE 21 19979 - 20827 1177 282 aa, chain - ## HITS:1 COG:SP0481 KEGG:ns NR:ns ## COG: SP0481 COG1912 # Protein_GI_number: 15900396 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 2 280 12 289 291 323 57.0 2e-88 MNNMLVLQTDFGLQDGAVAAMYGVSLGVNPDLKIYNLTHEIETYNIWDASYRLVQTLQYW PEGTVFVSVVDPGVGSSRKSIVAKTKDKKYIVTPDNGTLTHIHSNIGIEEIREIDETKNR LPYSGESYTFHGRDIYAFTGARLASGVISYEEVGPKISLDEMVCLNTTEVTVDEDGMIKG TIDILDIRFGNLWTNIPKEIFDKVGINYGDTLELNIKHDTRNVFNSSVKFVKSFAEVHVG RTLLYVNSLNKIGVAINQGSFSRAHQVESGYQWKVELKKISK >gi|261746347|gb|ADAD01000200.1| GENE 22 21376 - 22209 967 277 aa, chain - ## HITS:1 COG:SP0484 KEGG:ns NR:ns ## COG: SP0484 COG0619 # Protein_GI_number: 15900399 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Streptococcus pneumoniae TIGR4 # 6 277 5 276 276 255 48.0 6e-68 MAGNFLLEYIEKDSVVHRMNGAAKLISFLLWTTAIMLTYDTRFLIFLTVFGFFLFKVSKI RFKEVKIVFYLISVFLLLNLMMIFLFSPLEGTKIYGTQHDIFRIAGRYVLTWEQLFYEFN IFVKYISVIPVALLFIVTTHPSEFASSVNRIGVPYKFSYAVSIALRYIPTVQEDFMTINK AQQAKGVDVSKKVGFIKRIKNVSYTLMPLIFSSIEKIDVISNAMLLRGFGKKNKRTWYCE KKMKSRDIFGILLTLVFVIISALLIKINNGRFYNPFR >gi|261746347|gb|ADAD01000200.1| GENE 23 22209 - 23912 2042 567 aa, chain - ## HITS:1 COG:SP0483 KEGG:ns NR:ns ## COG: SP0483 COG1122 # Protein_GI_number: 15900398 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, ATPase component # Organism: Streptococcus pneumoniae TIGR4 # 6 566 2 555 560 545 50.0 1e-155 MEISGKKEAVSFENFTFQYESQSEPTLYDINLKIYEGEKVLIIGPSGSGKSTLGHCINGL APYFYKGETKGKVEIYNKEIKGIFGHSEYVGTVLQDSDAQFVGLSVAEDIAFALENNNVK TDMMKKQVHEIADFVKIGDLLTMKPHDLSGGQKQKVSLAGIMVEDTKIVLYDEPLANLDP LSGKYAIELIDELHCRKNLTTVIIEHRLEDVLHRKVDRVVVVDKGKIIVDDTPDNILKSD ILKKINIREPLYISALKYSGVNLNEYAEISDIEKMNLSDVKDNINNWIKYNPPKEKENNS ETILKLENIDFSYNEKRKILKNINIDIKEGEMVSIVGSNGAGKSTLSKVIAGFERQDIGK IFYKGEEITDESITDRAEKIGFVLQNPNAMISKVTVFDEVALGLKIRGVSPEEIEKRVFK ALDICKLKPFRNWPIKVLSYGQKKRVTIASILALEPKIIIVDEPTAGQDLFHYKEIMEFL KQLNDYGITILFITHDMHLMLEYTDKAYAFNDGEIIKEGKPSEVLADKKVLEEANLKETS LHYLAEAVGANPQDLIKTFVYYEKADK >gi|261746347|gb|ADAD01000200.1| GENE 24 23985 - 24530 991 181 aa, chain - ## HITS:1 COG:SA2477 KEGG:ns NR:ns ## COG: SA2477 COG4720 # Protein_GI_number: 15928271 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Staphylococcus aureus N315 # 3 180 5 184 184 154 45.0 1e-37 MNETGIKKVVAAGIGAAIYIVLARFATIPSPIPNTMIQITFAFLALMAFIYGPVTGLAIG FIGHTINDITAYGSPWFSWIIVTGFFGFSMGVLGKYIGLEDFNLTKIVKFIIGDIVISAI GWGILAPTLDIVLYKEPASKVFAQGMVAGTSNALTVAVLGTILILGYSKTLVNKGSLRKE D >gi|261746347|gb|ADAD01000200.1| GENE 25 24698 - 25174 260 158 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1940 NR:ns ## KEGG: Lebu_1940 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 155 1 150 155 107 48.0 2e-22 MFGKIKMYILSEIIHNVITNISNQFSKNDYSSLFSESNFVNFEIFFSIYNKNKKILLNKS DNFFKFLDKQISNLDFDLSFFEKIDKKYISILSKKFEQLCSNDCIDFMTNDIRKKVIEKF KNIINCMSLLYTVFTSILLNYSSFSSNIRIVRAPPRRC >gi|261746347|gb|ADAD01000200.1| GENE 26 25212 - 26297 1494 361 aa, chain + ## HITS:1 COG:FN0664 KEGG:ns NR:ns ## COG: FN0664 COG2070 # Protein_GI_number: 19703999 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Fusobacterium nucleatum # 1 361 1 361 382 568 72.0 1e-162 MKEFKGIKIGKHYIEKPLVQGGMGVGLSWDQLAGNVSKNGCLGTISAICTGYYQNMRFAK KLVNGRPFGTENTYSREALFEIFKNARKICGDKPLACNVLRAINDYERVVTDALDAGADI IVTGAGLPLELPRLTKDYPGVAIVPIVSSARALKVICKKWKSEGRLPDAVIVEGPKSGGH QGAKYDELFSPEHQLEYVLPLVKEERDKWGDFPIIAAGGIWDSDDIRKMMELGADAAQMG TRFVGTYECDASMELKKVLLNAKEEDIAIVSSPVGYPGRAIRTDLIKNLVPDDKRIKCIS NCVFPCGRGQGARRVGYCIADSLGDAYLGRLQTGLFFSGANGYRLKEIVHVKDLIDELMS N >gi|261746347|gb|ADAD01000200.1| GENE 27 26405 - 28042 1622 545 aa, chain - ## HITS:1 COG:no KEGG:FN1654 NR:ns ## KEGG: FN1654 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 543 30 569 571 499 52.0 1e-139 MKTLPIGKSDFKEIIEYNSYYLDKTLLIEELLKENFKVQLFTRPRRFGKTLNMSMLKYFF DIENAEENKKLFKSLAIEKSEYFKEQGQYPVVFISFKDLEANNWEEMIIRIKNLLSEVFS NYKHLVKGLDEFDLPIFKKIINEELDISNLKSVLRFLTKILYEKYDKKVVLLIDEYDKPI ISAYEHGYYNKAISFFKSFYSLTLKDNSYLRTGVMTGILQVAKEGIFSGLNNLMVHNILE NRYTTYFGLTEEEVKNTLTDYDLKENISDVKKWYNGYKFGNSEIYNPWSILNYLYGKELK AYWINTSSNELIYEVLEKSEEDVFNELRELFEDKSIQKTINASFNFQDIRNLRGIWQLFV YSGYLKIDESLGNNIYSLKIPNYEVKSFFEESFINHFLGNQDSFREMLIGLKKKDIREFE RKLQQVFKLSISYNDIGKEEKYYHNLLLGMILAMQNEYNIDSNREDGYGRYDIFIEPKNK LDTGFILEFKVAESEEELEKKSLEAIEQVKEKEYFTSMKNNGIKDILVLGIVFYKKKIKV SYEKI >gi|261746347|gb|ADAD01000200.1| GENE 28 28148 - 28228 65 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYSNFNFLQQDWGALAKIGEMREYIL >gi|261746347|gb|ADAD01000200.1| GENE 29 28411 - 28848 624 145 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1622 NR:ns ## KEGG: Lebu_1622 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 145 1 138 138 115 51.0 7e-25 MKIIFGYYDDEDFGLTEYPEIIQSEKNEEVPDEEYEQEKEKVEQYRKYLVTGIMEDFQYP ESCDEVLIAIKDIENGKSFGTEWDGQAFQHEITPHYVQFVHTIFGDSEEYPVWTCRLKEY KKVLHTWKSFLELPKDLESRIEVSI >gi|261746347|gb|ADAD01000200.1| GENE 30 28848 - 29525 791 225 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1623 NR:ns ## KEGG: Lebu_1623 # Name: not_defined # Def: MafB1 # Organism: L.buccalis # Pathway: not_defined # 58 222 40 198 200 200 73.0 3e-50 MDIKKIFKKIVEILLIVILCVAVTRFFINKEQKQKKDTAVSQESSGKEKSDKNYAQYKNN KKSNKKKKKEYNTETNNNNQNSDSEKTVSSGNRKYKIDYDHIMGGDINSSGEKVTGGHTL LRGDVRILKKIGSPASNGVYRATVEIKRPDGKWQKKTSNGGVNTMFPENWDEARVIDEIN SAWENRKELKGRDSNMWQGISKSGVIIRGYKSPRITAYPIYEGER >gi|261746347|gb|ADAD01000200.1| GENE 31 29539 - 31074 2140 511 aa, chain - ## HITS:1 COG:lin1985 KEGG:ns NR:ns ## COG: lin1985 COG1109 # Protein_GI_number: 16801051 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Listeria innocua # 3 498 9 495 503 420 45.0 1e-117 MELKHLVSGTDIRGIVSEFEGKEINLSEKEVKFIALGFSRWIKRKYERKADEENRKIKVS VGYDARLTGPKFAEIIREELKQEGIDVYDCKMSITPSLFMSTVFKNYKADGAIMITASHL PSYYNGIKFFTAEGGFEKSDVLDMLEMAGRRKCQCEQNLKKAMGIKDKKGRSSEKNLAED YADYLCKFIIKETGGEENPLKGLKVVIDAGNGAAGFFADKVIEKLGGDSTGSQFLNPDGS FPNHVPNPEAKEAIDSLKKAVLDNKADFGIIFDADGDRSAFVDKSGREINRNRLIALLSD ILIKQKDGATIVTDSVTSMGLKKFIEDRGGKHHRFQRGYKNVINEAKKLNKKGIYTPLAI ETSGHAAFMDNYFLDDGAYMAALLLIQLVKSKKAGISFTDTLNELKDPMEEKEIRFSIKA ADFRESGNKVLDKLPEYVGKINGWELEKPNYEGVRVICDGNNWFLLRLSLHEPLLCLNIE TEEKGKVKDIEKKLYEFLKNYDEVDSSVLNN >gi|261746347|gb|ADAD01000200.1| GENE 32 31113 - 32084 1030 323 aa, chain - ## HITS:1 COG:FN1576 KEGG:ns NR:ns ## COG: FN1576 COG0470 # Protein_GI_number: 19704897 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication # Organism: Fusobacterium nucleatum # 6 191 2 182 289 111 43.0 2e-24 MKSSEISEKLKNEAKQNKKSASYLFYGDKRVDLLFYALEFCKIIMTSGIDENSEEYKSII KRIDNFQYPDIEIINKENKNIKIDEVREIIYSAIESSYSSPKKIFILSGIESLRKESSNA LLKILEEPPKDVYFILLSRSMNIIPTIKSRAIKFHIEAENNEELDVSKEIYYFFDGNENN IRKWKERNISLGEYDRYVSTSREALDYIIKMKKYMEDENTENEETETDELDLIIKYNKSI EYIAKKIKFFDLKEVYTIINEIEKEFKQEREKLTEFLTKIIINAKNSINGDKLKKLINLK NSIRSNVNTRSVLFNFFDLLQEA >gi|261746347|gb|ADAD01000200.1| GENE 33 32095 - 33126 1528 343 aa, chain - ## HITS:1 COG:FN1577 KEGG:ns NR:ns ## COG: FN1577 COG1077 # Protein_GI_number: 19704898 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Fusobacterium nucleatum # 10 342 6 341 342 376 58.0 1e-104 MAIFDFFKIFRVNKSISIDLGTANMLIYDKQKKKIVLNEPSVVARDRKTGKIIAVGKEAR EMLGKTPASIEAIKPLQDGVIADIDSTKEMISYFLHKIYGNSLFKPEVMICVPIEVTSVE RKALFDSVNGAKKIYIIEEGRAAIIGSGIDISQPEGNMVVDIGGGSTDIAILSLDEVISS KSIKTAGNKFDEDIVKYIKKKYNLLVGDRTAEKIKKELGTALKVADPEVMTIKGRDLAVG IPNTIELNANEVYEAIEDSLYQIVNSSKEVLEKCPPELAADILDNGIVMTGGGSLIRNFV ELMENEIGIKVYLAENPLDSVVIGGGKAFDNKKLLKTLQMRES >gi|261746347|gb|ADAD01000200.1| GENE 34 33159 - 34643 1904 494 aa, chain - ## HITS:1 COG:FN1579 KEGG:ns NR:ns ## COG: FN1579 COG0215 # Protein_GI_number: 19704900 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 489 2 470 473 499 52.0 1e-141 MNLYNTMSNKIEEFKPIEEGKVKMYVCGPTVYNYIHLGNARPVVIFDILARYFKYRGYDV TYVQNFTDVDDKIIKRANEEGISPKEVTEKYIQGFFEDTELLNILQDIKRPKVTENMPEI IETIEKLIENGYAYEIDGNVFFDVKKYEDYGKLSNQKIDELEIGARVDIMEIKNNPLDFA LWKKKKEGEPYWDSPWGQGRPGWHIECSAMSKKYLGETFDIHGGGQDLIFPHHENEIAQS KCAYHGHFAKYWVHNGFVQVNGDKMSKSTGNFFLLREILGKFSGNVIRLFILSTHYRKPV NFSFEDMENTKKTLQNVISSIRRFEKSLEKLSSGIKSEDVSDEKINEINKVFKEKIKSFD VKFIGAMDEDMNTPQALSTIFDQIRETNRFCDDIEKILDKERNDSSFLLFEALELSYKNL KTKIEDVLGVLLMDEKKENKDDLSKNLIELLIKIRTEARKEKNFSLSDEIRDELKKIGVE IKDNKDGTTSYTLI >gi|261746347|gb|ADAD01000200.1| GENE 35 35019 - 37358 3036 779 aa, chain - ## HITS:1 COG:FN1581 KEGG:ns NR:ns ## COG: FN1581 COG1193 # Protein_GI_number: 19704902 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Fusobacterium nucleatum # 2 779 3 778 778 666 50.0 0 MEKNYDVLEFYKIINELINLSKLEKTKEKFIDTDIIKDKSVLDKELMLMTEMIDFYKFDD GFELTGMSDIQRYINSIELIGSYLNAEDLADLKKNLAVYRISKSRAKNVRDKYKIIWGLF SDTEEVKDIEQFIGDAVNDDGNIKDDASIGLRDIRRQKQNINANIKEKFDELMSGKDTQK AIQERIITQRNDRYVIAVKTDFKGLVKGIEHDRSATGSTVYIEPLNVVSLNNKLREYEAR EREEIRKILLRLTELIRTKKEEIIRIKEILERLDFLNAKTAYSLNKKCTVPKIINKEYLK LVEARHPLIDENAVVPINFELGNNENIMLITGPNTGGKTVTLKVAGLLTLMALSGIPVPA HEKTEIGHFGNVLADIGDEQSIEQNLSSFSGHVKKIKEIVEQVNSKSLVLMDELGSGTDP MEGAAFAMAIIDYLNGKNIKSIITTHYSEVKAYAFNTTGIKSASMEFNVETLSPTYKLLE GIPGESNALIIAGKYGISEQIINSAKSYISEDNQKVEQMLKSIKEKNDELEVLKFELENT KRELEDQKNSYEQKIIQVENEKNEVIKKAYEEADNYLKEVQSKAKNLIDRISQDEIKKEE AKNAQRSLNMLRESFIADKKQNVKERKIVARNIDIQEGEEVLVKTLNQNGKVLRIIPDTN NVQVQAGILKLVVSLDNIVKIQKKKTNRFKSFASLKSTQVRGEIDLRGKNADEAIAELEV YLDRAMLTGYHEVYIIHGKGTMILRKKIQEFLKTSKYVTEFKDANQNEGGIGCTVATLK >gi|261746347|gb|ADAD01000200.1| GENE 36 37497 - 38195 238 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 223 7 238 318 96 28 3e-19 MEKPILECKNLTKNYEHETALDNINLTISPGKIVGLLGPNGSGKTTLIKLSNDLLKPSSG EILINGEKPGVNSKKIISYLPDKTYFSNSHKINELFKLFSAFYPDFNEEKARKMLEGLKI DPKSKFKTLSKGMKEKVQLLLVMSRDAKLYLLDEPIAGVDPVAREYILNTIIKNYNREAS IIISTHLISDVEIVLDEAVFINRGKILLHEDVEKIRTVHNKSVDEYFREVFR >gi|261746347|gb|ADAD01000200.1| GENE 37 38192 - 39031 698 279 aa, chain + ## HITS:1 COG:no KEGG:DSY1434 NR:ns ## KEGG: DSY1434 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 3 279 4 265 265 73 27.0 8e-12 MIKLFKYEFGSTARLLFPLYIVVLGIGFLNKIVTTFGKNTGIEKLFLNDSSLNWRGLDTA VFSFQVFSYFLAALIVAGIFVLTYFSIINRYYNSMYGNEGYLTNTLPLKTHELILAKFFN FLFWIFIGSVVSLTSIMLLFPTITFKEIWAFAVGLISQIDSTDYIDYITRIILIVINHFV NLCLTIMIIFLSVTITNFFNNYKLIAGIVSYFVIRTILNIIYFFLINLPTFLMAINKDPE YLFKGSTLNILFAVFLAIIEWIIIFSIINYTHKNKLNLE >gi|261746347|gb|ADAD01000200.1| GENE 38 39046 - 41835 3856 929 aa, chain + ## HITS:1 COG:SP1068 KEGG:ns NR:ns ## COG: SP1068 COG2352 # Protein_GI_number: 15900937 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxylase # Organism: Streptococcus pneumoniae TIGR4 # 53 929 40 898 898 777 47.0 0 MKNLPVTPLPTVKLDKQQETLLEHNRLLGTLLGETIFTYAGLHIFKVTESLREASISYYK TNKQESRKDLSLFCSELGDSEILRVIRGFTYFSVLANIAEDVYQVYQHRSMKFSNKTEMG TLEKSLENLKKKGISIEKILEAMKKVSVVPVLTAHPTQVQRKSILDLMRKITDTLSKYEN VQIHQIDESEWIDELNSKIKILWQTSMLRTSQLRVTDEISNALSYYNITFFNEIPELTNK FQEISEKLGENKFEAEKLIPLTMGMWIGGDRDGNPYVTVDTLETSAQAQALTLFQFYLDE IESIYRDLSMSNTMTEVSKDLEKLSEASGKVSPHRVNEPYRRALTAINDRLLATAYKLCE NKEVLPPKRKDSLDIPYSSPEEFTKDLEIVANSLIENNSEALVKGKLRNLISGTKIFGFH LATIDLRQDSSIHELCVAELLKSANIMEDYLSLPENARCEMLLRELEHDPRILSDPTISQ SELLKGELAIFRKVKSLRDRFDSRIIEKHLISHATSVSDMLEIAILLKEANLAKGDKNSF CDIQIVPLFETVEDLEASSEILEKWFSLPLVQKWMEKSGRKQEVMLGYSDSNKDGGYLSS SWSLYKAQKKLTALGNKFDVQISFFHGRGGTVGRGGGPSYEAILAQPEGGADGTIRLTEQ GEVIGAKYGNPDLGFKNLEALVSAALESGALTVEDAKWDEYEKIIEEISESSYKVYRELV YNTEGFTDFFFEITPINAIAGLNIGSRPSSRKKKQSLESLRAIPWVFSWSQSRIMLPGWY GLGASFTEWINKNPDNLSILQKMYREWPFFRSVISNADMVLSKSDLRIASEYVKLAQNQE VAQKIFSEIVKEWELTLDVLKKITGNDVLLADNPELASSLRNRLAYFDSMNYLQIELLKR LREGDESEDLRKALHISINGLATGLRNSG >gi|261746347|gb|ADAD01000200.1| GENE 39 41920 - 42183 390 87 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229882702|ref|ZP_04502176.1| LSU ribosomal protein L28P [Sebaldella termitidis ATCC 33386] # 1 85 1 85 87 154 82 6e-37 MQICEVFGKRVGHGNMVSHSHRATKRIWRPNLQTMTLNVDGSEIKVRVCTKAMKTLKGKN VDQVKKILLENKDNLSPRISKVLFSAK Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:59:03 2011 Seq name: gi|261746344|gb|ADAD01000201.1| Leptotrichia goodfellowii F0264 contig00051, whole genome shotgun sequence Length of sequence - 1876 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 125 - 183 3.5 1 1 Op 1 . - CDS 295 - 1059 958 ## COG1496 Uncharacterized conserved protein 2 1 Op 2 . - CDS 1061 - 1876 1057 ## COG0142 Geranylgeranyl pyrophosphate synthase Predicted protein(s) >gi|261746344|gb|ADAD01000201.1| GENE 1 295 - 1059 958 254 aa, chain - ## HITS:1 COG:FN1561 KEGG:ns NR:ns ## COG: FN1561 COG1496 # Protein_GI_number: 19704893 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 3 252 9 238 242 143 34.0 4e-34 MFKEKENYYFIEEFEKYGIKAVYSKKNAGNMSDYCKMEGQEEKEQKKNRNKLLKELGLED RKTLMSYQTHTNNVEIIDENTENYVFKDIDGFVTGRKDVALFTFYADCLPIFVYDKKNEV IGVWHSGWPGTYKEIMKNGLEVMKNKFGTEPENVLMALGIGIQQENYEVGQEFFEKFVDK FGKESELIVKSFKLNENTQKYHFSNTIFNKITALNLGIKEENLIVSEEDTWDEKFYSYRR EGKRAGRATAMIAF >gi|261746344|gb|ADAD01000201.1| GENE 2 1061 - 1876 1057 271 aa, chain - ## HITS:1 COG:FN1327 KEGG:ns NR:ns ## COG: FN1327 COG0142 # Protein_GI_number: 19704662 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Fusobacterium nucleatum # 1 269 22 295 297 254 51.0 2e-67 ELEKYKYPEELSEAMKYAVMNGGKRIRPILMYMICDLFGKKYDKIEDMATALEFIHCYSL VHDDLPAMDNDMYRRGKLTTHVKYGESTAILVGDVLLTEAFNVIAESEKTDDTAKAKIIA KLSEYAGFYGMIGGQFMDIKSEHTRVSYDTLKYIHANKTGKLITAAIELPLIALDIEESK REKLVGYSELIGTAFQIKDDILDIEGKFEETGKQASDEKLEKTTYPSLFGLEKSKELLNE CISKAKKILEENFENNGLLIELTDYFGNRGA Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:59:04 2011 Seq name: gi|261746343|gb|ADAD01000202.1| Leptotrichia goodfellowii F0264 contig00146, whole genome shotgun sequence Length of sequence - 3121 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + LSU_RRNA 1829 - 2202 94.0 # AJ307978 [D:1..2925] # 23S ribosomal RNA # Propionigenium modestum # Bacteria; Fusobacteria; Fusobacteriales; Fusobacteriaceae; Propionigenium. + LSU_RRNA 2290 - 2852 92.0 # AJ307978 [D:1..2925] # 23S ribosomal RNA # Propionigenium modestum # Bacteria; Fusobacteria; Fusobacteriales; Fusobacteriaceae; Propionigenium. Prediction of potential genes in microbial genomes Time: Thu Jul 14 03:59:11 2011 Seq name: gi|261746319|gb|ADAD01000203.1| Leptotrichia goodfellowii F0264 contig00038, whole genome shotgun sequence Length of sequence - 23250 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 8, operones - 5 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 79 - 117 -0.4 1 1 Op 1 12/0.000 - CDS 164 - 1126 1614 ## COG3958 Transketolase, C-terminal subunit 2 1 Op 2 1/0.000 - CDS 1127 - 1978 902 ## COG3959 Transketolase, N-terminal subunit 3 1 Op 3 11/0.000 - CDS 2015 - 3370 2333 ## COG3037 Uncharacterized protein conserved in bacteria 4 1 Op 4 13/0.000 - CDS 3387 - 3656 194 ## PROTEIN SUPPORTED gi|148984431|ref|ZP_01817719.1| PTS system, IIB component, putative 5 1 Op 5 5/0.000 - CDS 3713 - 4144 690 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 6 1 Op 6 . - CDS 4138 - 6195 2443 ## COG3711 Transcriptional antiterminator 7 1 Op 7 . - CDS 6245 - 6421 79 ## gi|262039640|ref|ZP_06012931.1| putative CagA protein - Prom 6460 - 6519 9.2 8 2 Tu 1 . + CDS 6532 - 9102 1873 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 9138 - 9181 1.4 - Term 9126 - 9169 2.1 9 3 Op 1 . - CDS 9196 - 9648 277 ## Dhaf_4658 Abi family protein 10 3 Op 2 . - CDS 9522 - 10124 232 ## Dhaf_4658 Abi family protein - Prom 10250 - 10309 11.5 - Term 10136 - 10181 3.1 11 4 Op 1 . - CDS 10324 - 10581 386 ## Lebu_0560 hypothetical protein - Prom 10602 - 10661 6.5 12 4 Op 2 . - CDS 10675 - 11538 992 ## COG2207 AraC-type DNA-binding domain-containing proteins 13 4 Op 3 . - CDS 11617 - 12687 1224 ## FN1986 hypothetical protein - Prom 12799 - 12858 10.9 - Term 12919 - 12973 4.9 14 5 Op 1 17/0.000 - CDS 12976 - 14424 1585 ## COG0168 Trk-type K+ transport systems, membrane components 15 5 Op 2 . - CDS 14449 - 15804 1772 ## COG0569 K+ transport systems, NAD-binding component 16 5 Op 3 1/0.000 - CDS 15879 - 16853 1378 ## COG0113 Delta-aminolevulinic acid dehydratase 17 5 Op 4 . - CDS 16870 - 17346 751 ## COG1648 Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) - Prom 17375 - 17434 2.4 18 6 Op 1 . - CDS 17436 - 18458 1221 ## COG2603 Predicted ATPase 19 6 Op 2 6/0.000 - CDS 18480 - 19094 772 ## COG0406 Fructose-2,6-bisphosphatase 20 6 Op 3 . - CDS 19091 - 19819 938 ## COG0368 Cobalamin-5-phosphate synthase 21 6 Op 4 . - CDS 19875 - 21266 1575 ## COG1488 Nicotinic acid phosphoribosyltransferase - Prom 21343 - 21402 11.1 + Prom 21287 - 21346 10.3 22 7 Tu 1 . + CDS 21385 - 22599 814 ## gi|262039654|ref|ZP_06012945.1| hypothetical protein HMPREF0554_0496 + Term 22699 - 22736 -1.0 23 8 Tu 1 . - CDS 22731 - 23174 375 ## gi|262039636|ref|ZP_06012927.1| sporozoite and liver stage asparagine-rich protein Predicted protein(s) >gi|261746319|gb|ADAD01000203.1| GENE 1 164 - 1126 1614 320 aa, chain - ## HITS:1 COG:STM2340 KEGG:ns NR:ns ## COG: STM2340 COG3958 # Protein_GI_number: 16765667 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Salmonella typhimurium LT2 # 11 318 10 317 317 420 66.0 1e-117 MFKVYKGEPKKDDIEMRKVYSSKLSELMKKDDRVVALEADLMSSMTMDKVQKENPDKVIN CGIMEANMIGVAAGMSIAGKYPFAHTFTAFASRRCFDQLFMSGAYQKNNIKVIGSDAGVT SVHNGGTHMSFEDMGIMRGLADTTVMEMTDAVMFENILEQIALKDGFYWIRTMRKNAATI YEKGSTFKIGKGNVLKDGKNITLIANGIMVIEALKAAEKLEKEGNSVAVIDMFTLNPIDK ELIIEYGNKTGKIVTCENHSVHNGLGSAVAEVIAESGNAVLKRIGIQERYGQVGTLDFLM EEYGLTSEHIYKAALKLLEK >gi|261746319|gb|ADAD01000203.1| GENE 2 1127 - 1978 902 283 aa, chain - ## HITS:1 COG:STM2341 KEGG:ns NR:ns ## COG: STM2341 COG3959 # Protein_GI_number: 16765668 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Salmonella typhimurium LT2 # 10 279 3 272 276 404 68.0 1e-112 MKNCNCEEKLKEIEKLASKIRINTLKSLTHLGFGHYGGSLSVVEVLAVLYGGVMNINPED PNWKGRDYFVLSKGHAGPALYATLALKGFFPVEEIYTLNQNGTNLPSHPDRLKTKGVDAT TGSLGQGISIATGIATALKMDKKSNRVFSIIGDGEINEGQCWEAFQFIAHHNLNNLTVFL DYNKQQLDGTLEEIIKPFSFENKLKAFGFDTVTIKGDDIKGIYEMVKKPRKENEKPLFVI LDTVKGQGVEYIEKMKNSHHLRLTEELKKEIEEVIEKLEREAE >gi|261746319|gb|ADAD01000203.1| GENE 3 2015 - 3370 2333 451 aa, chain - ## HITS:1 COG:STM2342 KEGG:ns NR:ns ## COG: STM2342 COG3037 # Protein_GI_number: 16765669 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 5 448 7 450 463 548 73.0 1e-156 MKTLLTFIVDILKVPSILVGLIAMIGLMVQKKPVADIVKGTIKTILGFLVLGGGAGLVVS SLAPMGTMFEQGFHVQGIVPNNEAIVSLALKEYGTITALIMALGMFWNIFIARFTKLKYI FLTGHHTLYMACMIGIILKVGGFEGPLLIFIGSLTLGLVMAIFPALAQPYMRKITGSDDV GFGHFSTIGYVLSGIVGSLIGKGSKSTEEMKLPKNLSFLRDSSISISLTMMIIYLILALV AGKDFVEKELSGGDNYLVFAVIQAITFAAGVYIILSGVRLVLAEIVPAFVGFSEKLVPNA KPALDCPIVFPYAPNAVLIGFLCSFLGGIVGLFMLGMLKWTLILPGVVPHFFCGATAGVF GNATGGRRGASWGAFANGLLLTFLPVILLPVLGNLGFANTTFSDADFVTVGIVLGNMAKS VSPMIVSAIIVGITVVLVALGFVPSKKKEEK >gi|261746319|gb|ADAD01000203.1| GENE 4 3387 - 3656 194 89 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148984431|ref|ZP_01817719.1| PTS system, IIB component, putative [Streptococcus pneumoniae SP3-BS71] # 1 84 2 85 94 79 51 2e-14 MKIMAVCGSGLGSSFMVEMNIKKVLKKIGVEAEVEHSDLASAVSGAADLFVMAKDIAESS SVPAEKRIVLDSIISMPELEEKLTNYFKK >gi|261746319|gb|ADAD01000203.1| GENE 5 3713 - 4144 690 143 aa, chain - ## HITS:1 COG:SPy1952_2 KEGG:ns NR:ns ## COG: SPy1952_2 COG1762 # Protein_GI_number: 15675752 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Streptococcus pyogenes M1 GAS # 10 140 5 142 154 110 42.0 1e-24 MLEEILDGNIQTVDFVKDWEESIKIASKPLLEKNIIEKKYVEAMIESIKKLGFYVVLREN LAMPHARPEEGAIDTGISLLKLKQPVNYGDEKVYIVIVLASKDSDSHTEILMKLVELFQD DESIEELIKAETVEEIRRIVGKY >gi|261746319|gb|ADAD01000203.1| GENE 6 4138 - 6195 2443 685 aa, chain - ## HITS:1 COG:SA0321_1 KEGG:ns NR:ns ## COG: SA0321_1 COG3711 # Protein_GI_number: 15926034 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Staphylococcus aureus N315 # 1 485 1 482 504 131 24.0 5e-30 MLNYKEIEILNSLIKGEKCSFRAISEKYGISDRAARYYIDNIDYILKILGYKITEKDRNI IFLETDQDFSKIFEILKKAQKLSVDDRVNILKLILFFDEKGLNITKISEELQISRTTIKK DMKLLSEELGKDGIKLNYKNNIGYKPEGNNGEILIKQIELIEEILELSKGKDDFKIVKTQ ILNYFYKYIKEENIENTKEFIVEVEKVMELNINGESYNKIFSYMLILLNFDGKYGNSQDS VSKKFLAHTDEYEKIKEILENILKATVKQEMLIEMTDLITGITINSFENNSFEDWINEEL IIKKMISKVSKIVKTDLTRDEILYNGLLYHIKPAMYRIKNNIQITNSVFKELILAKDPVL NVVNEGIKEIEELFGVKFPEDEIALIGFHIKASIERNTSEKTKKVILICGLGYGSSKVLE QSLKENYDLDIVDVLPYYLIEDTLPNYKNIDLILSTVDLKHKYGDVPVVKINPLLKEDDF VKLSKYGIKKGNRKISMKKLMKIIKNNTTMENSDKLIEELKQELENSIVDDLSDAGTILR EILNKENVRFAEKVQDWKEAITKSGNILVENKVVKSGYVEEMIKLVKKHGAYIVIEEGIA IPHAGISENVLKTGVSLLIVKEKVFFPSGRGANIFLSFATINKNDHLNILNDLFELITKY NFIEEISKISKYEELEKYFRKEFVC >gi|261746319|gb|ADAD01000203.1| GENE 7 6245 - 6421 79 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039640|ref|ZP_06012931.1| ## NR: gi|262039640|ref|ZP_06012931.1| putative CagA protein [Leptotrichia goodfellowii F0264] putative CagA protein [Leptotrichia goodfellowii F0264] # 12 58 1 47 47 63 97.0 3e-09 MVANKNNISYFLIYVNTFFEKNNFFVNAENVNKIKDKIDSEEKQGVTLFQIYLKISAN >gi|261746319|gb|ADAD01000203.1| GENE 8 6532 - 9102 1873 856 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 4 849 5 806 815 726 45 0.0 MEEKFTQKSMEALSEAHNFAVRYKSSDMKVEHLLLALIGQMNGLIPNILKKMGINTIELT KKLEDKLNRMPKIEGGTSDPRPNGELNRVIVGAEDYAKKMGDAYISTEHLFLASYDNNSF LRENGIVKNQFENVLNEIRGGRKIMSDNPENSYEALEKYGKDLVELARKGKLDPIIGRDQ EIRRAIQILSRRNKNNPILIGEPGVGKTAIAEGIAQRILKGDVPENLKDKTIFSLDMGAL ISGAKYRGEFEERLKAVLDELENSDGRIILFIDEVHNIVGAGKTEGSMDAGNLLKPMLAR GEIKVIGATTLDEYRKYIEKDAALERRFQPIMVNEPTVEDTISILRGLKEKFEIFHGIRI TDNAIVTAATMSDRYINDRFLPDKAIDLIDEAAAKVKTEINSMPTELDEVTRRLMQLEIE KVALTKEKDQASKDRLESLEKEIAELKEEETKLKSQWENEKQEAGRLTKINEEIEKVNLE IQEAERKSDYNKLAELKYGKLATLEKERENEEEKWKDRNEGNGSKLLKQEIDSEEIAEIV GKWTGIPVSKLLQGEKEKILNLAEHMKARVIGQDEAIETISDTIIRSRAGLKDPNRPIGS FIFLGPTGVGKTYLTKTLAHNLFDDENNIVRIDMSEYMDKFSVTRLIGAPPGYVGYEEGG QLTEAVRRKPYSVILFDEIEKAHPDVFNVLLQLLDDGRLTDGKGKVVDFKNTIVIMTSNI GSEIILEDPKLSNVTKEAVLDEMKHRFKPEFLNRIDDIIVFKSLGKEHVKNIISLILKDI NKKLKDQFIKIEFTDKALDYIVDEAYDPAYGARPLKRFVQKDIETNLSKMILSNEIPENS TVLIDSDGEKLTYKVK >gi|261746319|gb|ADAD01000203.1| GENE 9 9196 - 9648 277 150 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_4658 NR:ns ## KEGG: Dhaf_4658 # Name: not_defined # Def: Abi family protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 13 147 175 314 317 68 33.0 9e-11 MDINKAIYVWYYSFFYSSLTDDIKDKISSEISSEYSHEYKNMKINLSKEELENIIHFLNF LRNKCAHNERFFNTNKKKTAIVYPHSSEIFKGRLFDAVLLLKLFLFKKDFNIFRKELKIE IDKINKELNTGIFNKVLIEMGFPKNWEERI >gi|261746319|gb|ADAD01000203.1| GENE 10 9522 - 10124 232 200 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_4658 NR:ns ## KEGG: Dhaf_4658 # Name: not_defined # Def: Abi family protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 3 177 2 181 317 137 43.0 3e-31 MSKPFKTYRQQLSILRNRGMEIKDGGKVIKILKRENYYSIINGYKDIFIDNNSTIEKFKM GTTFEDIYCIFEFDRNLRNIFLKNILKIESLVNTKISYYFSERNQKEFSYLNINNFHPSK LEENTGLIANISNVIKDKSKSNKSNQISHHLKNHQDLPLWILIKQFTFGITAFFILLLQM ILRIKFQAKYQVNIPMNIRT >gi|261746319|gb|ADAD01000203.1| GENE 11 10324 - 10581 386 85 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0560 NR:ns ## KEGG: Lebu_0560 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 85 1 85 85 151 94.0 7e-36 MSEKNLKILGWIGTFLSVIMYVSYIPQIMGNLHGNKTPFLQPLAAAINCTIWTSYGLLKE KKDYPLSAANLPGIIFGLLATITAF >gi|261746319|gb|ADAD01000203.1| GENE 12 10675 - 11538 992 287 aa, chain - ## HITS:1 COG:FN0315_1 KEGG:ns NR:ns ## COG: FN0315_1 COG2207 # Protein_GI_number: 19703660 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Fusobacterium nucleatum # 1 129 1 129 130 177 69.0 3e-44 MNIIKAFNQTIGYIEENLTEKLDEKIIQQLSGYSFPLFSRIFSIMVGYPLSEYIRFRKLS CAAVDLREKDERVIDIAFKYGYESPDSFTAAFKKFHGYTPNEVKKGKTFKIFSPILLSLN VSGGKNMDIKIEKKKGFKIAGIRKEVNENTNFPEVWNKLFSKVDSRILENLGNGQSYGAC CNFKNKIEVLCEKNIFDYMAGYDVKDMNKAKEAGLEILEVPEAEYAVIKLKGAIPECIHE GWKYVTNVFFPEQGYKHAGTPDFEVYSEGNPESIDYEMELWVPVEKL >gi|261746319|gb|ADAD01000203.1| GENE 13 11617 - 12687 1224 356 aa, chain - ## HITS:1 COG:no KEGG:FN1986 NR:ns ## KEGG: FN1986 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 42 355 2 336 336 256 48.0 1e-66 MSIFGNDWDFYFTGKQIVPFEISETAIKKEIIKINIEGKEAVVNVKFIFDSPVKSKKMIG FVAPTIKNMQNPGLDKLNVRDFETKVNGKKVNSNVEKLDYFLKKGIMSEDKLKKYTDEEY PDNYVYYFNANFVKGENVVEHSYKRDYIYDFEYVLTTISKWKNKKVDDFELIIEPGNKFI QLQYSFWNDQKMINWEIVGEGKLTAITPEMVEKAEGSFGQKINENIYARLKNGYIRYKAE NFAPDKDFYMKYIENIEQMEDELPREKVGNYIFTDKLFMIAAYGNEKEMRKLSNKDLDIV KNYLYAIAGYDFSKKQLKDYFSKFVWYVPINKDVKIDDYNKLTEKINKIKKERSRK >gi|261746319|gb|ADAD01000203.1| GENE 14 12976 - 14424 1585 482 aa, chain - ## HITS:1 COG:FN0993 KEGG:ns NR:ns ## COG: FN0993 COG0168 # Protein_GI_number: 19704328 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 1 480 1 480 483 438 52.0 1e-122 MNKKMIGFIVGRILLLEAGLMILPLIVSFIYGEDWKYRISYFSVIMMLVILGFILSGKSP KNMSIQGREGFVIVSLSWILLSAFGALPFVLTKEIPSFIDAFFETVSGFTTTGSSIITDL SKISHSNLFWRSFTHFVGGMGVLVLALAVFPKNSPSSVHVMKAEVPGPTFGKLVSKLSAT ARVLYQIYLVMTIVVMILLMFGGLDLFESSLLAFGTAGTGGFGLRNGSILPYHSAYIEMV LAIGMLVFGINFNIYYFILIKKAKEALANEELKYYLMIVIFSTALIMINIIPMYKSLWQC FRDAFFSVSSVITTTGYSTADFGMWPVFPQTILLILMFSGACAGSTAGGLKVSRVVMMGK MFFSEIKQMISPNRVVTVKYENKPLDIKIQKSVAAYFMVYMILFTAILLIISVNSQDFLT AFSAVAATFNNIGPGLGKVGPAFSFAEYNDFSKTVLSFAMLAGRLEIYPMLILFAPATWK IK >gi|261746319|gb|ADAD01000203.1| GENE 15 14449 - 15804 1772 451 aa, chain - ## HITS:1 COG:FN0242 KEGG:ns NR:ns ## COG: FN0242 COG0569 # Protein_GI_number: 19703587 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 1 451 1 451 452 343 45.0 4e-94 MKIVIVGAGVVGESLCSELSEEGNDVILIEKREEVLNRIIDTNDITGLVGNGASYENLFE AGADKADVFIAVTEADELNIISCIIAQKIGAKYTIARVRNPEYSADKKFVREGLGISMMI NPEEEAAKSIMNKLKFPNATSVDSIFSNRATILELEITKESSLKGIMLKDLDKVTSEKII IFIVKRGNEVFIPSGNFVLEEDDSIYVTGTSDAVVKFYNEMGYQHKNINSAMLIGGGTIS HYLTERLLKIKKQVKIIESDRERAEKLSRSYPNAIVIKGDETDQELLLNEGIKNYDAVLA LTDKDEENIVISMFAESVNEGKVITKMSRTLLLPILEEKGLYSVIVPKKVIADIIIRVVR SKINVKGSKMNTLHRLVDNNIEAIVFEVSPQSKIIGIPLKDLKVISDLLIVCILRDEELI YPGGDDIIQAKDKVMIVTLKRTIEDIDDILE >gi|261746319|gb|ADAD01000203.1| GENE 16 15879 - 16853 1378 324 aa, chain - ## HITS:1 COG:FN0460 KEGG:ns NR:ns ## COG: FN0460 COG0113 # Protein_GI_number: 19703795 # Func_class: H Coenzyme transport and metabolism # Function: Delta-aminolevulinic acid dehydratase # Organism: Fusobacterium nucleatum # 1 321 1 321 322 422 64.0 1e-118 MYKRHRKLRNNSMIRNLVKDVYLEKNDLIYPVFIEEGENIKNEIKSMPGIFRYSLDKIDE ELEELVKLGINSILLFGIPEHKDEYATESYSKEGIIQRAIRYIKSKYPNFLIIGDVCCCE YTSHGHCGILDEHGYVKNDETLEILAKIALSYAKAGVDIIAPSDMMDGRVEKISKILAKN NFANIPVMSYSVKYSSSFYGPFRDAADSAPKFGDRKSYQMDFRYAGDALSEVKADLKQGA DIVIVKPALAYLDILTKVKETFGVPVVAYNVSGEYSMVKAAAINGWINEKNIVMEQMYAF KRAGANAIITYFAKDVAHYLNEES >gi|261746319|gb|ADAD01000203.1| GENE 17 16870 - 17346 751 158 aa, chain - ## HITS:1 COG:FN0539 KEGG:ns NR:ns ## COG: FN0539 COG1648 # Protein_GI_number: 19703874 # Func_class: H Coenzyme transport and metabolism # Function: Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) # Organism: Fusobacterium nucleatum # 2 152 5 152 152 84 35.0 1e-16 MFFPMFVNVENKEILVIGGGKIGARKAELLKEYGGNVTVYSEVVTEEALTKIEGIKIVNK NLSHDENEIKELVKKYFVVVAATNDEELNDKIAHICIAENVLVNNVSSKTEMNSMFGAIV KNKEFQIGISSYGKSCRRSKALKGRVQKVIDEIAAMSE >gi|261746319|gb|ADAD01000203.1| GENE 18 17436 - 18458 1221 340 aa, chain - ## HITS:1 COG:Cj0500 KEGG:ns NR:ns ## COG: Cj0500 COG2603 # Protein_GI_number: 15791864 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Campylobacter jejuni # 1 317 1 306 332 122 29.0 7e-28 MIIAQSYEELLKKNKKLVFIDVRSPKEYKEAHIPDAVNIPVFSDKEREIIGTLYKKEGKK EAIREALKRVGPKVYDIYTEMEKYVERDAEIVVYCARGGMRSSAVVSFFKEFSLPLVKLA KGYKGYRSYLNEKLPEMVAQCEYTALYGKTGSGKTKILKVLEKRGYDVLDLEECANNRGS LLGSIGLGEKYSQKFFESQVLKRFLSFKTKKIIIEGESKRIGNIVMPKYLYEAVINSEKL LVETELAKRIDIIKEEYLKESYEKQEIIETLKKLGRYIGEKQINEYIEKIEKEEYDFVIK ELIEKYYDKVYTTKNKVFQKTFYNKNEEECADEIVKYIFG >gi|261746319|gb|ADAD01000203.1| GENE 19 18480 - 19094 772 204 aa, chain - ## HITS:1 COG:FN0911 KEGG:ns NR:ns ## COG: FN0911 COG0406 # Protein_GI_number: 19704246 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Fusobacterium nucleatum # 8 204 4 190 191 142 43.0 5e-34 MKNRQTNVIFIRHGETDMNKENLYFGHLDPELNETGIYQLKKTRKLLKYFEKNINIVYSS DLKRCMESTGILKIGAKIKKIPLNEFREMNFGIFEGKTYEEISTEFPEEVEKMNKDWREY RVPQGESLKEVMERAVEKLEELTKKHKNKTIVIVSHAGVIKSIVSYYLYGNLDGYWKIKV DNGSMTKMCILEDGFTYFDYINRL >gi|261746319|gb|ADAD01000203.1| GENE 20 19091 - 19819 938 242 aa, chain - ## HITS:1 COG:FN0912 KEGG:ns NR:ns ## COG: FN0912 COG0368 # Protein_GI_number: 19704247 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Fusobacterium nucleatum # 2 241 1 272 278 145 36.0 9e-35 MIKQFLILIQFMTRIPVFVNVEYDEEKLGKSIKYFPLVGAVIGIFLFGINFLAGKVTENR QIIAVIIVIAEIFITGILHIDGLADTFDGLFSYAKKERMLEIMKDSRIGTNGAVVLILYF ITKIILLSEVKSEYILLYPIISRLSTSMNAGLGEYARKDGMSNGIIEKNGPKEVVISVVI TAVLSFLILKIKGLGVLAAAIVFILIFMRNVKKKIGGITGDTMGASLELTSILILLTGVI LK >gi|261746319|gb|ADAD01000203.1| GENE 21 19875 - 21266 1575 463 aa, chain - ## HITS:1 COG:FN0348 KEGG:ns NR:ns ## COG: FN0348 COG1488 # Protein_GI_number: 19703691 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 2 427 6 459 501 355 45.0 1e-97 MLMDKFFKFLNSDRYQYTESEIYLKNNMENQIAVFDIFLRTEKEKDYAIVYGISDVLELI DTLNSTSYEEKKKYLSKILKNEDLTEYIANMKFTGSVKGVRDGETVFTDEPILTIIAPLI QGKILETPILNILNYQILTATVTSKITLAAKDKEVLFFGTRRVPGFEASMAMTKASYIAG CISHSNIMGEYFYDLKSTGTMTHGYVQSFGTEKDSEYRAFDTFIKTCKDEKLPLIMLTDT YSVLKSGIHSTIRAFRDNNINDDYEGTYGIRIDSGNLLNLSKKCREILDDEGFRKAKIIL TGGIDEGKIRRLMKHGVQADMFGVGDAIALPKKEISTVYKMSRIEETDVMKVSDDKSKMS YPGDKEIYRTYKNNNFFDVVTLKDEKEEMEILKKEKYRKLTLDFIIDGEKVEENIELLGL TETKRYYENNVFYLKEIYNETQKRKRIRFSKGIKKLKEELTKF >gi|261746319|gb|ADAD01000203.1| GENE 22 21385 - 22599 814 404 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039654|ref|ZP_06012945.1| ## NR: gi|262039654|ref|ZP_06012945.1| hypothetical protein HMPREF0554_0496 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0496 [Leptotrichia goodfellowii F0264] # 1 404 1 404 404 701 100.0 0 MKRKLIFSLLISLLIAVIYFLNVFRIPYKKDISYLLMNDKYLISKPDYGYLNNSLIFLDN REMTENGKDNEINTCLTVTKMAMIKDYYIGYVTGENSESCKGYFYVRKSVPENLIGLTEK EIEEKFGKNIKYENPSDFMNKYGIDEHGQEYYSENLKTVSIILFFFCFLIIETGFIIYSN IFKIIKNIKEKEMNYKNKVRIILTLTLAAGFLSGFSYFTYNILTGNYKTNWYIHVNNQSS IVRKISEFYSSVILLNPEEYTGIFLDDIEEVKNLYKKPASDRYKDKCGEISKIAMIGDYY VGYREDSDNKNCGGYFYVKKGALESATGLSEQQIEQKFRKNIKYEDPSKFINKNGSGGDY INDFSYLLIMSIYLFLHYLFIVGLVLFGCNFIIDFIKKLISLKK >gi|261746319|gb|ADAD01000203.1| GENE 23 22731 - 23174 375 147 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262039636|ref|ZP_06012927.1| ## NR: gi|262039636|ref|ZP_06012927.1| sporozoite and liver stage asparagine-rich protein [Leptotrichia goodfellowii F0264] sporozoite and liver stage asparagine-rich protein [Leptotrichia goodfellowii F0264] # 1 147 1 147 147 221 100.0 1e-56 MCVKYEPKNSYYIYSDSENLNEEKLKEIYISDDETMFINSDEINNNLILRKISLMYNNSK LGEAVVDKKIKDILEGNSYYISDDIIKILGKDYNKIRQNSEEYLDGKEFEIILILENTDK NKSYEIKFQNVNLYYRRKGFDIWILNI Prediction of potential genes in microbial genomes Time: Thu Jul 14 04:00:01 2011 Seq name: gi|261746310|gb|ADAD01000204.1| Leptotrichia goodfellowii F0264 contig00004, whole genome shotgun sequence Length of sequence - 7779 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 3 - 888 1320 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 2 1 Op 2 . - CDS 909 - 2288 2082 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 3 1 Op 3 . - CDS 2304 - 2483 324 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific + Prom 3066 - 3125 7.5 4 2 Tu 1 . + CDS 3245 - 4009 469 ## Sterm_3349 initiator RepB protein + Prom 4079 - 4138 10.8 5 3 Tu 1 . + CDS 4203 - 4814 780 ## COG3212 Predicted membrane protein - Term 5261 - 5312 1.9 6 4 Op 1 . - CDS 5332 - 6381 1242 ## Lebu_2187 hypothetical protein 7 4 Op 2 . - CDS 6405 - 6998 725 ## Lebu_2188 hypothetical protein 8 4 Op 3 . - CDS 7048 - 7536 544 ## Lebu_2188 hypothetical protein - Prom 7646 - 7705 11.5 Predicted protein(s) >gi|261746310|gb|ADAD01000204.1| GENE 1 3 - 888 1320 295 aa, chain - ## HITS:1 COG:CAC0533 KEGG:ns NR:ns ## COG: CAC0533 COG1486 # Protein_GI_number: 15893823 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Clostridium acetobutylicum # 1 295 1 296 441 505 78.0 1e-143 MKKFSIVVAGGGSTFTPGIVLMLLENLDKFPIRQIKFYDNDKERQEIIAKACEILIKEKA PNINFVYTVDPEEAFTDIDFVMAHIRVGKYAMREKDEKIPLKHGVLGQETCGPGGISYGM RSIGGVLEIVDYMEKYSPDAWMLNYSNPAAIVAEATRRLRPNSKILNICDMPIGIEIRMA EMLGLKSRKEMFIRYFGLNHFGWWTDIRDKEGNDLMPALREKVARVGYNVEIEGENVEAS WSETFTKAKDVYALDPATMPNTYLKYYYFPDYVVAHSNPNHTRANEVMEGREKFV >gi|261746310|gb|ADAD01000204.1| GENE 2 909 - 2288 2082 459 aa, chain - ## HITS:1 COG:BS_glvC_1 KEGG:ns NR:ns ## COG: BS_glvC_1 COG1263 # Protein_GI_number: 16077887 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus subtilis # 1 370 66 439 452 479 65.0 1e-135 MPLLFAIGLPVGLAKKTNARACLETFALYTTFNYFVNALIKLWGFGVDINATGPTSGLTD IAGIRTLDTSIIGSIIIAGIVVYLHNKYFDTKLPDFLGIFQGSVFVYIIGFIVMIPCALI TVLVWPKIQAGIFALQGLLVTSGVIGVWIYTFLERILIPTGLHHFIYGPFIFGPAVVEGG ITNYWLAHAPQFGASTEALKTLFPQGGFSLHGNSKIFGCTGIALALYATAKPEKKKRIAG LLIPAVLTAVVSGITEPLEFTFLFIAPLLFAVHSVLAATMAAVMYTFGVVGNFGGGLLDF LFQNWIPMTKNHSNVMFTQIFIGLIFTVIYFFVFKFLIEKMNLKTPGREEDDVETKLFTK ADYKAKKNMEANQNDENDMYMDQAIIILEALGGKDNIEELNNCATRLRVSVKDEKLLQPD SVFKQAGAHGVLKQGKAIQVIIGLSVPQVRDRVEELLKK >gi|261746310|gb|ADAD01000204.1| GENE 3 2304 - 2483 324 59 aa, chain - ## HITS:1 COG:CAC0532_1 KEGG:ns NR:ns ## COG: CAC0532_1 COG1263 # Protein_GI_number: 15893822 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Clostridium acetobutylicum # 1 59 1 59 453 77 59.0 9e-15 MLKKIQRFGGAMFTPVLFFTFTGVVVGITGIFKNPQIMGSIANEGTGWWKFWQLIEEGG >gi|261746310|gb|ADAD01000204.1| GENE 4 3245 - 4009 469 254 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3349 NR:ns ## KEGG: Sterm_3349 # Name: not_defined # Def: initiator RepB protein # Organism: S.termitidis # Pathway: not_defined # 4 239 158 398 398 117 36.0 5e-25 MSDNKKEFNLPLSSLKRYLNSDDKYKRFFDFEKHILKKAVFDINTFTDFNIEYKKIKSGT KASNKIVSINFKIRKSKQIPILYDDTIYKMIETIKDKIENHEKIYNLLLLYSIKRGHKYV YDNIEYLKNDSSENFERKLKRALLLDLATNKLKLYISINEFVKSPTKLYNILVQNINLIK TYYPKIDIPEYTTGLTNLRNVSFLEDQEIFELSNNEIELYIKYYSKKKSVIKIYLSNNII KIIRKAKQNDDSDI >gi|261746310|gb|ADAD01000204.1| GENE 5 4203 - 4814 780 203 aa, chain + ## HITS:1 COG:MA0533 KEGG:ns NR:ns ## COG: MA0533 COG3212 # Protein_GI_number: 20089422 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 4 193 3 196 206 67 27.0 1e-11 MKNKMLITMLLSIIILGTVLTAATAKKKTVSKTKTAAKKREPQTTKIKYKKVVGENPFVT FVKAEEIALKHAGLDKETGDITKILMEQKDKELVYKIALHSDYNDYSYIIGSKEGKVLNY SLRSRDYLLSNKDYIGNDKVRDIILEKMPEIKDSDIYEIKSEKENNLPVYKVRVIRSGKE YKFIIDALKGKILKSEESEFTSL >gi|261746310|gb|ADAD01000204.1| GENE 6 5332 - 6381 1242 349 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2187 NR:ns ## KEGG: Lebu_2187 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 349 1 353 353 310 55.0 6e-83 MKKIITLLFAVCSIFSFGDNASEKRIQVRGTATREIAPDSAKIHFEIITKNENLEKASRE NADILAKYKNLLKKTNTKYEKIESTNYSTYKNYNWESIKVNEGKKEFRTVLTVEVSPFDI NLLKNFMTVLVQKGVYNIIRNDNGSYSFNVISQNADNKTSYQNAVNDFNDIQKKLAANGF NSNNIKISGYDNNEVNMESNKDVKKEVYTVSHSIEVSTRDMKKLGNIINLAHSLEIASTG IIEYDIDNKQKIEDQLYEQAYKEALKKAENILNKTELKLRKPVTITDNSNGVIRPFYSYF NRNYVDDYNLDILKESDAKIIEKSEHNAIIINPQKSSFSKTVYIEFEMN >gi|261746310|gb|ADAD01000204.1| GENE 7 6405 - 6998 725 197 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2188 NR:ns ## KEGG: Lebu_2188 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 4 197 176 373 373 177 53.0 3e-43 MNNALSKTLSKFNSVKSKLQSSGISANDIVFSNYKVIENQNVQGKNEKDVFVVTEEFTIS TKNLKNLNDIISLADDNSINVSGSIYFDISDKDRIASEMYNEAFNQAKSKALSILKSSNM TLSSPLVVSEDIAFQQKMIDRIDQDWAIAAEAPEAYGLKETRYVNNAAIQKRSLPKVDYT PKPIKLSQNISVLYEIK >gi|261746310|gb|ADAD01000204.1| GENE 8 7048 - 7536 544 162 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2188 NR:ns ## KEGG: Lebu_2188 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 161 1 155 373 121 49.0 8e-27 MKKIFTALFLILSAFSFGENVVRKISVTGNSEREVLPDTAKISFKIQMKNQNLNTATKEL KEKIEKFKSGLRAKKIELSNFETVSFYSNKTKDYNDYYDDYRYVEDEQAVYDVKGKTKTD KDKKPASYTIQMQVIVKNTTFDKISKLIEFSGNDAVKNIKKI Prediction of potential genes in microbial genomes Time: Thu Jul 14 04:00:26 2011 Seq name: gi|261746283|gb|ADAD01000205.1| Leptotrichia goodfellowii F0264 contig00034, whole genome shotgun sequence Length of sequence - 26601 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 14, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 328 571 ## Lebu_0887 autotransporter beta-domain protein + Term 342 - 394 7.7 + Prom 353 - 412 9.4 2 2 Tu 1 . + CDS 508 - 1923 2231 ## COG1823 Predicted Na+/dicarboxylate symporter + Term 1934 - 1985 4.5 + Prom 1963 - 2022 7.5 3 3 Op 1 . + CDS 2054 - 2878 1132 ## COG1968 Uncharacterized bacitracin resistance protein 4 3 Op 2 . + CDS 2882 - 4327 1923 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 5 3 Op 3 . + CDS 4359 - 4943 628 ## Lebu_2229 hypothetical protein + TRNA 5076 - 5162 69.0 # Leu CAA 0 0 + Prom 5078 - 5137 80.3 6 4 Op 1 . + CDS 5262 - 6296 1618 ## COG1363 Cellulase M and related proteins + Term 6348 - 6389 5.1 + Prom 6422 - 6481 11.6 7 4 Op 2 . + CDS 6506 - 7537 1201 ## COG2008 Threonine aldolase + Prom 7634 - 7693 12.5 8 5 Tu 1 . + CDS 7739 - 8611 1122 ## COG1737 Transcriptional regulators + Term 8625 - 8677 9.1 + Prom 8662 - 8721 14.5 9 6 Tu 1 . + CDS 8780 - 10039 1826 ## CD0209 putative sugar-phosphate kinase + Prom 10125 - 10184 12.1 10 7 Op 1 8/0.000 + CDS 10220 - 10714 713 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 11 7 Op 2 7/0.000 + CDS 10745 - 11062 647 ## COG1445 Phosphotransferase system fructose-specific component IIB 12 7 Op 3 . + CDS 11128 - 12186 1859 ## COG1299 Phosphotransferase system, fructose-specific IIC component 13 8 Tu 1 . - CDS 12306 - 13073 823 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 13165 - 13224 18.1 + Prom 13168 - 13227 10.4 14 9 Op 1 11/0.000 + CDS 13280 - 15577 3015 ## COG1882 Pyruvate-formate lyase 15 9 Op 2 . + CDS 15620 - 16399 932 ## COG1180 Pyruvate-formate lyase-activating enzyme 16 9 Op 3 . + CDS 16401 - 17117 1015 ## COG0176 Transaldolase + Term 17203 - 17246 1.4 + Prom 17265 - 17324 12.7 17 10 Tu 1 . + CDS 17479 - 18540 1756 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold + Prom 18589 - 18648 5.5 18 11 Op 1 11/0.000 + CDS 18669 - 20177 2162 ## COG3037 Uncharacterized protein conserved in bacteria + Prom 20184 - 20243 5.3 19 11 Op 2 13/0.000 + CDS 20265 - 20546 566 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 20 11 Op 3 8/0.000 + CDS 20598 - 21047 723 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) + Term 21128 - 21188 -0.9 + Prom 21156 - 21215 7.7 21 11 Op 4 9/0.000 + CDS 21253 - 21915 1101 ## COG0269 3-hexulose-6-phosphate synthase and related proteins 22 11 Op 5 8/0.000 + CDS 21965 - 22840 1276 ## COG3623 Putative L-xylulose-5-phosphate 3-epimerase 23 11 Op 6 . + CDS 22840 - 23541 1220 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 24 12 Tu 1 . + CDS 23592 - 24005 527 ## Sterm_0148 XRE family transcriptional regulator + Term 24228 - 24267 -1.0 - Term 23985 - 24025 2.1 25 13 Tu 1 . - CDS 24203 - 25936 1838 ## COG0018 Arginyl-tRNA synthetase - Prom 26033 - 26092 13.6 + Prom 25993 - 26052 9.2 26 14 Tu 1 . + CDS 26264 - 26539 393 ## gi|262039674|ref|ZP_06012963.1| hypothetical protein HMPREF0554_0448 Predicted protein(s) >gi|261746283|gb|ADAD01000205.1| GENE 1 2 - 328 571 108 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0887 NR:ns ## KEGG: Lebu_0887 # Name: not_defined # Def: autotransporter beta-domain protein # Organism: L.buccalis # Pathway: not_defined # 1 108 2724 2831 2831 150 68.0 2e-35 DYYSVKPEVGIEFKYKQPMAVRTTFTTTLGLGYENELGRVGDVKNKGRVAYTDADWFNIR GEKDDRKGNFKADLNLGIENQRFGVTLNAGYDTKGKNIRGGLGFRVIY >gi|261746283|gb|ADAD01000205.1| GENE 2 508 - 1923 2231 471 aa, chain + ## HITS:1 COG:Cj0025c KEGG:ns NR:ns ## COG: Cj0025c COG1823 # Protein_GI_number: 15791424 # Func_class: R General function prediction only # Function: Predicted Na+/dicarboxylate symporter # Organism: Campylobacter jejuni # 25 462 21 458 461 211 33.0 2e-54 MYIKILNKIYLGGKMTSLVFSSIWIGILILLFFLLVHINVNLKKKFKFSVRVIISTVLGF AVGIVFQSTLSMVNAEQVPEVIKNVTTAASLVGRGFTSLLKMIVIPLVGVSVYNSIINSK NNEDLRKLTQKSVIYYSVTVAISAIIGISIAMLFNLGVGMKLPEGMEAWTGKGEYKGLVD VIVSFIPSNIFKAMTETSVIGVVIFAVFLGFATNRVSLKNPEKIKPLKDVTAALFAVLTS VTTTIIKIIPYGVAALMFDLTASYGLEVFKNLVTYVIVMFSAMLIVVIMQSLNLAIHGIS PIQYYKKAMAPLVLAFTTTSSMGTLPVTIETLEKGVGVSSGTANFTATLGTTIGMNACAG VFPGVVAVMIANMNGIAITPVFLITLVAVIVLGSFGMAGVPGTAYIAATVVLGGMGLPFD PVALILPIDSIIDMGRTMVNVNGAMTISTVVDKEMGTFSESILKEERTIEA >gi|261746283|gb|ADAD01000205.1| GENE 3 2054 - 2878 1132 274 aa, chain + ## HITS:1 COG:FN1702 KEGG:ns NR:ns ## COG: FN1702 COG1968 # Protein_GI_number: 19705023 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Fusobacterium nucleatum # 8 257 2 254 266 202 54.0 6e-52 MIFDILKVVILSLVEGLTEFIPVSSTGHMIIVEKFIKLSENKSFVNAFEIIIQLGAILAV VVYFWHKLWPFSKNLSKAQSRGIILKWIKIIIAVLPAVVLGLLFDDIIDKYLFKVTVVAG TLIFYGVVLIWIESGKRKGGEISKVKDIPISKVIGIGLFQCLAMVPGTSRSAVTIIGGVL LGLNRVLATEFSFFLAIPTMMGATLLKVIKMGTKMTGYEWFLIGLGFVLSFVFAYGVIKV FMDYIKKHDFKVFGYYRIILGIIVFALYFTGVIK >gi|261746283|gb|ADAD01000205.1| GENE 4 2882 - 4327 1923 481 aa, chain + ## HITS:1 COG:FN1612 KEGG:ns NR:ns ## COG: FN1612 COG0635 # Protein_GI_number: 19704933 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 1 466 2 460 469 379 47.0 1e-105 MINCNIKINENKLEEFVRVLLPEHYDILDGNSEQKIYINVTENLENIEVETISGDNKIRL RYEKIGQKYSDQGEVMAKTSLMKLFGKEKDYKWGTLIGVRPTKIVGRFLKMGLSYEKIRE ILKNIYLVSDEKVNLLINIVKRQIPYLDKETIGIYIGIAYCPTKCTYCSFPAYLLKGKYA ERYDEYYESLLKEIKEIGTLTKNLDLKISTIYIGGGTPSILSAKEIETLLDTVKSSFELG NLKEFTFEAGRIDTLNEKKLEIIKKSGVNKISINPQSFNERTLKLVNRYHNRKQFDSIYD IAKRIGLEINMDLILGLPGETTEDILYTLDEVKKYNPENLTVHNLAIKNASKLNEENYRH EIELDYEKIFEKLDKITESKKLYPYYMYRQKNSFQWGENLGYSVDGAESVYNIEMIEENK MIIGIGAGAITKLIWNEGEKNNIKRLVNPKDPLVWINELDDRLENKKKEITYVVENLIDK K >gi|261746283|gb|ADAD01000205.1| GENE 5 4359 - 4943 628 194 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2229 NR:ns ## KEGG: Lebu_2229 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 194 1 193 193 171 58.0 1e-41 MKIKYIIIFVILLVLGNFLRLFIEDHNMPDIKINEEISYQKENAKKENDLSKTKEKFDIN IVGYDELLKLGFPKSKAEKLITFRDETGIISDMEELKNISKFGEAGIKQAKKYLFVDKEK IKNPEQNYNGRTYTVFNINDADEERLKMIGFTKREIKKLLPEISKGNIHSNIDLEKIIGS ERYEKVEKRIRFSE >gi|261746283|gb|ADAD01000205.1| GENE 6 5262 - 6296 1618 344 aa, chain + ## HITS:1 COG:SP0627 KEGG:ns NR:ns ## COG: SP0627 COG1363 # Protein_GI_number: 15900534 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 3 343 4 344 345 545 78.0 1e-155 MSTLKYLEKLIEIPSPTGYTREVTEYIVKEVRDMGYEAVKTNKGGVIVTLKGKDDTKHRA ITAHVDTLGAIVRAVKSDGRLKMDKIGGFPWNMIEGENCLVHVASTGKTVSGTILIHQTS THVYRDAGTAERTQDNMEVRLDEKVESKEETKALGIEVGDFISFDPRMTITESGFVKSRF LDDKVSVAILLNLLKIYKKENIELPNTTHFMFSCFEEVGHGANSSIPAEVTEYLAVDMGA MGDDQQTDEYTVSICVKDASGPYNYEFRQHLVKLAKENNIPFKLDIYPYYGSDASSAMRA GAEVKHALLGAGIESSHSYERTHKDSIEATERLVDVYLRNGLVK >gi|261746283|gb|ADAD01000205.1| GENE 7 6506 - 7537 1201 343 aa, chain + ## HITS:1 COG:SA1154 KEGG:ns NR:ns ## COG: SA1154 COG2008 # Protein_GI_number: 15926897 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Staphylococcus aureus N315 # 5 343 4 341 341 409 58.0 1e-114 MKLFFMNDYGKGAHPKVLQNLIDNNESVQVGYGFDELSSRAKEKIRKACKRNNAEIYFLT GGTQTNAVVINSLLRSYEGVISANTGHINVHEAGAIEITGHKVIELPHEDGKLTSKEVKE YLDAFHKDENKSHMVMPGMVYISFPTEYGTLYSKQELNSLYQLCQEYRIPLFIDGARLGY GLAAPECDMDLPTLAGLCDVFYIGGTKIGTLSGEAVVFSNPEIMPDNYVTIIKQMGALLA KGRLLGAQFDALFTDDLYFEISKNAIKTAHLLKQALKEKGYRFYIDSPTNQQFIIIDNDK MDELHKKVQFAYWEKYDENHTIIRFVTDWSTKEEDVRKLIELL >gi|261746283|gb|ADAD01000205.1| GENE 8 7739 - 8611 1122 290 aa, chain + ## HITS:1 COG:CAC0191 KEGG:ns NR:ns ## COG: CAC0191 COG1737 # Protein_GI_number: 15893484 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 10 266 7 258 283 160 38.0 2e-39 MSLYENYEHKIKEIYSELRKSEKKVADYILANKIKIEKMGLEEIAENSKVSTPTVIRFTK ALGYEGFKDFKTELLKSGRQNQNDYDNIDLLLDLHITKNDKLEDIPIKLVGLTIKALEET LKFLNYEIYEEAIKLITNANTIDIYGVGNSGSIGNDFASKLLRIGLNCRAYPDNHLQQLC ACHLGKKDLAIAISHSGETKDTVDALRIAKESGAKTLVLTNFKASIITKYADISLFTGDT ESTFYSETMSSRMSQLALVDMLYMGVLLSDYNKYAKRLDKINNLTIGKIY >gi|261746283|gb|ADAD01000205.1| GENE 9 8780 - 10039 1826 419 aa, chain + ## HITS:1 COG:no KEGG:CD0209 NR:ns ## KEGG: CD0209 # Name: not_defined # Def: putative sugar-phosphate kinase # Organism: C.difficile # Pathway: not_defined # 1 417 1 418 421 497 59.0 1e-139 MSKMTLTEVVKKGLKMKKEERASMLGIGPMSKTLIKASILLAKEKDFPLIFIASRNQVDS KELGGGYVCNWDQKAFSNAIKEIANEAGFDGLYYLCRDHGGPWQRDKERKDHLPEEEAMK LGKQSYIADLENGFDLLHIDPTKDPYIVGKVIDVNVVLRRTVELIEYVEKERIARKLPEI SYEVGTEETNGGLTSVESYEFFIQELLKELDKKGLPHPCFIVGQTGTLTRLTENVGHFNA KASKELSDVARKYEVGLKEHNGDYQDEAILLAHPALGITAMNVAPEYGTVETRAYLKLVE LEEMLFAEGIISKKSNLKNCIRKEAVASRRWEKWMTDDTVSKSTEELLRDENIINTITDI SGHYTFNNDNVKKEIENLFENLAEAGVNAEEYVIYKLKESLDRYVECFNLKGYTSRLKK >gi|261746283|gb|ADAD01000205.1| GENE 10 10220 - 10714 713 164 aa, chain + ## HITS:1 COG:BH0828_1 KEGG:ns NR:ns ## COG: BH0828_1 COG1762 # Protein_GI_number: 15613391 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Bacillus halodurans # 1 151 1 144 165 85 34.0 3e-17 MVNVSDLINENLIYLNFDAADKEAVLSGLAKIISEEKKLCDEKYGEKEALEGYIGSLHER EETFSTAVGFSFAIPHGKCKYVKQACLAYARLNNEISWAEDESAKHIFMIGVSEENAGNE HLEILIKLSTAILEDDFREKLNKAETESEVMKLLKDYSSKERTV >gi|261746283|gb|ADAD01000205.1| GENE 11 10745 - 11062 647 105 aa, chain + ## HITS:1 COG:STM4113 KEGG:ns NR:ns ## COG: STM4113 COG1445 # Protein_GI_number: 16767378 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Salmonella typhimurium LT2 # 2 101 3 102 106 113 64.0 8e-26 MKIVGITACPSGVAHTYMAAEALKKAAIARGHEIKVETQGQIGLENEITQEEANAADVVV LTTDIGLKNTERFNGKPIVRIGISDLVKKAPALVEKIEEALKSRK >gi|261746283|gb|ADAD01000205.1| GENE 12 11128 - 12186 1859 352 aa, chain + ## HITS:1 COG:STM4112 KEGG:ns NR:ns ## COG: STM4112 COG1299 # Protein_GI_number: 16767377 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Salmonella typhimurium LT2 # 1 328 4 335 359 308 57.0 9e-84 MLGLLKNTKQHLMTGVSYMIPFVVAGGVLLALAVMISGKAAVPETGFLKNMSDIGIAGLT LFIPVLGGFIAYSMVDRPGIGPGMIGAYLANSKGGGFLGGIIAGLIAGIVVYYLKKIKVP KVMSSVMPIFIIPLVGTFIVGMIILLFIGEPIGAFMKGLEVWLSGMQNSSKIVLGIILGA MIAFDMGGPLNKTAFFFAVAMIPTNPTLMAAVATAVCTPPLGLALATFVAKNKFTVAEQE SGKAALIMGCIGITEGAIPFAAADPIRVLPSIMAGGSAAAVTSLLLGATNQAAWGGLIVL PVVTNRIGYIIAVIVGSVVTALVVAVLKKTQVNEVVVQENTKNEDLELDITF >gi|261746283|gb|ADAD01000205.1| GENE 13 12306 - 13073 823 255 aa, chain - ## HITS:1 COG:CAC0113 KEGG:ns NR:ns ## COG: CAC0113 COG1349 # Protein_GI_number: 15893409 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Clostridium acetobutylicum # 1 255 1 253 253 172 43.0 5e-43 MFVSERLDEIIYLINQEGKVEVNKLSKKFNVSKDLIRKDLSKLEENGILERTYGGAVKKR KLAETVTITSRISKNLDTKQKIAQKALKLLKPKESIFLDISSINYILAHELIKNNWEITV ITNMIDIMHLISLNPDSKVKLIGIGGTCNNIIDGFVGISSINQINNFNVDRCFIGTIGIN TYNGNVSTYEQDDGLTKVAIMNTSKNKYLITEISKIDQDGKYNFSNLKMFDCLITDNNIS DLITKELKKYKIKIV >gi|261746283|gb|ADAD01000205.1| GENE 14 13280 - 15577 3015 765 aa, chain + ## HITS:1 COG:STM4114 KEGG:ns NR:ns ## COG: STM4114 COG1882 # Protein_GI_number: 16767379 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Salmonella typhimurium LT2 # 1 765 1 765 765 975 59.0 0 MNERVEFLKKELFKNKREISTERAVLYTQSHKETVGEPEIIRRAKATKNILEKMEISIRE KELIAGNRTIKPRSGIISPEMDPYWIMKEIDTMETRPQDQFVFTEKDKKIFTENLYPYWK DKSLKDSLNGIIPEYIRKAVNEKIVNLNQTDKGQGHIIPDFETVLKRGYGQIRNEVKEKL EKNPENEFYKALLITLEATISHIKRYEKLAREMALQEKDESRKKELQKIEEISGKVATEP AGTFLEALQLLWYTSIVLQMESNASSISLGRIDQYLYPYYKNDIEKGEDKEKLKEYLEAF YIKTNDVVLLRSEHSAKFFAGFPSGYTALLGGVNIYGQSAVNELSYLCLEAYHDIRLPQP NLGIRVNDIEPKKFIKKTCETIALGTGIPQLFNDEVVIPSFLSRGVSLEDARDYSVVGCV ELSIPGKTYGLHDIAMLNILKIMEKVLYNNENSENLTLEKINGEIKESISYYVKLMAEGS NIVDEGHKKYAPVPLLSTIVKDSLEKGKDITAGGARYNFSGVQGIGLPNLCDSLMIIKKF VFDEKKYTFKEVIEALKNNYEGNVYEKMKNEFISDETKYGNDIDEVDNISAEILRYYCKE VEKYKNPRGGIFIPGSYTVSAHIPLGEVVGATPDGRLSGEQLADGGLSPMFGRDIFGPTA VLKSLSKLDNVLLTNGSLLNVKLSPSAVKTEEGINKFVNFIYAYMKLKLTHIQFNVIGVE TLKKAQVEPEKYGNLVVRVAGYSAFFSELNKKIQDDIIHRTEHGL >gi|261746283|gb|ADAD01000205.1| GENE 15 15620 - 16399 932 259 aa, chain + ## HITS:1 COG:STM4115 KEGG:ns NR:ns ## COG: STM4115 COG1180 # Protein_GI_number: 16767380 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Salmonella typhimurium LT2 # 8 259 24 292 292 221 40.0 8e-58 MNNRKGLIFNIQRYSLNDGEGIRTIVFFKGCPLRCPWCSNPESQSFETEYMKSNVNGNIK TIGKWYTVDEIIKEALKDEVFFNTSGGGVTLSGGEVLAQGEFIEELLKELKENDINTAIE TCGYGNINVLKKILPYVDTVFFDLKITDNQKSKEIIKGDFDTIKKNFIESTKSNRVIPRV PYIPGYTDDIENIDEILNIVKKCNLKEIHVLPYHNYGMSKYEGLGREYELKNTEIPKKET MEKIKKYIENKGFKIIIGG >gi|261746283|gb|ADAD01000205.1| GENE 16 16401 - 17117 1015 238 aa, chain + ## HITS:1 COG:ECs0903 KEGG:ns NR:ns ## COG: ECs0903 COG0176 # Protein_GI_number: 15830157 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Escherichia coli O157:H7 # 1 234 25 244 244 165 36.0 8e-41 MEYFLDTADIEAIKRINEIFPLKGVTTNPSIIAKEKRDFKDIINDIYNIIGKEKVVHAQA VGSTADIIVKEVNLLRDTFGENIYAKIPVTFEGIKAMKILSKDGHKITATGILSSQQIIM AAEAGAEYMAPYINRSDNIGESGVEIVRDAYKILEMFKKSDEECLKRYKRVFELKILGAS FKNVRQVHETMLAGSKSVTISPDVFERLIYHPYTDWSMDTFNSDWEKIYGDKTLIDLL >gi|261746283|gb|ADAD01000205.1| GENE 17 17479 - 18540 1756 353 aa, chain + ## HITS:1 COG:STM4382 KEGG:ns NR:ns ## COG: STM4382 COG2220 # Protein_GI_number: 16767628 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Salmonella typhimurium LT2 # 1 353 1 354 354 492 62.0 1e-139 MAKLDEITRESWVLSTFPEWGTWLNEEIEETKVEQGTVAMWWLGNMGLWVKTEGNANICM DLWVATGKRSGKNKLMKPKHQHQRAVGCVALQPNLRTTPCVIDPFAIKDLDALLATHSHS DHIDQNVAAAVLKNCPEAKFIGPKTCTDIWRKWGVPEERLVTVRPGDELTIKDAKIKMLE SFDRTMLLTVAEDVVLKDKLPPDMDDMAVNYLIETTGGNIYNAGDSHHSNYFVKHGNENK VDVAFVGYGENPRGMTDKLTSSDVLRVAEELKTQVVIPLHHDIWSNFMADPKEITLLWNY RKDRMKYKFKPYIWQPGGKFVFPDNKDDMEYMYPRGFEDAFTIEPDLPFKSFL >gi|261746283|gb|ADAD01000205.1| GENE 18 18669 - 20177 2162 502 aa, chain + ## HITS:1 COG:SP2038 KEGG:ns NR:ns ## COG: SP2038 COG3037 # Protein_GI_number: 15901859 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 1 497 1 481 485 511 57.0 1e-144 MNILLLIGTWFGKNILTKPEFFVGLLVFIGYLFMGKKIYEAVGGFIKATVGYMILNVGAG GLVTTFRPILVALKTKYNINAAVIDPYFGLQAAEAALKTIGQTTAAVMSALLVGFLLNIL LVLLRKFTKIRTLFITGHIMQQQAQTATWMLFFLFPQFRNVWGVLLIGIFSGVYWAVGSN LSVEPTQRLTGNAGFAIGHQQMFAVWVADKLAPKLGNPKKKLDDLKLPKWLSMLHDDIIA TGLIMIIFFGAIMLLLGPEFFTAKFGKCVVENGMQTCPVVNPNGVAAGAFDPKKLSFGTY VISTTLSFAVYLTILKTGVRMFVSELTLSFQGISNKLLPGSFPAVDCAATYGFGSPSAVL FGFLVGTIAQFISIAGLLIFKSPIFIITGFVPVFFDNATIAVFADKRGGARAAFILSALS GVLQVLCGAAAVMLFGLGEFGGWHGNIDQSTLWLAQGFIMKYLALPGYALVIILMLLIPQ IQYIKAKNKEQYYEGTVDLVEE >gi|261746283|gb|ADAD01000205.1| GENE 19 20265 - 20546 566 93 aa, chain + ## HITS:1 COG:SP2037 KEGG:ns NR:ns ## COG: SP2037 COG3414 # Protein_GI_number: 15901858 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Streptococcus pneumoniae TIGR4 # 1 89 1 90 93 84 54.0 4e-17 MLKVLAVCGSGMGTSMIMKMKVSQVLKKLNVDADVNSCSMGEAKSGLANYDLVLASTHII KDLKGGPNTKMVGLLNLLDANELEAKLKEVGIG >gi|261746283|gb|ADAD01000205.1| GENE 20 20598 - 21047 723 149 aa, chain + ## HITS:1 COG:SP2036 KEGG:ns NR:ns ## COG: SP2036 COG1762 # Protein_GI_number: 15901857 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Streptococcus pneumoniae TIGR4 # 1 146 1 146 161 146 47.0 2e-35 MTLLDSLKENNSVGLKKEAKTWEEAIEACMQPLLDKGTVQRKYVDAIIERTKELGPFYIL APGLAMPHERPEMGVNKDCFSFVTLKEPVTFPDGQEVDILIGLAATNTDIHNGEAIPQIV MLFDDDSVFDEIRAAKVPEDIYKMIESKL >gi|261746283|gb|ADAD01000205.1| GENE 21 21253 - 21915 1101 220 aa, chain + ## HITS:1 COG:SPy0177 KEGG:ns NR:ns ## COG: SPy0177 COG0269 # Protein_GI_number: 15674382 # Func_class: G Carbohydrate transport and metabolism # Function: 3-hexulose-6-phosphate synthase and related proteins # Organism: Streptococcus pyogenes M1 GAS # 4 219 5 220 220 369 88.0 1e-102 MARPLLQVALDHSDLQGAIKAAVSVGHEVDVIEAGTVCLLQVGSELVEVLRSLFPEKIIV ADTKCADAGGTVAKNNAVRGADWMTCICSATIPTMKAALKAIKEVRGDRGEIQVELYGDW TYEQAQQWLDAGISQAIYHQSRDALLAGETWGEKDLNKVKKLIEMGFRVSVTGGLSTETL KLFKGVDVFTFIAGRGITEAADPAQAARDFKAEIAKYWAE >gi|261746283|gb|ADAD01000205.1| GENE 22 21965 - 22840 1276 291 aa, chain + ## HITS:1 COG:sgbU KEGG:ns NR:ns ## COG: sgbU COG3623 # Protein_GI_number: 16131453 # Func_class: G Carbohydrate transport and metabolism # Function: Putative L-xylulose-5-phosphate 3-epimerase # Organism: Escherichia coli K12 # 9 291 17 296 297 341 58.0 8e-94 MKDLNKLNLGIYEKALPKDIDWIERIKLVKECEYDFVEMSVDETDERLARLDWSDEEINK IHEALVNTGVRIPSMCFSGHRRFPMGSMDEKTREKAMEFMQKAIIFADKLGIRTIQMAGY DVYYEEGSEQTKKYFTENLKKAVEWASSYNITLAIEIMDHPFINSITKYMEYSEIIKSPW LKVYPDVGNLTAWPENDTLKELELGIKQGEIVAIHLKDTLAVTDTFPGKFKEVPFGEGCV DFPKVFAKLKELNYKGPFLIEMWTEKSDNPIEEVKKAKEWMLDKMKKGGFI >gi|261746283|gb|ADAD01000205.1| GENE 23 22840 - 23541 1220 233 aa, chain + ## HITS:1 COG:VCA0244 KEGG:ns NR:ns ## COG: VCA0244 COG0235 # Protein_GI_number: 15601012 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Vibrio cholerae # 1 231 1 230 230 333 67.0 2e-91 MLEKLKEEVYKANMELPAKGLVLFTWGNVSAIDREKGLVVIKPSGVDYDKMKAEDMVVVD LDGKVVEGELNPSSDTPTHVELYKKFPNIGGIVHTHSTNATIWAQSGRDIPAYGTTHADY FYGPIPCTRKMTPEEIKGEYEKETGTVIIETFEKRNIDPKFVPAVVVHSHGPFTWGKNAA EAVYNSVVLEELSKMAIYTEQINKDIKPMQQELLDKHFLRKHGENAYYGQKKK >gi|261746283|gb|ADAD01000205.1| GENE 24 23592 - 24005 527 137 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0148 NR:ns ## KEGG: Sterm_0148 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: S.termitidis # Pathway: not_defined # 3 135 2 134 134 153 67.0 3e-36 MDEKQDIKTNPNLVIALGYYIKNKRLQKNIGLREMADLLKISPAYLSNLESGKHSMTNPL LLKKISKILEVDHLKLFKIIGYTDKDMSDLKKEIMSELVEEISDIEIGKIIEELMKMEPE KVFLVKQYIELLNNKNA >gi|261746283|gb|ADAD01000205.1| GENE 25 24203 - 25936 1838 577 aa, chain - ## HITS:1 COG:FN0506 KEGG:ns NR:ns ## COG: FN0506 COG0018 # Protein_GI_number: 19703841 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 577 1 569 569 635 57.0 0 MELLNIQLKKLFEKNIQNIFKEDFSDKIDIQNSTKKEFGDFQTNFAMVTSKILGKNPREI ANEIIENFEKNDIIEKMEIAGPGFINIYLKNSFLDNETKKIGNEKYDFSFLDSDKTVIID YSSPNIAKRMHVGHLRSTIIGDSLKRILQFLGFKKVLGDNHVGDWGTQFGKLIVGYNLWL NREAYEKDPIEELERIYVMFSDEAKKDPSLEDVAREELRKLQSGDEVNNALWREFIDISI KEYNKIYKRFDITFDYYNGESFYNDLMPVVLEELKEKNIAKQDEDALVVFFGEDTKLHPC IVQKKDGSFLYSTSDLATIKYRKDNLDVDLAIYVVDERQQDHFKQVFSIAKMIGAPYDYD KVHVWFGIMRFANGVILSTRGGNVIRLIDLLDEAKKHVKKVIDEKNPDIPEPEKEIIADI VGTGAIKYFDLSQNRTSPILFDWEKVLSFEGNTGPYLQYTYVRIMSILRKMESENISINK NGNIIFDNMQDVERELAVALLRFPQVVVKSYESYRPNVIADYLFELAKLFNNFYNSKPIL KEENAETMQARILLALKTAEILKQGLSLLGIQTVDRM >gi|261746283|gb|ADAD01000205.1| GENE 26 26264 - 26539 393 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039674|ref|ZP_06012963.1| ## NR: gi|262039674|ref|ZP_06012963.1| hypothetical protein HMPREF0554_0448 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_0448 [Leptotrichia goodfellowii F0264] # 13 91 13 91 91 110 100.0 4e-23 MSNNYRNSYRNNHKKGISSGTKNVILLFIFSVILTGASVFFMQDKIPALKSKVEMLKAEK QKKEDELKTVESTLAEKEKSYNELKSKVDKK Prediction of potential genes in microbial genomes Time: Thu Jul 14 04:00:49 2011 Seq name: gi|261746279|gb|ADAD01000206.1| Leptotrichia goodfellowii F0264 contig00132, whole genome shotgun sequence Length of sequence - 977 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 364 464 ## FN2115 hypothetical protein + Prom 404 - 463 4.7 2 2 Tu 1 . + CDS 527 - 916 415 ## gi|262039528|ref|ZP_06012828.1| hypothetical protein HMPREF0554_2462 Predicted protein(s) >gi|261746279|gb|ADAD01000206.1| GENE 1 2 - 364 464 120 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 119 30 150 151 75 41.0 1e-12 EKEATKKDGPGAVSRDKYRGGVEEIVKDISKRPINKRVQFEGVALIIPEGTVINSKQGNI IDMKTGYGIPITFSGSSSGCIAKKVKENIDYGIFYNKTIPEISKIAQKIININGFKNTCN >gi|261746279|gb|ADAD01000206.1| GENE 2 527 - 916 415 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262039528|ref|ZP_06012828.1| ## NR: gi|262039528|ref|ZP_06012828.1| hypothetical protein HMPREF0554_2462 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2462 [Leptotrichia goodfellowii F0264] # 8 129 679 800 800 220 97.0 2e-56 MGSINSNVYESAEISEQIYNKFEEYKSANNTRKEEILGEVIELYKKDQALKLDLAMNGPA ILDNSPFLLKTRDEYNRAELIREYNAGRYQDKGLPGIMNVGMSVEEKRKELKVSDITSYL RKLRQGVER Prediction of potential genes in microbial genomes Time: Thu Jul 14 04:01:01 2011 Seq name: gi|261746277|gb|ADAD01000207.1| Leptotrichia goodfellowii F0264 contig00164, whole genome shotgun sequence Length of sequence - 3675 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 3675 4644 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|261746277|gb|ADAD01000207.1| GENE 1 3 - 3675 4644 1224 aa, chain - ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 7 1212 659 1981 2806 502 36.0 1e-141 EININKNIINNNEIASNDLTLKTENLINNKNITAGKANITATKDTTNNGLINGNDITVSS QNIINNKKIEADKLTTKATGITENKGELSANEYTSTTTDFVNGNIVSVGKGTINATNTIR NDNKINANELTATSKTFVNNKKVETGKGNITATENIENNDLIAANDLTLKGKNIINAKGK TIFTTNELTANASESIVNNGAEILSQGKINLTAATVDNKVGKIKGSGLVDITANVVNNIG KAGDLTKYKKYWEAWDGTIYTDETKMMQRNQPGSWERNYYADPSNEGDKWQVFHNWLTGL KKPEEPSYIAKNMEDIVKNRMRANEDGIMLNSAEFPIIPLVNKIESEAQTEFAVISGGDV KITSTGEVNNIDGIISSDGLTKVDAPKIVNRVTIDTNNPVQLRDGMENLKWWYHKSKHSK KSTYYVGYYRFLSPTIRTAYVAGQPSVIEGKILSVDTSKIVGESYEEAKGKLSTNPTYSK NVDKQNIEIVKNISGITEVKNTGIIPINISGINTGNTGISQRNGINIGSLYVPSTDPSSR YVIENRTEFITRGNYYGGDYFLSRMGYVEDWDRVKLLGDAYYDNMLIEQTLTEKLGTRYI NGLSGEELTKQLIDNATTAKKDLQLRQGVALTKDQINNLKNDIIWYEYETVNGKKTLVPK IYLSKATIEKLEVDGRSKMYGTDYTLVNVKGDYENKGLKIGSTTGVTLVKANKVRNETVA NERAEITGRQVRVTALEGNIENVGGKIVSVERTELIAKNGDIINNGTKVTKGYDLGENFR SKYEELGNIGEISGPAVYLEANNYNSTGGALATKNLELQLTGNINENALKLNGDDRFGQN SSNYQTYKSKEHIGSGIVAEKATGKVGDINLKGSAFVVEDGKGLKVGNVKAESLVNEYDV EKRTKEKGILSKRESLTQSHTEENVSSNFKIGENADIKGTVTGIGSNIYIGDNSYVGGKV TTDSRELHNTYYNEEKKSGFSASAKGTSVSAGYGKSKNTYDEKSTINAKSSLHVGNNTVL NNGASITATDFEHGKIEINNGDVTYGARKDTRDVSTSSKSSYIGVTASVKSPALDRVKQA KEAVDQIKKGDTVGGAVNAVNFVTGTVSGMRGNITRPDGGRATRKDIEAGNYKSNNDFYV QGSVNVGFNTSKSETKSHEESAVVTTIKGIDGNSSITYNNVKNINYIGTQAKDTKFIYNN VENINKEAVELHKSYESKSKGFGI Prediction of potential genes in microbial genomes Time: Thu Jul 14 04:01:01 2011 Seq name: gi|261746276|gb|ADAD01000208.1| Leptotrichia goodfellowii F0264 contig00196, whole genome shotgun sequence Length of sequence - 268 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 266 433 ## gi|262037843|ref|ZP_06011277.1| hypothetical protein HMPREF0554_2290 Predicted protein(s) >gi|261746276|gb|ADAD01000208.1| GENE 1 2 - 266 433 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262037843|ref|ZP_06011277.1| ## NR: gi|262037843|ref|ZP_06011277.1| hypothetical protein HMPREF0554_2290 [Leptotrichia goodfellowii F0264] hypothetical protein HMPREF0554_2290 [Leptotrichia goodfellowii F0264] # 1 88 7 94 627 140 92.0 3e-32 IKFDSEGKLKVEGVDVQTAGHVYLRGKKGVEVLPGVENSLREEEHKISGIKGSISVSWGG VSAGIGYGKSSDKIKEVTKEIIANKLQA